• 关于Oracle RAC中SCN原理和机制的探索


    今天看书时看到了关于RAC中SCN的问题,为了进一步搞清楚其内部原理和机制,对该问题进行了广泛的查阅和搜索,遗憾的是,可以参考的资料很少,网上大部分是人云亦云的帖子,其中,详细介绍其内部原理和机制的资料更是几乎没有,现将一些有参考价值的资料片段在下面贴出,以供本人和同行参考。

    1.Achieving Read Consistency
    One of the main characteristics of the Oracle database is the ability to simultaneously provide different
    views of data. This characteristic is called multi-version read consistency. Queries will be read
    consistently; writers won’t block readers, and vice versa. Of course, multi-version read consistency also
    holds true for RAC databases, but a little more work is involved.
    The System Change Number is an Oracle internal timestamp that is crucial for read consistency. If
    the local instance requires a read-consistent version of a block, it contacts the block’s resource master to
    ascertain if a version of the block that has the same SCN, or if a more recent SCN exists in the buffer
    cache of any remote instance. If such a block exists, then the resource master will send a request to the
    relevant remote instance to forward a read-consistent version of the block to the local instance. If the
    remote instance is holding a version of the block at the requested SCN, it sends the block immediately. If
    the remote instance is holding a newer version of the block, it creates a copy of the block, called a past
    image; applies undo to the copy to revert it to the correct SCN; and sends it over the interconnect.
    2.Synchronizing System Change Numbers
    System Change Numbers are internal time stamps generated and used by the Oracle database. All events
    happening in the database are assigned SCNs, and so are transactions. The implementation Oracle uses
    to allow read consistency relies heavily on SCNs and information in the undo tablespaces to produce
    read-consistent information. System change numbers needs to be in sync across the cluster. Two
    different schemes to keep SCNs current on all cluster nodes are used in Real Application Clusters: the
    broadcast-on-commit scheme and the Lamport scheme.
    The broadcast-on-commit scheme is the default scheme in 10g Release 2 and newer; it addresses a
    known problem with the Lamport scheme. Historically, the Lamport scheme was the default scheme—it
    promised better scalability as SCN propagation happened as part of other (not necessarily related)
    cluster communication and not immediately after a commit is issued on a node. This was deemed
    sufficient in most situations by Oracle, and documents available on My Oracle Support seem to confirm
    this. However, there was a problem with the Lamport scheme: It was possible for SCNs of a node to lag
    behind another node’s SCNs—especially if there was little messaging activity. The lagging of system
    change numbers meant that committed transactions on a node were “seen” a little later by the instance
    lagging behind.
    On the other hand, the broadcast-on-commit scheme is a bit more resource intensive. The log
    writer process LGWR updates the global SCN after every commit and broadcasts it to all other instances.
    The deprecated max_commit_propagation_delay initialization parameter allowed the database
    administrator to influence the default behavior in RAC 11.1; the parameter has been removed in Oracle
    11.2.
    3.Wait for master SCN
    Each instance in the cluster will generate its own SCN and subsequently, using the propagation method,
    will resynchronize to the highest SCN in the cluster.
    This wait indicates the number of times the foreground processes waited for SCNs to be acknowledged from
    other instances in the cluster.
    Before Oracle database 10g Release 2, the method of SCN propagation was driven by the parameter
    MAX_COMMIT_PROPAGATION_DELAY. Setting this to a value higher than zero uses the Lamport algorithm. Now this
    parameter is deprecated and is maintained for backward compatibility only and defaults to zero. This functionality
    is now driven by the underscore (hidden) parameter _IMMEDIATE_COMMIT_PROPAGATION and has a Boolean value of
    TRUE or FALSE.
    When the value of the parameter is set to TRUE (default) Oracle uses the “Block on Commit” (BOC) algorithm for
    messaging. Although the method of propagation remains similar to the Lamport algorithm, in the case of BOC, the
    global high water mark for the SCNs sent and received is maintained, thereby reducing messaging traffic for global
    SCN synchronization and in turn improving overall performance.
    4.
    In earlier versions such as Oracle 9i, every commit System
    Commit Numbers (Commit SCN) is broadcasted to all the nodes, and the log writer is held up
    until all the pending redos are written to the disk. Starting with Oracle 10g, the wait is greatly
    reduced because the broadcast and commit are asynchronous. This means the system waits until it
    is sure that all nodes have seen the Commit SCN. Any message with an SCN greater than
    Commit SCN is deemed sufficient.
    Before doing a broadcast, the process checks whether it has already received a higher SCN
    from that instance. It used the same SCN to determine whether a foreground or an LMS has to
    be posted. With Oracle 10g, this is decoupled: an SCN is needed to release foregrounds and an
    SCN is needed for shipping buffers. The init.ora parameter “_lgwr_async_broadcasts = true” can
    be used to change the broadcast method.
    5.
    1)The number of outstanding broadcasts increased from 3 to 8.This improves throughput but does not affect latency.
    2)LGWR can now issue direct and indirect sends.This frees up the local LMS processes and improves latency.
    3)Processing is not limited to LMS0. The SCN is hashed to determine which LMS process will send the message (indirect send) or process the broadcast and send the ACK back to the broadcasting node.This improves general performance by reducing the load on the local (indirect sends) and remote LMS0 processes.
    4)Broadcast and acknowledgement messages are no longer blocked by DRM events.This improves BOC latency by eliminating the up to 0.5-second delay introduced by Dynamic Remastering.
    5)All Cache Fusion messages can now carry the broadcast SCN.This reduces the need for explicit broadcasts thereby reducing the number of messages on the private interconnect and possibly reducing latency.
    观点:
    1)很多资料中,谈到在所有节点中的同步是为了数据视图的一致性,其实,仔细想想,如果仅仅是为了各节点间数据的一致性,只需要在commit时将SCN同步给资源的master节点就可以了,没必要同步给所有节点。
    2)接续以上问题,那么,commit时将SCN同步给所有节点的目的,更主要是为了在整个cluster内保持SCN的最高水位线,以便cluster中的任意节点可以随时获取到SCN的最高值,避免了过多的信息通信和性能延迟问题。
     
  • 相关阅读:
    MySql存储过程学习
    自己用C语言写的扫雷算法
    Spring学习——Hello World
    ICE Service使用方法简介
    DevExpress学习笔记(一)Ribbon
    DevExpress学习笔记(二)NavBarControl
    ORACLE DBLINK无法使用问题
    vbs脚本读写INI文件
    Python操作INI文件:configobj 更好
    个人发展的误区:越广越好,还是越深越好?
  • 原文地址:https://www.cnblogs.com/lhdz_bj/p/9110536.html
Copyright © 2020-2023  润新知