ITPub博客

首页 > Linux操作系统 > Linux操作系统 > oracle实验记录 (oracle 详细分析redo(2))

oracle实验记录 (oracle 详细分析redo(2))

原创 Linux操作系统 作者:fufuh2o 时间:2009-10-23 14:27:56 0 删除 编辑

*******关于log buffer的 设置************
 

 

 

Tuning the Log Buffer Size
Normally, tuning the log buffer size is a matter of careful balance, but you can have your cake and eat it too by using the hidden _log_io_size parameter.

How is log buffer space used?
The log buffer in the SGA is internally divided into blocks of the log block size. The value specified by the log_buffer parameter must be a multiple of the log block size. On some operating systems it is rounded up at instance start-up if necessary. Each block in the log buffer maps to a block in the current online log file, most of the time.

Consider for example the instance illustrated in the following figure. This instance has 100 log blocks in the log buffer. After a log switch, an SGA variable (b) is set to indicate that the base disk block for the log buffer is block 2. Another SGA variable (i) is used for the index into the log buffer for redo generation. This is shown as pointing to log buffer block 30 shortly after the log switch. A little later, after several log writes have occurred, the base disk block for the log buffer is disk block 82. This maps to log buffer block 81. Redo generation continues and wraps from log buffer block 100 to log buffer block 1. At the moment illustrated the index into the log buffer for redo generation is block 5. This maps to disk block 106. The log buffer is used cyclically in this way until i maps to the last block in the log file.

 

When does LGWR write?
There are two classes of LGWR writes: background writes, and sync writes. The distinction is merely as to whether another process has to wait for the log write to complete.

A background write by LGWR can be triggered in either of two ways. Whenever a process allocates space in the log buffer, the number of used log buffer blocks is calculated. This calculation is based on the two SGA variables mentioned above, and is performed under the protection of the redo allocation latch. If the number of used blocks is greater than or equal to the number of blocks specified by the _log_io_size parameter, and if LGWR is not already active, then LGWR is posted to perform. a background write. The default value for _log_io_size is 1/3 of the log buffer expressed in log blocks (with an upper bound equivalent to 1M under Oracle8). If more than this number of log buffer blocks are used, then the redo allocation latch is released and the redo writing latch is taken to check whether LGWR is already active.

When LGWR is waiting to be posted, it sleeps on an rdbms ipc message wait with a timeout of 3 seconds (as does DBWn). A background write may also be performed if this sleep times out and LGWR finds that there is some redo available to write.

The most common cause of log sync writes is the end of a transaction. When a process finishes a transaction, normally with a commit, it generates a commit marker in the redo stream. However, the transaction is not recoverable until the log block containing that commit marker has been flushed to disk. Therefore, before the process finishing the transaction can continue, it must wait for LGWR to flush that log block to disk. That process therefore posts LGWR and then sleeps on a log file sync wait with a timeout of 1 second. If several commits occur in distinct transactions before LGWR wakes up, then the commit markers are all flushed to disk in a single sync write. This is sometimes called a group commit .

The only other process to wait for log sync writes is DBWn. DBWn cannot write a database block to disc if the redo for the most recent change to that block has not yet been flushed to the redo log file. Otherwise, if the instance were to crash before that redo could be written, there would be no way to rollback those uncommitted changes, because the redo for the corresponding undo would not yet have been written either, because it would have been part of the same unwritten redo entry.

To prevent this, Oracle maintains a record of the most recent change to each current mode database block in the buffer header structure. This is stored as a redo block address (RBA) comprised of the log file sequence number, the log file block number and an offset into the block in bytes. There are three redo block addresses in the buffer header structure - the low RBA, the recovery RBA and the high RBA. It is the high RBA that is used at this point. Before writing a batch of database blocks, DBWn finds the highest high redo block address that needs to be synced before the batch can be written. DBWn then takes the redo allocation latch to ensure that the required redo block address has already been written by LGWR, and if not, it posts LGWR and sleeps on a log file sync wait.

What if the log buffer is too small?
If the log buffer is too small, then log buffer space waits will be seen during bursts of redo generation. LGWR may not begin to write redo until the _log_io_size threshold (by default, 1/3 of the log buffer or 1M whichever is less) has been exceeded, and the remainder of the log buffer may be filled before LGWR can complete its writes and free some space in the log buffer.

Ideally, the log buffer should be large enough to cope with all bursts of redo generation, without any log buffer space waits. Commonly, the most severe bursts of redo generation occur immediately after a log switch, when redo generation has been disabled for some time, and there is a backlog of demand for log buffer space.

Can the log buffer be too big?
However, if the log buffer is very big, then the default _log_io_size threshold will be big also, and so background writes may seldom be triggered. This means that all the redo will have to be flushed by sync writes, and so log file sync waits will take longer than otherwise. This impacts commit response time, and possibly DBWn performance as well.

Setting _log_io_size
Of course, it is not necessary to carefully balance log buffer space waits against log file sync waits. You can have both a large log_buffer setting to avoid log buffer space waits, and a relatively small _log_io_size setting to minimize log file sync waits and reduce their duration.

However, _log_io_size must not be so small as to keep LGWR unduly active. LGWR uses the redo allocation latch both before and after each write. If LGWR is over active, then there is increased risk of redo allocation latch contention. An over active LGWR also wastes CPU time by performing multiple small writes, and by running more often which involves extra context switches. This can impact LGWR's operating system scheduling priority, which in turn degrades LGWR performance even further.

For raw log files, or file system based log files with direct I/O enabled, the optimal setting for _log_io_size is normally just below the maximum physical I/O size supported by the operating system (device driver). For file system based log files without direct I/O, a smaller setting normally works better.

Note that the _log_io_size setting must be specified in log blocks, whereas the log_buffer parameter is set in bytes. Note further that setting _log_io_size is no excuse for having a wastefully large log_buffer setting. It is very rare to require a log buffer more than a few hundred kilobytes in size.


大致翻译下:
LGWR 分为2种写,一种为 后台写,一种为同步写 ,后台写 在4种情况下发生 1.每3S LGWR启动一次,2.若dbwn启动时发现dirty block对应的redo entries还没写入redofile,则先触发lgwr将redo entries写入redofile,3._log_io_size控制 1/3 log buffer写入(使用的日志缓冲数超过了该值,那么redo allocation latch将会被释放并且将使用redo writing latch检查lgwr是否为active。
),4.log buffer 中redo 1m,同步写只有一种情况 COMMIT.

 

关于rba(redo block address)
the log file sequence number (4 bytes)
the log file block number (4 bytes)
the byte offset into the block at which the redo record starts (2 bytes)

rba:由日志文件序号,日志文件block号,redo record在 redo block中的起始位置


RBAs are used in the following important ways.

With respect to a dirty block in the buffer cache, the low RBA is the address of the redo for the first change that was applied to the block since it was last clean, and the high RBA is the address of the redo for the most recent change to have been applied to the block.
Dirty buffers are maintained on the buffer cache checkpoint queues in low RBA order. The checkpoint RBA is the point up to which DBWn has written buffers from the checkpoint queues if incremental checkpointing is enabled -- otherwise it is the RBA of last full thread checkpoint. The checkpoint RBA is copied into the checkpoint progress record of the controlfile by the checkpoint heartbeat once every 3 seconds. Instance recovery, when needed, begins from the checkpoint RBA recorded in the controlfile. The target RBA is the point up to which DBWn should seek to advance the checkpoint RBA to satisfy instance recovery objectives.

The on-disk RBA is the point up to which LGWR has flushed the redo thread to the online log files. DBWn may not write a block for which the high RBA is beyond the on-disk RBA. Otherwise transaction recovery (rollback) would not be possible, because the redo needed to undo a change is always in the same redo record as the redo for the change itself.

 

当user update一个块 ,比如块A,block a成为dirty block, oracle为这个block a 产生redo record ,此redo record在 logfile中的 位置就是 rba,过一会 user 又一个update更新block a ,block此时已经是dirty block, 又会为它生成一条redo record,则第一个update对应的 redo record 的rba叫做block a的 lrba(low redo block address),第2个update的redo record对应的 rba叫做 hrba(high redo block address),lrba-hrba就是 检查点队列上 dirty block 的 顺序(checkpoint queue)

dbwn 写 dirty bufer时候 ,dbwn会找到需要同步的high rba,然后获得redo allocation latch 保证要写入datafile的块已经写入(用high rba与 on disk rba相比),否则触发lgwr(8I前在log file sync事件上等待,8i后不需要了)

ON DISK RBA:就是 redo log file中 最后一条redo record 的 rba,如果dbwr 要写的dirty block的 rba比这个大,证明 对应的redo record 还没写入disk,要触发lgwr


lgwr几种情况具体解释:

1. 3S 自动触发lgwr
lgwr处于空闲状态,将依赖于rdbms ipc message 等待,处于修面状态,当 3S后 出现超时, 如果lgwr发现有要redo 写出,那么lgwr将执行写操作 log file parallel write等待事件将会出现(LGWR的一个专属事件,将log buffer写到log file group member时 lgwr进程在该事件上等待完成)

SQL> alter system set event='10046 trace name context forever,level 12' scope=sp
file;

系统已更改。

SQL> startup force
ORA-32004: obsolete and/or deprecated parameter(s) specified

SQL> show parameter background_dump_dest

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
background_dump_dest                 string      F:\ORACLE\PRODUCT\10.2.0\ADMIN
                                                 \XHTEST\BDUMP
查看lgwr的trace

 


WAIT #0: nam='rdbms ipc message' ela= 16934 timeout=204 p2=0 p3=0 obj#=-1 tim=17681800618
WAIT #0: nam='log file parallel write' ela= 2322 files=1 blocks=1 requests=1 obj#=-1 tim=17681820724
WAIT #0: nam='rdbms ipc message' ela= 889179 timeout=197 p2=0 p3=0 obj#=-1 tim=17682733749
WAIT #0: nam='log file parallel write' ela= 304 files=1 blocks=2 requests=1 obj#=-1 tim=17682734184
WAIT #0: nam='rdbms ipc message' ela= 14637 timeout=108 p2=0 p3=0 obj#=-1 tim=17682750577
WAIT #0: nam='log file parallel write' ela= 314 files=1 blocks=2 requests=1 obj#=-1 tim=17682750990
WAIT #0: nam='rdbms ipc message' ela= 3228 timeout=107 p2=0 p3=0 obj#=-1 tim=17682755665
WAIT #0: nam='log file parallel write' ela= 700 files=1 blocks=2 requests=1 obj#=-1 tim=17682756449
WAIT #0: nam='rdbms ipc message' ela= 3745 timeout=107 p2=0 p3=0 obj#=-1 tim=17682761894


可以看到 这2个事件交替出现,是3S 发生一次lgwr写

 


SQL> alter system set event='10046 trace name context off' scope=spfile;

系统已更改。

SQL> startup force


2.事务提交

commit or rollback :发出这两条命令的时候将 是同步提交,发生lgwr
当事务提交时 在这些redo没有写入 redo log file时,是不能恢复的,oracle在 redo stream中记录一个 提交标记,在事务返回成功标志给用户前,必须等待lgwr写完成,
进程通知lgwr写执行,并以log file sync等待事件 休眠,超时1S


SQL> set linesize 132
SQL> column name format a30
SQL> column value format a25
SQL> select
  2    x.ksppinm  name,
  3    y.ksppstvl  value,
  4    y.ksppstdf  isdefault,
  5    decode(bitand(y.ksppstvf,7),1,'MODIFIED',4,'SYSTEM_MOD','FALSE')  ismod,
  6    decode(bitand(y.ksppstvf,2),2,'TRUE','FALSE')  isadj,x. KSPPDESC
  7  from
  8    sys.x$ksppi x,
  9    sys.x$ksppcv y
 10  where
 11    x.inst_id = userenv('Instance') and
 12    y.inst_id = userenv('Instance') and
 13    x.indx = y.indx and
 14    x.ksppinm like '%_&par%'
 15  order by
 16    translate(x.ksppinm, ' _', ' ')
 17  /
输入 par 的值:  wait_for_sync
原值   14:   x.ksppinm like '%_&par%'
新值   14:   x.ksppinm like '%_wait_for_sync%'

NAME                           VALUE                     ISDEFAULT ISMOD      IS
ADJ
------------------------------ ------------------------- --------- ---------- --
---
KSPPDESC
--------------------------------------------------------------------------------
----------------------------------------------------
_wait_for_sync                 TRUE                      TRUE      FALSE      FA
LSE
wait for sync on commit MUST BE ALWAYS TRUE

 

_wait_for_sync   参数设置为false 可以 避免 redo file sync等待,提交的进程不会等待 lgwr完成.


3.log buffer 1/3满

SQL> set linesize 132
SQL> column name format a30
SQL> column value format a25
SQL> select
  2    x.ksppinm  name,
  3    y.ksppstvl  value,
  4    y.ksppstdf  isdefault,
  5    decode(bitand(y.ksppstvf,7),1,'MODIFIED',4,'SYSTEM_MOD','FALSE')  ismod,
  6    decode(bitand(y.ksppstvf,2),2,'TRUE','FALSE')  isadj,x. KSPPDESC
  7  from
  8    sys.x$ksppi x,
  9    sys.x$ksppcv y
 10  where
 11    x.inst_id = userenv('Instance') and
 12    y.inst_id = userenv('Instance') and
 13    x.indx = y.indx and
 14    x.ksppinm like '%_&par%'
 15  order by
 16    translate(x.ksppinm, ' _', ' ')
 17  /
输入 par 的值:  log_io_size
原值   14:   x.ksppinm like '%_&par%'
新值   14:   x.ksppinm like '%_log_io_size%'

NAME                           VALUE                     ISDEFAULT ISMOD      IS
ADJ
------------------------------ ------------------------- --------- ---------- --
---
KSPPDESC
--------------------------------------------------------------------------------
----------------------------------------------------
_log_io_size                   0                         TRUE      FALSE      FA
LSE
automatically initiate log write if this many redo blocks in buffer


default为0表示dirty redo block占用log buffer 1/3时启用,如果这个值改为非0,若该参数值不大于log buffer 1/2时,该值作为
启动lgwr的一个启动值,若大于 log buffer 1/2 则忽略设置的值以 log buffer  的1/2 为 lGWR的启动条件.


4.dbwr写前
dbwr要写出的 dirty block的hight rba超过 lgwr的 on-disk rba , dbwr通知lgwr执行写出,(8I 前此时dbwr 等待log file sync事件,8i后 dbwr将这些要写到disk datafile的dirty block放入一个延迟队列,同时通知lgwr写出,这样 dbwr可以继续执行,无需等待。

 

 

 


关于log buffer space
当会话由于日志缓冲区空间不足而无法将重做日志条目复制到日志缓冲区时,会话将在log buffer space事件上等待。LGWR负责写出重做条目,腾出日志缓冲区空间。
当会话必须等待日志缓冲区中的空间变成可用以写入新的信息时产生该事件。
LGWR进程周期性地从日志缓冲区写入重做日志文件,使得日志缓冲区可以重复可以。
该等待事件表示:应用程序生成重做日志的速度比LGWR进程将其写入重做日志文件的速度快。

SQL> conn / as sysdba
已连接。
SQL> desc v$sesstat;
 名称                                      是否为空? 类型
 ----------------------------------------- -------- -----------------

 SID                                                NUMBER
 STATISTIC#                                         NUMBER
 VALUE                                              NUMBER
SQL> desc v$statname;
 名称                                      是否为空? 类型
 ----------------------------------------- -------- -----------------

 STATISTIC#                                         NUMBER
 NAME                                               VARCHAR2(64)
 CLASS                                              NUMBER
 STAT_ID                                            NUMBER

连接一个这两个表

SQL> ed
已写入 file afiedt.buf

  1* select sid,value from v$sesstat s where statistic#=(select statistic# from
v$statname t where name='redo buffer allocation retries')
SQL> /

       SID      VALUE
---------- ----------
       137          0
       146          0
       148          0
       150          0
       156          0
       157          0
       159          0
       160          0
       161          0
       162          0
       163          0

       SID      VALUE
---------- ----------
       164          0
       165          0
       166          0
       167          0
       168          0
       169          0
       170          0

已选择18行。
  1* select name,value,class from v$sysstat s where statistic#=(select statistic
# from v$statname t where name='redo buffer allocation retries')
SQL> /

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
     CLASS
----------
redo buffer allocation retries                                            0
         2


SQL> col name format a40
SQL> /

NAME                                          VALUE      CLASS
---------------------------------------- ---------- ----------
redo buffer allocation retries                    0          2


以上 查询可以看到 session procss或 system级 必须等待 空间次数(等待lgwr flush  log buffer)

redo buffer allocation retries :表示再次尝试从log buffer中分空间的次数(最好为0),process第一次没请求成功 会触发LGWR 然后等待完成或其它PROCESS 已经触发而 这个进程等待lgwr完成,完成后再次尝试从log buffer中分配空间
用 redo buffer allocation retries和redo entries可以计算出从PGA copy change vector到SGA LOG BUFFER 时必须等待的重做记录的数量所占的比例
最好为0 或<1%如果>1%且不断边大 说明从PGA copy change vector到SGA LOG BUFFER时必须等待log buffer空的日志块,可以考虑加大log buffer or  提高lgwr写效率


所以设置 log buffer 最好是lgwr写到online redofile时,log buffer还有剩余空间可以 让 其它进程使用,当lgwr完成后 log buffer不要有太多剩余空间

LGWR 进程启动 受 下面这个隐藏参数影响,default为0表示dirty redo block占用log buffer 1/3时启用,如果这个值改为非0,若该参数值不大于log buffer 1/2时,该值作为
启动lgwr的一个启动值,若大于 log buffer 1/2 则忽略设置的值以 log buffer  的1/2 为 lGWR的启动条件

 

 

_log_io_size                   0                         TRUE      FALSE      FA
LSE
automatically initiate log write if this many redo blocks in buffer


另外 只要log buffer超过1M 那么lgwr就会启动


log buffer大小设置
1.5*(平均每个事务产生的redo record大小*每秒事务的提交量)


SQL> select to_number(value,99999999999) from v$sysstat where name='user commits';

TO_NUMBER(VALUE,99999999999)
----------------------------
                         1631

找到user commit次数

SQL> select (sysdate - startup_time)*24*60*60 as seconds from v$instance;

   SECONDS
----------
     19957

SQL> select 1631/19957 from dual;

1631/19957
----------
 .08172571
 实验环境,可以看到每秒提交事务的数量非常少.

SQL> select value  from v$sysstat where name = 'redo blocks written
';

value
----------
     17494


  1* select a.redoblocks/b.trancount from (select value redoblocks from v$syssta
t where name='redo blocks written') a ,(select value trancount from v$sysstat wh
ere name='user commits') b
SQL> /

A.REDOBLOCKS/B.TRANCOUNT
------------------------
              10.3542499
计算 出平均 每个事务产生的redo record大小 (按块)

SQL> select 1.5*10.3542499* .08172571 from dual;

1.5*10.3542499*.08172571
------------------------
              1.26931264

 

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/12020513/viewspace-617286/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2009-06-26

  • 博文量
    182
  • 访问量
    427867