ITPub博客

首页 > 数据库 > Oracle > Oracle 11g ADG Automatic Block Repair(ABMR自动块修复)功能测试

Oracle 11g ADG Automatic Block Repair(ABMR自动块修复)功能测试

Oracle 作者:尛样儿 时间:2014-03-23 23:27:35 0 删除 编辑

11G ADG automatic repair corrupt data blocks(ABMR自动坏块修复)测试一


ORACLE 11g ACTIVE Data Guard 不仅提供了physical standby 可以open read-only ,还提供了一个无论是主备库发现currupt block的情况下,在满足一定条件下,透明的有ABMR后台进程自动恢复该块从另一方的可用的block.下面附上一段官方解释,开始我的测试。

Automatic Block Repair

Automatic block repair allows corrupt data blocks to be automatically repaired as soon as the corruption is detected. This feature reduces the amount of time that data is inaccessible due to block corruption. This reduces block recovery time by using up-to-date good blocks in real-time, as opposed to retrieving blocks from disk or tape backups, or from Flashback logs.

Note:
Oracle Active Data Guard is packaged as a separate database option for Oracle Enterprise Edition. It requires a license for production database and all of the physical standby databases that are used for the Oracle Active Data Guard option.

if
A corrupt data block is discovered on a primary database
then
A physical standby database operating in real-time query mode can be used to repair corrupt data blocks in a primary database. If possible, any corrupt data block encountered when a primary database is accessed will be automatically replaced with an uncorrupted copy of that block from a physical standby database operating in real-time query mode. An ORA-1578 error is returned when automatic repair is not possible.
if
A corrupt data block is discovered on a physical standby database
then
The server attempts to automatically repair the corruption by obtaining a copy of the block from the primary database if the following database initialization parameters are configured on the standby database:

Configure the LOG_ARCHIVE_CONFIG parameter with a DG_CONFIG list
Configure a LOG_ARCHIVE_DEST_n parameter for the primary database

You can also manually repair a corrupted data block by using the RMAN RECOVER BLOCK command.

read more http://docs.oracle.com/cd/E11882_01/server.112/e17157/unplanned.htm#HAOVW141

PORA92 is primary site, SORA245 is standby site

搭建测试环境11G ADG 部门省略。

sys@PORA92>create tablespace tbs_test datafile '/oradata/pora92/pora92/tbs_test01.dbf' size 10m autoextend on maxsize 1g;
Tablespace created.

sys@PORA92>conn anbob
Enter password: 
Connected.
anbob@PORA92>create table testbmr tablespace tbs_test as select 1 id from dual ;
Table created.

anbob@PORA92>select distinct  dbms_rowid.rowid_block_number(rowid) from testbmr;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                 131

1,测试主库的currupt block auto repair

[oracle@dbserver92 ~]$ dd of=/oradata/pora92/pora92/tbs_test01.dbf bs=8192 seek=131 conv=notrunc count=1 if=/dev/zero
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 3.6111e-05 seconds, 227 MB/s

[oracle@dbserver92 ~]$ dbv file='/oradata/pora92/pora92/tbs_test01.dbf' blocksize=8192
DBVERIFY: Release 11.2.0.4.0 - Production on Thu Feb 20 14:20:04 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oradata/pora92/pora92/tbs_test01.dbf
Page 131 is marked corrupt
Corrupt block relative dba: 0x01400083 (file 5, block 131)
Completely zero block found during dbv: 

DBVERIFY - Verification complete

Total Pages Examined         : 1280
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 127
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1152
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1079914 (0.1079914) sys@PORA92>alter system flush buffer_cache; sys@PORA92>select * from anbob.testbmr;
                  ID
--------------------
                   1

[oracle@dbserver92 ~]$ dbv file='/oradata/pora92/pora92/tbs_test01.dbf' blocksize=8192
DBVERIFY: Release 11.2.0.4.0 - Production on Thu Feb 20 14:21:48 2014

DBVERIFY - Verification complete

Total Pages Examined         : 1280
Total Pages Processed (Data) : 1
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 130
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1149
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1079997 (0.1079997)

alert log
===========================
ALTER SYSTEM: Flushing buffer cache
2014-02-20 14:22:23.779000 +08:00
Hex dump of (file 5, block 131) in trace file /oracle/diag/rdbms/pora92/pora92/trace/pora92_ora_21197.trc
Corrupt block relative dba: 0×01400083 (file 5, block 131)
Completely zero block found during buffer read
Reading datafile ‘/oradata/pora92/pora92/tbs_test01.dbf’ for corruption at rdba: 0×01400083 (file 5, block 131)
Reread (file 5, block 131) found same corrupt data (no logical check)
Starting background process ABMR
ABMR started with pid=32, OS id=21199
Automatic block media recovery service is active.
Automatic block media recovery requested for (file# 5, block# 131)
Automatic block media recovery successful for (file# 5, block# 131)
Automatic block media recovery successful for (file# 5, block# 131)

2,测试备库的currupt block auto repair

– on standby site

>select name,log_mode,CONTROLFILE_TYPE,OPEN_MODE,PROTECTION_MODE, REMOTE_ARCHIVE, DATABASE_ROLE from v$database;

NAME      LOG_MODE     CONTROL OPEN_MODE            PROTECTION_MODE      REMOTE_A DATABASE_ROLE
--------- ------------ ------- -------------------- -------------------- -------- ----------------
PORA92    ARCHIVELOG   STANDBY READ ONLY WITH APPLY MAXIMUM PERFORMANCE  ENABLED  PHYSICAL STANDBY


[oracle@dbserver58 ~]$ dd of=/oradata/sora245/tbs_test01.dbf bs=8192 seek=131 conv=notrunc count=1 if=/dev/zero                                     
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 6.7053e-05 seconds, 122 MB/s
[oracle@dbserver58 ~]$ dbv file='/oradata/sora245/tbs_test01.dbf' blocksize=8192                                     

DBVERIFY: Release 11.2.0.4.0 - Production on Thu Feb 20 14:51:08 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oradata/sora245/tbs_test01.dbf
Page 131 is marked corrupt Corrupt block relative dba: 0x01400083 (file 5, block 131) Completely zero block found during dbv: 



DBVERIFY - Verification complete

Total Pages Examined         : 1280
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 130
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1149 Total Pages Marked Corrupt   : 1 Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1079997 (0.1079997)


因为备库通常我们很少用select,先试一下此时通常的操作rman 会遭遇什么?

RMAN> backup database;
...
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/oradata/sora245/system01.dbf
..
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 20-FEB-14
channel ORA_DISK_1: finished piece 1 at 20-FEB-14
piece handle=/flrvarea/SORA245/SORA245/backupset/2014_02_20/o1_mf_ncsnf_TAG20140220T145617_9jc9rn1k_.bkp tag=TAG20140220T145617 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================

RMAN-03009: failure of backup command on ORA_DISK_1 channel at 02/20/2014 14:56:18
ORA-19566: exceeded limit of 0 corrupt blocks for file /oradata/sora245/tbs_test01.dbf

note:
now RMAN helped me to prove that there is a block corruption as the backup is failed with “ORA-19566: exceeded limit of 0 corrupt blocks”
Also, now Oracle reported the error in alert log as following

alert log
=====================
2014-02-20 15:22:53.353000 +08:00
Hex dump of (file 5, block 131) in trace file /oracle/diag/rdbms/sora245/sora245/trace/sora245_ora_2318.trc
Corrupt block relative dba: 0×01400083 (file 5, block 131)
Completely zero block found during backing up datafile
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Deleted Oracle managed file /flrvarea/SORA245/SORA245/backupset/2014_02_20/o1_mf_nnndf_TAG20140220T152253_9jccbf9x_.bkp

sys@sora245>select * from anbob.testbmr;
ID
——————–
1

sys@sora245>alter system flush buffer_cache;
System altered.

sys@sora245>select dbms_rowid.ROWID_RELATIVE_FNO(rowid), dbms_rowid.rowid_block_number(rowid) from anbob.testbmr;
DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
———————————— ————————————
5 131

sys@sora245>select * from anbob.testbmr;
ID
——————–
1

sys@sora245>select * from V$DATABASE_BLOCK_CORRUPTION;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
——————– ——————– ——————– ——————– ———
5 131 1 0 ALL ZERO


NOTE:
这里出了个小插曲, 可以查询select,但是块提示还是curuppted,此刻再次rman backup 还是报此块损坏,exp 逻辑正常,而且alert 日志也没有出现abmr的日志输出。

下面查看是不是禁用了ABMR?_auto_bmr 隐藏控制。和log_archive_config 参数archive_log_dest_N配置也是正常的

sys@sora245>@p bmr
old   4: and lower(n.ksppinm) like lower('%&1%')
new   4: and lower(n.ksppinm) like lower('%bmr%')

NAME                                     VALUE
---------------------------------------- ----------------------------------------
_auto_bmr                                enabled
_auto_bmr_req_timeout                    60
_auto_bmr_sess_threshold                 30
_auto_bmr_pub_timeout                    10
_auto_bmr_fc_time                        60
_auto_bmr_bg_time                        3600
_auto_bmr_sys_threshold                  100

后来做了 dump block trace发现物理块上是空块,多次flush buffer cache的操作查询仍然可以,那只有来自buffer cache,下面确认的确是。

sys@sora245>select file#,block#,status,objd,stale from v$bh where file#=5 and block#=131;

               FILE#               BLOCK# STATUS                     OBJD S
-------------------- -------------------- ---------- -------------------- -
                   5                  131 mrec                      87361 N


TIP:
mrec state means block in media recovery mode.

还曾经怀疑备库的noformated block不可以被automatic repair,其实不是的。解决上面的问题是在主库对该条记录进行更新,如果发于standby 的block 是空或zero all,会自动re-formated.

-- on primary site

sys@PORA92>update anbob.testbmr set id=3 where id=1;
1 row updated.

sys@PORA92>commit;

-- on standby site

sys@sora245>select * from anbob.testbmr;
                  ID
--------------------
                   3

后做了data block trace 发现块是更新了,dbv, rman 也检查正常通过了。下面来看一下buffer cache中flush 不掉的mrec block,通过重启redo apply清除。

sys@sora245>select file#,block#,status,objd,stale from v$bh where file#=5 and block#=131;

               FILE#               BLOCK# STATUS                     OBJD S
-------------------- -------------------- ---------- -------------------- -
                   5                  131 free                      87361 N
                   5                  131 mrec                      87361 N
                   5                  131 free                      87361 N		

sys@sora245>alter database recover managed standby database cancel;

sys@sora245>alter system flush buffer_cache;

sys@sora245>select file#,block#,status,objd,stale from v$bh where file#=5 and block#=131;

               FILE#               BLOCK# STATUS                     OBJD S
-------------------- -------------------- ---------- -------------------- -
                   5                  131 free                      87361 N
                   5                  131 free                      87361 N
                   5                  131 free                      87361 N

对于上面的standby 端的automatic block repair 是失败的,因为有BH中的mrec状态 block,虽然后来通过primary site的块更新覆盖的standby 的block,在下一篇我会重新测试,如果bh 中没有mrec的block 是不是可以触发ABMR.


11G ADG automatic repair corrupt data blocks(ABMR自动坏块修复)测试二


上一篇11G ADG automatic repair corrupt data blocks(ABMR自动坏块修复)测试一中,primary site 的ABMR 测试是成功的,而standby site 的ABMR没有成功, 后来是怀疑buffer header中有mrec 的block, 现在我接着测试如果buffer header 中没有此类block,ABMR是否成功?以前用bbed 标记corrupt block的方式, 如果有corrupt block时rman backup 又会怎么样?实验开始..

sys@sora245>select dbms_rowid.ROWID_RELATIVE_FNO(rowid), dbms_rowid.rowid_block_number(rowid) from anbob.testbmr;
DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------ ------------------------------------
                                   5                                  131

sys@sora245>select file#,block#,status,objd,stale from v$bh where file#=5 and block#=131;

               FILE#               BLOCK# STATUS                     OBJD S
-------------------- -------------------- ---------- -------------------- -
                   5                  131 free                      87361 N
                   5                  131 xcur                      87361 N

sys@sora245>alter system flush buffer_cache;
System altered.

sys@sora245>select file#,block#,status,objd,stale from v$bh where file#=5 and block#=131;
               FILE#               BLOCK# STATUS                     OBJD S
-------------------- -------------------- ---------- -------------------- -
                   5                  131 free                      87361 N
                   5                  131 free                      87361 N [oracle@dbserver58 ~]$ dd of=/oradata/sora245/tbs_test01.dbf bs=8192 seek=131 conv=notrunc count=1 if=/dev/zero                                   
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 3.8116e-05 seconds, 215 MB/s


sys@sora245>select * from anbob.testbmr;
                  ID
--------------------
                   3

alert log writen the following :
=======================================
Hex dump of (file 5, block 131) in trace file /oracle/diag/rdbms/sora245/sora245/trace/sora245_ora_10022.trc
Corrupt block relative dba: 0×01400083 (file 5, block 131)
Completely zero block found during multiblock buffer read
Reading datafile ‘/oradata/sora245/tbs_test01.dbf’ for corruption at rdba: 0×01400083 (file 5, block 131)
Reread (file 5, block 131) found same corrupt data (no logical check)
Automatic block media recovery requested for (file# 5, block# 131)
2014-02-21 13:41:02.134000 +08:00
Automatic block media recovery requested for (file# 5, block# 131)
Automatic block media recovery successful for (file# 5, block# 131)
Automatic block media recovery successful for (file# 5, block# 131)

TIP:
可以看到这次用了同样的dd 破坏造成的物理corrupt block,buffer header中没有mrec 状态的block 时是可以成功自动repair的(select时).

BBED 构造逻辑currup block及rman 备份测试

[oracle@dbserver58 ~]$ bbed filename=/oradata/sora245/tbs_test01.dbf mode=edit
Password: 

BBED: Release 2.0.0.0.0 - Limited Production on Fri Feb 21 16:46:05 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> set block 131
        BLOCK#          131

BBED> dump                              
 File: /oradata/sora245/tbs_test01.dbf (0)
 Block: 131              Offsets:    0 to  511           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 83004001 a4951000 00000206 3fc50000 01000000 41550100 b77a1000 
 00000000 03003200 80004001 ffff0000 00000000 00000000 00000000 00800000 
 b77a1000 0a000800 a4030000 bc01c000 ff003f00 01200000 a4951000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00010100 
 ffff1400 771f631f 631f0000 01007a1f 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>

BBED> corrupt block 131              
Block marked media corrupt.

BBED> dump
 File: /oradata/sora245/tbs_test01.dbf (0)
 Block: 131              Offsets:    0 to  511           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 83000000 00000000 0000ff04 6fc60000 01000000 41550100 b77a1000 
 00000000 03003200 80004001 ffff0000 00000000 00000000 00000000 00800000 
 b77a1000 0a000800 a4030000 bc01c000 ff003f00 01200000 a4951000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00010100 
 ffff1400 771f631f 631f0000 01007a1f 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 

 <32 bytes per line>
 
 RMAN> backup datafile 5;
 
Starting backup at 21-FEB-14
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=228 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=/oradata/sora245/tbs_test01.dbf
channel ORA_DISK_1: starting piece 1 at 21-FEB-14
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 02/21/2014 16:50:14
ORA-19566: exceeded limit of 0 corrupt blocks for file /oradata/sora245/tbs_test01.dbf

sys@sora245>alter system flush buffer_cache;
System altered.

sys@sora245>select * from anbob.testbmr;
                  ID
--------------------
                   3
alert log
============================
Hex dump of (file 5, block 131) in trace file /oracle/diag/rdbms/sora245/sora245/trace/sora245_ora_10982.trc
Corrupt block relative dba: 0x01400083 (file 5, block 131)
Bad header found during backing up datafile
Data in bad block:
 type: 6 format: 2 rdba: 0x00000083
 last change scn: 0x0000.00000000 seq: 0xff flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x000006ff
 check value in block header: 0xc66f
 computed block checksum: 0x0
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Reread of blocknum=131, file=/oradata/sora245/tbs_test01.dbf. found same corrupt data
Deleted Oracle managed file /flrvarea/SORA245/SORA245/backupset/2014_02_21/o1_mf_nnndf_TAG20140221T165012_9jg4t51o_.bkp
2014-02-21 16:50:47.965000 +08:00
ALTER SYSTEM: Flushing buffer cache
2014-02-21 16:50:49.831000 +08:00
Hex dump of (file 5, block 131) in trace file /oracle/diag/rdbms/sora245/sora245/trace/sora245_ora_10987.trc
Corrupt block relative dba: 0x01400083 (file 5, block 131)
Bad header found during multiblock buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x00000083
 last change scn: 0x0000.00000000 seq: 0xff flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x000006ff
 check value in block header: 0xc66f
 computed block checksum: 0x0
Reading datafile '/oradata/sora245/tbs_test01.dbf' for corruption at rdba: 0x01400083 (file 5, block 131)
Reread (file 5, block 131) found same corrupt data (no logical check)
Automatic block media recovery requested for (file# 5, block# 131)
Automatic block media recovery requested for (file# 5, block# 131)
Automatic block media recovery successful for (file# 5, block# 131)
Automatic block media recovery successful for (file# 5, block# 131)


TIP:
可以看到bbed 只是修改了block的几个标志位造成corrupt block, rman 备份时会出错,通过select 可以触发ABMR,也会在primary site ,block 发现更新时覆盖standby site的corrupt block,此时alert 无记录。


转载自:
http://www.anbob.com/archives/2156.html
http://www.anbob.com/archives/2163.html

--end--

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/23135684/viewspace-1127927/,如需转载,请注明出处,否则将追究法律责任。

下一篇: More about AWR
请登录后发表评论 登录
全部评论
Oracle数据库管理员,Oracle数据库系统构架员;2012年7月出版《构建最高可用Oracle数据库系统:Oracle 11gR2 RAC管理、维护与性能优化》一书;Oracle 10g OCM。

注册时间:2010-01-05

  • 博文量
    483
  • 访问量
    5355053