ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 聊聊Oracle Data Recovery Advisor(DRA)(下)

聊聊Oracle Data Recovery Advisor(DRA)(下)

原创 Linux操作系统 作者:realkid4 时间:2013-09-08 09:17:34 0 删除 编辑

上篇我们介绍了Oracle 11gData Recover AdvisorDRA)特性,并且给出了一个在启动过程出错故障的演示。本篇我们继续看数据库执行过程中,问题过程修复。

 

4、运行过程中故障

 

在运行过程中的oracle故障,坏块和文件异常删除出现的比较多,特别是初级DBA刚刚上手的时候。我们先来模拟一下这个场景。

 

Undo表空间是Oracle核心表空间之一,删除之后会引起比较严重的问题故障。

 

 

SQL> select file_name from dba_data_files where tablespace_name='UNDOTBS1';

 

FILE_NAME

--------------------------------------------------------------------------------

/u01/oradata/WILSON/datafile/o1_mf_undotbs1_7xt3yzl5_.dbf

 

 

当前数据库处在Open运行状态,突然Undo文件被后OS层面删除。

 

 

[oracle@bspdev datafile]$ ls -l | grep undo

-rw-r----- 1 oracle oinstall 346038272 Sep  6 07:21 o1_mf_undotbs1_7xt3yzl5_.dbf

[oracle@bspdev datafile]$ mv o1_mf_undotbs1_7xt3yzl5_.dbf o1_mf_undotbs1_7xt3yzl5_.dbf.bak

[oracle@bspdev datafile]$ ls -l | grep undo

-rw-r----- 1 oracle oinstall 346038272 Sep  6 07:21 o1_mf_undotbs1_7xt3yzl5_.dbf.bak

 

 

此时,alert log中可以出现上篇中那个“checker”的工作过程。

 

 

Fri Sep 06 07:25:47 2013

Checker run found 1 new persistent data failures

Fri Sep 06 07:26:34 2013

Starting background process SMCO

Fri Sep 06 07:26:34 2013

SMCO started with pid=19, OS id=4819

Fri Sep 06 07:26:46 2013

Errors in file /u01/diag/rdbms/wilson/wilson/trace/wilson_mmnl_4418.trc:

ORA-01116: error in opening database file 3

ORA-01110: data file 3: '/u01/oradata/WILSON/datafile/o1_mf_undotbs1_7xt3yzl5_.dbf'

ORA-27041: unable to open file

Linux Error: 2: No such file or directory

Additional information: 3

Fri Sep 06 07:26:48 2013

Errors in file /u01/diag/rdbms/wilson/wilson/trace/wilson_m000_4835.trc:

ORA-01116: error in opening database file 3

ORA-01110: data file 3: '/u01/oradata/WILSON/datafile/o1_mf_undotbs1_7xt3yzl5_.dbf'

ORA-27041: unable to open file

Linux Error: 2: No such file or directory

Additional information: 3

 

 

差不多两秒钟报一个错误,发现文件被删除无法打开。

 

此时,我们在rman上使用list failure命令,查看生成的错误信息。

 

 

RMAN> list failure all;

 

List of Database Failures

=========================

 

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

242        HIGH     OPEN      06-SEP-13     One or more non-system datafiles are missing

 

 

我们使用advisor failure,查看一个Oracle的建议。

 

 

RMAN> advise failure ;

 

List of Database Failures

=========================

 

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

242        HIGH     OPEN      06-SEP-13     One or more non-system datafiles are missing

 

analyzing automatic repair options; this may take some time

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=30 device type=DISK

analyzing automatic repair options complete

 

Mandatory Manual Actions

========================

1. If file /u01/oradata/WILSON/datafile/o1_mf_undotbs1_7xt3yzl5_.dbf was unintentionally renamed or moved, restore it

2. Automatic repairs may be available if you shutdown the database and restart it in mount mode

3. Contact Oracle Support Services if the preceding recommendations cannot be used, or if they do not fix the failures selected for repair

 

Optional Manual Actions

=======================

no manual actions available

 

Automated Repair Options

========================

no automatic repair options available

 

 

注意,在automated repair options中,我们没有看到脚本信息。说明Oracle好像在目前也没有太好的方法。在Manual Actions中,Oracle DRA要求将数据库重启到mount状态,才能有自动脚本的出现。Manual Actions是那些Oracle觉得需要用户手工执行才能继续下去的步骤。

 

重新启动一下库,加载到mount状态。

 

--强制关闭

RMAN> shutdown abort;

Oracle instance shut down

 

RMAN> startup mount;

 

connected to target database (not started)

Oracle instance started

database mounted

 

Total System Global Area     849530880 bytes

 

Fixed Size                     1339824 bytes

Variable Size                616566352 bytes

Database Buffers             226492416 bytes

Redo Buffers                   5132288 bytes

 

 

此时再次使用DRA工具,看问题和提示内容。

 

 

RMAN> advise failure;

 

List of Database Failures

=========================

 

Failure ID Priority Status    Time Detected Summary

---------- -------- --------- ------------- -------

242        HIGH     OPEN      06-SEP-13     One or more non-system datafiles are missing

 

analyzing automatic repair options; this may take some time

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=18 device type=DISK

analyzing automatic repair options complete

 

Mandatory Manual Actions

========================

no manual actions available

 

Optional Manual Actions

=======================

1. If file /u01/oradata/WILSON/datafile/o1_mf_undotbs1_7xt3yzl5_.dbf was unintentionally renamed or moved, restore it

 

Automated Repair Options

========================

Option Repair Description

------ ------------------

1      Restore and recover datafile 3 

  Strategy: The repair includes complete media recovery with no data loss

  Repair script. /u01/diag/rdbms/wilson/wilson/hm/reco_1850469943.hm

 

 

在上篇中,我们是手工打开hm文件,看里面的脚本。其实还可以使用repair failure review命令来查看执行语句。

 

 

RMAN> repair failure preview;

 

Strategy: The repair includes complete media recovery with no data loss

Repair script. /u01/diag/rdbms/wilson/wilson/hm/reco_1850469943.hm

 

contents of repair script.:

   # restore and recover datafile

   restore datafile 3;

   recover datafile 3;

 

 

注意:此时Oracle DRA发现了当前我们有Undo的备份和归档日志。所以使用restore之后伴随recover,可以快速实现恢复。

 

如果在preview中没有发现什么问题,可以repair failure命令执行进行恢复。

 

 

RMAN> repair failure;

 

Strategy: The repair includes complete media recovery with no data loss

Repair script. /u01/diag/rdbms/wilson/wilson/hm/reco_1850469943.hm

 

contents of repair script.:

   # restore and recover datafile

   restore datafile 3;

   recover datafile 3;

 

Do you really want to execute the above repair (enter YES or NO)? yes

executing repair script

 

Starting restore at 06-SEP-13

using channel ORA_DISK_1

 

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00003 to /u01/oradata/WILSON/datafile/o1_mf_undotbs1_7xt3yzl5_.dbf

channel ORA_DISK_1: reading from backup piece /u01/flash_recovery_area/WILSON/backupset/2013_09_06/o1_mf_nnndf_TAG20130906T061608_92l0od6w_.bkp

channel ORA_DISK_1: piece handle=/u01/flash_recovery_area/WILSON/backupset/2013_09_06/o1_mf_nnndf_TAG20130906T061608_92l0od6w_.bkp tag=TAG20130906T061608

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:25

Finished restore at 06-SEP-13

 

Starting recover at 06-SEP-13

using channel ORA_DISK_1

 

starting media recovery

media recovery complete, elapsed time: 00:00:02

 

Finished recover at 06-SEP-13

repair failure complete

 

--可以选择打开数据库

Do you want to open the database (enter YES or NO)? yes

database opened

 

 

我们在alert log中,可以监控到恢复的步骤。

 

--Restore过程

Fri Sep 06 07:35:49 2013

Full restore complete of datafile 3 /u01/oradata/WILSON/datafile/o1_mf_undotbs1_92l5b0v4_.dbf.  Elapsed time: 0:00:15

  checkpoint is 3838694

  last deallocation scn is 3817636

  Undo Optimization current scn is 3815429

Fri Sep 06 07:35:54 2013

alter database recover datafile list clear

Completed: alter database recover datafile list clear

--recovery过程

alter database recover if needed

 datafile 3

Media Recovery Start

Serial Media Recovery started

Recovery of Online Redo Log: Thread 1 Group 2 Seq 176 Reading mem 0

  Mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_2_870n48hc_.log

  Mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_2_870n4dtl_.log

Recovery of Online Redo Log: Thread 1 Group 3 Seq 177 Reading mem 0

  Mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_870n4lsg_.log

  Mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_870n4o31_.log

Recovery of Online Redo Log: Thread 1 Group 1 Seq 178 Reading mem 0

  Mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_870n42n1_.log

  Mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_870n44z3_.log

Media Recovery Complete (wilson)

Completed: alter database recover if needed

 datafile 3

Fri Sep 06 07:36:04 2013

alter database open

 

 

此时,数据库错误消除。

 

 

RMAN> list failure;

no failures found that match specification

 

 

最后,我们还有一个命令可以使用,就是change failureChange Failure命令的作用就是显示的将错误的状态修改掉。最常用的做法是:当一个错误发生的时候,如果我们没有在RMAN层面上去解决,比如使用冷备份方法还原。Failure信息是不会变化状态的。此时,可以使用change failure命令将状态设置为Closed,命令如:change failure all closed

 

5、结论

 

注意,目前的11g版本中,Data Recovery Advisor还不支持RAC环境。

 

随着版本的推进,越来越多的Advisor出现在我们周围。从目前看,Advisor只是一个信息咨询专家库,我们可以听也可以不听。很多老资格的DBA对这些“花哨”产品也是比较不屑。笔者认为大可不必。

 

工具的出现,自动化、智能化是任何一个事物的必然过程。可能在早期的版本中,一些Advisor存在这样或者那样的问题。但是随着不断的改进升级,这些Advisor变的越来越智能,也是不可辩驳的事实。最终智能化也只是时间的问题了。

 

那么,作为传统业务的DBA我们自己,应该怎么做呢?首先,原理一定要作为基础。任何技术,特别是Oracle近几个版本,都遵循9i时期奠定的基础框架和机制。很多花哨产品都是以此为基础进行研发,所以理解基础很重要。其次,业务价值。开发DBA是一个体现业务价值的重要方面,将数据库的理念带入到架构设计、开发过程,可以让我们的系统衔接的更平顺。最后就是行业优势,Oracle是死的,应用行业是多样的。每一个行业都有自己的特点和取向。作为DBA,特别是资深DBA,对业务数据的敏感度要远大于开发团队的很多人,把价值发挥出来,空间自然不会小。

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/17203031/viewspace-772358/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
求道~

注册时间:2010-11-30

  • 博文量
    545
  • 访问量
    7632394