ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 【备份恢复】不使用rman工具就能恢复被rm删除的数据文件案例

【备份恢复】不使用rman工具就能恢复被rm删除的数据文件案例

原创 Linux操作系统 作者:leonarding 时间:2012-07-27 01:53:15 0 删除 编辑

引题:朋友一时兴起使用了rm**,删除了oracle数据文件后找我帮忙,我在帮朋友恢复数据库时,遇到了当recover时,报错不能找到28739号归档日志,这样我就不能同步scn,更不能打开数据库了。这是归档日志不连续的典型案例,我最后告诉他要做好心理准备。事情还没有完,这个真实案例引发了我的思考,如果当时在朋友没有做rman拯救措施的情况下,可不可能不使用rman即可恢复数据文件呢!最后我找到了答案:)

案例
1.
系统 solaris SunOS TJLT-YDWG6 5.9 Generic_122300-25 sun4u sparc SUNW,Sun-Fire-V890
DB  Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
2.
案情描述
现场工程师使用了rm -rf *.dbf 命令把所有的数据文件全部删除了
现在有54日的备份
restore database until time "to_date('2012-05-04 12:00:00','yyyy-mm-dd hh24:mi:ss')";
  进行恢复显示finish restore complete 没有问题已经把文件 复制回来了
进行同步
RMAN> recover database until time "to_date('2012-05-04 11:00:00','yyyy-mm-dd hh24:mi:ss')";
                           
Starting recover at 2012-07-26 14:02:42
using channel ORA_DISK_1
starting media recovery
unable to find archive log
archive log thread=1 sequence=28739
   缺少28739号的归档日志,导致undotbs01.dbf文件不一致
Oracle Error:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 2 needs more recovery to be consistent
ORA-01110: data file 2: '/opt/oradata/kpidb/undotbs01.dbf'
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 07/26/2012 14:02:51
RMAN-06054: media recovery requesting unknown log: thread 1 seq 28739 lowscn 1513525474
Leonarding
2012.7.26

在我们工作中可能会经常发生这样类似的突发状况,在遇到此情况下首先要做的就是冷静,上面发生的问题到了我这里之后,我就发现数据库已经变成了mount状态,在使用文件句柄方式恢复数据文件已经为时已晚,所以我采用了常规的恢复方式,没想到啊没想到,归档日志还不全,立马我整个人都“斯巴达”了,最后告诉朋友做DBA是需要勇气的。

下面我用自己的测试库演示一下操作系统rm级别的删除数据文件后,数据库仍然处于open状态的时候使用文件句柄来恢复被rm删除的数据文件,并最终顺利打开数据库的实验!我是一个比较严谨的人,所以一上来我先做个了“压缩全库备份”做到有备无患

描述一下场景:

操作系统:Enterprise Linux Enterprise Linux AS release 4 (October Update 8)

数据库版本:Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod

一、备份重于一切

因此在演示之前我们先做一下备份

RMAN> show all;

RMAN配置参数区,如下都是默认值

RMAN configuration parameters are:

冗余配置保留政策:冗余数是1

CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default

开启增量备份:关

CONFIGURE BACKUP OPTIMIZATION OFF; # default

默认备份设备是磁盘

CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default

控制文件自动备份:关

CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default

控制文件自动备份目录和格式:%F  【备份设备:Disk】

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default

备份的并行度:1,备份类型为备份集

CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default

数据文件采用复制方式备份

CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

归档日志采用复制方式备份

CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

最大值:无限制

CONFIGURE MAXSETSIZE TO UNLIMITED; # default

加密数据库:关

CONFIGURE ENCRYPTION FOR DATABASE OFF; # default

加密算法采用AES128

CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default

归档日志删除策略:空

CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default

快照控制文件名

CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_LEO.f'; # default

配置RMAN默认备份介质存放目录到/home/oracle/backup

RMAN> configure channel device type disk format '/home/oracle/backup/DB_LEO_%U';

new RMAN configuration parameters:

CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT   '/home/oracle/backup/DB_LEO_%U';

new RMAN configuration parameters are successfully stored

配置控制文件自动备份并保存到/home/oracle/backup/control目录

RMAN> configure controlfile autobackup on;              启动控制文件自动备份

new RMAN configuration parameters:

CONFIGURE CONTROLFILE AUTOBACKUP ON;

new RMAN configuration parameters are successfully stored

RMAN> configure controlfile autobackup format for device type disk to  设置控制文件自动备份目录和格式

'/home/oracle/backup/control/cf_LEO_%F';

new RMAN configuration parameters:

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO

'/home/oracle/backup/control/cf_LEO_%F';

new RMAN configuration parameters are successfully stored

显示配置后RMAN环境变量

RMAN configuration parameters are:

CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default

CONFIGURE BACKUP OPTIMIZATION OFF; # default

CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default

CONFIGURE CONTROLFILE AUTOBACKUP ON;

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/home/oracle/backup/control/cf_LEO_%F';

CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default

CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT   '/home/oracle/backup/DB_LEO_%U';

CONFIGURE MAXSETSIZE TO UNLIMITED; # default

CONFIGURE ENCRYPTION FOR DATABASE OFF; # default

CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default

CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default

CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_LEO.f'; # default

使用rman命令压缩备份数据库

RMAN> backup as compressed backupset full database format

2> '/home/oracle/backup/full_bk1_%u%p%s.rmn' include current controlfile

3> plus

4> archivelog format '/home/oracle/backup/arch_bk1_%u%p%s.rmn'  delete all input; (删除备份过的旧归档日志)

Starting backup at 26-JUL-12

current log archived

allocated channel: ORA_DISK_1

channel ORA_DISK_1: sid=318 devtype=DISK

channel ORA_DISK_1: starting compressed archive log backupset

channel ORA_DISK_1: specifying archive log(s) in backup set

input archive log thread=1 sequence=5 recid=1 stamp=784059130

input archive log thread=1 sequence=6 recid=6 stamp=788723549

input archive log thread=1 sequence=7 recid=4 stamp=788723547

input archive log thread=1 sequence=8 recid=5 stamp=788723548

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/arch_bk1_01nh3iuq11.rmn tag=TAG20120726T235048 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:16

channel ORA_DISK_1: deleting archive log(s)

archive log filename=/home/oracle/arch/LEO/1_5_784052660.dbf recid=1 stamp=784059130

archive log filename=/home/oracle/arch/LEO/1_6_784052660.dbf recid=6 stamp=788723549

archive log filename=/home/oracle/arch/LEO/1_6_784052660.dbf recid=2 stamp=784059169

archive log filename=/home/oracle/arch/LEO/1_7_784052660.dbf recid=4 stamp=788723547

archive log filename=/home/oracle/arch/LEO/1_7_784052660.dbf recid=3 stamp=784059221

archive log filename=/home/oracle/arch/LEO/1_8_784052660.dbf recid=5 stamp=788723548

channel ORA_DISK_1: starting compressed archive log backupset

channel ORA_DISK_1: specifying archive log(s) in backup set

input archive log thread=1 sequence=1 recid=8 stamp=789695445

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/arch_bk1_02nh3ivb12.rmn tag=TAG20120726T235048 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:09

channel ORA_DISK_1: deleting archive log(s)

archive log filename=/home/oracle/arch/LEO/1_1_788725257.dbf recid=8 stamp=789695445

channel ORA_DISK_1: starting compressed archive log backupset

channel ORA_DISK_1: specifying archive log(s) in backup set

input archive log thread=1 sequence=1 recid=7 stamp=788725258

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/arch_bk1_03nh3ivk13.rmn tag=TAG20120726T235048 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02

channel ORA_DISK_1: deleting archive log(s)

archive log filename=/home/oracle/arch/LEO/1_1_788723547.dbf recid=7 stamp=788725258

Finished backup at 26-JUL-12

 

Starting backup at 26-JUL-12

using channel ORA_DISK_1

channel ORA_DISK_1: starting compressed full datafile backupset

channel ORA_DISK_1: specifying datafile(s) in backupset

input datafile fno=00001 name=/u01/app/oracle/oradata/LEO/file1/system01.dbf

input datafile fno=00003 name=/u01/app/oracle/oradata/LEO/file1/sysaux01.dbf

input datafile fno=00002 name=/u01/app/oracle/oradata/LEO/file1/undotbs01.dbf

input datafile fno=00005 name=/u01/app/oracle/oradata/LEO/file1/example01.dbf

input datafile fno=00004 name=/u01/app/oracle/oradata/LEO/file1/users01.dbf

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/full_bk1_04nh3ivo14.rmn tag=TAG20120726T235119 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:04:56

channel ORA_DISK_1: starting compressed full datafile backupset

channel ORA_DISK_1: specifying datafile(s) in backupset

including current control file in backupset

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/full_bk1_05nh3j9015.rmn tag=TAG20120726T235119 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02

Finished backup at 26-JUL-12

 

Starting backup at 26-JUL-12

current log archived

using channel ORA_DISK_1

channel ORA_DISK_1: starting compressed archive log backupset

channel ORA_DISK_1: specifying archive log(s) in backup set

input archive log thread=1 sequence=2 recid=9 stamp=789695779

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/arch_bk1_06nh3j9316.rmn tag=TAG20120726T235619 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02

channel ORA_DISK_1: deleting archive log(s)

archive log filename=/home/oracle/arch/LEO/1_2_788725257.dbf recid=9 stamp=789695779

Finished backup at 26-JUL-12

 

Starting Control File and SPFILE Autobackup at 26-JUL-12

piece handle=/home/oracle/backup/control/cf_LEO_c-1558319476-20120726-00 comment=NONE

Finished Control File and SPFILE Autobackup at 26-JUL-12

ll        看一下我们的备份文件已经生成,下面可以安心做实验啦

-rw-r-----  1 oracle oinstall 11180032 Jul 26 23:51 arch_bk1_01nh3iuq11.rmn

-rw-r-----  1 oracle oinstall  7031808 Jul 26 23:51 arch_bk1_02nh3ivb12.rmn

-rw-r-----  1 oracle oinstall  1030144 Jul 26 23:51 arch_bk1_03nh3ivk13.rmn

-rw-r-----  1 oracle oinstall     8704 Jul 26 23:56 arch_bk1_06nh3j9316.rmn

drwxr-xr-x  2 oracle oinstall     4096 Jul 26 23:56 control

-rw-r-----  1 oracle oinstall 57221120 Jul 26 23:56 full_bk1_04nh3ivo14.rmn

-rw-r-----  1 oracle oinstall  1097728 Jul 26 23:56 full_bk1_05nh3j9015.rmn

二、模拟数据文件删除

SYS@LEO> select status from v$instance;   查看数据库是否是open状态,是open没有问题

STATUS

------------

OPEN

SYS@LEO> select name from v$datafile;    查看存在的数据文件

NAME

--------------------------------------------------------------------------------

/u01/app/oracle/oradata/LEO/file1/system01.dbf

/u01/app/oracle/oradata/LEO/file1/undotbs01.dbf

/u01/app/oracle/oradata/LEO/file1/sysaux01.dbf

/u01/app/oracle/oradata/LEO/file1/users01.dbf

/u01/app/oracle/oradata/LEO/file1/example01.dbf

SYS@LEO> host rm /u01/app/oracle/oradata/LEO/file1/example01.dbf  删除example01.dbf数据文件

我们检查一下删除example01.dbf文件后数据库是不是还处在open状态

SYS@LEO> select status from v$instance;    呵呵  就像我们在开始说的一下,open木有问题

STATUS

------------

OPEN

我们检查一下删除example01.dbf文件后表空间处于什么状态呢?

SYS@LEO> select TABLESPACE_NAME,status from dba_tablespaces;

TABLESPACE_NAME                STATUS

------------------------------ ---------

SYSTEM                         ONLINE

UNDOTBS                        ONLINE

SYSAUX                          ONLINE

TEMPTS1                        ONLINE

USERS                           ONLINE

TEMP1                          ONLINE

TEMP2                          ONLINE

EXAMPLE                        ONLINE      我们看到EXAMPLE还是联机状态,为什么是这样呢,我们不是已经把文件删除了嘛!对喽 可能有人已经想到了,我们是从操作系统层面上直接删除的(没有通过数据库删除),所以数据库此时还认为这个文件没有变化,数据字典中记录的还是完好的状态

conn ls/ls                                          切换用户

数据文件被删除,所以创建表失败(当我们往数据文件上写数据时,数据库才会检查文件状态)

LS@LEO> create table t1 tablespace example as select * from liusheng;

create table t1 tablespace example as select * from liusheng  

ERROR at line 1:

ORA-01116: error in opening database file 5

ORA-01110: data file 5: '/u01/app/oracle/oradata/LEO/file1/example01.dbf'

ORA-27041: unable to open file

Linux Error: 2: No such file or directory

Additional information: 3

注:如果我们在其他数据文件上创建表时,是可以正常创建的,因为其他数据文件没有被破坏

LS@LEO> create table t2 tablespace users as select * from liusheng;

Table created.

三、通过文件句柄恢复数据文件

查询dbwr进程pid

[oracle@secdb1 bdump]$ ps -ef|grep dbw|grep -v grep

oracle   26745     1  0 Jul26 ?        00:00:01 ora_dbw0_LEO

注:1.可能有的朋友对  grep –v grep 这句不是很明白,-v 选项反向过滤oracle    2405   421  0 01:03 pts/9    00:00:00 grep dbw 这行记录的意思,只保留 ora_dbw0_LEO 这行记录,方便观察

2. dbwr进程会打开所有访问数据文件的句柄。在proc目录中可以查到,目录名是进程PIDfdfile discription)表示文件描述符。

cd  /proc/26745/fd

ls -l

lr-x------  1 oracle oinstall 64 Jul 27 01:09 0 -> /dev/null

lr-x------  1 oracle oinstall 64 Jul 27 01:09 1 -> /dev/null

lrwx------  1 oracle oinstall 64 Jul 27 01:09 10 -> /u01/app/oracle/admin/LEO/adump/ora_26737.aud

lr-x------  1 oracle oinstall 64 Jul 27 01:09 11 -> /dev/zero

lr-x------  1 oracle oinstall 64 Jul 27 01:09 12 -> /dev/zero

lr-x------  1 oracle oinstall 64 Jul 27 01:09 13 -> /u01/app/oracle/product/10.2.0/db_1/rdbms/mesg/oraus.msb

lrwx------  1 oracle oinstall 64 Jul 27 01:09 14 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_LEO.dat

lrwx------  1 oracle oinstall 64 Jul 27 01:09 15 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkLEO

lrwx------  1 oracle oinstall 64 Jul 27 01:09 16 -> /u01/app/oracle/oradata/LEO/file1/control01.ctl

lrwx------  1 oracle oinstall 64 Jul 27 01:09 17 -> /u01/app/oracle/oradata/LEO/file2/control02.ctl

lrwx------  1 oracle oinstall 64 Jul 27 01:09 18 -> /u01/app/oracle/oradata/LEO/file3/control03.ctl

lrwx------  1 oracle oinstall 64 Jul 27 01:09 19 -> /u01/app/oracle/oradata/LEO/file1/system01.dbf

lr-x------  1 oracle oinstall 64 Jul 27 01:09 2 -> /dev/null

lrwx------  1 oracle oinstall 64 Jul 27 01:09 20 -> /u01/app/oracle/oradata/LEO/file1/undotbs01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 21 -> /u01/app/oracle/oradata/LEO/file1/sysaux01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 22 -> /u01/app/oracle/oradata/LEO/file1/users01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 23 -> /u01/app/oracle/oradata/LEO/file1/example01.dbf (deleted)

lrwx------  1 oracle oinstall 64 Jul 27 01:09 24 -> /u01/app/oracle/oradata/LEO/file1/temp01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 25 -> /u01/app/oracle/oradata/LEO/file1/temp1_01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 26 -> /u01/app/oracle/oradata/LEO/file1/temp1_02.dbf

lr-x------  1 oracle oinstall 64 Jul 27 01:09 3 -> /dev/null

lr-x------  1 oracle oinstall 64 Jul 27 01:09 4 -> /dev/null

l-wx------  1 oracle oinstall 64 Jul 27 01:09 5 -> /u01/app/oracle/admin/LEO/udump/leo_ora_26737.trc

l-wx------  1 oracle oinstall 64 Jul 27 01:09 6 -> /u01/app/oracle/admin/LEO/bdump/alert_LEO.log

lrwx------  1 oracle oinstall 64 Jul 27 01:09 7 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkinstLEO (deleted)

l-wx------  1 oracle oinstall 64 Jul 27 01:09 8 -> /u01/app/oracle/admin/LEO/bdump/alert_LEO.log

lrwx------  1 oracle oinstall 64 Jul 27 01:09 9 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_LEO.dat

注:“23 -> /u01/app/oracle/oradata/LEO/file1/example01.dbf (deleted) 被删除数据文件会被标示(deleted),我们使用的是redhat linux,如果在solaris系统中使用lsof命令查看句柄和文件对应关系,我们通过copy文件句柄方式恢复数据文件到原位置

cp /proc/26745/fd/23 /u01/app/oracle/oradata/LEO/file1/example01.dbf

确认example01.dbf 数据文件已经恢复成功

ll /u01/app/oracle/oradata/LEO/file1/example01.dbf

-rw-r-----  1 oracle oinstall 209723392 Jul 27 01:22 /u01/app/oracle/oradata/LEO/file1/example01.dbf

四、数据文件recover

LS@LEO> alter database datafile 5 offline;  先把example01.dbf文件脱机

Database altered.

LS@LEO> recover datafile 5;                  完成介质恢复,实质同步控制文件、redo日志scn

Media recovery complete.

LS@LEO> alter database datafile 5 online;  再把example01.dbf文件联机

Database altered.

五、测试是否可以正常创建表t1

LS@LEO> create table t1 tablespace example as select * from liusheng;

Table created.                                                                                              成功创建

LS@LEO> select * from t1;

   ORDERID NAME       LS_DATE

---------- ---------- -------------------

         1 ls1        1981-01-02 00:00:00

         1 ls2        1998-01-03 00:00:00

         1 ls3        1999-01-04 00:00:00

         1 ls4        2000-01-05 00:00:00

         1 ls5        2000-01-06 00:00:00

         1 ls6        2001-01-07 00:00:00

         1 ls7        2001-01-08 00:00:00

         1 ls8        2002-01-09 00:00:00

         1 ls9        2002-01-10 00:00:00

         1 ls10       2011-01-11 00:00:00

10 rows selected.

小结:当我们在Linux系统中不小心rm了数据文件时,一定要冷静,不要做关闭数据库、重启操作系统等危险操作,因为在不了解数据库运行状态的前提下做这些往往是徒劳的,还可能造成无法挽回的后果。此时我们不妨在数据库open状态下使用文件句柄方式来恢复被我们rm掉的数据文件。因为只要数据文件被某个进程使用着,那么这个进程就会一直持有这个数据文件的句柄,那么所指向的数据文件依然可以读写,我们可以从proc->pid->fd目录中找到被删除的数据文件句柄(deletedcopy到原位置来恢复,这也是在恢复窗口时间很短的情况下快速恢复的好方法。

 

Leonarding

2012.7.26

天津&summer

分享技术~收获快乐

Bloghttp://space.itpub.net/26686207

 
 
 
 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/26686207/viewspace-737123/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
刘盛,网名leonarding,the first ACEA in china, Oracle OCM10g&11g RHCE, ACOUG Core Member Blogger, DATAGURU Oracle数据库版主, ITPUB HADOOP版版主, DB2中国专家组成员, 专注于&Oracle&Hadoop&Nosql&SAS, 提倡“分享技术~成就梦想”, 新浪微博:@itpub_leonarding QQ:40354446

注册时间:2012-02-10

  • 博文量
    246
  • 访问量
    3227636