ITPub博客

首页 > 数据库 > Oracle > oracle丢失的是所有的redo日志组

oracle丢失的是所有的redo日志组

原创 Oracle 作者:贺子_DBA时代 时间:2019-04-20 22:31:11 0 删除 编辑

假设丢失的是所有的redo日志组,分下列几种情况分别处理:

1.Oracle没开归档,一致性关闭数据库

2.Oracle没开归档,非一致性关闭数据库

3.Oracle开归档,一致性关闭数据库

4.Oracle开归档,非一致性关闭数据库

一:Oracle没开归档,一致性关闭数据库

我做实验的过程中有一个诡异的情况,我先把redo文件从操作系统层面都删除了,但是数据库正常创建表,insert数据,我理解的是当你commit的时候,会触发lgwr进程从redo log buffer中涮新redo 到redo 文件中,但是redo文件已经被删除了,就会报错,但是他并没有报错:

[root@testdb59 /data/u01/app/oracle/oradata/stdb59]# ll

total 13697796

-rw-r----- 1 oracle oinstall 144916480 Apr 5 22:30 control01.ctl

-rw-r----- 1 oracle oinstall 2147491840 Apr 5 22:26 liuwenhe.dbf

-rw-r----- 1 oracle oinstall 52429312 Apr 5 22:26 redo01.log

-rw-r----- 1 oracle oinstall 52429312 Apr 5 22:29 redo03.log

-rw-r----- 1 oracle oinstall 4938801152 Apr 5 22:26 soe3.dbf

-rw-r----- 1 oracle oinstall 2469404672 Apr 5 22:26 soe.dbf

-rw-r----- 1 oracle oinstall 2705334272 Apr 5 22:26 sysaux01.dbf

-rw-r----- 1 oracle oinstall 786440192 Apr 5 22:26 system01.dbf

-rw-r----- 1 oracle oinstall 30416896 Oct 16 12:37 temp01.dbf

-rw-r----- 1 oracle oinstall 1073750016 Apr 5 22:26 temp.dbf

-rw-r----- 1 oracle oinstall 309338112 Apr 5 22:26 undotbs01.dbf

-rw-r----- 1 oracle oinstall 166469632 Apr 5 22:26 users01.dbf

删除redo 文件

[root@testdb59 /data/u01/app/oracle/oradata/stdb59]# rm *.log

再次查看,发现确实已经没有了redo文件

[root@testdb59 /data/u01/app/oracle/oradata/stdb59]# ll

total 13595388

-rw-r----- 1 oracle oinstall 144916480 Apr 5 22:50 control01.ctl

-rw-r----- 1 oracle oinstall 2147491840 Apr 5 22:50 liuwenhe.dbf

-rw-r----- 1 oracle oinstall 4938801152 Apr 5 22:50 soe3.dbf

-rw-r----- 1 oracle oinstall 2469404672 Apr 5 22:50 soe.dbf

-rw-r----- 1 oracle oinstall 2705334272 Apr 5 22:50 sysaux01.dbf

-rw-r----- 1 oracle oinstall 786440192 Apr 5 22:50 system01.dbf

-rw-r----- 1 oracle oinstall 30416896 Oct 16 12:37 temp01.dbf

-rw-r----- 1 oracle oinstall 1073750016 Apr 5 22:41 temp.dbf

-rw-r----- 1 oracle oinstall 309338112 Apr 5 22:50 undotbs01.dbf

-rw-r----- 1 oracle oinstall 166469632 Apr 5 22:50 users01.dbf

SQL> create table t(int int);

Table created.

SQL> insert into t values (100);

1 row created.

SQL> commit;

SQL>alter system switch logfile;

System altered.

SQL> alter system checkpoint;

System altered.

有点理解不了!!!!问了下老师,才知道原来是打开的文件句柄还在,重启之后就没有了!就会报错

(体外话:也就是说rm这个文件了,但是这个文件实际上还是存在的,先说一下他的工作原理吧,然后我在把试验分享给大家, 工作原理其实也不难,这个工具需要在ext3或者ext4 的文件系统上才可以实现,因为ext3文件系统是日志型文件系统,ext3文件系统储存信息的时候是由inode号和block块存储的。

神马? 不知道什么是inode号?和block块? 好吧,在说明白点,比如:一个分区比如一本书,那么block块就是书每页的内容,而inode号 就是书的目录,系统找文件的时候先找inode号 然后根据inode号去找硬盘上的block快信息,明白了吧!

在说一下删除的原理吧。 当硬盘上的一个文件删除,其实没有真正想象中的那样在硬盘上清除掉的,他是把inode号和block块的那个链子 断开,但是真正的数据还是在硬盘上的,有没有感觉在windos上删除是那么快,没考虑到这吧,当你在删除文件的地方重新复制了新文件,那时候才会把之前的文件覆盖掉,也就是说删除了没有关系,千万不要往那个位置放文件了)

因为数据库是一致性关闭的,也就是不需要实例恢复,也就不需要丢失的redo,所以可以直接删除重建,当然也可以recover database 来恢复丢失的redo,所以针对这种情况,有两种恢复方式:

方法一:直接clear相应的redo日志组!也就是删除重新建立!

SQL> shutdown immediate #一致性关闭

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 1603411968 bytes

Fixed Size 2253664 bytes

Variable Size 1275071648 bytes

Database Buffers 318767104 bytes

Redo Buffers 7319552 bytes

Database mounted.

SQL> archive log list;

Database log mode No Archive Mode

Automatic archival Disabled

Archive destination USE_DB_RECOVERY_FILE_DEST

Oldest online log sequence 30641

Current log sequence 30642

清理删除从新建立或者直接clear所有的redo 日志组,包括当前状态的和active状态的redo 日志组!

SQL> alter database clear logfile group 1;

Database altered.

SQL> alter database clear logfile group 3;

Database altered.

SQL> alter database open ;

Database altered.

方法二:recover的方式恢复重做日志,我的实验过程中,有的时候这个方法会报错,如果报错那么就使用第一种方式恢复!

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 830930944 bytes

Fixed Size 2257800 bytes

Variable Size 536874104 bytes

Database Buffers 289406976 bytes

Redo Buffers 2392064 bytes

Database mounted.

SQL>

###恢复丢失的redo文件,但是需要open resetlogs之后才能自动创建上!

SQL> recover database until cancel;

Media recovery complete.

SQL> alter database open resetlogs;

Database altered.

二:Oracle没开归档,非一致性关闭数据库

[root@testdb59 /data/u01/app/oracle/oradata/stdb59]# rm -f *.log

SQL> shu abort ###非一致性关闭数据库

ORACLE instance shut down.

这个时候尝试使用前面的clear或者recover database都会报错,无法恢复,因为这个时候是需要做实例恢复的,那么什么时候需要实例恢复的判断依据,请参考另一篇文章(Oracle原理-----关于oracle实例恢复的前滚和回滚的理解),报错如下:

首先尝试重建,当你尝试clear当前的日志组的时候,会报错提示是需要的!!!因为非一致性关闭确实需要使用丢失的active和current状态的redo来实例恢复!

首先启动数据库到mount状态

SQL> alter database clear logfile group 3;

alter database clear logfile group 3

*

ERROR at line 1:

ORA-01624: log 3 needed for crash recovery of instance stdb59 (thread 1)

ORA-00312: online log 3 thread 1:

'/data/u01/app/oracle/oradata/stdb59/redo03.log'

然后尝试recover database,结果肯定不可以,因为实例恢复需要的redo已经丢失!!

SQL> recover database until cancel;

ORA-00279: change 21959466 generated at 04/06/2019 21:15:45 needed for thread 1

ORA-00289: suggestion :

/data/u01/app/oracle/fast_recovery_area/STDB59/archivelog/2019_04_06/o1_mf_1_2_%

u_.arc

ORA-00280: change 21959466 for thread 1 is in sequence #2

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

CANCEL

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

ORA-01112: media recovery not started

SQL> alter database open RESETLOGS;

alter database open RESETLOGS

ERROR at line 1:

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

那么针对这种情况,恢复的方式如下:

使用一个隐含参数_allow_resetlogs_corruption强制启动数据库,设置此参数之后,在数据库Open过程中,Oracle会跳过某些一致性检查,从而使数据库可能跳过不一致状态,到达open数据库的目的

SQL> create pfile='/home/oracle/pfile.ora' from spfile;

File created.

然后在/home/oracle/pfile.ora添加上

*._allow_resetlogs_corruption=true

SQL> startup mount pfile='/home/oracle/pfile.ora';

SQL> recover database until cancel; #恢复丢失的redo文件

ORA-00279: change 21959471 generated at 04/06/2019 22:34:01 needed for thread 1

ORA-00289: suggestion :

/data/u01/app/oracle/fast_recovery_area/STDB59/archivelog/2019_04_06/o1_mf_1_2_%

u_.arc

ORA-00280: change 21959471 for thread 1 is in sequence #2

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

CANCEL

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

ORA-01112: media recovery not started

幸运的话就可以直接以resetlogs方式open数据库了!

SQL> alter database open RESETLOGS;

Database altered.

如果遇到下面的错误,那么你就得重建控制文件了:

SQL> alter database open RESETLOGS;

alter database open RESETLOGS

*

ERROR at line 1:

ORA-01092: ORACLE instance terminated. Disconnection forced

ORA-00704: bootstrap process failure

ORA-00704: bootstrap process failure

ORA-00600: internal error code, arguments: [2662], [0], [21959484], [0],

[21959877], [4194545], [], [], [], [], [], []

Process ID: 13177

Session ID: 63 Serial number: 5

重建数据库控制文件

1)直接使用如下alter database backup controlfile这种会报错

SQL> alter database backup controlfile to trace as '/data/u01/control_rebuild.trc';

alter database backup controlfile to trace as '/data/u01/control_rebuild.trc'

*

ERROR at line 1:

ORA-16433: The database must be opened in read/write mode.

2)还可以使用如下特定的格式来重建,

查询数据库的redo 信息:

SQL> select GROUP#,MEMBER from v$logfile;

GROUP# MEMBER

3 /data/u01/app/oracle/oradata/stdb59/redo03.log

1 /data/u01/app/oracle/oradata/stdb59/redo01.log

查询数据库的datafile信息

SQL> select MEMBER from v$logfile;

MEMBER

--------------------------------------------------------------------------------

/data/u01/app/oracle/oradata/stdb59/redo03.log

/data/u01/app/oracle/oradata/stdb59/redo01.log

/data/u01/app/oracle/oradata/stdb59/redo04.log

/data/u01/app/oracle/oradata/stdb59/redo05.log

/data/u01/app/oracle/oradata/stdb59/redo06.log

/data/u01/app/oracle/oradata/stdb59/redo07.log

查出数据库字符集:

SQL> select userenv('language') nls_lang from dual;

NLS_LANG

----------------------------------------------------

AMERICAN_AMERICA.AL32UTF8

然后编辑出创建控制文件的脚本:注意这里的的testdb57为数据库(db_name),如果是adg转换成的主库,不要写db_unique_name

CREATE CONTROLFILE REUSE DATABASE 'testdb57' NORESETLOGS ARCHIVELOG

MAXLOGFILES 50

MAXLOGMEMBERS 5

MAXDATAFILES 100

MAXINSTANCES 8

MAXLOGHISTORY 226

LOGFILE

GROUP 3 '/data/u01/app/oracle/oradata/stdb59/redo03.log' SIZE 50M,

GROUP 1 '/data/u01/app/oracle/oradata/stdb59/redo01.log' SIZE 50M

DATAFILE

'/data/u01/app/oracle/oradata/stdb59/system01.dbf',

'/data/u01/app/oracle/oradata/stdb59/sysaux01.dbf',

'/data/u01/app/oracle/oradata/stdb59/undotbs01.dbf',

'/data/u01/app/oracle/oradata/stdb59/users01.dbf',

'/data/u01/app/oracle/oradata/stdb59/liuwenhe.dbf',

'/data/u01/app/oracle/oradata/stdb59/soe.dbf',

'/data/u01/app/oracle/oradata/stdb59/soe3.dbf'

CHARACTER SET AL32UTF8;

然后直接将数据库启动到nomount状态,执行创建脚本即可

SQL> startup nomount pfile='/home/oracle/pfile.ora';

ORACLE instance started.

Total System Global Area 1603411968 bytes

Fixed Size 2253664 bytes

Variable Size 1275071648 bytes

Database Buffers 318767104 bytes

Redo Buffers 7319552 bytes

CREATE CONTROLFILE REUSE DATABASE 'testdb57' NORESETLOGS ARCHIVELOG

MAXLOGFILES 50

MAXLOGMEMBERS 5

MAXDATAFILES 100

MAXINSTANCES 8

MAXLOGHISTORY 226

LOGFILE

GROUP 3 '/data/u01/app/oracle/oradata/stdb59/redo03.log' SIZE 50M,

GROUP 1 '/data/u01/app/oracle/oradata/stdb59/redo01.log' SIZE 50M

DATAFILE

'/data/u01/app/oracle/oradata/stdb59/system01.dbf',

'/data/u01/app/oracle/oradata/stdb59/sysaux01.dbf',

'/data/u01/app/oracle/oradata/stdb59/undotbs01.dbf',

'/data/u01/app/oracle/oradata/stdb59/users01.dbf',

'/data/u01/app/oracle/oradata/stdb59/liuwenhe.dbf',

'/data/u01/app/oracle/oradata/stdb59/soe.dbf',

'/data/u01/app/oracle/oradata/stdb59/soe3.dbf'

CHARACTER SET AL32UTF8;

Control file created.

然后使用oradebug推进内存中scn号,以便于执行后面的recover来恢复丢失的redo文件,因为recover的过程会读取内存中scn。注意 alter session set events '10015 trace name adjust_scn level 10';这种方式在11.2.0.4已经失效了

(题外话:我们先聊聊Oracle的SCN。在数据库内部,SCN是一个单向递增的数字编号,控制文件、数据文件、在线Redo日志、归档日志和备份集合中,都包括这个数字编号。在内部文件中,SCN是通过Base和Wrap两个部分进行保存。Base是SCN编号的基础位,是通过32位二进制位进行保存。一旦超过这32位长度,系统会自动在Wrap进位。也就是说,Wrap表示的超过4G个数的进位次数)

SQL> oradebug poke 0x06001AE70 4 0x001B7740

oradebug 推进scn号,poke命令中,第一位参数是对应写入的内存位数,第二位参数是写入长度,第三位参数是写入取值。默认写入取值是10进制,我们在这里指定写入16进制(0x开头),每一个取值段,用8个16进制对应,对应到数字位数是4位

首先查出数据库的控制文件中的scn号

SQL> select file#, checkpoint_change# from v$datafile;

FILE# CHECKPOINT_CHANGE#

---------- ------------------

1 21959486

2 21959486

3 21959486

4 21959486

5 21959486

6 21959486

7 21959486

7 rows selected.

SQL> oradebug setmypid

Statement processed.

SQL> oradebug DUMPvar SGA kcsgscn_

kcslf kcsgscn_ [06001AE70, 06001AEA0) = 014F14A2 00000001 00000000 00000000 000000EB 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000

SQL> oradebug poke 0x06001AE70 4 21959486

BEFORE: [06001AE70, 06001AE74) = 00000000

AFTER: [06001AE70, 06001AE74) = 014F133E

(或者可以把21959486转换成16进制,然后再修改

SQL> select to_char(21959486, 'XXXXXXXXXXX') from dual;

TO_CHAR(2195

------------

14F133E

SQL> oradebug poke 0x06001AE70 4 0x14F133E

BEFORE: [06001AE70, 06001AE74) = 00000000

AFTER: [06001AE70, 06001AE74) = 014F133E)

再次查看确实已经变成了014F133E(对应10进制是21959486)

SQL> oradebug DUMPvar SGA kcsgscn_

kcslf kcsgscn_ [06001AE70, 06001AEA0) = 014F133E 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000

然后执行recover进行不完全恢复:

SQL> recover database until cancel;

ORA-00279: change 21959486 generated at 04/06/2019 23:52:28 needed for thread 1

ORA-00289: suggestion :

/data/u01/app/oracle/fast_recovery_area/STDB59/archivelog/2019_04_07/o1_mf_1_2_%

u_.arc

ORA-00280: change 21959486 for thread 1 is in sequence #2

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

CANCEL

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

ORA-01112: media recovery not started

SQL> alter database open resetlogs;

Database altered.

至此恢复成功!

三:oracle开归档,一致性关闭

这种情况是同情况1,不需要做实例恢复,所以可以直接删除从新或者recover所有的redo组即可,

方法一:直接clear相应的redo日志组!也就是删除重新建立!

SQL> shutdown immediate #一致性关闭

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 1603411968 bytes

Fixed Size 2253664 bytes

Variable Size 1275071648 bytes

Database Buffers 318767104 bytes

Redo Buffers 7319552 bytes

Database mounted.

清理删除从新建立或者直接clear所有的redo 日志组,包括当前状态的和active状态的redo 日志组!

SQL> alter database clear logfile group 1;

Database altered.

SQL> alter database clear logfile group 3;

Database altered.

SQL> alter database open ;

Database altered.

方法二:recover的方式恢复重做日志,我的实验过程中,有的时候这个方法会报错,如果报错那么就使用第一种方式恢复!

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 830930944 bytes

Fixed Size 2257800 bytes

Variable Size 536874104 bytes

Database Buffers 289406976 bytes

Redo Buffers 2392064 bytes

Database mounted.

SQL>

###恢复丢失的redo文件,但是需要open resetlogs之后才能自动创建上!

SQL> recover database until cancel;

Media recovery complete.

SQL> alter database open resetlogs;

Database altered.

四:开归档,非一致性关闭;

这种情况,只能借助归档日志做不完全恢复!

SQL> select * from v$log;

GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC

---------- ---------- ---------- ---------- ---------- ---------- ---

STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME

---------------- ------------- --------- ------------ ---------

1 1 39 52428800 512 1 YES

INACTIVE 4318162327 20-APR-19 4318209770 20-APR-19

3 1 40 52428800 512 1 NO

CURRENT 4318209770 20-APR-19 2.8147E+14

SQL> archive log list;

Database log mode Archive Mode

Automatic archival Enabled

Archive destination USE_DB_RECOVERY_FILE_DEST

Oldest online log sequence 39

Next log sequence to archive 40

Current log sequence 40

删除redo log文件

[oracle@testdb59 stdb59]$ rm -f *.log

然后非一致性关闭

SQL> shu abort

ORACLE instance shut down.

解决过程:

SQL> startup mount

ORACLE instance started.

Total System Global Area 1603411968 bytes

Fixed Size 2253664 bytes

Variable Size 1275071648 bytes

Database Buffers 318767104 bytes

Redo Buffers 7319552 bytes

Database mounted.

###恢复丢失的redo文件,但是需要open resetlogs之后才能自动创建上!

SQL> recover database until cancel;

Media recovery complete.

尝试resetlog方式打开,如果报错如下,那么还得借助隐含参数_allow_resetlogs_corruption;

SQL> alter database open RESETLOGS;

alter database open RESETLOGS

*

ERROR at line 1:

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

使用一个隐含参数_allow_resetlogs_corruption强制启动数据库,设置此参数之后,在数据库Open过程中,Oracle会跳过某些一致性检查,从而使数据库可能跳过不一致状态,到达open数据库的目的

SQL> create pfile='/home/oracle/pfile.ora' from spfile;

File created.

然后在/home/oracle/pfile.ora添加上

*._allow_resetlogs_corruption=true

SQL> startup mount pfile='/home/oracle/pfile.ora';

SQL> alter database open RESETLOGS;

Database altered.

然后一致性关闭数据库,去掉隐含参数_allow_resetlogs_corruption,重启数据库!

总结:不管是开归档还是没开归档,只要是非一致性关闭数据库,就需要借助隐含参数_allow_resetlogs_corruption,一致性关闭数据库恢复的话比较简单,启动到mount状态,重建丢失的redo文件即可!

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29654823/viewspace-2642066/,如需转载,请注明出处,否则将追究法律责任。

下一篇: 没有了~
请登录后发表评论 登录
全部评论
记录工作中遇到的问题,积少成多,坚持就是胜利,工作经历:曾就职于国美、中国采购于招标网、目前就职于一家正规消费金融公司负责Oracle和MySQL以及hadoop相关运维和优化的工作

注册时间:2014-05-12

  • 博文量
    223
  • 访问量
    1572213