ITPub博客

首页 > 数据库 > Oracle > oracle 11g rac 共享存储坏掉后数据库恢复

oracle 11g rac 共享存储坏掉后数据库恢复

原创 Oracle 作者:selectshen 时间:2015-10-23 18:44:53 0 删除 编辑
共享存储坏掉,存在共享存储的OCR,votedisk,ASM,database数据全部丢失.本地主机保存有数据库的备份.两个主机的软件是没有问题,所以此时只需要重新配置gi,然后恢复数据库.

以下模拟:
host:ct66rac01,ct66rac02
os:centos 6.6
db:11.2.0.4
GRID_HOME=/u01/app/11.2.0/grid
ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1

当前数据库正常状态:
[grid@ct66rac01 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    ct66rac01
ora....ER.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora....N1.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora.asm        ora.asm.type   ONLINE    ONLINE    ct66rac01
ora....SM1.asm application    ONLINE    ONLINE    ct66rac01
ora....01.lsnr application    ONLINE    ONLINE    ct66rac01
ora....c01.gsd application    OFFLINE   OFFLINE
ora....c01.ons application    ONLINE    ONLINE    ct66rac01
ora....c01.vip ora....t1.type ONLINE    ONLINE    ct66rac01
ora....SM2.asm application    ONLINE    ONLINE    ct66rac02
ora....02.lsnr application    ONLINE    ONLINE    ct66rac02
ora....c02.gsd application    OFFLINE   OFFLINE
ora....c02.ons application    ONLINE    ONLINE    ct66rac02
ora....c02.vip ora....t1.type ONLINE    ONLINE    ct66rac02
ora.cvu        ora.cvu.type   ONLINE    ONLINE    ct66rac01
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE
ora....network ora....rk.type ONLINE    ONLINE    ct66rac01
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    ct66rac01
ora.ons        ora.ons.type   ONLINE    ONLINE    ct66rac01
ora.rac11g.db  ora....se.type ONLINE    ONLINE    ct66rac01
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    ct66rac01

模拟存储坏掉:
[root@ct66rac01 bin]#  dd if=/dev/zero bs=1M count=100 of=/dev/sdb1
[root@ct66rac01 bin]#  dd if=/dev/zero bs=1M count=100 of=/dev/sdb2

开始修复:
1.清空cluster的配置信息
--先停止各节点crs
[root@ct66rac01 bin]# cd /u01/app/11.2.0/grid/bin/
[root@ct66rac01 bin]# ./crsctl stop crs -f
[root@ct66rac02 ~]# cd /u01/app/11.2.0/grid/bin/
[root@ct66rac02 bin]# ./crsctl stop crs -f
--清空各节点cluster配置信息,注意最后的节点加-lastnode
[root@ct66rac01 bin]# cd /u01/app/11.2.0/grid/crs/install/
[root@ct66rac01 install]# ./rootcrs.pl -deconfig -force
[root@ct66rac02 bin]# cd /u01/app/11.2.0/grid/crs/install/
[root@ct66rac02 install]#  ./rootcrs.pl -deconfig -force -lastnode

2.重新配置gi
/u01/app/11.2.0/grid/crs/install/crsconfig_params这里面存放了之前gi的配置信息.
共享存储重建之后,如果路径和之前的不一样,可以修改/u01/app/11.2.0/grid/crs/install/crsconfig_params这个文件改为当前的配置.
如果配置信息变化太大,也可以直接通过图形化执行/u01/app/11.2.0/grid/crs/config/config.sh去重新输入环境信息,以完成gi重建.

如果环境没有变化,只需要执行以下:
--重建cluster配置信息,
[root@ct66rac01 ~]# cd /u01/app/11.2.0/grid
[root@ct66rac01 grid]# ./root.sh

[root@ct66rac02 ~]# cd /u01/app/11.2.0/grid
[root@ct66rac02 grid]# ./root.sh

--此时完成之后,可以看到,连asm也重建完成
[grid@ct66rac01 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    ct66rac01
ora....N1.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora.asm        ora.asm.type   ONLINE    ONLINE    ct66rac01
ora....SM1.asm application    ONLINE    ONLINE    ct66rac01
ora....c01.gsd application    OFFLINE   OFFLINE
ora....c01.ons application    ONLINE    ONLINE    ct66rac01
ora....c01.vip ora....t1.type ONLINE    ONLINE    ct66rac01
ora....SM2.asm application    ONLINE    ONLINE    ct66rac02
ora....c02.gsd application    OFFLINE   OFFLINE
ora....c02.ons application    ONLINE    ONLINE    ct66rac02
ora....c02.vip ora....t1.type ONLINE    ONLINE    ct66rac02
ora.cvu        ora.cvu.type   ONLINE    ONLINE    ct66rac01
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE
ora....network ora....rk.type ONLINE    ONLINE    ct66rac01
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    ct66rac01
ora.ons        ora.ons.type   ONLINE    ONLINE    ct66rac01
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    ct66rac01

--重新注册本地listenter到ocr
[grid@ct66rac01 ~]$ srvctl add listener -l listener
[grid@ct66rac01 ~]$ srvctl start listener
[grid@ct66rac01 ~]$ srvctl status listener
Listener LISTENER is enabled
Listener LISTENER is running on node(s): ct66rac01,ct66rac02

--添加数据库的ASM磁盘组
可以通过sqlplus或者asmca,然后通srvctl add diskgroup重新注册diskgroup到ocr.
这里模拟环境的数据库用的和ocr相同的diskgroup,所以不需要新建.

3.gi重新配置完成,开始还原数据库
如果你是dp备份,那需要先通过DBCA去重建数据库,再impdp导入到数据库.
如果是rman备份,执行以下:

[root@ct66rac01 grid]# su - oracle
--显示当前的备份在/home/oracle下
[oracle@ct66rac01 ~]$ ll full*
-rw-r----- 1 oracle asmadmin 1134043136 Oct 22 17:59 full_db_RAC11G_3_1_1.dbfbk
-rw-r----- 1 oracle asmadmin   18579456 Oct 22 17:59 full_db_RAC11G_4_1_1.dbfbk

[oracle@ct66rac01 ~]$ ORACLE_SID=rac11g1
[oracle@ct66rac01 ~]$ rman target /
--启动到nomount,此处不需要参数文件
RMAN> startup nomount;

startup failed: ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/rac11g/spfilerac11g.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA/rac11g/spfilerac11g.ora
ORA-15056: additional error message
ORA-17503: ksfdopn:2 Failed to open file +DATA/rac11g/spfilerac11g.ora
ORA-15173: entry 'rac11g' does not exist in directory '/'
ORA-06512: at line 4

starting Oracle instance without parameter file for retrieval of spfile
Oracle instance started

Total System Global Area    1068937216 bytes

Fixed Size                     2260088 bytes
Variable Size                281019272 bytes
Database Buffers             780140544 bytes
Redo Buffers                   5517312 bytes
--还原spfile
--此处报错是因为目录不存在,要先在asm的+DATA下建相应目录
RMAN> restore spfile from '/home/oracle/full_db_RAC11G_4_1_1.dbfbk';

Starting restore at 23-OCT-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=174 device type=DISK

channel ORA_DISK_1: restoring spfile from AUTOBACKUP /home/oracle/full_db_RAC11G_4_1_1.dbfbk
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 10/23/2015 14:36:58
ORA-19870: error while restoring backup piece /home/oracle/full_db_RAC11G_4_1_1.dbfbk
ORA-32015: unable to restore SPFILE
ORA-17502: ksfdcre:4 Failed to create file +DATA/rac11g/spfilerac11g.ora
ORA-15056: additional error message
ORA-17502: ksfdcre:4 Failed to create file +DATA/rac11g/spfilerac11g.ora
ORA-15173: entry 'rac11g' does not exist in directory '/'

RMAN> exit
--在+DATA下建rac11g目录
[root@ct66rac01 grid]# su - grid
[grid@ct66rac01 ~]$ asmcmd
ASMCMD> ls
DATA/
ASMCMD> cd DATA
ASMCMD> ls
ct66rac-scan/
ASMCMD> mkdir rac11g
ASMCMD> exit

[root@ct66rac01 grid]# su - oracle
[oracle@ct66rac01 ~]$ ORACLE_SID=rac11g1
[oracle@ct66rac01 ~]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Fri Oct 23 14:38:44 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: DUMMY (not mounted)
--重新还原spfile
RMAN>  restore spfile from '/home/oracle/full_db_RAC11G_4_1_1.dbfbk';
--用还原后的spfile开启到nomount
RMAN> startup force nomount;
--还原控制文件
RMAN>  restore controlfile from '/home/oracle/full_db_RAC11G_4_1_1.dbfbk';
--启动到mount
RMAN> alter database mount;
--把/home/oracle下备份文件信息记录到控制文件
RMAN> catalog start with '/home/oracle/full';

searching for all files that match the pattern /home/oracle/full

List of Files Unknown to the Database
=====================================
File Name: /home/oracle/full_db_RAC11G_4_1_1.dbfbk

Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /home/oracle/full_db_RAC11G_4_1_1.dbfbk

RMAN> crosscheck backup;
--还原数据库
RMAN> restore database;

RMAN> recover database;
--打开数据库
RMAN> alter database open resetlogs;

database opened

RMAN> exit
--添加数据库信息到ocr
[oracle@ct66rac01 ~]$ srvctl status database -d rac11g
PRCD-1120 : The resource for database rac11g could not be found.
PRCR-1001 : Resource ora.rac11g.db does not exist
[oracle@ct66rac01 ~]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/db_1
[oracle@ct66rac01 ~]$ srvctl add database -d rac11g -o /u01/app/oracle/product/11.2.0/db_1 -c RAC
[oracle@ct66rac01 ~]$ srvctl add instance -d rac11g -i rac11g1 -n ct66rac01
[oracle@ct66rac01 ~]$ srvctl add instance -d rac11g -i rac11g2 -n ct66rac02
[oracle@ct66rac01 ~]$ srvctl start database -d rac11g

--修复完成,查看数据库状态
[root@ct66rac01 bin]# ./crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    ct66rac01
ora....ER.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora....N1.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora.asm        ora.asm.type   ONLINE    ONLINE    ct66rac01
ora....SM1.asm application    ONLINE    ONLINE    ct66rac01
ora....01.lsnr application    ONLINE    ONLINE    ct66rac01
ora....c01.gsd application    OFFLINE   OFFLINE
ora....c01.ons application    ONLINE    ONLINE    ct66rac01
ora....c01.vip ora....t1.type ONLINE    ONLINE    ct66rac01
ora....SM2.asm application    ONLINE    ONLINE    ct66rac02
ora....02.lsnr application    ONLINE    ONLINE    ct66rac02
ora....c02.gsd application    OFFLINE   OFFLINE
ora....c02.ons application    ONLINE    ONLINE    ct66rac02
ora....c02.vip ora....t1.type ONLINE    ONLINE    ct66rac02
ora.cvu        ora.cvu.type   ONLINE    ONLINE    ct66rac01
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE
ora....network ora....rk.type ONLINE    ONLINE    ct66rac01
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    ct66rac01
ora.ons        ora.ons.type   ONLINE    ONLINE    ct66rac01
ora.rac11g.db  ora....se.type ONLINE    ONLINE    ct66rac01
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    ct66rac01



来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/28539951/viewspace-1815958/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2014-01-05

  • 博文量
    169
  • 访问量
    1460326