ITPub博客

首页 > 数据库 > Oracle > OCR磁盘损坏的恢复

OCR磁盘损坏的恢复

原创 Oracle 作者:oracle_mao 时间:2016-03-29 17:12:29 0 删除 编辑


思路:
1.ocr被破坏(采用dd破坏磁盘的方式)
2.所有节点的cluster都关闭,资源都关闭,db也关闭了。
3.选一个节点执行crsctl start crs -excl -nocrs 启动asm
4.as sysasm 进入后,创建新的磁盘组,并将asm参数文件的spfile创建到新磁盘组里,然后shutdown immdiate关闭asm,再startup启动,以使用新参数文件。
5.恢复crs
6.恢复voting disk




步骤:
在破坏ocr之前,我们得先看看我们的ocr备份是否存在,备份有2种,手动和自动备份。
先简单查看ocr磁盘的冗余模式
这个是我的测试环境,ocr是normal方式的。
SQL> select name,type from v$asm_diskgroup;


NAME                           TYPE
------------------------------ ------
DATA                           NORMAL


这是另一个环境,ocr是外部冗余方式的。
SQL> select name,type from v$asm_diskgroup;
 
NAME                           TYPE
------------------------------ ------
DATA_PTL                       EXTERN
OCR_PTL                        EXTERN


查看自动备份
默认的自动备份路径是$CRS_HOME/cdata/$CRS_NAME
自动备份只会在一个节点执行。如果用于备份的节点出现异常,则oracle会自动切换到其他节点进行备份。
默认情况下,oracle会保留最近5份ocr备份:3份最近的、一份昨天和一份上周的。
[root@host01 bin]# ./ocrconfig -showbackup


host01     2016/03/24 06:48:42     /u01/app/11.2.0/grid/cdata/cluster01/backup00.ocr


host01     2016/03/24 02:48:42     /u01/app/11.2.0/grid/cdata/cluster01/backup01.ocr


host01     2016/03/23 22:48:41     /u01/app/11.2.0/grid/cdata/cluster01/backup02.ocr


host01     2016/03/23 14:48:40     /u01/app/11.2.0/grid/cdata/cluster01/day.ocr


host01     2016/03/23 14:48:40     /u01/app/11.2.0/grid/cdata/cluster01/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available
[root@host01 bin]# ll /u01/app/11.2.0/grid/cdata/cluster01/
total 43344
-rw------- 1 root root 7385088 Mar 24 06:48 backup00.ocr
-rw------- 1 root root 7385088 Mar 24 02:48 backup01.ocr
-rw------- 1 root root 7385088 Mar 23 22:48 backup02.ocr
-rw------- 1 root root 7385088 Mar 24 02:48 day_.ocr
-rw------- 1 root root 7385088 Mar 23 14:48 day.ocr
-rw------- 1 root root 7385088 Mar 23 14:48 week.ocr


检查ocr的完整性:
[oracle@host01 bin]$ pwd
/u01/app/11.2.0/grid/bin
[oracle@host01 bin]$ ./cluvfy comp ocr -n all


Verifying OCR integrity 


Checking OCR integrity...


Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations




ASM Running check passed. ASM is running on all specified nodes


Checking OCR config file "/etc/oracle/ocr.loc"...


OCR config file "/etc/oracle/ocr.loc" check successful




Disk group for ocr location "+DATA" available on all the nodes




NOTE: 
This check does not verify the integrity of the OCR contents. Execute 'ocrcheck' as a privileged user to verify the contents of OCR.


OCR integrity check passed


Verification of OCR integrity was successful.
也可以使用ocrcheck检查ocr的完整性:
[oracle@host01 bin]$ ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3064
         Available space (kbytes) :     259056
         ID                       :  764742178
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded


                                    Device/File not configured


                                    Device/File not configured


                                    Device/File not configured


                                    Device/File not configured


         Cluster registry integrity check succeeded


         Logical corruption check bypassed due to non-privileged user


[oracle@host01 bin]$ 
[root@host02 ~]# /u01/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3064
         Available space (kbytes) :     259056
         ID                       :  764742178
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded


                                    Device/File not configured


                                    Device/File not configured


                                    Device/File not configured


                                    Device/File not configured


         Cluster registry integrity check succeeded


         Logical corruption check succeeded
检查voting disk信息:
[oracle@host01 bin]$ ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   a19ebdf15f374f81bf6af6cd31ed413f (/dev/oracleasm/disks/ASMDISK1) [DATA]
 2. ONLINE   6854e5df9d314f36bfc0a65db37c5db1 (/dev/oracleasm/disks/ASMDISK2) [DATA]
 3. ONLINE   45f2a22766864fd8bf00694c9d8029d3 (/dev/oracleasm/disks/ASMDISK3) [DATA]
Located 3 voting disk(s).


手动备份ocr文件:
[root@host02 ~]# cd /u01/app/11.2.0/grid/bin/
[root@host02 bin]# ./ocrconfig -export /home/oracle/ocr_0326.dmp


将asm参数文件导出:
[oracle@host01 bin]$ export ORACLE_HOME=/u01/app/11.2.0/grid
[oracle@host01 bin]$ export ORACLE_SID=+ASM1
[oracle@host01 bin]$ /u01/app/oracle/product/11.2.0/dbhome_1/bin/sqlplus / as sysasm


SQL*Plus: Release 11.2.0.3.0 Production on Sat Mar 26 14:57:03 2016


Copyright (c) 1982, 2011, Oracle.  All rights reserved.




Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Real Application Clusters and Automatic Storage Management options


SQL> select name,state from v$asm_diskgroup;


NAME                           STATE
------------------------------ -----------
DATA                           MOUNTED


SQL> show parameter spfile


NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      +DATA/cluster01/asmparameterfi
                                                 le/registry.253.906557533
SQL> create pfile='/home/oracle/asmpfile.ora' from spfile;


File created.


SQL> exit
[oracle@host01 ~]$ cat asmpfile.ora 
*.asm_diskstring='/dev/oracleasm/disks'
*.asm_power_limit=1
*.diagnostic_dest='/u01/app/oracle'
*.instance_type='asm'
*.large_pool_size=12M
*.remote_login_passwordfile='EXCLUSIVE'


破坏OCR磁盘:
[oracle@host01 bin]$ dd if=/dev/zero of=/dev/oracleasm/disks/ASMDISK1 bs=1024 count=1000
1000+0 records in
1000+0 records out
1024000 bytes (1.0 MB) copied, 0.00300442 seconds, 341 MB/s
[oracle@host01 bin]$ dd if=/dev/zero of=/dev/oracleasm/disks/ASMDISK2 bs=1024 count=1000
1000+0 records in
1000+0 records out
1024000 bytes (1.0 MB) copied, 0.072793 seconds, 14.1 MB/s
这里要注意的时,要想恢复ocr的信息,必须要求磁盘组是mount的。所以也可以将ocr恢复到其他磁盘组,然后在修改还应修改/etc/oracle/ocr.loc文件。
[root@host02 bin]# ./crsctl stop cluster 
CRS-2673: Attempting to stop 'ora.crsd' on 'host02'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'host02'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'host02'
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'host02'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'host02'
CRS-2673: Attempting to stop 'ora.oc4j' on 'host02'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'host02'
CRS-2673: Attempting to stop 'ora.cvu' on 'host02'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'host02'
CRS-2677: Stop of 'ora.cvu' on 'host02' succeeded
CRS-2672: Attempting to start 'ora.cvu' on 'host01'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.host02.vip' on 'host02'
CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.scan2.vip' on 'host02'
CRS-2676: Start of 'ora.cvu' on 'host01' succeeded
CRS-2677: Stop of 'ora.scan2.vip' on 'host02' succeeded
CRS-2672: Attempting to start 'ora.scan2.vip' on 'host01'
CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.scan3.vip' on 'host02'
CRS-2677: Stop of 'ora.scan3.vip' on 'host02' succeeded
CRS-2672: Attempting to start 'ora.scan3.vip' on 'host01'
CRS-2677: Stop of 'ora.host02.vip' on 'host02' succeeded
CRS-2672: Attempting to start 'ora.host02.vip' on 'host01'
CRS-2676: Start of 'ora.scan2.vip' on 'host01' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'host01'
CRS-2676: Start of 'ora.scan3.vip' on 'host01' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'host01'
CRS-2676: Start of 'ora.host02.vip' on 'host01' succeeded
CRS-2677: Stop of 'ora.registry.acfs' on 'host02' succeeded
CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'host01' succeeded
CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'host01' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'host02' succeeded
CRS-2672: Attempting to start 'ora.oc4j' on 'host01'
CRS-2676: Start of 'ora.oc4j' on 'host01' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'host02'
CRS-2677: Stop of 'ora.asm' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'host02'
CRS-2677: Stop of 'ora.ons' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'host02'
CRS-2677: Stop of 'ora.net1.network' on 'host02' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'host02' has completed
CRS-2677: Stop of 'ora.crsd' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'host02'
CRS-2673: Attempting to stop 'ora.evmd' on 'host02'
CRS-2673: Attempting to stop 'ora.asm' on 'host02'
CRS-2677: Stop of 'ora.evmd' on 'host02' succeeded
CRS-2677: Stop of 'ora.asm' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'host02'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'host02' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'host02' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'host02'
CRS-2677: Stop of 'ora.cssd' on 'host02' succeeded
[root@host02 bin]# ./crsctl start cluster ---启动失败
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'host02'
CRS-2676: Start of 'ora.cssdmonitor' on 'host02' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'host02'
CRS-2672: Attempting to start 'ora.diskmon' on 'host02'
CRS-2676: Start of 'ora.diskmon' on 'host02' succeeded
CRS-2674: Start of 'ora.cssd' on 'host02' failed
CRS-2679: Attempting to clean 'ora.cssd' on 'host02'
CRS-2681: Clean of 'ora.cssd' on 'host02' succeeded
CRS-5804: Communication error with agent process
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'host02'
CRS-2676: Start of 'ora.cssdmonitor' on 'host02' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'host02'
CRS-2672: Attempting to start 'ora.diskmon' on 'host02'
CRS-2676: Start of 'ora.diskmon' on 'host02' succeeded
CRS-2674: Start of 'ora.cssd' on 'host02' failed
CRS-2679: Attempting to clean 'ora.cssd' on 'host02'
CRS-2681: Clean of 'ora.cssd' on 'host02' succeeded
CRS-5804: Communication error with agent process
CRS-4000: Command Start failed, or completed with errors.
[root@host02 host02]# tail -20 alerthost02.log 
2016-03-26 15:03:56.872
[cssd(8781)]CRS-1603:CSSD on node host02 shutdown by user.
2016-03-26 15:03:56.877
[ohasd(5017)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'host02'.
2016-03-26 15:03:57.835
[ohasd(5017)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2016-03-26 15:14:03.500
[/u01/app/11.2.0/grid/bin/cssdagent(8768)]CRS-5818:Aborted command 'start' for resource 'ora.cssd'. Details at (:CRSAGF00113:) {0:0:148} in /u01/app/11.2.0/grid/log/host02/agent/ohasd/oracssdagent_root/oracssdagent_root.log.
2016-03-26 15:14:16.599
[cssd(9425)]CRS-1713:CSSD daemon is started in clustered mode
2016-03-26 15:14:17.118
[cssd(9425)]CRS-1637:Unable to locate configured voting file with ID a19ebdf1-5f374f81-bf6af6cd-31ed413f; details at (:CSSNM00020:) in /u01/app/11.2.0/grid/log/host02/cssd/ocssd.log
2016-03-26 15:14:17.119
[cssd(9425)]CRS-1637:Unable to locate configured voting file with ID 6854e5df-9d314f36-bfc0a65d-b37c5db1; details at (:CSSNM00020:) in /u01/app/11.2.0/grid/log/host02/cssd/ocssd.log
2016-03-26 15:14:17.119
[cssd(9425)]CRS-1705:Found 1 configured voting files but 2 voting files are required, terminating to ensure data integrity; details at (:CSSNM00021:) in /u01/app/11.2.0/grid/log/host02/cssd/ocssd.log
2016-03-26 15:14:17.119
[cssd(9425)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11.2.0/grid/log/host02/cssd/ocssd.log
2016-03-26 15:14:17.140
[cssd(9425)]CRS-1603:CSSD on node host02 shutdown by user.
[root@host02 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage
ORA-29701: unable to connect to Cluster Synchronization Service


以独占模式启动集群资源但不起crs:
[root@host01 bin]# ./crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.
[root@host01 bin]# reboot
[root@host01 bin]# ./crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'host01'
CRS-2676: Start of 'ora.mdnsd' on 'host01' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'host01'
CRS-2676: Start of 'ora.gpnpd' on 'host01' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'host01'
CRS-2672: Attempting to start 'ora.gipcd' on 'host01'
CRS-2676: Start of 'ora.cssdmonitor' on 'host01' succeeded
CRS-2676: Start of 'ora.gipcd' on 'host01' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'host01'
CRS-2672: Attempting to start 'ora.diskmon' on 'host01'
CRS-2676: Start of 'ora.diskmon' on 'host01' succeeded
CRS-2676: Start of 'ora.cssd' on 'host01' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'host01'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'host01'
CRS-2672: Attempting to start 'ora.ctssd' on 'host01'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'host01' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'host01'
CRS-2676: Start of 'ora.ctssd' on 'host01' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on 'host01' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'host01' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'host01'
CRS-2676: Start of 'ora.asm' on 'host01' succeeded
[root@host01 bin]# ./crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
[root@host01 bin]# ps -ef |grep smon
oracle    5984     1  0 16:03 ?        00:00:00 asm_smon_+ASM1
root      6079  5305  0 16:03 pts/1    00:00:00 grep smon
创建asm参数文件:
[oracle@host01 ~]$ export ORACLE_SID=+ASM1
[oracle@host01 ~]$ export ORACLE_HOME=/u01/app/11.2.0/grid
[oracle@host01 ~]$ /u01/app/oracle/product/11.2.0/dbhome_1/bin/asmcmd
ASMCMD> ls
ASMCMD> exit
[oracle@host01 ~]$ /u01/app/oracle/product/11.2.0/dbhome_1/bin/sqlplus / as sysasm


SQL*Plus: Release 11.2.0.3.0 Production on Sat Mar 26 16:05:25 2016


Copyright (c) 1982, 2011, Oracle.  All rights reserved.




Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Real Application Clusters and Automatic Storage Management options


SQL> show parameter spfile;


NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string
SQL> create diskgroup CRSVOTEDISK normal redundancy disk '/dev/oracleasm/disks/ASMDISK7','/dev/oracleasm/disks/ASMDISK8','/dev/oracleasm/disks/ASMDISK9'  attribute 'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0';


Diskgroup created.
或者:
create diskgroup MAO  external  redundancy disk  '/dev/oracleasm/disks/ASMDISK10' attribute 'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0';


SQL> create spfile='+CRSVOTEDISK' from pfile='/home/oracle/asmpfile.ora';


File created.


[oracle@host01 ~]$ /u01/app/oracle/product/11.2.0/dbhome_1/bin/asmcmd
ASMCMD> ls
CRSVOTEDISK/
ASMCMD> ls C*
cluster01/
ASMCMD> ls C*/c*
ASMPARAMETERFILE/
ASMCMD> ls C*/c*/*
REGISTRY.253.907517283
修改ocr的指向位置:
[root@host01 bin]# vi /etc/oracle/ocr.loc
#ocrconfig_loc=+DATA
ocrconfig_loc=+CRSVOTEDISK 
local_only=FALSE




[root@host01 bin]# ./crsctl query css votedisk
Located 0 voting disk(s).
[root@host01 bin]# ./ocrconfig -import /root/ocr_0326.dmp ---恢复ocr
此时ocr已经导入到我们刚才创建的asm磁盘组中了(因为/etc/oracle/ocr的文件里指定了ocr要导入的磁盘组)那此时voting disk可能是乱的,所以要用下面的命令来替换voting disk到我ocr磁盘组中。
[root@host01 bin]# ./crsctl replace votedisk +CRSVOTEDISK  ---恢复voting disk
Successful addition of voting disk 4245abe430544f49bff06fe9e1debf54.
Successful addition of voting disk 8c470e646aaa4f28bfe517868b7d06b1.
Successful addition of voting disk af827499ece74f85bfa184aebfa1eeee.
Successfully replaced voting disk group with +CRSVOTEDISK.
CRS-4266: Voting file(s) successfully replaced
[root@host01 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3064
         Available space (kbytes) :     259056
         ID                       : 2105760671
         Device/File Name         : +CRSVOTEDISK
                                    Device/File integrity check succeeded


                                    Device/File not configured


                                    Device/File not configured


                                    Device/File not configured


                                    Device/File not configured


         Cluster registry integrity check succeeded


         Logical corruption check succeeded


[root@host01 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   4245abe430544f49bff06fe9e1debf54 (/dev/oracleasm/disks/ASMDISK7) [CRSVOTEDISK]
 2. ONLINE   8c470e646aaa4f28bfe517868b7d06b1 (/dev/oracleasm/disks/ASMDISK8) [CRSVOTEDISK]
 3. ONLINE   af827499ece74f85bfa184aebfa1eeee (/dev/oracleasm/disks/ASMDISK9) [CRSVOTEDISK]
Located 3 voting disk(s).
在节点2也修改ocr的指向位置。然后两边重启集群即可。
[root@host01 bin]# ./crsctl stop crs

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24500180/viewspace-2071834/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2011-03-28

  • 博文量
    94
  • 访问量
    753076