ITPub博客

首页 > Linux操作系统 > Linux操作系统 > OCR AND VOTINGDISK

OCR AND VOTINGDISK

原创 Linux操作系统 作者:wshxgxiaoli 时间:2012-07-05 13:53:13 0 删除 编辑
OCR:(ORACLE CLUSTERWARE REGISTRY)负责维护整个集群的配置信息,包括RAC以及CLUSTERWARE资源,包括的信息有节点成员,数据库,实例,服务,监听器,应用程序等。
健忘部题是由于每个节点都有配置信息的拷贝,修改节点的配置信息不能同步所引起的。解决这个问题最好的方法就是让整个集群保留一份配置,各节点共用这份配置,这样无论在哪个节点上修改配置,都是修改相同的配置文件,这样就保证了修改不会丢失。
OCR以KEY-VALUE的形式记录配置,当使用OEM,DBCA或者SRVCTL进行配置时,更新的就是这个文件。
如果想要修改OCR内容,也是由该节点的OCR PROCESS向MASTER NODE 的OCR PROCESS提交请求,由MASTER OCR PROCESS完成物理读写,并同步所有节点的OCR CACHE中的内容。

VOTING DISK: 管理集群的节点成员身份,根据这里的记录判断哪个节点属于集群的成员。并在出现脑裂时,仲裁哪个PARTITION获得集群的控制权,而其他PARTITION必须从集群中剔除。Voting disk使用的是一种“多数可用算法”,如果有多个Voting  disk,,则必须一半以上的Votedisk同时使用,Clusterware才能正常使用。 比如配置了4个Votedisk,坏一个Votedisk,集群可以正常工作,如果坏了2个,则不能满足半数以上,集群会立即宕掉,所有节点立即重启,所以如果添加Votedisk,尽量不要只添加一个,而应该添加2个。这点和OCR 不一样。OCR 只需配置一个。

可见这些信息都非常重要, 在日常工作中需要对他们进行备份,当然OCR也会自动备份,当OCR或者VOTING DISK出现问题时,有备份的话就使用备份来恢复,当没有备份的话就只能重建。

VOTEDISK备份与恢复测试:
1.查询VOTEDISK的位置:
crsctl query css votedisk
 0.     0    /ocfs/clusterware/votingdisk

located 1 votedisk(s).

2. 备份VOTEDISK
dd if=/ocfs/clusterware/votingdisk f=/home/oracle/rman/voting_disk.bak
20000+0 records in
20000+0 records out
10240000 bytes (10 MB) copied, 1.01971 seconds, 10.0 MB/s

3.恢复VOTEDISK。
 dd if=/home/oracle/rman/voting_disk.bak f=/ocfs/clusterware/votingdisk
20000+0 records in
20000+0 records out
10240000 bytes (10 MB) copied, 0.590575 seconds, 17.3 MB/s

在测试恢复以后自动关机了。原因:是在正常情况下恢复的,而不是等VOTEDISK出现故障做的恢复。

4.查看VOTING DISK的内容:
strings voting_disk.bak
SslcLlik
SslcLlik
SslcLlik
SslcLlik
SslcLlik
SslcLlik
SslcLlik
SslcLlik
SslcLlik

OCR备份与恢复测试:
因为OCR的内容如此重要,所以Oracle 每4个小时对其做一次备份,并且保留最后的3个备份,以及前一天,前一周的最后一个备份。 这个备份由Master Node CRSD进程完成,备份的默认位置是$CRS_HOME/crs/cdata/目录下,可以通过ocrconfig -backuploc 命令修改到新的目录。 每次备份后,备份文件名自动更改,以反应备份时间顺序,最近一次的备份叫作backup00.ocr。这些备份文件除了保存在本地,DBA还应该在其他存储设备上保留一份,以防止意外的存储故障。
Oracle 推荐在对集群做调整时,比如增加,删除节点之前,修改RAC IP之前,对OCR做一个备份,可以使用export 备份到指定文件,如果做了replace或者restore 等操作,Oracle 建议使用 cluvfy comp ocr -n all 命令来做一次全面的检查。对OCR的备份与恢复,我们可以使用ocrconfig 命令。

1.查看OCR的备份:
ls -l
total 13500
-rw-r--r-- 1 root root 4595712 May  7 19:50 backup00.ocr
-rw-r--r-- 1 root root 4595712 May  7 19:50 day.ocr
-rw-r--r-- 1 root root 4595712 May  7 19:50 week.ocr
rac1-> pwd
/home/oracle/product/10.2.0/crs_1/cdata/crs

2.OCR配置帮助
ocrconfig --help
Name:
        ocrconfig - Configuration tool for Oracle Cluster Registry.

Synopsis:
        ocrconfig [option]
        option:
                -export [-s online]
                                                    - Export cluster register contents to a file
                -import                   - Import cluster registry contents from a file
                -upgrade [ []]
                                                    - Upgrade cluster registry from previous version
                -downgrade [-version ]
                                                    - Downgrade cluster registry to the specified version
                -backuploc                 - Configure periodic backup location
                -showbackup                         - Show backup information
                -restore                  - Restore from physical backup
                -replace ocr|ocrmirror [] - Add/replace/remove a OCR device/file
                -overwrite                          - Overwrite OCR configuration on disk
                -repair ocr|ocrmirror     - Repair local OCR configuration
                -help                               - Print out this help information

Note:
        A log file will be created in
        $ORACLE_HOME/log//client/ocrconfig_.log. Please ensure
        you have file creation privileges in the above directory before
        running this tool.

3.恢复OCR
/etc/init.d/init.crs stop  两个节点均要关闭
ocrconfig -showbackup
ocrconfig -restore filename_location: [root@rac1 ~]# /home/oracle/product/10.2.0/crs_1/bin/ocrconfig -restore /home/oracle/product/10.2.0/crs_1/cdata/crs/backup00.ocr
[root@rac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     262144
         Used space (kbytes)      :       4304
         Available space (kbytes) :     257840
         ID                       : 1278044310
         Device/File Name         : /ocfs/clusterware/ocr
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded
启动两个节点上的CRS。
导出OCR内容:
[root@rac1 ~]# cd /home/oracle/product/10.2.0/crs_1/bin/
[root@rac1 bin]# ./ocrconfig -export /home/oracle/rman/ocr.exp

检查CRS状态:
 ./crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy

破坏OCR内容:
dd if=/dev/zero f=/ocfs/clusterware/ocr bs=1024 count=102400
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 15.1351 seconds, 6.9 MB/s

检查OCR一致性:
[root@rac1 bin]# ./ocrcheck
PROT-601: Failed to initialize ocrcheck

使用CLUVFY工具检查一至性。
/home/oracle/clusterware/cluvfy/runcluvfy.sh comp ocr -n all

Verifying OCR integrity
Unable to retrieve nodelist from Oracle clusterware.

Verification cannot proceed.

使用IMPORT 恢复OCR内容:
./ocrconfig -import /home/oracle/rman/ocr.exp
[root@rac1 bin]#

再次检查OCR:
./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     262144
         Used space (kbytes)      :       4304
         Available space (kbytes) :     257840
         ID                       :   86171642
         Device/File Name         : /ocfs/clusterware/ocr
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

使用CLUVFY工具检查:
 /home/oracle/clusterware/cluvfy/runcluvfy.sh comp ocr -n all

Verifying OCR integrity

Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.

Uniqueness check for OCR device passed.

Checking the version of OCR...
OCR of correct Version "2" exists.

Checking data integrity of OCR...
Data integrity check for OCR passed.

OCR integrity check passed.

Verification of OCR integrity was successful.

验证通过。
 crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.....CRM.cs application    ONLINE    ONLINE    rac1        
ora....db1.srv application    ONLINE    ONLINE    rac1        
ora.devdb.db   application    ONLINE    ONLINE    rac2        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

在做备份的时候也可以用:
dd if=/ocfs/clusterware/ocr f=/home/oracle/rman/ocr.bak
204800+0 records in
204800+0 records out
104857600 bytes (105 MB) copied, 40.2978 seconds, 2.6 MB/s


备份恢复就到此了, 再来看下重建:
1.对OCR和VOTEDISK做一次备份:
 dd if=/ocfs/clusterware/votingdisk f=/home/oracle/rman/voting_disk.bak
20000+0 records in
20000+0 records out
10240000 bytes (10 MB) copied, 6.65133 seconds, 1.5 MB/s

2.关掉数据库相关服务。
rac1-> srvctl stop instance -d devdb -i devdb1
rac1-> srvctl stop instance -d devdb -i devdb2
rac1-> srvctl stop asm -n rac1
rac1-> srvctl stop asm -n rac2
rac1-> srvctl stop nodeapps -n rac1
rac1-> srvctl stop nodeapps -n rac2

3.停止所有节点的CRS:
节点1:
[root@rac1 bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
节点2:
 ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

4.备份每个节点的CLUSTERWARE HOME
 cp -rf crs_1 crs_1_bak

5.在所有节点执行/install/rootdelete.sh 命令
pwd
/home/oracle/product/10.2.0/crs_1/install
[root@rac1 install]# ls
cluster.ini         install.excl  paramfile.crs  rootaddnode.sbs   rootdeletenode.sh  rootlocaladd
cmdllroot.sh        install.incl  preupdate.sh   rootconfig        rootdelete.sh      rootupgrade
envVars.properties  make.log      readme.txt     rootdeinstall.sh  rootinstall        templocal
[root@rac1 install]# ./rootde
rootdeinstall.sh   rootdeletenode.sh  rootdelete.sh      
[root@rac1 install]# ./rootde
rootdeinstall.sh   rootdeletenode.sh  rootdelete.sh      
[root@rac1 install]# ./rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down...
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script. for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'

6.在执行安装的节点执行/install/rootdeinstall.sh命令,记住,只需在安装的节点进行此角本:
 ./rootdeinstall.sh

Removing contents from OCR device
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 1.05943 seconds, 9.9 MB/s

7.检查有无CRS进程,如没有返回值,继续下一步。
[root@rac1 install]# ps -elf | grep -i 'ocs[s]d'
[root@rac1 install]# ps -elf | grep -i 'cr[s]d.bin'
[root@rac1 install]# ps -elf | grep -i 'ev[m]d.bin'
[root@rac1 install]#

8.在安装节点执行ORA_CRS_HOME/root.sh
 ./root.sh
WARNING: directory '/home/oracle/product/10.2.0' is not owned by root
WARNING: directory '/home/oracle/product' is not owned by root
WARNING: directory '/home/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/home/oracle/product/10.2.0' is not owned by root
WARNING: directory '/home/oracle/product' is not owned by root
WARNING: directory '/home/oracle' is not owned by root
assigning default hostname rac1 for node 1.
assigning default hostname rac2 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: rac1 rac1-priv rac1
node 2: rac2 rac2-priv rac2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /ocfs/clusterware/votingdisk
Format of 1 voting devices complete.
Startup will be queued to init within 90 seconds.
/etc/profile: line 57: ulimit: pipe size: cannot modify limit: Invalid argument
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
        rac1
CSS is inactive on these nodes.
        rac2
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.

9.在另外一节点ORA_CRS_HOME下也执行ROOT.SH
在最后会报:Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
Error 0(Native: listNetInterfaces:[3])
  [Error 0(Native: listNetInterfaces:[3])]
解决办法:
[root@rac2 bin]# ./oifcfg getif
[root@rac2 bin]# ./oifcfg iflist
eth0  192.168.1.0
eth1  10.10.10.0
[root@rac2 bin]# ./oifcfg setif -global eth0/192.168.1.0:public
[root@rac2 bin]# ./oifcfg setif -global eth1/10.10.10.0:cluster_interconnect
You have new mail in /var/spool/mail/root
[root@rac2 bin]# ./oifcfg getif
eth0  192.168.1.0  global  public
eth1  10.10.10.0  global  cluster_interconnect
再到图形化界面去执行VIPCA
配置完成后再往下走。

10.检查CRS状态:
 crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

11,配置监听(netca)
rac1-> mv /home/oracle/product/10.2.0/db_1/network/admin/listener.ora /tmp/listener.ora.bak
rac2-> mv /home/oracle/product/10.2.0/db_1/network/admin/listener.ora /tmp/listener.ora.bak
去执行NETCA
配置完成后再次查看服务可以看到增加了侦听服务:
rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

12,添加其他资源到OCR
rac1-> srvctl add asm -n rac1 -i +ASM1 -o /home/oracle/product/10.2.0/db_1
rac1-> srvctl add asm -n rac2 -i +ASM2 -o /home/oracle/product/10.2.0/db_1
rac1-> srvctl add database -d devdb -o /home/oracle/product/10.2.0/db_1
rac1-> srvctl add instance -d devdb -i devdb1 -n rac1
rac1-> srvctl add instance -d devdb -i devdb2 -n rac2
rac1-> srvctl add service -d devdb -s oltp -r devdb1,devdb2 -P BASIC

13,完成后再来查看下服务:
 crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.devdb.db   application    OFFLINE   OFFLINE               
ora....b1.inst application    OFFLINE   OFFLINE               
ora....b2.inst application    OFFLINE   OFFLINE               
ora....oltp.cs application    OFFLINE   OFFLINE               
ora....db1.srv application    OFFLINE   OFFLINE               
ora....db2.srv application    OFFLINE   OFFLINE               
ora....SM1.asm application    OFFLINE   OFFLINE               
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    OFFLINE   OFFLINE               
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2        

14,启动资源:
rac1-> srvctl start asm -n rac1
rac1-> srvctl start asm -n rac2
rac1-> srvctl start database -d devdb
rac1-> srvctl start service -d devdb

15. 再次检查
rac1-> crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.devdb.db   application    ONLINE    ONLINE    rac1        
ora....b1.inst application    ONLINE    ONLINE    rac1        
ora....b2.inst application    ONLINE    ONLINE    rac2        
ora....oltp.cs application    ONLINE    ONLINE    rac1        
ora....db1.srv application    ONLINE    ONLINE    rac1        
ora....db2.srv application    ONLINE    ONLINE    rac2        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   application    ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   application    ONLINE    ONLINE    rac2  

16. 进行环境检查:
 cluvfy stage -post crsinst -n rac1,rac2

Performing post-checks for cluster services setup

Checking node reachability...
Node reachability check passed from node "rac1".


Checking user equivalence...
User equivalence check passed for user "oracle".

Checking Cluster manager integrity...


Checking CSS daemon...
Daemon status check passed for "CSS daemon".

Cluster manager integrity check passed.

Checking cluster integrity...


Cluster integrity check passed


Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.

Uniqueness check for OCR device passed.

Checking the version of OCR...
OCR of correct Version "2" exists.

Checking data integrity of OCR...
Data integrity check for OCR passed.

OCR integrity check passed.

Checking CRS integrity...

Checking daemon liveness...
Liveness check passed for "CRS daemon".

Checking daemon liveness...
Liveness check passed for "CSS daemon".

Checking daemon liveness...
Liveness check passed for "EVM daemon".

Checking CRS health...
CRS health check passed.

CRS integrity check passed.

Checking node application existence...


Checking existence of VIP node application (required)
Check passed.

Checking existence of ONS node application (optional)
Check passed.

Checking existence of GSD node application (optional)
Check passed.


Post-check for cluster services setup was successful.


重建结束。

本文摘抄自http://blog.csdn.net/tianlesoftware/article/details/6050606 与《大话RAC》 最终至OCR与VOTEDISK的管理恢复,重建等测试通过。同时感谢21大湿的指导!















来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22740983/viewspace-734628/,如需转载,请注明出处,否则将追究法律责任。

上一篇: 数据库闪回
下一篇: RAC + GG
请登录后发表评论 登录
全部评论

注册时间:2012-03-30

  • 博文量
    33
  • 访问量
    51693