ITPub博客

首页 > Linux操作系统 > Linux操作系统 > RAC环境中集群配置变更维护指导

RAC环境中集群配置变更维护指导

原创 Linux操作系统 作者:willkk88 时间:2011-07-06 20:45:34 0 删除 编辑
在RAC的维护过程中,常常由于机房搬迁、网络规划调整、存储调整等原因需调整集群的Public IP及VIP,及变更CRS的OCR、Voting Disk。以下通过维护过程中实施经验,并加以试验总结,希望对大家在维护过程中碰到类似操作能起到指导作用。

本文所涉及环境:

操作系统--Oracle Enterprise Linux 5

Oracle版本--10.2.0.4

Oracel架构—RAC +ASM

一、              RAC环境修改Public、私有、Vip 等IP

1)       调整目标

调整前hosts配置如下:

127.0.0.1               localhost

138.30.0.101            rac1

138.30.0.102            rac2

138.30.0.201            rac1-vip

138.30.0.202            rac2-vip

100.100.1.1             rac1-priv

100.100.1.2             rac2-priv

调整目标为:

192.168.0.128            rac1

192.168.0.129            rac2

192.168.0.201            rac1-vip

192.168.0.202            rac2-vip

10.10.1.1                 rac1-priv

10.10.1.2                 rac2-priv

2)       关闭CRS相关资源

停止CRS上所有相关资源,包括:实例(DB+ASM)、服务、应用等

当前crs状态如下:

[oracle@RAC1 ~]$ crs_stat -t

Name           Type           Target    State     Host        

------------------------------------------------------------

ora....SM1.asm application    ONLINE    ONLINE    rac1        

ora....C1.lsnr application    ONLINE    ONLINE    rac1        

ora.rac1.gsd   application    ONLINE    ONLINE    rac1        

ora.rac1.ons   application    ONLINE    ONLINE    rac1        

ora.rac1.vip   application    ONLINE    ONLINE    rac1        

ora....SM2.asm application    ONLINE    ONLINE    rac2        

ora....C2.lsnr application    ONLINE    ONLINE    rac2        

ora.rac2.gsd   application    ONLINE    ONLINE    rac2        

ora.rac2.ons   application    ONLINE    ONLINE    rac2        

ora.rac2.vip   application    ONLINE    ONLINE    rac2        

ora.racdb.db   application    ONLINE    ONLINE    rac2        

ora....b1.inst application    ONLINE    ONLINE    rac1        

ora....b2.inst application    ONLINE    ONLINE    rac2

  通过crs_stop命令为停止CRS上所有资源的最快捷方法:

[oracle@RAC1 ~]$ crs_stop -all

Attempting to stop `ora.rac1.gsd` on member `rac1`

Attempting to stop `ora.rac1.ons` on member `rac1`

Attempting to stop `ora.rac2.gsd` on member `rac2`

Attempting to stop `ora.rac2.ons` on member `rac2`

Attempting to stop `ora.racdb.db` on member `rac2`

Stop of `ora.rac1.gsd` on member `rac1` succeeded.

Stop of `ora.rac1.ons` on member `rac1` succeeded.

Stop of `ora.rac2.gsd` on member `rac2` succeeded.

Stop of `ora.rac2.ons` on member `rac2` succeeded.

Stop of `ora.racdb.db` on member `rac2` succeeded.

`ora.racdb.racdb1.inst` is already OFFLINE.

`ora.racdb.racdb2.inst` is already OFFLINE.

Attempting to stop `ora.rac1.ASM1.asm` on member `rac1`

Attempting to stop `ora.rac2.ASM2.asm` on member `rac2`

Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`

Attempting to stop `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2`

Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.

Attempting to stop `ora.rac1.vip` on member `rac1`

Stop of `ora.rac1.vip` on member `rac1` succeeded.

Stop of `ora.rac2.ASM2.asm` on member `rac2` succeeded.

Stop of `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2` succeeded.

Attempting to stop `ora.rac2.vip` on member `rac2`

Stop of `ora.rac1.ASM1.asm` on member `rac1` succeeded.

Stop of `ora.rac2.vip` on member `rac2` succeeded.

当然也可以通过srvctl命令集进行逐个资源停止。

srvctl stop database -d racdb   关闭数据库

srvctl stop asm -n rac1 关闭asm实例1

srvctl stop asm -n rac2 关闭asm实例2

srvctl stop nodeapps -n rac1 关闭实例1上应用程序

srvctl stop nodeapps -n rac2 关闭实例2上应用程序

通过crs_stat –t命令进行检查,是否所有资源状态为offline.

然后以ROOT分别停止2个节点的CRS;

[root@RAC1 bin]# ./crsctl stop crs

[root@RAC2 bin]# ./crsctl stop crs

3)       修改系统层IP配置

在操作系统层修改/etc/hosts配置,需2个节点一致;

分别在2个节点修改IP地址,本文所用到网卡名为eth0和eth1;

也可以通过修改/etc/sysconfig/network-scripts/目录下网络配置文件:

-rw-r--r-- 4 root root   205 Mar  5 13:41 ifcfg-eth1

-rw-r--r-- 4 root root   229 Mar  5 13:41 ifcfg-eth0

修改后分别重新激活网卡eth0及eth1;

4)       启动CRS

通过crsctl启动crs,但由于CRS相关服务会伴随CRS启动,需通过上述2)步骤停止相关资源,但保留CRS进程,后续修改ip需CRS在启动状态。

另可以在启动CRS前通过srvctl命令配置服务、实例、应用不随CRS启动,具体命令如下:

-- 禁止数据库自动启动
srvctl disable database -d racdb
-- 禁止某个实例的自动启动
srvctl disable instance -d racdb -i racdb1
srvctl disable instance -d racdb -i racdb2

-- 禁止服务在实例上运行

srvctl disable service -d racdb -s service_name -i racdb1
-- 查看具体配置信息
srvctl config database -d racdb –a

参数:-a为列出详细信息,例如:

[root@RAC1 bin]# ./srvctl config database -d racdb

rac1 racdb1 /home/oracle/10gR2/db

rac2 racdb2 /home/oracle/10gR2/db

[root@RAC1 bin]# ./srvctl config database -d racdb -a

rac1 racdb1 /home/oracle/10gR2/db

rac2 racdb2 /home/oracle/10gR2/db

DB_NAME: null

ORACLE_HOME: /home/oracle/10gR2/db

SPFILE: null

DOMAIN: null

DB_ROLE: null

START_OPTIONS: null

POLICY:  AUTOMATIC

ENABLE FLAG: DB ENABLED

使用srvctl禁用数据库自动启动后区别如下:

[root@RAC1 bin]# ./srvctl disable database -d racdb

[root@RAC1 bin]# ./srvctl config database -d racdb -a

rac1 racdb1 /home/oracle/10gR2/db

rac2 racdb2 /home/oracle/10gR2/db

DB_NAME: null

ORACLE_HOME: /home/oracle/10gR2/db

SPFILE: null

DOMAIN: null

DB_ROLE: null

START_OPTIONS: null

POLICY:  MANUAL

ENABLE FLAG: DB DISABLED, INST DISABLED ON racdb1 racdb2

 

验证结果:

crs_stat –t中输出结果中,所有资源为offline,并且crsctl check crs状态如下:

CSS appears healthy

CRS appears healthy

EVM appears healthy

5)       修改CRS中IP配置

通过oifcfg命令集进行配置,该命令用于定义和修改Oracle 集群需要的网卡属性,包括网卡的网段地址,子网掩码,接口类型等。

Oifcfg 命令的格式如下:

interface_name/subnet:interface_type
接口的配置方式分为两类,global 和node-specific。前者说明集群所有节点的配置信息相同,也就是说所有节点的配置是对称的;而后者意味着这个节点的配置和其他节点配置不同,是非对称的。
Iflist:显示网口列表
Getif: 获得单个网口信息
Setif:配置单个网口
Delif:删除网口
-- 查看public 类型的网卡
[root@rac1 bin]# ./oifcfg getif -type public

-- 删除接口配置
[root@rac1 bin]# ./oifcfg delif –global

-- 添加接口配置
[root@rac1 bin]# ./oifcfg setif -global eth0/192.168.0.0:public
[root@rac1 bin]# ./oifcfg setif -global eth1/10.10.1.0:cluster_interconnect

注:IP 地址最一个为0,代表的是一个网段。



具体调整CRS中IP配置步骤如下:

-- 查看当前配置

./oifcfg getif -global

eth0  138.30.0.0  global  public

eth1  100.100.1.0  global  cluster_interconnect

-- 删除当前配置

./oifcfg delif -global eth0

./oifcfg delif -global eth1

-- 添加新public 及私有ip地址

./oifcfg setif -global eth0/192.168.0.0:public eth1/10.10.1.0:cluster_interconnect

-- 列出修改后配置信息

./oifcfg iflist 

eth0  192.168.0.0

eth1  10.10.1.0

-- 添加新VIP地址

./srvctl modify nodeapps -n rac1 -A 192.168.0.201/255.255.255.0/eth0

./srvctl modify nodeapps -n rac2 -A 192.168.0.202/255.255.255.0/eth0

以上步骤只需在1个节点执行操作。

6)       修改listener文件

修改监听配置文件中涉及IP部分。

7)       重启CRS及验证结果

通过crsctl命令重启CRS,并通过crs_stat –t检查所有资源是否是online状态,如下:

[root@RAC1 bin]# ./crs_stat -t

Name           Type           Target    State     Host        

------------------------------------------------------------

ora....SM1.asm application    ONLINE    ONLINE    rac1        

ora....C1.lsnr application    ONLINE    ONLINE    rac1        

ora.rac1.gsd   application    ONLINE    ONLINE    rac1        

ora.rac1.ons   application    ONLINE    ONLINE    rac1        

ora.rac1.vip   application    ONLINE    ONLINE    rac1        

ora....SM2.asm application    ONLINE    ONLINE    rac2        

ora....C2.lsnr application    ONLINE    ONLINE    rac2        

ora.rac2.gsd   application    ONLINE    ONLINE    rac2        

ora.rac2.ons   application    ONLINE    ONLINE    rac2        

ora.rac2.vip   application    ONLINE    ONLINE    rac2        

ora.racdb.db   application    ONLINE    ONLINE    rac1        

ora....b1.inst application    ONLINE    ONLINE    rac1        

ora....b2.inst application    ONLINE    ONLINE    rac2

 

若显示以上结果,既调整完成。

二、              RAC CRS的OCR及VOTING损坏修复

Oracle Clusterware由2部分组成,分别是Voting Disk和 OCR。Voting Disk里面记录着节点成员的信息。如RAC数据库中有哪些节点成员,节点增加或者删除时也同样会将信息记录进来。Voting Disk必须存放在共享存储上,通常来说是存放在裸设备上。为了保证Voting Disk的安全,需要配置多个Voting Disk,Voting Disk使用的是一种“多数可用算法”,如果存在多个Voting Disk,则必须一半以上的同时使用,Clusterware才能正常使用。不满足半数以上,集群会立即宕掉,所以Oracle建议Voting Disk的个数应该为奇数个,如 1、3、5个,每个Voting Disk的大小约为20MB。 
OCR 记录的是节点成员的配置信息,如数据库、ASM、实例、监听器、VIP等CRS资源的配置信息。CRS进程管理的信息来自OCR的内容。OCR存储的配置信息是以目录树的形式来记录一系列“键值”对应信息的。OCR记录着 CRS进程管理资源的所有配置信息。大小约为100MB。 
Voting Disk和OCR存放的信息是至关重要的,一旦他们丢失或者损坏的话,Clusterware将无法启动,这样整个RAC都无法启动。因此需要对 Voting Disk和OCR进行完备的备份。 本文描述在未备份前提下OCR、Voting Disk损坏恢复过程。

1)       测试环境准备

将RAC的CRS的OCR、Voting Disk对应内容清除,同时清理相关CRS配置;

在停止CRS相关资源及CRS的前提下分别在2所有节点执行CRS_HOME目录下的./install/rootdelete.sh及rootdeinstall.sh脚本,然后用dd命令清理OCR、Voting Disk对应裸设备内容清除;

dd if=/dev/zero f=/dev/raw/raw2 bs=1M count=100

dd if=/dev/zero f=/dev/raw/raw1 bs=1M count=100

注:以上操作切忌在生产库执行。

查询CRS的OCR、Voting Disk对应裸设备方法如下:

[root@RAC1 bin]# ./ocrcheck

Status of Oracle Cluster Registry is as follows :

         Version                  :          2

         Total space (kbytes)     :     524024

         Used space (kbytes)      :       3948

         Available space (kbytes) :     520076

         ID                       :  566773499

         Device/File Name         : /dev/raw/raw1

                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

[root@RAC1 bin]# ./crsctl query css votedisk

 0.     0    /dev/raw/raw2

located 1 votedisk(s).

 

2)       重建OCR、VotingDisk

分别在1、2节点执行CRS_HOME目录下的root.sh,执行结果,

1节点:

[root@RAC1 crs]# /home/oracle/10gR2/crs/root.sh

WARNING: directory '/home/oracle/10gR2' is not owned by root

WARNING: directory '/home/oracle' is not owned by root

Checking to see if Oracle CRS stack is already configured

 

Setting the permissions on OCR backup directory

Setting up NS directories

Oracle Cluster Registry configuration upgraded successfully

WARNING: directory '/home/oracle/10gR2' is not owned by root

WARNING: directory '/home/oracle' is not owned by root

Successfully accumulated necessary OCR keys.

Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.

node :

node 1: rac1 rac1-priv rac1

node 2: rac2 rac2-priv rac2

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.

Now formatting voting device: /dev/raw/raw2

Format of 1 voting devices complete.

Startup will be queued to init within 30 seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within 600 seconds.

CSS is active on these nodes.

        rac1

CSS is inactive on these nodes.

        rac2

Local node checking complete.

Run root.sh on remaining nodes to start CRS daemons.

2节点:

[root@RAC2 ~]# /home/oracle/10gR2/crs/root.sh

WARNING: directory '/home/oracle/10gR2' is not owned by root

WARNING: directory '/home/oracle' is not owned by root

Checking to see if Oracle CRS stack is already configured

 

Setting the permissions on OCR backup directory

Setting up NS directories

Oracle Cluster Registry configuration upgraded successfully

WARNING: directory '/home/oracle/10gR2' is not owned by root

WARNING: directory '/home/oracle' is not owned by root

clscfg: EXISTING configuration version 3 detected.

clscfg: version 3 is 10G Release 2.

Successfully accumulated necessary OCR keys.

Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.

node :

node 1: rac1 rac1-priv rac1

node 2: rac2 rac2-priv rac2

clscfg: Arguments check out successfully.

 

NO KEYS WERE WRITTEN. Supply -force parameter to override.

-force is destructive and will destroy any previous cluster

configuration.

Oracle Cluster Registry for cluster has already been initialized

Startup will be queued to init within 30 seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within 600 seconds.

CSS is active on these nodes.

        rac1

        rac2

CSS is active on all nodes.

Waiting for the Oracle CRSD and EVMD to start

Oracle CRS stack installed and running under init(1M)

Running vipca(silent) for configuring nodeapps

 

Creating VIP application resource on (2) nodes...

Creating GSD application resource on (2) nodes...

Creating ONS application resource on (2) nodes...

Starting VIP application resource on (2) nodes...

Starting GSD application resource on (2) nodes...

Starting ONS application resource on (2) nodes...

 

 

Done.

 

执行完后CRS将自动启动,用crs_stat –t结果如下:

[root@RAC2 bin]# ./crs_stat -t

Name           Type           Target    State     Host        

------------------------------------------------------------

ora.rac1.gsd   application    ONLINE    ONLINE    rac1        

ora.rac1.ons   application    ONLINE    ONLINE    rac1        

ora.rac1.vip   application    ONLINE    ONLINE    rac1        

ora.rac2.gsd   application    ONLINE    ONLINE    rac2        

ora.rac2.ons   application    ONLINE    ONLINE    rac2        

ora.rac2.vip   application    ONLINE    ONLINE    rac2

只创建gsd、ons、vip等应用程序资源;

3)       注册其它资源到集群

通过srvctl注册其它资源到集群中;

--注册库及实例到CRS

srvctl add database -d racdb -o /home/oracle/10gR2/db

srvctl add instance -d racdb -i racdb1 -n rac1

srvctl add instance -d racdb -i racdb2 -n rac2

--注册ASM实例到CRS

srvctl add asm -n rac1 -i +ASM1 -o /home/oracle/10gR2/db

srvctl add asm -n rac2 -i +ASM2 -o /home/oracle/10gR2/db

--修改实例和ASM实例的依赖关系

[oracle@RAC1 ~]$ srvctl modify instance -d racdb -i racdb1 -s +ASM1

[oracle@RAC1 ~]$ srvctl modify instance -d racdb -i racdb2 -s +ASM2

--创建服务到CRS

srvctl add service -d racdb -s service_name -r racdb1 -a racdb2 -P BASIC

参数说明:

-s : 服务名
-r:首选实例名
-a:备选实例名
-P:TAF策略,可选值为None(缺省值),Basic,preconnect

4)       注册节点的监听器到集群

分别在2个节点$ORACLE_HOME下的/crs/public目录创建监听cap文件

vi  $ORACLE_HOME/crs/public/ora.rac1.LISTENER_RAC1.lsnr.cap

添加如下内容(以1节点为例子):

NAME=ora.rac1.LISTENER_RAC1.lsnr

TYPE=application

ACTION_SCRIPT=/home/oracle/10gR2/db/bin/racgwrap

CHECK_INTERVAL=600

ACTIVE_PLACEMENT=1

DESCRIPTION=CRS application for listener on node

HOSTING_MEMBERS=rac1

PLACEMENT=favored

REQUIRED_RESOURCES=ora.rac1.vip

然后通过crs_register将监听器注册到集群

crs_register ora.rac1.LISTENER_RAC1.lsnr

最后启动资源

[oracle@RAC2 ~]$ crs_start –all

5)       验证结果

通过crs_stat –t命令查询,若所有已添加资源状态为online,则OCR、VotingDisk已修复。

[root@RAC1 bin]# ./crs_stat -t

Name           Type           Target    State     Host        

------------------------------------------------------------

ora....SM1.asm application    ONLINE    ONLINE    rac1        

ora....C1.lsnr application    ONLINE    ONLINE    rac1        

ora.rac1.gsd   application    ONLINE    ONLINE    rac1        

ora.rac1.ons   application    ONLINE    ONLINE    rac1        

ora.rac1.vip   application    ONLINE    ONLINE    rac1        

ora....SM2.asm application    ONLINE    ONLINE    rac2        

ora....C2.lsnr application    ONLINE    ONLINE    rac2        

ora.rac2.gsd   application    ONLINE    ONLINE    rac2        

ora.rac2.ons   application    ONLINE    ONLINE    rac2        

ora.rac2.vip   application    ONLINE    ONLINE    rac2        

ora.racdb.db   application    ONLINE    ONLINE    rac1        

ora....b1.inst application    ONLINE    ONLINE    rac1        

ora....b2.inst application    ONLINE    ONLINE    rac2

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24145320/viewspace-701552/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2011-03-29

  • 博文量
    15
  • 访问量
    26480