ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 转帖---oracle rac 节点崩溃重装

转帖---oracle rac 节点崩溃重装

原创 Linux操作系统 作者:andyxu 时间:2009-12-10 14:47:26 0 删除 编辑

linux下oracle rac一个节点崩溃,而另一个节点正常,如何重装崩溃节点?请高手赐教!

本人已经重装崩溃节点的系统,但rac如何恢复,是通过节点2删除rac中节点1的信息,然后再新增节点的方法?还是可以不用删除节点信息,直接可以恢复?
具体步骤如何?请高手赐教啊,不甚感激!

node1 已坏
node2 正常

重装node1后,计算机名还是node1,直接可以添加吗?不用删除以前的节点信息?
谁能说下具体步骤啊

 

先在node2上把node1的信息删干净,然后重新填加node1
步骤如下:
1,在node2上运行DBCA,删除instance;
2,如果有ASM,删除ASM实例,srvctl remove asm -n node1;
3,在node2上执行updateNodeList脚本
runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=node2";
4,在node2上执行rootdeletenode.sh脚本
$CRS_HOME/install/rootdeletenode.sh node1;
5,在node2上执行updateNodeList脚本更新CRS信息
runInstaller -updateNodeList ORACLE_HOME=$CRS_HOME "CLUSTER_NODES=node2";
6,看下是不是删除成功了
cluvfy comp crs -n all

下边添加:

1 Install CRS
2 Add ONS
3 Install ASM
4 Config Listener
5 Install DB software
6 Add DB instance into this node


在新的node1上配置和node2完全一样的所有信息,包括环境
1,在node2,以oracle用户进入$CRS_HOME/oui/bin目录,执行addNode.sh脚本
按照步骤添加
2,在node2,以oracle用户进入$ORACLE_HOME/oui/bin目录,执行addNode.sh脚本
按照步骤添加
3,配置listener
在node1上运行netca,选择cluster database,按步骤配置
4,在node2上运行DBCA添加新的instance
先选择...Cluster database...然后instance management然后add an instance然后...
后边没难度了,不说了自己看提示吧

另:楼主灵活处理上边的某些语句,我只是说明方法,不保证完全和你的一样,你多试几次,自然就成功,自己多想想!总的思路就是先删instance然后删ASM然后删DB然后删CRS信息,添加正好相反

This document is intented to provide the steps to be taken to remove a node from the Oracle cluster. The node itself is unavailable due to some OS issue or hardware issue which prevents the node from starting up. This document will provide the steps to remove such a node so that it can be added back after the node is fixed.

The steps to remove a node from a Cluster is already documented in the Oracle documentation at

Version Documentation Link
10gR2 http://download.oracle.com/docs/ ... elunix.htm#BEIFDCAF
11g http://download.oracle.com/docs/ ... erware.htm#BEIFDCAF


This note is different because the documentation covers the scenario where the node is accessible and the removal is a planned procedure. This note covers the scenario where the Node is unable to boot up and therefore it is not possible to run the clusterware commands from this node.

Solution
Summary
Basically all the steps documented in the Oracle® Clusterware Administration and Deployment Guide must be followed. The difference here is that we skip the steps that are to be executed on the node which is not available and we run some extra commands on the other node which is going to remain in the cluster to remove the resources from the node that is to be removed.

Example Configuration
Node Names Halinux1
Halinux2

OS RHAS 4.0 Update 4 RHAS 4.0 Update 4
Oracle Clusterware Oracle 11g Oracle 11g


Assume that Halinux2 is down due to a hardware failure and cannot even boot up. The plan is to remove it from the clusterware, fix the issue and then add it again to the Clusterware. In this document, we will cover the steps to remove the node from the clusterware

Initial Stage
At this state, the Oracle Clusterware on Halinux1 (Good Node) is up and running. The node Halinux2 is down and cannot be accessed. Note that the Virtual IP of halinux2 is running on Node 1. The rest of halinux2 resources are OFFLINE



Step 1 - Remove oifcfg information for the failed node
Generally most installations use the global flag of the oifcfg command and therefore they can skip this step. They can confirm this using

$oifcfg getif
If the output of the above command returns global as shown below then you can skip this step (executing the command below on a global defination will return an error as shown below.

If the output of the oifcfg getif command does not return global then use the following command

$oifcfg delif -node



Step 2 Remove ONS information
Execute as root the following command to find out the remote port number to be used

$cat $CRS_HOME/opmn/conf/ons.config
and remove the information pertaining the node to be deleted using

#$CRS_HOME/bin/racgons remove_config  harh2:6200


Step 3 Remove resources
In this step, the resources that were defined on this node has to be removed. These resources include (a) Database (b) Instance (c) ASM. A list of this can be acquired by running crs_stat -t command from any node



Step 4 Execute rootdeletenode.sh
From the node that you are not deleting execute as root the following command which will help find out the node number of the node that you want to delete

#$CRS_HOME/bin/olsnodes -n
this number can be passed to the rootdeletenode.sh command which is to be executed as root from any node which is going to remain in the cluster.

#$CRS_HOME/install/rootdeletenode.sh halinux2,2

Step 5 Update the Inventory
From the node which is going to remain in the cluster run the following command as owner of the CRS_HOME. The argument to be passed to the CLUSTER_NODES is a comma seperated list of node names of the cluster which are going to remain in the cluster. This step needs to be performed from ASM home and RAC home.

$CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=CRS_HOME "CLUSTER_NODES=halinux1" CRS=TRUE                       ## Optionally enclose the host names with {}

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/110321/viewspace-622128/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2009-06-26

  • 博文量
    167
  • 访问量
    293322