ITPub博客

首页 > Linux操作系统 > Linux操作系统 > Solaris8上安装RAC10202环境(六)

Solaris8上安装RAC10202环境(六)

原创 Linux操作系统 作者:yangtingkun 时间:2007-03-20 00:00:00 0 删除 编辑

前一阵一直在测试ORACLE 10R2RAC环境在Solaris上的安装。碰到了很多的问题,不过最后总算成功了,这里简单总结一下安装步骤,以及碰到的问题和解决方法。

这一篇主要讨论Oracle RAC环境应用5117016补丁集以及应用这个补丁集后的bug

操作系统准备工作可以参考:Solaris8上安装RAC10202环境(一):http://yangtingkun.itpub.net/post/468/271797

OracleClusterware安装过程可以参考:Solaris8上安装RAC10202环境(二):http://yangtingkun.itpub.net/post/468/271812

Oracle软件安装和ASM配置可以参考:Solaris8上安装RAC10202环境(三):http://yangtingkun.itpub.net/post/468/272088

RAC数据库的建立可以参考:Solaris8上安装RAC10202环境(四):http://yangtingkun.itpub.net/post/468/272138

ORACLE 10.2.0.2的补丁安装可以参考:Solaris8上安装RAC10202环境(五):http://yangtingkun.itpub.net/post/468/272201


相信用过10.2.0.2的,都知道10.2.0.2有一个很严重的问题:LIBSERVER10.A INCORRECTLY LOCATED IN $ORACLE_HOME/RDBMS/LIB/

这会导致以后10.2.0.2版本的数据库安装或升级任何的补丁都必须强制打一个补丁5117016

这里就一起将这个bug也打了patch

下载p5117016_10202_SOLARIS64.zip将文件拷贝到两个节点上并展开:

$ cd /data/patch/5117016
$ ls
p5117016_10202_SOLARIS64.zip
$ unzip p5117016_10202_SOLARIS64.zip
Archive: p5117016_10202_SOLARIS64.zip
creating: 5117016/
creating: 5117016/files/
creating: 5117016/etc/
creating: 5117016/etc/config/
inflating: 5117016/etc/config/inventory
inflating: 5117016/etc/config/actions
creating: 5117016/etc/xml/
inflating: 5117016/etc/xml/GenericActions.xml
inflating: 5117016/etc/xml/ShiphomeDirectoryStructure.xml
creating: 5117016/custom/
creating: 5117016/custom/scripts/
inflating: 5117016/custom/scripts/pre
inflating: 5117016/README.txt

根据本文前面的给出的停止进程的方法,停掉所有的Oracle进程:

$ srvctl stop db -d testrac
$ srvctl stop asm -n racnode1
$ srvctl stop asm -n racnode2
$ srvctl stop listener -n racnode1
$ srvctl stop listener -n racnode2

如果agentEnterprise Manager进程启动着,在两个节点使用emctl stop agentemctl stop dbconsole命令关闭进程。

然后使用root用户在两个节点上执行:

# /etc/init.d/init.crs stop
Shutting down Oracle Cluster Ready Services (CRS):
Mar 15 17:47:59.301 | INF | daemon shutting down
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
Shutdown has begun. The daemons should exit soon.

通过ps –ef检查是否所有的Oracle相关进程都已经停止。

下面在两个节点上分别执行下面的操作:

$ cd /data/patch/5117016/5117016
$ $ORACLE_HOME/OPatch/opatch apply -local
Invoking OPatch 10.2.0.2.0

Oracle interim Patch Installer version 10.2.0.2.0
Copyright (c) 2005, Oracle Corporation. All rights reserved..


Oracle Home : /data/oracle/product/10.2/database
Central Inventory : /data/oracle/oraInventory
from : /data/oracle/product/10.2/database/oraInst.loc
OPatch version : 10.2.0.2.0
OUI version : 10.2.0.2.0
OUI location : /data/oracle/product/10.2/database/oui
Log file location : /data/oracle/product/10.2/database/cfgtoollogs/opatch/opatch-2007_Mar_15_18-03-44-CST_Thu.log

ApplySession applying interim patch '5117016' to OH '/data/oracle/product/10.2/database'

You selected -local option, hence OPatch will patch the local system only.


Please shutdown Oracle instances running out of this ORACLE_HOME on the local system.
(Oracle Home = '/data/oracle/product/10.2/database')

Is the local system ready for patching?

Do you want to proceed? [y|n]
y
User Responded with: Y
Backing up files and inventory (not for auto-rollback) for the Oracle Home
Backing up files affected by the patch '5117016' for restore. This might take a while...
Backing up files affected by the patch '5117016' for rollback. This might take a while...
Execution of 'sh /data/patch/5117016/5117016/custom/scripts/pre -apply 5117016 ':

Return Code = 0

Patching component oracle.rdbms, 10.2.0.2.0...
Running make for target ioracle
ApplySession adding interim patch '5117016' to inventory

Verifying the update...
Inventory check OK: Patch ID 5117016 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 5117016 are present in Oracle Home.

The local system has been patched and can be restarted.


OPatch succeeded.

两边都打完patch,使用root重新启动RAC环境:

# /etc/init.d/init.crs start
Startup will be queued to init within 30 seconds.

执行完操作后等待一段时间,检查ASM、数据库和LISTENER都已经正常启动,则安装完成。

不过郁闷的是,安装完这个补丁后,居然发现了严重的问题:

racnode2上的Oracle实例无法正常启动了:

$ sqlplus "/ as sysdba"

SQL*Plus: Release 10.2.0.2.0 - Production on 星期四 3 15 18:16:04 2007

Copyright (c) 1982, 2005, Oracle. All Rights Reserved.

已连接到空闲例程。

SQL> startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DISK/testrac/spfiletestrac.ora'
ORA-17503: ksfdopn:2 Failed to open file +DISK/testrac/spfiletestrac.ora
ORA-03113: end-of-file on communication channel

同时从后台的alert文件中可以看到如下的错误:

Errors in file /data/oracle/admin/testrac/udump/testrac2_ora_4598.trc:
ORA-07445:
出现异常错误: 核心转储 [kkxcms()+1160] [SIGSEGV] [Address not mapped to object] [0x000000168] [] []

查询了一下Metalink,发现是OraclebugNote:390591.1Subject: RAC instances cannot be started after applying 10.2.0.2 patchset

Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.2 to 10.2.0.2
This problem can occur on any platform.

Symptoms

After applying the 10.2.0.2 patchset the following problem occurs :
The instance can be started only on one node. This is the node where the Oracle Universal Installer was started.

The following messages occur on the offending nodes while trying to startup the DB using an spfile within sqlplus:

SQL> startup nomount
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+SISTEMA/pge/spfilepgec.ora'
ORA-17503: ksfdopn:2 Failed to open file +SISTEMA/pge/spfilepgec.ora
ORA-03113: end-of-file on communication channel
SQL>

and the following error can be seen in the alert log of de ASM instance:

Errors in file /opt/sw/app/oracle/admin/+ASM/udump/+asm1_ora_9677.trc:
ORA-07445: exception encountered: core dump [kkxsyn()+740] [SIGSEGV] [Address not mapped to object] [0x000000168] [] []
Wed Sep 6 18:56:08 2006
Trace dumping is performing id=[cdmp_20060906185608]

No error is dumped in the alert.log of the instance.

If a pfile is used an error message occurs in the alert log of the instance :


replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=23, OS id=928
Wed Sep 6 18:51:20 2006
Errors in file /opt/sw/app/oracle/admin/pge/udump/pgec2_ora_29898.trc:
ORA-07445: exception encountered: core dump [kkxcms()+1160] [SIGSEGV] [Address not mapped to object] [0x000000168] [] []
Wed Sep 6 18:51:22 2006
Trace dumping is performing id=[cdmp_20060906185122]

Cause

Installing the 10.2.0.2 patchset in a RAC installation on any Unix platform does not correctly update the libknlopt.a file on all nodes. The local node where the installer is run does update libknlopt.a but remote nodes do not get the updated file. This can lead to dumps or internal errors on the remote nodes if Oracle is subsequently relinked.

Solution

There are two solutions for this problem:

1) Manual copy of the "libknlopt.a" library to the offending nodes :

- ensure all instances are shut down
- manually copy $ORACLE_HOME/rdbms/lib/libknlopt.a from the local node to all remote nodes
- relink Oracle on all nodes :
make -f ins_rdbms.mk ioracle

2) Install the patchset on every node using the "-local" option:

On Unix:
runInstaller -updateNodeList -local ORACLE_HOME=$ORACLE_HOME CLUSTER_NODES=node1,node2,...

On Windows:
setup.exe -updateNodeList -local ORACLE_HOME=%ORACLE_HOME% CLUSTER_NODES=node1,node2,...

References

@ Bug 5128575 - Libknlopt.A In 10.2.0.2 Not Relinked In Rac Installation With Patched .O Modules

根据Oracle提供的解决方法1进行测试,在racnode2上手工编译,进行操作之前,确保Oracle数据库已经关闭:

$ rcp racnode1:$ORACLE_HOME/rdbms/lib/libknlopt.a $ORACLE_HOME/rdbms/lib/libknlopt.a
$ cd $ORACLE_HOME/rdbms/lib
$ /usr/ccs/bin/make -f ins_rdbms.mk ioracle
chmod 755 /data/oracle/product/10.2/database/bin

- Linking Oracle
rm -f /data/oracle/product/10.2/database/rdbms/lib/oracle
/usr/ccs/bin/ld -o /data/oracle/product/10.2/database/rdbms/lib/oracle -L/data/oracle/product/10.2/database/rdbms/lib/ -L/data/oracle/product/10.2/database/lib/ -dy /data/oracle/product/10.2/database/lib/prod/lib/v9/crti.o /data/oracle/product/10.2/database/lib/prod/lib/v9/crt1.o /data/oracle/product/10.2/database/rdbms/lib/opimai.o /data/oracle/product/10.2/database/rdbms/lib/ssoraed.o /data/oracle/product/10.2/database/rdbms/lib/ttcsoi.o /data/oracle/product/10.2/database/rdbms/lib/defopt.o -z allextract -lperfsrv10 -z defaultextract /data/oracle/product/10.2/database/lib/nautab.o /data/oracle/product/10.2/database/lib/naeet.o /data/oracle/product/10.2/database/lib/naect.o /data/oracle/product/10.2/database/lib/naedhs.o /data/oracle/product/10.2/database/rdbms/lib/config.o -lserver10 -lodm10 -lnnet10 -lskgxp10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 -lhasgen10 -lcore10 -lskgxn2 -locr10 -locrb10 -locrutl10 -lhasgen10 -lcore10 -lskgxn2 -lclient10 -lvsn10 -lcommon10 -lgeneric10 -lknlopt `if /usr/ccs/bin/ar tv /data/oracle/product/10.2/database/rdbms/lib/libknlopt.a | grep xsyeolap.o > /dev/null 2>&1 ; then echo "-loraolap10" ; fi` -lslax10 -lpls10 -lplp10 -lserver10 -lclient10 -lvsn10 -lcommon10 -lgeneric10 -lknlopt -lslax10 -lpls10 -lplp10 -ljox10 -lserver10 -lclsra10 -ldbcfg10 -locijdbcst10 -lwwg `cat /data/oracle/product/10.2/database/lib/ldflags` -lnsslb10 -lncrypt10 -lnsgr10 -lnzjs10 -ln10 -lnnz10 -lnl10 -lnro10 `cat /data/oracle/product/10.2/database/lib/ldflags` -lnsslb10 -lncrypt10 -lnsgr10 -lnzjs10 -ln10 -lnnz10 -lnl10 -lmm -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 `cat /data/oracle/product/10.2/database/lib/ldflags` -lnsslb10 -lncrypt10 -lnsgr10 -lnzjs10 -ln10 -lnnz10 -lnl10 -lnro10 `cat /data/oracle/product/10.2/database/lib/ldflags` -lnsslb10 -lncrypt10 -lnsgr10 -lnzjs10 -ln10 -lnnz10 -lnl10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 `if /usr/ccs/bin/ar tv /data/oracle/product/10.2/database/rdbms/lib/libknlopt.a | grep "kxmnsd.o" > /dev/null 2>&1 ; then echo " " ; else echo "-lordsdo10"; fi` -lctxc10 -lctx10 -lzx10 -lgx10 -lctx10 -lzx10 -lgx10 -lordimt10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 -lsnls10 -lunls10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lcore10 -lsnls10 -lnls10 -lxml10 -lcore10 -lunls10 -lsnls10 -lnls10 -lcore10 -lnls10 `cat /data/oracle/product/10.2/database/lib/sysliblist` -R /opt/SUNWcluster/lib/sparcv9:/data/oracle/product/10.2/database/lib:/opt/ORCLcluster/lib/ -Y P,:/opt/SUNWcluster/lib/sparcv9:/opt/ORCLcluster/lib/:/usr/ccs/lib/sparcv9:/usr/lib/sparcv9 -Qy -lc -laio -lposix4 -lkstat -lm /data/oracle/product/10.2/database/lib/prod/lib/v9/crtn.o
mv -f /data/oracle/product/10.2/database/bin/oracle /data/oracle/product/10.2/database/bin/oracleO
mv /data/oracle/product/10.2/database/rdbms/lib/oracle /data/oracle/product/10.2/database/bin/oracle
chmod 6751 /data/oracle/product/10.2/database/bin/oracle

可能是由于没有关闭ASM实例的原因,编译后问题依旧。尝试将整个系统重新启动,问题得到解决。

这里建议,如果没有必要还是不要打5117016补丁。或者可以考虑将数据库版本直接升级到10.2.0.3

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-69208/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
暂无介绍

注册时间:2007-12-29

  • 博文量
    1955
  • 访问量
    10352950