ITPub博客

首页 > Linux操作系统 > Linux操作系统 > Solaris10下Silent模式安装Oracle1106RAC环境(八)

Solaris10下Silent模式安装Oracle1106RAC环境(八)

原创 Linux操作系统 作者:yangtingkun 时间:2009-01-28 23:49:15 0 删除 编辑

主机环境基本上和前面文章中描述的SOLARIS10安装Oracle1106rac的环境基本完全一致,最主要的区别在于没有了VOLUMN CLUSTER MANAGER,于是这里打算使用OracleASM。由于安装操作没有什么区别,所以这次安装选择了SILENT静默模式来安装RAC

描述安装过程中碰到的问题。

Solaris10Silent模式安装Oracle1106RAC环境(一):http://yangtingkun.itpub.net/post/468/477442

Solaris10Silent模式安装Oracle1106RAC环境(二):http://yangtingkun.itpub.net/post/468/477443

Solaris10Silent模式安装Oracle1106RAC环境(三):http://yangtingkun.itpub.net/post/468/477444

Solaris10Silent模式安装Oracle1106RAC环境(四):http://yangtingkun.itpub.net/post/468/477446

Solaris10Silent模式安装Oracle1106RAC环境(五):http://yangtingkun.itpub.net/post/468/477447

Solaris10Silent模式安装Oracle1106RAC环境(六):http://yangtingkun.itpub.net/post/468/477448

Solaris10Silent模式安装Oracle1106RAC环境(七):http://yangtingkun.itpub.net/post/468/477600

 

 

虽然Silent安装和RAC11G的安装都不只一次,但是RACSILENT模式安装还是第一次,而且SILENT安装ASM也是第一次,所以碰到一些问题也是在所难免。

第一个错误其实是粗心造成的,在Silent模式安装Cluster的时候,报错错误信息为:

$ ./runInstaller -silent -noconfig -responseFile /data/cluster/response/my_crs.rsp
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 180 MB.   Actual 57590 MB    Passed
Checking swap space: must be greater than 150 MB.   Actual 57802 MB    Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2008-08-29_01-52-22PM. Please wait ...$ Oracle Universal Installer, Version 11.1.0.6.0 Production
Copyright (C) 1999, 2007, Oracle. All rights reserved.

OUI-10203:The specified response file '/data/cluster/response/my_crs.rsp' is not found. Make sure that the response file specified exists and you have read privileges to this file.

检查metalink,发现问题是由于RESPONSE参数FROM_LOCATION的设置不正确造成的。由于这个参数当时是采用相对路径,个人认为默认值就应该是正确的,因此没有去验证。

将其设置为正确的绝对路径后,重新执行安装,问题解决。

第二个错误还是粗心造成的,仍然是CLUSTER的安装过程,报错如下:

OUI-10155:Error while setting variable sl_tableList: The following node names are invalid, as they do not resolve to a valid IP address:

ser2-vip.

检查RESPONSE参数SL_TABLELIST未发现任何异常,结果发现在/etc/hosts文件中设置错误:

127.0.0.1       localhost
172.0.2.62      ser1    ser1.   loghost
172.0.2.63      ser2
172.0.2.68      ser1-vip
172.0.2.69      serv2-vip
10.0.2.1        ser1-priv
10.0.2.2        ser2-priv

这里误将ser2-vip敲错为serv2-vip,导致错误的产生,将文件修改为:

127.0.0.1       localhost
172.0.2.62      ser1    ser1.   loghost
172.0.2.63      ser2
172.0.2.68      ser1-vip
172.0.2.69      ser2-vip
10.0.2.1        ser1-priv
10.0.2.2        ser2-priv

检查两个节点的所有网络设置,确认无误后,重新执行runInstaller,问题解决。

问题三是问题二的延伸,由于没有设置uninstall的时候删除安装目录,导致目录不为空,出现了下面的错误:

SEVERE:OUI-10029:You have specified a non-empty directory to install this product. It is recommended to specify either an empty or a non-existent directory. You may, however, choose to ignore this message if the directory contains Operating System generated files or subdirectories like lost+found.

这个问题处理很简单,手工清除目录就可以了。

第四个问题是执行/data/oracle/product/11.1/crs/root.sh脚本是碰到了,错误如下:

root@ser1 # . /data/oracle/product/11.1/crs/root.sh
WARNING: directory '/data/oracle/product/11.1' is not owned by root
WARNING: directory '/data/oracle/product' is not owned by root
WARNING: directory '/data/oracle' is not owned by root
WARNING: directory '/data' is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up Network socket directories
Failed to upgrade Oracle Cluster Registry configuration

检查了半天,最后发现居然是以前碰到过的一个错误,Oracle不认“s0”分区。具体描述可以参考:http://yangtingkun.itpub.net/post/468/272473

简单的说就是:Oracle默认不会使用s0分区,如果指定了s0分区作为ocrvoting disk,那么在执行root.sh时也会收到同样的错误信息:Failed to upgrade Oracle Cluster Registry configuration

而最初的配置正是将ocr配置为/dev/rdsk/emcpower0a,这个对应的正是s0分区。所以导致了这个错误。

于是只能将整个CLUSTER完全卸载,重新设置ocrvoting disk对应的裸设备。再次安装则没有出现任何问题。

第五个问题仍然是重复以前的错误。明明在以前碰到过,结果还是重蹈覆辙,详细描述可以参考http://yangtingkun.itpub.net/post/468/407375

在软件安装完成的时候,需要执行root.sh,不过这个脚本存在两个bug,一个是./root.sh的时候,‘.’和‘/’之间不能加空格,否则会陷入死循环。二是如果silent模式创建数据库,会导致root.sh里面的参数OUI_SILENT设置错误,需要手工将其设置为FALSE,然后再执行脚本,可惜这两次问题这次又全都碰到了,希望下次安装11g的时候不要再次犯同样的错误。

第六个问题是SILENT建立ASM时候碰到的

$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent.log
Failed to retrieve network listener resources required for the Real Application Clusters high availability extensions configurations
 on the following nodes: [ser1", "ser2].

Do you want listeners on port 1521 with prefix LISTENER to be created on nodes [ser1", "ser2] automatically?  If you would like to c
onfigure the listener with different properties, run NetCA before continuing.
Listener creation failed with error: Failed to create a profile for listener "[LISTENER_SER1"]" on node "ser1"", "PRKC-1056 : Failed
 to get the hostname for node ser1"
PRKH-1001 : HASContext Internal Error
  [OCR Error(Native: getHostName:[21])]"..

It is strongly recommended to run NetCA to configure listeners before continuing.  Do you want to continue with the operation?
The operation will be stopped. Re-run DBCA after successfully running NetCA.

显然根据错误提示,Oracle解析节点名称的时候出现了错误,经过检查发现输入节点参数的时候忘了添加大括弧,将参数改为NODELIST={"ser1","ser2"}之后,再次安装,发现问题依旧,不过这次提示错误又包含了大括弧的部分。

DBCARESPONSE设置和OUIRESPONSE设置还有一定的区别,将参数改为NODELIST=ser1,ser2,再次尝试安装,错误消失。

第七个错误还是ASM的配置相关,配置ASM时出现下面的问题:

$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent1.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent1.log
Failed to retrieve network listener resources required for the Real Application Clusters high availability extensions configurations
 on the following nodes: [ser1, ser2].

Do you want listeners on port 1521 with prefix LISTENER to be created on nodes [ser1, ser2] automatically?  If you would like to con
figure the listener with different properties, run NetCA before continuing.
ORA-15018: diskgroup cannot be created
ORA-15031: disk specification '/dev/rdsk/emcpower0g' matches no disks
ORA-15025: could not open disk '/dev/rdsk/emcpower0g'
ORA-15056: additional error message

由于路径没有任何问题,感觉是权限的问题,于是使用root对裸设备授权:

root@ser1 # ls -l /dev/rdsk/emcpower0g
lrwxrwxrwx   1 root     root          33 May 30 14:58 /dev/rdsk/emcpower0g -> ../../devices/pseudo/emcp@0:g,raw
root@ser1 # chown oracle:oinstall /dev/rdsk/emcpower0g
root@ser1 # ls -l /dev/rdsk/emcpower0g
lrwxrwxrwx   1 root     root          33 May 30 14:58 /dev/rdsk/emcpower0g -> ../../devices/pseudo/emcp@0:g,raw

虽然授权之后裸设备权限没有发生变化,不过ASM配置的错误提示已经发生了变化:

$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent2.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent2.log
Could not mount the diskgroup on remote node ser2 using connection service ser2:1521:+ASM2.  Ensure that the listener is running on
this node and the ASM instance is registered to the listener.  Received the following error:

 ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

Could not mount the diskgroup on remote node ser2 using connection service ser2:1521:+ASM2.  Ensure that the listener is running on
this node and the ASM instance is registered to the listener.  Received the following error:

 ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

看来同样的授权应该在节点2上执行:

root@ser2 # chown oracle:oinstall emcpower0g

重新运行dbca,错误解决。

第八个问题是建立数据库时出现的,由于没有设置ASM对应的密码,所以安装过程中出现提示输入密码,可以一旦键入回车,则脚本运行就停止了:

$ dbca -silent -responseFile /data/database/response/my_db.rsp
Enter ASM SYS user password:
 
$

ASM_SYS_PASSWORD作为参数的一部分,重新执行,问题消失。

不过这次运行没有任何的错误提示了:

$ dbca -silent -responseFile /data/database/response/my_db.rsp

命令直接结束,没有任何的错误或正确的提示。

只好找到dbca的日志存放目录:

$ cd /data/oracle/cfgtoollogs/dbca/
$ ls -l
total 50
-rw-r-----   1 oracle   oinstall     840 Sep  2 15:33 silent.log
-rw-r-----   1 oracle   oinstall     854 Sep  2 15:40 silent0.log
-rw-r-----   1 oracle   oinstall     581 Sep  2 15:46 silent1.log
-rw-r-----   1 oracle   oinstall     697 Sep  2 15:51 silent2.log
-rw-r-----   1 oracle   oinstall       1 Sep  2 16:06 silent3.log
-rw-r-----   1 oracle   oinstall   19714 Sep  2 17:47 trace.log_OraDbHome1
$ more trace.log_OraDbHome1
[main] [17:47:42:597] [CommandLineArguments.process:639]  CommandLineArguments->process: number of arguments = 3
[main] [17:47:42:605] [CommandLineArguments.loadNodeinfo:3885]  CommandLineArguments:loadNodeinfo: length of m_nodeinfo = 2
[main] [17:47:42:606] [CommandLineArguments.loadNodeinfo:3894]  CommandLineArguments->loadNodeinfo: Node is {"ser1"
[main] [17:47:42:606] [CommandLineArguments.loadNodeinfo:3894]  CommandLineArguments->loadNodeinfo: Node is "ser2"}
[main] [17:47:42:609] [CommandLineArguments.validateArguments:3372]  CommandLineArguments->process: in Operation Type is Creation/Ge
nerateScripts Mode condition
[main] [17:47:42:609] [OracleHome.hasEELicense:204]  Running script. to determine licensing: /data/oracle/product/11.1/database/bin/b
ndlchk
.
.
.
[Thread-7] [17:47:44:658] [StreamReader.run:65]  OUTPUT>ser1    1
[Thread-7] [17:47:44:661] [StreamReader.run:65]  OUTPUT>ser2    2
[main] [17:47:44:668] [RuntimeExec.runCommand:144]  runCommand: process returns 0
[main] [17:47:44:668] [RuntimeExec.runCommand:161]  RunTimeExec: output>
[main] [17:47:44:668] [RuntimeExec.runCommand:164]  ser1        1
[main] [17:47:44:668] [RuntimeExec.runCommand:164]  ser2        2
[main] [17:47:44:669] [RuntimeExec.runCommand:170]  RunTimeExec: error>
[main] [17:47:44:669] [RuntimeExec.runCommand:192]  Returning from RunTimeExec.runCommand
[main] [17:47:44:669] [ClusterInfo.getClusterNodeMap:960]  Number of nodes=2
[main] [17:47:44:670] [ClusterInfo.getClusterNodeMap:972]  i=0 nodeName=ser1 nodeNum=1
[main] [17:47:44:670] [ClusterInfo.getClusterNodeMap:972]  i=1 nodeName=ser2 nodeNum=2
[main] [17:47:44:670] [StepContext.getInstanceNumbers:2968]  firstNodeNum=1
[main] [17:47:44:670] [StepContext.getInstanceNumbers:2991]  node={"ser1" nodeNum=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:3006]  nodeNames[0]={"ser1" instance number=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:2991]  node="ser2"} nodeNum=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:3006]  nodeNames[1]="ser2"} instance number=null
Exception in thread "main" java.lang.NumberFormatException: null
        at java.lang.Integer.parseInt(Integer.java:415)
        at java.lang.Integer.parseInt(Integer.java:497)
        at oracle.sysman.assistants.util.step.StepContext.getDBInstanceNumbers(StepContext.java:3038)
        at oracle.sysman.assistants.dbca.backend.Host.createOPSConfiguration(Host.java:957)
        at oracle.sysman.assistants.dbca.backend.SilentHost.performOperation(SilentHost.java:186)
        at oracle.sysman.assistants.dbca.backend.Host.startOperation(Host.java:3090)
        at oracle.sysman.assistants.dbca.Dbca.execute(Dbca.java:115)
        at oracle.sysman.assistants.dbca.Dbca.main(Dbca.java:180)

根据错误日志,似乎问题仍然出现在大括号的地方,去掉大括弧后重试,问题解决。Oracle自己内部工具标准都不统一,OUIDBCA之间就有这么明显的差别。而且最关键的问题是,Oracledbca自己的response文件中开头的注释中写到:StringList :  {"",""}。只能说在这一点上,Oracledbca处理存在bug

注意和前面ASM一样,这里不但要去掉大括弧,也要去掉双引号。

 

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-544644/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
暂无介绍

注册时间:2007-12-29

  • 博文量
    1955
  • 访问量
    10542457