主机环境基本上和前面文章中描述的SOLARIS10安装Oracle1106rac的环境基本完全一致,最主要的区别在于没有了VOLUMN CLUSTER MANAGER,于是这里打算使用Oracle的ASM。由于安装操作没有什么区别,所以这次安装选择了SILENT静默模式来安装RAC。
描述安装过程中碰到的问题。
Solaris10下Silent模式安装Oracle1106RAC环境(一):http://yangtingkun.itpub.net/post/468/477442
Solaris10下Silent模式安装Oracle1106RAC环境(二):http://yangtingkun.itpub.net/post/468/477443
Solaris10下Silent模式安装Oracle1106RAC环境(三):http://yangtingkun.itpub.net/post/468/477444
Solaris10下Silent模式安装Oracle1106RAC环境(四):http://yangtingkun.itpub.net/post/468/477446
Solaris10下Silent模式安装Oracle1106RAC环境(五):http://yangtingkun.itpub.net/post/468/477447
Solaris10下Silent模式安装Oracle1106RAC环境(六):http://yangtingkun.itpub.net/post/468/477448
Solaris10下Silent模式安装Oracle1106RAC环境(七):http://yangtingkun.itpub.net/post/468/477600
虽然Silent安装和RAC11G的安装都不只一次,但是RAC的SILENT模式安装还是第一次,而且SILENT安装ASM也是第一次,所以碰到一些问题也是在所难免。
第一个错误其实是粗心造成的,在Silent模式安装Cluster的时候,报错错误信息为:
$ ./runInstaller -silent -noconfig -responseFile /data/cluster/response/my_crs.rsp
Starting Oracle Universal Installer...
Checking Temp space: must be greater than 180 MB. Actual 57590 MB Passed
Checking swap space: must be greater than 150 MB. Actual 57802 MB Passed
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2008-08-29_01-52-22PM. Please wait ...$ Oracle Universal Installer, Version 11.1.0.6.0 Production
Copyright (C) 1999, 2007, Oracle. All rights reserved.
OUI-10203:The specified response file '/data/cluster/response/my_crs.rsp' is not found. Make sure that the response file specified exists and you have read privileges to this file.
检查metalink,发现问题是由于RESPONSE参数FROM_LOCATION的设置不正确造成的。由于这个参数当时是采用相对路径,个人认为默认值就应该是正确的,因此没有去验证。
将其设置为正确的绝对路径后,重新执行安装,问题解决。
第二个错误还是粗心造成的,仍然是CLUSTER的安装过程,报错如下:
OUI-10155:Error while setting variable sl_tableList: The following node names are invalid, as they do not resolve to a valid IP address:
ser2-vip.
检查RESPONSE参数SL_TABLELIST未发现任何异常,结果发现在/etc/hosts文件中设置错误:
127.0.0.1 localhost
172.0.2.62 ser1 ser1. loghost
172.0.2.63 ser2
172.0.2.68 ser1-vip
172.0.2.69 serv2-vip
10.0.2.1 ser1-priv
10.0.2.2 ser2-priv
这里误将ser2-vip敲错为serv2-vip,导致错误的产生,将文件修改为:
127.0.0.1 localhost
172.0.2.62 ser1 ser1. loghost
172.0.2.63 ser2
172.0.2.68 ser1-vip
172.0.2.69 ser2-vip
10.0.2.1 ser1-priv
10.0.2.2 ser2-priv
检查两个节点的所有网络设置,确认无误后,重新执行runInstaller,问题解决。
问题三是问题二的延伸,由于没有设置uninstall的时候删除安装目录,导致目录不为空,出现了下面的错误:
SEVERE:OUI-10029:You have specified a non-empty directory to install this product. It is recommended to specify either an empty or a non-existent directory. You may, however, choose to ignore this message if the directory contains Operating System generated files or subdirectories like lost+found.
这个问题处理很简单,手工清除目录就可以了。
第四个问题是执行/data/oracle/product/11.1/crs/root.sh脚本是碰到了,错误如下:
root@ser1 # . /data/oracle/product/11.1/crs/root.sh
WARNING: directory '/data/oracle/product/11.1' is not owned by root
WARNING: directory '/data/oracle/product' is not owned by root
WARNING: directory '/data/oracle' is not owned by root
WARNING: directory '/data' is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up Network socket directories
Failed to upgrade Oracle Cluster Registry configuration
检查了半天,最后发现居然是以前碰到过的一个错误,Oracle不认“s0”分区。具体描述可以参考:http://yangtingkun.itpub.net/post/468/272473
简单的说就是:Oracle默认不会使用s0分区,如果指定了s0分区作为ocr或voting disk,那么在执行root.sh时也会收到同样的错误信息:Failed to upgrade Oracle Cluster Registry configuration。
而最初的配置正是将ocr配置为/dev/rdsk/emcpower0a,这个对应的正是s0分区。所以导致了这个错误。
于是只能将整个CLUSTER完全卸载,重新设置ocr和voting disk对应的裸设备。再次安装则没有出现任何问题。
第五个问题仍然是重复以前的错误。明明在以前碰到过,结果还是重蹈覆辙,详细描述可以参考http://yangtingkun.itpub.net/post/468/407375
在软件安装完成的时候,需要执行root.sh,不过这个脚本存在两个bug,一个是./root.sh的时候,‘.’和‘/’之间不能加空格,否则会陷入死循环。二是如果silent模式创建数据库,会导致root.sh里面的参数OUI_SILENT设置错误,需要手工将其设置为FALSE,然后再执行脚本,可惜这两次问题这次又全都碰到了,希望下次安装11g的时候不要再次犯同样的错误。
第六个问题是SILENT建立ASM时候碰到的
$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent.log
Failed to retrieve network listener resources required for the Real Application Clusters high availability extensions configurations
on the following nodes: [ser1", "ser2].
Do you want listeners on port 1521 with prefix LISTENER to be created on nodes [ser1", "ser2] automatically? If you would like to c
onfigure the listener with different properties, run NetCA before continuing.
Listener creation failed with error: Failed to create a profile for listener "[LISTENER_SER1"]" on node "ser1"", "PRKC-1056 : Failed
to get the hostname for node ser1"
PRKH-1001 : HASContext Internal Error
[OCR Error(Native: getHostName:[21])]"..
It is strongly recommended to run NetCA to configure listeners before continuing. Do you want to continue with the operation?
The operation will be stopped. Re-run DBCA after successfully running NetCA.
显然根据错误提示,Oracle解析节点名称的时候出现了错误,经过检查发现输入节点参数的时候忘了添加大括弧,将参数改为NODELIST={"ser1","ser2"}之后,再次安装,发现问题依旧,不过这次提示错误又包含了大括弧的部分。
DBCA的RESPONSE设置和OUI的RESPONSE设置还有一定的区别,将参数改为NODELIST=ser1,ser2,再次尝试安装,错误消失。
第七个错误还是ASM的配置相关,配置ASM时出现下面的问题:
$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent1.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent1.log
Failed to retrieve network listener resources required for the Real Application Clusters high availability extensions configurations
on the following nodes: [ser1, ser2].
Do you want listeners on port 1521 with prefix LISTENER to be created on nodes [ser1, ser2] automatically? If you would like to con
figure the listener with different properties, run NetCA before continuing.
ORA-15018: diskgroup cannot be created
ORA-15031: disk specification '/dev/rdsk/emcpower0g' matches no disks
ORA-15025: could not open disk '/dev/rdsk/emcpower0g'
ORA-15056: additional error message
由于路径没有任何问题,感觉是权限的问题,于是使用root对裸设备授权:
root@ser1 # ls -l /dev/rdsk/emcpower0g
lrwxrwxrwx 1 root root 33 May 30 14:58 /dev/rdsk/emcpower0g -> ../../devices/pseudo/emcp@0:g,raw
root@ser1 # chown oracle:oinstall /dev/rdsk/emcpower0g
root@ser1 # ls -l /dev/rdsk/emcpower0g
lrwxrwxrwx 1 root root 33 May 30 14:58 /dev/rdsk/emcpower0g -> ../../devices/pseudo/emcp@0:g,raw
虽然授权之后裸设备权限没有发生变化,不过ASM配置的错误提示已经发生了变化:
$ dbca -silent -responseFile /data/database/response/my_asm.rsp
Look at the log file "/data/oracle/cfgtoollogs/dbca/silent2.log" for further details.
$ more /data/oracle/cfgtoollogs/dbca/silent2.log
Could not mount the diskgroup on remote node ser2 using connection service ser2:1521:+ASM2. Ensure that the listener is running on
this node and the ASM instance is registered to the listener. Received the following error:
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
Could not mount the diskgroup on remote node ser2 using connection service ser2:1521:+ASM2. Ensure that the listener is running on
this node and the ASM instance is registered to the listener. Received the following error:
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
看来同样的授权应该在节点2上执行:
root@ser2 # chown oracle:oinstall emcpower0g
重新运行dbca,错误解决。
第八个问题是建立数据库时出现的,由于没有设置ASM对应的密码,所以安装过程中出现提示输入密码,可以一旦键入回车,则脚本运行就停止了:
$ dbca -silent -responseFile /data/database/response/my_db.rsp
Enter ASM SYS user password:
$
将ASM_SYS_PASSWORD作为参数的一部分,重新执行,问题消失。
不过这次运行没有任何的错误提示了:
$ dbca -silent -responseFile /data/database/response/my_db.rsp
命令直接结束,没有任何的错误或正确的提示。
只好找到dbca的日志存放目录:
$ cd /data/oracle/cfgtoollogs/dbca/
$ ls -l
total 50
-rw-r----- 1 oracle oinstall 840 Sep 2 15:33 silent.log
-rw-r----- 1 oracle oinstall 854 Sep 2 15:40 silent0.log
-rw-r----- 1 oracle oinstall 581 Sep 2 15:46 silent1.log
-rw-r----- 1 oracle oinstall 697 Sep 2 15:51 silent2.log
-rw-r----- 1 oracle oinstall 1 Sep 2 16:06 silent3.log
-rw-r----- 1 oracle oinstall 19714 Sep 2 17:47 trace.log_OraDbHome1
$ more trace.log_OraDbHome1
[main] [17:47:42:597] [CommandLineArguments.process:639] CommandLineArguments->process: number of arguments = 3
[main] [17:47:42:605] [CommandLineArguments.loadNodeinfo:3885] CommandLineArguments:loadNodeinfo: length of m_nodeinfo = 2
[main] [17:47:42:606] [CommandLineArguments.loadNodeinfo:3894] CommandLineArguments->loadNodeinfo: Node is {"ser1"
[main] [17:47:42:606] [CommandLineArguments.loadNodeinfo:3894] CommandLineArguments->loadNodeinfo: Node is "ser2"}
[main] [17:47:42:609] [CommandLineArguments.validateArguments:3372] CommandLineArguments->process: in Operation Type is Creation/Ge
nerateScripts Mode condition
[main] [17:47:42:609] [OracleHome.hasEELicense:204] Running script. to determine licensing: /data/oracle/product/11.1/database/bin/b
ndlchk
.
.
.
[Thread-7] [17:47:44:658] [StreamReader.run:65] OUTPUT>ser1 1
[Thread-7] [17:47:44:661] [StreamReader.run:65] OUTPUT>ser2 2
[main] [17:47:44:668] [RuntimeExec.runCommand:144] runCommand: process returns 0
[main] [17:47:44:668] [RuntimeExec.runCommand:161] RunTimeExec: output>
[main] [17:47:44:668] [RuntimeExec.runCommand:164] ser1 1
[main] [17:47:44:668] [RuntimeExec.runCommand:164] ser2 2
[main] [17:47:44:669] [RuntimeExec.runCommand:170] RunTimeExec: error>
[main] [17:47:44:669] [RuntimeExec.runCommand:192] Returning from RunTimeExec.runCommand
[main] [17:47:44:669] [ClusterInfo.getClusterNodeMap:960] Number of nodes=2
[main] [17:47:44:670] [ClusterInfo.getClusterNodeMap:972] i=0 nodeName=ser1 nodeNum=1
[main] [17:47:44:670] [ClusterInfo.getClusterNodeMap:972] i=1 nodeName=ser2 nodeNum=2
[main] [17:47:44:670] [StepContext.getInstanceNumbers:2968] firstNodeNum=1
[main] [17:47:44:670] [StepContext.getInstanceNumbers:2991] node={"ser1" nodeNum=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:3006] nodeNames[0]={"ser1" instance number=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:2991] node="ser2"} nodeNum=null
[main] [17:47:44:671] [StepContext.getInstanceNumbers:3006] nodeNames[1]="ser2"} instance number=null
Exception in thread "main" java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:415)
at java.lang.Integer.parseInt(Integer.java:497)
at oracle.sysman.assistants.util.step.StepContext.getDBInstanceNumbers(StepContext.java:3038)
at oracle.sysman.assistants.dbca.backend.Host.createOPSConfiguration(Host.java:957)
at oracle.sysman.assistants.dbca.backend.SilentHost.performOperation(SilentHost.java:186)
at oracle.sysman.assistants.dbca.backend.Host.startOperation(Host.java:3090)
at oracle.sysman.assistants.dbca.Dbca.execute(Dbca.java:115)
at oracle.sysman.assistants.dbca.Dbca.main(Dbca.java:180)
根据错误日志,似乎问题仍然出现在大括号的地方,去掉大括弧后重试,问题解决。Oracle自己内部工具标准都不统一,OUI和DBCA之间就有这么明显的差别。而且最关键的问题是,Oracle的dbca自己的response文件中开头的注释中写到:StringList : {"
注意和前面ASM一样,这里不但要去掉大括弧,也要去掉双引号。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-544644/,如需转载,请注明出处,否则将追究法律责任。