ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 11R2 clusterware进程无法启动常见原因

11R2 clusterware进程无法启动常见原因

Linux操作系统 作者:410192979 时间:2016-04-07 17:05:19 0 删除 编辑

Grid infrastruce资源无法启动,常见原因分析

OHASD无法启动

原因分析:

1  OS的运行等级设置有误

--linuxrunlevel参照如下

rc0.d - System Halted

rc1.d - Single User Mode

rc2.d - Single User Mode with Networking

rc3.d - Multi-User Mode - boot up in text mode

rc4.d - Not yet Defined

rc5.d - Multi-User Mode - boot up in X Windows

rc6.d - Shutdown & Reboot

查看 ohasd的运行等级

oracle@> more /etc/inittab | grep ohasd

h1:2:respawn:/etc/init.ohasd run >/dev/null 2>&1 </dev/null

查看当前用户的运行等级

oracle@> who -r

   .        run-level 2 Jul 03 07:46       2    0    S

2 init.ohasd run是否运行

init.ohasd run没有运行,则ohasd.bin不会启动

oracle@> ps -ef | grep ohasd | grep -v grep

    root  7340038        1  10   Jul 03      - 1022:34 /u001/app/11.2.0.2/grid/bin/ohasd.bin reboot

root  9568408        1   0   Jul 03      -  0:00 /bin/sh /etc/init.ohasd run

init.ohasd不能及时启动,则会收到类似错误"[ohasd(<pid>)] CRS-0715:Oracle High Availability Service has timed out waiting for init.ohasd to be started."

注:从linux 6inittab被废弃,init.ohasd配置在/etc/init

3 clusterware自动重启是否激活

运行$GRID_HOME/bin/crsctl config crs查看crs是否自动启动

OS日志显示如下

Feb 29 16:20:36 racnode1 logger: Oracle Cluster Ready Services startup disabled.

Feb 29 16:20:36 racnode1 logger: Could not access /var/opt/oracle/scls_scr/racnode1/root/ohasdstr

--该文件无法访问或不存在

4 oracle local registry是否可访问

 ls –altr $GRID_HOME/cdata/*.olr

OLR不可访问或损坏,ohasd.log会有类似记录

2010-01-24 22:59:10.470: [ default][1373676464] Initializing OLR

2010-01-24 22:59:10.472: [  OCROSD][1373676464]utopen:6m':failed in stat OCR file/disk /ocw/grid/cdata/rac1.olr, errno=2, os err string=No such file or directory

2010-01-24 22:59:10.472: [  OCROSD][1373676464]utopen:7:failed to open any OCR file/disk, errno=2, os err string=No such file or directory

2010-01-24 22:59:10.473: [  OCRRAW][1373676464]proprinit: Could not open raw device

5 ohasd.bin无法访问socket文件

Network socket 文件一般位于/tmp/var/opt目录

Ohasd.log记录如下:

2010-06-29 10:31:01.570: [ COMMCRS][1206901056]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=procr_local_conn_0_PROL))

 

2010-06-29 10:31:01.571: [  OCRSRV][1217390912]th_listen: CLSCLISTEN failed clsc_ret= 3, addr= [(ADDRESS=(PROTOCOL=ipc)(KEY=procr_local_conn_0_PROL))]

2010-06-29 10:31:01.571: [  OCRSRV][3267002960]th_init: Local listener did not reach valid state

6 ohasd.bin无法访问日志路径

查看OS messagesyslog显示如下

Feb 20 10:47:08 racnode1 OHASD[9566]: OHASD exiting; Directory /ocw/grid/log/racnode1/ohasd not found.

7 ohasd无法启动

ps -ef| grep ohasd.bin显示ohasd.bin已经启动,但是ohasd.log很长时间没有更新,使用truss跟踪显示

15058/1:         0.1995 close(2147483646)                               Err#9 EBADF

15058/1:         0.1996 close(2147483645)                               Err#9 EBADF

Pstack跟踪ohasd.bin则出现

_close  sclssutl_closefiledescriptors  main ..

此由bug11834289引起,11203已修复

 

OHASD Agent无法启动

OHASD.bin置换出4agent

oraagent: responsible for ora.asm, ora.evmd, ora.gipcd, ora.gpnpd, ora.mdnsd etc
orarootagent: responsible for ora.crsd, ora.ctssd, ora.diskmon, ora.drivers.acfs etc
cssdagent / cssdmonitor: responsible for ora.cssd(for ocssd.bin) and ora.cssdmonitor(for cssdmonitor itself)

1

最常见的问题是相应 agent的日志目录没有操作权限

2

Agent binary损坏,agent无法启动,日志记录如下:

2011-05-03 11:11:13.189

[ohasd(25303)]CRS-5828:Could not start agent '/ocw/grid/bin/orarootagent_grid'. Details at (:CRSAGF00130:) {0:0:2} in /ocw/grid/log/racnode1/ohasd/ohasd.log.

 

OCSSD.bin无法启动

cssd.bin启动需要如下条件

1 GPnP  profile可以正常访问

--profile存储着cssdiscoverystring

<orcl:CSS-Profile id="css" DiscoveryString="/s001/oracle_crsdata/votingfile" LeaseDuration="400"/> --voting disk没有存放在ASM

2 vote disk可以访问

从第一步的GPnp中找出DiscoveryString

3 网络正常

 

 

CRSD.bin无法启动

1 ocssd是否启动

2 OCR可否访问

3 crsd.bin pid 文件存在且指向crsd.bin进程

oracle@ justin> pwd

/u001/app/11.2.0.2/grid/crs/init

oracle@ justin> more justin.pid

22347868

oracle@ justin> ps -ef | grep 22347868

    root 22347868        1   6   Jul 03      - 1279:53 /u001/app/11.2.0.2/grid/bin/crsd.bin reboot

如改文件不存在或其pid指向非crsd.bin进程,则crsd无法正常启动,详情需要参考orarootagent_root.log

4 CRSD相关的可执行文件权限设置错误

--查看crsd.bin$GRID_HOME/bin下的crsd

 

参考文档1050908.1

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/26366371/viewspace-2076948/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2014-03-08

  • 博文量
    151
  • 访问量
    109002