ITPub博客

首页 > Linux操作系统 > Linux操作系统 > oracle rac 其中第一个节点监听偶尔中断处理案例

oracle rac 其中第一个节点监听偶尔中断处理案例

原创 Linux操作系统 作者:paulyibinyi 时间:2011-07-06 20:26:32 0 删除 编辑

1          概述

问题简述:2010-5-31日上午,p730a节点listener_p730a资源offline,导致

应用切换到p730b节点上,后来p730a节点每隔四五天左右,监听就会出现偶尔

中断现象。

操作系统:AIX 6100

数据库:oracle 10.2.0.5 rac

存储: emc-cx4-960

2          问题描述

2010-5-31日上午,p730a节点listener_p730a资源offline,导致

应用切换到p730b节点上,后来p730a节点监听每隔四五天左右会出现offline现象,需要手工去启动p730a节点监听。

3          处理过程

1.   通过以下方法,可以暂时解决这个问题

  Srvctl stop listener –n p730a

  Srvctl start listener –n p730a

2.   检查操作系统日志

最新日志只有到2011.5.18号,后来操作系统没有任何相关报错。

3.   查看p730a节点crs日志

2011-05-30 09:19:37.294: [  CRSAPP][11834]32CheckResource error for ora.p730b.vip error code = 1

2011-05-30 09:19:37.308: [  CRSRES][11834]32In stateChanged, ora.p730b.vip target is ONLINE

2011-05-30 09:19:37.309: [  CRSRES][11834]32ora.p730b.vip on p730a went OFFLINE unexpectedly

2011-05-30 09:19:37.309: [  CRSRES][11834]32StopResource: setting CLI values

2011-05-30 09:19:37.321: [  CRSRES][11834]32Attempting to stop `ora.p730b.vip` on member `p730a`

2011-05-30 09:19:37.689: [  CRSRES][11834]32Stop of `ora.p730b.vip` on member `p730a` succeeded.

2011-05-30 09:19:37.690: [  CRSRES][11834]32ora.p730b.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0

2011-05-30 09:19:37.692: [  CRSRES][11834]32ora.p730b.vip failed on p730a relocating.

2011-05-30 09:19:37.755: [  CRSRES][11834]32Attempting to start `ora.p730b.vip` on member `p730b`

2011-05-30 09:19:44.705: [  CRSRES][11834]32Start of `ora.p730b.vip` on member `p730b` failed.

2011-05-30 09:21:08.879: [  CRSAPP][11841]32CheckResource error for ora.p730a.vip error code = 1

2011-05-30 09:21:08.883: [  CRSRES][11841]32In stateChanged, ora.p730a.vip target is ONLINE

2011-05-30 09:21:08.883: [  CRSRES][11841]32ora.p730a.vip on p730a went OFFLINE unexpectedly

2011-05-30 09:21:08.883: [  CRSRES][11841]32StopResource: setting CLI values

2011-05-30 09:21:08.903: [  CRSRES][11841]32Attempting to stop `ora.p730a.vip` on member `p730a`

2011-05-30 09:21:09.280: [  CRSRES][11841]32Stop of `ora.p730a.vip` on member `p730a` succeeded.

2011-05-30 09:21:09.280: [  CRSRES][11841]32ora.p730a.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0

2011-05-30 09:21:09.283: [  CRSRES][11841]32ora.p730a.vip failed on p730a relocating.

2011-05-30 09:21:09.321: [  CRSRES][11841]32StopResource: setting CLI values

2011-05-30 09:21:09.330: [  CRSRES][11841]32Attempting to stop `ora.p730a.LISTENER_P730A.lsnr` on member `p730a`

2011-05-30 09:22:26.511: [  CRSRES][11841]32Stop of `ora.p730a.LISTENER_P730A.lsnr` on member `p730a` succeeded.

2011-05-30 09:22:26.527: [  CRSRES][11841]32Attempting to start `ora.p730a.vip` on member `p730b`

2011-05-30 09:22:28.006: [  CRSRES][11841]32Start of `ora.p730a.vip` on member `p730b` succeeded.

可以看到p730a节点监听offline主要原因是由于p730a节点 vip offline,然后p730a节点的vip资源自动切换到p370b节点。

 

4.   打开debugvip资源进行trace

 crsctl debug log res "ora.p730a.vip:5"

        产生的trace文件放在$ORA_CRS_HOME/log/p730a/目录下

5.      根据metalink文档ID1297867.1

    根据以下步骤:修改racgvip脚本

1. Stop all node applications.
% srvctl stop nodeapps -n

2. Backup then Modify the racgvip script. .

Change:
# timeout of ping in number of loops (1 sec)
PING_TIMEOUT=" -c 1 -w 1"

To:
# timeout of ping in number of loops (3 sec)
PING_TIMEOUT=" -c 1 -w 3"

3. Start the node applications and other necessary resources.
% srvctl start nodeapps -n

6.   关闭debug

crsctl debug log res "ora.p730a.vip:0"

 后来打电话给客户,客户说通过修改racgvip脚本后, p730a监听中断问题没有再出现过。

4          结论和建议

对于比较异常的crs问题,可以用debug来跟踪产生log,从而确定问题所在。

打开debug


crsctl debug log res "ora.p730a.vip:5"
crsctl debug log res "ora.p730b.vip:5"


  关闭debug

 

crsctl debug log res "ora.p730a.vip:0"
crsctl debug log res "ora.p730b.vip:0"

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/7199859/viewspace-701543/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
学习数据库

注册时间:2007-12-11

  • 博文量
    902
  • 访问量
    6597038