ITPub博客

首页 > 数据库 > Oracle > 某业务系统的监听每过10天左右,就异常终止一次TNS-12537

某业务系统的监听每过10天左右,就异常终止一次TNS-12537

原创 Oracle 作者:xueshancheng 时间:2021-08-15 21:39:36 0 删除 编辑

1 查看监听日志、发现监听终止是通过CRS发出的。

2 查看CRS的alert日志,发现CRS日志中有报错信息


查看 crsd.log 日志、发现监听的状态切换有异常。发现此问题同《 VIP, SCAN VIP/Listener Fails Over and Listener Stops After Short Public Network Hiccup ( 文档 ID 1333165.1) 》文档一样,以下4、5点都同此文档一致。

4 查看 orarootagent_root.log 日志



查看 oraagent_<user>.log 日志: 发现监听强制停止,重启失败。


原因:

结合 1-5 点,此种情况和《 VIP, SCAN VIP/Listener Fails Over and Listener Stops After Short Public Network Hiccup ( 文档 ID 1333165.1) 》文档相符,按照此文档进行修改。

 

以下为 oracle 官方相关文档:

VIP, SCAN VIP/Listener Fails Over and Listener Stops After Short Public Network Hiccup ( 文档 ID 1333165.1)

 

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.1 to 12.1.0.1 [Release 11.2 to 12.1]

Information in this document applies to any platform.

SYMPTOMS

After check timed out, 11gR2 Grid Infrastructure network resource (usually ora.net1.network) goes to INTERMEDIATE state, then goes back to ONLINE very shortly. This note will not discuss cause of check time out, but most common cause is public network hiccup.

 

Once network resource goes into INTERMEDIATE state, it may trigger VIP, service, SCAN VIP/SCAN listener, ora.cvu and ora.ons etc to be failed over/go offline due to resource dependence, which could result in unnecessary connectivity issue for that period of time. After network resource is back online, affected resources may not come back online.

 

·  $GRID_HOME/log/<node>/crsd/crsd.log

2011-06-12 07:12:31.261: [    AGFW][10796] {0:1:2881} Received state change for ora.net1.network racnode1 1 [old state = ONLINE, new state = UNKNOWN]

2011-06-12 07:12:31.261: [    AGFW][10796] {0:1:2881} Received state LABEL change for ora.net1.network racnode1 1 [old label  = , new label = CHECK TIMED OUT]

..

2011-06-12 07:12:31.297: [   CRSPE][12081] {0:1:2881} RI [ora.net1.network racnode1 1] new external state [INTERMEDIATE] old value: [ONLINE] on racnode1 label = [CHECK TIMED OUT] 

..

2011-06-12 07:12:31.981: [    AGFW][10796] {0:1:2882} Received state change for ora.net1.network racnode1 1 [old state = UNKNOWN, new state = ONLINE]

..

2011-06-12 07:12:32.307: [   CRSPE][12081] {0:1:2881} RI [ora.LISTENER.lsnr racnode1 1] new internal state: [STOPPING] old value: [STABLE]

2011-06-12 07:12:32.308: [   CRSPE][12081] {0:1:2881} CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'racnode1'

·  $GRID_HOME/log/<node>/agent/crsd/orarootagent_root/orarootagent_root.log

ᒤ  2011-06-12 07:12:08.965: [    AGFW][2070] {1:27767:2} Created alert : (:CRSAGF00113:) :  Aborting the command: check for resource: ora.net1.network racnode1 1

ᒤ  2011-06-12 07:12:08.966: [ora.net1.network][2070] {1:27767:2} [check] clsn_agent::abort {

ᒤ  ..

ᒤ  2011-06-12 07:12:31.257: [    AGFW][2070] {1:27767:2} Command: check for resource: ora.net1.network racnode1 1 completed with status: TIMEDOUT

ᒤ  2011-06-12 07:12:31.258: [    AGFW][2314] {1:27767:2} ora.net1.network racnode1 1 state changed from: ONLINE to: UNKNOWN

ᒤ  2011-06-12 07:12:31.258: [    AGFW][2314] {1:27767:2} ora.net1.network racnode1 1 would be continued to monitored!

ᒤ  2011-06-12 07:12:31.258: [    AGFW][2314] {1:27767:2} ora.net1.network racnode1 1 state details has changed from:  to: CHECK TIMED OUT

ᒤ  ..

ᒤ  2011-06-12 07:12:31.923: [ora.net1.network][2314][F-ALGO] {1:27767:2} CHECK initiated by timer for: ora.net1.network racnode1 1

ᒤ  ..

ᒤ  2011-06-12 07:12:31.973: [ora.net1.network][8502][F-ALGO] {1:27767:2} [check] Command check for resource: ora.net1.network racnode1 1 completed with status ONLINE

ᒤ  2011-06-12 07:12:31.978: [    AGFW][2314] {1:27767:2} ora.net1.network racnode1 1 state changed from: UNKNOWN to: ONLINE

·  $GRID_HOME/log/<node>/agent/crsd/oraagent_<user>/oraagent_<user>.log

䀞  2011-06-12 07:12:32.335: [    AGFW][2314] {0:1:2881} Agent received the message: RESOURCE_STOP[ora.LISTENER.lsnr racnode1 1] ID 4099:14792

䀞  2011-06-12 07:12:32.335: [    AGFW][2314] {0:1:2881} Preparing STOP command for: ora.LISTENER.lsnr racnode1 1

䀞  2011-06-12 07:12:32.335: [    AGFW][2314] {0:1:2881} ora.LISTENER.lsnr racnode1 1 state changed from: ONLINE to: STOPPING

 

·  $GRID_HOME/log/<node>/alert<node>.log

䀬  2012-01-10 06:48:18.474 [/ocw/grid/bin/orarootagent.bin(10485902)]CRS-5818:Aborted command 'check for resource: ora.net1.network racnode1 1' for resource 'ora.net1.network'. Details at (:CRSAGF00113:) {1:24200:2} in /ocw/grid/log/racnode1/agent/crsd/orarootagent_root/orarootagent_root.log.

䀬  2012-01-10 06:48:43.481 [/ocw/grid/bin/oraagent.bin(8847542)]CRS-5016:Process "/ocw/grid/bin/lsnrctl" spawned by agent "/ocw/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/ocw/grid/log/racnode1/agent/crsd/oraagent_grid/oraagent_grid.log"

䀬  2012-01-10 06:48:43.552 [/ocw/grid/bin/oraagent.bin(8847542)]CRS-5016:Process "/ocw/grid/opmn/bin/onsctli" spawned by agent "/ocw/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/ocw/grid/log/racnode1/agent/crsd/oraagent_grid/oraagent_grid.log"

 

 

 

CAUSE

 

SOLUTION

The issue is fixed in a few different bugs:

1.  bug 12680491  fixes the dependence between network and VIP

 

The fix of  bug 12680491  will add intermediate modifier to stop dependency between network resource and VIP to avoid unnecessary resource state change, it's included in 11.2.0.2 GI PSU4, 11.2.0.3 GI PSU3, 11.2.0.3 Windows Patch 7, 11.2.0.4 and above. This fix is recommended instead of fix for  bug 12378938  to avoid the issue in first place. 

 

Once patch for this bug is applied, the following needs to be executed to change the dependence for all VIPs:

# $GRID_HOME/bin/crsctl modify res ora.<racnode1>.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.<net1>.network)"

 

For example:

# /ocw/grid/bin/crsctl modify res ora.racnode1.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"

Once the attribute is changed, a restart of nodeapps/VIP is needed to be in effect

 

2.  bug 13582411  fixes the dependence between network and SCAN VIP/listener

The fix of  bug 13582411  will add intermediate modifyer to stop dependency between network resource and SCAN VIP to avoid unnecessary resource state change, it's included in 11.2.0.3 GI PSU4, 11.2.0.4 and above.  

 

Once patch for this bug is applied, the following needs to be executed to change the dependence for all SCAN VIPs and to restart SCAN VIPs:

# $GRID_HOME/bin/crsctl modify res ora.scan<n>.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net<n>.network)"

For example:

# /ocw/grid/bin/crsctl modify res ora.scan1.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"

# /ocw/grid/bin/crsctl modify res ora.scan2.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"

# /ocw/grid/bin/crsctl modify res ora.scan3.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"

# /ocw/grid/bin/srvctl stop scan -f

$ /ocw/grid/bin/srvctl start scan_listener 

 

3.  bug 17435488  fixes the dependence between network and ora.cvu and ora.ons

The fix will add intermediate modifyer to stop dependency between network resource and ora.cvu and ora.ons to avoid unnecessary resource state change, it's included in 12.1.0.2


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/69996316/viewspace-2787039/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
本人目前就职于北京海天起点技术服务有限股份公司,从事Oracle数据库有十几年了,对Oracle及goldengate比较精通。

注册时间:2021-03-11

  • 博文量
    43
  • 访问量
    11353