ITPub博客

首页 > Linux操作系统 > Linux操作系统 > AIX RAC9I 心跳线断掉测试

AIX RAC9I 心跳线断掉测试

原创 Linux操作系统 作者:westzq1984 时间:2009-05-11 22:04:29 0 删除 编辑

SQL> select * from gv$instance;

   INST_ID INSTANCE_NUMBER INSTANCE_NAME    HOST_ VERSION           STARTUP_T STATUS       PAR    THREAD# ARCHIVE LOG_SWITCH_
---------- --------------- ---------------- ----- ----------------- --------- ------------ --- ---------- ------- -----------
LOGINS     SHU DATABASE_STATUS   INSTANCE_ROLE      ACTIVE_ST
---------- --- ----------------- ------------------ ---------
         1               1 rac1             P61A  9.2.0.8.0         11-MAY-09 OPEN         YES          1 STOPPED
ALLOWED    NO  ACTIVE            PRIMARY_INSTANCE   NORMAL

         2               2 rac2             P61B  9.2.0.8.0         11-MAY-09 OPEN         YES          2 STOPPED
ALLOWED    NO  ACTIVE            PRIMARY_INSTANCE   NORMAL


SQL> select inst_id,open_mode from gv$database;

   INST_ID OPEN_MODE
---------- ----------
         1 READ WRITE
         2 READ WRITE

拔掉实例2(P61B)的心跳线

所有节点都HANG住,实例1上查询无返回,新开窗口登录卡住,登录实例2卡住

节点1日志(P61A)

Mon May 11 21:54:30 2009
IPC Send timeout detected. Sender ospid 250018
Mon May 11 21:54:31 2009
IPC Send timeout detected. Sender ospid 348268
Mon May 11 21:55:02 2009
Communications reconfiguration: instance 1
Evicting instance 2 from cluster
Mon May 11 21:55:29 2009
Waiting for instances to leave:
2
Mon May 11 21:55:33 2009
Trace dumping is performing id=[cdmp_20090511215503]
Mon May 11 21:55:39 2009
Reconfiguration started (old inc 2, new inc 4)
List of nodes:
 0
 Nested/batched reconfiguration detected.
 Global Resource Directory frozen
one node partition
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
 Resources and enqueues cleaned out
 Resources remastered 605
 601 GCS shadows traversed, 0 cancelled, 0 closed
 200 GCS resources traversed, 0 cancelled
 set master node info
 Submitted all remote-enqueue requests
 Update rdomain variables
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 601 GCS shadows traversed, 0 replayed, 0 unopened
 Submitted all GCS remote-cache requests
 0 write requests issued in 601 GCS resources
 0 PIs marked suspect, 0 flush PI msgs
Mon May 11 21:55:39 2009
Reconfiguration complete
 Post SMON to start 1st pass IR
Mon May 11 21:55:39 2009
Instance recovery: looking for dead threads
Mon May 11 21:55:39 2009
Beginning instance recovery of 1 threads
Mon May 11 21:55:39 2009
Started redo scan
Mon May 11 21:55:39 2009
Completed redo scan
 182 redo blocks read, 32 data blocks need recovery
Mon May 11 21:55:39 2009
Started recovery at
 Thread 2: logseq 5, block 3, scn 0.0
Mon May 11 21:55:39 2009
Recovery of Online Redo Log: Thread 2 Group 3 Seq 5 Reading mem 0
  Mem# 0 errs 0: /dev/rtrac_redo2_11
Mon May 11 21:55:40 2009
Completed redo application
Mon May 11 21:55:40 2009
Ended recovery at
 Thread 2: logseq 5, block 185, scn 0.209203
 8 data blocks read, 32 data blocks written, 182 redo blocks read
Ending instance recovery of 1 threads
SMON: about to recover undo segment 11
SMON: mark undo segment 11 as available
SMON: about to recover undo segment 12
SMON: mark undo segment 12 as available
SMON: about to recover undo segment 13
SMON: mark undo segment 13 as available
SMON: about to recover undo segment 14
SMON: mark undo segment 14 as available
SMON: about to recover undo segment 15
SMON: mark undo segment 15 as available
SMON: about to recover undo segment 16
SMON: mark undo segment 16 as available
SMON: about to recover undo segment 17
SMON: mark undo segment 17 as available
SMON: about to recover undo segment 18
SMON: mark undo segment 18 as available
SMON: about to recover undo segment 19
SMON: mark undo segment 19 as available
SMON: about to recover undo segment 20
SMON: mark undo segment 20 as available

节点2(P61B)
Mon May 11 21:54:35 2009
IPC Send timeout detected. Sender ospid 299166
Mon May 11 21:55:07 2009
Communications reconfiguration: instance 0
IPC Send timeout detected. Sender ospid 393310
Mon May 11 21:55:37 2009
Trace dumping is performing id=[cdmp_20090511215507]
Mon May 11 21:55:43 2009
Errors in file /u01/app/oracle/admin/rac/bdump/rac2_lmon_393310.trc:
ORA-29740: evicted by member 0, group incarnation 3
Mon May 11 21:55:43 2009
LMON: terminating instance due to error 29740
Instance terminated by LMON, pid = 393310

从察觉到节点网络失效到完成接管,用了大概70秒。。。节点B被踢出集群,但是从拔掉网线到察觉,大概用了6分钟


节点1(P61A)上的查询
SQL> /
select inst_id,open_mode from gv$database
                              *
ERROR at line 1:
ORA-12805: parallel query server died unexpectedly


SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> /
select inst_id,open_mode from gv$database
*
ERROR at line 1:
ORA-12805: parallel query server died unexpectedly


SQL> /

   INST_ID OPEN_MODE
---------- ----------
         1 READ WRITE

SQL> /

   INST_ID OPEN_MODE
---------- ----------
         1 READ WRITE

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/8242091/viewspace-594777/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2009-04-06

  • 博文量
    251
  • 访问量
    955590