ITPub博客

首页 > 数字化转型 > ERP > 统计信息收集出现DFS等待导致实例HANG死

统计信息收集出现DFS等待导致实例HANG死

原创 ERP 作者:yangtingkun 时间:2013-07-07 23:36:57 0 删除 编辑

客户10.2.0.4 RAC环境,出现大量的library cache lockcursor: pin S wait on X等待,经分析是由于统计信息收集僵死导致的。

[@more@]

数据库在8点到9点期间,数据库两个节点都存在明显的cursor: pin S wait on Xlibrary cache lock的等待:

Event

Waits

Time(s)

Avg Wait(ms)

% Total Call Time

Wait Class

cursor: pin S wait on X

1,573,056

30,651

19

146.2

Concurrency

library cache lock

31,757

7,009

221

33.4

Concurrency

CPU time

6,416

30.6

DFS lock handle

12,381

2,979

241

14.2

Other

latch: library cache

1,646

1,974

1,199

9.4

Concurrency

Event

Waits

Time(s)

Avg Wait(ms)

% Total Call Time

Wait Class

cursor: pin S wait on X

17,586,451

342,437

19

812.1

Concurrency

library cache lock

63,657

30,153

474

71.5

Concurrency

CPU time

3,820

9.1

db file sequential read

241,761

1,094

5

2.6

User I/O

inactive session

1,105

1,090

986

2.6

Other

两个节点的等待现象基本一致,而节点1上还存在明显的DFS lock handle等待事件。

通过分析ASH信息,发现library cache lockcursor: pin S wait on X等待基本上都是6点之后才开始出现:

SQL> select trunc(sample_time, 'hh24') time, count(*)
2 from WRH$_ACTIVE_SESSION_HISTORY ash, wrh$_event_name en
3 where ash.event_id = en.event_id
4 and sample_time >= to_timestamp('20130703', 'yyyymmdd')
5 and event_name in ('cursor: pin S wait on X', 'library cache lock')
6 group by trunc(sample_time, 'hh24')
7 order by 1;

TIME COUNT(*)
------------------- ----------
2013-07-03 00:00:00 4
2013-07-03 01:00:00 8
2013-07-03 02:00:00 3
2013-07-03 03:00:00 8
2013-07-03 04:00:00 9
2013-07-03 05:00:00 14
2013-07-03 06:00:00 348
2013-07-03 07:00:00 2138
2013-07-03 08:00:00 41576
2013-07-03 09:00:00 50108
2013-07-03 10:00:00 2452

而观察出现cursor: pin S wait on Xlibrary cache lock等待会话的BLOCKING SESSION,发现大部分处于等待的会话被会话37锁定:

SQL> select to_char(sample_time, 'hh24:mi:ss.ff') time, session_id sid,
2 decode(en.event_name, 'cursor: pin S wait on X', to_number(substr(ltrim(to_char(p2, '0xxxxxxxxxxxxxxx')), 1, 8), 'xxxxxxxx'), blocking_session) b_sid,
3 en.event_name
4 from WRH$_ACTIVE_SESSION_HISTORY ash, wrh$_event_name en
5 where ash.event_id = en.event_id(+)
6 and instance_number = 1
7 and sample_time >= to_timestamp('201307030800', 'yyyymmddhh24mi')
8 and sample_time <= to_timestamp('201307030801', 'yyyymmddhh24mi')
9 order by 1;

TIME SID B_SID EVENT_NAME
------------------ ---------- ---------- ------------------------------
08:00:09.569 457 4294967291 latch: library cache
08:00:09.569 52 981 cursor: pin S wait on X
08:00:09.569 168 73 library cache lock
08:00:09.569 1763 4294967291 SQL*Net message from client
08:00:09.569 123 4294967295 kksfbc child completion
08:00:09.569 476 4294967291 kst: async disk IO
08:00:09.569 500 4294967292 db file sequential read
08:00:09.569 327 4294967291 SQL*Net message to client
08:00:09.569 233 4294967291 db file scattered read
08:00:09.569 73 73 DFS lock handle
08:00:09.569 1889 4294967291 gc current block 2-way
08:00:09.569 1969 981 cursor: pin S wait on X
08:00:09.569 857 981 cursor: pin S wait on X
08:00:09.569 812 168 cursor: pin S wait on X
08:00:09.569 981 73 library cache lock
08:00:19.669 52 981 cursor: pin S wait on X
08:00:19.669 812 168 cursor: pin S wait on X
08:00:19.669 857 981 cursor: pin S wait on X
08:00:19.669 1969 981 cursor: pin S wait on X
08:00:19.669 233 4294967291 db file scattered read
08:00:19.669 500 4294967292 db file sequential read
08:00:19.669 1763 4294967291 db file sequential read
08:00:19.669 476 4294967291 kst: async disk IO
08:00:19.669 457 4294967291 kksfbc child completion
08:00:19.669 73 73 DFS lock handle
08:00:19.669 654 4294967291 SQL*Net message to client
08:00:19.669 168 73 library cache lock
08:00:19.669 981 73 library cache lock
08:00:29.769 1969 981 cursor: pin S wait on X
08:00:29.769 52 981 cursor: pin S wait on X
08:00:29.769 457 4294967291 kksfbc child completion
08:00:29.769 981 73 library cache lock
08:00:29.769 168 73 library cache lock
08:00:29.769 500 4294967292 db file sequential read
08:00:29.769 73 73 DFS lock handle
08:00:29.769 654 4294967291 SQL*Net message to client
08:00:29.769 327 4294967291 SQL*Net message to client
08:00:29.769 1763 4294967291 db file sequential read
08:00:29.769 857 981 cursor: pin S wait on X
08:00:29.769 123 4294967295 latch: library cache
08:00:29.769 196 4294967295 latch: library cache
08:00:29.769 233 4294967295 latch: library cache
08:00:29.769 812 168 cursor: pin S wait on X
08:00:29.769 476 4294967291 kst: async disk IO
08:00:39.879 476 4294967291 kst: async disk IO
08:00:39.879 658 4294967291 db file sequential read
08:00:39.879 500 4294967292 db file sequential read
08:00:39.879 1969 981 cursor: pin S wait on X
08:00:39.879 857 981 cursor: pin S wait on X
08:00:39.879 812 168 cursor: pin S wait on X
08:00:39.879 52 981 cursor: pin S wait on X
08:00:39.879 981 73 library cache lock
08:00:39.879 168 73 library cache lock
08:00:39.879 1849 4294967291 SQL*Net message to client
08:00:39.879 327 4294967291 SQL*Net message to client
08:00:39.879 235 4294967291 rdbms ipc reply
08:00:39.879 19 4294967291 rdbms ipc reply
08:00:39.879 73 73 DFS lock handle
08:00:39.879 123 4294967294 enq: WF - contention
08:00:49.987 476 4294967291 kst: async disk IO
08:00:49.987 500 4294967292 db file sequential read
08:00:49.987 382 4294967291 gc cr block busy
08:00:49.987 1969 981 cursor: pin S wait on X
08:00:49.987 857 981 cursor: pin S wait on X
08:00:49.987 812 168 cursor: pin S wait on X
08:00:49.987 235 4294967295 kksfbc child completion
08:00:49.987 658 4294967295 latch: library cache
08:00:49.987 19 4294967291 latch: library cache
08:00:49.987 981 73 library cache lock
08:00:49.987 168 73 library cache lock
08:00:49.987 1889 4294967291 SQL*Net message from client
08:00:49.987 73 73 DFS lock handle
08:00:49.987 52 981 cursor: pin S wait on X

可以判断,导致数据库实例HANG死的原因是因为SID73的进程:

SQL> select to_char(sample_time, 'hh24:mi:ss'), program, action, event_name
2 from wrh$_active_session_history ash, wrh$_event_name en
3 where session_id = 73
4 and ash.event_id = en.event_id(+)
5 and sample_time >= to_date('201307030550', 'yyyymmddhh24mi')
6 and sample_time <= to_date('201307030601', 'yyyymmddhh24mi')
7 order by 1;

TO_CHAR( PROGRAM ACTION EVENT_NAME
-------- ---------------------------------------- ------------------------------ ---------------------------
05:50:07 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential read
05:50:17 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential read
05:50:27 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:50:37 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:50:47 oracle@orcldb5 (J002) GATHER_STATS_JOB gc cr block 2-way
05:50:57 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:08 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:18 oracle@orcldb5 (J002) GATHER_STATS_JOB db file scattered read
05:51:28 oracle@orcldb5 (J002) GATHER_STATS_JOB gc cr multi block request
05:51:38 oracle@orcldb5 (J002) GATHER_STATS_JOB direct path write temp
05:51:48 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:58 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:52:08 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:52:18 or

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-1060686/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
暂无介绍

注册时间:2007-12-29

  • 博文量
    1955
  • 访问量
    10424731