ITPub博客

首页 > 数据库 > Oracle > 由oradebug poke process allocation latch引发dfs lock handle等待事件进一步分析

由oradebug poke process allocation latch引发dfs lock handle等待事件进一步分析

原创 Oracle 作者:wisdomone1 时间:2015-11-14 01:20:24 0 删除 编辑

结论

1,阻塞process allocation latch,数据库会出现latch free及dfs lock handle
2,dfs lock handle通过systemstate dump找不到有价值的信息
  却可通过v$session的p1,p2,p3找到有价值的信息
3,因为dfs lock handle的p1,p2,p3含义不知,只能借助于mos,找到文章:Understanding 'DFS Lock Handle' and Cross-Instance (CI) Call Waits [ID 458292.1]
  定位到锁CI
4,关于锁CI的含义请见v$lock_type
5, 关于ID1及ID2请见   Understanding 'DFS Lock Handle' and Cross-Instance (CI) Call Waits [ID 458292.1]
6,根据ID1及ID2不同的值,即会在不同的解决思路
7,ORACLE获取锁TYEP的脚本
select chr(bitand(p1,-16777216)/16777215) || chr(bitand(p1, 16711680)/65535) "Lock_type" from dual; --lock_type对应v$lock.type



测试



--oradebug suspend process allocation latch前
SQL> select sid,serial#,paddr from v$session where sid=(select sid from v$mystat where rownum=1);


       SID    SERIAL# PADDR
---------- ---------- ----------------
       153        742 0000000083A60520


SQL> select pid,spid from v$process where addr='0000000083A60520';


       PID SPID
---------- ------------
        22 28711


SQL> create table t_process(a int);


Table created.


SQL> select sid,serial#,program,event,blocking_session,p1,p1text,p2,p2text,p3,p3text from v$session where  sid=153;


       SID    SERIAL# PROGRAM                        EVENT           BLOCKING_SESSION         P1 P1TEXT                  P2 P2TEXT                  P3 P3TEXT
---------- ---------- ------------------------------ --------------- ---------------- ---------- --------------- ---------- --------------- ---------- ---------------
       153        742 sqlplus@jingfa1 (TNS V1-V3)    SQL*Net message                  1650815232 driver id                1 #bytes                   0
                                                      from client




---oradebug suspend process allocation latch后


SQL> select name,addr,latch#,level# from v$latch where name='process allocation';


NAME                 ADDR                 LATCH#     LEVEL#
-------------------- ---------------- ---------- ----------
process allocation   0000000060007498          3          1


SQL> select 'oradebug poke 0x'||addr||' 4 0x00000001;' from v$latch where latch#=3;


'ORADEBUGPOKE0X'||ADDR||'40X00000001;'
----------------------------------------------
oradebug poke 0x0000000060007498 4 0x00000001;        


SQL> oradebug setmypid
Statement processed.
SQL> oradebug poke 0x0000000060007498 4 0x00000001;
BEFORE: [060007498, 06000749C) = 00000000
AFTER:  [060007498, 06000749C) = 00000001


可见dml不受影响
SQL> insert into t_process values(1);


1 row created.


SQL> commit;


Commit complete.


而发生检查点,不过过了一会儿就正常了,这里就不好说process allocation latch对于发生检查点的影响程度了
SQL> alter system checkpoint;


等待dfs lock handle
SQL> select sid,serial#,program,event,blocking_session,p1,p1text,p2,p2text,p3,p3text from v$session where  sid=153;


       SID    SERIAL# PROGRAM                        EVENT           BLOCKING_SESSION         P1 P1TEXT                  P2 P2TEXT                  P3 P3TEXT
---------- ---------- ------------------------------ --------------- ---------------- ---------- --------------- ---------- --------------- ---------- ---------------
       153        742 sqlplus@jingfa1 (TNS V1-V3)    DFS lock handle                  1128857605 type|mode                2 id1                      2 id2






SQL> select sid,serial#,program,event from v$session where type='USER' and wait_class<>'Idle';


       SID    SERIAL# PROGRAM                        EVENT
---------- ---------- ------------------------------ --------------------------------------------------
       132        866 sqlplus@jingfa1 (TNS V1-V3)    SQL*Net message to client




新增2个无法登陆的会话,可见后台台进程mmon及lmon分别在等待dfs lock handle及latch free
SQL> select sid,serial#,program,event from v$session where   wait_class<>'Idle';


       SID    SERIAL# PROGRAM                        EVENT
---------- ---------- ------------------------------ --------------------------------------------------
       132        866 sqlplus@jingfa1 (TNS V1-V3)    SQL*Net message to client
       156          1 oracle@jingfa1 (MMON)          DFS lock handle
       167          1 oracle@jingfa1 (LMON)          latch free    


新增一个无法登陆会话,MMON又变成等待os thread startup
SQL> select sid,serial#,program,event from v$session where   wait_class<>'Idle';


       SID    SERIAL# PROGRAM                        EVENT
---------- ---------- ------------------------------ --------------------------------------------------
       132        866 sqlplus@jingfa1 (TNS V1-V3)    SQL*Net message to client
       156          1 oracle@jingfa1 (MMON)          os thread startup
       167          1 oracle@jingfa1 (LMON)          latch free          




继续新增一个无法登陆会话,等待会话由3个变成了4个
SQL> select sid,serial#,program,event from v$session where   wait_class<>'Idle';


       SID    SERIAL# PROGRAM                        EVENT
---------- ---------- ------------------------------ --------------------------------------------------
       132        866 sqlplus@jingfa1 (TNS V1-V3)    SQL*Net message to client
       156          1 oracle@jingfa1 (MMON)          DFS lock handle
       160          1 oracle@jingfa1 (CKPT)          os thread startup
       167          1 oracle@jingfa1 (LMON)          latch free


 新增会话,同上
 
 新增会话,由4个会话变成了3个会话,少了CKPT进程
 SQL> select sid,serial#,program,event from v$session where   wait_class<>'Idle';


       SID    SERIAL# PROGRAM                        EVENT
---------- ---------- ------------------------------ --------------------------------------------------
       132        866 sqlplus@jingfa1 (TNS V1-V3)    SQL*Net message to client
       156          1 oracle@jingfa1 (MMON)          DFS lock handle
       167          1 oracle@jingfa1 (LMON)          latch free      




新增会话,同上


分析dump,暂未分析出有价值的内容
SQL> oradebug setmypid
Statement processed.
SQL> oradebug dump systemstate 3
Statement processed.
SQL> oradebug tracefile_name
/u01/app/oracle/admin/jingfa/udump/jingfa1_ora_27463.trc 


转换思路,dfs lock handle几个参数含义到底是什么
SQL> select sid,serial#,program,event,p1,p1text,p2,p2text,p3,p3text from v$session where   wait_class<>'Idle';


       SID    SERIAL# PROGRAM                        EVENT                   P1 P1TEXT                  P2 P2TEXT                  P3 P3TEXT
---------- ---------- ------------------------------ --------------- ---------- --------------- ---------- --------------- ---------- ---------------
       132        866 sqlplus@jingfa1 (TNS V1-V3)    SQL*Net message 1650815232 driver id                1 #bytes                   0
                                                      to client


       156          1 oracle@jingfa1 (MMON)          DFS lock handle 1128857605 type|mode               10 id1                      2 id2
       167          1 oracle@jingfa1 (LMON)          latch free      1610642584 address                  3 number              177447 tries


也上DUMP文件匹配,但还是不知是什么含义
SQL> select to_char('1128857605','xxxxxxxxx') from dual;


TO_CHAR('1
----------
  43490005


经过查阅MOS发现如下文章:
Understanding 'DFS Lock Handle' and Cross-Instance (CI) Call Waits [ID 458292.1]
获知:id1及id2的含义:可见id1=10,即 Create Remote parallel query Server ,说明与进程创建有关系,也就是说无法创建进程
      id2=2,即调用后台进程中一个函数


1 reuse (checkpoint and invalidate block range) 
2 LGWR Checkpointing and Hot Backup 
3 DBWR syncronization of SGA with control file 
4 log file add/drop/rename notification 
5 miscellaneous CTWR operations 
7 Manged standby recovery related 
8 alter rollback segment optimal 
9 Signal Query Servers/coordinator 
10 Create Remote parallel query Server 
11 Set Global Partitions 
12 Stop Disk Writes 
13 Drop Sort Segments 
14 Release unused space from Sort Segments 
15 instance Recovery for Parallel operation Group 
16 Validate parallel slave Lock Value 
17 check transaction state objects 
18 object reuse request 
19 rolling release checks 
20 propagate begin backup scn for a file 
21 refresh top plan (for db scheduler 
22 clear checkpoint progress record 
23 drop temp file 
24 QUiesce database Restricted 
25 Update Dscn Tracking (ktcndt 
26 Purge dictionary Object number Cache 
27 set Database Force Logging mode 
28 invalidate cached file address translations 
29 Cursor Unauthorize Mode 
30 process waiters after row cache requeue 
31 Active Change Directory extent relocation 
32 block change tracking state change 
33 kgl mulitversion obsolete 
34 set previous resetlogs data 
35 set recovery destination pointer 
36 fast object reuse request 
38 ASM diskgroup discovery wait 
39 ASM diskgroup release 
40 ASM push DB updates 
41 ASM add ACD chunk 
42 ASM map resize message 
43 ASM map lock message 
44 ASM map unlock message (phase 1 
45 ASM map unlock message (phase 2 
46 ASM generate add disk redo marker 
d47 ASM check of PST validity 
48 ASM offline disk CIC 
49 Logical Standby Sync Point SCN 
50 update SQL Tuning Base existence bitvector 
51 PQ induced Checkpointing 
52 ASM F1X0 relocation 
53 Scheduler autostart 
54 KZS increment grant/revoke counter 
55 ASM disk operation message 
56 ASM I/O error emulation 
57 DB Supp log cursor invalidation 
58 Cache global range invalidation 
59 Cache global object invalidation 
60 ASM Pre-Existing Extent Lock wait 
61 Perform a ksk action through DBWR 
62 ASM diskgroup refresh wait 
63 KCBO object checkpoint 
64 KCBO object pq checkpoint 
65 global health check event 
66 Oracle Label Security refresh 
67 thread internal enable 
68 cross-instance registration 
69 KGL purge unused subheaps 
70 clear pin instance flag 
71 Rolling operations CIC
The following can be used to understand id2:


1 used to pass in parameters
2 used to invoke the function in backgroud process 
3 used to indicate the foreground has not returned
4 mounted excl, use to allocate mechanism 
5 used to queue up interested clients  


基于上述分析,获取fs lock handle到底是什么锁
SQL> select chr(bitand(1128857605,-16777216)/16777215) || chr(bitand(1128857605, 16711680)/65535) "Lock_name" from dual;


Lock_name
--------------------
CI


获取CI锁的含义,用于协调跨实例函数的调用,可见用于RAC
SQL> select type,name,id1_tag,id2_tag,is_user,description from v$lock_type where type='CI';


TYPE            NAME                 ID1_TAG    ID2_TAG    IS_USER    DESCRIPTION
--------------- -------------------- ---------- ---------- ---------- --------------------------------------------------
CI              Cross-Instance Call  opcode     type       NO         Coordinates cross-instance function invocations
                Invocation






发现没有任何会话持CI锁
SQL> select sid,type,id1,id2,lmode,request from v$lock where type='CI';


no rows selected




一运行全局GV$LOCK马上就HANG住了


SQL> select sid,type,id1,id2,lmode,request from gv$lock where type='CI';


在另一个节点查询看看,也没有内容
SQL> select sid,type,id1,id2,lmode,request from v$lock where type='CI';


no rows selected




---换个思路,查看对应节点的告警日志


上述查询一直HNAG时,告警日志是没有信息的,但如果你强制取消,出现如下信息,可见是启动并行进程时报错
引申,DB HANG或出故障时,查询GV视图会去创建并行查询进程(到于为何要创建这个并行查询进程,将在下文分析)
Fri Nov 13 12:06:58 2015
Timed out trying to start process PZ99.




这个就和前面的联系起来吧,哈哈,因为oradebug suspend process allocation latch




我们resume process allocation latch
[oracle@jingfa1 bdump]$ ps -ef|grep ora_|wc -l
25
[oracle@jingfa1 bdump]$ 


我们再次运行gv$lock查询
可见没有进程没有变化
[oracle@jingfa1 bdump]$ ps -ef|grep ora_|wc -l
25
[oracle@jingfa1 bdump]$ 






来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/9240380/viewspace-1837910/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
提供针对oracle初学者及进阶的数据库培训,欢迎大家咨询: 微信: wisdomone 微信公众号: lovedb 工作经历: 中国普天 北京科蓝 北京云和恩墨 北京神州新桥

注册时间:2008-04-04

  • 博文量
    2189
  • 访问量
    11977358