ITPub博客

首页 > 数据库 > Oracle > 基于oracle 10.2.0.1 rac使用oradebug dump hanganalyze 分析oracle hang系列六

基于oracle 10.2.0.1 rac使用oradebug dump hanganalyze 分析oracle hang系列六

原创 Oracle 作者:wisdomone1 时间:2015-11-13 23:43:40 0 删除 编辑

结论

1,为了模拟db hang,尝试oradebug suspend ckpt,dbwr,smon,lmd无果,可见对于后台进程还要继续深入研究
2,oradebug suspend process allcation latch,模拟出会话无法登陆
3,从目前测试看,普通的等待事件,仍在other chains中,仅为latch或mutex方会在open chains中出现
4,latch free诊断,通过v$session.p1或p2定位到具体的latch
  然后结合v$latch_misses,找到最终的原因
5,name-service call wait等待事件,没有明确告诉你如何解决此事件
  尝试用systemstate dump or processstate dump皆未发现有价值的信息
  ,尝试用STRACE发现一点有价值信息,发现POLL时报错,函数调用被中断,然后反复尝试一个动作
6,如何高效理解strace的报错涉及的函数,非常重要
  又如何把这些报错函数与ORACLE联系起来,即联系能力非常重要
7,通过用strace -p 跟踪name-service call wait对应的进程,发现进程工作的一些原理
8,最终定位到SOCKET,如何把SOCKET与ORACLE联系起来,还要努力思考
9,oradebug suspend process allocation latch后,开始新建会话可以连接上,后面再新建会话不会在v$session创建信息
  这块理解不够还要继续学习






测试



---监控会话
SQL> select sid,serial#,paddr from v$session where sid=(select sid from v$mystat where rownum=1);


       SID    SERIAL# PADDR
---------- ---------- ----------------
       153          8 0000000083A5E580


SQL> select pid,spid from v$process where addr='0000000083A5E580';


       PID SPID
---------- ------------
        18 16457






----用oradebug suspend lgwr,ckpt,dbwr,lmon未成功,上于对这些后台进程的理解仍不深所致


---没有等待事件前的DUMP
Open chains found:
Other chains found:
Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/138/1/0x83a63c78/29290/Streams AQ: qmn slave idle wait>
Chain 2 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/143/10/0x83a624c0/29028/Streams AQ: qmn coordinator idle>
Chain 3 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/146/1/0x83a60d08/28837/Streams AQ: waiting for messages>
Chain 4 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/152/14/0x83a63490/29288/Streams AQ: waiting for time man>
Chain 5 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/153/3/0x83a5e580/28897/No Wait>
Chain 6 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/170/1/0x83a56ee8/28481/DIAG idle wait>


---新增一个普通等待事件后的DUMP


可见新增等待事件对应会话未出现在other chains,出现于state of nodes
SQL> delete from t_lock where rownum=1;


1 row deleted.


Open chains found:
Other chains found:
Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/138/1/0x83a63c78/29290/Streams AQ: qmn slave idle wait>
Chain 2 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/143/10/0x83a624c0/29028/Streams AQ: qmn coordinator idle>
Chain 3 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/146/1/0x83a60d08/28837/Streams AQ: waiting for messages>
Chain 4 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/152/14/0x83a63490/29288/Streams AQ: waiting for time man>
Chain 5 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/153/8/0x83a5e580/16457/No Wait>
Chain 6 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/170/1/0x83a56ee8/28481/DIAG idle wait>


[147]/0/148/928/0x83b62eb0/15740/IGN/13/14//none    






---通过oradebug模拟HANG会话登陆,即POKE PROCESS ALLOCATION LATCH
SQL> select name,addr,latch#,level# from v$latch where name='process allocation';


NAME                           ADDR                 LATCH#     LEVEL#
------------------------------ ---------------- ---------- ----------
process allocation             0000000060007498          3          1




SQL> select 'oradebug poke 0x'||addr||' 4 0x00000001;' from v$latch where latch#=3;


'ORADEBUGPOKE0X'||ADDR||'40X00000001;'
----------------------------------------------
oradebug poke 0x0000000060007498 4 0x00000001;


SQL> oradebug setmypid
Statement processed.
SQL> oradebug poke 0x0000000060007498 4 0x00000001;
BEFORE: [060007498, 06000749C) = 00000000
AFTER:  [060007498, 06000749C) = 00000001


--新建会话连接不上,无法登陆
[oracle@jingfa1 ~]$ sqlplus tbs_zxy/system


SQL*Plus: Release 10.2.0.1.0 - Production on Fri Nov 13 02:58:46 2015


Copyright (c) 1982, 2005, Oracle.  All rights reserved.


可见other chains新增1条记录,即会话167,等待LATCH FREE
Open chains found:
Other chains found:
Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/138/1/0x83a63c78/29290/Streams AQ: qmn slave idle wait>
Chain 2 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/143/10/0x83a624c0/29028/Streams AQ: qmn coordinator idle>
Chain 3 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/146/1/0x83a60d08/28837/Streams AQ: waiting for messages>
Chain 4 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/152/14/0x83a63490/29288/Streams AQ: waiting for time man>
Chain 5 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/153/8/0x83a5e580/16457/No Wait>
Chain 6 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/167/1/0x83a57eb8/28485/latch free> --新增
Chain 7 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/170/1/0x83a56ee8/28481/DIAG idle wait>




--新建第2个会话无法登陆
[root@jingfa1 ~]# su - oracle
[oracle@jingfa1 ~]$ sqlplus tbs_zxy/system


SQL*Plus: Release 10.2.0.1.0 - Production on Fri Nov 13 06:03:34 2015


Copyright (c) 1982, 2005, Oracle.  All rights reserved.
可见other chains继续新增1条记录
Open chains found:
Other chains found:
Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/138/1/0x83a63c78/29290/Streams AQ: qmn slave idle wait>
Chain 2 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/143/10/0x83a624c0/29028/Streams AQ: qmn coordinator idle>
Chain 3 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/146/1/0x83a60d08/28837/Streams AQ: waiting for messages>
Chain 4 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/152/14/0x83a63490/29288/Streams AQ: waiting for time man>
Chain 5 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/153/8/0x83a5e580/16457/No Wait>
Chain 6 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/156/1/0x83a5cdc8/28532/os thread startup>  --新增
Chain 7 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/167/1/0x83a57eb8/28485/latch free>
Chain 8 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/170/1/0x83a56ee8/28481/DIAG idle wait>




--新建第3个会话无法登陆


可见open chains出现信息,并且167会话由other chains移动到open chains,且148会话为新增
而且转储级别由原来的5及10新增了4和6,共计4个级别
Open chains found:
Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/167/1/0x83a57eb8/28485/latch free> --移动
 -- <0/148/1074/0x83a5fd38/18767/name-service call wait> --新增
Other chains found:
Chain 2 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/138/1/0x83a63c78/29290/Streams AQ: qmn slave idle wait>
Chain 3 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/143/10/0x83a624c0/29028/Streams AQ: qmn coordinator idle>
Chain 4 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/146/1/0x83a60d08/28837/Streams AQ: waiting for messages>
Chain 5 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/152/14/0x83a63490/29288/Streams AQ: waiting for time man>
Chain 6 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/153/8/0x83a5e580/16457/No Wait>
Chain 7 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/156/1/0x83a5cdc8/28532/DFS lock handle>
Chain 8 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/170/1/0x83a56ee8/28481/DIAG idle wait>




Extra information that will be dumped at higher levels:
[level  4] :   1 node dumps -- [REMOTE_WT] [LEAF] [LEAF_NW] 
[level  5] :   7 node dumps -- [SINGLE_NODE] [SINGLE_NODE_NW] [IGN_DMP] 
[level  6] :   1 node dumps -- [NLEAF] 
[level 10] :  18 node dumps -- [IGN] 




--继续新增第4个会话无法登陆


可见open chains及other chains没有变化
Open chains found:
Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/167/1/0x83a57eb8/28485/latch free>
 -- <0/148/1074/0x83a5fd38/18767/name-service call wait>
Other chains found:
Chain 2 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/138/1/0x83a63c78/29290/Streams AQ: qmn slave idle wait>
Chain 3 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/143/10/0x83a624c0/29028/Streams AQ: qmn coordinator idle>
Chain 4 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/146/1/0x83a60d08/28837/Streams AQ: waiting for messages>
Chain 5 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/152/14/0x83a63490/29288/Streams AQ: waiting for time man>
Chain 6 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/153/8/0x83a5e580/16457/No Wait>
Chain 7 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/156/1/0x83a5cdc8/28532/DFS lock handle>
Chain 8 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/170/1/0x83a56ee8/28481/DIAG idle wait>




--继续新增第5个会话无法登陆


如下基于上述的第4上无法登陆的会话
上述发现在open chains及other chains以及state of nodes全没有发现这些无法登陆的会话,我们分析下
SQL> select distinct type from v$session;


TYPE
----------
USER
BACKGROUND


SQL> select count(*) from v$session where type='USER';


  COUNT(*)
----------
         5


SQL> select sid,serial#,program,event from v$session where  type='USER' order by 1;


       SID    SERIAL# PROGRAM                                          EVENT
---------- ---------- ------------------------------------------------ ----------------------------------------------------------------
       144          1 racgimon@jingfa1 (TNS V1-V3)                     SQL*Net message from client
       145          1 racgimon@jingfa1 (TNS V1-V3)                     SQL*Net message from client
       146          1 racgimon@jingfa1 (TNS V1-V3)                     Streams AQ: waiting for messages in the queue
       148       1074                                                  name-service call wait
       153          8 sqlplus@jingfa1 (TNS V1-V3)                      SQL*Net message to client


--新生成一个无法登陆的会话,看上述的V$SESSION会否变化即可,经分析确实如果会话没有登陆成功,确实不会生成V$SESSION
SQL> select sid,serial#,program,event from v$session where  type='USER' order by 1;


       SID    SERIAL# PROGRAM                                          EVENT
---------- ---------- ------------------------------------------------ ----------------------------------------------------------------
       144          1 racgimon@jingfa1 (TNS V1-V3)                     SQL*Net message from client
       145          1 racgimon@jingfa1 (TNS V1-V3)                     SQL*Net message from client
       146          1 racgimon@jingfa1 (TNS V1-V3)                     Streams AQ: waiting for messages in the queue
       148       1074                                                  name-service call wait
       153          8 sqlplus@jingfa1 (TNS V1-V3)                      SQL*Net message to client


      
 我们再学习下整合v$latch相关视图,先分析上述的latch free
      
167会话在等待LATCH FREE      
SQL> select sid,serial#,program,event,blocking_session,p1,p1text,p2,p2text,p3,p3text from v$session where  sid=167;


       SID    SERIAL# PROGRAM                        EVENT                BLOCKING_SESSION         P1 P1TEXT                  P2 P2TEXT                  P3 P3TEXT
---------- ---------- ------------------------------ -------------------- ---------------- ---------- --------------- ---------- --------------- ---------- ---------------
       167          1 oracle@jingfa1 (LMON)          latch free                            1610642584 address                  3 number              135485 tries


p2即latch的编号,可见是在等待process allocation latch,也和我们测试前面的oradebug poke关联起来
SQL> select name,latch# from v$latch where latch#=3;


NAME                                                   LATCH#
-------------------------------------------------- ----------
process allocation                                          3


或者通过p1即latch addr,不过长度是32个,所以前缀要补8个0
SQL> select name,latch# from v$latch where addr='0000000060007498';


NAME                                                   LATCH#
-------------------------------------------------- ----------
process allocation                                          3


再学习下v$latch_misses
SQL> select parent_name,nwfail_count,sleep_count,wtr_slp_count,longhold_count,location from v$latch_misses where parent_name='process allocation';


PARENT_NAME                                        NWFAIL_COUNT SLEEP_COUNT WTR_SLP_COUNT LONGHOLD_COUNT LOCATION
-------------------------------------------------- ------------ ----------- ------------- -------------- ----------------------------------------------------------------
process allocation                                            0           0        841049              0 ksuapc
process allocation                                            0           0             0              0 ksukia
process allocation                                            0           0             0              0 ksucrp
process allocation                                            0     1019502        178453              0 ksufap: active procs
process allocation                                            0           0             0              0 ksdxwcwpt
process allocation                                            0           0             0              0 ksdxwdwpt
process allocation                                            0           0             0              0 ksusigskip
process allocation                                            0           0             0              0 ksu_reserve
process allocation                                            0           0             0              0 ksu_unreserve
process allocation                                            0           0             0              0 ksu_unreserve_proc
process allocation                                            0           0             0              0 ksudlp


11 rows selected.


在生产环境你就可以用下SQL,查找定位到底是哪个LATCH具体哪些代码竞争最严重
SQL> set pause on
SQL> select parent_name,nwfail_count,sleep_count,wtr_slp_count,longhold_count,location from v$latch_misses order by 3 desc;




PARENT_NAME                                        NWFAIL_COUNT SLEEP_COUNT WTR_SLP_COUNT LONGHOLD_COUNT LOCATION
-------------------------------------------------- ------------ ----------- ------------- -------------- ----------------------------------------------------------------
process allocation                                            0     1058941        183540              0 ksufap: active procs
ges resource hash list                                        0           5             0              0 kjrmas1: lookup master node






再看下148会话等待事件name-service call wait
SQL> select sid,serial#,program,event,blocking_session,p1,p1text,p2,p2text,p3,p3text from v$session where  sid=148;




       SID    SERIAL# PROGRAM         EVENT                BLOCKING_SESSION         P1 P1TEXT                  P2 P2TEXT                  P3 P3TEXT
---------- ---------- --------------- -------------------- ---------------- ---------- --------------- ---------- --------------- ---------- ---------------
       148       1074                 name-service call wa                          50 waittime                 0                          0
                                      it








SQL> select paddr from v$session where sid=148;




PADDR
----------------
0000000083A5FD38


SQL> select spid from v$process where addr='0000000083A5FD38';




SPID
------------
18767




没找到有价值的信息
SQL> oradebug setospid 18767
Oracle pid: 21, Unix process pid: 18767, image: oracle@jingfa1 (PZ99)
SQL> oradebug dump processstate 10
Statement processed.
SQL> oradebug tracefile_name
/u01/app/oracle/admin/jingfa/bdump/jingfa1_pz99_18767.trc


发现等待事件是OTHER
SQL> select sid,wait_class from v$session_wait class where sid=148;


       SID WAIT_CLASS
---------- ----------------------------------------------------------------
       148 Other




我僮尝试用strace跟踪分析下
[oracle@jingfa1 ~]$ strace -p 18767
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = -1 EINTR (Interrupted system call)
--- SIGALRM (Alarm clock) @ 0 (0) ---
rt_sigprocmask(SIG_BLOCK, [], NULL, 8)  = 0
times(NULL)                             = 434946713
rt_sigprocmask(SIG_BLOCK, [ALRM], NULL, 8) = 0
times(NULL)                             = 434946713
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={5, 0}}, NULL) = 0
rt_sigprocmask(SIG_UNBLOCK, [ALRM], NULL, 8) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={5, 0}}, NULL) = 0
rt_sigprocmask(SIG_UNBLOCK, [], NULL, 8) = 0
rt_sigreturn(0x1)                       = -1 EINTR (Interrupted system call)


定位到POLL函数对应的FD文件描述符
[oracle@jingfa1 fd]$ pwd
/proc/18767/fd
[oracle@jingfa1 fd]$ ll
total 0
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 0 -> /dev/null
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 1 -> /dev/null
lrwx------ 1 oracle oinstall 64 Nov 13 07:09 10 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkinstjingfa1 (deleted)
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 11 -> /dev/zero
lrwx------ 1 oracle oinstall 64 Nov 13 07:09 12 -> /u01/app/oracle/admin/jingfa/adump/ora_28419.aud
lrwx------ 1 oracle oinstall 64 Nov 13 07:09 13 -> socket:[1736890]  --13 FD
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 14 -> /dev/zero
lrwx------ 1 oracle oinstall 64 Nov 13 07:09 15 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_jingfa1.dat  
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 16 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 17 -> /u01/app/oracle/product/10.2.0/db_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 Nov 13 07:09 18 -> socket:[1736894]
lrwx------ 1 oracle oinstall 64 Nov 13 07:09 19 -> socket:[1736897]  ---19 FD
l-wx------ 1 oracle oinstall 64 Nov 13 07:09 2 -> /u01/app/oracle/admin/jingfa/bdump/jingfa1_pz99_18767.trc
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 3 -> /dev/null
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 4 -> /dev/null
l-wx------ 1 oracle oinstall 64 Nov 13 07:09 5 -> /u01/app/oracle/admin/jingfa/udump/jingfa1_ora_28419.trc
l-wx------ 1 oracle oinstall 64 Nov 13 07:09 6 -> /u01/app/oracle/admin/jingfa/bdump/alert_jingfa1.log
lrwx------ 1 oracle oinstall 64 Nov 13 07:09 7 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_jingfa1.dat
l-wx------ 1 oracle oinstall 64 Nov 13 07:09 8 -> /u01/app/oracle/admin/jingfa/bdump/alert_jingfa1.log
lr-x------ 1 oracle oinstall 64 Nov 13 07:09 9 -> /dev/null   




查阅MAN POOL,获取系统函数的使用说明
[oracle@jingfa1 ~]$ man poll
POLL(2)                    Linux Programmer’s Manual                   POLL(2)


NAME
       poll, ppoll - wait for some event on a file descriptor  --这个函数,等待在个FD文件描述符上面一些事件发生


SYNOPSIS
       #include <poll.h>


       int poll(struct pollfd *fds, nfds_t nfds, int timeout); --函数的参数说明,共计3个参数


       #define _GNU_SOURCE
       #include <poll.h>


       int ppoll(struct pollfd *fds, nfds_t nfds,
               const struct timespec *timeout, const sigset_t *sigmask);


DESCRIPTION
       poll() performs a similar task to select(2): it waits for one of a set of file descriptors to become ready to perform I/O.--此函数对SELECT函数进行简单的任务,等待一系列FD文件描述符,准备开始进行IO操作


       一系列的文件描述符FD用入POLL的输入参数,第一个输入参数是个结构体,类似于ORACLE的数组
       The set of file descriptors to be monitored is specified in the fds argument, which is an array of nfds structures of the following form:
           第一个参数的构成,即此结构有个元素
           struct pollfd {
               int   fd;         /* file descriptor */  ---文件描述符
               short events;     /* requested events */ - 请求的事件
               short revents;    /* returned events */  返回的事件
           };


       The field fd contains a file descriptor for an open file. 


       The field events is an input parameter, a bitmask specifying the events the application is interested in.


       The  field  revents  is  an output parameter, filled by the kernel with the events that actually occurred.  The bits returned in revents can include any of
       those specified in events, or one of the values POLLERR, POLLHUP, or POLLNVAL.  (These three bits are meaningless in the events field, and will be  set  in
       the revents field whenever the corresponding condition is true.)


       If none of the events requested (and no error) has occurred for any of the file descriptors, then poll() blocks until one of the events occurs.


       第3个参数即超时时间,即如果POLL函数被阻塞多久的最大期限,会报错返回给前端或被调用方,如果提定一个负数表明无限期的超时
       The  timeout  argument  specifies an upper limit on the time for which poll() will block, in milliseconds.  Specifying a negative value in timeout means an
       infinite timeout.


       具体的事件含义定义在poll.h,请见下
       The bits that may be set/returned in events and revents are defined in <poll.h>:


              POLLIN There is data to read.


              POLLPRI
                     There is urgent data to read (e.g., out-of-band data on TCP socket; pseudo-terminal master in packet mode has seen state change in slave).


              POLLOUT
                     Writing now will not block.


              POLLRDHUP (since Linux 2.6.17)
                     Stream socket peer closed connection, or shut down writing half of connection.  The _GNU_SOURCE feature test macro must be defined  in  order
                     to obtain this definition.


              POLLERR
                     Error condition (output only).


              POLLHUP
                     Hang up (output only).


              POLLNVAL
                     Invalid request: fd not open (output only).


       When compiling with _XOPEN_SOURCE defined, one also has the following, which convey no further information beyond the bits listed above:


              POLLRDNORM
                     Equivalent to POLLIN.


              POLLRDBAND
                     Priority band data can be read (generally unused on Linux).


              POLLWRNORM
                     Equivalent to POLLOUT.


              POLLWRBAND
                     Priority data may be written.


       Linux also knows about, but does not use POLLMSG.


   ppoll()
       The  relationship between poll() and ppoll() is analogous to the relationship between select() and pselect(): like pselect(), ppoll() allows an application
       to safely wait until either a file descriptor becomes ready or until a signal is caught.


       Other than the difference in the timeout argument, the following ppoll() call:


           ready = ppoll(&fds, nfds, timeout, &sigmask);


       is equivalent to atomically executing the following calls:


           sigset_t origmask;


           sigprocmask(SIG_SETMASK, &sigmask, &origmask);
           ready = ppoll(&fds, nfds, timeout);
           sigprocmask(SIG_SETMASK, &origmask, NULL);


       See the description of pselect(2) for an explanation of why ppoll() is necessary.


       The timeout argument specifies an upper limit on the amount of time that ppoll() will block.  This argument is a pointer to a structure  of  the  following
       form:


         struct timespec {
             long    tv_sec;         /* seconds */
             long    tv_nsec;        /* nanoseconds */
         };


       If timeout is specified as NULL, then ppoll() can block indefinitely.


返回值,成功为0,失败为-1
RETURN VALUE
       On  success,  a  positive  number  is returned; this is the number of structures which have non-zero revents fields (in other words, those descriptors with
       events or errors reported).  A value of 0 indicates that the call timed out and no file descriptors were ready. On error, -1 is returned, and errno is  set
       appropriately.


具体的返回错误值
ERRORS
       EBADF  An invalid file descriptor was given in one of the sets.


       EFAULT The array given as argument was not contained in the calling program’s address space.


       EINTR  A signal occurred before any requested event. --上述STRACE跟踪返回的错误值


       EINVAL The nfds value exceeds the RLIMIT_NOFILE value.


       ENOMEM There was no space to allocate file descriptor tables.


回过头再来分析poll
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = -1 EINTR (Interrupted system call)


对POLL请求的事件在中途失改了
  POLLIN There is data to read. --要读取的数据


  POLLPRI   ---马上要读取的数据
                     There is urgent data to read (e.g., out-of-band data on TCP socket; pseudo-terminal master in packet mode has seen state change in slave).


  
  When compiling with _XOPEN_SOURCE defined, one also has the following, which convey no further information beyond the bits listed above:


              POLLRDNORM  ---等同于pollin
                     Equivalent to POLLIN.




  POLLRDBAND  --优先级相关的数据能被读取,一般在LINUX不使用这个
                     Priority band data can be read (generally unused on Linux).




经过用STRACE 反复多次跟踪发现,函数调用一直会多次反复尝试,先是多次TIMEOUT超时,最后报
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = -1 
说明先是多次对13及19 FD文件描述符尝试操作,因故不能操作,然后报错,接着又开始重复上述的动作


poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 250) = 0 (Timeout)
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139784
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 172) = 0 (Timeout)
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139801
getrusage(RUSAGE_SELF, {ru_utime={0, 439933}, ru_stime={0, 91986}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 439933}, ru_stime={0, 91986}, ...}) = 0
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139801
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139851
getrusage(RUSAGE_SELF, {ru_utime={0, 439933}, ru_stime={0, 91986}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 439933}, ru_stime={0, 91986}, ...}) = 0
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139851
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139901
getrusage(RUSAGE_SELF, {ru_utime={0, 439933}, ru_stime={0, 91986}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 439933}, ru_stime={0, 91986}, ...}) = 0
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139902
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139952
getrusage(RUSAGE_SELF, {ru_utime={0, 439933}, ru_stime={0, 91986}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 439933}, ru_stime={0, 91986}, ...}) = 0
times({tms_utime=43, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435139952
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140002
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 91986}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 91986}, ...}) = 0
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140002
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140052
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 91986}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 91986}, ...}) = 0
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140052
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140102
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 91986}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 91986}, ...}) = 0
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140103
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140153
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 92985}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 92985}, ...}) = 0
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140153
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140203
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 92985}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 92985}, ...}) = 0
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140203
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = 0 (Timeout)
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140253
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 93985}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 440932}, ru_stime={0, 93985}, ...}) = 0
times({tms_utime=44, tms_stime=9, tms_cutime=0, tms_cstime=0}) = 435140254
poll([{fd=13, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}, {fd=19, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 500) = -1 EINTR (Interrupted system call)








来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/9240380/viewspace-1837759/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
提供针对oracle初学者及进阶的数据库培训,欢迎大家咨询: 微信: wisdomone 微信公众号: lovedb qq: 305076427 微博: wisdomone9

注册时间:2008-04-04

  • 博文量
    2150
  • 访问量
    11873190