ITPub博客

首页 > 数据库 > Oracle > [20211203]为什么出现负数3.txt

[20211203]为什么出现负数3.txt

原创 Oracle 作者:lfree 时间:2021-12-03 16:49:45 0 删除 编辑

[20211203]为什么出现负数3.txt

--//前几天看华为监控sql语句遇到的问题,链接http://blog.itpub.net/267265/viewspace-2844951/
--//里面计算的latch_hit 大于100。也就是gets相减出现负数的情况。

--//首先考虑awr 做snapshot时从那里取值的,我仔细想一下应该来源V$LATCH_PARENT或者V$LATCH_CHILDREN。
--//继续昨天的探究看看。

1.环境:
> @ ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

2.分析:
> select sum(gets),name from V$LATCH_PARENT where name ='cache buffers chains' group by name
                         union all
                         select sum(gets),name from V$LATCH_CHILDREN where name ='cache buffers chains' group by name;
      SUM(GETS) NAME
--------------- ----------------------------------------
            167 cache buffers chains
  1438354062593 cache buffers chains

> /
      SUM(GETS) NAME
--------------- ----------------------------------------
            167 cache buffers chains
  1438470539683 cache buffers chains

--//很明显,计算应该是snap时sum(gets) V$LATCH_CHILDREN视图的总计。这样V$LATCH_CHILDREN的gets最大是power(2,32)-1=4294967295.

> select power(2,32)-1 from dual ;
  POWER(2,32)-1
---------------
     4294967295

> select * from (select latch#,child#,gets,name from V$LATCH_CHILDREN where name ='cache buffers chains' order by gets desc) where rownum<=3;
         LATCH#          CHILD#            GETS NAME
--------------- --------------- --------------- ----------------------------------------
            177          226480      4203680676 cache buffers chains
            177          281284      4203453046 cache buffers chains
            177          299292      4045467193 cache buffers chains

--//这样如果这些子拴锁的gets溢出,这样总和就会变小了。出现snap上下相减出现负数的可能性。
--//生产系统不能乱来,我通过测试环境演示这个情况。

3.环境:

SCOTT@book> @ ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

SCOTT@book> select rowid,dept.* from dept where rownum=1;
ROWID                  DEPTNO DNAME          LOC
------------------ ---------- -------------- -------------
AAAVRCAAEAAAACHAAA         10 ACCOUNTING     NEW YORK

SCOTT@book> @ rowid AAAVRCAAEAAAACHAAA
    OBJECT       FILE      BLOCK        ROW ROWID_DBA            DBA                  TEXT
---------- ---------- ---------- ---------- -------------------- -------------------- ----------------------------------------
     87106          4        135          0  0x1000087           4,135                alter system dump datafile 4 block 135 ;

SYS@book> @ bh 4 135
   INST_ID HLADDR              DBARFIL     DBABLK      CLASS CLASS_TYPE         STATE             TCH CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ BA               LE_ADDR          OBJECT_NAME
---------- ---------------- ---------- ---------- ---------- ------------------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- ---------------- --------------------
         1 0000000084D25320          4        135          1 data block         xcur                1          0          0          0          0          0 000000006F2E0000 00               DEPT
--//记下HLADDR=0000000084D25320.

SYS@book> select addr,name,gets  from V$LATCH_CHILDREN where addr=hextoraw('0000000084D25320');
ADDR             NAME                                           GETS
---------------- ---------------------------------------- ----------
0000000084D25320 cache buffers chains                          17033

SCOTT@book> select rowid,dept.* from dept where rownum=1;
ROWID                  DEPTNO DNAME          LOC
------------------ ---------- -------------- -------------
AAAVRCAAEAAAACHAAA         10 ACCOUNTING     NEW YORK

SYS@book> select addr,name,gets  from V$LATCH_CHILDREN where addr=hextoraw('0000000084D25320');
ADDR             NAME                                           GETS
---------------- ---------------------------------------- ----------
0000000084D25320 cache buffers chains                          17035
--//查询1次,gets增加2.

SYS@book> select *  from V$LATCH_CHILDREN where addr=hextoraw('0000000084D25320')
  2  @ prxx
==============================
ADDR                          : 0000000084D25320
LATCH#                        : 177
CHILD#                        : 1793
LEVEL#                        : 1
NAME                          : cache buffers chains
HASH                          : 3563305585
GETS                          : 17035
MISSES                        : 0
SLEEPS                        : 0
IMMEDIATE_GETS                : 324
IMMEDIATE_MISSES              : 0
WAITERS_WOKEN                 : 0
WAITS_HOLDING_LATCH           : 0
SPIN_GETS                     : 0
...
PL/SQL procedure successfully completed.
--//可以发现gets=17035,增加2次。

SYS@book> oradebug peek 0x0000000084D25320 64
[084D25320, 084D25360) = 00000000 00000000 0000428B 000000B1 00000001 00000712 00000000 00000000 00000144 00000000 00000000 00000000 00000000 00000000 ...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SYS@book> @ calc 17035 + 0
         DEC                  HEX
------------ --------------------
17035.000000                 428B

--//324 = 0x144,IMMEDIATE_GETS在后面。b1 = 177,估计是LATCH#.
--//注意看下线先位置对应的正是gets的数量。

--//先做一次awr的snapshot。
SYS@book> exec dbms_workload_repository.create_snapshot();
PL/SQL procedure successfully completed.

SYS@book> oradebug poke 0x0000000084D25328 4 0xffffff00
BEFORE: [084D25328, 084D2532C) = 0000428B
AFTER:  [084D25328, 084D2532C) = FFFFFF00

SYS@book> select addr,name,gets  from V$LATCH_CHILDREN where addr=hextoraw('0000000084D25320');
ADDR             NAME                                           GETS
---------------- ---------------------------------------- ----------
0000000084D25320 cache buffers chains                     4294967040

--//2^32-1 = 4294967295,很接近溢出情况了。

SYS@book> exec dbms_workload_repository.create_snapshot();
PL/SQL procedure successfully completed.

SYS@book> select addr,name,gets  from V$LATCH_CHILDREN where addr=hextoraw('0000000084D25320');
ADDR             NAME                                           GETS
---------------- ---------------------------------------- ----------
0000000084D25320 cache buffers chains                     4294967058

--//4294967295-4294967058 = 237,多执行以上查询多次。

SYS@book> select addr,name,gets  from V$LATCH_CHILDREN where addr=hextoraw('0000000084D25320');
ADDR             NAME                                           GETS
---------------- ---------------------------------------- ----------
0000000084D25320 cache buffers chains                             30

--//OK,现在溢出去了。

SYS@book> exec dbms_workload_repository.create_snapshot();
PL/SQL procedure successfully completed.

SYS@book> select max(snap_id) from DBA_HIST_SNAPSHOT;
MAX(SNAP_ID)
------------
        1950

SELECT SNAP_ID
              --   ,DBID
              ,INSTANCE_NUMBER
              ,MISSES
              ,GETS
              ,LATCH_NAME
          FROM DBA_HIST_LATCH
         WHERE     latch_name = 'cache buffers chains'
               AND snap_id IN (1950, 1949, 1948);

   SNAP_ID INSTANCE_NUMBER     MISSES       GETS LATCH_NAME
---------- --------------- ---------- ---------- ----------------------------------------------------------------
      1948               1        103   25364624 cache buffers chains
      1949               1        103 4320392703 cache buffers chains
      1950               1        103   25472305 cache buffers chains

--//哈哈,演示出来了,中间的snap_id=1949的gets最大,更加说明一点华为监控这个无用,对于大型数据库逻辑读超高的情况下。

set numw 12
select * from (
  SELECT SNAP_ID
        --,DBID
        ,INSTANCE_NUMBER
        ,GETS
        ,misses
        ,MISSES - LAG (MISSES, 1, NULL) OVER (ORDER BY SNAP_ID) MISSES1
        ,GETS - LAG (GETS, 1, NULL) OVER (ORDER BY SNAP_ID) GETS1
    FROM (  SELECT SNAP_ID
                  ,DBID
                  ,INSTANCE_NUMBER
                  ,SUM (MISSES) MISSES
                  ,SUM (GETS) GETS
              FROM DBA_HIST_LATCH
             WHERE     INSTANCE_NUMBER IN (SELECT instance_number
                                             FROM v$INSTANCE)
                   AND DBID IN (SELECT DBID
                                  FROM v$database)
          GROUP BY SNAP_ID, DBID, INSTANCE_NUMBER)
ORDER BY SNAP_ID DESC) where rownum<=2;

     SNAP_ID INSTANCE_NUMBER         GETS       MISSES      MISSES1        GETS1
------------ --------------- ------------ ------------ ------------ ------------
        1950               1    260591322        38478            0  -4294716073
        1949               1   4555307395        38478            6   4295358093

--//可以看出对于大型生产系统看这样的监控毫无用处,查询大于latch_hit大于100的情况也许是问题最严重的时候。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-2845698/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
熟悉oracle相关技术,擅长sql优化,rman备份与恢复,熟悉linux shell编程。

注册时间:2008-01-03

  • 博文量
    3135
  • 访问量
    6839378