ITPub博客

首页 > 数据库 > Oracle > [20211026]关于18c row cache mutex.txt

[20211026]关于18c row cache mutex.txt

原创 Oracle 作者:lfree 时间:2021-10-26 10:21:15 0 删除 编辑

[20211026]关于18c row cache mutex.txt

--//晚上,看了许多row cache mutex相关链接,发现如下:
ksun-oracle.blogspot.com/2020/08/row-cache-object-and-row-cache-mutex.html
ksun-oracle.blogspot.com/2018/07/row-cache-mutex-in-oracle-122010_28.html
ksun-oracle.blogspot.com/2017/07/nlsdatabaseparameters-dcprops-latch-row.html


--//ksun-oracle.blogspot.com 站点的测试比较详细.从上面介绍可以看出.12.1.0.2.0 (12cR1) 还再使用"latch: row cache
--//objects",从Oracle 12.2.0.1.0 (12cR2)开始才使用 "row cache mutex" 替换"latch: row cache objects".
In Oracle 12.2.0.1.0 (12cR2), "row cache mutex" replaced 12.1.0.2.0 (12cR1) "latch: row cache objects", similar to
"latch: library cache" substitution by "library cache: mutex X" in the previous release.

--//看了anbob网站,感觉row cache mutex相关的bug非常多.也许oracle每一项新特性都伴随大量的bug,稳定性还不足.
--//我们生产系统使用的ODA 的oracle版本竟然是19.0.0.0.0,上线之前都没有打任何补丁,无语...

--//ksun-oracle.blogspot.com 还拿nls_database_parameters视图做了许多相关测试.昨天上午上班无法....,回家仔细看了一下.
--//手头已经没有12.1.X版本,无法在生产系统19c上做这样的测试,找了一台虚拟机上运行18c的测试看看。

1.环境:
SYS@192.168.a.b:1521/orcl> @ ver
SYS@192.168.a.b:1521/orcl> @ prxx
==============================
PORT_STRING                   : x86_64/Linux 2.4.xx
VERSION                       : 18.0.0.0.0
BANNER                        : Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
BANNER_FULL                   : Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.3.0.0.0
BANNER_LEGACY                 : Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
CON_ID                        : 0
PL/SQL procedure successfully completed.

2.测试环境建立:
create table job_times (sid number, time_ela number,method varchar2(20));

$ cat m14.txt
set verify off
host sleep $(echo &&3/50 | bc -l )
variable vmethod varchar2(20);
exec :vmethod := '&&2';
insert into job_times values ( sys_context ('userenv', 'sid') ,dbms_utility.get_time ,:vmethod) ;
commit ;
declare
v_id number;
v_d date;
v_val VARCHAR2(1000);
l_count PLS_INTEGER;
begin
    for i in 1 .. &&1 loop
        --//select /*+ &&3 */ value into v_val from v$nls_parameters where parameter = 'NLS_CHARACTERSET';
        select /*+ &&3 */ value into v_val from nls_database_parameters where parameter = 'NLS_CHARACTERSET';
        --//select sql_text into v_val from v$sql where rownum=1;
        --//SELECT /*+ &&3 */ seq1.NEXTVAL into v_id FROM DUAL;
    end loop;
end ;
/
update job_times set time_ela = dbms_utility.get_time - time_ela where sid=sys_context ('userenv', 'sid') and method= (standard_in) 1: syntax errorvmethod;
commit;
quit

3.测试:
--//没想到很慢,我只能修改循环次数以及并发数量。
$ zzdate;seq 10 | xargs -I{} -P 10 sqlplus -s -l "sys/xxxxxxx@orcl as sysdba" @m14.txt 1e5 P=10 {} 1 >/dev/null;zzdate
trunc(sysdate)+09/24+16/1440+31/86400 == 2021/10/26 09:16:31
trunc(sysdate)+09/24+21/1440+13/86400 == 2021/10/26 09:21:13

SYS@192.168.a.b:1521/orclcdb> @tpt/ashtop event,p1,p2,p3 "module like 'sqlplus%' and username='SYS' and event like 'row cache%'" trunc(sysdate)+09/24+16/1440+31/86400 trunc(sysdate)+09/24+21/1440+13/86400
    Total
  Seconds     AAS %This   EVENT             P1   P2  P3 FIRST_SEEN          LAST_SEEN
--------- ------- ------- ---------------- --- ---- --- ------------------- -------------------
     1663     5.9   92% | row cache mutex   15   19   0 2021-10-26 09:16:33 2021-10-26 09:21:11
      144      .5    8% | row cache mutex   60   19   0 2021-10-26 09:16:37 2021-10-26 09:21:12
        8      .0    0% | row cache mutex   15   10   0 2021-10-26 09:16:46 2021-10-26 09:21:09
        2      .0    0% | row cache mutex   60   10   0 2021-10-26 09:17:32 2021-10-26 09:17:32
--//可以发现如果应用存在这样类似语句大量并发执行,会拖累整个系统性能。

SYS@orcl> Select method,count(*),round(avg(TIME_ELA),0),sum(TIME_ELA) from job_times group by method order by 3 ;
METHOD                 COUNT(*) ROUND(AVG(TIME_ELA),0) SUM(TIME_ELA)
-------------------- ---------- ---------------------- -------------
P=10                         10                  27925        279253
--//大约平均280秒上下。

SYS@192.168.a.b:1521/orclcdb> @ ev_name 'row cache mutex'
SYS@192.168.a.b:1521/orclcdb> @ prxx
==============================
EVENT#                        : 344
EVENT_ID                      : 306610566
NAME                          : row cache mutex
PARAMETER1                    : cache id
PARAMETER2                    : where requested
PARAMETER3                    :
WAIT_CLASS_ID                 : 3875070507
WAIT_CLASS#                   : 4
WAIT_CLASS                    : Concurrency
DISPLAY_NAME                  : row cache mutex
CON_ID                        : 0
PL/SQL procedure successfully completed.

SYS@192.168.a.b:1521/orclcdb> column usage format 9999999
SYS@192.168.a.b:1521/orclcdb> select * from v$rowcache where cache# in (15,60);
CACHE# TYPE   SUBORDINATE# PARAMETER   COUNT USAGE FIXED      GETS FASTGETS GETMISSES SCANS SCANMISSES SCANCOMPLETES MODIFICATIONS FLUSHES DLM_REQUESTS DLM_CONFLICTS DLM_RELEASES CON_ID
------ ------ ------------ ----------- ----- ----- ----- --------- -------- --------- ----- ---------- ------------- ------------- ------- ------------ ------------- ------------ ------
    15 PARENT              dc_props       89    89     0 257978364        0      8146     4          0             4             4       4            0             0            0      0
    60 PARENT              dc_cdbprops     7     7     0  19065107        0       581     0          0             0             6       6            0             0            0      0
--//CACHE#=15 对应dc_props,CACHE#=60对应dc_cdbprops。
--//顺便看以上相关语句的执行计划,可以发现访问的是X$PROPS。
Plan hash value: 2762963881
------------------------------------------------------------------
| Id  | Operation        | Name    | E-Rows |E-Bytes| Cost (%CPU)|
------------------------------------------------------------------
|   0 | SELECT STATEMENT |         |        |       |     1 (100)|
|*  1 |  FIXED TABLE FULL| X$PROPS |      1 |    31 |     0   (0)|
------------------------------------------------------------------

--//查看视图的定义如下:
SYS@192.168.a.b:1521/orclcdb> @ ddl sys.nls_database_parameters
C300
------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE FORCE NONEDITIONABLE VIEW "SYS"."NLS_DATABASE_PARAMETERS" ("PARAMETER", "VALUE") AS
  select name,
       substr(value$, 1, 64)
from x$props
where name like 'NLS%';

--//如果对比11g的定义:
SCOTT@book> @ ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

SCOTT@book> @ ddl sys.nls_database_parameters
C300
---------------------------------------------------------------------------------------
CREATE OR REPLACE FORCE VIEW "SYS"."NLS_DATABASE_PARAMETERS" ("PARAMETER", "VALUE") AS
  select name,
       substr(value$, 1, 40)
from props$
where name like 'NLS%';
--//可以发现两者访问的对象不同。前者是x$props,后者是props$。

SYS@orcl> select cache_name, existent, count(*) cnt from v$rowcache_parent where cache_name = 'dc_props' group by cache_name, existent;
CACHE_NA E        CNT
-------- - ----------
dc_props N         24
dc_props Y         36

SYS@192.168.a.b:1521/orclcdb> select cache_name, existent, count(*) cnt from v$rowcache_parent where cache_name = 'dc_cdbprops' group by cache_name, existent;
CACHE_NAME             EX          CNT
---------------------- -- ------------
dc_cdbprops            N            11
dc_cdbprops            Y             1

SYS@192.168.a.b:1521/orclcdb> select * from x$mutex_sleep where mutex_type='Row Cache' order by sleeps desc;
ADDR                     INDX      INST_ID       CON_ID MUTEX_TYPE   MUTEX_TYPE_ID  LOCATION_ID LOCATION                         SLEEPS    WAIT_TIME
---------------- ------------ ------------ ------------ ------------ ------------- ------------ ------------------------------- ------- ------------
00007F646C176500            4            1            0 Row Cache                4           19 [19] kqrpre                     9924170  50159384303
00007F646C176620            6            1            0 Row Cache                4           10 [10] kqreqd                      117833     98550710
00007F646C1763E0            2            1            0 Row Cache                4           32 [32] kqrsfd                        1996      1329094
00007F646C176590            5            1            0 Row Cache                4           17 [17] kqrCreateUsingSecondaryKey     187       280745
00007F646C1766B0            7            1            0 Row Cache                4            8 [08] kqrget                          42        18877
00007F646C176350            1            1            0 Row Cache                4           33 [33] kqrsrd                          36        50532
00007F646C176740            8            1            0 Row Cache                4            3 [03] kqrUpdateHashTable              33         9042
00007F646C1762C0            0            1            0 Row Cache                4           34 [34] kqrssc                           7        40739
00007F646C176470            3            1            0 Row Cache                4           31 [31] kqrcmt                           1            8
9 rows selected.
--//locate_id=19,10 与等待事件P2对应。

SELECT *
  FROM (  SELECT mutex_addr, COUNT (*)
            FROM x$mutex_sleep_history
           WHERE mutex_type = 'Row Cache'
        GROUP BY mutex_addr
        ORDER BY 2 DESC)
 WHERE ROWNUM <= 5;

MUTEX_ADDR         COUNT(*)
---------------- ----------
0000000089F20D10         33
0000000089F20AB8         32
0000000089F20D70         30
0000000089F20D58         30
0000000089F209E0         19

SYS@192.168.a.b:1521/orclcdb> @ tpt/fcha 0000000089F20D10
Find in which heap (UGA, PGA or Shared Pool) the memory address 0000000089F20D10 resides...
WARNING!!! This script will query X$KSMSP, which will cause heavy shared pool latch contention
in systems under load and with large shared pool. This may even completely hang
your instance until the query has finished! You probably do not want to run this in production!
Press ENTER to continue, CTRL+C to cancel...

LOC    KSMCHPTR             KSMCHIDX     KSMCHDUR KSMCHCOM                             KSMCHSIZ KSMCHCLS             KSMCHTYP KSMCHPAR
------ ---------------- ------------ ------------ -------------------------------- ------------ ---------------- ------------ ----------------
SGA    00000000890CD000            1            1 permanent memor                      15579088 perm                        0 00

SYS@192.168.a.b:1521/orclcdb> @ tpt/fcha 0000000089F20AB8
LOC    KSMCHPTR             KSMCHIDX     KSMCHDUR KSMCHCOM                             KSMCHSIZ KSMCHCLS             KSMCHTYP KSMCHPAR
------ ---------------- ------------ ------------ -------------------------------- ------------ ---------------- ------------ ----------------
SGA    00000000890CD000            1            1 permanent memor                      15579088 perm                        0 00

--//可以发现都是位于0x00000000890CD000 ,0x00000000890CD000+15579088(十进制)区域。

4.继续测试:
--//如果换成select /*+ &&3 */ value into v_val from v$nls_parameters where parameter = 'NLS_CHARACTERSET';语句,不需要访
--//问dc_props,dc_cdbprops等数据字典。执行情况呢?

$ cat m14.txt
set verify off
host sleep $(echo &&3/50 | bc -l )
variable vmethod varchar2(20);
exec :vmethod := '&&2';
insert into job_times values ( sys_context ('userenv', 'sid') ,dbms_utility.get_time ,:vmethod) ;
commit ;
declare
v_id number;
v_d date;
v_val VARCHAR2(1000);
l_count PLS_INTEGER;
begin
    for i in 1 .. &&1 loop
        select /*+ &&3 */ value into v_val from v$nls_parameters where parameter = 'NLS_CHARACTERSET';
        --//select /*+ &&3 */ value into v_val from nls_database_parameters where parameter = 'NLS_CHARACTERSET';
        --//select sql_text into v_val from v$sql where rownum=1;
        --//SELECT /*+ &&3 */ seq1.NEXTVAL into v_id FROM DUAL;
    end loop;
end ;
/
update job_times set time_ela = dbms_utility.get_time - time_ela where sid=sys_context ('userenv', 'sid') and method= (standard_in) 1: syntax errorvmethod;
commit;
quit

$ zzdate;seq 10 | xargs -I{} -P 10 sqlplus -s -l "sys/xxxxxxx@orcl as sysdba" @m14.txt 1e5 P=10x {} 1 >/dev/null;zzdate
trunc(sysdate)+10/24+02/1440+31/86400 == 2021/10/26 10:02:31
trunc(sysdate)+10/24+02/1440+39/86400 == 2021/10/26 10:02:39
--//仅仅需要8秒执行完成。

SYS@orcl> Select method,count(*),round(avg(TIME_ELA),0),sum(TIME_ELA) from job_times group by method order by 3 ;
METHOD                 COUNT(*) ROUND(AVG(TIME_ELA),0) SUM(TIME_ELA)
-------------------- ---------- ---------------------- -------------
P=10x                        10                    630          6303
P=10                         10                  27925        279253

SYS@192.168.a.b:1521/orclcdb> @tpt/ashtop event,p1,p2,p3 "module like 'sqlplus%' and username='SYS' and event like 'row cache%'" trunc(sysdate)+10/24+02/1440+31/86400 trunc(sysdate)+10/24+02/1440+39/86400
no rows selected

--//没有任何row cache相关等待事件。

SYS@192.168.a.b:1521/orclcdb> @tpt/ashtop event,p1,p2,p3 "module like 'sqlplus%' and username='SYS' " trunc(sysdate)+10/24+02/1440+31/86400 trunc(sysdate)+10/24+02/1440+39/86400
    Total
  Seconds     AAS %This   EVENT                       P1            P2                   P3 FIRST_SEEN          LAST_SEEN
--------- ------- ------- ------------------- ---------- ------------- -------------------- ------------------- -------------------
       39     4.9   59% |                          65536             1                    0 2021-10-26 10:02:33 2021-10-26 10:02:38
       10     1.3   15% | CRS call completion         14             6                    0 2021-10-26 10:02:32 2021-10-26 10:02:32
        6      .8    9% |                     3390613612  721554505728 18446744069421400154 2021-10-26 10:02:33 2021-10-26 10:02:38
        6      .8    9% |                     3390613612 1228360646656 18446744069421400068 2021-10-26 10:02:33 2021-10-26 10:02:38
        5      .6    8% |                     1613826344           592                    0 2021-10-26 10:02:34 2021-10-26 10:02:38

--// CRS call completion 也没有遇到,有机会研究。
--//执行计划如下:
select /*+ 1 */ value  from v$nls_parameters where parameter =
'NLS_CHARACTERSET'

Plan hash value: 1805486652

---------------------------------------------------------------------------
| Id  | Operation        | Name             | E-Rows |E-Bytes| Cost (%CPU)|
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                  |        |       |     1 (100)|
|*  1 |  FIXED TABLE FULL| X$NLS_PARAMETERS |      1 |    32 |     0   (0)|
---------------------------------------------------------------------------
--//同样是X表,所以特别提醒,一些监视或者应用递归语句以及一些系统视图尽量规避数据字段相关信息的访问。

--//另外这台数据库已经运行很久了,看看row cache objects latch的情况。
SYS@orcl> select name, latch#,gets from v$latch where name in ('row cache objects');
NAME                     LATCH#       GETS
-------------------- ---------- ----------
row cache objects           544      13295

SYS@orcl> select name, latch#,gets from v$latch where name in ('row cache objects');
NAME                     LATCH#       GETS
-------------------- ---------- ----------
row cache objects           544      13296

--//执行1次增加1,说明12R2以后我估计row cahce已经使用mutex代替latch。
--//19c也类似:
> select name, latch#,gets from v$latch where name in ('row cache objects');

NAME                                         LATCH#       GETS
---------------------------------------- ---------- ----------
row cache objects                               569       1890

> select name, latch#,gets from v$latch where name in ('row cache objects');
NAME                                         LATCH#       GETS
---------------------------------------- ---------- ----------
row cache objects                               569       1891

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-2839264/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
熟悉oracle相关技术,擅长sql优化,rman备份与恢复,熟悉linux shell编程。

注册时间:2008-01-03

  • 博文量
    3085
  • 访问量
    6808089