ITPub博客

首页 > 数据库 > Oracle > [20190409]pre_page_sga=true与连接缓慢的问题.txt

[20190409]pre_page_sga=true与连接缓慢的问题.txt

原创 Oracle 作者:lfree 时间:2019-04-11 09:51:30 0 删除 编辑

[20190409]pre_page_sga=true与连接缓慢的问题.txt


--//曾经遇到11g下设置pre_page_sga=true启动缓慢的问题(没有使用hugepages).

--//链接:http://blog.itpub.net/267265/viewspace-2295412/

--//实际上这样系统也会遇到连接缓慢的情况,通过测试说明问题.


1.建立pfile:

$ cat initxxxx.ora

db_name=xxxx

instance_name=xxxx

sga_target=20G

#sga_target=0

sga_max_size=20G

pre_page_sga=true


$ export ORACLE_SID=xxxx


SYS@xxxx> startup nomount

ORACLE instance started.

Total System Global Area 2.1379E+10 bytes

Fixed Size                  2262656 bytes

Variable Size            2818574720 bytes

Database Buffers         1.8522E+10 bytes

Redo Buffers               36073472 bytes


$ grep -i page /proc/meminfo

AnonPages:        239336 kB

PageTables:       701008 kB

AnonHugePages:         0 kB

HugePages_Total:     104

HugePages_Free:       48

HugePages_Rsvd:       41

HugePages_Surp:        0

Hugepagesize:       2048 kB


--//才启动没有任何用户连接,PageTables=701008 kB.


2.测试登陆问题:

$ time sqlplus -s -l  / as sysdba <<<quit

real    0m4.944s

user    0m0.024s

sys     0m0.016s

--//单个连接需要5秒.如果多个用户并发呢?

$ cat aa.sql

quit


$ time seq 50 | xargs -I{} -P 50 sqlplus -s -l  / as sysdba @ aa.sql

real    0m13.978s

user    0m1.833s

sys     0m1.017s

--//多个用户同时登录,全部完成需要13秒.


3.分析:

$ strace -f -c sqlplus -s -l  / as sysdba <<<quit

Process 3120 attached

Process 3120 detached

% time     seconds  usecs/call     calls    errors syscall

------ ----------- ----------- --------- --------- ----------------

 99.85    0.419648       69941         6           shmdt

  0.10    0.000434           1       413         2 read

  0.01    0.000046           0       287       127 open

  0.01    0.000045           0       180           mmap

  0.01    0.000037           0       178           close

  0.01    0.000032           0        79           rt_sigaction

  ....

------ ----------- ----------- --------- --------- ----------------

100.00    0.420259                  1970       227 total


$ man shmdt

...

shmdt() detaches the shared memory segment located at the address specified by shmaddr from the address space of the

calling process.  The to-be-detached segment must be currently attached with shmaddr equal to the value returned by the

attaching shmat() call.


On a successful shmdt() call the system updates the members of the shmid_ds structure associated with the shared memory

segment as follows:


shm_dtime is set to the current time.

shm_lpid is set to the process-ID of the calling process.

shm_nattch is decremented by one.  If it becomes 0 and the segment is marked for deletion, the segment is deleted.


--//给人的感觉好像消耗在调用shmdt,实际上消耗的时间0.419648.也就是这样还是无法定位问题在那里.


$ strace -fTtt  -e shmdt sqlplus -s -l  / as sysdba

Process 6046 attached

[pid  6046] 08:41:55.548586 shmdt(0x7f2c41600000) = 0 <0.000036>

quit

08:41:59.992835 shmdt(0x60000000)       = 0 <0.000042>

08:41:59.992983 shmdt(0x68000000)       = 0 <0.000042>

08:41:59.993097 shmdt(0x6c000000)       = 0 <0.360175>

08:42:00.353446 shmdt(0x4ec000000)      = 0 <0.037200>

08:42:00.390737 shmdt(0x560000000)      = 0 <0.000026>

Process 6046 detached

--//我手工执行quit,可以发现shmdt调用发生在退出阶段,也就是不是它导致登陆缓慢.


$ ipcs -smt


------ Shared Memory Attach/Detach/Change Times --------

shmid      owner      attached             detached             changed

352092161  oracle      Apr 10 17:26:09      Apr 10 17:26:17      Apr 10 17:09:55

352124930  oracle      Apr 10 17:26:09      Apr 10 17:26:17      Apr 10 17:09:55

352157699  oracle      Apr 10 17:26:09      Apr 10 17:26:18      Apr 10 17:09:55

352190469  oracle      Apr 10 17:26:09      Apr 10 17:26:18      Apr 10 17:09:55

352223238  oracle      Apr 10 17:26:09      Apr 10 17:26:18      Apr 10 17:09:55


------ Semaphore Operation/Change Times --------

semid    owner      last-op                    last-changed

310738944 oracle      Wed Apr 10 17:11:33 2019   Wed Apr 10 17:11:33 2019

--//从执行时间看4.5秒并不是消耗在这里.


4.继续分析,排除调用shmdt的情况:


$ strace -fTr -o /tmp/a2.txt  sqlplus  -s -l / as sysdba <<<quit

$ awk '{print $2}' /tmp/a2.txt | sort -nr | head

5.041521

0.287119

0.064745

0.008298

0.006386

0.004288

0.004272

0.001319

0.000982

0.000978


--//查看a2.txt

3366       0.000067 stat("/u01/app/oracle/product/11.2.0.4/dbhome_1/lib", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0 <0.000025>

3366       0.000112 chdir("/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs") = 0 <0.000027>

3366       5.041521 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4c2c0b3000 <0.000048>

3366       0.000407 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4c2c090000 <0.000015>

3366       0.000127 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4c2c06d000 <0.000013>

3366       0.000218 getcwd("/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs"..., 256) = 46 <0.000016>

3366       0.000087 access("/etc/intchg.ora", F_OK) = -1 ENOENT (No such file or directory) <0.000019>

...

--//也就是在chdir调用与mmap调用之间有5秒不知道在干什么?(chdir 仅仅需要0.000027秒).strace看不出来!!

--//如果你使用top -u oracle观察,你可以发现RES字段不断在增加,接近SGA设置.我的感觉就是在这里建立pagetables的过程.


$ cat aa.sql

host sleep 1

quit


$ cat a.sh

#! /bin/bash

export ORACLE_SID=xxxx

sqlplus  -s -l / as sysdba @ aa.sql &

a=$(($!+2))

top -p $a -d 1 -b


$ . a.sh | tee /tmp/aa.txt

...

--//按ctrl+c中断.


$ grep PID /tmp/aa.txt | head -1; egrep 'oracle' /tmp/aa.txt

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

 8498 oracle    20   0 20.2g 1.5g 1.5g R 93.9  1.2   0:00.47 oracle

 8498 oracle    20   0 20.2g 5.8g 5.8g R 99.7  4.6   0:01.47 oracle

 8498 oracle    20   0 20.2g  10g  10g R 99.8  8.4   0:02.47 oracle

 8498 oracle    20   0 20.2g  15g  15g R 99.8 12.2   0:03.47 oracle

--//sleep 1

 8498 oracle    20   0 20.2g  19g  19g S 95.9 15.7   0:04.43 oracle

 8498 oracle    20   0 20.2g  19g  19g S  0.0 15.7   0:04.43 oracle

 8498 oracle    20   0 20.0g  19g  19g R  1.0 15.6   0:04.44 oracle

--//你可以发现RES,SHR都在增加,CPU使用率在9X%,感觉11g版本在设置pre_page_sga=true下有问题.

--//大致定位时间定位在建立pagetables的过程.


5.改用hugepages呢?

$ grep -i page /proc/meminfo

AnonPages:        244144 kB

PageTables:        11508 kB

AnonHugePages:         0 kB

HugePages_Total:   26000

HugePages_Free:    15800

HugePages_Rsvd:       41

HugePages_Surp:        0

Hugepagesize:       2048 kB


$ time  sqlplus -s -l  / as sysdba <<<quit

real    0m0.180s

user    0m0.029s

sys     0m0.011s


$ strace -fTr -o /tmp/a3.txt  sqlplus  -s -l / as sysdba <<<quit

$ awk '{print $2}' /tmp/a3.txt | sort -nr | head

0.121929

0.009182

0.009144

0.009047

0.005081

0.002050

0.001984

0.001932

0.001321

0.001243


--//查看a3.txt

8716       0.000095 chdir("/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs") = 0 <0.000019>

--//这里等待时间很小.

8716       0.121929 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f552659d000 <0.000029>

8716       0.000143 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f552657a000 <0.000020>

8716       0.000166 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5526557000 <0.000020>

8716       0.000260 getcwd("/u01/app/oracle/product/11.2.0.4/dbhome_1/dbs"..., 256) = 46 <0.000023>


--//总之:

如果pre_page_sga=true最好开启hugepages.

很奇怪的是12c 缺省pre_page_sga=true,没有开启hugepages的情况下,没有这个问题.


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-2640993/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
熟悉oracle相关技术,擅长sql优化,rman备份与恢复,熟悉linux shell编程。

注册时间:2008-01-03

  • 博文量
    2859
  • 访问量
    6646232