ITPub博客

首页 > Linux操作系统 > Linux操作系统 > 硬解析带来高CPU消耗的诊断

硬解析带来高CPU消耗的诊断

原创 Linux操作系统 作者:Ora-600 时间:2009-06-16 14:22:32 0 删除 编辑

客户的系统进行压力测试时发现CPU压力非常大,正常情况下CPU负载在30-40%,而峰值期间CPU基本上100%,以至于事务无法处理完成,甚至无法登陆。

经过客户的描述,这个压力测试实际上压力应该不大,仅仅设置了100个客户端的并发,而且单个客户端整个功能模块完成实际上仅需要25秒钟,但并发一多,CPU负载就非常高,以至于单个功能模块完成时间被延长至几百秒。

到客户现场,在操作系统级察看,CPU大量消耗在user上,ioCPU等待很少,确定CPU是消耗在应用层面。

[oracle@localhost ~]$ sar -u 3 50

Linux 2.6.18-92.el5 (localhost.localdomain)     20090615  _x86_64_        (8 CPU)

 

145650     CPU     %user     %nice   %system   %iowait    %steal     %idle

145653     all     93.29      0.00      1.08      0.00      0.00      5.62

145656     all     90.54      0.00      1.33      0.00      0.00      8.12

145659     all     93.00      0.00      1.08      0.00      0.00      5.93

145702     all     92.34      0.00      0.92      0.00      0.00      6.75

145705     all     93.25      0.00      0.87      0.00      0.00      5.88

145708     all     92.06      0.00      1.02      0.00      0.00      6.92

145711     all     91.95      0.00      0.87      0.00      0.00      7.18

145714     all     92.71      0.00      0.92      0.00      0.00      6.38

145717     all     92.19      0.00      0.87      0.00      0.00      6.94

145720     all     94.96      0.00      0.79      0.00      0.00      4.25

145723     all     92.91      0.00      0.83      0.00      0.00      6.26

145726     all     92.06      0.00      0.70      0.00      0.00      7.24

 

由于客户描述的情况中并发量与CPU消耗有直接关系,因此怀疑有应用级并发的竞争,于是收集了一下Statspack报告,在命中率方面:

Instance Efficiency Percentages (Target 100%)

Buffer Nowait %:

99.98

Redo NoWait %:

99.98

Buffer Hit %:

100.00

In-memory Sort %:

100.00

Library Hit %:

74.95

Soft Parse %:

72.85

Execute to Parse %:

0.27

Latch Hit %:

97.18

Parse CPU to Parse Elapsd %:

9.47

% Non-Parse CPU:

7.65

库高速缓存的命中率很低,硬解析很高,而硬解析与CPU消耗有直接关系。

在等待事件方面:

Top 5 Timed Events

Event

Waits

Time(s)

Avg Wait(ms)

% Total Call Time

Wait Class

latch: library cache

5,679,771

363,979

64

86.1

Concurrency

CPU time

 

41,658

 

9.9

 

latch: shared pool

1,908,510

6,061

3

1.4

Concurrency

latch free

192,585

5,334

28

1.3

Other

latch: cache buffers chains

19,228

823

43

.2

Concurrency

很明显,主要等待都集中在与库高速缓存相关的Latch竞争上,库高速缓存是主要的性能问题,而且硬解析也主要在消耗CPU,看来性能瓶颈已经表现出来了,通过下面会话级的直接查询,也发现在大量会话中同时都存在着latch: library cache 竞争:

select event from v$session_wait

where event not in ('SQL*Net message from client','rdbms ipc message')

 

EVENT

----------------------------------------------------------------

SQL*Net message to client

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: shared pool

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

latch: library cache

。。。。。。

smon timer

pmon timer

 

目前客户的共享池有544M,而在程序运行过程中共享池的剩余空间还有59M,同时从报告中的共享池大小建议上看,对于目前的应用来说,共享池只需要400M就足够,因此造成库高速缓存命中率低,硬解析高,CPU大量消耗的主要原因很可能是SQL代码没有共享。

Shared Pool Size(M)

SP Size Factr

Est LC Size (M)

Est LC Mem Obj

Est LC Time Saved (s)

Est LC Time Saved Factr

Est LC Load Time (s)

Est LC Load Time Factr

Est LC Mem Obj Hits

416

0.76

67

7,820

3,652

0.98

466,989

1.00

3,518,320

480

0.88

130

15,164

3,710

1.00

466,931

1.00

3,523,512

544

1.00

193

22,507

3,727

1.00

466,914

1.00

3,523,751

608

1.12

256

29,890

3,730

1.00

466,911

1.00

3,523,816

672

1.24

319

37,269

3,731

1.00

466,910

1.00

3,523,863

736

1.35

382

44,593

3,731

1.00

466,910

1.00

3,523,915

800

1.47

445

51,975

3,732

1.00

466,909

1.00

3,523,952

864

1.59

508

59,424

3,733

1.00

466,908

1.00

3,524,016

928

1.71

571

66,882

3,735

1.00

466,906

1.00

3,524,077

992

1.82

634

74,314

3,744

1.00

466,897

1.00

3,524,148

1,056

1.94

697

81,781

3,744

1.00

466,897

1.00

3,524,190

1,120

2.06

760

89,216

3,748

1.01

466,893

1.00

3,524,267

 

查询内存中的所有SQL语句,发现有两条SQL语句执行的条数占了内存所有SQL语句的80%,而这些语句的文本基本相同,仅仅是where子句中的值不同。

select sql_text from v$sql order by sql_text;

 

select * from olat_examinationresult where examlog_id =243035 and question_id=1961                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=1974                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2078                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2114                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2121                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2330                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2349                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2389                                      

select * from olat_examinationresult where examlog_id =243035 and question_id=2390                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2420                                      

select * from olat_examinationresult where examlog_id =243035 and question_id=2519                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2520                                      

select * from olat_examinationresult where examlog_id =243035 and question_id=2809                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2826                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2837                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2839                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2847                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2871                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2893                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=2895                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=3168                                     

select * from olat_examinationresult where examlog_id =243035 and question_id=3264                                     

。。。。。。       

 

select t1.id, t1.examination_id, t1.user_id,to_char(t1.start_time,'yyyy-mm-dd hh24:mi:ss')as start_time, to_char(t1.end_

time,'yyyy-mm-dd hh24:mi:ss')as end_time, state, t1.right_rate, t1.result,t1.check_user, t1.isnot_pass, to_char(t1.last_

time,'yyyy-mm-dd hh24:mi:ss') as last_time, t1.USE_TIME,t1.TESTPAPER_ID,t2.name from olat_examinationlog t1 inner join o

lat_examination t2 on t1.examination_id = t2.id where t1.id=242499                                         

select t1.id, t1.examination_id, t1.user_id,to_char(t1.start_time,'yyyy-mm-dd hh24:mi:ss')as start_time, to_char(t1.end_

time,'yyyy-mm-dd hh24:mi:ss')as end_time, state, t1.right_rate, t1.result,t1.check_user, t1.isnot_pass, to_char(t1.last_

time,'yyyy-mm-dd hh24:mi:ss') as last_time, t1.USE_TIME,t1.TESTPAPER_ID,t2.name from olat_examinationlog t1 inner join o

lat_examination t2 on t1.examination_id = t2.id where t1.id=242500                                         

select t1.id, t1.examination_id, t1.user_id,to_char(t1.start_time,'yyyy-mm-dd hh24:mi:ss')as start_time, to_char(t1.end_

time,'yyyy-mm-dd hh24:mi:ss')as end_time, state, t1.right_rate, t1.result,t1.check_user, t1.isnot_pass, to_char(t1.last_

time,'yyyy-mm-dd hh24:mi:ss') as last_time, t1.USE_TIME,t1.TESTPAPER_ID,t2.name from olat_examinationlog t1 inner join o

lat_examination t2 on t1.examination_id = t2.id where t1.id=242501                                         

select t1.id, t1.examination_id, t1.user_id,to_char(t1.start_time,'yyyy-mm-dd hh24:mi:ss')as start_time, to_char(t1.end_

time,'yyyy-mm-dd hh24:mi:ss')as end_time, state, t1.right_rate, t1.result,t1.check_user, t1.isnot_pass, to_char(t1.last_

time,'yyyy-mm-dd hh24:mi:ss') as last_time, t1.USE_TIME,t1.TESTPAPER_ID,t2.name from olat_examinationlog t1 inner join o

lat_examination t2 on t1.examination_id = t2.id where t1.id=242502                                        

select t1.id, t1.examination_id, t1.user_id,to_char(t1.start_time,'yyyy-mm-dd hh24:mi:ss')as start_time, to_char(t1.end_

time,'yyyy-mm-dd hh24:mi:ss')as end_time, state, t1.right_rate, t1.result,t1.check_user, t1.isnot_pass, to_char(t1.last_

time,'yyyy-mm-dd hh24:mi:ss') as last_time, t1.USE_TIME,t1.TESTPAPER_ID,t2.name from olat_examinationlog t1 inner join o

lat_examination t2 on t1.examination_id = t2.id where t1.id=242503                                          

select t1.id, t1.examination_id, t1.user_id,to_char(t1.start_time,'yyyy-mm-dd hh24:mi:ss')as start_time, to_char(t1.end_

time,'yyyy-mm-dd hh24:mi:ss')as end_time, state, t1.right_rate, t1.result,t1.check_user, t1.isnot_pass, to_char(t1.last_

time,'yyyy-mm-dd hh24:mi:ss') as last_time, t1.USE_TIME,t1.TESTPAPER_ID,t2.name from olat_examinationlog t1 inner join o

lat_examination t2 on t1.examination_id = t2.id where t1.id=242504                                 

。。。。。。

 

问题已经很明显了,应用中存在两条循环执行的SQL语句,这两条语句都存在没有使用绑定变量的问题,因此每次使用不同的值都不得不硬解析,因此并发越多,需要硬解析的SQL语句越多,从而带来更多的CPU消耗,因此让用户将这两条语句的值换成绑定变量,CPU消耗的峰值降低到了15%以下。

 

                              

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22235/viewspace-606501/,如需转载,请注明出处,否则将追究法律责任。

下一篇: 没有了~
请登录后发表评论 登录
全部评论

注册时间:2008-05-11

  • 博文量
    31
  • 访问量
    43702