ITPub博客

首页 > 数字化转型 > ERP > 给兄弟项目做Statspack优化过程

给兄弟项目做Statspack优化过程

原创 ERP 作者:foreverlee 时间:2006-03-24 13:11:45 0 删除 编辑

目前数据库负载处理事务情况
需要指出的是数据库每秒钟需要处理事务数(业务和Oracle数据字典本身)为99.53.说明数据库在24-Feb-05 16:00:03至24-Feb-05 22:00:05期间比较繁忙.
每秒钟的物理读(disk reads)为4,558.93*8k=35671k(34M左右)表示: 每秒钟处理业务需要系统级的内存交换在34M左右.这一点是需要降低的,看到这里我感觉到数据缓冲区中的命中率不会高.
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 84,012.03 844.10
Logical reads: 7,510.41 75.46
Block changes: 514.31 5.17
Physical reads: 4,558.93 45.81
Physical writes: 9.13 0.09
User calls: 215.01 2.16
Parses: 4.12 0.04
Hard parses: 0.01 0.00
Sorts: 2.13 0.02
Logons: 0.12 0.00
Executes: 152.38 1.53
Transactions: 99.53

[@more@]

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
GPS 4182057045 gps 1 9.2.0.4.0 NO dbsvr1

我采集了24-Feb-05 16:00:03至24-Feb-05 22:00:05的性能数据作为此篇报告依据.
Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 2 24-Feb-05 16:00:03 44 6.4
End Snap: 8 24-Feb-05 22:00:05 47 6.0
Elapsed: 360.03 (mins)


目前数据库配置:
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 1,904M Std Block Size: 8K
Shared Pool Size: 224M Log Buffer: 1,024K


目前数据库负载处理事务情况
需要指出的是数据库每秒钟需要处理事务数(业务和Oracle数据字典本身)为99.53.说明数据库在24-Feb-05 16:00:03至24-Feb-05 22:00:05期间比较繁忙.
每秒钟的物理读(disk reads)为4,558.93*8k=35671k(34M左右)表示: 每秒钟处理业务需要系统级的内存交换在34M左右.这一点是需要降低的,看到这里我感觉到数据缓冲区中的命中率不会高.
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 84,012.03 844.10
Logical reads: 7,510.41 75.46
Block changes: 514.31 5.17
Physical reads: 4,558.93 45.81
Physical writes: 9.13 0.09
User calls: 215.01 2.16
Parses: 4.12 0.04
Hard parses: 0.01 0.00
Sorts: 2.13 0.02
Logons: 0.12 0.00
Executes: 152.38 1.53
Transactions: 99.53


实例性能分析:
Buffer Hit %: 数据缓冲区中的命中率为39.30% 这个指标比较低,通常应当在90%以上.提升这个指标需要做的工作很多,稍后介绍.
Buffer Nowait Ratio: 在缓冲区中获取buffer的未等待比率为99.99%,间接反映逻辑读的成功率。通常应当在99%以上.
Soft Parse Ratio:99.83% 数据库软分析解析率为99.44%. 通常高代表可能使用了绑定变量(或者share_pool有足够空间存放sql执行计划也可以提高软分析率,但这是错误的),太低需要调整应用使用绑定变量,或者参考 cursor_sharing = similar ,9i以上版本.通常应当在99%以上.这是一个很重要的指标.
由于Soft Parse Ratio与shared pool的设置有很大关系,下面还好继续讨论.

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 39.30 In-memory Sort %: 100.00
Library Hit %: 99.99 Soft Parse %: 99.83
Execute to Parse %: 97.29 Latch Hit %: 99.97
Parse CPU to Parse Elapsd %: 100.28 % Non-Parse CPU: 99.79


共享池数据统计(这部分对于性能很重要).
Memory Usage %:共享池的使用率,应该稳定在75%--90%之间(稳定!!!),太小浪费内存,太大则显内存不足.这里为37%左右. 表示shared pool还有足够空间生成和存放新的sql执行计划。
Percent of SQLs with Execution:执行次数大于1的sql的比率(若太小可能是没有使用绑定变量) 这里为39%.
执行次数大于1的含义: 相同的sql语句如果使用绑定变量其执行计划应当在shared pool中保留一份.
Percent of Memory for SQl with Execution: 执行次数大于1的sql消耗内存/(所有sql消耗内存) 越大越好.表示充分使用绑定变量.这里为:35%.
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 35.90 38.07
% SQL with executions>1: 39.18 39.44
% Memory for SQL w/exec>1: 35.17 34.50


前5个数据库内部等待事件
log file sync:当客户段commit,rollback一个事务的时候,lgwr(一个oracle内部进程,主要完成日志写任务)会将这段时间的重做日志写入到物理日志文件中.日志文件的同步必须等待这一过程的完成.这个等待事件的发生应当有这样几个原因:
1〉 物理硬盘本身写的速度慢
2〉 数据库日志文件组每个成员应当平均分配在不同的盘上,而不是同一个盘的不同分区.
SQL> select group#,MEMBER from v$logfile;

GROUP#
----------
MEMBER
--------------------------------------------------------------------------------
3
/opt/ora9/oradata/gps/redo03.log

2
/opt/ora9/oradata/gps/redo02.log

1
/opt/ora9/oradata/gps/redo01.log
我们这里没有做日志文件组成员的镜像,但不能说明日志文件设置的没有问题.原因:
SQL> select file_name from dba_data_files;
FILE_NAME
--------------------------------------------------------------------------------
/opt/ora9/oradata/gps/system01.dbf
/opt/ora9/oradata/gps/perfstat.dbf
/opt/ora9/oradata/gps/cwmlite01.dbf
/opt/ora9/oradata/gps/drsys01.dbf
/gps/example01.dbf
/opt/ora9/oradata/gps/indx01.dbf
/opt/ora9/oradata/gps/odm01.dbf
/opt/ora9/oradata/gps/tools01.dbf
/gps/users01.dbf
/opt/ora9/oradata/gps/xdb01.dbf
/gps/users02.dbf
FILE_NAME
--------------------------------------------------------------------------------
/gps/undotbs21.dbf
日志文件和数据文件应当分配在不同的磁盘上,减少磁盘征用(Oracle默认安装的日志文件目录设置是需要调整的).

db file scattered read:这种情况通常显示与全表扫描相关的等待.通常对于全表扫描,出于性能考虑,数据回分散的读入至DB_Cache中.如果这个事件过高,可能说明对于有些全表扫描的表没有创建索引或者没有创建正确的索引或者没有正确使用索引.
进一步确定是哪张表的问题,发现MCC2.COUNTER_SINGLE表
SQL> select TARGET,count(*) from v$session_longops
2 group by TARGET;
TARGET COUNT(*)
---------------------------------------------------------------- ----------
MCC2.COUNTER_SINGLE 127


control file parallel write :和log file sync的问题一样,在安装Oracle数据库的时候应当将control file和数据文件分开.而现在的redo日志文件,所有数据文件,control file都位于同一磁盘.由于业务数据访问相当频繁,数据文件磁盘写很密集,当Oracle写redo的时候(每三秒)就会发生等待.

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
log file sync 2,161,184 16,122 69.26
CPU time 5,043 21.66
db file scattered read 6,213,357 1,448 6.22
control file parallel write 6,904 602 2.58
latch free 3,237 13 .06
-------------------------------------------------------------









问题级初步解决方案
1 redo日志文件,所有数据文件,control file控制文件位于同一磁盘.
解决方法: 将redo日志文件,control file,数据文件(静态数据)至于磁盘A,数据文件(动态数据,主要是业务数据)至于磁盘B.

2 没有充足绑定变量
解决方法: 见优化你的应用--请使用绑定变量.txt


3 索引使用情况. 特别是COUNTER_SINGLE表的索引是否合理,是否正确使用了索引 (这类问题的解决需要多方面配合)
见:优化你的系统--索引(一) 正确使用索引.txt


4 需要优化的sql
我觉得前2个查询要开并行了.不是有4个cpu么
1>
SELECT "UTC","RECORDID","ONLONGITUDE","ONLATITUDE","PRICE" FROM
"MCC2"."COUNTER_SINGLE" "A1" WHERE "RECORDID">0 AND "ONLATITUDE"
<=:1 AND "ONLATITUDE">=:2 AND "ONLONGITUDE"<=:3 AND "ONLONGITUDE
">=:4 AND "UTC"<=:5 AND "UTC">=:6
2>
SELECT "UTC","RECORDID","OFFLONGITUDE","OFFLATITUDE","PRICE" FRO
M "MCC2"."COUNTER_SINGLE" "A1" WHERE "RECORDID">0 AND "OFFLATITU
DE"<=:1 AND "OFFLATITUDE">=:2 AND "OFFLONGITUDE"<=:3 AND "OFFLON
GITUDE">=:4 AND "UTC"<=:5 AND "UTC">=:6
3> 使用表连接代替in
select DISTINCT OGM.opid from OGMEMBER OGM,OGOPT where OGM.opid=
:1 and OGOPT.moptcode=:2 and OGM.ogid in(select OGOPT.ogid from
OGOPT OGOPT, SERVICEGROUP SG, SERVICEUNIT SU where SU.oemcode=:3
and SU.commaddr=:4 and SG.sgid=SU.sgid and OGOPT.sgid=SG.sgid)
4>
select * from MCC_LASTTRACK where SUID in (select SUID from SERV
ICEVIEW where MAC=:1) 使用表连接代替in
select ml.*
from MCC_LASTTRACK ml,
SERVICEVIEW ser,
where ml.suid=ser.suid; (MCC_LASTTRACK表suid有索引)

5 关于shared pool问题.
现在的Shared Pool Size:224M,而shared pool的利用率为37%左右.而且数据库每秒钟需要处理事务数(业务和Oracle数据字典本身)为99.53.数据库如此繁忙,但shared pool的利用率仍为为37%.我猜测苏州天泽应用系统的查询种类并不是很多,但查询数量是非常非常大的.查询种类如果不多,那么应用采用绑定变量后产生的sql执行计划就应当不会持续增长.
我认为将shared pool的大小控制在150M左右是比较恰当的.

6 关于DB Cache的问题
现在Buffer Hit %: 39.30
导致这个问题是多方面的。2,3,4
当然DB_Cache_size这个参数也可以设大点。2.5G可以考虑.


7 先简单说一下COUNTER_SINGLE表的查询.

1>
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
49,813,694 384 129,723.2 30.7 1800.14 1763.39 1923892574
Module: oracle@dc6.gpssz.com (TNS V1-V3)
SELECT "UTC","RECORDID","ONLONGITUDE","ONLATITUDE","PRICE" FROM
"MCC2"."COUNTER_SINGLE" "A1" WHERE "RECORDID">0 AND "ONLATITUDE"
<=:1 AND "ONLATITUDE">=:2 AND "ONLONGITUDE"<=:3 AND "ONLONGITUDE
">=:4 AND "UTC"<=:5 AND "UTC">=:6
分析:
SELECT "UTC","RECORDID","ONLONGITUDE","ONLATITUDE","PRICE"
FROM "MCC2"."COUNTER_SINGLE" "A1"
WHERE "RECORDID">0
AND "ONLATITUDE"<=:1
AND "ONLATITUDE">=:2
AND "ONLONGITUDE"<=:3
AND "ONLONGITUDE">=:4
AND "UTC"<=:5
AND "UTC">=:6
ONLONGITUDE,ONLATITUDE没有索引? 这会是一个恐怖的全表扫描(FTS)啊.

COUNTER_SINGLE表索引情况
SQL> column INDEX_NAME for a20
SQL> column COLUMN_NAME for a20
SQL> column TABLE_NAME for a20
SQL> l
1 select INDEX_NAME,COLUMN_NAME,TABLE_NAME from dba_ind_columns
2 where INDEX_OWNER='MCC2'
3* and TABLE_NAME='COUNTER_SINGLE'
SQL> /

INDEX_NAME COLUMN_NAME TABLE_NAME
-------------------- -------------------- --------------------
COUNTER_SINGLE_IDX SUID COUNTER_SINGLE
COUNTER_SINGLE_IDX RECORDID COUNTER_SINGLE
COUNTER_SINGLE_PK SUID COUNTER_SINGLE
COUNTER_SINGLE_PK UTC COUNTER_SINGLE

2> 也是一样ONLONGITUDE,ONLATITUDE没有索引?
49,813,585 384 129,722.9 30.7 1841.68 1802.62 635376328
Module: oracle@dc6.gpssz.com (TNS V1-V3)
SELECT "UTC","RECORDID","OFFLONGITUDE","OFFLATITUDE","PRICE" FRO
M "MCC2"."COUNTER_SINGLE" "A1" WHERE "RECORDID">0 AND "OFFLATITU
DE"<=:1 AND "OFFLATITUDE">=:2 AND "OFFLONGITUDE"<=:3 AND "OFFLON
GITUDE">=:4 AND "UTC"<=:5 AND "UTC">=:6

3>
33,962,880 14 2,425,920.0 20.9 85.44 83.67 1822985056
Module: JDBC Thin Client
select trim(to_char(su.OEMCODE, '000X')) || ':' || su.COMMADDR a
s MAC, su.SUID,o.OEMNAME,v.VNAME,v.VCAT,sg.sgname,c.cname,c2.cn
ame as cname2,c3.cname as cname3 from SERVICEUNIT su,SERVICEGRO
UP sg, VEHICLE v, taxi_vehicle tv,OEM o,CUSTOMER c,CUSTOMER c2,C
USTOMER c3 where su.SGID in (select DISTINCT sgid from OGOPT

分析:
首先根据hash_value找到全部sql语句
SQL> select sql_text from v$sqltext where hash_value=&hash_value order by piece;
Enter value for hash_value: 1822985056
old 1: select sql_text from v$sqltext where hash_value=&hash_value order by piece
new 1: select sql_text from v$sqltext where hash_value=1822985056 order by piece

SQL_TEXT
----------------------------------------------------------------
select trim(to_char(su.OEMCODE, '000X')) || ':' || su.COMMADDR as MAC su.SUID,o.OEMNAME,v.VNAME,v.VCAT,sg.sgname,c.cname,c2.cn
ame as cname2,c3.cname as cname3
from SERVICEUNIT su,
SERVICEGROUP sg,
VEHICLE v,
taxi_vehicle tv,
OEM o,
CUSTOMER c,
CUSTOMER c2,
CUSTOMER c3
where su.SGID in
(select DISTINCT sgid
from OGOPT
where OGID in ( select OGID
from OGMEMBER
where OPID = :1))
and su.VID = v.VID
and su.vid = tv.vid(+)
and su.OEMCODE=o.OEMCODE
and su.sgid = sg.sgid and su.cid = c.cid
and tv.driver2id = c2.cid(+)
and tv.driver3id = c3.cid(+)
and su.stid = 1
order by mac
首先in要用子查询改写.



来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/76065/viewspace-822632/,如需转载,请注明出处,否则将追究法律责任。

下一篇: 重建tempfile
请登录后发表评论 登录
全部评论

注册时间:2008-11-26

  • 博文量
    72
  • 访问量
    1357149