ITPub博客

STATSPACK报告解读

原创 Linux操作系统 作者:yiyu_66 时间:2007-12-12 09:40:45 0 删除 编辑

STATSPACK报告解读

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host

------------ ----------- ------------ -------- ----------- ------- ------------

his 1834191151 his 1 9.2.0.4.0 NO server

这部分主要是数据库环境的一些信息,例如实例名、数据库名、数据库版本等信息。http://spaces.msn.com/kang-biao

Snap Id Snap Time Sessions Curs/Sess Comment

------- ------------------ -------- --------- -------------------

Begin Snap: 1 21-3 -06 09:55:14 406 40.9

End Snap: 2 21-3 -06 10:34:18 412 42.9

Elapsed: 39.07 (mins)

这部分主要是收集的数据点的时间和两个收集点之间的间隔时间,这两个时间点收集的信息是本报表产生所依赖的信息,关于间隔时间的问题,在网上有很大的争论,我个人的建议是能够在数据库负载最大的时候采集,间隔时间最好是15分钟左右,这样能够将我们想要知道的信息收集到。http://spaces.msn.com/kang-biao

Cache Sizes (end)

~~~~~~~~~~~~~~~~~

Buffer Cache: 2,048M Std Block Size: 8K

Shared Pool Size: 1,024M Log Buffer: 10,486K

这一部分是关于缓存SGA大小的部分:包括数据缓存,共享池大小,日志缓存和本数据库的块的大小。http://spaces.msn.com/kang-biao

Load Profile

~~~~~~~~~~~~ Per Second Per Transaction

--------------- ---------------

Redo size: 47,187.06 9,279.07

Logical reads: 81,094.36 15,946.74

Block changes: 292.32 57.48

Physical reads: 1,068.40 210.09

Physical writes: 77.41 15.22

User calls: 1,405.84 276.45

Parses: 257.87 50.71

Hard parses: 28.21 5.55

Sorts: 112.05 22.03

Logons: 0.05 0.01

Executes: 346.84 68.20

Transactions: 5.09

这部分应该说是比较多的内容,

每秒和每个事务产生的重做数量,在这上面产生的为47K9K

逻辑读和物理读:还可以看到,逻辑读和物理读之间的比重(210.09/15,946.741.3%,意思就是说有13%的逻辑读将产生物理读。

这里比较重要的是分析的数据,在上面标为红色的就是我们需要重要分析的部分,例如上面的每秒钟有258个分析,而硬分析为28个,应该说对一个系统来说这样的数值是有点高,这样系统的负载会比较高,尤其是CPU的负载,硬分析数据库需要递归执行SQL,这样是很消耗CPU资源的,要想消除硬分析,需要在应用程序中绑定变量。

还有一个需要注意的内容是LOGONS,如果发现这个数值比较高就要看看数据库是不是有什么问题,为什么有人频繁登陆数据库。这里平均每20秒有一次连接。

% Blocks changed per Read: 0.36 Recursive Call %: 25.28

Rollback per transaction %: 0.23 Rows per Sort: 388.01

% Blocks changed per Read:的意思是有64%的逻辑读是用于只读,只有36%的是用于修改块。

Recursive Call %:这里的意思是只有25%的调用是使用的PLSQL

Rollback per transaction %:这个部分的值如果很低将是一个很好的现象,因为对于Oracle来说回滚是需要消耗很多资源的。

Rows per Sort:这个是平均每个排序的行数,这里是388行,但是在上面每秒钟的排序时112次,每秒钟要有112×38843456行的排序,这个说明这个数据库的排序很多,而为什么要有这么大的排序,需要DBA好好考虑的问题,这也是数据库负载增高的原因之一,尤其CPU资源的占用。同时这么大的排序需要比较大的排序区。

Instance Efficiency Percentages (Target 100%)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Buffer Nowait %: 100.00 Redo NoWait %: 100.00

Buffer Hit %: 98.70 In-memory Sort %: 100.00

Library Hit %: 95.81 Soft Parse %: 89.06

Execute to Parse %: 25.65 Latch Hit %: 99.81

Parse CPU to Parse Elapsd %: 21.78 % Non-Parse CPU: 68.74

这里是Instance Efficiency Percentages,这里的目标是100%,这里我认为比较重要的是,软分析率Soft Parse %。其中软分析率是软分析和硬分析的比率。如果这个值接近100%的话,说明这个数据库系统几乎都绑定数据了,我们期望看到高的软分析率,以为这样可以减少系统的负担。

如果在一个系统中硬分析每秒大于100次的话,会给这个系统带来很高的负担,可能造成一系列的性能问题,应该必须去调查,高硬分析率必将伴随着在共享池和数据字典缓存中latch的问题,检查在top 5等待事件上是否存在‘latch free’事件,如果存在,需要我们进一步检查报告中的latch部分。这个也是大多数系统中存在地问题,在这个报告中就出现这样的问题,主要问题是在应用程序编写的过程中,程序员没有绑定变量而造成的问题。

而软分析可以在系统中每秒进行300次或者更高的分析,但是不必要的软分析也能限制应用的扩展性,需要优化SQL文,能够做到每一个会话只进行一次软分析,能够执行多次。

Parse CPU to Parse Elapsd %这个时一个比较难理解的一个概念,例如这个例子上这个值为21.78%,意思就是说和如果在正常执行的状态上,即没有任何地等待的分析时间,相比,这个系统上CPU花费在等待得时间是正常时间的4.58(1/21.78%)倍。这是个很低的数字,说明花费了4倍的时间去做这件事情,如果找个比率是100%的话,说明在处理时候没有任何的等待。

% Non-Parse CPU这个意思是花费在做实际工作上的时间与花费在查询分析上的时间的比较,使用的公式为:

Round(100*1-(parse_cpu/tot_cpu),2)

从这个公式上看如果tot_cpu比较高的话,就说明这个系统的大部分工作时执行查询工作,而不是分析工作,从上述的例子来看,% Non-Parse CPU: 68.74,说明这个数据库只是有68%的时间用于查询,而32%的用于分析上。

其他的部分的含义,我就不解释了。大家 可以参考一下下面的介绍。

Buffer Nowait Ratio

Is the percentage of requests a server process makes for a specific buffer where the buffer was immediately available; all buffer types are included in this statistic. If the ratio is low, determine which type of block is being contended for by examining the Buffer Wait Statistics section of the Statspack report.

Buffer Hit Ratio

This statistic is also known as the buffer cache hit ratio. This is the percentage of requests for a articular block which are satisfied within the cache without the need for physical IO.

Although historically known as one of the most important statistics, the buffer cache hit ratio can sometimes be misleading. A high (e.g. 99%) cache hit ratio normally indicates the cache is adequately sized, however this may not be the case in all circumstances. For example,

frequently executed SQL statements which repeatedly refer to a small number of buffers via indexed lookups can skew the buffer gets statistic. When these blocks are read, they are placed at the most recently used (MRU) end of the buffer cache; iterative access to these blocks can artificially inflate the cache hit ratio. This makes tuning the buffer cache a challenging activity.

On some sites, it is possible to identify a too small buffer cache by the appearance of the ‘write complete waits’ event, which indicates that hot blocks (i.e. blocks which are still being modified) are aging out of the cache while they are still needed; check the Wait events section for evidence of this event.

Alternatively, a lower buffer cache hit ratio does not necessarily mean the cache is too small; it may be that (potentially valid) full table scans are artificially reducing what is otherwise a good hit ratio.

Library hit ratio.

The ratio indicates the number of pin requests which result in pin hits. A pin hit occurs when the SQL or PL/SQL code you wish to execute is already in the library cache and is valid to execute.

A low library cache hit percentage could imply SQL is prematurely aging out of the shared pool as the shared pool may be small, or that unsharable SQL is being used. Also compare with the soft parse ratio; if they are both low, then investigate whether there is a parsing issue.

Redo no-wait Ratio

This ratio is indicative of the number of redo-entries generated for which there was space immediately available in the redo log. The percentage is calculated as followed:

100 x (1- (redo log space requests/redo entries))

The ‘redo log space request’ statistic is incremented when an Oracle process attempts to write a redo entry, however there was not sufficient space remaining in the online redo log.

The ‘redo entries’ statistic is incremented for each entry made to the redo log.

Frequent, or slow log switches may be contributing to waits for redo log space. If you are switching logs frequently (e.g. more than once every 15 minutes) this may be improved by increasing the size of the online redo logs. If the log switches are not frequent, check the disks the redo logs reside on to see if log switches are taking a long time due to a slow IO system. If the IO system is overloaded,either move the redo logs to disks with less activity, place the logs on dedicated disks or faster devices.

Shared Pool Statistics Begin End

------ ------

Memory Usage %: 87.94 88.78

% SQL with executions>1: 28.28 30.16

% Memory for SQL w/exec>1: 37.27 38.50

这部分主要是介绍的是共享池的一些统计信息:

Memory Usage %:共享池的使用率,这个数值根据tom的建议为75%-90%,如果太低则是浪费内存,如果太高会使共享池内的内容老化。

% SQL with executions>1:是在共享池中执行次数超过一次的SQL语句的比率。这个数据应该是越高越好。

% Memory for SQL w/exec>1:这个是与不经常使用的SQL文相比,经常使用的SQL文消耗的内存的比率是多少,这个数据总体上与执行次数多于一次的SQL文的比率非常接近,除非某些查询任务消耗的内存没有规律。

Top 5 Timed Events

~~~~~~~~~~~~~~~~~~ % Total

Event Waits Time (s) Ela Time

-------------------------------------------- ------------ ----------- --------

latch free 459,303 5,636 47.12

CPU time 3,757 31.41

db file sequential read 859,956 1,894 15.83

db file scattered read 267,639 384 3.21

log file sync 11,529 109 .91

-------------------------------------------------------------

这部分的内容在网上,有很多人都在说这些问题的产生的原因,如何去诊断系统的问题等等,我在这里就不详细谈,以后有时间好好去解析这个问题。

原文地址:http://hi.baidu.com/utilities/blog/item/519632f565d20a24bc310989.html

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/81912/viewspace-412/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2007-12-11

  • 博文量
    12
  • 访问量
    52971