ITPub博客

首页 > Linux操作系统 > Linux操作系统 > Oracle OS Watcher 工具 使用详解

Oracle OS Watcher 工具 使用详解

原创 Linux操作系统 作者:rongshiyuan 时间:2012-07-09 11:00:42 0 删除 编辑
http://blog.csdn.net/tianlesoftware/article/details/7316191
 

oswsub.sh

20661

cedavis

1

30

0

1096K

880K

sleep

0:00

0.09%

vmstat

20668

cedavis

1

0

0

1904K

1240K

run

0:00

0.05%

oswsub.sh

20674

cedavis

1

0

0

968K

624K

sleep

0:00

0.05%

sleep

20663

cedavis

1

20

0

1080K

864K

sleep

0:00

0.05%

mpstat

2.4.6.1 Field Descriptions

(1)load averages: 0.11, 0.07, 0.0612:50:36

This linedisplays the load averages over the last 1, 5 and 15 minutes as well as thesystem time. This is quite handy as top basically includes a timestamp alongwith the data capture.

Load average isdefined as the average number of processes in the run queue. A runnable Unixprocess is one that is available right now to consume CPU resources and is notblocked on I/O or on a system call. The higher the load average, the more workyour machine is doing.

The threenumbers are the average of the depth of the run queue over the last 1, 5, and15 minutes. In this example we can see that .11 processes were on the run queueon average over the last minute, .07 processes on average on the run queue overthe last 5 minutes, etc. It is important to determine what the average load ofthe system is through benchmarking and then look for deviations. A dramaticrise in the load average can indicate a serious performance problem.

(2)136 processes: 133 sleeping, 2running, 1 on cpu

This linedisplays the total number of processes running at the time of the last update.It also indicates how many Unix processes exist, how many are sleeping (blockedon I/O or a system call), how many are stopped (someone in a shell hassuspended it), and how many are actually assigned to a CPU. This last numberwill not be greater than the number of processors on the machine, and the valueshould also correlate to the machine's load average provided the load averageis less than the number of CPUs. Like load average, the total number ofprocesses on a healthy machine usually varies just a small amount over time.Suddenly having a significantly larger or smaller number of processes could bea warning sign.

(3)Memory: 2048M real, 1061M free,542M swap in use, 1605M swap free

The"Memory:" line is very important. It reflects how much real and swapmemory a computer has, and how much is free. "Real" memory is theamount of RAM installed in the system, a.k.a. the "physical" memory."Swap" is virtual memory stored on the machine's disk.

Once a computerruns out of physical memory, and starts using swap space, its performancedeteriorates dramatically. If you run out of swap, you'll likely crash yourprograms or the OS.

(4)Individual process fields

Field

Description

PID

Process ID of process

USERNAME

Username of process

THR

Process thread PRI Priority of process

NICE

Nice value of process

SIZE

Total size of a process, including code and data, plus the stack space in kilobytes

RES

Amount of physical memory used by the process

STATE

Current CPU state of process. The states can be S for sleeping, D for uninterrupted, R for running, T for stopped/traced, and Z for zombied

TIME

The CPU time that a process has used since it started

%CPU

The CPU time that a process has used since the last update

COMMAND

The task's command name

2.4.6.2 What to Look For

(1) Large run queue. Large numberof processes waiting in the run queue may be an indication that your systemdoes not have sufficient CPU capacity.

(2) Process consuming lots of CPU.A process which is "hogging" CPU is always suspect. If this processis an oracle foreground process it's most likely running an expensive querythat should be tuned. Oracle background process should not hog CPU for longperiods of time.

(3) High load averages. Processesshould not be backed up on the run queue for extended periods of time.

(4) Low swap space. This is anindication you are running low on memory.

2.4.7 oswvmstat

_vmstat_YY.MM.DD:HH24.dat

These files willcontain output from the 'vmstat' command that is obtained and archive byOSWatcher Black Box at specified intervals. These files will only existif 'vmstat' is installed on the OS and if the OSWbb user has privileges to runthe utility.

--这个文件包含vmstat 命令的内容。

The name vmstatcomes from "report virtual memory statistics". The vmstatutility does a bit more than this, though. In addition to reporting virtualmemory, vmstat reports certain kernel statistics about processes, disk, trap,and CPU activity.

The vmstatutility is fairly standard across UNIX platforms. Each platform. will have aslightly different version of the vmstat utility. You should consult youroperating system man pages for specifics. The sample provided below is forSolaris.

OSWbb runs thevmstat utility at the specified interval and stores the data in the oswvmstatsubdirectory under the archive directory. The data is stored in hourly archivefiles. Each entry in the file contains a timestamp prefixed by *** embedded inthe vmstat output. Notice there are 3 entries for each timestamp. You shouldalways ignore the first entry as this entry is always invalid. The second andthird entry will be valid but the second entry will be 1 sec later than thetimestamp and the third entry will be 2 seconds later than the timestamp.

Sample vmstat file produced by OSWbb

***Fri Jan 28 12:50:36 EST 2005

procs

memory

page

disk

faults

cpu

r

b

w

swap

free

re

mf

pi

po

fr

de

sr

dd

f0

s0

in

sy

cs

us

sy

id

0

0

0

1761344

1246520

1

6

0

0

0

0

0

2

0

0

0

380

1364

900

4

1

95

0

0

0

1643920

1086776

331

1485

8

16

16

0

0

31

0

0

0

447

4966

1315

15

31

54

0

0

0

1643872

1086728

6

0

0

0

0

0

0

0

0

0

0

389

1472

932

0

0

100

2.4.7.1 Field Descriptions

The vmstatoutput is actually broken up into six sections: procs, memory, page, disk,faults and CPU. Each section is outlined in the following table.

Field

Description

PROCS

r

Number of processes that are in a wait state and basically not doing anything but waiting to run

b

Number of processes that were in sleep mode and were interrupted since the last update

w

Number of processes that have been swapped out by mm and vm subsystems and have yet to run

MEMORY

swap

The amount of swap space currently available free The size of the free list

PAGE

re

page reclaims

mf

minor faults

pi

kilobytes paged in

po

kilobytes paged out

fr

kilobytes freed

de

anticipated short-term memory shortfall (Kbytes)

sr

pages scanned by clock algorithm

DISK

Bi

Disk blocks sent to disk devices in blocks per second

FAULTS

In

Interrupts per second, including the CPU clocks

Sy

System calls

Cs

Context switches per second within the kernel

CPU

Us

Percentage of CPU cycles spent on user processes

Sy

Percentage of CPU cycles spent on system processes

Id

Percentage of unused CPU cycles or idle time when the CPU is basically doing nothing

2.4.7.2 What to look for

The followinginformation should be used as a guideline and not considered hard and fastrules. The information documented below comes from Adrian Cockcroft's book, SunPerformance Tuning. Other operating systems like HP and Linux may havedifferent thresholds.

(1) Large run queue. AdrianCockcroft defines anything over 4 processes per CPU on the run queue as thethreshold for CPU saturation. This is certainly a problem if this last for anylong period of time.

(2) CPU utilization. The amount oftime spent running system code should not exceed 30% especially if idle time isclose to 0%.

(3) A combination of large runqueue with no idle CPU is an indication the system has insufficient CPUcapacity.

(4) Memory bottlenecks aredetermined by the scan rate (sr) . The scan rate is the pages scanned by theclock algorithm per second. If the scan rate (sr) is continuously over 200pages per second then there is a memory shortage.

(5) Disk problems may be identifiedif the number of processes blocked exceeds the number of processes on runqueue.

三. 配置OS Watcher 自启动

MOS:How To Start OSWatcher Black Box Every System Boot [ID 580513.1]

Oracle supportoften recommends that the OSWatcher Black Box(*) tool be run for an extendedperiod. Should the system reboot during this time, the systemadministrator must manually restart the OSWatcher, and allow it to run untilthe necessary data have been collected.

--OSW收集的信息越多,更有利用与系统的分析,所以我们可以设置OSW的自启动。

To automate thisprocedure, a simple shell script. can be used. Care must be taken to avoidaccidentally overwriting the log data upon a restart. The scriptmust also ensure that the OSWatcher tool be run using the correct userprivileges.

--让OSW 自启动可以通过脚本来实现,但是要注意的问题就是要避免在故障启动后对原来日志的覆盖,因为这些数据对分析很重要,如果在OSW自动启

动时覆盖了这些历史数据,就不能帮助我们分析问题。

osw-service 包可以从MOS上下载,也可以从我的CSDN下载:

http://download.csdn.net/detail/tianlesoftware/4109807

Theosw-service RPM package provides a script. to run the OSWatcher at system boot,and to stop it down gracefully at system shutdown. It provides an"osw" service that can be controlled using the standardLinux init(1) script. controls:

--osw-service RPM 包提供了脚本让系统重启时运行OSWaterch,并且在系统shutdown时gracefully的stop。这个包提供了一个osw的服务来控制linux init(1)脚本:

# /sbin/chkconfig osw on
# /sbin/service osw start

The osw-service RPMpackage is available as an attachment to this note. Download and installit as any other RPM package. A source RPM is provided for completeness.

[root@rac1 OS Watcher Tool]# rpm -ivhosw_service_0_0_2_1_noarch.rpm

Preparing... ########################################### [100%]

1:osw-service ########################################### [100%]

Before startingthe service, first change the settings inthe /etc/sysconfig/osw configuration file to fit your situation:

--安装好osw service 后,在启动之前,需要修改/etc/sysconfig/osw的配置,具体如下:

# Set OSWHOME to the directory where yourOSWatcher tools are installed
OSWHOME=/u01/oswbb

# Set OSWINTERVAL to the number of secondsbetween collections
OSWINTERVAL=60
# Set OSRETENTION to the number of hours logs are to be retained
OSWRETENTION=1
# Set OSUSER to the owner of the OSWHOME directory
OSWUSER=oracle

Once this is done, the command:
--修改完毕就可以启动OSWatcher 自启动脚本:

# /sbin/service osw start

注意个问题:

[root@rac1 u01]# service osw start

Starting OSWatcher: bash: line 7:./startOSW.sh: No such file or directory

[FAILED]

因为OSWatcher 在4.0 以后做了修改,这里我们启动时报错,只需要将startOSWbb.sh 复制一份成startOSW.sh 就可以了。

rac1:/u01/oswbb> cp startOSWbb.sh startOSW.sh

will start the OSWatcher tool upon everyboot.

--之后每次系统重启,OSWatcher 都会自动启动。

The OSWatcherlogs will be stored in ${OSWHOME}/archive as normal. Whenthe osw-service is started, anyprior ${OSWHOME}/archive directory will be movedto ${OSWHOME}/archive- first.

--OSWatcher 的log 存储在${OSWHOME}/archive目录下,当osw-service 启动时,任何之前的${OSWHOME}/archive 目录都会先被移到${OSWHOME}/archive-目录,然后启动,这样就避免了日志被覆盖的可能型。

四.OS Watcher Black Box Analyzer安装配置

MOS:OS Watcher Black Box Analyzer User Guide [ID 461053.1]

我们用OSWatcher收集了数据存储到归档里,但是这些文件不利于分析,所以Oracle 提供了OSWbba工具,其可以分析OSWbb收集的数据并用图表展示出来。

OSWbba iswritten in java and requires as a minimum java version 1.4.2 or higher. OSWbbacan run on any Unix X Windows or PC Windows platform. An X Windows environmentis required because OSWbba uses Oracle Chartbuilder which requires it.

--OSWbba 是用java 写的,所以运行OSWbba 至少需要Java1.4.2 的版本。OSWbba 可以运行在任何平台下。

OSWbba parsesall the OSWbb vmstat, iostat and top utility log files contained in an archivedirectory. Once the data is parsed, the user is presented with a commandline menu which has options for both displaying graphs, creating binary giffiles of these graphs, generating an html report containing all the graphs withnarrative on what to look for, and new in this release, the ability toself-analyze the files OSWbb creates.

--OSWbb 通过vmstat,iostat等命令收集数据存放在归档目录里,OSWbba分析这些数据。 数据分析之后,用户就可以通过命令行目录来提取这些数据,可以选择图表或者生成图形的gif 文件,亦或html报告。

也就是说,OSWbba 对OSWbb 收集的数据进行一个图形的展现。

OSWbba is certified to run on the followingplatforms:

--OSWbba 可以在一下平台运行:

(1) AIX

(2) Solaris

(3) HP-UX

(4) Linux

(5) Windows XP

2.1 Installing OSWbba

OSWbba requiresno installation. It comes shipped as a standalone java jar file with OSWbbv4.0.0 and higher.

--OSWbba 不需要安装,其是一个独立的java 包。

2.2 Starting OSWbba

在启动OSWbba 工具之前,必须先安装java 1.4.2 或以上版本。 当然如果安装过了Oracle,那么oracle 安装目录里也有java。

[root@rac1oswbb]# su - oracle

rac1:/home/oracle>java -version

java version"1.6.0_20"

OpenJDK RuntimeEnvironment (IcedTea6 1.9.7) (rhel-1.39.1.9.7.el6-x86_64)

OpenJDK 64-BitServer VM (build 19.0-b09, mixed mode)

--我这里安装的java 是1.6 的版本。

如果使用Oracle的Java,那么需要修改一下环境变量,在Path里添加Java的路径,如:

PATH=$ORACLE_HOME/jdk/bin:$PATH

rac1:/u02/app/oracle/product/11.2.0/db_1/jdk/bin>./java -version

java version "1.5.0_30"

Java(TM) 2 Runtime Environment, StandardEdition (build 1.5.0_30-b03)

Java HotSpot(TM) 64-Bit Server VM (build1.5.0_30-b03, mixed mode)

--我这里的oracle是11.2.0.3,其自带的java 版本是1.5.

运行OSWbba 需要用-i 参数指定input 目录, 这里的目录是OSWbb log归档的全路径。这个归档目录必须和OSWbb 的目录结构相同,其必须包含其他的子目录,如oswvmstat,oswiostat, oswps, oswtop, oswnetstat 等。

--注意这里显示图片需要条用X windows,所以我们要在图形窗口中执行:

[root@rac1 u02]# xhost +

access control disabled, clients canconnect from any host

然后执行如下命令:

rac1:/u01/oswbb> java -jar oswbba.jar -i/u01/oswbb/archive

Starting OSW Black Box Analyzer V4.0

OSWatcher Black Box Analyzer Written byOracle Center of Expertise

Copyright (c) 2012 by Oracle Corporation

Parsing Data. Please Wait...

Parsing file rac1_iostat_12.03.03.2200.dat...

Parsing file rac1_vmstat_12.03.03.2200.dat...

Parsing file rac1_top_12.03.03.2200.dat ...

Parsing Completed.

Enter 1 to Display CPU Process Queue Graphs

Enter 2 to Display CPU Utilization Graphs

Enter 3 to Display CPU Other Graphs

Enter 4 to Display Memory Graphs

Enter 5 to Display Disk IO Graphs

Enter 6 to Generate All CPU Gif Files

Enter 7 to Generate All Memory Gif Files

Enter 8 to Generate All Disk Gif Files

Enter L to Specify Alternate Location ofGif Directory

Enter T to Specify Different Time Scale

Enter D to Return to Default Time Scale

Enter R to Remove Currently DisplayedGraphs

Enter P to Generate A Profile

Enter A to Analyze Data

Enter Q to Quit Program

Please Select an Option:2

这里按Q退出OSWbba。

相关分析的图形结果如下:




上面是在交互模式下进行,也可以使用命令行执行:

java -jar oswbba.jar -i -P -L -6 -7-8 -B

这里的参数,在上面有说明,6,7,8 是生成图片。

OSWbba parsesall the archive files in memory prior to generating graphs or performing ananalysis. If you have a large amount of files to parse you may need to allocatemore memory in the java heap. If you experience any error messages regardingout of memory such as java.lang.OutOfMemoryError, you may have to increase thesize of the java heap. To increase the size of the java heap use the -Xmx flag.

--OSWbba 解析所有的归档文件在内存中进行,然后生成图表,如果有大量的文件需要解析,可以指定java heap 大小。

$java -jar -Xmx512M OSWbba.jar -i /u01/oswbb/archive

Starting OSWbba V4.0.0
OSWatcher Black Box Analyzer Written by Oracle Center of Expertise
Copyright (c) 2012 by Oracle Corporation

Parsing Data. Please Wait...

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/17252115/viewspace-734946/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2009-11-24

  • 博文量
    798
  • 访问量
    3213636