ITPub博客

首页 > Linux操作系统 > Linux操作系统 > Performance and Tuning on the Solaris 6 7 8

Performance and Tuning on the Solaris 6 7 8

原创 Linux操作系统 作者:yanggq 时间:2019-07-21 13:51:02 0 删除 编辑
Performance Tuning for Solaris

When a system is running slowly and performance is degrading, it is difficult to know what the cause is. Whether the cause is a lack of memory, disk subsystem bottleneck, or limited scalability of a particular application, there are ways to find, understand, and possibly remove the root cause.

This article gives suggestions on where to start. It covers how to approach performance concerns and address some common performance bottlenecks, introducing a number of concepts such as Intimate Shared Memory (ISM) and priority paging, which are intertwined with performance. The emphasis is on the Solaris 2.6, 7, and 8 Operating Environments. It is not a complete treatment of all performance issues, but is intended as a place to start, to stimulate your thinking about Solaris system performance and suggest where to go next.

1. Approaching Performance Problems

Performance, perhaps more than any other aspect of computer system behavior, requires a holistic approach. To identify a cause rooted in a single or multiple components, a structured approach is a must.

The practical upshot is that for performance the single most important part of the troubleshooting process is to define the problem you are trying to solve. In practical terms this means defining an operation or test case for which:

A) You know how fast it goes now.
B) You have a requirement for it to go "X" times faster, or it has gone "X" times faster under different circumstances.

Setting the baseline from which to start is the first step. Performance analysis is a top-down sport starting by defining the problem to be solved with a clear and concise statement. If you want a system to go faster, you still need to define what attribute of that system you aim to improve and what tradeoffs you will and won't accept. Until you can clearly describe the symptoms of the problem/opportunity, identifying the root cause will always be hit or miss.

Performance analysis is much like detective work where we establish the facts of the case through evidence and observation, being very careful not to jump to a premature conclusion that does not fit the facts -- only naming the suspect when the weight of evidence is overwhelming.

Be skeptical about all assumptions. What others state as a fact may really be an assumption that may or may not be incorrect. If the assumption is wrong, you may be working with false evidence and will arrive at an incorrect conclusion.

Some words of warning. The Solaris OE is in most cases very good at tuning itself for the workload in hand. The later the release, the less tuning that should be required. It has often been found that the root cause of a performance problem is an attempt at performance tuning. Pay attention to the application first and the Operating Environment last.

Any changes to the system configuration such as memory size or disk layout mean that performance settings should be checked for their current validity. This is also true of an upgrade where carrying parameters on across an upgrade may limit the performance of the new OE.

2. Performance Monitoring

2.1. Start at the Top

What operation(s) do you see that are symptoms of the performance problem(s)?

For example, are particular types of database query, file, or network operations slower than you think they should be? How specific can you be about the operation in terms of providing a test case, such as an SQL query or 30 lines of C?

Define your problem statement as precisely as possible to explain "what is wrong with what" to your best knowledge. Some examples of good problem statements include:

· An SQL query takes two times longer on VXFS when compared to UFS.

· SVR4 message queue operations take 30 percent longer on OE revision "A" compared to OE revision "B."

· Login to system "A" takes three times longer than login to system "Y."

A problem statement should not contain the solution or a possible solution.

Most times, getting a clear statement of the problem is more than halfway to solving a problem. It is important to take into account the perspective of the user in stating the problem you are trying to solve, which means taking the application perspective. It goes against human nature, which tries to prove or disprove a possible cause by experimenting, rather than assessing the merit of a cause relative to observed facts.

Poor problem statements include:

· mpstat "wt" column shows a high wait time.

· User jobs take too long.

The boundary between the correct functioning of a system and its applications and a performance problem is often a gray area. Entire system hangs and process hangs are beyond the scope of this article. If you suspect incorrect functioning of the system as opposed to a performance problem, then log a call with your Sun Solution Center to develop a course of action. A prerequisite for a high-performance system is that it function correctly.

As part of your proactive maintenance schedule, it is worth checking /var/adm/messages for indications of hardware issues such as disk retries or excessive message generation.

It is well worth looking back at the history of the system; if your system has given better performance, draw a timeline detailing the changes before poor performance was first noticed and when it has been seen since.

2.2. Know What Your Systems Do in Good Times

It is a good idea to keep some examples of how your system operates properly. You can easily collect and store monthly performance data, such as:

· *stat family: vmstat, mpstat, iostat, vxstat

· sar

· ps output to show what processes are running (prstat on the Solaris 8 OE)

In addition, a number of commercial and unsupported products are available for performance monitoring.

One of the issues with many such products is that threshold values are different for different hardware configurations. For example, certain values would be considered excessive and may bring a 400-MHz system to a crawl, but they may be acceptable for a 900-MHz system.

2.3. Looking for a Performance Bottleneck

Once you have defined the performance problem you are trying to solve, the next step is to narrow down the area in which the bottleneck occurs.

Questions worth asking at this stage include:

A. What can the application tell me about what it sees as a bottleneck? Taking Oracle as an example, an Oracle DBA should know what BSTAT/ESTATS are and how to run and interpret them. Again, taking the application perspective, BSTATS/ESTATS may show the bottleneck that is limiting Oracle performance and serve as a guide for further analysis.

B. Where are we spending the most time, in kernel or user land? Answer with vmstat, mpstat or sar, ps, and prstat.

C. Are all resources of a similar type equally busy? The intent is to find unequal distribution of resources. For example, one disk may be a bottleneck, or one CPU may be busier than the others. For CPUs, look at mpstat. For disks, use iostat.

D. What process or processes are using the most resources? To see the top processes using CPU and memory resources, use:

          ps -eo pid,pcpu,args | sort +1n
          ps -eo pid,vsz,args | sort +1n
 
          /usr/ucb/ps aux |more 
        

%cpu
kilobytes of virtual memory

Output is sorted with highest users (processes) of CPU and memory at the top.

The Solaris 8 OE provides prstat, which gives a running commentary of CPU and memory use. The output from prstat -cvm is very useful.

We now look at how to use some of the common Solaris commands for initial performance analysis.

2.3.1. vmstat - Using the vmstat Command

The command vmstat is concise. Here we can see an example of insufficient CPU capacity for the executing applications.

% vmstat 15
procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr m0 m1 m2 m3   in   sy   cs us sy id
45 0 0 2887216 182104 3 707 449 6 455 0 80 2  6  1  0 1531 5797  983 61 30  9
58 0 0 2831312 46408 5 983 582 56 3211 0 492 0 0 0  0 1413 4797 1027 69 31  0
55 0 0 2830944 56064 2 649 656 3 806 0 121 0  0  0  0 1441 4627  989 69 31  0
57 0 0 2827704 48760 4 818 723 6 800 0 121 0  0  1  0 1606 4316 1160 66 34  0
56 0 0 2824712 47512 6 857 604 56 1736 0 261 0 0 1  0 1584 4939 1086 68 32  0
58 0 0 2813400 47056 7 856 673 33 2374 0 355 0 0 0  0 1676 5112 1114 70 30  0
60 1 0 2816712 49464 7 861 720 6 731 0 110 7  0  3  0 2329 6131 1067 64 36  0
58 0 0 2817552 48392 4 585 521 0 996 0 146 0  0  0  0 1357 6724 1059 71 29  0

Always ignore the first line of vmstat output. The column labeled "r" under the "procs" section is the run queue of processes waiting to get on the CPUs. The "id" column is CPU idle time. This machine lacks the CPU resources to keep up with the process demand as seen by it spending the majority of CPU time in user space (see "us" column).

Two approaches can be taken here -- first, add extra CPUs, or second, profile the application code to determine if the part of the application can be optimized. A great deal of effort can be expended profiling sections of code -- sometimes for little gain. It's a good idea to be realistic when assessing your potential "return on investment" in relation to your time.

2.3.2. mpstat - Using the mpstat Command

The mpstat command reports per-processor statistics, with each row of the table representing the activity of one processor.

$ mpstat 5
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0   20   0 3592  3350 2338 1355   43  184  285    0  4578    9   6   1  84
  1   19   0  304   465  283 2139  135  398  140    0  6170    9   6   1  85
  2   25   0  352   507  295 2153  158  433  183    0  7508   12   7   1  81
  3   26   0  357   513  302 2082  155  425  181    0  7460   12   7   0  81
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    3   0 3879  3773 2754 1832   61  322  339    0  3424   12   7   0  81
  1    2   0  555   544  264 3040  197  670  112    0  4828   15   6   0  78
  2   11   0  188   595  269 3141  219  738  121    0  5291   18   6   1  75
  3   65   0  185   585  279 2660  211  673  110    0  5420   22   9   0  69
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    6   0 4028  3633 2620 1695   51  287  343    0  2857   12   8   0  80
  1    7   0  150   545  265 3044  196  663  117    0  4374   14   4   0  81
  2   14   0  226   602  279 2823  225  707  103    0  4715   22   4   1  73
  3    2   0  125   600  282 2810  230  699  118    0  4665   18   4   0  78

mpstat identifies what each CPU is spending its time doing: for example, the distribution of system, user, wait, and idle time, system calls made, lock contention, interrupts, faults, and cross calls.

See the mpstat(1M) man page for details of each column.

2.3.3. iostat - Using the iostat Command

The iostat command reports disk usage. Each row of the table represents the activity of one disk. Frequently used options include:

Option

Description

n

Identifies disks according to cXtYdZ.

x

Reports extended statistics.

z

This option is new to the Solaris 8 OE. It omits lines where no disk activity has taken place in the sampling interval, which helps to shorten the output and highlight active disks.

p and P

Reports per-partition I/O statistics, which are useful when looking at swap devices.

E

Useful for identifying disks that are generating errors.

Table 1: iostat Options

iostat also reports activity over NFS, yet it can make output rather long.

2.3.4. truss - Your Friend

The truss(1M) utility executes a specified command and produces a trace of the system calls it performs, the signals it receives, and the machine faults it incurs.

truss can also follow the execution of an existing process. It is a very useful tool to narrow down what resources an application is requesting from the kernel that are slow or are used to excess.

If you don't know about truss, then read the man page and give it a try. The -m option is very useful for showing faults such as page faults. The -c option gives a summary of:

· System calls

· Faults

· Signals

· Cumulative times spent in each system call type

· Number of failed system calls

2.3.5. lockstat - Contention for Resources

Kernel locks protect multiple updates to data structures and control access to resources such as disk caches, network caches, and various kernel caches.

lockstat executes a command that reports all kernel lock activity for the duration of the command, irrespective of the process or device that made the request for a lock. See the lockstat(1M) man page. The option -s 10 reports the stack of the kernel threads contending on each lock.

2.3.6. trapstat - Runtime Trap Statistics

trapstat is a tool to provide runtime trap statistics on an UltraSPARC® processor running an otherwise stock Solaris kernel. For I-TLB and D-TLB misses, trapstat can optionally display the amount of time spent in the operating system's TLB miss handler. For interrupt vector traps, trapstat can optionally display the interrupting device.

2.3.7. gprof - Application Profiling

For C, C++, and FORTRAN applications, try compiling -xpg and execute the program with a typical workload that demonstrates the performance problem. Run gprof on the generated tmon.out file. This will show where the application is spending most of its time.

Forte TeamWare (formerly Sun WorkShop TeamWare, now part of Sun Studio developer tools) has a number of useful tools, such as the analyzer which provides a graphical representation of where the application is spending its time. For further details, see Sun Studio and Forte TeamWare Documentation and Rajat Garg and Ilya Sharapov's Sun BluePrints book, Techniques for Optimizing Applications: High Performance Computing.

2.3.8. proc Tools

proc tools are utilities that exercise features of /proc reporting attributes of a process such as:

· pstack - the call stack

· ptree - a tree of process relationships

· pfiles - a list of open file descriptors

· pldd - a list of dynamic libraries in use by the running processes

See the proc(1) man page for more information.

3. Some Commonly Asked Questions and Some Suggestions

3.1. What Do 64-Bit Sizing and Capacities Provide?

From a performance point of view, the ability to run 64-bit applications has two main benefits. The first is that much larger problems can be solved efficiently using a bigger process address space. The second is that integer arithmetic computations get to use 64-bit registers and operations.

Overall, programs get slightly larger due to larger pointer values in code and data structures. This, in turn, means that CPU caches are a little less likely to have enough cache lines, and a slight slowdown might occur in programs that could run just as well in a 32-bit environment.

Kernel thread stacks are 16Kb, rather than 8Kb, though the effect is often negligible.

3.2. Free Memory

Examining a Solaris system to determine the amount of memory that is free has traditionally been an area of confusion.

For releases before the Solaris 8 OE, to look for a shortage of memory, do not rely upon the "free" column or the "sr" column. The value in the "fr" column is not an indication of a lack of memory. The page cache is holding onto pages in case they may be needed again. The VM subsystem will only reclaim memory when needed.

Much has been written on this subject in the SunWorld articles and Sun Performance and Tuning - Java and the Internet. To determine if there is a lack of memory, examine the 12th column ("sr" or scan rate) in conjunction with I/O traffic to the swap partitions (using iostat -P) on disk. The "sr" column may have high figures if a large amount of I/O is being generated through the file system and the page scanner needs to run in order to free up pages for I/O.

The pageout scanner runs only when the free list shrinks below a threshold (lotsfree in pages). Any process or file inactive and not locked in memory may be paged out. The size of the freelist will appear to shrink and will remain at that value (lotsfree). The page daemon will start to scan for memory to be reclaimed from the page cache and exited and idle processes when the amount on the freelist drops below the lotsfree threshold. There is no way for the "free" value to grow much above the threshold, because there is no way to get the page scanner to reclaim memory beyond the threshold. It is more efficient for pages to be left in the page cache, rather than needlessly put on the free list.

The Solaris 8 OE implements a more efficient algorithm within the segmap driver to provide the pages required for I/O. The "fr" column in vmstat really reflects memory that is free and not used by the page cache. The -p option has been added to vmstat to give a more accurate breakdown of paging behavior.

For individual processes, the pmap command reports the address space layout of an individual process (-x option is useful).

3.3. Priority Paging

Priority paging was introduced with the Solaris 7 OE and was back-ported to the Solaris 2.6 OE (kernel patch 105181-XX) and the Solaris 2.5.1 OE (kernel patch 103640-XX). Recent versions of both patches are available from the SunSolve Online program.

Priority paging provides an improved paging algorithm that can significantly enhance system response when the file system is being used. Priority paging introduces a new additional watermark, cachefree. The paging parameters are now:

                minfree < desfree < lotsfree < cachefree   

By default the new behavior is turned off in the Solaris 2.5.1, 2.6, and 7 Operating Environments, so it is important to enable this functionality on systems that are paging noticeably. cachefree is set to lotsfree if priority_paging is not enabled. If it is enabled, then cachefree is set to 2 times lotsfree by default.

Adjusting this parameter tends to make switching between windows on desktop systems faster, and this is a big help for systems running databases that read large files into memory from the file system. For systems that perform a large amount of I/O through a file system, speed increases of several hundred percent have been seen for compute-intensive jobs with a large data set.

The Solaris 8 OE uses a different algorithm, which removes the limiting factor of previous releases where the page scanner had to scan for memory to supply the segmap driver with memory in which to place I/O. All pages that the segmap no longer uses are put on a list allowing immediate reuse. Do not set priority_paging in the Solaris 8 OE. In addition, the Solaris 8 OE should not require tuning of virtual memory parameters, except on large systems where setting fastscan and maxpgio to higher values may be beneficial.

For more information on priority paging, refer to Sun Performance, Priority Paging Frequently Asked Questions.

3.4. Intimate Shared Memory (ISM)

ISM provides for the shared memory to be locked in memory, and it cannot be paged out. Memory management data structures that are normally created on a per-process basis are created once and then shared by every process. In the Solaris 2.6 OE, a further optimization takes place as the kernel tries to find 4-Mbyte contiguous blocks of physical memory that can be used as large pages to map the shared memory. This greatly reduces memory management unit overhead. (See page 333 of Performance and Tuning - Java and the Internet.) By default, applications such as Oracle, Informix, and Sybase use a special flag to specify that they want ISM.

ISM is an important optimization that makes more efficient use of the kernel and hardware resources involved in the implementation of virtual memory. In addition, ISM provides a means of keeping heavily used shared pages locked in memory.

Intimate shared memory is enabled by default, and there is no need to edit the /etc/system file to turn on this feature. In a kernel with current patch levels, turning off ISM can cause system degradation and possibly a hang condition. In addition, database configuration files, such as Oracle's init.ora file, should not have use_ism=false because it turns off ISM.

3.5. Swap Configurations Related to Shared Memory

To understand swap configurations related to shared memory, see "Clearing Up Swap Space Confusion" by Adrian Cockcroft.

The two primary considerations in setting swap space size are to have enough:

1. Memory to avoid swapping in common operation

2. Swap to get a crash dump

3.6. Interprocess Communication (IPC) Parameters

The values for the following IPC parameters need to be determined by your database administrator (DBA). Sun Solution Centers cannot give recommendations for what the actual IPC parameter settings should be. These values are application dependent.

It is extremely easy to mistype the /etc/system setting for IPC parameters. Such an error can have a significant performance impact on the application. To check for a typo, trawl through /var/adm/messages for a message of the form:

      genunix: [ID 492708 kern.notice] sorry, variable 'seminfo_semopn' 
      is not defined in the 'semsys' 

This indicates a typo in the line. Grep for "sorry."

The Solaris 8 OE has improved defaults for IPC values than previous releases.

For releases previous to the Solaris 2.6 OE, more swap space (as "backing store") is needed for shared memory. Using swap -l, divide the block numbers by 2 to get megabytes. There should be at least 2 times the amount of swap available for allocated shared memory (shmmax).

Here are the default and maximum values for shmmax:

            Default         Maximum            
shmmax   1048576 (Meg)    4294967295  (4GB)  2.5.1, 2.6, 32bit solaris 7
                                2147483647  (2GB)  2.5 or lower 

In the Solaris 2.6 OE, shmmax and shmmin are unsigned integers (32 bit). In the Solaris 7 OE, "32-bit" shmmax and shmmin are unsigned integers (32 bit). In the Solaris 7 OE, "64-bit" shmmax and shmmin are unsigned longs (64 bit). In all cases, shmmni and shmseg are signed integers (32 bit). Table 2 summarizes these commands and their type.

Command

Solaris 2.6 32-bit

Solaris 7 32-bit

Solaris 7 64-bit

shmmax

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/26706/viewspace-64600/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2002-10-29

  • 博文量
    78
  • 访问量
    50748