Linux性能分析工具

CPU占用：top topas（aix） prstat（sun）

磁盘利用：iostat sar –d 1 10

内存状态：free（-/+ buffers/cache对应的free为实际空闲内存） vmstat 1 3 每1秒统计一次，统计3次；

linux:/etc/rc.d # free

total used free shared buffers cached

Mem: 8082700 3882856 4199844 0 16980 2600376

-/+ buffers/cache: 1265500 6817200

Swap: 2096472 0 2096472

6817200 = 4199844 + 16980 + 2600376 ---free命令第二行对应的free为空闲内存数量（字节）。

linux:/etc/rc.d # vmstat

procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------

r b swpd free buff cache si so bi bo in cs us sy id wa st

5 0 0 4199588 16988 2600384 0 0 0 12 11 13 0 1 98 1 0

vmstat显示的buffer、cache、free值与free命令的第一行相匹配。

top - 11:44:09 up 8 days, 2:20, 7 users, load average: 0.00, 0.03, 0.04

Tasks: 155 total, 1 running, 154 sleeping, 0 stopped, 0 zombie

Cpu(s): 0.4%us, 0.5%sy, 0.0%ni, 98.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 8082700k total, 3882356k used, 4200344k free, 18056k buffers

Swap: 2096472k total, 0k used, 2096472k free, 2600416k cached

top显示的buffer、cache、free值与free命令的第一行相匹配。

1. vmstat

vmstat在linux执行结果：

procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------

r b swpd free buff cache si so bi bo in cs us sy id wa st

0 0 582632 151717 14344 705264 0 1 6 25 9 11 4 3 93 0 0

Cpu性能：

r In run queue

表示运行和等待cpu时间片的进程数，这个值如果长期大于系统CPU的个数，说明CPU不足，需要增加CPU。

b Blocked for resources (I/O, paging, etc.)

表示在等待资源的进程数，比如正在等待I/O、或者内存交换等，该值高说明IO或内存存在瓶颈。

us user time, including nice time

显示了用户进程消耗的CPU时间百分比。us的值比较高时，说明用户进程消耗的cpu时间多，但是如果长期大于50%，就需要考虑优化程序或算法。

sy system time

显示了内核进程消耗的CPU时间百分比。

根据经验，us+sy的参考值为80%，如果us+sy大于 80%说明可能存在CPU资源不足。

us+ys没有超过30%，b经常很大，表明正在等待I/O、或者内存交换等，看来是内存或者io出现了问题。

内存性能

swpd

使用虚拟内存大小，单位KB，如果swpd的值不为0，只要si、so的值长期为0，这种情况下一般不用担心，不会影响系统性能。

Free

表示当前空闲的物理内存数量（以k为单位）

Buff

表示buffers cache的内存数量，单位KB，一般对块设备的读写才需要缓冲。

cache

表示page cached的内存数量，单位KB，一般作为文件系统cached，频繁访问的文件都会被cached，如果cache值较大，说明cached的文件数较多，如果此时IO中bi比较小，说明文件系统效率比较好。

Si Amount of memory swapped in from disk (/s).

表示由磁盘调入内存的页数目

So Amount of memory swapped out to disk (/s).

表示由内存调入磁盘。

一般情况下，si、so的值都为0，如果si、so的值长期不为0，则表示系统内存不足。需要增加系统内存。

中断次数。

进程切换次数。

2. sar

sar –w 1 100，可以分析系统中的交换区的活动情况。

swpin/s： Number of process swapins per second;
swpot/s： Number of process swapouts per second;
bswin/s： Number of 512-byte units transferred for swapins per second;
bswot/s： Number of 512-byte units transferred for swapouts per second;
pswch/s： Number of process context switches per second.
对结果的分析：
如果swpin/s的值大于零，那么swpot的值必须引起注意；
同时必须注意pswch/s的值，如果很大，说明进程切换频繁。

sar –b 1 100，可以分析系统中的缓冲区的活动情况。

bread/s： Number of physical reads per second from the disk (or other block devices) to the buffer cache;
bwrit/s： Number of physical writes per second from the buffer cache to the disk (or other block device);
lread/s： Number of reads per second from buffer cache;
lwrit/s： Number of writes per second to buffer cache;
%rcache： Buffer cache hit ratio for read requests e.g., 1 - bread/lread;
%wcache： Buffer cache hit ratio for write requests e.g., 1 - bwrit/lwrit;
pread/s： Number of reads per second from character device using the physio() (raw I/O) mechanism;
pwrit/s： Number of writes per second to character device using the physio() (i.e., raw I/O ) mechanism; mechanism.
对结果的分析：

如果%rcache列的值小于90%，并且%wcache列的值不在70-70%之间，我们必须观察系统中什么应用在做什么样的读/写操作，我们是否需要增加缓冲的大小。

sar –d 1 100我们可以分析系统中的每个磁盘和磁带的活动情况。

device：设备名；
%busy： Portion of time device was busy servicing a request; statistics.
avque： Average number of requests outstanding for the device;
r+w/s： Number of data transfers per second (read and writes) from and to the device;
blks/s： Number of bytes transferred (in 512-byte units) from and to the device;
avwait： Average time (in milliseconds) that transfer requests waited idly on queue for the device;
avserv： Average time (in milliseconds) to service each transfer request (includes seek, rotational latency, and data transfer times) for the device.
对结果的分析：
await表示平均每次设备I/O操作的等待时间（以毫秒为单位）。
svctm表示平均每次设备I/O操作的服务时间（以毫秒为单位）。
%util表示一秒中有百分之几的时间用于I/O操作。
对以磁盘IO性能，一般有如下评判标准：
     正常情况下svctm应该是小于await值的，而svctm的大小和磁盘性能有关，CPU、内存的负荷也会对svctm值造成影响，过多的请求也会间接的导致svctm值的增加。
     await值的大小一般取决与svctm的值和I/O队列长度以及I/O请求模式，如果svctm的值与await很接近，表示几乎没有I/O等待，磁盘性能很好，如果await的值远高于svctm的值，则表示I/O队列等待太长，系统上运行的应用程序将变慢，此时可以通过更换更快的硬盘来解决问题。
     %util项的值也是衡量磁盘I/O的一个重要指标，如果%util接近100%，表示磁盘产生的I/O请求太多，I/O系统已经满负荷的在工作，该磁盘可能存在瓶颈。长期下去，势必影响系统的性能，可以通过优化程序或者通过更换更高、更快的磁盘来解决此问题。

3. iostat

iostat -x 1

Linux 2.6.32.12-0.7-default (linux) 08/06/13 _x86_64_

avg-cpu: %user %nice %system %iowait %steal %idle

4.10 0.00 3.12 0.35 0.00 92.42

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

sda 0.10 8.93 0.46 3.81 22.06 93.00 26.93 0.06 13.61 5.12 2.19

IO设备每项数据的含义如下，

rrqm/s: 每秒进行 merge的读操作数目。即 rmerge/s

wrqm/s: 每秒进行 merge的写操作数目。即 wmerge/s

r/s: 每秒完成的读 I/O设备次数。即 rio/s

w/s: 每秒完成的写 I/O设备次数。即 wio/s

rsec/s: 每秒读扇区数。

wsec/s: 每秒写扇区数。

rkB/s: 每秒读K字节数。是 rsect/s的一半，因为每扇区大小为512字节。

wkB/s: 每秒写K字节数。是 wsect/s的一半。

avgrq-sz: 平均每次设备I/O操作的数据大小 (扇区)。

avgqu-sz: 平均I/O队列长度。即 aveq/1000 (因为aveq的单位为毫秒)。

await: 平均每次设备I/O操作的等待时间 (毫秒)。

svctm: 平均每次设备I/O操作的服务时间 (毫秒)。

%util: 一秒中有百分之多少的时间用于 I/O操作，或者说一秒中有多少时间

如果 %util接近 100%，说明产生的I/O请求太多，I/O系统已经满负荷，该磁盘可能存在瓶颈。

dd命令测试io速度：

dd bs=1024 count=10240 if=/dev/zero of=./tt 测试文件系统写入速度

dd bs=1024 count=10240 if=/dev/zero of=./tt conv=fdatasync 测试物理磁盘写入速度

dd bs=1024 count=10240 of=/dev/zero if=./tt 测试文件系统读速度

秒客网

Linux性能分析工具

1. vmstat

2. sar

3. iostat

相关文章