【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

时间:2021-09-25 03:14:26

一 问题

Dubbo monitor所在服务器状态异常,iowait一直很高,load也一直很高,监控如下:

iowait如图:

【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

load如图:

【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

二 分析

通过iotop命令可以查看当前系统中磁盘io情况以及进程占用磁盘io的情况

【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

从中可以定位到占用io进程的pid;

通过

cat /proc/${pid}/io

可以查看一个进程具体的读写状况;

通过

ps aux|grep ${pid}

可以查到这个进程具体的命令;

通过以上命令定位到进程为dubbo的monitor进程,用jstack打印线程栈发现处于RUNNABLE的进程除了

java.lang.Thread.State: RUNNABLE

at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)

之外,有两个线程很可疑

"DubboMonitorTimer-thread-1" daemon prio=10 tid=0x00007f53b0593000 nid=0x363e runnable [0x00007f53ac36d000]

java.lang.Thread.State: RUNNABLE

at java.io.FileInputStream.readBytes(Native Method)

at java.io.FileInputStream.read(FileInputStream.java:272)

at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)

at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)

at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)

- locked <0x0000000779d506f8> (a java.io.FileReader)

at java.io.InputStreamReader.read(InputStreamReader.java:184)

at java.io.BufferedReader.fill(BufferedReader.java:154)

at java.io.BufferedReader.readLine(BufferedReader.java:317)

- locked <0x0000000779d506f8> (a java.io.FileReader)

at java.io.BufferedReader.readLine(BufferedReader.java:382)

at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.appendData(SimpleMonitorService.java:322)

at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.draw(SimpleMonitorService.java:263)

at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.access$300(SimpleMonitorService.java:64)

at com.alibaba.dubbo.monitor.simple.SimpleMonitorService$2.run(SimpleMonitorService.java:137)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Locked ownable synchronizers:

- <0x0000000788104268> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"DubboMonitorAsyncWriteLogThread" daemon prio=10 tid=0x00007f53b05ba000 nid=0x363d runnable [0x00007f53ac3af000]

java.lang.Thread.State: RUNNABLE

at java.io.FileOutputStream.writeBytes(Native Method)

at java.io.FileOutputStream.write(FileOutputStream.java:345)

at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)

at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)

at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)

at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)

- locked <0x000000077a028918> (a java.io.FileWriter)

at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)

at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.write(SimpleMonitorService.java:213)

at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.access$100(SimpleMonitorService.java:64)

at com.alibaba.dubbo.monitor.simple.SimpleMonitorService$1.run(SimpleMonitorService.java:120)

at java.lang.Thread.run(Thread.java:744)

Locked ownable synchronizers:

- None

这两个线程一个在read,一个在write,查看dubbo monitor的源代码发现:

【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

monitor中会有两个线程,一个不停地写统计日志,一个每隔300s进行一次draw操作;

monitor中每天的日志都会保存在一个日志目录中,进入其中一天的日志目录查看日志大小:

发现一天的日志只有几十M,而上边iotop查看的结果是read是1520K/s,write是3.43M/s,所以排除write的问题,问题应该出在draw身上,查看draw的源代码发现draw会不断的读取统计日志同时进行绘图操作,

【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

draw中会遍历日志目录,同时在appendData方法中会进行日志读取操作,由于monitor已经运行很长时间,所以按天保存的目录非常多,每次draw都需要去遍历这些目录并且进行读取文件操作,历史数据没有改动但却会不断的重新进行日志读取和绘图,问题应该出在这里;

三 解决

修改代码如下

【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

即让draw只对当天的日志进行处理,修改之后服务器iowait如图:

【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

load如图:

【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

问题解决