先确定pid:
top
找到最消耗cpu的进程15495
再确定tid:
ps -mp 15495 -o THREAD,tid,time
找到最占用cpu的进程18448
printf "%x\n" 18448
4810
打印堆栈
jstack 15495 | grep 4810 -A 30
例如发现栈如下:
- "regionserver60020-smallCompactions-1438827962552" daemon prio=10 tid=0x00007f4ce1903800 nid=0xe2a72 runnable [0x00007f443b8f6000]
- java.lang.Thread.State: RUNNABLE
- at com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect(Native Method)
- at com.hadoop.compression.lzo.LzoDecompressor.decompress(LzoDecompressor.java:315)
- - locked <0x00007f450c42d820> (a com.hadoop.compression.lzo.LzoDecompressor)
- at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:88)
- at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
- at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
- at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
- - locked <0x00007f4494423b70> (a java.io.BufferedInputStream)
- at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
- at org.apache.hadoop.hbase.io.compress.Compression.decompress(Compression.java:439)
- at org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:91)
- at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1522)
- at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1314)
- at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:358)
- at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:610)
- at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:724)
- at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:136)
- at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
- at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:507)
- at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:217)
- at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:76)
- at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:109)
- at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1106)
- at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1482)
- at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:475)
- at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
- at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
- at java.lang.Thread.run(Thread.java:745)
是读写hfile发生的错误,导致启动多个runnable
一个应用占用CPU很高,除了确实是计算密集型应用之外,通常原因都是出现了死循环。