I have a situation here at work where we run a Java EE server with several applications deployed on it. Lately, we've been having frequent OutOfMemoryException's. We suspect some of the apps might be behaving badly, maybe leaking, or something.
我在这里遇到一种情况,我们运行的Java EE服务器上部署了多个应用程序。最近,我们经常遇到OutOfMemoryException。我们怀疑某些应用可能表现不佳,可能是泄漏,或者其他什么。
The problem is, we can't really tell which one. We have run some memory profilers (like YourKit), and they're pretty good at telling what classes use the most memory. But they don't show relationships between classes, so that leaves us with a situation like this: We see that there are, say, lots of Strings and int arrays and HashMap entries, but we can't really tell which application or package they come from.
问题是,我们无法真正分辨出哪一个。我们已经运行了一些内存分析器(比如YourKit),并且它们非常善于告诉哪些类使用最多的内存。但是他们没有显示类之间的关系,所以这给我们留下了这样的情况:我们看到有很多字符串和int数组以及HashMap条目,但我们无法真正告诉他们哪个应用程序或包来自。
Is there a way of knowing where these objects come from, so we can try to pinpoint the packages (or apps) that are allocating the most memory?
有没有办法知道这些对象来自哪里,所以我们可以尝试查明分配最多内存的包(或应用程序)?
3 个解决方案
#1
3
There are several things that one could do in this situation:
在这种情况下,有几件事情可以做:
- Configure the Java EE application server to produce a heap dump on OOME. This feature is available via a JVM parameter since the 1.5 days. Once a dump has been obtained, it can be analyzed offline, using tools like Eclipse MAT. The important part is figuring out the dominator tree.
- Perform memory profiling on a test server; Netbeans is good at this. This is bound to take more time that the first when it comes to analyzing the root cause, since the exact conditions of memory allocation failure must be present. If you do have automated integration/functional tests, then deducing the root cause will be easier. The trick is to take periodic heap dumps, and analyze the classes that are contributing to the increase in heap consumption. There might not necessarily be a leak - it could be a case of insufficient heap size.
配置Java EE应用程序服务器以在OOME上生成堆转储。自1.5天起,此功能可通过JVM参数使用。获得转储后,可以使用Eclipse MAT等工具离线分析。重要的是找出支配树。
在测试服务器上执行内存分析; Netbeans很擅长这一点。由于必须存在内存分配失败的确切条件,因此在分析根本原因时必须花费更多时间。如果您确实有自动集成/功能测试,那么推断根本原因会更容易。诀窍是定期进行堆转储,并分析导致堆消耗增加的类。可能不一定存在泄漏 - 可能是堆大小不足的情况。
#2
0
A quick thought is that you probably can do some reflection, if you don't mind some performance trade-off....
一个快速的想法是,你可能会做一些反思,如果你不介意一些性能权衡....
#3
0
What I have found helpful is:
我发现有用的是:
jmap -J-d64 -histo $PID
(remove the -J-d64
option for 32-bit arch)
(删除32位拱的-J-d64选项)
This will output something like this:
这将输出如下内容:
num #instances #bytes class name
----------------------------------------------
1: 4040792 6446686072 [B
2: 3420444 1614800480 [C
3: 3365261 701539904 [I
4: 7109024 227488768 java.lang.ThreadLocal$ThreadLocalMap$Entry
5: 6659946 159838704 java.util.concurrent.locks.ReentrantReadWriteLock$Sync$HoldCounter
And then from there you can try to further diagnose the problem, doing diffs and what not to compare successive snapshots.
然后从那里你可以尝试进一步诊断问题,做差异和什么不比较连续的快照。
This will only pause the VM for a brief time, even for big heaps, so you can safely do this in production (during off-peak hours, hopefully :) )
这只会暂停VM很短的时间,即使是大堆,所以你可以安全地在生产中这样做(在非高峰时段,希望:)
#1
3
There are several things that one could do in this situation:
在这种情况下,有几件事情可以做:
- Configure the Java EE application server to produce a heap dump on OOME. This feature is available via a JVM parameter since the 1.5 days. Once a dump has been obtained, it can be analyzed offline, using tools like Eclipse MAT. The important part is figuring out the dominator tree.
- Perform memory profiling on a test server; Netbeans is good at this. This is bound to take more time that the first when it comes to analyzing the root cause, since the exact conditions of memory allocation failure must be present. If you do have automated integration/functional tests, then deducing the root cause will be easier. The trick is to take periodic heap dumps, and analyze the classes that are contributing to the increase in heap consumption. There might not necessarily be a leak - it could be a case of insufficient heap size.
配置Java EE应用程序服务器以在OOME上生成堆转储。自1.5天起,此功能可通过JVM参数使用。获得转储后,可以使用Eclipse MAT等工具离线分析。重要的是找出支配树。
在测试服务器上执行内存分析; Netbeans很擅长这一点。由于必须存在内存分配失败的确切条件,因此在分析根本原因时必须花费更多时间。如果您确实有自动集成/功能测试,那么推断根本原因会更容易。诀窍是定期进行堆转储,并分析导致堆消耗增加的类。可能不一定存在泄漏 - 可能是堆大小不足的情况。
#2
0
A quick thought is that you probably can do some reflection, if you don't mind some performance trade-off....
一个快速的想法是,你可能会做一些反思,如果你不介意一些性能权衡....
#3
0
What I have found helpful is:
我发现有用的是:
jmap -J-d64 -histo $PID
(remove the -J-d64
option for 32-bit arch)
(删除32位拱的-J-d64选项)
This will output something like this:
这将输出如下内容:
num #instances #bytes class name
----------------------------------------------
1: 4040792 6446686072 [B
2: 3420444 1614800480 [C
3: 3365261 701539904 [I
4: 7109024 227488768 java.lang.ThreadLocal$ThreadLocalMap$Entry
5: 6659946 159838704 java.util.concurrent.locks.ReentrantReadWriteLock$Sync$HoldCounter
And then from there you can try to further diagnose the problem, doing diffs and what not to compare successive snapshots.
然后从那里你可以尝试进一步诊断问题,做差异和什么不比较连续的快照。
This will only pause the VM for a brief time, even for big heaps, so you can safely do this in production (during off-peak hours, hopefully :) )
这只会暂停VM很短的时间,即使是大堆,所以你可以安全地在生产中这样做(在非高峰时段,希望:)