在Java中分析本机方法 - 奇怪的结果

时间:2022-12-20 02:45:29

I have been using Yourkit 8.0 to profile a mathematically-intensive application running under Mac OS X (10.5.7, Apple JDK 1.6.0_06-b06-57), and have noticed some strange behavior in the CPU profiling results.

我一直在使用Yourkit 8.0来分析在Mac OS X(10.5.7,Apple JDK 1.6.0_06-b06-57)下运行的数学密集型应用程序,并注意到CPU分析结果中有一些奇怪的行为。

For instance - I did a profiling run using sampling, which reported that 40% of the application's 10-minute runtime was spent in the StrictMath.atan method. I found this puzzling, but I took it at it's word and spent a bit of time replacing atan with an extremely simply polynomial fit.

例如 - 我使用抽样进行了分析运行,其中报告了应用程序的10分钟运行时的40%用于StrictMath.atan方法。我发现这令人费解,但我接受了它的话,花了一点时间用非常简单的多项式拟合替换atan。

When I ran the application again, it took almost exactly the same time as before (10 minutes) - but my atan replacement showed up nowhere in the profiling results. Instead, the runtime percentages of the other major hotspots simply increased to make up for it.

当我再次运行应用程序时,它几乎与以前完全相同(10分钟) - 但我的atan替换在分析结果中无处可见。相反,其他主要热点的运行时百分比只是增加以弥补它。

To summarize:

RESULTS WITH StrictMath.atan (native method)
Total runtime: 10 minutes
Method 1: 20%
Method 2: 20%
Method 3: 20%
StrictMath.atan: 40%

结果使用StrictMath.atan(本机方法)总运行时间:10分钟方法1:20%方法2:20%方法3:20%StrictMath.atan:40%

RESULTS WITH simplified, pure Java atan
Total runtime: 10 minutes
Method 1: 33%
Method 2: 33%
Method 3: 33%

结果简化,纯Java atan总运行时间:10分钟方法1:33%方法2:33%方法3:33%

(Methods 1,2,3 do not perform any atan calls)

(方法1,2,3不执行任何atan调用)

Any idea what is up with this behavior? I got the same results using EJ-Technologies' JProfiler. It seems like the JDK profiling API reports inaccurate results for native methods, at least under OS X.

知道这种行为是什么吗?我使用EJ-Technologies的JProfiler获得了相同的结果。似乎JDK概要分析API报告了本机方法的不准确结果,至少在OS X下是这样。

6 个解决方案

#1


This can happen because of inconsistencies of when samples are taken. So for example, if a method uses a fair amount of time, but doesn't take very long to execute, it is possible for the sampling to miss it. Also, I think garbage collection never happens during a sample, but if some code causes a lot of garbage collection it can greatly contribute to a slowdown without appearing in the sample.

这可能是因为采样时不一致。因此,例如,如果方法使用相当长的时间,但执行时间不长,则采样可能会错过它。此外,我认为垃圾收集在样本期间永远不会发生,但是如果某些代码导致大量垃圾收集,它可以极大地促进减速而不会出现在样本中。

In similar situation I've found it very helpful to run twice, once with tracing as well as once with sampling. If a method appears in both it is probably using a lot of CPU, otherwise it could well just be an artifact of the sampling process.

在类似的情况下,我发现运行两次非常有帮助,一次是跟踪,一次是采样。如果一个方法同时出现,它可能使用了大量的CPU,否则它很可能只是采样过程的一个人为因素。

#2


Since you're using a Mac, you might try Apple's Shark profiler (free download from ADC) which has Java support and Apple's performance group has put a fair amount of time into the tool.

由于您使用的是Mac,您可以尝试Apple的Shark分析器(从ADC免费下载),它具有Java支持,Apple的性能组已经花了相当多的时间进入该工具。

As Nick pointed out, sampling can be misleading if the sample interval is close enough to the function's execution time and the profiler rarely checks when the function is actually executing. I don't know whether YourKit supports this but in Shark you can change the sampling interval to something other than the default 10ms and see if the results are substantially different. There's also a separate call-tracing mode which will record every function enter/return - this completely avoids the possibility of aliasing errors but collects a ton of data and higher overhead, which could matter if your app is doing any sort of real-time processing.

正如Nick指出的那样,如果采样间隔足够接近函数的执行时间,并且分析器很少检查函数何时实际执行,则采样可能会产生误导。我不知道YourKit是否支持这个,但是在Shark中,您可以将采样间隔更改为默认值10ms以外的其他值,并查看结果是否有很大差异。还有一个单独的调用跟踪模式,它将记录每个函数的进入/返回 - 这完全避免了混叠错误的可能性,但收集了大量数据和更高的开销,这可能很重要,如果您的应用程序正在进行任何类型的实时处理。

#3


You may want to look at the parameters that are being passed into the three methods. It may be that the time is being spent generating return values or in methods that are creating a lot of temporary objects.

您可能希望查看传递给三种方法的参数。可能是生成返回值所花费的时间,或者是创建大量临时对象的方法。

#4


I find YourKit greatly exaggerates the cost of calling sub-methods (due to its logging method, I assume). If you only follow the advice that the profile gives you you'll end up just merging functions with no real gain as the HotSpot usually does excellently for this.

我发现YourKit极大地夸大了调用子方法的成本(由于它的记录方法,我假设)。如果你只是按照配置文件给你的建议,你最终只会合并功能而没有真正的收益,因为HotSpot通常对此非常好。

Therefore, I'd highly advise to test batches completely outside profilers too, to get a better idea whether changes are really beneficial (it may seem obvious but this cost me some development time).

因此,我强烈建议在分析器之外完全测试批次,以便更好地了解变化是否真正有益(这看起来很明显,但这花费了我一些开发时间)。

#5


It worth noting that Java methods can be inlined if they are small enough, however native methods are inlined under different rules. If a method is inlined it doesn't appear in the profiler (certainly not YourKit anyway)

值得注意的是,如果Java方法足够小,可以内联,但是本机方法在不同规则下内联。如果方法是内联的,它不会出现在探查器中(当然不是YourKit)

#6


Profilers can be like that.

Profilers可以是这样的。

This is the method I use.

这是我使用的方法。

Works every time.

每次都有效。

And this is why.

这就是原因。

#1


This can happen because of inconsistencies of when samples are taken. So for example, if a method uses a fair amount of time, but doesn't take very long to execute, it is possible for the sampling to miss it. Also, I think garbage collection never happens during a sample, but if some code causes a lot of garbage collection it can greatly contribute to a slowdown without appearing in the sample.

这可能是因为采样时不一致。因此,例如,如果方法使用相当长的时间,但执行时间不长,则采样可能会错过它。此外,我认为垃圾收集在样本期间永远不会发生,但是如果某些代码导致大量垃圾收集,它可以极大地促进减速而不会出现在样本中。

In similar situation I've found it very helpful to run twice, once with tracing as well as once with sampling. If a method appears in both it is probably using a lot of CPU, otherwise it could well just be an artifact of the sampling process.

在类似的情况下,我发现运行两次非常有帮助,一次是跟踪,一次是采样。如果一个方法同时出现,它可能使用了大量的CPU,否则它很可能只是采样过程的一个人为因素。

#2


Since you're using a Mac, you might try Apple's Shark profiler (free download from ADC) which has Java support and Apple's performance group has put a fair amount of time into the tool.

由于您使用的是Mac,您可以尝试Apple的Shark分析器(从ADC免费下载),它具有Java支持,Apple的性能组已经花了相当多的时间进入该工具。

As Nick pointed out, sampling can be misleading if the sample interval is close enough to the function's execution time and the profiler rarely checks when the function is actually executing. I don't know whether YourKit supports this but in Shark you can change the sampling interval to something other than the default 10ms and see if the results are substantially different. There's also a separate call-tracing mode which will record every function enter/return - this completely avoids the possibility of aliasing errors but collects a ton of data and higher overhead, which could matter if your app is doing any sort of real-time processing.

正如Nick指出的那样,如果采样间隔足够接近函数的执行时间,并且分析器很少检查函数何时实际执行,则采样可能会产生误导。我不知道YourKit是否支持这个,但是在Shark中,您可以将采样间隔更改为默认值10ms以外的其他值,并查看结果是否有很大差异。还有一个单独的调用跟踪模式,它将记录每个函数的进入/返回 - 这完全避免了混叠错误的可能性,但收集了大量数据和更高的开销,这可能很重要,如果您的应用程序正在进行任何类型的实时处理。

#3


You may want to look at the parameters that are being passed into the three methods. It may be that the time is being spent generating return values or in methods that are creating a lot of temporary objects.

您可能希望查看传递给三种方法的参数。可能是生成返回值所花费的时间,或者是创建大量临时对象的方法。

#4


I find YourKit greatly exaggerates the cost of calling sub-methods (due to its logging method, I assume). If you only follow the advice that the profile gives you you'll end up just merging functions with no real gain as the HotSpot usually does excellently for this.

我发现YourKit极大地夸大了调用子方法的成本(由于它的记录方法,我假设)。如果你只是按照配置文件给你的建议,你最终只会合并功能而没有真正的收益,因为HotSpot通常对此非常好。

Therefore, I'd highly advise to test batches completely outside profilers too, to get a better idea whether changes are really beneficial (it may seem obvious but this cost me some development time).

因此,我强烈建议在分析器之外完全测试批次,以便更好地了解变化是否真正有益(这看起来很明显,但这花费了我一些开发时间)。

#5


It worth noting that Java methods can be inlined if they are small enough, however native methods are inlined under different rules. If a method is inlined it doesn't appear in the profiler (certainly not YourKit anyway)

值得注意的是,如果Java方法足够小,可以内联,但是本机方法在不同规则下内联。如果方法是内联的,它不会出现在探查器中(当然不是YourKit)

#6


Profilers can be like that.

Profilers可以是这样的。

This is the method I use.

这是我使用的方法。

Works every time.

每次都有效。

And this is why.

这就是原因。