为什么java编译上的并行执行会在时间上呈线性增长

时间:2022-08-02 14:35:43
time javac Main.java                                      --> 0m1.050s
time javac Main.java & javac Main.java                    --> 0m1.808s
time javac Main.java & javac Main.java & javac Main.java  --> 0m2.690s
time javac Main.java & ... 8 time                         --> 0m8.309s

When we run javac command in parallel and with each increase in javac command ~1 sec gets added for all the javac command to complete.

当我们并行运行javac命令并且每次增加javac命令时,为所有javac命令添加1秒以完成。

Why is there a linear growth is time ?

为什么线性增长是时间?

Is all javac process while running involved in some kind on locks, if yes how to overcome it so as not to have a linear growth in time

运行时是否所有javac进程都涉及锁定,如果是,如何克服它以免在时间上有线性增长


PS: I have tried above on single core machine, double core machine, 4 core machine all showed same behaviour.

PS:我在单核机器,双核机器,4核机器上都试过以上都表现出相同的行为。

PS2: environment RedHat7, javac 1.7.0_79

PS2:环境RedHat7,javac 1.7.0_79

1 个解决方案

#1


27  

The java compiler already handles dividing its work across available processors, even when only compiling a single file. Therefore running separate compiler instances in parallel yourself won't yield the performance gains you are expecting.

java编译器已经处理了跨可用处理器的工作,即使只编译单个文件也是如此。因此,并行运行单独的编译器实例不会产生您期望的性能提升。

To demonstrate this, I generated a large (1 million lines, 10,000 methods) java program in a single file called Main1.java. Then made additional copies as Main2.java through Main8.java. Compile times are as follows:

为了证明这一点,我在名为Main1.java的单个文件中生成了一个大型(100万行,10,000个方法)java程序。然后通过Main8.java将另外的副本作为Main2.java。编译时间如下:

Single file compile:

单文件编译:

time javac Main1.java &    --> (real) 11.6 sec

Watching this single file compile in top revealed processor usage mostly in the 200-400% range (indicating multiple CPU usage, 100% per CPU), with occasional spikes in the 700% range (the max on this machine is 800% since there are 8 processors).

看到这个单个文件编译在顶层显示处理器使用率大多在200-400%范围内(表示多CPU使用率,每CPU 100%),偶尔会出现700%范围内的峰值(此机器上的最大值为800%,因为有8个处理器)。

Next, two files simultaneously:

接下来,两个文件同时:

time javac Main1.java &    --> (real) 14.5 sec
time javac Main2.java &    --> (real) 14.8 sec

So it only took 14.8 seconds to compile two, when it took 11.6 seconds to compile one. That's definitely non-linear. It was clear by looking at top while these were running that again each java compiler was only taking advantage of at most four CPUs at once (with occasional spikes higher). Because of this, the two compilers ran across eight CPUs mostly in parallel with each other.

所以编译两个只花了14.8秒,编译一个花了11.6秒。这绝对是非线性的。很明显,当这些运行时,顶部运行时,每个java编译器一次只能利用最多四个CPU(偶尔会出现高峰)。正因为如此,两个编译器遇到了八个CPU,大多数是彼此并行的。

Next, four files simultaneously:

接下来,同时有四个文件:

time javac Main1.java &    --> (real) 24.2 sec
time javac Main2.java &    --> (real) 24.6 sec
time javac Main3.java &    --> (real) 25.0 sec
time javac Main4.java &    --> (real) 25.0 sec

Okay, here we've hit the wall. We can no longer out-parallelize the compiler. Four files took 25 seconds when two took 14.8. There's a little optimization there but it's mostly a linear time increase.

好的,我们已经碰壁了。我们不能再对编译器进行并行化。四个文件花了25秒,两个花了14.8。那里有一点优化,但主要是线性时间的增加。

Finally, eight simultaneously:

最后,八个同时:

time javac Main1.java &    --> (real) 51.9 sec
time javac Main2.java &    --> (real) 52.3 sec
time javac Main3.java &    --> (real) 52.5 sec
time javac Main4.java &    --> (real) 53.0 sec
time javac Main5.java &    --> (real) 53.4 sec
time javac Main6.java &    --> (real) 53.5 sec
time javac Main7.java &    --> (real) 53.6 sec
time javac Main8.java &    --> (real) 54.6 sec

This was actually a little worse than linear, as eight took 54.6 seconds while four only took 25.0.

这实际上比线性更糟糕,因为8个花了54.6秒而4个只花了25.0秒。

So I think the takeaway from all this is to have faith that the compiler will do a decent job trying to optimize the work you give it across the available CPU resources, and that trying to add additional parallelization by hand will have limited (if any) benefit.

因此,我认为从这一切中得到的结论是相信编译器会在尝试优化您在可用CPU资源上提供的工作方面做得不错,而且尝试手动添加额外的并行化将会受到限制(如果有的话)效益。

Edit:

编辑:

For reference, there are two entries I found in Oracle's bug database regarding enhancing javac to take advantage of multiple processors:

作为参考,我在Oracle的bug数据库中找到了两个关于增强javac以利用多个处理器的条目:

  • Bug ID: JDK-6629150 -- The original complaint, this was eventually marked as a duplicate of:
  • 错误ID:JDK-6629150 - 原始投诉,最终标记为重复:
  • Bug ID: JDK-6713663 -- Suggests the resolution, and based on the "Resolved Date" it appears that multi-processor support in javac was added on 2008-06-12.
  • 错误ID:JDK-6713663 - 建议解决方案,并根据“已解决的日期”,似乎在2008-06-12添加了javac中的多处理器支持。

#1


27  

The java compiler already handles dividing its work across available processors, even when only compiling a single file. Therefore running separate compiler instances in parallel yourself won't yield the performance gains you are expecting.

java编译器已经处理了跨可用处理器的工作,即使只编译单个文件也是如此。因此,并行运行单独的编译器实例不会产生您期望的性能提升。

To demonstrate this, I generated a large (1 million lines, 10,000 methods) java program in a single file called Main1.java. Then made additional copies as Main2.java through Main8.java. Compile times are as follows:

为了证明这一点,我在名为Main1.java的单个文件中生成了一个大型(100万行,10,000个方法)java程序。然后通过Main8.java将另外的副本作为Main2.java。编译时间如下:

Single file compile:

单文件编译:

time javac Main1.java &    --> (real) 11.6 sec

Watching this single file compile in top revealed processor usage mostly in the 200-400% range (indicating multiple CPU usage, 100% per CPU), with occasional spikes in the 700% range (the max on this machine is 800% since there are 8 processors).

看到这个单个文件编译在顶层显示处理器使用率大多在200-400%范围内(表示多CPU使用率,每CPU 100%),偶尔会出现700%范围内的峰值(此机器上的最大值为800%,因为有8个处理器)。

Next, two files simultaneously:

接下来,两个文件同时:

time javac Main1.java &    --> (real) 14.5 sec
time javac Main2.java &    --> (real) 14.8 sec

So it only took 14.8 seconds to compile two, when it took 11.6 seconds to compile one. That's definitely non-linear. It was clear by looking at top while these were running that again each java compiler was only taking advantage of at most four CPUs at once (with occasional spikes higher). Because of this, the two compilers ran across eight CPUs mostly in parallel with each other.

所以编译两个只花了14.8秒,编译一个花了11.6秒。这绝对是非线性的。很明显,当这些运行时,顶部运行时,每个java编译器一次只能利用最多四个CPU(偶尔会出现高峰)。正因为如此,两个编译器遇到了八个CPU,大多数是彼此并行的。

Next, four files simultaneously:

接下来,同时有四个文件:

time javac Main1.java &    --> (real) 24.2 sec
time javac Main2.java &    --> (real) 24.6 sec
time javac Main3.java &    --> (real) 25.0 sec
time javac Main4.java &    --> (real) 25.0 sec

Okay, here we've hit the wall. We can no longer out-parallelize the compiler. Four files took 25 seconds when two took 14.8. There's a little optimization there but it's mostly a linear time increase.

好的,我们已经碰壁了。我们不能再对编译器进行并行化。四个文件花了25秒,两个花了14.8。那里有一点优化,但主要是线性时间的增加。

Finally, eight simultaneously:

最后,八个同时:

time javac Main1.java &    --> (real) 51.9 sec
time javac Main2.java &    --> (real) 52.3 sec
time javac Main3.java &    --> (real) 52.5 sec
time javac Main4.java &    --> (real) 53.0 sec
time javac Main5.java &    --> (real) 53.4 sec
time javac Main6.java &    --> (real) 53.5 sec
time javac Main7.java &    --> (real) 53.6 sec
time javac Main8.java &    --> (real) 54.6 sec

This was actually a little worse than linear, as eight took 54.6 seconds while four only took 25.0.

这实际上比线性更糟糕,因为8个花了54.6秒而4个只花了25.0秒。

So I think the takeaway from all this is to have faith that the compiler will do a decent job trying to optimize the work you give it across the available CPU resources, and that trying to add additional parallelization by hand will have limited (if any) benefit.

因此,我认为从这一切中得到的结论是相信编译器会在尝试优化您在可用CPU资源上提供的工作方面做得不错,而且尝试手动添加额外的并行化将会受到限制(如果有的话)效益。

Edit:

编辑:

For reference, there are two entries I found in Oracle's bug database regarding enhancing javac to take advantage of multiple processors:

作为参考,我在Oracle的bug数据库中找到了两个关于增强javac以利用多个处理器的条目:

  • Bug ID: JDK-6629150 -- The original complaint, this was eventually marked as a duplicate of:
  • 错误ID:JDK-6629150 - 原始投诉,最终标记为重复:
  • Bug ID: JDK-6713663 -- Suggests the resolution, and based on the "Resolved Date" it appears that multi-processor support in javac was added on 2008-06-12.
  • 错误ID:JDK-6713663 - 建议解决方案,并根据“已解决的日期”,似乎在2008-06-12添加了javac中的多处理器支持。