Is there a way to speed up Javadoc (takes 7 minutes)

时间:2022-01-31 00:04:21

I am building a Javadoc for a module with 2,509 classes. This currently takes 7 min or 6 files per second.

我正在为一个包含2,509个类的模块构建一个Javadoc。这目前每秒需要7分钟或6个文件。

I have tried

我试过了

mvn -T 1C install

However javadoc only uses 1 CPU. Is there a way to use more and/or speed up?

但是javadoc只使用1个CPU。有没有办法使用更多和/或加速?

I am using Oracle JDK 8 update 112. My dev machine has 16 cores and 128 GB of memory.

我正在使用Oracle JDK 8更新112.我的开发机器有16个内核和128 GB内存。

Running flight recorder I can see that there is only one thread main

运行飞行记录器我可以看到只有一个主线程

Is there a way to speed up Javadoc (takes 7 minutes)

For those who are interested, I've used the following options:

对于那些感兴趣的人,我使用了以下选项:

<plugin>
    <artifactId>maven-javadoc-plugin</artifactId>
    <configuration>
        <additionalJOptions>
            <additionalJOption>-J-XX:+UnlockCommercialFeatures</additionalJOption>
            <additionalJOption>-J-XX:+FlightRecorder</additionalJOption>
            <additionalJOption>-J-XX:StartFlightRecording=name=test,filename=/tmp/myrecording-50.jfr,dumponexit=true</additionalJOption>
            <additionalJOption>-J-XX:FlightRecorderOptions=loglevel=debug</additionalJOption>
        </additionalJOptions>
    </configuration>
</plugin>

NOTE: One workaround is to do:

注意:一种解决方法是:

-Dmaven.javadoc.skip=true

6 个解决方案

#1


5  

Running maven with -T1C will cause maven to try to build modules in parallel, so if you have a multi-module project, at best it will build each module's javadoc in parallel (if your dependency graph between modules allow it).

使用-T1C运行maven将导致maven尝试并行构建模块,因此如果您有一个多模块项目,最多它将并行构建每个模块的javadoc(如果模块之间的依赖关系图允许它)。

The javadoc process itself is single-threaded, so you won't be able to use multiple cores to generate the javadoc of one single module.

javadoc进程本身是单线程的,因此您将无法使用多个内核来生成单个模块的javadoc。

However, since you have many classes (and possibly many @link doclets or similar ?), maybe the javadoc process could benefit from extended heap. Have you looked into GC activity ? Try adding this in your configuration, see if it helps :

但是,既然你有很多类(可能还有很多@link doclet或类似的?),那么javadoc进程可能会从扩展堆中受益。你看过GC活动了吗?尝试在您的配置中添加它,看看它是否有帮助:

<additionalJOption>-J-Xms2g</additionalJOption>
<additionalJOption>-J-Xmx2g</additionalJOption>

#2


4  

@lbndev is right, at least with the default Doclet (com.sun.tools.doclets.formats.html.HtmlDoclet) that is supplied with Javadoc. A look through the source confirms the single threaded implementation:

@lbndev是对的,至少使用随Javadoc提供的默认Doclet(com.sun.tools.doclets.formats.html.HtmlDoclet)。通过源代码确认单线程实现:

  • The starting point is in the parent class AbstractDoclet.startGeneration()
  • 起点在父类AbstractDoclet.startGeneration()中

  • This in turn calls new ClassTree() which calls ClassTree.buildTree() which uses a for loop to iterate the list of classes, generating models of the classes
  • 这反过来调用新的ClassTree()调用ClassTree.buildTree(),它使用for循环来迭代类列表,生成类的模型

  • The next step is AbstractDoclet.generateClassFiles(), again a for loop on the packages
  • 下一步是AbstractDoclet.generateClassFiles(),同样是包上的for循环

  • This then drills down to HtmlDoclet.generateClassFiles(), which iterates package.allClasses(), again in a for loop.
  • 然后深入研究HtmlDoclet.generateClassFiles(),它再次在for循环中迭代package.allClasses()。

(Those links are to JDK 8 source. With JDK 11 the classes have moved, but the basic for loops in HtmlDoclet and AbstractDoclet are still there.)

(那些链接是JDK 8源代码。使用JDK 11,类已经移动了,但HtmlDoclet和AbstractDoclet中的基本for循环仍然存在。)

Some sample based profiling confirmed these are the methods that are the bottleneck: Is there a way to speed up Javadoc (takes 7 minutes)

一些基于样本的分析证实了这些是瓶颈的方法:

This won't be what you're hoping to hear, but this looks like no option in the current standard Javadoc for multi-threading, at least within a single Maven module.

这不是您希望听到的内容,但在当前标准的多线程Javadoc中看起来似乎没有选项,至少在单个Maven模块中是这样。

generateClassFiles() etc would lend themselves well to a bit of multithreading, though this would probably need to be a change in the JDK. As mentioned below AbstractDoclet.isValidDoclet() even actively blocks subclassing of HtmlDoclet. Trying to reimplement some of those loops as a third party would need to pull in a lot of other code.

generateClassFiles()等可以很好地适应一些多线程,虽然这可能需要改变JDK。如下所述,AbstractDoclet.isValidDoclet()甚至主动阻止HtmlDoclet的子类化。尝试重新实现其中一些循环作为第三方需要引入许多其他代码。

A scan around other Doclet implementations (e.g. javadown) only found a similar implementation style around the package and class drilldown. It's possible others on this thread will know more.

围绕其他Doclet实现(例如javadown)的扫描仅在包和类钻取周围找到了类似的实现样式。这个帖子上的其他人可能会知道更多。

Thinking a bit more widely, there might be room for tuning around DocFileFactory. It's clearly marked up as an internal class (not even public in the package), but it does abstract the writing of the (HTML) files. It seems possible an alternative version of this could buffer the HTML in memory, or stream directly to a zip file, to improve the IO performance. But clearly this would also need to understand the risk of change in the JDK tools.

思考更广泛,可能还有调整DocFileFactory的空间​​。它被清楚地标记为内部类(在包中甚至不公开),但它确实抽象了(HTML)文件的编写。似乎可能的替代版本可以在内存中缓冲HTML,或直接流式传输到zip文件,以提高IO性能。但显然,这也需要了解JDK工具的变更风险。

#3


0  

javadoc, and the standard doclet, are currently fundamentally single-threaded.

javadoc和标准doclet目前基本上是单线程的。

It is "on the radar" to improve this, primarily by generating pages in parallel, but this means retrofitting MT-safeness to various shared data structures.

主要通过并行生成页面来“改进”这一点,但这意味着将MT安全性改进为各种共享数据结构。

#4


0  

You can have Maven to use multiple threads per core in all the cores.

您可以让Maven在所有核心中为每个核心使用多个线程。

For eg.

mvn -T 4C install # will use 4 threads per available CPU core

You can change 4 above to whatever number you want. You have a machine with lots of resources. Try 8 or 16.

您可以将上面的4更改为您想要的任何数字。你有一台拥有大量资源的机器。尝试8或16。

Also have you tried using javadoc-no-fork ? This will ensure javadoc is not triggered second time - https://maven.apache.org/plugins/maven-javadoc-plugin/examples/javadoc-nofork.html

还试过使用javadoc-no-fork吗?这将确保第二次不触发javadoc - https://maven.apache.org/plugins/maven-javadoc-plugin/examples/javadoc-nofork.html

#5


0  

Maven customization is a way to speed up javadoc generation.

Maven定制是一种加速javadoc生成的方法。

Another approach would be to change the doclet used for generating the javadoc. The maven javadoc plugin allow you to change the doclet used to generate the javadoc

另一种方法是更改​​用于生成javadoc的doclet。 maven javadoc插件允许您更改用于生成javadoc的doclet

https://maven.apache.org/plugins/maven-javadoc-plugin/examples/alternate-doclet.html

I did found the following commercial doclet (I'm not affiliated with them in any way) wich claims to be faster than traditional javadoc. It offers a free/trial/commercial license. If you're realy eager to speed up your javadoc build maybe it is worth to look if it's worth the price

我确实找到了以下商业文档(我没有以任何方式与他们联系),声称比传统的javadoc更快。它提供免费/试用/商业许可证。如果你真的急于加速你的javadoc构建,那么值得一试它是否物有所值

http://www.filigris.com/docflex-javadoc

Maybe opensource alternatives exists on internet...

也许开源替代品存在于互联网上......

#6


-3  

Use doxygen instead of the regular mvn, what you are using now.

使用doxygen而不是常规mvn,你现在使用的是什么。

#1


5  

Running maven with -T1C will cause maven to try to build modules in parallel, so if you have a multi-module project, at best it will build each module's javadoc in parallel (if your dependency graph between modules allow it).

使用-T1C运行maven将导致maven尝试并行构建模块,因此如果您有一个多模块项目,最多它将并行构建每个模块的javadoc(如果模块之间的依赖关系图允许它)。

The javadoc process itself is single-threaded, so you won't be able to use multiple cores to generate the javadoc of one single module.

javadoc进程本身是单线程的,因此您将无法使用多个内核来生成单个模块的javadoc。

However, since you have many classes (and possibly many @link doclets or similar ?), maybe the javadoc process could benefit from extended heap. Have you looked into GC activity ? Try adding this in your configuration, see if it helps :

但是,既然你有很多类(可能还有很多@link doclet或类似的?),那么javadoc进程可能会从扩展堆中受益。你看过GC活动了吗?尝试在您的配置中添加它,看看它是否有帮助:

<additionalJOption>-J-Xms2g</additionalJOption>
<additionalJOption>-J-Xmx2g</additionalJOption>

#2


4  

@lbndev is right, at least with the default Doclet (com.sun.tools.doclets.formats.html.HtmlDoclet) that is supplied with Javadoc. A look through the source confirms the single threaded implementation:

@lbndev是对的,至少使用随Javadoc提供的默认Doclet(com.sun.tools.doclets.formats.html.HtmlDoclet)。通过源代码确认单线程实现:

  • The starting point is in the parent class AbstractDoclet.startGeneration()
  • 起点在父类AbstractDoclet.startGeneration()中

  • This in turn calls new ClassTree() which calls ClassTree.buildTree() which uses a for loop to iterate the list of classes, generating models of the classes
  • 这反过来调用新的ClassTree()调用ClassTree.buildTree(),它使用for循环来迭代类列表,生成类的模型

  • The next step is AbstractDoclet.generateClassFiles(), again a for loop on the packages
  • 下一步是AbstractDoclet.generateClassFiles(),同样是包上的for循环

  • This then drills down to HtmlDoclet.generateClassFiles(), which iterates package.allClasses(), again in a for loop.
  • 然后深入研究HtmlDoclet.generateClassFiles(),它再次在for循环中迭代package.allClasses()。

(Those links are to JDK 8 source. With JDK 11 the classes have moved, but the basic for loops in HtmlDoclet and AbstractDoclet are still there.)

(那些链接是JDK 8源代码。使用JDK 11,类已经移动了,但HtmlDoclet和AbstractDoclet中的基本for循环仍然存在。)

Some sample based profiling confirmed these are the methods that are the bottleneck: Is there a way to speed up Javadoc (takes 7 minutes)

一些基于样本的分析证实了这些是瓶颈的方法:

This won't be what you're hoping to hear, but this looks like no option in the current standard Javadoc for multi-threading, at least within a single Maven module.

这不是您希望听到的内容,但在当前标准的多线程Javadoc中看起来似乎没有选项,至少在单个Maven模块中是这样。

generateClassFiles() etc would lend themselves well to a bit of multithreading, though this would probably need to be a change in the JDK. As mentioned below AbstractDoclet.isValidDoclet() even actively blocks subclassing of HtmlDoclet. Trying to reimplement some of those loops as a third party would need to pull in a lot of other code.

generateClassFiles()等可以很好地适应一些多线程,虽然这可能需要改变JDK。如下所述,AbstractDoclet.isValidDoclet()甚至主动阻止HtmlDoclet的子类化。尝试重新实现其中一些循环作为第三方需要引入许多其他代码。

A scan around other Doclet implementations (e.g. javadown) only found a similar implementation style around the package and class drilldown. It's possible others on this thread will know more.

围绕其他Doclet实现(例如javadown)的扫描仅在包和类钻取周围找到了类似的实现样式。这个帖子上的其他人可能会知道更多。

Thinking a bit more widely, there might be room for tuning around DocFileFactory. It's clearly marked up as an internal class (not even public in the package), but it does abstract the writing of the (HTML) files. It seems possible an alternative version of this could buffer the HTML in memory, or stream directly to a zip file, to improve the IO performance. But clearly this would also need to understand the risk of change in the JDK tools.

思考更广泛,可能还有调整DocFileFactory的空间​​。它被清楚地标记为内部类(在包中甚至不公开),但它确实抽象了(HTML)文件的编写。似乎可能的替代版本可以在内存中缓冲HTML,或直接流式传输到zip文件,以提高IO性能。但显然,这也需要了解JDK工具的变更风险。

#3


0  

javadoc, and the standard doclet, are currently fundamentally single-threaded.

javadoc和标准doclet目前基本上是单线程的。

It is "on the radar" to improve this, primarily by generating pages in parallel, but this means retrofitting MT-safeness to various shared data structures.

主要通过并行生成页面来“改进”这一点,但这意味着将MT安全性改进为各种共享数据结构。

#4


0  

You can have Maven to use multiple threads per core in all the cores.

您可以让Maven在所有核心中为每个核心使用多个线程。

For eg.

mvn -T 4C install # will use 4 threads per available CPU core

You can change 4 above to whatever number you want. You have a machine with lots of resources. Try 8 or 16.

您可以将上面的4更改为您想要的任何数字。你有一台拥有大量资源的机器。尝试8或16。

Also have you tried using javadoc-no-fork ? This will ensure javadoc is not triggered second time - https://maven.apache.org/plugins/maven-javadoc-plugin/examples/javadoc-nofork.html

还试过使用javadoc-no-fork吗?这将确保第二次不触发javadoc - https://maven.apache.org/plugins/maven-javadoc-plugin/examples/javadoc-nofork.html

#5


0  

Maven customization is a way to speed up javadoc generation.

Maven定制是一种加速javadoc生成的方法。

Another approach would be to change the doclet used for generating the javadoc. The maven javadoc plugin allow you to change the doclet used to generate the javadoc

另一种方法是更改​​用于生成javadoc的doclet。 maven javadoc插件允许您更改用于生成javadoc的doclet

https://maven.apache.org/plugins/maven-javadoc-plugin/examples/alternate-doclet.html

I did found the following commercial doclet (I'm not affiliated with them in any way) wich claims to be faster than traditional javadoc. It offers a free/trial/commercial license. If you're realy eager to speed up your javadoc build maybe it is worth to look if it's worth the price

我确实找到了以下商业文档(我没有以任何方式与他们联系),声称比传统的javadoc更快。它提供免费/试用/商业许可证。如果你真的急于加速你的javadoc构建,那么值得一试它是否物有所值

http://www.filigris.com/docflex-javadoc

Maybe opensource alternatives exists on internet...

也许开源替代品存在于互联网上......

#6


-3  

Use doxygen instead of the regular mvn, what you are using now.

使用doxygen而不是常规mvn,你现在使用的是什么。