在Java 8中并行生成了多少个线程?

时间:2022-10-06 21:03:04

In JDK8, how many threads are spawned when i'm using parallelStream? For instance, in the code:

在JDK8中,当我使用parallelStream时会产生多少个线程?例如,在代码中:

list.parallelStream().forEach(/** Do Something */);

If this list has 100000 items, how many threads will be spawned?

如果此列表包含100000个项目,那么将生成多少个线程?

Also, do each of the threads get the same number of items to work on or is it randomly allotted?

另外,每个线程都可以获得相同数量的项目,还是随机分配?

2 个解决方案

#1


41  

The Oracle's implementation[1] of parallel stream uses the current thread and in addition to that, if needed, also the threads that compose the default fork join pool ForkJoinPool.commonPool(), which has a default size equal to one less than the number of cores of your CPU.

Oracle的并行流实现[1]使用当前线程,除此之外,如果需要,还包括组成默认fork连接池ForkJoinPool.commonPool()的线程,其默认大小等于数字小于1您的CPU核心。

That default size of the common pool can be changed with this property:

可以使用以下属性更改公共池的默认大小:

-Djava.util.concurrent.ForkJoinPool.common.parallelism=8

Alternatively, you can use your own pool:

或者,您可以使用自己的游泳池:

ForkJoinPool myPool = new ForkJoinPool(8);
myPool.submit(() ->
    list.parallelStream().forEach(/* Do Something */);
).get();

Regarding the order, jobs will be executed as soon as a thread is available, in no specific order.

关于订单,只要线程可用,就会立即执行作业,而不是特定的顺序。

As correctly pointed out by @Holger this is an implementation specific detail (with just one vague reference at the bottom of a document), both approaches will work on Oracle's JVM but are definitely not guaranteed to work on JVMs from other vendors, the property could not exist in a non-Oracle implementation and Streams could not even use a ForkJoinPool internally rendering the alternative based on the behavior of ForkJoinTask.fork completely useless (see here for details on this).

正如@Holger正确指出的那样,这是一个特定于实现的细节(在文档底部只有一个模糊的引用),这两种方法都可以在Oracle的JVM上运行,但绝对不能保证可以在其他供应商的JVM上运行,属性可以在非Oracle实现中不存在,Streams甚至无法使用ForkJoinPool在内部呈现基于ForkJoinTask.fork行为完全无用的替代方法(有关详细信息,请参见此处)。

#2


3  

While @uraimo is correct, the answer depends on exactly what "Do Something" does. The parallel.streams API uses the CountedCompleter Class which has some interesting problems. Since the F/J framework does not use a separate object to hold results, long chains may result in an OOME. Also those long chains can sometimes cause a Stack Overflow. The answer to those problems is the use of the Paraquential technique as I pointed out in this article.

虽然@uraimo是正确的,但答案完全取决于“Do Something”的作用。 parallel.streams API使用CountedCompleter类,它有一些有趣的问题。由于F / J框架不使用单独的对象来保存结果,因此长链可能会导致OOME。这些长链有时也会导致堆栈溢出。这些问题的答案就是我在本文中指出的使用Paraquential技术。

The other problem is excessive thread creation when using nested parallel forEach.

另一个问题是在使用嵌套并行forEach时创建过多的线程。

#1


41  

The Oracle's implementation[1] of parallel stream uses the current thread and in addition to that, if needed, also the threads that compose the default fork join pool ForkJoinPool.commonPool(), which has a default size equal to one less than the number of cores of your CPU.

Oracle的并行流实现[1]使用当前线程,除此之外,如果需要,还包括组成默认fork连接池ForkJoinPool.commonPool()的线程,其默认大小等于数字小于1您的CPU核心。

That default size of the common pool can be changed with this property:

可以使用以下属性更改公共池的默认大小:

-Djava.util.concurrent.ForkJoinPool.common.parallelism=8

Alternatively, you can use your own pool:

或者,您可以使用自己的游泳池:

ForkJoinPool myPool = new ForkJoinPool(8);
myPool.submit(() ->
    list.parallelStream().forEach(/* Do Something */);
).get();

Regarding the order, jobs will be executed as soon as a thread is available, in no specific order.

关于订单,只要线程可用,就会立即执行作业,而不是特定的顺序。

As correctly pointed out by @Holger this is an implementation specific detail (with just one vague reference at the bottom of a document), both approaches will work on Oracle's JVM but are definitely not guaranteed to work on JVMs from other vendors, the property could not exist in a non-Oracle implementation and Streams could not even use a ForkJoinPool internally rendering the alternative based on the behavior of ForkJoinTask.fork completely useless (see here for details on this).

正如@Holger正确指出的那样,这是一个特定于实现的细节(在文档底部只有一个模糊的引用),这两种方法都可以在Oracle的JVM上运行,但绝对不能保证可以在其他供应商的JVM上运行,属性可以在非Oracle实现中不存在,Streams甚至无法使用ForkJoinPool在内部呈现基于ForkJoinTask.fork行为完全无用的替代方法(有关详细信息,请参见此处)。

#2


3  

While @uraimo is correct, the answer depends on exactly what "Do Something" does. The parallel.streams API uses the CountedCompleter Class which has some interesting problems. Since the F/J framework does not use a separate object to hold results, long chains may result in an OOME. Also those long chains can sometimes cause a Stack Overflow. The answer to those problems is the use of the Paraquential technique as I pointed out in this article.

虽然@uraimo是正确的,但答案完全取决于“Do Something”的作用。 parallel.streams API使用CountedCompleter类,它有一些有趣的问题。由于F / J框架不使用单独的对象来保存结果,因此长链可能会导致OOME。这些长链有时也会导致堆栈溢出。这些问题的答案就是我在本文中指出的使用Paraquential技术。

The other problem is excessive thread creation when using nested parallel forEach.

另一个问题是在使用嵌套并行forEach时创建过多的线程。