在线程之间划分不均匀的数字

时间:2021-04-23 17:36:09

I am just learning Threads in Java and I want to sort a list of words alphabetical. My program read the words of a txt-file and put them in a String-array. The user can choose how many threads they want to use themselves. I want to split the array in even (as possible) chunks that the threads can sort by themselves.

我只是学习Java中的Threads,我想按字母顺序排列单词列表。我的程序读取txt文件的文字并将它们放在一个String-array中。用户可以选择他们想要自己使用的线程数。我希望将数组拆分为偶数(尽可能)的线程,线程可以自行排序。

So to my question:

所以对我的问题:

How can I split the array.length as even as possible across the threads? My mind is blanking and I can't think of a smart way to do this.

如何在线程中尽可能地分割array.length?我的想法是空白,我想不出一个聪明的方法来做到这一点。

e.g: If I have a array.length of 22 and 4 threads, how can give the threads in this case; 6, 6, 5 and 5 sized pieces of the array? Needs to be applicable to every number given.

例如:如果我有一个22和4个线程的array.length,在这种情况下如何给出线程; 6,6,5和5大小的阵列?需要适用于给出的每个号码。

I tried to explain it the best I could, please ask if something was unclear! Thank you!

我尽力解释它,请问是否有些不清楚!谢谢!

5 个解决方案

#1


4  

It doesn't need to be as evenly as possible. If one thread has 6, this will determine the length of time it takes in which case it doesn't matter how many are up to 6.

它不需要尽可能均匀。如果一个线程有6个,这将确定它所花费的时间长度,在这种情况下,最多6个线程无关紧要。

You can do

你可以做

int chunkSize = (tasks + threads - 1) / threads; // divide by threads rounded up.
for (int t = 0; t < threads; t++) {
    int start = t * chunksSize;
    int end = Math.min(start + chunkSize, tasks);
    executor.submit(() -> {
         // inside the thread
         for (int i = start; i < end; i++) {
             process(i);
    });
}

Note: if you use Stream.of(array).parallel() it actually create two tasks per thread. This mitigates that some batches might take longer even though they have the same number of elements.

注意:如果使用Stream.of(array).parallel(),它实际上每个线程创建两个任务。这减轻了一些批次可能需要更长时间,即使它们具有相同数量的元素。

#2


5  

Let me just take your example as it will be easy to explain. 22 elements amongst 4 threads.

让我举一个例子,因为它很容易解释。 4个线程中的22个元素。

22 % 4 = 2. This gives you the number of threads that will get one element more than the remaining threads.

22%4 = 2.这为您提供了比剩余线程更多地获得一个元素的线程数。

22 / 4 = 5. This gives you the minimum number of elements per thread.

22/4 = 5.这为每个线程提供了最少的元素数。

Now start dividing your array into 5 elements and assign them to a thread each till you are left with (22%4) 2 threads. Assign them the remaining (5+1=6) elements each.

现在开始将你的数组分成5个元素并将它们分配给一个线程,直到你留下(22%4)2个线程。分别为它们分配剩余的(5 + 1 = 6)个元素。

#3


0  

You can do it in two phases. First: divide length with threads count without the remainder to get chunks. Second: split the remainder between chunks - +1 per each chunk. Some chunk won't get +1.

你可以分两个阶段完成。第一步:用线程数除以长度而不用剩余部分来获取块。第二:将剩余部分分成块 - 每个块+1。一些块不会获得+1。

#4


0  

Given n elements and kthreads, you should assign 1 + n/k elements to the first n % k threads, and n/k elements to the remaining threads.

给定n个元素和kthreads,您应该为前n%k个线程分配1 + n / k个元素,为剩余的线程分配n / k个元素。

In your case, you have n = 22 and k = 4, so... n/k = 5 (rounded down) and n%k = 2, so first 2 threads have 5+1 elements assigned to them, and the remaining 2 threads have 5 assigned to them.

在你的情况下,你有n = 22和k = 4,所以... n / k = 5(向下舍入)和n%k = 2,所以前2个线程分配了5 + 1个元素,剩下的2个线程分配了5个线程。

#5


0  

In order to make sure that the threads have a "similar" workload, it is important to find an even distribution. This is particularly important when the number of threads is "high" compared to the number of elements. For this case, one should make sure that the numbers of elements that the threads are responsible for differs by at most 1.

为了确保线程具有“类似”工作负载,找到均匀分布非常重要。当线程数与元素数量相比“高”时,这尤其重要。对于这种情况,应该确保线程负责的元素数量最多相差1。

To achieve this, you can compute the remainder of dividing the number of elements (the array length, in your case) by the number of threads, and distribute this remainder, one by one, among the tasks.

要实现这一点,您可以计算除以元素数量(在您的情况下为数组长度)的余数除以线程数,并在任务中逐个分配此余数。

I had the same problem a while ago. In fact, I tried to solve it in a slightly more general form, for some ParallelRangeExecutor class, which required the computation of the start- and end indices of the intervals of an arbitrary range (which does not need to start with index 0). The following is "extracted" from this class:

我刚才有同样的问题。事实上,我尝试以稍微更一般的形式解决它,对于一些ParallelRangeExecutor类,它需要计算任意范围的间隔的开始和结束索引(不需要以索引0开始)。以下是从这个类“提取”:

import java.util.Arrays;

public class EvenTaskDistribution
{
    public static void main(String[] args)
    {
        test( 22, 4);
        test( 21, 4);
        test(100, 3);
        test(  3, 4);
    }

    private static void test(int numElements, int parallelism)
    {
        int taskSizes[] = computeTaskSizes(parallelism, 0, numElements);
        System.out.printf("Distributing %4d elements among %4d threads: %s\n",
            numElements, parallelism, Arrays.toString(taskSizes));
    }

    public static int[] computeTaskSizes(
        int parallelism, int globalMin, int globalMax)
    {
        if (parallelism <= 0)
        {
            throw new IllegalArgumentException(
                "Parallelism must be positive, but is " + parallelism);
        }
        if (globalMin > globalMax)
        {
            throw new IllegalArgumentException(
                "The global minimum may not be larger than the global " + 
                "maximum. Global minimum is "+globalMin+", " + 
                "global maximum is "+globalMax);
        }
        int range = globalMax - globalMin;
        if (range == 0)
        {
            return new int[0];
        }
        int numTasks = Math.min(range, parallelism);
        int localRange = (range - 1) / numTasks + 1;
        int spare = localRange * numTasks - range;
        int currentIndex = globalMin;
        int taskSizes[] = new int[numTasks];
        for (int i = 0; i < numTasks; i++)
        {
            final int min = currentIndex;
            final int max = min + localRange - (i < spare ? 1 : 0);
            taskSizes[i] = max - min; 
            currentIndex = max;
        }
        return taskSizes;
    }
}

The output is

输出是

Distributing   22 elements among    4 threads: [5, 5, 6, 6]
Distributing   21 elements among    4 threads: [5, 5, 5, 6]
Distributing  100 elements among    3 threads: [33, 33, 34]
Distributing    3 elements among    4 threads: [1, 1, 1]

(The last one shows one of the corner cases that one might have to take into account. E.g. one could expect [1,1,1,0] here. But this can easily be adjusted depending on the application case).

(最后一个显示了一个可能需要考虑的极端情况。例如,人们可以期待[1,1,1,0]。但这可以根据应用案例轻松调整)。

#1


4  

It doesn't need to be as evenly as possible. If one thread has 6, this will determine the length of time it takes in which case it doesn't matter how many are up to 6.

它不需要尽可能均匀。如果一个线程有6个,这将确定它所花费的时间长度,在这种情况下,最多6个线程无关紧要。

You can do

你可以做

int chunkSize = (tasks + threads - 1) / threads; // divide by threads rounded up.
for (int t = 0; t < threads; t++) {
    int start = t * chunksSize;
    int end = Math.min(start + chunkSize, tasks);
    executor.submit(() -> {
         // inside the thread
         for (int i = start; i < end; i++) {
             process(i);
    });
}

Note: if you use Stream.of(array).parallel() it actually create two tasks per thread. This mitigates that some batches might take longer even though they have the same number of elements.

注意:如果使用Stream.of(array).parallel(),它实际上每个线程创建两个任务。这减轻了一些批次可能需要更长时间,即使它们具有相同数量的元素。

#2


5  

Let me just take your example as it will be easy to explain. 22 elements amongst 4 threads.

让我举一个例子,因为它很容易解释。 4个线程中的22个元素。

22 % 4 = 2. This gives you the number of threads that will get one element more than the remaining threads.

22%4 = 2.这为您提供了比剩余线程更多地获得一个元素的线程数。

22 / 4 = 5. This gives you the minimum number of elements per thread.

22/4 = 5.这为每个线程提供了最少的元素数。

Now start dividing your array into 5 elements and assign them to a thread each till you are left with (22%4) 2 threads. Assign them the remaining (5+1=6) elements each.

现在开始将你的数组分成5个元素并将它们分配给一个线程,直到你留下(22%4)2个线程。分别为它们分配剩余的(5 + 1 = 6)个元素。

#3


0  

You can do it in two phases. First: divide length with threads count without the remainder to get chunks. Second: split the remainder between chunks - +1 per each chunk. Some chunk won't get +1.

你可以分两个阶段完成。第一步:用线程数除以长度而不用剩余部分来获取块。第二:将剩余部分分成块 - 每个块+1。一些块不会获得+1。

#4


0  

Given n elements and kthreads, you should assign 1 + n/k elements to the first n % k threads, and n/k elements to the remaining threads.

给定n个元素和kthreads,您应该为前n%k个线程分配1 + n / k个元素,为剩余的线程分配n / k个元素。

In your case, you have n = 22 and k = 4, so... n/k = 5 (rounded down) and n%k = 2, so first 2 threads have 5+1 elements assigned to them, and the remaining 2 threads have 5 assigned to them.

在你的情况下,你有n = 22和k = 4,所以... n / k = 5(向下舍入)和n%k = 2,所以前2个线程分配了5 + 1个元素,剩下的2个线程分配了5个线程。

#5


0  

In order to make sure that the threads have a "similar" workload, it is important to find an even distribution. This is particularly important when the number of threads is "high" compared to the number of elements. For this case, one should make sure that the numbers of elements that the threads are responsible for differs by at most 1.

为了确保线程具有“类似”工作负载,找到均匀分布非常重要。当线程数与元素数量相比“高”时,这尤其重要。对于这种情况,应该确保线程负责的元素数量最多相差1。

To achieve this, you can compute the remainder of dividing the number of elements (the array length, in your case) by the number of threads, and distribute this remainder, one by one, among the tasks.

要实现这一点,您可以计算除以元素数量(在您的情况下为数组长度)的余数除以线程数,并在任务中逐个分配此余数。

I had the same problem a while ago. In fact, I tried to solve it in a slightly more general form, for some ParallelRangeExecutor class, which required the computation of the start- and end indices of the intervals of an arbitrary range (which does not need to start with index 0). The following is "extracted" from this class:

我刚才有同样的问题。事实上,我尝试以稍微更一般的形式解决它,对于一些ParallelRangeExecutor类,它需要计算任意范围的间隔的开始和结束索引(不需要以索引0开始)。以下是从这个类“提取”:

import java.util.Arrays;

public class EvenTaskDistribution
{
    public static void main(String[] args)
    {
        test( 22, 4);
        test( 21, 4);
        test(100, 3);
        test(  3, 4);
    }

    private static void test(int numElements, int parallelism)
    {
        int taskSizes[] = computeTaskSizes(parallelism, 0, numElements);
        System.out.printf("Distributing %4d elements among %4d threads: %s\n",
            numElements, parallelism, Arrays.toString(taskSizes));
    }

    public static int[] computeTaskSizes(
        int parallelism, int globalMin, int globalMax)
    {
        if (parallelism <= 0)
        {
            throw new IllegalArgumentException(
                "Parallelism must be positive, but is " + parallelism);
        }
        if (globalMin > globalMax)
        {
            throw new IllegalArgumentException(
                "The global minimum may not be larger than the global " + 
                "maximum. Global minimum is "+globalMin+", " + 
                "global maximum is "+globalMax);
        }
        int range = globalMax - globalMin;
        if (range == 0)
        {
            return new int[0];
        }
        int numTasks = Math.min(range, parallelism);
        int localRange = (range - 1) / numTasks + 1;
        int spare = localRange * numTasks - range;
        int currentIndex = globalMin;
        int taskSizes[] = new int[numTasks];
        for (int i = 0; i < numTasks; i++)
        {
            final int min = currentIndex;
            final int max = min + localRange - (i < spare ? 1 : 0);
            taskSizes[i] = max - min; 
            currentIndex = max;
        }
        return taskSizes;
    }
}

The output is

输出是

Distributing   22 elements among    4 threads: [5, 5, 6, 6]
Distributing   21 elements among    4 threads: [5, 5, 5, 6]
Distributing  100 elements among    3 threads: [33, 33, 34]
Distributing    3 elements among    4 threads: [1, 1, 1]

(The last one shows one of the corner cases that one might have to take into account. E.g. one could expect [1,1,1,0] here. But this can easily be adjusted depending on the application case).

(最后一个显示了一个可能需要考虑的极端情况。例如,人们可以期待[1,1,1,0]。但这可以根据应用案例轻松调整)。