更好的多线程：单一功能或集合功能

I don't know if I worded it correctly, but for a simple example let's say we have a collection of Point3 values (say 1M).

我不知道我的措辞是否正确,但是举一个简单的例子,假设我们有一个Point3值的集合(比如1M)。

We have a method called Offset that adds another Point3 value on these values, returning new Point3 values. Let's say the method is static.

我们有一个名为Offset的方法,它在这些值上添加另一个Point3值,返回新的Point3值。假设该方法是静态的。

The Point3 type is immutable.

Point3类型是不可变的。

The question is, should I have a method like this:

问题是,我应该有这样的方法:

public static Point3 Offset ( Point3 a, Point3 b )

public static IEnumerable<Point3> Offset ( IEnumerable<Point3> a, IEnumerable<Point3> b )

To me #1 seems like a better choice to break the task into separate tasks for different threads.

对我来说,#1似乎是将任务分解为不同线程的单独任务的更好选择。

What do you think? And advantages to #1 or #2?

你怎么看? #1还是#2的优点是什么?

4 个解决方案

#1

Option 1 is the logical core operation. With .NET 4.0 you can achieve the same operation as option 2 using the Zip operator. From memory, instead of:

选项1是逻辑核心操作。使用.NET 4.0,您可以使用Zip运算符实现与选项2相同的操作。从记忆中,而不是:

var newPoints = Offset(firstPoints, secondPoints);

you'd write:

var newPoints = firstPoints.Zip(secondPoints, (p1, p2) => Offset(p1, p2));

You may want to consider making Offset an extension method on Point3 if you're using .NET 3.5 as well. (Alternatively, if you control the Point3 type, this sounds like a logical addition - it would be nice to write (p1, p2) => p1 + p2 in the call to Zip.

如果您还在使用.NET 3.5,则可能需要考虑在Point3上使用Offset作为扩展方法。 (或者,如果你控制Point3类型,这听起来像是一个逻辑补充 - 在Zip调用中写(p1,p2)=> p1 + p2会很好。

If you're not using .NET 4.0 but Zip appeals to you, we have an implementation in MoreLINQ - it's pretty simple.

如果您没有使用.NET 4.0但Zip吸引您,我们在MoreLINQ中有一个实现 - 它非常简单。

So far, nothing has been related to multi-threading... now I don't know offhand whether there's a PLINQ implementation of Zip in .NET 4.0, but it would make sense for there to be one, IMO.

到目前为止,没有任何与多线程有关...现在我不知道是否在.NET 4.0中有一个PLIN的PLINQ实现,但是有一个,IMO是有意义的。

#2

You should probably have the first, and have the second call the first.

你可能应该有第一个,第二个叫第一个。

#3

#1 seems simpler and cleaner, and you could always parallelize it from outside. I don't see a reason to use #2 exclusively, unless you've neglected to state a crucial detail. If you decide you want to routinely parallelize this sort of loop in the same way, make #2 call #1.

#1看起来更简单,更清洁,您可以始终从外部并行化。我没有理由只使用#2,除非你忽略了一个关键的细节。如果您决定以相同的方式定期并行化这种循环,请将#2调用#1。

#4

My answer is both. I like having the simpiliest functions possible, so #1 is good. At the same time, convenience methods to operate on lists are darn useful, and can do the hard work of spawning threads if it's appropriate.

我的回答是。我喜欢最简单的功能,所以#1很好。同时,在列表上操作的便捷方法也很有用,如果合适的话,它可以完成产生线程的艰苦工作。

One of my beefs with Java (well, almost all languages, but Java is new enough they should have known better) is that they still haven't done a good job making the base library take advantages of multiple threads, or provided many mechanisms to help developers with that. There really should be a generic function to do "apply this function to all elements in this list", and have that function figure out how many cores are available, how big the list is, what the overhead is, and optimize accordingly.

我的Java之一(好吧,几乎所有语言,但Java足够新,他们本应该知道的更好)是他们仍然没有做好基础库利用多线程的优势,或提供了许多机制帮助开发人员。确实应该有一个通用函数来“将此函数应用于此列表中的所有元素”,并使该函数计算出可用的核心数,列表的大小,开销是多少,并相应地进行优化。

#1