为什么不java.util。集合实现新的流接口?

时间:2022-02-27 22:04:44

I just took some time to start looking into the java-8 buzz about streams and lambdas. What surprised me is that you cannot apply the Stream operations, like .map(), .filter() directly on a java.util.Collection. Is there a technical reason why the java.util.Collection interface was not extended with default implementations of these Stream operations?

我只是花了一些时间开始研究关于流和lambdas的java-8 buzz。令我惊讶的是,您不能直接在java.util.Collection上应用流操作,比如.map()、.filter()。java.util有什么技术原因吗?这些流操作的默认实现没有扩展集合接口?

Googling a bit, I see lots of examples of people coding along the pattern of:

在谷歌上搜索一下,我看到很多人按照以下模式编码:

List<String> list = someListExpression;
List<String> anotherList = list.stream().map(x -> f(x)).collect(Collectors.toList());

which becomes very clumsy, if you have a lot of these stream-operations in your code. Since .stream() and .collect() are completely irrelevant to what you want to express, you would rather like to say:

如果你的代码中有很多这样的流操作,就会变得非常笨拙。因为.stream()和.collect()与您想要表达的内容完全无关,所以您想说:

List<String> list = someListExpression;
List<String> anotherList = list.map(x -> f(x));

1 个解决方案

#1


78  

Yes, there are excellent reasons for these decisions :)

是的,这些决定有很好的理由:

The key is the difference between eager and lazy operations. The examples you give under the first question show eager operations where mapping or filtering a list produces a new list. There's nothing wrong with this, but it is often not what you want, because you're often doing way more work than you need; an eager operation must operate on every element, and produce a new collection. If you're composing multiple operations (filter-map-reduce), you're doing a lot of extra work. On the other hand, lazy operations compose beautifully; if you do:

关键是渴望操作和惰性操作之间的区别。在第一个问题下给出的示例显示了热切操作,其中映射或过滤一个列表会生成一个新的列表。这并没有什么错,但往往不是你想要的,因为你经常做的工作比你需要的多;热切操作必须对每个元素进行操作,并生成一个新的集合。如果您正在编写多个操作(filter-map-reduce),则需要做大量额外的工作。另一方面,懒惰的操作构成美丽;如果你做的事:

 Optional<Person> tallestGuy = people.stream()
                                     .filter(p -> p.getGender() == MALE)
                                     .max(comparing(Person::getHeight));

the filter and reduce (max) operations are fused together into a single pass. This is very efficient.

过滤器和减少(最大)操作被融合到一个单一的通道。这是非常有效的。

So, why not expose the Stream methods right on List? Well, we tried it like that. Among numerous other reasons, we found that mixing lazy methods like filter() and eager methods like removeAll() was confusing to users. By grouping the lazy methods into a separate abstraction, it becomes much clearer; the methods on List are those that mutate the list; the methods on Stream are those that deal in composible, lazy operations on data sequences regardless of where that data lives.

那么,为什么不在列表中公开流方法呢?我们试过了。在众多其他原因中,我们发现混合使用filter()和eager方法removeAll()会让用户感到困惑。通过将惰性方法分组到一个单独的抽象中,它变得更加清晰;列表中的方法是改变列表的方法;流中的方法是处理可组合的、延迟的数据序列操作,而不管数据位于何处。

So, the way you suggest it is great if you want to do really simple things, but starts to fall apart when you try to build on it. Is the extra stream() method annoying? Sure. But keeping the abstractions for data structures (which are largely about organizing data in memory) and streams (which are largely about composing aggregate behavior) separate scales better to more sophisticated operations.

所以,如果你想做一些非常简单的事情,你的建议是很棒的,但是当你尝试着去做的时候,就会开始崩溃。额外的stream()方法令人讨厌吗?确定。但是,保持数据结构的抽象(主要是为了在内存中组织数据)和流(很大程度上是为了组合聚合行为),将更好地划分为更复杂的操作。

To your second question, you can do this relatively easily: implement the stream methods like this:

对于第二个问题,您可以相对容易地做到这一点:实现如下的流方法:

public<U> Stream<U> map(Function<T,U> mapper) { return convertToStream().map(mapper); }

But that's just swimming against the tide; better to just implement an efficient stream() method.

但那只是逆水行舟;最好只实现一个高效的stream()方法。

#1


78  

Yes, there are excellent reasons for these decisions :)

是的,这些决定有很好的理由:

The key is the difference between eager and lazy operations. The examples you give under the first question show eager operations where mapping or filtering a list produces a new list. There's nothing wrong with this, but it is often not what you want, because you're often doing way more work than you need; an eager operation must operate on every element, and produce a new collection. If you're composing multiple operations (filter-map-reduce), you're doing a lot of extra work. On the other hand, lazy operations compose beautifully; if you do:

关键是渴望操作和惰性操作之间的区别。在第一个问题下给出的示例显示了热切操作,其中映射或过滤一个列表会生成一个新的列表。这并没有什么错,但往往不是你想要的,因为你经常做的工作比你需要的多;热切操作必须对每个元素进行操作,并生成一个新的集合。如果您正在编写多个操作(filter-map-reduce),则需要做大量额外的工作。另一方面,懒惰的操作构成美丽;如果你做的事:

 Optional<Person> tallestGuy = people.stream()
                                     .filter(p -> p.getGender() == MALE)
                                     .max(comparing(Person::getHeight));

the filter and reduce (max) operations are fused together into a single pass. This is very efficient.

过滤器和减少(最大)操作被融合到一个单一的通道。这是非常有效的。

So, why not expose the Stream methods right on List? Well, we tried it like that. Among numerous other reasons, we found that mixing lazy methods like filter() and eager methods like removeAll() was confusing to users. By grouping the lazy methods into a separate abstraction, it becomes much clearer; the methods on List are those that mutate the list; the methods on Stream are those that deal in composible, lazy operations on data sequences regardless of where that data lives.

那么,为什么不在列表中公开流方法呢?我们试过了。在众多其他原因中,我们发现混合使用filter()和eager方法removeAll()会让用户感到困惑。通过将惰性方法分组到一个单独的抽象中,它变得更加清晰;列表中的方法是改变列表的方法;流中的方法是处理可组合的、延迟的数据序列操作,而不管数据位于何处。

So, the way you suggest it is great if you want to do really simple things, but starts to fall apart when you try to build on it. Is the extra stream() method annoying? Sure. But keeping the abstractions for data structures (which are largely about organizing data in memory) and streams (which are largely about composing aggregate behavior) separate scales better to more sophisticated operations.

所以,如果你想做一些非常简单的事情,你的建议是很棒的,但是当你尝试着去做的时候,就会开始崩溃。额外的stream()方法令人讨厌吗?确定。但是,保持数据结构的抽象(主要是为了在内存中组织数据)和流(很大程度上是为了组合聚合行为),将更好地划分为更复杂的操作。

To your second question, you can do this relatively easily: implement the stream methods like this:

对于第二个问题,您可以相对容易地做到这一点:实现如下的流方法:

public<U> Stream<U> map(Function<T,U> mapper) { return convertToStream().map(mapper); }

But that's just swimming against the tide; better to just implement an efficient stream() method.

但那只是逆水行舟;最好只实现一个高效的stream()方法。