在Java流中,窥视真的只是为了调试吗?

时间:2021-11-08 19:09:42

I'm reading up about Java streams and discovering new things as I go along. One of the new things I found was the peek() function. Almost everything I've read on peek says it should be used to debug your Streams.

我正在阅读有关Java流和发现新事物的内容。我找到的新东西之一是peek()函数。几乎所有我读过的内容都说它应该用来调试你的Streams。

What if I had a Stream where each Account has a username, password field and a login() and loggedIn() method.

如果我有一个Stream,每个帐户都有一个用户名,密码字段以及login()和loggedIn()方法,该怎么办?

I also have

我也有

Consumer<Account> login = account -> account.login();

and

Predicate<Account> loggedIn = account -> account.loggedIn();

Why would this be so bad?

为什么会这么糟糕?

List<Account> accounts; //assume it's been setup
List<Account> loggedInAccount = 
accounts.stream()
    .peek(login)
    .filter(loggedIn)
    .collect(Collectors.toList());

Now as far as I can tell this does exactly what it's intended to do. It;

现在据我所知,这完全符合它的目的。它;

  • Takes a list of accounts
  • 获取帐户列表
  • Tries to log in to each account
  • 尝试登录每个帐户
  • Filters out any account which aren't logged in
  • 过滤掉任何未登录的帐户
  • Collects the logged in accounts into a new list
  • 将登录的帐户收集到新列表中

What is the downside of doing something like this? Any reason I shouldn't proceed? Lastly, if not this solution then what?

做这样的事情的缺点是什么?有什么理由我不应该继续吗?最后,如果不是这个解决方案呢?

The original version of this used the .filter() method as follows;

其原始版本使用.filter()方法如下;

.filter(account -> {
        account.login();
        return account.loggedIn();
    })

5 个解决方案

#1


48  

The key takeaway from this:

关键是要点:

Don't use the API in an unintended way, even if it accomplishes your immediate goal. That approach may break in the future, and it is also unclear to future maintainers.

不要以非预期的方式使用API​​,即使它实现了您的直接目标。这种方法将来可能会破裂,未来的维护者也不清楚。


There is no harm in breaking this out to multiple operations, as they are distinct operations. There is harm in using the API in an unclear and unintended way, which may have ramifications if this particular behavior is modified in future versions of Java.

将其分解为多个操作没有任何害处,因为它们是不同的操作。以不明确和无意的方式使用API​​会有害,如果在将来的Java版本中修改此特定行为,则可能会产生影响。

Using forEach on this operation would make it clear to the maintainer that there is an intended side effect on each element of accounts, and that you are performing some operation that can mutate it.

在此操作上使用forEach将使维护者清楚地知道对每个帐户元素存在预期的副作用,并且您正在执行一些可以改变它的操作。

It's also more conventional in the sense that peek is an intermediate operation which doesn't operate on the entire collection until the terminal operation runs, but forEach is indeed a terminal operation. This way, you can make strong arguments around the behavior and the flow of your code as opposed to asking questions about if peek would behave the same as forEach does in this context.

从某种意义上说,peek是一种中间操作,在终端操作运行之前不会对整个集合进行操作,而且每个终端操作确实是终端操作。通过这种方式,您可以围绕行为和代码流进行强有力的论证,而不是询问peek的行为是否与在此上下文中的forEach相同。

accounts.forEach(a -> a.login());
List<Account> loggedInAccounts = accounts.stream()
                                         .filter(Account::loggedIn)
                                         .collect(Collectors.toList());

#2


65  

The important thing you have to understand, is that streams are driven by the terminal operation. The terminal operation determines whether all elements have to be processed or any at all. So collect is an operation which processes each item, whereas findAny may stop processing items once it encountered a matching element.

您必须要了解的重要一点是,流是由终端操作驱动的。终端操作确定是否必须处理所有元素或者根本不处理所有元素。因此,collect是一个处理每个项目的操作,而findAny可能会在遇到匹配元素时停止处理项目。

And count() may not process any elements at all when it can determine the size of the stream without processing the items. Since this is an optimization not made in Java 8, but which will be in Java 9, there might be surprises when you switch to Java 9 and have code relying on count() processing all items. This is also connected to other implementation dependent details, e.g. even in Java 9, the reference implementation will not be able to predict the size of an infinite stream source combined with limit while there is no fundamental limitation preventing such prediction.

并且count()可以在不处理项目的情况下确定流的大小时根本不处理任何元素。由于这是一个不是在Java 8中进行的优化,而是在Java 9中进行的优化,因此当您切换到Java 9并且依赖count()处理所有项目时,可能会出现意外情况。这也与其他依赖于实现的细节有关,例如,即使在Java 9中,参考实现也不能预测无限流源的大小与限制相结合,而没有阻止这种预测的基本限制。

Since peek allows “performing the provided action on each element as elements are consumed from the resulting stream”, it does not mandate processing of elements but will perform the action dependent on what the terminal operation needs. This implies that you have to use it with great care if you need a particular processing, e.g. want to apply an action on all elements. It works if the terminal operation is guaranteed to process all items, but even then, you must be sure that not the next developer changes the terminal operation (or you forget that subtle aspect).

由于peek允许“在从结果流中消耗元素时对每个元素执行提供的操作”,因此它不会强制处理元素,而是根据终端操作需要执行操作。这意味着如果您需要特殊处理,例如必须小心使用它,例如想要对所有元素应用操作。如果终端操作保证处理所有项目,它仍然有效,但即使这样,您必须确保下一个开发人员不会更改终端操作(或者您忘记了那个微妙的方面)。

Further, while streams guarantee maintaining the encounter order for certain combination of operations even for parallel streams, these guarantees do not apply to peek. When collecting into a list, the resulting list will have the right order for ordered parallel streams, but the peek action may get invoked in an arbitrary order and concurrently.

此外,虽然流保证即使对于并行流也保持某些操作组合的遭遇顺序,但这些保证不适用于窥视。收集到列表中时,结果列表将具有有序并行流的正确顺序,但可以以任意顺序同时调用查看操作。

So the most useful thing you can do with peek is to find out whether a stream element has been processed which is exactly what the API documentation says:

因此,使用peek可以做的最有用的事情是找出是否已经处理了一个流元素,这正是API文档所说的:

This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline

此方法主要用于支持调试,您希望在元素流经管道中的某个点时查看这些元素

#3


9  

Perhaps a rule of thumb should be that if you do use peek outside the "debug" scenario, you should only do so if you're sure of what the terminating and intermediate filtering conditions are. For example:

也许一个经验法则应该是,如果您在“调试”场景之外使用窥视,那么只有在确定终止和中间过滤条件是什么时才应该这样做。例如:

return list.stream().map(foo->foo.getBar())
                    .peek(bar->bar.publish("HELLO"))
                    .collect(Collectors.toList());

seems to be a valid case where you want, in one operation to transform all Foos to Bars and tell them all hello.

似乎是一个有效的案例,你想要在一个操作中将所有Foos转换为Bars并告诉他们所有你好。

Seems more efficient and elegant than something like:

似乎比以下更有效和优雅:

List<Bar> bars = list.stream().map(foo->foo.getBar()).collect(Collectors.toList());
bars.forEach(bar->bar.publish("HELLO"));
return bars;

and you don't end up iterating a collection twice.

而且你不会最终迭代一个集合两次。

#4


2  

Although I agree with most answers above, I have one case in which using peek actually seems like the cleanest way to go.

虽然我同意上面的大多数答案,但我有一个案例,其中使用peek实际上似乎是最干净的方式。

Similar to your use case, suppose you want to filter only on active accounts and then perform a login on these accounts.

与您的用例类似,假设您只想过滤活动帐户,然后在这些帐户上执行登录。

accounts.stream()
    .filter(Account::isActive)
    .peek(login)
    .collect(Collectors.toList());

Peek is helpful to avoid the redundant call while not having to iterate the collection twice:

Peek有助于避免冗余调用,而无需迭代集合两次:

accounts.stream()
    .filter(Account::isActive)
    .map(account -> {
        account.login();
        return account;
    })
    .collect(Collectors.toList());

#5


1  

I would say that peek provides the ability to decentralize code that can mutate stream objects, or modify global state (based on them), instead of stuffing everything into a simple or composed function passed to a terminal method.

我想说peek提供了分散代码的能力,这些代码可以改变流对象,或者修改全局状态(基于它们),而不是将所有内容都填充到传递给终端方法的简单或组合函数中。

Now the question might be: should we mutate stream objects or change global state from within functions in functional style java programming?

现在的问题可能是:我们应该在功能样式java编程中改变流对象还是改变函数内的全局状态?

If the answer to any of the the above 2 questions is yes (or: in some cases yes) then peek() is definitely not only for debugging purposes, for the same reason that forEach() isn't only for debugging purposes.

如果对上述2个问题中的任何一个的答案是肯定的(或者:在某些情况下是肯定的)那么peek()绝对不仅仅是出于调试目的,因为forEach()不仅仅是出于调试目的。

For me when choosing between forEach() and peek(), is choosing the following: Do I want pieces of code that mutate stream objects to be attached to a composable, or do I want them to attach directly to stream?

对我来说,在forEach()和peek()之间进行选择时,选择以下内容:我是否希望将流对象变异的代码片段附加到可组合对象,或者我是否希望它们直接附加到流?

I think peek() will better pair with java9 methods. e.g. takeWhile() may need to decide when to stop iteration based on an already mutated object, so paring it with forEach() would not have the same effect.

我认为peek()会更好地与java9方法配对。例如takeWhile()可能需要根据已经变异的对象来决定何时停止迭代,因此使用forEach()将其削减不会产生相同的效果。

P.S. I have not referenced map() anywhere because in case we want to mutate objects (or global state), rather than generating new objects, it works exactly like peek().

附:我没有在任何地方引用map(),因为如果我们想要改变对象(或全局状态),而不是生成新对象,它的工作方式与peek()完全相同。

#1


48  

The key takeaway from this:

关键是要点:

Don't use the API in an unintended way, even if it accomplishes your immediate goal. That approach may break in the future, and it is also unclear to future maintainers.

不要以非预期的方式使用API​​,即使它实现了您的直接目标。这种方法将来可能会破裂,未来的维护者也不清楚。


There is no harm in breaking this out to multiple operations, as they are distinct operations. There is harm in using the API in an unclear and unintended way, which may have ramifications if this particular behavior is modified in future versions of Java.

将其分解为多个操作没有任何害处,因为它们是不同的操作。以不明确和无意的方式使用API​​会有害,如果在将来的Java版本中修改此特定行为,则可能会产生影响。

Using forEach on this operation would make it clear to the maintainer that there is an intended side effect on each element of accounts, and that you are performing some operation that can mutate it.

在此操作上使用forEach将使维护者清楚地知道对每个帐户元素存在预期的副作用,并且您正在执行一些可以改变它的操作。

It's also more conventional in the sense that peek is an intermediate operation which doesn't operate on the entire collection until the terminal operation runs, but forEach is indeed a terminal operation. This way, you can make strong arguments around the behavior and the flow of your code as opposed to asking questions about if peek would behave the same as forEach does in this context.

从某种意义上说,peek是一种中间操作,在终端操作运行之前不会对整个集合进行操作,而且每个终端操作确实是终端操作。通过这种方式,您可以围绕行为和代码流进行强有力的论证,而不是询问peek的行为是否与在此上下文中的forEach相同。

accounts.forEach(a -> a.login());
List<Account> loggedInAccounts = accounts.stream()
                                         .filter(Account::loggedIn)
                                         .collect(Collectors.toList());

#2


65  

The important thing you have to understand, is that streams are driven by the terminal operation. The terminal operation determines whether all elements have to be processed or any at all. So collect is an operation which processes each item, whereas findAny may stop processing items once it encountered a matching element.

您必须要了解的重要一点是,流是由终端操作驱动的。终端操作确定是否必须处理所有元素或者根本不处理所有元素。因此,collect是一个处理每个项目的操作,而findAny可能会在遇到匹配元素时停止处理项目。

And count() may not process any elements at all when it can determine the size of the stream without processing the items. Since this is an optimization not made in Java 8, but which will be in Java 9, there might be surprises when you switch to Java 9 and have code relying on count() processing all items. This is also connected to other implementation dependent details, e.g. even in Java 9, the reference implementation will not be able to predict the size of an infinite stream source combined with limit while there is no fundamental limitation preventing such prediction.

并且count()可以在不处理项目的情况下确定流的大小时根本不处理任何元素。由于这是一个不是在Java 8中进行的优化,而是在Java 9中进行的优化,因此当您切换到Java 9并且依赖count()处理所有项目时,可能会出现意外情况。这也与其他依赖于实现的细节有关,例如,即使在Java 9中,参考实现也不能预测无限流源的大小与限制相结合,而没有阻止这种预测的基本限制。

Since peek allows “performing the provided action on each element as elements are consumed from the resulting stream”, it does not mandate processing of elements but will perform the action dependent on what the terminal operation needs. This implies that you have to use it with great care if you need a particular processing, e.g. want to apply an action on all elements. It works if the terminal operation is guaranteed to process all items, but even then, you must be sure that not the next developer changes the terminal operation (or you forget that subtle aspect).

由于peek允许“在从结果流中消耗元素时对每个元素执行提供的操作”,因此它不会强制处理元素,而是根据终端操作需要执行操作。这意味着如果您需要特殊处理,例如必须小心使用它,例如想要对所有元素应用操作。如果终端操作保证处理所有项目,它仍然有效,但即使这样,您必须确保下一个开发人员不会更改终端操作(或者您忘记了那个微妙的方面)。

Further, while streams guarantee maintaining the encounter order for certain combination of operations even for parallel streams, these guarantees do not apply to peek. When collecting into a list, the resulting list will have the right order for ordered parallel streams, but the peek action may get invoked in an arbitrary order and concurrently.

此外,虽然流保证即使对于并行流也保持某些操作组合的遭遇顺序,但这些保证不适用于窥视。收集到列表中时,结果列表将具有有序并行流的正确顺序,但可以以任意顺序同时调用查看操作。

So the most useful thing you can do with peek is to find out whether a stream element has been processed which is exactly what the API documentation says:

因此,使用peek可以做的最有用的事情是找出是否已经处理了一个流元素,这正是API文档所说的:

This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline

此方法主要用于支持调试,您希望在元素流经管道中的某个点时查看这些元素

#3


9  

Perhaps a rule of thumb should be that if you do use peek outside the "debug" scenario, you should only do so if you're sure of what the terminating and intermediate filtering conditions are. For example:

也许一个经验法则应该是,如果您在“调试”场景之外使用窥视,那么只有在确定终止和中间过滤条件是什么时才应该这样做。例如:

return list.stream().map(foo->foo.getBar())
                    .peek(bar->bar.publish("HELLO"))
                    .collect(Collectors.toList());

seems to be a valid case where you want, in one operation to transform all Foos to Bars and tell them all hello.

似乎是一个有效的案例,你想要在一个操作中将所有Foos转换为Bars并告诉他们所有你好。

Seems more efficient and elegant than something like:

似乎比以下更有效和优雅:

List<Bar> bars = list.stream().map(foo->foo.getBar()).collect(Collectors.toList());
bars.forEach(bar->bar.publish("HELLO"));
return bars;

and you don't end up iterating a collection twice.

而且你不会最终迭代一个集合两次。

#4


2  

Although I agree with most answers above, I have one case in which using peek actually seems like the cleanest way to go.

虽然我同意上面的大多数答案,但我有一个案例,其中使用peek实际上似乎是最干净的方式。

Similar to your use case, suppose you want to filter only on active accounts and then perform a login on these accounts.

与您的用例类似,假设您只想过滤活动帐户,然后在这些帐户上执行登录。

accounts.stream()
    .filter(Account::isActive)
    .peek(login)
    .collect(Collectors.toList());

Peek is helpful to avoid the redundant call while not having to iterate the collection twice:

Peek有助于避免冗余调用,而无需迭代集合两次:

accounts.stream()
    .filter(Account::isActive)
    .map(account -> {
        account.login();
        return account;
    })
    .collect(Collectors.toList());

#5


1  

I would say that peek provides the ability to decentralize code that can mutate stream objects, or modify global state (based on them), instead of stuffing everything into a simple or composed function passed to a terminal method.

我想说peek提供了分散代码的能力,这些代码可以改变流对象,或者修改全局状态(基于它们),而不是将所有内容都填充到传递给终端方法的简单或组合函数中。

Now the question might be: should we mutate stream objects or change global state from within functions in functional style java programming?

现在的问题可能是:我们应该在功能样式java编程中改变流对象还是改变函数内的全局状态?

If the answer to any of the the above 2 questions is yes (or: in some cases yes) then peek() is definitely not only for debugging purposes, for the same reason that forEach() isn't only for debugging purposes.

如果对上述2个问题中的任何一个的答案是肯定的(或者:在某些情况下是肯定的)那么peek()绝对不仅仅是出于调试目的,因为forEach()不仅仅是出于调试目的。

For me when choosing between forEach() and peek(), is choosing the following: Do I want pieces of code that mutate stream objects to be attached to a composable, or do I want them to attach directly to stream?

对我来说,在forEach()和peek()之间进行选择时,选择以下内容:我是否希望将流对象变异的代码片段附加到可组合对象,或者我是否希望它们直接附加到流?

I think peek() will better pair with java9 methods. e.g. takeWhile() may need to decide when to stop iteration based on an already mutated object, so paring it with forEach() would not have the same effect.

我认为peek()会更好地与java9方法配对。例如takeWhile()可能需要根据已经变异的对象来决定何时停止迭代,因此使用forEach()将其削减不会产生相同的效果。

P.S. I have not referenced map() anywhere because in case we want to mutate objects (or global state), rather than generating new objects, it works exactly like peek().

附:我没有在任何地方引用map(),因为如果我们想要改变对象(或全局状态),而不是生成新对象,它的工作方式与peek()完全相同。