在Apache Beam中强制流式传输空窗格/窗口

时间:2022-03-22 15:36:03

I am trying to implement a pipeline and takes in a stream of data and every minutes output a True if there is any element in the minute interval or False if there is none. The pane (with forever time trigger) or window (fixed window) does not seem to trigger if there is no element for the duration.

我正在尝试实现一个管道并接收一个数据流,如果在分钟间隔中有任何元素,则每分钟输出一个True,如果没有,则输出False。如果持续时间没有元素,则窗格(具有永久时间触发器)或窗口(固定窗口)似乎不会触发。

One workaround I am thinking is to put the stream into a global window, use a ValueState to keep a queue to accumulate the data and a timer as a trigger to exam the queue. I wonder if there is any neater way of achieving this.

我想的一个解决方法是将流放入全局窗口,使用ValueState保持队列以累积数据,并使用计时器作为检查队列的触发器。我想知道是否有更简洁的方法来实现这一目标。

Thanks.

谢谢。

1 个解决方案

#1


1  

I think your timers and state solution is a good way to do this. However, keep in mind that your timers will not be set until you receive at least one element for a key.

我认为你的计时器和状态解决方案是一个很好的方法。但是,请记住,在收到密钥的至少一个元素之前,不会设置计时器。

If this is an issue, then the other thing you could do is inject a PCollection so that every window is guaranteed to have at least one dummy element. Then you can use ValueState to check if any element besides the dummy element has arrived. Or alternatively use Count.PerElement over the window and check if there is more than 1 element(An additional element, that is not the dummy element) for that window.

如果这是一个问题,那么你可以做的另一件事就是注入一个PCollection,这样每个窗口都保证至少有一个虚拟元素。然后,您可以使用ValueState来检查除虚拟元素之外的任何元素是否已到达。或者在窗口上使用Count.PerElement并检查该窗口是否有多于1个元素(一个附加元素,不是虚拟元素)。

#1


1  

I think your timers and state solution is a good way to do this. However, keep in mind that your timers will not be set until you receive at least one element for a key.

我认为你的计时器和状态解决方案是一个很好的方法。但是,请记住,在收到密钥的至少一个元素之前,不会设置计时器。

If this is an issue, then the other thing you could do is inject a PCollection so that every window is guaranteed to have at least one dummy element. Then you can use ValueState to check if any element besides the dummy element has arrived. Or alternatively use Count.PerElement over the window and check if there is more than 1 element(An additional element, that is not the dummy element) for that window.

如果这是一个问题,那么你可以做的另一件事就是注入一个PCollection,这样每个窗口都保证至少有一个虚拟元素。然后,您可以使用ValueState来检查除虚拟元素之外的任何元素是否已到达。或者在窗口上使用Count.PerElement并检查该窗口是否有多于1个元素(一个附加元素,不是虚拟元素)。