I am trying to create aggregators to count values that satisfy a condition across all input data . I looked into documentation and found the below for creation .
我正在尝试创建聚合器来计算满足所有输入数据条件的值。我查看了文档,发现了下面的创建。
https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/transforms/Aggregator ..
I am using : google-cloud-dataflow-java-sdk-all - 2.4.0 (apache beam based)
我正在使用:google-cloud-dataflow-java-sdk-all - 2.4.0(基于apache beam)
However I am not able to find the corresponding class in the new beam api.. I looked into org.apache.beam.sdk.transforms package .
但是我无法在新的光束api中找到相应的类。我查看了org.apache.beam.sdk.transforms包。
Can you please let me know how can I use aggregators with dataflow runner in new api . ?
能告诉我如何在新api中使用带有dataflow runner的聚合器。 ?
2 个解决方案
#1
1
The link you have is for the old SDK (1.x).
您拥有的链接是旧SDK(1.x)。
In SDK 2.x, you should refer to apache-beam
SDK. For the Aggregators
you mentioned, if I understand correctly, it's for adding counters during processing. I guess the corresponding package should be org.apache.beam.sdk.metrics
.
在SDK 2.x中,您应该参考apache-beam SDK。对于您提到的聚合器,如果我理解正确的话,那就是在处理过程中添加计数器。我猜对应的包应该是org.apache.beam.sdk.metrics。
Package org.apache.beam.sdk.metrics Metrics allow exporting information about the execution of a pipeline.
包org.apache.beam.sdk.metrics指标允许导出有关管道执行的信息。
and org.apache.beam.sdk.metrics.Counter
interface:
和org.apache.beam.sdk.metrics.Counter接口:
A metric that reports a single long value and can be incremented or decremented.
报告单个长值并可递增或递减的度量标准。
#2
0
As of now, there seem to be no replacement for the Aggregator class in Apache Beam SDK 2.X. An alternate solution to count values respecting a condition would be Transforms. By using the GroupBy transform to collect data meeting a condition and then the Combine transform, you can get a count of the input data respecting the condition.
截至目前,似乎没有替代Apache Beam SDK 2.X中的Aggregator类。计算关于条件的值的替代解决方案是变换。通过使用GroupBy变换来收集满足条件的数据,然后使用Combine变换,您可以获得关于条件的输入数据的计数。
#1
1
The link you have is for the old SDK (1.x).
您拥有的链接是旧SDK(1.x)。
In SDK 2.x, you should refer to apache-beam
SDK. For the Aggregators
you mentioned, if I understand correctly, it's for adding counters during processing. I guess the corresponding package should be org.apache.beam.sdk.metrics
.
在SDK 2.x中,您应该参考apache-beam SDK。对于您提到的聚合器,如果我理解正确的话,那就是在处理过程中添加计数器。我猜对应的包应该是org.apache.beam.sdk.metrics。
Package org.apache.beam.sdk.metrics Metrics allow exporting information about the execution of a pipeline.
包org.apache.beam.sdk.metrics指标允许导出有关管道执行的信息。
and org.apache.beam.sdk.metrics.Counter
interface:
和org.apache.beam.sdk.metrics.Counter接口:
A metric that reports a single long value and can be incremented or decremented.
报告单个长值并可递增或递减的度量标准。
#2
0
As of now, there seem to be no replacement for the Aggregator class in Apache Beam SDK 2.X. An alternate solution to count values respecting a condition would be Transforms. By using the GroupBy transform to collect data meeting a condition and then the Combine transform, you can get a count of the input data respecting the condition.
截至目前,似乎没有替代Apache Beam SDK 2.X中的Aggregator类。计算关于条件的值的替代解决方案是变换。通过使用GroupBy变换来收集满足条件的数据,然后使用Combine变换,您可以获得关于条件的输入数据的计数。