在processElement（）中选择元素 - Apache Beam

I know that when we implement a ParDo transform, we pick up individual elements from our data(basically separated by "\n"). But what if I have an element that occupies two lines in my file. Can I apply my own condition to pick elements according to it? Or is it always necessary to have an element in a single line?

我知道当我们实现ParDo变换时，我们从数据中获取单个元素（基本上用“\ n”分隔）。但是，如果我的元素在我的文件中占用两行，该怎么办？我可以根据自己的条件选择元素吗？或者总是需要在一行中包含一个元素？

1 个解决方案

#1

Reading of text files is controlled by TextIO, not by ParDo - I suppose that's what you meant. Indeed right now TextIO splits files into 1 element per line, however there is work in progress on changing that. You can follow the work at https://issues.apache.org/jira/browse/BEAM-2802.

阅读文本文件由TextIO控制，而不是由ParDo控制 - 我想这就是你的意思。事实上，现在TextIO将文件分成每行1个元素，但是正在进行更改。您可以访问https://issues.apache.org/jira/browse/BEAM-2802。

It would be useful for that work, if you told more about your file format, to make sure it is in scope.

如果您对文件格式有更多了解，那么这项工作将非常有用，以确保它在范围内。

#1

It would be useful for that work, if you told more about your file format, to make sure it is in scope.

如果您对文件格式有更多了解，那么这项工作将非常有用，以确保它在范围内。

秒客网

在processElement（）中选择元素 - Apache Beam

1 个解决方案

#1

#1

相关文章