I need to instantiate use a GcsUtil
from within a CombineFn
subclass and it looks like I need to hand a PipelineOptions instance to the GcsUtilFactory. However I cannot find a way to retrieve an instance of the PipelineOptions
class (unlike in DoFn
s).
我需要在CombineFn子类中实例化使用GcsUtil,看起来我需要将PipelineOptions实例交给GcsUtilFactory。但是,我找不到一种方法来检索PipelineOptions类的实例(与DoFns不同)。
Is there an API to retrieve the current pipeline's options at runtime? Keeping the options in a field doesn't seem to work and blocks the pipeline upload to the dataflow service.
是否有API在运行时检索当前管道的选项?将选项保留在字段中似乎不起作用并阻止管道上载到数据流服务。
Thanks! G
谢谢! G
1 个解决方案
#1
1
Reading from GCS within the CombineFn is likely to be problematic. For instance, you wouldn't get any of the caching that side-inputs give you.
在CombineFn中从GCS读取可能会有问题。例如,您不会获得侧输入给您的任何缓存。
Depending on what kind of configuration you're trying to do, your best bet is probably to use a ParDo/DoFn before running the Combine.
根据您尝试的配置类型,最好的办法是在运行Combine之前使用ParDo / DoFn。
Separately, it probably does make sense for PipelineOptions to be made accessible from within the CombineFn. I've made a note of this, and we'll take a look.
另外,从CombineFn中可以访问PipelineOptions可能是有意义的。我已经记下了这一点,我们来看看。
#1
1
Reading from GCS within the CombineFn is likely to be problematic. For instance, you wouldn't get any of the caching that side-inputs give you.
在CombineFn中从GCS读取可能会有问题。例如,您不会获得侧输入给您的任何缓存。
Depending on what kind of configuration you're trying to do, your best bet is probably to use a ParDo/DoFn before running the Combine.
根据您尝试的配置类型,最好的办法是在运行Combine之前使用ParDo / DoFn。
Separately, it probably does make sense for PipelineOptions to be made accessible from within the CombineFn. I've made a note of this, and we'll take a look.
另外,从CombineFn中可以访问PipelineOptions可能是有意义的。我已经记下了这一点,我们来看看。