数据流作业失败,输出属性丢失错误

时间:2022-05-24 14:44:13

Dataflow pipeline job failing with message output property missing though passing output parameter as arguments.

虽然将输出参数作为参数传递,但数据流管道作业失败并且消息输出属性丢失。

Error:

错误:

Exception in thread "main" java.lang.IllegalArgumentException: Class interface org.apache.beam.runners.dataflow.options.DataflowPipelineOptions missing a property named 'output'.
    at org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1483)
    at org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:110)
    at org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:297)
    at com.example.DataValidationPipeline.getOptions(DataValidationPipeline.java:36)
    at com.example.DataValidationPipeline.main(DataValidationPipeline.java:50)

1 个解决方案

#1


0  

What seems to be happening is that the output file may not be set properly. I recommend reviewing this document where you can see that when you build and run the Dataflow Pipeline you need to set the output argument in the command like this:

似乎正在发生的是输出文件可能未正确设置。我建议您查看此文档,您可以在其中看到在构建和运行Dataflow Pipeline时,您需要在命令中设置输出参数,如下所示:

  mvn compile exec:java \
  -Dexec.mainClass=com.example.WordCount \
  -Dexec.args="--project=<my-cloud-project> \
  --stagingLocation=gs://<my-wordcount-storage-bucket>/staging/ \
  --output=gs://<my-wordcount-storage-bucket>/output \
  --runner=DataflowRunner"

This is assuming you are using maven, if you are using eclipse you may review this document.

这假设你正在使用maven,如果你正在使用eclipse,你可以查看这个文档。

#1


0  

What seems to be happening is that the output file may not be set properly. I recommend reviewing this document where you can see that when you build and run the Dataflow Pipeline you need to set the output argument in the command like this:

似乎正在发生的是输出文件可能未正确设置。我建议您查看此文档,您可以在其中看到在构建和运行Dataflow Pipeline时,您需要在命令中设置输出参数,如下所示:

  mvn compile exec:java \
  -Dexec.mainClass=com.example.WordCount \
  -Dexec.args="--project=<my-cloud-project> \
  --stagingLocation=gs://<my-wordcount-storage-bucket>/staging/ \
  --output=gs://<my-wordcount-storage-bucket>/output \
  --runner=DataflowRunner"

This is assuming you are using maven, if you are using eclipse you may review this document.

这假设你正在使用maven,如果你正在使用eclipse,你可以查看这个文档。