Gcloud主题在Apache Beam中转义

时间:2021-10-26 15:36:06

I'm trying to run a dataflow job through gcloud command:

我正在尝试通过gcloud命令运行数据流作业:

gcloud beta dataflow jobs run test --gcs-location gs://bucket/templates/templateName --parameters query="select a.name,b.salary,a.id from table1 a join table2 b on a.id = b.id"

But I get an error saying:

但我得到一个错误说:

ERROR: (gcloud.beta.dataflow.jobs.run) argument --parameters: Bad syntax for dict arg: [b.salary]. Please see gcloud topic escaping if you would like information on escaping list or dictionary flag values.

错误:(gcloud.beta.dataflow.jobs.run)参数 - 参数:dict的错误语法arg:[b.salary]。如果您想了解有关转义列表或字典标志值的信息,请参阅gcloud主题转义。

I saw the documentation for gcloud topic escaping but cannot figure out how to apply that here.Can somebody please help me with this.

我看到gcloud主题转义的文档,但无法弄清楚如何在这里应用。有人请帮助我。

Thanks.

谢谢。

1 个解决方案

#1


2  

The parameters argument takes a dictionary as its argument. As specified in gcloud topic escaping, you need to specify a delimiter between the dictionnary's elements, even though we only have one element here.

parameters参数以字典作为参数。正如gcloud topic escaping中所指定的那样,你需要在dictionnary的元素之间指定一个分隔符,即使我们这里只有一个元素。

Therefore we can just give an arbitrary delimiter like ":", using (notice the change before query=):

因此,我们可以给出一个像“:”这样的任意分隔符,使用(注意查询前的更改=):

gcloud beta dataflow jobs run test --gcs-location gs://bucket/templates/templateName --parameters ^:^query="select a.name,b.salary,a.id from table1 a join table2 b on a.id = b.id"

gcloud beta dataflow作业运行测试--gcs-location gs:// bucket / templates / templateName --parameters ^:^ query =“select a.name,b.salary,a.id from table1 a join table2 b on a。 id = b.id“


On an actual template (provided by google): gcloud beta dataflow jobs run test --gcs-location=gs://dataflow-templates/wordcount/template_file --parameters ^:^query="select a.name,b.salary,a.id from table1 a join table2 b on a.id = b.id"

在实际模板上(由谷歌提供):gcloud beta数据流作业运行测试--gcs-location = gs:// dataflow-templates / wordcount / template_file --parameters ^:^ query =“select a.name,b.salary ,a.id from table1 a a table2 b on a.id = b.id“

This returns INVALID_ARGUMENT: (bf23ae8a2a6f1efe): The workflow could not be created. Causes: (bf23ae8a2a6f165b): Found unexpected parameters: ['query' (perhaps you meant 'runner')], which shows that we have indeed fixed the issue: dataflow properly understands we're passing a query parameter. However the google template uses no such parameter and therefore throws an error, which is the expected behavior.

这将返回INVALID_ARGUMENT:(bf23ae8a2a6f1efe):无法创建工作流程。原因:(bf23ae8a2a6f165b):找到了意外的参数:['query'(也许你的意思是'跑步者')],这表明我们确实解决了这个问题:数据流正确理解我们传递了一个查询参数。但是,谷歌模板不使用此类参数,因此会引发错误,这是预期的行为。

#1


2  

The parameters argument takes a dictionary as its argument. As specified in gcloud topic escaping, you need to specify a delimiter between the dictionnary's elements, even though we only have one element here.

parameters参数以字典作为参数。正如gcloud topic escaping中所指定的那样,你需要在dictionnary的元素之间指定一个分隔符,即使我们这里只有一个元素。

Therefore we can just give an arbitrary delimiter like ":", using (notice the change before query=):

因此,我们可以给出一个像“:”这样的任意分隔符,使用(注意查询前的更改=):

gcloud beta dataflow jobs run test --gcs-location gs://bucket/templates/templateName --parameters ^:^query="select a.name,b.salary,a.id from table1 a join table2 b on a.id = b.id"

gcloud beta dataflow作业运行测试--gcs-location gs:// bucket / templates / templateName --parameters ^:^ query =“select a.name,b.salary,a.id from table1 a join table2 b on a。 id = b.id“


On an actual template (provided by google): gcloud beta dataflow jobs run test --gcs-location=gs://dataflow-templates/wordcount/template_file --parameters ^:^query="select a.name,b.salary,a.id from table1 a join table2 b on a.id = b.id"

在实际模板上(由谷歌提供):gcloud beta数据流作业运行测试--gcs-location = gs:// dataflow-templates / wordcount / template_file --parameters ^:^ query =“select a.name,b.salary ,a.id from table1 a a table2 b on a.id = b.id“

This returns INVALID_ARGUMENT: (bf23ae8a2a6f1efe): The workflow could not be created. Causes: (bf23ae8a2a6f165b): Found unexpected parameters: ['query' (perhaps you meant 'runner')], which shows that we have indeed fixed the issue: dataflow properly understands we're passing a query parameter. However the google template uses no such parameter and therefore throws an error, which is the expected behavior.

这将返回INVALID_ARGUMENT:(bf23ae8a2a6f1efe):无法创建工作流程。原因:(bf23ae8a2a6f165b):找到了意外的参数:['query'(也许你的意思是'跑步者')],这表明我们确实解决了这个问题:数据流正确理解我们传递了一个查询参数。但是,谷歌模板不使用此类参数,因此会引发错误,这是预期的行为。