I would like to use the experimental option that allows me to update a BigQuery schema when performing a load job.
我想使用允许我在执行加载作业时更新BigQuery架构的实验选项。
I'm using Dataflow and the built-in BigQueryIO.write from the SDK.
我正在使用Dataflow和SDK中的内置BigQueryIO.write。
I saw that with a JobConfigurationLoad.setSchemaUpdateOptions(ALLOW_FIELD_ADDITION) from the BigQuery API it's possible, but I can't find the equivalent with the BigQueryIO.
我从BigQuery API看到了一个JobConfigurationLoad.setSchemaUpdateOptions(ALLOW_FIELD_ADDITION),但是我找不到与BigQueryIO等效的东西。
Does it exist somewhere or can I override some part in the BigQueryIO to do that ?
它存在于某个地方还是可以覆盖BigQueryIO中的某些部分来执行此操作?
Thank you very much,
非常感谢你,
1 个解决方案
#1
0
AFAIK, that experimental option is not yet exposed via the Dataflow/Beam APIs in BigQueryIO
, and it would not be a trivial task to override something in in that class - I wouldn't recommended going down that route.
AFAIK,该实验选项尚未通过BigQueryIO中的Dataflow / Beam API公开,并且在该类中覆盖某些内容并不是一项微不足道的任务 - 我不建议沿着那条路走下去。
One workaround I can think of would be to redirect your sink to GCS instead of BigQuery, and then perform a normal BigQuery load job(s) at the end of your pipeline. That way you can use the option.
我能想到的一个解决方法是将接收器重定向到GCS而不是BigQuery,然后在管道末端执行正常的BigQuery加载作业。这样你就可以使用该选项。
#1
0
AFAIK, that experimental option is not yet exposed via the Dataflow/Beam APIs in BigQueryIO
, and it would not be a trivial task to override something in in that class - I wouldn't recommended going down that route.
AFAIK,该实验选项尚未通过BigQueryIO中的Dataflow / Beam API公开,并且在该类中覆盖某些内容并不是一项微不足道的任务 - 我不建议沿着那条路走下去。
One workaround I can think of would be to redirect your sink to GCS instead of BigQuery, and then perform a normal BigQuery load job(s) at the end of your pipeline. That way you can use the option.
我能想到的一个解决方法是将接收器重定向到GCS而不是BigQuery,然后在管道末端执行正常的BigQuery加载作业。这样你就可以使用该选项。