将JSON解析为Dataflow作业中的键值

时间:2021-07-19 15:35:16

How to parse JSON data in apache beam and store in bigquery table ? For example: JSON data

如何解析apache beam中的JSON数据并存储在bigquery表中?例如:JSON数据

[{ "name":"stack"},{"id":"100"}].

How to parse JSON data and convert to PCollection K,V that will store in BQ table? Appreciate your help!!

如何解析JSON数据并转换为将存储在BQ表中的PCollection K,V?感谢你的帮助!!

1 个解决方案

#1


2  

Typically you would use a built in JSON parser in the programming language (Are you using beam or python). Then create a TableRow object and use that for the PCollection which you are passing to the BQ table.

通常,您将使用编程语言中的内置JSON解析器(您使用的是beam还是python)。然后创建一个TableRow对象,并将其用于传递给BQ表的PCollection。

Note: Some JSON parsers disallow JSON which starts with a root list, as you have shown in your example. They tend to prefer something like this, with a root map. I believe this is the case in python's json library.

注意:某些JSON解析器不允许以根列表开头的JSON,如您在示例中所示。他们倾向于喜欢这样的东西,带有根图。我相信这是python的json库中的情况。

{"name":"stack", "id":"100"}

{“name”:“stack”,“id”:“100”}

Please see this example pipeline, for an example on how to create the PCollection and use BigqueryIO.

有关如何创建PCollection并使用BigqueryIO的示例,请参阅此示例管道。

You may also want to consider using one of the X to BigQuery template pipelines.

您可能还想考虑使用X到BigQuery模板管道之一。

#1


2  

Typically you would use a built in JSON parser in the programming language (Are you using beam or python). Then create a TableRow object and use that for the PCollection which you are passing to the BQ table.

通常,您将使用编程语言中的内置JSON解析器(您使用的是beam还是python)。然后创建一个TableRow对象,并将其用于传递给BQ表的PCollection。

Note: Some JSON parsers disallow JSON which starts with a root list, as you have shown in your example. They tend to prefer something like this, with a root map. I believe this is the case in python's json library.

注意:某些JSON解析器不允许以根列表开头的JSON,如您在示例中所示。他们倾向于喜欢这样的东西,带有根图。我相信这是python的json库中的情况。

{"name":"stack", "id":"100"}

{“name”:“stack”,“id”:“100”}

Please see this example pipeline, for an example on how to create the PCollection and use BigqueryIO.

有关如何创建PCollection并使用BigqueryIO的示例,请参阅此示例管道。

You may also want to consider using one of the X to BigQuery template pipelines.

您可能还想考虑使用X到BigQuery模板管道之一。