How can I read nested structures using Apache Beam Python SDK?
如何使用Apache Beam Python SDK读取嵌套结构?
lines = p | io.Read(io.BigQuerySource('project:test.beam_in'))
result in
导致
"reason": "invalidQuery",
"message": "Cannot output multiple independently repeated fields at the same time. Found classification_item_distribution and category_cat_name"
Is it possible to read nested structures?
是否可以读取嵌套结构?
2 个解决方案
#1
1
This is a property of BigQuery. The two ways to execute such a query are to disable result flattening (by BigQuery) or to explicitly flatten fields in your query.
这是BigQuery的一个属性。执行此类查询的两种方法是禁用结果展平(通过BigQuery)或显式展平查询中的字段。
With the current Python SDK only the latter is available - see "Flattening Google Analytics data (with repeated fields) not working anymore" for a guide on where and how to invoke the FLATTEN
function.
使用当前的Python SDK,只有后者可用 - 请参阅“将Google Analytics数据展平(重复字段)不再起作用”,以获取有关调用FLATTEN函数的位置和方法的指南。
The feature to disable flattening is filed as BEAM-877 if you care to subscribe to updates or discuss.
如果您想订阅更新或讨论,则禁用展平的功能将作为BEAM-877提交。
#2
1
You can now read nested results directly in Beam Python by adding flatten_results=False
when creating your source:
您现在可以通过在创建源时添加flatten_results = False直接在Beam Python中读取嵌套结果:
lines = p | io.Read(io.BigQuerySource('project:test.beam_in', flatten_results=False))
See source here.
请参阅此处的来源
#1
1
This is a property of BigQuery. The two ways to execute such a query are to disable result flattening (by BigQuery) or to explicitly flatten fields in your query.
这是BigQuery的一个属性。执行此类查询的两种方法是禁用结果展平(通过BigQuery)或显式展平查询中的字段。
With the current Python SDK only the latter is available - see "Flattening Google Analytics data (with repeated fields) not working anymore" for a guide on where and how to invoke the FLATTEN
function.
使用当前的Python SDK,只有后者可用 - 请参阅“将Google Analytics数据展平(重复字段)不再起作用”,以获取有关调用FLATTEN函数的位置和方法的指南。
The feature to disable flattening is filed as BEAM-877 if you care to subscribe to updates or discuss.
如果您想订阅更新或讨论,则禁用展平的功能将作为BEAM-877提交。
#2
1
You can now read nested results directly in Beam Python by adding flatten_results=False
when creating your source:
您现在可以通过在创建源时添加flatten_results = False直接在Beam Python中读取嵌套结果:
lines = p | io.Read(io.BigQuerySource('project:test.beam_in', flatten_results=False))
See source here.
请参阅此处的来源