I have a wide flat table, stored in Google bigquery in the folowing similar format :
我有一个宽大的平台,以类似的格式存储在Google bigquery中:
log_date:integer,sessionid:integer,computer:string,ip:string,event_id:integer,amount:float
I'm trying to create this table in hierarchical nested format , having 2 nested levels , as following :
我正在尝试以分层嵌套格式创建此表,具有2个嵌套级别,如下所示:
[
{
"name": "log_date",
"type": "integer"
},
{
"name": "session",
"type": "record",
"mode": "repeated",
"fields": [
{
"name": "sessionid",
"type": "integer"
},
{
"name": "computer",
"type": "string"
},
{
"name": "ip",
"type": "string"
},
{
"name": "event",
"type": "record",
"mode": "repeated",
"fields": [
{
"name": "event_id",
"type": "integer"
},
{
"name": "amount",
"type": "float"
}]] } ]
What is the best way to generate the json formatted data file from bigquery table ? Is there a different and faster approach than 1. download the table into external csv 2. build the json record , and write it into external file 3. upload the external json file into new bigquery table
从bigquery表生成json格式的数据文件的最佳方法是什么?有没有比1更快更快的方法。将表下载到外部csv 2.构建json记录,并将其写入外部文件3.将外部json文件上传到新的bigquery表
Can we have a direct process that generates json from existing tables ?
我们可以有一个从现有表生成json的直接进程吗?
Thank you , H
谢谢,H
1 个解决方案
#1
1
There isn't currently a way to automatically transform the data to a nested format. If you'd like to get the data out in json format rather than CSV, you can use the export commend with the --destination_format
flag set to NEWLINE_DELIMITED_JSON
. e.g.
目前没有办法将数据自动转换为嵌套格式。如果您希望以json格式而不是CSV格式获取数据,则可以使用exportAdmend,并将--destination_format标志设置为NEWLINE_DELIMITED_JSON。例如
bq extract \
--destination_format=NEWLINE_DELIMITED_JSON \
yourdataset.table \
gs://your_bucket/result*.json
#1
1
There isn't currently a way to automatically transform the data to a nested format. If you'd like to get the data out in json format rather than CSV, you can use the export commend with the --destination_format
flag set to NEWLINE_DELIMITED_JSON
. e.g.
目前没有办法将数据自动转换为嵌套格式。如果您希望以json格式而不是CSV格式获取数据,则可以使用exportAdmend,并将--destination_format标志设置为NEWLINE_DELIMITED_JSON。例如
bq extract \
--destination_format=NEWLINE_DELIMITED_JSON \
yourdataset.table \
gs://your_bucket/result*.json