如何在BigQuery中完全使用嵌套模式创建新表

时间:2021-07-09 15:22:47

I've got a nested table A in BigQuery with a schema as follows:

我在BigQuery中有一个嵌套的表A,其架构如下:

    {
    "name": "page_event",
    "mode": "repeated",
    "type": "RECORD",
    "fields": [
        {
            "name": "id",
            "type": "STRING"
        }
    ]
    }

I would like to enrich table A with data from other table and save result as a new nested table. Let's say I would like to add "description" field to table A (creating table B), so my schema will be as follows:

我想用其他表中的数据来丰富表A,并将结果保存为新的嵌套表。假设我想在表A(创建表B)中添加“description”字段,所以我的模式如下:

    {
    "name": "page_event",
    "mode": "repeated",
    "type": "RECORD",
    "fields": [
        {
            "name": "id",
            "type": "STRING"
        },
        {
            "name": "description",
            "type": "STRING"
        }
    ]
    }

How do I do this in BigQuery? It seems, that there are no functions for creating nested structures in BigQuery SQL (except NEST functions, which produces a list - but this function doesn't seem to work, failing with Unexpected error)

我如何在BigQuery中执行此操作?似乎没有在BigQuery SQL中创建嵌套结构的函数(除了NEST函数,它产生一个列表 - 但是这个函数似乎不起作用,因意外错误而失败)

The only way of doing this I can think of, is to:

我能想到的唯一方法就是:

  • use string concatenation functions to produce table B with single field called "json" with content being enriched data from A, converted to json string
  • 使用字符串连接函数生成表B,其中单个字段称为“json”,内容为来自A的丰富数据,转换为json字符串

  • export B to GCS as set of files F
  • 将B作为文件集F导出到GCS

  • load F as table C
  • 将F加载为表C.

Is there an easier way to do it?

有更简单的方法吗?

1 个解决方案

#1


1  

To enrich schema of existing table one can use tables patch API
https://cloud.google.com/bigquery/docs/reference/v2/tables/patch

为了丰富现有表的架构,可以使用表补丁API https://cloud.google.com/bigquery/docs/reference/v2/tables/patch

Request will look like below

请求将如下所示

PATCH https://www.googleapis.com/bigquery/v2/projects/{project_id}/datasets/{dataset_id}/tables/{table_id}?key={YOUR_API_KEY}

{
 "schema": {
  "fields": [
   {
    "name": "page_event",
    "mode": "repeated",
    "type": "RECORD",
    "fields": [
     {
      "name": "id",
      "type": "STRING"
     },
     {
      "name": "description",
      "type": "STRING"
     }
    ]
   }
  ]
 }
}

Before Patch

如何在BigQuery中完全使用嵌套模式创建新表

After Patch

如何在BigQuery中完全使用嵌套模式创建新表

#1


1  

To enrich schema of existing table one can use tables patch API
https://cloud.google.com/bigquery/docs/reference/v2/tables/patch

为了丰富现有表的架构,可以使用表补丁API https://cloud.google.com/bigquery/docs/reference/v2/tables/patch

Request will look like below

请求将如下所示

PATCH https://www.googleapis.com/bigquery/v2/projects/{project_id}/datasets/{dataset_id}/tables/{table_id}?key={YOUR_API_KEY}

{
 "schema": {
  "fields": [
   {
    "name": "page_event",
    "mode": "repeated",
    "type": "RECORD",
    "fields": [
     {
      "name": "id",
      "type": "STRING"
     },
     {
      "name": "description",
      "type": "STRING"
     }
    ]
   }
  ]
 }
}

Before Patch

如何在BigQuery中完全使用嵌套模式创建新表

After Patch

如何在BigQuery中完全使用嵌套模式创建新表