Google BigQuery是否支持ARRAY ?

时间:2021-09-18 14:35:10

I am pushing the data from Google dataflow to Google BigQuery. I have TableRow object with data in it. One of columns in TableRow does contain Array of String.

我正在将Google数据流中的数据推送到Google BigQuery。我有TableRow对象,其中包含数据。 TableRow中的一列包含String of Array。

From here, I found that Google BigQuery supports Array column type. So I tried to create table with ARRAY<SCHEMA> as type. But I got the below error

从这里,我发现Google BigQuery支持Array列类型。所以我尝试用ARRAY 创建表格作为类型。但是我得到了以下错误

com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
  "code" : 400,
  "errors" : [ {
    "domain" : "global",
    "message" : "Invalid value for: ARRAY<STRING> is not a valid value",
    "reason" : "invalid"
  } ],
  "message" : "Invalid value for: ARRAY<STRING> is not a valid value"
}
com.google.cloud.dataflow.sdk.util.UserCodeException.wrapIf(UserCodeException.java:47)
com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.wrapUserCodeException(DoFnRunnerBase.java:369)
com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.finishBundle(DoFnRunnerBase.java:162)
com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.finishBundle(SimpleParDoFn.java:194)
com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.finishBundle(ForwardingParDoFn.java:47)

Here is the code that I use to publish values into BigQuery

这是我用来将值发布到BigQuery的代码

    .apply(BigQueryIO.Write.named("Write enriched data")
               .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
               .withSchema(getSchema())
               .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
               .to("table_name"));

And here is the schema construction

这是架构构造

private static TableSchema getSchema() {
    List<TableFieldSchema> fields = new ArrayList<>();

    fields.add(new TableFieldSchema().setName("column1").setType("STRING"));
    fields.add(new TableFieldSchema().setName("column2").setType("STRING"));
    fields.add(new TableFieldSchema().setName("array_column").setType("ARRAY<STRING>"));

    return new TableSchema().setFields(fields);
}

How can I insert array of string into BigQuery table?

如何在BigQuery表中插入字符串数组?

1 个解决方案

#1


4  

To define a ARRAY<STRING> in BigQuery I set the field as 'STRING' and its mode as 'REPEATED'.

要在BigQuery中定义ARRAY ,我将字段设置为“STRING”,将其模式设置为“REPEATED”。

In Python for instance it's defined as field = SchemaField(name='field_1', type='STRING', mode='REPEATED')

例如,在Python中,它被定义为field = SchemaField(name ='field_1',type ='STRING',mode ='REPEATED')

For the Java client for what I could see you have the same options, you could define the TYPE as STRING and the MODE as REPEATED.

对于Java客户端,我可以看到您有相同的选项,您可以将TYPE定义为STRING,将MODE定义为REPEATED。

#1


4  

To define a ARRAY<STRING> in BigQuery I set the field as 'STRING' and its mode as 'REPEATED'.

要在BigQuery中定义ARRAY ,我将字段设置为“STRING”,将其模式设置为“REPEATED”。

In Python for instance it's defined as field = SchemaField(name='field_1', type='STRING', mode='REPEATED')

例如,在Python中,它被定义为field = SchemaField(name ='field_1',type ='STRING',mode ='REPEATED')

For the Java client for what I could see you have the same options, you could define the TYPE as STRING and the MODE as REPEATED.

对于Java客户端,我可以看到您有相同的选项,您可以将TYPE定义为STRING,将MODE定义为REPEATED。