数据流无法读取区域“asia-northeast1”中的BigQuery数据集

I have a BigQuery dataset located in the new "asia-northeast1" region. I'm trying to run a Dataflow templated pipeline (running in Australia region) to read a table from it. It chucks the following error, even though the dataset/table does indeed exist:

我有一个BigQuery数据集位于新的“asia-northeast1”区域。我正在尝试运行Dataflow模板化管道（在澳大利亚地区运行）从中读取表格。即使数据集/表确实存在，它也会丢失以下错误：

Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
  "code" : 404,
  "errors" : [ {
    "domain" : "global",
    "message" : "Not found: Dataset grey-sort-challenge:Konnichiwa_Tokyo",
    "reason" : "notFound"
  } ],
  "message" : "Not found: Dataset grey-sort-challenge:Konnichiwa_Tokyo"
}

Am I doing something wrong here?

我在这里做错了吗？

/**
 * BigQuery -> ParDo -> GCS (one file)
 */
public class BigQueryTableToOneFile {
    public static void main(String[] args) throws Exception {
        PipelineOptionsFactory.register(TemplateOptions.class);
        TemplateOptions options = PipelineOptionsFactory
                .fromArgs(args)
                .withValidation()
                .as(TemplateOptions.class);
        options.setAutoscalingAlgorithm(THROUGHPUT_BASED);
        Pipeline pipeline = Pipeline.create(options);
        pipeline.apply(BigQueryIO.read().from(options.getBigQueryTableName()).withoutValidation())
                .apply(ParDo.of(new DoFn<TableRow, String>() {
                    @ProcessElement
                    public void processElement(ProcessContext c) throws Exception {
                        String commaSep = c.element().values()
                                .stream()
                                .map(cell -> cell.toString().trim())
                                .collect(Collectors.joining("\",\""));
                        c.output(commaSep);
                    }
                }))
                .apply(TextIO.write().to(options.getOutputFile())
                        .withoutSharding()
                        .withWritableByteChannelFactory(GZIP)
                );
        pipeline.run();
    }

    public interface TemplateOptions extends DataflowPipelineOptions {
        @Description("The BigQuery table to read from in the format project:dataset.table")
        @Default.String("bigquery-samples:wikipedia_benchmark.Wiki1k")
        ValueProvider<String> getBigQueryTableName();

        void setBigQueryTableName(ValueProvider<String> value);

        @Description("The name of the output file to produce in the format gs://bucket_name/filname.csv")
        @Default.String("gs://bigquery-table-to-one-file/output/bar.csv.gz")
        ValueProvider<String> getOutputFile();

        void setOutputFile(ValueProvider<String> value);
    }
}

Args:

ARGS：

--project=grey-sort-challenge
--runner=DataflowRunner
--jobName=bigquery-table-to-one-file
--maxNumWorkers=1
--zone=australia-southeast1-a
--stagingLocation=gs://bigquery-table-to-one-file/jars
--tempLocation=gs://bigquery-table-to-one-file/tmp
--templateLocation=gs://bigquery-table-to-one-file/template

Job id: 2018-05-05_05_37_08-8260293482986343692

工作号：2018-05-05_05_37_08-8260293482986343692

1 个解决方案

#1

Sorry about that issue. It will be addressed in the upcoming Beam SDK 2.5.0 (you can try using current head snapshots from the Beam repo)

抱歉，这个问题。它将在即将发布的Beam SDK 2.5.0中解决（您可以尝试使用Beam repo中的当前头部快照）

#1

Sorry about that issue. It will be addressed in the upcoming Beam SDK 2.5.0 (you can try using current head snapshots from the Beam repo)

抱歉，这个问题。它将在即将发布的Beam SDK 2.5.0中解决（您可以尝试使用Beam repo中的当前头部快照）

秒客网

数据流无法读取区域“asia-northeast1”中的BigQuery数据集

1 个解决方案

#1

#1

相关文章