从Hive到Google Storage / Big Query的数据传输

时间:2022-12-16 15:25:51

I have some Hive tables in an on-premise hadoop cluster.
I need to transfer the tables to BigQuery in google cloud.

我在内部部署的hadoop集群中有一些Hive表。我需要将表格转移到谷歌云中的BigQuery。

Can you suggest any google tools or any open source tools for the data transfer?

你能建议任何谷歌工具或任何开源工具进行数据传输吗?

Thanks in advance

提前致谢

1 个解决方案

#1


1  

BigQuery can import Avro files.

BigQuery可以导入Avro文件。

This means you can do something like INSERT overwrite table target_avro_hive_table SELECT * FROM source_hive_table;

这意味着您可以执行类似INSERT覆盖表target_avro_hive_table SELECT * FROM source_hive_table;

You can then load the underlying .avro files into BigQuery via the bq command line tool or using the console UI:

然后,您可以通过bq命令行工具或使用控制台UI将基础.avro文件加载到BigQuery中:

bq load --source_format=AVRO your_dataset.something something.avro

bq load --source_format = AVRO your_dataset.something something.avro

#1


1  

BigQuery can import Avro files.

BigQuery可以导入Avro文件。

This means you can do something like INSERT overwrite table target_avro_hive_table SELECT * FROM source_hive_table;

这意味着您可以执行类似INSERT覆盖表target_avro_hive_table SELECT * FROM source_hive_table;

You can then load the underlying .avro files into BigQuery via the bq command line tool or using the console UI:

然后,您可以通过bq命令行工具或使用控制台UI将基础.avro文件加载到BigQuery中:

bq load --source_format=AVRO your_dataset.something something.avro

bq load --source_format = AVRO your_dataset.something something.avro