导出Google Cloud Datastore并以编程方式导入BigQuery

时间:2021-08-05 14:55:41

I'm looking for a method to export my Cloud Datastore and import it into BigQuery daily. The manual way is described at google page. I do not find a clean way to automate it.

我正在寻找一种方法来导出我的Cloud Datastore并每天将其导入BigQuery。谷歌页面描述了手动方式。我找不到一种干净的自动化方法。

4 个解决方案

#1


There isn't a simple way to do this, but you can separate out the two parts: creating appengine backups and loading them into bigquery.

没有一种简单的方法可以做到这一点,但您可以将这两部分分开:创建appengine备份并将它们加载到bigquery中。

You can use scheduled backups to create datastore backups periodically (https://cloud.google.com/appengine/articles/scheduled_backups).

您可以使用计划备份定期创建数据存储备份(https://cloud.google.com/appengine/articles/scheduled_backups)。

You can then use Apps Script to automate the BigQuery portion (https://developers.google.com/apps-script/advanced/bigquery#load_csv_data) or use an AppEngine cron to do the same thing.

然后,您可以使用Apps脚本自动执行BigQuery部分(https://developers.google.com/apps-script/advanced/bigquery#load_csv_data)或使用AppEngine cron执行相同的操作。

#2


As of last week there's a proper way to automate this. The most important part is gcloud beta datastore export.

截至上周,有一种适当的自动化方法。最重要的部分是gcloud beta数据存储区导出。

I created a script around it: https://github.com/chees/datastore2bigquery You could run this in a cron job.

我创建了一个脚本:https://github.com/chees/datastore2bigquery你可以在一个cron作业中运行它。

See here for a demo of how it works: https://www.youtube.com/watch?v=dGyQCE3bWkU

请参阅此处了解其工作原理:https://www.youtube.com/watch?v = dGyQCE3bWkU

#3


Building on @Jordan's answer above, the steps to do this would be:

基于@ Jordan上面的回答,执行此操作的步骤将是:

1) Make a storage bucket

1)制作存储桶

2) Export datastore entities to this bucket

2)将数据存储区实体导出到此存储区

3) Open Big Query Web UI, and load using the Google Cloud file path.

3)打开Big Query Web UI,并使用Google Cloud文件路径加载。

Full tutorial with images is available at this post.

本文提供了带图像的完整教程。

#4


It is possible using the following code. It basically uses App Engine Cron jobs and BigQuery API.

可以使用以下代码。它主要使用App Engine Cron作业和BigQuery API。

https://github.com/wenzhe/appengine_datastore_bigquery

#1


There isn't a simple way to do this, but you can separate out the two parts: creating appengine backups and loading them into bigquery.

没有一种简单的方法可以做到这一点,但您可以将这两部分分开:创建appengine备份并将它们加载到bigquery中。

You can use scheduled backups to create datastore backups periodically (https://cloud.google.com/appengine/articles/scheduled_backups).

您可以使用计划备份定期创建数据存储备份(https://cloud.google.com/appengine/articles/scheduled_backups)。

You can then use Apps Script to automate the BigQuery portion (https://developers.google.com/apps-script/advanced/bigquery#load_csv_data) or use an AppEngine cron to do the same thing.

然后,您可以使用Apps脚本自动执行BigQuery部分(https://developers.google.com/apps-script/advanced/bigquery#load_csv_data)或使用AppEngine cron执行相同的操作。

#2


As of last week there's a proper way to automate this. The most important part is gcloud beta datastore export.

截至上周,有一种适当的自动化方法。最重要的部分是gcloud beta数据存储区导出。

I created a script around it: https://github.com/chees/datastore2bigquery You could run this in a cron job.

我创建了一个脚本:https://github.com/chees/datastore2bigquery你可以在一个cron作业中运行它。

See here for a demo of how it works: https://www.youtube.com/watch?v=dGyQCE3bWkU

请参阅此处了解其工作原理:https://www.youtube.com/watch?v = dGyQCE3bWkU

#3


Building on @Jordan's answer above, the steps to do this would be:

基于@ Jordan上面的回答,执行此操作的步骤将是:

1) Make a storage bucket

1)制作存储桶

2) Export datastore entities to this bucket

2)将数据存储区实体导出到此存储区

3) Open Big Query Web UI, and load using the Google Cloud file path.

3)打开Big Query Web UI,并使用Google Cloud文件路径加载。

Full tutorial with images is available at this post.

本文提供了带图像的完整教程。

#4


It is possible using the following code. It basically uses App Engine Cron jobs and BigQuery API.

可以使用以下代码。它主要使用App Engine Cron作业和BigQuery API。

https://github.com/wenzhe/appengine_datastore_bigquery