We have a large dataset from an appengine app in our datastore. Now I want to do some ETL on them to push them to bigquery, and I thought of using a Dataflow batch job.
我们的数据存储区中有一个来自appengine app的大型数据集。现在我想对它们做一些ETL以将它们推向bigquery,我想到了使用Dataflow批处理作业。
All examples I find are using this class to query the Datastore:
我找到的所有示例都使用此类来查询数据存储区:
import com.google.api.services.datastore.DatastoreV1.Query;
And that does work. However, I'm not familiar wit this DatastoreV1 API and would like to use the API provided with the appengine SDK, like this:
这确实有效。但是,我不熟悉这个DatastoreV1 API,并且想要使用appengine SDK提供的API,如下所示:
import com.google.appengine.api.datastore.Query;
The problem is that the DatastoreIO doesn't accept these queries:
问题是DatastoreIO不接受这些查询:
PCollection<Entity> projects = p.apply(Read.from(DatastoreIO.source().withQuery(q).withDataset(DATASET_ID)));
It will only take DatastoreV1.Query objects. Is there any way to use the app engine provided API's? I'm much more familiar with those calls. Better yet, if we could use Objectify, that would be awesome :)
它只需要DatastoreV1.Query对象。有没有办法使用app引擎提供的API?我对这些电话更加熟悉。更好的是,如果我们可以使用Objectify,那将是非常棒的:)
Thanks!
谢谢!
1 个解决方案
#1
0
This isn't possible with the current implementation of the API. We can look at adding as a feature, and would gladly accept a pull request to expand the current functionality. The AppEngine team is also actively working on increasing interoperability between their SDK and the Datastore API.
对于当前的API实现,这是不可能的。我们可以看一下添加功能,并乐意接受拉取请求来扩展当前功能。 AppEngine团队还积极致力于提高SDK与数据存储API之间的互操作性。
#1
0
This isn't possible with the current implementation of the API. We can look at adding as a feature, and would gladly accept a pull request to expand the current functionality. The AppEngine team is also actively working on increasing interoperability between their SDK and the Datastore API.
对于当前的API实现,这是不可能的。我们可以看一下添加功能,并乐意接受拉取请求来扩展当前功能。 AppEngine团队还积极致力于提高SDK与数据存储API之间的互操作性。