One of my Batch-Jobs tonight failed with a Runtime-Exception. It writes Data to Datastore like 200 other jobs that were running tonight. This one failed with a very long list auf causes, the root of it should be this:
今晚我的一个Batch-Jobs失败了,遇到了Runtime-Exception。它将数据写入数据存储区,就像今晚运行的200个其他工作一样。这个失败了一个非常长的列表auf原因,它的根应该是这样的:
Caused by: com.google.datastore.v1.client.DatastoreException: I/O error, code=UNAVAILABLE
at com.google.datastore.v1.client.RemoteRpc.makeException(RemoteRpc.java:126)
at com.google.datastore.v1.client.RemoteRpc.call(RemoteRpc.java:95)
at com.google.datastore.v1.client.Datastore.commit(Datastore.java:84)
at com.google.cloud.dataflow.sdk.io.datastore.DatastoreV1$DatastoreWriterFn.flushBatch(DatastoreV1.java:925)
at com.google.cloud.dataflow.sdk.io.datastore.DatastoreV1$DatastoreWriterFn.processElement(DatastoreV1.java:892)
Caused by: java.io.IOException: insufficient data written
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.close(HttpURLConnection.java:3501)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:81)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at com.google.datastore.v1.client.RemoteRpc.call(RemoteRpc.java:87)
at com.google.datastore.v1.client.Datastore.commit(Datastore.java:84)
at com.google.cloud.dataflow.sdk.io.datastore.DatastoreV1$DatastoreWriterFn.flushBatch(DatastoreV1.java:925)
at com.google.cloud.dataflow.sdk.io.datastore.DatastoreV1$DatastoreWriterFn.processElement(DatastoreV1.java:892)
at com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
at com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:139)
at com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:188)
at com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
at com.google.cloud.dataflow.sdk.runners.
How can this happen? It's very similar to all the other jobs I run. I am using the Dataflow-Version 1.9.0 and the standard DatastoreIO.v1().write....
怎么会发生这种情况?它与我运行的所有其他工作非常相似。我正在使用Dataflow-Version 1.9.0和标准DatastoreIO.v1()。write ....
The jobIds with this error message:
jobIds出现此错误消息:
2017-08-29_17_05_19-6961364220840664744
2017-08-29_17_05_19-6961364220840664744
2017-08-29_16_40_46-15665765683196208095
2017-08-29_16_40_46-15665765683196208095
Is it possible to retrieve the errors/logs of a job from an outside application (Not cloud console) to automatically being able to restart jobs, if they would usually succeed and fail because of quota-issues or other reasons that are temporary? Thanks in advance
是否可以从外部应用程序(非云控制台)检索作业的错误/日志,以便能够自动重新启动作业,如果它们通常成功并因配额问题或其他临时原因而失败?提前致谢
1 个解决方案
#1
0
This is most likely because DatastoreIO is trying to write more mutations in one RPC call than the Datastore RPC size limit allows. This is data-dependent - suppose the data for this job differs somewhat from data for other jobs. In any case: this issue was fixed in 2.1.0 - updating your SDK should help.
这很可能是因为DatastoreIO试图在一次RPC调用中写入比数据存储区RPC大小限制允许的更多的突变。这与数据有关 - 假设此作业的数据与其他作业的数据略有不同。在任何情况下:此问题已在2.1.0中修复 - 更新SDK应该有所帮助。
#1
0
This is most likely because DatastoreIO is trying to write more mutations in one RPC call than the Datastore RPC size limit allows. This is data-dependent - suppose the data for this job differs somewhat from data for other jobs. In any case: this issue was fixed in 2.1.0 - updating your SDK should help.
这很可能是因为DatastoreIO试图在一次RPC调用中写入比数据存储区RPC大小限制允许的更多的突变。这与数据有关 - 假设此作业的数据与其他作业的数据略有不同。在任何情况下:此问题已在2.1.0中修复 - 更新SDK应该有所帮助。