I have a Streaming dataflow running to read the PUB/SUB subscription.
我有一个Streaming数据流正在运行以读取PUB / SUB订阅。
After a period of a time or may be after processing certain amount of data, i want the pipeline to stop by itself. I don't want my compute engine instance to be running indefinitely.
经过一段时间或者可能在处理了一定数量的数据之后,我希望管道自行停止。我不希望我的计算引擎实例无限期地运行。
When i cancel the job through dataflow console, it is shown as failed job.
当我通过数据流控制台取消作业时,它显示为失败的作业。
Is there a way to achieve this? am i missing something ? Or that feature is missing in the API.
有没有办法实现这个目标?我错过了什么吗?或者API中缺少该功能。
2 个解决方案
#1
2
Could you do something like this?
你能做这样的事吗?
Pipeline pipeline = ...;
... (construct the streaming pipeline) ...
final DataflowPipelineJob job =
DataflowPipelineRunner.fromOptions(pipelineOptions)
.run(pipeline);
Thread.sleep(your timeout);
job.cancel();
#2
0
I was able to drain (canceling a job without losing data) a running streaming job on data flow with Rest API.
通过Rest API,我能够在数据流上耗尽(取消作业而不会丢失数据)正在运行的流媒体作业。
看到我的回答
Use Rest Update method, with this body :
使用Rest Update方法,使用此正文:
{ "requestedState": "JOB_STATE_DRAINING" }
{“requestedState”:“JOB_STATE_DRAINING”}
#1
2
Could you do something like this?
你能做这样的事吗?
Pipeline pipeline = ...;
... (construct the streaming pipeline) ...
final DataflowPipelineJob job =
DataflowPipelineRunner.fromOptions(pipelineOptions)
.run(pipeline);
Thread.sleep(your timeout);
job.cancel();
#2
0
I was able to drain (canceling a job without losing data) a running streaming job on data flow with Rest API.
通过Rest API,我能够在数据流上耗尽(取消作业而不会丢失数据)正在运行的流媒体作业。
看到我的回答
Use Rest Update method, with this body :
使用Rest Update方法,使用此正文:
{ "requestedState": "JOB_STATE_DRAINING" }
{“requestedState”:“JOB_STATE_DRAINING”}