如何在Google云数据流中停止流式传输管道

时间:2022-02-01 15:23:13

I have a Streaming dataflow running to read the PUB/SUB subscription.

我有一个Streaming数据流正在运行以读取PUB / SUB订阅。

After a period of a time or may be after processing certain amount of data, i want the pipeline to stop by itself. I don't want my compute engine instance to be running indefinitely.

经过一段时间或者可能在处理了一定数量的数据之后,我希望管道自行停止。我不希望我的计算引擎实例无限期地运行。

When i cancel the job through dataflow console, it is shown as failed job.

当我通过数据流控制台取消作业时,它显示为失败的作业。

Is there a way to achieve this? am i missing something ? Or that feature is missing in the API.

有没有办法实现这个目标?我错过了什么吗?或者API中缺少该功能。

2 个解决方案

#1


2  

Could you do something like this?

你能做这样的事吗?

Pipeline pipeline = ...;
... (construct the streaming pipeline) ...
final DataflowPipelineJob job =
    DataflowPipelineRunner.fromOptions(pipelineOptions)
                          .run(pipeline);
Thread.sleep(your timeout);
job.cancel();

#2


0  

I was able to drain (canceling a job without losing data) a running streaming job on data flow with Rest API.

通过Rest API,我能够在数据流上耗尽(取消作业而不会丢失数据)正在运行的流媒体作业。

See my answer

看到我的回答

Use Rest Update method, with this body :

使用Rest Update方法,使用此正文:

{ "requestedState": "JOB_STATE_DRAINING" }

{“requestedState”:“JOB_STATE_DRAINING”}

#1


2  

Could you do something like this?

你能做这样的事吗?

Pipeline pipeline = ...;
... (construct the streaming pipeline) ...
final DataflowPipelineJob job =
    DataflowPipelineRunner.fromOptions(pipelineOptions)
                          .run(pipeline);
Thread.sleep(your timeout);
job.cancel();

#2


0  

I was able to drain (canceling a job without losing data) a running streaming job on data flow with Rest API.

通过Rest API,我能够在数据流上耗尽(取消作业而不会丢失数据)正在运行的流媒体作业。

See my answer

看到我的回答

Use Rest Update method, with this body :

使用Rest Update方法,使用此正文:

{ "requestedState": "JOB_STATE_DRAINING" }

{“requestedState”:“JOB_STATE_DRAINING”}