如何在谷歌数据流管道中设置diskSourceImage

时间:2021-12-25 15:38:54

I've been trying to use custom made images to run my google data flow pipeline. Given the information from https://cloud.google.com/compute/docs/reference/latest/images I've tested the following code snippets:

我一直在尝试使用自定义图像来运行我的谷歌数据流管道。鉴于来自https://cloud.google.com/compute/docs/reference/latest/images的信息,我测试了以下代码段:

DataflowPipelineOptions options = PipelineOptionsFactory.create().as(DataflowPipelineOptions.class);
...
options.setDiskSourceImage("ubuntu-1504-vivid-v20150911");
options.setDiskSourceImage("projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");
options.setDiskSourceImage("https://www.googleapis.com/compute/beta/projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");

all of the above tries led to the following error in my pipeline:

所有上述尝试都导致我的管道中出现以下错误:

(b9c7b66a676906f4): Unable to create VMs. Causes: (b9c7b66a67690aef): Error: Message: Invalid value for field 'resource.disks[0].initializeParams.sourceImage': '[edited]'. Must be the URL to a Compute resource of the correct type HTTP Code: 400

(b9c7b66a676906f4):无法创建VM。原因:(b9c7b66a67690aef):错误:消息:字段'resource.disks [0] .initializeParams.sourceImage'的值无效:'[编辑]'。必须是正确类型HTTP代码的计算资源的URL:400

2 个解决方案

#1


1  

Using a custom disk image with Dataflow is not a viable option. The flag diskSourceImage is deprecated and will be removed in a future SDK release. The reason it is no longer supported is because the Dataflow service relies on versioned resources in the VM image. So Dataflow needs control of the VM image so that we can upgrade it as necessary. If users supply their own custom images we have no way of keeping them in sync with the requirements of the Dataflow service.

使用Dataflow的自定义磁盘映像不是一个可行的选择。不推荐使用标志diskSourceImage,将在以后的SDK版本中将其删除。不再支持它的原因是因为Dataflow服务依赖于VM映像中的版本化资源。因此,Dataflow需要控制VM映像,以便我们可以根据需要进行升级。如果用户提供自己的自定义映像,我们无法使它们与Dataflow服务的要求保持同步。

If your custom VM image is based off a Dataflow image then you would be able to execute jobs using that custom image until the next release of a Dataflow VM image. There is no reasonable way in which you would be able to keep your custom images in sync with Dataflow's VM images so that you would be able to keep this working.

如果您的自定义VM映像基于Dataflow映像,那么您将能够使用该自定义映像执行作业,直到下一版本的Dataflow VM映像。没有合理的方法可以使您的自定义映像与Dataflow的VM映像保持同步,这样您就可以保持此工作。

If you would like to customize the VM image please let us know why (e.g. send us an email at dataflow-feedback@google.com) so we can either suggest an alternative solution or else consider supporting your use case in the future.

如果您想自定义虚拟机映像,请告知我们原因(例如发送电子邮件至dataflow-feedback@google.com),以便我们建议替代解决方案,或者考虑将来支持您的用例。

#2


0  

There's a subtle issue with setDiskSourceImage -- it uses 'beta' instead of the current 'v1' version for Compute Engine. If you try the following, it should work:

setDiskSourceImage存在一个微妙的问题 - 它使用'beta'代替Compute Engine的当前'v1'版本。如果你尝试以下,它应该工作:

options.setDiskSourceImage("https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");

#1


1  

Using a custom disk image with Dataflow is not a viable option. The flag diskSourceImage is deprecated and will be removed in a future SDK release. The reason it is no longer supported is because the Dataflow service relies on versioned resources in the VM image. So Dataflow needs control of the VM image so that we can upgrade it as necessary. If users supply their own custom images we have no way of keeping them in sync with the requirements of the Dataflow service.

使用Dataflow的自定义磁盘映像不是一个可行的选择。不推荐使用标志diskSourceImage,将在以后的SDK版本中将其删除。不再支持它的原因是因为Dataflow服务依赖于VM映像中的版本化资源。因此,Dataflow需要控制VM映像,以便我们可以根据需要进行升级。如果用户提供自己的自定义映像,我们无法使它们与Dataflow服务的要求保持同步。

If your custom VM image is based off a Dataflow image then you would be able to execute jobs using that custom image until the next release of a Dataflow VM image. There is no reasonable way in which you would be able to keep your custom images in sync with Dataflow's VM images so that you would be able to keep this working.

如果您的自定义VM映像基于Dataflow映像,那么您将能够使用该自定义映像执行作业,直到下一版本的Dataflow VM映像。没有合理的方法可以使您的自定义映像与Dataflow的VM映像保持同步,这样您就可以保持此工作。

If you would like to customize the VM image please let us know why (e.g. send us an email at dataflow-feedback@google.com) so we can either suggest an alternative solution or else consider supporting your use case in the future.

如果您想自定义虚拟机映像,请告知我们原因(例如发送电子邮件至dataflow-feedback@google.com),以便我们建议替代解决方案,或者考虑将来支持您的用例。

#2


0  

There's a subtle issue with setDiskSourceImage -- it uses 'beta' instead of the current 'v1' version for Compute Engine. If you try the following, it should work:

setDiskSourceImage存在一个微妙的问题 - 它使用'beta'代替Compute Engine的当前'v1'版本。如果你尝试以下,它应该工作:

options.setDiskSourceImage("https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");