Apache Beam / GCP Datawflow:读取视频/图像文件

时间:2021-09-13 15:36:50

I am struggling to understand how to create a pipeline which involves reading and manipulating a video/image file. Looking into the documentation, which is quite "essential" without examples and few comments, maybe the beam.io package should help, i.e. class LocalFileSystem

我正在努力了解如何创建一个涉及读取和操作视频/图像文件的管道。查看文档,这是非常“必要的”,没有示例和一些注释,也许beam.io包应该有帮助,即类LocalFileSystem

However I have no idea how to use it to create a working pipeline which reads and apply some transform (i.e. frame extraction with ffmpeg)

但是我不知道如何使用它来创建一个读取和应用一些变换的工作流水线(即使用ffmpeg进行帧提取)

I am using using python, however if java is more documented i can switch.

我正在使用python,但是如果java有更多文档我可以切换。

Any example? any help? Thanks in advance

任何例子?任何帮助?提前致谢

1 个解决方案

#1


0  

IMHO, you can specifying ffmpeg in the workers in order to use it for image/video processing. To upload the specified resources instead of default ones to the workers, use filesToStage pipeline option. To use this option, you should use Java SDK since it is not available in Python.

恕我直言,您可以在工作人员中指定ffmpeg,以便将其用于图像/视频处理。要将指定的资源而不是默认资源上载到worker,请使用filesToStage管道选项。要使用此选项,您应该使用Java SDK,因为它在Python中不可用。

See this SO question for more details about use ffmpeg in pipeline and this question to have a overview about the process.

有关在管道中使用ffmpeg的更多详细信息,请参阅此SO问题,此问题可以概述该过程。

#1


0  

IMHO, you can specifying ffmpeg in the workers in order to use it for image/video processing. To upload the specified resources instead of default ones to the workers, use filesToStage pipeline option. To use this option, you should use Java SDK since it is not available in Python.

恕我直言,您可以在工作人员中指定ffmpeg,以便将其用于图像/视频处理。要将指定的资源而不是默认资源上载到worker,请使用filesToStage管道选项。要使用此选项,您应该使用Java SDK,因为它在Python中不可用。

See this SO question for more details about use ffmpeg in pipeline and this question to have a overview about the process.

有关在管道中使用ffmpeg的更多详细信息,请参阅此SO问题,此问题可以概述该过程。