如何使用python从视频中提取幻灯片

I have a video training course supplied as AVI files. Most of the screens are shown as slides with a mouse pointer moving around on them.

我有一个视频培训课程作为AVI文件提供。大多数屏幕都显示为幻灯片,鼠标指针在它们周围移动。

I'd like to capture a screenshot of the slide automatically when the screen changes (ignoring when the image changes a small amount due to the mouse pointer moving around.)

我想在屏幕变化时自动捕捉幻灯片的屏幕截图(由于鼠标指针移动而忽略图像变化很小的时候。)

I want to do this so I can paste the images into a word or html document that I can add notes to as I learn as at the moment I'm taking screenshots but it's very slow and tedious and the course is very long (around 24 hours total play time).

我想这样做,所以我可以将图像粘贴到一个单词或html文档,我可以添加笔记,因为我学习的时候我正在拍摄截图,但它非常缓慢而乏味且课程很长(大约24个)小时总播放时间)。

I know python well but am unsure as to how I would go about extracting stills from a video file and then how to compare one still with another to see how much they differ to decide which to keep and which to discard.

我很了解python,但我不确定如何从视频文件中提取静止图像,然后如何比较一个仍然与另一个,看看它们有多大不同,以决定保留哪些和丢弃哪些。

Can anyone suggest how to go about doing this?

谁能建议怎么做呢?

2 个解决方案

#1

A tool like ffmpeg is suited for extracting images from a video. From the manual:

像ffmpeg这样的工具适合从视频中提取图像。从手册:

 ffmpeg -i foo.avi -r 1 -s WxH -f image2 foo-%03d.jpeg

This will extract one video frame per second from the video and will output them in files named foo-001.jpeg, foo-002.jpeg, etc. Images will be rescaled to fit the new WxH values.

这将从视频中每秒提取一个视频帧,并将其输出到名为foo-001.jpeg,foo-002.jpeg等的文件中。图像将重新调整以适应新的WxH值。

Comparing them for differences can then perhaps be done by PIL and/or OpenCV.

然后可以通过PIL和/或OpenCV来比较它们的差异。

EDIT: I just realized that it probably would be more efficient to only grab the key frames (intra frame), because those occur when a drastic change in the scene happens. A quick google later we have this:

编辑:我刚刚意识到只抓取关键帧(帧内帧)可能会更有效,因为这些都是在场景发生剧烈变化时发生的。快速谷歌后来我们有这个:

ffmpeg -i foo.avi -vsync 0 -vf select="eq(pict_type\,PICT_TYPE_I)" -s WxH -f image2 foo-%03d.jpeg

#2

What you basically want is scene detection. framedifferenceanalyzer is an educational proof of concept in Python that does exactly that, and should provide a good starting point for learning about the problem itself.

你基本上想要的是场景检测。 framedifferenceanalyzer是Python中的一个教育概念证明,正是这样做,应该为学习问题本身提供一个良好的起点。

As for implementing it yourself, ffmpeg is the ideal tool for converting a video into a sequence of frames - I probably wouldn't attempt doing that part in pure Python.

至于自己实现它,ffmpeg是将视频转换为帧序列的理想工具 - 我可能不会尝试在纯Python中执行该部分。

For calculating the difference between frames you could maybe use ImageMagick (its compare tool in particular). There are several Python bindings for ImageMagick, for example PythonMagick or magickwand to name just two.

为了计算帧之间的差异,您可以使用ImageMagick(特别是它的比较工具)。 ImageMagick有几个Python绑定,例如PythonMagick或magickwand,仅举两个。

You could also use OpenCV to do the image analysis. OpenCV is a library of high performance, high quality computer vision algorithms and probably one of, if not the most powerful tool out there to do things like this. However, it kind of assumes that you have a certain knowledge about computer vision / image processing and already have a good idea of what you're looking for.

您也可以使用OpenCV进行图像分析。 OpenCV是一个高性能,高质量的计算机视觉算法库,可能是其中一个,如果不是最强大的工具,可以做这样的事情。但是,它假设您对计算机视觉/图像处理有一定的了解,并且已经对您正在寻找的内容有了一个很好的了解。

#1