HTML5视频的字节范围请求(伪流)如何工作?

If you play an HTML5 video for a video that is hosted on a server that accepts range requests, then when you try to seek ahead to a non-buffered part of the video you'll notice from the network traffic that the browser makes a byte range-request. I'm assuming that the browser computes the byte by knowing the total video size ahead of time and assuming a constant bitrate (if you click half-way in the progress bar, then it will request the byte at the half-way point). But especially if the video is variable bitrate, it seems unlikely that the byte it requests could really correspond to the time-point that the user clicked on, and the byte would likely fall in the middle of a frame.

如果你播放一个HTML5视频，它是在一个接受范围请求的服务器上播放的，那么当你试图在视频中寻找非缓冲的部分时，你会注意到网络流量中，浏览器发出了一个字节范围的请求。我假设浏览器通过提前知道总视频大小并假设一个恒定的比特率(如果在进度条中单击一半，然后在中途点请求字节)来计算字节。但是，特别是如果视频是可变的比特率，那么它所请求的字节似乎不太可能与用户单击的时间点相对应，而字节可能会在帧的中间掉下来。

How does the browser know what the beginning of the next frame is, once it's begun fetching at some arbitrary byte?

浏览器如何知道下一帧的开始是什么，一旦它开始在某个任意字节抓取?

2 个解决方案

#1

I assume your video is in an Mp4 container. The mp4 file format contains a hierarchical structure of 'boxes'. One of these boxes is the Time-To-Sample (stts) box. This box contains the time of every frame (in a compact fashion). From here you can find the 'chunk' that contains the frame using the Sample-to-Chunk (stsc) atom. And finally the Chunk offset atom (stco) gives you the byte offset into the file.

我假设您的视频在Mp4容器中。mp4文件格式包含“框”的层次结构。其中一个框是时间到样(stts)框。这个盒子包含了每一帧的时间(以紧凑的方式)。从这里您可以找到包含使用sampleto - chunk (stsc) atom的框架的“chunk”。最后，Chunk offset atom (stco)为您提供了文件的字节偏移量。

The total duration of the movie is store in the Movie header atom (mvhd). When you move the scrub handle, a time is estimated based on the duration of the movie and where you let go of the scrub handle, a calculation is made from the the file header downloaded previously, and a request is made.

电影的总持续时间是在电影头部的atom (mvhd)。当您移动这个擦洗句柄时，根据电影的持续时间来估计一个时间，并且在您松开擦洗手柄的地方，一个计算是由先前下载的文件头做出的，并且提出了一个请求。

Edit: If it is not mp4, other containers have similar mechanism. Codec is irrelevant.

编辑:如果不是mp4，其他容器也有类似的机制。编解码器是无关紧要的。

#2

-1

Many video/media types, such as MPEG, are encoded in fixed-same packets.

许多视频/媒体类型，例如MPEG，都被编码在固定相同的包中。

MPEG was originally designed on 188-byte packets (originally chosen to be 8 cells of the ATM transport layer, though that is now obsolete). So if you seek to a multiple of that 188-byte size, the player will read valid packets & recover sync when it finds the beginning of a frame.

MPEG最初设计于188字节数据包(最初被选择为ATM传输层的8个单元，尽管现在已经过时了)。因此，如果您试图找到一个188字节大小的倍数，玩家将读取有效的数据包并在发现帧的开始时恢复同步。

Actual picture can be displayed, when the browser/player reaches an I-frame (or keyframe) which can be decoded independently of any other frames. P- and B-frames are interpolations, so if you seek to them you can't yet construct a picture.

当浏览器/播放器到达一个I-frame(或关键帧)，可以独立于其他帧进行解码时，可以显示实际的图片。P和b帧是插值，所以如果你想找它们，你还不能构造图片。

See:

看到的:

http://en.wikipedia.org/wiki/MPEG_transport_stream

http://en.wikipedia.org/wiki/MPEG_transport_stream
http://en.wikipedia.org/wiki/MPEG-1#Frame.2Fpicture.2Fblock_types

http://en.wikipedia.org/wiki/MPEG-1 Frame.2Fpicture.2Fblock_types

#1