如何使用VideoToolbox解压H.264视频流?

时间:2021-08-01 04:16:49

I had a lot of trouble figuring out how to use Apple's Hardware accelerated video framework to decompress an H.264 video stream. After a few weeks I figured it out and wanted to share an extensive example since I couldn't find one.

我费了好大的劲才想出如何使用苹果的硬件加速视频框架来解压H.264视频流。几个星期后,我发现了它,想要分享一个广泛的例子,因为我找不到。

My goal is to give a thorough, instructive example of Video Toolbox introduced in WWDC '14 session 513. My code will not compile or run since it needs to be integrated with an elementary H.264 stream (like a video read from a file or streamed from online etc) and needs to be tweaked depending on the specific case.

我的目标是在WWDC '14第513节中介绍一个详细的、有教育意义的视频工具箱示例。我的代码不会编译或运行,因为它需要与一个基本的H.264流(就像从文件中读取的视频或从网上下载的视频)集成,需要根据具体情况进行调整。

I should mention that I have very little experience with video en/decoding except what I learned while googling the subject. I don't know all the details about video formats, parameter structure etc. so I've only included what I think you need to know.

我应该提一下,除了在google上搜索这个主题的时候我学到的东西,我几乎没有什么经验。我不知道所有关于视频格式、参数结构等的细节,所以我只包括了我认为你需要知道的内容。

I am using XCode 6.2 and have deployed to iOS devices that are running iOS 8.1 and 8.2.

我正在使用XCode 6.2,并部署到运行iOS 8.1和8.2的iOS设备上。

5 个解决方案

#1


138  

Concepts:

NALUs: NALUs are simply a chunk of data of varying length that has a NALU start code header 0x00 00 00 01 YY where the first 5 bits of YY tells you what type of NALU this is and therefore what type of data follows the header. (Since you only need the first 5 bits, I use YY & 0x1F to just get the relevant bits.) I list what all these types are in the method NSString * const naluTypesStrings[], but you don't need to know what they all are.

NALUs: NALUs只是一个不同长度的数据块,它有一个NALU启动代码头,0x00 00 01 YY,前5位YY告诉你这是什么类型的NALU,因此是什么类型的数据跟随header。(因为您只需要前5位,我使用YY和0x1F来获取相关的比特。)我列出了所有这些类型在方法NSString * const naluTypesStrings[]中,但是您不需要知道它们都是什么。

Parameters: Your decoder needs parameters so it knows how the H.264 video data is stored. The 2 you need to set are Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) and they each have their own NALU type number. You don't need to know what the parameters mean, the decoder knows what to do with them.

参数:您的解码器需要参数,所以它知道H.264视频数据是如何存储的。您需要设置的2是序列参数集(SPS)和图像参数集(PPS),它们各自都有自己的NALU类型编号。你不需要知道参数的含义,解码器知道如何处理它们。

H.264 Stream Format: In most H.264 streams, you will receive with an initial set of PPS and SPS parameters followed by an i frame (aka IDR frame or flush frame) NALU. Then you will receive several P frame NALUs (maybe a few dozen or so), then another set of parameters (which may be the same as the initial parameters) and an i frame, more P frames, etc. i frames are much bigger than P frames. Conceptually you can think of the i frame as an entire image of the video, and the P frames are just the changes that have been made to that i frame, until you receive the next i frame.

H.264流格式:在大多数H.264流中,您将收到一组初始的PPS和SPS参数,然后是i帧(即IDR框架或平帧)NALU。然后你会收到几个P框架NALUs(可能是几十个左右),然后另一组参数(可能与初始参数相同)和i帧,更多的P帧,等等,i帧比P帧大得多。从概念上讲,你可以把i帧看作整个视频的图像,而P帧只是我帧的变化,直到你接收到下一个i帧。

Procedure:

  1. Generate individual NALUs from your H.264 stream. I cannot show code for this step since it depends a lot on what video source you're using. I made this graphic to show what I was working with ("data" in the graphic is "frame" in my following code), but your case may and probably will differ. 如何使用VideoToolbox解压H.264视频流? My method receivedRawVideoFrame: is called every time I receive a frame (uint8_t *frame) which was one of 2 types. In the diagram, those 2 frame types are the 2 big purple boxes.

    从H.264流中生成单独的NALUs。我不能显示这一步的代码,因为它很大程度上取决于您使用的视频源。我做了这个图形来显示我正在处理的东西(图形中的“数据”在我的下面代码中是“frame”),但是您的情况可能并且可能会有所不同。我的方法receivedRawVideoFrame:每当我接收到一个帧(uint8_t *frame),它就是2种类型之一。在图中,这两种框架类型是两个大紫色框。

  2. Create a CMVideoFormatDescriptionRef from your SPS and PPS NALUs with CMVideoFormatDescriptionCreateFromH264ParameterSets( ). You cannot display any frames without doing this first. The SPS and PPS may look like a jumble of numbers, but VTD knows what to do with them. All you need to know is that CMVideoFormatDescriptionRef is a description of video data., like width/height, format type (kCMPixelFormat_32BGRA, kCMVideoCodecType_H264 etc.), aspect ratio, color space etc. Your decoder will hold onto the parameters until a new set arrives (sometimes parameters are resent regularly even when they haven't changed).

    从您的SPS和PPS NALUs中创建一个CMVideoFormatDescriptionRef,并使用CMVideoFormatDescriptionCreateFromH264ParameterSets()。如果不先做这个,就不能显示任何帧。SPS和PPS可能看起来像一堆乱七八糟的数字,但VTD知道如何处理它们。您需要知道的是,CMVideoFormatDescriptionRef是视频数据的描述。如宽度/高度,格式类型(kCMPixelFormat_32BGRA, kCMVideoCodecType_H264等),纵横比,颜色空间等。你的解码器将保留参数直到一个新的集合到达(有时参数即使没有改变,也会经常受到攻击)。

  3. Re-package your IDR and non-IDR frame NALUs according to the "AVCC" format. This means removing the NALU start codes and replacing them with a 4-byte header that states the length of the NALU. You don't need to do this for the SPS and PPS NALUs. (Note that the 4-byte NALU length header is in big-endian, so if you have a UInt32 value it must be byte-swapped before copying to the CMBlockBuffer using CFSwapInt32. I do this in my code with the htonl function call.)

    根据“AVCC”格式重新包装您的IDR和非IDR框架NALUs。这意味着删除NALU启动代码,并以一个4字节的标头替换它们,该标头表示NALU的长度。你不需要为SPS和PPS NALUs做这个。(注意,4字节的NALU长度标头位于大端,所以如果您有一个UInt32值,它必须在使用CFSwapInt32复制到CMBlockBuffer之前被交换。我在代码中使用htonl函数调用。

  4. Package the IDR and non-IDR NALU frames into CMBlockBuffer. Do not do this with the SPS PPS parameter NALUs. All you need to know about CMBlockBuffers is that they are a method to wrap arbitrary blocks of data in core media. (Any compressed video data in a video pipeline is wrapped in this.)

    将IDR和非IDR NALU帧打包成CMBlockBuffer。不要使用SPS PPS参数NALUs。关于cmblockbuffer,您需要知道的是,它们是一种将任意数据块封装到核心媒体中的方法。(视频管道中的任何压缩视频数据都包在这里。)

  5. Package the CMBlockBuffer into CMSampleBuffer. All you need to know about CMSampleBuffers is that they wrap up our CMBlockBuffers with other information (here it would be the CMVideoFormatDescription and CMTime, if CMTime is used).

    将CMBlockBuffer打包成CMSampleBuffer。您需要了解的关于cmsamplebuffer的全部内容是,它们用其他信息打包我们的cmblockbuffer(这里将是CMVideoFormatDescription和CMTime,如果使用CMTime)。

  6. Create a VTDecompressionSessionRef and feed the sample buffers into VTDecompressionSessionDecodeFrame( ). Alternatively, you can use AVSampleBufferDisplayLayer and its enqueueSampleBuffer: method and you won't need to use VTDecompSession. It's simpler to set up, but will not throw errors if something goes wrong like VTD will.

    创建一个vtdispressessionref并将示例缓冲区提供给vtdispressionsessiondecodeframe()。或者,您可以使用AVSampleBufferDisplayLayer及其enqueueSampleBuffer:方法,您不需要使用VTDecompSession。设置起来比较简单,但是如果出现像VTD这样的错误,就不会抛出错误。

  7. In the VTDecompSession callback, use the resultant CVImageBufferRef to display the video frame. If you need to convert your CVImageBuffer to a UIImage, see my * answer here.

    在vtdispsession回调中,使用合成的CVImageBufferRef来显示视频帧。如果需要将CVImageBuffer转换为UIImage,请参见这里的*答案。

Other notes:

  • H.264 streams can vary a lot. From what I learned, NALU start code headers are sometimes 3 bytes (0x00 00 01) and sometimes 4 (0x00 00 00 01). My code works for 4 bytes; you will need to change a few things around if you're working with 3.

    H.264流可以有很大的变化。从我学到的东西来看,NALU启动代码头有时是3个字节(0x00 00 01),有时是4个字节(0x00 00 00 01)。我的代码可以工作4个字节;如果你和3个人一起工作,你需要改变一些事情。

  • If you want to know more about NALUs, I found this answer to be very helpful. In my case, I found that I didn't need to ignore the "emulation prevention" bytes as described, so I personally skipped that step but you may need to know about that.

    如果你想知道更多关于NALUs的信息,我发现这个答案非常有用。在我的例子中,我发现我不需要忽略“模拟预防”的字节,所以我个人跳过了这个步骤,但是您可能需要知道这一点。

  • If your VTDecompressionSession outputs an error number (like -12909) look up the error code in your XCode project. Find the VideoToolbox framework in your project navigator, open it and find the header VTErrors.h. If you can't find it, I've also included all the error codes below in another answer.

    如果您的VTDecompressionSession输出一个错误号(比如-12909),请在您的XCode项目中查找错误代码。在您的项目导航器中找到VideoToolbox框架,打开它并找到header VTErrors.h。如果你找不到它,我还把所有的错误代码都包含在了另一个答案中。

Code Example:

So let's start by declaring some global variables and including the VT framework (VT = Video Toolbox).

因此,让我们先声明一些全局变量,包括VT框架(VT = Video Toolbox)。

#import <VideoToolbox/VideoToolbox.h>

@property (nonatomic, assign) CMVideoFormatDescriptionRef formatDesc;
@property (nonatomic, assign) VTDecompressionSessionRef decompressionSession;
@property (nonatomic, retain) AVSampleBufferDisplayLayer *videoLayer;
@property (nonatomic, assign) int spsSize;
@property (nonatomic, assign) int ppsSize;

The following array is only used so that you can print out what type of NALU frame you are receiving. If you know what all these types mean, good for you, you know more about H.264 than me :) My code only handles types 1, 5, 7 and 8.

下面的数组只是用来打印您正在接收的NALU框架的类型。如果您知道所有这些类型的含义,对您有好处,您对H.264的了解比我更多:)我的代码只处理类型1、5、7和8。

NSString * const naluTypesStrings[] =
{
    @"0: Unspecified (non-VCL)",
    @"1: Coded slice of a non-IDR picture (VCL)",    // P frame
    @"2: Coded slice data partition A (VCL)",
    @"3: Coded slice data partition B (VCL)",
    @"4: Coded slice data partition C (VCL)",
    @"5: Coded slice of an IDR picture (VCL)",      // I frame
    @"6: Supplemental enhancement information (SEI) (non-VCL)",
    @"7: Sequence parameter set (non-VCL)",         // SPS parameter
    @"8: Picture parameter set (non-VCL)",          // PPS parameter
    @"9: Access unit delimiter (non-VCL)",
    @"10: End of sequence (non-VCL)",
    @"11: End of stream (non-VCL)",
    @"12: Filler data (non-VCL)",
    @"13: Sequence parameter set extension (non-VCL)",
    @"14: Prefix NAL unit (non-VCL)",
    @"15: Subset sequence parameter set (non-VCL)",
    @"16: Reserved (non-VCL)",
    @"17: Reserved (non-VCL)",
    @"18: Reserved (non-VCL)",
    @"19: Coded slice of an auxiliary coded picture without partitioning (non-VCL)",
    @"20: Coded slice extension (non-VCL)",
    @"21: Coded slice extension for depth view components (non-VCL)",
    @"22: Reserved (non-VCL)",
    @"23: Reserved (non-VCL)",
    @"24: STAP-A Single-time aggregation packet (non-VCL)",
    @"25: STAP-B Single-time aggregation packet (non-VCL)",
    @"26: MTAP16 Multi-time aggregation packet (non-VCL)",
    @"27: MTAP24 Multi-time aggregation packet (non-VCL)",
    @"28: FU-A Fragmentation unit (non-VCL)",
    @"29: FU-B Fragmentation unit (non-VCL)",
    @"30: Unspecified (non-VCL)",
    @"31: Unspecified (non-VCL)",
};

Now this is where all the magic happens.

这就是魔术发生的地方。

-(void) receivedRawVideoFrame:(uint8_t *)frame withSize:(uint32_t)frameSize isIFrame:(int)isIFrame
{
    OSStatus status;

    uint8_t *data = NULL;
    uint8_t *pps = NULL;
    uint8_t *sps = NULL;

    // I know what my H.264 data source's NALUs look like so I know start code index is always 0.
    // if you don't know where it starts, you can use a for loop similar to how i find the 2nd and 3rd start codes
    int startCodeIndex = 0;
    int secondStartCodeIndex = 0;
    int thirdStartCodeIndex = 0;

    long blockLength = 0;

    CMSampleBufferRef sampleBuffer = NULL;
    CMBlockBufferRef blockBuffer = NULL;

    int nalu_type = (frame[startCodeIndex + 4] & 0x1F);
    NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);

    // if we havent already set up our format description with our SPS PPS parameters, we
    // can't process any frames except type 7 that has our parameters
    if (nalu_type != 7 && _formatDesc == NULL)
    {
        NSLog(@"Video error: Frame is not an I Frame and format description is null");
        return;
    }

    // NALU type 7 is the SPS parameter NALU
    if (nalu_type == 7)
    {
        // find where the second PPS start code begins, (the 0x00 00 00 01 code)
        // from which we also get the length of the first SPS code
        for (int i = startCodeIndex + 4; i < startCodeIndex + 40; i++)
        {
            if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
            {
                secondStartCodeIndex = i;
                _spsSize = secondStartCodeIndex;   // includes the header in the size
                break;
            }
        }

        // find what the second NALU type is
        nalu_type = (frame[secondStartCodeIndex + 4] & 0x1F);
        NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
    }

    // type 8 is the PPS parameter NALU
    if(nalu_type == 8)
    {
        // find where the NALU after this one starts so we know how long the PPS parameter is
        for (int i = _spsSize + 4; i < _spsSize + 30; i++)
        {
            if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
            {
                thirdStartCodeIndex = i;
                _ppsSize = thirdStartCodeIndex - _spsSize;
                break;
            }
        }

        // allocate enough data to fit the SPS and PPS parameters into our data objects.
        // VTD doesn't want you to include the start code header (4 bytes long) so we add the - 4 here
        sps = malloc(_spsSize - 4);
        pps = malloc(_ppsSize - 4);

        // copy in the actual sps and pps values, again ignoring the 4 byte header
        memcpy (sps, &frame[4], _spsSize-4);
        memcpy (pps, &frame[_spsSize+4], _ppsSize-4);

        // now we set our H264 parameters
        uint8_t*  parameterSetPointers[2] = {sps, pps};
        size_t parameterSetSizes[2] = {_spsSize-4, _ppsSize-4};

        status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, 
                                                (const uint8_t *const*)parameterSetPointers, 
                                                parameterSetSizes, 4, 
                                                &_formatDesc);

        NSLog(@"\t\t Creation of CMVideoFormatDescription: %@", (status == noErr) ? @"successful!" : @"failed...");
        if(status != noErr) NSLog(@"\t\t Format Description ERROR type: %d", (int)status);

        // See if decomp session can convert from previous format description 
        // to the new one, if not we need to remake the decomp session.
        // This snippet was not necessary for my applications but it could be for yours
        /*BOOL needNewDecompSession = (VTDecompressionSessionCanAcceptFormatDescription(_decompressionSession, _formatDesc) == NO);
         if(needNewDecompSession)
         {
             [self createDecompSession];
         }*/

        // now lets handle the IDR frame that (should) come after the parameter sets
        // I say "should" because that's how I expect my H264 stream to work, YMMV
        nalu_type = (frame[thirdStartCodeIndex + 4] & 0x1F);
        NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
    }

    // create our VTDecompressionSession.  This isnt neccessary if you choose to use AVSampleBufferDisplayLayer
    if((status == noErr) && (_decompressionSession == NULL))
    {
        [self createDecompSession];
    }

    // type 5 is an IDR frame NALU.  The SPS and PPS NALUs should always be followed by an IDR (or IFrame) NALU, as far as I know
    if(nalu_type == 5)
    {
        // find the offset, or where the SPS and PPS NALUs end and the IDR frame NALU begins
        int offset = _spsSize + _ppsSize;
        blockLength = frameSize - offset;
        data = malloc(blockLength);
        data = memcpy(data, &frame[offset], blockLength);

        // replace the start code header on this NALU with its size.
        // AVCC format requires that you do this.  
        // htonl converts the unsigned int from host to network byte order
        uint32_t dataLength32 = htonl (blockLength - 4);
        memcpy (data, &dataLength32, sizeof (uint32_t));

        // create a block buffer from the IDR NALU
        status = CMBlockBufferCreateWithMemoryBlock(NULL, data,  // memoryBlock to hold buffered data
                                                    blockLength,  // block length of the mem block in bytes.
                                                    kCFAllocatorNull, NULL,
                                                    0, // offsetToData
                                                    blockLength,   // dataLength of relevant bytes, starting at offsetToData
                                                    0, &blockBuffer);

        NSLog(@"\t\t BlockBufferCreation: \t %@", (status == kCMBlockBufferNoErr) ? @"successful!" : @"failed...");
    }

    // NALU type 1 is non-IDR (or PFrame) picture
    if (nalu_type == 1)
    {
        // non-IDR frames do not have an offset due to SPS and PSS, so the approach
        // is similar to the IDR frames just without the offset
        blockLength = frameSize;
        data = malloc(blockLength);
        data = memcpy(data, &frame[0], blockLength);

        // again, replace the start header with the size of the NALU
        uint32_t dataLength32 = htonl (blockLength - 4);
        memcpy (data, &dataLength32, sizeof (uint32_t));

        status = CMBlockBufferCreateWithMemoryBlock(NULL, data,  // memoryBlock to hold data. If NULL, block will be alloc when needed
                                                    blockLength,  // overall length of the mem block in bytes
                                                    kCFAllocatorNull, NULL,
                                                    0,     // offsetToData
                                                    blockLength,  // dataLength of relevant data bytes, starting at offsetToData
                                                    0, &blockBuffer);

        NSLog(@"\t\t BlockBufferCreation: \t %@", (status == kCMBlockBufferNoErr) ? @"successful!" : @"failed...");
    }

    // now create our sample buffer from the block buffer,
    if(status == noErr)
    {
        // here I'm not bothering with any timing specifics since in my case we displayed all frames immediately
        const size_t sampleSize = blockLength;
        status = CMSampleBufferCreate(kCFAllocatorDefault,
                                      blockBuffer, true, NULL, NULL,
                                      _formatDesc, 1, 0, NULL, 1,
                                      &sampleSize, &sampleBuffer);

        NSLog(@"\t\t SampleBufferCreate: \t %@", (status == noErr) ? @"successful!" : @"failed...");
    }

    if(status == noErr)
    {
        // set some values of the sample buffer's attachments
        CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, YES);
        CFMutableDictionaryRef dict = (CFMutableDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
        CFDictionarySetValue(dict, kCMSampleAttachmentKey_DisplayImmediately, kCFBooleanTrue);

        // either send the samplebuffer to a VTDecompressionSession or to an AVSampleBufferDisplayLayer
        [self render:sampleBuffer];
    }

    // free memory to avoid a memory leak, do the same for sps, pps and blockbuffer
    if (NULL != data)
    {
        free (data);
        data = NULL;
    }
}

The following method creates your VTD session. Recreate it whenever you receive new parameters. (You don't have to recreate it every time you receive parameters, pretty sure.)

下面的方法创建了VTD会话。当您收到新的参数时重新创建它。(你不必每次收到参数时都重新创建它。)

If you want to set attributes for the destination CVPixelBuffer, read up on CoreVideo PixelBufferAttributes values and put them in NSDictionary *destinationImageBufferAttributes.

如果您想为目标CVPixelBuffer设置属性,请阅读CoreVideo PixelBufferAttributes值并将其放入NSDictionary *destinationImageBufferAttributes中。

-(void) createDecompSession
{
    // make sure to destroy the old VTD session
    _decompressionSession = NULL;
    VTDecompressionOutputCallbackRecord callBackRecord;
    callBackRecord.decompressionOutputCallback = decompressionSessionDecodeFrameCallback;

    // this is necessary if you need to make calls to Objective C "self" from within in the callback method.
    callBackRecord.decompressionOutputRefCon = (__bridge void *)self;

    // you can set some desired attributes for the destination pixel buffer.  I didn't use this but you may
    // if you need to set some attributes, be sure to uncomment the dictionary in VTDecompressionSessionCreate
    NSDictionary *destinationImageBufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:
                                                      [NSNumber numberWithBool:YES],
                                                      (id)kCVPixelBufferOpenGLESCompatibilityKey,
                                                      nil];

    OSStatus status =  VTDecompressionSessionCreate(NULL, _formatDesc, NULL,
                                                    NULL, // (__bridge CFDictionaryRef)(destinationImageBufferAttributes)
                                                    &callBackRecord, &_decompressionSession);
    NSLog(@"Video Decompression Session Create: \t %@", (status == noErr) ? @"successful!" : @"failed...");
    if(status != noErr) NSLog(@"\t\t VTD ERROR type: %d", (int)status);
}

Now this method gets called every time VTD is done decompressing any frame you sent to it. This method gets called even if there's an error or if the frame is dropped.

现在这个方法被调用,每次VTD都被解压到你发送给它的任何帧。这个方法会被调用,即使出现错误或者帧被删除。

void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon,
                                             void *sourceFrameRefCon,
                                             OSStatus status,
                                             VTDecodeInfoFlags infoFlags,
                                             CVImageBufferRef imageBuffer,
                                             CMTime presentationTimeStamp,
                                             CMTime presentationDuration)
{
    THISCLASSNAME *streamManager = (__bridge THISCLASSNAME *)decompressionOutputRefCon;

    if (status != noErr)
    {
        NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        NSLog(@"Decompressed error: %@", error);
    }
    else
    {
        NSLog(@"Decompressed sucessfully");

        // do something with your resulting CVImageBufferRef that is your decompressed frame
        [streamManager displayDecodedFrame:imageBuffer];
    }
}

This is where we actually send the sampleBuffer off to the VTD to be decoded.

这是我们将sampleBuffer发送到VTD的地方。

- (void) render:(CMSampleBufferRef)sampleBuffer
{
    VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression;
    VTDecodeInfoFlags flagOut;
    NSDate* currentTime = [NSDate date];
    VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
                                      (void*)CFBridgingRetain(currentTime), &flagOut);

    CFRelease(sampleBuffer);

    // if you're using AVSampleBufferDisplayLayer, you only need to use this line of code
    // [videoLayer enqueueSampleBuffer:sampleBuffer];
}

If you're using AVSampleBufferDisplayLayer, be sure to init the layer like this, in viewDidLoad or inside some other init method.

如果你使用的是AVSampleBufferDisplayLayer,一定要在viewDidLoad或其他init方法中init这个层。

-(void) viewDidLoad
{
    // create our AVSampleBufferDisplayLayer and add it to the view
    videoLayer = [[AVSampleBufferDisplayLayer alloc] init];
    videoLayer.frame = self.view.frame;
    videoLayer.bounds = self.view.bounds;
    videoLayer.videoGravity = AVLayerVideoGravityResizeAspect;

    // set Timebase, you may need this if you need to display frames at specific times
    // I didn't need it so I haven't verified that the timebase is working
    CMTimebaseRef controlTimebase;
    CMTimebaseCreateWithMasterClock(CFAllocatorGetDefault(), CMClockGetHostTimeClock(), &controlTimebase);

    //videoLayer.controlTimebase = controlTimebase;
    CMTimebaseSetTime(self.videoLayer.controlTimebase, kCMTimeZero);
    CMTimebaseSetRate(self.videoLayer.controlTimebase, 1.0);

    [[self.view layer] addSublayer:videoLayer];
}

#2


13  

If you can't find the VTD error codes in the framework, I decided to just include them here. (Again, all these errors and more can be found inside the VideoToolbox.framework itself in the project navigator, in the file VTErrors.h.)

如果您在框架中找不到VTD错误代码,我决定将它们包括在这里。(同样,所有这些错误和更多的错误都可以在videotoolbox.net中找到,在项目导航器中,在文件VTErrors.h中。)

You will get one of these error codes either in the the VTD decode frame callback or when you create your VTD session if you did something incorrectly.

您将在VTD解码帧回调中获得其中一个错误代码,或者在您创建VTD会话时,如果您做错了什么事情。

kVTPropertyNotSupportedErr              = -12900,
kVTPropertyReadOnlyErr                  = -12901,
kVTParameterErr                         = -12902,
kVTInvalidSessionErr                    = -12903,
kVTAllocationFailedErr                  = -12904,
kVTPixelTransferNotSupportedErr         = -12905, // c.f. -8961
kVTCouldNotFindVideoDecoderErr          = -12906,
kVTCouldNotCreateInstanceErr            = -12907,
kVTCouldNotFindVideoEncoderErr          = -12908,
kVTVideoDecoderBadDataErr               = -12909, // c.f. -8969
kVTVideoDecoderUnsupportedDataFormatErr = -12910, // c.f. -8970
kVTVideoDecoderMalfunctionErr           = -12911, // c.f. -8960
kVTVideoEncoderMalfunctionErr           = -12912,
kVTVideoDecoderNotAvailableNowErr       = -12913,
kVTImageRotationNotSupportedErr         = -12914,
kVTVideoEncoderNotAvailableNowErr       = -12915,
kVTFormatDescriptionChangeNotSupportedErr   = -12916,
kVTInsufficientSourceColorDataErr       = -12917,
kVTCouldNotCreateColorCorrectionDataErr = -12918,
kVTColorSyncTransformConvertFailedErr   = -12919,
kVTVideoDecoderAuthorizationErr         = -12210,
kVTVideoEncoderAuthorizationErr         = -12211,
kVTColorCorrectionPixelTransferFailedErr    = -12212,
kVTMultiPassStorageIdentifierMismatchErr    = -12213,
kVTMultiPassStorageInvalidErr           = -12214,
kVTFrameSiloInvalidTimeStampErr         = -12215,
kVTFrameSiloInvalidTimeRangeErr         = -12216,
kVTCouldNotFindTemporalFilterErr        = -12217,
kVTPixelTransferNotPermittedErr         = -12218,

#3


8  

A good Swift example of much of this can be found in Josh Baker's Avios library: https://github.com/tidwall/Avios

在Josh Baker的Avios库中可以找到一个很好的例子:https://github.com/tidwall/Avios。

Note that Avios currently expects the user to handle chunking data at NAL start codes, but does handle decoding the data from that point forward.

请注意,Avios当前期望用户在NAL start代码中处理组块数据,但是处理从那个点转发的数据。

Also worth a look is the Swift based RTMP library HaishinKit (formerly "LF"), which has its own decoding implementation, including more robust NALU parsing: https://github.com/shogo4405/lf.swift

同样值得一看的是基于Swift的RTMP图书馆HaishinKit(以前是“LF”),它有自己的解码实现,包括更健壮的NALU解析:https://github.com/shogo4405/lf.swift。

#4


4  

In addition to VTErrors above, I thought it's worth adding CMFormatDescription, CMBlockBuffer, CMSampleBuffer errors that you may encounter while trying Livy's example.

除了上面的VTErrors之外,我认为还值得添加CMFormatDescription、CMBlockBuffer、CMSampleBuffer错误,在尝试Livy的例子时可能会遇到这些错误。

kCMFormatDescriptionError_InvalidParameter  = -12710,
kCMFormatDescriptionError_AllocationFailed  = -12711,
kCMFormatDescriptionError_ValueNotAvailable = -12718,

kCMBlockBufferNoErr                             = 0,
kCMBlockBufferStructureAllocationFailedErr      = -12700,
kCMBlockBufferBlockAllocationFailedErr          = -12701,
kCMBlockBufferBadCustomBlockSourceErr           = -12702,
kCMBlockBufferBadOffsetParameterErr             = -12703,
kCMBlockBufferBadLengthParameterErr             = -12704,
kCMBlockBufferBadPointerParameterErr            = -12705,
kCMBlockBufferEmptyBBufErr                      = -12706,
kCMBlockBufferUnallocatedBlockErr               = -12707,
kCMBlockBufferInsufficientSpaceErr              = -12708,

kCMSampleBufferError_AllocationFailed             = -12730,
kCMSampleBufferError_RequiredParameterMissing     = -12731,
kCMSampleBufferError_AlreadyHasDataBuffer         = -12732,
kCMSampleBufferError_BufferNotReady               = -12733,
kCMSampleBufferError_SampleIndexOutOfRange        = -12734,
kCMSampleBufferError_BufferHasNoSampleSizes       = -12735,
kCMSampleBufferError_BufferHasNoSampleTimingInfo  = -12736,
kCMSampleBufferError_ArrayTooSmall                = -12737,
kCMSampleBufferError_InvalidEntryCount            = -12738,
kCMSampleBufferError_CannotSubdivide              = -12739,
kCMSampleBufferError_SampleTimingInfoInvalid      = -12740,
kCMSampleBufferError_InvalidMediaTypeForOperation = -12741,
kCMSampleBufferError_InvalidSampleData            = -12742,
kCMSampleBufferError_InvalidMediaFormat           = -12743,
kCMSampleBufferError_Invalidated                  = -12744,
kCMSampleBufferError_DataFailed                   = -16750,
kCMSampleBufferError_DataCanceled                 = -16751,

#5


1  

@Livy to remove memory leaks before CMVideoFormatDescriptionCreateFromH264ParameterSets you should add the following:

@Livy在CMVideoFormatDescriptionCreateFromH264ParameterSets之前删除内存泄漏,您应该添加以下内容:

if (_formatDesc) {
    CFRelease(_formatDesc);
    _formatDesc = NULL;
}

#1


138  

Concepts:

NALUs: NALUs are simply a chunk of data of varying length that has a NALU start code header 0x00 00 00 01 YY where the first 5 bits of YY tells you what type of NALU this is and therefore what type of data follows the header. (Since you only need the first 5 bits, I use YY & 0x1F to just get the relevant bits.) I list what all these types are in the method NSString * const naluTypesStrings[], but you don't need to know what they all are.

NALUs: NALUs只是一个不同长度的数据块,它有一个NALU启动代码头,0x00 00 01 YY,前5位YY告诉你这是什么类型的NALU,因此是什么类型的数据跟随header。(因为您只需要前5位,我使用YY和0x1F来获取相关的比特。)我列出了所有这些类型在方法NSString * const naluTypesStrings[]中,但是您不需要知道它们都是什么。

Parameters: Your decoder needs parameters so it knows how the H.264 video data is stored. The 2 you need to set are Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) and they each have their own NALU type number. You don't need to know what the parameters mean, the decoder knows what to do with them.

参数:您的解码器需要参数,所以它知道H.264视频数据是如何存储的。您需要设置的2是序列参数集(SPS)和图像参数集(PPS),它们各自都有自己的NALU类型编号。你不需要知道参数的含义,解码器知道如何处理它们。

H.264 Stream Format: In most H.264 streams, you will receive with an initial set of PPS and SPS parameters followed by an i frame (aka IDR frame or flush frame) NALU. Then you will receive several P frame NALUs (maybe a few dozen or so), then another set of parameters (which may be the same as the initial parameters) and an i frame, more P frames, etc. i frames are much bigger than P frames. Conceptually you can think of the i frame as an entire image of the video, and the P frames are just the changes that have been made to that i frame, until you receive the next i frame.

H.264流格式:在大多数H.264流中,您将收到一组初始的PPS和SPS参数,然后是i帧(即IDR框架或平帧)NALU。然后你会收到几个P框架NALUs(可能是几十个左右),然后另一组参数(可能与初始参数相同)和i帧,更多的P帧,等等,i帧比P帧大得多。从概念上讲,你可以把i帧看作整个视频的图像,而P帧只是我帧的变化,直到你接收到下一个i帧。

Procedure:

  1. Generate individual NALUs from your H.264 stream. I cannot show code for this step since it depends a lot on what video source you're using. I made this graphic to show what I was working with ("data" in the graphic is "frame" in my following code), but your case may and probably will differ. 如何使用VideoToolbox解压H.264视频流? My method receivedRawVideoFrame: is called every time I receive a frame (uint8_t *frame) which was one of 2 types. In the diagram, those 2 frame types are the 2 big purple boxes.

    从H.264流中生成单独的NALUs。我不能显示这一步的代码,因为它很大程度上取决于您使用的视频源。我做了这个图形来显示我正在处理的东西(图形中的“数据”在我的下面代码中是“frame”),但是您的情况可能并且可能会有所不同。我的方法receivedRawVideoFrame:每当我接收到一个帧(uint8_t *frame),它就是2种类型之一。在图中,这两种框架类型是两个大紫色框。

  2. Create a CMVideoFormatDescriptionRef from your SPS and PPS NALUs with CMVideoFormatDescriptionCreateFromH264ParameterSets( ). You cannot display any frames without doing this first. The SPS and PPS may look like a jumble of numbers, but VTD knows what to do with them. All you need to know is that CMVideoFormatDescriptionRef is a description of video data., like width/height, format type (kCMPixelFormat_32BGRA, kCMVideoCodecType_H264 etc.), aspect ratio, color space etc. Your decoder will hold onto the parameters until a new set arrives (sometimes parameters are resent regularly even when they haven't changed).

    从您的SPS和PPS NALUs中创建一个CMVideoFormatDescriptionRef,并使用CMVideoFormatDescriptionCreateFromH264ParameterSets()。如果不先做这个,就不能显示任何帧。SPS和PPS可能看起来像一堆乱七八糟的数字,但VTD知道如何处理它们。您需要知道的是,CMVideoFormatDescriptionRef是视频数据的描述。如宽度/高度,格式类型(kCMPixelFormat_32BGRA, kCMVideoCodecType_H264等),纵横比,颜色空间等。你的解码器将保留参数直到一个新的集合到达(有时参数即使没有改变,也会经常受到攻击)。

  3. Re-package your IDR and non-IDR frame NALUs according to the "AVCC" format. This means removing the NALU start codes and replacing them with a 4-byte header that states the length of the NALU. You don't need to do this for the SPS and PPS NALUs. (Note that the 4-byte NALU length header is in big-endian, so if you have a UInt32 value it must be byte-swapped before copying to the CMBlockBuffer using CFSwapInt32. I do this in my code with the htonl function call.)

    根据“AVCC”格式重新包装您的IDR和非IDR框架NALUs。这意味着删除NALU启动代码,并以一个4字节的标头替换它们,该标头表示NALU的长度。你不需要为SPS和PPS NALUs做这个。(注意,4字节的NALU长度标头位于大端,所以如果您有一个UInt32值,它必须在使用CFSwapInt32复制到CMBlockBuffer之前被交换。我在代码中使用htonl函数调用。

  4. Package the IDR and non-IDR NALU frames into CMBlockBuffer. Do not do this with the SPS PPS parameter NALUs. All you need to know about CMBlockBuffers is that they are a method to wrap arbitrary blocks of data in core media. (Any compressed video data in a video pipeline is wrapped in this.)

    将IDR和非IDR NALU帧打包成CMBlockBuffer。不要使用SPS PPS参数NALUs。关于cmblockbuffer,您需要知道的是,它们是一种将任意数据块封装到核心媒体中的方法。(视频管道中的任何压缩视频数据都包在这里。)

  5. Package the CMBlockBuffer into CMSampleBuffer. All you need to know about CMSampleBuffers is that they wrap up our CMBlockBuffers with other information (here it would be the CMVideoFormatDescription and CMTime, if CMTime is used).

    将CMBlockBuffer打包成CMSampleBuffer。您需要了解的关于cmsamplebuffer的全部内容是,它们用其他信息打包我们的cmblockbuffer(这里将是CMVideoFormatDescription和CMTime,如果使用CMTime)。

  6. Create a VTDecompressionSessionRef and feed the sample buffers into VTDecompressionSessionDecodeFrame( ). Alternatively, you can use AVSampleBufferDisplayLayer and its enqueueSampleBuffer: method and you won't need to use VTDecompSession. It's simpler to set up, but will not throw errors if something goes wrong like VTD will.

    创建一个vtdispressessionref并将示例缓冲区提供给vtdispressionsessiondecodeframe()。或者,您可以使用AVSampleBufferDisplayLayer及其enqueueSampleBuffer:方法,您不需要使用VTDecompSession。设置起来比较简单,但是如果出现像VTD这样的错误,就不会抛出错误。

  7. In the VTDecompSession callback, use the resultant CVImageBufferRef to display the video frame. If you need to convert your CVImageBuffer to a UIImage, see my * answer here.

    在vtdispsession回调中,使用合成的CVImageBufferRef来显示视频帧。如果需要将CVImageBuffer转换为UIImage,请参见这里的*答案。

Other notes:

  • H.264 streams can vary a lot. From what I learned, NALU start code headers are sometimes 3 bytes (0x00 00 01) and sometimes 4 (0x00 00 00 01). My code works for 4 bytes; you will need to change a few things around if you're working with 3.

    H.264流可以有很大的变化。从我学到的东西来看,NALU启动代码头有时是3个字节(0x00 00 01),有时是4个字节(0x00 00 00 01)。我的代码可以工作4个字节;如果你和3个人一起工作,你需要改变一些事情。

  • If you want to know more about NALUs, I found this answer to be very helpful. In my case, I found that I didn't need to ignore the "emulation prevention" bytes as described, so I personally skipped that step but you may need to know about that.

    如果你想知道更多关于NALUs的信息,我发现这个答案非常有用。在我的例子中,我发现我不需要忽略“模拟预防”的字节,所以我个人跳过了这个步骤,但是您可能需要知道这一点。

  • If your VTDecompressionSession outputs an error number (like -12909) look up the error code in your XCode project. Find the VideoToolbox framework in your project navigator, open it and find the header VTErrors.h. If you can't find it, I've also included all the error codes below in another answer.

    如果您的VTDecompressionSession输出一个错误号(比如-12909),请在您的XCode项目中查找错误代码。在您的项目导航器中找到VideoToolbox框架,打开它并找到header VTErrors.h。如果你找不到它,我还把所有的错误代码都包含在了另一个答案中。

Code Example:

So let's start by declaring some global variables and including the VT framework (VT = Video Toolbox).

因此,让我们先声明一些全局变量,包括VT框架(VT = Video Toolbox)。

#import <VideoToolbox/VideoToolbox.h>

@property (nonatomic, assign) CMVideoFormatDescriptionRef formatDesc;
@property (nonatomic, assign) VTDecompressionSessionRef decompressionSession;
@property (nonatomic, retain) AVSampleBufferDisplayLayer *videoLayer;
@property (nonatomic, assign) int spsSize;
@property (nonatomic, assign) int ppsSize;

The following array is only used so that you can print out what type of NALU frame you are receiving. If you know what all these types mean, good for you, you know more about H.264 than me :) My code only handles types 1, 5, 7 and 8.

下面的数组只是用来打印您正在接收的NALU框架的类型。如果您知道所有这些类型的含义,对您有好处,您对H.264的了解比我更多:)我的代码只处理类型1、5、7和8。

NSString * const naluTypesStrings[] =
{
    @"0: Unspecified (non-VCL)",
    @"1: Coded slice of a non-IDR picture (VCL)",    // P frame
    @"2: Coded slice data partition A (VCL)",
    @"3: Coded slice data partition B (VCL)",
    @"4: Coded slice data partition C (VCL)",
    @"5: Coded slice of an IDR picture (VCL)",      // I frame
    @"6: Supplemental enhancement information (SEI) (non-VCL)",
    @"7: Sequence parameter set (non-VCL)",         // SPS parameter
    @"8: Picture parameter set (non-VCL)",          // PPS parameter
    @"9: Access unit delimiter (non-VCL)",
    @"10: End of sequence (non-VCL)",
    @"11: End of stream (non-VCL)",
    @"12: Filler data (non-VCL)",
    @"13: Sequence parameter set extension (non-VCL)",
    @"14: Prefix NAL unit (non-VCL)",
    @"15: Subset sequence parameter set (non-VCL)",
    @"16: Reserved (non-VCL)",
    @"17: Reserved (non-VCL)",
    @"18: Reserved (non-VCL)",
    @"19: Coded slice of an auxiliary coded picture without partitioning (non-VCL)",
    @"20: Coded slice extension (non-VCL)",
    @"21: Coded slice extension for depth view components (non-VCL)",
    @"22: Reserved (non-VCL)",
    @"23: Reserved (non-VCL)",
    @"24: STAP-A Single-time aggregation packet (non-VCL)",
    @"25: STAP-B Single-time aggregation packet (non-VCL)",
    @"26: MTAP16 Multi-time aggregation packet (non-VCL)",
    @"27: MTAP24 Multi-time aggregation packet (non-VCL)",
    @"28: FU-A Fragmentation unit (non-VCL)",
    @"29: FU-B Fragmentation unit (non-VCL)",
    @"30: Unspecified (non-VCL)",
    @"31: Unspecified (non-VCL)",
};

Now this is where all the magic happens.

这就是魔术发生的地方。

-(void) receivedRawVideoFrame:(uint8_t *)frame withSize:(uint32_t)frameSize isIFrame:(int)isIFrame
{
    OSStatus status;

    uint8_t *data = NULL;
    uint8_t *pps = NULL;
    uint8_t *sps = NULL;

    // I know what my H.264 data source's NALUs look like so I know start code index is always 0.
    // if you don't know where it starts, you can use a for loop similar to how i find the 2nd and 3rd start codes
    int startCodeIndex = 0;
    int secondStartCodeIndex = 0;
    int thirdStartCodeIndex = 0;

    long blockLength = 0;

    CMSampleBufferRef sampleBuffer = NULL;
    CMBlockBufferRef blockBuffer = NULL;

    int nalu_type = (frame[startCodeIndex + 4] & 0x1F);
    NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);

    // if we havent already set up our format description with our SPS PPS parameters, we
    // can't process any frames except type 7 that has our parameters
    if (nalu_type != 7 && _formatDesc == NULL)
    {
        NSLog(@"Video error: Frame is not an I Frame and format description is null");
        return;
    }

    // NALU type 7 is the SPS parameter NALU
    if (nalu_type == 7)
    {
        // find where the second PPS start code begins, (the 0x00 00 00 01 code)
        // from which we also get the length of the first SPS code
        for (int i = startCodeIndex + 4; i < startCodeIndex + 40; i++)
        {
            if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
            {
                secondStartCodeIndex = i;
                _spsSize = secondStartCodeIndex;   // includes the header in the size
                break;
            }
        }

        // find what the second NALU type is
        nalu_type = (frame[secondStartCodeIndex + 4] & 0x1F);
        NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
    }

    // type 8 is the PPS parameter NALU
    if(nalu_type == 8)
    {
        // find where the NALU after this one starts so we know how long the PPS parameter is
        for (int i = _spsSize + 4; i < _spsSize + 30; i++)
        {
            if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
            {
                thirdStartCodeIndex = i;
                _ppsSize = thirdStartCodeIndex - _spsSize;
                break;
            }
        }

        // allocate enough data to fit the SPS and PPS parameters into our data objects.
        // VTD doesn't want you to include the start code header (4 bytes long) so we add the - 4 here
        sps = malloc(_spsSize - 4);
        pps = malloc(_ppsSize - 4);

        // copy in the actual sps and pps values, again ignoring the 4 byte header
        memcpy (sps, &frame[4], _spsSize-4);
        memcpy (pps, &frame[_spsSize+4], _ppsSize-4);

        // now we set our H264 parameters
        uint8_t*  parameterSetPointers[2] = {sps, pps};
        size_t parameterSetSizes[2] = {_spsSize-4, _ppsSize-4};

        status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, 
                                                (const uint8_t *const*)parameterSetPointers, 
                                                parameterSetSizes, 4, 
                                                &_formatDesc);

        NSLog(@"\t\t Creation of CMVideoFormatDescription: %@", (status == noErr) ? @"successful!" : @"failed...");
        if(status != noErr) NSLog(@"\t\t Format Description ERROR type: %d", (int)status);

        // See if decomp session can convert from previous format description 
        // to the new one, if not we need to remake the decomp session.
        // This snippet was not necessary for my applications but it could be for yours
        /*BOOL needNewDecompSession = (VTDecompressionSessionCanAcceptFormatDescription(_decompressionSession, _formatDesc) == NO);
         if(needNewDecompSession)
         {
             [self createDecompSession];
         }*/

        // now lets handle the IDR frame that (should) come after the parameter sets
        // I say "should" because that's how I expect my H264 stream to work, YMMV
        nalu_type = (frame[thirdStartCodeIndex + 4] & 0x1F);
        NSLog(@"~~~~~~~ Received NALU Type \"%@\" ~~~~~~~~", naluTypesStrings[nalu_type]);
    }

    // create our VTDecompressionSession.  This isnt neccessary if you choose to use AVSampleBufferDisplayLayer
    if((status == noErr) && (_decompressionSession == NULL))
    {
        [self createDecompSession];
    }

    // type 5 is an IDR frame NALU.  The SPS and PPS NALUs should always be followed by an IDR (or IFrame) NALU, as far as I know
    if(nalu_type == 5)
    {
        // find the offset, or where the SPS and PPS NALUs end and the IDR frame NALU begins
        int offset = _spsSize + _ppsSize;
        blockLength = frameSize - offset;
        data = malloc(blockLength);
        data = memcpy(data, &frame[offset], blockLength);

        // replace the start code header on this NALU with its size.
        // AVCC format requires that you do this.  
        // htonl converts the unsigned int from host to network byte order
        uint32_t dataLength32 = htonl (blockLength - 4);
        memcpy (data, &dataLength32, sizeof (uint32_t));

        // create a block buffer from the IDR NALU
        status = CMBlockBufferCreateWithMemoryBlock(NULL, data,  // memoryBlock to hold buffered data
                                                    blockLength,  // block length of the mem block in bytes.
                                                    kCFAllocatorNull, NULL,
                                                    0, // offsetToData
                                                    blockLength,   // dataLength of relevant bytes, starting at offsetToData
                                                    0, &blockBuffer);

        NSLog(@"\t\t BlockBufferCreation: \t %@", (status == kCMBlockBufferNoErr) ? @"successful!" : @"failed...");
    }

    // NALU type 1 is non-IDR (or PFrame) picture
    if (nalu_type == 1)
    {
        // non-IDR frames do not have an offset due to SPS and PSS, so the approach
        // is similar to the IDR frames just without the offset
        blockLength = frameSize;
        data = malloc(blockLength);
        data = memcpy(data, &frame[0], blockLength);

        // again, replace the start header with the size of the NALU
        uint32_t dataLength32 = htonl (blockLength - 4);
        memcpy (data, &dataLength32, sizeof (uint32_t));

        status = CMBlockBufferCreateWithMemoryBlock(NULL, data,  // memoryBlock to hold data. If NULL, block will be alloc when needed
                                                    blockLength,  // overall length of the mem block in bytes
                                                    kCFAllocatorNull, NULL,
                                                    0,     // offsetToData
                                                    blockLength,  // dataLength of relevant data bytes, starting at offsetToData
                                                    0, &blockBuffer);

        NSLog(@"\t\t BlockBufferCreation: \t %@", (status == kCMBlockBufferNoErr) ? @"successful!" : @"failed...");
    }

    // now create our sample buffer from the block buffer,
    if(status == noErr)
    {
        // here I'm not bothering with any timing specifics since in my case we displayed all frames immediately
        const size_t sampleSize = blockLength;
        status = CMSampleBufferCreate(kCFAllocatorDefault,
                                      blockBuffer, true, NULL, NULL,
                                      _formatDesc, 1, 0, NULL, 1,
                                      &sampleSize, &sampleBuffer);

        NSLog(@"\t\t SampleBufferCreate: \t %@", (status == noErr) ? @"successful!" : @"failed...");
    }

    if(status == noErr)
    {
        // set some values of the sample buffer's attachments
        CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, YES);
        CFMutableDictionaryRef dict = (CFMutableDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
        CFDictionarySetValue(dict, kCMSampleAttachmentKey_DisplayImmediately, kCFBooleanTrue);

        // either send the samplebuffer to a VTDecompressionSession or to an AVSampleBufferDisplayLayer
        [self render:sampleBuffer];
    }

    // free memory to avoid a memory leak, do the same for sps, pps and blockbuffer
    if (NULL != data)
    {
        free (data);
        data = NULL;
    }
}

The following method creates your VTD session. Recreate it whenever you receive new parameters. (You don't have to recreate it every time you receive parameters, pretty sure.)

下面的方法创建了VTD会话。当您收到新的参数时重新创建它。(你不必每次收到参数时都重新创建它。)

If you want to set attributes for the destination CVPixelBuffer, read up on CoreVideo PixelBufferAttributes values and put them in NSDictionary *destinationImageBufferAttributes.

如果您想为目标CVPixelBuffer设置属性,请阅读CoreVideo PixelBufferAttributes值并将其放入NSDictionary *destinationImageBufferAttributes中。

-(void) createDecompSession
{
    // make sure to destroy the old VTD session
    _decompressionSession = NULL;
    VTDecompressionOutputCallbackRecord callBackRecord;
    callBackRecord.decompressionOutputCallback = decompressionSessionDecodeFrameCallback;

    // this is necessary if you need to make calls to Objective C "self" from within in the callback method.
    callBackRecord.decompressionOutputRefCon = (__bridge void *)self;

    // you can set some desired attributes for the destination pixel buffer.  I didn't use this but you may
    // if you need to set some attributes, be sure to uncomment the dictionary in VTDecompressionSessionCreate
    NSDictionary *destinationImageBufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:
                                                      [NSNumber numberWithBool:YES],
                                                      (id)kCVPixelBufferOpenGLESCompatibilityKey,
                                                      nil];

    OSStatus status =  VTDecompressionSessionCreate(NULL, _formatDesc, NULL,
                                                    NULL, // (__bridge CFDictionaryRef)(destinationImageBufferAttributes)
                                                    &callBackRecord, &_decompressionSession);
    NSLog(@"Video Decompression Session Create: \t %@", (status == noErr) ? @"successful!" : @"failed...");
    if(status != noErr) NSLog(@"\t\t VTD ERROR type: %d", (int)status);
}

Now this method gets called every time VTD is done decompressing any frame you sent to it. This method gets called even if there's an error or if the frame is dropped.

现在这个方法被调用,每次VTD都被解压到你发送给它的任何帧。这个方法会被调用,即使出现错误或者帧被删除。

void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon,
                                             void *sourceFrameRefCon,
                                             OSStatus status,
                                             VTDecodeInfoFlags infoFlags,
                                             CVImageBufferRef imageBuffer,
                                             CMTime presentationTimeStamp,
                                             CMTime presentationDuration)
{
    THISCLASSNAME *streamManager = (__bridge THISCLASSNAME *)decompressionOutputRefCon;

    if (status != noErr)
    {
        NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
        NSLog(@"Decompressed error: %@", error);
    }
    else
    {
        NSLog(@"Decompressed sucessfully");

        // do something with your resulting CVImageBufferRef that is your decompressed frame
        [streamManager displayDecodedFrame:imageBuffer];
    }
}

This is where we actually send the sampleBuffer off to the VTD to be decoded.

这是我们将sampleBuffer发送到VTD的地方。

- (void) render:(CMSampleBufferRef)sampleBuffer
{
    VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression;
    VTDecodeInfoFlags flagOut;
    NSDate* currentTime = [NSDate date];
    VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
                                      (void*)CFBridgingRetain(currentTime), &flagOut);

    CFRelease(sampleBuffer);

    // if you're using AVSampleBufferDisplayLayer, you only need to use this line of code
    // [videoLayer enqueueSampleBuffer:sampleBuffer];
}

If you're using AVSampleBufferDisplayLayer, be sure to init the layer like this, in viewDidLoad or inside some other init method.

如果你使用的是AVSampleBufferDisplayLayer,一定要在viewDidLoad或其他init方法中init这个层。

-(void) viewDidLoad
{
    // create our AVSampleBufferDisplayLayer and add it to the view
    videoLayer = [[AVSampleBufferDisplayLayer alloc] init];
    videoLayer.frame = self.view.frame;
    videoLayer.bounds = self.view.bounds;
    videoLayer.videoGravity = AVLayerVideoGravityResizeAspect;

    // set Timebase, you may need this if you need to display frames at specific times
    // I didn't need it so I haven't verified that the timebase is working
    CMTimebaseRef controlTimebase;
    CMTimebaseCreateWithMasterClock(CFAllocatorGetDefault(), CMClockGetHostTimeClock(), &controlTimebase);

    //videoLayer.controlTimebase = controlTimebase;
    CMTimebaseSetTime(self.videoLayer.controlTimebase, kCMTimeZero);
    CMTimebaseSetRate(self.videoLayer.controlTimebase, 1.0);

    [[self.view layer] addSublayer:videoLayer];
}

#2


13  

If you can't find the VTD error codes in the framework, I decided to just include them here. (Again, all these errors and more can be found inside the VideoToolbox.framework itself in the project navigator, in the file VTErrors.h.)

如果您在框架中找不到VTD错误代码,我决定将它们包括在这里。(同样,所有这些错误和更多的错误都可以在videotoolbox.net中找到,在项目导航器中,在文件VTErrors.h中。)

You will get one of these error codes either in the the VTD decode frame callback or when you create your VTD session if you did something incorrectly.

您将在VTD解码帧回调中获得其中一个错误代码,或者在您创建VTD会话时,如果您做错了什么事情。

kVTPropertyNotSupportedErr              = -12900,
kVTPropertyReadOnlyErr                  = -12901,
kVTParameterErr                         = -12902,
kVTInvalidSessionErr                    = -12903,
kVTAllocationFailedErr                  = -12904,
kVTPixelTransferNotSupportedErr         = -12905, // c.f. -8961
kVTCouldNotFindVideoDecoderErr          = -12906,
kVTCouldNotCreateInstanceErr            = -12907,
kVTCouldNotFindVideoEncoderErr          = -12908,
kVTVideoDecoderBadDataErr               = -12909, // c.f. -8969
kVTVideoDecoderUnsupportedDataFormatErr = -12910, // c.f. -8970
kVTVideoDecoderMalfunctionErr           = -12911, // c.f. -8960
kVTVideoEncoderMalfunctionErr           = -12912,
kVTVideoDecoderNotAvailableNowErr       = -12913,
kVTImageRotationNotSupportedErr         = -12914,
kVTVideoEncoderNotAvailableNowErr       = -12915,
kVTFormatDescriptionChangeNotSupportedErr   = -12916,
kVTInsufficientSourceColorDataErr       = -12917,
kVTCouldNotCreateColorCorrectionDataErr = -12918,
kVTColorSyncTransformConvertFailedErr   = -12919,
kVTVideoDecoderAuthorizationErr         = -12210,
kVTVideoEncoderAuthorizationErr         = -12211,
kVTColorCorrectionPixelTransferFailedErr    = -12212,
kVTMultiPassStorageIdentifierMismatchErr    = -12213,
kVTMultiPassStorageInvalidErr           = -12214,
kVTFrameSiloInvalidTimeStampErr         = -12215,
kVTFrameSiloInvalidTimeRangeErr         = -12216,
kVTCouldNotFindTemporalFilterErr        = -12217,
kVTPixelTransferNotPermittedErr         = -12218,

#3


8  

A good Swift example of much of this can be found in Josh Baker's Avios library: https://github.com/tidwall/Avios

在Josh Baker的Avios库中可以找到一个很好的例子:https://github.com/tidwall/Avios。

Note that Avios currently expects the user to handle chunking data at NAL start codes, but does handle decoding the data from that point forward.

请注意,Avios当前期望用户在NAL start代码中处理组块数据,但是处理从那个点转发的数据。

Also worth a look is the Swift based RTMP library HaishinKit (formerly "LF"), which has its own decoding implementation, including more robust NALU parsing: https://github.com/shogo4405/lf.swift

同样值得一看的是基于Swift的RTMP图书馆HaishinKit(以前是“LF”),它有自己的解码实现,包括更健壮的NALU解析:https://github.com/shogo4405/lf.swift。

#4


4  

In addition to VTErrors above, I thought it's worth adding CMFormatDescription, CMBlockBuffer, CMSampleBuffer errors that you may encounter while trying Livy's example.

除了上面的VTErrors之外,我认为还值得添加CMFormatDescription、CMBlockBuffer、CMSampleBuffer错误,在尝试Livy的例子时可能会遇到这些错误。

kCMFormatDescriptionError_InvalidParameter  = -12710,
kCMFormatDescriptionError_AllocationFailed  = -12711,
kCMFormatDescriptionError_ValueNotAvailable = -12718,

kCMBlockBufferNoErr                             = 0,
kCMBlockBufferStructureAllocationFailedErr      = -12700,
kCMBlockBufferBlockAllocationFailedErr          = -12701,
kCMBlockBufferBadCustomBlockSourceErr           = -12702,
kCMBlockBufferBadOffsetParameterErr             = -12703,
kCMBlockBufferBadLengthParameterErr             = -12704,
kCMBlockBufferBadPointerParameterErr            = -12705,
kCMBlockBufferEmptyBBufErr                      = -12706,
kCMBlockBufferUnallocatedBlockErr               = -12707,
kCMBlockBufferInsufficientSpaceErr              = -12708,

kCMSampleBufferError_AllocationFailed             = -12730,
kCMSampleBufferError_RequiredParameterMissing     = -12731,
kCMSampleBufferError_AlreadyHasDataBuffer         = -12732,
kCMSampleBufferError_BufferNotReady               = -12733,
kCMSampleBufferError_SampleIndexOutOfRange        = -12734,
kCMSampleBufferError_BufferHasNoSampleSizes       = -12735,
kCMSampleBufferError_BufferHasNoSampleTimingInfo  = -12736,
kCMSampleBufferError_ArrayTooSmall                = -12737,
kCMSampleBufferError_InvalidEntryCount            = -12738,
kCMSampleBufferError_CannotSubdivide              = -12739,
kCMSampleBufferError_SampleTimingInfoInvalid      = -12740,
kCMSampleBufferError_InvalidMediaTypeForOperation = -12741,
kCMSampleBufferError_InvalidSampleData            = -12742,
kCMSampleBufferError_InvalidMediaFormat           = -12743,
kCMSampleBufferError_Invalidated                  = -12744,
kCMSampleBufferError_DataFailed                   = -16750,
kCMSampleBufferError_DataCanceled                 = -16751,

#5


1  

@Livy to remove memory leaks before CMVideoFormatDescriptionCreateFromH264ParameterSets you should add the following:

@Livy在CMVideoFormatDescriptionCreateFromH264ParameterSets之前删除内存泄漏,您应该添加以下内容:

if (_formatDesc) {
    CFRelease(_formatDesc);
    _formatDesc = NULL;
}