如何在ios4中直接旋转CVImageBuffer图像而不转换为UIImage?

I am using OpenCV 2.2 on the iPhone to detect faces. I'm using the IOS 4's AVCaptureSession to get access to the camera stream, as seen in the code that follows.

我在iPhone上使用opencv2.2来检测人脸。我正在使用ios4的AVCaptureSession来访问相机流，如下面的代码所示。

My challenge is that the video frames come in as CVBufferRef (pointers to CVImageBuffer) objects, and they come in oriented as a landscape, 480px wide by 300px high. This is fine if you are holding the phone sideways, but when the phone is held in the upright position I want to rotate these frames 90 degrees clockwise so that OpenCV can find the faces correctly.

我的挑战是，视频帧作为CVBufferRef(指向CVImageBuffer的指针)的对象，它们以一种景观的形式出现，480px宽300px高。如果你是侧着拿着手机，这是可以的，但是当手机处于垂直位置时，我要顺时针旋转90度，这样OpenCV就能正确地找到你的脸。

I could convert the CVBufferRef to a CGImage, then to a UIImage, and then rotate, as this person is doing: Rotate CGImage taken from video frame

我可以将CVBufferRef转换为CGImage，然后转换为UIImage，然后旋转，就像这个人正在做的那样:从视频帧中旋转CGImage。

However that wastes a lot of CPU. I'm looking for a faster way to rotate the images coming in, ideally using the GPU to do this processing if possible.

但是这浪费了很多CPU。我正在寻找一种更快的方式来旋转图像，如果可能的话，最好使用GPU来完成这个过程。

Any ideas?

什么好主意吗?

Ian

伊恩

Code Sample:

代码示例:

 -(void) startCameraCapture {
  // Start up the face detector

  faceDetector = [[FaceDetector alloc] initWithCascade:@"haarcascade_frontalface_alt2" withFileExtension:@"xml"];

  // Create the AVCapture Session
  session = [[AVCaptureSession alloc] init];

  // create a preview layer to show the output from the camera
  AVCaptureVideoPreviewLayer *previewLayer = [AVCaptureVideoPreviewLayer layerWithSession:session];
  previewLayer.frame = previewView.frame;
  previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill;

  [previewView.layer addSublayer:previewLayer];

  // Get the default camera device
  AVCaptureDevice* camera = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];

  // Create a AVCaptureInput with the camera device
  NSError *error=nil;
  AVCaptureInput* cameraInput = [[AVCaptureDeviceInput alloc] initWithDevice:camera error:&error];
  if (cameraInput == nil) {
   NSLog(@"Error to create camera capture:%@",error);
  }

  // Set the output
  AVCaptureVideoDataOutput* videoOutput = [[AVCaptureVideoDataOutput alloc] init];
  videoOutput.alwaysDiscardsLateVideoFrames = YES;

  // create a queue besides the main thread queue to run the capture on
  dispatch_queue_t captureQueue = dispatch_queue_create("catpureQueue", NULL);

  // setup our delegate
  [videoOutput setSampleBufferDelegate:self queue:captureQueue];

  // release the queue.  I still don't entirely understand why we're releasing it here,
  // but the code examples I've found indicate this is the right thing.  Hmm...
  dispatch_release(captureQueue);

  // configure the pixel format
  videoOutput.videoSettings = [NSDictionary dictionaryWithObjectsAndKeys:
          [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA], 
          (id)kCVPixelBufferPixelFormatTypeKey,
          nil];

  // and the size of the frames we want
  // try AVCaptureSessionPresetLow if this is too slow...
  [session setSessionPreset:AVCaptureSessionPresetMedium];

  // If you wish to cap the frame rate to a known value, such as 10 fps, set 
  // minFrameDuration.
  videoOutput.minFrameDuration = CMTimeMake(1, 10);

  // Add the input and output
  [session addInput:cameraInput];
  [session addOutput:videoOutput];

  // Start the session
  [session startRunning];  
 }

 - (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
  // only run if we're not already processing an image
  if (!faceDetector.imageNeedsProcessing) {

   // Get CVImage from sample buffer
   CVImageBufferRef cvImage = CMSampleBufferGetImageBuffer(sampleBuffer);

   // Send the CVImage to the FaceDetector for later processing
   [faceDetector setImageFromCVPixelBufferRef:cvImage];

   // Trigger the image processing on the main thread
   [self performSelectorOnMainThread:@selector(processImage) withObject:nil waitUntilDone:NO];
  }
 }

4 个解决方案

#1

vImage is a pretty fast way to do it. Requires ios5 though. The call says ARGB but it works for the BGRA you get from the buffer.

vImage是一种非常快的方法。不过需要ios5。这个调用说ARGB，但是它适用于从缓冲区得到的BGRA。

This also has the advantage that you can cut out a part of the buffer and rotate that. See my answer here

这也有好处，你可以去掉一部分缓冲区并旋转它。看到我的答案

- (unsigned char*) rotateBuffer: (CMSampleBufferRef) sampleBuffer
{
 CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
 CVPixelBufferLockBaseAddress(imageBuffer,0);

 size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
 size_t width = CVPixelBufferGetWidth(imageBuffer);
 size_t height = CVPixelBufferGetHeight(imageBuffer);
 size_t currSize = bytesPerRow*height*sizeof(unsigned char); 
 size_t bytesPerRowOut = 4*height*sizeof(unsigned char); 

 void *srcBuff = CVPixelBufferGetBaseAddress(imageBuffer); 
 unsigned char *outBuff = (unsigned char*)malloc(currSize);  

 vImage_Buffer ibuff = { srcBuff, height, width, bytesPerRow};
 vImage_Buffer ubuff = { outBuff, width, height, bytesPerRowOut};

 uint8_t rotConst = 1;   // 0, 1, 2, 3 is equal to 0, 90, 180, 270 degrees rotation

 vImage_Error err= vImageRotate90_ARGB8888 (&ibuff, &ubuff, NULL, rotConst, NULL,0);
 if (err != kvImageNoError) NSLog(@"%ld", err);

 return outBuff;
}

#2

Maybe easier to just set the video orientation the way you want:

也许更容易设置视频定位的方式:

connection.videoOrientation = AVCaptureVideoOrientationPortrait

This way you don't need to do that rotation gimmick at all

这样你就不需要做旋转手法了。

#3

If you rotate at 90 degree stops then you can just do it in memory. Here is example code that just simply copies the data to a new pixel buffer. Doing a brute force rotation should be straight forward.

如果你旋转90度，你就可以在记忆中做。下面是示例代码，它只是简单地将数据复制到一个新的像素缓冲区。做蛮力旋转应该是直接的。

- (CVPixelBufferRef) rotateBuffer: (CMSampleBufferRef) sampleBuffer
{
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CVPixelBufferLockBaseAddress(imageBuffer,0);

    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);

    void *src_buff = CVPixelBufferGetBaseAddress(imageBuffer);

    NSDictionary *options = [NSDictionary dictionaryWithObjectsAndKeys:
                             [NSNumber numberWithBool:YES], kCVPixelBufferCGImageCompatibilityKey,
                             [NSNumber numberWithBool:YES], kCVPixelBufferCGBitmapContextCompatibilityKey,
                             nil];

    CVPixelBufferRef pxbuffer = NULL;
    //CVReturn status = CVPixelBufferPoolCreatePixelBuffer (NULL, _pixelWriter.pixelBufferPool, &pxbuffer);
    CVReturn status = CVPixelBufferCreate(kCFAllocatorDefault, width,
                                          height, kCVPixelFormatType_32BGRA, (CFDictionaryRef) options, 
                                          &pxbuffer);

    NSParameterAssert(status == kCVReturnSuccess && pxbuffer != NULL);

    CVPixelBufferLockBaseAddress(pxbuffer, 0);
    void *dest_buff = CVPixelBufferGetBaseAddress(pxbuffer);
    NSParameterAssert(dest_buff != NULL);

    int *src = (int*) src_buff ;
    int *dest= (int*) dest_buff ;
    size_t count = (bytesPerRow * height) / 4 ;
    while (count--) {
        *dest++ = *src++;
    }

    //Test straight copy.
    //memcpy(pxdata, baseAddress, width * height * 4) ;
    CVPixelBufferUnlockBaseAddress(pxbuffer, 0);
    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
    return pxbuffer;
}

You can then use AVAssetWriterInputPixelBufferAdaptor if you are writing this back out to an AVAssetWriterInput.

您可以使用AVAssetWriterInputPixelBufferAdaptor，如果您要将其写回AVAssetWriterInput。

The above is not optimized. You may want to look for a more efficient copy algorithm. A good place to start is with In-place Matrix Transpose. You would also want to use a pixel buffer pool rather then create a new one each time.

上面没有优化。您可能想要寻找一种更高效的复制算法。一个好的起始点是用In-place矩阵转置。您还需要使用一个像素缓冲池，而不是每次都创建一个新的。

Edit. You could use the GPU to do this. This sounds like a lot of data being pushed around. In CVPixelBufferRef there is the key kCVPixelBufferOpenGLCompatibilityKey. I assume you could create a OpenGL compatible image from the CVImageBufferRef (which is just a pixel buffer ref), and push it through a shader. Again, overkill IMO. You may see if BLAS or LAPACK has 'out of place' transpose methods. If they do then you can be assured they are highly optimized.

编辑。你可以用GPU来做这个。这听起来好像有很多数据被推了出来。在CVPixelBufferRef中有密钥kCVPixelBufferOpenGLCompatibilityKey。我假设您可以从CVImageBufferRef(它只是一个像素缓冲区ref)创建一个OpenGL兼容的映像，并通过一个着色器来实现它。同样,过度海事组织。你可能会发现BLAS或LAPACK是否有“不到位”的转置方法。如果他们这样做了，你可以放心他们是高度优化的。

90 CW where new_width = width ... This will get you a portrait oriented image.

CW, new_width =宽度…这将给你一个肖像导向的形象。

for (int i = 1; i <= new_height; i++) {
    for (int j = new_width - 1; j > -1; j--) {
        *dest++ = *(src + (j * width) + i) ;
    }
}

#4

I know this is quite old question, but I've been solving similar problem recently and maybe someone can find my solution useful.

我知道这是一个很老的问题，但我最近一直在解决类似的问题，也许有人会发现我的解决方案很有用。

I needed to extract raw image data from image buffer of YCbCr format delivered by iPhone camera (got from [AVCaptureVideoDataOutput.availableVideoCVPixelFormatTypes firstObject]), dropping information such as headers, meta information etc to pass it to further processing.

我需要从iPhone摄像头的YCbCr格式的图像缓冲区中提取原始图像数据(从AVCaptureVideoDataOutput获取)。第一个对象)，删除信息，如标题、元信息等，以将其传递给进一步处理。

Also, I needed to extract only small area in the center of captured video frame, so some cropping was needed.

另外，我需要在拍摄的视频帧中心只提取一小块区域，所以需要一些裁剪。

My conditions allowed capturing video only in either landscape orientation, but when a device is positioned in landscape left orientation, image is delivered turned upside down, so I needed to flip it in both axis. In case the image is flipped, my idea was to copy data from the source image buffer in reverse order and reverse bytes in each row of read data to flip image in both axis. That idea really works, and as I needed to copy data from source buffer anyway, it seems there's not much performance penalty if reading from the start or the end (Of course, bigger image = longer processing, but I deal with really small numbers).

我的条件允许捕获视频只在横向的方向，但是当一个设备定位在横向的方向，图像被交付翻转，所以我需要翻转它在两个轴。如果图像被翻转，我的想法是将数据从源图像缓冲区中以相反的顺序复制，并在每一行读取数据中反向字节以在两个轴上翻转图像。这个想法确实有效，而且我需要从源缓冲区复制数据，如果从开始或结束读取(当然，更大的图像=更长的处理，但我处理的是非常小的数字)，似乎没有太多的性能损失。

I'd like to know what others think about this solution and of course some hints how to improve the code:

我想知道其他人对这个解决方案的看法，当然还有一些提示如何改进代码:

/// Lock pixel buffer
CVPixelBufferLockBaseAddress(imageBuffer, 0);

/// Address where image buffer starts
uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);

/// Read image parameters
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);

/// See whether image is flipped upside down
BOOL isFlipped = (_previewLayer.connection.videoOrientation == AVCaptureVideoOrientationLandscapeLeft);

/// Calculate cropping frame. Crop to scanAreaSize (defined as CGSize constant elsewhere) from the center of an image
CGRect cropFrame = CGRectZero;
cropFrame.size = scanAreaSize;
cropFrame.origin.x = (width / 2.0f) - (scanAreaSize.width / 2.0f);
cropFrame.origin.y = (height / 2.0f) - (scanAreaSize.height / 2.0f);

/// Update proportions to cropped size
width = (size_t)cropFrame.size.width;
height = (size_t)cropFrame.size.height;

/// Allocate memory for output image data. W*H for Y component, W*H/2 for CbCr component
size_t bytes = width * height + (width * height / 2);

uint8_t *outputDataBaseAddress = (uint8_t *)malloc(bytes);

if(outputDataBaseAddress == NULL) {

    /// Memory allocation failed, unlock buffer and give up
    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);

    return NULL;
}

/// Get parameters of YCbCr pixel format
CVPlanarPixelBufferInfo_YCbCrBiPlanar *bufferInfo = (CVPlanarPixelBufferInfo_YCbCrBiPlanar *)baseAddress;

NSUInteger bytesPerRowY = EndianU32_BtoN(bufferInfo->componentInfoY.rowBytes);
NSUInteger offsetY = EndianU32_BtoN(bufferInfo->componentInfoY.offset);

NSUInteger bytesPerRowCbCr = EndianU32_BtoN(bufferInfo->componentInfoCbCr.rowBytes);
NSUInteger offsetCbCr = EndianU32_BtoN(bufferInfo->componentInfoCbCr.offset);

/// Copy image data only, skipping headers and metadata. Create single buffer which will contain Y component data
/// followed by CbCr component data.

/// Process Y component
/// Pointer to the source buffer
uint8_t *src;

/// Pointer to the destination buffer
uint8_t *destAddress;

/// Calculate crop rect offset. Crop offset is number of rows (y * bytesPerRow) + x offset.
/// If image is flipped, then read buffer from the end to flip image vertically. End address is height-1!
int flipOffset = (isFlipped) ? (int)((height - 1) * bytesPerRowY) : 0;

int cropOffset = (int)((cropFrame.origin.y * bytesPerRowY) + flipOffset + cropFrame.origin.x);

/// Set source pointer to Y component buffer start address plus crop rect offset
src = baseAddress + offsetY + cropOffset;

for(int y = 0; y < height; y++) {

    /// Copy one row of pixel data from source into the output buffer.
    destAddress = (outputDataBaseAddress + y * width);

    memcpy(destAddress, src, width);

    if(isFlipped) {

        /// Reverse bytes in row to flip image horizontally
        [self reverseBytes:destAddress bytesSize:(int)width];

        /// Move one row up
        src -= bytesPerRowY;
    }
    else {

        /// Move to the next row
        src += bytesPerRowY;
    }
}

/// Calculate crop offset for CbCr component
flipOffset = (isFlipped) ? (int)(((height - 1) / 2) * bytesPerRowCbCr) : 0;
cropOffset = (int)((cropFrame.origin.y * bytesPerRowCbCr) + flipOffset + cropFrame.origin.x);

/// Set source pointer to the CbCr component offset + crop offset
src = (baseAddress + offsetCbCr + cropOffset);

for(int y = 0; y < (height / 2); y++) {

    /// Copy one row of pixel data from source into the output buffer.
    destAddress = (outputDataBaseAddress + (width * height) + y * width);

    memcpy(destAddress, src, width);

    if(isFlipped) {

        /// Reverse bytes in row to flip image horizontally
        [self reverseBytes:destAddress bytesSize:(int)width];

        /// Move one row up
        src -= bytesPerRowCbCr;
    }
    else {

        src += bytesPerRowCbCr;
    }
}

/// Unlock pixel buffer
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);

/// Continue with image data in outputDataBaseAddress;

#1