AVFoundation缓冲区与保存的映像的比较

时间:2021-11-01 19:48:24

I am a long time reader, first time poster on *, and must say it has been a great source of knowledge for me.

我是一个长期的读者,第一次看到关于*的海报,我必须说它是我知识的一大来源。

I am trying to get to know the AVFoundation framework.

我正在尝试了解AVFoundation框架。

What I want to do is save what the camera sees and then detect when something changes.

我想做的是保存相机看到的东西,然后检测什么东西发生了变化。

Here is the part where I save the image to a UIImage :

这是我将图像保存到UIImage的部分:

if (shouldSetBackgroundImage) {
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

    // Create a bitmap graphics context with the sample buffer data
    CGContextRef context = CGBitmapContextCreate(rowBase, bufferWidth,
        bufferHeight, 8, bytesPerRow,
        colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst); 
    // Create a Quartz image from the pixel data in the bitmap graphics context
    CGImageRef quartzImage = CGBitmapContextCreateImage(context); 

    // Free up the context and color space
    CGContextRelease(context); 
    CGColorSpaceRelease(colorSpace);

    // Create an image object from the Quartz image
    UIImage * image = [UIImage imageWithCGImage:quartzImage];
    [self setBackgroundImage:image];
    NSLog(@"reference image actually set");

    // Release the Quartz image
    CGImageRelease(quartzImage);

    //Signal that the image has been saved
    shouldSetBackgroundImage = NO;

}

and here is the part where I check if there is any change in the image seen by the camera :

这里是我检查相机的图像是否有变化的部分:

else {

    CGImageRef cgImage = [backgroundImage CGImage];
    CGDataProviderRef provider = CGImageGetDataProvider(cgImage);
    CFDataRef bitmapData = CGDataProviderCopyData(provider);
    char* data = CFDataGetBytePtr(bitmapData);

    if (data != NULL)
    {
        int64_t numDiffer = 0, pixelCount = 0;
        NSMutableArray * pointsMutable = [NSMutableArray array];

        for( int row = 0; row < bufferHeight; row += 8 ) {
            for( int column = 0; column < bufferWidth; column += 8 ) {

                //we get one pixel from each source (buffer and saved image)
                unsigned char *pixel = rowBase + (row * bytesPerRow) + (column * BYTES_PER_PIXEL);
                unsigned char *referencePixel = data + (row * bytesPerRow) + (column * BYTES_PER_PIXEL);

                pixelCount++;

                if ( !match(pixel, referencePixel, matchThreshold) ) {
                    numDiffer++;
                    [pointsMutable addObject:[NSValue valueWithCGPoint:CGPointMake(SCREEN_WIDTH - (column/ (float) bufferHeight)* SCREEN_WIDTH - 4.0, (row/ (float) bufferWidth)* SCREEN_HEIGHT- 4.0)]];
                }
            }
        }
        numberOfPixelsThatDiffer = numDiffer;
        points = [pointsMutable copy];
    }

For some reason, this doesn't work, meaning that the iPhone detects almost everything as being different from the saved image, even though I set a very low threshold for detection in the match function...

由于某些原因,这并不起作用,这意味着iPhone几乎可以检测到与保存的图像不同的所有东西,即使我在匹配函数中设置了一个非常低的检测阈值……

Do you have any idea of what I am doing wrong?

你知道我做错了什么吗?

2 个解决方案

#1


1  

There are three possibilities I can think of for why you might be seeing nearly every pixel be different: colorspace conversions, incorrect mapping of pixel locations, or your thresholding being too sensitive for the actual movement of the iPhone camera. The first two aren't very likely, so I think it might be the third, but they're worth checking.

有三种可能,我可以想到为什么你会看到几乎每个像素都不同:colorspace转换,像素位置的不正确映射,或者你的阈值对iPhone摄像头的实际移动太敏感。前两个不太可能,所以我想可能是第三个,但它们值得检查。

There might be some color correction going on when you place your pixels within a UIImage, then extract them later. You could try simply storing them in their native state from the buffer, then using that original buffer as the point of comparison, not the UIImage's backing data.

当你把你的像素放在一个UIImage里时,可能会出现一些颜色校正,然后再把它们提取出来。您可以尝试从缓冲区将它们存储在它们的原生状态中,然后使用原始缓冲区作为比较点,而不是UIImage的支持数据。

Also, check to make sure that your row / column arithmetic works out for the actual pixel locations in both images. Perhaps generate a difference image the absolute difference of subtracting the two images, then use a simple black / white divided area as a test image for the camera.

另外,请检查以确保您的行/列算法对两个图像中的实际像素位置是有效的。也许产生一个差的图像减去两个图像的绝对差值,然后使用一个简单的黑白分割区域作为相机的测试图像。

The most likely case is that the overall image is shifting by more than one pixel simply through the act of a human hand holding it. These whole-frame image shifts could cause almost every pixel to be different in a simple comparison. You may need to adjust your thresholding or do more intelligent motion estimation, like is used in video compression routines.

最可能的情况是,整个图像仅仅通过一个人手的动作来移动一个以上的像素。在简单的比较中,这些全帧图像的移动会导致几乎每个像素都不一样。您可能需要调整阈值或进行更智能的运动估计,就像在视频压缩例程中使用的那样。

Finally, when it comes to the comparison operation, I'd recommend taking a look at OpenGL ES 2.0 shaders for performing this. You should see a huge speedup (14-28X in my benchmarks) over doing this pixel-by-pixel comparison on the CPU. I show how to do color-based thresholding using the GPU in this article, which has this iPhone sample application that tracks colored objects in real time using GLSL shaders.

最后,当涉及到比较操作时,我建议您查看OpenGL ES 2.0着色器以执行此操作。在CPU上进行逐像素的比较时,您应该会看到一个巨大的加速(在我的基准测试中是14-28X)。我在本文中展示了如何使用GPU进行基于颜色的阈值处理,GPU有一个iPhone示例应用程序,使用GLSL着色器实时跟踪有颜色的对象。

#2


1  

Human eyes are way much different than a camera (even a very expensive one) in the way that we don't perceive minimal light changes or small motion changes. Cameras DO, they are very sensitive but not smart at all!

人类的眼睛和照相机(即使是非常昂贵的)不同,因为我们没有察觉到微小的光线变化或微小的运动变化。相机可以,它们非常敏感,但一点也不聪明!

With your current approach (it seems you are comparing each pixel): What would happen if the frame is shifted only 1 pixel to the right?! You can image right the result of your algorithm, right?. Humans will perceive nothing or almost nothing.

使用当前的方法(看起来您正在比较每个像素):如果帧仅向右移动一个像素,会发生什么情况?你可以对算法的结果进行图像处理,对吧?人类将什么也察觉不到或几乎什么都察觉不到。

There is also the camera shutter problem: That means that every frame might not have the same amount of light. Hence, a pixel-by-pixel comparison method is too prone to fail.

还有相机快门问题:这意味着每一帧的光线量可能都不一样。因此,逐像素比较方法太容易失败。

You want to at least pre-process your image and extract some basic features. Maybe edges, corners, etc. OpenCV is easy for that but I am not sure that doing such a processing will be fast in the iPhone. (It depends on your image size)

您至少需要预先处理您的映像并提取一些基本特性。也许边、角等等,OpenCV很容易做到,但我不确定在iPhone中做这样的处理会不会很快。(视乎你的影像大小而定)

Alternatively you can try the naive template matching algorithm with a template size that will be a little short than your hole view size.

或者,您可以尝试使用一个模板大小的简单的模板匹配算法,它将比您的孔视图大小稍短一些。

Image Processing is computationally expensive so don't expect it to be fast from the first time, specially in a mobile device and even more if you don't have experience in Image Processing/Computer Vision stuff.

图像处理的计算成本很高,所以不要期望它从第一次开始就很快,尤其是在移动设备上,如果你没有图像处理或计算机视觉方面的经验,就更不要指望它了。

Hope it helps ;)

希望它能帮助;)

#1


1  

There are three possibilities I can think of for why you might be seeing nearly every pixel be different: colorspace conversions, incorrect mapping of pixel locations, or your thresholding being too sensitive for the actual movement of the iPhone camera. The first two aren't very likely, so I think it might be the third, but they're worth checking.

有三种可能,我可以想到为什么你会看到几乎每个像素都不同:colorspace转换,像素位置的不正确映射,或者你的阈值对iPhone摄像头的实际移动太敏感。前两个不太可能,所以我想可能是第三个,但它们值得检查。

There might be some color correction going on when you place your pixels within a UIImage, then extract them later. You could try simply storing them in their native state from the buffer, then using that original buffer as the point of comparison, not the UIImage's backing data.

当你把你的像素放在一个UIImage里时,可能会出现一些颜色校正,然后再把它们提取出来。您可以尝试从缓冲区将它们存储在它们的原生状态中,然后使用原始缓冲区作为比较点,而不是UIImage的支持数据。

Also, check to make sure that your row / column arithmetic works out for the actual pixel locations in both images. Perhaps generate a difference image the absolute difference of subtracting the two images, then use a simple black / white divided area as a test image for the camera.

另外,请检查以确保您的行/列算法对两个图像中的实际像素位置是有效的。也许产生一个差的图像减去两个图像的绝对差值,然后使用一个简单的黑白分割区域作为相机的测试图像。

The most likely case is that the overall image is shifting by more than one pixel simply through the act of a human hand holding it. These whole-frame image shifts could cause almost every pixel to be different in a simple comparison. You may need to adjust your thresholding or do more intelligent motion estimation, like is used in video compression routines.

最可能的情况是,整个图像仅仅通过一个人手的动作来移动一个以上的像素。在简单的比较中,这些全帧图像的移动会导致几乎每个像素都不一样。您可能需要调整阈值或进行更智能的运动估计,就像在视频压缩例程中使用的那样。

Finally, when it comes to the comparison operation, I'd recommend taking a look at OpenGL ES 2.0 shaders for performing this. You should see a huge speedup (14-28X in my benchmarks) over doing this pixel-by-pixel comparison on the CPU. I show how to do color-based thresholding using the GPU in this article, which has this iPhone sample application that tracks colored objects in real time using GLSL shaders.

最后,当涉及到比较操作时,我建议您查看OpenGL ES 2.0着色器以执行此操作。在CPU上进行逐像素的比较时,您应该会看到一个巨大的加速(在我的基准测试中是14-28X)。我在本文中展示了如何使用GPU进行基于颜色的阈值处理,GPU有一个iPhone示例应用程序,使用GLSL着色器实时跟踪有颜色的对象。

#2


1  

Human eyes are way much different than a camera (even a very expensive one) in the way that we don't perceive minimal light changes or small motion changes. Cameras DO, they are very sensitive but not smart at all!

人类的眼睛和照相机(即使是非常昂贵的)不同,因为我们没有察觉到微小的光线变化或微小的运动变化。相机可以,它们非常敏感,但一点也不聪明!

With your current approach (it seems you are comparing each pixel): What would happen if the frame is shifted only 1 pixel to the right?! You can image right the result of your algorithm, right?. Humans will perceive nothing or almost nothing.

使用当前的方法(看起来您正在比较每个像素):如果帧仅向右移动一个像素,会发生什么情况?你可以对算法的结果进行图像处理,对吧?人类将什么也察觉不到或几乎什么都察觉不到。

There is also the camera shutter problem: That means that every frame might not have the same amount of light. Hence, a pixel-by-pixel comparison method is too prone to fail.

还有相机快门问题:这意味着每一帧的光线量可能都不一样。因此,逐像素比较方法太容易失败。

You want to at least pre-process your image and extract some basic features. Maybe edges, corners, etc. OpenCV is easy for that but I am not sure that doing such a processing will be fast in the iPhone. (It depends on your image size)

您至少需要预先处理您的映像并提取一些基本特性。也许边、角等等,OpenCV很容易做到,但我不确定在iPhone中做这样的处理会不会很快。(视乎你的影像大小而定)

Alternatively you can try the naive template matching algorithm with a template size that will be a little short than your hole view size.

或者,您可以尝试使用一个模板大小的简单的模板匹配算法,它将比您的孔视图大小稍短一些。

Image Processing is computationally expensive so don't expect it to be fast from the first time, specially in a mobile device and even more if you don't have experience in Image Processing/Computer Vision stuff.

图像处理的计算成本很高,所以不要期望它从第一次开始就很快,尤其是在移动设备上,如果你没有图像处理或计算机视觉方面的经验,就更不要指望它了。

Hope it helps ;)

希望它能帮助;)