
时间:2021-11-01 19:48:24

I am a long time reader, first time poster on *, and must say it has been a great source of knowledge for me.


I am trying to get to know the AVFoundation framework.


What I want to do is save what the camera sees and then detect when something changes.


Here is the part where I save the image to a UIImage :


if (shouldSetBackgroundImage) {
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

    // Create a bitmap graphics context with the sample buffer data
    CGContextRef context = CGBitmapContextCreate(rowBase, bufferWidth,
        bufferHeight, 8, bytesPerRow,
        colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst); 
    // Create a Quartz image from the pixel data in the bitmap graphics context
    CGImageRef quartzImage = CGBitmapContextCreateImage(context); 

    // Free up the context and color space

    // Create an image object from the Quartz image
    UIImage * image = [UIImage imageWithCGImage:quartzImage];
    [self setBackgroundImage:image];
    NSLog(@"reference image actually set");

    // Release the Quartz image

    //Signal that the image has been saved
    shouldSetBackgroundImage = NO;


and here is the part where I check if there is any change in the image seen by the camera :


else {

    CGImageRef cgImage = [backgroundImage CGImage];
    CGDataProviderRef provider = CGImageGetDataProvider(cgImage);
    CFDataRef bitmapData = CGDataProviderCopyData(provider);
    char* data = CFDataGetBytePtr(bitmapData);

    if (data != NULL)
        int64_t numDiffer = 0, pixelCount = 0;
        NSMutableArray * pointsMutable = [NSMutableArray array];

        for( int row = 0; row < bufferHeight; row += 8 ) {
            for( int column = 0; column < bufferWidth; column += 8 ) {

                //we get one pixel from each source (buffer and saved image)
                unsigned char *pixel = rowBase + (row * bytesPerRow) + (column * BYTES_PER_PIXEL);
                unsigned char *referencePixel = data + (row * bytesPerRow) + (column * BYTES_PER_PIXEL);


                if ( !match(pixel, referencePixel, matchThreshold) ) {
                    [pointsMutable addObject:[NSValue valueWithCGPoint:CGPointMake(SCREEN_WIDTH - (column/ (float) bufferHeight)* SCREEN_WIDTH - 4.0, (row/ (float) bufferWidth)* SCREEN_HEIGHT- 4.0)]];
        numberOfPixelsThatDiffer = numDiffer;
        points = [pointsMutable copy];

For some reason, this doesn't work, meaning that the iPhone detects almost everything as being different from the saved image, even though I set a very low threshold for detection in the match function...


Do you have any idea of what I am doing wrong?


2 个解决方案



There are three possibilities I can think of for why you might be seeing nearly every pixel be different: colorspace conversions, incorrect mapping of pixel locations, or your thresholding being too sensitive for the actual movement of the iPhone camera. The first two aren't very likely, so I think it might be the third, but they're worth checking.


There might be some color correction going on when you place your pixels within a UIImage, then extract them later. You could try simply storing them in their native state from the buffer, then using that original buffer as the point of comparison, not the UIImage's backing data.


Also, check to make sure that your row / column arithmetic works out for the actual pixel locations in both images. Perhaps generate a difference image the absolute difference of subtracting the two images, then use a simple black / white divided area as a test image for the camera.


The most likely case is that the overall image is shifting by more than one pixel simply through the act of a human hand holding it. These whole-frame image shifts could cause almost every pixel to be different in a simple comparison. You may need to adjust your thresholding or do more intelligent motion estimation, like is used in video compression routines.


Finally, when it comes to the comparison operation, I'd recommend taking a look at OpenGL ES 2.0 shaders for performing this. You should see a huge speedup (14-28X in my benchmarks) over doing this pixel-by-pixel comparison on the CPU. I show how to do color-based thresholding using the GPU in this article, which has this iPhone sample application that tracks colored objects in real time using GLSL shaders.

最后,当涉及到比较操作时,我建议您查看OpenGL ES 2.0着色器以执行此操作。在CPU上进行逐像素的比较时,您应该会看到一个巨大的加速(在我的基准测试中是14-28X)。我在本文中展示了如何使用GPU进行基于颜色的阈值处理,GPU有一个iPhone示例应用程序,使用GLSL着色器实时跟踪有颜色的对象。



Human eyes are way much different than a camera (even a very expensive one) in the way that we don't perceive minimal light changes or small motion changes. Cameras DO, they are very sensitive but not smart at all!


With your current approach (it seems you are comparing each pixel): What would happen if the frame is shifted only 1 pixel to the right?! You can image right the result of your algorithm, right?. Humans will perceive nothing or almost nothing.


There is also the camera shutter problem: That means that every frame might not have the same amount of light. Hence, a pixel-by-pixel comparison method is too prone to fail.


You want to at least pre-process your image and extract some basic features. Maybe edges, corners, etc. OpenCV is easy for that but I am not sure that doing such a processing will be fast in the iPhone. (It depends on your image size)


Alternatively you can try the naive template matching algorithm with a template size that will be a little short than your hole view size.


Image Processing is computationally expensive so don't expect it to be fast from the first time, specially in a mobile device and even more if you don't have experience in Image Processing/Computer Vision stuff.


Hope it helps ;)




There are three possibilities I can think of for why you might be seeing nearly every pixel be different: colorspace conversions, incorrect mapping of pixel locations, or your thresholding being too sensitive for the actual movement of the iPhone camera. The first two aren't very likely, so I think it might be the third, but they're worth checking.


There might be some color correction going on when you place your pixels within a UIImage, then extract them later. You could try simply storing them in their native state from the buffer, then using that original buffer as the point of comparison, not the UIImage's backing data.


Also, check to make sure that your row / column arithmetic works out for the actual pixel locations in both images. Perhaps generate a difference image the absolute difference of subtracting the two images, then use a simple black / white divided area as a test image for the camera.


The most likely case is that the overall image is shifting by more than one pixel simply through the act of a human hand holding it. These whole-frame image shifts could cause almost every pixel to be different in a simple comparison. You may need to adjust your thresholding or do more intelligent motion estimation, like is used in video compression routines.


Finally, when it comes to the comparison operation, I'd recommend taking a look at OpenGL ES 2.0 shaders for performing this. You should see a huge speedup (14-28X in my benchmarks) over doing this pixel-by-pixel comparison on the CPU. I show how to do color-based thresholding using the GPU in this article, which has this iPhone sample application that tracks colored objects in real time using GLSL shaders.

最后,当涉及到比较操作时,我建议您查看OpenGL ES 2.0着色器以执行此操作。在CPU上进行逐像素的比较时,您应该会看到一个巨大的加速(在我的基准测试中是14-28X)。我在本文中展示了如何使用GPU进行基于颜色的阈值处理,GPU有一个iPhone示例应用程序,使用GLSL着色器实时跟踪有颜色的对象。



Human eyes are way much different than a camera (even a very expensive one) in the way that we don't perceive minimal light changes or small motion changes. Cameras DO, they are very sensitive but not smart at all!


With your current approach (it seems you are comparing each pixel): What would happen if the frame is shifted only 1 pixel to the right?! You can image right the result of your algorithm, right?. Humans will perceive nothing or almost nothing.


There is also the camera shutter problem: That means that every frame might not have the same amount of light. Hence, a pixel-by-pixel comparison method is too prone to fail.


You want to at least pre-process your image and extract some basic features. Maybe edges, corners, etc. OpenCV is easy for that but I am not sure that doing such a processing will be fast in the iPhone. (It depends on your image size)


Alternatively you can try the naive template matching algorithm with a template size that will be a little short than your hole view size.


Image Processing is computationally expensive so don't expect it to be fast from the first time, specially in a mobile device and even more if you don't have experience in Image Processing/Computer Vision stuff.


Hope it helps ;)
