Tesseract OCR w/ ios&swift返回错误或gibberish。

时间:2022-07-23 09:02:06

I used this tutorial to get Tesseract OCR working with Swift: http://www.piterwilson.com/blog/2014/10/18/minimal-tesseact-ocr-setup-in-swift/

我使用本教程让Tesseract OCR与Swift一起工作:http://www.piterwilson.com/blog4/10/18/minimal -tesseact-ocr-setup-in swift/

It works fine if I upload the demo image and call

如果我上传演示图片并调用它,效果会很好

 tesseract.image = UIImage(named: "image_sample.jpg");

But if I use my camera code and take a picture of that same image and call

但是如果我用我的相机代码拍下相同的图像然后调用

 tesseract.image = self.image.blackAndWhite();

the result is either gibberish like

结果要么是胡言乱语

s I 5E251 :Ec ‘-. —7.//:E*髧 a g :_{:7 IC‘ J 7 iii—1553‘ : fizzle —‘;-—:

s i5e251:Ec ' -。7所示。/ /:E * E«§g:_ {:7 IC“J 7 iii - 1553”:失败——“;——:

; ~:~./: -:-‘-

;~:~。/:-:-的-

‘- :~£:': _-'~‘:

“——:~£:”:_ -“~”:

: 37%; §:‘—_

:37%;§:_

: ::::E 7,;. 1f:,:~ ——,

:::::E 7。1 f::~,,

Or it returns a BAD_EXC_ACCESS error. I haven't been able to reproduce the reasoning behind why it gives the error or the gibberish. This is the code of my camera capture (photo taken()) and the processing step (nextStepTapped()):

或者返回BAD_EXC_ACCESS错误。我无法再现为什么它会给出错误或胡言乱语的原因。这是我的相机捕获(拍照())的代码和处理步骤(nextStepTapped())):

 @IBAction func photoTaken(sender: UIButton) {

    var videoConnection = stillImageOutput.connectionWithMediaType(AVMediaTypeVideo)

    if videoConnection != nil {

        // Show next step button
        self.view.bringSubviewToFront(self.nextStep)
        self.nextStep.hidden = false

        // Secure image
        stillImageOutput.captureStillImageAsynchronouslyFromConnection(videoConnection) {
            (imageDataSampleBuffer, error) -> Void in
                var imageData = AVCaptureStillImageOutput.jpegStillImageNSDataRepresentation(imageDataSampleBuffer)

                self.image = UIImage(data: imageData)

                //var dataProvider = CGDataProviderCreateWithCFData(imageData)
                //var cgImageRef = CGImageCreateWithJPEGDataProvider(dataProvider, nil, true, kCGRenderingIntentDefault)
                //self.image = UIImage(CGImage: cgImageRef, scale: 1.0, orientation: UIImageOrientation.Right)

        }

        // Freeze camera preview
        captureSession.stopRunning()

    }

}

@IBAction func nextStepTapped(sender: UIButton) {

    // Save to camera roll & proceeed
    //UIImageWriteToSavedPhotosAlbum(self.image.blackAndWhite(), nil, nil, nil)
    //UIImageWriteToSavedPhotosAlbum(self.image, nil, nil, nil)

    // OCR

    var tesseract:Tesseract = Tesseract();
    tesseract.language = "eng";
    tesseract.delegate = self;
    tesseract.image = self.image.blackAndWhite();
    tesseract.recognize();

    NSLog("%@", tesseract.recognizedText);

}

The image saves to the Camera Roll and is completely legible if I uncomment the commented lines. Not sure why it won't work. It has no problem reading the text on the image if it's uploaded directly into Xcode as a supporting file, but if I take a picture of the exact same image on my screen then it can't read it.

图像保存到相机滚动,如果我取消评论行是完全清晰的。不知道为什么它不能工作。如果将图像上的文本作为支持文件直接上传到Xcode,那么读取图像上的文本是没有问题的,但是如果我在屏幕上为相同的图像拍照,那么它就不能读取它。

1 个解决方案

#1


2  

Stumbled upon this tutorial: http://www.raywenderlich.com/93276/implementing-tesseract-ocr-ios

偶然发现了本教程:http://www.raywenderlich.com/93276/ implementing-tesseractocr-ios

It happened to mention scaling the image. They chose the max dimension as 640. I was taking my pictures as 640x480, so I figured I didn't need to scale them, but I think this code essentially redraws the image. For some reason now my photos OCR fairly well. I still need to work on image processing for smaller text, but it works perfectly for large text. Run my image through this scaling function and I'm good to go.

它碰巧提到了缩放图像。他们选择最大尺寸为640。我拍的照片是640x480,所以我认为我不需要缩放它们,但是我认为这段代码实际上是重新绘制了图像。由于某种原因,现在我的照片OCR相当不错。我仍然需要对较小的文本进行图像处理,但是对于较大的文本,它非常适用。通过这个缩放函数运行我的图像,我很好。

  func scaleImage(image: UIImage, maxDimension: CGFloat) -> UIImage {

   var scaledSize = CGSize(width: maxDimension, height: maxDimension)
   var scaleFactor: CGFloat

   if image.size.width > image.size.height {
      scaleFactor = image.size.height / image.size.width
      scaledSize.width = maxDimension
      scaledSize.height = scaledSize.width * scaleFactor
   } else {
      scaleFactor = image.size.width / image.size.height
      scaledSize.height = maxDimension
      scaledSize.width = scaledSize.height * scaleFactor
   }

   UIGraphicsBeginImageContext(scaledSize)
   image.drawInRect(CGRectMake(0, 0, scaledSize.width, scaledSize.height))
   let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
   UIGraphicsEndImageContext()

 return scaledImage
}

#1


2  

Stumbled upon this tutorial: http://www.raywenderlich.com/93276/implementing-tesseract-ocr-ios

偶然发现了本教程:http://www.raywenderlich.com/93276/ implementing-tesseractocr-ios

It happened to mention scaling the image. They chose the max dimension as 640. I was taking my pictures as 640x480, so I figured I didn't need to scale them, but I think this code essentially redraws the image. For some reason now my photos OCR fairly well. I still need to work on image processing for smaller text, but it works perfectly for large text. Run my image through this scaling function and I'm good to go.

它碰巧提到了缩放图像。他们选择最大尺寸为640。我拍的照片是640x480,所以我认为我不需要缩放它们,但是我认为这段代码实际上是重新绘制了图像。由于某种原因,现在我的照片OCR相当不错。我仍然需要对较小的文本进行图像处理,但是对于较大的文本,它非常适用。通过这个缩放函数运行我的图像,我很好。

  func scaleImage(image: UIImage, maxDimension: CGFloat) -> UIImage {

   var scaledSize = CGSize(width: maxDimension, height: maxDimension)
   var scaleFactor: CGFloat

   if image.size.width > image.size.height {
      scaleFactor = image.size.height / image.size.width
      scaledSize.width = maxDimension
      scaledSize.height = scaledSize.width * scaleFactor
   } else {
      scaleFactor = image.size.width / image.size.height
      scaledSize.height = maxDimension
      scaledSize.width = scaledSize.height * scaleFactor
   }

   UIGraphicsBeginImageContext(scaledSize)
   image.drawInRect(CGRectMake(0, 0, scaledSize.width, scaledSize.height))
   let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
   UIGraphicsEndImageContext()

 return scaledImage
}