在iPhone中使用Tesseract OCR读取驾照数据

时间:2021-01-02 08:54:20

I am trying to read information from a driving license of USA. But I am not able to get correct text from the image. 在iPhone中使用Tesseract OCR读取驾照数据

我正在阅读美国驾照上的资料。但是我无法从图像中得到正确的文本。

I am trying to read image like above but I am getting some strange result. I am getting something like following:

我试着读上面的图片,但是我得到了一些奇怪的结果。我得到了如下信息:

7 WISCONSIN **i_.* 4' L. _-
DRIVER LICENSE Regular
' Q555-5555-2555-00 35533
I5 .4 ClassDMXxX Enduslmmls TPXMXX J
Sex r mnBLQ EyesBl-U 0000.501" 0.00.100
X Restrictions 0n Back MM 08484005
X E0". 00-20-2010
It JANE QUINCY
' * 1' 3913' ECIJ-SWILEKgSJVEEQIJNSRIEMREKBVAY
jilfccgbwm suns 20s
BLACK RIVER FALLS w: 54015-0000

威斯康辛州7 * * i_。* 4' L. _-驾驶证普通' Q555-5555-2555-00 35533 I5 .4 ClassDMXxX enduslmls TPXMXX J Sex r mnBLQ EyesBl-U 0000.501" 0.00.100 X constraints 0n Back MM 08484005 X E0"。00-20-2010, JANE QUINCY ' * 1' 3913' ECIJ-SWILEKgSJVEEQIJNSRIEMREKBVAY jivay jilfccgbwm sun 20s - BLACK RIVER FALLS w: 54015-0000

Very few of the words are correct. What should I need do to get a more accurate information?
My Code:

很少有词是正确的。我需要做什么来获得更准确的信息?我的代码:

Tesseract* tesseract4 = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];
[tesseract4 setVariableValue:@"*'\"-_:.0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" forKey:@"tessedit_char_whitelist"];
[tesseract4 setImage:[UIImage imageNamed:@"dlWI.jpg"]];
[tesseract4 recognize];

NSLog(@"%@", [tesseract4 recognizedText]);

1 个解决方案

#1


1  

Try having a look at this question here it explains how to convert the image to grayscale and process the image a bit in order to improve the quality of the results from Tessseract

看看这个问题,它解释了如何将图像转换为灰度,并对图像进行处理,以提高Tessseract结果的质量

iOS Tesseract OCR Image Preperation

iostesseract OCR图像预处理

Also it is worth ensuring that your white list only includes characters that you want to process. So if you don't need : or _ or * then don't include them in the white list and this should clean up the results a bit

同样值得确保的是,您的白名单只包含您想要处理的字符。因此,如果您不需要:或_或*,那么不要将它们包括在白名单中,这应该会清理一些结果

#1


1  

Try having a look at this question here it explains how to convert the image to grayscale and process the image a bit in order to improve the quality of the results from Tessseract

看看这个问题,它解释了如何将图像转换为灰度,并对图像进行处理,以提高Tessseract结果的质量

iOS Tesseract OCR Image Preperation

iostesseract OCR图像预处理

Also it is worth ensuring that your white list only includes characters that you want to process. So if you don't need : or _ or * then don't include them in the white list and this should clean up the results a bit

同样值得确保的是,您的白名单只包含您想要处理的字符。因此,如果您不需要:或_或*,那么不要将它们包括在白名单中,这应该会清理一些结果