iPhone上的数字识别是否可以实时实现?

时间:2022-12-23 03:53:44

I need to recognise numbers from the camera image on iPhone, in real-time. I know there will be no more than 5 digits on the image.

我需要实时识别iPhone上相机图像中的数字。我知道图像上的数字不会超过5位。

Is this problem realistic to solve given the computational specifications of the iPhone? Does anyone have any experience using the Tesseract OCR library, and do you think it could be solved by using it?

鉴于iPhone的计算规格,这个问题是否切合实际?有没有人有使用Tesseract OCR库的经验,您认为可以通过使用它来解决吗?

5 个解决方案

#1


11  

The depends on your definition of "real-time", but yes, it should be possible to do relatively fast recognition of just the digits 0-9 on an iPhone 4, particularly if you can fonts, lighting conditions, etc. that they will appear in.

这取决于你对“实时”的定义,但是,应该可以相对快速地识别iPhone 4上的数字0-9,特别是如果你可以使用字体,照明条件等等。出现在。

I highly recommend reading the article on how Sudoku Grab does its recognition of puzzles using the iPhone camera. In their case, a trained neural network was used to identify the digits, which should be reasonably simple and fast on modern iOS hardware.

我强烈推荐阅读有关Sudoku Grab如何使用iPhone相机识别谜题的文章。在他们的情况下,使用经过训练的神经网络来识别数字,这在现代iOS硬件上应该相当简单和快速。

The current recognition libraries out there, like OpenCV, will use the iPhone's CPU to do the processing. I've heard that they can do even more complex tasks like facial recognition fast enough to use with video sources while showing a minimal amount of stutter.

目前的识别库,如OpenCV,将使用iPhone的CPU进行处理。我听说他们可以做更复杂的任务,比如面部识别速度足够快,可以与视频源一起使用,同时显示最少量的口吃。

For even better performance, I believe that there's a lot of potential in the programmable GPUs on the newer iOS devices. In my benchmarks, I saw a 14X - 28X speedup when using the iPhone 4's GPU for simple image processing. While few people are looking at this right now, something like Sudoku Grab's neural network should be a parallel enough process to benefit from running on the GPU.

为了获得更好的性能,我相信在新的iOS设备上可编程GPU有很多潜力。在我的基准测试中,当使用iPhone 4的GPU进行简单的图像处理时,我看到了14倍-28倍的加速。虽然目前很少有人关注这一点,但像Sudoku Grab的神经网络这样的东西应该是一个足够平行的过程,以便在GPU上运行。

#2


1  

It should be computationally possible. There are apps that can get a bar code in real time and also an app that does real time translation. (Word Lens). I'm not sure what libraries they use, however.

它应该是计算上可行的。有些应用程序可以实时获取条形码,还有一个可以进行实时翻译的应用程序。 (文字镜头)。但是,我不确定他们使用的库。

#3


1  

YES it is possible using the tesseract engine

是的,可以使用tesseract引擎

Here is the sample code if you like to check...

如果你想查看,这是示例代码...

https://github.com/nolanbrown/Tesseract-iPhone-Demo

#4


1  

There is free SDK for that: http://rtrsdk.com/ Supports both iOS and Andorid, works in real-time, helps you capture any text, numbers should not be a problem.

有免费的SDK:http://rtrsdk.com/支持iOS和Andorid,实时工作,帮助您捕获任何文本,数字应该不是问题。

Disclaimer: I work for ABBYY

免责声明:我为ABBYY工作

#5


0  

Yes. Bender can help you with that. It lets you build and run neural nets on iOS. As it uses Metal under the hood, it runs fast and smooth. It also supports running TensorFlow models directly.

是。 Bender可以帮助你。它允许您在iOS上构建和运行神经网络。因为它在引擎盖下使用金属,它运行快速和平稳。它还支持直接运行TensorFlow模型。

So you can run in Bender an existing model in TensorFlow trained for digit recognition Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras if you need help

所以你可以在Bender中运行TensorFlow中的现有模型,训练数字识别手写数字识别使用Python中的卷积神经网络与Keras如果你需要帮助

Disclaimer: I worked on this project.

免责声明:我参与了这个项目。

#1


11  

The depends on your definition of "real-time", but yes, it should be possible to do relatively fast recognition of just the digits 0-9 on an iPhone 4, particularly if you can fonts, lighting conditions, etc. that they will appear in.

这取决于你对“实时”的定义,但是,应该可以相对快速地识别iPhone 4上的数字0-9,特别是如果你可以使用字体,照明条件等等。出现在。

I highly recommend reading the article on how Sudoku Grab does its recognition of puzzles using the iPhone camera. In their case, a trained neural network was used to identify the digits, which should be reasonably simple and fast on modern iOS hardware.

我强烈推荐阅读有关Sudoku Grab如何使用iPhone相机识别谜题的文章。在他们的情况下,使用经过训练的神经网络来识别数字,这在现代iOS硬件上应该相当简单和快速。

The current recognition libraries out there, like OpenCV, will use the iPhone's CPU to do the processing. I've heard that they can do even more complex tasks like facial recognition fast enough to use with video sources while showing a minimal amount of stutter.

目前的识别库,如OpenCV,将使用iPhone的CPU进行处理。我听说他们可以做更复杂的任务,比如面部识别速度足够快,可以与视频源一起使用,同时显示最少量的口吃。

For even better performance, I believe that there's a lot of potential in the programmable GPUs on the newer iOS devices. In my benchmarks, I saw a 14X - 28X speedup when using the iPhone 4's GPU for simple image processing. While few people are looking at this right now, something like Sudoku Grab's neural network should be a parallel enough process to benefit from running on the GPU.

为了获得更好的性能,我相信在新的iOS设备上可编程GPU有很多潜力。在我的基准测试中,当使用iPhone 4的GPU进行简单的图像处理时,我看到了14倍-28倍的加速。虽然目前很少有人关注这一点,但像Sudoku Grab的神经网络这样的东西应该是一个足够平行的过程,以便在GPU上运行。

#2


1  

It should be computationally possible. There are apps that can get a bar code in real time and also an app that does real time translation. (Word Lens). I'm not sure what libraries they use, however.

它应该是计算上可行的。有些应用程序可以实时获取条形码,还有一个可以进行实时翻译的应用程序。 (文字镜头)。但是,我不确定他们使用的库。

#3


1  

YES it is possible using the tesseract engine

是的,可以使用tesseract引擎

Here is the sample code if you like to check...

如果你想查看,这是示例代码...

https://github.com/nolanbrown/Tesseract-iPhone-Demo

#4


1  

There is free SDK for that: http://rtrsdk.com/ Supports both iOS and Andorid, works in real-time, helps you capture any text, numbers should not be a problem.

有免费的SDK:http://rtrsdk.com/支持iOS和Andorid,实时工作,帮助您捕获任何文本,数字应该不是问题。

Disclaimer: I work for ABBYY

免责声明:我为ABBYY工作

#5


0  

Yes. Bender can help you with that. It lets you build and run neural nets on iOS. As it uses Metal under the hood, it runs fast and smooth. It also supports running TensorFlow models directly.

是。 Bender可以帮助你。它允许您在iOS上构建和运行神经网络。因为它在引擎盖下使用金属,它运行快速和平稳。它还支持直接运行TensorFlow模型。

So you can run in Bender an existing model in TensorFlow trained for digit recognition Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras if you need help

所以你可以在Bender中运行TensorFlow中的现有模型,训练数字识别手写数字识别使用Python中的卷积神经网络与Keras如果你需要帮助

Disclaimer: I worked on this project.

免责声明:我参与了这个项目。