Tesseract使用4.0版本训练数据无法在Swift 3.0项目中工作

时间:2020-12-10 08:54:48

I'm attempting to use Tesseract-OCR-iOS in a new Swift 3.0 project. I'm using Xcode Version 8.1 (8B62). CocoaPods is version 1.1.1.

我正在尝试在新的Swift 3.0项目中使用Tesseract-OCR-iOS。我正在使用Xcode版本8.1(8B62)。 CocoaPods是1.1.1版。

When I attempt to use tesseract.recognize(), my app crashes and I get the following output in the console:

当我尝试使用tesseract.recognize()时,我的应用程序崩溃了,我在控制台中得到以下输出:

actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file tessdatamanager.cpp, line 53

I found this post, which sounds I'm using the wrong version of traineddata. I downloaded tessdata from the tesseract-ocr/tessdata repo, so I'm baffled as to why I'd have a mismatch on the version numbers.

我发现这篇文章,听起来我使用的是训练有素的数据的错误版本。我从tesseract-ocr / tessdata repo下载了tessdata,所以我很困惑为什么我的版本号不匹配。

Any suggestions how to get Tesseract working are greatly appreciated. Below is additional information re: my setup.

任何建议如何让Tesseract工作非常感谢。以下是其他信息:我的设置。

Here's what my Podfile looks like:

这是我的Podfile的样子:

# Uncomment the next line to define a global platform for your project
platform :ios, '9.0'

target 'TesseractDemo' do
  # Comment the next line if you're not using Swift and don't want to use dynamic frameworks
  use_frameworks!

  # Pods for TesseractDemo
pod 'TesseractOCRiOS', '4.0.0'

end

I've dragged a tessdata folder containing eng.traineddata into the root directory of my project outside of Xcode and dragged a reference from Finder to Xcode's Project Navigator.

我已将包含eng.traineddata的tessdata文件夹拖到Xcode外部项目的根目录中,并将Finder中的引用拖到Xcode的Project Navigator中。

Everything works fine up to this point. No compiler errors, linker whining, etc. In a UIViewController I'm importing TesseratOCR and calling it like so:

到目前为止一切正常。没有编译器错误,链接器抱怨等。在UIViewController中,我正在导入TesseratOCR并像这样调用它:

// MARK: - OCR Methods
func scanImage(image: UIImage) {
    if let tesseract = G8Tesseract(language: "eng") {
        tesseract.delegate = self
        tesseract.image = imageToScan?.g8_blackAndWhite()
        tesseract.recognize()

        textView.text = tesseract.recognizedText
    }
}

Update I found a link to a repo of traineddata files for version 4.0. I nuked my old eng.traineddata file and replaced it with the one from the 4.0 repo. I get the same error referencing the same line.

更新我找到了4.0版本的训练数据文件的回购链接。我修改了我的旧eng.traineddata文件,并将其替换为4.0 repo中的文件。我得到引用相同行的相同错误。

1 个解决方案

#1


29  

The current version of eng.traineddata linked above on GitHub will not work with the current version of the Tesseract-OCR-iOS.

上面在GitHub上链接的当前版本的eng.traineddata将无法与当前版本的Tesseract-OCR-iOS一起使用。

The installation instructions posted on GitHub work perfectly if you've got the right <language>.traineddata file.

如果你有正确的 .traineddata文件,GitHub上发布的安装说明可以很好地工作。

I discovered this after dragging the eng.traineddata from Lyndsey Scott's brilliant Tesseract tutorial on Ray Wenderlich.

我从Lyndsey Scott的Ray Wenderlich精彩的Tesseract教程中拖出eng.traineddata后发现了这一点。

This repo contains the eng.traineddata file I needed to get Tesseract working. I'm not sure if that applies to all languages.

这个repo包含了我需要使用Tesseract工作的eng.traineddata文件。我不确定这是否适用于所有语言。

#1


29  

The current version of eng.traineddata linked above on GitHub will not work with the current version of the Tesseract-OCR-iOS.

上面在GitHub上链接的当前版本的eng.traineddata将无法与当前版本的Tesseract-OCR-iOS一起使用。

The installation instructions posted on GitHub work perfectly if you've got the right <language>.traineddata file.

如果你有正确的 .traineddata文件,GitHub上发布的安装说明可以很好地工作。

I discovered this after dragging the eng.traineddata from Lyndsey Scott's brilliant Tesseract tutorial on Ray Wenderlich.

我从Lyndsey Scott的Ray Wenderlich精彩的Tesseract教程中拖出eng.traineddata后发现了这一点。

This repo contains the eng.traineddata file I needed to get Tesseract working. I'm not sure if that applies to all languages.

这个repo包含了我需要使用Tesseract工作的eng.traineddata文件。我不确定这是否适用于所有语言。