如何强制tesseract不使用TESSDATA_PREFIX

时间:2021-01-02 08:54:32

I had tesseract installed on my pc, and it defined TESSDATA_PREFIX enviroment variable. After complete uninstallation of tesseract, i try to use tesseract API in this way:

我在pc上安装了tesseract,它定义了TESSDATA_PREFIX环境变量。在完全卸载tesseract后,我尝试使用tesseract API:

if (myOCR->Init("C:/Projects/project/Release/tessdata/", "rus")) {
            fprintf(stderr, "Could not initialize tesseract.\n");
            exit(1);
        }

and recieve

和接待

Error opening data file C:\Program Files (x86)\Tesseract-OCR\tessdata/rus.traine
ddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent d
irectory of your "tessdata" directory.
Failed loading language 'rus'
Tesseract couldn't load any languages!
Could not initialize tesseract.

typing set TESSDATA_PREFIX in cmd gives me that there is no such variable. But tesseract remembers it (don't know how). So how can i force tesseract to search traindata in concrete folder? Thanks

在cmd中输入set TESSDATA_PREFIX让我知道没有这样的变量。但是tesseract记得(不知道怎么记)。那么如何强制tesseract在具体的文件夹中搜索traindata呢?谢谢

3 个解决方案

#1


3  

This seems helpful: Tesseract - change language file location

这似乎很有帮助:Tesseract -更改语言文件位置

From the answer in that thread, it appears to be the case that tesseract looks for the environment variable, but if it is not set, assumes a fixed location.

从线程中的答案来看,似乎是tesseract在寻找环境变量,但如果没有设置,则假定位置是固定的。

The easiest way to fix this would be to run "cmd", then do:

解决这个问题最简单的方法是运行“cmd”,然后执行:

c:\Users\alex> set TESSDATA_PREFIX="C:/Projects/project/Release/tessdata"
c:\Users\alex> cd MyOCRProgDir
c:\Users\alex\MyOCRProgDir> MyProg

Hope that helps!

希望会有帮助!

#2


1  

Ive been through the same problem . .. All I did was copy the tessdata folder to the directory where my application is running . . .

我也遇到过同样的问题。我所做的只是将tessdata文件夹复制到应用程序正在运行的目录中。

Note: after doing so make sure to set that the tessdata properties "Copy to Output Directory" to "Copy Always" . This solves the problem . . .

注意:在这样做之后,请确保将tessdata属性“复制到输出目录”设置为“复制始终”。这就解决了问题……

Refer to this link in youtube . . .for better demonstration . . .Hope it helps :)

请参考youtube上的这个链接……以获得更好的演示……希望它能有所帮助:)

http://www.youtube.com/watch?v=RqvvXJXuRYY

http://www.youtube.com/watch?v=RqvvXJXuRYY

#3


0  

I had the same problem with training data. Instead of forcing not to use TESSDATA_PREFIX, I found a workaround. This worked for me.

我对训练数据也有同样的问题。我没有强迫不使用TESSDATA_PREFIX,而是找到了一个变通方法。这为我工作。

My machine is 64 bit and im building a 32 bit copy with VS2012.

我的机器是64位的,我正在用VS2012构建一个32位的拷贝。

set the environment variables. TESSDATA_PREFIX : C:\Program Files (x86)\Tesseract-OCR

设置环境变量。TESSDATA_PREFIX:C:\Program Files (x86)\ Tesseract-OCR

here "Tesseract-OCR" is the parent directory of "tessdata" folder.

这里的“tesserac - ocr”是“tessdata”文件夹的父目录。

edit the path variable. path : C:\tess\lib\lib;

编辑path变量。苔丝路径:C:\ \ lib \ lib。

here "C:\tess\lib\lib" is the place where lib and dll files are located : liblept168.dll,liblept168.lib etc.

在这里,“C:\tess\lib lib”是lib和dll文件所在的位置:liblept168.dll文件。*等。

start a new win32 console application and set the following settings. C/C++ >> General C:\tess\include\include

启动一个新的win32控制台应用程序并设置以下设置。C / c++ > >一般C:\苔丝\ \包括

here "C:\tess\include\include" is the parent directory of "tesseract" and "leptonica" folders where the include files are located.

这里的“C:\tess include”是包含文件所在的“tesseract”和“leptonica”文件夹的父目录。

Linker >> Additional Library Dependencies C:\tess\lib\lib

Linker >>额外的库依赖项C:\tess\lib\lib

Linker >> Additional Dependencies liblept168.lib libtesseract302.lib (add these to the list)

Linker >>附加附件liblept168。*libtesseract302。lib(添加到列表中)

C/C++>>Preprocessor _CRT_SECURE_NO_WARNINGS (add this to the list)

C/C+ >>预处理器_crt_secure_no_warning(添加到列表中)

copy the two tesseract dlls (corresponding to the library files) to debug and release folders (not the ones inside the root)

将两个tesseract dll(对应于库文件)复制到调试和发布文件夹(而不是根目录中的文件夹)

copy the tessdata folder (inside the Tesseract installation) to the locations mentioned above.

将tessdata文件夹(在Tesseract安装中)复制到上面提到的位置。

Hopefully, You will be good to go.

希望你能去。

#1


3  

This seems helpful: Tesseract - change language file location

这似乎很有帮助:Tesseract -更改语言文件位置

From the answer in that thread, it appears to be the case that tesseract looks for the environment variable, but if it is not set, assumes a fixed location.

从线程中的答案来看,似乎是tesseract在寻找环境变量,但如果没有设置,则假定位置是固定的。

The easiest way to fix this would be to run "cmd", then do:

解决这个问题最简单的方法是运行“cmd”,然后执行:

c:\Users\alex> set TESSDATA_PREFIX="C:/Projects/project/Release/tessdata"
c:\Users\alex> cd MyOCRProgDir
c:\Users\alex\MyOCRProgDir> MyProg

Hope that helps!

希望会有帮助!

#2


1  

Ive been through the same problem . .. All I did was copy the tessdata folder to the directory where my application is running . . .

我也遇到过同样的问题。我所做的只是将tessdata文件夹复制到应用程序正在运行的目录中。

Note: after doing so make sure to set that the tessdata properties "Copy to Output Directory" to "Copy Always" . This solves the problem . . .

注意:在这样做之后,请确保将tessdata属性“复制到输出目录”设置为“复制始终”。这就解决了问题……

Refer to this link in youtube . . .for better demonstration . . .Hope it helps :)

请参考youtube上的这个链接……以获得更好的演示……希望它能有所帮助:)

http://www.youtube.com/watch?v=RqvvXJXuRYY

http://www.youtube.com/watch?v=RqvvXJXuRYY

#3


0  

I had the same problem with training data. Instead of forcing not to use TESSDATA_PREFIX, I found a workaround. This worked for me.

我对训练数据也有同样的问题。我没有强迫不使用TESSDATA_PREFIX,而是找到了一个变通方法。这为我工作。

My machine is 64 bit and im building a 32 bit copy with VS2012.

我的机器是64位的,我正在用VS2012构建一个32位的拷贝。

set the environment variables. TESSDATA_PREFIX : C:\Program Files (x86)\Tesseract-OCR

设置环境变量。TESSDATA_PREFIX:C:\Program Files (x86)\ Tesseract-OCR

here "Tesseract-OCR" is the parent directory of "tessdata" folder.

这里的“tesserac - ocr”是“tessdata”文件夹的父目录。

edit the path variable. path : C:\tess\lib\lib;

编辑path变量。苔丝路径:C:\ \ lib \ lib。

here "C:\tess\lib\lib" is the place where lib and dll files are located : liblept168.dll,liblept168.lib etc.

在这里,“C:\tess\lib lib”是lib和dll文件所在的位置:liblept168.dll文件。*等。

start a new win32 console application and set the following settings. C/C++ >> General C:\tess\include\include

启动一个新的win32控制台应用程序并设置以下设置。C / c++ > >一般C:\苔丝\ \包括

here "C:\tess\include\include" is the parent directory of "tesseract" and "leptonica" folders where the include files are located.

这里的“C:\tess include”是包含文件所在的“tesseract”和“leptonica”文件夹的父目录。

Linker >> Additional Library Dependencies C:\tess\lib\lib

Linker >>额外的库依赖项C:\tess\lib\lib

Linker >> Additional Dependencies liblept168.lib libtesseract302.lib (add these to the list)

Linker >>附加附件liblept168。*libtesseract302。lib(添加到列表中)

C/C++>>Preprocessor _CRT_SECURE_NO_WARNINGS (add this to the list)

C/C+ >>预处理器_crt_secure_no_warning(添加到列表中)

copy the two tesseract dlls (corresponding to the library files) to debug and release folders (not the ones inside the root)

将两个tesseract dll(对应于库文件)复制到调试和发布文件夹(而不是根目录中的文件夹)

copy the tessdata folder (inside the Tesseract installation) to the locations mentioned above.

将tessdata文件夹(在Tesseract安装中)复制到上面提到的位置。

Hopefully, You will be good to go.

希望你能去。