TesserOCR训练

1.CMD命令行进入图片目录。运行：

tesseract.exe testcode.tif testcode batch.nochop makebox

注意：上面的 testcode 名称必须保持一致。且在同一个文件夹下

程序自动在图片目录中生成 code1.box文件。

2.用jTessBoxEditor.jar 打开tif文件。

界面：

TesserOCR训练

3.对程序分割结果进行校正。

（1）常用菜单解释：

　　 TesserOCR训练

4.校正完成后，在cmd中， cd进入图片目录，执行命令：

tesseract.exe testcode.tif testcode nobatch box.train

再执行：

unicharset_extractor.exe testcode.box

5.在图片目录中创建一个txt文件，打开，输入：

testcode

然后将txt文件改名文：font_properties (不带后缀)

6.执行命令：

cntraining.exe testcode.tr

7.执行命令：

mftraining.exe -F font_properties -U unicharset testcode.tr

8.在图片目录中找到 unicharset inttemp normproto pfftable,在这几个文件前面加上训练名称前缀testcode.（如testcode.unicharset）

9.执行命令：

combine_tessdata testcode.

10.将testcode.traineddata 拷贝到 tesseract-OCR目录下的tessdata目录中