目录1.源码获取2.编译3.测试
1.源码获取1.1获取tesseract-ocr源码源码下载地址:https://github.com/tesseract-ocr/tesseract/tree/3.02.02
在github中可以根据需要选择不同的版本
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9abWxzWlRvdkx5OURPaTlWYzJWeWN5OXRZWGd2UVhCd1JHRjBZUzlNYjJOaGJDOVpUbTkwWlM5a1lYUmhMM0Z4TlRBd01UbEZRamd4TmtFMk5qSTVORUZHUVRnd05URkRORUU1UWtOQlFUY3ZPV001TWpVMU5UWXlOR00wTkRNd09XRXdZelk0T1Rkak5HSTBaak5tTURFdlkyeHBjR0p2WVhKa0xuQnVadz09.jpg?w=700&webp=1)
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9hSFIwY0RvdkwybHRaeTVpYkc5bkxtTnpaRzR1Ym1WMEx6SXdNVGN3TVRBMk1UZ3lNREUxTWpFMlAzZGhkR1Z5YldGeWF5OHlMM1JsZUhRdllVaFNNR05FYjNaTU1rcHpZakpqZFZrelRtdGlhVFYxV2xoUmRtUlVRWGhOYWxVeVRtcGpNVTFSUFQwdlptOXVkQzgxWVRaTU5Vd3lWQzltYjI1MGMybDZaUzgwTURBdlptbHNiQzlKTUVwQ1VXdEdRMDFCUFQwdlpHbHpjMjlzZG1Vdk56QXZaM0poZG1sMGVTOURaVzUwWlhJPQ%3D%3D.jpg?w=700&webp=1)
1.2因为Tesseract依赖Leptonica库,所以还需要编译Leptonica 源码:leptonica-1.68.tar.gz
VS工程:vs2008-1.68.zip
相关头文件和库:leptonica-1.68-win32-lib-include-dirs.zip
2.编译 2.1编译Leptonica step1 将压缩包解压并移动位置如下图
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9hSFIwY0RvdkwybHRaeTVpYkc5bkxtTnpaRzR1Ym1WMEx6SXdNVGN3TVRBMk1UZ3lNRE00TVRNNFAzZGhkR1Z5YldGeWF5OHlMM1JsZUhRdllVaFNNR05FYjNaTU1rcHpZakpqZFZrelRtdGlhVFYxV2xoUmRtUlVRWGhOYWxVeVRtcGpNVTFSUFQwdlptOXVkQzgxWVRaTU5Vd3lWQzltYjI1MGMybDZaUzgwTURBdlptbHNiQzlKTUVwQ1VXdEdRMDFCUFQwdlpHbHpjMjlzZG1Vdk56QXZaM0poZG1sMGVTOURaVzUwWlhJPQ%3D%3D.jpg?w=700&webp=1)
step2 在vs2008中找到工程并使用vs2010打开工程
step3 编译 分别对Release和Debug进行编译,一次成功。
2.2编译tesseract-ocr step1 在vs2008文件夹中找到工程
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9abWxzWlRvdkx5OURPaTlWYzJWeWN5OXRZWGd2UVhCd1JHRjBZUzlNYjJOaGJDOVpUbTkwWlM5a1lYUmhMM0Z4TlRBd01UbEZRamd4TmtFMk5qSTVORUZHUVRnd05URkRORUU1UWtOQlFUY3ZNRFpoTlRCbU9XSXdNelJsTkdVNE5qaGxORE5oTXpZMU5EUXpZamN6T1dVdlkyeHBjR0p2WVhKa0xuQnVadz09.jpg?w=700&webp=1)
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9hSFIwY0RvdkwybHRaeTVpYkc5bkxtTnpaRzR1Ym1WMEx6SXdNVGN3TVRBMk1UZ3lNVEkxTnpBeVAzZGhkR1Z5YldGeWF5OHlMM1JsZUhRdllVaFNNR05FYjNaTU1rcHpZakpqZFZrelRtdGlhVFYxV2xoUmRtUlVRWGhOYWxVeVRtcGpNVTFSUFQwdlptOXVkQzgxWVRaTU5Vd3lWQzltYjI1MGMybDZaUzgwTURBdlptbHNiQzlKTUVwQ1VXdEdRMDFCUFQwdlpHbHpjMjlzZG1Vdk56QXZaM0poZG1sMGVTOURaVzUwWlhJPQ%3D%3D.jpg?w=700&webp=1)
step2 用vs2010转换项目后报错误 错误 1 error C1083: 无法打开包括文件:“allheaders.h”: No such file or directory
这是因为allheaders.h在Leptonica中,而两个工程目录没有协调导致。解决方法:
调整目录如下图
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9hSFIwY0RvdkwybHRaeTVpYkc5bkxtTnpaRzR1Ym1WMEx6SXdNVGN3TVRBMk1UZ3lNVEF3TURZNFAzZGhkR1Z5YldGeWF5OHlMM1JsZUhRdllVaFNNR05FYjNaTU1rcHpZakpqZFZrelRtdGlhVFYxV2xoUmRtUlVRWGhOYWxVeVRtcGpNVTFSUFQwdlptOXVkQzgxWVRaTU5Vd3lWQzltYjI1MGMybDZaUzgwTURBdlptbHNiQzlKTUVwQ1VXdEdRMDFCUFQwdlpHbHpjMjlzZG1Vdk56QXZaM0poZG1sMGVTOURaVzUwWlhJPQ%3D%3D.jpg?w=700&webp=1)
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9abWxzWlRvdkx5OURPaTlWYzJWeWN5OXRZWGd2UVhCd1JHRjBZUzlNYjJOaGJDOVpUbTkwWlM5a1lYUmhMM0Z4TlRBd01UbEZRamd4TmtFMk5qSTVORUZHUVRnd05URkRORUU1UWtOQlFUY3ZOMlEwTnpnNE16TXpPR0l4TkdKak4yRXhOak00TkRneFpUVTJOek5rWkRRdlkyeHBjR0p2WVhKa0xuQnVadz09.jpg?w=700&webp=1)
step3 调整目录后重新编译,报错 错误 2 error C2146: 语法错误: 缺少“}”(在标识符“銆”的前面)
这是错误由于文件编码格式引起的。
解决方法:
选择vs2010的菜单“文件 -- 高级保存选项”,在窗口中选择“简体中文(gb2312)-代码页936”,保存后重新编译。终于成功
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9hSFIwY0RvdkwybHRaeTVpYkc5bkxtTnpaRzR1Ym1WMEx6SXdNVGN3TVRBMk1UZ3lNakEzTWpVMlAzZGhkR1Z5YldGeWF5OHlMM1JsZUhRdllVaFNNR05FYjNaTU1rcHpZakpqZFZrelRtdGlhVFYxV2xoUmRtUlVRWGhOYWxVeVRtcGpNVTFSUFQwdlptOXVkQzgxWVRaTU5Vd3lWQzltYjI1MGMybDZaUzgwTURBdlptbHNiQzlKTUVwQ1VXdEdRMDFCUFQwdlpHbHpjMjlzZG1Vdk56QXZaM0poZG1sMGVTOURaVzUwWlhJPQ%3D%3D.jpg?w=700&webp=1)
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9abWxzWlRvdkx5OURPaTlWYzJWeWN5OXRZWGd2UVhCd1JHRjBZUzlNYjJOaGJDOVpUbTkwWlM5a1lYUmhMM0Z4TlRBd01UbEZRamd4TmtFMk5qSTVORUZHUVRnd05URkRORUU1UWtOQlFUY3ZaVGM0TVdJMVl6RXdZemd6TkRGaE9XRmpOMkUwT0RNNE5ESmtNekZpTUdNdlkyeHBjR0p2WVhKa0xuQnVadz09.jpg?w=700&webp=1)
3.测试
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9abWxzWlRvdkx5OURPaTlWYzJWeWN5OXRZWGd2UVhCd1JHRjBZUzlNYjJOaGJDOVpUbTkwWlM5a1lYUmhMM0Z4TlRBd01UbEZRamd4TmtFMk5qSTVORUZHUVRnd05URkRORUU1UWtOQlFUY3ZPREl3WTJabE16SmpZMlkzTkRFM1pUa3lNRE00TnpZeE1XUmlNVEV6WWpVdlkyeHBjR0p2WVhKa0xuQnVadz09.jpg?w=700&webp=1)
![Tesseract-OCR入门使用(3)-VS2010编译源码 Tesseract-OCR入门使用(3)-VS2010编译源码](https://image.shishitao.com:8440/aHR0cHM6Ly93d3cuaXRkYWFuLmNvbS9nby9hSFIwY0RvdkwybHRaeTVpYkc5bkxtTnpaRzR1Ym1WMEx6SXdNVGN3TVRBMk1UZ3lNakUzTkRNM1AzZGhkR1Z5YldGeWF5OHlMM1JsZUhRdllVaFNNR05FYjNaTU1rcHpZakpqZFZrelRtdGlhVFYxV2xoUmRtUlVRWGhOYWxVeVRtcGpNVTFSUFQwdlptOXVkQzgxWVRaTU5Vd3lWQzltYjI1MGMybDZaUzgwTURBdlptbHNiQzlKTUVwQ1VXdEdRMDFCUFQwdlpHbHpjMjlzZG1Vdk56QXZaM0poZG1sMGVTOURaVzUwWlhJPQ%3D%3D.jpg?w=700&webp=1)
参考资料: 1.《如何在windows上编译Tesseract OCR》 2.《Tesseract-OCR 进行文字识别 VS2010》 3.《Tesseract-OCR学习系列(二)构建》 4.《Tesseract-OCR学习系列(三)简例》 5.《干货:Tesseract的图文识别!》 6.《VS2010编译出现“error C2146: 语法错误: 缺少“;”(在标识符“銆”的前面)”》