I'm trying to convert an image file to text using tess4j maven dependency.
Dependency in pom.xml:-
我正在尝试使用tess4j maven依赖项将一个图像文件转换为文本。在pom . xml中依赖:-
<!-- OCR dependency -->
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
<version>3.4.0</version>
<exclusions>
<exclusion>
<groupId>net.java.dev.jna</groupId>
<artifactId>jna</artifactId>
</exclusion>
<exclusion>
<groupId>net.sourceforge.lept4j</groupId>
<artifactId>lept4j</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>net.java.dev.jna</groupId>
<artifactId>jna</artifactId>
<version>4.4.0</version>
</dependency>
<dependency>
<groupId>net.sourceforge.lept4j</groupId>
<artifactId>lept4j</artifactId>
<version>1.5.0</version>
</dependency>
My code:-
我的代码:
public String convertImageToText(String imageFilePath) throws TesseractException {
File imageFile = new File("imageFilePath");
ITesseract iTesseract = new Tesseract();
ImageIO.scanForPlugins();
String result = iTesseract.doOCR(imageFile);
System.out.println("Converted text is: "+result);
return result;
}
However, when I try executing my program, I always encounter below exception:
然而,当我尝试执行我的程序时,我总是遇到以下的例外:
Exception in thread "main" net.sourceforge.tess4j.TesseractException: java.lang.RuntimeException: Unsupported image format. May need to install JAI Image I/O package.
https://java.net/projects/jai-imageio/
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:215)
at utilities.HelperMethods.convertImageToText(HelperMethods.java:218)
at net.sourceforge.tess4j.util.ImageIOHelper.getIIOImageList(ImageIOHelper.java:408)
at utilities.HelperMethods.main(HelperMethods.java:250)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:212)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:196)
Caused by: java.lang.RuntimeException: Unsupported image format. May need to install JAI Image I/O package.
https://java.net/projects/jai-imageio/
at utilities.HelperMethods.convertImageToText(HelperMethods.java:218)
at net.sourceforge.tess4j.util.ImageIOHelper.getIIOImageList(ImageIOHelper.java:408)
at utilities.HelperMethods.main(HelperMethods.java:250)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:212)
All required dependencies like jai, lept4j etc are present in my repository. Also I have tried all the solutions suggested on this forum but I'm unable to resolve this error.
Any help would be appreciated.
所有需要的依赖项,如jai、lept4j等都存在于我的存储库中。我也尝试了在这个论坛上提出的所有解决方案,但是我无法解决这个错误。如有任何帮助,我们将不胜感激。
Thanks
Update: Attaching the file here - Jpg file
感谢更新:在这里附加文件- Jpg文件。
1 个解决方案
#1
0
It cannot determine an appropriate ImageReader for the given file format. So it's probably 1) the file format cannot be determined properly (weird file extension?) or 2) there is no image reader registered for the format you're trying use.
它不能为给定的文件格式确定合适的ImageReader。所以可能是1)文件格式不能被正确地确定(奇怪的文件扩展名?)或者2)没有为你正在使用的格式注册的图像读取器。
See ImageIO.getImageReaderByFormatName.
看到ImageIO.getImageReaderByFormatName。
#1
0
It cannot determine an appropriate ImageReader for the given file format. So it's probably 1) the file format cannot be determined properly (weird file extension?) or 2) there is no image reader registered for the format you're trying use.
它不能为给定的文件格式确定合适的ImageReader。所以可能是1)文件格式不能被正确地确定(奇怪的文件扩展名?)或者2)没有为你正在使用的格式注册的图像读取器。
See ImageIO.getImageReaderByFormatName.
看到ImageIO.getImageReaderByFormatName。