Apache POI HSSF XLS读取错误

时间:2021-07-28 20:20:55

Using the following code while reading in a .xls file, where s is the file directory:

在.xls文件中读取时使用以下代码,其中s是文件目录:

InputStream input = new FileInputStream(s);
Workbook wbs = new HSSFWorkbook(input);

I get the following error message:

我收到以下错误消息:

Exception in thread "main" java.io.IOException: Invalid header signature; read 0x0010000000060809, expected 0xE11AB1A1E011CFD0

I need a program that is able to read in either XLSX or XLS, and using the exact same code just adjusted for XSSF it has no problem at all reading in the XLSX file.

我需要一个能够在XLSX或XLS中读取的程序,并且使用刚为XSSF调整的完全相同的代码,它在XLSX文件中的所有读取都没有问题。

3 个解决方案

#1


2  

If the file is in xlsx format instead of xls you might get this error. I would try using the generic Workbook object (Also called the SS Usermodel)

如果文件是xlsx格式而不是xls,则可能会出现此错误。我会尝试使用通用的Workbook对象(也称为SS Usermodel)

Check out the Workbook interface and the WorkbookFactory object. The factory should be able to create a generic Workbook for you out of either xlsx or xls.

查看Workbook界面和WorkbookFactory对象。工厂应该能够使用xlsx或xls为您创建通用工作簿。

I thought I had a good tutorial on this, but I can't seem to find it. I'll keep looking though.

我以为我有一个很好的教程,但我似乎无法找到它。我会继续看。

Edit

编辑

I found this little tiny snippet from Apache's site about reading and rewriting using the SS Usermodel.

我在Apache的网站上发现了一个关于使用SS Usermodel进行读取和重写的小小片段。

I hope this helps!

我希望这有帮助!

#2


1  

Invalid header signature; read 0x342E312D46445025, expected 0xE11AB1A1E011CFD0

标头签名无效;读取0x342E312D46445025,预期为0xE11AB1A1E011CFD0

Well I got this error when I uploaded corrupted xls/xlsx file(to upload corrupt file I renamed sample.pdf to sample.xls). Add validation like :

好吧,当我上传损坏的xls / xlsx文件时,我收到此错误(上传损坏的文件,我将sample.pdf重命名为sample.xls)。添加验证,如:

Workbook wbs = null;
try {
    InputStream input = new FileInputStream(s);
    wbs = new HSSFWorkbook(input);
} catch(IOException e) {
    // log "file is corrupted", show error message to user
}

#3


1  

The Exception you're getting is one telling you that the file you're supplying isn't a valid Excel binary file, at least not a valid Excel file produced since about 1990. The exception you're getting tells you what POI expects, and that it found something else instead which wasn't a valid .xls file, and wasn't anything else POI can detect.

您获得的异常是告诉您,您提供的文件不是有效的Excel二进制文件,至少不是自1990年以来生成的有效Excel文件。您获得的异常会告诉您POI期望的内容,并且它发现了其他东西而不是有效的.xls文件,而且它不是POI可以检测到的任何东西。

One thing to be aware of is that Excel opens a wide variety of different file formats, including .csv and .html. It's also not very picky about the file extension, so will happily open a CSV file that has been renamed to a .xls one. However, since renaming a .csv to a .xls doesn't magically change the format, POI still can't open it!

有一点需要注意的是,Excel会打开各种不同的文件格式,包括.csv和.html。它对文件扩展名也不是很挑剔,所以很乐意打开一个已重命名为.xls文件的CSV文件。但是,由于将.csv重命名为.xls并没有神奇地改变格式,POI仍然无法打开它!

.

From the exception, I can tell what's happening, and I can also tell you're using an ancient version of Apache POI! A header signature of 0x0010000000060809 corresponds to the Excel 4 file format, from about 25 years ago! If you use a more recent version of Apache POI, it'll give you a helpful error message telling you that the file supplied is an old and largely unsupported Excel file. New versions of POI do include the OldExcelExtractor tool which can pull out some information from those ancient formats.

从例外情况来看,我可以告诉你发生了什么,我也可以告诉你正在使用古老版本的Apache POI!标题符号0x0010000000060809对应于大约25年前的Excel 4文件格式!如果您使用更新版本的Apache POI,它将为您提供一条有用的错误消息,告知您提供的文件是旧的且基本上不受支持的Excel文件。新版本的POI确实包含OldExcelExtractor工具,它可以从这些古老的格式中提取一些信息。

Otherwise, as with all exceptions of this type, try opening the file in Excel and doing a save-as. That will give you an idea of what the file currently is (eg .html saved as .xls, .csv saved as .xls etc), and will also let you re-save it as a proper .xls file for POI to load and work with.

否则,与此类型的所有异常一样,尝试在Excel中打开文件并执行另存为。这将使您了解文件当前是什么(例如.html保存为.xls,.csv保存为.xls等),并且还允许您将其重新保存为正确的.xls文件以供POI加载和与...合作。

#1


2  

If the file is in xlsx format instead of xls you might get this error. I would try using the generic Workbook object (Also called the SS Usermodel)

如果文件是xlsx格式而不是xls,则可能会出现此错误。我会尝试使用通用的Workbook对象(也称为SS Usermodel)

Check out the Workbook interface and the WorkbookFactory object. The factory should be able to create a generic Workbook for you out of either xlsx or xls.

查看Workbook界面和WorkbookFactory对象。工厂应该能够使用xlsx或xls为您创建通用工作簿。

I thought I had a good tutorial on this, but I can't seem to find it. I'll keep looking though.

我以为我有一个很好的教程,但我似乎无法找到它。我会继续看。

Edit

编辑

I found this little tiny snippet from Apache's site about reading and rewriting using the SS Usermodel.

我在Apache的网站上发现了一个关于使用SS Usermodel进行读取和重写的小小片段。

I hope this helps!

我希望这有帮助!

#2


1  

Invalid header signature; read 0x342E312D46445025, expected 0xE11AB1A1E011CFD0

标头签名无效;读取0x342E312D46445025,预期为0xE11AB1A1E011CFD0

Well I got this error when I uploaded corrupted xls/xlsx file(to upload corrupt file I renamed sample.pdf to sample.xls). Add validation like :

好吧,当我上传损坏的xls / xlsx文件时,我收到此错误(上传损坏的文件,我将sample.pdf重命名为sample.xls)。添加验证,如:

Workbook wbs = null;
try {
    InputStream input = new FileInputStream(s);
    wbs = new HSSFWorkbook(input);
} catch(IOException e) {
    // log "file is corrupted", show error message to user
}

#3


1  

The Exception you're getting is one telling you that the file you're supplying isn't a valid Excel binary file, at least not a valid Excel file produced since about 1990. The exception you're getting tells you what POI expects, and that it found something else instead which wasn't a valid .xls file, and wasn't anything else POI can detect.

您获得的异常是告诉您,您提供的文件不是有效的Excel二进制文件,至少不是自1990年以来生成的有效Excel文件。您获得的异常会告诉您POI期望的内容,并且它发现了其他东西而不是有效的.xls文件,而且它不是POI可以检测到的任何东西。

One thing to be aware of is that Excel opens a wide variety of different file formats, including .csv and .html. It's also not very picky about the file extension, so will happily open a CSV file that has been renamed to a .xls one. However, since renaming a .csv to a .xls doesn't magically change the format, POI still can't open it!

有一点需要注意的是,Excel会打开各种不同的文件格式,包括.csv和.html。它对文件扩展名也不是很挑剔,所以很乐意打开一个已重命名为.xls文件的CSV文件。但是,由于将.csv重命名为.xls并没有神奇地改变格式,POI仍然无法打开它!

.

From the exception, I can tell what's happening, and I can also tell you're using an ancient version of Apache POI! A header signature of 0x0010000000060809 corresponds to the Excel 4 file format, from about 25 years ago! If you use a more recent version of Apache POI, it'll give you a helpful error message telling you that the file supplied is an old and largely unsupported Excel file. New versions of POI do include the OldExcelExtractor tool which can pull out some information from those ancient formats.

从例外情况来看,我可以告诉你发生了什么,我也可以告诉你正在使用古老版本的Apache POI!标题符号0x0010000000060809对应于大约25年前的Excel 4文件格式!如果您使用更新版本的Apache POI,它将为您提供一条有用的错误消息,告知您提供的文件是旧的且基本上不受支持的Excel文件。新版本的POI确实包含OldExcelExtractor工具,它可以从这些古老的格式中提取一些信息。

Otherwise, as with all exceptions of this type, try opening the file in Excel and doing a save-as. That will give you an idea of what the file currently is (eg .html saved as .xls, .csv saved as .xls etc), and will also let you re-save it as a proper .xls file for POI to load and work with.

否则,与此类型的所有异常一样,尝试在Excel中打开文件并执行另存为。这将使您了解文件当前是什么(例如.html保存为.xls,.csv保存为.xls等),并且还允许您将其重新保存为正确的.xls文件以供POI加载和与...合作。