有什么方法可以使用Apache POI同时读取.xls和.xlsx文件吗?

时间:2023-01-15 12:36:47

I need to create a method that can read both xls and xlsx files. According to my research, HSSF is used to read xls and XSSF to read xlsx. Is there a part of the Apache POI I can use to read both files? I also came across the ss.usermodel but found no sufficient codes that will entertain both xls and xlsx....

我需要创建一个可以同时读取xls和xlsx文件的方法。根据我的研究,HSSF用于读取xls和XSSF来读取xlsx。Apache POI中有一部分可以用来读取两个文件吗?我也遇到ss.usermodel但没有发现足够的代码,将娱乐xls和xlsx ....

7 个解决方案



I haven't had much exp with Apache POI, but as far as i know if you refer to a workbook by class "Workbook" then you can read and write both xls & xlsx.

我对Apache POI并没有太多的经验,但是据我所知,如果您按类“工作簿”查阅工作簿,那么您可以同时阅读和编写xls和xlsx。

All you have to do is when creating object write


for .xls-

xls -

Workbook wb = new HSSFWorkbook();

for .xlsx-

.xlsx -

Workbook wb = new XSSFWorkbook();

you can pass a parameter for file type and create the WorkBook object accordingly using If statement.




Yes, there's a new set of interfaces provided by POI that work with both types.


Use the WorkbookFactory.create() method to get a Workbook: http://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/WorkbookFactory.html


You can check for excel files without relying on file extensions (which are unreliable - many csv files have xls extensions for example but cannot be parsed by POI) using the following:


//simple way to check for both types of excel files
public boolean isExcel(InputStream i) throws IOException{
    return (POIFSFileSystem.hasPOIFSHeader(i) || POIXMLDocument.hasOOXMLHeader(i));



you can read using poi-ooxml and poi-ooxml-schema jars provided by apache.

您可以使用apache提供的poi-ooxml和poi-ooxml-schema jar进行读取。

and use below code:--


Workbook wb = null;
excelFileToRead = new FileInputStream(fileName);
wb = WorkbookFactory.create(excelFileToRead); 
Sheet sheet = wb.getSheet(sheetName);

the above code will read both xls and xlsx files




Thanks to Tom's answer just to add, use foll. code to get inputstream else we may face Exception in thread "main" java.io.IOException: mark/reset not supported

感谢汤姆的回答,补充一下,使用foll。获取inputstream else的代码在线程“main”java.io中可能会遇到异常。IOException:马克/重置不支持

     InputStream inputStream = new FileInputStream(new File("C:\\myFile.xls"));

     if(! inputStream.markSupported()) {
                inputStream = new PushbackInputStream(fileStream, 8);



one option would be to check the file name with lastIndexOf for . and see if it is .xls or xlsx and then use an if condition to switch accordingly. been a long time since i worked on poi but i think it the attributes are like HSSF for .xls and XSSF for .xlsx refer http://poi.apache.org/ site, last line under the topic Why should I use Apache POI?

一个选项是使用lastIndexOf检查文件名。看看它是。xls还是xlsx,然后使用if条件进行相应的切换。我在poi上工作已经有很长一段时间了,但是我认为它的属性就像。xls的HSSF和。xlsx的XSSF都是http://poi.apache.org/ site,主题下的最后一行为什么我要使用Apache poi ?



You can use


Workbook wb = WorkBookFactory().create(inputStream); 



It appears you are looking for a way to abstract the read process, you are saying it doesn't matter if its XLS or XLSX, you want your code to work without modification.


I'd recommend you to look at Apache Tika, its an awesome library that abstracts file reading and content parsing, it uses POI and many other libraries and has a nice abstraction to all of them.

我建议您看看Apache Tika,它是一个很棒的库,可以抽象文件读取和内容解析,它使用POI和许多其他库,并且对所有库都有很好的抽象。

reading a PDF/XLS/XLSX is similar to reading a text file, all the work is done behind the scene.


read this for more. http://www.searchworkings.org/blog/-/blogs/introduction-to-apache-tika




I haven't had much exp with Apache POI, but as far as i know if you refer to a workbook by class "Workbook" then you can read and write both xls & xlsx.

我对Apache POI并没有太多的经验,但是据我所知,如果您按类“工作簿”查阅工作簿,那么您可以同时阅读和编写xls和xlsx。

All you have to do is when creating object write


for .xls-

xls -

Workbook wb = new HSSFWorkbook();

for .xlsx-

.xlsx -

Workbook wb = new XSSFWorkbook();

you can pass a parameter for file type and create the WorkBook object accordingly using If statement.




Yes, there's a new set of interfaces provided by POI that work with both types.


Use the WorkbookFactory.create() method to get a Workbook: http://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/WorkbookFactory.html


You can check for excel files without relying on file extensions (which are unreliable - many csv files have xls extensions for example but cannot be parsed by POI) using the following:


//simple way to check for both types of excel files
public boolean isExcel(InputStream i) throws IOException{
    return (POIFSFileSystem.hasPOIFSHeader(i) || POIXMLDocument.hasOOXMLHeader(i));



you can read using poi-ooxml and poi-ooxml-schema jars provided by apache.

您可以使用apache提供的poi-ooxml和poi-ooxml-schema jar进行读取。

and use below code:--


Workbook wb = null;
excelFileToRead = new FileInputStream(fileName);
wb = WorkbookFactory.create(excelFileToRead); 
Sheet sheet = wb.getSheet(sheetName);

the above code will read both xls and xlsx files




Thanks to Tom's answer just to add, use foll. code to get inputstream else we may face Exception in thread "main" java.io.IOException: mark/reset not supported

感谢汤姆的回答,补充一下,使用foll。获取inputstream else的代码在线程“main”java.io中可能会遇到异常。IOException:马克/重置不支持

     InputStream inputStream = new FileInputStream(new File("C:\\myFile.xls"));

     if(! inputStream.markSupported()) {
                inputStream = new PushbackInputStream(fileStream, 8);



one option would be to check the file name with lastIndexOf for . and see if it is .xls or xlsx and then use an if condition to switch accordingly. been a long time since i worked on poi but i think it the attributes are like HSSF for .xls and XSSF for .xlsx refer http://poi.apache.org/ site, last line under the topic Why should I use Apache POI?

一个选项是使用lastIndexOf检查文件名。看看它是。xls还是xlsx,然后使用if条件进行相应的切换。我在poi上工作已经有很长一段时间了,但是我认为它的属性就像。xls的HSSF和。xlsx的XSSF都是http://poi.apache.org/ site,主题下的最后一行为什么我要使用Apache poi ?



You can use


Workbook wb = WorkBookFactory().create(inputStream); 



It appears you are looking for a way to abstract the read process, you are saying it doesn't matter if its XLS or XLSX, you want your code to work without modification.


I'd recommend you to look at Apache Tika, its an awesome library that abstracts file reading and content parsing, it uses POI and many other libraries and has a nice abstraction to all of them.

我建议您看看Apache Tika,它是一个很棒的库,可以抽象文件读取和内容解析,它使用POI和许多其他库,并且对所有库都有很好的抽象。

reading a PDF/XLS/XLSX is similar to reading a text file, all the work is done behind the scene.


read this for more. http://www.searchworkings.org/blog/-/blogs/introduction-to-apache-tika
