使用java在xlsx和xls文件中搜索

时间:2021-08-18 13:58:57

I have a large xlsx file which as huge amount of data on which I have to implement search option I have used Apache POI jar as well as jxl jar so that the search between rows and column have been made. But it took huge time to traverse between big data can some one help me that is any jar files or any other concept available to do the search faster on Excel files...

我有一个很大的xlsx文件,它是我用来实现搜索选项的大量数据,我已经使用Apache POI jar和jxl jar,以便在行和列之间进行搜索。但是,在大数据之间来回切换需要花费大量的时间,这是任何一个jar文件或任何其他的概念,可以更快地在Excel文件上搜索……

    String searchValue="my_value_to_search";
    for (int i = 0; i < sheet.getColumns(); i++) {
        for (int j = 0; j < sheet.getRows(); j++) {
            value = sheet.getCell(i, j);
            valueType = value.getType();
            String val=getCellType(valueType, value);
            if (val != null&&val==searchValue) {
                //   To do manipulation.
            }
        }
    }

1 个解决方案

#1


6  

Bottleneck is usually the huge amount of memory required to represent large XLSX files in memory at once. (XLS can't be that big by design, this is usually not a problem). To search in a really huge XLSX file without the memory problems, you could do this:

瓶颈通常是同时在内存中表示大型XLSX文件所需的大量内存。(XLS的设计不能这么大,这通常不是问题)。要搜索一个真正巨大的XLSX文件而不存在内存问题,您可以这样做:

  • the xlsx file is in fact a ZIP archive, you can open it and read the contents as if it is a ZIP file.
  • xlsx文件实际上是一个ZIP归档文件,您可以打开它并读取内容,就好像它是一个ZIP文件一样。
  • inside the ZIP are folder "xl/worksheets" with files sheet1.xml (and sheet2.xml and so on)
  • 在ZIP文件中有文件夹“xl/worksheets”和文件sheet1。xml(和sheet2。xml等等)
  • you can parse these XML files using a normal XmlReader (using callbacks for maximum performance and least memory consumption).
  • 您可以使用普通的XmlReader(使用回调以获得最大的性能和最小的内存消耗)解析这些XML文件。

Hope that helps.

希望有帮助。

#1


6  

Bottleneck is usually the huge amount of memory required to represent large XLSX files in memory at once. (XLS can't be that big by design, this is usually not a problem). To search in a really huge XLSX file without the memory problems, you could do this:

瓶颈通常是同时在内存中表示大型XLSX文件所需的大量内存。(XLS的设计不能这么大,这通常不是问题)。要搜索一个真正巨大的XLSX文件而不存在内存问题,您可以这样做:

  • the xlsx file is in fact a ZIP archive, you can open it and read the contents as if it is a ZIP file.
  • xlsx文件实际上是一个ZIP归档文件,您可以打开它并读取内容,就好像它是一个ZIP文件一样。
  • inside the ZIP are folder "xl/worksheets" with files sheet1.xml (and sheet2.xml and so on)
  • 在ZIP文件中有文件夹“xl/worksheets”和文件sheet1。xml(和sheet2。xml等等)
  • you can parse these XML files using a normal XmlReader (using callbacks for maximum performance and least memory consumption).
  • 您可以使用普通的XmlReader(使用回调以获得最大的性能和最小的内存消耗)解析这些XML文件。

Hope that helps.

希望有帮助。