使用Apache POI从Excel文件获取列?

时间:2021-01-02 20:22:10

In order to do some statistical analysis I need to extract values in a column of an Excel sheet. I have been using the Apache POI package to read from Excel files, and it works fine when one needs to iterate over rows. However I couldn't find anything about getting columns neither in the API (link text) nor through google searching.

为了进行一些统计分析,我需要在Excel表的一列中提取值。我一直在使用Apache POI包从Excel文件中读取数据,当需要对行进行迭代时,它可以正常工作。但是,无论是在API(链接文本)还是通过谷歌搜索,我都找不到任何关于获取列的信息。

As I need to get max and min values of different columns and generate random numbers using these values, so without picking up individual columns, the only other option is to iterate over rows and columns to get the values and compare one by one, which doesn't sound all that time-efficient.

我需要得到的最大和最小值不同的列和使用这些值生成随机数,因此没有捡个人列,唯一的选择就是遍历行和列的值和比较一个接一个,这听起来并不容易。

Any ideas on how to tackle this problem?

你对如何解决这个问题有什么想法吗?

Thanks,

谢谢,

3 个解决方案

#1


17  

Excel files are row based rather than column based, so the only way to get all the values in a column is to look at each row in turn. There's no quicker way to get at the columns, because cells in a column aren't stored together.

Excel文件是基于行而不是基于列的,因此获得列中的所有值的唯一方法是依次查看每一行。没有快速获取列的方法,因为列中的单元格不是存储在一起的。

Your code probably wants to be something like:

您的代码可能希望是这样的:

List<Double> values = new ArrayList<Double>();
for(Row r : sheet) {
   Cell c = r.getCell(columnNumber);
   if(c != null) {
      if(c.getCellType() == Cell.CELL_TYPE_NUMERIC) {
         valuesadd(c.getNumericCellValue());
      } else if(c.getCellType() == Cell.CELL_TYPE_FORMULA && c.getCachedFormulaResultType() == Cell.CELL_TYPE_NUMERIC) {
         valuesadd(c.getNumericCellValue());
      }
   }
}

That'll then give you all the numeric cell values in that column.

这样就会得到该列中的所有数值单元格值。

#2


0  

I know it's an old question but I had the same problem as presented and had to solve it differently.

我知道这是个老问题,但我遇到了同样的问题,必须用不同的方法解决。

My code could not be easily adapted and would have gained alot of unnecessary complexity. So I decided to change the excel sheet instead by inversing columns and rows like explained here: (http://www.howtogeek.com/howto/12366/)

我的代码不容易修改,并且会获得大量不必要的复杂性。因此,我决定更改excel表,将列和行颠倒过来,如下所示:

You can also inverse it by VBA like shown here:

你也可以用VBA来反演,如下所示:

Convert row with columns of data into column with multiple rows in Excel 2007

在excel2007中,将具有数据列的行转换为具有多行的列

Hope it helps somebody out there

希望它能帮助别人。

#3


0  

Just wanted to add, in case you have headers in your file and you are not sure about the column index but want to pick columns under specific headers (column names) for eg, you can try something like this

只是想要添加,以防在文件中有标题,并且不确定列索引,但是想要为eg在特定的标题(列名)下选择列,您可以尝试以下操作

    for(Row r : datatypeSheet) 
            {
                Iterator<Cell> headerIterator = r.cellIterator();
                Cell header = null;
                // table header row
                if(r.getRowNum() == 0)
                {
                    //  getting specific column's index

                    while(headerIterator.hasNext())
                    {
                        header = headerIterator.next();
                        if(header.getStringCellValue().equalsIgnoreCase("column1Index"))
                        {
                            column1Index = header.getColumnIndex();
                        }
                    }
                }
                else
                {
                    Cell column1Cells = r.getCell(column1);

                    if(column1Cells != null) 
                    {
                        if(column1Cells.getCellType() == Cell.CELL_TYPE_NUMERIC) 
                        {
// adding to a list
                            column1Data.add(column1Cells.getNumericCellValue());
                        }
                        else if(column1Cells.getCellType() == Cell.CELL_TYPE_FORMULA && column1Cells.getCachedFormulaResultType() == Cell.CELL_TYPE_NUMERIC) 
                        {
// adding to a list
                            column1Data.add(column1Cells.getNumericCellValue());
                        }
                    }

                }    
            }

#1


17  

Excel files are row based rather than column based, so the only way to get all the values in a column is to look at each row in turn. There's no quicker way to get at the columns, because cells in a column aren't stored together.

Excel文件是基于行而不是基于列的,因此获得列中的所有值的唯一方法是依次查看每一行。没有快速获取列的方法,因为列中的单元格不是存储在一起的。

Your code probably wants to be something like:

您的代码可能希望是这样的:

List<Double> values = new ArrayList<Double>();
for(Row r : sheet) {
   Cell c = r.getCell(columnNumber);
   if(c != null) {
      if(c.getCellType() == Cell.CELL_TYPE_NUMERIC) {
         valuesadd(c.getNumericCellValue());
      } else if(c.getCellType() == Cell.CELL_TYPE_FORMULA && c.getCachedFormulaResultType() == Cell.CELL_TYPE_NUMERIC) {
         valuesadd(c.getNumericCellValue());
      }
   }
}

That'll then give you all the numeric cell values in that column.

这样就会得到该列中的所有数值单元格值。

#2


0  

I know it's an old question but I had the same problem as presented and had to solve it differently.

我知道这是个老问题,但我遇到了同样的问题,必须用不同的方法解决。

My code could not be easily adapted and would have gained alot of unnecessary complexity. So I decided to change the excel sheet instead by inversing columns and rows like explained here: (http://www.howtogeek.com/howto/12366/)

我的代码不容易修改,并且会获得大量不必要的复杂性。因此,我决定更改excel表,将列和行颠倒过来,如下所示:

You can also inverse it by VBA like shown here:

你也可以用VBA来反演,如下所示:

Convert row with columns of data into column with multiple rows in Excel 2007

在excel2007中,将具有数据列的行转换为具有多行的列

Hope it helps somebody out there

希望它能帮助别人。

#3


0  

Just wanted to add, in case you have headers in your file and you are not sure about the column index but want to pick columns under specific headers (column names) for eg, you can try something like this

只是想要添加,以防在文件中有标题,并且不确定列索引,但是想要为eg在特定的标题(列名)下选择列,您可以尝试以下操作

    for(Row r : datatypeSheet) 
            {
                Iterator<Cell> headerIterator = r.cellIterator();
                Cell header = null;
                // table header row
                if(r.getRowNum() == 0)
                {
                    //  getting specific column's index

                    while(headerIterator.hasNext())
                    {
                        header = headerIterator.next();
                        if(header.getStringCellValue().equalsIgnoreCase("column1Index"))
                        {
                            column1Index = header.getColumnIndex();
                        }
                    }
                }
                else
                {
                    Cell column1Cells = r.getCell(column1);

                    if(column1Cells != null) 
                    {
                        if(column1Cells.getCellType() == Cell.CELL_TYPE_NUMERIC) 
                        {
// adding to a list
                            column1Data.add(column1Cells.getNumericCellValue());
                        }
                        else if(column1Cells.getCellType() == Cell.CELL_TYPE_FORMULA && column1Cells.getCachedFormulaResultType() == Cell.CELL_TYPE_NUMERIC) 
                        {
// adding to a list
                            column1Data.add(column1Cells.getNumericCellValue());
                        }
                    }

                }    
            }