I have a huge excel file with tons of columns which looks like this :-
我有一个巨大的excel文件,有很多列,看起来像这样:
Column1 Column2 Column3 Column4 Column5
abc def ghi
mno pqr
......
This is the code that I wrote to print these values:
这是我编写的用于打印这些值的代码:
try {
FileInputStream inputStr = new FileInputStream(fileName);
XSSFWorkbook xssfWork = new XSSFWorkbook(inputStr) ;
XSSFSheet sheet1 = xssfWork.getSheetAt(0);
Iterator rowItr = sheet1.rowIterator();
while ( rowItr.hasNext() ) {
XSSFRow row = (XSSFRow) rowItr.next();
System.out.println("ROW:-->");
Iterator cellItr = row.cellIterator();
while ( cellItr.hasNext() ) {
XSSFCell cell = (XSSFCell) cellItr.next();
System.out.println("CELL:-->"+cell.toString());
}
}
} catch (Exception e) {
e.printStackTrace();
}
The output generated by this code is :-
此代码生成的输出是: -
ROW:-->
CELL:-->Column1
CELL:-->Column2
CELL:-->Column3
CELL:-->Column4
CELL:-->Column5
ROW:-->
CELL:-->abc
CELL:-->def
CELL:-->ghi
ROW:-->
CELL:-->mno
CELL:-->pqr
So, If we look at the output above we can note that the cells where I left blank values was not picked up by the POI library , is there a way in which I can get these values as null. or a way to recognize that the values presented skipped blank cells?
所以,如果我们查看上面的输出,我们可以注意到POI库没有拾取我留下空白值的单元格,有没有办法可以将这些值作为null。或者一种方法来识别所呈现的值跳过空白单元格?
Thanks.
8 个解决方案
#1
53
If you want to get all cells, no matter if they exist or not, then the iterator isn't for you. Instead, you need to manually fetch the appropriate cells, likely with a missing cell policy
如果你想获得所有单元格,无论它们是否存在,那么迭代器不适合你。相反,您需要手动获取相应的单元格,可能需要丢失单元格策略
for(Row row : sheet) {
for(int cn=0; cn<row.getLastCellNum(); cn++) {
// If the cell is missing from the file, generate a blank one
// (Works by specifying a MissingCellPolicy)
Cell cell = row.getCell(cn, Row.CREATE_NULL_AS_BLANK);
// Print the cell for debugging
System.out.println("CELL: " + cn + " --> " + cell.toString());
}
}
There's more details on all of this in the Apache POI documentation on iterating over cells
迭代单元格的Apache POI文档中有关于所有这些的更多细节
#2
9
I have been frustrated by this same problem. Here is what I found with poi-3.7-20101029 and poi-3.8.
我对同样的问题感到沮丧。这是我在poi-3.7-20101029和poi-3.8中找到的。
RowIterator and CellIterator do not support iterating over NULL cells or rows -- only physically defined cells (which can be BLANK).
RowIterator和CellIterator不支持迭代NULL单元格或行 - 仅物理定义的单元格(可以是BLANK)。
The solution that returns what I expect requires using the 0-based Row.getCell([int], Row.CREATE_NULL_AS_BLANK)
, much like Chavira's answer alludes to (assuming 8 cell rows). Or you can use the Cell.columnIndex
value while iterating to check for jumping numbers...
返回我期望的解决方案需要使用基于0的Row.getCell([int],Row.CREATE_NULL_AS_BLANK),就像Chavira的回答暗示(假设8个单元格行)。或者您可以在迭代时使用Cell.columnIndex值来检查跳数...
Annoyingly, after creating blank cells using method #1, the iterators will return the now created BLANK cells. I consider it a bug that MissingCellPolicy is ignored by CellIterator.
令人讨厌的是,在使用方法#1创建空白单元格之后,迭代器将返回现在创建的BLANK单元格。我认为这是一个错误,CellIterator会忽略MissingCellPolicy。
#3
3
The reason is quite simple: Excel files can contain as many rows and as many columns as possibles, so returning all available blank rows and columns will render the cells huge and memory intensive.
原因很简单:Excel文件可以包含尽可能多的行和尽可能多的列,因此返回所有可用的空白行和列将使单元格变得庞大且占用大量内存。
Assuming you have a 10x10 sheet, in Excel, it's not "exactly" 10x10 since you can add 11x10 very easily with blank cell, so should POI return the 11th column?
假设你有10x10的表,在Excel中,它不是“完全”10x10,因为你可以很容易地用空白单元格添加11x10,那么POI应该返回第11列吗?
One way to do what you're requesting is to use HSSFCell.getColumnIndex()
.
您要求的一种方法是使用HSSFCell.getColumnIndex()。
Example:
//Assuming your have a 2 dimensional array.
String[][] values = ......;// It is assigned
POIFSFileSystem fileSystem = new POIFSFileSystem(new FileInputStream(fileName));
HSSFWorkbook workbook = new HSSFWorkbook(fileSystem);
//Going through every worksheet.
for (int sheetPos = 0; sheetPos < workbook.getNumberOfSheets(); sheetPos++) {
HSSFSheet sheet = workbook.getSheetAt(sheetPos);
int rowPos = 0;
Iterator<Row> rows = sheet.rowIterator();
while (rows.hasNext()) {
HSSFRow row = (HSSFRow) rows.next();
Iterator<Cell> cells = row.cellIterator();
while (cells.hasNext()) {
HSSFCell cell = (HSSFCell) cells.next();
String value = "";
switch (cell.getCellType()) {
case HSSFCell.CELL_TYPE_NUMERIC:
value = BigDecimal.valueOf(cell.getNumericCellValue()).toPlainString();
break;
case HSSFCell.CELL_TYPE_STRING:
value = cell.getStringCellValue();
break;
case HSSFCell.CELL_TYPE_BLANK:
value = "";
break;
case HSSFCell.CELL_TYPE_FORMULA:
value = cell.getCellFormula();
break;
default:
break;
}
values[rowPos][cell.getColumnIndex()] = value;
}
rowPos++;
}
}
#4
1
for(org.apache.poi.ss.usermodel.Row tmp : hssfSheet){
for(int i = 0; i<8;i++){
System.out.println(tmp.getCell(i));
}
}
#5
1
Below is what worked for me. The "row.CREATE_NULL_AS_BLANK" did not appear to be valid but that could be lack of NPOI knowledge.
以下是对我有用的。 “row.CREATE_NULL_AS_BLANK”似乎没有效果,但可能缺乏NPOI知识。
HSSFCell dataCell= (HSSFCell)row.GetCell(column, NPOI.SS.UserModel.MissingCellPolicy.CREATE_NULL_AS_BLANK);
#6
0
This worked for me....
这对我有用....
int rowNumber;
int previousCell;
int currentCell;
int currentRowNumber;
HSSFCell cell;
while (rows.hasNext()) {
previousCell = -1;
currentCell = 0;
while (cellIterator.hasNext()) {
cell = (HSSFCell) cellIterator.next();
currentCell = cell.getColumnIndex();
if (previousCell == currentCell-1) {
//...
}
else {
System.out.println("Blank cell found");
}
previousCell = currentCell;
}
}
#7
0
List cellDataList = new ArrayList();
int lineNumber = 0;
while (rowIterator.hasNext()) {
HSSFRow hssfRow = (HSSFRow) rowIterator.next();
//System.out.println("Befor If");
lineNumber++;
if(lineNumber==1){continue;}
//System.out.println("Out side if ");
Iterator<Cell> iterator = hssfRow.cellIterator();
List<Cell> cellTempList = new ArrayList();
int current = 0, next = 1;
while (iterator.hasNext()) {
Cell hssfCell = iterator.next();
current = hssfCell.getColumnIndex();
if(current<next){
System.out.println("Condition Satisfied");
}
else{
int loop = current-next;
System.out.println("inside else Loop value : "+(loop));
for(int k=0;k<loop+1;k++){
System.out.println("Adding nulls");
cellTempList.add(null);
next = next + 1;
}
}
cellTempList.add(hssfCell);
next = next + 1;
System.out.println("At End next value is : "+next);
}
cellDataList.add(cellTempList);
}
#8
0
public String[] rowToString(Row row)
{
Iterator<Cell> cells = row.cellIterator() ;
String[] data = new String[row.getLastCellNum()] ;
int previousCell = 0 ;
Cell cell = cells.next() ;
int currentCell = cell.getColumnIndex();
while (true)
{
if (previousCell == currentCell) {
switch (cell.getCellType()) {
case Cell.CELL_TYPE_NUMERIC:
data[previousCell] = cell.getNumericCellValue()+"" ;
break;
case Cell.CELL_TYPE_STRING:
data[previousCell] = cell.getStringCellValue() ;
break;
/* // there could be other cases here.
case Cell.CELL_TYPE_FORMULA:
data[previousCell] =eval.evaluateFormulaCell(cell);
break;
case Cell.CELL_TYPE_BOOLEAN:
data[previousCell] = cell.getBooleanCellValue();
break;
case Cell.CELL_TYPE_BLANK:
data[previousCell] = "";
break;
case Cell.CELL_TYPE_ERROR:
data[previousCell] = "ERROR";
break;
*/
}
if(cells.hasNext()){
cell = cells.next() ;
currentCell = cell.getColumnIndex();
} else {
break ;
}
} else {
data[previousCell] = "";
}
previousCell++ ;
}
return data ;
}
#1
53
If you want to get all cells, no matter if they exist or not, then the iterator isn't for you. Instead, you need to manually fetch the appropriate cells, likely with a missing cell policy
如果你想获得所有单元格,无论它们是否存在,那么迭代器不适合你。相反,您需要手动获取相应的单元格,可能需要丢失单元格策略
for(Row row : sheet) {
for(int cn=0; cn<row.getLastCellNum(); cn++) {
// If the cell is missing from the file, generate a blank one
// (Works by specifying a MissingCellPolicy)
Cell cell = row.getCell(cn, Row.CREATE_NULL_AS_BLANK);
// Print the cell for debugging
System.out.println("CELL: " + cn + " --> " + cell.toString());
}
}
There's more details on all of this in the Apache POI documentation on iterating over cells
迭代单元格的Apache POI文档中有关于所有这些的更多细节
#2
9
I have been frustrated by this same problem. Here is what I found with poi-3.7-20101029 and poi-3.8.
我对同样的问题感到沮丧。这是我在poi-3.7-20101029和poi-3.8中找到的。
RowIterator and CellIterator do not support iterating over NULL cells or rows -- only physically defined cells (which can be BLANK).
RowIterator和CellIterator不支持迭代NULL单元格或行 - 仅物理定义的单元格(可以是BLANK)。
The solution that returns what I expect requires using the 0-based Row.getCell([int], Row.CREATE_NULL_AS_BLANK)
, much like Chavira's answer alludes to (assuming 8 cell rows). Or you can use the Cell.columnIndex
value while iterating to check for jumping numbers...
返回我期望的解决方案需要使用基于0的Row.getCell([int],Row.CREATE_NULL_AS_BLANK),就像Chavira的回答暗示(假设8个单元格行)。或者您可以在迭代时使用Cell.columnIndex值来检查跳数...
Annoyingly, after creating blank cells using method #1, the iterators will return the now created BLANK cells. I consider it a bug that MissingCellPolicy is ignored by CellIterator.
令人讨厌的是,在使用方法#1创建空白单元格之后,迭代器将返回现在创建的BLANK单元格。我认为这是一个错误,CellIterator会忽略MissingCellPolicy。
#3
3
The reason is quite simple: Excel files can contain as many rows and as many columns as possibles, so returning all available blank rows and columns will render the cells huge and memory intensive.
原因很简单:Excel文件可以包含尽可能多的行和尽可能多的列,因此返回所有可用的空白行和列将使单元格变得庞大且占用大量内存。
Assuming you have a 10x10 sheet, in Excel, it's not "exactly" 10x10 since you can add 11x10 very easily with blank cell, so should POI return the 11th column?
假设你有10x10的表,在Excel中,它不是“完全”10x10,因为你可以很容易地用空白单元格添加11x10,那么POI应该返回第11列吗?
One way to do what you're requesting is to use HSSFCell.getColumnIndex()
.
您要求的一种方法是使用HSSFCell.getColumnIndex()。
Example:
//Assuming your have a 2 dimensional array.
String[][] values = ......;// It is assigned
POIFSFileSystem fileSystem = new POIFSFileSystem(new FileInputStream(fileName));
HSSFWorkbook workbook = new HSSFWorkbook(fileSystem);
//Going through every worksheet.
for (int sheetPos = 0; sheetPos < workbook.getNumberOfSheets(); sheetPos++) {
HSSFSheet sheet = workbook.getSheetAt(sheetPos);
int rowPos = 0;
Iterator<Row> rows = sheet.rowIterator();
while (rows.hasNext()) {
HSSFRow row = (HSSFRow) rows.next();
Iterator<Cell> cells = row.cellIterator();
while (cells.hasNext()) {
HSSFCell cell = (HSSFCell) cells.next();
String value = "";
switch (cell.getCellType()) {
case HSSFCell.CELL_TYPE_NUMERIC:
value = BigDecimal.valueOf(cell.getNumericCellValue()).toPlainString();
break;
case HSSFCell.CELL_TYPE_STRING:
value = cell.getStringCellValue();
break;
case HSSFCell.CELL_TYPE_BLANK:
value = "";
break;
case HSSFCell.CELL_TYPE_FORMULA:
value = cell.getCellFormula();
break;
default:
break;
}
values[rowPos][cell.getColumnIndex()] = value;
}
rowPos++;
}
}
#4
1
for(org.apache.poi.ss.usermodel.Row tmp : hssfSheet){
for(int i = 0; i<8;i++){
System.out.println(tmp.getCell(i));
}
}
#5
1
Below is what worked for me. The "row.CREATE_NULL_AS_BLANK" did not appear to be valid but that could be lack of NPOI knowledge.
以下是对我有用的。 “row.CREATE_NULL_AS_BLANK”似乎没有效果,但可能缺乏NPOI知识。
HSSFCell dataCell= (HSSFCell)row.GetCell(column, NPOI.SS.UserModel.MissingCellPolicy.CREATE_NULL_AS_BLANK);
#6
0
This worked for me....
这对我有用....
int rowNumber;
int previousCell;
int currentCell;
int currentRowNumber;
HSSFCell cell;
while (rows.hasNext()) {
previousCell = -1;
currentCell = 0;
while (cellIterator.hasNext()) {
cell = (HSSFCell) cellIterator.next();
currentCell = cell.getColumnIndex();
if (previousCell == currentCell-1) {
//...
}
else {
System.out.println("Blank cell found");
}
previousCell = currentCell;
}
}
#7
0
List cellDataList = new ArrayList();
int lineNumber = 0;
while (rowIterator.hasNext()) {
HSSFRow hssfRow = (HSSFRow) rowIterator.next();
//System.out.println("Befor If");
lineNumber++;
if(lineNumber==1){continue;}
//System.out.println("Out side if ");
Iterator<Cell> iterator = hssfRow.cellIterator();
List<Cell> cellTempList = new ArrayList();
int current = 0, next = 1;
while (iterator.hasNext()) {
Cell hssfCell = iterator.next();
current = hssfCell.getColumnIndex();
if(current<next){
System.out.println("Condition Satisfied");
}
else{
int loop = current-next;
System.out.println("inside else Loop value : "+(loop));
for(int k=0;k<loop+1;k++){
System.out.println("Adding nulls");
cellTempList.add(null);
next = next + 1;
}
}
cellTempList.add(hssfCell);
next = next + 1;
System.out.println("At End next value is : "+next);
}
cellDataList.add(cellTempList);
}
#8
0
public String[] rowToString(Row row)
{
Iterator<Cell> cells = row.cellIterator() ;
String[] data = new String[row.getLastCellNum()] ;
int previousCell = 0 ;
Cell cell = cells.next() ;
int currentCell = cell.getColumnIndex();
while (true)
{
if (previousCell == currentCell) {
switch (cell.getCellType()) {
case Cell.CELL_TYPE_NUMERIC:
data[previousCell] = cell.getNumericCellValue()+"" ;
break;
case Cell.CELL_TYPE_STRING:
data[previousCell] = cell.getStringCellValue() ;
break;
/* // there could be other cases here.
case Cell.CELL_TYPE_FORMULA:
data[previousCell] =eval.evaluateFormulaCell(cell);
break;
case Cell.CELL_TYPE_BOOLEAN:
data[previousCell] = cell.getBooleanCellValue();
break;
case Cell.CELL_TYPE_BLANK:
data[previousCell] = "";
break;
case Cell.CELL_TYPE_ERROR:
data[previousCell] = "ERROR";
break;
*/
}
if(cells.hasNext()){
cell = cells.next() ;
currentCell = cell.getColumnIndex();
} else {
break ;
}
} else {
data[previousCell] = "";
}
previousCell++ ;
}
return data ;
}