I am using apache POI to read an excel document. To say the least, it is able to serve my purpose as of now. But one thing where I am getting struck is extracting the value of cell as HTML.
我正在使用apache POI来阅读excel文档。至少可以说,它至今可以满足我的目的。但是我受到打击的一件事是将单元格的值提取为HTML。
I have one cell wherein user will enter some string and apply some formatting(like bullets/numbers/bold/italic) etc.
我有一个单元格,其中用户将输入一些字符串并应用一些格式(如子弹/数字/粗体/斜体)等。
SO when I read it the content should be in HTML format and not a plain string format as given by POI.
因此,当我阅读它时,内容应该是HTML格式,而不是POI给出的普通字符串格式。
I have almost gone through the entire POI API but not able to find anyone. I want to remain the formatting of just one particular column and not the entire excel. By column I mean, the text which is entered in that column. I want that text as HTML text.
我几乎已经完成了整个POI API但却无法找到任何人。我想保留一个特定列的格式,而不是整个excel。按列我的意思是,在该列中输入的文本。我希望该文本为HTML文本。
Explored and used Apache Tika also. However as I understand it can only get me the text but not the formatting of the text.
探索和使用Apache Tika也。但据我所知它只能得到文本而不是文本的格式。
Please someone guide me. I am running out of options.
请有人指导我。我的选项用完了。
Suppose I wrote My name is Angel and Demon in Excel.
假设我在Excel中写了我的名字是天使和恶魔。
The output I should get in Java is My name is <b>Angel</b> and <i>Demon</i>
我应该用Java获得的输出是我的名字是 Angel 和 Demon
1 个解决方案
#1
3
I've paste this as unicode to cell A1 of xls file:
我将此作为unicode粘贴到xls文件的单元格A1:
<html><p>This is a test. Will this text be <b>bold</b> or <i>italic</i></p></html>
This html line produce this:
这个html行产生了这个:
This is a test. Will this text be bold or italic
这是一个测试。这个文本是粗体还是斜体
My code:
我的代码:
public class ExcelWithHtml {
// <html><p>This is a test. Will this text be <b>bold</b> or
// <i>italic</i></p></html>
public static void main(String[] args) throws FileNotFoundException,
IOException {
new ExcelWithHtml()
.readFirstCellOfXSSF("/Users/rcacheira/testeHtml.xlsx");
}
boolean inBold = false;
boolean inItalic = false;
public void readFirstCellOfXSSF(String filePathName)
throws FileNotFoundException, IOException {
FileInputStream fis = new FileInputStream(filePathName);
XSSFWorkbook wb = new XSSFWorkbook(fis);
XSSFSheet sheet = wb.getSheetAt(0);
String cellHtml = getHtmlFormatedCellValueFromSheet(sheet, "A1");
System.out.println(cellHtml);
fis.close();
}
public String getHtmlFormatedCellValueFromSheet(XSSFSheet sheet,
String cellName) {
CellReference cellReference = new CellReference(cellName);
XSSFRow row = sheet.getRow(cellReference.getRow());
XSSFCell cell = row.getCell(cellReference.getCol());
XSSFRichTextString cellText = cell.getRichStringCellValue();
String htmlCode = "";
// htmlCode = "<html>";
for (int i = 0; i < cellText.numFormattingRuns(); i++) {
try {
htmlCode += getFormatFromFont(cellText.getFontAtIndex(i));
} catch (NullPointerException ex) {
}
try {
htmlCode += getFormatFromFont(cellText
.getFontOfFormattingRun(i));
} catch (NullPointerException ex) {
}
int indexStart = cellText.getIndexOfFormattingRun(i);
int indexEnd = indexStart + cellText.getLengthOfFormattingRun(i);
htmlCode += cellText.getString().substring(indexStart, indexEnd);
}
if (inItalic) {
htmlCode += "</i>";
inItalic = false;
}
if (inBold) {
htmlCode += "</b>";
inBold = false;
}
// htmlCode += "</html>";
return htmlCode;
}
private String getFormatFromFont(XSSFFont font) {
String formatHtmlCode = "";
if (font.getItalic() && !inItalic) {
formatHtmlCode += "<i>";
inItalic = true;
} else if (!font.getItalic() && inItalic) {
formatHtmlCode += "</i>";
inItalic = false;
}
if (font.getBold() && !inBold) {
formatHtmlCode += "<b>";
inBold = true;
} else if (!font.getBold() && inBold) {
formatHtmlCode += "</b>";
inBold = false;
}
return formatHtmlCode;
}
}
My output:
我的输出:
This is a test. Will this text be <b>bold</b> or <i>italic</i>
I think it is what you want, i'm only show you the possibilities, i'm not using the best code practices, i'm just programming fast to produce an output.
我认为这是你想要的,我只是告诉你可能性,我没有使用最好的代码实践,我只是快速编程以产生输出。
#1
3
I've paste this as unicode to cell A1 of xls file:
我将此作为unicode粘贴到xls文件的单元格A1:
<html><p>This is a test. Will this text be <b>bold</b> or <i>italic</i></p></html>
This html line produce this:
这个html行产生了这个:
This is a test. Will this text be bold or italic
这是一个测试。这个文本是粗体还是斜体
My code:
我的代码:
public class ExcelWithHtml {
// <html><p>This is a test. Will this text be <b>bold</b> or
// <i>italic</i></p></html>
public static void main(String[] args) throws FileNotFoundException,
IOException {
new ExcelWithHtml()
.readFirstCellOfXSSF("/Users/rcacheira/testeHtml.xlsx");
}
boolean inBold = false;
boolean inItalic = false;
public void readFirstCellOfXSSF(String filePathName)
throws FileNotFoundException, IOException {
FileInputStream fis = new FileInputStream(filePathName);
XSSFWorkbook wb = new XSSFWorkbook(fis);
XSSFSheet sheet = wb.getSheetAt(0);
String cellHtml = getHtmlFormatedCellValueFromSheet(sheet, "A1");
System.out.println(cellHtml);
fis.close();
}
public String getHtmlFormatedCellValueFromSheet(XSSFSheet sheet,
String cellName) {
CellReference cellReference = new CellReference(cellName);
XSSFRow row = sheet.getRow(cellReference.getRow());
XSSFCell cell = row.getCell(cellReference.getCol());
XSSFRichTextString cellText = cell.getRichStringCellValue();
String htmlCode = "";
// htmlCode = "<html>";
for (int i = 0; i < cellText.numFormattingRuns(); i++) {
try {
htmlCode += getFormatFromFont(cellText.getFontAtIndex(i));
} catch (NullPointerException ex) {
}
try {
htmlCode += getFormatFromFont(cellText
.getFontOfFormattingRun(i));
} catch (NullPointerException ex) {
}
int indexStart = cellText.getIndexOfFormattingRun(i);
int indexEnd = indexStart + cellText.getLengthOfFormattingRun(i);
htmlCode += cellText.getString().substring(indexStart, indexEnd);
}
if (inItalic) {
htmlCode += "</i>";
inItalic = false;
}
if (inBold) {
htmlCode += "</b>";
inBold = false;
}
// htmlCode += "</html>";
return htmlCode;
}
private String getFormatFromFont(XSSFFont font) {
String formatHtmlCode = "";
if (font.getItalic() && !inItalic) {
formatHtmlCode += "<i>";
inItalic = true;
} else if (!font.getItalic() && inItalic) {
formatHtmlCode += "</i>";
inItalic = false;
}
if (font.getBold() && !inBold) {
formatHtmlCode += "<b>";
inBold = true;
} else if (!font.getBold() && inBold) {
formatHtmlCode += "</b>";
inBold = false;
}
return formatHtmlCode;
}
}
My output:
我的输出:
This is a test. Will this text be <b>bold</b> or <i>italic</i>
I think it is what you want, i'm only show you the possibilities, i'm not using the best code practices, i'm just programming fast to produce an output.
我认为这是你想要的,我只是告诉你可能性,我没有使用最好的代码实践,我只是快速编程以产生输出。