pdf文件之itextpdf插入html内容以及中文解决方案

时间:2021-02-05 14:38:17

简述

目前网上已经有很多种html文件直接转pdf的技术帖子,但是很少有直接将部分html作为段落插入到pdf中,而且也没有一个可以很好的解决中文显示的问题。

因此今天上午围绕这个问题进行了研究,把解决方案分享给大家。

itextpdf基础操作请访问:http://www.cnblogs.com/mvilplss/p/5640598.html

感谢:http://gridmix.blog.51cto.com/4764051/1229585

实现思路

如果想插入html片段,我们使用一个类的静态方法:

         String html = "<div style='color:green;font-size:20px;'>你好世界!hello world !</div>";
Paragraph context = new Paragraph();
ElementList elementList =XMLWorkerHelper.parseToElementList(htmlString, null);
for (Element element : elementList) {
context.add(element);
}
document.add(context);

不过你会发现不能显示中文,这个问题网上有很多种解决方法,但是都不好使。

查看XMLWorkerHelper.parseToElementList(htmlString, null)这个方法的源码,发现

CssAppliers cssAppliers = new CssAppliersImpl(FontFactory.getFontImp());可以进行字体的更换。
 public static ElementList parseToElementList(String html, String css) throws IOException {
// CSS
CSSResolver cssResolver = new StyleAttrCSSResolver();
if (css != null) {
CssFile cssFile = XMLWorkerHelper.getCSS(new ByteArrayInputStream(css.getBytes()));
cssResolver.addCss(cssFile);
} // HTML
CssAppliers cssAppliers = new CssAppliersImpl(FontFactory.getFontImp());//这里可以下手对字体进行操作
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false); // Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline htmlPipeline = new HtmlPipeline(htmlContext, end);
CssResolverPipeline cssPipeline = new CssResolverPipeline(cssResolver, htmlPipeline); // XML Worker
XMLWorker worker = new XMLWorker(cssPipeline, true);
XMLParser p = new XMLParser(worker);
p.parse(new ByteArrayInputStream(html.getBytes())); return elements;
}

因此我们就想到重写XMLWorkerFontProvider类的getFont(*)方法,对于没有显示声明css样式的字体,默认使用undefine字体样式进行设置默认字体。

 public class MyXMLWorkerHelper {
public static class MyFontsProvider extends XMLWorkerFontProvider {
public MyFontsProvider() {
super(null, null);
} @Override
public Font getFont(final String fontname, String encoding, float size, final int style) { String fntname = fontname;
if (fntname == null) {
fntname = "宋体";
}
return super.getFont(fntname, encoding, size, style);
}
} public static ElementList parseToElementList(String html, String css) throws IOException {
// CSS
CSSResolver cssResolver = new StyleAttrCSSResolver();
if (css != null) {
CssFile cssFile = XMLWorkerHelper.getCSS(new ByteArrayInputStream(css.getBytes()));
cssResolver.addCss(cssFile);
} // HTML
MyFontsProvider fontProvider = new MyFontsProvider();
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false); // Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline htmlPipeline = new HtmlPipeline(htmlContext, end);
CssResolverPipeline cssPipeline = new CssResolverPipeline(cssResolver, htmlPipeline); // XML Worker
XMLWorker worker = new XMLWorker(cssPipeline, true);
XMLParser p = new XMLParser(worker);
html = html.replace("<br>", "").replace("<hr>", "").replace("<img>", "").replace("<param>", "")
.replace("<link>", "");
p.parse(new ByteArrayInputStream(html.getBytes())); return elements;
} }

因为XMLWork不支持html的单标签,所以要对但标签进行过滤。不然就会报错:Invalid nested tag div found, expected closing tag br