I am using iText-Java to split PDFs at bookmark level. Does anybody know or have any examples for splitting a PDF at bookmarks that exist at a level 2 or 3? For ex: I have the bookmarks in the following levels:
我正在使用iText-Java在书签级别拆分PDF。有没有人知道或有任何在PDF 2级或3级书签中拆分PDF的例子?例如:我有以下级别的书签:
Father
|-Son
|-Son
|-Daughter
|-|-Grand son
|-|-Grand daughter
父亲| -Son | -Son | -Daughter | - | - 大儿子| - | - 女儿
Right now I have below code to read the bookmark which reads the base bookmark(Father). Basically SimpleBookmark.getBookmark(reader) line did all the work.
现在我有下面的代码来读取读取基本书签(父亲)的书签。基本上SimpleBookmark.getBookmark(阅读器)行完成了所有工作。
But I want to read the level 2 and level 3 bookmarks to split the content present between those inner level bookmarks.
但是我想阅读2级和3级书签来分割这些内部级书签之间的内容。
public static void splitPDFByBookmarks(String pdf, String outputFolder){
try
{
PdfReader reader = new PdfReader(pdf);
//List of bookmarks: each bookmark is a map with values for title, page, etc
List<HashMap> bookmarks = SimpleBookmark.getBookmark(reader);
for(int i=0; i<bookmarks.size(); i++){
HashMap bm = bookmarks.get(i);
HashMap nextBM = i==bookmarks.size()-1 ? null : bookmarks.get(i+1);
//In my case I needed to split the title string
String title = ((String)bm.get("Title")).split(" ")[2];
log.debug("Titel: " + title);
String startPage = ((String)bm.get("Page")).split(" ")[0];
String startPageNextBM = nextBM==null ? "" + (reader.getNumberOfPages() + 1) : ((String)nextBM.get("Page")).split(" ")[0];
log.debug("Page: " + startPage);
log.debug("------------------");
extractBookmarkToPDF(reader, Integer.valueOf(startPage), Integer.valueOf(startPageNextBM), title + ".pdf",outputFolder);
}
}
catch (IOException e)
{
log.error(e.getMessage());
}
}
private static void extractBookmarkToPDF(PdfReader reader, int pageFrom, int pageTo, String outputName, String outputFolder){
Document document = new Document();
OutputStream os = null;
try{
os = new FileOutputStream(outputFolder + outputName);
// Create a writer for the outputstream
PdfWriter writer = PdfWriter.getInstance(document, os);
document.open();
PdfContentByte cb = writer.getDirectContent(); // Holds the PDF data
PdfImportedPage page;
while(pageFrom < pageTo) {
document.newPage();
page = writer.getImportedPage(reader, pageFrom);
cb.addTemplate(page, 0, 0);
pageFrom++;
}
os.flush();
document.close();
os.close();
}catch(Exception ex){
log.error(ex.getMessage());
}finally {
if (document.isOpen())
document.close();
try {
if (os != null)
os.close();
} catch (IOException ioe) {
log.error(ioe.getMessage());
}
}
}
Your help is much appreciated. Thanks in advance! :)
非常感谢您的帮助。提前致谢! :)
1 个解决方案
#1
0
You get an ArrayList<HashMap>
when you call SimpleBookmark.getBookmark(reader);
(do the cast if you need it). Try to iterate through that Arraylist and see its structure. If a bookmarks have sons (as you call it), it will contains another list with the same structure.
当你调用SimpleBookmark.getBookmark(reader)时,你得到一个ArrayList
A recursive method could be the solution.
递归方法可能是解决方案。
#1
0
You get an ArrayList<HashMap>
when you call SimpleBookmark.getBookmark(reader);
(do the cast if you need it). Try to iterate through that Arraylist and see its structure. If a bookmarks have sons (as you call it), it will contains another list with the same structure.
当你调用SimpleBookmark.getBookmark(reader)时,你得到一个ArrayList
A recursive method could be the solution.
递归方法可能是解决方案。