通过flowable将现有PDF合并到新的ReportLab PDF中

时间:2021-07-27 21:11:35

I have a reportlab SimpleDocTemplate and returning it as a dynamic PDF. I am generating it's content based on some Django model metadata. Here's my template setup:

我有一个reportlab SimpleDocTemplate并将其作为动态PDF返回。我基于一些Django模型元数据生成它的内容。这是我的模板设置:

buff = StringIO()
doc = SimpleDocTemplate(buff, pagesize=letter,
                        rightMargin=72,leftMargin=72,
                        topMargin=72,bottomMargin=18)
Story = []

I can easily add textual metadata from the Entry model into the Story list to be built later:

我可以轻松地将Entry模型中的文本元数据添加到以后要构建的Story列表中:

    ptext = '<font size=20>%s</font>' % entry.title.title()
    paragraph = Paragraph(ptext, custom_styles["Custom"])
    Story.append(paragraph)

And then generate the PDF to be returned in the response by calling build on the SimpleDocTemplate:

然后通过在SimpleDocTemplate上调用build来生成要在响应中返回的PDF:

doc.build(Story, onFirstPage=entry_page_template, onLaterPages=entry_page_template)

pdf = buff.getvalue()
resp = HttpResponse(mimetype='application/x-download')    
resp['Content-Disposition'] = 'attachment;filename=logbook.pdf'
resp.write(pdf)
return resp

One metadata field on the model is a file attachment. When those file attachments are PDFs, I'd like to merge them into the Story that I am generating; IE meaning a PDF of reportlab "flowable" type.

模型上的一个元数据字段是文件附件。当这些文件附件是PDF时,我想将它们合并到我正在生成的故事中; IE表示reportlab“flowable”类型的PDF。

I'm attempting to do so using pdfrw, but haven't had any luck. Ideally I'd love to just call:

我试图使用pdfrw这样做,但没有运气。理想情况下,我很乐意打电话:

from pdfrw import PdfReader
pdf = pPdfReader(entry.document.file.path)
Story.append(pdf)

and append the pdf to the existing Story list to be included in the generation of the final document, as noted above.

并将pdf附加到现有的故事列表中,以包含在最终文档的生成中,如上所述。

Anyone have any ideas? I tried something similar using pagexobj to create the pdf, trying to follow this example:

有人有主意吗?我尝试使用pagexobj创建类似的东西,尝试按照这个例子:

http://code.google.com/p/pdfrw/source/browse/trunk/examples/rl1/subset.py

http://code.google.com/p/pdfrw/source/browse/trunk/examples/rl1/subset.py

from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl

pdf = pagexobj(PdfReader(entry.document.file.path))

But didn't have any luck either. Can someone explain to me the best way to merge an existing PDF file into a reportlab flowable? I'm no good with this stuff and have been banging my head on pdf-generation for days now. :) Any direction greatly appreciated!

但也没有任何运气。有人可以向我解释将现有PDF文件合并到reportlab可流动的最佳方法吗?我对这些东西并不擅长,并且几天来一直在为pdf一代敲打我的头脑。 :)任何方向非常感谢!

2 个解决方案

#1


1  

I just had a similar task in a project. I used reportlab (open source version) to generate pdf files and pyPDF to facilitate the merge. My requirements were slightly different in that I just needed one page from each attachment, but I'm sure this is probably close enough for you to get the general idea.

我刚刚在一个项目中有类似的任务。我使用reportlab(开源版本)生成pdf文件和pyPDF以促进合并。我的要求略有不同,因为我只需要每个附件中的一个页面,但我确信这可能足以让您获得一般性的想法。

from pyPdf import PdfFileReader, PdfFileWriter

def create_merged_pdf(user):
    basepath = settings.MEDIA_ROOT + "/"
    # following block calls the function that uses reportlab to generate a pdf
    coversheet_path = basepath + "%s_%s_cover_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f"))
    create_cover_sheet(coversheet_path, user, user.performancereview_set.all())

    # now user the cover sheet and all of the performance reviews to create a merged pdf
    merged_path = basepath + "%s_%s_merged_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f"))

    # for merged file result
    output = PdfFileWriter()

    # for each pdf file to add, open in a PdfFileReader object and add page to output
    cover_pdf = PdfFileReader(file( coversheet_path, "rb"))
    output.addPage(cover_pdf.getPage(0))

    # iterate through attached files and merge.  I only needed the first page, YMMV
    for review in user.performancereview_set.all():
        review_pdf = PdfFileReader(file(review.pdf_file.file.name, "rb"))
        output.addPage(review_pdf.getPage(0)) # only first page of attachment

    # write out the merged file
    outputStream = file(merged_path, "wb")
    output.write(outputStream)
    outputStream.close()

#2


1  

I used the following class to solve my issue. It inserts the PDFs as vector PDF images. It works great because I needed to have a table of contents. The flowable object allowed the built in TOC functionality to work like a charm.

我用下面的课来解决我的问题。它将PDF作为矢量PDF图像插入。它很棒,因为我需要有一个目录。可流动对象允许内置TOC功能像魅力一样工作。

Is there a matplotlib flowable for ReportLab?

是否有适用于ReportLab的matplotlib可流动?

Note: If you have multiple pages in the file, you have to modify the class slightly. The sample class is designed to just read the first page of the PDF.

注意:如果文件中有多个页面,则必须稍微修改该类。示例类旨在读取PDF的第一页。

#1


1  

I just had a similar task in a project. I used reportlab (open source version) to generate pdf files and pyPDF to facilitate the merge. My requirements were slightly different in that I just needed one page from each attachment, but I'm sure this is probably close enough for you to get the general idea.

我刚刚在一个项目中有类似的任务。我使用reportlab(开源版本)生成pdf文件和pyPDF以促进合并。我的要求略有不同,因为我只需要每个附件中的一个页面,但我确信这可能足以让您获得一般性的想法。

from pyPdf import PdfFileReader, PdfFileWriter

def create_merged_pdf(user):
    basepath = settings.MEDIA_ROOT + "/"
    # following block calls the function that uses reportlab to generate a pdf
    coversheet_path = basepath + "%s_%s_cover_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f"))
    create_cover_sheet(coversheet_path, user, user.performancereview_set.all())

    # now user the cover sheet and all of the performance reviews to create a merged pdf
    merged_path = basepath + "%s_%s_merged_%s.pdf" %(user.first_name, user.last_name, datetime.now().strftime("%f"))

    # for merged file result
    output = PdfFileWriter()

    # for each pdf file to add, open in a PdfFileReader object and add page to output
    cover_pdf = PdfFileReader(file( coversheet_path, "rb"))
    output.addPage(cover_pdf.getPage(0))

    # iterate through attached files and merge.  I only needed the first page, YMMV
    for review in user.performancereview_set.all():
        review_pdf = PdfFileReader(file(review.pdf_file.file.name, "rb"))
        output.addPage(review_pdf.getPage(0)) # only first page of attachment

    # write out the merged file
    outputStream = file(merged_path, "wb")
    output.write(outputStream)
    outputStream.close()

#2


1  

I used the following class to solve my issue. It inserts the PDFs as vector PDF images. It works great because I needed to have a table of contents. The flowable object allowed the built in TOC functionality to work like a charm.

我用下面的课来解决我的问题。它将PDF作为矢量PDF图像插入。它很棒,因为我需要有一个目录。可流动对象允许内置TOC功能像魅力一样工作。

Is there a matplotlib flowable for ReportLab?

是否有适用于ReportLab的matplotlib可流动?

Note: If you have multiple pages in the file, you have to modify the class slightly. The sample class is designed to just read the first page of the PDF.

注意:如果文件中有多个页面,则必须稍微修改该类。示例类旨在读取PDF的第一页。