如何使用Python将信息从Excel提取到PowerPoint并保留格式?

时间:2021-01-07 13:02:27

I've written a script with python's xlrd and pptx to read each workbook in a directory and pull information from each sheet into a table in a PowerPoint slide. It works okay if the excel table is small but I don't know what will be in these excel files. It becomes illegible when there is too many rows and columns. My main problem arose when an excel file had graphs instead of cells and the script couldn't read it. So I tried using pyscreenshot to open the document and take a screenshot but this seems slow and unnecessary. I'd like to make a slide in the PowerPoint look exactly as it would in excel but with the ability to add and change things.

我用python的xlrd和pptx编写了一个脚本来读取目录中的每个工作簿,并将每个工作表中的信息拉入PowerPoint幻灯片中的表格中。如果excel表很小但是我不知道这些excel文件中会包含什么,它可以正常工作。当行和列太多时,它变得难以辨认。当excel文件有图形而不是单元格并且脚本无法读取时,我的主要问题出现了。所以我尝试使用pyscreenshot打开文档并截取屏幕截图,但这似乎很慢而且没必要。我想在PowerPoint中制作幻灯片,就像在excel中一样,但是能够添加和更改内容。

import libraries and modules
import xlrd
from pptx import Presentation
from pptx.util import Inches, Pt
import time
import glob
import os

start = time.time()

prs = Presentation()
title_slide_layout = prs.slide_layouts[0]
slide = prs.slides.add_slide(title_slide_layout)
shapes = slide.shapes
title = slide.shapes.title
subtitle = slide.placeholders[1]

title.text = "Dashboard Generator"
subtitle.text = "made with Python-pptx and xlrd"

for filename in glob.glob(os.path.join("C:/Users/penelope/Desktop/PMO/myfiles/", '*.xlsx')):
    print(filename)
    file_location = filename
    try: 
        workbook = xlrd.open_workbook(file_location)
        nsheets = workbook.nsheets
        for n in range(0, nsheets):
            sheet = workbook.sheet_by_index(n)
            print("sheet:", sheet)
            rows = sheet.nrows
            cols = sheet.ncols
            c = cols
            r = rows
            if c > 0:
                print(c, r)
                slide = prs.slides.add_slide(prs.slide_layouts[5])
                shapes = slide.shapes
                title = slide.shapes.title
                title.text = "Table testing"
                left = Inches(0.0)
                top = Inches(2.0)
                width = Inches(6.0)
                height = Inches(4.0)
                num = 10.0/c
                table = shapes.add_table(rows, cols, left, top, width, height).table
                for i in range(0, c):
                    table.columns[i].width = Inches(num)
                for i in range(0,r):
                    for e in range(0,c):
                        table.cell(i,e).text = str(sheet.cell_value(i,e))
                        cell = table.rows[i].cells[e]
                        paragraph = cell.text_frame.paragraphs[0]
                        paragraph.font.size = Pt(11)
    except:
        print("Error!")
        pass

prs.save('powerpointfile1.pptx')
end = time.time()
print(end - start)

And this is my screenshot script:

这是我的截图脚本:

import os
import time
import pyscreenshot as ImageGrab
from PIL import Image

if __name__ == "__main__":
    os.system('start excel.exe "C:/Users/penelope/Desktop/PMO/TestCase.xlsx"')
    time.sleep(3)
    im=ImageGrab.grab(bbox=(24,210,1800,990))
    im.save("image7.png")
    img = Image.open('image7.png')
    img.show()

1 个解决方案

#1


2  

Well, you've chosen a hard problem. Certainly all the times I've attempted this sort of thing I've ended up abandoning the effort.

好吧,你选择了一个难题。当然,我一直尝试过这种事情,但我最终放弃了努力。

The fundamental explanation I formed was that Excel (and Word) are "flowed" document environments. That is, when you run out of room on one page, it flows to the next. PowerPoint, on the other hand, is a page-by-page exhibit layout environment. Each slide is independent of the rest (evidenced by the ability to reorder slides freely), each meant to be shown all at once, and not scrolled. This leads to each slide being self-contained, which means constrained to a single "page".

我形成的基本解释是Excel(和Word)是“流动”的文档环境。也就是说,当你在一个页面上用完房间时,它会流向下一个页面。另一方面,PowerPoint是一个逐页的展览布局环境。每张幻灯片都独立于其他幻灯片(可以*重新排序幻灯片的证明),每个幻灯片都要一次显示,而不是滚动。这导致每个幻灯片是自包含的,这意味着约束到单个“页面”。

There's a limit to how much information one can place on a slide and still have it communicate. Generally less is better. So, perhaps it's not a surprise all my early efforts there ended in frustration :) I also concluded that an effective "dashboard" slide would require very skillful layout, and extreme restraint on content length, probably requiring specific (human) summarization effort (not just copying from a "database").

可以在幻灯片上放置多少信息并且仍然可以进行通信,这是一个限制。通常越少越好。所以,也许我所有早期的努力都以令人沮丧的方式结束并不奇怪:)我还得出结论,一个有效的“仪表板”幻灯片需要非常熟练的布局,并且对内容长度有极大的限制,可能需要特定的(人类)总结工作(不是只是从“数据库”复制)。

Regarding the charts bit, those theoretically can be moved to PowerPoint and I've even seen it done, but it's technically quite challenging. There is no API support for it in python-pptx. This historical issue on the GitHub repo may give some idea what was involved. Not for the faint of heart I expect :)

关于图表位,那些理论上可以转移到PowerPoint,我甚至已经看到它完成,但它在技术上相当具有挑战性。 python-pptx中没有API支持。关于GitHub回购的历史问题可能会让我们知道所涉及的内容。不是我想要的胆小的心脏:)

#1


2  

Well, you've chosen a hard problem. Certainly all the times I've attempted this sort of thing I've ended up abandoning the effort.

好吧,你选择了一个难题。当然,我一直尝试过这种事情,但我最终放弃了努力。

The fundamental explanation I formed was that Excel (and Word) are "flowed" document environments. That is, when you run out of room on one page, it flows to the next. PowerPoint, on the other hand, is a page-by-page exhibit layout environment. Each slide is independent of the rest (evidenced by the ability to reorder slides freely), each meant to be shown all at once, and not scrolled. This leads to each slide being self-contained, which means constrained to a single "page".

我形成的基本解释是Excel(和Word)是“流动”的文档环境。也就是说,当你在一个页面上用完房间时,它会流向下一个页面。另一方面,PowerPoint是一个逐页的展览布局环境。每张幻灯片都独立于其他幻灯片(可以*重新排序幻灯片的证明),每个幻灯片都要一次显示,而不是滚动。这导致每个幻灯片是自包含的,这意味着约束到单个“页面”。

There's a limit to how much information one can place on a slide and still have it communicate. Generally less is better. So, perhaps it's not a surprise all my early efforts there ended in frustration :) I also concluded that an effective "dashboard" slide would require very skillful layout, and extreme restraint on content length, probably requiring specific (human) summarization effort (not just copying from a "database").

可以在幻灯片上放置多少信息并且仍然可以进行通信,这是一个限制。通常越少越好。所以,也许我所有早期的努力都以令人沮丧的方式结束并不奇怪:)我还得出结论,一个有效的“仪表板”幻灯片需要非常熟练的布局,并且对内容长度有极大的限制,可能需要特定的(人类)总结工作(不是只是从“数据库”复制)。

Regarding the charts bit, those theoretically can be moved to PowerPoint and I've even seen it done, but it's technically quite challenging. There is no API support for it in python-pptx. This historical issue on the GitHub repo may give some idea what was involved. Not for the faint of heart I expect :)

关于图表位,那些理论上可以转移到PowerPoint,我甚至已经看到它完成,但它在技术上相当具有挑战性。 python-pptx中没有API支持。关于GitHub回购的历史问题可能会让我们知道所涉及的内容。不是我想要的胆小的心脏:)