从受密码保护的Excel文件到pandas DataFrame

时间:2021-12-17 00:36:31

I can open a password-protected Excel file with this:

我可以使用以下命令打开受密码保护的Excel文件:

import sys
import win32com.client
xlApp = win32com.client.Dispatch("Excel.Application")
print "Excel library version:", xlApp.Version
filename, password = sys.argv[1:3]
xlwb = xlApp.Workbooks.Open(filename, Password=password)
# xlwb = xlApp.Workbooks.Open(filename)
xlws = xlwb.Sheets(1) # counts from 1, not from 0
print xlws.Name
print xlws.Cells(1, 1) # that's A1

I'm not sure though how to transfer the information to a pandas dataframe. Do I need to read cells one by one and all, or is there a convenient method for this to happen?

我不确定如何将信息传递给pandas数据帧。我是否需要逐个读取单元格,或者是否有方便的方法来实现?

3 个解决方案

#1


3  

Assuming the starting cell is given as (StartRow, StartCol) and the ending cell is given as (EndRow, EndCol), I found the following worked for me:

假设起始单元格为(StartRow,StartCol),结束单元格为(EndRow,EndCol),我发现以下内容对我有用:

# Get the content in the rectangular selection region
# content is a tuple of tuples
content = xlws.Range(xlws.Cells(StartRow, StartCol), xlws.Cells(EndRow, EndCol)).Value 

# Transfer content to pandas dataframe
dataframe = pandas.DataFrame(list(content))

Note: Excel Cell B5 is given as row 5, col 2 in win32com. Also, we need list(...) to convert from tuple of tuples to list of tuples, since there is no pandas.DataFrame constructor for a tuple of tuples.

注意:Excel单元格B5在win32com中作为第5行,第2列给出。此外,我们需要list(...)从元组的元组转换为元组列表,因为没有用于元组元组的pandas.DataFrame构造函数。

#2


1  

Assuming that you can save the encrypted file back to disk using the win32com API (which I realize might defeat the purpose) you could then immediately call the top-level pandas function read_excel. You'll need to install some combination of xlrd (for Excel 2003), xlwt (also for 2003), and openpyxl (for Excel 2007) first though. Here is the documentation for reading in Excel files. Currently pandas does not provide support for using the win32com API to read Excel files. You're welcome to open up a GitHub issue if you'd like.

假设您可以使用win32com API将加密文件保存回磁盘(我意识到可能会失败),您可以立即调用*pandas函数read_excel。您需要先安装xlrd(适用于Excel 2003),xlwt(也适用于2003)和openpyxl(适用于Excel 2007)的某些组合。这是用于阅读Excel文件的文档。目前,pandas不支持使用win32com API读取Excel文件。如果您愿意,欢迎您打开GitHub问题。

#3


0  

from David Hamann's site (all credits go to him) https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/

来自David Hamann的网站(所有学分归他所有)https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/

Use xlwings, opening the file will first launch the Excel application so you can enter the password.

使用xlwings,打开文件将首先启动Excel应用程序,以便输入密码。

import pandas as pd
import xlwings as xw

PATH = '/Users/me/Desktop/xlwings_sample.xlsx'
wb = xw.Book(PATH)
sheet = wb.sheets['sample']

df = sheet['A1:C4'].options(pd.DataFrame, index=False, header=True).value
df

#1


3  

Assuming the starting cell is given as (StartRow, StartCol) and the ending cell is given as (EndRow, EndCol), I found the following worked for me:

假设起始单元格为(StartRow,StartCol),结束单元格为(EndRow,EndCol),我发现以下内容对我有用:

# Get the content in the rectangular selection region
# content is a tuple of tuples
content = xlws.Range(xlws.Cells(StartRow, StartCol), xlws.Cells(EndRow, EndCol)).Value 

# Transfer content to pandas dataframe
dataframe = pandas.DataFrame(list(content))

Note: Excel Cell B5 is given as row 5, col 2 in win32com. Also, we need list(...) to convert from tuple of tuples to list of tuples, since there is no pandas.DataFrame constructor for a tuple of tuples.

注意:Excel单元格B5在win32com中作为第5行,第2列给出。此外,我们需要list(...)从元组的元组转换为元组列表,因为没有用于元组元组的pandas.DataFrame构造函数。

#2


1  

Assuming that you can save the encrypted file back to disk using the win32com API (which I realize might defeat the purpose) you could then immediately call the top-level pandas function read_excel. You'll need to install some combination of xlrd (for Excel 2003), xlwt (also for 2003), and openpyxl (for Excel 2007) first though. Here is the documentation for reading in Excel files. Currently pandas does not provide support for using the win32com API to read Excel files. You're welcome to open up a GitHub issue if you'd like.

假设您可以使用win32com API将加密文件保存回磁盘(我意识到可能会失败),您可以立即调用*pandas函数read_excel。您需要先安装xlrd(适用于Excel 2003),xlwt(也适用于2003)和openpyxl(适用于Excel 2007)的某些组合。这是用于阅读Excel文件的文档。目前,pandas不支持使用win32com API读取Excel文件。如果您愿意,欢迎您打开GitHub问题。

#3


0  

from David Hamann's site (all credits go to him) https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/

来自David Hamann的网站(所有学分归他所有)https://davidhamann.de/2018/02/21/read-password-protected-excel-files-into-pandas-dataframe/

Use xlwings, opening the file will first launch the Excel application so you can enter the password.

使用xlwings,打开文件将首先启动Excel应用程序,以便输入密码。

import pandas as pd
import xlwings as xw

PATH = '/Users/me/Desktop/xlwings_sample.xlsx'
wb = xw.Book(PATH)
sheet = wb.sheets['sample']

df = sheet['A1:C4'].options(pd.DataFrame, index=False, header=True).value
df