将Excel工作簿中的所有工作表读入带有data.frame的R列表

时间:2022-11-16 18:34:59

I understand that XLConnect can be used to read an Excel worksheet into R. For example, this would read the first worksheet in a workbook called test.xls into R.

我理解XLConnect可以用于将Excel工作表读入r。例如,它将读取名为test的工作簿中的第一个工作表。xls R。

library(XLConnect)
readWorksheetFromFile('test.xls', sheet = 1)

I have an Excel Workbook with multiple worksheets.

我有一个包含多个工作表的Excel工作簿。

How can all worksheets in a workbook be imported into a list in R where each element of the list is a data.frame for a given sheet, and where the name of each element corresponds to the name of the worksheet in Excel?

如何将工作簿中的所有工作表导入R中的列表,其中列表中的每个元素都是给定表的data.frame,以及每个元素的名称与Excel中的工作表的名称对应?

9 个解决方案

#1


66  

Updated answer using readxl (22nd June 2015)

Since posting this question the readxl package has been released. It supports both xls and xlsx format. Importantly, in contrast to other excel import packages, it works on Windows, Mac, and Linux without requiring installation of additional software.

自从发布了这个问题,readxl软件包已经发布。它支持xls和xlsx格式。重要的是,与其他excel导入包相比,它可以在Windows、Mac和Linux上运行,而不需要安装额外的软件。

So a function for importing all sheets in an Excel workbook would be:

因此,在Excel工作簿中导入所有表的函数是:

library(readxl)    
read_excel_allsheets <- function(filename, tibble = FALSE) {
    # I prefer straight data.frames
    # but if you like tidyverse tibbles (the default with read_excel)
    # then just pass tibble = TRUE
    sheets <- readxl::excel_sheets(filename)
    x <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X))
    if(!tibble) x <- lapply(x, as.data.frame)
    names(x) <- sheets
    x
}

This could be called with:

可以这样称呼:

mysheets <- read_excel_allsheets("foo.xls")

Old Answer

Building on the answer provided by @mnel, here is a simple function that takes an Excel file as an argument and returns each sheet as a data.frame in a named list.

基于@mnel提供的答案,这里有一个简单的函数,它将Excel文件作为参数,并将每个表作为指定列表中的data.frame返回。

library(XLConnect)

importWorksheets <- function(filename) {
    # filename: name of Excel file
    workbook <- loadWorkbook(filename)
    sheet_names <- getSheets(workbook)
    names(sheet_names) <- sheet_names
    sheet_list <- lapply(sheet_names, function(.sheet){
        readWorksheet(object=workbook, .sheet)})
}

Thus, it could be called with:

因此,可以这样称呼它:

importWorksheets('test.xls')

#2


41  

Note that most of XLConnect's functions are already vectorized. This means that you can read in all worksheets with one function call without having to do explicit vectorization:

注意,XLConnect的大多数函数已经是矢量化的。这意味着您可以使用一个函数调用来读取所有工作表,而不必进行显式的向量化:

require(XLConnect)
wb <- loadWorkbook(system.file("demoFiles/mtcars.xlsx", package = "XLConnect"))
lst = readWorksheet(wb, sheet = getSheets(wb))

With XLConnect 0.2-0 lst will already be a named list.

对于XLConnect, 0 - 2- lst将已经是一个命名列表。

#3


7  

You can load the work book and then use lapply, getSheets and readWorksheet and do something like this.

您可以加载工作簿,然后使用lapply、getSheets和readWorksheet,并执行类似的操作。

wb.mtcars <- loadWorkbook(system.file("demoFiles/mtcars.xlsx", 
                          package = "XLConnect"))
sheet_names <- getSheets(wb.mtcars)
names(sheet_names) <- sheet_names

sheet_list <- lapply(sheet_names, function(.sheet){
    readWorksheet(object=wb.mtcars, .sheet)})

#4


4  

Since this is the number one hit to the question: Read multi sheet excel to list:

由于这是第一个击中问题:阅读多张excel表列出:

here is the openxlsx solution:

这是openxlsx解决方案:

filename <-"myFilePath"

sheets <- openxlsx::getSheetNames(filename)
SheetList <- lapply(sheets,openxlsx::read.xlsx,xlsxFile=filename)
names(SheetList) <- sheets

#5


3  

excel.link will do the job.

excel。link会做这个工作。

I actually found it easier to use compared to XLConnect (not that either package is that difficult to use). Learning curve for both was about 5 minutes.

与XLConnect相比,我发现它更容易使用(并不是说这两个包都很难使用)。两种方法的学习时间都在5分钟左右。

As an aside, you can easily find all R packages that mention the word "Excel" by browsing to http://cran.r-project.org/web/packages/available_packages_by_name.html

另外,通过浏览http://cran.r-project.org/web/packages/available_packages_by_name.html,您可以轻松找到所有提到“Excel”的R包

#6


2  

I stumbled across this old question and I think the easiest approach is still missing.

我偶然发现了这个古老的问题,我认为最简单的方法仍未找到。

You can use rio to import all excel sheets with just one line of code.

可以使用里约热内卢只导入一行代码的所有excel表。

library(rio)
data_list <- import_list("test.xls")

If you're a fan of the tidyverse, you can easily import them as tibbles by adding the setclass argument to the function call.

如果您是tidyverse的粉丝,您可以通过向函数调用添加setclass参数轻松地将它们作为tibbles导入。

data_list <- import_list("test.xls", setclass = "tbl")

#7


2  

From official readxl (tidyverse) documentation (changing first line):

来自官方的readxl (tidyverse)文档(更改第一行):

path <- "data/datasets.xlsx"

path %>% 
  excel_sheets() %>% 
  set_names() %>% 
  map(read_excel, path = path)

Details at: http://readxl.tidyverse.org/articles/articles/readxl-workflows.html#iterate-over-multiple-worksheets-in-a-workbook

详细信息:http://readxl.tidyverse.org/articles/articles/readxl-workflows.html iterate-over-multiple-worksheets-in-a-workbook

#8


1  

I tried the above and had issues with the amount of data that my 20MB Excel I needed to convert consisted of; therefore the above did not work for me.

我尝试过上面的方法,但是我需要转换的20MB Excel中包含的数据量有问题;因此上面的方法对我不起作用。

After more research I stumbled upon openxlsx and this one finally did the trick (and fast) Importing a big xlsx file into R?

在更多的研究之后,我偶然发现了openxlsx,而这个最终实现了将一个大型xlsx文件导入到R中的诀窍(而且是快速的)。

https://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf

https://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf

#9


0  

To read multiple sheets from a workbook, use readxl package as follows:

要从工作簿中读取多个表,请使用readxl包,如下所示:

library(readxl)
library(dplyr)

final_dataFrame <- bind_row(path_to_workbook %>%
                              excel_sheets() %>%
                              set_names() %>%
                              map(read_excel, path = path_to_workbook))

Here, bind_row (dplyr) will put all data rows from all sheets into one data frame, and path_to_workbook is "dir/of/the/data/workbook".

在这里,bind_row (dplyr)将把所有表中的所有数据行放入一个数据帧,path_to_workbook是“dir/of/the/data/workbook”。

#1


66  

Updated answer using readxl (22nd June 2015)

Since posting this question the readxl package has been released. It supports both xls and xlsx format. Importantly, in contrast to other excel import packages, it works on Windows, Mac, and Linux without requiring installation of additional software.

自从发布了这个问题,readxl软件包已经发布。它支持xls和xlsx格式。重要的是,与其他excel导入包相比,它可以在Windows、Mac和Linux上运行,而不需要安装额外的软件。

So a function for importing all sheets in an Excel workbook would be:

因此,在Excel工作簿中导入所有表的函数是:

library(readxl)    
read_excel_allsheets <- function(filename, tibble = FALSE) {
    # I prefer straight data.frames
    # but if you like tidyverse tibbles (the default with read_excel)
    # then just pass tibble = TRUE
    sheets <- readxl::excel_sheets(filename)
    x <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X))
    if(!tibble) x <- lapply(x, as.data.frame)
    names(x) <- sheets
    x
}

This could be called with:

可以这样称呼:

mysheets <- read_excel_allsheets("foo.xls")

Old Answer

Building on the answer provided by @mnel, here is a simple function that takes an Excel file as an argument and returns each sheet as a data.frame in a named list.

基于@mnel提供的答案,这里有一个简单的函数,它将Excel文件作为参数,并将每个表作为指定列表中的data.frame返回。

library(XLConnect)

importWorksheets <- function(filename) {
    # filename: name of Excel file
    workbook <- loadWorkbook(filename)
    sheet_names <- getSheets(workbook)
    names(sheet_names) <- sheet_names
    sheet_list <- lapply(sheet_names, function(.sheet){
        readWorksheet(object=workbook, .sheet)})
}

Thus, it could be called with:

因此,可以这样称呼它:

importWorksheets('test.xls')

#2


41  

Note that most of XLConnect's functions are already vectorized. This means that you can read in all worksheets with one function call without having to do explicit vectorization:

注意,XLConnect的大多数函数已经是矢量化的。这意味着您可以使用一个函数调用来读取所有工作表,而不必进行显式的向量化:

require(XLConnect)
wb <- loadWorkbook(system.file("demoFiles/mtcars.xlsx", package = "XLConnect"))
lst = readWorksheet(wb, sheet = getSheets(wb))

With XLConnect 0.2-0 lst will already be a named list.

对于XLConnect, 0 - 2- lst将已经是一个命名列表。

#3


7  

You can load the work book and then use lapply, getSheets and readWorksheet and do something like this.

您可以加载工作簿,然后使用lapply、getSheets和readWorksheet,并执行类似的操作。

wb.mtcars <- loadWorkbook(system.file("demoFiles/mtcars.xlsx", 
                          package = "XLConnect"))
sheet_names <- getSheets(wb.mtcars)
names(sheet_names) <- sheet_names

sheet_list <- lapply(sheet_names, function(.sheet){
    readWorksheet(object=wb.mtcars, .sheet)})

#4


4  

Since this is the number one hit to the question: Read multi sheet excel to list:

由于这是第一个击中问题:阅读多张excel表列出:

here is the openxlsx solution:

这是openxlsx解决方案:

filename <-"myFilePath"

sheets <- openxlsx::getSheetNames(filename)
SheetList <- lapply(sheets,openxlsx::read.xlsx,xlsxFile=filename)
names(SheetList) <- sheets

#5


3  

excel.link will do the job.

excel。link会做这个工作。

I actually found it easier to use compared to XLConnect (not that either package is that difficult to use). Learning curve for both was about 5 minutes.

与XLConnect相比,我发现它更容易使用(并不是说这两个包都很难使用)。两种方法的学习时间都在5分钟左右。

As an aside, you can easily find all R packages that mention the word "Excel" by browsing to http://cran.r-project.org/web/packages/available_packages_by_name.html

另外,通过浏览http://cran.r-project.org/web/packages/available_packages_by_name.html,您可以轻松找到所有提到“Excel”的R包

#6


2  

I stumbled across this old question and I think the easiest approach is still missing.

我偶然发现了这个古老的问题,我认为最简单的方法仍未找到。

You can use rio to import all excel sheets with just one line of code.

可以使用里约热内卢只导入一行代码的所有excel表。

library(rio)
data_list <- import_list("test.xls")

If you're a fan of the tidyverse, you can easily import them as tibbles by adding the setclass argument to the function call.

如果您是tidyverse的粉丝,您可以通过向函数调用添加setclass参数轻松地将它们作为tibbles导入。

data_list <- import_list("test.xls", setclass = "tbl")

#7


2  

From official readxl (tidyverse) documentation (changing first line):

来自官方的readxl (tidyverse)文档(更改第一行):

path <- "data/datasets.xlsx"

path %>% 
  excel_sheets() %>% 
  set_names() %>% 
  map(read_excel, path = path)

Details at: http://readxl.tidyverse.org/articles/articles/readxl-workflows.html#iterate-over-multiple-worksheets-in-a-workbook

详细信息:http://readxl.tidyverse.org/articles/articles/readxl-workflows.html iterate-over-multiple-worksheets-in-a-workbook

#8


1  

I tried the above and had issues with the amount of data that my 20MB Excel I needed to convert consisted of; therefore the above did not work for me.

我尝试过上面的方法,但是我需要转换的20MB Excel中包含的数据量有问题;因此上面的方法对我不起作用。

After more research I stumbled upon openxlsx and this one finally did the trick (and fast) Importing a big xlsx file into R?

在更多的研究之后,我偶然发现了openxlsx,而这个最终实现了将一个大型xlsx文件导入到R中的诀窍(而且是快速的)。

https://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf

https://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf

#9


0  

To read multiple sheets from a workbook, use readxl package as follows:

要从工作簿中读取多个表,请使用readxl包,如下所示:

library(readxl)
library(dplyr)

final_dataFrame <- bind_row(path_to_workbook %>%
                              excel_sheets() %>%
                              set_names() %>%
                              map(read_excel, path = path_to_workbook))

Here, bind_row (dplyr) will put all data rows from all sheets into one data frame, and path_to_workbook is "dir/of/the/data/workbook".

在这里,bind_row (dplyr)将把所有表中的所有数据行放入一个数据帧,path_to_workbook是“dir/of/the/data/workbook”。