如何使用列名和行名将Excel数据导入R中

时间:2022-07-29 00:27:26

I am a R novice and was wondering how to import excel data into R using row names and column names. Specifically i require a subset of the data in a number of worksheet within one excel file. Can i use row names and column names to identify and extract certain cells of data to R ?

我是R新手,想知道如何使用行名和列名将Excel数据导入到R中。具体来说,我需要一个excel文件中的许多工作表中的数据子集。我可以使用行名和列名来识别和提取某些数据单元到R吗?

Worksheet 1
----------
* X Y Z 
A 1 2 2
B 1 1 1
C 1 3 4
D 4 2 2
E 2 2 2 
----------
Worksheet 2
----------
*  X Y1 Z1 
A 1  2  2
B 1  2  3
C 1  3  4
D 4  1  1
E 2  1  1 

For example in the above spreadsheet how could i extract the data (2,2,2,2) using the row and column names (D,Y) (D,Z) (E,Y) (E,Z) in worksheet 1

例如,在上面的电子表格中,如何使用工作表1中的行和列名称(D,Y)(D,Z)(E,Y)(E,Z)提取数据(2,2,2,2)

how could i extract the data (1,1,1,1) using the row and column names (D,Y1) (D,Z1) (E,Y1) (E,Z1) in worksheet 2 ?

如何使用工作表2中的行和列名称(D,Y1)(D,Z1)(E,Y1)(E,Z1)提取数据(1,1,1,1)?

Thanks for any help provided

感谢您提供的任何帮助

Barry

巴里

2 个解决方案

#1


8  

@Andrie mentionned the XLConnect package, it's a very useful package for I/O between R and Excel with the possibility to select region in Excel sheet.

@Andrie提到了XLConnect包,它是R和Excel之间I / O的一个非常有用的包,可以在Excel工作表中选择区域。

I created an Excel file like yours in my Dropbox public folder, you can download the example.xls file here.

我在Dropbox公共文件夹中创建了一个类似你的Excel文件,你可以在这里下载example.xls文件。

require(XLConnect)

## A5:C5 correspond to (D,Y) (D,Z) (E,Y) (E,Z)  in your example
selectworksheet1 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls",
                               sheet = "Worksheet1", 
                               region = "A5:C5", header = FALSE)

selectworksheet1
##  Col0 Col1 Col2
## 1    2    2    2


## B4:C5 correspond to (D,Y1) (D,Z1) (E,Y1) (E,Z1) in the second example
selectworksheet2 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls",
                         sheet = "Worksheet2", 
                         region = "B4:C5", header = FALSE)

selectworksheet2
##   Col0 Col1
## 1    1    1
## 2    1    1

unlist(selectworksheet2)
## Col01 Col02 Col11 Col12 
##    1     1     1     1 

#2


2  

There are several packages which provide functions to import Excel data to R; see the R data import/export documentation.

有几个包提供了将Excel数据导入R的功能;请参阅R数据导入/导出文档。

I've found the xlsx package to be useful (it will read both .xls and .xlsx files). I don't believe that it will accept row/column names as input, but it will accept their numerical value (row 1, column 4 for example). In your case, something like this should work, assuming that X, Y and Z correspond to columns 1-3:

我发现xlsx包很有用(它将读取.xls和.xlsx文件)。我不相信它会接受行/列名作为输入,但它会接受它们的数值(例如第1行,第4列)。在你的情况下,假设X,Y和Z对应于第1-3列,这样的东西应该有效:

library(xlsx)
# first example subset; call it ss1
# assume first row is not a header; otherwise requires header = T
ss1 <- read.xlsx("myfile.xlsx", sheetIndex = 1, rowIndex = 4:5, colIndex = 2:3)

# second example subset; call it ss2
# just the same except worksheet index = 2
ss2 <- read.xlsx("myfile.xlsx", sheetIndex = 2, rowIndex = 4:5, colIndex = 2:3)

However, you will need to experiment with your own file until things work as expected. You can also specify sheetName but I find sheetIndex normally works more reliably, once you figure out the correct index for each sheet. And take care if the first row is a header.

但是,您需要尝试使用自己的文件,直到按预期方式工作。你也可以指定sheetName,但是一旦你找出每张工作表的正确索引,我发现sheetIndex通常工作得更可靠。如果第一行是标题,请注意。

Having said all that: my preferred option would be to export the sheet to a text format such as CSV, use shell tools (cut, head, tail etc.) to get the required rows/columns and import that to R.

说了这么多:我首选的选项是将工作表导出为CSV等文本格式,使用shell工具(剪切,头部,尾部等)来获取所需的行/列并将其导入R.

#1


8  

@Andrie mentionned the XLConnect package, it's a very useful package for I/O between R and Excel with the possibility to select region in Excel sheet.

@Andrie提到了XLConnect包,它是R和Excel之间I / O的一个非常有用的包,可以在Excel工作表中选择区域。

I created an Excel file like yours in my Dropbox public folder, you can download the example.xls file here.

我在Dropbox公共文件夹中创建了一个类似你的Excel文件,你可以在这里下载example.xls文件。

require(XLConnect)

## A5:C5 correspond to (D,Y) (D,Z) (E,Y) (E,Z)  in your example
selectworksheet1 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls",
                               sheet = "Worksheet1", 
                               region = "A5:C5", header = FALSE)

selectworksheet1
##  Col0 Col1 Col2
## 1    2    2    2


## B4:C5 correspond to (D,Y1) (D,Z1) (E,Y1) (E,Z1) in the second example
selectworksheet2 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls",
                         sheet = "Worksheet2", 
                         region = "B4:C5", header = FALSE)

selectworksheet2
##   Col0 Col1
## 1    1    1
## 2    1    1

unlist(selectworksheet2)
## Col01 Col02 Col11 Col12 
##    1     1     1     1 

#2


2  

There are several packages which provide functions to import Excel data to R; see the R data import/export documentation.

有几个包提供了将Excel数据导入R的功能;请参阅R数据导入/导出文档。

I've found the xlsx package to be useful (it will read both .xls and .xlsx files). I don't believe that it will accept row/column names as input, but it will accept their numerical value (row 1, column 4 for example). In your case, something like this should work, assuming that X, Y and Z correspond to columns 1-3:

我发现xlsx包很有用(它将读取.xls和.xlsx文件)。我不相信它会接受行/列名作为输入,但它会接受它们的数值(例如第1行,第4列)。在你的情况下,假设X,Y和Z对应于第1-3列,这样的东西应该有效:

library(xlsx)
# first example subset; call it ss1
# assume first row is not a header; otherwise requires header = T
ss1 <- read.xlsx("myfile.xlsx", sheetIndex = 1, rowIndex = 4:5, colIndex = 2:3)

# second example subset; call it ss2
# just the same except worksheet index = 2
ss2 <- read.xlsx("myfile.xlsx", sheetIndex = 2, rowIndex = 4:5, colIndex = 2:3)

However, you will need to experiment with your own file until things work as expected. You can also specify sheetName but I find sheetIndex normally works more reliably, once you figure out the correct index for each sheet. And take care if the first row is a header.

但是,您需要尝试使用自己的文件,直到按预期方式工作。你也可以指定sheetName,但是一旦你找出每张工作表的正确索引,我发现sheetIndex通常工作得更可靠。如果第一行是标题,请注意。

Having said all that: my preferred option would be to export the sheet to a text format such as CSV, use shell tools (cut, head, tail etc.) to get the required rows/columns and import that to R.

说了这么多:我首选的选项是将工作表导出为CSV等文本格式,使用shell工具(剪切,头部,尾部等)来获取所需的行/列并将其导入R.