虽然R语言有类型很丰富的数据结构,但是很多时候数据结构比较复杂,那么基本就会用到list这种结构的数据类型。但是list对象很难以文本的形式导出,因此需要一个函数能快速将复杂的list结构扁平化成dataframe。这里要介绍的就是do.call函数。
这里是do.call 函数的官方文档:
do.call {base} | R Documentation |
Execute a Function Call
Description
do.call
constructs and executes a function call from a name or a function and a list of arguments to be passed to it.
Usage
do.call(what, args, quote = FALSE, envir = parent.frame())
Arguments
what |
either a function or a non-empty character string naming the function to be called. |
args |
a list of arguments to the function call. The |
quote |
a logical value indicating whether to quote the arguments. |
envir |
an environment within which to evaluate the call. This will be most useful if |
Details
If quote
is FALSE
, the default, then the arguments are evaluated (in the calling environment, not in envir
). If quote
is TRUE
then each argument is quoted (see quote
) so that the effect of argument evaluation is to remove the quotes – leaving the original arguments unevaluated when the call is constructed.
The behavior of some functions, such as substitute
, will not be the same for functions evaluated using do.call
as if they were evaluated from the interpreter. The precise semantics are currently undefined and subject to change.
Value
The result of the (evaluated) function call.
Warning
This should not be used to attempt to evade restrictions on the use of .Internal
and other non-API calls.
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & *s/Cole.
简单的讲,do.call 的功能就是执行一个函数,而这个函数的参数呢,放在一个list里面, 是list的每个子元素。
看例子:
> tmp <- data.frame('letter' = letters[1:10], 'number' = 1:10, 'value' = c('+','-'))
> tmp
letter number value
1 a 1 +
2 b 2 -
3 c 3 +
4 d 4 -
5 e 5 +
6 f 6 -
7 g 7 +
8 h 8 -
9 i 9 +
10 j 10 -
> tmp[[1]]
[1] a b c d e f g h i j
> tmp[[2]]
[1] 1 2 3 4 5 6 7 8 9 10
> tmp[[3]]
[1] + - + - + - + - + -
> do.call("paste", c(tmp, sep = ""))
[1] "a1+" "b2-" "c3+" "d4-" "e5+" "f6-" "g7+" "h8-" "i9+" "j10-"
这里的tmp使用data.frame函数创建的,其实它本质上还是一个list,这里分别用[[]]符号显示他的三个元素,可以看到do.call函数把tmp的三个元素(三个向量)作为paste函数的参数。这个例子我们也可以这样写:
> paste(tmp[[1]],tmp[[2]],tmp[[3]], sep = "")
[1] "a1+" "b2-" "c3+" "d4-" "e5+" "f6-" "g7+" "h8-" "i9+" "j10-"
可以看到两种结果是一模一样的。
再举一个例子:
> number_add <- list(101:110, 1:10)
> number_add
[[1]]
[1] 101 102 103 104 105 106 107 108 109 110
[[2]]
[1] 1 2 3 4 5 6 7 8 9 10
> add <- function(x,y) {x + y}
> add
function(x,y) {x + y}
> do.call(add, number_add)
[1] 102 104 106 108 110 112 114 116 118 120
> add(number_add[[1]], number_add[[2]])
[1] 102 104 106 108 110 112 114 116 118 120
最后回到开头,假如说我们有一个list对象,这个对象里面是格式一致的dataframe,我们需要将这个list对象合并成一个总的dataframe并输出成文本文件,那么可以这样做:
> list1
[[1]]
up down number
1 A a 1
2 B b 2
3 C c 3
4 D d 4
5 E e 5
[[2]]
up down number
1 A a 1
2 B b 2
3 C c 3
4 D d 4
5 E e 5
[[3]]
up down number
1 A a 1
2 B b 2
3 C c 3
4 D d 4
5 E e 5
> do.call("rbind",list1)
up down number
1 A a 1
2 B b 2
3 C c 3
4 D d 4
5 E e 5
6 A a 1
7 B b 2
8 C c 3
9 D d 4
10 E e 5
11 A a 1
12 B b 2
13 C c 3
14 D d 4
15 E e 5
这里再推荐一个比较实用的函数族,apply族函数,有兴趣的朋友可以查阅相关资料。