Suppose I have a data frame with columns c1, ..., cn, and a function f that takes in the columns of this data frame as arguments. How can I apply f to each row of the data frame to get a new data frame?
假设我有一个包含列c1,...,cn的数据框,以及一个函数f,它将此数据帧的列作为参数。如何将f应用于数据帧的每一行以获得新的数据帧?
For example,
例如,
x = data.frame(letter=c('a','b','c'), number=c(1,2,3))
# x is
# letter | number
# a | 1
# b | 2
# c | 3
f = function(letter, number) { paste(letter, number, sep='') }
# desired output is
# a1
# b2
# c3
How do I do this? I'm guessing it's something along the lines of {s,l,t}apply(x, f), but I can't figure it out.
我该怎么做呢?我猜它是{s,l,t}应用(x,f)的东西,但我无法弄明白。
3 个解决方案
#1
11
as @greg points out, paste() can do this. I suspect your example is a simplification of a more general problem. After struggling with this in the past, as illustrated in this previous question, I ended up using the plyr package for this type of thing. plyr does a LOT more, but for these things it's easy:
正如@greg指出的那样,paste()可以做到这一点。我怀疑你的例子是一个更普遍的问题的简化。在过去挣扎之后,如前一个问题所示,我最终使用plyr包来处理这类事情。 plyr做了很多,但对于这些事情很简单:
> require(plyr)
> adply(x, 1, function(x) f(x$letter, x$number))
X1 V1
1 1 a1
2 2 b2
3 3 c3
you'll want to rename the output columns, I'm sure
我确定你要重命名输出列
So while I was typing this, @joshua showed an alternative method using ddply
. The difference in my example is that adply
treats the input data frame as an array. adply
does not use the "group by" variable row
that @joshua created. How he did it is exactly how I was doing it until Hadley tipped me to the adply()
approach. In the aforementioned question.
所以当我输入这个时,@ joshua展示了一种使用ddply的替代方法。我的例子中的不同之处在于adply将输入数据帧视为数组。 adply不使用@joshua创建的“group by”变量行。他是怎么做到的,这正是我在做这件事之前,直到哈德利向我倾诉adply()方法。在上述问题中。
#2
7
paste(x$letter, x$number, sep = "")
#3
1
I think you were thinking of something like this, but note that the apply
family of functions do not return data.frames. They will also attempt to coerce your data.frame to a matrix before applying the function.
我认为您正在考虑这样的事情,但请注意,apply系列函数不会返回data.frames。在应用函数之前,他们还会尝试将data.frame强制转换为矩阵。
apply(x,1,function(x) paste(x,collapse=""))
So you may be more interested in ddply
from the plyr
package.
所以你可能对plyr包中的ddply更感兴趣。
> x$row <- 1:NROW(x)
> ddply(x, "row", function(df) paste(df[[1]],df[[2]],sep=""))
row V1
1 1 a1
2 2 b2
3 3 c3
#1
11
as @greg points out, paste() can do this. I suspect your example is a simplification of a more general problem. After struggling with this in the past, as illustrated in this previous question, I ended up using the plyr package for this type of thing. plyr does a LOT more, but for these things it's easy:
正如@greg指出的那样,paste()可以做到这一点。我怀疑你的例子是一个更普遍的问题的简化。在过去挣扎之后,如前一个问题所示,我最终使用plyr包来处理这类事情。 plyr做了很多,但对于这些事情很简单:
> require(plyr)
> adply(x, 1, function(x) f(x$letter, x$number))
X1 V1
1 1 a1
2 2 b2
3 3 c3
you'll want to rename the output columns, I'm sure
我确定你要重命名输出列
So while I was typing this, @joshua showed an alternative method using ddply
. The difference in my example is that adply
treats the input data frame as an array. adply
does not use the "group by" variable row
that @joshua created. How he did it is exactly how I was doing it until Hadley tipped me to the adply()
approach. In the aforementioned question.
所以当我输入这个时,@ joshua展示了一种使用ddply的替代方法。我的例子中的不同之处在于adply将输入数据帧视为数组。 adply不使用@joshua创建的“group by”变量行。他是怎么做到的,这正是我在做这件事之前,直到哈德利向我倾诉adply()方法。在上述问题中。
#2
7
paste(x$letter, x$number, sep = "")
#3
1
I think you were thinking of something like this, but note that the apply
family of functions do not return data.frames. They will also attempt to coerce your data.frame to a matrix before applying the function.
我认为您正在考虑这样的事情,但请注意,apply系列函数不会返回data.frames。在应用函数之前,他们还会尝试将data.frame强制转换为矩阵。
apply(x,1,function(x) paste(x,collapse=""))
So you may be more interested in ddply
from the plyr
package.
所以你可能对plyr包中的ddply更感兴趣。
> x$row <- 1:NROW(x)
> ddply(x, "row", function(df) paste(df[[1]],df[[2]],sep=""))
row V1
1 1 a1
2 2 b2
3 3 c3