映射数据帧的行

时间:2022-06-12 17:03:05

Suppose I have a data frame with columns c1, ..., cn, and a function f that takes in the columns of this data frame as arguments. How can I apply f to each row of the data frame to get a new data frame?

假设我有一个包含列c1,...,cn的数据框,以及一个函数f,它将此数据帧的列作为参数。如何将f应用于数据帧的每一行以获得新的数据帧?

For example,

例如,

x = data.frame(letter=c('a','b','c'), number=c(1,2,3))
# x is
# letter | number
#      a | 1
#      b | 2
#      c | 3

f = function(letter, number) { paste(letter, number, sep='') }

# desired output is
# a1
# b2
# c3

How do I do this? I'm guessing it's something along the lines of {s,l,t}apply(x, f), but I can't figure it out.

我该怎么做呢?我猜它是{s,l,t}应用(x,f)的东西,但我无法弄明白。

3 个解决方案

#1


11  

as @greg points out, paste() can do this. I suspect your example is a simplification of a more general problem. After struggling with this in the past, as illustrated in this previous question, I ended up using the plyr package for this type of thing. plyr does a LOT more, but for these things it's easy:

正如@greg指出的那样,paste()可以做到这一点。我怀疑你的例子是一个更普遍的问题的简化。在过去挣扎之后,如前一个问题所示,我最终使用plyr包来处理这类事情。 plyr做了很多,但对于这些事情很简单:

> require(plyr)
> adply(x, 1, function(x) f(x$letter, x$number))
  X1 V1
1  1 a1
2  2 b2
3  3 c3

you'll want to rename the output columns, I'm sure

我确定你要重命名输出列

So while I was typing this, @joshua showed an alternative method using ddply. The difference in my example is that adply treats the input data frame as an array. adply does not use the "group by" variable row that @joshua created. How he did it is exactly how I was doing it until Hadley tipped me to the adply() approach. In the aforementioned question.

所以当我输入这个时,@ joshua展示了一种使用ddply的替代方法。我的例子中的不同之处在于adply将输入数据帧视为数组。 adply不使用@joshua创建的“group by”变量行。他是怎么做到的,这正是我在做这件事之前,直到哈德利向我倾诉adply()方法。在上述问题中。

#2


7  

paste(x$letter, x$number, sep = "")

#3


1  

I think you were thinking of something like this, but note that the apply family of functions do not return data.frames. They will also attempt to coerce your data.frame to a matrix before applying the function.

我认为您正在考虑这样的事情,但请注意,apply系列函数不会返回data.frames。在应用函数之前,他们还会尝试将data.frame强制转换为矩阵。

apply(x,1,function(x) paste(x,collapse=""))

So you may be more interested in ddply from the plyr package.

所以你可能对plyr包中的ddply更感兴趣。

> x$row <- 1:NROW(x)
> ddply(x, "row", function(df) paste(df[[1]],df[[2]],sep=""))
  row V1
1   1 a1
2   2 b2
3   3 c3

#1


11  

as @greg points out, paste() can do this. I suspect your example is a simplification of a more general problem. After struggling with this in the past, as illustrated in this previous question, I ended up using the plyr package for this type of thing. plyr does a LOT more, but for these things it's easy:

正如@greg指出的那样,paste()可以做到这一点。我怀疑你的例子是一个更普遍的问题的简化。在过去挣扎之后,如前一个问题所示,我最终使用plyr包来处理这类事情。 plyr做了很多,但对于这些事情很简单:

> require(plyr)
> adply(x, 1, function(x) f(x$letter, x$number))
  X1 V1
1  1 a1
2  2 b2
3  3 c3

you'll want to rename the output columns, I'm sure

我确定你要重命名输出列

So while I was typing this, @joshua showed an alternative method using ddply. The difference in my example is that adply treats the input data frame as an array. adply does not use the "group by" variable row that @joshua created. How he did it is exactly how I was doing it until Hadley tipped me to the adply() approach. In the aforementioned question.

所以当我输入这个时,@ joshua展示了一种使用ddply的替代方法。我的例子中的不同之处在于adply将输入数据帧视为数组。 adply不使用@joshua创建的“group by”变量行。他是怎么做到的,这正是我在做这件事之前,直到哈德利向我倾诉adply()方法。在上述问题中。

#2


7  

paste(x$letter, x$number, sep = "")

#3


1  

I think you were thinking of something like this, but note that the apply family of functions do not return data.frames. They will also attempt to coerce your data.frame to a matrix before applying the function.

我认为您正在考虑这样的事情,但请注意,apply系列函数不会返回data.frames。在应用函数之前,他们还会尝试将data.frame强制转换为矩阵。

apply(x,1,function(x) paste(x,collapse=""))

So you may be more interested in ddply from the plyr package.

所以你可能对plyr包中的ddply更感兴趣。

> x$row <- 1:NROW(x)
> ddply(x, "row", function(df) paste(df[[1]],df[[2]],sep=""))
  row V1
1   1 a1
2   2 b2
3   3 c3