What is the preferred way to send all columns within a current group to a function as a tibble or data.frame when calling an arbitrary function in a dplyr pipe?
在调用dplyr管道中的任意函数时,将当前组中的所有列作为tibble或data.frame发送到函数的首选方法是什么?
In the example below, mean_B
is a simple example where I know what is needed before I make the function call. mean_B_fun
gives the wrong answer (compared to what I want-- I want the within-group mean), and mean_B_fun_ugly
gives what I want, but it seems like both an inefficient (and ugly) way to get the effect I want.
在下面的示例中,mean_B是一个简单的示例,我在进行函数调用之前知道需要什么。 mean_B_fun给出了错误的答案(与我想要的相比 - 我想要组内的意思),而mean_B_fun_ugly给出了我想要的东西,但它似乎是一种低效(和丑陋)的方式来获得我想要的效果。
The reason I want to operate on arbitrary columns is that in practice, I'm taking my_fun
in the example below from the user, and I don't know the columns that the user will need to operate on a priori.
我想在任意列上操作的原因是,在实践中,我在用户的下面的示例中使用my_fun,并且我不知道用户需要先验操作的列。
library(dplyr)
my_fun <- function(x) mean(x$B)
my_data <-
expand.grid(A=1:3, B=1:2) %>%
mutate(B=A*B) %>%
group_by(A) %>%
mutate(mean_B=mean(B),
mean_B_fun=my_fun(.),
mean_B_fun_ugly=my_fun(as.data.frame(.)[.$A == unique(A),,drop=FALSE]))
1 个解决方案
#1
0
here's my answer, not knowing the columns on which you want to calculate the mean.
这是我的答案,不知道你想要计算平均值的列。
expand.grid(A=1:3, B=1:2) %>%
mutate(B=A*B) %>% nest(-A) %>%
mutate(means = map(.$data, function(x) colMeans(x)))
A data means
1 1 1, 2 1.5
2 2 2, 4 3
3 3 3, 6 4.5
#1
0
here's my answer, not knowing the columns on which you want to calculate the mean.
这是我的答案,不知道你想要计算平均值的列。
expand.grid(A=1:3, B=1:2) %>%
mutate(B=A*B) %>% nest(-A) %>%
mutate(means = map(.$data, function(x) colMeans(x)))
A data means
1 1 1, 2 1.5
2 2 2, 4 3
3 3 3, 6 4.5