I have the following lists:
我有以下清单:
group1<-c("A", "B", "D")
group2<-c("C", "E")
group3<-c("F")
and a dataframe with values and corresponding names:
和具有值和相应名称的数据aframe:
df <- data.frame (name=c("A","B","C","D","E","F"),value=c(1,2,3,4,5,6))
df
name value
1 A 1
2 B 2
3 C 3
4 D 4
5 E 5
6 F 6
I'd like to group the data based on the lists, using the name column;
我想根据列表来分组数据,使用name列;
df
name value group
1 A 1 group1
2 B 2 group1
3 C 3 group2
4 D 4 group1
5 E 5 group2
6 F 6 group3
and sum the values for each group.
并将每个组的值相加。
df
group sum
1 group1 7
2 group2 8
3 group3 6
I've searched for similar posts, but failed using them for my problem.
我搜索过类似的帖子,但是用它们来解决我的问题失败了。
3 个解决方案
#1
1
Here's an approach. First, use ifelse
to assign groups to each name
, then use aggregate
to get the sum for each group
.
这里有一个方法。首先,使用ifelse为每个名称分配组,然后使用聚合获取每个组的和。
> df$group <- with(df, ifelse(name %in% group1, "group1",
ifelse(name %in% group2, "group2", "group3" )))
> aggregate(value ~ group, sum, data=df)
group value
1 group1 7
2 group2 8
3 group3 6
#2
1
I would suggest having your grouping as a data.frame, something along these lines -
我建议把你的分组作为数据。帧,沿着这些线
grouping <- data.frame(name=c("A","B","C","D","E","F"),groupno=c(1,1,1,2,2,3))
df2 <- merge(df,grouping, by = 'name')
aggregate(value ~ groupno, sum, data=df2)
#3
1
Another idea:
另一个想法:
df$X <- factor(df$name)
levels(df$X) <- list(group1 = group1, group2 = group2, group3 = group3)
aggregate(df$value, list(group = df$X), sum)
# group x
#1 group1 7
#2 group2 8
#3 group3 6
EDIT
编辑
As noted by @thelatemail in the comments below you can mget
-in a list- all the objects in your workspace called "group_", like this:
@thelatemail在下面的评论中提到,你可以在一个列表中添加mget——你的工作空间中所有的对象都被称为group_,如下所示:
mget(ls(pattern="group\\d+"))
In case, though, you have loaded -say- a function called "group4", this function will be selected too in ls()
. A way to avoid this is to use something like:
但是,如果您已经加载了-say-一个名为“group4”的函数,那么这个函数也将在ls()中被选中。避免这种情况的一种方法是:
.ls <- ls(pattern="group\\d+")
mget(.ls[!.ls %in% apropos("group", mode = "function")]) #`mget` only non-functions.
#You can, of course, avoid any
#other `mode`, besides "function".
The list returned from mget
can, then, be used as the levels(df$X)
.
然后,从mget返回的列表可以用作级别(df$X)。
#1
1
Here's an approach. First, use ifelse
to assign groups to each name
, then use aggregate
to get the sum for each group
.
这里有一个方法。首先,使用ifelse为每个名称分配组,然后使用聚合获取每个组的和。
> df$group <- with(df, ifelse(name %in% group1, "group1",
ifelse(name %in% group2, "group2", "group3" )))
> aggregate(value ~ group, sum, data=df)
group value
1 group1 7
2 group2 8
3 group3 6
#2
1
I would suggest having your grouping as a data.frame, something along these lines -
我建议把你的分组作为数据。帧,沿着这些线
grouping <- data.frame(name=c("A","B","C","D","E","F"),groupno=c(1,1,1,2,2,3))
df2 <- merge(df,grouping, by = 'name')
aggregate(value ~ groupno, sum, data=df2)
#3
1
Another idea:
另一个想法:
df$X <- factor(df$name)
levels(df$X) <- list(group1 = group1, group2 = group2, group3 = group3)
aggregate(df$value, list(group = df$X), sum)
# group x
#1 group1 7
#2 group2 8
#3 group3 6
EDIT
编辑
As noted by @thelatemail in the comments below you can mget
-in a list- all the objects in your workspace called "group_", like this:
@thelatemail在下面的评论中提到,你可以在一个列表中添加mget——你的工作空间中所有的对象都被称为group_,如下所示:
mget(ls(pattern="group\\d+"))
In case, though, you have loaded -say- a function called "group4", this function will be selected too in ls()
. A way to avoid this is to use something like:
但是,如果您已经加载了-say-一个名为“group4”的函数,那么这个函数也将在ls()中被选中。避免这种情况的一种方法是:
.ls <- ls(pattern="group\\d+")
mget(.ls[!.ls %in% apropos("group", mode = "function")]) #`mget` only non-functions.
#You can, of course, avoid any
#other `mode`, besides "function".
The list returned from mget
can, then, be used as the levels(df$X)
.
然后,从mget返回的列表可以用作级别(df$X)。