如何在保留R中的其他列的同时聚合一些列?

时间:2022-01-05 23:01:04

I have a data frame like this:

我有这样的数据框:

     id  no  age
1    1   7   23
2    1   2   23
3    2   1   25
4    2   4   25
5    3   6   23
6    3   1   23

and I hope to aggregate the date frame by id to a form like this: (just sum the no if they share the same id, but keep age there)

并且我希望通过id将日期框架聚合到这样的形式:(如果它们共享相同的ID,则将其加总,但保持年龄)

    id  no  age
1    1   9   23
2    2   5   25
3    3   7   23

How to achieve this using R?

如何使用R来实现这一目标?

3 个解决方案

#1


14  

Assuming that your data frame is named df.

假设您的数据框名为df。

aggregate(no~id+age, df, sum)
#   id age no
# 1  1  23  9
# 2  3  23  7
# 3  2  25  5

#2


5  

Even better, data.table:

更好的是,data.table:

library(data.table)
# convert your object to a data.table (by reference) to unlock data.table syntax
setDT(DF)
DF[  , .(sum_no = sum(no), unq_age = unique(age)), by = id]

#3


2  

Alternatively, you could use ddply from plyr package:

或者,您可以使用plyr包中的ddply:

require(plyr)
ddply(df,.(id,age),summarise,no = sum(no))

In this particular example the results are identical. However, this is not always the case, the difference between the both functions is outlined here. Both functions have their uses and are worth exploring, which is why I felt this alternative should be mentioned.

在该特定示例中,结果是相同的。然而,情况并非总是如此,这里概述了两种功能之间的差异。这两个功能都有其用途,值得探索,这就是为什么我觉得应该提到这个替代方案。

#1


14  

Assuming that your data frame is named df.

假设您的数据框名为df。

aggregate(no~id+age, df, sum)
#   id age no
# 1  1  23  9
# 2  3  23  7
# 3  2  25  5

#2


5  

Even better, data.table:

更好的是,data.table:

library(data.table)
# convert your object to a data.table (by reference) to unlock data.table syntax
setDT(DF)
DF[  , .(sum_no = sum(no), unq_age = unique(age)), by = id]

#3


2  

Alternatively, you could use ddply from plyr package:

或者,您可以使用plyr包中的ddply:

require(plyr)
ddply(df,.(id,age),summarise,no = sum(no))

In this particular example the results are identical. However, this is not always the case, the difference between the both functions is outlined here. Both functions have their uses and are worth exploring, which is why I felt this alternative should be mentioned.

在该特定示例中,结果是相同的。然而,情况并非总是如此,这里概述了两种功能之间的差异。这两个功能都有其用途,值得探索,这就是为什么我觉得应该提到这个替代方案。