This question already has an answer here:
这个问题在这里已有答案:
- How to sum a variable by group? 10 answers
- 如何按组对变量求和? 10个答案
I have a relational dataset, where I'm looking for dyadic information.
我有一个关系数据集,我正在寻找二元信息。
I have 4 columns. Sender, Receiver, Attribute, Edge
我有4列。发件人,接收者,属性,边缘
I'm looking to take the repeated Sender -- Receiver counts and convert them as additional edges.
我正在寻找重复的Sender - Receiver计数并将它们转换为额外的边缘。
df <- data.frame(sender = c(1,1,1,1,3,5), receiver = c(1,2,2,2,4,5),
attribute = c(12,12,12,12,13,13), edge = c(0,1,1,1,1,0))
sender receiver attribute edge
1 1 1 12 0
2 1 2 12 1
3 1 2 12 1
4 1 2 12 1
5 3 4 13 1
I want the end result to look like this:
我希望最终结果如下所示:
sender receiver attribute edge
1 1 1 12 0
2 1 2 12 3
3 3 4 13 1
Where the relationship between duplicate sender-receivers have been combined and the number of duplicates incorporated in the number of edges.
已经组合了重复发送者 - 接收者之间的关系,并且边缘数量中包含重复数量。
Any input would be really appreciated.
任何输入都会非常感激。
Thanks!
谢谢!
2 个解决方案
#1
6
plyr
is your friend - although I think your end result is not quite correct given the input data.
plyr是你的朋友 - 虽然我认为你的最终结果在输入数据方面并不完全正确。
library(plyr)
ddply(df, .(sender, receiver, attribute), summarize, edge = sum(edge))
Returns
返回
sender receiver attribute edge
1 1 1 12 0
2 1 2 12 3
3 3 4 13 1
4 5 5 13 0
#2
20
For fun, here are two other options, first using the base function aggregate()
and the second using data.table
package:
为了好玩,这里有两个其他选项,首先使用基函数aggregate(),第二个使用data.table包:
> aggregate(edge ~ sender + receiver + attribute, FUN = "sum", data = df)
sender receiver attribute edge
1 1 1 12 0
2 1 2 12 3
3 3 4 13 1
4 5 5 13 0
> require(data.table)
> dt <- data.table(df)
> dt[, list(sumedge = sum(edge)), by = "sender, receiver, attribute"]
sender receiver attribute sumedge
[1,] 1 1 12 0
[2,] 1 2 12 3
[3,] 3 4 13 1
[4,] 5 5 13 0
For the record, this question has been asked many many many times, perusing my own answers yields several answers that would point you down the right path.
为了记录,这个问题已被问过很多次,仔细阅读我自己的答案会产生几个答案,这些答案会指向正确的道路。
#1
6
plyr
is your friend - although I think your end result is not quite correct given the input data.
plyr是你的朋友 - 虽然我认为你的最终结果在输入数据方面并不完全正确。
library(plyr)
ddply(df, .(sender, receiver, attribute), summarize, edge = sum(edge))
Returns
返回
sender receiver attribute edge
1 1 1 12 0
2 1 2 12 3
3 3 4 13 1
4 5 5 13 0
#2
20
For fun, here are two other options, first using the base function aggregate()
and the second using data.table
package:
为了好玩,这里有两个其他选项,首先使用基函数aggregate(),第二个使用data.table包:
> aggregate(edge ~ sender + receiver + attribute, FUN = "sum", data = df)
sender receiver attribute edge
1 1 1 12 0
2 1 2 12 3
3 3 4 13 1
4 5 5 13 0
> require(data.table)
> dt <- data.table(df)
> dt[, list(sumedge = sum(edge)), by = "sender, receiver, attribute"]
sender receiver attribute sumedge
[1,] 1 1 12 0
[2,] 1 2 12 3
[3,] 3 4 13 1
[4,] 5 5 13 0
For the record, this question has been asked many many many times, perusing my own answers yields several answers that would point you down the right path.
为了记录,这个问题已被问过很多次,仔细阅读我自己的答案会产生几个答案,这些答案会指向正确的道路。