合并数据框内的行[重复]

This question already has an answer here:

这个问题在这里已有答案：

How to sum a variable by group? 10 answers
如何按组对变量求和？ 10个答案

I have a relational dataset, where I'm looking for dyadic information.

我有一个关系数据集，我正在寻找二元信息。

I have 4 columns. Sender, Receiver, Attribute, Edge

我有4列。发件人，接收者，属性，边缘

I'm looking to take the repeated Sender -- Receiver counts and convert them as additional edges.

我正在寻找重复的Sender - Receiver计数并将它们转换为额外的边缘。

df <- data.frame(sender = c(1,1,1,1,3,5), receiver = c(1,2,2,2,4,5), 
                attribute = c(12,12,12,12,13,13), edge = c(0,1,1,1,1,0))

   sender receiver attribute edge
1       1        1        12    0
2       1        2        12    1
3       1        2        12    1
4       1        2        12    1
5       3        4        13    1

I want the end result to look like this:

我希望最终结果如下所示：

  sender receiver attribute edge
1      1        1        12    0
2      1        2        12    3
3      3        4        13    1

Where the relationship between duplicate sender-receivers have been combined and the number of duplicates incorporated in the number of edges.

已经组合了重复发送者 - 接收者之间的关系，并且边缘数量中包含重复数量。

Any input would be really appreciated.

任何输入都会非常感激。

Thanks!

谢谢！

2 个解决方案

#1

plyr is your friend - although I think your end result is not quite correct given the input data.

plyr是你的朋友 - 虽然我认为你的最终结果在输入数据方面并不完全正确。

library(plyr)

ddply(df, .(sender, receiver, attribute), summarize, edge = sum(edge))

Returns

  sender receiver attribute edge
1      1        1        12    0
2      1        2        12    3
3      3        4        13    1
4      5        5        13    0

#2

For fun, here are two other options, first using the base function aggregate() and the second using data.table package:

为了好玩，这里有两个其他选项，首先使用基函数aggregate（），第二个使用data.table包：

> aggregate(edge ~ sender + receiver + attribute, FUN = "sum", data = df)
  sender receiver attribute edge
1      1        1        12    0
2      1        2        12    3
3      3        4        13    1
4      5        5        13    0
> require(data.table)
> dt <- data.table(df)
> dt[, list(sumedge = sum(edge)), by = "sender, receiver, attribute"]
     sender receiver attribute sumedge
[1,]      1        1        12       0
[2,]      1        2        12       3
[3,]      3        4        13       1
[4,]      5        5        13       0

For the record, this question has been asked many many many times, perusing my own answers yields several answers that would point you down the right path.

为了记录，这个问题已被问过很多次，仔细阅读我自己的答案会产生几个答案，这些答案会指向正确的道路。

#1