I have a very large data set that I am trying to make smaller. For the purpose of this question I will simplify it by focusing only on a few of the variables. I have samples taken from many subjects once every 5 minutes for three hours and I would like to add together every 2 time segments. Instead of using 5 minute intervals I would like to switch to 10 minute intervals.
我有一个非常大的数据集,我试图缩小。出于这个问题的目的,我将通过仅关注一些变量来简化它。我每隔5分钟从许多科目中抽取样本,持续3个小时,我想每两个时间段加在一起。我没有使用5分钟的间隔,而是希望切换到10分钟的间隔。
Data:
数据:
ID Time Measurement
A1 5 2
A1 10 3
A1 15 2
A1 20 4
A2 5 0
A2 10 3
A2 15 3
A2 20 0
I would like to turn this into:
我想把它变成:
ID Time Measurement
A1 10 5
A1 20 6
A2 10 3
A2 20 3
How would I make this happen in R?
我怎么会在R中发生这种情况?
1 个解决方案
#1
1
Maybe you can use findInterval
and aggregate
in some way... something like the following, perhaps:
也许你可以使用findInterval并以某种方式聚合......如下所示:
mydf$newTime <- findInterval(mydf$Time, seq(1, 180, 10)) * 10
## Or, as suggested by G. Grothendieck
mydf$newTime <- 10 * ((mydf$Time - 5) %/% 10) + 10
"mydf" now looks like this:
“mydf”现在看起来像这样:
mydf
# ID Time Measurement newTime
# 1 A1 5 2 10
# 2 A1 10 3 10
# 3 A1 15 2 20
# 4 A1 20 4 20
# 5 A2 5 0 10
# 6 A2 10 3 10
# 7 A2 15 3 20
# 8 A2 20 0 20
From here, we can easily use aggregate
:
从这里,我们可以轻松使用聚合:
aggregate(Measurement ~ ID + newTime, mydf, sum)
# ID newTime Measurement
# 1 A1 10 5
# 2 A2 10 3
# 3 A1 20 6
# 4 A2 20 3
I haven't tested this on anything but your sample data though....
除了你的样本数据之外,我还没有测试过这个......
#1
1
Maybe you can use findInterval
and aggregate
in some way... something like the following, perhaps:
也许你可以使用findInterval并以某种方式聚合......如下所示:
mydf$newTime <- findInterval(mydf$Time, seq(1, 180, 10)) * 10
## Or, as suggested by G. Grothendieck
mydf$newTime <- 10 * ((mydf$Time - 5) %/% 10) + 10
"mydf" now looks like this:
“mydf”现在看起来像这样:
mydf
# ID Time Measurement newTime
# 1 A1 5 2 10
# 2 A1 10 3 10
# 3 A1 15 2 20
# 4 A1 20 4 20
# 5 A2 5 0 10
# 6 A2 10 3 10
# 7 A2 15 3 20
# 8 A2 20 0 20
From here, we can easily use aggregate
:
从这里,我们可以轻松使用聚合:
aggregate(Measurement ~ ID + newTime, mydf, sum)
# ID newTime Measurement
# 1 A1 10 5
# 2 A2 10 3
# 3 A1 20 6
# 4 A2 20 3
I haven't tested this on anything but your sample data though....
除了你的样本数据之外,我还没有测试过这个......