I have a dataset with two columns. I need to calculate the total time in seconds for which the value was greater than 1 for the duration between 00:00 to 6:00. How can I do this in most efficient way in R? Can this be done using dplyr package? I need to do this in a generic way such that it can be applied for other durations(6 to 9, 9 to 12) as well. Below is some sample data :
我有一个有两列的数据集。我需要计算总时间,以秒为单位,在00:00到6:00之间的时间内,值大于1。如何在R中以最有效的方式来做呢?这可以用dplyr包完成吗?我需要以一种通用的方式来做这件事,这样它就可以应用到其他的时间(6到9,9到12)。以下是一些样本数据:
+--------------------------------------+
| Timestamp Value |
+--------------------------------------+
| 2015-10-01 00:00:00 300 |
| 2015-10-01 00:00:55 200 |
| 2015-10-01 00:25:10 0 |
| 2015-10-01 01:05:40 876 |
| 2015-10-01 02:05:40 989 |
| 2015-10-01 04:05:40 0 |
| 2015-10-01 05:00:00 600 |
| 2015-10-01 06:00:00 300 |
+--------------------------------------+
So the output that is expected here for duration between 00 to 06 is 15910 seconds.
所以在00到06之间的输出是15910秒。
1 个解决方案
#1
3
First I would parse the date/time:
首先我将解析日期/时间:
dat$Timestamp <- strptime(dat$Timestamp, format="%Y-%m-%d %H:%M:%S")
Then I would grab the seconds between each observation using difftime
:
然后我会在每一个观察到的时间里利用扩散时间:
secs <- as.numeric(difftime(tail(dat$Timestamp, -1), head(dat$Timestamp, -1),
units="secs"))
Finally, I would sum up the number of seconds in each interval that has value greater than 1:
最后,我将每个值大于1的区间的秒数相加:
sum(secs[head(dat$Value, -1) > 1])
# [1] 15910
Assuming the boundaries of the time you are interested in appear in the Timestamp field, you can limit to the time range of interest (start at begin.time
and end at end.time
) with something like:
假设您感兴趣的时间边界出现在Timestamp字段中,您可以将其限制为感兴趣的时间范围(从begin开始)。时间和结束。时间)用类似的东西:
dat.subset <- dat[dat$Timestamp >= begin.time & dat$Timestamp <= end.time,]
Data:
数据:
dat <- data.frame(Timestamp = c("2015-10-01 00:00:00", "2015-10-01 00:00:55", "2015-10-01 00:25:10", "2015-10-01 01:05:40", "2015-10-01 02:05:40", "2015-10-01 04:05:40", "2015-10-01 05:00:00", "2015-10-01 06:00:00"), Value = c(300, 200, 0, 876, 989, 0, 600, 300))
#1
3
First I would parse the date/time:
首先我将解析日期/时间:
dat$Timestamp <- strptime(dat$Timestamp, format="%Y-%m-%d %H:%M:%S")
Then I would grab the seconds between each observation using difftime
:
然后我会在每一个观察到的时间里利用扩散时间:
secs <- as.numeric(difftime(tail(dat$Timestamp, -1), head(dat$Timestamp, -1),
units="secs"))
Finally, I would sum up the number of seconds in each interval that has value greater than 1:
最后,我将每个值大于1的区间的秒数相加:
sum(secs[head(dat$Value, -1) > 1])
# [1] 15910
Assuming the boundaries of the time you are interested in appear in the Timestamp field, you can limit to the time range of interest (start at begin.time
and end at end.time
) with something like:
假设您感兴趣的时间边界出现在Timestamp字段中,您可以将其限制为感兴趣的时间范围(从begin开始)。时间和结束。时间)用类似的东西:
dat.subset <- dat[dat$Timestamp >= begin.time & dat$Timestamp <= end.time,]
Data:
数据:
dat <- data.frame(Timestamp = c("2015-10-01 00:00:00", "2015-10-01 00:00:55", "2015-10-01 00:25:10", "2015-10-01 01:05:40", "2015-10-01 02:05:40", "2015-10-01 04:05:40", "2015-10-01 05:00:00", "2015-10-01 06:00:00"), Value = c(300, 200, 0, 876, 989, 0, 600, 300))