I have a vector of POSIXct values and I would like to round them to the nearest quarter hour. I don't care about the day. How do I convert the values to hours and minutes?
我有一个POSIXct值的向量,我想将它们舍入到最近的四分之一小时。我不关心这一天。如何将值转换为小时和分钟?
For example, I would like the value
例如,我想要的价值
"2012-05-30 20:41:21 UTC"
to be
成为
"20:45"
7 个解决方案
#1
7
Indeed, an old question with some helpful answers so far. The last one by giraffhere seems to be the most elegant. however, ist not floor_date but round_date which will do the trick:
事实上,到目前为止,一个老问题有一些有用的答案。 giraffhere的最后一个似乎是最优雅的。但是,不是floor_date而是round_date,它可以解决这个问题:
lubridate::round_date(x, "15 minutes")
#2
19
You can use round
. The trick is to divide by 900 seconds (15 minutes * 60 seconds) before rounding and multiply by 900 afterwards:
你可以使用圆形。诀窍是在舍入之前除以900秒(15分钟* 60秒)并在之后乘以900:
a <-as.POSIXlt("2012-05-30 20:41:21 UTC")
b <-as.POSIXlt(round(as.double(a)/(15*60))*(15*60),origin=(as.POSIXlt('1970-01-01')))
b
[1] "2012-05-30 20:45:00 EDT"
To get only hour and minute, just use format
要只获得小时和分钟,只需使用格式
format(b,"%H:%M")
[1] "20:45"
as.character(format(b,"%H:%M"))
[1] "20:45"
#3
14
something like
就像是
format(strptime("1970-01-01", "%Y-%m-%d", tz="UTC") + round(as.numeric(your.time)/900)*900,"%H:%M")
would work
会工作
#4
9
Old question, but would like to note that the lubridate
package handles this easily now with floor_date
. To cut a vector of POSIXct objects to 15 minute intervals, use like this.
老问题,但是要注意,lubridate包现在可以使用floor_date轻松处理。要将POSIXct对象的矢量剪切为15分钟间隔,请像这样使用。
x <- lubridate::floor_date(x, "15 minutes")
x < - lubridate :: floor_date(x,“15分钟”)
EDIT: Noted by user @user3297928, use lubridate::round_date(x, "15 minutes")
for rounding to the nearest 15 minutes. The above floors it.
编辑:由用户@ user3297928注意,使用lubridate :: round_date(x,“15分钟”)进行四舍五入到最接近的15分钟。它上面的楼层。
#5
4
You can use the align.time
function in the xts package to handle the rounding, then format
to return a string of "HH:MM":
您可以使用xts包中的align.time函数来处理舍入,然后格式化以返回字符串“HH:MM”:
R> library(xts)
R> p <- as.POSIXct("2012-05-30 20:41:21", tz="UTC")
R> a <- align.time(p, n=60*15) # n is in seconds
R> format(a, "%H:%M")
[1] "20:45"
#6
3
Try this, which combines both requests and is based on looking at what round.POSIXt()
and trunc.POSIXt()
do.
尝试这个,它结合了两个请求,并基于查看round.POSIXt()和trunc.POSIXt()做什么。
myRound <- function (x, convert = TRUE) {
x <- as.POSIXlt(x)
mins <- x$min
mult <- mins %/% 15
remain <- mins %% 15
if(remain > 7L || (remain == 7L && x$sec > 29))
mult <- mult + 1
if(mult > 3) {
x$min <- 0
x <- x + 3600
} else {
x$min <- 15 * mult
}
x <- trunc.POSIXt(x, units = "mins")
if(convert) {
x <- format(x, format = "%H:%M")
}
x
}
This gives:
这给出了:
> tmp <- as.POSIXct("2012-05-30 20:41:21 UTC")
> myRound(tmp)
[1] "20:45"
> myRound(tmp, convert = FALSE)
[1] "2012-05-30 20:45:00 BST"
> tmp2 <- as.POSIXct("2012-05-30 20:55:21 UTC")
> myRound(tmp2)
[1] "21:00"
> myRound(tmp2, convert = FALSE)
[1] "2012-05-30 21:00:00 BST"
#7
2
Using IDate
and ITime
classes from data.table
and a IPeriod
class (just developed) I was able to get more scalable solution.
Only shhhhimhuntingrabbits and PLapointe answer the question in terms of nearest. xts
solution only rounds using ceiling, my IPeriod
solution allows to specify ceiling or floor.
To get top performance you would need to keep your data in IDate
and ITime
classes. As seen on benchmark it is cheap to produce POSIXct
from IDate/ITime/IPeriod
. Below benchmark of some 22M timestamp:
使用data.table和IPeriod类(刚刚开发)中的IDate和ITime类,我能够获得更具可扩展性的解决方案。只有shhhhimhuntingrabbits和PLapointe用最近的方式回答这个问题。 xts解决方案仅使用天花板进行舍入,我的IPeriod解决方案允许指定天花板或地板。要获得最佳性能,您需要将数据保存在IDate和ITime类中。从基准测试可以看出,从IDate / ITime / IPeriod生产POSIXct是很便宜的。低于约22M时间戳的基准:
# install only if you don't have
install.packages(c("microbenchmarkCore","data.table"),
repos = c("https://olafmersmann.github.io/drat",
"https://jangorecki.github.io/drat/iperiod"))
library(microbenchmarkCore)
library(data.table) # iunit branch
library(xts)
Sys.setenv(TZ="UTC")
## some source data: download and unzip csv
# "http://api.bitcoincharts.com/v1/csv/btceUSD.csv.gz"
# below benchmark on btceUSD.csv.gz 11-Oct-2015 11:35 133664801
system.nanotime(dt <- fread(".btceUSD.csv"))
# Read 21931266 rows and 3 (of 3) columns from 0.878 GB file in 00:00:10
# user system elapsed
# NA NA 9.048991
# take the timestamp only
x = as.POSIXct(dt[[1L]], tz="UTC", origin="1970-01-01")
# functions
shhhhi <- function(your.time){
strptime("1970-01-01", "%Y-%m-%d", tz="UTC") + round(as.numeric(your.time)/900)*900
}
PLapointe <- function(a){
as.POSIXlt(round(as.double(a)/(15*60))*(15*60),origin=(as.POSIXlt('1970-01-01')))
}
# myRound - not vectorized
# compare results
all.equal(
format(shhhhi(x),"%H:%M"),
format(PLapointe(x),"%H:%M")
)
# [1] TRUE
all.equal(
format(align.time(x, n = 60*15),"%H:%M"),
format(periodize(x, "mins", 15),"%H:%M")
)
# [1] TRUE
# IPeriod native input are IDate and ITime - will be tested too
idt <- IDateTime(x)
idate <- idt$idate
itime <- idt$itime
microbenchmark(times = 10L,
shhhhi(x),
PLapointe(x),
xts = align.time(x, 15*60),
posix_ip_posix = as.POSIXct(periodize(x, "mins", 15), tz="UTC"),
posix_ip = periodize(x, "mins", 15),
ip_posix = as.POSIXct(periodize(idate, itime, "mins", 15), tz="UTC"),
ip = periodize(idate, itime, "mins", 15))
# Unit: microseconds
# expr min lq mean median uq max neval
# shhhhi(x) 960819.810 984970.363 1127272.6812 1167512.2765 1201770.895 1243706.235 10
# PLapointe(x) 2322929.313 2440263.122 2617210.4264 2597772.9825 2792936.774 2981499.356 10
# xts 453409.222 525738.163 581139.6768 546300.9395 677077.650 767609.155 10
# posix_ip_posix 3314609.993 3499220.920 3641219.0876 3586822.9150 3654548.885 4457614.174 10
# posix_ip 3010316.462 3066736.299 3157777.2361 3133693.0655 3234307.549 3401388.800 10
# ip_posix 335.741 380.696 513.7420 543.3425 630.020 663.385 10
# ip 98.031 151.471 207.7404 231.8200 262.037 278.789 10
IDate
and ITime
successfully scales not only in this particular task. Both types, same as IPeriod
, are integer based. I would assume they will also scale nice on join or grouping by datetime fields.
Online manual: https://jangorecki.github.io/drat/iperiod/
IDate和ITime不仅成功地扩展了这一特定任务。两种类型,与IPeriod相同,都是基于整数的。我认为他们也可以通过datetime字段进行连接或分组。在线手册:https://jangorecki.github.io/drat/iperiod/
#1
7
Indeed, an old question with some helpful answers so far. The last one by giraffhere seems to be the most elegant. however, ist not floor_date but round_date which will do the trick:
事实上,到目前为止,一个老问题有一些有用的答案。 giraffhere的最后一个似乎是最优雅的。但是,不是floor_date而是round_date,它可以解决这个问题:
lubridate::round_date(x, "15 minutes")
#2
19
You can use round
. The trick is to divide by 900 seconds (15 minutes * 60 seconds) before rounding and multiply by 900 afterwards:
你可以使用圆形。诀窍是在舍入之前除以900秒(15分钟* 60秒)并在之后乘以900:
a <-as.POSIXlt("2012-05-30 20:41:21 UTC")
b <-as.POSIXlt(round(as.double(a)/(15*60))*(15*60),origin=(as.POSIXlt('1970-01-01')))
b
[1] "2012-05-30 20:45:00 EDT"
To get only hour and minute, just use format
要只获得小时和分钟,只需使用格式
format(b,"%H:%M")
[1] "20:45"
as.character(format(b,"%H:%M"))
[1] "20:45"
#3
14
something like
就像是
format(strptime("1970-01-01", "%Y-%m-%d", tz="UTC") + round(as.numeric(your.time)/900)*900,"%H:%M")
would work
会工作
#4
9
Old question, but would like to note that the lubridate
package handles this easily now with floor_date
. To cut a vector of POSIXct objects to 15 minute intervals, use like this.
老问题,但是要注意,lubridate包现在可以使用floor_date轻松处理。要将POSIXct对象的矢量剪切为15分钟间隔,请像这样使用。
x <- lubridate::floor_date(x, "15 minutes")
x < - lubridate :: floor_date(x,“15分钟”)
EDIT: Noted by user @user3297928, use lubridate::round_date(x, "15 minutes")
for rounding to the nearest 15 minutes. The above floors it.
编辑:由用户@ user3297928注意,使用lubridate :: round_date(x,“15分钟”)进行四舍五入到最接近的15分钟。它上面的楼层。
#5
4
You can use the align.time
function in the xts package to handle the rounding, then format
to return a string of "HH:MM":
您可以使用xts包中的align.time函数来处理舍入,然后格式化以返回字符串“HH:MM”:
R> library(xts)
R> p <- as.POSIXct("2012-05-30 20:41:21", tz="UTC")
R> a <- align.time(p, n=60*15) # n is in seconds
R> format(a, "%H:%M")
[1] "20:45"
#6
3
Try this, which combines both requests and is based on looking at what round.POSIXt()
and trunc.POSIXt()
do.
尝试这个,它结合了两个请求,并基于查看round.POSIXt()和trunc.POSIXt()做什么。
myRound <- function (x, convert = TRUE) {
x <- as.POSIXlt(x)
mins <- x$min
mult <- mins %/% 15
remain <- mins %% 15
if(remain > 7L || (remain == 7L && x$sec > 29))
mult <- mult + 1
if(mult > 3) {
x$min <- 0
x <- x + 3600
} else {
x$min <- 15 * mult
}
x <- trunc.POSIXt(x, units = "mins")
if(convert) {
x <- format(x, format = "%H:%M")
}
x
}
This gives:
这给出了:
> tmp <- as.POSIXct("2012-05-30 20:41:21 UTC")
> myRound(tmp)
[1] "20:45"
> myRound(tmp, convert = FALSE)
[1] "2012-05-30 20:45:00 BST"
> tmp2 <- as.POSIXct("2012-05-30 20:55:21 UTC")
> myRound(tmp2)
[1] "21:00"
> myRound(tmp2, convert = FALSE)
[1] "2012-05-30 21:00:00 BST"
#7
2
Using IDate
and ITime
classes from data.table
and a IPeriod
class (just developed) I was able to get more scalable solution.
Only shhhhimhuntingrabbits and PLapointe answer the question in terms of nearest. xts
solution only rounds using ceiling, my IPeriod
solution allows to specify ceiling or floor.
To get top performance you would need to keep your data in IDate
and ITime
classes. As seen on benchmark it is cheap to produce POSIXct
from IDate/ITime/IPeriod
. Below benchmark of some 22M timestamp:
使用data.table和IPeriod类(刚刚开发)中的IDate和ITime类,我能够获得更具可扩展性的解决方案。只有shhhhimhuntingrabbits和PLapointe用最近的方式回答这个问题。 xts解决方案仅使用天花板进行舍入,我的IPeriod解决方案允许指定天花板或地板。要获得最佳性能,您需要将数据保存在IDate和ITime类中。从基准测试可以看出,从IDate / ITime / IPeriod生产POSIXct是很便宜的。低于约22M时间戳的基准:
# install only if you don't have
install.packages(c("microbenchmarkCore","data.table"),
repos = c("https://olafmersmann.github.io/drat",
"https://jangorecki.github.io/drat/iperiod"))
library(microbenchmarkCore)
library(data.table) # iunit branch
library(xts)
Sys.setenv(TZ="UTC")
## some source data: download and unzip csv
# "http://api.bitcoincharts.com/v1/csv/btceUSD.csv.gz"
# below benchmark on btceUSD.csv.gz 11-Oct-2015 11:35 133664801
system.nanotime(dt <- fread(".btceUSD.csv"))
# Read 21931266 rows and 3 (of 3) columns from 0.878 GB file in 00:00:10
# user system elapsed
# NA NA 9.048991
# take the timestamp only
x = as.POSIXct(dt[[1L]], tz="UTC", origin="1970-01-01")
# functions
shhhhi <- function(your.time){
strptime("1970-01-01", "%Y-%m-%d", tz="UTC") + round(as.numeric(your.time)/900)*900
}
PLapointe <- function(a){
as.POSIXlt(round(as.double(a)/(15*60))*(15*60),origin=(as.POSIXlt('1970-01-01')))
}
# myRound - not vectorized
# compare results
all.equal(
format(shhhhi(x),"%H:%M"),
format(PLapointe(x),"%H:%M")
)
# [1] TRUE
all.equal(
format(align.time(x, n = 60*15),"%H:%M"),
format(periodize(x, "mins", 15),"%H:%M")
)
# [1] TRUE
# IPeriod native input are IDate and ITime - will be tested too
idt <- IDateTime(x)
idate <- idt$idate
itime <- idt$itime
microbenchmark(times = 10L,
shhhhi(x),
PLapointe(x),
xts = align.time(x, 15*60),
posix_ip_posix = as.POSIXct(periodize(x, "mins", 15), tz="UTC"),
posix_ip = periodize(x, "mins", 15),
ip_posix = as.POSIXct(periodize(idate, itime, "mins", 15), tz="UTC"),
ip = periodize(idate, itime, "mins", 15))
# Unit: microseconds
# expr min lq mean median uq max neval
# shhhhi(x) 960819.810 984970.363 1127272.6812 1167512.2765 1201770.895 1243706.235 10
# PLapointe(x) 2322929.313 2440263.122 2617210.4264 2597772.9825 2792936.774 2981499.356 10
# xts 453409.222 525738.163 581139.6768 546300.9395 677077.650 767609.155 10
# posix_ip_posix 3314609.993 3499220.920 3641219.0876 3586822.9150 3654548.885 4457614.174 10
# posix_ip 3010316.462 3066736.299 3157777.2361 3133693.0655 3234307.549 3401388.800 10
# ip_posix 335.741 380.696 513.7420 543.3425 630.020 663.385 10
# ip 98.031 151.471 207.7404 231.8200 262.037 278.789 10
IDate
and ITime
successfully scales not only in this particular task. Both types, same as IPeriod
, are integer based. I would assume they will also scale nice on join or grouping by datetime fields.
Online manual: https://jangorecki.github.io/drat/iperiod/
IDate和ITime不仅成功地扩展了这一特定任务。两种类型,与IPeriod相同,都是基于整数的。我认为他们也可以通过datetime字段进行连接或分组。在线手册:https://jangorecki.github.io/drat/iperiod/