I have data in two columns:
我有两个栏目的数据:
DateTime Profit
20130319T01 5
20130319T02 135
20130319T03 245
20130320T01 10
20130320T02 115
and I want to create a column that has the diff for each hour BUT the problem is that Profit resets to zero each day. I want to get the following:
我想创建一个每小时都有差异的列,但问题是利润每天都是零。我想要得到以下信息:
DateTime Diff
20130319T01 5
20130319T02 130
20130319T03 110
20130320T01 10
20130320T02 105
1 个解决方案
#1
4
Assuming the format of your DateTime character vector is always "YYYYMMDD"
then you can use the ddply
function from plyr
to get what you want:
假设您的DateTime字符向量的格式总是“YYYYMMDD”,那么您可以使用plyr中的ddply函数来获得您想要的:
require(plyr)
df$Date <- substr( df$DateTime , 1 , 8 )
ddply( df , .(Date) , summarise , Diff = diff(c(0,Profit)) )
# Date Diff
#1 20130319 5
#2 20130319 130
#3 20130319 110
#4 20130320 10
#5 20130320 105
Another way using base's ave
:
使用base's ave的另一种方式是:
within(df, { Profit_diff <- ave(Profit, list(gsub("T.*$", "", DateTime)),
FUN=function(x) c(x[1], diff(x)))})
# DateTime Profit Profit_diff
# 1 20130319T01 5 5
# 2 20130319T02 135 130
# 3 20130319T03 245 110
# 4 20130320T01 10 10
# 5 20130320T02 115 105
#1
4
Assuming the format of your DateTime character vector is always "YYYYMMDD"
then you can use the ddply
function from plyr
to get what you want:
假设您的DateTime字符向量的格式总是“YYYYMMDD”,那么您可以使用plyr中的ddply函数来获得您想要的:
require(plyr)
df$Date <- substr( df$DateTime , 1 , 8 )
ddply( df , .(Date) , summarise , Diff = diff(c(0,Profit)) )
# Date Diff
#1 20130319 5
#2 20130319 130
#3 20130319 110
#4 20130320 10
#5 20130320 105
Another way using base's ave
:
使用base's ave的另一种方式是:
within(df, { Profit_diff <- ave(Profit, list(gsub("T.*$", "", DateTime)),
FUN=function(x) c(x[1], diff(x)))})
# DateTime Profit Profit_diff
# 1 20130319T01 5 5
# 2 20130319T02 135 130
# 3 20130319T03 245 110
# 4 20130320T01 10 10
# 5 20130320T02 115 105