I'm working with a set of panel data which looks like this:
我正在处理一组面板数据,如下所示:
> head(data)
date id value
1998-12-31 AB89 120.3
1998-12-31 BC12 89.3
1998-12-31 SU16 56.3
.
.
.
1999-06-31 SU16 526.3
1999-06-31 AB89 80
1999-06-31 ZP32 15
.
.
.
And so on. I would like to append a column to the data so that it gives the quintile that row belongs to on that date. E.g.:
等等。我想在数据中附加一列,以便它在该日期给出该行所属的五分位数。例如。:
> head(data)
date id value quintile
1998-12-31 AB89 120.3 1
1998-12-31 BC12 89.3 2
1998-12-31 SU16 56.3 5
.
.
.
1999-06-31 SU16 526.3 1
1999-06-31 AB89 80 4
1999-06-31 ZP32 15 5
.
.
.
To clarify, AB89
's value of 120.3
would put it in the first quintile out of all the possible values on 1998-12-31
.
为了澄清,AB89的值为120.3将使其成为1998-12-31所有可能值中的第一个五分位数。
I've looked at plyr
and tried playing around with the function ddply
but it's taking me a long time to figure it out.
我看了一下plyr,试着玩ddply这个功能,但是我花了很长时间才弄明白。
1 个解决方案
#1
1
If you use dplyr
you could do something like this. Please note that I modified your data. In the data, date is in character. But, even if you have date object, you should have the same result.
如果你使用dplyr你可以做这样的事情。请注意,我修改了您的数据。在数据中,日期是个性。但是,即使你有日期对象,你也应该有相同的结果。
library(dplyr)
foo %>%
group_by(date) %>%
mutate(quintile = ntile(desc(value),5))
# date id value quintile
#1 1998-12-31 AB89 120.3 1
#2 1998-12-31 BC12 89.3 2
#3 1998-12-31 SU16 56.3 3
#4 1998-12-31 SU16 20.3 4
#5 1998-12-31 SU18 9.3 5
#6 1999-06-31 SU16 526.3 1
#7 1999-06-31 AB89 80.0 3
#8 1999-06-31 ZP32 15.0 5
#9 1999-06-31 AB99 40.0 4
#10 1999-06-31 AS33 130.0 2
#11 1999-06-31 ZP32 200.0 1
DATA
数据
foo <- structure(list(date = c("1998-12-31", "1998-12-31", "1998-12-31",
"1998-12-31", "1998-12-31", "1999-06-31", "1999-06-31", "1999-06-31",
"1999-06-31", "1999-06-31", "1999-06-31"), id = c("AB89", "BC12",
"SU16", "SU16", "SU18", "SU16", "AB89", "ZP32", "AB99", "AS33",
"ZP32"), value = c(120.3, 89.3, 56.3, 20.3, 9.3, 526.3, 80, 15,
40, 130, 200)), .Names = c("date", "id", "value"), class = "data.frame", row.names = c(NA,
-11L))
#1
1
If you use dplyr
you could do something like this. Please note that I modified your data. In the data, date is in character. But, even if you have date object, you should have the same result.
如果你使用dplyr你可以做这样的事情。请注意,我修改了您的数据。在数据中,日期是个性。但是,即使你有日期对象,你也应该有相同的结果。
library(dplyr)
foo %>%
group_by(date) %>%
mutate(quintile = ntile(desc(value),5))
# date id value quintile
#1 1998-12-31 AB89 120.3 1
#2 1998-12-31 BC12 89.3 2
#3 1998-12-31 SU16 56.3 3
#4 1998-12-31 SU16 20.3 4
#5 1998-12-31 SU18 9.3 5
#6 1999-06-31 SU16 526.3 1
#7 1999-06-31 AB89 80.0 3
#8 1999-06-31 ZP32 15.0 5
#9 1999-06-31 AB99 40.0 4
#10 1999-06-31 AS33 130.0 2
#11 1999-06-31 ZP32 200.0 1
DATA
数据
foo <- structure(list(date = c("1998-12-31", "1998-12-31", "1998-12-31",
"1998-12-31", "1998-12-31", "1999-06-31", "1999-06-31", "1999-06-31",
"1999-06-31", "1999-06-31", "1999-06-31"), id = c("AB89", "BC12",
"SU16", "SU16", "SU18", "SU16", "AB89", "ZP32", "AB99", "AS33",
"ZP32"), value = c(120.3, 89.3, 56.3, 20.3, 9.3, 526.3, 80, 15,
40, 130, 200)), .Names = c("date", "id", "value"), class = "data.frame", row.names = c(NA,
-11L))