将聚合数据列附加到R中的面板数据

时间:2022-03-16 16:12:22

I'm working with a set of panel data which looks like this:

我正在处理一组面板数据,如下所示:

> head(data)
      date  id    value
1998-12-31  AB89  120.3
1998-12-31  BC12  89.3
1998-12-31  SU16  56.3
.
.
.
1999-06-31  SU16  526.3
1999-06-31  AB89  80
1999-06-31  ZP32  15
.
.
.

And so on. I would like to append a column to the data so that it gives the quintile that row belongs to on that date. E.g.:

等等。我想在数据中附加一列,以便它在该日期给出该行所属的五分位数。例如。:

> head(data)
      date  id    value   quintile
1998-12-31  AB89  120.3   1
1998-12-31  BC12  89.3    2
1998-12-31  SU16  56.3    5
.
.
.
1999-06-31  SU16  526.3   1
1999-06-31  AB89  80      4
1999-06-31  ZP32  15      5
.
.
.

To clarify, AB89's value of 120.3 would put it in the first quintile out of all the possible values on 1998-12-31.

为了澄清,AB89的值为120.3将使其成为1998-12-31所有可能值中的第一个五分位数。

I've looked at plyr and tried playing around with the function ddply but it's taking me a long time to figure it out.

我看了一下plyr,试着玩ddply这个功能,但是我花了很长时间才弄明白。

1 个解决方案

#1


1  

If you use dplyr you could do something like this. Please note that I modified your data. In the data, date is in character. But, even if you have date object, you should have the same result.

如果你使用dplyr你可以做这样的事情。请注意,我修改了您的数据。在数据中,日期是个性。但是,即使你有日期对象,你也应该有相同的结果。

library(dplyr)

foo %>%
    group_by(date) %>%
    mutate(quintile = ntile(desc(value),5))

#         date   id value quintile
#1  1998-12-31 AB89 120.3        1
#2  1998-12-31 BC12  89.3        2
#3  1998-12-31 SU16  56.3        3
#4  1998-12-31 SU16  20.3        4
#5  1998-12-31 SU18   9.3        5
#6  1999-06-31 SU16 526.3        1
#7  1999-06-31 AB89  80.0        3
#8  1999-06-31 ZP32  15.0        5
#9  1999-06-31 AB99  40.0        4
#10 1999-06-31 AS33 130.0        2
#11 1999-06-31 ZP32 200.0        1

DATA

数据

foo <- structure(list(date = c("1998-12-31", "1998-12-31", "1998-12-31", 
"1998-12-31", "1998-12-31", "1999-06-31", "1999-06-31", "1999-06-31", 
"1999-06-31", "1999-06-31", "1999-06-31"), id = c("AB89", "BC12", 
"SU16", "SU16", "SU18", "SU16", "AB89", "ZP32", "AB99", "AS33", 
"ZP32"), value = c(120.3, 89.3, 56.3, 20.3, 9.3, 526.3, 80, 15, 
40, 130, 200)), .Names = c("date", "id", "value"), class = "data.frame", row.names = c(NA, 
-11L))

#1


1  

If you use dplyr you could do something like this. Please note that I modified your data. In the data, date is in character. But, even if you have date object, you should have the same result.

如果你使用dplyr你可以做这样的事情。请注意,我修改了您的数据。在数据中,日期是个性。但是,即使你有日期对象,你也应该有相同的结果。

library(dplyr)

foo %>%
    group_by(date) %>%
    mutate(quintile = ntile(desc(value),5))

#         date   id value quintile
#1  1998-12-31 AB89 120.3        1
#2  1998-12-31 BC12  89.3        2
#3  1998-12-31 SU16  56.3        3
#4  1998-12-31 SU16  20.3        4
#5  1998-12-31 SU18   9.3        5
#6  1999-06-31 SU16 526.3        1
#7  1999-06-31 AB89  80.0        3
#8  1999-06-31 ZP32  15.0        5
#9  1999-06-31 AB99  40.0        4
#10 1999-06-31 AS33 130.0        2
#11 1999-06-31 ZP32 200.0        1

DATA

数据

foo <- structure(list(date = c("1998-12-31", "1998-12-31", "1998-12-31", 
"1998-12-31", "1998-12-31", "1999-06-31", "1999-06-31", "1999-06-31", 
"1999-06-31", "1999-06-31", "1999-06-31"), id = c("AB89", "BC12", 
"SU16", "SU16", "SU18", "SU16", "AB89", "ZP32", "AB99", "AS33", 
"ZP32"), value = c(120.3, 89.3, 56.3, 20.3, 9.3, 526.3, 80, 15, 
40, 130, 200)), .Names = c("date", "id", "value"), class = "data.frame", row.names = c(NA, 
-11L))