I have following dataframe in R
我在R中有以下数据框
truck_no start_time end_time ctr_no time type
ABC123 20-05-2016 06:53 20-05-2016 08:53 ERT09 1.67 D
ABC123 20-05-2016 06:53 20-05-2016 08:53 TRT12 1.67 R
ABC123 20-05-2016 06:53 20-05-2016 08:53 ERT34 1.67 R
ABC123 20-05-2016 06:53 20-05-2016 08:53 ERT33 1.67 D
ERT123 21-05-2016 06:53 21-05-2016 08:53 QRT34 2.67 R
ERT123 21-05-2016 06:53 21-05-2016 08:53 PRT33 2.67 D
Now My desired data frame is
现在我想要的数据框是
truck_no start_time end_time ctr_no time type
ABC123 20-05-2016 06:53 20-05-2016 08:53 ERT09 1.67 2D2R
ABC123 20-05-2016 06:53 20-05-2016 08:53 TRT12 2.67 1R1D
I want to count D's and R's
and paste it in above manner time is taken as a average. How can I do it in dplyr ?
我想算上D和R并以上面的方式粘贴时间作为平均值。我怎么能在dplyr中做到这一点?
1 个解决方案
#1
4
Here's a dplyr approach:
这是一个dplyr方法:
foo <- function(x) {y <- table(x); paste(rbind(y, names(y)), collapse = "")}
df %>%
group_by(truck_no) %>%
mutate(type = foo(type)) %>%
summarise_all(first)
## A tibble: 2 x 5
# truck_no start_time end_time ctr_no type
# <fctr> <fctr> <fctr> <fctr> <chr>
#1 ABC123 20-05-2016 06:53 20-05-2016 08:53 ERT09 2D2R
#2 ERT123 21-05-2016 06:53 21-05-2016 08:53 QRT34 1D1R
In case you want the start and end time to be the mean per truck_no
you could use the following extension:
如果您希望开始和结束时间是每辆truck_no的平均值,您可以使用以下扩展名:
df %>%
group_by(truck_no) %>%
mutate_at(vars(ends_with("_time")),
~mean(as.POSIXct(as.character(.), format="%d-%m-%Y %H:%M"))) %>%
mutate(type = foo(type)) %>%
summarise_all(first)
## A tibble: 2 x 5
# truck_no start_time end_time ctr_no type
# <fctr> <dttm> <dttm> <fctr> <chr>
#1 ABC123 2016-05-20 06:53:00 2016-05-20 08:23:00 ERT09 2D2R
#2 ERT123 2016-05-21 06:53:00 2016-05-21 08:53:00 QRT34 1D1R
#1
4
Here's a dplyr approach:
这是一个dplyr方法:
foo <- function(x) {y <- table(x); paste(rbind(y, names(y)), collapse = "")}
df %>%
group_by(truck_no) %>%
mutate(type = foo(type)) %>%
summarise_all(first)
## A tibble: 2 x 5
# truck_no start_time end_time ctr_no type
# <fctr> <fctr> <fctr> <fctr> <chr>
#1 ABC123 20-05-2016 06:53 20-05-2016 08:53 ERT09 2D2R
#2 ERT123 21-05-2016 06:53 21-05-2016 08:53 QRT34 1D1R
In case you want the start and end time to be the mean per truck_no
you could use the following extension:
如果您希望开始和结束时间是每辆truck_no的平均值,您可以使用以下扩展名:
df %>%
group_by(truck_no) %>%
mutate_at(vars(ends_with("_time")),
~mean(as.POSIXct(as.character(.), format="%d-%m-%Y %H:%M"))) %>%
mutate(type = foo(type)) %>%
summarise_all(first)
## A tibble: 2 x 5
# truck_no start_time end_time ctr_no type
# <fctr> <dttm> <dttm> <fctr> <chr>
#1 ABC123 2016-05-20 06:53:00 2016-05-20 08:23:00 ERT09 2D2R
#2 ERT123 2016-05-21 06:53:00 2016-05-21 08:53:00 QRT34 1D1R