I need to merge two datasets, but the rows have to merge if the date of the one dataset is between two dates of the other one. The first dataset data
looks like this:
我需要合并两个数据集,但如果一个数据集的日期在另一个数据集的两个日期之间,则必须合并行。第一个数据集数据如下所示:
Date Weight diff Loc.nr
2013-01-24 1040 7 2
2013-01-31 1000 7 2
2013-02-07 1185 7 2
2013-02-14 915 7 2
2013-02-21 1090 7 2
2013-03-01 1065 9 2
2013-01-19 500 4 9
2013-01-23 1040 3 9
2013-01-28 415 5 9
2013-01-31 650 3 9
2013-02-04 725 4 9
2013-02-07 450 3 9
2013-02-11 550 4 9
The other data set matches
looks like this:
其他数据集匹配如下所示:
Date winning
2013-01-20 1
2013-01-27 0
2013-02-03 1
2013-02-10 0
2013-02-17 1
2013-02-24 0
I wrote a code to connect the winning column from matches
to the data set "data":
我编写了一个代码,用于将匹配的获胜列连接到数据集“data”:
data$winning <- NA
for(i in 1:nrow(data)) {
for(j in 1:nrow(matches)) {
if((data$Date[i]-data$diff[i]) < matches$Date[j] & data$Date[i] > matches$Date[j]) {
data$winning[i] <- matches$winning[j]
}
}
}
This code takes 3 days to run, is there a faster way to do this?
此代码需要3天才能运行,有更快的方法吗?
My expected output is:
我的预期输出是:
Date Weight diff Loc.nr winning
2013-01-24 1040 7 2 1
2013-01-31 1000 7 2 0
2013-02-07 1185 7 2 1
2013-02-14 915 7 2 0
2013-02-21 1090 7 2 1
2013-03-01 1065 9 2 0
2013-01-19 500 4 9 NA
2013-01-23 1040 3 9 NA
2013-01-28 415 5 9 0
2013-01-31 650 3 9 NA
2013-02-04 725 4 9 1
2013-02-07 450 3 9 NA
2013-02-11 550 4 9 0
1 个解决方案
#1
0
With non-equi join as suggested by Gregor you can try something along
根据Gregor的建议,你可以尝试一些东西
library(data.table)
setDT(data)[, winning := setDT(matches)[data[, .(upper = Date, lower = Date - diff)],
on = .(Date < upper, Date > lower)]$winning][]
Date Weight diff Loc.nr winning 1: 2013-01-24 1040 7 2 1 2: 2013-01-31 1000 7 2 0 3: 2013-02-07 1185 7 2 1 4: 2013-02-14 915 7 2 0 5: 2013-02-21 1090 7 2 1 6: 2013-03-01 1065 9 2 0 7: 2013-01-19 500 4 9 NA 8: 2013-01-23 1040 3 9 NA 9: 2013-01-28 415 5 9 0 10: 2013-01-31 650 3 9 NA 11: 2013-02-04 725 4 9 1 12: 2013-02-07 450 3 9 NA 13: 2013-02-11 550 4 9 0
#1
0
With non-equi join as suggested by Gregor you can try something along
根据Gregor的建议,你可以尝试一些东西
library(data.table)
setDT(data)[, winning := setDT(matches)[data[, .(upper = Date, lower = Date - diff)],
on = .(Date < upper, Date > lower)]$winning][]
Date Weight diff Loc.nr winning 1: 2013-01-24 1040 7 2 1 2: 2013-01-31 1000 7 2 0 3: 2013-02-07 1185 7 2 1 4: 2013-02-14 915 7 2 0 5: 2013-02-21 1090 7 2 1 6: 2013-03-01 1065 9 2 0 7: 2013-01-19 500 4 9 NA 8: 2013-01-23 1040 3 9 NA 9: 2013-01-28 415 5 9 0 10: 2013-01-31 650 3 9 NA 11: 2013-02-04 725 4 9 1 12: 2013-02-07 450 3 9 NA 13: 2013-02-11 550 4 9 0