I need to merge two datasets, but the rows have to merge if the date of the one dataset is between two dates of the other one. The first dataset data
looks like this:
Date Weight diff Loc.nr
2013-01-24 1040 7 2
2013-01-31 1000 7 2
2013-02-07 1185 7 2
2013-02-14 915 7 2
2013-02-21 1090 7 2
2013-03-01 1065 9 2
2013-01-19 500 4 9
2013-01-23 1040 3 9
2013-01-28 415 5 9
2013-01-31 650 3 9
2013-02-04 725 4 9
2013-02-07 450 3 9
2013-02-11 550 4 9
The other data set matches
looks like this:
Date winning
2013-01-20 1
2013-01-27 0
2013-02-03 1
2013-02-10 0
2013-02-17 1
2013-02-24 0
I wrote a code to connect the winning column from matches
to the data set "data":
data$winning <- NA
for(i in 1:nrow(data)) {
for(j in 1:nrow(matches)) {
if((data$Date[i]-data$diff[i]) < matches$Date[j] & data$Date[i] > matches$Date[j]) {
data$winning[i] <- matches$winning[j]
This code takes 3 days to run, is there a faster way to do this?
My expected output is:
Date Weight diff Loc.nr winning
2013-01-24 1040 7 2 1
2013-01-31 1000 7 2 0
2013-02-07 1185 7 2 1
2013-02-14 915 7 2 0
2013-02-21 1090 7 2 1
2013-03-01 1065 9 2 0
2013-01-19 500 4 9 NA
2013-01-23 1040 3 9 NA
2013-01-28 415 5 9 0
2013-01-31 650 3 9 NA
2013-02-04 725 4 9 1
2013-02-07 450 3 9 NA
2013-02-11 550 4 9 0
1 个解决方案
With non-equi join as suggested by Gregor you can try something along
setDT(data)[, winning := setDT(matches)[data[, .(upper = Date, lower = Date - diff)],
on = .(Date < upper, Date > lower)]$winning][]
Date Weight diff Loc.nr winning 1: 2013-01-24 1040 7 2 1 2: 2013-01-31 1000 7 2 0 3: 2013-02-07 1185 7 2 1 4: 2013-02-14 915 7 2 0 5: 2013-02-21 1090 7 2 1 6: 2013-03-01 1065 9 2 0 7: 2013-01-19 500 4 9 NA 8: 2013-01-23 1040 3 9 NA 9: 2013-01-28 415 5 9 0 10: 2013-01-31 650 3 9 NA 11: 2013-02-04 725 4 9 1 12: 2013-02-07 450 3 9 NA 13: 2013-02-11 550 4 9 0
With non-equi join as suggested by Gregor you can try something along
setDT(data)[, winning := setDT(matches)[data[, .(upper = Date, lower = Date - diff)],
on = .(Date < upper, Date > lower)]$winning][]
Date Weight diff Loc.nr winning 1: 2013-01-24 1040 7 2 1 2: 2013-01-31 1000 7 2 0 3: 2013-02-07 1185 7 2 1 4: 2013-02-14 915 7 2 0 5: 2013-02-21 1090 7 2 1 6: 2013-03-01 1065 9 2 0 7: 2013-01-19 500 4 9 NA 8: 2013-01-23 1040 3 9 NA 9: 2013-01-28 415 5 9 0 10: 2013-01-31 650 3 9 NA 11: 2013-02-04 725 4 9 1 12: 2013-02-07 450 3 9 NA 13: 2013-02-11 550 4 9 0