当日期在其他2个日期之间R时,如何将数据集连接到另一个数据集

时间:2022-09-03 22:52:28

I need to merge two datasets, but the rows have to merge if the date of the one dataset is between two dates of the other one. The first dataset data looks like this:

我需要合并两个数据集,但如果一个数据集的日期在另一个数据集的两个日期之间,则必须合并行。第一个数据集数据如下所示:

      Date   Weight   diff    Loc.nr
2013-01-24     1040      7         2
2013-01-31     1000      7         2
2013-02-07     1185      7         2
2013-02-14      915      7         2
2013-02-21     1090      7         2
2013-03-01     1065      9         2
2013-01-19      500      4         9
2013-01-23     1040      3         9
2013-01-28      415      5         9
2013-01-31      650      3         9
2013-02-04      725      4         9
2013-02-07      450      3         9
2013-02-11      550      4         9

The other data set matches looks like this:

其他数据集匹配如下所示:

      Date   winning
2013-01-20         1
2013-01-27         0
2013-02-03         1
2013-02-10         0
2013-02-17         1
2013-02-24         0

I wrote a code to connect the winning column from matches to the data set "data":

我编写了一个代码,用于将匹配的获胜列连接到数据集“data”:

data$winning <- NA
for(i in 1:nrow(data)) {
  for(j in 1:nrow(matches)) {
    if((data$Date[i]-data$diff[i]) < matches$Date[j] & data$Date[i] > matches$Date[j]) {
      data$winning[i] <- matches$winning[j]
    }
  }
}

This code takes 3 days to run, is there a faster way to do this?

此代码需要3天才能运行,有更快的方法吗?

My expected output is:

我的预期输出是:

      Date   Weight   diff    Loc.nr    winning
2013-01-24     1040      7         2          1
2013-01-31     1000      7         2          0
2013-02-07     1185      7         2          1
2013-02-14      915      7         2          0
2013-02-21     1090      7         2          1
2013-03-01     1065      9         2          0
2013-01-19      500      4         9         NA
2013-01-23     1040      3         9         NA
2013-01-28      415      5         9          0
2013-01-31      650      3         9         NA
2013-02-04      725      4         9          1
2013-02-07      450      3         9         NA
2013-02-11      550      4         9          0

1 个解决方案

#1


0  

With non-equi join as suggested by Gregor you can try something along

根据Gregor的建议,你可以尝试一些东西

library(data.table)
setDT(data)[, winning := setDT(matches)[data[, .(upper = Date, lower = Date - diff)], 
                                        on = .(Date < upper, Date > lower)]$winning][]
          Date Weight diff Loc.nr winning
 1: 2013-01-24   1040    7      2       1
 2: 2013-01-31   1000    7      2       0
 3: 2013-02-07   1185    7      2       1
 4: 2013-02-14    915    7      2       0
 5: 2013-02-21   1090    7      2       1
 6: 2013-03-01   1065    9      2       0
 7: 2013-01-19    500    4      9      NA
 8: 2013-01-23   1040    3      9      NA
 9: 2013-01-28    415    5      9       0
10: 2013-01-31    650    3      9      NA
11: 2013-02-04    725    4      9       1
12: 2013-02-07    450    3      9      NA
13: 2013-02-11    550    4      9       0

#1


0  

With non-equi join as suggested by Gregor you can try something along

根据Gregor的建议,你可以尝试一些东西

library(data.table)
setDT(data)[, winning := setDT(matches)[data[, .(upper = Date, lower = Date - diff)], 
                                        on = .(Date < upper, Date > lower)]$winning][]
          Date Weight diff Loc.nr winning
 1: 2013-01-24   1040    7      2       1
 2: 2013-01-31   1000    7      2       0
 3: 2013-02-07   1185    7      2       1
 4: 2013-02-14    915    7      2       0
 5: 2013-02-21   1090    7      2       1
 6: 2013-03-01   1065    9      2       0
 7: 2013-01-19    500    4      9      NA
 8: 2013-01-23   1040    3      9      NA
 9: 2013-01-28    415    5      9       0
10: 2013-01-31    650    3      9      NA
11: 2013-02-04    725    4      9       1
12: 2013-02-07    450    3      9      NA
13: 2013-02-11    550    4      9       0