Say I have two tables:
说我有两张桌子:
library(data.table)
set.seed(1)
tab1 <- data.table(
let = rep(letters[1:2], each = 3),
num = rep(1:3, 2),
val = rnorm(6),
key = c("let", "num")
)
tab2 <- data.table(
let = rep(letters[1:2], each = 2),
num = rep(1:2, 2),
val = rnorm(4),
key = c("let", "num")
)
Table 1:
> tab1
let num val
1: a 1 -0.6264538
2: a 2 0.1836433
3: a 3 -0.8356286
4: b 1 1.5952808
5: b 2 0.3295078
6: b 3 -0.8204684
Table 2:
> tab2
let num
1: a 1
2: a 2
3: b 1
4: b 2
Is there a way to "merge" these tables such that I get all the results in tab1
that are not in tab2
?:
有没有办法“合并”这些表,以便我得到tab1中不在tab2中的所有结果?:
let num val
1: a 3 -0.8356286
2: b 3 -0.8204684
2 个解决方案
#1
12
In this case, it's equivalent to an anti join:
在这种情况下,它相当于反连接:
tab1[!tab2, on=c("let", "num")]
But setdiff()
would only the first row for every let,num
. This is marked for v1.9.8, FR #547.
但是setdiff()只会是每个let的第一行,num。这标记为v1.9.8,FR#547。
#2
0
One solution would be to do a merge and remove the rows where there are values from tab2
一种解决方案是进行合并并删除tab2中有值的行
d<-as.data.frame(merge(tab1,tab2,all=T))
t<-is.na(d[,4])
d[t,][,-4]
let num val.x
3 a 3 -0.8356286
6 b 3 -0.8204684
Using data.table
:
使用data.table:
merge(tab1,tab2,all=T)[is.na(val.y),1:3,with=F]
let num val.x
1: a 3 -0.8356286
2: b 3 -0.8204684
#1
12
In this case, it's equivalent to an anti join:
在这种情况下,它相当于反连接:
tab1[!tab2, on=c("let", "num")]
But setdiff()
would only the first row for every let,num
. This is marked for v1.9.8, FR #547.
但是setdiff()只会是每个let的第一行,num。这标记为v1.9.8,FR#547。
#2
0
One solution would be to do a merge and remove the rows where there are values from tab2
一种解决方案是进行合并并删除tab2中有值的行
d<-as.data.frame(merge(tab1,tab2,all=T))
t<-is.na(d[,4])
d[t,][,-4]
let num val.x
3 a 3 -0.8356286
6 b 3 -0.8204684
Using data.table
:
使用data.table:
merge(tab1,tab2,all=T)[is.na(val.y),1:3,with=F]
let num val.x
1: a 3 -0.8356286
2: b 3 -0.8204684