如何从数据框中选择不匹配的行?

时间:2022-06-26 07:21:44

I'm trying to identify the values in a data frame that do not match, but can't figure out how to do this.

我正在尝试识别数据框中不匹配的值,但无法弄清楚如何执行此操作。

# make data frame 
a <- data.frame( x =  c(1,2,3,4)) 
b <- data.frame( y =  c(1,2,3,4,5,6))

# select only values from b that are not in 'a'
# attempt 1: 
results1 <- b$y[ !a$x ]

# attempt 2:  
results2 <- b[b$y != a$x,]

If a = c(1,2,3) this works, as a is a multiple of b. However, I'm trying to just select all the values from data frame y, that are not in x, and don't understand what function to use.

如果a = c(1,2,3),则这是有效的,因为a是b的倍数。但是,我试图只选择数据框y中的所有值,这些值不在x中,并且不了解要使用的函数。

2 个解决方案

#1


39  

If I understand correctly, you need the negation of the %in% operator. Something like this should work:

如果我理解正确,您需要否定%in%运算符。像这样的东西应该工作:

subset(b, !(y %in% a$x))

子集(b,!(%y $ x中的y%))

> subset(b, !(y %in% a$x))
  y
5 5
6 6

#2


19  

Try the set difference function setdiff. So you would have

尝试设置差异函数setdiff。所以你会的

results1 = setdiff(a$x, b$y)   # elements in a$x NOT in b$y
results2 = setdiff(b$y, a$x)   # elements in b$y NOT in a$x

#1


39  

If I understand correctly, you need the negation of the %in% operator. Something like this should work:

如果我理解正确,您需要否定%in%运算符。像这样的东西应该工作:

subset(b, !(y %in% a$x))

子集(b,!(%y $ x中的y%))

> subset(b, !(y %in% a$x))
  y
5 5
6 6

#2


19  

Try the set difference function setdiff. So you would have

尝试设置差异函数setdiff。所以你会的

results1 = setdiff(a$x, b$y)   # elements in a$x NOT in b$y
results2 = setdiff(b$y, a$x)   # elements in b$y NOT in a$x