从一列中选择不在另一列中的所有值的有效方法

时间:2021-05-20 07:54:59

I need to return all values from colA that are not in colB from mytable. I am using:

我需要从mytable返回colA中不在colB中的所有值。我在用:

SELECT DISTINCT(colA) FROM mytable WHERE colA NOT IN (SELECT colB FROM mytable)

It is working however the query is taking an excessively long time to complete.

但它正在运行,但查询需要花费很长时间才能完成。

Is there a more efficient way to do this?

有没有更有效的方法来做到这一点?

3 个解决方案

#1


13  

In standard SQL there are no parentheses in DISTINCT colA. DISTINCT is not a function.

在标准SQL中,DISTINCT colA中没有括号。 DISTINCT不是一个功能。

SELECT DISTINCT colA
FROM   mytable
WHERE  colA NOT IN (SELECT DISTINCT colB FROM mytable);

Added DISTINCT to the sub-select as well. If you have many duplicates it could speed up the query.

也将DISTINCT添加到子选择中。如果您有许多重复项,它可以加快查询速度。

A CTE might be faster, depending on your DBMS. I additionally demonstrate LEFT JOIN as alternative to exclude the values in valB, and an alternative way to get distinct values with GROUP BY:

CTE可能更快,具体取决于您的DBMS。我还演示了LEFT JOIN作为排除valB中值的替代方法,以及使用GROUP BY获取不同值的另一种方法:

WITH x AS (SELECT colB FROM mytable GROUP BY colB)
SELECT m.colA
FROM   mytable m
LEFT   JOIN x ON x.colB = m.colA
WHERE  x.colB IS NULL
GROUP  BY m.colA;

Or, simplified further, and with a plain subquery (probably fastest):

或者,进一步简化,并使用普通子查询(可能最快):

SELECT DISTINCT m.colA
FROM   mytable m
LEFT   JOIN mytable x ON x.colB = m.colA
WHERE  x.colB IS NULL;

There are basically 4 techniques to exclude rows with keys present in another (or the same) table:

基本上有4种技术可以排除具有另一个(或相同)表中的键的行:

The deciding factor for speed will be indexes. You need to have indexes on colA and colB for this query to be fast.

速度的决定因素是指数。您需要在colA和colB上具有索引才能使此查询更快。

#2


6  

You can use exists:

你可以使用exists:

select distinct
    colA
from
    mytable m1
where
    not exists (select 1 from mytable m2 where m2.colB = m1.colA)

exists does a semi-join to quickly match the values. not in completes the entire result set and then does an or on it. exists is typically faster for values in tables.

exists执行半连接以快速匹配值。不是完成整个结果集然后做一个或上面。对于表中的值,exists通常更快。

#3


0  

You can use the EXCEPT operator which effectively diffs two SELECT queries. EXCEPT DISTINCT will return only unique values. Oracle's MINUS operator is equivalent to EXCEPT DISTINCT.

您可以使用EXCEPT运算符来有效地区分两个SELECT查询。除了DISTINCT将仅返回唯一值。 Oracle的MINUS运算符相当于EXCEPT DISTINCT。

#1


13  

In standard SQL there are no parentheses in DISTINCT colA. DISTINCT is not a function.

在标准SQL中,DISTINCT colA中没有括号。 DISTINCT不是一个功能。

SELECT DISTINCT colA
FROM   mytable
WHERE  colA NOT IN (SELECT DISTINCT colB FROM mytable);

Added DISTINCT to the sub-select as well. If you have many duplicates it could speed up the query.

也将DISTINCT添加到子选择中。如果您有许多重复项,它可以加快查询速度。

A CTE might be faster, depending on your DBMS. I additionally demonstrate LEFT JOIN as alternative to exclude the values in valB, and an alternative way to get distinct values with GROUP BY:

CTE可能更快,具体取决于您的DBMS。我还演示了LEFT JOIN作为排除valB中值的替代方法,以及使用GROUP BY获取不同值的另一种方法:

WITH x AS (SELECT colB FROM mytable GROUP BY colB)
SELECT m.colA
FROM   mytable m
LEFT   JOIN x ON x.colB = m.colA
WHERE  x.colB IS NULL
GROUP  BY m.colA;

Or, simplified further, and with a plain subquery (probably fastest):

或者,进一步简化,并使用普通子查询(可能最快):

SELECT DISTINCT m.colA
FROM   mytable m
LEFT   JOIN mytable x ON x.colB = m.colA
WHERE  x.colB IS NULL;

There are basically 4 techniques to exclude rows with keys present in another (or the same) table:

基本上有4种技术可以排除具有另一个(或相同)表中的键的行:

The deciding factor for speed will be indexes. You need to have indexes on colA and colB for this query to be fast.

速度的决定因素是指数。您需要在colA和colB上具有索引才能使此查询更快。

#2


6  

You can use exists:

你可以使用exists:

select distinct
    colA
from
    mytable m1
where
    not exists (select 1 from mytable m2 where m2.colB = m1.colA)

exists does a semi-join to quickly match the values. not in completes the entire result set and then does an or on it. exists is typically faster for values in tables.

exists执行半连接以快速匹配值。不是完成整个结果集然后做一个或上面。对于表中的值,exists通常更快。

#3


0  

You can use the EXCEPT operator which effectively diffs two SELECT queries. EXCEPT DISTINCT will return only unique values. Oracle's MINUS operator is equivalent to EXCEPT DISTINCT.

您可以使用EXCEPT运算符来有效地区分两个SELECT查询。除了DISTINCT将仅返回唯一值。 Oracle的MINUS运算符相当于EXCEPT DISTINCT。