I'm looking for the answer on how DISTINCT clause works in SQL (SQL Server 2008 if that makes a difference) on a query with multiple tables joined?
我正在寻找关于DISTINCT子句如何在SQL中运行的答案(SQL Server 2008,如果这会产生影响)对多个表连接的查询?
I mean how the SQL engine handles the query with DISTINCT clause?
我的意思是SQL引擎如何使用DISTINCT子句处理查询?
The reason I'm asking is that I was told by my far more experienced colleague that SQL applies DISTINCT to every field of every table. It seems unlikely for me, but I want to make sure....
我问的原因是我的经验丰富的同事告诉我,SQL将DISTINCT应用于每个表的每个字段。对我来说似乎不太可能,但我想确定......
For example having two tables:
例如,有两个表:
CREATE TABLE users
(
u_id INT PRIMARY KEY,
u_name VARCHAR(30),
u_password VARCHAR(30)
)
CREATE TABLE roles
(
r_id INT PRIMARY KEY,
r_name VARCHAR(30)
)
CREATE TABLE users_l_roles
(
u_id INT FOREIGN KEY REFERENCES users(u_id) ,
r_id INT FOREIGN KEY REFERENCES roles(r_id)
)
And then having this query:
然后有这个查询:
SELECT u_name
FROM users
INNER JOIN users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN roles ON users_l_roles.r_id = roles.r_id
Assuming there was user with two roles then the above query will return two records with the same user name.
假设有两个角色的用户,则上述查询将返回具有相同用户名的两个记录。
But this query with distinct:
但这个查询具有明显的:
SELECT DISTINCT u_name
FROM users
INNER JOIN users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN roles ON users_l_roles.r_id = roles.r_id
will return only one user name.
将只返回一个用户名。
The question is whether SQL will compare all the fields from all the joined tables (u_id, u_name, u_password, r_id, r_name) or it will compare only named fields in the query (u_name) and distinct the results?
问题是SQL是否会比较所有连接表中的所有字段(u_id,u_name,u_password,r_id,r_name),还是只比较查询中的命名字段(u_name)并区分结果?
3 个解决方案
#1
16
DISTINCT
filters out duplicate values of your returned fields.
DISTINCT过滤掉返回字段的重复值。
A really simplified way to look at it is:
一个非常简单的方法是:
- It builds your overall result set (including duplicates) based on your
FROM
andWHERE
clauses - It sorts that result set based on the fields you want to return
- It removes any duplicate values in those fields
它根据您的FROM和WHERE子句构建整个结果集(包括重复项)
它根据您要返回的字段对结果集进行排序
它会删除这些字段中的任何重复值
It's semantically equivalent to a GROUP BY
where all returned fields are in the GROUP BY
clause.
它在语义上等同于GROUP BY,其中所有返回的字段都在GROUP BY子句中。
#2
3
DISTINCT
simply de-duplicates the resultant recordset after all other query operations have been performed. This article has more detail.
在执行所有其他查询操作之后,DISTINCT简单地对结果记录集进行重复数据删除。这篇文章有更多细节。
#3
0
First selects all the 'available records' and then it 'removes duplicate records' in all 'available records' and prints.
首先选择所有“可用记录”,然后在所有“可用记录”中“删除重复记录”并打印。
#1
16
DISTINCT
filters out duplicate values of your returned fields.
DISTINCT过滤掉返回字段的重复值。
A really simplified way to look at it is:
一个非常简单的方法是:
- It builds your overall result set (including duplicates) based on your
FROM
andWHERE
clauses - It sorts that result set based on the fields you want to return
- It removes any duplicate values in those fields
它根据您的FROM和WHERE子句构建整个结果集(包括重复项)
它根据您要返回的字段对结果集进行排序
它会删除这些字段中的任何重复值
It's semantically equivalent to a GROUP BY
where all returned fields are in the GROUP BY
clause.
它在语义上等同于GROUP BY,其中所有返回的字段都在GROUP BY子句中。
#2
3
DISTINCT
simply de-duplicates the resultant recordset after all other query operations have been performed. This article has more detail.
在执行所有其他查询操作之后,DISTINCT简单地对结果记录集进行重复数据删除。这篇文章有更多细节。
#3
0
First selects all the 'available records' and then it 'removes duplicate records' in all 'available records' and prints.
首先选择所有“可用记录”,然后在所有“可用记录”中“删除重复记录”并打印。