SQL的DISTINCT子句如何工作？

I'm looking for the answer on how DISTINCT clause works in SQL (SQL Server 2008 if that makes a difference) on a query with multiple tables joined?

我正在寻找关于DISTINCT子句如何在SQL中运行的答案(SQL Server 2008,如果这会产生影响)对多个表连接的查询?

I mean how the SQL engine handles the query with DISTINCT clause?

我的意思是SQL引擎如何使用DISTINCT子句处理查询?

The reason I'm asking is that I was told by my far more experienced colleague that SQL applies DISTINCT to every field of every table. It seems unlikely for me, but I want to make sure....

我问的原因是我的经验丰富的同事告诉我,SQL将DISTINCT应用于每个表的每个字段。对我来说似乎不太可能,但我想确定......

For example having two tables:

例如,有两个表:

CREATE TABLE users
(
u_id INT PRIMARY KEY,
u_name VARCHAR(30),
u_password VARCHAR(30)
)

CREATE TABLE roles
(
r_id INT PRIMARY KEY,
r_name VARCHAR(30)
)

CREATE TABLE users_l_roles
(
u_id INT FOREIGN KEY REFERENCES users(u_id) ,
r_id INT FOREIGN KEY REFERENCES roles(r_id) 
)

And then having this query:

然后有这个查询:

SELECT          u_name
FROM            users 
INNER JOIN      users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN      roles ON users_l_roles.r_id = roles.r_id

Assuming there was user with two roles then the above query will return two records with the same user name.

假设有两个角色的用户,则上述查询将返回具有相同用户名的两个记录。

But this query with distinct:

但这个查询具有明显的:

SELECT DISTINCT u_name
FROM            users 
INNER JOIN      users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN      roles ON users_l_roles.r_id = roles.r_id

will return only one user name.

将只返回一个用户名。

The question is whether SQL will compare all the fields from all the joined tables (u_id, u_name, u_password, r_id, r_name) or it will compare only named fields in the query (u_name) and distinct the results?

问题是SQL是否会比较所有连接表中的所有字段(u_id,u_name,u_password,r_id,r_name),还是只比较查询中的命名字段(u_name)并区分结果?

3 个解决方案

#1

DISTINCT filters out duplicate values of your returned fields.

DISTINCT过滤掉返回字段的重复值。

A really simplified way to look at it is:

一个非常简单的方法是:

It builds your overall result set (including duplicates) based on your FROM and WHERE clauses

它根据您的FROM和WHERE子句构建整个结果集(包括重复项)

It sorts that result set based on the fields you want to return

它根据您要返回的字段对结果集进行排序

It removes any duplicate values in those fields

它会删除这些字段中的任何重复值

It's semantically equivalent to a GROUP BY where all returned fields are in the GROUP BY clause.

它在语义上等同于GROUP BY,其中所有返回的字段都在GROUP BY子句中。

#2

DISTINCT simply de-duplicates the resultant recordset after all other query operations have been performed. This article has more detail.

在执行所有其他查询操作之后,DISTINCT简单地对结果记录集进行重复数据删除。这篇文章有更多细节。

#3

First selects all the 'available records' and then it 'removes duplicate records' in all 'available records' and prints.

首先选择所有“可用记录”,然后在所有“可用记录”中“删除重复记录”并打印。

#1