为什么“SELECT DISTINCT a, b FROM…”返回的记录要比“SELECT DISTINCT a + '|' + b FROM……”

时间:2022-10-13 17:01:53

I have a query that's selecting a bunch of fields related to names and addresses of customers but it boils down to:

我有一个查询,它选择了一些与客户名称和地址相关的字段,但归结为:

SELECT DISTINCT a, b, c, ... FROM big_dumb_flat_table

it returns a bunch of records (10986590). When I replace the commas in the select-list to format it as a pipe-separated concatenated string:

它返回一串记录(10986590)。当我在选择列表中替换逗号时,将其格式化为由管道分隔的连接字符串:

SELECT DISTINCT a + '|' + b + '|' + c + '|' + ... FROM big_dumb_flat_table

it's returning 248 more records. I've reassured myself that there are no pipes in any of the fields that could be screwing the fidelity of the returned set. What's going on here?

还有248条记录。我向自己保证,在任何一个可能会使返回的集合的保真度发生变化的领域都没有管道。这里发生了什么?

2 个解决方案

#1


10  

Trailing spaces could cause this. For string comparisons these are ignored.

尾随空格可能导致这个。对于字符串比较,这些都被忽略。

CREATE TABLE #T
(
a varchar(10),
b varchar(10),
c varchar(10)
)

INSERT INTO #T
SELECT 'a ' as a, 'b' as b, 'c ' as c union all
SELECT 'a' as a, 'b' as b, 'c ' as c

SELECT DISTINCT a, b, c  
FROM #T /*1 result*/

SELECT DISTINCT a + '|' + b + '|' + c + '|'   
FROM #T /*2 results*/


SELECT DISTINCT LTRIM(RTRIM(a)) + '|' + LTRIM(RTRIM(b)) + '|' +
                LTRIM(RTRIM(c)) + '|'   
FROM #T /*1 result*/

#2


2  

The really aren't any scenarios that I can think of that would get you MORE records, only fewer. I would simplify the query by only selecting a + '|', then add more columns as you go.

我能想到的并不是那些能给你带来更多记录的场景,只有更少的。我只选择一个+ '|'来简化查询,然后再添加更多的列。

#1


10  

Trailing spaces could cause this. For string comparisons these are ignored.

尾随空格可能导致这个。对于字符串比较,这些都被忽略。

CREATE TABLE #T
(
a varchar(10),
b varchar(10),
c varchar(10)
)

INSERT INTO #T
SELECT 'a ' as a, 'b' as b, 'c ' as c union all
SELECT 'a' as a, 'b' as b, 'c ' as c

SELECT DISTINCT a, b, c  
FROM #T /*1 result*/

SELECT DISTINCT a + '|' + b + '|' + c + '|'   
FROM #T /*2 results*/


SELECT DISTINCT LTRIM(RTRIM(a)) + '|' + LTRIM(RTRIM(b)) + '|' +
                LTRIM(RTRIM(c)) + '|'   
FROM #T /*1 result*/

#2


2  

The really aren't any scenarios that I can think of that would get you MORE records, only fewer. I would simplify the query by only selecting a + '|', then add more columns as you go.

我能想到的并不是那些能给你带来更多记录的场景,只有更少的。我只选择一个+ '|'来简化查询,然后再添加更多的列。