左外连接查询在SQL Server中返回重复项

I have a table 1 (MID, SSN, ...) MID is primary key and table 2 (ID, SSN, StateCode..) where ID and SSN make up the primary key. I'm trying to display all columns from table 1 along with StateCode from table 2 matching it against SSN. Tbl 1 has 50 rows and some have same SSN values.

我有一个表1(MID,SSN,...)MID是主键和表2(ID,SSN,StateCode ..),其中ID和SSN组成主键。我正在尝试显示表1中的所有列以及表2中的StateCode与SSN匹配。 Tb11具有50行,并且一些具有相同的SSN值。

If no SSN match is found from table 2, displaying a NULL in StateCode is acceptable, so I chose left join. Here is my query

如果从表2中找不到SSN匹配,则在StateCode中显示NULL是可以接受的,所以我选择了左连接。这是我的查询

Select 
    tbl1.*, tbl2.StateCode
from 
    tbl1
left outer join 
    tbl2 on tbl1.SSN = tbl2.SSN

I'm looking to retrieve 50 records, but I get 70, rows that contain the same ssn value in tbl1 ends up duplicated in the final output. What is going wrong?

我想要检索50条记录,但是我得到70条,在tbl1中包含相同ssn值的行最终会在最终输出中重复。出了什么问题?

6 个解决方案

#1

I'd suggest reading on cartesian product.

我建议阅读笛卡尔积。

If you have 50 rows in the first table and 70 in the second that makes 3500 rows. The join condition tbl1.SSN = tbl2.SSN will filter out rows but you may well end up with more than 50 rows.

如果第一个表中有50行,而第二个表中有70行,则表示3500行。连接条件tbl1.SSN = tbl2.SSN将过滤掉行,但最终可能会超过50行。

Back to your problem you can see what is happening by trying the following :

回到你的问题,你可以通过尝试以下方法看到发生了什么:

SELECT 
  tbl1.*,
  (SELECT COUNT(*) FROM tbl2 WHERE tbl1.SSN = tbl2.SSN) AS NbResultTbl2
FROM 
  tbl1

This will tell which rows of tbl1 has multiple match in tbl2. If you have a number higher than 1 in the NbResultTbl2 column then you are going to end up with duplicates.

这将告诉tbl1哪些行在tbl2中有多个匹配。如果NbResultTbl2列中的数字大于1,那么您将最终得到重复项。

To eliminate those duplicates you can try this :

要消除这些重复,您可以尝试这样做:

SELECT 
  tbl1.*,
  (SELECT TOP 1 StateCode FROM tbl2 WHERE tbl1.SSN = tbl2.SSN) 
FROM 
  tbl1

This will get the first StateCode found for a matching SNN in tbl2.

这将获得在tbl2中为匹配的SNN找到的第一个StateCode。

#2

Try using SELECT DISTINCT rather than just SELECT statement, SELECT DISTINCT won't show duplicates

尝试使用SELECT DISTINCT而不仅仅是SELECT语句,SELECT DISTINCT将不会显示重复项

#3

Try, Your both Table have one more primary key is there So just try ID column to match

试试,你的两个表都有一个主键是那么只需尝试ID列匹配

Select tbl1.MID,tbl1.SSN, tbl2.StateCode
from tbl1
left outer join tbl2 
on tbl1.MID= tbl2.ID
Group by tbl1.MID,tbl1.SSN, tbl2.StateCode

#4

This is too long for a comment.

这个评论太长了。

"ID and SSN are both primary keys" . . . This statement indicates a lack of understanding of what a primary key is. A table can have only one primary key. A primary key can be composite (composed of more than one column), but there is only one.

“ID和SSN都是主键”。。。该声明表明缺乏对主键的理解。一个表只能有一个主键。主键可以是复合的(由多个列组成),但只有一个。

If MID is the primary key for table1, then presumably multiple rows can have the same SSN.

如果MID是table1的主键,那么可能多行可以具有相同的SSN。

Your query is:

您的查询是:

Select *
from tbl1, tbl2.StateCode
from tbl1, tbl2 left outer join tbl2 on tbl1.SSN = tbl2.SSN

This is not even valid SQL. You might try this version:

这甚至不是有效的SQL。您可以尝试这个版本:

Select distinct tbl1.*, tbl2.StateCode
from tbl1 left outer join
     tbl2
     on tbl1.SSN = tbl2.SSN;

This is valid and would appear to be what you want.

这是有效的,看起来像你想要的。

#5

Try grouping SSN within Table2 and getting the MAX StateCode:

尝试在Table2中对SSN进行分组并获取MAX StateCode:

SELECT Table1.*, DT.StateCode
FROM Table1
LEFT OUTER JOIN (
    SELECT SSN, MAX(StateCode) AS StateCode FROM Table2 GROUP BY SSN
) DT ON Table1.SSN = DT.SSN

#6

To add to the comment by pixelbits. You can perform a subquery with (skip MID)

要通过pixelbits添加到评论中。您可以执行子查询(跳过MID)

 select distinct ssn,* from tbl1

And left join that to tbl2

然后将其加入tbl2

That should give your 50 rows unless other columns besides MID are distinct

这应该给你50行,除非MID之外的其他列是不同的

#1