SQL查询帮助处理非唯一重复项

时间:2021-11-26 22:56:34

I can't think through this one. I have this query:

我想不通过这个。我有这个问题:

SELECT 
    p.person_id,
    p.first_nm,
    p.last_nm, 
    pu.purchase_dt,
    pr.sku, 
    pr.description,
    a.address_type_id,
    a.city_cd, 
    a.state_cd, 
    a.postal_cd
FROM 
    person p 
    INNER JOIN address a ON p.person_id = a.person_id
    INNER JOIN purchase pu ON pu.person_id = p.person_id
    INNER JOIN product pr ON pr.product_id = pu.product_id

Simple enough - I just need to get the information for customers that we've shipped returns to. However, because of the addressType table

很简单 - 我只需要为我们发货的客户获取信息。但是,由于addressType表

AddressType

address_type_id    address_type_desc
------------------------------------
1            Home
2            Shipping

some customers have multiple addresses in the address table, creating non-unique duplicate entries like this.

一些客户在地址表中有多个地址,创建这样的非唯一重复条目。

1,Smith, John, 12/01/2009, A12345, Purple Widget, 1, Anywhere, CA, 12345
1,Smith, John, 12/01/2009, A12345, Purple Widget, 2, Somewhere, ID, 54321

I'd like to get the query to return just one row/person and return the home address if available otherwise, return the shipping address.

我想让查询返回一行/人并返回家庭地址(如果可用),返回送货地址。

This seems simple enough, and maybe it's just my cold, but this is causing me to scratch my head somewhat.

这看起来很简单,也许这只是我的感冒,但这让我有点头疼。

3 个解决方案

#1


5  

SELECT 
    p.person_id,
    p.first_nm,
    p.last_nm, 
    pu.purchase_dt,
    pr.sku, 
    pr.description,
    COALESCE(ha.address_type_id, sa.address_type_id) AS address_type_id
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.city_cd ELSE sa.city_cd END AS city_cd, 
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.state_cd ELSE sa.state_cd END AS state_cd, 
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.postal_cd ELSE sa.postal_cd END AS postal_cd
FROM 
    person p 
    LEFT JOIN address ha ON p.person_id = ha.person_id AND ha.address_type_id = 1
    LEFT JOIN address sa ON p.person_id = sa.person_id AND sa.address_type_id = 2
    INNER JOIN purchase pu ON pu.person_id = p.person_id
    INNER JOIN product pr ON pr.product_id = pu.product_id

#2


6  

you want to change your join so it returns the min(addressID) instead of all of them:

你想改变你的连接,所以它返回min(addressID)而不是所有的连接:

        INNER JOIN address a ON p.person_id = a.person_id
        inner join (select person_id, min(address_type_id) as min_addr 
from address group by person_id) a_min 
on a.person_id = a_min.person_id and a.address_type_id = a_min.min_addr

#3


1  

If SQL Server, or other version with common table expressions (CTE), you could do the following. The CTE adds a row-number column that is grouped by person and ordered by the address_type_id. The main query is altered to return number 1 row for each person from the CTE.

如果SQL Server或具有公用表表达式(CTE)的其他版本,您可以执行以下操作。 CTE添加一个按人员分组并按address_type_id排序的行号列。主要查询被更改为从CTE返回每个人的第1行。

WITH cte AS
    (
    SELECT
         a.person_id,
         a.address_type_id,
         a.city_cd, 
         a.state_cd, 
         a.postal_cd,
         ROW_NUMBER() over (PARTITION BY person_id ORDER BY address_type_id) AS sequence
    FROM address a
    INNER JOIN AddressType at ON a.address_type_id = at.address_type_id
    )

    SELECT 
        p.person_id,
        p.first_nm,
        p.last_nm, 
        pu.purchase_dt,
        pr.sku, 
        pr.description,
        a.address_type_id,
        a.city_cd, 
        a.state_cd, 
        a.postal_cd
    FROM 
        person p 
        INNER JOIN cte a ON p.person_id = a.person_id
        INNER JOIN purchase pu ON pu.person_id = p.person_id
        INNER JOIN product pr ON pr.product_id = pu.product_id
    WHERE
        a.sequence = 1

By the way, if you have person records that have no addresses, you might want to change the INNER JOIN to an OUTER JOIN on the addresses table (cte in my answer). This may also be appropriate for joins to purchase and product if your requirements indicate so.

顺便说一句,如果您有没有地址的人员记录,您可能想要在地址表上将INNER JOIN更改为OUTER JOIN(在我的回答中为cte)。如果您的要求表明,这也适用于购买和产品的连接。

#1


5  

SELECT 
    p.person_id,
    p.first_nm,
    p.last_nm, 
    pu.purchase_dt,
    pr.sku, 
    pr.description,
    COALESCE(ha.address_type_id, sa.address_type_id) AS address_type_id
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.city_cd ELSE sa.city_cd END AS city_cd, 
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.state_cd ELSE sa.state_cd END AS state_cd, 
    CASE WHEN ha.address_type_id IS NOT NULL THEN ha.postal_cd ELSE sa.postal_cd END AS postal_cd
FROM 
    person p 
    LEFT JOIN address ha ON p.person_id = ha.person_id AND ha.address_type_id = 1
    LEFT JOIN address sa ON p.person_id = sa.person_id AND sa.address_type_id = 2
    INNER JOIN purchase pu ON pu.person_id = p.person_id
    INNER JOIN product pr ON pr.product_id = pu.product_id

#2


6  

you want to change your join so it returns the min(addressID) instead of all of them:

你想改变你的连接,所以它返回min(addressID)而不是所有的连接:

        INNER JOIN address a ON p.person_id = a.person_id
        inner join (select person_id, min(address_type_id) as min_addr 
from address group by person_id) a_min 
on a.person_id = a_min.person_id and a.address_type_id = a_min.min_addr

#3


1  

If SQL Server, or other version with common table expressions (CTE), you could do the following. The CTE adds a row-number column that is grouped by person and ordered by the address_type_id. The main query is altered to return number 1 row for each person from the CTE.

如果SQL Server或具有公用表表达式(CTE)的其他版本,您可以执行以下操作。 CTE添加一个按人员分组并按address_type_id排序的行号列。主要查询被更改为从CTE返回每个人的第1行。

WITH cte AS
    (
    SELECT
         a.person_id,
         a.address_type_id,
         a.city_cd, 
         a.state_cd, 
         a.postal_cd,
         ROW_NUMBER() over (PARTITION BY person_id ORDER BY address_type_id) AS sequence
    FROM address a
    INNER JOIN AddressType at ON a.address_type_id = at.address_type_id
    )

    SELECT 
        p.person_id,
        p.first_nm,
        p.last_nm, 
        pu.purchase_dt,
        pr.sku, 
        pr.description,
        a.address_type_id,
        a.city_cd, 
        a.state_cd, 
        a.postal_cd
    FROM 
        person p 
        INNER JOIN cte a ON p.person_id = a.person_id
        INNER JOIN purchase pu ON pu.person_id = p.person_id
        INNER JOIN product pr ON pr.product_id = pu.product_id
    WHERE
        a.sequence = 1

By the way, if you have person records that have no addresses, you might want to change the INNER JOIN to an OUTER JOIN on the addresses table (cte in my answer). This may also be appropriate for joins to purchase and product if your requirements indicate so.

顺便说一句,如果您有没有地址的人员记录,您可能想要在地址表上将INNER JOIN更改为OUTER JOIN(在我的回答中为cte)。如果您的要求表明,这也适用于购买和产品的连接。