从具有不同列的表上的多个连接的结果中删除重复

时间:2020-12-08 00:48:53

I am trying to make one statement to pull data from 3 related tables (as in they all share a common string index). I am having trouble preventing MySQL from returning the product of two of the tables, making the result set much larger than I want it. Each table has a different number of columns, and I would prefer to not use UNION anyway, because the data in each table is separate.

我正在尝试从3个相关的表中提取数据(因为它们都共享一个公共的字符串索引)。我在阻止MySQL返回两个表的产品时遇到了麻烦,这使得结果集比我想要的要大得多。每个表有不同数量的列,我宁愿不使用UNION,因为每个表中的数据是分开的。

Here is an example:

这是一个例子:

Table X is the main table and has fields A B.

表X是主表,具有A B字段。

Table Y has fields A C D.

表Y有一个C D字段。

Table Z has fields A E F G.

表Z的字段为efg。

-

- - - - - -

My ideal result would have the form:

我的理想结果是:

A1 B1 C1 D1 E1 F1 G1

A1 B2 C2 D2 00 00 00

A2 B3 C3 D3 E2 F2 G2

A2 B4 00 00 E3 F3 G3

etc...

等等……

-

- - - - - -

Here is the simplest SQL I have tried that shows my problem (that is, it returns the product of Y * Z indexed by data from A:

下面是我尝试过的最简单的SQL来说明我的问题(即,它返回Y * Z的乘积,该乘积由A的数据索引:

SELECT DISTINCT *

FROM X

LEFT JOIN Y USING (A)

LEFT JOIN Z USING (A)

-

- - - - - -

I have tried adding a group by clause to fields on Y and Z. But, if I only group by one column, it only returns the first result matched with each unique value in that column (ie: A1 C1 E1, A1 C2 E1, A1 C3 E1). And if I group by two columns it returns the product of the two tables again.

我尝试过向Y和z的字段添加一个group by子句,但是,如果我只对一列进行分组,它只返回第一个与该列中每个唯一值匹配的结果(即:A1 C1 E1、A1 C2 E1、A1 C3 E1)。如果我对两列进行分组,它会再次返回两个表的乘积。

I've also tried doing multiple select statements in the query, then joining the resulting tables, but I received the product of the tables as output again.

我还尝试在查询中执行多个select语句,然后加入结果表,但我再次将表的乘积作为输出。

Basically I want to merge the results of three select statements into a single result, without it giving me all combinations of the data. If I need to, I can resort to doing multiple queries. However, since they all contain a common index, I feel there should be a way to do it in one query that I am missing.

基本上,我想将三个select语句的结果合并到一个结果中,而不需要它提供所有的数据组合。如果需要,我可以使用多个查询。但是,由于它们都包含一个公共索引,所以我觉得应该有一种方法可以在我丢失的查询中完成它。

Thanks for any help.

感谢任何帮助。

5 个解决方案

#1


2  

I don't know if I understand your problem, but why are you using a LEFT JOIN? The story sounds more like an INNER JOIN. Nothing here calls for a UNION.

我不知道我是否理解您的问题,但是您为什么要使用左连接?这个故事听起来更像是内心的联结。这里没有什么需要联合。

[Edit] OK, I think I see what you want now. I've never tried what I am about to suggest, and what's more, some DBs don't support it (yet), but I think you want a windowing function.

[编辑]好的,我想我知道你想要什么了。我从来没有尝试过我的建议,更重要的是,一些DBs还不支持它,但是我认为您需要一个窗口功能。

WITH Y2 AS (SELECT Y.*, ROW_NUMBER() OVER (PARTITION BY A) AS YROW FROM Y),
     Z2 AS (SELECT Z.*, ROW_NUMBER() OVER (PARTITION BY A) AS ZROW FROM Z)
SELECT COALESCE(Y2.A,Z2.A) AS A, Y2.C, Y2.D, Z2.E, Z2.F, Z2.G
FROM Y2 FULL OUTER JOIN Z2 ON Y2.A=Z2.A AND YROW=ZROW;

The idea is to print the list in as few rows as possible, right? So if A1 has 10 entries in Y and 7 in Z, then we get 10 rows with 3 having NULLs for the Z fields. This works in Postgres. I do not believe this syntax is available in MySQL.

我们的想法是尽可能少地打印列表,对吧?如果A1在Y中有10个元素,在Z中有7个元素,那么我们就得到了10行,而Z域中有3个零。这在Postgres工作。我不相信MySQL中有这种语法。

Y:

Y:

 a | d | c  
---+---+----
 1 | 1 | -1
 1 | 2 | -1
 2 | 0 | -1

Z:

Z:

 a | f | g | e 
---+---+---+---
 1 | 9 | 9 | 0
 2 | 1 | 1 | 0
 3 | 0 | 1 | 0

Output of statement above:

上面的输出语句:

 a | c  | d | e | f | g 
---+----+---+---+---+---
 1 | -1 | 1 | 0 | 9 | 9
 1 | -1 | 2 |   |   |  
 2 | -1 | 0 | 0 | 1 | 1
 3 |    |   | 0 | 0 | 1

#2


0  

Yep, UNION is not the answer.

是的,联盟不是答案。

I'm thinking you want:

我思考你想要的:

SELECT *
FROM x
    JOIN y ON x.a = y.a
    JOIN z ON x.a = z.a
GROUB BY x.a;

#3


0  

I found a new way editing this post and this can be used to merg two table according to unique ids.
Try this:

我发现了一种编辑这篇文章的新方法,它可以用于根据唯一id对两个表进行merg。试试这个:

create table y
(
a int,
d int,
c int
)

create table z
(
a int,
f int,
g int,
e int
)

go

insert into y values(1,1,-1)
insert into y values(1,2,-1)
insert into y values(2,0,-1)

insert into z values(1,9,9,0)
insert into z values(2,1,1,0)
insert into z values(3,0,1,0)

go

select * from y
select * from z

WITH Y2 AS (SELECT Y.*, ROW_NUMBER()  OVER (ORDER BY A) AS YROW FROM Y where A = 3),
     Z2 AS (SELECT Z.*, ROW_NUMBER()  OVER (ORDER BY A) AS ZROW FROM Z where A = 3)
SELECT COALESCE(Y2.A,Z2.A) AS A, Y2.C, Y2.D, Z2.E, Z2.F, Z2.G
FROM Y2 FULL OUTER JOIN Z2 ON Y2.A=Z2.A AND YROW=ZROW;

#4


0  

PostgreSQL is always the right answer to most MySQL issues, but your problem could have been solved this way :

PostgreSQL始终是大多数MySQL问题的正确答案,但是您的问题可以通过以下方式解决:

The issue you experienced was that you had two left joins, i.e.

您遇到的问题是您有两个剩余的连接,即。

A left join X left join Y which inevitably gives you A x X x Y where you wanted (AxX)x(AxY)

左连接X左连接Y不可避免地会得到X X Y (AxX) X (AxY)

A simple solution could be :

一个简单的解决办法可以是:

select x.A,x.B,x.C,x.D,y.E,y.F,y.G from (SELECT A.A,A.B,X.C,X.D FROM A LEFT JOIN X ON A.A=X.A) x INNER JOIN (SELECT A.A,Y.E,Y.F,Y.G FROM A LEFT JOIN Y ON A.A=Y.A) y ON x.A=y.A

For the test details :

有关测试详情:

CREATE TABLE A (A varchar(3),B varchar(3));
CREATE TABLE X (A varchar(3),C varchar(3), D varchar(3));
CREATE TABLE Y (A varchar(3),E varchar(3), F varchar(3), G varchar(3));
INSERT INTO A(A,B) VALUES ('A1','B1'), ('A2','B2'), ('A3','B3'), ('A4','B4');
INSERT INTO X(A,C,D) VALUES ('A1','C1','D1'), ('A3','C3','D3'), ('A4','C4','D4');
INSERT INTO Y(A,E,F,G) VALUES ('A1','E1','F1','G1'), ('A2','E2','F2','G2'), ('A4','E4','F4','G4');
select x.A,x.B,x.C,x.D,y.E,y.F,y.G from (SELECT A.A,A.B,X.C,X.D FROM A LEFT JOIN X ON A.A=X.A) x INNER JOIN (SELECT A.A,Y.E,Y.F,Y.G FROM A LEFT JOIN Y ON A.A=Y.A) y ON x.A=y.A

As a summary, yes MySQL has many many many issues, but this is not one of them - most of the issues concern more advanced stuff.

总结一下,是的,MySQL有很多问题,但这不是其中之一——大多数问题涉及更高级的东西。

#5


0  

If I understand correctly, table X has a 1:n relationship with both tables Y and Z. So, the behaviour you see is expected. The result you get is a kind of Cross Product.

如果我理解正确,表X与表Y和z都有1:n的关系。你得到的结果是一种外积。

If X has Person data, Y has Address data for those persons and Z has Phone data for those persons, then it's natural your query to show all combinations of addresses and phones for every person. If someone has 3 addresses and 4 phones in your tables, then the query shows 12 rows in the result.

如果X有个人数据,Y有这些人的地址数据,Z有这些人的电话数据,那么你的查询自然会显示每个人的所有地址和电话的组合。如果某人在您的表中有3个地址和4个电话,那么查询将在结果中显示12行。

You could avoid it by either using a UNION query or issuing two queries:

您可以通过使用UNION查询或发出两个查询来避免这种情况:

SELECT X.*
     , Y.*

FROM X
  LEFT JOIN Y 
    ON Y.A = X.A

and:

和:

SELECT X.*
     , Z.*

FROM X 
  LEFT JOIN Z 
    ON Z.A = X.A

#1


2  

I don't know if I understand your problem, but why are you using a LEFT JOIN? The story sounds more like an INNER JOIN. Nothing here calls for a UNION.

我不知道我是否理解您的问题,但是您为什么要使用左连接?这个故事听起来更像是内心的联结。这里没有什么需要联合。

[Edit] OK, I think I see what you want now. I've never tried what I am about to suggest, and what's more, some DBs don't support it (yet), but I think you want a windowing function.

[编辑]好的,我想我知道你想要什么了。我从来没有尝试过我的建议,更重要的是,一些DBs还不支持它,但是我认为您需要一个窗口功能。

WITH Y2 AS (SELECT Y.*, ROW_NUMBER() OVER (PARTITION BY A) AS YROW FROM Y),
     Z2 AS (SELECT Z.*, ROW_NUMBER() OVER (PARTITION BY A) AS ZROW FROM Z)
SELECT COALESCE(Y2.A,Z2.A) AS A, Y2.C, Y2.D, Z2.E, Z2.F, Z2.G
FROM Y2 FULL OUTER JOIN Z2 ON Y2.A=Z2.A AND YROW=ZROW;

The idea is to print the list in as few rows as possible, right? So if A1 has 10 entries in Y and 7 in Z, then we get 10 rows with 3 having NULLs for the Z fields. This works in Postgres. I do not believe this syntax is available in MySQL.

我们的想法是尽可能少地打印列表,对吧?如果A1在Y中有10个元素,在Z中有7个元素,那么我们就得到了10行,而Z域中有3个零。这在Postgres工作。我不相信MySQL中有这种语法。

Y:

Y:

 a | d | c  
---+---+----
 1 | 1 | -1
 1 | 2 | -1
 2 | 0 | -1

Z:

Z:

 a | f | g | e 
---+---+---+---
 1 | 9 | 9 | 0
 2 | 1 | 1 | 0
 3 | 0 | 1 | 0

Output of statement above:

上面的输出语句:

 a | c  | d | e | f | g 
---+----+---+---+---+---
 1 | -1 | 1 | 0 | 9 | 9
 1 | -1 | 2 |   |   |  
 2 | -1 | 0 | 0 | 1 | 1
 3 |    |   | 0 | 0 | 1

#2


0  

Yep, UNION is not the answer.

是的,联盟不是答案。

I'm thinking you want:

我思考你想要的:

SELECT *
FROM x
    JOIN y ON x.a = y.a
    JOIN z ON x.a = z.a
GROUB BY x.a;

#3


0  

I found a new way editing this post and this can be used to merg two table according to unique ids.
Try this:

我发现了一种编辑这篇文章的新方法,它可以用于根据唯一id对两个表进行merg。试试这个:

create table y
(
a int,
d int,
c int
)

create table z
(
a int,
f int,
g int,
e int
)

go

insert into y values(1,1,-1)
insert into y values(1,2,-1)
insert into y values(2,0,-1)

insert into z values(1,9,9,0)
insert into z values(2,1,1,0)
insert into z values(3,0,1,0)

go

select * from y
select * from z

WITH Y2 AS (SELECT Y.*, ROW_NUMBER()  OVER (ORDER BY A) AS YROW FROM Y where A = 3),
     Z2 AS (SELECT Z.*, ROW_NUMBER()  OVER (ORDER BY A) AS ZROW FROM Z where A = 3)
SELECT COALESCE(Y2.A,Z2.A) AS A, Y2.C, Y2.D, Z2.E, Z2.F, Z2.G
FROM Y2 FULL OUTER JOIN Z2 ON Y2.A=Z2.A AND YROW=ZROW;

#4


0  

PostgreSQL is always the right answer to most MySQL issues, but your problem could have been solved this way :

PostgreSQL始终是大多数MySQL问题的正确答案,但是您的问题可以通过以下方式解决:

The issue you experienced was that you had two left joins, i.e.

您遇到的问题是您有两个剩余的连接,即。

A left join X left join Y which inevitably gives you A x X x Y where you wanted (AxX)x(AxY)

左连接X左连接Y不可避免地会得到X X Y (AxX) X (AxY)

A simple solution could be :

一个简单的解决办法可以是:

select x.A,x.B,x.C,x.D,y.E,y.F,y.G from (SELECT A.A,A.B,X.C,X.D FROM A LEFT JOIN X ON A.A=X.A) x INNER JOIN (SELECT A.A,Y.E,Y.F,Y.G FROM A LEFT JOIN Y ON A.A=Y.A) y ON x.A=y.A

For the test details :

有关测试详情:

CREATE TABLE A (A varchar(3),B varchar(3));
CREATE TABLE X (A varchar(3),C varchar(3), D varchar(3));
CREATE TABLE Y (A varchar(3),E varchar(3), F varchar(3), G varchar(3));
INSERT INTO A(A,B) VALUES ('A1','B1'), ('A2','B2'), ('A3','B3'), ('A4','B4');
INSERT INTO X(A,C,D) VALUES ('A1','C1','D1'), ('A3','C3','D3'), ('A4','C4','D4');
INSERT INTO Y(A,E,F,G) VALUES ('A1','E1','F1','G1'), ('A2','E2','F2','G2'), ('A4','E4','F4','G4');
select x.A,x.B,x.C,x.D,y.E,y.F,y.G from (SELECT A.A,A.B,X.C,X.D FROM A LEFT JOIN X ON A.A=X.A) x INNER JOIN (SELECT A.A,Y.E,Y.F,Y.G FROM A LEFT JOIN Y ON A.A=Y.A) y ON x.A=y.A

As a summary, yes MySQL has many many many issues, but this is not one of them - most of the issues concern more advanced stuff.

总结一下,是的,MySQL有很多问题,但这不是其中之一——大多数问题涉及更高级的东西。

#5


0  

If I understand correctly, table X has a 1:n relationship with both tables Y and Z. So, the behaviour you see is expected. The result you get is a kind of Cross Product.

如果我理解正确,表X与表Y和z都有1:n的关系。你得到的结果是一种外积。

If X has Person data, Y has Address data for those persons and Z has Phone data for those persons, then it's natural your query to show all combinations of addresses and phones for every person. If someone has 3 addresses and 4 phones in your tables, then the query shows 12 rows in the result.

如果X有个人数据,Y有这些人的地址数据,Z有这些人的电话数据,那么你的查询自然会显示每个人的所有地址和电话的组合。如果某人在您的表中有3个地址和4个电话,那么查询将在结果中显示12行。

You could avoid it by either using a UNION query or issuing two queries:

您可以通过使用UNION查询或发出两个查询来避免这种情况:

SELECT X.*
     , Y.*

FROM X
  LEFT JOIN Y 
    ON Y.A = X.A

and:

和:

SELECT X.*
     , Z.*

FROM X 
  LEFT JOIN Z 
    ON Z.A = X.A