Hello I've been working on this all morning. I thought it was a simple self join but the self join actually returns too many rows.
你好,我整个上午一直在努力。我认为这是一个简单的自连接,但自连接实际上返回了太多行。
Essentially I'm trying to find rows in a table where certain column values match row to row.
本质上,我试图在表中找到行,其中某些列值匹配行。
So if row one and three have the same column values in three specific columns then those two rows are returned.
因此,如果第一行和第三行在三个特定列中具有相同的列值,则返回这两行。
So far I've tried a self-join, and a semi-join in a couple of different ways.
到目前为止,我已经尝试过一种自我加入,并以几种不同的方式进行半连接。
SELECT *
FROM ATable a, ATable b
Where a.colValue = b.colValue
and a.colValue2 = b.colValue2
This returns too many rows. Is this query even a join? Am I on the wrong track here? What am I missing about self joins that it returns more rows than the table itself?
这会返回太多行。这个查询甚至是连接吗?我在这里走错了路吗?关于self连接我错过了什么,它返回的行多于表本身?
ATable contains 20 rows but above query returns 36.
ATable包含20行,但上面的查询返回36。
As Always thanks very much for any answers or hints. I learn alot just by formulating the question.
始终非常感谢任何答案或提示。我只是通过提出问题来学习。
3 个解决方案
#1
2
The query at the moment will return every row, because all rows are equal to themselves. You need to restrict it so that they must be different rows.
此时的查询将返回每一行,因为所有行都等于它们自己。您需要限制它,以便它们必须是不同的行。
I'm assuming you have some sort of Primary Key ID column.
我假设您有某种主键ID列。
SELECT *
FROM ATable a, ATable b
Where a.colValue = b.colValue
and a.colValue2 = b.colValue2
and a.Id!= b.Id
Another thing you have to consider is that if you had the rows:
你需要考虑的另一件事是,如果你有行:
ID ColValue ColValue2 ColValue3
1 A B C
2 A B D
You'd see:
a.id a.ColValue a.ColValue2 a.ColValue3 b.id b.ColValue b.ColValue2 b.ColValue3
1 A B C 2 A B D
2 A B D 1 A B C
Because Row 1 is the same as Row 2. But also Row 2 is the same as Row 1.
因为第1行与第2行相同。但第2行与第1行相同。
#2
1
you are doing it right... for each row, you will get itself, and all other rows that match the columns you specify, that should be a MINIMUM of the total number of rows in the table, and probably more.
你正在做对...对于每一行,你将获得自己,以及与你指定的列匹配的所有其他行,它应该是表中总行数的最小值,可能更多。
#3
0
Do you try to find duplicate rows?
你试图找到重复的行吗?
SELECT count(a.id) as cnt
FROM ATable a
GROUP BY a.colValue
WHERE cnt>1
#1
2
The query at the moment will return every row, because all rows are equal to themselves. You need to restrict it so that they must be different rows.
此时的查询将返回每一行,因为所有行都等于它们自己。您需要限制它,以便它们必须是不同的行。
I'm assuming you have some sort of Primary Key ID column.
我假设您有某种主键ID列。
SELECT *
FROM ATable a, ATable b
Where a.colValue = b.colValue
and a.colValue2 = b.colValue2
and a.Id!= b.Id
Another thing you have to consider is that if you had the rows:
你需要考虑的另一件事是,如果你有行:
ID ColValue ColValue2 ColValue3
1 A B C
2 A B D
You'd see:
a.id a.ColValue a.ColValue2 a.ColValue3 b.id b.ColValue b.ColValue2 b.ColValue3
1 A B C 2 A B D
2 A B D 1 A B C
Because Row 1 is the same as Row 2. But also Row 2 is the same as Row 1.
因为第1行与第2行相同。但第2行与第1行相同。
#2
1
you are doing it right... for each row, you will get itself, and all other rows that match the columns you specify, that should be a MINIMUM of the total number of rows in the table, and probably more.
你正在做对...对于每一行,你将获得自己,以及与你指定的列匹配的所有其他行,它应该是表中总行数的最小值,可能更多。
#3
0
Do you try to find duplicate rows?
你试图找到重复的行吗?
SELECT count(a.id) as cnt
FROM ATable a
GROUP BY a.colValue
WHERE cnt>1