I work on a dataset with three different columns: pile
, position
and info
.
我处理具有三个不同列的数据集:桩,位置和信息。
There is no duplicate in the database, but it can happen, that for one combination of pile
and position
there is one or two different texts in the info column. And those are the entries I tried to find.
数据库中没有重复,但可能发生,对于桩和位置的一个组合,在info列中有一个或两个不同的文本。这些是我试图找到的条目。
I tried the following
我尝试了以下内容
SELECT COUNT(DISTINCT(`pile`, `position`)) FROM db;
But received an error message
但是收到了一条错误消息
ERROR 1241 (21000): Operand should contain 1 column(s)
Is there a way to find distinct combinations of values in two columns?
有没有办法在两列中找到不同的值组合?
3 个解决方案
#1
13
This works even without subselects.
即使没有子选择,这也可以工作。
SELECT
`pile`,
`position`,
COUNT(*) AS c
FROM
db
GROUP BY
`pile`,
`position`
HAVING c > 1;
The command above shows all combinations of pile
and position
that occur more than once in the table db
.
上面的命令显示了在表db中多次出现的桩和位置的所有组合。
#2
0
To get the count of distinct duplicates (group by used in preference here)
获取不同重复项的计数(此处优先使用组)
select count(*)
from (
select pile, position
from db
group by pile, position
) x
To find the actual duplicate records
找到实际的重复记录
select db.*
from (
select pile, position
from db
group by pile, position
having count(*) > 1
) x
join db on db.pile = x.pile and db.position = x.position
#3
0
SELECT *
FROM db x
WHERE EXISTS (
SELECT 1 FROM db y
WHERE y.pile = x.pile
AND y.position =x.postion
AND y.other_field <> x.other_field
);
Now, for other_field
you can use some unique id column, or any combination of fields (except for {pole, postion} of course)
现在,对于other_field,您可以使用一些唯一的id列或任何字段组合(当然除了{pole,postion})
#1
13
This works even without subselects.
即使没有子选择,这也可以工作。
SELECT
`pile`,
`position`,
COUNT(*) AS c
FROM
db
GROUP BY
`pile`,
`position`
HAVING c > 1;
The command above shows all combinations of pile
and position
that occur more than once in the table db
.
上面的命令显示了在表db中多次出现的桩和位置的所有组合。
#2
0
To get the count of distinct duplicates (group by used in preference here)
获取不同重复项的计数(此处优先使用组)
select count(*)
from (
select pile, position
from db
group by pile, position
) x
To find the actual duplicate records
找到实际的重复记录
select db.*
from (
select pile, position
from db
group by pile, position
having count(*) > 1
) x
join db on db.pile = x.pile and db.position = x.position
#3
0
SELECT *
FROM db x
WHERE EXISTS (
SELECT 1 FROM db y
WHERE y.pile = x.pile
AND y.position =x.postion
AND y.other_field <> x.other_field
);
Now, for other_field
you can use some unique id column, or any combination of fields (except for {pole, postion} of course)
现在,对于other_field,您可以使用一些唯一的id列或任何字段组合(当然除了{pole,postion})