I have a table author_data
:
我有一个表author_data:
author_id | author_name
----------+----------------
9 | ernest jordan
14 | k moribe
15 | ernest jordan
25 | william h nailon
79 | howard jason
36 | k moribe
Now I need the result as:
现在我需要结果如下:
author_id | author_name
----------+----------------
9 | ernest jordan
15 | ernest jordan
14 | k moribe
36 | k moribe
That is, I need the author_id
for the names having duplicate appearances. I have tried this statement:
也就是说,我需要author_id来获取具有重复外观的名称。我试过这句话:
select author_id,count(author_name)
from author_data
group by author_name
having count(author_name)>1
But it's not working. How can I get this?
但它不起作用。我怎么能得到这个?
3 个解决方案
#1
9
I suggest a window function in a subquery:
我建议子查询中的窗口函数:
SELECT author_id, author_name -- omit the name here, if you just need ids
FROM (
SELECT author_id, author_name
, count(*) OVER (PARTITION BY author_name) AS ct
FROM author_data
) sub
WHERE ct > 1;
You will recognize the basic aggregate function count()
. It can be turned into a window function by appending an OVER
clause - just like any other aggregate function.
您将识别基本的聚合函数count()。可以通过附加OVER子句将其转换为窗口函数 - 就像任何其他聚合函数一样。
This way it counts the rows per partition. Voilá.
这样它计算每个分区的行数。瞧。
In older versions without window functions (v.8.3 or older) - or generally - this alternative performs pretty fast:
在没有窗口功能(v.8.3或更早版本)的旧版本中 - 或者通常 - 此替代方案执行速度非常快:
SELECT author_id, author_name -- omit name, if you just need ids
FROM author_data a
WHERE EXISTS (
SELECT 1
FROM author_data a2
WHERE a2.author_name = a.author_name
AND a2.author_id <> a.author_id
);
If you are concerned with performance, add an index on author_name
.
如果您关心性能,请在author_name上添加索引。
#2
1
You are half way there already. You need to just use the identified Author_IDs
and fetch the rest of the data.
你已经到了一半了。您只需使用标识的Author_ID并获取其余数据。
try this..
尝试这个..
SELECT author_id, author_name
FROM author_data
WHERE author_id in (select author_id
from author_data
group by author_name
having count(author_name)>1)
#3
1
You could join the table onto itself, which is achievable with either of the following queries:
您可以将表连接到自身,这可以通过以下任一查询实现:
SELECT a1.author_id, a1.author_name
FROM authors a1
CROSS JOIN authors a2
ON a1.author_id <> a2.author_id
AND a1.author_name = a2.author_name;
-- 9 |ernest jordan
-- 15|ernest jordan
-- 14|k moribe
-- 36|k moribe
--OR
SELECT a1.author_id, a1.author_name
FROM authors a1
INNER JOIN authors a2
WHERE a1.author_id <> a2.author_id
AND a1.author_name = a2.author_name;
-- 9 |ernest jordan
-- 15|ernest jordan
-- 14|k moribe
-- 36|k moribe
#1
9
I suggest a window function in a subquery:
我建议子查询中的窗口函数:
SELECT author_id, author_name -- omit the name here, if you just need ids
FROM (
SELECT author_id, author_name
, count(*) OVER (PARTITION BY author_name) AS ct
FROM author_data
) sub
WHERE ct > 1;
You will recognize the basic aggregate function count()
. It can be turned into a window function by appending an OVER
clause - just like any other aggregate function.
您将识别基本的聚合函数count()。可以通过附加OVER子句将其转换为窗口函数 - 就像任何其他聚合函数一样。
This way it counts the rows per partition. Voilá.
这样它计算每个分区的行数。瞧。
In older versions without window functions (v.8.3 or older) - or generally - this alternative performs pretty fast:
在没有窗口功能(v.8.3或更早版本)的旧版本中 - 或者通常 - 此替代方案执行速度非常快:
SELECT author_id, author_name -- omit name, if you just need ids
FROM author_data a
WHERE EXISTS (
SELECT 1
FROM author_data a2
WHERE a2.author_name = a.author_name
AND a2.author_id <> a.author_id
);
If you are concerned with performance, add an index on author_name
.
如果您关心性能,请在author_name上添加索引。
#2
1
You are half way there already. You need to just use the identified Author_IDs
and fetch the rest of the data.
你已经到了一半了。您只需使用标识的Author_ID并获取其余数据。
try this..
尝试这个..
SELECT author_id, author_name
FROM author_data
WHERE author_id in (select author_id
from author_data
group by author_name
having count(author_name)>1)
#3
1
You could join the table onto itself, which is achievable with either of the following queries:
您可以将表连接到自身,这可以通过以下任一查询实现:
SELECT a1.author_id, a1.author_name
FROM authors a1
CROSS JOIN authors a2
ON a1.author_id <> a2.author_id
AND a1.author_name = a2.author_name;
-- 9 |ernest jordan
-- 15|ernest jordan
-- 14|k moribe
-- 36|k moribe
--OR
SELECT a1.author_id, a1.author_name
FROM authors a1
INNER JOIN authors a2
WHERE a1.author_id <> a2.author_id
AND a1.author_name = a2.author_name;
-- 9 |ernest jordan
-- 15|ernest jordan
-- 14|k moribe
-- 36|k moribe