删除除一条重复记录外的所有记录

时间:2021-04-12 09:18:59

I have a table that is supposed to keep a trace of visitors to a given profile (user id to user id pair). It turns out my SQL query was a bit off and is producing multiple pairs instead of single ones as intended. With hindsight I should have enforced a unique constraint on each id+id pair.

我有一个表,它应该保存给定概要文件(用户id到用户id对)的访问者跟踪。事实证明,我的SQL查询有点偏离了,它产生的是多个对,而不是单个的。事后看来,我应该对每个id+id对强制执行一个惟一的约束。

Now, how could I go about cleaning up the table? What I want to do is delete all duplicate pairs and leave just one.

现在,我该怎么清理桌子呢?我要做的是删除所有重复的对,只留下一个。

So for example change this:

比如改变这个

23515 -> 52525 date_visited
23515 -> 52525 date_visited
23515 -> 52525 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
23515 -> 52525 date_visited
...

Into this:

到这个:

23515 -> 52525 date_visited
12345 -> 54321 date_visited

Update: Here is the table structure as requested:

更新:这是按要求的表结构:

id  int(10)         UNSIGNED    Non     Aucun   AUTO_INCREMENT
profile_id  int(10)         UNSIGNED    Non     0 
visitor_id  int(10)         UNSIGNED    Non     0
date_visited    timestamp           Non     CURRENT_TIMESTAMP   

4 个解决方案

#1


39  

Use group by in a subquery:

在子查询中使用group by:

delete from my_tab where id not in 
(select min(id) from my_tab group by profile_id, visitor_id);

You need some kind of unique identifier(here, I'm using id).

您需要某种唯一标识符(这里我使用的是id)。

UPDATE

更新

As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):

正如@JamesPoulson指出的,这会导致MySQL中的语法错误;正确的解决方案是(如詹姆斯的回答所示):

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);

#2


12  

Here's Frank Schmitt's solution with a small workaround for the temporary table:

以下是弗兰克·施密特的解决方案:

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
)

#3


2  

Select all unique rows
Copy them to a new temp table
Truncate original table
Copy temp table data to original table

That's what I'd do. I'm not sure if there's 1 query that would do all this for you.

选择所有唯一的行将它们复制到一个新的临时表中,截断原始表将临时表数据复制到原始表中,这就是我要做的。我不确定是否有一个查询可以帮到你。

#4


1  

This will work:

这将工作:

With NewCTE
AS
(
Select *, Row_number() over(partition by ID order by ID)as RowNumber from 
table_name
)
Delete from NewCTE where RowNumber > 1

#1


39  

Use group by in a subquery:

在子查询中使用group by:

delete from my_tab where id not in 
(select min(id) from my_tab group by profile_id, visitor_id);

You need some kind of unique identifier(here, I'm using id).

您需要某种唯一标识符(这里我使用的是id)。

UPDATE

更新

As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):

正如@JamesPoulson指出的,这会导致MySQL中的语法错误;正确的解决方案是(如詹姆斯的回答所示):

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);

#2


12  

Here's Frank Schmitt's solution with a small workaround for the temporary table:

以下是弗兰克·施密特的解决方案:

delete from `my_tab` where id not in
( SELECT * FROM 
    (select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
)

#3


2  

Select all unique rows
Copy them to a new temp table
Truncate original table
Copy temp table data to original table

That's what I'd do. I'm not sure if there's 1 query that would do all this for you.

选择所有唯一的行将它们复制到一个新的临时表中,截断原始表将临时表数据复制到原始表中,这就是我要做的。我不确定是否有一个查询可以帮到你。

#4


1  

This will work:

这将工作:

With NewCTE
AS
(
Select *, Row_number() over(partition by ID order by ID)as RowNumber from 
table_name
)
Delete from NewCTE where RowNumber > 1