I have a table that is supposed to keep a trace of visitors to a given profile (user id to user id pair). It turns out my SQL query was a bit off and is producing multiple pairs instead of single ones as intended. With hindsight I should have enforced a unique constraint on each id+id pair.
我有一个表,它应该保存给定概要文件(用户id到用户id对)的访问者跟踪。事实证明,我的SQL查询有点偏离了,它产生的是多个对,而不是单个的。事后看来,我应该对每个id+id对强制执行一个惟一的约束。
Now, how could I go about cleaning up the table? What I want to do is delete all duplicate pairs and leave just one.
现在,我该怎么清理桌子呢?我要做的是删除所有重复的对,只留下一个。
So for example change this:
比如改变这个
23515 -> 52525 date_visited
23515 -> 52525 date_visited
23515 -> 52525 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
12345 -> 54321 date_visited
23515 -> 52525 date_visited
...
Into this:
到这个:
23515 -> 52525 date_visited
12345 -> 54321 date_visited
Update: Here is the table structure as requested:
更新:这是按要求的表结构:
id int(10) UNSIGNED Non Aucun AUTO_INCREMENT
profile_id int(10) UNSIGNED Non 0
visitor_id int(10) UNSIGNED Non 0
date_visited timestamp Non CURRENT_TIMESTAMP
4 个解决方案
#1
39
Use group by in a subquery:
在子查询中使用group by:
delete from my_tab where id not in
(select min(id) from my_tab group by profile_id, visitor_id);
You need some kind of unique identifier(here, I'm using id).
您需要某种唯一标识符(这里我使用的是id)。
UPDATE
更新
As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):
正如@JamesPoulson指出的,这会导致MySQL中的语法错误;正确的解决方案是(如詹姆斯的回答所示):
delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);
#2
12
Here's Frank Schmitt's solution with a small workaround for the temporary table:
以下是弗兰克·施密特的解决方案:
delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
)
#3
2
Select all unique rows
Copy them to a new temp table
Truncate original table
Copy temp table data to original table
That's what I'd do. I'm not sure if there's 1 query that would do all this for you.
选择所有唯一的行将它们复制到一个新的临时表中,截断原始表将临时表数据复制到原始表中,这就是我要做的。我不确定是否有一个查询可以帮到你。
#4
1
This will work:
这将工作:
With NewCTE
AS
(
Select *, Row_number() over(partition by ID order by ID)as RowNumber from
table_name
)
Delete from NewCTE where RowNumber > 1
#1
39
Use group by in a subquery:
在子查询中使用group by:
delete from my_tab where id not in
(select min(id) from my_tab group by profile_id, visitor_id);
You need some kind of unique identifier(here, I'm using id).
您需要某种唯一标识符(这里我使用的是id)。
UPDATE
更新
As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):
正如@JamesPoulson指出的,这会导致MySQL中的语法错误;正确的解决方案是(如詹姆斯的回答所示):
delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);
#2
12
Here's Frank Schmitt's solution with a small workaround for the temporary table:
以下是弗兰克·施密特的解决方案:
delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
)
#3
2
Select all unique rows
Copy them to a new temp table
Truncate original table
Copy temp table data to original table
That's what I'd do. I'm not sure if there's 1 query that would do all this for you.
选择所有唯一的行将它们复制到一个新的临时表中,截断原始表将临时表数据复制到原始表中,这就是我要做的。我不确定是否有一个查询可以帮到你。
#4
1
This will work:
这将工作:
With NewCTE
AS
(
Select *, Row_number() over(partition by ID order by ID)as RowNumber from
table_name
)
Delete from NewCTE where RowNumber > 1