Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id.
+----+------------------+
| Id | Email |
+----+------------------+
| 1 | john@example.com |
| 2 | bob@example.com |
| 3 | john@example.com |
+----+------------------+
Id is the primary key column for this table.
For example, after running your query, the above Person table should have the following rows:
+----+------------------+
| Id | Email |
+----+------------------+
| 1 | john@example.com |
| 2 | bob@example.com |
+----+------------------+
DELETE p1 FROM Person p1 INNER JOIN Person p2 WHERE p1.Email = p2.Email AND p1.Id > p2.Id;
这里p1和p2都是Person的别名。inner join等于join。所以p1 join p2应该是以下的结果:
------+------+-----------------+
|P1.Id| p2.Id|Email |
+-----+------+-----------------+
| 1 | 1 | john@example.com |
| 2 | 2 | bob@example.com |
| 3 | 3 | john@example.com |
+----+-------------------------+
然后p1.id>p2.id所以id为3的被删除了,但是因为p1和p2都是person事实上,只有一个id。为了理解,写成了两列。
补充一些关于sql语句的知识:
- sql表别名的用法:as
select * from kettleoutputtable a
where a.os =2 and storename = 'anzhi'
和
select * from kettleoutputtable as a
where a.os =2 and storename = 'anzhi'
等效,也就是说别名的as可以省略!在表明后直接加上简单的名字就行了。