今天在LeetCode上面看到一道题,题目不难,而且考的点也很基础,但是感觉可以归纳出一些内容来,所以就自己归纳了一下。题目如下
题目并不难,我写的sql如下:
select distinct p1.Email from Person as p1 where p1.Email in (select p2.Email from Person as p2 group by p2.Email having count(p2.Email)>1)
其实从sql来看我就是把这条sql分成了两部分,in里面的部分是取出Email里面count>1的数据,然后去取p1里面合适的数据即可,但是p2里面取出来的会有很多重复的数据,所以我在p1的select 部分加了distinct来去重。一般来说应用到具体业务是不会单独取个Name的值的,所以上方的写法仅仅只是为了完成这道题。下面整理集中SQL的重复记录的查询方法,以及删除多余数据的方法。
1.查找表中的多余重复记录,用单字段来判断
select * from Person as p1 where p1.Email in (select Email from Person as p2 group by p2.Email having count(p2.Email)>1)
2.删除表的多余重复记录,保留id最小的记录,用单字段进行判断
delete from Person where Email in (select p2.Email from Person as p2 group by p2.Email having count(p2.Email)>1) and id not in (select min(id) from Person as p3 group by p3.Email having count(p3.Email)>1)
3.查询表中的多余重复记录,用多字段进行判断
select * from Person as p1 where (p1.Email,p1.Name) in (select p2.Email,p2.Name from Person as p2 group by p2.Email,p2.Name having count(*)>1)
4.删除表的多余重复近路,保留id最小的记录,多字段判断
delete from Person where Email in (select p2.Email,p2.Name from Person as p2 group by p2.Email,p2.Name having count(*)>1) and id not in (select min(id) from Person as p3 group by p3.Email,p3.Name having count(*)>1)
5.查询表中的重复记录,用多字段判断,不包括id最小的那个记录
select * from Person as p1 where (p1.Email,p1.Name) in (select p2.Email,p2.Name from Person as p2 group by p2.Email,p2.Name having count(*)>1) and id not in(select min(id) from Person as p3 group by p3.Email,p3.Name having count(*)>1)