一、使用 Row_Number() Over(Partition By ... Order By ...) 为每个分组生成一个内部序号
第一步,根据需求查询重复数据,并为每个分组生成一个内部行号
Select t.*, Row_Number() Over(Partition By Visit_Id, Presc_Detail_Id, Rule_Code Order By t.Visit_Id) Su From Test t;
第二步,筛选出行号大于1的,即重复的多余的数据
Select * From (Select t.*, Row_Number() Over(Partition By Visit_Id, Presc_Detail_Id, Rule_Code Order By t.Visit_Id) Su From Test t) Where Su > 1;
第三步,删除重复数据,只保留1条
Delete From Test Where Id In (Select Id From (Select t.*, Row_Number() Over(Partition By Visit_Id, Presc_Detail_Id, Rule_Code Order By t.Visit_Id) Su From Test t) Where Su > 1);
这种方式有个弊端,如果数据本身并没有每行的唯一标识id,删除重复数据还是非常困难的。
二、使用 ROWID
第一步,查找重复的记录
Select * From Test t Where (Visit_Id, Presc_Detail_Id, Rule_Code) In (Select Visit_Id, Presc_Detail_Id, Rule_Code From Test Group By Visit_Id, Presc_Detail_Id, Rule_Code Having Count(*) > 1) Order By Visit_Id, Presc_Detail_Id, Rule_Code;
第二步,查找表中多余的重复记录,不包含rowid最小的记录
Select * From Test t Where (Visit_Id, Presc_Detail_Id, Rule_Code) In (Select Visit_Id, Presc_Detail_Id, Rule_Code From Test Group By Visit_Id, Presc_Detail_Id, Rule_Code Having Count(*) > 1) And Rowid Not In (Select Min(Rowid) From Test Group By Visit_Id, Presc_Detail_Id, Rule_Code Having Count(*) > 1);
第三步,删除表中多余的重复记录,只保留rowid最小的记录
Delete From Test t Where (Visit_Id, Presc_Detail_Id, Rule_Code) In (Select Visit_Id, Presc_Detail_Id, Rule_Code From Test Group By Visit_Id, Presc_Detail_Id, Rule_Code Having Count(*) > 1) And Rowid Not In (Select Min(Rowid) From Test Group By Visit_Id, Presc_Detail_Id, Rule_Code Having Count(*) > 1);