Oracle查询和过滤重复数据

时间:2023-03-09 15:54:36
Oracle查询和过滤重复数据

对数据库某些意外情况,引起的重复数据,如何处理呢?

----------------查重复:

select *
from satisfaction_survey s
where s.as_side = 0
and s.project_no in (select ss.project_no
from satisfaction_survey ss
where ss.as_side = 0
group by ss.project_no
having count(*) > 1)
order by s.project_no, s.submit_time

思路:从预先选出的数据中(s.as_side = 0),根据特定字段(s.project_no),判断是否存在(count(*) > 1,符合条件的数据不止一条)重复数据

当然,也可以根据多个字段(in 前面的字段,用个括号括起来,in select子句的返回字段对应改变)判断是否存在重复数据

----------------------去重复:

delete from satisfaction_survey s
where s.as_side = 0
and(s.project_no/*, to_char(s.submit_time, 'yyyy-mm-dd')*/) in
(select ss.project_no/*, to_char(ss.submit_time, 'yyyy-mm-dd') */
from satisfaction_survey ss
where ss.as_side=0//根据自己实际情况,确定子句(清楚每个sql的结果集很重要)
group by ss.project_no/*, to_char(ss.submit_time, 'yyyy-mm-dd')*/
having count(*) > 1)
and rowid not in
(select max(rowid) from satisfaction_survey where as_side=0 group by project_no/*, to_char(submit_time, 'yyyy-mm-dd')*/ having count(*) > 1)

思路:在删除全部重复数据时,保留一条(and rowid not in)符合自己需求的数据

这里为了保证程序的健壮性,可以在查询数据时防止查出多条数据,在返回结果外面加一层,只取一条数据:

select *
from (SELECT ROW_NUMBER() OVER(ORDER BY s.submit_time) 插入时间, s.*
FROM satisfaction_survey s
where s.project_no = 'WW-PM-2017-RP-449'
and s.employee_no = 'BF1019589'
and s.as_side = 0
--and s.mark is not null)
and s.mark is null)
where 插入时间 = 1

这里根据时间排序,取符合条件的第一条