SQL优化-Filter正名

时间:2024-04-11 12:17:03

之前我写过一篇博客分析Filter也有性能好的一面,最近看了一下,感觉说的不够浅显易懂,这里再写一篇。

之前博客链接:

https://blog.csdn.net/songjian1104/article/details/98325658

1 数据准备

drop table emp1;

drop table dept1;

create table emp1 as select * from emp;

insert into emp1 select * from emp1;

insert into emp1 select * from emp1;

insert into emp1 select * from emp1;

insert into emp1 select * from emp1;

insert into emp1 select * from emp1;

insert into emp1 select * from emp1;

insert into emp1 select * from emp1;

insert into emp1 select * from emp1;

create table dept1 as select * from dept;

insert into dept1 select * from dept1;

insert into dept1 select * from dept1;

insert into dept1 select * from dept1;

insert into dept1 select * from dept1;

insert into dept1 select * from dept1;

insert into dept1 select * from dept1;

insert into dept1 select * from dept1;

insert into dept1 select * from dept1;

commit;

2 需求分析

需求很简单,大家直接看SQL理解吧。

3 SQL改写

select count(distinct e.EMPNO)

from emp1 e

where e.empno=7788

   or e.deptno in (select deptno

                   from dept1 d

                               where d.dname='SALES');

 

虽然有Filter,但是结果还是秒出了。

SQL优化-Filter正名

select count(empno) from

(select *

from emp1 e

where e.empno=7788

union

select *

from emp1 e

where e.deptno in (select deptno

                   from dept1 d

                               where d.dname='SALES'));

SQL优化-Filter正名

我们发现,Filter消除后,执行时间大幅度增加。

原因很简单,我们回想Filter连接的算法,第一次连接后,连接的结果会被缓存,内存中会维护一个Table,结构如下

SQL优化-Filter正名

这样,emp1表向dept1表传值进行判定时,先去缓存的表里检索,如果发现结构已经缓存,那么直接从缓存里取值,避免了dept1表的再检索。

但是,消除Filter后,第一,emp1要多扫描一次,第二,看到执行计划中,内存使用也要相应多很多,在连接计算上,成本也要高出很多。

所以,表里面连接列重复值很多时,Filter有时候性能也很好。具体问题具体分析。