整理之前的优化案例,觉得下面这个应该是开发很难发现也会很困惑的一个低效SQL。
看下面这个SQL。你看到这个SQL会不会感觉很正常。其实我刚看到也觉得正常得不得了。但是测试后它确实效率很低。
select test1.object_id, test1.object_name, test1.owner, test1.object_type
from test1
inner join (select test1.object_id as object_id1,
test2.object_id as object_id2,
test3.object_id as object_id3,
test1.object_name as object_name1
from test1, test2, test3
where test1.object_id = test2.object_id
and test1.object_id = test3.object_id
and test2.object_id = test3.object_id) temp
on test1.object_id = temp.object_id1;
测试数据:
create table test1 as select object_id,object_name,owner,object_type from dba_objects;--13433行
create table test2 as select * from test1;--13433行
create table test3 as select * from test1;--13433行
create index i_test1_id on test1(object_id);
analyze table test1 compute statistics for table for all indexes for all indexed columns;
analyze table test2 compute statistics for table;
analyze table test3 compute statistics for table;
原始SQL:
SQL> select test1.object_id, test1.object_name, test1.owner, test1.object_type
2 from test1
3 inner join (select test1.object_id as object_id1,
4 test2.object_id as object_id2,
5 test3.object_id as object_id3,
6 test1.object_name as object_name1
7 from test1, test2, test3
8 where test1.object_id = test2.object_id
9 and test1.object_id = test3.object_id
10 and test2.object_id = test3.object_id) temp
11 on test1.object_id = temp.object_id1;
已选择13433行。
已用时间: 00: 00: 00.41
执行计划
----------------------------------------------------------
Plan hash value: 3055531907
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 68 | 62 (4)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 1 | 68 | 62 (4)| 00:00:01 |
|* 3 | HASH JOIN | | 1 | 30 | 60 (4)| 00:00:01 |
| 4 | TABLE ACCESS FULL | TEST3 | 13433 | 170K| 24 (0)| 00:00:01 |
|* 5 | HASH JOIN | | 13433 | 223K| 35 (3)| 00:00:01 |
| 6 | INDEX FAST FULL SCAN | I_TEST1_ID | 13433 | 53732 | 10 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL | TEST2 | 13433 | 170K| 24 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | I_TEST1_ID | 1 | | 1 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID| TEST1 | 1 | 38 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("TEST1"."OBJECT_ID"="TEST3"."OBJECT_ID" AND
"TEST2"."OBJECT_ID"="TEST3"."OBJECT_ID")
5 - access("TEST1"."OBJECT_ID"="TEST2"."OBJECT_ID")
8 - access("TEST1"."OBJECT_ID"="TEST1"."OBJECT_ID")
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
4086 consistent gets
0 physical reads
0 redo size
549707 bytes sent via SQL*Net to client
10260 bytes received via SQL*Net from client
897 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
13433 rows processed
是不是开始困惑了?子查询我三个表数据一模一样,都是用的inner join,按我们造的测试数据都是从dba_objects的,object_id应该也是唯一的。
test1、test2和test3关联结果应该跟三个表的结果集是一样的才对,怎么三个表hash join的结果竟然是1.最后再跟test1进行一次join,优化器很容易就选择了nested loop。
其实这个我困惑,猜想,优化器由test1.object_id = test2.object_id and test1.object_id = test3.object_id,可以推导出test2.object_id=test3.object_id,
但是限制条件中又写了一遍,它倒评估错了。至于啥原理,我也没搞懂。
去掉test2.object_id=test3.object_id限制,再看下执行计划:
SQL> select test1.object_id, test1.object_name, test1.owner, test1.object_type
2 from test1
3 inner join (select test1.object_id as object_id1,
4 test2.object_id as object_id2,
5 test3.object_id as object_id3,
6 test1.object_name as object_name1
7 from test1, test2, test3
8 where test1.object_id = test2.object_id
9 and test1.object_id = test3.object_id
10 /*and test2.object_id = test3.object_id*/) temp
11 on test1.object_id = temp.object_id1;
已选择13433行。
已用时间: 00: 00: 00.36
执行计划
----------------------------------------------------------
Plan hash value: 144638806
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13433 | 892K| 84 (3)| 00:00:02 |
|* 1 | HASH JOIN | | 13433 | 892K| 84 (3)| 00:00:02 |
| 2 | TABLE ACCESS FULL | TEST3 | 13433 | 170K| 24 (0)| 00:00:01 |
|* 3 | HASH JOIN | | 13433 | 721K| 60 (4)| 00:00:01 |
| 4 | TABLE ACCESS FULL | TEST2 | 13433 | 170K| 24 (0)| 00:00:01 |
|* 5 | HASH JOIN | | 13433 | 550K| 35 (3)| 00:00:01 |
| 6 | INDEX FAST FULL SCAN| I_TEST1_ID | 13433 | 53732 | 10 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL | TEST1 | 13433 | 498K| 24 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TEST1"."OBJECT_ID"="TEST3"."OBJECT_ID")
3 - access("TEST1"."OBJECT_ID"="TEST2"."OBJECT_ID")
5 - access("TEST1"."OBJECT_ID"="TEST1"."OBJECT_ID")
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
1161 consistent gets
0 physical reads
0 redo size
549707 bytes sent via SQL*Net to client
10260 bytes received via SQL*Net from client
897 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
13433 rows processed
这次评估是没错的,外部的test1和三个表关联的结果是一样的,再走hash join也是理所当然的。
这个案例,不是条件写得越多越好。开发可能觉得,把能写的条件都写进去,可是优化器的评估却出乎意料。
至于优化器评估原理,还没搞懂:(