如果索引列的数据以严格的有序的方式插入,那么B-Tree索引树将变成一棵不对称的"歪树",如图 5所示:
而如果索引列的数据以随机值的方式插入,我们将得到一棵趋向对称的索引树,如图 6所示:
比较图 5和图 6,在图 5中搜索到A块需要进行5次I/O操作,而图 6仅需要3次I/O操作。
既然索引列数据从序列中获取,其有序性无法规避,但在建立索引时,Oracle允许对索引列的值进行反向,即预先对列值进行比特位的反向,如 1000,10001,10011,10111,1100经过反向后的值将是0001,1001,1101,0011。显然经过位反向处理的有序数据变得比较随机了,这样所得到的索引树就比较对称,从而提高表的查询性能。
但反向键索引也有它局限性:如果在WHERE语句中,需要对索引列的值进行范围性的搜索,如BETWEEN、<、>等,其反向键索引无法使用,此时,Oracle将执行全表扫描;只有对反向键索引列进行 <>和 = 的比较操作时,其反向键索引才会得到使用。
1.反向索引应用场合
1)发现索引叶块成为热点块时使用
通常,使用数据时(常见于批量插入操作)都比较集中在一个连续的数据范围内,那么在使用正常的索引时就很容易发生索引叶子块过热的现象,严重时将会导致系统性能下降。
2)在RAC环境中使用
当RAC环境中几个节点访问数据的特点是集中和密集,索引热点块发生的几率就会很高。如果系统对范围检索要求不是很高的情况下可以考虑使用反向索引技术来提高系统的性能。因此该技术多见于RAC环境,它可以显著的降低索引块的争用。
2.使用反向索引的优点
最大的优点莫过于降低索引叶子块的争用,减少热点块,提高系统性能。
3.使用反向索引的缺点
由于反向索引结构自身的特点,如果系统中经常使用范围扫描进行读取数据的话(例如在where子句中使用“between and”语句或比较运算符“>”“<”等),那么反向索引将不适用,因为此时会出现大量的全表扫描的现象,反而会降低系统的性能。
有时候可以通过改写sql语句来避免使用范围扫描,例如where id between 12345 and 12347,可以改写为where id in(12345,12346,12347),CBO会把这样的sql查询转换为where id=12345 or id=12346 or id=12347,这对反向索引也是有效的。
4.通过一个小实验简单演示一下反向索引的创建及修改
SQL> select count(*) from t1; COUNT(*) ---------- 0 SQL> select count(*) from t2; COUNT(*) ---------- 0 SQL> select count(*) from t3; COUNT(*) ---------- 2000000 SQL> select INDEX_NAME,INDEX_TYPE,TABLE_NAME from user_indexes; INDEX_NAME INDEX_TYPE TABLE_NAME ------------------------------ --------------------------- ------------------------------ PK_T2 NORMAL/REV T2 PK_T1 NORMAL T1表t1是主键是正常的主键,表t2的主键是反向主键。现在我把表t3的数据分别插入到表t1和表t2
SQL> set timing on; SQL> set autotrace on; SQL> insert /* +append */ into t1 select * from t3; 已创建2000000行。 已用时间: 00: 01: 42.83 执行计划 ---------------------------------------------------------- Plan hash value: 4161002650 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 2316K| 485M| 19014 (1)| 00:03:49 | | 1 | LOAD TABLE CONVENTIONAL | T1 | | | | | | 2 | TABLE ACCESS FULL | T3 | 2316K| 485M| 19014 (1)| 00:03:49 | --------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 12305 recursive calls 538835 db block gets 203937 consistent gets 83057 physical reads 428323528 redo size 688 bytes sent via SQL*Net to client 614 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 2000000 rows processed SQL> commit; 提交完成。 已用时间: 00: 00: 00.04 SQL> insert /* +append */ into t2 select * from t3; 已创建2000000行。 已用时间: 00: 02: 02.63 执行计划 ---------------------------------------------------------- Plan hash value: 4161002650 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 2316K| 485M| 19014 (1)| 00:03:49 | | 1 | LOAD TABLE CONVENTIONAL | T2 | | | | | | 2 | TABLE ACCESS FULL | T3 | 2316K| 485M| 19014 (1)| 00:03:49 | --------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 7936 recursive calls 6059147 db block gets 158053 consistent gets 56613 physical reads 790167468 redo size 689 bytes sent via SQL*Net to client 614 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 2000000 rows processed SQL> commit; 提交完成。 已用时间: 00: 00: 00.01
可以看见:由于反向索引的数据块比较分散了后,db block gets要稍微高一些。热块的争用有所缓解,consistent gets有所下降,从203937下降到158053,减少了45884次。redo size 也变多了!再来做查询,来看看他们的区别。
SQL> set autotrace traceonly; SQL> select OBJECT_NAME from t1 where id = 100; 已用时间: 00: 00: 00.06 执行计划 ---------------------------------------------------------- Plan hash value: 1141790563 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 79 | 0 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 79 | 0 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PK_T1 | 1 | | 0 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID"=100) 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 3 physical reads 0 redo size 434 bytes sent via SQL*Net to client 416 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> select OBJECT_NAME from t1 where id > 100 and id < 200; 已选择99行。 已用时间: 00: 00: 01.10 执行计划 ---------------------------------------------------------- Plan hash value: 1249713949 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 99 | 7821 | 1 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 99 | 7821 | 1 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | PK_T1 | 99 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">100 AND "ID"<200) Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 9 recursive calls 0 db block gets 140 consistent gets 189 physical reads 2356 redo size 2656 bytes sent via SQL*Net to client 482 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 99 rows processed SQL> select OBJECT_NAME from t2 where id = 100; 已用时间: 00: 00: 00.05 执行计划 ---------------------------------------------------------- Plan hash value: 1480579010 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 79 | 0 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 79 | 0 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PK_T2 | 1 | | 0 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID"=100) 统计信息 ---------------------------------------------------------- 1 recursive calls 0 db block gets 4 consistent gets 1 physical reads 0 redo size 434 bytes sent via SQL*Net to client 416 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> select OBJECT_NAME from t2 where id > 100 and id < 200; 已选择99行。 已用时间: 00: 00: 04.39 执行计划 ---------------------------------------------------------- Plan hash value: 1513984157 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 336 | 26544 | 8282 (1)| 00:01:40 | |* 1 | TABLE ACCESS FULL| T2 | 336 | 26544 | 8282 (1)| 00:01:40 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ID">100 AND "ID"<200) Note ----- - dynamic sampling used for this statement (level=2) 统计信息 ---------------------------------------------------------- 29 recursive calls 1 db block gets 60187 consistent gets 30335 physical reads 5144 redo size 2656 bytes sent via SQL*Net to client 482 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 99 rows processed
可以看见,单个值查询的时候,表t1和表t2并无差别,但是范围查询的时候,表t1是INDEX RANGE SCAN,表t2是TABLE ACCESS FULL了。在数据库的优化中你经常会发现没有绝对的好,也没有绝对的差。
在考虑使用反向索引之前,大多数情况可以考虑对索引进行散列分区(hash)来减少索引叶块的争用。
反向索引:
alter index id_inx rebuild reverse online;
alter index id_inx rebuild online reverse;
alter index name_inx rebuild online noreverse;