--查看表数据的内容和分布
SQL> select count(*) from members;
COUNT(*)
----------
345567
SQL> CREATE INDEX ct_ind ON members(member_name) INDEXTYPE IS CTXSYS.CONTEXT;
索引已创建。
实验1:全文索引性能差于模糊查询
SQL> set autotrace traceonly
SQL> SELECT * FROM members WHERE CONTAINS(member_name, '亚') > 0;
已选择2629行。
执行计划
----------------------------------------------------------
Plan hash value: 420797271
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 790 | 47400 | 289 (0)| 00:00:04 |
| 1 | TABLE ACCESS BY INDEX ROWID| MEMBERS | 790 | 47400 | 289 (0)| 00:00:04 |
|* 2 | DOMAIN INDEX | CT_IND | | | 139 (0)| 00:00:02 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CTXSYS"."CONTAINS"("MEMBER_NAME",'亚')>0)
统计信息
----------------------------------------------------------
15 recursive calls
0 db block gets
3904 consistent gets
0 physical reads
0 redo size
206863 bytes sent via SQL*Net to client
2445 bytes received via SQL*Net from client
177 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2629 rows processed
SQL> select * from members where member_name like '%亚%';
已选择2629行。
执行计划
----------------------------------------------------------
Plan hash value: 3682585452
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17252 | 1010K| 877 (1)| 00:00:11 |
|* 1 | TABLE ACCESS FULL| MEMBERS | 17252 | 1010K| 877 (1)| 00:00:11 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("MEMBER_NAME" LIKE '%亚%' AND "MEMBER_NAME" IS NOT NULL)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
3329 consistent gets
0 physical reads
0 redo size
182298 bytes sent via SQL*Net to client
2445 bytes received via SQL*Net from client
177 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2629 rows processed
结论:
全文索引逻辑读:3904
模糊查询逻辑读:3329
经过测试可以看到在查询结果集比较多时,全文索引的性能优势并不明显,甚至比模糊查询的全表扫描还要低。
实验二:全文索引性能优于模糊查询的情况
--同样的SQL只改变查询条件
SQL> SELECT * FROM members WHERE CONTAINS(member_name, '亚南')>0;
已选择15行。
执行计划
----------------------------------------------------------
Plan hash value: 420797271
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 1200 | 16 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| MEMBERS | 20 | 1200 | 16 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | CT_IND | | | 12 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CTXSYS"."CONTAINS"("MEMBER_NAME",'亚南')>0)
统计信息
----------------------------------------------------------
144 recursive calls
0 db block gets
1172 consistent gets
0 physical reads
0 redo size
2075 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
15 rows processed
SQL> select * from members where member_name like '%亚南%';
已选择15行 。
执行计划
----------------------------------------------------------
Plan hash value: 3682585452
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17252 | 1010K| 877 (1)| 00:00:11 |
|* 1 | TABLE ACCESS FULL| MEMBERS | 17252 | 1010K| 877 (1)| 00:00:11 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("MEMBER_NAME" LIKE '%亚南%' AND "MEMBER_NAME" IS NOT NULL)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
3155 consistent gets
0 physical reads
0 redo size
1965 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
15 rows processed
结论:
全文索引逻辑读:1172
模糊查询逻辑读:3155
经过测试可以看到在查询结果集比较少时,全文索引的性能优于传统的全表扫描的模糊查询。
总结:
1.全文索引在查询结果比较多和跟其他索引的配合使用时,性能并不会一定会高,而且结果不一定会完全正确,这取决于使用何种分词方法。
2.由于全文索引是以空间换时间的思想,创建时占用大量的存储空间,适用于在长度比较小的字段上创建,减少分词量占用的空间,如name,title类似的字段。
3由于全文索引维护成本比较高,需要定期进行同步刷新索引数据,频繁刷新数据会对生产环境带来压力,这点物化视图有点类似,所以同步频率根据数据量和需求进行权衡,在OLTP生产环境要慎用。
4.由于全文索引bug较多,使用前一定做好测试,确保对其它表和其它应用环境不生产影响。笔者就遇到过创建全文索引导致logic dataguard同步失效的情况。