一个实际工作中的sql的模拟

时间:2021-08-09 21:24:17

 

create table t as select * from dba_objects;

update t set LAST_DDL_TIME=to_date('20040101','yyyymmdd')+trunc(dbms_random.value(1,360))

SQL> desc x
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 OBJECT_ID                     NOT NULL NUMBER
 TYPE                                   CHAR(2)
 NAME                                   VARCHAR2(128)

alter table t add constraint ind_t_pk primary key(object_id);
alter table x add constraint ind_x_pk primary key(object_id);
create index ind_t on t(LAST_DDL_TIME);
 
insert into x  select * from  (select object_id,'11',object_name from t order by dbms_random.value)  where rownum<100;
update x set type='22' where rownum<40;
commit;
execute dbms_stats.gather_schema_stats('QDL');
到此就利用dba_objects构造了模拟环境。
t在现实中有4900万纪录,x现实中有500万纪录。
程序要求找出x里面类型为'22'的并且在t里面时间在某个范围的所有x的记录
其sql为:
select x.object_id, x.name, x.type
  from x, t
 where x.object_id = t.object_id
   and x.type = '22'
   and t.LAST_DDL_TIME > to_date('20041101', 'yyyymmdd');
不加任何hint,计划为下面,应该是由于t内的数据量不够,优化器自动走了全表
-----------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |    49 |  1862 |    34   (3)| 00:00:01 |
|   1 |  NESTED LOOPS                |          |    49 |  1862 |    34   (3)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL          | T        |  1635 | 21255 |    33   (4)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| X        |     1 |    25 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | IND_X_PK |     1 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------


首先想到的就是object_id为2个表各自的主健,当然要用
  
select /*+ INDEX(T,IND_T_PK) */
x.object_id,x.name,x.type
from x,t
where x.object_id =t.object_id
and t.LAST_DDL_TIME >to_date('20040801','yyyymmdd');

-----------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |    99 |  3762 |   101   (0)| 00:00:02 |
|   1 |  NESTED LOOPS                |          |    99 |  3762 |   101   (0)| 00:00:02 |
|   2 |   TABLE ACCESS FULL          | X        |    99 |  2475 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| T        |     1 |    13 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | IND_T_PK |     1 |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        208  consistent gets
          0  physical reads
          0  redo size
       1877  bytes sent via SQL*Net to client
        407  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         38  rows processed
随后反复调整时间范围,consistent gets没有任何变化,由于现时中表记录很多,响应时间在20多秒,无法接受

调整SQL使用基于时间的索引
 select /*+ INDEX(T,IND_T) */
  x.object_id, x.name, x.type
   from x, t
  where x.object_id = t.object_id
    and x.type = '22'
    and t.LAST_DDL_TIME > to_date('20040101', 'yyyymmdd');

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |    50 |  1900 |  9777   (1)| 00:01:58 |
|*  1 |  HASH JOIN                   |       |    50 |  1900 |  9777   (1)| 00:01:58 |
|*  2 |   TABLE ACCESS FULL          | X     |    50 |  1250 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T     | 10838 |   137K|  9774   (1)| 00:01:58 |
|*  4 |    INDEX RANGE SCAN          | IND_T | 10838 |       |    30   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       9767  consistent gets
         14  physical reads
          0  redo size
       1439  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         29  rows processed
当时间条件为20040101时(包含全部记录),得到了一个恐怖的consistent gets
但是现实中的时间窗口跨度不大

 select /*+ INDEX(T,IND_T) */
  x.object_id, x.name, x.type
   from x, t
  where x.object_id = t.object_id
    and x.type = '22'
    and t.LAST_DDL_TIME > to_date('20041220', 'yyyymmdd');
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        149  consistent gets
          0  physical reads
          0  redo size
        386  bytes sent via SQL*Net to client
        374  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
在取出的记录数少的时候,用时间的索引可以控制consistent gets达到更少的值

由此想到建立一个基于时间和object_id的联合唯一索引,可能会获得更好的性能
create unique index ind_t_unique on t(LAST_DDL_TIME,object_id);

 select /*+ INDEX(T,IND_T_UNIQUE) */
  x.object_id, x.name, x.type
   from x, t
  where x.object_id = t.object_id
    and x.type = '22'
-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |    50 |  1500 |    38   (3)| 00:00:01 |
|*  1 |  HASH JOIN         |              |    50 |  1500 |    38   (3)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| X            |    50 |  1250 |     2   (0)| 00:00:01 |
|   3 |   INDEX FULL SCAN  | IND_T_UNIQUE | 10838 | 54190 |    35   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         40  consistent gets
         13  physical reads
          0  redo size
       1439  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         29  rows processed
不加任何时间条件,索引使用了INDEX FULL SCAN,而这时的consistent gets也很小。加上时间条件就会变成ind_t_unique的INDEX UNIQUE SCAN,结果有望更加惊人。
 select /*+ INDEX(T,IND_T_UNIQUE) */
  x.object_id, x.name, x.type
   from x, t
  where x.object_id = t.object_id
    and x.type = '22'
    and t.LAST_DDL_TIME > to_date('20041220', 'yyyymmdd');
---------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |    49 |  1862 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |              |    49 |  1862 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN           | IND_T_UNIQUE |   151 |  1963 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| X            |     1 |    25 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | IND_X_PK     |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------   
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        386  bytes sent via SQL*Net to client
        374  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

经过这样的测试,就可以在运行环境建立对应索引测试性能了,现实环境比这个测试复杂得多,表的个数8个,而且大量的嵌套。而且也武斗正在执行,很难随便建立索引进行各种测试。
所以根据现实的sql搭建一个模拟的环境进行测试是sql调整的一个有效方法。