在使用Oracle数据库进行分页查询算法设计时,分页查询的SQL语句基本上可以按照下面给出的模板来进行套用:
分页查询格式:
SELECT * FROM
(
SELECT A.*, ROWNUM RN
FROM (SELECT * FROM TABLE_NAME) A
WHERE ROWNUM <= 20
)
WHERE RN >= 11
其中最内层的查询SELECT * FROM TABLE_NAME表示不进行翻页的原始查询语句。ROWNUM <= 20和RN >= 11控制分页查询的每页的范围。
上面给出的这个分页查询语句,在大多数情况拥有较高的效率。分页的目的就是控制输出结果集大小,将结果尽快的返回。在上面的分页查询语句
中,这种考虑主要体现在WHERE ROWNUM <= 20这句上。
选择第11到20条记录存在两种方法,一种是上面例子中展示的在查询的第二层通过ROWNUM <= 20来控制最大值,在查询的最外层控制最小值。而另
一种方式是去掉查询第二层的WHERE ROWNUM <= 20语句,在查询的最外层控制分页的最小值和最大值。这是,查询语句如下:
SELECT * FROM
(
SELECT A.*, ROWNUM RN
FROM (SELECT * FROM TABLE_NAME) A
)
WHERE RN BETWEEN 11 AND 20
对比这两种写法,绝大多数的情况下,第一个查询的效率比第二个高得多。
这是由于CBO优化模式下,Oracle可以将外层的查询条件推到内层查询中,以提高内层查询的执行效率。对于第一个查询语句,第二层的查询条件
WHERE ROWNUM <= 20就可以被Oracle推入到内层查询中,这样Oracle查询的结果一旦超过了ROWNUM限制条件,就终止查询将结果返回了。
而第二个查询语句,由于查询条件BETWEEN 11 AND 20是存在于查询的第三层,而Oracle无法将第三层的查询条件推到最内层(即使推到最内层也
没有意义,因为最内层查询不知道RN代表什么)。因此,对于第二个查询语句,Oracle最内层返回给中间层的是所有满足条件的数据,而中间层返
回给最外层的也是所有数据。数据的过滤在最外层完成,显然这个效率要比第一个查询低得多。
上面分析的查询不仅仅是针对单表的简单查询,对于最内层查询是复杂的多表联合查询或最内层查询包含排序的情况一样有效。
下面简单讨论一下多表联合的情况。对于最常见的等值表连接查询,
CBO一般可能会采用两种连接方式NESTED LOOP和HASH JOIN(MERGE JOIN效率比HASH JOIN效率低,一般CBO不会考虑)。在这里,由于使用了分页
,因此指定了一个返回的最大记录数,NESTED LOOP在返回记录数超过最大值时可以马上停止并将结果返回给中间层,而HASH JOIN必须处理完所有
结果集(MERGE JOIN也是)。那么在大部分的情况下,对于分页查询选择NESTED LOOP作为查询的连接方法具有较高的效率(分页查询的时候绝大
部分的情况是查询前几页的数据,越靠后面的页数访问几率越小)。
因此,如果不介意在系统中使用HINT的话,可以将分页的查询语句改写为:
SELECT /*+ FIRST_ROWS */ * FROM
(
SELECT A.*, ROWNUM RN
FROM (SELECT * FROM TABLE_NAME) A
WHERE ROWNUM <= 20
)
WHERE RN >= 11
下面用几个例子来说明分页查询的效率。首先选择一个比较大的表作为测试表:
可以采用Oracle 10g R2数据库自带的示例方案中的OE用户的数据表 customers表做测试:
下面是CUSTOMERS表的DDL定义语句,主要是为了让大家理解表结构
CREATE TABLE "SH"."CUSTOMERS"
( "CUST_ID" NUMBER NOT NULL ENABLE,
"CUST_FIRST_NAME" VARCHAR2(20) NOT NULL ENABLE,
"CUST_LAST_NAME" VARCHAR2(40) NOT NULL ENABLE,
"CUST_GENDER" CHAR(1) NOT NULL ENABLE,
"CUST_YEAR_OF_BIRTH" NUMBER(4,0) NOT NULL ENABLE,
"CUST_MARITAL_STATUS" VARCHAR2(20),
"CUST_STREET_ADDRESS" VARCHAR2(40) NOT NULL ENABLE,
"CUST_POSTAL_CODE" VARCHAR2(10) NOT NULL ENABLE,
"CUST_CITY" VARCHAR2(30) NOT NULL ENABLE,
"CUST_CITY_ID" NUMBER NOT NULL ENABLE,
"CUST_STATE_PROVINCE" VARCHAR2(40) NOT NULL ENABLE,
"CUST_STATE_PROVINCE_ID" NUMBER NOT NULL ENABLE,
"COUNTRY_ID" NUMBER NOT NULL ENABLE,
"CUST_MAIN_PHONE_NUMBER" VARCHAR2(25) NOT NULL ENABLE,
"CUST_INCOME_LEVEL" VARCHAR2(30),
"CUST_CREDIT_LIMIT" NUMBER,
"CUST_EMAIL" VARCHAR2(30),
"CUST_TOTAL" VARCHAR2(14) NOT NULL ENABLE,
"CUST_TOTAL_ID" NUMBER NOT NULL ENABLE,
"CUST_SRC_ID" NUMBER,
"CUST_EFF_FROM" DATE,
"CUST_EFF_TO" DATE,
"CUST_VALID" VARCHAR2(1),
CONSTRAINT "CUSTOMERS_PK" PRIMARY KEY ("CUST_ID")
USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "EXAMPLE" ENABLE NOVALIDATE,
CONSTRAINT "CUSTOMERS_COUNTRY_FK" FOREIGN KEY ("COUNTRY_ID")
REFERENCES "SH"."COUNTRIES" ("COUNTRY_ID") ENABLE NOVALIDATE
) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "EXAMPLE" ;
CREATE BITMAP INDEX "SH"."CUSTOMERS_GENDER_BIX" ON "SH"."CUSTOMERS" ("CUST_GENDER")
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "EXAMPLE" ;
CREATE BITMAP INDEX "SH"."CUSTOMERS_MARITAL_BIX" ON "SH"."CUSTOMERS" ("CUST_MARITAL_STATUS")
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "EXAMPLE" ;
CREATE UNIQUE INDEX "SH"."CUSTOMERS_PK" ON "SH"."CUSTOMERS" ("CUST_ID")
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "EXAMPLE" ;
CREATE BITMAP INDEX "SH"."CUSTOMERS_YOB_BIX" ON "SH"."CUSTOMERS" ("CUST_YEAR_OF_BIRTH")
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "EXAMPLE" ;
ALTER TABLE "SH"."CUSTOMERS" ADD CONSTRAINT "CUSTOMERS_PK" PRIMARY KEY ("CUST_ID")
USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "EXAMPLE" ENABLE NOVALIDATE;
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_ID" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_FIRST_NAME" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_LAST_NAME" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_GENDER" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_YEAR_OF_BIRTH" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_STREET_ADDRESS" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_POSTAL_CODE" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_CITY" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_CITY_ID" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_STATE_PROVINCE" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_STATE_PROVINCE_ID" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("COUNTRY_ID" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_MAIN_PHONE_NUMBER" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_TOTAL" NOT NULL ENABLE);
ALTER TABLE "SH"."CUSTOMERS" MODIFY ("CUST_TOTAL_ID" NOT NULL ENABLE);
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_ID" IS 'primary key';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_FIRST_NAME" IS 'first name of the customer';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_LAST_NAME" IS 'last name of the customer';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_GENDER" IS 'gender; low cardinality attribute';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_YEAR_OF_BIRTH" IS 'customer year of birth';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_MARITAL_STATUS" IS 'customer marital status; low cardinality attribute';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_STREET_ADDRESS" IS 'customer street address';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_POSTAL_CODE" IS 'postal code of the customer';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_CITY" IS 'city where the customer lives';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_STATE_PROVINCE" IS 'customer geography: state or province';
COMMENT ON COLUMN "SH"."CUSTOMERS"."COUNTRY_ID" IS 'foreign key to the countries table (snowflake)';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_MAIN_PHONE_NUMBER" IS 'customer main phone number';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_INCOME_LEVEL" IS 'customer income level';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_CREDIT_LIMIT" IS 'customer credit limit';
COMMENT ON COLUMN "SH"."CUSTOMERS"."CUST_EMAIL" IS 'customer email id';
COMMENT ON TABLE "SH"."CUSTOMERS" IS 'dimension table';
GRANT SELECT ON "SH"."CUSTOMERS" TO "BI";
下面统计一下这张表中的记录数:
SQL> select count(*) from customers;
COUNT(*)
----------
55500
已用时间: 00: 00: 00.01
执行计划
----------------------------------------------------------------------------------------------
Plan hash value: 237477902
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | BITMAP CONVERSION COUNT | | 55500 | 3 (0)| 00:00:01 |
| 3 | BITMAP INDEX FAST FULL SCAN| CUSTOMERS_GENDER_BIX | | | |
----------------------------------------------------------------------------------------------
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
413 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
首先比较两种分页方法的区别:
SQL> SET AUTOT ON
SQL> set timing on
SQL> col CUST_LAST_NAME format a16;
SQL> col CUST_FIRST_NAME format a16;
以SYS用户执行下面的SQL Plus命令(注意不要在任何重要的生产环境中执行这两条命令):
SQL> show user;
USER 为 "SYS"
SQL> alter system flush shared_pool;
系统已更改。
SQL> alter system flush buffer_cache;
系统已更改。
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'CUSTOMERS')
PL/SQL 过程已成功完成。
下面比较两种分页算法的效率:
SQL> select CUST_ID,
2 CUST_FIRST_NAME,
3 CUST_LAST_NAME,
4 CUST_GENDER,
5 CUST_YEAR_OF_BIRTH,
6 CUST_CITY,
7 CUST_CITY_ID,
8 COUNTRY_ID
9 from (select rownum rn,
10 CUST_ID,
11 CUST_FIRST_NAME,
12 CUST_LAST_NAME,
13 CUST_GENDER,
14 CUST_YEAR_OF_BIRTH,
15 CUST_CITY,
16 CUST_CITY_ID,
17 COUNTRY_ID
18 from (select CUST_ID,
19 CUST_FIRST_NAME,
20 CUST_LAST_NAME,
21 CUST_GENDER,
22 CUST_YEAR_OF_BIRTH,
23 CUST_CITY,
24 CUST_CITY_ID,
25 COUNTRY_ID
26 from customers))
27 where rn between 11 and 20;
CUST_ID CUST_FIRST_NAME CUST_LAST_NAME C CUST_YEAR_OF_BIRTH CUST_CITY CUST_CITY_ID COUNTRY_ID
---------- ---------------- ---------------- - ------------------ ------------------------------ ------------ ----------
35227 Abigail Ruddy M 1948 Lelystad 51793 52770
36117 Abner Everett M 1954 Wolverhampton 52514 52789
39672 Abner Everett M 1975 Murnau 51934 52776
43228 Abner Everett M 1957 Los Angeles 51806 52790
25470 Abner Everett M 1966 Stuttgart 52331 52776
47006 Abner Everett M 1985 Montara 51919 52790
50561 Abner Everett M 1960 Neuss 51975 52776
4117 Abner Everett M 1972 Clermont-l'Herault 51329 52779
7673 Abner Everett M 1988 Schwaebisch Gmuend 52300 52776
11228 Abner Everett M 1956 Ingolstadt 51680 52776
已选择10行。
已用时间: 00: 00: 00.22
执行计划
----------------------------------------------------------
Plan hash value: 4059343527
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 55500 | 6449K| 333 (2)| 00:00:04 |
|* 1 | VIEW | | 55500 | 6449K| 333 (2)| 00:00:04 |
| 2 | COUNT | | | | | |
| 3 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 2493K| 333 (2)| 00:00:04 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RN"<=20 AND "RN">=11)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
1460 consistent gets
0 physical reads
0 redo size
1250 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed
SQL>
SQL> select
2 CUST_ID,
3 CUST_FIRST_NAME,
4 CUST_LAST_NAME,
5 CUST_GENDER,
6 CUST_YEAR_OF_BIRTH,
7 CUST_CITY,
8 CUST_CITY_ID,
9 COUNTRY_ID
10 from (select rownum rn,
11 CUST_ID,
12 CUST_FIRST_NAME,
13 CUST_LAST_NAME,
14 CUST_GENDER,
15 CUST_YEAR_OF_BIRTH,
16 CUST_CITY,
17 CUST_CITY_ID,
18 COUNTRY_ID
19 from (select CUST_ID,
20 CUST_FIRST_NAME,
21 CUST_LAST_NAME,
22 CUST_GENDER,
23 CUST_YEAR_OF_BIRTH,
24 CUST_CITY,
25 CUST_CITY_ID,
26 COUNTRY_ID
27 from customers)
28 where rownum <= 20)
29 where rn >= 11;
CUST_ID CUST_FIRST_NAME CUST_LAST_NAME C CUST_YEAR_OF_BIRTH CUST_CITY CUST_CITY_ID COUNTRY_I
---------- ---------------- ---------------- - ------------------ ------------------------------ ------------ ---------
35227 Abigail Ruddy M 1948 Lelystad 51793 5277
36117 Abner Everett M 1954 Wolverhampton 52514 5278
39672 Abner Everett M 1975 Murnau 51934 5277
43228 Abner Everett M 1957 Los Angeles 51806 5279
25470 Abner Everett M 1966 Stuttgart 52331 5277
47006 Abner Everett M 1985 Montara 51919 5279
50561 Abner Everett M 1960 Neuss 51975 5277
4117 Abner Everett M 1972 Clermont-l'Herault 51329 5277
7673 Abner Everett M 1988 Schwaebisch Gmuend 52300 5277
11228 Abner Everett M 1956 Ingolstadt 51680 5277
已选择10行。
已用时间: 00: 00: 00.10
执行计划
----------------------------------------------------------
Plan hash value: 4287989439
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 2380 | 2 (0)| 00:00:01 |
|* 1 | VIEW | | 20 | 2380 | 2 (0)| 00:00:01 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | TABLE ACCESS FULL| CUSTOMERS | 20 | 920 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RN">=11)
2 - filter(ROWNUM<=20)
统计信息
----------------------------------------------------------
373 recursive calls
0 db block gets
141 consistent gets
9 physical reads
0 redo size
1250 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
10 rows processed
二者执行效率相差很大,一个需要1460多逻辑读,而另一个只需要141个逻辑读。观察二者的执行计划可以发现,两个执行计划唯一的区别就是第二
个查询在COUNT这步使用了STOPKEY,也就是说,Oracle将ROWNUM <= 20推入到查询内层,当符合查询的条件的记录达到STOPKEY的值,则Oracle结
束查询。
因此,可以预见,采用第二种方式,在翻页的开始部分查询速度很快,越到后面,效率越低,当翻到最后一页,效率应该和第一种方式接近。
SQL> select
2 CUST_ID,
3 CUST_FIRST_NAME,
4 CUST_LAST_NAME,
5 CUST_GENDER,
6 CUST_YEAR_OF_BIRTH,
7 CUST_CITY,
8 CUST_CITY_ID,
9 COUNTRY_ID
10 from (select rownum rn,
11 CUST_ID,
12 CUST_FIRST_NAME,
13 CUST_LAST_NAME,
14 CUST_GENDER,
15 CUST_YEAR_OF_BIRTH,
16 CUST_CITY,
17 CUST_CITY_ID,
18 COUNTRY_ID
19 from (select CUST_ID,
20 CUST_FIRST_NAME,
21 CUST_LAST_NAME,
22 CUST_GENDER,
23 CUST_YEAR_OF_BIRTH,
24 CUST_CITY,
25 CUST_CITY_ID,
26 COUNTRY_ID
27 from customers)
28 where rownum <= 55500)
29 where rn >= 55490;
CUST_ID CUST_FIRST_NAME CUST_LAST_NAME C CUST_YEAR_OF_BIRTH CUST_CITY CUST_CITY_ID COUNTR
---------- ---------------- ---------------- - ------------------ ------------------------------ ------------ ------
101789 Pavani Krishnan M 1947 Lowndesville 51807 5
100115 Grace Carbery M 1981 Frederick 51519 5
104365 Payton Stengard F 1981 Plains 52111 5
101876 Margaret Dadashev F 1977 Hiseville 51663 5
103405 Abbie Anderson M 1975 Chieflake 51325 5
100364 Jordan Zwolinsky F 1970 Molino 51917 5
104496 Sydney Tang F 1947 Evinston 51459 5
102169 Boriana Cay M 1944 Orlinda 52038 5
104252 Jade Taft M 1976 Norman 51986 5
100680 Joshua Prabu M 1954 Cleveland 51330 5
100055 Andrew Clark F 1978 Duncan 51402 5
已选择11行。
已用时间: 00: 00: 00.28
执行计划
----------------------------------------------------------
Plan hash value: 4287989439
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 55500 | 6449K| 333 (2)| 00:00:04 |
|* 1 | VIEW | | 55500 | 6449K| 333 (2)| 00:00:04 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | TABLE ACCESS FULL| CUSTOMERS | 55500 | 2493K| 333 (2)| 00:00:04 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RN">=55490)
2 - filter(ROWNUM<=55500)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
1460 consistent gets
1449 physical reads
0 redo size
1349 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
SQL>
注意在执行计划的统计信息中出现了 1460 consistent gets
未完待续