Oracle的分析函数,对我们进行统计有很大的帮助,可以避免一些子查询等操作,在统计中,我们对开窗函数的接触较少,下面主要介绍下开窗函数的使用;
http://www.itpub.net/thread-1241311-1-1.html
http://www.oracle-base.com/articles/misc/analytic-functions.php#windowing_clause
http://blog.sina.com.cn/s/blog_70cea94b0100xi46.html
首先我们介绍下分析函数的语义
(分为range和row):缺省时相当于RANGE UNBOUNDED PRECEDING
值域窗(RANGE WINDOW) 如:RANGE N PRECEDING, 仅对数值或日期类型有效,选定窗为排序后当前行之前,某列(即排序列)值大于/小于(当 前 行该列值 –/+ N)的所有行,因此与ORDER BY子句有关系。
行窗(ROW WINDOW)如:ROWS N PRECEDING ,选定窗为当前行及之前N行。还可以加上BETWEEN AND 形式,例如RANGE BETWEEN m PRECEDING AND n FOLLOWING,表示每行对应的数据窗口是之前m行与之后n行内。
SELECT empno,
sal,
mgr,
deptno,
SUM(sal) over(PARTITION BY deptno ORDER BY sal RANGE BETWEEN 0 PRECEDING AND 100 FOLLOWING) dd
FROM emp;
其中:上面代表按DEPARTMENT_ID分区,按SALARY升序排序,汇总当前SALARY到比当前SALARY大100之间的SALARY总和。、
按DEPARTMENT_ID分区,按SALARY升序排序,汇总当前SALARY到比当前SALARY大100之间的SALARY总和。
Analytic functions are commonly used to compute cumulative, moving, centered, and reporting aggregates.
analytic_function::=
Description of the illustration analytic_function.gif
analytic_clause::=
Description of the illustration analytic_clause.gif
Description of the illustration query_partition_clause.gif
Description of the illustration order_by_clause.gif
windowing_clause ::=
Description of the illustration windowing_clause.gif
上面的这张图片是开窗函数的具体语法,我们可以参照这个语法。
值的开窗,该值只能是日期和数字
我有这样一个要求:
1、查询的结果按照值排序,如sql:select value from t;
结果示例如下:
50
70
90
130
160
190
2、对数据进行分组。从上述数组第一个值开始,+50之内的值作为同一组值,如果超出50了,则开始一个新的分组。示例如下
50 50
70 50
90 50
130 130
160 130
190 190
3、最终结果是统计每组的个数。结果示例:
50 3
130 2
190 1
原帖见:http://www.itpub.net/thread-985707-1-1.html
WITH T AS (
SELECT 50 N FROM DUAL UNION ALL
SELECT 70 N FROM DUAL UNION ALL
SELECT 90 N FROM DUAL UNION ALL
SELECT 130 N FROM DUAL UNION ALL
SELECT 160 N FROM DUAL UNION ALL
SELECT 190 N FROM DUAL
)
SELECT *
FROM (SELECT n,
row_number() OVER(ORDER BY n) rn,
COUNT(*) OVER(ORDER BY n RANGE BETWEEN CURRENT ROW AND 50 FOLLOWING) cn
FROM t)
START WITH rn = 1
CONNECT BY RN = PRIOR CN + PRIOR RN;
在这里,我们通过数值开窗函数,统计了每个范围内的值,然后,通过构造条件,去进行connect by,
在这里,通过让cn和rn去相加,作为connect by的条件,这个思路非常的好,很值得我们思考
在统计的过程,我们往往只是需要去构造一个场景,条件。
我有这样一个要求:
1、查询的结果按照值排序,如sql:select value from t;
结果示例如下:
50
70
90
130
160
190
2、对数据进行分组。从上述数组第一个值开始,+50之内的值作为同一组值,如果超出50了,则开始一个新的分组。示例如下
50 50
70 50
90 50
130 130
160 130
190 190
3、最终结果是统计每组的个数。结果示例:
50 3
130 2
190 1
这样一个要求,怎么用一个sql语句实现呢。
谢谢大家!
原帖见:http://www.itpub.net/thread-985707-1-1.html
通过如下的SQL可以实现上面的要求:
WITH T AS (
SELECT 1 N FROM DUAL UNION ALL
SELECT 3 N FROM DUAL UNION ALL
SELECT 4 N FROM DUAL UNION ALL
SELECT 7 N FROM DUAL UNION ALL
SELECT 10 N FROM DUAL UNION ALL
SELECT 11 N FROM DUAL UNION ALL
SELECT 12 N FROM DUAL UNION ALL
SELECT 12 N FROM DUAL UNION ALL
SELECT 19 N FROM DUAL UNION ALL
SELECT 20 N FROM DUAL
)
SELECT T2.N
,DENSE_RANK() OVER(ORDER BY T2.G) G
FROM (
SELECT T.N
,MAX(T1.N)OVER(ORDER BY T.N) G
FROM (
SELECT N
FROM (
SELECT N
,COUNT(*) OVER(ORDER BY N RANGE BETWEEN CURRENT ROW AND 4 FOLLOWING) CNT
,ROW_NUMBER()OVER(ORDER BY N) RN
FROM T
)
CONNECT BY RN = PRIOR RN + PRIOR CNT
START WITH RN = 1
) T1 , T
WHERE T1.N(+) = T.N
) T2;
在这里,我们需要关注connect by,dense rank函数和 ,MAX(T1.N)OVER(ORDER BY T.N) G这个用法。
下面是高手用with递归解决的例子,当前也可以用我们熟悉的connect by解决该问题
WITH T AS
(SELECT 1 N
FROM DUAL
UNION ALL
SELECT 4 N
FROM DUAL
UNION ALL
SELECT 3 N
FROM DUAL
UNION ALL
SELECT 7 N
FROM DUAL
UNION ALL
SELECT 10 N
FROM DUAL
UNION ALL
SELECT 11 N
FROM DUAL
UNION ALL
SELECT 12 N
FROM DUAL
UNION ALL
SELECT 12 N
FROM DUAL
UNION ALL
SELECT 19 N
FROM DUAL
UNION ALL
SELECT 20 N FROM DUAL),
v AS
(SELECT n, row_number() over(ORDER BY n) rn FROM t),
v1(flag,
n,
rn) AS
(SELECT n, n, rn
FROM v
WHERE rn = 1
UNION ALL
SELECT CASE
WHEN v.n - v1.flag >= 5 THEN
v.n
ELSE
v1.flag
END,
v.n,
v.rn
FROM v, v1
WHERE v.rn = v1.rn + 1)
SELECT * FROM v1
当然也有高手用MODEL语句实现了该功能,请查看原帖。