如何用Pivot实现行列转换

时间:2020-12-22 09:37:34

在Oracle中,如果要实现行列转换,较为常见的是用DECODE和CASE语句。对于简单的行列转行,DECODE和CASE语句尚能应付。在逻辑比较复杂,分组聚合较多的场景中,DECODE和CASE语句则力有不逮。而pivot则可完美解决这一切。

首先,我们来看看Oracle对于其的解释:如何用Pivot实现行列转换

可见,pivot是数据仓库中的关键技术,它利用交叉查询(crosstabulation query)将行转换为列。

基本语法如下:

SELECT ....
FROM <table-expr>
PIVOT
(
aggregate-function(<column>)
FOR <pivot-column> IN (<value1>, <value2>,..., <valuen>)
) AS <alias>
WHERE .....

下面我们来通过具体的案例对其进行阐述。

首先,构造案例所需的数据,

1> 创建视图,以EMP表的数据作为源数据。

CREATE VIEW emp_view AS
SELECT
deptno,job,to_char(hiredate,'yyyy') hiredate,
count(*) cnt,sum(sal) sum_sal
FROM emp
GROUP BY deptno,job,to_char(hiredate,'yyyy');

其中,deptno为部门号,job为工作的类型(即工种),hiredate为雇佣的日期,cnt为特定部门,特定工种在特定年份雇佣的员工的总数,sum_sal为特定部门,特定工种,特定年份雇佣的员工的工资的总和。

2> 视图的数据如下:

SQL> select * from emp_view;

    DEPTNO JOB       HIRE        CNT    SUM_SAL
---------- --------- ---- ---------- ----------
20 CLERK 1980 1 800
20 ANALYST 1981 1 3000
20 ANALYST 1987 1 3000
30 CLERK 1981 1 950
30 MANAGER 1981 1 2850
10 MANAGER 1981 1 2450
30 SALESMAN 1981 4 5600
20 MANAGER 1981 1 2975
10 PRESIDENT 1981 1 5000
10 CLERK 1982 1 1300
20 CLERK 1987 1 1100 11 rows selected.

应用场景一:

基本的Pivot转换

例1:

SELECT * FROM
( SELECT deptno,hiredate,cnt
FROM emp_view
) PIVOT (SUM(cnt)
FOR hiredate IN ('' AS "1980",'' AS "1981",
'' AS "1982",'' AS "1987"))
ORDER BY deptno; DEPTNO 1980 1981 1982 1987
---------- ---------- ---------- ---------- ----------
10 2 1
20 1 2 2
30 6 3 rows selected.

例2:

SELECT * FROM
( SELECT deptno,job,cnt
FROM emp_view
) PIVOT (SUM(cnt)
FOR job IN ('CLERK','ANALYST','MANAGER','SALESMAN','PRESIDENT'))
ORDER BY deptno; DEPTNO 'CLERK' 'ANALYST' 'MANAGER' 'SALESMAN' 'PRESIDENT'
---------- ---------- ---------- ---------- ---------- -----------
10 1 1 1
20 2 2 1
30 1 1 4 3 rows selected.

两例以不同的列进行统计,前者是hiredate,后者是job。

除此之外,前者用了别名,后面没有用别名,两者的显示效果也是不一样的。

应用场景二:

对多列进行Pivot转换

SELECT * FROM
( SELECT deptno,job,hiredate,cnt
FROM emp_view
) PIVOT (SUM(cnt)
FOR (job,hiredate) IN
(('CLERK','') AS clerk_1980,
('CLERK','') AS clerk_1981,
('ANALYST','') AS analyst_1987,
('MANAGER','') AS manager_1981
)
)
ORDER by deptno; DEPTNO CLERK_1980 CLERK_1981 ANALYST_1987 MANAGER_1981
---------- ---------- ---------- ------------ ------------
10 1
20 1 1 1
30 1 1 3 rows selected.

限于篇幅,FOR (job,hiredate) IN语句中没有列出更多组合,只列出了四组,当然,我们可以根据实际场景需要罗列更多的组合。

从本例中可以看出,对两个列进行Pivot转换可从三个维度呈现统计结果。

应用场景三:

用Pivot实现多个聚合

SELECT * FROM
( SELECT deptno,hiredate,cnt,sum_sal
FROM emp_view
) PIVOT ( SUM(cnt) AS cnt,
SUM(sum_sal) AS sum_sal
FOR hiredate IN ('','','',''))
ORDER BY deptno; DEPTNO ''_CNT ''_SUM_SAL ''_CNT ''_SUM_SAL ''_CNT ''_SUM_SAL ''_CNT ''_SUM_SAL
---------- ---------- -------------- ---------- -------------- ---------- -------------- ---------- --------------
10 2 7450 1 1300
20 1 800 2 5975 2 4100
30 6 9400 3 rows selected.

'1981'_CNT指的是1981年雇佣的员工的总数,'1981'_SUM_SAL指的是1981年雇佣员工所开出的工资。

具体到本例中,即1981年10号部门招聘了2位员工,开出的工资合计为7450元,20号部门招聘了2位员工,开出的工资合计为5975元,30号部门招聘了6名员工,开出的工资合计为9400元,依次类推。

既然有pivot将行转换为列,同样也有unpivot操作将聚合后的列转换为行。

UNPIVOT

以上述应用场景三的结果作为源数据进行操作

CREATE TABLE T1 AS
SELECT * FROM
( SELECT deptno,hiredate,cnt,sum_sal
FROM emp_view
) PIVOT ( SUM(cnt) AS cnt,
SUM(sum_sal) AS sum_sal
FOR hiredate IN ('' AS "1980",'' AS "1981",
'' AS "1982",'' AS "1987"))
ORDER BY deptno

表T1的结果为:

SQL> select * from t1;

    DEPTNO   1980_CNT 1980_SUM_SAL   1981_CNT 1981_SUM_SAL   1982_CNT 1982_SUM_SAL   1987_CNT 1987_SUM_SAL
---------- ---------- ------------ ---------- ------------ ---------- ------------ ---------- ------------
10 2 7450 1 1300
20 1 800 2 5975 2 4100
30 6 9400 3 rows selected.

首先进行一维unpivot

SELECT deptno,DECODE(hiredate,'1980_CNT','','1981_CNT','','1982_CNT','','1987_CNT','') AS hiredate,cnt
FROM T1
UNPIVOT INCLUDE NULLS
( cnt
FOR hiredate IN ("1980_CNT","1981_CNT","1982_CNT","1987_CNT")); DEPTNO HIRE CNT
---------- ---- ----------
10 1980
10 1981 2
10 1982 1
10 1987
20 1980 1
20 1981 2
20 1982
20 1987 2
30 1980
30 1981 6
30 1982
30 1987 12 rows selected.

输出的结果为不同部门在不同年份的雇佣人数,

注意:上述SQL语句中UNPIVOT后加了INCLUDE NULLS,当然也可以指定为EXCLUDE NULLS,即排除cnt为空的值,如果不指定,则默认为EXCLUDE NULLS。

UNPIVOT后不指定INCLUDE NULLS的输入结果为:

    DEPTNO HIRE        CNT
---------- ---- ----------
10 1981 2
10 1982 1
20 1980 1
20 1981 2
20 1987 2
30 1981 6 6 rows selected.

下面,我们再进行二维unpivot

SELECT deptno,hiredate,cnt,sum_sal
FROM T1
UNPIVOT
( (cnt,sum_sal)
FOR hiredate IN (("1980_CNT","1980_SUM_SAL") AS 1980,
("1981_CNT","1981_SUM_SAL") AS 1981,
("1982_CNT","1982_SUM_SAL") AS 1982,
("1987_CNT","1987_SUM_SAL") AS 1987)); DEPTNO HIREDATE CNT SUM_SAL
---------- ---------- ---------- ----------
10 1981 2 7450
10 1982 1 1300
20 1980 1 800
20 1981 2 5975
20 1987 2 4100
30 1981 6 9400 6 rows selected.

输入结果为T1表列转行的结果。

参考文档:

SQL for Analysis and Reporting