Oracle的行列转换

时间:2022-06-17 21:59:15

首先准备如下表格

tony@ORA11GR2> select empno,ename,job,sal,deptno from emp
2 order by deptno,job;

EMPNO ENAME JOB SAL DEPTNO
---------- -------------------- ------------------ ---------- ----------
7934 MILLER CLERK 1300 10
7782 CLARK MANAGER 2450 10
7839 KING PRESIDENT 5000 10
7788 SCOTT ANALYST 3000 20
7902 FORD ANALYST 3000 20
7876 ADAMS CLERK 1100 20
7369 SMITH CLERK 800 20
7566 JONES MANAGER 2975 20
7900 JAMES CLERK 950 30
7698 BLAKE MANAGER 2850 30
7654 MARTIN SALESMAN 1250 30
7521 WARD SALESMAN 1250 30
7499 ALLEN SALESMAN 1600 30
7844 TURNER SALESMAN 1500 30


现在查询各部门各工种的总薪水,

tony@ORA11GR2> select deptno, job, sum(sal) total_sal from emp
2 group by deptno, job order by 1, 2;

DEPTNO JOB TOTAL_SAL
---------- ------------------ ----------
10 CLERK 1300
10 MANAGER 2450
10 PRESIDENT 5000
20 ANALYST 6000
20 CLERK 1900
20 MANAGER 2975
30 CLERK 950
30 MANAGER 2850
30 SALESMAN 5600


但是这样不直观,如果能够把每个工种作为1列显示就会更一目了然.
这就是需要行转列。
在11g之前,需要一点技巧,利用decode函数才能完成这个目标。

select deptno,
sum(decode(job, 'PRESIDENT', sal, 0)) as PRESIDENT_SAL,
sum(decode(job, 'MANAGER', sal, 0)) as MANAGER_SAL,
sum(decode(job, 'ANALYST', sal, 0)) as ANALYST_SAL,
sum(decode(job, 'CLERK', sal, 0)) as CLERK_SAL,
sum(decode(job, 'SALESMAN', sal, 0)) as SALESMAN_SAL
from emp group by deptno order by 1;

得到结果:

    DEPTNO PRESIDENT_SAL MANAGER_SAL ANALYST_SAL  CLERK_SAL SALESMAN_SAL
---------- ------------- ----------- ----------- ---------- ------------
10 5000 2450 0 1300 0
20 0 2975 6000 1900 0
30 0 2850 0 950 5600


如果要在变回前面的结果,需要用到笛卡尔乘积,一行变五行,然后利用decode。例如:

with t as (
select deptno,
sum(decode(job, 'PRESIDENT', sal, 0)) as PRESIDENT_SAL,
sum(decode(job, 'MANAGER', sal, 0)) as MANAGER_SAL,
sum(decode(job, 'ANALYST', sal, 0)) as ANALYST_SAL,
sum(decode(job, 'CLERK', sal, 0)) as CLERK_SAL,
sum(decode(job, 'SALESMAN', sal, 0)) as SALESMAN_SAL
from emp group by deptno
)
select deptno,
decode(lvl, 1, 'PRESIDENT', 2, 'MANAGER', 3, 'ANALYST',
4, 'CLERK', 5, 'SALESMAN') as JOB,
decode(lvl, 1, PRESIDENT_SAL, 2, MANAGER_SAL, 3, ANALYST_SAL,
4, CLERK_SAL, 5, SALESMAN_SAL) as TOTAL_SAL
from t, (select level lvl from dual connect by level <= 5)
order by 1, 2;

得到结果:

    DEPTNO JOB                 TOTAL_SAL
---------- ------------------ ----------
10 ANALYST 0
10 CLERK 1300
10 MANAGER 2450
10 PRESIDENT 5000
10 SALESMAN 0
20 ANALYST 6000
20 CLERK 1900
20 MANAGER 2975
20 PRESIDENT 0
20 SALESMAN 0
30 ANALYST 0
30 CLERK 950
30 MANAGER 2850
30 PRESIDENT 0
30 SALESMAN 5600



11g之后,oracle增加了pivotunpivot语句,可以很方便的完成这个转换。
pivot
先来看看pivot的语法

SELECT ....
FROM <table-expr>
PIVOT
(
aggregate-function(<column>)
FOR <pivot-column> IN (<value1>, <value2>,..., <valuen>)
) AS <alias>
WHERE .....


来看个例子:

select * from
(select deptno, job, sal from emp)
pivot(
sum(sal) for job in (
'PRESIDENT' as PRESIDENT_SAL,
'MANAGER' as MANAGER_SAL,
'ANALYST' as ANALYST_SAL,
'CLERK' as CLERK_SAL,
'SALESMAN' as SALESMAN_SAL
)
) order by 1;

得到结果:

    DEPTNO PRESIDENT_SAL MANAGER_SAL ANALYST_SAL  CLERK_SAL SALESMAN_SAL
---------- ------------- ----------- ----------- ---------- ------------
10 5000 2450 1300
20 2975 6000 1900
30 2850 950 5600

实际上,oracle对pivot子句中出现的列以外的列做了一个隐式的group by.
现在,如果想要再结果中增加1列,显示部门的薪水总合,可以这么做,

tony@ORA11GR2> select * from
2 (select deptno, sum(sal) over (partition by deptno) SAL_TOTAL, job, sal from emp)
3 pivot(
4 sum(sal) as SAL_TOTAL for job in (
5 'PRESIDENT' as PRESIDENT,
6 'MANAGER' as MANAGER,
7 'ANALYST' as ANALYST,
8 'CLERK' as CLERK,
9 'SALESMAN' as SALESMAN
10 )
11 ) order by 1;

DEPTNO SAL_TOTAL PRESIDENT_SAL_TOTAL MANAGER_SAL_TOTAL ANALYST_SAL_TOTAL CLERK_SAL_TOTAL SALESMAN_SAL_TOTAL
---------- ---------- ------------------- ----------------- ----------------- --------------- ------------------
10 8750 5000 2450 1300
20 10875 2975 6000 1900
30 9400 2850 950 5600

2点说明,
1)oracle对pivot子句中出现的列以外的列,也就是deptno和SAL_TOTAL做了隐式的group by.
      这里用了分析函数,对于每个deptno,SAL_TOTAL是唯一的,所以group by的结果还是3行。
2)oracle会拼接列名 = for字句中别名+聚合函数别名,比如'PRESIDENT'+'_'+'SAL_TOTAL'。

可以指定多个聚合函数,例如统计薪水总合和人数总合:

tony@ORA11GR2> select * from
2 (select deptno, job, sal from emp)
3 pivot(
4 sum(sal) as SAL_TOTAL, count(sal) as EMP_TOTAL for job in (
5 'CLERK' as CLERK,
6 'SALESMAN' as SALESMAN
7 )
8 ) order by 1;

DEPTNO CLERK_SAL_TOTAL CLERK_EMP_TOTAL SALESMAN_SAL_TOTAL SALESMAN_EMP_TOTAL
---------- --------------- --------------- ------------------ ------------------
10 1300 1 0
20 1900 2 0
30 950 1 5600 4


for子句可以指定多列
为此,先给emp表追加1列rank,取值为'A','B',

tony@ORA11GR2> alter table emp add (rank varchar2(1) default('A'));

表已更改。

tony@ORA11GR2> update emp set rank=decode(mod(rownum, 2), 0, 'B', rank);

已更新14行。

tony@ORA11GR2> select deptno, job, rank, count(sal) as EMP_TOTAL from emp
2 group by deptno, job, rank order by 1, 2;

DEPTNO JOB RA EMP_TOTAL
---------- ------------------ -- ----------
10 CLERK B 1
10 MANAGER A 1
10 PRESIDENT A 1
20 ANALYST A 1
20 ANALYST B 1
20 CLERK A 2
20 MANAGER B 1
30 CLERK B 1
30 MANAGER B 1
30 SALESMAN A 2
30 SALESMAN B 2

 

现在,想统计SALESMAN和CLERK的员工中,rank A和rank B各自的人数。

tony@ORA11GR2> select * from
2 (select deptno, job, rank from emp)
3 pivot(
4 count(rank) as EMP_TOTAL for (job, rank) in (
5 ('SALESMAN', 'A') as SALESMAN_A,
6 ('SALESMAN', 'B') as SALESMAN_B,
7 ('CLERK', 'A') as CLERK_A,
8 ('CLERK', 'B') as CLERK_B
9 )
10 ) order by 1;

DEPTNO SALESMAN_A_EMP_TOTAL SALESMAN_B_EMP_TOTAL CLERK_A_EMP_TOTAL CLERK_B_EMP_TOTAL
---------- -------------------- -------------------- ----------------- -----------------
10 0 0 0 1
20 0 0 2 0
30 2 2 0 1


unpivot
现在来看看unpivot的用法,
unpivot的语法

SELECT ....
FROM <table-expr>
UNPIVOT [include nulls|exclude nulls]
(
(<column>)
FOR <pivot-column> IN (<value1>, <value2>,..., <valuen>)
) AS <alias>
WHERE .....

例如,有下面的的表格,

    DEPTNO  CLERK_SAL SALESMAN_SAL
---------- ---------- ------------
10 1300
20 1900
30 950 5600


用unpivot语句来做列到行的转换,

tony@ORA11GR2> with t as (
2 select * from
3 (select deptno, job, sal from emp)
4 pivot(
5 sum(sal) for job in (
6 'CLERK' as CLERK_SAL,
7 'SALESMAN' as SALESMAN_SAL
8 )
9 )
10 )
11 select * from t
12 unpivot(
13 SAL_TOTAL for JOB in (
14 CLERK_SAL as 'CLERK',
15 SALESMAN_SAL as 'SALESMAN'
16 )
17 ) order by 1,2;

DEPTNO JOB SAL_TOTAL
---------- ---------------- ----------
10 CLERK 1300
20 CLERK 1900
30 CLERK 950
30 SALESMAN 5600


如果加上include nulls子句

tony@ORA11GR2> with t as (
2 select * from
3 (select deptno, job, sal from emp)
4 pivot(
5 sum(sal) for job in (
6 'CLERK' as CLERK_SAL,
7 'SALESMAN' as SALESMAN_SAL
8 )
9 )
10 )
11 select * from t
12 unpivot include nulls(
13 SAL_TOTAL for JOB in (
14 CLERK_SAL as 'CLERK',
15 SALESMAN_SAL as 'SALESMAN'
16 )
17 ) order by 1,2;

DEPTNO JOB SAL_TOTAL
---------- ---------------- ----------
10 CLERK 1300
10 SALESMAN
20 CLERK 1900
20 SALESMAN
30 CLERK 950
30 SALESMAN 5600




可以指定多个pivot-column,例如对于下面的表格

    DEPTNO CLERK_SAL_TOTAL CLERK_EMP_TOTAL SALESMAN_SAL_TOTAL SALESMAN_EMP_TOTAL
---------- --------------- --------------- ------------------ ------------------
30 950 1 5600 4
20 1900 2 0
10 1300 1 0

转换为行以后,有2列数据分别显示薪水总合和人数总和。也就是需要对2列同时进行转换。

tony@ORA11GR2> with t as (
2 select * from
3 (select deptno, job, sal from emp)
4 pivot(
5 sum(sal) as SAL_TOTAL, count(sal) as EMP_TOTAL for job in (
6 'CLERK' as CLERK,
7 'SALESMAN' as SALESMAN
8 )
9 )
10 )
11 select * from t
12 unpivot include nulls(
13 (SAL_TOTAL, EMP_TOTAL) for JOB in (
14 (CLERK_SAL_TOTAL, CLERK_EMP_TOTAL) as 'CLERK',
15 (SALESMAN_SAL_TOTAL, SALESMAN_EMP_TOTAL) as 'SALESMAN'
16 )
17 ) order by 1,2;

DEPTNO JOB SAL_TOTAL EMP_TOTAL
---------- ---------------- ---------- ----------
10 CLERK 1300 1
10 SALESMAN 0
20 CLERK 1900 2
20 SALESMAN 0
30 CLERK 950 1
30 SALESMAN 5600 4


关于ANY子句

pivot的一个不便之处是需要在IN子句中指定所有取值,能不能自动完成呢?
比如for job in (select distinct(job) from emp),可惜oracle不支持这么做。
不过oracle支持通过这种方式返回XML格式数据,例如,

tony@ORA11GR2> select * from
2 (select deptno, job, sal from emp)
3 pivot XML(
4 sum(sal) for job in (ANY)
5 ) order by 1;

DEPTNO JOB_XML
---------- --------------------------------------------------------------------------------
10 <PivotSet><item><column name = "JOB">CLERK</column><column name = "SUM(SAL)">130
0</column></item><item><column name = "JOB">MANAGER</column><column name = "SUM(
SAL)">2450</column></item><item><column name = "JOB">PRESIDENT</column><column n
ame = "SUM(SAL)">5000</column></item></PivotSet>

20 <PivotSet><item><column name = "JOB">ANALYST</column><column name = "SUM(SAL)">6
000</column></item><item><column name = "JOB">CLERK</column><column name = "SUM(
SAL)">1900</column></item><item><column name = "JOB">MANAGER</column><column nam
e = "SUM(SAL)">2975</column></item></PivotSet>

30 <PivotSet><item><column name = "JOB">CLERK</column><column name = "SUM(SAL)">950
</column></item><item><column name = "JOB">MANAGER</column><column name = "SUM(S
AL)">2850</column></item><item><column name = "JOB">SALESMAN</column><column nam
e = "SUM(SAL)">5600</column></item></PivotSet>

*可以把上面的ANY换成select distinct(job) from emp

利用逗号分割进行行列转换
行->列,利用wm_concat函数,
列->行,例如regexp_substr函数,
例如

scott@ORA11GR2> select deptno, wm_concat(ename) enames from emp group by deptno;
    DEPTNO ENAMES
---------- ----------------------------------------
        10 CLARK,MILLER,KING
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,JAMES,TURNER,BLAKE,MARTIN,WARD
--先找到所有记录中,最大的分割后记录数,例如上面deptno=30,最多可以得到6条子记录;
--然后对于每一行,一行变6行,利用regexp_substr函数找到每个子记录,对于deptno=10和20的记录,分别会有3条和1条null的记录;
--最后利用is not null,去掉为null的子记录;
--order by lvl,可以保证子记录按照分割前的顺序排列
scott@ORA11GR2> with t as (select deptno, wm_concat(ename) enames from emp group by deptno)
  2  select deptno, ename from(
  3    select deptno, regexp_substr(enames, '[^,]+',1,lvl) ename, lvl from t,
  4    (select level lvl from dual connect by
  5     level < =(select max(length(regexp_replace(enames,'[^,]','')))+1 from t))
  6  ) where ename is not null order by deptno, lvl;
  DEPTNO ENAME
---------- --------------------
        10 CLARK
        10 MILLER
        10 KING
        20 SMITH
        20 FORD
        20 ADAMS
        20 SCOTT
        20 JONES
        30 ALLEN
        30 JAMES
        30 TURNER
        30 BLAKE
        30 MARTIN
        30 WARD
14 rows selected.

 

*11gr2之后,可以利用LISTAGG函数替代wm_concat函数。
The LISTAGG analytic function was introduced in Oracle 11g Release 2, making it very easy to aggregate strings. The nice thing about this function is it also allows us to order the elements in the concatenated list. If you are using 11g Release 2 you should use this function for string aggregation.
和wm_concat相比,listagg可以执行排序。例如
select deptno, listagg(ename,';') within group(order by ename) enames from emp group by deptno;