在MySQL中,如何实现Row Number OVER Partiton或者是Rank OVER Partiton呢?

时间:2022-12-07 23:12:07


原文链接http://www.tbk.ren/article/255.html?source=csdn


有的同学在处理自己的业务逻辑的时候,需要用到Row Number OVER Partition的方法,那么,什么是Row Number OVER Partition呢?


例如我们有下面的数据,第一列是用户的ID,第二列是用户的购买日期,现在如果我们需要判断用户是否重复购买,并且,每一次的购买,下一次的购买时间间隔是多少呢?


id,date
1,20150601
1,20150603
2,20150601
2,20150605
2,20150610
3,20150503
3,20150603
4,20150601


如果我们可以处理称为下面的数据,那么这个问题就完美解决了:


  id    date.x      date.y
0   1  20150601    20150603
1   1  20150603         NaN
2   2  20150601    20150605
3   2  20150605    20150610
4   2  20150610         NaN
5   3  20150503    20150603
6   3  20150603         NaN
7   4  20150601         NaN

date_y就是下一次的购买时间,date_y-date_x就是前后两次的购买间隔。


把数据处理成这种格式,使用Row Number OVER Partition或者是说Rank OVER Partition是最简单的方法。

我们先来生成数据:

CREATE TABLE t (
ID INT,
DATE INT
) ENGINE=MYISAM DEFAULT CHARSET=utf8;


INSERT INTO t VALUES
(1,20150601),
(1,20150603),
(2,20150601),
(2,20150605),
(2,20150610),
(3,20150503),
(3,20150603),
(4,20150601);


接着,我们来在MySQL中,实现Row Number OVER Partition或者是说Rank OVER Partition的方法:


SET @id_rank:=0;
SELECT id, DATE, id_rank FROM (
SELECT id, DATE, 
  @id_rank := IF(@id = id, @id_rank + 1, 1) AS id_rank,
  @id := id 
 FROM t
 ORDER BY id, DATE
) a;


该查询,返回的结果如下:

"id" "DATE""id_rank"
"1" "20150601""1"
"1" "20150603""2"
"2" "20150601""1"
"2" "20150605""2"
"2" "20150610""3"
"3" "20150503""1"
"3" "20150603""2"
"4" "20150601""1"

使用这个结果,我们就可以使用Left Join的逻辑,计算出我们的目标表格:

SET @id_rank:=0;

CREATE TABLE temp ENGINE=MYISAM DEFAULT CHARSET=utf8 AS
SELECT id, DATE, id_rank FROM (
SELECT id, DATE, 
  @id_rank := IF(@id = id, @id_rank + 1, 1) AS id_rank,
  @id := id 
 FROM t
 ORDER BY id, DATE
) a;

SELECT a.id, a.date, b.date FROM (
SELECT id, DATE, id_rank FROM temp
) a LEFT JOIN (
SELECT id, DATE, (id_rank-1) AS id_rank_1 FROM temp
) b ON a.id=b.id AND a.id_rank=b.id_rank_1;

得到的结果如下所示:

"id" "date""date"
"1" "20150601""20150603"
"1" "20150603"\N
"2" "20150601""20150605"
"2" "20150605""20150610"
"2" "20150610"\N
"3" "20150503""20150603"
"3" "20150603"\N
"4" "20150601"\N

如果是大量的数据,我们记得一定要在CREATE TABLE temp的时候,创建索引,这样子执行起来,才会杠杠滴。

最后,千万不要忘记,删掉temp表

DROP TABLE temp;