I have the following data set
我有以下数据集
Date Category
2014-01-01 A
2014-01-02 A
2014-01-03 A
2014-01-04 B
2014-01-05 B
...
2014-01-10 B
2014-01-11 A
...
2014-01-20 A
The result I want to achieve is to find local min/max date for A and B, so as follows:
我想实现的结果是找到A和B的局部最小/最大日期,如下:
MinDate MaxDate Category
2014-01-01 2014-01-03 A
2014-01-04 2014-01-10 B
2014-01-11 2014-01-20 A
Note: using
注意:使用
Select min(date), max(date), category from TABLE Group by category
will create the result
将创建的结果
MinDate MaxDate Category
2014-01-01 2014-01-20 A
2014-01-04 2014-01-10 B
this is not what I want to achieve
这不是我想要的
1 个解决方案
#1
5
Assuming you have a DBMS that supports window functions you can do this:
假设你有一个支持窗口功能的DBMS,你可以这样做:
select category, grp, min(date) as min_date, max(date) as max_date
from (
select category, date
, row_number() over (order by date)
- row_number() over (partition by category order by date) as grp
from T
) as X
group by category, grp
order by min(date)
The idea is to order all rows according to date and to order all rows in each category according to date. If the difference between these numbers changes it means that the chain of consecutive events for an category is broken by another category.
其思想是根据日期对所有行进行排序,并根据日期对每个类别中的所有行进行排序。如果这些数字之间的差异发生变化,则意味着一个类别的连续事件链被另一个类别打破。
#1
5
Assuming you have a DBMS that supports window functions you can do this:
假设你有一个支持窗口功能的DBMS,你可以这样做:
select category, grp, min(date) as min_date, max(date) as max_date
from (
select category, date
, row_number() over (order by date)
- row_number() over (partition by category order by date) as grp
from T
) as X
group by category, grp
order by min(date)
The idea is to order all rows according to date and to order all rows in each category according to date. If the difference between these numbers changes it means that the chain of consecutive events for an category is broken by another category.
其思想是根据日期对所有行进行排序,并根据日期对每个类别中的所有行进行排序。如果这些数字之间的差异发生变化,则意味着一个类别的连续事件链被另一个类别打破。