This is an SQL problem I can't wrap my head around in a simple query Is it possible?
这是一个SQL问题我无法在一个简单的查询中包围我的头脑是否可能?
The data set is (letters added for ease of understanding):
数据集是(为便于理解而添加的字母):
Start End
10:01 10:12 (A)
10:03 10:06 (B)
10:05 10:25 (C)
10:14 10:42 (D)
10:32 10:36 (E)
The desired output is:
所需的输出是:
PeriodStart New ActiveAtEnd MinActive MaxActive
09:50 0 0 0 0
10:00 3 (ABC) 2 (AC) 0 3 (ABC)
10:10 1 (D) 2 (CD) 1 (C) 2 (AC or CD)
10:20 0 1 (D) 1 (C) 2 (CD)
10:30 1 (E) 1 (D) 1 (D) 2 (DE)
10:40 0 0 0 1 (D)
10:50 0 0 0 0
So, the query needed is a summary of the first table, calculating the minimum overlapping time periods (Start-End) and the maximum overlapping time periods (Start-End) from the first table within a 10 minute period.
因此,所需的查询是第一个表的摘要,计算10分钟内第一个表中的最小重叠时间段(开始 - 结束)和最大重叠时间段(开始 - 结束)。
'New' is the number of rows with a Start in the summary period. 'ActiveAtEnd' is the number of rows active at the end of the summary period.
“新建”是在摘要期间具有“开始”的行数。 'ActiveAtEnd'是摘要周期结束时活动的行数。
I'm using Oracle, but I'm sure a solution can be adjusted. Stored procedures not allowed - just plain SELECT/INSERT (views are allowed). Its also OK to run one SQL command per 10 minute output (as once populated, that will be how it keeps up to date.
我正在使用Oracle,但我确信可以调整解决方案。不允许存储过程 - 只需简单的SELECT / INSERT(允许视图)。每10分钟输出一次运行一个SQL命令也是可以的(一旦填充,这将是它保持最新的方式。
Thanks for any ideas, including 'not possible' ;-)
感谢任何想法,包括'不可能';-)
4 个解决方案
#1
3
Assuming you also have (or Create) a table named @Times with one record for each ten minute start time, How about...
假设您还拥有(或创建)一个名为@Times的表,每个十分钟的开始时间有一条记录,那么......
Select T.Start,
(Select Count(*) From testTab
Where Start Between T.Start
And DateAdd(minute, 10, T.Start)) New,
(Select Count(*) From testTab
Where Start < DateAdd(minute, 10, T.Start)
And EndDt > DateAdd(minute, 10, T.Start)) ActiveAtEnd,
(Select Max(Cnt) From
(Select Count(Distinct T.Which) Cnt
From (Select Distinct Start
From testTab
Where Start Between T.Start
And DateAdd(minute, 10, T.Start)
Union Select T.Start
Union Select DateAdd(minute, 10, T.Start)) Z
Left Join testTab T
On Z.Start Between T.Start And T.EndDt
Group By Z.Start) ZZ ) MaxActive,
(Select Min(Cnt) From
(Select Count(Distinct T.Which) Cnt
From (Select Distinct Start
From testTab
Where Start Between T.Start
And DateAdd(minute, 10, T.Start)
Union Select T.Start
Union Select DateAdd(minute, 10, T.Start)) Z
Left Join testTab T
On Z.Start Between T.Start And T.EndDt
Group By Z.Start) ZZ ) MinActive
From @Times T
I Created this table in SQL Server as a Table variable, using
我在SQL Server中创建此表作为Table变量,使用
Declare @Times Table (Start datetime Primary key Not Null)
Declare @Start DateTime
Set @Start = '1 Nov 2008 10:00'
While @Start < '1 Nov 2008 11:00' begin
Insert @Times(Start) values(@Start)
Set @Start = DateAdd(minute, 10, @Start)
End
If you are using another product, use a temp table instead... but this approach does need a table with one record for each ten minute "period" as a hook to run against...
如果您正在使用其他产品,请使用临时表...但是这种方法确实需要一个表,每十分钟“句点”有一条记录作为运行的钩子...
with the following data, this query generates output as follows:
使用以下数据,此查询生成输出,如下所示:
start endDt Which
----------------------- ----------------------- -----
2008-11-01 10:01:00.000 2008-11-01 10:12:00.000 A
2008-11-01 10:03:00.000 2008-11-01 10:06:00.000 B
2008-11-01 10:05:00.000 2008-11-01 10:25:00.000 C
2008-11-01 10:14:00.000 2008-11-01 10:42:00.000 D
2008-11-01 10:32:00.000 2008-11-01 10:36:00.000 E
2008-11-01 10:22:00.000 2008-11-01 10:51:00.000 F
2008-11-01 10:22:00.000 2008-11-01 10:23:00.000 G
Start New ActiveAtEnd MaxActive MinActive
----------------------- ----------- ----------- ----------- -----------
2008-11-01 10:00:00.000 3 2 3 0
2008-11-01 10:10:00.000 1 2 2 2
2008-11-01 10:20:00.000 2 2 4 2
2008-11-01 10:30:00.000 1 2 3 2
2008-11-01 10:40:00.000 0 1 2 1
2008-11-01 10:50:00.000 0 0 1 0
Warning: Null value is eliminated by an aggregate or other SET operation.
警告:聚合或其他SET操作消除了空值。
#2
1
I'm struggling with the ActiveAtEnd value, but the others are OK.
我正在努力使用ActiveAtEnd值,但其他人都没问题。
This is for MySQL:
这适用于MySQL:
set @active:=0;
select
period,
sum( if( score=1, 1, 0)) New,
if( max(ab) > max(aa), max(ab), max(aa)) MaxActive,
if( min( ab ) < min( aa ), min(ab), min(aa)) MinActive
from (
select
period,
etime,
score,
@active ab,
@active:=@active+score aa
from (
select
from_unixtime( floor( unix_timestamp(start)/600) * 600) period,
start etime,
+1 score
from ev
union all
select from_unixtime( floor( unix_timestamp(end)/600) * 600) period,
end etime,
-1 score
from ev
) event order by etime
) as temp
group by period;
The innermost selection breaks the original table into a set of events - with a score of +1 for a start-event, and -1 for an end event. union all is used so that duplicate events are allowed.
最里面的选择将原始表分成一组事件 - 开始事件的得分为+1,结束事件的得分为-1。使用union all以便允许重复事件。
The next inner selection runs a variable across the score values - @active holds a count of the number of active intervals at each time point. Both the value of @active before and after the current count is added is selected: I do not know how portable this is.
下一个内部选择在分数值上运行变量 - @active保存每个时间点的活动间隔数。选择添加当前计数之前和之后的@active值:我不知道这是多么便携。
The outermost selection accumulates the results for each period. 'New' is the sum of '+1' scores, MaxActive and MinActive must both take the value of active before (ab) and active after (aa) into account.
最外面的选择累积每个时期的结果。 “新”是“+1”分数的总和,MaxActive和MinActive必须同时考虑(a)之前的活动值和(aa)之后的活动值。
Here are sample results:
以下是示例结果:
+---------------------+------+-----------+-----------+
| period | New | MaxActive | MinActive |
+---------------------+------+-----------+-----------+
| 2008-11-19 10:00:00 | 3 | 3 | 0 |
| 2008-11-19 10:10:00 | 1 | 2 | 1 |
| 2008-11-19 10:20:00 | 0 | 2 | 1 |
| 2008-11-19 10:30:00 | 1 | 2 | 1 |
| 2008-11-19 10:40:00 | 0 | 1 | 0 |
+---------------------+------+-----------+-----------+
#3
0
The New and ActiveAtEnd are fairly straightforward (assuming the the period's start and end being stored in temporary variables):
New和ActiveAtEnd相当简单(假设句点的开始和结束存储在临时变量中):
select @periodStart PeriodStart
, @periodEnd PeriodEnd
, n.[new]
, ae.ActiveAtEnd
from (
select count(*) [new]
from @times
where [start] >= @periodStart
and [start] < @PeriodEnd
) n
cross join
(
select count(*) [ActiveAtEnd]
from @times
where [start] < @PeriodEnd
and [end] >= @PeriodEnd
) ae
The Max and Min Actives are harder. You can presume a minute's granularity, so you would need to explode out active period at that granularity to be able to probe into each slice.
Max和Min Actives更难。您可以设定一分钟的粒度,因此您需要以该粒度分解活动期以便能够探测每个切片。
I'm not sure that that's possible in a single query.
我不确定在单个查询中是否可行。
#4
0
The only way that I have ever been able to solve this sort of problem has been to get the count of 'start' for each one minute period. You then get the maximum (or minimum) for the 10 minute group. I have not been able to apply a set based approach.
我能够解决这类问题的唯一方法就是每分钟计算一次'开始'。然后,您将获得10分钟组的最大值(或最小值)。我无法应用基于集合的方法。
#1
3
Assuming you also have (or Create) a table named @Times with one record for each ten minute start time, How about...
假设您还拥有(或创建)一个名为@Times的表,每个十分钟的开始时间有一条记录,那么......
Select T.Start,
(Select Count(*) From testTab
Where Start Between T.Start
And DateAdd(minute, 10, T.Start)) New,
(Select Count(*) From testTab
Where Start < DateAdd(minute, 10, T.Start)
And EndDt > DateAdd(minute, 10, T.Start)) ActiveAtEnd,
(Select Max(Cnt) From
(Select Count(Distinct T.Which) Cnt
From (Select Distinct Start
From testTab
Where Start Between T.Start
And DateAdd(minute, 10, T.Start)
Union Select T.Start
Union Select DateAdd(minute, 10, T.Start)) Z
Left Join testTab T
On Z.Start Between T.Start And T.EndDt
Group By Z.Start) ZZ ) MaxActive,
(Select Min(Cnt) From
(Select Count(Distinct T.Which) Cnt
From (Select Distinct Start
From testTab
Where Start Between T.Start
And DateAdd(minute, 10, T.Start)
Union Select T.Start
Union Select DateAdd(minute, 10, T.Start)) Z
Left Join testTab T
On Z.Start Between T.Start And T.EndDt
Group By Z.Start) ZZ ) MinActive
From @Times T
I Created this table in SQL Server as a Table variable, using
我在SQL Server中创建此表作为Table变量,使用
Declare @Times Table (Start datetime Primary key Not Null)
Declare @Start DateTime
Set @Start = '1 Nov 2008 10:00'
While @Start < '1 Nov 2008 11:00' begin
Insert @Times(Start) values(@Start)
Set @Start = DateAdd(minute, 10, @Start)
End
If you are using another product, use a temp table instead... but this approach does need a table with one record for each ten minute "period" as a hook to run against...
如果您正在使用其他产品,请使用临时表...但是这种方法确实需要一个表,每十分钟“句点”有一条记录作为运行的钩子...
with the following data, this query generates output as follows:
使用以下数据,此查询生成输出,如下所示:
start endDt Which
----------------------- ----------------------- -----
2008-11-01 10:01:00.000 2008-11-01 10:12:00.000 A
2008-11-01 10:03:00.000 2008-11-01 10:06:00.000 B
2008-11-01 10:05:00.000 2008-11-01 10:25:00.000 C
2008-11-01 10:14:00.000 2008-11-01 10:42:00.000 D
2008-11-01 10:32:00.000 2008-11-01 10:36:00.000 E
2008-11-01 10:22:00.000 2008-11-01 10:51:00.000 F
2008-11-01 10:22:00.000 2008-11-01 10:23:00.000 G
Start New ActiveAtEnd MaxActive MinActive
----------------------- ----------- ----------- ----------- -----------
2008-11-01 10:00:00.000 3 2 3 0
2008-11-01 10:10:00.000 1 2 2 2
2008-11-01 10:20:00.000 2 2 4 2
2008-11-01 10:30:00.000 1 2 3 2
2008-11-01 10:40:00.000 0 1 2 1
2008-11-01 10:50:00.000 0 0 1 0
Warning: Null value is eliminated by an aggregate or other SET operation.
警告:聚合或其他SET操作消除了空值。
#2
1
I'm struggling with the ActiveAtEnd value, but the others are OK.
我正在努力使用ActiveAtEnd值,但其他人都没问题。
This is for MySQL:
这适用于MySQL:
set @active:=0;
select
period,
sum( if( score=1, 1, 0)) New,
if( max(ab) > max(aa), max(ab), max(aa)) MaxActive,
if( min( ab ) < min( aa ), min(ab), min(aa)) MinActive
from (
select
period,
etime,
score,
@active ab,
@active:=@active+score aa
from (
select
from_unixtime( floor( unix_timestamp(start)/600) * 600) period,
start etime,
+1 score
from ev
union all
select from_unixtime( floor( unix_timestamp(end)/600) * 600) period,
end etime,
-1 score
from ev
) event order by etime
) as temp
group by period;
The innermost selection breaks the original table into a set of events - with a score of +1 for a start-event, and -1 for an end event. union all is used so that duplicate events are allowed.
最里面的选择将原始表分成一组事件 - 开始事件的得分为+1,结束事件的得分为-1。使用union all以便允许重复事件。
The next inner selection runs a variable across the score values - @active holds a count of the number of active intervals at each time point. Both the value of @active before and after the current count is added is selected: I do not know how portable this is.
下一个内部选择在分数值上运行变量 - @active保存每个时间点的活动间隔数。选择添加当前计数之前和之后的@active值:我不知道这是多么便携。
The outermost selection accumulates the results for each period. 'New' is the sum of '+1' scores, MaxActive and MinActive must both take the value of active before (ab) and active after (aa) into account.
最外面的选择累积每个时期的结果。 “新”是“+1”分数的总和,MaxActive和MinActive必须同时考虑(a)之前的活动值和(aa)之后的活动值。
Here are sample results:
以下是示例结果:
+---------------------+------+-----------+-----------+
| period | New | MaxActive | MinActive |
+---------------------+------+-----------+-----------+
| 2008-11-19 10:00:00 | 3 | 3 | 0 |
| 2008-11-19 10:10:00 | 1 | 2 | 1 |
| 2008-11-19 10:20:00 | 0 | 2 | 1 |
| 2008-11-19 10:30:00 | 1 | 2 | 1 |
| 2008-11-19 10:40:00 | 0 | 1 | 0 |
+---------------------+------+-----------+-----------+
#3
0
The New and ActiveAtEnd are fairly straightforward (assuming the the period's start and end being stored in temporary variables):
New和ActiveAtEnd相当简单(假设句点的开始和结束存储在临时变量中):
select @periodStart PeriodStart
, @periodEnd PeriodEnd
, n.[new]
, ae.ActiveAtEnd
from (
select count(*) [new]
from @times
where [start] >= @periodStart
and [start] < @PeriodEnd
) n
cross join
(
select count(*) [ActiveAtEnd]
from @times
where [start] < @PeriodEnd
and [end] >= @PeriodEnd
) ae
The Max and Min Actives are harder. You can presume a minute's granularity, so you would need to explode out active period at that granularity to be able to probe into each slice.
Max和Min Actives更难。您可以设定一分钟的粒度,因此您需要以该粒度分解活动期以便能够探测每个切片。
I'm not sure that that's possible in a single query.
我不确定在单个查询中是否可行。
#4
0
The only way that I have ever been able to solve this sort of problem has been to get the count of 'start' for each one minute period. You then get the maximum (or minimum) for the 10 minute group. I have not been able to apply a set based approach.
我能够解决这类问题的唯一方法就是每分钟计算一次'开始'。然后,您将获得10分钟组的最大值(或最小值)。我无法应用基于集合的方法。