I have some data. I want to group them based on the value of data
column. If there are 3 or more consecutive rows that have data bigger than 10, then those rows are what I want.
我有一些数据。我想根据数据列的值对它们进行分组。如果有3个或更多的连续行数据大于10,那么这些行就是我想要的。
So for this data:
所以对于这些数据:
use tempdb;
go
set nocount on;
if object_id('t', 'U') is not null
drop table t;
go
create table t
(
id int primary key identity,
[when] datetime,
data int
)
go
insert into t([when], data) values ('20130801', 1);
insert into t([when], data) values ('20130802', 121);
insert into t([when], data) values ('20130803', 132);
insert into t([when], data) values ('20130804', 15);
insert into t([when], data) values ('20130805', 9);
insert into t([when], data) values ('20130806', 1435);
insert into t([when], data) values ('20130807', 143);
insert into t([when], data) values ('20130808', 18);
insert into t([when], data) values ('20130809', 19);
insert into t([when], data) values ('20130810', 1);
insert into t([when], data) values ('20130811', 1234);
insert into t([when], data) values ('20130812', 124);
insert into t([when], data) values ('20130813', 6);
select * from t;
What I want is:
我想要的是:
id when data
----------- ----------------------- -----------
2 2013-08-02 00:00:00.000 121
3 2013-08-03 00:00:00.000 132
4 2013-08-04 00:00:00.000 15
6 2013-08-06 00:00:00.000 1435
7 2013-08-07 00:00:00.000 143
8 2013-08-08 00:00:00.000 18
9 2013-08-09 00:00:00.000 19
How to do that?
如何做呢?
2 个解决方案
#1
15
Try this
试试这个
WITH cte
AS
(
SELECT *,COUNT(1) OVER(PARTITION BY cnt) pt FROM
(
SELECT tt.*
,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
FROM t tt
WHERE data > 10
) t1
)
SELECT id, [when], data FROM cte WHERE pt >= 3
SQL小提琴演示
OUTPUT
输出
id when data
2 2013-08-02 00:00:00.000 121
3 2013-08-03 00:00:00.000 132
4 2013-08-04 00:00:00.000 15
6 2013-08-06 00:00:00.000 1435
7 2013-08-07 00:00:00.000 143
8 2013-08-08 00:00:00.000 18
9 2013-08-09 00:00:00.000 19
EDIT
编辑
First the inner query counts the no of records where data <= 10
首先,内部查询计数数据<= 10的记录的no
SELECT tt.*
,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
FROM t tt
output
输出
id when data cnt
1 2013-08-01 00:00:00.000 1 1
2 2013-08-02 00:00:00.000 121 1
3 2013-08-03 00:00:00.000 132 1
4 2013-08-04 00:00:00.000 15 1
5 2013-08-05 00:00:00.000 9 2
6 2013-08-06 00:00:00.000 1435 2
7 2013-08-07 00:00:00.000 143 2
8 2013-08-08 00:00:00.000 18 2
9 2013-08-09 00:00:00.000 19 2
10 2013-08-10 00:00:00.000 1 3
11 2013-08-11 00:00:00.000 1234 3
12 2013-08-12 00:00:00.000 124 3
13 2013-08-13 00:00:00.000 6 4
Then we filter the records with data > 10
然后用>10数据过滤记录
WHERE data > 10
Now we count the records by partitoning cnt column
现在我们通过分割cnt列来计算记录
SELECT *,COUNT(1) OVER(PARTITION BY cnt) pt FROM
(
SELECT tt.*
,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
FROM t tt
WHERE data > 10
) t1
Output
输出
id when data cnt pt
2 2013-08-02 00:00:00.000 121 1 3
3 2013-08-03 00:00:00.000 132 1 3
4 2013-08-04 00:00:00.000 15 1 3
6 2013-08-06 00:00:00.000 1435 2 4
7 2013-08-07 00:00:00.000 143 2 4
8 2013-08-08 00:00:00.000 18 2 4
9 2013-08-09 00:00:00.000 19 2 4
11 2013-08-11 00:00:00.000 1234 3 2
12 2013-08-12 00:00:00.000 124 3 2
The above query is put in cte just like temp table
上面的查询就像临时表一样放在cte中。
Now select the records that are having the consecutive count >= 3
现在选择具有连续计数>= 3的记录。
SELECT id, [when], data FROM cte WHERE pt >= 3
ANOTHER SOLUTION
另一个解决方案
;WITH partitioned AS (
SELECT *, id - ROW_NUMBER() OVER (ORDER BY id) AS grp
FROM t
WHERE data > 10
),
counted AS (
SELECT *, COUNT(*) OVER (PARTITION BY grp) AS cnt
FROM partitioned
)
SELECT id, [when], data
FROM counted
WHERE cnt >= 3
参考网址
SQL小提琴演示
#2
3
First, we discount any row that has a value of 10 or less:
首先,我们对任何值小于或等于10的行进行贴现:
WITH t10 AS (SELECT * FROM t WHERE data > 10),
Next, get the rows whose immediate predecessor is also more than 10:
接下来,获取其前任也大于10的行:
okleft AS (SELECT t10.*, pred.id AS predid FROM
t10
INNER JOIN t pred ON
pred.[when] < t10.[when]
AND pred.[when] >= ALL (SELECT [when] FROM t t2 WHERE t2.[when] < t10.[when])
WHERE pred.data > 10
),
Also get the rows whose immediate successor is also more than 10:
同时,也要得到直接继承人也超过10的行:
okright as (SELECT t10.*, succ.id AS succid FROM
t10
INNER JOIN t succ ON
succ.[when] > t10.[when]
AND succ.[when] <= ALL (SELECT [when] FROM t t2 WHERE t2.[when] > t10.[when])
WHERE succ.data > 10
),
Finally, select any row where it either starts a sequence of 3, is in the middle of one, or ends one:
最后,选择任意一行,它开始一个3的序列,在一个序列的中间,或者结束一个:
A row whose valid right side also has a valid right side starts a sequence of at least 3:
一个有效右边也有有效右边的行,其序列至少为3:
starts3 AS (SELECT id, [when], data FROM okright r1 WHERE EXISTS(
SELECT NULL FROM okright r2 WHERE r2.id = r1.succid)),
A row whose predecessor and successor are both valid is in the middle of at least 3:
其前任和继任者都有效的一排至少在3人中间:
mid3 AS (SELECT id, [when], data FROM okleft l WHERE EXISTS(
SELECT NULL FROM okright r WHERE r.id = l.id)),
A row whose valid left side also has a valid left side ends a sequence of at least 3:
一个有效的左边也有有效的左边的行结束一个至少3的序列:
ends3 AS (SELECT id, [when], data FROM okleft l1 WHERE EXISTS(
SELECT NULL FROM okleft l2 WHERE l2.id = l1.predid))
Join them all up, with UNION to remove duplicates:
将它们全部加入,与UNION一起删除副本:
SELECT * FROM starts3
UNION SELECT * FROM mid3
UNION SELECT * FROM ends3
SQL Fiddler: http://sqlfiddle.com/#!3/12f3a/9
SQL提琴手:http://sqlfiddle.com/ ! 3/12f3a / 9
Edit: I like BVR's answer, much more elegant than mine.
编辑:我喜欢BVR的答案,比我的要优雅得多。
#1
15
Try this
试试这个
WITH cte
AS
(
SELECT *,COUNT(1) OVER(PARTITION BY cnt) pt FROM
(
SELECT tt.*
,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
FROM t tt
WHERE data > 10
) t1
)
SELECT id, [when], data FROM cte WHERE pt >= 3
SQL小提琴演示
OUTPUT
输出
id when data
2 2013-08-02 00:00:00.000 121
3 2013-08-03 00:00:00.000 132
4 2013-08-04 00:00:00.000 15
6 2013-08-06 00:00:00.000 1435
7 2013-08-07 00:00:00.000 143
8 2013-08-08 00:00:00.000 18
9 2013-08-09 00:00:00.000 19
EDIT
编辑
First the inner query counts the no of records where data <= 10
首先,内部查询计数数据<= 10的记录的no
SELECT tt.*
,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
FROM t tt
output
输出
id when data cnt
1 2013-08-01 00:00:00.000 1 1
2 2013-08-02 00:00:00.000 121 1
3 2013-08-03 00:00:00.000 132 1
4 2013-08-04 00:00:00.000 15 1
5 2013-08-05 00:00:00.000 9 2
6 2013-08-06 00:00:00.000 1435 2
7 2013-08-07 00:00:00.000 143 2
8 2013-08-08 00:00:00.000 18 2
9 2013-08-09 00:00:00.000 19 2
10 2013-08-10 00:00:00.000 1 3
11 2013-08-11 00:00:00.000 1234 3
12 2013-08-12 00:00:00.000 124 3
13 2013-08-13 00:00:00.000 6 4
Then we filter the records with data > 10
然后用>10数据过滤记录
WHERE data > 10
Now we count the records by partitoning cnt column
现在我们通过分割cnt列来计算记录
SELECT *,COUNT(1) OVER(PARTITION BY cnt) pt FROM
(
SELECT tt.*
,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
FROM t tt
WHERE data > 10
) t1
Output
输出
id when data cnt pt
2 2013-08-02 00:00:00.000 121 1 3
3 2013-08-03 00:00:00.000 132 1 3
4 2013-08-04 00:00:00.000 15 1 3
6 2013-08-06 00:00:00.000 1435 2 4
7 2013-08-07 00:00:00.000 143 2 4
8 2013-08-08 00:00:00.000 18 2 4
9 2013-08-09 00:00:00.000 19 2 4
11 2013-08-11 00:00:00.000 1234 3 2
12 2013-08-12 00:00:00.000 124 3 2
The above query is put in cte just like temp table
上面的查询就像临时表一样放在cte中。
Now select the records that are having the consecutive count >= 3
现在选择具有连续计数>= 3的记录。
SELECT id, [when], data FROM cte WHERE pt >= 3
ANOTHER SOLUTION
另一个解决方案
;WITH partitioned AS (
SELECT *, id - ROW_NUMBER() OVER (ORDER BY id) AS grp
FROM t
WHERE data > 10
),
counted AS (
SELECT *, COUNT(*) OVER (PARTITION BY grp) AS cnt
FROM partitioned
)
SELECT id, [when], data
FROM counted
WHERE cnt >= 3
参考网址
SQL小提琴演示
#2
3
First, we discount any row that has a value of 10 or less:
首先,我们对任何值小于或等于10的行进行贴现:
WITH t10 AS (SELECT * FROM t WHERE data > 10),
Next, get the rows whose immediate predecessor is also more than 10:
接下来,获取其前任也大于10的行:
okleft AS (SELECT t10.*, pred.id AS predid FROM
t10
INNER JOIN t pred ON
pred.[when] < t10.[when]
AND pred.[when] >= ALL (SELECT [when] FROM t t2 WHERE t2.[when] < t10.[when])
WHERE pred.data > 10
),
Also get the rows whose immediate successor is also more than 10:
同时,也要得到直接继承人也超过10的行:
okright as (SELECT t10.*, succ.id AS succid FROM
t10
INNER JOIN t succ ON
succ.[when] > t10.[when]
AND succ.[when] <= ALL (SELECT [when] FROM t t2 WHERE t2.[when] > t10.[when])
WHERE succ.data > 10
),
Finally, select any row where it either starts a sequence of 3, is in the middle of one, or ends one:
最后,选择任意一行,它开始一个3的序列,在一个序列的中间,或者结束一个:
A row whose valid right side also has a valid right side starts a sequence of at least 3:
一个有效右边也有有效右边的行,其序列至少为3:
starts3 AS (SELECT id, [when], data FROM okright r1 WHERE EXISTS(
SELECT NULL FROM okright r2 WHERE r2.id = r1.succid)),
A row whose predecessor and successor are both valid is in the middle of at least 3:
其前任和继任者都有效的一排至少在3人中间:
mid3 AS (SELECT id, [when], data FROM okleft l WHERE EXISTS(
SELECT NULL FROM okright r WHERE r.id = l.id)),
A row whose valid left side also has a valid left side ends a sequence of at least 3:
一个有效的左边也有有效的左边的行结束一个至少3的序列:
ends3 AS (SELECT id, [when], data FROM okleft l1 WHERE EXISTS(
SELECT NULL FROM okleft l2 WHERE l2.id = l1.predid))
Join them all up, with UNION to remove duplicates:
将它们全部加入,与UNION一起删除副本:
SELECT * FROM starts3
UNION SELECT * FROM mid3
UNION SELECT * FROM ends3
SQL Fiddler: http://sqlfiddle.com/#!3/12f3a/9
SQL提琴手:http://sqlfiddle.com/ ! 3/12f3a / 9
Edit: I like BVR's answer, much more elegant than mine.
编辑:我喜欢BVR的答案,比我的要优雅得多。