GROUP BY用于SQL中的连续行

时间:2021-02-20 07:39:02

Given the following table:

鉴于下表:

ID   State  Date
12   1      2009-07-16 10:00
45   2      2009-07-16 13:00
67   2      2009-07-16 14:40
77   1      2009-07-16 15:00
89   1      2009-07-16 15:30
99   1      2009-07-16 16:00

Question:
How can i GROUP by the field "State", while still maintaining the borders between the state changes?

问题:我如何通过“国家”字段进行GROUP,同时仍保持状态变化之间的边界?

SELECT MIN(ID) AS ID, State, MIN(Date) AS Date, COUNT(ID) AS Count
FROM table GROUP BY State

results in the following:

结果如下:

ID   State  Date              Count
12   1      2009-07-16 10:00  4
45   2      2009-07-16 13:00  2


but this is expected:

但这是预期的:

ID   State  Date              Count
12   1      2009-07-16 10:00  1
45   2      2009-07-16 13:00  2
77   1      2009-07-16 15:00  3


Is this possible in SQL? I didn't find a solution so far...

这在SQL中可行吗?到目前为止我没有找到解决方案......

4 个解决方案

#1


SELECT  MIN(id) AS id, MIN(ts) AS ts, MIN(state) AS state, COUNT(*) cnt
FROM    (
        SELECT  @r := @r + (@state != state) AS gn,
                @state := state AS sn,
                s.*
        FROM    (
                SELECT  @r := 0,
                        @state := 0
                ) vars,
                t_state s
        ORDER BY
                ts
        ) q
GROUP BY
        gn

Table creation scripts for testing:

用于测试的表创建脚本:

CREATE TABLE t_state (id INT NOT NULL PRIMARY KEY, state INT NOT NULL, ts DATETIME NOT NULL);

INSERT
INTO  t_state
VALUES
(12,   1,      '2009-07-16 10:00'),
(45,   2,      '2009-07-16 13:00'),
(67,   2,      '2009-07-16 14:40'),
(77,   1,      '2009-07-16 15:00'),
(89,   1,      '2009-07-16 15:30'),
(99,   1,      '2009-07-16 16:00');

#2


This is how to do it with CTEs on MSSQL server

这是如何在MSSQL服务器上使用CTE

-- DROP TABLE MyLog
CREATE TABLE MyLog(
        ID          INT PRIMARY KEY
        , State     INT
        , Date      DATETIME
        )
INSERT MyLog
SELECT 12, 1, '2009-07-16 10:00' UNION ALL
SELECT 45, 2, '2009-07-16 13:00' UNION ALL
SELECT 67, 2, '2009-07-16 14:40' UNION ALL
SELECT 77, 1, '2009-07-16 15:00' UNION ALL
SELECT 89, 1, '2009-07-16 15:30' UNION ALL
SELECT 99, 1, '2009-07-16 16:00'

;WITH   CTE
AS      (
        SELECT  ROW_NUMBER() OVER(ORDER BY ID) AS RowNo
                , *
        FROM    MyLog
        )
, MyLogGroup
AS      (
        SELECT  l.*
                , ( SELECT  MAX(ID)
                    FROM    CTE c
                    WHERE   NOT EXISTS (SELECT * FROM CTE
                                        WHERE RowNo = c.RowNo-1 AND State = c.State)
                            AND c.ID <= l.ID) AS GroupID
        FROM    MyLog l
        )
SELECT  *
FROM    MyLogGroup

#3


Here is a lengthier description of how solutions like the one offered by Quassnoi work

以下是Quassnoi提供的解决方案如何工作的更长的描述

#4


I might be stating the obvious here, but if you're willing to make use of Transact-SQL, you can iterate through the rows of the table and build your own result set, which probably seems like a hassle, but it will definitely work. The iteration can be done without the use of cursors.

我可能会在这里说明显而易见的,但是如果你愿意使用Transact-SQL,你可以遍历表的行并构建自己的结果集,这可能看起来很麻烦,但它肯定会起作用。迭代可以在不使用游标的情况下完成。

#1


SELECT  MIN(id) AS id, MIN(ts) AS ts, MIN(state) AS state, COUNT(*) cnt
FROM    (
        SELECT  @r := @r + (@state != state) AS gn,
                @state := state AS sn,
                s.*
        FROM    (
                SELECT  @r := 0,
                        @state := 0
                ) vars,
                t_state s
        ORDER BY
                ts
        ) q
GROUP BY
        gn

Table creation scripts for testing:

用于测试的表创建脚本:

CREATE TABLE t_state (id INT NOT NULL PRIMARY KEY, state INT NOT NULL, ts DATETIME NOT NULL);

INSERT
INTO  t_state
VALUES
(12,   1,      '2009-07-16 10:00'),
(45,   2,      '2009-07-16 13:00'),
(67,   2,      '2009-07-16 14:40'),
(77,   1,      '2009-07-16 15:00'),
(89,   1,      '2009-07-16 15:30'),
(99,   1,      '2009-07-16 16:00');

#2


This is how to do it with CTEs on MSSQL server

这是如何在MSSQL服务器上使用CTE

-- DROP TABLE MyLog
CREATE TABLE MyLog(
        ID          INT PRIMARY KEY
        , State     INT
        , Date      DATETIME
        )
INSERT MyLog
SELECT 12, 1, '2009-07-16 10:00' UNION ALL
SELECT 45, 2, '2009-07-16 13:00' UNION ALL
SELECT 67, 2, '2009-07-16 14:40' UNION ALL
SELECT 77, 1, '2009-07-16 15:00' UNION ALL
SELECT 89, 1, '2009-07-16 15:30' UNION ALL
SELECT 99, 1, '2009-07-16 16:00'

;WITH   CTE
AS      (
        SELECT  ROW_NUMBER() OVER(ORDER BY ID) AS RowNo
                , *
        FROM    MyLog
        )
, MyLogGroup
AS      (
        SELECT  l.*
                , ( SELECT  MAX(ID)
                    FROM    CTE c
                    WHERE   NOT EXISTS (SELECT * FROM CTE
                                        WHERE RowNo = c.RowNo-1 AND State = c.State)
                            AND c.ID <= l.ID) AS GroupID
        FROM    MyLog l
        )
SELECT  *
FROM    MyLogGroup

#3


Here is a lengthier description of how solutions like the one offered by Quassnoi work

以下是Quassnoi提供的解决方案如何工作的更长的描述

#4


I might be stating the obvious here, but if you're willing to make use of Transact-SQL, you can iterate through the rows of the table and build your own result set, which probably seems like a hassle, but it will definitely work. The iteration can be done without the use of cursors.

我可能会在这里说明显而易见的,但是如果你愿意使用Transact-SQL,你可以遍历表的行并构建自己的结果集,这可能看起来很麻烦,但它肯定会起作用。迭代可以在不使用游标的情况下完成。