如何通过不间断的序列进行GROUP条目?

时间:2021-05-09 15:34:48
CREATE TABLE entries (
  id serial NOT NULL,
  title character varying,
  load_sequence integer
);

and data

INSERT INTO entries(title, load_sequence) VALUES ('A', 1);
INSERT INTO entries(title, load_sequence) VALUES ('A', 2);
INSERT INTO entries(title, load_sequence) VALUES ('A', 3);

INSERT INTO entries(title, load_sequence) VALUES ('A', 6);

INSERT INTO entries(title, load_sequence) VALUES ('B', 4);
INSERT INTO entries(title, load_sequence) VALUES ('B', 5);

INSERT INTO entries(title, load_sequence) VALUES ('B', 7);
INSERT INTO entries(title, load_sequence) VALUES ('B', 8);

Is there a way in PostgreSQL to write SQL that groups data by same title segments after ordering them by load_sequence. I mean:

PostgreSQL中是否有一种方法可以编写SQL,在按load_sequence对它们进行排序后,它们按相同的标题段对数据进行分组。我的意思是:

=# SELECT id, title, load_sequence FROM entries ORDER BY load_sequence;
 id | title | load_sequence 
----+-------+---------------
  9 | A     |             1
 10 | A     |             2
 11 | A     |             3
 13 | B     |             4
 14 | B     |             5
 12 | A     |             6
 15 | B     |             7
 16 | B     |             8

AND I want groups:

我想要小组:

=# SELECT title, string_agg(id::text, ',' ORDER BY id) FROM entries ???????????;

so result would be:

结果将是:

 title | string_agg  
-------+-------------
 A     | 9,10,11
 B     | 13,14
 A     | 12
 B     | 15,16

2 个解决方案

#1


4  

You can use the following query:

您可以使用以下查询:

SELECT title, string_agg(id::text, ',' ORDER BY id)
FROM (
  SELECT id, title, 
         ROW_NUMBER() OVER (ORDER BY load_sequence) -
         ROW_NUMBER() OVER (PARTITION BY title 
                            ORDER BY load_sequence) AS grp
  FROM entries ) AS t
GROUP BY title, grp

Calculated grp field serves to identify slices of title records having consecutive load_sequence values. Using this field in the GROUP BY clause we can achieve the required aggregation over id values.

计算的grp字段用于识别具有连续load_sequence值的标题记录的片段。在GROUP BY子句中使用此字段,我们可以在id值上实现所需的聚合。

Demo here

#2


0  

There's a trick you can use with sum as a window function running over a lagged window for this.

有一个技巧,你可以使用sum作为一个窗口函数运行在一个滞后窗口为此。

The idea is that when you hit an edge/discontinuity you return 1, otherwise you return 0. You detect the discontinuities using the lag window function.

这个想法是当你遇到边缘/不连续时你返回1,否则你返回0.你使用滞后窗函数检测不连续性。

SELECT title, string_agg(id::text, ', ') FROM (
  SELECT 
   id, title, load_sequence,
   sum(title_changed) OVER (ORDER BY load_sequence) AS partition_no
  FROM (
    SELECT
      id, title, load_sequence,
      CASE WHEN title = lag(title, 1) OVER (ORDER BY load_sequence) THEN 0 ELSE 1 END AS title_changed FROM entries
  ) x
) y
GROUP BY partition_no, title;

#1


4  

You can use the following query:

您可以使用以下查询:

SELECT title, string_agg(id::text, ',' ORDER BY id)
FROM (
  SELECT id, title, 
         ROW_NUMBER() OVER (ORDER BY load_sequence) -
         ROW_NUMBER() OVER (PARTITION BY title 
                            ORDER BY load_sequence) AS grp
  FROM entries ) AS t
GROUP BY title, grp

Calculated grp field serves to identify slices of title records having consecutive load_sequence values. Using this field in the GROUP BY clause we can achieve the required aggregation over id values.

计算的grp字段用于识别具有连续load_sequence值的标题记录的片段。在GROUP BY子句中使用此字段,我们可以在id值上实现所需的聚合。

Demo here

#2


0  

There's a trick you can use with sum as a window function running over a lagged window for this.

有一个技巧,你可以使用sum作为一个窗口函数运行在一个滞后窗口为此。

The idea is that when you hit an edge/discontinuity you return 1, otherwise you return 0. You detect the discontinuities using the lag window function.

这个想法是当你遇到边缘/不连续时你返回1,否则你返回0.你使用滞后窗函数检测不连续性。

SELECT title, string_agg(id::text, ', ') FROM (
  SELECT 
   id, title, load_sequence,
   sum(title_changed) OVER (ORDER BY load_sequence) AS partition_no
  FROM (
    SELECT
      id, title, load_sequence,
      CASE WHEN title = lag(title, 1) OVER (ORDER BY load_sequence) THEN 0 ELSE 1 END AS title_changed FROM entries
  ) x
) y
GROUP BY partition_no, title;