将数组交集作为组的聚合函数。

时间:2020-12-27 22:46:36

I have the following table:

我有下表:

CREATE TABLE person
AS
  SELECT name, preferences
  FROM ( VALUES
    ( 'John', ARRAY['pizza', 'meat'] ),
    ( 'John', ARRAY['pizza', 'spaghetti'] ),
    ( 'Bill', ARRAY['lettuce', 'pizza'] ),
    ( 'Bill', ARRAY['tomatoes'] )
  ) AS t(name, preferences);

I want to group by person with intersect(preferences) as aggregate function. So I want the following output:

我想将具有intersect(preferences)的人分组为聚合函数。所以我想要以下输出:

person | preferences
-------------------------------
John   | ['pizza']
Bill   | []

How should this be done in SQL? I guess I need to do something like the following, but what does the X function look like?

如何在SQL中实现这一点?我想我需要做如下的事情,但是X函数是什么样子的呢?

SELECT    person.name, array_agg(X)
FROM      person
LEFT JOIN unnest(preferences) preferences
ON        true
GROUP BY  name

3 个解决方案

#1


2  

Using FILTER with ARRAY_AGG

使用过滤器和ARRAY_AGG

SELECT name, array_agg(pref) FILTER (WHERE namepref = total)
FROM (
  SELECT name, pref, t1.count AS total, count(*) AS namepref
  FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name)
    FROM person
  ) AS t1
  CROSS JOIN LATERAL unnest(preferences) AS pref
  GROUP BY name, total, pref
) AS t2
GROUP BY name;

Here is one way to do it using the ARRAY constructor and DISTINCT.

这里有一种使用数组构造函数和不同的方法。

WITH t AS (
  SELECT name, pref, t1.count AS total, count(*) AS namepref
  FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name)
    FROM person
  ) AS t1
  CROSS JOIN LATERAL unnest(preferences) AS pref
  GROUP BY name, total, pref
)
SELECT DISTINCT
  name,
  ARRAY(SELECT pref FROM t AS t2 WHERE total=namepref AND t.name = t2.name)
FROM t;

#2


2  

You could create your own aggregate function:

您可以创建自己的聚合函数:

CREATE OR REPLACE FUNCTION arr_sec_agg_f(anyarray, anyarray) RETURNS anyarray
   LANGUAGE sql IMMUTABLE AS
   'SELECT CASE
              WHEN $1 IS NULL
              THEN $2
              WHEN $2 IS NULL
              THEN $1
              ELSE array_agg(x)
           END
    FROM (SELECT x FROM unnest($1) a(x)
          INTERSECT
          SELECT x FROM unnest($2) a(x)
         ) q';

CREATE AGGREGATE arr_sec_agg(anyarray) (
   SFUNC = arr_sec_agg_f(anyarray, anyarray),
   STYPE = anyarray
);

SELECT name, arr_sec_agg(preferences)
FROM person
GROUP BY name;

┌──────┬─────────────┐
│ name │ arr_sec_agg │
├──────┼─────────────┤
│ John │ {pizza}     │
│ Bill │             │
└──────┴─────────────┘
(2 rows)

#3


1  

If writing a custom aggregate (like @LaurenzAlbe provided) is not an option for you, you can usually enroll the same logic in a recursive CTE:

如果编写自定义聚合(如提供的@LaurenzAlbe)不是您的选项,您通常可以在递归CTE中注册相同的逻辑:

with recursive cte(name, pref_intersect, pref_prev, iteration) as (
    select   name,
             min(preferences),
             min(preferences),
             0
    from     your_table
    group by name
  union all
    select   name,
             array(select e from unnest(pref_intersect) e
                   intersect
                   select e from unnest(pref_next) e),
             pref_next,
             iteration + 1
    from     cte,
    lateral  (select   your_table.preferences pref_next
              from     your_table
              where    your_table.name        = cte.name
              and      your_table.preferences > cte.pref_prev
              order by your_table.preferences
              limit    1) n
)
select   distinct on (name) name, pref_intersect
from     cte
order by name, iteration desc

http://rextester.com/ZQMGW66052

http://rextester.com/ZQMGW66052

The main idea here is to find an ordering in which you can "walk" through your rows. I used the natural ordering of the preferences array (because not much of your columns are being showed). Ideally, this ordering should happen on (a) unique field(s) (preferably on the primary key), but here, because duplications in the preferences column does not influence the result of the intersection, it is good enough.

这里的主要思想是找到一个排序,在其中您可以“遍历”您的行。我使用了首选项数组的自然排序(因为显示的列不多)。理想情况下,这种排序应该发生在(a)唯一字段(最好是在主键上),但是在这里,因为preferences列中的重复不会影响交集的结果,所以这样就足够了。

#1


2  

Using FILTER with ARRAY_AGG

使用过滤器和ARRAY_AGG

SELECT name, array_agg(pref) FILTER (WHERE namepref = total)
FROM (
  SELECT name, pref, t1.count AS total, count(*) AS namepref
  FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name)
    FROM person
  ) AS t1
  CROSS JOIN LATERAL unnest(preferences) AS pref
  GROUP BY name, total, pref
) AS t2
GROUP BY name;

Here is one way to do it using the ARRAY constructor and DISTINCT.

这里有一种使用数组构造函数和不同的方法。

WITH t AS (
  SELECT name, pref, t1.count AS total, count(*) AS namepref
  FROM (
    SELECT name, preferences, count(*) OVER (PARTITION BY name)
    FROM person
  ) AS t1
  CROSS JOIN LATERAL unnest(preferences) AS pref
  GROUP BY name, total, pref
)
SELECT DISTINCT
  name,
  ARRAY(SELECT pref FROM t AS t2 WHERE total=namepref AND t.name = t2.name)
FROM t;

#2


2  

You could create your own aggregate function:

您可以创建自己的聚合函数:

CREATE OR REPLACE FUNCTION arr_sec_agg_f(anyarray, anyarray) RETURNS anyarray
   LANGUAGE sql IMMUTABLE AS
   'SELECT CASE
              WHEN $1 IS NULL
              THEN $2
              WHEN $2 IS NULL
              THEN $1
              ELSE array_agg(x)
           END
    FROM (SELECT x FROM unnest($1) a(x)
          INTERSECT
          SELECT x FROM unnest($2) a(x)
         ) q';

CREATE AGGREGATE arr_sec_agg(anyarray) (
   SFUNC = arr_sec_agg_f(anyarray, anyarray),
   STYPE = anyarray
);

SELECT name, arr_sec_agg(preferences)
FROM person
GROUP BY name;

┌──────┬─────────────┐
│ name │ arr_sec_agg │
├──────┼─────────────┤
│ John │ {pizza}     │
│ Bill │             │
└──────┴─────────────┘
(2 rows)

#3


1  

If writing a custom aggregate (like @LaurenzAlbe provided) is not an option for you, you can usually enroll the same logic in a recursive CTE:

如果编写自定义聚合(如提供的@LaurenzAlbe)不是您的选项,您通常可以在递归CTE中注册相同的逻辑:

with recursive cte(name, pref_intersect, pref_prev, iteration) as (
    select   name,
             min(preferences),
             min(preferences),
             0
    from     your_table
    group by name
  union all
    select   name,
             array(select e from unnest(pref_intersect) e
                   intersect
                   select e from unnest(pref_next) e),
             pref_next,
             iteration + 1
    from     cte,
    lateral  (select   your_table.preferences pref_next
              from     your_table
              where    your_table.name        = cte.name
              and      your_table.preferences > cte.pref_prev
              order by your_table.preferences
              limit    1) n
)
select   distinct on (name) name, pref_intersect
from     cte
order by name, iteration desc

http://rextester.com/ZQMGW66052

http://rextester.com/ZQMGW66052

The main idea here is to find an ordering in which you can "walk" through your rows. I used the natural ordering of the preferences array (because not much of your columns are being showed). Ideally, this ordering should happen on (a) unique field(s) (preferably on the primary key), but here, because duplications in the preferences column does not influence the result of the intersection, it is good enough.

这里的主要思想是找到一个排序,在其中您可以“遍历”您的行。我使用了首选项数组的自然排序(因为显示的列不多)。理想情况下,这种排序应该发生在(a)唯一字段(最好是在主键上),但是在这里,因为preferences列中的重复不会影响交集的结果,所以这样就足够了。