I have the following table:
我有下表:
CREATE TABLE person
AS
SELECT name, preferences
FROM ( VALUES
( 'John', ARRAY['pizza', 'meat'] ),
( 'John', ARRAY['pizza', 'spaghetti'] ),
( 'Bill', ARRAY['lettuce', 'pizza'] ),
( 'Bill', ARRAY['tomatoes'] )
) AS t(name, preferences);
I want to group by person
with intersect(preferences)
as aggregate function. So I want the following output:
我想将具有intersect(preferences)的人分组为聚合函数。所以我想要以下输出:
person | preferences
-------------------------------
John | ['pizza']
Bill | []
How should this be done in SQL? I guess I need to do something like the following, but what does the X
function look like?
如何在SQL中实现这一点?我想我需要做如下的事情,但是X函数是什么样子的呢?
SELECT person.name, array_agg(X)
FROM person
LEFT JOIN unnest(preferences) preferences
ON true
GROUP BY name
3 个解决方案
#1
2
Using FILTER
with ARRAY_AGG
使用过滤器和ARRAY_AGG
SELECT name, array_agg(pref) FILTER (WHERE namepref = total)
FROM (
SELECT name, pref, t1.count AS total, count(*) AS namepref
FROM (
SELECT name, preferences, count(*) OVER (PARTITION BY name)
FROM person
) AS t1
CROSS JOIN LATERAL unnest(preferences) AS pref
GROUP BY name, total, pref
) AS t2
GROUP BY name;
Here is one way to do it using the ARRAY
constructor and DISTINCT
.
这里有一种使用数组构造函数和不同的方法。
WITH t AS (
SELECT name, pref, t1.count AS total, count(*) AS namepref
FROM (
SELECT name, preferences, count(*) OVER (PARTITION BY name)
FROM person
) AS t1
CROSS JOIN LATERAL unnest(preferences) AS pref
GROUP BY name, total, pref
)
SELECT DISTINCT
name,
ARRAY(SELECT pref FROM t AS t2 WHERE total=namepref AND t.name = t2.name)
FROM t;
#2
2
You could create your own aggregate function:
您可以创建自己的聚合函数:
CREATE OR REPLACE FUNCTION arr_sec_agg_f(anyarray, anyarray) RETURNS anyarray
LANGUAGE sql IMMUTABLE AS
'SELECT CASE
WHEN $1 IS NULL
THEN $2
WHEN $2 IS NULL
THEN $1
ELSE array_agg(x)
END
FROM (SELECT x FROM unnest($1) a(x)
INTERSECT
SELECT x FROM unnest($2) a(x)
) q';
CREATE AGGREGATE arr_sec_agg(anyarray) (
SFUNC = arr_sec_agg_f(anyarray, anyarray),
STYPE = anyarray
);
SELECT name, arr_sec_agg(preferences)
FROM person
GROUP BY name;
┌──────┬─────────────┐
│ name │ arr_sec_agg │
├──────┼─────────────┤
│ John │ {pizza} │
│ Bill │ │
└──────┴─────────────┘
(2 rows)
#3
1
If writing a custom aggregate (like @LaurenzAlbe provided) is not an option for you, you can usually enroll the same logic in a recursive CTE:
如果编写自定义聚合(如提供的@LaurenzAlbe)不是您的选项,您通常可以在递归CTE中注册相同的逻辑:
with recursive cte(name, pref_intersect, pref_prev, iteration) as (
select name,
min(preferences),
min(preferences),
0
from your_table
group by name
union all
select name,
array(select e from unnest(pref_intersect) e
intersect
select e from unnest(pref_next) e),
pref_next,
iteration + 1
from cte,
lateral (select your_table.preferences pref_next
from your_table
where your_table.name = cte.name
and your_table.preferences > cte.pref_prev
order by your_table.preferences
limit 1) n
)
select distinct on (name) name, pref_intersect
from cte
order by name, iteration desc
http://rextester.com/ZQMGW66052
http://rextester.com/ZQMGW66052
The main idea here is to find an ordering in which you can "walk" through your rows. I used the natural ordering of the preferences
array (because not much of your columns are being showed). Ideally, this ordering should happen on (a) unique field(s) (preferably on the primary key), but here, because duplications in the preferences
column does not influence the result of the intersection, it is good enough.
这里的主要思想是找到一个排序,在其中您可以“遍历”您的行。我使用了首选项数组的自然排序(因为显示的列不多)。理想情况下,这种排序应该发生在(a)唯一字段(最好是在主键上),但是在这里,因为preferences列中的重复不会影响交集的结果,所以这样就足够了。
#1
2
Using FILTER
with ARRAY_AGG
使用过滤器和ARRAY_AGG
SELECT name, array_agg(pref) FILTER (WHERE namepref = total)
FROM (
SELECT name, pref, t1.count AS total, count(*) AS namepref
FROM (
SELECT name, preferences, count(*) OVER (PARTITION BY name)
FROM person
) AS t1
CROSS JOIN LATERAL unnest(preferences) AS pref
GROUP BY name, total, pref
) AS t2
GROUP BY name;
Here is one way to do it using the ARRAY
constructor and DISTINCT
.
这里有一种使用数组构造函数和不同的方法。
WITH t AS (
SELECT name, pref, t1.count AS total, count(*) AS namepref
FROM (
SELECT name, preferences, count(*) OVER (PARTITION BY name)
FROM person
) AS t1
CROSS JOIN LATERAL unnest(preferences) AS pref
GROUP BY name, total, pref
)
SELECT DISTINCT
name,
ARRAY(SELECT pref FROM t AS t2 WHERE total=namepref AND t.name = t2.name)
FROM t;
#2
2
You could create your own aggregate function:
您可以创建自己的聚合函数:
CREATE OR REPLACE FUNCTION arr_sec_agg_f(anyarray, anyarray) RETURNS anyarray
LANGUAGE sql IMMUTABLE AS
'SELECT CASE
WHEN $1 IS NULL
THEN $2
WHEN $2 IS NULL
THEN $1
ELSE array_agg(x)
END
FROM (SELECT x FROM unnest($1) a(x)
INTERSECT
SELECT x FROM unnest($2) a(x)
) q';
CREATE AGGREGATE arr_sec_agg(anyarray) (
SFUNC = arr_sec_agg_f(anyarray, anyarray),
STYPE = anyarray
);
SELECT name, arr_sec_agg(preferences)
FROM person
GROUP BY name;
┌──────┬─────────────┐
│ name │ arr_sec_agg │
├──────┼─────────────┤
│ John │ {pizza} │
│ Bill │ │
└──────┴─────────────┘
(2 rows)
#3
1
If writing a custom aggregate (like @LaurenzAlbe provided) is not an option for you, you can usually enroll the same logic in a recursive CTE:
如果编写自定义聚合(如提供的@LaurenzAlbe)不是您的选项,您通常可以在递归CTE中注册相同的逻辑:
with recursive cte(name, pref_intersect, pref_prev, iteration) as (
select name,
min(preferences),
min(preferences),
0
from your_table
group by name
union all
select name,
array(select e from unnest(pref_intersect) e
intersect
select e from unnest(pref_next) e),
pref_next,
iteration + 1
from cte,
lateral (select your_table.preferences pref_next
from your_table
where your_table.name = cte.name
and your_table.preferences > cte.pref_prev
order by your_table.preferences
limit 1) n
)
select distinct on (name) name, pref_intersect
from cte
order by name, iteration desc
http://rextester.com/ZQMGW66052
http://rextester.com/ZQMGW66052
The main idea here is to find an ordering in which you can "walk" through your rows. I used the natural ordering of the preferences
array (because not much of your columns are being showed). Ideally, this ordering should happen on (a) unique field(s) (preferably on the primary key), but here, because duplications in the preferences
column does not influence the result of the intersection, it is good enough.
这里的主要思想是找到一个排序,在其中您可以“遍历”您的行。我使用了首选项数组的自然排序(因为显示的列不多)。理想情况下,这种排序应该发生在(a)唯一字段(最好是在主键上),但是在这里,因为preferences列中的重复不会影响交集的结果,所以这样就足够了。