PostgreSQL中是否有类似zip()的函数,它结合了两个数组?

时间:2021-09-07 21:17:18

I have two array values of the same length in PostgreSQL:

在PostgreSQL中有两个相同长度的数组值:

{a,b,c} and {d,e,f}

{ a,b,c }和{ d,e,f }

and I'd like to combine them into

我想把它们结合起来

{{a,d},{b,e},{c,f}}

{ { a,d },{ b,e },{ c、f } }

Is there a way to do that?

有办法吗?

2 个解决方案

#1


34  

Postgres 9.3 or older

Simple zip()

Consider the following demo for Postgres 9.3 or earlier:

考虑一下Postgres 9.3或更早版本的演示:

SELECT ARRAY[a,b] AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
         ,unnest('{d,e,f}'::text[]) AS b
    ) x;

Result:

结果:

  ab
-------
 {a,d}
 {b,e}
 {c,f}

Note that both arrays must have the same number of elements to unnest in parallel, or you get a cross join instead.

请注意,两个数组必须具有相同数量的元素以并行的方式打开,否则就会得到一个交叉连接。

You can wrap this into a function, if you want to:

如果你想:

CREATE OR REPLACE FUNCTION zip(anyarray, anyarray)
  RETURNS SETOF anyarray LANGUAGE SQL AS
$func$
SELECT ARRAY[a,b] FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;
$func$;

Call:

电话:

SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);

Same result.

相同的结果。

zip() to multi-dimensional array:

Now, if you want to aggregate that new set of arrays into one 2-dimenstional array, it gets more complicated.

现在,如果你想把新的数组集集合成一个二维数组,它会变得更加复杂。


  
  
  
    SELECT ARRAY (SELECT ...) 
  

or:

或者:


  
  
  
    SELECT array_agg(ARRAY[a,b]) AS ab FROM ( SELECT unnest('{a,b,c}'::text[]) AS a ,unnest('{d,e,f}'::text[]) AS b ) x 
  

or:

或者:


  
  
  
    SELECT array_agg(ARRAY[ARRAY[a,b]]) AS ab FROM ... 
  

will all result in the same error message (tested with pg 9.1.5):

将导致相同的错误消息(使用pg 9.1.5测试):

ERROR: could not find array type for data type text[]

错误:无法找到数据类型文本的数组类型[]

But there is a way around this, as we worked out under this closely related question.
Create a custom aggregate function:

但是有一种方法可以解决这个问题,正如我们在这个紧密相关的问题下所做的那样。创建自定义聚合函数:

CREATE AGGREGATE array_agg_mult (anyarray) (
    SFUNC    = array_cat
   ,STYPE    = anyarray
   ,INITCOND = '{}'
);

And use it like this:

像这样使用它:

SELECT array_agg_mult(ARRAY[ARRAY[a,b]]) AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
         ,unnest('{d,e,f}'::text[]) AS b
    ) x

Result:

结果:

{{a,d},{b,e},{c,f}}

Note the additional ARRAY[] layer! Without it and just:

注意附加的数组[]层!没有它,只是:

SELECT array_agg_mult(ARRAY[a,b]) AS ab
FROM ...

You get:

你会得到:

{a,d,b,e,c,f}

Which may be useful for other purposes.

这可能对其他目的有用。

Roll another function:

另一个函数:滚

CREATE OR REPLACE FUNCTION zip2(anyarray, anyarray)
  RETURNS SETOF anyarray LANGUAGE SQL AS
$func$
SELECT array_agg_mult(ARRAY[ARRAY[a,b]])
FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;
$func$;

Call:

电话:

SELECT zip2('{a,b,c}'::text[],'{d,e,f}'::text[]); -- or any other array type

Result:

结果:

{{a,d},{b,e},{c,f}}

Postgres 9.4+

Use the ROWS FROM construct or the updated unnest() which takes multiple arrays to unnest in parallel. Each can have a different length. You get (per documentation):

使用构造或更新后的unnest()中的行,它们将多个数组并行地反嵌。每个都可以有不同的长度。你得到(每个文档):

[...] the number of result rows in this case is that of the largest function result, with smaller results padded with null values to match.

[…在这种情况下,结果行数是最大的函数结果,较小的结果被填充为空值以匹配。

Use this cleaner and simpler variant:

使用这个更干净更简单的变体:

SELECT ARRAY[a,b] AS ab
FROM   unnest('{a,b,c}'::text[] 
            , '{d,e,f}'::text[]) x(a,b);

Postgres 9.5+

ships array_agg(array expression):

船只array_agg(数组表达式):

Function                Argument Type(s)   Return Type
array_agg(expression)   any array type     same as argument data type  

Description
input arrays concatenated into array of one higher dimension
(inputs must all have same dimensionality, and cannot be empty or NULL)

This is a drop-in replacement for my custom aggregate function array_agg_mult()implemented in C which is considerably faster. Use it.

这是对我的自定义聚合函数array_agg_mult()的替换,该函数在C中实现,速度要快得多。使用它。

#2


6  

Here's another approach that's safe for arrays of differing lengths, using the array multi-aggregation mentioned by Erwin:

使用Erwin提到的数组多聚合,对于不同长度的数组来说,这是另一种安全的方法:

CREATE OR REPLACE FUNCTION zip(array1 anyarray, array2 anyarray) RETURNS text[]
AS $$
SELECT array_agg_mult(ARRAY[ARRAY[array1[i],array2[i]]])
FROM generate_subscripts(
  CASE WHEN array_length(array1,1) >= array_length(array2,1) THEN array1 ELSE array2 END,
  1
) AS subscripts(i)
$$ LANGUAGE sql;

regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);
         zip         
---------------------
 {{a,d},{b,e},{c,f}}
(1 row)


regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f,g}'::text[]);
             zip              
------------------------------
 {{a,d},{b,e},{c,f},{NULL,g}}
(1 row)

regress=> SELECT zip('{a,b,c,z}'::text[],'{d,e,f}'::text[]);
             zip              
------------------------------
 {{a,d},{b,e},{c,f},{z,NULL}}
(1 row)

If you want to chop off the excess rather than null-padding, just change the >= length test to <= instead.

如果您想删除多余的部分而不是空填充,只需将>= length测试改为<=。

This function does not handle the rather bizarre PostgreSQL feature that arrays may have a stating element other than 1, but in practice nobody actually uses that feature. Eg with a zero-indexed 3-element array:

这个函数不处理非常奇怪的PostgreSQL特性,数组可能有一个声明元素而不是1,但是实际上没有人使用这个特性。使用零索引的3元素数组:

regress=> SELECT zip('{a,b,c}'::text[], array_fill('z'::text, ARRAY[3], ARRAY[0]));
          zip           
------------------------
 {{a,z},{b,z},{c,NULL}}
(1 row)

wheras Erwin's code does work with such arrays, and even with multi-dimensional arrays (by flattening them) but does not work with arrays of differing length.

Erwin的代码确实可以处理这样的数组,甚至可以处理多维数组(通过使它们变平),但是不能处理不同长度的数组。

Arrays are a bit special in PostgreSQL, they're a little too flexible with multi-dimensional arrays, configurable origin index, etc.

数组在PostgreSQL中有点特殊,它们在多维数组、可配置的起源索引等方面过于灵活。

In 9.4 you'll be able to write:

在9.4,你可以写:

SELECT array_agg_mult(ARRAY[ARRAY[a,b])
FROM unnest(array1) WITH ORDINALITY as (o,a)
NATURAL FULL OUTER JOIN
unnest(array2) WITH ORDINALITY as (o,b);

which will be a lot nicer, especially if an optimisation to scan the functions together rather than doing a sort and join goes in.

这将会更好,特别是如果优化将一起扫描函数,而不是进行排序和连接。

#1


34  

Postgres 9.3 or older

Simple zip()

Consider the following demo for Postgres 9.3 or earlier:

考虑一下Postgres 9.3或更早版本的演示:

SELECT ARRAY[a,b] AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
         ,unnest('{d,e,f}'::text[]) AS b
    ) x;

Result:

结果:

  ab
-------
 {a,d}
 {b,e}
 {c,f}

Note that both arrays must have the same number of elements to unnest in parallel, or you get a cross join instead.

请注意,两个数组必须具有相同数量的元素以并行的方式打开,否则就会得到一个交叉连接。

You can wrap this into a function, if you want to:

如果你想:

CREATE OR REPLACE FUNCTION zip(anyarray, anyarray)
  RETURNS SETOF anyarray LANGUAGE SQL AS
$func$
SELECT ARRAY[a,b] FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;
$func$;

Call:

电话:

SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);

Same result.

相同的结果。

zip() to multi-dimensional array:

Now, if you want to aggregate that new set of arrays into one 2-dimenstional array, it gets more complicated.

现在,如果你想把新的数组集集合成一个二维数组,它会变得更加复杂。


  
  
  
    SELECT ARRAY (SELECT ...) 
  

or:

或者:


  
  
  
    SELECT array_agg(ARRAY[a,b]) AS ab FROM ( SELECT unnest('{a,b,c}'::text[]) AS a ,unnest('{d,e,f}'::text[]) AS b ) x 
  

or:

或者:


  
  
  
    SELECT array_agg(ARRAY[ARRAY[a,b]]) AS ab FROM ... 
  

will all result in the same error message (tested with pg 9.1.5):

将导致相同的错误消息(使用pg 9.1.5测试):

ERROR: could not find array type for data type text[]

错误:无法找到数据类型文本的数组类型[]

But there is a way around this, as we worked out under this closely related question.
Create a custom aggregate function:

但是有一种方法可以解决这个问题,正如我们在这个紧密相关的问题下所做的那样。创建自定义聚合函数:

CREATE AGGREGATE array_agg_mult (anyarray) (
    SFUNC    = array_cat
   ,STYPE    = anyarray
   ,INITCOND = '{}'
);

And use it like this:

像这样使用它:

SELECT array_agg_mult(ARRAY[ARRAY[a,b]]) AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
         ,unnest('{d,e,f}'::text[]) AS b
    ) x

Result:

结果:

{{a,d},{b,e},{c,f}}

Note the additional ARRAY[] layer! Without it and just:

注意附加的数组[]层!没有它,只是:

SELECT array_agg_mult(ARRAY[a,b]) AS ab
FROM ...

You get:

你会得到:

{a,d,b,e,c,f}

Which may be useful for other purposes.

这可能对其他目的有用。

Roll another function:

另一个函数:滚

CREATE OR REPLACE FUNCTION zip2(anyarray, anyarray)
  RETURNS SETOF anyarray LANGUAGE SQL AS
$func$
SELECT array_agg_mult(ARRAY[ARRAY[a,b]])
FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;
$func$;

Call:

电话:

SELECT zip2('{a,b,c}'::text[],'{d,e,f}'::text[]); -- or any other array type

Result:

结果:

{{a,d},{b,e},{c,f}}

Postgres 9.4+

Use the ROWS FROM construct or the updated unnest() which takes multiple arrays to unnest in parallel. Each can have a different length. You get (per documentation):

使用构造或更新后的unnest()中的行,它们将多个数组并行地反嵌。每个都可以有不同的长度。你得到(每个文档):

[...] the number of result rows in this case is that of the largest function result, with smaller results padded with null values to match.

[…在这种情况下,结果行数是最大的函数结果,较小的结果被填充为空值以匹配。

Use this cleaner and simpler variant:

使用这个更干净更简单的变体:

SELECT ARRAY[a,b] AS ab
FROM   unnest('{a,b,c}'::text[] 
            , '{d,e,f}'::text[]) x(a,b);

Postgres 9.5+

ships array_agg(array expression):

船只array_agg(数组表达式):

Function                Argument Type(s)   Return Type
array_agg(expression)   any array type     same as argument data type  

Description
input arrays concatenated into array of one higher dimension
(inputs must all have same dimensionality, and cannot be empty or NULL)

This is a drop-in replacement for my custom aggregate function array_agg_mult()implemented in C which is considerably faster. Use it.

这是对我的自定义聚合函数array_agg_mult()的替换,该函数在C中实现,速度要快得多。使用它。

#2


6  

Here's another approach that's safe for arrays of differing lengths, using the array multi-aggregation mentioned by Erwin:

使用Erwin提到的数组多聚合,对于不同长度的数组来说,这是另一种安全的方法:

CREATE OR REPLACE FUNCTION zip(array1 anyarray, array2 anyarray) RETURNS text[]
AS $$
SELECT array_agg_mult(ARRAY[ARRAY[array1[i],array2[i]]])
FROM generate_subscripts(
  CASE WHEN array_length(array1,1) >= array_length(array2,1) THEN array1 ELSE array2 END,
  1
) AS subscripts(i)
$$ LANGUAGE sql;

regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);
         zip         
---------------------
 {{a,d},{b,e},{c,f}}
(1 row)


regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f,g}'::text[]);
             zip              
------------------------------
 {{a,d},{b,e},{c,f},{NULL,g}}
(1 row)

regress=> SELECT zip('{a,b,c,z}'::text[],'{d,e,f}'::text[]);
             zip              
------------------------------
 {{a,d},{b,e},{c,f},{z,NULL}}
(1 row)

If you want to chop off the excess rather than null-padding, just change the >= length test to <= instead.

如果您想删除多余的部分而不是空填充,只需将>= length测试改为<=。

This function does not handle the rather bizarre PostgreSQL feature that arrays may have a stating element other than 1, but in practice nobody actually uses that feature. Eg with a zero-indexed 3-element array:

这个函数不处理非常奇怪的PostgreSQL特性,数组可能有一个声明元素而不是1,但是实际上没有人使用这个特性。使用零索引的3元素数组:

regress=> SELECT zip('{a,b,c}'::text[], array_fill('z'::text, ARRAY[3], ARRAY[0]));
          zip           
------------------------
 {{a,z},{b,z},{c,NULL}}
(1 row)

wheras Erwin's code does work with such arrays, and even with multi-dimensional arrays (by flattening them) but does not work with arrays of differing length.

Erwin的代码确实可以处理这样的数组,甚至可以处理多维数组(通过使它们变平),但是不能处理不同长度的数组。

Arrays are a bit special in PostgreSQL, they're a little too flexible with multi-dimensional arrays, configurable origin index, etc.

数组在PostgreSQL中有点特殊,它们在多维数组、可配置的起源索引等方面过于灵活。

In 9.4 you'll be able to write:

在9.4,你可以写:

SELECT array_agg_mult(ARRAY[ARRAY[a,b])
FROM unnest(array1) WITH ORDINALITY as (o,a)
NATURAL FULL OUTER JOIN
unnest(array2) WITH ORDINALITY as (o,b);

which will be a lot nicer, especially if an optimisation to scan the functions together rather than doing a sort and join goes in.

这将会更好,特别是如果优化将一起扫描函数,而不是进行排序和连接。