如何使用postgreSQL访问数组内部索引?

时间:2022-02-13 01:48:50

This is my (perhaps usual for you) non-optimized solution:

这是我(也许通常适合你)的非优化解决方案:

Workaround for PG problem with non-optimized internal function:

使用非优化内部函数的PG问题的解决方法:

CREATE FUNCTION unnest_with_idx(anyarray)
RETURNS TABLE(idx integer, val anyelement) AS
$$ 
   SELECT generate_series(1,array_upper($1,1)) as idx, unnest($1) as val;
$$ LANGUAGE SQL IMMUTABLE;

Test:

SELECT idx,val from unnest_with_idx(array[1,20,3,5]) as t;

But, as I said, non-optimized. I can't believe (!!) that PostgreSQL doesn't have an internal index for arrays ... ? But in this case, the question is how to directly access this index, where the GIN-like internal counter?

但是,正如我所说,非优化。我不敢相信(!!)PostgreSQL没有数组的内部索引......?但在这种情况下,问题是如何直接访问这个类似GIN的内部计数器?

NOTE1: the solution above and the question is not the same as "how do you create an index by each element of an array?". Also not the same as "Can PostgreSQL index array columns?" because the function is for an isolated array, not for a table index for array fields.

注1:上面的解决方案和问题与“如何通过数组的每个元素创建索引?”不同。也不一样的“可以PostgreSQL索引数组列吗?”因为该函数用于隔离数组,而不是用于数组字段的表索引。


NOTE2 (edited after answers): "array indexes" (more popular term) or "array subscripts" or "array counter" are terms that we can use in a semantic path to refer the "internal counter", the accumulator to the next array item. I see that no PostgreSQL command offer a direct access to this counter. As generate_series() function, the generate_subscripts() function is a sequence generator, and the performance is (best but) near the same. By other hand row_number() function offers a direct access to a "internal counter of rows", but it is about rows, not about arrays, and unfortunately the performance is worse.

NOTE2(在答案后编辑):“数组索引”(更常用的术语)或“数组下标”或“数组计数器”是我们可以在语义路径中用来引用“内部计数器”,累加器到下一个数组的术语项目。我看到没有PostgreSQL命令提供对此计数器的直接访问。作为generate_series()函数,generate_subscripts()函数是一个序列生成器,性能(最好但是)接近相同。另一方面,row_number()函数提供了对“行内部计数器”的直接访问,但它是关于行,而不是关于数组,不幸的是性能更差。

2 个解决方案

#1


6  

PostgreSQL does provide dedicated functions to generate array subscripts:

PostgreSQL确实提供了生成数组下标的专用函数:

WITH   x(a) AS ( VALUES ('{1,20,3,5}'::int[]) )
SELECT generate_subscripts(a, 1) AS idx
      ,unnest(a) AS val
FROM   x;

Effectively it does almost the same as @Frank's query, just without subquery.
Plus it works with subscripts that do not start with 1.

实际上它与@Frank的查询几乎相同,只是没有子查询。此外,它适用于不以1开头的下标。

Either solution works for for 1-dimensional arrays only! (Can easily be expanded to multiple dimensions.)

这两种解决方案都只适用于一维数组! (可以轻松扩展到多个维度。)

Function:

CREATE OR REPLACE FUNCTION unnest_with_idx(anyarray) 
RETURNS TABLE(idx integer, val anyelement) LANGUAGE SQL IMMUTABLE AS
$func$
  SELECT generate_subscripts($1, 1), unnest($1);
$func$;

Call:

SELECT * FROM unnest_with_idx('{1,20,3,5}'::int[]);

Also consider:

SELECT * FROM unnest_with_idx('[4:7]={1,20,3,5}'::int[]);

More about array subscripts in this related question.

有关此相关问题中的数组下标的更多信息。

If you actually want normalized subscripts (starting with 1), I'd use:

如果你真的想要标准化的下标(从1开始),我会使用:

SELECT generate_series(1, array_length($1,1)) ...

That's almost the query you had already, just with array_length() instead of array_upper() - which would fail with non-standard subscripts.

这几乎就是你已经拥有的查询,只是使用array_length()而不是array_upper() - 这将使用非标准下标失败。

Performance

I ran a quick test on an array of 1000 int with all queries presented here so far. They all perform about the same (~ 3,5 ms) - except for row_number() on a subquery (~ 7,5 ms) - as expected, because of the subquery.

我对1000 int的数组进行了快速测试,目前为止所有查询都在这里。它们都执行大约相同的(~3.5 ms) - 除了子查询上的row_number()(~7,5 ms) - 正如预期的那样,因为子查询。

Update: Postgres 9.4+

Unless you operate with non-standard index subscripts, use the new WITH ORDINALITY instead:

除非您使用非标准索引下标,否则请使用新的WITH ORDINALITY:

#2


1  

row_number() works:

SELECT 
    row_number() over(), 
    value
FROM (SELECT unnest(array[1,20,3,5])) a(value);

Then, the optimized function will be

然后,优化的功能将是

CREATE OR REPLACE FUNCTION unnest_with_idx(anyarray) 
RETURNS table(idx integer, val anyelement) AS $$ 
  SELECT (row_number() over())::integer as idx, val
  FROM (SELECT unnest($1)) a(val);
$$ LANGUAGE SQL IMMUTABLE;

#1


6  

PostgreSQL does provide dedicated functions to generate array subscripts:

PostgreSQL确实提供了生成数组下标的专用函数:

WITH   x(a) AS ( VALUES ('{1,20,3,5}'::int[]) )
SELECT generate_subscripts(a, 1) AS idx
      ,unnest(a) AS val
FROM   x;

Effectively it does almost the same as @Frank's query, just without subquery.
Plus it works with subscripts that do not start with 1.

实际上它与@Frank的查询几乎相同,只是没有子查询。此外,它适用于不以1开头的下标。

Either solution works for for 1-dimensional arrays only! (Can easily be expanded to multiple dimensions.)

这两种解决方案都只适用于一维数组! (可以轻松扩展到多个维度。)

Function:

CREATE OR REPLACE FUNCTION unnest_with_idx(anyarray) 
RETURNS TABLE(idx integer, val anyelement) LANGUAGE SQL IMMUTABLE AS
$func$
  SELECT generate_subscripts($1, 1), unnest($1);
$func$;

Call:

SELECT * FROM unnest_with_idx('{1,20,3,5}'::int[]);

Also consider:

SELECT * FROM unnest_with_idx('[4:7]={1,20,3,5}'::int[]);

More about array subscripts in this related question.

有关此相关问题中的数组下标的更多信息。

If you actually want normalized subscripts (starting with 1), I'd use:

如果你真的想要标准化的下标(从1开始),我会使用:

SELECT generate_series(1, array_length($1,1)) ...

That's almost the query you had already, just with array_length() instead of array_upper() - which would fail with non-standard subscripts.

这几乎就是你已经拥有的查询,只是使用array_length()而不是array_upper() - 这将使用非标准下标失败。

Performance

I ran a quick test on an array of 1000 int with all queries presented here so far. They all perform about the same (~ 3,5 ms) - except for row_number() on a subquery (~ 7,5 ms) - as expected, because of the subquery.

我对1000 int的数组进行了快速测试,目前为止所有查询都在这里。它们都执行大约相同的(~3.5 ms) - 除了子查询上的row_number()(~7,5 ms) - 正如预期的那样,因为子查询。

Update: Postgres 9.4+

Unless you operate with non-standard index subscripts, use the new WITH ORDINALITY instead:

除非您使用非标准索引下标,否则请使用新的WITH ORDINALITY:

#2


1  

row_number() works:

SELECT 
    row_number() over(), 
    value
FROM (SELECT unnest(array[1,20,3,5])) a(value);

Then, the optimized function will be

然后,优化的功能将是

CREATE OR REPLACE FUNCTION unnest_with_idx(anyarray) 
RETURNS table(idx integer, val anyelement) AS $$ 
  SELECT (row_number() over())::integer as idx, val
  FROM (SELECT unnest($1)) a(val);
$$ LANGUAGE SQL IMMUTABLE;