I have been fruitlessly trying for several hours to make a function that filter array subscripts based upon a criteria on the array from which the subscripts and then create an array of those subscripts.
我几个小时都没有成功地尝试创建一个函数,根据下标数组的条件过滤数组下标,然后创建这些下标的数组。
The data structure I am dealing with is similar to the following sample (except with many more columns to compare and more complicated rules and mixed data types):
我正在处理的数据结构类似于以下示例(除了要比较更多的列和更复杂的规则和混合数据类型):
id hierarchy abbreviation1 abbreviation2
1 {1} SB GL
2 {2,1} NULL NULL
3 {3,2,1} NULL TC
4 {4,2,1} NULL NULL
I need to run a query that takes the next non-null value closest to the parent for abbreviation1 and abbreviation2 and compares them based upon the hierarchical distance from the current record in order to get a single value for an abbreviation. So, for example, if the first non-null values of abbreviation1 and abbreviation2 are both on the same record level abbreviation1 would take priority; on the other hand, if the first non-null abbreviation2 is closer to the current record then the corresponding non-null value for abbreviation1, then abbreviation2 would be used.
我需要运行一个查询,该查询将最接近父项的下一个非空值用于abbreviation1和abbreviation2,并根据与当前记录的分层距离对它们进行比较,以获得缩写的单个值。因此,例如,如果abbreviation1和abbreviation2的第一个非null值都在同一记录级别上,则abbreviation1将优先;另一方面,如果第一个非null缩写2更接近当前记录,则使用abbreviation1的相应非空值,然后使用abbreviation2。
Thus the described query on the above sample table would yield;
因此,上述样本表中描述的查询将产生;
id abbreviation
1 SB
2 SB
3 TC
4 SB
To accomplish this task I need to generate a filtered array of array subscripts (after doing an array_agg()
on the abbreviation columns) which only contain subscripts where the value in an abbreviation column is not null.
要完成此任务,我需要生成一个过滤的数组下标数组(在缩写列上执行array_agg()之后),该数组只包含缩写列中的值不为null的下标。
The following function, based on all the logic in my tired mind, should work but does not
基于我疲惫的头脑中的所有逻辑,以下功能应该有效但不能
CREATE OR REPLACE FUNCTION filter_array_subscripts(rawarray anyarray,criteria anynonarray,dimension integer, reverse boolean DEFAULT False)
RETURNS integer[] as
$$
DECLARE
outarray integer[] := ARRAY[]::integer[];
x integer;
BEGIN
for i in array_lower(rawarray,dimension)..array_upper(rawarray,dimension) LOOP
IF NOT criteria IS NULL THEN
IF NOT rawarray[i] IS NULL THEN
IF NOT rawarray[i] = criteria THEN
IF reverse = False THEN
outarray := array_append(outarray,i);
ELSE
outarray := array_prepend(i,outarray);
END IF;
ELSE
IF reverse = False THEN
outarray := array_append(outarray,i);
ELSE
outarray := array_prepend(i,outarray);
END IF;
END IF;
END IF;
ELSE
IF NOT rawarray[i] is NULL THEN
IF reverse = False THEN
outarray := array_append(outarray,i);
ELSE
outarray := array_prepend(i,outarray);
END IF;
END IF;
END IF;
END LOOP;
RETURN outarray;
END;
$$ LANGUAGE plpgsql;
For example, the below query returns {5,3,1}
when it should return {5,4,2,1}
例如,以下查询返回{5,3,1}时应返回{5,4,2,1}
select filter_array_subscripts(array['This',NULL,'is',NULL,'insane!']::text[]
,'is',1,True);
I have no idea why this does not work, I have tried using the foreach
array iteration syntax but I cannot figure out how to cast the iteration value to the scalar type contained within the anyarray
.
我不知道为什么这不起作用,我已经尝试使用foreach数组迭代语法,但我无法弄清楚如何将迭代值转换为包含在anyarray中的标量类型。
What can be done to fix this?
可以做些什么来解决这个问题?
1 个解决方案
#1
2
You can largely simplify this whole endeavor with the use of a RECURSIVE CTE, available in PostgreSQL 8.4 or later:
通过使用PostgreSQL 8.4或更高版本中提供的RECURSIVE CTE,您可以在很大程度上简化整个过程:
Test table (makes it easier for everyone to provide test data in a form like this):
测试表(使每个人更容易以这样的形式提供测试数据):
CREATE TEMP TABLE tbl (
id int
, hierarchy int[]
, abbreviation1 text
, abbreviation2 text
);
INSERT INTO tbl VALUES
(1, '{1}', 'SB', 'GL')
,(2, '{2,1}', NULL, NULL)
,(3, '{3,2,1}', NULL, 'TC')
,(4, '{4,2,1}', NULL, NULL);
Query:
查询:
WITH RECURSIVE x AS (
SELECT id
, COALESCE(abbreviation1, abbreviation2) AS abbr
, hierarchy[2] AS parent_id
FROM tbl
UNION ALL
SELECT x.id
, COALESCE(parent.abbreviation1, parent.abbreviation2) AS abbr
, parent.hierarchy[2] AS parent_id
FROM x
JOIN tbl AS parent ON parent.id = x.parent_id
WHERE x.abbr IS NULL -- stop at non-NULL value
)
SELECT id, abbr
FROM x
WHERE abbr IS NOT NULL -- discard intermediary NULLs
ORDER BY id
Returns:
返回:
id | abbr
---+-----
1 | SB
2 | SB
3 | TC
4 | SB
This presumes that there is a non-null value on every path, or such rows will be dropped from the result.
这假设每条路径上都有一个非空值,或者这些行将从结果中删除。
#1
2
You can largely simplify this whole endeavor with the use of a RECURSIVE CTE, available in PostgreSQL 8.4 or later:
通过使用PostgreSQL 8.4或更高版本中提供的RECURSIVE CTE,您可以在很大程度上简化整个过程:
Test table (makes it easier for everyone to provide test data in a form like this):
测试表(使每个人更容易以这样的形式提供测试数据):
CREATE TEMP TABLE tbl (
id int
, hierarchy int[]
, abbreviation1 text
, abbreviation2 text
);
INSERT INTO tbl VALUES
(1, '{1}', 'SB', 'GL')
,(2, '{2,1}', NULL, NULL)
,(3, '{3,2,1}', NULL, 'TC')
,(4, '{4,2,1}', NULL, NULL);
Query:
查询:
WITH RECURSIVE x AS (
SELECT id
, COALESCE(abbreviation1, abbreviation2) AS abbr
, hierarchy[2] AS parent_id
FROM tbl
UNION ALL
SELECT x.id
, COALESCE(parent.abbreviation1, parent.abbreviation2) AS abbr
, parent.hierarchy[2] AS parent_id
FROM x
JOIN tbl AS parent ON parent.id = x.parent_id
WHERE x.abbr IS NULL -- stop at non-NULL value
)
SELECT id, abbr
FROM x
WHERE abbr IS NOT NULL -- discard intermediary NULLs
ORDER BY id
Returns:
返回:
id | abbr
---+-----
1 | SB
2 | SB
3 | TC
4 | SB
This presumes that there is a non-null value on every path, or such rows will be dropped from the result.
这假设每条路径上都有一个非空值,或者这些行将从结果中删除。