I have a Presto database with a column array which contains for example:
我有一个包含列数组的Presto数据库,例如:
- id1,[1,2,3,4]
- id1、(1、2、3、4)
- id2,[3,4,5,6]
- id2,(3、4、5、6)
- id3,[3,4,7,8]
- id3,(3、4、7、8)
- id4,[5,4,3,6]
- id4,5、4、3、6]
I need a to search which rows contains the array [3,4,5] in the correct order. So for instance the result should return only id2 but not id4.
我需要a搜索哪些行以正确的顺序包含数组[3,4,5]。例如,结果应该只返回id2而不是id4。
I can use array_intersect in combination with cardinality to find id2,id4 but I don't know how can I verify that id2 or id4 are in the correct order.
我可以结合使用array_intersect和基数查找id2、id4,但我不知道如何验证id2或id4的顺序是否正确。
The only ugly solution I can think of is to convert the two arrays into a string and then do a string like operation.
我能想到的唯一糟糕的解决方案是将两个数组转换成一个字符串,然后执行一个类似字符串的操作。
Any better ideas?
有更好的主意吗?
Following the suggestion below and using AWS Athena:
按照以下建议并使用AWS雅典娜:
WITH dataset AS (
(values array[1,2,3,4],
array[3,4,5,6],
array[3,4,7,8],
array[5,4,3,6])
)
SELECT ngrams FROM dataset t(ngrams) where reduce(
transform(array[3,4,5], a -> array_position(ngrams, a)),
0,
(s, n) -> if( s < 0, -1, if ( n > s, n, -1)),
s -> s >= 0) ;
The error I get is:
我得到的错误是:
SYNTAX_ERROR: line 7:44: Unexpected parameters (array(bigint), integer, com.facebook.presto.sql.analyzer.TypeSignatureProvider@1d8b3792, com.facebook.presto.sql.analyzer.TypeSignatureProvider@563900c2) for function reduce. Expected: reduce(array(T), S, function(S,T,S), function(S,R)) T, S, R
SYNTAX_ERROR:第7:44行:意外参数(数组(bigint),整数,com.facebook.presto.sql.analyzer。为减少功能,请输入“TypeSignatureProvider@1d8b3792, com. facebook.presto.html。期望:reduce(array(T), S,函数(S,T,S),函数(S,R)) T,S,R
1 个解决方案
#1
0
Here comes the magic for you:
你的魔力来了:
select x
from (values
array[1,2,3,4],
array[3,4,5,6],
array[3,4,7,8],
array[5,4,3,6]) t(x)
where reduce(
transform(array[3,4,5], a -> array_position(x, a)),
0,
(s, n) -> if( s < 0, -1, if ( n > s, n, -1)),
s -> s >= 0)
The above find each element in queried array and returns true if position array is is increasing. This still have a lot of corner cases to solve (handling duplicates or gaps), but I hope this is something you can start to work with.
上面的函数查找查询数组中的每个元素,如果位置数组增加,则返回true。这仍然有很多需要解决的问题(处理重复的或空白),但是我希望这是您可以开始使用的东西。
See https://prestodb.io/docs/current/functions/array.html for more details
见https://prestodb.io/docs/current/functions/array。html为更多的细节
#1
0
Here comes the magic for you:
你的魔力来了:
select x
from (values
array[1,2,3,4],
array[3,4,5,6],
array[3,4,7,8],
array[5,4,3,6]) t(x)
where reduce(
transform(array[3,4,5], a -> array_position(x, a)),
0,
(s, n) -> if( s < 0, -1, if ( n > s, n, -1)),
s -> s >= 0)
The above find each element in queried array and returns true if position array is is increasing. This still have a lot of corner cases to solve (handling duplicates or gaps), but I hope this is something you can start to work with.
上面的函数查找查询数组中的每个元素,如果位置数组增加,则返回true。这仍然有很多需要解决的问题(处理重复的或空白),但是我希望这是您可以开始使用的东西。
See https://prestodb.io/docs/current/functions/array.html for more details
见https://prestodb.io/docs/current/functions/array。html为更多的细节