I'm working with Teradata conversion to Hive (version 0.10.0).
我正在使用Teradata到Hive的转换(版本0.10.0)。
Teradata Query :
Teradata查询:
QUALIFY ROW_NUMBER() OVER (PARTITION BY ADJSTMNT,SRC_CMN , TYPE_CMD,IOD_TYPE_CD,ROE_PST ,ORDR_SYC,SOR_CD,PROS_ED ORDER BY ADJSTMNT )=1
I did my search and found UDF for Row_Sequence in hive. I also replaced Over Partition with Distribute All and sort By. But I am stuck with QUALIFY.
我搜索了一下,并在hive中找到了Row_Sequence的UDF。我还用分布All和sort By替换了分区。但我被限制在资格。
Any ideas to convert the above to hive are really appreciated and will help us a lot.
任何将上面的想法转化为蜂巢的想法都是非常值得欣赏的,并且会对我们有很大的帮助。
1 个解决方案
#1
5
a QUALIFY with analytics function (ROW_NUMBER(), SUM(), COUNT(), ... over (partition by ...)) is just a WHERE on a subquery containing the analytics value.
使用分析函数(ROW_NUMBER()、SUM()、COUNT()、…over (partition by…)只是包含分析值的子查询的WHERE。
eg:
例如:
select A,B,C
from X
QUALIFY ROW_NUMBER() over (...) = 1
is equivalent to :
等价于:
select A,B,C
from (
select A,B,C, ROW_NUMBER() over (...) as RNUM
from X
) t
where RNUM = 1
NB: analytics function are available in Hive 0.12
NB:在Hive 0.12中有分析功能
#1
5
a QUALIFY with analytics function (ROW_NUMBER(), SUM(), COUNT(), ... over (partition by ...)) is just a WHERE on a subquery containing the analytics value.
使用分析函数(ROW_NUMBER()、SUM()、COUNT()、…over (partition by…)只是包含分析值的子查询的WHERE。
eg:
例如:
select A,B,C
from X
QUALIFY ROW_NUMBER() over (...) = 1
is equivalent to :
等价于:
select A,B,C
from (
select A,B,C, ROW_NUMBER() over (...) as RNUM
from X
) t
where RNUM = 1
NB: analytics function are available in Hive 0.12
NB:在Hive 0.12中有分析功能