hive多表关联

时间:2025-04-16 08:18:13
多表join使用说明


select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on = join (select userId from table_c where dt=20160731) c on =
等价于
select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on = join (select userId from table_c where dt=20160731) c on =
等价于
select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on = join (select userId from table_c where dt=20160731) c on = and =




总结:
可以把a与b表join关联看成一个新的表table_j,table_j表有两列,与c表进行关联






如果是left outer join 效果相同,只不过是将对最后一个on链接条件来说,是与table_j的第一列相连还是与第二列相连而已


等价于table_j的第二列userId()与c进行关联,不去管b表是否有能关联上

select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on = 
left outer join (select userId from table_c where dt=20160731) c on =




等价于table_j的第二列userId()与c进行关联,不去管a表是否有能关联上
select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on = 
left outer join (select userId from table_c where dt=20160731) c on =




等价于table_j的第一列与第二列userId()与c进行关联,要求同时能关联上
select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on =  
left outer join (select userId from table_c where dt=20160731) c on = and =