![[HIve - LanguageManual] Union [HIve - LanguageManual] Union](https://image.shishitao.com:8440/aHR0cHM6Ly9ia3FzaW1nLmlrYWZhbi5jb20vdXBsb2FkL2NoYXRncHQtcy5wbmc%2FIQ%3D%3D.png?!?w=700&webp=1)
Union Syntax
select_statement UNION ALL select_statement UNION ALL select_statement ... |
UNION is used to combine the result from multiple SELECT statements into a single result set. Hive currently only supports UNION ALL (bag union), in which duplicates are not eliminated. The number and names of columns returned by each select_statement have to be the same. Otherwise, a schema error is thrown.
If some additional processing has to be done on the result of the UNION, the entire statement expression can be embedded in a FROM clause like below:
SELECT * FROM ( select_statement
UNION ALL
select_statement
) unionResult |
For example, if we suppose there are two different tables that track which user has published a video and which user has published a comment, the following query joins the results of a UNION ALL with the user table to create a single annotated stream for all the video publishing and comment publishing events:
SELECT u.id, actions.date FROM ( SELECT av.uid AS uid
FROM action_video av
WHERE av.date = '2008-06-03'
UNION ALL
SELECT ac.uid AS uid
FROM action_comment ac
WHERE ac.date = '2008-06-03'
) actions JOIN users u ON (u.id = actions.uid)
|
Unions can be used in views, inserts, and CTAS (create table as select) statements. A query can contain multiple UNION ALL clauses, as shown in the syntax above.
Version information
Icon
In Hive 0.12.0 and earlier releases, unions can only be used within a subquery such as "SELECT * FROM (select_statement UNION ALLselect_statement UNION ALL ...) unionResult".
As of Hive 0.13.0, unions can also be used in a top-level query: "select_statement UNION ALL select_statement UNION ALL ...". (See HIVE-6189.)