[Hive 基础]-- 使用 Map 和 Array 数据结构

Hive支持的数据类型分为基础数据类型和复杂类型，这是关系性数据库不具备的特性。

基础类型主要包括：tinyint，smalint，int，bigint，boolean，float，double，string，timestamp 等
复杂类型主要包括：array，map，struct等

参考：https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types

1、map

1.1 可以存储如下数据

"geo": {
"country": "中国",
"subdivision": "四川",
"city": "成都",
"latitude": "31.261",
"longitude": "127.22562",
"isp": "电信"
}

说明：geo 代表hive 表字段名称，冒号后的是对应的值。

1.2 创建 hive 表

create table fdp.map_test(

id string,

geo map<string,string>

);

1.3 准备数据文件 map_test

1 country:中国,subdivision:四川,city:德阳,latitude:31.86141,longitude:117.27562,isp:移动
2 country:中国,subdivision:四川,city:绵阳,latitude:31.86141,longitude:117.27562,isp:电信
3 country:中国,subdivision:安徽,city:淮南,latitude:31.86141,longitude:117.27562,isp:电信

1.4 load 到hive 表中，然后查询

load data local inpath "/data/map_test" into table fdp.map_test;

select * from fdp.map_test where  geo["country"] ='中国' ;

2、array

2.1 可以存储如下数据

"business": ["it","os","abc"]

说明：business 代表hive 表里的字段名称，冒号后的是对应的值。

2.2 创建hive 表

create table fdp.array_test(

id string,

business array<string>

);

2.3 准备数据文件 array_test

1  iqwt,st,ows,dds

2 idw,iss,dcf

3 idw,swd,dv

2.4 load 到 hive 表中，然后查询

注意： sql 中需要使用视图，spark sql 和 hive sql 都支持视图。

#load data local inpath "/data/array_test" into table fdp.array_test;

#select * from fdp.array_test  LATERAL VIEW explode(business) rTable AS realid  where realid='it' limit 2;

秒客网

[Hive 基础]-- 使用 Map 和 Array 数据结构

Hive支持的数据类型分为基础数据类型和复杂类型，这是关系性数据库不具备的特性。

1、map

1.1 可以存储如下数据

1.2 创建 hive 表

1.3 准备数据文件 map_test

1.4 load 到hive 表中，然后查询

2、array

2.1 可以存储如下数据

2.2 创建hive 表

2.3 准备数据文件 array_test

2.4 load 到 hive 表中，然后查询

相关文章