Hive数据导入/插入数据(重点)

时间:2025-04-05 09:32:58

向表中导入数据(load)

实操案例:
(1)创建一张表:

create table student(id string, name string) 
row format delimited fields terminated by '\t';

(2)加载本地文件到Hive,如果是从linux本地加载的文件,要添加local关键字。

load data local inpath '/opt/module/datas/' 
into table ;

(3)从HDFS文件加载到Hive
首先将数据上传到HDFS当中:

hive (default)> dfs -put /opt/module/datas/ /user/zhang/hive;

再从HDFS上加载数据到Hive中:

load data inpath '/user/zhang/hive/' into table ;

(4)加载数据覆盖表中已有的数据:

上传文件到HDFS:

dfs -put /opt/module/datas/ /user/zhang/hive;

HDFS上将数据导入到Hive:

load data inpath '/user/zhang/hive/' overwrite into table ;

向表中插入数据(Insert)

(1)创建一张分区表:

create table student(id int, name string) 
partitioned by (month string) 
row format delimited fields terminated by '\t';

(2)基本插入数据:

insert into table  student partition(month='201709') values(1,'wangwu');

(3)基本模式插入(根据单张表查询结果)

 insert overwrite table student 
 partition(month='201708')
 select id, name from student where month='201709';

(4) 多插入模式(根据多张表查询结果)

from student
insert overwrite table student partition(month='201707')
select id, name where month='201709'
insert overwrite table student partition(month='201706')
select id, name where month='201709';

查询语句中创建表并加载数据

create table if not exists student3
as select id, name from student;

创建表时通过Location指定加载数据路径

创建表,并指定在hdfs上的位置

create table if not exists student3(
id int, name string
)
row format delimited fields terminated by '\t'
location '/user/hive/warehouse/student3';

上传数据到hdfs上

dfs -put /opt/module/datas/
/user/hive/warehouse/student5;

查询数据:

select * from student5;