Hive的数据导入——Load语句

时间:2024-05-19 12:46:21
一 语法
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE]
INTO TABLE tablename [PARTITION (partcol1=val1,partcol2=val2 ...)]
二 表
hive> desc t2;
OK
tid int
tname string
age int
Time taken: 2.176 seconds, Fetched: 3 row(s)
hive> desc t3;
OK
tid int
tname string
age int
Time taken: 0.525 seconds, Fetched: 3 row(s)

三 要导入的数据
[[email protected] ~]# more student01.txt
1,Tom,23
2,Mary,20
[[email protected] ~]# more student02.txt
3,Mike,25
[[email protected] ~]# more student03.txt
4,Scott,21
5,King,20

四 将student01.txt导入t2
hive> load data local inpath '/root/student01.txt' into table t2;
Copying data from file:/root/student01.txt
Copying file: file:/root/student01.txt
Loading data to table default.t2
Table default.t2 stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0]
OK
Time taken: 2.765 seconds
hive> select * from t2;
OK
NULL NULL NULL
NULL NULL NULL
Time taken: 1.042 seconds, Fetched: 2 row(s)

四 将student01.txt导入t3
hive> load data local inpath '/root/student01.txt' into table t3;
Copying data from file:/root/student01.txt
Copying file: file:/root/student01.txt
Loading data to table default.t3
Table default.t3 stats: [numFiles=1, numRows=0, totalSize=19, rawDataSize=0]
OK
Time taken: 2.776 seconds
hive> select * from t3;
OK
1 Tom 23
2 Mary 20
Time taken: 0.33 seconds, Fetched: 2 row(s)

五 将/root/data下的所有数据文件导入到t3表,并覆盖原来的数据。
hive> load data local inpath '/root/data/' overwrite into table t3;
Copying data from file:/root/data
Copying file: file:/root/data/student01.txt
Copying file: file:/root/data/student02.txt
Copying file: file:/root/data/student03.txt
Loading data to table default.t3
Deleted hdfs://localhost:9000/user/hive/warehouse/t3
Table default.t3 stats: [numFiles=3, numRows=0, totalSize=50, rawDataSize=0]
OK
Time taken: 1.246 seconds
hive> select * from t3;
OK
1 Tom 23
2 Mary 20
3 Mike 25
4 Scott 21
5 King 20
Time taken: 0.343 seconds, Fetched: 5 row(s)

六 将HDFS中 、/user/root/input/student01.txt 导入到t3.
hive> load data inpath '/user/root/input/student01.txt' overwrite into table t3;
Loading data to table default.t3
Deleted hdfs://localhost:9000/user/hive/warehouse/t3
Table default.t3 stats: [numFiles=1, numRows=0, totalSize=19, rawDataSize=0]
OK
Time taken: 1.614 seconds
hive> select * from t3;
OK
1 Tom 23
2 Mary 20
Time taken: 0.292 seconds, Fetched: 2 row(s)

七 将数据导入到分区表
1、准备数据
[[email protected] data]# more data1.txt
1,Tom,M
3,Mike,M
[[email protected] data]# more data2.txt
2,Mary,F
2、准备分区表
hive> desc patition_table;
OK
sid int
sname string
gender string
# Partition Information
# col_name data_type comment
gender string
Time taken: 0.328 seconds, Fetched: 8 row(s)
3、导入数据到分区表
hive> load data local inpath '/root/data/data1.txt' into table patition_table partition (gender='M');
Copying data from file:/root/data/data1.txt
Copying file: file:/root/data/data1.txt
Loading data to table default.patition_table partition (gender=M)
Partition default.patition_table{gender=M} stats: [numFiles=1, numRows=0, totalSize=17, rawDataSize=0]
OK
Time taken: 2.772 seconds
hive> load data local inpath '/root/data/data2.txt' into table patition_table partition (gender='F');
Copying data from file:/root/data/data2.txt
Copying file: file:/root/data/data2.txt
Loading data to table default.patition_table partition (gender=F)
Partition default.patition_table{gender=F} stats: [numFiles=1, numRows=0, totalSize=9, rawDataSize=0]
OK
Time taken: 1.615 seconds
4、效果确认
Hive的数据导入——Load语句