Sqoop2入门之导入关系型数据库数据到HDFS上

时间:2021-08-11 16:08:27

需求:将hive数据库中的TBLS表导出到HDFS之上;

 $SQOOP2_HOME/bin/sqoop.sh client 
sqoop:> set server --host hadoop000 --port 12000 --webapp sqoop
Server is set successfully

创建connection:

sqoop:> create connection --cid 1
Creating connection for connector with id
Please fill following values to create new connection object
Name: TBLS_IMPORT_DEMO
Connection configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://hadoop000:3306/hive
Username: root
Password: ****
JDBC Connection Properties:
There are currently values in the map:
entry#
Security related configuration options
Max connections:
New connection was successfully created with validation status FINE and persistent id 10

创建job:

sqoop:> create job --xid  --type import
Creating job for connection with id
Please fill following values to create new job object
Name: tbls_import
Database configuration
Schema name: hive
Table name: TBLS
Table SQL statement:
Table column names:
Partition column name:
Nulls in partition column:
Boundary query:
Output configuration
Storage type:
: HDFS
Choose: 0
Output format:
: TEXT_FILE
: SEQUENCE_FILE
Choose:
Compression format:
: NONE
: DEFAULT
: DEFLATE
: GZIP
: BZIP2
: LZO
: LZ4
: SNAPPY
Choose:
Output directory: hdfs://hadoop000:8020/sqoop2/tbls_import_demo
Throttling resources
Extractors:
Loaders:
New job was successfully created with validation status FINE and persistent id

提交job:

start job --jid 6 

查看job执行状态:

status job --jid 

操作成功后查看HDFS上的文件

hadoop fs -ls hdfs://hadoop000:8020/sqoop2/tbls_import_demo