Sqoop2——导入关系型数据库数据到HDFS上

时间:2021-07-05 07:51:01

启动sqoop2-1.99.4版本客户端:

$SQOOP2_HOME/bin/sqoop.sh client 
set server
--host hadoop000 --port 12000 --webapp sqoop

 

查看所有connector:

show connector --all
Sqoop2——导入关系型数据库数据到HDFS上
2 connector(s) to show: 
Connector with
id 1:
Name: hdfs
-connector
Class: org.apache.sqoop.connector.hdfs.HdfsConnector
Version:
1.99.4-cdh5.3.0

Connector with
id 2:
Name: generic
-jdbc-connector
Class: org.apache.sqoop.connector.jdbc.GenericJdbcConnector
Version:
1.99.4-cdh5.3.0
Sqoop2——导入关系型数据库数据到HDFS上

 

查询所有link: 

show link

删除指定link:

delete link --lid x

 

查询所有job:

show job

 

删除指定job:

delete job --jid 1

  

创建generic-jdbc-connector类型的connector

Sqoop2——导入关系型数据库数据到HDFS上
create link --cid 2
Name: First Link
JDBC Driver Class: oracle.jdbc.driver.OracleDriver
JDBC Connection String:
jdbc:oracle:thin:@192.168.24.150:1521:dt1
Username: root
Password:
****
JDBC Connection Properties:
There are currently
0 values in the map:
entry# protocol
=tcp
There are currently
1 values in the map:
protocol
= tcp
entry#
New link was successfully created with validation status OK and persistent
id 3
Sqoop2——导入关系型数据库数据到HDFS上

 

Sqoop2——导入关系型数据库数据到HDFS上
show link
+----+-------------+-----------+---------+
| Id | Name | Connector | Enabled |
+----+-------------+-----------+---------+
| 3 | First Link | 2 | true |
+----+-------------+-----------+---------+
Sqoop2——导入关系型数据库数据到HDFS上

 

创建hdfs-connector类型的connector:

create link -cid 1
Name: Second Link
HDFS URI:
hdfs://dtbigdata1:9000 
New link was successfully created with validation status OK and persistent id 4

 

Sqoop2——导入关系型数据库数据到HDFS上
show link
+----+-------------+-----------+---------+
| Id | Name | Connector | Enabled |
+----+-------------+-----------+---------+
| 3 | First Link | 2 | true |
| 4 | Second Link | 1 | true |
+----+-------------+-----------+---------+
Sqoop2——导入关系型数据库数据到HDFS上

 

Sqoop2——导入关系型数据库数据到HDFS上
show link -all
2 link(s) to show:
link with
id 3 and name First Link (Enabled: true, Created by null at 15-2-2 ??11:28, Updated by null at 15-2-2 ??11:28)
Using Connector
id 2
Link configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String:
jdbc:oracle:thin:@192.168.24.150:1521:dt1
Username: root
Password:
JDBC Connection Properties:
protocol
= tcp
link with
id 4 and name Second Link (Enabled: true, Created by null at 15-2-2 ??11:32, Updated by null at 15-2-2 ??11:32)
Using Connector
id 1
Link configuration
HDFS URI:
hdfs://dtbigdata1:9000 
Sqoop2——导入关系型数据库数据到HDFS上

 

根据connector id创建job:

Sqoop2——导入关系型数据库数据到HDFS上
create job -f 3 -t 4
Creating job
for links with from id 3 and to id 4
Please fill following values to create new job
object
Name: Sqoopy

From database configuration

Schema name: hive --用户名
Table name: TBLS
Table SQL statement:
Table column names:
Partition column name:
Null value allowed
for the partition column:
Boundary query:

ToJob configuration

Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose:
0
Compression format:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose:
0
Custom compression format:
Output directory:
hdfs://dtbigdata1:9000/tmp/data

Throttling resources Extractors: 1 --map数 Loaders: 1 --reduce数 New job was successfully created with validation status OK and persistent id 2
Sqoop2——导入关系型数据库数据到HDFS上

 

查询所有job: 

Sqoop2——导入关系型数据库数据到HDFS上
show job
+----+--------+----------------+--------------+---------+
| Id | Name | From Connector | To Connector | Enabled |
+----+--------+----------------+--------------+---------+
| 2 | Sqoopy | 2 | 1 | true |
+----+--------+----------------+--------------+---------+
Sqoop2——导入关系型数据库数据到HDFS上

 

启动指定的job:  该job执行完后查看HDFS上的文件(hdfs fs -ls hdfs://dtbigdata1:9000/tmp/data/)

start job --jid 2

 

查看指定job的执行状态:

status job --jid 2

 

停止指定的job:

stop job --jid 2

 

在start job(如:start job --jid 2)时常见错误:

Exception has occurred during processing command 
Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

在sqoop客户端设置查看job详情:

set option --name verbose --value true
show job
--jid 2