[sqoop1.99.6] 基于1.99.6版本的一个小例子

时间:2024-08-04 23:33:26

1、创建mysql数据库、表、以及测试数据
mysql> desc test;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(45) | YES | | NULL | |
| age | int(11) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+

2、赋予权限
grant all on test.* to 'root@localhost' identified by 'root';
flush privileges;

grant all on saiku_base.* to 'root@localhost' identified by 'password';
flush privileges;

3、启动sqoop2-server
./bin/sqoop2-server start

4、启动sqoop2-shell
./bin/sqoop2-shell

5、查看连接信息
sqoop:000> show connector
+----+------------------------+---------+------------------------------------------------------+----------------------+
| Id | Name | Version | Class | Supported Directions |
+----+------------------------+---------+------------------------------------------------------+----------------------+
| 1 | generic-jdbc-connector | 1.99.6 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO |
| 2 | kite-connector | 1.99.6 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO |
| 3 | hdfs-connector | 1.99.6 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO |
| 4 | kafka-connector | 1.99.6 | org.apache.sqoop.connector.kafka.KafkaConnector | TO |
+----+------------------------+---------+------------------------------------------------------+----------------------+

第1种 传统jdbc连接
第3种 hdfs连接

6、创建一个传统的jdbc连接
sqoop:000> create link -c 1 (这里的数字1是connector的类型 也就是代指传统jdbc连接)
Name:192.168.200.70-mysql
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://192.168.200.70:3306/saiku_base?useUnicode=true&characterEncoding=utf-8
Username: root
Password: password
JDBC Connection Properties:
There are currently 0 values in the map:
entry#
New link was successfully created with validation status OK and persistent id 1

7、查看创建的jdbc连接
sqoop:000> show link
+----+---------------------+--------------+------------------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+---------------------+--------------+------------------------+---------+
| 1 | 192.168.10.52-mysql | 1 | generic-jdbc-connector | true |
+----+---------------------+--------------+------------------------+---------+

修改link:update link -l 1
删除link:delete link -l 1

8、创建一个HDFS连接
语法:http://sqoop.apache.org/docs/1.99.6/CommandLineClient.html#delete-link-function
HDFS URI 例子:
hdfs://example.com:8020/
hdfs://example.com/
file:///
file:///tmp
file://localhost/tmp

sqoop:000> create link -c 3
Creating link for connector with id 3
Please fill following values to create new link object
Name: 1921.68.200.70-hdfs
Link configuration
HDFS URI: hdfs://namenode:9000
Hadoop conf directory: /home/hadoop/hadoop/etc/hadoop
New link was successfully created with validation status OK and persistent id 3

9、查看创建的hdfs连接
sqoop:000> show link
+----+---------------------+--------------+------------------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+---------------------+--------------+------------------------+---------+
| 1 | 192.168.10.52-mysql | 1 | generic-jdbc-connector | true |
| 3 | 1921.68.200.70-hdfs | 3 | hdfs-connector | true |
+----+---------------------+--------------+------------------------+---------+

这时候,mysql和hdfs的连接就创建完毕了
接下来要执行数据同步操作

10、创建job -f【from】 -t【to】 即从哪一个数据库导入到另一个数据库
sqoop:000> show link
+----+---------------------+--------------+------------------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+---------------------+--------------+------------------------+---------+
| 1 | 192.168.10.52-mysql | 1 | generic-jdbc-connector | true |
| 3 | 1921.68.200.70-hdfs | 3 | hdfs-connector | true |
+----+---------------------+--------------+------------------------+---------+
sqoop:000> create job -f 1 -t 3 【将数据从mysql同步到hdfs】
Creating job for links with from id 1 and to id 3
Please fill following values to create new job object
Name: sync_mysql_2_hdfs

From database configuration

Schema name: test
Table name: test
Table SQL statement:
Table column names:
Partition column name:
Null value allowed for the partition column:
Boundary query:

Incremental read

Check column:
Last value:

To HDFS configuration

Override null value:
Null value:
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 0
Compression format:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0
Custom compression format:
Output directory: hdfs://master:9000/sqoop2/tbl_test_from_mysql_2_hdfs
Append mode:

Throttling resources

Extractors:
Loaders:
New job was successfully created with validation status OK and persistent id 1

11、查看job
sqoop:000> show job
+----+-------------------+----------------+--------------+---------+
| Id | Name | From Connector | To Connector | Enabled |
+----+-------------------+----------------+--------------+---------+
| 1 | sync_mysql_2_hdfs | 1 | 3 | true |
+----+-------------------+----------------+--------------+---------+

12、启动指定的job: 该job执行完后查看HDFS上的文件 hdfs://master:9000/sqoop2/tbl_test_from_mysql_2_hdfs
sqoop:000> start job --jid 1
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

在start job(如:start job --jid 2)时常见错误:
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

在sqoop客户端设置查看job详情:
set option --name verbose --value true
show job --jid 2

例如:
sqoop:000> set option --name verbose --value true
Verbose option was changed to true
sqoop:000> show job -jid 1
1 job(s) to show:
Job with id 1 and name sync_mysql_2_hdfs (Enabled: true, Created by hadoop at 16-2-19 下午3:47, Updated by hadoop at 16-2-19 下午3:47)
Throttling resources
Extractors:
Loaders:
From link: 1
From database configuration
Schema name: test
Table name: test
Table SQL statement:
Table column names:
Partition column name:
Null value allowed for the partition column:
Boundary query:
Incremental read
Check column:
Last value:
To link: 3
To HDFS configuration
Override null value:
Null value:
Output format: TEXT_FILE
Compression format: NONE
Custom compression format:
Output directory: hdfs://master:9000/sqoop2/tbl_test_from_mysql_2_hdfs
Append mode:

然后再次启动任务就可以显示出有用的出错信息
sqoop:000> start job --jid 1
Exception has occurred during processing command
Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception
Stack trace:
at org.apache.sqoop.client.request.ResourceRequest (ResourceRequest.java:129)
at org.apache.sqoop.client.request.ResourceRequest (ResourceRequest.java:179)
at org.apache.sqoop.client.request.JobResourceRequest (JobResourceRequest.java:112)
at org.apache.sqoop.client.request.SqoopResourceRequests (SqoopResourceRequests.java:157)
at org.apache.sqoop.client.SqoopClient (SqoopClient.java:452)
at org.apache.sqoop.shell.StartJobFunction (StartJobFunction.java:80)
at org.apache.sqoop.shell.SqoopFunction (SqoopFunction.java:51)
at org.apache.sqoop.shell.SqoopCommand (SqoopCommand.java:135)
at org.apache.sqoop.shell.SqoopCommand (SqoopCommand.java:111)
at org.codehaus.groovy.tools.shell.Command$execute (null:-1)
at org.codehaus.groovy.tools.shell.Shell (Shell.groovy:101)
at org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:-1)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:-2)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method (Method.java:497)
at org.codehaus.groovy.reflection.CachedMethod (CachedMethod.java:90)
at groovy.lang.MetaMethod (MetaMethod.java:233)
at groovy.lang.MetaClassImpl (MetaClassImpl.java:1054)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter (ScriptBytecodeAdapter.java:128)
at org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:173)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:-2)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method (Method.java:497)
at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce (PogoMetaMethodSite.java:267)
at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite (PogoMetaMethodSite.java:52)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite (AbstractCallSite.java:141)
at org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:121)
at org.codehaus.groovy.tools.shell.Shell (Shell.groovy:114)
at org.codehaus.groovy.tools.shell.Shell$leftShift$0 (null:-1)
at org.codehaus.groovy.tools.shell.ShellRunner (ShellRunner.groovy:88)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner (InteractiveShellRunner.groovy:-1)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:-2)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method (Method.java:497)
at org.codehaus.groovy.reflection.CachedMethod (CachedMethod.java:90)
at groovy.lang.MetaMethod (MetaMethod.java:233)
at groovy.lang.MetaClassImpl (MetaClassImpl.java:1054)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter (ScriptBytecodeAdapter.java:128)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter (ScriptBytecodeAdapter.java:148)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner (InteractiveShellRunner.groovy:100)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:-2)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method (Method.java:497)
at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce (PogoMetaMethodSite.java:267)
at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite (PogoMetaMethodSite.java:52)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite (AbstractCallSite.java:137)
at org.codehaus.groovy.tools.shell.ShellRunner (ShellRunner.groovy:57)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner (InteractiveShellRunner.groovy:-1)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:-2)
at sun.reflect.NativeMethodAccessorImpl (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method (Method.java:497)
at org.codehaus.groovy.reflection.CachedMethod (CachedMethod.java:90)
at groovy.lang.MetaMethod (MetaMethod.java:233)
at groovy.lang.MetaClassImpl (MetaClassImpl.java:1054)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter (ScriptBytecodeAdapter.java:128)
at org.codehaus.groovy.runtime.ScriptBytecodeAdapter (ScriptBytecodeAdapter.java:148)
at org.codehaus.groovy.tools.shell.InteractiveShellRunner (InteractiveShellRunner.groovy:66)
at java_lang_Runnable$run (null:-1)
at org.codehaus.groovy.runtime.callsite.CallSiteArray (CallSiteArray.java:42)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite (AbstractCallSite.java:108)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite (AbstractCallSite.java:112)
at org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:463)
at org.codehaus.groovy.tools.shell.Groovysh (Groovysh.groovy:402)
at org.apache.sqoop.shell.SqoopShell (SqoopShell.java:130)
Caused by: Exception: java.lang.IllegalArgumentException Message: Wrong FS: hdfs://master:9000/sqoop2/tbl_test_from_mysql_2_hdfs, expected: hdfs://namenode:9000
Stack trace:
at org.apache.hadoop.fs.FileSystem (FileSystem.java:646)
at org.apache.hadoop.hdfs.DistributedFileSystem (DistributedFileSystem.java:194)
at org.apache.hadoop.hdfs.DistributedFileSystem (DistributedFileSystem.java:106)
at org.apache.hadoop.hdfs.DistributedFileSystem$22 (DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22 (DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver (FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem (DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem (FileSystem.java:1424)
at org.apache.sqoop.connector.hdfs.HdfsToInitializer (HdfsToInitializer.java:58)
at org.apache.sqoop.connector.hdfs.HdfsToInitializer (HdfsToInitializer.java:35)
at org.apache.sqoop.driver.JobManager (JobManager.java:449)
at org.apache.sqoop.driver.JobManager (JobManager.java:373)
at org.apache.sqoop.driver.JobManager (JobManager.java:276)
at org.apache.sqoop.handler.JobRequestHandler (JobRequestHandler.java:380)
at org.apache.sqoop.handler.JobRequestHandler (JobRequestHandler.java:116)
at org.apache.sqoop.server.v1.JobServlet (JobServlet.java:96)
at org.apache.sqoop.server.SqoopProtocolServlet (SqoopProtocolServlet.java:79)
at javax.servlet.http.HttpServlet (HttpServlet.java:646)
at javax.servlet.http.HttpServlet (HttpServlet.java:723)
at org.apache.catalina.core.ApplicationFilterChain (ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain (ApplicationFilterChain.java:206)
at org.apache.hadoop.security.authentication.server.AuthenticationFilter (AuthenticationFilter.java:595)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter (DelegationTokenAuthenticationFilter.java:291)
at org.apache.hadoop.security.authentication.server.AuthenticationFilter (AuthenticationFilter.java:554)
at org.apache.catalina.core.ApplicationFilterChain (ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain (ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve (StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve (StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve (StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve (ErrorReportValve.java:103)
at org.apache.catalina.core.StandardEngineValve (StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter (CoyoteAdapter.java:293)
at org.apache.coyote.http11.Http11Processor (Http11Processor.java:861)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler (Http11Protocol.java:606)
at org.apache.tomcat.util.net.JIoEndpoint$Worker (JIoEndpoint.java:489)
at java.lang.Thread (Thread.java:745)
sqoop:000>

出错原因:
Caused by: Exception: java.lang.IllegalArgumentException Message: Wrong FS: hdfs://master:9000/sqoop2/tbl_test_from_mysql_2_hdfs, expected: hdfs://namenode:9000

创建 link 的时候 HDFS URL错了 不是 hdfs://namenode:9000 是hdfs://master:9000

修改
update job -jid 1
Output directory: hdfs://namenode:9000/sqoop2/tbl_test_from_mysql_2_hdfs

日志:
sqoop:000> start job --jid 1
Submission details
Job ID: 1
Server URL: http://localhost:12000/sqoop/
Created by: hadoop
Creation date: 2016-02-19 16:04:59 CST
Lastly updated by: hadoop
External ID: job_1455853781605_0001
http://namenode:8088/proxy/application_1455853781605_0001/
Source Connector schema: Schema{name=test.test,columns=[
FixedPoint{name=id,nullable=true,type=FIXED_POINT,byteSize=4,signed=true},
Text{name=name,nullable=true,type=TEXT,charSize=null},
FixedPoint{name=age,nullable=true,type=FIXED_POINT,byteSize=4,signed=true}]}
2016-02-19 16:04:59 CST: BOOTING - Progress is not available

#查看任务执行状态
sqoop:000> status job --jid 1
Submission details
Job ID: 1
Server URL: http://localhost:12000/sqoop/
Created by: hadoop
Creation date: 2016-02-19 16:04:59 CST
Lastly updated by: hadoop
External ID: job_1455853781605_0001
http://namenode:8088/proxy/application_1455853781605_0001/
2016-02-19 16:07:26 CST: BOOTING - 0.00 %

解释:http://sqoop.apache.org/docs/1.99.2/ClientAPI.html
Job submission requires a job id. On successful submission, getStatus() method returns “BOOTING” or “RUNNING”.

sqoop:000> show submission
+--------+------------------------+---------+------------------------------+
| Job Id | External Id | Status | Last Update Date |
+--------+------------------------+---------+------------------------------+
| 1 | job_1455853781605_0001 | BOOTING | Fri Feb 19 16:15:22 CST 2016 |
+--------+------------------------+---------+------------------------------+