hadoop 2.6.0
sqoop:sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
然后把sqoop 解压。这里解压发生的问题是,
tar: /home/luis:归档中找不到
tar: 由于前次错误,将以上次的错误状态退出
tar -xzvf XXXXX -C ~/ 大写-C 不要忘记加上。
配置环境变量:sudo nano /etc/bash.bashrc
加上SQOOP_HOME=/home/hadoop/sqoop-1.4.6
PATH=$PATH:$SQOOP_HOME
source/etc/profile
修改配置文件
在$SQOOP_HOME/conf 目录下拷贝sqoop-env-template.sh改名为sqoop-env.sh.
cp sqoop-env-template.sh sqoop-env.sh
#Setpath to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/luis/hadoop-2.6.0
#Setpath to where hadoop-*-core.jar is available
#export HADOOP_MAPRED_HOME=/home/luis/hadoop-2.6.0
#setthe path to where bin/hbase is available
export HBASE_HOME=/home/luis/hbase-1.0.1.1
#Setthe path to where bin/hive is available
#export HIVE_HOME=
#Setthe path for where zookeper config dir is
export ZOOCFGDIR=/home/hadoop/zookeeper-3.4.6
修改$SQOOP_HOME/bin/configure-sqoop
注释掉HCatalog,Accumulo检查(除非你准备使用HCatalog,Accumulo等HADOOP上的组件)
##Moved to be a runtime check in sqoop.
#if[ ! -d "${HCAT_HOME}" ]; then
# echo "Warning: $HCAT_HOME does notexist! HCatalog jobs will fail."
# echo 'Please set $HCAT_HOME to the root ofyour HCatalog installation.'
#fi
#if[ ! -d "${ACCUMULO_HOME}" ]; then
# echo "Warning: $ACCUMULO_HOME does notexist! Accumulo imports will fail."
# echo 'Please set $ACCUMULO_HOME to the rootof your Accumulo installation.'
#fi
#Add HCatalog to dependency list
#if[ -e "${HCAT_HOME}/bin/hcat" ]; then
# TMP_SQOOP_CLASSPATH=${SQOOP_CLASSPATH}:`${HCAT_HOME}/bin/hcat-classpath`
# if [ -z "${HIVE_CONF_DIR}" ]; then
# TMP_SQOOP_CLASSPATH=${TMP_SQOOP_CLASSPATH}:${HIVE_CONF_DIR}
# fi
# SQOOP_CLASSPATH=${TMP_SQOOP_CLASSPATH}
#fi
#Add Accumulo to dependency list
#if[ -e "$ACCUMULO_HOME/bin/accumulo" ]; then
# for jn in `$ACCUMULO_HOME/bin/accumuloclasspath | grep file:.*accumulo.*jar |cut -d':' -f2`; do
# SQOOP_CLASSPATH=$SQOOP_CLASSPATH:$jn
# done
# for jn in `$ACCUMULO_HOME/bin/accumuloclasspath | grep file:.*zookeeper.*jar |cut -d':' -f2`; do
# SQOOP_CLASSPATH=$SQOOP_CLASSPATH:$jn
# done
#fi
同时我还把zookeeper 的内容给注释掉了。因为老是卡在那里报错。
#fi
#if [ ! -d "${ZOOKEEPER_HOME}" ]; then
# echo "Warning: $ZOOKEEPER_HOME does not exist! Accumulo imports will fail."
# echo 'Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.'
#fi
就配置完了,然后运行命令,还是会有常见异常:
Streaming result set com.mysql.jdbc.RowDataDynamic@12e9d40f is still active
执行sqoop脚本时,出现这个异常是因为mysql的驱动的原因,使用最新的mysql驱动包。
但是即使我更新到最新的java链接mysql的包,放入lib 目录里还是会出错。
原因是,不止一个地方我放入了 java连接mysql 的包,要把所有共享基础包的地方的连接方式的包都换成最新的。比如黄底位置
$ sudo find -name mysql-connector-java*
./home/luis/下载/mysql-connector-java-5.1.32
./home/luis/下载/mysql-connector-java-5.1.32/mysql-connector-java-5.1.32-bin.jar
./home/luis/weka/weka-3-6-13/lib/mysql-connector-java-5.1.6-bin.jar
./home/luis/sqoop-1.4.6/lib/mysql-connector-java-5.1.32-bin.jar
./home/luis/.eclipse/org.eclipse.platform_3.8_155965261/plugins/org.python.pydev.jython_4.2.0.201507041133/cachedir/packages/mysql-connector-java-5.1.6-bin.pkc
./usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext/mysql-connector-java-5.1.32-bin.jar
./usr/share/java/mysql-connector-java-5.1.32-bin.jar
然后可以使用 命令把mysql 导入 hdfs了
导入全部 表 sqoop import-all-tables --connect jdbc:mysql://192.168.1.113:3306/weibocatch
1、每个表必须都只有一个列作为主键;
2、必须将每个表中所有的数据导入,而不是部分;
3、必须使用默认分隔列,且WHERE子句无任何强加的条件
导入单张表:
./sqoop import --connect jdbc:mysql://192.168.1.113:3306/weibocatch --username root --password xxxxx --table w_transfer -m1 这里的m 1 表示的是启动map 的个数
测试连接
sqoop list-databases --connect jdbc:mysql://172.16.247.140:3306/ --username xxx --password xxx
列出表
sqoop list-tables --connect jdbc:mysql://172.16.247.140:3306/sqoop --username hive --password 123456
然后就可以进到hadoop的hdfs里头看看内容了。
hdfs 的命令 ls 和 ls -al ,rm , rmr