用sbt打包Spark程序,并未将所有依赖都打入包中,把Spark应用放到集群中运行时,出现异常:
Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at SparkHbase
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
… 11 more
出现该异常的原因是Spark应用缺少hbase依赖,我这里的做法是在集群的spark/conf/spark-env.sh中添加下文:
export SPARK_CLASSPATH=/home/hadoop/SW/hbase/lib/hbase-client-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/hbase-server-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/hbase-common-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/hbase-protocol-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/htrace-core-2.04.jar:/home/hadoop/SW/hbase/lib/hbase-hadoop2-compat-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/hbase-it-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/guava-12.0.1.jar
切记注意每个jar包之间用冒号分隔!然后执行命令:
source spark-env.sh
并重启一下spark服务,就ok了!
其实还有一个方法,就是在你提交应用时增加–driver-class-path配置参数来设置driver的classpath:
./spark-submit –driver-class-path /home/hadoop/SW/hbase/lib/hbase-client-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/hbase-server-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/hbase-common-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/hbase-protocol-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/htrace-core-2.04.jar:/home/hadoop/SW/hbase/lib/hbase-hadoop2-compat-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/hbase-it-0.98.12-hadoop2.jar:/home/hadoop/SW/hbase/lib/guava-12.0.1.jar –class com.dtxy.data.SqlTest ../lib/bigdata-1.0-SNAPSHOT.jar
注:不能同时在spark/conf/spark-env.sh里面配置SPARK_CLASSPATH又在提交作业加上–driver-class-path参数,否则会出现异常:
15/08/14 09:22:23 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Found both spark.driver.extraClassPath and SPARK_CLASSPATH. Use only the former.
at org.apache.spark.SparkConf
at org.apache.spark.SparkConf
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.SparkConf
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:430)
at org.apache.spark.SparkContext.(SparkContext.scala:365)
at com.dtxy.data.SqlTest
at org.apache.spark.deploy.SparkSubmit
at org.apache.spark.deploy.SparkSubmit
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/08/14 09:22:23 INFO Utils: Shutdown hook called
到此为止,问题解决!
参考来源:http://www.abcn.net/2014/07/lighting-spark-with-hbase-full-edition.html