. lang。ClassNotFoundException:org.apache.hadoop.hbase.HBaseConfiguration

时间:2021-02-24 08:26:08

I want to create my first scala program using the scala example HBaseTest2.scala, provided in Sparkd 1.4.1. The goal is to connect to HBase and do some basic stuff, such as counting rows or scan rows. However, when I tried to execute the program, I got an error. It seems that Spark couldn't find the class HBaseConfiguration. Assuming the we're located a the root path of my project HBaseTest2 /usr/local/Cellar/spark/programs/HBaseTest2. Here are some details for the exception :

我想使用scala示例HBaseTest2创建我的第一个scala程序。scala,在Sparkd 1.4.1中提供。目标是连接到HBase并做一些基本的事情,比如计算行数或扫描行数。然而,当我试图执行程序时,我得到了一个错误。似乎Spark没有找到类HBaseConfiguration。假设我们位于项目HBaseTest2 /usr/ localar / cellar / spark/programs/hbasetest2的根路径。以下是例外的一些细节:

./src/main/scala/com/orange/spark/examples/HBaseTest2.scala

/ src / main / scala /com/orange/spark/examples/HBaseTest2.scala

package com.orange.spark.examples

import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor, TableName}
import org.apache.hadoop.hbase.mapreduce.TableInputFormat

import org.apache.spark._


object HBaseTest2 {
  def main(args: Array[String]) {
    val sparkConf = new SparkConf().setAppName("HBaseTest2")
    val sc = new SparkContext(sparkConf)
    val tableName = "personal-cloud-test"

    // please ensure HBASE_CONF_DIR is on classpath of spark driver
    // e.g: set it through spark.driver.extraClassPath property
    // in spark-defaults.conf or through --driver-class-path
    // command line option of spark-submit

    val conf = HBaseConfiguration.create()

    // Other options for configuring scan behavior are available. More information available at
    // http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html
    conf.set(TableInputFormat.INPUT_TABLE, tableName)

    // Initialize hBase table if necessary
    val admin = new HBaseAdmin(conf)
    if (!admin.isTableAvailable(tableName)) {
      val tableDesc = new HTableDescriptor(TableName.valueOf(tableName))
      admin.createTable(tableDesc)
    }

    val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
      classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
      classOf[org.apache.hadoop.hbase.client.Result])

    println("hbaseRDD.count()")
    println(hBaseRDD.count())

    sc.stop()
    admin.close()
  }
}

./build.sbt
I've added dependencies in this file to ensure all classes called are included in the jar file.

/构建。我在这个文件中添加了依赖项,以确保所有被调用的类都包含在jar文件中。

name := "HBaseTest2"

version := "1.0"

scalaVersion := "2.11.7"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.1"

libraryDependencies ++= Seq(
    "org.apache.hadoop" % "hadoop-core" % "1.2.1",
    "org.apache.hbase" % "hbase" % "1.0.1.1",
    "org.apache.hbase" % "hbase-client" % "1.0.1.1",
    "org.apache.hbase" % "hbase-common" % "1.0.1.1",
    "org.apache.hbase" % "hbase-server" % "1.0.1.1"
)

Run application

运行应用程序

MacBook-Pro-de-Mincong:spark-1.4.1 minconghuang$ bin/spark-submit \
  --class "com.orange.spark.examples.HBaseTest2" \
  --master local[4] \ 
  ../programs/HBaseTest2/target/scala-2.11/hbasetest2_2.11-1.0.jar

Exception

异常

15/08/18 12:06:17 INFO storage.BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
    at com.orange.spark.examples.HBaseTest2$.main(HBaseTest2.scala:21)
    at com.orange.spark.examples.HBaseTest2.main(HBaseTest2.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 11 more
15/08/18 12:06:17 INFO spark.SparkContext: Invoking stop() from shutdown hook

The problem might come from the HBase configuration as mentioned in HBaseTest2.scala line 16 :

这个问题可能来自HBaseTest2中提到的HBase配置。scala 16行:

// please ensure HBASE_CONF_DIR is on classpath of spark driver
// e.g: set it through spark.driver.extraClassPath property
// in spark-defaults.conf or through --driver-class-path
// command line option of spark-submit

//请确保HBASE_CONF_DIR在spark驱动程序// e的类路径中。司机:把它放在火花区。extraClassPath属性//在sparkdefaults -default中。conf或through——驱动程序类路径/命令行选项的spark提交

But I don't know how to configure it... I've added the HBASE_CONF_DIR to CLASSPATH in my command line. The CLASSPATH is now /usr/local/Cellar/hadoop/hbase-1.0.1.1/conf. Nothing happened... T_T So what should I do to get this fixed ? I can add/delete details if needed. Thanks a lot !!

但是我不知道如何配置它…我已经将HBASE_CONF_DIR添加到命令行中的类路径中。类路径现在是/usr/local/ cellar /hadoop/hbase-1.0.1.1/conf。什么也没发生……T_T那么我该怎么做才能把它固定下来呢?如果需要,我可以添加/删除细节。非常感谢! !

2 个解决方案

#1


0  

Have you tried

你有试过

sparkConf.set("spark.driver.extraClassPath", "/usr/local/Cellar/hadoop/hbase-1.0.1.1/conf")

#2


0  

The problem came from class-path-setting as mentioned in HBaseTest2.scala line 33 :

问题来自于HBaseTest2中提到的类路径设置。scala 33行:

// please ensure HBASE_CONF_DIR is on classpath of spark driver
// e.g: set it through spark.driver.extraClassPath property
// in spark-defaults.conf or through --driver-class-path
// command line option of spark-submit

//请确保HBASE_CONF_DIR在spark驱动程序// e的类路径中。司机:把它放在火花区。extraClassPath属性//在sparkdefaults -default中。conf或through——驱动程序类路径/命令行选项的spark提交

As I'm using a MAC OS X, setting is different from Linux. When I tried echo $CLASSPATH, it returned empty. It seems that Mac doesn't use the CLASSPATH to do the driver job. So I need to add all jar files through spark.driver.extraClassPath in spark-defaults.conf file. My collegue did the same way in Linux. I think there's a better way to handle it elegantly, but we didn't find out. Please share if you know the answer. Thanks.

当我使用MAC OS X时,设置与Linux不同。当我尝试echo $CLASSPATH时,它返回为空。似乎Mac没有使用类路径来完成驱动工作。所以我需要通过spark.driver添加所有jar文件。extraClassPath spark-defaults。conf文件。我的同事在Linux上也是如此。我认为有更好的方法来优雅地处理它,但是我们没有发现。如果你知道答案,请分享。谢谢。


Mac / Linux
add all external jars in conf/spark-defaults.conf

Mac / Linux将所有外部jar添加到conf/spark-default .conf中。

spark.driver.extraClassPath    /path/to/a.jar:/path/to/b.jar:/path/to/c.jar

#1


0  

Have you tried

你有试过

sparkConf.set("spark.driver.extraClassPath", "/usr/local/Cellar/hadoop/hbase-1.0.1.1/conf")

#2


0  

The problem came from class-path-setting as mentioned in HBaseTest2.scala line 33 :

问题来自于HBaseTest2中提到的类路径设置。scala 33行:

// please ensure HBASE_CONF_DIR is on classpath of spark driver
// e.g: set it through spark.driver.extraClassPath property
// in spark-defaults.conf or through --driver-class-path
// command line option of spark-submit

//请确保HBASE_CONF_DIR在spark驱动程序// e的类路径中。司机:把它放在火花区。extraClassPath属性//在sparkdefaults -default中。conf或through——驱动程序类路径/命令行选项的spark提交

As I'm using a MAC OS X, setting is different from Linux. When I tried echo $CLASSPATH, it returned empty. It seems that Mac doesn't use the CLASSPATH to do the driver job. So I need to add all jar files through spark.driver.extraClassPath in spark-defaults.conf file. My collegue did the same way in Linux. I think there's a better way to handle it elegantly, but we didn't find out. Please share if you know the answer. Thanks.

当我使用MAC OS X时,设置与Linux不同。当我尝试echo $CLASSPATH时,它返回为空。似乎Mac没有使用类路径来完成驱动工作。所以我需要通过spark.driver添加所有jar文件。extraClassPath spark-defaults。conf文件。我的同事在Linux上也是如此。我认为有更好的方法来优雅地处理它,但是我们没有发现。如果你知道答案,请分享。谢谢。


Mac / Linux
add all external jars in conf/spark-defaults.conf

Mac / Linux将所有外部jar添加到conf/spark-default .conf中。

spark.driver.extraClassPath    /path/to/a.jar:/path/to/b.jar:/path/to/c.jar