无法从sparksql连接hive Metastore

Hive .14 Spark 1.6 .Trying to connect hive table from spark pragmatically. I have already put my hive-site.xml in spark conf folder. But when I run this code, everytime its connecting to underlying hive metastore i.e. Derby. I tried googled a lot but evertywhere I am getting suggestion to put hive-site.xml in spark cofiguration folder, which I already did. Please someone suggest me the solution.Below is my code

Hive .14 Spark 1.6。务必将火花从火花连接到火花桌。我已经将我的hive-site.xml放在spark conf文件夹中了。但是,当我运行此代码时,每次它连接到底层的hive Metastore即Derby。我试过google了很多但是everty其中我建议将hive-site.xml放在spark cofiguration文件夹中,我已经这样做了。请有人建议我解决。请问我的代码

FYI: My existing hive is using MYSQL as metastore.

仅供参考:我现有的蜂巢使用MYSQL作为Metastore。

I am running this code directly from eclipse, not using spark-submit utility.

我直接从eclipse运行此代码,而不是使用spark-submit实用程序。

package org.scala.spark

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.hive.HiveContext

object HiveToHdfs {

def main(args: Array[String]) 
  {

    val conf=new SparkConf().setAppName("HDFS to Local").setMaster("local")
    val sc=new SparkContext(conf)  
    val hiveContext=new org.apache.spark.sql.hive.HiveContext(sc)
    import hiveContext.implicits._
    hiveContext.sql("load data local inpath '/home/cloudera/Documents/emp_table.txt' into table employee")
    sc.stop()
  }
}

Below are my eclipse error log:

下面是我的eclipse错误日志:

16/11/18 22:09:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/11/18 22:09:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/11/18 22:09:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/11/18 22:09:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
**16/11/18 22:09:06 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY**
16/11/18 22:09:06 INFO ObjectStore: Initialized ObjectStore
16/11/18 22:09:06 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/11/18 22:09:06 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/11/18 22:09:07 INFO HiveMetaStore: Added admin role in metastore
16/11/18 22:09:07 INFO HiveMetaStore: Added public role in metastore
16/11/18 22:09:07 INFO HiveMetaStore: No user is added in admin role, since config is empty
16/11/18 22:09:07 INFO HiveMetaStore: 0: get_all_databases
16/11/18 22:09:07 INFO audit: ugi=cloudera  ip=unknown-ip-addr  cmd=get_all_databases   
16/11/18 22:09:07 INFO HiveMetaStore: 0: get_functions: db=default pat=*
16/11/18 22:09:07 INFO audit: ugi=cloudera  ip=unknown-ip-addr  cmd=get_functions: db=default pat=* 
16/11/18 22:09:07 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:194)
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
    at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218)
    at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208)
    at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462)
    at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461)
    at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40)
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
    at org.scala.spark.HiveToHdfs$.main(HiveToHdfs.scala:15)
    at org.scala.spark.HiveToHdfs.main(HiveToHdfs.scala)
Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx------
    at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612)
    at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
    ... 12 more
16/11/18 22:09:07 INFO SparkContext: Invoking stop() from shutdown hook

Please let me know if any other in other information is also needed to rectify it.

如果还需要其他任何信息来解决问题,请告诉我。

1 个解决方案

#1

check this link -> https://issues.apache.org/jira/browse/SPARK-15118 metastore might be using mysql db

检查此链接 - > https://issues.apache.org/jira/browse/SPARK-15118 metastore可能正在使用mysql db

the above error is from,

以上错误来自,

  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
  </property>

give permission for /tmp/hive

给予/ tmp / hive许可

#1