在spark-default.conf文件中明明配置了mysql的数据源连接
随后启动spark-shell 执行如下测试代码:
import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.sql.{SaveMode, DataFrame}
import org.apache.spark.sql.hive.HiveContext val mySQLUrl = "jdbc:mysql://localhost:3306/yangsy?user=root&password=yangsiyi" val people_DDL = s"""
CREATE TEMPORARY TABLE PEOPLE
USING org.apache.spark.sql.jdbc
OPTIONS (
url '${mySQLUrl}',
dbtable 'person'
)""".stripMargin sqlContext.sql(people_DDL)
val person = sql("SELECT * FROM PEOPLE").cache() val name = "name"
val targets = person.filter("name ="+name).collect()
collect()的时候报找不到driver
这个问题就很诡异了。。数据源连接也没错啊,毕竟在hive的metastore也是用的这个啊。。最终只能在启动spark-shell的时候同时引入jar包了= =
./spark-shell --jars /usr/local/spark-1.4.0-bin-2.5.0-cdh5.2.1/lib/mysql-connector-java-5.1.30-bin.jar
随后再执行就OK了。。诡异。。
或者在执行collect()之前引入mysql的jar包也可以
sqlContext.sql("add jar /usr/local/spark-1.4.0-bin-2.5.0-cdh5.2.1/lib/mysql-connector-java-5.1.30-bin.jar")
不过总感觉不妥。。有解决办法的求指导哈~