I have Spark spark-1.4.1-bin-hadoop2.6 deployed in local mode, I'm reading input JSON file from HDFS. But methods of SparkR dataFrame read.df method cannot load data from HDFS.
我在本地模式中部署了Spark Spark -1.4.1-bin-hadoop2.6,我正在从HDFS读取输入JSON文件。但是SparkR的方法读取数据。df方法不能从HDFS加载数据。
1) "read.df" error message
1)“阅读。df”错误消息
data <- read.df("/data/sample.json") # input from hdfs
数据<- read.df("/data/sample.json")从hdfs输入。
15/09/01 18:19:38 ERROR r.RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils failed
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:142)
at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74)
at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.NoSuchElementException: key not found: path
at scala.collection.MapLike$class.default(MapLike.scala:228)
at org.apache.spark.sql.sources.CaseInsensitiveMap.default(ddl.scala:467)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at org.apache.spark.sql.sources.CaseInsensitiveMap.apply(ddl.scala:467)
at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:273)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
at org.apache.spark.sql.api.r.SQLUtils$.loadDF(SQLUtils.scala:147)
at org.apache.spark.sql.api.r.SQLUtils.loadDF(SQLUtils.scala)
... 25 more
Error: returnStatus == 0 is not TRUE
Thanks in Adv.
这在睡觉。
2 个解决方案
#1
0
data <- read.json("/data/sample.json")
数据< - read.json(“/数据/ sample.json”)
#2
0
You need to invoke the sqlContext
.
您需要调用sqlContext。
Example:
例子:
data <- read.df(sqlContext, "/data/sample.json","json")
It worked for me for a dataframe.R example with people.json.
它对我来说是一个dataframe。与people.json R的例子。
#1
0
data <- read.json("/data/sample.json")
数据< - read.json(“/数据/ sample.json”)
#2
0
You need to invoke the sqlContext
.
您需要调用sqlContext。
Example:
例子:
data <- read.df(sqlContext, "/data/sample.json","json")
It worked for me for a dataframe.R example with people.json.
它对我来说是一个dataframe。与people.json R的例子。