使用ScalaObjectMapper的Jackson模块在Spark 1.4.0上运行作业时出错

时间:2021-09-28 18:03:14

I'm running a spark job written in Scala 2.10.4 and running on Spark 1.4.0 cluster (based on HDFS and managed with YARN) and using Jackson modules version 2.6.1 on Maven repository

我正在运行一个用Scala 2.10.4编写的spark作业,并在spark 1.4.0集群(基于HDFS并使用纱线进行管理)上运行,并在Maven存储库上使用Jackson模块2.6.1版本

When running the code locally from my IDE (IntelliJ IDEA v14) everything works on the on-memory cluster, but when running the job on my remote cluster (EMR cluster on AWS VPC) i'm getting the following exception:

当在我的IDE (IntelliJ IDEA v14)上运行本地代码时,一切都在内存集群中运行,但是在我的远程集群(AWS VPC上的EMR集群)上运行作业时,我得到了以下例外:

java.lang.AbstractMethodError: com.company.scala.framework.utils.JsonParser$$anon$1.com$fasterxml$jackson$module$scala$experimental$ScalaObjectMapper$_setter_$com$fasterxml$jackson$module$scala$experimental$ScalaObjectMapper$$typeCache_$eq(Lorg/spark-project/guava/cache/LoadingCache;)V
    at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper$class.$init$(ScalaObjectMapper.scala:50)
    at com.company.scala.framework.utils.JsonParser$$anon$1.<init>(JsonParser.scala:14)
    at com.company.scala.framework.utils.JsonParser$.<init>(JsonParser.scala:14)
    at com.company.scala.framework.utils.JsonParser$.<clinit>(JsonParser.scala)
    at com.company.migration.Migration$.printAllKeys(Migration.scala:21)
    at com.company.migration.Main$$anonfun$main$1.apply(Main.scala:22)
    at com.company.migration.Main$$anonfun$main$1.apply(Main.scala:22)
    at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199)
    at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:70)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

I tried to look over the web for the exception and with no luck. I also tried look for a similar question here and found just one thread with no acceptable answer and none of the answers helped me there.

我试着在网上寻找一个例外,但没有运气。我也尝试在这里寻找一个类似的问题,发现只有一个线程没有可接受的答案,没有一个答案对我有帮助。

Hope to find help here,

希望能在这里找到帮助,

Thanks.

谢谢。

1 个解决方案

#1


12  

I'm answering the question for further views by other users.

我回答这个问题是为了让其他用户进一步了解。

I stopped using the ScalaObjectMapper and started working with the regular ObjectMapper.

我停止使用ScalaObjectMapper,开始使用常规的ObjectMapper。

val jacksonMapper = new ObjectMapper() 
jacksonMapper.registerModule(DefaultScalaModule)

And it works fine for the time being. Attaching piggybox's comment to be helpful comment:

目前它还能正常工作。附加piggybox的评论作为有用的评论:

The only difference in code is to use classOf[...] to specify type for readValue as the 2nd parameter.

代码中唯一的区别是使用classOf[…]将readValue的类型指定为第二个参数。

#1


12  

I'm answering the question for further views by other users.

我回答这个问题是为了让其他用户进一步了解。

I stopped using the ScalaObjectMapper and started working with the regular ObjectMapper.

我停止使用ScalaObjectMapper,开始使用常规的ObjectMapper。

val jacksonMapper = new ObjectMapper() 
jacksonMapper.registerModule(DefaultScalaModule)

And it works fine for the time being. Attaching piggybox's comment to be helpful comment:

目前它还能正常工作。附加piggybox的评论作为有用的评论:

The only difference in code is to use classOf[...] to specify type for readValue as the 2nd parameter.

代码中唯一的区别是使用classOf[…]将readValue的类型指定为第二个参数。