I have a problem with running spark application on standalone cluster. (I use spark 1.1.0 version). I succesfully run master server by command:
在独立集群上运行spark应用程序有问题。(我使用的是spark 1.1.0版本)。我成功地通过命令运行主服务器:
bash start-master.sh
Then I run one worker by command:
然后我命令一个工人:
bash spark-class org.apache.spark.deploy.worker.Worker spark://fujitsu11:7077
At master’s web UI:
在硕士web UI:
http://localhost:8080
I see, that master and worker are running.
我明白了,那主人和工人正在奔跑。
Then I run my application from Eclipse Luna. I successfully connect to cluster by command
然后我从Eclipse Luna运行我的应用程序。我成功地通过命令连接到集群
JavaSparkContext sc = new JavaSparkContext("spark://fujitsu11:7077", "myapplication");
And after that application works, but when program achieve following code:
应用程序运行后,当程序实现以下代码时:
JavaRDD<Document> collectionRdd = sc.parallelize(list);
It's crashing with following error message:
它崩溃的错误信息如下:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 4 times, most recent failure: Lost task 7.3 in stage 0.0 (TID 11, fujitsu11.inevm.ru):java.lang.ClassNotFoundException: maven.maven1.Document
java.net.URLClassLoader$1.run(URLClassLoader.java:366)
java.net.URLClassLoader$1.run(URLClassLoader.java:355)
java.security.AccessController.doPrivileged(Native Method)
java.net.URLClassLoader.findClass(URLClassLoader.java:354)
java.lang.ClassLoader.loadClass(ClassLoader.java:425)
java.lang.ClassLoader.loadClass(ClassLoader.java:358)
java.lang.Class.forName0(Native Method)
java.lang.Class.forName(Class.java:270)
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:74)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:606)
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:744)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
In shell I found:
在shell中,我发现:
14/11/12 18:46:06 INFO ExecutorRunner: Launch command: "C:\PROGRA~1\Java\jdk1.7.0_51/bin/java" "-cp" ";;D:\spark\bin\..\conf;D:\spark\bin\..\lib\spark-assembly-
1.1.0-hadoop1.0.4.jar;;D:\spark\bin\..\lib\datanucleus-api-jdo-3.2.1.jar;D:\spar
k\bin\..\lib\datanucleus-core-3.2.2.jar;D:\spark\bin\..\lib\datanucleus-rdbms-3.
2.1.jar" "-XX:MaxPermSize=128m" "-Dspark.driver.port=50913" "-Xms512M" "-Xmx512M
" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://sparkDriv
er@fujitsu11.inevm.ru:50913/user/CoarseGrainedScheduler" "0" "fujitsu11.inevm.ru
" "8" "akka.tcp://sparkWorker@fujitsu11.inevm.ru:50892/user/Worker" "app-2014111
2184605-0000"
14/11/12 18:46:40 INFO Worker: Asked to kill executor app-20141112184605-0000/0
14/11/12 18:46:40 INFO ExecutorRunner: Runner thread for executor app-2014111218
4605-0000/0 interrupted
14/11/12 18:46:40 INFO ExecutorRunner: Killing process!
14/11/12 18:46:40 INFO Worker: Executor app-20141112184605-0000/0 finished with
state KILLED exitStatus 1
14/11/12 18:46:40 INFO LocalActorRef: Message [akka.remote.transport.ActorTransp
ortAdapter$DisassociateUnderlying] from Actor[akka://sparkWorker/deadLetters] to
Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtoco
l-tcp%3A%2F%2FsparkWorker%40192.168.3.5%3A50955-2#1066511138] was not delivered.
[1] dead letters encountered. This logging can be turned off or adjusted with c
onfiguration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-
shutdown'.
14/11/12 18:46:40 INFO LocalActorRef: Message [akka.remote.transport.Association
Handle$Disassociated] from Actor[akka://sparkWorker/deadLetters] to Actor[akka:/
/sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2
FsparkWorker%40192.168.3.5%3A50955-2#1066511138] was not delivered. [2] dead let
ters encountered. This logging can be turned off or adjusted with configuration
settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/11/12 18:46:41 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker
@fujitsu11.inevm.ru:50892] -> [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:50954
]: Error [Association failed with [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:5
0954]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sp
arkExecutor@fujitsu11.inevm.ru:50954]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon
$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:5
0954
]
14/11/12 18:46:42 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker
@fujitsu11.inevm.ru:50892] -> [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:50954
]: Error [Association failed with [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:5
0954]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sp
arkExecutor@fujitsu11.inevm.ru:50954]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon
$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:5
0954
]
14/11/12 18:46:43 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker
@fujitsu11.inevm.ru:50892] -> [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:50954
]: Error [Association failed with [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:5
0954]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sp
arkExecutor@fujitsu11.inevm.ru:50954]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon
$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:5
0954
]
In logs:
在日志:
14/11/12 18:46:41 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster@fujitsu11:7077] -> [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]: Error [Association failed with [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:50913
]
14/11/12 18:46:42 INFO Master: akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913 got disassociated, removing it.
14/11/12 18:46:42 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster@fujitsu11:7077] -> [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]: Error [Association failed with [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:50913
]
14/11/12 18:46:43 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster@fujitsu11:7077] -> [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]: Error [Association failed with [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:50913
]
I googled a lot but I have no idea whats wrong... I found a bit similar discussion here:
我用谷歌搜索了很多次,但我不知道哪里出错了……我在这里发现了一些类似的讨论:
https://github.com/datastax/spark-cassandra-connector/issues/187
https://github.com/datastax/spark-cassandra-connector/issues/187
But it doesn't solve my problem...
但这并不能解决我的问题……
Somebody knows whats wrong?
有人知道什么错了吗?
Thank You.
谢谢你!
2 个解决方案
#1
2
Found a way to run it using IDE / Maven
找到一种使用IDE / Maven运行它的方法。
- Create a Fat Jar ( One which includes all dependencies ). Use Shade Plugin for this. Example pom :
- 创建一个胖Jar(一个包含所有依赖项的Jar)。使用Shade插件。pom示例:
<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.2</version> <configuration> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration> <executions> <execution> <id>job-driver-jar</id> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <shadedArtifactAttached>true</shadedArtifactAttached> <shadedClassifierName>driver</shadedClassifierName> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> <!-- Some care is required: http://doc.akka.io/docs/akka/snapshot/general/configuration.html --> <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"> <resource>reference.conf</resource> </transformer> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>mainClass</mainClass> </transformer> </transformers> </configuration> </execution> <execution> <id>worker-library-jar</id> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <shadedArtifactAttached>true</shadedArtifactAttached> <shadedClassifierName>worker</shadedClassifierName> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> </transformers> </configuration> </execution> </executions> </plugin>
- Now we have to send the compiled jar file to the cluster. For this, specify the jar file in the spark config like this :
- 现在我们必须将编译后的jar文件发送到集群。为此,在spark配置中指定jar文件,如下所示:
SparkConf conf = new SparkConf().setAppName("appName").setMaster("spark://machineName:7077").setJars(new String[] {"target/appName-1.0-SNAPSHOT-driver.jar"});
SparkConf conf = new SparkConf(). setappname(“appName”). setmaster(“spark://machineName:7077”)。setJars(新String[]{“目标/浏览器名称- 1.0 -快照driver.jar " });
-
Run mvn clean package to create the Jar file. It will be created in your target folder.
运行mvn清理包来创建Jar文件。它将在目标文件夹中创建。
-
Run using your IDE or using maven command :
使用您的IDE或使用maven命令运行:
mvn exec:java -Dexec.mainClass="className"
mvn exec:java -Dexec.mainClass = "名称"
This does not require spark-submit. Just remember to package file before running
这不需要spark-submit。在运行之前,请记住打包文件
If you don't want to hardcode the jar path, you can do this :
如果不想硬编码jar路径,可以这样做:
- In the config, write :
- 在配置中,写:
SparkConf conf = new SparkConf() .setAppName("appName") .setMaster("spark://machineName:7077") .setJars(JavaSparkContext.jarOfClass(this.getClass()));
("spark://machineName:7077") .setJars(JavaSparkContext.jarOfClass(this.getClass()));
- Create the fat jar ( as above ) and run using maven after running package command :
- 创建fat jar(如上所示),运行包命令后使用maven运行:
java -jar target/application-1.0-SNAPSHOT-driver.jar
java jar /应用程序- 1.0 -快照driver.jar目标
This will take the jar from the jar the class was loaded.
这将从装入类的jar中取出jar。
#2
4
For the benefit of others running into this problem:
为了其他遇到这个问题的人的利益:
I faced an identical issue due to a mismatch between the spark connector and spark version being used. Spark was 1.3.1 and the connector was 1.3.0, an identical error message appeared:
由于使用的火花连接器和火花版本之间的不匹配,我遇到了一个相同的问题。Spark为1.3.1,连接器为1.3.0,出现相同的错误信息:
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 2 in stage 0.0 failed 4 times, most recent failure: Lost
task 2.3 in stage 0.0
Updating the dependancy in SBT solved the problem.
更新SBT的依赖性解决了这个问题。
#1
2
Found a way to run it using IDE / Maven
找到一种使用IDE / Maven运行它的方法。
- Create a Fat Jar ( One which includes all dependencies ). Use Shade Plugin for this. Example pom :
- 创建一个胖Jar(一个包含所有依赖项的Jar)。使用Shade插件。pom示例:
<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.2</version> <configuration> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration> <executions> <execution> <id>job-driver-jar</id> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <shadedArtifactAttached>true</shadedArtifactAttached> <shadedClassifierName>driver</shadedClassifierName> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> <!-- Some care is required: http://doc.akka.io/docs/akka/snapshot/general/configuration.html --> <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"> <resource>reference.conf</resource> </transformer> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>mainClass</mainClass> </transformer> </transformers> </configuration> </execution> <execution> <id>worker-library-jar</id> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <shadedArtifactAttached>true</shadedArtifactAttached> <shadedClassifierName>worker</shadedClassifierName> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> </transformers> </configuration> </execution> </executions> </plugin>
- Now we have to send the compiled jar file to the cluster. For this, specify the jar file in the spark config like this :
- 现在我们必须将编译后的jar文件发送到集群。为此,在spark配置中指定jar文件,如下所示:
SparkConf conf = new SparkConf().setAppName("appName").setMaster("spark://machineName:7077").setJars(new String[] {"target/appName-1.0-SNAPSHOT-driver.jar"});
SparkConf conf = new SparkConf(). setappname(“appName”). setmaster(“spark://machineName:7077”)。setJars(新String[]{“目标/浏览器名称- 1.0 -快照driver.jar " });
-
Run mvn clean package to create the Jar file. It will be created in your target folder.
运行mvn清理包来创建Jar文件。它将在目标文件夹中创建。
-
Run using your IDE or using maven command :
使用您的IDE或使用maven命令运行:
mvn exec:java -Dexec.mainClass="className"
mvn exec:java -Dexec.mainClass = "名称"
This does not require spark-submit. Just remember to package file before running
这不需要spark-submit。在运行之前,请记住打包文件
If you don't want to hardcode the jar path, you can do this :
如果不想硬编码jar路径,可以这样做:
- In the config, write :
- 在配置中,写:
SparkConf conf = new SparkConf() .setAppName("appName") .setMaster("spark://machineName:7077") .setJars(JavaSparkContext.jarOfClass(this.getClass()));
("spark://machineName:7077") .setJars(JavaSparkContext.jarOfClass(this.getClass()));
- Create the fat jar ( as above ) and run using maven after running package command :
- 创建fat jar(如上所示),运行包命令后使用maven运行:
java -jar target/application-1.0-SNAPSHOT-driver.jar
java jar /应用程序- 1.0 -快照driver.jar目标
This will take the jar from the jar the class was loaded.
这将从装入类的jar中取出jar。
#2
4
For the benefit of others running into this problem:
为了其他遇到这个问题的人的利益:
I faced an identical issue due to a mismatch between the spark connector and spark version being used. Spark was 1.3.1 and the connector was 1.3.0, an identical error message appeared:
由于使用的火花连接器和火花版本之间的不匹配,我遇到了一个相同的问题。Spark为1.3.1,连接器为1.3.0,出现相同的错误信息:
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 2 in stage 0.0 failed 4 times, most recent failure: Lost
task 2.3 in stage 0.0
Updating the dependancy in SBT solved the problem.
更新SBT的依赖性解决了这个问题。