spark Association failed with [akka.tcp:sparkMaster@ip:7077]

时间:2024-09-17 11:05:50

今搭建spark集群, conf/spark-env.sh 配制如下

export SPARK_MASTER_IP=master
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_MEMORY=1024m

export SCALA_HOME=/usr/local/scala-2.11.2
export JAVA_HOME=/usr/local/jdk1.7.0_60

sbin/start-all.sh

启动后, 通过jps 查看各slave 机 均存在worker进程,但访问 http://master:8080,Workers 列表为空,查看日志

报如下错误(worker 注册失败):

5/01/08 07:46:54 INFO Worker: Connecting to master spark://ip:7077...
15/01/08 07:46:54 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster@ip:7077] has failed, address is now gated for [5000] ms. Reason is: [Association failed with [akka.tcp://sparkMaster@ip:7077]].
15/01/08 07:46:54 INFO RemoteActorRefProvider$RemoteDeadLetterActorRef: Message [org.apache.spark.deploy.DeployMessages$RegisterWorker] from Actor[akka://sparkWorker/user/Worker#1215447184] to Actor[akka://sparkWorker/deadLetters] was not delivered. [12] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

查询一点资料,说要同时配制SPARK_MASTER_IP=ip 和MASTER=spark://ip:7077, 否则slave无法注册主机…  按此设置后仍然不行…  查了许多资料,试了许多方案,最后原来是防火墙的问题,真是太马虎了,白白浪费了那么多时间;所以具体问题还是要看自己的logs.