Spark集群master节点实现HA配置,standalone模式的Spark集群构架为master-slave的架构,master可以实现类似HDFS2.0之后基于Zookeeper的HA,主备切换配置。
1、集群环境Spark1.5.2+Zookeeper.3.4.5
启动zk服务
zkServer.sh start查看状态
zkServer.sh status自动选举出leader
2、配置spark-env.sh
加入如下配置
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=spark1:2181,spark2:2181,spark3:2181 -Dspark.deploy.zookeeper.dir=/spark"指定恢复的模式为 zookeeper,并配置相关zk的结点,及zk中spark的路径
完整的spark-evn.sh配置
export JAVA_HOME=/opt/software/spark/jdk1.7.0_71
#export SPARK_MASTER_IP=spark1
#export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=5
#export SPARK_WORKER_INSTANCES=1
export SPARK_EXECUTOR_INSTANCES=1
export SPARK_WORKER_MEMORY=3g
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=spark1:2181,spark2:2181,spark3:2181 -Dspark.deploy.zookeeper.dir=/spark"
export SPARK_CLASSPATH=$SPARK_CLASSPATH:/opt/software/spark/myspark/spark-1.5.2-bin-2.6.0/lib/mysql-connector-java-5.1.38-bin.jar
注意,原来指定的master节点的配置需要注释。
3、启动集群
在spark1节点上启动
sbin/start-all.sh
spark1当前为ALIVE的master节点
在spark2节点启动
sbin/start-master.sh
4、测试主备自动切换
在spark1节点中查看并杀掉master进程
spark2自动切换成master的alive节点
自动切换后,原来运行的job也不受影响
正在运行的任务自动切换到新的alive的master节点中
5、任务提交集群时,配置master的地址主备,例如:
/opt/modules/spark/bin/spark-sql --master spark://bdc20.hexun.com:7077,bdc220.hexun.com:7077 --executor-memory 3g --total-executor-cores 24 --conf spark.ui.port=54689 --driver-memory 5g -e "\
INSERT overwrite TABLE st.stock_realtime_analysis PARTITION (DTYPE='02')
SELECT t1.stockId stockId,
t1.url url,
t1.clickcnt clickcnt,
0,
round((t1.clickcnt / (CASE WHEN t2.clickcnt IS NULL THEN 0 ELSE t2.clickcnt END) - 1) * 100, 2) LPcnt,
'02' TYPE, '${today2}' analysis_date,
'${nowtime}' analysis_time
FROM
(SELECT stock_code stockId,
concat('http://stockdata.stock.hexun.com/', stock_code, '.shtml') url,
count(*) clickcnt
FROM dms.tracklog_5min
WHERE DAY = '${today}'
AND stock_type = 'STOCK'
AND datetime >= '${onehourage}'
GROUP BY stock_code
ORDER BY clickcnt DESC LIMIT 20) t1
LEFT JOIN
(SELECT stock_code stockId,
count(*) clickcnt
FROM dms.tracklog_5min
WHERE DAY = '${today}'
AND stock_type = 'STOCK'
AND datetime <= '${onehourage}'
AND datetime >= '${twohourage}'
GROUP BY stock_code) t2 ON t1.stockId = t2.stockId
ORDER BY clickcnt DESC LIMIT 20;"\
6、总结:
Spark集群master节点通过zk实现了HA,起来越多的分布式集群都通过ZK实现了主节点的HA,如HDFS2.0,HBase等。
如果使用的YARN模式,自然用的是YARN高可用方案了。