spark集群安装部署

时间:2021-05-18 08:11:09

安装部署环境:

首先免密钥配置

  ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

  cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

将本机的id_dsa.pub内容追加到其他机器的authorized_keys

 

一、安装hadoop

1.hadoop包进行解压缩:tar -zxvf hadoop-2.5.0-cdh5.3.6.tar.gz

2.配置环境变量vi /etc/profile

export JAVA_HOME=/usr/java/default

export JRE_HOME=/usr/java/default/jre

export HADOOP_HOME=/home/trs/hadoop

export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

3.配置文件生效 source /etc/profile

4.修改hdfs-site.xml

<configuration>

<property>

   <name>dfs.replication</name>

     <value>1</value>

      </property>

 <property>

   <name>dfs.namenode.secondary.http-address</name>

     <value>mesos-slave-7:50090</value>

      </property>

 <property>

   <name>dfs.name.dir</name>

     <value>/home/trs/hadoop/hdfs/name</value>

      </property>

 <property>

    <name>dfs.data.dir</name>

      <value>/home/trs/hadoop/hdfs/data</value>

         </property>

</configuration>

5.修改mapred-site.xml

<configuration>

 <property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>mesos-slave-5:10020</value>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>mesos-slave-5:19888</value>

</property>

</configuration>

6.修改yarn-site.xml

<configuration>

 

<!-- Site specific YARN configuration properties -->

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value><!--NM上自定义shuffle服务-->

</property>

<property>

      <name>yarn.app.mapreduce.am.env</name>

            <value>LD_LIBRARY_PATH=$HADOOP_HOME/lib/native</value>

              </property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value><!--shuffle服务的实现-->

</property>

<property>

<name>yarn.resourcemanager.address</name>

<value>mesos-slave-5:8032</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>mesos-slave-5:8030</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>mesos-slave-5:8031</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>mesos-slave-5:8033</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>mesos-slave-5:8088</value>

</property>

</configuration>

7.修改slaves文件

mesos-slave-7 其他机器名

mesos-slave-5

8.配置hadoop-env.sh

export JAVA_HOME=/usr/java/default  jdk的路径

9.将配置好的文件拷贝到其他机器

  10.bin/hdfs namenode -format

   sbin/start-dfs.sh

  11.访问http://ip:50070

二、spark安装部署

1.安装Scala并配置环境变量

2.  #进入spark配置目录

cp spark-env.sh.template spark-env.sh  #从配置模板复制

vim spark-env.sh    #添加配置内容

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export SPARK_MASTER_IP=mesos-slave-6

export SPARK_EXECUTOR_MEMORY=8g

export SPARK_WORKER_MEMORY=16g

export SPARK_DRIVER_MEMORY=8g

export SPARK_WORKER_CORES=4

export JAVA_HOME=/usr/java/default

export JRE_HOME=/usr/java/default/jre

export HADOOP_HOME=/home/trs/hadoop

export SCALA_HOME=/usr/scala

3.vi slaves 配置从节点

mesos-slave-5

mesos-slave-7

4.启动spark  命令:start-all.sh

5.访问地址:http://ip:8080