hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz 的集群搭建(3节点和5节点皆适用)

时间:2022-09-13 14:34:49

  本人呕心沥血所写,经过好一段时间反复锤炼和整理修改。感谢所参考的博友们!同时,欢迎前来查阅赏脸的博友们收藏和转载,附上本人的链接。http://www.cnblogs.com/zlslch/p/5846091.html

  附链接如下:

    http://my.oschina.net/amui/blog/610288

    http://my.oschina.net/amui/blog/610329

    http://blog.csdn.net/u010270403/article/details/51446674

    http://blog.csdn.net/stark_summer/article/details/42424279

    http://blog.csdn.net/dai451954706/article/details/46966165

关于几个疑问和几处心得!

a.用NAT,还是桥接,还是only-host模式?

答: hostonly、桥接和NAT

b.用static的ip,还是dhcp的?

答:static

c.别认为快照和克隆不重要,小技巧,比别人灵活用,会很节省时间和大大减少错误。

d.重用起来脚本语言的编程,如paython或shell编程。

 对于用scp -r命令或deploy.conf(配置文件),deploy.sh(实现文件复制的shell脚本文件),runRemoteCdm.sh(在远程节点上执行命令的shell脚本文件)。

e.重要Vmare Tools增强工具,或者,rz上传、sz下载。

f.大多数人常用

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

Xmanager Enterprise *安装步骤

  用到的所需:

  1、VMware-workstation-full-11.1.2.61471.1437365244.exe

  2、ubuntukylin-14.04-desktop-amd64.iso

  3、jdk-8u60-linux-x64.tar.gz

  4、hadoop-2.6.0.tar.gz

  5、scala-2.10.4.tgz

  6、spark-1.5.2-bin-hadoop2.6.tgz

  

  机器规划:

  192.168.80.31   ----------------  SparkMaster (对于主机名,想说,不要Spark_Master,否则,最后格式化启动通不过。本人已实验过! )

  192.168.80.32   ----------------  SparkWorker1

  192.168.80.33   ----------------  SparkWorker2

  

  目录规划:

  1、下载目录

   /root/Downloads/Spark_Cluster_Software  ----------------    存放所有安装软件

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  2、新建目录

  

  3、安装目录

  jdk-8u60-linux-x64.tar.gz  --------------------------------------------------  /usr/local/jdk/jdk1.8.0_60

  hadoop-2.6.0.tar.gz ----------------------------------------------------------  /usr/local/hadoop/hadoop-2.6.0

  scala-2.10.4.tgz --------------------------------------------------------------- /usr/local/scala/scala-2.10.4

  spark-1.5.2-bin-hadoop2.6.tgz ---------------------------------------------- /usr/loca/spark/spark-1.5.2-bin-hadoop2.6

  4、快照步骤

  快照一:

    刚安装完毕,且能连上网

快照二:

    root用户的开启、vim编辑器的安装、ssh的安装、静态IP的设置、/etc/hostname和/etc/hosts和永久关闭防火墙

     SSH安装完之后的免密码配置,放在后面
       静态IP是192.168.80.31
          /etc/hostname是SparkMaster
          /etc/hosts是
          192.168.80.31 SparkMaster
          192.168.80.32 SparkWorker1
          192.168.80.33 SparkWorker2

  快照三:

       安装jdk、安装scala、配置SSH免密码登录

1、配置SSH免密码登录(SparkMaster自身、SparkWorker1自身、SparkWorker2自身)
          2、配置SSH免密码登录(SparkWorker1与SparkMaster、SparkWorker2与SparkMaster)
          3、配置SSH免密码登录(SparkWorker1与SparkWorker2)   

快照四:

安装hadoop(没格式化)、安装lrzsz、将自己写好的替换掉默认的配置文件、建立好目录

  

  快照五:

    安装hadoop(格式化)成功、进程启动正常

  快照六:

    spark的安装和配置工作完成

  快照七:

    启动hadoop、spark集群成功、查看50070、8088、8080、4040页面

 第一步:

    安装VMware-workstation虚拟机,我这里是VMware-workstation11版本。

       详细见 ->

VMware workstation 11 的下载      

VMWare Workstation 11的安装

VMware Workstation 11安装之后的一些配置

 第二步:

    安装ubuntukylin-14.04-desktop系统 (最好安装英文系统)

    详细见 ->

Ubuntu各版本的历史发行界面

Ubuntukylin-14.04-desktop(带分区)安装步骤详解 

Ubuntukylin-14.04-desktop( 不带分区)安装步骤详解

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

第三步:VMware Tools增强工具安装

    详细见 ->

VMware里Ubuntukylin-14.04-desktop的VMware Tools安装图文详解

  第四步:准备小修改(学会用快照和克隆,根据自身要求情况,合理位置快照) 

    详细见 ->   

       

CentOS常用命令、快照、克隆大揭秘

Centos 6.5下的OPENJDK卸载和SUN的JDK安装、环境变量配置

E:Package 'Vim' has no installation candidate问题解决

解决Ubuntu系统的每次开机重启后,resolv.conf清空的问题

新建用户组、用户、用户密码、删除用户组、用户(适合CentOS、Ubuntu)

    1、root用户的开启(Ubuntu系统,安装之后默认是没有root用户)

    2、vim编辑器的安装

    3、ssh的安装(SSH安装完之后的免密码配置,放在后面)

    4、静态IP的设置

Ubuntu14.04安装之后的一些配置

    5、/etc/hostname和/etc/hosts  

root@zhouls-virtual-machine:~# hostname
zhouls-virtual-machine
root@zhouls-virtual-machine:~# sudo vim /etc/hostname
root@zhouls-virtual-machine:~# sudo cat /etc/hostname
SparkMaster
root@zhouls-virtual-machine:~# sudo vim /etc/hosts
root@zhouls-virtual-machine:~# sudo cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 zhouls-virtual-machine

192.168.80.31 SparkMaster
192.168.80.32 SparkWorker1
192.168.80.33 SparkWorker2

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
root@zhouls-virtual-machine:~#

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    6、永久关闭防火墙

        一般,在搭建hadoop/spark集群时,最好是永久关闭防火墙,因为,防火墙会保护其进程间通信。

root@SparkMaster:~# sudo ufw status
Status: inactive
root@SparkMaster:~#

        hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

        由此,表明Ubuntu14.04是默认没开启防火墙的。

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    三台机器都照做!

安装前的思路梳理:

  ***********************************************************************************

  *                                                    *

  *              编程语言  ->   hadoop 集群  -> spark 集群                            *

  * 1、安装jdk                                                 *

  * 2、安装scala                                              *

  * 3、配置SSH免密码登录(SparkMaster自身、SparkWorker1自身、SparkWorker2自身)          *

  *     配置SSH免密码登录(SparkWorker1与SparkMaster、SparkWorker2与SparkMaster)                  *

  *   配置SSH免密码登录(SparkWorker1与SparkWorker2)                      *

  *4、安装hadoop                                             *

  *5、安装spark                                             *

  *6、启动集群                                              *

  *7、查看页面                                              *

  *8、成功(记得快照)                                                *

  *******************************************************

  

  用wget命令在线下载,养成习惯,放到/root/Downloads/Spark_Cluster_Software/目录下,或者,安装了Vmare增强工具Tools,直接拖进去。也可以。

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

一、安装jdk

  jdk-8u60-linux-x64.tar.gz  --------------------------------------------------  /usr/local/jdk/jdk1.8.0_60

    1、jdk-8u60-linux-x64.tar.gz的下载

    下载,http://download.csdn.net/download/aqtata/9022063

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    2、jdk-8u60-linux-x64.tar.gz的上传

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  三台机器都照做!

  

    3、首先,检查Ubuntu系统的自带openjdk

      hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:~# java -version
The program 'java' can be found in the following packages:
* default-jre
* gcj-4.8-jre-headless
* openjdk-7-jre-headless
* gcj-4.6-jre-headless
* openjdk-6-jre-headless
Try: apt-get install <selected package>
root@SparkMaster:~# sudo apt-get purge openjdk*
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'openjdk-jre' for regex 'openjdk*'

    由此,可见,此Ubuntu系统,没有自带的openjdk。

  4、现在,新建/usr/loca/下的jdk目录

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  mkdir -p /usr/local/jdk

  5、将下载的jdk文件移到刚刚创建的/usr/local/jdk下

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  sudo cp  /root/Downloads/Spark_Cluster_Software/jdk-8u60-linux-x64.tar.gz  /usr/local/jdk

  最好用cp,不要轻易要mv

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:~/Downloads/Spark_Cluster_Software# cd /usr/local/jdk
root@SparkMaster:/usr/local/jdk# ls

  6、解压jdk文件

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  tar -zxvf jdk-8u60-linux-x64.tar.gz

  7、删除解压包,留下解压完成的文件目录

  rm -rf jdk-8u60-linux-x64.tar.gz

  改下,uucp

  chown -R root:root  jdk1.8.0_60/

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  8、修改环境变量 

  vim ~./bash_profile   或 vim /etc/profile

  配置在这个文件~/.bash_profile,或者也可以,配置在那个全局的文件里,也可以哟。/etc/profile。

  这里,我vim /etc/profile

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  #java

  export JAVA_HOME=/usr/local/jdk/jdk1.8.0_60

  export JRE_HOME=$JAVA_HOME/jre

  export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

  export PATH=$PATH:$JAVA_HOME/bin

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  root@SparkMaster:/usr/local/jdk# vim /etc/profile
  root@SparkMaster:/usr/local/jdk# source /etc/profile
  root@SparkMaster:/usr/local/jdk# java -version
  java version "1.8.0_60"
  Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
  Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
  root@SparkMaster:/usr/local/jdk#

  至此,表明java安装结束。

  其他两台都照做!

  

二、安装scala

    scala-2.10.4.tgz --------------------------------------------------------------- /usr/local/scala/scala-2.10.4

  1、scala的下载

  http://www.scala-lang.org/files/archive/

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  2、scala-2.10.4.tgz 的上传

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  其他两台都照做!

  3、现在,新建/usr/loca/下的sacla目录

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    mkdir -p /usr/local/scala

    4、将下载的scala文件移到刚刚创建的/usr/local/scala下

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  sudo cp /root/Downloads/Spark_Cluster_Software/scala-2.10.4.tgz  /usr/local/scala

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  最好用cp,不要轻易要mv

    5、解压scala文件

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  root@SparkMaster:/usr/local/scala# tar -zxvf scala-2.10.4.tgz

    6、删除解压包,留下解压完成的文件目录

root@SparkMaster:/usr/local/scala# ls
scala-2.10.4 scala-2.10.4.tgz
root@SparkMaster:/usr/local/scala# rm -rf scala-2.10.4.tgz
root@SparkMaster:/usr/local/scala# ls
scala-2.10.4
root@SparkMaster:/usr/local/scala# ll
total 12
drwxr-xr-x 3 root root 4096 9月 7 08:49 ./
drwxr-xr-x 12 root root 4096 9月 7 08:36 ../
drwxrwxr-x 9 2000 2000 4096 3月 18 2014 scala-2.10.4/
root@SparkMaster:/usr/local/scala# chown -R root:root scala-2.10.4/
root@SparkMaster:/usr/local/scala# ll
total 12
drwxr-xr-x 3 root root 4096 9月 7 08:49 ./
drwxr-xr-x 12 root root 4096 9月 7 08:36 ../
drwxrwxr-x 9 root root 4096 3月 18 2014 scala-2.10.4/
root@SparkMaster:/usr/local/scala#

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  7、修改环境变量 

  vim ~./bash_profile   或 vim /etc/profile

  配置在这个文件~/.bash_profile,或者也可以,配置在那个全局的文件里,也可以哟。/etc/profile。

  这里,我vim /etc/profile

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用) 

  #scala
  export SCALA_HOME=/usr/local/scala/scala-2.10.4
  export PATH=$PATH:$SCALA_HOME/bin

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  root@SparkMaster:/usr/local/scala# vim /etc/profile
  root@SparkMaster:/usr/local/scala# source /etc/profile
  root@SparkMaster:/usr/local/scala# scala -version
  Scala code runner version 2.10.4 -- Copyright 2002-2013, LAMP/EPFL
  root@SparkMaster:/usr/local/scala#

至此,表明scala安装结束。

  其他两台都照做!

  8、输入scala命令,可直接进入scala的命令行交互界面。

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/scala# scala
Welcome to Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60).
Type in expressions to have them evaluated.
Type :help for more information.

scala> 9*9
res0: Int = 81

scala> exit;
warning: there were 1 deprecation warning(s); re-run with -deprecation for details
root@Spark_Master:/usr/local/scala#

  

三、配置免密码登录

  

root@SparkMaster:~# mkdir .ssh
root@SparkMaster:~# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): Enter
Enter passphrase (empty for no passphrase): Enter
Enter same passphrase again: Enter
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
a5:e3:51:95:a4:0b:d8:a1:a8:5c:12:1e:ba:b0:ef:6a root@Spark_Master
The key's randomart image is:
+--[ RSA 2048]----+
| o . .o. |
| o o . + . o. |
|o o o o o + |
|.+ + = . |
|o o S . |
| . . o |
| . . |
| E |
|o.. |
+-----------------+
root@SparkMaster:~#

     hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  其他两台都照做!

    切换到.ssh目录下,进行查看公钥和私钥

root@SparkMaster:~# cd .ssh
root@SparkMaster:~/.ssh# ls
id_rsa id_rsa.pub
root@SparkMaster:~/.ssh#

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  其他两台都照做!

    将公钥复制到日志文件里。查看是否复制成功

root@SparkMaster:~/.ssh# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
root@SparkMaster:~/.ssh# ls
authorized_keys id_rsa id_rsa.pub
root@SparkMaster:~/.ssh# pwd
/root/.ssh
root@SparkMaster:~/.ssh# cd ..
root@SparkMaster:~# ll

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

其他两台都照做!

  赋予权限

root@SparkMaster:~# chmod 700 .ssh
root@SparkMaster:~# chmod 600 .ssh/*
root@SparkMaster:~#

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

其他两台都照做!

  1、配置SSH免密码登录(SparkMaster自身、SparkWorker1自身、SparkWorker2自身)

      SparkMaster本身

root@SparkMaster:~# ssh SparkMaster
The authenticity of host 'sparkmaster (192.168.80.31)' can't be established.
ECDSA key fingerprint is 26:3d:99:40:70:43:33:b9:0b:16:57:c3:63:37:8f:ac.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'sparkmaster,192.168.80.31' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

731 packages can be updated.
345 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 08:43:27 2016 from 192.168.80.1
root@SparkMaster:~# exit;
logout
Connection to sparkmaster closed.
root@SparkMaster:~# ssh Spark_Master
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

731 packages can be updated.
345 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:09:57 2016 from sparkmaster
root@SparkMaster:~#

      hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDMfmh16fsyxwHH1ZcMyZhGC9wBcYMUwWaxeL1y4cI8ylpwATKkJAv0G+MkWhQQ+eVqsyxL4a5n3JrmoDuUJ2XUVXneBN+qPYQF6iu07MhTXoyTBFAlnVGCpgafgbdgNcdzKT0lvkoZMdr0eXImIlO+x4YQ7ZmFVmVYESvFKT4RWf9w7VAj/Oq8PxMV60RjsVB2MAlVeq9IzAzOPH6mhw53yeOkn7h6LoZfYjGjXpBaHnRLOohjz1cgajYoSYP5mUHTmI1CZvwg/oIgDI7qB7agfGMJAQFkBXxkwLTgrs0xQnhLmrARHK5KVewEUPLN7XKO3/6T3H7MU/7+hgK7vLMV root@SparkMaster
root@SparkMaster:~#   

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    SparkWorker1本身

root@SparkWorker1:~# ssh SparkWorker1
The authenticity of host 'sparkworker1 (192.168.80.32)' can't be established.
ECDSA key fingerprint is 7e:9b:83:ed:c1:ad:55:2a:e1:f3:49:cd:89:c0:4a:ec.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'sparkworker1,192.168.80.32' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

708 packages can be updated.
329 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 08:44:24 2016 from 192.168.80.1
root@SparkWorker1:~# exit;
logout
Connection to sparkworker1 closed.
root@SparkWorker1:~# ssh SparkWorker1
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

708 packages can be updated.
329 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:12:20 2016 from sparkworker1
root@SparkWorker1:~#

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker1:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzbQ6Uj69dLshMuYYVm4OndmagCbeTPnSg82y+1Sp13MGvMAOlyE13PElDlyYiaaCG+LIdBjV97BzZcDVm+hnMD56zFwkY56eIuzFMYJNOaAsfrcbD5aJEOhMTkSziqAGR7BQTtNO3COGbaWUoJR92LFFhIiJ8PSd/MJlsu3OLSldk2kM31NwoUWbd6nFt4WsBvTClEs/4qz4HoboM80ayNxrlyektGmv2tmF6Y00JTQMLRaFSAdFgX2Bh2iA6VHqD5Q2m+M1VU5IP0KFHLmrGiVCS8LL4igNz3U1qhkJ9B3ek++Y+0d2DjI80XLIkahKhuhKWQBkPb4zpHV4CQZZ5 root@SparkWorker1
root@SparkWorker1:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  SparkWorker2本身

root@SparkWorker2:~# ssh SparkWorker2
The authenticity of host 'sparkworker2 (192.168.80.33)' can't be established.
ECDSA key fingerprint is 97:59:ec:7b:be:3c:4f:04:71:64:7d:d7:b2:e9:c3:67.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'spark_worker2,192.168.80.33' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 07:59:57 2016 from 192.168.80.1
root@SparkWorker2:~# exit;
logout
Connection to sparkworker2 closed.
root@SparkWorker2:~# ssh SparkWorker2
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

713 packages can be updated.
329 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:14:23 2016 from sparkworker2
root@SparkWorker2:~#

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker2:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7L9KxXwRf8IeGMjBWLgnOJUVQSU7nggVAuIo3vpCu6YVwpEdd9Z29PQEy/njQR32IBvQJkdV1BhsT7eRfe14D+Jn7bL8BcJ9NpEw3IyQSdR17wFVofyzZkP1xC5AsejduZIDVpiQ6jRzinHR+tEZJSvI7jWCkR0HVZU+IPuh4DsYhke78xNu3FdRqY4J77H+VxcD2k/Ls1owjbbzUrgsWtbmiH+TmGGHVbwPNvaPT7rnXARWsLDooSWt5FjLhH3F4plkfWKxJ/sjWsyTLFbVhTHrMmFU4ZIjs6lFNu4m7EN212d4ZAoozIh0Sh2RaQS0sqq68CGG+67BRnH1GMxUd root@SparkWorker2
root@SparkWorker2:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

   2、配置SSH免密码登录(SparkWorker1与SparkMaster、SparkWorker2与SparkMaster)

    SparkWorker1与 SparkMaster

root@SparkWorker1:~# cat ~/.ssh/id_rsa.pub | ssh root@SparkMaster 'cat >> ~/.ssh/authorized_keys'
The authenticity of host 'sparkmaster (192.168.80.31)' can't be established.
ECDSA key fingerprint is 26:3d:99:40:70:43:33:b9:0b:16:57:c3:63:37:8f:ac.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'spark_master,192.168.80.31' (ECDSA) to the list of known hosts.
root@sparkmaster's password:
root@SparkWorker1:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzbQ6Uj69dLshMuYYVm4OndmagCbeTPnSg82y+1Sp13MGvMAOlyE13PElDlyYiaaCG+LIdBjV97BzZcDVm+hnMD56zFwkY56eIuzFMYJNOaAsfrcbD5aJEOhMTkSziqAGR7BQTtNO3COGbaWUoJR92LFFhIiJ8PSd/MJlsu3OLSldk2kM31NwoUWbd6nFt4WsBvTClEs/4qz4HoboM80ayNxrlyektGmv2tmF6Y00JTQMLRaFSAdFgX2Bh2iA6VHqD5Q2m+M1VU5IP0KFHLmrGiVCS8LL4igNz3U1qhkJ9B3ek++Y+0d2DjI80XLIkahKhuhKWQBkPb4zpHV4CQZZ5 root@SparkWorker1
root@SparkWorker1:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDMfmh16fsyxwHH1ZcMyZhGC9wBcYMUwWaxeL1y4cI8ylpwATKkJAv0G+MkWhQQ+eVqsyxL4a5n3JrmoDuUJ2XUVXneBN+qPYQF6iu07MhTXoyTBFAlnVGCpgafgbdgNcdzKT0lvkoZMdr0eXImIlO+x4YQ7ZmFVmVYESvFKT4RWf9w7VAj/Oq8PxMV60RjsVB2MAlVeq9IzAzOPH6mhw53yeOkn7h6LoZfYjGjXpBaHnRLOohjz1cgajYoSYP5mUHTmI1CZvwg/oIgDI7qB7agfGMJAQFkBXxkwLTgrs0xQnhLmrARHK5KVewEUPLN7XKO3/6T3H7MU/7+hgK7vLMV root@SparkMaster
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzbQ6Uj69dLshMuYYVm4OndmagCbeTPnSg82y+1Sp13MGvMAOlyE13PElDlyYiaaCG+LIdBjV97BzZcDVm+hnMD56zFwkY56eIuzFMYJNOaAsfrcbD5aJEOhMTkSziqAGR7BQTtNO3COGbaWUoJR92LFFhIiJ8PSd/MJlsu3OLSldk2kM31NwoUWbd6nFt4WsBvTClEs/4qz4HoboM80ayNxrlyektGmv2tmF6Y00JTQMLRaFSAdFgX2Bh2iA6VHqD5Q2m+M1VU5IP0KFHLmrGiVCS8LL4igNz3U1qhkJ9B3ek++Y+0d2DjI80XLIkahKhuhKWQBkPb4zpHV4CQZZ5 root@SparkWorker1
root@SparkMaster:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

    SparkWorker2与 SparkMaster

root@SparkWorker2:~# cat ~/.ssh/id_rsa.pub | ssh root@SparkMaster 'cat >> ~/.ssh/authorized_keys'
The authenticity of host 'sparkmaster (192.168.80.31)' can't be established.
ECDSA key fingerprint is 26:3d:99:40:70:43:33:b9:0b:16:57:c3:63:37:8f:ac.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'sparkmaster,192.168.80.31' (ECDSA) to the list of known hosts.
root@sparkmaster's password:
root@SparkWorker2:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7L9KxXwRf8IeGMjBWLgnOJUVQSU7nggVAuIo3vpCu6YVwpEdd9Z29PQEy/njQR32IBvQJkdV1BhsT7eRfe14D+Jn7bL8BcJ9NpEw3IyQSdR17wFVofyzZkP1xC5AsejduZIDVpiQ6jRzinHR+tEZJSvI7jWCkR0HVZU+IPuh4DsYhke78xNu3FdRqY4J77H+VxcD2k/Ls1owjbbzUrgsWtbmiH+TmGGHVbwPNvaPT7rnXARWsLDooSWt5FjLhH3F4plkfWKxJ/sjWsyTLFbVhTHrMmFU4ZIjs6lFNu4m7EN212d4ZAoozIh0Sh2RaQS0sqq68CGG+67BRnH1GMxUd root@SparkWorker2
root@SparkWorker2:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDMfmh16fsyxwHH1ZcMyZhGC9wBcYMUwWaxeL1y4cI8ylpwATKkJAv0G+MkWhQQ+eVqsyxL4a5n3JrmoDuUJ2XUVXneBN+qPYQF6iu07MhTXoyTBFAlnVGCpgafgbdgNcdzKT0lvkoZMdr0eXImIlO+x4YQ7ZmFVmVYESvFKT4RWf9w7VAj/Oq8PxMV60RjsVB2MAlVeq9IzAzOPH6mhw53yeOkn7h6LoZfYjGjXpBaHnRLOohjz1cgajYoSYP5mUHTmI1CZvwg/oIgDI7qB7agfGMJAQFkBXxkwLTgrs0xQnhLmrARHK5KVewEUPLN7XKO3/6T3H7MU/7+hgK7vLMV root@SparkMaster
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzbQ6Uj69dLshMuYYVm4OndmagCbeTPnSg82y+1Sp13MGvMAOlyE13PElDlyYiaaCG+LIdBjV97BzZcDVm+hnMD56zFwkY56eIuzFMYJNOaAsfrcbD5aJEOhMTkSziqAGR7BQTtNO3COGbaWUoJR92LFFhIiJ8PSd/MJlsu3OLSldk2kM31NwoUWbd6nFt4WsBvTClEs/4qz4HoboM80ayNxrlyektGmv2tmF6Y00JTQMLRaFSAdFgX2Bh2iA6VHqD5Q2m+M1VU5IP0KFHLmrGiVCS8LL4igNz3U1qhkJ9B3ek++Y+0d2DjI80XLIkahKhuhKWQBkPb4zpHV4CQZZ5 root@SparkWorker1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7L9KxXwRf8IeGMjBWLgnOJUVQSU7nggVAuIo3vpCu6YVwpEdd9Z29PQEy/njQR32IBvQJkdV1BhsT7eRfe14D+Jn7bL8BcJ9NpEw3IyQSdR17wFVofyzZkP1xC5AsejduZIDVpiQ6jRzinHR+tEZJSvI7jWCkR0HVZU+IPuh4DsYhke78xNu3FdRqY4J77H+VxcD2k/Ls1owjbbzUrgsWtbmiH+TmGGHVbwPNvaPT7rnXARWsLDooSWt5FjLhH3F4plkfWKxJ/sjWsyTLFbVhTHrMmFU4ZIjs6lFNu4m7EN212d4ZAoozIh0Sh2RaQS0sqq68CGG+67BRnH1GMxUd root@SparkWorker2
root@SparMaster:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

    将SparkMaster的~/.ssh/ authorized_keys,分发给SparkWorker1

      知识点:用自己写好的脚本,也可以,或者,用scp命令

root@SparkMaster:~# scp -r ~/.ssh/authorized_keys root@SparkWorker1:~/.ssh/
The authenticity of host 'sparkworker1 (192.168.80.32)' can't be established.
ECDSA key fingerprint is 7e:9b:83:ed:c1:ad:55:2a:e1:f3:49:cd:89:c0:4a:ec.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'sparkworker1,192.168.80.32' (ECDSA) to the list of known hosts.
root@sparkworker1's password:
authorized_keys 100% 1199 1.2KB/s 00:00
root@SparkMaster:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker1:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDMfmh16fsyxwHH1ZcMyZhGC9wBcYMUwWaxeL1y4cI8ylpwATKkJAv0G+MkWhQQ+eVqsyxL4a5n3JrmoDuUJ2XUVXneBN+qPYQF6iu07MhTXoyTBFAlnVGCpgafgbdgNcdzKT0lvkoZMdr0eXImIlO+x4YQ7ZmFVmVYESvFKT4RWf9w7VAj/Oq8PxMV60RjsVB2MAlVeq9IzAzOPH6mhw53yeOkn7h6LoZfYjGjXpBaHnRLOohjz1cgajYoSYP5mUHTmI1CZvwg/oIgDI7qB7agfGMJAQFkBXxkwLTgrs0xQnhLmrARHK5KVewEUPLN7XKO3/6T3H7MU/7+hgK7vLMV root@SparkMaster
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzbQ6Uj69dLshMuYYVm4OndmagCbeTPnSg82y+1Sp13MGvMAOlyE13PElDlyYiaaCG+LIdBjV97BzZcDVm+hnMD56zFwkY56eIuzFMYJNOaAsfrcbD5aJEOhMTkSziqAGR7BQTtNO3COGbaWUoJR92LFFhIiJ8PSd/MJlsu3OLSldk2kM31NwoUWbd6nFt4WsBvTClEs/4qz4HoboM80ayNxrlyektGmv2tmF6Y00JTQMLRaFSAdFgX2Bh2iA6VHqD5Q2m+M1VU5IP0KFHLmrGiVCS8LL4igNz3U1qhkJ9B3ek++Y+0d2DjI80XLIkahKhuhKWQBkPb4zpHV4CQZZ5 root@SparkWorker1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7L9KxXwRf8IeGMjBWLgnOJUVQSU7nggVAuIo3vpCu6YVwpEdd9Z29PQEy/njQR32IBvQJkdV1BhsT7eRfe14D+Jn7bL8BcJ9NpEw3IyQSdR17wFVofyzZkP1xC5AsejduZIDVpiQ6jRzinHR+tEZJSvI7jWCkR0HVZU+IPuh4DsYhke78xNu3FdRqY4J77H+VxcD2k/Ls1owjbbzUrgsWtbmiH+TmGGHVbwPNvaPT7rnXARWsLDooSWt5FjLhH3F4plkfWKxJ/sjWsyTLFbVhTHrMmFU4ZIjs6lFNu4m7EN212d4ZAoozIh0Sh2RaQS0sqq68CGG+67BRnH1GMxUd root@SparkWorker2
root@SparkWorker1:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

    将SparkMaster的~/.ssh/ authorized_keys,分发给SparkWorker2

  

root@SparkMaster:~# scp -r ~/.ssh/authorized_keys root@SparkWorker2:~/.ssh/
The authenticity of host 'spark_worker2 (192.168.80.33)' can't be established.
ECDSA key fingerprint is 97:59:ec:7b:be:3c:4f:04:71:64:7d:d7:b2:e9:c3:67.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'sparkworker2,192.168.80.33' (ECDSA) to the list of known hosts.
root@sparkworker2's password:
authorized_keys 100% 1199 1.2KB/s 00:00
root@SparkMaster:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker2:~# cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDMfmh16fsyxwHH1ZcMyZhGC9wBcYMUwWaxeL1y4cI8ylpwATKkJAv0G+MkWhQQ+eVqsyxL4a5n3JrmoDuUJ2XUVXneBN+qPYQF6iu07MhTXoyTBFAlnVGCpgafgbdgNcdzKT0lvkoZMdr0eXImIlO+x4YQ7ZmFVmVYESvFKT4RWf9w7VAj/Oq8PxMV60RjsVB2MAlVeq9IzAzOPH6mhw53yeOkn7h6LoZfYjGjXpBaHnRLOohjz1cgajYoSYP5mUHTmI1CZvwg/oIgDI7qB7agfGMJAQFkBXxkwLTgrs0xQnhLmrARHK5KVewEUPLN7XKO3/6T3H7MU/7+hgK7vLMV root@SparkMaster
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzbQ6Uj69dLshMuYYVm4OndmagCbeTPnSg82y+1Sp13MGvMAOlyE13PElDlyYiaaCG+LIdBjV97BzZcDVm+hnMD56zFwkY56eIuzFMYJNOaAsfrcbD5aJEOhMTkSziqAGR7BQTtNO3COGbaWUoJR92LFFhIiJ8PSd/MJlsu3OLSldk2kM31NwoUWbd6nFt4WsBvTClEs/4qz4HoboM80ayNxrlyektGmv2tmF6Y00JTQMLRaFSAdFgX2Bh2iA6VHqD5Q2m+M1VU5IP0KFHLmrGiVCS8LL4igNz3U1qhkJ9B3ek++Y+0d2DjI80XLIkahKhuhKWQBkPb4zpHV4CQZZ5 root@SparkWorker1
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7L9KxXwRf8IeGMjBWLgnOJUVQSU7nggVAuIo3vpCu6YVwpEdd9Z29PQEy/njQR32IBvQJkdV1BhsT7eRfe14D+Jn7bL8BcJ9NpEw3IyQSdR17wFVofyzZkP1xC5AsejduZIDVpiQ6jRzinHR+tEZJSvI7jWCkR0HVZU+IPuh4DsYhke78xNu3FdRqY4J77H+VxcD2k/Ls1owjbbzUrgsWtbmiH+TmGGHVbwPNvaPT7rnXARWsLDooSWt5FjLhH3F4plkfWKxJ/sjWsyTLFbVhTHrMmFU4ZIjs6lFNu4m7EN212d4ZAoozIh0Sh2RaQS0sqq68CGG+67BRnH1GMxUd root@SparkWorker2
root@SparkWorker2:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    3、 配置SSH免密码登录(SparkWorker1与SparkWorker2)  

root@SparkMaster:~# ssh SparkWorker1
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

708 packages can be updated.
329 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:12:30 2016 from sparkworker1
root@SparkWorker1:~# exit;
logout
Connection to spark_worker1 closed.
root@SparkMaster:~# ssh SparkWorker2
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

713 packages can be updated.
329 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:14:33 2016 from spark_worker2
root@SparkWorker2:~# exit;
logout
Connection to spark_worker2 closed.
root@SparkMaster:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker1:~# ssh SparkMaster
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

731 packages can be updated.
345 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:10:14 2016 from spark_master
root@SparkMaster:~# exit;
logout
Connection to spark_master closed.
root@SparkWorker1:~# ssh SparkWorker2
The authenticity of host 'spark_worker2 (192.168.80.33)' can't be established.
ECDSA key fingerprint is 97:59:ec:7b:be:3c:4f:04:71:64:7d:d7:b2:e9:c3:67.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'spark_worker2,192.168.80.33' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

713 packages can be updated.
329 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:42:32 2016 from spark_master

root@SparkWorker2:~# exit;
logout
Connection to sparkworker2 closed.
root@SparkWorker1:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    

root@SparkWorker2:~# ssh SparkMaster
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

731 packages can be updated.
345 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:44:48 2016 from spark_worker1
root@SparkMaster:~# exit;
logout
Connection to spark_master closed.
root@SparkWorker2:~# ssh SparkWorker1
The authenticity of host 'spark_worker1 (192.168.80.32)' can't be established.
ECDSA key fingerprint is 7e:9b:83:ed:c1:ad:55:2a:e1:f3:49:cd:89:c0:4a:ec.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'spark_worker1,192.168.80.32' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

* Documentation: https://help.ubuntu.com/

708 packages can be updated.
329 updates are security updates.

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Wed Sep 7 09:42:01 2016 from spark_master

root@SparkWorker1:~# exit;
logout
Connection to sparkworker1 closed.
root@SparkWorker2:~#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

 四、安装hadoop

  hadoop-2.6.0.tar.gz ----------------------------------------------------------  /usr/local/hadoop/hadoop-2.6.0

  1、hadoop的下载

http://archive.apache.org/dist/hadoop/common/hadoop-2.6.0/

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  2、hadoop-2.6.0.tar.gz的上传

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  其他两台都照做!

  3、现在,新建/usr/loca/下的hadoop目录

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    mkdir -p /usr/local/hadoop

  4、将下载的hadoop文件移到刚刚创建的/usr/local/hadoop下

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  sudo cp /root/Downloads/Spark_Cluster_Software/hadoop-2.6.0.tar.gz  /usr/local/hadoop

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  最好用cp,不要轻易要mv

  5、解压hadoop文件

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  root@SparkMaster:/usr/local/hadoop# tar -zxvf hadoop-2.6.0.tar.gz

  6、删除解压包,留下解压完成的文件目录

        并修改所属的用户组和用户

  

root@SparkMaster:/usr/local/hadoop# ls
hadoop-2.6.0 hadoop-2.6.0.tar.gz
root@SparkMaster:/usr/local/hadoop# rm -rf hadoop-2.6.0.tar.gz
root@SparkMaster:/usr/local/hadoop# ls
hadoop-2.6.0
root@SparkMaster:/usr/local/hadoop# ll
total 12
drwxr-xr-x 3 root root 4096 9月 7 11:28 ./
drwxr-xr-x 13 root root 4096 9月 7 11:21 ../
drwxr-xr-x 9 20000 20000 4096 11月 14 2014 hadoop-2.6.0/
root@SparkMaster:/usr/local/hadoop# chown -R root:root hadoop-2.6.0/
root@SparkMaster:/usr/local/hadoop# ll
total 12
drwxr-xr-x 3 root root 4096 9月 7 11:28 ./
drwxr-xr-x 13 root root 4096 9月 7 11:21 ../
drwxr-xr-x 9 root root 4096 11月 14 2014 hadoop-2.6.0/
root@SparkMaster:/usr/local/hadoop#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  7、修改环境变量 

    vim ~./bash_profile   或 vim /etc/profile

    配置在这个文件~/.bash_profile,或者也可以,配置在那个全局的文件里,也可以哟。/etc/profile。

    这里,我vim /etc/profile

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

#hadoop
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0
export HADOOP_MAPREDUCE_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

   hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/hadoop# vim /etc/profile
root@SparkMaster:/usr/local/hadoop# source /etc/profile
root@SparkMaster:/usr/local/hadoop# hadoop version
Hadoop 2.6.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1
Compiled by jenkins on 2014-11-13T21:10Z
Compiled with protoc 2.5.0
From source with checksum 18e43357c8f927c0695f1e9522859d6a
This command was run using /usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar
root@SparkMaster:/usr/local/hadoop#

或者

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

#hadoop
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

  至此,表明hadoop安装结束。

 其他两台都照做!

  

  配置hadoop的配置文件

    经验起见,一般都是在NotePad++里,弄好,丢上去。

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  在windows里解压,打开它的配置,写好。

核心

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用5节点!

  尤其是做eclipse开发时,一定得要配

  /usr/local/hadoop/hadoop-2.6.0/tmp

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
  <name>fs.defaultFS</name>
  <value>hdfs://SparkMaster:9000</value>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/usr/local/hadoop/hadoop-2.6.0/tmp</value>
</property>
<property>
  <name>hadoop.proxyuser.hadoop.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.hadoop.groups</name>
  <value>*</value>
</property>
</configuration>

或者

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用3节点!

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://SparkMaster:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/hadoop-2.6.0/tmp</value>
</property>
</configuration>

存储

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用5节点!

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
  <name>dfs.namenode.secondary.http-address</name>
  <value>SparkMaster:9001</value>
</property>
<property>
  <name>dfs.replication</name>
  <value>3</value>
</property>
<property>
  <name>dfs.namenode.name.dir</name>
  <value>/usr/local/hadoop/hadoop-2.6.0/dfs/name</value>
</property>
<property>
  <name>dfs.datanode.data.dir</name>
  <value>/usr/local/hadoop/hadoop-2.6.0/dfs/data</value>
</property>
<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
</property>
</configuration>

或者

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用3节点!

<configuration>
<property>
<name>dfs.namenode.rpc-address</name>
<value>SparkMaster:9000</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/hadoop-2.6.0/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/hadoop-2.6.0/dfs/data</value>
</property>
</configuration>

计算

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)  变成  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用5节点!

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
</property>
<property>
  <name>mapreduce.jobhistory.address</name>
  <value>SparkMaster:10020</value>
</property>
<property>
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>SparkMaster:19888</value>
</property>
</configuration>

或者

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用3节点!

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>SparkMaster:9001</value>
</property>
</configuration>

管理

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用5节点!

<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
  <name>yarn.resourcemanager.address</name>
  <value>SparkMaster:8032</value>
</property>
<property>
  <name>yarn.resourcemanager.scheduler.address</name>
  <value>SparkMaster:8030</value>
</property>
<property>
  <name>yarn.resourcemanager.resource-tracker.address</name>
  <value>SparkMaster:8031</value>
</property>
<property>
  <name>yarn.resourcemanager.admin.address</name>
  <value>SparkMaster:8033</value>
</property>
<property>
  <name>yarn.resourcemanager.webapp.address</name>
  <value>SparkMaster:8088</value>
</property>
</configuration>

或者

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用3节点!

<configuration>

<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

环境

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用5节点!

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME. All others are
# optional. When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.

# The java implementation to use.
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_60

# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol. Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}

export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
else
export HADOOP_CLASSPATH=$f
fi
done

# The maximum amount of heap to use, in MB. Default is 1000.
#export HADOOP_HEAPSIZE=
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""

# Extra Java runtime options. Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"

export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"

export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"

# On secure datanodes, user to run the datanode as after dropping privileges.
# This **MUST** be uncommented to enable secure HDFS if using privileged ports
# to provide authentication of data transfer protocol. This **MUST NOT** be
# defined if SASL is configured for authentication of data transfer protocol
# using non-privileged ports.
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}

# Where log files are stored. $HADOOP_HOME/logs by default.
#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER

# Where log files are stored in the secure data environment.
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}

###
# HDFS Mover specific parameters
###
# Specify the JVM options to be used when starting the HDFS Mover.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in HADOOP_OPTS
#
# export HADOOP_MOVER_OPTS=""

###
# Advanced Users Only!
###

# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by
# the user that will run the hadoop daemons. Otherwise there is the
# potential for a symlink attack.
export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER

或者

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

适用3节点!

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME. All others are
# optional. When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.

# The java implementation to use.
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_60

# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol. Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
else
export HADOOP_CLASSPATH=$f
fi
done

# The maximum amount of heap to use, in MB. Default is 1000.
#export HADOOP_HEAPSIZE=
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""

# Extra Java runtime options. Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"

export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"

export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"

# On secure datanodes, user to run the datanode as after dropping privileges.
# This **MUST** be uncommented to enable secure HDFS if using privileged ports
# to provide authentication of data transfer protocol. This **MUST NOT** be
# defined if SASL is configured for authentication of data transfer protocol
# using non-privileged ports.
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}

# Where log files are stored. $HADOOP_HOME/logs by default.
#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER

# Where log files are stored in the secure data environment.
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}

###
# HDFS Mover specific parameters
###
# Specify the JVM options to be used when starting the HDFS Mover.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in HADOOP_OPTS
#
# export HADOOP_MOVER_OPTS=""

###
# Advanced Users Only!
###

# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by
# the user that will run the hadoop daemons. Otherwise there is the
# potential for a symlink attack.
export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER

主节点

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

SparkMaster

从节点

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用) 

SparkWorker1
SparkWorker2

  将Spark_Master、Spark_Worker1、Spark_Worker2的各自原有配置这几个文件,删去。hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# pwd
/usr/local/hadoop/hadoop-2.6.0/etc/hadoop
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# ls
capacity-scheduler.xml hadoop-env.cmd hadoop-policy.xml httpfs-signature.secret kms-log4j.properties mapred-env.sh ssl-client.xml.example yarn-site.xml
configuration.xsl hadoop-env.sh hdfs-site.xml httpfs-site.xml kms-site.xml mapred-queues.xml.template ssl-server.xml.example
container-executor.cfg hadoop-metrics2.properties httpfs-env.sh kms-acls.xml log4j.properties mapred-site.xml.template yarn-env.cmd
core-site.xml hadoop-metrics.properties httpfs-log4j.properties kms-env.sh mapred-env.cmd slaves yarn-env.sh
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# rm -rf yarn-site.xml
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# rm -rf hadoop-env.sh
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# rm -rf hdfs-site.xml
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# rm -rf mapred-site.xml.template
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# rm -rf core-site.xml
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# rm -rf slaves
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# ls
capacity-scheduler.xml hadoop-env.cmd hadoop-policy.xml httpfs-signature.secret kms-env.sh log4j.properties mapred-queues.xml.template yarn-env.cmd
configuration.xsl hadoop-metrics2.properties httpfs-env.sh httpfs-site.xml kms-log4j.properties mapred-env.cmd ssl-client.xml.example yarn-env.sh
container-executor.cfg hadoop-metrics.properties httpfs-log4j.properties kms-acls.xml kms-site.xml mapred-env.sh ssl-server.xml.example
root@Spark_Master:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop#

将写好的,丢上去。

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# pwd
/usr/local/hadoop/hadoop-2.6.0/etc/hadoop
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# rz
The program 'rz' is currently not installed. You can install it by typing:
apt-get install lrzsz
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop# sudo apt-get install lrzsz

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  其他两台都照做!

新建目录

  root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# mkdir -p /usr/local/hadoop/hadoop-2.6.0/dfs/name

  root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# mkdir -p /usr/local/hadoop/hadoop-2.6.0/dfs/data

  root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# mkdir -p /usr/local/hadoop/hadoop-2.6.0/tmp

  推荐这种新建!但是得要到hadoop-2.6.0.tar.gz被解压完成,得到才做!

  root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# mkdir dfs

  root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/dfs# mkdir name

  root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0/dfs# mkdir data

是一样的。

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

 至此,hadoop的配置工作完成!

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

   hadoop的格式化

   在主节点(SparkMaster)的hadoop的安装目录下,进行如下命令操作

  ./bin/hadoop namenode  -format

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# pwd
/usr/local/hadoop/hadoop-2.6.0
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# ./bin/hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

16/09/07 15:09:58 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = Spark_Master/192.168.80.31
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.0
STARTUP_MSG: classpath = /usr/local/hadoop/hadoop-2.6.0/etc/hadoop:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/asm-3.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/curator-recipes-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/curator-client-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-el-1.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/junit-4.11.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/hadoop-annotations-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/hadoop-auth-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-collections-3.2.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j

12-1.7.5.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jsr305-1.3.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/htrace-core-3.0.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/curator-framework-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-nfs-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0-tests.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-el-1.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/commons-io-2.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/htrace-core-3.0.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/hadoop-hdfs-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/hadoop-hdfs-nfs-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/hdfs/hadoop-hdfs-2.6.0-tests.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/xz-1.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/asm-3.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jline-0.9.94.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/guice-3.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/activation-1.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/h

adoop-2.6.0/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-common-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-client-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-registry-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-api-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/yarn/hadoop-yarn-server-common-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/junit-4.11.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-tests.jar:/usr/local/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.0.jar:/usr/local/hadoop/hadoop-2.6.0/contrib/capacity-scheduler/*.jar:/usr/local/hadoop/hadoop-2.6.0/contrib/capacity-scheduler/*.jar
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z
STARTUP_MSG: java = 1.8.0_60
************************************************************/
16/09/07 15:09:58 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
16/09/07 15:09:58 INFO namenode.NameNode: createNameNode [-format]
16/09/07 15:10:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

16/09/07 15:10:05 WARN common.Util: Path /usr/local/hadoop/hadoop-2.6.0/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
16/09/07 15:10:05 WARN common.Util: Path /usr/local/hadoop/hadoop-2.6.0/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
Formatting using clusterid: CID-505e4c90-5200-400f-82ec-8009d0e3283f
16/09/07 15:10:05 INFO namenode.FSNamesystem: No KeyProvider found.
16/09/07 15:10:05 INFO namenode.FSNamesystem: fsLock is fair:true
16/09/07 15:10:06 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
16/09/07 15:10:06 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
16/09/07 15:10:06 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
16/09/07 15:10:06 INFO blockmanagement.BlockManager: The block deletion will start around 2016 Sep 07 15:10:06
16/09/07 15:10:06 INFO util.GSet: Computing capacity for map BlocksMap
16/09/07 15:10:06 INFO util.GSet: VM type = 64-bit
16/09/07 15:10:06 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
16/09/07 15:10:06 INFO util.GSet: capacity = 2^21 = 2097152 entries
16/09/07 15:10:06 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
16/09/07 15:10:06 INFO blockmanagement.BlockManager: defaultReplication = 3
16/09/07 15:10:06 INFO blockmanagement.BlockManager: maxReplication = 512
16/09/07 15:10:06 INFO blockmanagement.BlockManager: minReplication = 1
16/09/07 15:10:06 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
16/09/07 15:10:06 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
16/09/07 15:10:06 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
16/09/07 15:10:06 INFO blockmanagement.BlockManager: encryptDataTransfer = false
16/09/07 15:10:06 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
16/09/07 15:10:07 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)
16/09/07 15:10:07 INFO namenode.FSNamesystem: supergroup = supergroup
16/09/07 15:10:07 INFO namenode.FSNamesystem: isPermissionEnabled = true
16/09/07 15:10:07 INFO namenode.FSNamesystem: HA Enabled: false
16/09/07 15:10:07 INFO namenode.FSNamesystem: Append Enabled: true
16/09/07 15:10:09 INFO util.GSet: Computing capacity for map INodeMap
16/09/07 15:10:09 INFO util.GSet: VM type = 64-bit
16/09/07 15:10:09 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB

16/09/07 15:10:09 INFO util.GSet: capacity = 2^20 = 1048576 entries
16/09/07 15:10:09 INFO namenode.NameNode: Caching file names occuring more than 10 times
16/09/07 15:10:09 INFO util.GSet: Computing capacity for map cachedBlocks
16/09/07 15:10:09 INFO util.GSet: VM type = 64-bit
16/09/07 15:10:09 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
16/09/07 15:10:09 INFO util.GSet: capacity = 2^18 = 262144 entries
16/09/07 15:10:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
16/09/07 15:10:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
16/09/07 15:10:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
16/09/07 15:10:09 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
16/09/07 15:10:09 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
16/09/07 15:10:09 INFO util.GSet: Computing capacity for map NameNodeRetryCache
16/09/07 15:10:09 INFO util.GSet: VM type = 64-bit
16/09/07 15:10:09 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
16/09/07 15:10:09 INFO util.GSet: capacity = 2^15 = 32768 entries
16/09/07 15:10:09 INFO namenode.NNConf: ACLs enabled? false
16/09/07 15:10:09 INFO namenode.NNConf: XAttrs enabled? true
16/09/07 15:10:09 INFO namenode.NNConf: Maximum size of an xattr: 16384

16/09/07 15:10:09 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1306820420-192.168.80.31-1473232209632
16/09/07 15:10:09 INFO common.Storage: Storage directory /usr/local/hadoop/hadoop-2.6.0/dfs/name has been successfully formatted.
16/09/07 15:10:10 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
16/09/07 15:10:10 INFO util.ExitUtil: Exiting with status 0
16/09/07 15:10:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at SparkMaster/192.168.80.31
************************************************************/
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0#

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

   启动hadoop

  ./sbin/start-all.sh

  

root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# pwd
/usr/local/hadoop/hadoop-2.6.0
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# ./sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [SparkMaster]
SparkMaster: starting namenode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-SparkMaster.out
SparkWorker2: starting datanode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-SparkWorker2.out
SparkWorker1: starting datanode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-SparkWorker1.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is f2:e9:89:a2:74:d0:69:02:60:a5:18:7b:d4:f0:02:bc.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-SparkMaster.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-resourcemanager-SparkMaster.out
SparkWorker1: starting nodemanager, logging to /usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-SparkWorker1.out
SparkWorker2: starting nodemanager, logging to /usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-SparkWorker2.out
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# jps
5026 Jps
4679 SecondaryNameNode
4839 ResourceManager
4507 NameNode
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0#

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

root@SparkWorker1:/usr/local/hadoop/hadoop-2.6.0# pwd
/usr/local/hadoop/hadoop-2.6.0
root@SparkWorker1:/usr/local/hadoop/hadoop-2.6.0# jps
4549 Jps
4329 DataNode
4426 NodeManager
root@SparkWorker1:/usr/local/hadoop/hadoop-2.6.0#

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

root@SparkWorker2:/usr/local/hadoop/hadoop-2.6.0# pwd
/usr/local/hadoop/hadoop-2.6.0
root@SparkWorker2:/usr/local/hadoop/hadoop-2.6.0# jps
3386 NodeManager
3292 DataNode
3518 Jps
root@SparkWorker2:/usr/local/hadoop/hadoop-2.6.0#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  URI看下

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

五、安装spark

    spark-1.5.2-bin-hadoop2.6.tgz ---------------------------------------------- /usr/loca/spark/spark-1.5.2-bin-hadoop2.6

    1、spark的下载   

http://mirror.bit.edu.cn/apache/spark/spark-1.5.2/

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    2、spark-1.5.2-bin-hadoop2.6.tgz的上传

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    其他两台都照做!

    3、现在,新建/usr/local下的spark目录

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    mkdir -p /usr/local/spark

  

  4、将下载的spark文件移到刚刚创建的/usr/local/spark下

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  sudo cp /root/Downloads/Spark_Cluster_Software/spark-1.5.2-bin-hadoop2.6.tgz  /usr/local/spark

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

   最好用cp,不要轻易mv

  5、解压spark文件

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  tar -zxvf spark-1.5.2-bin-hadoop2.6.tgz

  6、删除解压包,留下解压完成的文件目录

  并修改所属的用户组合用户

  

root@SparkMaster:/usr/local/spark# ls
spark-1.5.2-bin-hadoop2.6 spark-1.5.2-bin-hadoop2.6.tgz
root@SparkMaster:/usr/local/spark# rm -rf spark-1.5.2-bin-hadoop2.6.tgz
root@SparkMaster:/usr/local/spark# ls
spark-1.5.2-bin-hadoop2.6
root@SparkMaster:/usr/local/spark# ll
total 12
drwxr-xr-x 3 root root 4096 9月 8 10:27 ./
drwxr-xr-x 14 root root 4096 9月 8 10:21 ../
drwxr-xr-x 12 500 500 4096 11月 4 2015 spark-1.5.2-bin-hadoop2.6/
root@SparkMaster:/usr/local/spark# chown -R root:root spark-1.5.2-bin-hadoop2.6/
root@SparkMaster:/usr/local/spark# ll
total 12
drwxr-xr-x 3 root root 4096 9月 8 10:27 ./
drwxr-xr-x 14 root root 4096 9月 8 10:21 ../
drwxr-xr-x 12 root root 4096 11月 4 2015 spark-1.5.2-bin-hadoop2.6/
root@SparkMaster:/usr/local/spark#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  7、修改环境变量

    vim ~./bash_profile   或 vim /etc/profile

    配置在这个文件~/.bash_profile,或者也可以,配置在那个全局的文件里,也可以哟。/etc/profile。

    这里,我vim /etc/profile

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

#spark
export SPARK_HOME=/usr/local/spark/spark-1.5.2-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/spark# vim /etc/profile
root@SparkMaster:/usr/local/spark# source /etc/profile
root@SparkMaster:/usr/local/spark#

  至此,表明spark安装结束。

其他两台机器都照做!

  配置spark的配置文件

    经验起见,一般都是在NotePad++里,弄好,丢上去

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)   变成    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

#!/usr/bin/env bash

# This file is sourced when running various Spark programs.
# Copy it as spark-env.sh and edit that to configure Spark for your site.

# Options read when launching programs locally with
# ./bin/run-example or ./bin/spark-submit
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public dns name of the driver program
# - SPARK_CLASSPATH, default classpath entries to append

# Options read by executors and drivers running inside the cluster
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public DNS name of the driver program
# - SPARK_CLASSPATH, default classpath entries to append
# - SPARK_LOCAL_DIRS, storage directories to use on this node for shuffle and RDD data
# - MESOS_NATIVE_JAVA_LIBRARY, to point to your libmesos.so if you use Mesos

# Options read in YARN client mode
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
# - SPARK_EXECUTOR_INSTANCES, Number of workers to start (Default: 2)
# - SPARK_EXECUTOR_CORES, Number of cores for the workers (Default: 1).
# - SPARK_EXECUTOR_MEMORY, Memory per Worker (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_DRIVER_MEMORY, Memory for Master (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_YARN_APP_NAME, The name of your application (Default: Spark)
# - SPARK_YARN_QUEUE, The hadoop queue to use for allocation requests (Default: ‘default’)
# - SPARK_YARN_DIST_FILES, Comma separated list of files to be distributed with the job.
# - SPARK_YARN_DIST_ARCHIVES, Comma separated list of archives to be distributed with the job.

# Options for the daemons used in the standalone deploy mode
# - SPARK_MASTER_IP, to bind the master to a different IP address or hostname
# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master
# - SPARK_MASTER_OPTS, to set config properties only for the master (e.g. "-Dx=y")
# - SPARK_WORKER_CORES, to set the number of cores to use on this machine
# - SPARK_WORKER_MEMORY, to set how much total memory workers have to give executors (e.g. 1000m, 2g)
# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker
# - SPARK_WORKER_INSTANCES, to set the number of worker processes per node
# - SPARK_WORKER_DIR, to set the working directory of worker processes
# - SPARK_WORKER_OPTS, to set config properties only for the worker (e.g. "-Dx=y")
# - SPARK_DAEMON_MEMORY, to allocate to the master, worker and history server themselves (default: 1g).
# - SPARK_HISTORY_OPTS, to set config properties only for the history server (e.g. "-Dx=y")
# - SPARK_SHUFFLE_OPTS, to set config properties only for the external shuffle service (e.g. "-Dx=y")
# - SPARK_DAEMON_JAVA_OPTS, to set config properties for all daemons (e.g. "-Dx=y")
# - SPARK_PUBLIC_DNS, to set the public dns name of the master or workers

# Generic options for the daemons used in the standalone deploy mode
# - SPARK_CONF_DIR Alternate conf dir. (Default: ${SPARK_HOME}/conf)
# - SPARK_LOG_DIR Where log files are stored. (Default: ${SPARK_HOME}/logs)
# - SPARK_PID_DIR Where the pid file is stored. (Default: /tmp)
# - SPARK_IDENT_STRING A string representing this instance of spark. (Default: $USER)
# - SPARK_NICENESS The scheduling priority for daemons. (Default: 0)

export JAVA_HOME=/usr/local/jdk/jdk1.8.0_60
export SCALA_HOME=/usr/local/scala/scala-2.10.4
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0
export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.6.0/etc/hadoop
export SPARK_MASTER_IP=SparkMaster
export SPARK_WORKER_MERMORY=1G
export SPARK_MASTER_PORT=7077

这里是3节点,我们一般跳过这个配置文件。若是5节点,则需配置。

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)     变成   hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.

# Example:
# spark.master spark://master:7077
# spark.eventLog.enabled true
# spark.eventLog.dir hdfs://namenode:8021/directory
# spark.serializer org.apache.spark.serializer.KryoSerializer
# spark.driver.memory 5g
# spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"

spark.master spark://SparkMaster:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://SparkMaster:9000/sparkHistoryLogs
spark.eventLog.compress true
spark.history.updateInterval 5
spark.history.ui.port 7777
spark.history.fs.logDirectory hdfs://SparkMaster:9000/sparkHistoryLogs

从节点

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)   变成   hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

SparkMaster
SparkWorker1
SparkWorker2

  将SparkMaster、SparkWorker1、SparkWorker2的各自原有配置这几个文件,删去

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  在这里,最好是复制,因为权限。

  

root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# pwd
/usr/local/spark/spark-1.5.2-bin-hadoop2.6
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# ls
bin CHANGES.txt conf data ec2 examples lib LICENSE licenses NOTICE python R README.md RELEASE sbin
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# cd conf
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ls
docker.properties.template fairscheduler.xml.template log4j.properties.template metrics.properties.template slaves.template spark-defaults.conf.template spark-env.sh.template
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ll
total 40
drwxr-xr-x 2 root root 4096 11月 4 2015 ./
drwxr-xr-x 12 root root 4096 11月 4 2015 ../
-rw-r--r-- 1 root root 202 11月 4 2015 docker.properties.template
-rw-r--r-- 1 root root 303 11月 4 2015 fairscheduler.xml.template
-rw-r--r-- 1 root root 949 11月 4 2015 log4j.properties.template
-rw-r--r-- 1 root root 5886 11月 4 2015 metrics.properties.template
-rw-r--r-- 1 root root 80 11月 4 2015 slaves.template
-rw-r--r-- 1 root root 507 11月 4 2015 spark-defaults.conf.template
-rwxr-xr-x 1 root root 3418 11月 4 2015 spark-env.sh.template*
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

cp spark-env.sh.template spark-env.sh

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ls
docker.properties.template fairscheduler.xml.template log4j.properties.template metrics.properties.template slaves.template spark-defaults.conf.template spark-env.sh.template
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# cp spark-env.sh.template spark-env.sh
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ll
total 44
drwxr-xr-x 2 root root 4096 9月 8 11:48 ./
drwxr-xr-x 12 root root 4096 11月 4 2015 ../
-rw-r--r-- 1 root root 202 11月 4 2015 docker.properties.template
-rw-r--r-- 1 root root 303 11月 4 2015 fairscheduler.xml.template
-rw-r--r-- 1 root root 949 11月 4 2015 log4j.properties.template
-rw-r--r-- 1 root root 5886 11月 4 2015 metrics.properties.template
-rw-r--r-- 1 root root 80 11月 4 2015 slaves.template
-rw-r--r-- 1 root root 507 11月 4 2015 spark-defaults.conf.template
-rwxr-xr-x 1 root root 3418 9月 8 11:48 spark-env.sh*
-rwxr-xr-x 1 root root 3418 11月 4 2015 spark-env.sh.template*
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ls
docker.properties.template log4j.properties.template slaves.template spark-env.sh
fairscheduler.xml.template metrics.properties.template spark-defaults.conf.template spark-env.sh.template
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# rm -rf spark-env.sh.template
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ls
docker.properties.template fairscheduler.xml.template log4j.properties.template metrics.properties.template slaves.template spark-defaults.conf.template spark-env.sh
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# vim spark-env.sh
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

export SCALA_HOME=/usr/local/scala/scala-2.10.4
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0
export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.6.0/etc/hadoop
export SPARK_MASTER_IP=SparkMaster
export SPARK_WORKER_MERMORY=512M

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

cp slaves.template slaves

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

SparkMaster
SparkWorker1
SparkWorker2

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ls
docker.properties.template fairscheduler.xml.template log4j.properties.template metrics.properties.template slaves.template spark-defaults.conf.template spark-env.sh
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# cp slaves.template slaves
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ls
docker.properties.template fairscheduler.xml.template log4j.properties.template metrics.properties.template slaves slaves.template spark-defaults.conf.template spark-env.sh
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# rm -rf slaves.template
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ls
docker.properties.template fairscheduler.xml.template log4j.properties.template metrics.properties.template slaves spark-defaults.conf.template spark-env.sh
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# vim slaves
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# vim masters
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf# ls
docker.properties.template fairscheduler.xml.template log4j.properties.template masters metrics.properties.template slaves spark-defaults.conf.template spark-env.sh
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6/conf#

至此,spark的配置工作完成!

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

    六、启动集群

    1、在haoop的安装目录下,启动hadoop集群。

      /usr/local/hadoop/hadoop-2.6.0下,执行./sbin/start-all.sh

    或,在任何路径下,$HADOOP_HOME/sbin/start-all.sh

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# pwd
/usr/local/hadoop/hadoop-2.6.0
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# jps
4803 Jps
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# ./sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [SparkMaster]
SparkMaster: starting namenode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-SparkMaster.out
SparkWorker2: starting datanode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-SparkWorker2.out
SparkWorker1: starting datanode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-SparkWorker1.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-SparkMaster.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-resourcemanager-SparkMaster.out
SparkWorker1: starting nodemanager, logging to /usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-SparkWorker1.out
SparkWorker2: starting nodemanager, logging to /usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-SparkWorker2.out
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0# jps
5344 ResourceManager
5028 NameNode
5204 SecondaryNameNode
5437 Jps
root@SparkMaster:/usr/local/hadoop/hadoop-2.6.0#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker1:/usr/local/hadoop/hadoop-2.6.0# pwd
/usr/local/hadoop/hadoop-2.6.0
root@SparkWorker1:/usr/local/hadoop/hadoop-2.6.0# jps
4399 Jps
root@SparkWorker1:/usr/local/hadoop/hadoop-2.6.0# jps
4801 Jps
4545 DataNode
4644 NodeManager
root@SparkWorker1:/usr/local/hadoop/hadoop-2.6.0#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker2:/usr/local/hadoop/hadoop-2.6.0# pwd
/usr/local/hadoop/hadoop-2.6.0
root@SparkWorker2:/usr/local/hadoop/hadoop-2.6.0# jps
4510 Jps
root@SparkWorker2:/usr/local/hadoop/hadoop-2.6.0# jps
4656 DataNode
4912 Jps
4754 NodeManager
root@SparkWorker2:/usr/local/hadoop/hadoop-2.6.0#

    2、在spark的安装目录下,启动spark集群。

    /usr/local/spark/spark-1.5.2-bin-hadoop2.6下,执行./sbin/start-all.sh

   或, 在任何路径下,执行 $SPARK_HOME/sbin/start-all.sh

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# pwd
/usr/local/spark/spark-1.5.2-bin-hadoop2.6
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# jps
4534 ResourceManager
4761 Jps
4395 SecondaryNameNode
4221 NameNode
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# ./sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-SparkMaster.out
failed to launch org.apache.spark.deploy.master.Master:
full log in /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-SparkMaster.out
SparkWorker2: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-SparkWorker2.out
SparkWorker2: failed to launch org.apache.spark.deploy.worker.Worker:
SparkWorker1: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-SparkWorker1.out
SparkWorker2: full log in /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-SparkWorker2.out
SparkWorker1: failed to launch org.apache.spark.deploy.worker.Worker:
SparkWorker1: full log in /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-SparkWorker1.out
SparkMaster: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/spark-1.5.2-bin-hadoop2.6/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-SparkMaster.out
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# jps
5109 Jps
4901 Master
4534 ResourceManager
4395 SecondaryNameNode
4221 NameNode
5071 Worker
root@SparkMaster:/usr/local/spark/spark-1.5.2-bin-hadoop2.6#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker1:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# pwd
/usr/local/spark/spark-1.5.2-bin-hadoop2.6
root@SparkWorker1:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# jps
5728 DataNode
5827 NodeManager
5981 Jps
root@SparkWorker1:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# jps
5728 DataNode
5827 NodeManager
6117 Jps
6078 Worker
root@SparkWorker1:/usr/local/spark/spark-1.5.2-bin-hadoop2.6#

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

root@SparkWorker2:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# pwd
/usr/local/spark/spark-1.5.2-bin-hadoop2.6
root@SparkWorker2:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# jps
4901 Jps
4663 DataNode
4761 NodeManager
root@SparkWorker2:/usr/local/spark/spark-1.5.2-bin-hadoop2.6# jps
5058 Jps
4663 DataNode
5015 Worker
4761 NodeManager
root@SparkWorker2:/usr/local/spark/spark-1.5.2-bin-hadoop2.6#

  由此,可见,hadoop的启动、spark的启动都正常!

  七、查看页面

  进入hadoop的hdfs的web页面。访问http://SparkMaster:50070  (安装之后,立即可以看到)

  进入hadoop的yarn的web页面。访问http://SparkMaster:8088  (安装之后,立即可以看到)

  进入spark的web页面。访问 http://SparkMaster:8080   (安装之后,立即可以看到)

  进入spark的shell的web页面。访问http:SparkMaster:4040  (需开启spark shell)

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

 

  我们也可以进入scala状态下的spark

  在spark的安装目录下,执行./bin/spark-shell   注意,没空格

  或者,在任何路径下,执行    $SPARK_HOME/bin/spark-shell --master spark://master:7077   注意,$SPARK_HOME,没空格

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

   

  在这里,直接启动./bin/spark-shell,出现了如上的错误!

  后面,我就直接终止,Ctrl + c ,查看了部分博客,如下

  http://blog.csdn.net/ggz631047367/article/details/50185181

    http://bbs.csdn.net/topics/391860835

  

  解决办法是:

    将SPARK_MASTER_IP=SparkMaster        改成    SPARK_MASTER_IP=192.168.80.31

  但是,我重启之后。再次启动hadoop集群、spark集群。正常了!

   在spark的安装目录下,执行./bin/spark-shell   注意,没空格

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

16/09/08 17:08:17 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/09/08 17:08:17 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/09/08 17:08:18 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/09/08 17:08:19 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/09/08 17:08:26 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/09/08 17:08:31 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/09/08 17:08:31 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/09/08 17:08:40 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/09/08 17:08:40 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/09/08 17:08:41 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/09/08 17:08:41 INFO metastore.ObjectStore: Initialized ObjectStore
16/09/08 17:08:41 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/09/08 17:08:42 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
16/09/08 17:08:44 INFO metastore.HiveMetaStore: Added admin role in metastore
16/09/08 17:08:44 INFO metastore.HiveMetaStore: Added public role in metastore
16/09/08 17:08:44 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty
16/09/08 17:08:44 INFO metastore.HiveMetaStore: 0: get_all_databases
16/09/08 17:08:44 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
16/09/08 17:08:44 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/09/08 17:08:44 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/09/08 17:08:44 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/09/08 17:08:46 INFO session.SessionState: Created local directory: /tmp/48b17646-4ba9-4462-a48e-7f6a977f8849_resources
16/09/08 17:08:47 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/48b17646-4ba9-4462-a48e-7f6a977f8849
16/09/08 17:08:47 INFO session.SessionState: Created local directory: /tmp/root/48b17646-4ba9-4462-a48e-7f6a977f8849
16/09/08 17:08:47 INFO session.SessionState: Created HDFS directory: /tmp/hive/root/48b17646-4ba9-4462-a48e-7f6a977f8849/_tmp_space.db
16/09/08 17:08:47 INFO repl.SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.

scala> 9*9
res0: Int = 81

scala>

  以后,注意。sc就是SparkContext的实例。这是在启动Spark shell的时候系统帮助我们自动生成,SparkContext是把代码提交到集群或本地的通道,编写Spark代码,无论是要运行本地还是集群都必须有SparkContext的实例。

  进入了Spark的shell 世界,从Web的角度看一夏SparkUI的情况。

  八、梳理

  进入hadoop的hdfs的web页面。访问http://SparkMaster:50070  (安装之后,立即可以看到)

  进入hadoop的yarn的web页面。访问http://SparkMaster:8088  (安装之后,立即可以看到)

  进入spark的web页面。访问 http://SparkMaster:8080   (安装之后,立即可以看到)

  进入spark的shell的web页面。访问http:SparkMaster:4040  (需开启spark shell)

  

    hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

当然,我们也可以查看一些其他信息

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

   当然,我们也可以查看一些其他信息

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz  的集群搭建(3节点和5节点皆适用)

  

   结束!

  总结:

    其实啊,我这里还仅是初学入门,以后深入呢,还有Spark Standalone HA的部署。

    收藏的几个博客推荐:

http://mmicky.blog.163.com/blog/#m=0

http://mmicky.blog.163.com/blog/static/15029015420143191440337/

http://mmicky.blog.163.com/blog/static/150290154201451233350614/

http://blog.csdn.net/book_mmicky/article/details/25714295

  

  

hadoop-2.6.0.tar.gz + spark-1.5.2-bin-hadoop2.6.tgz 的集群搭建(3节点和5节点皆适用)的更多相关文章

  1. hadoop-2&period;6&period;0&period;tar&period;gz &plus; spark-1&period;5&period;2-bin-hadoop2&period;6&period;tgz的集群搭建(单节点)

    前言 本人呕心沥血所写,经过好一段时间反复锤炼和整理修改.感谢所参考的博友们!同时,欢迎前来查阅赏脸的博友们收藏和转载,附上本人的链接.http://www.cnblogs.com/zlslch/p/ ...

  2. hadoop-2&period;6&period;0&period;tar&period;gz &plus; spark-1&period;5&period;2-bin-hadoop2&period;6&period;tgz的集群搭建(单节点)(Ubuntu系统)

    前言 本人呕心沥血所写,经过好一段时间反复锤炼和整理修改.感谢所参考的博友们!同时,欢迎前来查阅赏脸的博友们收藏和转载,附上本人的链接.http://www.cnblogs.com/zlslch/p/ ...

  3. hadoop-2&period;6&period;0&period;tar&period;gz &plus; spark-1&period;6&period;1-bin-hadoop2&period;6&period;tgz的集群搭建(单节点)(CentOS系统)

    福利 => 每天都推送 欢迎大家,关注微信扫码并加入我的4个微信公众号:   大数据躺过的坑      Java从入门到架构师      人工智能躺过的坑         Java全栈大联盟   ...

  4. Spark on YARN模式的安装(spark-1&period;6&period;1-bin-hadoop2&period;6&period;tgz &plus; hadoop-2&period;6&period;0&period;tar&period;gz)(master、slave1和slave2)(博主推荐)

    说白了 Spark on YARN模式的安装,它是非常的简单,只需要下载编译好Spark安装包,在一台带有Hadoop YARN客户端的的机器上运行即可.  Spark on YARN简介与运行wor ...

  5. hadoop-2&period;6&period;0&period;tar&period;gz &plus; spark-1&period;6&period;1-bin-hadoop2&period;6&period;tgz &plus; zeppelin-0&period;5&period;6-incubating-bin-all&period;tgz(master、slave1和slave2)(博主推荐)(图文详解)

    不多说,直接上干货! 我这里,采取的是CentOS6.5,当然大家也可以在ubuntu 16.04系统里,这些都是小事 CentOS 6.5的安装详解 hadoop-2.6.0.tar.gz + sp ...

  6. apache-storm-1&period;0&period;2&period;tar&period;gz的集群搭建(3节点)(图文详解)(非HA和HA)

    不多说,直接上干货! Storm的版本选取 我这里,是选用apache-storm-1.0.2.tar.gz apache-storm-0.9.6.tar.gz的集群搭建(3节点)(图文详解) 为什么 ...

  7. Hadoop上路-01&lowbar;Hadoop2&period;3&period;0的分布式集群搭建

    一.配置虚拟机软件 下载地址:https://www.virtualbox.org/wiki/downloads 1.虚拟机软件设定 1)进入全集设定 2)常规设定 2.Linux安装配置 1)名称类 ...

  8. java&lowbar;redis3&period;0&period;3集群搭建

    redis3.0版本之后支持Cluster,具体介绍redis集群我就不多说,了解请看redis中文简介. 首先,直接访问redis.io官网,下载redis.tar.gz,现在版本3.0.3,我下面 ...

  9. redis3&period;0&period;3集群搭建

    redis3.0版本之后支持Cluster,具体介绍redis集群我就不多说,了解请看redis中文简介. 首先,直接访问redis.io官网,下载redis.tar.gz,现在版本3.0.3,我下面 ...

随机推荐

  1. ViewPager取消左右滑动切换功能

    ViewPager取消左右滑动切换功能 最近做项目要求某种情况下ViewPager不能滑动,那么我们只需要重写这个方法就可以禁止ViewPager滑动 IndexViewPager.java: imp ...

  2. ASP&period;NET Web API(三):安全验证之使用摘要认证&lpar;digest authentication&rpar;

    在前一篇文章中,主要讨论了使用HTTP基本认证的方法,因为HTTP基本认证的方式决定了它在安全性方面存在很大的问题,所以接下来看看另一种验证的方式:digest authentication,即摘要认 ...

  3. PHP浮点数的一个常见问题的解答 &lpar;转载 http&colon;&sol;&sol;www&period;laruence&period;com&sol;2013&sol;03&sol;26&sol;2884&period;html&rpar;

    不过, 我当时遗漏了一点, 也就是对于如下的这个常见问题的回答: <?php $f = 0.58; var_dump(intval($f * 100)); //为啥输出57 ?> 为啥输出 ...

  4. LoadRunner用户行为模拟器 《第三篇》

    用户行为模拟器简称VU,VU通过运行VU脚本模拟了用户对软件的操作行为.VU是基于网络协议的.很明显,被测服务器是通过各种各样的网络协议与客户端打交道的.VU要“骗过”被测服务器,当然就要遵守这些协议 ...

  5. hdu 1026 Ignatius and the Princess I&lpar;优先队列&plus;bfs&plus;记录路径&rpar;

    以前写的题了,现在想整理一下,就挂出来了. 题意比较明确,给一张n*m的地图,从左上角(0, 0)走到右下角(n-1, m-1). 'X'为墙,'.'为路,数字为怪物.墙不能走,路花1s经过,怪物需要 ...

  6. 贴近浏览器窗口右侧的jqueryui dialog快速从左侧调整大小时对话框大小设置不准确的问题

    之前在做两个相同的页面的事件同步时发现了这个问题,现在把它记录下来. 一.问题描述 页面中的jqueryui对话框,如果把它拖动到靠近浏览器窗口右侧边缘,并快速从对话框左侧调整对话框窗口大小时,对话框 ...

  7. JavaScript-DOM编程的一些常用属性

    一.Document常见属性 document.title // 设置文档标题等价于HTML的title标签 document.bgColor // 设置页面背景色 document.fgColor ...

  8. JAVA9模块化详解(一)——模块化的定义

    JAVA9模块化详解 前言 java9已经出来有一段时间了,今天向大家介绍一下java9的一个重要特性--模块化.模块化系统的主要目的如下: 更可靠的配置,通过制定明确的类的依赖关系代替以前那种易错的 ...

  9. js 解析json

    一 ,eval var dataObj=eval("("+data+")"); 1,这种形式将使得性能显著降低,因为它必须运行编译器 2,eval函数还减弱了你 ...

  10. &lbrack;转&rsqb;图解分布式一致性协议Paxos

    Paxos协议/算法是分布式系统中比较重要的协议,它有多重要呢? <分布式系统的事务处理>: Google Chubby的作者MikeBurrows说过这个世界上只有一种一致性算法,那就是 ...