CentOS6.5搭建hadoop伪分布式集群

时间:2025-02-18 11:59:45
搭建伪分布式集群:
  无密登录
    a.在家目录下创建.ssh文件夹,修改文件夹权限为700
      $>mkdir ~/.ssh
      $>chmod 700 ~/.ssh
    b.生成公钥
      $>ssh-keygen -t rsa -P '' -f ~/.ssh
    c.将公钥添加至认证库,修改authorized_keys的权限为600
      $>cat ~/.ssh/id_rsa.pub >> authorized_keys
      $>chmod 600 ~/.ssh/authorized_keys
    d.验证无密登录自己是否成功
      $>ssh localhost
  2.安装jdk
    a.准备工作:把jdk的**.压缩包放在~/soft文件夹下.
      确保系统中没有安装jdk,可以用rpm -qa|grep jdk查看,
      如果有,先将jdk卸载(rpm -e --nodeps jdk**)。
    b.解压jdk
      $>tar -zxf
    c.创建jdk的软连接
      $>ln -s jdk1.8.0_171/ jdk
    d.配置环境变量 ~/.bash_profile
      $>vim ~/.bash_profile
      添加:
# jdk install
export JAVA_HOME=/home/hadoop/soft/jdk
export PATH=$PATH:$JAVA_HOME/bin
    e.使配置文件生效
      $>source ~/.bash_profile
    f.验证java是否配置成功
      $>java -version
      $>javac -version
  3.安装hadoop
    a.准备工作,将hadoop的压缩文件放在~/soft文件夹下
    b.解压hadoop,创建软连接
      $>tar -zxf hadoop-2.7.
      $>ln -s hadoop-2.7.3/ hadoop
    c.配置hadoop的环境变量
      $>vim ~/.bash_profile
      添加:
# hadoop install
export HADOOP_HOME=/home/hadoop/soft/hadoop
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
    d.使环境变量生效
      $ source ~/.bash_profile
    e.配置hadoop/etc/hadoop下的文件
      1)
$ vim hadoop/etc/hadoop/
注释原有的JAVA_HOME,重新导入JAVA_HOME
        #export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/home/hadoop/soft/jkd
      2)
        $ vim hadoop/etc/hadoop/
添加:
<property>
  <name></name>
  <value>hdfs://localhost:9000</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property () naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>
<property>
  <name></name>
  <value>/home/hadoop/tmp/hadoop</value>
  <description>A base for other temporary directories.</description>
</property>
      3)
        $ vim hadoop/etc/hadoop/
添加:
<property>
  <name></name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>
<property>
  <name></name>
  <value>file:///home/hadoop/tmp/hadoop/dfs/name,file:///home/hadoop/tmp/hadoop/dfs/name1</value>
  <description>Determines where on the local filesystem the DFS name node
      should store the name table(fsimage).  If this is a comma-delimited list
      of directories then the name table is replicated in all of the
      directories, for redundancy. </description>
</property>
<property>
  <name></name>
  <value>file:///home/hadoop/tmp/hadoop/dfs/data,file:///home/hadoop/tmp/hadoop/dfs/data1</value>
  <description>Determines where on the local filesystem an DFS data node
  should store its blocks.  If this is a comma-delimited
  list of directories, then data will be stored in all named
  directories, typically on different devices. The directories should be tagged
  with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for HDFS
  storage policies. The default storage type will be DISK if the directory does
  not have a storage type tagged explicitly. Directories that do not exist will
  be created if local filesystem permission allows.
  </description>
</property>
      4)
        $ cp hadoop/etc/hadoop/ hadoop/etc/hadoop/
$ vim hadoop/etc/hadoop/
添加:
    <property>
<name></name>
<value>yarn</value>
    </property>
      5)
        $ vim hadoop/etc/hadoop/
添加:
    <property>
<name>-services</name>
<value>mapreduce_shuffle</value>
    </property>
    f.格式化namenode(只有第一次才格式化,之后启动hadoop都不需要)
      $>hdfs namenode -format
    e.验证.
      开启hadoop
      $>
      查看进程:
      $>jps
      如果显示含有namenode datanode secondarynamenode说明hadoop伪分配置成功。