【大数据系统架构师】1.2 大数据基础Hadoop 2.X

时间:2021-01-13 10:46:02

1. hadoop环境搭建

1.1 伪分布式环境搭建

1.1.1 伪分布式环境搭建

1.1.2 伪分布式搭建结果

hdfs可视化界面: http://od001:50070/dfshealth.html#tab-overview

yarn可视化界面: http://od001:8088/cluster

历史服务器可视化界面:http://od001:19888/

secondarynamenode可视化界面: http://od001:50090/status.html

1.1.3 批量启动脚本

#!/bin/bash
echo "启动namenode"
hadoop-daemon.sh start namenode
echo "启动datanode"
hadoop-daemon.sh start datanode
echo "启动resourcemanager"
yarn-daemon.sh start resourcemanager
echo "启动nodemanager"
yarn-daemon.sh start nodemanager
echo "启动historyserver"
mr-jobhistory-daemon.sh start historyserver
echo "启动secondarynamenode"
hadoop-daemon.sh start secondarynamenode

1.1.4 批量停止脚本

#!/bin/bash
echo "停止namenode"
hadoop-daemon.sh stop namenode
echo "停止datanode"
hadoop-daemon.sh stop datanode
echo "停止resourcemanager"
yarn-daemon.sh stop resourcemanager
echo "停止nodemanager"
yarn-daemon.sh stop nodemanager
echo "停止historyserver"
mr-jobhistory-daemon.sh stop historyserver
echo "停止secondarynamenode"
hadoop-daemon.sh stop secondarynamenode

1.2 集群环境

1.2.1 集群环境搭建

1)克隆虚拟机

2) 使用root用户修改网卡信息

vi /etc/udev/rules.d/70-persistent-net.rules

vi /etc/sysconfig/network-scripts/ifcfg-eth0

1.2.2 资源规划

  od002 od003 od004
HDFS      
  NameNode    
  DataNode DataNode DataNode
      SecondaryNameNode
YARN   ResourceManger  
  NodeManager NodeManager NodeManager
MapReduce      
  JobHistoryServer    
       

1.3.3 配置

hfds

  hadoop-env.sh

  core-site.xml

  hdfs-site.xml

  slaves

yarn

  yarn-env.sh

  yarn-site.xml

  slaves

mapreduce

  mapred-env.sh

  mapred-site.xml

1.3.4 配置同步

1)在od002、od003、od004分别使用ssh-keygen -t rsa命令,生成公私钥文件

2)在每台服务器的.ssh目录,使用

ssh-copy-id od002

ssh-copy-id od003

ssh-copy-id od004

配置ssh无密码登录

3)使用scp命令,同步配置文件

scp -r ./hadoop-2.5.-cdh5.3.6/etc/ od003:/opt/modules/hadoop-2.5.-cdh5.3.6

scp -r ./hadoop-2.5.-cdh5.3.6/etc/ od004:/opt/modules/hadoop-2.5.-cdh5.3.6

1.3.5 启动命令

1)在od002上,执行start-dfs.sh命令

Starting namenodes on [od002]
od002: starting namenode, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/hadoop-od-namenode-od002.out
od004: starting datanode, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/hadoop-od-datanode-od004.out
od003: starting datanode, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/hadoop-od-datanode-od003.out
od002: starting datanode, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/hadoop-od-datanode-od002.out
Starting secondary namenodes [od004]
od004: starting secondarynamenode, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/hadoop-od-secondarynamenode-od004.out

2)在od003上,执行start-yarn.sh命令

starting yarn daemons
starting resourcemanager, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/yarn-od-resourcemanager-od003.out
od004: starting nodemanager, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/yarn-od-nodemanager-od004.out
od002: starting nodemanager, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/yarn-od-nodemanager-od002.out
od003: starting nodemanager, logging to /opt/modules/hadoop-2.5.-cdh5.3.6/logs/yarn-od-nodemanager-od003.out

3)验证环境

yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.-cdh5.3.6.jar wordcount input output001

1.3.6 基准测试

1)基本测试:服务启动、是否可用、简单的应用

hdfs读写操作

2)

1.3.9 集群环境搭建结果

hdfs可视化界面: http://od002:50070/dfshealth.html#tab-overview

yarn可视化界面: http://od003:8088/cluster

历史服务器可视化界面:http://od002:19888/

secondarynamenode可视化界面: http://od004:50090/status.html