和我一起学Hadoop(三):Hadoop集群的最简化部署

时间:2022-08-10 04:51:20

和我一起学Hadoop(二):Hadoop的源码构建章节中我们构建了适合本地Linux的hadoop-xxx.tar.gz的安装包。
解压到安装目录
配置如下
环境变量配置 /etc/profile

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native/"
配置hadoop 的JAVA_HOME

并不是说我们在/etc/profile下配置了JAVA_HOME就可以了
修改文件$HADOOP_HOME/etc/hadoop/hadoop-env.sh,在脚本最上面添加JAVA_HOME变量

export JAVA_HOME=[你的JAVA_HOME]

1.conf/core-site.xml

<configuration>
<property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop01:9000</value>
        <description>NameNode URI. value:http://host:port/</description>
</property>
<property>
        <name>hadoop.tmp.dir</name>
        <value>/root/hdp/tmp</value>
</property>
<property>
        <name>io.file.buffer.size</name>
        <value>4069</value>
        <description>Size of read/write buffer used in SequenceFiles.value:131072</description>
</property>
</configuration>

2.conf/hdfs-site.xml

<configuration>
<property>
        <name>dfs.namenode.name.dir</name>
        <value>/root/data/hdp/nameNodeDir</value>
        <description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
        <name>dfs.blocksize</name>
        <value>268435456</value>
        <description>HDFS blocksize of 256MB for large file-systems.</description>
</property>
<property>
        <name>dfs.datanode.data.dir</name>
        <value>/root/data/hdp/dataNodeDir</value>
        <description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>
</property>
</configuration>

3.conf/yarn-site.xml

<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop01</value>
    <description>ResourceManager host.</description>
</property>
<property>
    <description>the valid service name should only contain a-zA-Z0-9_ and can not start with numbers</description>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

4.conf/mapred-site.xml

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    <description>Execution framework set to Hadoop YARN.</description>
</property>

5.启动hadoop
5.1 格式化namenode(首次使用需要先格式化namenode)

$HADOOP_HOME/bin/hdfs namenode -format

5.2 启动namenode

hadoop-daemon.sh  --script hdfs start namenode

5.3 启动datanode 在所有的slave节点

hadoop-daemon.sh  --script hdfs start datanode

5.4 启动yarn resourcemanager

yarn-daemon.sh start resourcemanager

5.5 启动yarn nodemanager 在所有的slave节点

yarn-daemon.sh --config start nodemanager

5.6 启动 a standalone WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:

yarn-daemon.sh start proxyserver

5.7 启动 MapReduce JobHistory Server

mr-jobhistory-daemon.sh start historyserver

6.关闭hadoop
Stop the NameNode with the following command, run on the designated NameNode:

$ hadoop-daemon.sh --script hdfs stop namenode

Run a script to stop DataNodes on all slaves:

$ hadoop-daemon.sh --script hdfs stop datanode

Stop the ResourceManager with the following command, run on the designated ResourceManager:

$ yarn-daemon.sh  stop resourcemanager

Run a script to stop NodeManagers on all slaves:

$ yarn-daemon.sh  stop nodemanager

Stop the WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:

$ yarn-daemon.sh stop proxyserver 

Stop the MapReduce JobHistory Server with the following command, run on the designated server:

$ mr-jobhistory-daemon.sh stop historyserver 

====

全自动化脚本:$HADOOP_CONF_DIR/start-all.sh && stop-all.sh
现在已丢弃,推荐使用:start-dfs.sh && start-yarn.sh

web访问地址====
Once the Hadoop cluster is up and running check the web-ui of the components as described below:

Daemon Web Interface Notes
NameNode http://nn_host:port/ Default HTTP port is 50070.
ResourceManager http://rm_host:port/ Default HTTP port is 8088.
MapReduce JobHistory Server http://jhs_host:port/ Default HTTP port is 19888.