在和我一起学Hadoop(二):Hadoop的源码构建章节中我们构建了适合本地Linux的hadoop-xxx.tar.gz的安装包。
解压到安装目录
配置如下
环境变量配置 /etc/profile
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native/"
配置hadoop 的JAVA_HOME
并不是说我们在/etc/profile下配置了JAVA_HOME就可以了
修改文件$HADOOP_HOME/etc/hadoop/hadoop-env.sh,在脚本最上面添加JAVA_HOME变量
export JAVA_HOME=[你的JAVA_HOME]
1.conf/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9000</value>
<description>NameNode URI. value:http://host:port/</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/hdp/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4069</value>
<description>Size of read/write buffer used in SequenceFiles.value:131072</description>
</property>
</configuration>
2.conf/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/root/data/hdp/nameNodeDir</value>
<description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
<description>HDFS blocksize of 256MB for large file-systems.</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/root/data/hdp/dataNodeDir</value>
<description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>
</property>
</configuration>
3.conf/yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop01</value>
<description>ResourceManager host.</description>
</property>
<property>
<description>the valid service name should only contain a-zA-Z0-9_ and can not start with numbers</description>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
4.conf/mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>Execution framework set to Hadoop YARN.</description>
</property>
5.启动hadoop
5.1 格式化namenode(首次使用需要先格式化namenode)
$HADOOP_HOME/bin/hdfs namenode -format
5.2 启动namenode
hadoop-daemon.sh --script hdfs start namenode
5.3 启动datanode 在所有的slave节点
hadoop-daemon.sh --script hdfs start datanode
5.4 启动yarn resourcemanager
yarn-daemon.sh start resourcemanager
5.5 启动yarn nodemanager 在所有的slave节点
yarn-daemon.sh --config start nodemanager
5.6 启动 a standalone WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:
yarn-daemon.sh start proxyserver
5.7 启动 MapReduce JobHistory Server
mr-jobhistory-daemon.sh start historyserver
6.关闭hadoop
Stop the NameNode with the following command, run on the designated NameNode:
$ hadoop-daemon.sh --script hdfs stop namenode
Run a script to stop DataNodes on all slaves:
$ hadoop-daemon.sh --script hdfs stop datanode
Stop the ResourceManager with the following command, run on the designated ResourceManager:
$ yarn-daemon.sh stop resourcemanager
Run a script to stop NodeManagers on all slaves:
$ yarn-daemon.sh stop nodemanager
Stop the WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:
$ yarn-daemon.sh stop proxyserver
Stop the MapReduce JobHistory Server with the following command, run on the designated server:
$ mr-jobhistory-daemon.sh stop historyserver
====
全自动化脚本:$HADOOP_CONF_DIR/start-all.sh && stop-all.sh
现在已丢弃,推荐使用:start-dfs.sh && start-yarn.sh
web访问地址====
Once the Hadoop cluster is up and running check the web-ui of the components as described below:
Daemon Web Interface Notes
NameNode http://nn_host:port/ Default HTTP port is 50070.
ResourceManager http://rm_host:port/ Default HTTP port is 8088.
MapReduce JobHistory Server http://jhs_host:port/ Default HTTP port is 19888.