不多说,直接上干货!
我的集群机器情况是 bigdatamaster(192.168.80.10)、bigdataslave1(192.168.80.11)和bigdataslave2(192.168.80.12)
然后,安装目录是在/home/hadoop/app下。
官方建议在master机器上安装Hue,我这里也不例外。安装在bigdatamaster机器上。
Hue版本:hue-3.9.0-cdh5.5.4
需要编译才能使用(联网)
说给大家的话:大家电脑的配置好的话,一定要安装cloudera manager。毕竟是一家人的。
同时,我也亲身经历过,会有部分组件版本出现问题安装起来要个大半天时间去排除,做好心里准备。废话不多说,因为我目前读研,自己笔记本电脑最大8G,只能玩手动来练手。
纯粹是为了给身边没高配且条件有限的学生党看的! 但我已经在实验室机器群里搭建好cloudera manager 以及 ambari都有。
大数据领域两大最主流集群管理工具Ambari和Cloudera Manger
Cloudera安装搭建部署大数据集群(图文分五大步详解)(博主强烈推荐)
Ambari安装搭建部署大数据集群(图文分五大步详解)(博主强烈推荐)
一、以下是默认的配置文件
二、以下是跟我机器集群匹配的配置文件(非HA集群下怎么配置Hue的yarn_clusters模块)
最终我的非HA配置信息如下
# Configuration for YARN (MR2)
# ------------------------------------------------------------------------
[[yarn_clusters]]
[[[default]]]
# Enter the host on which you are running the ResourceManager
resourcemanager_host=bigdatamaster
# The port where the ResourceManager IPC listens on
resourcemanager_port=8032
# Whether to submit jobs to this cluster
submit_to=True
# Resource Manager logical name (required for HA)
## logical_name=
# Change this if your YARN cluster is Kerberos-secured
## security_enabled=false
# URL of the ResourceManager API
resourcemanager_api_url=http://bigdatamaster:8088
# URL of the ProxyServer API
proxy_api_url=http://bigdatamaster:8088
# URL of the HistoryServer API
history_server_api_url=http://bigdatamaster:19888
# In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
# have to be verified against certificate authority
## ssl_cert_ca_verify=True
# HA support by specifying multiple clusters
# e.g.
# [[[ha]]]
# Resource Manager logical name (required for HA)
## logical_name=my-rm-name
三、以下是跟我机器集群匹配的配置文件(HA集群下怎么配置Hue的yarn_clusters模块)
hadoop-2.6.0.tar.gz的集群搭建(5节点)
这里需要说明一下,[[[default]]] 和 [[ha]]中各配置一个RM。
logical_name名字就是你集群中yarn-site.xml中配置的
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
URL of the ResourceManager API 这里配置资源管理的地址和端口,对应yarn-site.xml中的
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>djt11:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>djt12:8088</value>
</property>
那么就要如下来配置
# URL of the ResourceManager API
resourcemanager_api_url=djt11:8088,djt12:8088
URL of the HistoryServer API 这里配置历史记录资源管理的地址和端口,对应mapred-site.xml中的
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>djt13:19888</value>
</property>
所以,我的HA最终如下配置
# Configuration for YARN (MR2)
# ------------------------------------------------------------------------
[[yarn_clusters]]
[[[default]]]
# Enter the host on which you are running the ResourceManager
resourcemanager_host=cluster1
# The port where the ResourceManager IPC listens on
resourcemanager_port=8032
# Whether to submit jobs to this cluster
submit_to=True
# Resource Manager logical name (required for HA)
logical_name=rm1
# Change this if your YARN cluster is Kerberos-secured
## security_enabled=false
# URL of the ResourceManager API
resourcemanager_api_url=http://djt11:8088
# URL of the ProxyServer API
proxy_api_url=http://djt13:8088
# URL of the HistoryServer API
history_server_api_url=http://bigdatamaster:19888
# In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
# have to be verified against certificate authority
## ssl_cert_ca_verify=True
# HA support by specifying multiple clusters
# e.g.
# [[[ha]]]
# Resource Manager logical name (required for HA)
logical_name=rm2
resourcemanager_api_url=http://djt12:23188
history_server_api_url=http://djt13:19888
submit_to=True
成功!
参考
http://gethue.com/hadoop-tutorial-yarn-resource-manager-high-availability-ha-in-mr2/
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_hue_config.html
http://cloudera.github.io/hue/docs-3.8.0/manual.html#_hadoop_configuration