大数据平台搭建(4)

时间:2022-04-27 14:18:29

注意:因为博客中美元符号有特殊含义,所以将美元符号替换为&
12.安装hbase
a.上传:将hbase-1.3.1-bin.tar.gz压缩包上传到/user/local路径下
b.解压:将上传的hadoop包解压缩到当前的目录下(tar -zxvf hbase-1.3.1-bin.tar.gz)
c.修改 hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.8
export HBASE_HOME=/usr/local/hbase-1.3.1
export HADOOP_HOME=/usr/local/hadoop-2.7.3
export PATH=&PATH:/usr/local/hbase-1.3.1/bin
export HBASE_MANAGES_ZK=false

d.修改hbase-site.xml

<configuration>
       <property>
         <name>hbase.rootdir</name>
         <value>hdfs://mycluster/hbase</value>      //因为是多台master,所以hbase.roodir的值跟hadoop配置文件hdfs-site.xml中dfs.nameservices的值是一样的
       </property>
       <property>
         <name>hbase.cluster.distributed</name>
         <value>true</value>
       </property>
       <property>
         <name>hbase.zookeeper.quorum</name>
         <value>namenode01,namenode02,datanode01,datanode02,datanode03</value>
       </property>
       <property>
         <name>hbase.master.port</name>    //当定义多台master的时候,我们只需要提供端口号,单机配置只需要hbase.master的属性
         <value>60000</value>
       </property>
       <property>
         <name>zookeeper.session.timeout</name>
         <value>60000</value>
       </property>
       <property>
         <name>hbase.zookeeper.property.clientPort</name>
         <value>2181</value>            //hbase.zookeeper.property.clientPort配置的这个端口号必须跟zookeeper配置的clientPort端口号一致。
       </property>
       <property>
         <name>hbase.zookeeper.property.dataDir</name>
         <value>/usr/local/zookeeper-3.4.10/data</value>   //hbase.zookeeper.property.dataDir配置跟zookeeperper配置的dataDir一致
       </property>
    </configuration>

e.修改regionservers
datanode01
datanode02
datanode03
f.关闭hbase

13.安装kafka
a.上传:将kafka_2.11-0.11.0.2.tgz压缩包上传到/user/local路径下
b.解压:将上传的kafka_2.11-0.11.0.2.tgz包解压缩到当前的目录下(tar -zxvf kafka_2.11-0.11.0.2.tgz)
c.修改 server.properties
broker.id=0(类似于myid,整个集群内唯一的id号,整数一般从0开始)
listeners=PLAINTEXT://datanode01:9092(协议/当前broker机器的ip,端口)
port=9092(broker的端口)
host.name=datanode01(broker的ip)
log.dirs=/usr/local/kafka_2.11-0.11.0.2/kafka-logs(存储数据的目录)
zookeeper.connect=namenode01:2181,namenode02:2181.datanode01:2181,datanode02:2181,datanode03:2181(集群列表)
d.启动kafka (./bin/kafka-server-start.sh -daemon config/server.properties )

kafka的简单语法
(1).创建一个主题名为test: ./bin/kafka-topics.sh –create –zookeeper namenode01:2181,datanode01:2181 –replication-factor 1 –partitions 3 –topic test
(2).查看主题的详情: ./bin/kafka-topics.sh –describe –zookeeper namenode01:2181 –topic test
(3).启动生产者: ./bin/kafka-console-producer.sh –broker-list datanode01:9092 –topic test
(4).启动消费者: ./bin/kafka-console-consumer.sh –zookeeper datanode01:2181 –topic test –from-beginning

14.安装scala
a.上传:将scala-2.11.8.tgz压缩包上传到/user/local路径下
b.解压:将上传的scala-2.11.8.tgz包解压缩到当前的目录下(tar -zxvf scala-2.11.8.tgz)
c.配置换将变量(vim /etc/profile)
d.在末尾添加
export SCALA_HOME=/usr/local/scala-2.11.8
export PATH=&SCALA_HOME/bin:&PATH
e.生效 source /etc/profile
f.检验是否安装成功(scala)
g.查询当前安装的scala的版本(scala -version)
16.安装spark
a.上传:将spark-2.2.0-bin-hadoop2.7.tgz压缩包上传到/user/local路径下
b.解压:将上传的spark-2.2.0-bin-hadoop2.7.tgz包解压缩到当前的目录下(tar -zxvf spark-2.2.0-bin-hadoop2.7.tgz)
c.配置换将变量(vim /etc/profile)
d.在末尾添加
export SPARK_HOME=/usr/local/spark-2.2.0-bin-hadoop2.7
export PATH=&PATH:&SPARK_HOME/bin
e.生效 source /etc/profile
f.修改 spark-env.sh
export JAVA_HOME=/usr/java/jdk1.8
export SCALA_HOME=/usr/local/scala-2.11.8
export HADOOP_HOME=/usr/local/hadoop-2.7.3
export HADOOP_CONF_DIR=/usr/local/hadoop-2.7.3/etc/hadoop
export SPARK_MASTER_IP=datanode01
export SPARK_MASTER_HOST=datanode01
export SPARK_LOCAL_IP=datanode01
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_CORES=2
export SPARK_HOME=/usr/local/spark-2.2.0-bin-hadoop2.7
export SPARK_DIST_CLASSPATH=/usr/local/hadoop-2.7.3/bin/hadoop classpath

g.修改slaves
datanode01
datanode02
datanode03
h.启动spark集群
spark-2.1.0-bin-hadoop2.7/sbin/start-all.sh