一,环境
机器名 |
IP |
集群状态 |
zookeeper |
hadoop01 |
176.129.8.111 |
active |
follower |
hadoop02 |
176.129.8.112 |
standby |
leader |
hadoop03 |
176.129.8.113 |
observer |
CentOS6.5
JDK1.8.0
Hadoop2.7.1
Zookeeper3.7.1
Scala-2.13.0
二,下载及解压
机器:hadoop01
下载地址http://spark.apache.org/releases/spark-release-2-2-2.html
解压到/home/software下
三,配置
机器:hadoop01
进入conf目录
1,复制spark-env.sh
cp spark-env.sh.template spark-env.sh
复制缓存的文件spark-env.sh.template改为spark识别的文件spark-env.sh,在文件尾部编辑如下内容:
export JAVA_HOME=/home/software/jdk1.8.0
export SCALA_HOME=/home/software/scala-2.13.0
export HADOOP_HOME=/home/hadoop-2.7.1
export HADOOP_CONF_DIR=/home/hadoop-2.7.1/etc/hadoop
export SPARK_MASTER_IP=hadoop01
export SPARK_WORKER_MEMORY=2g
export SPARK_WORKER_CORES=2
export SPARK_WORKER_INSTANCES=1
2,复制slaves
cp slaves.template slaves
编辑如下内容:
hadoop02
hadoop03
3,复制到hadoop02和hadoop03上
[root@hadoop01 software]# scp -r spark-2.2.2 hadoop02:/home/software/
[root@hadoop01 software]# scp -r spark-2.2.2 hadoop03:/home/software/
四,启动
1,进入spark/sbin下
sh start-all.sh
2,在web*问
http://hadoop01:8080
Worker Id |
Address |
State |
Cores |
Memory |
worker-20180919193657-176.129.8.113-46491 |
176.129.8.113:46491 |
ALIVE |
2 (0 Used) |
2.0 GB (0.0 B Used) |
worker-20180919193658-176.129.8.112-38025 |
176.129.8.112:38025 |
ALIVE |
2 (0 Used) |
2.0 GB (0.0 B Used) |
3,进入bin/spark-shell
a,启动并加载jar包准备进入hive库
[root@hadoop01 bin]# sh spark-shell --master local[3] --jars /home/software/hive-1.2.0/lib/mysql-connector-java-5.1.38-bin.jar
b,查看库
scala> spark.sql("show databases").show;
c,创建库
scala> spark.sql("create database test").show;
d,使用库
scala> spark.sql("use test").show;
e,创建表
scala> spark.sql("create table student(id int, name string,age int) row format delimited fields terminated by ','").show;
f,增加数据
spark.sql("insert into student values(1,'Zenghaoxuan',10)").show;
g,查看数据
scala> spark.sql("select * from student").show;
+---+---------+---+
| id| name|age|
+---+---------+---+
| 1|Zenghaoxuan| 10|
+---+---------+---+