Windows安装Spark-2.2.0

时间:2022-10-19 20:50:44

前提:安装scala-2.11.8.msi

1、根据上篇安装部署好Hadoop环境

2、下载Spark软件:http://spark.apache.org/downloads.html

3、解压到D:\BigDataApp\spark-2.2.0-bin-hadoop2.7

4、配置系统环境变量:创建系统变量SPARK_HOME,内容为D:\BigDataApp\spark-2.2.0-bin-hadoop2.7,添加%SPARK_HOME%\bin 和 %SPARK_HOME%\sbin到系统变量PATH中

5、cmd运行:spark-shell

验证:看到Spark context available as 'sc' 则启动成功

Scala验证:

val textFile=sc.textFile("C:\\logs\\1.txt")
val tokenizedFileData = textFile.flatMap(line=>line.split(" "))
val countPrep = tokenizedFileData.map(word=>(word,1))
val counts = countPrep.reduceByKey((accumValue, newValue)=>accumValue+newValue)
var sortedCounts = counts.sortBy(kvPair=>kvPair._2,false)
//sortedCounts.saveAsTextFile("file:///SparkOutputData/ReadMeWordCount")
sortedCounts.saveAsTextFile("C:\\logs\\test")