我们选择在线安装
这个是windows下的scala,直接双击安装就可以了
安装好之后可以验证一下
这个是我本地的jdk1.8安装包,直接双击安装
安装完成后可以验证一下
https://archive.apache.org/dist/maven/maven-3/3.3.9/binaries/
解压
我的本地是win10系统
配置好环境变量我们可以验证一下
修改这个文件
这个是默认的
改成这样子
把本地的maven配置进来
接下来就是等待自动把相应的架包下载下来
把scala添加进来了
接下来我们创建目录
在scala目录下建包
在这个包里面创建一个scala的类
输入以下代码
配置maven的 pom.xml文件
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.spark</groupId>
<artifactId>sparkStu</artifactId>
<packaging>war</packaging>
<version>1.0-SNAPSHOT</version>
<name>sparkStu Maven Webapp</name>
<url>http://maven.apache.org</url> <properties>
<hadoop.version>2.6.0</hadoop.version>
<scala.binary.version>2.11</scala.binary.version>
<spark.version>2.2.0</spark.version>
</properties> <dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency> <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency> <!--
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency> <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency> <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency> <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency> <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency> <dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency> </dependencies>
<build>
<finalName>sparkStu</finalName>
</build>
</project>
在Test.scala里加上这段内容
我们编写一个简单的代码
package com.spark.test import org.apache.spark.sql.SparkSession object Test { def main(args: Array[String]): Unit = {
val spark= SparkSession
.builder
.appName("HdfsTest")
.getOrCreate()
val filePart = "E://Mycode/datas/stu.txt"
val rdd= spark.sparkContext.textFile(filePart) val lines= rdd.flatMap(x => x.split(" ")).collect().toList
println(lines)
}
}
运行一下
结果报错了
E:\software\jdk1.\bin\java "-javaagent:E:\software\IDEA\IntelliJ IDEA 2017.2.6\lib\idea_rt.jar=59010:E:\software\IDEA\IntelliJ IDEA 2017.2.6\bin" -Dfile.encoding=UTF- -classpath E:\software\jdk1.\jre\lib\charsets.jar;E:\software\jdk1.\jre\lib\deploy.jar;E:\software\jdk1.\jre\lib\ext\access-bridge-.jar;E:\software\jdk1.\jre\lib\ext\cldrdata.jar;E:\software\jdk1.\jre\lib\ext\dnsns.jar;E:\software\jdk1.\jre\lib\ext\jaccess.jar;E:\software\jdk1.\jre\lib\ext\jfxrt.jar;E:\software\jdk1.\jre\lib\ext\localedata.jar;E:\software\jdk1.\jre\lib\ext\nashorn.jar;E:\software\jdk1.\jre\lib\ext\sunec.jar;E:\software\jdk1.\jre\lib\ext\sunjce_provider.jar;E:\software\jdk1.\jre\lib\ext\sunmscapi.jar;E:\software\jdk1.\jre\lib\ext\sunpkcs11.jar;E:\software\jdk1.\jre\lib\ext\zipfs.jar;E:\software\jdk1.\jre\lib\javaws.jar;E:\software\jdk1.\jre\lib\jce.jar;E:\software\jdk1.\jre\lib\jfr.jar;E:\software\jdk1.\jre\lib\jfxswt.jar;E:\software\jdk1.\jre\lib\jsse.jar;E:\software\jdk1.\jre\lib\management-agent.jar;E:\software\jdk1.\jre\lib\plugin.jar;E:\software\jdk1.\jre\lib\resources.jar;E:\software\jdk1.\jre\lib\rt.jar;E:\Mycode\SparkStu\target\classes;E:\software\Scala\lib\scala-actors-2.11..jar;E:\software\Scala\lib\scala-actors-migration_2.-1.1..jar;E:\software\Scala\lib\scala-library.jar;E:\software\Scala\lib\scala-parser-combinators_2.-1.0..jar;E:\software\Scala\lib\scala-reflect.jar;E:\software\Scala\lib\scala-swing_2.-1.0..jar;E:\software\Scala\lib\scala-xml_2.-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-core_2.\2.2.\spark-core_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\avro\avro\1.7.\avro-1.7..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-core-asl\1.9.\jackson-core-asl-1.9..jar;E:\software\maven3.3.9\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-compress\1.4.\commons-compress-1.4..jar;E:\software\maven3.3.9\repository\org\tukaani\xz\1.0\xz-1.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-mapred\1.7.\avro-mapred-1.7.-hadoop2.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.\avro-ipc-1.7..jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.\avro-ipc-1.7.-tests.jar;E:\software\maven3.3.9\repository\com\twitter\chill_2.\0.8.\chill_2.-0.8..jar;E:\software\maven3.3.9\repository\com\esotericsoftware\kryo-shaded\3.0.\kryo-shaded-3.0..jar;E:\software\maven3.3.9\repository\com\esotericsoftware\minlog\1.3.\minlog-1.3..jar;E:\software\maven3.3.9\repository\org\objenesis\objenesis\2.1\objenesis-2.1.jar;E:\software\maven3.3.9\repository\com\twitter\chill-java\0.8.\chill-java-0.8..jar;E:\software\maven3.3.9\repository\org\apache\xbean\xbean-asm5-shaded\4.4\xbean-asm5-shaded-4.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-launcher_2.\2.2.\spark-launcher_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-common_2.\2.2.\spark-network-common_2.-2.2..jar;E:\software\maven3.3.9\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-annotations\2.6.\jackson-annotations-2.6..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-shuffle_2.\2.2.\spark-network-shuffle_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-unsafe_2.\2.2.\spark-unsafe_2.-2.2..jar;E:\software\maven3.3.9\repository\net\java\dev\jets3t\jets3t\0.9.\jets3t-0.9..jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpcore\4.3.\httpcore-4.3..jar;E:\software\maven3.3.9\repository\javax\activation\activation\1.1.\activation-1.1..jar;E:\software\maven3.3.9\repository\mx4j\mx4j\3.0.\mx4j-3.0..jar;E:\software\maven3.3.9\repository\javax\mail\mail\1.4.\mail-1.4..jar;E:\software\maven3.3.9\repository\org\bouncycastle\bcprov-jdk15on\1.51\bcprov-jdk15on-1.51.jar;E:\software\maven3.3.9\repository\com\jamesmurty\utils\java-xmlbuilder\1.0\java-xmlbuilder-1.0.jar;E:\software\maven3.3.9\repository\net\iharder\base64\2.3.\base64-2.3..jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-recipes\2.6.\curator-recipes-2.6..jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-framework\2.6.\curator-framework-2.6..jar;E:\software\maven3.3.9\repository\org\apache\zookeeper\zookeeper\3.4.\zookeeper-3.4..jar;E:\software\maven3.3.9\repository\com\google\guava\guava\16.0.\guava-16.0..jar;E:\software\maven3.3.9\repository\javax\servlet\javax.servlet-api\3.1.\javax.servlet-api-3.1..jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-math3\3.4.\commons-math3-3.4..jar;E:\software\maven3.3.9\repository\com\google\code\findbugs\jsr305\1.3.\jsr305-1.3..jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-api\1.7.\slf4j-api-1.7..jar;E:\software\maven3.3.9\repository\org\slf4j\jul-to-slf4j\1.7.\jul-to-slf4j-1.7..jar;E:\software\maven3.3.9\repository\org\slf4j\jcl-over-slf4j\1.7.\jcl-over-slf4j-1.7..jar;E:\software\maven3.3.9\repository\log4j\log4j\1.2.\log4j-1.2..jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-log4j12\1.7.\slf4j-log4j12-1.7..jar;E:\software\maven3.3.9\repository\com\ning\compress-lzf\1.0.\compress-lzf-1.0..jar;E:\software\maven3.3.9\repository\org\xerial\snappy\snappy-java\1.1.2.6\snappy-java-1.1.2.6.jar;E:\software\maven3.3.9\repository\net\jpountz\lz4\lz4\1.3.\lz4-1.3..jar;E:\software\maven3.3.9\repository\org\roaringbitmap\RoaringBitmap\0.5.\RoaringBitmap-0.5..jar;E:\software\maven3.3.9\repository\commons-net\commons-net\2.2\commons-net-2.2.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-library\2.11.\scala-library-2.11..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-jackson_2.\3.2.\json4s-jackson_2.-3.2..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-core_2.\3.2.\json4s-core_2.-3.2..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-ast_2.\3.2.\json4s-ast_2.-3.2..jar;E:\software\maven3.3.9\repository\org\scala-lang\scalap\2.11.\scalap-2.11..jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-compiler\2.11.\scala-compiler-2.11..jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-xml_2.\1.0.\scala-xml_2.-1.0..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-client\2.22.\jersey-client-2.22..jar;E:\software\maven3.3.9\repository\javax\ws\rs\javax.ws.rs-api\2.0.\javax.ws.rs-api-2.0..jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-api\2.4.-b34\hk2-api-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-utils\2.4.-b34\hk2-utils-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.-b34\aopalliance-repackaged-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\javax.inject\2.4.-b34\javax.inject-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-locator\2.4.-b34\hk2-locator-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\javassist\javassist\3.18.-GA\javassist-3.18.-GA.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-common\2.22.\jersey-common-2.22..jar;E:\software\maven3.3.9\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.\jersey-guava-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\osgi-resource-locator\1.0.\osgi-resource-locator-1.0..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-server\2.22.\jersey-server-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.\jersey-media-jaxb-2.22..jar;E:\software\maven3.3.9\repository\javax\validation\validation-api\1.1..Final\validation-api-1.1..Final.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.\jersey-container-servlet-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.\jersey-container-servlet-core-2.22..jar;E:\software\maven3.3.9\repository\io\netty\netty-all\4.0..Final\netty-all-4.0..Final.jar;E:\software\maven3.3.9\repository\io\netty\netty\3.9..Final\netty-3.9..Final.jar;E:\software\maven3.3.9\repository\com\clearspring\analytics\stream\2.7.\stream-2.7..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-core\3.1.\metrics-core-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-jvm\3.1.\metrics-jvm-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-json\3.1.\metrics-json-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-graphite\3.1.\metrics-graphite-3.1..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-databind\2.6.\jackson-databind-2.6..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-core\2.6.\jackson-core-2.6..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-scala_2.\2.6.\jackson-module-scala_2.-2.6..jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-reflect\2.11.\scala-reflect-2.11..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.6.\jackson-module-paranamer-2.6..jar;E:\software\maven3.3.9\repository\org\apache\ivy\ivy\2.4.\ivy-2.4..jar;E:\software\maven3.3.9\repository\oro\oro\2.0.\oro-2.0..jar;E:\software\maven3.3.9\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;E:\software\maven3.3.9\repository\net\sf\py4j\py4j\0.10.\py4j-0.10..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-tags_2.\2.2.\spark-tags_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-crypto\1.0.\commons-crypto-1.0..jar;E:\software\maven3.3.9\repository\org\spark-project\spark\unused\1.0.\unused-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql_2.\2.2.\spark-sql_2.-2.2..jar;E:\software\maven3.3.9\repository\com\univocity\univocity-parsers\2.2.\univocity-parsers-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sketch_2.\2.2.\spark-sketch_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-catalyst_2.\2.2.\spark-catalyst_2.-2.2..jar;E:\software\maven3.3.9\repository\org\codehaus\janino\janino\3.0.\janino-3.0..jar;E:\software\maven3.3.9\repository\org\codehaus\janino\commons-compiler\3.0.\commons-compiler-3.0..jar;E:\software\maven3.3.9\repository\org\antlr\antlr4-runtime\4.5.\antlr4-runtime-4.5..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-column\1.8.\parquet-column-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-common\1.8.\parquet-common-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-encoding\1.8.\parquet-encoding-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-hadoop\1.8.\parquet-hadoop-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-format\2.3.\parquet-format-2.3..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-jackson\1.8.\parquet-jackson-1.8..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming_2.\2.2.\spark-streaming_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-hive_2.\2.2.\spark-hive_2.-2.2..jar;E:\software\maven3.3.9\repository\com\twitter\parquet-hadoop-bundle\1.6.\parquet-hadoop-bundle-1.6..jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-exec\1.2..spark2\hive-exec-1.2..spark2.jar;E:\software\maven3.3.9\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;E:\software\maven3.3.9\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;E:\software\maven3.3.9\repository\javolution\javolution\5.5.\javolution-5.5..jar;E:\software\maven3.3.9\repository\log4j\apache-log4j-extras\1.2.\apache-log4j-extras-1.2..jar;E:\software\maven3.3.9\repository\org\antlr\antlr-runtime\3.4\antlr-runtime-3.4.jar;E:\software\maven3.3.9\repository\org\antlr\stringtemplate\3.2.\stringtemplate-3.2..jar;E:\software\maven3.3.9\repository\antlr\antlr\2.7.\antlr-2.7..jar;E:\software\maven3.3.9\repository\org\antlr\ST4\4.0.\ST4-4.0..jar;E:\software\maven3.3.9\repository\com\googlecode\javaewah\JavaEWAH\0.3.\JavaEWAH-0.3..jar;E:\software\maven3.3.9\repository\org\iq80\snappy\snappy\0.2\snappy-0.2.jar;E:\software\maven3.3.9\repository\stax\stax-api\1.0.\stax-api-1.0..jar;E:\software\maven3.3.9\repository\net\sf\opencsv\opencsv\2.3\opencsv-2.3.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-metastore\1.2..spark2\hive-metastore-1.2..spark2.jar;E:\software\maven3.3.9\repository\com\jolbox\bonecp\0.8..RELEASE\bonecp-0.8..RELEASE.jar;E:\software\maven3.3.9\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;E:\software\maven3.3.9\repository\commons-logging\commons-logging\1.1.\commons-logging-1.1..jar;E:\software\maven3.3.9\repository\org\apache\derby\derby\10.10.2.0\derby-10.10.2.0.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-api-jdo\3.2.\datanucleus-api-jdo-3.2..jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-rdbms\3.2.\datanucleus-rdbms-3.2..jar;E:\software\maven3.3.9\repository\commons-pool\commons-pool\1.5.\commons-pool-1.5..jar;E:\software\maven3.3.9\repository\commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar;E:\software\maven3.3.9\repository\javax\jdo\jdo-api\3.0.\jdo-api-3.0..jar;E:\software\maven3.3.9\repository\javax\transaction\jta\1.1\jta-1.1.jar;E:\software\maven3.3.9\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-avatica\1.2.-incubating\calcite-avatica-1.2.-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-core\1.2.-incubating\calcite-core-1.2.-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-linq4j\1.2.-incubating\calcite-linq4j-1.2.-incubating.jar;E:\software\maven3.3.9\repository\net\hydromatic\eigenbase-properties\1.1.\eigenbase-properties-1.1..jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpclient\4.5.\httpclient-4.5..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.\jackson-mapper-asl-1.9..jar;E:\software\maven3.3.9\repository\commons-codec\commons-codec\1.10\commons-codec-1.10.jar;E:\software\maven3.3.9\repository\joda-time\joda-time\2.9.\joda-time-2.9..jar;E:\software\maven3.3.9\repository\org\jodd\jodd-core\3.5.\jodd-core-3.5..jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-core\3.2.\datanucleus-core-3.2..jar;E:\software\maven3.3.9\repository\org\apache\thrift\libthrift\0.9.\libthrift-0.9..jar;E:\software\maven3.3.9\repository\org\apache\thrift\libfb303\0.9.\libfb303-0.9..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming-kafka--10_2.\2.2.\spark-streaming-kafka--10_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka_2.\0.10.0.1\kafka_2.-0.10.0.1.jar;E:\software\maven3.3.9\repository\com\101tec\zkclient\0.8\zkclient-0.8.jar;E:\software\maven3.3.9\repository\com\yammer\metrics\metrics-core\2.2.\metrics-core-2.2..jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-parser-combinators_2.\1.0.\scala-parser-combinators_2.-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql-kafka--10_2.\2.2.\spark-sql-kafka--10_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka-clients\0.10.0.1\kafka-clients-0.10.0.1.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-client\2.6.\hadoop-client-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-common\2.6.\hadoop-common-2.6..jar;E:\software\maven3.3.9\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;E:\software\maven3.3.9\repository\commons-collections\commons-collections\3.2.\commons-collections-3.2..jar;E:\software\maven3.3.9\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;E:\software\maven3.3.9\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils\1.7.\commons-beanutils-1.7..jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils-core\1.8.\commons-beanutils-core-1.8..jar;E:\software\maven3.3.9\repository\com\google\protobuf\protobuf-java\2.5.\protobuf-java-2.5..jar;E:\software\maven3.3.9\repository\com\google\code\gson\gson\2.2.\gson-2.2..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-auth\2.6.\hadoop-auth-2.6..jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.-M15\apacheds-kerberos-codec-2.0.-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-i18n\2.0.-M15\apacheds-i18n-2.0.-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-asn1-api\1.0.-M20\api-asn1-api-1.0.-M20.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-util\1.0.-M20\api-util-1.0.-M20.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-client\2.6.\curator-client-2.6..jar;E:\software\maven3.3.9\repository\org\htrace\htrace-core\3.0.\htrace-core-3.0..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-hdfs\2.6.\hadoop-hdfs-2.6..jar;E:\software\maven3.3.9\repository\org\mortbay\jetty\jetty-util\6.1.\jetty-util-6.1..jar;E:\software\maven3.3.9\repository\xerces\xercesImpl\2.9.\xercesImpl-2.9..jar;E:\software\maven3.3.9\repository\xml-apis\xml-apis\1.3.\xml-apis-1.3..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.\hadoop-mapreduce-client-app-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.\hadoop-mapreduce-client-common-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-client\2.6.\hadoop-yarn-client-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.\hadoop-yarn-server-common-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.\hadoop-mapreduce-client-shuffle-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-api\2.6.\hadoop-yarn-api-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.\hadoop-mapreduce-client-core-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-common\2.6.\hadoop-yarn-common-2.6..jar;E:\software\maven3.3.9\repository\javax\xml\bind\jaxb-api\2.2.\jaxb-api-2.2..jar;E:\software\maven3.3.9\repository\javax\xml\stream\stax-api\1.0-\stax-api-1.0-.jar;E:\software\maven3.3.9\repository\javax\servlet\servlet-api\2.5\servlet-api-2.5.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-core\1.9\jersey-core-1.9.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-client\1.9\jersey-client-1.9.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-jaxrs\1.9.\jackson-jaxrs-1.9..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-xc\1.9.\jackson-xc-1.9..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.\hadoop-mapreduce-client-jobclient-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-annotations\2.6.\hadoop-annotations-2.6..jar com.spark.test.Test
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
// :: INFO SparkContext: Running Spark version 2.2.
// :: WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
// :: ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:)
at org.apache.hadoop.security.Groups.<init>(Groups.java:)
at org.apache.hadoop.security.Groups.<init>(Groups.java:)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$.apply(Utils.scala:)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$.apply(Utils.scala:)
at scala.Option.getOrElse(Option.scala:)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at scala.Option.getOrElse(Option.scala:)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:)
at com.spark.test.Test$.main(Test.scala:)
at com.spark.test.Test.main(Test.scala)
// :: ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at scala.Option.getOrElse(Option.scala:)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:)
at com.spark.test.Test$.main(Test.scala:)
at com.spark.test.Test.main(Test.scala)
// :: INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at scala.Option.getOrElse(Option.scala:)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:)
at com.spark.test.Test$.main(Test.scala:)
at com.spark.test.Test.main(Test.scala) Process finished with exit code
这是因为我本地没有配置好hadoop,现在我们配一个
这个是我本地的hadoop/bin
下面把本地win10的环境变量配置一下
再重启一下idea,再运行一下程序
报了另外一个错误,但是可以确定的是前面的错误我们解决了
E:\software\jdk1.\bin\java "-javaagent:E:\software\IDEA\IntelliJ IDEA 2017.2.6\lib\idea_rt.jar=60011:E:\software\IDEA\IntelliJ IDEA 2017.2.6\bin" -Dfile.encoding=UTF- -classpath E:\software\jdk1.\jre\lib\charsets.jar;E:\software\jdk1.\jre\lib\deploy.jar;E:\software\jdk1.\jre\lib\ext\access-bridge-.jar;E:\software\jdk1.\jre\lib\ext\cldrdata.jar;E:\software\jdk1.\jre\lib\ext\dnsns.jar;E:\software\jdk1.\jre\lib\ext\jaccess.jar;E:\software\jdk1.\jre\lib\ext\jfxrt.jar;E:\software\jdk1.\jre\lib\ext\localedata.jar;E:\software\jdk1.\jre\lib\ext\nashorn.jar;E:\software\jdk1.\jre\lib\ext\sunec.jar;E:\software\jdk1.\jre\lib\ext\sunjce_provider.jar;E:\software\jdk1.\jre\lib\ext\sunmscapi.jar;E:\software\jdk1.\jre\lib\ext\sunpkcs11.jar;E:\software\jdk1.\jre\lib\ext\zipfs.jar;E:\software\jdk1.\jre\lib\javaws.jar;E:\software\jdk1.\jre\lib\jce.jar;E:\software\jdk1.\jre\lib\jfr.jar;E:\software\jdk1.\jre\lib\jfxswt.jar;E:\software\jdk1.\jre\lib\jsse.jar;E:\software\jdk1.\jre\lib\management-agent.jar;E:\software\jdk1.\jre\lib\plugin.jar;E:\software\jdk1.\jre\lib\resources.jar;E:\software\jdk1.\jre\lib\rt.jar;E:\Mycode\SparkStu\target\classes;E:\software\Scala\lib\scala-actors-2.11..jar;E:\software\Scala\lib\scala-actors-migration_2.-1.1..jar;E:\software\Scala\lib\scala-library.jar;E:\software\Scala\lib\scala-parser-combinators_2.-1.0..jar;E:\software\Scala\lib\scala-reflect.jar;E:\software\Scala\lib\scala-swing_2.-1.0..jar;E:\software\Scala\lib\scala-xml_2.-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-core_2.\2.2.\spark-core_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\avro\avro\1.7.\avro-1.7..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-core-asl\1.9.\jackson-core-asl-1.9..jar;E:\software\maven3.3.9\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-compress\1.4.\commons-compress-1.4..jar;E:\software\maven3.3.9\repository\org\tukaani\xz\1.0\xz-1.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-mapred\1.7.\avro-mapred-1.7.-hadoop2.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.\avro-ipc-1.7..jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.\avro-ipc-1.7.-tests.jar;E:\software\maven3.3.9\repository\com\twitter\chill_2.\0.8.\chill_2.-0.8..jar;E:\software\maven3.3.9\repository\com\esotericsoftware\kryo-shaded\3.0.\kryo-shaded-3.0..jar;E:\software\maven3.3.9\repository\com\esotericsoftware\minlog\1.3.\minlog-1.3..jar;E:\software\maven3.3.9\repository\org\objenesis\objenesis\2.1\objenesis-2.1.jar;E:\software\maven3.3.9\repository\com\twitter\chill-java\0.8.\chill-java-0.8..jar;E:\software\maven3.3.9\repository\org\apache\xbean\xbean-asm5-shaded\4.4\xbean-asm5-shaded-4.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-launcher_2.\2.2.\spark-launcher_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-common_2.\2.2.\spark-network-common_2.-2.2..jar;E:\software\maven3.3.9\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-annotations\2.6.\jackson-annotations-2.6..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-shuffle_2.\2.2.\spark-network-shuffle_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-unsafe_2.\2.2.\spark-unsafe_2.-2.2..jar;E:\software\maven3.3.9\repository\net\java\dev\jets3t\jets3t\0.9.\jets3t-0.9..jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpcore\4.3.\httpcore-4.3..jar;E:\software\maven3.3.9\repository\javax\activation\activation\1.1.\activation-1.1..jar;E:\software\maven3.3.9\repository\mx4j\mx4j\3.0.\mx4j-3.0..jar;E:\software\maven3.3.9\repository\javax\mail\mail\1.4.\mail-1.4..jar;E:\software\maven3.3.9\repository\org\bouncycastle\bcprov-jdk15on\1.51\bcprov-jdk15on-1.51.jar;E:\software\maven3.3.9\repository\com\jamesmurty\utils\java-xmlbuilder\1.0\java-xmlbuilder-1.0.jar;E:\software\maven3.3.9\repository\net\iharder\base64\2.3.\base64-2.3..jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-recipes\2.6.\curator-recipes-2.6..jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-framework\2.6.\curator-framework-2.6..jar;E:\software\maven3.3.9\repository\org\apache\zookeeper\zookeeper\3.4.\zookeeper-3.4..jar;E:\software\maven3.3.9\repository\com\google\guava\guava\16.0.\guava-16.0..jar;E:\software\maven3.3.9\repository\javax\servlet\javax.servlet-api\3.1.\javax.servlet-api-3.1..jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-math3\3.4.\commons-math3-3.4..jar;E:\software\maven3.3.9\repository\com\google\code\findbugs\jsr305\1.3.\jsr305-1.3..jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-api\1.7.\slf4j-api-1.7..jar;E:\software\maven3.3.9\repository\org\slf4j\jul-to-slf4j\1.7.\jul-to-slf4j-1.7..jar;E:\software\maven3.3.9\repository\org\slf4j\jcl-over-slf4j\1.7.\jcl-over-slf4j-1.7..jar;E:\software\maven3.3.9\repository\log4j\log4j\1.2.\log4j-1.2..jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-log4j12\1.7.\slf4j-log4j12-1.7..jar;E:\software\maven3.3.9\repository\com\ning\compress-lzf\1.0.\compress-lzf-1.0..jar;E:\software\maven3.3.9\repository\org\xerial\snappy\snappy-java\1.1.2.6\snappy-java-1.1.2.6.jar;E:\software\maven3.3.9\repository\net\jpountz\lz4\lz4\1.3.\lz4-1.3..jar;E:\software\maven3.3.9\repository\org\roaringbitmap\RoaringBitmap\0.5.\RoaringBitmap-0.5..jar;E:\software\maven3.3.9\repository\commons-net\commons-net\2.2\commons-net-2.2.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-library\2.11.\scala-library-2.11..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-jackson_2.\3.2.\json4s-jackson_2.-3.2..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-core_2.\3.2.\json4s-core_2.-3.2..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-ast_2.\3.2.\json4s-ast_2.-3.2..jar;E:\software\maven3.3.9\repository\org\scala-lang\scalap\2.11.\scalap-2.11..jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-compiler\2.11.\scala-compiler-2.11..jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-xml_2.\1.0.\scala-xml_2.-1.0..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-client\2.22.\jersey-client-2.22..jar;E:\software\maven3.3.9\repository\javax\ws\rs\javax.ws.rs-api\2.0.\javax.ws.rs-api-2.0..jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-api\2.4.-b34\hk2-api-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-utils\2.4.-b34\hk2-utils-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.-b34\aopalliance-repackaged-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\javax.inject\2.4.-b34\javax.inject-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-locator\2.4.-b34\hk2-locator-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\javassist\javassist\3.18.-GA\javassist-3.18.-GA.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-common\2.22.\jersey-common-2.22..jar;E:\software\maven3.3.9\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.\jersey-guava-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\osgi-resource-locator\1.0.\osgi-resource-locator-1.0..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-server\2.22.\jersey-server-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.\jersey-media-jaxb-2.22..jar;E:\software\maven3.3.9\repository\javax\validation\validation-api\1.1..Final\validation-api-1.1..Final.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.\jersey-container-servlet-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.\jersey-container-servlet-core-2.22..jar;E:\software\maven3.3.9\repository\io\netty\netty-all\4.0..Final\netty-all-4.0..Final.jar;E:\software\maven3.3.9\repository\io\netty\netty\3.9..Final\netty-3.9..Final.jar;E:\software\maven3.3.9\repository\com\clearspring\analytics\stream\2.7.\stream-2.7..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-core\3.1.\metrics-core-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-jvm\3.1.\metrics-jvm-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-json\3.1.\metrics-json-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-graphite\3.1.\metrics-graphite-3.1..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-databind\2.6.\jackson-databind-2.6..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-core\2.6.\jackson-core-2.6..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-scala_2.\2.6.\jackson-module-scala_2.-2.6..jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-reflect\2.11.\scala-reflect-2.11..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.6.\jackson-module-paranamer-2.6..jar;E:\software\maven3.3.9\repository\org\apache\ivy\ivy\2.4.\ivy-2.4..jar;E:\software\maven3.3.9\repository\oro\oro\2.0.\oro-2.0..jar;E:\software\maven3.3.9\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;E:\software\maven3.3.9\repository\net\sf\py4j\py4j\0.10.\py4j-0.10..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-tags_2.\2.2.\spark-tags_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-crypto\1.0.\commons-crypto-1.0..jar;E:\software\maven3.3.9\repository\org\spark-project\spark\unused\1.0.\unused-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql_2.\2.2.\spark-sql_2.-2.2..jar;E:\software\maven3.3.9\repository\com\univocity\univocity-parsers\2.2.\univocity-parsers-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sketch_2.\2.2.\spark-sketch_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-catalyst_2.\2.2.\spark-catalyst_2.-2.2..jar;E:\software\maven3.3.9\repository\org\codehaus\janino\janino\3.0.\janino-3.0..jar;E:\software\maven3.3.9\repository\org\codehaus\janino\commons-compiler\3.0.\commons-compiler-3.0..jar;E:\software\maven3.3.9\repository\org\antlr\antlr4-runtime\4.5.\antlr4-runtime-4.5..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-column\1.8.\parquet-column-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-common\1.8.\parquet-common-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-encoding\1.8.\parquet-encoding-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-hadoop\1.8.\parquet-hadoop-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-format\2.3.\parquet-format-2.3..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-jackson\1.8.\parquet-jackson-1.8..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming_2.\2.2.\spark-streaming_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-hive_2.\2.2.\spark-hive_2.-2.2..jar;E:\software\maven3.3.9\repository\com\twitter\parquet-hadoop-bundle\1.6.\parquet-hadoop-bundle-1.6..jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-exec\1.2..spark2\hive-exec-1.2..spark2.jar;E:\software\maven3.3.9\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;E:\software\maven3.3.9\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;E:\software\maven3.3.9\repository\javolution\javolution\5.5.\javolution-5.5..jar;E:\software\maven3.3.9\repository\log4j\apache-log4j-extras\1.2.\apache-log4j-extras-1.2..jar;E:\software\maven3.3.9\repository\org\antlr\antlr-runtime\3.4\antlr-runtime-3.4.jar;E:\software\maven3.3.9\repository\org\antlr\stringtemplate\3.2.\stringtemplate-3.2..jar;E:\software\maven3.3.9\repository\antlr\antlr\2.7.\antlr-2.7..jar;E:\software\maven3.3.9\repository\org\antlr\ST4\4.0.\ST4-4.0..jar;E:\software\maven3.3.9\repository\com\googlecode\javaewah\JavaEWAH\0.3.\JavaEWAH-0.3..jar;E:\software\maven3.3.9\repository\org\iq80\snappy\snappy\0.2\snappy-0.2.jar;E:\software\maven3.3.9\repository\stax\stax-api\1.0.\stax-api-1.0..jar;E:\software\maven3.3.9\repository\net\sf\opencsv\opencsv\2.3\opencsv-2.3.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-metastore\1.2..spark2\hive-metastore-1.2..spark2.jar;E:\software\maven3.3.9\repository\com\jolbox\bonecp\0.8..RELEASE\bonecp-0.8..RELEASE.jar;E:\software\maven3.3.9\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;E:\software\maven3.3.9\repository\commons-logging\commons-logging\1.1.\commons-logging-1.1..jar;E:\software\maven3.3.9\repository\org\apache\derby\derby\10.10.2.0\derby-10.10.2.0.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-api-jdo\3.2.\datanucleus-api-jdo-3.2..jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-rdbms\3.2.\datanucleus-rdbms-3.2..jar;E:\software\maven3.3.9\repository\commons-pool\commons-pool\1.5.\commons-pool-1.5..jar;E:\software\maven3.3.9\repository\commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar;E:\software\maven3.3.9\repository\javax\jdo\jdo-api\3.0.\jdo-api-3.0..jar;E:\software\maven3.3.9\repository\javax\transaction\jta\1.1\jta-1.1.jar;E:\software\maven3.3.9\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-avatica\1.2.-incubating\calcite-avatica-1.2.-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-core\1.2.-incubating\calcite-core-1.2.-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-linq4j\1.2.-incubating\calcite-linq4j-1.2.-incubating.jar;E:\software\maven3.3.9\repository\net\hydromatic\eigenbase-properties\1.1.\eigenbase-properties-1.1..jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpclient\4.5.\httpclient-4.5..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.\jackson-mapper-asl-1.9..jar;E:\software\maven3.3.9\repository\commons-codec\commons-codec\1.10\commons-codec-1.10.jar;E:\software\maven3.3.9\repository\joda-time\joda-time\2.9.\joda-time-2.9..jar;E:\software\maven3.3.9\repository\org\jodd\jodd-core\3.5.\jodd-core-3.5..jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-core\3.2.\datanucleus-core-3.2..jar;E:\software\maven3.3.9\repository\org\apache\thrift\libthrift\0.9.\libthrift-0.9..jar;E:\software\maven3.3.9\repository\org\apache\thrift\libfb303\0.9.\libfb303-0.9..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming-kafka--10_2.\2.2.\spark-streaming-kafka--10_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka_2.\0.10.0.1\kafka_2.-0.10.0.1.jar;E:\software\maven3.3.9\repository\com\101tec\zkclient\0.8\zkclient-0.8.jar;E:\software\maven3.3.9\repository\com\yammer\metrics\metrics-core\2.2.\metrics-core-2.2..jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-parser-combinators_2.\1.0.\scala-parser-combinators_2.-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql-kafka--10_2.\2.2.\spark-sql-kafka--10_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka-clients\0.10.0.1\kafka-clients-0.10.0.1.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-client\2.6.\hadoop-client-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-common\2.6.\hadoop-common-2.6..jar;E:\software\maven3.3.9\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;E:\software\maven3.3.9\repository\commons-collections\commons-collections\3.2.\commons-collections-3.2..jar;E:\software\maven3.3.9\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;E:\software\maven3.3.9\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils\1.7.\commons-beanutils-1.7..jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils-core\1.8.\commons-beanutils-core-1.8..jar;E:\software\maven3.3.9\repository\com\google\protobuf\protobuf-java\2.5.\protobuf-java-2.5..jar;E:\software\maven3.3.9\repository\com\google\code\gson\gson\2.2.\gson-2.2..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-auth\2.6.\hadoop-auth-2.6..jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.-M15\apacheds-kerberos-codec-2.0.-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-i18n\2.0.-M15\apacheds-i18n-2.0.-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-asn1-api\1.0.-M20\api-asn1-api-1.0.-M20.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-util\1.0.-M20\api-util-1.0.-M20.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-client\2.6.\curator-client-2.6..jar;E:\software\maven3.3.9\repository\org\htrace\htrace-core\3.0.\htrace-core-3.0..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-hdfs\2.6.\hadoop-hdfs-2.6..jar;E:\software\maven3.3.9\repository\org\mortbay\jetty\jetty-util\6.1.\jetty-util-6.1..jar;E:\software\maven3.3.9\repository\xerces\xercesImpl\2.9.\xercesImpl-2.9..jar;E:\software\maven3.3.9\repository\xml-apis\xml-apis\1.3.\xml-apis-1.3..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.\hadoop-mapreduce-client-app-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.\hadoop-mapreduce-client-common-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-client\2.6.\hadoop-yarn-client-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.\hadoop-yarn-server-common-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.\hadoop-mapreduce-client-shuffle-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-api\2.6.\hadoop-yarn-api-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.\hadoop-mapreduce-client-core-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-common\2.6.\hadoop-yarn-common-2.6..jar;E:\software\maven3.3.9\repository\javax\xml\bind\jaxb-api\2.2.\jaxb-api-2.2..jar;E:\software\maven3.3.9\repository\javax\xml\stream\stax-api\1.0-\stax-api-1.0-.jar;E:\software\maven3.3.9\repository\javax\servlet\servlet-api\2.5\servlet-api-2.5.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-core\1.9\jersey-core-1.9.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-client\1.9\jersey-client-1.9.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-jaxrs\1.9.\jackson-jaxrs-1.9..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-xc\1.9.\jackson-xc-1.9..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.\hadoop-mapreduce-client-jobclient-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-annotations\2.6.\hadoop-annotations-2.6..jar com.spark.test.Test
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
// :: INFO SparkContext: Running Spark version 2.2.
// :: ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at scala.Option.getOrElse(Option.scala:)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:)
at com.spark.test.Test$.main(Test.scala:)
at com.spark.test.Test.main(Test.scala)
// :: INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$.apply(SparkSession.scala:)
at scala.Option.getOrElse(Option.scala:)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:)
at com.spark.test.Test$.main(Test.scala:)
at com.spark.test.Test.main(Test.scala) Process finished with exit code
这里的错误是说要指明你的程序运行在什么地方
在程序里加上这一句,指明我们现在在本地运行
我们再运行一次,可以看到没问题了
我们继续修改程序,加上这一句
再次运行看看结果
把相同的单词进行累加
我们看看运行结果
刚刚我们使用的是rdd的方式,接下来我们使用dataSet的方式
dataSet可以近似的理解为数据库的一张张表
我们运行的结果
用空格切分单词
运行结果
E:\software\jdk1.\bin\java "-javaagent:E:\software\IDEA\IntelliJ IDEA 2017.2.6\lib\idea_rt.jar=62232:E:\software\IDEA\IntelliJ IDEA 2017.2.6\bin" -Dfile.encoding=UTF- -classpath E:\software\jdk1.\jre\lib\charsets.jar;E:\software\jdk1.\jre\lib\deploy.jar;E:\software\jdk1.\jre\lib\ext\access-bridge-.jar;E:\software\jdk1.\jre\lib\ext\cldrdata.jar;E:\software\jdk1.\jre\lib\ext\dnsns.jar;E:\software\jdk1.\jre\lib\ext\jaccess.jar;E:\software\jdk1.\jre\lib\ext\jfxrt.jar;E:\software\jdk1.\jre\lib\ext\localedata.jar;E:\software\jdk1.\jre\lib\ext\nashorn.jar;E:\software\jdk1.\jre\lib\ext\sunec.jar;E:\software\jdk1.\jre\lib\ext\sunjce_provider.jar;E:\software\jdk1.\jre\lib\ext\sunmscapi.jar;E:\software\jdk1.\jre\lib\ext\sunpkcs11.jar;E:\software\jdk1.\jre\lib\ext\zipfs.jar;E:\software\jdk1.\jre\lib\javaws.jar;E:\software\jdk1.\jre\lib\jce.jar;E:\software\jdk1.\jre\lib\jfr.jar;E:\software\jdk1.\jre\lib\jfxswt.jar;E:\software\jdk1.\jre\lib\jsse.jar;E:\software\jdk1.\jre\lib\management-agent.jar;E:\software\jdk1.\jre\lib\plugin.jar;E:\software\jdk1.\jre\lib\resources.jar;E:\software\jdk1.\jre\lib\rt.jar;E:\Mycode\SparkStu\target\classes;E:\software\Scala\lib\scala-actors-2.11..jar;E:\software\Scala\lib\scala-actors-migration_2.-1.1..jar;E:\software\Scala\lib\scala-library.jar;E:\software\Scala\lib\scala-parser-combinators_2.-1.0..jar;E:\software\Scala\lib\scala-reflect.jar;E:\software\Scala\lib\scala-swing_2.-1.0..jar;E:\software\Scala\lib\scala-xml_2.-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-core_2.\2.2.\spark-core_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\avro\avro\1.7.\avro-1.7..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-core-asl\1.9.\jackson-core-asl-1.9..jar;E:\software\maven3.3.9\repository\com\thoughtworks\paranamer\paranamer\2.3\paranamer-2.3.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-compress\1.4.\commons-compress-1.4..jar;E:\software\maven3.3.9\repository\org\tukaani\xz\1.0\xz-1.0.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-mapred\1.7.\avro-mapred-1.7.-hadoop2.jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.\avro-ipc-1.7..jar;E:\software\maven3.3.9\repository\org\apache\avro\avro-ipc\1.7.\avro-ipc-1.7.-tests.jar;E:\software\maven3.3.9\repository\com\twitter\chill_2.\0.8.\chill_2.-0.8..jar;E:\software\maven3.3.9\repository\com\esotericsoftware\kryo-shaded\3.0.\kryo-shaded-3.0..jar;E:\software\maven3.3.9\repository\com\esotericsoftware\minlog\1.3.\minlog-1.3..jar;E:\software\maven3.3.9\repository\org\objenesis\objenesis\2.1\objenesis-2.1.jar;E:\software\maven3.3.9\repository\com\twitter\chill-java\0.8.\chill-java-0.8..jar;E:\software\maven3.3.9\repository\org\apache\xbean\xbean-asm5-shaded\4.4\xbean-asm5-shaded-4.4.jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-launcher_2.\2.2.\spark-launcher_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-common_2.\2.2.\spark-network-common_2.-2.2..jar;E:\software\maven3.3.9\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-annotations\2.6.\jackson-annotations-2.6..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-network-shuffle_2.\2.2.\spark-network-shuffle_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-unsafe_2.\2.2.\spark-unsafe_2.-2.2..jar;E:\software\maven3.3.9\repository\net\java\dev\jets3t\jets3t\0.9.\jets3t-0.9..jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpcore\4.3.\httpcore-4.3..jar;E:\software\maven3.3.9\repository\javax\activation\activation\1.1.\activation-1.1..jar;E:\software\maven3.3.9\repository\mx4j\mx4j\3.0.\mx4j-3.0..jar;E:\software\maven3.3.9\repository\javax\mail\mail\1.4.\mail-1.4..jar;E:\software\maven3.3.9\repository\org\bouncycastle\bcprov-jdk15on\1.51\bcprov-jdk15on-1.51.jar;E:\software\maven3.3.9\repository\com\jamesmurty\utils\java-xmlbuilder\1.0\java-xmlbuilder-1.0.jar;E:\software\maven3.3.9\repository\net\iharder\base64\2.3.\base64-2.3..jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-recipes\2.6.\curator-recipes-2.6..jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-framework\2.6.\curator-framework-2.6..jar;E:\software\maven3.3.9\repository\org\apache\zookeeper\zookeeper\3.4.\zookeeper-3.4..jar;E:\software\maven3.3.9\repository\com\google\guava\guava\16.0.\guava-16.0..jar;E:\software\maven3.3.9\repository\javax\servlet\javax.servlet-api\3.1.\javax.servlet-api-3.1..jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-math3\3.4.\commons-math3-3.4..jar;E:\software\maven3.3.9\repository\com\google\code\findbugs\jsr305\1.3.\jsr305-1.3..jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-api\1.7.\slf4j-api-1.7..jar;E:\software\maven3.3.9\repository\org\slf4j\jul-to-slf4j\1.7.\jul-to-slf4j-1.7..jar;E:\software\maven3.3.9\repository\org\slf4j\jcl-over-slf4j\1.7.\jcl-over-slf4j-1.7..jar;E:\software\maven3.3.9\repository\log4j\log4j\1.2.\log4j-1.2..jar;E:\software\maven3.3.9\repository\org\slf4j\slf4j-log4j12\1.7.\slf4j-log4j12-1.7..jar;E:\software\maven3.3.9\repository\com\ning\compress-lzf\1.0.\compress-lzf-1.0..jar;E:\software\maven3.3.9\repository\org\xerial\snappy\snappy-java\1.1.2.6\snappy-java-1.1.2.6.jar;E:\software\maven3.3.9\repository\net\jpountz\lz4\lz4\1.3.\lz4-1.3..jar;E:\software\maven3.3.9\repository\org\roaringbitmap\RoaringBitmap\0.5.\RoaringBitmap-0.5..jar;E:\software\maven3.3.9\repository\commons-net\commons-net\2.2\commons-net-2.2.jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-library\2.11.\scala-library-2.11..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-jackson_2.\3.2.\json4s-jackson_2.-3.2..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-core_2.\3.2.\json4s-core_2.-3.2..jar;E:\software\maven3.3.9\repository\org\json4s\json4s-ast_2.\3.2.\json4s-ast_2.-3.2..jar;E:\software\maven3.3.9\repository\org\scala-lang\scalap\2.11.\scalap-2.11..jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-compiler\2.11.\scala-compiler-2.11..jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-xml_2.\1.0.\scala-xml_2.-1.0..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-client\2.22.\jersey-client-2.22..jar;E:\software\maven3.3.9\repository\javax\ws\rs\javax.ws.rs-api\2.0.\javax.ws.rs-api-2.0..jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-api\2.4.-b34\hk2-api-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-utils\2.4.-b34\hk2-utils-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.-b34\aopalliance-repackaged-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\external\javax.inject\2.4.-b34\javax.inject-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\hk2-locator\2.4.-b34\hk2-locator-2.4.-b34.jar;E:\software\maven3.3.9\repository\org\javassist\javassist\3.18.-GA\javassist-3.18.-GA.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-common\2.22.\jersey-common-2.22..jar;E:\software\maven3.3.9\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.\jersey-guava-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\hk2\osgi-resource-locator\1.0.\osgi-resource-locator-1.0..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\core\jersey-server\2.22.\jersey-server-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.\jersey-media-jaxb-2.22..jar;E:\software\maven3.3.9\repository\javax\validation\validation-api\1.1..Final\validation-api-1.1..Final.jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.\jersey-container-servlet-2.22..jar;E:\software\maven3.3.9\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.\jersey-container-servlet-core-2.22..jar;E:\software\maven3.3.9\repository\io\netty\netty-all\4.0..Final\netty-all-4.0..Final.jar;E:\software\maven3.3.9\repository\io\netty\netty\3.9..Final\netty-3.9..Final.jar;E:\software\maven3.3.9\repository\com\clearspring\analytics\stream\2.7.\stream-2.7..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-core\3.1.\metrics-core-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-jvm\3.1.\metrics-jvm-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-json\3.1.\metrics-json-3.1..jar;E:\software\maven3.3.9\repository\io\dropwizard\metrics\metrics-graphite\3.1.\metrics-graphite-3.1..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-databind\2.6.\jackson-databind-2.6..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\core\jackson-core\2.6.\jackson-core-2.6..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-scala_2.\2.6.\jackson-module-scala_2.-2.6..jar;E:\software\maven3.3.9\repository\org\scala-lang\scala-reflect\2.11.\scala-reflect-2.11..jar;E:\software\maven3.3.9\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.6.\jackson-module-paranamer-2.6..jar;E:\software\maven3.3.9\repository\org\apache\ivy\ivy\2.4.\ivy-2.4..jar;E:\software\maven3.3.9\repository\oro\oro\2.0.\oro-2.0..jar;E:\software\maven3.3.9\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;E:\software\maven3.3.9\repository\net\sf\py4j\py4j\0.10.\py4j-0.10..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-tags_2.\2.2.\spark-tags_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\commons\commons-crypto\1.0.\commons-crypto-1.0..jar;E:\software\maven3.3.9\repository\org\spark-project\spark\unused\1.0.\unused-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql_2.\2.2.\spark-sql_2.-2.2..jar;E:\software\maven3.3.9\repository\com\univocity\univocity-parsers\2.2.\univocity-parsers-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sketch_2.\2.2.\spark-sketch_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-catalyst_2.\2.2.\spark-catalyst_2.-2.2..jar;E:\software\maven3.3.9\repository\org\codehaus\janino\janino\3.0.\janino-3.0..jar;E:\software\maven3.3.9\repository\org\codehaus\janino\commons-compiler\3.0.\commons-compiler-3.0..jar;E:\software\maven3.3.9\repository\org\antlr\antlr4-runtime\4.5.\antlr4-runtime-4.5..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-column\1.8.\parquet-column-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-common\1.8.\parquet-common-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-encoding\1.8.\parquet-encoding-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-hadoop\1.8.\parquet-hadoop-1.8..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-format\2.3.\parquet-format-2.3..jar;E:\software\maven3.3.9\repository\org\apache\parquet\parquet-jackson\1.8.\parquet-jackson-1.8..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming_2.\2.2.\spark-streaming_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-hive_2.\2.2.\spark-hive_2.-2.2..jar;E:\software\maven3.3.9\repository\com\twitter\parquet-hadoop-bundle\1.6.\parquet-hadoop-bundle-1.6..jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-exec\1.2..spark2\hive-exec-1.2..spark2.jar;E:\software\maven3.3.9\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;E:\software\maven3.3.9\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;E:\software\maven3.3.9\repository\javolution\javolution\5.5.\javolution-5.5..jar;E:\software\maven3.3.9\repository\log4j\apache-log4j-extras\1.2.\apache-log4j-extras-1.2..jar;E:\software\maven3.3.9\repository\org\antlr\antlr-runtime\3.4\antlr-runtime-3.4.jar;E:\software\maven3.3.9\repository\org\antlr\stringtemplate\3.2.\stringtemplate-3.2..jar;E:\software\maven3.3.9\repository\antlr\antlr\2.7.\antlr-2.7..jar;E:\software\maven3.3.9\repository\org\antlr\ST4\4.0.\ST4-4.0..jar;E:\software\maven3.3.9\repository\com\googlecode\javaewah\JavaEWAH\0.3.\JavaEWAH-0.3..jar;E:\software\maven3.3.9\repository\org\iq80\snappy\snappy\0.2\snappy-0.2.jar;E:\software\maven3.3.9\repository\stax\stax-api\1.0.\stax-api-1.0..jar;E:\software\maven3.3.9\repository\net\sf\opencsv\opencsv\2.3\opencsv-2.3.jar;E:\software\maven3.3.9\repository\org\spark-project\hive\hive-metastore\1.2..spark2\hive-metastore-1.2..spark2.jar;E:\software\maven3.3.9\repository\com\jolbox\bonecp\0.8..RELEASE\bonecp-0.8..RELEASE.jar;E:\software\maven3.3.9\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;E:\software\maven3.3.9\repository\commons-logging\commons-logging\1.1.\commons-logging-1.1..jar;E:\software\maven3.3.9\repository\org\apache\derby\derby\10.10.2.0\derby-10.10.2.0.jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-api-jdo\3.2.\datanucleus-api-jdo-3.2..jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-rdbms\3.2.\datanucleus-rdbms-3.2..jar;E:\software\maven3.3.9\repository\commons-pool\commons-pool\1.5.\commons-pool-1.5..jar;E:\software\maven3.3.9\repository\commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar;E:\software\maven3.3.9\repository\javax\jdo\jdo-api\3.0.\jdo-api-3.0..jar;E:\software\maven3.3.9\repository\javax\transaction\jta\1.1\jta-1.1.jar;E:\software\maven3.3.9\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-avatica\1.2.-incubating\calcite-avatica-1.2.-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-core\1.2.-incubating\calcite-core-1.2.-incubating.jar;E:\software\maven3.3.9\repository\org\apache\calcite\calcite-linq4j\1.2.-incubating\calcite-linq4j-1.2.-incubating.jar;E:\software\maven3.3.9\repository\net\hydromatic\eigenbase-properties\1.1.\eigenbase-properties-1.1..jar;E:\software\maven3.3.9\repository\org\apache\httpcomponents\httpclient\4.5.\httpclient-4.5..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.\jackson-mapper-asl-1.9..jar;E:\software\maven3.3.9\repository\commons-codec\commons-codec\1.10\commons-codec-1.10.jar;E:\software\maven3.3.9\repository\joda-time\joda-time\2.9.\joda-time-2.9..jar;E:\software\maven3.3.9\repository\org\jodd\jodd-core\3.5.\jodd-core-3.5..jar;E:\software\maven3.3.9\repository\org\datanucleus\datanucleus-core\3.2.\datanucleus-core-3.2..jar;E:\software\maven3.3.9\repository\org\apache\thrift\libthrift\0.9.\libthrift-0.9..jar;E:\software\maven3.3.9\repository\org\apache\thrift\libfb303\0.9.\libfb303-0.9..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-streaming-kafka--10_2.\2.2.\spark-streaming-kafka--10_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka_2.\0.10.0.1\kafka_2.-0.10.0.1.jar;E:\software\maven3.3.9\repository\com\101tec\zkclient\0.8\zkclient-0.8.jar;E:\software\maven3.3.9\repository\com\yammer\metrics\metrics-core\2.2.\metrics-core-2.2..jar;E:\software\maven3.3.9\repository\org\scala-lang\modules\scala-parser-combinators_2.\1.0.\scala-parser-combinators_2.-1.0..jar;E:\software\maven3.3.9\repository\org\apache\spark\spark-sql-kafka--10_2.\2.2.\spark-sql-kafka--10_2.-2.2..jar;E:\software\maven3.3.9\repository\org\apache\kafka\kafka-clients\0.10.0.1\kafka-clients-0.10.0.1.jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-client\2.6.\hadoop-client-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-common\2.6.\hadoop-common-2.6..jar;E:\software\maven3.3.9\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;E:\software\maven3.3.9\repository\commons-collections\commons-collections\3.2.\commons-collections-3.2..jar;E:\software\maven3.3.9\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;E:\software\maven3.3.9\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils\1.7.\commons-beanutils-1.7..jar;E:\software\maven3.3.9\repository\commons-beanutils\commons-beanutils-core\1.8.\commons-beanutils-core-1.8..jar;E:\software\maven3.3.9\repository\com\google\protobuf\protobuf-java\2.5.\protobuf-java-2.5..jar;E:\software\maven3.3.9\repository\com\google\code\gson\gson\2.2.\gson-2.2..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-auth\2.6.\hadoop-auth-2.6..jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.-M15\apacheds-kerberos-codec-2.0.-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\server\apacheds-i18n\2.0.-M15\apacheds-i18n-2.0.-M15.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-asn1-api\1.0.-M20\api-asn1-api-1.0.-M20.jar;E:\software\maven3.3.9\repository\org\apache\directory\api\api-util\1.0.-M20\api-util-1.0.-M20.jar;E:\software\maven3.3.9\repository\org\apache\curator\curator-client\2.6.\curator-client-2.6..jar;E:\software\maven3.3.9\repository\org\htrace\htrace-core\3.0.\htrace-core-3.0..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-hdfs\2.6.\hadoop-hdfs-2.6..jar;E:\software\maven3.3.9\repository\org\mortbay\jetty\jetty-util\6.1.\jetty-util-6.1..jar;E:\software\maven3.3.9\repository\xerces\xercesImpl\2.9.\xercesImpl-2.9..jar;E:\software\maven3.3.9\repository\xml-apis\xml-apis\1.3.\xml-apis-1.3..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.\hadoop-mapreduce-client-app-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.\hadoop-mapreduce-client-common-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-client\2.6.\hadoop-yarn-client-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.\hadoop-yarn-server-common-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.\hadoop-mapreduce-client-shuffle-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-api\2.6.\hadoop-yarn-api-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.\hadoop-mapreduce-client-core-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-yarn-common\2.6.\hadoop-yarn-common-2.6..jar;E:\software\maven3.3.9\repository\javax\xml\bind\jaxb-api\2.2.\jaxb-api-2.2..jar;E:\software\maven3.3.9\repository\javax\xml\stream\stax-api\1.0-\stax-api-1.0-.jar;E:\software\maven3.3.9\repository\javax\servlet\servlet-api\2.5\servlet-api-2.5.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-core\1.9\jersey-core-1.9.jar;E:\software\maven3.3.9\repository\com\sun\jersey\jersey-client\1.9\jersey-client-1.9.jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-jaxrs\1.9.\jackson-jaxrs-1.9..jar;E:\software\maven3.3.9\repository\org\codehaus\jackson\jackson-xc\1.9.\jackson-xc-1.9..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.\hadoop-mapreduce-client-jobclient-2.6..jar;E:\software\maven3.3.9\repository\org\apache\hadoop\hadoop-annotations\2.6.\hadoop-annotations-2.6..jar com.spark.test.Test
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
// :: INFO SparkContext: Running Spark version 2.2.
// :: INFO SparkContext: Submitted application: HdfsTest
// :: INFO SecurityManager: Changing view acls to: Brave
// :: INFO SecurityManager: Changing modify acls to: Brave
// :: INFO SecurityManager: Changing view acls groups to:
// :: INFO SecurityManager: Changing modify acls groups to:
// :: INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Brave); groups with view permissions: Set(); users with modify permissions: Set(Brave); groups with modify permissions: Set()
// :: INFO Utils: Successfully started service 'sparkDriver' on port .
// :: INFO SparkEnv: Registering MapOutputTracker
// :: INFO SparkEnv: Registering BlockManagerMaster
// :: INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
// :: INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
// :: INFO DiskBlockManager: Created local directory at C:\Users\Brave\AppData\Local\Temp\blockmgr-2ad95228--4a24-b6b6-b09973c4a4ff
// :: INFO MemoryStore: MemoryStore started with capacity 1998.3 MB
// :: INFO SparkEnv: Registering OutputCommitCoordinator
// :: INFO Utils: Successfully started service 'SparkUI' on port .
// :: INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.56.1:4040
// :: INFO Executor: Starting executor ID driver on host localhost
// :: INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port .
// :: INFO NettyBlockTransferService: Server created on 192.168.56.1:
// :: INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
// :: INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.56.1, , None)
// :: INFO BlockManagerMasterEndpoint: Registering block manager 192.168.56.1: with 1998.3 MB RAM, BlockManagerId(driver, 192.168.56.1, , None)
// :: INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.56.1, , None)
// :: INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.56.1, , None)
// :: INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/E:/Mycode/SparkStu/spark-warehouse/').
// :: INFO SharedState: Warehouse path is 'file:/E:/Mycode/SparkStu/spark-warehouse/'.
// :: INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
// :: INFO FileSourceStrategy: Pruning directories with:
// :: INFO FileSourceStrategy: Post-Scan Filters:
// :: INFO FileSourceStrategy: Output Data Schema: struct<value: string>
// :: INFO FileSourceScanExec: Pushed Filters:
// :: INFO CodeGenerator: Code generated in 321.911944 ms
// :: INFO CodeGenerator: Code generated in 9.798824 ms
// :: INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 213.6 KB, free 1998.1 MB)
// :: INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 20.2 KB, free 1998.1 MB)
// :: INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.56.1: (size: 20.2 KB, free: 1998.3 MB)
// :: INFO SparkContext: Created broadcast from show at Test.scala:
// :: INFO FileSourceScanExec: Planning scan with bin packing, max size: bytes, open cost is considered as scanning bytes.
// :: INFO SparkContext: Starting job: show at Test.scala:
// :: INFO DAGScheduler: Got job (show at Test.scala:) with output partitions
// :: INFO DAGScheduler: Final stage: ResultStage (show at Test.scala:)
// :: INFO DAGScheduler: Parents of final stage: List()
// :: INFO DAGScheduler: Missing parents: List()
// :: INFO DAGScheduler: Submitting ResultStage (MapPartitionsRDD[] at show at Test.scala:), which has no missing parents
// :: INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 13.0 KB, free 1998.1 MB)
// :: INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.1 KB, free 1998.1 MB)
// :: INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.56.1: (size: 6.1 KB, free: 1998.3 MB)
// :: INFO SparkContext: Created broadcast from broadcast at DAGScheduler.scala:
// :: INFO DAGScheduler: Submitting missing tasks from ResultStage (MapPartitionsRDD[] at show at Test.scala:) (first tasks are for partitions Vector())
// :: INFO TaskSchedulerImpl: Adding task set 0.0 with tasks
// :: INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID , localhost, executor driver, partition , PROCESS_LOCAL, bytes)
// :: INFO Executor: Running task 0.0 in stage 0.0 (TID )
// :: INFO CodeGenerator: Code generated in 13.617205 ms
// :: INFO FileScanRDD: Reading File path: file:///E:/Mycode/datas/stu.txt, range: 0-113, partition values: [empty row]
// :: INFO CodeGenerator: Code generated in 11.971125 ms
// :: INFO Executor: Finished task 0.0 in stage 0.0 (TID ). bytes result sent to driver
// :: INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID ) in ms on localhost (executor driver) (/)
// :: INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
// :: INFO DAGScheduler: ResultStage (show at Test.scala:) finished in 0.284 s
// :: INFO DAGScheduler: Job finished: show at Test.scala:, took 0.483521 s
// :: INFO CodeGenerator: Code generated in 23.334109 ms
+------+
| value|
+------+
|hadoop|
|hadoop|
| java|
| java|
| spark|
| spark|
| hive|
| hbase|
| sqoop|
| sqoop|
| mysql|
| redit|
| flume|
| flume|
| join|
| hue|
| scala|
|python|
+------+ // :: INFO SparkContext: Invoking stop() from shutdown hook
// :: INFO SparkUI: Stopped Spark web UI at http://192.168.56.1:4040
// :: INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
// :: INFO MemoryStore: MemoryStore cleared
// :: INFO BlockManager: BlockManager stopped
// :: INFO BlockManagerMaster: BlockManagerMaster stopped
// :: INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
// :: INFO SparkContext: Successfully stopped SparkContext
// :: INFO ShutdownHookManager: Shutdown hook called
// :: INFO ShutdownHookManager: Deleting directory C:\Users\Brave\AppData\Local\Temp\spark-2c920b38-6a7f--a2ef-9ee345492414 Process finished with exit code
增加这个字段
运行结果
分组统计
运行结果
把最后的代码放上来
package com.spark.test import org.apache.spark.sql.SparkSession
import org.apache.spark.{SparkConf, SparkContext}
object Test { def main(args: Array[String]): Unit = {
val spark= SparkSession
.builder
.master("local")
.appName("HdfsTest")
.getOrCreate()
val filePart = "E://Mycode/datas/stu.txt"
// val rdd= spark.sparkContext.textFile(filePart)
// val lines= rdd.flatMap(x => x.split(" ")).map(x=>(x,1)).reduceByKey((a,b)=>(a+b)).collect().toList
// println(lines)
import spark.implicits._
val dataSet= spark.read.textFile(filePart)
.flatMap(x => x.split(" "))
.map(x=>(x,)).groupBy("_1").count()
.show() }
}
现在我们把程序打包
我们把代码稍微改一下
package com.spark.test import org.apache.spark.sql.SparkSession
import org.apache.spark.{SparkConf, SparkContext}
object Test { def main(args: Array[String]): Unit = {
val spark= SparkSession
.builder
.master("local")
.appName("HdfsTest")
.getOrCreate() val filePart = args()
// val filePart = "E://Mycode/datas/stu.txt"
// val rdd= spark.sparkContext.textFile(filePart)
// val lines= rdd.flatMap(x => x.split(" ")).map(x=>(x,1)).reduceByKey((a,b)=>(a+b)).collect().toList
// println(lines)
import spark.implicits._
val dataSet= spark.read.textFile(filePart)
.flatMap(x => x.split(" "))
.map(x=>(x,)).groupBy("_1").count()
.show() }
}
把这些都剔除掉
剩下这两个
打包完成了
把这个包上传到我们的集群上
这个是我们的数据文件
我们把数据文件上传的hdfs上面去,先启动hdfs
同时记得把zookeeper也启动了,不然会出问题的
现在hdfs上创建一个目录
把本地的文件上传
我们在集群上跑一下
bin/spark-submit --master local[] /opt/jars/sparkStu.jar hdfs://bigdata-pro01.kfk.com:9000/user/datas/stu.txt
可以看到跑下来的结果