Spark Streaming之旅

时间:2023-03-09 19:28:07
Spark Streaming之旅
1. 打开spark-shell

2. 建立StreamingContext
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._
import org.apache.spark.api.java.function._
import org.apache.spark.streaming._
import org.apache.spark.streaming.api._
// Create a StreamingContext with a local master
val ssc = new StreamingContext(sc, Seconds())

3.用StreamingContext建立一个DStream

val lines = ssc.socketTextStream("localhost", )

4.运算

// Split each line into words
val words = lines.flatMap(_.split(" ")) import org.apache.spark.streaming.StreamingContext._
// Count each word in each batch
val pairs = words.map(word => (word, ))
val wordCounts = pairs.reduceByKey(_ + _) // Print a few of the counts to the console
wordCounts.print()

5.启动DStream

ssc.start()             // Start the computation
ssc.awaitTermination() // Wait for the computation to terminate

6.启动一个新控制台,并执行如下命令

$ nc -lk 

7.在新控制台上输入任意语句,并在spark-shell窗口观察输出

hello world how are you hello are you there

you are the son of a bitch