# 启动SparkContext
启动SparkContext的时候,提示ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(...)
// 创建SparkContext的最基本方法,只需要传递两个参数。
// 集群URL:告诉Spark如何连接到集群上,使用local可以让spark运行在单机单线程上。
// 应用名:使用"My App",当连接到一个集群时,这个值可以在集群管理器的用户界面中找到你的应用。
>>> from pyspark import SparkConf, SparkContextTraceback (most recent call last):
>>> conf = SparkConf().setMaster("local").setAppName("My App")
>>> sc = SparkContext(conf = conf)
File "<stdin>", line 1, in <module>
File "/home/xl/spark/spark-2.1.0-bin-hadoop2.7/python/pyspark/context.py", line 115, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "/home/xl/spark/spark-2.1.0-bin-hadoop2.7/python/pyspark/context.py", line 272, in _ensure_initialized
callsite.function, callsite.file, callsite.linenum))
ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=PySparkShell, master=local[*]) created by <module> at /home/xl/spark/spark-2.1.0-bin-hadoop2.7/python/pyspark/shell.py:43
出现这个错误是因为之前已经启动了SparkContext,所以需要先关闭spark,然后再启动。
>>> sc.stop() // 关闭spark
>>> sc = SparkContext(conf = conf)
>>> conf
<pyspark.conf.SparkConf object at 0x7fedfc83bf10>
>>> sc
<pyspark.context.SparkContext object at 0x7fedfc943210>
>>>