关于2.0版本的火花流的警告

时间:2022-07-31 20:51:59

When I try to use the latest Spark Streaming with checkpoint:

当我尝试使用带检查点的最新Spark Streaming时:

cfg = SparkConf().setAppName('MyApp').setMaster('local[3]')
sc = SparkContext(conf=cfg)
ssc = StreamingContext(sparkContext=sc, batchDuration=1)
ssc.checkpoint('checkpoint')

Then I got this repeatedly WARN :

然后我反复得到这个警告:

-------------------------------------------
Time: 2016-10-11 10:08:02
-------------------------------------------
('world', 1)
('hello', 1)

16/10/11 10:08:06 WARN DFSClient: Caught exception 
java.lang.InterruptedException
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1281)
    at java.lang.Thread.join(Thread.java:1355)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:609)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:370)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:546)
-------------------------------------------
Time: 2016-10-11 10:08:03
-------------------------------------------
('world', 1)
('hello', 1)

What is that? It looks like HDFS`s WARN

那是什么?它看起来像HDFS的WARN

Is this a important information ?

这是一个重要的信息吗?

I`m sure that there is not WARN with spark ver 2.0.0

我确定没有带有spark 2.0.0的WARN

1 个解决方案

#1


0  

For completeness I moved my comment to the answer

为了完整起见,我将评论移至答案

I think the problem is the hadoop-hdfs.jar upgraded from v2.7.2 to v2.7.3. Spark 2.0.0 uses 2.7.2 whereas Spark 2.0.1 uses 2.7.3

我认为问题是hadoop-hdfs.jar从v2.7.2升级到v2.7.3。 Spark 2.0.0使用2.7.2而Spark 2.0.1使用2.7.3

#1


0  

For completeness I moved my comment to the answer

为了完整起见,我将评论移至答案

I think the problem is the hadoop-hdfs.jar upgraded from v2.7.2 to v2.7.3. Spark 2.0.0 uses 2.7.2 whereas Spark 2.0.1 uses 2.7.3

我认为问题是hadoop-hdfs.jar从v2.7.2升级到v2.7.3。 Spark 2.0.0使用2.7.2而Spark 2.0.1使用2.7.3