Local模式下Spark程序只输出关键信息

时间:2022-02-16 18:10:59

使用spark-submit提交local任务时,会输出很多Info信息:

-------------------------------------------
Time: ms
------------------------------------------- // :: INFO scheduler.JobScheduler: Finished job streaming job ms. from job set of time ms
// :: INFO scheduler.JobScheduler: Total delay: 0.054 s for time ms (execution: 0.046 s)
// :: INFO rdd.MapPartitionsRDD: Removing RDD from persistence list
// :: INFO storage.BlockManager: Removing RDD
// :: INFO rdd.MapPartitionsRDD: Removing RDD from persistence list
// :: INFO storage.BlockManager: Removing RDD
// :: INFO rdd.BlockRDD: Removing RDD from persistence list
// :: INFO dstream.SocketInputDStream: Removing blocks of RDD BlockRDD[] at socketTextStream at CoGroupTest.scala: of time ms
// :: INFO storage.BlockManager: Removing RDD
// :: INFO rdd.MapPartitionsRDD: Removing RDD from persistence list
// :: INFO storage.BlockManager: Removing RDD
// :: INFO rdd.MapPartitionsRDD: Removing RDD from persistence list
// :: INFO storage.BlockManager: Removing RDD
// :: INFO scheduler.ReceivedBlockTracker: Deleting batches: ms
// :: INFO scheduler.InputInfoTracker: remove old batch metadata: ms
// :: INFO scheduler.JobScheduler: Added jobs for time ms
// :: INFO scheduler.JobScheduler: Starting job streaming job ms. from job set of time ms

可以修改log4j的日志级别,只输出关键信息:

1.修改$SPARK_HOME/conf/log4j.properties

如果有log4j.properties.template,则复制一份为log4j.properties

cp log4j.properties.template log4j.properties

2.修改第一行

log4j.rootCategory=INFO, console

log4j.rootCategory=ERROR, console

3.再次提交任务,只会输出关键信息:

-------------------------------------------
Time: ms
-------------------------------------------
(helloworld,helloworld_one)
(hello,hello_one)
(join,join_one)