整个思路：spark streaming 接受Kafka数据（KafkaUtils.createDirectStream）然后累计值（updateStateByKey）把值发给Kafka。

整个过程出现两个问题，第一个问题是启动脚本的问题，第二个问题是添加性能参数的问题，第三个问题是认证过期问题。

问题一：Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: Java heap space

Spark Streaming 对接Kafka实现实时统计的问题定位和解决

问题描述：

最后观察Spark 界面，发现spark.executor.memory才1G左右。而确实配置过参数spark-submit --master yarn-cluster --class 类名 --jar包路径参数列表 --driver-memory 4G --executor-memory 8G --executor-cores 2 --num-executors 1 。确实设置了

问题原因:

spark-submit --master yarn-cluster --class 类名 --jar包路径参数列表 --driver-memory 4G --executor-memory 8G --executor-cores 2 --num-executors 1 与Java 顺序有关，所以 --jar包路径直到最后，会当做是不是当做java的参数了（我也没有太明白）。

问题解决:

spark-submit --master yarn-client --driver-memory 4G --executor-memory 8G --executor-cores 2 --num-executors 1 --class 类名 --jar包路径参数列表（尴尬了！）

可是解决那个问题之后，申请的资源也大了，但是又报出另外一个问题。

问题二：is running beyond physical memory limits. Current usage: 4.5 GB of 4.5 GB physical memory used; 6.4 GB of 22.5 GB virtual memory used. Killing container.

问题描述：

Spark 存储在hdfs上面的log，不再报oom和其他错误。

问题原因:

日志中显示“Killing container”，直接原因是物理内存使用超过了限定值，YARN的NodeManager监控到内存使用超过阈值，强制终止该container进程。

问题解决:

spark.yarn.driver.memoryOverhead：设置堆外内存大小（cluster模式使用）。

碰到两个问题其实都是与资源有关：最后Spark命令就修改为

spark-submit --master yarn-cluster --conf spark.yarn.driver.memoryOverhead=1024 --driver-memory 4G --executor-memory 8G --executor-cores 2 --num-executors 1 --jar 。。。。--class 。。。。

问题三：User class threw exception: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 139891 for **) is expired

问题描述：

hdfs用户过期了

问题解决:

安全模式下，HDFS中用户可以对token的最大存活时间和token renew的时间间隔进行灵活地设置，根据集群的具体需求合理地配置。

需要更改集群hdfs上的配置文件

dfs.namenode.delegation.token.max-lifetime

万分感谢帮助定位和解决的同事们！！

秒客网

Spark Streaming 对接Kafka实现实时统计的问题定位和解决

问题一：Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: Java heap space

问题描述：

问题原因:

问题解决:

问题二：is running beyond physical memory limits. Current usage: 4.5 GB of 4.5 GB physical memory used; 6.4 GB of 22.5 GB virtual memory used. Killing container.

问题描述：

问题原因:

问题解决:

问题三：User class threw exception: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 139891 for **) is expired

问题描述：

问题解决:

相关文章