filebeat+kafka+SparkStreaming程序报错及解决办法

时间:2023-03-09 18:47:07
filebeat+kafka+SparkStreaming程序报错及解决办法
// :: WARN RandomBlockReplicationPolicy: Expecting  replicas with only  peer/s.
// :: WARN BlockManager: Block input-- replicated to only peer(s) instead of peers
// :: ERROR Executor: Exception in task 0.0 in stage 113711.0 (TID )
java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:)
at org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$$$anonfun$apply$.apply(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$$$anonfun$apply$.apply(BlockInfoManager.scala:)
at scala.Option.foreach(Option.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$.apply(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$.apply(BlockInfoManager.scala:)
at scala.collection.Iterator$class.foreach(Iterator.scala:)
at scala.collection.AbstractIterator.foreach(Iterator.scala:)
at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:)
at java.lang.Thread.run(Thread.java:)
// :: WARN TaskSetManager: Lost task 0.0 in stage 113711.0 (TID , localhost, executor driver): java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:)
at org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$$$anonfun$apply$.apply(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$$$anonfun$apply$.apply(BlockInfoManager.scala:)
at scala.Option.foreach(Option.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$.apply(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$.apply(BlockInfoManager.scala:)
at scala.collection.Iterator$class.foreach(Iterator.scala:)
at scala.collection.AbstractIterator.foreach(Iterator.scala:)
at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:)
at java.lang.Thread.run(Thread.java:) // :: ERROR TaskSetManager: Task in stage 113711.0 failed times; aborting job
// :: ERROR JobScheduler: Error running job streaming job ms.
org.apache.spark.SparkException: An exception was raised by Python:
Traceback (most recent call last):
File "/home/admin/agent/spark/python/lib/pyspark.zip/pyspark/streaming/util.py", line , in call
r = self.func(t, *rdds)
File "/home/admin/agent/spark/python/lib/pyspark.zip/pyspark/streaming/dstream.py", line , in takeAndPrint
taken = rdd.take(num + )
File "/home/admin/agent/spark/python/lib/pyspark.zip/pyspark/rdd.py", line , in take
res = self.context.runJob(self, takeUpToNumLeft, p)
File "/home/admin/agent/spark/python/lib/pyspark.zip/pyspark/context.py", line , in runJob
port = self._jvm.PythonRDD.runJob(self._jsc.sc(), mappedRDD._jrdd, partitions)
File "/home/admin/agent/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line , in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/home/admin/agent/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line , in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task in stage 113711.0 failed times, most recent failure: Lost task 0.0 in stage 113711.0 (TID , localhost, executor driver): java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:)
at org.apache.spark.storage.BlockInfo.checkInvariants(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfo.readerCount_$eq(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$$$anonfun$apply$.apply(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$$$anonfun$apply$.apply(BlockInfoManager.scala:)
at scala.Option.foreach(Option.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$.apply(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockInfoManager$$anonfun$releaseAllLocksForTask$.apply(BlockInfoManager.scala:)
at scala.collection.Iterator$class.foreach(Iterator.scala:)
at scala.collection.AbstractIterator.foreach(Iterator.scala:)
at org.apache.spark.storage.BlockInfoManager.releaseAllLocksForTask(BlockInfoManager.scala:)
at org.apache.spark.storage.BlockManager.releaseAllLocksForTask(BlockManager.scala:)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:)
at java.lang.Thread.run(Thread.java:)
  
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$.apply(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$.apply(DAGScheduler.scala:)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$.apply(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$.apply(DAGScheduler.scala:)
at scala.Option.foreach(Option.scala:)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:)
at org.apache.spark.util.EventLoop$$anon$.run(EventLoop.scala:)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:)
at org.apache.spark.api.python.PythonRDD$.runJob(PythonRDD.scala:)
at org.apache.spark.api.python.PythonRDD.runJob(PythonRDD.scala)
at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:)
at java.lang.reflect.Method.invoke(Method.java:)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:)
at py4j.Gateway.invoke(Gateway.java:)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:)
at py4j.commands.CallCommand.execute(CallCommand.java:)
at py4j.GatewayConnection.run(GatewayConnection.java:)
at java.lang.Thread.run(Thread.java:)

排查原因1:

  1. 【不是】由于代码中checkpoint目录为本地导致,搭建了hdfs,将checkpoint移到hdfs,发现还是运行一天左右就挂掉,报错如上。

  2. 待续

请大虾们指点。