因磁盘空间不足,内存不足,系统掉电,删除节点等等原因导致block损坏,dataNode丢失。系统自动进入安全模式。并出现如下异常信息:
2018-12-07 10:10:52 WARN Client:87 - Failed to cleanup staging dir hdfs://master:9000/user/root/.sparkStaging/application_1544148219961_0001
org.apache.(): Cannot delete /user/root/.sparkStaging/application_1544148219961_0001. Name node is in safe mode.
The reported blocks 38 needs additional 26 blocks to reach the threshold 0.9990 of total blocks 64.
The number of live datanodes 4 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
at .hadoop.(:1335)
at .server.(:3689)
at (:953)
at (:623)
at $ClientNamenodeProtocol$()
at $Server$(:616)
at $(:982)
at $Handler$(:2217)
at $Handler$(:2213)
at (Native Method)
at (:422)
at (:1762)
at $(:2211)
at (:1475)
at (:1412)
at $(:229)
at .$(Unknown Source)
at (:540)
at .invoke0(Native Method)
at (:62)
at (:43)
at (:498)
at (:191)
at (:102)
at .$(Unknown Source)
at (:2044)
at $(:707)
at $(:703)
at (:81)
at (:714)
at .spark.$apache$spark$deploy$yarn$Client$$cleanupStagingDirInternal$1(:200)
at (:217)
at (:182)
at (:1146)
at (:1518)
at $.org$apache$spark$deploy$SparkSubmit$$runMain(:879)
at $.doRunMain$1(:197)
at $.submit(:227)
at $.main(:136)
at ()
Exception in thread "main" (): Cannot create directory /user/root/.sparkStaging/application_1544148219961_0001. Name node is in safe mode.
The reported blocks 38 needs additional 26 blocks to reach the threshold 0.9990 of total blocks 64.
The number of live datanodes 4 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
at (:1335)
at (:3874)
at (:984)
at (:634)
at $ClientNamenodeProtocol$()
at $Server$(:616)
at $(:982)
at $Handler$(:2217)
at $Handler$(:2213)
at (Native Method)
at (:422)
at (:1762)
at $(:2211)
at (:1475)
at (:1412)
at $(:229)
at .$(Unknown Source)
at (:558)
at .invoke0(Native Method)
at (:62)
at (:43)
at (:498)
at (:191)
at (:102)
at .$(Unknown Source)
at (:3000)
at (:2970)
at $(:1047)
at $(:1043)
at (:81)
at (:1061)
at (:1036)
at (:1881)
at (:600)
at (:429)
at (:863)
at (:169)
at (:1146)
at (:1518)
at $.org$apache$spark$deploy$SparkSubmit$$runMain(:879)
at $.doRunMain$1(:197)
at $.submit(:227)
at $.main(:136)
at ()
解决方法:
1.执行命令退出安全模式:hadoop dfsadmin -safemode leave
2.执行健康检查,删除损坏掉的block。hdfs fsck / -delete
注意: 这种方式会出现数据丢失,损坏的block会被删掉