问题一、启动Hadoop-2.2.0中的yarn时,resourcemanager进程一直没有启动起来。
查看日志文件中的信息tail -n 50
出现一下异常:
2016-09-09 14:41:09,341 INFO : Service ResourceManager failed in state STARTED; cause: : Error starting http server
: Error starting http server
at $(:262)
at (:623)
at (:655)
at (:193)
at (:872)
Caused by: : Port in use: 192.168.1.120:8088
at (:742)
at (:686)
at $(:257)
... 4 more
Caused by: : Address already in use
at .bind0(Native Method)
at (:444)
at (:436)
at (:214)
at (:74)
at (:216)
at (:738)
... 6 more
解决方法:
1. ps aux | grep -i resourcemanager
, 查看主机master中的resourcemanager的进程个数
2. 然后使用 kill -9 <RESOURCE_MANAGER_PID> 杀死相关进行
3. sbin目录下重启yarn即可复现进行
./ ./
在主节点master上面即可出现resourcemanager进程
问题二、有时,启动hregionserver后又挂掉了,查看Hbase启动的日志
dell@master1:/usr/local/hbase-0.98.7-hadoop2/logs$ tail -n 100
at (:1286)
at (:862)
at (:745)
2017-01-12 10:02:23,347 FATAL [regionserver60020] : ABORTING region server master1,60020,1484186540447: Unhandled: Cannot create directory /hbase/WALs/master1,60020,1484186540447. Name node is in safe mode.
Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
at (:3355)
at (:3330)
at (:724)
at (:502)
at $ClientNamenodeProtocol$(:59598)
at $Server$(:585)
at $(:928)
at $Handler$(:2048)
at $Handler$(:2044)
at (Native Method)
at (:415)
at (:1491)
at $(:2042)
(): Cannot create directory /hbase/WALs/master1,60020,1484186540447. Name node is in safe mode.
Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
at (:3355)
at (:3330)
at (:724)
at (:502)
at $ClientNamenodeProtocol$(:59598)
at $Server$(:585)
at $(:928)
at $Handler$(:2048)
at $Handler$(:2044)
at (Native Method)
at (:415)
at (:1491)
at $(:2042)
at (:1347)
at (:1300)
at $(:206)
at .$(Unknown Source)
at .invoke0(Native Method)
at (:57)
at (:43)
at (:606)
at (:186)
at (:102)
at .$(Unknown Source)
at (:467)
at .invoke0(Native Method)
at (:57)
at (:43)
at (:606)
at $(:294)
at .$(Unknown Source)
at (:2394)
at (:2365)
at $(:817)
at $(:813)
at (:81)
at (:813)
at (:806)
at (:1933)
at .<init>(:408)
at .<init>(:334)
at (:58)
at (:1552)
at (:1531)
at (:1286)
at (:862)
at (:745)
2017-01-12 10:02:23,350 FATAL [regionserver60020] : RegionServer abort: loaded coprocessors are: []
2017-01-12 10:02:23,367 INFO [regionserver60020] : Stopping server on 60020
2017-01-12 10:02:23,368 INFO [regionserver60020] : Stopping infoServer
2017-01-12 10:02:23,373 INFO [regionserver60020] : Stopped SelectChannelConnector@0.0.0.0:60030
2017-01-12 10:02:23,475 INFO [regionserver60020] : Stopping RegionServerSnapshotManager abruptly.
2017-01-12 10:02:23,475 INFO [regionserver60020] : aborting server master1,60020,1484186540447
2017-01-12 10:02:23,475 DEBUG [regionserver60020] : Stopping catalog tracker @58465d50
2017-01-12 10:02:23,475 INFO [regionserver60020] $HConnectionImplementation: Closing zookeeper sessionid=0x358d3e5582442fb
2017-01-12 10:02:23,485 INFO [regionserver60020] : Session: 0x358d3e5582442fb closed
2017-01-12 10:02:23,485 INFO [regionserver60020-EventThread] : EventThread shut down
2017-01-12 10:02:23,488 INFO [regionserver60020] : stopping server master1,60020,1484186540447; all regions closed.
2017-01-12 10:02:23,588 INFO [regionserver60020] : regionserver60020 closing leases
2017-01-12 10:02:23,588 INFO [regionserver60020] : regionserver60020 closed leases
2017-01-12 10:02:23,589 INFO [regionserver60020] : Waiting for Split Thread to finish...
2017-01-12 10:02:23,589 INFO [regionserver60020] : Waiting for Merge Thread to finish...
2017-01-12 10:02:23,589 INFO [regionserver60020] : Waiting for Large Compaction Thread to finish...
2017-01-12 10:02:23,589 INFO [regionserver60020] : Waiting for Small Compaction Thread to finish...
2017-01-12 10:02:23,636 INFO [regionserver60020] : Session: 0x558d3e6026242f9 closed
2017-01-12 10:02:23,636 INFO [regionserver60020-EventThread] : EventThread shut down
2017-01-12 10:02:23,636 INFO [regionserver60020] : stopping server master1,60020,1484186540447; zookeeper connection closed.
2017-01-12 10:02:23,636 INFO [regionserver60020] : regionserver60020 exiting
2017-01-12 10:02:23,636 ERROR [main] : Region server exiting
: HRegionServer Aborted
at (:66)
at (:85)
at (:70)
at (:126)
at (:2489)
2017-01-12 10:02:23,639 INFO [Thread-10] : Shutdown hook starting; =true; fsShutdownHook=$Cache$ClientFinalizer@68ee3eb2
2017-01-12 10:02:23,640 INFO [Thread-10] : Starting fs shutdown hook thread.
2017-01-12 10:02:23,641 INFO [Thread-10] : Shutdown hook finished.
You have new mail in /var/mail/dell
解决方法:
1. hdfs dfsadmin -safemode leave
, 释放安全模式
2. 然后使用
启动集群中所有的regionserver或者启动某个regionserver
./ start regionserver
3.查看Hbase webUI
http://192.168.1.120:60010/master-status
可以看到Region Servers的存活个数。
参考文献:/questions/26704763/yarn-resourcetrackerservice-failed-in-state-started