今天在hbase中执行下面的命令
disable 'iw:test06'
alter 'iw:test06',NAME=>'i',COMPRESSION=>'SNAPPY'
count 'iw:test06'
提示下面的异常信息
2018-01-15 10:49:20,660 [myid:2] - INFO [:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for client /192.168.5.172:29409 (no session established for client)
2018-01-15 10:49:22,559 [myid:2] - INFO [:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.5.172:29444
2018-01-15 10:49:22,560 [myid:2] - WARN [:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
at (:230)
at (:203)
at (:745)
执行命令netstat -antp | grep 2181
,可以看到
参考Hadoop 常用端口及查看方法查看端口使用情况
在这篇文章Hadoop及HBase使用过程中的一些问题集中描述hb下面的regionserver全部掉线,其实不是zk的原因,而是hdfs的问题。
Unable to read additional data from client sessionid 0x0, likely client has closed socket这篇文章解释需要配置,而zookeeper中minSessionTimeout与maxSessionTimeout是tickTime的2倍和20倍。看来问题不在于此。
在http://192.168.5.174:16010/master-status
中可以看到压缩后,原来Online Regions,现在变成了failed Regions.
继续跟进hbase的日志
2018-01-16 10:05:55,864 INFO [RS_OPEN_REGION-dashuju172:16020-1] : Opening of region {ENCODED => adf4db74cf8fee5f59a72536bbe5c866, NAME => 'iw:test09,,1516065176992.adf4db74cf8fee5f59a72536bbe5c866.', STARTKEY => '', ENDKEY => '18243bd709d84eaa87a52224c2a5f584_420321800520311_20170417'} failed, transitioning from OPENING to FAILED_OPEN in ZK, expecting version 22
2018-01-16 10:05:55,969 INFO [=12,queue=0,port=16020] : Open iw:test09,18243bd709d84eaa87a52224c2a5f584_420321800520311_20170417,1516065176992.63fbdfec70bdd534cf309fd3194685d0.
2018-01-16 10:05:55,985 INFO [=10,queue=0,port=16020] : Open iw:test09,,1516065176992.adf4db74cf8fee5f59a72536bbe5c866.
2018-01-16 10:05:55,993 ERROR [RS_OPEN_REGION-dashuju172:16020-0] : Failed open of region=iw:test09,18243bd709d84eaa87a52224c2a5f584_420321800520311_20170417,1516065176992.63fbdfec70bdd534cf309fd3194685d0., starting to roll back the global memstore size.
: Compression algorithm 'snappy' previously failed test.
at (:91)
at (:6300)
at (:6251)
at (:6218)
at (:6189)
at (:6145)
at (:6096)
at (:362)
at (:129)
at (:129)
at (:1145)
at $(:615)
at (:745)
网上一搜,很多原因会造成这个问题,这里整理一下,虽然没有解决我遇到的问题
hbase Master节点变更
zookeeper或数据损坏
于是我按照下面的的步骤,参考shut down ZM and HBase
cp /home/hadoop/platform/zookeeper/data/myid /home/hadoop/bak/
stop
rm -Rf /home/hadoop/platform/zookeeper/data/*
cp /home/hadoop/bak/myid /home/hadoop/platform/zookeeper/data/
start
zookeeper的异常不见了。
执行下面命令格式化查看zookeeper的日志
java -classpath .:/home/hadoop/application/zookeeper/lib/slf4j-api-1.6.:/home/hadoop/application/zookeeper/zookeeper-3.4. /home/hadoop/platform/zookeeper/log/version-2/log.2f00000001