EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has cl

时间:2024-11-12 12:15:37

今天在hbase中执行下面的命令

disable 'iw:test06'
alter 'iw:test06',NAME=>'i',COMPRESSION=>'SNAPPY'
count 'iw:test06'

提示下面的异常信息

2018-01-15 10:49:20,660 [myid:2] - INFO  [:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1008] - Closed socket connection for client /192.168.5.172:29409 (no session established for client)
2018-01-15 10:49:22,559 [myid:2] - INFO  [:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /192.168.5.172:29444
2018-01-15 10:49:22,560 [myid:2] - WARN  [:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
	at (:230)
	at (:203)
	at (:745)

执行命令netstat -antp | grep 2181,可以看到
1
参考Hadoop 常用端口及查看方法查看端口使用情况
在这篇文章Hadoop及HBase使用过程中的一些问题集中描述hb下面的regionserver全部掉线,其实不是zk的原因,而是hdfs的问题。
Unable to read additional data from client sessionid 0x0, likely client has closed socket这篇文章解释需要配置,而zookeeper中minSessionTimeout与maxSessionTimeout是tickTime的2倍和20倍。看来问题不在于此。
http://192.168.5.174:16010/master-status中可以看到压缩后,原来Online Regions,现在变成了failed Regions.
2
继续跟进hbase的日志

2018-01-16 10:05:55,864 INFO  [RS_OPEN_REGION-dashuju172:16020-1] : Opening of region {ENCODED => adf4db74cf8fee5f59a72536bbe5c866, NAME => 'iw:test09,,1516065176992.adf4db74cf8fee5f59a72536bbe5c866.', STARTKEY => '', ENDKEY => '18243bd709d84eaa87a52224c2a5f584_420321800520311_20170417'} failed, transitioning from OPENING to FAILED_OPEN in ZK, expecting version 22
2018-01-16 10:05:55,969 INFO  [=12,queue=0,port=16020] : Open iw:test09,18243bd709d84eaa87a52224c2a5f584_420321800520311_20170417,1516065176992.63fbdfec70bdd534cf309fd3194685d0.
2018-01-16 10:05:55,985 INFO  [=10,queue=0,port=16020] : Open iw:test09,,1516065176992.adf4db74cf8fee5f59a72536bbe5c866.
2018-01-16 10:05:55,993 ERROR [RS_OPEN_REGION-dashuju172:16020-0] : Failed open of region=iw:test09,18243bd709d84eaa87a52224c2a5f584_420321800520311_20170417,1516065176992.63fbdfec70bdd534cf309fd3194685d0., starting to roll back the global memstore size.
: Compression algorithm 'snappy' previously failed test.
	at (:91)
	at (:6300)
	at (:6251)
	at (:6218)
	at (:6189)
	at (:6145)
	at (:6096)
	at (:362)
	at (:129)
	at (:129)
	at (:1145)
	at $(:615)
	at (:745)

网上一搜,很多原因会造成这个问题,这里整理一下,虽然没有解决我遇到的问题
hbase Master节点变更
zookeeper或数据损坏
于是我按照下面的的步骤,参考shut down ZM and HBase

cp /home/hadoop/platform/zookeeper/data/myid /home/hadoop/bak/


 stop

rm -Rf /home/hadoop/platform/zookeeper/data/*
cp /home/hadoop/bak/myid  /home/hadoop/platform/zookeeper/data/ 


 start

zookeeper的异常不见了。
执行下面命令格式化查看zookeeper的日志

java -classpath .:/home/hadoop/application/zookeeper/lib/slf4j-api-1.6.:/home/hadoop/application/zookeeper/zookeeper-3.4.  /home/hadoop/platform/zookeeper/log/version-2/log.2f00000001