监控hdfs 块迁移数量脚本及 metasave 日志内容详解

时间:2023-03-10 16:24:02

把以下脚本配置到 crontab 下 ,10分钟一次,进行打点,记录复制块的进度。

想让节点快速退役(下线)的方法可以参考我的 https://www.cnblogs.com/jiangxiaoxian/p/9665588.html 文章。

#!/bin/bash
source /etc/profile
hdfs dfsadmin -metasave 1.meta
date +%Y%m%d-%H:%M:%S >> /opt/hadoop/hadoop-2.6.0/logs/meta_recods.log
cat /opt/hadoop/hadoop-2.6.0/logs/1.meta | grep 'Metasave: Blocks waiting' >> /opt/hadoop/hadoop-2.6.0/logs/meta_recods.log

日志 meta_recods.log 内容

20180925-14:00:18
Metasave: Blocks waiting for replication: 3814058
20180925-14:10:16
Metasave: Blocks waiting for replication: 3801943
20180925-14:20:16
Metasave: Blocks waiting for replication: 3789778
20180925-14:30:18
Metasave: Blocks waiting for replication: 3777659
20180925-14:40:22
Metasave: Blocks waiting for replication: 3765676
20180925-14:50:18
Metasave: Blocks waiting for replication: 3753631
20180925-15:00:17
Metasave: Blocks waiting for replication: 3741611
......

线上参数如下(根据打点日志 1个小时完成了7w左右个块的拷贝。)

线上要退役的节点有400w个块,那理论上一天可完成170w个块(但是hdfs晚上要跑大量的按天任务,比较繁忙),2.5天可以完成。

<!-- speed up decommission  -->
<property>
<name>dfs.namenode.replication.max-streams</name>
<value>10</value>
</property>
<property>
<name>dfs.namenode.replication.max-streams-hard-limit</name>
<value>20</value>
</property>
<property>
<name>dfs.namenode.replication.work.multiplier.per.iteration</name>
<value>5</value>
</property>

metasave 的日志示意

源码中是BlockManager的metaSave(PrintWriter out) 方法

19349608 files and directories, 14515745 blocks = 33865353 total  #共有3.3千万的对象在namenode的内存中
Live Datanodes: 15 #有15个datenode in service
Dead Datanodes: 0 # 死了的dn节点
Metasave: Blocks waiting for replication: 2380757 # 等待要复制的块有238w个
# l: == live(in service):, d: == decommissioned(退役) c: == corrupt(损坏) e: == excess(超额)
# 以下是 文件名 --- 等待复制的block名 --- l:2 d:1 c:0 e:0 --- block所在的节点
/user/hive/warehouse/ods.db/xxx/order_datekey=20170213/000003_1: blk_1286491219_213142207 (replicas: l: 2 d: 1 c: 0 e: 0) xx.x.x.30:50010 : xx.x.x.36:50010 : xx.x.x.34:50010(decommissioned) :
/user/hive/warehouse/analytics.db/xxx/datekey=20180810/part=5/part-0001: blk_1529882603_456688042 (replicas: l: 2 d: 1 c: 0 e: 0) xx.x.x.11:50010 : xx.x.x.34:50010(decommissioned) : xx.x.x.41:50010 :
...
Mis-replicated blocks that have been postponed: # 在做namenode 故障转移时,待复制的block 会延迟复制(新active namenode 并不知道要复制哪些块,只到dn的一次report),所以这个也是空(不发生切换ha的情况)
Metasave: Blocks being replicated: 102 # 当前正在复制的 block 102个
blk_1308167564_234830077 StartTime: 10:39:58 NumReplicaInProgress: 1
blk_1475963863_402729834 StartTime: 10:39:58 NumReplicaInProgress: 1
blk_1090092594_16396468 StartTime: 10:39:58 NumReplicaInProgress: 1
...
Metasave: Blocks 333 waiting deletion from 4 datanodes. # 待被删除的block
xx.x.x.11:50010
LightWeightHashSet(size=81, modification=81, entries.length=128)
xx.x.x.21:50010
LightWeightHashSet(size=90, modification=90, entries.length=128)
xx.x.x.36:50010
LightWeightHashSet(size=75, modification=75, entries.length=128)
xx.x.x.41:50010
LightWeightHashSet(size=87, modification=87, entries.length=128)
Metasave: Number of datanodes: 15 # datanode节点ip:port --- 机架 --- 服务状态(in service;Decommission In Progress;) --- 储存信息(磁盘最大可用,剩余,已用空间)---dump meta 时间 --- 所在节点block 的复制情况及失效情况
xxx.xxx.xxx.xx:50010 default_rack IN 35719153532928(32.49 TB) 19426144912589(17.67 TB) 0.54% 14196798403289(12.91 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018 1 blocks to be replicated;
xxx.xxx.xxx.xx:50010 default_rack IN 54168542494720(49.27 TB) 18217462271284(16.57 TB) 0.34% 35222156404459(32.03 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018 6 blocks to be replicated;
xxx.xxx.xxx.xx:50010 default_rack IN 35719153532928(32.49 TB) 19584613210523(17.81 TB) 0.55% 13989217942458(12.72 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018
xxx.xxx.xxx.34:50010 default_rack DP 43039268741120(39.14 TB) 19133599044243(17.40 TB) 0.44% 16143418817257(14.68 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018 2 blocks to be replicated;
xxx.xxx.xxx.xx:50010 default_rack IN 35719153532928(32.49 TB) 19639941377872(17.86 TB) 0.55% 13921822275178(12.66 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018 64 blocks to be invalidated;
xxx.xxx.xxx.xx:50010 default_rack IN 43044577198080(39.15 TB) 22257842728998(20.24 TB) 0.52% 8827718379586(8.03 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018
xxx.xxx.xxx.xx:50010 default_rack IN 54168542494720(49.27 TB) 19201190599407(17.46 TB) 0.35% 33335553166140(30.32 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018
xxx.xxx.xxx.xx:50010 default_rack IN 54168542494720(49.27 TB) 19669312248147(17.89 TB) 0.36% 32865881808839(29.89 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018
xxx.xxx.xxx.xx:50010 default_rack IN 37762378219520(34.34 TB) 3246159556995(2.95 TB) 0.09% 33678460006824(30.63 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018 63 blocks to be invalidated;
xxx.xxx.xxx.xx:50010 default_rack IN 54168542494720(49.27 TB) 20323539696784(18.48 TB) 0.38% 32271527069120(29.35 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018 7 blocks to be replicated; 73 blocks to be invalidated;
xxx.xxx.xxx.xx:50010 default_rack IN 22880316538880(20.81 TB) 4524463118033(4.11 TB) 0.20% 16963981545595(15.43 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018 80 blocks to be invalidated;
xxx.xxx.xxx.xx:50010 default_rack IN 37723723513860(34.31 TB) 2242584812391(2.04 TB) 0.06% 35079540988600(31.90 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018
xxx.xxx.xxx.xx:50010 default_rack IN 37723723513860(34.31 TB) 864073122598(804.73 GB) 0.02% 36507695599616(33.20 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018
xxx.xxx.xxx.xx:50010 default_rack IN 32044152463360(29.14 TB) 20191502077098(18.36 TB) 0.63% 4276374611465(3.89 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018 5 blocks to be replicated;
xxx.xxx.xxx.xx:50010 default_rack IN 35719153532928(32.49 TB) 19310209465238(17.56 TB) 0.54% 14261995815199(12.97 TB) 0(0 B) 0(0 B) NaN% 0(0 B) Wed Sep 26 10:40:04 CST 2018