今天在hadoop集群环境下需要将两台datanode删除,为了不影响在运行业务,需对节点进行动态删除,记录操作过程如下:
1, 从集群中移走节点,需要对移走节点的数据进行备份:
在主节点的core-site.xml配置文件中添加如下内容:
<property>
<name>dfs.hosts.exclude</name>
<value>/home/hadoop/hadoop/conf/excludes</value>
</property>
说明
dfs.hosts.exclude:指要删除的节点
/home/hadoop/hadoop/conf/excludes:指定要被删除文件所在路径及名称,该处定义为excludes
2, 在1中设置目录中touch excludes,内容为每行需要移走的节点
cloud4
cloud5
3,进入 运行命令:hadoop dfsadmin -refreshNodes(我这用的yum安装的,不同安装方式hadoop目录会在不同路径),该命令可以动态刷新dfs.hosts和dfs.hosts.exclude配置,无需重启NameNode。
执行完成被删除节点datanode消失了,但是tasktracker还会存在,需要自己手动停掉
4,然后通过 bin/hadoop dfsadmin -report查看,结果如下:
Configured Capacity: 17721082527744 (16.12 TB)
Present Capacity: 16806607028262 (15.29 TB)
DFS Remaining: 14996775104512 (13.64 TB)
DFS Used: 1809831923750 (1.65 TB)
DFS Used%: 10.77%
Under replicated blocks: 6788
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 6 (6 total, 0 dead)
Name: 192.168.1.5:50010
Decommission Status : Normal
Configured Capacity: 2953511657472 (2.69 TB)
DFS Used: 265079108972 (246.87 GB)
Non DFS Used: 150286670484 (139.97 GB)
DFS Remaining: 2538145878016(2.31 TB)
DFS Used%: 8.98%
DFS Remaining%: 85.94%
Last contact: Thu Sep 08 10:12:45 CST 2011
Name: 192.168.1.8:50010
Decommission Status : Decommission in progress
Configured Capacity: 2953511657472 (2.69 TB)
DFS Used: 228590288896 (212.89 GB)
Non DFS Used: 150240718848 (139.92 GB)
DFS Remaining: 2574680649728(2.34 TB)
DFS Used%: 7.74%
DFS Remaining%: 87.17%
Last contact: Thu Sep 08 10:12:45 CST 2011
Name: 192.168.1.7:50010
Decommission Status : Normal
Configured Capacity: 2953511657472 (2.69 TB)
DFS Used: 266826599821 (248.5 GB)
Non DFS Used: 150259458675 (139.94 GB)
DFS Remaining: 2536425598976(2.31 TB)
DFS Used%: 9.03%
DFS Remaining%: 85.88%
Last contact: Thu Sep 08 10:12:46 CST 2011
Name: 192.168.1.9:50010
Decommission Status : Decommission in progress
Configured Capacity: 2953511657472 (2.69 TB)
DFS Used: 226060701696 (210.54 GB)
Non DFS Used: 150240718848 (139.92 GB)
DFS Remaining: 2577210236928(2.34 TB)
DFS Used%: 7.65%
DFS Remaining%: 87.26%
Last contact: Thu Sep 08 10:12:45 CST 2011
Name: 192.168.1.4:50010
Decommission Status : Normal
Configured Capacity: 2953524240384 (2.69 TB)
DFS Used: 553202110857 (515.21 GB)
Non DFS Used: 163197603447 (151.99 GB)
DFS Remaining: 2237124526080(2.03 TB)
DFS Used%: 18.73%
DFS Remaining%: 75.74%
Last contact: Thu Sep 08 10:12:46 CST 2011
Name: 192.168.1.6:50010
Decommission Status : Normal
Configured Capacity: 2953511657472 (2.69 TB)
DFS Used: 270073113508 (251.53 GB)
Non DFS Used: 150250329180 (139.93 GB)
DFS Remaining: 2533188214784(2.3 TB)
DFS Used%: 9.14%
DFS Remaining%: 85.77%
Last contact: Thu Sep 08 10:12:44 CST 2011
5,通过4中命令可以查看到被删除节点状态,如192.168.1.9
Decommission Status : Decommissioned
说明从91往其他节点同步数据已经完成,如果状态为Decommission Status : Decommissione in process则还在执行。
至此删除节点操作完成
问题总结
在拔掉节点时注意要把往hadoop放数据程序先停掉,否则程序还会往要删除节点同步数据,删除节点程序会一直执行。