Hadoop 2.7.4 HDFS+YRAN HA增加datanode和nodemanager

当前集群

主机名称	IP地址	角色	统一安装目录	统一安装用户
sht-sgmhadoopnn-01	172.16.101.55	namenode,resourcemanager	/usr/local/hadoop(软连接) /usr/local/hadoop-2.7.4 /usr/local/zookeeper（软连接） /usr/local/zookeeper-3.4.9	root
sht-sgmhadoopnn-02	172.16.101.56	namenode,resourcemanager
sht-sgmhadoopdn-01	172.16.101.58	datanode,nodemanager,journalnode,zookeeper
sht-sgmhadoopdn-02	172.16.101.59	datanode,nodemanager,journalnode,zookeeper
sht-sgmhadoopdn-03	172.16.101.60	datanode,nodemanager,journalnode,zookeeper

集群部署完成后增加datanode sht-sgmhadoopdn-04

部署参考 https://www.cnblogs.com/ilifeilong/p/10610993.html

1. 新datanode节点按照全新安装方式配置ssh无密码登录、系统变量、主机名解析、等

2.在namenode active节点sht-sgmhadoopnn-01修改配置文件

1）slaves

添加主机名sht-sgmhadoopdn-04至slaves文件

2）hdfs-site.xml

将dfs.replication参数值修改为4

3. 在namenode active节点sht-sgmhadoopnn-01将以上两个新修改的文件rsync到集群其他节点

# rsync -az --progress hdfs-site.xml root@172.16.101.56:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress hdfs-site.xml root@172.16.101.58:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress hdfs-site.xml root@172.16.101.59:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress hdfs-site.xml root@172.16.101.60:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress hdfs-site.xml root@172.16.101.66:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress slaves root@172.16.101.56:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress slaves root@172.16.101.58:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress slaves root@172.16.101.59:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress slaves root@172.16.101.60:/usr/local/hadoop/etc/hadoop/
# rsync -az --progress slaves root@172.16.101.66:/usr/local/hadoop/etc/hadoop/

4. 在namenode active节点sht-sgmhadoopnn-01将hadoop目录同步到新节点

# rsync -az --progress --exclude=data --exclude=logs  /usr/local/hadoop-2.7.4 root@sht-sgmhadoopdn-04:/usr/local/

5. 在新节点上启动datanode和nodemanager角色

# hadoop-daemon.sh start datanode
# yarn-daemon.sh start nodemanager

6. 在namenode和resourcemanager 的active节点或standby节点的WEB界面验证

http://172.16.101.55:50070/dfshealth.html#tab-datanode

http://172.16.101.55:8088/cluster/nodes

7.重新均衡集群datanode数据（建议在standby namenode节点操作）

# hdfs balancer -threshold 1

输出log

# hdfs balancer -threshold 1
19/03/29 23:59:21 INFO balancer.Balancer: Using a threshold of 1.0
19/03/29 23:59:21 INFO balancer.Balancer: namenodes  = [hdfs://mycluster]
19/03/29 23:59:21 INFO balancer.Balancer: parameters = Balancer.Parameters [BalancingPolicy.Node, threshold = 1.0, max idle iteration = 5, number of nodes to be excluded = 0, number of nodes to be included = 0, run during upgrade = false]
Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved
19/03/29 23:59:24 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
19/03/29 23:59:24 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
19/03/29 23:59:24 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
19/03/29 23:59:24 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
19/03/29 23:59:24 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
19/03/29 23:59:24 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
19/03/29 23:59:24 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
19/03/29 23:59:24 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.66:50010
19/03/29 23:59:24 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.58:50010
19/03/29 23:59:24 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.60:50010
19/03/29 23:59:24 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.59:50010
19/03/29 23:59:24 INFO balancer.Balancer: 0 over-utilized: []
19/03/29 23:59:24 INFO balancer.Balancer: 1 underutilized: [172.16.101.66:50010:DISK]
19/03/29 23:59:24 INFO balancer.Balancer: Need to move 1.10 GB to make the cluster balanced.
19/03/29 23:59:24 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized
19/03/29 23:59:24 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized
19/03/29 23:59:24 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized
19/03/29 23:59:24 INFO balancer.Balancer: Decided to move 635.63 MB bytes from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK
19/03/29 23:59:24 INFO balancer.Balancer: Decided to move 147.43 MB bytes from 172.16.101.60:50010:DISK to 172.16.101.66:50010:DISK
19/03/29 23:59:24 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized
19/03/29 23:59:24 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized
19/03/29 23:59:24 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized
19/03/29 23:59:24 INFO balancer.Balancer: Will move 783.06 MB in this iteration
19/03/29 23:59:24 INFO balancer.Dispatcher: Limiting threads per target to the specified max.
19/03/29 23:59:24 INFO balancer.Dispatcher: Allocating 5 threads per target.
19/03/29 23:59:24 INFO balancer.Dispatcher: Start moving blk_1073741839_1015 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/29 23:59:24 INFO balancer.Dispatcher: Start moving blk_1073741846_1022 with size=134217728 from 172.16.101.60:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/29 23:59:24 INFO balancer.Dispatcher: Start moving blk_1073741838_1014 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/29 23:59:24 INFO balancer.Dispatcher: Start moving blk_1073741845_1021 with size=134217728 from 172.16.101.60:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/29 23:59:24 INFO balancer.Dispatcher: Start moving blk_1073741837_1013 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/29 23:59:52 INFO balancer.Dispatcher: Successfully moved blk_1073741838_1014 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/29 23:59:52 INFO balancer.Dispatcher: Start moving blk_1073741836_1012 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:00:14 INFO balancer.Dispatcher: Successfully moved blk_1073741836_1012 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:00:14 INFO balancer.Dispatcher: Start moving blk_1073741835_1011 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:00:38 INFO balancer.Dispatcher: Successfully moved blk_1073741835_1011 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:01:44 WARN balancer.Dispatcher: Failed to move blk_1073741837_1013 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741837_1013 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22240 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:01:44 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:01:44 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:02:07 WARN balancer.Dispatcher: Failed to move blk_1073741845_1021 with size=134217728 from 172.16.101.60:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741845_1021 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22238 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:02:07 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:02:07 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:02:11 WARN balancer.Dispatcher: Failed to move blk_1073741839_1015 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741839_1015 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22232 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:02:11 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:02:11 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:02:35 WARN balancer.Dispatcher: Failed to move blk_1073741846_1022 with size=134217728 from 172.16.101.60:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741846_1022 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22234 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:02:35 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:02:35 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
Mar 30, 2019 12:02:36 AM          0               384 MB             1.10 GB          783.06 MB
19/03/30 00:02:41 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
19/03/30 00:02:41 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
19/03/30 00:02:41 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
19/03/30 00:02:41 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
19/03/30 00:02:41 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
19/03/30 00:02:41 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
19/03/30 00:02:41 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
19/03/30 00:02:41 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.66:50010
19/03/30 00:02:41 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.59:50010
19/03/30 00:02:41 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.58:50010
19/03/30 00:02:41 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.60:50010
19/03/30 00:02:41 INFO balancer.Balancer: 0 over-utilized: []
19/03/30 00:02:41 INFO balancer.Balancer: 1 underutilized: [172.16.101.66:50010:DISK]
19/03/30 00:02:41 INFO balancer.Balancer: Need to move 833.58 MB to make the cluster balanced.
19/03/30 00:02:41 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized
19/03/30 00:02:41 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized
19/03/30 00:02:41 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized
19/03/30 00:02:41 INFO balancer.Balancer: Decided to move 538.88 MB bytes from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:02:41 INFO balancer.Balancer: Decided to move 244.18 MB bytes from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:02:41 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized
19/03/30 00:02:41 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized
19/03/30 00:02:41 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized
19/03/30 00:02:41 INFO balancer.Balancer: Will move 783.06 MB in this iteration
19/03/30 00:02:41 INFO balancer.Dispatcher: Limiting threads per target to the specified max.
19/03/30 00:02:41 INFO balancer.Dispatcher: Allocating 5 threads per target.
19/03/30 00:02:41 INFO balancer.Dispatcher: Start moving blk_1073741837_1013 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:02:41 INFO balancer.Dispatcher: Start moving blk_1073741834_1010 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:02:41 INFO balancer.Dispatcher: Start moving blk_1073741842_1018 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:02:41 INFO balancer.Dispatcher: Start moving blk_1073741841_1017 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:02:41 INFO balancer.Dispatcher: Start moving blk_1073741840_1016 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:02:41 WARN balancer.Dispatcher: Failed to move blk_1073741834_1010 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741834_1010 received exception java.io.IOException: Got error, status message Not able to copy block 1073741834 to /172.16.101.66:22256 because threads quota is exceeded., copy block BP-698223843-172.16.101.55-1553701973789:blk_1073741834_1010 from /172.16.101.58:50010, block move is failed
19/03/30 00:02:41 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:02:41 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:02:41 INFO balancer.Dispatcher: Start moving blk_1073741839_1015 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:02:41 WARN balancer.Dispatcher: Failed to move blk_1073741841_1017 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741841_1017 received exception java.io.IOException: Got error, status message Not able to copy block 1073741841 to /172.16.101.66:22258 because threads quota is exceeded., copy block BP-698223843-172.16.101.55-1553701973789:blk_1073741841_1017 from /172.16.101.58:50010, block move is failed
19/03/30 00:02:41 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:02:41 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:02:41 WARN balancer.Dispatcher: Failed to move blk_1073741840_1016 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741840_1016 received exception java.io.IOException: Got error, status message Not able to copy block 1073741840 to /172.16.101.66:22260 because threads quota is exceeded., copy block BP-698223843-172.16.101.55-1553701973789:blk_1073741840_1016 from /172.16.101.58:50010, block move is failed
19/03/30 00:02:41 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:02:41 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:02:41 WARN balancer.Dispatcher: Failed to move blk_1073741839_1015 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741839_1015 received exception java.io.IOException: Got error, status message Not able to copy block 1073741839 to /172.16.101.66:22262 because threads quota is exceeded., copy block BP-698223843-172.16.101.55-1553701973789:blk_1073741839_1015 from /172.16.101.58:50010, block move is failed
19/03/30 00:02:41 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:02:41 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:02:58 INFO balancer.Dispatcher: Successfully moved blk_1073741842_1018 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:02:58 INFO balancer.Dispatcher: Successfully moved blk_1073741837_1013 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
Mar 30, 2019 12:02:58 AM          1               640 MB           833.58 MB          783.06 MB
19/03/30 00:03:03 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
19/03/30 00:03:03 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
19/03/30 00:03:03 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
19/03/30 00:03:03 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
19/03/30 00:03:03 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
19/03/30 00:03:03 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
19/03/30 00:03:03 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
19/03/30 00:03:03 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.58:50010
19/03/30 00:03:03 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.66:50010
19/03/30 00:03:03 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.59:50010
19/03/30 00:03:03 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.60:50010
19/03/30 00:03:03 INFO balancer.Balancer: 0 over-utilized: []
19/03/30 00:03:03 INFO balancer.Balancer: 1 underutilized: [172.16.101.66:50010:DISK]
19/03/30 00:03:03 INFO balancer.Balancer: Need to move 640.08 MB to make the cluster balanced.
19/03/30 00:03:03 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized
19/03/30 00:03:03 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized
19/03/30 00:03:03 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized
19/03/30 00:03:03 INFO balancer.Balancer: Decided to move 474.38 MB bytes from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:03:03 INFO balancer.Balancer: Decided to move 308.67 MB bytes from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:03:03 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized
19/03/30 00:03:03 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized
19/03/30 00:03:03 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized
19/03/30 00:03:03 INFO balancer.Balancer: Will move 783.06 MB in this iteration
19/03/30 00:03:03 INFO balancer.Dispatcher: Limiting threads per target to the specified max.
19/03/30 00:03:03 INFO balancer.Dispatcher: Allocating 5 threads per target.
19/03/30 00:03:03 INFO balancer.Dispatcher: Start moving blk_1073741834_1010 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:03:03 INFO balancer.Dispatcher: Start moving blk_1073741833_1009 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:03:03 INFO balancer.Dispatcher: Start moving blk_1073741832_1008 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:03:03 INFO balancer.Dispatcher: Start moving blk_1073741828_1004 with size=21901927 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:03:03 INFO balancer.Dispatcher: Start moving blk_1073741827_1003 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:03:03 WARN balancer.Dispatcher: Failed to move blk_1073741828_1004 with size=21901927 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741828_1004 received exception java.io.IOException: Got error, status message Not able to copy block 1073741828 to /172.16.101.66:22272 because threads quota is exceeded., copy block BP-698223843-172.16.101.55-1553701973789:blk_1073741828_1004 from /172.16.101.58:50010, block move is failed
19/03/30 00:03:03 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:03:03 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:03:03 INFO balancer.Dispatcher: Start moving blk_1073741826_1002 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:03:03 WARN balancer.Dispatcher: Failed to move blk_1073741826_1002 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741826_1002 received exception java.io.IOException: Got error, status message Not able to copy block 1073741826 to /172.16.101.66:22274 because threads quota is exceeded., copy block BP-698223843-172.16.101.55-1553701973789:blk_1073741826_1002 from /172.16.101.58:50010, block move is failed
19/03/30 00:03:03 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:03:03 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:03:47 INFO balancer.Dispatcher: Successfully moved blk_1073741833_1009 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:05:12 WARN balancer.Dispatcher: Failed to move blk_1073741834_1010 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741834_1010 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22266 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:05:12 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:05:12 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:05:36 WARN balancer.Dispatcher: Failed to move blk_1073741827_1003 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741827_1003 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22270 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:05:36 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:05:36 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:06:11 WARN balancer.Dispatcher: Failed to move blk_1073741832_1008 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741832_1008 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22268 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:06:11 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:06:11 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
Mar 30, 2019 12:06:11 AM          2               768 MB           640.08 MB          783.06 MB
19/03/30 00:06:16 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
19/03/30 00:06:16 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
19/03/30 00:06:16 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
19/03/30 00:06:16 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
19/03/30 00:06:16 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
19/03/30 00:06:16 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
19/03/30 00:06:16 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
19/03/30 00:06:16 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.59:50010
19/03/30 00:06:16 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.58:50010
19/03/30 00:06:16 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.66:50010
19/03/30 00:06:16 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.60:50010
19/03/30 00:06:16 INFO balancer.Balancer: 0 over-utilized: []
19/03/30 00:06:16 INFO balancer.Balancer: 1 underutilized: [172.16.101.66:50010:DISK]
19/03/30 00:06:16 INFO balancer.Balancer: Need to move 458.28 MB to make the cluster balanced.
19/03/30 00:06:16 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized
19/03/30 00:06:16 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized
19/03/30 00:06:16 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized
19/03/30 00:06:16 INFO balancer.Balancer: Decided to move 413.78 MB bytes from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:06:16 INFO balancer.Balancer: Decided to move 369.28 MB bytes from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:06:16 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized
19/03/30 00:06:16 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized
19/03/30 00:06:16 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized
19/03/30 00:06:16 INFO balancer.Balancer: Will move 783.06 MB in this iteration
19/03/30 00:06:16 INFO balancer.Dispatcher: Limiting threads per target to the specified max.
19/03/30 00:06:16 INFO balancer.Dispatcher: Allocating 5 threads per target.
19/03/30 00:06:16 INFO balancer.Dispatcher: Start moving blk_1073741832_1008 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:06:16 INFO balancer.Dispatcher: Start moving blk_1073741828_1004 with size=21901927 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:06:16 INFO balancer.Dispatcher: Start moving blk_1073741826_1002 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:06:16 INFO balancer.Dispatcher: Start moving blk_1073741827_1003 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:06:16 INFO balancer.Dispatcher: Start moving blk_1073741834_1010 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:06:16 WARN balancer.Dispatcher: Failed to move blk_1073741834_1010 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741834_1010 received exception java.io.IOException: Got error, status message Not able to copy block 1073741834 to /172.16.101.66:22284 because threads quota is exceeded., copy block BP-698223843-172.16.101.55-1553701973789:blk_1073741834_1010 from /172.16.101.58:50010, block move is failed
19/03/30 00:06:16 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:06:16 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:06:16 INFO balancer.Dispatcher: Start moving blk_1073741825_1001 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:06:16 WARN balancer.Dispatcher: Failed to move blk_1073741825_1001 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741825_1001 received exception java.io.IOException: Got error, status message Not able to copy block 1073741825 to /172.16.101.66:22286 because threads quota is exceeded., copy block BP-698223843-172.16.101.55-1553701973789:blk_1073741825_1001 from /172.16.101.58:50010, block move is failed
19/03/30 00:06:16 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:06:16 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
19/03/30 00:06:19 INFO balancer.Dispatcher: Successfully moved blk_1073741828_1004 with size=21901927 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:06:49 INFO balancer.Dispatcher: Successfully moved blk_1073741832_1008 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:06:53 INFO balancer.Dispatcher: Successfully moved blk_1073741827_1003 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:08:36 WARN balancer.Dispatcher: Failed to move blk_1073741826_1002 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741826_1002 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22280 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:08:36 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:08:36 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
Mar 30, 2019 12:08:36 AM          3              1.02 GB           458.28 MB          783.06 MB
19/03/30 00:08:41 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
19/03/30 00:08:41 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
19/03/30 00:08:41 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
19/03/30 00:08:41 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
19/03/30 00:08:41 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
19/03/30 00:08:41 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
19/03/30 00:08:41 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
19/03/30 00:08:41 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.59:50010
19/03/30 00:08:41 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.58:50010
19/03/30 00:08:41 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.66:50010
19/03/30 00:08:41 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.60:50010
19/03/30 00:08:41 INFO balancer.Balancer: 0 over-utilized: []
19/03/30 00:08:41 INFO balancer.Balancer: 1 underutilized: [172.16.101.66:50010:DISK]
19/03/30 00:08:41 INFO balancer.Balancer: Need to move 248.99 MB to make the cluster balanced.
19/03/30 00:08:41 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => underUtilized
19/03/30 00:08:41 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized
19/03/30 00:08:41 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized
19/03/30 00:08:41 INFO balancer.Balancer: Decided to move 344.02 MB bytes from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:08:41 INFO balancer.Balancer: Decided to move 344.02 MB bytes from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:08:41 INFO balancer.Balancer: Decided to move 95.03 MB bytes from 172.16.101.60:50010:DISK to 172.16.101.66:50010:DISK
19/03/30 00:08:41 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized
19/03/30 00:08:41 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized
19/03/30 00:08:41 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized
19/03/30 00:08:41 INFO balancer.Balancer: Will move 783.06 MB in this iteration
19/03/30 00:08:41 INFO balancer.Dispatcher: Limiting threads per target to the specified max.
19/03/30 00:08:41 INFO balancer.Dispatcher: Allocating 5 threads per target.
19/03/30 00:08:41 INFO balancer.Dispatcher: Start moving blk_1073741839_1015 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:08:41 INFO balancer.Dispatcher: Start moving blk_1073741834_1010 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:08:41 INFO balancer.Dispatcher: Start moving blk_1073741825_1001 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:08:41 INFO balancer.Dispatcher: Start moving blk_1073741826_1002 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:08:41 INFO balancer.Dispatcher: Start moving blk_1073741848_1024 with size=73209856 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:09:35 INFO balancer.Dispatcher: Successfully moved blk_1073741848_1024 with size=73209856 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:09:35 INFO balancer.Dispatcher: Start moving blk_1073741847_1023 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:09:40 INFO balancer.Dispatcher: Successfully moved blk_1073741839_1015 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:09:41 INFO balancer.Dispatcher: Successfully moved blk_1073741826_1002 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:09:57 INFO balancer.Dispatcher: Successfully moved blk_1073741825_1001 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:09:57 INFO balancer.Dispatcher: Successfully moved blk_1073741834_1010 with size=134217728 from 172.16.101.59:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010
19/03/30 00:12:28 WARN balancer.Dispatcher: Failed to move blk_1073741847_1023 with size=134217728 from 172.16.101.58:50010:DISK to 172.16.101.66:50010:DISK through 172.16.101.58:50010: Got error, status message opReplaceBlock BP-698223843-172.16.101.55-1553701973789:blk_1073741847_1023 received exception java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.101.66:22298 remote=/172.16.101.58:50010], block move is failed
19/03/30 00:12:28 INFO balancer.Dispatcher: DDatanode:172.16.101.58:50010 activateDelay 10.0 seconds
19/03/30 00:12:28 INFO balancer.Dispatcher: DDatanode:172.16.101.66:50010 activateDelay 10.0 seconds
Mar 30, 2019 12:12:28 AM          4              1.59 GB           248.99 MB          783.06 MB
19/03/30 00:12:33 INFO balancer.Balancer: dfs.balancer.movedWinWidth = 5400000 (default=5400000)
19/03/30 00:12:33 INFO balancer.Balancer: dfs.balancer.moverThreads = 1000 (default=1000)
19/03/30 00:12:33 INFO balancer.Balancer: dfs.balancer.dispatcherThreads = 200 (default=200)
19/03/30 00:12:33 INFO balancer.Balancer: dfs.datanode.balance.max.concurrent.moves = 5 (default=5)
19/03/30 00:12:33 INFO balancer.Balancer: dfs.balancer.getBlocks.size = 2147483648 (default=2147483648)
19/03/30 00:12:33 INFO balancer.Balancer: dfs.balancer.getBlocks.min-block-size = 10485760 (default=10485760)
19/03/30 00:12:33 INFO balancer.Balancer: dfs.balancer.max-size-to-move = 10737418240 (default=10737418240)
19/03/30 00:12:33 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.59:50010
19/03/30 00:12:33 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.60:50010
19/03/30 00:12:33 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.58:50010
19/03/30 00:12:33 INFO net.NetworkTopology: Adding a new node: /default-rack/172.16.101.66:50010
19/03/30 00:12:33 INFO balancer.Balancer: 0 over-utilized: []
19/03/30 00:12:33 INFO balancer.Balancer: 0 underutilized: []
The cluster is balanced. Exiting...
Mar 30, 2019 12:12:33 AM          5              1.59 GB                 0 B               -1 B
Mar 30, 2019 12:12:34 AM Balancing took 13.216533333333333 minutes

View Code

再次查看hdfs集群负载

8. 修改hdfs集群中现有文件/目录的副本因子

现有的文件的备份系数仍是原来的值，hadoop并不会自动的按照新的备份系数调整，我们需要手动完成。

hdfs dfs -setrep -R -w 4 /

输出log

Replication 4 set: /CentOS-6.8-x86_64-bin-DVD2.iso
Replication 4 set: /hadoop-2.8.1.tar.gz
Replication 4 set: /slaves
Waiting for /CentOS-6.8-x86_64-bin-DVD2.iso ..................... done
Waiting for /hadoop-2.8.1.tar.gz ... done
Waiting for /slaves ... done

View Code

通过命令查看

# hdfs fsck /
Connecting to namenode via http://sht-sgmhadoopnn-01:50070/fsck?ugi=root&path=%2F
FSCK started by root (auth:SIMPLE) from /172.16.101.55 for path / at Sat Mar 30 00:22:54 CST 2019
...Status: HEALTHY
 Total size:    2645248691 B
 Total dirs:    2
 Total files:    3
 Total symlinks:        0
 Total blocks (validated):    22 (avg. block size 120238576 B)
 Minimally replicated blocks:    22 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    0 (0.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    3
 Average block replication:    4.0
 Corrupt blocks:        0
 Missing replicas:        0 (0.0 %)
 Number of data-nodes:        4
 Number of racks:        1
FSCK ended at Sat Mar 30 00:22:54 CST 2019 in 2 milliseconds


The filesystem under path '/' is HEALTHY

以上步骤在不重启hdfs集群下动态添加datanode节点，仍然建议在适当时重启hdfs集群。

相关阅读:
WinowsXP 任务栏无法显示当前运行程序图标
 日志记录组件[Log4net]详细介绍(转)
桌面上的IE图标变成了快捷方式那种图标怎么还原回来
 面试必须要知道的SQL语法，语句（转载）
兼容火狐 IE 的JS时间控件任意格式年月日时分秒
 Nagios远程监控软件的安装与配置详解(1)
linux集群负载均衡实验笔记
 PHPB2B 模板标签
 PHP 去除 HTML 回车换行空格
 OpenX参考网址
原文地址：https://www.cnblogs.com/ilifeilong/p/10618069.html