• (转载) 添加或删除datanode节点


    转载:https://www.cnblogs.com/marility/p/9362168.html

    1.测试环境

    ip 主机名 角色
    10.124.147.22 hadoop1 namenode
    10.124.147.23 hadoop2 namenode
    10.124.147.32 hadoop3 resourcemanager
    10.124.147.33 hadoop4 resourcemanager
    10.110.92.161 hadoop5 datanode/journalnode
    10.110.92.162 hadoop6 datanode
    10.122.147.37 hadoop7 datanode

    2.配置文件中必备参数

    2.1 hdfs-site.xml参数

    [hadoop@10-124-147-22 hadoop]$ grep dfs.host -A10 /usr/local/hadoop/etc/hadoop/hdfs-site.xml
    <!-- datanode踢除主机列表文件 -->
    <name>dfs.hosts.exclude</name>
    <value>/usr/local/hadoop/etc/hadoop/dfs_exclude</value>
    </property>
    

    <!-- datanode添加主机列表文件-->
    <property>
    <name>dfs.hosts</name>
    <value>/usr/local/hadoop/etc/hadoop/slaves</value>
    </property>

    2.2 yarn-site.xml参数

    [hadoop@10-124-147-22 hadoop]$ grep exclude-path -A10 /usr/local/hadoop/etc/hadoop/yarn-site.xml
    <!-- datanode踢除主机列表文件 -->
    <name>yarn.resourcemanager.nodes.exclude-path</name>
    <value>/usr/local/hadoop/etc/hadoop/dfs_exclude</value>
    </property>
    

    <!-- datanode添加主机列表文件-->
    <property>
    <name>yarn.resourcemanager.nodes.include-path</name>
    <value>/usr/local/hadoop/etc/hadoop/slaves</value>
    </property>

    3.踢除现有主机

    1.在namenode主机中,将要踢除主机的ip添加到hdfs-site.xml配置文件dfs.hosts.exclude参数指定的文件dfs_exclude

    [hadoop@10-124-147-22 hadoop]$ cat /usr/local/hadoop/etc/hadoop/dfs_exclude 
    10.122.147.37

    2.将其copy至hadoop其它主机

    [hadoop@10-124-147-22 hadoop]$ for i in {2,3,4,5,6,7};do scp etc/hadoop/dfs_exclude hadoop$i:/usr/local/hadoop/etc/hadoop/;done

    3.更新namenode信息

    [hadoop@10-124-147-22 hadoop]$ hdfs dfsadmin -refreshNodes
    Refresh nodes successful for hadoop1/10.124.147.22:9000
    Refresh nodes successful for hadoop2/10.124.147.23:9000

    4.查看namenode 状态信息

    [hadoop@10-124-147-22 hadoop]$ hdfs dfsadmin -report
    Configured Capacity: 1100228980736 (1.00 TB)
    Present Capacity: 1087754866688 (1013.05 GB)
    DFS Remaining: 1087752667136 (1013.05 GB)
    DFS Used: 2199552 (2.10 MB)
    DFS Used%: 0.00%
    Under replicated blocks: 11
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    

    Live datanodes (3):

    Name:
    10.122.147.37:50010 (hadoop7)
    Hostname: hadoop7
    Decommission Status : Decommission in progress
    Configured Capacity: 250831044608 (233.60 GB)
    DFS Used: 733184 (716 KB)
    Non DFS Used: 1235771392 (1.15 GB)
    DFS Remaining: 249594540032 (232.45 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 99.51%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Tue Jul 24 10:25:17 CST 2018

    Name:
    10.110.92.161:50010 (hadoop5)
    Hostname: hadoop5
    Decommission Status : Normal
    以下略

    可以看到被踢除主机10.122.147.37的状态变成Decommission in progress,表示集群对存放于该节点的副本正在进行转移。当其变成Decommissioned时,即代表已经结束,相当于已经踢除集群。

    同时此状态可以在hdfs的web页面查看

    5.更新resourcemananger信息

    [hadoop@10-124-147-32 hadoop]$ yarn rmadmin -refreshNodes

    更新之后,可以在resourcemanager的web页面查看到Active Nodes 的信息

    或者使用命令查看

    [hadoop@10-124-147-32 hadoop]$ yarn node -list
    Total Nodes:2
             Node-Id         Node-State Node-Http-Address   Number-of-Running-Containers
       hadoop5:37438            RUNNING      hadoop5:8042                              0
        hadoop6:9001            RUNNING      hadoop6:8042                              0

    4.添加新主机至集群

    1.将原hadoop配置文件copy新主机,并安装好java环境
    2.在namenode中将新主机的ip添加于dfs.hosts参数指定的文件中

    [hadoop@10-124-147-22 hadoop]$ cat /usr/local/hadoop/etc/hadoop/slaves 
    hadoop5
    hadoop6
    10.122.147.37

    3.将该slaves文件同步到其它主机之上

    [hadoop@10-124-147-22 hadoop]$ for i in {2,3,4,5,6,7};do scp etc/hadoop/slaves hadoop$i:/usr/local/hadoop/etc/hadoop/;done

    4.启动新主机的datanode进程和nodemanager进程

    [hadoop@10-122-147-37 hadoop]$ sbin/hadoop-daemon.sh start datanode
    starting datanode, logging to /letv/hadoop-2.7.6/logs/hadoop-hadoop-datanode-10-122-147-37.out
    [hadoop@10-122-147-37 hadoop]$ jps
    3068 DataNode
    6143 Jps
    [hadoop@10-122-147-37 hadoop]$ sbin/yarn-daemon.sh start nodemanager
    starting nodemanager, logging to /letv/hadoop-2.7.6/logs/yarn-hadoop-nodemanager-10-122-147-37.out
    [hadoop@10-122-147-37 hadoop]$ jps
    6211 NodeManager
    6403 Jps
    3068 DataNode

    5.刷新namenode

    [hadoop@10-124-147-22 hadoop]$ hdfs dfsadmin -refreshNodes
    Refresh nodes successful for hadoop1/10.124.147.22:9000
    Refresh nodes successful for hadoop2/10.124.147.23:9000

    6.查看hdfs信息

    [hadoop@10-124-147-22 hadoop]$ hdfs dfsadmin -refreshNodes
    Refresh nodes successful for hadoop1/10.124.147.22:9000
    Refresh nodes successful for hadoop2/10.124.147.23:9000
    [hadoop@10-124-147-22 hadoop]$ hdfs dfsadmin -report
    Configured Capacity: 1351059292160 (1.23 TB)
    Present Capacity: 1337331367936 (1.22 TB)
    DFS Remaining: 1337329156096 (1.22 TB)
    DFS Used: 2211840 (2.11 MB)
    DFS Used%: 0.00%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    

    Live datanodes (3):

    Name:
    10.122.147.37:50010 (hadoop7)
    Hostname: hadoop7
    Decommission Status : Normal
    Configured Capacity: 250831044608 (233.60 GB)
    DFS Used: 737280 (720 KB)
    Non DFS Used: 1240752128 (1.16 GB)
    DFS Remaining: 249589555200 (232.45 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 99.51%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Tue Jul 24 17:15:09 CST 2018

    Name: 10.110.92.161:50010 (hadoop5)
    Hostname: hadoop5
    Decommission Status : Normal
    Configured Capacity: 550114123776 (512.33 GB)
    DFS Used: 737280 (720 KB)
    Non DFS Used: 11195953152 (10.43 GB)
    DFS Remaining: 538917433344 (501.91 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 97.96%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Tue Jul 24 17:15:10 CST 2018

    Name: 10.110.92.162:50010 (hadoop6)
    Hostname: hadoop6
    Decommission Status : Normal
    Configured Capacity: 550114123776 (512.33 GB)
    DFS Used: 737280 (720 KB)
    Non DFS Used: 1291218944 (1.20 GB)
    DFS Remaining: 548822167552 (511.13 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 99.77%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Tue Jul 24 17:15:10 CST 2018

    7.更新resourcemanager信息

    [hadoop@10-124-147-32 hadoop]$ yarn rmadmin -refreshNodes
    [hadoop@10-124-147-32 hadoop]$ yarn node -list
    18/07/24 18:11:23 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
    Total Nodes:3
             Node-Id         Node-State Node-Http-Address   Number-of-Running-Containers
        hadoop7:3296            RUNNING      hadoop7:8042   
        hadoop5:37438           RUNNING      hadoop5:8042                              0
        hadoop6:9001            RUNNING      hadoop6:8042                              0

    8.include与exclude对yarn和hdfs的影响

    判断一个nodemanager能否连接到resourcemanager的条件是,该nodemanager出现在include文件且不出现exclude文件中

    而hdfs规与yarn不太一样(hdfs中的include直接即为dfs.hosts),其规则如下表

    是否在include 是否在exclude 是否可连接
    无法连接
    无法连接
    可以连接
    可连接,即将解除

    如果未指定include或者include为空,即意味着所有节点都在include文件

    5.遇到异常

    在移除datanode中的,会遇到被移除datanode一直处于Decommission in progress状态,这是因为默认测试环境中,没有设置副本数量,在hadoop中的默认副本数为3,而本测试环境中,因为datanode总共只有3个节点,所以会出现该异常

    将副本数量设置成小于datanode数量即可

    [hadoop@10-124-147-22 hadoop]$ grep dfs.replication -C3 /usr/local/hadoop/etc/hadoop/hdfs-site.xml
    

    <!-- 副本复制数量 -->
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>

  • 相关阅读:
    AndroidUI组件之ListView小技巧
    iframe属性參数
    Applet 数字签名技术全然攻略
    SoftReference
    递归算法浅谈
    VS2010 打包生成exe文件后 执行安装文件出现 TODO:&lt;文件说明&gt;已停止工作并已关闭
    创建新的Cocos2dx 3.0项目并解决一些编译问题
    ORACLE触发器具体解释
    SRM 624 D2L3: GameOfSegments, 博弈论,Sprague–Grundy theorem,Nimber
    cidaemon.exe进程cpu占用率高及关闭cidaemon.exe进程方法
  • 原文地址:https://www.cnblogs.com/yjt1993/p/10495243.html
Copyright © 2020-2023  润新知