• 通过secondary namenode恢复崩溃的namenode


    模拟namenode崩溃,将name目录的内容全部删除,然后通过secondary namenode恢复namenode。

    环境:OS:Centos 6.5 x64 & Soft:Hadoop 1.2.1

    1、进入name目录下,删除name目录内容。

    [huser@master name]$ pwd
    /home/huser/hadoop/tmp/dfs/name
    
    [huser@master name]$ ll
    drwxrwxr-x 2 huser huser 4096 4月 16 20:16 current
    drwxrwxr-x 2 huser huser 4096 4月 16 17:24 image
    -rw-rw-r-- 1 huser huser 0 4月 16 20:10 in_use.lock
    drwxrwxr-x 2 huser huser 4096 4月 16 18:55 previous.checkpoint
    
    [huser@master name]$ rm -R *
    [huser@master name]$ ls

    2、停止集群,然后重启集群,发现nameNode失败。

    [huser@master hadoop-1.2.1]$ bin/stop-all.sh
    
    [huser@master hadoop-1.2.1]$ bin/start-all.sh 
    [huser@master hadoop-1.2.1]$ jps
    7160 SecondaryNameNode
    7229 JobTracker
    7369 Jps

    3、停止集群格式化namenode。

    [huser@master hadoop-1.2.1]$ bin/stop-all.sh
    
    [huser@master hadoop-1.2.1]$ bin/hadoop namenode -format
    14/04/16 21:17:39 INFO namenode.NameNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG: host = master/192.168.1.115
    STARTUP_MSG: args = [-format]
    STARTUP_MSG: version = 1.2.1
    STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
    STARTUP_MSG: java = 1.7.0_51
    ************************************************************/
    Re-format filesystem in /home/huser/hadoop/tmp/dfs/name ? (Y or N) Y
    14/04/16 21:17:42 INFO util.GSet: Computing capacity for map BlocksMap
    14/04/16 21:17:42 INFO util.GSet: VM type = 64-bit
    14/04/16 21:17:42 INFO util.GSet: 2.0% max memory = 1013645312
    14/04/16 21:17:42 INFO util.GSet: capacity = 2^21 = 2097152 entries
    14/04/16 21:17:42 INFO util.GSet: recommended=2097152, actual=2097152
    14/04/16 21:17:43 INFO namenode.FSNamesystem: fsOwner=huser
    14/04/16 21:17:43 INFO namenode.FSNamesystem: supergroup=supergroup
    14/04/16 21:17:43 INFO namenode.FSNamesystem: isPermissionEnabled=true
    14/04/16 21:17:43 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
    14/04/16 21:17:43 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
    14/04/16 21:17:43 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
    14/04/16 21:17:43 INFO namenode.NameNode: Caching file names occuring more than 10 times 
    14/04/16 21:17:43 INFO common.Storage: Image file /home/huser/hadoop/tmp/dfs/name/current/fsimage of size 111 bytes saved in 0 seconds.
    14/04/16 21:17:43 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/home/huser/hadoop/tmp/dfs/name/current/edits
    14/04/16 21:17:43 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/home/huser/hadoop/tmp/dfs/name/current/edits
    14/04/16 21:17:44 INFO common.Storage: Storage directory /home/huser/hadoop/tmp/dfs/name has been successfully formatted.
    14/04/16 21:17:44 INFO namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at master/192.168.1.115
    ************************************************************/

    4、从datanode节点获取namespace的ID。

    [huser@master hadoop-1.2.1]$ ssh slave1
    
    [huser@slave1 current]$ pwd
    /home/huser/hadoop/tmp/dfs/data/current
    
    [huser@slave1 current]$ ll
    -rw-rw-r-- 1 huser huser 49184 4月 16 18:43 blk_-1800088935645150399
    -rw-rw-r-- 1 huser huser 395 4月 16 18:43 blk_-1800088935645150399_1013.meta
    -rw-rw-r-- 1 huser huser 25 4月 16 18:43 blk_269963827714855400
    -rw-rw-r-- 1 huser huser 11 4月 16 18:43 blk_269963827714855400_1014.meta
    -rw-rw-r-- 1 huser huser 16353 4月 16 18:43 blk_4611281727215307463
    -rw-rw-r-- 1 huser huser 135 4月 16 18:43 blk_4611281727215307463_1015.meta
    -rw-rw-r-- 1 huser huser 769 4月 16 19:32 dncp_block_verification.log.curr
    -rw-rw-r-- 1 huser huser 158 4月 16 19:51 VERSION
    
    [huser@slave1 current]$ cat VERSION 
    #Wed Apr 16 19:51:23 CST 2014
    namespaceID=589801292
    storageID=DS-1065963269-192.168.1.111-50010-1397640950581
    cTime=0
    storageType=DATA_NODE
    layoutVersion=-41

    5、修改namenode的VERSION文件中namespaceID。

    [huser@slave1 current]$ exit
    logout
    
    [huser@master current]$ pwd
    /home/huser/hadoop/tmp/dfs/name/current
    
    [huser@master current]$ vi VERSION 
    #Wed Apr 16 21:17:43 CST 2014
    namespaceID=589801292
    cTime=0
    storageType=NAME_NODE
    layoutVersion=-41

    6、删除namenode节点下的fsinage文件。

    [huser@master current]$ rm fsimage 
    [huser@master current]$ ll
    -rw-rw-r-- 1 huser huser 4 4月 16 21:17 edits
    -rw-rw-r-- 1 huser huser 8 4月 16 21:17 fstime
    -rw-rw-r-- 1 huser huser 100 4月 16 21:30 VERSION

    7、复制secondarynamenode节点的fsimage文件到namenode节点下。

    [huser@master current]$ pwd
    /home/huser/hadoop/tmp/dfs/namesecondary/current
    [huser@master current]$ ll
    -rw-rw-r-- 1 huser huser 4 4月 16 20:16 edits
    -rw-rw-r-- 1 huser huser 2259 4月 16 20:16 fsimage
    -rw-rw-r-- 1 huser huser 8 4月 16 20:16 fstime
    -rw-rw-r-- 1 huser huser 100 4月 16 20:16 VERSION
    
    [huser@master current]$ cp fsimage /home/huser/hadoop/tmp/dfs/name/current/
    
    [huser@master current]$ cd /home/huser/hadoop/tmp/dfs/name/current/
    [huser@master current]$ ll
    -rw-rw-r-- 1 huser huser 4 4月 16 21:17 edits
    -rw-rw-r-- 1 huser huser 2259 4月 16 21:37 fsimage
    -rw-rw-r-- 1 huser huser 8 4月 16 21:17 fstime
    -rw-rw-r-- 1 huser huser 100 4月 16 21:30 VERSION

    8、重启集群并检查运行情况。

    [huser@master hadoop-1.2.1]$ jps
    7927 SecondaryNameNode
    7773 NameNode
    8017 JobTracker
    8123 Jps
  • 相关阅读:
    java基础之分辨final,static, abstract
    HTML DOM
    Zero Copy-转载201604
    Zero Copy
    java 虚拟机
    Spring Junit4
    【转】Java的序列化和反序列化总结
    【转】SQL Server 查询处理中的各个阶段(SQL执行顺序)
    【转】linux sar命令详解
    【转】linux top命令详解
  • 原文地址:https://www.cnblogs.com/guarder/p/3703808.html
Copyright © 2020-2023  润新知