• NameNode和SecondaryNameNode工作原理剖析


                NameNode和SecondaryNameNode工作原理剖析

                                         作者:尹正杰

    版权声明:原创作品,谢绝转载!否则将追究法律责任。

    一.HDFS启动流程

    1>.启动HDFS集群

      关于集群的部署我这里就不罗嗦了,小白同学可以参考我之前的笔记。
    
      启动HDFS集群,如下图所示,启动HDFS集群后,会自动生成一个最新的编辑日志和镜像文件。
      博主推荐阅读:     https:
    //www.cnblogs.com/yinzhengjie2020/p/12424192.html
    [root@hadoop101.yinzhengjie.org.cn ~]# start-dfs.sh
    Starting namenodes on [hadoop101.yinzhengjie.org.cn]
    hadoop101.yinzhengjie.org.cn: starting namenode, logging to /yinzhengjie/softwares/hadoop-2.10.0/logs/hadoop-root-namenode-hadoop101.yinzhengjie.org.cn.out
    hadoop102.yinzhengjie.org.cn: starting datanode, logging to /yinzhengjie/softwares/hadoop-2.10.0/logs/hadoop-root-datanode-hadoop102.yinzhengjie.org.cn.out
    hadoop104.yinzhengjie.org.cn: starting datanode, logging to /yinzhengjie/softwares/hadoop-2.10.0/logs/hadoop-root-datanode-hadoop104.yinzhengjie.org.cn.out
    hadoop103.yinzhengjie.org.cn: starting datanode, logging to /yinzhengjie/softwares/hadoop-2.10.0/logs/hadoop-root-datanode-hadoop103.yinzhengjie.org.cn.out
    Starting secondary namenodes [hadoop105.yinzhengjie.org.cn]
    hadoop105.yinzhengjie.org.cn: starting secondarynamenode, logging to /yinzhengjie/softwares/hadoop-2.10.0/logs/hadoop-root-secondarynamenode-hadoop105.yinzhengjie.org.cn.out
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# start-dfs.sh

    2>.访问NameNode的WebUI

    [root@hadoop101.yinzhengjie.org.cn ~]# ll /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/
    total 9364
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000001-0000000000000000002
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000003-0000000000000000004
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000005-0000000000000000005
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000006-0000000000000000006
    -rw-r--r-- 1 root root      42 Mar 12 03:29 edits_0000000000000000007-0000000000000000008
    -rw-r--r-- 1 root root 1048576 Mar 12 03:29 edits_0000000000000000009-0000000000000000009
    -rw-r--r-- 1 root root      42 Mar 12 03:32 edits_0000000000000000010-0000000000000000011
    -rw-r--r-- 1 root root      42 Mar 12 04:32 edits_0000000000000000012-0000000000000000013
    -rw-r--r-- 1 root root 1048576 Mar 12 04:32 edits_0000000000000000014-0000000000000000014
    -rw-r--r-- 1 root root      42 Mar 12 04:57 edits_0000000000000000015-0000000000000000016
    -rw-r--r-- 1 root root 1048576 Mar 12 04:57 edits_0000000000000000017-0000000000000000017
    -rw-r--r-- 1 root root      42 Mar 12 05:03 edits_0000000000000000018-0000000000000000019
    -rw-r--r-- 1 root root 1048576 Mar 12 05:03 edits_0000000000000000020-0000000000000000020
    -rw-r--r-- 1 root root      42 Mar 12 07:46 edits_0000000000000000021-0000000000000000022
    -rw-r--r-- 1 root root 1048576 Mar 12 07:46 edits_0000000000000000023-0000000000000000023
    -rw-r--r-- 1 root root      42 Mar 12 08:41 edits_0000000000000000024-0000000000000000025
    -rw-r--r-- 1 root root      42 Mar 12 09:41 edits_0000000000000000026-0000000000000000027
    -rw-r--r-- 1 root root      42 Mar 12 10:41 edits_0000000000000000028-0000000000000000029
    -rw-r--r-- 1 root root      42 Mar 12 11:41 edits_0000000000000000030-0000000000000000031
    -rw-r--r-- 1 root root      42 Mar 12 12:41 edits_0000000000000000032-0000000000000000033
    -rw-r--r-- 1 root root      42 Mar 12 13:41 edits_0000000000000000034-0000000000000000035
    -rw-r--r-- 1 root root      42 Mar 12 14:41 edits_0000000000000000036-0000000000000000037
    -rw-r--r-- 1 root root     672 Mar 12 15:41 edits_0000000000000000038-0000000000000000046
    -rw-r--r-- 1 root root 1048576 Mar 12 15:41 edits_0000000000000000047-0000000000000000047
    -rw-r--r-- 1 root root      42 Mar 12 15:54 edits_0000000000000000048-0000000000000000049
    -rw-r--r-- 1 root root   23130 Mar 12 16:54 edits_0000000000000000050-0000000000000000237
    -rw-r--r-- 1 root root    1140 Mar 12 17:54 edits_0000000000000000238-0000000000000000246
    -rw-r--r-- 1 root root      42 Mar 12 18:54 edits_0000000000000000247-0000000000000000248
    -rw-r--r-- 1 root root   24270 Mar 12 19:54 edits_0000000000000000249-0000000000000000448
    -rw-r--r-- 1 root root 1048576 Mar 12 19:54 edits_inprogress_0000000000000000449
    -rw-r--r-- 1 root root    2603 Mar 12 18:54 fsimage_0000000000000000248
    -rw-r--r-- 1 root root      62 Mar 12 18:54 fsimage_0000000000000000248.md5
    -rw-r--r-- 1 root root    3595 Mar 12 19:54 fsimage_0000000000000000448
    -rw-r--r-- 1 root root      62 Mar 12 19:54 fsimage_0000000000000000448.md5
    -rw-r--r-- 1 root root       4 Mar 12 19:54 seen_txid
    -rw-r--r-- 1 root root     217 Mar 12 08:40 VERSION
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# jps
    5329 Jps
    [root@hadoop101.yinzhengjie.org.cn ~]# 

    [root@hadoop101.yinzhengjie.org.cn ~]# ll /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/
    total 10388
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000001-0000000000000000002
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000003-0000000000000000004
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000005-0000000000000000005
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000006-0000000000000000006
    -rw-r--r-- 1 root root      42 Mar 12 03:29 edits_0000000000000000007-0000000000000000008
    -rw-r--r-- 1 root root 1048576 Mar 12 03:29 edits_0000000000000000009-0000000000000000009
    -rw-r--r-- 1 root root      42 Mar 12 03:32 edits_0000000000000000010-0000000000000000011
    -rw-r--r-- 1 root root      42 Mar 12 04:32 edits_0000000000000000012-0000000000000000013
    -rw-r--r-- 1 root root 1048576 Mar 12 04:32 edits_0000000000000000014-0000000000000000014
    -rw-r--r-- 1 root root      42 Mar 12 04:57 edits_0000000000000000015-0000000000000000016
    -rw-r--r-- 1 root root 1048576 Mar 12 04:57 edits_0000000000000000017-0000000000000000017
    -rw-r--r-- 1 root root      42 Mar 12 05:03 edits_0000000000000000018-0000000000000000019
    -rw-r--r-- 1 root root 1048576 Mar 12 05:03 edits_0000000000000000020-0000000000000000020
    -rw-r--r-- 1 root root      42 Mar 12 07:46 edits_0000000000000000021-0000000000000000022
    -rw-r--r-- 1 root root 1048576 Mar 12 07:46 edits_0000000000000000023-0000000000000000023
    -rw-r--r-- 1 root root      42 Mar 12 08:41 edits_0000000000000000024-0000000000000000025
    -rw-r--r-- 1 root root      42 Mar 12 09:41 edits_0000000000000000026-0000000000000000027
    -rw-r--r-- 1 root root      42 Mar 12 10:41 edits_0000000000000000028-0000000000000000029
    -rw-r--r-- 1 root root      42 Mar 12 11:41 edits_0000000000000000030-0000000000000000031
    -rw-r--r-- 1 root root      42 Mar 12 12:41 edits_0000000000000000032-0000000000000000033
    -rw-r--r-- 1 root root      42 Mar 12 13:41 edits_0000000000000000034-0000000000000000035
    -rw-r--r-- 1 root root      42 Mar 12 14:41 edits_0000000000000000036-0000000000000000037
    -rw-r--r-- 1 root root     672 Mar 12 15:41 edits_0000000000000000038-0000000000000000046
    -rw-r--r-- 1 root root 1048576 Mar 12 15:41 edits_0000000000000000047-0000000000000000047
    -rw-r--r-- 1 root root      42 Mar 12 15:54 edits_0000000000000000048-0000000000000000049
    -rw-r--r-- 1 root root   23130 Mar 12 16:54 edits_0000000000000000050-0000000000000000237
    -rw-r--r-- 1 root root    1140 Mar 12 17:54 edits_0000000000000000238-0000000000000000246
    -rw-r--r-- 1 root root      42 Mar 12 18:54 edits_0000000000000000247-0000000000000000248
    -rw-r--r-- 1 root root   24270 Mar 12 19:54 edits_0000000000000000249-0000000000000000448
    -rw-r--r-- 1 root root 1048576 Mar 12 19:54 edits_0000000000000000449-0000000000000000449
    -rw-r--r-- 1 root root 1048576 Mar 17 12:27 edits_inprogress_0000000000000000450
    -rw-r--r-- 1 root root    3595 Mar 12 19:54 fsimage_0000000000000000448
    -rw-r--r-- 1 root root      62 Mar 12 19:54 fsimage_0000000000000000448.md5
    -rw-r--r-- 1 root root    3595 Mar 17 12:27 fsimage_0000000000000000449
    -rw-r--r-- 1 root root      62 Mar 17 12:27 fsimage_0000000000000000449.md5
    -rw-r--r-- 1 root root       4 Mar 17 12:27 seen_txid
    -rw-r--r-- 1 root root     217 Mar 17 12:27 VERSION
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# jps
    5752 Jps
    5449 NameNode
    [root@hadoop101.yinzhengjie.org.cn ~]# 

    3>.NameNode启动流程分析

       通过上述步骤分析估计大家也能大致了解NameNode启动流程大致分为以下四个步骤:
        (1)加载镜像文件(以"fsimage_*"开头,后缀编号最新的文件)内容到内存;
        (2)加载编辑日志文件(以"edits_*"开头,后缀编号记录在同目录下的"seen_txid"文件中)内容到内存;
        (3)保存检查点;
        (4)进入安全模式;
    
      针对启动流程,我这里提出几个自问自答的问题,由于个人水平有限,如果有对以下问题进行补充的小伙伴欢迎留言。
        Q1:什么是镜像文件?
          答:镜像文件是HDFS文件系统元数据的一个永久性的检查点,其中包含HDFS文件系统的所有目录和文件inode的序列化信息。从上图也可以发现,NameNode启动时默认会加载最新的镜像文件,于此同时还会执行检查点以生成最新的镜像文件,并删除较久的镜像文件,默认只保留2个镜像文件哟。
    
        Q2:什么是编辑日志?
          答:存放HDFS文件系统的所有更新操作的路径,从上图也可以看出来编辑日志分为两类,一类是以"dits_*"开头,另一个是以"edits_inprogress_*"开头,文件系统客户端执行的所有写操作首先会被记录到以"edits_inprogress_*"开头的编辑日志文件中,然后再更新内存中的数据,这样做的目的是以防数据丢失。 
    
        Q3:关于seen_txid文件的作用是什么?
          答: 文件中保存的是一个数字,这个数字保存着当前可以写入的编辑日志的编号,换句话说,该文件保存的是以"edits_inprogress_*"开头的编辑日志文件对应的数字编号。
    
        Q4:请问以"dits_*"开头,和以"edits_inprogress_*"开头都是编辑日志文件,那么它们两个有什么区别呢?
          答:以"dits_*"开头的文件一般情况下是不会往里面写入数据了,而以"edits_inprogress_*"开头表示所有的更新操作都记录到该文件,当一个NameNode运行时间超过一年以上都没有关机过,你会发现再NameNode的数据目录里有很多以"dits_*"开头的文件,而以"edits_inprogress_*"开头的文件始终就只有一个哟。
    
        Q5:在NameNode的数据目录中,VERSION文件的作用是什么呢?
          答: 记录着集群的版本号,包括namespaceID,clusterID,cTime,storageType,blockpoolID,layoutVersion多个字段,这些字段我们之前在部署集群已经给大家介绍过,希望你还记得。 
    
        Q6:什么是保存检查点?
          答:我们知道NameNode在启动的第三个步骤就是"Saving checkpoint",即保存检查点,从结果上来讲,保存检查点就是将内存的数据保存到本次磁盘,即生成一个最新持久化的镜像文件。我们从上图也能依稀的看到"fsimage.ckpt_0000000000000000449"这个文件名称,这个过程非常块,保存检查点完成后就是我们看到的"fsimage_0000000000000000449"文件哟。
    
        Q7:什么是安全模式?
          答:安全模式是hadoop的一种保护机制,用于保证集群中的数据块的安全性。当NameNode一直运行在安全模式时,即HDFS文件系统对于客户端来说是只读的,换句话说就是无法进行写操作。在启动一个刚刚格式化的HDFS集群时,因为系统中还没有任何块,所以NameNode不会进入安全模式。当然,如果启动一个HDFS数据量相对较少时,安全模式也会很快就退出的哟。
    
        Q8:NameNode什么适合退出安全模式呢?
          答:当满足最小副本条件,NameNode会在30秒后就退出安全模式。所谓最小副本条件指的是在整个文件系统中99.9%(如果想要修改该参数的默认值可以直接修改"dfs.namenode.safemode.threshold-pct"参数)的块满足最小副本级别(默认值:dfs.namenode.replication.min=1)。
    
        Q9:NameNode中的元数据到底是存储在哪里呢?
          答:很显然,NameNode的元数据存储在内存中,但如果只存在内存中,一旦断电,元数据就会面临丢失的风险,整个集群将无法工作,因此需要将内存中的数据落地到磁盘。
      
        Q10:既然有了镜像文件为啥还需要编辑日志呢?
          答:自然是为了提升效率呀,当在内存中的元数据更新时,如果同时更新镜像文件,就会导致效率过低(因为内存的速度要比磁盘的速度快很几十倍,差距太大),但如果不更新就会发生一致性问题,一旦NameNode节点宕机,就会产生数据丢失的风险。因此引入了编辑日志文件,我们只在编辑日志文件进行追加操作,效率很高。每当元数据有更新或者添加元数据时,修改内存中的元数据并追加到编辑日志中。这样,一旦节点断电,可以通过镜像文件和编辑日志和合并,合成元数据信息。

    二.SecondaryNameNode工作原理

    1>.启动NameNode后,对比NameNode和SecondaryNameNode中的数据信息

    [root@hadoop101.yinzhengjie.org.cn ~]# ll /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/
    total 10388
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000001-0000000000000000002
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000003-0000000000000000004
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000005-0000000000000000005
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000006-0000000000000000006
    -rw-r--r-- 1 root root      42 Mar 12 03:29 edits_0000000000000000007-0000000000000000008
    -rw-r--r-- 1 root root 1048576 Mar 12 03:29 edits_0000000000000000009-0000000000000000009
    -rw-r--r-- 1 root root      42 Mar 12 03:32 edits_0000000000000000010-0000000000000000011
    -rw-r--r-- 1 root root      42 Mar 12 04:32 edits_0000000000000000012-0000000000000000013
    -rw-r--r-- 1 root root 1048576 Mar 12 04:32 edits_0000000000000000014-0000000000000000014
    -rw-r--r-- 1 root root      42 Mar 12 04:57 edits_0000000000000000015-0000000000000000016
    -rw-r--r-- 1 root root 1048576 Mar 12 04:57 edits_0000000000000000017-0000000000000000017
    -rw-r--r-- 1 root root      42 Mar 12 05:03 edits_0000000000000000018-0000000000000000019
    -rw-r--r-- 1 root root 1048576 Mar 12 05:03 edits_0000000000000000020-0000000000000000020
    -rw-r--r-- 1 root root      42 Mar 12 07:46 edits_0000000000000000021-0000000000000000022
    -rw-r--r-- 1 root root 1048576 Mar 12 07:46 edits_0000000000000000023-0000000000000000023
    -rw-r--r-- 1 root root      42 Mar 12 08:41 edits_0000000000000000024-0000000000000000025
    -rw-r--r-- 1 root root      42 Mar 12 09:41 edits_0000000000000000026-0000000000000000027
    -rw-r--r-- 1 root root      42 Mar 12 10:41 edits_0000000000000000028-0000000000000000029
    -rw-r--r-- 1 root root      42 Mar 12 11:41 edits_0000000000000000030-0000000000000000031
    -rw-r--r-- 1 root root      42 Mar 12 12:41 edits_0000000000000000032-0000000000000000033
    -rw-r--r-- 1 root root      42 Mar 12 13:41 edits_0000000000000000034-0000000000000000035
    -rw-r--r-- 1 root root      42 Mar 12 14:41 edits_0000000000000000036-0000000000000000037
    -rw-r--r-- 1 root root     672 Mar 12 15:41 edits_0000000000000000038-0000000000000000046
    -rw-r--r-- 1 root root 1048576 Mar 12 15:41 edits_0000000000000000047-0000000000000000047
    -rw-r--r-- 1 root root      42 Mar 12 15:54 edits_0000000000000000048-0000000000000000049
    -rw-r--r-- 1 root root   23130 Mar 12 16:54 edits_0000000000000000050-0000000000000000237
    -rw-r--r-- 1 root root    1140 Mar 12 17:54 edits_0000000000000000238-0000000000000000246
    -rw-r--r-- 1 root root      42 Mar 12 18:54 edits_0000000000000000247-0000000000000000248
    -rw-r--r-- 1 root root   24270 Mar 12 19:54 edits_0000000000000000249-0000000000000000448
    -rw-r--r-- 1 root root 1048576 Mar 12 19:54 edits_0000000000000000449-0000000000000000449
    -rw-r--r-- 1 root root 1048576 Mar 17 14:08 edits_inprogress_0000000000000000450
    -rw-r--r-- 1 root root    3595 Mar 12 19:54 fsimage_0000000000000000448
    -rw-r--r-- 1 root root      62 Mar 12 19:54 fsimage_0000000000000000448.md5
    -rw-r--r-- 1 root root    3595 Mar 17 14:08 fsimage_0000000000000000449
    -rw-r--r-- 1 root root      62 Mar 17 14:08 fsimage_0000000000000000449.md5
    -rw-r--r-- 1 root root       4 Mar 17 14:08 seen_txid
    -rw-r--r-- 1 root root     217 Mar 17 14:08 VERSION
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# jps
    5669 Jps
    5406 NameNode
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# ll /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/
    [root@hadoop105.yinzhengjie.org.cn ~]# ll /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/namesecondary/current/
    total 6288
    -rw-r--r-- 1 root root      42 Mar 12 01:26 edits_0000000000000000001-0000000000000000002
    -rw-r--r-- 1 root root      42 Mar 12 02:26 edits_0000000000000000003-0000000000000000004
    -rw-r--r-- 1 root root 1048576 Mar 12 03:29 edits_0000000000000000005-0000000000000000005
    -rw-r--r-- 1 root root 1048576 Mar 12 03:29 edits_0000000000000000006-0000000000000000006
    -rw-r--r-- 1 root root      42 Mar 12 03:29 edits_0000000000000000007-0000000000000000008
    -rw-r--r-- 1 root root 1048576 Mar 12 03:32 edits_0000000000000000009-0000000000000000009
    -rw-r--r-- 1 root root      42 Mar 12 03:32 edits_0000000000000000010-0000000000000000011
    -rw-r--r-- 1 root root      42 Mar 12 04:32 edits_0000000000000000012-0000000000000000013
    -rw-r--r-- 1 root root 1048576 Mar 12 04:57 edits_0000000000000000014-0000000000000000014
    -rw-r--r-- 1 root root      42 Mar 12 04:57 edits_0000000000000000015-0000000000000000016
    -rw-r--r-- 1 root root 1048576 Mar 12 05:03 edits_0000000000000000017-0000000000000000017
    -rw-r--r-- 1 root root      42 Mar 12 05:03 edits_0000000000000000018-0000000000000000019
    -rw-r--r-- 1 root root      42 Mar 12 07:46 edits_0000000000000000021-0000000000000000022
    -rw-r--r-- 1 root root      42 Mar 12 08:41 edits_0000000000000000024-0000000000000000025
    -rw-r--r-- 1 root root      42 Mar 12 09:41 edits_0000000000000000026-0000000000000000027
    -rw-r--r-- 1 root root      42 Mar 12 10:41 edits_0000000000000000028-0000000000000000029
    -rw-r--r-- 1 root root      42 Mar 12 11:41 edits_0000000000000000030-0000000000000000031
    -rw-r--r-- 1 root root      42 Mar 12 12:41 edits_0000000000000000032-0000000000000000033
    -rw-r--r-- 1 root root      42 Mar 12 13:41 edits_0000000000000000034-0000000000000000035
    -rw-r--r-- 1 root root      42 Mar 12 14:41 edits_0000000000000000036-0000000000000000037
    -rw-r--r-- 1 root root     672 Mar 12 15:41 edits_0000000000000000038-0000000000000000046
    -rw-r--r-- 1 root root 1048576 Mar 12 15:54 edits_0000000000000000047-0000000000000000047
    -rw-r--r-- 1 root root      42 Mar 12 15:54 edits_0000000000000000048-0000000000000000049
    -rw-r--r-- 1 root root   23130 Mar 12 16:54 edits_0000000000000000050-0000000000000000237
    -rw-r--r-- 1 root root    1140 Mar 12 17:54 edits_0000000000000000238-0000000000000000246
    -rw-r--r-- 1 root root      42 Mar 12 18:54 edits_0000000000000000247-0000000000000000248
    -rw-r--r-- 1 root root   24270 Mar 12 19:54 edits_0000000000000000249-0000000000000000448
    -rw-r--r-- 1 root root    2603 Mar 12 18:54 fsimage_0000000000000000248
    -rw-r--r-- 1 root root      62 Mar 12 18:54 fsimage_0000000000000000248.md5
    -rw-r--r-- 1 root root    3595 Mar 12 19:54 fsimage_0000000000000000448
    -rw-r--r-- 1 root root      62 Mar 12 19:54 fsimage_0000000000000000448.md5
    -rw-r--r-- 1 root root     217 Mar 12 19:54 VERSION
    [root@hadoop105.yinzhengjie.org.cn ~]# 
    [root@hadoop105.yinzhengjie.org.cn ~]# jps
    5301 SecondaryNameNode
    5381 Jps
    [root@hadoop105.yinzhengjie.org.cn ~]# 
    [root@hadoop105.yinzhengjie.org.cn ~]# 
    [root@hadoop105.yinzhengjie.org.cn ~]# ll /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/namesecondary/current/

      从上面的结果我们可以得出一个结论,SecondaryNameNode并不是无法取代NameNode,若强行取代存在数据丢失的风险。
    
      既然SecondaryNameNode并不是NameNode的备份,那SecondaryNameNode的功能到底是干什么的呢?
    
      接下来的我们就来详细分析一下SecondaryNameNode的工作原理。

    2>.SecondaryNameNode的工作原理刨析

       

      在了解NameNode启动过程之后,我们来了解一下SecondaryNameNode的工作机制。如上图所示,我们的SecondaryNameNode的工作逻辑大致如下所示:
        1>.周期性询问NameNode是否保存检查点;
        2>.触发保存点后SecondaryNameNode会执行保存检查点;
        3>.NameNode滚动正在写的编辑日志(以"edits_inprogress_*"开头);
        4>.将滚动后的后的编辑日志和最新的镜像文件拷贝到SecondaryNameNode节点;
        5>.SecondaryNameNode将NameNode拷贝过来的镜像文件和编辑日志加载到内存进行合并并生成新的镜像文件明明为以"fsimage.ckpt_*"开头的最新镜像文件,并将该镜像文件传输给NameNode;
        6>.最新以"fsimage.ckpt_*"开头的镜像文件从SecondaryNameNode节点拷贝到NameNode节点后又被重名为以"fsimage_*"开头的镜像文件,下次NameNode启动就会用最新的镜像文件合成和编辑日志进行合并。

      在初步了解SecondaryNameNode的工作机制之后,我这里提出结果自问自答的问题,,由于个人水平有限,如果有对以下问题进行补充的小伙伴欢迎留言。
        Q1:我们知道在重启HDFS集群时,NameNode会自动进行保存检查点,为什么还要用SecondaryNameNode来保存检查点呢?
          答:没错,NameNode在重启时的确会自动保存检查点,但是在生产环境中我们很少去重启NameNode,因为公司各部门开发人员可能都需要使用到HDFS集群。我们不会去重启NameNode集群的,反正我已经记不得上一次重启NameNode是啥时间啦。随着时间的推移,如果长时间添加数据到编辑日志中,会导致文件数据过大,降低效率,而且一旦断电,恢复元数据需要的时间过长。一次你需要定期进行镜像文件和编辑日志的合并,如果这个操作由NameNode节点完成,效率会过低。因此,Apache Hadoop官方又引入了新的节点,即SecondaryNamenode,专门用于镜像文件和编辑日志的合并。

        Q2:SecondaryNameNode周期性询问NameNode是否保存检查点的间隔时间是多少呢?
          答:SecondaryNameNode会默认每间隔60秒(如果想要修改这个间隔时间可以修改"dfs.namenode.checkpoint.check.period"参数的值)获取一次编辑日志操作记录,当累计操作次数默认满100w次操作时会保存一次检查点。

        Q3:SecondaryNameNode保存检查点的触发条件是什么呢?
          答:我们上个问题已经提到了默认累计操作次数满100w(可以修改"dfs.namenode.checkpoint.txns"参数来改变默认值)次操作就会触发保存检查点操作,或者是默认每间隔一个小时(开源修改"dfs.namenode.checkpoint.period"参数来改变默认值)创建一次检查点,二者满足其一均可以保存检查点。

        Q4:一次上传文件对应的是一次操作吗?
          答:不一定是,举个例子,我们假设客户端上传的文件的大小是372MB,这意味着客户端会将该文件分成3个Block(此处我们默认文件的块大小为128MB),每个Block上传对应这一次操作,这意味着我们上传了一个文件其实对应了3次操作哟。当然,当一个文件内容小于默认的128MB时,其实上传文件就对应这一次操作。

      如下图所示,当我们启动集群超过1小时后,我们发现NameNode的确有新的镜像文件生成啦,这都是SecondaryNameNode的功劳哟~

    三.查看镜像文件和编辑日志

    1>.下载镜像文件

    [root@hadoop101.yinzhengjie.org.cn ~]# hdfs oiv -h
    Usage: bin/hdfs oiv [OPTIONS] -i INPUTFILE -o OUTPUTFILE
    Offline Image Viewer
    View a Hadoop fsimage INPUTFILE using the specified PROCESSOR,
    saving the results in OUTPUTFILE.
    
    The oiv utility will attempt to parse correctly formed image files
    and will abort fail with mal-formed image files.
    
    The tool works offline and does not require a running cluster in
    order to process an image file.
    
    The following image processors are available:
      * XML: This processor creates an XML document with all elements of
        the fsimage enumerated, suitable for further analysis by XML
        tools.
      * ReverseXML: This processor takes an XML file and creates a
        binary fsimage containing the same elements.
      * FileDistribution: This processor analyzes the file size
        distribution in the image.
        -maxSize specifies the range [0, maxSize] of file sizes to be
         analyzed (128GB by default).
        -step defines the granularity of the distribution. (2MB by default)
        -format formats the output result in a human-readable fashion
         rather than a number of bytes. (false by default)
      * Web: Run a viewer to expose read-only WebHDFS API.
        -addr specifies the address to listen. (localhost:5978 by default)
      * Delimited (experimental): Generate a text file with all of the elements common
        to both inodes and inodes-under-construction, separated by a
        delimiter. The default delimiter is 	, though this may be
        changed via the -delimiter argument.
    
    Required command line arguments:
    -i,--inputFile <arg>   FSImage or XML file to process.
    
    Optional command line arguments:
    -o,--outputFile <arg>  Name of output file. If the specified
                           file exists, it will be overwritten.
                           (output to stdout by default)
                           If the input file was an XML file, we
                           will also create an <outputFile>.md5 file.
    -p,--processor <arg>   Select which type of processor to apply
                           against image file. (XML|FileDistribution|
                           ReverseXML|Web|Delimited)
                           The default is Web.
    -delimiter <arg>       Delimiting string to use with Delimited processor.  
    -t,--temp <arg>        Use temporary dir to cache intermediate result to generate
                           Delimited outputs. If not set, Delimited processor constructs
                           the namespace in memory before outputting text.
    -h,--help              Display usage information and exit
    
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# hdfs oiv -h
    [root@hadoop101.yinzhengjie.org.cn ~]# ll /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/
    total 10396
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000001-0000000000000000002
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000003-0000000000000000004
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000005-0000000000000000005
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000006-0000000000000000006
    -rw-r--r-- 1 root root      42 Mar 12 03:29 edits_0000000000000000007-0000000000000000008
    -rw-r--r-- 1 root root 1048576 Mar 12 03:29 edits_0000000000000000009-0000000000000000009
    -rw-r--r-- 1 root root      42 Mar 12 03:32 edits_0000000000000000010-0000000000000000011
    -rw-r--r-- 1 root root      42 Mar 12 04:32 edits_0000000000000000012-0000000000000000013
    -rw-r--r-- 1 root root 1048576 Mar 12 04:32 edits_0000000000000000014-0000000000000000014
    -rw-r--r-- 1 root root      42 Mar 12 04:57 edits_0000000000000000015-0000000000000000016
    -rw-r--r-- 1 root root 1048576 Mar 12 04:57 edits_0000000000000000017-0000000000000000017
    -rw-r--r-- 1 root root      42 Mar 12 05:03 edits_0000000000000000018-0000000000000000019
    -rw-r--r-- 1 root root 1048576 Mar 12 05:03 edits_0000000000000000020-0000000000000000020
    -rw-r--r-- 1 root root      42 Mar 12 07:46 edits_0000000000000000021-0000000000000000022
    -rw-r--r-- 1 root root 1048576 Mar 12 07:46 edits_0000000000000000023-0000000000000000023
    -rw-r--r-- 1 root root      42 Mar 12 08:41 edits_0000000000000000024-0000000000000000025
    -rw-r--r-- 1 root root      42 Mar 12 09:41 edits_0000000000000000026-0000000000000000027
    -rw-r--r-- 1 root root      42 Mar 12 10:41 edits_0000000000000000028-0000000000000000029
    -rw-r--r-- 1 root root      42 Mar 12 11:41 edits_0000000000000000030-0000000000000000031
    -rw-r--r-- 1 root root      42 Mar 12 12:41 edits_0000000000000000032-0000000000000000033
    -rw-r--r-- 1 root root      42 Mar 12 13:41 edits_0000000000000000034-0000000000000000035
    -rw-r--r-- 1 root root      42 Mar 12 14:41 edits_0000000000000000036-0000000000000000037
    -rw-r--r-- 1 root root     672 Mar 12 15:41 edits_0000000000000000038-0000000000000000046
    -rw-r--r-- 1 root root 1048576 Mar 12 15:41 edits_0000000000000000047-0000000000000000047
    -rw-r--r-- 1 root root      42 Mar 12 15:54 edits_0000000000000000048-0000000000000000049
    -rw-r--r-- 1 root root   23130 Mar 12 16:54 edits_0000000000000000050-0000000000000000237
    -rw-r--r-- 1 root root    1140 Mar 12 17:54 edits_0000000000000000238-0000000000000000246
    -rw-r--r-- 1 root root      42 Mar 12 18:54 edits_0000000000000000247-0000000000000000248
    -rw-r--r-- 1 root root   24270 Mar 12 19:54 edits_0000000000000000249-0000000000000000448
    -rw-r--r-- 1 root root 1048576 Mar 12 19:54 edits_0000000000000000449-0000000000000000449
    -rw-r--r-- 1 root root      42 Mar 17 15:07 edits_0000000000000000450-0000000000000000451
    -rw-r--r-- 1 root root      42 Mar 17 16:07 edits_0000000000000000452-0000000000000000453
    -rw-r--r-- 1 root root 1048576 Mar 17 16:07 edits_inprogress_0000000000000000454
    -rw-r--r-- 1 root root    3595 Mar 17 14:08 fsimage_0000000000000000449
    -rw-r--r-- 1 root root      62 Mar 17 14:08 fsimage_0000000000000000449.md5
    -rw-r--r-- 1 root root    3595 Mar 17 16:07 fsimage_0000000000000000453
    -rw-r--r-- 1 root root      62 Mar 17 16:07 fsimage_0000000000000000453.md5
    -rw-r--r-- 1 root root       4 Mar 17 16:07 seen_txid
    -rw-r--r-- 1 root root     217 Mar 17 14:08 VERSION
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# ll
    total 0
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# hdfs oiv -p XML -i /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/fsimage_0000000000000000453 -o ./fsimage.xml
    20/03/17 16:47:56 INFO offlineImageViewer.FSImageHandler: Loading 3 strings
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# ll
    total 16
    -rw-r--r-- 1 root root 13956 Mar 17 16:47 fsimage.xml
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# hdfs oiv -p XML -i /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/fsimage_0000000000000000453 -o ./fsimage.xml

    2>.查看镜像文件

     

    3>.下载编辑日志

    [root@hadoop101.yinzhengjie.org.cn ~]# hdfs oev -h
    Usage: bin/hdfs oev [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE
    Offline edits viewer
    Parse a Hadoop edits log file INPUT_FILE and save results
    in OUTPUT_FILE.
    Required command line arguments:
    -i,--inputFile <arg>   edits file to process, xml (case
                           insensitive) extension means XML format,
                           any other filename means binary format.
                           XML/Binary format input file is not allowed
                           to be processed by the same type processor.
    -o,--outputFile <arg>  Name of output file. If the specified
                           file exists, it will be overwritten,
                           format of the file is determined
                           by -p option
    
    Optional command line arguments:
    -p,--processor <arg>   Select which type of processor to apply
                           against image file, currently supported
                           processors are: binary (native binary format
                           that Hadoop uses), xml (default, XML
                           format), stats (prints statistics about
                           edits file)
    -h,--help              Display usage information and exit
    -f,--fix-txids         Renumber the transaction IDs in the input,
                           so that there are no gaps or invalid
                           transaction IDs.
    -r,--recover           When reading binary edit logs, use recovery 
                           mode.  This will give you the chance to skip 
                           corrupt parts of the edit log.
    -v,--verbose           More verbose output, prints the input and
                           output filenames, for processors that write
                           to a file, also output to screen. On large
                           image files this will dramatically increase
                           processing time (default is false).
    
    
    Generic options supported are:
    -conf <configuration file>        specify an application configuration file
    -D <property=value>               define a value for a given property
    -fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurati
    ons.-jt <local|resourcemanager:port>  specify a ResourceManager
    -files <file1,...>                specify a comma-separated list of files to be copied to the map reduce cluster
    -libjars <jar1,...>               specify a comma-separated list of jar files to be included in the classpath
    -archives <archive1,...>          specify a comma-separated list of archives to be unarchived on the compute machines
    
    The general command line syntax is:
    command [genericOptions] [commandOptions]
    
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# hdfs oev -h
    [root@hadoop101.yinzhengjie.org.cn ~]# ll /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/
    total 10396
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000001-0000000000000000002
    -rw-r--r-- 1 root root      42 Mar 12 03:27 edits_0000000000000000003-0000000000000000004
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000005-0000000000000000005
    -rw-r--r-- 1 root root 1048576 Mar 12 03:27 edits_0000000000000000006-0000000000000000006
    -rw-r--r-- 1 root root      42 Mar 12 03:29 edits_0000000000000000007-0000000000000000008
    -rw-r--r-- 1 root root 1048576 Mar 12 03:29 edits_0000000000000000009-0000000000000000009
    -rw-r--r-- 1 root root      42 Mar 12 03:32 edits_0000000000000000010-0000000000000000011
    -rw-r--r-- 1 root root      42 Mar 12 04:32 edits_0000000000000000012-0000000000000000013
    -rw-r--r-- 1 root root 1048576 Mar 12 04:32 edits_0000000000000000014-0000000000000000014
    -rw-r--r-- 1 root root      42 Mar 12 04:57 edits_0000000000000000015-0000000000000000016
    -rw-r--r-- 1 root root 1048576 Mar 12 04:57 edits_0000000000000000017-0000000000000000017
    -rw-r--r-- 1 root root      42 Mar 12 05:03 edits_0000000000000000018-0000000000000000019
    -rw-r--r-- 1 root root 1048576 Mar 12 05:03 edits_0000000000000000020-0000000000000000020
    -rw-r--r-- 1 root root      42 Mar 12 07:46 edits_0000000000000000021-0000000000000000022
    -rw-r--r-- 1 root root 1048576 Mar 12 07:46 edits_0000000000000000023-0000000000000000023
    -rw-r--r-- 1 root root      42 Mar 12 08:41 edits_0000000000000000024-0000000000000000025
    -rw-r--r-- 1 root root      42 Mar 12 09:41 edits_0000000000000000026-0000000000000000027
    -rw-r--r-- 1 root root      42 Mar 12 10:41 edits_0000000000000000028-0000000000000000029
    -rw-r--r-- 1 root root      42 Mar 12 11:41 edits_0000000000000000030-0000000000000000031
    -rw-r--r-- 1 root root      42 Mar 12 12:41 edits_0000000000000000032-0000000000000000033
    -rw-r--r-- 1 root root      42 Mar 12 13:41 edits_0000000000000000034-0000000000000000035
    -rw-r--r-- 1 root root      42 Mar 12 14:41 edits_0000000000000000036-0000000000000000037
    -rw-r--r-- 1 root root     672 Mar 12 15:41 edits_0000000000000000038-0000000000000000046
    -rw-r--r-- 1 root root 1048576 Mar 12 15:41 edits_0000000000000000047-0000000000000000047
    -rw-r--r-- 1 root root      42 Mar 12 15:54 edits_0000000000000000048-0000000000000000049
    -rw-r--r-- 1 root root   23130 Mar 12 16:54 edits_0000000000000000050-0000000000000000237
    -rw-r--r-- 1 root root    1140 Mar 12 17:54 edits_0000000000000000238-0000000000000000246
    -rw-r--r-- 1 root root      42 Mar 12 18:54 edits_0000000000000000247-0000000000000000248
    -rw-r--r-- 1 root root   24270 Mar 12 19:54 edits_0000000000000000249-0000000000000000448
    -rw-r--r-- 1 root root 1048576 Mar 12 19:54 edits_0000000000000000449-0000000000000000449
    -rw-r--r-- 1 root root      42 Mar 17 15:07 edits_0000000000000000450-0000000000000000451
    -rw-r--r-- 1 root root      42 Mar 17 16:07 edits_0000000000000000452-0000000000000000453
    -rw-r--r-- 1 root root 1048576 Mar 17 16:07 edits_inprogress_0000000000000000454
    -rw-r--r-- 1 root root    3595 Mar 17 14:08 fsimage_0000000000000000449
    -rw-r--r-- 1 root root      62 Mar 17 14:08 fsimage_0000000000000000449.md5
    -rw-r--r-- 1 root root    3595 Mar 17 16:07 fsimage_0000000000000000453
    -rw-r--r-- 1 root root      62 Mar 17 16:07 fsimage_0000000000000000453.md5
    -rw-r--r-- 1 root root       4 Mar 17 16:07 seen_txid
    -rw-r--r-- 1 root root     217 Mar 17 14:08 VERSION
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# ll
    total 16
    -rw-r--r-- 1 root root 13956 Mar 17 16:47 fsimage.xml
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# hdfs oev -p XML -i /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/edits_inprogress_0000000000000000454 -o ./edits.xml
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# ll
    total 20
    -rw-r--r-- 1 root root   204 Mar 17 16:54 edits.xml
    -rw-r--r-- 1 root root 13956 Mar 17 16:47 fsimage.xml
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# hdfs oev -p XML -i /yinzhengjie/softwares/hadoop-2.10.0/data/tmp/dfs/name/current/edits_inprogress_0000000000000000454 -o ./edits.xml

    4>.查看编辑日志

     

  • 相关阅读:
    Python爬取Boss直聘,帮你获取全国各类职业薪酬榜
    【深入浅出etcd系列】1. 架构概览
    00041_类与接口的关系
    ASP.NET MVC4 部分视图
    Ueditor编辑器 从word中复制内容带多张图片
    百度Ueditor编辑器 从word中复制内容带多张图片
    百度Ueditor 从word中复制内容带多张图片
    C#.NET实现大文件上传
    .NET实现大文件上传
    asp.net实现大文件上传
  • 原文地址:https://www.cnblogs.com/yinzhengjie2020/p/12466763.html
Copyright © 2020-2023  润新知