• NameNode & DataNode


    NameNode类位于org.apache.hadoop.hdfs.server.namenode包下。

    NameNode serves as both directory namespace manager and "inode table" for the Hadoop DFS. There is a single NameNode running in any DFS deployment. (Well, except when there is a second backup/failover NameNode.)

    The NameNode controls two critical tables:
    1) filename->blocksequence (namespace)
    2) block->machinelist ("inodes")

    The first table is stored on disk and is very precious. The second table is rebuilt every time the NameNode comes up.

    'NameNode' refers to both this class as well as the 'NameNode server'. The 'FSNamesystem' class actually performs most of the filesystem management. The majority of the 'NameNode' class itself is concerned with exposing the IPC interface and the http server to the outside world, plus some configuration management.

    NameNode implements the ClientProtocol interface, which allows clients to ask for DFS services. ClientProtocol is not designed for direct use by authors of DFS client code. End-users should instead use the org.apache.nutch.hadoop.fs.FileSystem class.

    NameNode also implements the DatanodeProtocol interface, used by DataNode programs that actually store DFS data blocks. These methods are invoked repeatedly and automatically by all the DataNodes in a DFS deployment.

    NameNode also implements the NamenodeProtocol interface, used by secondary namenodes or rebalancing processes to get partial namenode's state, for example partial blocksMap etc.

    DataNode 类位于org.apache.hadoop.hdfs.server.datanode包下。
    DataNode is a class (and program) that stores a set of blocks for a DFS deployment. A single deployment can have one or many DataNodes. Each DataNode communicates regularly with a single NameNode. It also communicates with client code and other DataNodes from time to time.

    DataNodes store a series of named blocks. The DataNode allows client code to read these blocks, or to write new block data. The DataNode may also, in response to instructions from its NameNode, delete blocks or copy blocks to/from other DataNodes.

    The DataNode maintains just one critical table:
    block-> stream of bytes (of BLOCK_SIZE or less)

    This info is stored on a local disk. The DataNode reports the table's contents to the NameNode upon startup and every so often afterwards.

    DataNodes spend their lives in an endless loop of asking the NameNode for something to do. A NameNode cannot connect to a DataNode directly; a NameNode simply returns values from functions invoked by a DataNode.

    DataNodes maintain an open server socket so that client code or other DataNodes can read/write data. The host/port for this server is reported to the NameNode, which then sends that information to clients or other DataNodes that might be interested.

    查找工程里的类或者是资源文件:Ctrl + Shift + R。
    查找jar包里的类:Ctrl + Shift + T。

  • 相关阅读:
    MySql 应用语句
    MySql 存储过程 退出
    MySql 存储过程 光标只循环一次
    MySql获取两个日期间的时间差
    VM VirtualBox 全屏模式 && 自动缩放模式 相互切换
    区分不同操作系统、编译器不同版本的宏
    debian下配置网络 安装无线网卡驱动 Broadcom BCMXX系列
    YII 主题设置
    Unix线程概念、控制原语、属性
    远程IPC种植木马
  • 原文地址:https://www.cnblogs.com/tsiangleo/p/4200789.html
Copyright © 2020-2023  润新知