hadoop源代码分析（2）hdfs.server.datanode包DataXceiverServer类【原创】

一准备

　　hadop版本：1.0.3

　　学习方法：在学习datanode类过程中，发现它引用DataXceiverServer这个类，同时配合DataNode的理解，学习DataXceiverServer。

　　时间：2013-01-27

二 DataXceiverServer功能描述

　　DataXceiverServer类是DataNode的辅助类，它最主要是用来实现客户端或其他数据节点与当前节点通信，并负责接收/发送数据块。这个类的创建是为了监听来自客户端或其他数据节点的请求。它的实现通信的方法不是用hadoop IPC，而是用jdk本身就有的ServerSocket。

三 DataXceiverServer如何实现其功能

　　1、通信的准备工作

　　按照我们通俗的理解，接收和发送数据块，需要知道数据块的大小，传输使用手段，通信的双方，用多少个线程来通信，带宽情况。这些都在这个类里有实现。

　　2、通信

　　通信用的方式是ServerSocket，并且采用线程的方式，因此实现类Runnable接口。最主要的方法如代码所示。

View Code

  /**
   * 运行数据块接收服务器，前提，datanode要运行
   */
  public void run() {
    while (datanode.shouldRun) {
      try {
        /**
         * 侦听并接受来自客户端或其他服务器的连接请求,ss为执行当前方法的数据节点
         */
        Socket s = ss.accept();
        s.setTcpNoDelay(true); // 不延迟
        new Daemon(datanode.threadGroup, 
            new DataXceiver(s, datanode, this)).start();
      } catch (SocketTimeoutException ignored) {
        // wake up to see if should continue to run
      } catch (AsynchronousCloseException ace) {
          LOG.warn(datanode.dnRegistration + ":DataXceiveServer:"
                  + StringUtils.stringifyException(ace));
          datanode.shouldRun = false;
      } catch (IOException ie) {
        LOG.warn(datanode.dnRegistration + ":DataXceiveServer: IOException due to:"
                                 + StringUtils.stringifyException(ie));
      } catch (Throwable te) {
        LOG.error(datanode.dnRegistration + ":DataXceiveServer: Exiting due to:" 
                                 + StringUtils.stringifyException(te));
        datanode.shouldRun = false;
      }
    }
    try {
      ss.close();
    } catch (IOException ie) {
      LOG.warn(datanode.dnRegistration + ":DataXceiveServer: Close exception due to: "
                               + StringUtils.stringifyException(ie));
    }
    LOG.info("Exiting DataXceiveServer");
  }

　　里面涉及到一些异常的处理。在关闭通信时首先要保证DataNode停止运行。另外运用安全关闭Socket的方法。运行线程通过方法：new 　　　Daemon(datanode.threadGroup, new DataXceiver(s, datanode, this)).start();同时，把数据传入类DataXceiver的构造方法，在构造方法里用dataXceiverServer.childSockets.put(s, s);获取请求的sockets，然后运行DataXceiver.run()方法，从DataXceiveServer读写数据。关于DataXceiver类后面要继续深入分析。

　　3、多个socket处理　　

　　DataXceiverServer类通过由指定的映射支持的同步映射，为数据传出记录所有打开的sockets，然后提供安全线程的方式处理sockets的关闭。

四 DataXceiverServer主要方法、属性分析

　　以下是DataXceiverServer源码，在一些主要的方法或属性我已做了中文的解释，这个类相对来说比较简单。

View DataXceiverServer Code

  1 package org.apache.hadoop.hdfs.server.datanode;
  2 
  3 import java.io.IOException;
  4 import java.net.ServerSocket;
  5 import java.net.Socket;
  6 import java.net.SocketTimeoutException;
  7 import java.nio.channels.AsynchronousCloseException;
  8 import java.util.Collections;
  9 import java.util.HashMap;
 10 import java.util.Iterator;
 11 import java.util.Map;
 12 
 13 import org.apache.commons.logging.Log;
 14 import org.apache.hadoop.hdfs.protocol.FSConstants;
 15 import org.apache.hadoop.hdfs.server.balancer.Balancer;
 16 import org.apache.hadoop.util.Daemon;
 17 import org.apache.hadoop.util.StringUtils;
 18 
 19 import org.apache.hadoop.conf.Configuration;
 20 
 21 /**
 22  * DataXceiverServer类用来接接收/发送数据块。这个类的创建是为了监听来自客户端或其他数据节点的请求。
 23  * 这个小服务器没有利用hadoop IPC。
 24  */
 25 class DataXceiverServer implements Runnable, FSConstants {
 26   public static final Log LOG = DataNode.LOG;
 27   
 28   ServerSocket ss;
 29   DataNode datanode;
 30   // Record all sockets opened for data transfer
 31   /**
 32    * 返回由指定的映射支持的同步映射，为数据传出记录所有打开的sockets
 33    */
 34   Map<Socket, Socket> childSockets = Collections.synchronizedMap(
 35                                        new HashMap<Socket, Socket>());
 36   
 37   /**
 38    * Maximal number of concurrent xceivers per node.
 39    * Enforcing the limit is required in order to avoid data-node
 40    * running out of memory.
 41    * 每个节点当前最大的xceivers数目
 42    * 必须强制限制数目时为类防止datanode内存溢出
 43    */
 44   static final int MAX_XCEIVER_COUNT = 256;
 45   int maxXceiverCount = MAX_XCEIVER_COUNT;
 46 
 47   /** A manager to make sure that cluster balancing does not
 48    * take too much resources.
 49    * 
 50    * It limits the number of block moves for balancing and
 51    * the total amount of bandwidth they can use.
 52    * BlockBalanceThrotter数据块均衡管理器确保集群均衡，不消耗太多的资源
 53    * 为了均衡，它限制数据块移动的数目和他们能用的总带宽
 54    */
 55   static class BlockBalanceThrottler extends BlockTransferThrottler {
 56    private int numThreads;
 57    
 58    /**Constructor
 59     * 
 60     * @param bandwidth Total amount of bandwidth can be used for balancing 
 61     */
 62    private BlockBalanceThrottler(long bandwidth) {
 63      super(bandwidth);
 64      LOG.info("Balancing bandwith is "+ bandwidth + " bytes/s");
 65    }
 66    
 67    /** Check if the block move can start. 
 68     * 
 69     * Return true if the thread quota is not exceeded and 
 70     * the counter is incremented; False otherwise.
 71     * 数据块同时移动的最大数目为5，超过5，返回false，不然线程数自增。
 72     */
 73    synchronized boolean acquire() {
 74      if (numThreads >= Balancer.MAX_NUM_CONCURRENT_MOVES) {
 75        return false;
 76      }
 77      numThreads++;
 78      return true;
 79    }
 80    
 81    /** Mark that the move is completed. The thread counter is decremented. */
 82    synchronized void release() {
 83      numThreads--;
 84    }
 85   }
 86 
 87   BlockBalanceThrottler balanceThrottler;
 88   
 89   /**
 90    * We need an estimate for block size to check if the disk partition has
 91    * enough space. For now we set it to be the default block size set
 92    * in the server side configuration, which is not ideal because the
 93    * default block size should be a client-size configuration. 
 94    * A better solution is to include in the header the estimated block size,
 95    * i.e. either the actual block size or the default block size.
 96    * 我们需要评估数据块大小来检检查磁盘分区是否有足够的空间。当前我们把它设置为默认的数据块大小。
 97    * 在服务器端的配置，默认的数据块大小就显得不理想，因为默认数据块大小本应是为客户端配置的。
 98    * 一个更号的解决方案是把比如实际数据块大小或默认数据块大小包含在待评估的数据块大小的头部。
 99    */
100   long estimateBlockSize;
101   
102   /**
103    * 数据接收服务器构造方法
104    * @param ss 服务器端套接字
105    * @param conf 配置
106    * @param datanode 数据节点
107    */
108   DataXceiverServer(ServerSocket ss, Configuration conf, 
109       DataNode datanode) {
110     
111     this.ss = ss;
112     this.datanode = datanode;
113     /**
114      * 根据配置项读取手工配置的最大接收数目，与MAX_XCEIVER_COUNT
115      */
116     this.maxXceiverCount = conf.getInt("dfs.datanode.max.xcievers",
117         MAX_XCEIVER_COUNT);
118     // 根据配置项dfs.block.size读取数据块大小，默认为64M。
119     this.estimateBlockSize = conf.getLong("dfs.block.size", DEFAULT_BLOCK_SIZE);
120     
121     //set up parameter for cluster balancing，默认为1M。
122     this.balanceThrottler = new BlockBalanceThrottler(
123       conf.getLong("dfs.balance.bandwidthPerSec", 1024L*1024));
124   }
125 
126   /**
127    * 运行数据块接收服务器，前提，datanode要运行
128    */
129   public void run() {
130     while (datanode.shouldRun) {
131       try {
132         Socket s = ss.accept();
133         /**
134          * 侦听并接受来自客户端或其他服务器的连接请求,ss为执行当前方法的数据节点
135          */
136         s.setTcpNoDelay(true);
137         new Daemon(datanode.threadGroup, 
138             new DataXceiver(s, datanode, this)).start();
139       } catch (SocketTimeoutException ignored) {
140         // wake up to see if should continue to run
141       } catch (AsynchronousCloseException ace) {
142           LOG.warn(datanode.dnRegistration + ":DataXceiveServer:"
143                   + StringUtils.stringifyException(ace));
144           datanode.shouldRun = false;
145       } catch (IOException ie) {
146         LOG.warn(datanode.dnRegistration + ":DataXceiveServer: IOException due to:"
147                                  + StringUtils.stringifyException(ie));
148       } catch (Throwable te) {
149         LOG.error(datanode.dnRegistration + ":DataXceiveServer: Exiting due to:" 
150                                  + StringUtils.stringifyException(te));
151         datanode.shouldRun = false;
152       }
153     }
154     try {
155       ss.close();
156     } catch (IOException ie) {
157       LOG.warn(datanode.dnRegistration + ":DataXceiveServer: Close exception due to: "
158                                + StringUtils.stringifyException(ie));
159     }
160     LOG.info("Exiting DataXceiveServer");
161   }
162   /**
163    * 杀死此线程，前提时确保数据节点已经关闭运行，然后关掉ServerSocket
164    */
165   void kill() {
166     assert datanode.shouldRun == false :
167       "shoudRun should be set to false before killing";
168     try {
169       this.ss.close();
170     } catch (IOException ie) {
171       LOG.warn(datanode.dnRegistration + ":DataXceiveServer.kill(): " 
172                               + StringUtils.stringifyException(ie));
173     }
174 
175     // close all the sockets that were accepted earlier
176     /**
177      * 关闭各个socket的安全方法
178      */
179     synchronized (childSockets) {
180       for (Iterator<Socket> it = childSockets.values().iterator();
181            it.hasNext();) {
182         Socket thissock = it.next();
183         try {
184           thissock.close();
185         } catch (IOException e) {
186         }
187       }
188     }
189   }
190 }

五 DataXceiverServer相关类、接口简述

　　DataXceiverServer实现Runnalble和FSConstants接口，关联org.apache.hadoop.hdfs.server.balancer.Balancer，内部类：BlockBalanceThrottler。Runnalble接口：实现此接口的类视为创建一个线程，它必须实现void run()方法，启动该线程将导致在独立执行中调用DataXceiverServer的run方法。
　　FSConstants接口：定义一些文件系统常量，在DataXceiverServer类用到的常量有：DEFAULT_BLOCK_SIZE：默认数据块大小（64M）。
　　Balancer：磁盘空间平衡器，在DataXceiverServer中限制类最大线程数不能超过Balancer.MAX_NUM_CONCURRENT_MOVES（值为5）。
　　BlockBalanceThrottler数据块均衡管理器确保集群均衡，不消耗太多的资源。为了均衡，它限制数据块移动的数目和他们能用的总带宽。限制数据块最大数目为5，带宽为1M。

六结语

　　1、感觉hadoop的源码层层嵌套，同时逻辑性又特别强，有难度。这个类的学习让我对套接字的通信、线程使用、线程安全、后台线程都得到进一步的加深。

　　2、备忘：学习DataXceiver类。

　　原文出处：http://www.cnblogs.com/caoyuanzhanlang

草原战狼淘宝小店：http://xarxf.taobao.com/ 淘宝搜小矮人鞋坊，主营精致美丽时尚女鞋，为您的白雪公主挑一双哦。谢谢各位博友的支持。

==========================================================================================================

　　=================================== 以上分析仅代表个人观点，欢迎指正与交流 ============================================

　　=================================== 尊重劳动成果，转载请注明出处，万分感谢 ============================================