• Hadoop RPC通信Server端的流程分析


               前2天刚刚小小的分析下Client端的流程,走的还是比较通顺的,但是RPC的服务端就显然没有那么简单了,毕竟C-S这种模式的,压力和重点都是放在Server端的,所以我也只能做个大概的分析,因为里面细节的东西太多,我也不可能理清所有细节,但是我会集合源代码把主要的流程理理清。如果读者想进一步学习的话,可自行查阅源码。

               Server服务端和Client客户端在某些变量的定义上还是一致的,比如服务端也有Call,和Connection,这个很好理解,Call回调,和Connection连接是双向的。首先看一个Server类的定义:

    public abstract class Server {
      private final boolean authorize;
      private boolean isSecurityEnabled;
      
      /**
       * The first four bytes of Hadoop RPC connections
       * Hadoop RPC的连接魔数字符‘hrpc’
       */
      public static final ByteBuffer HEADER = ByteBuffer.wrap("hrpc".getBytes());
      
      // 1 : Introduce ping and server does not throw away RPCs
      // 3 : Introduce the protocol into the RPC connection header
      // 4 : Introduced SASL security layer
      public static final byte CURRENT_VERSION = 4;
      ....
    这里定义了基本的一些信息,版本号了,还有用于验证的魔数了等等。下面看看他的2个关键内部类,Connection连接和Call回调类

      /** A call queued for handling. */
      /** 服务端的Call列表队列 ,与客户端的是不同的*/
      private static class Call {
    	//客户端的Call Id,是从客户端上传过类的
        private int id;                               // the client's call id
        //Call回调参数
        private Writable param;                       // the parameter passed
        //还保存了与客户端的连接
        private Connection connection;                // connection to client
        
        //接收到response回应的时间
        private long timestamp;     // the time received when response is null
                                       // the time served when response is not null
        //对于此回调的回应值
        private ByteBuffer response;                      // the response for this call
        ......
    在内部变量的设置上还是有小小的不同的,到时服务端就是通过往Call中写response处理回复的。还有一个是连接类:

      /** Reads calls from a connection and queues them for handling. */
      public class Connection {
    	//连接的RPC头部是否已读
        private boolean rpcHeaderRead = false; // if initial rpc header is read
        //版本号之后的头部信息是否已读
        private boolean headerRead = false;  //if the connection header that
                                             //follows version is read.
    
        private SocketChannel channel;
        //字节缓冲用于读写回复
        private ByteBuffer data;
        private ByteBuffer dataLengthBuffer;
        //回复Call列表
        private LinkedList<Call> responseQueue;
        //此连接下的RPC请求数
        private volatile int rpcCount = 0; // number of outstanding rpcs
        private long lastContact;
        private int dataLength;
        private Socket socket;
        // Cache the remote host & port info so that even if the socket is 
        // disconnected, we can say where it used to connect to.
        private String hostAddress;
        private int remotePort;
        private InetAddress addr;
        .....
    上面的变量也很好理解,不解释了,在Server端多出了下面几个关键的变量:

    .....
      volatile private boolean running = true;         // true while server runs
      //阻塞式Call待处理的队列
      private BlockingQueue<Call> callQueue; // queued calls
    
      //与客户端的连接数链表
      private List<Connection> connectionList = 
        Collections.synchronizedList(new LinkedList<Connection>());
      //maintain a list
      //of client connections
      //服务端的监听线程
      private Listener listener = null;
      //处理应答线程
      private Responder responder = null;
      private int numConnections = 0;
      //处理请求线程组
      private Handler[] handlers = null;
      .....
    callQueue,待处理请求列表,ConnectionList连接列表,还有3大线程,监听,处理,应答请求线程,待处理请求人家用的还是BlockingQueue阻塞式队列,队列如果满了是插入不了需要等待的,队列为空是取不出数据也是要等待。在这点上作者是有自己的考虑的。通过上面的描述,Server类的大体框图就出来了:


    好了,下面的分析重点就是3大线程的具体操作了。3大线程的在Server start操作后就会开始工作:

      /** Starts the service.  Must be called before any calls will be handled. */
      /** 服务端的启动方法 */
      public synchronized void start() {
    	//开启3大进程监听线程,回复线程,处理请求线程组
        responder.start();
        listener.start();
        handlers = new Handler[handlerCount];
        
        for (int i = 0; i < handlerCount; i++) {
          handlers[i] = new Handler(i);
          handlers[i].start();
        }
      }
    初始化操作在构造函数中已经执行过了的,所以这里的操作很干脆,直接开启线程。按照正常的顺序,第一步显然是listener线程干的事了,就是监听请求。

        public Listener() throws IOException {
        	.....
          bind(acceptChannel.socket(), address, backlogLength);
          port = acceptChannel.socket().getLocalPort(); //Could be an ephemeral port
          // create a selector;
          selector= Selector.open();
          readers = new Reader[readThreads];
          readPool = Executors.newFixedThreadPool(readThreads);
          for (int i = 0; i < readThreads; i++) {
            Selector readSelector = Selector.open();
            Reader reader = new Reader(readSelector);
            readers[i] = reader;
            //reader Runnable放入线程池中执行
            readPool.execute(reader);
          }
    
          // Register accepts on the server socket with the selector.
          //Java NIO的知识,在selector上注册key的监听事件
          acceptChannel.register(selector, SelectionKey.OP_ACCEPT);
          this.setName("IPC Server listener on " + port);
          this.setDaemon(true);
        }
    Listener在构造函数中做了上面一些事,初始化一些线程池了,注册读事件了。下面是他的主要在跑的程序:

        @Override
        public void run() {
          LOG.info(getName() + ": starting");
          SERVER.set(Server.this);
          while (running) {
            SelectionKey key = null;
            try {
              selector.select();
              Iterator<SelectionKey> iter = selector.selectedKeys().iterator();
              while (iter.hasNext()) {
                key = iter.next();
                iter.remove();
                try {
                  if (key.isValid()) {
                    if (key.isAcceptable())
                      //Listener的作用就是监听客户端的额连接事件
                      doAccept(key);
                  }
                } catch (IOException e) {
                }
                key = null;
                .....
    在读之前就是监听连接的请求,方法就来到了doAccept(),

        void doAccept(SelectionKey key) throws IOException,  OutOfMemoryError {
          Connection c = null;
          ServerSocketChannel server = (ServerSocketChannel) key.channel();
          SocketChannel channel;
          while ((channel = server.accept()) != null) {
            channel.configureBlocking(false);
            channel.socket().setTcpNoDelay(tcpNoDelay);
            Reader reader = getReader();
            try {
              //连接成功之后,在NIO上注册Read读事件
              reader.startAdd();
              SelectionKey readKey = reader.registerChannel(channel);
              c = new Connection(readKey, channel, System.currentTimeMillis());
              readKey.attach(c);
              synchronized (connectionList) {
                connectionList.add(numConnections, c);
                numConnections++;
              }
              ....
    accept操作之后就是把Reader操作注册到通道上:
          public synchronized SelectionKey registerChannel(SocketChannel channel)
                                                              throws IOException {
              return channel.register(readSelector, SelectionKey.OP_READ);
          }
    后面的事情就又来到了Reader的主操作了:

          public void run() {
            LOG.info("Starting SocketReader");
            synchronized (this) {
              while (running) {
                SelectionKey key = null;
                try {
                  readSelector.select();
                  while (adding) {
                    this.wait(1000);
                  }              
    
                  Iterator<SelectionKey> iter = readSelector.selectedKeys().iterator();
                  while (iter.hasNext()) {
                    key = iter.next();
                    iter.remove();
                    if (key.isValid()) {
                      if (key.isReadable()) {
                    	//Reader的作用就是监听Read读事件
                        doRead(key);
                      }
                    }
                    key = null;
                  }
                  .....
    跟连接的监听非常类似,操作就发生在了doRead()方法上了:

        void doRead(SelectionKey key) throws InterruptedException {
          int count = 0;
          Connection c = (Connection)key.attachment();
          if (c == null) {
            return;  
          }
          c.setLastContact(System.currentTimeMillis());
          
          try {
        	//监听到RPC请求的读事件后,首先调用下面的方法
            count = c.readAndProcess();
            ....
        public int readAndProcess() throws IOException, InterruptedException {
          while (true) {
            /* Read at most one RPC. If the header is not read completely yet
             * then iterate until we read first RPC or until there is no data left.
             */    
            int count = -1;
            //首先读取数据的header头部信息
            if (dataLengthBuffer.remaining() > 0) {
              count = channelRead(channel, dataLengthBuffer);       
              if (count < 0 || dataLengthBuffer.remaining() > 0) 
                return count;
            }
          
            if (!rpcHeaderRead) {
              //Every connection is expected to send the header.
              if (rpcHeaderBuffer == null) {
                rpcHeaderBuffer = ByteBuffer.allocate(2);
              }
              count = channelRead(channel, rpcHeaderBuffer);
              if (count < 0 || rpcHeaderBuffer.remaining() > 0) {
                return count;
              }
              
              //从头部获取版本信息和验证的method类型
              int version = rpcHeaderBuffer.get(0);
              byte[] method = new byte[] {rpcHeaderBuffer.get(1)};
              authMethod = AuthMethod.read(new DataInputStream(
                  new ByteArrayInputStream(method)));
              dataLengthBuffer.flip();    
              //在这里做if的验证,不符合要求的直接返回
              if (!HEADER.equals(dataLengthBuffer) || version != CURRENT_VERSION) {
                //Warning is ok since this is not supposed to happen.
                LOG.warn("Incorrect header or version mismatch from " + 
                         hostAddress + ":" + remotePort +
                         " got version " + version + 
                         " expected version " + CURRENT_VERSION);
                return -1;
              }
             ....
            
            //继承从channel通道读入数据到data中
            count = channelRead(channel, data);
            
            if (data.remaining() == 0) {
              dataLengthBuffer.clear();
              data.flip();
              if (skipInitialSaslHandshake) {
                data = null;
                skipInitialSaslHandshake = false;
                continue;
              }
              boolean isHeaderRead = headerRead;
              //根据是否用了sasl的方式与否进行不同的处理
              //SASL是一种用来扩充C/S模式验证能力的机制,我们卡简单的不用这种机制的
              if (useSasl) {
                saslReadAndProcess(data.array());
              } else {
                processOneRpc(data.array());
              }
              .....
    然后来到了下面的这个方法:

        private void processOneRpc(byte[] buf) throws IOException,
            InterruptedException {
          if (headerRead) {
        	//头部信息验证完毕,正式处理处理请求数据
            processData(buf);
          } else {
        	//继续验证头部的剩余信息,协议和用户组信息
            processHeader(buf);
            headerRead = true;
            if (!authorizeConnection()) {
              throw new AccessControlException("Connection from " + this
                  + " for protocol " + header.getProtocol()
                  + " is unauthorized for user " + user);
            }
          }
        }
    processData就是最终的处理方法了,这一路上的方法真是多啊。
        private void processData(byte[] buf) throws  IOException, InterruptedException {
          DataInputStream dis =
            new DataInputStream(new ByteArrayInputStream(buf));
          int id = dis.readInt();                    // try to read an id
            
          if (LOG.isDebugEnabled())
            LOG.debug(" got #" + id);
          
          //从配置根据反射获取参数类型
          Writable param = ReflectionUtils.newInstance(paramClass, conf);//read param
          //数据读入此类似
          param.readFields(dis);        
          
          //依据ID,和参数构建Server服务的Call回调对象
          Call call = new Call(id, param, this);
          //放入阻塞式Call队列
          callQueue.put(call);              // queue the call; maybe blocked here
          //增加RPC请求数的数量
          incRpcCount();  // Increment the rpc count
        }
    到了这里方法结束了,所以他的核心操作就是把读请求中的参数变为Call放入到阻塞式队列中,这个就是listener干的事。然后与此相关的一个线程就有事情做了Handler处理线程:

      /** 处理请求Call队列 */
      private class Handler extends Thread {
        public Handler(int instanceNumber) {
          this.setDaemon(true);
          this.setName("IPC Server handler "+ instanceNumber + " on " + port);
        }
    
        @Override
        public void run() {
          LOG.info(getName() + ": starting");
          SERVER.set(Server.this);
          ByteArrayOutputStream buf = 
            new ByteArrayOutputStream(INITIAL_RESP_BUF_SIZE);
          //while一直循环处理
          while (running) {
            try {
              //从队列中取出call请求
              final Call call = callQueue.take(); // pop the queue; maybe blocked here
    
              if (LOG.isDebugEnabled())
                LOG.debug(getName() + ": has #" + call.id + " from " +
                          call.connection);
              
              String errorClass = null;
              String error = null;
              Writable value = null;
    
              //设置成当前处理的call请求
              CurCall.set(call);
              ....
              CurCall.set(null);
              synchronized (call.connection.responseQueue) {
                // setupResponse() needs to be sync'ed together with 
                // responder.doResponse() since setupResponse may use
                // SASL to encrypt response data and SASL enforces
                // its own message ordering.
            	//设置回复初始条件
                setupResponse(buf, call, 
                            (error == null) ? Status.SUCCESS : Status.ERROR, 
                            value, errorClass, error);
              // Discard the large buf and reset it back to 
              // smaller size to freeup heap
              if (buf.size() > maxRespSize) {
                LOG.warn("Large response size " + buf.size() + " for call " + 
                    call.toString());
                  buf = new ByteArrayOutputStream(INITIAL_RESP_BUF_SIZE);
                }
                //交给responder线程执行写回复操作
                responder.doRespond(call);
                ....
    Handler的处理还算直接,就是从刚刚的待回复队列中取出Call交给下个response线程写回复的,相当于一个中转操作。阻塞式队列的一个好处是如果callQueue里面没有数据,他会阻塞在callQueue.take()这行代码上的,后面的就无法执行了。然后就把后面的操作扔给了response线程了。

        void doRespond(Call call) throws IOException {
          synchronized (call.connection.responseQueue) {
            call.connection.responseQueue.addLast(call);
            if (call.connection.responseQueue.size() == 1) {
              processResponse(call.connection.responseQueue, true);
            }
          }
        }
    继续看processResponse方法:

        private boolean processResponse(LinkedList<Call> responseQueue,
                                        boolean inHandler) throws IOException {
          boolean error = true;
          boolean done = false;       // there is more data for this channel.
          int numElements = 0;
          Call call = null;
          try {
            synchronized (responseQueue) {
              //
              // If there are no items for this channel, then we are done
              //
              numElements = responseQueue.size();
              if (numElements == 0) {
                error = false;
                return true;              // no more data for this channel.
              }
              //
              // Extract the first call
              //从Call列表中取出一个做回复
              call = responseQueue.removeFirst();
              SocketChannel channel = call.connection.channel;
              if (LOG.isDebugEnabled()) {
                LOG.debug(getName() + ": responding to #" + call.id + " from " +
                          call.connection);
              }
              //
              // Send as much data as we can in the non-blocking fashion
              //向call.response写入回复
              int numBytes = channelWrite(channel, call.response);
              if (numBytes < 0) {
                return true;
              }
              if (!call.response.hasRemaining()) {
                call.connection.decRpcCount();
                if (numElements == 1) {    // last call fully processes.
                  done = true;             // no more data for this channel.
                } else {
                  done = false;            // more calls pending to be sent.
                }
                if (LOG.isDebugEnabled()) {
                  LOG.debug(getName() + ": responding to #" + call.id + " from " +
                            call.connection + " Wrote " + numBytes + " bytes.");
                }
              } else {
                //
                // If we were unable to write the entire response out, then 
                // insert in Selector queue. 
                //重新把这个call加回call列表
                call.connection.responseQueue.addFirst(call);
                
                if (inHandler) {
                  //inHandler说明此回复将会过会被发送回去,需要改写时间
                  // set the serve time when the response has to be sent later
                  //改写Call中收到回复的时间
                  call.timestamp = System.currentTimeMillis();
                  
                  incPending();
                  try {
                    // Wakeup the thread blocked on select, only then can the call 
                    // to channel.register() complete.
                    writeSelector.wakeup();
                    channel.register(writeSelector, SelectionKey.OP_WRITE, call);
                    .....
    里面的写回复的操作函数:

      private int channelWrite(WritableByteChannel channel, 
                               ByteBuffer buffer) throws IOException {
        
    	//channel向call.response 的buffer中写入数据
        int count =  (buffer.remaining() <= NIO_BUFFER_LIMIT) ?
                     channel.write(buffer) : channelIO(null, channel, buffer);
        if (count > 0) {
          rpcMetrics.incrSentBytes(count);
        }
        return count;
      }
    这里的buffer就是参数call.response,写完的回复是放入Connection类中的回复列表中的,因为一个连接可能要处理很多回复的
    //回复Call列表
        private LinkedList<Call> responseQueue;
    上面的事就是Response干的事情了,3大线程围绕着一个关键的callQueue工作的,所以画了一个协议图:


    还有一张函数操作的时序图,各个函数的调用流程:


    在Hadoop RPC还有一个RPC的辅助类,用来你获取服务端和客户端实例的:

      /** Construct a server for a protocol implementation instance listening on a
       * port and address, with a secret manager. */
      /** 获取服务端的实例 */
      public static Server getServer(final Object instance, final String bindAddress, final int port,
                                     final int numHandlers,
                                     final boolean verbose, Configuration conf,
                                     SecretManager<? extends TokenIdentifier> secretManager) 
        throws IOException {
        return new Server(instance, conf, bindAddress, port, numHandlers, verbose, secretManager);
      }
    客户端搞了一个缓存机制:

        /**
         * Construct & cache an IPC client with the user-provided SocketFactory 
         * if no cached client exists.
         *  获取端缓存中取出客户端,如果没有则创建一个
         * @param conf Configuration
         * @return an IPC client
         */
        private synchronized Client getClient(Configuration conf,
            SocketFactory factory) {
          // Construct & cache client.  The configuration is only used for timeout,
          // and Clients have connection pools.  So we can either (a) lose some
          // connection pooling and leak sockets, or (b) use the same timeout for all
          // configurations.  Since the IPC is usually intended globally, not
          // per-job, we choose (a).
          Client client = clients.get(factory);
          if (client == null) {
            client = new Client(ObjectWritable.class, conf, factory);
            clients.put(factory, client);
          } else {
            client.incCount();
          }
          return client;
        }

    以上就是Hadoop RPC服务端的主要流程分析,确实的忽略了很多细节。整个Hadoop RPC结构是非常复杂的,在Java NIO的基础之上,用了很多动态代理,反射的思想。

  • 相关阅读:
    查看pip install *.whl 支持的文件版本
    spark Infinate 的处理
    nc 文件的nan识别
    mysql 存中文失败问题
    tensorflow 安装
    数据库存含中文的json 时避免存成中文的ascii
    python 继承中的__init__
    python mysql数据库中 json的存储
    python 版本配置问题
    python dict 实现swich
  • 原文地址:https://www.cnblogs.com/bianqi/p/12184131.html
Copyright © 2020-2023  润新知