spark 源码分析之十--Spark RPC剖析之TransportResponseHandler、TransportRequestHandler和TransportChannelHandler剖析
TransportResponseHandler分析
先来看类说明:
Handler that processes server responses, in response to requests issued from a [[TransportClient]]. It works by tracking the list of outstanding requests (and their callbacks). Concurrency: thread safe and can be called from multiple threads.
其关键的成员字段作如下说明:
1. channel:与之绑定的SocketChannel对象
2. outstandingFetches:是一个ConcurrentHashMap,主要保存StreamChunkId和ChunkReceivedCallback的映射关系。
3. outstandingRpcs:是一个ConcurrentHashMap,主要保存 request id 和RpcResponseCallback的映射关系。
4. streamCallbacks 是一个ConcurrentLinkedQueue队列,保存了Pair<String, StreamCallback>,其中String是stream id
5. timeOfLastRequestNs:记录了上次rpc 请求或 chunk fetching 的系统时间,以纳秒计算
其关键方法 handle 如下:
TransportRequestHandler分析
类说明如下:
A handler that processes requests from clients and writes chunk data back. Each handler is attached to a single Netty channel, and keeps track of which streams have been fetched via this channel, in order to clean them up if the channel is terminated (see #channelUnregistered). The messages should have been processed by the pipeline setup by TransportServer.
它是一个handler,处理来自于client 的 请求,返回chunk 给 client。每一个handler与一个netty channel 关联,并追踪那个chunk 已经被chennel获取到了。其中消息应该已经被TransportServer建立起来的管道处理过了。
其成员变量说明如下:
1. channel: 是Channel对象,与之关联的SocketChannel对象
2. reverseClient:是TransportClient对象,同一个channel 上的client,这样,就可以给消息的请求者通信了
3. rpcHandler:是一个RpcHandler对象,处理所有的 RPC 消息
4. streamManager: 是一个StreamManager对象,返回一个流的 任意一部分chunk
5. maxChunksBeingTransferred: 正在传输的流的chunk 下标
其关键方法 handle 如下:
我们只看一个分支作为示例:
其调用了rpcHandler 的 receive 方法,该方法处理完毕后返回,如果成功,则返回RpcResponse对象,否则返回RpcResponse对象,由于这个返回可能是需要跨网络传输的,所以,有进一步封装了response 方法,如下:
即通过response 方法将server 端的请求结果返回给客户端。
TransportChannelHandler分析
类说明如下:
The single Transport-level Channel handler which is used for delegating requests to the TransportRequestHandler and responses to the TransportResponseHandler. All channels created in the transport layer are bidirectional. When the Client initiates a Netty Channel with a RequestMessage (which gets handled by the Server's RequestHandler), the Server will produce a ResponseMessage (handled by the Client's ResponseHandler). However, the Server also gets a handle on the same Channel, so it may then begin to send RequestMessages to the Client. This means that the Client also needs a RequestHandler and the Server needs a ResponseHandler, for the Client's responses to the Server's requests. This class also handles timeouts from a io.netty.handler.timeout.IdleStateHandler. We consider a connection timed out if there are outstanding fetch or RPC requests but no traffic on the channel for at least `requestTimeoutMs`. Note that this is duplex traffic; we will not timeout if the client is continuously sending but getting no responses, for simplicity.
传输层的handler,负责委托请求给TransportRequestHandler,委托响应给TransportResponseHandler。
在传输层中创建的所有通道都是双向的。当客户端使用RequestMessage启动Netty通道(由服务器的RequestHandler处理)时,服务器将生成ResponseMessage(由客户端的ResponseHandler处理)。但是,服务器也会在同一个Channel上获取句柄,因此它可能会开始向客户端发送RequestMessages。这意味着客户端还需要一个RequestHandler,而Server需要一个ResponseHandler,用于客户端对服务器请求的响应。此类还处理来自io.netty.handler.timeout.IdleStateHandler的超时。如果存在未完成的提取或RPC请求但是至少在“requestTimeoutMs”上没有通道上的流量,我们认为连接超时。请注意,这是双工流量;如果客户端不断发送但是没有响应,我们将不会超时。
关键方法channelRead如下:
该方法,负责将请求委托给TransportRequestHandler,将响应委托给TransportResponseHandler。
因为这个channel最终被添加到了channel上,所以消息从channel中传输(流出或流入)都会触发这个方法,进而调用响应的方法。
即Spark RPC通过netty的channel发送请求,获取响应。