• spark源码阅读之network(2)


    在上节的解读中发现spark的源码中大量使用netty的buffer部分的api,该节将看到netty核心的一些api,比如channel:

    在Netty里,Channel是通讯的载体(网络套接字或组件的连接),而ChannelHandler负责Channel中的逻辑处理,channel支持读,写,绑定本地端口,连接远程等,Channel中所有的操作都是异步的,当发生io操作的时候将会返回一个ChannelFutrue的接口,在ChannelFutrue里面可以处理操作成功、失败、取消后的动作。有了这些理解就可以看client部分的源码了
     
    TransportClientFactory是一个创建TransportClient的工厂类,该类为每一个网络地址都提供了一个连接池,相同的主机返回相同的TransportClient。所有的TransportClient都共享一个EventLoopGroup,该类用于处理channel的上的事件的。
    1. privatestaticclassClientPool{
    2. TransportClient[] clients;
    3. Object[] locks;
    4. publicClientPool(int size){
    5. clients =newTransportClient[size];
    6. locks =newObject[size];
    7. for(int i =0; i < size; i++){
    8. locks[i]=newObject();
    9. }
    10. }
    ClientPool表示一个连接池,每一个地址对应一个连接池。连接池的大小spark.shuffle.io.numConnectionsPerPeer来指定。该连接池怎么使用呢。用户传递一个地址进来,该地址作为key到connectionPool中查找该地址对应的连接池,没有就生成一个,获取连接池后需要随机的获取一个连接,这个时候连接池中锁就用到了。
    1. publicTransportClient createClient(String remoteHost,int remotePort)throwsIOException{
    2. // Get connection from the connection pool first.
    3. // If it is not found or not active, create a new one.
    4. finalInetSocketAddress address =newInetSocketAddress(remoteHost, remotePort);
    5. // Create the ClientPool if we don't have it yet.
    6. ClientPool clientPool = connectionPool.get(address);
    7. if(clientPool ==null){
    8. connectionPool.putIfAbsent(address,newClientPool(numConnectionsPerPeer));
    9. clientPool = connectionPool.get(address);
    10. }
    11. int clientIndex = rand.nextInt(numConnectionsPerPeer);
    12. TransportClient cachedClient = clientPool.clients[clientIndex];
    13. if(cachedClient !=null&& cachedClient.isActive()){
    14. logger.trace("Returning cached connection to {}: {}", address, cachedClient);
    15. return cachedClient;
    16. }
    17. // If we reach here, we don't have an existing connection open. Let's create a new one.
    18. // Multiple threads might race here to create new connections. Keep only one of them active.
    19. synchronized(clientPool.locks[clientIndex]){
    20. cachedClient = clientPool.clients[clientIndex];
    21. if(cachedClient !=null){
    22. if(cachedClient.isActive()){
    23. logger.trace("Returning cached connection to {}: {}", address, cachedClient);
    24. return cachedClient;
    25. }else{
    26. logger.info("Found inactive connection to {}, creating a new one.", address);
    27. }
    28. }
    29. clientPool.clients[clientIndex]= createClient(address);
    30. return clientPool.clients[clientIndex];
    31. }
    32. }
    怎么创建一个全新的TransportClient,这块要使用netty的bootstrap类帮忙了,主要bootstrap的配置,缓存分配器使用缓存池来管理。调用bootstrap的handler函数给bootstrap添加了一个ChannelHandler,当bootstrap连接成功后回调该ChannelHandler,在ChannelHandler的initchannel的监听方法里面获取了连接通道SocketChannel,使用TransportContext的initializePipeline来初始化通道,就是给通道添加监听器。在这个方法里面我们拿到了
    TransportClient和channel。
    为什么要使用AtomicReference来保存他们的引用呢?
    说说我的理解:内部类只能使用外部类的final变量,局部final变量必须声明的时候初始化,如果不使用AtomicReference就无法保持内部类的一些对象的引用。
    1. */
    2. publicTransportClient createUnmanagedClient(String remoteHost,int remotePort)
    3. throwsIOException{
    4. finalInetSocketAddress address =newInetSocketAddress(remoteHost, remotePort);
    5. return createClient(address);
    6. }
    7. /** Create a completely new {@link TransportClient} to the remote address. */
    8. privateTransportClient createClient(InetSocketAddress address)throwsIOException{
    9. logger.debug("Creating new connection to "+ address);
    10. Bootstrap bootstrap =newBootstrap();
    11. bootstrap.group(workerGroup)
    12. .channel(socketChannelClass)
    13. // Disable Nagle's Algorithm since we don't want packets to wait
    14. .option(ChannelOption.TCP_NODELAY,true)
    15. .option(ChannelOption.SO_KEEPALIVE,true)
    16. .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, conf.connectionTimeoutMs())
    17. .option(ChannelOption.ALLOCATOR, pooledAllocator);
    18. finalAtomicReference<TransportClient> clientRef =newAtomicReference<TransportClient>();
    19. finalAtomicReference<Channel> channelRef =newAtomicReference<Channel>();
    20. bootstrap.handler(newChannelInitializer<SocketChannel>(){
    21. @Override
    22. publicvoid initChannel(SocketChannel ch){
    23. TransportChannelHandler clientHandler = context.initializePipeline(ch);
    24. clientRef.set(clientHandler.getClient());
    25. channelRef.set(ch);
    26. }
    27. });
    28. // Connect to the remote server
    29. long preConnect =System.nanoTime();
    30. ChannelFuture cf = bootstrap.connect(address);
    31. if(!cf.awaitUninterruptibly(conf.connectionTimeoutMs())){
    32. thrownewIOException(
    33. String.format("Connecting to %s timed out (%s ms)", address, conf.connectionTimeoutMs()));
    34. }elseif(cf.cause()!=null){
    35. thrownewIOException(String.format("Failed to connect to %s", address), cf.cause());
    36. }
    37. TransportClient client = clientRef.get();
    38. Channel channel = channelRef.get();
    39. assert client !=null:"Channel future completed successfully with null client";
    40. // Execute any client bootstraps synchronously before marking the Client as successful.
    41. long preBootstrap =System.nanoTime();
    42. logger.debug("Connection to {} successful, running bootstraps...", address);
    43. try{
    44. for(TransportClientBootstrap clientBootstrap : clientBootstraps){
    45. clientBootstrap.doBootstrap(client, channel);
    46. }
    47. }catch(Exception e){// catch non-RuntimeExceptions too as bootstrap may be written in Scala
    48. long bootstrapTimeMs =(System.nanoTime()- preBootstrap)/1000000;
    49. logger.error("Exception while bootstrapping client after "+ bootstrapTimeMs +" ms", e);
    50. client.close();
    51. throwThrowables.propagate(e);
    52. }
    53. long postBootstrap =System.nanoTime();
    54. logger.debug("Successfully created connection to {} after {} ms ({} ms spent in bootstraps)",
    55. address,(postBootstrap - preConnect)/1000000,(postBootstrap - preBootstrap)/1000000);
    56. return client;
    57. }
     
    下面看下TransportClient,该类有两个作用:获取数据和发送请求,获取数据用来获取预先协议好的数据流,把数据打散成块(大小为KB和MB)便于传输,当TransportClient要从流上获取数据时,流相关的配置不是TCP的传输层做的,而是需要调用TransportClient的sendRPC执行一些配置。具体流程如下:
    client.sendRPC(new OpenFile("/foo") 返回一个StreamId = 10
    client.fetchChunk(streamId=100,chunkIndex= 0,callback)
    client.fetchChunk(streamId=100,chunkIndex= 1,callback)
    client.sendRPC(new CloseStream(100))
    一个TransportClient可以使用在多个Streams上,但是一个streams只能和一个client绑定,以免响应顺序错乱。
    一个client有3个成员变量:channel用于写操作,向服务器端发送请求,TransportResponseHandler用于处理服务器端响应,clientId给client编号。
    1. privatefinalChannel channel;
    2. privatefinalTransportResponseHandler handler;
    3. @NullableprivateString clientId;
    client有3个请求函数,一个是请求数据流中的一个数据块,用于数据传输,第二个是请求整个数据流,用于数据传输,第三个是发送控制请求。有点像ftp,一个用于控制,一个用于数据。
    1. publicvoid fetchChunk(
    2. long streamId,
    3. finalint chunkIndex,
    4. finalChunkReceivedCallback callback){
    5. finalString serverAddr =NettyUtils.getRemoteAddress(channel);
    6. finallong startTime =System.currentTimeMillis();
    7. logger.debug("Sending fetch chunk request {} to {}", chunkIndex, serverAddr);
    8. finalStreamChunkId streamChunkId =newStreamChunkId(streamId, chunkIndex);
    9. handler.addFetchRequest(streamChunkId, callback);
    10. channel.writeAndFlush(newChunkFetchRequest(streamChunkId)).addListener(
    11. newChannelFutureListener(){
    12. @Override
    13. publicvoid operationComplete(ChannelFuture future)throwsException{
    14. if(future.isSuccess()){
    15. long timeTaken =System.currentTimeMillis()- startTime;
    16. logger.trace("Sending request {} to {} took {} ms", streamChunkId, serverAddr,
    17. timeTaken);
    18. }else{
    19. String errorMsg =String.format("Failed to send request %s to %s: %s", streamChunkId,
    20. serverAddr, future.cause());
    21. logger.error(errorMsg, future.cause());
    22. handler.removeFetchRequest(streamChunkId);
    23. channel.close();
    24. try{
    25. callback.onFailure(chunkIndex,newIOException(errorMsg, future.cause()));
    26. }catch(Exception e){
    27. logger.error("Uncaught exception in RPC response callback handler!", e);
    28. }
    29. }
    30. }
    31. });
    32. }
    callback有两个方法,这里要说明下他们的回调机制,onFailure在channel的IO操作失败后调用,就是ChannelFuture失败时候调用,ChannelFuture是IO操作的结果。
    onSuccess调用时在channel的事件处理流程中使用,context.initializePipeline(ch)给channel注册了一个TransportChannelHandler,TransportChannelHandler包含了TransportResponseHandler对象,它把响应结果转发给TransportResponseHandler用于处理服务器端响应,handler.addFetchRequest(streamChunkId, callback)映射每个响应对应的回调接口。在对应响应到来时调用对应回调接口。
    channel.writeAndFlush的对象需要实现Encodable接口。该接口的一些方法被MessageDecoder和MessageEncoder调用
     
    sendRpc和上面方法一样,这里就不描述了,看下stream方法
    1. publicvoid stream(finalString streamId,finalStreamCallback callback){
    2. finalString serverAddr =NettyUtils.getRemoteAddress(channel);
    3. finallong startTime =System.currentTimeMillis();
    4. logger.debug("Sending stream request for {} to {}", streamId, serverAddr);
    5. // Need to synchronize here so that the callback is added to the queue and the RPC is
    6. // written to the socket atomically, so that callbacks are called in the right order
    7. // when responses arrive.
    8. synchronized(this){
    9. handler.addStreamCallback(callback);
    10. channel.writeAndFlush(newStreamRequest(streamId)).addListener(
    11. newChannelFutureListener(){
    12. @Override
    13. publicvoid operationComplete(ChannelFuture future)throwsException{
    14. if(future.isSuccess()){
    15. long timeTaken =System.currentTimeMillis()- startTime;
    16. logger.trace("Sending request for {} to {} took {} ms", streamId, serverAddr,
    17. timeTaken);
    18. }else{
    19. String errorMsg =String.format("Failed to send request for %s to %s: %s", streamId,
    20. serverAddr, future.cause());
    21. logger.error(errorMsg, future.cause());
    22. channel.close();
    23. try{
    24. callback.onFailure(streamId,newIOException(errorMsg, future.cause()));
    25. }catch(Exception e){
    26. logger.error("Uncaught exception in RPC response callback handler!", e);
    27. }
    28. }
    29. }
    30. });
    31. }
    32. }
    这里加了一个同步块,这样保证回调接口调用和请求的顺序一样。这里一个不明白的地方就是,同时发两个请求,第二个请求可能比第一个请求更快返回。怎么保证顺序一致呢?
     
    sendRpcSysnc是一个非常有意思的方法,这里学习了Future怎么使用了。
    1. publicbyte[] sendRpcSync(byte[] message,long timeoutMs){
    2. finalSettableFuture<byte[]> result =SettableFuture.create();
    3. sendRpc(message,newRpcResponseCallback(){
    4. @Override
    5. publicvoid onSuccess(byte[] response){
    6. result.set(response);
    7. }
    8. @Override
    9. publicvoid onFailure(Throwable e){
    10. result.setException(e);
    11. }
    12. });
    13. try{
    14. return result.get(timeoutMs,TimeUnit.MILLISECONDS);
    15. }catch(ExecutionException e){
    16. throwThrowables.propagate(e.getCause());
    17. }catch(Exception e){
    18. throwThrowables.propagate(e);
    19. }
    20. }
    匿名内部类智能使用外部类的final,要异步获取内部类的数据使用了一个SettableFuture。
     
     
     
     
     
     
     
     
     
     
     
     
     





  • 相关阅读:
    SpringBoot 动态修改定时任务频率
    window三种程序自启动方式
    vbs与bat脚本实现本地jdk版本自动切换
    java连接sqlserver数据库
    java连接Access数据库
    Java如何连接Access数据库(两种方式实例代码)
    java连接access数据库的三种方式以及远程连接
    Linux下实现MySQL数据库每天定时自动备份
    解决谷歌浏览器http链接自动跳转到https的问题
    2021年第一天
  • 原文地址:https://www.cnblogs.com/gaoxing/p/4985559.html
Copyright © 2020-2023  润新知