此文已由作者赵计刚授权网易云社区发布。
欢迎访问网易云社区,了解更多网易技术产品运营经验。
dubbo的心跳机制:
目的:检测provider与consumer之间的connection连接是不是还连接着,如果连接断了,需要作出相应的处理。
原理:
provider:dubbo的心跳默认是在heartbeat(默认是60s)内如果没有接收到消息,就会发送心跳消息,如果连着3次(180s)没有收到心跳响应,provider会关闭channel。
consumer:dubbo的心跳默认是在60s内如果没有接收到消息,就会发送心跳消息,如果连着3次(180s)没有收到心跳响应,consumer会进行重连。
来看源码调用链。先看provider端。
一、provider端心跳机制
-->openServer(URL url) url:dubbo://10.10.10.10:20880/com.alibaba.dubbo.demo.DemoService?anyhost=true&application=demo-provider&bind.ip=10.10.10.10&bind.port=20880&default.server=netty4&dubbo=2.0.0&generic=false&interface=com.alibaba.dubbo.demo.DemoService&methods=sayHello&pid=21999&qos.port=22222&side=provider×tamp=1520660491836 -->createServer(URL url) -->HeaderExchanger.bind(URL url, ExchangeHandler handler) url:dubbo://10.10.10.10:20880/com.alibaba.dubbo.demo.DemoService?anyhost=true&application=demo-provider&bind.ip=10.10.10.10&bind.port=20880&channel.readonly.sent=true&codec=dubbo&default.server=netty4&dubbo=2.0.0&generic=false&heartbeat=60000&interface=com.alibaba.dubbo.demo.DemoService&methods=sayHello&pid=21999&qos.port=22222&side=provider×tamp=1520660491836 handler:DubboProtocol.requestHandler -->new DecodeHandler(new HeaderExchangeHandler(handler))) -->NettyTransporter.bind(URL url, ChannelHandler listener) listener:上边的DecodeHandler实例 -->new NettyServer(URL url, ChannelHandler handler) -->ChannelHandler.wrapInternal(ChannelHandler handler, URL url) handler:上边的DecodeHandler实例 -->doOpen()//开启netty服务 -->new HeaderExchangeServer(Server server) server:上述的NettyServer -->startHeatbeatTimer()
服务端在开启netty服务时, 在调用createServer时,会从url的parameters map中获取heartbeat配置,代码如下:
1 private ExchangeServer createServer(URL url) { 2 3 ... 4 5 url = url.addParameterIfAbsent(Constants.HEARTBEAT_KEY, String.valueOf(Constants.DEFAULT_HEARTBEAT)); 6 7 ... 8 9 ExchangeServer server; 10 try { 11 server = Exchangers.bind(url, requestHandler); 12 } catch (RemotingException e) { 13 throw new RpcException("Fail to start server(url: " + url + ") " + e.getMessage(), e); 14 } 15 16 ... 17 18 return server; 19 }
其中:int DEFAULT_HEARTBEAT = 60 * 1000,即当用户没有配置heartbeat(心跳时间)时,默认heartbeat=60s(即60s内没有接收到任何请求,就会发送心跳信息)。那么这个heartbeat到底该怎么配?
provider端:
1 <dubbo:service ...> 2 <dubbo:parameter key="heartbeat" value="3000"/> 3 </dubbo:service>
consumer端:
1 <dubbo:reference ...> 2 <dubbo:parameter key="heartbeat" value="3000"/> 3 </dubbo:reference>
再来看调用链,当执行到这一句。
1 ChannelHandler.wrapInternal(ChannelHandler handler, URL url)
会形成一个handler调用链,调用链如下:
1 MultiMessageHandler 2 -->handler: HeartbeatHandler 3 -->handler: AllChannelHandler 4 -->url: providerUrl 5 -->executor: FixedExecutor 6 -->handler: DecodeHandler 7 -->handler: HeaderExchangeHandler 8 -->handler: ExchangeHandlerAdapter(DubboProtocol.requestHandler)
这也是netty接收到请求后的处理链路,注意其中有一个HeartbeatHandler。
最后,执行new HeaderExchangeServer(Server server),来看源码:
1 public class HeaderExchangeServer implements ExchangeServer { 2 /** 心跳定时器 */ 3 private final ScheduledExecutorService scheduled = Executors.newScheduledThreadPool(1, 4 new NamedThreadFactory( 5 "dubbo-remoting-server-heartbeat", 6 true)); 7 /** NettyServer */ 8 private final Server server; 9 // heartbeat timer 10 private ScheduledFuture<?> heatbeatTimer; 11 // heartbeat timeout (ms), default value is 0 , won't execute a heartbeat. 12 private int heartbeat; 13 private int heartbeatTimeout; 14 private AtomicBoolean closed = new AtomicBoolean(false); 15 16 public HeaderExchangeServer(Server server) { 17 if (server == null) { 18 throw new IllegalArgumentException("server == null"); 19 } 20 this.server = server; 21 this.heartbeat = server.getUrl().getParameter(Constants.HEARTBEAT_KEY, 0); 22 this.heartbeatTimeout = server.getUrl().getParameter(Constants.HEARTBEAT_TIMEOUT_KEY, heartbeat * 3); 23 if (heartbeatTimeout < heartbeat * 2) { 24 throw new IllegalStateException("heartbeatTimeout < heartbeatInterval * 2"); 25 } 26 startHeatbeatTimer(); 27 } 28 29 private void startHeatbeatTimer() { 30 stopHeartbeatTimer(); 31 if (heartbeat > 0) { 32 heatbeatTimer = scheduled.scheduleWithFixedDelay( 33 new HeartBeatTask(new HeartBeatTask.ChannelProvider() { 34 public Collection<Channel> getChannels() { 35 return Collections.unmodifiableCollection( 36 HeaderExchangeServer.this.getChannels()); 37 } 38 }, heartbeat, heartbeatTimeout), 39 heartbeat, heartbeat, TimeUnit.MILLISECONDS); 40 } 41 } 42 43 private void stopHeartbeatTimer() { 44 try { 45 ScheduledFuture<?> timer = heatbeatTimer; 46 if (timer != null && !timer.isCancelled()) { 47 timer.cancel(true); 48 } 49 } catch (Throwable t) { 50 logger.warn(t.getMessage(), t); 51 } finally { 52 heatbeatTimer = null; 53 } 54 } 55 }
创建HeaderExchangeServer时,初始化了heartbeat(心跳间隔时间)和heartbeatTimeout(心跳响应超时时间:即如果最终发送的心跳在这个时间内都没有返回,则做出响应的处理)。
heartbeat默认是0(从startHeatbeatTimer()方法可以看出只有heartbeat>0的情况下,才会发心跳,这里heartbeat如果从url的parameter map中获取不到,就是0,但是我们在前边看到dubbo会默认设置heartbeat=60s到parameter map中,所以此处的heartbeat=60s);
heartbeatTimeout:默认是heartbeat*3。(原因:假设一端发出一次heartbeatRequest,另一端在heartbeat内没有返回任何响应-包括正常请求响应和心跳响应,此时不能认为是连接断了,因为有可能还是网络抖动什么的导致了tcp包的重传超时等)
scheduled是一个含有一个线程的定时线程执行器(其中的线程名字为:"dubbo-remoting-server-heartbeat-thread-*")
之后启动心跳定时任务:
首先如果原来有心跳定时任务,关闭原来的定时任务
之后启动scheduled中的定时线程,从启动该线程开始,每隔heartbeat执行一次HeartBeatTask任务(第一次执行是在启动线程后heartbeat时)
来看一下HeartBeatTask的源码:
1 final class HeartBeatTask implements Runnable { 2 // channel获取器:用于获取所有需要进行心跳检测的channel 3 private ChannelProvider channelProvider; 4 private int heartbeat; 5 private int heartbeatTimeout; 6 7 HeartBeatTask(ChannelProvider provider, int heartbeat, int heartbeatTimeout) { 8 this.channelProvider = provider; 9 this.heartbeat = heartbeat; 10 this.heartbeatTimeout = heartbeatTimeout; 11 } 12 13 public void run() { 14 try { 15 long now = System.currentTimeMillis(); 16 for (Channel channel : channelProvider.getChannels()) { 17 if (channel.isClosed()) { 18 continue; 19 } 20 try { 21 // 获取最后一次读操作的时间 22 Long lastRead = (Long) channel.getAttribute( 23 HeaderExchangeHandler.KEY_READ_TIMESTAMP); 24 // 获取最后一次写操作的时间 25 Long lastWrite = (Long) channel.getAttribute( 26 HeaderExchangeHandler.KEY_WRITE_TIMESTAMP);27 // 如果在heartbeat内没有进行读操作或者写操作,则发送心跳请求 28 if ((lastRead != null && now - lastRead > heartbeat) 29 || (lastWrite != null && now - lastWrite > heartbeat)) { 30 Request req = new Request(); 31 req.setVersion("2.0.0"); 32 req.setTwoWay(true); 33 req.setEvent(Request.HEARTBEAT_EVENT); 34 channel.send(req); 35 if (logger.isDebugEnabled()) { 36 logger.debug("Send heartbeat to remote channel " + channel.getRemoteAddress() 37 + ", cause: The channel has no data-transmission exceeds a heartbeat period: " + heartbeat + "ms"); 38 } 39 } 40 //正常消息和心跳在heartbeatTimeout都没接收到 41 if (lastRead != null && now - lastRead > heartbeatTimeout) { 42 logger.warn("Close channel " + channel 43 + ", because heartbeat read idle time out: " + heartbeatTimeout + "ms"); 44 // consumer端进行重连 45 if (channel instanceof Client) { 46 try { 47 ((Client) channel).reconnect(); 48 } catch (Exception e) { 49 //do nothing 50 } 51 } else {// provider端关闭连接 52 channel.close(); 53 } 54 } 55 } catch (Throwable t) { 56 logger.warn("Exception when heartbeat to remote channel " + channel.getRemoteAddress(), t); 57 } 58 } 59 } catch (Throwable t) { 60 logger.warn("Unhandled exception when heartbeat, cause: " + t.getMessage(), t); 61 } 62 } 63 64 interface ChannelProvider { 65 Collection<Channel> getChannels(); 66 } 67 }
HeartBeatTask首先获取所有的channelProvider#getChannels获取所有需要心跳检测的channel,channelProvider实例是HeaderExchangeServer中在启动线程定时执行器的时候创建的内部类。
1 new HeartBeatTask.ChannelProvider() { 2 public Collection<Channel> getChannels() { 3 return Collections.unmodifiableCollection( 4 HeaderExchangeServer.this.getChannels()); 5 } 6 }
更多网易技术、产品、运营经验分享请点击。
相关文章:
【推荐】 用双十一的故事串起碎片的网络协议(上)
【推荐】 互联网时代,我眼中的架构变迁
【推荐】 react技术栈实践(1)