我一直都有一个疑问,丰巢业务服务的生产环境jvm参数设置是禁止system.gc的,也就是开启设置:-XX:+DisableExplicitGC,但是生产环境却从来没有出现过堆外内存溢出的情况。说明一下,丰巢使用了阿里开源的dubbo,而dubbo底层通信默认情况下使用了3.2.5.Final版本的netty,而我们对于netty的常规认知里,netty一定是使用了堆外内存,并且堆外内存在禁止了system.gc这个函数调用的话,在服务没有主动回收分配的堆外内存的情况下,一定会出现堆外内存的泄露。带着这个问题,刚好前天晚上有些时间,研究了一下3.2.5版本的netty源码,又是在科兴科兴园等馒头妈妈时候,发现了秘密之所在,我只能说,科兴科学园真是我的宝地啊。
涉及到的netty类:NioWorker、HeapChannelBufferFactory、BigEndianHeapChannelBuffer、SocketReceiveBufferPool
核心的秘密在SocketReceiveBufferPool中
1 final class SocketReceiveBufferPool { 2 3 private static final int POOL_SIZE = 8; 4 5 @SuppressWarnings("unchecked") 6 private final SoftReference<ByteBuffer>[] pool = new SoftReference[POOL_SIZE]; 7 8 SocketReceiveBufferPool() { 9 super(); 10 } 11 12 final ByteBuffer acquire(int size) { 13 final SoftReference<ByteBuffer>[] pool = this.pool; 14 for (int i = 0; i < POOL_SIZE; i ++) { 15 SoftReference<ByteBuffer> ref = pool[i]; 16 if (ref == null) { 17 continue; 18 } 19 20 ByteBuffer buf = ref.get(); 21 if (buf == null) { 22 pool[i] = null; 23 continue; 24 } 25 26 if (buf.capacity() < size) { 27 continue; 28 } 29 30 pool[i] = null; 31 32 buf.clear(); 33 return buf; 34 } 35 36 ByteBuffer buf = ByteBuffer.allocateDirect(normalizeCapacity(size)); 37 buf.clear(); 38 return buf; 39 } 40 41 final void release(ByteBuffer buffer) { 42 final SoftReference<ByteBuffer>[] pool = this.pool; 43 for (int i = 0; i < POOL_SIZE; i ++) { 44 SoftReference<ByteBuffer> ref = pool[i]; 45 if (ref == null || ref.get() == null) { 46 pool[i] = new SoftReference<ByteBuffer>(buffer); 47 return; 48 } 49 } 50 51 // pool is full - replace one 52 final int capacity = buffer.capacity(); 53 for (int i = 0; i< POOL_SIZE; i ++) { 54 SoftReference<ByteBuffer> ref = pool[i]; 55 ByteBuffer pooled = ref.get(); 56 if (pooled == null) { 57 pool[i] = null; 58 continue; 59 } 60 61 if (pooled.capacity() < capacity) { 62 pool[i] = new SoftReference<ByteBuffer>(buffer); 63 return; 64 } 65 } 66 } 67 68 private static final int normalizeCapacity(int capacity) { 69 // Normalize to multiple of 1024 70 int q = capacity >>> 10; 71 int r = capacity & 1023; 72 if (r != 0) { 73 q ++; 74 } 75 return q << 10; 76 } 77 }
SocketReceiveBufferPool中维护了一个SoftReference<ByteBuffer>类型的数组,关于java的SoftReference,大家可以自行搜索。其实就是在此类中维护了一个directbuffer的内存池,此部分的内存是可以重复利用的。那么问题来了,如果我们把netty用于接收网络信息的directbuffer直接传给dubbo的业务代码,那么这个内存池的作用是什么呢,内存如何被release回内存池?带着这个疑问,继续分析调用了SocketReceiveBufferPool的NioWorker代码。
1 private boolean read(SelectionKey k) { 2 final SocketChannel ch = (SocketChannel) k.channel(); 3 final NioSocketChannel channel = (NioSocketChannel) k.attachment(); 4 5 final ReceiveBufferSizePredictor predictor = 6 channel.getConfig().getReceiveBufferSizePredictor(); 7 final int predictedRecvBufSize = predictor.nextReceiveBufferSize(); 8 9 int ret = 0; 10 int readBytes = 0; 11 boolean failure = true; 12 13 ByteBuffer bb = recvBufferPool.acquire(predictedRecvBufSize); 14 15 try { 16 while ((ret = ch.read(bb)) > 0) { 17 readBytes += ret; 18 if (!bb.hasRemaining()) { 19 break; 20 } 21 } 22 failure = false; 23 } catch (ClosedChannelException e) { 24 // Can happen, and does not need a user attention. 25 } catch (Throwable t) { 26 fireExceptionCaught(channel, t); 27 } 28 29 if (readBytes > 0) { 30 bb.flip(); 31 32 final ChannelBufferFactory bufferFactory = 33 channel.getConfig().getBufferFactory(); 34 final ChannelBuffer buffer = bufferFactory.getBuffer(readBytes); 35 buffer.setBytes(0, bb); 36 buffer.writerIndex(readBytes); 37 //if(buffer instanceof BigEndianHeapChannelBuffer){ 38 // logger2.info("buffer instanceof BigEndianHeapChannelBuffer."); 39 //} 40 recvBufferPool.release(bb); 41 42 // Update the predi||\||||| 43 predictor.previousReceiveBufferSize(readBytes); 44 45 // Fire the event. 46 fireMessageReceived(channel, buffer); 47 } else { 48 recvBufferPool.release(bb); 49 } 50 51 if (ret < 0 || failure) { 52 k.cancel(); // Some JDK implementations run into an infinite loop without this. 53 close(channel, succeededFuture(channel)); 54 return false; 55 } 56 57 return true; 58 }
在代码里发现了netty会再创造一个chanelbuffer对象,然后将directbuffer里的内容复制到chanelbuffer里面,而这个chanelbuffer对象实际上是一个堆内内存,然后netty会真对这块内存进行解码及返回给上层调用服务等,也就是说没有直接将directbuffer返回给dubbo服务,这样也就解释了,我们在提供dubbo服务的jvm里,禁止掉了system.gc的情况下,没有发生过堆外内存泄漏的原因。后面我会找时间详细的分析一下netty4和kafka使用directbuffer的情况。