摘要: 对于Redis服务,通常我们推荐用户使用长连接来访问Redis,但是由于某些用户在连接池失效的时候还是会建立大量的短连接或者用户由于客户端限制还是只能使用短连接来访问Redis,而原生的Redis在频繁建立短连接的时候有一定性能损耗,本文从源码角度对Redis短连接的性能进行了优化。
1. 问题
通过历史监控我们可以发现用户在频繁使用短连接的时候Redis的cpu使用率有显著的上升
2. 排查
通过扁鹊查看但是Redis的cpu运行情况如下
从扁鹊我们可以看到Redis在freeClient的时候会频繁调用listSearchKey,并且该函数占用了百分30左右的调用量,如果我们可以优化降低该调用,短连接性能将得到具体提升。
3. 优化
通过以上分析我们可以知道Redis在释放链接的时候频繁调用了listSearchKey,通过查看Redis关闭客户端源码如下:
void freeClient(redisClient *c) { listNode *ln; /* If this is marked as current client unset it */ if (server.current_client == c) server.current_client = NULL; /* If it is our master that's beging disconnected we should make sure * to cache the state to try a partial resynchronization later. * * Note that before doing this we make sure that the client is not in * some unexpected state, by checking its flags. */ if (server.master && c->flags & REDIS_MASTER) { redisLog(REDIS_WARNING,"Connection with master lost."); if (!(c->flags & (REDIS_CLOSE_AFTER_REPLY| REDIS_CLOSE_ASAP| REDIS_BLOCKED| REDIS_UNBLOCKED))) { replicationCacheMaster(c); return; } } /* Log link disconnection with slave */ if ((c->flags & REDIS_SLAVE) && !(c->flags & REDIS_MONITOR)) { redisLog(REDIS_WARNING,"Connection with slave %s lost.", replicationGetSlaveName(c)); } /* Free the query buffer */ sdsfree(c->querybuf); c->querybuf = NULL; /* Deallocate structures used to block on blocking ops. */ if (c->flags & REDIS_BLOCKED) unblockClientWaitingData(c); dictRelease(c->bpop.keys); freeClientArgv(c); /* Remove from the list of clients */ if (c->fd != -1) { ln = listSearchKey(server.clients,c); redisAssert(ln != NULL); listDelNode(server.clients,ln); } /* When client was just unblocked because of a blocking operation, * remove it from the list of unblocked clients. */ if (c->flags & REDIS_UNBLOCKED) { ln = listSearchKey(server.unblocked_clients,c); redisAssert(ln != NULL); listDelNode(server.unblocked_clients,ln); } ... ... ... /* Release other dynamically allocated client structure fields, * and finally release the client structure itself. */ if (c->name) decrRefCount(c->name); zfree(c->argv); freeClientMultiState(c); sdsfree(c->peerid); if (c->pause_event > 0) aeDeleteTimeEvent(server.el, c->pause_event); zfree(c); }
从源码我们可以看到Redis在释放链接的时候遍历server.clients查找到对应的redisClient对象然后调用listDelNode把该redisClient对象从server.clients删除,代码如下:
/* Remove from the list of clients */ if (c->fd != -1) { ln = listSearchKey(server.clients,c); redisAssert(ln != NULL); listDelNode(server.clients,ln); }
查看server.clients为List结构,而redis定义的List为双端链表,我们可以在createClient的时候将redisClient的指针地址保留再freeClient的时候直接删除对应的listNode即可,无需再次遍历server.clients,代码优化如下:
3.1 createClient修改
redisClient *createClient(int fd) { redisClient *c = zmalloc(sizeof(redisClient)); /* passing -1 as fd it is possible to create a non connected client. * This is useful since all the Redis commands needs to be executed * in the context of a client. When commands are executed in other * contexts (for instance a Lua script) we need a non connected client. */ if (fd != -1) { anetNonBlock(NULL,fd); anetEnableTcpNoDelay(NULL,fd); if (server.tcpkeepalive) anetKeepAlive(NULL,fd,server.tcpkeepalive); if (aeCreateFileEvent(server.el,fd,AE_READABLE, readQueryFromClient, c) == AE_ERR) { close(fd); zfree(c); return NULL; } } ... if (fd != -1) { c->client_list_node = listAddNodeTailReturnNode(server.clients,c); } return c; }
3.2 freeClient修改
freeClient修改如下:
/* Remove from the list of clients */ if (c->fd != -1) { if (c->client_list_node != NULL) listDelNode(server.clients,c->client_list_node); }
3.3 优化结果
在同一台物理机上启动优化前后的Redis,分别进行压测,压测命令如下:
redis-benchmark -h host -p port -k 0 -t get -n 100000 -c 8000
其中-k 代表使用短连接进行测试
原生Redis-server结果:
99.74% <= 963 milliseconds 99.78% <= 964 milliseconds 99.84% <= 965 milliseconds 99.90% <= 966 milliseconds 99.92% <= 967 milliseconds 99.94% <= 968 milliseconds 99.99% <= 969 milliseconds 100.00% <= 969 milliseconds 10065.42 requests per second
优化后Redis-server结果
99.69% <= 422 milliseconds 99.72% <= 423 milliseconds 99.80% <= 424 milliseconds 99.82% <= 425 milliseconds 99.86% <= 426 milliseconds 99.89% <= 427 milliseconds 99.94% <= 428 milliseconds 99.96% <= 429 milliseconds 99.97% <= 430 milliseconds 100.00% <= 431 milliseconds 13823.61 requests per second
我们可以看到优化之后的Redis-server性能在短连接多的场景下提升了百分30%以上。