最近接浅橙贷超Api推过来的流量,由于有几个请求头的body体积比较大,最大有30M,到我们这边nginx的error日志就报错了:
2018/11/19 22:33:52 [error] 9791#0: *639124 readv() failed (104: Connection reset by peer) while reading upstream, client: 116.62.210.85, server: axdapi.adpanshi.com, request: "POST /qianchengApi/doCall HTTP/1.1", upstream: "http://127.0.0.1:10030/qianchengApi/doCall", host: "axdapi.adpanshi.com"
然后还有一串警告:
2018/11/22 11:03:40 [warn] 18674#0: *41074 a client request body is buffered to a temporary file /var/lib/nginx/tmp/client_body/0000002187, client: 116.62.210.85, server: axdapi.adpanshi.com, request: "POST /qianchengApi/doCall HTTP/1.1", host: "axdapi.adpanshi.com"
然后在nginx access.log里也有很多报500错误
- 120.55.62.188 - - [19/Nov/2018:19:47:52 +0800] "POST /qianchengApi/doCall HTTP/1.1" 200 138 "-" "SH-XJ360""request_time:1.092" "upstream_response_time:1.092" - 116.62.210.85 - - [19/Nov/2018:19:47:57 +0800] "POST /qianchengApi/doCall HTTP/1.1" 500 25 "-" "SH-XJ360""request_time:3.913" "upstream_response_time:0.015" - 120.55.62.188 - - [19/Nov/2018:19:48:05 +0800] "POST /qianchengApi/doCall HTTP/1.1" 200 87 "-" "SH-XJ360""request_time:0.857" "upstream_response_time:0.857" - 116.62.210.85 - - [19/Nov/2018:19:48:07 +0800] "POST /qianchengApi/doCall HTTP/1.1" 500 25 "-" "SH-XJ360""request_time:2.095" "upstream_response_time:0.012" - 116.62.210.85 - - [19/Nov/2018:19:48:31 +0800] "POST /qianchengApi/doCall HTTP/1.1" 500 25 "-" "SH-XJ360""request_time:1.392" "upstream_response_time:0.013" - 116.62.210.85 - - [19/Nov/2018:19:48:36 +0800] "POST /qianchengApi/doCall HTTP/1.1" 500 25 "-" "SH-XJ360""request_time:1.649" "upstream_response_time:0.012" - 120.55.62.188 - - [19/Nov/2018:19:49:05 +0800] "POST /qianchengApi/doCall HTTP/1.1" 200 211 "-" "SH-XJ360""request_time:0.120" "upstream_response_time:0.120" - 116.62.210.85 - - [19/Nov/2018:19:49:07 +0800] "POST /qianchengApi/doCall HTTP/1.1" 500 25 "-" "SH-XJ360""request_time:1.270" "upstream_response_time:0.012"
当时叫来了运维,把nginx.conf 里的client_max_body_size调成32m,还改了一些其他连接超时的参数(哪些记不清了),但是错误依旧,这时想到报500有可能是应用的问题。然后在core的spring boot的配置文件里加上了:
server.tomcat.max-http-post-size = 33554432
我记得在middle层加上的是
server.tomcat.max-http-post-size = -1
结果还是不行
然后在Application.java里加了下面代码设置最大线程和连接数,但结果还是有报错
protected void customizeConnector(Connector connector) { super.customizeConnector(connector); Http11NioProtocol protocol = (Http11NioProtocol) connector.getProtocolHandler(); // 设置最大连接数 protocol.setMaxConnections(2000); // 设置最大线程数 protocol.setMaxThreads(2000); protocol.setConnectionTimeout(30000); }
最后在检查时把middle层的applicationproperties 把server.tomcat.max-http-post-size = -1 改成了
server.tomcat.max-http-post-size = 33554432
好像就不报错了。真的是这个原因吗?
但是nginx日志里那个警告还有,然后合作方反映我们这边的接口效率差,请求有堆积,时间最长的请求响应时间超过了11s。然后到运维那看了zabbix的监控,发现cpu,内存,网络,磁盘的负载都不高,估计还是因为这边的配置不当引起的问题。然后看到报警的那句话是把请求体存到临时文件里面去了,网上搜了下这个日志,发现了解决方案,因为从文件里取会降低效率。然后把client_body_buffer_size 512k 调到了2048K,结果情况就大为好转,接口响应实际那基本都在2s内了
# client_body_buffer_size 512k;
client_body_buffer_size 4096k;