最近开始研究skywalking准备应用到系统中,在测试环境测试的时候遇到一个现象:在oapserver刚启动的时候,trace数据上传一切正常,大概几分钟之后就不再有SegmentTrace数据上传了。
ES 实例只有一台 分了8G内存
service实例数:30+
观察oapserver日志,
在oapserver刚启动的时候报了下面这个错误,看代码之后猜测是因为elasticsearch还没有连接上导致的
2019-07-25 15:34:51,516 - org.apache.skywalking.oap.server.core.register.worker.RegisterRemoteWorker - 49 [DataCarrier.REGISTER_L1.BulkConsumePool.0.Thread] ERROR [] - Index: 0, Size: 0
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.LinkedList.checkElementIndex(LinkedList.java:555) ~[?:1.8.0_111]
at java.util.LinkedList.get(LinkedList.java:476) ~[?:1.8.0_111]
at org.apache.skywalking.oap.server.core.remote.selector.ForeverFirstSelector.select(ForeverFirstSelector.java:37) ~[server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.core.remote.RemoteSenderService.send(RemoteSenderService.java:58) ~[server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.core.register.worker.RegisterRemoteWorker.in(RegisterRemoteWorker.java:47) ~[server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.core.register.worker.RegisterRemoteWorker.in(RegisterRemoteWorker.java:32) ~[server-core-6.2.0.jar:6.2.0]
at java.util.HashMap$Values.forEach(HashMap.java:980) [?:1.8.0_111]
at org.apache.skywalking.oap.server.core.register.worker.RegisterDistinctWorker.onWork(RegisterDistinctWorker.java:77) [server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.core.register.worker.RegisterDistinctWorker.access$100(RegisterDistinctWorker.java:34) [server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.core.register.worker.RegisterDistinctWorker$AggregatorConsumer.consume(RegisterDistinctWorker.java:104) [server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.apm.commons.datacarrier.consumer.MultipleChannelsConsumer.consume(MultipleChannelsConsumer.java:80) [apm-datacarrier-6.2.0.jar:6.2.0]
at org.apache.skywalking.apm.commons.datacarrier.consumer.MultipleChannelsConsumer.run(MultipleChannelsConsumer.java:49) [apm-datacarrier-6.2.0.jar:6.2.0]
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
在SegmentTrace中断数据上传之前报了几个调用ES的错误日志如下:
2019-07-25 15:37:48,084 - org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker - 113 [DataCarrier.REGISTER_L2.BulkConsumePool.0.Thread] ERROR [] - Elasticsearch exception [type=es_rejected_execution_exception, reason=rejected execution of processing of [356918][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[lpt_skywalking__service_instance_inventory][0]] containing [index {[lpt_skywalking__service_instance_inventory][type][5_e6372b0d8acc461886f43d36e9f10dbf_0_0], source[{"sequence":12,"heartbeat_time":1564040275951,"service_id":5,"address_id":0,"name":"enrollment-service-pid:27836@spring-boot-servicei","is_address":0,"instance_uuid":"e6372b0d8acc461886f43d36e9f10dbf","register_time":1564040123195,"properties":"{"os_name":"Linux","host_name":"spring-boot-servicei","process_no":"27836","language":"java","ipv4s":"[\"10.10.10.111\"]"}"}]}] and a refresh, target allocation id: uNxjird6TG6JolbwhVpYHQ, primary term: 1 on EsThreadPoolExecutor[name = LgVAr7j/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@76ce58ba[Running, pool size = 4, active threads = 4, queued tasks = 200, completed tasks = 19571]]]
org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=es_rejected_execution_exception, reason=rejected execution of processing of [356918][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[lpt_skywalking__service_instance_inventory][0]] containing [index {[lpt_skywalking__service_instance_inventory][type][5_e6372b0d8acc461886f43d36e9f10dbf_0_0], source[{"sequence":12,"heartbeat_time":1564040275951,"service_id":5,"address_id":0,"name":"enrollment-service-pid:27836@spring-boot-servicei","is_address":0,"instance_uuid":"e6372b0d8acc461886f43d36e9f10dbf","register_time":1564040123195,"properties":"{"os_name":"Linux","host_name":"spring-boot-servicei","process_no":"27836","language":"java","ipv4s":"[\"10.10.10.111\"]"}"}]}] and a refresh, target allocation id: uNxjird6TG6JolbwhVpYHQ, primary term: 1 on EsThreadPoolExecutor[name = LgVAr7j/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@76ce58ba[Running, pool size = 4, active threads = 4, queued tasks = 200, completed tasks = 19571]]]
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:653) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:628) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:535) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:508) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.update(RestHighLevelClient.java:366) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.apache.skywalking.oap.server.library.client.elasticsearch.ElasticSearchClient.forceUpdate(ElasticSearchClient.java:262) ~[library-client-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.RegisterEsDAO.forceUpdate(RegisterEsDAO.java:56) ~[storage-elasticsearch-plugin-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker.lambda$onWork$0(RegisterPersistentWorker.java:90) ~[server-core-6.2.0.jar:6.2.0]
at java.util.HashMap$Values.forEach(HashMap.java:980) [?:1.8.0_111]
at org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker.onWork(RegisterPersistentWorker.java:85) [server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker.access$100(RegisterPersistentWorker.java:36) [server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker$PersistentConsumer.consume(RegisterPersistentWorker.java:142) [server-core-6.2.0.jar:6.2.0]
at org.apache.skywalking.apm.commons.datacarrier.consumer.MultipleChannelsConsumer.consume(MultipleChannelsConsumer.java:80) [apm-datacarrier-6.2.0.jar:6.2.0]
at org.apache.skywalking.apm.commons.datacarrier.consumer.MultipleChannelsConsumer.run(MultipleChannelsConsumer.java:49) [apm-datacarrier-6.2.0.jar:6.2.0]
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://10.10.10.251:3012], URI [/lpt_skywalking__service_instance_inventory/type/5_e6372b0d8acc461886f43d36e9f10dbf_0_0/_update?refresh=true&timeout=1m], status line [HTTP/1.1 429 Too Many Requests]
{"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[LgVAr7j][10.0.23.34:9300][indices:data/write/update[s]]"}],"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [356918][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[lpt_skywalking__service_instance_inventory][0]] containing [index {[lpt_skywalking__service_instance_inventory][type][5_e6372b0d8acc461886f43d36e9f10dbf_0_0], source[{"sequence":12,"heartbeat_time":1564040275951,"service_id":5,"address_id":0,"name":"enrollment-service-pid:27836@spring-boot-servicei","is_address":0,"instance_uuid":"e6372b0d8acc461886f43d36e9f10dbf","register_time":1564040123195,"properties":"{\"os_name\":\"Linux\",\"host_name\":\"spring-boot-servicei\",\"process_no\":\"27836\",\"language\":\"java\",\"ipv4s\":\"[\\\"10.10.10.111\\\"]\"}"}]}] and a refresh, target allocation id: uNxjird6TG6JolbwhVpYHQ, primary term: 1 on EsThreadPoolExecutor[name = LgVAr7j/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@76ce58ba[Running, pool size = 4, active threads = 4, queued tasks = 200, completed tasks = 19571]]"},"status":429}
at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:705) ~[elasticsearch-rest-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:235) ~[elasticsearch-rest-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:198) ~[elasticsearch-rest-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:522) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:508) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.elasticsearch.client.RestHighLevelClient.update(RestHighLevelClient.java:366) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
at org.apache.skywalking.oap.server.library.client.elasticsearch.ElasticSearchClient.forceUpdate(ElasticSearchClient.java:262) ~[library-client-6.2.0.jar:6.2.0]
at org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.RegisterEsDAO.forceUpdate