一、背景介绍
mongoDB吃内存,貌似已经是默认的现象了。而且现在内置存储引擎也已经默认采用wiredTiger了。
最近有个测试,每秒1000多条数据的插入,应用侧采用500个线程池进行插入,mongo属于docker部署,只限制了wiredTiger的内存占用,运行几个小时后就会发现mongo发生OOM被杀死,查看运行log,最后有这么几条信息
2020-11-10T13:11:57.937050930Z {"t":{"$date":"2020-11-10T13:11:57.935+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"thread1010","msg":"WiredTiger error","attr":{"error":12,"message":"[1605013917:935682][1:0x7fe214b9b700], file:index-145-6498808884659112531.wt, eviction-server: __posix_file_write, 615: /data/db/index-145-6498808884659112531.wt: handle-write: pwrite: failed to write 12288 bytes at offset 90112: Cannot allocate memory"}}
2020-11-10T13:11:57.937126955Z {"t":{"$date":"2020-11-10T13:11:57.935+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"thread1010","msg":"WiredTiger error","attr":{"error":12,"message":"[1605013917:935887][1:0x7fe214b9b700], eviction-server: __wt_evict_thread_run, 327: cache eviction thread error: Cannot allocate memory"}}
2020-11-10T13:11:57.937140085Z {"t":{"$date":"2020-11-10T13:11:57.935+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"thread1010","msg":"WiredTiger error","attr":{"error":-31804,"message":"[1605013917:935926][1:0x7fe214b9b700], eviction-server: __wt_evict_thread_run, 327: the process must exit and restart: WT_PANIC: WiredTiger library panic"}}
2020-11-10T13:11:57.937147862Z {"t":{"$date":"2020-11-10T13:11:57.935+00:00"},"s":"F", "c":"-", "id":23089, "ctx":"thread1010","msg":"Fatal assertion","attr":{"msgid":50853,"file":"src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp","line":446}}
2020-11-10T13:11:57.937154249Z {"t":{"$date":"2020-11-10T13:11:57.936+00:00"},"s":"F", "c":"-", "id":23090, "ctx":"thread1010","msg":" ***aborting after fassert() failure "}
2020-11-10T13:11:57.937160477Z {"t":{"$date":"2020-11-10T13:11:57.936+00:00"},"s":"F", "c":"CONTROL", "id":4757800, "ctx":"thread1010","msg":"Writing fatal message","attr":{"message":"Got signal: 6 (Aborted). "}}
基本就是说内存无法分配的意思。
问题的大致方向清楚了,由于对这块问题的思考不够深入,没办法理解为什么内存会慢慢上升导致内存不够用了,所以也百度了好久,最后换了谷歌才算大概了解这个问题的原理。
这篇文章简单讲一下自己对mongoDB内存和CPU平衡的经过查阅所得出的一个思考
二、基础知识
mongoDB倾向于内存操作,wiredTiger引擎,采用token来控制并发,同时采用eviction线程的机制来清理到达阈值的内存
三、分析
其实单从wiredTiger角度出发思考,内存占用比作水池的话,上升的原因只可能是入大于出。入的话很简单,就是我每秒1000多条的插入;出,根据基础知识,则可能是因为token数不够,导致很多插入操作堆积、或者eviction的线程数不够,对于内存的清理不够及时。
问题原因挺简单的,主要其实还是对于问题的认识,下面放一下谷歌到的参考链接,文章里面也有参考链接,挺好的。
最后主要从cache,token(并发数),eviction(淘汰线程数)三个方面去实施。结果就是用CPU换内存。
这里需要注意关于token并发数设置,需要将数字用双引号应用,
db.adminCommand({setParameter: 1, wiredTigerConcurrentWriteTransactions: “512”})
同时eviction策略相关设置目前没有在官方文档里面看到,是不是默认的已经很好了,不提供修改选项了?
所以目前就从token和cache size方面进行了设置。
给大家一个参考的docker运行命令
docker run --name mongodb --cpus 1 -m 4G -v /alidata/MongoData:/data/db -p 27017:27017 -d mongo:4.4.1--wiredTigerCacheSizeGB 2.4 --setParameter wiredTigerConcurrentWriteTransactions=1500
四、结果
结果好坏,通过mongostat命令,
主要观察qrw一列看是否有堆积,然后内存占用这个肯定是会上升的,主要看当时wiredTiger的cache大小设置,同时还有相关的mongo操作需要mongod占用内存(这部分是刚需,基本避免不了吧)。
至于mongostat这个命令行工具结果中各列含义,则可以参照
https://docs.mongodb.com/v4.2/reference/program/mongostat/
最后结果如下图
used一列一直在80%上下徘徊(mongodb默认的缓存清理时机,缓存占用达到80%开始清理),没有再出现OOM的问题,不过CPU占用变高了
五、参考链接
https://caosiyang.github.io/posts/2016/08/23/mongodb-wiredtiger-performance-tuning/