为什么会有MIGRATE_PCPTYPES
在分析per cpu page的时候无疑看到MIGRATE_PCPTYPES和MIGRATE_HIGHATOMIC相等,感觉挺奇怪的,花点时间分析了下。
在分析这些特性的时候,个人还是比较喜欢追根溯源,看了下社区提交的patch记录,大致了解了设计page属性的目的。
https://lwn.net/Articles/658081/
There is still value in reserving blocks of memory for high-order allocations, though; fragmentation is still a concern in current kernels. So another part of Mel's patch set creates a new MIGRATE_HIGHATOMIC reserve that serves this purpose, but in a different way. Initially, this reserve contains no page blocks at all. If a high-order allocation cannot be satisfied without breaking up a previously whole page block, that block will be marked as being part of the high-order atomic reserve; thereafter, only higher-order allocations (and only high-priority ones at that) can be satisfied from that page block.
从介绍来看,针对高阶page的申请导致的碎片一直以来就是一个令人担忧的问题,所以Mel创建了一个新的page type,MIGRATE_HIGHATOMIC,用于放置碎片化过多的问题,仅仅高阶并且具备同等级别的才能从该pageblock中申请page。
代码实现
在mm目录下的代码,我们可以看到少量的关于MIGRATE_HIGHATOMIC的代码
2634 static void reserve_highatomic_pageblock(struct page *page, struct zone *zone, 2635 unsigned int alloc_order) 2636 { 2637 int mt; 2638 unsigned long max_managed, flags; 2639 2640 /* 2641 * Limit the number reserved to 1 pageblock or roughly 1% of a zone. 2642 * Check is race-prone but harmless. 2643 */ 2644 max_managed = (zone_managed_pages(zone) / 100) + pageblock_nr_pages; 2645 if (zone->nr_reserved_highatomic >= max_managed) 2646 return; 2647 2648 spin_lock_irqsave(&zone->lock, flags); 2649 2650 /* Recheck the nr_reserved_highatomic limit under the lock */ 2651 if (zone->nr_reserved_highatomic >= max_managed) 2652 goto out_unlock; 2653 2654 /* Yoink! */ 2655 mt = get_pageblock_migratetype(page); 2656 if (!is_migrate_highatomic(mt) && !is_migrate_isolate(mt) 2657 && !is_migrate_cma(mt)) { 2658 zone->nr_reserved_highatomic += pageblock_nr_pages; 2659 set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC); 2660 move_freepages_block(zone, page, MIGRATE_HIGHATOMIC, NULL); 2661 } 2662 2663 out_unlock: 2664 spin_unlock_irqrestore(&zone->lock, flags); 2665 }
3788 get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, 3789 const struct alloc_context *ac) 3790 { 3791 page = rmqueue(ac->preferred_zoneref->zone, zone, order, 3792 gfp_mask, alloc_flags, ac->migratetype); 3793 if (page) { 3794 prep_new_page(page, order, gfp_mask, alloc_flags); 3795 3796 /* 3797 * If this is a high-order atomic allocation then check 3798 * if the pageblock should be reserved for the future 3799 */ 3800 if (unlikely(order && (alloc_flags & ALLOC_HARDER))) 3801 reserve_highatomic_pageblock(page, zone, order);
3421 struct page *rmqueue(struct zone *preferred_zone, 3422 struct zone *zone, unsigned int order, 3423 gfp_t gfp_flags, unsigned int alloc_flags, 3424 int migratetype) 3425 { 3426 do { 3427 page = NULL; 3428 /* 3429 * order-0 request can reach here when the pcplist is skipped 3430 * due to non-CMA allocation context. HIGHATOMIC area is 3431 * reserved for high-order atomic allocation, so order-0 3432 * request should skip it. 3433 */ 3434 if (order > 0 && alloceflags & ALLOC_HARDER) { 3435 page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); 3436 if (page) 3437 trace_mm_page_alloc_zone_locked(page, order, migratetype); 3438 } 3439 if (!page) 3440 page = __rmqueue(zone, order, migratetype, alloc_flags); 3441 } while (page && check_new_pages(page, order)); 3442 }
从上面代码可以看到,在申请page的时候,会通过判定该page申请时的一些flags配置,例如属于ALLOC_HARDER,表明该页申请时无低阶page申请时
从高阶获取,并且order,可以认为该page是从high order分配下来的,所以将该页加入到highatomic_pageblock中,但是该类型的页又不能无限多,否则
后面内存紧张的时候,就无法申请page了。
所以尽量设置该pageblock的数目小于zone里面page的1/100。
max_managed = (zone_managed_pages(zone) / 100) + pageblock_nr_pages;
最终会调用move_freepages_block函数将该页移动到highatomic_pageblock中去。
cat /proc/pagetypeinfo | grep HighAtomic Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Number of blocks type Unmovable Movable Reclaimable HighAtomic CMA Isolate Node 1, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Number of blocks type Unmovable Movable Reclaimable HighAtomic CMA Isolate Node 2, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Number of blocks type Unmovable Movable Reclaimable HighAtomic CMA Isolate Node 3, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Number of blocks type Unmovable Movable Reclaimable HighAtomic CMA Isolate
可以通过pagetypeinfo查看MIGRATE_HIGHATOMIC数量,在本人机器上看到HighAtomic类型的page数量为0。
关于MIGRATE_HIGHATOMIC的释放
内核函数往往体现这对称性,将page加入到highatomic_pageblock中去采用的reserve_highatomic_pageblock函数,而将page从
该highatomic_pageblock移除,则调用的unreserve_highatomic_pageblock函数, 此处不再做过多的解释,大体意思如下,当alloc_page
slow申请page的时候失败,会从该highatomic_pageblock中申请page。
4344 __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order, 4345 unsigned int alloc_flags, const struct alloc_context *ac, 4346 unsigned long *did_some_progress) 4347 { 4348 struct page *page = NULL; 4349 bool drained = false; 4350 4351 *did_some_progress = __perform_reclaim(gfp_mask, order, ac); 4352 if (unlikely(!(*did_some_progress))) 4353 return NULL; 4354 4355 retry: 4356 page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); 4357 4358 /* 4359 * If an allocation failed after direct reclaim, it could be because 4360 * pages are pinned on the per-cpu lists or in high alloc reserves. 4361 * Shrink them and try again 4362 */ 4363 if (!page && !drained) { 4364 unreserve_highatomic_pageblock(ac, false); 4365 drain_all_pages(NULL); 4366 drained = true; 4367 goto retry; 4368 } 4369 4370 return page; 4371 }