linux-stable/mm
Barry Song cc864ebba5 madvise:madvise_cold_or_pageout_pte_range(): allow split while folio_estimated_sharers = 0
The purpose is stopping splitting large folios whose mapcount are 2 or
above.  Folios whose estimated_shares = 0 should be still perfect and even
better candidates than estimated_shares = 1.

Consider a pte-mapped large folio with 16 subpages, if we unmap 1-15, the
current code will split folios and reclaim them while madvise goes on this
folio; but if we unmap subpage 0, we will keep this folio and break.  This
is weird.

For pmd-mapped large folios, we can still use "= 1" as the condition as
anyway we have the entire map for it.  So this patch doesn't change the
condition for pmd-mapped large folios.  This also explains why we had been
using "= 1" for both pmd-mapped and pte-mapped large folios before commit
07e8c82b5e ("madvise: convert madvise_cold_or_pageout_pte_range() to use
folios"), because in the past, we used the mapcount of the specific
subpage, since the subpage had pte present, its mapcount wouldn't be 0.

The problem can be quite easily reproduced by writing a small program,
unmapping the first subpage of a pte-mapped large folio vs.  unmapping
anyone other than the first subpage.

Link: https://lkml.kernel.org/r/20240221085036.105621-1-21cnbao@gmail.com
Fixes: 2f406263e3 ("madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check")
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yin Fengwei <fengwei.yin@intel.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:34 -08:00
..
damon mm/damon/reclaim: implement memory PSI-driven quota self-tuning 2024-02-23 17:48:30 -08:00
kasan merge mm-hotfixes-stable into mm-nonmm-stable to pick up stackdepot changes 2024-02-23 17:28:43 -08:00
kfence KFENCE: cleanup kfence_guarded_alloc() after CONFIG_SLAB removal 2023-12-05 11:17:58 +01:00
kmsan mm: kmsan: remove runtime checks from kmsan_unpoison_memory() 2024-02-22 10:24:41 -08:00
backing-dev.c blk-wbt: Fix detection of dirty-throttled tasks 2024-02-06 09:44:03 -07:00
balloon_compaction.c
bootmem_info.c bootmem: use kmemleak_free_part_phys in put_page_bootmem 2023-10-25 16:47:13 -07:00
cma.c mm/cma: add sysfs file 'release_pages_success' 2024-02-22 10:24:57 -08:00
cma.h mm/cma: add sysfs file 'release_pages_success' 2024-02-22 10:24:57 -08:00
cma_debug.c
cma_sysfs.c mm/cma: add sysfs file 'release_pages_success' 2024-02-22 10:24:57 -08:00
compaction.c mm/compaction: optimize >0 order folio compaction with free page split. 2024-02-23 17:48:33 -08:00
debug.c
debug_page_alloc.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: fix BUG_ON with pud advanced test 2024-02-23 17:27:13 -08:00
dmapool.c mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs 2023-12-05 11:17:58 +01:00
dmapool_test.c
early_ioremap.c
fadvise.c
fail_page_alloc.c
failslab.c
filemap.c merge mm-hotfixes-stable into mm-nonmm-stable to pick up stackdepot changes 2024-02-23 17:28:43 -08:00
folio-compat.c mm: remove page_add_new_anon_rmap and lru_cache_add_inactive_or_unevictable 2023-12-29 11:58:27 -08:00
gup.c mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte|pmd]() 2023-12-29 11:58:56 -08:00
gup_test.c
gup_test.h
highmem.c x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs 2023-12-29 12:22:28 -08:00
hmm.c
huge_memory.c userfaultfd: use per-vma locks in userfaultfd operations 2024-02-22 15:27:20 -08:00
hugetlb.c mm/hugetlb: move page order check inside hugetlb_cma_reserve() 2024-02-22 10:24:59 -08:00
hugetlb_cgroup.c mm, hugetlb: remove HUGETLB_CGROUP_MIN_ORDER 2023-10-18 14:34:17 -07:00
hugetlb_vmemmap.c mm: hugetlb_vmemmap: move mmap lock to vmemmap_remap_range() 2023-12-12 10:57:08 -08:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: fix reference to nonexistent file 2023-10-25 16:47:14 -07:00
hwpoison-inject.c
init-mm.c mm: Deprecate pasid field 2023-12-12 10:11:32 +01:00
internal.h mm/compaction: add support for >0 order folio memory compaction. 2024-02-23 17:48:33 -08:00
interval_tree.c
io-mapping.c
ioremap.c
Kconfig Introduce cpu_dcache_is_aliasing() across all architectures 2024-02-22 15:27:19 -08:00
Kconfig.debug mm/slab: remove CONFIG_SLAB from all Kconfig and Makefile 2023-12-05 11:14:40 +01:00
khugepaged.c mm/khugepaged: bypassing unnecessary scans with MMF_DISABLE_THP check 2024-02-23 17:48:25 -08:00
kmemleak.c kmemleak: avoid RCU stalls when freeing metadata for per-CPU pointers 2023-12-12 10:57:07 -08:00
ksm.c mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte|pmd]() 2023-12-29 11:58:56 -08:00
list_lru.c mm/zswap: stop lru list shrinking when encounter warm region 2024-02-22 10:24:54 -08:00
maccess.c
madvise.c madvise:madvise_cold_or_pageout_pte_range(): allow split while folio_estimated_sharers = 0 2024-02-23 17:48:34 -08:00
Makefile mm/slab: remove CONFIG_SLAB from all Kconfig and Makefile 2023-12-05 11:14:40 +01:00
mapping_dirty_helpers.c
memblock.c mm/memblock: add MEMBLOCK_RSRV_NOINIT into flagname[] array 2024-02-20 14:20:49 -08:00
memcontrol.c mm: memcg: use larger batches for proactive reclaim 2024-02-22 10:24:52 -08:00
memfd.c memfd: drop warning for missing exec-related flags 2023-10-04 10:32:22 -07:00
memory-failure.c mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page 2024-02-07 21:20:34 -08:00
memory-tiers.c mm/demotion: print demotion targets 2024-02-22 10:24:55 -08:00
memory.c mm: add pte_batch_hint() to reduce scanning in folio_pte_batch() 2024-02-22 15:27:18 -08:00
memory_hotplug.c mm/memory_hotplug: export mhp_supports_memmap_on_memory() 2024-02-22 10:24:40 -08:00
mempolicy.c mm/mempolicy: protect task interleave functions with tsk->mems_allowed_seq 2024-02-22 10:24:47 -08:00
mempool.c Many singleton patches against the MM code. The patch series which 2024-01-09 11:18:47 -08:00
memremap.c mm: remove stale example from comment 2023-12-29 11:58:26 -08:00
memtest.c
migrate.c merge mm-hotfixes-stable into mm-nonmm-stable to pick up stackdepot changes 2024-02-23 17:28:43 -08:00
migrate_device.c mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte|pmd]() 2023-12-29 11:58:56 -08:00
mincore.c
mlock.c mm: mlock: avoid folio_within_range() on KSM pages 2023-10-25 16:47:14 -07:00
mm_init.c efi: disable mirror feature during crashkernel 2024-01-12 15:20:47 -08:00
mm_slot.h
mmap.c mm/mmap: pass vma to vma_merge() 2024-02-22 10:24:52 -08:00
mmap_lock.c
mmu_gather.c mm/mmu_gather: improve cond_resched() handling with large folios and expensive page freeing 2024-02-22 15:27:17 -08:00
mmu_notifier.c
mmzone.c zswap: shrink zswap pool based on memory pressure 2023-12-12 10:57:02 -08:00
mprotect.c mprotect: use pfn_swap_entry_folio 2024-02-21 16:00:03 -08:00
mremap.c mm: abstract VMA merge and extend into vma_merge_extend() helper 2023-10-18 14:34:18 -07:00
msync.c
nommu.c mm/vmalloc: remove vmap_area_list 2024-02-23 17:48:19 -08:00
oom_kill.c mm, oom:dump_tasks add rss detailed information printing 2023-12-10 16:51:53 -08:00
page-writeback.c block-6.8-2024-02-10 2024-02-10 08:02:48 -08:00
page_alloc.c mm/compaction: add support for >0 order folio memory compaction. 2024-02-23 17:48:33 -08:00
page_counter.c
page_ext.c
page_idle.c
page_io.c zswap: memcontrol: implement zswap writeback disabling 2023-12-29 20:22:11 -08:00
page_isolation.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
page_owner.c mm,page_owner: filter out stacks by a threshold 2024-02-23 17:48:17 -08:00
page_poison.c mm/page_poison: replace kmap_atomic() with kmap_local_page() 2023-12-10 16:51:50 -08:00
page_reporting.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
page_reporting.h
page_table_check.c
page_vma_mapped.c mm: thp: introduce multi-size THP sysfs interface 2023-12-20 14:48:12 -08:00
pagewalk.c mm: pagewalk: assert write mmap lock only for walking the user page tables 2023-12-10 16:51:53 -08:00
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c mm: Introduce flush_cache_vmap_early() 2023-12-14 00:23:17 -08:00
pgalloc-track.h
pgtable-generic.c
process_vm_access.c mm: fix process_vm_rw page counts 2023-12-10 16:51:39 -08:00
ptdump.c mm: ptdump: add check_wx_pages debugfs attribute 2024-02-22 10:24:47 -08:00
readahead.c readahead: use ilog2 instead of a while loop in page_cache_ra_order() 2024-02-22 10:24:38 -08:00
rmap.c rmap: replace two calls to compound_order with folio_order 2024-02-22 15:27:20 -08:00
rodata_test.c
secretmem.c
shmem.c shmem: properly report quota mount options 2024-02-23 17:48:34 -08:00
shmem_quota.c
show_mem.c mm, treewide: introduce NR_PAGE_ORDERS 2024-01-08 15:27:15 -08:00
shrinker.c mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info() 2024-01-05 09:58:32 -08:00
shrinker_debug.c mm: shrinker: convert shrinker_rwsem to mutex 2023-10-04 10:32:26 -07:00
shuffle.c
shuffle.h mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
slab.h mm/slab: move kmalloc() functions from slab_common.c to slub.c 2023-12-06 11:57:21 +01:00
slab_common.c slub: use a folio in __kmalloc_large_node 2024-01-05 10:17:46 -08:00
slub.c Many singleton patches against the MM code. The patch series which 2024-01-09 11:18:47 -08:00
sparse-vmemmap.c
sparse.c mm/memory_hotplug: introduce MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers 2024-02-21 16:00:01 -08:00
swap.c mm/mmu_gather: add __tlb_remove_folio_pages() 2024-02-22 15:27:17 -08:00
swap.h mm/swap: fix race when skipping swapcache 2024-02-20 14:20:48 -08:00
swap_cgroup.c
swap_slots.c mm/zswap: invalidate zswap entry when swap entry free 2024-02-22 10:24:54 -08:00
swap_state.c mm/mmu_gather: add __tlb_remove_folio_pages() 2024-02-22 15:27:17 -08:00
swapfile.c mm/swapfile:__swap_duplicate: drop redundant WRITE_ONCE on swap_map for err cases 2024-02-23 17:48:34 -08:00
truncate.c fs: convert error_remove_page to error_remove_folio 2023-12-10 16:51:42 -08:00
usercopy.c
userfaultfd.c userfaultfd: use per-vma locks in userfaultfd operations 2024-02-22 15:27:20 -08:00
util.c mm/util: use kmap_local_page() in memcmp_pages() 2023-12-10 16:51:49 -08:00
vmalloc.c mm: vmalloc: refactor vmalloc_dump_obj() function 2024-02-23 17:48:21 -08:00
vmpressure.c eventfd: simplify eventfd_signal() 2023-11-28 14:08:38 +01:00
vmscan.c mm/mglru: improve swappiness handling 2024-02-22 10:24:58 -08:00
vmstat.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
workingset.c mm: ratelimit stat flush from workingset shrinker 2024-01-05 10:17:45 -08:00
z3fold.c mm/z3fold: fix the comment for __encode_handle() 2024-02-23 17:48:31 -08:00
zbud.c
zpool.c
zsmalloc.c mm/zsmalloc: remove get_zspage_mapping() 2024-02-23 17:48:32 -08:00
zswap.c mm: zswap: increase reject_compress_poor but not reject_compress_fail if compression returns ENOSPC 2024-02-23 17:48:31 -08:00