linux-stable/mm
Johannes Weiner 493614da0d mm: compaction: fix endless looping over same migrate block
During stress testing, the following situation was observed:

     70 root      39  19       0      0      0 R 100.0   0.0 959:29.92 khugepaged
 310936 root      20   0   84416  25620    512 R  99.7   1.5 642:37.22 hugealloc

Tracing shows isolate_migratepages_block() endlessly looping over the
first block in the DMA zone:

       hugealloc-310936  [001] ..... 237297.415718: mm_compaction_finished: node=0 zone=DMA      order=9 ret=no_suitable_page
       hugealloc-310936  [001] ..... 237297.415718: mm_compaction_isolate_migratepages: range=(0x1 ~ 0x400) nr_scanned=513 nr_taken=0
       hugealloc-310936  [001] ..... 237297.415718: mm_compaction_finished: node=0 zone=DMA      order=9 ret=no_suitable_page
       hugealloc-310936  [001] ..... 237297.415718: mm_compaction_isolate_migratepages: range=(0x1 ~ 0x400) nr_scanned=513 nr_taken=0
       hugealloc-310936  [001] ..... 237297.415718: mm_compaction_finished: node=0 zone=DMA      order=9 ret=no_suitable_page
       hugealloc-310936  [001] ..... 237297.415718: mm_compaction_isolate_migratepages: range=(0x1 ~ 0x400) nr_scanned=513 nr_taken=0
       hugealloc-310936  [001] ..... 237297.415718: mm_compaction_finished: node=0 zone=DMA      order=9 ret=no_suitable_page
       hugealloc-310936  [001] ..... 237297.415718: mm_compaction_isolate_migratepages: range=(0x1 ~ 0x400) nr_scanned=513 nr_taken=0

The problem is that the functions tries to test and set the skip bit once
on the block, to avoid skipping on its own skip-set, using
pageblock_aligned() on the pfn as a test.  But because this is the DMA
zone which starts at pfn 1, this is never true for the first block, and
the skip bit isn't set or tested at all.  As a result,
fast_find_migrateblock() returns the same pageblock over and over.

If the pfn isn't pageblock-aligned, also check if it's the start of the
zone to ensure test-and-set-exactly-once on unaligned ranges.

Thanks to Vlastimil Babka for the help in debugging this.

Link: https://lkml.kernel.org/r/20230731172450.1632195-1-hannes@cmpxchg.org
Fixes: 90ed667c03 ("Revert "Revert "mm/compaction: fix set skip in fast_find_migrateblock""")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-04 13:03:42 -07:00
..
damon mm/damon/core-test: initialise context before test in damon_test_set_attrs() 2023-07-27 13:07:03 -07:00
kasan kasan, slub: fix HW_TAGS zeroing with slub_debug 2023-07-08 09:29:32 -07:00
kfence mm/slab: introduce kmem_cache flag SLAB_NO_MERGE 2023-06-02 10:24:33 +02:00
kmsan kasan,kmsan: remove __GFP_KSWAPD_RECLAIM usage from kasan/kmsan 2023-06-23 16:59:26 -07:00
backing-dev.c mm: backing-dev: make bdi_class a static const structure 2023-06-23 16:59:27 -07:00
balloon_compaction.c
bootmem_info.c
cma.c mm/page_owner/cma: show pfn in cma/page_owner with hex format 2023-06-19 16:19:32 -07:00
cma.h
cma_debug.c
cma_sysfs.c mm: cma: make kobj_type structure constant 2023-03-28 16:20:06 -07:00
compaction.c mm: compaction: fix endless looping over same migrate block 2023-08-04 13:03:42 -07:00
debug.c mm: update validate_mm() to use vma iterator 2023-06-09 16:25:31 -07:00
debug_page_alloc.c mm: page_alloc: split out DEBUG_PAGEALLOC 2023-06-09 16:25:23 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable,page_table_check: warn pte map fails 2023-06-19 16:19:15 -07:00
dmapool.c dmapool: create/destroy cleanup 2023-06-09 16:25:17 -07:00
dmapool_test.c dmapool: add alloc/free performance test 2023-04-05 19:42:38 -07:00
early_ioremap.c mm/early_ioremap.c: improve the execution efficiency of early_ioremap_setup() 2023-06-09 16:25:56 -07:00
fadvise.c mm: remove unnecessary pagevec includes 2023-06-23 16:59:31 -07:00
fail_page_alloc.c mm: page_alloc: split out FAIL_PAGE_ALLOC 2023-06-09 16:25:23 -07:00
failslab.c
filemap.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
folio-compat.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
frontswap.c mm: zswap: support exclusive loads 2023-06-19 16:19:05 -07:00
gup.c gup: make the stack expansion warning a bit more targeted 2023-07-05 09:33:31 -07:00
gup_test.c Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes. 2023-06-23 16:58:19 -07:00
gup_test.h
highmem.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
hmm.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
huge_memory.c mm: remove references to pagevec 2023-06-23 16:59:30 -07:00
hugetlb.c hugetlb: do not clear hugetlb dtor until allocating vmemmap 2023-08-04 13:03:41 -07:00
hugetlb_cgroup.c mm/hugetlb: increase use of folios in alloc_huge_page() 2023-02-13 15:54:27 -08:00
hugetlb_vmemmap.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
hugetlb_vmemmap.h
hwpoison-inject.c
init-mm.c IOMMU Updates for Linux 6.4 2023-04-30 13:00:38 -07:00
internal.h - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
interval_tree.c
io-mapping.c
ioremap.c
Kconfig mm: disable CONFIG_PER_VMA_LOCK until its fixed 2023-07-08 09:29:29 -07:00
Kconfig.debug mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM 2023-05-29 16:14:28 +01:00
khugepaged.c mm/khugepaged: fix regression in collapse_file() 2023-06-29 09:41:45 -07:00
kmemleak.c lib/stackdepot, mm: rename stack_depot_want_early_init 2023-02-16 20:43:49 -08:00
ksm.c mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page 2023-08-04 13:03:40 -07:00
list_lru.c
maccess.c mm: Fix copy_from_user_nofault(). 2023-04-12 17:36:23 -07:00
madvise.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
Makefile mm: page_alloc: split out DEBUG_PAGEALLOC 2023-06-09 16:25:23 -07:00
mapping_dirty_helpers.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
memblock.c Revert "mm,memblock: reset memblock.reserved to system init state to prevent UAF" 2023-07-28 09:47:06 -07:00
memcontrol.c mm/memcontrol: do not tweak node in mem_cgroup_init() 2023-06-23 16:59:26 -07:00
memfd.c memfd: check for non-NULL file_seals in memfd_create() syscall 2023-06-19 13:19:31 -07:00
memory-failure.c mm: memory-failure: avoid false hwpoison page mapped error info 2023-08-04 13:03:41 -07:00
memory-tiers.c memory tier: remove unneeded !IS_ENABLED(CONFIG_MIGRATION) check 2023-06-19 16:19:29 -07:00
memory.c mm: lock_vma_under_rcu() must check vma->anon_vma under vma lock 2023-07-27 11:13:22 -07:00
memory_hotplug.c mm: remove unnecessary pagevec includes 2023-06-23 16:59:31 -07:00
mempolicy.c mm/mempolicy: Take VMA lock before replacing policy 2023-07-28 09:44:06 -07:00
mempool.c
memremap.c mm/memremap.c: fix outdated comment in devm_memremap_pages 2023-02-09 16:51:46 -08:00
memtest.c mm/memtest: add results of early memtest to /proc/meminfo 2023-04-05 19:42:55 -07:00
migrate.c mm: remove unnecessary pagevec includes 2023-06-23 16:59:31 -07:00
migrate_device.c mm: remove references to pagevec 2023-06-23 16:59:30 -07:00
mincore.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
mlock.c mm/mlock: fix vma iterator conversion of apply_vma_lock_flags() 2023-07-17 12:53:21 -07:00
mm_init.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
mm_slot.h
mmap.c mm: lock VMA in dup_anon_vma() before setting ->anon_vma 2023-07-27 13:07:04 -07:00
mmap_lock.c
mmu_gather.c mm: prefer xxx_page() alloc/free functions for order-0 pages 2023-03-28 16:20:16 -07:00
mmu_notifier.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
mmzone.c
mprotect.c Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes. 2023-06-23 16:58:19 -07:00
mremap.c mm: Update do_vmi_align_munmap() return semantics 2023-07-01 08:10:56 -07:00
msync.c
nommu.c xtensa: fix lock_mm_and_find_vma in case VMA not found 2023-07-01 08:00:05 -07:00
oom_kill.c mm, oom: do not check 0 mask in out_of_memory() 2023-06-09 16:25:20 -07:00
page-writeback.c writeback: account the number of pages written back 2023-07-08 09:29:30 -07:00
page_alloc.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
page_counter.c
page_ext.c mm/page_ext: init page_ext early if there are no deferred struct pages 2023-02-02 22:33:22 -08:00
page_idle.c
page_io.c swap: use __bio_add_page to add page to bio 2023-05-31 09:50:02 -06:00
page_isolation.c mm: page_isolation: write proper kerneldoc 2023-06-19 16:18:59 -07:00
page_owner.c mm/page_owner/cma: show pfn in cma/page_owner with hex format 2023-06-19 16:19:32 -07:00
page_poison.c
page_reporting.c mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
page_reporting.h
page_table_check.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
page_vma_mapped.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
pagewalk.c mm/pagewalk: fix EFI_PGT_DUMP of espfix area 2023-07-27 13:07:04 -07:00
percpu-internal.h percpu-internal/pcpu_chunk: re-layout pcpu_chunk structure to reduce false sharing 2023-06-19 16:19:29 -07:00
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c mm: memcontrol: rename memcg_kmem_enabled() 2023-02-16 20:43:56 -08:00
pgalloc-track.h
pgtable-generic.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
process_vm_access.c mm/gup: remove unused vmas parameter from pin_user_pages_remote() 2023-06-09 16:25:25 -07:00
ptdump.c mm: ptdump should use ptep_get_lockless() 2023-06-19 16:19:24 -07:00
readahead.c mm: remove unnecessary pagevec includes 2023-06-23 16:59:31 -07:00
rmap.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
rodata_test.c
secretmem.c mm/mlock: rename mlock_future_check() to mlock_future_ok() 2023-06-09 16:25:38 -07:00
shmem.c shmem: minor fixes to splice-read implementation 2023-07-27 13:07:03 -07:00
show_mem.c mm: page_alloc: collect mem statistic into show_mem.c 2023-06-09 16:25:22 -07:00
shrinker_debug.c Revert "mm: shrinkers: make count and scan in shrinker debugfs lockless" 2023-06-19 13:19:34 -07:00
shuffle.c
shuffle.h mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
slab.c slab updates for 6.5 2023-06-29 16:34:12 -07:00
slab.h kasan, slub: fix HW_TAGS zeroing with slub_debug 2023-07-08 09:29:32 -07:00
slab_common.c slab updates for 6.5 2023-06-29 16:34:12 -07:00
slub.c slab updates for 6.5 2023-06-29 16:34:12 -07:00
sparse-vmemmap.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
sparse.c - Arnd Bergmann has fixed a bunch of -Wmissing-prototypes in 2023-06-28 10:59:38 -07:00
swap.c mm: remove references to pagevec 2023-06-23 16:59:30 -07:00
swap.h mm: remove the __swap_writepage return value 2023-02-02 22:33:33 -08:00
swap_cgroup.c
swap_slots.c
swap_state.c mm: remove unnecessary pagevec includes 2023-06-23 16:59:31 -07:00
swapfile.c mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page 2023-08-04 13:03:40 -07:00
truncate.c mm: remove references to pagevec 2023-06-23 16:59:30 -07:00
usercopy.c mm: Fix copy_from_user_nofault(). 2023-04-12 17:36:23 -07:00
userfaultfd.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
util.c mm: uninline kstrdup() 2023-04-08 13:45:37 -07:00
vmalloc.c Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes. 2023-06-23 16:58:19 -07:00
vmpressure.c
vmscan.c mm/vmscan: fix root proactive reclaim unthrottling unbalanced node 2023-06-23 16:59:32 -07:00
vmstat.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
workingset.c Multi-gen LRU: fix workingset accounting 2023-06-09 16:25:46 -07:00
z3fold.c mm: zswap: remove shrink from zpool interface 2023-06-19 16:19:27 -07:00
zbud.c mm: zswap: remove shrink from zpool interface 2023-06-19 16:19:27 -07:00
zpool.c mm: zswap: remove shrink from zpool interface 2023-06-19 16:19:27 -07:00
zsmalloc.c zsmalloc: fix races between modifications of fullness and isolated 2023-08-04 13:03:40 -07:00
zswap.c mm: zswap: fix double invalidate with exclusive loads 2023-06-23 16:59:31 -07:00