linux-stable/mm
Hugh Dickins 1cb5d11a37 mempolicy: fix migrate_pages(2) syscall return nr_failed
"man 2 migrate_pages" says "On success migrate_pages() returns the number
of pages that could not be moved".  Although 5.3 and 5.4 commits fixed
mbind(MPOL_MF_STRICT|MPOL_MF_MOVE*) to fail with EIO when not all pages
could be moved (because some could not be isolated for migration),
migrate_pages(2) was left still reporting only those pages failing at the
migration stage, forgetting those failing at the earlier isolation stage.

Fix that by accumulating a long nr_failed count in struct queue_pages,
returned by queue_pages_range() when it's not returning an error, for
adding on to the nr_failed count from migrate_pages() in mm/migrate.c.  A
count of pages?  It's more a count of folios, but changing it to pages
would entail more work (also in mm/migrate.c): does not seem justified.

queue_pages_range() itself should only return -EIO in the "strictly
unmovable" case (STRICT without any MOVEs): in that case it's best to
break out as soon as nr_failed gets set; but otherwise it should continue
to isolate pages for MOVing even when nr_failed - as the mbind(2) manpage
promises.

There's a case when nr_failed should be incremented when it was missed:
queue_folios_pte_range() and queue_folios_hugetlb() count the transient
migration entries, like queue_folios_pmd() already did.  And there's a
case when nr_failed should not be incremented when it would have been: in
meeting later PTEs of the same large folio, which can only be isolated
once: fixed by recording the current large folio in struct queue_pages.

Clean up the affected functions, fixing or updating many comments.  Bool
migrate_folio_add(), without -EIO: true if adding, or if skipping shared
(but its arguable folio_estimated_sharers() heuristic left unchanged). 
Use MPOL_MF_WRLOCK flag to queue_pages_range(), instead of bool lock_vma. 
Use explicit STRICT|MOVE* flags where queue_pages_test_walk() checks for
skipping, instead of hiding them behind MPOL_MF_VALID.

Link: https://lkml.kernel.org/r/9a6b0b9-3bb-dbef-8adf-efab4397b8d@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tejun heo <tj@kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-25 16:47:15 -07:00
..
damon mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets() 2023-10-25 16:47:15 -07:00
kasan mm: delete checks for xor_unlock_is_negative_byte() 2023-10-18 14:34:17 -07:00
kfence LoongArch changes for v6.6 2023-09-08 12:16:52 -07:00
kmsan mm: kmsan: panic on failure to allocate early boot metadata 2023-10-25 16:47:10 -07:00
backing-dev.c writeback: remove redundant checks for root memcg 2023-08-21 13:37:48 -07:00
balloon_compaction.c
bootmem_info.c bootmem: use kmemleak_free_part_phys in put_page_bootmem 2023-10-25 16:47:13 -07:00
cma.c mm/cma: use nth_page() in place of direct struct page manipulation 2023-10-04 10:32:29 -07:00
cma.h
cma_debug.c
cma_sysfs.c
compaction.c mm/compaction: factor out code to test if we should run compaction for target order 2023-10-04 10:32:19 -07:00
debug.c
debug_page_alloc.c
debug_page_ref.c
debug_vm_pgtable.c mm: fix multiple typos in multiple files 2023-10-25 16:47:14 -07:00
dmapool.c
dmapool_test.c
early_ioremap.c
fadvise.c mm: remove unnecessary pagevec includes 2023-06-23 16:59:31 -07:00
fail_page_alloc.c
failslab.c
filemap.c mm: drop the assumption that VM_SHARED always implies writable 2023-10-18 14:34:19 -07:00
folio-compat.c filemap: Add fgf_t typedef 2023-07-24 18:04:30 -04:00
gup.c mm/gup: make failure to pin an error if FOLL_NOWAIT not specified 2023-10-18 14:34:15 -07:00
gup_test.c Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes. 2023-06-23 16:58:19 -07:00
gup_test.h
highmem.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
hmm.c mm: enable page walking API to lock vmas during the walk 2023-08-21 13:07:20 -07:00
huge_memory.c mm: huge_memory: use folio_xchg_last_cpupid() in __split_huge_page_tail() 2023-10-25 16:47:12 -07:00
hugetlb.c hugetlb_vmemmap: use folio argument for hugetlb_vmemmap_* functions 2023-10-25 16:47:08 -07:00
hugetlb_cgroup.c mm, hugetlb: remove HUGETLB_CGROUP_MIN_ORDER 2023-10-18 14:34:17 -07:00
hugetlb_vmemmap.c hugetlb_vmemmap: use folio argument for hugetlb_vmemmap_* functions 2023-10-25 16:47:08 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: fix reference to nonexistent file 2023-10-25 16:47:14 -07:00
hwpoison-inject.c
init-mm.c mm: move dummy_vm_ops out of a header 2023-08-21 13:37:46 -07:00
internal.h mm: fix multiple typos in multiple files 2023-10-25 16:47:14 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm: ioremap: remove unneeded ioremap_allowed and iounmap_allowed 2023-08-18 10:12:36 -07:00
Kconfig mm: restrict the pcp batch scale factor to avoid too long latency 2023-10-25 16:47:10 -07:00
Kconfig.debug
khugepaged.c mm/khugepaged: convert collapse_pte_mapped_thp() to use folios 2023-10-25 16:47:14 -07:00
kmemleak.c mm/kmemleak: move the initialisation of object to __link_object 2023-10-25 16:47:13 -07:00
ksm.c mm/ksm: add pages_skipped metric 2023-10-16 15:44:39 -07:00
list_lru.c
maccess.c
madvise.c mm: drop the assumption that VM_SHARED always implies writable 2023-10-18 14:34:19 -07:00
Makefile mm: vmscan: move shrinker-related code into a separate file 2023-10-04 10:32:23 -07:00
mapping_dirty_helpers.c mm: fix clean_record_shared_mapping_range kernel-doc 2023-08-24 16:20:30 -07:00
memblock.c memblock: introduce MEMBLOCK_RSRV_NOINIT flag 2023-10-04 10:32:30 -07:00
memcontrol.c mm: fix multiple typos in multiple files 2023-10-25 16:47:14 -07:00
memfd.c memfd: drop warning for missing exec-related flags 2023-10-04 10:32:22 -07:00
memory-failure.c mm: convert DAX lock/unlock page to lock/unlock folio 2023-10-04 10:32:20 -07:00
memory-tiers.c dax, kmem: calculate abstract distance with general interface 2023-10-16 15:44:39 -07:00
memory.c mm: use folio_xchg_last_cpupid() in wp_page_reuse() 2023-10-25 16:47:13 -07:00
memory_hotplug.c mm: memory_hotplug: drop memoryless node from fallback lists 2023-10-25 16:47:14 -07:00
mempolicy.c mempolicy: fix migrate_pages(2) syscall return nr_failed 2023-10-25 16:47:15 -07:00
mempool.c
memremap.c
memtest.c mm: memtest: convert to memtest_report_meminfo() 2023-08-21 13:37:47 -07:00
migrate.c mm: migrate: record the mlocked page status to remove unnecessary lru drain 2023-10-25 16:47:14 -07:00
migrate_device.c Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
mincore.c mm: enable page walking API to lock vmas during the walk 2023-08-21 13:07:20 -07:00
mlock.c mm: mlock: avoid folio_within_range() on KSM pages 2023-10-25 16:47:14 -07:00
mm_init.c mm: hugetlb: skip initialization of gigantic tail struct pages if freed by HVO 2023-10-04 10:32:30 -07:00
mm_slot.h
mmap.c mm: fix multiple typos in multiple files 2023-10-25 16:47:14 -07:00
mmap_lock.c
mmu_gather.c mm: fix kernel-doc warning from tlb_flush_rmaps() 2023-08-24 16:20:30 -07:00
mmu_notifier.c mmu_notifiers: rename invalidate_range notifier 2023-08-18 10:12:41 -07:00
mmzone.c mm: remove page_cpupid_xchg_last() 2023-10-25 16:47:13 -07:00
mprotect.c mm: mprotect: use a folio in change_pte_range() 2023-10-25 16:47:12 -07:00
mremap.c mm: abstract VMA merge and extend into vma_merge_extend() helper 2023-10-18 14:34:18 -07:00
msync.c
nommu.c mm: make vma_merge() and split_vma() internal 2023-10-18 14:34:18 -07:00
oom_kill.c mm/oom_killer: simplify OOM killer info dump helper 2023-10-25 16:47:10 -07:00
page-writeback.c mm: use folio_xor_flags_has_waiters() in folio_end_writeback() 2023-10-18 14:34:17 -07:00
page_alloc.c mm: page_alloc: check the order of compound page even when the order is zero 2023-10-25 16:47:14 -07:00
page_counter.c
page_ext.c mm/page_ext: move functions around for minor cleanups to page_ext 2023-08-18 10:12:31 -07:00
page_idle.c
page_io.c mm: memcg: add THP swap out info for anonymous reclaim 2023-10-04 10:32:27 -07:00
page_isolation.c mm/hugetlb: get rid of page_hstate() 2023-08-18 10:12:39 -07:00
page_owner.c mm/page_owner: remove free_ts from page_owner output 2023-10-18 14:34:19 -07:00
page_poison.c mm/page_poison: remove unused page_ext.h from page_poison 2023-08-21 13:37:30 -07:00
page_reporting.c
page_reporting.h
page_table_check.c mm: convert page_table_check_pte_set() to page_table_check_ptes_set() 2023-08-24 16:20:18 -07:00
page_vma_mapped.c mm: correct stale comment of function check_pte 2023-08-18 10:12:13 -07:00
pagewalk.c mm/pagewalk: fix bootstopping regression from extra pte_unmap() 2023-09-02 08:39:21 -07:00
percpu-internal.h percpu-internal/pcpu_chunk: re-layout pcpu_chunk structure to reduce false sharing 2023-06-19 16:19:29 -07:00
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c percpu: scoped objcg protection 2023-10-25 16:47:11 -07:00
pgalloc-track.h
pgtable-generic.c mm/pgtable: notes on pte_offset_map[_lock]() 2023-08-18 10:12:25 -07:00
process_vm_access.c
ptdump.c mm: ptdump should use ptep_get_lockless() 2023-06-19 16:19:24 -07:00
readahead.c filemap: Allow __filemap_get_folio to allocate large folios 2023-07-24 18:04:30 -04:00
rmap.c mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap() 2023-10-18 14:34:14 -07:00
rodata_test.c
secretmem.c mm/secretmem: use a folio in secretmem_fault() 2023-08-21 13:38:02 -07:00
shmem.c mm: update memfd seal write check to include F_SEAL_WRITE 2023-10-18 14:34:19 -07:00
shmem_quota.c shmem: Add default quota limit mount options 2023-08-09 09:15:40 +02:00
show_mem.c mm: refactor si_mem_available() 2023-10-04 10:32:19 -07:00
shrinker.c mm: shrinker: convert shrinker_rwsem to mutex 2023-10-04 10:32:26 -07:00
shrinker_debug.c mm: shrinker: convert shrinker_rwsem to mutex 2023-10-04 10:32:26 -07:00
shuffle.c
shuffle.h
slab.c Randomized slab caches for kmalloc() 2023-07-18 10:07:47 +02:00
slab.h mm: kmem: scoped objcg protection 2023-10-25 16:47:11 -07:00
slab_common.c slab fixes for 6.6-rc4 2023-09-29 12:10:12 -07:00
slub.c mm/slub: remove freelist_dereference() 2023-07-14 09:57:21 +02:00
sparse-vmemmap.c mm/vmemmap: allow architectures to override how vmemmap optimization works 2023-08-18 10:12:53 -07:00
sparse.c mm/sparse: remove redundant judgments from macro for_each_present_section_nr 2023-08-18 10:12:14 -07:00
swap.c mm: remove references to pagevec 2023-06-23 16:59:30 -07:00
swap.h swap: remove remnants of polling from read_swap_cache_async 2023-08-24 16:20:16 -07:00
swap_cgroup.c
swap_slots.c
swap_state.c mm/swap: avoid a xa load for swapout path 2023-10-25 16:47:11 -07:00
swapfile.c mm/swap: inline folio_set_swap_entry() and folio_swap_entry() 2023-08-24 16:20:28 -07:00
truncate.c - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
usercopy.c
userfaultfd.c Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
util.c nilfs2: convert nilfs_copy_page() to nilfs_copy_folio() 2023-10-25 16:47:09 -07:00
vmalloc.c mm: hugetlb: add huge page size param to set_huge_pte_at() 2023-09-29 17:20:47 -07:00
vmpressure.c net-memcg: Fix scope of sockmem pressure indicators 2023-08-16 12:21:32 +01:00
vmscan.c mm: multi-gen LRU: reuse some legacy trace events 2023-10-18 14:34:14 -07:00
vmstat.c mm: tune PCP high automatically 2023-10-25 16:47:10 -07:00
workingset.c mm: workingset: dynamically allocate the mm-shadow shrinker 2023-10-04 10:32:24 -07:00
z3fold.c mm/z3fold: remove obsolete comment for struct z3fold_pool 2023-08-21 13:37:51 -07:00
zbud.c mm: zswap: remove shrink from zpool interface 2023-06-19 16:19:27 -07:00
zpool.c mm: zswap: remove shrink from zpool interface 2023-06-19 16:19:27 -07:00
zsmalloc.c zsmalloc: use copy_page for full page copy 2023-10-18 14:34:16 -07:00
zswap.c mm: zswap: fix pool refcount bug around shrink_worker() 2023-10-18 12:12:40 -07:00