linux-stable/mm
David Hildenbrand a7f2266041 mm/gup: trigger FAULT_FLAG_UNSHARE when R/O-pinning a possibly shared anonymous page
Whenever GUP currently ends up taking a R/O pin on an anonymous page that
might be shared -- mapped R/O and !PageAnonExclusive() -- any write fault
on the page table entry will end up replacing the mapped anonymous page
due to COW, resulting in the GUP pin no longer being consistent with the
page actually mapped into the page table.

The possible ways to deal with this situation are:
 (1) Ignore and pin -- what we do right now.
 (2) Fail to pin -- which would be rather surprising to callers and
     could break user space.
 (3) Trigger unsharing and pin the now exclusive page -- reliable R/O
     pins.

Let's implement 3) because it provides the clearest semantics and allows
for checking in unpin_user_pages() and friends for possible BUGs: when
trying to unpin a page that's no longer exclusive, clearly something went
very wrong and might result in memory corruptions that might be hard to
debug.  So we better have a nice way to spot such issues.

This change implies that whenever user space *wrote* to a private mapping
(IOW, we have an anonymous page mapped), that GUP pins will always remain
consistent: reliable R/O GUP pins of anonymous pages.

As a side note, this commit fixes the COW security issue for hugetlb with
FOLL_PIN as documented in:
  https://lore.kernel.org/r/3ae33b08-d9ef-f846-56fb-645e3b9b4c66@redhat.com
The vmsplice reproducer still applies, because vmsplice uses FOLL_GET
instead of FOLL_PIN.

Note that follow_huge_pmd() doesn't apply because we cannot end up in
there with FOLL_PIN.

This commit is heavily based on prototype patches by Andrea.

Link: https://lkml.kernel.org/r/20220428083441.37290-17-david@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Co-developed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Liang Zhang <zhangliang5@huawei.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:45 -07:00
..
damon mm/damon/reclaim: fix the timer always stays active 2022-04-29 14:37:00 -07:00
kasan kasan: fix sleeping function called from invalid context on RT kernel 2022-04-29 14:36:58 -07:00
kfence mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
backing-dev.c remove congestion tracking framework 2022-03-22 15:57:01 -07:00
balloon_compaction.c mm/balloon_compaction: make balloon page compaction callbacks static 2022-03-28 16:52:57 -04:00
bootmem_info.c bootmem: Use page->index instead of page->freelist 2022-01-06 12:27:03 +01:00
cma.c mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
cma.h mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
cma_debug.c
cma_sysfs.c
compaction.c mm: compaction: make sure highest is above the min_pfn 2022-04-28 23:16:19 -07:00
debug.c mm: unexport page_init_poison 2022-03-24 19:06:45 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: drop protection_map[] usage 2022-04-28 23:16:12 -07:00
dmapool.c mm/dmapool.c: revert "make dma pool to use kmalloc_node" 2022-01-15 16:30:28 +02:00
early_ioremap.c mm/early_ioremap: declare early_memremap_pgprot_adjust() 2022-03-22 15:57:11 -07:00
fadvise.c remove inode_congested() 2022-03-22 15:57:01 -07:00
failslab.c
filemap.c tmpfs: fix regressions from wider use of ZERO_PAGE 2022-04-15 14:49:54 -07:00
folio-compat.c mm/rmap: Convert rmap_walk() to take a folio 2022-03-21 13:01:35 -04:00
frontswap.c frontswap: remove support for multiple ops 2022-01-22 08:33:38 +02:00
gup.c mm/gup: trigger FAULT_FLAG_UNSHARE when R/O-pinning a possibly shared anonymous page 2022-05-09 18:20:45 -07:00
gup_test.c
gup_test.h
highmem.c highmem: fix checks in __kmap_local_sched_{in,out} 2022-04-08 14:20:36 -10:00
hmm.c mm/hmm.c: remove unneeded local variable ret 2022-03-22 15:57:12 -07:00
huge_memory.c mm/gup: trigger FAULT_FLAG_UNSHARE when R/O-pinning a possibly shared anonymous page 2022-05-09 18:20:45 -07:00
hugetlb.c mm/gup: trigger FAULT_FLAG_UNSHARE when R/O-pinning a possibly shared anonymous page 2022-05-09 18:20:45 -07:00
hugetlb_cgroup.c hugetlb: add hugetlb.*.numa_stat file 2022-01-15 16:30:29 +02:00
hugetlb_vmemmap.c mm/hugetlb_vmemmap: move comment block to Documentation/vm 2022-04-28 23:16:15 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
hwpoison-inject.c mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler 2022-03-22 15:57:07 -07:00
init-mm.c kernel/fork: Initialize mm's PASID 2022-02-14 19:51:47 +01:00
internal.h mm: compaction: clean up comment for sched contention 2022-04-28 23:16:17 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm: move ioremap_page_range to vmalloc.c 2021-09-08 11:50:24 -07:00
Kconfig mm/mmap: drop arch_filter_pgprot() 2022-04-28 23:16:13 -07:00
Kconfig.debug mm: page table check 2022-01-15 16:30:28 +02:00
khugepaged.c mm/rmap: drop "compound" parameter from page_add_new_anon_rmap() 2022-05-09 18:20:43 -07:00
kmemleak.c mm: kmemleak: take a full lowmem check in kmemleak_*_phys() 2022-04-15 14:49:56 -07:00
ksm.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
list_lru.c mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()" 2022-04-08 14:20:36 -10:00
maccess.c asm-generic updates for 5.18 2022-03-23 18:03:08 -07:00
madvise.c mm/madvise: fix potential pte_unmap_unlock pte error 2022-04-28 23:16:09 -07:00
Makefile mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
mapping_dirty_helpers.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
memblock.c memblock: test suite and a small cleanup 2022-03-27 13:36:06 -07:00
memcontrol.c memcg: introduce per-memcg reclaim interface 2022-04-29 14:36:59 -07:00
memfd.c memfd: fix F_SEAL_WRITE after shmem huge page allocated 2022-03-05 11:08:32 -08:00
memory-failure.c mm, hugetlb, hwpoison: separate branch for free and in-use hugepage 2022-04-28 23:16:02 -07:00
memory.c mm: support GUP-triggered unsharing of anonymous pages 2022-05-09 18:20:45 -07:00
memory_hotplug.c mm/sparse-vmemmap: add a pgmap argument to section activation 2022-04-28 23:16:15 -07:00
mempolicy.c mm/mempolicy: clean up the code logic in queue_pages_pte_range 2022-04-28 23:16:06 -07:00
mempool.c mm: remove spurious blkdev.h includes 2021-10-18 06:17:01 -06:00
memremap.c mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages 2022-05-09 18:20:44 -07:00
memtest.c
migrate.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
migrate_device.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
mincore.c
mlock.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
mm_init.c
mmap.c mm/mmap: drop arch_vm_get_page_pgprot() 2022-04-28 23:16:14 -07:00
mmap_lock.c mm: mmap_lock: fix disabling preemption directly 2021-07-23 17:43:28 -07:00
mmu_gather.c mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush 2022-04-28 23:16:12 -07:00
mmu_notifier.c mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() 2022-04-21 20:01:10 -07:00
mmzone.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
mprotect.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
mremap.c mm/mremap: avoid unneeded do_munmap call 2022-04-28 23:16:14 -07:00
msync.c
nommu.c no-MMU: expose vmalloc_huge() for alloc_large_system_hash() 2022-04-25 10:11:49 -07:00
oom_kill.c oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup 2022-04-21 20:01:10 -07:00
page-writeback.c mm: rework calculation of bdi_min_ratio in bdi_set_min_ratio 2022-04-28 23:15:57 -07:00
page_alloc.c mm/page_alloc: simplify update of pgdat in wake_all_kswapds 2022-04-29 14:36:59 -07:00
page_counter.c mm/page_counter: remove an incorrect call to propagate_protected_usage() 2022-01-15 16:30:27 +02:00
page_ext.c mm: use for_each_online_node and node_online instead of open coding 2022-04-29 14:36:58 -07:00
page_idle.c mm/rmap: Constify the rmap_walk_control argument 2022-03-21 13:01:35 -04:00
page_io.c mm: fix unexpected zeroed page mapping with zram swap 2022-04-15 14:49:55 -07:00
page_isolation.c mm: wrap __find_buddy_pfn() with a necessary buddy page validation 2022-04-28 23:16:01 -07:00
page_owner.c mm/page_owner.c: record tgid 2022-03-24 19:06:44 -07:00
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c mm/page_table_check.c: use strtobool for param parsing 2022-03-22 15:57:11 -07:00
page_vma_mapped.c mm: pvmw: add support for walking devmap pages 2022-04-28 23:16:10 -07:00
pagewalk.c
percpu-internal.h mm: memcg/percpu: account extra objcg space to memory cgroups 2022-01-15 16:30:31 +02:00
percpu-km.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu-stats.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
percpu-vm.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu.c bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
pgalloc-track.h
pgtable-generic.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
process_vm_access.c
ptdump.c mm: sparsemem: use page table lock to protect kernel pmd operations 2022-03-22 15:57:08 -07:00
readahead.c readahead: Update comments 2022-04-01 14:40:42 -04:00
rmap.c mm/rmap: fail try_to_migrate() early when setting a PMD migration entry fails 2022-05-09 18:20:44 -07:00
rodata_test.c
secretmem.c mm/secretmem: fix panic when growing a memfd_secret 2022-04-15 14:49:54 -07:00
shmem.c mm: shmem: make shmem_init return void 2022-04-28 23:15:58 -07:00
shuffle.c
shuffle.h
slab.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slab.h mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slab_common.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slob.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slub.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
sparse-vmemmap.c mm/sparse-vmemmap: improve memory savings for compound devmaps 2022-04-28 23:16:16 -07:00
sparse.c mm/sparse-vmemmap: add a pgmap argument to section activation 2022-04-28 23:16:15 -07:00
swap.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
swap_cgroup.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
swap_slots.c treewide: Add missing includes masked by cgroup -> bpf dependency 2021-12-03 10:58:13 -08:00
swap_state.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
swapfile.c mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages 2022-05-09 18:20:44 -07:00
truncate.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
usercopy.c Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
userfaultfd.c mm/rmap: drop "compound" parameter from page_add_new_anon_rmap() 2022-05-09 18:20:43 -07:00
util.c kvmalloc: use vmalloc_huge for vmalloc allocations 2022-04-24 10:05:38 -07:00
vmacache.c
vmalloc.c vmap(): don't allow invalid pages 2022-04-28 23:16:00 -07:00
vmpressure.c mm/vmpressure: fix data-race with memcg->socket_pressure 2021-11-06 13:30:40 -07:00
vmscan.c mm/vmscan: fix comment for isolate_lru_pages 2022-04-28 23:16:04 -07:00
vmstat.c mm/vmstat: add events for ksm cow 2022-04-28 23:16:16 -07:00
workingset.c memcg: sync flush only if periodic flush is delayed 2022-04-21 20:01:09 -07:00
z3fold.c mm/z3fold: remove unneeded PAGE_HEADLESS check in free_handle() 2022-04-28 23:16:06 -07:00
zbud.c mm/zbud: add kerneldoc fields for zbud_pool 2021-07-01 11:06:03 -07:00
zpool.c zpool: remove the list of pools_head 2022-01-15 16:30:31 +02:00
zsmalloc.c zsmalloc: replace get_cpu_var with local_lock 2022-01-22 08:33:37 +02:00
zswap.c mm/zswap.c: allow handling just same-value filled pages 2022-03-22 15:57:11 -07:00