linux-stable/mm
Carlos Llamas 6757330b1b mm/mmap: undo ->mmap() when arch_validate_flags() fails
commit deb0f65628 upstream.

Commit c462ac288f ("mm: Introduce arch_validate_flags()") added a late
check in mmap_region() to let architectures validate vm_flags.  The check
needs to happen after calling ->mmap() as the flags can potentially be
modified during this callback.

If arch_validate_flags() check fails we unmap and free the vma.  However,
the error path fails to undo the ->mmap() call that previously succeeded
and depending on the specific ->mmap() implementation this translates to
reference increments, memory allocations and other operations what will
not be cleaned up.

There are several places (mainly device drivers) where this is an issue.
However, one specific example is bpf_map_mmap() which keeps count of the
mappings in map->writecnt.  The count is incremented on ->mmap() and then
decremented on vm_ops->close().  When arch_validate_flags() fails this
count is off since bpf_map_mmap_close() is never called.

One can reproduce this issue in arm64 devices with MTE support.  Here the
vm_flags are checked to only allow VM_MTE if VM_MTE_ALLOWED has been set
previously.  From userspace then is enough to pass the PROT_MTE flag to
mmap() syscall to trigger the arch_validate_flags() failure.

The following program reproduces this issue:

  #include <stdio.h>
  #include <unistd.h>
  #include <linux/unistd.h>
  #include <linux/bpf.h>
  #include <sys/mman.h>

  int main(void)
  {
	union bpf_attr attr = {
		.map_type = BPF_MAP_TYPE_ARRAY,
		.key_size = sizeof(int),
		.value_size = sizeof(long long),
		.max_entries = 256,
		.map_flags = BPF_F_MMAPABLE,
	};
	int fd;

	fd = syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
	mmap(NULL, 4096, PROT_WRITE | PROT_MTE, MAP_SHARED, fd, 0);

	return 0;
  }

By manually adding some log statements to the vm_ops callbacks we can
confirm that when passing PROT_MTE to mmap() the map->writecnt is off upon
->release():

With PROT_MTE flag:
  root@debian:~# ./bpf-test
  [  111.263874] bpf_map_write_active_inc: map=9 writecnt=1
  [  111.288763] bpf_map_release: map=9 writecnt=1

Without PROT_MTE flag:
  root@debian:~# ./bpf-test
  [  157.816912] bpf_map_write_active_inc: map=10 writecnt=1
  [  157.830442] bpf_map_write_active_dec: map=10 writecnt=0
  [  157.832396] bpf_map_release: map=10 writecnt=0

This patch fixes the above issue by calling vm_ops->close() when the
arch_validate_flags() check fails, after this we can proceed to unmap and
free the vma on the error path.

Link: https://lkml.kernel.org/r/20220930003844.1210987-1-cmllamas@google.com
Fixes: c462ac288f ("mm: Introduce arch_validate_flags()")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Liam Howlett <liam.howlett@oracle.com>
Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>	[5.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-24 09:56:48 +02:00
..
damon mm/damon: validate if the pmd entry is present before accessing 2022-10-24 09:56:48 +02:00
kasan kasan: fix zeroing vmalloc memory with HW_TAGS 2022-08-17 15:15:26 +02:00
kfence Revert "mm: kfence: apply kmemleak_ignore_phys on early allocated pool" 2022-08-21 15:20:08 +02:00
backing-dev.c writeback: avoid use-after-free after removing device 2022-08-31 17:18:14 +02:00
balloon_compaction.c mm/balloon_compaction: make balloon page compaction callbacks static 2022-03-28 16:52:57 -04:00
bootmem_info.c bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem 2022-08-31 17:18:15 +02:00
cma.c Revert "mm/cma.c: remove redundant cma_mutex lock" 2022-05-13 15:11:26 -07:00
cma.h mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
cma_debug.c
cma_sysfs.c
compaction.c mm, compaction: fast_find_migrateblock() should return pfn in the target zone 2022-05-13 16:48:57 -07:00
debug.c mm: unexport page_init_poison 2022-03-24 19:06:45 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: add tests for __HAVE_ARCH_PTE_SWP_EXCLUSIVE 2022-05-09 18:20:45 -07:00
dmapool.c mm/dmapool.c: revert "make dma pool to use kmalloc_node" 2022-01-15 16:30:28 +02:00
early_ioremap.c mm/early_ioremap: declare early_memremap_pgprot_adjust() 2022-03-22 15:57:11 -07:00
fadvise.c riscv: compat: syscall: Add compat_sys_call_table implementation 2022-04-26 13:36:25 -07:00
failslab.c mm: fix missing handler for __GFP_NOWARN 2022-05-19 14:08:55 -07:00
filemap.c filemap: Handle sibling entries in filemap_get_read_batch() 2022-06-20 16:37:45 -04:00
folio-compat.c fs: Remove aop flags parameter from grab_cache_page_write_begin() 2022-05-08 14:28:19 -04:00
frontswap.c frontswap: don't call ->init if no ops are registered 2022-10-05 10:40:43 +02:00
gup.c mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page 2022-10-24 09:56:48 +02:00
gup_test.c
gup_test.h
highmem.c highmem: fix checks in __kmap_local_sched_{in,out} 2022-04-08 14:20:36 -10:00
hmm.c mm/hmm: fault non-owner device private entries 2022-07-29 11:33:37 -07:00
huge_memory.c mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW 2022-08-31 17:17:59 +02:00
hugetlb.c mm/uffd: fix warning without PTE_MARKER_UFFD_WP compiled in 2022-10-24 09:56:48 +02:00
hugetlb_cgroup.c hugetlb_cgroup: fix wrong hugetlb cgroup numa stat 2022-08-17 15:16:15 +02:00
hugetlb_vmemmap.c mm: hugetlb_vmemmap: fix CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON 2022-06-01 15:57:16 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
hwpoison-inject.c mm/memory-failure: disable unpoison once hw error happens 2022-06-16 19:11:32 -07:00
init-mm.c kernel/fork: Initialize mm's PASID 2022-02-14 19:51:47 +01:00
internal.h mm: split free page with properly free memory accounting and without race 2022-05-27 09:33:43 -07:00
interval_tree.c
io-mapping.c
ioremap.c
Kconfig mm: Kconfig: reorganize misplaced mm options 2022-05-27 09:33:47 -07:00
Kconfig.debug Two followon fixes for the post-5.19 series "Use pageblock_order for cma 2022-05-27 11:40:49 -07:00
khugepaged.c mm: gup: fix the fast GUP race against THP collapse 2022-10-05 10:40:46 +02:00
kmemleak.c Revert "mm: kmemleak: take a full lowmem check in kmemleak_*_phys()" 2022-09-15 10:47:07 +02:00
ksm.c ksm: fix typo in comment 2022-05-25 10:47:48 -07:00
list_lru.c mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()" 2022-04-08 14:20:36 -10:00
maccess.c asm-generic updates for 5.18 2022-03-23 18:03:08 -07:00
madvise.c mm: fix madivse_pageout mishandling on non-LRU page 2022-10-05 10:40:47 +02:00
Makefile mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
mapping_dirty_helpers.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
memblock.c memblock: test suite and a small cleanup 2022-03-27 13:36:06 -07:00
memcontrol.c mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py 2022-06-16 19:11:32 -07:00
memfd.c memfd: fix F_SEAL_WRITE after shmem huge page allocated 2022-03-05 11:08:32 -08:00
memory-failure.c mm,hwpoison: check mm when killing accessing process 2022-10-05 10:40:47 +02:00
memory.c mm/uffd: fix warning without PTE_MARKER_UFFD_WP compiled in 2022-10-24 09:56:48 +02:00
memory_hotplug.c mm/migration: return errno when isolate_huge_page failed 2022-08-17 15:15:26 +02:00
mempolicy.c mm/mempolicy: fix get_nodes out of bound access 2022-08-17 15:15:26 +02:00
mempool.c mm: remove spurious blkdev.h includes 2021-10-18 06:17:01 -06:00
memremap.c mm/memremap: fix memunmap_pages() race with get_dev_pagemap() 2022-08-17 15:15:21 +02:00
memtest.c
migrate.c mm/migration: fix potential pte_unmap on an not mapped pte 2022-08-17 15:15:26 +02:00
migrate_device.c mm/migrate_device.c: copy pte dirty bit to page 2022-10-05 10:40:47 +02:00
mincore.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
mlock.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
mm_init.c
mmap.c mm/mmap: undo ->mmap() when arch_validate_flags() fails 2022-10-24 09:56:48 +02:00
mmap_lock.c
mmu_gather.c mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush 2022-04-28 23:16:12 -07:00
mmu_notifier.c mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() 2022-04-21 20:01:10 -07:00
mmzone.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
mprotect.c mm/uffd: fix warning without PTE_MARKER_UFFD_WP compiled in 2022-10-24 09:56:48 +02:00
mremap.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
msync.c
nommu.c no-MMU: expose vmalloc_huge() for alloc_large_system_hash() 2022-04-25 10:11:49 -07:00
oom_kill.c mm/oom_kill.c: fix vm_oom_kill_table[] ifdeffery 2022-06-01 15:57:16 -07:00
page-writeback.c writeback: avoid use-after-free after removing device 2022-08-31 17:18:14 +02:00
page_alloc.c mm: prevent page_frag_alloc() from corrupting the memory 2022-10-05 10:40:46 +02:00
page_counter.c mm/page_counter: remove an incorrect call to propagate_protected_usage() 2022-01-15 16:30:27 +02:00
page_ext.c mm: use for_each_online_node and node_online instead of open coding 2022-04-29 14:36:58 -07:00
page_idle.c mm: don't be stuck to rmap lock on reclaim path 2022-05-19 14:08:54 -07:00
page_io.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
page_isolation.c mm/page_isolation: fix isolate_single_pageblock() isolation behavior 2022-10-05 10:40:46 +02:00
page_owner.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c Six hotfixes. One from Miaohe Lin is considered a minor thing so it isn't 2022-05-27 11:29:35 -07:00
page_vma_mapped.c mm: pvmw: add support for walking devmap pages 2022-04-28 23:16:10 -07:00
pagewalk.c mm: pagewalk: Fix race between unmap and page walker 2022-09-08 11:24:04 +02:00
percpu-internal.h percpu: improve percpu_alloc_percpu event trace 2022-05-13 07:20:18 -07:00
percpu-km.c
percpu-stats.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
percpu-vm.c
percpu.c mm: percpu: use kmemleak_ignore_phys() instead of kmemleak_free() 2022-08-17 15:15:34 +02:00
pgalloc-track.h
pgtable-generic.c mm: avoid unnecessary flush on change_huge_pmd() 2022-05-13 07:20:05 -07:00
process_vm_access.c
ptdump.c mm: pagewalk: Fix race between unmap and page walker 2022-09-08 11:24:04 +02:00
readahead.c filemap: Fix serialization adding transparent huge pages to page cache 2022-06-23 12:22:00 -04:00
rmap.c mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuse 2022-09-05 10:31:28 +02:00
rodata_test.c
secretmem.c mm: fix dereferencing possible ERR_PTR 2022-10-05 10:40:46 +02:00
shmem.c shmem: update folio if shmem_replace_page() updates the page 2022-08-31 17:18:16 +02:00
shuffle.c
shuffle.h
slab.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
slab.h slab changes for 5.19 2022-05-25 10:24:04 -07:00
slab_common.c mm/slab_common: fix possible double free of kmem_cache 2022-09-28 11:32:14 +02:00
slob.c mm: make minimum slab alignment a runtime property 2022-05-13 07:20:07 -07:00
slub.c mm: slub: fix flush_cpu_slab()/__free_slab() invocations in task context. 2022-09-28 11:32:07 +02:00
sparse-vmemmap.c mm: sparsemem: fix missing higher order allocation splitting 2022-07-03 15:42:32 -07:00
sparse.c mm/memory-failure.c: move clear_hwpoisoned_pages 2022-05-13 07:20:19 -07:00
swap.c mm: lru_cache_disable: use synchronize_rcu_expedited 2022-06-16 19:11:30 -07:00
swap.h swap: convert add_to_swap() to take a folio 2022-05-13 07:20:15 -07:00
swap_cgroup.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
swap_slots.c mm/swap: remove buggy cache->nr check in refill_swap_slots_cache 2022-05-19 14:08:51 -07:00
swap_state.c mm: filter out swapin error entry in shmem mapping 2022-05-27 09:33:46 -07:00
swapfile.c Two followon fixes for the post-5.19 series "Use pageblock_order for cma 2022-05-27 11:40:49 -07:00
truncate.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
usercopy.c usercopy: Make usercopy resilient against ridiculously large copies 2022-06-13 09:54:52 -07:00
userfaultfd.c mm/uffd: reset write protection when unregister with wp-mode 2022-08-31 17:18:00 +02:00
util.c mm: fix BUG splat with kvmalloc + GFP_ATOMIC 2022-10-05 10:40:45 +02:00
vmacache.c
vmalloc.c kasan: fix zeroing vmalloc memory with HW_TAGS 2022-08-17 15:15:26 +02:00
vmpressure.c mm/vmpressure: fix data-race with memcg->socket_pressure 2021-11-06 13:30:40 -07:00
vmscan.c Yang Shi has improved the behaviour of khugepaged collapsing of readonly 2022-05-26 12:32:41 -07:00
vmstat.c Bitmap patches for 5.19-rc1 2022-06-04 14:04:27 -07:00
workingset.c memcg: sync flush only if periodic flush is delayed 2022-04-21 20:01:09 -07:00
z3fold.c mm/z3fold: fix z3fold_page_migrate races with z3fold_map 2022-05-27 09:33:44 -07:00
zbud.c
zpool.c zpool: remove the list of pools_head 2022-01-15 16:30:31 +02:00
zsmalloc.c zsmalloc: fix races between asynchronous zspage free and page migration 2022-05-13 15:11:26 -07:00
zswap.c zswap: memcg accounting 2022-05-19 14:08:53 -07:00