linux-stable/mm
Jan Stancek 09417dd35e mm/memory.c: do_fault: avoid usage of stale vm_area_struct
commit fc8efd2ddf upstream.

LTP testcase mtest06 [1] can trigger a crash on s390x running 5.0.0-rc8.
This is a stress test, where one thread mmaps/writes/munmaps memory area
and other thread is trying to read from it:

  CPU: 0 PID: 2611 Comm: mmap1 Not tainted 5.0.0-rc8+ #51
  Hardware name: IBM 2964 N63 400 (z/VM 6.4.0)
  Krnl PSW : 0404e00180000000 00000000001ac8d8 (__lock_acquire+0x7/0x7a8)
  Call Trace:
  ([<0000000000000000>]           (null))
   [<00000000001adae4>] lock_acquire+0xec/0x258
   [<000000000080d1ac>] _raw_spin_lock_bh+0x5c/0x98
   [<000000000012a780>] page_table_free+0x48/0x1a8
   [<00000000002f6e54>] do_fault+0xdc/0x670
   [<00000000002fadae>] __handle_mm_fault+0x416/0x5f0
   [<00000000002fb138>] handle_mm_fault+0x1b0/0x320
   [<00000000001248cc>] do_dat_exception+0x19c/0x2c8
   [<000000000080e5ee>] pgm_check_handler+0x19e/0x200

page_table_free() is called with NULL mm parameter, but because "0" is a
valid address on s390 (see S390_lowcore), it keeps going until it
eventually crashes in lockdep's lock_acquire.  This crash is
reproducible at least since 4.14.

Problem is that "vmf->vma" used in do_fault() can become stale.  Because
mmap_sem may be released, other threads can come in, call munmap() and
cause "vma" be returned to kmem cache, and get zeroed/re-initialized and
re-used:

handle_mm_fault                           |
  __handle_mm_fault                       |
    do_fault                              |
      vma = vmf->vma                      |
      do_read_fault                       |
        __do_fault                        |
          vma->vm_ops->fault(vmf);        |
            mmap_sem is released          |
                                          |
                                          | do_munmap()
                                          |   remove_vma_list()
                                          |     remove_vma()
                                          |       vm_area_free()
                                          |         # vma is released
                                          | ...
                                          | # same vma is allocated
                                          | # from kmem cache
                                          | do_mmap()
                                          |   vm_area_alloc()
                                          |     memset(vma, 0, ...)
                                          |
      pte_free(vma->vm_mm, ...);          |
        page_table_free                   |
          spin_lock_bh(&mm->context.lock);|
            <crash>                       |

Cache mm_struct to avoid using potentially stale "vma".

[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/mtest06/mmap1.c

Link: http://lkml.kernel.org/r/5b3fdf19e2a5be460a384b936f5b56e13733f1b8.1551595137.git.jstancek@redhat.com
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Rafael Aquini <aquini@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23 20:10:04 +01:00
..
kasan kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN 2018-08-17 16:20:30 -07:00
backing-dev.c writeback: synchronize sync(2) against cgroup writeback membership switches 2019-03-05 17:58:50 +01:00
balloon_compaction.c virtio_balloon: fix deadlock on OOM 2017-11-14 23:57:38 +02:00
bootmem.c docs/mm: bootmem: add overview documentation 2018-08-02 12:17:27 -06:00
cleancache.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
cma.c mm/cma: remove unsupported gfp_mask parameter from cma_alloc() 2018-08-17 16:20:32 -07:00
cma.h
cma_debug.c mm/cma: remove unsupported gfp_mask parameter from cma_alloc() 2018-08-17 16:20:32 -07:00
compaction.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
debug.c mm: get rid of vmacache_flush_all() entirely 2018-09-13 15:18:04 -10:00
debug_page_ref.c
dmapool.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
early_ioremap.c mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep 2017-12-11 14:54:44 +01:00
fadvise.c vfs: implement readahead(2) using POSIX_FADV_WILLNEED 2018-08-30 20:01:32 +02:00
failslab.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
filemap.c mm: use new return type vm_fault_t 2018-06-07 17:34:36 -07:00
frame_vector.c mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' 2017-12-14 16:00:48 -08:00
frontswap.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
gup.c mm/gup: fix gup_pmd_range() for dax 2019-03-23 20:09:46 +01:00
gup_benchmark.c mm/gup_benchmark: fix unsigned comparison to zero in __gup_benchmark_ioctl 2018-10-05 16:32:04 -07:00
highmem.c
hmm.c mm, hmm: mark hmm_devmem_{add, add_resource} EXPORT_SYMBOL_GPL 2019-01-13 09:51:04 +01:00
huge_memory.c mm: thp: fix flags for pmd migration when split 2018-12-29 13:37:58 +01:00
hugetlb.c hugetlbfs: fix races and page leaks during migration 2019-03-05 17:58:53 +01:00
hugetlb_cgroup.c mm: rename page_counter's count/limit into usage/max 2018-06-07 17:34:35 -07:00
hwpoison-inject.c mm/memory_failure: Remove unused trapno from memory_failure 2018-01-23 12:17:42 -06:00
init-mm.c mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids 2018-07-17 09:35:30 +02:00
internal.h mm: Change return type int to vm_fault_t for fault handlers 2018-08-23 18:48:44 -07:00
interval_tree.c mm/interval_tree.c: use vma_pages() helper 2018-01-31 17:18:37 -08:00
Kconfig mm: disable deferred struct page for 32-bit arches 2018-09-20 22:01:11 +02:00
Kconfig.debug mm: clarify CONFIG_PAGE_POISONING and usage 2018-08-22 10:52:44 -07:00
khugepaged.c mm/khugepaged: collapse_shmem() do not crash on Compound 2018-12-05 19:31:57 +01:00
kmemleak-test.c
kmemleak.c kmemleak: always register debugfs file 2018-09-04 16:45:02 -07:00
ksm.c include/linux/compiler*.h: make compiler-*.h mutually exclusive 2018-08-22 17:31:34 -07:00
list_lru.c mm/list_lru: introduce list_lru_shrink_walk_irq() 2018-08-17 16:20:32 -07:00
maccess.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
madvise.c mm: madvise(MADV_DODUMP): allow hugetlbfs pages 2018-10-05 16:32:05 -07:00
Makefile vfs: implement readahead(2) using POSIX_FADV_WILLNEED 2018-08-30 20:01:32 +02:00
memblock.c mm/memblock.c: replace u64 with phys_addr_t where appropriate 2018-08-17 16:20:30 -07:00
memcontrol.c memcg, oom: notify on oom killer invocation from the charge path 2019-01-13 09:51:04 +01:00
memfd.c alloc_file(): switch to passing O_... flags instead of FMODE_... mode 2018-07-12 10:02:57 -04:00
memory-failure.c mm: hwpoison: fix thp split handing in soft_offline_in_use_page() 2019-03-23 20:10:04 +01:00
memory.c mm/memory.c: do_fault: avoid usage of stale vm_area_struct 2019-03-23 20:10:04 +01:00
memory_hotplug.c mm, memory_hotplug: fix off-by-one in is_pageblock_removable 2019-03-13 14:02:32 -07:00
mempolicy.c numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES 2019-02-27 10:08:50 +01:00
mempool.c mm/mempool.c: add missing parameter description 2018-08-22 10:52:44 -07:00
memtest.c
migrate.c hugetlbfs: fix races and page leaks during migration 2019-03-05 17:58:53 +01:00
mincore.c
mlock.c dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
mm_init.c mm: access zone->node via zone_to_nid() and zone_set_nid() 2018-08-22 10:52:45 -07:00
mmap.c mm: enforce min addr even if capable() in expand_downwards() 2019-03-05 17:58:53 +01:00
mmu_context.c
mmu_notifier.c mm, oom: distinguish blockable mode for mmu notifiers 2018-08-22 10:52:44 -07:00
mmzone.c
mprotect.c x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings 2018-06-20 19:10:01 +02:00
mremap.c mremap: properly flush TLB before releasing the page 2018-10-18 11:30:52 +02:00
msync.c
nobootmem.c mm/memblock: add a name for memblock flags enumeration 2018-08-02 12:17:27 -06:00
nommu.c mm: provide a fallback for PAGE_KERNEL_EXEC for architectures 2018-08-17 16:20:29 -07:00
oom_kill.c mm, oom: fix use-after-free in oom_kill_process 2019-02-06 17:30:15 +01:00
page-writeback.c mm/page-writeback.c: don't break integrity writeback on ->writepage() error 2019-01-26 09:32:43 +01:00
page_alloc.c mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs 2019-03-23 20:09:46 +01:00
page_counter.c memcg: introduce memory.min 2018-06-07 17:34:36 -07:00
page_ext.c Revert "mm: use early_pfn_to_nid in page_ext_init" 2019-03-23 20:09:46 +01:00
page_idle.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
page_io.c swap,blkcg: issue swap io with the appropriate context 2018-07-09 09:07:54 -06:00
page_isolation.c mm, migrate: remove reason argument from new_page_t 2018-04-11 10:28:32 -07:00
page_owner.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
page_poison.c mm/page_poison.c: make early_page_poison_param() __init 2018-04-05 21:36:26 -07:00
page_vma_mapped.c mm/rmap: map_pte() was not handling private ZONE_DEVICE page properly 2018-11-13 11:08:46 -08:00
pagewalk.c mm: kernel-doc: add missing parameter descriptions 2018-04-05 21:36:27 -07:00
percpu-internal.h
percpu-km.c percpu: convert spin_lock_irq to spin_lock_irqsave. 2019-02-12 19:47:12 +01:00
percpu-stats.c treewide: Use array_size() in vmalloc() 2018-06-12 16:19:22 -07:00
percpu-vm.c percpu: allow select gfp to be passed to underlying allocators 2018-02-18 05:33:01 -08:00
percpu.c percpu: stop leaking bitmap metadata blocks 2018-10-07 14:50:12 -07:00
pgtable-generic.c mm: do not lose dirty and accessed bits in pmdp_invalidate() 2018-01-31 17:18:38 -08:00
process_vm_access.c mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors 2018-02-06 18:32:48 -08:00
quicklist.c
readahead.c vfs: implement readahead(2) using POSIX_FADV_WILLNEED 2018-08-30 20:01:32 +02:00
rmap.c mm/huge_memory: rename freeze_page() to unmap_page() 2018-12-05 19:31:57 +01:00
rodata_test.c
shmem.c tmpfs: fix uninitialized return value in shmem_link 2019-03-23 20:09:52 +01:00
slab.c slab: alien caches must not be initialized if the allocation of the alien cache failed 2019-01-16 22:04:33 +01:00
slab.h mm: introduce CONFIG_MEMCG_KMEM as combination of CONFIG_MEMCG && !CONFIG_SLOB 2018-08-17 16:20:30 -07:00
slab_common.c mm: don't warn about large allocations for slab 2018-12-01 09:37:28 +01:00
slob.c slab: __GFP_ZERO is incompatible with a constructor 2018-06-07 17:34:34 -07:00
slub.c notifier: Remove notifier header file wherever not used 2018-08-30 12:56:40 +02:00
sparse-vmemmap.c mm/sparse: delete old sparse_init and enable new one 2018-08-17 16:20:32 -07:00
sparse.c mm/sparse: delete old sparse_init and enable new one 2018-08-17 16:20:32 -07:00
swap.c mm: handle lru_add_drain_all for UP properly 2019-03-23 20:09:50 +01:00
swap_cgroup.c
swap_slots.c mm, swap, get_swap_pages: use entry_size instead of cluster in parameter 2018-08-22 10:52:44 -07:00
swap_state.c treewide: kvzalloc() -> kvcalloc() 2018-06-12 16:19:22 -07:00
swapfile.c mm/swap: use nr_node_ids for avail_lists in swap_info_struct 2019-01-26 09:32:43 +01:00
truncate.c mm: cleancache: fix corruption on missed inode invalidation 2018-12-05 19:32:13 +01:00
usercopy.c mm/usercopy.c: no check page span for stack objects 2019-01-16 22:04:33 +01:00
userfaultfd.c userfaultfd: shmem: add i_size checks 2018-12-08 12:59:08 +01:00
util.c mm: page_mapped: don't assume compound page is huge or THP 2019-01-16 22:04:36 +01:00
vmacache.c mm: get rid of vmacache_flush_all() entirely 2018-09-13 15:18:04 -10:00
vmalloc.c mm/vmalloc: fix size check for remap_vmalloc_range_partial() 2019-03-23 20:10:04 +01:00
vmpressure.c mm/vmpressure.c: convert to use match_string() helper 2018-06-07 17:34:36 -07:00
vmscan.c Revert "mm: slowly shrink slabs with a relatively small number of objects" 2019-02-20 10:25:47 +01:00
vmstat.c mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly 2018-10-05 16:32:05 -07:00
workingset.c mm/list_lru: introduce list_lru_shrink_walk_irq() 2018-08-17 16:20:32 -07:00
z3fold.c z3fold: fix possible reclaim races 2018-12-01 09:37:33 +01:00
zbud.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
zpool.c mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc 2018-02-21 15:35:43 -08:00
zsmalloc.c mm/zsmalloc.c: make several functions and a struct static 2018-08-17 16:20:30 -07:00
zswap.c zswap: re-check zswap_is_full() after do zswap_shrink() 2018-07-26 19:38:03 -07:00