linux-stable/arch/powerpc/mm
Michael Ellerman 9f7853d760 powerpc/mm: Fix set_memory_*() against concurrent accesses
Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:

  kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0)
  BUG: Unable to handle kernel instruction fetch
  Faulting instruction address: 0xc008000004073278
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
  CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
  Workqueue: events control_work_handler [virtio_console]
  NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
  REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
  MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822 XER: 200400cf
  ...
  NIP fill_queue+0xf0/0x210 [virtio_console]
  LR  fill_queue+0xf0/0x210 [virtio_console]
  Call Trace:
    fill_queue+0xb4/0x210 [virtio_console] (unreliable)
    add_port+0x1a8/0x470 [virtio_console]
    control_work_handler+0xbc/0x1e8 [virtio_console]
    process_one_work+0x290/0x590
    worker_thread+0x88/0x620
    kthread+0x194/0x1a0
    ret_from_kernel_thread+0x5c/0x64

Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.

Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.

We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].

But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.

[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/

Fixes: 1f9ad21c3b ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Murilo Opsfelder Araújo <muriloo@linux.ibm.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818120518.3603172-1-mpe@ellerman.id.au
2021-08-19 09:41:54 +10:00
..
book3s32 powerpc/32s: Fix setup_{kuap/kuep}() on SMP 2021-06-30 22:20:39 +10:00
book3s64 powerpc/book3s64/mm: update flush_tlb_range to flush page walk cache 2021-07-08 11:48:23 -07:00
kasan powerpc updates for 5.10 2020-10-16 12:21:15 -07:00
nohash powerpc/4xx: Fix setup_kuep() on SMP 2021-06-30 22:21:02 +10:00
ptdump powerpc/mm: Properly coalesce pages in ptdump 2021-06-25 00:07:10 +10:00
cacheflush.c powerpc/mem: Use kmap_local_page() in flushing functions 2021-04-14 23:04:19 +10:00
copro_fault.c mm: clean up the last pieces of page fault accountings 2020-08-12 10:58:04 -07:00
dma-noncoherent.c dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h> 2020-10-06 07:07:06 +02:00
drmem.c pseries/drmem: don't cache node id in drmem_lmb struct 2020-09-02 11:00:21 +10:00
fault.c powerpc/mm: Fix lockup on kernel exec fault 2021-07-05 22:23:24 +10:00
hugetlbpage.c hugetlb: pass vma into huge_pte_alloc() and huge_pmd_share() 2021-05-05 11:27:20 -07:00
init-common.c powerpc: Inline setup_kup() 2020-12-15 13:13:49 +11:00
init_32.c powerpc: Enable KFENCE for PPC32 2021-03-24 14:09:30 +11:00
init_64.c Merge branch 'fixes' into next 2020-09-14 22:57:18 +10:00
ioremap.c mm/vmalloc: remove unmap_kernel_range 2021-04-30 11:20:40 -07:00
ioremap_32.c powerpc/mm: Leave a gap between early allocated IO areas 2021-06-25 00:07:10 +10:00
ioremap_64.c powerpc/mm: Leave a gap between early allocated IO areas 2021-06-25 00:07:10 +10:00
maccess.c powerpc: Don't use 'struct ppc_inst' to reference instruction location 2021-06-17 00:09:00 +10:00
Makefile powerpc updates for 5.14 2021-07-02 12:54:34 -07:00
mem.c powerpc updates for 5.14 2021-07-02 12:54:34 -07:00
mmap.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
mmu_context.c KVM: PPC: Book3S HV: Implement radix prefetch workaround by disabling MMU 2021-06-10 22:12:14 +10:00
mmu_decl.h powerpc: Enable KFENCE for PPC32 2021-03-24 14:09:30 +11:00
numa.c powerpc/numa: Fix a regression on memoryless node 0 2020-11-27 22:06:21 +11:00
pageattr.c powerpc/mm: Fix set_memory_*() against concurrent accesses 2021-08-19 09:41:54 +10:00
pgtable-frag.c powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings 2020-07-20 22:57:56 +10:00
pgtable.c powerpc/64s: Fix boot failure with 4K Radix 2021-06-25 00:06:54 +10:00
pgtable_32.c powerpc/32: use set_memory_attr() 2021-06-21 21:13:21 +10:00
pgtable_64.c mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t * 2021-07-08 11:48:22 -07:00
slice.c powerpc: Replace _ALIGN_UP() by ALIGN() 2020-05-11 23:15:15 +10:00