linux-stable/include/asm-generic
Mathieu Desnoyers fe90f3967b sched: Add missing memory barrier in switch_mm_cid
Many architectures' switch_mm() (e.g. arm64) do not have an smp_mb()
which the core scheduler code has depended upon since commit:

    commit 223baf9d17 ("sched: Fix performance regression introduced by mm_cid")

If switch_mm() doesn't call smp_mb(), sched_mm_cid_remote_clear() can
unset the actively used cid when it fails to observe active task after it
sets lazy_put.

There *is* a memory barrier between storing to rq->curr and _return to
userspace_ (as required by membarrier), but the rseq mm_cid has stricter
requirements: the barrier needs to be issued between store to rq->curr
and switch_mm_cid(), which happens earlier than:

  - spin_unlock(),
  - switch_to().

So it's fine when the architecture switch_mm() happens to have that
barrier already, but less so when the architecture only provides the
full barrier in switch_to() or spin_unlock().

It is a bug in the rseq switch_mm_cid() implementation. All architectures
that don't have memory barriers in switch_mm(), but rather have the full
barrier either in finish_lock_switch() or switch_to() have them too late
for the needs of switch_mm_cid().

Introduce a new smp_mb__after_switch_mm(), defined as smp_mb() in the
generic barrier.h header, and use it in switch_mm_cid() for scheduler
transitions where switch_mm() is expected to provide a memory barrier.

Architectures can override smp_mb__after_switch_mm() if their
switch_mm() implementation provides an implicit memory barrier.
Override it with a no-op on x86 which implicitly provide this memory
barrier by writing to CR3.

Fixes: 223baf9d17 ("sched: Fix performance regression introduced by mm_cid")
Reported-by: levi.yun <yeoreum.yun@arm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64
Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86
Cc: <stable@vger.kernel.org> # 6.4.x
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20240415152114.59122-2-mathieu.desnoyers@efficios.com
2024-04-16 13:59:45 +02:00
..
bitops riscv: Avoid code duplication with generic bitops implementation 2024-01-24 17:25:36 -08:00
vdso
Kbuild cfi: Flip headers 2023-12-15 16:25:55 -08:00
access_ok.h
agp.h
archrandom.h
asm-offsets.h
asm-prototypes.h
atomic.h
atomic64.h
audit_change_attr.h
audit_dir_write.h
audit_read.h
audit_signal.h
audit_write.h
barrier.h sched: Add missing memory barrier in switch_mm_cid 2024-04-16 13:59:45 +02:00
bitops.h
bitsperlong.h
bug.h bug: Fix no-return-statement warning with !CONFIG_BUG 2024-04-10 22:01:35 +02:00
cache.h
cacheflush.h mm: Introduce flush_cache_vmap_early() 2023-12-14 00:23:17 -08:00
cfi.h cfi: Flip headers 2023-12-15 16:25:55 -08:00
checksum.h asm-generic: Improve csum_fold 2024-01-17 17:52:29 -08:00
cmpxchg-local.h asm-generic: Fix 32 bit __generic_cmpxchg_local 2024-01-05 23:19:14 +01:00
cmpxchg.h
compat.h
current.h
delay.h
device.h
div64.h
dma-mapping.h
dma.h
early_ioremap.h
emergency-restart.h
error-injection.h
exec.h
extable.h
fb.h
fixmap.h
flat.h
ftrace.h
futex.h
getorder.h
hardirq.h
hugetlb.h
hw_irq.h
hyperv-tlfs.h hyperv-fixes for v6.9-rc4 2024-04-11 16:23:56 -07:00
int-ll64.h
io.h
ioctl.h
iomap.h
irq.h
irq_regs.h
irq_work.h
irqflags.h
kdebug.h
kmap_size.h
kprobes.h
kvm_para.h
kvm_types.h
linkage.h
local.h
local64.h
logic_io.h
mcs_spinlock.h
memory_model.h
mm_hooks.h
mmiowb.h
mmiowb_types.h
mmu.h
mmu_context.h
module.h
module.lds.h
mshyperv.h hyperv-fixes for v6.9-rc4 2024-04-11 16:23:56 -07:00
msi.h
nommu_context.h
numa.h arm64: irq: set the correct node for VMAP stack 2023-12-05 14:26:50 +00:00
page.h
param.h
parport.h
pci.h
pci_iomap.h
percpu.h
pgalloc.h
pgtable-nop4d.h
pgtable-nopmd.h mm: recover pud_leaf() definitions in nopmd case 2024-03-13 12:12:21 -07:00
pgtable-nopud.h
pgtable_uffd.h
preempt.h
qrwlock.h
qrwlock_types.h
qspinlock.h
qspinlock_types.h
resource.h
rwonce.h
seccomp.h
sections.h
serial.h
set_memory.h
shmparam.h
signal.h
simd.h
softirq_stack.h
spinlock.h
spinlock_types.h
statfs.h
string.h
switch_to.h
syscall.h
syscalls.h
timex.h
tlb.h mm/mmu_gather: add __tlb_remove_folio_pages() 2024-02-22 15:27:17 -08:00
tlbflush.h
topology.h
trace_clock.h
uaccess.h
unaligned.h asm-generic: make sparse happy with odd-sized put_unaligned_*() 2024-01-08 09:48:35 -08:00
user.h
vermagic.h
vga.h
vmlinux.lds.h treewide: update LLVM Bugzilla links 2024-02-22 15:38:51 -08:00
vtime.h
word-at-a-time.h kernel.h: removed REPEAT_BYTE from kernel.h 2024-02-01 09:47:59 -08:00
xor.h