linux-stable/kernel
Jiri Olsa 063c9ce8e7 bpf: Disable preemption in bpf_event_output
commit d62cc390c2 upstream.

We received report [1] of kernel crash, which is caused by
using nesting protection without disabled preemption.

The bpf_event_output can be called by programs executed by
bpf_prog_run_array_cg function that disabled migration but
keeps preemption enabled.

This can cause task to be preempted by another one inside the
nesting protection and lead eventually to two tasks using same
perf_sample_data buffer and cause crashes like:

  BUG: kernel NULL pointer dereference, address: 0000000000000001
  #PF: supervisor instruction fetch in kernel mode
  #PF: error_code(0x0010) - not-present page
  ...
  ? perf_output_sample+0x12a/0x9a0
  ? finish_task_switch.isra.0+0x81/0x280
  ? perf_event_output+0x66/0xa0
  ? bpf_event_output+0x13a/0x190
  ? bpf_event_output_data+0x22/0x40
  ? bpf_prog_dfc84bbde731b257_cil_sock4_connect+0x40a/0xacb
  ? xa_load+0x87/0xe0
  ? __cgroup_bpf_run_filter_sock_addr+0xc1/0x1a0
  ? release_sock+0x3e/0x90
  ? sk_setsockopt+0x1a1/0x12f0
  ? udp_pre_connect+0x36/0x50
  ? inet_dgram_connect+0x93/0xa0
  ? __sys_connect+0xb4/0xe0
  ? udp_setsockopt+0x27/0x40
  ? __pfx_udp_push_pending_frames+0x10/0x10
  ? __sys_setsockopt+0xdf/0x1a0
  ? __x64_sys_connect+0xf/0x20
  ? do_syscall_64+0x3a/0x90
  ? entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fixing this by disabling preemption in bpf_event_output.

[1] https://github.com/cilium/cilium/issues/26756
Cc: stable@vger.kernel.org
Reported-by: Oleg "livelace" Popov <o.popov@livelace.ru>
Closes: https://github.com/cilium/cilium/issues/26756
Fixes: 2a916f2f54 ("bpf: Use migrate_disable/enable in array macros and cgroup/lirc code.")
Acked-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230725084206.580930-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11 12:14:22 +02:00
..
bpf bpf, cpumap: Handle skb as well when clean up ptr_ring 2023-08-11 12:14:15 +02:00
cgroup sched/psi: use kernfs polling functions for PSI trigger polling 2023-07-27 08:56:53 +02:00
configs Char/Misc drivers for 6.4-rc1 2023-04-27 12:07:50 -07:00
debug kdb: use srcu console list iterator 2022-12-02 11:25:00 +01:00
dma swiotlb: reduce the number of areas to match actual memory pool size 2023-07-23 13:53:36 +02:00
entry ptrace: Provide set/get interface for syscall user dispatch 2023-04-16 14:23:07 +02:00
events perf/core: Fix perf_sample_data not properly initialized for different swevents in perf_tp_event() 2023-05-08 10:58:26 +02:00
futex - Prevent the leaking of a debug timer in futex_waitv() 2023-01-01 11:15:05 -08:00
gcov gcov: add support for checksum field 2022-12-21 14:31:52 -08:00
irq xen: branch for v6.4-rc4 2023-05-27 09:42:56 -07:00
kcsan kcsan: Don't expect 64 bits atomic builtins from 32 bits architectures 2023-07-19 16:36:13 +02:00
livepatch Scheduler changes for v6.4: 2023-04-28 14:53:30 -07:00
locking locking/rtmutex: Fix task->pi_waiters integrity 2023-08-03 10:26:09 +02:00
module module/decompress: Fix error checking on zstd decompression 2023-06-01 14:36:46 -07:00
power PM: QoS: Restore support for default value on frequency QoS 2023-07-23 13:54:11 +02:00
printk - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
rcu rcu: Mark additional concurrent load from ->cpu_no_qs.b.exp 2023-07-27 08:56:48 +02:00
sched sched/psi: use kernfs polling functions for PSI trigger polling 2023-07-27 08:56:53 +02:00
time posix-timers: Ensure timer ID search-loop limit is valid 2023-07-27 08:56:45 +02:00
trace bpf: Disable preemption in bpf_event_output 2023-08-11 12:14:22 +02:00
.gitignore
acct.c
async.c
audit.c
audit.h
audit_fsnotify.c
audit_tree.c
audit_watch.c
auditfilter.c
auditsc.c capability: just use a 'u64' instead of a 'u32[2]' array 2023-03-01 10:01:22 -08:00
backtracetest.c
bounds.c
capability.c capability: just use a 'u64' instead of a 'u32[2]' array 2023-03-01 10:01:22 -08:00
cfi.c
compat.c sched_getaffinity: don't assume 'cpumask_size()' is fully initialized 2023-03-14 19:32:38 -07:00
configs.c
context_tracking.c context_tracking: Fix noinstr vs KASAN 2023-01-13 11:48:18 +01:00
cpu.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
cpu_pm.c cpuidle, cpu_pm: Remove RCU fiddling from cpu_pm_{enter,exit}() 2023-01-13 11:48:15 +01:00
crash_core.c mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
crash_dump.c
cred.c
delayacct.c delayacct: track delays from IRQ/SOFTIRQ 2023-04-18 16:39:34 -07:00
dma.c
exec_domain.c
exit.c fork, vhost: Use CLONE_THREAD to fix freezer/ps regression 2023-06-01 17:15:33 -04:00
extable.c
fail_function.c kernel/fail_function: fix memory leak with using debugfs_lookup() 2023-02-08 13:36:22 +01:00
fork.c fork: lock VMAs of the parent process when forking 2023-07-11 06:31:05 +02:00
freezer.c
gen_kheaders.sh kheaders: use standard naming for the temporary directory 2023-01-22 23:43:34 +09:00
groups.c
hung_task.c kernel/hung_task.c: set some hung_task.c variables storage-class-specifier to static 2023-04-08 13:45:37 -07:00
iomem.c
irq_work.c trace: Add trace_ipi_send_cpu() 2023-03-24 11:01:29 +01:00
jump_label.c jump_label: Prevent key->enabled int overflow 2022-12-01 15:53:05 -08:00
kallsyms.c kallsyms: strip LTO-only suffixes from promoted global functions 2023-07-27 08:56:54 +02:00
kallsyms_internal.h
kallsyms_selftest.c kallsyms: Delete an unused parameter related to {module_}kallsyms_on_each_symbol() 2023-03-19 13:27:19 -07:00
kallsyms_selftest.h
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c mm: replace vma->vm_flags direct modifications with modifier calls 2023-02-09 16:51:39 -08:00
kexec.c kexec: introduce sysctl parameters kexec_load_limit_* 2023-02-02 22:50:05 -08:00
kexec_core.c kexec: fix a memory leak in crash_shrink_memory() 2023-07-19 16:35:28 +02:00
kexec_elf.c
kexec_file.c kexec: support purgatories with .text.hot sections 2023-06-12 11:31:50 -07:00
kexec_internal.h
kheaders.c kheaders: Use array declaration instead of char 2023-03-24 20:10:59 -07:00
kprobes.c x86/kprobes: Fix arch_check_optimized_kprobe check within optimized_kprobe range 2023-02-21 08:49:16 +09:00
ksysfs.c kernel/ksysfs.c: use sysfs_emit for sysfs show handlers 2023-03-24 17:09:14 +01:00
kthread.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
latencytop.c
Makefile modules-6.4-rc1 2023-04-27 16:36:55 -07:00
module_signature.c
notifier.c notifiers: add tracepoints to the notifiers infrastructure 2023-04-08 13:45:38 -07:00
nsproxy.c convert setns(2) to fdget()/fdput() 2023-04-20 22:55:35 -04:00
padata.c padata: use alignment when calculating the number of worker threads 2023-03-14 17:06:44 +08:00
panic.c cpu: Mark nmi_panic_self_stop() __noreturn 2023-04-14 17:31:26 +02:00
params.c module: make module_ktype structure constant 2023-03-09 12:55:15 -08:00
pid.c pid: add pidfd_prepare() 2023-04-03 11:16:56 +02:00
pid_namespace.c kernel: pid_namespace: simplify sysctls with register_sysctl() 2023-05-02 19:23:29 -07:00
pid_sysctl.h kernel: pid_namespace: simplify sysctls with register_sysctl() 2023-05-02 19:23:29 -07:00
profile.c
ptrace.c ptrace: Provide set/get interface for syscall user dispatch 2023-04-16 14:23:07 +02:00
range.c
reboot.c
regset.c
relay.c relayfs: fix out-of-bounds access in relay_file_read 2023-05-02 17:23:27 -07:00
resource.c dax/kmem: Fix leak of memory-hotplug resources 2023-02-17 14:58:01 -08:00
resource_kunit.c
rseq.c rseq: Extend struct rseq with per-memory-map concurrency ID 2022-12-27 12:52:12 +01:00
scftorture.c
scs.c
seccomp.c seccomp: simplify sysctls with register_sysctl_init() 2023-04-13 11:49:20 -07:00
signal.c mm: suppress mm fault logging if fatal signal already pending 2023-08-03 10:25:53 +02:00
smp.c trace,smp: Trace all smp_function_call*() invocations 2023-03-24 11:01:30 +01:00
smpboot.c
smpboot.h
softirq.c softirq: Add trace points for tasklet entry/exit 2023-04-15 10:17:16 +02:00
stackleak.c stackleak: allow to specify arch specific stackleak poison function 2023-04-20 11:36:35 +02:00
stacktrace.c
static_call.c
static_call_inline.c
stop_machine.c
sys.c prctl: move PR_GET_AUXV out of PR_MCE_KILL 2023-07-27 08:56:33 +02:00
sys_ni.c
sysctl-test.c
sysctl.c mm: compaction: move compaction sysctl to its own file 2023-04-13 11:49:35 -07:00
task_work.c
taskstats.c
torture.c torture: Fix hang during kthread shutdown phase 2023-01-05 12:10:35 -08:00
tracepoint.c tracepoint: Allow livepatch module add trace event 2023-02-18 14:34:36 -05:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c umh: simplify the capability pointer logic 2023-03-03 16:18:19 -08:00
up.c
user-return-notifier.c
user.c
user_namespace.c userns: fix a struct's kernel-doc notation 2023-02-02 22:50:04 -08:00
usermode_driver.c
utsname.c
utsname_sysctl.c utsname: simplify one-level sysctl registration for uts_kern_table 2023-04-13 11:49:35 -07:00
vhost_task.c vhost: Fix worker hangs due to missed wake up calls 2023-06-08 15:43:09 -04:00
watch_queue.c watch_queue: prevent dangling pipe pointer 2023-07-19 16:36:53 +02:00
watchdog.c watchdog/hardlockup: keep kernel.nmi_watchdog sysctl as 0444 if probe fails 2023-07-19 16:35:33 +02:00
watchdog_hld.c watchdog/hardlockup: rename some "NMI watchdog" constants/function 2023-07-19 16:35:32 +02:00
workqueue.c workqueue: clean up WORK_* constant types, clarify masking 2023-06-23 12:08:14 -07:00
workqueue_internal.h