linux-stable/kernel/trace
Zheng Yejian 89c89da92a tracing: Fix race issue between cpu buffer write and swap
[ Upstream commit 3163f635b2 ]

Warning happened in rb_end_commit() at code:
	if (RB_WARN_ON(cpu_buffer, !local_read(&cpu_buffer->committing)))

  WARNING: CPU: 0 PID: 139 at kernel/trace/ring_buffer.c:3142
	rb_commit+0x402/0x4a0
  Call Trace:
   ring_buffer_unlock_commit+0x42/0x250
   trace_buffer_unlock_commit_regs+0x3b/0x250
   trace_event_buffer_commit+0xe5/0x440
   trace_event_buffer_reserve+0x11c/0x150
   trace_event_raw_event_sched_switch+0x23c/0x2c0
   __traceiter_sched_switch+0x59/0x80
   __schedule+0x72b/0x1580
   schedule+0x92/0x120
   worker_thread+0xa0/0x6f0

It is because the race between writing event into cpu buffer and swapping
cpu buffer through file per_cpu/cpu0/snapshot:

  Write on CPU 0             Swap buffer by per_cpu/cpu0/snapshot on CPU 1
  --------                   --------
                             tracing_snapshot_write()
                               [...]

  ring_buffer_lock_reserve()
    cpu_buffer = buffer->buffers[cpu]; // 1. Suppose find 'cpu_buffer_a';
    [...]
    rb_reserve_next_event()
      [...]

                               ring_buffer_swap_cpu()
                                 if (local_read(&cpu_buffer_a->committing))
                                     goto out_dec;
                                 if (local_read(&cpu_buffer_b->committing))
                                     goto out_dec;
                                 buffer_a->buffers[cpu] = cpu_buffer_b;
                                 buffer_b->buffers[cpu] = cpu_buffer_a;
                                 // 2. cpu_buffer has swapped here.

      rb_start_commit(cpu_buffer);
      if (unlikely(READ_ONCE(cpu_buffer->buffer)
          != buffer)) { // 3. This check passed due to 'cpu_buffer->buffer'
        [...]           //    has not changed here.
        return NULL;
      }
                                 cpu_buffer_b->buffer = buffer_a;
                                 cpu_buffer_a->buffer = buffer_b;
                                 [...]

      // 4. Reserve event from 'cpu_buffer_a'.

  ring_buffer_unlock_commit()
    [...]
    cpu_buffer = buffer->buffers[cpu]; // 5. Now find 'cpu_buffer_b' !!!
    rb_commit(cpu_buffer)
      rb_end_commit()  // 6. WARN for the wrong 'committing' state !!!

Based on above analysis, we can easily reproduce by following testcase:
  ``` bash
  #!/bin/bash

  dmesg -n 7
  sysctl -w kernel.panic_on_warn=1
  TR=/sys/kernel/tracing
  echo 7 > ${TR}/buffer_size_kb
  echo "sched:sched_switch" > ${TR}/set_event
  while [ true ]; do
          echo 1 > ${TR}/per_cpu/cpu0/snapshot
  done &
  while [ true ]; do
          echo 1 > ${TR}/per_cpu/cpu0/snapshot
  done &
  while [ true ]; do
          echo 1 > ${TR}/per_cpu/cpu0/snapshot
  done &
  ```

To fix it, IIUC, we can use smp_call_function_single() to do the swap on
the target cpu where the buffer is located, so that above race would be
avoided.

Link: https://lore.kernel.org/linux-trace-kernel/20230831132739.4070878-1-zhengyejian1@huawei.com

Cc: <mhiramat@kernel.org>
Fixes: f1affcaaa8 ("tracing: Add snapshot in the per_cpu trace directories")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13 09:48:34 +02:00
..
rv tracing/tools: Updates for 6.4 2023-04-28 16:11:26 -07:00
Kconfig tracing: Unbreak user events 2023-03-29 06:52:09 -04:00
Makefile rv: Add Runtime Verification (RV) interface 2022-07-30 14:01:28 -04:00
blktrace.c block: remove more NULL checks after bdev_get_queue() 2023-02-21 09:23:22 -07:00
bpf_trace.c bpf: Clear the probe_addr for uprobe 2023-09-13 09:48:00 +02:00
bpf_trace.h
error_report-traces.c
fgraph.c arm64 fixes for 5.19-rc1: 2022-06-03 14:05:34 -07:00
fprobe.c fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free() 2023-07-23 13:54:12 +02:00
ftrace.c ftrace: Fix possible warning on checking all pages used in ftrace_process_locs() 2023-07-23 13:54:10 +02:00
ftrace_internal.h
kprobe_event_gen_test.c tracing: Fix wrong return in kprobe_event_gen_test.c 2023-03-19 12:20:48 -04:00
pid_list.c tracing: Cleanup double word in comment 2022-04-26 17:58:50 -04:00
pid_list.h
power-traces.c
preemptirq_delay_test.c
rethook.c fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free() 2023-07-23 13:54:12 +02:00
ring_buffer.c ring-buffer: Do not swap cpu_buffer during resize process 2023-08-23 17:32:35 +02:00
ring_buffer_benchmark.c ring_buffer: Remove unused "event" parameter 2022-11-23 19:08:30 -05:00
rpm-traces.c
synth_event_gen_test.c tracing: Always use canonical ftrace path 2023-02-18 14:34:09 -05:00
trace.c tracing: Fix race issue between cpu buffer write and swap 2023-09-13 09:48:34 +02:00
trace.h tracing: Introduce pipe_cpumask to avoid race on trace_pipes 2023-09-13 09:47:55 +02:00
trace_benchmark.c tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_benchmark.h tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_boot.c tracing: Initialize integer variable to prevent garbage return value 2022-05-26 21:13:00 -04:00
trace_branch.c
trace_clock.c
trace_dynevent.c tracing: Free buffers when a used dynamic event is removed 2022-11-23 19:07:12 -05:00
trace_dynevent.h
trace_entries.h
trace_eprobe.c kernel/trace: Fix cleanup logic of enable_trace_eprobe 2023-07-23 13:53:45 +02:00
trace_event_perf.c tracing/perf: Use strndup_user instead of kzalloc/strncpy_from_user 2022-11-23 19:08:31 -05:00
trace_events.c tracing: Fix warning in trace_buffered_event_disable() 2023-08-03 10:26:00 +02:00
trace_events_filter.c tracing: Add a way to filter function addresses to function names 2023-01-25 10:31:11 -05:00
trace_events_filter_test.h
trace_events_hist.c tracing/histograms: Return an error if we fail to add histogram to hist_vars list 2023-07-27 08:57:06 +02:00
trace_events_inject.c
trace_events_synth.c tracing/synthetic: Allocate one additional element for size 2023-08-30 14:52:29 +02:00
trace_events_trigger.c tracing: Do not synchronize freeing of trigger filter on boot up 2022-12-14 08:50:56 -05:00
trace_events_user.c tracing/user_events: Fix struct arg size match check 2023-07-23 13:54:14 +02:00
trace_export.c tracing: Fix TASK_COMM_LEN in trace event format file 2023-02-12 10:23:39 -05:00
trace_functions.c
trace_functions_graph.c
trace_hwlat.c tracing: Remove extra space at the end of hwlat_detector/mode 2023-09-13 09:48:34 +02:00
trace_irqsoff.c tracing: Fix memleak due to race between current_tracer and trace 2023-08-30 14:52:29 +02:00
trace_kdb.c
trace_kprobe.c kernel/trace: extract common part in process_fetch_insn 2023-02-24 10:04:13 +09:00
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_mmiotrace.c
trace_nop.c
trace_osnoise.c tracing/timerlat: Always wakeup the timerlat thread 2023-05-23 11:54:31 -04:00
trace_output.c tracing: Modify print_fields() for fields output order 2023-06-14 12:41:52 -04:00
trace_output.h tracing: Add "fields" option to show raw trace event fields 2023-03-29 06:52:08 -04:00
trace_preemptirq.c cpuidle: tracing, preempt: Squash _rcuidle tracing 2023-01-31 15:01:46 +01:00
trace_printk.c
trace_probe.c Revert "tracing: Add "(fault)" name injection to kernel probes" 2023-07-23 13:54:14 +02:00
trace_probe.h tracing/probe: trace_probe_primary_from_call(): checked list_first_entry 2023-05-31 18:47:10 +09:00
trace_probe_kernel.h tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails 2023-07-23 13:54:14 +02:00
trace_probe_tmpl.h tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails 2023-07-23 13:54:14 +02:00
trace_recursion_record.c tracing: Use trace_create_file() to simplify creation of tracefs entries 2022-05-26 21:12:52 -04:00
trace_sched_switch.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_sched_wakeup.c tracing: Fix memleak due to race between current_tracer and trace 2023-08-30 14:52:29 +02:00
trace_selftest.c tracing: Have function_graph selftest call cond_resched() 2023-05-28 21:15:46 -04:00
trace_selftest_dynamic.c
trace_seq.c tracing: Acquire buffer from temparary trace sequence 2023-02-07 12:42:54 -05:00
trace_stack.c
trace_stat.c
trace_stat.h
trace_synth.h tracing: Allow synthetic events to pass around stacktraces 2023-01-25 10:31:24 -05:00
trace_syscalls.c tracing: Remove unused __bad_type_size() method 2022-11-17 20:21:06 -05:00
trace_uprobe.c bpf: Clear the probe_addr for uprobe 2023-09-13 09:48:00 +02:00
tracing_map.c tracing: Remove unused variable 'dups' 2022-10-03 12:20:31 -04:00
tracing_map.h