linux-stable/kernel/trace
Petr Mladek 8b46ff6938 ring_buffer: Do no not complete benchmark reader too early
It seems that complete(&read_done) might be called too early
in some situations.

1st scenario:
-------------

CPU0					CPU1

ring_buffer_producer_thread()
  wake_up_process(consumer);
  wait_for_completion(&read_start);

					ring_buffer_consumer_thread()
					  complete(&read_start);

  ring_buffer_producer()
    # producing data in
    # the do-while cycle

					  ring_buffer_consumer();
					    # reading data
					    # got error
					    # set kill_test = 1;
					    set_current_state(
						TASK_INTERRUPTIBLE);
					    if (reader_finish)  # false
					    schedule();

    # producer still in the middle of
    # do-while cycle
    if (consumer && !(cnt % wakeup_interval))
      wake_up_process(consumer);

					    # spurious wakeup
					    while (!reader_finish &&
						   !kill_test)
					    # leaving because
					    # kill_test == 1
					    reader_finish = 0;
					    complete(&read_done);

1st BANG: We might access uninitialized "read_done" if this is the
	  the first round.

    # producer finally leaving
    # the do-while cycle because kill_test == 1;

    if (consumer) {
      reader_finish = 1;
      wake_up_process(consumer);
      wait_for_completion(&read_done);

2nd BANG: This will never complete because consumer already did
	  the completion.

2nd scenario:
-------------

CPU0					CPU1

ring_buffer_producer_thread()
  wake_up_process(consumer);
  wait_for_completion(&read_start);

					ring_buffer_consumer_thread()
					  complete(&read_start);

  ring_buffer_producer()
    # CPU3 removes the module	  <--- difference from
    # and stops producer          <--- the 1st scenario
    if (kthread_should_stop())
      kill_test = 1;

					  ring_buffer_consumer();
					    while (!reader_finish &&
						   !kill_test)
					    # kill_test == 1 => we never go
					    # into the top level while()
					    reader_finish = 0;
					    complete(&read_done);

    # producer still in the middle of
    # do-while cycle
    if (consumer && !(cnt % wakeup_interval))
      wake_up_process(consumer);

					    # spurious wakeup
					    while (!reader_finish &&
						   !kill_test)
					    # leaving because kill_test == 1
					    reader_finish = 0;
					    complete(&read_done);

BANG: We are in the same "bang" situations as in the 1st scenario.

Root of the problem:
--------------------

ring_buffer_consumer() must complete "read_done" only when "reader_finish"
variable is set. It must not be skipped due to other conditions.

Note that we still must keep the check for "reader_finish" in a loop
because there might be spurious wakeups as described in the
above scenarios.

Solution:
----------

The top level cycle in ring_buffer_consumer() will finish only when
"reader_finish" is set. The data will be read in "while-do" cycle
so that they are not read after an error (kill_test == 1)
or a spurious wake up.

In addition, "reader_finish" is manipulated by the producer thread.
Therefore we add READ_ONCE() to make sure that the fresh value is
read in each cycle. Also we add the corresponding barrier
to synchronize the sleep check.

Next we set the state back to TASK_RUNNING for the situation where we
did not sleep.

Just from paranoid reasons, we initialize both completions statically.
This is safer, in case there are other races that we are unaware of.

As a side effect we could remove the memory barrier from
ring_buffer_producer_thread(). IMHO, this was the reason for
the barrier. ring_buffer_reset() uses spin locks that should
provide the needed memory barrier for using the buffer.

Link: http://lkml.kernel.org/r/1441629518-32712-2-git-send-email-pmladek@suse.com

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-03 16:03:24 -05:00
..
blktrace.c tracing: Move trace_flags from global to a trace_array field 2015-09-30 15:22:55 -04:00
bpf_trace.c bpf: add support for %s specifier to bpf_trace_printk() 2015-08-28 16:27:27 -07:00
ftrace.c ftrace: add module globbing 2015-10-20 20:02:03 -04:00
Kconfig tracing: gpio: Add Kconfig option for enabling/disabling trace events 2015-10-20 21:56:10 -04:00
Makefile bpf: Fix the build on BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable 2015-04-02 16:28:06 +02:00
power-traces.c PM / sleep: export suspend_resume trace event 2015-01-30 02:10:41 +01:00
ring_buffer.c ring-buffer: rb_event_is_commit() can return boolean 2015-11-02 14:25:29 -05:00
ring_buffer_benchmark.c ring_buffer: Do no not complete benchmark reader too early 2015-11-03 16:03:24 -05:00
rpm-traces.c PM / Runtime: Introduce trace points for tracing rpm_* functions 2011-09-27 22:53:27 +02:00
trace.c tracing: Update instance_rmdir() to use tracefs_remove_recursive 2015-11-02 13:59:06 -05:00
trace.h tracing: Remove redundant TP_ARGS redefining 2015-11-03 15:07:07 -05:00
trace_benchmark.c tracing: Only benchmark the time tracepoints take if tracing is on 2015-11-02 13:34:58 -05:00
trace_benchmark.h tracing: Add tracepoint benchmark tracepoint 2014-05-29 22:49:54 -04:00
trace_branch.c tracing: Remove {start,stop}_branch_trace 2015-10-21 10:10:09 -04:00
trace_clock.c tracing: Export tracing clock functions 2015-05-12 15:56:57 -04:00
trace_entries.h tracing: %pF is only for function pointers 2015-03-25 08:57:22 -04:00
trace_event_perf.c tracing: Rename ftrace_event_{call,class} to trace_event_{call,class} 2015-05-13 14:06:10 -04:00
trace_events.c tracing: Call on_each_cpu() when adding or removing single pids from set_event_pid 2015-11-02 13:08:26 -05:00
trace_events_filter.c tracing: is_legal_op() can return boolean 2015-11-02 14:26:51 -05:00
trace_events_filter_test.h tracing/filter: Add startup tests for events filter 2011-08-19 14:35:59 -04:00
trace_events_trigger.c tracing: Rename ftrace_raw_##call event structures to trace_event_raw_##call 2015-05-13 21:48:40 -04:00
trace_export.c tracing: ftrace_event_is_function() can return boolean 2015-11-02 14:28:05 -05:00
trace_functions.c tracing/trivial: Fix typos and make an int into a bool 2014-11-20 10:05:36 -05:00
trace_functions_graph.c tracing: Move trace_flags from global to a trace_array field 2015-09-30 15:22:55 -04:00
trace_irqsoff.c tracing: report_latency() in trace_irqsoff.c can return boolean 2015-11-02 14:20:19 -05:00
trace_kdb.c tracing: Move trace_flags from global to a trace_array field 2015-09-30 15:22:55 -04:00
trace_kprobe.c lib: introduce strncpy_from_unsafe() 2015-08-28 16:27:27 -07:00
trace_mmiotrace.c tracing: Pass trace_array into trace_buffer_unlock_commit() 2015-09-25 17:38:44 -04:00
trace_nop.c tracing: Remove unneeded includes of debugfs.h and fs.h 2015-01-22 11:19:48 -05:00
trace_output.c tracing: Move trace_flags from global to a trace_array field 2015-09-30 15:22:55 -04:00
trace_output.h tracing: Turn seq_print_user_ip() into a static function 2015-09-28 10:16:12 -04:00
trace_printk.c tracing: Remove access to trace_flags in trace_printk.c 2015-09-30 04:35:18 -04:00
trace_probe.c trace: Don't use __weak in header files 2015-03-25 08:57:23 -04:00
trace_probe.h kernel/trace_probe: is_good_name can be boolean 2015-09-22 13:11:30 -04:00
trace_sched_switch.c sched: Introduce the 'trace_sched_waking' tracepoint 2015-08-03 12:21:22 +02:00
trace_sched_wakeup.c tracing: report_latency() in trace_sched_wakeup.c can return boolean 2015-11-02 14:20:06 -05:00
trace_selftest.c Seems that Peter Zijlstra added a new check that is making old 2014-10-12 07:28:55 -04:00
trace_selftest_dynamic.c
trace_seq.c tracing: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
trace_stack.c tracing: Rename max_stack_lock to stack_trace_max_lock 2015-11-03 14:50:15 -05:00
trace_stat.c tracing: Convert the tracing facility over to use tracefs 2015-02-03 12:48:41 -05:00
trace_stat.h
trace_syscalls.c tracing: Move trace_flags from global to a trace_array field 2015-09-30 15:22:55 -04:00
trace_uprobe.c tracing/uprobes: Do not print '0x (null)' when offset is 0 2015-08-26 10:43:01 -03:00