linux-stable/tools/perf
Jiri Olsa b57334b945 perf machine: Use last_match threads cache only in single thread mode
There's an issue with using threads::last_match in multithread mode
which is enabled during the perf top synthesize. It might crash with
following assertion:

  perf: ...include/linux/refcount.h:109: refcount_inc:
        Assertion `!(!refcount_inc_not_zero(r))' failed.

The gdb backtrace looks like this:

  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  (gdb)
  #0  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  #1  0x00007ffff5085800 in abort () from /lib64/libc.so.6
  #2  0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
  #3  0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
  #4  0x0000000000535ff9 in refcount_inc (r=0x7fffe8009a70)
      at ...include/linux/refcount.h:109
  #5  0x0000000000536771 in thread__get (thread=0x7fffe8009a40)
      at util/thread.c:115
  #6  0x0000000000523cd0 in ____machine__findnew_thread (machine=0xbfde38,
      threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:432
  #7  0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
      pid=2, tid=2) at util/machine.c:489
  #8  0x0000000000523f24 in machine__findnew_thread (machine=0xbfde38,
      pid=2, tid=2) at util/machine.c:499
  #9  0x0000000000526fbe in machine__process_fork_event (machine=0xbfde38,
  ...

The failing assertion is this one:

  REFCOUNT_WARN(!refcount_inc_not_zero(r), ...

the problem is that we don't serialize access to threads::last_match.
We serialize the access to the threads tree, but we don't care how's
threads::last_match being accessed. Both locked/unlocked paths use
that data and can set it. In multithreaded mode we can end up with
invalid object in thread__get call, like in following paths race:

  thread 1
    ...
    machine__findnew_thread
      down_write(&threads->lock);
      __machine__findnew_thread
        ____machine__findnew_thread
          th = threads->last_match;
          if (th->tid == tid) {
            thread__get

  thread 2
    ...
    machine__find_thread
      down_read(&threads->lock);
      __machine__findnew_thread
        ____machine__findnew_thread
          th = threads->last_match;
          if (th->tid == tid) {
            thread__get

  thread 3
    ...
    machine__process_fork_event
      machine__remove_thread
        __machine__remove_thread
          threads->last_match = NULL
          thread__put
      thread__put

Thread 1 and 2 might got stale last_match, before thread 3 clears
it. Thread 1 and 2 then race with thread 3's thread__put and they
might trigger the refcnt == 0 assertion above.

The patch is disabling the last_match cache for multiple thread
mode. It was originally meant for single thread scenarios, where
it's common to have multiple sequential searches of the same
thread.

In multithread mode this does not make sense, because top's threads
processes different /proc entries and so the 'struct threads' object
is queried for various threads. Moreover we'd need to add more locks
to make it work.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180719143345.12963-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24 14:53:52 -03:00
..
arch perf arm64: Generate system call table from asm/unistd.h 2018-07-24 14:52:48 -03:00
bench perf bench: Fix numa report output code 2018-06-25 11:59:37 -03:00
Documentation perf list: Add missing documentation for --desc and --debug options 2018-07-24 14:49:57 -03:00
examples/bpf perf bpf: Add probe() helper to reduce kprobes boilerplate 2018-05-15 14:31:24 -03:00
include/bpf perf bpf: Add probe() helper to reduce kprobes boilerplate 2018-05-15 14:31:24 -03:00
jvmti perf tools: Fix compilation errors on gcc8 2018-07-11 09:39:57 -04:00
pmu-events perf json: Add s390 transaction counter definition 2018-07-24 14:49:30 -03:00
python perf python: Make twatch.py work with both python2 and python3 2018-02-19 12:28:08 -03:00
scripts perf scripts python: Add Python 3 support to EventClass.py 2018-07-11 10:01:50 -03:00
tests perf tests: Fix record+probe_libc_inet_pton.sh when event exists 2018-07-24 14:52:19 -03:00
trace perf trace beauty prctl: Default header_dir to cwd to work without parms 2018-06-01 16:13:06 -03:00
ui perf hists: Clarify callchain disabling when available 2018-07-24 14:37:33 -03:00
util perf machine: Use last_match threads cache only in single thread mode 2018-07-24 14:53:52 -03:00
.gitignore perf tools: Add trace/beauty/generated/ into .gitignore 2018-02-05 13:58:02 -03:00
Build perf trace: Remove audit-libs dependency if syscall tables are present 2018-01-23 09:51:38 -03:00
builtin-annotate.c perf tools: Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE] 2018-06-25 11:59:37 -03:00
builtin-bench.c
builtin-buildid-cache.c perf buildid-cache: Warn --purge-all failures 2018-05-15 10:32:16 -03:00
builtin-buildid-list.c
builtin-c2c.c perf hists: Clarify callchain disabling when available 2018-07-24 14:37:33 -03:00
builtin-config.c
builtin-data.c
builtin-diff.c perf hists: Clarify callchain disabling when available 2018-07-24 14:37:33 -03:00
builtin-evlist.c
builtin-ftrace.c perf ftrace: Append an EOL when write tracing files 2018-02-19 09:49:12 -03:00
builtin-help.c perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT 2018-04-12 10:33:31 -03:00
builtin-inject.c perf thread: Make thread__find_map() return the map 2018-04-26 13:47:08 -03:00
builtin-kallsyms.c perf machine: Ditch find_kernel_function variants 2018-04-30 12:20:54 -03:00
builtin-kmem.c perf machine: Ditch find_kernel_function variants 2018-04-30 12:20:54 -03:00
builtin-kvm.c perf tools: Ditch the symbol_conf.nr_events global 2018-06-04 10:28:52 -03:00
builtin-list.c
builtin-lock.c
builtin-mem.c perf mem: Allow all record/report options 2018-04-18 15:35:48 -03:00
builtin-probe.c perf tools: No need to check if the argument to __get() function is NULL 2018-06-04 10:28:50 -03:00
builtin-record.c perf record: Synthesize features before events in pipe mode 2018-03-16 13:56:50 -03:00
builtin-report.c perf hists: Clarify callchain disabling when available 2018-07-24 14:37:33 -03:00
builtin-sched.c perf sched: Use sched->show_callchain where appropriate 2018-06-05 10:09:54 -03:00
builtin-script.c perf tools: Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE] 2018-06-25 11:59:37 -03:00
builtin-stat.c perf stat: Add transaction flag (-T) support for s390 2018-07-24 14:49:37 -03:00
builtin-timechart.c perf thread: Make thread__find_symbol() return the symbol searched 2018-04-26 13:47:09 -03:00
builtin-top.c perf hists: Clarify callchain disabling when available 2018-07-24 14:37:33 -03:00
builtin-trace.c perf evsel: Add has_callchain() helper to make code more compact/clear 2018-06-05 10:09:54 -03:00
builtin-version.c perf version: Print status for syscall_table 2018-04-12 10:33:34 -03:00
builtin.h
check-headers.sh tools include: Grab copies of arm64 dependent unistd.h files 2018-07-24 14:52:39 -03:00
command-list.txt
CREDITS
design.txt
Makefile
Makefile.config perf trace arm64: Use generated syscall table 2018-07-24 14:53:01 -03:00
Makefile.perf perf bpf: Fixup include and examples install messages 2018-05-19 06:42:50 -03:00
MANIFEST
perf-archive.sh
perf-completion.sh perf tools: Auto-complete for events with ':' 2017-12-27 12:16:00 -03:00
perf-read-vdso.c
perf-sys.h Drop a bunch of metag references 2018-02-23 14:29:59 +00:00
perf-with-kcore.sh
perf.c perf tools: Remove dead quote.[ch] code 2018-06-04 10:28:50 -03:00
perf.h perf tools: Add 'perf -vv' as an alias to 'perf version --build-options' 2018-04-02 13:50:35 -03:00