linux-stable/tools/perf
Namhyung Kim 3477f079fe perf lock contention: Add -o/--lock-owner option
When there're many lock contentions in the system, people sometimes want
to know who caused the contention, IOW who's the owner of the locks.

The -o/--lock-owner option tries to follow the lock owners for the
contended mutexes and rwsems from BPF, and then attributes the
contention time to the owner instead of the waiter.  It's a best effort
approach to get the owner info at the time of the contention and doesn't
guarantee to have the precise tracking of owners if it's changing over
time.

Currently it only handles mutex and rwsem that have owner field in their
struct and it basically points to a task_struct that owns the lock at
the moment.

Technically its type is atomic_long_t and it comes with some LSB bits
used for other meanings.  So it needs to clear them when casting it to a
pointer to task_struct.

Also the atomic_long_t is a typedef of the atomic 32 or 64 bit types
depending on arch which is a wrapper struct for the counter value.  I'm
not aware of proper ways to access those kernel atomic types from BPF so
I just read the internal counter value directly.  Please let me know if
there's a better way.

When -o/--lock-owner option is used, it goes to the task aggregation
mode like -t/--threads option does.  However it cannot get the owner for
other lock types like spinlock and sometimes even for mutex.

  $ sudo ./perf lock con -abo -- ./perf bench sched pipe
  # Running 'sched/pipe' benchmark:
  # Executed 1000000 pipe operations between two processes

       Total time: 4.766 [sec]

         4.766540 usecs/op
           209795 ops/sec
   contended   total wait     max wait     avg wait          pid   owner

         403    565.32 us     26.81 us      1.40 us           -1   Unknown
           4     27.99 us      8.57 us      7.00 us      1583145   sched-pipe
           1      8.25 us      8.25 us      8.25 us      1583144   sched-pipe
           1      2.03 us      2.03 us      2.03 us         5068   chrome

As you can see, the owner is unknown for the most cases.  But if we
filter only for the mutex locks, it'd more likely get the onwers.

  $ sudo ./perf lock con -abo -Y mutex -- ./perf bench sched pipe
  # Running 'sched/pipe' benchmark:
  # Executed 1000000 pipe operations between two processes

       Total time: 4.910 [sec]

         4.910435 usecs/op
           203647 ops/sec
   contended   total wait     max wait     avg wait          pid   owner

           2     15.50 us      8.29 us      7.75 us      1582852   sched-pipe
           7      7.20 us      2.47 us      1.03 us           -1   Unknown
           1      6.74 us      6.74 us      6.74 us      1582851   sched-pipe

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230207002403.63590-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-08 10:33:32 -03:00
..
arch perf event x86: Add retire_lat when synthesizing PERF_SAMPLE_WEIGHT_STRUCT 2023-02-06 14:56:22 -03:00
bench perf bench syscall: Add execve syscall benchmark 2023-02-02 16:32:19 -03:00
dlfilters perf tools: Fix usage of the verbose variable 2022-12-20 15:16:33 -03:00
Documentation perf lock contention: Add -o/--lock-owner option 2023-02-08 10:33:32 -03:00
examples/bpf perf trace: Remove unused bpf map 'syscalls' 2022-11-23 10:30:00 -03:00
include/perf perf bpf: Remove now unused BPF headers 2022-11-04 11:41:48 -03:00
jvmti
pmu-events perf jevents: Run metric_test.py at compile-time 2023-02-03 17:11:39 -03:00
python
scripts perf script flamegraph: Avoid d3-flame-graph package dependency 2023-01-19 09:29:58 -03:00
tests perf test bpf: Skip test if kernel-debuginfo is not present 2023-02-06 15:01:23 -03:00
trace perf beauty: Update copy of linux/socket.h with the kernel sources 2023-01-18 10:12:23 -03:00
ui perf tools: Fix "kernel lock contention analysis" test by not printing warnings in quiet mode 2022-10-27 16:37:26 -03:00
util perf lock contention: Add -o/--lock-owner option 2023-02-08 10:33:32 -03:00
.gitignore perf jevents: Run metric_test.py at compile-time 2023-02-03 17:11:39 -03:00
Build perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
builtin-annotate.c perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
builtin-bench.c perf bench syscall: Add execve syscall benchmark 2023-02-02 16:32:19 -03:00
builtin-buildid-cache.c
builtin-buildid-list.c
builtin-c2c.c perf tools: Use dedicated non-atomic clear/set bit helpers 2022-12-05 09:29:06 -03:00
builtin-config.c
builtin-daemon.c perf daemon: Use sig_atomic_t to avoid UB 2022-11-03 09:35:44 -03:00
builtin-data.c perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
builtin-diff.c perf tools: Make quiet mode consistent between tools 2022-10-27 16:37:26 -03:00
builtin-evlist.c
builtin-ftrace.c perf ftrace: Use sig_atomic_t to avoid UB 2022-11-03 09:36:09 -03:00
builtin-help.c
builtin-inject.c perf inject: Use perf_data__read() for auxtrace 2023-02-01 19:22:07 -03:00
builtin-kallsyms.c
builtin-kmem.c perf kmem: Support field "node" in evsel__process_alloc_event() coping with recent tracepoint restructuring 2023-01-10 10:52:49 -03:00
builtin-kvm.c perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
builtin-kwork.c perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
builtin-list.c perf pmu-events: Remove now unused event and metric variables 2023-02-03 13:54:21 -03:00
builtin-lock.c perf lock contention: Add -o/--lock-owner option 2023-02-08 10:33:32 -03:00
builtin-mem.c perf tools: Move 'struct perf_sample' to a separate header file to disentangle headers 2022-10-31 11:06:41 -03:00
builtin-probe.c perf probe: Fix usage when libtraceevent is missing 2023-02-02 16:32:19 -03:00
builtin-record.c perf tools: Fix usage of the verbose variable 2022-12-20 15:16:33 -03:00
builtin-report.c perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
builtin-sched.c perf tools: Use dedicated non-atomic clear/set bit helpers 2022-12-05 09:29:06 -03:00
builtin-script.c perf script: Support Retire Latency 2023-02-03 17:26:40 -03:00
builtin-stat.c perf stat: Remove evsel metric_name/expr 2023-02-03 13:54:21 -03:00
builtin-timechart.c perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
builtin-top.c perf evlist: Remove group option. 2022-12-14 15:28:18 -03:00
builtin-trace.c perf trace: Reduce #ifdefs for TEP_FIELD_IS_RELATIVE 2023-01-19 13:26:28 -03:00
builtin-version.c perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
builtin.h
check-headers.sh
command-list.txt perf help: Use HAVE_LIBTRACEEVENT to filter out unsupported commands 2023-01-02 11:51:53 -03:00
CREDITS
design.txt
Makefile perf tools: Use "grep -E" instead of "egrep" 2022-12-14 15:28:19 -03:00
Makefile.config perf tools: Remove HAVE_LIBTRACEEVENT_TEP_FIELD_IS_RELATIVE 2023-01-19 13:24:56 -03:00
Makefile.perf perf jevents: Run metric_test.py at compile-time 2023-02-03 17:11:39 -03:00
MANIFEST tools lib traceevent: Remove libtraceevent 2022-12-14 11:16:12 -03:00
perf-archive.sh
perf-completion.sh
perf-iostat.sh
perf-read-vdso.c
perf-sys.h
perf.c perf build: Use libtraceevent from the system 2022-12-14 11:16:12 -03:00
perf.h