linux-stable/tools/perf/Documentation/perf-lock.txt

175 lines
3.9 KiB
Text
Raw Normal View History

perf-lock(1)
============
NAME
----
perf-lock - Analyze lock events
SYNOPSIS
--------
[verse]
'perf lock' {record|report|script|info|contention}
DESCRIPTION
-----------
You can analyze various lock behaviours
and statistics with this 'perf lock' command.
'perf lock record <command>' records lock events
between start and end <command>. And this command
produces the file "perf.data" which contains tracing
results of lock events.
'perf lock report' reports statistical data.
'perf lock script' shows raw lock events.
'perf lock info' shows metadata like threads or addresses
of lock instances.
'perf lock contention' shows contention statistics.
COMMON OPTIONS
--------------
-i::
--input=<file>::
Input file name. (default: perf.data unless stdin is a fifo)
-v::
--verbose::
Be more verbose (show symbol address, etc).
-q::
--quiet::
Do not show any warnings or messages. (Suppress -v)
-D::
--dump-raw-trace::
Dump raw trace in ASCII.
-f::
--force::
Don't complain, do it.
--vmlinux=<file>::
vmlinux pathname
--kallsyms=<file>::
kallsyms pathname
REPORT OPTIONS
--------------
-k::
--key=<value>::
Sorting key. Possible values: acquired (default), contended,
avg_wait, wait_total, wait_max, wait_min.
-F::
--field=<value>::
Output fields. By default it shows all the fields but users can
customize that using this. Possible values: acquired, contended,
avg_wait, wait_total, wait_max, wait_min.
2022-01-27 00:00:49 +00:00
-c::
--combine-locks::
Merge lock instances in the same class (based on name).
-t::
--threads::
The -t option is to show per-thread lock stat like below:
$ perf lock report -t -F acquired,contended,avg_wait
Name acquired contended avg wait (ns)
perf 240569 9 5784
swapper 106610 19 543
:15789 17370 2 14538
ContainerMgr 8981 6 874
sleep 5275 1 11281
ContainerThread 4416 4 944
RootPressureThr 3215 5 1215
rcu_preempt 2954 0 0
ContainerMgr 2560 0 0
unnamed 1873 0 0
EventManager_De 1845 1 636
futex-default-S 1609 0 0
-E::
--entries=<value>::
Display this many entries.
INFO OPTIONS
------------
-t::
--threads::
dump thread list in perf.data
-m::
--map::
dump map of lock instances (address:name table)
CONTENTION OPTIONS
--------------
-k::
--key=<value>::
Sorting key. Possible values: contended, wait_total (default),
wait_max, wait_min, avg_wait.
-F::
--field=<value>::
Output fields. By default it shows all but the wait_min fields
and users can customize that using this. Possible values:
contended, wait_total, wait_max, wait_min, avg_wait.
-t::
--threads::
Show per-thread lock contention stat
-b::
--use-bpf::
Use BPF program to collect lock contention stats instead of
using the input data.
-a::
--all-cpus::
System-wide collection from all CPUs.
-C::
--cpu::
Collect samples only on the list of CPUs provided. Multiple CPUs can be
provided as a comma-separated list with no space: 0,1. Ranges of CPUs
are specified with -: 0-2. Default is to monitor all CPUs.
-p::
--pid=::
Record events on existing process ID (comma separated list).
--tid=::
Record events on existing thread ID (comma separated list).
--map-nr-entries::
Maximum number of BPF map entries (default: 10240).
perf lock contention: Allow to change stack depth and skip It needs stack traces to find callers of locks. To minimize the performance overhead it only collects up to 8 entries for each stack trace. And it skips first 3 entries as they came from BPF, tracepoint and lock functions which are not interested for most users. But it turned out that those numbers are different in some configuration. Using fixed number can result in non meaningful caller names. Let's make them adjustable with --stack-depth and --skip-stack options. On my setup, the default output is like below: # /perf lock con -ab -F contended,wait_total sleep 3 contended total wait type caller 28 4.55 ms rwlock:W __bpf_trace_contention_begin+0xb 33 1.67 ms rwlock:W __bpf_trace_contention_begin+0xb 12 580.28 us spinlock __bpf_trace_contention_begin+0xb 60 240.54 us rwsem:R __bpf_trace_contention_begin+0xb 27 64.45 us spinlock __bpf_trace_contention_begin+0xb If I change the stack skip to 5, the result will be like: # perf lock con -ab -F contended,wait_total --stack-skip 5 sleep 3 contended total wait type caller 32 715.45 us spinlock folio_lruvec_lock_irqsave+0x61 26 550.22 us spinlock folio_lruvec_lock_irqsave+0x61 15 486.93 us rwsem:R mmap_read_lock+0x13 12 139.66 us rwsem:W vm_mmap_pgoff+0x93 1 7.04 us spinlock tick_do_update_jiffies64+0x25 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220912055314.744552-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-12 05:53:13 +00:00
--max-stack::
Maximum stack depth when collecting lock contention (default: 8).
--stack-skip
Number of stack depth to skip when finding a lock caller (default: 3).
-E::
--entries=<value>::
Display this many entries.
SEE ALSO
--------
linkperf:perf[1]