linux-stable/tools/perf
Namhyung Kim edc41a1099 perf record: Enable off-cpu analysis with BPF
Add --off-cpu option to enable the off-cpu profiling with BPF.  It'd
use a bpf_output event and rename it to "offcpu-time".  Samples will
be synthesized at the end of the record session using data from a BPF
map which contains the aggregated off-cpu time at context switches.
So it needs root privilege to get the off-cpu profiling.

Each sample will have a separate user stacktrace so it will skip
kernel threads.  The sample ip will be set from the stacktrace and
other sample data will be updated accordingly.  Currently it only
handles some basic sample types.

The sample timestamp is set to a dummy value just not to bother with
other events during the sorting.  So it has a very big initial value
and increase it on processing each samples.

Good thing is that it can be used together with regular profiling like
cpu cycles.  If you don't want to that, you can use a dummy event to
enable off-cpu profiling only.

Example output:
  $ sudo perf record --off-cpu perf bench sched messaging -l 1000

  $ sudo perf report --stdio --call-graph=no
  # Total Lost Samples: 0
  #
  # Samples: 41K of event 'cycles'
  # Event count (approx.): 42137343851
  ...

  # Samples: 1K of event 'offcpu-time'
  # Event count (approx.): 587990831640
  #
  # Children      Self  Command          Shared Object       Symbol
  # ........  ........  ...............  ..................  .........................
  #
      81.66%     0.00%  sched-messaging  libc-2.33.so        [.] __libc_start_main
      81.66%     0.00%  sched-messaging  perf                [.] cmd_bench
      81.66%     0.00%  sched-messaging  perf                [.] main
      81.66%     0.00%  sched-messaging  perf                [.] run_builtin
      81.43%     0.00%  sched-messaging  perf                [.] bench_sched_messaging
      40.86%    40.86%  sched-messaging  libpthread-2.33.so  [.] __read
      37.66%    37.66%  sched-messaging  libpthread-2.33.so  [.] __write
       2.91%     2.91%  sched-messaging  libc-2.33.so        [.] __poll
  ...

As you can see it spent most of off-cpu time in read and write in
bench_sched_messaging().  The --call-graph=no was added just to make
the output concise here.

It uses perf hooks facility to control BPF program during the record
session rather than adding new BPF/off-cpu specific calls.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-26 12:36:57 -03:00
..
arch perf intel-pt: Track sideband system-wide when needed 2022-05-26 12:36:57 -03:00
bench Merge remote-tracking branch 'torvalds/master' into perf/core 2022-05-23 09:32:49 -03:00
dlfilters perf dlfilter: Drop unused variable 2021-12-16 12:18:11 -03:00
Documentation perf record: Enable off-cpu analysis with BPF 2022-05-26 12:36:57 -03:00
examples/bpf
include
jvmti
pmu-events perf vendors events arm64: Update Cortex A57/A72 2022-05-23 10:16:41 -03:00
python perf python: Convert tracepoint.py example to python3 2022-04-01 16:19:35 -03:00
scripts perf scripts python: intel-pt-events.py: Print ptwrite value as a string if it is ASCII 2022-05-17 11:56:15 -03:00
tests perf test: Add checking for perf stat CSV output. 2022-05-26 12:36:57 -03:00
trace perf beauty: Update copy of linux/socket.h with the kernel sources 2022-04-01 16:19:34 -03:00
ui perf annotate: Avoid TUI crash when navigating in the annotation of recursive functions 2022-01-10 15:47:30 -03:00
util perf record: Enable off-cpu analysis with BPF 2022-05-26 12:36:57 -03:00
.gitignore perf tools: Delete perf-with-kcore.sh script 2022-04-27 20:11:26 -03:00
Build
builtin-annotate.c perf annotate: Add --percent-limit option 2022-05-10 14:37:55 -03:00
builtin-bench.c perf bench: Add breakpoint benchmarks 2022-05-13 11:00:38 -03:00
builtin-buildid-cache.c perf record: Disable debuginfod by default 2022-01-15 17:41:25 -03:00
builtin-buildid-list.c
builtin-c2c.c perf c2c: Add dimensions for 'N/A' metrics of store operation 2022-05-23 09:36:34 -03:00
builtin-config.c
builtin-daemon.c perf daemon: Remove duplicate sys/file.h include 2021-10-08 15:14:50 -03:00
builtin-data.c perf data: Don't mention --to-ctf if it's not supported 2022-02-22 21:23:08 -03:00
builtin-diff.c
builtin-evlist.c
builtin-ftrace.c perf evlist: Rename cpus to user_requested_cpus 2022-04-01 16:19:35 -03:00
builtin-help.c
builtin-inject.c perf inject: Keep a copy of kcore_dir 2022-05-23 10:11:49 -03:00
builtin-kallsyms.c
builtin-kmem.c perf tools: Enhance the matching of sub-commands abbreviations 2022-03-26 10:55:57 -03:00
builtin-kvm.c perf kvm report: Add guest_code support 2022-05-23 10:19:15 -03:00
builtin-list.c perf list: Display hybrid PMU events with cpu type 2021-10-25 13:47:42 -03:00
builtin-lock.c perf lock: Add -t/--thread option for report 2022-05-23 09:49:35 -03:00
builtin-mem.c perf tools: Enhance the matching of sub-commands abbreviations 2022-03-26 10:55:57 -03:00
builtin-probe.c perf namespaces: Add functions to access nsinfo 2022-02-11 14:31:22 -03:00
builtin-record.c perf record: Enable off-cpu analysis with BPF 2022-05-26 12:36:57 -03:00
builtin-report.c perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event 2022-04-22 18:39:34 -03:00
builtin-sched.c perf tools: Enhance the matching of sub-commands abbreviations 2022-03-26 10:55:57 -03:00
builtin-script.c perf script: Add guest_code support 2022-05-23 10:19:04 -03:00
builtin-stat.c perf stat: Add requires_cpu flag for uncore 2022-05-26 12:36:57 -03:00
builtin-timechart.c perf tools: Enhance the matching of sub-commands abbreviations 2022-03-26 10:55:57 -03:00
builtin-top.c perf evlist: Rename cpus to user_requested_cpus 2022-04-01 16:19:35 -03:00
builtin-trace.c Merge remote-tracking branch 'torvalds/master' into perf/core 2022-02-17 18:40:54 -03:00
builtin-version.c perf version: Add HAVE_DEBUGINFOD_SUPPORT to built-in features 2022-04-20 13:32:09 -03:00
builtin.h
check-headers.sh tools arm64: Import cputype.h 2022-03-26 10:53:45 -03:00
command-list.txt
CREDITS
design.txt perf design.txt: Synchronize the definition of enum perf_hw_id with code 2021-11-13 18:11:50 -03:00
Makefile
Makefile.config perf build: Stop using __weak bpf_map_create() to handle older libbpf versions 2022-05-26 12:36:56 -03:00
Makefile.perf perf record: Enable off-cpu analysis with BPF 2022-05-26 12:36:57 -03:00
MANIFEST perf MANIFEST: Add bpftool files to allow building with BUILD_BPF_SKEL=1 2021-11-07 15:39:28 -03:00
perf-archive.sh
perf-completion.sh
perf-iostat.sh
perf-read-vdso.c
perf-sys.h
perf.c perf tools: Add external commands to list-cmds 2022-04-09 14:21:00 -03:00
perf.h