Commit graph

10862 commits

Author SHA1 Message Date
Ian Rogers
c5baf90892 perf annotate: Fix objdump --no-show-raw-insn flag
Remove redirection of objdump's stderr to /dev/null to help diagnose
failures.

Fix the '--no-show-raw' flag to be '--no-show-raw-insn' which binutils
is permissive and allows, but fails with LLVM objdump.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Ian Rogers
b34b45eef1 perf annotate: Don't pipe objdump output through 'expand' command
Avoiding a pipe allows objdump command failures to surface.  Move to the
caller of symbol__parse_objdump_line the call to strim that removes
leading and trailing tabs.  Add a new expand_tabs function that if a tab
is present allocate a new line in which tabs are expanded.  In
symbol__parse_objdump_line the line had no leading spaces, so simplify
the line_ip processing.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Ian Rogers
7a675de428 perf annotate: Don't pipe objdump output through 'grep' command
Simplify the objdump command by not piping the output of objdump through
grep. Instead, drop lines that match the grep pattern during the reading
loop.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:42 -03:00
Ian Rogers
4235949944 perf annotate: Use libsubcmd's run-command.h to fork objdump
Reduce duplicated logic by using the subcmd library. Ensure when errors
occur they are reported to the caller. Before this patch, if no lines
are read the error status is 0.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-3-irogers@google.com
Link: http://lore.kernel.org/lkml/20191015003418.62563-1-irogers@google.com
[ merged follow up fix for NULL termination as in the 2nd link above ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:39:01 -03:00
Ian Rogers
353dcaa2f9 perf annotate: Avoid reallocation in objdump parsing
Objdump output is parsed using getline which allocates memory for the
read. Getline will realloc if the memory is too small, but currently the
line is always freed after the call.

Simplify parse_objdump_line by performing the reading in symbol__disassemble.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20191010183649.23768-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Jin Yao
800d3f5616 perf report: Add warning when libunwind not compiled in
We received a user report that call-graph DWARF mode was enabled in
'perf record' but 'perf report' didn't unwind the callstack correctly.
The reason was, libunwind was not compiled in.

We can use 'perf -vv' to check the compiled libraries but it would be
valuable to report a warning to user directly (especially valuable for
a perf newbie).

The warning is:

Warning:
Please install libunwind development packages during the perf build.

Both TUI and stdio are supported.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011022122.26369-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Leo Yan
791ce9c48c perf test: Avoid infinite loop for task exit case
When executing the task exit testing case, perf gets stuck in an endless
loop this case and doesn't return back on Arm64 Juno board.

After digging into this issue, since Juno board has Arm's big.LITTLE
CPUs, thus the PMUs are not compatible between the big CPUs and little
CPUs.  This leads to a PMU event that cannot be enabled properly when
the traced task is migrated from one variant's CPU to another variant.
Finally, the test case runs into infinite loop for cannot read out any
event data after return from polling.

Eventually, we need to work out formal solution to allow PMU events can
be freely migrated from one CPU variant to another, but this is a
difficult task and a different topic.  This patch tries to fix the Perf
test case to avoid infinite loop, when the testing detects 1000 times
retrying for reading empty events, it will directly bail out and return
failure.  This allows the Perf tool can continue its other test cases.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Leo Yan
6add129c5d perf test: Report failure for mmap events
When fail to mmap events in task exit case, it misses to set 'err' to
-1; thus the testing will not report failure for it.

This patch sets 'err' to -1 when fails to mmap events, thus Perf tool
can report correct result.

Fixes: d723a55096 ("perf test: Add test case for checking number of EXIT events")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Andi Kleen
5a40e19948 perf evlist: Fix fix for freed id arrays
In the earlier fix for the memory overrun of id arrays I managed to typo
the wrong event in the fix.

Of course we need to close the current event in the loop, not the
original failing event.

The same test case as in the original patch still passes.

Fixes: 7834fa948b ("perf evlist: Fix access of freed id arrays")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Andi Kleen
b3509b6ed7 perf script: Fix --reltime with --time
My earlier patch to just enable --reltime with --time was a little too
optimistic.  The --time parsing would accept absolute time, which is
very confusing to the user.

Support relative time in --time parsing too. This only works with recent
perf record that records the first sample time. Otherwise we error out.

Fixes: 3714437d3f ("perf script: Allow --time with --reltime")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Jiri Olsa
bb91a073ed perf tools: Allow to build with -ltcmalloc
By using "make TCMALLOC=1" you can enable perf to be build for usage
with libtcmalloc.so (gperftools).

Get heap profile (tools/perf directory):

  $ <install gperftools>
  $ make TCMALLOC=1 DEBUG=1
  $ HEAPPROFILE=/tmp/heapprof ./perf ...
  $ pprof ./perf /tmp/heapprof.000*
  (pprof) top
  Total: 2335.5 MB
    1735.1  74.3%  74.3%   1735.1  74.3% memdup
     402.0  17.2%  91.5%    402.0  17.2% zalloc
     140.2   6.0%  97.5%    145.8   6.2% map__new
      33.6   1.4%  98.9%     33.6   1.4% symbol__new
      12.4   0.5%  99.5%     12.4   0.5% alloc_event
       6.2   0.3%  99.7%      6.2   0.3% nsinfo__new
       5.5   0.2% 100.0%      5.5   0.2% nsinfo__copy
       0.3   0.0% 100.0%      0.3   0.0% dso__new
       0.1   0.0% 100.0%      0.1   0.0% do_read_string
       0.0   0.0% 100.0%      0.0   0.0% __GI__IO_file_doallocate

See callstack:
  $ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf
  $ pprof --web ./perf /tmp/heapprof.00*

Committer testing:

Install gperftools, on fedora:

  # dnf install gperftools-devel

Then build:

 $ make TCMALLOC=1 DEBUG=1 -C tools/perf O=/tmp/build/perf install-bin

Verify that it linked against the right library:

  $ ldd ~/bin/perf | grep tcma
	libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fb2953a7000)
  $

Run 'perf trace' system wide for 1 minute:

  # HEAPPROFILE=/tmp/heapprof perf trace -a sleep 1m
  <SNIP>
   59985.524 ( 0.006 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafb0) = -1 EAGAIN (Resource temporarily unavailable)
   59985.536 ( 0.005 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafc0) = -1 EAGAIN (Resource temporarily unavailable)
   59981.956 (10.143 ms): SCTP timer/21716  ... [continued]: select())                            = 0 (Timeout)
   59985.549 (         ): Web Content/20354 poll(ufds: 0x7f1df38af180, nfds: 3, timeout_msecs: 4294967295) ...
       0.926 (59999.481 ms): sleep/29764  ... [continued]: nanosleep())                           = 0
   59992.133 (         ): SCTP timer/21716 select(tvp: 0x7ff5bf7fee80)                            ...
   60000.477 ( 0.009 ms): sleep/29764 close(fd: 1)                                                = 0
   60000.493 ( 0.005 ms): sleep/29764 close(fd: 2)                                                = 0
   60000.514 (         ): sleep/29764 exit_group()                                                = ?
  Dumping heap profile to /tmp/heapprof.0001.heap (Exiting, 3 MB in use)
[root@quaco ~]#

Install pprof:

  # dnf install pprof

And run it:

  # pprof ~/bin/perf /tmp/heapprof.0001.heap
  Using local file /root/bin/perf.
  Using local file /tmp/heapprof.0001.heap.
  Welcome to pprof!  For help, type 'help'.
  (pprof) top
  Total: 4.0 MB
       1.7  42.0%  42.0%      2.2  54.1% map__new
       0.9  23.3%  65.3%      0.9  23.3% zalloc
       0.5  11.4%  76.7%      0.5  11.4% dso__new
       0.2   5.6%  82.3%      0.3   8.5% trace__sys_enter
       0.2   4.9%  87.2%      0.2   4.9% __GI___strdup
       0.2   3.8%  91.0%      0.2   3.8% new_term
       0.1   2.2%  93.2%      0.4  10.1% __perf_pmu__new_alias
       0.0   1.0%  94.3%      0.0   1.2% event_read_fields
       0.0   0.8%  95.1%      0.0   0.8% nsinfo__new
       0.0   0.7%  95.8%      0.1   3.2% trace__read_syscall_info
  (pprof)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191013151427.11941-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-15 08:36:22 -03:00
Jin Yao
cebf7d51a6 perf diff: Report noisy for cycles diff
This patch prints the stddev and hist for the cycles diff of program
block. It can help us to understand if the cycles is noisy or not.

This patch is inspired by Andi Kleen's patch:

  https://lwn.net/Articles/600471/

We create new option '--cycles-hist'.

Example:

  perf record -b ./div
  perf record -b ./div
  perf diff -c cycles

  # Baseline                                [Program Block Range] Cycles Diff  Shared Object      Symbol
  # ........  .......................................................... ....  .................  ............................
  #
      46.72%                                      [div.c:40 -> div.c:40]    0  div                [.] main
      46.72%                                      [div.c:42 -> div.c:44]    0  div                [.] main
      46.72%                                      [div.c:42 -> div.c:39]    0  div                [.] main
      20.54%                          [random_r.c:357 -> random_r.c:394]    1  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:357 -> random_r.c:380]    0  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:388]    0  libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:391]    0  libc-2.27.so       [.] __random_r
      17.04%                              [random.c:288 -> random.c:291]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:291 -> random.c:291]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:293 -> random.c:293]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
      17.04%                              [random.c:298 -> random.c:298]    0  libc-2.27.so       [.] __random
       8.40%                                      [div.c:22 -> div.c:25]    0  div                [.] compute_flag
       8.40%                                      [div.c:27 -> div.c:28]    0  div                [.] compute_flag
       5.14%                                    [rand.c:26 -> rand.c:27]    0  libc-2.27.so       [.] rand
       5.14%                                    [rand.c:28 -> rand.c:28]    0  libc-2.27.so       [.] rand
       2.15%                                  [rand@plt+0 -> rand@plt+0]    0  div                [.] rand@plt
       0.00%                                                                   [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
       0.00%                                [do_mmap+714 -> do_mmap+732]  -10  [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+737 -> do_mmap+765]    1  [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+262 -> do_mmap+299]    0  [kernel.kallsyms]  [k] do_mmap
       0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7  [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
       0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  [kernel.kallsyms]  [k] native_sched_clock
       0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  [kernel.kallsyms]  [k] native_write_msr

When we enable the option '--cycles-hist', the output is

  perf diff -c cycles --cycles-hist

  # Baseline                                [Program Block Range] Cycles Diff        stddev/Hist  Shared Object      Symbol
  # ........  .......................................................... ....  .................  .................  ............................
  #
      46.72%                                      [div.c:40 -> div.c:40]    0  ± 37.8% ▁█▁▁██▁█   div                [.] main
      46.72%                                      [div.c:42 -> div.c:44]    0  ± 49.4% ▁▁▂█▂▂▂▂   div                [.] main
      46.72%                                      [div.c:42 -> div.c:39]    0  ± 24.1% ▃█▂▄▁▃▂▁   div                [.] main
      20.54%                          [random_r.c:357 -> random_r.c:394]    1  ± 33.5% ▅▂▁█▃▁▂▁   libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:357 -> random_r.c:380]    0  ± 39.4% ▁▁█▁██▅▁   libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:388]    0                     libc-2.27.so       [.] __random_r
      20.54%                          [random_r.c:388 -> random_r.c:391]    0  ± 41.2% ▁▃▁▂█▄▃▁   libc-2.27.so       [.] __random_r
      17.04%                              [random.c:288 -> random.c:291]    0  ± 48.8% ▁▁▁▁███▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:291 -> random.c:291]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:293 -> random.c:293]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
      17.04%                              [random.c:295 -> random.c:295]    0                     libc-2.27.so       [.] __random
      17.04%                              [random.c:298 -> random.c:298]    0  ± 75.6% ▃█▁▁▁▁▁▁   libc-2.27.so       [.] __random
       8.40%                                      [div.c:22 -> div.c:25]    0  ± 42.1% ▁▃▁▁███▁   div                [.] compute_flag
       8.40%                                      [div.c:27 -> div.c:28]    0  ± 41.8% ██▁▁▄▁▁▄   div                [.] compute_flag
       5.14%                                    [rand.c:26 -> rand.c:27]    0  ± 37.8% ▁▁▁████▁   libc-2.27.so       [.] rand
       5.14%                                    [rand.c:28 -> rand.c:28]    0                     libc-2.27.so       [.] rand
       2.15%                                  [rand@plt+0 -> rand@plt+0]    0                     div                [.] rand@plt
       0.00%                                                                                      [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
       0.00%                                [do_mmap+714 -> do_mmap+732]  -10                     [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+737 -> do_mmap+765]    1                     [kernel.kallsyms]  [k] do_mmap
       0.00%                                [do_mmap+262 -> do_mmap+299]    0                     [kernel.kallsyms]  [k] do_mmap
       0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7                     [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
       0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  ± 38.5% ▄█▁        [kernel.kallsyms]  [k] native_sched_clock
       0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  ± 47.1% ▁█▇▃▁▁     [kernel.kallsyms]  [k] native_write_msr

 v8:
 ---
 Rebase to perf/core branch

 v7:
 ---
 1. v6 got Jiri's ACK.
 2. Rebase to latest perf/core branch.

 v6:
 ---
 1. Jiri provides better code for using data__hpp_register() in ui_init().
    Use this code in v6.

 v5:
 ---
 1. Refine the use of data__hpp_register() in ui_init() according to
    Jiri's suggestion.

 v4:
 ---
 1. Rename the new option from '--noisy' to '--cycles-hist'
 2. Remove the option '-n'.
 3. Only update the spark value and stats when '--cycles-hist' is enabled.
 4. Remove the code of printing '..'.

 v3:
 ---
 1. Move the histogram to a separate column
 2. Move the svals[] out of struct stats

 v2:
 ---
 Jiri got a compile error,

  CC       builtin-diff.o
  builtin-diff.c: In function ‘compute_cycles_diff’:
  builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value]
  712 |          labs(pair->block_info->cycles_spark[i] -
      |          ^~~~

 Because the result of u64 - u64 is still u64. Now we change the type of
 cycles_spark[] to s64.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-11 10:57:00 -03:00
Jiri Olsa
55542113c6 perf tools: Propagate CFLAGS to libperf
Andi reported that 'make DEBUG=1' does not propagate to the libbperf
code. It's true also for the other flags. Changing the code to propagate
the global build flags to libperf compilation.

Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191011122155.15738-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-11 10:55:22 -03:00
Jiri Olsa
84227cb11f libperf: Adopt perf_evlist__filter_pollfd() from tools/perf
Introduce the perf_evlist__filter_pollfd function and export it in the
perf/evlist.h header, so that libperf users can check if the descriptor
is still alive.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-27-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:58:45 -03:00
Jiri Olsa
696f27c994 libperf: Introduce perf_evlist__purge()
Add a static perf_evlist__purge() function to purge evsels from a evlist.

Add also perf_evlist__for_each_entry_safe() which is used by
perf_evlist__purge().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-26-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:57:22 -03:00
Jiri Olsa
93dd6e2831 libperf: Introduce perf_evlist__exit()
Add the perf_evlist__exit() function, so far it's not exported and added
only for internal use for perf and libperf.

USe it to release cpus/threads and pollfd array.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-25-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:56:01 -03:00
Jiri Olsa
230662e15e libperf: Move the pollfd allocation from tools/perf to libperf
It's needed in libperf only, so move it to the perf_evlist__mmap_ops()
function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-24-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:54:35 -03:00
Jiri Olsa
285aaeac8c libperf: Centralize map refcnt setting
Currently when a new map is mmapped we set its refcnt to 2 in the
perf_evlist_mmap_ops::mmap callback.

Every mmap gets its refcnt set to 2 when it's first mmaped:

  - 1 for the current user, which will be taken out by a call to
    perf_evlist__munmap_filtered(), where we find out there's
    no more data comming from kernel to this mmap.

  - 1 for the drain code where in perf_mmap__consume() the mmap
    is released if it is empty.

Move this common setup into libperf's generic code before the mmap
callback is called.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:52:41 -03:00
Jiri Olsa
923d0f1868 perf evlist: Switch to libperf's mmap interface
Switch to the libperf mmap interface by calling directly
perf_evlist__mmap_ops() and removing perf's evlist__mmap_per_*
functions.

By switching to libperf perf_evlist__mmap() we need to operate over
'struct perf_mmap' in evlist__add_pollfd, so make the related changes
there.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:46:04 -03:00
Jiri Olsa
b80132b12a perf evlist: Introduce perf_evlist__mmap_cb_mmap()
Add the perf_evlist__mmap_cb_mmap() function to call perf specific
mmap__mmap() function during perf_evlist__mmap_ops() call.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:44:59 -03:00
Jiri Olsa
bb1b1885e2 perf evlist: Introduce perf_evlist__mmap_cb_get()
Add the perf_evlist__mmap_cb_get() function to return 'struct perf_mmap'
object during perf_evlist__mmap_ops() call.

The array of 'struct mmap' is allocated via evlist__alloc_mmap(), in
this callback we simply returns pointer to the base object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:30:21 -03:00
Jiri Olsa
9abd2ab237 perf tools: Introduce perf_evlist__mmap_cb_idx()
Add perf_evlist__mmap_cb_idx function to call auxtrace_mmap_params__set_idx()
on each new index during perf_evlist__mmap_ops call.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-19-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:23:52 -03:00
Jiri Olsa
b5911e7ac2 libperf: Introduce perf_evlist_mmap_ops::mmap callback
Add the perf_evlist_mmap_ops::mmap callback to be called in
mmap_per_evsel() to actually mmap the map.

Add libperf's perf_evlist__mmap_cb_mmap() function as libperf's mmap
callback.

New mmaped map gets refcount set to 2 in mmap__mmap(), we follow that in
mmap callback. We will move this to common place after we switch to
perf_evlist__mmap().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-18-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:22:21 -03:00
Jiri Olsa
3a8bb58121 libperf: Add perf_evlist_mmap_ops::get callback
Add the perf_evlist_mmap_ops::get callback to be called in
mmap_per_evsel() to get/allocate the 'struct perf_mmap' object.

Add the libperf's perf_evlist__mmap_cb_get() function as libperf's get
callback.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-17-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:21:11 -03:00
Jiri Olsa
1fcbb75cc5 libperf: Introduce perf_evlist_mmap_ops::idx callback
Add the perf_evlist_mmap_ops::idx callback to be called in
mmap_per_cpu() and mmap_per_thread() with current cpu and thread
indexes.

It's used by current aux code, so perf will use this callback to set the
aux index.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-16-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:20:08 -03:00
Jiri Olsa
0b5ea10d4c libperf: Introduce perf_evlist__mmap_ops()
To be able to pass specific callbacks to evlist's mmap.

There will be a specific call to this function from perf's
evlist__mmap() and libperf's perf_evlist__mmap() functions in following
changes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-15-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2019-10-10 12:18:00 -03:00
Jiri Olsa
d1a177595b libperf: Adopt perf_evlist__mmap()/munmap() from tools/perf
Add libperf's version of perf_evlist__mmap()/munmap() functions and
exporting them in the perf/evlist.h header.

It's the backbone of what we have in perf code. The following changes
will add needed callbacks and then we'll finally switch the perf code to
use libperf's version.

Add mmap/mmap_ovw 'struct perf_mmap' object arrays to hold maps for
libperf's evlist.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-14-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 12:15:58 -03:00
Jiri Olsa
151ed5d70d libperf: Adopt perf_mmap__read_event() from tools/perf
Move perf_mmap__read_event() from tools/perf to libperf and export it in
the perf/mmap.h header.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-13-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 11:49:46 -03:00
Jiri Olsa
32fdc2ca7e libperf: Adopt perf_mmap__read_done() from tools/perf
Move perf_mmap__read_init() from tools/perf to libperf and export it in
the perf/mmap.h header.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-12-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 11:45:32 -03:00
Jiri Olsa
7c4d41824f libperf: Adopt perf_mmap__read_init() from tools/perf
Move perf_mmap__read_init() from tools/perf to libperf and export it in
perf/mmap.h header.

And add pr_debug2()/pr_debug3() macros support, because the code is
using them.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 11:45:21 -03:00
Jiri Olsa
7728fa0cfa libperf: Adopt perf_mmap__consume() function from tools/perf
Move perf_mmap__consume() vrom tools/perf to libperf and export it in
the perf/mmap.h header.

Move also the needed helpers perf_mmap__write_tail(),
perf_mmap__read_head() and perf_mmap__empty().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 11:43:49 -03:00
Jiri Olsa
1d40ae4e17 perf tools: Use perf_mmap way to detect aux mmap
We will move this code to libperf shortly, so we need to free it of
'struct auxtrace_mmap' usage, because it won't be available in libperf
(for now).

The perf_event_mmap_page::aux_size is set when the aux mmap is mapped,
so the check is equivalent.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 10:11:54 -03:00
Jiri Olsa
80e53d1148 libperf: Adopt perf_mmap__put() function from tools/perf
Move perf_mmap__put() from tools/perf to libperf.

Once perf_mmap__put() is moved, we need a way to call application
related unmap code (AIO and aux related code for eprf), when the map
goes away.

Add the perf_mmap::unmap callback to do that.

The unmap path from perf is:

  perf_mmap__put                           (libperf)
    perf_mmap__munmap                      (libperf)
      map->unmap_cb -> perf_mmap__unmap_cb (perf)
        mmap__munmap                       (perf)

Committer notes:

Add missing linux/kernel.h to tools/perf/lib/mmap.c to get the BUG_ON
definition.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 10:09:25 -03:00
Jiri Olsa
59d7ea620b libperf: Adopt perf_mmap__unmap() function from tools/perf
Move perf_mmap__unmap() from tools/perf to libperf, to internal header
internal/mmap.h. It will be used in the following patches. And rename
the existing perf's function to mmap__munmap().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 10:05:57 -03:00
Jiri Olsa
e75710f063 libperf: Adopt perf_mmap__get() function from tools/perf
Move perf_mmap__get() from tools/perf to libperf in the internal header
internal/mmap.h.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 09:53:27 -03:00
Jiri Olsa
32c261c070 libperf: Adopt perf_mmap__mmap() function from tools/perf
Move perf_mmap__mmap() from tools/perf to libperf, it will be used in
the following patches. And rename the existing perf's function to
mmap__mmap().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 09:42:59 -03:00
Jiri Olsa
bf59b3053e libperf: Adopt perf_mmap__mmap_len() function from tools/perf
Move perf_mmap__mmap_len() from tools/perf wto libperf, it will be used
in the following patches. And rename the existing perf's function to
mmap__mmap_len().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 09:41:38 -03:00
Jiri Olsa
e440979faf libperf: Add 'struct perf_mmap_param'
Add libperf's version of mmap params 'struct perf_mmap_param' object
with the basics: 'prot' and 'mask'.  Encapsulate it in the current
'struct mmap_params' object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 09:40:00 -03:00
Jiri Olsa
353120b48d libperf: Add perf_mmap__init() function
Add perf_mmap__init() function to initialize 'struct perf_mmap' objects.

Add it to a new mmap.c source file, that will carry all the mmap related
functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 09:37:25 -03:00
Ian Rogers
42466b9f29 perf tools: Avoid 'sample_reg_masks' being const + weak
Being const + weak breaks with some compilers that constant-propagate
from the weak symbol. This behavior is outside of the specification, but
in LLVM is chosen to match GCC's behavior.

LLVM's implementation was set in this patch:

  f49573d1ee

A const + weak symbol is set to be weak_odr:

  https://llvm.org/docs/LangRef.html

ODR is one definition rule, and given there is one constant definition
constant-propagation is possible. It is possible to get this code to
miscompile with LLVM when applying link time optimization. As compilers
become more aggressive, this is likely to break in more instances.

Move the definition of sample_reg_masks to the conditional part of
perf_regs.h and guard usage with HAVE_PERF_REGS_SUPPORT. This avoids the
weak symbol.

Fix an issue when HAVE_PERF_REGS_SUPPORT isn't defined from patch v1.
In v3, add perf_regs.c for architectures that HAVE_PERF_REGS_SUPPORT but
don't declare sample_regs_masks.

Further notes:

Jiri asked:

  "Is this just a precaution or you actualy saw some breakage?"

Ian answered:

  "We saw a breakage with clang with thinlto enabled for linking. Our
   compiler team had recently seen, and were surprised by, a similar issue
   and were able to dig out the weak ODR issue."

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Guo Ren <guoren@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: linux-riscv@lists.infradead.org
Cc: Mao Han <han_mao@c-sky.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20191001003623.255186-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-10 09:29:33 -03:00
Arnaldo Carvalho de Melo
728db19886 perf beauty: Introduce strtoul() for x86 MSRs
Continuing from the previous cset comment, now that filter expression
works:

  # perf trace -e msr:* --filter="msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750
     0.000 Timer/5033 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
     0.009 Timer/5033 msr:write_msr(msr: LSTAR, val: -1398800368)
     0.010 Timer/5033 msr:write_msr(msr: TSC_AUX, val: 4)
     0.050 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
    45.661 gnome-terminal/12595 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
    45.672 gnome-terminal/12595 msr:write_msr(msr: LSTAR, val: -1398800368)
    45.675 gnome-terminal/12595 msr:write_msr(msr: TSC_AUX, val: 3)
    54.852 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
   130.508 Timer/4050 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
   130.527 Timer/4050 msr:write_msr(msr: LSTAR, val: -1398800368)
   130.531 Timer/4050 msr:write_msr(msr: TSC_AUX, val: 3)
   140.924 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
   164.738 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
   603.578 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
   620.809 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
   690.115 JS Watchdog/4259 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
   690.136 JS Watchdog/4259 msr:write_msr(msr: LSTAR, val: -1398800368)
   690.141 JS Watchdog/4259 msr:write_msr(msr: TSC_AUX, val: 3)
   690.186 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
   759.016 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
^C[root@quaco ~]#

Or look at the first 3 write_msr events for that IA32_TSC_DEADLINE to learn why
it happens so often:

  # perf trace --max-events=3 --max-stack=8 -e msr:* --filter="msr==IA32_TSC_DEADLINE" --filter-pids 3750
     0.000 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19296732550862)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       lapic_next_deadline ([kernel.kallsyms])
                                       clockevents_program_event ([kernel.kallsyms])
                                       hrtimer_interrupt ([kernel.kallsyms])
                                       smp_apic_timer_interrupt ([kernel.kallsyms])
                                       apic_timer_interrupt ([kernel.kallsyms])
                                       cpuidle_enter_state ([kernel.kallsyms])
    32.646 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19296800134158)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       lapic_next_deadline ([kernel.kallsyms])
                                       clockevents_program_event ([kernel.kallsyms])
                                       hrtimer_start_range_ns ([kernel.kallsyms])
                                       tick_nohz_restart_sched_tick ([kernel.kallsyms])
                                       tick_nohz_idle_exit ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
    32.802 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19297507436922)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       lapic_next_deadline ([kernel.kallsyms])
                                       clockevents_program_event ([kernel.kallsyms])
                                       hrtimer_try_to_cancel ([kernel.kallsyms])
                                       hrtimer_cancel ([kernel.kallsyms])
                                       tick_nohz_restart_sched_tick ([kernel.kallsyms])
                                       tick_nohz_idle_exit ([kernel.kallsyms])
  #

And if some of the strings can't be found:

  # trace -e msr:* --filter="msr!=SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750
  "SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION" not found for "msr" in "msr:read_msr", can't set filter "(msr!=SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL) && (common_pid != 28131 && common_pid != 3750)"
  #

Next step is to automatically wire up the pre-existing strarrays, which there
are quite a few.

The strtoul() methods will be further enhanced to allow for looking at other
arguments in a syscall/tracepoint, just like going from integer to string
(scnprintf methods), so that those "val" lines for the msr tracepoints can be
properly formatted or even resolved into some string.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-4qaai5iqjgefd11k4ddm7qg8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 16:25:02 -03:00
Arnaldo Carvalho de Melo
90df0249c2 perf trace: Expand strings in filters to integers
So that one can try things like:

  # perf trace -e msr:* --filter="msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750

That, at this point in the patchset, without any strtoul in place for
tracepoint arguments, will result in:

  No resolver (strtoul) for "msr" in "msr:read_msr", can't set filter "(msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL) && (common_pid != 25407 && common_pid != 3750)"
  #

See you in the next cset!

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-dx5j70fv2rgkeezd1cb3hv2p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 16:22:16 -03:00
Arnaldo Carvalho de Melo
d0a3a10410 perf trace: Introduce a strtoul() method for 'struct strarrays'
And also for 'struct strarray', since its needed to implement
strarrays__strtoul(). This just traverses the entries and when finding a
match, returns (offset + index), i.e. the value associated with the
searched string.

E.g. "EFER" (MSR_EFER) returns:

  # grep -w EFER -B2 /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
  #define x86_64_specific_MSRs_offset 0xc0000080
  static const char *x86_64_specific_MSRs[] = {
	[0xc0000080 - x86_64_specific_MSRs_offset] = "EFER",
  #

  0xc0000080

This will be auto-attached to 'struct syscall_arg_fmt' entries
associated with strarrays as soon as we add a ->strarray and ->strarrays
to 'struct syscall_arg_fmt'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-r2hpaahf8lishyb1owko9vs1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 16:11:36 -03:00
Arnaldo Carvalho de Melo
3f41b77843 perf trace: Add a strtoul() method to 'struct syscall_arg_fmt'
This will go from a string to a number, so that filter expressions can
be constructed with strings and then, before applying the tracepoint
filters (or eBPF, in the future) we can map those strings to numbers.

The first one will be for 'msr' tracepoint arguments, but real quickly
we will be able to reuse all strarrays for that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wgqq48agcgr95b8dmn6fygtr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 16:06:43 -03:00
Arnaldo Carvalho de Melo
d4097f1937 perf trace: Introduce --filter for tracepoint events
Similar to what is in 'perf record', works just like there:

  # perf trace -e msr:*
   328.297 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
   328.302 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
   328.306 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
   328.317 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
   328.322 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
   328.327 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
   328.331 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
   328.336 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
   328.340 :0/0 ^Cmsr:write_msr(msr: FS_BASE, val: 140240388381888)
  #

So, for a system wide trace session looking at the write_msr tracepoint
we see a flood of MSR_FS_BASE, we need to get the number for that:

  # grep FS_BASE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
	[0xc0000100 - x86_64_specific_MSRs_offset] = "FS_BASE",
  #

And then use it in a filter:

  # perf trace -e msr:* --filter="msr!=0xc0000100"
  <SNIP>
   942.177 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068232)
   942.199 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3057135655252)
   942.203 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068222)
   942.231 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056998373022)
   942.241 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068236)
  <SNIP>
  #

Ok, lets filter that too, too noisy:

  # grep TSC_DEADLINE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
	[0x000006E0] = "IA32_TSC_DEADLINE",
  #

  # perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" -a sleep 0.1
     0.000 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
     0.066 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
     0.070 CPU 0/KVM/4895 msr:write_msr(msr: 0x830, val: 34359740667)
     0.099 CPU 0/KVM/4895 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199021993472)
     0.100 CPU 0/KVM/4895 msr:read_msr(msr: IA32_APICBASE, val: 4276096000)
     0.101 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR)
     0.109 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
     1.000 :0/0 msr:write_msr(msr: 0x830, val: 17179871485)
    18.893 :0/0 msr:write_msr(msr: 0x83f, val: 246)
    28.810 :0/0 msr:write_msr(msr: 0x830, val: 68719479037)
    40.117 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
    40.127 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR)
    40.139 CPU 0/KVM/4895 msr:write_msr(msr: LSTAR, val: -2130661312)
    40.141 CPU 0/KVM/4895 msr:write_msr(msr: SYSCALL_MASK, val: 14080)
    40.142 CPU 0/KVM/4895 msr:write_msr(msr: TSC_AUX)
    40.144 CPU 0/KVM/4895 msr:write_msr(msr: KERNEL_GS_BASE)
    40.147 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL)
    40.148 CPU 0/KVM/4895 msr:write_msr(msr: IA32_FLUSH_CMD, val: 1)
    40.151 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
  ^C
  #

One can combine that with filtering pids as well:

  # perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" --filter-pids 4895 -a sleep 0.09
     0.000 :0/0 msr:write_msr(msr: 0x830, val: 4294969597)
     0.291 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
     0.294 gnome-terminal/2790 msr:write_msr(msr: LSTAR, val: -1935671280)
     0.295 gnome-terminal/2790 msr:write_msr(msr: TSC_AUX, val: 6)
    10.940 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
    15.943 gnome-shell/2096 msr:write_msr(msr: 0x830, val: 4294969597)
    16.975 :0/0 msr:write_msr(msr: 0x830, val: 4294969597)
    19.560 :0/0 msr:write_msr(msr: 0x83f, val: 246)
    25.162 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
    25.807 JS Watchdog/3635 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
    25.820 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
    25.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
    26.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
    29.942 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
    45.313 :0/0 msr:write_msr(msr: 0x83f, val: 246)
    56.945 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
    60.946 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
    74.096 JS Watchdog/8971 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
    74.130 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
    79.673 :0/0 msr:write_msr(msr: 0x83f, val: 246)
    79.947 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 17179871485)
  #

Or for just a pid, with callchains:

  # grep SYSCALL_MAS /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
	[0xc0000084 - x86_64_specific_MSRs_offset] = "SYSCALL_MASK",
  # perf trace -e msr:* --filter="msr==0xc0000084" --pid 2790 --call-graph=dwarf

     0.000 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       kvm_on_user_return ([kvm])
                                       fire_user_return_notifiers ([kernel.kallsyms])
                                       exit_to_usermode_loop ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64 ([kernel.kallsyms])
                                       __GI___poll (inlined)
  9299.073 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       kvm_on_user_return ([kvm])
                                       fire_user_return_notifiers ([kernel.kallsyms])
                                       exit_to_usermode_loop ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64 ([kernel.kallsyms])
                                       __GI___poll (inlined)
  9348.374 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       kvm_on_user_return ([kvm])
                                       fire_user_return_notifiers ([kernel.kallsyms])
                                       exit_to_usermode_loop ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64 ([kernel.kallsyms])
                                       __GI___poll (inlined)
  <SNIP>
  #

Ok, just another form of KVM to emit MSRs :-)

Next step: elliminate those greps by getting the filter expression,
looking for arg names, then for the arrays associated with it to do a
reverse lookup.

Also allow those filters to be associated with strace-like syscall
names.

After that: augment the 'val' arg for 'msr:write_msr' based on the first
arg, 'msr'.

Then, do that with eBPF too, not just with tracepoint filters.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-95bfe5d4tzy5f66bx49d05rj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
1827ab5ba8 perf evlist: Introduce append_tp_filter_pid() and append_tp_filter_pids()
We'll need this to support 'perf trace e tracepoint --filter=expr', as
the command line tracepoint filter is attache to the preceding evsel,
just like in 'perf record' and when we go to set pid filters, which we
do at the minimum to filter 'perf trace' own syscalls, we need to
append, not set the tp filter.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-daynpknni44ywuzi8iua57nn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
53c92f7338 perf evlist: Introduce append_tp_filter() method
Will be used by 'perf trace' to support 'perf trace --filter', we need
to append to any pre-existing filter.

When parse_filter() gets invoked to process --filter, it'll set the
filter to that specified on the command line, later on, when we filter
out 'perf trace' own pid to avoid an event feedback loop, we need to
preserve the command line filter put in place by parse_filter().

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-h9rot08qmxlnfmte0holt68x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
05cea4492c perf evlist: Factor out asprintf routine to build a tracepoint pid filter
Will be used to append such lists to existing filters.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-798vlyqfqw938ehoe8etivx1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
c330ef2847 perf trace: Associate the "msr" tracepoint arg name with x86_MSR__scnprintf()
So that we can go from:

  # perf trace -e msr:write_msr --max-stack=16 sleep 1
       0.000 sleep/6740 msr:write_msr(msr: 3221225728, val: 139636317451648)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_arch_prctl_64 ([kernel.kallsyms])
                                         __x64_sys_arch_prctl ([kernel.kallsyms])
                                         do_syscall_64 ([kernel.kallsyms])
                                         entry_SYSCALL_64 ([kernel.kallsyms])
                                         init_tls (/usr/lib64/ld-2.29.so)
                                         dl_main (/usr/lib64/ld-2.29.so)
                                         _dl_sysdep_start (/usr/lib64/ld-2.29.so)
                                         _dl_start (/usr/lib64/ld-2.29.so)
  #

To:

  # perf trace -e msr:write_msr --max-stack=16 sleep 1
     0.000 sleep/8519 msr:write_msr(msr: FS_BASE, val: 139878031705472)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_arch_prctl_64 ([kernel.kallsyms])
                                       __x64_sys_arch_prctl ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64 ([kernel.kallsyms])
                                       init_tls (/usr/lib64/ld-2.29.so)
                                       dl_main (/usr/lib64/ld-2.29.so)
                                       _dl_sysdep_start (/usr/lib64/ld-2.29.so)
                                       _dl_start (/usr/lib64/ld-2.29.so)
  #

This, in reverse, will allow for symbolic system call/tracepoint
filtering.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-q1q4unmqja5ex7dy0kb5cjaa@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
646b3e2cfb perf trace beauty: Add the glue for the autogenerated MSR arrays
We need to wrap those autogenerated string arrays with the
strarrays__scnprintf() formatter, do it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wqjz4kwi4a0ot6lsis3kc65j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
5d88099bc0 perf trace: Allow associating scnprintf routines with well known arg names
For instance 'msr' appears in several tracepoints, so we can associate
it with a single scnprintf() routine auto-generated from kernel headers,
as will be done in followup patches.

Start with an empty array of associations.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-89ptht6s5fez82lykuwq1eyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
fd21834704 perf beauty: Hook up the x86 MSR table generator
This way we generate the source with the table for later use by plugins,
etc.

I.e. after running:

  $ make -C tools/perf O=/tmp/build/perf

We end up with:

  $ head /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
  static const char *x86_MSRs[] = {
  	[0x00000000] = "IA32_P5_MC_ADDR",
  	[0x00000001] = "IA32_P5_MC_TYPE",
  	[0x00000010] = "IA32_TSC",
  	[0x00000017] = "IA32_PLATFORM_ID",
  	[0x0000001b] = "IA32_APICBASE",
  	[0x00000020] = "KNC_PERFCTR0",
  	[0x00000021] = "KNC_PERFCTR1",
  	[0x00000028] = "KNC_EVNTSEL0",
  	[0x00000029] = "KNC_EVNTSEL1",
  $

Now its just a matter of using it, first in a libtracevent plugin.

At some point we should move tools/perf/trace/beauty to tools/beauty/,
so that it can be used more generally and even made available externally
like libbpf, libperf, libtraevent, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-b3rmutg4igcohx6kpo67qh4j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
693d345818 perf trace beauty: Add a x86 MSR cmd id->str table generator
Without parameters it'll parse tools/arch/x86/include/asm/msr-index.h
and output a table usable by tools, that will be wired up later to a
libtraceevent plugin registered from perf's glue code:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh
  static const char *x86_MSRs[] = {
 <SNIP>
  	[0x00000034] = "SMI_COUNT",
  	[0x0000003a] = "IA32_FEATURE_CONTROL",
  	[0x0000003b] = "IA32_TSC_ADJUST",
  	[0x00000040] = "LBR_CORE_FROM",
  	[0x00000048] = "IA32_SPEC_CTRL",
  	[0x00000049] = "IA32_PRED_CMD",
 <SNIP>
  	[0x0000010b] = "IA32_FLUSH_CMD",
  	[0x0000010F] = "TSX_FORCE_ABORT",
 <SNIP>
  	[0x00000198] = "IA32_PERF_STATUS",
  	[0x00000199] = "IA32_PERF_CTL",
  <SNIP>
  	[0x00000da0] = "IA32_XSS",
  	[0x00000dc0] = "LBR_INFO_0",
  	[0x00000ffc] = "IA32_BNDCFGS_RSVD",
  };

  #define x86_64_specific_MSRs_offset 0xc0000080
  static const char *x86_64_specific_MSRs[] = {
  	[0xc0000080 - x86_64_specific_MSRs_offset] = "EFER",
  	[0xc0000081 - x86_64_specific_MSRs_offset] = "STAR",
  	[0xc0000082 - x86_64_specific_MSRs_offset] = "LSTAR",
  	[0xc0000083 - x86_64_specific_MSRs_offset] = "CSTAR",
  	[0xc0000084 - x86_64_specific_MSRs_offset] = "SYSCALL_MASK",
  <SNIP>
  	[0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX",
  	[0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO",
  };

  #define x86_AMD_V_KVM_MSRs_offset 0xc0010000
  static const char *x86_AMD_V_KVM_MSRs[] = {
  	[0xc0010000 - x86_AMD_V_KVM_MSRs_offset] = "K7_EVNTSEL0",
  <SNIP>
  	[0xc0010114 - x86_AMD_V_KVM_MSRs_offset] = "VM_CR",
  	[0xc0010115 - x86_AMD_V_KVM_MSRs_offset] = "VM_IGNNE",
  	[0xc0010117 - x86_AMD_V_KVM_MSRs_offset] = "VM_HSAVE_PA",
  <SNIP>
  	[0xc0010240 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTL",
  	[0xc0010241 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTR",
  	[0xc0010280 - x86_AMD_V_KVM_MSRs_offset] = "F15H_PTSC",
  };

Then these will in turn be hooked up in a follow up patch to be used by
strarrays__scnprintf().

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ja080xawx08kedez855usnon@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
8d6505bae3 perf beauty: Make strarray's offset be u64
We need it for things like MSRs that are sparse and go over MAXINT.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-g8t2d0jr0mg3yimg2qrjkvlt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo
444e2ff34d tools arch x86: Grab a copy of the file containing the MSR numbers
We'll use it to generate a table and then convert the
msr:{read,write}_msr 'msr' option in things like perf trace, script,
etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-y1f4s0y1s43d4drh7pd2huzn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo
f11b2803bb perf trace: Allow choosing how to augment the tracepoint arguments
So far we used the libtraceevent printing routines when showing
tracepoint arguments, but since 'perf trace' has a lot of beautifiers
for syscall arguments, and since some of those can be used to augment
tracepoint arguments, add a routine to make use of those beautifiers
and allow the user to choose which one to use.

The default now is to use the same beautifiers used for the strace-like
sys_enter+sys_exit lines, but the user can choose the libtraceevent ones
by either using the:

    perf trace --libtraceevent_print

command line option, or by setting:

  # cat ~/.perfconfig
  [trace]
	tracepoint_beautifiers = libtraceevent

For instance, here are some examples:

  # perf trace -e sched:*switch,*sleep,sched:*wakeup,exit*,sched:*exit sleep 1
       0.000 sched:sched_wakeup(comm: "perf", pid: 5273 (perf), prio: 120, success: 1, target_cpu: 6)
       0.621 nanosleep(rqtp: 0x7ffdd06d1140, rmtp: NULL) ...
       0.628 sched:sched_switch(prev_comm: "sleep", prev_pid: 5273 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/6", next_pid: 0, next_prio: 120)
    1000.879 sched:sched_wakeup(comm: "sleep", pid: 5273 (sleep), prio: 120, success: 1, target_cpu: 6)
       0.621  ... [continued]: nanosleep())          = 0
    1001.026 exit_group(error_code: 0)               = ?
    1001.216 sched:sched_process_exit(comm: "sleep", pid: 5273 (sleep), prio: 120)
  #

And then using libtraceevent, as before:

  # perf trace --libtraceevent_print -e sched:*switch,*sleep,sched:*wakeup,exit*,sched:*exit sleep 1
       0.000 sched:sched_wakeup(comm=perf pid=5288 prio=120 target_cpu=001)
       0.739 nanosleep(rqtp: 0x7ffeba6c2f40, rmtp: NULL) ...
       0.747 sched:sched_switch(prev_comm=sleep prev_pid=5288 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120)
    1000.902 sched:sched_wakeup(comm=sleep pid=5288 prio=120 target_cpu=001)
       0.739  ... [continued]: nanosleep())          = 0
    1001.012 exit_group(error_code: 0)               = ?
  #

The new default allocates an array of 'struct syscall_arg_fmt' for the
tracepoint arguments and, just like with syscall arguments, tries to
find suitable syscall_arg__scnprintf_NAME() routines to augment those
tracepoint arguments based on their type (as in the tracefs "format"
file), or even in their name + type, for instance arguntents with names
ending in "fd" with type "int" get the fd scnprintf beautifier attached,
etc.

Soon this will take advantage of the kernel BTF information to augment
enumerations based on the tracefs "format" type info.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-o8qdluotkcb3b1x2gjqrejcl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo
311baaf93c perf trace: Enclose all events argument lists with ()
So that they look a bit like normal strace-like syscall enter+exit
lines.

They will look even more when we switch from using libtraceevent's
tep_print_event() routine in favour of using all the perf beautifiers
used by the strace-like syscall enter+exit lines.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-y4fcej6v6u1m644nbxd2r4pg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo
9597945d7f perf trace: Add array of chars scnprintf beautifier
Needed for sched's traceoints prev/next comm, where, unlike with
syscalls, we are not dealing with an integer or pointer, but an array
straight out from the ring buffer.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-rlll7tmcqe1g4odtaifil5re@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo
888ca854e2 perf trace: Add the syscall_arg_fmt pointer to syscall_arg
So that the scnprintf beautifiers can access it, as will be the case
with the char array one in the following csets, that needs to know
the number of elements in an array.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-01qmjqv6cb1nj1qy4khdexce@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo
3e0c9b2cfa perf trace: Move some scnprintf methods from syscall to syscall_arg_fmt
Since all they operate on is on a syscall_arg_fmt instance, so move them
to allow use it from the upcoming tracepoint fprintf routine.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ynttrs1l75f0x9tk67spd7jd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo
947b843cf5 perf trace: Allocate an array of beautifiers for tracepoint args
This will work similar to the syscall args, we'll allocate an array
of 'struct syscall_arg_fmt' for the tracepoint args and then init them
using the same algorithm used for the defaults for syscall args, i.e.
using its types and sometimes names as hints to find the right scnprintf
routine to beautify them from numbers into strings.

Next step is to stop using libtracevent to printf tracepoints, as we'll
have more beautifiers than int provides, modulo perhaps some plugins.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-dcl135relxvf6ljisjg13aqg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo
8d1d4ff5e2 perf trace: Factor out the initialization of syscal_arg_fmt->scnprintf
We set the default scnprint routines for the syscall args based on its
type or on heuristics based on its names, now we'll use this for
tracepoints as well, so move it out of syscall__set_arg_fmts() and into
a routine that receive just an array of syscall_arg_fmt entries + the
tracepoint format fields list.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-xs3x0zzyes06c7scdsjn01ty@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Andi Kleen
3714437d3f perf script: Allow --time with --reltime
The original --reltime patch forbid --time with --reltime.

But it turns out --time doesn't really care about --reltime, because the
relative time is only used at final output, while the time filtering
always works earlier on absolute time.

So just remove the check and allow combining the two options.

Fixes: 90b10f47c0 ("perf script: Support relative time")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191002164642.1719-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:18 -03:00
Björn Töpel
06f84d1989 perf tools: Make usage of test_attr__* optional for perf-sys.h
For users of perf-sys.h outside perf, e.g. samples/bpf/bpf_load.c, it's
convenient not to depend on test_attr__*.

After commit 91854f9a07 ("perf tools: Move everything related to
sys_perf_event_open() to perf-sys.h"), all users of perf-sys.h will
depend on test_attr__enabled and test_attr__open.

This commit enables a user to define HAVE_ATTR_TEST to zero in order
to omit the test dependency.

Fixes: 91854f9a07 ("perf tools: Move everything related to sys_perf_event_open() to perf-sys.h")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20191001113307.27796-2-bjorn.topel@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Adrian Hunter
b3700f21c2 perf scripts python: exported-sql-viewer.py: Add Time chart by CPU
Add a time chart based on context switch information.

Context switch information was added to the database export fairly
recently, so the chart menu option will only appear if context switch
information is in the database.

Refer to the Exported SQL Viewer Help option for more information about
the chart.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20190821083216.1340-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Adrian Hunter
e69d5df75d perf scripts python: exported-sql-viewer.py: Add ability for Call tree to open at a specified task and time
Add ability for Call tree to open at a specified task and time.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20190821083216.1340-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Adrian Hunter
da4264f5cf perf scripts python: exported-sql-viewer.py: Tidy up Call tree call_time
Record call_time on tree nodes and re-name the misnamed "count" parameter.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20190821083216.1340-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Adrian Hunter
9a9dae3655 perf scripts python: exported-sql-viewer.py: Add global time range calculations
Add calculations to determine a time range that encompasses all data.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20190821083216.1340-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Adrian Hunter
42c303ff9a perf scripts python: exported-sql-viewer.py: Add HBoxLayout and VBoxLayout
Add layout classes HBoxLayout and VBoxLayout.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20190821083216.1340-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Adrian Hunter
181ea40a24 perf scripts python: exported-sql-viewer.py: Add LookupModel()
Add LookupModel() to find a model in the model cache without creating it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20190821083216.1340-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
8bd436b006 perf trace augmented_syscalls: Do not show syscalls when none was asked for
When not using augmented syscalls, i.e. not passing thru the command
line a eBPF source or object file event that provides the
__augmented_syscalls__ BPF_MAP_TYPE_PERF_EVENT_ARRAY, etc, as with:

   perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c

or passing that augmented eBPF source/object via the trace.add_events in
.perfconfig file, we were assuming that syscalls were asked for,
differing from when not using augmented syscalls at all.

This is confusing when using .perfconfig to hide the fact we're using
the augmenter, i.e. using:

 # perf trace -e sched:* sleep 1

Will show both the scheduler tracepoints and the syscalls, where what we
want is to show just the scheduler tracepoints.

To see the scheduler tracepoints and some specific syscall strace-like
formatting, one has to use:

  # perf trace -e sched:*,nanosleep sleep 1

Or, if wanting all the syscalls:

  # perf trace -e sched:* --syscalls sleep 1

This way 'perf trace' can be used to trace just a set of tracepoints
while allowing for mixing with strace-like when desired, by simply
adding to the mix the name of the syscalls to show in addition to the
tracepoints.

Fix it so that the behaviour using the eBPF based syscall augmenter is
the same as when not using one.

Testing:

Before this patch, with this ~/.perfconfig:

  # egrep -B1 ^[[:space:]]+add_events ~/.perfconfig
  [trace]
  	add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  #

That points to this pre-compiled eBPF syscall augmenter:

  # file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped

And when asking for _only_ sched:sched_switch and sched:sched_wakeup we
were unconditionally getting all the syscalls formatted strace-like:

  # perf trace -e sched:*switch,sched:*wakeup sleep 1 |& tail
     0.633 fstat(3, 0x7fe11d030ac0)                = 0
     0.635 mmap(NULL, 217750512, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe10fec5000
     0.643 close(3)                                = 0
     0.668 nanosleep(0x7fff649a3a90, NULL)      ...
     0.672 sched:sched_switch:prev_comm=sleep prev_pid=4417 prev_prio=120 prev_state=S ==> next_comm=swapper/6 next_pid=0 next_prio=120
  1000.822 sched:sched_wakeup:comm=sleep pid=4417 prio=120 target_cpu=006
     0.668  ... [continued]: nanosleep())          = 0
  1000.923 close(1)                                = 0
  1000.941 close(2)                                = 0
  1000.974 exit_group(0)                           = ?
  #

After the patch:

  # perf trace -e sched:*switch,sched:*wakeup sleep 1
     0.000 sched:sched_wakeup:comm=perf pid=5529 prio=120 target_cpu=005
     1.186 sched:sched_switch:prev_comm=sleep prev_pid=5529 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120
  1001.573 sched:sched_wakeup:comm=sleep pid=5529 prio=120 target_cpu=005
  #

If we add the "open*" syscalls to the mix then the eBPF augmented _will_
be used and these syscalls will be traced together with the specified
sched tracepoints:

  # cd /sys/kernel/debug/tracing/events/syscalls/
  # ls -1d sys_enter_open*
  sys_enter_open
  sys_enter_openat
  sys_enter_open_by_handle_at
  sys_enter_open_tree
  #

  # perf trace -e open*,sched:*switch,sched:*wakeup sleep 1
       0.000 sched:sched_wakeup:comm=perf pid=5580 prio=120 target_cpu=005
       0.590 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
       0.616 openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
       0.846 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
       0.891 sched:sched_switch:prev_comm=sleep prev_pid=5580 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120
    1001.005 sched:sched_wakeup:comm=sleep pid=5580 prio=120 target_cpu=005
  #

And as we can see, the pathnames were collected via the eBPF augmenters.

If we don't specify anything it'll trace all syscalls:

  # perf trace sleep 1 |& tail
       0.299 brk(0x5597543a3000)                     = 0x5597543a3000
       0.302 brk(NULL)                               = 0x5597543a3000
       0.307 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
       0.313 fstat(3, 0x7feece50cac0)                = 0
       0.315 mmap(NULL, 217750512, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7feec13a1000
       0.323 close(3)                                = 0
       0.354 nanosleep(0x7ffe338856e0, NULL)         = 0
    1000.641 close(1)                                = 0
    1000.655 close(2)                                = 0
    1000.673 exit_group(0)                           = ?
  #

Ditto if we don't use .perfconfig's trace.add_events but instead pass
just the augmenter as a command line event:

  # vim ~/.perfconfig
  # egrep -B1 ^[[:space:]]+add_events ~/.perfconfig
  # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o sleep 1 |& tail
       0.294 brk(0x55ae08ec3000)                     = 0x55ae08ec3000
       0.297 brk(NULL)                               = 0x55ae08ec3000
       0.302 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
       0.309 fstat(3, 0x7f726488fac0)                = 0
       0.311 mmap(NULL, 217750512, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7257724000
       0.319 close(3)                                = 0
       0.347 nanosleep(0x7ffe23643a70, NULL)         = 0
    1000.560 close(1)                                = 0
    1000.575 close(2)                                = 0
    1000.593 exit_group(0)                           = ?
  #

As well as that + some syscall names for strace-like formatting:

  # perf trace -e socket,connect,/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o ssh localhost
       0.000 socket(PF_LOCAL, SOCK_STREAM|CLOEXEC|NONBLOCK, 0) = 3
       0.021 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
       0.034 socket(PF_LOCAL, SOCK_STREAM|CLOEXEC|NONBLOCK, 0) = 3
       0.041 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
       0.163 socket(PF_LOCAL, SOCK_STREAM, 0)        = 4
       0.185 connect(4, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 110) = 0
       0.670 socket(PF_LOCAL, SOCK_STREAM|CLOEXEC|NONBLOCK, 0) = 7
       0.684 connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
       0.694 socket(PF_LOCAL, SOCK_STREAM|CLOEXEC|NONBLOCK, 0) = 7
       0.701 connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
       0.994 socket(PF_LOCAL, SOCK_STREAM|CLOEXEC|NONBLOCK, 0) = 5
       1.006 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
       1.014 socket(PF_LOCAL, SOCK_STREAM|CLOEXEC|NONBLOCK, 0) = 5
       1.022 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory)
       1.068 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 5
       1.087 connect(5, { .family: PF_INET, port: 22, addr: 127.0.0.1 }, 16) = 0
      24.299 socket(PF_LOCAL, SOCK_STREAM, 0)        = 6
      24.337 connect(6, { .family: PF_LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, 110) = 0
      28.441 socket(PF_LOCAL, SOCK_STREAM, 0)        = 6
      28.516 connect(6, { .family: PF_LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, 110) = 0
  root@localhost's password:^C
  #

Everything works without augmenters:

  # egrep -B1 ^[[:space:]]+add_events ~/.perfconfig
  # perf trace sleep 1 |& tail
       0.261 brk(0x5635068ac000)                     = 0x5635068ac000
       0.264 brk(NULL)                               = 0x5635068ac000
       0.268 openat(AT_FDCWD, 0xdce642a0, O_RDONLY|O_CLOEXEC) = 3
       0.275 fstat(3, 0x7f3fdce97ac0)                = 0
       0.277 mmap(NULL, 217750512, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3fcfd2c000
       0.284 close(3)                                = 0
       0.310 nanosleep(0x7ffdaea6ecd0, NULL)         = 0
    1000.552 close(1)                                = 0
    1000.565 close(2)                                = 0
    1000.580 exit_group(0)                           = ?
  #

  # perf trace -e connect ssh localhost
       0.000 connect(3, 0x58266930, 110)             = -1 ENOENT (No such file or directory)
       0.022 connect(3, 0x58266af0, 110)             = -1 ENOENT (No such file or directory)
       0.150 connect(4, 0x58266b00, 110)             = 0
       0.490 connect(7, 0x58264150, 110)             = -1 ENOENT (No such file or directory)
       0.505 connect(7, 0x58264300, 110)             = -1 ENOENT (No such file or directory)
       0.832 connect(5, 0x58266220, 110)             = -1 ENOENT (No such file or directory)
       0.847 connect(5, 0x582663e0, 110)             = -1 ENOENT (No such file or directory)
       0.899 connect(5, 0x95ba0630, 16)              = 0
      25.619 connect(6, 0x58266360, 110)             = 0
      40.564 connect(6, 0x58266330, 110)             = 0
  root@localhost's password: ^C
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-624f6jxic04031tnt40va4dd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
7e035929f3 perf trace: Postpone parsing .perfconfig trace.add_events to after --verbose is processed
When we add events via the '[trace]' section in perfconfig the command
line options are not yet processed, so when something goes wrong with
parsing those events and using --verbose is advised, we end up not
getting any more verbosity by doing so.

So just copy the trace.add_events string for later processing, after we
processed --verbose and the other command line options.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-d6wbnz85ftqljdll6ynjyjd8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
bcddbfc5c8 perf trace: Generalize the syscall_fmt find routines
To allow them to be used with other stuff, such as tracepoints.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-od3gzg77ppqgnnrxqv40fvgx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
9b2036cd32 perf trace: Separate 'struct syscall_fmt' definition from syscall_fmts variable
As this has all the things needed to format tracepoints events, not just
syscalls, that, after all, are just tracepoints with a set in stone ABI,
i.e. order and number of parameters.

For tracepoints we'll create a

  static struct syscall_fmt tracepoint_fmts[]

array and will fill the ->arg[] entries with the beautifier for each
positional argument and record the name, then, when we need it, we'll
just check that the position has the same name, maybe even type, so that
we can do some check that the tracepoint hasn't changed, if it has, we
can even reorder things.

Keep calling it syscall_fmt but use it as well for tracepoints, do it
this way to minimize changes and reuse what is in place for syscalls,
we'll see.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-2x1jgiev13zt4njaanlnne0d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
206d635aa5 perf trace: Make evlist__set_evsel_handler() affect just entries without a handler
Renaming it to evlist__set_default_evsel_handler(), to better reflect
what we want to do, which is to set a default handler for events we
still haven't set a custom handler, like the ones for "msr:write_msr",
etc that are coming soon.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-e1bit7upnpmtsayh8039kfuw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
c0e53476ab perf evlist: Adopt __set_tracepoint_handlers method from perf_session
It all operates on the evsels in the session's evlist, so move it to the
evlist layer to make it useful to tools not using perf_session, just
evlists, like 'perf trace' in live mode.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9oc53gnfi53vg82fvolkm85g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
608127f737 perf top: Initialize perf_env->cpuid, needed by the per arch annotation init routine
Just read it so that later on the per arch init routine can use it,
e.g. x86__annotate_init().

When using a perf.data file this is obtained from a header that was put
there by 'perf record', and then it may be for another machine, another
arch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-4t4n3o8l8s0tc2b1pq53hyr4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
f1cedfb828 perf env: Add routine to read the env->cpuid from the running machine
In 'perf top' we use that cpuid when initializing the per arch
annotation init routines (e.g. x86__annotate_init()) and in that case
(live mode, 'perf top') we need to obtain it from the running machine,
not from a perf.data file header.

Provide a means to do that. Will be used by 'perf top' in a followup
patch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-h2wb3sx7u7znx6lqfezrh7ca@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo
11aad897f6 perf annotate: Don't return -1 for error when doing BPF disassembly
Return errno when open_memstream() fails and add two new speciall error
codes for when an invalid, non BPF file or one without BTF is passed to
symbol__disassemble_bpf(), so that its callers can rely on
symbol__strerror_disassemble() to convert that to a human readable error
message that can help figure out what is wrong, with hints even.

Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Song Liu <songliubraving@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-usevw9r2gcipfcrbpaueurw0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:30:06 -03:00
Arnaldo Carvalho de Melo
16ed3c1e91 perf annotate: Return appropriate error code for allocation failures
We should return errno or the annotation extra range understood by
symbol__strerror_disassemble() instead of -1, fix it, returning ENOMEM
instead.

Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-8of1cmj3rz0mppfcshc9bbqq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:30:04 -03:00
Arnaldo Carvalho de Melo
42d7a9107d perf annotate: Fix arch specific ->init() failure errors
They are called from symbol__annotate() and to propagate errors that can
help understand the problem make them return what
symbol__strerror_disassemble() known, i.e. errno codes and other
annotation specific errors in a special, out of errnos, range.

Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-pqx7srcv7tixgid251aeboj6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:30:03 -03:00
Arnaldo Carvalho de Melo
211f493b61 perf annotate: Propagate the symbol__annotate() error return
We were just returning -1 in symbol__annotate() when symbol__annotate()
failed, propagate its error as it is used later to pass to
symbol__strerror_disassemble() to present a error message to the user,
that in some cases were getting:

  "Invalid -1 error code"

Fix it to propagate the error.

Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-0tj89rs9g7nbcyd5skadlvuu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:30:01 -03:00
Arnaldo Carvalho de Melo
28f4417c33 perf annotate: Fix the signedness of failure returns
Callers of symbol__annotate() expect a errno value or some other
extended error value range in symbol__strerror_disassemble() to
convert to a proper error string, fix it when propagating a failure to
find the arch specific annotation routines via arch__find(arch_name).

Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-o0k6dw7cas0vvmjjvgsyvu1i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:30:00 -03:00
Arnaldo Carvalho de Melo
a66fa0619a perf annotate: Propagate perf_env__arch() error
The callers of symbol__annotate2() use symbol__strerror_disassemble() to
convert its failure returns into a human readable string, so
propagate error values from functions it calls, starting with
perf_env__arch() that when fails the right thing to do is to look at
'errno' to see why its possible call to uname() failed.

Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-it5d83kyusfhb1q1b0l4pxzs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:58 -03:00
Arnaldo Carvalho de Melo
9db0e3635f perf evsel: Fall back to global 'perf_env' in perf_evsel__env()
I.e. if evsel->evlist or evsel->evlist->env isn't set, return the
environment for the running machine, as that would be set if reading
from a perf.data file.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-uqq4grmhbi12rwb0lfpo6lfu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:57 -03:00
Arnaldo Carvalho de Melo
f67001a4a0 perf tools: Propagate get_cpuid() error
For consistency, propagate the exact cause for get_cpuid() to have
failed.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9ig269f7ktnhh99g4l15vpu2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:54 -03:00
Andi Kleen
6bdfd9f118 perf jevents: Fix period for Intel fixed counters
The Intel fixed counters use a special table to override the JSON
information.

During this override the period information from the JSON file got
dropped, which results in inst_retired.any and similar running with
frequency mode instead of a period.

Just specify the expected period in the table.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20190927233546.11533-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:53 -03:00
Andi Kleen
e98df280bc perf script brstackinsn: Fix recovery from LBR/binary mismatch
When the LBR data and the instructions in a binary do not match the loop
printing instructions could get confused and print a long stream of
bogus <bad> instructions.

The problem was that if the instruction decoder cannot decode an
instruction it ilen wasn't initialized, so the loop going through the
basic block would continue with the previous value.

Harden the code to avoid such problems:

- Make sure ilen is always freshly initialized and is 0 for bad
  instructions.

- Do not overrun the code buffer while printing instructions

- Print a warning message if the final jump is not on an instruction
  boundary.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20190927233546.11533-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:52 -03:00
Steve MacLean
2657983b4c perf docs: Correct and clarify jitdump spec
Specification claims latest version of jitdump file format is 2. Current
jit dump reading code treats 1 as the latest version.

Correct spec to match code.

The original language made it unclear the value to be written in the
magic field.

Revise language that the writer always writes the same value. Specify
that the reader uses the value to detect endian mismatches.

Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brian Robbins <brianrob@microsoft.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Keeping <john@metanate.com>
Cc: John Salem <josalem@microsoft.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tom McDonald <thomas.mcdonald@microsoft.com>
Link: http://lore.kernel.org/lkml/BN8PR21MB1362F63CDE7AC69736FC7F9EF7800@BN8PR21MB1362.namprd21.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:51 -03:00
Steve MacLean
b59711e9b0 perf inject jit: Fix JIT_CODE_MOVE filename
During perf inject --jit, JIT_CODE_MOVE records were injecting MMAP records
with an incorrect filename. Specifically it was missing the ".so" suffix.

Further the JIT_CODE_LOAD record were silently truncating the
jr->load.code_index field to 32 bits before generating the filename.

Make both records emit the same filename based on the full 64 bit
code_index field.

Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brian Robbins <brianrob@microsoft.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: John Keeping <john@metanate.com>
Cc: John Salem <josalem@microsoft.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom McDonald <thomas.mcdonald@microsoft.com>
Link: http://lore.kernel.org/lkml/BN8PR21MB1362FF8F127B31DBF4121528F7800@BN8PR21MB1362.namprd21.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:49 -03:00
Steve MacLean
ee212d6ea2 perf map: Fix overlapped map handling
Whenever an mmap/mmap2 event occurs, the map tree must be updated to add a new
entry. If a new map overlaps a previous map, the overlapped section of the
previous map is effectively unmapped, but the non-overlapping sections are
still valid.

maps__fixup_overlappings() is responsible for creating any new map entries from
the previously overlapped map. It optionally creates a before and an after map.

When creating the after map the existing code failed to adjust the map.pgoff.
This meant the new after map would incorrectly calculate the file offset
for the ip. This results in incorrect symbol name resolution for any ip in the
after region.

Make maps__fixup_overlappings() correctly populate map.pgoff.

Add an assert that new mapping matches old mapping at the beginning of
the after map.

Committer-testing:

Validated correct parsing of libcoreclr.so symbols from .NET Core 3.0 preview9
(which didn't strip symbols).

Preparation:

  ~/dotnet3.0-preview9/dotnet new webapi -o perfSymbol
  cd perfSymbol
  ~/dotnet3.0-preview9/dotnet publish
  perf record ~/dotnet3.0-preview9/dotnet \
      bin/Debug/netcoreapp3.0/publish/perfSymbol.dll
  ^C

Before:

  perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\
     grep libcoreclr.so | head -n 4
        dotnet  1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \
            [0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \
            r-xp .../3.0.0-preview9-19423-09/libcoreclr.so
        dotnet  1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \
            [0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \
            rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
        dotnet  1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \
            [0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \
            rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
        dotnet  1907 373352.705249:     250000 cpu-clock: \
             7fe6159a1f99 [unknown] \
             (.../3.0.0-preview9-19423-09/libcoreclr.so)

After:

  perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\
     grep libcoreclr.so | head -n 4
        dotnet  1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \
            [0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \
            r-xp .../3.0.0-preview9-19423-09/libcoreclr.so
        dotnet  1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \
            [0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \
            rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
        dotnet  1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \
            [0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \
            rwxp .../3.0.0-preview9-19423-09/libcoreclr.so

All the [unknown] symbols were resolved.

Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Tested-by: Brian Robbins <brianrob@microsoft.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: John Keeping <john@metanate.com>
Cc: John Salem <josalem@microsoft.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom McDonald <thomas.mcdonald@microsoft.com>
Link: http://lore.kernel.org/lkml/BN8PR21MB136270949F22A6A02335C238F7800@BN8PR21MB1362.namprd21.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:46 -03:00
Thomas Richter
0d0e5ecec6 perf vendor events s390: Use s390 machine name instead of type 8561
In the pmu-events directory for JSON file definitions use the
official machine name IBM z15 instead of machine type number
8561. This is consistent with previous machines.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20190927081147.18345-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:45 -03:00
Thomas Richter
02d0847922 perf vendor events s390: Add JSON transaction for machine type 8561
Add s390 transaction counter definition for machine 8561. This is the
same file as for the predecessor machine.

Fixes: 6e67d77d67 ("perf vendor events s390: Add JSON files for machine type 8561")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20190927081147.18345-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:42 -03:00
Ian Rogers
7d4c85b703 perf llvm: Don't access out-of-scope array
The 'test_dir' variable is assigned to the 'release' array which is
out-of-scope 3 lines later.

Extend the scope of the 'release' array so that an out-of-scope array
isn't accessed.

Bug detected by clang's address sanitizer.

Fixes: 07bc5c699a ("perf tools: Make fetch_kernel_version() publicly available")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20190926220018.25402-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:35 -03:00
Arnaldo Carvalho de Melo
0ae4061223 tools headers uapi: Sync linux/fs.h with the kernel sources
To pick the changes from:

  78a1b96bcf ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl")
  23c688b540 ("fscrypt: allow unprivileged users to add/remove keys for v2 policies")
  5dae460c22 ("fscrypt: v2 encryption policy support")
  5a7e29924d ("fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl")
  b1c0ec3599 ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl")
  22d94f493b ("fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl")
  3b6df59bc4 ("fscrypt: use FSCRYPT_* definitions, not FS_*")
  2336d0deb2 ("fscrypt: use FSCRYPT_ prefix for uapi constants")
  7af0ab0d3a ("fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h>")

That don't trigger any changes in tooling, as it so far is used only
for:

  $ grep -l 'fs\.h' tools/perf/trace/beauty/*.sh | xargs grep regex=
  tools/perf/trace/beauty/rename_flags.sh:regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+RENAME_([[:alnum:]_]+)[[:space:]]+\(1[[:space:]]*<<[[:space:]]*([[:xdigit:]]+)[[:space:]]*\)[[:space:]]*.*'
  tools/perf/trace/beauty/sync_file_range.sh:regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+SYNC_FILE_RANGE_([[:alnum:]_]+)[[:space:]]+([[:xdigit:]]+)[[:space:]]*.*'
  tools/perf/trace/beauty/usbdevfs_ioctl.sh:regex="^#[[:space:]]*define[[:space:]]+USBDEVFS_(\w+)(\(\w+\))?[[:space:]]+_IO[CWR]{0,2}\([[:space:]]*(_IOC_\w+,[[:space:]]*)?'U'[[:space:]]*,[[:space:]]*([[:digit:]]+).*"
  tools/perf/trace/beauty/usbdevfs_ioctl.sh:regex="^#[[:space:]]*define[[:space:]]+USBDEVFS_(\w+)[[:space:]]+_IO[WR]{0,2}\([[:space:]]*'U'[[:space:]]*,[[:space:]]*([[:digit:]]+).*"
  $

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
  diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-44g48exl9br9ba0t64chqb4i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-30 17:29:22 -03:00
Ian Rogers
d586ac10ce perf docs: Allow man page date to be specified
With this change if a perf_date parameter is provided to asciidoc then
it will override the default date written to the man page metadata.

Without this change, or if the perf_date isn't specified, then the
current date is written to the metadata.

Having this parameter allows the metadata to be constant if builds
happen on different dates.

The name of the parameter is intended to be consistent with the existing
perf_version parameter.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20190921041327.155054-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-27 09:26:14 -03:00
Ian Rogers
e3e2cf3d5b perf tests: Avoid raising SEGV using an obvious NULL dereference
An optimized build such as:

  make -C tools/perf CLANG=1 CC=clang EXTRA_CFLAGS="-O3

will turn the dereference operation into a ud2 instruction, raising a
SIGILL rather than a SIGSEGV. Use raise(..) for correctness and clarity.

Similar issues were addressed in Numfor Mbiziwo-Tiapo's patch:

  https://lkml.org/lkml/2019/7/8/1234

Committer testing:

Before:

  [root@quaco ~]# perf test hooks
  55: perf hooks                                            : Ok
  [root@quaco ~]# perf test -v hooks
  55: perf hooks                                            :
  --- start ---
  test child forked, pid 17092
  SIGSEGV is observed as expected, try to recover.
  Fatal error (SEGFAULT) in perf hook 'test'
  test child finished with 0
  ---- end ----
  perf hooks: Ok
  [root@quaco ~]#

After:

  [root@quaco ~]# perf test hooks
  55: perf hooks                                            : Ok
  [root@quaco ~]# perf test -v hooks
  55: perf hooks                                            :
  --- start ---
  test child forked, pid 17909
  SIGSEGV is observed as expected, try to recover.
  Fatal error (SEGFAULT) in perf hook 'test'
  test child finished with 0
  ---- end ----
  perf hooks: Ok
  [root@quaco ~]#

Fixes: a074865e60 ("perf tools: Introduce perf hooks")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20190925195924.152834-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-27 09:26:14 -03:00
Arnaldo Carvalho de Melo
26acf400d2 perf unwind: Fix libunwind build failure on i386 systems
Naresh Kamboju reported, that on the i386 build pr_err()
doesn't get defined properly due to header ordering:

  perf-in.o: In function `libunwind__x86_reg_id':
  tools/perf/util/libunwind/../../arch/x86/util/unwind-libunwind.c:109:
  undefined reference to `pr_err'

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-26 21:59:38 +02:00
Arnaldo Carvalho de Melo
d6840d87b2 perf parser: Remove needless include directives
They go on accumulating there like the debug.h one, that was introduced
here:

  f23610245c ("perf list: Add debug support for outputing alias string")

But then, when that need is removed via:

  2073ad3326 ("perf tools: Factor out PMU matching in parser")

The thing stays there, so continue the house cleaning spree...

list.h not needed, no macros from there are used, and 'struct
list_head' is in linux/types.h, ditto for util.h, no need for that as
well.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-zkxr3mf6inun8m5mbnil4u0d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:41 -03:00
Thomas Richter
815c1560bf perf build: Add detection of java-11-openjdk-devel package
With Java 11 there is no seperate JRE anymore.

Details:

  https://coderanch.com/t/701603/java/JRE-JDK

Therefore the detection of the JRE needs to be adapted.

This change works for s390 and x86.  I have not tested other platforms.

Committer testing:

Continues to work with the OpenJDK 8:

  $ rm -f ~acme/lib64/libperf-jvmti.so
  $ rpm -qa | grep jdk-devel
  java-1.8.0-openjdk-devel-1.8.0.222.b10-0.fc30.x86_64
  $ git log --oneline -1
  a51937170f33 (HEAD -> perf/core) perf build: Add detection of java-11-openjdk-devel package
  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; make -C tools/perf O=/tmp/build/perf install > /dev/null 2>1
  $ ls -la ~acme/lib64/libperf-jvmti.so
  -rwxr-xr-x. 1 acme acme 230744 Sep 24 16:46 /home/acme/lib64/libperf-jvmti.so
  $

Suggested-by: Andreas Krebbel <krebbel@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20190909114116.50469-4-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:41 -03:00
Thomas Richter
61bf4ee29d perf jvmti: Include JVMTI support for s390
Enable JVMTI support for s390 perf tool chain.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20190909114116.50469-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:41 -03:00
Mamatha Inamdar
28b951760c perf vendor events: Remove P8 HW events which are not supported
This patch is to remove following hardware events
from JSON file which are not supported on POWER8.

pm_l3_p0_grp_pump
pm_l3_p0_lco_data
pm_l3_p0_lco_no_data
pm_l3_p0_lco_rty

  Note: Unfortunately power8 event list is not publicly available.

Fixes: c3b4d5c4af ("perf vendor events: Remove P8 HW events which are not supported")
Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190909065624.11956.3992.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:40 -03:00
Andi Kleen
7834fa948b perf evlist: Fix access of freed id arrays
I'm not fully sure if this is the correct fix, but without this I get
crashes on more complex perf stat metric usages. The problem is that
part of the state gets freed when a weak group fails, but then is later
still used. Just don't free the ids, we're going to reuse them anyways
on the weak group retry.

For example:

  % perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2

  crashes and gives in valgrind:

  =21527== Invalid write of size 8
  ==21527==    at 0x4EE582: hlist_add_head (list.h:644)
  ==21527==    by 0x4EFD3C: perf_evlist__id_hash (evlist.c:477)
  ==21527==    by 0x4EFD99: perf_evlist__id_add (evlist.c:483)
  ==21527==    by 0x4EFF15: perf_evlist__id_add_fd (evlist.c:524)
  ==21527==    by 0x4FC693: store_evsel_ids (evsel.c:2969)
  ==21527==    by 0x4FC76C: perf_evsel__store_ids (evsel.c:2986)
  ==21527==    by 0x450DA7: __run_perf_stat (builtin-stat.c:519)
  ==21527==    by 0x451285: run_perf_stat (builtin-stat.c:636)
  ==21527==    by 0x454619: cmd_stat (builtin-stat.c:1966)
  ==21527==    by 0x4D557D: run_builtin (perf.c:310)
  ==21527==    by 0x4D57EA: handle_internal_command (perf.c:362)
  ==21527==    by 0x4D5931: run_argv (perf.c:406)
  ==21527==  Address 0x12e3f008 is 104 bytes inside a block of size 2,056 free'd
  ==21527==    at 0x4839A0C: free (vg_replace_malloc.c:540)
  ==21527==    by 0x627139: xyarray__delete (xyarray.c:32)
  ==21527==    by 0x4F6BE4: perf_evsel__free_id (evsel.c:1253)
  ==21527==    by 0x4FA11F: evsel__close (evsel.c:1994)
  ==21527==    by 0x4F30A3: perf_evlist__reset_weak_group (evlist.c:1783)
  ==21527==    by 0x450B47: __run_perf_stat (builtin-stat.c:466)
  ==21527==    by 0x451285: run_perf_stat (builtin-stat.c:636)
  ==21527==    by 0x454619: cmd_stat (builtin-stat.c:1966)
  ==21527==    by 0x4D557D: run_builtin (perf.c:310)
  ==21527==    by 0x4D57EA: handle_internal_command (perf.c:362)
  ==21527==    by 0x4D5931: run_argv (perf.c:406)
  ==21527==    by 0x4D5CAE: main (perf.c:531)
  ==21527==  Block was alloc'd at
  ==21527==    at 0x483AB1A: calloc (vg_replace_malloc.c:762)
  ==21527==    by 0x627024: zalloc (zalloc.c:8)
  ==21527==    by 0x627088: xyarray__new (xyarray.c:10)
  ==21527==    by 0x4F6B20: perf_evsel__alloc_id (evsel.c:1237)
  ==21527==    by 0x4FC74E: perf_evsel__store_ids (evsel.c:2983)
  ==21527==    by 0x450DA7: __run_perf_stat (builtin-stat.c:519)
  ==21527==    by 0x451285: run_perf_stat (builtin-stat.c:636)
  ==21527==    by 0x454619: cmd_stat (builtin-stat.c:1966)
  ==21527==    by 0x4D557D: run_builtin (perf.c:310)
  ==21527==    by 0x4D57EA: handle_internal_command (perf.c:362)
  ==21527==    by 0x4D5931: run_argv (perf.c:406)
  ==21527==    by 0x4D5CAE: main (perf.c:531)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/20190923233339.25326-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:40 -03:00
Andi Kleen
6f6473c37d perf stat: Fix free memory access / memory leaks in metrics
Make sure to not free the name passed in by the caller, but free all the
allocated ids when parsing expressions.

The loop at the end knows that the first entry shouldn't be freed, so
make sure the caller name is the first entry.

Fixes

  % perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2

  valgrind:
       1.009943231 ==21527== Invalid read of size 1
  ==21527==    at 0x483CB74: strcmp (vg_replace_strmem.c:849)
  ==21527==    by 0x582CF8: collect_all_aliases (stat-display.c:554)
  ==21527==    by 0x582EB3: collect_data (stat-display.c:577)
  ==21527==    by 0x583A32: print_counter_aggr (stat-display.c:806)
  ==21527==    by 0x584FAD: perf_evlist__print_counters (stat-display.c:1200)
  ==21527==    by 0x45133A: print_counters (builtin-stat.c:655)
  ==21527==    by 0x450629: process_interval (builtin-stat.c:353)
  ==21527==    by 0x450FBD: __run_perf_stat (builtin-stat.c:564)
  ==21527==    by 0x451285: run_perf_stat (builtin-stat.c:636)
  ==21527==    by 0x454619: cmd_stat (builtin-stat.c:1966)
  ==21527==    by 0x4D557D: run_builtin (perf.c:310)
  ==21527==    by 0x4D57EA: handle_internal_command (perf.c:362)
  ==21527==  Address 0x12826cd0 is 0 bytes inside a block of size 25 free'd
  ==21527==    at 0x4839A0C: free (vg_replace_malloc.c:540)
  ==21527==    by 0x627041: __zfree (zalloc.c:13)
  ==21527==    by 0x57F66A: generic_metric (stat-shadow.c:814)
  ==21527==    by 0x580B21: perf_stat__print_shadow_stats (stat-shadow.c:1057)
  ==21527==    by 0x58418E: print_metric_headers (stat-display.c:943)
  ==21527==    by 0x5844BC: print_interval (stat-display.c:1004)
  ==21527==    by 0x584DEB: perf_evlist__print_counters (stat-display.c:1172)
  ==21527==    by 0x45133A: print_counters (builtin-stat.c:655)
  ==21527==    by 0x450629: process_interval (builtin-stat.c:353)
  ==21527==    by 0x450FBD: __run_perf_stat (builtin-stat.c:564)
  ==21527==    by 0x451285: run_perf_stat (builtin-stat.c:636)
  ==21527==    by 0x454619: cmd_stat (builtin-stat.c:1966)
  ==21527==  Block was alloc'd at
  ==21527==    at 0x483880B: malloc (vg_replace_malloc.c:309)
  ==21527==    by 0x51677DE: strdup (in /usr/lib64/libc-2.29.so)
  ==21527==    by 0x506457: parse_events_name (parse-events.c:1754)
  ==21527==    by 0x5550BB: parse_events_parse (parse-events.y:214)
  ==21527==    by 0x50694D: parse_events__scanner (parse-events.c:1887)
  ==21527==    by 0x506AEF: parse_events (parse-events.c:1927)
  ==21527==    by 0x521D8B: metricgroup__parse_groups (metricgroup.c:527)
  ==21527==    by 0x45156F: parse_metric_groups (builtin-stat.c:721)
  ==21527==    by 0x6228A9: get_value (parse-options.c:243)
  ==21527==    by 0x62363F: parse_short_opt (parse-options.c:348)
  ==21527==    by 0x62363F: parse_options_step (parse-options.c:536)
  ==21527==    by 0x62363F: parse_options_subcommand (parse-options.c:651)
  ==21527==    by 0x453C1D: cmd_stat (builtin-stat.c:1718)
  ==21527==    by 0x4D557D: run_builtin (perf.c:310)

and also a leak report.

Committer testing:

Before:

  # perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
  #           time      CPU_Utilization
       1.000470810                      free(): double free detected in tcache 2
  Aborted (core dumped)
  #

After:

  # perf stat -M IpB,IpCall,IpTB,IPC,Retiring_SMT,Frontend_Bound_SMT,Kernel_Utilization,CPU_Utilization --metric-only -a -I 1000 sleep 2
  #           time      CPU_Utilization
       1.000494752                  0.1
       2.001105112                  0.1
  #

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/20190923233339.25326-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:40 -03:00
Arnaldo Carvalho de Melo
252a2fdc74 perf tools: Replace needless mmap.h with what is needed, event.h
The perf_sample struct definition and the event_attr_init() are in
util/event.h, but some places were getting it thru an otherwise needless
util/mmap.h header, fix it by including util/event.h directly.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-p1anwyjdbbvghrkl9dlxv7h5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:40 -03:00
Arnaldo Carvalho de Melo
95be9d197d perf evsel: Move config terms to a separate header
Further reducing the size of util/evsel.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-20zr7di9eynm0272mtjfdhfc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:40 -03:00
Arnaldo Carvalho de Melo
bd70462062 perf evlist: Remove unused perf_evlist__fprintf() method
Ditch it, noone is using it, one more stdio.h include in a hot header.

Fix the fallout in parse-events.y, where we end up using a FILE pointer,
I think due to YYDEBUG being set and in some places, like Amazon Linux 1
we don't get stdio.h included by luck, like in most other places, add a
explicit stdio.h include directive.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-37k5q0lhdbo2hvvfbnnzn7og@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:40 -03:00
Arnaldo Carvalho de Melo
ca1252779f perf evsel: Introduce evsel_fprintf.h
We already had evsel_fprintf.c, add its counterpart, so that we can
reduce evsel.h a bit more.

We needed a new perf_event_attr_fprintf.c file so as to have a separate
object to link with the python binding in tools/perf/util/python-ext-sources
and not drag symbol_conf, etc into the python binding.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-06bdmt1062d9unzgqmxwlv88@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 16:26:34 -03:00
Arnaldo Carvalho de Melo
9620bc361a perf evsel: Remove need for symbol_conf in evsel_fprintf.c
So that we an later link it to the python binding without having to
drag the symbol object files.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8823tveyasocnuoelq4qopwf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 15:06:59 -03:00
Arnaldo Carvalho de Melo
32ff3fec07 perf copyfile: Move copyfile routines to separate files
Further reducing the util.c hodgepodge files.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-0i62zh7ok25znibyebgq0qs4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:49 -03:00
Jiri Olsa
80ab2987a0 libperf: Add perf_evlist__poll() function
Move perf_evlist__poll() from tools/perf to libperf, it will be used in
the following patches.

And rename the existing perf's function to evlist__poll().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-39-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:49 -03:00
Jiri Olsa
f4009e7bf7 libperf: Add perf_evlist__add_pollfd() function
Move perf_evlist__add_pollfd() from tools/perf to libperf, it will be
used in the following patches.

Also rename perf's perf_evlist__add_pollfd()/perf_evlist__filter_pollfd()
to evlist__add_pollfd()/evlist__filter_pollfd().

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-38-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:49 -03:00
Jiri Olsa
31f67fc462 libperf: Add perf_evlist__alloc_pollfd() function
Move perf_evlist__alloc_pollfd() from tools/perf to libperf, it will be
used in the following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-37-jolsa@kernel.org
[ Added api/fd/array.h include to the lib/evlist.c file ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:49 -03:00
Jiri Olsa
379dd98c3d libperf: Add libperf_init() call to the tests
Add libperf_init() call to the automated tests.

Committer notes:

Added missing stdarg.h and/or stdio.h to places using vfprintf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-34-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:49 -03:00
Jiri Olsa
428dab813a libperf: Merge libperf_set_print() into libperf_init()
The libperf_set_print() function needs to be called in any case so let's
merge it with libperf_init(), so we have just one init function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-34-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:49 -03:00
Jiri Olsa
fb4bf51fcc libperf: Add libperf dependency for tests targets
Add libperf dependency for tests targets.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-36-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:49 -03:00
Arnaldo Carvalho de Melo
7634d5336a libperf: Use sys/types.h to get ssize_t, not unistd.h
The sys/types.h header looks more sensible, from its name we can gather
it should be there because of some needed typedef, and it is much
smaller than unistd.h, so use it and fix up the fallout in places where
it was being used for something else entirely but being obtained by
sheer luck, indirectly.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-49bn251httu22ymwgipeavmy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:49 -03:00
Arnaldo Carvalho de Melo
26049111c3 perf tools: No need to include internal/lib.h from util/util.h
That was done just to have users of writen() and readn(), that before
had their prototypes in util/util.h to get it without having to add an
include for internal/lib.h, but the right way is to add it and by now
all places already do it.

Fix a fallout were readlink() was used but unistd.h was being obtained
by luck thru util.h -> internal/lib.h, now to check why unistd.h is
being included there...

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lcnytgrtafey3kwlfog2rzzj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Jiri Olsa
20f2be1d48 libperf: Move 'page_size' global variable to libperf
We need the 'page_size' variable in libperf, so move it there.

Add a libperf_init() as a global libperf init function to obtain this
value via sysconf() at tool start.

Committer notes:

Add internal/lib.h to tools/perf/ files using 'page_size', sometimes
replacing util.h with it if that was the only reason for having util.h
included.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-33-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Jiri Olsa
d5a99483de libperf: Add perf_evlist__id_add_fd() function
Add the perf_evlist__id_add_fd() function to libperf as an internal
function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-32-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Jiri Olsa
b0031c2281 libperf: Add perf_evlist__id_add() function
Add the perf_evlist__id_add() function to libperf as an internal
function.  We already have the 'heads' member in 'struct perf_evlist'.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-31-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Jiri Olsa
ff47d86a0d libperf: Add perf_evlist__read_format() function
Add the perf_evlist__read_format() function to libperf as internal
function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-30-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Jiri Olsa
515dbe48f6 libperf: Add perf_evlist__first()/last() functions
Add perf_evlist__first()/last() functions to libperf, as internal
functions and rename perf's origins to evlist__first/last.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-29-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Jiri Olsa
70c20369ee libperf: Add perf_evsel__alloc_id/perf_evsel__free_id functions
Add perf_evsel__alloc_id()/perf_evsel__free_id() functions to libperf as
internal functions.

Move 'struct perf_sample_id' to internal/evsel.h header and change
'struct perf_sample_id::evsel' to 'struct perf_evsel' and the related
code that touches it.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-28-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Jiri Olsa
1d5af02d7a libperf: Move 'heads' from 'struct evlist' to 'struct perf_evlist'
Move 'heads' hash table from 'struct evlist' to 'struct perf_evlist'.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-27-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Jiri Olsa
e7eb9002d4 libperf: Move 'ids' from 'struct evsel' to 'struct perf_evsel'
Move 'ids' from 'struct evsel' to libperf's 'struct perf_evsel'.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-26-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:47 -03:00
Jiri Olsa
deaf321913 libperf: Move 'id' from 'struct evsel' to 'struct perf_evsel'
Move the 'id' array from 'struct evsel' to libperf's 'struct perf_evsel'.

Committer note:

Fix the tools/perf/util/cs-etm.c build, i.e. aarch64's CoreSight.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-25-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:47 -03:00
Jiri Olsa
8cd36f3ef4 libperf: Move 'sample_id' from 'struct evsel' to 'struct perf_evsel'
Move 'sample_id' array from 'struct evsel' to libperf's 'struct perf_evsel'.

Committer notes:

Removed the 'struct xyarray' from util/evsel.h, not needed anymore
there.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-24-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:47 -03:00
Arnaldo Carvalho de Melo
fee92b4442 libperf: Add missing 'struct xyarray' forward declaration
We were getting it by luck, from files included before internal/evsel.h
where it is being included.

Fixes: 9dfcb75990 ("libperf: Move fd array from perf's evsel to lobperf's perf_evsel class")
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lkml.kernel.org/n/tip-r8ukhxprpkflbd2k9vcc42v1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:47 -03:00
Jiri Olsa
40cb2d5141 libperf: Move 'pollfd' from 'struct evlist' to 'struct perf_evlist'
Moving 'pollfd' from 'struct evlist' to 'struct perf_evlist' it will be
used in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:47 -03:00
Jiri Olsa
f6fa437577 libperf: Move 'mmap_len' from 'struct evlist' to 'struct perf_evlist'
Moving 'mmap_len' from 'struct evlist' to 'struct perf_evlist' it will
be used in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:47 -03:00
Jiri Olsa
c976ee11a0 libperf: Move 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist'
Moving 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist', it will
be used in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:47 -03:00
Jiri Olsa
648b5af3f3 libperf: Move 'system_wide' from 'struct evsel' to 'struct perf_evsel'
Move the 'system_wide 'member from perf's evsel to libperf's perf_evsel.

Committer notes:

Added stdbool.h as we now use bool here.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:46 -03:00
Jiri Olsa
65aa2e6bae libperf: Add 'flush' to 'struct perf_mmap'
Move 'flush' from tools/perf's mmap to libperf's perf_mmap struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-19-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:46 -03:00
Jiri Olsa
4443e6d770 libperf: Add 'event_copy' to 'struct perf_mmap'
Move 'event_copy' from tools/perf's mmap to libperf's perf_mmap struct.

Committer notes:

Add linux/compiler.h as we need it for '__aligned'.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-18-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:46 -03:00
Jiri Olsa
8df7a86981 libperf: Add 'overwrite' to 'struct perf_mmap'
Move 'overwrite' from tools/perf's mmap to libperf's perf_mmap struct.

Committer notes:

Add stdbool.h as we start using 'bool'.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-17-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:46 -03:00
Jiri Olsa
ebe4d72bba libperf: Add prev/start/end to struct perf_mmap
Move prev/start/end from tools/perf's mmap to libperf's perf_mmap struct.

Committer notes:

Add linux/types.h as we use u64.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-16-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:46 -03:00
Jiri Olsa
e03edfeac0 libperf: Add 'refcnt' to struct perf_mmap
Move 'refcnt' from tools/perf's mmap to libperf's perf_mmap struct.

Committer notes:

Add the refcount.h include directive here, now it is needed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-15-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:46 -03:00
Jiri Olsa
56a94706cd libperf: Add 'cpu' to struct perf_mmap
Move 'cpu' from tools/perf's mmap to libperf's perf_mmap struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-14-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:45 -03:00
Jiri Olsa
2cf07b294a libperf: Add 'fd' to struct perf_mmap
Move 'fd' from tools/perf's mmap to libperf's perf_mmap struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-13-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:45 -03:00
Jiri Olsa
4fd0cef2c7 libperf: Add 'mask' to struct perf_mmap
Move 'mask' from tools/perf's mmap to libperf's perf_mmap struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-12-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:45 -03:00
Jiri Olsa
547740f7b3 libperf: Add perf_mmap struct
Add the perf_mmap struct to libperf.

The definition is added into:

  include/internal/mmap.h

which is not to be included by users, but shared within perf and
libperf.

Committer notes:

Remove unnecessary includes from tools/perf/lib/include/internal/mmap.h,
those will be readded as they become necessary, later in the series.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:45 -03:00
Arnaldo Carvalho de Melo
e0fcfb086f perf evlist: Adopt backwards ring buffer state enum
As this isn't used at all in mmap.h but in evlist.h, so to cut down the
header dependency tree, move it to where it is used.

Also add mmap.h to the places using it but previously getting it
indirectly via evlist.h.

Add missing pthread.h to evlist.h, as it has a pthread_t struct member
and was getting the header via mmap.h.

Noticed while processing a Jiri's libperf batch touching mmap.h, where
almost everything gets rebuilt because evlist.h is so popular, so cut
down't this rebuild the world party.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/n/tip-he0uljeftl0xfveh3d6vtode@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:45 -03:00
Jiri Olsa
d80a5540bc libperf: Link libapi.a in libperf.so
Linking libapi.a in libperf.so, because we are about to use some of the
API functions in it.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:45 -03:00
Jiri Olsa
e6b1878d4e perf tools: Rename perf_evlist__purge() to evlist__purge()
Rename (perf_evlist__purge) to evlist__purge(), so we don't have a
name clash when we add (perf_evlist__purge) in libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:45 -03:00
Jiri Olsa
470579b021 perf tools: Rename perf_evlist__exit() to evlist__exit()
Rename perf_evlist__exit() to evlist__exit(), so we don't have a name
clash when we add perf_evlist__exit() to libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:44 -03:00
Jiri Olsa
d50cf36115 perf tools: Rename perf_evlist__alloc_mmap() to evlist__alloc_mmap()
Rename perf_evlist__alloc_mmap() to evlist__alloc_mmap(), so we don't
have a name clash when we add perf_evlist__alloc_mmap() to libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:44 -03:00
Jiri Olsa
db6b7b1385 perf tools: Rename perf_evlist__munmap() to evlist__munmap()
Rename perf_evlist__munmap() to evlist__munmap(), so we don't have a
name clash when we add perf_evlist__munmap() in libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:44 -03:00
Jiri Olsa
9521b5f2d9 perf tools: Rename perf_evlist__mmap() to evlist__mmap()
Rename perf_evlist__mmap() to evlist__mmap(), so we don't have a name
clash when we add perf_evlist__mmap() in libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:44 -03:00
Jiri Olsa
a583053299 perf tools: Rename 'struct perf_mmap' to 'struct mmap'
Rename 'struct perf_evlist' to 'struct evlist', so we don't have a name
clash when we add 'struct perf_mmap' to libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:44 -03:00