Commit graph

12762 commits

Author SHA1 Message Date
Namhyung Kim
5f534a8181 perf test: Do not compare overheads in the zstd comp test
The overhead can vary on each run so it'd make the test failed
sometimes.  Also order of hist entry can change.

Use perf report -F option to omit the overhead field and sort the
result alphabetically.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210812235738.1684583-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-13 10:45:36 -03:00
Jin Yao
1d3351e631 perf tools: Enable on a list of CPUs for hybrid
The 'perf record' and 'perf stat' commands have supported the option
'-C/--cpus' to count or collect only on the list of CPUs provided. This
option needs to be supported for hybrid as well.

For hybrid support, it needs to check that the cpu list are available
on hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
is 'cpu_atom'.

Before:

  # perf stat -e cpu_core/cycles/ -C11 -- sleep 1

   Performance counter stats for 'CPU(s) 11':

     <not supported>      cpu_core/cycles/

         1.006179431 seconds time elapsed

The 'perf stat' command silently returned "<not supported>" without any
helpful information. It should error out pointing out that that cpu11
was not 'cpu_core'.

After:

  # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
  WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
  failed to use cpu list 11

We also need to support the events without pmu prefix specified.

  # perf stat -e cycles -C11 -- sleep 1
  WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)

   Performance counter stats for 'CPU(s) 11':

           1,067,373      cpu_atom/cycles/

         1.005544738 seconds time elapsed

The perf tool creates two cycles events automatically, cpu_core/cycles/ and
cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
for cpu_core/cycles/ and only count the cpu_atom/cycles/.

If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', for example,

  # perf stat -e cycles -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           1,914,704      cpu_core/cycles/
           2,036,983      cpu_atom/cycles/

         1.005815641 seconds time elapsed

It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
cpu_atom/cycles/, and output with some warnings.

Some more complex examples,

  # perf stat -e cycles,instructions -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
  WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           2,780,387      cpu_core/cycles/
           1,583,432      cpu_atom/cycles/
           3,957,277      cpu_core/instructions/
           1,167,089      cpu_atom/instructions/

         1.006005124 seconds time elapsed

  # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           3,290,301      cpu_core/cycles/
           1,953,073      cpu_atom/cycles/
           1,407,869      cpu_atom/instructions/

         1.006260912 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210723063433.7318-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 16:07:32 -03:00
Jin Yao
b726e3634e perf tools: Create hybrid flag in target
The user may count or collect only on a cpu list via '-C/--cpus' option.

Previously cpus for an evsel were retrieved from PMU's sysfs. But if the
target cpu list is defined, the retrieved cpus are not kept and the
target cpu list is used instead.

But for hybrid system, we can't directly use target cpu list. The cpu
list may not be available on hybrid pmu (e.g. cpu_core or cpu_atom).  So
we should not set the 'has_user_cpus' flag for hybrid system.

The difficulity is that we can't call perf_pmu__has_hybrid() in evlist.c
to check hybrid system otherwise 'perf test python' would be failed
(undefined symbol for perf_pmu__has_hybrid). If we add pmu.c to
python-ext-sources, too many symbol dependencies are hard to resolve.

We use an alternative method by using a new 'hybrid' flag in target
for hybrid system checking.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210723063433.7318-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 16:04:33 -03:00
Riccardo Mancini
ebdf90a4a1 perf test: Make --skip work on shell tests
perf-test has the option --skip to provide a list of tests to skip.
However, this option does not work with shell scripts.

This patch passes the skiplist to run_shell_tests, so that also shell
scripts could be skipped using --skip.

Committer tests:

Tests 79 onwards are shell tests:

Before:

  # perf test --skip 1,2,81,82,84,88,90
   1: vmlinux symtab matches kallsyms                                 : Skip (user override)
   2: Detect openat syscall event                                     : Skip (user override)
   3: Detect openat syscall event on all cpus                         : Ok
   4: Read samples using the mmap interface                           : Ok
   5: Test data source output                                         : Ok
  <SNIP>
  78: x86 Sample parsing                                              : Ok
  79: build id cache operations                                       : Ok
  80: daemon operations                                               : Ok
  81: perf pipe recording and injection test                          : Ok
  82: Add vfs_getname probe to get syscall args filenames             : FAILED!
  83: probe libc's inet_pton & backtrace it with ping                 : Ok
  84: Use vfs_getname probe to get syscall args filenames             : FAILED!
  85: Zstd perf.data compression/decompression                        : Ok
  86: perf stat csv summary test                                      : Ok
  87: perf stat metrics (shadow stat) test                            : Ok
  88: perf stat --bpf-counters test                                   : Ok
  89: Check Arm CoreSight trace data recording and synthesized samples: Skip
  90: Check open filename arg using perf trace + vfs_getname          : FAILED!
  #

After:

  # perf test --skip 1,2,81,82,84,88,90
   1: vmlinux symtab matches kallsyms                                 : Skip (user override)
   2: Detect openat syscall event                                     : Skip (user override)
   3: Detect openat syscall event on all cpus                         : Ok
   4: Read samples using the mmap interface                           : Ok
   5: Test data source output                                         : Ok
  <SNIP>
  78: x86 Sample parsing                                              : Ok
  79: build id cache operations                                       : Ok
  80: daemon operations                                               : Ok
  81: perf pipe recording and injection test                          : Skip (user override)
  82: Add vfs_getname probe to get syscall args filenames             : Skip (user override)
  83: probe libc's inet_pton & backtrace it with ping                 : Ok
  84: Use vfs_getname probe to get syscall args filenames             : Skip (user override)
  85: Zstd perf.data compression/decompression                        : Ok
  86: perf stat csv summary test                                      : Ok
  87: perf stat metrics (shadow stat) test                            : Ok
  88: perf stat --bpf-counters test                                   : Skip (user override)
  89: Check Arm CoreSight trace data recording and synthesized samples: Skip
  90: Check open filename arg using perf trace + vfs_getname          : Skip (user override)
  #

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210811180625.160944-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 15:45:08 -03:00
Adrian Hunter
9f9c9a8de2 perf tests: Add dlfilter test
Add a perf test to test the dlfilter C API.

A perf.data file is synthesized and then processed by perf script with a
dlfilter named dlfilter-test-api-v0.so. Also a C file is compiled to
provide a dso to match the synthesized perf.data file.

Committer testing:

  [root@five ~]# perf test dlfilter
  72: dlfilter C API                                                  : Ok
  [root@five ~]# perf test -v dlfilter
  72: dlfilter C API                                                  :
  --- start ---
  test child forked, pid 3387712
  Checking for gcc
  Command: gcc --version
  gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3)
  Copyright (C) 2021 Free Software Foundation, Inc.
  This is free software; see the source for copying conditions.  There is NO
  warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

  dlfilters path: /var/home/acme/libexec/perf-core/dlfilters
  Command: gcc -g -o /tmp/dlfilter-test-3387712-prog /tmp/dlfilter-test-3387712-prog.c
  Creating new host machine structure
  Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 0 --dlarg last
  start API
  filter_event_early API
  filter_event API
  stop API
  Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 1 --dlarg last
  start API
  filter_event_early API
  filter_event API
  stop API
  Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 2 --dlarg last
  start API
  filter_event_early API
  stop API
  test child finished with 0
  ---- end ----
  dlfilter C API: Ok
  [root@five ~]#

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:35:44 -03:00
Adrian Hunter
3af1dfdd51 perf build: Move perf_dlfilters.h in the source tree
Move perf_dlfilters.h in the source tree so that it will be found when
building dlfilters as part of the perf build.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:35:24 -03:00
Adrian Hunter
b29edf35ef perf dlfilter: Amend documentation wrt library dependencies
Like all locally-built programs, dlfilters may need to be re-built if
shared libraries they use change. Also there may be unexpected results
if the dfilter uses different versions of the shared libraries that perf
uses.

Note those things in the documentation.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:34:31 -03:00
Adrian Hunter
3e8e226307 perf script: Fix --list-dlfilters documentation
The option --list-dlfilters does use a string value.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 638e2b9984 ("perf script Add option to list dlfilters")
Link: https //lore.kernel.org/r/20210811101036.17986-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:34:07 -03:00
Adrian Hunter
29159727aa perf script: Fix unnecessary machine_resolve()
machine_resolve() may have already been called. Test for that to avoid
calling it again unnecessarily.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:33:59 -03:00
Adrian Hunter
988db17932 perf script: Fix documented const'ness of perf_dlfilter_fns
perf_dlfilter_fns must not be const, because it is not.

Declaring it const can result in it being mapped read-only, causing a
segfaullt when it is written. Update documentation accordingly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 8defa7147d5572 ("perf script Add API for filtering via dynamically loaded shared object")
Link: https //lore.kernel.org/r/20210811101036.17986-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:33:15 -03:00
Jin Yao
c4ad8fabd0 perf vendor events: Update metrics for SkyLake Server
Update JSON metrics for SkyLake Server.

Based on TMA metrics 4.21 at 01.org.
https://download.01.org/perfmon/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-7-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:19:49 -03:00
Jin Yao
d5c0a8d554 perf vendor events intel: Update uncore event list for SkyLake Server
Update JSON uncore events for SkyLake Server.

Based on JSON list v1.24:

https://download.01.org/perfmon/SKX/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-6-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:19:34 -03:00
Jin Yao
2c72404e95 perf vendor events intel: Update core event list for SkyLake Server
Update JSON core events for SkyLake Server.

Based on JSON list v1.24:

https://download.01.org/perfmon/SKX/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-5-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:19:12 -03:00
Jin Yao
ed97cc6cbb perf vendor events: Update metrics for CascadeLake Server
Update JSON metrics for CascadeLake Server.

Based on TMA metrics 4.21 at 01.org.
https://download.01.org/perfmon/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-4-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:18:36 -03:00
Jin Yao
96fe584f99 perf vendor events intel: Update uncore event list for CascadeLake Server
Update JSON uncore events for CascadeLake Server.

Based on JSON list v1.11:

https://download.01.org/perfmon/CLX/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-3-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:12:04 -03:00
Jin Yao
e0ddfd8d50 perf vendor events intel: Update core event list for CascadeLake Server
Update JSON core events for CascadeLake Server.

Based on JSON list v1.11:

https://download.01.org/perfmon/CLX/

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 15:08:21 -03:00
John Garry
8ee465a181 perf test: Add pmu-events sys event support
Add support for system events, along with core and uncore events.

Support for a sample PMU is also added.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-12-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:47:52 -03:00
John Garry
5abd3988b0 perf jevents: Print SoC name per system event table
Print the SoC name per system event table, which will allow the test SoC be
identified by the pmu-events test.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-11-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:47:07 -03:00
John Garry
e199f47f15 perf pmu: Make pmu_add_sys_aliases() public
Function pmu_add_sys_aliases() will be required for the PMU events test
for system events aliases, so make it public.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-10-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:46:39 -03:00
John Garry
6a86657fbc perf test: Add more pmu-events uncore aliases
Add more events to cover the scenarios fixed and also inadvertently
broken by commit c47a5599ed ("perf tools: Fix pattern matching for
same substring in different PMU type")

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-9-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:45:53 -03:00
John Garry
5a65c0c8f6 perf test: Re-add pmu-event uncore PMU alias test
Add support to match aliases for uncore PMUs.

Since we cannot rely on the PMUs being present on the host system, use
fake PMUs.

The following conditions in the test are ensures:

- Expected count of aliases created

- All aliases can be matched to an expected alias in
  perf_pmu_test_pmu.aliases

This will catch the condition fixed in commit c47a5599ed ("perf tools:
Fix pattern matching for same substring in different PMU type"), where
excess events were created for a PMU. It will also fix the scenario
inadvertently broken there, where no aliases were created for aliases
with multiple tokens.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-8-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:44:51 -03:00
John Garry
5806099a2e perf pmu: Check .is_uncore field in pmu_add_cpu_aliases_map()
Calling pmu_is_uncore() for fake PMUs does not work, as it checks sysfs
for the PMU details (which won't exist).

Check .is_uncore field instead, which makes sense anyway.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-7-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:44:00 -03:00
John Garry
3bc4526b30 perf test: Test pmu-events core aliases separately
The current method to test uncore event aliasing is limited, as it
relies on the uncore PMU being present in the host system to test.

As such, breakages of uncore PMU aliases goes unnoticed. To make this
more robust, a new method of testing uncore PMUs with fake PMUs will be
used in future. This will be separate to testing core PMU aliases.

So make the current test function core PMU only. Uncore PMU alias
support will be re-added later.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-6-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:42:35 -03:00
John Garry
e386acd790 perf test: Factor out pmu-events alias comparison
Factor out alias test which will be used in multiple places.

Also test missing fields.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-5-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:35:24 -03:00
John Garry
c81e823ff8 perf test: Declare pmu-events test events separately
Currently all test events are put into arrays of test events.

Create pointer arrays of test events instead, so the test events may be
referenced later for tighter alias verification.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:34:35 -03:00
John Garry
35267cea90 perf jevents: Relocate test events to cpu folder
In future to add support for sys events, relocate the core and uncore
events to a cpu folder.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 11:47:09 -03:00
John Garry
19ac3df32f perf test: Factor out pmu-events event comparison
Factor out event comparison which will be used in multiple places.

Also test "pmu" and "compat" fields.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 11:46:43 -03:00
John Garry
517db3b595 perf jevents: Make build dependency on test JSONs
Currently all JSONs and the mapfile for an arch are dependencies for
building pmu-events.c

The test JSONs are missing as a dependency, so add them.

Signed-off-by: John Garry <john.garry@huawei.com>
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/90094733-741c-50e5-ac7d-f5640b5f0bdd@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 11:39:36 -03:00
Riccardo Mancini
4241eabf59 perf bench: Add benchmark for evlist open/close operations
This new benchmark finds the total time that is taken to open, mmap,
enable, disable, munmap, close an evlist (time taken for new,
create_maps, config, delete is not counted in).

The evlist can be configured as in perf-record using the
-a,-C,-e,-u,--per-thread,-t,-p options.

The events can be duplicated in the evlist to quickly test performance
with many events using the -n options.

Furthermore, also the number of iterations used to calculate the
statistics is customizable.

Examples:
- Open one dummy event system-wide:

  $ sudo ./perf bench internals evlist-open-close
    Number of cpus:       4
    Number of threads:    1
    Number of events:     1 (4 fds)
    Number of iterations: 100
    Average open-close took: 613.870 usec (+- 32.852 usec)

- Open the group '{cs,cycles}' on CPU 0

  $ sudo ./perf bench internals evlist-open-close -e '{cs,cycles}' -C 0
    Number of cpus:       1
    Number of threads:    1
    Number of events:     2 (2 fds)
    Number of iterations: 100
    Average open-close took: 8503.220 usec (+- 252.652 usec)

- Open 10 'cycles' events for user 0, calculate average over 100 runs

  $ sudo ./perf bench internals evlist-open-close -e cycles -n 10 -u 0 -i 100
    Number of cpus:       4
    Number of threads:    328
    Number of events:     10 (13120 fds)
    Number of iterations: 100
    Average open-close took: 180043.140 usec (+- 2295.889 usec)

Committer notes:

Replaced a deprecated bzero() call with designated initialized zeroing.

Added some missing evlist allocation checks, one noted by Riccardo on
the mailing list.

Minor cosmetic changes (sent in private).

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210809201101.277594-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 11:32:37 -03:00
Alyssa Ross
f2c24ebadd perf docs: Fix accidental em-dashes
" -- " is an em dash (—) in asciidoc, so all these examples that were
supposed to be producing a literal two dashes were being misrendered.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210809153226.332545-1-hi@alyssa.is
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 11:05:21 -03:00
Leo Yan
7c0223e1dd perf env: Track kernel 64-bit mode in environment
It's useful to know that the kernel is running in 32-bit or 64-bit mode.
E.g. We can decide if perf tool is running in compat mode based on the
info.

This patch adds an item "kernel_is_64_bit" into session's environment
structure perf_env, its value is initialized based on the architecture
string.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: russell king <linux@armlinux.org.uk>
Link: http://lore.kernel.org/lkml/20210809112727.596876-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 17:11:18 -03:00
Leo Yan
65c45afb14 perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
Since the __sync functions have been dropped, This patch removes unused
build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210809111407.596077-9-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 17:01:51 -03:00
Leo Yan
9d64503308 perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()
Since the function auxtrace_mmap__read_snapshot_head() is exactly same
with auxtrace_mmap__read_head(), whether the session is in snapshot mode
or not, it's unified to use function auxtrace_mmap__read_head() for
reading AUX buffer head.

And the function auxtrace_mmap__read_snapshot_head() is unused so this
patch removes it.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210809111407.596077-8-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 17:01:23 -03:00
Leo Yan
1fc7e593e2 perf auxtrace: Drop legacy __sync functions
The main purpose for using __sync built-in functions is to support
compat mode for 32-bit perf with 64-bit kernel.  But using these
built-in functions might cause potential issues.

__sync functions originally support Intel Itanium processoer [1] but it
cannot promise to support all 32-bit archs.  Now these functions have
become the legacy functions.

Considering __sync functions cannot really fix the 64-bit value
atomicity on 32-bit archs, thus this patch drops __sync functions.

Credits to Peter for detailed analysis.

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210809111407.596077-7-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 17:00:06 -03:00
Leo Yan
1ea3cb159e perf auxtrace: Use WRITE_ONCE() for updating aux_tail
Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
behaviour.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: x86@kernel.org
Link: http //lore.kernel.org/lkml/20210809111407.596077-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 16:58:29 -03:00
Stephen Brennan
b7ae6d4378 perf script python: Fix unintended underline
The text ranging from "subsystem__event_name" to "raw_syscalls__sys_enter()"
is interpreted by asciidoc as a pair of unconstrained text formatting markers.

The result is that the manual page displayed this text as underlined,
and the HTML pages displayed this text as italicized. Escape the first
double-underscore to prevent this.

https://docs.asciidoctor.org/asciidoc/latest/syntax-quick-reference/

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210806204502.110305-1-stephen.s.brennan@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 16:54:20 -03:00
James Clark
9c38b671eb perf cs-etm: Add warnings for missing DSOs
Currently decode will silently fail if no binary data is available for
the decode. This is made worse if only partial data is available because
the decode will appear to work, but any trace from that missing DSO will
silently not be generated.

Add a UI popup once if there is any data missing, and then warn in the
bottom left for each individual DSO that's missing.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http //lore.kernel.org/lkml/20210805130354.878120-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 16:30:24 -03:00
Jin Yao
b6ac16eed3 perf vendor events: Add metrics for Icelake Server
Add JSON metrics for Icelake Server to perf.

Based on TMA metrics 4.21 at 01.org.
https://download.01.org/perfmon/

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210806075404.31209-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 16:30:24 -03:00
Davidlohr Bueso
46f815323b perf bench futex, requeue: Add --pi parameter
This extends the program to measure WAIT_REQUEUE_PI+CMP_REQUEUE_PI
pairs, which are the underlying machinery behind priority-inheritance
aware condition variables. The defaults are the same as with the regular
non-pi version, requeueing one task at a time, with the exception that
PI will always wakeup the first waiter.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20210809043301.66002-8-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 12:00:27 -03:00
Davidlohr Bueso
6f9661b25b perf bench futex, requeue: Robustify futex_wait() handling
Do not assume success and account for EAGAIN or any other return value,
however unlikely.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20210809043301.66002-7-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 12:00:22 -03:00
Davidlohr Bueso
d262e6a93b perf bench futex, requeue: Add --broadcast option
Such that all threads are requeued to uaddr2 in a single
futex_cmp_requeue(), unlike the default, which is 1.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20210809043301.66002-6-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 12:00:10 -03:00
Davidlohr Bueso
9f9a3ffe94 perf bench futex: Add --mlockall parameter
This adds, across all futex benchmarks, the -m/--mlockall option
which is a common operation for realtime workloads by not incurring
in page faults in paths that want determinism. As such, threads
started after a call to mlockall(2) will generate page faults
immediately since the new stack is immediately forced to memory,
due to the MCL_FUTURE flag.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20210809043301.66002-5-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 12:00:04 -03:00
Davidlohr Bueso
b2105a7570 perf bench futex: Remove bogus backslash from comment
It obviously doesn't belong there.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20210809043301.66002-3-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 11:59:49 -03:00
Davidlohr Bueso
0959046303 perf bench futex: Group test parameters cleanup
Do this across all futex-bench tests such that all program parameters
neatly share a common structure, which is nicer than how we have them
now. No changes in program behavior are expected.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20210809043301.66002-2-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 11:59:46 -03:00
Jakub Kicinski
0ca8d3ca45 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Build failure in drivers/net/wwan/mhi_wwan_mbim.c:
add missing parameter (0, assuming we don't want buffer pre-alloc).

Conflict in drivers/net/dsa/sja1105/sja1105_main.c between:
  589918df93 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
  0fac6aa098 ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode")

Follow the instructions from the commit message of the former commit
- removed the if conditions. When looking at commit 589918df93 ("net:
dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
note that the mask_iotag fields get removed by the following patch.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-05 15:08:47 -07:00
James Clark
f3c33cbd92 perf cs-etm: Improve Coresight zero timestamp warning
Only show the warning if the user hasn't already set timeless mode and
improve the text because there was ambiguity around the meaning of '...'

Change the warning to a UI warning instead of printing straight to
stderr because this corrupts the UI when perf report TUI is used. The UI
warning function also handles printing to stderr when in perf script
mode.

Suggested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:04:19 -03:00
James Clark
1155204950 perf tools: Add flag for tracking warnings of missing DSOs
Auxtrace support may need DSOs for decoding (for example Arm Coresight).
If one of these is missing it would make sense to warn once for each one
that's missing, but not flood the output with every address as there
could be thousands of lookups.

This flag will allow tracking whether a warning was shown for each DSO.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:04:08 -03:00
James Clark
243c3a3eb4 perf annotate: Add disassembly warnings for annotate --stdio
Currently 'perf annotate --stdio' (and --stdio2) will exit without
printing anything if there are disassembly errors. Apply the same error
handler that's used for TUI and GTK modes. This makes comparing
disassembly across the different modes more consistent.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:04:06 -03:00
James Clark
3d8b92472a perf annotate: Re-add annotate_warned functionality
Setting annotate_warned to true on errors was removed in
commit ee51d85139 ("perf annotate: Introduce strerror for handling
symbol__disassemble() errors") which means when 'perf annotate
--skip-missing' is used warnings are shown multiple times for the same
DSO.

Setting this again restores the original behavior of only one warning
each.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:04:04 -03:00
James Clark
1094795eb9 perf tools: Add WARN_ONCE equivalent for UI warnings
Currently WARN_ONCE prints to stderr and corrupts the TUI. Add
equivalent methods for UI warnings.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:03:18 -03:00
Namhyung Kim
ec02f2b134 perf tools: Add pipe_test.sh to verify pipe operations
It builds a test program and use it to verify pipe behavior with perf
record, inject and report.

  $ perf test pipe -v
  80: perf pipe recording and injection test                          :
  --- start ---
  test child forked, pid 1109301
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
     1109315  1109315       -1 |test.file.MGNff
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
      99.99%  test.file.MGNff  test.file.MGNffM  [.] noploop
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
      99.99%  test.file.MGNff  test.file.MGNffM  [.] noploop
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.153 MB /tmp/perf.data.dmsnlx (3995 samples) ]
      99.99%  test.file.MGNff  test.file.MGNffM  [.] noploop
  test child finished with 0
  ---- end ----
  perf pipe recording and injection test: Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:14:58 -03:00
Namhyung Kim
c3a057dc3a perf inject: Fix output from a file to a pipe
When the input is a regular file but the output is a pipe, it should
write a pipe header.  But just repiping would write a portion of the
existing header which is different in 'size' value.  So we need to
prevent it and write a new pipe header along with other information
like event attributes and features.

This can handle something like this:

  # perf record -a -B sleep 1

  # perf inject -b -i perf.data | perf report -i -

Factor out perf_event__synthesize_for_pipe() to be shared between perf
record and inject.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:14:34 -03:00
Namhyung Kim
fea20d66f9 perf inject: Fix output from a pipe to a file
Sometimes it needs to save the perf inject data to a file for debugging.
But normally it assumes the same format for input and output, so the end
result cannot be used due to a broken format.

  # perf record -a -o - sleep 1 | perf inject -b -o my.data

  # perf report -i my.data --stdio
  0x208 [0]: failed to process type: 0 [Invalid argument]
  Error:
  failed to process sample
  # To display the perf.data header info, please use --header/--header-only options.
  #

In this case, it thought the data has a regular file header since the
output is not a pipe.  But actually it doesn't have one and has a pipe
file header.  At the end of the session, it tries to rewrite the regular
file header with updated features and it overwrites the data just
follows the pipe header.

Fix it by checking either the input and the output is a pipe.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:12:55 -03:00
Namhyung Kim
0ae0389362 perf tools: Pass a fd to perf_file_header__read_pipe()
Currently it unconditionally writes to stdout for repipe.  But perf
inject can direct its output to a regular file.  Then it needs to
write the header to the file as well.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:09:05 -03:00
Namhyung Kim
2681bd85a4 perf tools: Remove repipe argument from perf_session__new()
The repipe argument is only used by perf inject and the all others
passes 'false'.  Let's remove it from the function signature and add
__perf_session__new() to be called from perf inject directly.

This is a preparation of the change the pipe input/output.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-2-namhyung@kernel.org
[ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:06:51 -03:00
Eirik Fuller
880569296f perf test: Handle fd gaps in test__dso_data_reopen
https://github.com/beaker-project/restraint/issues/215 describes a file
descriptor leak which revealed the test failure described here.

The 'DSO data reopen' perf test assumes that RLIMIT_NOFILE limits the
number of open file descriptors, but it actually limits newly opened
file descriptors. When the file descriptor limit is reduced, file
descriptors already open remain open regardless of the new limit. This
test failure does not occur if open file descriptors are contiguous,
beginning at zero.

The following command triggers this perf test failure.

perf test 'DSO data reopen' 3>/dev/null 8>/dev/null

This patch determines the file descriptor limit by opening four files
and then closing them. The limit is set to the fourth file descriptor,
leaving only the first three available because any newly opened file
descriptor must be less than the limit.

Signed-off-by: Eirik Fuller <efuller@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Michael Petlan <mpetlan@redhat.com>
LPU-Reference: 20210626023825.1398547-1-efuller@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:01:54 -03:00
Jin Yao
43c117d809 perf vendor events intel: Add basic metrics for Elkhartlake
Add JSON metrics for Elkhartlake to perf.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210802053440.21035-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:57:46 -03:00
Jin Yao
aa1bd89235 perf vendor events intel: Add core event list for Elkhartlake
Add JSON core events for Elkhartlake to perf.

Based on JSON list v1.02:

https://download.01.org/perfmon/EHL/

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210802053440.21035-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:57:23 -03:00
Jin Yao
b9efd75b6e perf vendor events: Add metrics for Tigerlake
Add JSON metrics for Tigerlake to perf.

Based on TMA metrics 4.21 at 01.org.
https://download.01.org/perfmon/

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719070058.4159-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Jin Yao
4babba5572 perf vendor events intel: Add core event list for Tigerlake
Add JSON core events for Tigerlake to perf.

Based on JSON list v1.03:

https://download.01.org/perfmon/TGL/

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719070058.4159-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Li Huafei
c4db54be9b perf annotate: Add error log in symbol__annotate()
When users use 'perf annotate' on unsupported machines, error logs
should be printed for user feedback.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dengcheng Zhu <dzhu@wavecomp.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http://lore.kernel.org/lkml/20210726123854.13463-2-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Li Huafei
4502da0efb perf env: Normalize aarch64.* and arm64.* to arm64 in normalize_arch()
On my aarch64 big endian machine, the perf annotate does not work.

 # perf annotate
  Percent |      Source code & Disassembly of [kernel.kallsyms] for cycles (253 samples, percent: local period)
 --------------------------------------------------------------------------------------------------------------
  Percent |      Source code & Disassembly of [kernel.kallsyms] for cycles (1 samples, percent: local period)
 ------------------------------------------------------------------------------------------------------------
  Percent |      Source code & Disassembly of [kernel.kallsyms] for cycles (47 samples, percent: local period)
 -------------------------------------------------------------------------------------------------------------
 ...

This is because the arch_find() function uses the normalized architecture
name provided by normalize_arch(), and my machine's architecture name
aarch64_be is not normalized to arm64.  Like other architectures such as
arm and powerpc, we can fuzzy match the architecture names associated with
aarch64.* and normalize them.

It seems that there is also arm64_be architecture name, which we also
normalize to arm64.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dengcheng Zhu <dzhu@wavecomp.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http //lore.kernel.org/lkml/20210726123854.13463-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Ian Rogers
f463ad7f41 perf beauty: Reuse the generic arch errno switch
Previously the code would see if, for example,
tools/perf/arch/arm/include/uapi/asm/errno.h exists and if not generate
a "generic" switch statement using the asm-generic/errno.h.

This creates multiple identical "generic" switch statements before the
default generic switch statement for an unknown architecture.

By simplifying the archlist to be only for architectures that are not
"generic" the amount of generated code can be reduced from 14 down to 6
functions.

Remove the special case of x86, instead reverse the architecture names
so that it comes first.

Committer testing:

  $ tools/perf/trace/beauty/arch_errno_names.sh gcc tools > before

Apply this patch and:

  $ tools/perf/trace/beauty/arch_errno_names.sh gcc tools > after

14 arches down to 6, that are the ones with an explicit errno.h file:

  $ ls -1 tools/arch/*/include/uapi/asm/errno.h
  tools/arch/alpha/include/uapi/asm/errno.h
  tools/arch/mips/include/uapi/asm/errno.h
  tools/arch/parisc/include/uapi/asm/errno.h
  tools/arch/powerpc/include/uapi/asm/errno.h
  tools/arch/sparc/include/uapi/asm/errno.h
  tools/arch/x86/include/uapi/asm/errno.h
  $

  $ diff -u4 before after
  @@ -2099,32 +987,16 @@
   const char *arch_syscalls__strerrno(const char *arch, int err)
   {
   	if (!strcmp(arch, "x86"))
   		return errno_to_name__x86(err);
  -	if (!strcmp(arch, "alpha"))
  -		return errno_to_name__alpha(err);
  -	if (!strcmp(arch, "arc"))
  -		return errno_to_name__arc(err);
  -	if (!strcmp(arch, "arm"))
  -		return errno_to_name__arm(err);
  -	if (!strcmp(arch, "arm64"))
  -		return errno_to_name__arm64(err);
  -	if (!strcmp(arch, "csky"))
  -		return errno_to_name__csky(err);
  -	if (!strcmp(arch, "mips"))
  -		return errno_to_name__mips(err);
  -	if (!strcmp(arch, "parisc"))
  -		return errno_to_name__parisc(err);
  -	if (!strcmp(arch, "powerpc"))
  -		return errno_to_name__powerpc(err);
  -	if (!strcmp(arch, "riscv"))
  -		return errno_to_name__riscv(err);
  -	if (!strcmp(arch, "s390"))
  -		return errno_to_name__s390(err);
  -	if (!strcmp(arch, "sh"))
  -		return errno_to_name__sh(err);
   	if (!strcmp(arch, "sparc"))
   		return errno_to_name__sparc(err);
  -	if (!strcmp(arch, "xtensa"))
  -		return errno_to_name__xtensa(err);
  +	if (!strcmp(arch, "powerpc"))
  +		return errno_to_name__powerpc(err);
  +	if (!strcmp(arch, "parisc"))
  +		return errno_to_name__parisc(err);
  +	if (!strcmp(arch, "mips"))
  +		return errno_to_name__mips(err);
  +	if (!strcmp(arch, "alpha"))
  +		return errno_to_name__alpha(err);
   	return errno_to_name__generic(err);
   }

The rest of the patch is the removal of the errno_to_name__generic()
unneeded clones.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210513060441.408507-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Ian Rogers
c44fc5af3c perf doc: Reorganize ARTICLES variables.
Place early, as they are in the git Makefile. Remove references to a
'technical` directory that doesn't exist in perf.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Ian Rogers
17ef1f14f6 perf doc: Remove howto-index.sh related references.
howto-index.sh exists in git but not in perf, as such targets that
depend upon it fail. Remove such failing targets.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Ian Rogers
e30b992f08 perf doc: Remove cmd-list.perl references
cmd-list.perl exists in git but not in perf. As such these targets fail
with missing dependencies. Remove them.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Ian Rogers
361ac7b462 perf doc: Add info pages to all target.
Enabled to ensure that info pages build.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Ian Rogers
33e536103f perf doc: Remove references to user-manual
Perf doesn't have a user-manual.txt, but git does and this explains why
there are references here. Having these references breaks 'make info' as
user-manual.info can't be created given the missing dependency. Remove
all references to user-manual so that 'make info' can succeed.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Ian Rogers
a81df63a5d perf doc: Fix doc.dep
The doc.dep dependencies for the Makefile fail to build as
build-docdep.perl is missing. Add this file from git.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
Ian Rogers
6f6e7f065c perf doc: Fix perfman.info build
Before this change 'make perfman.info' fails as cat-texi.perl is
missing. It also fails as the makeinfo output isn't written into the
appropriate file. Add cat-texi.perl from git. Add missing output file
flag for makeinfo.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210715013343.2286699-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
9182f04a85 perf cs-etm: Pass unformatted flag to decoder
The TRBE (Trace Buffer Extension) feature allows a separate trace buffer
for each trace source, therefore the trace wouldn't need to be
formatted. The driver was introduced in commit 3fbf7f011f
("coresight: sink: Add TRBE driver").

The formatted/unformatted mode is encoded in one of the flags of the
AUX record. The first AUX record encountered for each event is used to
determine the mode, and this will persist for the remaining trace that
is either decoded or dumped.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-7-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
04aaad262c perf cs-etm: Use existing decoder instead of resetting it
When dumping trace, the decoder is continually deleted and recreated to
decode each buffer. To support both formatted and unformatted trace in
a later commit, the decoder will be configured in advance.

This commit removes the deletion of the decoder and allows the
formatted/unformatted setting to persist.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
b8324f490b perf cs-etm: Suppress printing when resetting decoder
The decoder is quite noisy when being reset. In a future commit,
dump-raw-trace will use a code path that resets the decoder rather than
creating a new one, so printing has to be suppressed to not flood the
output.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
ca50db5917 perf cs-etm: Only setup queues when they are modified
Continually creating queues in cs_etm__process_event() is unnecessary.
They only need to be created when a buffer for a new CPU or thread is
encountered. This can be in two places, when building the queues in
advance in cs_etm__process_auxtrace_info(), or in
cs_etm__process_auxtrace_event() when data_queued is false and the
index wasn't available (pipe mode).

This change will allow the 'formatted' decoder setting to applied when
iterating over aux records in a later commit.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
9ac8afd500 perf cs-etm: Split setup and timestamp search functions
This refactoring has some benefits:

 * Decoding is done to find the timestamp. If we want to print errors
   when maps aren't available, then doing it from cs_etm__setup_queue()
   may cause warnings to be printed.

 * The cs_etm__setup_queue() flow is shared between timed and timeless
   modes, so it needs to be guarded by an if statement which can now
   be removed.

 * Allows moving the setup queues function earlier.

 * If data was piped in, then not all queues would be filled so it
   wouldn't have worked properly anyway. Now it waits for flush so
   data in all queues will be available.

The motivation for this is to decouple setup functions with ones that
involve decoding. That way we can move the setup function earlier when
the formatted/unformatted trace information is available.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
6f38e1158b perf cs-etm: Refactor initialisation of kernel start address
The kernel start address is already cached in the machine struct once it
is initialised, so storing it in the cs_etm struct is unnecessary.

It also depends on kernel maps being available to be initialised.
Therefore cs_etm__setup_queues() isn't an appropriate place to call it
because it could be called before processing starts. It would be better
to initialise it at the point when it is needed, then we can be sure
that all the necessary maps are available. Also by calling
machine__kernel_start() multiple times it can be initialised at some
point, even if it failed to initialise previously due to missing maps.

In a later commit cs_etm__setup_queues() will be moved which is the
motivation for this change.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20210721150202.32065-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
Wei Li
ea0056f09a perf trace: Update cmd string table to decode sys_bpf first arg
As 'enum bpf_cmd' has been extended a lot, update the cmd string table to
decode sys_bpf first arg clearly in perf-trace.

Signed-off-by: Wei Li <liwei391@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Li Bin <huawei.libin@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210714015000.2844867-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
Jakub Kicinski
d39e8b92c3 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:

====================
bpf-next 2021-07-30

We've added 64 non-merge commits during the last 15 day(s) which contain
a total of 83 files changed, 5027 insertions(+), 1808 deletions(-).

The main changes are:

1) BTF-guided binary data dumping libbpf API, from Alan.

2) Internal factoring out of libbpf CO-RE relocation logic, from Alexei.

3) Ambient BPF run context and cgroup storage cleanup, from Andrii.

4) Few small API additions for libbpf 1.0 effort, from Evgeniy and Hengqi.

5) bpf_program__attach_kprobe_opts() fixes in libbpf, from Jiri.

6) bpf_{get,set}sockopt() support in BPF iterators, from Martin.

7) BPF map pinning improvements in libbpf, from Martynas.

8) Improved module BTF support in libbpf and bpftool, from Quentin.

9) Bpftool cleanups and documentation improvements, from Quentin.

10) Libbpf improvements for supporting CO-RE on old kernels, from Shuyi.

11) Increased maximum cgroup storage size, from Stanislav.

12) Small fixes and improvements to BPF tests and samples, from various folks.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (64 commits)
  tools: bpftool: Complete metrics list in "bpftool prog profile" doc
  tools: bpftool: Document and add bash completion for -L, -B options
  selftests/bpf: Update bpftool's consistency script for checking options
  tools: bpftool: Update and synchronise option list in doc and help msg
  tools: bpftool: Complete and synchronise attach or map types
  selftests/bpf: Check consistency between bpftool source, doc, completion
  tools: bpftool: Slightly ease bash completion updates
  unix_bpf: Fix a potential deadlock in unix_dgram_bpf_recvmsg()
  libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf
  tools: bpftool: Support dumping split BTF by id
  libbpf: Add split BTF support for btf__load_from_kernel_by_id()
  tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id()
  tools: Free BTF objects at various locations
  libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
  libbpf: Rename btf__load() as btf__load_into_kernel()
  libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
  bpf: Emit better log message if bpf_iter ctx arg btf_id == 0
  tools/resolve_btfids: Emit warnings and patch zero id for missing symbols
  bpf: Increase supported cgroup storage value size
  libbpf: Fix race when pinning maps in parallel
  ...
====================

Link: https://lore.kernel.org/r/20210730225606.1897330-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-31 11:23:26 -07:00
Arnaldo Carvalho de Melo
9bac1bd6e6 Revert "perf map: Fix dso->nsinfo refcounting"
This makes 'perf top' abort in some cases, and the right fix will
involve surgery that is too much to do at this stage, so revert for now
and fix it in the next merge window.

This reverts commit 2d6b74baa7.

Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-30 18:26:22 -03:00
Quentin Monnet
86f4b7f257 tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id()
Replace the calls to function btf__get_from_id(), which we plan to
deprecate before the library reaches v1.0, with calls to
btf__load_from_kernel_by_id() in tools/ (bpftool, perf, selftests).
Update the surrounding code accordingly (instead of passing a pointer to
the btf struct, get it as a return value from the function).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-6-quentin@isovalent.com
2021-07-29 17:23:52 -07:00
Quentin Monnet
369e955b3d tools: Free BTF objects at various locations
Make sure to call btf__free() (and not simply free(), which does not
free all pointers stored in the struct) on pointers to struct btf
objects retrieved at various locations.

These were found while updating the calls to btf__get_from_id().

Fixes: 999d82cbc0 ("tools/bpf: enhance test_btf file testing to test func info")
Fixes: 254471e57a ("tools/bpf: bpftool: add support for func types")
Fixes: 7b612e291a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs")
Fixes: d56354dc49 ("perf tools: Save bpf_prog_info and BTF of new BPF programs")
Fixes: 47c09d6a9f ("bpftool: Introduce "prog profile" command")
Fixes: fa853c4b83 ("perf stat: Enable counting events for BPF programs")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210729162028.29512-5-quentin@isovalent.com
2021-07-29 17:09:28 -07:00
John Garry
c07d5c9226 perf pmu: Fix alias matching
Commit c47a5599ed ("perf tools: Fix pattern matching for same
substring in different PMU type"), may have fixed some alias matching,
but has broken some others.

Firstly it cannot handle the simple scenario of PMU name in form
pmu_name{digits} - it can only handle pmu_name_{digits}.

Secondly it cannot handle more complex matching in the case where we
have multiple tokens. In this scenario, the code failed to realise that
we may examine multiple substrings in the PMU name.

Fix in two ways:

- Change perf_pmu__valid_suffix() to accept a PMU name without '_' in the
  suffix

- Only pay attention to perf_pmu__valid_suffix() for the final token

Also add const qualifiers as necessary to avoid casting.

Fixes: c47a5599ed ("perf tools: Fix pattern matching for same substring in different PMU type")
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1626793819-79090-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-27 13:21:46 -03:00
James Clark
48e8a7b5a5 perf cs-etm: Split --dump-raw-trace by AUX records
Currently --dump-raw-trace skips queueing and splitting buffers because
of an early exit condition in cs_etm__process_auxtrace_info(). Once
that is removed we can print the split data by using the queues
and searching for split buffers with the same reference as the
one that is currently being processed.

This keeps the same behaviour of dumping in file order when an AUXTRACE
event appears, rather than moving trace dump to where AUX records are in
the file.

There will be a newline and size printout for each fragment. For example
this buffer is comprised of two AUX records, but was printed as one:

  0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t

  . ... CoreSight ETM Trace data: size 160 bytes
          Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
          Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
          Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
          Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;

But is now printed as two fragments:

  0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t

  . ... CoreSight ETM Trace data: size 80 bytes
          Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
          Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;

  . ... CoreSight ETM Trace data: size 80 bytes
          Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
          Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;

Decoding errors that appeared in problematic files are now not present,
for example:

        Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
        ...
        PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210624164303.28632-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-27 12:50:56 -03:00
Yang Jihong
b0f008551f perf sched: Fix record failure when CONFIG_SCHEDSTATS is not set
The tracepoints trace_sched_stat_{wait, sleep, iowait} are not exposed to user
if CONFIG_SCHEDSTATS is not set, "perf sched record" records the three events.
As a result, the command fails.

Before:

  #perf sched record sleep 1
  event syntax error: 'sched:sched_stat_wait'
                       \___ unknown tracepoint

  Error:  File /sys/kernel/tracing/events/sched/sched_stat_wait not found.
  Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this feature?.

  Run 'perf list' for a list of valid events

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -e, --event <event>   event selector. use 'perf list' to list available events

Solution:
  Check whether schedstat tracepoints are exposed. If no, these events are not recorded.

After:
  # perf sched record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.163 MB perf.data (1091 samples) ]
  # perf sched report
  run measurement overhead: 4736 nsecs
  sleep measurement overhead: 9059979 nsecs
  the run test took 999854 nsecs
  the sleep test took 8945271 nsecs
  nr_run_events:        716
  nr_sleep_events:      785
  nr_wakeup_events:     0
  ...
  ------------------------------------------------------------

Fixes: 2a09b5de23 ("sched/fair: do not expose some tracepoints to user if CONFIG_SCHEDSTATS is not set")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Yafang Shao <laoar.shao@gmail.com>
Link: http://lore.kernel.org/lkml/20210713112358.194693-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-18 09:36:37 -03:00
Yang Jihong
22a665513b perf probe: Fix add event failure when running 32-bit perf in a 64-bit kernel
The "address" member of "struct probe_trace_point" uses long data type.
If kernel is 64-bit and perf program is 32-bit, size of "address"
variable is 32 bits.

As a result, upper 32 bits of address read from kernel are truncated, an
error occurs during address comparison in kprobe_warn_out_range().

Before:

  # perf probe -a schedule
  schedule is out of .text, skip it.
    Error: Failed to add events.

Solution:
  Change data type of "address" variable to u64 and change corresponding
address printing and value assignment.

After:

  # perf.new.new probe -a schedule
  Added new event:
    probe:schedule       (on schedule)

  You can now use it in all perf tools, such as:

          perf record -e probe:schedule -aR sleep 1

  # perf probe -l
    probe:schedule       (on schedule@kernel/sched/core.c)
  # perf record -e probe:schedule -aR sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.156 MB perf.data (1366 samples) ]
  # perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 1K of event 'probe:schedule'
  # Event count (approx.): 1366
  #
  # Overhead  Command          Shared Object      Symbol
  # ........  ...............  .................  ............
  #
       6.22%  migration/0      [kernel.kallsyms]  [k] schedule
       6.22%  migration/1      [kernel.kallsyms]  [k] schedule
       6.22%  migration/2      [kernel.kallsyms]  [k] schedule
       6.22%  migration/3      [kernel.kallsyms]  [k] schedule
       6.15%  migration/10     [kernel.kallsyms]  [k] schedule
       6.15%  migration/11     [kernel.kallsyms]  [k] schedule
       6.15%  migration/12     [kernel.kallsyms]  [k] schedule
       6.15%  migration/13     [kernel.kallsyms]  [k] schedule
       6.15%  migration/14     [kernel.kallsyms]  [k] schedule
       6.15%  migration/15     [kernel.kallsyms]  [k] schedule
       6.15%  migration/4      [kernel.kallsyms]  [k] schedule
       6.15%  migration/5      [kernel.kallsyms]  [k] schedule
       6.15%  migration/6      [kernel.kallsyms]  [k] schedule
       6.15%  migration/7      [kernel.kallsyms]  [k] schedule
       6.15%  migration/8      [kernel.kallsyms]  [k] schedule
       6.15%  migration/9      [kernel.kallsyms]  [k] schedule
       0.22%  rcu_sched        [kernel.kallsyms]  [k] schedule
  ...
  #
  # (Cannot load tips.txt file, please install perf!)
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jianlin Lv <jianlin.lv@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20210715063723.11926-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-18 09:31:15 -03:00
Riccardo Mancini
d4b3eedce1 perf data: Close all files in close_dir()
When using 'perf report' in directory mode, the first file is not closed
on exit, causing a memory leak.

The problem is caused by the iterating variable never reaching 0.

Fixes: 1455206311 ("perf data: Add perf_data__(create_dir|close_dir) functions")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: http://lore.kernel.org/lkml/20210716141122.858082-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-18 09:27:49 -03:00
Riccardo Mancini
e0fa7ab422 perf probe-file: Delete namelist in del_events() on the error path
ASan reports some memory leaks when running:

  # perf test "42: BPF filter"

This second leak is caused by a strlist not being dellocated on error
inside probe_file__del_events.

This patch adds a goto label before the deallocation and makes the error
path jump to it.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: e7895e422e ("perf probe: Split del_perf_probe_events()")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/174963c587ae77fa108af794669998e4ae558338.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-18 09:27:37 -03:00
Riccardo Mancini
937654ce49 perf test bpf: Free obj_buf
ASan reports some memory leaks when running:

  # perf test "42: BPF filter"

The first of these leaks is caused by obj_buf never being deallocated in
__test__bpf.

This patch adds the missing free.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: ba1fae431e ("perf test: Add 'perf test BPF'")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/60f3ca935fe6672e7e866276ce6264c9e26e4c87.1626343282.git.rickyman7@gmail.com
[ Added missing stdlib.h include ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-16 13:50:30 -03:00
Riccardo Mancini
659ede7d13 perf trace: Free strings in trace__parse_events_option()
ASan reports several memory leaks running:

  # perf test "88: Check open filename arg using perf trace + vfs_getname"

The fourth of these leaks is related to some strings never being freed
in trace__parse_events_option.

This patch adds the missing frees.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/34d08535b11124106b859790549991abff5a7de8.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:35:57 -03:00
Riccardo Mancini
3cb4d5e00e perf trace: Free syscall tp fields in evsel->priv
ASan reports several memory leaks running:

  # perf test "88: Check open filename arg using perf trace + vfs_getname"

The third of these leaks is related to evsel->priv fields of sycalls
never being deallocated.

This patch adds the function evlist__free_syscall_tp_fields which
iterates over all evsels in evlist, matching syscalls, and calling the
missing frees.

This new function is called at the end of trace__run, right before
calling evlist__delete.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/46526611904ec5ff2768b59014e3afce8e0197d1.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:35:18 -03:00
Riccardo Mancini
f2ebf8ffe7 perf trace: Free syscall->arg_fmt
ASan reports several memory leaks running:

  # perf test "88: Check open filename arg using perf trace + vfs_getname"

The second of these leaks is caused by the arg_fmt field of syscall not
being deallocated.

This patch adds a new function syscall__exit which is called on all
syscalls.table entries in trace__exit, which will free the arg_fmt
field.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/d68f25c043d30464ac9fa79c3399e18f429bca82.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:34:39 -03:00
Riccardo Mancini
6c7f0ab047 perf trace: Free malloc'd trace fields on exit
ASan reports several memory leaks running:

  # perf test "88: Check open filename arg using perf trace + vfs_getname"

The first of these leaks is related to struct trace fields never being
deallocated.

This patch adds the function trace__exit, which is called at the end of
cmd_trace, replacing the existing deallocation, which is now moved
inside the new function.

This function deallocates:

 - ev_qualifier
 - ev_qualifier_ids.entries
 - syscalls.table
 - sctbl
 - perfconfig_events

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/de5945ed5c0cb882cbfa3268567d0bff460ff016.1626343282.git.rickyman7@gmail.com
[ Removed needless initialization to zero, missing named initializers are zeroed by the compiler ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:34:07 -03:00
Riccardo Mancini
f8cbb0f926 perf lzma: Close lzma stream on exit
ASan reports memory leaks when running:

  # perf test "88: Check open filename arg using perf trace + vfs_getname"

One of these is caused by the lzma stream never being closed inside
lzma_decompress_to_file().

This patch adds the missing lzma_end().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 80a32e5b49 ("perf tools: Add lzma decompression support for kernel module")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/aaf50bdce7afe996cfc06e1bbb36e4a2a9b9db93.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:30:22 -03:00
Riccardo Mancini
faf3ac305d perf script: Fix memory 'threads' and 'cpus' leaks on exit
ASan reports several memory leaks while running:

  # perf test "82: Use vfs_getname probe to get syscall args filenames"

Two of these are caused by some refcounts not being decreased on
perf-script exit, namely script.threads and script.cpus.

This patch adds the missing __put calls in a new perf_script__exit
function, which is called at the end of cmd_script.

This patch concludes the fixes of all remaining memory leaks in perf
test "82: Use vfs_getname probe to get syscall args filenames".

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: cfc8874a48 ("perf script: Process cpu/threads maps")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/5ee73b19791c6fa9d24c4d57f4ac1a23609400d7.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:28:14 -03:00
Riccardo Mancini
1b1f57cf9e perf script: Release zstd data
ASan reports several memory leak while running:

  # perf test "82: Use vfs_getname probe to get syscall args filenames"

One of the leaks is caused by zstd data not being released on exit in
perf-script.

This patch adds the missing zstd_fini().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: b13b04d938 ("perf script: Initialize zstd_data")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/39388e8cc2f85ca219ea18697a17b7bd8f74b693.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
423b9174f5 perf session: Cleanup trace_event
ASan reports several memory leaks when running:

  # perf test "82: Use vfs_getname probe to get syscall args filenames"

many of which are related to session->tevent.

This patch will solve this problem, then next patch will fix the
remaining memory leaks in 'perf script'.

This bug is due to a missing deallocation of the trace_event data
strutures.

This patch adds the missing trace_event__cleanup() in
perf_session__delete().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/fa2a3f221d90e47ce4e5b7e2d6e64c3509ddc96a.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
02e6246f53 perf inject: Close inject.output on exit
ASan reports a memory leak when running:

  # perf test "83: Zstd perf.data compression/decompression"

which happens inside 'perf inject'.

The bug is caused by inject.output never being closed.

This patch adds the missing perf_data__close().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 6ef81c55a2 ("perf session: Return error code for perf_session__new() function on failure")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/c06f682afa964687367cf6e92a64ceb49aec76a5.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
a37338aad8 perf report: Free generated help strings for sort option
ASan reports the memory leak of the strings allocated by sort_help() when
running perf report.

This patch changes the returned pointer to char* (instead of const
char*), saves it in a temporary variable, and finally deallocates it at
function exit.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 702fb9b415 ("perf report: Show all sort keys in help output")
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/a38b13f02812a8a6759200b9063c6191337f44d4.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
da6b7c6c06 perf env: Fix memory leak of cpu_pmu_caps
ASan reports memory leaks while running:

 # perf test "83: Zstd perf.data compression/decompression"

The first of the leaks is caused by env->cpu_pmu_caps not being freed.

This patch adds the missing (z)free inside perf_env__exit.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 6f91ea283a ("perf header: Support CPU PMU capabilities")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/6ba036a8220156ec1f3d6be3e5d25920f6145028.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
244d1797c8 perf test maps__merge_in: Fix memory leak of maps
ASan reports a memory leak when running:

  # perf test "65: maps__merge_in"

This is the second and final patch addressing these memory leaks.

This time, the problem is simply that the maps object is never
destructed.

This patch adds the missing maps__exit call.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 79b6bb73f8 ("perf maps: Merge 'struct maps' with 'struct map_groups'")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/a1a29b97a58738987d150e94d4ebfad0282fb038.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
581e295a0f perf dso: Fix memory leak in dso__new_map()
ASan reports a memory leak when running:

  # perf test "65: maps__merge_in".

The causes of the leaks are two, this patch addresses only the first
one, which is related to dso__new_map().

The bug is that dso__new_map() creates a new dso but never decreases the
refcount it gets from creating it.

This patch adds the missing dso__put().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: d3a7c489c7 ("perf tools: Reference count struct dso")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/60bfe0cd06e89e2ca33646eb8468d7f5de2ee597.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
dccfca926c perf test event_update: Fix memory leak of unit
ASan reports a memory leak while running:

  # perf test "49: Synthesize attr update"

Caused by a string being duplicated but never freed.

This patch adds the missing free().

Note that evsel->unit is not deallocated together with evsel since it is
supposed to be a constant string.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: a6e5281780 ("perf tools: Add event_update event unit type")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1fbc8158663fb0d4d5392e36bae564f6ad60be3c.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
fc56f54f6f perf test event_update: Fix memory leak of evlist
ASan reports a memory leak when running:

  # perf test "49: Synthesize attr update"

Caused by evlist not being deleted.

This patch adds the missing evlist__delete and removes the
perf_cpu_map__put since it's already being deleted by evlist__delete.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: a6e5281780 ("perf tools: Add event_update event unit type")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/f7994ad63d248f7645f901132d208fadf9f2b7e4.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:51 -03:00
Riccardo Mancini
233f2dc1c2 perf test session_topology: Delete session->evlist
ASan reports a memory leak related to session->evlist while running:

  # perf test "41: Session topology".

When perf_data is in write mode, session->evlist is owned by the caller,
which should also take care of deleting it.

This patch adds the missing evlist__delete().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: c84974ed9f ("perf test: Add entry to test cpu topology")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/822f741f06eb25250fb60686cf30a35f447e9e91.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:51 -03:00
Riccardo Mancini
42db3d9ded perf env: Fix sibling_dies memory leak
ASan reports a memory leak in perf_env while running:

  # perf test "41: Session topology"

Caused by sibling_dies not being freed.

This patch adds the required free.

Fixes: acae8b36cd ("perf header: Add die information in CPU topology")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/2140d0b57656e4eb9021ca9772250c24c032924b.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:49 -03:00
Riccardo Mancini
dedeb4be20 perf probe: Fix dso->nsinfo refcounting
ASan reports a memory leak of nsinfo during the execution of:

 # perf test "31: Lookup mmap thread".

The leak is caused by a refcounted variable being replaced without
dropping the refcount.

This patch makes sure that the refcnt of nsinfo is decreased whenever
a refcounted variable is replaced with a new value.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 544abd44c7 ("perf probe: Allow placing uprobes in alternate namespaces.")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343282.git.rickyman7@gmail.com
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:25:28 -03:00
Riccardo Mancini
2d6b74baa7 perf map: Fix dso->nsinfo refcounting
ASan reports a memory leak of nsinfo during the execution of

  # perf test "31: Lookup mmap thread"

The leak is caused by a refcounted variable being replaced without
dropping the refcount.

This patch makes sure that the refcnt of nsinfo is decreased whenever a
refcounted variable is replaced with a new value.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: bf2e710b3c ("perf maps: Lookup maps in both intitial mountns and inner mountns.")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343282.git.rickyman7@gmail.com
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:25:27 -03:00
Riccardo Mancini
0967ebffe0 perf inject: Fix dso->nsinfo refcounting
ASan reports a memory leak of nsinfo during the execution of:

  # perf test "31: Lookup mmap thread"

The leak is caused by a refcounted variable being replaced without
dropping the refcount.

This patch makes sure that the refcnt of nsinfo is decreased when a
refcounted variable is replaced with a new value.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 27c9c3424f ("perf inject: Add --buildid-all option")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343282.git.rickyman7@gmail.com
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:25:24 -03:00
James Clark
83d1fc92d4 perf cs-etm: Split Coresight decode by aux records
Populate the auxtrace queues using AUX records rather than whole
auxtrace buffers so that the decoder is reset between each aux record.

This is similar to the auxtrace_queues__process_index() ->
auxtrace_queues__add_indexed_event() flow where
perf_session__peek_event() is used to read AUXTRACE events out of random
positions in the file based on the auxtrace index.

But now we loop over all PERF_RECORD_AUX events instead of AUXTRACE
buffers. For each PERF_RECORD_AUX event, we find the corresponding
AUXTRACE buffer using the index, and add a fragment of that buffer to
the auxtrace queues.

No other changes to decoding were made, apart from populating the
auxtrace queues. The result of decoding is identical to before, except
in cases where decoding failed completely, due to not resetting the
decoder.

The reason for this change is because AUX records are emitted any time
tracing is disabled, for example when the process is scheduled out.
Because ETM was disabled and enabled again, the decoder also needs to be
reset to force the search for a sync packet. Otherwise there would be
fatal decoding errors.

Testing
=======

Testing was done with the following script, to diff the decoding results
between the patched and un-patched versions of perf:

	#!/bin/bash
	set -ex

	$1 script -i $3 $4 > split.script
	$2 script -i $3 $4 > default.script

	diff split.script default.script | head -n 20

And it was run like this, with various itrace options depending on the
quantity of synthesised events:

	compare.sh ./perf-patched ./perf-default perf-per-cpu-2-threads.data --itrace=i100000ns

No changes in output were observed in the following scenarios:

* Simple per-cpu
	perf record -e cs_etm/@tmc_etr0/u top

* Per-thread, single thread
	perf record -e cs_etm/@tmc_etr0/u --per-thread ./threads_C

* Per-thread multiple threads (but only one thread collected data):
	perf record -e cs_etm/@tmc_etr0/u --per-thread --pid 4596,4597

* Per-thread multiple threads (both threads collected data):
	perf record -e cs_etm/@tmc_etr0/u --per-thread --pid 4596,4597

* Per-cpu explicit threads:
	perf record -e cs_etm/@tmc_etr0/u --pid 853,854

* System-wide (per-cpu):
    perf record -e cs_etm/@tmc_etr0/u -a

* No data collected (no aux buffers)
	Can happen with any command when run for a short period

* Containing truncated records
	Can happen with any command

* Containing aux records with 0 size
	Can happen with any command

* Snapshot mode (various files with and without buffer wrap)
	perf record -e cs_etm/@tmc_etr0/u -a --snapshot

Some differences were observed in the following scenario:

* Snapshot mode (with duplicate buffers)
	perf record -e cs_etm/@tmc_etr0/u -a --snapshot

Fewer samples are generated in snapshot mode if duplicate buffers
were gathered because buffers with the same offset are now only added
once. This gives different, but more correct results and no duplicate
data is decoded any more.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210624164303.28632-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 14:42:36 -03:00
Arnaldo Carvalho de Melo
d08c84e01a perf sched: Cast PTHREAD_STACK_MIN to int as it may turn into sysconf(__SC_THREAD_STACK_MIN_VALUE)
In fedora rawhide the PTHREAD_STACK_MIN define may end up expanded to a
sysconf() call, and that will return 'long int', breaking the build:

    45 fedora:rawhide                : FAIL gcc version 11.1.1 20210623 (Red Hat 11.1.1-6) (GCC)
      builtin-sched.c: In function 'create_tasks':
      /git/perf-5.14.0-rc1/tools/include/linux/kernel.h:43:24: error: comparison of distinct pointer types lacks a cast [-Werror]
         43 |         (void) (&_max1 == &_max2);              \
            |                        ^~
      builtin-sched.c:673:34: note: in expansion of macro 'max'
        673 |                         (size_t) max(16 * 1024, PTHREAD_STACK_MIN));
            |                                  ^~~
      cc1: all warnings being treated as errors

  $ grep __sysconf /usr/include/*/*.h
  /usr/include/bits/pthread_stack_min-dynamic.h:extern long int __sysconf (int __name) __THROW;
  /usr/include/bits/pthread_stack_min-dynamic.h:#   define PTHREAD_STACK_MIN __sysconf (__SC_THREAD_STACK_MIN_VALUE)
  /usr/include/bits/time.h:extern long int __sysconf (int);
  /usr/include/bits/time.h:# define CLK_TCK ((__clock_t) __sysconf (2))	/* 2 is _SC_CLK_TCK */
  $

So cast it to int to cope with that.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 13:06:38 -03:00
Heiko Carstens
50e98924d7 libperf: Fix build error with LIBPFM4=1
Fix build error with LIBPFM4=1:

    CC      util/pfm.o
  util/pfm.c: In function ‘parse_libpfm_events_option’:
  util/pfm.c:102:30: error: ‘struct evsel’ has no member named ‘leader’
    102 |                         evsel->leader = grp_leader;
        |                              ^~

Committer notes:

There is this entry in 'make -C tools/perf build-test' to test the build
with libpfm:

  $ grep libpfm tools/perf/tests/make
  make_with_libpfm4   := LIBPFM4=1
  run += make_with_libpfm4
  $

But the test machine lacked libpfm-devel, now its installed and further
cases like this shouldn't happen.

Committer testing:

Before this patch this fails, after applying it:

  $ make -C tools/perf build-test
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
                   make_static: make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1 -j24  DESTDIR=/tmp/tmp.KzFSfvGRQa
  <SNIP>
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
           make_with_libpfm4_O: make LIBPFM4=1
         make_install_prefix_O: make install prefix=/tmp/krava
            make_no_auxtrace_O: make NO_AUXTRACE=1
  <SNIP>
  $ rpm -q libpfm-devel
  libpfm-devel-4.11.0-4.fc34.x86_64
  $

FIXME:

This shows a need for 'build-test' to bail out when a build option is
specified that has no required library devel files installed.

Fixes: fba7c86601 ("libperf: Move 'leader' from tools/perf to perf_evsel::leader")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210713091907.1555560-1-hca@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Arnaldo Carvalho de Melo
376a947653 tools headers UAPI: Sync files changed by the memfd_secret new syscall
To pick the changes in this cset:

  7bb7f2ac24 ("arch, mm: wire up memfd_secret system call where relevant")

That silences these perf build warnings and add support for those new
syscalls in tools such as 'perf trace'.

For instance, this is now possible:

  # perf trace -v -e memfd_secret
  event qualifier tracepoint filter: (common_pid != 13375 && common_pid != 3713) && (id == 447)
  ^C#

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep memfd_secret tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
  447    common  memfd_secret            sys_memfd_secret
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/unistd.h' differs from latest version at 'arch/arm64/include/uapi/asm/unistd.h'
  diff -u tools/arch/arm64/include/uapi/asm/unistd.h arch/arm64/include/uapi/asm/unistd.h
  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Jin Yao
e0a7ef2a62 perf stat: Merge uncore events by default for hybrid platform
On a hybrid platform, by default 'perf stat' aggregates and reports the
event counts per PMU. For example,

  # perf stat -e cycles -a true

   Performance counter stats for 'system wide':

           1,400,445      cpu_core/cycles/
             680,881      cpu_atom/cycles/

         0.001770773 seconds time elapsed

But for uncore events that's not a suitable method. Uncore has nothing
to do with hybrid. So for uncore events, we aggregate event counts from
all PMUs and report the counts without PMUs.

Before:

  # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true

   Performance counter stats for 'system wide':

               2,058      uncore_arb_0/event=0x81,umask=0x1/
               2,028      uncore_arb_1/event=0x81,umask=0x1/
                   0      uncore_arb_0/event=0x84,umask=0x1/
                   0      uncore_arb_1/event=0x84,umask=0x1/

         0.000614498 seconds time elapsed

After:

  # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true

   Performance counter stats for 'system wide':

               3,996      arb/event=0x81,umask=0x1/
                   0      arb/event=0x84,umask=0x1/

         0.000630046 seconds time elapsed

Of course, we also keep the '--no-merge' working for uncore events.

  # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ --no-merge true

   Performance counter stats for 'system wide':

               1,952      uncore_arb_0/event=0x81,umask=0x1/
               1,921      uncore_arb_1/event=0x81,umask=0x1/
                   0      uncore_arb_0/event=0x84,umask=0x1/
                   0      uncore_arb_1/event=0x84,umask=0x1/

         0.000575536 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210707055652.962-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Jin Yao
de3d5fd83c perf tests: Fix 'Convert perf time to TSC' on core-only system
If the atom CPUs are offlined, the 'cpu_atom' is not valid.
We don't need the test case for 'cpu_atom'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210708013701.20347-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Jin Yao
212f3d97ab perf tests: Fix 'Roundtrip evsel->name' on core-only system
If the atom CPUs are offlined, the 'cpu_atom' is not valid.
Perf will not create two events for one hw event, so the
evsel->idx doesn't need to be divided by 2 before comparing.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210708013701.20347-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Jin Yao
490e9a8fb4 perf tests: Fix 'Parse event definition strings' on core-only system
If the atom CPUs are offlined, the 'cpu_atom' is not valid.
We don't need the test case for 'cpu_atom'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210708013701.20347-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Jin Yao
49afa7f6c7 perf pmu: Skip invalid hybrid pmu
On hybrid platform, such as Alderlake, if atom CPUs are offlined,
the kernel still exports the sysfs path '/sys/devices/cpu_atom/' for
'cpu_atom' pmu but the file '/sys/devices/cpu_atom/cpus' is empty,
which indicates this is an invalid pmu.

Need to check and skip the invalid hybrid pmu.

Before:

  # perf list
  ...
  branch-instructions OR cpu_atom/branch-instructions/ [Kernel PMU event]
  branch-instructions OR cpu_core/branch-instructions/ [Kernel PMU event]
  branch-misses OR cpu_atom/branch-misses/           [Kernel PMU event]
  branch-misses OR cpu_core/branch-misses/           [Kernel PMU event]
  bus-cycles OR cpu_atom/bus-cycles/                 [Kernel PMU event]
  bus-cycles OR cpu_core/bus-cycles/                 [Kernel PMU event]
  ...

The cpu_atom events are still displayed even if atom CPUs are offlined.

After:

  # perf list
  ...
  branch-instructions OR cpu_core/branch-instructions/ [Kernel PMU event]
  branch-misses OR cpu_core/branch-misses/           [Kernel PMU event]
  bus-cycles OR cpu_core/bus-cycles/                 [Kernel PMU event]
  ...

Now only cpu_core events are displayed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210708013701.20347-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Eric W. Biederman
b48c7236b1 exit/bdflush: Remove the deprecated bdflush system call
The bdflush system call has been deprecated for a very long time.
Recently Michael Schmitz tested[1] and found that the last known
caller of of the bdflush system call is unaffected by it's removal.

Since the code is not needed delete it.

[1] https://lkml.kernel.org/r/36123b5d-daa0-6c2b-f2d4-a942f069fd54@gmail.com
Link: https://lkml.kernel.org/r/87sg10quue.fsf_-_@disp2133
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-07-12 15:17:47 -05:00
Riccardo Mancini
eb7261f14e perf test: Add free() calls for scandir() returned dirent entries
ASan reported a memory leak for items of the entlist returned from scandir().

In fact, scandir() returns a malloc'd array of malloc'd dirents.

This patch adds the missing (z)frees.

Fixes: da963834fe ("perf test: Iterate over shell tests in alphabetical order")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Link: http://lore.kernel.org/lkml/20210709163454.672082-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 15:34:41 -03:00
Jin Yao
c47a5599ed perf tools: Fix pattern matching for same substring in different PMU type
Some different PMU types may have the same substring. For example, on
Icelake server we have PMU types "uncore_imc" and
"uncore_imc_free_running". Both PMU types have the substring
"uncore_imc".  But the parser wrongly thinks they are the same PMU type.

We enable an imc event,
perf stat -e uncore_imc/event=0xe3/ -a -- sleep 1

Perf actually expands the event to:

  uncore_imc_0/event=0xe3/
  uncore_imc_1/event=0xe3/
  uncore_imc_2/event=0xe3/
  uncore_imc_3/event=0xe3/
  uncore_imc_4/event=0xe3/
  uncore_imc_5/event=0xe3/
  uncore_imc_6/event=0xe3/
  uncore_imc_7/event=0xe3/
  uncore_imc_free_running_0/event=0xe3/
  uncore_imc_free_running_1/event=0xe3/
  uncore_imc_free_running_3/event=0xe3/
  uncore_imc_free_running_4/event=0xe3/

That's because the "uncore_imc_free_running" matches the
pattern "uncore_imc*".

Now we check that the last characters of PMU name is '_<digit>'.

For example, for pattern "uncore_imc*", "uncore_imc_0" is parsed ok, but
"uncore_imc_free_running_0" fails.

Fixes: b2b9d3a3f0 ("perf pmu: Support wildcards on pmu name in dynamic pmu events")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210701064253.1175-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Kan Liang
b91e5492f9 perf record: Add a dummy event on hybrid systems to collect metadata records
Some symbols may not be resolved if a user only monitors one type of
PMU.

  $ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload
  $ sudo perf report –stdio
  # Overhead  Command    Shared Object      Symbol
  # ........  .........  .................  .....................
  #
     28.02%  perf-exec  [unknown]          [.] 0x0000000000401cf6
     11.32%  perf-exec  [unknown]          [.] 0x0000000000401d04
     10.90%  perf-exec  [unknown]          [.] 0x0000000000401d11
     10.61%  perf-exec  [unknown]          [.] 0x0000000000401cfc

To parse symbols the metadata records, e.g., PERF_RECORD_COMM, which are
generated by the kernel, are required.

To decide whether to generate the metadata records, the kernel relies on
the event_filter_match() to filter the unrelated events.

On a hybrid system, event_filter_match() further checks the CPU mask of
the current enabled PMU. If an event is collected on the CPU which
doesn't have an enabled PMU, it's treated as an unrelated event.

The "big_small_workload" is created in a big core, but runs on a small
core. The metadata records are filtered, because the user only monitors
the PMU of the small core. The big core PMU is not enabled.

For a hybrid system, a dummy event is required to generate the complete
side-band events.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1625760212-18441-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Kan Liang
5f148e7c6a perf stat: Add Topdown metrics L2 events as default events
The Topdown Microarchitecture Analysis (TMA) Method is a structured
analysis methodology to identify critical performance bottlenecks in
out-of-order processors.

The Topdown metrics L1 event was added as default in 42641d6f4d
("perf stat: Add Topdown metrics events as default events")

From the Sapphire Rapids server and later platforms, the same dedicated
"metrics" register is extended to support both L1 and L2 events.

Add both L1 and L2 Topdown metrics events as default to enrich the
default measuring information if the new measurement register is
available.

On legacy systems there is no change to avoid extra multiplexing.

The topdown_level indicates the max metrics level for the top-down
statistics. Set it to 2 to display all L1 and L2 Topdown metrics events.

With the patch:

  $ perf stat sleep 1

  Performance counter stats for 'sleep 1':

           0.59 msec task-clock             #   0.001 CPUs utilized
              1      context-switches       #   1.687 K/sec
              0      cpu-migrations         #   0.000 /sec
             76      page-faults            # 128.198 K/sec
      1,405,318      cycles                 #   2.371 GHz
      1,471,136      instructions           #   1.05  insn per cycle
        310,132      branches               # 523.136 M/sec
         10,435      branch-misses          #   3.36% of all branches
      8,431,908      slots                  #  14.223 G/sec
      1,554,116      topdown-retiring       #    18.4% retiring
      1,289,585      topdown-bad-spec       #    15.2% bad speculation
      2,810,636      topdown-fe-bound       #    33.2% frontend bound
      2,810,636      topdown-be-bound       #    33.2% backend bound
        231,464      topdown-heavy-ops      #     2.7% heavy operations   #  15.6% light operations
      1,223,453      topdown-br-mispredict  #    14.5% branch mispredict  #   0.8% machine clears
      1,884,779      topdown-fetch-lat      #    22.3% fetch latency      #  10.9% fetch bandwidth
      1,454,917      topdown-mem-bound      #    17.2% memory bound       #  16.0% Core bound

    1.001179699 seconds time elapsed

    0.000000000 seconds user
    0.001238000 seconds sys

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1625760169-18396-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa
2e6263ab54 libperf: Adopt evlist__set_leader() from tools/perf as perf_evlist__set_leader()
Move the implementation of evlist__set_leader() to a new libperf
perf_evlist__set_leader() function with the same functionality make it a
libperf exported API.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa
3a683120d8 libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups
Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group
interface to libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa
fba7c86601 libperf: Move 'leader' from tools/perf to perf_evsel::leader
Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.

Also add several evsel helpers to ease up the transition:

  struct evsel *evsel__leader(struct evsel *evsel);
  - get leader evsel

  bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
  - true if evsel has leader as leader

  bool evsel__is_leader(struct evsel *evsel);
  - true if evsel is itw own leader

  void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
  - set leader for evsel

Committer notes:

Fix this when building with 'make BUILD_BPF_SKEL=1'

  tools/perf/util/bpf_counter.c

  -       if (evsel->leader->core.nr_members > 1) {
  +       if (evsel->core.leader->nr_members > 1) {

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:31 -03:00
Jiri Olsa
38fe0e0156 libperf: Move 'idx' from tools/perf to perf_evsel::idx
Move evsel::idx to perf_evsel::idx, so we can move the group interface
to libperf.

Committer notes:

Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that
appeared in my tree in my local tree.

Also fixed up these:

$ find tools/perf/ -name "*.[ch]" | xargs grep 'evsel->idx'
tools/perf/ui/gtk/annotate.c:                      evsel->idx + i);
tools/perf/ui/gtk/annotate.c:                   evsel->idx);
$

That running 'make -C tools/perf build-test' caught.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:28 -03:00
Adrian Hunter
b4b046ff9e perf intel-pt: Add a config for max loops without consuming a packet
The Intel PT decoder limits the number of unconditional branches (e.g.
jmps) decoded without consuming any trace packets. Generally, a loop
needs a conditional branch which generates a TNT packet, whereas a "ret"
instruction will generate a TIP or TNT packet. So exceeding the limit is
assumed to be a never-ending loop, which can happen if there has been a
decoding error putting the decoder at the wrong place in the code.

Up until now, the limit of 10000 has been enough but some analytic
purposes have been reported to exceed that.

Increase the limit to 100000, and make it configurable via perf config
intel-pt.max-loops. Also amend the "Never-ending loop" message to
mention the configuration entry.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210701175132.3977-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 11:40:56 -03:00
Jin Yao
493be70ac3 perf stat: Disable the NMI watchdog message on hybrid
If we run a single workload that only runs on big core, there is always
a ugly message about disabling the NMI watchdog because the atom is not
counted.

Before:

  # ./perf stat true

   Performance counter stats for 'true':

                0.43 msec task-clock                #    0.396 CPUs utilized
                   0      context-switches          #    0.000 /sec
                   0      cpu-migrations            #    0.000 /sec
                  45      page-faults               #  103.918 K/sec
             639,634      cpu_core/cycles/          #    1.477 G/sec
       <not counted>      cpu_atom/cycles/                                              (0.00%)
             643,498      cpu_core/instructions/    #    1.486 G/sec
       <not counted>      cpu_atom/instructions/                                        (0.00%)
             123,715      cpu_core/branches/        #  285.694 M/sec
       <not counted>      cpu_atom/branches/                                            (0.00%)
               4,094      cpu_core/branch-misses/   #    9.454 M/sec
       <not counted>      cpu_atom/branch-misses/                                       (0.00%)

         0.001092407 seconds time elapsed

         0.001144000 seconds user
         0.000000000 seconds sys

  Some events weren't counted. Try disabling the NMI watchdog:
          echo 0 > /proc/sys/kernel/nmi_watchdog
          perf stat ...
          echo 1 > /proc/sys/kernel/nmi_watchdog

  # ./perf stat -e '{cpu_atom/cycles/,msr/tsc/}' true

   Performance counter stats for 'true':

       <not counted>      cpu_atom/cycles/                                              (0.00%)
       <not counted>      msr/tsc/                                                      (0.00%)

         0.001904106 seconds time elapsed

         0.001947000 seconds user
         0.000000000 seconds sys

  Some events weren't counted. Try disabling the NMI watchdog:
          echo 0 > /proc/sys/kernel/nmi_watchdog
          perf stat ...
          echo 1 > /proc/sys/kernel/nmi_watchdog
  The events in group usually have to be from the same PMU. Try reorganizing the group.

Now we disable the NMI watchdog message on hybrid, otherwise there
are too many false positives.

After:

  # ./perf stat true

   Performance counter stats for 'true':

                0.79 msec task-clock                #    0.419 CPUs utilized
                   0      context-switches          #    0.000 /sec
                   0      cpu-migrations            #    0.000 /sec
                  48      page-faults               #   60.889 K/sec
             777,692      cpu_core/cycles/          #  986.519 M/sec
       <not counted>      cpu_atom/cycles/                                              (0.00%)
             669,147      cpu_core/instructions/    #  848.828 M/sec
       <not counted>      cpu_atom/instructions/                                        (0.00%)
             128,635      cpu_core/branches/        #  163.176 M/sec
       <not counted>      cpu_atom/branches/                                            (0.00%)
               4,089      cpu_core/branch-misses/   #    5.187 M/sec
       <not counted>      cpu_atom/branch-misses/                                       (0.00%)

         0.001880649 seconds time elapsed

         0.001935000 seconds user
         0.000000000 seconds sys

  # ./perf stat -e '{cpu_atom/cycles/,msr/tsc/}' true

   Performance counter stats for 'true':

       <not counted>      cpu_atom/cycles/                                              (0.00%)
       <not counted>      msr/tsc/                                                      (0.00%)

         0.000963319 seconds time elapsed

         0.000999000 seconds user
         0.000000000 seconds sys

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210610034557.29766-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 11:37:23 -03:00
Kajol Jain
a3cbcadfdf perf vendor events power10: Adds 24x7 nest metric events for power10 platform
Patch adds 24x7 nest metric events for POWER10.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20210628064935.163465-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 11:35:49 -03:00
Kajol Jain
dea8cfcc33 perf script python: Fix buffer size to report iregs in perf script
Commit 48a1f56526 ("perf script python: Add more PMU fields to
event handler dict") added functionality to report fields like weight,
iregs, uregs etc via perf report.  That commit predefined buffer size to
512 bytes to print those fields.

But in PowerPC, since we added extended regs support in:

  068aeea377 ("perf powerpc: Support exposing Performance Monitor Counter SPRs as part of extended regs")
  d735599a06 ("powerpc/perf: Add extended regs support for power10 platform")

Now iregs can carry more bytes of data and this predefined buffer size
can result to data loss in perf script output.

This patch resolves this issue by making the buffer size dynamic, based
on the number of registers needed to print. It also changes the
regs_map() return type from int to void, as it is not being used by the
set_regs_in_dict(), its only caller.

Fixes: 068aeea377 ("perf powerpc: Support exposing Performance Monitor Counter SPRs as part of extended regs")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20210628062341.155839-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 11:15:44 -03:00
Justin M. Forbes
e63cbfa3be perf trace: Fix the perf trace link location
The install perf_dlfilter.h patch included what seems to be a typo in
the Makefile.perf, which changed the location of the trace link from
'$(DESTDIR_SQ)$(bindir_SQ)/trace' to '$(DESTDIR_SQ)$(dir_SQ)/trace'.

This reverts it back to the correct location.

Fixes: 0beb218315 ("perf build: Install perf_dlfilter.h")
Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Justin M. Forbes <jmforbes@linuxtx.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706185952.116121-1-jforbes@fedoraproject.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:10 -03:00
Riccardo Mancini
83952286f2 perf top: Fix overflow in elf_sec__is_text()
ASan reports a heap-buffer-overflow in elf_sec__is_text when using perf-top.

The bug is caused by the fact that secstrs is built from runtime_ss, while
shdr is built from syms_ss if shdr.sh_type != SHT_NOBITS. Therefore, they
point to two different ELF files.

This patch renames secstrs to secstrs_run and adds secstrs_sym, so that
the correct secstrs is chosen depending on shdr.sh_type.

  $ ASAN_OPTIONS=abort_on_error=1:disable_coredump=0:unmap_shadow_on_exit=1 ./perf top
  =================================================================
  ==363148==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61300009add6 at pc 0x00000049875c bp 0x7f4f56446440 sp 0x7f4f56445bf0
  READ of size 1 at 0x61300009add6 thread T6
    #0 0x49875b in StrstrCheck(void*, char*, char const*, char const*) (/home/user/linux/tools/perf/perf+0x49875b)
    #1 0x4d13a2 in strstr (/home/user/linux/tools/perf/perf+0x4d13a2)
    #2 0xacae36 in elf_sec__is_text /home/user/linux/tools/perf/util/symbol-elf.c:176:9
    #3 0xac3ec9 in elf_sec__filter /home/user/linux/tools/perf/util/symbol-elf.c:187:9
    #4 0xac2c3d in dso__load_sym /home/user/linux/tools/perf/util/symbol-elf.c:1254:20
    #5 0x883981 in dso__load /home/user/linux/tools/perf/util/symbol.c:1897:9
    #6 0x8e6248 in map__load /home/user/linux/tools/perf/util/map.c:332:7
    #7 0x8e66e5 in map__find_symbol /home/user/linux/tools/perf/util/map.c:366:6
    #8 0x7f8278 in machine__resolve /home/user/linux/tools/perf/util/event.c:707:13
    #9 0x5f3d1a in perf_event__process_sample /home/user/linux/tools/perf/builtin-top.c:773:6
    #10 0x5f30e4 in deliver_event /home/user/linux/tools/perf/builtin-top.c:1197:3
    #11 0x908a72 in do_flush /home/user/linux/tools/perf/util/ordered-events.c:244:9
    #12 0x905fae in __ordered_events__flush /home/user/linux/tools/perf/util/ordered-events.c:323:8
    #13 0x9058db in ordered_events__flush /home/user/linux/tools/perf/util/ordered-events.c:341:9
    #14 0x5f19b1 in process_thread /home/user/linux/tools/perf/builtin-top.c:1109:7
    #15 0x7f4f6a21a298 in start_thread /usr/src/debug/glibc-2.33-16.fc34.x86_64/nptl/pthread_create.c:481:8
    #16 0x7f4f697d0352 in clone ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

0x61300009add6 is located 10 bytes to the right of 332-byte region [0x61300009ac80,0x61300009adcc)
allocated by thread T6 here:

    #0 0x4f3f7f in malloc (/home/user/linux/tools/perf/perf+0x4f3f7f)
    #1 0x7f4f6a0a88d9  (/lib64/libelf.so.1+0xa8d9)

Thread T6 created by T0 here:

    #0 0x464856 in pthread_create (/home/user/linux/tools/perf/perf+0x464856)
    #1 0x5f06e0 in __cmd_top /home/user/linux/tools/perf/builtin-top.c:1309:6
    #2 0x5ef19f in cmd_top /home/user/linux/tools/perf/builtin-top.c:1762:11
    #3 0x7b28c0 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #4 0x7b119f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #5 0x7b2423 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #6 0x7b0c19 in main /home/user/linux/tools/perf/perf.c:539:3
    #7 0x7f4f696f7b74 in __libc_start_main /usr/src/debug/glibc-2.33-16.fc34.x86_64/csu/../csu/libc-start.c:332:16

  SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/user/linux/tools/perf/perf+0x49875b) in StrstrCheck(void*, char*, char const*, char const*)
  Shadow bytes around the buggy address:
    0x0c268000b560: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x0c268000b570: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x0c268000b580: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x0c268000b590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0c268000b5a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  =>0x0c268000b5b0: 00 00 00 00 00 00 00 00 00 04[fa]fa fa fa fa fa
    0x0c268000b5c0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
    0x0c268000b5d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0c268000b5e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0c268000b5f0: 07 fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x0c268000b600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone:       fa
    Freed heap region:       fd
    Stack left redzone:      f1
    Stack mid redzone:       f2
    Stack right redzone:     f3
    Stack after return:      f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:       f6
    Poisoned by user:        f7
    Container overflow:      fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
    Shadow gap:              cc
  ==363148==ABORTING

Suggested-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Link: http://lore.kernel.org/lkml/20210621222108.196219-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:10 -03:00
Riccardo Mancini
5a4451e4d5 perf annotate: Fix 's' on source line when disasm is empty
If the disasm is empty, 's' should fail. Instead it seemingly works,
hiding the empty lines and causing an assertion error on the next time
annotate is called (from within perf report).

The problem is caused by a buffer overflow, caused by a wrong exit
condition in annotate_browser__find_next_asm_line, which checks
browser->b.top instead of browser->b.entries.

This patch fixes the issue, making annotate_browser__toggle_source
fail if the disasm is empty (nothing happens to the user).

Fixes: 6de249d66d ("perf annotate: Allow 's' on source code lines")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210705161524.72953-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:10 -03:00
Masami Hiramatsu
d5882a92ea perf probe: Do not show @plt function by default
Fix the perf-probe --functions option do not show the PLT
stub symbols (*@plt) by default.

  -----
  $ ./perf probe -x /usr/lib64/libc-2.33.so -F | head
  a64l
  abort
  abs
  accept
  accept4
  access
  acct
  addmntent
  addseverity
  adjtime
  -----

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhriamat@kernel.org>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162532653450.393143.12621329879630677469.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:10 -03:00
Masami Hiramatsu
87704345cc perf symbol-elf: Decode dynsym even if symtab exists
In Fedora34, libc-2.33.so has both .dynsym and .symtab sections and
most of (not all) symbols moved to .dynsym. In this case, perf only
decode the symbols in .symtab, and perf probe can not list up the
functions in the library.

To fix this issue, decode both .symtab and .dynsym sections.

Without this fix,
  -----
  $ ./perf probe -x /usr/lib64/libc-2.33.so -F
  @plt
  @plt
  calloc@plt
  free@plt
  malloc@plt
  memalign@plt
  realloc@plt
  -----

With this fix.

  -----
  $ ./perf probe -x /usr/lib64/libc-2.33.so -F
  @plt
  @plt
  a64l
  abort
  abs
  accept
  accept4
  access
  acct
  addmntent
  -----

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162532652681.393143.10163733179955267999.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:10 -03:00
Masami Hiramatsu
eb4717f733 perf probe: Fix debuginfo__new() to enable build-id based debuginfo
Fix debuginfo__new() to set the build-id to dso before
dso__read_binary_type_filename() so that it can find
DSO_BINARY_TYPE__BUILDID_DEBUGINFO debuginfo correctly.

However, this may not change the result, because elfutils (libdwfl) has
its own debuginfo finder. With/without this patch, the perf probe
correctly find the debuginfo file.

This is just a failsafe and keep code's sanity (if you use
dso__read_binary_type_filename(), you must set the build-id to the dso.)

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhriamat@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162532651863.393143.11692691321219235810.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:07 -03:00
Arnaldo Carvalho de Melo
44c2cd80f2 tools headers UAPI: Sync files changed by the quotactl_fd new syscall
To pick the changes in these csets:

  64c2c2c62f ("quota: Change quotactl_path() systcall to an fd-based one")
  65ffb3d69e ("quota: Wire up quotactl_fd syscall")

That silences these perf build warnings and add support for those new
syscalls in tools such as 'perf trace'.

For instance, this is now possible:

  # perf trace -v -e quota*
  event qualifier tracepoint filter: (common_pid != 158365 && common_pid != 2512) && (id == 179 || id == 443)
  ^C#

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep quota tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
  179	common	quotactl		sys_quotactl
  443	common	quotactl_fd		sys_quotactl_fd
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
  diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-05 14:36:22 -03:00
Namhyung Kim
944138f048 perf stat: Enable BPF counter with --for-each-cgroup
Recently bperf was added to use BPF to count perf events for various
purposes.  This is an extension for the approach and targetting to
cgroup usages.

Unlike the other bperf, it doesn't share the events with other
processes but it'd reduce unnecessary events (and the overhead of
multiplexing) for each monitored cgroup within the perf session.

When --for-each-cgroup is used with --bpf-counters, it will open
cgroup-switches event per cpu internally and attach the new BPF
program to read given perf_events and to aggregate the results for
cgroups.  It's only called when task is switched to a task in a
different cgroup.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210701211227.1403788-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-05 14:16:57 -03:00
Namhyung Kim
892ba7f186 perf report: Fix --task and --stat with pipe input
Current 'perf report' fails to process a pipe input when --task or
--stat options are used.  This is because they reset all the tool
callbacks and fails to find a matching event for a sample.

When pipe input is used, the event info is passed via ATTR records so it
needs to handle that operation.  Otherwise the following error occurs.
Note, -14 (= -EFAULT) comes from evlist__parse_sample():

  # perf record -a -o- sleep 1 | perf report -i- --stat
  Can't parse sample, err = -14
  0x271044 [0x38]: failed to process type: 9
  Error:
  failed to process sample
  #

Committer testing:

Before:

  $ perf record -o- sleep 1 | perf report -i- --stat
  Can't parse sample, err = -14
  [ perf record: Woken up 1 times to write data ]
  0x1350 [0x30]: failed to process type: 9
  Error:
  failed to process sample
  [ perf record: Captured and wrote 0.000 MB - ]
  $

After:

  $ perf record -o- sleep 1 | perf report -i- --stat
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]

  Aggregated stats:
             TOTAL events:         41
              COMM events:          2  ( 4.9%)
              EXIT events:          1  ( 2.4%)
            SAMPLE events:          9  (22.0%)
             MMAP2 events:          4  ( 9.8%)
              ATTR events:          1  ( 2.4%)
    FINISHED_ROUND events:          1  ( 2.4%)
        THREAD_MAP events:          1  ( 2.4%)
           CPU_MAP events:          1  ( 2.4%)
      EVENT_UPDATE events:          1  ( 2.4%)
         TIME_CONV events:          1  ( 2.4%)
           FEATURE events:         19  (46.3%)
  cycles:uhH stats:
            SAMPLE events:          9
  $

Fixes: a4a4d0a7a2 ("perf report: Add --stats option to display quick data statistics")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210630043058.1131295-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-05 14:16:57 -03:00
Riccardo Mancini
cf96b8e45a perf session: Add missing evlist__delete when deleting a session
ASan reports a memory leak caused by evlist not being deleted on exit in
perf-report, perf-script and perf-data.
The problem is caused by evlist->session not being deleted, which is
allocated in perf_session__read_header, called in perf_session__new if
perf_data is in read mode.
In case of write mode, the session->evlist is filled by the caller.
This patch solves the problem by calling evlist__delete in
perf_session__delete if perf_data is in read mode.

Changes in v2:
 - call evlist__delete from within perf_session__delete

v1: https://lore.kernel.org/lkml/20210621234317.235545-1-rickyman7@gmail.com/

ASan report follows:

$ ./perf script report flamegraph
=================================================================
==227640==ERROR: LeakSanitizer: detected memory leaks

<SNIP unrelated>

Indirect leak of 2704 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    #2 0x7f999e in evlist__new /home/user/linux/tools/perf/util/evlist.c:77:26
    #3 0x8ad938 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3797:20
    #4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #11 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 568 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    #2 0x80ce88 in evsel__new_idx /home/user/linux/tools/perf/util/evsel.c:268:24
    #3 0x8aed93 in evsel__new /home/user/linux/tools/perf/util/evsel.h:210:9
    #4 0x8ae07e in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3853:11
    #5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #12 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 264 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    #2 0xbe3e70 in xyarray__new /home/user/linux/tools/lib/perf/xyarray.c:10:23
    #3 0xbd7754 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:361:21
    #4 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7
    #5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #12 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    #2 0xbd77e0 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:365:14
    #3 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7
    #4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #11 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 7 byte(s) in 1 object(s) allocated from:
    #0 0x4b8207 in strdup (/home/user/linux/tools/perf/perf+0x4b8207)
    #1 0x8b4459 in evlist__set_event_name /home/user/linux/tools/perf/util/header.c:2292:16
    #2 0x89d862 in process_event_desc /home/user/linux/tools/perf/util/header.c:2313:3
    #3 0x8af319 in perf_file_section__process /home/user/linux/tools/perf/util/header.c:3651:9
    #4 0x8aa6e9 in perf_header__process_sections /home/user/linux/tools/perf/util/header.c:3427:9
    #5 0x8ae3e7 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3886:2
    #6 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #7 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #8 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #9 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #10 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #11 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #12 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #13 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

SUMMARY: AddressSanitizer: 3728 byte(s) leaked in 7 allocation(s).

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210624231926.212208-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Riccardo Mancini
6de249d66d perf annotate: Allow 's' on source code lines
In perf annotate, when 's' is pressed on a line containing source code,
it shows the message "Only available for assembly lines".

This patch gets rid of the error, moving the cursr to the next available
asm line (or the closest previous one if no asm line is found moving
forwards), before hiding source code lines.

Changes in v2:
 - handle case of no asm line found in
   annotate_browser__find_next_asm_line by returning NULL and
   handling error in caller.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210624223423.189550-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
ec4c00fedb perf dlfilter: Add object_code() to perf_dlfilter_fns
Add a function, for use by dlfilters, to read object code.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
6495e76252 perf dlfilter: Add attr() to perf_dlfilter_fns
Add a function, for use by dlfilters, to return the perf_event_attr
structure.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
244afc0c93 perf dlfilter: Add srcline() to perf_dlfilter_fns
Add a function, for use by dlfilters, to return source code file name and
line number.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
e35995effd perf dlfilter: Add insn() to perf_dlfilter_fns
Add a function, for use by dlfilters, to return instruction bytes.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
f645744c50 perf dlfilter: Add resolve_address() to perf_dlfilter_fns
Add a function, for use by dlfilters, to resolve addresses from branch
stacks or callchains.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
0beb218315 perf build: Install perf_dlfilter.h
Users of the --dlfilter option need to include perf_dlfilter.h
in their filters. Install it to the include path.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
3d032a2516 perf script: Add option to pass arguments to dlfilters
Add option --dlarg to pass arguments to dlfilters. The --dlarg option can
be repeated to pass more than 1 argument.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter
638e2b9984 perf script: Add option to list dlfilters
Add option --list-dlfilters to list dlfilters in the current directory or
the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must
come before option --list-dlfilters) to show long descriptions.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter
9bde93a79a perf script: Add dlfilter__filter_event_early()
filter_event_early() can be more than 30% faster than filter_event()
because it is called before internal filtering. In other respects it
is the same as filter_event(), except that it will be passed events
that have yet to be filtered out.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter
291961fc3c perf script: Add API for filtering via dynamically loaded shared object
In some cases, users want to filter very large amounts of data (e.g.
from AUX area tracing like Intel PT) looking for something specific.
While scripting such as Python can be used, Python is 10 to 20 times
slower than C. So define a C API so that custom filters can be written
and loaded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Arnaldo Carvalho de Melo
c435c166dc perf llvm: Return -ENOMEM when asprintf() fails
Zhihao sent a patch but it made llvm__compile_bpf() return what
asprintf() returns on error, which is just -1, but since this function
returns -errno, fix it by returning -ENOMEM for this case instead.

Fixes: cb76371441 ("perf llvm: Allow passing options to llc ...")
Fixes: 5eab5a7ee0 ("perf llvm: Display eBPF compiling command ...")
Reported-by: Hulk Robot <hulkci@huawei.com>
Reported-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210609115945.2193194-1-chengzhihao1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
James Clark
0323dea318 perf cs-etm: Delay decode of non-timeless data until cs_etm__flush_events()
Currently, timeless mode starts the decode on PERF_RECORD_EXIT, and
non-timeless mode starts decoding on the fist PERF_RECORD_AUX record.

This can cause the "data has no samples!" error if the first
PERF_RECORD_AUX record comes before the first (or any relevant)
PERF_RECORD_MMAP2 record because the mmaps are required by the decoder
to access the binary data.

This change pushes the start of non-timeless decoding to the very end of
parsing the file. The PERF_RECORD_EXIT event can't be used because it
might not exist in system-wide or snapshot modes.

I have not been able to find the exact cause for the events to be
intermittently in the wrong order in the basic scenario:

	perf record -e cs_etm/@tmc_etr0/u top

But it can be made to happen every time with the --delay option. This is
because "enable_on_exec" is disabled, which causes tracing to start
before the process to be launched is exec'd. For example:

	perf record -e cs_etm/@tmc_etr0/u --delay=1 top
	perf report -D | grep 'AUX\|MAP'

	0 16714475632740 0x520 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x30 flags: 0 []
	0 16714476494960 0x5d0 [0x40]: PERF_RECORD_AUX offset: 0x30 size: 0x30 flags: 0 []
	0 16714478208900 0x660 [0x40]: PERF_RECORD_AUX offset: 0x60 size: 0x30 flags: 0 []
	4294967295 16714478293340 0x700 [0x70]: PERF_RECORD_MMAP2 8712/8712: [0x557a460000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
	4294967295 16714478353020 0x770 [0x88]: PERF_RECORD_MMAP2 8712/8712: [0x7f86f72000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so

Another scenario in which decoding from the first aux record fails is a
workload that forks. Although the aux record comes after 'bash', it
comes before 'top', which is what we are interested in. For example:

	perf record -e cs_etm/@tmc_etr0/u -- bash -c top
	perf report -D | grep 'AUX\|MAP'

	4294967295 16853946421300 0x510 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x558f280000(0x142000) @ 0 00:17 5213953 0]: r-xp /usr/bin/bash
	4294967295 16853946543560 0x580 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba6e000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
	4294967295 16853946628420 0x608 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba9e000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
	0 16853947067300 0x690 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x3a60 flags: 0 []
	...
	0 16853966602580 0x1758 [0x40]: PERF_RECORD_AUX offset: 0xc2470 size: 0x30 flags: 0 []
	4294967295 16853967119860 0x1818 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x5559e70000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
	4294967295 16853967181620 0x1888 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed06000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
	4294967295 16853967237180 0x1910 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed36000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]

A third scenario is when the majority of time is spent in a shared
library that is not loaded at startup. For example a dynamically loaded
plugin.

Testing
=======

Testing was done by checking if any samples that are present in the
old output are missing from the new output. Timestamps must be
stripped out with awk because now they are set to the last AUX sample,
rather than the first:

	./perf script $4 | awk '!($4="")' > new.script
	./perf-default script $4 | awk '!($4="")' > default.script
	comm -13 <(sort -u new.script) <(sort -u default.script)

Testing showed that the new output is a superset of the old. When lines
appear in the comm output, it is not because they are missing but
because [unknown] is now resolved to sensible locations. For example
last putp branch here now resolves to libtinfo, so it's not missing
from the output, but is actually improved:

Old:
	top 305 [001]  1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 0 [unknown] ([unknown])
New:
	top 305 [001]  1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 7f8ab39208 putp+0x0 (/lib/libtinfo.so.5.9)

In the following two modes, decoding now works and the "data has no
samples!" error is not displayed any more:

	perf record -e cs_etm/@tmc_etr0/u -- bash -c top
	perf record -e cs_etm/@tmc_etr0/u --delay=1 top

In snapshot mode, there is also an improvement to decoding. Previously
samples for the 'kill' process that was used to send SIGUSR2 were
completely missing, because the process hadn't started yet. But now
there are additional samples present:

	perf record -e cs_etm/@tmc_etr0/u --snapshot -a
	perf script

		stress 19380 [003] 161627.938153:    1000000    instructions:uH:      aaaabb612fb4 [unknown] (/usr/bin/stress)
		  kill 19644 [000] 161627.938153:    1000000    instructions:uH:      ffffae0ef210 [unknown] (/lib/aarch64-linux-gnu/ld-2.27.so)
		stress 19380 [003] 161627.938153:    1000000    instructions:uH:      ffff9e754d40 random_r+0x20 (/lib/aarch64-linux-gnu/libc-2.27.so)

Also tested was the round trip of 'perf inject' followed by 'perf
report' which has the same differences and improvements.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210609130421.13934-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Leo Yan
8941ba502f perf arm-spe: Don't wait for PERF_RECORD_EXIT event
When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
perf event) for processing trace data, which is needless and even might
cause logic error, e.g. it might fail to correlate perf events with Arm
SPE events correctly.

So this patch removes the condition checking for PERF_RECORD_EXIT event.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210519071939.1598923-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
afb5e9e47f perf arm-spe: Bail out if the trace is later than perf event
It's possible that record in Arm SPE trace is later than perf event and
vice versa.  This asks to correlate the perf events and Arm SPE
synthesized events to be processed in the manner of correct timing.

To achieve the time ordering, this patch reverses the flow, it firstly
calls arm_spe_sample() and then calls arm_spe_decode().  By comparing
the timestamp value and detect the perf event is coming earlier than Arm
SPE trace data, it bails out from the decoding loop, the last record is
pushed into auxtrace stack and is deferred to generate sample.  To track
the timestamp, everytime it updates timestamp for the latest record.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
85498f756f perf arm-spe: Assign kernel time to synthesized event
In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.

To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
630519014c perf arm-spe: Convert event kernel time to counter value
When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.

This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
c210c30696 perf arm-spe: Save clock parameters from TIME_CONV event
During the recording phase, "perf record" tool synthesizes event
PERF_RECORD_TIME_CONV for the hardware clock parameters and saves the
event into the data file.

Afterwards, when processing the data file, the event TIME_CONV will be
processed at the very early time and is stored into session context.

This patch extracts these parameters from the session context and saves
into the structure "spe->tc" with the type perf_tsc_conversion, so that
the parameters are ready for conversion between clock counter and time
stamp.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
2f01c200d4 perf cs-etm: Remove callback cs_etm_find_snapshot()
The callback cs_etm_find_snapshot() is invoked for snapshot mode, its
main purpose is to find the correct AUX trace data and returns "head"
and "old" (we can call "old" as "old head") to the caller, the caller
__auxtrace_mmap__read() uses these two pointers to decide the AUX trace
data size.

This patch removes cs_etm_find_snapshot() with below reasons:

- The first thing in cs_etm_find_snapshot() is to check if the head has
  wrapped around, if it is not, directly bails out.  The checking is
  pointless, this is because the "head" and "old" pointers both are
  monotonical increasing so they never wrap around.

- cs_etm_find_snapshot() adjusts the "head" and "old" pointers and
  assumes the AUX ring buffer is fully filled with the hardware trace
  data, so it always subtracts the difference "mm->len" from "head" to
  get "old".  Let's imagine the snapshot is taken in very short
  interval, the tracers only fill a small chunk of the trace data into
  the AUX ring buffer, in this case, it's wrongly to copy the whole the
  AUX ring buffer to perf file.

- As the "head" and "old" pointers are monotonically increased, the
  function __auxtrace_mmap__read() handles these two pointers properly.
  It calculates the reminders for these two pointers, and the size is
  clamped to be never more than "snapshot_size".  We can simply reply on
  the function __auxtrace_mmap__read() to calculate the correct result
  for data copying, it's not necessary to add Arm CoreSight specific
  callback.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Cc: Denis Nikitin <denik@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: http://lore.kernel.org/lkml/20210701093537.90759-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Namhyung Kim
d6a735ef32 perf bpf_counter: Move common functions to bpf_counter.h
Some helper functions will be used for cgroup counting too.  Move them
to a header file for sharing.

Committer notes:

Fix the build on older systems with:

  -       struct bpf_map_info map_info = {0};
  +       struct bpf_map_info map_info = { .id = 0, };

This wasn't breaking the build in such systems as bpf_counter.c isn't
built due to:

tools/perf/util/Build:

  perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o

The bpf_counter.h file on the other hand is included from places that
are built everywhere.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:19 -03:00
Namhyung Kim
21bcc72661 perf tools: Add cgroup_is_v2() helper
The cgroup_is_v2() is to check if the given subsystem is mounted on
cgroup v2 or not.  It'll be used by BPF cgroup code later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 15:00:33 -03:00
Namhyung Kim
69e874db4d perf tools: Add read_cgroup_id() function
The read_cgroup_id() is to read a cgroup id from a file handle using
name_to_handle_at(2) for the given cgroup.  It'll be used by bperf
cgroup stat later.

Committer notes:

  -int read_cgroup_id(struct cgroup *cgrp)
  +static inline int read_cgroup_id(struct cgroup *cgrp __maybe_unused)

To fix the build when HAVE_FILE_HANDLE is not defined.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 15:00:03 -03:00
Joshua Martinez
51f382428c perf top: Add cgroup support for perf top (-G)
Added callback option (-G) to support cgroups for 'perf top'.

Added condition to make sure -cgroup and --all-cgroups aren't both enabled.

Example:

  $perf top -e cycles -G system.slice/docker-6b95a5eb649c0d671eba3835f0d93973d05a088f3ae8602246bde37affb1ba3e.scope -a --stdio

   PerfTop:    3330 irqs/sec  kernel:68.2%  exact:  0.0% lost: 0/0 drop: 0/11075 [4000Hz cpu-clock],  (all, 4 CPUs)
   -------------------------------------------------------------------------------------------------------------------------------------------------------

    27.32%  [unknown]         [.] 0x00007f8ab7b69352
    11.44%  [kernel]          [k] 0xffffffff968cd657
     3.12%  [kernel]          [k] 0xffffffff96160e96
     2.63%  [kernel]          [k] 0xffffffff96160eb0
     1.96%  [kernel]          [k] 0xffffffff9615fcf6
     1.42%  [kernel]          [k] 0xffffffff964ddfc7
     1.09%  [kernel]          [k] 0xffffffff96160e90
     0.81%  [kernel]          [k] 0xffffffff96160eb3
     0.67%  [kernel]          [k] 0xffffffff9615fec1
     0.57%  [kernel]          [k] 0xffffffff961ee1d0
     0.53%  [unknown]         [.] 0x00007f8ab7b6666c
     0.53%  [kernel]          [k] 0xffffffff96160e64
     0.52%  [kernel]          [k] 0xffffffff9616c303
     0.51%  [kernel]          [k] 0xffffffffc08e7d50
     ...

Signed-off-by: Joshua Martinez <joshuamart@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: joshua martinez <joshuamart@google.com>
Link: http://lore.kernel.org/lkml/20210616231829.3735671-1-joshuamart@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-24 15:33:35 -03:00
Adrian Hunter
b743b86ce6 perf script: Share addr_al between functions
Share the addr_location of 'addr' so that it need not be resolved more than
once.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 15:21:54 -03:00
Adrian Hunter
4371fbc0c9 perf script: Move filtering before scripting
To make it possible to use filtering with scripts, move filtering before
scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 15:18:15 -03:00
Adrian Hunter
9300041c66 perf script: Move filter_cpu() earlier
Generally, it should be more efficient if filter_cpu() comes before
machine__resolve() because filter_cpu() is much less code than
machine__resolve().

Example:

 $ perf record --sample-cpu -- make -C tools/perf >/dev/null

Before:

 $ perf stat -- perf script -C 0 >/dev/null

  Performance counter stats for 'perf script -C 0':

            116.94 msec task-clock                #    0.992 CPUs utilized
                 2      context-switches          #   17.103 /sec
                 0      cpu-migrations            #    0.000 /sec
             8,187      page-faults               #   70.011 K/sec
       478,351,812      cycles                    #    4.091 GHz
       564,785,464      instructions              #    1.18  insn per cycle
       114,341,105      branches                  #  977.789 M/sec
         2,615,495      branch-misses             #    2.29% of all branches

       0.117840576 seconds time elapsed

       0.085040000 seconds user
       0.032396000 seconds sys

After:

 $ perf stat -- perf script -C 0 >/dev/null

  Performance counter stats for 'perf script -C 0':

            107.45 msec task-clock                #    0.992 CPUs utilized
                 3      context-switches          #   27.919 /sec
                 0      cpu-migrations            #    0.000 /sec
             7,964      page-faults               #   74.117 K/sec
       438,417,260      cycles                    #    4.080 GHz
       522,571,855      instructions              #    1.19  insn per cycle
       105,187,488      branches                  #  978.921 M/sec
         2,356,261      branch-misses             #    2.24% of all branches

       0.108282546 seconds time elapsed

       0.095935000 seconds user
       0.011991000 seconds sys

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210621150514.32159-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 15:15:42 -03:00
Ian Rogers
e3c9cfd07d perf test: Pass the verbose option to shell tests
Having a verbose option will allow shell tests to provide extra failure
details when the fail or skip.

Committer notes:

Keep the 'script' variable at PATH_MAX, as its just something we'll pass
to system(), not really a "path", so being arbitrary, reduce the patch
size by not adding the three extra bytes to the 'script' variable.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210621215648.2991319-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 14:52:09 -03:00
Arnaldo Carvalho de Melo
ce09673636 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes, since perf/urgent is already upstream.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 13:56:50 -03:00
Arnaldo Carvalho de Melo
ef83f9efe8 perf beauty: Update copy of linux/socket.h with the kernel sources
To pick the changes in:

  ea6932d70e ("net: make get_net_ns return error if NET_NS is disabled")

That don't result in any changes in the tables generated from that
header.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: Changbin Du <changbin.du@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:09:08 -03:00
Ian Rogers
482698c2f8 perf test: Fix non-bash issue with stat bpf counters
$(( .. )) is a bash feature but the test's interpreter is !/bin/sh,
switch the code to use expr.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210617184216.2075588-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
Riccardo Mancini
c087e9480c perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL
ASan reported a memory leak of BPF-related ksymbols map and dso. The
leak is caused by refount never reaching 0, due to missing __put calls
in the function machine__process_ksymbol_register.

Once the dso is inserted in the map, dso__put() should be called
(map__new2() increases the refcount to 2).

The same thing applies for the map when it's inserted into maps
(maps__insert() increases the refcount to 2).

  $ sudo ./perf record -- sleep 5
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

  =================================================================
  ==297735==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 6992 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x8e4e53 in map__new2 /home/user/linux/tools/perf/util/map.c:216:20
      #2 0x8cf68c in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:778:10
      [...]

  Indirect leak of 8702 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x8728d7 in dso__new_id /home/user/linux/tools/perf/util/dso.c:1256:20
      #2 0x872015 in dso__new /home/user/linux/tools/perf/util/dso.c:1295:9
      #3 0x8cf623 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:774:21
      [...]

  Indirect leak of 1520 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
      #2 0x888954 in map__process_kallsym_symbol /home/user/linux/tools/perf/util/symbol.c:710:8
      [...]

  Indirect leak of 1406 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
      #2 0x8cfbd8 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:803:8
      [...]

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Link: http://lore.kernel.org/lkml/20210612173751.188582-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
John Garry
fe7a98b9d9 perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
The error code is not set at all in the sys event iter function.

This may lead to an uninitialized value of "ret" in
metricgroup__add_metric() when no CPU metric is added.

Fix by properly setting the error code.

It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
if we have no CPU or sys event metric matching, then "has_match" should
be 0 and "ret" is set to -EINVAL.

However gcc cannot detect that it may not have been set after the
map_for_each_metric() loop for CPU metrics, which is strange.

Fixes: be335ec28e ("perf metricgroup: Support adding metrics for system PMUs")
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623335580-187317-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
John Garry
fc96ec4d5d perf metricgroup: Fix find_evsel_group() event selector
The following command segfaults on my x86 broadwell:

  $ ./perf stat  -M frontend_bound,retiring,backend_bound,bad_speculation sleep 1
  WARNING: grouped events cpus do not match, disabling group:
    anon group { raw 0x10e }
    anon group { raw 0x10e }
  perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)' failed.
  Aborted (core dumped)

The issue shows itself as a use-after-free in evlist__check_cpu_maps(),
whereby the leader of an event selector (evsel) has been deleted (yet we
still attempt to verify for an evsel).

Fundamentally the problem comes from metricgroup__setup_events() ->
find_evsel_group(), and has developed from the previous fix attempt in
commit 9c880c24cb ("perf metricgroup: Fix for metrics containing
duration_time").

The problem now is that the logic in checking if an evsel is in the same
group is subtly broken for the "cycles" event. For the "cycles" event,
the pmu_name is NULL; however the logic in find_evsel_group() may set an
event matched against "cycles" as used, when it should not be.

This leads to a condition where an evsel is set, yet its leader is not.

Fix the check for evsel pmu_name by not matching evsels when either has a
NULL pmu_name.

There is still a pre-existing metric issue whereby the ordering of the
metrics may break the 'stat' function, as discussed at:
https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/

Fixes: 9c880c24cb ("perf metricgroup: Fix for metrics containing duration_time")
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # On a Thinkpad T450S
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623335580-187317-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
Masami Hiramatsu
45237f9898 perf probe: Add --bootconfig to output definition in bootconfig format
Now the boot-time tracing supports kprobes events and that must be
written in bootconfig file in the following format.

  ftrace.event.kprobes.<EVENT_NAME>.probes = <PROBE-DEF>

'perf probe' already supports --definition (-D) action to show probe
definitions, but the format is for tracefs:

  [p|r][:EVENT_NAME] <PROBE-DEF>

This patch adds the --bootconfig option for -D action so that it outputs
the probe definitions in bootconfig format. E.g.

  $ perf probe --bootconfig -D "path_lookupat:7 err:s32 s:string"
  ftrace.event.kprobes.path_lookupat_L7.probe = 'path_lookupat.isra.0+309 err_s32=%ax:s32 s_string=+0(%r13):string'

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282412351.452340.14871995440005640114.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-18 13:50:05 -03:00
Masami Hiramatsu
d26ea48144 perf probe: Cleanup synthesize_probe_trace_command()
Cleanup synthesize_probe_trace_command() to simplify the code path.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282411361.452340.16886399333622147122.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-18 13:50:05 -03:00
Masami Hiramatsu
f338de2219 perf probe: Support probes on init functions for offline kernel
'perf probe' internally checks the probe target is in the text area in
post-process (after analyzing debuginfo). But it fails if the probe
target is in the "inittext".

This is a good limitation for the online kernel because such functions
have gone after booting. However, for using it for boot-time tracing,
user may want to put a probe on init functions.

This skips the post checking process if the target is offline kenrel so
that user can get the probe definition on the init functions.

Without this patch:

  $ perf probe -k ./build-x86_64/vmlinux -D do_mount_root:10
  Probe point 'do_mount_root:10' not found.
    Error: Failed to add events.

With this patch:

  $ perf probe -k ./build-x86_64/vmlinux -D do_mount_root:10
  p:probe/do_mount_root_L10 mount_block_root+300

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282410293.452340.13347006295826431632.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-18 13:50:05 -03:00
Ian Rogers
a49ed2b4e2 perf test: Make stat bpf counters test more robust
If the test is run on a hypervisor then the cycles event may not be
counted, skip the test in this situation. Fail the test if cycles are
not counted in the subsequent bpf counter run.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210617184216.2075588-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-18 13:50:05 -03:00
Ian Rogers
2638fbd351 perf test: Add verbose skip output for bpf counters
Provide additional context for when the stat bpf counters test skips.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210617184216.2075588-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-18 13:50:05 -03:00
Yang Jihong
4bcbe438b3 perf annotate: Add itrace options support
The "auxtrace_info" and "auxtrace" functions are not set in "tool" member of
"annotate". As a result, perf annotate does not support parsing itrace data.

Before:

  # perf record -e arm_spe_0/branch_filter=1/ -a sleep 1
  [ perf record: Woken up 9 times to write data ]
  [ perf record: Captured and wrote 20.874 MB perf.data ]
  # perf annotate --stdio
  Error:
  The perf.data data has no samples!

Solution:

1. Add itrace options in help,
2. Set hook functions of "id_index", "auxtrace_info" and "auxtrace" in perf_tool.

After:

  # perf record --all-user -e arm_spe_0/branch_filter=1/ ls
  Couldn't synthesize bpf events.
  perf.data
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.010 MB perf.data ]
  # perf annotate --stdio
   Percent |      Source code & Disassembly of libc-2.28.so for branch-miss (1 samples, percent: local period)
  ------------------------------------------------------------------------------------------------------------
           :
           :
           :
           :           Disassembly of section .text:
           :
           :           0000000000066180 <__getdelim@@GLIBC_2.17>:
      0.00 :   66180:  stp     x29, x30, [sp, #-96]!
      0.00 :   66184:  cmp     x0, #0x0
      0.00 :   66188:  ccmp    x1, #0x0, #0x4, ne  // ne = any
      0.00 :   6618c:  mov     x29, sp
      0.00 :   66190:  stp     x24, x25, [sp, #56]
      0.00 :   66194:  stp     x26, x27, [sp, #72]
      0.00 :   66198:  str     x28, [sp, #88]
      0.00 :   6619c:  b.eq    66450 <__getdelim@@GLIBC_2.17+0x2d0>  // b.none
      0.00 :   661a0:  stp     x22, x23, [x29, #40]
      0.00 :   661a4:  mov     x22, x1
      0.00 :   661a8:  ldr     w1, [x3]
      0.00 :   661ac:  mov     w23, w2
      0.00 :   661b0:  stp     x20, x21, [x29, #24]
      0.00 :   661b4:  mov     x20, x3
      0.00 :   661b8:  mov     x21, x0
      0.00 :   661bc:  tbnz    w1, #15, 66360 <__getdelim@@GLIBC_2.17+0x1e0>
      0.00 :   661c0:  ldr     x0, [x3, #136]
      0.00 :   661c4:  ldr     x2, [x0, #8]
      0.00 :   661c8:  str     x19, [x29, #16]
      0.00 :   661cc:  mrs     x19, tpidr_el0
      0.00 :   661d0:  sub     x19, x19, #0x700
      0.00 :   661d4:  cmp     x2, x19
      0.00 :   661d8:  b.eq    663f0 <__getdelim@@GLIBC_2.17+0x270>  // b.none
      0.00 :   661dc:  mov     w1, #0x1                        // #1
      0.00 :   661e0:  ldaxr   w2, [x0]
      0.00 :   661e4:  cmp     w2, #0x0
      0.00 :   661e8:  b.ne    661f4 <__getdelim@@GLIBC_2.17+0x74>  // b.any
      0.00 :   661ec:  stxr    w3, w1, [x0]
      0.00 :   661f0:  cbnz    w3, 661e0 <__getdelim@@GLIBC_2.17+0x60>
      0.00 :   661f4:  b.ne    66448 <__getdelim@@GLIBC_2.17+0x2c8>  // b.any
      0.00 :   661f8:  ldr     x0, [x20, #136]
      0.00 :   661fc:  ldr     w1, [x20]
      0.00 :   66200:  ldr     w2, [x0, #4]
      0.00 :   66204:  str     x19, [x0, #8]
      0.00 :   66208:  add     w2, w2, #0x1
      0.00 :   6620c:  str     w2, [x0, #4]
      0.00 :   66210:  tbnz    w1, #5, 66388 <__getdelim@@GLIBC_2.17+0x208>
      0.00 :   66214:  ldr     x19, [x29, #16]
      0.00 :   66218:  ldr     x0, [x21]
      0.00 :   6621c:  cbz     x0, 66228 <__getdelim@@GLIBC_2.17+0xa8>
      0.00 :   66220:  ldr     x0, [x22]
      0.00 :   66224:  cbnz    x0, 6623c <__getdelim@@GLIBC_2.17+0xbc>
      0.00 :   66228:  mov     x0, #0x78                       // #120
      0.00 :   6622c:  str     x0, [x22]
      0.00 :   66230:  bl      20710 <malloc@plt>
      0.00 :   66234:  str     x0, [x21]
      0.00 :   66238:  cbz     x0, 66428 <__getdelim@@GLIBC_2.17+0x2a8>
      0.00 :   6623c:  ldr     x27, [x20, #8]
      0.00 :   66240:  str     x19, [x29, #16]
      0.00 :   66244:  ldr     x19, [x20, #16]
      0.00 :   66248:  sub     x19, x19, x27
      0.00 :   6624c:  cmp     x19, #0x0
      0.00 :   66250:  b.le    66398 <__getdelim@@GLIBC_2.17+0x218>
      0.00 :   66254:  mov     x25, #0x0                       // #0
      0.00 :   66258:  b       662d8 <__getdelim@@GLIBC_2.17+0x158>
      0.00 :   6625c:  nop
      0.00 :   66260:  add     x24, x19, x25
      0.00 :   66264:  ldr     x3, [x22]
      0.00 :   66268:  add     x26, x24, #0x1
      0.00 :   6626c:  ldr     x0, [x21]
      0.00 :   66270:  cmp     x3, x26
      0.00 :   66274:  b.cs    6629c <__getdelim@@GLIBC_2.17+0x11c>  // b.hs, b.nlast
      0.00 :   66278:  lsl     x3, x3, #1
      0.00 :   6627c:  cmp     x3, x26
      0.00 :   66280:  csel    x26, x3, x26, cs  // cs = hs, nlast
      0.00 :   66284:  mov     x1, x26
      0.00 :   66288:  bl      206f0 <realloc@plt>
      0.00 :   6628c:  cbz     x0, 66438 <__getdelim@@GLIBC_2.17+0x2b8>
      0.00 :   66290:  str     x0, [x21]
      0.00 :   66294:  ldr     x27, [x20, #8]
      0.00 :   66298:  str     x26, [x22]
      0.00 :   6629c:  mov     x2, x19
      0.00 :   662a0:  mov     x1, x27
      0.00 :   662a4:  add     x0, x0, x25
      0.00 :   662a8:  bl      87390 <explicit_bzero@@GLIBC_2.25+0x50>
      0.00 :   662ac:  ldr     x0, [x20, #8]
      0.00 :   662b0:  add     x19, x0, x19
      0.00 :   662b4:  str     x19, [x20, #8]
      0.00 :   662b8:  cbnz    x28, 66410 <__getdelim@@GLIBC_2.17+0x290>
      0.00 :   662bc:  mov     x0, x20
      0.00 :   662c0:  bl      73b80 <__underflow@@GLIBC_2.17>
      0.00 :   662c4:  cmn     w0, #0x1
      0.00 :   662c8:  b.eq    66410 <__getdelim@@GLIBC_2.17+0x290>  // b.none
      0.00 :   662cc:  ldp     x27, x19, [x20, #8]
      0.00 :   662d0:  mov     x25, x24
      0.00 :   662d4:  sub     x19, x19, x27
      0.00 :   662d8:  mov     x2, x19
      0.00 :   662dc:  mov     w1, w23
      0.00 :   662e0:  mov     x0, x27
      0.00 :   662e4:  bl      807b0 <memchr@@GLIBC_2.17>
      0.00 :   662e8:  cmp     x0, #0x0
      0.00 :   662ec:  mov     x28, x0
      0.00 :   662f0:  sub     x0, x0, x27
      0.00 :   662f4:  csinc   x19, x19, x0, eq  // eq = none
      0.00 :   662f8:  mov     x0, #0x7fffffffffffffff         // #9223372036854775807
      0.00 :   662fc:  sub     x0, x0, x25
      0.00 :   66300:  cmp     x19, x0
      0.00 :   66304:  b.lt    66260 <__getdelim@@GLIBC_2.17+0xe0>  // b.tstop
      0.00 :   66308:  adrp    x0, 17f000 <sys_sigabbrev@@GLIBC_2.17+0x320>
      0.00 :   6630c:  ldr     x0, [x0, #3624]
      0.00 :   66310:  mrs     x2, tpidr_el0
      0.00 :   66314:  ldr     x19, [x29, #16]
      0.00 :   66318:  mov     w3, #0x4b                       // #75
      0.00 :   6631c:  ldr     w1, [x20]
      0.00 :   66320:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   66324:  str     w3, [x2, x0]
      0.00 :   66328:  tbnz    w1, #15, 66340 <__getdelim@@GLIBC_2.17+0x1c0>
      0.00 :   6632c:  ldr     x0, [x20, #136]
      0.00 :   66330:  ldr     w1, [x0, #4]
      0.00 :   66334:  sub     w1, w1, #0x1
      0.00 :   66338:  str     w1, [x0, #4]
      0.00 :   6633c:  cbz     w1, 663b8 <__getdelim@@GLIBC_2.17+0x238>
      0.00 :   66340:  mov     x0, x24
      0.00 :   66344:  ldr     x28, [sp, #88]
      0.00 :   66348:  ldp     x20, x21, [x29, #24]
      0.00 :   6634c:  ldp     x22, x23, [x29, #40]
      0.00 :   66350:  ldp     x24, x25, [sp, #56]
      0.00 :   66354:  ldp     x26, x27, [sp, #72]
      0.00 :   66358:  ldp     x29, x30, [sp], #96
      0.00 :   6635c:  ret
    100.00 :   66360:  tbz     w1, #5, 66218 <__getdelim@@GLIBC_2.17+0x98>
      0.00 :   66364:  ldp     x20, x21, [x29, #24]
      0.00 :   66368:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   6636c:  ldp     x22, x23, [x29, #40]
      0.00 :   66370:  mov     x0, x24
      0.00 :   66374:  ldp     x24, x25, [sp, #56]
      0.00 :   66378:  ldp     x26, x27, [sp, #72]
      0.00 :   6637c:  ldr     x28, [sp, #88]
      0.00 :   66380:  ldp     x29, x30, [sp], #96
      0.00 :   66384:  ret
      0.00 :   66388:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   6638c:  ldr     x19, [x29, #16]
      0.00 :   66390:  b       66328 <__getdelim@@GLIBC_2.17+0x1a8>
      0.00 :   66394:  nop
      0.00 :   66398:  mov     x0, x20
      0.00 :   6639c:  bl      73b80 <__underflow@@GLIBC_2.17>
      0.00 :   663a0:  cmn     w0, #0x1
      0.00 :   663a4:  b.eq    66438 <__getdelim@@GLIBC_2.17+0x2b8>  // b.none
      0.00 :   663a8:  ldp     x27, x19, [x20, #8]
      0.00 :   663ac:  sub     x19, x19, x27
      0.00 :   663b0:  b       66254 <__getdelim@@GLIBC_2.17+0xd4>
      0.00 :   663b4:  nop
      0.00 :   663b8:  str     xzr, [x0, #8]
      0.00 :   663bc:  ldxr    w2, [x0]
      0.00 :   663c0:  stlxr   w3, w1, [x0]
      0.00 :   663c4:  cbnz    w3, 663bc <__getdelim@@GLIBC_2.17+0x23c>
      0.00 :   663c8:  cmp     w2, #0x1
      0.00 :   663cc:  b.le    66340 <__getdelim@@GLIBC_2.17+0x1c0>
      0.00 :   663d0:  mov     x1, #0x81                       // #129
      0.00 :   663d4:  mov     x2, #0x1                        // #1
      0.00 :   663d8:  mov     x3, #0x0                        // #0
      0.00 :   663dc:  mov     x8, #0x62                       // #98
      0.00 :   663e0:  svc     #0x0
      0.00 :   663e4:  ldp     x20, x21, [x29, #24]
      0.00 :   663e8:  ldp     x22, x23, [x29, #40]
      0.00 :   663ec:  b       66370 <__getdelim@@GLIBC_2.17+0x1f0>
      0.00 :   663f0:  ldr     w2, [x0, #4]
      0.00 :   663f4:  add     w2, w2, #0x1
      0.00 :   663f8:  str     w2, [x0, #4]
      0.00 :   663fc:  tbz     w1, #5, 66214 <__getdelim@@GLIBC_2.17+0x94>
      0.00 :   66400:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   66404:  ldr     x19, [x29, #16]
      0.00 :   66408:  b       66330 <__getdelim@@GLIBC_2.17+0x1b0>
      0.00 :   6640c:  nop
      0.00 :   66410:  ldr     x0, [x21]
      0.00 :   66414:  strb    wzr, [x0, x24]
      0.00 :   66418:  ldr     w1, [x20]
      0.00 :   6641c:  ldr     x19, [x29, #16]
      0.00 :   66420:  b       66328 <__getdelim@@GLIBC_2.17+0x1a8>
      0.00 :   66424:  nop
      0.00 :   66428:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   6642c:  ldr     w1, [x20]
      0.00 :   66430:  b       66328 <__getdelim@@GLIBC_2.17+0x1a8>
      0.00 :   66434:  nop
      0.00 :   66438:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   6643c:  ldr     w1, [x20]
      0.00 :   66440:  ldr     x19, [x29, #16]
      0.00 :   66444:  b       66328 <__getdelim@@GLIBC_2.17+0x1a8>
      0.00 :   66448:  bl      e3ba0 <pthread_setcanceltype@@GLIBC_2.17+0x30>
      0.00 :   6644c:  b       661f8 <__getdelim@@GLIBC_2.17+0x78>
      0.00 :   66450:  adrp    x0, 17f000 <sys_sigabbrev@@GLIBC_2.17+0x320>
      0.00 :   66454:  ldr     x0, [x0, #3624]
      0.00 :   66458:  mrs     x1, tpidr_el0
      0.00 :   6645c:  mov     w2, #0x16                       // #22
      0.00 :   66460:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   66464:  str     w2, [x1, x0]
      0.00 :   66468:  b       66370 <__getdelim@@GLIBC_2.17+0x1f0>
      0.00 :   6646c:  ldr     w1, [x20]
      0.00 :   66470:  mov     x4, x0
      0.00 :   66474:  tbnz    w1, #15, 6648c <__getdelim@@GLIBC_2.17+0x30c>
      0.00 :   66478:  ldr     x0, [x20, #136]
      0.00 :   6647c:  ldr     w1, [x0, #4]
      0.00 :   66480:  sub     w1, w1, #0x1
      0.00 :   66484:  str     w1, [x0, #4]
      0.00 :   66488:  cbz     w1, 66494 <__getdelim@@GLIBC_2.17+0x314>
      0.00 :   6648c:  mov     x0, x4
      0.00 :   66490:  bl      20e40 <gnu_get_libc_version@@GLIBC_2.17+0x130>
      0.00 :   66494:  str     xzr, [x0, #8]
      0.00 :   66498:  ldxr    w2, [x0]
      0.00 :   6649c:  stlxr   w3, w1, [x0]
      0.00 :   664a0:  cbnz    w3, 66498 <__getdelim@@GLIBC_2.17+0x318>
      0.00 :   664a4:  cmp     w2, #0x1
      0.00 :   664a8:  b.le    6648c <__getdelim@@GLIBC_2.17+0x30c>
      0.00 :   664ac:  mov     x1, #0x81                       // #129
      0.00 :   664b0:  mov     x2, #0x1                        // #1
      0.00 :   664b4:  mov     x3, #0x0                        // #0
      0.00 :   664b8:  mov     x8, #0x62                       // #98
      0.00 :   664bc:  svc     #0x0
      0.00 :   664c0:  b       6648c <__getdelim@@GLIBC_2.17+0x30c>

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210615091704.259202-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-16 15:07:42 -03:00
Li Huafei
28b8e87abf perf mem-events: Remove duplicate #undef
Remove duplicate '#undef E'.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http://lore.kernel.org/lkml/20210616120339.219807-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-16 15:05:24 -03:00
Leo Yan
197eecb6ec perf session: Correct buffer copying when peeking events
When peeking an event, it has a short path and a long path.  The short
path uses the session pointer "one_mmap_addr" to directly fetch the
event; and the long path needs to read out the event header and the
following event data from file and fill into the buffer pointer passed
through the argument "buf".

The issue is in the long path that it copies the event header and event
data into the same destination address which pointer "buf", this means
the event header is overwritten.  We are just lucky to run into the
short path in most cases, so we don't hit the issue in the long path.

This patch adds the offset "hdr_sz" to the pointer "buf" when copying
the event data, so that it can reserve the event header which can be
used properly by its caller.

Fixes: 5a52f33adf ("perf session: Add perf_session__peek_event()")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-11 12:54:24 -03:00
Jin Yao
1fcc57b7e5 perf evsel: Adjust hybrid event and global event mixed group
A group mixed with hybrid event and global event is allowed. For
example, group leader is 'intel_pt//' and the group member is
'cpu_atom/cycles/'.

e.g.:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u'

The challenge is that their available cpus are not fully matched. For
example, 'intel_pt//' is available on CPU0-CPU23, but 'cpu_atom/cycles/'
is available on CPU16-CPU23.

When getting the group id for group member, we must be very careful.
Because the cpu for 'intel_pt//' is not equal to the cpu for
'cpu_atom/cycles/'. Actually the cpu here is the index of evsel->core.cpus,
not the real CPU ID.

e.g. cpu0 for 'intel_pt//' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.

Before:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u' -vv uname
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             10
    size                             128
    config                           0xe601
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|CPU|IDENTIFIER
    read_format                      ID
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    enable_on_exec                   1
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 4084  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 4084  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 4084  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid 4084  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 4084  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 4084  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 4084  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid 4084  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid 4084  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid 4084  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid 4084  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid 4084  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid 4084  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid 4084  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid 4084  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid 4084  cpu 15  group_fd -1  flags 0x8 = 21
  sys_perf_event_open: pid 4084  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid 4084  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid 4084  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid 4084  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid 4084  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid 4084  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid 4084  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid 4084  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------
  perf_event_attr:
    size                             128
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD|IDENTIFIER|AUX
    read_format                      ID
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    freq                             1
    sample_id_all                    1
    exclude_guest                    1
    aux_sample_size                  4096
  ------------------------------------------------------------
  sys_perf_event_open: pid 4084  cpu 16  group_fd 5  flags 0x8
  sys_perf_event_open failed, error -22

The group_fd 5 is not correct. It should be 22 (the fd of
'intel_pt' on CPU16).

After:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u' -vv uname
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             10
    size                             128
    config                           0xe601
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|CPU|IDENTIFIER
    read_format                      ID
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    enable_on_exec                   1
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 5162  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 5162  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 5162  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid 5162  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 5162  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 5162  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 5162  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid 5162  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid 5162  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid 5162  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid 5162  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid 5162  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid 5162  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid 5162  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid 5162  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid 5162  cpu 15  group_fd -1  flags 0x8 = 21
  sys_perf_event_open: pid 5162  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid 5162  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid 5162  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid 5162  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid 5162  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid 5162  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid 5162  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid 5162  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------
  perf_event_attr:
    size                             128
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD|IDENTIFIER|AUX
    read_format                      ID
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    freq                             1
    sample_id_all                    1
    exclude_guest                    1
    aux_sample_size                  4096
  ------------------------------------------------------------
  sys_perf_event_open: pid 5162  cpu 16  group_fd 22  flags 0x8 = 30
  sys_perf_event_open: pid 5162  cpu 17  group_fd 23  flags 0x8 = 31
  sys_perf_event_open: pid 5162  cpu 18  group_fd 24  flags 0x8 = 32
  sys_perf_event_open: pid 5162  cpu 19  group_fd 25  flags 0x8 = 33
  sys_perf_event_open: pid 5162  cpu 20  group_fd 26  flags 0x8 = 34
  sys_perf_event_open: pid 5162  cpu 21  group_fd 27  flags 0x8 = 35
  sys_perf_event_open: pid 5162  cpu 22  group_fd 28  flags 0x8 = 36
  sys_perf_event_open: pid 5162  cpu 23  group_fd 29  flags 0x8 = 37
  ------------------------------------------------------------
  ...

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210609044555.27180-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-10 13:41:50 -03:00
Masami Hiramatsu
0808b3d5b7 perf probe: Provide clearer message permission error for tracefs access
Report permission error for the tracefs open and rewrite whole the error
message code around it.

You'll see a hint according to what you want to do with perf probe as
below.

  $ perf probe -l
  No permission to read tracefs.
  Please try 'sudo mount -o remount,mode=755 /sys/kernel/tracing/'
    Error: Failed to show event list.

  $ perf probe -d \*
  No permission to write tracefs.
  Please run this command again with sudo.
    Error: Failed to delete events.

This also fixes -ENOTSUP checking for mounting tracefs/debugfs.
Actually open returns -ENOENT in that case and we have to check it with
current mount point list. If we unmount debugfs and tracefs perf probe
shows correct message as below.

  $ perf probe -l
  Debugfs or tracefs is not mounted
  Please try 'sudo mount -t tracefs nodev /sys/kernel/tracing/'
    Error: Failed to show event list.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162299456839.503471.13863002017089255222.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 14:12:14 -03:00
Leo Yan
bde1e7d934 perf auxtrace: Change to use SMP memory barriers
The kernel and the userspace tool can access the AUX ring buffer head
and tail from different CPUs, thus SMP class of barriers are required
on SMP system.

This patch changes to use SMP barriers to replace mb() and rmb()
barriers.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210602103007.184993-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 13:45:04 -03:00
Zou Wei
f54cad25a1 perf srccode: Use list_move() instead of equivalent list_del() + list_add() sequence
Using list_move() instead of list_del() + list_add(), shorter,
equivalent.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623113566-49455-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 09:36:36 -03:00
Masami Hiramatsu
f4f1c42953 perf probe: Report possible permission error for map__load() failure
Report possible permission error including kptr_restrict setting
for map__load() failure. This can happen when non-superuser runs
perf probe.

With this patch, perf probe shows the following message.

 $ perf probe vfs_read
 Failed to load symbols from /proc/kallsyms
 Please ensure you can read the /proc/kallsyms symbol addresses.
 If the /proc/sys/kernel/kptr_restrict is '2', you can not read
 kernel symbol address even if you are a superuser. Please change
 it to '1'. If kptr_restrict is '1', the superuser can read the
 symbol addresses.
 In that case, please run this command again with sudo.
   Error: Failed to add events.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162282065877.448336.10047912688119745151.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 15:43:37 -03:00
Riccardo Mancini
67069a1f0f perf env: Fix memory leak of bpf_prog_info_linear member
ASan reported a memory leak caused by info_linear not being deallocated.

The info_linear was allocated during in perf_event__synthesize_one_bpf_prog().

This patch adds the corresponding free() when bpf_prog_info_node
is freed in perf_env__purge_bpf().

  $ sudo ./perf record -- sleep 5
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

  =================================================================
  ==297735==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 7688 byte(s) in 19 object(s) allocated from:
      #0 0x4f420f in malloc (/home/user/linux/tools/perf/perf+0x4f420f)
      #1 0xc06a74 in bpf_program__get_prog_info_linear /home/user/linux/tools/lib/bpf/libbpf.c:11113:16
      #2 0xb426fe in perf_event__synthesize_one_bpf_prog /home/user/linux/tools/perf/util/bpf-event.c:191:16
      #3 0xb42008 in perf_event__synthesize_bpf_events /home/user/linux/tools/perf/util/bpf-event.c:410:9
      #4 0x594596 in record__synthesize /home/user/linux/tools/perf/builtin-record.c:1490:8
      #5 0x58c9ac in __cmd_record /home/user/linux/tools/perf/builtin-record.c:1798:8
      #6 0x58990b in cmd_record /home/user/linux/tools/perf/builtin-record.c:2901:8
      #7 0x7b2a20 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
      #8 0x7b12ff in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
      #9 0x7b2583 in run_argv /home/user/linux/tools/perf/perf.c:409:2
      #10 0x7b0d79 in main /home/user/linux/tools/perf/perf.c:539:3
      #11 0x7fa357ef6b74 in __libc_start_main /usr/src/debug/glibc-2.33-8.fc34.x86_64/csu/../csu/libc-start.c:332:16

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20210602224024.300485-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:26:20 -03:00
Masami Hiramatsu
fe4f3eb1fd perf probe: Add permission and sysctl notice to man page
Add a section to notify the permission and sysctl setting for perf
probe. And fix some indentations.

Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/162204068898.388434.16842705842611255787.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:24:38 -03:00
Riccardo Mancini
69c9ffed6c perf symbol-elf: Fix memory leak by freeing sdt_note.args
Reported by ASan.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Link: http://lore.kernel.org/lkml/20210602220833.285226-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:06:27 -03:00
Namhyung Kim
3cc84399e9 perf stat: Honor event config name on --no-merge
If user gave an event name explicitly, it should be displayed in the
output as is.  But with --no-merge option it adds a pmu name at the
end so might confuse users.

Actually this is true for hybrid pmus, I think we should do the same
for others.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210602212241.2175005-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:05:23 -03:00
Namhyung Kim
2dc065eae5 perf evsel: Add missing cloning of evsel->use_config_name
The evsel__clone() should copy all fields in the evsel which are set
during the event parsing.  But it missed the use_config_name field.

Fixes: 12279429d8 ("perf stat: Uniquify hybrid event name")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210602212241.2175005-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:04:20 -03:00
Arnaldo Carvalho de Melo
67e446eb4d Revert "perf vendor events intel: Add metrics for Icelake Server"
It is making 'perf test 10' fail:

  ⬢[acme@toolbox perf]$ perf test 10
  10: PMU events                                                      :
  10.1: PMU event table sanity                                        : Ok
  10.2: PMU event map aliases                                         : Ok
  10.3: Parsing of PMU event table metrics                            : Ok
  10.4: Parsing of PMU event table metrics with fake PMUs             : FAILED!
  ⬢[acme@toolbox perf]

This reverts commit d89bf9cab1.
2021-06-02 08:27:00 -03:00
Arnaldo Carvalho de Melo
0ab8009b3e Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes from perf/urgent to allow perf/core to be used for new
development.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 14:58:44 -03:00
Jin Yao
79e157b008 perf c2c: Support record for hybrid platform
Support 'perf c2c record' for hybrid platform. On hybrid platform,
such as Alderlake, when executing 'perf c2c record', it actually calls:

record -W -d --phys-data --sample-cpu
-e {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P
-e cpu_atom/mem-loads,ldlat=30/P
-e cpu_core/mem-stores/P
-e cpu_atom/mem-stores/P

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-9-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:06:35 -03:00
Jin Yao
d5a8bd0fcd perf mem: Disable 'mem-loads-aux' group before reporting
For some platforms, such as Alderlake, the 'mem-loads' event is required
to use together with 'mem-loads-aux' within a group and 'mem-loads-aux'
must be the group leader. Now we disable this group before reporting
because 'mem-loads-aux' is just an auxiliary event. It doesn't carry
any valid memory load result. If we show the 'mem-loads-aux' +
'mem-loads' as a group in report, it needs many of changes but they
are totally unnecessary.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:06:01 -03:00
Jin Yao
a6d9de8427 perf mem: Fix wrong verbose output for recording events
Current code:

for (j = 0; j < argc; j++, i++)
        rec_argv[i] = argv[j];

if (verbose > 0) {
        pr_debug("calling: record ");

        while (rec_argv[j]) {
                pr_debug("%s ", rec_argv[j]);
                j++;
        }
        pr_debug("\n");
}

The entries of argv[] are copied to the end of rec_argv[], not
copied to the beginning of rec_argv[]. So the index j at
rec_argv[] doesn't point to the first event.

Now we record the start index and end index for events in rec_argv[],
and print them if verbose is enabled.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-7-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:05:30 -03:00
Jin Yao
4a9086adc3 perf mem: Support record for hybrid platform
Support 'perf mem record' for hybrid platform. On hybrid platform,
such as Alderlake, when executing 'perf mem record', it actually calls:

record -e {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P
       -e cpu_atom/mem-loads,ldlat=30/P
       -e cpu_core/mem-stores/P
       -e cpu_atom/mem-stores/P

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:59 -03:00
Jin Yao
e7ce8d11bf perf tools: Check if mem_events is supported for hybrid platform
Check if the mem_events ('mem-loads' and 'mem-stores') exist
in the sysfs path.

For Alderlake, the hybrid cpu pmu are "cpu_core" and "cpu_atom".
Check the existing of following paths:

/sys/devices/cpu_atom/events/mem-loads
/sys/devices/cpu_atom/events/mem-stores
/sys/devices/cpu_core/events/mem-loads
/sys/devices/cpu_core/events/mem-stores

If the patch exists, the mem_event is supported.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:33 -03:00
Jin Yao
a91ffcf30e perf tools: Support pmu prefix for mem-store event
For enabling mem-store event, it doesn't need an auxiliary event.
So just build an event name string with the pmu prefix.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:05 -03:00
Jin Yao
d2f327acc6 perf tools: Support pmu prefix for mem-load event
The perf_mem_events__name() can generate the mem-load event name.
It uses a variable 'mem_loads_name__init' to avoid generating the
event name every time (because perf_pmu__scan takes some time).

The perf_mem_events__name() assumes the pmu is "cpu" but it's not
correct for hybrid platform. For Alderlake, the pmu is "cpu_core" or
"cpu_atom"

Introduce a new parameter 'pmu_name' in perf_mem_events__name
to let the caller specify a pmu name.

Considering such event name is x86 specific, so move
perf_mem_events[] to arch/x86/util/mem-events.c.

We still keep the variable 'mem_loads_name__init' but it's only
used when pmu_name is NULL (compatible for original behavior). When
pmu_name is not NULL (e.g. "cpu_core"), this patch doesn't have
optimization. That can be implemented in follow up patch.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:03:35 -03:00