Commit graph

12739 commits

Author SHA1 Message Date
Like Xu
a827c007c7 perf config: Refine error message to eliminate confusion
If there is no configuration file at first, the user can write any pair
of "key.subkey=value" to the newly created configuration file, while
value validation against a valid configurable key is *deferred* until
the next execution or the implied execution of "perf config ... ".

For example:

  $ rm ~/.perfconfig
  $ perf config call-graph.dump-size=65529
  $ cat ~/.perfconfig
  # this file is auto-generated.
  [call-graph]
 	dump-size = 65529
  $ perf config call-graph.dump-size=2048
  callchain: Incorrect stack dump size (max 65528): 65529
  Error: wrong config key-value pair call-graph.dump-size=65529

The user might expect that the second value 2048 is valid and can be
updated to the configuration file, but the error message is very
confusing because the first value 65529 is not reported as an error
during the last configuration.

It is recommended not to change the current behavior of delayed
validation (as more effort is needed), but to refine the original error
message to *clearly indicate* that the cause of the error is the
configuration file.

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210924115817.58689-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Like Xu
4da6552c5d perf doc: Fix typos all over the place
Considering that perf and its subcommands have so many parameters, the
documentation is always the first stop for perf beginners. Fixing some
spelling errors will relax the eyes of some readers a little bit.

 s/specicfication/specification/
 s/caheline/cacheline/
 s/tranasaction/transaction/
 s/complan/complain/
 s/sched_wakep/sched_wakeup/
 s/possble/possible/
 s/methology/methodology/

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210924081942.38368-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Ian Rogers
c6613bd4a5 perf arm: Fix off-by-one directory paths.
Relative path include works in the regular build due to -I paths but may
fail in other situations.

v2. Rebase. Comments on v1 were that we should handle include paths
    differently and it is agreed that can be a sensible refactor but
    beyond the scope of this change.
https://lore.kernel.org/lkml/20210504191227.793712-1-irogers@google.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210923154254.737657-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Colin Ian King
774f2c0890 perf vendor events powerpc: Fix spelling mistake "icach" -> "icache"
There is a spelling mistake in the description text, fix it.

Signed-off-by: Colin King <colin.king@canonical.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210916081314.41751-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
James Clark
0f892fd1bd perf tests: Fix flaky test 'Object code reading'
This test occasionally fails on aarch64 when a sample is taken in
free@plt and it fails with "Bytes read differ from those read by
objdump".

This is because that symbol is near a section boundary in the elf file.
Despite the -z option to always output zeros, objdump uses
bfd_map_over_sections() to iterate through the elf file so it doesn't
see outside of the sections where these zeros are and can't print them.

For example this boundary proceeds free@plt in libc with a gap of 48
bytes between .plt and .text:

  objdump -d -z --start-address=0x23cc8 --stop-address=0x23d08 libc-2.30.so

  libc-2.30.so:     file format elf64-littleaarch64

  Disassembly of section .plt:

  0000000000023cc8 <*ABS*+0x7fd00@plt+0x8>:
     23cc8:	91018210 	add	x16, x16, #0x60
     23ccc:	d61f0220 	br	x17

  Disassembly of section .text:

  0000000000023d00 <abort@@GLIBC_2.17-0x98>:
     23d00:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
     23d04:	910003fd 	mov	x29, sp

Taking a sample in free@plt is very rare because it is so small, but the
test can be forced to fail almost every time on any platform by linking
the test with a shared library that has a single empty function and
calling it in a loop.

The fix is to zero the buffers so that when there is a jump in the
addresses output by objdump, zeros are already filled in between.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20210906152238.3415467-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Ian Rogers
5c34aea341 perf test: Fix DWARF unwind for optimized builds.
To ensure the stack frames are on the stack tail calls optimizations
need to be inhibited. If your compiler supports an attribute use it,
otherwise use an asm volatile barrier.

The barrier fix was suggested here:
https://lore.kernel.org/lkml/20201028081123.GT2628@hirez.programming.kicks-ass.net/
Tested with an optimized clang build and by forcing the asm barrier
route with an optimized clang build.

A GCC bug tracking a proper disable_tail_calls is:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97831

Fixes: 9ae1e990f1 ("perf tools: Remove broken __no_tail_call
       attribute")

v2. is a rebase. The original fix patch generated quite a lot of
    discussion over the right place for the fix:
    https://lore.kernel.org/lkml/20201114000803.909530-1-irogers@google.com/
    The patch reflects my preference of it being near the use, so that
    future code cleanups don't break this somewhat special usage.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210922173812.456348-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Andrii Nakryiko
219d720e6d perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id()
Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
a weak function, presumably to dynamically link against old version of
libbpf shared library. Unfortunately this causes compilation warning
when perf is compiled against libbpf v0.6+.

For now, just ignore deprecation warning, but there might be a better
solution, depending on perf's needs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: kernel-team@fb.com
LPU-Reference: 20210914170004.4185659-1-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:47:02 -03:00
Michael Petlan
57f0ff059e perf machine: Initialize srcline string member in add_location struct
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:

  # perf top --sort comm -g --ignore-callees=do_idle

terminates with:

  #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
  #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
  #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
  #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
  #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
      sample_self=sample_self@entry=true) at util/hist.c:657
  #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
      sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
  #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
      at util/hist.c:1056
  #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
  #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
      at util/hist.c:1231
  #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
      at builtin-top.c:842
  #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
  #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
  #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
  #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
  #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
  #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
  #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
  #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
  #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6

If you look at the frame #2, the code is:

488	 if (he->srcline) {
489          he->srcline = strdup(he->srcline);
490          if (he->srcline == NULL)
491              goto err_rawdata;
492	 }

If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.

Also, if you look at the commit 1fb7d06a50, it adds the srcline property
into the struct, but not initializing it everywhere needed.

Committer notes:

Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():

2181         if (al.sym != NULL) {
2182                 if (perf_hpp_list.parent && !*parent &&
2183                     symbol__match_regex(al.sym, &parent_regex))
2184                         *parent = al.sym;
2185                 else if (have_ignore_callees && root_al &&
2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187                         /* Treat this symbol as the root,
2188                            forgetting its callees. */
2189                         *root_al = al;
2190                         callchain_cursor_reset(cursor);
2191                 }
2192         }

And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:

1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212                          int max_stack_depth, void *arg)
1213 {
1214         int err, err2;
1215         struct map *alm = NULL;
1216
1217         if (al)
1218                 alm = map__get(al->map);
1219
1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221                                         iter->evsel, al, max_stack_depth);
1222         if (err) {
1223                 map__put(alm);
1224                 return err;
1225         }
1226
1227         err = iter->ops->prepare_entry(iter, al);
1228         if (err)
1229                 goto out;
1230
1231         err = iter->ops->add_single_entry(iter, al);
1232         if (err)
1233                 goto out;
1234

That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:

        iter->ops->add_single_entry(iter, al);

will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06a50 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Adrian Hunter
ff6f41fbce perf script: Fix ip display when type != attr->type
set_print_ip_opts() was not being called when type != attr->type
because there is not a one-to-one relationship between output types
and attr->type. That resulted in ip not printing.

The attr_type() function is removed, and the match of attr->type to
output type is corrected.

Example on ADL using taskset to select an atom cpu:

 # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]

 Before:

  # perf script | head
         taskset   428 [-01] 10394.179041:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179043:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179044:         11 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179045:        407 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179046:      16789 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179052:     676300 cpu_atom/cpu-cycles/:
           uname   428 [-01] 10394.179278:    4079859 cpu_atom/cpu-cycles/:

 After:

  # perf script | head
         taskset   428 10394.179041:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179043:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179044:         11 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179045:        407 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179046:      16789 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179052:     676300 cpu_atom/cpu-cycles/:      7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
           uname   428 10394.179278:    4079859 cpu_atom/cpu-cycles/:  ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Ravi Bangoria
7efbcc8c07 perf annotate: Fix fused instr logic for assembly functions
Some x86 microarchitectures fuse a subset of cmp/test/ALU instructions
with branch instructions, and thus perf annotate highlight such valid
pairs as fused.

When annotated with source, perf uses struct disasm_line to contain
either source or instruction line from objdump output. Usually, a C
statement generates multiple instructions which include such
cmp/test/ALU + branch instruction pairs. But in case of assembly
function, each individual assembly source line generate one
instruction.

The 'perf annotate' instruction fusion logic assumes the previous
disasm_line as the previous instruction line, which is wrong because,
for assembly function, previous disasm_line contains source line.  And
thus perf fails to highlight valid fused instruction pairs for assembly
functions.

Fix it by searching backward until we find an instruction line and
consider that disasm_line as fused with current branch instruction.

Before:
         │    cmpq    %rcx, RIP+8(%rsp)
    0.00 │      cmp    %rcx,0x88(%rsp)
         │    je      .Lerror_bad_iret      <--- Source line
    0.14 │   ┌──je     b4                   <--- Instruction line
         │   │movl    %ecx, %eax

After:
         │    cmpq    %rcx, RIP+8(%rsp)
    0.00 │   ┌──cmp    %rcx,0x88(%rsp)
         │   │je      .Lerror_bad_iret
    0.14 │   ├──je     b4
         │   │movl    %ecx, %eax

Reviewed-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https //lore.kernel.org/r/20210911043854.8373-1-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Ian Rogers
0d1c50ac48 perf tools: Add an option to build without libbfd
Some distributions, like debian, don't link perf with libbfd. Add a
build flag to make this configuration buildable and testable.

This was inspired by:

  https://lore.kernel.org/linux-perf-users/20210910102307.2055484-1-tonyg@leastfixedpoint.com/T/#u

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: tony garnock-jones <tonyg@leastfixedpoint.com>
Link: http://lore.kernel.org/lkml/20210910225756.729087-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:06:52 -03:00
Namhyung Kim
4a86d41404 perf tools: Allow build-id with trailing zeros
Currently perf saves a build-id with size but old versions assumes the
size of 20.  In case the build-id is less than 20 (like for MD5), it'd
fill the rest with 0s.

I saw a problem when old version of perf record saved a binary in the
build-id cache and new version of perf reads the data.  The symbols
should be read from the build-id cache (as the path no longer has the
same binary) but it failed due to mismatch in the build-id.

  symsrc__init: build id mismatch for /home/namhyung/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf.

The build-id event in the data has 20 byte build-ids, but it saw a
different size (16) when it reads the build-id of the elf file in the
build-id cache.

  $ readelf -n ~/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf

  Displaying notes found in: .note.gnu.build-id
    Owner                Data size 	Description
    GNU                  0x00000010	NT_GNU_BUILD_ID (unique build ID bitstring)
      Build ID: 53e4c2f42a4c61a2d632d92a72afa08f

Let's fix this by allowing trailing zeros if the size is different.

Fixes: 39be8d0115 ("perf tools: Pass build_id object to dso__build_id_equal()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210910224630.1084877-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:04:47 -03:00
Adrian Hunter
99fc5941b8 perf tools: Fix hybrid config terms list corruption
A config terms list was spliced twice, resulting in a never-ending loop
when the list was traversed. Fix by using list_splice_init() and copying
and freeing the lists as necessary.

This patch also depends on patch "perf tools: Factor out
copy_config_terms() and free_config_terms()"

Example on ADL:

 Before:

  # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname &
  # jobs
  [1]+  Running                    perf record -e "{intel_pt//,cycles/aux-sample-size=4096/pp}" uname
  # perf top -E 10
    PerfTop:    4071 irqs/sec  kernel: 6.9%  exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles],  (all, 24 CPUs)
  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    97.60%  perf           [.] __evsel__get_config_term
     0.25%  [kernel]       [k] kallsyms_expand_symbol.constprop.13
     0.24%  perf           [.] kallsyms__parse
     0.15%  [kernel]       [k] _raw_spin_lock
     0.14%  [kernel]       [k] number
     0.13%  [kernel]       [k] advance_transaction
     0.08%  [kernel]       [k] format_decode
     0.08%  perf           [.] map__process_kallsym_symbol
     0.08%  perf           [.] rb_insert_color
     0.08%  [kernel]       [k] vsnprintf
  exiting.
  # kill %1

After:

  # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname &
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.060 MB perf.data ]
  # perf script | head
       perf-exec   604 [001]  1827.312293:                            psb:  psb offs: 0                       ffffffffb8415e87 pt_config_start+0x37 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a3bd event_sched_in.isra.133+0xfd ([kernel.kallsyms]) => ffffffffb856a9a0 perf_pmu_nop_void+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856b10e merge_sched_in+0x26e ([kernel.kallsyms]) => ffffffffb856a2c0 event_sched_in.isra.133+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a45d event_sched_in.isra.133+0x19d ([kernel.kallsyms]) => ffffffffb8568b80 perf_event_set_state.part.61+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8568b86 perf_event_set_state.part.61+0x6 ([kernel.kallsyms]) => ffffffffb85662a0 perf_event_update_time+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a35c event_sched_in.isra.133+0x9c ([kernel.kallsyms]) => ffffffffb8567610 perf_log_itrace_start+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb856a377 event_sched_in.isra.133+0xb7 ([kernel.kallsyms]) => ffffffffb8403b40 x86_pmu_add+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8403b86 x86_pmu_add+0x46 ([kernel.kallsyms]) => ffffffffb8403940 collect_events+0x0 ([kernel.kallsyms])
       perf-exec   604  1827.312293:          1                       branches:  ffffffffb8403a7b collect_events+0x13b ([kernel.kallsyms]) => ffffffffb8402cd0 collect_event+0x0 ([kernel.kallsyms])

Fixes: 30def61f64 ("perf parse-events Create two hybrid cache events")
Fixes: 94da591b1c ("perf parse-events Create two hybrid raw events")
Fixes: 9cbfa2f64c ("perf parse-events Create two hybrid hardware events")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https //lore.kernel.org/r/20210909125508.28693-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:00:34 -03:00
Adrian Hunter
a7d212fc6c perf tools: Factor out copy_config_terms() and free_config_terms()
Factor out copy_config_terms() and free_config_terms() so that they can
be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https //lore.kernel.org/r/20210909125508.28693-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 16:00:13 -03:00
Adrian Hunter
eb34363ae1 perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields
Some fields are missing and text_poke is duplicated. Fix that up.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210911120550.12203-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 15:58:36 -03:00
Ian Rogers
da4572d62d perf tools: Ignore Documentation dependency file
When building directly on the checked out repository the build process
produces a file that should be ignored, so add it to .gitignore.

Fixes: a81df63a5d ("perf doc: Fix doc.dep")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210910232249.739661-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-11 15:36:16 -03:00
Arnaldo Carvalho de Melo
218e7b775d perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions
The btf__get_from_id() function was deprecated in favour of
btf__load_from_kernel_by_id(), but it is still avaiable, so use it to
provide a weak function btf__load_from_kernel_by_id() for older libbpf
when building perf with LIBBPF_DYNAMIC=1, i.e. using the system's libbpf
package.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:32:04 -03:00
Arnaldo Carvalho de Melo
155ed9f1b5 perf beauty: Cover more flags in the move_mount syscall argument beautifier
Previously the regext expected MOVE_MOUNT_[FT]_*, but in the next patch
a flag that doesn't match that expression will be added, MOVE_MOUNT_SET_GROUP

To make this more future proof, take advantage of the fact that the only
one we don't need to cover is MOVE_MOUNT__MASK and use MOVE_MOUNT_[^_]+_*_.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:22 -03:00
Kim Phillips
291dcb98d7 perf report: Add support to print a textual representation of IBS raw sample data
Perf records IBS (Instruction Based Sampling) extra sample data when
'perf record --raw-samples' is used with an IBS-compatible event, on a
machine that supports IBS.  IBS support is indicated in
CPUID_Fn80000001_ECX bit #10.

Up until now, users have been able to see the extra sample data solely
in raw hex format using 'perf report --dump-raw-trace'.  From there,
users could decode the data either manually, or by using an external
script.

Enable the built-in 'perf report --dump-raw-trace' to do the decoding of
the extra sample data bits, so manual or external script decoding isn't
necessary.

Example usage:

  $ sudo perf record -c 10000001 -a --raw-samples -e ibs_fetch/rand_en=1/,ibs_op/cnt_ctl=1/ -C 0,1 taskset -c 0,1 7za b -mmt2 | perf report --dump-raw-trace

Stdout contains IBS Fetch samples, e.g.:

  ibs_fetch_ctl:	02170007ffffffff MaxCnt 1048560 Cnt 1048560 Lat     7 En 1 Val 1 Comp 1 IcMiss 0 PhyAddrValid 1 L1TlbPgSz 4KB L1TlbMiss 0 L2TlbMiss 0 RandEn 1 L2Miss 0
  IbsFetchLinAd:	000056016b2ead40
  IbsFetchPhysAd:	000000115cedfd40
  c_ibs_ext_ctl:	0000000000000000 IbsItlbRefillLat   0

..and IBS Op samples, e.g.:

  ibs_op_ctl:	0000009e009e8968 MaxCnt  10000000 En 1 Val 1 CntCtl 1=uOps CurCnt       158
  IbsOpRip:	000056016b2ea73d
  ibs_op_data:	00000000000b0002 CompToRetCtr     2 TagToRetCtr    11 BrnRet 0  RipInvalid 0 BrnFuse 0 Microcode 0
  ibs_op_data2:	0000000000000002 CacheHitSt 0=M-state RmtNode 0 DataSrc 2=Local node cache
  ibs_op_data3:	0000000000c60002 LdOp 0 StOp 1 DcL1TlbMiss 0 DcL2TlbMiss 0 DcL1TlbHit2M 0 DcL1TlbHit1G 0 DcL2TlbHit2M 0 DcMiss 0 DcMisAcc 0 DcWcMemAcc 0 DcUcMemAcc 0 DcLockedOp 0 DcMissNoMabAlloc 0 DcLinAddrValid 1 DcPhyAddrValid 1 DcL2TlbHit1G 0 L2Miss 0 SwPf 0 OpMemWidth  4 bytes OpDcMissOpenMemReqs  0 DcMissLat     0 TlbRefillLat     0
  IbsDCLinAd:	00007f133c319ce0
  IbsDCPhysAd:	0000000270485ce0

Committer notes:

Fixed up this:

  util/amd-sample-raw.c: In function ‘evlist__amd_sample_raw’:
  util/amd-sample-raw.c:125:42: error: ‘ bytes’ directive output may be truncated writing 6 bytes into a region of size between 4 and 7 [-Werror=format-truncation=]
    125 |                          " OpMemWidth %2d bytes", 1 << (reg.op_mem_width - 1));
        |                                          ^~~~~~
  In file included from /usr/include/stdio.h:866,
                   from util/amd-sample-raw.c:7:
  /usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 21 and 24 bytes into a destination of size 21
     71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     72 |                                    __glibc_objsize (__s), __fmt,
        |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     73 |                                    __va_arg_pack ());
        |                                    ~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

As that %2d won't limit the number of chars to 2, just state that 2 is
the minimal width:

  $ cat printf.c
  #include <stdio.h>
  #include <stdlib.h>

  int main(int argc, char *argv[])
  {
  	char bf[64];
  	int len = snprintf(bf, sizeof(bf), "%2d", atoi(argv[1]));

  	printf("strlen(%s): %u\n", bf, len);

  	return 0;
  }
  $ ./printf 1
  strlen( 1): 2
  $ ./printf 12
  strlen(12): 2
  $ ./printf 123
  strlen(123): 3
  $ ./printf 1234
  strlen(1234): 4
  $ ./printf 12345
  strlen(12345): 5
  $ ./printf 123456
  strlen(123456): 6
  $

And since we probably don't want that output to be truncated, just
assume the worst case, as the compiler did, and add a few more chars to
that buffer.

Also use sizeof(var) instead of sizeof(dup-of-wanted-format-string) to
avoid bugs when changing one but not the other.

I also had to change this:

  -#include <asm/amd-ibs.h>
  +#include "../../arch/x86/include/asm/amd-ibs.h"

To make it build on other architectures, just like intel-pt does.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210817221509.88391-4-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:15:21 -03:00
Kim Phillips
dde994dd54 perf report: Add tools/arch/x86/include/asm/amd-ibs.h
This is a tools/-side patch for the patch that adds the original copy
of the IBS header file, in arch/x86/include/asm/.

We also add an entry to check-headers.sh, so future changes continue
to be copied.

Committer notes:

Had to add this

  -#include <asm/msr-index.h>
  +#include "msr-index.h"

And change the check-headers.sh entry to ignore this line when diffing
with the original kernel header.

This is needed so that we can use 'perf report' on a perf.data with IBS
data on a !x86 system, i.e. building on ARM fails without this as there
is no asm/msr-index.h there.

This was done on the next patch in this series and is done for things
like Intel PT and ARM CoreSight.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210817221509.88391-3-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 18:13:20 -03:00
Kim Phillips
9fe8895a27 perf env: Add perf_env__cpuid, perf_env__{nr_}pmu_mappings
To be used by IBS raw data display: It needs the recorder's cpuid in
order to determine which errata workarounds to apply to the data, and
the pmu_mappings are needed in order to figure out which PMU sample
type is IBS Fetch vs. IBS Op.

When not available from perf.data, we assume local operation, and
retrieve cpuid and pmu mappings directly from the running system.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https //lore.kernel.org/r/20210817221509.88391-2-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:19 -03:00
Remi Bernon
d2930ede52 perf symbol: Look for ImageBase in PE file to compute .text offset
Instead of using the file offset in the debug file.

This fixes a regression from 00a3423492 ("perf symbols: Make
dso__load_bfd_symbols() load PE files from debug cache only"), causing
incorrect symbol resolution when debug file have been stripped from
non-debug sections (in which case its .text section is empty and doesn't
have any file position).

The debug files could also be created with a different file alignment,
and have different file positions from the mmap-ed binary, or have the
section reordered.

This instead looks for the file image base, using the corresponding bfd
*ABS* symbols. As PE symbols only have 4 bytes, it also needs to keep
.text section vma high bits.

Signed-off-by: Remi Bernon <rbernon@codeweavers.com>
Fixes: 00a3423492 ("perf symbols: Make dso__load_bfd_symbols() load PE files from debug cache only")
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210909192637.4139125-1-rbernon@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:19 -03:00
Michael Petlan
51ae7fa62d perf scripts python: Fix passing arguments to stackcollapse report
The '--' prevented arguments from being passed to the script, such as:

  $ perf script report stackcollapse -i my_perf.data

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
LPU-Reference: 20200427142327.21172-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:19 -03:00
Michael Petlan
3e11300cdf perf test: Fix bpf test sample mismatch reporting
When the expected sample count in the condition changed, the message
needs to be changed too, otherwise we'll get:

  0x1001f2091d8: mmap mask[0]:
  BPF filter result incorrect, expected 56, got 56 samples

Fixes: 4b04e0decd ("perf test: Fix basic bpf filtering test")
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https //lore.kernel.org/r/20210805160611.5542-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:19 -03:00
Arnaldo Carvalho de Melo
64f4535166 tools headers UAPI: Sync files changed by new process_mrelease syscall and the removal of some compat entry points
To pick the changes in these csets:

  59ab844eed ("compat: remove some compat entry points")
  dce4910396 ("mm: wire up syscall process_mrelease")
  b48c7236b1 ("exit/bdflush: Remove the deprecated bdflush system call")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible:

  # perf trace -v -e process_mrelease
  event qualifier tracepoint filter: (common_pid != 19351 && common_pid != 9112) && (id == 448)
  ^C#

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep process_mrelease tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
  448    common  process_mrelease            sys_process_mrelease
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
  diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 11:45:07 -03:00
Arnaldo Carvalho de Melo
bb91de4469 perf beauty: Update copy of linux/socket.h with the kernel sources
To pick the changes in:

Fixes: d32f89da7f ("net: add accept helper not installing fd")
Fixes: bc49d8169a ("mctp: Add MCTP base")

This automagically adds support for the AF_MCTP protocol domain:

  $ tools/perf/trace/beauty/socket.sh > before
  $ cp include/linux/socket.h tools/perf/trace/beauty/include/linux/socket.h
  $ tools/perf/trace/beauty/socket.sh > after
  $ diff -u before after
  --- before	2021-09-06 11:57:14.972747200 -0300
  +++ after	2021-09-06 11:57:30.541920222 -0300
  @@ -44,4 +44,5 @@
   	[42] = "QIPCRTR",
   	[43] = "SMC",
   	[44] = "XDP",
  +	[45] = "MCTP",
   };
  $

This will allow 'perf trace' to translate 45 into "MCTP" as is done with
the other domains:

  # perf trace -e socket*
     0.000 chronyd/1029 socket(family: INET, type: DGRAM|CLOEXEC|NONBLOCK, protocol: IP) = 4
  ^C#

This addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: David S. Miller <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jeremy Kerr <jk@codeconstruct.com.au>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-10 10:42:49 -03:00
Linus Torvalds
2d338201d5 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "147 patches, based on 7d2a07b769.

  Subsystems affected by this patch series: mm (memory-hotplug, rmap,
  ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
  alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
  checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
  selftests, ipc, and scripts"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
  scripts: check_extable: fix typo in user error message
  mm/workingset: correct kernel-doc notations
  ipc: replace costly bailout check in sysvipc_find_ipc()
  selftests/memfd: remove unused variable
  Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
  configs: remove the obsolete CONFIG_INPUT_POLLDEV
  prctl: allow to setup brk for et_dyn executables
  pid: cleanup the stale comment mentioning pidmap_init().
  kernel/fork.c: unexport get_{mm,task}_exe_file
  coredump: fix memleak in dump_vma_snapshot()
  fs/coredump.c: log if a core dump is aborted due to changed file permissions
  nilfs2: use refcount_dec_and_lock() to fix potential UAF
  nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
  nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
  nilfs2: fix NULL pointer in nilfs_##name##_attr_release
  nilfs2: fix memory leak in nilfs_sysfs_create_device_group
  trap: cleanup trap_init()
  init: move usermodehelper_enable() to populate_rootfs()
  ...
2021-09-08 12:55:35 -07:00
Andy Shevchenko
7fc5b57132 tools: rename bitmap_alloc() to bitmap_zalloc()
Rename bitmap_alloc() to bitmap_zalloc() in tools to follow the bitmap API
in the kernel.

No functional changes intended.

Link: https://lkml.kernel.org/r/20210814211713.180533-14-yury.norov@gmail.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Suggested-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Lobakin <alobakin@pm.me>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 11:50:27 -07:00
Linus Torvalds
27151f1778 perf tools changes for v5.15:
New features:
 
 - Improvements for the flamegraph python script, including:
 
   - Display perf.data header
   - Display PIDs of user stacks
   - Added option to change color scheme
   - Default to blue/green color scheme to improve accessibility
   - Correctly identify kernel stacks when debuginfo is available
 
 - Improvements for 'perf bench futex':
   - Add --mlockall parameter
   - Add --broadcast and --pi to the 'requeue' sub benchmark
 
 - Add support for PMU aliases.
 
 - Introduce an ARM Coresight ETE decoder.
 
 - Add a 'perf bench' entry for evlist open/close operations, to help quantify
   improvements with multithreading 'perf record'.
 
 - Allow reporting the [un]throttle PERF_RECORD_ meta event in 'perf script's
   python scripting.
 
 - Add a 'perf test' entry for PMU aliases.
 
 - Add a 'perf test' entry for 'perf record/perf report/perf script' pipe mode.
 
 Fixes:
 
 - perf script dlfilter (API for filtering via dynamically loaded shared object
   introduced in v5.14) fixes and a 'perf test' entry for it.
 
 - Fix get_current_dir_name() compilation on Android.
 
 - Fix issues with asciidoc and double dashes uses.
 
 - Fix memory leaks in the BTF handling code.
 
 - Fix leftover problems in the Documentation from the infrastructure originally
   lifted from the git codebase.
 
 - Fix *probe_vfs_getname.sh 'perf test' failures.
 
 - Handle fd gaps in 'perf test's test__dso_data_reopen().
 
 - Make sure to show disasembly warnings for 'perf annotate --stdio'.
 
 - Fix output from pipe to file and vice-versa in 'perf record/report/script'.
 
 - Correct 'perf data -h' output.
 
 - Fix wrong comm in system-wide mode with 'perf record --delay'.
 
 - Do not allow --for-each-cgroup without cpu in 'perf stat'
 
 - Make 'perf test --skip' work on shell tests.
 
 - Fix libperf's verbose printing.
 
 Misc improvements:
 
 - Preparatory patches for multithreading varios 'perf record' phases
   (synthesizing, opening, recording, etc).
 
 - Add sparse context/locking annotations in compiler-types.h, also to help with
   the multithreading effort.
 
 - Optimize the generation of the arch specific erno tables used in 'perf trace'.
 
 - Optimize libperf's perf_cpu_map__max().
 
 - Improve ARM's CoreSight warnings.
 
 - Report collisions in AUX records.
 
 - Improve warnings for the LLVM 'perf test' entry.
 
 - Improve the PMU events 'perf test' codebase.
 
 - perf test: Do not compare overheads in the zstd comp test
 
 - Better support annotation on ARM.
 
 - Update 'perf trace's cmd string table to decode sys_bpf() first arg.
 
 Vendor events:
 
 - Add JSON events and metrics for Intel's Ice Lake, Tiger Lake and Elhart Lake.
 
 - Update JSON eventsand metrics for Intel's Cascade Lake and Sky Lake servers.
 
 Hardware tracing:
 
 - Improvements for the ARM hardware tracing auxtrace support.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYTO8OAAKCRCyPKLppCJ+
 J/N0AQCTAfxKsVyh9vjIY5vwHvNkYMDM4Wvs7TmlfXbVgKLP3AEA0wYlTyar/teI
 NC6jQpipxpWASFUJbLSmU8qn/XFXwQo=
 =4r0w
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.15-2021-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool updates from Arnaldo Carvalho de Melo:
 "New features:

   - Improvements for the flamegraph python script, including:
       - Display perf.data header
       - Display PIDs of user stacks
       - Added option to change color scheme
       - Default to blue/green color scheme to improve accessibility
       - Correctly identify kernel stacks when debuginfo is available

   - Improvements for 'perf bench futex':
       - Add --mlockall parameter
       - Add --broadcast and --pi to the 'requeue' sub benchmark

   - Add support for PMU aliases.

   - Introduce an ARM Coresight ETE decoder.

   - Add a 'perf bench' entry for evlist open/close operations, to help
     quantify improvements with multithreading 'perf record'.

   - Allow reporting the [un]throttle PERF_RECORD_ meta event in 'perf
     script's python scripting.

   - Add a 'perf test' entry for PMU aliases.

   - Add a 'perf test' entry for 'perf record/perf report/perf script'
     pipe mode.

  Fixes:

   - perf script dlfilter (API for filtering via dynamically loaded
     shared object introduced in v5.14) fixes and a 'perf test' entry
     for it.

   - Fix get_current_dir_name() compilation on Android.

   - Fix issues with asciidoc and double dashes uses.

   - Fix memory leaks in the BTF handling code.

   - Fix leftover problems in the Documentation from the infrastructure
     originally lifted from the git codebase.

   - Fix *probe_vfs_getname.sh 'perf test' failures.

   - Handle fd gaps in 'perf test's test__dso_data_reopen().

   - Make sure to show disasembly warnings for 'perf annotate --stdio'.

   - Fix output from pipe to file and vice-versa in 'perf
     record/report/script'.

   - Correct 'perf data -h' output.

   - Fix wrong comm in system-wide mode with 'perf record --delay'.

   - Do not allow --for-each-cgroup without cpu in 'perf stat'

   - Make 'perf test --skip' work on shell tests.

   - Fix libperf's verbose printing.

  Misc improvements:

   - Preparatory patches for multithreading various 'perf record' phases
     (synthesizing, opening, recording, etc).

   - Add sparse context/locking annotations in compiler-types.h, also to
     help with the multithreading effort.

   - Optimize the generation of the arch specific erno tables used in
     'perf trace'.

   - Optimize libperf's perf_cpu_map__max().

   - Improve ARM's CoreSight warnings.

   - Report collisions in AUX records.

   - Improve warnings for the LLVM 'perf test' entry.

   - Improve the PMU events 'perf test' codebase.

   - perf test: Do not compare overheads in the zstd comp test

   - Better support annotation on ARM.

   - Update 'perf trace's cmd string table to decode sys_bpf() first
     arg.

  Vendor events:

   - Add JSON events and metrics for Intel's Ice Lake, Tiger Lake and
     Elhart Lake.

   - Update JSON eventsand metrics for Intel's Cascade Lake and Sky Lake
     servers.

  Hardware tracing:

   - Improvements for the ARM hardware tracing auxtrace support"

* tag 'perf-tools-for-v5.15-2021-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (130 commits)
  perf tests: Add test for PMU aliases
  perf pmu: Add PMU alias support
  perf session: Report collisions in AUX records
  perf script python: Allow reporting the [un]throttle PERF_RECORD_ meta event
  perf build: Report failure for testing feature libopencsd
  perf cs-etm: Show a warning for an unknown magic number
  perf cs-etm: Print the decoder name
  perf cs-etm: Create ETE decoder
  perf cs-etm: Update OpenCSD decoder for ETE
  perf cs-etm: Fix typo
  perf cs-etm: Save TRCDEVARCH register
  perf cs-etm: Refactor out ETMv4 header saving
  perf cs-etm: Initialise architecture based on TRCIDR1
  perf cs-etm: Refactor initialisation of decoder params.
  tools build: Fix feature detect clean for out of source builds
  perf evlist: Add evlist__for_each_entry_from() macro
  perf evsel: Handle precise_ip fallback in evsel__open_cpu()
  perf evsel: Move bpf_counter__install_pe() to success path in evsel__open_cpu()
  perf evsel: Move test_attr__open() to success path in evsel__open_cpu()
  perf evsel: Move ignore_missing_thread() to fallback code
  ...
2021-09-05 11:56:18 -07:00
Jin Yao
c7a3828d98 perf tests: Add test for PMU aliases
A perf uncore PMU may have two PMU names, a real name and an alias.

Add one test case to verify that the real and alias names have the same
effect.

Iterate sysfs to get one event which has an alias and create an evlist
by adding two evsels. Evsel1 is created by event and evsel2 is created
by alias.

Test asserts:

  evsel1->core.attr.type == evsel2->core.attr.type
  evsel1->core.attr.config == evsel2->core.attr.config

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: http://lore.kernel.org/lkml/20210902065955.1299-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:38:49 -03:00
Kan Liang
13d60ba073 perf pmu: Add PMU alias support
A perf uncore PMU may have two PMU names, a real name and an alias. The
alias is exported at /sys/bus/event_source/devices/uncore_*/alias.
The perf tool should support the alias as well.

Add alias_name in the struct perf_pmu to store the alias. For the PMU
which doesn't have an alias. It's NULL.

Introduce two X86 specific functions to retrieve the real name and the
alias separately.

Only go through the sysfs to retrieve the mapping between the real name
and the alias once. The result is cached in a list, uncore_pmu_list.

Nothing changed for the other ARCHs.

With the patch, the perf tool can monitor the PMU with either the real
name or the alias.

Use the real name,
 $ perf stat -e uncore_cha_2/event=1/ -x,
   4044879584,,uncore_cha_2/event=1/,2528059205,100.00,,

Use the alias,
 $ perf stat -e uncore_type_0_2/event=1/ -x,
   3659675336,,uncore_type_0_2/event=1/,2287306455,100.00,,

Committer notes:

Rename 'struct perf_pmu_alias_name' to 'pmu_alias', the 'perf_' prefix
should be used for libperf, things inside just tools/perf/ are being
moved away from that prefix.

Also 'pmu_alias' is shorter and reflects the abstraction.

Also don't use 'pmu' as the name for variables for that type, we should
use that for the 'struct perf_pmu' variables, avoiding confusion. Use
'pmu_alias' for 'struct pmu_alias' variables.

Co-developed-by: Jin Yao <yao.jin@linux.intel.com>
Co-developed-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: http://lore.kernel.org/lkml/20210902065955.1299-2-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:33:26 -03:00
Suzuki K Poulose
c68b421d8e perf session: Report collisions in AUX records
Just like the other flags in the AUX records, report a summary of the
Collisions if there were any.

Signed-off-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
LPU-Reference: 20210728091219.527886-1-suzuki.poulose@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:29:55 -03:00
Stephen Brennan
538d9c1829 perf script python: Allow reporting the [un]throttle PERF_RECORD_ meta event
perf_events may sometimes throttle an event due to creating too many
samples during a given timer tick.

As of now, the perf tool will not report on throttling, which means this
is a silent error.

Implement a callback for the throttle and unthrottle events within the
Python scripting engine, which can allow scripts to detect and report
when events may have been lost due to throttling.

The simplest script to report throttle events is:

  def throttle(*args):
      print("throttle" + repr(args))

  def unthrottle(*args):
      print("unthrottle" + repr(args))

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210901210815.133251-1-stephen.s.brennan@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:18:25 -03:00
Leo Yan
71f7f897c3 perf build: Report failure for testing feature libopencsd
When build perf tool with passing option 'CORESIGHT=1' explicitly, if
the feature test fails for library libopencsd, the build doesn't
complain the feature failure and continue to build the tool with
disabling the CoreSight feature insteadly.

This patch changes the building behaviour, when build perf tool with the
option 'CORESIGHT=1' and detect the failure for testing feature
libopencsd, the build process will be aborted and it shows the complaint
info.

Committer testing:

First make sure there is no opencsd library installed:

  $ rpm -qa | grep -i csd
  $ sudo rm -rf `find /usr/local -name "*csd*"`
  $ find /usr/local -name "*csd*"
  $

Then cleanup the perf build output directory:

  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
  $

And try to build explicitely asking for coresight:

  $ make O=/tmp/build/perf CORESIGHT=1 O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j24' parallel build
    HOSTCC  /tmp/build/perf/fixdep.o
    HOSTLD  /tmp/build/perf/fixdep-in.o
    LINK    /tmp/build/perf/fixdep
  Makefile.config:493: *** Error: No libopencsd library found or the version is not up-to-date. Please install recent libopencsd to build with CORESIGHT=1.  Stop.
  make[1]: *** [Makefile.perf:238: sub-make] Error 2
  make: *** [Makefile:113: install-bin] Error 2
  make: Leaving directory '/var/home/acme/git/perf/tools/perf'
  $

Now install the opencsd library present in Fedora 34:

  $ sudo dnf install opencsd-devel
  <SNIP>
  Installed:
    opencsd-1.0.0-1.fc34.x86_64 opencsd-devel-1.0.0-1.fc34.x86_64
  Complete!
  $

Try again building with coresight:

  $ make O=/tmp/build/perf CORESIGHT=1 O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j24' parallel build
  Makefile.config:493: *** Error: No libopencsd library found or the version is not up-to-date. Please install recent libopencsd to build with CORESIGHT=1.  Stop.
  make[1]: *** [Makefile.perf:238: sub-make] Error 2
  make: *** [Makefile:113: install-bin] Error 2
  make: Leaving directory '/var/home/acme/git/perf/tools/perf'
  $

Since Fedora 34 is pretty recent, one assumes we need to get it from its
upstream git repository, use rpm to find where that is:

  $ rpm -q --qf "%{URL}\n" opencsd
  https://github.com/Linaro/OpenCSD
  $

Go there, clone the repo, build it and install into /usr/local, then try
again:

  $ cd ~acme/git/perf
  $ make O=/tmp/build/perf VF=1 CORESIGHT=1 O=/tmp/build/perf -C tools/perf install-bin | grep -i opencsd
  ...                    libopencsd: [ on  ]
    PERF_VERSION = 5.14.g454719f67a3d
  $ export LD_LIBRARY_PATH=/usr/local/lib
  $ ldd ~/bin/perf | grep opencsd
  	libopencsd_c_api.so.1 => /usr/local/lib/libopencsd_c_api.so.1 (0x00007f28f78a4000)
  	libopencsd.so.1 => /usr/local/lib/libopencsd.so.1 (0x00007f28f6a2e000)
  $

Now it works.

Requested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Link: http://lore.kernel.org/lkml/20210902081800.550016-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:18:24 -03:00
James Clark
a80aea64aa perf cs-etm: Show a warning for an unknown magic number
Currently perf reports "Cannot allocate memory" which isn't very helpful
for a potentially user facing issue. If we add a new magic number in
the future, perf will be able to report unrecognised magic numbers.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-10-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:18:24 -03:00
James Clark
56c62f52b6 perf cs-etm: Print the decoder name
Use the real name of the decoder instead of hard-coding "ETM" to avoid
confusion when the trace is ETE. This also now distinguishes between
ETMv3 and ETMv4.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-9-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:17:25 -03:00
James Clark
779f414a48 perf cs-etm: Create ETE decoder
If the magic number indicates ETE instantiate a OCSD_BUILTIN_DCD_ETE
decoder instead of OCSD_BUILTIN_DCD_ETMV4I. ETE is the new trace feature
for Armv9.

Testing performed
=================

* Old files with v0 and v1 headers for ETMv4 still open correctly
* New files with new magic number open on new versions of perf
* New files with new magic number fail to open on old versions of perf
* Decoding with the ETE decoder results in the same output as the ETMv4
  decoder as long as there are no new ETE packet types

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-8-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:16:36 -03:00
James Clark
212095f7ca perf cs-etm: Update OpenCSD decoder for ETE
OpenCSD v1.1.1 has a bug fix for the installation of the ETE decoder
headers. This also means that including headers separately for each
decoder is unnecessary so remove these.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-7-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:16:00 -03:00
James Clark
050a0fc4ed perf cs-etm: Fix typo
TRCIRD2 should be TRCIDR2

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:15:45 -03:00
James Clark
51ba881131 perf cs-etm: Save TRCDEVARCH register
When ETE is present save the TRCDEVARCH register and set a new magic
number. It will be used to configure the decoder in a later commit.

Old versions of perf will not be able to open files with this new magic
number, but old files will still work with newer versions of perf.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-5-james.clark@arm.com
[ Addressed some cosmetic suggestions by Suzuki Poulouse ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:15:10 -03:00
James Clark
c9ccc96bf6 perf cs-etm: Refactor out ETMv4 header saving
Extract a function for saving the ETMv4 header because this will be used
for ETE in a later commit.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:07:42 -03:00
James Clark
f4aef1ea26 perf cs-etm: Initialise architecture based on TRCIDR1
Currently the architecture is hard coded as ARCH_V8, but from ETMv4.4
onwards this should be ARCH_AA64.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:07:38 -03:00
James Clark
991f69e9e0 perf cs-etm: Refactor initialisation of decoder params.
The initialisation of the decoder params is duplicated between
creation of the packet printer and packet decoder. Put them both
into one function so that future changes only need to be made in one
place.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210806134109.1182235-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-03 08:04:38 -03:00
Linus Torvalds
bcfeebbff3 Merge branch 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull exit cleanups from Eric Biederman:
 "In preparation of doing something about PTRACE_EVENT_EXIT I have
  started cleaning up various pieces of code related to do_exit. Most of
  that code I did not manage to get tested and reviewed before the merge
  window opened but a handful of very useful cleanups are ready to be
  merged.

  The first change is simply the removal of the bdflush system call. The
  code has now been disabled long enough that even the oldest userspace
  working userspace setups anyone can find to test are fine with the
  bdflush system call being removed.

  Changing m68k fsp040_die to use force_sigsegv(SIGSEGV) instead of
  calling do_exit directly is interesting only in that it is nearly the
  most difficult of the incorrect uses of do_exit to remove.

  The change to the seccomp code to simply send a signal instead of
  calling do_coredump directly is a very nice little cleanup made
  possible by realizing the existing signal sending helpers were missing
  a little bit of functionality that is easy to provide"

* 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  signal/seccomp: Dump core when there is only one live thread
  signal/seccomp: Refactor seccomp signal and coredump generation
  signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  exit/bdflush: Remove the deprecated bdflush system call
2021-09-01 14:52:05 -07:00
Riccardo Mancini
79e7ed56d7 perf evlist: Add evlist__for_each_entry_from() macro
This patch adds a new iteration macro for evlist that resumes iteration
from a given evsel in the evlist.

This macro will be used in the workqueue series.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/2386505f8b598adf0dbcd04ec21804c6bcf00826.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-01 11:24:43 -03:00
Riccardo Mancini
28667a5269 perf evsel: Handle precise_ip fallback in evsel__open_cpu()
This is another patch in the effort to separate the fallback mechanisms
from the open itself.

In case of precise_ip fallback, the original precise_ip will be stored
in the evsel (it was stored in a local variable) and the open will be
retried. Since the precise_ip fallback will be the first in the chain of
fallbacks, there should be no functional change with this patch.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/74208c433d2024a6c4af9c0b140b54ed6b5ea810.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:52:27 -03:00
Riccardo Mancini
91233d003b perf evsel: Move bpf_counter__install_pe() to success path in evsel__open_cpu()
I don't see why bpf_counter__install_pe() should get called even if
fd = -1, so I'm moving it to the success path.

This will be useful in following patches to separate the actual open and
the related operations from the fallback mechanisms.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/64f8a1b0a838a6e6049cd43c1beafd432999ae57.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:50:34 -03:00
Riccardo Mancini
ebfb045a41 perf evsel: Move test_attr__open() to success path in evsel__open_cpu()
test_attr__open() ignores the fd if -1, therefore it is safe to move it to
the success path (fd >= 0).

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/b3baf11360ca96541c9631730614fd7d217496fc.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:45:30 -03:00
Riccardo Mancini
da7c3b4622 perf evsel: Move ignore_missing_thread() to fallback code
This patch moves ignore_missing_thread outside the perf_event_open loop.

Doing so, we need to move the retry_open flag a few places higher, with
minimal impact. Furthermore, thread need not be decreased since it won't
get increased by the for loop (since we're jumping back inside), but we
need to check that the nthreads decrease didn't put thread out of range.

The goal is to have fallbacks handled in one place only, since in the
future parallel code, these would be handled separately.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/4eca51443c786baaf6811b7cd8e73aafd97f7606.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:44:30 -03:00
Riccardo Mancini
71efc48a4c perf evsel: Separate rlimit increase from evsel__open_cpu()
This is a preparatory patch for the workqueue patches with the goal to
separate from evlist__open_cpu() the actual opening (which could be
performed in parallel), from the existing fallback mechanisms, which
should be handled sequentially.

This patch separates the rlimit increase from evsel__open_cpu().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/2f256de8ec37b9809a5cef73c2fa7bce416af5d3.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:42:10 -03:00
Riccardo Mancini
d21fc5f077 perf evsel: Separate missing feature detection from evsel__open_cpu()
This is a preparatory patch for the workqueue patches with the goal to
separate in evlist__open_cpu() the actual opening, which could be
performed in parallel, from the existing fallback mechanisms, which
should be handled sequentially.

This patch separates the missing feature detection in evsel__open_cpu()
into a new evsel__detect_missing_features() function.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/cba0b7d939862473662adeedb0f9c9b69566ee9a.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:38:18 -03:00
Riccardo Mancini
6efd06e374 perf evsel: Add evsel__prepare_open()
This function will prepare the evsel and disable the missing features.
It will be used in one of the following patches.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/fa5e78bbb92c848226f044278fdcf777b3ce4583.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:36:54 -03:00
Riccardo Mancini
588f4ac763 perf evsel: Separate missing feature disabling from evsel__open_cpu
This is a preparatory patch for the patches in the workqueue series with
the goal to separate in evlist__open_cpu() the actual opening, which
could be performed in parallel, from the existing fallback mechanisms,
which should be handled sequentially.

This patch separates the disabling of missing features from
evlist__open_cpu() into a new function evsel__disable_missing_features(().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/48138bd2932646dde315505da733c2ca635ad2ee.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:34:03 -03:00
Riccardo Mancini
46def08f5d perf evsel: Save open flags in evsel in prepare_open()
This patch caches the flags used in perf_event_open() inside evsel, so
that they can be set in __evsel__prepare_open() (this will be useful in
patches in the workqueue series, when the fallback mechanisms will be
handled outside the open itself).

This also optimizes the code, by not having to recompute them everytime.

Since flags are now saved in evsel, the flags argument in
perf_event_open() is removed.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/d9f63159098e56fa518eecf25171d72e6f74df37.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:29:12 -03:00
Riccardo Mancini
d45ce03434 perf evsel: Separate open preparation from open itself
This is a preparatory patch for the following patches with the goal to
separate in evlist__open_cpu the actual perf_event_open, which could be
performed in parallel, from the existing fallback mechanisms, which
should be handled sequentially.

This patch separates the first lines of evsel__open_cpu into a new
__evsel__prepare_open function.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/e14118b934c338dbbf68b8677f20d0d7dbf9359a.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:26:55 -03:00
Riccardo Mancini
bc0496043e perf evsel: Remove retry_sample_id goto label
As far as I can tell, there is no good reason, apart from optimization
to have the retry_sample_id separate from fallback_missing_features.

Probably, this label was added to avoid reapplying patches for missing
features that had already been applied.

However, missing features that have been added later have not used this
optimization, always jumping to fallback_missing_features and reapplying
all missing features.

This patch removes that label, replacing it with
fallback_missing_features.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/340af0d03408d6621fd9c742e311db18b3585b3b.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:24:35 -03:00
Riccardo Mancini
5d4da30f76 perf mmap: Add missing bitops.h header
MMAP_CPU_MASK_BYTES uses the BITS_TO_LONGS macro, which is defined in
linux/bitops.h.

However, this header is not included directly, but gets imported
indirectly in files using the macro.

This patch adds the missing include.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/c5b91ee432a2e28e7f16337c740b43b4d0b0e86c.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 16:21:17 -03:00
James Clark
40a72c6472 perf tools: Fix LLVM download hint link
http://llvm.org/apt returns 404, it has moved to https://apt.llvm.org/

Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210831145501.2135754-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:18:16 -03:00
James Clark
792adb1aa9 perf tools: Fix LLVM test failure when running in verbose mode
A CI system might want to run all tests in verbose mode so that there is
enough information to diagnose issues. This LLVM test is the only test
that uses "-v" to signify to not skip the test if the preconditions
aren't met (LLVM isn't installed). This means that running the test in
verbose mode without LLVM installed causes a test failure.

For consistency with the other tests, remove this verbose/skip check. An
alternate solution would be to make _all_ tests not skip when run in
verbose mode, but I don't think that would be intuitive.

Also change the search_program() call to search_program_and_warn().
Previously the hint about installing LLVM was only printed by the actual
test because this check was skipped in verbose mode. To maintain the old
behaviour, the precondition check must also print the full warning.

Previous output:

  $ ./perf test llvm
  40: LLVM search and compile                                     :
  40.1: Basic BPF llvm compile                                    : Skip

  $ ./perf test -v llvm
  40: LLVM search and compile                                     :
  40.1: Basic BPF llvm compile                                    :
  --- start ---
  test child forked, pid 2085835
  ERROR:	unable to find clang.
  Hint:	Try to install latest clang/llvm to support BPF. Check your $PATH
  ...
  test child finished with -1
  ---- end ----
  LLVM search and compile subtest 1: FAILED!

New output (non verbose mode is identical, verbose changes from fail to
skip):

  $ ./perf test llvm
  40: LLVM search and compile                                     :
  40.1: Basic BPF llvm compile                                    : Skip

  $ ./perf test -v llvm
  40: LLVM search and compile                                     :
  40.1: Basic BPF llvm compile                                    :
  --- start ---
  test child forked, pid 2087680
  ERROR:	unable to find clang.
  Hint:	Try to install latest clang/llvm to support BPF. Check your $PATH
  ...
  No clang, skip this test
  test child finished with -2
  ---- end ----
  LLVM search and compile subtest 1: Skip

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210831145501.2135754-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:17:45 -03:00
James Clark
a8a2d5c0b3 perf tools: Refactor LLVM test warning for missing binary
The same warning is duplicated in two places so refactor it into a
single function "search_program_and_warn". This will be used a third
time in a later commit.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210831145501.2135754-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:17:07 -03:00
Leo Yan
474b3f2882 perf auxtrace arm: Support compat_auxtrace_mmap__{read_head|write_tail}
When the tool runs with compat mode on Arm platform, the kernel is in
64-bit mode and user space is in 32-bit mode; the user space can use
instructions "ldrd" and "strd" for 64-bit value atomicity.

This patch adds compat_auxtrace_mmap__{read_head|write_tail} for arm
building, it uses "ldrd" and "strd" instructions to ensure accessing
atomicity for aux head and tail.  The file arch/arm/util/auxtrace.c is
built for arm and arm64 building, these two functions are not needed for
arm64, so check the compiler macro "__arm__" to only include them for
arm building.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Russell King (oracle)" <linux@armlinux.org.uk>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210829102238.19693-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:12:00 -03:00
Leo Yan
bbc49f1202 perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
When perf runs in compat mode (kernel in 64-bit mode and the perf is in
32-bit mode), the 64-bit value atomicity in the user space cannot be
assured, E.g. on some architectures, the 64-bit value accessing is split
into two instructions, one is for the low 32-bit word accessing and
another is for the high 32-bit word.

This patch introduces weak functions compat_auxtrace_mmap__read_head()
and compat_auxtrace_mmap__write_tail(), as their naming indicates, when
perf tool works in compat mode, it uses these two functions to access
the AUX head and tail.  These two functions can allow the perf tool to
work properly in certain conditions, e.g. when perf tool works in
snapshot mode with only using AUX head pointer, or perf tool uses the
AUX buffer and the incremented tail is not bigger than 4GB.

When perf tool cannot handle the case when the AUX tail is bigger than
4GB, the function compat_auxtrace_mmap__write_tail() returns -1 and
tells the caller to bail out for the error.

These two functions are declared as weak attribute, this allows to
implement arch specific functions if any arch can support the 64-bit
value atomicity in compat mode.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Russell King (oracle)" <linux@armlinux.org.uk>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210829102238.19693-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:12:00 -03:00
Ian Rogers
298105b78b perf bpf: Fix memory leaks relating to BTF.
BTF needs to be freed with btf__free().

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210826184833.408563-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:12:00 -03:00
Joshua Martinez
760f5e77e6 perf data: Correct -h output
There is currently only 1 'perf data' command, but supporting extra
commands was breaking the help output. Simplify for now so that the help
output is correct.

Before:
$ perf data -h

 Usage: perf data [<common options>] <command> [<options>]

$ perf data
Usage:
        perf data [<common options>] <command> [<options>]

        Available commands:
         convert        - converts data file between formats

After:
$ perf data

 Usage: perf data convert [<options>]

    -f, --force           don't complain, do it
    -i, --input <file>    input file name
    -v, --verbose         be more verbose
        --all             Convert all events
        --to-ctf ...      Convert to CTF format
        --to-json ...     Convert to JSON format
        --tod             Convert time to wall clock time

$ perf data -h

 Usage: perf data convert [<options>]

    -f, --force           don't complain, do it
    -i, --input <file>    input file name
    -v, --verbose         be more verbose
        --all             Convert all events
        --to-ctf ...      Convert to CTF format
        --to-json ...     Convert to JSON format
        --tod             Convert time to wall clock time

Signed-off-by: Joshua Martinez <joshuamart@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210824205829.52822-1-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:12:00 -03:00
Colin Ian King
cb5a2ebbf1 perf header: Fix spelling mistake "cant'" -> "can't"
There is a spelling mistake in a warning message. Fix it.

Signed-off-by: Colin King <colin.king@canonical.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210826121801.13281-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:12:00 -03:00
Arnaldo Carvalho de Melo
e807ffe669 perf dlfilters: Fix build on environments with a --sysroot gcc arg
Such as cross building on Android, so just add EXTRA_CFLAGS to the
dlfilters rules as it is where --sysroot= has been specified.

Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YS1JwIMTNNWcbGdT@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-31 15:11:55 -03:00
Andreas Gerstmayr
c611e4f24c perf flamegraph: flamegraph.py script improvements
* display perf.data header
* display PIDs of user stacks
* added option to change color scheme
* default to blue/green color scheme to improve accessibility
* correctly identify kernel stacks when kernel-debuginfo is installed

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210830164729.116049-1-agerstmayr@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 18:22:23 -03:00
Namhyung Kim
bb07d62e03 perf record: Fix wrong comm in system-wide mode with delay
Stephane found that the name of the forked process in a system-wide
mode is wrong when --delay option is used.  For example,

  # perf record -a --delay=1000  noploop 3

The noploop process will run a busy loop for 3 second.  And on an idle
machine it should show up at the top in the perf report.  It works
well without the --delay option.  But if I add the option, it showed
'perf' not 'noploop'.

  # perf report -s comm -q | head -3
      52.94%  perf
      16.65%  swapper
      12.04%  chrome

It turned out that the dummy event didn't work at all and it missed
COMM and MMAP events for the noploop process (and others too).  We
should enable the dummy event immediately in system-wide mode, as the
enable-on-exec would work only for task events.

With this change,

  # perf report -s comm -q | head -3
      52.75%  noploop
      17.03%  swapper
      12.83%  chrome

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210827233212.3121037-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 18:22:23 -03:00
Namhyung Kim
1c02f6c904 perf stat: Do not allow --for-each-cgroup without cpu
The cgroup mode should work with cpu events.  Warn if --for-each-cgroup
option is used with a task target like existing -G option.

  # perf stat --for-each-cgroup . sleep 1
  both cgroup and no-aggregation modes only available in system-wide mode

   Usage: perf stat [<options>] [<command>]

      -G, --cgroup <name>   monitor event in cgroup name only
      -A, --no-aggr         disable CPU count aggregation
      -a, --all-cpus        system-wide collection from all CPUs
          --for-each-cgroup <name>
                            expand events for each cgroup

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210830170200.55652-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 18:22:23 -03:00
Arnaldo Carvalho de Melo
a32762b864 perf bench evlist-open-close: Use PRIu64 with u64 to fix build on 32-bit architectures
73     9.00 ubuntu:18.04-x-powerpc        : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    bench/evlist-open-close.c: In function 'bench_evlist_open_close__run':
    bench/evlist-open-close.c:173:12: error: format '%ld' expects argument of type 'long int', but argument 5 has type 'u64 {aka long long unsigned int}' [-Werror=format=]
       pr_debug("Iteration %d took:\t%ldus\n", i, runtime_us);
                ^
    bench/../util/debug.h:18:21: note: in definition of macro 'pr_fmt'
     #define pr_fmt(fmt) fmt
                         ^~~
    bench/evlist-open-close.c:173:3: note: in expansion of macro 'pr_debug'
       pr_debug("Iteration %d took:\t%ldus\n", i, runtime_us);
       ^~~~~~~~
    cc1: all warnings being treated as errors
    /git/perf-5.14.0/tools/build/Makefile.build:139: recipe for target 'bench' failed

Cc: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 4241eabf59 ("perf bench: Add benchmark for evlist open/close operations")
Link: http://lore.kernel.org/lkml/YS0oTcA9Zuy8Wjm9@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 18:22:02 -03:00
James Clark
a05b42702d perf tests: Fix *probe_vfs_getname.sh test failures
The commit 4d6101f5fd ("perf probe: Clarify error message about
not finding kernel modules debuginfo") changed the error message "Failed
to find the path for kernel" to "Failed to find the path for the
kernel".

Update the regex so that the tests still skip rather than fail when
kernel debug symbols aren't present.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lore.kernel.org/lkml/20210825164259.833222-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 10:06:43 -03:00
Arnaldo Carvalho de Melo
edf7b4a2d8 perf bench inject-buildid: Handle writen() errors
The build on fedora:35 and fedora:rawhide with clang is failing with:

  49    41.00 fedora:35                     : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
    bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable]
            u64 len = 0;
                ^
    1 error generated.
    make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2
  50    41.11 fedora:rawhide                : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
    bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable]
            u64 len = 0;
                ^
    1 error generated.
    make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2

That 'len' variable is not used at all, so just make sure all the
synthesize_RECORD() routines return ssize_t to propagate the writen()
return, as it may fail, ditch the 'ret' var and bail out if those
routines fail.

Fixes: 0bf02a0d80 ("perf bench: Add build-id injection benchmark")
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/CAM9d7cgEZNSor+B+7Y2C+QYGme_v5aH0Zn0RLfxoQ+Fy83EHrg@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 10:06:37 -03:00
Li Huafei
cdf32b4467 perf unwind: Do not overwrite FEATURE_CHECK_LDFLAGS-libunwind-{x86,aarch64}
When setting LIBUNWIND_DIR, we first set

 FEATURE_CHECK_LDFLAGS-libunwind-{aarch64,x86} = -L$(LIBUNWIND_DIR)/lib.

<committer note>
This happens a bit before, the overwritting, in:

  libunwind_arch_set_flags = $(eval $(libunwind_arch_set_flags_code))
  define libunwind_arch_set_flags_code
    FEATURE_CHECK_CFLAGS-libunwind-$(1)  = -I$(LIBUNWIND_DIR)/include
    FEATURE_CHECK_LDFLAGS-libunwind-$(1) = -L$(LIBUNWIND_DIR)/lib
  endef

  ifdef LIBUNWIND_DIR
    LIBUNWIND_CFLAGS  = -I$(LIBUNWIND_DIR)/include
    LIBUNWIND_LDFLAGS = -L$(LIBUNWIND_DIR)/lib
    LIBUNWIND_ARCHS = x86 x86_64 arm aarch64 debug-frame-arm debug-frame-aarch64
    $(foreach libunwind_arch,$(LIBUNWIND_ARCHS),$(call libunwind_arch_set_flags,$(libunwind_arch)))
  endif

Look at that 'foreach' on all the LIBUNWIND_ARCHS.
</>

After commit 5c4d7c82c0 ("perf unwind: Do not put libunwind-{x86,aarch64}
in FEATURE_TESTS_BASIC"), FEATURE_CHECK_LDFLAGS-libunwind-{x86,aarch64} is
overwritten. As a result, the remote libunwind libraries cannot be searched
from $(LIBUNWIND_DIR)/lib directory during feature check tests. Fix it with
variable appending.

Before this patch:

  perf$ make VF=1 LIBUNWIND_DIR=/opt/libunwind_aarch64
   BUILD:   Doing 'make -j16' parallel build
  <SNIP>
  ...
  ...                    libopencsd: [ OFF ]
  ...                 libunwind-x86: [ OFF ]
  ...              libunwind-x86_64: [ OFF ]
  ...                 libunwind-arm: [ OFF ]
  ...             libunwind-aarch64: [ OFF ]
  ...         libunwind-debug-frame: [ OFF ]
  ...     libunwind-debug-frame-arm: [ OFF ]
  ... libunwind-debug-frame-aarch64: [ OFF ]
  ...                           cxx: [ OFF ]
  <SNIP>

  perf$ cat ../build/feature/test-libunwind-aarch64.make.output
  /usr/bin/ld: cannot find -lunwind-aarch64
  /usr/bin/ld: cannot find -lunwind-aarch64
  collect2: error: ld returned 1 exit status

After this patch:

  perf$ make VF=1 LIBUNWIND_DIR=/opt/libunwind_aarch64
   BUILD:   Doing 'make -j16' parallel build
  <SNIP>
  ...                    libopencsd: [ OFF ]
  ...                 libunwind-x86: [ OFF ]
  ...              libunwind-x86_64: [ OFF ]
  ...                 libunwind-arm: [ OFF ]
  ...             libunwind-aarch64: [ on  ]
  ...         libunwind-debug-frame: [ OFF ]
  ...     libunwind-debug-frame-arm: [ OFF ]
  ... libunwind-debug-frame-aarch64: [ OFF ]
  ...                           cxx: [ OFF ]
  <SNIP>

  perf$ cat ../build/feature/test-libunwind-aarch64.make.output

  perf$ ldd ./perf
        linux-vdso.so.1 (0x00007ffdf07da000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f30953dc000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f30951d4000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3094e36000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3094c32000)
        libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f3094a18000)
        libdw.so.1 => /usr/lib/x86_64-linux-gnu/libdw.so.1 (0x00007f30947cc000)
        libunwind-x86_64.so.8 => /usr/lib/x86_64-linux-gnu/libunwind-x86_64.so.8 (0x00007f30945ad000)
        libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8 (0x00007f3094392000)
        liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f309416c000)
        libunwind-aarch64.so.8 => not found
        libslang.so.2 => /lib/x86_64-linux-gnu/libslang.so.2 (0x00007f3093c8a000)
        libpython2.7.so.1.0 => /usr/local/lib/libpython2.7.so.1.0 (0x00007f309386b000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f309364e000)
        libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007f3093443000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3093052000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3096097000)
        libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f3092e42000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f3092c3f000)

Fixes: 5c4d7c82c0 ("perf unwind: Do not put libunwind-{x86,aarch64} in FEATURE_TESTS_BASIC")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http://lore.kernel.org/lkml/20210823134340.60955-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 10:06:29 -03:00
Arnaldo Carvalho de Melo
261f491133 perf config: Fix caching and memory leak in perf_home_perfconfig()
Acaict, perf_home_perfconfig() is supposed to cache the result of
home_perfconfig, which returns the default location of perfconfig for
the user, given the HOME environment variable.

However, the current implementation calls home_perfconfig every time
perf_home_perfconfig() is called (so no caching is actually performed),
replacing the previous pointer, thus also causing a memory leak.

This patch adds a check of whether either config or failed is set and,
in that case, directly returns config without calling home_perfconfig at
each invocation.

Fixes: f5f03e19ce ("perf config: Add perf_home_perfconfig function")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: http://lore.kernel.org/lkml/20210820130817.740536-1-rickyman7@gmail.com
[ Removed needless double check for the 'failed' variable ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 10:06:22 -03:00
Alexey Dobriyan
128dbd78bd perf tools: Fixup get_current_dir_name() compilation
strdup() prototype doesn't live in stdlib.h .

Add limits.h for PATH_MAX definition as well.

This fixes the build on Android.

Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YRukaQbrgDWhiwGr@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 10:06:16 -03:00
Nghia Le
ce73af8087 perf tools: Add missing newline at the end of header file
Add missing newline at the end of file parse-sublevel-options.h.

Thus removing relevant warning reported by checkpatch.

Signed-off-by: Nghia Le <nghialm78@gmail.com>
Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http //lore.kernel.org/lkml/20210824085947.224062-1-nghialm78@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-24 15:01:31 -03:00
Riccardo Mancini
6ca822e576 perf tests dlfilter: Free desc and long_desc in check_filter_desc
In dlfilter-test.c, check_filter_desc() calls get_filter_desc() which
allocates 'desc' and 'long_desc'.  However, these variables are never
deallocated.

This patch adds the missing free() calls.

Fixes: 9f9c9a8de2 ("perf tests: Add dlfilter test")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210820113132.724034-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 11:16:27 -03:00
Namhyung Kim
5f534a8181 perf test: Do not compare overheads in the zstd comp test
The overhead can vary on each run so it'd make the test failed
sometimes.  Also order of hist entry can change.

Use perf report -F option to omit the overhead field and sort the
result alphabetically.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210812235738.1684583-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-13 10:45:36 -03:00
Jin Yao
1d3351e631 perf tools: Enable on a list of CPUs for hybrid
The 'perf record' and 'perf stat' commands have supported the option
'-C/--cpus' to count or collect only on the list of CPUs provided. This
option needs to be supported for hybrid as well.

For hybrid support, it needs to check that the cpu list are available
on hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
is 'cpu_atom'.

Before:

  # perf stat -e cpu_core/cycles/ -C11 -- sleep 1

   Performance counter stats for 'CPU(s) 11':

     <not supported>      cpu_core/cycles/

         1.006179431 seconds time elapsed

The 'perf stat' command silently returned "<not supported>" without any
helpful information. It should error out pointing out that that cpu11
was not 'cpu_core'.

After:

  # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
  WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
  failed to use cpu list 11

We also need to support the events without pmu prefix specified.

  # perf stat -e cycles -C11 -- sleep 1
  WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)

   Performance counter stats for 'CPU(s) 11':

           1,067,373      cpu_atom/cycles/

         1.005544738 seconds time elapsed

The perf tool creates two cycles events automatically, cpu_core/cycles/ and
cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
for cpu_core/cycles/ and only count the cpu_atom/cycles/.

If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', for example,

  # perf stat -e cycles -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           1,914,704      cpu_core/cycles/
           2,036,983      cpu_atom/cycles/

         1.005815641 seconds time elapsed

It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
cpu_atom/cycles/, and output with some warnings.

Some more complex examples,

  # perf stat -e cycles,instructions -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
  WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           2,780,387      cpu_core/cycles/
           1,583,432      cpu_atom/cycles/
           3,957,277      cpu_core/instructions/
           1,167,089      cpu_atom/instructions/

         1.006005124 seconds time elapsed

  # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
  WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
  WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.

   Performance counter stats for 'CPU(s) 0,11':

           3,290,301      cpu_core/cycles/
           1,953,073      cpu_atom/cycles/
           1,407,869      cpu_atom/instructions/

         1.006260912 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210723063433.7318-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 16:07:32 -03:00
Jin Yao
b726e3634e perf tools: Create hybrid flag in target
The user may count or collect only on a cpu list via '-C/--cpus' option.

Previously cpus for an evsel were retrieved from PMU's sysfs. But if the
target cpu list is defined, the retrieved cpus are not kept and the
target cpu list is used instead.

But for hybrid system, we can't directly use target cpu list. The cpu
list may not be available on hybrid pmu (e.g. cpu_core or cpu_atom).  So
we should not set the 'has_user_cpus' flag for hybrid system.

The difficulity is that we can't call perf_pmu__has_hybrid() in evlist.c
to check hybrid system otherwise 'perf test python' would be failed
(undefined symbol for perf_pmu__has_hybrid). If we add pmu.c to
python-ext-sources, too many symbol dependencies are hard to resolve.

We use an alternative method by using a new 'hybrid' flag in target
for hybrid system checking.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210723063433.7318-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 16:04:33 -03:00
Riccardo Mancini
ebdf90a4a1 perf test: Make --skip work on shell tests
perf-test has the option --skip to provide a list of tests to skip.
However, this option does not work with shell scripts.

This patch passes the skiplist to run_shell_tests, so that also shell
scripts could be skipped using --skip.

Committer tests:

Tests 79 onwards are shell tests:

Before:

  # perf test --skip 1,2,81,82,84,88,90
   1: vmlinux symtab matches kallsyms                                 : Skip (user override)
   2: Detect openat syscall event                                     : Skip (user override)
   3: Detect openat syscall event on all cpus                         : Ok
   4: Read samples using the mmap interface                           : Ok
   5: Test data source output                                         : Ok
  <SNIP>
  78: x86 Sample parsing                                              : Ok
  79: build id cache operations                                       : Ok
  80: daemon operations                                               : Ok
  81: perf pipe recording and injection test                          : Ok
  82: Add vfs_getname probe to get syscall args filenames             : FAILED!
  83: probe libc's inet_pton & backtrace it with ping                 : Ok
  84: Use vfs_getname probe to get syscall args filenames             : FAILED!
  85: Zstd perf.data compression/decompression                        : Ok
  86: perf stat csv summary test                                      : Ok
  87: perf stat metrics (shadow stat) test                            : Ok
  88: perf stat --bpf-counters test                                   : Ok
  89: Check Arm CoreSight trace data recording and synthesized samples: Skip
  90: Check open filename arg using perf trace + vfs_getname          : FAILED!
  #

After:

  # perf test --skip 1,2,81,82,84,88,90
   1: vmlinux symtab matches kallsyms                                 : Skip (user override)
   2: Detect openat syscall event                                     : Skip (user override)
   3: Detect openat syscall event on all cpus                         : Ok
   4: Read samples using the mmap interface                           : Ok
   5: Test data source output                                         : Ok
  <SNIP>
  78: x86 Sample parsing                                              : Ok
  79: build id cache operations                                       : Ok
  80: daemon operations                                               : Ok
  81: perf pipe recording and injection test                          : Skip (user override)
  82: Add vfs_getname probe to get syscall args filenames             : Skip (user override)
  83: probe libc's inet_pton & backtrace it with ping                 : Ok
  84: Use vfs_getname probe to get syscall args filenames             : Skip (user override)
  85: Zstd perf.data compression/decompression                        : Ok
  86: perf stat csv summary test                                      : Ok
  87: perf stat metrics (shadow stat) test                            : Ok
  88: perf stat --bpf-counters test                                   : Skip (user override)
  89: Check Arm CoreSight trace data recording and synthesized samples: Skip
  90: Check open filename arg using perf trace + vfs_getname          : Skip (user override)
  #

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210811180625.160944-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 15:45:08 -03:00
Adrian Hunter
9f9c9a8de2 perf tests: Add dlfilter test
Add a perf test to test the dlfilter C API.

A perf.data file is synthesized and then processed by perf script with a
dlfilter named dlfilter-test-api-v0.so. Also a C file is compiled to
provide a dso to match the synthesized perf.data file.

Committer testing:

  [root@five ~]# perf test dlfilter
  72: dlfilter C API                                                  : Ok
  [root@five ~]# perf test -v dlfilter
  72: dlfilter C API                                                  :
  --- start ---
  test child forked, pid 3387712
  Checking for gcc
  Command: gcc --version
  gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3)
  Copyright (C) 2021 Free Software Foundation, Inc.
  This is free software; see the source for copying conditions.  There is NO
  warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

  dlfilters path: /var/home/acme/libexec/perf-core/dlfilters
  Command: gcc -g -o /tmp/dlfilter-test-3387712-prog /tmp/dlfilter-test-3387712-prog.c
  Creating new host machine structure
  Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 0 --dlarg last
  start API
  filter_event_early API
  filter_event API
  stop API
  Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 1 --dlarg last
  start API
  filter_event_early API
  filter_event API
  stop API
  Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 2 --dlarg last
  start API
  filter_event_early API
  stop API
  test child finished with 0
  ---- end ----
  dlfilter C API: Ok
  [root@five ~]#

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:35:44 -03:00
Adrian Hunter
3af1dfdd51 perf build: Move perf_dlfilters.h in the source tree
Move perf_dlfilters.h in the source tree so that it will be found when
building dlfilters as part of the perf build.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:35:24 -03:00
Adrian Hunter
b29edf35ef perf dlfilter: Amend documentation wrt library dependencies
Like all locally-built programs, dlfilters may need to be re-built if
shared libraries they use change. Also there may be unexpected results
if the dfilter uses different versions of the shared libraries that perf
uses.

Note those things in the documentation.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:34:31 -03:00
Adrian Hunter
3e8e226307 perf script: Fix --list-dlfilters documentation
The option --list-dlfilters does use a string value.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 638e2b9984 ("perf script Add option to list dlfilters")
Link: https //lore.kernel.org/r/20210811101036.17986-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:34:07 -03:00
Adrian Hunter
29159727aa perf script: Fix unnecessary machine_resolve()
machine_resolve() may have already been called. Test for that to avoid
calling it again unnecessarily.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:33:59 -03:00
Adrian Hunter
988db17932 perf script: Fix documented const'ness of perf_dlfilter_fns
perf_dlfilter_fns must not be const, because it is not.

Declaring it const can result in it being mapped read-only, causing a
segfaullt when it is written. Update documentation accordingly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 8defa7147d5572 ("perf script Add API for filtering via dynamically loaded shared object")
Link: https //lore.kernel.org/r/20210811101036.17986-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:33:15 -03:00
Jin Yao
c4ad8fabd0 perf vendor events: Update metrics for SkyLake Server
Update JSON metrics for SkyLake Server.

Based on TMA metrics 4.21 at 01.org.
https://download.01.org/perfmon/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-7-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:19:49 -03:00
Jin Yao
d5c0a8d554 perf vendor events intel: Update uncore event list for SkyLake Server
Update JSON uncore events for SkyLake Server.

Based on JSON list v1.24:

https://download.01.org/perfmon/SKX/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-6-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:19:34 -03:00
Jin Yao
2c72404e95 perf vendor events intel: Update core event list for SkyLake Server
Update JSON core events for SkyLake Server.

Based on JSON list v1.24:

https://download.01.org/perfmon/SKX/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-5-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:19:12 -03:00
Jin Yao
ed97cc6cbb perf vendor events: Update metrics for CascadeLake Server
Update JSON metrics for CascadeLake Server.

Based on TMA metrics 4.21 at 01.org.
https://download.01.org/perfmon/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-4-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:18:36 -03:00
Jin Yao
96fe584f99 perf vendor events intel: Update uncore event list for CascadeLake Server
Update JSON uncore events for CascadeLake Server.

Based on JSON list v1.11:

https://download.01.org/perfmon/CLX/

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-3-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
2021-08-10 15:12:04 -03:00
Jin Yao
e0ddfd8d50 perf vendor events intel: Update core event list for CascadeLake Server
Update JSON core events for CascadeLake Server.

Based on JSON list v1.11:

https://download.01.org/perfmon/CLX/

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https //lore.kernel.org/r/20210810020508.31261-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 15:08:21 -03:00
John Garry
8ee465a181 perf test: Add pmu-events sys event support
Add support for system events, along with core and uncore events.

Support for a sample PMU is also added.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-12-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:47:52 -03:00
John Garry
5abd3988b0 perf jevents: Print SoC name per system event table
Print the SoC name per system event table, which will allow the test SoC be
identified by the pmu-events test.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-11-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:47:07 -03:00
John Garry
e199f47f15 perf pmu: Make pmu_add_sys_aliases() public
Function pmu_add_sys_aliases() will be required for the PMU events test
for system events aliases, so make it public.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-10-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:46:39 -03:00
John Garry
6a86657fbc perf test: Add more pmu-events uncore aliases
Add more events to cover the scenarios fixed and also inadvertently
broken by commit c47a5599ed ("perf tools: Fix pattern matching for
same substring in different PMU type")

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-9-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:45:53 -03:00
John Garry
5a65c0c8f6 perf test: Re-add pmu-event uncore PMU alias test
Add support to match aliases for uncore PMUs.

Since we cannot rely on the PMUs being present on the host system, use
fake PMUs.

The following conditions in the test are ensures:

- Expected count of aliases created

- All aliases can be matched to an expected alias in
  perf_pmu_test_pmu.aliases

This will catch the condition fixed in commit c47a5599ed ("perf tools:
Fix pattern matching for same substring in different PMU type"), where
excess events were created for a PMU. It will also fix the scenario
inadvertently broken there, where no aliases were created for aliases
with multiple tokens.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-8-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:44:51 -03:00
John Garry
5806099a2e perf pmu: Check .is_uncore field in pmu_add_cpu_aliases_map()
Calling pmu_is_uncore() for fake PMUs does not work, as it checks sysfs
for the PMU details (which won't exist).

Check .is_uncore field instead, which makes sense anyway.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-7-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:44:00 -03:00
John Garry
3bc4526b30 perf test: Test pmu-events core aliases separately
The current method to test uncore event aliasing is limited, as it
relies on the uncore PMU being present in the host system to test.

As such, breakages of uncore PMU aliases goes unnoticed. To make this
more robust, a new method of testing uncore PMUs with fake PMUs will be
used in future. This will be separate to testing core PMU aliases.

So make the current test function core PMU only. Uncore PMU alias
support will be re-added later.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-6-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:42:35 -03:00