Commit graph

7156 commits

Author SHA1 Message Date
Masami Hiramatsu
0808b3d5b7 perf probe: Provide clearer message permission error for tracefs access
Report permission error for the tracefs open and rewrite whole the error
message code around it.

You'll see a hint according to what you want to do with perf probe as
below.

  $ perf probe -l
  No permission to read tracefs.
  Please try 'sudo mount -o remount,mode=755 /sys/kernel/tracing/'
    Error: Failed to show event list.

  $ perf probe -d \*
  No permission to write tracefs.
  Please run this command again with sudo.
    Error: Failed to delete events.

This also fixes -ENOTSUP checking for mounting tracefs/debugfs.
Actually open returns -ENOENT in that case and we have to check it with
current mount point list. If we unmount debugfs and tracefs perf probe
shows correct message as below.

  $ perf probe -l
  Debugfs or tracefs is not mounted
  Please try 'sudo mount -t tracefs nodev /sys/kernel/tracing/'
    Error: Failed to show event list.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162299456839.503471.13863002017089255222.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 14:12:14 -03:00
Leo Yan
bde1e7d934 perf auxtrace: Change to use SMP memory barriers
The kernel and the userspace tool can access the AUX ring buffer head
and tail from different CPUs, thus SMP class of barriers are required
on SMP system.

This patch changes to use SMP barriers to replace mb() and rmb()
barriers.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210602103007.184993-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 13:45:04 -03:00
Zou Wei
f54cad25a1 perf srccode: Use list_move() instead of equivalent list_del() + list_add() sequence
Using list_move() instead of list_del() + list_add(), shorter,
equivalent.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623113566-49455-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 09:36:36 -03:00
Masami Hiramatsu
f4f1c42953 perf probe: Report possible permission error for map__load() failure
Report possible permission error including kptr_restrict setting
for map__load() failure. This can happen when non-superuser runs
perf probe.

With this patch, perf probe shows the following message.

 $ perf probe vfs_read
 Failed to load symbols from /proc/kallsyms
 Please ensure you can read the /proc/kallsyms symbol addresses.
 If the /proc/sys/kernel/kptr_restrict is '2', you can not read
 kernel symbol address even if you are a superuser. Please change
 it to '1'. If kptr_restrict is '1', the superuser can read the
 symbol addresses.
 In that case, please run this command again with sudo.
   Error: Failed to add events.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162282065877.448336.10047912688119745151.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 15:43:37 -03:00
Arnaldo Carvalho de Melo
0ab8009b3e Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes from perf/urgent to allow perf/core to be used for new
development.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 14:58:44 -03:00
Jin Yao
d5a8bd0fcd perf mem: Disable 'mem-loads-aux' group before reporting
For some platforms, such as Alderlake, the 'mem-loads' event is required
to use together with 'mem-loads-aux' within a group and 'mem-loads-aux'
must be the group leader. Now we disable this group before reporting
because 'mem-loads-aux' is just an auxiliary event. It doesn't carry
any valid memory load result. If we show the 'mem-loads-aux' +
'mem-loads' as a group in report, it needs many of changes but they
are totally unnecessary.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:06:01 -03:00
Jin Yao
4a9086adc3 perf mem: Support record for hybrid platform
Support 'perf mem record' for hybrid platform. On hybrid platform,
such as Alderlake, when executing 'perf mem record', it actually calls:

record -e {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P
       -e cpu_atom/mem-loads,ldlat=30/P
       -e cpu_core/mem-stores/P
       -e cpu_atom/mem-stores/P

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:59 -03:00
Jin Yao
e7ce8d11bf perf tools: Check if mem_events is supported for hybrid platform
Check if the mem_events ('mem-loads' and 'mem-stores') exist
in the sysfs path.

For Alderlake, the hybrid cpu pmu are "cpu_core" and "cpu_atom".
Check the existing of following paths:

/sys/devices/cpu_atom/events/mem-loads
/sys/devices/cpu_atom/events/mem-stores
/sys/devices/cpu_core/events/mem-loads
/sys/devices/cpu_core/events/mem-stores

If the patch exists, the mem_event is supported.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:33 -03:00
Jin Yao
d2f327acc6 perf tools: Support pmu prefix for mem-load event
The perf_mem_events__name() can generate the mem-load event name.
It uses a variable 'mem_loads_name__init' to avoid generating the
event name every time (because perf_pmu__scan takes some time).

The perf_mem_events__name() assumes the pmu is "cpu" but it's not
correct for hybrid platform. For Alderlake, the pmu is "cpu_core" or
"cpu_atom"

Introduce a new parameter 'pmu_name' in perf_mem_events__name
to let the caller specify a pmu name.

Considering such event name is x86 specific, so move
perf_mem_events[] to arch/x86/util/mem-events.c.

We still keep the variable 'mem_loads_name__init' but it's only
used when pmu_name is NULL (compatible for original behavior). When
pmu_name is not NULL (e.g. "cpu_core"), this patch doesn't have
optimization. That can be implemented in follow up patch.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:03:35 -03:00
Adrian Hunter
e621b8ffec perf auxtrace: Factor out itrace_do_parse_synth_opts()
Factor out itrace_do_parse_synth_opts() so that it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:04:10 -03:00
Adrian Hunter
d9ae9c9776 perf script: Factor out script_fetch_insn()
Factor out script_fetch_insn() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:46 -03:00
Adrian Hunter
cf9bfa6c15 perf scripting python: Assign perf_script_context
The scripting_context pointer itself does not change and nor does it need
to. Put it directly into the script as a variable at the start so it does
not have to be passed on each call into the script.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:33 -03:00
Adrian Hunter
67e50ce0e3 perf scripting: Add perf_session to scripting_context
This is preparation for allowing a script to set the itrace options
for the session if they have not already been set.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:17 -03:00
Adrian Hunter
cac30400a6 perf scripting: Add scripting_context__update()
Move scripting_context update to a separate function and add
the arguments of ->process_event() to it.

This prepares the way for adding more methods to the perf_trace_context
module, by providing the context information that they will need.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210530192308.7382-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 10:03:02 -03:00
Namhyung Kim
c673b7f59e perf stat: Fix error check for bpf_program__attach
It seems the bpf_program__attach() returns a negative error code instead
of a NULL pointer in case of error.

Fixes: 7fac83aaf2 ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/20210527220052.1657578-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-27 21:51:21 -03:00
Ravi Bangoria
41ca1d1e88 perf probe: Provide more detail with relocation warning
When run as normal user with default sysctl kernel.kptr_restrict=0
and kernel.perf_event_paranoid=2, perf probe fails with:

  $ ./perf probe move_page_tables
  Relocated base symbol is not found!

The warning message is not much informative. The reason perf fails
is because /proc/kallsyms is restricted by perf_event_paranoid=2
for normal user and thus perf fails to read relocated address of
the base symbol.

Tweaking kptr_restrict and perf_event_paranoid can change the
behavior of perf probe. Also, running as root or privileged user
works too. Add these details in the warning message.

Plus, kmap->ref_reloc_sym might not be always set even if
host_machine is initialized. Above is the example of the same.
Remove that comment.

Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210525043744.193297-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-27 13:55:28 -03:00
Denys Zagorui
6793672acc perf parse-events: Add bison --file-prefix-map option
During a perf build with O= bison stores full paths in generated files
and those paths are stored in resulting perf binary.

Starting from bison v3.7.1 those paths can be remapped by using the
--file-prefix-map option.  Use this option if possible to make perf
binary more reproducible.

Signed-off-by: Denys Zagorui <dzagorui@cisco.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210524111514.65713-3-dzagorui@cisco.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-27 13:55:28 -03:00
Adrian Hunter
2ede92173f perf scripting python: Add auxtrace error
Add auxtrace_error to general python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
0db2134069 perf scripting python: Add context switch
Add context_switch to general python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
22cc2f74bb perf scripting python: Add cpumode
Add cpumode to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
142b05182e perf scripting python: Add IPC
Add IPC to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
bee272af78 perf scripting python: Add sample flags
Add sample flags to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
54cd8b0324 perf script: Factor out perf_sample__sprintf_flags()
Factor out perf_sample__sprintf_flags() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
3f8e009e01 perf scripting python: Add 'addr_location' for 'addr'
If sample addr correlates to a symbol, add  "addr_dso", "addr_symbol", and
"addr_symoff" to python scripting.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
8271b50958 perf scripting python: Factor out set_sym_in_dict()
Factor out set_sym_in_dict() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:17 -03:00
Adrian Hunter
d04c1ff0b3 perf scripting python: Fix tuple_set_u64()
tuple_set_u64() produces a signed value instead of an unsigned value.
That works for database export but not other cases. Rename to
tuple_set_d64() for database export and fix tuple_set_u64().

Fixes: df919b400a ("perf scripting python: Extend interface to export data in a database-friendly way")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20210525095112.1399-2-adrian.hunter@intel.com
2021-05-25 10:07:16 -03:00
Arnaldo Carvalho de Melo
0461296878 perf auxtrace: Make perf_event__process_auxtrace*() callable
As we'll use it in the upcoming python interfaces and when built with:

                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
  +NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1
  make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1
  +NO_SYSCALL_TABLE=1
    BUILD:   Doing 'make -j24' parallel build
  <SNIP>
    CC      /tmp/tmp.rGrdpQlTCr/builtin-daemon.o
  In file included from util/events_stats.h:8,
                   from util/evlist.h:12,
                   from builtin-script.c:18:
  builtin-script.c: In function ‘process_auxtrace_error’:
  util/auxtrace.h:708:57: error: called object is not a function or function pointer
    708 | #define perf_event__process_auxtrace_error              0
        |                                                         ^
  builtin-script.c:2443:16: note: in expansion of macro ‘perf_event__process_auxtrace_error’
   2443 |         return perf_event__process_auxtrace_error(session, event);
        |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    MKDIR   /tmp/tmp.rGrdpQlTCr/tests/
    MKDIR   /tmp/tmp.rGrdpQlTCr/bench/
    CC      /tmp/tmp.rGrdpQlTCr/tests/builtin-test.o
    CC      /tmp/tmp.rGrdpQlTCr/bench/sched-messaging.o
  builtin-script.c:2444:1: error: control reaches end of non-void function [-Werror=return-type]
   2444 | }
        | ^

To: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 10:07:16 -03:00
Adrian Hunter
6ea4b5dbe0 perf script: Find script file relative to exec path
Allow perf script to find a script in the exec path.

Example:

Before:

 $ perf record -a -e intel_pt/branch=0/ sleep 0.1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.954 MB perf.data ]
 $ perf script intel-pt-events.py 2>&1 | head -3
   Error: Couldn't find script `intel-pt-events.py'
   See perf script -l for available scripts.
 $ perf script -s intel-pt-events.py 2>&1 | head -3
 Can't open python script "intel-pt-events.py": No such file or directory
 $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3
   Error: Couldn't find script `/home/ahunter/libexec/perf-core/scripts/python/intel-pt-events.py'
   See perf script -l for available scripts.
 $

After:

 $ perf script intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $ perf script -s intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3
 Intel PT Power Events and PTWRITE
            perf  8123/8123  [000]       551.230753986     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
            perf  8123/8123  [001]       551.230808216     cbr:  42  freq: 4219 MHz  (156%)                0 [unknown] ([unknown])
 $

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210524065718.11421-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 09:51:44 -03:00
Arnaldo Carvalho de Melo
100475f83b Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes from perf/urgent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-25 09:13:52 -03:00
Song Liu
f8b61bd204 perf stat: Skip evlist__[enable|disable] when all events uses BPF
When all events of a perf-stat session use BPF, it is not necessary to
call evlist__enable() and evlist__disable(). Skip them when
all_counters_use_bpf is true.

Signed-off-by: Song Liu <song@kernel.org>
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-21 16:50:17 -03:00
Adrian Hunter
f42907e8a4 perf script: Add missing PERF_IP_FLAG_CHARS for VM-Entry and VM-Exit
Add 'g' (guest) for VM-Entry and 'h' (host) for VM-Exit.

Fixes: c025d46cd9 ("perf script: Add branch types for VM-Entry and VM-Exit")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210521175127.27264-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-21 16:41:37 -03:00
Arnaldo Carvalho de Melo
3b2f17ad17 perf parse-events: Check if the software events array slots are populated
To avoid a NULL pointer dereference when the kernel supports the new
feature but the tooling still hasn't an entry for it.

This happened with the recently added PERF_COUNT_SW_CGROUP_SWITCHES
software event.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https://lore.kernel.org/linux-perf-users/YKVESEKRjKtILhog@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-21 07:47:56 -03:00
Namhyung Kim
fb6c79d726 perf tools: Add 'cgroup-switches' software event
It counts how often cgroups are changed actually during the context
switches.

  # perf stat -a -e context-switches,cgroup-switches -a sleep 1

   Performance counter stats for 'system wide':

              11,267      context-switches
              10,950      cgroup-switches

         1.015634369 seconds time elapsed

Committer notes:

The kernel patches landed in v5.13, but this entry wasn't filled in
perf's parse-events tables, which was leading to a segfault when running
'perf list' on a kernel with that feature, as reported by Thomas
Richter.

Also removed the part touching tools/include/uapi/linux/perf_event.h as
it was updated in the usual sync with the kernel UAPI headers, in a
previous, already upstream, patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210210083327.22726-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 14:23:23 -03:00
Adrian Hunter
0a0c597245 perf intel-pt: Remove redundant setting of ptq->insn_len
Remove redundant "ptq->insn_len = 0" statement.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210519074515.9262-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 10:35:31 -03:00
Adrian Hunter
c954eb72b3 perf intel-pt: Fix sample instruction bytes
The decoder reports the current instruction if it was decoded. In some
cases the current instruction is not decoded, in which case the instruction
bytes length must be set to zero. Ensure that is always done.

Note perf script can anyway get the instruction bytes for any samples where
they are not present.

Also note, that there is a redundant "ptq->insn_len = 0" statement which is
not removed until a subsequent patch in order to make this patch apply
cleanly to stable branches.

Example:

A machne that supports TSX is required. It will have flag "rtm". Kernel
parameter tsx=on may be required.

 # for w in `cat /proc/cpuinfo | grep -m1 flags `;do echo $w | grep rtm ; done
 rtm

Test program:

 #include <stdio.h>
 #include <immintrin.h>

 int main()
 {
        int x = 0;

        if (_xbegin() == _XBEGIN_STARTED) {
                x = 1;
                _xabort(1);
        } else {
                printf("x = %d\n", x);
        }
        return 0;
 }

Compile with -mrtm i.e.

 gcc -Wall -Wextra -mrtm xabort.c -o xabort

Record:

 perf record -e intel_pt/cyc/u --filter 'filter main @ ./xabort' ./xabort

Before:

 # perf script --itrace=xe -F+flags,+insn,-period --xed --ns
          xabort  1478 [007] 92161.431348581:   transactions:   x                              400b81 main+0x14 (/root/xabort)          mov $0xffffffff, %eax
          xabort  1478 [007] 92161.431348624:   transactions:   tx abrt                        400b93 main+0x26 (/root/xabort)          mov $0xffffffff, %eax

After:

 # perf script --itrace=xe -F+flags,+insn,-period --xed --ns
          xabort  1478 [007] 92161.431348581:   transactions:   x                              400b81 main+0x14 (/root/xabort)          xbegin 0x6
          xabort  1478 [007] 92161.431348624:   transactions:   tx abrt                        400b93 main+0x26 (/root/xabort)          xabort $0x1

Fixes: faaa87680b ("perf intel-pt/bts: Report instruction bytes and length in sample")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210519074515.9262-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 10:33:43 -03:00
Adrian Hunter
cb7987837c perf intel-pt: Fix transaction abort handling
When adding support for power events, some handling of FUP packets was
unified. That resulted in breaking reporting of TSX aborts, by not
considering the associated TIP packet. Fix that.

Example:

A machine that supports TSX is required. It will have flag "rtm". Kernel
parameter tsx=on may be required.

 # for w in `cat /proc/cpuinfo | grep -m1 flags `;do echo $w | grep rtm ; done
 rtm

Test program:

 #include <stdio.h>
 #include <immintrin.h>

 int main()
 {
        int x = 0;

        if (_xbegin() == _XBEGIN_STARTED) {
                x = 1;
                _xabort(1);
        } else {
                printf("x = %d\n", x);
        }
        return 0;
 }

Compile with -mrtm i.e.

 gcc -Wall -Wextra -mrtm xabort.c -o xabort

Record:

 perf record -e intel_pt/cyc/u --filter 'filter main @ ./xabort' ./xabort

Before:

 # perf script --itrace=be -F+flags,+addr,-period,-event --ns
          xabort  1478 [007] 92161.431348552:   tr strt                             0 [unknown] ([unknown]) =>           400b6d main+0x0 (/root/xabort)
          xabort  1478 [007] 92161.431348624:   jmp                            400b96 main+0x29 (/root/xabort) =>           400bae main+0x41 (/root/xabort)
          xabort  1478 [007] 92161.431348624:   return                         400bb4 main+0x47 (/root/xabort) =>           400b87 main+0x1a (/root/xabort)
          xabort  1478 [007] 92161.431348637:   jcc                            400b8a main+0x1d (/root/xabort) =>           400b98 main+0x2b (/root/xabort)
          xabort  1478 [007] 92161.431348644:   tr end  call                   400ba9 main+0x3c (/root/xabort) =>           40f690 printf+0x0 (/root/xabort)
          xabort  1478 [007] 92161.431360859:   tr strt                             0 [unknown] ([unknown]) =>           400bae main+0x41 (/root/xabort)
          xabort  1478 [007] 92161.431360882:   tr end  return                 400bb4 main+0x47 (/root/xabort) =>           401139 __libc_start_main+0x309 (/root/xabort)

After:

 # perf script --itrace=be -F+flags,+addr,-period,-event --ns
          xabort  1478 [007] 92161.431348552:   tr strt                             0 [unknown] ([unknown]) =>           400b6d main+0x0 (/root/xabort)
          xabort  1478 [007] 92161.431348624:   tx abrt                        400b93 main+0x26 (/root/xabort) =>           400b87 main+0x1a (/root/xabort)
          xabort  1478 [007] 92161.431348637:   jcc                            400b8a main+0x1d (/root/xabort) =>           400b98 main+0x2b (/root/xabort)
          xabort  1478 [007] 92161.431348644:   tr end  call                   400ba9 main+0x3c (/root/xabort) =>           40f690 printf+0x0 (/root/xabort)
          xabort  1478 [007] 92161.431360859:   tr strt                             0 [unknown] ([unknown]) =>           400bae main+0x41 (/root/xabort)
          xabort  1478 [007] 92161.431360882:   tr end  return                 400bb4 main+0x47 (/root/xabort) =>           401139 __libc_start_main+0x309 (/root/xabort)

Fixes: a472e65fc4 ("perf intel-pt: Add decoder support for ptwrite and power event packets")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210519074515.9262-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 10:31:04 -03:00
Thomas Richter
316a76a58c perf test: Fix libpfm4 support (63) test error for nested event groups
Compiling perf with make LIBPFM4=1 includes libpfm support and
enables test case 63 'Test libpfm4 support'. This test reports an error
on all platforms for subtest 63.2 'test groups of --pfm-events'.
The reported error message is 'nested event groups not supported'

 # ./perf test -F 63
 63: Test libpfm4 support                                            :
 63.1: test of individual --pfm-events                               :
 Error:
 failed to parse event stereolab : event not found
 Error:
 failed to parse event stereolab,instructions : event not found
 Error:
 failed to parse event instructions,stereolab : event not found
  Ok
 63.2: test groups of --pfm-events                                   :
 Error:
 nested event groups not supported    <------ Error message here
 Error:
 failed to parse event {stereolab} : event not found
 Error:
 failed to parse event {instructions,cycles},{instructions,stereolab} :\
	 event not found
 Ok
 #

This patch addresses the error message 'nested event groups not supported'.
The root cause is function parse_libpfm_events_option() which parses the
event string '{},{instructions}' and can not handle a leading empty
group notation '{},...'.

The code detects the first (empty) group indicator '{' but does not
terminate group processing on the following group closing character '}'.
So when the second group indicator '{' is detected, the code assumes
a nested group and returns an error.

With the error message fixed, also change the expected event number to
one for the test case to succeed.

While at it also fix a memory leak. In good case the function does not
free the duplicated string given as first parameter.

Output after:
 # ./perf test -F 63
 63: Test libpfm4 support                                            :
 63.1: test of individual --pfm-events                               :
 Error:
 failed to parse event stereolab : event not found
 Error:
 failed to parse event stereolab,instructions : event not found
 Error:
 failed to parse event instructions,stereolab : event not found
  Ok
 63.2: test groups of --pfm-events                                   :
 Error:
 failed to parse event {stereolab} : event not found
 Error:
 failed to parse event {instructions,cycles},{instructions,stereolab} : \
	 event not found
  Ok
 #
Error message 'nested event groups not supported' is gone.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-By: Ian Rogers <irogers@google.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210517140931.2559364-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-19 10:30:37 -03:00
James Clark
c1a6165a63 perf cs-etm: Prevent and warn on underflows during timestamp calculation.
When a zero timestamp is encountered, warn once. This is to make
hardware or configuration issues visible. Also suggest that the issue
can be worked around with the --itrace=Z option.

When an underflow with a non-zero timestamp occurs, warn every time.
This is an unexpected scenario, and with increasing timestamps, it's
unlikely that it would occur more than once, therefore it should be
ok to warn every time.

Only try to calculate the timestamp by subtracting the instruction
count if neither of the above cases are true. This makes attempting
to decode files with zero timestamps in non-timeless mode
more consistent. Currently it can half work if the timestamp wraps
around and becomes non-zero, although the behavior is undefined and
unpredictable.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210517131741.3027-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17 11:06:56 -03:00
James Clark
c36c1ef6f6 perf cs-etm: Start reading 'Z' --itrace option
Recently the 'Z' --itrace option was added to override detection
of timeless decoding. This is also useful in Coresight to work around
issues with invalid timestamps on some hardware.

When the 'Z' option is provided, the existing timeless decoding mode
will be used, even if timestamps were recorded.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210517131741.3027-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17 11:06:14 -03:00
James Clark
cac3141867 perf cs-etm: Move synth_opts initialisation
Move initialisation of synth_opts earlier in the function
so that synth_opts can be used at an earlier stage in a
later commit.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210517131741.3027-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17 11:02:50 -03:00
Jin Yao
e119083bab perf header: Support HYBRID_CPU_PMU_CAPS feature
Perf has supported the CPU_PMU_CAPS feature to display a list of CPU PMU
capabilities. But on a hybrid platform, it may have several CPU PMUs (such
as "cpu_core" and "cpu_atom"). The CPU_PMU_CAPS feature is hard to extend
to support multiple CPU PMUs well if it needs to be compatible for the case
of old perf data file + new perf tool.

So for better compatibility we now create a new feature HYBRID_CPU_PMU_CAPS
in the header.

For the perf.data generated on hybrid platform,

  root@otcpl-adl-s-2:~# perf report --header-only -I

  # cpu_core pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid
  # cpu_atom pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid
  # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CPU_PMU_CAPS CLOCK_DATA

For the perf.data generated on non-hybrid platform

  root@kbl-ppc:~# perf report --header-only -I

  # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake
  # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY HYBRID_CPU_PMU_CAPS

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210514122948.9472-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17 10:58:10 -03:00
Jin Yao
f7d74ce32f perf header: Support HYBRID_TOPOLOGY feature
It is useful to let the user know about the hybrid topology.

Add the HYBRID_TOPOLOGY feature in header to indicate the core CPUs
and the atom CPUs.

With this patch a perf.data generated on a hybrid platform reports
the hybrid CPU list:

  root@otcpl-adl-s-2:~# perf report --header-only -I
  ...
  # hybrid cpu system:
  # cpu_core cpu list : 0-15
  # cpu_atom cpu list : 16-23

For a perf.data generated on a non-hybrid platform, reports a message
that HYBRID_TOPOLOGY is missing:

  root@kbl-ppc:~# perf report --header-only -I
  ...
  # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210514122948.9472-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-17 10:55:10 -03:00
James Clark
1ac9e0b573 perf cs-etm: Set time on synthesised samples to preserve ordering
The following attribute is set when synthesising samples in
timed decoding mode:

    attr.sample_type |= PERF_SAMPLE_TIME;

This results in new samples that appear to have timestamps but
because we don't assign any timestamps to the samples, when the
resulting inject file is opened again, the synthesised samples
will be on the wrong side of the MMAP or COMM events.

For example, this results in the samples being associated with
the perf binary, rather than the target of the record:

    perf record -e cs_etm/@tmc_etr0/u top
    perf inject -i perf.data -o perf.inject --itrace=i100il
    perf report -i perf.inject

Where 'Command' == perf should show as 'top':

    # Overhead  Command  Source Shared Object  Source Symbol           Target Symbol           Basic Block Cycles
    # ........  .......  ....................  ......................  ......................  ..................
    #
        31.08%  perf     [unknown]             [.] 0x000000000040c3f8  [.] 0x000000000040c3e8  -

If the perf.data file is opened directly with perf, without the
inject step, then this already works correctly because the
events are synthesised after the COMM and MMAP events and
no second sorting happens. Re-sorting only happens when opening
the perf.inject file for the second time so timestamps are
needed.

Using the timestamp from the AUX record mirrors the current
behaviour when opening directly with perf, because the events
are generated on the call to cs_etm__process_queues().

The ETM trace could optionally contain time stamps, but there is
no way to correlate this with the kernel time. So, the best available
time value is that of the AUX_RECORD header. This patch uses
the timestamp from the header for all the samples. The ordering of the
samples are implicit in the trace and thus is fine with respect to
relative ordering.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Co-developed-by: Al Grant <al.grant@arm.com>
Signed-off-by: Al Grant <al.grant@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki K Poulos <suzuki.poulose@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20210510143248.27423-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 15:47:09 -03:00
James Clark
aadd6ba409 perf cs-etm: Refactor timestamp variable names
Remove ambiguity in variable names relating to timestamps.

A later commit will save the sample kernel timestamp in one of the etm
structs, so name all elements appropriately to avoid confusion.

This is also removes some ambiguity arising from the fact that the
--timestamp argument to perf record refers to sample kernel timestamps,
and the /timestamp/ event modifier refers to CS timestamps, so the term
is overloaded.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20210510143248.27423-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 15:47:04 -03:00
Namhyung Kim
07b747f99a perf stat: Use aggregated counts directly
The ps->res_stats is for repeated runs, so the interval code should
not touch it.  Actually the aggregated counts are available in the
counter->counts->aggr, so we can (and should) use it directly IMHO.

No functional change intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210423023833.1430520-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 12:43:11 -03:00
Adrian Hunter
e3ff42bdeb perf intel-pt: Parse VM Time Correlation options and set up decoding
Add parsing and validation of VM Time Correlation options, and pass
parameters to the decoder. Also update the Intel PT documentation
accordingly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 12:43:11 -03:00
Adrian Hunter
fa8f949d16 perf intel-pt: Add VM Time Correlation to decoder
VM Time Correlation means determining if each TSC packet belongs to a VM
Guest or the Host. When the trace is "in context" that is indicated by
the NR flag in the PIP packet. However, when tracing kernel-only,
userspace only, or using address filters, the trace can be "out of context"
in which case timing packets are produced but not PIP packets.

Nevertheless, it is very unlikely the VM Guest timestamps will be in
the same range as the Host timestamps. Host time ranges are established
by a starting side-band event timestamp, and subsequently by the buffer
timestamp, written when the buffer is copied to the perf.data file.

This patch supports updating the VM Guest timestamp packets, assuming an
unchanging (during perf record) VMX TSC Offset and no VMX TSC scaling.

Furthermore, it is possible to determine what the VMX TSC Offset is,
although not necessarily at the start. The dry-run option lets that
information be determined so that the user can pass it to a subsequent
run. For more detail, refer to the example in the Intel PT documentation
in a subsequent patch.

VM Time Correlation is also performed on the TSC value in PEBs-via-PT
records.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 12:43:11 -03:00
Adrian Hunter
31c7e27dae perf intel-pt: Better 7-byte timestamp wraparound logic
A timestamp should not go backwards. If it does it is assumed that the
 7-byte TSC packet value has wrapped. Improve that logic so that it will
not allow the timestamp to go past the buffer timestamp (which is recorded
when the buffer is copied out)

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 12:43:11 -03:00
Adrian Hunter
5ac35d778a perf intel-pt: Pass the first timestamp to the decoder
VM Time Correlation will use time ranges to determine whether a TSC packet
belongs to the Host or Guest. To start, the first non-zero timestamp is
needed. Pass that to the decoder.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 12:43:11 -03:00
Adrian Hunter
0fc9d33894 perf intel-pt: Add a tree for VMCS information
Even when VMX TSC Offset is not changing (during perf record), different
virtual machines can have different TSC Offsets. There is a Virtual Machine
Control Structure (VMCS) for each virtual CPU, the address of which is
reported to Intel PT in the VMCS packet. We do not know which VMCS belongs
to which virtual machine, so use a tree to keep track of VMCS information.
Then the decoder will be able to use the current VMCS value to look up the
current TSC Offset.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-12 12:43:11 -03:00