Commit Graph

1201384 Commits

Author SHA1 Message Date
Ian Rogers 478c3f5dcd perf list: Don't print Unit for "default_core"
"default_core" was added as a way to demark JSON events whose PMU should
be whatever the default core PMU is, previously this had been assumed to
be "cpu" but that fails on s390 and ARM.

'perf list' displays the PMU in the event description to save storing it
in JSON, but was still comparing against "cpu" and not "default_core",
so update this.

Fixes: d2045f8715 ("perf jevents: Use "default_core" for events with no Unit")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230831071421.2201358-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-31 16:24:47 -03:00
Ian Rogers bdc6012991 perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
The metric is using the wrong format encoding. This fix is in the
converter script PR: https://github.com/intel/perfmon/pull/101

Committer testing:

Tested on a Lenovo t480s, before 'perf test 100' was failing with:

 # perf test 100
  100: perf all metrics test                                           : FAILED!

With 'perf test -vv 100' we can see:

<SNIP>
  Testing MemoryBW
  Not grouping metric tma_fb_full's events.
  Try disabling the NMI watchdog to comply NO_NMI_WATCHDOG metric constraint:
      echo 0 > /proc/sys/kernel/nmi_watchdog
      perf stat ...
      echo 1 > /proc/sys/kernel/nmi_watchdog
  event syntax error: '...DATA_READ/thresh=1,metric-id=UNC_ARB_TRK_OCCUPANCY.DATA_READ!3thresh!21!3/,UNC_ARB_TRK_OCCUPANCY.DATA_READ/metric-id=UNC_ARB_TRK_OCCUPANCY.DATA_READ/}:W,duration_time'
                                    \___ Bad event or PMU

  Unable to find PMU or event on a PMU of 'UNC_ARB_TRK_OCCUPANCY.DATA_READ'
<SNIP>

With the patch this problem is gone.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20230830175543.1911892-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Adrian Hunter 45210e1ada perf dlfilter: Avoid leak in v0 API test use of resolve_address()
The introduction of reference counting causes the v0 API
perf_dlfilter_fns.resolve_address() to leak.

v2 API introduced perf_dlfilter_fns.al_cleanup() to prevent that.

For the v0 API, avoid the leak by exiting the addr_location immediately,
since the documentation makes it clear that pointers obtained via
perf_dlfilter_fns are not necessarily valid (dereferenceable) after
'filter_event' and 'filter_event_early' return.

Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Closes: https://lore.kernel.org/oe-lkp/202308232146.94d82cb4-oliver.sang@intel.com
Link: http://lore.kernel.org/lkml/20230830090539.68206-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers f0005f1732 perf metric: Add #num_cpus_online literal
Returns the number of CPUs online, unlike #num_cpus that returns the
number present.

Add a test of the property.

This will be used in future Intel metrics.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830073026.1829912-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers 30f0b435bb perf pmu: Remove str from perf_pmu_alias
Currently the value is only used in perf list.

Compute the value just when needed to avoid unnecessary overhead.

Recycle the strbuf to avoid memory allocation overhead.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers 7a6e916447 perf parse-events: Make common term list to strbuf helper
A term list is turned into a string for debug output and for the str
value in the alias.

Add a helper to do this based on existing code, but then fix for
situations like events being identified.

Use strbuf to manage the dynamic memory allocation and remove the 256
byte limit.

Use in various places the string of the term list is required.

Before:

  $ sudo perf stat -vv -e inst_retired.any true
  Using CPUID GenuineIntel-6-8D-1
  intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
  Attempting to add event pmu 'cpu' with 'inst_retired.any,' that may result in non-fatal errors
  After aliases, add event pmu 'cpu' with 'event,period,' that may result in non-fatal errors
  inst_retired.any -> cpu/inst_retired.any/
  ...

After:

$ sudo perf stat -vv -e inst_retired.any true

  Using CPUID GenuineIntel-6-8D-1
  intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
  Attempt to add: cpu/inst_retired.any/
  ..after resolving event: cpu/event=0xc0,period=0x1e8483/
  inst_retired.any -> cpu/event=0xc0,period=0x1e8483/
  ...

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers 6beb6cfddf perf parse-events: Minor help message improvements
Be more specific and fix a typo.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers 196e355877 perf pmu: Avoid uninitialized use of alias->str
alias is allocated with malloc allowing uninitialized memory to be
accessed.

The initialization of str was moved late after it could have been
updated by a JSON event, however, this create a potential for an
uninitialized use.

Fix this by assigning str to NULL early.

Testing on ARM (Raspberry Pi) showed a memory leak in the same code so
add a zfree.

Fixes: f63a536f03 ("perf pmu: Merge JSON events with sysfs at load time")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230830000545.1638964-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:02 -03:00
Ian Rogers d2045f8715 perf jevents: Use "default_core" for events with no Unit
The JSON Unit field encodes the name of the PMU to match the events
to. When no name is given it has meant the "cpu" core PMU except for
tests.

On ARM, Intel hybrid and s390 the core PMU is named differently which
means that using "cpu" for this case causes the events not to get
matched to the PMU.

Introduce a new "default_core" string for this case and in the
pmu__name_match force all core PMUs to match this name.

Fixes: 2e255b4f9f ("perf jevents: Group events by PMU")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230826062203.1058041-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:15 -03:00
Namhyung Kim a84260e314 perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
It has system-wide test and cpu-list test but the cpu-list test fails
sometimes.  It runs sleep command on CPU1 and measure both user.slice
and system.slice cgroups by default (on systemd-based systems).

But if the system was idle enough, sometime the system.slice gets no
count and it makes the test failing.  Maybe that's because it only looks
at the CPU1, let's add CPU0 to increase the chance it finds some tasks.

Fixes: 7901086014 ("perf test: Add a new test for perf stat cgroup BPF counter")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:15 -03:00
Namhyung Kim 68ca249c96 perf test shell stat_bpf_counters: Fix test on Intel
As of now, bpf counters (bperf) don't support event groups.  But the
default perf stat includes topdown metrics if supported (on recent Intel
machines) which require groups.  That makes perf stat exiting.

  $ sudo perf stat --bpf-counter true
  bpf managed perf events do not yet support groups.

Actually the test explicitly uses cycles event only, but it missed to
pass the option when it checks the availability of the command.

Fixes: 2c0cb9f560 ("perf test: Add a shell test for 'perf stat --bpf-counters' new option")
Reviewed-by: Song Liu <song@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Namhyung Kim 11f5710d96 perf test shell record_bpf_filter: Skip 6.2 kernel
The BPF sample filtering requires two kernel changes below:

 * bpf_cast_to_kernel_ctx() kfunc (added in v6.2)

 * setting perf_sample_data->sample_flags (finished in v6.3)

The perf tools can check bpf_cast_to_kernel_ctx() easily so it can
refuse BPF filters on those old kernels (v6.1 and earlier).

But checking sample_flags appears to be difficult so current code won't
work on v6.2 kernel.  That's unfortunate but I don't know what's the
correct way to handle it.

For now, let's skip v6.2 kernels explicitly (if failed) in the test.

Fixes: 9575ecdd19 ("perf test: Add perf record sample filtering test")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Namhyung Kim c091c78b73 libperf: Get rid of attr.id field
Now there's no in-tree user of the field.  To remove the possible bug
later, let's get rid of the 'id' field and add a comment for that.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230825152552.112913-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Namhyung Kim f174341d0d perf tools: Convert to perf_record_header_attr_id()
Instead of accessing the attr.id directly, use the
perf_record_header_attr_id() helper to handle old versions.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230825152552.112913-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Namhyung Kim baec60800d libperf: Add perf_record_header_attr_id()
The HEADER_ATTR record has an event attr followed by the id array.  But
perf data from a different version could have different size of attr.

So it cannot just use event->attr.id to access the array.  Let's add the
perf_record_header_attr_id() macro to calculate the start of the array.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230825152552.112913-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Namhyung Kim 9bf63282ea perf tools: Handle old data in PERF_RECORD_ATTR
The PERF_RECORD_ATTR is used for a pipe mode to describe an event with
attribute and IDs.  The ID table comes after the attr and it calculate
size of the table using the total record size and the attr size.

  n_ids = (total_record_size - end_of_the_attr_field) / sizeof(u64)

This is fine for most use cases, but sometimes it saves the pipe output
in a file and then process it later.  And it becomes a problem if there
is a change in attr size between the record and report.

  $ perf record -o- > perf-pipe.data  # old version
  $ perf report -i- < perf-pipe.data  # new version

For example, if the attr size is 128 and it has 4 IDs, then it would
save them in 168 byte like below:

   8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
 128 byte: perf event attr { .size = 128, ... },
  32 byte: event IDs [] = { 1234, 1235, 1236, 1237 },

But when report later, it thinks the attr size is 136 then it only read
the last 3 entries as ID.

   8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
 136 byte: perf event attr { .size = 136, ... },
  24 byte: event IDs [] = { 1235, 1236, 1237 },  // 1234 is missing

So it should use the recorded version of the attr.  The attr has the
size field already then it should honor the size when reading data.

Fixes: 2c46dbb517 ("perf: Convert perf header attrs into attr events")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230825152552.112913-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Ian Rogers cd4e1efbbc perf pmus: Skip duplicate PMUs and don't print list suffix by default
Add a PMUs scan that ignores duplicates. When there are multiple PMUs
that differ only by suffix, by default just list the first one and
skip all others. The scan routine checks that the PMU names match but
doesn't enforce that the numbers are consecutive as for some PMUs
there are gaps. If "-v" is passed to "perf list" then list all PMUs.

With the previous change duplicate PMUs are no longer printed but the
suffix of the first is printed. When duplicate PMUs are being skipped
avoid printing the suffix.

Before:

  $ perf list
  ...
    uncore_imc_free_running_0/data_read/               [Kernel PMU event]
    uncore_imc_free_running_0/data_total/              [Kernel PMU event]
    uncore_imc_free_running_0/data_write/              [Kernel PMU event]
    uncore_imc_free_running_1/data_read/               [Kernel PMU event]
    uncore_imc_free_running_1/data_total/              [Kernel PMU event]
    uncore_imc_free_running_1/data_write/              [Kernel PMU event]

After:

  $ perf list
  ...
    uncore_imc_free_running/data_read/                 [Kernel PMU event]
    uncore_imc_free_running/data_total/                [Kernel PMU event]
    uncore_imc_free_running/data_write/                [Kernel PMU event]
  ...
  $ perf list -v
    uncore_imc_free_running_0/data_read/               [Kernel PMU event]
    uncore_imc_free_running_0/data_total/              [Kernel PMU event]
    uncore_imc_free_running_0/data_write/              [Kernel PMU event]
    uncore_imc_free_running_1/data_read/               [Kernel PMU event]
    uncore_imc_free_running_1/data_total/              [Kernel PMU event]
    uncore_imc_free_running_1/data_write/              [Kernel PMU event]
  ...

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230825135237.921058-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Ian Rogers 8d9f5146f5 perf pmus: Sort pmus by name then suffix
Sort PMUs by name. If two PMUs have the same name but differ by
suffix, sort the suffixes numerically.

For example, "breakpoint" comes before "cpu",
"uncore_imc_free_running_0" comes before "uncore_imc_free_running_1".

Suffixes need to be treated specially as otherwise they will be ordered
like 0, 1, 10, 11, .., 2, 20, 21, .., etc. Only PMUs starting 'uncore_'
are considered to have a potential suffix.

Sorting of PMUs is done so that later patches can skip duplicate uncore
PMUs that differ only by there suffix.

Committer notes:

Used the more compact, intention revealing strstarts() function we got
from the kernel sources:

-       if (strncmp(str, "uncore_", 7))
+       if (!strstarts(str, "uncore_"))

Also in pmus_cmp() the lhs_num and rhs_num variables may end up not
being set for non "uncore_" prefixed PMUs in pmu_name_len_no_suffix(),
or at least gcc 7.5 in some distros (opensuse 15.5, to be EOLed in
Dec/2024) thins so, so initialize both to zero.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230825135237.921058-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00
Yanteng Si f703073eff perf beauty mmap_flags: Use "test -f" instead of "[-f FILE]"
"[" is part of the shell builtin test (and a synonym for it),
 not a link to the external command /usr/bin/test.

Using the "test" is simpler because it avoids a lot of "[]".

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: loongarch@lists.linux.dev
Cc: loongson-kernel@lists.loongnix.cn
Link: https://lore.kernel.org/r/c50bc0a92dce0ff0fa6504c1a52fb53e2ac007bf.1692962043.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:15:21 -03:00
Yanteng Si 49cf0bf637 perf beauty mmap_flags: Fix script for archs that use the generic mman.h
To address this error:

  grep: /root/linux-next/tools/arch/xxxxx/include/uapi/asm//mman.h:
  No such file or directory

Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: loongarch@lists.linux.dev
Cc: loongson-kernel@lists.loongnix.cn
Link: https://lore.kernel.org/r/42e8e3565d6035302907426c1e65483b2a4007f5.1692962043.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:14:43 -03:00
Yanteng Si c56f286f24 perf tools: Allow to use cpuinfo on LoongArch
Define these macros so that the CPU name can be displayed when running
'perf report' and 'perf timechart'.

Committer notes:

No need to have:

	if (strcasestr(buf, "Model Name")) {
		strlcpy(cpu_m, &buf[13], 255);
		break;
	} else if (strcasestr(buf, "model name")) {
		strlcpy(cpu_m, &buf[13], 255);
		break;
	}

As the point of strcasestr() is to be case insensitive to both the
haystack and the needle, so simplify the above to just:

	if (strcasestr(buf, "model name")) {
		strlcpy(cpu_m, &buf[13], 255);
		break;
	}

Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: loongarch@lists.linux.dev
Cc: loongson-kernel@lists.loongnix.cn
Link: https://lore.kernel.org/r/db968a186a10e4629fe10c26a1210f7126ad41ec.1692962043.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:13:48 -03:00
Kajol Jain 0f2418fddb perf lock contention: Fix typo in max-stack option description
Fix typo in max-stack option description by changing lopck contention
to lock contention.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230825104700.440809-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:27:07 -03:00
Ian Rogers 520da457f9 perf tui slang: Tidy casts
Casts were necessary for older versions of libslang, however, these
are now 15 years old and so we no longer need to care about supporting
them. Tidy the casts and remove unnecessary logic.

Move the ENABLE_SLFUTURE_CONST to the libslang.h common include file,
and also enable ENABLE_SLFUTURE_VOID.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:24:55 -03:00
Ian Rogers 7512e96957 perf build-id: Simplify build_id_cache__cachedir()
Initialize realname to NULL, rather than name.

This avoids a cast and as realpath is either NULL or an allocated
string, free can be called unconditionally.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:24:10 -03:00
Ian Rogers b7823045ec perf pmu: Make id const and add missing free
The struct pmu id is initialized from pmu_id that is read into allocated
memory from a file, as such it needs free-ing in pmu__delete().

Make the id value const so that we can remove casts in tests.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:23:34 -03:00
Ian Rogers 970ef02e98 perf parse-events: Make term's config const
This avoids casts in tests. Use zfree in a few places to avoid
warnings about a freeing a const pointer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:22:34 -03:00
Ian Rogers c091ee9089 perf pmu: Remove logic for PMU name being NULL
The PMU name could be NULL in the case of the fake_pmu. Initialize the
name for the fake_pmu to "fake" so that all other logic can assume it
is initialized. Add a const to the type of name so that a literal can
be used to avoid additional initialization code. Propagate the cost
through related routines and remove now unnecessary "(char *)"
casts. Doing this located a bug in builtin-list for the pmu_glob that
was missing a strdup.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230825024002.801955-3-irogers@google.com
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Ming Wang <wangming01@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:22:16 -03:00
Ian Rogers 9897009eec perf header: Fix missing PMU caps
PMU caps are written as HEADER_PMU_CAPS or for the special case of the
PMU "cpu" as HEADER_CPU_PMU_CAPS. As the PMU "cpu" is special, and not
any "core" PMU, the logic had become broken and core PMUs not called
"cpu" were not having their caps written.

This affects ARM and s390 non-hybrid PMUs.

Simplify the PMU caps writing logic to scan one fewer time and to be
more explicit in its behavior.

Fixes: 178ddf3bad ("perf header: Avoid hybrid PMU list in write_pmu_caps")
Reported-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230825024002.801955-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-25 10:21:33 -03:00
Ian Rogers eeb6b12992 perf jevents: Don't append Unit to desc
Unit with the PMU name is appended to desc in jevents.py, but on
hybrid platforms it causes the desc to differ from the regular
non-hybrid system with a PMU of 'cpu'. Having differing descs means
the events don't deduplicate. To make the perf list output not differ,
append the Unit on again in the perf list printing code.

On x86 reduces the binary size by 409,600 bytes or about 4%. Update
pmu-events test expectations to match the differently generated
pmu-events.c code.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824183212.374787-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 19:21:27 -03:00
Anup Sharma f208b2c6f9 perf scripts python gecko: Launch the profiler UI on the default browser with the appropriate URL
All required libraries have been imported and make sure that none of
them are external dependencies. To achieve this, created a virt env and
verified.

Modified usage information and added combined command.

Modified the main() function to read the --save-only command-line option
and set the output_file variable accordingly.

Modified the trace_end() function to check for the output_file variable.
If it is set, the profiler data is saved to a local file in Gecko
Profile format, or the profiler.firefox.com is opened on the default
browser.

Included trace_begin() to initialize the Firefox Profiler and launch the
default browser to display the profiler.firefox.com.

Added a new function launchFirefox() to start a local server and launch
the profiler UI on the default browser with the appropriate URL.

Created the "CORSRequestHandler" class to enable Cross-Origin Resource
Sharing.

Summary:

This integration now includes a exiting feature to conveniently host the
Gecko Profile data on a local server and open it directly in the default
web browser.

This means that users can now effortlessly visualize and analyze the
profiler results with just a single click.

The addition of the --save-only command-line option allows users to save
the profiler output to a local file in Gecko Profile format, but the
real highlight lies in the capability to seamlessly launch a local
server, making the data accessible to Firefox Profiler via a web
browser.

In addition, it's important to highlight that all data are hosted
locally, eliminating any concerns about data privacy rules and
regulations.

Signed-off-by: Anup Sharma <anupnewsmail@gmail.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/ZNOS0vo58DnVLpD8@yoga
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 14:41:49 -03:00
Anup Sharma 43803cb16f perf scripts python: Add support for input args in gecko script
Refines the argument handling mechanism in the "gecko-report" script to
enable better compatibility and improved user experience.

The script now differentiates between scenarios where arguments are
provided for record and report cases where gecko.py arguments are
passed.

Signed-off-by: Anup Sharma <anupnewsmail@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/ZNf7W+EIrrCSHZN0@yoga
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 14:39:43 -03:00
Ian Rogers f85d120c46 perf jevents: Sort strings in the big C string to reduce faults
Sort the strings within the big C string based on whether they were
for a metric and then by when they were added. This helps group
related strings and reduce minor faults by approximately 10 in 1740,
about 0.57%.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:11:09 -03:00
Ian Rogers 8d4b6d37ea perf pmu: Lazily load sysfs aliases
Don't load sysfs aliases for a PMU when the PMU is first created, defer
until an alias needs to be found. For the pmu-scan benchmark, average
core PMU scanning is reduced by 30.8%, and average PMU scanning by
12.6%.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:10:27 -03:00
Ian Rogers 7b723dbb96 perf pmu: Be lazy about loading event info files from sysfs
Event info is only needed when an event is parsed or when merging data
from an JSON and sysfs event. Be lazy in its loading to reduce file
accesses.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:10:01 -03:00
Ian Rogers 88ed91848d perf pmu: Scan type early to fail an invalid PMU quickly
Scan sysfs PMU's type early so that format and aliases aren't
attempted to be loaded if the PMU name is invalid.

This is the case for event_pmu tokens in parse-events.y where a wildcard
name is first assumed to be a PMU name.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:09:15 -03:00
Ian Rogers e6ff1eed35 perf pmu: Lazily add JSON events
Rather than scanning all JSON events and adding them when a PMU is
created, add the alias when the JSON event is needed.

Average core PMU scanning run time reduced by 60.2%. Average PMU
scanning run time reduced by 15%. Page faults with no events reduced by
74 page faults, 4% of total.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:08:47 -03:00
Ian Rogers 7c52f10c0d perf pmu: Cache JSON events table
Cache the JSON events table so that finding it isn't done per
event/alias.

Change the events table find so that when the PMU is given, if the PMU
has no JSON events return null.

Update usage to always use the PMU variable.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:07:39 -03:00
Ian Rogers f63a536f03 perf pmu: Merge JSON events with sysfs at load time
Rather than load all sysfs events then parsing all JSON events and
merging with ones that already exist. When a sysfs event is loaded, look
for a corresponding JSON event and merge immediately.

To simplify the logic, early exit the perf_pmu__new_alias function if an
alias is attempted to be added twice - as merging has already been
explicitly handled.

Fix the copying of terms to a merged alias and some ENOMEM paths.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:05:09 -03:00
Ian Rogers f26d22f1ba perf pmu: Prefer passing pmu to aliases list
The aliases list is part of the PMU. Rather than pass the aliases
list, pass the full PMU simplifying some callbacks.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:04:27 -03:00
Ian Rogers edb217ff14 perf pmu: Parse sysfs events directly from a file
Rather than read a sysfs events file into a 256 byte char buffer, pass
the FILE* directly to the lex/yacc parser.

This avoids there being a maximum events file size.

While changing the API, constify some arguments to remove unnecessary
casts.

Allocating the read buffer decreases the performance of pmu-scan by
around 3%.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:02:59 -03:00
Ian Rogers 3d5045492a perf pmu-events: Add pmu_events_table__find_event()
jevents stores events sorted by name. Add a find function that will
binary search event names avoiding the need to linearly search through
events.

Add a test in tests/pmu-events.c. If the PMU or event aren't found -1000
is returned. If the event is found but no callback function given, 0 is
returned.

This allows the find function also act as a test for existence.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:02:22 -03:00
Ian Rogers e3edd6cf63 perf pmu-events: Reduce processed events by passing PMU
Pass the PMU to pmu_events_table__for_each_event so that entries that
don't match don't need to be processed by callback.

If a NULL PMU is passed then all PMUs are processed.

'perf bench internals pmu-scan's "Average PMU scanning" performance is
reduced by about 5% on an Intel tigerlake.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 11:00:09 -03:00
Ian Rogers c4ac7f7542 perf s390 s390_cpumcfdg_dump: Don't scan all PMUs
Rather than scanning all PMUs for a counter name, scan the PMU
associated with the evsel of the sample. This is done to remove a
dependence on pmu-events.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:57:05 -03:00
Ian Rogers 9d31cb9395 perf parse-events: Improve error message for double setting
Double setting information for an event would produce an error message
associated with the PMU rather than the term that was double setting.
Improve the error message to be on the term.

Before:

  $ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
  event syntax error: 'cpu/inst_retired.any,inst_retired.any/'
                       \___ Bad event or PMU

  Unabled to find PMU or event on a PMU of 'cpu'
  Run 'perf list' for a list of valid events
  $

After:

  $ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true
  event syntax error: '..etired.any,inst_retired.any/'
                                    \___ Bad event or PMU

  Unabled to find PMU or event on a PMU of 'cpu'

  Initial error:

  event syntax error: '..etired.any,inst_retired.any/'
                                    \___ Attempt to set event's scale twice
  Run 'perf list' for a list of valid events

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:52:35 -03:00
Ian Rogers 2e255b4f9f perf jevents: Group events by PMU
Prior to this change a cpuid would map to a list of events where the PMU
would be encoded alongside the event information. This change breaks
apart each group of events so that there is a group per PMU. A new table
is added with the PMU's name and the list of events, the original table
now holding an array of these per PMU tables.

These changes are to make it easier to get per PMU information about
events, rather than the current approach of scanning all events. The
perf binary size with BPF skeletons on x86 is reduced by about 1%. The
unidentified PMU is now always expanded to "cpu".

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:51:03 -03:00
Ian Rogers 4000519eb0 perf pmu-events: Add extra underscore to function names
Add extra underscore before "for" of pmu_events_table_for_each_event
and pmu_metrics_table_for_each_metric.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:43:19 -03:00
Ian Rogers c3245d2093 perf pmu: Abstract alias/event struct
In order to be able to lazily compute aliases/events for a PMU, move
the struct perf_pmu_alias into pmu.c.

Add perf_pmu__find_event and perf_pmu__for_each_event that take a
callback that is called for the found event or for each event.

The layout of struct pmu and the event/alias list is unchanged but the
API is altered so that aliases are no longer directly accessed, allowing
for later changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:42:46 -03:00
Ian Rogers 5040264121 perf pmu: Make the loading of formats lazy
The sysfs format files are loaded eagerly in a PMU. Add a flag so that
we create the format but only load the contents when necessary.

Reduce the size of the value in struct perf_pmu_format and avoid holes
so there is no additional space requirement.

For "perf stat -e cycles true" this reduces the number of openat calls
from 648 to 573 (about 12%). The benchmark pmu scan speed is improved
by roughly 5%.

Before:

  $ perf bench internals pmu-scan
  Computing performance of sysfs PMU event scan for 100 times
    Average core PMU scanning took: 1061.100 usec (+- 9.965 usec)
    Average PMU scanning took: 4725.300 usec (+- 260.599 usec)

After:

  $ perf bench internals pmu-scan
  Computing performance of sysfs PMU event scan for 100 times
    Average core PMU scanning took: 989.170 usec (+- 6.873 usec)
    Average PMU scanning took: 4520.960 usec (+- 251.272 usec)

Committer testing:

On a AMD Ryzen 5950x:

Before:

  $ perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 563.466 usec (+- 1.008 usec)
    Average PMU scanning took: 1619.174 usec (+- 23.627 usec)
  $ perf stat -r5 perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 583.401 usec (+- 2.098 usec)
    Average PMU scanning took: 1677.352 usec (+- 24.636 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 553.254 usec (+- 0.825 usec)
    Average PMU scanning took: 1635.655 usec (+- 24.312 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 557.733 usec (+- 0.980 usec)
    Average PMU scanning took: 1600.659 usec (+- 23.344 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 554.906 usec (+- 0.774 usec)
    Average PMU scanning took: 1595.338 usec (+- 23.288 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 551.798 usec (+- 0.967 usec)
    Average PMU scanning took: 1623.213 usec (+- 23.998 usec)

   Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):

             3276.82 msec task-clock:u                     #    0.990 CPUs utilized               ( +-  0.82% )
                   0      context-switches:u               #    0.000 /sec
                   0      cpu-migrations:u                 #    0.000 /sec
                1008      page-faults:u                    #  307.615 /sec                        ( +-  0.04% )
         12049614778      cycles:u                         #    3.677 GHz                         ( +-  0.07% )  (83.34%)
           117507478      stalled-cycles-frontend:u        #    0.98% frontend cycles idle        ( +-  0.33% )  (83.32%)
            27106761      stalled-cycles-backend:u         #    0.22% backend cycles idle         ( +-  9.55% )  (83.36%)
         33294953848      instructions:u                   #    2.76  insn per cycle
                                                           #    0.00  stalled cycles per insn     ( +-  0.03% )  (83.31%)
          6849825049      branches:u                       #    2.090 G/sec                       ( +-  0.03% )  (83.37%)
            71533903      branch-misses:u                  #    1.04% of all branches             ( +-  0.20% )  (83.30%)

              3.3088 +- 0.0302 seconds time elapsed  ( +-  0.91% )

  $

After:

  $ perf stat -r5 perf bench internals pmu-scan -i1000
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 550.702 usec (+- 0.958 usec)
    Average PMU scanning took: 1566.577 usec (+- 22.747 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 548.315 usec (+- 0.555 usec)
    Average PMU scanning took: 1565.499 usec (+- 22.760 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 548.073 usec (+- 0.555 usec)
    Average PMU scanning took: 1586.097 usec (+- 23.299 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 561.184 usec (+- 2.709 usec)
    Average PMU scanning took: 1567.153 usec (+- 22.548 usec)
  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 1000 times
    Average core PMU scanning took: 546.987 usec (+- 0.553 usec)
    Average PMU scanning took: 1562.814 usec (+- 22.729 usec)

   Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs):

             3170.86 msec task-clock:u                     #    0.992 CPUs utilized               ( +-  0.22% )
                   0      context-switches:u               #    0.000 /sec
                   0      cpu-migrations:u                 #    0.000 /sec
                1010      page-faults:u                    #  318.526 /sec                        ( +-  0.04% )
         11890047674      cycles:u                         #    3.750 GHz                         ( +-  0.14% )  (83.27%)
           119090499      stalled-cycles-frontend:u        #    1.00% frontend cycles idle        ( +-  0.46% )  (83.40%)
            32502449      stalled-cycles-backend:u         #    0.27% backend cycles idle         ( +-  8.32% )  (83.30%)
         33119141261      instructions:u                   #    2.79  insn per cycle
                                                    #    0.00  stalled cycles per insn     ( +-  0.01% )  (83.37%)
          6812816561      branches:u                       #    2.149 G/sec                       ( +-  0.01% )  (83.29%)
            70157855      branch-misses:u                  #    1.03% of all branches             ( +-  0.28% )  (83.38%)

             3.19710 +- 0.00826 seconds time elapsed  ( +-  0.26% )

  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230824041330.266337-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-24 10:38:04 -03:00
Guilherme Amadio 9e1f16939b perf build: Allow customization of clang options for BPF target
This also puts an unconditional -Werror under control of WERROR. The
clang includes added during the build can lead to a warning that may be
turned into an error. In addition, hardened clang produces a warning
about lack of support for -fstack-protector* options for the BPF target:

  clang -g -O2 -target bpf -Wall -Werror -Ilinux/tools/perf/util/bpf_skel/.tmp/.. \
    -I -idirafter /usr/lib/llvm/16/bin/../../../../lib/clang/16/include -idirafter /usr/local/include \
    -idirafter /usr/include  -Ilinux/tools/include/uapi -c util/bpf_skel/bperf_follower.bpf.c \
    -o linux/tools/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o && llvm-strip -g linux/tools/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o
  clang-16: error: /usr/lib/llvm/16/bin/../../../../lib/clang/16/include: 'linker' input unused [-Werror,-Wunused-command-line-argument]
  clang-16: error: ignoring '-fstack-protector-strong' option as it is not currently supported for target 'bpf' [-Werror,-Woption-ignored]
  make[1]: *** [Makefile.perf:1082: linux/tools/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o] Error 1

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZOZQ2LDA+3Wg8x2T@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-23 15:33:18 -03:00
Ian Rogers 838a8c5f40 perf pmu: Pass PMU rather than aliases and format
Pass the pmu so the aliases and format list can be better abstracted
and later lazily loaded.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230823080828.1460376-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-23 14:27:58 -03:00