Commit graph

38 commits

Author SHA1 Message Date
Ian Rogers
7a16183316 perf stat: Remove dead code: no need to set os.evsel twice
No need to set os.evsel twice.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200910032632.511566-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10 08:13:04 -03:00
Thomas Richter
313146a844 perf stat: Fix out of bounds array access in the print_counters() evlist method
Fix a compile error on F32 and gcc version 10.1 on s390 in file
utils/stat-display.c.  The error does not show up with make DEBUG=y.  In
fact the issue shows up when using both compiler options -O6 and
-D_FORTIFY_SOURCE=2 (which are omitted with DEBUG=Y).

This is the offending call chain:

print_counter_aggr()
  printout(config, -1, 0, ...)  with 2nd parm id set to -1
    aggr_printout(config, x, id --> -1, ...) which leads to this code:
		case AGGR_NONE:
                if (evsel->percore && !config->percore_show_thread) {
                        ....
                } else {
                        fprintf(config->output, "CPU%*d%s",
                                config->csv_output ? 0 : -7,
                                evsel__cpus(evsel)->map[id],
				                        ^^ id is -1 !!!!
                                config->csv_sep);
                }

This is a compiler inlining issue which is detected on s390 but not on
other plattforms.

Output before:

 # make util/stat-display.o
    .....

  util/stat-display.c: In function ‘perf_evlist__print_counters’:
  util/stat-display.c:121:4: error: array subscript -1 is below array
      bounds of ‘int[]’ [-Werror=array-bounds]
  121 |    fprintf(config->output, "CPU%*d%s",
      |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  122 |     config->csv_output ? 0 : -7,
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  123 |     evsel__cpus(evsel)->map[id],
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  124 |     config->csv_sep);
      |     ~~~~~~~~~~~~~~~~
  In file included from util/evsel.h:13,
                 from util/evlist.h:13,
                 from util/stat-display.c:9:
  /root/linux/tools/lib/perf/include/internal/cpumap.h:10:7:
  note: while referencing ‘map’
   10 |  int  map[];
      |       ^~~
  cc1: all warnings being treated as errors
  mv: cannot stat 'util/.stat-display.o.tmp': No such file or directory
  make[3]: *** [/root/linux/tools/build/Makefile.build:97: util/stat-display.o]
  Error 1
  make[2]: *** [Makefile.perf:716: util/stat-display.o] Error 2
  make[1]: *** [Makefile.perf:231: sub-make] Error 2
  make: *** [Makefile:110: util/stat-display.o] Error 2
  [root@t35lp46 perf]#

Output after:

  # make util/stat-display.o
    .....
  CC       util/stat-display.o
  [root@t35lp46 perf]#

Committer notes:

Removed the removal of {} enclosing the multiline else block, as pointed
out by Jiri Olsa.

Suggested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200825063304.77733-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-01 12:15:52 -03:00
Hongbo Yao
c0c652fc70 perf stat: Fix NULL pointer dereference
If config->aggr_map is NULL and config->aggr_get_id is not NULL,
the function print_aggr() will still calling arrg_update_shadow(),
which can result in accessing the invalid pointer.

Fixes: 088519f318 ("perf stat: Move the display functions to stat-display.c")
Signed-off-by: Hongbo Yao <yaohongbo@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/lkml/20200608163625.GC3073@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-09 12:40:04 -03:00
Arnaldo Carvalho de Melo
c754c382c9 perf evsel: Rename perf_evsel__is_*() to evsel__is*()
As those are 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-05 16:35:31 -03:00
Arnaldo Carvalho de Melo
8ab2e96d8f perf evsel: Rename *perf_evsel__*name() to *evsel__*name()
As they are 'struct evsel' methods or related routines, not part of
tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-05 16:35:30 -03:00
Arnaldo Carvalho de Melo
5eb88f0476 perf evsel: Rename perf_evsel__nr_cpus() to evsel__nr_cpus()
As it is a 'struct evsel' method, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-05 16:35:30 -03:00
Kajol Jain
3351c6da89 perf tools: Enable Hz/hz prinitg for --metric-only option
Commit 54b5091606 ("perf stat: Implement --metric-only mode") added
function 'valid_only_metric()' which drops "Hz" or "hz", if it is part
of "ScaleUnit". This patch enable it since hv_24x7 supports couple of
frequency events.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20200401203340.31402-7-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-30 10:48:33 -03:00
Jin Yao
d13e9e413e perf stat: Align the output for interval aggregation mode
There is a slight misalignment in -A -I output.

For example:

 # perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000

 #           time CPU                    counts unit events
      1.000440863 CPU0               1,068,388      cpu/event=cpu-cycles/
      1.000440863 CPU1                 875,954      cpu/event=cpu-cycles/
      1.000440863 CPU2               3,072,538      cpu/event=cpu-cycles/
      1.000440863 CPU3               4,026,870      cpu/event=cpu-cycles/
      1.000440863 CPU4               5,919,630      cpu/event=cpu-cycles/
      1.000440863 CPU5               2,714,260      cpu/event=cpu-cycles/
      1.000440863 CPU6               2,219,240      cpu/event=cpu-cycles/
      1.000440863 CPU7               1,299,232      cpu/event=cpu-cycles/

The value of counts is not aligned with the column "counts" and
the event name is not aligned with the column "events".

With this patch, the output is,

 # perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000

 #           time CPU                    counts unit events
      1.000423009 CPU0                  997,421      cpu/event=cpu-cycles/
      1.000423009 CPU1                1,422,042      cpu/event=cpu-cycles/
      1.000423009 CPU2                  484,651      cpu/event=cpu-cycles/
      1.000423009 CPU3                  525,791      cpu/event=cpu-cycles/
      1.000423009 CPU4                1,370,100      cpu/event=cpu-cycles/
      1.000423009 CPU5                  442,072      cpu/event=cpu-cycles/
      1.000423009 CPU6                  205,643      cpu/event=cpu-cycles/
      1.000423009 CPU7                1,302,250      cpu/event=cpu-cycles/

Now output is aligned.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200218071614.25736-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-24 09:37:27 -03:00
Kan Liang
2a14c1bf01 perf util: Factor out sysctl__nmi_watchdog_enabled()
The NMI watchdog status is required for metric group constraint
examination.  Factor out sysctl__nmi_watchdog_enabled() to retrieve the
NMI watchdog status.

Users may count more than one metric group each time. If so, the NMI
watchdog status may be retrieved several times. To reduce the overhead,
cache the NMI watchdog status.

Replace the NMI watchdog status checking in print_footer() by
sysctl__nmi_watchdog_enabled().

Suggested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/1582581564-184429-4-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-10 14:46:19 -03:00
Jin Yao
1af62ce61c perf stat: Show percore counts in per CPU output
We have supported the event modifier "percore" which sums up the event
counts for all hardware threads in a core and show the counts per core.

For example,

 # perf stat -e cpu/event=cpu-cycles,percore/ -a -A -- sleep 1

  Performance counter stats for 'system wide':

 S0-D0-C0                395,072      cpu/event=cpu-cycles,percore/
 S0-D0-C1                851,248      cpu/event=cpu-cycles,percore/
 S0-D0-C2                954,226      cpu/event=cpu-cycles,percore/
 S0-D0-C3              1,233,659      cpu/event=cpu-cycles,percore/

This patch provides a new option "--percore-show-thread". It is used
with event modifier "percore" together to sum up the event counts for
all hardware threads in a core but show the counts per hardware thread.

This is essentially a replacement for the any bit (which is gone in
Icelake). Per core counts are useful for some formulas, e.g. CoreIPC.
The original percore version was inconvenient to post process. This
variant matches the output of the any bit.

With this patch, for example,

 # perf stat -e cpu/event=cpu-cycles,percore/ -a -A --percore-show-thread  -- sleep 1

  Performance counter stats for 'system wide':

 CPU0               2,453,061      cpu/event=cpu-cycles,percore/
 CPU1               1,823,921      cpu/event=cpu-cycles,percore/
 CPU2               1,383,166      cpu/event=cpu-cycles,percore/
 CPU3               1,102,652      cpu/event=cpu-cycles,percore/
 CPU4               2,453,061      cpu/event=cpu-cycles,percore/
 CPU5               1,823,921      cpu/event=cpu-cycles,percore/
 CPU6               1,383,166      cpu/event=cpu-cycles,percore/
 CPU7               1,102,652      cpu/event=cpu-cycles,percore/

We can see counts are duplicated in CPU pairs (CPU0/CPU4, CPU1/CPU5,
CPU2/CPU6, CPU3/CPU7).

The interval mode also works. For example,

 # perf stat -e cpu/event=cpu-cycles,percore/ -a -A --percore-show-thread  -I 1000
 #           time CPU                    counts unit events
      1.000425421 CPU0                 925,032      cpu/event=cpu-cycles,percore/
      1.000425421 CPU1                 430,202      cpu/event=cpu-cycles,percore/
      1.000425421 CPU2                 436,843      cpu/event=cpu-cycles,percore/
      1.000425421 CPU3               1,192,504      cpu/event=cpu-cycles,percore/
      1.000425421 CPU4                 925,032      cpu/event=cpu-cycles,percore/
      1.000425421 CPU5                 430,202      cpu/event=cpu-cycles,percore/
      1.000425421 CPU6                 436,843      cpu/event=cpu-cycles,percore/
      1.000425421 CPU7               1,192,504      cpu/event=cpu-cycles,percore/

If we offline CPU5, the result is:

 # perf stat -e cpu/event=cpu-cycles,percore/ -a -A --percore-show-thread -- sleep 1

  Performance counter stats for 'system wide':

 CPU0               2,752,148      cpu/event=cpu-cycles,percore/
 CPU1               1,009,312      cpu/event=cpu-cycles,percore/
 CPU2               2,784,072      cpu/event=cpu-cycles,percore/
 CPU3               2,427,922      cpu/event=cpu-cycles,percore/
 CPU4               2,752,148      cpu/event=cpu-cycles,percore/
 CPU6               2,784,072      cpu/event=cpu-cycles,percore/
 CPU7               2,427,922      cpu/event=cpu-cycles,percore/

        1.001416041 seconds time elapsed

 v4:
 ---
 Ravi Bangoria reports an issue in v3. Once we offline a CPU,
 the output is not correct. The issue is we should use the cpu
 idx in print_percore_thread rather than using the cpu value.

 v3:
 ---
 1. Fix the interval mode output error
 2. Use cpu value (not cpu index) in config->aggr_get_id().
 3. Refine the code according to Jiri's comments.

 v2:
 ---
 Add the explanation in change log. This is essentially a replacement
 for the any bit. No code change.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200214080452.26402-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-04 10:34:09 -03:00
Jiri Olsa
86895b480a perf stat: Add --per-node agregation support
Adding new --per-node option to aggregate counts per NUMA
nodes for system-wide mode measurements.

You can specify --per-node in live mode:

  # perf stat  -a -I 1000 -e cycles --per-node
  #           time node   cpus             counts unit events
       1.000542550 N0       20          6,202,097      cycles
       1.000542550 N1       20            639,559      cycles
       2.002040063 N0       20          7,412,495      cycles
       2.002040063 N1       20          2,185,577      cycles
       3.003451699 N0       20          6,508,917      cycles
       3.003451699 N1       20            765,607      cycles
  ...

Or in the record/report stat session:

  # perf stat record -a -I 1000 -e cycles
  #           time             counts unit events
       1.000536937         10,008,468      cycles
       2.002090152          9,578,539      cycles
       3.003625233          7,647,869      cycles
       4.005135036          7,032,086      cycles
  ^C     4.340902364          3,923,893      cycles

  # perf stat report --per-node
  #           time node   cpus             counts unit events
       1.000536937 N0       20          9,355,086      cycles
       1.000536937 N1       20            653,382      cycles
       2.002090152 N0       20          7,712,838      cycles
       2.002090152 N1       20          1,865,701      cycles
       3.003625233 N0       20          6,604,441      cycles
       3.003625233 N1       20          1,043,428      cycles
       4.005135036 N0       20          6,350,522      cycles
       4.005135036 N1       20            681,564      cycles
       4.340902364 N0       20          3,403,188      cycles
       4.340902364 N1       20            520,705      cycles

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20190904073415.723-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-06 15:49:39 -03:00
Arnaldo Carvalho de Melo
f2a39fe849 perf auxtrace: Uninline functions that touch perf_session
So that we don't carry the session.h include directive in auxtrace.h,
which in turn opens a can of worms of files that were getting all sorts
of things via that include, fix them all.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-d2d83aovpgri2z75wlitquni@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-31 22:24:10 -03:00
Souptick Joarder
b4de344b25 perf tools: Remove duplicate headers
Removed headers which are included twice.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1566663319-4283-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26 11:58:29 -03:00
Jiri Olsa
a2f354e3ab libperf: Add perf_thread_map__nr/perf_thread_map__pid functions
So it's part of libperf library as basic functions operating on
perf_thread_map objects.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo
bfc49182c6 perf stat: Add missing counts.h
It is getting this via evsel.h, that don't strictly need counts.h, just
forward declarations for some structs, so add it here before we remove
it from there.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-jwcbm9gv9llloe3he5qkdefs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:57 -03:00
Jiri Olsa
6549cd8f2c perf tools: Use perf_cpu_map__nr instead of cpu_map__nr
Switch the rest of the perf code to use libperf's perf_cpu_map__nr(),
which is the same as current cpu_map__nr() and remove the cpu_map__nr()
function.

Link: http://lkml.kernel.org/n/tip-6e0guy75clis7nm0xpuz9fga@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 11:14:54 -03:00
Jiri Olsa
5643b1a59e libperf: Move nr_members from perf's evsel to libperf's perf_evsel
Move the nr_members member from perf's evsel to libperf's perf_evsel.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-60-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:46 -03:00
Jiri Olsa
f72f901d90 libperf: Add cpus to struct perf_evlist
Move cpus from tools/perf's evlist to libperf's perf_evlist struct.

Committer notes:

Fixed up this one:

  tools/perf/arch/arm/util/cs-etm.c

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-55-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:45 -03:00
Jiri Olsa
af663bd01b libperf: Add threads to struct perf_evsel
Move 'threads' from tools/perf's evsel to libperf's perf_evsel struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-53-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:45 -03:00
Jiri Olsa
d400bd3abf libperf: Add cpus to struct perf_evsel
Mov the 'cpus' field from tools/perf's evsel to libperf's perf_evsel.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-51-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:45 -03:00
Jiri Olsa
1fc632cef4 libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel
Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.

Committer notes:

Fixed up these:

 tools/perf/arch/arm/util/auxtrace.c
 tools/perf/arch/arm/util/cs-etm.c
 tools/perf/arch/arm64/util/arm-spe.c
 tools/perf/arch/s390/util/auxtrace.c
 tools/perf/util/cs-etm.c

Also

  cc1: warnings being treated as errors
  tests/sample-parsing.c: In function 'do_test':
  tests/sample-parsing.c:162: error: missing initializer
  tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')

   	struct evsel evsel = {
   		.needs_swap = false,
  -		.core.attr = {
  -			.sample_type = sample_type,
  -			.read_format = read_format,
  +		.core = {
  +			. attr = {
  +				.sample_type = sample_type,
  +				.read_format = read_format,
  +			},

  [perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
  gcc (GCC) 4.4.7

Also we don't need to include perf_event.h in
tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
perf_event_attr' is enough. And this even fixes the build in some
systems where things are used somewhere down the include path from
perf_event.h without defining __always_inline.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:45 -03:00
Jiri Olsa
ce9036a6e3 libperf: Include perf_evlist in evlist object
Include perf_evlist in the evlist object, will continue to move other
generic things into libperf's perf_evlist.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-37-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:44 -03:00
Jiri Olsa
b27c4ece72 libperf: Include perf_evsel in evsel object
Including perf_evsel in evsel object, will continue to move other
generic things into libperf's perf_evsel struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-36-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:44 -03:00
Jiri Olsa
7836e52e51 libperf: Add perf_thread_map__get()/perf_thread_map__put()
Move the following functions:

  thread_map__get()
  thread_map__put()
  thread_map__comm()

to libperf with the following names:

  perf_thread_map__get()
  perf_thread_map__put()
  perf_thread_map__comm()

Add the perf_thread_map__comm() function for it to work/compile.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-34-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:44 -03:00
Jiri Olsa
b49aca3e9c perf evsel: Rename perf_evsel__cpus() to evsel__cpus()
Rename perf_evsel__cpus() to evsel__cpus(), so we don't have a name
clash when we add perf_evsel__cpus() in libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-19-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:43 -03:00
Jiri Olsa
63503dba87 perf evlist: Rename struct perf_evlist to struct evlist
Rename struct perf_evlist to struct evlist, so we don't have a name
clash when we add struct perf_evlist in libperf.

Committer notes:

Added fixes to build on arm64, from Jiri and from me
(tools/perf/util/cs-etm.c)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:42 -03:00
Jiri Olsa
32dcd021d0 perf evsel: Rename struct perf_evsel to struct evsel
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.

Committer notes:

Added fixes for arm64, provided by Jiri.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:42 -03:00
Andi Kleen
6c5f4e5cb3 perf stat: Don't merge events in the same PMU
Event merging is mainly to collapse similar events in lots of different
duplicated PMUs.

It can break metric displaying. It's possible for two metrics to have
the same event, and when the two events happen in a row the second
wouldn't be displayed.  This would also not show the second metric.

To avoid this don't merge events in the same PMU. This makes sense, if
we have multiple events in the same PMU there is likely some reason for
it (e.g. using multiple groups) and we better not merge them.

While in theory it would be possible to construct metrics that have
events with the same name in different PMU no current metrics have this
problem.

This is the fix for perf stat -M UPI,IPC (needs also another bug fix to
completely work)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Fixes: 430daf2dc7 ("perf stat: Collapse identically named events")
Link: http://lkml.kernel.org/r/20190624193711.35241-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-01 22:50:41 -03:00
Arnaldo Carvalho de Melo
328584804e perf tools: Ditch rtrim(), use skip_spaces() to get closer to the kernel
No change in behaviour, just using the same kernel idiom for such
operation.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-a85lkptkt0ru40irpga8yf54@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-26 11:42:03 -03:00
Arnaldo Carvalho de Melo
810826acd1 perf stat: Use recently introduced skip_spaces()
No change in behaviour.

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ncpvp4eelf8fqhuy29uv56z9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-25 21:28:49 -03:00
Arnaldo Carvalho de Melo
3052ba56bc tools perf: Move from sane_ctype.h obtained from git to the Linux's original
We got the sane_ctype.h headers from git and kept using it so far, but
since that code originally came from the kernel sources to the git
sources, perhaps its better to just use the one in the kernel, so that
we can leverage tools/perf/check_headers.sh to be notified when our copy
gets out of sync, i.e. when fixes or goodies are added to the code we've
copied.

This will help with things like tools/lib/string.c where we want to have
more things in common with the kernel, such as strim(), skip_spaces(),
etc so as to go on removing the things that we have in tools/perf/util/
and instead using the code in the kernel, indirectly and removing things
like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
are made to the original code.

Hopefully this also should help with reducing the difference of code
hosted in tools/ to the one in the kernel proper.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-25 21:02:47 -03:00
Kan Liang
db5742b684 perf stat: Support per-die aggregation
It is useful to aggregate counts per die. E.g. Uncore becomes die-scope
on Xeon Cascade Lake-AP.

Introduce a new option "--per-die" to support per-die aggregation.

The global id for each core has been changed to socket + die id + core
id. The global id for each die is socket + die id.

Add die information for per-core aggregation. The output of per-core
aggregation will be changed from "S0-C0" to "S0-D0-C0". Any scripts
which rely on the output format of per-core aggregation probably be
broken.

For 'perf stat record/report', there is no die information when
processing the old perf.data. The per-die result will be the same as
per-socket.

Committer notes:

Renamed 'die' variable to 'die_id' to fix the build in some systems:

    CC       /tmp/build/perf/builtin-script.o
  cc1: warnings being treated as errors
  builtin-stat.c: In function 'perf_env__get_die':
  builtin-stat.c:963: error: declaration of 'die' shadows a global declaration
  util/util.h:19: error: shadowed declaration is here
  mv: cannot stat `/tmp/build/perf/.builtin-stat.o.tmp': No such file or directory

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-bsnhx7vgsuu6ei307mw60mbj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-10 16:19:59 -03:00
Jin Yao
4fc4d8dfa0 perf stat: Support 'percore' event qualifier
With this patch, we can use the 'percore' event qualifier in perf-stat.

  root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
    1.000773050 S0-C0   98,352,832 cpu/event=0,umask=0x3,percore=1/  (50.01%)
    1.000773050 S0-C1  103,763,057 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 S0-C2  196,776,995 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 S0-C3  176,493,779 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 CPU0    47,699,641 cpu/event=0,umask=0x3/            (50.02%)
    1.000773050 CPU1    49,052,451 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU2   102,771,422 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU3   100,784,662 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU4    43,171,342 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU5    54,152,158 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU6    93,618,410 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU7    74,477,589 cpu/event=0,umask=0x3/            (49.99%)

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:

  S0-C0 = CPU0 + CPU4
  S0-C1 = CPU1 + CPU5
  S0-C2 = CPU2 + CPU6
  S0-C3 = CPU3 + CPU7

So the result is expected (tiny difference is ignored).

Note that, the 'percore' event qualifier needs to use with option '-A'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:24 -03:00
Jin Yao
40480a8136 perf stat: Factor out aggregate counts printing
Move the aggregate counts printing to a new function
print_counter_aggrdata, which will be used in following patches.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-16 14:17:24 -03:00
Andi Kleen
c2b3c170db perf stat: Revert checks for duration_time
This reverts e864c5ca14 ("perf stat: Hide internal duration_time
counter") but doing it manually since the code has now moved to a
different file.

The next patch will properly implement duration_time as a full event, so
no need to hide it anymore.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190326221823.11518-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-04-01 14:49:24 -03:00
Arnaldo Carvalho de Melo
8a249c73a5 perf annotate: Remove lots of headers from annotate.h
To reduce the chances changes trigger tons of rebuilds, more to come.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ytbykaku63862guk7muflcy4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-25 15:12:08 +01:00
Stephane Eranian
bc4da38a47 perf stat: Fix CSV mode column output for non-cgroup events
When using the -x option, perf stat prints CSV-style output with one
event per line.  For each event, it prints the count, the unit, the
event name, the cgroup, and a bunch of other event specific fields (such
as insn per cycles).

When you use CSV-style mode, you expect a normalized output where each
event is printed with the same number of fields regardless of what it is
so it can easily be imported into a spreadsheet or parsed.

For instance, if an event does not have a unit, then print an empty
field for it.

Although this approach was implemented for the unit, it was not for the
cgroup.

When mixing cgroup and non-cgroup events, then non-cgroup events would
not show an empty field, instead the next field was printed, make
columns not line up correctly.

This patch fixes the cgroup output issues by forcing an empty field
for non-cgroup events as soon as one event has cgroup.

Before:

  <not counted> @ @cycles @foo    @ 0    @100.00@@
  2531614       @ @cycles @6420922@100.00@    @

foo cgroup lines up with time_running!

After:

  <not counted> @ @cycles @foo @0       @100.00@@
  2594834       @ @cycles @    @5287372 @100.00@@

Fields line up.

Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1541587845-9150-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:53:41 -03:00
Jiri Olsa
088519f318 perf stat: Move the display functions to stat-display.c
Move perf_evlist__print_counters() with all its dependency functions to
the stat-display.c object.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-44-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-30 15:52:25 -03:00