Commit graph

33379 commits

Author SHA1 Message Date
Ian Rogers
8749311045 perf vendor events intel: Refresh haswell metrics and events
Update the haswell metrics and events using the new tooling from:

  https://github.com/intel/perfmon

The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1".  The
events are unchanged but unused json values are removed. The
formatting changes increase consistency across the json files.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065510.1621979-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:41 -03:00
Ian Rogers
a335420d32 perf vendor events intel: Refresh goldmontplus events
Update the goldmontplus events using the new tooling from:

  https://github.com/intel/perfmon

The events are unchanged but unused json values are removed. This
increases consistency across the json files.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065510.1621979-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:41 -03:00
Ian Rogers
387bc79f83 perf vendor events intel: Refresh goldmont events
Update the goldmont events using the new tooling from:

  https://github.com/intel/perfmon

The events are unchanged but unused json values are removed. This
increases consistency across the json files.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065510.1621979-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:41 -03:00
Ian Rogers
5cebe49ce8 perf vendor events intel: Refresh elkhartlake events
Update the elkhartlake events using the new tooling from:

  https://github.com/intel/perfmon

The events are unchanged but unused json values are removed. This
increases consistency across the json files.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065510.1621979-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:41 -03:00
Ian Rogers
8358b12227 perf vendor events intel: Refresh cascadelakex metrics and events
Update the cascadelakex metrics and events using the new tooling from:

  https://github.com/intel/perfmon

The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1". The
order of metrics varies as TMA metrics are first converted and then
removed if perfmon versions are found. The events are updated with
fixes to uncore events and improved descriptions. The formatting
changes increase consistency across the json files.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065017.1621020-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Ian Rogers
5e241aad62 perf vendor events intel: Refresh broadwellx metrics and events
Update the broadwellx metrics and events using the new tooling from:

  https://github.com/intel/perfmon

The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1". The
order of metrics varies as TMA metrics are first converted and then
removed if perfmon versions are found. The events are updated with
fixes to uncore events and improved descriptions. The formatting
changes increase consistency across the json files.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065017.1621020-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Ian Rogers
f6ee944ce4 perf vendor events intel: Refresh broadwellde metrics and events
Update the broadwellde metrics and events using the new tooling from:

  https://github.com/intel/perfmon

The metrics vary as tma_false_sharing, MEM_Parallel_Requests and
MEM_Request_Latency are explicitly dropped from having missing events:
https://github.com/captain5050/perfmon/blob/main/scripts/create_perf_json.py#L934
The formulas also differ due to parentheses, use of exponents and
removal of redundant operations like "* 1".  The events are unchanged
but unused json values are removed and implicit umasks of 0 are
dropped. This increases consistency across the json files.

mapfile.csv's version number is set to match that in the perfmon
repository.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215065017.1621020-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Ian Rogers
fec57a8e4a perf vendor events intel: Refresh broadwell metrics and events
Update the broadwell metrics and events using the new tooling from:

  https://github.com/intel/perfmon

The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1".  The
events are unchanged but unused json values are removed, implicit
umasks of 0 are dropped and duplicate short and long descriptions have
the long one dropped. This increases consistency across the json
files.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215064755.1620246-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Ian Rogers
6fa91f645f perf vendor events intel: Refresh bonnell events
Update the bonnell events using the new tooling from:

  https://github.com/intel/perfmon

The events are unchanged but unused json values are removed and
implicit umasks of 0 are dropped. This increases consistency across
the json files.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215064755.1620246-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Ian Rogers
a5abef626f perf vendor events intel: Refresh alderlake-n metrics
Update the alderlake-n metrics using the new tooling from:

  https://github.com/intel/perfmon

The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1".

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215064755.1620246-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Ian Rogers
266b2ca727 perf vendor events intel: Refresh alderlake metrics
Update the alderlake metrics using the new tooling from:

  https://github.com/intel/perfmon

The metrics are unchanged but the formulas differ due to parentheses,
use of exponents and removal of redundant operations like "* 1".

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221215064755.1620246-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Ian Rogers
ed4c1778cc perf test pmu-events: Fake PMU metric workaround
We test metrics with fake events with fake values. The fake values may
yield division by zero and so we count both up and down to try to
avoid this. Unfortunately this isn't sufficient for some metrics and
so don't fail the test for them.

Add the metric name to debug output.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20221215064755.1620246-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Namhyung Kim
ad9ef9eb64 perf hist: Improve srcline_{from,to} sort key performance
Likewise, modify ->cmp() callback to compare sample address and map
address.  And add ->collapse() and ->sort() to check the actual
srcfile string.  Also add ->init() to make sure it has the srcfile.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Namhyung Kim
f0cdde28fe perf hist: Improve srcfile sort key performance
Likewise, modify ->cmp() callback to compare sample address and map
address.  And add ->collapse() and ->sort() to check the actual
srcfile string.  Also add ->init() to make sure it has the srcfile.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Namhyung Kim
ec222d7e7c perf hist: Improve srcline sort key performance
The sort_entry->cmp() will be called for eventy sample data to find a
matching entry.  When it has 'srcline' sort key, that means it needs to
call addr2line or libbfd everytime.

This is not optimal because many samples will have same address and it
just can call addr2line once.  So postpone the actual srcline check to
the sort_entry->collpase() and compare addresses in ->cmp().

Also it needs to add ->init() callback to make sure it has srcline info.
If a sample has a unique data, chances are the entry can be sorted out
by other (previous) keys and callbacks in sort_srcline never called.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Namhyung Kim
cb6e92c764 perf hist: Add perf_hpp_fmt->init() callback
In __hists__insert_output_entry(), it calls fmt->sort() for dynamic
entries with NULL to update column width for tracepoint fields.
But it's a hacky abuse of the sort callback, better to have a proper
callback for that.  I'll add more use cases later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Namhyung Kim
d5e33ce06b perf srcline: Conditionally suppress addr2line warnings
It has symbol_conf.disable_add2line_warn to suppress some warnings.  Let's
make it consistent with others.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Namhyung Kim
3b27222dd6 perf srcline: Skip srcline if .debug_line is missing
The srcline info is from the .debug_line section.  No need to setup
addr2line subprocess if the section is missing.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20221215192817.2734573-5-namhyung@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Namhyung Kim
06ea72a42d perf symbol: Add filename__has_section()
The filename__has_section() is to check if the given section name is in
the binary.  It'd be used for checking debug info for srcline.

Committer notes:

Added missing  __maybe_unused to the unused filename__has_section()
arguments in tools/perf/util/symbol-minimal.c.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Namhyung Kim
ea335ef3dd perf srcline: Do not return NULL for srcline
The code assumes non-NULL srcline value always, let's return the usual
SRCLINE_UNKNOWN ("??:0") string instead.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221215192817.2734573-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Michael Petlan
b50d691e50 perf test: Fix "all PMU test" to skip parametrized events
Parametrized events are not only a powerpc domain. They occur on other
platforms too (e.g. aarch64). They should be ignored in this testcase,
since proper setup of the parameters is out of scope of this script.

Let's not filter them out by PMU name, but rather based on the fact that
they expect a parameter.

Fixes: 451ed8058c ("perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc")
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221219163008.9691-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:40 -03:00
Changbin Du
0c0a0db87e perf tools: Add .DELETE_ON_ERROR special Makefile target to clean up partially updated files on error.
As kbuild, this adds .DELETE_ON_ERROR special target to clean up
partially updated files on error. A known issue is the empty vmlinux.h
generted by bpftool if it failed to dump btf info.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221217225151.90387-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:39 -03:00
Namhyung Kim
cb459c89b7 perf test: Update 'perf lock contention' test
Add more tests for the new filters.

  $ sudo perf test contention -v
   87: kernel lock contention analysis test                            :
  --- start ---
  test child forked, pid 412379
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --type-filter
  Testing perf lock contention --lock-filter
  test child finished with 0
  ---- end ----
  kernel lock contention analysis test: Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:39 -03:00
Namhyung Kim
5e3febe7b7 perf lock contention: Support lock addr/name filtering for BPF
Likewise, add addr_filter BPF hash map and check it with the lock
address.

  $ sudo ./perf lock con -ab -L tasklist_lock -- ./perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.169 [sec]
   contended  total wait  max wait  avg wait      type  caller

          18   174.09 us  25.31 us   9.67 us  rwlock:W  do_exit+0x36d
           5    32.34 us  10.87 us   6.47 us  rwlock:R  do_wait+0x8b
           4    15.41 us   4.73 us   3.85 us  rwlock:W  release_task+0x6e

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:39 -03:00
Namhyung Kim
511e19b9e2 perf lock contention: Add -L/--lock-filter option
The -L/--lock-filter option is to filter only given locks.  The locks
can be specified by address or name (if exists).

  $ sudo ./perf lock record -a  sleep 1

  $ sudo ./perf lock con -l
   contended  total wait  max wait  avg wait           address  symbol

          57     1.11 ms  42.83 us  19.54 us  ffff9f4140059000
          15   280.88 us  23.51 us  18.73 us  ffffffff9d007a40  jiffies_lock
           1    20.49 us  20.49 us  20.49 us  ffffffff9d0d50c0  rcu_state
           1     9.02 us   9.02 us   9.02 us  ffff9f41759e9ba0

  $ sudo ./perf lock con -L jiffies_lock,rcu_state
   contended  total wait  max wait  avg wait      type  caller

          15   280.88 us  23.51 us  18.73 us  spinlock  tick_sched_do_timer+0x93
           1    20.49 us  20.49 us  20.49 us  spinlock  __softirqentry_text_start+0xeb

  $ sudo ./perf lock con -L ffff9f4140059000
   contended  total wait  max wait  avg wait      type  caller

          38   779.40 us  42.83 us  20.51 us  spinlock  worker_thread+0x50
          11   216.30 us  39.87 us  19.66 us  spinlock  queue_work_on+0x39
           8   118.13 us  20.51 us  14.77 us  spinlock  kthread+0xe5

Committer testing:

  # uname -a
  Linux quaco 6.0.12-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 8 17:15:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  # perf lock record
  ^C[ perf record: Woken up 1 times to write data ]
  # perf lock con -L jiffies_lock,rcu_state
   contended   total wait     max wait     avg wait         type   caller

  # perf lock con
   contended   total wait     max wait     avg wait         type   caller

           1      9.06 us      9.06 us      9.06 us     spinlock   call_timer_fn+0x24
  # perf lock con -L call
  ignore unknown symbol: call
   contended   total wait     max wait     avg wait         type   caller

           1      9.06 us      9.06 us      9.06 us     spinlock   call_timer_fn+0x24
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:39 -03:00
Namhyung Kim
529772c4df perf lock contention: Support lock type filtering for BPF
Likewise, add type_filter BPF hash map and check it when user gave a
lock type filter.

  $ sudo ./perf lock con -ab -Y rwlock -- ./perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.203 [sec]
   contended  total wait  max wait  avg wait       type  caller

          15   156.19 us  19.45 us  10.41 us   rwlock:W  do_exit+0x36d
           1    11.12 us  11.12 us  11.12 us   rwlock:R  do_wait+0x8b
           1     5.09 us   5.09 us   5.09 us   rwlock:W  release_task+0x6e

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:39 -03:00
Namhyung Kim
b4a7eff93c perf lock contention: Add -Y/--type-filter option
The -Y/--type-filter option is to filter the result for specific lock
types only.  It can accept comma-separated values.  Note that it would
accept type names like one in the output.  spinlock, mutex, rwsem:R and
so on.

For RW-variant lock types, it converts the name to the both variants.
In other words, "rwsem" is same as "rwsem:R,rwsem:W".  Also note that
"mutex" has two different encoding - one for sleeping wait, another for
optimistic spinning.  Add "mutex-spin" entry for the lock_type_table so
that we can add it for "mutex" under the table.

  $ sudo ./perf lock record -a -- ./perf bench sched messaging

  $ sudo ./perf lock con -E 5 -Y spinlock
   contended  total wait   max wait  avg wait      type  caller

         802     1.26 ms   11.73 us   1.58 us  spinlock  __wake_up_common_lock+0x62
          13   787.16 us  105.44 us  60.55 us  spinlock  remove_wait_queue+0x14
          12   612.96 us   78.70 us  51.08 us  spinlock  prepare_to_wait+0x27
         114   340.68 us   12.61 us   2.99 us  spinlock  try_to_wake_up+0x1f5
          83   226.38 us    9.15 us   2.73 us  spinlock  folio_lruvec_lock_irqsave+0x5e

Committer notes:

Make get_type_flag() return UINT_MAX for error instad of -1UL, as that
function returns 'unsigned int' and we store the value on a 'unsigned
int' 'flags' variable which makes clang unhappy:

  35    98.23 fedora:37                     : FAIL clang version 15.0.6 (Fedora 15.0.6-1.fc37)
    builtin-lock.c:2012:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
                            if (flags != -1UL) {
                                ~~~~~ ^  ~~~~
    builtin-lock.c:2021:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
                            if (flags != -1UL) {
                                ~~~~~ ^  ~~~~
    builtin-lock.c:2037:14: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
                            if (flags != -1UL) {
                                ~~~~~ ^  ~~~~
    3 errors generated.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:51:04 -03:00
Linus Torvalds
609d3bc623 Including fixes from bpf, netfilter and can.
Current release - regressions:
 
  - bpf: synchronize dispatcher update with bpf_dispatcher_xdp_func
 
  - rxrpc:
   - fix security setting propagation
   - fix null-deref in rxrpc_unuse_local()
   - fix switched parameters in peer tracing
 
 Current release - new code bugs:
 
  - rxrpc:
    - fix I/O thread startup getting skipped
    - fix locking issues in rxrpc_put_peer_locked()
    - fix I/O thread stop
    - fix uninitialised variable in rxperf server
    - fix the return value of rxrpc_new_incoming_call()
 
  - microchip: vcap: fix initialization of value and mask
 
  - nfp: fix unaligned io read of capabilities word
 
 Previous releases - regressions:
 
  - stop in-kernel socket users from corrupting socket's task_frag
 
  - stream: purge sk_error_queue in sk_stream_kill_queues()
 
  - openvswitch: fix flow lookup to use unmasked key
 
  - dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()
 
  - devlink:
    - hold region lock when flushing snapshots
    - protect devlink dump by the instance lock
 
 Previous releases - always broken:
 
  - bpf:
    - prevent leak of lsm program after failed attach
    - resolve fext program type when checking map compatibility
 
  - skbuff: account for tail adjustment during pull operations
 
  - macsec: fix net device access prior to holding a lock
 
  - bonding: switch back when high prio link up
 
  - netfilter: flowtable: really fix NAT IPv6 offload
 
  - enetc: avoid buffer leaks on xdp_do_redirect() failure
 
  - unix: fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()
 
  - dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmOiGa4ACgkQMUZtbf5S
 IrvetBAAg/AjgG51gboLsuGjgRSwAi5T6ijgVR+pW+kMuoOdaamOF+h/zC1ox/H9
 QrWvTBipy+EqSD8bM4Xz0FNgidch8X4iWYhKGZuBht/4NP5FOzPUG2mNlUy5ANGq
 QZcCw6CUsir8HTb+IJpFEIq0JMwzKCm3WyAkYjEj4iuft0Y93cAgjkMVwoX0RERO
 o/pslC5dsozCLJxEglpw1aJq7aoroNuRSGSXl95nv8fU3UxmUXajnA3HNscXImdV
 6uqSIuyPIaGocpCBPRKUQd0sctkTY4cm8wmxxMCDVsBRVusoaq5eg1VRvxJm9Rxj
 gvDvHvfhnEuSigFF5A+paBp4c+i3C8g/UTBJTtptdAC+Y2tt4UT3Q5aaazYUOAqd
 W4TSJ3bk5zhkhpRF9clb0fNQaM1HOT4rkDEEGTfVN62dtHfPKpNwYufQKaYHdVj1
 RJ3ooH6c7TMVaRs6ZgEWNYToKZj94SIfPhfEhuqWXdNMDBkUMp2BXFFOp9fZDWju
 PsMQrRD7n6+XXpNvScYtnJDORqfIL9yHGZE9kxZA5QSDl9cnPA3SUbNruQPlXHrl
 w0yQlYuG3gcciua4dXaLfz1iN4rPdenuYhVBHhztEwDKl+b61CVQYlOHGkXPVURp
 oft74qCCFbva+Hf/7jENQotjT1tLfxAGdUARuFeDBueJgDRAPsw=
 =goV5
 -----END PGP SIGNATURE-----

Merge tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, netfilter and can.

  Current release - regressions:

   - bpf: synchronize dispatcher update with bpf_dispatcher_xdp_func

   - rxrpc:
      - fix security setting propagation
      - fix null-deref in rxrpc_unuse_local()
      - fix switched parameters in peer tracing

  Current release - new code bugs:

   - rxrpc:
      - fix I/O thread startup getting skipped
      - fix locking issues in rxrpc_put_peer_locked()
      - fix I/O thread stop
      - fix uninitialised variable in rxperf server
      - fix the return value of rxrpc_new_incoming_call()

   - microchip: vcap: fix initialization of value and mask

   - nfp: fix unaligned io read of capabilities word

  Previous releases - regressions:

   - stop in-kernel socket users from corrupting socket's task_frag

   - stream: purge sk_error_queue in sk_stream_kill_queues()

   - openvswitch: fix flow lookup to use unmasked key

   - dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()

   - devlink:
      - hold region lock when flushing snapshots
      - protect devlink dump by the instance lock

  Previous releases - always broken:

   - bpf:
      - prevent leak of lsm program after failed attach
      - resolve fext program type when checking map compatibility

   - skbuff: account for tail adjustment during pull operations

   - macsec: fix net device access prior to holding a lock

   - bonding: switch back when high prio link up

   - netfilter: flowtable: really fix NAT IPv6 offload

   - enetc: avoid buffer leaks on xdp_do_redirect() failure

   - unix: fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()

   - dsa: microchip: remove IRQF_TRIGGER_FALLING in
     request_threaded_irq"

* tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
  net: fec: check the return value of build_skb()
  net: simplify sk_page_frag
  Treewide: Stop corrupting socket's task_frag
  net: Introduce sk_use_task_frag in struct sock.
  mctp: Remove device type check at unregister
  net: dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq
  can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len
  can: flexcan: avoid unbalanced pm_runtime_enable warning
  Documentation: devlink: add missing toc entry for etas_es58x devlink doc
  mctp: serial: Fix starting value for frame check sequence
  nfp: fix unaligned io read of capabilities word
  net: stream: purge sk_error_queue in sk_stream_kill_queues()
  myri10ge: Fix an error handling path in myri10ge_probe()
  net: microchip: vcap: Fix initialization of value and mask
  rxrpc: Fix the return value of rxrpc_new_incoming_call()
  rxrpc: rxperf: Fix uninitialised variable
  rxrpc: Fix I/O thread stop
  rxrpc: Fix switched parameters in peer tracing
  rxrpc: Fix locking issues in rxrpc_put_peer_locked()
  rxrpc: Fix I/O thread startup getting skipped
  ...
2022-12-21 08:41:32 -08:00
Namhyung Kim
59119c09ae perf lock contention: Factor out lock_type_table
Move it out of get_type_str() so that we can reuse the table for others
later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 15:18:29 -03:00
Yang Jihong
8b269b7555 perf probe: Check -v and -q options in the right place
Check the -q and -v options first to return earlier on error.

Before:

  # perf probe -q -v test
  probe-definition(0): test
  symbol:test file:(null) line:0 offset:0 return:0 lazy:(null)
  0 arguments
    Error: -v and -q are exclusive.

After:

  # perf probe -q -v test
    Error: -v and -q are exclusive.

Fixes: 5e17b28f1e ("perf probe: Add --quiet option to suppress output result message")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20221220035702.188413-4-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 15:16:33 -03:00
Yang Jihong
7c0a6144f9 perf tools: Fix usage of the verbose variable
The data type of the verbose variable is integer and can be negative,
replace improperly used cases in a unified manner:
 1. if (verbose)        => if (verbose > 0)
 2. if (!verbose)       => if (verbose <= 0)
 3. if (XX && verbose)  => if (XX && verbose > 0)
 4. if (XX && !verbose) => if (XX && verbose <= 0)

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20221220035702.188413-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 15:16:33 -03:00
Yang Jihong
188ac720d3 perf debug: Set debug_peo_args and redirect_to_stderr variable to correct values in perf_quiet_option()
When perf uses quiet mode, perf_quiet_option() sets the 'debug_peo_args'
variable to -1, and display_attr() incorrectly determines the value of
'debug_peo_args'.  As a result, unexpected information is displayed.

Before:

  # perf record --quiet -- ls > /dev/null
  ------------------------------------------------------------
  perf_event_attr:
    size                             128
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD
    read_format                      ID|LOST
    disabled                         1
    inherit                          1
    mmap                             1
    comm                             1
    freq                             1
    enable_on_exec                   1
    task                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
    bpf_event                        1
  ------------------------------------------------------------
  ...

After:
  # perf record --quiet -- ls > /dev/null
  #

redirect_to_stderr is a similar problem.

Fixes: f78eaef0e0 ("perf tools: Allow to force redirect pr_debug to stderr.")
Fixes: ccd26741f5 ("perf tool: Provide an option to print perf_event_open args and return value")
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: martin.lau@kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/20221220035702.188413-2-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 15:16:33 -03:00
Arnaldo Carvalho de Melo
b235e5b51f tools headers UAPI: Sync linux/kvm.h with the kernel sources
To pick the changes in:

  86bdf3ebcf ("KVM: Support dirty ring in conjunction with bitmap")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, a simple test
build didn't succeed, but for another reason:

  lib/kvm_util.c: In function ‘vm_enable_dirty_ring’:
  lib/kvm_util.c:125:30: error: ‘KVM_CAP_DIRTY_LOG_RING_ACQ_REL’ undeclared (first use in this function); did you mean ‘KVM_CAP_DIRTY_LOG_RING’?
    125 |         if (vm_check_cap(vm, KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
        |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        |                              KVM_CAP_DIRTY_LOG_RING

I'll send a separate patch for that.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/Y6H3b1Q4Msjy5Yz3@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 15:15:57 -03:00
Arnaldo Carvalho de Melo
6d5edd15c9 tools headers UAPI: Sync powerpc syscall table with the kernel sources
To pick the changes in these csets:

  ce883a2ba3 ("powerpc/32: fix syscall wrappers with 64-bit arguments")

That doesn't cause any changes in the perf tools.

This table is used in tools perf to allow features as described in the
last update to this file.

This addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Y6H0C5plZ4V4aiPm@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 14:42:49 -03:00
Arnaldo Carvalho de Melo
a66558dcb1 tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  97fa21f65c ("x86/resctrl: Move MSR defines into msr-index.h")
  7420ae3bb9 ("x86/intel_epb: Set Alder Lake N and Raptor Lake P normal EPB")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-12-20 14:28:40.893794072 -0300
  +++ after	2022-12-20 14:28:54.831993914 -0300
  @@ -266,6 +266,7 @@
   	[0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO",
   	[0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT",
   	[0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG",
  +	[0xc0000200 - x86_64_specific_MSRs_offset] = "IA32_MBA_BW_BASE",
   	[0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS",
   	[0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL",
   	[0xc0000302 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS_CLR",
  $

Now one can trace systemwide asking to see backtraces to where that MSR
is being read/written, see this example with a previous update:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  Using CPUID AuthenticAMD-25-21-0
  0x6a0
  0x6a8
  New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  0x6a0
  0x6a8
  New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
     0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule ([kernel.kallsyms])
                                       futex_wait_queue_me ([kernel.kallsyms])
                                       futex_wait ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
     0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule_idle ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
                                       cpu_startup_entry ([kernel.kallsyms])
                                       secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/lkml/Y6HyTOGRNvKfCVe4@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 14:36:47 -03:00
Arnaldo Carvalho de Melo
eeac18e2bf tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
To pick up the changes in:

  bc7ed4d308 ("drm/i915/perf: Apply Wa_18013179988")
  81d5f7d914 ("drm/i915/perf: Add 32-bit OAG and OAR formats for DG2")
  8133a6daad ("drm/i915: enable PS64 support for DG2")
  b76c14c8fb ("drm/i915/huc: better define HuC status getparam possible return values.")
  94dfc73e7c ("treewide: uapi: Replace zero-length arrays with flexible-array members")

That doesn't add any ioctl, so no changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://lore.kernel.org/lkml/Y6HukoRaZh2R4j5U@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-20 14:28:13 -03:00
Alessandro Carminati
bfa87ac86c rv/monitors: Move monitor structure in rodata
It makes sense to move the important monitor structure into rodata to
prevent accidental structure modification.

Link: https://lkml.kernel.org/r/20221122173648.4732-1-acarmina@redhat.com

Signed-off-by: Alessandro Carminati <acarmina@redhat.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-12-20 11:46:40 -05:00
Linus Torvalds
32d528c4b8 SPDX/License additions for 6.2-rc1
Here are 2 small updates for LICENSES and some kernel files that add the
 Copyleft-next license and use it in a SPDX tag as a dual-license for
 some kernel files.
 
 These have been discussed thoroughly in public on the linux-spdx mailing
 list, and have the needed acks on them, as well as having been in
 linux-next with no reported issues for quite some time.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY6F1Qg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynGWwCfVJ+Z1CVWSFC8KaaGNiFu/gXmgNUAoKy11gWJ
 8igpSNEkOiGiaGA+AvN+
 =j8iu
 -----END PGP SIGNATURE-----

Merge tag 'spdx-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX/License additions from Greg KH:
 "Here are two small updates for LICENSES and some kernel files that add
  the Copyleft-next license and use it in a SPDX tag as a dual-license
  for some kernel files.

  These have been discussed thoroughly in public on the linux-spdx
  mailing list, and have the needed acks on them, as well as having been
  in linux-next with no reported issues for quite some time"

* tag 'spdx-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
  testing: use the copyleft-next-0.3.1 SPDX tag
  LICENSES: Add the copyleft-next-0.3.1 license
2022-12-20 08:53:16 -06:00
Linus Torvalds
35f79d0e2c parisc architecture fixes for kernel v6.2-rc1:
Fixes:
 - Fix potential null-ptr-deref in start_task()
 - Fix kgdb console on serial port
 - Add missing FORCE prerequisites in Makefile
 - Drop PMD_SHIFT from calculation in pgtable.h
 
 Enhancements:
 - Implement a wrapper to align madvise() MADV_* constants with other
   architectures
 - If machine supports running MPE/XL, show the MPE model string
 
 Cleanups:
 - Drop duplicate kgdb console code
 - Indenting fixes in setup_cmdline()
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCY6B/cgAKCRD3ErUQojoP
 X85pAQCC6YpSYON3KZRfABeiDTRCKcGm72p7JQRnyj88XCq6ZAEA40T2qpRpjoYi
 NaXr28mxHFYh4Z0c5Y7K5EuFTT7gAA4=
 =e2Jd
 -----END PGP SIGNATURE-----

Merge tag 'parisc-for-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc updates from Helge Deller:
 "There is one noteable patch, which allows the parisc kernel to use the
  same MADV_xxx constants as the other architectures going forward. With
  that change only alpha has one entry left (MADV_DONTNEED is 6 vs 4 on
  others) which is different. To prevent an ABI breakage, a wrapper is
  included which translates old MADV values to the new ones, so existing
  userspace isn't affected. Reason for that patch is, that some
  applications wrongly used the standard MADV_xxx values even on some
  non-x86 platforms and as such those programs failed to run correctly
  on parisc (examples are qemu-user, tor browser and boringssl).

  Then the kgdb console and the LED code received some fixes, and some
  0-day warnings are now gone. Finally, the very last compile warning
  which was visible during a kernel build is now fixed too (in the vDSO
  code).

  The majority of the patches are tagged for stable series and in
  summary this patchset is quite small and drops more code than it adds:

Fixes:
   - Fix potential null-ptr-deref in start_task()
   - Fix kgdb console on serial port
   - Add missing FORCE prerequisites in Makefile
   - Drop PMD_SHIFT from calculation in pgtable.h

  Enhancements:
   - Implement a wrapper to align madvise() MADV_* constants with other
     architectures
   - If machine supports running MPE/XL, show the MPE model string

  Cleanups:
   - Drop duplicate kgdb console code
   - Indenting fixes in setup_cmdline()"

* tag 'parisc-for-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Show MPE/iX model string at bootup
  parisc: Add missing FORCE prerequisites in Makefile
  parisc: Move pdc_result struct to firmware.c
  parisc: Drop locking in pdc console code
  parisc: Drop duplicate kgdb_pdc console
  parisc: Fix locking in pdc_iodc_print() firmware call
  parisc: Drop PMD_SHIFT from calculation in pgtable.h
  parisc: Align parisc MADV_XXX constants with all other architectures
  parisc: led: Fix potential null-ptr-deref in start_task()
  parisc: Fix inconsistent indenting in setup_cmdline()
2022-12-20 08:43:53 -06:00
Arnaldo Carvalho de Melo
43a3ce77ae tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
To pick the changes from:

  f8b435f93b ("fscrypt: remove unused Speck definitions")
  e0cefada13 ("fscrypt: Add SM4 XTS/CTS symmetric algorithm support")

That don't result in any changes in tooling, just causes this to be
rebuilt:

  CC      /tmp/build/perf-urgent/trace/beauty/sync_file_range.o
  LD      /tmp/build/perf-urgent/trace/beauty/perf-in.o

addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h'
  diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Link: https://lore.kernel.org/lkml/Y6CHSS6Rn9YOqpAd@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-19 12:46:36 -03:00
Arnaldo Carvalho de Melo
51c4f2bf53 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  5e85c4ebf2 ("x86: KVM: Advertise AVX-IFMA CPUID to user space")
  af2872f622 ("x86: KVM: Advertise AMX-FP16 CPUID to user space")
  6a19d7aa58 ("x86: KVM: Advertise CMPccXADD CPUID to user space")
  aaa65d17ee ("x86/tsx: Add a feature bit for TSX control MSR support")
  b1599915f0 ("x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit")
  16a7fe3728 ("KVM/VMX: Allow exposing EDECCSSA user leaf function to KVM guest")
  80e4c1cd42 ("x86/retbleed: Add X86_FEATURE_CALL_DEPTH")
  7df548840c ("x86/bugs: Add "unknown" reporting for MMIO Stale Data")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiaxi Chen <jiaxi.chen@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/Y6CD%2FIcEbDW5X%2FpN@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-19 12:38:33 -03:00
Arnaldo Carvalho de Melo
0bc1d0e2c1 tools headers disabled-cpufeatures: Sync with the kernel sources
To pick the changes from:

  15e15d64bd ("x86/cpufeatures: Add X86_FEATURE_XENPV to disabled-features.h")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc:  Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Y6B2w3WqifB%2FV70T@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-19 12:27:48 -03:00
Arnaldo Carvalho de Melo
30d647f5ba Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-19 12:27:25 -03:00
Arnaldo Carvalho de Melo
66dfc517e8 perf python: Don't stop building if python setuptools isn't installed
The python3-setuptools package is needed to build the python binding, so
that one can use things like:

  # ~acme/git/perf/tools/perf/python/twatch.py
  cpu: 6, pid: 4573, tid: 2184618 { type: exit, pid: 4573, ppid: 4172, tid: 2184618, ptid: 4172, time: 12563190090107}
  cpu: 24, pid: 4573, tid: 4573 { type: fork, pid: 4573, ppid: 4573, tid: 2190991, ptid: 4573, time: 12563415289331}
  cpu: 29, pid: 4573, tid: 2190991 { type: comm, pid: 4573, tid: 2190991, comm: StreamT~ns #401 }
  cpu: 29, pid: 4573, tid: 2190991 { type: comm, pid: 4573, tid: 2190991, comm: StreamT~ns #401 }
  ^CTraceback (most recent call last):
    File "/var/home/acme/git/perf/tools/perf/python/twatch.py", line 61, in <module>
      main()
    File "/var/home/acme/git/perf/tools/perf/python/twatch.py", line 33, in main
      evlist.poll(timeout = -1)
  KeyboardInterrupt

  #

That have 'import perf;'.

But distros don't always have that python3-setuptools (or equivalent)
installed, which was breaking the build. Just check if it is installed
and emit a warning that such binding isn't being built and continue the
build without it:

With it:

  $ rpm -q python3-setuptools
  python3-setuptools-59.6.0-3.fc36.noarch
  $ rm -rf /tmp/build/perf; mkdir -p /tmp/build/perf
  $ make O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
  <SNIP>
  ...                               libpython: [ on  ]
  <SNIP>
    GEN     /tmp/build/perf/python/perf.cpython-310-x86_64-linux-gnu.so
  <SNIP>
  $ ls -la /tmp/build/perf/python/perf.cpython-310-x86_64-linux-gnu.so
  -rwxr-xr-x. 1 acme acme 1609112 Dec 17 11:39 /tmp/build/perf/python/perf.cpython-310-x86_64-linux-gnu.so
  $

Without it:

  $ sudo rpm -e python3-setuptools
  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
  $ make O=/tmp/build/perf -C tools/perf install-bin
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
  <SNIP>
  ...                               libpython: [ on  ]
  <SNIP>
  $ ls -la /tmp/build/perf/python/perf.cpython-310-x86_64-linux-gnu.so
  ls: cannot access '/tmp/build/perf/python/perf.cpython-310-x86_64-linux-gnu.so': No such file or directory
  $

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/Y53XHw3rlsaaUgOs@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-19 12:26:58 -03:00
Linus Torvalds
5f6e430f93 powerpc updates for 6.2
- Add powerpc qspinlock implementation optimised for large system scalability and
    paravirt. See the merge message for more details.
 
  - Enable objtool to be built on powerpc to generate mcount locations.
 
  - Use a temporary mm for code patching with the Radix MMU, so the writable mapping is
    restricted to the patching CPU.
 
  - Add an option to build the 64-bit big-endian kernel with the ELFv2 ABI.
 
  - Sanitise user registers on interrupt entry on 64-bit Book3S.
 
  - Many other small features and fixes.
 
 Thanks to: Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn Helgaas, Bo Liu, Chen
 Lifu, Christoph Hellwig, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Colin
 Ian King, Deming Wang, Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven,
 Gustavo A. R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol Jain,
 Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan Lynch, Naveen N. Rao,
 Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Randy Dunlap, Rohan McLure,
 Russell Currey, Sathvika Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas
 Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng, XueBing Chen, Yang
 Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu, Wolfram Sang.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmOfrj8THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgIWtD/9mGF/ze2k+qFTo+30fb7bO8WJIDgsR
 dIASnZjXV7q/45elvymhUdkQv4R7xL3pzC40P1+ZKtWzGTNe+zWUQLoALNwRK85j
 8CsxZbqefGNKE5Z6ZHo9s37wsu3+jJu9yEQpGFo1LINyzeclCn5St5oqfRam+Hd/
 cPF+VfvREwZ0+YOKGBhJ2EgC+Gc9xsFY7DLQsoYlu71iZZr6Z6rgZW/EY5h3RMGS
 YKBoVwDsWaU0FpFWrr/rYTI6DqSr3AHr1+ftDg7ncCZMD6vQva6aMCCt94aLB1aE
 vC+DNdhZlA558bXGa5yA7Wr//7aUBUIwyC60DogOeZ6vw3kD9tdEd1fbH5hmqNKY
 K5bfqm28XU2959CTE8RDgsYYZvwDcfrjBIML14WZGdCQOTcGKpgOGp22o6yNb1Pq
 JKpHHnVpvu2PZ/p2XdKSm9+etr2yI6lXZAEVTS7ehdtMukButjSHEVbSCEZ8tlWz
 KokQt2J23BMHuSrXK6+67wWQBtdsLEk+LBOQmweiwarMocqvL/Zjz/5J7DR2DtH8
 wlY3wOtB1+E5j7xZ+RgK3c3jNg5dH39ZwvFsSATWTI3P+iq6OK/bbk4q4LmZt2l9
 ZIfH/CXPf9BvGCHzHa3AAd3UBbJLFwj17btMEv1wFVPS0T4LPUzkgTNTNUYeP6zL
 h1e5QfgUxvKPuQ==
 =7k3p
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add powerpc qspinlock implementation optimised for large system
   scalability and paravirt. See the merge message for more details

 - Enable objtool to be built on powerpc to generate mcount locations

 - Use a temporary mm for code patching with the Radix MMU, so the
   writable mapping is restricted to the patching CPU

 - Add an option to build the 64-bit big-endian kernel with the ELFv2
   ABI

 - Sanitise user registers on interrupt entry on 64-bit Book3S

 - Many other small features and fixes

Thanks to Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn
Helgaas, Bo Liu, Chen Lifu, Christoph Hellwig, Christophe JAILLET,
Christophe Leroy, Christopher M. Riedl, Colin Ian King, Deming Wang,
Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven, Gustavo A.
R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol
Jain, Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin,
Pali Rohár, Randy Dunlap, Rohan McLure, Russell Currey, Sathvika
Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas
Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng,
XueBing Chen, Yang Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu,
and Wolfram Sang.

* tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (181 commits)
  powerpc/code-patching: Fix oops with DEBUG_VM enabled
  powerpc/qspinlock: Fix 32-bit build
  powerpc/prom: Fix 32-bit build
  powerpc/rtas: mandate RTAS syscall filtering
  powerpc/rtas: define pr_fmt and convert printk call sites
  powerpc/rtas: clean up includes
  powerpc/rtas: clean up rtas_error_log_max initialization
  powerpc/pseries/eeh: use correct API for error log size
  powerpc/rtas: avoid scheduling in rtas_os_term()
  powerpc/rtas: avoid device tree lookups in rtas_os_term()
  powerpc/rtasd: use correct OF API for event scan rate
  powerpc/rtas: document rtas_call()
  powerpc/pseries: unregister VPA when hot unplugging a CPU
  powerpc/pseries: reset the RCU watchdogs after a LPM
  powerpc: Take in account addition CPU node when building kexec FDT
  powerpc: export the CPU node count
  powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state
  powerpc/dts/fsl: Fix pca954x i2c-mux node names
  cxl: Remove unnecessary cxl_pci_window_alignment()
  selftests/powerpc: Fix resource leaks
  ...
2022-12-19 07:13:33 -06:00
Linus Torvalds
1ea9d333ba - A few late-breaking minor fixups
- Two minor feature patches which were awkwardly dependent on mm-nonmm.
   I need to set up a new branch to handle such things.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY56V1wAKCRDdBJ7gKXxA
 juVQAP9pr5XBx880RJEil6skMCxYJmae8LvYShhvxJi9keot7QEA3wZRlGcllw/3
 fiHcsaBlXqtXBWUbtnMezcdP6gb3TQo=
 =8T1p
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-12-17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more mm updates from Andrew Morton:

 - A few late-breaking minor fixups

 - Two minor feature patches which were awkwardly dependent on mm-nonmm.
   I need to set up a new branch to handle such things.

* tag 'mm-stable-2022-12-17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  MAINTAINERS: zram: zsmalloc: Add an additional co-maintainer
  mm/kmemleak: use %pK to display kernel pointers in backtrace
  mm: use stack_depot for recording kmemleak's backtrace
  maple_tree: update copyright dates for test code
  maple_tree: fix mas_find_rev() comment
  mm/gup_test: free memory allocated via kvcalloc() using kvfree()
2022-12-19 06:58:57 -06:00
Helge Deller
71bdea6f79 parisc: Align parisc MADV_XXX constants with all other architectures
Adjust some MADV_XXX constants to be in sync what their values are on
all other platforms. There is currently no reason to have an own
numbering on parisc, but it requires workarounds in many userspace
sources (e.g. glibc, qemu, ...) - which are often forgotten and thus
introduce bugs and different behaviour on parisc.

A wrapper avoids an ABI breakage for existing userspace applications by
translating any old values to the new ones, so this change allows us to
move over all programs to the new ABI over time.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-12-17 23:19:39 +01:00
Linus Torvalds
aa4800e31c perf tools changes for v6.2: 1st batch
Libraries:
 
 - Drop the old copy of libtraceevent in tools/lib/traceevent/ now that all major distros
   ship it from its external repository.
 
   This is now just another feature detection, emitting a warning when the
   libtraceevent-dev[el] package isn't installed, disabling the build of perf features
   and tools that strictly require parsing things from tracefs while keeping
   the core functionality present and working with a subset of the events, the
   most used ones like CPU cycles, hardware cache and also vendor events, etc.
 
   This was tested with lots of containers for Fedora, Debian, OpenSUSE, Alpine Linux,
   Ubuntu, with cross builds, etc.
 
 Build:
 
 - Update to C standard to gnu11, like was done for the kernel.
 
 - Install the tools/lib/ libraries locally instead of having headers searched
   directly from the source code directories, to help the cases where we can
   build either from in-kernel source libraries or from the same library shipped
   as a distro package, as is the case with libbpf and was the case with
   libtraceevent.
 
 perf stat:
 
 - Do not delay the workload with --delay, the delay is just for starting to count
   the events, to skip noise at workload startup.
 
 - When we have events for each cgroup, the metric should be printed for each
   cgroup separately.
 
   $ perf stat -a --for-each-cgroup system.slice,user.slice --metric-only sleep 1
 
    Performance counter stats for 'system wide':
 
                    GHz  insn per cycle  branch-misses of all branches
    system.slice  3.792      0.61                  3.24%
    user.slice    3.661      2.32                  0.37%
 
 - Fix printing field separator in CSV metrics output.
 
 - Fix --metric-only --json output.
 
 - Fix summary output in CSV with --metric-only.
 
 - Update event group check for support of uncore event.
 
 perf test:
 
 - Stop requiring a C toolchain in shell tests, instead add a workload option that has
   all the previously C snippets built as part of 'perf test -w' that then get used in
   the 'perf test' shell scripts.
 
 - Add event group test for events in multiple PMUs
 
 - The "kernel lock contention analysis" test should not print warnings in quiet mode.
 
 - Add attr tests for ARM64's new VG register.
 
 - Fix record test on KVM guests, as using precise flag with the
   br_inst_retired.near_call event causes the test fail on KVM guests, even when
   the guests have PMU forwarding enabled and the event itself is supported, so just
   remove the precise flag from the event.
 
 - Add mechanism for skipping attr tests on specific kernel versions where it is known that
   these checks will fail.
 
 - Skip watchpoint tests if no watchpoints available.
 
 - Add more Intel PT 'perf test' entries: hybrid CPUs, split the packet decoder
   into a suite of subtests.
 
 perf script:
 
 - Introduce task analyzer python script, where one first records some events:
 
 Recording can be done in two ways:
 
   $ perf script record tasks-analyzer -- sleep 10
   $ perf record -e sched:sched_switch -a -- sleep 10
 
 The script can parse any perf.data files, as long as it has sched:sched_switch events,
 other events will be ignored.
 
 The most simple report use case is to just call the script without arguments.
 
 Runtime is the time the task was running on the CPU, Time Out-In is the time
 between the process being scheduled *out* and scheduled back *in*. So the last
 time span between two executions:
 
   $ perf script report tasks-analyzer
       Switched-In     Switched-Out CPU    PID    TID             Comm  Runtime  Time Out-In
   15576.658891407  15576.659156086   4   2412   2428            gdbus      265         1949
   15576.659111320  15576.659455410   0   2412   2412      gnome-shell      344         2267
   15576.659491326  15576.659506173   2     74     74      kworker/2:1       15        13145
   15576.659506173  15576.659825748   2   2858   2858  gnome-terminal-      320        63263
   15576.659871270  15576.659902872   6  20932  20932    kworker/u16:0       32      2314582
   15576.659909951  15576.659945501   3  27264  27264               sh       36           -1
   15576.659853285  15576.659971052   7  27265  27265             perf      118      5050741
   [...]
 
 perf lock:
 
 - Allow concurrent record and report to support live monitoring of kernel lock
   contention without BPF:
 
   # perf lock record -a -o- sleep 1 | perf lock contention -i-
    contended   total wait     max wait     avg wait         type   caller
 
            2     10.27 us      6.17 us      5.13 us     spinlock   load_balance+0xc03
            1      5.29 us      5.29 us      5.29 us     rwlock:W   ep_scan_ready_list+0x54
            1      4.12 us      4.12 us      4.12 us     spinlock   smpboot_thread_fn+0x116
            1      3.28 us      3.28 us      3.28 us        mutex   pipe_read+0x50
 
 - Implement -t/--threads option when using BPF:
 
   $ sudo ./perf lock contention -abt -E 5 sleep 1
    contended  total wait   max wait   avg wait      pid  comm
 
            1   740.66 ms  740.66 ms  740.66 ms     1950  nv_queue
            3   305.50 ms  298.19 ms  101.83 ms     1884  nvidia-modeset/
            1    25.14 us   25.14 us   25.14 us  2725038  EventManager_De
           12    23.09 us    9.30 us    1.92 us        0  swapper
            1    20.18 us   20.18 us   20.18 us  2725033  EventManager_De
 
 - Add -l/--lock-addr to aggregate per-lock-instance contention:
 
   $ sudo ./perf lock contention -abl sleep 1
    contended  total wait  max wait  avg wait           address  symbol
 
            1    36.28 us  36.28 us  36.28 us  ffff92615d6448b8
            9    10.91 us   1.84 us   1.21 us  ffffffffbaed50c0  rcu_state
            1    10.49 us  10.49 us  10.49 us  ffff9262ac4f0c80
            8     4.68 us   1.67 us    585 ns  ffffffffbae07a40  jiffies_lock
            3     3.03 us   1.45 us   1.01 us  ffff9262277861e0
            1      924 ns    924 ns    924 ns  ffff926095ba9d20
            1      436 ns    436 ns    436 ns  ffff9260bfda4f60
 
 perf record:
 
 - Add remaining branch filters: "no_cycles", "no_flags" & "hw_index", to be
   used with hardware such as Intel's LBR that allows things like stitching
   stacks of two samples to overcome the limits of the number of LBR registers.
 
 Symbol resolution:
 
 - Handle .debug files created with 'objcopy --only-keep-debug', where program
   headers are zeroed and thus can't be used for adjustments, use the info in
   the runtime_ss (runtime ELF) instead.
 
 perf trace:
 
 - Add BPF based augmenter for the 'perf_event_open's 'struct perf_event_attr' argument.
 
 - Add BPF based augmenter for the 'clock_gettime's 'struct timespec' argument.
 
 - In both cases the syscall tracepoint has just the pointer value, we
   need to hook a BPF program to collect the pointer contents, and then,
   in userspace, pretty print it in 'perf trace'.
 
 perf list:
 
 - Introduce JSON output of events.
 
 - Streamline how the expression specifying what events should be shown is handled,
   fixing several corner cases, such as the metric filter that is specified as a glob
   but was using strstr().
 
 perf probe:
 
 - Fix to avoid crashing if DW_AT_decl_file is NULL, coping with clang generating
   DWARF5 like that.
 
 - Use dwarf_attr_integrate() as generic DWARF attr accessor as it supersedes dwarf_attr(),
   supporting abstact origin DIEs.
 
 perf inject:
 
 - Set PERF_RECORD_MISC_BUILD_ID_SIZE in the PERF_RECORD_HEADER_BUILD_ID so that
   perf.data readers can get the real build-id size and avoid trailing zeros.
 
 perf data:
 
 - Add tracepoint fields when converting a perf.data file to JSON.
 
 arm64:
 
 - Fix mksyscalltbl, don't lose syscalls due to sort -nu.
 
 - Add Arm Neoverse V2 PMU events.
 
 riscv:
 
 - Add riscv sbi firmware std event files.
 
 - Add Sifive U74 vendor events (JSON) file.
 
 - Add some more events and metrics for Alderlake/Alderlake-N.
 
 Documentation:
 
 - Add data documentation for the PMU structs in the C source code.
 
 Miscellaneous:
 
 - Periodic sanitization of headers, adding missing includes, removing needless ones,
   creating new ones, etc.
 
 - Use sig_atomic_t for signal handlers to avoid undefined behaviour in all perf
   tools.
 
 - Fixes for libbpf 1.0+ compatibility (maps, etc) on 'perf trace' BPF examples.
 
 - Remove some old perf bpf examples, leave the best ones that demonstrate how
   to associate BPF functions to points in the kernel.
 
 - Make quiet mode consistent between tools.
 
 - Use dedicated non-atomic clear/set bit helpers.
 
 - Use "grep -E" instead of "egrep" as recommended by warning emitted by GNU
   grep since at least version 3.8.
 
 - Complete list of supported subcommands in the 'perf daemon' help message.
 
 - Update John Garry's email address for arm64 perf tooling on the MAINTAINERS file,
   he moved from Huawei to Oracle.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCY5yARAAKCRCyPKLppCJ+
 J7bdAQCO4Y4gXKWv+AQc77aptQaCRmWy6T9ynsdv5gOV43NpCwD/TWZz8zcBqLSS
 fxYSgf2kOQ3Z9soE4/udsL5sDhFbsgA=
 =hLlg
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.2-1-2022-12-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "Libraries:

   - Drop the old copy of libtraceevent in tools/lib/traceevent/ now
     that all major distros ship it from its external repository.

     This is now just another feature detection, emitting a warning when
     the libtraceevent-dev[el] package isn't installed, disabling the
     build of perf features and tools that strictly require parsing
     things from tracefs while keeping the core functionality present
     and working with a subset of the events, the most used ones like
     CPU cycles, hardware cache and also vendor events, etc.

     This was tested with lots of containers for Fedora, Debian,
     OpenSUSE, Alpine Linux, Ubuntu, with cross builds, etc.

  Build:

   - Update to C standard to gnu11, like was done for the kernel.

   - Install the tools/lib/ libraries locally instead of having headers
     searched directly from the source code directories, to help the
     cases where we can build either from in-kernel source libraries or
     from the same library shipped as a distro package, as is the case
     with libbpf and was the case with libtraceevent.

  perf stat:

   - Do not delay the workload with --delay, the delay is just for
     starting to count the events, to skip noise at workload startup.

   - When we have events for each cgroup, the metric should be printed
     for each cgroup separately.

        $ perf stat -a --for-each-cgroup system.slice,user.slice --metric-only sleep 1

        Performance counter stats for 'system wide':

                        GHz  insn per cycle  branch-misses of all branches
        system.slice  3.792      0.61                  3.24%
        user.slice    3.661      2.32                  0.37%

   - Fix printing field separator in CSV metrics output.

   - Fix --metric-only --json output.

   - Fix summary output in CSV with --metric-only.

   - Update event group check for support of uncore event.

  perf test:

   - Stop requiring a C toolchain in shell tests, instead add a workload
     option that has all the previously C snippets built as part of
     'perf test -w' that then get used in the 'perf test' shell scripts.

   - Add event group test for events in multiple PMUs

   - The "kernel lock contention analysis" test should not print
     warnings in quiet mode.

   - Add attr tests for ARM64's new VG register.

   - Fix record test on KVM guests, as using precise flag with the
     br_inst_retired.near_call event causes the test fail on KVM guests,
     even when the guests have PMU forwarding enabled and the event
     itself is supported, so just remove the precise flag from the
     event.

   - Add mechanism for skipping attr tests on specific kernel versions
     where it is known that these checks will fail.

   - Skip watchpoint tests if no watchpoints available.

   - Add more Intel PT 'perf test' entries: hybrid CPUs, split the
     packet decoder into a suite of subtests.

  perf script:

   - Introduce task analyzer python script, where one first records some events:

     Recording can be done in two ways:

        $ perf script record tasks-analyzer -- sleep 10
        $ perf record -e sched:sched_switch -a -- sleep 10

     The script can parse any perf.data files, as long as it has
     sched:sched_switch events, other events will be ignored.

     The most simple report use case is to just call the script without
     arguments.

     Runtime is the time the task was running on the CPU, Time Out-In is
     the time between the process being scheduled *out* and scheduled
     back *in*. So the last time span between two executions:

        $ perf script report tasks-analyzer
            Switched-In     Switched-Out CPU    PID    TID             Comm  Runtime  Time Out-In
        15576.658891407  15576.659156086   4   2412   2428            gdbus      265         1949
        15576.659111320  15576.659455410   0   2412   2412      gnome-shell      344         2267
        15576.659491326  15576.659506173   2     74     74      kworker/2:1       15        13145
        15576.659506173  15576.659825748   2   2858   2858  gnome-terminal-      320        63263
        15576.659871270  15576.659902872   6  20932  20932    kworker/u16:0       32      2314582
        15576.659909951  15576.659945501   3  27264  27264               sh       36           -1
        15576.659853285  15576.659971052   7  27265  27265             perf      118      5050741
        [...]

  perf lock:

   - Allow concurrent record and report to support live monitoring of
     kernel lock contention without BPF:

        # perf lock record -a -o- sleep 1 | perf lock contention -i-
         contended   total wait     max wait     avg wait         type   caller

                 2     10.27 us      6.17 us      5.13 us     spinlock   load_balance+0xc03
                 1      5.29 us      5.29 us      5.29 us     rwlock:W   ep_scan_ready_list+0x54
                 1      4.12 us      4.12 us      4.12 us     spinlock   smpboot_thread_fn+0x116
                 1      3.28 us      3.28 us      3.28 us        mutex   pipe_read+0x50

   - Implement -t/--threads option when using BPF:

        $ sudo ./perf lock contention -abt -E 5 sleep 1
         contended  total wait   max wait   avg wait      pid  comm

                 1   740.66 ms  740.66 ms  740.66 ms     1950  nv_queue
                 3   305.50 ms  298.19 ms  101.83 ms     1884  nvidia-modeset/
                 1    25.14 us   25.14 us   25.14 us  2725038  EventManager_De
                12    23.09 us    9.30 us    1.92 us        0  swapper
                 1    20.18 us   20.18 us   20.18 us  2725033  EventManager_De

   - Add -l/--lock-addr to aggregate per-lock-instance contention:

        $ sudo ./perf lock contention -abl sleep 1
         contended  total wait  max wait  avg wait           address  symbol

                 1    36.28 us  36.28 us  36.28 us  ffff92615d6448b8
                 9    10.91 us   1.84 us   1.21 us  ffffffffbaed50c0  rcu_state
                 1    10.49 us  10.49 us  10.49 us  ffff9262ac4f0c80
                 8     4.68 us   1.67 us    585 ns  ffffffffbae07a40  jiffies_lock
                 3     3.03 us   1.45 us   1.01 us  ffff9262277861e0
                 1      924 ns    924 ns    924 ns  ffff926095ba9d20
                 1      436 ns    436 ns    436 ns  ffff9260bfda4f60

  perf record:

   - Add remaining branch filters: "no_cycles", "no_flags" & "hw_index",
     to be used with hardware such as Intel's LBR that allows things
     like stitching stacks of two samples to overcome the limits of the
     number of LBR registers.

  Symbol resolution:

   - Handle .debug files created with 'objcopy --only-keep-debug', where
     program headers are zeroed and thus can't be used for adjustments,
     use the info in the runtime_ss (runtime ELF) instead.

  perf trace:

   - Add BPF based augmenter for the 'perf_event_open's 'struct
     perf_event_attr' argument.

   - Add BPF based augmenter for the 'clock_gettime's 'struct timespec'
     argument.

   - In both cases the syscall tracepoint has just the pointer value, we
     need to hook a BPF program to collect the pointer contents, and
     then, in userspace, pretty print it in 'perf trace'.

  perf list:

   - Introduce JSON output of events.

   - Streamline how the expression specifying what events should be
     shown is handled, fixing several corner cases, such as the metric
     filter that is specified as a glob but was using strstr().

  perf probe:

   - Fix to avoid crashing if DW_AT_decl_file is NULL, coping with clang
     generating DWARF5 like that.

   - Use dwarf_attr_integrate() as generic DWARF attr accessor as it
     supersedes dwarf_attr(), supporting abstact origin DIEs.

  perf inject:

   - Set PERF_RECORD_MISC_BUILD_ID_SIZE in the PERF_RECORD_HEADER_BUILD_ID
     so that perf.data readers can get the real build-id size and avoid
     trailing zeroes.

  perf data:

   - Add tracepoint fields when converting a perf.data file to JSON.

  arm64:

   - Fix mksyscalltbl, don't lose syscalls due to sort -nu.

   - Add Arm Neoverse V2 PMU events.

  riscv:

   - Add riscv sbi firmware std event files.

   - Add Sifive U74 vendor events (JSON) file.

   - Add some more events and metrics for Alderlake/Alderlake-N.

  Documentation:

   - Add data documentation for the PMU structs in the C source code.

  Miscellaneous:

   - Periodic sanitization of headers, adding missing includes, removing
     needless ones, creating new ones, etc.

   - Use sig_atomic_t for signal handlers to avoid undefined behaviour
     in all perf tools.

   - Fixes for libbpf 1.0+ compatibility (maps, etc) on 'perf trace' BPF
     examples.

   - Remove some old perf bpf examples, leave the best ones that
     demonstrate how to associate BPF functions to points in the kernel.

   - Make quiet mode consistent between tools.

   - Use dedicated non-atomic clear/set bit helpers.

   - Use "grep -E" instead of "egrep" as recommended by warning emitted
     by GNU grep since at least version 3.8.

   - Complete list of supported subcommands in the 'perf daemon' help
     message.

   - Update John Garry's email address for arm64 perf tooling on the
     MAINTAINERS file, he moved from Huawei to Oracle"

* tag 'perf-tools-for-v6.2-1-2022-12-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (239 commits)
  libperf: Fix install_pkgconfig target
  perf tools: Use "grep -E" instead of "egrep"
  perf stat: Do not delay the workload with --delay
  perf evlist: Remove group option.
  perf build: Fix python/perf.so library's name
  perf test arm64: Add attr tests for new VG register
  perf test: Add mechanism for skipping attr tests on kernel versions
  perf test: Add mechanism for skipping attr tests on auxiliary vector values
  perf test: Add ability to test exit code for attr tests
  perf test: add new task-analyzer tests
  perf script: task-analyzer add csv support
  perf script: Introduce task analyzer python script
  perf cs-etm: Print auxtrace info even if OpenCSD isn't linked
  perf cs-etm: Cleanup cs_etm__process_auxtrace_info()
  perf cs-etm: Tidy up auxtrace info header printing
  perf cs-etm: Remove unused stub methods
  perf cs-etm: Print unknown header version as an error
  perf test: Update perf lock contention test
  perf lock contention: Add -l/--lock-addr option
  perf lock contention: Implement -t/--threads option for BPF
  ...
2022-12-16 13:21:20 -06:00
Jakub Kicinski
13e3c7793e bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY5yc9AAKCRDbK58LschI
 g3n4AP4heagqBHPH+WcC0N/Vc4K2dmPmil4ZTuZ/Xt+EMrX0MQEAygWsa272V2C9
 vx51DOu5/D+DPC20/1+mRpEnC4JIqAA=
 =utn+
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2022-12-16

We've added 7 non-merge commits during the last 2 day(s) which contain
a total of 9 files changed, 119 insertions(+), 36 deletions(-).

1) Fix for recent syzkaller XDP dispatcher update splat, from Jiri Olsa.

2) Fix BPF program refcount leak in LSM attachment failure path,
   from Milan Landaverde.

3) Fix BPF program type in map compatibility check for fext,
   from Toke Høiland-Jørgensen.

4) Fix a BPF selftest compilation error under !CONFIG_SMP config,
   from Yonghong Song.

5) Fix CI to enable CONFIG_FUNCTION_ERROR_INJECTION after it got changed
   to a prompt, from Song Liu.

6) Various BPF documentation fixes for socket local storage,
   from Donald Hunter.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Add a test for using a cpumap from an freplace-to-XDP program
  bpf: Resolve fext program type when checking map compatibility
  bpf: Synchronize dispatcher update with bpf_dispatcher_xdp_func
  bpf: prevent leak of lsm program after failed attach
  selftests/bpf: Select CONFIG_FUNCTION_ERROR_INJECTION
  selftests/bpf: Fix a selftest compilation error with CONFIG_SMP=n
  docs/bpf: Reword docs for BPF_MAP_TYPE_SK_STORAGE
====================

Link: https://lore.kernel.org/r/20221216174540.16598-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-16 10:41:56 -08:00
Alexander Gordeev
4ff17c448a libperf: Fix install_pkgconfig target
Commit 47e02b94a4 ("tools lib perf: Add dependency test to install_headers")
misses the notion of $(DESTDIR_SQ) for install_pkgconfig target, which leads to
error:

  install: cannot create regular file '/usr/lib64/pkgconfig/libperf.pc': Permission denied
  make: *** [Makefile:210: install_pkgconfig] Error 1

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: http://lore.kernel.org/lkml/Y5w/cWKyb8vpNMfA@li-4a3a4a4c-28e5-11b2-a85c-a8d192c6f089.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-16 10:04:06 -03:00
Arnaldo Carvalho de Melo
1a931707ad Merge remote-tracking branch 'torvalds/master' into perf/core
To resolve a trivial merge conflict with c302378bc1 ("libbpf:
Hashmap interface update to allow both long and void* keys/values"),
where a function present upstream was removed in the perf tools
development tree.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-16 09:53:53 -03:00
Linus Torvalds
58bcac11fd USB/Thunderbolt driver changes for 6.2-rc1
Here is the large set of USB and Thunderbolt driver changes for 6.2-rc1.
 Overall, thanks to the removal of a driver, more lines were removed than
 added, a nice change.  Highlights include:
   - removal of the sisusbvga driver that was not used by anyone anymore
   - minor thunderbolt driver changes and tweaks
   - chipidea driver updates
   - usual set of typec driver features and hardware support added
   - musb minor driver fixes
   - fotg210 driver fixes, bringing that hardware back from the "dead"
   - minor dwc3 driver updates
   - addition, and then removal, of a list.h helper function for many USB
     and other subsystem drivers, that ended up breaking the build.  That
     will come back for 6.3-rc1, it missed this merge window.
   - usual xhci updates and enhancements
   - usb-serial driver updates and support for new devices
   - other minor USB driver updates
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wvYg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yl5DACgssl/ag4zDePHpfoiG5zEGEzH8XsAoMFrzvzu
 d43hsH3qsfDGSZRkJJMu
 =ORDd
 -----END PGP SIGNATURE-----

Merge tag 'usb-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB and Thunderbolt driver updates from Greg KH:
 "Here is the large set of USB and Thunderbolt driver changes for
  6.2-rc1. Overall, thanks to the removal of a driver, more lines were
  removed than added, a nice change. Highlights include:

   - removal of the sisusbvga driver that was not used by anyone anymore

   - minor thunderbolt driver changes and tweaks

   - chipidea driver updates

   - usual set of typec driver features and hardware support added

   - musb minor driver fixes

   - fotg210 driver fixes, bringing that hardware back from the "dead"

   - minor dwc3 driver updates

   - addition, and then removal, of a list.h helper function for many
     USB and other subsystem drivers, that ended up breaking the build.
     That will come back for 6.3-rc1, it missed this merge window.

   - usual xhci updates and enhancements

   - usb-serial driver updates and support for new devices

   - other minor USB driver updates

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'usb-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (153 commits)
  usb: gadget: uvc: Rename bmInterfaceFlags -> bmInterlaceFlags
  usb: dwc2: power on/off phy for peripheral mode in dual-role mode
  usb: dwc2: disable lpm feature on Rockchip SoCs
  dt-bindings: usb: mtk-xhci: add support for mt7986
  usb: dwc3: core: defer probe on ulpi_read_id timeout
  usb: ulpi: defer ulpi_register on ulpi_read_id timeout
  usb: misc: onboard_usb_hub: add Genesys Logic GL850G hub support
  dt-bindings: usb: Add binding for Genesys Logic GL850G hub controller
  dt-bindings: vendor-prefixes: add Genesys Logic
  usb: fotg210-udc: fix potential memory leak in fotg210_udc_probe()
  usb: typec: tipd: Set mode of operation for USB Type-C connector
  usb: gadget: udc: drop obsolete dependencies on COMPILE_TEST
  usb: musb: remove extra check in musb_gadget_vbus_draw
  usb: gadget: uvc: Prevent buffer overflow in setup handler
  usb: dwc3: qcom: Fix memory leak in dwc3_qcom_interconnect_init
  usb: typec: wusb3801: fix fwnode refcount leak in wusb3801_probe()
  usb: storage: Add check for kcalloc
  USB: sisusbvga: use module_usb_driver()
  USB: sisusbvga: rename sisusb.c to sisusbvga.c
  USB: sisusbvga: remove console support
  ...
2022-12-16 03:22:53 -08:00
Jakub Kicinski
d1c4a3469e selftests: devlink: add a warning for interfaces coming up
NetworkManager (and other daemons) may bring the interface up
and cause failures in quiescence checks. Print a helpful warning,
and take the interface down again.

I seem to forget about this every time I run these tests on a new VM.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-16 10:21:37 +00:00
Jakub Kicinski
2fc60e2ff9 selftests: devlink: fix the fd redirect in dummy_reporter_test
$number + > bash means redirect FD $number, e.g. commonly
used 2> redirects stderr (fd 2). The test uses 8192> to
write the number 8192 to a file, this results in:

  ./devlink.sh: line 499: 8192: Bad file descriptor

Oddly the test also papers over this issue by checking
for failure (expecting an error rather than success)
so it passes, anyway.

Fixes: ff18176ad8 ("selftests: Add a test of large binary to devlink health test")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-16 10:21:36 +00:00
Liam Howlett
9102b78b6f maple_tree: update copyright dates for test code
Add the span to the year of the development.

Link: https://lkml.kernel.org/r/20221025173709.2718725-1-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15 16:37:49 -08:00
Linus Torvalds
8fa590bf34 ARM64:
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
   option to keep the good old dirty log around for pages that are
   dirtied by something other than a vcpu.
 
 * Switch to the relaxed parallel fault handling, using RCU to delay
   page table reclaim and giving better performance under load.
 
 * Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
   which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a9:
   "Fix a number of issues with MTE, such as races on the tags being
   initialised vs the PG_mte_tagged flag as well as the lack of support
   for VM_SHARED when KVM is involved.  Patches from Catalin Marinas and
   Peter Collingbourne").
 
 * Merge the pKVM shadow vcpu state tracking that allows the hypervisor
   to have its own view of a vcpu, keeping that state private.
 
 * Add support for the PMUv3p5 architecture revision, bringing support
   for 64bit counters on systems that support it, and fix the
   no-quite-compliant CHAIN-ed counter support for the machines that
   actually exist out there.
 
 * Fix a handful of minor issues around 52bit VA/PA support (64kB pages
   only) as a prefix of the oncoming support for 4kB and 16kB pages.
 
 * Pick a small set of documentation and spelling fixes, because no
   good merge window would be complete without those.
 
 s390:
 
 * Second batch of the lazy destroy patches
 
 * First batch of KVM changes for kernel virtual != physical address support
 
 * Removal of a unused function
 
 x86:
 
 * Allow compiling out SMM support
 
 * Cleanup and documentation of SMM state save area format
 
 * Preserve interrupt shadow in SMM state save area
 
 * Respond to generic signals during slow page faults
 
 * Fixes and optimizations for the non-executable huge page errata fix.
 
 * Reprogram all performance counters on PMU filter change
 
 * Cleanups to Hyper-V emulation and tests
 
 * Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
   running on top of a L1 Hyper-V hypervisor)
 
 * Advertise several new Intel features
 
 * x86 Xen-for-KVM:
 
 ** Allow the Xen runstate information to cross a page boundary
 
 ** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
 
 ** Add support for 32-bit guests in SCHEDOP_poll
 
 * Notable x86 fixes and cleanups:
 
 ** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
 
 ** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
    years back when eliminating unnecessary barriers when switching between
    vmcs01 and vmcs02.
 
 ** Clean up vmread_error_trampoline() to make it more obvious that params
    must be passed on the stack, even for x86-64.
 
 ** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
    of the current guest CPUID.
 
 ** Fudge around a race with TSC refinement that results in KVM incorrectly
    thinking a guest needs TSC scaling when running on a CPU with a
    constant TSC, but no hardware-enumerated TSC frequency.
 
 ** Advertise (on AMD) that the SMM_CTL MSR is not supported
 
 ** Remove unnecessary exports
 
 Generic:
 
 * Support for responding to signals during page faults; introduces
   new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
 
 Selftests:
 
 * Fix an inverted check in the access tracking perf test, and restore
   support for asserting that there aren't too many idle pages when
   running on bare metal.
 
 * Fix build errors that occur in certain setups (unsure exactly what is
   unique about the problematic setup) due to glibc overriding
   static_assert() to a variant that requires a custom message.
 
 * Introduce actual atomics for clear/set_bit() in selftests
 
 * Add support for pinning vCPUs in dirty_log_perf_test.
 
 * Rename the so called "perf_util" framework to "memstress".
 
 * Add a lightweight psuedo RNG for guest use, and use it to randomize
   the access pattern and write vs. read percentage in the memstress tests.
 
 * Add a common ucall implementation; code dedup and pre-work for running
   SEV (and beyond) guests in selftests.
 
 * Provide a common constructor and arch hook, which will eventually be
   used by x86 to automatically select the right hypercall (AMD vs. Intel).
 
 * A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
   breakpoints, stage-2 faults and access tracking.
 
 * x86-specific selftest changes:
 
 ** Clean up x86's page table management.
 
 ** Clean up and enhance the "smaller maxphyaddr" test, and add a related
    test to cover generic emulation failure.
 
 ** Clean up the nEPT support checks.
 
 ** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
 
 ** Fix an ordering issue in the AMX test introduced by recent conversions
    to use kvm_cpu_has(), and harden the code to guard against similar bugs
    in the future.  Anything that tiggers caching of KVM's supported CPUID,
    kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
    the caching occurs before the test opts in via prctl().
 
 Documentation:
 
 * Remove deleted ioctls from documentation
 
 * Clean up the docs for the x86 MSR filter.
 
 * Various fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
 KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
 mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
 yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
 Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
 MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
 =QAYX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Enable the per-vcpu dirty-ring tracking mechanism, together with an
     option to keep the good old dirty log around for pages that are
     dirtied by something other than a vcpu.

   - Switch to the relaxed parallel fault handling, using RCU to delay
     page table reclaim and giving better performance under load.

   - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
     option, which multi-process VMMs such as crosvm rely on (see merge
     commit 382b5b87a9: "Fix a number of issues with MTE, such as
     races on the tags being initialised vs the PG_mte_tagged flag as
     well as the lack of support for VM_SHARED when KVM is involved.
     Patches from Catalin Marinas and Peter Collingbourne").

   - Merge the pKVM shadow vcpu state tracking that allows the
     hypervisor to have its own view of a vcpu, keeping that state
     private.

   - Add support for the PMUv3p5 architecture revision, bringing support
     for 64bit counters on systems that support it, and fix the
     no-quite-compliant CHAIN-ed counter support for the machines that
     actually exist out there.

   - Fix a handful of minor issues around 52bit VA/PA support (64kB
     pages only) as a prefix of the oncoming support for 4kB and 16kB
     pages.

   - Pick a small set of documentation and spelling fixes, because no
     good merge window would be complete without those.

  s390:

   - Second batch of the lazy destroy patches

   - First batch of KVM changes for kernel virtual != physical address
     support

   - Removal of a unused function

  x86:

   - Allow compiling out SMM support

   - Cleanup and documentation of SMM state save area format

   - Preserve interrupt shadow in SMM state save area

   - Respond to generic signals during slow page faults

   - Fixes and optimizations for the non-executable huge page errata
     fix.

   - Reprogram all performance counters on PMU filter change

   - Cleanups to Hyper-V emulation and tests

   - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
     guest running on top of a L1 Hyper-V hypervisor)

   - Advertise several new Intel features

   - x86 Xen-for-KVM:

      - Allow the Xen runstate information to cross a page boundary

      - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured

      - Add support for 32-bit guests in SCHEDOP_poll

   - Notable x86 fixes and cleanups:

      - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).

      - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
        a few years back when eliminating unnecessary barriers when
        switching between vmcs01 and vmcs02.

      - Clean up vmread_error_trampoline() to make it more obvious that
        params must be passed on the stack, even for x86-64.

      - Let userspace set all supported bits in MSR_IA32_FEAT_CTL
        irrespective of the current guest CPUID.

      - Fudge around a race with TSC refinement that results in KVM
        incorrectly thinking a guest needs TSC scaling when running on a
        CPU with a constant TSC, but no hardware-enumerated TSC
        frequency.

      - Advertise (on AMD) that the SMM_CTL MSR is not supported

      - Remove unnecessary exports

  Generic:

   - Support for responding to signals during page faults; introduces
     new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks

  Selftests:

   - Fix an inverted check in the access tracking perf test, and restore
     support for asserting that there aren't too many idle pages when
     running on bare metal.

   - Fix build errors that occur in certain setups (unsure exactly what
     is unique about the problematic setup) due to glibc overriding
     static_assert() to a variant that requires a custom message.

   - Introduce actual atomics for clear/set_bit() in selftests

   - Add support for pinning vCPUs in dirty_log_perf_test.

   - Rename the so called "perf_util" framework to "memstress".

   - Add a lightweight psuedo RNG for guest use, and use it to randomize
     the access pattern and write vs. read percentage in the memstress
     tests.

   - Add a common ucall implementation; code dedup and pre-work for
     running SEV (and beyond) guests in selftests.

   - Provide a common constructor and arch hook, which will eventually
     be used by x86 to automatically select the right hypercall (AMD vs.
     Intel).

   - A bunch of added/enabled/fixed selftests for ARM64, covering
     memslots, breakpoints, stage-2 faults and access tracking.

   - x86-specific selftest changes:

      - Clean up x86's page table management.

      - Clean up and enhance the "smaller maxphyaddr" test, and add a
        related test to cover generic emulation failure.

      - Clean up the nEPT support checks.

      - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.

      - Fix an ordering issue in the AMX test introduced by recent
        conversions to use kvm_cpu_has(), and harden the code to guard
        against similar bugs in the future. Anything that tiggers
        caching of KVM's supported CPUID, kvm_cpu_has() in this case,
        effectively hides opt-in XSAVE features if the caching occurs
        before the test opts in via prctl().

  Documentation:

   - Remove deleted ioctls from documentation

   - Clean up the docs for the x86 MSR filter.

   - Various fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
  KVM: x86: Add proper ReST tables for userspace MSR exits/flags
  KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
  KVM: arm64: selftests: Align VA space allocator with TTBR0
  KVM: arm64: Fix benign bug with incorrect use of VA_BITS
  KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
  KVM: x86: Advertise that the SMM_CTL MSR is not supported
  KVM: x86: remove unnecessary exports
  KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
  tools: KVM: selftests: Convert clear/set_bit() to actual atomics
  tools: Drop "atomic_" prefix from atomic test_and_set_bit()
  tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
  KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
  perf tools: Use dedicated non-atomic clear/set bit helpers
  tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
  KVM: arm64: selftests: Enable single-step without a "full" ucall()
  KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
  KVM: Remove stale comment about KVM_REQ_UNHALT
  KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
  KVM: Reference to kvm_userspace_memory_region in doc and comments
  KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
  ...
2022-12-15 11:12:21 -08:00
Toke Høiland-Jørgensen
f506439ec3 selftests/bpf: Add a test for using a cpumap from an freplace-to-XDP program
This adds a simple test for inserting an XDP program into a cpumap that is
"owned" by an XDP program that was loaded as PROG_TYPE_EXT (as libxdp
does). Prior to the kernel fix this would fail because the map type
ownership would be set to PROG_TYPE_EXT instead of being resolved to
PROG_TYPE_XDP.

v5:
- Fix a few nits from Andrii, add his ACK
v4:
- Use skeletons for selftest
v3:
- Update comment to better explain the cause
- Add Yonghong's ACK

Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221214230254.790066-2-toke@redhat.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-12-14 21:30:40 -08:00
Masami Hiramatsu (Google)
d4505aa6af tracing/probes: Reject symbol/symstr type for uprobe
Since uprobe's argument must contain the user-space data, that
should not be converted to kernel symbols. Reject if user
specifies these types on uprobe events. e.g.

 /sys/kernel/debug/tracing # echo 'p /bin/sh:10 %ax:symbol' >> uprobe_events
 sh: write error: Invalid argument
 /sys/kernel/debug/tracing # echo 'p /bin/sh:10 %ax:symstr' >> uprobe_events
 sh: write error: Invalid argument
 /sys/kernel/debug/tracing # cat error_log
 [ 1783.134883] trace_uprobe: error: Unknown type is specified
   Command: p /bin/sh:10 %ax:symbol
                             ^
 [ 1792.201120] trace_uprobe: error: Unknown type is specified
   Command: p /bin/sh:10 %ax:symstr
                             ^
Link: https://lore.kernel.org/all/166679931679.1528100.15540755370726009882.stgit@devnote3/

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2022-12-15 09:00:20 +09:00
Linus Torvalds
94a855111e - Add the call depth tracking mitigation for Retbleed which has
been long in the making. It is a lighterweight software-only fix for
 Skylake-based cores where enabling IBRS is a big hammer and causes a
 significant performance impact.
 
 What it basically does is, it aligns all kernel functions to 16 bytes
 boundary and adds a 16-byte padding before the function, objtool
 collects all functions' locations and when the mitigation gets applied,
 it patches a call accounting thunk which is used to track the call depth
 of the stack at any time.
 
 When that call depth reaches a magical, microarchitecture-specific value
 for the Return Stack Buffer, the code stuffs that RSB and avoids its
 underflow which could otherwise lead to the Intel variant of Retbleed.
 
 This software-only solution brings a lot of the lost performance back,
 as benchmarks suggest:
 
   https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
 
 That page above also contains a lot more detailed explanation of the
 whole mechanism
 
 - Implement a new control flow integrity scheme called FineIBT which is
 based on the software kCFI implementation and uses hardware IBT support
 where present to annotate and track indirect branches using a hash to
 validate them
 
 - Other misc fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOZp5EACgkQEsHwGGHe
 VUrZFxAAvi/+8L0IYSK4mKJvixGbTFjxN/Swo2JVOfs34LqGUT6JaBc+VUMwZxdb
 VMTFIZ3ttkKEodjhxGI7oGev6V8UfhI37SmO2lYKXpQVjXXnMlv/M+Vw3teE38CN
 gopi+xtGnT1IeWQ3tc/Tv18pleJ0mh5HKWiW+9KoqgXj0wgF9x4eRYDz1TDCDA/A
 iaBzs56j8m/FSykZHnrWZ/MvjKNPdGlfJASUCPeTM2dcrXQGJ93+X2hJctzDte0y
 Nuiw6Y0htfFBE7xoJn+sqm5Okr+McoUM18/CCprbgSKYk18iMYm3ZtAi6FUQZS1A
 ua4wQCf49loGp15PO61AS5d3OBf5D3q/WihQRbCaJvTVgPp9sWYnWwtcVUuhMllh
 ZQtBU9REcVJ/22bH09Q9CjBW0VpKpXHveqQdqRDViLJ6v/iI6EFGmD24SW/VxyRd
 73k9MBGrL/dOf1SbEzdsnvcSB3LGzp0Om8o/KzJWOomrVKjBCJy16bwTEsCZEJmP
 i406m92GPXeaN1GhTko7vmF0GnkEdJs1GVCZPluCAxxbhHukyxHnrjlQjI4vC80n
 Ylc0B3Kvitw7LGJsPqu+/jfNHADC/zhx1qz/30wb5cFmFbN1aRdp3pm8JYUkn+l/
 zri2Y6+O89gvE/9/xUhMohzHsWUO7xITiBavewKeTP9GSWybWUs=
 =cRy1
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Borislav Petkov:

 - Add the call depth tracking mitigation for Retbleed which has been
   long in the making. It is a lighterweight software-only fix for
   Skylake-based cores where enabling IBRS is a big hammer and causes a
   significant performance impact.

   What it basically does is, it aligns all kernel functions to 16 bytes
   boundary and adds a 16-byte padding before the function, objtool
   collects all functions' locations and when the mitigation gets
   applied, it patches a call accounting thunk which is used to track
   the call depth of the stack at any time.

   When that call depth reaches a magical, microarchitecture-specific
   value for the Return Stack Buffer, the code stuffs that RSB and
   avoids its underflow which could otherwise lead to the Intel variant
   of Retbleed.

   This software-only solution brings a lot of the lost performance
   back, as benchmarks suggest:

       https://lore.kernel.org/all/20220915111039.092790446@infradead.org/

   That page above also contains a lot more detailed explanation of the
   whole mechanism

 - Implement a new control flow integrity scheme called FineIBT which is
   based on the software kCFI implementation and uses hardware IBT
   support where present to annotate and track indirect branches using a
   hash to validate them

 - Other misc fixes and cleanups

* tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
  x86/paravirt: Use common macro for creating simple asm paravirt functions
  x86/paravirt: Remove clobber bitmask from .parainstructions
  x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al
  x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
  x86/Kconfig: Enable kernel IBT by default
  x86,pm: Force out-of-line memcpy()
  objtool: Fix weak hole vs prefix symbol
  objtool: Optimize elf_dirty_reloc_sym()
  x86/cfi: Add boot time hash randomization
  x86/cfi: Boot time selection of CFI scheme
  x86/ibt: Implement FineIBT
  objtool: Add --cfi to generate the .cfi_sites section
  x86: Add prefix symbols for function padding
  objtool: Add option to generate prefix symbols
  objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  objtool: Slice up elf_create_section_symbol()
  kallsyms: Revert "Take callthunks into account"
  x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces
  x86/retpoline: Fix crash printing warning
  x86/paravirt: Fix a !PARAVIRT build warning
  ...
2022-12-14 15:03:00 -08:00
Linus Torvalds
ad76bf1ff1 memblock: extend test coverage
* add tests that trigger reallocation of memblock structures from
   memblock itself via memblock_double_array()
 * add tests for memblock_alloc_exact_nid_raw() that verify that requested
   node and memory range constraints are respected.
 -----BEGIN PGP SIGNATURE-----
 
 iQFMBAABCAA2FiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmOYL14YHG1pa2UucmFw
 b3BvcnRAZ21haWwuY29tAAoJEDkDhibLDv2RZdcH/2AE447oXzVO2lzOgkqQH1EX
 xJdaa7hu00h2Euzv2lgcOHroHGXDP8wYjUV2cEyNZMP0WOMiO8i6rwIKmrzWufcm
 R+ZoKPQV/Nc+7rIycpW455yLxcgsVIpUILK2BQEkDCGYugSHKb7IYdcA9KDJwtmR
 xIG9j8nsuwWJtmtAuQqNOBmsc5FzKNYFa/RtDiJoMFmQNK3UqB8G8VCASdP0DYvH
 7MXPcyRmlwpmOsKoNKi2/wQBsiag8/PLgcZv5vYg+E6no1tMG6u7pgDS12Sn6ZvA
 I8gThJ8HNAo0d1O2SnbkicMx2CqrPFSub3QXaEFjCZF5mdBcirxHc/VBKj50TXU=
 =iXEA
 -----END PGP SIGNATURE-----

Merge tag 'memblock-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock updates from Mike Rapoport:
 "Extend test coverage:

   - add tests that trigger reallocation of memblock structures from
     memblock itself via memblock_double_array()

   - add tests for memblock_alloc_exact_nid_raw() that verify that
     requested node and memory range constraints are respected"

* tag 'memblock-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock tests: remove completed TODO item
  memblock tests: add generic NUMA tests for memblock_alloc_exact_nid_raw
  memblock tests: add bottom-up NUMA tests for memblock_alloc_exact_nid_raw
  memblock tests: add top-down NUMA tests for memblock_alloc_exact_nid_raw
  memblock tests: introduce range tests for memblock_alloc_exact_nid_raw
  memblock test: Update TODO list
  memblock test: Add test to memblock_reserve() 129th region
  memblock test: Add test to memblock_add() 129th region
2022-12-14 12:17:57 -08:00
Tiezhu Yang
818448e9cf perf tools: Use "grep -E" instead of "egrep"
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:

	egrep: warning: egrep is obsolescent; using grep -E

fix this up by moving the related file to use "grep -E" instead.

  sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/perf`

Here are the steps to install the latest grep:

  wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
  tar xf grep-3.8.tar.gz
  cd grep-3.8 && ./configure && make
  sudo make install
  export PATH=/usr/local/bin:$PATH

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1668762999-9297-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 15:28:19 -03:00
Namhyung Kim
c587e77e10 perf stat: Do not delay the workload with --delay
The -D/--delay option is to delay the measure after the program starts.
But the current code goes to sleep before starting the program so the
program is delayed too.  This is not the intention, let's fix it.

Before:

  $ time sudo ./perf stat -a -e cycles -D 3000 sleep 4
  Events disabled
  Events enabled

   Performance counter stats for 'system wide':

       4,326,949,337      cycles

         4.007494118 seconds time elapsed

  real	0m7.474s
  user	0m0.356s
  sys	0m0.120s

It ran the workload for 4 seconds and gave the 3 second delay.  So it
should skip the first 3 second and measure the last 1 second only.  But
as you can see, it delays 3 seconds and ran the workload after that for
4 seconds.  So the total time (real) was 7 seconds.

After:

  $ time sudo ./perf stat -a -e cycles -D 3000 sleep 4
  Events disabled
  Events enabled

   Performance counter stats for 'system wide':

       1,063,551,013      cycles

         1.002769510 seconds time elapsed

  real	0m4.484s
  user	0m0.385s
  sys	0m0.086s

The bug was introduced when it changed enablement of system-wide events
with a command line workload.  But it should've considered the initial
delay case.  The code was reworked since then (in bb8bc52e75) so I'm
afraid it won't be applied cleanly.

Fixes: d0a0a51149 ("perf stat: Fix forked applications enablement of counters")
Reported-by: Kevin Nomura <nomurak@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https://lore.kernel.org/r/20221212230820.901382-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 15:28:19 -03:00
Ian Rogers
5f8f95673f perf evlist: Remove group option.
The group option predates grouping events using curly braces added in
commit 89efb02950 ("perf tools: Add support to parse event group
syntax").

The --group option was retained for legacy support (in August
2012) but keeping it adds complexity.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Shaomin Deng <dengshaomin@cdjrlc.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213232651.1269909-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 15:28:18 -03:00
Song Liu
a8dfde09c9 selftests/bpf: Select CONFIG_FUNCTION_ERROR_INJECTION
BPF selftests require CONFIG_FUNCTION_ERROR_INJECTION to work. However,
CONFIG_FUNCTION_ERROR_INJECTION is no longer 'y' by default after recent
changes. As a result, we are seeing errors like the following from BPF CI:

   bpf_testmod_test_read() is not modifiable
   __x64_sys_setdomainname is not sleepable
   __x64_sys_getpgid is not sleepable

Fix this by explicitly selecting CONFIG_FUNCTION_ERROR_INJECTION in the
selftest config.

Fixes: a4412fdd49 ("error-injection: Add prompt for function error injection")
Reported-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20221213220500.3427947-1-song@kernel.org
2022-12-14 18:35:41 +01:00
Yonghong Song
ec9230b18b selftests/bpf: Fix a selftest compilation error with CONFIG_SMP=n
Kernel test robot reported bpf selftest build failure when CONFIG_SMP
is not set. The error message looks below:

  >> progs/rcu_read_lock.c:256:34: error: no member named 'last_wakee' in 'struct task_struct'
             last_wakee = task->real_parent->last_wakee;
                          ~~~~~~~~~~~~~~~~~  ^
     1 error generated.

When CONFIG_SMP is not set, the field 'last_wakee' is not available in struct
'task_struct'. Hence the above compilation failure. To fix the issue, let us
choose another field 'group_leader' which is available regardless of
CONFIG_SMP set or not.

Fixes: fe147956fc ("bpf/selftests: Add selftests for new task kfuncs")
Fixes: 48671232fc ("selftests/bpf: Add tests for bpf_rcu_read_lock()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20221213012224.379581-1-yhs@fb.com
2022-12-14 18:35:41 +01:00
Linus Torvalds
08cdc21579 iommufd for 6.2
iommufd is the user API to control the IOMMU subsystem as it relates to
 managing IO page tables that point at user space memory.
 
 It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
 container) which is the VFIO specific interface for a similar idea.
 
 We see a broad need for extended features, some being highly IOMMU device
 specific:
  - Binding iommu_domain's to PASID/SSID
  - Userspace IO page tables, for ARM, x86 and S390
  - Kernel bypassed invalidation of user page tables
  - Re-use of the KVM page table in the IOMMU
  - Dirty page tracking in the IOMMU
  - Runtime Increase/Decrease of IOPTE size
  - PRI support with faults resolved in userspace
 
 Many of these HW features exist to support VM use cases - for instance the
 combination of PASID, PRI and Userspace IO Page Tables allows an
 implementation of DMA Shared Virtual Addressing (vSVA) within a
 guest. Dirty tracking enables VM live migration with SRIOV devices and
 PASID support allow creating "scalable IOV" devices, among other things.
 
 As these features are fundamental to a VM platform they need to be
 uniformly exposed to all the driver families that do DMA into VMs, which
 is currently VFIO and VDPA.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY5ct7wAKCRCFwuHvBreF
 YZZ5AQDciXfcgXLt0UBEmWupNb0f/asT6tk717pdsKm8kAZMNAEAsIyLiKT5HqGl
 s7fAu+CQ1pr9+9NKGevD+frw8Solsw4=
 =jJkd
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd implementation from Jason Gunthorpe:
 "iommufd is the user API to control the IOMMU subsystem as it relates
  to managing IO page tables that point at user space memory.

  It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
  container) which is the VFIO specific interface for a similar idea.

  We see a broad need for extended features, some being highly IOMMU
  device specific:
   - Binding iommu_domain's to PASID/SSID
   - Userspace IO page tables, for ARM, x86 and S390
   - Kernel bypassed invalidation of user page tables
   - Re-use of the KVM page table in the IOMMU
   - Dirty page tracking in the IOMMU
   - Runtime Increase/Decrease of IOPTE size
   - PRI support with faults resolved in userspace

  Many of these HW features exist to support VM use cases - for instance
  the combination of PASID, PRI and Userspace IO Page Tables allows an
  implementation of DMA Shared Virtual Addressing (vSVA) within a guest.
  Dirty tracking enables VM live migration with SRIOV devices and PASID
  support allow creating "scalable IOV" devices, among other things.

  As these features are fundamental to a VM platform they need to be
  uniformly exposed to all the driver families that do DMA into VMs,
  which is currently VFIO and VDPA"

For more background, see the extended explanations in Jason's pull request:

  https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits)
  iommufd: Change the order of MSI setup
  iommufd: Improve a few unclear bits of code
  iommufd: Fix comment typos
  vfio: Move vfio group specific code into group.c
  vfio: Refactor dma APIs for emulated devices
  vfio: Wrap vfio group module init/clean code into helpers
  vfio: Refactor vfio_device open and close
  vfio: Make vfio_device_open() truly device specific
  vfio: Swap order of vfio_device_container_register() and open_device()
  vfio: Set device->group in helper function
  vfio: Create wrappers for group register/unregister
  vfio: Move the sanity check of the group to vfio_create_group()
  vfio: Simplify vfio_create_group()
  iommufd: Allow iommufd to supply /dev/vfio/vfio
  vfio: Make vfio_container optionally compiled
  vfio: Move container related MODULE_ALIAS statements into container.c
  vfio-iommufd: Support iommufd for emulated VFIO devices
  vfio-iommufd: Support iommufd for physical VFIO devices
  vfio-iommufd: Allow iommufd to be used in place of a container fd
  vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent()
  ...
2022-12-14 09:15:43 -08:00
Ian Rogers
caec54705a perf build: Fix python/perf.so library's name
Since Python 3.3 extensions have a suffix encoding platform and
version information. For example, the perf extension was previously
perf.so but now maybe perf.cpython-310-x86_64-linux-gnu.so. Compute
the extension using Python and then use this in the target name. Doing
this avoids the "perf.so" target always being rebuilt.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Shaomin Deng <dengshaomin@cdjrlc.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213232651.1269909-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:32 -03:00
James Clark
9440ebdc33 perf test arm64: Add attr tests for new VG register
Ensure that the availability of the VG register behaves as expected
depending on the kernel version and SVE support.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213114739.2312862-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:32 -03:00
James Clark
ee26adf627 perf test: Add mechanism for skipping attr tests on kernel versions
The first two version numbers are used since that is where the ABI
changes happen, so seems to be the most useful for now.

'Until' is exclusive and 'since' is inclusive so that the same version
number can be used to mark a point where the change comes into effect.

This allows keeping the tests in a state where new tests will also pass
on older kernels if the existence of a new feature isn't explicitly
broadcast by the kernel. For example extended user regs are currently
discovered by trial and error calls to perf_event_open.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213114739.2312862-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:32 -03:00
James Clark
c3a8f85351 perf test: Add mechanism for skipping attr tests on auxiliary vector values
This can be used to skip tests or provide different test values on
different platforms. For example to run a test only where Arm SVE is
present add this to the config section:

  auxv    = auxv["AT_HWCAP"] & 0x200000 == 0x200000

The value is a freeform Python expression that is evaled in the context
of a map called "auxv" that contains the decoded auxiliary vector.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213114739.2312862-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:32 -03:00
James Clark
a8f26192ca perf test: Add ability to test exit code for attr tests
Currently the return value is used to skip the test, but sometimes it
can be useful to test if a certain command should return a certain exit
code.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221213114739.2312862-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:32 -03:00
Petar Gligoric
e8478b84d6 perf test: add new task-analyzer tests
Provide task-analyzer test cases for all possible arguments and a subset of possible
combinations.

12 Tests in total.

test_basic:
 - cmd:"perf script report task-analyzer"
 - Fundamental test of script without arguments.
 - Check for standard output.

test_ns_rename:
 - cmd:"perf script report task-analyzer --ns --rename-comms-by-tids 0:random"
 - Standard task with timestamps in nanoseconds and comm renamed.
 - Check for standard output.

test_ms_filtertasks_highlight:
 - cmd:"perf script report task-analyzer --ms --filter-tasks perf --highlight-tasks perf"
 - Standard task with timestamps in milliseconds, task filtered out and highlighted.
 - Check for standard output.

test_extended_times_timelimit_limittasks:
 - cmd "perf script report task-analyzer --extended-times --time-limit :99999"
 - Standard task with additional schedule out/in info and timlimit active at 99999.
 - Check for extended table output.

test_summary:
 - cmd:"perf script report task-analyzer --summary"
 - Standard task with additional summary output.
 - Check for summary print.

test_summary_extended:
 - cmd:"perf script report task-analyzer --summary-extended"
 - Standard task with summary and additional schedule in/out info.
 - Chceck for extended table print.

test_summaryonly:
 - cmd:"perf script report task-analyzer --summary-only"
 - Only summary should be printed.
 - Check for summary print.

test_extended_times_summary_ns:
 - cmd:"perf script report task-analyzer --extended-times --summary --ns"
 - Standard task with extended schedule in/out information and summary in ns.
 - Check for extended table and summary.

test_csv:
 - cmd:"perf script report task-analyzer --csv csv"
 - Print standard task to csv file in csv format.
 - Check for csv format.

test_csv_extended_times:
 - cmd:"perf script report task-analyzer --csv csv --extended-times"
 - Print standard task to csv file in csv format with additional schedule in/out
   information.
 - Check for additional information and csv format.

test_csvsummary:
 - cmd:"perf script report task-analyzer --csv-summary csvsummary"
 - Print summary to csvsummary file in csv format.
 - Check for csv format.

test_csvsummary_extended:
 - cmd:"perf script report task-analyzer --csv-summary csvsummary --summary-extended"
 - Print summary to csvsummary file in csv format with additional schedule in/out
   information.
 - Check for additional information and csv format.

Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221206154406.41941-4-petar.gligor@gmail.com
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Petar Gligoric
fdd0f81f05 perf script: task-analyzer add csv support
This patch adds the possibility to write the trace and the summary as csv files
to a user specified file. A format as such simplifies further data processing.
This is achieved by having ";" as separators instead of spaces and solely one
header per file.

Additional parameters are being considered, like in the normal usage of the
script. Colors are turned off in the case of a csv output, thus the highlight
option is also being ignored.

Usage:

Write standard task to csv file:

  $ perf script report tasks-analyzer --csv <file>

write limited output to csv file in nanoseconds:

  $ perf script report tasks-analyzer --csv <file> --ns --limit-to-tasks 1337

Write summary to a csv file:

  $ perf script report tasks-analyzer --csv-summary <file>

Write summary to csv file with additional schedule information:

  $ perf script report tasks-analyzer --csv-summary <file> --summary-extended

Write both summary and standard task to a csv file:

  $ perf script report tasks-analyzer --csv --csv-summary

The following examples illustrate what is possible with the CSV output.  The
first command sequence will record all scheduler switch events for 10 seconds,
the task-analyzer calculates task information like runtimes as CSV.  A small
python snippet using pandas and matplotlib will visualize the most frequent
task (e.g. kworker/1:1) runtimes - each runtime as a bar in a bar chart:

  $ perf record -e sched:sched_switch -a -- sleep 10
  $ perf script report tasks-analyzer --ns --csv tasks.csv
  $ cat << EOF > /tmp/freq-comm-runtimes-bar.py
    import pandas as pd
    import matplotlib.pyplot as plt

    df = pd.read_csv("tasks.csv", sep=';')
    most_freq_comm = df["COMM"].value_counts().idxmax()
    most_freq_runtimes = df[df["COMM"]==most_freq_comm]["Runtime"]
    plt.title(f"Runtimes for Task {most_freq_comm} in Nanoseconds")
    plt.bar(range(len(most_freq_runtimes)), most_freq_runtimes)
    plt.show()
  $ python3 /tmp/freq-comm-runtimes-bar.py

As a seconds example, the subsequent script generates a pie chart of all
accumulated tasks runtimes for 10 seconds of system recordings:

  $ perf record -e sched:sched_switch -a -- sleep 10
  $ perf script report tasks-analyzer --csv-summary task-summary.csv
  $ cat << EOF > /tmp/accumulated-task-pie.py
    import pandas as pd
    from matplotlib.pyplot import pie, axis, show

    df = pd.read_csv("task-summary.csv", sep=';')
    sums = df.groupby(df["Comm"])["Accumulated"].sum()
    axis("equal")
    pie(sums, labels=sums.index);
    show()
  EOF
  $ python3 /tmp/accumulated-task-pie.py

A variety of other visualizations are possible in matplotlib and other
environments. Of course, pandas, numpy and co. also allow easy
statistical analysis of the data!

Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221206154406.41941-3-petar.gligor@gmail.com
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Hagen Paul Pfeifer
e76aff0523 perf script: Introduce task analyzer python script
Introduce a new 'perf script' to analyze task scheduling behavior.

During the task analysis, some data is always needed - which goes beyond
the simple time of switching on and off a task (process/thread). This
concerns for example the runtime of a process or the frequency with
which the process was called. This script serves to simplify this
recurring analyze process. It immediately provides the user with helpful
task characteristic information about the tasks runtimes.

Usage:

Recorded can be in two ways:

  $ perf script record tasks-analyzer -- sleep 10
  $ perf record -e sched:sched_switch -a -- sleep 10

The script can parse all perf.data files, most important: sched:sched_switch
events are mandatory, other events will be ignored.

Most simple report use case is to just call the script without arguments:

  $ perf script report tasks-analyzer
      Switched-In      Switched-Out CPU      PID      TID             Comm    Runtime     Time Out-In
  15576.658891407   15576.659156086   4     2412     2428            gdbus        265            1949
  15576.659111320   15576.659455410   0     2412     2412      gnome-shell        344            2267
  15576.659491326   15576.659506173   2       74       74      kworker/2:1         15           13145
  15576.659506173   15576.659825748   2     2858     2858  gnome-terminal-        320           63263
  15576.659871270   15576.659902872   6    20932    20932    kworker/u16:0         32         2314582
  15576.659909951   15576.659945501   3    27264    27264               sh         36              -1
  15576.659853285   15576.659971052   7    27265    27265             perf        118         5050741
  [...]

What is not shown here are the ASCII color sequences. For example, if
the task consists of only one thread, the TID is grayed out.

Runtime is the time the task was running on the CPU, Time Out-In is the
time between the process being scheduled *out* and scheduled back *in*.
So the last time span between two executions. If -1 is printed, then the
task simply ran the first time in the measurements - a Out-In delta
could not be calculated.

In addition to the chronological representation, there is a summary on
task level. This output can be additionally switched on via the
--summary option and provides information such as max, min & average
runtime per process. The maximum runtime is often important for
debugging. The call looks like this:

  $ perf script report tasks-analyzer --summary
  Summary
       Task Information                       Runtime Information
    PID   TID            Comm Runs Accumulated    Mean  Median  Min   Max          Max At
     14    14     ksoftirqd/0   13         334      26      15    9   127 15571.621211956
     15    15     rcu_preempt  133        1778      13      13    2    33 15572.581176024
     16    16     migration/0    3          49      16      13   12    24 15571.608915425
     20    20     migration/1    3          34      11      13    8    13 15571.639101555
     25    25     migration/2    3          32      11      12    9    12 15575.639239896
  [...]

Besides these two options, there are a number of other options that change the
output and behavior. This can be queried via --help. Options worth mentioning include:

- filter-tasks         - filter out unneeded tasks, --filter-task 1337,/sbin/init
- highlight-tasks      - more pleasant focusing, --highlight-tasks 1:red,mutt:yellow
- extended-times       - show combinations of elapsed times between schedule in/schedule out
- summary-extended     - summary with additional information, like maximum delta time statistics
- rename-comms-by-tids - handy for inexpressive processnames like python, --rename 1337:my-python-app
- ms                   - show timestamps in milliseconds, nanoseconds is also possible (--ns)
- time-limit           - limit the analyzer to a time range, --time-limit 15576.0:15576.1

Script is tested and prime time ready for python2 & python3:

- make PYTHON=python3 prefix=/usr/local install
- make PYTHON=python2 prefix=/usr/local install

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221206154406.41941-2-petar.gligor@gmail.com
Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
James Clark
55c1de9973 perf cs-etm: Print auxtrace info even if OpenCSD isn't linked
Printing the info doesn't have any dependency on OpenCSD, and neither
does recording Coresight data. Because it's sometimes useful to look at
the info for debugging, it makes sense to be able to see it on the same
platform that the recording was made on.

So pull the auxtrace info printing parts into a new file that is always
compiled into Perf.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
James Clark
fd63091f2a perf cs-etm: Cleanup cs_etm__process_auxtrace_info()
hdr is a copy of 3 values of ptr and doesn't need to be long lived. So
just use ptr instead which means the malloc and the extra error path can
be removed to simplify things.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
James Clark
b00204f5c2 perf cs-etm: Tidy up auxtrace info header printing
cs_etm__print_auxtrace_info() is called twice in case there is an error
somewhere in cs_etm__process_auxtrace_info(), but all the info is
already available at the beginning so just print it there instead.

Also use u64 and the already cast ptr variable to make it more
consistent with the rest of the etm code.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
James Clark
fe55ba1832 perf cs-etm: Remove unused stub methods
These aren't used outside of cs-etm so don't need stubs. Leave
cs_etm__process_auxtrace_info() which is used externally, and add an
error message so that it's obvious to users why it causes errors.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
James Clark
ab6bd55e99 perf cs-etm: Print unknown header version as an error
This is an error rather than just for the raw trace dump so always print
it as an error. Also remove the duplicate header version check.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221212155513.2259623-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Namhyung Kim
22ddcb6b4a perf test: Update perf lock contention test
Add test cases for the task and addr aggregation modes.

  $ sudo ./perf test -v contention
   86: kernel lock contention analysis test                            :
  --- start ---
  test child forked, pid 680006
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  test child finished with 0
  ---- end ----
  kernel lock contention analysis test: Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Namhyung Kim
688d2e8de2 perf lock contention: Add -l/--lock-addr option
The -l/--lock-addr option is to implement per-lock-instance contention
stat using LOCK_AGGR_ADDR.  It displays lock address and optionally
symbol name if exists.

  $ sudo ./perf lock con -abl sleep 1
   contended   total wait     max wait     avg wait            address   symbol

           1     36.28 us     36.28 us     36.28 us   ffff92615d6448b8
           9     10.91 us      1.84 us      1.21 us   ffffffffbaed50c0   rcu_state
           1     10.49 us     10.49 us     10.49 us   ffff9262ac4f0c80
           8      4.68 us      1.67 us       585 ns   ffffffffbae07a40   jiffies_lock
           3      3.03 us      1.45 us      1.01 us   ffff9262277861e0
           1       924 ns       924 ns       924 ns   ffff926095ba9d20
           1       436 ns       436 ns       436 ns   ffff9260bfda4f60

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Namhyung Kim
eca949b2b4 perf lock contention: Implement -t/--threads option for BPF
The BPF didn't show the per-thread stat properly.  Use task's thread id (PID)
as a key instead of stack_id and add a task_data map to save task comm names.

  $ sudo ./perf lock con -abt -E 5 sleep 1
   contended   total wait     max wait     avg wait          pid   comm

           1    740.66 ms    740.66 ms    740.66 ms         1950   nv_queue
           3    305.50 ms    298.19 ms    101.83 ms         1884   nvidia-modeset/
           1     25.14 us     25.14 us     25.14 us      2725038   EventManager_De
          12     23.09 us      9.30 us      1.92 us            0   swapper
           1     20.18 us     20.18 us     20.18 us      2725033   EventManager_De

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Namhyung Kim
fd507d3e35 perf lock contention: Add lock_data.h for common data
Accessing BPF maps should use the same data types.  Add bpf_skel/lock_data.h
to define the common data structures.  No functional changes.

Committer notes:

Fixed contention_key.stack_id missing rename to contention_key.stack_or_task_id.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Khem Raj
3cad53a6f9 perf python: Account for multiple words in CC
Sometimes build systems may append options e.g. --sysroot etc. to CC
variable especially in cross-compile environments like yocto project
where CC varable is composed of cross-compiler name and some needed
options for it to work in a relocatable environment.

Therefore separate out the compiler name from rest of the options in CC,
then add the options via second argument to Popen() API

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Link: https://lore.kernel.org/r/20221205025534.150006-1-raj.khem@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Namhyung Kim
167b266bf6 perf off_cpu: Fix a typo in BTF tracepoint name, it should be 'btf_trace_sched_switch'
In BTF, tracepoint definitions have the "btf_trace_" prefix.  The
off-cpu profiler needs to check the signature of the sched_switch event
using that definition.  But there's a typo (s/bpf/btf/) so it failed
always.

Fixes: b36888f71c ("perf record: Handle argument change in sched_switch")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: bpf@vger.kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20221208182636.524139-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Athira Rajeev
232b82d201 perf test: Update event group check for support of uncore event
The event group test checks group creation for combinations of hw, sw
and uncore PMU events. Some of the uncore pmus may require additional
permission to access the counters.

For example, in case of hv_24x7, partition need to have permissions to
access hv_24x7 pmu counters. If not, event_open will fail. Hence add a
sanity check to see if event_open succeeds before proceeding with the
test.

Fixes: 9d9b22beda ("perf test: Add event group test for events in multiple PMUs")
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20221207165815.774-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:23:36 -03:00
Arnaldo Carvalho de Melo
b9a49f8cb0 perf tools: Check if libtracevent has TEP_FIELD_IS_RELATIVE
Some distros have older versions of libtraceevent where
TEP_FIELD_IS_RELATIVE and its associated semantics are not present, so
we need to check if the version has it, it was introduced in
libtraceevent 1.5.0.

Reported-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ian Rogers
4171925aa9 tools lib traceevent: Remove libtraceevent
libtraceevent is now out-of-date and it is better to depend on the
system version. Remove this code that is no longer depended upon by
any builds.

Committer notes:

Removed the removed tools/lib/traceevent/ from tools/perf/MANIFEST, so
that 'make perf-tar-src-pkg' works.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20221130062935.2219247-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ian Rogers
378ef0f5d9 perf build: Use libtraceevent from the system
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.

If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.

This also disables CONFIG_TRACE that controls "perf trace".

CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.

Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed.  The
majority of commands continue to work including "perf test".

Committer notes:

Fixed up a tools/perf/util/Build reject and added:

  #include <traceevent/event-parse.h>

to tools/perf/util/scripting-engines/trace-event-perl.c.

Committer testing:

  $ rpm -qi libtraceevent-devel
  Name        : libtraceevent-devel
  Version     : 1.5.3
  Release     : 2.fc36
  Architecture: x86_64
  Install Date: Mon 25 Jul 2022 03:20:19 PM -03
  Group       : Unspecified
  Size        : 27728
  License     : LGPLv2+ and GPLv2+
  Signature   : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
  Source RPM  : libtraceevent-1.5.3-2.fc36.src.rpm
  Build Date  : Fri 15 Apr 2022 10:57:01 AM -03
  Build Host  : buildvm-x86-05.iad2.fedoraproject.org
  Packager    : Fedora Project
  Vendor      : Fedora Project
  URL         : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
  Bug URL     : https://bugz.fedoraproject.org/libtraceevent
  Summary     : Development headers of libtraceevent
  Description :
  Development headers of libtraceevent-libs
  $

Default build:

  $ ldd ~/bin/perf | grep tracee
  	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
  $

  # perf trace -e sched:* --max-events 10
       0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
       0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
       0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
       1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
       1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
       0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
       0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
       0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
       1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
       1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
  #

Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.

Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:

- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y

- perf-$(CONFIG_LIBTRACEEVENT) += scripts/

- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y

- The python binding needed some fixups and util/trace-event.c can't be
  built and linked with the python binding shared object, so remove it
  in tools/perf/util/setup.py and exclude it from the list of
  dependencies in the python/perf.so Makefile.perf target.

Building without libtraceevent-devel installed uncovered more build
failures:

- The python binding tools/perf/util/python.c was assuming that
  traceevent/parse-events.h was always available, which was the case
  when we defaulted to using the in-kernel tools/lib/traceevent/ files,
  now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
  the other parts of it that deal with tracepoints.

- We have to ifdef the rules in the Build files with
  CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
  tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
  setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
  detect libtraceevent-devel installed in the system. Simplification here
  to avoid these two ways of disabling builtin-trace.c and not having
  CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
  way.

From Athira:

<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>

Then, ditto for arm64 and s390, detected by container cross build tests.

- s/390 uses test__checkevent_tracepoint() that is now only available if
  HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.

Also from Athira:

<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>

Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ian Rogers
40769665b6 perf jevents: Parse metrics during conversion
Currently the 'MetricExpr' json value is passed from the json
file to the pmu-events.c. This change introduces an expression
tree that is parsed into. The parsing is done largely by using
operator overloading and python's 'eval' function. Two advantages
in doing this are:

1) Broken metrics fail at compile time rather than relying on
   `perf test` to detect. `perf test` remains relevant for checking
   event encoding and actual metric use.

2) The conversion to a string from the tree can minimize the metric's
   string size, for example, preferring 1e6 over 1000000, avoiding
   multiplication by 1 and removing unnecessary whitespace. On x86
   this reduces the string size by 2,930bytes (0.07%).

In future changes it would be possible to programmatically
generate the json expressions (a single line of text and so a
pain to write manually) for an architecture using the expression
tree. This could avoid copy-pasting metrics for all architecture
variants.

v4. Doesn't simplify "0*SLOTS" to 0, as the pattern is used to fix
    Intel metrics with topdown events.
v3. Avoids generic types on standard types like set that aren't
    supported until Python 3.9, fixing an issue with Python 3.6
    reported-by John Garry. v3 also fixes minor pylint issues and adds
    a call to Simplify on the read expression tree.
v2. Improvements to type information.

Committer notes:

Added one-line fixer from Ian, see first Link: tag below.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/CAP-5=fWa=zNK_ecpWGoGggHCQx7z-oW0eGMQf19Maywg0QK=4g@mail.gmail.com
Link: https://lore.kernel.org/r/20221207055908.1385448-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Namhyung Kim
b897613510 perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events
In print_counter_aggrdata(), it skips some events that has no aggregate
count.  It's actually for system-wide per-thread mode and merged uncore
and hybrid events.

Let's update the condition to check them explicitly.

Fixes: 91f85f98da ("perf stat: Display event stats using aggr counts")
Reported-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221206175804.391387-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ian Rogers
616aa32d6f perf build: Fixes for LIBTRACEEVENT_DYNAMIC
If LIBTRACEEVENT_DYNAMIC is enabled then avoid the install step for
the plugins. If disabled correct DESTDIR so that the plugins are
installed under <lib>/traceevent/plugins.

Fixes: ef019df01e ("perf build: Install libtraceevent locally when building")
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20221205225940.3079667-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Arnaldo Carvalho de Melo
cc2367eebb machine: Adopt is_lock_function() from builtin-lock.c
It is used in bpf_lock_contention.c and builtin-lock.c will be made
CONFIG_LIBTRACEEVENT=y conditional, so move it to machine.c, that is
always available.

This makes those 4 global variables for sched and lock text start and
end to move to 'struct machine' too, as conceivably we can have that
info for several machine instances, say some 'perf diff' like tool.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ravi Bangoria
9d9b22beda perf test: Add event group test for events in multiple PMUs
Multiple events in a group can belong to one or more PMUs, however
there are some limitations.

One of the limitations is that perf doesn't allow creating a group of
events from different hw PMUs.

Write a simple test to create various combinations of hw, sw and uncore
PMU events and verify group creation succeeds or fails as expected.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20221206043237.12159-3-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ravi Bangoria
336b92da1a perf tool: Move pmus list variable to a new file
The 'pmus' list variable is defined as static variable under pmu.c file.

Introduce a new pmus.c file and migrate this variable to it. Also make
it non static so that it can be accessed from outside.

Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: carsten.haitzler@arm.com
Link: https://lore.kernel.org/r/20221206043237.12159-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ian Rogers
5b7a29fb0b perf util: Add host_is_bigendian to util.h
Avoid libtraceevent dependency for tep_is_bigendian or trace-event.h
dependency for bigendian. Add a new host_is_bigendian to util.h, using
the compiler defined __BYTE_ORDER__ when available.

Committer notes:

Added:

 #else  /* !__BYTE_ORDER__ */

On that nested #ifdef block, as per Namhyung's suggestion.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221130062935.2219247-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ian Rogers
fce9a61914 perf util: Make header guard consistent with tool
Remove git reference by changing GIT_COMPAT_UTIL_H to __PERF_UTIL_H.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221130062935.2219247-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
James Clark
3f81f72d30 perf stat: Fix invalid output handle
In this context, 'os' is already a pointer so the extra dereference
isn't required. This fixes the following test failure on aarch64:

  $ ./perf test "json output" -vvv
  92: perf stat JSON output linter                                    :
  --- start ---
  Checking json output: no args Test failed for input:
  ...
  Fatal error: glibc detected an invalid stdio handle
  ---- end ----
  perf stat JSON output linter: FAILED!

Fixes: e7f4da3122 ("perf stat: Pass struct outstate to printout()")
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221130111521.334152-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Namhyung Kim
117195d9f8 perf stat: Fix multi-line metric output in JSON
When a metric produces more than one values, it missed to print the opening
bracket.

Fixes: ab6baaae27 ("perf stat: Fix JSON output in metric-only mode")
Reported-by: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221202190447.1588680-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ian Rogers
113bb39642 tools lib symbol: Add dependency test to install_headers
Compute the headers to be installed from their source headers and make
each have its own build target to install it. Using dependencies
avoids headers being reinstalled and getting a new timestamp which
then causes files that depend on the header to be rebuilt.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20221202045743.2639466-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00