linux-stable/tools/perf
Linus Torvalds 57d17378a4 perf tools changes for v5.17: 1st batch
New features:
 
 - Add 'trace' subcommand for 'perf ftrace', setting the stage for more
   'perf ftrace' subcommands. Not using a subcommand yields the previous
   behaviour of 'perf ftrace'.
 
 - Add 'latency' subcommand to 'perf ftrace', that can use the function
   graph tracer or a BPF optimized one, via the -b/--use-bpf option.
 
   E.g.:
 
   $ sudo perf ftrace latency -a -T mutex_lock sleep 1
   #   DURATION     |      COUNT | GRAPH                          |
        0 - 1    us |       4596 | ########################       |
        1 - 2    us |       1680 | #########                      |
        2 - 4    us |       1106 | #####                          |
        4 - 8    us |        546 | ##                             |
        8 - 16   us |        562 | ###                            |
       16 - 32   us |          1 |                                |
       32 - 64   us |          0 |                                |
       64 - 128  us |          0 |                                |
      128 - 256  us |          0 |                                |
      256 - 512  us |          0 |                                |
      512 - 1024 us |          0 |                                |
        1 - 2    ms |          0 |                                |
        2 - 4    ms |          0 |                                |
        4 - 8    ms |          0 |                                |
        8 - 16   ms |          0 |                                |
       16 - 32   ms |          0 |                                |
       32 - 64   ms |          0 |                                |
       64 - 128  ms |          0 |                                |
      128 - 256  ms |          0 |                                |
      256 - 512  ms |          0 |                                |
      512 - 1024 ms |          0 |                                |
        1 - ...   s |          0 |                                |
 
   The original implementation of this command was in the bcc tool.
 
 - Support --cputype option for hybrid events in 'perf stat'.
 
 Improvements:
 
 - Call chain improvements for ARM64.
 
 - No need to do any affinity setup when profiling pids.
 
 - Reduce multiplexing with duration_time in 'perf stat' metrics.
 
 - Improve error message for uncore events, stating that some event groups are
   can only be used in system wide (-a) mode.
 
 - perf stat metric group leader fixes/improvements, including arch specific
   changes to better support Intel topdown events.
 
 - Probe non-deprecated sysfs path 1st, i.e. try /sys/devices/system/cpu/cpuN/topology/thread_siblings
   first, then the old /sys/devices/system/cpu/cpuN/topology/core_cpus.
 
 - Disable debuginfod by default in 'perf record', to avoid stalls on distros
   such as Fedora 35.
 
 - Use unbuffered output in 'perf bench' when pipe/tee'ing to a file.
 
 - Enable ignore_missing_thread in 'perf trace'
 
 Fixes:
 
 - Avoid TUI crash when navigating in the annotation of recursive functions.
 
 - Fix hex dump character output in 'perf script'.
 
 - Fix JSON indentation to 4 spaces standard in the ARM vendor event files.
 
 - Fix use after free in metric__new().
 
 - Fix IS_ERR_OR_NULL() usage in the perf BPF loader.
 
 - Fix up cross-arch register support, i.e. when printing register names take
   into account the architecture where the perf.data file was collected.
 
 - Fix SMT fallback with large core counts.
 
 - Don't lower case MetricExpr when parsing JSON files so as not to lose info
   such as the ":G" event modifier in metrics.
 
 perf test:
 
 - Add basic stress test for sigtrap handling to 'perf test'.
 
 - Fix 'perf test' failures on s/390
 
 - Enable system wide for metricgroups test in 'perf test´.
 
 - Use 3 digits for test numbering now we can have more tests.
 
 Arch specific:
 
 - Add events for Arm Neoverse N2 in the ARM JSON vendor event files
 
 - Support PERF_MEM_LVLNUM encodings in powerpc, that came from a single
   patch series, where I incorrectly merged the kernel bits, that were then
   reverted after coordination with Michael Ellerman and Stephen Rothwell.
 
 - Add ARM SPE total latency as PERF_SAMPLE_WEIGHT.
 
 - Update AMD documentation, with info on raw event encoding.
 
 - Add support for global and local variants of the "p_stage_cyc" sort key,
   applicable to perf.data files collected on powerpc.
 
 - Remove duplicate and incorrect aux size checks in the ARM CoreSight ETM code.
 
 Refactorings:
 
 - Add a perf_cpu abstraction to disambiguate CPUs and CPU map indexes, fixing
   problems along the way.
 
 - Document CPU map methods.
 
 UAPI sync:
 
 - Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
 
 - Sync UAPI files with the kernel sources: drm, msr-index, cpufeatures.
 
 Build system
 
 - Enable warnings through HOSTCFLAGS.
 
 - Drop requirement for libstdc++.so for libopencsd check
 
 libperf:
 
 - Make libperf adopt perf_counts_values__scale() from tools/perf/util/.
 
 - Add a stat multiplexing test to libperf.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYeQj6AAKCRCyPKLppCJ+
 JwyWAQCBmU8OJxhSJQnNCwTB9zNkPPBbihvIztepOJ7zsw7JcQD+KfAidHGQvI/Y
 EmXIYkmdNkWPYJafONllnKK5cckjxgI=
 =aj9V
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.17-2022-01-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool updates from Arnaldo Carvalho de Melo:
 "New features:

   - Add 'trace' subcommand for 'perf ftrace', setting the stage for
     more 'perf ftrace' subcommands. Not using a subcommand yields the
     previous behaviour of 'perf ftrace'.

   - Add 'latency' subcommand to 'perf ftrace', that can use the
     function graph tracer or a BPF optimized one, via the -b/--use-bpf
     option.

     E.g.:

	$ sudo perf ftrace latency -a -T mutex_lock sleep 1
	#   DURATION     |      COUNT | GRAPH                          |
	     0 - 1    us |       4596 | ########################       |
	     1 - 2    us |       1680 | #########                      |
	     2 - 4    us |       1106 | #####                          |
	     4 - 8    us |        546 | ##                             |
	     8 - 16   us |        562 | ###                            |
	    16 - 32   us |          1 |                                |
	    32 - 64   us |          0 |                                |
	    64 - 128  us |          0 |                                |
	   128 - 256  us |          0 |                                |
	   256 - 512  us |          0 |                                |
	   512 - 1024 us |          0 |                                |
	     1 - 2    ms |          0 |                                |
	     2 - 4    ms |          0 |                                |
	     4 - 8    ms |          0 |                                |
	     8 - 16   ms |          0 |                                |
	    16 - 32   ms |          0 |                                |
	    32 - 64   ms |          0 |                                |
	    64 - 128  ms |          0 |                                |
	   128 - 256  ms |          0 |                                |
	   256 - 512  ms |          0 |                                |
	   512 - 1024 ms |          0 |                                |
	     1 - ...   s |          0 |                                |

     The original implementation of this command was in the bcc tool.

   - Support --cputype option for hybrid events in 'perf stat'.

  Improvements:

   - Call chain improvements for ARM64.

   - No need to do any affinity setup when profiling pids.

   - Reduce multiplexing with duration_time in 'perf stat' metrics.

   - Improve error message for uncore events, stating that some event
     groups are can only be used in system wide (-a) mode.

   - perf stat metric group leader fixes/improvements, including arch
     specific changes to better support Intel topdown events.

   - Probe non-deprecated sysfs path first, i.e. try the path
     /sys/devices/system/cpu/cpuN/topology/thread_siblings first, then
     the old /sys/devices/system/cpu/cpuN/topology/core_cpus.

   - Disable debuginfod by default in 'perf record', to avoid stalls on
     distros such as Fedora 35.

   - Use unbuffered output in 'perf bench' when pipe/tee'ing to a file.

   - Enable ignore_missing_thread in 'perf trace'

  Fixes:

   - Avoid TUI crash when navigating in the annotation of recursive
     functions.

   - Fix hex dump character output in 'perf script'.

   - Fix JSON indentation to 4 spaces standard in the ARM vendor event
     files.

   - Fix use after free in metric__new().

   - Fix IS_ERR_OR_NULL() usage in the perf BPF loader.

   - Fix up cross-arch register support, i.e. when printing register
     names take into account the architecture where the perf.data file
     was collected.

   - Fix SMT fallback with large core counts.

   - Don't lower case MetricExpr when parsing JSON files so as not to
     lose info such as the ":G" event modifier in metrics.

  perf test:

   - Add basic stress test for sigtrap handling to 'perf test'.

   - Fix 'perf test' failures on s/390

   - Enable system wide for metricgroups test in 'perf test´.

   - Use 3 digits for test numbering now we can have more tests.

  Arch specific:

   - Add events for Arm Neoverse N2 in the ARM JSON vendor event files

   - Support PERF_MEM_LVLNUM encodings in powerpc, that came from a
     single patch series, where I incorrectly merged the kernel bits,
     that were then reverted after coordination with Michael Ellerman
     and Stephen Rothwell.

   - Add ARM SPE total latency as PERF_SAMPLE_WEIGHT.

   - Update AMD documentation, with info on raw event encoding.

   - Add support for global and local variants of the "p_stage_cyc" sort
     key, applicable to perf.data files collected on powerpc.

   - Remove duplicate and incorrect aux size checks in the ARM CoreSight
     ETM code.

  Refactorings:

   - Add a perf_cpu abstraction to disambiguate CPUs and CPU map
     indexes, fixing problems along the way.

   - Document CPU map methods.

  UAPI sync:

   - Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench
     mem memcpy'

   - Sync UAPI files with the kernel sources: drm, msr-index,
     cpufeatures.

  Build system

   - Enable warnings through HOSTCFLAGS.

   - Drop requirement for libstdc++.so for libopencsd check

  libperf:

   - Make libperf adopt perf_counts_values__scale() from tools/perf/util/.

   - Add a stat multiplexing test to libperf"

* tag 'perf-tools-for-v5.17-2022-01-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (115 commits)
  perf record: Disable debuginfod by default
  perf evlist: No need to do any affinity setup when profiling pids
  perf cpumap: Add is_dummy() method
  perf metric: Fix metric_leader
  perf cputopo: Fix CPU topology reading on s/390
  perf metricgroup: Fix use after free in metric__new()
  libperf tests: Update a use of the new cpumap API
  perf arm: Fix off-by-one directory path
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools headers cpufeatures: Sync with the kernel sources
  tools headers UAPI: Update tools's copy of drm.h header
  tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
  perf pmu-events: Don't lower case MetricExpr
  perf expr: Add debug logging for literals
  perf tools: Probe non-deprecated sysfs path 1st
  perf tools: Fix SMT fallback with large core counts
  perf cpumap: Give CPUs their own type
  perf stat: Correct first_shadow_cpu to return index
  perf script: Fix flipped index and cpu
  perf c2c: Use more intention revealing iterator
  ...
2022-01-18 06:32:11 +02:00
..
arch perf cpumap: Give CPUs their own type 2022-01-12 14:28:23 -03:00
bench perf cpumap: Give CPUs their own type 2022-01-12 14:28:23 -03:00
dlfilters perf dlfilter: Drop unused variable 2021-12-16 12:18:11 -03:00
Documentation perf record: Disable debuginfod by default 2022-01-15 17:41:25 -03:00
examples/bpf
include perf build: Move perf_dlfilters.h in the source tree 2021-08-11 09:35:24 -03:00
jvmti
pmu-events perf pmu-events: Don't lower case MetricExpr 2022-01-12 15:02:48 -03:00
python
scripts perf scripts python: intel-pt-events.py: Fix printing of switch events 2021-12-28 17:26:25 -03:00
tests Merge remote-tracking branch 'torvalds/master' into perf/core 2022-01-13 10:20:59 -03:00
trace perf beauty: Add socket level scnprintf that handles ARCH specific SOL_SOCKET 2021-11-12 10:40:34 -03:00
ui perf annotate: Avoid TUI crash when navigating in the annotation of recursive functions 2022-01-10 15:47:30 -03:00
util perf tools changes for v5.17: 1st batch 2022-01-18 06:32:11 +02:00
.gitignore Add 'tools/perf/libbpf/' to ignored files 2021-11-08 11:33:35 -08:00
Build
builtin-annotate.c perf tools: Check vmlinux/kallsyms arguments in all tools 2021-11-07 12:27:38 -03:00
builtin-bench.c perf bench: Use unbuffered output when pipe/tee'ing to a file 2021-12-16 12:18:11 -03:00
builtin-buildid-cache.c perf record: Disable debuginfod by default 2022-01-15 17:41:25 -03:00
builtin-buildid-list.c
builtin-c2c.c perf cpumap: Give CPUs their own type 2022-01-12 14:28:23 -03:00
builtin-config.c
builtin-daemon.c perf daemon: Remove duplicate sys/file.h include 2021-10-08 15:14:50 -03:00
builtin-data.c perf data: Correct -h output 2021-08-31 15:12:00 -03:00
builtin-diff.c
builtin-evlist.c
builtin-ftrace.c perf cpumap: Give CPUs their own type 2022-01-12 14:28:23 -03:00
builtin-help.c
builtin-inject.c perf inject: Fix segfault due to perf_data__fd() without open 2021-12-18 08:31:14 -03:00
builtin-kallsyms.c
builtin-kmem.c perf cpumap: Give CPUs their own type 2022-01-12 14:28:23 -03:00
builtin-kvm.c perf tools: Allow controlling synthesizing PERF_RECORD_ metadata events during record 2021-09-17 08:44:19 -03:00
builtin-list.c perf list: Display hybrid PMU events with cpu type 2021-10-25 13:47:42 -03:00
builtin-lock.c
builtin-mem.c
builtin-probe.c perf tools: Check vmlinux/kallsyms arguments in all tools 2021-11-07 12:27:38 -03:00
builtin-record.c perf record: Disable debuginfod by default 2022-01-15 17:41:25 -03:00
builtin-report.c perf callchain: Enable dwarf_callchain_users on arm64 2021-12-21 18:35:44 -03:00
builtin-sched.c perf cpumap: Give CPUs their own type 2022-01-12 14:28:23 -03:00
builtin-script.c perf cpumap: Give CPUs their own type 2022-01-12 14:28:23 -03:00
builtin-stat.c perf cpumap: Give CPUs their own type 2022-01-12 14:28:23 -03:00
builtin-timechart.c
builtin-top.c perf tools: Check vmlinux/kallsyms arguments in all tools 2021-11-07 12:27:38 -03:00
builtin-trace.c perf tools changes for v5.17: 1st batch 2022-01-18 06:32:11 +02:00
builtin-version.c
builtin.h
check-headers.sh tools lib: Adopt list_sort() from the kernel sources 2021-10-20 10:30:59 -03:00
command-list.txt
CREDITS
design.txt perf design.txt: Synchronize the definition of enum perf_hw_id with code 2021-11-13 18:11:50 -03:00
Makefile
Makefile.config perf tools: Drop requirement for libstdc++.so for libopencsd check 2021-12-07 22:18:24 -03:00
Makefile.perf perf ftrace: Add -b/--use-bpf option for latency subcommand 2021-12-16 12:18:12 -03:00
MANIFEST perf MANIFEST: Add bpftool files to allow building with BUILD_BPF_SKEL=1 2021-11-07 15:39:28 -03:00
perf-archive.sh
perf-completion.sh
perf-iostat.sh
perf-read-vdso.c
perf-sys.h
perf-with-kcore.sh
perf.c
perf.h