Disable building BPF based features by default for v6.4.

We need to better polish building with BPF skels, so revert back to making it an experimental feature that has to be explicitely enabled using BUILD_BPF_SKEL=1. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZFbCXwAKCRCyPKLppCJ+ J7cHAP97erKY4hBXArjpfzcvpFmboh/oqhbTLntyIpS6TEnOyQEAyervAPGIjQYC DCo4foyXmOWn3dhNtK9M+YiRl3o2SgQ= =7G78 -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "Third version of perf tool updates, with the build problems with with using a 'vmlinux.h' generated from the main build fixed, and the bpf skeleton build disabled by default. Build: - Require libtraceevent to build, one can disable it using NO_LIBTRACEEVENT=1. It is required for tools like 'perf sched', 'perf kvm', 'perf trace', etc. libtraceevent is available in most distros so installing 'libtraceevent-devel' should be a one-time event to continue building perf as usual. Using NO_LIBTRACEEVENT=1 produces tooling that is functional and sufficient for lots of users not interested in those libtraceevent dependent features. - Allow Python support in 'perf script' when libtraceevent isn't linked, as not all features requires it, for instance Intel PT does not use tracepoints. - Error if the python interpreter needed for jevents to work isn't available and NO_JEVENTS=1 isn't set, preventing a build without support for JSON vendor events, which is a rare but possible condition. The two check error messages: $(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.) $(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.) - Make libbpf 1.0 the minimum required when building with out of tree, distro provided libbpf. - Use libsdtc++'s and LLVM's libcxx's __cxa_demangle, a portable C++ demangler, add 'perf test' entry for it. - Make binutils libraries opt in, as distros disable building with it due to licensing, they were used for C++ demangling, for instance. - Switch libpfm4 to opt-out rather than opt-in, if libpfm-devel (or equivalent) isn't installed, we'll just have a build warning: Makefile.config:1144: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev - Add a feature test for scandirat(), that is not implemented so far in musl and uclibc, disabling features that need it, such as scanning for tracepoints in /sys/kernel/tracing/events. perf BPF filters: - New feature where BPF can be used to filter samples, for instance: $ sudo ./perf record -e cycles --filter 'period > 1000' true $ sudo ./perf script perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms]) perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms]) perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms]) true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms]) true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) - In addition to 'period' (PERF_SAMPLE_PERIOD), the other PERF_SAMPLE_ can be used for filtering, and also some other sample accessible values, from tools/perf/Documentation/perf-record.txt: Essentially the BPF filter expression is: <term> <operator> <value> (("," | "||") <term> <operator> <value>)* The <term> can be one of: ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr, code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat, p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock, mem_dtlb, mem_blk, mem_hops The <operator> can be one of: ==, !=, >, >=, <, <=, & The <value> can be one of: <number> (for any term) na, load, store, pfetch, exec (for mem_op) l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl) na, none, hit, miss, hitm, fwd, peer (for mem_snoop) remote (for mem_remote) na, locked (for mem_locked) na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb) na, by_data, by_addr (for mem_blk) hops0, hops1, hops2, hops3 (for mem_hops) perf lock contention: - Show lock type with address. - Track and show mmap_lock, siglock and per-cpu rq_lock with address. This is done for mmap_lock by following the current->mm pointer: $ sudo ./perf lock con -abl -- sleep 10 contended total wait max wait avg wait address symbol ... 16344 312.30 ms 2.22 ms 19.11 us ffff8cc702595640 17686 310.08 ms 1.49 ms 17.53 us ffff8cc7025952c0 3 84.14 ms 45.79 ms 28.05 ms ffff8cc78114c478 mmap_lock 3557 76.80 ms 68.75 us 21.59 us ffff8cc77ca3af58 1 68.27 ms 68.27 ms 68.27 ms ffff8cda745dfd70 9 54.53 ms 7.96 ms 6.06 ms ffff8cc7642a48b8 mmap_lock 14629 44.01 ms 60.00 us 3.01 us ffff8cc7625f9ca0 3481 42.63 ms 140.71 us 12.24 us ffffffff937906ac vmap_area_lock 16194 38.73 ms 42.15 us 2.39 us ffff8cd397cbc560 11 38.44 ms 10.39 ms 3.49 ms ffff8ccd6d12fbb8 mmap_lock 1 5.43 ms 5.43 ms 5.43 ms ffff8cd70018f0d8 1674 5.38 ms 422.93 us 3.21 us ffffffff92e06080 tasklist_lock 581 4.51 ms 130.68 us 7.75 us ffff8cc9b1259058 5 3.52 ms 1.27 ms 703.23 us ffff8cc754510070 112 3.47 ms 56.47 us 31.02 us ffff8ccee38b3120 381 3.31 ms 73.44 us 8.69 us ffffffff93790690 purge_vmap_area_lock 255 3.19 ms 36.35 us 12.49 us ffff8d053ce30c80 - Update default map size to 16384. - Allocate single letter option -M for --map-nr-entries, as it is proving being frequently used. - Fix struct rq lock access for older kernels with BPF's CO-RE (Compile once, run everywhere). - Fix problems found with MSAn. perf report/top: - Add inline information when using --call-graph=fp or lbr, as was already done to the --call-graph=dwarf callchain mode. - Improve the 'srcfile' sort key performance by really using an optimization introduced in 6.2 for the 'srcline' sort key that avoids calling addr2line for comparision with each sample. perf sched: - Make 'perf sched latency/map/replay' to use "sched:sched_waking" instead of "sched:sched_waking", consistent with 'perf record' since d566a9c2d4 ("perf sched: Prefer sched_waking event when it exists"). perf ftrace: - Make system wide the default target for latency subcommand, run the following command then generate some network traffic and press control+C: # perf ftrace latency -T __kfree_skb ^C DURATION | COUNT | GRAPH | 0 - 1 us | 27 | ############# | 1 - 2 us | 22 | ########### | 2 - 4 us | 8 | #### | 4 - 8 us | 5 | ## | 8 - 16 us | 24 | ############ | 16 - 32 us | 2 | # | 32 - 64 us | 1 | | 64 - 128 us | 0 | | 128 - 256 us | 0 | | 256 - 512 us | 0 | | 512 - 1024 us | 0 | | 1 - 2 ms | 0 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 0 | | 256 - 512 ms | 0 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | # perf top: - Add --branch-history (LBR: Last Branch Record) option, just like already available for 'perf record'. - Fix segfault in thread__comm_len() where thread->comm was being used outside thread->comm_lock. perf annotate: - Allow configuring objdump and addr2line in ~/.perfconfig., so that you can use alternative binaries, such as llvm's. perf kvm: - Add TUI mode for 'perf kvm stat report'. Reference counting: - Add reference count checking infrastructure to check for use after free, done to the 'cpumap', 'namespaces', 'maps' and 'map' structs, more to come. To build with it use -DREFCNT_CHECKING=1 in the make command line to build tools/perf. Documented at: https://perf.wiki.kernel.org/index.php/Reference_Count_Checking - The above caught, for instance, fix, present in this series: - Fix maps use after put in 'perf test "Share thread maps"': 'maps' is copied from leader, but the leader is put on line 79 and then 'maps' is used to read the reference count below - so a use after put, with the put of maps happening within thread__put. Fixed by reversing the order of puts so that the leader is put last. - Also several fixes were made to places where reference counts were not being held. - Make this one of the tests in 'make -C tools/perf build-test' to regularly build test it and to make sure no direct access to the reference counted structs are made, doing that via accessors to check the validity of the struct pointer. ARM64: - Fix 'perf report' segfault when filtering coresight traces by sparse lists of CPUs. - Add support for 'simd' as a sort field for 'perf report', to show ARM's NEON SIMD's predicate flags: "partial" and "empty". arm64 vendor events: - Add N1 metrics. Intel vendor events: - Add graniterapids, grandridge and sierraforrest events. - Refresh events for: alderlake, aldernaken, broadwell, broadwellde, broadwellx, cascadelakx, haswell, haswellx, icelake, icelakex, jaketown, meteorlake, knightslanding, sandybridge, sapphirerapids, silvermont, skylake, tigerlake and westmereep-dp - Refresh metrics for alderlake-n, broadwell, broadwellde, broadwellx, haswell, haswellx, icelakex, ivybridge, ivytown and skylakex. perf stat: - Implement --topdown using JSON metrics. - Add TopdownL1 JSON metric as a default if present, but disable it for now for some Intel hybrid architectures, a series of patches addressing this is being reviewed and will be submitted for v6.5. - Use metrics for --smi-cost. - Update topdown documentation. Vendor events (JSON) infrastructure: - Add support for computing and printing metric threshold values. For instance, here is one found in thesapphirerapids json file: { "BriefDescription": "Percentage of cycles spent in System Management Interrupts.", "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)", "MetricGroup": "smi", "MetricName": "smi_cycles", "MetricThreshold": "smi_cycles > 0.1", "ScaleUnit": "100%" }, - Test parsing metric thresholds with the fake PMU in 'perf test pmu-events'. - Support for printing metric thresholds in 'perf list'. - Add --metric-no-threshold option to 'perf stat'. - Add rand (reverse and) and has_pmem (optane memory) support to metrics. - Sort list of input files to avoid depending on the order from readdir() helping in obtaining reproducible builds. S/390: - Add common metrics: - CPI (cycles per instruction), prbstate (ratio of instructions executed in problem state compared to total number of instructions), l1mp (Level one instruction and data cache misses per 100 instructions). - Add cache metrics for z13, z14, z15 and z16. - Add metric for TLB and cache. ARM: - Add raw decoding for SPE (Statistical Profiling Extension) v1.3 MTE (Memory Tagging Extension) and MOPS (Memory Operations) load/store. Intel PT hardware tracing: - Add event type names UINTR (User interrupt delivered) and UIRET (Exiting from user interrupt routine), documented in table 32-50 "CFE Packet Type and Vector Fields Details" in the Intel Processor Trace chapter of The Intel SDM Volume 3 version 078. - Add support for new branch instructions ERETS and ERETU. - Fix CYC timestamps after standalone CBR ARM CoreSight hardware tracing: - Allow user to override timestamp and contextid settings. - Fix segfault in dso lookup. - Fix timeless decode mode detection. - Add separate decode paths for timeless and per-thread modes. auxtrace: - Fix address filter entire kernel size. Miscellaneous: - Fix use-after-free and unaligned bugs in the PLT handling routines. - Use zfree() to reduce chances of use after free. - Add missing 0x prefix for addresses printed in hexadecimal in 'perf probe'. - Suppress massive unsupported target platform errors in the unwind code. - Fix return incorrect build_id size in elf_read_build_id(). - Fix 'perf scripts intel-pt-events.py' IPC output for Python 2 . - Add missing new parameter in kfree_skb tracepoint to the python scripts using it. - Add 'perf bench syscall fork' benchmark. - Add support for printing PERF_MEM_LVLNUM_UNC (Uncached access) in 'perf mem'. - Fix wrong size expectation for perf test 'Setup struct perf_event_attr' caused by the patch adding perf_event_attr::config3. - Fix some spelling mistakes" * tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (365 commits) Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL" Revert "perf build: Warn for BPF skeletons if endian mismatches" perf metrics: Fix SEGV with --for-each-cgroup perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE perf stat: Separate bperf from bpf_profiler perf test record+probe_libc_inet_pton: Fix call chain match on x86_64 perf test record+probe_libc_inet_pton: Fix call chain match on s390 perf tracepoint: Fix memory leak in is_valid_tracepoint() perf cs-etm: Add fix for coresight trace for any range of CPUs perf build: Fix unescaped # in perf build-test perf unwind: Suppress massive unsupported target platform errors perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it perf script: Print raw ip instead of binary offset for callchain perf symbols: Fix return incorrect build_id size in elf_read_build_id() perf list: Modify the warning message about scandirat(3) perf list: Fix memory leaks in print_tracepoint_events() perf lock contention: Rework offset calculation with BPF CO-RE perf lock contention: Fix struct rq lock access perf stat: Disable TopdownL1 on hybrid perf stat: Avoid SEGV on counter->name ...
2023-05-07 11:32:18 -07:00 · 2023-05-07 11:32:18 -07:00 · f085df1be6
parent 17784de648 9a2d5178b9
commit f085df1be6
514 changed files with 173765 additions and 150652 deletions
--- a/tools/arch/x86/include/uapi/asm/unistd_32.h
+++ b/tools/arch/x86/include/uapi/asm/unistd_32.h
@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __NR_execve
-#define __NR_execve 11
+#ifndef __NR_fork
+#define __NR_fork 2
 #endif
 #ifndef __NR_getppid
 #define __NR_getppid 64
--- a/tools/arch/x86/include/uapi/asm/unistd_64.h
+++ b/tools/arch/x86/include/uapi/asm/unistd_64.h
@ -1,4 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __NR_fork
+#define __NR_fork 57
+#endif
 #ifndef __NR_execve
 #define __NR_execve 59
 #endif
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@ -64,6 +64,7 @@ FEATURE_TESTS_BASIC :=                  \
        lzma                            \
        get_cpuid                       \
        bpf                             \
+        scandirat			\
        sched_getcpu			\
        sdt				\
        setns				\
@ -80,6 +81,7 @@ FEATURE_TESTS_EXTRA :=                  \
         compile-32                     \
         compile-x32                    \
         cplus-demangle                 \
+         cxa-demangle                   \
         gtk2                           \
         gtk2-infobar                   \
         hello                          \
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@ -23,6 +23,7 @@ FILES=                                          \
         test-libbfd-liberty.bin                \
         test-libbfd-liberty-z.bin              \
         test-cplus-demangle.bin                \
+         test-cxa-demangle.bin                  \
         test-libcap.bin			\
         test-libelf.bin                        \
         test-libelf-getphdrnum.bin             \
@ -58,19 +59,13 @@ FILES=                                          \
         test-lzma.bin                          \
         test-bpf.bin                           \
         test-libbpf.bin                        \
-         test-libbpf-btf__load_from_kernel_by_id.bin	\
-         test-libbpf-bpf_prog_load.bin          \
-         test-libbpf-bpf_map_create.bin		\
-         test-libbpf-bpf_object__next_program.bin \
-         test-libbpf-bpf_object__next_map.bin   \
-         test-libbpf-bpf_program__set_insns.bin	\
-         test-libbpf-btf__raw_data.bin          \
         test-get_cpuid.bin                     \
         test-sdt.bin                           \
         test-cxx.bin                           \
         test-gettid.bin			\
         test-jvmti.bin				\
         test-jvmti-cmlr.bin			\
+         test-scandirat.bin			\
         test-sched_getcpu.bin			\
         test-setns.bin				\
         test-libopencsd.bin			\
@ -135,6 +130,9 @@ $(OUTPUT)test-get_current_dir_name.bin:
 $(OUTPUT)test-glibc.bin:
 	$(BUILD)

+$(OUTPUT)test-scandirat.bin:
+	$(BUILD)
+
 $(OUTPUT)test-sched_getcpu.bin:
 	$(BUILD)

@ -269,6 +267,9 @@ $(OUTPUT)test-libbfd-liberty-z.bin:
 $(OUTPUT)test-cplus-demangle.bin:
 	$(BUILD) -liberty

+$(OUTPUT)test-cxa-demangle.bin:
+	$(BUILDXX)
+
 $(OUTPUT)test-backtrace.bin:
 	$(BUILD)

--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@ -114,6 +114,10 @@
 # include "test-pthread-barrier.c"
 #undef main

+#define main main_test_scandirat
+# include "test-scandirat.c"
+#undef main
+
 #define main main_test_sched_getcpu
 # include "test-sched_getcpu.c"
 #undef main
@ -206,6 +210,7 @@ int main(int argc, char *argv[])
 	main_test_get_cpuid();
 	main_test_bpf();
 	main_test_libcrypto();
+	main_test_scandirat();
 	main_test_sched_getcpu();
 	main_test_sdt();
 	main_test_setns();
--- a/tools/build/feature/test-cxa-demangle.cpp
+++ b/tools/build/feature/test-cxa-demangle.cpp
@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <stdlib.h>
+#include <cxxabi.h>
+
+int main(void)
+{
+  size_t len = 256;
+  char *output = (char*)malloc(len);
+        int status;
+
+        output = abi::__cxa_demangle("FieldName__9ClassNameFd", output, &len, &status);
+
+        printf("demangled symbol: {%s}\n", output);
+
+        return 0;
+}
--- a/tools/build/feature/test-libbpf-bpf_map_create.c
+++ b/tools/build/feature/test-libbpf-bpf_map_create.c
@ -1,8 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <bpf/bpf.h>
-
-int main(void)
-{
-	return bpf_map_create(0 /* map_type */, NULL /* map_name */, 0, /* key_size */,
-			      0 /* value_size */, 0 /* max_entries */, NULL /* opts */);
-}
--- a/tools/build/feature/test-libbpf-bpf_object__next_map.c
+++ b/tools/build/feature/test-libbpf-bpf_object__next_map.c
@ -1,8 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <bpf/libbpf.h>
-
-int main(void)
-{
-	bpf_object__next_map(NULL /* obj */, NULL /* prev */);
-	return 0;
-}
--- a/tools/build/feature/test-libbpf-bpf_object__next_program.c
+++ b/tools/build/feature/test-libbpf-bpf_object__next_program.c
@ -1,8 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <bpf/libbpf.h>
-
-int main(void)
-{
-	bpf_object__next_program(NULL /* obj */, NULL /* prev */);
-	return 0;
-}
--- a/tools/build/feature/test-libbpf-bpf_prog_load.c
+++ b/tools/build/feature/test-libbpf-bpf_prog_load.c
@ -1,9 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <bpf/bpf.h>
-
-int main(void)
-{
-	return bpf_prog_load(0 /* prog_type */, NULL /* prog_name */,
-			     NULL /* license */, NULL /* insns */,
-			     0 /* insn_cnt */, NULL /* opts */);
-}
--- a/tools/build/feature/test-libbpf-bpf_program__set_insns.c
+++ b/tools/build/feature/test-libbpf-bpf_program__set_insns.c
@ -1,8 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <bpf/libbpf.h>
-
-int main(void)
-{
-	bpf_program__set_insns(NULL /* prog */, NULL /* new_insns */, 0 /* new_insn_cnt */);
-	return 0;
-}
--- a/tools/build/feature/test-libbpf-btf__load_from_kernel_by_id.c
+++ b/tools/build/feature/test-libbpf-btf__load_from_kernel_by_id.c
@ -1,8 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <bpf/btf.h>
-
-int main(void)
-{
-	btf__load_from_kernel_by_id(20151128);
-	return 0;
-}
--- a/tools/build/feature/test-libbpf-btf__raw_data.c
+++ b/tools/build/feature/test-libbpf-btf__raw_data.c
@ -1,8 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <bpf/btf.h>
-
-int main(void)
-{
-	btf__raw_data(NULL /* btf_ro */, NULL /* size */);
-	return 0;
-}
--- a/tools/build/feature/test-libbpf.c
+++ b/tools/build/feature/test-libbpf.c
@ -1,6 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <bpf/libbpf.h>

+#if !defined(LIBBPF_MAJOR_VERSION) || (LIBBPF_MAJOR_VERSION < 1)
+#error At least libbpf 1.0 is required for Linux tools.
+#endif
+
 int main(void)
 {
 	return bpf_object__open("test") ? 0 : -1;
--- a/tools/build/feature/test-scandirat.c
+++ b/tools/build/feature/test-scandirat.c
@ -0,0 +1,13 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif
+#include <dirent.h>
+
+int main(void)
+{
+	// expects non-NULL, arg3 is 'restrict' so "pointers" have to be different
+	return scandirat(/*dirfd=*/ 0, /*dirp=*/ (void *)1, /*namelist=*/ (void *)2, /*filter=*/ (void *)3, /*compar=*/ (void *)4);
+}
+
+#undef _GNU_SOURCE
--- a/tools/include/linux/compiler-gcc.h
+++ b/tools/include/linux/compiler-gcc.h
@ -12,8 +12,10 @@
 		     + __GNUC_PATCHLEVEL__)
 #endif

-#if GCC_VERSION >= 70000 && !defined(__CHECKER__)
-# define __fallthrough __attribute__ ((fallthrough))
+#if __has_attribute(__fallthrough__)
+# define fallthrough                    __attribute__((__fallthrough__))
+#else
+# define fallthrough                    do {} while (0)  /* fallthrough */
 #endif

 #if __has_attribute(__error__)
--- a/tools/include/linux/compiler.h
+++ b/tools/include/linux/compiler.h
@ -186,10 +186,6 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
 })


-#ifndef __fallthrough
-# define __fallthrough
-#endif
-
 /* Indirect macros required for expanded argument pasting, eg. __LINE__. */
 #define ___PASTE(a, b) a##b
 #define __PASTE(a, b) ___PASTE(a, b)
--- a/tools/include/linux/coresight-pmu.h
+++ b/tools/include/linux/coresight-pmu.h
@ -7,8 +7,32 @@
 #ifndef _LINUX_CORESIGHT_PMU_H
 #define _LINUX_CORESIGHT_PMU_H

+#include <linux/bits.h>
+
 #define CORESIGHT_ETM_PMU_NAME "cs_etm"
-#define CORESIGHT_ETM_PMU_SEED  0x10
+
+/*
+ * The legacy Trace ID system based on fixed calculation from the cpu
+ * number. This has been replaced by drivers using a dynamic allocation
+ * system - but need to retain the legacy algorithm for backward comparibility
+ * in certain situations:-
+ * a) new perf running on older systems that generate the legacy mapping
+ * b) older tools that may not update at the same time as the kernel.
+ */
+#define CORESIGHT_LEGACY_CPU_TRACE_ID(cpu)  (0x10 + (cpu * 2))
+
+/* CoreSight trace ID is currently the bottom 7 bits of the value */
+#define CORESIGHT_TRACE_ID_VAL_MASK	GENMASK(6, 0)
+
+/*
+ * perf record will set the legacy meta data values as unused initially.
+ * This allows perf report to manage the decoders created when dynamic
+ * allocation in operation.
+ */
+#define CORESIGHT_TRACE_ID_UNUSED_FLAG	BIT(31)
+
+/* Value to set for unused trace ID values */
+#define CORESIGHT_TRACE_ID_UNUSED_VAL	0x7F

 /*
 * Below are the definition of bit offsets for perf option, and works as
@ -34,15 +58,16 @@
 #define ETM4_CFG_BIT_RETSTK	12
 #define ETM4_CFG_BIT_VMID_OPT	15

-static inline int coresight_get_trace_id(int cpu)
-{
-	/*
-	 * A trace ID of value 0 is invalid, so let's start at some
-	 * random value that fits in 7 bits and go from there.  Since
-	 * the common convention is to have data trace IDs be I(N) + 1,
-	 * set instruction trace IDs as a function of the CPU number.
-	 */
-	return (CORESIGHT_ETM_PMU_SEED + (cpu * 2));
-}
+/*
+ * Interpretation of the PERF_RECORD_AUX_OUTPUT_HW_ID payload.
+ * Used to associate a CPU with the CoreSight Trace ID.
+ * [07:00] - Trace ID - uses 8 bits to make value easy to read in file.
+ * [59:08] - Unused (SBZ)
+ * [63:60] - Version
+ */
+#define CS_AUX_HW_ID_TRACE_ID_MASK	GENMASK_ULL(7, 0)
+#define CS_AUX_HW_ID_VERSION_MASK	GENMASK_ULL(63, 60)
+
+#define CS_AUX_HW_ID_CURR_VERSION 0

 #endif
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@ -1339,7 +1339,8 @@ union perf_mem_data_src {
 #define PERF_MEM_LVLNUM_L2	0x02 /* L2 */
 #define PERF_MEM_LVLNUM_L3	0x03 /* L3 */
 #define PERF_MEM_LVLNUM_L4	0x04 /* L4 */
-/* 5-0x8 available */
+/* 5-0x7 available */
+#define PERF_MEM_LVLNUM_UNC	0x08 /* Uncached */
 #define PERF_MEM_LVLNUM_CXL	0x09 /* CXL */
 #define PERF_MEM_LVLNUM_IO	0x0a /* I/O */
 #define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
--- a/tools/lib/api/io.h
+++ b/tools/lib/api/io.h
@ -7,7 +7,9 @@
 #ifndef __API_IO__
 #define __API_IO__

+#include <errno.h>
 #include <stdlib.h>
+#include <string.h>
 #include <unistd.h>

 struct io {
@ -112,4 +114,47 @@ static inline int io__get_dec(struct io *io, __u64 *dec)
 	}
 }

+/* Read up to and including the first newline following the pattern of getline. */
+static inline ssize_t io__getline(struct io *io, char **line_out, size_t *line_len_out)
+{
+	char buf[128];
+	int buf_pos = 0;
+	char *line = NULL, *temp;
+	size_t line_len = 0;
+	int ch = 0;
+
+	/* TODO: reuse previously allocated memory. */
+	free(*line_out);
+	while (ch != '\n') {
+		ch = io__get_char(io);
+
+		if (ch < 0)
+			break;
+
+		if (buf_pos == sizeof(buf)) {
+			temp = realloc(line, line_len + sizeof(buf));
+			if (!temp)
+				goto err_out;
+			line = temp;
+			memcpy(&line[line_len], buf, sizeof(buf));
+			line_len += sizeof(buf);
+			buf_pos = 0;
+		}
+		buf[buf_pos++] = (char)ch;
+	}
+	temp = realloc(line, line_len + buf_pos + 1);
+	if (!temp)
+		goto err_out;
+	line = temp;
+	memcpy(&line[line_len], buf, buf_pos);
+	line[line_len + buf_pos] = '\0';
+	line_len += buf_pos;
+	*line_out = line;
+	*line_len_out = line_len;
+	return line_len;
+err_out:
+	free(line);
+	return -ENOMEM;
+}
+
 #endif /* __API_IO__ */
--- a/tools/lib/perf/Makefile
+++ b/tools/lib/perf/Makefile
@ -188,7 +188,7 @@ install_lib: libs
 		cp -fpR $(LIBPERF_ALL) $(DESTDIR)$(libdir_SQ)

 HDRS := bpf_perf.h core.h cpumap.h threadmap.h evlist.h evsel.h event.h mmap.h
-INTERNAL_HDRS := cpumap.h evlist.h evsel.h lib.h mmap.h threadmap.h xyarray.h
+INTERNAL_HDRS := cpumap.h evlist.h evsel.h lib.h mmap.h rc_check.h threadmap.h xyarray.h

 INSTALL_HDRS_PFX := $(DESTDIR)$(prefix)/include/perf
 INSTALL_HDRS := $(addprefix $(INSTALL_HDRS_PFX)/, $(HDRS))
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@ -10,16 +10,21 @@
 #include <ctype.h>
 #include <limits.h>

-static struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
+void perf_cpu_map__set_nr(struct perf_cpu_map *map, int nr_cpus)
 {
-	struct perf_cpu_map *cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
+	RC_CHK_ACCESS(map)->nr = nr_cpus;
+}

-	if (cpus != NULL) {
+struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
+{
+	RC_STRUCT(perf_cpu_map) *cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
+	struct perf_cpu_map *result;
+
+	if (ADD_RC_CHK(result, cpus)) {
 		cpus->nr = nr_cpus;
 		refcount_set(&cpus->refcnt, 1);
-
 	}
-	return cpus;
+	return result;
 }

 struct perf_cpu_map *perf_cpu_map__dummy_new(void)
@ -27,7 +32,7 @@ struct perf_cpu_map *perf_cpu_map__dummy_new(void)
 	struct perf_cpu_map *cpus = perf_cpu_map__alloc(1);

 	if (cpus)
-		cpus->map[0].cpu = -1;
+		RC_CHK_ACCESS(cpus)->map[0].cpu = -1;

 	return cpus;
 }
@ -35,23 +40,30 @@ struct perf_cpu_map *perf_cpu_map__dummy_new(void)
 static void cpu_map__delete(struct perf_cpu_map *map)
 {
 	if (map) {
-		WARN_ONCE(refcount_read(&map->refcnt) != 0,
+		WARN_ONCE(refcount_read(perf_cpu_map__refcnt(map)) != 0,
 			  "cpu_map refcnt unbalanced\n");
-		free(map);
+		RC_CHK_FREE(map);
 	}
 }

 struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map)
 {
-	if (map)
-		refcount_inc(&map->refcnt);
-	return map;
+	struct perf_cpu_map *result;
+
+	if (RC_CHK_GET(result, map))
+		refcount_inc(perf_cpu_map__refcnt(map));
+
+	return result;
 }

 void perf_cpu_map__put(struct perf_cpu_map *map)
 {
-	if (map && refcount_dec_and_test(&map->refcnt))
-		cpu_map__delete(map);
+	if (map) {
+		if (refcount_dec_and_test(perf_cpu_map__refcnt(map)))
+			cpu_map__delete(map);
+		else
+			RC_CHK_PUT(map);
+	}
 }

 static struct perf_cpu_map *cpu_map__default_new(void)
@ -68,7 +80,7 @@ static struct perf_cpu_map *cpu_map__default_new(void)
 		int i;

 		for (i = 0; i < nr_cpus; ++i)
-			cpus->map[i].cpu = i;
+			RC_CHK_ACCESS(cpus)->map[i].cpu = i;
 	}

 	return cpus;
@ -94,15 +106,15 @@ static struct perf_cpu_map *cpu_map__trim_new(int nr_cpus, const struct perf_cpu
 	int i, j;

 	if (cpus != NULL) {
-		memcpy(cpus->map, tmp_cpus, payload_size);
-		qsort(cpus->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
+		memcpy(RC_CHK_ACCESS(cpus)->map, tmp_cpus, payload_size);
+		qsort(RC_CHK_ACCESS(cpus)->map, nr_cpus, sizeof(struct perf_cpu), cmp_cpu);
 		/* Remove dups */
 		j = 0;
 		for (i = 0; i < nr_cpus; i++) {
-			if (i == 0 || cpus->map[i].cpu != cpus->map[i - 1].cpu)
-				cpus->map[j++].cpu = cpus->map[i].cpu;
+			if (i == 0 || RC_CHK_ACCESS(cpus)->map[i].cpu != RC_CHK_ACCESS(cpus)->map[i - 1].cpu)
+				RC_CHK_ACCESS(cpus)->map[j++].cpu = RC_CHK_ACCESS(cpus)->map[i].cpu;
 		}
-		cpus->nr = j;
+		perf_cpu_map__set_nr(cpus, j);
 		assert(j <= nr_cpus);
 	}
 	return cpus;
@ -263,20 +275,20 @@ struct perf_cpu perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx)
 		.cpu = -1
 	};

-	if (cpus && idx < cpus->nr)
-		return cpus->map[idx];
+	if (cpus && idx < RC_CHK_ACCESS(cpus)->nr)
+		return RC_CHK_ACCESS(cpus)->map[idx];

 	return result;
 }

 int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
 {
-	return cpus ? cpus->nr : 1;
+	return cpus ? RC_CHK_ACCESS(cpus)->nr : 1;
 }

 bool perf_cpu_map__empty(const struct perf_cpu_map *map)
 {
-	return map ? map->map[0].cpu == -1 : true;
+	return map ? RC_CHK_ACCESS(map)->map[0].cpu == -1 : true;
 }

 int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
@ -287,10 +299,10 @@ int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu)
 		return -1;

 	low = 0;
-	high = cpus->nr;
+	high = RC_CHK_ACCESS(cpus)->nr;
 	while (low < high) {
 		int idx = (low + high) / 2;
-		struct perf_cpu cpu_at_idx = cpus->map[idx];
+		struct perf_cpu cpu_at_idx = RC_CHK_ACCESS(cpus)->map[idx];

 		if (cpu_at_idx.cpu == cpu.cpu)
 			return idx;
@ -316,7 +328,7 @@ struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map)
 	};

 	// cpu_map__trim_new() qsort()s it, cpu_map__default_new() sorts it as well.
-	return map->nr > 0 ? map->map[map->nr - 1] : result;
+	return RC_CHK_ACCESS(map)->nr > 0 ? RC_CHK_ACCESS(map)->map[RC_CHK_ACCESS(map)->nr - 1] : result;
 }

 /** Is 'b' a subset of 'a'. */
@ -324,15 +336,15 @@ bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu
 {
 	if (a == b || !b)
 		return true;
-	if (!a || b->nr > a->nr)
+	if (!a || RC_CHK_ACCESS(b)->nr > RC_CHK_ACCESS(a)->nr)
 		return false;

-	for (int i = 0, j = 0; i < a->nr; i++) {
-		if (a->map[i].cpu > b->map[j].cpu)
+	for (int i = 0, j = 0; i < RC_CHK_ACCESS(a)->nr; i++) {
+		if (RC_CHK_ACCESS(a)->map[i].cpu > RC_CHK_ACCESS(b)->map[j].cpu)
 			return false;
-		if (a->map[i].cpu == b->map[j].cpu) {
+		if (RC_CHK_ACCESS(a)->map[i].cpu == RC_CHK_ACCESS(b)->map[j].cpu) {
 			j++;
-			if (j == b->nr)
+			if (j == RC_CHK_ACCESS(b)->nr)
 				return true;
 		}
 	}
@ -362,27 +374,27 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
 		return perf_cpu_map__get(other);
 	}

-	tmp_len = orig->nr + other->nr;
+	tmp_len = RC_CHK_ACCESS(orig)->nr + RC_CHK_ACCESS(other)->nr;
 	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
 	if (!tmp_cpus)
 		return NULL;

 	/* Standard merge algorithm from wikipedia */
 	i = j = k = 0;
-	while (i < orig->nr && j < other->nr) {
-		if (orig->map[i].cpu <= other->map[j].cpu) {
-			if (orig->map[i].cpu == other->map[j].cpu)
+	while (i < RC_CHK_ACCESS(orig)->nr && j < RC_CHK_ACCESS(other)->nr) {
+		if (RC_CHK_ACCESS(orig)->map[i].cpu <= RC_CHK_ACCESS(other)->map[j].cpu) {
+			if (RC_CHK_ACCESS(orig)->map[i].cpu == RC_CHK_ACCESS(other)->map[j].cpu)
 				j++;
-			tmp_cpus[k++] = orig->map[i++];
+			tmp_cpus[k++] = RC_CHK_ACCESS(orig)->map[i++];
 		} else
-			tmp_cpus[k++] = other->map[j++];
+			tmp_cpus[k++] = RC_CHK_ACCESS(other)->map[j++];
 	}

-	while (i < orig->nr)
-		tmp_cpus[k++] = orig->map[i++];
+	while (i < RC_CHK_ACCESS(orig)->nr)
+		tmp_cpus[k++] = RC_CHK_ACCESS(orig)->map[i++];

-	while (j < other->nr)
-		tmp_cpus[k++] = other->map[j++];
+	while (j < RC_CHK_ACCESS(other)->nr)
+		tmp_cpus[k++] = RC_CHK_ACCESS(other)->map[j++];
 	assert(k <= tmp_len);

 	merged = cpu_map__trim_new(k, tmp_cpus);
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@ -687,15 +687,14 @@ perf_evlist__next_mmap(struct perf_evlist *evlist, struct perf_mmap *map,

 void __perf_evlist__set_leader(struct list_head *list, struct perf_evsel *leader)
 {
-	struct perf_evsel *first, *last, *evsel;
+	struct perf_evsel *evsel;
+	int n = 0;

-	first = list_first_entry(list, struct perf_evsel, node);
-	last = list_last_entry(list, struct perf_evsel, node);
-
-	leader->nr_members = last->idx - first->idx + 1;
-
-	__perf_evlist__for_each_entry(list, evsel)
+	__perf_evlist__for_each_entry(list, evsel) {
 		evsel->leader = leader;
+		n++;
+	}
+	leader->nr_members = n;
 }

 void perf_evlist__set_leader(struct perf_evlist *evlist)
@ -704,7 +703,23 @@ void perf_evlist__set_leader(struct perf_evlist *evlist)
 		struct perf_evsel *first = list_entry(evlist->entries.next,
 						struct perf_evsel, node);

-		evlist->nr_groups = evlist->nr_entries > 1 ? 1 : 0;
 		__perf_evlist__set_leader(&evlist->entries, first);
 	}
 }
+
+int perf_evlist__nr_groups(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	int nr_groups = 0;
+
+	perf_evlist__for_each_evsel(evlist, evsel) {
+		/*
+		 * evsels by default have a nr_members of 1, and they are their
+		 * own leader. If the nr_members is >1 then this is an
+		 * indication of a group.
+		 */
+		if (evsel->leader == evsel && evsel->nr_members > 1)
+			nr_groups++;
+	}
+	return nr_groups;
+}
--- a/tools/lib/perf/include/internal/cpumap.h
+++ b/tools/lib/perf/include/internal/cpumap.h
@ -4,6 +4,7 @@

 #include <linux/refcount.h>
 #include <perf/cpumap.h>
+#include <internal/rc_check.h>

 /**
 * A sized, reference counted, sorted array of integers representing CPU
@ -12,7 +13,7 @@
 * gaps if CPU numbers were used. For events associated with a pid, rather than
 * a CPU, a single dummy map with an entry of -1 is used.
 */
-struct perf_cpu_map {
+DECLARE_RC_STRUCT(perf_cpu_map) {
 	refcount_t	refcnt;
 	/** Length of the map array. */
 	int		nr;
@ -24,7 +25,14 @@ struct perf_cpu_map {
 #define MAX_NR_CPUS	2048
 #endif

+struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus);
 int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu);
 bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu_map *b);

+void perf_cpu_map__set_nr(struct perf_cpu_map *map, int nr_cpus);
+
+static inline refcount_t *perf_cpu_map__refcnt(struct perf_cpu_map *map)
+{
+	return &RC_CHK_ACCESS(map)->refcnt;
+}
 #endif /* __LIBPERF_INTERNAL_CPUMAP_H */
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@ -17,7 +17,6 @@ struct perf_mmap_param;
 struct perf_evlist {
 	struct list_head	 entries;
 	int			 nr_entries;
-	int			 nr_groups;
 	bool			 has_user_cpus;
 	bool			 needs_map_propagation;
 	/**
--- a/tools/lib/perf/include/internal/rc_check.h
+++ b/tools/lib/perf/include/internal/rc_check.h
@ -0,0 +1,102 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+#ifndef __LIBPERF_INTERNAL_RC_CHECK_H
+#define __LIBPERF_INTERNAL_RC_CHECK_H
+
+#include <stdlib.h>
+#include <linux/zalloc.h>
+
+/*
+ * Enable reference count checking implicitly with leak checking, which is
+ * integrated into address sanitizer.
+ */
+#if defined(LEAK_SANITIZER) || defined(ADDRESS_SANITIZER)
+#define REFCNT_CHECKING 1
+#endif
+
+/*
+ * Shared reference count checking macros.
+ *
+ * Reference count checking is an approach to sanitizing the use of reference
+ * counted structs. It leverages address and leak sanitizers to make sure gets
+ * are paired with a put. Reference count checking adds a malloc-ed layer of
+ * indirection on a get, and frees it on a put. A missed put will be reported as
+ * a memory leak. A double put will be reported as a double free. Accessing
+ * after a put will cause a use-after-free and/or a segfault.
+ */
+
+#ifndef REFCNT_CHECKING
+/* Replaces "struct foo" so that the pointer may be interposed. */
+#define DECLARE_RC_STRUCT(struct_name)		\
+	struct struct_name
+
+/* Declare a reference counted struct variable. */
+#define RC_STRUCT(struct_name) struct struct_name
+
+/*
+ * Interpose the indirection. Result will hold the indirection and object is the
+ * reference counted struct.
+ */
+#define ADD_RC_CHK(result, object) (result = object, object)
+
+/* Strip the indirection layer. */
+#define RC_CHK_ACCESS(object) object
+
+/* Frees the object and the indirection layer. */
+#define RC_CHK_FREE(object) free(object)
+
+/* A get operation adding the indirection layer. */
+#define RC_CHK_GET(result, object) ADD_RC_CHK(result, object)
+
+/* A put operation removing the indirection layer. */
+#define RC_CHK_PUT(object) {}
+
+#else
+
+/* Replaces "struct foo" so that the pointer may be interposed. */
+#define DECLARE_RC_STRUCT(struct_name)			\
+	struct original_##struct_name;			\
+	struct struct_name {				\
+		struct original_##struct_name *orig;	\
+	};						\
+	struct original_##struct_name
+
+/* Declare a reference counted struct variable. */
+#define RC_STRUCT(struct_name) struct original_##struct_name
+
+/*
+ * Interpose the indirection. Result will hold the indirection and object is the
+ * reference counted struct.
+ */
+#define ADD_RC_CHK(result, object)					\
+	(								\
+		object ? (result = malloc(sizeof(*result)),		\
+			result ? (result->orig = object, result)	\
+			: (result = NULL, NULL))			\
+		: (result = NULL, NULL)					\
+		)
+
+/* Strip the indirection layer. */
+#define RC_CHK_ACCESS(object) object->orig
+
+/* Frees the object and the indirection layer. */
+#define RC_CHK_FREE(object)			\
+	do {					\
+		zfree(&object->orig);		\
+		free(object);			\
+	} while(0)
+
+/* A get operation adding the indirection layer. */
+#define RC_CHK_GET(result, object) ADD_RC_CHK(result, (object ? object->orig : NULL))
+
+/* A put operation removing the indirection layer. */
+#define RC_CHK_PUT(object)			\
+	do {					\
+		if (object) {			\
+			object->orig = NULL;	\
+			free(object);		\
+		}				\
+	} while(0)
+
+#endif
+
+#endif /* __LIBPERF_INTERNAL_RC_CHECK_H */
--- a/tools/lib/perf/include/perf/event.h
+++ b/tools/lib/perf/include/perf/event.h
@ -70,6 +70,8 @@ struct perf_record_lost {
 	__u64			 lost;
 };

+#define PERF_RECORD_MISC_LOST_SAMPLES_BPF (1 << 15)
+
 struct perf_record_lost_samples {
 	struct perf_event_header header;
 	__u64			 lost;
--- a/tools/lib/perf/include/perf/evlist.h
+++ b/tools/lib/perf/include/perf/evlist.h
@ -47,4 +47,5 @@ LIBPERF_API struct perf_mmap *perf_evlist__next_mmap(struct perf_evlist *evlist,
 	     (pos) = perf_evlist__next_mmap((evlist), (pos), overwrite))

 LIBPERF_API void perf_evlist__set_leader(struct perf_evlist *evlist);
+LIBPERF_API int perf_evlist__nr_groups(struct perf_evlist *evlist);
 #endif /* __LIBPERF_EVLIST_H */
--- a/tools/perf/Build
+++ b/tools/perf/Build
@ -56,6 +56,6 @@ CFLAGS_builtin-report.o	   += -DDOCDIR="BUILD_STR($(srcdir_SQ)/Documentation)"
 perf-y += util/
 perf-y += arch/
 perf-y += ui/
-perf-$(CONFIG_LIBTRACEEVENT) += scripts/
+perf-y += scripts/

 gtk-y += ui/gtk/
--- a/tools/perf/Documentation/perf-annotate.txt
+++ b/tools/perf/Documentation/perf-annotate.txt
@ -116,6 +116,9 @@ include::itrace.txt[]
 -M::
 --disassembler-style=:: Set disassembler style for objdump.

+--addr2line=<path>::
+        Path to addr2line binary.
+
 --objdump=<path>::
        Path to objdump binary.

--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@ -250,7 +250,13 @@ annotate.*::
 	These are in control of addresses, jump function, source code
 	in lines of assembly code from a specific program.

-	annotate.disassembler_style:
+	annotate.addr2line::
+		addr2line binary to use for file names and line numbers.
+
+	annotate.objdump::
+		objdump binary to use for disassembly and annotations.
+
+	annotate.disassembler_style::
 		Use this to change the default disassembler style to some other value
 		supported by binutils, such as "intel", see the '-M' option help in the
 		'objdump' man page.
--- a/tools/perf/Documentation/perf-kvm.txt
+++ b/tools/perf/Documentation/perf-kvm.txt
@ -58,7 +58,7 @@ There are a couple of variants of perf kvm:
  events.

  'perf kvm stat report' reports statistical data which includes events
-  handled time, samples, and so on.
+  handled sample, percent_sample, time, percent_time, max_t, min_t, mean_t.

  'perf kvm stat live' reports statistical data in a live mode (similar to
  record + report but with statistical data updated live at a given display
@ -82,6 +82,8 @@ OPTIONS
 :GMEXAMPLESUBCMD: top
 include::guest-files.txt[]

+--stdio:: Use the stdio interface.
+
 -v::
 --verbose::
 	Be more verbose (show counter open errors, etc).
@ -97,7 +99,10 @@ STAT REPORT OPTIONS
 -k::
 --key=<value>::
       Sorting key. Possible values: sample (default, sort by samples
-       number), time (sort by average time).
+       number), percent_sample (sort by sample percentage), time
+       (sort by average time), precent_time (sort by time percentage),
+       max_t (sort by maximum time), min_t (sort by minimum time), mean_t
+       (sort by mean time).
 -p::
 --pid=::
    Analyze events only for given process ID(s) (comma separated list).
--- a/tools/perf/Documentation/perf-lock.txt
+++ b/tools/perf/Documentation/perf-lock.txt
@ -155,8 +155,10 @@ CONTENTION OPTIONS
 --tid=<value>::
        Record events on existing thread ID (comma separated list).

+-M::
 --map-nr-entries=<value>::
-	Maximum number of BPF map entries (default: 10240).
+	Maximum number of BPF map entries (default: 16384).
+	This will be aligned to a power of 2.

 --max-stack=<value>::
 	Maximum stack depth when collecting lock contention (default: 8).
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@ -119,9 +119,12 @@ OPTIONS
 	  "perf report" to view group events together.

 --filter=<filter>::
-        Event filter. This option should follow an event selector (-e) which
-	selects either tracepoint event(s) or a hardware trace PMU
-	(e.g. Intel PT or CoreSight).
+	Event filter.  This option should follow an event selector (-e).
+	If the event is a tracepoint, the filter string will be parsed by
+	the kernel.  If the event is a hardware trace PMU (e.g. Intel PT
+	or CoreSight), it'll be processed as an address filter.  Otherwise
+	it means a general filter using BPF which can be applied for any
+	kind of event.

 	- tracepoint filters

@ -176,6 +179,57 @@ OPTIONS

 	Multiple filters can be separated with space or comma.

+	- bpf filters
+
+	A BPF filter can access the sample data and make a decision based on the
+	data.  Users need to set an appropriate sample type to use the BPF
+	filter.  BPF filters need root privilege.
+
+	The sample data field can be specified in lower case letter.  Multiple
+	filters can be separated with comma.  For example,
+
+	  --filter 'period > 1000, cpu == 1'
+	or
+	  --filter 'mem_op == load || mem_op == store, mem_lvl > l1'
+
+	The former filter only accept samples with period greater than 1000 AND
+	CPU number is 1.  The latter one accepts either load and store memory
+	operations but it should have memory level above the L1.  Since the
+	mem_op and mem_lvl fields come from the (memory) data_source, it'd only
+	work with some events which set the data_source field.
+
+	Also user should request to collect that information (with -d option in
+	the above case).  Otherwise, the following message will be shown.
+
+	  $ sudo perf record -e cycles --filter 'mem_op == load'
+	  Error: cycles event does not have PERF_SAMPLE_DATA_SRC
+	   Hint: please add -d option to perf record.
+	  failed to set filter "BPF" on event cycles with 22 (Invalid argument)
+
+	Essentially the BPF filter expression is:
+
+	  <term> <operator> <value> (("," | "||") <term> <operator> <value>)*
+
+	The <term> can be one of:
+	  ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr,
+	  code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat,
+	  p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock,
+	  mem_dtlb, mem_blk, mem_hops
+
+	The <operator> can be one of:
+	  ==, !=, >, >=, <, <=, &
+
+	The <value> can be one of:
+	  <number> (for any term)
+	  na, load, store, pfetch, exec (for mem_op)
+	  l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl)
+	  na, none, hit, miss, hitm, fwd, peer (for mem_snoop)
+	  remote (for mem_remote)
+	  na, locked (for mem_locked)
+	  na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb)
+	  na, by_data, by_addr (for mem_blk)
+	  hops0, hops1, hops2, hops3 (for mem_hops)
+
 --exclude-perf::
 	Don't record events issued by perf itself. This option should follow
 	an event selector (-e) which selects tracepoint event(s). It adds a
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@ -117,6 +117,7 @@ OPTIONS
 	- addr: (Full) virtual address of the sampled instruction
 	- retire_lat: On X86, this reports pipeline stall of this instruction compared
 	  to the previous instruction in cycles. And currently supported only on X86
+	- simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate

 	By default, comm, dso and symbol keys are used.
 	(i.e. --sort comm,dso,symbol)
@ -380,6 +381,9 @@ OPTIONS
 	This allows to examine the path the program took to each sample.
 	The data collection must have used -b (or -j) and -g.

+--addr2line=<path>::
+        Path to addr2line binary.
+
 --objdump=<path>::
        Path to objdump binary.

--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@ -394,10 +394,10 @@ See perf list output for the possible metrics and metricgroups.
 Do not aggregate counts across all monitored CPUs.

 --topdown::
-Print complete top-down metrics supported by the CPU. This allows to
-determine bottle necks in the CPU pipeline for CPU bound workloads,
-by breaking the cycles consumed down into frontend bound, backend bound,
-bad speculation and retiring.
+Print top-down metrics supported by the CPU. This allows to determine
+bottle necks in the CPU pipeline for CPU bound workloads, by breaking
+the cycles consumed down into frontend bound, backend bound, bad
+speculation and retiring.

 Frontend bound means that the CPU cannot fetch and decode instructions fast
 enough. Backend bound means that computation or memory access is the bottle
@ -430,15 +430,18 @@ CPUs the workload runs on. If needed the CPUs can be forced using
 taskset.

 --td-level::
-Print the top-down statistics that equal to or lower than the input level.
-It allows users to print the interested top-down metrics level instead of
-the complete top-down metrics.
+Print the top-down statistics that equal the input level. It allows
+users to print the interested top-down metrics level instead of the
+level 1 top-down metrics.

-The availability of the top-down metrics level depends on the hardware. For
-example, Ice Lake only supports L1 top-down metrics. The Sapphire Rapids
-supports both L1 and L2 top-down metrics.
+As the higher levels gather more metrics and use more counters they
+will be less accurate. By convention a metric can be examined by
+appending '_group' to it and this will increase accuracy compared to
+gathering all metrics for a level. For example, level 1 analysis may
+highlight 'tma_frontend_bound'. This metric may be drilled into with
+'tma_frontend_bound_group' with
+'perf stat -M tma_frontend_bound_group...'.

-Default: 0 means the max level that the current hardware support.
 Error out if the input is higher than the supported max level.

 --no-merge::
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@ -161,6 +161,12 @@ Default is to monitor all CPUS.
 -M::
 --disassembler-style=:: Set disassembler style for objdump.

+--addr2line=<path>::
+        Path to addr2line binary.
+
+--objdump=<path>::
+        Path to objdump binary.
+
 --prefix=PREFIX::
 --prefix-strip=N::
        Remove first N entries from source file path names in executables
@ -248,6 +254,10 @@ Default is to monitor all CPUS.
 	The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
 	Note that this feature may not be available on all processors.

+--branch-history::
+	Add the addresses of sampled taken branches to the callstack.
+	This allows to examine the path the program took to each sample.
+
 --raw-trace::
 	When displaying traceevent output, do not use print fmt or plugins.

--- a/tools/perf/Documentation/topdown.txt
+++ b/tools/perf/Documentation/topdown.txt
@ -1,46 +1,35 @@
-Using TopDown metrics in user space
-----------------------------------
+Using TopDown metrics
+---------------------

-Intel CPUs (since Sandy Bridge and Silvermont) support a TopDown
-methodology to break down CPU pipeline execution into 4 bottlenecks:
-frontend bound, backend bound, bad speculation, retiring.
+TopDown metrics break apart performance bottlenecks. Starting at level
+1 it is typical to get metrics on retiring, bad speculation, frontend
+bound, and backend bound. Higher levels provide more detail in to the
+level 1 bottlenecks, such as at level 2: core bound, memory bound,
+heavy operations, light operations, branch mispredicts, machine
+clears, fetch latency and fetch bandwidth. For more details see [1][2][3].

-For more details on Topdown see [1][5]
+perf stat --topdown implements this using available metrics that vary
+per architecture.

-Traditionally this was implemented by events in generic counters
-and specific formulas to compute the bottlenecks.
+% perf stat -a --topdown -I1000
+#           time      %  tma_retiring %  tma_backend_bound %  tma_frontend_bound %  tma_bad_speculation
+     1.001141351                 11.5                 34.9                  46.9                    6.7
+     2.006141972                 13.4                 28.1                  50.4                    8.1
+     3.010162040                 12.9                 28.1                  51.1                    8.0
+     4.014009311                 12.5                 28.6                  51.8                    7.2
+     5.017838554                 11.8                 33.0                  48.0                    7.2
+     5.704818971                 14.0                 27.5                  51.3                    7.3
+...

-perf stat --topdown implements this.
-
-Full Top Down includes more levels that can break down the
-bottlenecks further. This is not directly implemented in perf,
-but available in other tools that can run on top of perf,
-such as toplev[2] or vtune[3]
-
-New Topdown features in Ice Lake
-===============================
+New Topdown features in Intel Ice Lake
+======================================

 With Ice Lake CPUs the TopDown metrics are directly available as
 fixed counters and do not require generic counters. This allows
 to collect TopDown always in addition to other events.

-% perf stat -a --topdown -I1000
-#           time             retiring      bad speculation       frontend bound        backend bound
-     1.001281330                23.0%                15.3%                29.6%                32.1%
-     2.003009005                 5.0%                 6.8%                46.6%                41.6%
-     3.004646182                 6.7%                 6.7%                46.0%                40.6%
-     4.006326375                 5.0%                 6.4%                47.6%                41.0%
-     5.007991804                 5.1%                 6.3%                46.3%                42.3%
-     6.009626773                 6.2%                 7.1%                47.3%                39.3%
-     7.011296356                 4.7%                 6.7%                46.2%                42.4%
-     8.012951831                 4.7%                 6.7%                47.5%                41.1%
-...
-
-This also enables measuring TopDown per thread/process instead
-of only per core.
-
-Using TopDown through RDPMC in applications on Ice Lake
-======================================================
+Using TopDown through RDPMC in applications on Intel Ice Lake
+=============================================================

 For more fine grained measurements it can be useful to
 access the new  directly from user space. This is more complicated,
@ -301,8 +290,8 @@ This "opens" a new measurement period.
 A program using RDPMC for TopDown should schedule such a reset
 regularly, as in every few seconds.

-Limits on Ice Lake
-==================
+Limits on Intel Ice Lake
+========================

 Four pseudo TopDown metric events are exposed for the end-users,
 topdown-retiring, topdown-bad-spec, topdown-fe-bound and topdown-be-bound.
@ -318,8 +307,8 @@ a sampling read group. Since the SLOTS event must be the leader of a TopDown
 group, the second event of the group is the sampling event.
 For example, perf record -e '{slots, $sampling_event, topdown-retiring}:S'

-Extension on Sapphire Rapids Server
-===================================
+Extension on Intel Sapphire Rapids Server
+=========================================
 The metrics counter is extended to support TMA method level 2 metrics.
 The lower half of the register is the TMA level 1 metrics (legacy).
 The upper half is also divided into four 8-bit fields for the new level 2
@ -338,7 +327,6 @@ other four level 2 metrics by subtracting corresponding metrics as below.


 [1] https://software.intel.com/en-us/top-down-microarchitecture-analysis-method-win
-[2] https://github.com/andikleen/pmu-tools/wiki/toplev-manual
-[3] https://software.intel.com/en-us/intel-vtune-amplifier-xe
+[2] https://sites.google.com/site/analysismethods/yasin-pubs
+[3] https://perf.wiki.kernel.org/index.php/Top-Down_Analysis
 [4] https://github.com/andikleen/pmu-tools/tree/master/jevents
-[5] https://sites.google.com/site/analysismethods/yasin-pubs
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@ -234,6 +234,7 @@ ifndef DEBUG
 endif

 ifeq ($(DEBUG),0)
+CORE_CFLAGS += -DNDEBUG=1
 ifeq ($(CC_NO_CLANG), 0)
  CORE_CFLAGS += -O3
 else
@ -417,7 +418,6 @@ endif

 ifdef NO_LIBELF
  NO_DWARF := 1
-  NO_DEMANGLE := 1
  NO_LIBUNWIND := 1
  NO_LIBDW_DWARF_UNWIND := 1
  NO_LIBBPF := 1
@ -431,15 +431,7 @@ else
      LIBC_SUPPORT := 1
    endif
    ifeq ($(LIBC_SUPPORT),1)
-      msg := $(warning No libelf found. Disables 'probe' tool, jvmti and BPF support in 'perf record'. Please install libelf-dev, libelf-devel or elfutils-libelf-devel);
-
-      NO_LIBELF := 1
-      NO_DWARF := 1
-      NO_DEMANGLE := 1
-      NO_LIBUNWIND := 1
-      NO_LIBDW_DWARF_UNWIND := 1
-      NO_LIBBPF := 1
-      NO_JVMTI := 1
+      msg := $(error ERROR: No libelf found. Disables 'probe' tool, jvmti and BPF support. Please install libelf-dev, libelf-devel, elfutils-libelf-devel or build with NO_LIBELF=1.)
    else
      ifneq ($(filter s% -fsanitize=address%,$(EXTRA_CFLAGS),),)
        ifneq ($(shell ldconfig -p | grep libasan >/dev/null 2>&1; echo $$?), 0)
@ -481,10 +473,6 @@ else
  endif # libelf support
 endif # NO_LIBELF

-ifeq ($(feature-glibc), 1)
-  CFLAGS += -DHAVE_GLIBC_SUPPORT
-endif
-
 ifeq ($(feature-libaio), 1)
  ifndef NO_AIO
    CFLAGS += -DHAVE_AIO_SUPPORT
@ -495,6 +483,10 @@ ifdef NO_DWARF
  NO_LIBDW_DWARF_UNWIND := 1
 endif

+ifeq ($(feature-scandirat), 1)
+  CFLAGS += -DHAVE_SCANDIRAT_SUPPORT
+endif
+
 ifeq ($(feature-sched_getcpu), 1)
  CFLAGS += -DHAVE_SCHED_GETCPU_SUPPORT
 endif
@ -571,54 +563,17 @@ ifndef NO_LIBELF

      # detecting libbpf without LIBBPF_DYNAMIC, so make VF=1 shows libbpf detection status
      $(call feature_check,libbpf)
+
      ifdef LIBBPF_DYNAMIC
        ifeq ($(feature-libbpf), 1)
          EXTLIBS += -lbpf
          $(call detected,CONFIG_LIBBPF_DYNAMIC)
-
-          $(call feature_check,libbpf-btf__load_from_kernel_by_id)
-          ifeq ($(feature-libbpf-btf__load_from_kernel_by_id), 1)
-            CFLAGS += -DHAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID
-          endif
-          $(call feature_check,libbpf-bpf_prog_load)
-          ifeq ($(feature-libbpf-bpf_prog_load), 1)
-            CFLAGS += -DHAVE_LIBBPF_BPF_PROG_LOAD
-          endif
-          $(call feature_check,libbpf-bpf_object__next_program)
-          ifeq ($(feature-libbpf-bpf_object__next_program), 1)
-            CFLAGS += -DHAVE_LIBBPF_BPF_OBJECT__NEXT_PROGRAM
-          endif
-          $(call feature_check,libbpf-bpf_object__next_map)
-          ifeq ($(feature-libbpf-bpf_object__next_map), 1)
-            CFLAGS += -DHAVE_LIBBPF_BPF_OBJECT__NEXT_MAP
-          endif
-          $(call feature_check,libbpf-bpf_program__set_insns)
-          ifeq ($(feature-libbpf-bpf_program__set_insns), 1)
-            CFLAGS += -DHAVE_LIBBPF_BPF_PROGRAM__SET_INSNS
-          else
-            dummy := $(error Error: libbpf devel library needs to be >= 0.8.0 to build with LIBBPF_DYNAMIC, update or build statically with the version that comes with the kernel sources);
-          endif
-          $(call feature_check,libbpf-btf__raw_data)
-          ifeq ($(feature-libbpf-btf__raw_data), 1)
-            CFLAGS += -DHAVE_LIBBPF_BTF__RAW_DATA
-          endif
-          $(call feature_check,libbpf-bpf_map_create)
-          ifeq ($(feature-libbpf-bpf_map_create), 1)
-            CFLAGS += -DHAVE_LIBBPF_BPF_MAP_CREATE
-          endif
        else
-          dummy := $(error Error: No libbpf devel library found, please install libbpf-devel);
+          dummy := $(error Error: No libbpf devel library found or older than v1.0, please install/update libbpf-devel);
        endif
      else
        # Libbpf will be built as a static library from tools/lib/bpf.
 	LIBBPF_STATIC := 1
-	CFLAGS += -DHAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID
-        CFLAGS += -DHAVE_LIBBPF_BPF_PROG_LOAD
-        CFLAGS += -DHAVE_LIBBPF_BPF_OBJECT__NEXT_PROGRAM
-        CFLAGS += -DHAVE_LIBBPF_BPF_OBJECT__NEXT_MAP
-        CFLAGS += -DHAVE_LIBBPF_BPF_PROGRAM__SET_INSNS
-        CFLAGS += -DHAVE_LIBBPF_BTF__RAW_DATA
-        CFLAGS += -DHAVE_LIBBPF_BPF_MAP_CREATE
      endif
    endif

@ -802,10 +757,6 @@ ifndef NO_LIBCRYPTO
  endif
 endif

-ifdef NO_NEWT
-  NO_SLANG=1
-endif
-
 ifndef NO_SLANG
  ifneq ($(feature-libslang), 1)
    ifneq ($(feature-libslang-include-subdir), 1)
@ -922,19 +873,17 @@ endif
 ifneq ($(NO_JEVENTS),1)
  NO_JEVENTS := 0
  ifndef PYTHON
-    $(warning No python interpreter disabling jevent generation)
-    NO_JEVENTS := 1
+    $(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.)
  else
    # jevents.py uses f-strings present in Python 3.6 released in Dec. 2016.
    JEVENTS_PYTHON_GOOD := $(shell $(PYTHON) -c 'import sys;print("1" if(sys.version_info.major >= 3 and sys.version_info.minor >= 6) else "0")' 2> /dev/null)
    ifneq ($(JEVENTS_PYTHON_GOOD), 1)
-      $(warning Python interpreter too old (older than 3.6) disabling jevent generation)
-      NO_JEVENTS := 1
+      $(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.)
    endif
  endif
 endif

-ifndef NO_LIBBFD
+ifdef BUILD_NONDISTRO
  ifeq ($(feature-libbfd), 1)
    EXTLIBS += -lbfd -lopcodes
  else
@ -957,6 +906,8 @@ ifndef NO_LIBBFD
    $(call feature_check,disassembler-init-styled)
  endif

+  CFLAGS += -DHAVE_LIBBFD_SUPPORT
+  CXXFLAGS += -DHAVE_LIBBFD_SUPPORT
  ifeq ($(feature-libbfd-buildid), 1)
    CFLAGS += -DHAVE_LIBBFD_BUILDID_SUPPORT
  else
@ -964,33 +915,25 @@ ifndef NO_LIBBFD
  endif
 endif

-ifdef NO_DEMANGLE
-  CFLAGS += -DNO_DEMANGLE
-else
-  ifdef HAVE_CPLUS_DEMANGLE_SUPPORT
-    EXTLIBS += -liberty
-  else
+ifndef NO_DEMANGLE
+  $(call feature_check,cxa-demangle)
+  ifeq ($(feature-cxa-demangle), 1)
+    EXTLIBS += -lstdc++
+    CFLAGS += -DHAVE_CXA_DEMANGLE_SUPPORT
+    CXXFLAGS += -DHAVE_CXA_DEMANGLE_SUPPORT
+  endif
+  ifdef BUILD_NONDISTRO
    ifeq ($(filter -liberty,$(EXTLIBS)),)
      $(call feature_check,cplus-demangle)
-
-      # we dont have neither HAVE_CPLUS_DEMANGLE_SUPPORT
-      # or any of 'bfd iberty z' trinity
      ifeq ($(feature-cplus-demangle), 1)
        EXTLIBS += -liberty
-      else
-        msg := $(warning No bfd.h/libbfd found, please install binutils-dev[el]/zlib-static/libiberty-dev to gain symbol demangling)
-        CFLAGS += -DNO_DEMANGLE
      endif
    endif
+    ifneq ($(filter -liberty,$(EXTLIBS)),)
+      CFLAGS += -DHAVE_CPLUS_DEMANGLE_SUPPORT
+      CXXFLAGS += -DHAVE_CPLUS_DEMANGLE_SUPPORT
+    endif
  endif
-
-  ifneq ($(filter -liberty,$(EXTLIBS)),)
-    CFLAGS += -DHAVE_CPLUS_DEMANGLE_SUPPORT
-  endif
-endif
-
-ifneq ($(filter -lbfd,$(EXTLIBS)),)
-  CFLAGS += -DHAVE_LIBBFD_SUPPORT
 endif

 ifndef NO_ZLIB
@ -1188,7 +1131,7 @@ ifdef LIBCLANGLLVM
  endif
 endif

-ifdef LIBPFM4
+ifndef NO_LIBPFM4
  $(call feature_check,libpfm4)
  ifeq ($(feature-libpfm4), 1)
    CFLAGS += -DHAVE_LIBPFM
@ -1197,7 +1140,6 @@ ifdef LIBPFM4
    $(call detected,CONFIG_LIBPFM4)
  else
    msg := $(warning libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev);
-    NO_LIBPFM4 := 1
  endif
 endif

@ -1215,7 +1157,7 @@ ifneq ($(NO_LIBTRACEEVENT),1)
    CFLAGS += -DLIBTRACEEVENT_VERSION=$(LIBTRACEEVENT_VERSION_CPP)
    $(call detected,CONFIG_LIBTRACEEVENT)
  else
-    dummy := $(warning Warning: libtraceevent is missing limiting functionality, please install libtraceevent-dev/libtraceevent-devel)
+    dummy := $(error ERROR: libtraceevent is missing. Please install libtraceevent-dev/libtraceevent-devel or build with NO_LIBTRACEEVENT=1)
  endif

  $(call feature_check,libtracefs)
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@ -44,8 +44,6 @@ include ../scripts/utilities.mak
 #
 # Define WERROR=0 to disable treating any warnings as errors.
 #
-# Define NO_NEWT if you do not want TUI support. (deprecated)
-#
 # Define NO_SLANG if you do not want TUI support.
 #
 # Define GTK2 if you want GTK+ GUI support.
@ -122,12 +120,14 @@ include ../scripts/utilities.mak
 # generated from the kernel .tbl or unistd.h files and use, if available, libaudit
 # for doing the conversions to/from strings/id.
 #
-# Define LIBPFM4 to enable libpfm4 events extension.
+# Define NO_LIBPFM4 to disable libpfm4 events extension.
 #
 # Define NO_LIBDEBUGINFOD if you do not want support debuginfod
 #
 # Define BUILD_BPF_SKEL to enable BPF skeletons
 #
+# Define BUILD_NONDISTRO to enable building an linking against libbfd and
+# libiberty distribution license incompatible libraries.

 # As per kernel Makefile, avoid funny character set dependencies
 unexport LC_ALL
@ -647,13 +647,16 @@ all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
 # Create python binding output directory if not already present
 _dummy := $(shell [ -d '$(OUTPUT)python' ] || mkdir -p '$(OUTPUT)python')

-$(OUTPUT)python/perf$(PYTHON_EXTENSION_SUFFIX): $(PYTHON_EXT_SRCS) $(PYTHON_EXT_DEPS) $(LIBPERF)
+$(OUTPUT)python/perf$(PYTHON_EXTENSION_SUFFIX): $(PYTHON_EXT_SRCS) $(PYTHON_EXT_DEPS) $(LIBPERF) $(LIBSUBCMD)
 	$(QUIET_GEN)LDSHARED="$(CC) -pthread -shared" \
        CFLAGS='$(CFLAGS)' LDFLAGS='$(LDFLAGS)' \
 	  $(PYTHON_WORD) util/setup.py \
 	  --quiet build_ext; \
 	cp $(PYTHON_EXTBUILD_LIB)perf*.so $(OUTPUT)python/

+python_perf_target:
+	@echo "Target is: $(OUTPUT)python/perf$(PYTHON_EXTENSION_SUFFIX)"
+
 please_set_SHELL_PATH_to_a_more_modern_shell:
 	$(Q)$$(:)

@ -1047,7 +1050,7 @@ SKELETONS := $(SKEL_OUT)/bpf_prog_profiler.skel.h
 SKELETONS += $(SKEL_OUT)/bperf_leader.skel.h $(SKEL_OUT)/bperf_follower.skel.h
 SKELETONS += $(SKEL_OUT)/bperf_cgroup.skel.h $(SKEL_OUT)/func_latency.skel.h
 SKELETONS += $(SKEL_OUT)/off_cpu.skel.h $(SKEL_OUT)/lock_contention.skel.h
-SKELETONS += $(SKEL_OUT)/kwork_trace.skel.h
+SKELETONS += $(SKEL_OUT)/kwork_trace.skel.h $(SKEL_OUT)/sample_filter.skel.h

 $(SKEL_TMP_OUT) $(LIBAPI_OUTPUT) $(LIBBPF_OUTPUT) $(LIBPERF_OUTPUT) $(LIBSUBCMD_OUTPUT) $(LIBSYMBOL_OUTPUT):
 	$(Q)$(MKDIR) -p $@
@ -1060,21 +1063,7 @@ $(BPFTOOL): | $(SKEL_TMP_OUT)
 	$(Q)CFLAGS= $(MAKE) -C ../bpf/bpftool \
 		OUTPUT=$(SKEL_TMP_OUT)/ bootstrap

-VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux)				\
-		     $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux)	\
-		     ../../vmlinux					\
-		     /sys/kernel/btf/vmlinux				\
-		     /boot/vmlinux-$(shell uname -r)
-VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
-
-$(SKEL_OUT)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL)
-ifeq ($(VMLINUX_H),)
-	$(QUIET_GEN)$(BPFTOOL) btf dump file $< format c > $@
-else
-	$(Q)cp "$(VMLINUX_H)" $@
-endif
-
-$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) $(SKEL_OUT)/vmlinux.h | $(SKEL_TMP_OUT)
+$(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
 	$(QUIET_CLANG)$(CLANG) -g -O2 -target bpf -Wall -Werror $(BPF_INCLUDE) \
 	  -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@

@ -1152,7 +1141,7 @@ FORCE:
 .PHONY: all install clean config-clean strip install-gtk
 .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
 .PHONY: .FORCE-PERF-VERSION-FILE TAGS tags cscope FORCE prepare
-.PHONY: archheaders
+.PHONY: archheaders python_perf_target

 endif # force_fixdep

--- a/tools/perf/arch/arm/tests/dwarf-unwind.c
+++ b/tools/perf/arch/arm/tests/dwarf-unwind.c
@ -33,7 +33,7 @@ static int sample_ustack(struct perf_sample *sample,
 		return -1;
 	}

-	stack_size = map->end - sp;
+	stack_size = map__end(map) - sp;
 	stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;

 	memcpy(buf, (void *) sp, stack_size);
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@ -69,21 +69,29 @@ static const char * const metadata_ete_ro[] = {
 static bool cs_etm_is_etmv4(struct auxtrace_record *itr, int cpu);
 static bool cs_etm_is_ete(struct auxtrace_record *itr, int cpu);

-static int cs_etm_set_context_id(struct auxtrace_record *itr,
-				 struct evsel *evsel, int cpu)
+static int cs_etm_validate_context_id(struct auxtrace_record *itr,
+				      struct evsel *evsel, int cpu)
 {
-	struct cs_etm_recording *ptr;
-	struct perf_pmu *cs_etm_pmu;
+	struct cs_etm_recording *ptr =
+		container_of(itr, struct cs_etm_recording, itr);
+	struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
 	char path[PATH_MAX];
-	int err = -EINVAL;
+	int err;
 	u32 val;
-	u64 contextid;
+	u64 contextid =
+		evsel->core.attr.config &
+		(perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1") |
+		 perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2"));

-	ptr = container_of(itr, struct cs_etm_recording, itr);
-	cs_etm_pmu = ptr->cs_etm_pmu;
+	if (!contextid)
+		return 0;

-	if (!cs_etm_is_etmv4(itr, cpu))
-		goto out;
+	/* Not supported in etmv3 */
+	if (!cs_etm_is_etmv4(itr, cpu)) {
+		pr_err("%s: contextid not supported in ETMv3, disable with %s/contextid=0/\n",
+		       CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
+		return -EINVAL;
+	}

 	/* Get a handle on TRCIDR2 */
 	snprintf(path, PATH_MAX, "cpu%d/%s",
@ -92,27 +100,13 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,

 	/* There was a problem reading the file, bailing out */
 	if (err != 1) {
-		pr_err("%s: can't read file %s\n",
-		       CORESIGHT_ETM_PMU_NAME, path);
-		goto out;
+		pr_err("%s: can't read file %s\n", CORESIGHT_ETM_PMU_NAME,
+		       path);
+		return err;
 	}

-	/* User has configured for PID tracing, respects it. */
-	contextid = evsel->core.attr.config &
-			(BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_CTXTID2));
-
-	/*
-	 * If user doesn't configure the contextid format, parse PMU format and
-	 * enable PID tracing according to the "contextid" format bits:
-	 *
-	 *   If bit ETM_OPT_CTXTID is set, trace CONTEXTIDR_EL1;
-	 *   If bit ETM_OPT_CTXTID2 is set, trace CONTEXTIDR_EL2.
-	 */
-	if (!contextid)
-		contextid = perf_pmu__format_bits(&cs_etm_pmu->format,
-						  "contextid");
-
-	if (contextid & BIT(ETM_OPT_CTXTID)) {
+	if (contextid &
+	    perf_pmu__format_bits(&cs_etm_pmu->format, "contextid1")) {
 		/*
 		 * TRCIDR2.CIDSIZE, bit [9-5], indicates whether contextID
 		 * tracing is supported:
@ -122,14 +116,14 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
 		 */
 		val = BMVAL(val, 5, 9);
 		if (!val || val != 0x4) {
-			pr_err("%s: CONTEXTIDR_EL1 isn't supported\n",
-			       CORESIGHT_ETM_PMU_NAME);
-			err = -EINVAL;
-			goto out;
+			pr_err("%s: CONTEXTIDR_EL1 isn't supported, disable with %s/contextid1=0/\n",
+			       CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
+			return -EINVAL;
 		}
 	}

-	if (contextid & BIT(ETM_OPT_CTXTID2)) {
+	if (contextid &
+	    perf_pmu__format_bits(&cs_etm_pmu->format, "contextid2")) {
 		/*
 		 * TRCIDR2.VMIDOPT[30:29] != 0 and
 		 * TRCIDR2.VMIDSIZE[14:10] == 0b00100 (32bit virtual contextid)
@ -138,35 +132,34 @@ static int cs_etm_set_context_id(struct auxtrace_record *itr,
 		 * Any value of VMIDSIZE >= 4 (i.e, > 32bit) is fine for us.
 		 */
 		if (!BMVAL(val, 29, 30) || BMVAL(val, 10, 14) < 4) {
-			pr_err("%s: CONTEXTIDR_EL2 isn't supported\n",
-			       CORESIGHT_ETM_PMU_NAME);
-			err = -EINVAL;
-			goto out;
+			pr_err("%s: CONTEXTIDR_EL2 isn't supported, disable with %s/contextid2=0/\n",
+			       CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
+			return -EINVAL;
 		}
 	}

-	/* All good, let the kernel know */
-	evsel->core.attr.config |= contextid;
-	err = 0;
-
-out:
-	return err;
+	return 0;
 }

-static int cs_etm_set_timestamp(struct auxtrace_record *itr,
-				struct evsel *evsel, int cpu)
+static int cs_etm_validate_timestamp(struct auxtrace_record *itr,
+				     struct evsel *evsel, int cpu)
 {
-	struct cs_etm_recording *ptr;
-	struct perf_pmu *cs_etm_pmu;
+	struct cs_etm_recording *ptr =
+		container_of(itr, struct cs_etm_recording, itr);
+	struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
 	char path[PATH_MAX];
-	int err = -EINVAL;
+	int err;
 	u32 val;

-	ptr = container_of(itr, struct cs_etm_recording, itr);
-	cs_etm_pmu = ptr->cs_etm_pmu;
+	if (!(evsel->core.attr.config &
+	      perf_pmu__format_bits(&cs_etm_pmu->format, "timestamp")))
+		return 0;

-	if (!cs_etm_is_etmv4(itr, cpu))
-		goto out;
+	if (!cs_etm_is_etmv4(itr, cpu)) {
+		pr_err("%s: timestamp not supported in ETMv3, disable with %s/timestamp=0/\n",
+		       CORESIGHT_ETM_PMU_NAME, CORESIGHT_ETM_PMU_NAME);
+		return -EINVAL;
+	}

 	/* Get a handle on TRCIRD0 */
 	snprintf(path, PATH_MAX, "cpu%d/%s",
@ -177,7 +170,7 @@ static int cs_etm_set_timestamp(struct auxtrace_record *itr,
 	if (err != 1) {
 		pr_err("%s: can't read file %s\n",
 		       CORESIGHT_ETM_PMU_NAME, path);
-		goto out;
+		return err;
 	}

 	/*
@ -189,24 +182,21 @@ static int cs_etm_set_timestamp(struct auxtrace_record *itr,
 	 */
 	val &= GENMASK(28, 24);
 	if (!val) {
-		err = -EINVAL;
-		goto out;
+		return -EINVAL;
 	}

-	/* All good, let the kernel know */
-	evsel->core.attr.config |= (1 << ETM_OPT_TS);
-	err = 0;
-
-out:
-	return err;
+	return 0;
 }

-#define ETM_SET_OPT_CTXTID	(1 << 0)
-#define ETM_SET_OPT_TS		(1 << 1)
-#define ETM_SET_OPT_MASK	(ETM_SET_OPT_CTXTID | ETM_SET_OPT_TS)
-
-static int cs_etm_set_option(struct auxtrace_record *itr,
-			     struct evsel *evsel, u32 option)
+/*
+ * Check whether the requested timestamp and contextid options should be
+ * available on all requested CPUs and if not, tell the user how to override.
+ * The kernel will silently disable any unavailable options so a warning here
+ * first is better. In theory the kernel could still disable the option for
+ * some other reason so this is best effort only.
+ */
+static int cs_etm_validate_config(struct auxtrace_record *itr,
+				  struct evsel *evsel)
 {
 	int i, err = -EINVAL;
 	struct perf_cpu_map *event_cpus = evsel->evlist->core.user_requested_cpus;
@ -220,18 +210,11 @@ static int cs_etm_set_option(struct auxtrace_record *itr,
 		    !perf_cpu_map__has(online_cpus, cpu))
 			continue;

-		if (option & BIT(ETM_OPT_CTXTID)) {
-			err = cs_etm_set_context_id(itr, evsel, i);
-			if (err)
-				goto out;
-		}
-		if (option & BIT(ETM_OPT_TS)) {
-			err = cs_etm_set_timestamp(itr, evsel, i);
-			if (err)
-				goto out;
-		}
-		if (option & ~(BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS)))
-			/* Nothing else is currently supported */
+		err = cs_etm_validate_context_id(itr, evsel, i);
+		if (err)
+			goto out;
+		err = cs_etm_validate_timestamp(itr, evsel, i);
+		if (err)
 			goto out;
 	}

@ -319,13 +302,6 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 	bool privileged = perf_event_paranoid_check(-1);
 	int err = 0;

-	ptr->evlist = evlist;
-	ptr->snapshot_mode = opts->auxtrace_snapshot_mode;
-
-	if (!record_opts__no_switch_events(opts) &&
-	    perf_can_record_switch_events())
-		opts->record_switch_events = true;
-
 	evlist__for_each_entry(evlist, evsel) {
 		if (evsel->core.attr.type == cs_etm_pmu->type) {
 			if (cs_etm_evsel) {
@ -333,11 +309,7 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 				       CORESIGHT_ETM_PMU_NAME);
 				return -EINVAL;
 			}
-			evsel->core.attr.freq = 0;
-			evsel->core.attr.sample_period = 1;
-			evsel->needs_auxtrace_mmap = true;
 			cs_etm_evsel = evsel;
-			opts->full_auxtrace = true;
 		}
 	}

@ -345,6 +317,16 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 	if (!cs_etm_evsel)
 		return 0;

+	ptr->evlist = evlist;
+	ptr->snapshot_mode = opts->auxtrace_snapshot_mode;
+
+	if (!record_opts__no_switch_events(opts) &&
+	    perf_can_record_switch_events())
+		opts->record_switch_events = true;
+
+	cs_etm_evsel->needs_auxtrace_mmap = true;
+	opts->full_auxtrace = true;
+
 	ret = cs_etm_set_sink_attr(cs_etm_pmu, cs_etm_evsel);
 	if (ret)
 		return ret;
@ -414,8 +396,8 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 		}
 	}

-	/* We are in full trace mode but '-m,xyz' wasn't specified */
-	if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) {
+	/* Buffer sizes weren't specified with '-m,xyz' so give some defaults */
+	if (!opts->auxtrace_mmap_pages) {
 		if (privileged) {
 			opts->auxtrace_mmap_pages = MiB(4) / page_size;
 		} else {
@ -423,7 +405,6 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 			if (opts->mmap_pages == UINT_MAX)
 				opts->mmap_pages = KiB(256) / page_size;
 		}
-
 	}

 	if (opts->auxtrace_snapshot_mode)
@ -437,38 +418,36 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 	evlist__to_front(evlist, cs_etm_evsel);

 	/*
-	 * In the case of per-cpu mmaps, we need the CPU on the
-	 * AUX event.  We also need the contextID in order to be notified
+	 * get the CPU on the sample - need it to associate trace ID in the
+	 * AUX_OUTPUT_HW_ID event, and the AUX event for per-cpu mmaps.
+	 */
+	evsel__set_sample_bit(cs_etm_evsel, CPU);
+
+	/*
+	 * Also the case of per-cpu mmaps, need the contextID in order to be notified
 	 * when a context switch happened.
 	 */
 	if (!perf_cpu_map__empty(cpus)) {
-		evsel__set_sample_bit(cs_etm_evsel, CPU);
-
-		err = cs_etm_set_option(itr, cs_etm_evsel,
-					BIT(ETM_OPT_CTXTID) | BIT(ETM_OPT_TS));
-		if (err)
-			goto out;
+		evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
+					   "timestamp", 1);
+		evsel__set_config_if_unset(cs_etm_pmu, cs_etm_evsel,
+					   "contextid", 1);
 	}

 	/* Add dummy event to keep tracking */
-	if (opts->full_auxtrace) {
-		struct evsel *tracking_evsel;
+	err = parse_event(evlist, "dummy:u");
+	if (err)
+		goto out;
+	evsel = evlist__last(evlist);
+	evlist__set_tracking_event(evlist, evsel);
+	evsel->core.attr.freq = 0;
+	evsel->core.attr.sample_period = 1;

-		err = parse_event(evlist, "dummy:u");
-		if (err)
-			goto out;
-
-		tracking_evsel = evlist__last(evlist);
-		evlist__set_tracking_event(evlist, tracking_evsel);
-
-		tracking_evsel->core.attr.freq = 0;
-		tracking_evsel->core.attr.sample_period = 1;
-
-		/* In per-cpu case, always need the time of mmap events etc */
-		if (!perf_cpu_map__empty(cpus))
-			evsel__set_sample_bit(tracking_evsel, TIME);
-	}
+	/* In per-cpu case, always need the time of mmap events etc */
+	if (!perf_cpu_map__empty(cpus))
+		evsel__set_sample_bit(evsel, TIME);

+	err = cs_etm_validate_config(itr, cs_etm_evsel);
 out:
 	return err;
 }
@ -659,8 +638,12 @@ static bool cs_etm_is_ete(struct auxtrace_record *itr, int cpu)
 {
 	struct cs_etm_recording *ptr = container_of(itr, struct cs_etm_recording, itr);
 	struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
-	int trcdevarch = cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCDEVARCH]);
+	int trcdevarch;

+	if (!cs_etm_pmu_path_exists(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCDEVARCH]))
+		return false;
+
+	trcdevarch = cs_etm_get_ro(cs_etm_pmu, cpu, metadata_ete_ro[CS_ETE_TRCDEVARCH]);
 	/*
 	 * ETE if ARCHVER is 5 (ARCHVER is 4 for ETM) and ARCHPART is 0xA13.
 	 * See ETM_DEVARCH_ETE_ARCH in coresight-etm4x.h
@ -675,8 +658,10 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr,

 	/* Get trace configuration register */
 	data[CS_ETMV4_TRCCONFIGR] = cs_etmv4_get_config(itr);
-	/* Get traceID from the framework */
-	data[CS_ETMV4_TRCTRACEIDR] = coresight_get_trace_id(cpu);
+	/* traceID set to legacy version, in case new perf running on older system */
+	data[CS_ETMV4_TRCTRACEIDR] =
+		CORESIGHT_LEGACY_CPU_TRACE_ID(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG;
+
 	/* Get read-only information from sysFS */
 	data[CS_ETMV4_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu,
 					       metadata_etmv4_ro[CS_ETMV4_TRCIDR0]);
@ -694,8 +679,8 @@ static void cs_etm_save_etmv4_header(__u64 data[], struct auxtrace_record *itr,
 		data[CS_ETMV4_TS_SOURCE] = (__u64) cs_etm_get_ro_signed(cs_etm_pmu, cpu,
 				metadata_etmv4_ro[CS_ETMV4_TS_SOURCE]);
 	else {
-		pr_warning("[%03d] pmu file 'ts_source' not found. Fallback to safe value (-1)\n",
-			   cpu);
+		pr_debug3("[%03d] pmu file 'ts_source' not found. Fallback to safe value (-1)\n",
+			  cpu);
 		data[CS_ETMV4_TS_SOURCE] = (__u64) -1;
 	}
 }
@ -707,8 +692,10 @@ static void cs_etm_save_ete_header(__u64 data[], struct auxtrace_record *itr, in

 	/* Get trace configuration register */
 	data[CS_ETE_TRCCONFIGR] = cs_etmv4_get_config(itr);
-	/* Get traceID from the framework */
-	data[CS_ETE_TRCTRACEIDR] = coresight_get_trace_id(cpu);
+	/* traceID set to legacy version, in case new perf running on older system */
+	data[CS_ETE_TRCTRACEIDR] =
+		CORESIGHT_LEGACY_CPU_TRACE_ID(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG;
+
 	/* Get read-only information from sysFS */
 	data[CS_ETE_TRCIDR0] = cs_etm_get_ro(cs_etm_pmu, cpu,
 					     metadata_ete_ro[CS_ETE_TRCIDR0]);
@ -729,8 +716,8 @@ static void cs_etm_save_ete_header(__u64 data[], struct auxtrace_record *itr, in
 		data[CS_ETE_TS_SOURCE] = (__u64) cs_etm_get_ro_signed(cs_etm_pmu, cpu,
 				metadata_ete_ro[CS_ETE_TS_SOURCE]);
 	else {
-		pr_warning("[%03d] pmu file 'ts_source' not found. Fallback to safe value (-1)\n",
-			   cpu);
+		pr_debug3("[%03d] pmu file 'ts_source' not found. Fallback to safe value (-1)\n",
+			  cpu);
 		data[CS_ETE_TS_SOURCE] = (__u64) -1;
 	}
 }
@ -764,9 +751,9 @@ static void cs_etm_get_metadata(int cpu, u32 *offset,
 		magic = __perf_cs_etmv3_magic;
 		/* Get configuration register */
 		info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr);
-		/* Get traceID from the framework */
+		/* traceID set to legacy value in case new perf running on old system */
 		info->priv[*offset + CS_ETM_ETMTRACEIDR] =
-						coresight_get_trace_id(cpu);
+			CORESIGHT_LEGACY_CPU_TRACE_ID(cpu) | CORESIGHT_TRACE_ID_UNUSED_FLAG;
 		/* Get read-only information from sysFS */
 		info->priv[*offset + CS_ETM_ETMCCER] =
 			cs_etm_get_ro(cs_etm_pmu, cpu,
@ -925,3 +912,22 @@ struct auxtrace_record *cs_etm_record_init(int *err)
 out:
 	return NULL;
 }
+
+/*
+ * Set a default config to enable the user changed config tracking mechanism
+ * (CFG_CHG and evsel__set_config_if_unset()). If no default is set then user
+ * changes aren't tracked.
+ */
+struct perf_event_attr *
+cs_etm_get_default_config(struct perf_pmu *pmu __maybe_unused)
+{
+	struct perf_event_attr *attr;
+
+	attr = zalloc(sizeof(struct perf_event_attr));
+	if (!attr)
+		return NULL;
+
+	attr->sample_period = 1;
+
+	return attr;
+}
--- a/tools/perf/arch/arm/util/pmu.c
+++ b/tools/perf/arch/arm/util/pmu.c
@ -12,6 +12,7 @@
 #include "arm-spe.h"
 #include "hisi-ptt.h"
 #include "../../../util/pmu.h"
+#include "../cs-etm.h"

 struct perf_event_attr
 *perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
@ -20,6 +21,7 @@ struct perf_event_attr
 	if (!strcmp(pmu->name, CORESIGHT_ETM_PMU_NAME)) {
 		/* add ETM default config here */
 		pmu->selectable = true;
+		return cs_etm_get_default_config(pmu);
 #if defined(__aarch64__)
 	} else if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) {
 		return arm_spe_pmu_default_config(pmu);
--- a/tools/perf/arch/arm64/tests/dwarf-unwind.c
+++ b/tools/perf/arch/arm64/tests/dwarf-unwind.c
@ -33,7 +33,7 @@ static int sample_ustack(struct perf_sample *sample,
 		return -1;
 	}

-	stack_size = map->end - sp;
+	stack_size = map__end(map) - sp;
 	stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;

 	memcpy(buf, (void *) sp, stack_size);
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@ -36,29 +36,6 @@ struct arm_spe_recording {
 	bool			*wrapped;
 };

-static void arm_spe_set_timestamp(struct auxtrace_record *itr,
-				  struct evsel *evsel)
-{
-	struct arm_spe_recording *ptr;
-	struct perf_pmu *arm_spe_pmu;
-	struct evsel_config_term *term = evsel__get_config_term(evsel, CFG_CHG);
-	u64 user_bits = 0, bit;
-
-	ptr = container_of(itr, struct arm_spe_recording, itr);
-	arm_spe_pmu = ptr->arm_spe_pmu;
-
-	if (term)
-		user_bits = term->val.cfg_chg;
-
-	bit = perf_pmu__format_bits(&arm_spe_pmu->format, "ts_enable");
-
-	/* Skip if user has set it */
-	if (bit & user_bits)
-		return;
-
-	evsel->core.attr.config |= bit;
-}
-
 static size_t
 arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
 		       struct evlist *evlist __maybe_unused)
@ -238,7 +215,8 @@ static int arm_spe_recording_options(struct auxtrace_record *itr,
 	 */
 	if (!perf_cpu_map__empty(cpus)) {
 		evsel__set_sample_bit(arm_spe_evsel, CPU);
-		arm_spe_set_timestamp(itr, arm_spe_evsel);
+		evsel__set_config_if_unset(arm_spe_pmu, arm_spe_evsel,
+					   "ts_enable", 1);
 	}

 	/*
@ -479,7 +457,7 @@ static void arm_spe_recording_free(struct auxtrace_record *itr)
 	struct arm_spe_recording *sper =
 			container_of(itr, struct arm_spe_recording, itr);

-	free(sper->wrapped);
+	zfree(&sper->wrapped);
 	free(sper);
 }

--- a/tools/perf/arch/arm64/util/kvm-stat.c
+++ b/tools/perf/arch/arm64/util/kvm-stat.c
@ -11,7 +11,6 @@ define_exit_reasons_table(arm64_trap_exit_reasons, kvm_arm_exception_class);

 const char *kvm_trap_exit_reason = "esr_ec";
 const char *vcpu_id_str = "id";
-const int decode_str_len = 20;
 const char *kvm_exit_reason = "ret";
 const char *kvm_entry_trace = "kvm:kvm_entry";
 const char *kvm_exit_trace = "kvm:kvm_exit";
@ -45,14 +44,14 @@ static bool event_begin(struct evsel *evsel,
 			struct perf_sample *sample __maybe_unused,
 			struct event_key *key __maybe_unused)
 {
-	return !strcmp(evsel->name, kvm_entry_trace);
+	return evsel__name_is(evsel, kvm_entry_trace);
 }

 static bool event_end(struct evsel *evsel,
 		      struct perf_sample *sample,
 		      struct event_key *key)
 {
-	if (!strcmp(evsel->name, kvm_exit_trace)) {
+	if (evsel__name_is(evsel, kvm_exit_trace)) {
 		event_get_key(evsel, sample, key);
 		return true;
 	}
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@ -128,7 +128,7 @@ static int lookup_triplets(const char *const *triplets, const char *name)
 }

 static int perf_env__lookup_binutils_path(struct perf_env *env,
-					  const char *name, const char **path)
+					  const char *name, char **path)
 {
 	int idx;
 	const char *arch = perf_env__arch(env), *cross_env;
@ -200,7 +200,7 @@ out_error:
 	return -1;
 }

-int perf_env__lookup_objdump(struct perf_env *env, const char **path)
+int perf_env__lookup_objdump(struct perf_env *env, char **path)
 {
 	/*
 	 * For live mode, env->arch will be NULL and we can use
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@ -6,7 +6,7 @@

 struct perf_env;

-int perf_env__lookup_objdump(struct perf_env *env, const char **path);
+int perf_env__lookup_objdump(struct perf_env *env, char **path);
 bool perf_env__single_address_space(struct perf_env *env);

 #endif /* ARCH_PERF_COMMON_H */
--- a/tools/perf/arch/powerpc/tests/dwarf-unwind.c
+++ b/tools/perf/arch/powerpc/tests/dwarf-unwind.c
@ -33,7 +33,7 @@ static int sample_ustack(struct perf_sample *sample,
 		return -1;
 	}

-	stack_size = map->end - sp;
+	stack_size = map__end(map) - sp;
 	stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;

 	memcpy(buf, (void *) sp, stack_size);
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@ -45,6 +45,6 @@ int arch_get_runtimeparam(const struct pmu_metric *pm)
 	int count;
 	char path[PATH_MAX] = "/devices/hv_24x7/interface/";

-	atoi(pm->aggr_mode) == PerChip ? strcat(path, "sockets") : strcat(path, "coresperchip");
+	strcat(path, pm->aggr_mode == PerChip ? "sockets" : "coresperchip");
 	return sysfs__read_int(path, &count) < 0 ? 1 : count;
 }
--- a/tools/perf/arch/powerpc/util/kvm-stat.c
+++ b/tools/perf/arch/powerpc/util/kvm-stat.c
@ -14,7 +14,6 @@
 #define NR_TPS 4

 const char *vcpu_id_str = "vcpu_id";
-const int decode_str_len = 40;
 const char *kvm_entry_trace = "kvm_hv:kvm_guest_enter";
 const char *kvm_exit_trace = "kvm_hv:kvm_guest_exit";

@ -61,13 +60,13 @@ static bool hcall_event_end(struct evsel *evsel,
 			    struct perf_sample *sample __maybe_unused,
 			    struct event_key *key __maybe_unused)
 {
-	return (!strcmp(evsel->name, kvm_events_tp[3]));
+	return (evsel__name_is(evsel, kvm_events_tp[3]));
 }

 static bool hcall_event_begin(struct evsel *evsel,
 			      struct perf_sample *sample, struct event_key *key)
 {
-	if (!strcmp(evsel->name, kvm_events_tp[2])) {
+	if (evsel__name_is(evsel, kvm_events_tp[2])) {
 		hcall_event_get_key(evsel, sample, key);
 		return true;
 	}
@ -80,7 +79,7 @@ static void hcall_event_decode_key(struct perf_kvm_stat *kvm __maybe_unused,
 {
 	const char *hcall_reason = get_hcall_exit_reason(key->key);

-	scnprintf(decode, decode_str_len, "%s", hcall_reason);
+	scnprintf(decode, KVM_EVENT_NAME_LEN, "%s", hcall_reason);
 }

 static struct kvm_events_ops hcall_events = {
--- a/tools/perf/arch/powerpc/util/skip-callchain-idx.c
+++ b/tools/perf/arch/powerpc/util/skip-callchain-idx.c
@ -255,14 +255,14 @@ int arch_skip_callchain_idx(struct thread *thread, struct ip_callchain *chain)
 	thread__find_symbol(thread, PERF_RECORD_MISC_USER, ip, &al);

 	if (al.map)
-		dso = al.map->dso;
+		dso = map__dso(al.map);

 	if (!dso) {
 		pr_debug("%" PRIx64 " dso is NULL\n", ip);
 		return skip_slot;
 	}

-	rc = check_return_addr(dso, al.map->start, ip);
+	rc = check_return_addr(dso, map__start(al.map), ip);

 	pr_debug("[DSO %s, sym %s, ip 0x%" PRIx64 "] rc %d\n",
 				dso->long_name, al.sym->name, ip, rc);
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@ -104,7 +104,7 @@ void arch__fix_tev_from_maps(struct perf_probe_event *pev,

 	lep_offset = PPC64_LOCAL_ENTRY_OFFSET(sym->arch_sym);

-	if (map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
+	if (map__dso(map)->symtab_type == DSO_BINARY_TYPE__KALLSYMS)
 		tev->point.offset += PPC64LE_LEP_OFFSET;
 	else if (lep_offset) {
 		if (pev->uprobes)
@ -131,7 +131,7 @@ void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
 	for (i = 0; i < ntevs; i++) {
 		tev = &pev->tevs[i];
 		map__for_each_symbol(map, sym, tmp) {
-			if (map->unmap_ip(map, sym->start) == tev->point.address) {
+			if (map__unmap_ip(map, sym->start) == tev->point.address) {
 				arch__fix_tev_from_maps(pev, tev, map, sym);
 				break;
 			}
--- a/tools/perf/arch/s390/annotate/instructions.c
+++ b/tools/perf/arch/s390/annotate/instructions.c
@ -39,7 +39,7 @@ static int s390_call__parse(struct arch *arch, struct ins_operands *ops,
 	target.addr = map__objdump_2mem(map, ops->target.addr);

 	if (maps__find_ams(ms->maps, &target) == 0 &&
-	    map__rip_2objdump(target.ms.map, map->map_ip(target.ms.map, target.addr)) == ops->target.addr)
+	    map__rip_2objdump(target.ms.map, map__map_ip(target.ms.map, target.addr)) == ops->target.addr)
 		ops->target.sym = target.ms.sym;

 	return 0;
--- a/tools/perf/arch/s390/util/Build
+++ b/tools/perf/arch/s390/util/Build
@ -6,5 +6,6 @@ perf-$(CONFIG_DWARF) += dwarf-regs.o
 perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o

 perf-y += machine.o
+perf-y += pmu.o

 perf-$(CONFIG_AUXTRACE) += auxtrace.o
--- a/tools/perf/arch/s390/util/kvm-stat.c
+++ b/tools/perf/arch/s390/util/kvm-stat.c
@ -19,7 +19,6 @@ define_exit_reasons_table(sie_diagnose_codes, diagnose_codes);
 define_exit_reasons_table(sie_icpt_prog_codes, icpt_prog_codes);

 const char *vcpu_id_str = "id";
-const int decode_str_len = 40;
 const char *kvm_exit_reason = "icptcode";
 const char *kvm_entry_trace = "kvm:kvm_s390_sie_enter";
 const char *kvm_exit_trace = "kvm:kvm_s390_sie_exit";
--- a/tools/perf/arch/s390/util/pmu.c
+++ b/tools/perf/arch/s390/util/pmu.c
@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright IBM Corp. 2023
+ * Author(s): Thomas Richter <tmricht@linux.ibm.com>
+ */
+
+#include <string.h>
+
+#include "../../../util/pmu.h"
+
+#define	S390_PMUPAI_CRYPTO	"pai_crypto"
+#define	S390_PMUPAI_EXT		"pai_ext"
+#define	S390_PMUCPUM_CF		"cpum_cf"
+
+struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu)
+{
+	if (!strcmp(pmu->name, S390_PMUPAI_CRYPTO) ||
+	    !strcmp(pmu->name, S390_PMUPAI_EXT) ||
+	    !strcmp(pmu->name, S390_PMUCPUM_CF))
+		pmu->selectable = true;
+	return NULL;
+}
--- a/tools/perf/arch/x86/tests/dwarf-unwind.c
+++ b/tools/perf/arch/x86/tests/dwarf-unwind.c
@ -33,7 +33,7 @@ static int sample_ustack(struct perf_sample *sample,
 		return -1;
 	}

-	stack_size = map->end - sp;
+	stack_size = map__end(map) - sp;
 	stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;

 	memcpy(buf, (void *) sp, stack_size);
--- a/tools/perf/arch/x86/tests/insn-x86.c
+++ b/tools/perf/arch/x86/tests/insn-x86.c
@ -29,6 +29,8 @@ struct test_data test_data_64[] = {
 #include "insn-x86-dat-64.c"
 	{{0x0f, 0x01, 0xee}, 3, 0, NULL, NULL, "0f 01 ee             \trdpkru"},
 	{{0x0f, 0x01, 0xef}, 3, 0, NULL, NULL, "0f 01 ef             \twrpkru"},
+	{{0xf2, 0x0f, 0x01, 0xca}, 4, 0, "erets", "indirect", "f2 0f 01 ca  \terets"},
+	{{0xf3, 0x0f, 0x01, 0xca}, 4, 0, "eretu", "indirect", "f3 0f 01 ca  \teretu"},
 	{{0}, 0, 0, NULL, NULL, NULL},
 };

@ -49,6 +51,8 @@ static int get_op(const char *op_str)
 		{"syscall", INTEL_PT_OP_SYSCALL},
 		{"sysret",  INTEL_PT_OP_SYSRET},
 		{"vmentry",  INTEL_PT_OP_VMENTRY},
+		{"erets",   INTEL_PT_OP_ERETS},
+		{"eretu",   INTEL_PT_OP_ERETU},
 		{NULL, 0},
 	};
 	struct val_data *val;
--- a/tools/perf/arch/x86/util/auxtrace.c
+++ b/tools/perf/arch/x86/util/auxtrace.c
@ -26,11 +26,7 @@ struct auxtrace_record *auxtrace_record__init_intel(struct evlist *evlist,
 	bool found_bts = false;

 	intel_pt_pmu = perf_pmu__find(INTEL_PT_PMU_NAME);
-	if (intel_pt_pmu)
-		intel_pt_pmu->auxtrace = true;
 	intel_bts_pmu = perf_pmu__find(INTEL_BTS_PMU_NAME);
-	if (intel_bts_pmu)
-		intel_bts_pmu->auxtrace = true;

 	evlist__for_each_entry(evlist, evsel) {
 		if (intel_pt_pmu && evsel->core.attr.type == intel_pt_pmu->type)
--- a/tools/perf/arch/x86/util/event.c
+++ b/tools/perf/arch/x86/util/event.c
@ -19,7 +19,7 @@ int perf_event__synthesize_extra_kmaps(struct perf_tool *tool,
 				       struct machine *machine)
 {
 	int rc = 0;
-	struct map *pos;
+	struct map_rb_node *pos;
 	struct maps *kmaps = machine__kernel_maps(machine);
 	union perf_event *event = zalloc(sizeof(event->mmap) +
 					 machine->id_hdr_size);
@ -33,11 +33,12 @@ int perf_event__synthesize_extra_kmaps(struct perf_tool *tool,
 	maps__for_each_entry(kmaps, pos) {
 		struct kmap *kmap;
 		size_t size;
+		struct map *map = pos->map;

-		if (!__map__is_extra_kernel_map(pos))
+		if (!__map__is_extra_kernel_map(map))
 			continue;

-		kmap = map__kmap(pos);
+		kmap = map__kmap(map);

 		size = sizeof(event->mmap) - sizeof(event->mmap.filename) +
 		       PERF_ALIGN(strlen(kmap->name) + 1, sizeof(u64)) +
@ -58,9 +59,9 @@ int perf_event__synthesize_extra_kmaps(struct perf_tool *tool,

 		event->mmap.header.size = size;

-		event->mmap.start = pos->start;
-		event->mmap.len   = pos->end - pos->start;
-		event->mmap.pgoff = pos->pgoff;
+		event->mmap.start = map__start(map);
+		event->mmap.len   = map__size(map);
+		event->mmap.pgoff = map__pgoff(map);
 		event->mmap.pid   = machine->pid;

 		strlcpy(event->mmap.filename, kmap->name, PATH_MAX);
--- a/tools/perf/arch/x86/util/evlist.c
+++ b/tools/perf/arch/x86/util/evlist.c
@ -59,35 +59,28 @@ int arch_evlist__add_default_attrs(struct evlist *evlist,
 				   struct perf_event_attr *attrs,
 				   size_t nr_attrs)
 {
-	if (nr_attrs)
-		return ___evlist__add_default_attrs(evlist, attrs, nr_attrs);
+	if (!nr_attrs)
+		return 0;

-	return topdown_parse_events(evlist);
+	return ___evlist__add_default_attrs(evlist, attrs, nr_attrs);
 }

-struct evsel *arch_evlist__leader(struct list_head *list)
+int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs)
 {
-	struct evsel *evsel, *first, *slots = NULL;
-	bool has_topdown = false;
-
-	first = list_first_entry(list, struct evsel, core.node);
-
-	if (!topdown_sys_has_perf_metrics())
-		return first;
-
-	/* If there is a slots event and a topdown event then the slots event comes first. */
-	__evlist__for_each_entry(list, evsel) {
-		if (evsel->pmu_name && !strncmp(evsel->pmu_name, "cpu", 3) && evsel->name) {
-			if (strcasestr(evsel->name, "slots")) {
-				slots = evsel;
-				if (slots == first)
-					return first;
-			}
-			if (strcasestr(evsel->name, "topdown"))
-				has_topdown = true;
-			if (slots && has_topdown)
-				return slots;
-		}
+	if (topdown_sys_has_perf_metrics() &&
+	    (!lhs->pmu_name || !strncmp(lhs->pmu_name, "cpu", 3))) {
+		/* Ensure the topdown slots comes first. */
+		if (strcasestr(lhs->name, "slots"))
+			return -1;
+		if (strcasestr(rhs->name, "slots"))
+			return 1;
+		/* Followed by topdown events. */
+		if (strcasestr(lhs->name, "topdown") && !strcasestr(rhs->name, "topdown"))
+			return -1;
+		if (!strcasestr(lhs->name, "topdown") && strcasestr(rhs->name, "topdown"))
+			return 1;
 	}
-	return first;
+
+	/* Default ordering by insertion index. */
+	return lhs->core.idx - rhs->core.idx;
 }
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@ -194,16 +194,19 @@ static u64 intel_pt_default_config(struct perf_pmu *intel_pt_pmu)
 	int pos = 0;
 	u64 config;
 	char c;
+	int dirfd;
+
+	dirfd = perf_pmu__event_source_devices_fd();

 	pos += scnprintf(buf + pos, sizeof(buf) - pos, "tsc");

-	if (perf_pmu__scan_file(intel_pt_pmu, "caps/mtc", "%d",
-				&mtc) != 1)
+	if (perf_pmu__scan_file_at(intel_pt_pmu, dirfd, "caps/mtc", "%d",
+				   &mtc) != 1)
 		mtc = 1;

 	if (mtc) {
-		if (perf_pmu__scan_file(intel_pt_pmu, "caps/mtc_periods", "%x",
-					&mtc_periods) != 1)
+		if (perf_pmu__scan_file_at(intel_pt_pmu, dirfd, "caps/mtc_periods", "%x",
+					   &mtc_periods) != 1)
 			mtc_periods = 0;
 		if (mtc_periods) {
 			mtc_period = intel_pt_pick_bit(mtc_periods, 3);
@ -212,13 +215,13 @@ static u64 intel_pt_default_config(struct perf_pmu *intel_pt_pmu)
 		}
 	}

-	if (perf_pmu__scan_file(intel_pt_pmu, "caps/psb_cyc", "%d",
-				&psb_cyc) != 1)
+	if (perf_pmu__scan_file_at(intel_pt_pmu, dirfd, "caps/psb_cyc", "%d",
+				   &psb_cyc) != 1)
 		psb_cyc = 1;

 	if (psb_cyc && mtc_periods) {
-		if (perf_pmu__scan_file(intel_pt_pmu, "caps/psb_periods", "%x",
-					&psb_periods) != 1)
+		if (perf_pmu__scan_file_at(intel_pt_pmu, dirfd, "caps/psb_periods", "%x",
+					   &psb_periods) != 1)
 			psb_periods = 0;
 		if (psb_periods) {
 			psb_period = intel_pt_pick_bit(psb_periods, 3);
@ -227,8 +230,8 @@ static u64 intel_pt_default_config(struct perf_pmu *intel_pt_pmu)
 		}
 	}

-	if (perf_pmu__scan_file(intel_pt_pmu, "format/pt", "%c", &c) == 1 &&
-	    perf_pmu__scan_file(intel_pt_pmu, "format/branch", "%c", &c) == 1)
+	if (perf_pmu__scan_file_at(intel_pt_pmu, dirfd, "format/pt", "%c", &c) == 1 &&
+	    perf_pmu__scan_file_at(intel_pt_pmu, dirfd, "format/branch", "%c", &c) == 1)
 		pos += scnprintf(buf + pos, sizeof(buf) - pos, ",pt,branch");

 	pr_debug2("%s default config: %s\n", intel_pt_pmu->name, buf);
@ -236,6 +239,7 @@ static u64 intel_pt_default_config(struct perf_pmu *intel_pt_pmu)
 	intel_pt_parse_terms(intel_pt_pmu->name, &intel_pt_pmu->format, buf,
 			     &config);

+	close(dirfd);
 	return config;
 }

@ -488,7 +492,7 @@ static void intel_pt_valid_str(char *str, size_t len, u64 valid)
 	}
 }

-static int intel_pt_val_config_term(struct perf_pmu *intel_pt_pmu,
+static int intel_pt_val_config_term(struct perf_pmu *intel_pt_pmu, int dirfd,
 				    const char *caps, const char *name,
 				    const char *supported, u64 config)
 {
@ -498,11 +502,11 @@ static int intel_pt_val_config_term(struct perf_pmu *intel_pt_pmu,
 	u64 bits;
 	int ok;

-	if (perf_pmu__scan_file(intel_pt_pmu, caps, "%llx", &valid) != 1)
+	if (perf_pmu__scan_file_at(intel_pt_pmu, dirfd, caps, "%llx", &valid) != 1)
 		valid = 0;

 	if (supported &&
-	    perf_pmu__scan_file(intel_pt_pmu, supported, "%d", &ok) == 1 && !ok)
+	    perf_pmu__scan_file_at(intel_pt_pmu, dirfd, supported, "%d", &ok) == 1 && !ok)
 		valid = 0;

 	valid |= 1;
@ -531,56 +535,45 @@ out_err:
 static int intel_pt_validate_config(struct perf_pmu *intel_pt_pmu,
 				    struct evsel *evsel)
 {
-	int err;
+	int err, dirfd;
 	char c;

 	if (!evsel)
 		return 0;

+	dirfd = perf_pmu__event_source_devices_fd();
+	if (dirfd < 0)
+		return dirfd;
+
 	/*
 	 * If supported, force pass-through config term (pt=1) even if user
 	 * sets pt=0, which avoids senseless kernel errors.
 	 */
-	if (perf_pmu__scan_file(intel_pt_pmu, "format/pt", "%c", &c) == 1 &&
+	if (perf_pmu__scan_file_at(intel_pt_pmu, dirfd, "format/pt", "%c", &c) == 1 &&
 	    !(evsel->core.attr.config & 1)) {
 		pr_warning("pt=0 doesn't make sense, forcing pt=1\n");
 		evsel->core.attr.config |= 1;
 	}

-	err = intel_pt_val_config_term(intel_pt_pmu, "caps/cycle_thresholds",
+	err = intel_pt_val_config_term(intel_pt_pmu, dirfd, "caps/cycle_thresholds",
 				       "cyc_thresh", "caps/psb_cyc",
 				       evsel->core.attr.config);
 	if (err)
-		return err;
+		goto out;

-	err = intel_pt_val_config_term(intel_pt_pmu, "caps/mtc_periods",
+	err = intel_pt_val_config_term(intel_pt_pmu, dirfd, "caps/mtc_periods",
 				       "mtc_period", "caps/mtc",
 				       evsel->core.attr.config);
 	if (err)
-		return err;
+		goto out;

-	return intel_pt_val_config_term(intel_pt_pmu, "caps/psb_periods",
+	err = intel_pt_val_config_term(intel_pt_pmu, dirfd, "caps/psb_periods",
 					"psb_period", "caps/psb_cyc",
 					evsel->core.attr.config);
-}

-static void intel_pt_config_sample_mode(struct perf_pmu *intel_pt_pmu,
-					struct evsel *evsel)
-{
-	u64 user_bits = 0, bits;
-	struct evsel_config_term *term = evsel__get_config_term(evsel, CFG_CHG);
-
-	if (term)
-		user_bits = term->val.cfg_chg;
-
-	bits = perf_pmu__format_bits(&intel_pt_pmu->format, "psb_period");
-
-	/* Did user change psb_period */
-	if (bits & user_bits)
-		return;
-
-	/* Set psb_period to 0 */
-	evsel->core.attr.config &= ~bits;
+out:
+	close(dirfd);
+	return err;
 }

 static void intel_pt_min_max_sample_sz(struct evlist *evlist,
@ -674,7 +667,8 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 		return 0;

 	if (opts->auxtrace_sample_mode)
-		intel_pt_config_sample_mode(intel_pt_pmu, intel_pt_evsel);
+		evsel__set_config_if_unset(intel_pt_pmu, intel_pt_evsel,
+					   "psb_period", 0);

 	err = intel_pt_validate_config(intel_pt_pmu, intel_pt_evsel);
 	if (err)
--- a/tools/perf/arch/x86/util/iostat.c
+++ b/tools/perf/arch/x86/util/iostat.c
@ -10,6 +10,7 @@
 #include <api/fs/fs.h>
 #include <linux/kernel.h>
 #include <linux/err.h>
+#include <linux/zalloc.h>
 #include <limits.h>
 #include <stdio.h>
 #include <string.h>
@ -100,8 +101,8 @@ static void iio_root_ports_list_free(struct iio_root_ports_list *list)

 	if (list) {
 		for (idx = 0; idx < list->nr_entries; idx++)
-			free(list->rps[idx]);
-		free(list->rps);
+			zfree(&list->rps[idx]);
+		zfree(&list->rps);
 		free(list);
 	}
 }
@ -390,7 +391,7 @@ void iostat_release(struct evlist *evlist)
 	evlist__for_each_entry(evlist, evsel) {
 		if (rp != evsel->priv) {
 			rp = evsel->priv;
-			free(evsel->priv);
+			zfree(&evsel->priv);
 		}
 	}
 }
--- a/tools/perf/arch/x86/util/kvm-stat.c
+++ b/tools/perf/arch/x86/util/kvm-stat.c
@ -18,7 +18,6 @@ static struct kvm_events_ops exit_events = {
 };

 const char *vcpu_id_str = "vcpu_id";
-const int decode_str_len = 20;
 const char *kvm_exit_reason = "exit_reason";
 const char *kvm_entry_trace = "kvm:kvm_entry";
 const char *kvm_exit_trace = "kvm:kvm_exit";
@ -47,7 +46,7 @@ static bool mmio_event_begin(struct evsel *evsel,
 		return true;

 	/* MMIO write begin event in kernel. */
-	if (!strcmp(evsel->name, "kvm:kvm_mmio") &&
+	if (evsel__name_is(evsel, "kvm:kvm_mmio") &&
 	    evsel__intval(evsel, sample, "type") == KVM_TRACE_MMIO_WRITE) {
 		mmio_event_get_key(evsel, sample, key);
 		return true;
@ -64,7 +63,7 @@ static bool mmio_event_end(struct evsel *evsel, struct perf_sample *sample,
 		return true;

 	/* MMIO read end event in kernel.*/
-	if (!strcmp(evsel->name, "kvm:kvm_mmio") &&
+	if (evsel__name_is(evsel, "kvm:kvm_mmio") &&
 	    evsel__intval(evsel, sample, "type") == KVM_TRACE_MMIO_READ) {
 		mmio_event_get_key(evsel, sample, key);
 		return true;
@ -77,7 +76,7 @@ static void mmio_event_decode_key(struct perf_kvm_stat *kvm __maybe_unused,
 				  struct event_key *key,
 				  char *decode)
 {
-	scnprintf(decode, decode_str_len, "%#lx:%s",
+	scnprintf(decode, KVM_EVENT_NAME_LEN, "%#lx:%s",
 		  (unsigned long)key->key,
 		  key->info == KVM_TRACE_MMIO_WRITE ? "W" : "R");
 }
@ -102,7 +101,7 @@ static bool ioport_event_begin(struct evsel *evsel,
 			       struct perf_sample *sample,
 			       struct event_key *key)
 {
-	if (!strcmp(evsel->name, "kvm:kvm_pio")) {
+	if (evsel__name_is(evsel, "kvm:kvm_pio")) {
 		ioport_event_get_key(evsel, sample, key);
 		return true;
 	}
@ -121,7 +120,7 @@ static void ioport_event_decode_key(struct perf_kvm_stat *kvm __maybe_unused,
 				    struct event_key *key,
 				    char *decode)
 {
-	scnprintf(decode, decode_str_len, "%#llx:%s",
+	scnprintf(decode, KVM_EVENT_NAME_LEN, "%#llx:%s",
 		  (unsigned long long)key->key,
 		  key->info ? "POUT" : "PIN");
 }
@ -146,7 +145,7 @@ static bool msr_event_begin(struct evsel *evsel,
 			       struct perf_sample *sample,
 			       struct event_key *key)
 {
-	if (!strcmp(evsel->name, "kvm:kvm_msr")) {
+	if (evsel__name_is(evsel, "kvm:kvm_msr")) {
 		msr_event_get_key(evsel, sample, key);
 		return true;
 	}
@ -165,7 +164,7 @@ static void msr_event_decode_key(struct perf_kvm_stat *kvm __maybe_unused,
 				    struct event_key *key,
 				    char *decode)
 {
-	scnprintf(decode, decode_str_len, "%#llx:%s",
+	scnprintf(decode, KVM_EVENT_NAME_LEN, "%#llx:%s",
 		  (unsigned long long)key->key,
 		  key->info ? "W" : "R");
 }
--- a/tools/perf/arch/x86/util/pmu.c
+++ b/tools/perf/arch/x86/util/pmu.c
@ -27,10 +27,14 @@ static bool cached_list;
 struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
 {
 #ifdef HAVE_AUXTRACE_SUPPORT
-	if (!strcmp(pmu->name, INTEL_PT_PMU_NAME))
+	if (!strcmp(pmu->name, INTEL_PT_PMU_NAME)) {
+		pmu->auxtrace = true;
 		return intel_pt_pmu_default_config(pmu);
-	if (!strcmp(pmu->name, INTEL_BTS_PMU_NAME))
+	}
+	if (!strcmp(pmu->name, INTEL_BTS_PMU_NAME)) {
+		pmu->auxtrace = true;
 		pmu->selectable = true;
+	}
 #endif
 	return NULL;
 }
@ -67,7 +71,7 @@ out_delete:

 static int setup_pmu_alias_list(void)
 {
-	char path[PATH_MAX];
+	int fd, dirfd;
 	DIR *dir;
 	struct dirent *dent;
 	struct pmu_alias *pmu_alias;
@ -75,10 +79,11 @@ static int setup_pmu_alias_list(void)
 	FILE *file;
 	int ret = -ENOMEM;

-	if (!perf_pmu__event_source_devices_scnprintf(path, sizeof(path)))
+	dirfd = perf_pmu__event_source_devices_fd();
+	if (dirfd < 0)
 		return -1;

-	dir = opendir(path);
+	dir = fdopendir(dirfd);
 	if (!dir)
 		return -errno;

@ -87,11 +92,11 @@ static int setup_pmu_alias_list(void)
 		    !strcmp(dent->d_name, ".."))
 			continue;

-		perf_pmu__pathname_scnprintf(path, sizeof(path), dent->d_name, "alias");
-		if (!file_available(path))
+		fd = perf_pmu__pathname_fd(dirfd, dent->d_name, "alias", O_RDONLY);
+		if (fd < 0)
 			continue;

-		file = fopen(path, "r");
+		file = fdopen(fd, "r");
 		if (!file)
 			continue;

--- a/tools/perf/arch/x86/util/topdown.c
+++ b/tools/perf/arch/x86/util/topdown.c
@ -1,19 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
-#include <stdio.h>
 #include "api/fs/fs.h"
+#include "util/evsel.h"
 #include "util/pmu.h"
 #include "util/topdown.h"
-#include "util/evlist.h"
-#include "util/debug.h"
-#include "util/pmu-hybrid.h"
 #include "topdown.h"
 #include "evsel.h"

-#define TOPDOWN_L1_EVENTS       "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
-#define TOPDOWN_L1_EVENTS_CORE  "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/}"
-#define TOPDOWN_L2_EVENTS       "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
-#define TOPDOWN_L2_EVENTS_CORE  "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/,cpu_core/topdown-heavy-ops/,cpu_core/topdown-br-mispredict/,cpu_core/topdown-fetch-lat/,cpu_core/topdown-mem-bound/}"
-
 /* Check whether there is a PMU which supports the perf metrics. */
 bool topdown_sys_has_perf_metrics(void)
 {
@ -38,30 +30,6 @@ bool topdown_sys_has_perf_metrics(void)
 	return has_perf_metrics;
 }

-/*
- * Check whether we can use a group for top down.
- * Without a group may get bad results due to multiplexing.
- */
-bool arch_topdown_check_group(bool *warn)
-{
-	int n;
-
-	if (sysctl__read_int("kernel/nmi_watchdog", &n) < 0)
-		return false;
-	if (n > 0) {
-		*warn = true;
-		return false;
-	}
-	return true;
-}
-
-void arch_topdown_group_warn(void)
-{
-	fprintf(stderr,
-		"nmi_watchdog enabled with topdown. May give wrong results.\n"
-		"Disable with echo 0 > /proc/sys/kernel/nmi_watchdog\n");
-}
-
 #define TOPDOWN_SLOTS		0x0400

 /*
@ -70,7 +38,6 @@ void arch_topdown_group_warn(void)
 * Only Topdown metric supports sample-read. The slots
 * event must be the leader of the topdown group.
 */
-
 bool arch_topdown_sample_read(struct evsel *leader)
 {
 	if (!evsel__sys_has_perf_metrics(leader))
@ -81,46 +48,3 @@ bool arch_topdown_sample_read(struct evsel *leader)

 	return false;
 }
-
-const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn)
-{
-	const char *pmu_name;
-
-	if (!perf_pmu__has_hybrid())
-		return "cpu";
-
-	if (!evlist->hybrid_pmu_name) {
-		if (warn)
-			pr_warning("WARNING: default to use cpu_core topdown events\n");
-		evlist->hybrid_pmu_name = perf_pmu__hybrid_type_to_pmu("core");
-	}
-
-	pmu_name = evlist->hybrid_pmu_name;
-
-	return pmu_name;
-}
-
-int topdown_parse_events(struct evlist *evlist)
-{
-	const char *topdown_events;
-	const char *pmu_name;
-
-	if (!topdown_sys_has_perf_metrics())
-		return 0;
-
-	pmu_name = arch_get_topdown_pmu_name(evlist, false);
-
-	if (pmu_have_event(pmu_name, "topdown-heavy-ops")) {
-		if (!strcmp(pmu_name, "cpu_core"))
-			topdown_events = TOPDOWN_L2_EVENTS_CORE;
-		else
-			topdown_events = TOPDOWN_L2_EVENTS;
-	} else {
-		if (!strcmp(pmu_name, "cpu_core"))
-			topdown_events = TOPDOWN_L1_EVENTS_CORE;
-		else
-			topdown_events = TOPDOWN_L1_EVENTS;
-	}
-
-	return parse_event(evlist, topdown_events);
-}
--- a/tools/perf/arch/x86/util/topdown.h
+++ b/tools/perf/arch/x86/util/topdown.h
@ -3,6 +3,5 @@
 #define _TOPDOWN_H 1

 bool topdown_sys_has_perf_metrics(void);
-int topdown_parse_events(struct evlist *evlist);

 #endif
--- a/tools/perf/bench/Build
+++ b/tools/perf/bench/Build
@ -15,6 +15,7 @@ perf-y += find-bit-bench.o
 perf-y += inject-buildid.o
 perf-y += evlist-open-close.o
 perf-y += breakpoint.o
+perf-y += pmu-scan.o

 perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-asm.o
 perf-$(CONFIG_X86_64) += mem-memset-x86-64-asm.o
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@ -23,6 +23,7 @@ int bench_sched_messaging(int argc, const char **argv);
 int bench_sched_pipe(int argc, const char **argv);
 int bench_syscall_basic(int argc, const char **argv);
 int bench_syscall_getpgid(int argc, const char **argv);
+int bench_syscall_fork(int argc, const char **argv);
 int bench_syscall_execve(int argc, const char **argv);
 int bench_mem_memcpy(int argc, const char **argv);
 int bench_mem_memset(int argc, const char **argv);
@ -41,6 +42,7 @@ int bench_inject_build_id(int argc, const char **argv);
 int bench_evlist_open_close(int argc, const char **argv);
 int bench_breakpoint_thread(int argc, const char **argv);
 int bench_breakpoint_enable(int argc, const char **argv);
+int bench_pmu_scan(int argc, const char **argv);

 #define BENCH_FORMAT_DEFAULT_STR	"default"
 #define BENCH_FORMAT_DEFAULT		0
--- a/tools/perf/bench/find-bit-bench.c
+++ b/tools/perf/bench/find-bit-bench.c
@ -61,7 +61,6 @@ static int do_for_each_set_bit(unsigned int num_bits)
 	double time_average, time_stddev;
 	unsigned int bit, i, j;
 	unsigned int set_bits, skip;
-	unsigned int old;

 	init_stats(&fb_time_stats);
 	init_stats(&tb_time_stats);
@ -73,7 +72,10 @@ static int do_for_each_set_bit(unsigned int num_bits)
 			__set_bit(i, to_test);

 		for (i = 0; i < outer_iterations; i++) {
-			old = accumulator;
+#ifndef NDEBUG
+			unsigned int old = accumulator;
+#endif
+
 			gettimeofday(&start, NULL);
 			for (j = 0; j < inner_iterations; j++) {
 				for_each_set_bit(bit, to_test, num_bits)
@ -85,7 +87,9 @@ static int do_for_each_set_bit(unsigned int num_bits)
 			runtime_us = diff.tv_sec * USEC_PER_SEC + diff.tv_usec;
 			update_stats(&fb_time_stats, runtime_us);

+#ifndef NDEBUG
 			old = accumulator;
+#endif
 			gettimeofday(&start, NULL);
 			for (j = 0; j < inner_iterations; j++) {
 				for (bit = 0; bit < num_bits; bit++) {
--- a/tools/perf/bench/inject-buildid.c
+++ b/tools/perf/bench/inject-buildid.c
@ -12,6 +12,7 @@
 #include <linux/time64.h>
 #include <linux/list.h>
 #include <linux/err.h>
+#include <linux/zalloc.h>
 #include <internal/lib.h>
 #include <subcmd/parse-options.h>

@ -122,7 +123,7 @@ static void release_dso(void)
 	for (i = 0; i < nr_dsos; i++) {
 		struct bench_dso *dso = &dsos[i];

-		free(dso->name);
+		zfree(&dso->name);
 	}
 	free(dsos);
 }
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@ -847,7 +847,7 @@ static u64 do_work(u8 *__data, long bytes, int nr, int nr_max, int loop, u64 val

 	if (g->p.data_rand_walk) {
 		u32 lfsr = nr + loop + val;
-		int j;
+		long j;

 		for (i = 0; i < words/1024; i++) {
 			long start, end;
--- a/tools/perf/bench/pmu-scan.c
+++ b/tools/perf/bench/pmu-scan.c
@ -0,0 +1,184 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Benchmark scanning sysfs files for PMU information.
+ *
+ * Copyright 2023 Google LLC.
+ */
+#include <stdio.h>
+#include "bench.h"
+#include "util/debug.h"
+#include "util/pmu.h"
+#include "util/pmus.h"
+#include "util/stat.h"
+#include <linux/atomic.h>
+#include <linux/err.h>
+#include <linux/time64.h>
+#include <subcmd/parse-options.h>
+
+static unsigned int iterations = 100;
+
+struct pmu_scan_result {
+	char *name;
+	int nr_aliases;
+	int nr_formats;
+	int nr_caps;
+};
+
+static const struct option options[] = {
+	OPT_UINTEGER('i', "iterations", &iterations,
+		"Number of iterations used to compute average"),
+	OPT_END()
+};
+
+static const char *const bench_usage[] = {
+	"perf bench internals pmu-scan <options>",
+	NULL
+};
+
+static int nr_pmus;
+static struct pmu_scan_result *results;
+
+static int save_result(void)
+{
+	struct perf_pmu *pmu;
+	struct list_head *list;
+	struct pmu_scan_result *r;
+
+	perf_pmu__scan(NULL);
+
+	perf_pmus__for_each_pmu(pmu) {
+		r = realloc(results, (nr_pmus + 1) * sizeof(*r));
+		if (r == NULL)
+			return -ENOMEM;
+
+		results = r;
+		r = results + nr_pmus;
+
+		r->name = strdup(pmu->name);
+		r->nr_caps = pmu->nr_caps;
+
+		r->nr_aliases = 0;
+		list_for_each(list, &pmu->aliases)
+			r->nr_aliases++;
+
+		r->nr_formats = 0;
+		list_for_each(list, &pmu->format)
+			r->nr_formats++;
+
+		pr_debug("pmu[%d] name=%s, nr_caps=%d, nr_aliases=%d, nr_formats=%d\n",
+			nr_pmus, r->name, r->nr_caps, r->nr_aliases, r->nr_formats);
+		nr_pmus++;
+	}
+
+	perf_pmu__destroy();
+	return 0;
+}
+
+static int check_result(void)
+{
+	struct pmu_scan_result *r;
+	struct perf_pmu *pmu;
+	struct list_head *list;
+	int nr;
+
+	for (int i = 0; i < nr_pmus; i++) {
+		r = &results[i];
+		pmu = perf_pmu__find(r->name);
+		if (pmu == NULL) {
+			pr_err("Cannot find PMU %s\n", r->name);
+			return -1;
+		}
+
+		if (pmu->nr_caps != (u32)r->nr_caps) {
+			pr_err("Unmatched number of event caps in %s: expect %d vs got %d\n",
+				pmu->name, r->nr_caps, pmu->nr_caps);
+			return -1;
+		}
+
+		nr = 0;
+		list_for_each(list, &pmu->aliases)
+			nr++;
+		if (nr != r->nr_aliases) {
+			pr_err("Unmatched number of event aliases in %s: expect %d vs got %d\n",
+				pmu->name, r->nr_aliases, nr);
+			return -1;
+		}
+
+		nr = 0;
+		list_for_each(list, &pmu->format)
+			nr++;
+		if (nr != r->nr_formats) {
+			pr_err("Unmatched number of event formats in %s: expect %d vs got %d\n",
+				pmu->name, r->nr_formats, nr);
+			return -1;
+		}
+	}
+	return 0;
+}
+
+static void delete_result(void)
+{
+	for (int i = 0; i < nr_pmus; i++)
+		free(results[i].name);
+	free(results);
+
+	results = NULL;
+	nr_pmus = 0;
+}
+
+static int run_pmu_scan(void)
+{
+	struct stats stats;
+	struct timeval start, end, diff;
+	double time_average, time_stddev;
+	u64 runtime_us;
+	unsigned int i;
+	int ret;
+
+	init_stats(&stats);
+	pr_info("Computing performance of sysfs PMU event scan for %u times\n",
+		iterations);
+
+	if (save_result() < 0) {
+		pr_err("Failed to initialize PMU scan result\n");
+		return -1;
+	}
+
+	for (i = 0; i < iterations; i++) {
+		gettimeofday(&start, NULL);
+		perf_pmu__scan(NULL);
+		gettimeofday(&end, NULL);
+
+		timersub(&end, &start, &diff);
+		runtime_us = diff.tv_sec * USEC_PER_SEC + diff.tv_usec;
+		update_stats(&stats, runtime_us);
+
+		ret = check_result();
+		perf_pmu__destroy();
+		if (ret < 0)
+			break;
+	}
+
+	time_average = avg_stats(&stats);
+	time_stddev = stddev_stats(&stats);
+	pr_info("  Average PMU scanning took: %.3f usec (+- %.3f usec)\n",
+		time_average, time_stddev);
+
+	delete_result();
+	return 0;
+}
+
+int bench_pmu_scan(int argc, const char **argv)
+{
+	int err = 0;
+
+	argc = parse_options(argc, argv, options, bench_usage, 0);
+	if (argc) {
+		usage_with_options(bench_usage, options);
+		exit(EXIT_FAILURE);
+	}
+
+	err = run_pmu_scan();
+
+	return err;
+}
--- a/tools/perf/bench/syscall.c
+++ b/tools/perf/bench/syscall.c
@ -18,6 +18,10 @@
 #include <unistd.h>
 #include <stdlib.h>

+#ifndef __NR_fork
+#define __NR_fork -1
+#endif
+
 #define LOOPS_DEFAULT 10000000
 static	int loops = LOOPS_DEFAULT;

@ -31,6 +35,23 @@ static const char * const bench_syscall_usage[] = {
 	NULL
 };

+static void test_fork(void)
+{
+	pid_t pid = fork();
+
+	if (pid < 0) {
+		fprintf(stderr, "fork failed\n");
+		exit(1);
+	} else if (pid == 0) {
+		exit(0);
+	} else {
+		if (waitpid(pid, NULL, 0) < 0) {
+			fprintf(stderr, "waitpid failed\n");
+			exit(1);
+		}
+	}
+}
+
 static void test_execve(void)
 {
 	const char *pathname = "/bin/true";
@ -71,6 +92,12 @@ static int bench_syscall_common(int argc, const char **argv, int syscall)
 		case __NR_getpgid:
 			getpgid(0);
 			break;
+		case __NR_fork:
+			test_fork();
+			/* Only loop 10000 times to save time */
+			if (i == 10000)
+				loops = 10000;
+			break;
 		case __NR_execve:
 			test_execve();
 			/* Only loop 10000 times to save time */
@ -92,6 +119,9 @@ static int bench_syscall_common(int argc, const char **argv, int syscall)
 	case __NR_getpgid:
 		name = "getpgid()";
 		break;
+	case __NR_fork:
+		name = "fork()";
+		break;
 	case __NR_execve:
 		name = "execve()";
 		break;
@ -143,6 +173,11 @@ int bench_syscall_getpgid(int argc, const char **argv)
 	return bench_syscall_common(argc, argv, __NR_getpgid);
 }

+int bench_syscall_fork(int argc, const char **argv)
+{
+	return bench_syscall_common(argc, argv, __NR_fork);
+}
+
 int bench_syscall_execve(int argc, const char **argv)
 {
 	return bench_syscall_common(argc, argv, __NR_execve);
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@ -15,7 +15,6 @@
 #include <linux/zalloc.h>
 #include "util/symbol.h"

-#include "perf.h"
 #include "util/debug.h"

 #include "util/evlist.h"
@ -36,6 +35,7 @@
 #include "util/block-range.h"
 #include "util/map_symbol.h"
 #include "util/branch.h"
+#include "util/util.h"

 #include <dlfcn.h>
 #include <errno.h>
@ -205,7 +205,7 @@ static int process_branch_callback(struct evsel *evsel,
 		return 0;

 	if (a.map != NULL)
-		a.map->dso->hit = 1;
+		map__dso(a.map)->hit = 1;

 	hist__account_cycles(sample->branch_stack, al, sample, false, NULL);

@ -235,10 +235,11 @@ static int evsel__add_sample(struct evsel *evsel, struct perf_sample *sample,
 		 * the DSO?
 		 */
 		if (al->sym != NULL) {
-			rb_erase_cached(&al->sym->rb_node,
-				 &al->map->dso->symbols);
+			struct dso *dso = map__dso(al->map);
+
+			rb_erase_cached(&al->sym->rb_node, &dso->symbols);
 			symbol__delete(al->sym);
-			dso__reset_find_symbol_cache(al->map->dso);
+			dso__reset_find_symbol_cache(dso);
 		}
 		return 0;
 	}
@ -252,7 +253,7 @@ static int evsel__add_sample(struct evsel *evsel, struct perf_sample *sample,
 	if (ann->has_br_stack && has_annotation(ann))
 		return process_branch_callback(evsel, sample, al, ann, machine);

-	he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
+	he = hists__add_entry(hists, al, NULL, NULL, NULL, NULL, sample, true);
 	if (he == NULL)
 		return -ENOMEM;

@ -320,7 +321,7 @@ static void hists__find_annotations(struct hists *hists,
 		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
 		struct annotation *notes;

-		if (he->ms.sym == NULL || he->ms.map->dso->annotate_warned)
+		if (he->ms.sym == NULL || map__dso(he->ms.map)->annotate_warned)
 			goto find_next;

 		if (ann->sym_hist_filter &&
@ -352,6 +353,7 @@ find_next:
 			int ret;
 			int (*annotate)(struct hist_entry *he,
 					struct evsel *evsel,
+					struct annotation_options *options,
 					struct hist_browser_timer *hbt);

 			annotate = dlsym(perf_gtk_handle,
@ -361,7 +363,7 @@ find_next:
 				return;
 			}

-			ret = annotate(he, evsel, NULL);
+			ret = annotate(he, evsel, &ann->opts, NULL);
 			if (!ret || !ann->skip_missing)
 				return;

@ -509,7 +511,6 @@ int cmd_annotate(int argc, const char **argv)
 			.ordered_events = true,
 			.ordering_requires_timestamps = true,
 		},
-		.opts = annotation__default_options,
 	};
 	struct perf_data data = {
 		.mode  = PERF_DATA_MODE_READ,
@ -517,6 +518,7 @@ int cmd_annotate(int argc, const char **argv)
 	struct itrace_synth_opts itrace_synth_opts = {
 		.set = 0,
 	};
+	const char *disassembler_style = NULL, *objdump_path = NULL, *addr2line_path = NULL;
 	struct option options[] = {
 	OPT_STRING('i', "input", &input_name, "file",
 		    "input file name"),
@ -561,14 +563,16 @@ int cmd_annotate(int argc, const char **argv)
 		    "Interleave source code with assembly code (default)"),
 	OPT_BOOLEAN(0, "asm-raw", &annotate.opts.show_asm_raw,
 		    "Display raw encoding of assembly instructions (default)"),
-	OPT_STRING('M', "disassembler-style", &annotate.opts.disassembler_style, "disassembler style",
+	OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
 		   "Specify disassembler style (e.g. -M intel for intel syntax)"),
 	OPT_STRING(0, "prefix", &annotate.opts.prefix, "prefix",
 		    "Add prefix to source file path names in programs (with --prefix-strip)"),
 	OPT_STRING(0, "prefix-strip", &annotate.opts.prefix_strip, "N",
 		    "Strip first N entries of source file path name in programs (with --prefix)"),
-	OPT_STRING(0, "objdump", &annotate.opts.objdump_path, "path",
+	OPT_STRING(0, "objdump", &objdump_path, "path",
 		   "objdump binary to use for disassembly and annotations"),
+	OPT_STRING(0, "addr2line", &addr2line_path, "path",
+		   "addr2line binary to use for line numbers"),
 	OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
 		    "Enable symbol demangling"),
 	OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
@ -598,6 +602,7 @@ int cmd_annotate(int argc, const char **argv)
 	set_option_flag(options, 0, "show-total-period", PARSE_OPT_EXCLUSIVE);
 	set_option_flag(options, 0, "show-nr-samples", PARSE_OPT_EXCLUSIVE);

+	annotation_options__init(&annotate.opts);

 	ret = hists__init();
 	if (ret < 0)
@ -617,6 +622,22 @@ int cmd_annotate(int argc, const char **argv)
 		annotate.sym_hist_filter = argv[0];
 	}

+	if (disassembler_style) {
+		annotate.opts.disassembler_style = strdup(disassembler_style);
+		if (!annotate.opts.disassembler_style)
+			return -ENOMEM;
+	}
+	if (objdump_path) {
+		annotate.opts.objdump_path = strdup(objdump_path);
+		if (!annotate.opts.objdump_path)
+			return -ENOMEM;
+	}
+	if (addr2line_path) {
+		symbol_conf.addr2line_path = strdup(addr2line_path);
+		if (!symbol_conf.addr2line_path)
+			return -ENOMEM;
+	}
+
 	if (annotate_check_args(&annotate.opts) < 0)
 		return -EINVAL;

@ -692,16 +713,13 @@ int cmd_annotate(int argc, const char **argv)

 out_delete:
 	/*
-	 * Speed up the exit process, for large files this can
-	 * take quite a while.
-	 *
-	 * XXX Enable this when using valgrind or if we ever
-	 * librarize this command.
-	 *
-	 * Also experiment with obstacks to see how much speed
-	 * up we'll get here.
-	 *
-	 * perf_session__delete(session);
+	 * Speed up the exit process by only deleting for debug builds. For
+	 * large files this can save time.
 	 */
+#ifndef NDEBUG
+	perf_session__delete(annotate.session);
+#endif
+	annotation_options__exit(&annotate.opts);
+
 	return ret;
 }
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@ -53,6 +53,7 @@ static struct bench sched_benchmarks[] = {
 static struct bench syscall_benchmarks[] = {
 	{ "basic",	"Benchmark for basic getppid(2) calls",		bench_syscall_basic	},
 	{ "getpgid",	"Benchmark for getpgid(2) calls",		bench_syscall_getpgid	},
+	{ "fork",	"Benchmark for fork(2) calls",			bench_syscall_fork	},
 	{ "execve",	"Benchmark for execve(2) calls",		bench_syscall_execve	},
 	{ "all",	"Run all syscall benchmarks",			NULL			},
 	{ NULL,		NULL,						NULL			},
@ -91,6 +92,7 @@ static struct bench internals_benchmarks[] = {
 	{ "kallsyms-parse", "Benchmark kallsyms parsing",	bench_kallsyms_parse	},
 	{ "inject-build-id", "Benchmark build-id injection",	bench_inject_build_id	},
 	{ "evlist-open-close", "Benchmark evlist open and close",	bench_evlist_open_close	},
+	{ "pmu-scan", "Benchmark sysfs PMU info scanning",	bench_pmu_scan		},
 	{ NULL,		NULL,					NULL			}
 };

--- a/tools/perf/builtin-buildid-list.c
+++ b/tools/perf/builtin-buildid-list.c
@ -8,7 +8,6 @@
 * Copyright (C) 2009, Arnaldo Carvalho de Melo <acme@redhat.com>
 */
 #include "builtin.h"
-#include "perf.h"
 #include "util/build-id.h"
 #include "util/debug.h"
 #include "util/dso.h"
@ -18,19 +17,20 @@
 #include "util/session.h"
 #include "util/symbol.h"
 #include "util/data.h"
+#include "util/util.h"
 #include <errno.h>
 #include <inttypes.h>
 #include <linux/err.h>

 static int buildid__map_cb(struct map *map, void *arg __maybe_unused)
 {
-	const struct dso *dso = map->dso;
+	const struct dso *dso = map__dso(map);
 	char bid_buf[SBUILD_ID_SIZE];

 	memset(bid_buf, 0, sizeof(bid_buf));
 	if (dso->has_build_id)
 		build_id__sprintf(&dso->bid, bid_buf);
-	printf("%s %16" PRIx64 " %16" PRIx64, bid_buf, map->start, map->end);
+	printf("%s %16" PRIx64 " %16" PRIx64, bid_buf, map__start(map), map__end(map));
 	if (dso->long_name != NULL) {
 		printf(" %s", dso->long_name);
 	} else if (dso->short_name != NULL) {
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@ -41,10 +41,10 @@
 #include "symbol.h"
 #include "ui/ui.h"
 #include "ui/progress.h"
-#include "../perf.h"
 #include "pmu.h"
 #include "pmu-hybrid.h"
 #include "string2.h"
+#include "util/util.h"

 struct c2c_hists {
 	struct hists		hists;
@ -165,8 +165,8 @@ static void *c2c_he_zalloc(size_t size)
 	return &c2c_he->he;

 out_free:
-	free(c2c_he->nodeset);
-	free(c2c_he->cpuset);
+	zfree(&c2c_he->nodeset);
+	zfree(&c2c_he->cpuset);
 	free(c2c_he);
 	return NULL;
 }
@ -178,13 +178,13 @@ static void c2c_he_free(void *he)
 	c2c_he = container_of(he, struct c2c_hist_entry, he);
 	if (c2c_he->hists) {
 		hists__delete_entries(&c2c_he->hists->hists);
-		free(c2c_he->hists);
+		zfree(&c2c_he->hists);
 	}

-	free(c2c_he->cpuset);
-	free(c2c_he->nodeset);
-	free(c2c_he->nodestr);
-	free(c2c_he->node_stats);
+	zfree(&c2c_he->cpuset);
+	zfree(&c2c_he->nodeset);
+	zfree(&c2c_he->nodestr);
+	zfree(&c2c_he->node_stats);
 	free(c2c_he);
 }

@ -315,7 +315,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 	c2c_decode_stats(&stats, mi);

 	he = hists__add_entry_ops(&c2c_hists->hists, &c2c_entry_ops,
-				  &al, NULL, NULL, mi,
+				  &al, NULL, NULL, mi, NULL,
 				  sample, true);
 	if (he == NULL)
 		goto free_mi;
@ -349,7 +349,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
 			goto free_mi;

 		he = hists__add_entry_ops(&c2c_hists->hists, &c2c_entry_ops,
-					  &al, NULL, NULL, mi,
+					  &al, NULL, NULL, mi, NULL,
 					  sample, true);
 		if (he == NULL)
 			goto free_mi;
--- a/tools/perf/builtin-daemon.c
+++ b/tools/perf/builtin-daemon.c
@ -193,7 +193,7 @@ static int session_config(struct daemon *daemon, const char *var, const char *va

 		if (!same) {
 			if (session->run) {
-				free(session->run);
+				zfree(&session->run);
 				pr_debug("reconfig: session %s is changed\n", name);
 			}

@ -924,9 +924,9 @@ static void daemon__signal(struct daemon *daemon, int sig)

 static void daemon_session__delete(struct daemon_session *session)
 {
-	free(session->base);
-	free(session->name);
-	free(session->run);
+	zfree(&session->base);
+	zfree(&session->name);
+	zfree(&session->run);
 	free(session);
 }

@ -975,9 +975,9 @@ static void daemon__exit(struct daemon *daemon)
 	list_for_each_entry_safe(session, h, &daemon->sessions, list)
 		daemon_session__remove(session);

-	free(daemon->config_real);
-	free(daemon->config_base);
-	free(daemon->base);
+	zfree(&daemon->config_real);
+	zfree(&daemon->config_base);
+	zfree(&daemon->base);
 }

 static int daemon__reconfig(struct daemon *daemon)
--- a/tools/perf/builtin-data.c
+++ b/tools/perf/builtin-data.c
@ -3,10 +3,10 @@
 #include <stdio.h>
 #include <string.h>
 #include "builtin.h"
-#include "perf.h"
 #include "debug.h"
 #include <subcmd/parse-options.h>
 #include "data-convert.h"
+#include "util/util.h"

 typedef int (*data_cmd_fn_t)(int argc, const char **argv);

--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@ -6,7 +6,6 @@
 * DSOs and symbol information, sort them and produce a diff.
 */
 #include "builtin.h"
-#include "perf.h"

 #include "util/debug.h"
 #include "util/event.h"
@ -26,6 +25,7 @@
 #include "util/spark.h"
 #include "util/block-info.h"
 #include "util/stream.h"
+#include "util/util.h"
 #include <linux/err.h>
 #include <linux/zalloc.h>
 #include <subcmd/pager.h>
@ -423,7 +423,7 @@ static int diff__process_sample_event(struct perf_tool *tool,
 	switch (compute) {
 	case COMPUTE_CYCLES:
 		if (!hists__add_entry_ops(hists, &block_hist_ops, &al, NULL,
-					  NULL, NULL, sample, true)) {
+					  NULL, NULL, NULL, sample, true)) {
 			pr_warning("problem incrementing symbol period, "
 				   "skipping event\n");
 			goto out_put;
@ -442,7 +442,7 @@ static int diff__process_sample_event(struct perf_tool *tool,
 		break;

 	default:
-		if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
+		if (!hists__add_entry(hists, &al, NULL, NULL, NULL, NULL, sample,
 				      true)) {
 			pr_warning("problem incrementing symbol period, "
 				   "skipping event\n");
--- a/tools/perf/builtin-evlist.c
+++ b/tools/perf/builtin-evlist.c
@ -7,7 +7,6 @@

 #include <linux/list.h>

-#include "perf.h"
 #include "util/evlist.h"
 #include "util/evsel.h"
 #include "util/evsel_fprintf.h"
@ -18,6 +17,7 @@
 #include "util/debug.h"
 #include <linux/err.h>
 #include "util/tool.h"
+#include "util/util.h"

 static int process_header_feature(struct perf_session *session __maybe_unused,
 				  union perf_event *event __maybe_unused)
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@ -623,7 +623,7 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace)
 	/* display column headers */
 	read_tracing_file_to_stdout("trace");

-	if (!ftrace->initial_delay) {
+	if (!ftrace->target.initial_delay) {
 		if (write_tracing_file("tracing_on", "1") < 0) {
 			pr_err("can't enable tracing\n");
 			goto out_close_fd;
@ -632,8 +632,8 @@ static int __cmd_ftrace(struct perf_ftrace *ftrace)

 	evlist__start_workload(ftrace->evlist);

-	if (ftrace->initial_delay) {
-		usleep(ftrace->initial_delay * 1000);
+	if (ftrace->target.initial_delay > 0) {
+		usleep(ftrace->target.initial_delay * 1000);
 		if (write_tracing_file("tracing_on", "1") < 0) {
 			pr_err("can't enable tracing\n");
 			goto out_close_fd;
@ -1164,8 +1164,8 @@ int cmd_ftrace(int argc, const char **argv)
 		     "Size of per cpu buffer, needs to use a B, K, M or G suffix.", parse_buffer_size),
 	OPT_BOOLEAN(0, "inherit", &ftrace.inherit,
 		    "Trace children processes"),
-	OPT_UINTEGER('D', "delay", &ftrace.initial_delay,
-		     "Number of milliseconds to wait before starting tracing after program start"),
+	OPT_INTEGER('D', "delay", &ftrace.target.initial_delay,
+		    "Number of milliseconds to wait before starting tracing after program start"),
 	OPT_PARENT(common_options),
 	};
 	const struct option latency_options[] = {
@ -1228,10 +1228,12 @@ int cmd_ftrace(int argc, const char **argv)
 		goto out_delete_filters;
 	}

+	/* Make system wide (-a) the default target. */
+	if (!argc && target__none(&ftrace.target))
+		ftrace.target.system_wide = true;
+
 	switch (subcmd) {
 	case PERF_FTRACE_TRACE:
-		if (!argc && target__none(&ftrace.target))
-			ftrace.target.system_wide = true;
 		cmd_func = __cmd_ftrace;
 		break;
 	case PERF_FTRACE_LATENCY:
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@ -14,6 +14,7 @@
 #include <subcmd/run-command.h>
 #include <subcmd/help.h>
 #include "util/debug.h"
+#include "util/util.h"
 #include <linux/kernel.h>
 #include <linux/string.h>
 #include <linux/zalloc.h>
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@ -630,10 +630,8 @@ static int dso__read_build_id(struct dso *dso)
 	if (filename__read_build_id(dso->long_name, &dso->bid) > 0)
 		dso->has_build_id = true;
 	else if (dso->nsinfo) {
-		char *new_name;
+		char *new_name = dso__filename_with_chroot(dso, dso->long_name);

-		new_name = filename_with_chroot(dso->nsinfo->pid,
-						dso->long_name);
 		if (new_name && filename__read_build_id(new_name, &dso->bid) > 0)
 			dso->has_build_id = true;
 		free(new_name);
@ -753,10 +751,12 @@ int perf_event__inject_buildid(struct perf_tool *tool, union perf_event *event,
 	}

 	if (thread__find_map(thread, sample->cpumode, sample->ip, &al)) {
-		if (!al.map->dso->hit) {
-			al.map->dso->hit = 1;
-			dso__inject_build_id(al.map->dso, tool, machine,
-					     sample->cpumode, al.map->flags);
+		struct dso *dso = map__dso(al.map);
+
+		if (!dso->hit) {
+			dso->hit = 1;
+			dso__inject_build_id(dso, tool, machine,
+					     sample->cpumode, map__flags(al.map));
 		}
 	}

@ -1309,10 +1309,10 @@ static void guest_session__exit(struct guest_session *gs)
 		if (gs->tmp_fd >= 0)
 			close(gs->tmp_fd);
 		unlink(gs->tmp_file_name);
-		free(gs->tmp_file_name);
+		zfree(&gs->tmp_file_name);
 	}
-	free(gs->vcpu);
-	free(gs->perf_data_file);
+	zfree(&gs->vcpu);
+	zfree(&gs->perf_data_file);
 }

 static void get_tsc_conv(struct perf_tsc_conversion *tc, struct perf_record_time_conv *time_conv)
--- a/tools/perf/builtin-kallsyms.c
+++ b/tools/perf/builtin-kallsyms.c
@ -28,6 +28,7 @@ static int __cmd_kallsyms(int argc, const char **argv)

 	for (i = 0; i < argc; ++i) {
 		struct map *map;
+		const struct dso *dso;
 		struct symbol *symbol = machine__find_kernel_symbol_by_name(machine, argv[i], &map);

 		if (symbol == NULL) {
@ -35,9 +36,10 @@ static int __cmd_kallsyms(int argc, const char **argv)
 			continue;
 		}

+		dso = map__dso(map);
 		printf("%s: %s %s %#" PRIx64 "-%#" PRIx64 " (%#" PRIx64 "-%#" PRIx64")\n",
-			symbol->name, map->dso->short_name, map->dso->long_name,
-			map->unmap_ip(map, symbol->start), map->unmap_ip(map, symbol->end),
+			symbol->name, dso->short_name, dso->long_name,
+			map__unmap_ip(map, symbol->start), map__unmap_ip(map, symbol->end),
 			symbol->start, symbol->end);
 	}

--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@ -1,6 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "builtin.h"
-#include "perf.h"

 #include "util/dso.h"
 #include "util/evlist.h"
@ -24,6 +23,7 @@

 #include "util/debug.h"
 #include "util/string2.h"
+#include "util/util.h"

 #include <linux/kernel.h>
 #include <linux/numa.h>
@ -423,7 +423,7 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
 		if (!caller) {
 			/* found */
 			if (node->ms.map)
-				addr = map__unmap_ip(node->ms.map, node->ip);
+				addr = map__dso_unmap_ip(node->ms.map, node->ip);
 			else
 				addr = node->ip;

@ -1024,7 +1024,7 @@ static void __print_slab_result(struct rb_root *root,

 		if (sym != NULL)
 			snprintf(buf, sizeof(buf), "%s+%" PRIx64 "", sym->name,
-				 addr - map->unmap_ip(map, sym->start));
+				 addr - map__unmap_ip(map, sym->start));
 		else
 			snprintf(buf, sizeof(buf), "%#" PRIx64 "", addr);
 		printf(" %-34s |", buf);
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@ -6,7 +6,6 @@
 */

 #include "builtin.h"
-#include "perf.h"

 #include "util/data.h"
 #include "util/evlist.h"
@ -20,6 +19,7 @@
 #include "util/string2.h"
 #include "util/callchain.h"
 #include "util/evsel_fprintf.h"
+#include "util/util.h"

 #include <subcmd/pager.h>
 #include <subcmd/parse-options.h>
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@ -127,7 +127,7 @@ static void default_print_event(void *ps, const char *pmu_name, const char *topi
 	if (strcmp(print_state->last_topic, topic ?: "")) {
 		if (topic)
 			printf("\n%s:\n", topic);
-		free(print_state->last_topic);
+		zfree(&print_state->last_topic);
 		print_state->last_topic = strdup(topic ?: "");
 	}

@ -168,6 +168,7 @@ static void default_print_metric(void *ps,
 				const char *desc,
 				const char *long_desc,
 				const char *expr,
+				const char *threshold,
 				const char *unit __maybe_unused)
 {
 	struct print_state *print_state = ps;
@ -196,7 +197,7 @@ static void default_print_metric(void *ps,
 			else
 				printf("%s\n", group);
 		}
-		free(print_state->last_metricgroups);
+		zfree(&print_state->last_metricgroups);
 		print_state->last_metricgroups = strdup(group ?: "");
 	}
 	if (!print_state->metrics)
@ -227,6 +228,11 @@ static void default_print_metric(void *ps,
 		wordwrap(expr, 8, pager_get_columns(), 0);
 		printf("]\n");
 	}
+	if (threshold && print_state->detailed) {
+		printf("%*s", 8, "[");
+		wordwrap(threshold, 8, pager_get_columns(), 0);
+		printf("]\n");
+	}
 }

 struct json_print_state {
@ -272,10 +278,10 @@ static void fix_escape_printf(struct strbuf *buf, const char *fmt, ...)
 						strbuf_addstr(buf, "\\n");
 						break;
 					case '\\':
-						__fallthrough;
+						fallthrough;
 					case '\"':
 						strbuf_addch(buf, '\\');
-						__fallthrough;
+						fallthrough;
 					default:
 						strbuf_addch(buf, s[s_pos]);
 						break;
@ -367,7 +373,7 @@ static void json_print_event(void *ps, const char *pmu_name, const char *topic,
 static void json_print_metric(void *ps __maybe_unused, const char *group,
 			      const char *name, const char *desc,
 			      const char *long_desc, const char *expr,
-			      const char *unit)
+			      const char *threshold, const char *unit)
 {
 	struct json_print_state *print_state = ps;
 	bool need_sep = false;
@ -388,6 +394,11 @@ static void json_print_metric(void *ps __maybe_unused, const char *group,
 		fix_escape_printf(&buf, "%s\t\"MetricExpr\": \"%S\"", need_sep ? ",\n" : "", expr);
 		need_sep = true;
 	}
+	if (threshold) {
+		fix_escape_printf(&buf, "%s\t\"MetricThreshold\": \"%S\"", need_sep ? ",\n" : "",
+				  threshold);
+		need_sep = true;
+	}
 	if (unit) {
 		fix_escape_printf(&buf, "%s\t\"ScaleUnit\": \"%S\"", need_sep ? ",\n" : "", unit);
 		need_sep = true;
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@ -60,7 +60,7 @@ static bool show_thread_stats;
 static bool show_lock_addrs;
 static bool show_lock_owner;
 static bool use_bpf;
-static unsigned long bpf_map_entries = 10240;
+static unsigned long bpf_map_entries = MAX_ENTRIES;
 static int max_stack_depth = CONTENTION_STACK_DEPTH;
 static int stack_skip = CONTENTION_STACK_SKIP;
 static int print_nr_entries = INT_MAX / 2;
@ -77,7 +77,7 @@ static enum lock_aggr_mode aggr_mode = LOCK_AGGR_ADDR;

 static bool needs_callstack(void)
 {
-	return verbose > 0 || !list_empty(&callstack_filters);
+	return !list_empty(&callstack_filters);
 }

 static struct thread_stat *thread_stat_find(u32 tid)
@ -900,7 +900,7 @@ static int get_symbol_name_offset(struct map *map, struct symbol *sym, u64 ip,
 		return 0;
 	}

-	offset = map->map_ip(map, ip) - sym->start;
+	offset = map__map_ip(map, ip) - sym->start;

 	if (offset)
 		return scnprintf(buf, size, "%s+%#lx", sym->name, offset);
@ -1070,7 +1070,7 @@ static int report_lock_contention_begin_event(struct evsel *evsel,
 				return -ENOMEM;
 			}

-			addrs[filters.nr_addrs++] = kmap->unmap_ip(kmap, sym->start);
+			addrs[filters.nr_addrs++] = map__unmap_ip(kmap, sym->start);
 			filters.addrs = addrs;
 		}
 	}
@ -1323,10 +1323,10 @@ static void print_bad_events(int bad, int total)
 	for (i = 0; i < BROKEN_MAX; i++)
 		broken += bad_hist[i];

-	if (quiet || (broken == 0 && verbose <= 0))
+	if (quiet || total == 0 || (broken == 0 && verbose <= 0))
 		return;

-	pr_info("\n=== output for debug===\n\n");
+	pr_info("\n=== output for debug ===\n\n");
 	pr_info("bad: %d, total: %d\n", bad, total);
 	pr_info("bad rate: %.2f %%\n", (double)bad / (double)total * 100);
 	pr_info("histogram of events caused bad sequence\n");
@ -1548,27 +1548,41 @@ static void sort_result(void)

 static const struct {
 	unsigned int flags;
+	const char *str;
 	const char *name;
 } lock_type_table[] = {
-	{ 0,				"semaphore" },
-	{ LCB_F_SPIN,			"spinlock" },
-	{ LCB_F_SPIN | LCB_F_READ,	"rwlock:R" },
-	{ LCB_F_SPIN | LCB_F_WRITE,	"rwlock:W"},
-	{ LCB_F_READ,			"rwsem:R" },
-	{ LCB_F_WRITE,			"rwsem:W" },
-	{ LCB_F_RT,			"rtmutex" },
-	{ LCB_F_RT | LCB_F_READ,	"rwlock-rt:R" },
-	{ LCB_F_RT | LCB_F_WRITE,	"rwlock-rt:W"},
-	{ LCB_F_PERCPU | LCB_F_READ,	"pcpu-sem:R" },
-	{ LCB_F_PERCPU | LCB_F_WRITE,	"pcpu-sem:W" },
-	{ LCB_F_MUTEX,			"mutex" },
-	{ LCB_F_MUTEX | LCB_F_SPIN,	"mutex" },
+	{ 0,				"semaphore",	"semaphore" },
+	{ LCB_F_SPIN,			"spinlock",	"spinlock" },
+	{ LCB_F_SPIN | LCB_F_READ,	"rwlock:R",	"rwlock" },
+	{ LCB_F_SPIN | LCB_F_WRITE,	"rwlock:W",	"rwlock" },
+	{ LCB_F_READ,			"rwsem:R",	"rwsem" },
+	{ LCB_F_WRITE,			"rwsem:W",	"rwsem" },
+	{ LCB_F_RT,			"rt-mutex",	"rt-mutex" },
+	{ LCB_F_RT | LCB_F_READ,	"rwlock-rt:R",	"rwlock-rt" },
+	{ LCB_F_RT | LCB_F_WRITE,	"rwlock-rt:W",	"rwlock-rt" },
+	{ LCB_F_PERCPU | LCB_F_READ,	"pcpu-sem:R",	"percpu-rwsem" },
+	{ LCB_F_PERCPU | LCB_F_WRITE,	"pcpu-sem:W",	"percpu-rwsem" },
+	{ LCB_F_MUTEX,			"mutex",	"mutex" },
+	{ LCB_F_MUTEX | LCB_F_SPIN,	"mutex",	"mutex" },
 	/* alias for get_type_flag() */
-	{ LCB_F_MUTEX | LCB_F_SPIN,	"mutex-spin" },
+	{ LCB_F_MUTEX | LCB_F_SPIN,	"mutex-spin",	"mutex" },
 };

 static const char *get_type_str(unsigned int flags)
 {
+	flags &= LCB_F_MAX_FLAGS - 1;
+
+	for (unsigned int i = 0; i < ARRAY_SIZE(lock_type_table); i++) {
+		if (lock_type_table[i].flags == flags)
+			return lock_type_table[i].str;
+	}
+	return "unknown";
+}
+
+static const char *get_type_name(unsigned int flags)
+{
+	flags &= LCB_F_MAX_FLAGS - 1;
+
 	for (unsigned int i = 0; i < ARRAY_SIZE(lock_type_table); i++) {
 		if (lock_type_table[i].flags == flags)
 			return lock_type_table[i].name;
@ -1582,6 +1596,10 @@ static unsigned int get_type_flag(const char *str)
 		if (!strcmp(lock_type_table[i].name, str))
 			return lock_type_table[i].flags;
 	}
+	for (unsigned int i = 0; i < ARRAY_SIZE(lock_type_table); i++) {
+		if (!strcmp(lock_type_table[i].str, str))
+			return lock_type_table[i].flags;
+	}
 	return UINT_MAX;
 }

@ -1605,6 +1623,26 @@ static void sort_contention_result(void)
 	sort_result();
 }

+static void print_bpf_events(int total, struct lock_contention_fails *fails)
+{
+	/* Output for debug, this have to be removed */
+	int broken = fails->task + fails->stack + fails->time + fails->data;
+
+	if (quiet || total == 0 || (broken == 0 && verbose <= 0))
+		return;
+
+	total += broken;
+	pr_info("\n=== output for debug ===\n\n");
+	pr_info("bad: %d, total: %d\n", broken, total);
+	pr_info("bad rate: %.2f %%\n", (double)broken / (double)total * 100);
+
+	pr_info("histogram of failure reasons\n");
+	pr_info(" %10s: %d\n", "task", fails->task);
+	pr_info(" %10s: %d\n", "stack", fails->stack);
+	pr_info(" %10s: %d\n", "time", fails->time);
+	pr_info(" %10s: %d\n", "data", fails->data);
+}
+
 static void print_contention_result(struct lock_contention *con)
 {
 	struct lock_stat *st;
@ -1632,8 +1670,6 @@ static void print_contention_result(struct lock_contention *con)
 	}

 	bad = total = printed = 0;
-	if (use_bpf)
-		bad = bad_hist[BROKEN_CONTENDED];

 	while ((st = pop_from_result())) {
 		struct thread *t;
@ -1662,8 +1698,8 @@ static void print_contention_result(struct lock_contention *con)
 				pid, pid == -1 ? "Unknown" : thread__comm_str(t));
 			break;
 		case LOCK_AGGR_ADDR:
-			pr_info("  %016llx   %s\n", (unsigned long long)st->addr,
-				st->name ? : "");
+			pr_info("  %016llx   %s (%s)\n", (unsigned long long)st->addr,
+				st->name, get_type_name(st->flags));
 			break;
 		default:
 			break;
@ -1690,7 +1726,21 @@ static void print_contention_result(struct lock_contention *con)
 			break;
 	}

-	print_bad_events(bad, total);
+	if (print_nr_entries) {
+		/* update the total/bad stats */
+		while ((st = pop_from_result())) {
+			total += use_bpf ? st->nr_contended : 1;
+			if (st->broken)
+				bad++;
+		}
+	}
+	/* some entries are collected but hidden by the callstack filter */
+	total += con->nr_filtered;
+
+	if (use_bpf)
+		print_bpf_events(total, &con->fails);
+	else
+		print_bad_events(bad, total);
 }

 static bool force;
@ -1917,9 +1967,6 @@ static int __cmd_contention(int argc, const char **argv)

 		lock_contention_stop();
 		lock_contention_read(&con);
-
-		/* abuse bad hist stats for lost entries */
-		bad_hist[BROKEN_CONTENDED] = con.lost;
 	} else {
 		err = perf_session__process_events(session);
 		if (err)
@ -2091,46 +2138,15 @@ static int parse_lock_type(const struct option *opt __maybe_unused, const char *
 		unsigned int flags = get_type_flag(tok);

 		if (flags == -1U) {
-			char buf[32];
-
-			if (strchr(tok, ':'))
-			    continue;
-
-			/* try :R and :W suffixes for rwlock, rwsem, ... */
-			scnprintf(buf, sizeof(buf), "%s:R", tok);
-			flags = get_type_flag(buf);
-			if (flags != UINT_MAX) {
-				if (!add_lock_type(flags)) {
-					ret = -1;
-					break;
-				}
-			}
-
-			scnprintf(buf, sizeof(buf), "%s:W", tok);
-			flags = get_type_flag(buf);
-			if (flags != UINT_MAX) {
-				if (!add_lock_type(flags)) {
-					ret = -1;
-					break;
-				}
-			}
-			continue;
+			pr_err("Unknown lock flags: %s\n", tok);
+			ret = -1;
+			break;
 		}

 		if (!add_lock_type(flags)) {
 			ret = -1;
 			break;
 		}
-
-		if (!strcmp(tok, "mutex")) {
-			flags = get_type_flag("mutex-spin");
-			if (flags != UINT_MAX) {
-				if (!add_lock_type(flags)) {
-					ret = -1;
-					break;
-				}
-			}
-		}
 	}

 	free(s);
@ -2291,7 +2307,7 @@ int cmd_lock(int argc, const char **argv)
 		   "Trace on existing process id"),
 	OPT_STRING(0, "tid", &target.tid, "tid",
 		   "Trace on existing thread id (exclusive to --pid)"),
-	OPT_CALLBACK(0, "map-nr-entries", &bpf_map_entries, "num",
+	OPT_CALLBACK('M', "map-nr-entries", &bpf_map_entries, "num",
 		     "Max number of BPF map entries", parse_map_entry),
 	OPT_CALLBACK(0, "max-stack", &max_stack_depth, "num",
 		     "Set the maximum stack depth when collecting lopck contention, "
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@ -4,7 +4,6 @@
 #include <sys/stat.h>
 #include <unistd.h>
 #include "builtin.h"
-#include "perf.h"

 #include <subcmd/parse-options.h>
 #include "util/auxtrace.h"
@ -22,6 +21,7 @@
 #include "util/pmu-hybrid.h"
 #include "util/sample.h"
 #include "util/string2.h"
+#include "util/util.h"
 #include <linux/err.h>

 #define MEM_OPERATION_LOAD	0x1
@ -200,6 +200,7 @@ dump_raw_samples(struct perf_tool *tool,
 	struct addr_location al;
 	const char *fmt, *field_sep;
 	char str[PAGE_SIZE_NAME_LEN];
+	struct dso *dso = NULL;

 	if (machine__resolve(machine, &al, sample) < 0) {
 		fprintf(stderr, "problem processing %d event, skipping it.\n",
@ -210,8 +211,11 @@ dump_raw_samples(struct perf_tool *tool,
 	if (al.filtered || (mem->hide_unresolved && al.sym == NULL))
 		goto out_put;

-	if (al.map != NULL)
-		al.map->dso->hit = 1;
+	if (al.map != NULL) {
+		dso = map__dso(al.map);
+		if (dso)
+			dso->hit = 1;
+	}

 	field_sep = symbol_conf.field_sep;
 	if (field_sep) {
@ -252,7 +256,7 @@ dump_raw_samples(struct perf_tool *tool,
 		symbol_conf.field_sep,
 		sample->data_src,
 		symbol_conf.field_sep,
-		al.map ? (al.map->dso ? al.map->dso->long_name : "???") : "???",
+		dso ? dso->long_name : "???",
 		al.sym ? al.sym->name : "???");
 out_put:
 	addr_location__put(&al);
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@ -715,7 +715,7 @@ __cmd_probe(int argc, const char **argv)
 			pr_err("  Error: --bootconfig doesn't support uprobes.\n");
 			return -EINVAL;
 		}
-		__fallthrough;
+		fallthrough;
 	case 'a':

 		/* Ensure the last given target is used */
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@ -52,6 +52,7 @@
 #include "util/pmu-hybrid.h"
 #include "util/evlist-hybrid.h"
 #include "util/off_cpu.h"
+#include "util/bpf-filter.h"
 #include "asm/bug.h"
 #include "perf.h"
 #include "cputopo.h"
@ -1292,7 +1293,7 @@ static int record__open(struct record *rec)
 	 * dummy event so that we can track PERF_RECORD_MMAP to cover the delay
 	 * of waiting or event synthesis.
 	 */
-	if (opts->initial_delay || target__has_cpu(&opts->target) ||
+	if (opts->target.initial_delay || target__has_cpu(&opts->target) ||
 	    perf_pmu__has_hybrid()) {
 		pos = evlist__get_tracking_event(evlist);
 		if (!evsel__is_dummy_event(pos)) {
@ -1307,7 +1308,7 @@ static int record__open(struct record *rec)
 		 * Enable the dummy event when the process is forked for
 		 * initial_delay, immediately for system wide.
 		 */
-		if (opts->initial_delay && !pos->immediate &&
+		if (opts->target.initial_delay && !pos->immediate &&
 		    !target__has_cpu(&opts->target))
 			pos->core.attr.enable_on_exec = 1;
 		else
@ -1352,7 +1353,7 @@ try_again:

 	if (evlist__apply_filters(evlist, &pos)) {
 		pr_err("failed to set filter \"%s\" on event %s with %d (%s)\n",
-			pos->filter, evsel__name(pos), errno,
+			pos->filter ?: "BPF", evsel__name(pos), errno,
 			str_error_r(errno, msg, sizeof(msg)));
 		rc = -1;
 		goto out;
@ -1856,24 +1857,16 @@ record__switch_output(struct record *rec, bool at_exit)
 	return fd;
 }

-static void __record__read_lost_samples(struct record *rec, struct evsel *evsel,
+static void __record__save_lost_samples(struct record *rec, struct evsel *evsel,
 					struct perf_record_lost_samples *lost,
-					int cpu_idx, int thread_idx)
+					int cpu_idx, int thread_idx, u64 lost_count,
+					u16 misc_flag)
 {
-	struct perf_counts_values count;
 	struct perf_sample_id *sid;
 	struct perf_sample sample = {};
 	int id_hdr_size;

-	if (perf_evsel__read(&evsel->core, cpu_idx, thread_idx, &count) < 0) {
-		pr_err("read LOST count failed\n");
-		return;
-	}
-
-	if (count.lost == 0)
-		return;
-
-	lost->lost = count.lost;
+	lost->lost = lost_count;
 	if (evsel->core.ids) {
 		sid = xyarray__entry(evsel->core.sample_id, cpu_idx, thread_idx);
 		sample.id = sid->id;
@ -1882,6 +1875,7 @@ static void __record__read_lost_samples(struct record *rec, struct evsel *evsel,
 	id_hdr_size = perf_event__synthesize_id_sample((void *)(lost + 1),
 						       evsel->core.attr.sample_type, &sample);
 	lost->header.size = sizeof(*lost) + id_hdr_size;
+	lost->header.misc = misc_flag;
 	record__write(rec, NULL, lost, lost->header.size);
 }

@ -1905,6 +1899,7 @@ static void record__read_lost_samples(struct record *rec)

 	evlist__for_each_entry(session->evlist, evsel) {
 		struct xyarray *xy = evsel->core.sample_id;
+		u64 lost_count;

 		if (xy == NULL || evsel->core.fd == NULL)
 			continue;
@ -1916,12 +1911,27 @@ static void record__read_lost_samples(struct record *rec)

 		for (int x = 0; x < xyarray__max_x(xy); x++) {
 			for (int y = 0; y < xyarray__max_y(xy); y++) {
-				__record__read_lost_samples(rec, evsel, lost, x, y);
+				struct perf_counts_values count;
+
+				if (perf_evsel__read(&evsel->core, x, y, &count) < 0) {
+					pr_debug("read LOST count failed\n");
+					goto out;
+				}
+
+				if (count.lost) {
+					__record__save_lost_samples(rec, evsel, lost,
+								    x, y, count.lost, 0);
+				}
 			}
 		}
-	}
-	free(lost);

+		lost_count = perf_bpf_filter__lost_count(evsel);
+		if (lost_count)
+			__record__save_lost_samples(rec, evsel, lost, 0, 0, lost_count,
+						    PERF_RECORD_MISC_LOST_SAMPLES_BPF);
+	}
+out:
+	free(lost);
 }

 static volatile sig_atomic_t workload_exec_errno;
@ -2474,7 +2484,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		rec->tool.ordered_events = false;
 	}

-	if (!rec->evlist->core.nr_groups)
+	if (evlist__nr_groups(rec->evlist) == 0)
 		perf_header__clear_feat(&session->header, HEADER_GROUP_DESC);

 	if (data->is_pipe) {
@ -2522,7 +2532,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	 * (apart from group members) have enable_on_exec=1 set,
 	 * so don't spoil it by prematurely enabling them.
 	 */
-	if (!target__none(&opts->target) && !opts->initial_delay)
+	if (!target__none(&opts->target) && !opts->target.initial_delay)
 		evlist__enable(rec->evlist);

 	/*
@ -2574,10 +2584,10 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		evlist__start_workload(rec->evlist);
 	}

-	if (opts->initial_delay) {
+	if (opts->target.initial_delay) {
 		pr_info(EVLIST_DISABLED_MSG);
-		if (opts->initial_delay > 0) {
-			usleep(opts->initial_delay * USEC_PER_MSEC);
+		if (opts->target.initial_delay > 0) {
+			usleep(opts->target.initial_delay * USEC_PER_MSEC);
 			evlist__enable(rec->evlist);
 			pr_info(EVLIST_ENABLED_MSG);
 		}
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@ -143,6 +143,10 @@ static int report__config(const char *var, const char *value, void *cb)

 	if (!strcmp(var, "report.sort_order")) {
 		default_sort_order = strdup(value);
+		if (!default_sort_order) {
+			pr_err("Not enough memory for report.sort_order\n");
+			return -1;
+		}
 		return 0;
 	}

@ -151,6 +155,7 @@ static int report__config(const char *var, const char *value, void *cb)
 		return 0;
 	}

+	pr_debug("%s variable unknown, ignoring...", var);
 	return 0;
 }

@ -314,7 +319,7 @@ static int process_sample_event(struct perf_tool *tool,
 	}

 	if (al.map != NULL)
-		al.map->dso->hit = 1;
+		map__dso(al.map)->hit = 1;

 	if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
@ -603,7 +608,7 @@ static void report__warn_kptr_restrict(const struct report *rep)
 		return;

 	if (kernel_map == NULL ||
-	    (kernel_map->dso->hit &&
+	     (map__dso(kernel_map)->hit &&
 	     (kernel_kmap->ref_reloc_sym == NULL ||
 	      kernel_kmap->ref_reloc_sym->addr == 0))) {
 		const char *desc =
@ -723,8 +728,7 @@ static int hists__resort_cb(struct hist_entry *he, void *arg)
 	if (rep->symbol_ipc && sym && !sym->annotate2) {
 		struct evsel *evsel = hists_to_evsel(he->hists);

-		symbol__annotate2(&he->ms, evsel,
-				  &annotation__default_options, NULL);
+		symbol__annotate2(&he->ms, evsel, &rep->annotation_opts, NULL);
 	}

 	return 0;
@ -840,17 +844,21 @@ static struct task *tasks_list(struct task *task, struct machine *machine)
 static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp)
 {
 	size_t printed = 0;
-	struct map *map;
+	struct map_rb_node *rb_node;
+
+	maps__for_each_entry(maps, rb_node) {
+		struct map *map = rb_node->map;
+		const struct dso *dso = map__dso(map);
+		u32 prot = map__prot(map);

-	maps__for_each_entry(maps, map) {
 		printed += fprintf(fp, "%*s  %" PRIx64 "-%" PRIx64 " %c%c%c%c %08" PRIx64 " %" PRIu64 " %s\n",
-				   indent, "", map->start, map->end,
-				   map->prot & PROT_READ ? 'r' : '-',
-				   map->prot & PROT_WRITE ? 'w' : '-',
-				   map->prot & PROT_EXEC ? 'x' : '-',
-				   map->flags & MAP_SHARED ? 's' : 'p',
-				   map->pgoff,
-				   map->dso->id.ino, map->dso->name);
+				   indent, "", map__start(map), map__end(map),
+				   prot & PROT_READ ? 'r' : '-',
+				   prot & PROT_WRITE ? 'w' : '-',
+				   prot & PROT_EXEC ? 'x' : '-',
+				   map__flags(map) ? 's' : 'p',
+				   map__pgoff(map),
+				   dso->id.ino, dso->name);
 	}

 	return printed;
@ -1218,11 +1226,11 @@ int cmd_report(int argc, const char **argv)
 		.max_stack		 = PERF_MAX_STACK_DEPTH,
 		.pretty_printing_style	 = "normal",
 		.socket_filter		 = -1,
-		.annotation_opts	 = annotation__default_options,
 		.skip_empty		 = true,
 	};
 	char *sort_order_help = sort_help("sort by key(s):");
 	char *field_order_help = sort_help("output field(s): overhead period sample ");
+	const char *disassembler_style = NULL, *objdump_path = NULL, *addr2line_path = NULL;
 	const struct option options[] = {
 	OPT_STRING('i', "input", &input_name, "file",
 		    "input file name"),
@ -1319,7 +1327,7 @@ int cmd_report(int argc, const char **argv)
 		    "Interleave source code with assembly code (default)"),
 	OPT_BOOLEAN(0, "asm-raw", &report.annotation_opts.show_asm_raw,
 		    "Display raw encoding of assembly instructions (default)"),
-	OPT_STRING('M', "disassembler-style", &report.annotation_opts.disassembler_style, "disassembler style",
+	OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
 		   "Specify disassembler style (e.g. -M intel for intel syntax)"),
 	OPT_STRING(0, "prefix", &report.annotation_opts.prefix, "prefix",
 		    "Add prefix to source file path names in programs (with --prefix-strip)"),
@ -1338,8 +1346,10 @@ int cmd_report(int argc, const char **argv)
 		    parse_branch_mode),
 	OPT_BOOLEAN(0, "branch-history", &branch_call_mode,
 		    "add last branch records to call history"),
-	OPT_STRING(0, "objdump", &report.annotation_opts.objdump_path, "path",
+	OPT_STRING(0, "objdump", &objdump_path, "path",
 		   "objdump binary to use for disassembly and annotations"),
+	OPT_STRING(0, "addr2line", &addr2line_path, "path",
+		   "addr2line binary to use for line numbers"),
 	OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
 		    "Disable symbol demangling"),
 	OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
@ -1398,6 +1408,8 @@ int cmd_report(int argc, const char **argv)
 	if (ret < 0)
 		goto exit;

+	annotation_options__init(&report.annotation_opts);
+
 	ret = perf_config(report__config, &report);
 	if (ret)
 		goto exit;
@ -1414,6 +1426,22 @@ int cmd_report(int argc, const char **argv)
 		report.symbol_filter_str = argv[0];
 	}

+	if (disassembler_style) {
+		report.annotation_opts.disassembler_style = strdup(disassembler_style);
+		if (!report.annotation_opts.disassembler_style)
+			return -ENOMEM;
+	}
+	if (objdump_path) {
+		report.annotation_opts.objdump_path = strdup(objdump_path);
+		if (!report.annotation_opts.objdump_path)
+			return -ENOMEM;
+	}
+	if (addr2line_path) {
+		symbol_conf.addr2line_path = strdup(addr2line_path);
+		if (!symbol_conf.addr2line_path)
+			return -ENOMEM;
+	}
+
 	if (annotate_check_args(&report.annotation_opts) < 0) {
 		ret = -EINVAL;
 		goto exit;
@ -1481,7 +1509,7 @@ repeat:

 	setup_forced_leader(&report, session->evlist);

-	if (symbol_conf.group_sort_idx && !session->evlist->core.nr_groups) {
+	if (symbol_conf.group_sort_idx && evlist__nr_groups(session->evlist) == 0) {
 		parse_options_usage(NULL, options, "group-sort-idx", 0);
 		ret = -EINVAL;
 		goto error;
@ -1701,6 +1729,7 @@ error:
 	zstd_fini(&(session->zstd_data));
 	perf_session__delete(session);
 exit:
+	annotation_options__exit(&report.annotation_opts);
 	free(sort_order_help);
 	free(field_order_help);
 	return ret;
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@ -1,6 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "builtin.h"
-#include "perf.h"
 #include "perf-sys.h"

 #include "util/cpumap.h"
@ -27,6 +26,7 @@

 #include "util/debug.h"
 #include "util/event.h"
+#include "util/util.h"

 #include <linux/kernel.h>
 #include <linux/log2.h>
@ -1516,6 +1516,14 @@ static int process_sched_wakeup_event(struct perf_tool *tool,
 	return 0;
 }

+static int process_sched_wakeup_ignore(struct perf_tool *tool __maybe_unused,
+				      struct evsel *evsel __maybe_unused,
+				      struct perf_sample *sample __maybe_unused,
+				      struct machine *machine __maybe_unused)
+{
+	return 0;
+}
+
 union map_priv {
 	void	*ptr;
 	bool	 color;
@ -1816,10 +1824,11 @@ static int perf_sched__process_comm(struct perf_tool *tool __maybe_unused,

 static int perf_sched__read_events(struct perf_sched *sched)
 {
-	const struct evsel_str_handler handlers[] = {
+	struct evsel_str_handler handlers[] = {
 		{ "sched:sched_switch",	      process_sched_switch_event, },
 		{ "sched:sched_stat_runtime", process_sched_runtime_event, },
 		{ "sched:sched_wakeup",	      process_sched_wakeup_event, },
+		{ "sched:sched_waking",	      process_sched_wakeup_event, },
 		{ "sched:sched_wakeup_new",   process_sched_wakeup_event, },
 		{ "sched:sched_migrate_task", process_sched_migrate_task_event, },
 	};
@ -1839,6 +1848,10 @@ static int perf_sched__read_events(struct perf_sched *sched)

 	symbol__init(&session->header.env);

+	/* prefer sched_waking if it is captured */
+	if (evlist__find_tracepoint_by_name(session->evlist, "sched:sched_waking"))
+		handlers[2].handler = process_sched_wakeup_ignore;
+
 	if (perf_session__set_tracepoints_handlers(session, handlers))
 		goto out_delete;

--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@ -1011,12 +1011,12 @@ static int perf_sample__fprintf_brstackoff(struct perf_sample *sample,
 		to   = entries[i].to;

 		if (thread__find_map_fb(thread, sample->cpumode, from, &alf) &&
-		    !alf.map->dso->adjust_symbols)
-			from = map__map_ip(alf.map, from);
+		    !map__dso(alf.map)->adjust_symbols)
+			from = map__dso_map_ip(alf.map, from);

 		if (thread__find_map_fb(thread, sample->cpumode, to, &alt) &&
-		    !alt.map->dso->adjust_symbols)
-			to = map__map_ip(alt.map, to);
+		    !map__dso(alt.map)->adjust_symbols)
+			to = map__dso_map_ip(alt.map, to);

 		printed += fprintf(fp, " 0x%"PRIx64, from);
 		if (PRINT_FIELD(DSO)) {
@ -1044,6 +1044,7 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
 	long offset, len;
 	struct addr_location al;
 	bool kernel;
+	struct dso *dso;

 	if (!start || !end)
 		return 0;
@ -1074,11 +1075,11 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
 		return 0;
 	}

-	if (!thread__find_map(thread, *cpumode, start, &al) || !al.map->dso) {
+	if (!thread__find_map(thread, *cpumode, start, &al) || (dso = map__dso(al.map)) == NULL) {
 		pr_debug("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n", start, end);
 		return 0;
 	}
-	if (al.map->dso->data.status == DSO_DATA_STATUS_ERROR) {
+	if (dso->data.status == DSO_DATA_STATUS_ERROR) {
 		pr_debug("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n", start, end);
 		return 0;
 	}
@ -1086,11 +1087,11 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
 	/* Load maps to ensure dso->is_64_bit has been updated */
 	map__load(al.map);

-	offset = al.map->map_ip(al.map, start);
-	len = dso__data_read_offset(al.map->dso, machine, offset, (u8 *)buffer,
+	offset = map__map_ip(al.map, start);
+	len = dso__data_read_offset(dso, machine, offset, (u8 *)buffer,
 				    end - start + MAXINSN);

-	*is64bit = al.map->dso->is_64_bit;
+	*is64bit = dso->is_64_bit;
 	if (len <= 0)
 		pr_debug("\tcannot fetch code for block at %" PRIx64 "-%" PRIx64 "\n",
 			start, end);
@ -1104,10 +1105,11 @@ static int map__fprintf_srccode(struct map *map, u64 addr, FILE *fp, struct srcc
 	unsigned line;
 	int len;
 	char *srccode;
+	struct dso *dso;

-	if (!map || !map->dso)
+	if (!map || (dso = map__dso(map)) == NULL)
 		return 0;
-	srcfile = get_srcline_split(map->dso,
+	srcfile = get_srcline_split(dso,
 				    map__rip_2objdump(map, addr),
 				    &line);
 	if (!srcfile)
@ -1206,7 +1208,7 @@ static int ip__fprintf_sym(uint64_t addr, struct thread *thread,
 	if (al.addr < al.sym->end)
 		off = al.addr - al.sym->start;
 	else
-		off = al.addr - al.map->start - al.sym->start;
+		off = al.addr - map__start(al.map) - al.sym->start;
 	printed += fprintf(fp, "\t%s", al.sym->name);
 	if (off)
 		printed += fprintf(fp, "%+d", off);
@ -1906,7 +1908,7 @@ static int perf_sample__fprintf_synth_evt(struct perf_sample *sample, FILE *fp)
 	struct perf_synth_intel_evt *data = perf_sample__synth_ptr(sample);
 	const char *cfe[32] = {NULL, "INTR", "IRET", "SMI", "RSM", "SIPI",
 			       "INIT", "VMENTRY", "VMEXIT", "VMEXIT_INTR",
-			       "SHUTDOWN"};
+			       "SHUTDOWN", NULL, "UINTR", "UIRET"};
 	const char *evd[64] = {"PFA", "VMXQ", "VMXR"};
 	const char *s;
 	int len, i;
@ -2072,10 +2074,6 @@ static void perf_sample__fprint_metric(struct perf_script *script,
 	if (evsel_script(leader)->gnum++ == 0)
 		perf_stat__reset_shadow_stats();
 	val = sample->period * evsel->scale;
-	perf_stat__update_shadow_stats(evsel,
-				       val,
-				       sample->cpu,
-				       &rt_stat);
 	evsel_script(evsel)->val = val;
 	if (evsel_script(leader)->gnum == leader->core.nr_members) {
 		for_each_group_member (ev2, leader) {
@ -2083,8 +2081,7 @@ static void perf_sample__fprint_metric(struct perf_script *script,
 						      evsel_script(ev2)->val,
 						      sample->cpu,
 						      &ctx,
-						      NULL,
-						      &rt_stat);
+						      NULL);
 		}
 		evsel_script(leader)->gnum = 0;
 	}
@ -2318,8 +2315,8 @@ static void setup_scripting(void)
 {
 #ifdef HAVE_LIBTRACEEVENT
 	setup_perl_scripting();
-	setup_python_scripting();
 #endif
+	setup_python_scripting();
 }

 static int flush_scripting(void)
@ -2794,8 +2791,6 @@ static int __cmd_script(struct perf_script *script)

 	signal(SIGINT, sig_handler);

-	perf_stat__init_shadow_stats();
-
 	/* override event processing functions */
 	if (script->show_task_events) {
 		script->tool.comm = process_comm_event;
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@ -41,7 +41,6 @@
 */

 #include "builtin.h"
-#include "perf.h"
 #include "util/cgroup.h"
 #include <subcmd/parse-options.h>
 #include "util/parse-events.h"
@ -71,6 +70,7 @@
 #include "util/bpf_counter.h"
 #include "util/iostat.h"
 #include "util/pmu-hybrid.h"
+#include "util/util.h"
 #include "asm/bug.h"

 #include <linux/time64.h>
@ -100,71 +100,6 @@

 static void print_counters(struct timespec *ts, int argc, const char **argv);

-/* Default events used for perf stat -T */
-static const char *transaction_attrs = {
-	"task-clock,"
-	"{"
-	"instructions,"
-	"cycles,"
-	"cpu/cycles-t/,"
-	"cpu/tx-start/,"
-	"cpu/el-start/,"
-	"cpu/cycles-ct/"
-	"}"
-};
-
-/* More limited version when the CPU does not have all events. */
-static const char * transaction_limited_attrs = {
-	"task-clock,"
-	"{"
-	"instructions,"
-	"cycles,"
-	"cpu/cycles-t/,"
-	"cpu/tx-start/"
-	"}"
-};
-
-static const char * topdown_attrs[] = {
-	"topdown-total-slots",
-	"topdown-slots-retired",
-	"topdown-recovery-bubbles",
-	"topdown-fetch-bubbles",
-	"topdown-slots-issued",
-	NULL,
-};
-
-static const char *topdown_metric_attrs[] = {
-	"slots",
-	"topdown-retiring",
-	"topdown-bad-spec",
-	"topdown-fe-bound",
-	"topdown-be-bound",
-	NULL,
-};
-
-static const char *topdown_metric_L2_attrs[] = {
-	"slots",
-	"topdown-retiring",
-	"topdown-bad-spec",
-	"topdown-fe-bound",
-	"topdown-be-bound",
-	"topdown-heavy-ops",
-	"topdown-br-mispredict",
-	"topdown-fetch-lat",
-	"topdown-mem-bound",
-	NULL,
-};
-
-#define TOPDOWN_MAX_LEVEL			2
-
-static const char *smi_cost_attrs = {
-	"{"
-	"msr/aperf/,"
-	"msr/smi/,"
-	"cycles"
-	"}"
-};
-
 static struct evlist	*evsel_list;
 static bool all_counters_use_bpf = true;

@ -246,14 +181,13 @@ static bool cpus_map_matched(struct evsel *a, struct evsel *b)

 static void evlist__check_cpu_maps(struct evlist *evlist)
 {
-	struct evsel *evsel, *pos, *leader;
-	char buf[1024];
+	struct evsel *evsel, *warned_leader = NULL;

 	if (evlist__has_hybrid(evlist))
 		evlist__warn_hybrid_group(evlist);

 	evlist__for_each_entry(evlist, evsel) {
-		leader = evsel__leader(evsel);
+		struct evsel *leader = evsel__leader(evsel);

 		/* Check that leader matches cpus with each member. */
 		if (leader == evsel)
@ -262,19 +196,26 @@ static void evlist__check_cpu_maps(struct evlist *evlist)
 			continue;

 		/* If there's mismatch disable the group and warn user. */
-		WARN_ONCE(1, "WARNING: grouped events cpus do not match, disabling group:\n");
-		evsel__group_desc(leader, buf, sizeof(buf));
-		pr_warning("  %s\n", buf);
+		if (warned_leader != leader) {
+			char buf[200];

+			pr_warning("WARNING: grouped events cpus do not match.\n"
+				"Events with CPUs not matching the leader will "
+				"be removed from the group.\n");
+			evsel__group_desc(leader, buf, sizeof(buf));
+			pr_warning("  %s\n", buf);
+			warned_leader = leader;
+		}
 		if (verbose > 0) {
+			char buf[200];
+
 			cpu_map__snprint(leader->core.cpus, buf, sizeof(buf));
 			pr_warning("     %s: %s\n", leader->name, buf);
 			cpu_map__snprint(evsel->core.cpus, buf, sizeof(buf));
 			pr_warning("     %s: %s\n", evsel->name, buf);
 		}

-		for_each_group_evsel(pos, leader)
-			evsel__remove_from_group(pos, leader);
+		evsel__remove_from_group(evsel, leader);
 	}
 }

@ -489,7 +430,6 @@ static void process_counters(void)

 	perf_stat_merge_counters(&stat_config, evsel_list);
 	perf_stat_process_percore(&stat_config, evsel_list);
-	perf_stat_process_shadow_stats(&stat_config, evsel_list);
 }

 static void process_interval(void)
@ -499,7 +439,6 @@ static void process_interval(void)
 	clock_gettime(CLOCK_MONOTONIC, &ts);
 	diff_timespec(&rs, &ts, &ref_time);

-	perf_stat__reset_shadow_per_stat(&rt_stat);
 	evlist__reset_aggr_stats(evsel_list);

 	if (read_counters(&rs) == 0)
@ -610,7 +549,7 @@ static void process_evlist(struct evlist *evlist, unsigned int interval)
 	if (evlist__ctlfd_process(evlist, &cmd) > 0) {
 		switch (cmd) {
 		case EVLIST_CTL_CMD_ENABLE:
-			__fallthrough;
+			fallthrough;
 		case EVLIST_CTL_CMD_DISABLE:
 			if (interval)
 				process_interval();
@ -773,7 +712,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
 		counter->reset_group = false;
 		if (bpf_counter__load(counter, &target))
 			return -1;
-		if (!evsel__is_bpf(counter))
+		if (!(evsel__is_bperf(counter)))
 			all_counters_use_bpf = false;
 	}

@ -789,7 +728,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)

 		if (counter->reset_group || counter->errored)
 			continue;
-		if (evsel__is_bpf(counter))
+		if (evsel__is_bperf(counter))
 			continue;
 try_again:
 		if (create_perf_stat_counter(counter, &stat_config, &target,
@ -970,7 +909,6 @@ try_again_reset:
 		evlist__copy_prev_raw_counts(evsel_list);
 		evlist__reset_prev_raw_counts(evsel_list);
 		evlist__reset_aggr_stats(evsel_list);
-		perf_stat__reset_shadow_per_stat(&rt_stat);
 	} else {
 		update_stats(&walltime_nsecs_stats, t1 - t0);
 		update_rusage_stats(&ru_stats, &stat_config.ru_data);
@ -1251,6 +1189,8 @@ static struct option stat_options[] = {
 		       "don't group metric events, impacts multiplexing"),
 	OPT_BOOLEAN(0, "metric-no-merge", &stat_config.metric_no_merge,
 		       "don't try to share events between metrics in a group"),
+	OPT_BOOLEAN(0, "metric-no-threshold", &stat_config.metric_no_threshold,
+		       "don't try to share events between metrics in a group  "),
 	OPT_BOOLEAN(0, "topdown", &topdown_run,
 			"measure top-down statistics"),
 	OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
@ -1716,7 +1656,6 @@ static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
 */
 static int add_default_attributes(void)
 {
-	int err;
 	struct perf_event_attr default_attrs0[] = {

  { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK		},
@ -1837,44 +1776,29 @@ static int add_default_attributes(void)
 		return 0;

 	if (transaction_run) {
-		struct parse_events_error errinfo;
 		/* Handle -T as -M transaction. Once platform specific metrics
 		 * support has been added to the json files, all architectures
 		 * will use this approach. To determine transaction support
 		 * on an architecture test for such a metric name.
 		 */
-		if (metricgroup__has_metric("transaction")) {
-			return metricgroup__parse_groups(evsel_list, "transaction",
-							 stat_config.metric_no_group,
-							 stat_config.metric_no_merge,
-							 stat_config.user_requested_cpu_list,
-							 stat_config.system_wide,
-							 &stat_config.metric_events);
+		if (!metricgroup__has_metric("transaction")) {
+			pr_err("Missing transaction metrics");
+			return -1;
 		}
-
-		parse_events_error__init(&errinfo);
-		if (pmu_have_event("cpu", "cycles-ct") &&
-		    pmu_have_event("cpu", "el-start"))
-			err = parse_events(evsel_list, transaction_attrs,
-					   &errinfo);
-		else
-			err = parse_events(evsel_list,
-					   transaction_limited_attrs,
-					   &errinfo);
-		if (err) {
-			fprintf(stderr, "Cannot set up transaction events\n");
-			parse_events_error__print(&errinfo, transaction_attrs);
-		}
-		parse_events_error__exit(&errinfo);
-		return err ? -1 : 0;
+		return metricgroup__parse_groups(evsel_list, "transaction",
+						stat_config.metric_no_group,
+						stat_config.metric_no_merge,
+						stat_config.metric_no_threshold,
+						stat_config.user_requested_cpu_list,
+						stat_config.system_wide,
+						&stat_config.metric_events);
 	}

 	if (smi_cost) {
-		struct parse_events_error errinfo;
 		int smi;

 		if (sysfs__read_int(FREEZE_ON_SMI_PATH, &smi) < 0) {
-			fprintf(stderr, "freeze_on_smi is not supported.\n");
+			pr_err("freeze_on_smi is not supported.");
 			return -1;
 		}

@ -1886,108 +1810,62 @@ static int add_default_attributes(void)
 			smi_reset = true;
 		}

-		if (!pmu_have_event("msr", "aperf") ||
-		    !pmu_have_event("msr", "smi")) {
-			fprintf(stderr, "To measure SMI cost, it needs "
-				"msr/aperf/, msr/smi/ and cpu/cycles/ support\n");
+		if (!metricgroup__has_metric("smi")) {
+			pr_err("Missing smi metrics");
 			return -1;
 		}
+
 		if (!force_metric_only)
 			stat_config.metric_only = true;

-		parse_events_error__init(&errinfo);
-		err = parse_events(evsel_list, smi_cost_attrs, &errinfo);
-		if (err) {
-			parse_events_error__print(&errinfo, smi_cost_attrs);
-			fprintf(stderr, "Cannot set up SMI cost events\n");
-		}
-		parse_events_error__exit(&errinfo);
-		return err ? -1 : 0;
+		return metricgroup__parse_groups(evsel_list, "smi",
+						stat_config.metric_no_group,
+						stat_config.metric_no_merge,
+						stat_config.metric_no_threshold,
+						stat_config.user_requested_cpu_list,
+						stat_config.system_wide,
+						&stat_config.metric_events);
 	}

 	if (topdown_run) {
-		const char **metric_attrs = topdown_metric_attrs;
-		unsigned int max_level = 1;
-		char *str = NULL;
-		bool warn = false;
-		const char *pmu_name = arch_get_topdown_pmu_name(evsel_list, true);
+		unsigned int max_level = metricgroups__topdown_max_level();
+		char str[] = "TopdownL1";

 		if (!force_metric_only)
 			stat_config.metric_only = true;

-		if (pmu_have_event(pmu_name, topdown_metric_L2_attrs[5])) {
-			metric_attrs = topdown_metric_L2_attrs;
-			max_level = 2;
+		if (!max_level) {
+			pr_err("Topdown requested but the topdown metric groups aren't present.\n"
+				"(See perf list the metric groups have names like TopdownL1)");
+			return -1;
 		}
-
 		if (stat_config.topdown_level > max_level) {
 			pr_err("Invalid top-down metrics level. The max level is %u.\n", max_level);
 			return -1;
 		} else if (!stat_config.topdown_level)
-			stat_config.topdown_level = max_level;
+			stat_config.topdown_level = 1;

-		if (topdown_filter_events(metric_attrs, &str, 1, pmu_name) < 0) {
-			pr_err("Out of memory\n");
+		if (!stat_config.interval && !stat_config.metric_only) {
+			fprintf(stat_config.output,
+				"Topdown accuracy may decrease when measuring long periods.\n"
+				"Please print the result regularly, e.g. -I1000\n");
+		}
+		str[8] = stat_config.topdown_level + '0';
+		if (metricgroup__parse_groups(evsel_list, str,
+						/*metric_no_group=*/false,
+						/*metric_no_merge=*/false,
+						/*metric_no_threshold=*/true,
+						stat_config.user_requested_cpu_list,
+						stat_config.system_wide,
+						&stat_config.metric_events) < 0)
 			return -1;
-		}
-
-		if (metric_attrs[0] && str) {
-			if (!stat_config.interval && !stat_config.metric_only) {
-				fprintf(stat_config.output,
-					"Topdown accuracy may decrease when measuring long periods.\n"
-					"Please print the result regularly, e.g. -I1000\n");
-			}
-			goto setup_metrics;
-		}
-
-		zfree(&str);
-
-		if (stat_config.aggr_mode != AGGR_GLOBAL &&
-		    stat_config.aggr_mode != AGGR_CORE) {
-			pr_err("top down event configuration requires --per-core mode\n");
-			return -1;
-		}
-		stat_config.aggr_mode = AGGR_CORE;
-		if (nr_cgroups || !target__has_cpu(&target)) {
-			pr_err("top down event configuration requires system-wide mode (-a)\n");
-			return -1;
-		}
-
-		if (topdown_filter_events(topdown_attrs, &str,
-				arch_topdown_check_group(&warn),
-				pmu_name) < 0) {
-			pr_err("Out of memory\n");
-			return -1;
-		}
-
-		if (topdown_attrs[0] && str) {
-			struct parse_events_error errinfo;
-			if (warn)
-				arch_topdown_group_warn();
-setup_metrics:
-			parse_events_error__init(&errinfo);
-			err = parse_events(evsel_list, str, &errinfo);
-			if (err) {
-				fprintf(stderr,
-					"Cannot set up top down events %s: %d\n",
-					str, err);
-				parse_events_error__print(&errinfo, str);
-				parse_events_error__exit(&errinfo);
-				free(str);
-				return -1;
-			}
-			parse_events_error__exit(&errinfo);
-		} else {
-			fprintf(stderr, "System does not support topdown\n");
-			return -1;
-		}
-		free(str);
 	}

 	if (!stat_config.topdown_level)
-		stat_config.topdown_level = TOPDOWN_MAX_LEVEL;
+		stat_config.topdown_level = 1;

 	if (!evsel_list->core.nr_entries) {
+		/* No events so add defaults. */
 		if (target__has_cpu(&target))
 			default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;

@ -2003,6 +1881,25 @@ setup_metrics:
 		}
 		if (evlist__add_default_attrs(evsel_list, default_attrs1) < 0)
 			return -1;
+		/*
+		 * Add TopdownL1 metrics if they exist. To minimize
+		 * multiplexing, don't request threshold computation.
+		 */
+		/*
+		 * TODO: TopdownL1 is disabled on hybrid CPUs to avoid a crashes
+		 * caused by exposing latent bugs. This is fixed properly in:
+		 * https://lore.kernel.org/lkml/bff481ba-e60a-763f-0aa0-3ee53302c480@linux.intel.com/
+		 */
+		if (metricgroup__has_metric("TopdownL1") && !perf_pmu__has_hybrid() &&
+		    metricgroup__parse_groups(evsel_list, "TopdownL1",
+					    /*metric_no_group=*/false,
+					    /*metric_no_merge=*/false,
+					    /*metric_no_threshold=*/true,
+					    stat_config.user_requested_cpu_list,
+					    stat_config.system_wide,
+					    &stat_config.metric_events) < 0)
+			return -1;
+
 		/* Platform specific attrs */
 		if (evlist__add_default_attrs(evsel_list, default_null_attrs) < 0)
 			return -1;
@ -2239,8 +2136,6 @@ static int __cmd_report(int argc, const char **argv)
 			input_name = "perf.data";
 	}

-	perf_stat__init_shadow_stats();
-
 	perf_stat.data.path = input_name;
 	perf_stat.data.mode = PERF_DATA_MODE_READ;

@ -2281,7 +2176,7 @@ static void setup_system_wide(int forks)

 		evlist__for_each_entry(evsel_list, counter) {
 			if (!counter->core.requires_cpu &&
-			    strcmp(counter->name, "duration_time")) {
+			    !evsel__name_is(counter, "duration_time")) {
 				return;
 			}
 		}
@ -2383,8 +2278,10 @@ int cmd_stat(int argc, const char **argv)
 			perror("failed to create output file");
 			return -1;
 		}
-		clock_gettime(CLOCK_REALTIME, &tm);
-		fprintf(output, "# started on %s\n", ctime(&tm.tv_sec));
+		if (!stat_config.json_output) {
+			clock_gettime(CLOCK_REALTIME, &tm);
+			fprintf(output, "# started on %s\n", ctime(&tm.tv_sec));
+		}
 	} else if (output_fd > 0) {
 		mode = append_file ? "a" : "w";
 		output = fdopen(output_fd, mode);
@ -2514,12 +2411,12 @@ int cmd_stat(int argc, const char **argv)
 		metricgroup__parse_groups(evsel_list, metrics,
 					stat_config.metric_no_group,
 					stat_config.metric_no_merge,
+					stat_config.metric_no_threshold,
 					stat_config.user_requested_cpu_list,
 					stat_config.system_wide,
 					&stat_config.metric_events);
 		zfree(&metrics);
 	}
-	perf_stat__init_shadow_stats();

 	if (add_default_attributes())
 		goto out;
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@ -24,7 +24,6 @@
 #include "util/thread.h"
 #include "util/callchain.h"

-#include "perf.h"
 #include "util/header.h"
 #include <subcmd/pager.h>
 #include <subcmd/parse-options.h>
@ -37,6 +36,7 @@
 #include "util/debug.h"
 #include "util/string2.h"
 #include "util/tracepoint.h"
+#include "util/util.h"
 #include <linux/err.h>
 #include <traceevent/event-parse.h>

--- a/Show More
+++ b/Show More