linux-stable

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	7ab89417ed	perf tools changes for v6.7 Build ----- * Compile BPF programs by default if clang (>= 12.0.1) is available to enable more features like kernel lock contention, off-cpu profiling, kwork, sample filtering and so on. It can be disabled by passing BUILD_BPF_SKEL=0 to make. * Produce better error messages for bison on debug build (make DEBUG=1) by defining YYDEBUG symbol internally. perf record ----------- * Track sideband events (like FORK/MMAP) from all CPUs even if perf record targets a subset of CPUs only (using -C option). Otherwise it may lose some information happened on a CPU out of the target list. * Fix checking raw sched_switch tracepoint argument using system BTF. This affects off-cpu profiling which attaches a BPF program to the raw tracepoint. perf lock contention -------------------- * Add --lock-cgroup option to see contention by cgroups. This should be used with BPF only (using -b option). $ sudo perf lock con -ab --lock-cgroup -- sleep 1 contended total wait max wait avg wait cgroup 835 14.06 ms 41.19 us 16.83 us /system.slice/led.service 25 122.38 us 13.77 us 4.89 us / 44 23.73 us 3.87 us 539 ns /user.slice/user-657345.slice/session-c4.scope 1 491 ns 491 ns 491 ns /system.slice/connectd.service * Add -G/--cgroup-filter option to see contention only for given cgroups. This can be useful when you identified a cgroup in the above command and want to investigate more on it. It also works with other output options like -t/--threads and -l/--lock-addr. $ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1 contended total wait max wait avg wait type caller 8 77.11 us 17.98 us 9.64 us spinlock futex_wake+0xc8 2 24.56 us 14.66 us 12.28 us spinlock tick_do_update_jiffies64+0x25 1 4.97 us 4.97 us 4.97 us spinlock futex_q_lock+0x2a * Use per-cpu array for better spinlock tracking. This is to improve performance of the BPF program and to avoid nested contention on a lock in the BPF hash map. * Update callstack check for PowerPC. To find a representative caller of a lock, it needs to look up the call stacks. It ends the lookup when it sees 0 in the call stack buffer. However, PowerPC call stacks can have 0 values in the beginning so skip them when it expects valid call stacks after. perf kwork ---------- * Support 'sched' class (for -k option) so that it can see task scheduling event (using sched_switch tracepoint) as well as irq and workqueue items. * Add perf kwork top subcommand to show more accurate cpu utilization with sched class above. It works both with a recorded data (using perf kwork record command) and BPF (using -b option). Unlike perf top command, it does not support interactive mode (yet). $ sudo perf kwork top -b -k sched Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 160702.425 ms, 8 cpus %Cpu(s): 36.00% id, 0.00% hi, 0.00% si %Cpu0 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.66%] %Cpu1 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.27%] %Cpu2 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 66.40%] %Cpu3 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.28%] %Cpu4 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.82%] %Cpu5 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 77.41%] %Cpu6 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.73%] %Cpu7 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 63.25%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 38.72 8089.463 ms [swapper/1] 0 0 38.71 8084.547 ms [swapper/3] 0 0 38.33 8007.532 ms [swapper/0] 0 0 38.26 7992.985 ms [swapper/6] 0 0 38.17 7971.865 ms [swapper/4] 0 0 36.74 7447.765 ms [swapper/7] 0 0 33.59 6486.942 ms [swapper/2] 0 0 22.58 3771.268 ms [swapper/5] 9545 9351 2.48 447.136 ms sched-messaging 9574 9351 2.09 418.583 ms sched-messaging 9724 9351 2.05 372.407 ms sched-messaging 9531 9351 2.01 368.804 ms sched-messaging 9512 9351 2.00 362.250 ms sched-messaging 9514 9351 1.95 357.767 ms sched-messaging 9538 9351 1.86 384.476 ms sched-messaging 9712 9351 1.84 386.490 ms sched-messaging 9723 9351 1.83 380.021 ms sched-messaging 9722 9351 1.82 382.738 ms sched-messaging 9517 9351 1.81 354.794 ms sched-messaging 9559 9351 1.79 344.305 ms sched-messaging 9725 9351 1.77 365.315 ms sched-messaging <SNIP> * Add hard/soft-irq statistics to perf kwork top. This will show the total CPU utilization with IRQ stats like below: $ sudo perf kwork top -b -k sched,irq,softirq Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 12554.889 ms, 8 cpus %Cpu(s): 96.23% id, 0.10% hi, 0.19% si <---- here %Cpu0 [\| 4.60%] %Cpu1 [\| 4.59%] %Cpu2 [ 2.73%] %Cpu3 [\| 3.81%] <SNIP> perf bench ---------- * Add -G/--cgroups option to perf bench sched pipe. The pipe bench is good to measure context switch overhead. With this option, it puts the reader and writer tasks in separate cgroups to enforce context switch between two different cgroups. Also it needs to set CPU affinity of the tasks in a CPU to accurately measure the impact of cgroup context switches. $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.307 [sec] 3.078180 usecs/op 324867 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000': 200,026 context-switches 63 cgroup-switches 0.321637922 seconds time elapsed You can see small number of cgroup-switches because both write and read tasks are in the same cgroup. $ sudo mkdir /sys/fs/cgroup/{AAA,BBB} $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.351 [sec] 3.512990 usecs/op 284657 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB': 200,020 context-switches 200,019 cgroup-switches 0.365034567 seconds time elapsed Now context-switches and cgroup-switches are almost same. And you can see the pipe operation took little more. * Kill child processes when perf bench sched messaging exited abnormally. Otherwise it'd leave the child doing unnecessary work. perf test --------- * Fix various shellcheck issues on the tests written in shell script. * Skip tests when condition is not satisfied: - object code reading test for non-text section addresses. - CoreSight test if cs_etm// event is not available. - lock contention test if not enough CPUs. Event parsing ------------- * Make PMU alias name loading lazy to reduce the startup time in the event parsing code for perf record, stat and others in the general case. * Lazily compute PMU default config. In the same sense, delay PMU initialization until it's really needed to reduce the startup cost. * Fix event term values that are raw events. The event specification can have several terms including event name. But sometimes it clashes with raw event encoding which starts with 'r' and has hex-digits. For example, an event named 'read' should be processed as a normal event but it was mis-treated as a raw encoding and caused a failure. $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1 event syntax error: '..nning/event=read/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Event metrics ------------- * Add "Compat" regex to match event with multiple identifiers. * Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne. Misc ---- * Assorted memory leak fixes and footprint reduction. * Add "bpf_skeletons" to perf version --build-options so that users can check whether their perf tools have BPF support easily. * Fix unaligned access in Intel-PT packet decoder found by undefined-behavior sanitizer. * Avoid frequency mode for the dummy event. Surprisingly it'd impact kernel timer tick handler performance by force iterating all PMU events. * Update bash shell completion for events and metrics. Signed-off-by: Namhyung Kim <namhyung@kernel.org> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZUMg7wAKCRCMstVUGiXM g8FvAQC9KED6H8rlH7UTvxE6fM947EJbldwGrNA1zGx++Ucd3gD/ewA2A6SUcIh6 Tua/XovmYOQbuDYOwlRHe+sdDag0sgg= =GrCE -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "Build: - Compile BPF programs by default if clang (>= 12.0.1) is available to enable more features like kernel lock contention, off-cpu profiling, kwork, sample filtering and so on. This can be disabled by passing BUILD_BPF_SKEL=0 to make. - Produce better error messages for bison on debug build (make DEBUG=1) by defining YYDEBUG symbol internally. perf record: - Track sideband events (like FORK/MMAP) from all CPUs even if perf record targets a subset of CPUs only (using -C option). Otherwise it may lose some information happened on a CPU out of the target list. - Fix checking raw sched_switch tracepoint argument using system BTF. This affects off-cpu profiling which attaches a BPF program to the raw tracepoint. perf lock contention: - Add --lock-cgroup option to see contention by cgroups. This should be used with BPF only (using -b option). $ sudo perf lock con -ab --lock-cgroup -- sleep 1 contended total wait max wait avg wait cgroup 835 14.06 ms 41.19 us 16.83 us /system.slice/led.service 25 122.38 us 13.77 us 4.89 us / 44 23.73 us 3.87 us 539 ns /user.slice/user-657345.slice/session-c4.scope 1 491 ns 491 ns 491 ns /system.slice/connectd.service - Add -G/--cgroup-filter option to see contention only for given cgroups. This can be useful when you identified a cgroup in the above command and want to investigate more on it. It also works with other output options like -t/--threads and -l/--lock-addr. $ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1 contended total wait max wait avg wait type caller 8 77.11 us 17.98 us 9.64 us spinlock futex_wake+0xc8 2 24.56 us 14.66 us 12.28 us spinlock tick_do_update_jiffies64+0x25 1 4.97 us 4.97 us 4.97 us spinlock futex_q_lock+0x2a - Use per-cpu array for better spinlock tracking. This is to improve performance of the BPF program and to avoid nested contention on a lock in the BPF hash map. - Update callstack check for PowerPC. To find a representative caller of a lock, it needs to look up the call stacks. It ends the lookup when it sees 0 in the call stack buffer. However, PowerPC call stacks can have 0 values in the beginning so skip them when it expects valid call stacks after. perf kwork: - Support 'sched' class (for -k option) so that it can see task scheduling event (using sched_switch tracepoint) as well as irq and workqueue items. - Add perf kwork top subcommand to show more accurate cpu utilization with sched class above. It works both with a recorded data (using perf kwork record command) and BPF (using -b option). Unlike perf top command, it does not support interactive mode (yet). $ sudo perf kwork top -b -k sched Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 160702.425 ms, 8 cpus %Cpu(s): 36.00% id, 0.00% hi, 0.00% si %Cpu0 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.66%] %Cpu1 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.27%] %Cpu2 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 66.40%] %Cpu3 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.28%] %Cpu4 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.82%] %Cpu5 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 77.41%] %Cpu6 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 61.73%] %Cpu7 [\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\| 63.25%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 38.72 8089.463 ms [swapper/1] 0 0 38.71 8084.547 ms [swapper/3] 0 0 38.33 8007.532 ms [swapper/0] 0 0 38.26 7992.985 ms [swapper/6] 0 0 38.17 7971.865 ms [swapper/4] 0 0 36.74 7447.765 ms [swapper/7] 0 0 33.59 6486.942 ms [swapper/2] 0 0 22.58 3771.268 ms [swapper/5] 9545 9351 2.48 447.136 ms sched-messaging 9574 9351 2.09 418.583 ms sched-messaging 9724 9351 2.05 372.407 ms sched-messaging 9531 9351 2.01 368.804 ms sched-messaging 9512 9351 2.00 362.250 ms sched-messaging 9514 9351 1.95 357.767 ms sched-messaging 9538 9351 1.86 384.476 ms sched-messaging 9712 9351 1.84 386.490 ms sched-messaging 9723 9351 1.83 380.021 ms sched-messaging 9722 9351 1.82 382.738 ms sched-messaging 9517 9351 1.81 354.794 ms sched-messaging 9559 9351 1.79 344.305 ms sched-messaging 9725 9351 1.77 365.315 ms sched-messaging <SNIP> - Add hard/soft-irq statistics to perf kwork top. This will show the total CPU utilization with IRQ stats like below: $ sudo perf kwork top -b -k sched,irq,softirq Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 12554.889 ms, 8 cpus %Cpu(s): 96.23% id, 0.10% hi, 0.19% si <---- here %Cpu0 [\| 4.60%] %Cpu1 [\| 4.59%] %Cpu2 [ 2.73%] %Cpu3 [\| 3.81%] <SNIP> perf bench: - Add -G/--cgroups option to perf bench sched pipe. The pipe bench is good to measure context switch overhead. With this option, it puts the reader and writer tasks in separate cgroups to enforce context switch between two different cgroups. Also it needs to set CPU affinity of the tasks in a CPU to accurately measure the impact of cgroup context switches. $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.307 [sec] 3.078180 usecs/op 324867 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000': 200,026 context-switches 63 cgroup-switches 0.321637922 seconds time elapsed You can see small number of cgroup-switches because both write and read tasks are in the same cgroup. $ sudo mkdir /sys/fs/cgroup/{AAA,BBB} $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.351 [sec] 3.512990 usecs/op 284657 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB': 200,020 context-switches 200,019 cgroup-switches 0.365034567 seconds time elapsed Now context-switches and cgroup-switches are almost same. And you can see the pipe operation took little more. - Kill child processes when perf bench sched messaging exited abnormally. Otherwise it'd leave the child doing unnecessary work. perf test: - Fix various shellcheck issues on the tests written in shell script. - Skip tests when condition is not satisfied: - object code reading test for non-text section addresses. - CoreSight test if cs_etm// event is not available. - lock contention test if not enough CPUs. Event parsing: - Make PMU alias name loading lazy to reduce the startup time in the event parsing code for perf record, stat and others in the general case. - Lazily compute PMU default config. In the same sense, delay PMU initialization until it's really needed to reduce the startup cost. - Fix event term values that are raw events. The event specification can have several terms including event name. But sometimes it clashes with raw event encoding which starts with 'r' and has hex-digits. For example, an event named 'read' should be processed as a normal event but it was mis-treated as a raw encoding and caused a failure. $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1 event syntax error: '..nning/event=read/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Event metrics: - Add "Compat" regex to match event with multiple identifiers. - Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne. Misc: - Assorted memory leak fixes and footprint reduction. - Add "bpf_skeletons" to perf version --build-options so that users can check whether their perf tools have BPF support easily. - Fix unaligned access in Intel-PT packet decoder found by undefined-behavior sanitizer. - Avoid frequency mode for the dummy event. Surprisingly it'd impact kernel timer tick handler performance by force iterating all PMU events. - Update bash shell completion for events and metrics" * tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (187 commits) perf vendor events intel: Update tsx_cycles_per_elision metrics perf vendor events intel: Update bonnell version number to v5 perf vendor events intel: Update westmereex events to v4 perf vendor events intel: Update meteorlake events to v1.06 perf vendor events intel: Update knightslanding events to v16 perf vendor events intel: Add typo fix for ivybridge FP perf vendor events intel: Update a spelling in haswell/haswellx perf vendor events intel: Update emeraldrapids to v1.01 perf vendor events intel: Update alderlake/alderlake events to v1.23 perf build: Disable BPF skeletons if clang version is < 12.0.1 perf callchain: Fix spelling mistake "statisitcs" -> "statistics" perf report: Fix spelling mistake "heirachy" -> "hierarchy" perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit() perf tests: test_arm_coresight: Simplify source iteration perf vendor events intel: Add tigerlake two metrics perf vendor events intel: Add broadwellde two metrics perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit perf callchain: Minor layout changes to callchain_list perf callchain: Make brtype_stat in callchain_list optional ...	2023-11-03 08:17:38 -10:00
Paolo Bonzini	45b890f768	KVM/arm64 updates for 6.7 - Generalized infrastructure for 'writable' ID registers, effectively allowing userspace to opt-out of certain vCPU features for its guest - Optimization for vSGI injection, opportunistically compressing MPIDR to vCPU mapping into a table - Improvements to KVM's PMU emulation, allowing userspace to select the number of PMCs available to a VM - Guest support for memory operation instructions (FEAT_MOPS) - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing bugs and getting rid of useless code - Changes to the way the SMCCC filter is constructed, avoiding wasted memory allocations when not in use - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing the overhead of errata mitigations - Miscellaneous kernel and selftest fixes -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZUFJRgAKCRCivnWIJHzd FtgYAP9cMsc1Mhlw3jNQnTc6q0cbTulD/SoEDPUat1dXMqjs+gEAnskwQTrTX834 fgGQeCAyp7Gmar+KeP64H0xm8kPSpAw= =R4M7 -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.7 - Generalized infrastructure for 'writable' ID registers, effectively allowing userspace to opt-out of certain vCPU features for its guest - Optimization for vSGI injection, opportunistically compressing MPIDR to vCPU mapping into a table - Improvements to KVM's PMU emulation, allowing userspace to select the number of PMCs available to a VM - Guest support for memory operation instructions (FEAT_MOPS) - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing bugs and getting rid of useless code - Changes to the way the SMCCC filter is constructed, avoiding wasted memory allocations when not in use - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing the overhead of errata mitigations - Miscellaneous kernel and selftest fixes	2023-10-31 16:37:07 -04:00
Namhyung Kim	fed3a1be64	perf tools fixes for v6.6: 2nd batch - Fix regression in reading scale and unit files from sysfs for PMU events, so that we can use that info to pretty print instead of printing raw numbers: # perf stat -e power/energy-ram/,power/energy-gpu/ sleep 2 Performance counter stats for 'system wide': 1.64 Joules power/energy-ram/ 0.20 Joules power/energy-gpu/ 2.001228914 seconds time elapsed # # grep -m1 "model name" /proc/cpuinfo model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz # - The small llvm.cpp file used to check if the llvm devel files are present was incorrectly deleted when removing the BPF event in 'perf trace', put it back as it is also used by tools/bpf/bpftool, that uses llvm routines to do disassembly of BPF object files. - Fix use of addr_location__exit() in dlfilter__object_code(), making sure that it is only used to pair a previous addr_location__init() call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZTKh5AAKCRCyPKLppCJ+ J/g/AP0f6SNyHJz21JzDTzyjXAeSdMzKwic0LXv+kATQy31HJAD+Kf7UKQieUeZB fxvp60aKyFN8IVIgpYiAjZMS3k9XPAY= =N7Gv -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' into perf-tools-next To get the latest fixes in the perf tools including perf stat output, dlfilter and LLVM feature detection. Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-30 13:46:27 -07:00
Colin Ian King	ee40490dd7	perf callchain: Fix spelling mistake "statisitcs" -> "statistics" There are a couple of spelling mistakes in perror messages. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20231027084633.1167530-1-colin.i.king@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-27 19:26:13 -07:00
Arnaldo Carvalho de Melo	93c65d6143	perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit() The changes in ("perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofile") ended up breaking the python binding that now references the rlimit__increase_nofile function, add the util/rlimit.o to the tools/perf/util/python-ext-sources to cure that. This was detected by the 'perf test python' regression test: $ perf test python 14: 'import perf' in python : FAILED! $ perf test -v python Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 14: 'import perf' in python : --- start --- test child forked, pid 2912462 python usage test: "echo "import sys ; sys.path.insert(0, '/tmp/build/perf-tools-next/python'); import perf" \| '/usr/bin/python3' " Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf-tools-next/python/perf.cpython-311-x86_64-linux-gnu.so: undefined symbol: rlimit__increase_nofile test child finished with -1 ---- end ---- 'import perf' in python: FAILED! $ Fixes: `e093a222d7` ("perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofile") Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/lkml/ZTrCS5Z3PZAmfPdV@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-27 19:23:52 -07:00
Ian Rogers	56e144fe98	perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit Fix leak where mem_info__put wouldn't release the maps/map as used by perf mem. Add exit functions and use elsewhere that the maps and map are released. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:39:58 -07:00
Ian Rogers	dec07fe5d4	perf callchain: Minor layout changes to callchain_list Avoid 6 byte hole for padding. Place more frequently used fields first in an attempt to use just 1 cacheline in the common case. Before: ``` struct callchain_list { u64 ip; /* 0 8 / struct map_symbol ms; / 8 24 / struct { _Bool unfolded; / 32 1 / _Bool has_children; / 33 1 / }; / 32 2 / / XXX 6 bytes hole, try to pack / u64 branch_count; / 40 8 / u64 from_count; / 48 8 / u64 predicted_count; / 56 8 / / --- cacheline 1 boundary (64 bytes) --- / u64 abort_count; / 64 8 / u64 cycles_count; / 72 8 / u64 iter_count; / 80 8 / u64 iter_cycles; / 88 8 / struct branch_type_stat brtype_stat; /* 96 8 / const char srcline; /* 104 8 / struct list_head list; / 112 16 / / size: 128, cachelines: 2, members: 13 / / sum members: 122, holes: 1, sum holes: 6 / }; ``` After: ``` struct callchain_list { struct list_head list; / 0 16 / u64 ip; / 16 8 / struct map_symbol ms; / 24 24 / const char srcline; /* 48 8 / u64 branch_count; / 56 8 / / --- cacheline 1 boundary (64 bytes) --- / u64 from_count; / 64 8 / u64 cycles_count; / 72 8 / u64 iter_count; / 80 8 / u64 iter_cycles; / 88 8 / struct branch_type_stat brtype_stat; /* 96 8 / u64 predicted_count; / 104 8 / u64 abort_count; / 112 8 / struct { _Bool unfolded; / 120 1 / _Bool has_children; / 121 1 / }; / 120 2 / / size: 128, cachelines: 2, members: 13 / / padding: 6 */ }; ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:39:32 -07:00
Ian Rogers	6ba29fbb0b	perf callchain: Make brtype_stat in callchain_list optional struct callchain_list is 352bytes in size, 232 of which are brtype_stat. brtype_stat is only used for certain callchain_list items so make it optional, allocating when necessary. So that printing doesn't need to deal with an optional brtype_stat, pass an empty/zero version. Before: ``` struct callchain_list { u64 ip; /* 0 8 / struct map_symbol ms; / 8 24 / struct { _Bool unfolded; / 32 1 / _Bool has_children; / 33 1 / }; / 32 2 / / XXX 6 bytes hole, try to pack / u64 branch_count; / 40 8 / u64 from_count; / 48 8 / u64 predicted_count; / 56 8 / / --- cacheline 1 boundary (64 bytes) --- / u64 abort_count; / 64 8 / u64 cycles_count; / 72 8 / u64 iter_count; / 80 8 / u64 iter_cycles; / 88 8 / struct branch_type_stat brtype_stat; / 96 232 / / --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- / const char srcline; /* 328 8 / struct list_head list; / 336 16 / / size: 352, cachelines: 6, members: 13 / / sum members: 346, holes: 1, sum holes: 6 / / last cacheline: 32 bytes / }; ``` After: ``` struct callchain_list { u64 ip; / 0 8 / struct map_symbol ms; / 8 24 / struct { _Bool unfolded; / 32 1 / _Bool has_children; / 33 1 / }; / 32 2 / / XXX 6 bytes hole, try to pack / u64 branch_count; / 40 8 / u64 from_count; / 48 8 / u64 predicted_count; / 56 8 / / --- cacheline 1 boundary (64 bytes) --- / u64 abort_count; / 64 8 / u64 cycles_count; / 72 8 / u64 iter_count; / 80 8 / u64 iter_cycles; / 88 8 / struct branch_type_stat brtype_stat; /* 96 8 / const char srcline; /* 104 8 / struct list_head list; / 112 16 / / size: 128, cachelines: 2, members: 13 / / sum members: 122, holes: 1, sum holes: 6 */ }; ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:39:08 -07:00
Ian Rogers	d47d876d72	perf callchain: Make display use of branch_type_stat const Display code doesn't modify the branch_type_stat so switch uses to const. This is done to aid refactoring struct callchain_list where current the branch_type_stat is embedded even if not used. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:38:50 -07:00
Ian Rogers	67a3ebf1c3	perf offcpu: Add missed btf_free Caught by address/leak sanitizer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:38:33 -07:00
Ian Rogers	7b2e444b76	perf threads: Remove unused dead thread list Commit `40826c45eb` ("perf thread: Remove notion of dead threads") removed dead threads but the list head wasn't removed. Remove it here. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:38:09 -07:00
Ian Rogers	c1149037f6	perf hist: Add missing puts to hist__account_cycles Caught using reference count checking on perf top with "--call-graph=lbr". After this no memory leaks were detected. Fixes: `57849998e2` ("perf report: Add processing for cycle histograms") Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:37:48 -07:00
Ian Rogers	78c32f4cb1	libperf rc_check: Add RC_CHK_EQUAL Comparing pointers with reference count checking is tricky to avoid a SEGV. Add a convenience macro to simplify and use. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:37:22 -07:00
Ian Rogers	ab8ce15078	perf machine: Avoid out of bounds LBR memory read Running perf top with address sanitizer and "--call-graph=lbr" fails due to reading sample 0 when no samples exist. Add a guard to prevent this. Fixes: `e2b23483eb` ("perf machine: Factor out lbr_callchain_add_lbr_ip()") Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:36:20 -07:00
Ian Rogers	7a8f349e9d	perf rwsem: Add debug mode that uses a mutex Mutex error check will capture trying to take the lock recursively and other problems that rwlock won't. At the expense of concurrency, adda debug mode that uses a mutex in place of a rwsem. Signed-off-by: Ian Rogers <irogers@google.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: liuwenyu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20231024222353.3024098-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 13:35:35 -07:00
Namhyung Kim	b5711042a1	perf lock contention: Use per-cpu array map for spinlocks Currently lock contention timestamp is maintained in a hash map keyed by pid. That means it needs to get and release a map element (which is proctected by spinlock!) on each contention begin and end pair. This can impact on performance if there are a lot of contention (usually from spinlocks). It used to go with task local storage but it had an issue on memory allocation in some critical paths. Although it's addressed in recent kernels IIUC, the tool should support old kernels too. So it cannot simply switch to the task local storage at least for now. As spinlocks create lots of contention and they disabled preemption during the spinning, it can use per-cpu array to keep the timestamp to avoid overhead in hashmap update and delete. In contention_begin, it's easy to check the lock types since it can see the flags. But contention_end cannot see it. So let's try to per-cpu array first (unconditionally) if it has an active element (lock != 0). Then it should be used and per-task tstamp map should not be used until the per-cpu array element is cleared which means nested spinlock contention (if any) was finished and it nows see (the outer) lock. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231020204741.1869520-3-namhyung@kernel.org	2023-10-25 10:02:55 -07:00
Namhyung Kim	6a070573f2	perf lock contention: Check race in tstamp elem creation When pelem is NULL, it'd create a new entry with zero data. But it might be preempted by IRQ/NMI just before calling bpf_map_update_elem() then there's a chance to call it twice for the same pid. So it'd be better to use BPF_NOEXIST flag and check the return value to prevent the race. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231020204741.1869520-2-namhyung@kernel.org	2023-10-25 10:02:47 -07:00
Namhyung Kim	d99317f214	perf lock contention: Clear lock addr after use It checks the current lock to calculated the delta of contention time. The address is saved in the tstamp map which is allocated at begining of contention and released at end of contention. But it's possible for bpf_map_delete_elem() to fail. In that case, the element in the tstamp map kept for the current lock and it makes the next contention for the same lock tracked incorrectly. Specificially the next contention begin will see the existing element for the task and it'd just return. Then the next contention end will see the element and calculate the time using the timestamp for the previous begin. This can result in a large value for two small contentions happened from time to time. Let's clear the lock address so that it can be updated next time even if the bpf_map_delete_elem() failed. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231020204741.1869520-1-namhyung@kernel.org	2023-10-25 10:02:34 -07:00
Yang Jihong	e093a222d7	perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofile evsel__increase_rlimit() helper does nothing with evsel, and description of the functionality is inaccurate, rename it and move to util/rlimit.c. By the way, fix a checkppatch warning about misplaced license tag: WARNING: Misplaced SPDX-License-Identifier tag - use line 1 instead #160: FILE: tools/perf/util/rlimit.h:3: /* SPDX-License-Identifier: LGPL-2.1 */ No functional change. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231023033144.1011896-1-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-25 10:02:11 -07:00
Yang Jihong	c4a852635e	perf data: Increase RLIMIT_NOFILE limit when open too many files in perf_data__create_dir() If using parallel threads to collect data, perf record needs at least 6 fds per CPU. (one for sys_perf_event_open, four for pipe msg and ack of the pipe, see record__thread_data_open_pipes(), and one for open perf.data.XXX) For an environment with more than 100 cores, if perf record uses both `-a` and `--threads` options, it is easy to exceed the upper limit of the file descriptor number, when we run out of them try to increase the limits. Before: $ ulimit -n 1024 $ lscpu \| grep 'On-line CPU(s)' On-line CPU(s) list: 0-159 $ perf record --threads -a sleep 1 Failed to create data directory: Too many open files After: $ ulimit -n 1024 $ lscpu \| grep 'On-line CPU(s)' On-line CPU(s) list: 0-159 $ perf record --threads -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.394 MB perf.data (1576 samples) ] Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231013075945.698874-1-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-19 23:38:27 -07:00
Thomas Richter	5069211e2f	perf trace: Use the right bpf_probe_read(_str) variant for reading user data Perf test case 111 Check open filename arg using perf trace + vfs_getname fails on s390. This is caused by a failing function bpf_probe_read() in file util/bpf_skel/augmented_raw_syscalls.bpf.c. The root cause is the lookup by address. Function bpf_probe_read() is used. This function works only for architectures with ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE. On s390 is not possible to determine from the address to which address space the address belongs to (user or kernel space). Replace bpf_probe_read() by bpf_probe_read_kernel() and bpf_probe_read_str() by bpf_probe_read_user_str() to explicity specify the address space the address refers to. Output before: # ./perf trace -eopen,openat -- touch /tmp/111 libbpf: prog 'sys_enter': BPF program load failed: Invalid argument libbpf: prog 'sys_enter': -- BEGIN PROG LOAD LOG -- reg type unsupported for arg#0 function sys_enter#75 0: R1=ctx(off=0,imm=0) R10=fp0 ; int sys_enter(struct syscall_enter_args args) 0: (bf) r6 = r1 ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0) ; return bpf_get_current_pid_tgid(); 1: (85) call bpf_get_current_pid_tgid#14 ; R0_w=scalar() 2: (63) (u32 )(r10 -8) = r0 ; R0_w=scalar() R10=fp0 fp-8=????mmmm 3: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 ; ..... lines deleted here ..... 23: (bf) r3 = r6 ; R3_w=ctx(off=0,imm=0) R6=ctx(off=0,imm=0) 24: (85) call bpf_probe_read#4 unknown func bpf_probe_read#4 processed 23 insns (limit 1000000) max_states_per_insn 0 \ total_states 2 peak_states 2 mark_read 2 -- END PROG LOAD LOG -- libbpf: prog 'sys_enter': failed to load: -22 libbpf: failed to load object 'augmented_raw_syscalls_bpf' libbpf: failed to load BPF skeleton 'augmented_raw_syscalls_bpf': -22 .... Output after: # ./perf test -Fv 111 111: Check open filename arg using perf trace + vfs_getname : --- start --- 1.085 ( 0.011 ms): touch/320753 openat(dfd: CWD, filename: \ "/tmp/temporary_file.SWH85", \ flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, mode: IRUGO\|IWUGO) = 3 ---- end ---- Check open filename arg using perf trace + vfs_getname: Ok # Test with the sleep command shows: Output before: # ./perf trace -e sleep sleep 1.234567890 0.000 (1234.681 ms): sleep/63114 clock_nanosleep(rqtp: \ { .tv_sec: 0, .tv_nsec: 0 }, rmtp: 0x3ffe0979720) = 0 # Output after: # ./perf trace -e *sleep sleep 1.234567890 0.000 (1234.686 ms): sleep/64277 clock_nanosleep(rqtp: \ { .tv_sec: 1, .tv_nsec: 234567890 }, rmtp: 0x3fff3df9ea0) = 0 # Fixes: `14e4b9f428` ("perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Co-developed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: gor@linux.ibm.com Cc: hca@linux.ibm.com Cc: sumanthk@linux.ibm.com Cc: svens@linux.ibm.com Link: https://lore.kernel.org/r/20231019082642.3286650-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-19 22:41:46 -07:00
Oliver Upton	e2bdd172e6	perf build: Generate arm64's sysreg-defs.h and add to include path Start generating sysreg-defs.h in anticipation of updating sysreg.h to a version that needs the generated output. Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231011195740.3349631-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-10-18 23:36:25 +00:00
Namhyung Kim	1f36b190ad	perf tools: Do not ignore the default vmlinux.h The recent change made it possible to generate vmlinux.h from BTF and to ignore the file. But we also have a minimal vmlinux.h that will be used by default. It should not be ignored by GIT. Fixes: `b7a2d774c9` ("perf build: Add ability to build with a generated vmlinux.h") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310110451.rvdUZJEY-lkp@intel.com/ Cc: oe-kbuild-all@lists.linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-18 15:35:20 -07:00
Ian Rogers	0197da7aff	perf pmu: Lazily compute default config The default config is computed during creation of the PMU and may do things like scanning sysfs, when the PMU may just be used as part of scanning. Change default_config to perf_event_attr_init_default, a callback that is used when a default config needs initializing. This avoids holding onto the memory for a perf_event_attr and copying. On a tigerlake laptop running the pmu-scan benchmark: Before: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 28.780 usec (+- 0.503 usec) Average PMU scanning took: 283.480 usec (+- 18.471 usec) Number of openat syscalls: 30,227 After: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 27.880 usec (+- 0.169 usec) Average PMU scanning took: 245.260 usec (+- 15.758 usec) Number of openat syscalls: 28,914 Over 3 runs it is a nearly 12% reduction in execution time and a 4.3% of openat calls. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-17 12:40:50 -07:00
Ian Rogers	63883cb063	perf pmu: Const-ify perf_pmu__config_terms Add const to related APIs, this is so they can be used to default initialize a perf_event_attr from a const pmu. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-17 12:40:50 -07:00
Ian Rogers	3a42f4c796	perf pmu: Const-ify file APIs File APIs don't alter the struct pmu so allow const ones to be passed. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-17 12:40:50 -07:00
Ian Rogers	aa61360155	perf pmu: Rename perf_pmu__get_default_config to perf_pmu__arch_init Assign default_config as part of the init. perf_pmu__get_default_config was doing more than just getting the default config and so this is intended to better align with the code. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231012175645.1849503-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-17 12:40:50 -07:00
Adrian Hunter	661ce78105	perf intel-pt: Prefer get_unaligned_le64 to memcpy_le64 Use get_unaligned_le64() instead of memcpy_le64(..., 8) because it produces simpler code. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20231005190451.175568-6-adrian.hunter@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-17 12:40:50 -07:00
Adrian Hunter	3b4fa67fc6	perf intel-pt: Use get_unaligned_le16() etc Avoid unaligned access by using get_unaligned_le16(), get_unaligned_le32() and get_unaligned_le64(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20231005190451.175568-5-adrian.hunter@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-17 12:40:49 -07:00
Adrian Hunter	f058fa5b07	perf intel-pt: Use existing definitions of le16_to_cpu() etc Use definitions from tools/include/linux/kernel.h Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20231005190451.175568-4-adrian.hunter@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-17 12:40:49 -07:00
Adrian Hunter	1d2dbce9bb	perf intel-pt: Simplify intel_pt_get_vmcs() Simplify and remove unnecessary constant expressions. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20231005190451.175568-3-adrian.hunter@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-17 12:40:49 -07:00
Besar Wicaksono	a16afcc58a	perf cs-etm: Fix incorrect or missing decoder for raw trace The decoder creation for raw trace uses metadata from the first CPU. On per-cpu mode, this metadata is incorrectly used for every decoder. On per-process/per-thread traces, the first CPU is CPU0. If CPU0 trace is not enabled, its metadata will be marked unused and the decoder is not created. Perf report dump skips the decoding part because the decoder is missing. To fix this, use metadata of the CPU associated with sample object. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: suzuki.poulose@arm.com Cc: mike.leach@linaro.org Cc: jonathanh@nvidia.com Cc: rwiley@nvidia.com Cc: treding@nvidia.com Cc: vsethi@nvidia.com Cc: ywan@nvidia.com Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Cc: linux-tegra@vger.kernel.org Link: https://lore.kernel.org/r/20231010234803.5419-1-bwicaksono@nvidia.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:57 -07:00
Ian Rogers	b84b3f4792	perf bpf_counter: Fix a few memory leaks Memory leaks were detected by clang-tidy. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-20-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:57 -07:00
Ian Rogers	1052545017	perf header: Fix various error path memory leaks Memory leaks were detected by clang-tidy. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-19-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:57 -07:00
Ian Rogers	97fe038374	perf trace-event-info: Avoid passing NULL value to closedir If opendir failed then closedir was passed NULL which is erroneous. Caught by clang-tidy. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-18-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:57 -07:00
Ian Rogers	7875c72c8b	perf parse-events: Fix unlikely memory leak when cloning terms Add missing free on an error path as detected by clang-tidy. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-16-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:57 -07:00
Ian Rogers	63d471979e	perf svghelper: Avoid memory leak On success path the sib_core and sib_thr values weren't being freed. Detected by clang-tidy. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-14-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:56 -07:00
Ian Rogers	52a5ad12f2	perf dlfilter: Be defensive against potential NULL dereference In the unlikely case of having a symbol without a mapping, avoid a NULL dereference that clang-tidy warns about. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:56 -07:00
Ian Rogers	85f73c377b	perf mem-events: Avoid uninitialized read pmu should be initialized to NULL before perf_pmus__scan loop. Fix and shrink the scope of pmu at the same time. Issue detected by clang-tidy. Fixes: `5752c20f37` ("perf mem: Scan all PMUs instead of just core ones") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:56 -07:00
Ian Rogers	b3aa09ee78	perf jitdump: Avoid memory leak jit_repipe_unwinding_info is called in a loop by jit_process_dump, avoid leaking unwinding_data by free-ing before overwriting. Error detected by clang-tidy. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:56 -07:00
Ian Rogers	e237213670	perf env: Remove unnecessary NULL tests clang-tidy was warning: ``` util/env.c:334:23: warning: Access to field 'nr_pmu_mappings' results in a dereference of a null pointer (loaded from variable 'env') [clang-analyzer-core.NullDereference] env->nr_pmu_mappings = pmu_num; ``` As functions are called potentially when !env was true. This condition could never be true as it would produce a segv, so remove the unnecessary NULL tests and silence clang-tidy. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: llvm@lists.linux.dev Cc: Ming Wang <wangming01@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20231009183920.200859-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:56 -07:00
Ian Rogers	b20576fd7f	perf parse-events: Fix for term values that are raw events Raw events can be strings like 'r0xead' but the 0x is optional so they can also be 'read'. On IcelakeX uncore_imc_free_running has an event called 'read' which may be programmed as: ``` $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1 ``` However, the PE_RAW type isn't allowed on the right of a term, even though in this case we just want to interpret it as a string. This leads to the following error on IcelakeX: ``` $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1 event syntax error: '..nning/event=read/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events ``` Fix this by allowing raw types on the right of terms and treat them as strings, just as is already done for PE_LEGACY_CACHE. Make this consistent by just entirely removing name_or_legacy and always using name_or_raw that covers all three cases. Fixes: `6fd1e51915` ("perf parse-events: Support PMUs for legacy cache events") Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230928004431.1926969-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:55 -07:00
Arnaldo Carvalho de Melo	29a2fd7c72	perf symbols: Add 'intel_idle_ibrs' to the list of idle symbols This is a longstanding to do list entry: we need a way to see that a sample took place while in idle state, as the current way to do it is to infer that by the name of the functions that in such state have more samples, IOW: a hack. Maybe we can do flip a bit in samples that take place inside the enter/exit idle section in do_idle()? But till then, add one more :-\ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/ZR66Qgbcltt+zG7F@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:55 -07:00
Ian Rogers	03ff4c6b3e	perf parse-events: Avoid erange from hex numbers We specify that a "num_hex" comprises 1 or more digits, however, that allows strtoull to fail with ERANGE. Limit the number of hex digits to being between 1 and 16. Before: ``` $ perf stat -e 'cpu/rE7574c47490475745/' true perf: util/parse-events.c:215: fix_raw: Assertion `errno == 0' failed. Aborted (core dumped) ``` After: ``` $ perf stat -e 'cpu/rE7574c47490475745/' true event syntax error: 'cpu/rE7574c47490475745/' \___ Bad event or PMU Unable to find PMU or event on a PMU of 'cpu' Initial error: event syntax error: 'cpu/rE7574c47490475745/' \___ unknown term 'rE7574c47490475745' for pmu 'cpu' valid terms: event,pc,edge,offcore_rsp,ldlat,inv,umask,frontend,cmask,config,config1,config2,config3,name,period,percore,metric-id Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events ``` Issue found through fuzz testing. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230907210533.3712979-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-12 10:01:55 -07:00
Arnaldo Carvalho de Melo	87cd3d4819	perf tools fixes for v6.6: 1st batch Build: - Update header files in the tools/**/include directory to sync with the kernel sources as usual. - Remove unused bpf-prologue files. While it's not strictly a fix, but the functionality was removed in this cycle so better to get rid of the code together. - Other minor build fixes. Misc: - Fix uninitialized memory access in PMU parsing code - Fix segfaults on software event Signed-off-by: Namhyung Kim <namhyung@kernel.org> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZRIFKAAKCRCMstVUGiXM g/pXAP9HLB2s+beBTK5iQU4/NfqmAVSl303QCoR9xLByo38vfAEAlLiRIh061pTi PRlXVuY9bUQPyCSYsiBHv/fmLqdQdwU= =ti6G -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v6.6-1-2023-09-25' into perf-tools-next To pick up the 'perf bench sched-seccomp-notify' changes to allow us to continue build testing perf-tools-next with the set of distro containers, where some older ones don't have a recent enough seccomp.h UAPI header that contains defines needed by this new 'perf bench' workload. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-10-10 17:36:36 -03:00
Athira Rajeev	6be5d82862	tools/perf: Add "is_kmod" to struct dso to check if it is kernel module Update "struct dso" to include new member "is_kmod". This new field will determine if the file is a kernel module or not. To resolve the address from a sample, perf looks at the DSO maps. In case of address from a kernel module, there were some address found to be not resolved. This was observed while running perf test for "Object code reading". Though the ip falls beteen the start address of the loaded module (perf map->start ) and end address ( perf map->end), it was unresolved. This was happening because in some cases for kernel modules, address from sample points to stub instructions. To identify if the DSO is a kernel module, the new field "is_kmod" is added to "struct dso". Reported-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230928075213.84392-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-04 22:28:07 -07:00
Athira Rajeev	26a5262d30	tools/perf: Add text_end to "struct dso" to save .text section size Update "struct dso" to include new member "text_end". This new field will represent the offset for end of text section for a dso. For elf, this value is derived as: sh_size (Size of section in byes) + sh_offset (Section file offst) of the elf header for text. For bfd, this value is derived as: 1. For PE file, section->size + ( section->vma - dso->text_offset) 2. Other cases: section->filepos (file position) + section->size (size of section) To resolve the address from a sample, perf looks at the DSO maps. In case of address from a kernel module, there were some address found to be not resolved. This was observed while running perf test for "Object code reading". Though the ip falls beteen the start address of the loaded module (perf map->start ) and end address ( perf map->end), it was unresolved. Example: Reading object code for memory address: 0xc008000007f0142c File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko On file address is: 0x1114cc Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko objdump read too few bytes: 128 test child finished with -1 Here, module is loaded at: # cat /proc/modules \| grep xfs xfs 2228224 3 - Live 0xc008000007d00000 From objdump for xfs module, text section is: text 0010f7bc 0000000000000000 0000000000000000 000000a0 2**4 Here the offset for 0xc008000007f0142c ie 0x112074 falls out .text section which is up to 0x10f7bc. In this case for module, the address 0xc008000007e11fd4 is pointing to stub instructions. This address range represents the module stubs which is allocated on module load and hence is not part of DSO offset. To identify such address, which falls out of text section and within module end, added the new field "text_end" to "struct dso". Reported-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230928075213.84392-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-04 22:28:07 -07:00
Kuan-Wei Chiu	be7a4caa7c	perf hisi-ptt: Fix memory leak in lseek failure handling In the previous code, there was a memory leak issue where the previously allocated memory was not freed upon a failed lseek operation. This patch addresses the problem by releasing the old memory before returning -errno in case of a lseek failure. This ensures that memory is properly managed and avoids potential memory leaks. Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: yangyicong@hisilicon.com Cc: jonathan.cameron@huawei.com Link: https://lore.kernel.org/r/20230930072719.1267784-1-visitorckw@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-10-04 22:28:07 -07:00
Adrian Hunter	f2d87895cb	perf intel-pt: Fix async branch flags Ensure PERF_IP_FLAG_ASYNC is set always for asynchronous branches (i.e. interrupts etc). Fixes: `90e457f7be` ("perf tools: Add Intel PT support") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230928072953.19369-1-adrian.hunter@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-09-29 23:59:08 -07:00
Adrian Hunter	7a48b58eb5	perf dlfilter: Fix use of addr_location__exit() in dlfilter__object_code() Stop calling addr_location__exit() when addr_location__init() was not called. Fixes: `0dd5041c9a` ("perf addr_location: Add init/exit/copy functions") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230928071605.17624-1-adrian.hunter@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-09-29 23:55:05 -07:00

1 2 3 4 5 ...

8739 Commits