linux-stable/kernel/events
Peter Zijlstra b0ebeb5956 perf: Optimize perf_cgroup_switch()
[ Upstream commit f06cc667f7 ]

Namhyung reported that bd27568117 ("perf: Rewrite core context handling")
regresses context switch overhead when perf-cgroup is in use together
with 'slow' PMUs like uncore.

Specifically, perf_cgroup_switch()'s perf_ctx_disable() /
ctx_sched_out() etc.. all iterate the full list of active PMUs for
that CPU, even if they don't have cgroup events.

Previously there was cgrp_cpuctx_list which linked the relevant PMUs
together, but that got lost in the rework. Instead of re-instruducing
a similar list, let the perf_event_pmu_context iteration skip those
that do not have cgroup events. This avoids growing multiple versions
of the perf_event_pmu_context iteration.

Measured performance (on a slightly different patch):

Before)

  $ taskset -c 0 ./perf bench sched pipe -l 10000 -G AAA,BBB
  # Running 'sched/pipe' benchmark:
  # Executed 10000 pipe operations between two processes

       Total time: 0.901 [sec]

        90.128700 usecs/op
            11095 ops/sec

After)

  $ taskset -c 0 ./perf bench sched pipe -l 10000 -G AAA,BBB
  # Running 'sched/pipe' benchmark:
  # Executed 10000 pipe operations between two processes

       Total time: 0.065 [sec]

         6.560100 usecs/op
           152436 ops/sec

Fixes: bd27568117 ("perf: Rewrite core context handling")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Debugged-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20231009210425.GC6307@noisy.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:58:53 +01:00
..
callchain.c uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
core.c perf: Optimize perf_cgroup_switch() 2023-11-20 11:58:53 +01:00
hw_breakpoint.c perf/hw_breakpoint: Remove arch breakpoint hooks 2023-08-16 23:54:50 +10:00
hw_breakpoint_test.c perf/hw_breakpoint: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:52 -07:00
internal.h perf/core: Fix perf_mmap fail when CONFIG_PERF_USE_VMALLOC enabled 2022-04-19 21:15:42 +02:00
Makefile perf/hw_breakpoint: Add KUnit test for constraints accounting 2022-08-30 10:56:20 +02:00
ring_buffer.c perf/ring_buffer: Use local_try_cmpxchg in __perf_output_begin 2023-07-10 09:52:36 +02:00
uprobes.c mmu_notifiers: don't invalidate secondary TLBs as part of mmu_notifier_invalidate_range_end() 2023-08-18 10:12:41 -07:00