From 40999312c703f80e8d31bc77cf00e6e84d36e091 Mon Sep 17 00:00:00 2001 From: Kan Liang Date: Wed, 18 Jan 2017 08:21:01 -0500 Subject: [PATCH] perf/core: Try parent PMU first when initializing a child event perf has additional overhead when monitoring the task which frequently generates child tasks. perf_init_event() is one of the hotspots for the additional overhead: Currently, to get the PMU, it tries to search the type in pmu_idr at first. But it is not always successful, especially for the widely used PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events. So it has to go to the slow path which go through the whole PMUs list. It will be a big performance issue, if the PMUs list is long (e.g. server with many uncore boxes) and the task frequently generates child tasks. The child event inherits its parent event. So the child event should try its parent PMU first. Here is some data from the overhead test on Broadwell server: perf record -e $TEST_EVENTS -- ./loop.sh 50000 loop.sh start=$(date +%s%N) i=0 while [ "$i" -le "$1" ] do date > /dev/null i=`expr $i + 1` done end=$(date +%s%N) elapsed=`expr $end - $start` Event# Original elapsed time Elapsed time with patch delta 1 196,573,192,397 189,162,029,998 -3.77% 2 257,567,753,013 241,620,788,683 -6.19% 4 398,730,726,971 370,518,938,714 -7.08% 8 824,983,761,120 740,702,489,329 -10.22% 16 1,883,411,923,498 1,672,027,508,355 -11.22% ... which shows a nice performance improvement. Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Link: http://lkml.kernel.org/r/1484745662-15928-2-git-send-email-kan.liang@intel.com [ Tidied up the changelog and the code comment. ] Signed-off-by: Ingo Molnar --- kernel/events/core.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/events/core.c b/kernel/events/core.c index cbcee23d05f0..88676ff98c0f 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9022,6 +9022,14 @@ static struct pmu *perf_init_event(struct perf_event *event) idx = srcu_read_lock(&pmus_srcu); + /* Try parent's PMU first: */ + if (event->parent && event->parent->pmu) { + pmu = event->parent->pmu; + ret = perf_try_init_event(pmu, event); + if (!ret) + goto unlock; + } + rcu_read_lock(); pmu = idr_find(&pmu_idr, event->attr.type); rcu_read_unlock();