linux-stable/kernel/rcu
Neeraj Upadhyay 4f2bfd9494 srcu: Make expedited RCU grace periods block even less frequently
The purpose of commit 282d8998e9 ("srcu: Prevent expedited GPs
and blocking readers from consuming CPU") was to prevent a long
series of never-blocking expedited SRCU grace periods from blocking
kernel-live-patching (KLP) progress.  Although it was successful, it also
resulted in excessive boot times on certain embedded workloads running
under qemu with the "-bios QEMU_EFI.fd" command line.  Here "excessive"
means increasing the boot time up into the three-to-four minute range.
This increase in boot time was due to the more than 6000 back-to-back
invocations of synchronize_rcu_expedited() within the KVM host OS, which
in turn resulted from qemu's emulation of a long series of MMIO accesses.

Commit 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace
periods") did not significantly help this particular use case.

Zhangfei Gao and Shameerali Kolothum Thodi did experiments varying the
value of SRCU_MAX_NODELAY_PHASE with HZ=250 and with various values
of non-sleeping per phase counts on a system with preemption enabled,
and observed the following boot times:

+──────────────────────────+────────────────+
| SRCU_MAX_NODELAY_PHASE   | Boot time (s)  |
+──────────────────────────+────────────────+
| 100                      | 30.053         |
| 150                      | 25.151         |
| 200                      | 20.704         |
| 250                      | 15.748         |
| 500                      | 11.401         |
| 1000                     | 11.443         |
| 10000                    | 11.258         |
| 1000000                  | 11.154         |
+──────────────────────────+────────────────+

Analysis on the experiment results show additional improvements with
CPU-bound delays approaching one jiffy in duration. This improvement was
also seen when number of per-phase iterations were scaled to one jiffy.

This commit therefore scales per-grace-period phase number of non-sleeping
polls so that non-sleeping polls extend for about one jiffy. In addition,
the delay-calculation call to srcu_get_delay() in srcu_gp_end() is
replaced with a simple check for an expedited grace period.  This change
schedules callback invocation immediately after expedited grace periods
complete, which results in greatly improved boot times.  Testing done
by Marc and Zhangfei confirms that this change recovers most of the
performance degradation in boottime; for CONFIG_HZ_250 configuration,
specifically, boot times improve from 3m50s to 41s on Marc's setup;
and from 2m40s to ~9.7s on Zhangfei's setup.

In addition to the changes to default per phase delays, this
change adds 3 new kernel parameters - srcutree.srcu_max_nodelay,
srcutree.srcu_max_nodelay_phase, and srcutree.srcu_retry_check_delay.
This allows users to configure the srcu grace period scanning delays in
order to more quickly react to additional use cases.

Fixes: 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace periods")
Fixes: 282d8998e9 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU")
Reported-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reported-by: yueluck <yueluck@163.com>
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-07-19 11:39:59 -07:00
..
Kconfig Merge branch 'exp.2022.05.11a' into HEAD 2022-05-11 11:49:35 -07:00
Kconfig.debug Merge branch 'exp.2022.05.11a' into HEAD 2022-05-11 11:49:35 -07:00
Makefile rcuperf: Change rcuperf to rcuscale 2020-08-24 18:39:24 -07:00
rcu.h sysctl changes for v5.19-rc1 2022-05-26 16:57:20 -07:00
rcu_segcblist.c rcu: Clarify fill-the-gap comment in rcu_segcblist_advance() 2022-04-11 17:28:48 -07:00
rcu_segcblist.h rcu: Mark writes to the rcu_segcblist structure's ->flags field 2022-02-14 10:36:58 -08:00
rcuscale.c rcuscale: Allow rcuscale without RCU Tasks Rude/Trace 2022-04-20 16:53:19 -07:00
rcutorture.c Merge branches 'docs.2022.04.20a', 'fixes.2022.04.20a', 'nocb.2022.04.11b', 'rcu-tasks.2022.04.11b', 'srcu.2022.05.03a', 'torture.2022.04.11b', 'torture-tasks.2022.04.20a' and 'torturescript.2022.04.20a' into HEAD 2022-05-03 10:21:40 -07:00
refscale.c refscale: Allow refscale without RCU Tasks Rude/Trace 2022-04-20 16:53:19 -07:00
srcutiny.c srcu: Prevent redundant __srcu_read_unlock() wakeup 2021-11-30 17:28:16 -08:00
srcutree.c srcu: Make expedited RCU grace periods block even less frequently 2022-07-19 11:39:59 -07:00
sync.c rcu_sync: Fix comment to properly reflect rcu_sync_exit() behavior 2022-04-20 16:51:11 -07:00
tasks.h rcu-tasks: Handle sparse cpu_possible_mask in rcu_tasks_invoke_cbs() 2022-04-11 17:06:43 -07:00
tiny.c srcu: Initialize SRCU after timers 2021-05-10 16:03:35 -07:00
tree.c Merge branch 'exp.2022.05.11a' into HEAD 2022-05-11 11:49:35 -07:00
tree.h Merge branch 'exp.2022.05.11a' into HEAD 2022-05-11 11:49:35 -07:00
tree_exp.h rcu: Move expedited grace period (GP) work to RT kthread_worker 2022-05-11 11:47:10 -07:00
tree_nocb.h rcu/nocb: Initialize nocb kthreads only for boot CPU prior SMP initialization 2022-04-11 17:05:58 -07:00
tree_plugin.h Merge branches 'docs.2022.04.20a', 'fixes.2022.04.20a', 'nocb.2022.04.11b', 'rcu-tasks.2022.04.11b', 'srcu.2022.05.03a', 'torture.2022.04.11b', 'torture-tasks.2022.04.20a' and 'torturescript.2022.04.20a' into HEAD 2022-05-03 10:21:40 -07:00
tree_stall.h printk changes for 5.19 2022-05-25 10:32:08 -07:00
update.c rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT 2022-05-11 11:38:50 -07:00