linux-stable/kernel/time
Thomas Gleixner a84b08314f tick/broadcast: Make broadcast device replacement work correctly
[ Upstream commit f9d36cf445 ]

When a tick broadcast clockevent device is initialized for one shot mode
then tick_broadcast_setup_oneshot() OR's the periodic broadcast mode
cpumask into the oneshot broadcast cpumask.

This is required when switching from periodic broadcast mode to oneshot
broadcast mode to ensure that CPUs which are waiting for periodic
broadcast are woken up on the next tick.

But it is subtly broken, when an active broadcast device is replaced and
the system is already in oneshot (NOHZ/HIGHRES) mode. Victor observed
this and debugged the issue.

Then the OR of the periodic broadcast CPU mask is wrong as the periodic
cpumask bits are sticky after tick_broadcast_enable() set it for a CPU
unless explicitly cleared via tick_broadcast_disable().

That means that this sets all other CPUs which have tick broadcasting
enabled at that point unconditionally in the oneshot broadcast mask.

If the affected CPUs were already idle and had their bits set in the
oneshot broadcast mask then this does no harm. But for non idle CPUs
which were not set this corrupts their state.

On their next invocation of tick_broadcast_enable() they observe the bit
set, which indicates that the broadcast for the CPU is already set up.
As a consequence they fail to update the broadcast event even if their
earliest expiring timer is before the actually programmed broadcast
event.

If the programmed broadcast event is far in the future, then this can
cause stalls or trigger the hung task detector.

Avoid this by telling tick_broadcast_setup_oneshot() explicitly whether
this is the initial switch over from periodic to oneshot broadcast which
must take the periodic broadcast mask into account. In the case of
initialization of a replacement device this prevents that the broadcast
oneshot mask is modified.

There is a second problem with broadcast device replacement in this
function. The broadcast device is only armed when the previous state of
the device was periodic.

That is correct for the switch from periodic broadcast mode to oneshot
broadcast mode as the underlying broadcast device could operate in
oneshot state already due to lack of periodic state in hardware. In that
case it is already armed to expire at the next tick.

For the replacement case this is wrong as the device is in shutdown
state. That means that any already pending broadcast event will not be
armed.

This went unnoticed because any CPU which goes idle will observe that
the broadcast device has an expiry time of KTIME_MAX and therefore any
CPUs next timer event will be earlier and cause a reprogramming of the
broadcast device. But that does not guarantee that the events of the
CPUs which were already in idle are delivered on time.

Fix this by arming the newly installed device for an immediate event
which will reevaluate the per CPU expiry times and reprogram the
broadcast device accordingly. This is simpler than caching the last
expiry time in yet another place or saving it before the device exchange
and handing it down to the setup function. Replacement of broadcast
devices is not a frequent operation and usually happens once somewhere
late in the boot process.

Fixes: 9c336c9935 ("tick/broadcast: Allow late registered device to enter oneshot mode")
Reported-by: Victor Hassan <victor@allwinnertech.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/87pm7d2z1i.ffs@tglx
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-24 17:32:31 +01:00
..
alarmtimer.c alarmtimer: Prevent starvation by small intervals and SIG_IGN 2023-02-22 12:59:55 +01:00
clockevents.c
clocksource-wdtest.c
clocksource.c clocksource: Suspend the watchdog temporarily when high read latency detected 2023-03-10 09:33:50 +01:00
hrtimer.c timers: Prevent union confusion from unexpected restart_syscall() 2023-03-10 09:33:49 +01:00
itimer.c
jiffies.c
Kconfig context_tracking: Take idle eqs entrypoints over RCU 2022-07-05 13:32:16 -07:00
Makefile
namespace.c
ntp.c
ntp_internal.h
posix-clock.c
posix-cpu-timers.c posix-cpu-timers: Implement the missing timer_wait_running callback 2023-05-11 23:03:00 +09:00
posix-stubs.c timers: Prevent union confusion from unexpected restart_syscall() 2023-03-10 09:33:49 +01:00
posix-timers.c posix-cpu-timers: Implement the missing timer_wait_running callback 2023-05-11 23:03:00 +09:00
posix-timers.h
sched_clock.c
test_udelay.c time/debug: Fix memory leak with using debugfs_lookup() 2023-03-10 09:33:52 +01:00
tick-broadcast-hrtimer.c
tick-broadcast.c tick/broadcast: Make broadcast device replacement work correctly 2023-05-24 17:32:31 +01:00
tick-common.c tick/common: Align tick period with the HZ tick. 2023-05-11 23:03:16 +09:00
tick-internal.h
tick-legacy.c
tick-oneshot.c
tick-sched.c rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check 2023-05-11 23:03:06 +09:00
tick-sched.h
time.c time: Correct the prototype of ns_to_kernel_old_timeval and ns_to_timespec64 2022-08-09 20:02:13 +02:00
time_test.c
timeconst.bc
timeconv.c
timecounter.c
timekeeping.c timekeeping: Fix references to nonexistent ktime_get_fast_ns() 2023-05-11 23:03:35 +09:00
timekeeping.h
timekeeping_debug.c
timekeeping_internal.h
timer.c Random number generator updates for Linux 5.19-rc1. 2022-05-24 11:58:10 -07:00
timer_list.c
vsyscall.c