linux-stable/kernel/time
Thomas Gleixner c9e6189fb0 ntp: Make the RTC synchronization more reliable
Miroslav reported that the periodic RTC synchronization in the NTP code
fails more often than not to hit the specified update window.

The reason is that the code uses delayed_work to schedule the update which
needs to be in thread context as the underlying RTC might be connected via
a slow bus, e.g. I2C. In the update function it verifies whether the
current time is correct vs. the requirements of the underlying RTC.

But delayed_work is using the timer wheel for scheduling which is
inaccurate by design. Depending on the distance to the expiry the wheel
gets less granular to allow batching and to avoid the cascading of the
original timer wheel. See 500462a9de ("timers: Switch to a non-cascading
wheel") and the code for further details.

The code already deals with this by splitting the 660 seconds period into a
long 659 seconds timer and then retrying with a smaller delta.

But looking at the actual granularities of the timer wheel (which depend on
the HZ configuration) the 659 seconds timer ends up in an outer wheel level
and is affected by a worst case granularity of:

HZ          Granularity
1000        32s
 250        16s
 100        40s

So the initial timer can be already off by max 12.5% which is not a big
issue as the period of the sync is defined as ~11 minutes.

The fine grained second attempt schedules to the desired update point with
a timer expiring less than a second from now. Depending on the actual delta
and the HZ setting even the second attempt can end up in outer wheel levels
which have a large enough granularity to make the correctness check fail.

As this is a fundamental property of the timer wheel there is no way to
make this more accurate short of iterating in one jiffies steps towards the
update point.

Switch it to an hrtimer instead which schedules the actual update work. The
hrtimer will expire precisely (max 1 jiffie delay when high resolution
timers are not available). The actual scheduling delay of the work is the
same as before.

The update is triggered from do_adjtimex() which is a bit racy but not much
more racy than it was before:

     if (ntp_synced())
     	queue_delayed_work(system_power_efficient_wq, &sync_work, 0);

which is racy when the work is currently executed and has not managed to
reschedule itself.

This becomes now:

     if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
     	queue_work(system_power_efficient_wq, &sync_work, 0);

which is racy when the hrtimer has expired and the work is currently
executed and has not yet managed to rearm the hrtimer.

Not a big problem as it just schedules work for nothing.

The new implementation has a safe guard in place to catch the case where
the hrtimer is queued on entry to the work function and avoids an extra
update attempt of the RTC that way.

Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Miroslav Lichvar <mlichvar@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.062910520@linutronix.de
2020-12-11 10:40:52 +01:00
..
alarmtimer.c alarmtimer: Convert comma to semicolon 2020-08-25 12:45:53 +02:00
clockevents.c
clocksource.c clocksource: Remove obsolete ifdef 2020-06-09 16:36:47 +02:00
hrtimer.c hrtimer: Fix kernel-doc markups 2020-11-16 15:20:01 +01:00
itimer.c y2038: rename itimerval to __kernel_old_itimerval 2019-12-18 18:07:33 +01:00
jiffies.c timekeeping: Convert jiffies_seq to seqcount_raw_spinlock_t 2020-10-26 11:04:14 +01:00
Kconfig posix-cpu-timers: Provide mechanisms to defer timer handling to task_work 2020-08-06 16:50:59 +02:00
Makefile ns: Introduce Time Namespace 2020-01-14 12:20:48 +01:00
namespace.c nsproxy: support CLONE_NEWTIME with setns() 2020-07-08 11:14:22 +02:00
ntp.c ntp: Make the RTC synchronization more reliable 2020-12-11 10:40:52 +01:00
ntp_internal.h ntp: Make the RTC synchronization more reliable 2020-12-11 10:40:52 +01:00
posix-clock.c posix-clocks: Rename the clock_get() callback to clock_get_timespec() 2020-01-14 12:20:49 +01:00
posix-cpu-timers.c posix-cpu-timers: Provide mechanisms to defer timer handling to task_work 2020-08-06 16:50:59 +02:00
posix-stubs.c posix-timers: Make clock_nanosleep() time namespace aware 2020-01-14 12:20:55 +01:00
posix-timers.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
posix-timers.h posix-clocks: Introduce clock_get_ktime() callback 2020-01-14 12:20:51 +01:00
sched_clock.c time/sched_clock: Use seqcount_latch_t 2020-09-10 11:19:29 +02:00
test_udelay.c
tick-broadcast-hrtimer.c tick: broadcast-hrtimer: Fix a race in bc_set_next 2019-09-27 14:45:55 +02:00
tick-broadcast.c tick: Get rid of tick_period 2020-11-19 10:48:29 +01:00
tick-common.c tick: Get rid of tick_period 2020-11-19 10:48:29 +01:00
tick-internal.h tick: Get rid of tick_period 2020-11-19 10:48:29 +01:00
tick-oneshot.c
tick-sched.c tick: Get rid of tick_period 2020-11-19 10:48:29 +01:00
tick-sched.h tick/sched: Update tick_sched struct documentation 2019-03-24 20:29:32 +01:00
time.c y2038: remove unused time32 interfaces 2020-02-21 11:22:15 -08:00
timeconst.bc
timeconv.c time: Add missing colons for parameter documentation of time64_to_tm() 2020-11-15 23:47:23 +01:00
timecounter.c
timekeeping.c timekeeping: Address parameter documentation issues for various functions 2020-11-15 23:47:24 +01:00
timekeeping.h timekeeping: Convert jiffies_seq to seqcount_raw_spinlock_t 2020-10-26 11:04:14 +01:00
timekeeping_debug.c
timekeeping_internal.h timekeeping/vsyscall: Provide vdso_update_begin/end() 2020-08-06 10:57:30 +02:00
timer.c timers: Make run_local_timers() static 2020-11-16 15:20:01 +01:00
timer_list.c timer_list: Use printk format instead of open-coded symbol lookup 2020-11-15 20:47:14 +01:00
vsyscall.c timekeeping/vsyscall: Provide vdso_update_begin/end() 2020-08-06 10:57:30 +02:00