linux-stable/kernel
Anna-Maria Behnsen 3c89a068bf PM: s2idle: Make sure CPUs will wakeup directly on resume
s2idle works like a regular suspend with freezing processes and freezing
devices. All CPUs except the control CPU go into idle. Once this is
completed the control CPU kicks all other CPUs out of idle, so that they
reenter the idle loop and then enter s2idle state. The control CPU then
issues an swait() on the suspend state and therefore enters the idle loop
as well.

Due to being kicked out of idle, the other CPUs leave their NOHZ states,
which means the tick is active and the corresponding hrtimer is programmed
to the next jiffie.

On entering s2idle the CPUs shut down their local clockevent device to
prevent wakeups. The last CPU which enters s2idle shuts down its local
clockevent and freezes timekeeping.

On resume, one of the CPUs receives the wakeup interrupt, unfreezes
timekeeping and its local clockevent and starts the resume process. At that
point all other CPUs are still in s2idle with their clockevents switched
off. They only resume when they are kicked by another CPU or after resuming
devices and then receiving a device interrupt.

That means there is no guarantee that all CPUs will wakeup directly on
resume. As a consequence there is no guarantee that timers which are queued
on those CPUs and should expire directly after resume, are handled. Also
timer list timers which are remotely queued to one of those CPUs after
resume will not result in a reprogramming IPI as the tick is
active. Queueing a hrtimer will also not result in a reprogramming IPI
because the first hrtimer event is already in the past.

The recent introduction of the timer pull model (7ee9887703 ("timers:
Implement the hierarchical pull model")) amplifies this problem, if the
current migrator is one of the non woken up CPUs. When a non pinned timer
list timer is queued and the queuing CPU goes idle, it relies on the still
suspended migrator CPU to expire the timer which will happen by chance.

The problem exists since commit 8d89835b04 ("PM: suspend: Do not pause
cpuidle in the suspend-to-idle path"). There the cpuidle_pause() call which
in turn invoked a wakeup for all idle CPUs was moved to a later point in
the resume process. This might not be reached or reached very late because
it waits on a timer of a still suspended CPU.

Address this by kicking all CPUs out of idle after the control CPU returns
from swait() so that they resume their timers and restore consistent system
state.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218641
Fixes: 8d89835b04 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path")
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: 5.16+ <stable@kernel.org> # 5.16+
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-08 15:36:54 +02:00
..
bpf Including fixes from netfilter, bluetooth and bpf. 2024-04-04 14:49:10 -07:00
cgroup Networking changes for 6.9. 2024-03-12 17:44:08 -07:00
configs Networking changes for 6.9. 2024-03-12 17:44:08 -07:00
debug kdb: Fix a potential buffer overflow in kdb_local() 2024-01-17 17:19:06 +00:00
dma dma-mapping fixes for Linux 6.9 2024-03-24 10:45:31 -07:00
entry entry: Respect changes to system call number by trace_sys_enter() 2024-03-12 13:23:32 +01:00
events - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames 2024-03-14 17:43:30 -07:00
futex futex: Prevent the reuse of stale pi_state 2024-01-19 12:58:17 +01:00
gcov gcov: annotate struct gcov_iterator with __counted_by 2023-10-18 14:43:22 -07:00
irq genirq: Introduce IRQF_COND_ONESHOT and use it in pinctrl-amd 2024-03-25 23:45:21 +01:00
kcsan mm: delete checks for xor_unlock_is_negative_byte() 2023-10-18 14:34:17 -07:00
livepatch
locking locking/rtmutex: Use try_cmpxchg_relaxed() in mark_rt_mutex_waiters() 2024-03-01 13:02:05 +01:00
module This push fixes a regression that broke iwd as well as a divide by 2024-03-25 10:48:23 -07:00
power PM: s2idle: Make sure CPUs will wakeup directly on resume 2024-04-08 15:36:54 +02:00
printk printk changes for 6.9-rc2 2024-03-26 09:25:57 -07:00
rcu Merge branches 'rcu-doc.2024.02.14a', 'rcu-nocb.2024.02.14a', 'rcu-exp.2024.02.14a', 'rcu-tasks.2024.02.26a' and 'rcu-misc.2024.02.14a' into rcu.2024.02.26a 2024-02-26 17:37:25 -08:00
sched RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
time timers/migration: Return early on deactivation 2024-04-05 11:05:16 +02:00
trace bpf: support deferring bpf_link dealloc to after RCU grace period 2024-03-28 18:47:45 -07:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.kexec crash: clean up kdump related config items 2024-02-23 17:48:22 -08:00
Kconfig.locks
Kconfig.preempt
Makefile crash: split crash dumping code out from kexec_core.c 2024-02-23 17:48:22 -08:00
acct.c
async.c async: Use a dedicated unbound workqueue with raised min_active 2024-02-09 11:13:59 -10:00
audit.c audit: use KMEM_CACHE() instead of kmem_cache_create() 2024-01-25 10:12:22 -05:00
audit.h
audit_fsnotify.c
audit_tree.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
audit_watch.c audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare() 2023-11-14 17:34:27 -05:00
auditfilter.c audit: remove unnecessary assignment in audit_dupe_lsm_field() 2024-01-25 09:59:27 -05:00
auditsc.c audit,io_uring: io_uring openat triggers audit reference count underflow 2023-10-13 18:34:46 +02:00
backtracetest.c backtracetest: Convert from tasklet to BH workqueue 2024-02-05 13:22:34 -10:00
bounds.c bounds: support non-power-of-two CONFIG_NR_CPUS 2024-02-22 15:38:51 -08:00
capability.c
cfi.c
compat.c
configs.c
context_tracking.c context_tracking: Fix kerneldoc headers for __ct_user_{enter,exit}() 2024-02-14 07:53:50 -08:00
cpu.c Rework of APIC enumeration and topology evaluation: 2024-03-11 15:45:55 -07:00
cpu_pm.c
crash_core.c crash: split crash dumping code out from kexec_core.c 2024-02-23 17:48:22 -08:00
crash_reserve.c crash: use macro to add crashk_res into iomem early for specific arch 2024-03-26 11:14:12 -07:00
cred.c cred: Use KMEM_CACHE() instead of kmem_cache_create() 2024-02-23 17:33:31 -05:00
delayacct.c
dma.c
elfcorehdr.c crash: remove dependency of FA_DUMP on CRASH_DUMP 2024-02-23 17:48:22 -08:00
exec_domain.c
exit.c vfs-6.9.pidfd 2024-03-11 10:21:06 -07:00
exit.h
extable.c
fail_function.c
fork.c RCU pull request for v6.9 2024-03-11 12:02:50 -07:00
freezer.c Linux 6.7-rc6 2023-12-23 15:52:13 +01:00
gen_kheaders.sh
groups.c
hung_task.c kernel/hung_task.c: export sysctl_hung_task_timeout_secs 2024-03-13 21:22:04 -04:00
iomem.c
irq_work.c
jump_label.c
kallsyms.c
kallsyms_internal.h
kallsyms_selftest.c mm/vmalloc: remove vmap_area_list 2024-02-23 17:48:19 -08:00
kallsyms_selftest.h
kcmp.c file: convert to SLAB_TYPESAFE_BY_RCU 2023-10-19 11:02:48 +02:00
kcov.c
kexec.c crash: split crash dumping code out from kexec_core.c 2024-02-23 17:48:22 -08:00
kexec_core.c - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
kexec_elf.c
kexec_file.c arm64, crash: wrap crash dumping code into crash related ifdefs 2024-02-23 17:48:23 -08:00
kexec_internal.h crash: remove dependency of FA_DUMP on CRASH_DUMP 2024-02-23 17:48:22 -08:00
kheaders.c
kprobes.c kprobes: Remove unnecessary initial values of variables 2024-02-08 23:29:29 +09:00
ksyms_common.c
ksysfs.c Driver core changes for 6.9-rc1 2024-03-21 13:34:15 -07:00
kthread.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
latencytop.c
module_signature.c
notifier.c
nsproxy.c pidfd: add pidfs 2024-03-01 12:23:37 +01:00
numa.c kernel/numa.c: Move logging out of numa.h 2023-12-20 19:26:30 -05:00
padata.c Author: Gang Li padata: dispatch works on 2024-03-06 13:04:17 -08:00
panic.c - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
params.c params: Fix multi-line comment style 2023-12-01 09:51:44 -08:00
pid.c pidfs: remove config option 2024-03-13 12:53:53 -07:00
pid_namespace.c wait: Remove uapi header file from main header file 2023-12-20 19:26:31 -05:00
pid_sysctl.h
profile.c
ptrace.c ptrace_attach: shift send(SIGSTOP) into ptrace_set_stopped() 2024-02-22 15:38:52 -08:00
range.c
reboot.c Thermal control updates for 6.8-rc1 2024-01-09 16:20:17 -08:00
regset.c
relay.c kernel: relay: remove relay_file_splice_read dead code, doesn't work 2023-12-29 12:22:27 -08:00
resource.c Quite a lot of kexec work this time around. Many singleton patches in 2024-01-09 11:46:20 -08:00
resource_kunit.c
rseq.c
scftorture.c
scs.c
seccomp.c file: remove __receive_fd() 2023-12-12 14:24:14 +01:00
signal.c - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
smp.c CSD lock commits for v6.7 2023-10-30 17:56:53 -10:00
smpboot.c
smpboot.h
softirq.c workqueue: Drain BH work items on hot-unplugged CPUs 2024-02-29 11:51:24 -10:00
stackleak.c
stacktrace.c stacktrace: fix kernel-doc typo 2023-12-29 12:22:29 -08:00
static_call.c
static_call_inline.c
stop_machine.c
sys.c prctl: generalize PR_SET_MDWE support check to be per-arch 2024-03-26 11:07:22 -07:00
sys_ni.c lsm/stable-6.8 PR 20240105 2024-01-09 12:57:46 -08:00
sysctl-test.c
sysctl.c tracing: Support to dump instance traces by ftrace_dump_on_oops 2024-03-18 10:33:06 -04:00
task_work.c
taskstats.c
torture.c
tracepoint.c
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c
up.c
user-return-notifier.c
user.c binfmt_misc: enable sandboxed mounts 2023-10-11 08:46:01 -07:00
user_namespace.c user_namespace: remove unnecessary NULL values from kbuf 2024-02-22 15:38:52 -08:00
usermode_driver.c
utsname.c
utsname_sysctl.c
vhost_task.c
vmcore_info.c crash_core: export vmemmap when CONFIG_SPARSEMEM_VMEMMAP is enabled 2024-03-04 17:01:27 -08:00
watch_queue.c watch_queue: fix kcalloc() arguments order 2023-12-21 13:17:54 +01:00
watchdog.c watchdog/core: remove sysctl handlers from public header 2024-03-12 13:09:23 -07:00
watchdog_buddy.c
watchdog_perf.c
workqueue.c Driver core changes for 6.9-rc1 2024-03-21 13:34:15 -07:00
workqueue_internal.h