linux-stable/kernel/sched
Mel Gorman 748d2e9585 sched/core: Do not requeue task on CPU excluded from cpus_mask
[ Upstream commit 751d4cbc43 ]

The following warning was triggered on a large machine early in boot on
a distribution kernel but the same problem should also affect mainline.

   WARNING: CPU: 439 PID: 10 at ../kernel/workqueue.c:2231 process_one_work+0x4d/0x440
   Call Trace:
    <TASK>
    rescuer_thread+0x1f6/0x360
    kthread+0x156/0x180
    ret_from_fork+0x22/0x30
    </TASK>

Commit c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
optimises ttwu by queueing a task that is descheduling on the wakelist,
but does not check if the task descheduling is still allowed to run on that CPU.

In this warning, the problematic task is a workqueue rescue thread which
checks if the rescue is for a per-cpu workqueue and running on the wrong CPU.
While this is early in boot and it should be possible to create workers,
the rescue thread may still used if the MAYDAY_INITIAL_TIMEOUT is reached
or MAYDAY_INTERVAL and on a sufficiently large machine, the rescue
thread is being used frequently.

Tracing confirmed that the task should have migrated properly using the
stopper thread to handle the migration. However, a parallel wakeup from udev
running on another CPU that does not share CPU cache observes p->on_cpu and
uses task_cpu(p), queues the task on the old CPU and triggers the warning.

Check that the wakee task that is descheduling is still allowed to run
on its current CPU and if not, wait for the descheduling to complete
and select an allowed CPU.

Fixes: c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20220804092119.20137-1-mgorman@techsingularity.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17 14:24:15 +02:00
..
autogroup.c sched/fair: Prevent dead task groups from regaining cfs_rq's 2021-11-25 09:48:32 +01:00
autogroup.h
clock.c
completion.c
core.c sched/core: Do not requeue task on CPU excluded from cpus_mask 2022-08-17 14:24:15 +02:00
core_sched.c
cpuacct.c sched/cpuacct: Fix charge percpu cpuusage 2022-04-08 14:23:11 +02:00
cpudeadline.c
cpudeadline.h
cpufreq.c
cpufreq_schedutil.c sched/uclamp: Fix iowait boost escaping uclamp restriction 2022-04-08 14:23:10 +02:00
cpupri.c
cpupri.h
cputime.c cputime, cpuacct: Include guest time in user time in cpuacct.stat 2022-01-27 11:05:09 +01:00
deadline.c sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy() 2022-08-17 14:24:14 +02:00
debug.c sched/debug: Remove mpol_get/put and task_lock/unlock from sched_show_numa 2022-04-08 14:23:10 +02:00
fair.c sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg 2022-08-17 14:23:00 +02:00
features.h sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg 2022-08-17 14:23:00 +02:00
idle.c sched/idle: Make the idle timer expire in hard interrupt context 2021-09-09 10:36:16 +02:00
isolation.c
loadavg.c
Makefile
membarrier.c sched/membarrier: Fix membarrier-rseq fence command missing from query bitmask 2022-02-01 17:27:05 +01:00
pelt.c
pelt.h sched/fair: Fix cfs_rq_clock_pelt() for throttled cfs_rq 2022-06-09 10:22:48 +02:00
psi.c sched/psi: report zeroes for CPU full at the system level 2022-06-09 10:22:48 +02:00
rt.c nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt() 2022-08-17 14:23:14 +02:00
sched-pelt.h
sched.h sched: Remove the limitation of WF_ON_CPU on wakelist if wakee cpu is idle 2022-08-17 14:24:15 +02:00
smp.h
stats.c
stats.h psi: Fix PSI_MEM_FULL state when tasks are in memstall and doing reclaim 2022-01-27 11:04:27 +01:00
stop_task.c
swait.c
topology.c
wait.c wait: add wake_up_pollfree() 2021-12-14 10:57:15 +01:00
wait_bit.c