linux-stable/kernel/sched
Eric W. Biederman 0ff7b2cfba tasks, sched/core: Ensure tasks are available for a grace period after leaving the runqueue
In the ordinary case today the RCU grace period for a task_struct is
triggered when another process wait's for it's zombine and causes the
kernel to call release_task().  As the waiting task has to receive a
signal and then act upon it before this happens, typically this will
occur after the original task as been removed from the runqueue.

Unfortunaty in some cases such as self reaping tasks it can be shown
that release_task() will be called starting the grace period for
task_struct long before the task leaves the runqueue.

Therefore use put_task_struct_rcu_user() in finish_task_switch() to
guarantee that the there is a RCU lifetime after the task
leaves the runqueue.

Besides the change in the start of the RCU grace period for the
task_struct this change may cause perf_event_delayed_put and
trace_sched_process_free.  The function perf_event_delayed_put boils
down to just a WARN_ON for cases that I assume never show happen.  So
I don't see any problem with delaying it.

The function trace_sched_process_free is a trace point and thus
visible to user space.  Occassionally userspace has the strangest
dependencies so this has a miniscule chance of causing a regression.
This change only changes the timing of when the tracepoint is called.
The change in timing arguably gives userspace a more accurate picture
of what is going on.  So I don't expect there to be a regression.

In the case where a task self reaps we are pretty much guaranteed that
the RCU grace period is delayed.  So we should get quite a bit of
coverage in of this worst case for the change in a normal threaded
workload.  So I expect any issues to turn up quickly or not at all.

I have lightly tested this change and everything appears to work
fine.

Inspired-by: Linus Torvalds <torvalds@linux-foundation.org>
Inspired-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87r24jdpl5.fsf_-_@x220.int.ebiederm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-25 17:42:29 +02:00
..
autogroup.c sched/autogroup: Make autogroup_path() always available 2019-06-24 19:23:40 +02:00
autogroup.h
clock.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
completion.c
core.c tasks, sched/core: Ensure tasks are available for a grace period after leaving the runqueue 2019-09-25 17:42:29 +02:00
cpuacct.c
cpudeadline.c Linux 5.2-rc5 2019-06-17 12:12:27 +02:00
cpudeadline.h
cpufreq.c sched/cpufreq: Annotate cpufreq_update_util_data pointer with __rcu 2019-04-03 12:34:31 +02:00
cpufreq_schedutil.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-09-16 17:25:49 -07:00
cpupri.c Linux 5.2-rc5 2019-06-17 12:12:27 +02:00
cpupri.h
cputime.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
deadline.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-09-16 17:25:49 -07:00
debug.c Linux 5.2-rc6 2019-06-24 19:19:53 +02:00
fair.c sched/fair: Remove unused cfs_rq_clock_task() function 2019-09-17 09:55:02 +02:00
features.h sched/fair: Replace source_load() & target_load() with weighted_cpuload() 2019-06-03 11:49:39 +02:00
idle.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-09-16 17:25:49 -07:00
isolation.c sched/isolation: Prefer housekeeping CPU in local node 2019-07-25 15:51:55 +02:00
loadavg.c sched: loadavg: make calc_load_n() public 2018-10-26 16:26:32 -07:00
Makefile psi: pressure stall information for CPU, memory, and IO 2018-10-26 16:26:32 -07:00
membarrier.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
pelt.c sched/debug: Add new tracepoint to track PELT at se level 2019-06-24 19:23:42 +02:00
pelt.h sched/topology: Remove unused 'sd' parameter from arch_scale_cpu_capacity() 2019-06-24 19:23:39 +02:00
psi.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-09-16 17:25:49 -07:00
rt.c sched: Rework pick_next_task() slow-path 2019-08-08 09:09:31 +02:00
sched-pelt.h sched/fair: Fix "runnable_avg_yN_inv" not used warnings 2019-06-17 12:15:58 +02:00
sched.h Merge branch 'sched/rt' into sched/core, to pick up -rt changes 2019-09-16 14:05:04 +02:00
stats.c
stats.h sched/stats: Fix unlikely() use of sched_info_on() 2019-07-25 15:51:55 +02:00
stop_task.c sched: Rework pick_next_task() slow-path 2019-08-08 09:09:31 +02:00
swait.c kernel/sched/: remove caller signal_pending branch predictions 2019-01-04 13:13:48 -08:00
topology.c sched/topology: Improve load balancing on AMD EPYC systems 2019-09-03 09:17:37 +02:00
wait.c sched/wait: Deduplicate code with do-while 2019-06-24 19:23:40 +02:00
wait_bit.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00