linux-stable/kernel/sched
Jann Horn 837ffc9723 sched/fair: Don't free p->numa_faults with concurrent readers
commit 16d51a590a upstream.

When going through execve(), zero out the NUMA fault statistics instead of
freeing them.

During execve, the task is reachable through procfs and the scheduler. A
concurrent /proc/*/sched reader can read data from a freed ->numa_faults
allocation (confirmed by KASAN) and write it back to userspace.
I believe that it would also be possible for a use-after-free read to occur
through a race between a NUMA fault and execve(): task_numa_fault() can
lead to task_numa_compare(), which invokes task_weight() on the currently
running task of a different CPU.

Another way to fix this would be to make ->numa_faults RCU-managed or add
extra locking, but it seems easier to wipe the NUMA fault statistics on
execve.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes: 82727018b0 ("sched/numa: Call task_numa_free() from do_execve()")
Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-04 09:33:45 +02:00
..
auto_group.c sched/autogroup: Fix 64-bit kernel nice level adjustment 2016-11-24 05:45:02 +01:00
auto_group.h sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
clock.c sched/clock: Make local_clock()/cpu_clock() inline 2016-04-13 12:25:22 +02:00
completion.c sched/completion: Serialize completion_done() with complete() 2015-02-18 14:27:40 +01:00
core.c sched/core: Handle overflow in cpu_shares_write_u64 2019-05-31 06:48:22 -07:00
cpuacct.c sched/cputime: Convert kcpustat to nsecs 2018-10-20 09:51:32 +02:00
cpuacct.h sched/cpuacct: Simplify the cpuacct code 2016-03-21 11:00:28 +01:00
cpudeadline.c sched/deadline: Split cpudl_set() into cpudl_set() and cpudl_clear() 2016-09-05 13:29:43 +02:00
cpudeadline.h sched/deadline: Split cpudl_set() into cpudl_set() and cpudl_clear() 2016-09-05 13:29:43 +02:00
cpufreq.c cpufreq / sched: Pass flags to cpufreq_update_util() 2016-08-16 22:14:55 +02:00
cpufreq_schedutil.c cpufreq: schedutil: Fix per-CPU structure initialization in sugov_start() 2017-06-14 15:06:05 +02:00
cpupri.c sched/core: Use tsk_cpus_allowed() instead of accessing ->cpus_allowed 2016-05-12 09:55:35 +02:00
cpupri.h sched/cpupri: Remove unnecessary definitions in cpupri.h 2014-11-16 10:58:59 +01:00
cputime.c sched/cputime: Fix ksoftirqd cputime accounting regression 2018-10-20 09:51:32 +02:00
deadline.c sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks 2018-04-13 19:48:23 +02:00
debug.c Merge branch 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2016-10-14 12:18:50 -07:00
fair.c sched/fair: Don't free p->numa_faults with concurrent readers 2019-08-04 09:33:45 +02:00
features.h sched/fair: Make select_idle_cpu() more aggressive 2017-12-14 09:28:17 +01:00
idle.c nmi_backtrace: generate one-line reports for idle cpus 2016-10-07 18:46:30 -07:00
idle_task.c sched/core: Rewrite and improve select_idle_siblings() 2016-09-30 11:03:09 +02:00
loadavg.c sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting 2017-07-05 14:40:28 +02:00
Makefile cpufreq: schedutil: New governor based on scheduler utilization data 2016-04-02 01:09:12 +02:00
rt.c sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning 2018-05-30 07:50:41 +02:00
sched.h sched: Add sched_smt_active() 2019-05-14 19:19:36 +02:00
stats.c sched: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
stats.h sched/debug: Rename 'schedstat_val()' -> 'schedstat_val_or_zero()' 2016-09-05 13:29:46 +02:00
stop_task.c locking/lockdep, sched/core: Implement a better lock pinning scheme 2016-05-05 09:23:59 +02:00
swait.c sched/wait: Remove the lockless swait_active() check in swake_up*() 2018-08-06 16:23:02 +02:00
wait.c mm: remove per-zone hashtable of bitlock waitqueues 2016-10-27 09:27:57 -07:00