linux-stable/kernel
Paul E. McKenney 26cdfedf6a rcu: Reject memory-order-induced stall-warning false positives
If a system is idle from an RCU perspective for longer than specified
by CONFIG_RCU_CPU_STALL_TIMEOUT, and if one CPU starts a grace period
just as a second checks for CPU stalls, and if this second CPU happens
to see the old value of rsp->jiffies_stall, it will incorrectly report a
CPU stall.  This is quite rare, but apparently occurs deterministically
on systems with about 6TB of memory.

This commit therefore orders accesses to the data used to determine
whether or not a CPU stall is in progress.  Grace-period initialization
and cleanup first increments rsp->completed to mark the end of the
previous grace period, then records the current jiffies in rsp->gp_start,
then records the jiffies at which a stall can be expected to occur in
rsp->jiffies_stall, and finally increments rsp->gpnum to mark the start
of the new grace period.  Now, this ordering by itself does not prevent
false positives.  For example, if grace-period initialization was delayed
between recording rsp->gp_start and rsp->jiffies_stall, the CPU stall
warning code might still see an old value of rsp->jiffies_stall.

Therefore, this commit also orders the CPU stall warning accesses as
well, loading rsp->gpnum and jiffies, then rsp->jiffies_stall, then
rsp->gp_start, and finally rsp->completed.  This ordering means that
the false-positive scenario in the previous paragraph would result
in rsp->completed being greater than or equal to rsp->gpnum, which is
never valid for a CPU stall, allowing the false positive to be rejected.
Furthermore, any fetch that gets an old value of rsp->jiffies_stall
must also get an old value of rsp->gpnum, which will again be rejected
by the comparison of rsp->gpnum and rsp->completed.  Situations where
rsp->gp_start is later than rsp->jiffies_stall are also rejected, as
are situations where jiffies is less than rsp->jiffies_stall.

Although use of unsynchronized accesses means that there are likely
still some false-positive scenarios (synchronization has proven to be
a very bad idea on large systems), this should get rid of a large class
of these scenarios.

Reported-by: Fabian Herschel <fabian.herschel@suse.com>
Reported-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Jochen Striepe <jochen@tolot.escape.de>
2013-09-23 09:15:30 -07:00
..
cpu idle: Enable interrupts in the weak arch_cpu_idle() implementation 2013-06-14 23:01:05 +02:00
debug kgdb/sysrq: fix inconstistent help message of sysrq key 2013-04-30 17:04:10 -07:00
events uprobes: Fix utask->depth accounting in handle_trampoline() 2013-09-12 08:00:55 +02:00
gcov kernel: replace strict_strto*() with kstrto*() 2013-09-12 15:38:03 -07:00
irq Remove GENERIC_HARDIRQ config option 2013-09-13 15:09:52 +02:00
power ACPI and power management fixes for 3.12-rc1 2013-09-12 11:22:45 -07:00
printk TTY/Serial driver patches for 3.12-rc1 2013-09-03 11:38:36 -07:00
sched Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-09-12 10:44:13 -07:00
time Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-09-04 09:36:54 -07:00
trace Not much changes for the 3.12 merge window. The major tracing changes 2013-09-09 14:42:15 -07:00
.gitignore kernel/hz.bc: ignore. 2013-04-22 07:09:06 -07:00
acct.c fs: Fix hang with BSD accounting on frozen filesystem 2013-05-04 14:57:58 -04:00
async.c async: rename and redefine async_func_ptr 2013-03-12 13:59:14 -07:00
audit.c audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE 2013-06-12 16:29:45 -07:00
audit.h audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record 2013-07-09 10:33:19 -07:00
audit_tree.c kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules() 2013-06-12 16:29:46 -07:00
audit_watch.c
auditfilter.c audit: Fix decimal constant description 2013-07-09 10:33:19 -07:00
auditsc.c audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record 2013-07-09 10:33:19 -07:00
backtracetest.c
bounds.c
capability.c xfs: update for v3.12-rc1 2013-09-09 11:19:09 -07:00
cgroup.c Kill indirect include of file.h from eventfd.h, use fdget() in cgroup.c 2013-09-07 19:54:57 -04:00
cgroup_freezer.c cgroup: make css_for_each_descendant() and friends include the origin css in the iteration 2013-08-08 20:11:27 -04:00
compat.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-05-01 07:21:43 -07:00
configs.c proc: Supply PDE attribute setting accessor functions 2013-05-01 17:29:18 -04:00
context_tracking.c context_tracking: User/kernel broundary cross trace events 2013-08-14 17:14:48 +02:00
cpu.c ACPI / processor: Acquire writer lock to update CPU maps 2013-08-13 12:20:16 +02:00
cpu_pm.c
cpuset.c Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2013-09-03 18:25:03 -07:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c ptrace: revert "Prepare to fix racy accesses on task breakpoints" 2013-07-09 10:33:26 -07:00
extable.c extable: skip sorting if the table is empty 2013-09-11 15:58:25 -07:00
fork.c Merge git://git.kvack.org/~bcrl/aio-next 2013-09-13 10:55:58 -07:00
freezer.c freezer: set PF_SUSPEND_TASK flag on tasks that call freeze_processes 2013-07-30 14:05:06 +02:00
futex.c futex: Use freezable blocking call 2013-06-25 23:11:19 +02:00
futex_compat.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-02-23 18:50:11 -08:00
groups.c userns: Kill nsown_capable it makes the wrong thing easy 2013-08-30 23:44:11 -07:00
hrtimer.c kernel: delete __cpuinit usage from all core kernel files 2013-07-14 19:36:59 -04:00
hung_task.c hung_task debugging: Print more info when reporting the problem 2013-08-02 11:02:42 +02:00
irq_work.c
itimer.c
jump_label.c jump_label: Split jumplabel ratelimit 2013-08-09 07:53:54 -07:00
kallsyms.c kernel: kallsyms: memory override issue, need check destination buffer length 2013-04-15 15:17:26 +09:30
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking: Fix copy/paste errors of "ARCH_INLINE_*_UNLOCK_BH" 2013-05-28 08:50:00 +02:00
Kconfig.preempt
kexec.c kexec: remove unnecessary return 2013-09-11 15:59:10 -07:00
kmod.c usermodehelper: kill the sub_info->path[0] check 2013-07-03 16:08:02 -07:00
kprobes.c kprobes: allow to specify custom allocator for insn caches 2013-09-11 15:58:52 -07:00
ksysfs.c kernel: replace strict_strto*() with kstrto*() 2013-09-12 15:38:03 -07:00
kthread.c kthread: implement probe_kthread_data() 2013-04-30 17:04:02 -07:00
latencytop.c
lglock.c lglock: Update lockdep annotations to report recursive local locks 2013-07-12 13:51:19 +02:00
lockdep.c lockdep: remove task argument from debug_check_no_locks_held 2013-05-12 14:16:21 +02:00
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
Makefile Remove GENERIC_HARDIRQ config option 2013-09-13 15:09:52 +02:00
modsign_certificate.S CONFIG_SYMBOL_PREFIX: cleanup. 2013-03-15 15:09:43 +10:30
modsign_pubkey.c kernel/modsign_pubkey.c: fix init const for module signing code 2013-09-11 15:58:21 -07:00
module-internal.h
module.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-09-05 08:50:26 -07:00
module_signing.c
mutex-debug.c
mutex-debug.h
mutex.c Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-09-04 08:18:19 -07:00
mutex.h
notifier.c
nsproxy.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2013-09-07 14:35:32 -07:00
padata.c padata - Register hotcpu notifier after initialization 2013-08-29 14:37:59 +10:00
panic.c panic: call panic handlers before kmsg_dump 2013-09-11 15:59:30 -07:00
params.c kernel: replace strict_strto*() with kstrto*() 2013-09-12 15:38:03 -07:00
pid.c pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup 2013-08-30 17:30:37 -07:00
pid_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2013-09-07 14:35:32 -07:00
posix-cpu-timers.c posix_timers: fix racy timer delta caching on task exit 2013-07-03 16:54:42 +02:00
posix-timers.c posix-timers: Remove unused variable 2013-04-18 12:51:19 +02:00
profile.c kernel: delete __cpuinit usage from all core kernel files 2013-07-14 19:36:59 -04:00
ptrace.c __ptrace_may_access() should not deny sub-threads 2013-09-11 15:59:01 -07:00
range.c range: Do not add new blank slot with add_range_with_merge 2013-06-18 11:32:10 -05:00
rcu.h rcu: Make call_rcu() leak callbacks for debug-object errors 2013-08-18 17:40:03 -07:00
rcupdate.c rcu: Convert local functions to static 2013-09-23 09:12:31 -07:00
rcutiny.c rcu: Add const annotation to char * for RCU tracepoints and functions 2013-07-29 17:07:49 -04:00
rcutiny_plugin.h rcu: Add const annotation to char * for RCU tracepoints and functions 2013-07-29 17:07:49 -04:00
rcutorture.c rcu: Make rcutorture emit online failures if verbose 2013-08-20 11:38:45 -07:00
rcutree.c rcu: Reject memory-order-induced stall-warning false positives 2013-09-23 09:15:30 -07:00
rcutree.h nohz_full: Force RCU's grace-period kthreads onto timekeeping CPU 2013-08-31 14:44:02 -07:00
rcutree_plugin.h rcu: Replace __get_cpu_var() uses 2013-09-23 09:15:22 -07:00
rcutree_trace.c rcutrace: single_open() leaks 2013-05-05 00:16:35 -04:00
reboot.c reboot: move arch/x86 reboot= handling to generic kernel 2013-07-09 10:33:29 -07:00
relay.c kernel: delete __cpuinit usage from all core kernel files 2013-07-14 19:36:59 -04:00
res_counter.c memcg: reduce function dereference 2013-09-12 15:38:02 -07:00
resource.c kernel/resource.c: remove the unneeded assignment in function __find_resource 2013-07-03 16:08:06 -07:00
rtmutex-debug.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
rtmutex-debug.h
rtmutex-tester.c locking/rtmutex/tester: Set correct permissions on sysfs files 2013-04-10 14:48:37 +02:00
rtmutex.c rtmutex: Document rt_mutex_adjust_prio_chain() 2013-05-28 09:23:52 +02:00
rtmutex.h
rtmutex_common.h
rwsem.c Revert "rw_semaphore: remove up/down_read_non_owner" 2013-03-23 15:53:52 -07:00
seccomp.c seccomp: allow BPF_XOR based ALU instructions. 2013-03-26 11:07:19 +11:00
semaphore.c semaphore: use `bool' type for semaphore_waiter's up 2013-04-30 17:04:08 -07:00
signal.c kernel-wide: fix missing validations on __get/__put/__copy_to/__copy_from_user() 2013-09-11 15:58:18 -07:00
smp.c kernel/smp.c: quit unconditionally enabling irqs in on_each_cpu_mask(). 2013-09-11 15:58:25 -07:00
smpboot.c kernel: delete __cpuinit usage from all core kernel files 2013-07-14 19:36:59 -04:00
smpboot.h
softirq.c Remove GENERIC_HARDIRQ config option 2013-09-13 15:09:52 +02:00
spinlock.c kernel/spinlock.c: add default arch_*_relax definitions for GENERIC_LOCKBREAK 2013-09-11 15:58:21 -07:00
srcu.c srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock() 2013-02-07 15:19:36 -08:00
stacktrace.c
stop_machine.c stop_machine: Mark per cpu stopper enabled early 2013-02-26 22:25:17 +01:00
sys.c userns: Kill nsown_capable it makes the wrong thing easy 2013-08-30 23:44:11 -07:00
sys_ni.c unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE 2013-05-09 13:46:38 -04:00
sysctl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-09-12 15:01:38 -07:00
sysctl_binary.c kernel: remove unnecessary head file 2013-06-26 18:01:46 +09:00
task_work.c task_work: documentation 2013-09-11 15:58:27 -07:00
taskstats.c
test_kprobes.c kernel/: rename random32() to prandom_u32() 2013-04-29 18:28:42 -07:00
time.c sched: Rename sched.c as sched/core.c in comments and Documentation 2013-06-19 12:58:42 +02:00
timeconst.bc kernel: Replace timeconst.pl with a bc script 2013-02-16 23:17:25 +01:00
timer.c kernel: delete __cpuinit usage from all core kernel files 2013-07-14 19:36:59 -04:00
tracepoint.c Tracing updates for Linux 3.10 2013-04-29 13:55:38 -07:00
tsacct.c
uid16.c userns: Kill nsown_capable it makes the wrong thing easy 2013-08-30 23:44:11 -07:00
up.c smp.h: move !SMP version of on_each_cpu() out-of-line 2013-09-11 15:58:25 -07:00
user-return-notifier.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
user.c userns: Better restrictions on when proc and sysfs can be mounted 2013-08-26 19:17:03 -07:00
user_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2013-09-07 14:35:32 -07:00
utsname.c userns: Kill nsown_capable it makes the wrong thing easy 2013-08-30 23:44:11 -07:00
utsname_sysctl.c kernel/utsname_sysctl.c: put get/get_uts() into CONFIG_PROC_SYSCTL code block 2013-02-27 19:10:22 -08:00
wait.c kernel: fix new kernel-doc warning in wait.c 2013-08-19 09:08:54 -07:00
watchdog.c watchdog: Make it work under full dynticks 2013-07-30 22:29:15 +02:00
workqueue.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-09-06 09:36:28 -07:00
workqueue_internal.h sched: Rename sched.c as sched/core.c in comments and Documentation 2013-06-19 12:58:42 +02:00