linux-stable/kernel/locking
Davidlohr Bueso e38513905e locking/rwsem: Rework zeroing reader waiter->task
Readers that are awoken will expect a nil ->task indicating
that a wakeup has occurred. Because of the way readers are
implemented, there's a small chance that the waiter will never
block in the slowpath (rwsem_down_read_failed), and therefore
requires some form of reference counting to avoid the following
scenario:

rwsem_down_read_failed()		rwsem_wake()
  get_task_struct();
  spin_lock_irq(&wait_lock);
  list_add_tail(&waiter.list)
  spin_unlock_irq(&wait_lock);
					  raw_spin_lock_irqsave(&wait_lock)
					  __rwsem_do_wake()
  while (1) {
    set_task_state(TASK_UNINTERRUPTIBLE);
					    waiter->task = NULL
    if (!waiter.task) // true
      break;
    schedule() // never reached

   __set_task_state(TASK_RUNNING);
 do_exit();
					    wake_up_process(tsk); // boom

... and therefore race with do_exit() when the caller returns.

There is also a mismatch between the smp_mb() and its documentation,
in that the serialization is done between reading the task and the
nil store. Furthermore, in addition to having the overlapping of
loads and stores to waiter->task guaranteed to be ordered within
that CPU, both wake_up_process() originally and now wake_q_add()
already imply barriers upon successful calls, which serves the
comment.

Now, as an alternative to perhaps inverting the checks in the blocker
side (which has its own penalty in that schedule is unavoidable),
with lockless wakeups this situation is naturally addressed and we
can just use the refcount held by wake_q_add(), instead doing so
explicitly. Of course, we must guarantee that the nil store is done
as the _last_ operation in that the task must already be marked for
deletion to not fall into the race above. Spurious wakeups are also
handled transparently in that the task's reference is only removed
when wake_up_q() is actually called _after_ the nil store.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman.Long@hpe.com
Cc: dave@stgolabs.net
Cc: jason.low2@hp.com
Cc: peter@hurleysoftware.com
Link: http://lkml.kernel.org/r/1463165787-25937-3-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-03 09:47:12 +02:00
..
lglock.c sched/stop_machine: Fix deadlock between multiple stop_two_cpus() 2015-06-19 10:03:12 +02:00
lockdep.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-05-16 14:47:16 -07:00
lockdep_internals.h lockdep: Increase static allocations 2014-04-18 14:20:50 +02:00
lockdep_proc.c lockdep: Fix lock_chain::base size 2016-04-23 13:53:03 +02:00
lockdep_states.h
locktorture.c lcoking/locktorture: Simplify the torture_runnable computation 2016-04-28 10:57:51 +02:00
Makefile kernel: add kcov code coverage 2016-03-22 15:36:02 -07:00
mcs_spinlock.h locking/mcs: Fix mcs_spin_lock() ordering 2016-02-29 10:02:41 +01:00
mutex-debug.c mutex: Always clear owner field upon mutex_unlock() 2015-01-09 11:20:39 +01:00
mutex-debug.h
mutex.c locking/ww_mutex: Report recursive ww_mutex locking early 2016-06-03 08:37:26 +02:00
mutex.h locking/mutexes: Use MUTEX_SPIN_ON_OWNER when appropriate 2014-08-13 10:32:02 +02:00
osq_lock.c locking/osq: Fix ordering of node initialisation in osq_lock 2015-12-17 11:40:29 -08:00
percpu-rwsem.c ext4: fix races between changing inode journal mode and ext4_writepages 2016-04-25 23:22:35 -04:00
qrwlock.c locking/qrwlock: Rename ->lock to ->wait_lock 2015-09-18 09:27:29 +02:00
qspinlock.c locking/qspinlock: Use smp_cond_acquire() in pending code 2016-02-29 10:02:42 +01:00
qspinlock_paravirt.h locking/pvqspinlock: Enable slowpath locking count tracking 2016-02-29 10:02:42 +01:00
qspinlock_stat.h locking/pvqspinlock: Robustify init_qspinlock_stat() 2016-05-05 09:58:51 +02:00
rtmutex-debug.c rtmutex: Cleanup deadlock detector debug logic 2014-06-21 22:05:30 +02:00
rtmutex-debug.h rtmutex: Cleanup deadlock detector debug logic 2014-06-21 22:05:30 +02:00
rtmutex.c rtmutex: Make wait_lock irq safe 2016-01-26 11:08:35 +01:00
rtmutex.h rtmutex: Cleanup deadlock detector debug logic 2014-06-21 22:05:30 +02:00
rtmutex_common.h rtmutex: Delete scriptable tester 2015-07-20 11:45:45 +02:00
rwsem-spinlock.c locking/rwsem: Introduce basis for down_write_killable() 2016-04-13 10:42:20 +02:00
rwsem-xadd.c locking/rwsem: Rework zeroing reader waiter->task 2016-06-03 09:47:12 +02:00
rwsem.c add down_write_killable_nested() 2016-05-26 00:04:58 -04:00
rwsem.h locking/rwsem: Set lock ownership ASAP 2015-02-18 16:57:13 +01:00
semaphore.c locking/semaphore: Resolve some shadow warnings 2014-09-04 07:17:24 +02:00
spinlock.c spinlock: Add spin_lock_bh_nested() 2015-01-03 14:32:57 -05:00
spinlock_debug.c locking: Move the spinlock code to kernel/locking/ 2013-11-06 07:55:21 +01:00