linux-stable/kernel/locking
Waiman Long 4f23dbc1e6 locking/rwsem: Implement lock handoff to prevent lock starvation
Because of writer lock stealing, it is possible that a constant
stream of incoming writers will cause a waiting writer or reader to
wait indefinitely leading to lock starvation.

This patch implements a lock handoff mechanism to disable lock stealing
and force lock handoff to the first waiter or waiters (for readers)
in the queue after at least a 4ms waiting period unless it is a RT
writer task which doesn't need to wait. The waiting period is used to
avoid discouraging lock stealing too much to affect performance.

The setting and clearing of the handoff bit is serialized by the
wait_lock. So racing is not possible.

A rwsem microbenchmark was run for 5 seconds on a 2-socket 40-core
80-thread Skylake system with a v5.1 based kernel and 240 write_lock
threads with 5us sleep critical section.

Before the patch, the min/mean/max numbers of locking operations for
the locking threads were 1/7,792/173,696. After the patch, the figures
became 5,842/6,542/7,458.  It can be seen that the rwsem became much
more fair, though there was a drop of about 16% in the mean locking
operations done which was a tradeoff of having better fairness.

Making the waiter set the handoff bit right after the first wakeup can
impact performance especially with a mixed reader/writer workload. With
the same microbenchmark with short critical section and equal number of
reader and writer threads (40/40), the reader/writer locking operation
counts with the current patch were:

  40 readers, Iterations Min/Mean/Max = 1,793/1,794/1,796
  40 writers, Iterations Min/Mean/Max = 1,793/34,956/86,081

By making waiter set handoff bit immediately after wakeup:

  40 readers, Iterations Min/Mean/Max = 43/44/46
  40 writers, Iterations Min/Mean/Max = 43/1,263/3,191

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Link: https://lkml.kernel.org/r/20190520205918.22251-8-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:27:59 +02:00
..
lock_events.c locking/lock_events: Don't show pvqspinlock events on bare metal 2019-04-10 10:56:05 +02:00
lock_events.h locking/lock_events: Use raw_cpu_{add,inc}() for stats 2019-06-03 12:32:56 +02:00
lock_events_list.h locking/rwsem: Implement lock handoff to prevent lock starvation 2019-06-17 12:27:59 +02:00
lockdep.c locking/lockdep: Remove unnecessary DEBUG_LOCKS_WARN_ON() 2019-06-17 12:09:37 +02:00
lockdep_internals.h locking/lockdep: Test all incompatible scenarios at once in check_irq_usage() 2019-04-29 08:29:20 +02:00
lockdep_proc.c locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count() 2019-02-28 07:55:44 +01:00
lockdep_states.h locking/lockdep: Rework FS_RECLAIM annotation 2017-08-10 12:29:03 +02:00
locktorture.c locktorture: NULL cxt.lwsa and cxt.lrsa to allow bad-arg detection 2019-03-26 14:42:53 -07:00
Makefile locking/rwsem: Merge rwsem.h and rwsem-xadd.c into rwsem.c 2019-06-17 12:27:57 +02:00
mcs_spinlock.h locking/mcs: Use smp_cond_load_acquire() in MCS spin loop 2018-04-27 09:48:49 +02:00
mutex-debug.c locking/mutex: Replace spin_is_locked() with lockdep 2018-11-12 09:06:22 -08:00
mutex-debug.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mutex.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
mutex.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
osq_lock.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
percpu-rwsem.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
qrwlock.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
qspinlock.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
qspinlock_paravirt.h locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs 2019-04-10 10:56:03 +02:00
qspinlock_stat.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
rtmutex-debug.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
rtmutex-debug.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
rtmutex.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
rtmutex.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
rtmutex_common.h locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() 2018-03-28 23:01:30 +02:00
rwsem.c locking/rwsem: Implement lock handoff to prevent lock starvation 2019-06-17 12:27:59 +02:00
rwsem.h locking/rwsem: Merge rwsem.h and rwsem-xadd.c into rwsem.c 2019-06-17 12:27:57 +02:00
semaphore.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436 2019-06-05 17:37:17 +02:00
spinlock.c asm-generic/mmiowb: Add generic implementation of mmiowb() tracking 2019-04-08 11:59:39 +01:00
spinlock_debug.c mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors 2019-04-08 11:59:47 +01:00
test-ww_mutex.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9 2019-05-21 11:28:40 +02:00