linux-stable/include/linux/sched/wake_q.h
Waiman Long 00f3c5a3df locking/rwsem: Always release wait_lock before waking up tasks
With the use of wake_q, we can do task wakeups without holding the
wait_lock. There is one exception in the rwsem code, though. It is
when the writer in the slowpath detects that there are waiters ahead
but the rwsem is not held by a writer. This can lead to a long wait_lock
hold time especially when a large number of readers are to be woken up.

Remediate this situation by releasing the wait_lock before waking
up tasks and re-acquiring it afterward. The rwsem_try_write_lock()
function is also modified to read the rwsem count directly to avoid
stale count value.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Link: https://lkml.kernel.org/r/20190520205918.22251-9-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:28:00 +02:00

63 lines
2.1 KiB
C

/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_SCHED_WAKE_Q_H
#define _LINUX_SCHED_WAKE_Q_H
/*
* Wake-queues are lists of tasks with a pending wakeup, whose
* callers have already marked the task as woken internally,
* and can thus carry on. A common use case is being able to
* do the wakeups once the corresponding user lock as been
* released.
*
* We hold reference to each task in the list across the wakeup,
* thus guaranteeing that the memory is still valid by the time
* the actual wakeups are performed in wake_up_q().
*
* One per task suffices, because there's never a need for a task to be
* in two wake queues simultaneously; it is forbidden to abandon a task
* in a wake queue (a call to wake_up_q() _must_ follow), so if a task is
* already in a wake queue, the wakeup will happen soon and the second
* waker can just skip it.
*
* The DEFINE_WAKE_Q macro declares and initializes the list head.
* wake_up_q() does NOT reinitialize the list; it's expected to be
* called near the end of a function. Otherwise, the list can be
* re-initialized for later re-use by wake_q_init().
*
* NOTE that this can cause spurious wakeups. schedule() callers
* must ensure the call is done inside a loop, confirming that the
* wakeup condition has in fact occurred.
*
* NOTE that there is no guarantee the wakeup will happen any later than the
* wake_q_add() location. Therefore task must be ready to be woken at the
* location of the wake_q_add().
*/
#include <linux/sched.h>
struct wake_q_head {
struct wake_q_node *first;
struct wake_q_node **lastp;
};
#define WAKE_Q_TAIL ((struct wake_q_node *) 0x01)
#define DEFINE_WAKE_Q(name) \
struct wake_q_head name = { WAKE_Q_TAIL, &name.first }
static inline void wake_q_init(struct wake_q_head *head)
{
head->first = WAKE_Q_TAIL;
head->lastp = &head->first;
}
static inline bool wake_q_empty(struct wake_q_head *head)
{
return head->first == WAKE_Q_TAIL;
}
extern void wake_q_add(struct wake_q_head *head, struct task_struct *task);
extern void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task);
extern void wake_up_q(struct wake_q_head *head);
#endif /* _LINUX_SCHED_WAKE_Q_H */