Commit graph

4 commits

Author SHA1 Message Date
Linus Torvalds
bc08b449ee lockref: implement lockless reference count updates using cmpxchg()
Instead of taking the spinlock, the lockless versions atomically check
that the lock is not taken, and do the reference count update using a
cmpxchg() loop.  This is semantically identical to doing the reference
count update protected by the lock, but avoids the "wait for lock"
contention that you get when accesses to the reference count are
contended.

Note that a "lockref" is absolutely _not_ equivalent to an atomic_t.
Even when the lockref reference counts are updated atomically with
cmpxchg, the fact that they also verify the state of the spinlock means
that the lockless updates can never happen while somebody else holds the
spinlock.

So while "lockref_put_or_lock()" looks a lot like just another name for
"atomic_dec_and_lock()", and both optimize to lockless updates, they are
fundamentally different: the decrement done by atomic_dec_and_lock() is
truly independent of any lock (as long as it doesn't decrement to zero),
so a locked region can still see the count change.

The lockref structure, in contrast, really is a *locked* reference
count.  If you hold the spinlock, the reference count will be stable and
you can modify the reference count without using atomics, because even
the lockless updates will see and respect the state of the lock.

In order to enable the cmpxchg lockless code, the architecture needs to
do three things:

 (1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit
     in an aligned u64, and have a "cmpxchg()" implementation that works
     on such a u64 data type.

 (2) define a helper function to test for a spinlock being unlocked
     ("arch_spin_value_unlocked()")

 (3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its
     Kconfig file.

This enables it for x86-64 (but not 32-bit, we'd need to make sure
cmpxchg() turns into the proper cmpxchg8b in order to enable it for
32-bit mode).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-02 12:12:15 -07:00
Linus Torvalds
2f4f12e571 lockref: uninline lockref helper functions
They aren't very good to inline, since they already call external
functions (the spinlock code), and we're going to create rather more
complicated versions of them that can do the reference count updates
locklessly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-02 11:58:20 -07:00
Linus Torvalds
b3abd80250 lockref: add 'lockref_get_or_lock() helper
This behaves like "lockref_get_not_zero()", but instead of doing nothing
if the count was zero, it returns with the lock held.

This allows callers to revalidate the lockref-protected data structure
if required even if the count was zero to begin with, and possibly
increment the count if it passes muster.

In particular, the dentry code wants this when it wants to turn an
RCU-protected dentry into a stable refcounted one: if the dentry count
it zero, but the sequence number still validates the dentry, we can take
a reference to it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-02 11:14:19 -07:00
Waiman Long
0f8f2aaaab Add new lockref infrastructure reference implementation
This introduces a new "lockref" structure that supports the concept of
lockless updates of reference counts that still honor an attached
spinlock.

NOTE! This reference implementation is not the optimized lockless
version, rather it is the fallback implementation using standard
spinlocks.  The actual optimized versions will be merged into 3.12, but
I wanted to get the infrastructure in place and document the new
interfaces.

[ Also note that this particular commit is drastically cut-down minimal
  version of the original patch by Waiman.  In order to properly credit
  the original author I'm marking Waiman as the author here, but in the
  end this patch bears little resemblance to the patch by Waiman.  So
  blame any errors on me editing things down to the point where I can
  introduce the infrastructure before the merge window for 3.12 actually
  opens.     - Linus ]

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-28 18:13:27 -07:00