mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2024-09-28 21:33:52 +00:00
Documentation/barriers: Remove references to [smp_]read_barrier_depends()
The [smp_]read_barrier_depends() barrier macros no longer exist as part of the Linux memory model, so remove all references to them from the Documentation/ directory. Although this is fairly mechanical on the whole, we drop the "CACHE COHERENCY" section entirely from 'memory-barriers.txt' as it doesn't make any sense now that the dependency barriers have been removed. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
This commit is contained in:
parent
93fab07c22
commit
8ca924aeb4
2 changed files with 9 additions and 149 deletions
|
@ -463,7 +463,7 @@ again without disrupting RCU readers.
|
||||||
This guarantee was only partially premeditated. DYNIX/ptx used an
|
This guarantee was only partially premeditated. DYNIX/ptx used an
|
||||||
explicit memory barrier for publication, but had nothing resembling
|
explicit memory barrier for publication, but had nothing resembling
|
||||||
``rcu_dereference()`` for subscription, nor did it have anything
|
``rcu_dereference()`` for subscription, nor did it have anything
|
||||||
resembling the ``smp_read_barrier_depends()`` that was later subsumed
|
resembling the dependency-ordering barrier that was later subsumed
|
||||||
into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The
|
into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The
|
||||||
need for these operations made itself known quite suddenly at a
|
need for these operations made itself known quite suddenly at a
|
||||||
late-1990s meeting with the DEC Alpha architects, back in the days when
|
late-1990s meeting with the DEC Alpha architects, back in the days when
|
||||||
|
|
|
@ -553,12 +553,12 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
|
||||||
DATA DEPENDENCY BARRIERS (HISTORICAL)
|
DATA DEPENDENCY BARRIERS (HISTORICAL)
|
||||||
-------------------------------------
|
-------------------------------------
|
||||||
|
|
||||||
As of v4.15 of the Linux kernel, an smp_read_barrier_depends() was
|
As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for
|
||||||
added to READ_ONCE(), which means that about the only people who
|
DEC Alpha, which means that about the only people who need to pay attention
|
||||||
need to pay attention to this section are those working on DEC Alpha
|
to this section are those working on DEC Alpha architecture-specific code
|
||||||
architecture-specific code and those working on READ_ONCE() itself.
|
and those working on READ_ONCE() itself. For those who need it, and for
|
||||||
For those who need it, and for those who are interested in the history,
|
those who are interested in the history, here is the story of
|
||||||
here is the story of data-dependency barriers.
|
data-dependency barriers.
|
||||||
|
|
||||||
The usage requirements of data dependency barriers are a little subtle, and
|
The usage requirements of data dependency barriers are a little subtle, and
|
||||||
it's not always obvious that they're needed. To illustrate, consider the
|
it's not always obvious that they're needed. To illustrate, consider the
|
||||||
|
@ -2708,144 +2708,6 @@ the properties of the memory window through which devices are accessed and/or
|
||||||
the use of any special device communication instructions the CPU may have.
|
the use of any special device communication instructions the CPU may have.
|
||||||
|
|
||||||
|
|
||||||
CACHE COHERENCY
|
|
||||||
---------------
|
|
||||||
|
|
||||||
Life isn't quite as simple as it may appear above, however: for while the
|
|
||||||
caches are expected to be coherent, there's no guarantee that that coherency
|
|
||||||
will be ordered. This means that while changes made on one CPU will
|
|
||||||
eventually become visible on all CPUs, there's no guarantee that they will
|
|
||||||
become apparent in the same order on those other CPUs.
|
|
||||||
|
|
||||||
|
|
||||||
Consider dealing with a system that has a pair of CPUs (1 & 2), each of which
|
|
||||||
has a pair of parallel data caches (CPU 1 has A/B, and CPU 2 has C/D):
|
|
||||||
|
|
||||||
:
|
|
||||||
: +--------+
|
|
||||||
: +---------+ | |
|
|
||||||
+--------+ : +--->| Cache A |<------->| |
|
|
||||||
| | : | +---------+ | |
|
|
||||||
| CPU 1 |<---+ | |
|
|
||||||
| | : | +---------+ | |
|
|
||||||
+--------+ : +--->| Cache B |<------->| |
|
|
||||||
: +---------+ | |
|
|
||||||
: | Memory |
|
|
||||||
: +---------+ | System |
|
|
||||||
+--------+ : +--->| Cache C |<------->| |
|
|
||||||
| | : | +---------+ | |
|
|
||||||
| CPU 2 |<---+ | |
|
|
||||||
| | : | +---------+ | |
|
|
||||||
+--------+ : +--->| Cache D |<------->| |
|
|
||||||
: +---------+ | |
|
|
||||||
: +--------+
|
|
||||||
:
|
|
||||||
|
|
||||||
Imagine the system has the following properties:
|
|
||||||
|
|
||||||
(*) an odd-numbered cache line may be in cache A, cache C or it may still be
|
|
||||||
resident in memory;
|
|
||||||
|
|
||||||
(*) an even-numbered cache line may be in cache B, cache D or it may still be
|
|
||||||
resident in memory;
|
|
||||||
|
|
||||||
(*) while the CPU core is interrogating one cache, the other cache may be
|
|
||||||
making use of the bus to access the rest of the system - perhaps to
|
|
||||||
displace a dirty cacheline or to do a speculative load;
|
|
||||||
|
|
||||||
(*) each cache has a queue of operations that need to be applied to that cache
|
|
||||||
to maintain coherency with the rest of the system;
|
|
||||||
|
|
||||||
(*) the coherency queue is not flushed by normal loads to lines already
|
|
||||||
present in the cache, even though the contents of the queue may
|
|
||||||
potentially affect those loads.
|
|
||||||
|
|
||||||
Imagine, then, that two writes are made on the first CPU, with a write barrier
|
|
||||||
between them to guarantee that they will appear to reach that CPU's caches in
|
|
||||||
the requisite order:
|
|
||||||
|
|
||||||
CPU 1 CPU 2 COMMENT
|
|
||||||
=============== =============== =======================================
|
|
||||||
u == 0, v == 1 and p == &u, q == &u
|
|
||||||
v = 2;
|
|
||||||
smp_wmb(); Make sure change to v is visible before
|
|
||||||
change to p
|
|
||||||
<A:modify v=2> v is now in cache A exclusively
|
|
||||||
p = &v;
|
|
||||||
<B:modify p=&v> p is now in cache B exclusively
|
|
||||||
|
|
||||||
The write memory barrier forces the other CPUs in the system to perceive that
|
|
||||||
the local CPU's caches have apparently been updated in the correct order. But
|
|
||||||
now imagine that the second CPU wants to read those values:
|
|
||||||
|
|
||||||
CPU 1 CPU 2 COMMENT
|
|
||||||
=============== =============== =======================================
|
|
||||||
...
|
|
||||||
q = p;
|
|
||||||
x = *q;
|
|
||||||
|
|
||||||
The above pair of reads may then fail to happen in the expected order, as the
|
|
||||||
cacheline holding p may get updated in one of the second CPU's caches while
|
|
||||||
the update to the cacheline holding v is delayed in the other of the second
|
|
||||||
CPU's caches by some other cache event:
|
|
||||||
|
|
||||||
CPU 1 CPU 2 COMMENT
|
|
||||||
=============== =============== =======================================
|
|
||||||
u == 0, v == 1 and p == &u, q == &u
|
|
||||||
v = 2;
|
|
||||||
smp_wmb();
|
|
||||||
<A:modify v=2> <C:busy>
|
|
||||||
<C:queue v=2>
|
|
||||||
p = &v; q = p;
|
|
||||||
<D:request p>
|
|
||||||
<B:modify p=&v> <D:commit p=&v>
|
|
||||||
<D:read p>
|
|
||||||
x = *q;
|
|
||||||
<C:read *q> Reads from v before v updated in cache
|
|
||||||
<C:unbusy>
|
|
||||||
<C:commit v=2>
|
|
||||||
|
|
||||||
Basically, while both cachelines will be updated on CPU 2 eventually, there's
|
|
||||||
no guarantee that, without intervention, the order of update will be the same
|
|
||||||
as that committed on CPU 1.
|
|
||||||
|
|
||||||
|
|
||||||
To intervene, we need to interpolate a data dependency barrier or a read
|
|
||||||
barrier between the loads (which as of v4.15 is supplied unconditionally
|
|
||||||
by the READ_ONCE() macro). This will force the cache to commit its
|
|
||||||
coherency queue before processing any further requests:
|
|
||||||
|
|
||||||
CPU 1 CPU 2 COMMENT
|
|
||||||
=============== =============== =======================================
|
|
||||||
u == 0, v == 1 and p == &u, q == &u
|
|
||||||
v = 2;
|
|
||||||
smp_wmb();
|
|
||||||
<A:modify v=2> <C:busy>
|
|
||||||
<C:queue v=2>
|
|
||||||
p = &v; q = p;
|
|
||||||
<D:request p>
|
|
||||||
<B:modify p=&v> <D:commit p=&v>
|
|
||||||
<D:read p>
|
|
||||||
smp_read_barrier_depends()
|
|
||||||
<C:unbusy>
|
|
||||||
<C:commit v=2>
|
|
||||||
x = *q;
|
|
||||||
<C:read *q> Reads from v after v updated in cache
|
|
||||||
|
|
||||||
|
|
||||||
This sort of problem can be encountered on DEC Alpha processors as they have a
|
|
||||||
split cache that improves performance by making better use of the data bus.
|
|
||||||
While most CPUs do imply a data dependency barrier on the read when a memory
|
|
||||||
access depends on a read, not all do, so it may not be relied on.
|
|
||||||
|
|
||||||
Other CPUs may also have split caches, but must coordinate between the various
|
|
||||||
cachelets for normal memory accesses. The semantics of the Alpha removes the
|
|
||||||
need for hardware coordination in the absence of memory barriers, which
|
|
||||||
permitted Alpha to sport higher CPU clock rates back in the day. However,
|
|
||||||
please note that (again, as of v4.15) smp_read_barrier_depends() should not
|
|
||||||
be used except in Alpha arch-specific code and within the READ_ONCE() macro.
|
|
||||||
|
|
||||||
|
|
||||||
CACHE COHERENCY VS DMA
|
CACHE COHERENCY VS DMA
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -3009,10 +2871,8 @@ caches with the memory coherence system, thus making it seem like pointer
|
||||||
changes vs new data occur in the right order.
|
changes vs new data occur in the right order.
|
||||||
|
|
||||||
The Alpha defines the Linux kernel's memory model, although as of v4.15
|
The Alpha defines the Linux kernel's memory model, although as of v4.15
|
||||||
the Linux kernel's addition of smp_read_barrier_depends() to READ_ONCE()
|
the Linux kernel's addition of smp_mb() to READ_ONCE() on Alpha greatly
|
||||||
greatly reduced Alpha's impact on the memory model.
|
reduced its impact on the memory model.
|
||||||
|
|
||||||
See the subsection on "Cache Coherency" above.
|
|
||||||
|
|
||||||
|
|
||||||
VIRTUAL MACHINE GUESTS
|
VIRTUAL MACHINE GUESTS
|
||||||
|
|
Loading…
Reference in a new issue