No description
Find a file
Tobias Waldekranz 603be95437 net: bridge: switchdev: Skip MDB replays of deferred events on offload
[ Upstream commit dc489f8625 ]

Before this change, generation of the list of MDB events to replay
would race against the creation of new group memberships, either from
the IGMP/MLD snooping logic or from user configuration.

While new memberships are immediately visible to walkers of
br->mdb_list, the notification of their existence to switchdev event
subscribers is deferred until a later point in time. So if a replay
list was generated during a time that overlapped with such a window,
it would also contain a replay of the not-yet-delivered event.

The driver would thus receive two copies of what the bridge internally
considered to be one single event. On destruction of the bridge, only
a single membership deletion event was therefore sent. As a
consequence of this, drivers which reference count memberships (at
least DSA), would be left with orphan groups in their hardware
database when the bridge was destroyed.

This is only an issue when replaying additions. While deletion events
may still be pending on the deferred queue, they will already have
been removed from br->mdb_list, so no duplicates can be generated in
that scenario.

To a user this meant that old group memberships, from a bridge in
which a port was previously attached, could be reanimated (in
hardware) when the port joined a new bridge, without the new bridge's
knowledge.

For example, on an mv88e6xxx system, create a snooping bridge and
immediately add a port to it:

    root@infix-06-0b-00:~$ ip link add dev br0 up type bridge mcast_snooping 1 && \
    > ip link set dev x3 up master br0

And then destroy the bridge:

    root@infix-06-0b-00:~$ ip link del dev br0
    root@infix-06-0b-00:~$ mvls atu
    ADDRESS             FID  STATE      Q  F  0  1  2  3  4  5  6  7  8  9  a
    DEV:0 Marvell 88E6393X
    33:33:00:00:00:6a     1  static     -  -  0  .  .  .  .  .  .  .  .  .  .
    33:33:ff:87:e4:3f     1  static     -  -  0  .  .  .  .  .  .  .  .  .  .
    ff:ff:ff:ff:ff:ff     1  static     -  -  0  1  2  3  4  5  6  7  8  9  a
    root@infix-06-0b-00:~$

The two IPv6 groups remain in the hardware database because the
port (x3) is notified of the host's membership twice: once via the
original event and once via a replay. Since only a single delete
notification is sent, the count remains at 1 when the bridge is
destroyed.

Then add the same port (or another port belonging to the same hardware
domain) to a new bridge, this time with snooping disabled:

    root@infix-06-0b-00:~$ ip link add dev br1 up type bridge mcast_snooping 0 && \
    > ip link set dev x3 up master br1

All multicast, including the two IPv6 groups from br0, should now be
flooded, according to the policy of br1. But instead the old
memberships are still active in the hardware database, causing the
switch to only forward traffic to those groups towards the CPU (port
0).

Eliminate the race in two steps:

1. Grab the write-side lock of the MDB while generating the replay
   list.

This prevents new memberships from showing up while we are generating
the replay list. But it leaves the scenario in which a deferred event
was already generated, but not delivered, before we grabbed the
lock. Therefore:

2. Make sure that no deferred version of a replay event is already
   enqueued to the switchdev deferred queue, before adding it to the
   replay list, when replaying additions.

Fixes: 4f2673b3a2 ("net: bridge: add helper to replay port and host-joined mdb entries")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01 13:35:06 +01:00
arch arm64: dts: rockchip: Correct Indiedroid Nova GPIO Names 2024-03-01 13:35:05 +01:00
block block: Fix WARNING in _copy_from_iter 2024-03-01 13:34:49 +01:00
certs certs: Reference revocation list for all keyrings 2023-08-17 20:12:41 +00:00
crypto crypto: algif_hash - Remove bogus SGL free on zero-length error path 2024-02-23 09:25:11 +01:00
Documentation docs: Instruct LaTeX to cope with deeper nesting 2024-03-01 13:34:58 +01:00
drivers scsi: jazz_esp: Only build if SCSI core is builtin 2024-03-01 13:35:06 +01:00
fs smb3: add missing null server pointer check 2024-03-01 13:35:03 +01:00
include net: bridge: switchdev: Skip MDB replays of deferred events on offload 2024-03-01 13:35:06 +01:00
init update workarounds for gcc "asm goto" issue 2024-02-23 09:24:47 +01:00
io_uring io_uring/net: fix multishot accept overflow handling 2024-02-23 09:25:10 +01:00
ipc Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
kernel sched/rt: Disallow writing invalid values to sched_rt_period_us 2024-03-01 13:34:47 +01:00
lib lib/Kconfig.debug: TEST_IOV_ITER depends on MMU 2024-03-01 13:34:59 +01:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm mm/damon/reclaim: fix quota stauts loss due to online tunings 2024-03-01 13:35:00 +01:00
net net: bridge: switchdev: Skip MDB replays of deferred events on offload 2024-03-01 13:35:06 +01:00
rust rust: upgrade to Rust 1.73.0 2024-02-16 19:10:43 +01:00
samples work around gcc bugs with 'asm goto' with outputs 2024-02-23 09:24:47 +01:00
scripts bpf, scripts: Correct GPL license name 2024-03-01 13:35:06 +01:00
security lsm: fix the logic in security_inode_getsecctx() 2024-02-23 09:25:02 +01:00
sound ALSA: usb-audio: Ignore clock selector errors for single connection 2024-03-01 13:34:52 +01:00
tools bpf: Derive source IP addr via bpf_*_fib_lookup() 2024-03-01 13:35:04 +01:00
usr initramfs: Encode dependency on KBUILD_BUILD_TIMESTAMP 2023-06-06 17:54:49 +09:00
virt ARM: 2023-09-07 13:52:20 -07:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: rpm-pkg: rename binkernel.spec to kernel.spec 2023-07-25 00:59:33 +09:00
.mailmap 20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues 2023-10-24 09:52:16 -10:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: add Catherine as xfs maintainer for 6.6.y 2024-02-16 19:10:43 +01:00
Makefile Linux 6.6.18 2024-02-23 09:25:28 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.