linux-stable/net
Vladimir Oltean b2bfa742b8 net: bridge: switchdev: don't notify FDB entries with "master dynamic"
[ Upstream commit 927cdea5d2 ]

There is a structural problem in switchdev, where the flag bits in
struct switchdev_notifier_fdb_info (added_by_user, is_local etc) only
represent a simplified / denatured view of what's in struct
net_bridge_fdb_entry :: flags (BR_FDB_ADDED_BY_USER, BR_FDB_LOCAL etc).
Each time we want to pass more information about struct
net_bridge_fdb_entry :: flags to struct switchdev_notifier_fdb_info
(here, BR_FDB_STATIC), we find that FDB entries were already notified to
switchdev with no regard to this flag, and thus, switchdev drivers had
no indication whether the notified entries were static or not.

For example, this command:

ip link add br0 type bridge && ip link set swp0 master br0
bridge fdb add dev swp0 00:01:02:03:04:05 master dynamic

has never worked as intended with switchdev. It causes a struct
net_bridge_fdb_entry to be passed to br_switchdev_fdb_notify() which has
a single flag set: BR_FDB_ADDED_BY_USER.

This is further passed to the switchdev notifier chain, where interested
drivers have no choice but to assume this is a static (does not age) and
sticky (does not migrate) FDB entry. So currently, all drivers offload
it to hardware as such, as can be seen below ("offload" is set).

bridge fdb get 00:01:02:03:04:05 dev swp0 master
00:01:02:03:04:05 dev swp0 offload master br0

The software FDB entry expires $ageing_time centiseconds after the
kernel last sees a packet with this MAC SA, and the bridge notifies its
deletion as well, so it eventually disappears from hardware too.

This is a problem, because it is actually desirable to start offloading
"master dynamic" FDB entries correctly - they should expire $ageing_time
centiseconds after the *hardware* port last sees a packet with this
MAC SA - and this is how the current incorrect behavior was discovered.
With an offloaded data plane, it can be expected that software only sees
exception path packets, so an otherwise active dynamic FDB entry would
be aged out by software sooner than it should.

With the change in place, these FDB entries are no longer offloaded:

bridge fdb get 00:01:02:03:04:05 dev swp0 master
00:01:02:03:04:05 dev swp0 master br0

and this also constitutes a better way (assuming a backport to stable
kernels) for user space to determine whether the kernel has the
capability of doing something sane with these or not.

As opposed to "master dynamic" FDB entries, on the current behavior of
which no one currently depends on (which can be deduced from the lack of
kselftests), Ido Schimmel explains that entries with the "extern_learn"
flag (BR_FDB_ADDED_BY_EXT_LEARN) should still be notified to switchdev,
since the spectrum driver listens to them (and this is kind of okay,
because although they are treated identically to "static", they are
expected to not age, and to roam).

Fixes: 6b26b51b1d ("net: bridge: Add support for notifying devices about FDB add/del")
Link: https://lore.kernel.org/netdev/20230327115206.jk5q5l753aoelwus@skbuf/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20230418155902.898627-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-26 14:28:35 +02:00
..
6lowpan
9p
802
8021q
appletalk
atm
ax25
batman-adv
bluetooth Bluetooth: Set ISO Data Path on broadcast sink 2023-04-20 12:35:09 +02:00
bpf
bpfilter
bridge net: bridge: switchdev: don't notify FDB entries with "master dynamic" 2023-04-26 14:28:35 +02:00
caif
can
ceph
core skbuff: Fix a race between coalescing and releasing SKBs 2023-04-20 12:35:10 +02:00
dcb
dccp
dns_resolver
dsa
ethernet
ethtool
hsr
ieee802154
ife
ipv4
ipv6 net: rpl: fix rpl header size calculation 2023-04-26 14:28:34 +02:00
iucv
kcm
key
l2tp
l3mdev
lapb
llc
mac80211
mac802154
mctp
mpls
mptcp mptcp: stricter state check in mptcp_worker 2023-04-20 12:35:13 +02:00
ncsi
netfilter netfilter: nf_tables: tighten netlink attribute requirements for catch-all elements 2023-04-26 14:28:34 +02:00
netlabel
netlink
netrom
nfc
nsh
openvswitch
packet
phonet
psample
qrtr net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume() 2023-04-20 12:35:09 +02:00
rds
rfkill
rose
rxrpc
sched net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg 2023-04-26 14:28:32 +02:00
sctp sctp: fix a potential overflow in sctp_ifwdtsn_skip 2023-04-20 12:35:09 +02:00
smc
strparser
sunrpc
switchdev
tipc
tls
unix
vmw_vsock
wireless
x25
xdp
xfrm
compat.c
devres.c
Kconfig
Kconfig.debug
Makefile
socket.c
sysctl_net.c