linux-stable/net/bridge
Florian Westphal 2b1414d5e9 netfilter: bridge: confirm multicast packets before passing them up the stack
[ Upstream commit 62e7151ae3 ]

conntrack nf_confirm logic cannot handle cloned skbs referencing
the same nf_conn entry, which will happen for multicast (broadcast)
frames on bridges.

 Example:
    macvlan0
       |
      br0
     /  \
  ethX    ethY

 ethX (or Y) receives a L2 multicast or broadcast packet containing
 an IP packet, flow is not yet in conntrack table.

 1. skb passes through bridge and fake-ip (br_netfilter)Prerouting.
    -> skb->_nfct now references a unconfirmed entry
 2. skb is broad/mcast packet. bridge now passes clones out on each bridge
    interface.
 3. skb gets passed up the stack.
 4. In macvlan case, macvlan driver retains clone(s) of the mcast skb
    and schedules a work queue to send them out on the lower devices.

    The clone skb->_nfct is not a copy, it is the same entry as the
    original skb.  The macvlan rx handler then returns RX_HANDLER_PASS.
 5. Normal conntrack hooks (in NF_INET_LOCAL_IN) confirm the orig skb.

The Macvlan broadcast worker and normal confirm path will race.

This race will not happen if step 2 already confirmed a clone. In that
case later steps perform skb_clone() with skb->_nfct already confirmed (in
hash table).  This works fine.

But such confirmation won't happen when eb/ip/nftables rules dropped the
packets before they reached the nf_confirm step in postrouting.

Pablo points out that nf_conntrack_bridge doesn't allow use of stateful
nat, so we can safely discard the nf_conn entry and let inet call
conntrack again.

This doesn't work for bridge netfilter: skb could have a nat
transformation. Also bridge nf prevents re-invocation of inet prerouting
via 'sabotage_in' hook.

Work around this problem by explicit confirmation of the entry at LOCAL_IN
time, before upper layer has a chance to clone the unconfirmed entry.

The downside is that this disables NAT and conntrack helpers.

Alternative fix would be to add locking to all code parts that deal with
unconfirmed packets, but even if that could be done in a sane way this
opens up other problems, for example:

-m physdev --physdev-out eth0 -j SNAT --snat-to 1.2.3.4
-m physdev --physdev-out eth1 -j SNAT --snat-to 1.2.3.5

For multicast case, only one of such conflicting mappings will be
created, conntrack only handles 1:1 NAT mappings.

Users should set create a setup that explicitly marks such traffic
NOTRACK (conntrack bypass) to avoid this, but we cannot auto-bypass
them, ruleset might have accept rules for untracked traffic already,
so user-visible behaviour would change.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217777
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-06 14:45:08 +00:00
..
netfilter netfilter: bridge: confirm multicast packets before passing them up the stack 2024-03-06 14:45:08 +00:00
br.c net: bridge: mst: Multiple Spanning Tree (MST) mode 2022-03-17 16:49:57 -07:00
br_arp_nd_proxy.c neighbour: annotate lockless accesses to n->nud_state 2023-10-10 22:00:42 +02:00
br_cfm.c
br_cfm_netlink.c bridge: cfm: fix enum typo in br_cc_ccm_tx_parse 2024-02-05 20:12:54 +00:00
br_device.c bridge: move from strlcpy with unused retval to strscpy 2022-08-22 17:57:30 -07:00
br_fdb.c rtnetlink: add extack support in fdb del handlers 2022-05-09 11:58:20 +01:00
br_forward.c net: bridge: use DEV_STATS_INC() 2023-10-06 14:56:41 +02:00
br_if.c net: bridge: keep ports without IFF_UNICAST_FLT in BR_PROMISC mode 2023-07-19 16:22:04 +02:00
br_input.c net: bridge: use DEV_STATS_INC() 2023-10-06 14:56:41 +02:00
br_ioctl.c
br_mdb.c net: bridge: allow add/remove permanent mdb entries on disabled ports 2022-06-15 09:35:21 +01:00
br_mrp.c
br_mrp_netlink.c
br_mrp_switchdev.c
br_mst.c net: bridge: mst: Add helper to query a port's MST state 2022-03-17 16:49:58 -07:00
br_multicast.c bridge: mcast: fix disabled snooping after long uptime 2024-02-05 20:13:01 +00:00
br_multicast_eht.c
br_netfilter_hooks.c netfilter: bridge: confirm multicast packets before passing them up the stack 2024-03-06 14:45:08 +00:00
br_netfilter_ipv6.c netfilter: bridge: replace physindev with physinif in nf_bridge_info 2024-01-25 15:27:51 -08:00
br_netlink.c bridge: Fix flushing of dynamic FDB entries 2022-11-02 20:47:09 -07:00
br_netlink_tunnel.c
br_nf_core.c
br_private.h bridge: mcast: fix disabled snooping after long uptime 2024-02-05 20:13:01 +00:00
br_private_cfm.h
br_private_mcast_eht.h
br_private_mrp.h
br_private_stp.h
br_private_tunnel.h bridge: always declare tunnel functions 2023-05-24 17:32:48 +01:00
br_stp.c net: bridge: mst: Multiple Spanning Tree (MST) mode 2022-03-17 16:49:57 -07:00
br_stp_bpdu.c
br_stp_if.c Revert "bridge: Add extack warning when enabling STP in netns." 2023-09-13 09:42:20 +02:00
br_stp_timer.c
br_switchdev.c net: bridge: switchdev: Ensure deferred event delivery on unoffload 2024-03-01 13:26:36 +01:00
br_sysfs_br.c bridge: Fix flushing of dynamic FDB entries 2022-11-02 20:47:09 -07:00
br_sysfs_if.c bridge: move from strlcpy with unused retval to strscpy 2022-08-22 17:57:30 -07:00
br_vlan.c bridge: switchdev: Fix memory leaks when changing VLAN protocol 2022-11-15 13:38:11 +01:00
br_vlan_options.c net: bridge: mst: Allow changing a VLAN's MSTI 2022-03-17 16:49:57 -07:00
br_vlan_tunnel.c
Kconfig
Makefile net: bridge: mst: Multiple Spanning Tree (MST) mode 2022-03-17 16:49:57 -07:00