linux-stable/net/bridge
Ido Schimmel 7b4858df3b skbuff: bridge: Add layer 2 miss indication
For EVPN non-DF (Designated Forwarder) filtering we need to be able to
prevent decapsulated traffic from being flooded to a multi-homed host.
Filtering of multicast and broadcast traffic can be achieved using the
following flower filter:

 # tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop

Unlike broadcast and multicast traffic, it is not currently possible to
filter unknown unicast traffic. The classification into unknown unicast
is performed by the bridge driver, but is not visible to other layers
such as tc.

Solve this by adding a new 'l2_miss' bit to the tc skb extension. Clear
the bit whenever a packet enters the bridge (received from a bridge port
or transmitted via the bridge) and set it if the packet did not match an
FDB or MDB entry. If there is no skb extension and the bit needs to be
cleared, then do not allocate one as no extension is equivalent to the
bit being cleared. The bit is not set for broadcast packets as they
never perform a lookup and therefore never incur a miss.

A bit that is set for every flooded packet would also work for the
current use case, but it does not allow us to differentiate between
registered and unregistered multicast traffic, which might be useful in
the future.

To keep the performance impact to a minimum, the marking of packets is
guarded by the 'tc_skb_ext_tc' static key. When 'false', the skb is not
touched and an skb extension is not allocated. Instead, only a
5 bytes nop is executed, as demonstrated below for the call site in
br_handle_frame().

Before the patch:

```
        memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
  c37b09:       49 c7 44 24 28 00 00    movq   $0x0,0x28(%r12)
  c37b10:       00 00

        p = br_port_get_rcu(skb->dev);
  c37b12:       49 8b 44 24 10          mov    0x10(%r12),%rax
        memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
  c37b17:       49 c7 44 24 30 00 00    movq   $0x0,0x30(%r12)
  c37b1e:       00 00
  c37b20:       49 c7 44 24 38 00 00    movq   $0x0,0x38(%r12)
  c37b27:       00 00
```

After the patch (when static key is disabled):

```
        memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
  c37c29:       49 c7 44 24 28 00 00    movq   $0x0,0x28(%r12)
  c37c30:       00 00
  c37c32:       49 8d 44 24 28          lea    0x28(%r12),%rax
  c37c37:       48 c7 40 08 00 00 00    movq   $0x0,0x8(%rax)
  c37c3e:       00
  c37c3f:       48 c7 40 10 00 00 00    movq   $0x0,0x10(%rax)
  c37c46:       00

#ifdef CONFIG_HAVE_JUMP_LABEL_HACK

static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
{
        asm_volatile_goto("1:"
  c37c47:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
        br_tc_skb_miss_set(skb, false);

        p = br_port_get_rcu(skb->dev);
  c37c4c:       49 8b 44 24 10          mov    0x10(%r12),%rax
```

Subsequent patches will extend the flower classifier to be able to match
on the new 'l2_miss' bit and enable / disable the static key when
filters that match on it are added / deleted.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30 23:37:00 -07:00
..
netfilter netfilter: bridge: introduce broute meta statement 2023-03-08 14:21:18 +01:00
br.c bridge: switchdev: Allow device drivers to install locked FDB entries 2022-11-09 19:06:13 -08:00
br_arp_nd_proxy.c bridge: Add per-{Port, VLAN} neighbor suppression data path support 2023-04-21 08:25:50 +01:00
br_cfm.c
br_cfm_netlink.c
br_device.c skbuff: bridge: Add layer 2 miss indication 2023-05-30 23:37:00 -07:00
br_fdb.c bridge: switchdev: Allow device drivers to install locked FDB entries 2022-11-09 19:06:13 -08:00
br_forward.c skbuff: bridge: Add layer 2 miss indication 2023-05-30 23:37:00 -07:00
br_if.c bridge: Take per-{Port, VLAN} neighbor suppression into account 2023-04-21 08:25:49 +01:00
br_input.c skbuff: bridge: Add layer 2 miss indication 2023-05-30 23:37:00 -07:00
br_ioctl.c
br_mdb.c rtnetlink: bridge: mcast: Relax group address validation in common code 2023-03-17 08:05:49 +00:00
br_mrp.c
br_mrp_netlink.c
br_mrp_switchdev.c
br_mst.c
br_multicast.c net: bridge: Add netlink knobs for number / maximum MDB entries 2023-02-06 08:48:26 +00:00
br_multicast_eht.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
br_netfilter_hooks.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-04-20 16:29:51 -07:00
br_netfilter_ipv6.c netfilter: move br_nf_check_hbh_len to utils 2023-03-08 14:25:40 +01:00
br_netlink.c bridge: Allow setting per-{Port, VLAN} neighbor suppression state 2023-04-21 08:25:50 +01:00
br_netlink_tunnel.c net: bridge: Set strict_start_type at two policies 2023-02-06 08:48:25 +00:00
br_nf_core.c net: dst: Switch to rcuref_t reference counting 2023-03-28 18:52:28 -07:00
br_private.h skbuff: bridge: Add layer 2 miss indication 2023-05-30 23:37:00 -07:00
br_private_cfm.h
br_private_mcast_eht.h
br_private_mrp.h
br_private_stp.h
br_private_tunnel.h bridge: always declare tunnel functions 2023-05-17 21:28:58 -07:00
br_stp.c
br_stp_bpdu.c
br_stp_if.c
br_stp_timer.c
br_switchdev.c net: bridge: switchdev: don't notify FDB entries with "master dynamic" 2023-04-20 09:20:14 +02:00
br_sysfs_br.c bridge: Fix flushing of dynamic FDB entries 2022-11-02 20:47:09 -07:00
br_sysfs_if.c bridge: move from strlcpy with unused retval to strscpy 2022-08-22 17:57:30 -07:00
br_vlan.c bridge: vlan: Allow setting VLAN neighbor suppression state 2023-04-21 08:25:50 +01:00
br_vlan_options.c bridge: vlan: Allow setting VLAN neighbor suppression state 2023-04-21 08:25:50 +01:00
br_vlan_tunnel.c
Kconfig
Makefile