No description
Find a file
David S. Miller 356ae88f83 Merge branch 'bridge-tx-fwd'
Vladimir Oltean says:

====================
Allow TX forwarding for the software bridge data path to be offloaded to capable devices

On RX, switchdev drivers have the ability to mark packets for the
software bridge as "already forwarded in hardware" via
skb->offload_fwd_mark. This instructs the nbp_switchdev_allowed_egress()
function to perform software forwarding of that packet only to the bridge
ports that are not in the same hardware domain as the source packet.

This series expands the concept for TX, in the sense that we can trust
the accelerator to:
(a) look up its FDB (which is more or less in sync with the software
    bridge FDB) for selecting the destination ports for a packet
(b) replicate the frame in hardware in case it's a multicast/broadcast,
    instead of the software bridge having to clone it and send the
    clones to each net device one at a time. This reduces the bandwidth
    needed between the CPU and the accelerator, as well as the CPU time
    spent.

This is done by augmenting nbp_switchdev_allowed_egress() to also
exclude the bridge ports which have the tx_fwd_offload capability if the
skb has already been transmitted to one port from their hardware domain.

Even though in reality, the software bridge still technically looks up
the FDB/MDB for every frame, but all skb clones are suppressed, this
offload specifically requires that the switchdev accelerator looks up
its FDB/MDB again. It is intended to be used to inject "data plane
packets" into the hardware as opposed to "control plane packets" which
target a precise destination port.

Towards that goal, the bridge always provides the TX packets with
skb->offload_fwd_mark = true with the VLAN tag always present, so that
the accelerator can forward according to that VLAN broadcast domain.

This work is not intended to cater to switches which can inject control
plane packets to a bit mask of destination ports. I see that as a more
difficult task to accomplish with potentially less benefits (it provides
only replication offload). The reason it is more difficult is that
struct skb_buff would probably need to be extended to contain a list of
struct net_devices that the packet must be replicated to. Sending data
plane packets avoids that issue by keeping the hardware and software FDB
more or less in sync and looking it up twice.

Additionally, the ability for the software bridge to request data plane
packets to be sent brings the opportunity for "dumb switches" to support
traffic termination to/from the bridge. Such switches (DSA or otherwise)
typically only use control packets for link-local traps, and sending or
receiving a control packet is an expensive operation.

For this class of switches, this patch series makes the difference
between supporting and not supporting local IP termination through a
VLAN-aware bridge, bridging with a foreign interface, bridging with
software upper interfaces like LAG, etc. So instead of telling them
"oh, what a dumb switch you are!", we can now tell them "oh, what a
stark contrast you have between the control and data plane!".

Patches 1-3 tested on Turris MOX (3 mv88e6xxx switches in a daisy chain
topology) and a second DSA driver to be added soon. Patches 4-5 tested
only on Turris MOX.

===========================================================

Changes in v5:
- make sure the static key is decremented on bridge port unoffload
- rename functions and variables so that the "tx_fwd_offload" string is
  easy to grep across the git tree
- simplify DSA core bookkeeping of the bridge_num

===========================================================

Changes in v4:

The biggest change compared to the previous series is not present in the
patches, but is rather a lack of them. Previously we were replaying
switchdev objects on the public notifier chain, but that was a mistake
in my reasoning and it was reverted for v4. Therefore, we are now
passing the notifier blocks as arguments to switchdev_bridge_port_offload()
for all drivers. This alone gets rid of 7 patches compared to v3.

Other changes are:
- Take more care for the case where mlxsw leaves a VLAN or LAG upper
  that is a bridge port, make sure that switchdev_bridge_port_unoffload()
  gets called for that case
- A couple of DSA bug fixes
- Add change logs for all patches
- Copy all switchdev driver maintainers on the changes relevant to them

===========================================================

Message for v3:
https://patchwork.kernel.org/project/netdevbpf/cover/20210712152142.800651-1-vladimir.oltean@nxp.com/

In this submission I have introduced a "native switchdev" driver API to
signal whether the TX forwarding offload is supported or not. This comes
after a third person has said that the macvlan offload framework used
for v2 and v1 was simply too convoluted.

This large patch set is submitted for discussion purposes (it is
provided in its entirety so it can be applied & tested on net-next).
It is only minimally tested, and yet I will not copy all switchdev
driver maintainers until we agree on the viability of this approach.

The major changes compared to v2:
- The introduction of switchdev_bridge_port_offload() and
  switchdev_bridge_port_unoffload() as two major API changes from the
  perspective of a switchdev driver. All drivers were converted to call
  these.
- Augment switchdev_bridge_port_{,un}offload to also handle the
  switchdev object replays on port join/leave.
- Augment switchdev_bridge_port_offload to also signal whether the TX
  forwarding offload is supported.

===========================================================

Message for v2:
https://patchwork.kernel.org/project/netdevbpf/cover/20210703115705.1034112-1-vladimir.oltean@nxp.com/

For this series I have taken Tobias' work from here:
https://patchwork.kernel.org/project/netdevbpf/cover/20210426170411.1789186-1-tobias@waldekranz.com/
and made the following changes:
- I collected and integrated (hopefully all of) Nikolay's, Ido's and my
  feedback on the bridge driver changes. Otherwise, the structure of the
  bridge changes is pretty much the same as Tobias left it.
- I basically rewrote the DSA infrastructure for the data plane
  forwarding offload, based on the commonalities with another switch
  driver for which I implemented this feature (not submitted here)
- I adapted mv88e6xxx to use the new infrastructure, hopefully it still
  works but I didn't test that
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 16:32:50 +01:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-23 16:13:06 +01:00
block block-5.14-2021-07-08 2021-07-09 12:05:33 -07:00
certs Kbuild updates for v5.13 (2nd) 2021-05-08 10:00:11 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2021-07-09 11:00:44 -07:00
Documentation Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-23 16:13:06 +01:00
drivers net: dsa: mv88e6xxx: map virtual bridges with forwarding offload in the PVT 2021-07-23 16:32:37 +01:00
fs AFS fixes 2021-07-21 11:51:59 -07:00
include net: dsa: add support for bridge TX forwarding offload 2021-07-23 16:32:37 +01:00
init Revert "mm/slub: use stackdepot to save stack trace in objects" 2021-07-17 13:27:00 -07:00
ipc Merge branch 'akpm' (patches from Andrew) 2021-07-02 12:08:10 -07:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-23 16:13:06 +01:00
lib lib/test_hmm: remove set but unused page variable 2021-07-15 10:13:49 -07:00
LICENSES LICENSES/dual/CC-BY-4.0: Git rid of "smart quotes" 2021-07-15 06:31:24 -06:00
mm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-23 16:13:06 +01:00
net net: dsa: tag_dsa: offload the bridge forwarding process 2021-07-23 16:32:37 +01:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-07-15 22:40:10 -07:00
scripts Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-23 16:13:06 +01:00
security asm-generic/unaligned: Unify asm/unaligned.h around struct helper 2021-07-02 12:43:40 -07:00
sound ASoC: Mediatek: MT8183: Fix fall-through warning for Clang 2021-07-13 14:58:18 -05:00
tools Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-23 16:13:06 +01:00
usr .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
virt * Allow again loading KVM on 32-bit non-PAE builds 2021-07-15 11:56:07 -07:00
.clang-format clang-format: Update with the latest for_each macro list 2021-05-12 23:32:39 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap m68k updates for v5.14 2021-06-28 14:01:03 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: move Murali Karicheri to credits 2021-04-29 15:47:30 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-23 16:13:06 +01:00
Makefile Linux 5.14-rc2 2021-07-18 14:13:49 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.