Commit graph

500 commits

Author SHA1 Message Date
Vladimir Oltean
b5d6e4f6b9 net: dsa: don't error out when drivers return ETH_DATA_LEN in .port_max_mtu()
[ Upstream commit 636e8adf78 ]

Currently, when dsa_slave_change_mtu() is called on a user port where
dev->max_mtu is 1500 (as returned by ds->ops->port_max_mtu()), the code
will stumble upon this check:

	if (new_master_mtu > mtu_limit)
		return -ERANGE;

because new_master_mtu is adjusted for the tagger overhead but mtu_limit
is not.

But it would be good if the logic went through, for example if the DSA
master really depends on an MTU adjustment to accept DSA-tagged frames.

To make the code pass through the check, we need to adjust mtu_limit for
the overhead as well, if the minimum restriction was caused by the DSA
user port's MTU (dev->max_mtu). A DSA user port MTU and a DSA master MTU
are always offset by the protocol overhead.

Currently no drivers return 1500 .port_max_mtu(), but this is only
temporary and a bug in itself - mv88e6xxx should have done that, but
since commit b9c587fed6 ("dsa: mv88e6xxx: Include tagger overhead when
setting MTU for DSA and CPU ports") it no longer does. This is a
preparation for fixing that.

Fixes: bfcb813203 ("net: dsa: configure the MTU for switch ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22 13:33:49 +01:00
Dan Carpenter
0a6d58a70a net: dsa: uninitialized variable in dsa_slave_netdevice_event()
Return zero if both dsa_slave_dev_check() and netdev_uses_dsa() are false.

Fixes: acc43b7bf5 ("net: dsa: allow masters to join a LAG")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-15 11:15:27 +01:00
Vladimir Oltean
cf5ca4ddc3 net: dsa: don't leave dangling pointers in dp->pl when failing
There is a desire to simplify the dsa_port registration path with
devlink, and this involves reworking a bit how user ports which fail to
connect to their PHY (because it's missing) get reinitialized as UNUSED
devlink ports.

The desire is for the change to look something like this; basically
dsa_port_setup() has failed, we just change dp->type and call
dsa_port_setup() again.

-/* Destroy the current devlink port, and create a new one which has the UNUSED
- * flavour.
- */
-static int dsa_port_reinit_as_unused(struct dsa_port *dp)
+static int dsa_port_setup_as_unused(struct dsa_port *dp)
 {
-	dsa_port_devlink_teardown(dp);
 	dp->type = DSA_PORT_TYPE_UNUSED;
-	return dsa_port_devlink_setup(dp);
+	return dsa_port_setup(dp);
 }

For an UNUSED port, dsa_port_setup() mostly only calls dsa_port_devlink_setup()
anyway, so we could get away with calling just that. But if we call the
full blown dsa_port_setup(dp) (which will be needed to properly set
dp->setup = true), the callee will have the tendency to go through this
code block too, and call dsa_port_disable(dp):

	switch (dp->type) {
	case DSA_PORT_TYPE_UNUSED:
		dsa_port_disable(dp);
		break;

That is not very good, because dsa_port_disable() has this hidden inside
of it:

	if (dp->pl)
		phylink_stop(dp->pl);

Fact is, we are not prepared to handle a call to dsa_port_disable() with
a struct dsa_port that came from a previous (and failed) call to
dsa_port_setup(). We do not clean up dp->pl, and this will make the
second call to dsa_port_setup() call phylink_stop() on a dangling dp->pl
pointer.

Solve this by creating an API for phylink destruction which is symmetric
to the phylink creation, and never leave dp->pl set to anything except
NULL or a valid phylink structure.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:17:16 -07:00
Vladimir Oltean
acc43b7bf5 net: dsa: allow masters to join a LAG
There are 2 ways in which a DSA user port may become handled by 2 CPU
ports in a LAG:

(1) its current DSA master joins a LAG

 ip link del bond0 && ip link add bond0 type bond mode 802.3ad
 ip link set eno2 master bond0

When this happens, all user ports with "eno2" as DSA master get
automatically migrated to "bond0" as DSA master.

(2) it is explicitly configured as such by the user

 # Before, the DSA master was eno3
 ip link set swp0 type dsa master bond0

The design of this configuration is that the LAG device dynamically
becomes a DSA master through dsa_master_setup() when the first physical
DSA master becomes a LAG slave, and stops being so through
dsa_master_teardown() when the last physical DSA master leaves.

A LAG interface is considered as a valid DSA master only if it contains
existing DSA masters, and no other lower interfaces. Therefore, we
mainly rely on method (1) to enter this configuration.

Each physical DSA master (LAG slave) retains its dev->dsa_ptr for when
it becomes a standalone DSA master again. But the LAG master also has a
dev->dsa_ptr, and this is actually duplicated from one of the physical
LAG slaves, and therefore needs to be balanced when LAG slaves come and
go.

To the switch driver, putting DSA masters in a LAG is seen as putting
their associated CPU ports in a LAG.

We need to prepare cross-chip host FDB notifiers for CPU ports in a LAG,
by calling the driver's ->lag_fdb_add method rather than ->port_fdb_add.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:36 +02:00
Vladimir Oltean
95f510d0b7 net: dsa: allow the DSA master to be seen and changed through rtnetlink
Some DSA switches have multiple CPU ports, which can be used to improve
CPU termination throughput, but DSA, through dsa_tree_setup_cpu_ports(),
sets up only the first one, leading to suboptimal use of hardware.

The desire is to not change the default configuration but to permit the
user to create a dynamic mapping between individual user ports and the
CPU port that they are served by, configurable through rtnetlink. It is
also intended to permit load balancing between CPU ports, and in that
case, the foreseen model is for the DSA master to be a bonding interface
whose lowers are the physical DSA masters.

To that end, we create a struct rtnl_link_ops for DSA user ports with
the "dsa" kind. We expose the IFLA_DSA_MASTER link attribute that
contains the ifindex of the newly desired DSA master.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:35 +02:00
Vladimir Oltean
8f6a19c031 net: dsa: introduce dsa_port_get_master()
There is a desire to support for DSA masters in a LAG.

That configuration is intended to work by simply enslaving the master to
a bonding/team device. But the physical DSA master (the LAG slave) still
has a dev->dsa_ptr, and that cpu_dp still corresponds to the physical
CPU port.

However, we would like to be able to retrieve the LAG that's the upper
of the physical DSA master. In preparation for that, introduce a helper
called dsa_port_get_master() that replaces all occurrences of the
dp->cpu_dp->master pattern. The distinction between LAG and non-LAG will
be made later within the helper itself.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:35 +02:00
Jakub Kicinski
880b0dd94f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
  21234e3a84 ("net/mlx5e: Fix use after free in mlx5e_fs_init()")
  c7eafc5ed0 ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer")
https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/
https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25 16:07:42 -07:00
Vladimir Oltean
855a28f9c9 net: dsa: don't dereference NULL extack in dsa_slave_changeupper()
When a driver returns -EOPNOTSUPP in dsa_port_bridge_join() but failed
to provide a reason for it, DSA attempts to set the extack to say that
software fallback will kick in.

The problem is, when we use brctl and the legacy bridge ioctls, the
extack will be NULL, and DSA dereferences it in the process of setting
it.

Sergei Antonov proves this using the following stack trace:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
PC is at dsa_slave_changeupper+0x5c/0x158

 dsa_slave_changeupper from raw_notifier_call_chain+0x38/0x6c
 raw_notifier_call_chain from __netdev_upper_dev_link+0x198/0x3b4
 __netdev_upper_dev_link from netdev_master_upper_dev_link+0x50/0x78
 netdev_master_upper_dev_link from br_add_if+0x430/0x7f4
 br_add_if from br_ioctl_stub+0x170/0x530
 br_ioctl_stub from br_ioctl_call+0x54/0x7c
 br_ioctl_call from dev_ifsioc+0x4e0/0x6bc
 dev_ifsioc from dev_ioctl+0x2f8/0x758
 dev_ioctl from sock_ioctl+0x5f0/0x674
 sock_ioctl from sys_ioctl+0x518/0xe40
 sys_ioctl from ret_fast_syscall+0x0/0x1c

Fix the problem by only overriding the extack if non-NULL.

Fixes: 1c6e8088d9 ("net: dsa: allow port_bridge_join() to override extack message")
Link: https://lore.kernel.org/netdev/CABikg9wx7vB5eRDAYtvAm7fprJ09Ta27a4ZazC=NX5K4wn6pWA@mail.gmail.com/
Reported-by: Sergei Antonov <saproj@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Sergei Antonov <saproj@gmail.com>
Link: https://lore.kernel.org/r/20220819173925.3581871-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-23 07:54:16 -07:00
Vladimir Oltean
7136097e11 net: dsa: only bring down user ports assigned to a given DSA master
This is an adaptation of commit c0a8a9c274 ("net: dsa: automatically
bring user ports down when master goes down") for multiple DSA masters.
When a DSA master goes down, only the user ports under its control
should go down too, the others can still send/receive traffic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
4f03dcc6b9 net: dsa: existing DSA masters cannot join upper interfaces
All the traffic to/from a DSA master is supposed to be distributed among
its DSA switch upper interfaces, so we should not allow other upper
device kinds.

An exception to this is DSA_TAG_PROTO_NONE (switches with no DSA tags),
and in that case it is actually expected to create e.g. VLAN interfaces
on the master. But for those, netdev_uses_dsa(master) returns false, so
the restriction doesn't apply.

The motivation for this change is to allow LAG interfaces of DSA masters
to be DSA masters themselves. We want to restrict the user's degrees of
freedom by 1: the LAG should already have all DSA masters as lowers, and
while lower ports of the LAG can be removed, none can be added after the
fact.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
920a33cd72 net: bridge: move DSA master bridging restriction to DSA
When DSA gains support for multiple CPU ports in a LAG, it will become
mandatory to monitor the changeupper events for the DSA master.

In fact, there are already some restrictions to be imposed in that area,
namely that a DSA master cannot be a bridge port except in some special
circumstances.

Centralize the restrictions at the level of the DSA layer as a
preliminary step.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
0498277ee1 net: dsa: don't stop at NOTIFY_OK when calling ds->ops->port_prechangeupper
dsa_slave_prechangeupper_sanity_check() is supposed to enforce some
adjacency restrictions, and calls ds->ops->port_prechangeupper if the
driver implements it.

We convert the error code from the port_prechangeupper() call to a
notifier code, and 0 is converted to NOTIFY_OK, but the caller of
dsa_slave_prechangeupper_sanity_check() stops at any notifier code
different from NOTIFY_DONE.

Avoid this by converting back the notifier code to an error code, so
that both NOTIFY_OK and NOTIFY_DONE will be seen as 0. This allows more
parallel sanity check functions to be added.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
4c3f80d22b net: dsa: walk through all changeupper notifier functions
Traditionally, DSA has had a single netdev notifier handling function
for each device type.

For the sake of code cleanliness, we would like to introduce more
handling functions which do one thing, but the conditions for entering
these functions start to overlap. Example: a handling function which
tracks whether any bridges contain both DSA and non-DSA interfaces.
Either this is placed before dsa_slave_changeupper(), case in which it
will prevent that function from executing, or we place it after
dsa_slave_changeupper(), case in which we will prevent it from
executing. The other alternative is to ignore errors from the new
handling function (not ideal).

To support this usage, we need to change the pattern. In the new model,
we enter all notifier handling sub-functions, and exit with NOTIFY_DONE
if there is nothing to do. This allows the sub-functions to be
relatively free-form and independent from each other.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Wolfram Sang
e4d44b3d27 dsa: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220818210216.8419-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22 18:06:35 -07:00
Oleksij Rempel
3d410403a5 net: dsa: add get_pause_stats support
Add support for pause stats

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:17:11 -07:00
Clément Léger
67f38b1c73 net: dsa: add support for ethtool get_rmon_stats()
Add support to allow dsa drivers to specify the .get_rmon_stats()
operation.

Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:37:55 +01:00
Clément Léger
1c6e8088d9 net: dsa: allow port_bridge_join() to override extack message
Some drivers might report that they are unable to bridge ports by
returning -EOPNOTSUPP, but still wants to override extack message.
In order to do so, in dsa_slave_changeupper(), if port_bridge_join()
returns -EOPNOTSUPP, check if extack message is set and if so, do not
override it.

Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:37:54 +01:00
Eric Dumazet
9962acefbc net: adopt u64_stats_t in struct pcpu_sw_netstats
As explained in commit 316580b69d ("u64_stats: provide u64_stats_t type")
we should use u64_stats_t and related accessors to avoid load/store tearing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09 21:53:11 -07:00
Vladimir Oltean
72c3b0c735 net: dsa: felix: manage host flooding using a specific driver callback
At the time - commit 7569459a52 ("net: dsa: manage flooding on the CPU
ports") - not introducing a dedicated switch callback for host flooding
made sense, because for the only user, the felix driver, there was
nothing different to do for the CPU port than set the flood flags on the
CPU port just like on any other bridge port.

There are 2 reasons why this approach is not good enough, however.

(1) Other drivers, like sja1105, support configuring flooding as a
    function of {ingress port, egress port}, whereas the DSA
    ->port_bridge_flags() function only operates on an egress port.
    So with that driver we'd have useless host flooding from user ports
    which don't need it.

(2) Even with the felix driver, support for multiple CPU ports makes it
    difficult to piggyback on ->port_bridge_flags(). The way in which
    the felix driver is going to support host-filtered addresses with
    multiple CPU ports is that it will direct these addresses towards
    both CPU ports (in a sort of multicast fashion), then restrict the
    forwarding to only one of the two using the forwarding masks.
    Consequently, flooding will also be enabled towards both CPU ports.
    However, ->port_bridge_flags() gets passed the index of a single CPU
    port, and that leaves the flood settings out of sync between the 2
    CPU ports.

This is to say, it's better to have a specific driver method for host
flooding, which takes the user port as argument. This solves problem (1)
by allowing the driver to do different things for different user ports,
and problem (2) by abstracting the operation and letting the driver do
whatever, rather than explicitly making the DSA core point to the CPU
port it thinks needs to be touched.

This new method also creates a problem, which is that cross-chip setups
are not handled. However I don't have hardware right now where I can
test what is the proper thing to do, and there isn't hardware compatible
with multi-switch trees that supports host flooding. So it remains a
problem to be tackled in the future.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:38:55 -07:00
Jakub Kicinski
0e55546b18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/netdevice.h
net/core/dev.c
  6510ea973d ("net: Use this_cpu_inc() to increment net->core_stats")
  794c24e992 ("net-core: rx_otherhost_dropped to core_stats")
https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/

drivers/net/wan/cosa.c
  d48fea8401 ("net: cosa: fix error check return value of register_chrdev()")
  89fbca3307 ("net: wan: remove support for COSA and SRP synchronous serial boards")
https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 13:02:01 -07:00
Vladimir Oltean
7c762e70c5 net: dsa: flood multicast to CPU when slave has IFF_PROMISC
Certain DSA switches can eliminate flooding to the CPU when none of the
ports have the IFF_ALLMULTI or IFF_PROMISC flags set. This is done by
synthesizing a call to dsa_port_bridge_flags() for the CPU port, a call
which normally comes from the bridge driver via switchdev.

The bridge port flags and IFF_PROMISC|IFF_ALLMULTI have slightly
different semantics, and due to inattention/lack of proper testing, the
IFF_PROMISC flag allows unknown unicast to be flooded to the CPU, but
not unknown multicast.

This must be fixed by setting both BR_FLOOD (unicast) and BR_MCAST_FLOOD
in the synthesized dsa_port_bridge_flags() call, since IFF_PROMISC means
that packets should not be filtered regardless of their MAC DA.

Fixes: 7569459a52 ("net: dsa: manage flooding on the CPU ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:46:24 +01:00
Vladimir Oltean
be6ff9665d net: dsa: don't emit targeted cross-chip notifiers for MTU change
A cross-chip notifier with "targeted_match=true" is one that matches
only the local port of the switch that emitted it. In other words,
passing through the cross-chip notifier layer serves no purpose.

Eliminate this concept by calling directly ds->ops->port_change_mtu
instead of emitting a targeted cross-chip notifier. This leaves the
DSA_NOTIFIER_MTU event being emitted only for MTU updates on the CPU
port, which need to be reflected also across all DSA links.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
4715029fa7 net: dsa: drop dsa_slave_priv from dsa_slave_change_mtu
We can get a hold of the "ds" pointer directly from "dp", no need for
the dsa_slave_priv.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
cf1c39d3b3 net: dsa: avoid one dsa_to_port() in dsa_slave_change_mtu
We could retrieve the cpu_dp pointer directly from the "dp" we already
have, no need to resort to dsa_to_port(ds, port).

This change also removes the need for an "int port", so that is also
deleted.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
b2033a05a7 net: dsa: use dsa_tree_for_each_user_port in dsa_slave_change_mtu
Use the more conventional iterator over user ports instead of explicitly
ignoring them, and use the more conventional name "other_dp" instead of
"dp_iter", for readability.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
5077e2c8cf net: dsa: fix missing host-filtered multicast addresses
DSA ports are stacked devices, so they use dev_mc_add() to sync their
address list to their lower interface (DSA master). But they are also
hardware devices, so they program those addresses to hardware using the
__dev_mc_add() sync and unsync callbacks.

Unfortunately both cannot work at the same time, and it seems that the
multicast addresses which are already present on the DSA master, like
33:33:00:00:00:01 (added by addrconf.c as in6addr_linklocal_allnodes)
are synced to the master via dev_mc_sync(), but not to hardware by
__dev_mc_sync().

This happens because both the dev_mc_sync() -> __hw_addr_sync_one()
code path, as well as __dev_mc_sync() -> __hw_addr_sync_dev(), operate
on the same variable: ha->sync_cnt, in a way that causes the "sync"
method (dsa_slave_sync_mc) to no longer be called.

To fix the issue we need to work with the API in the way in which it was
intended to be used, and therefore, call dev_uc_add() and friends for
each individual hardware address, from the sync and unsync callbacks.

Fixes: 5e8a1e03aa ("net: dsa: install secondary unicast and multicast addresses as host FDB/MDB")
Link: https://lore.kernel.org/netdev/20220321163213.lrn5sk7m6grighbl@skbuf/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220322003701.2056895-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-22 22:19:35 -07:00
Vladimir Oltean
0148bb50b8 net: dsa: pass extack to dsa_switch_ops :: port_mirror_add()
Drivers might have error messages to propagate to user space, most
common being that they support a single mirror port.

Propagate the netlink extack so that they can inform user space in a
verbal way of their limitations.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 17:42:47 -07:00
Tobias Waldekranz
7414af30b7 net: dsa: Handle MST state changes
Add the usual trampoline functionality from the generic DSA layer down
to the drivers for MST state changes.

When a state changes to disabled/blocking/listening, make sure to fast
age any dynamic entries in the affected VLANs (those controlled by the
MSTI in question).

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 16:49:59 -07:00
Tobias Waldekranz
8e6598a7b0 net: dsa: Pass VLAN MSTI migration notifications to driver
Add the usual trampoline functionality from the generic DSA layer down
to the drivers for VLAN MSTI migrations.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 16:49:59 -07:00
Tobias Waldekranz
332afc4c8c net: dsa: Validate hardware support for MST
When joining a bridge where MST is enabled, we validate that the
proper offloading support is in place, otherwise we fallback to
software bridging.

When then mode is changed on a bridge in which we are members, we
refuse the change if offloading is not supported.

At the moment we only check for configurable learning, but this will
be further restricted as we support more MST related switchdev events.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 16:49:58 -07:00
Tobias Waldekranz
a860352e9d net: dsa: Never offload FDB entries on standalone ports
If a port joins a bridge that it can't offload, it will fallback to
standalone mode and software bridging. In this case, we never want to
offload any FDB entries to hardware either.

Previously, for host addresses, we would eventually end up in
dsa_port_bridge_host_fdb_add, which would unconditionally dereference
dp->bridge and cause a segfault.

Fixes: c26933639b ("net: dsa: request drivers to perform FDB isolation")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220315233033.1468071-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-16 19:36:56 -07:00
Vladimir Oltean
47d75f7822 net: dsa: report and change port dscp priority using dcbnl
Similar to the port-based default priority, IEEE 802.1Q-2018 allows the
Application Priority Table to define QoS classes (0 to 7) per IP DSCP
value (0 to 63).

In the absence of an app table entry for a packet with DSCP value X,
QoS classification for that packet falls back to other methods (VLAN PCP
or port-based default). The presence of an app table for DSCP value X
with priority Y makes the hardware classify the packet to QoS class Y.

As opposed to the default-prio where DSA exposes only a "set" in
dsa_switch_ops (because the port-based default is the fallback, it
always exists, either implicitly or explicitly), for DSCP priorities we
expose an "add" and a "del". The addition of a DSCP entry means trusting
that DSCP priority, the deletion means ignoring it.

Drivers that already trust (at least some) DSCP values can describe
their configuration in dsa_switch_ops :: port_get_dscp_prio(), which is
called for each DSCP value from 0 to 63.

Again, there can be more than one dcbnl app table entry for the same
DSCP value, DSA chooses the one with the largest configured priority.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-14 10:36:15 +00:00
Vladimir Oltean
d538eca85c net: dsa: report and change port default priority using dcbnl
The port-based default QoS class is assigned to packets that lack a
VLAN PCP (or the port is configured to not trust the VLAN PCP),
an IP DSCP (or the port is configured to not trust IP DSCP), and packets
on which no tc-skbedit action has matched.

Similar to other drivers, this can be exposed to user space using the
DCB Application Priority Table. IEEE 802.1Q-2018 specifies in Table
D-8 - Sel field values that when the Selector is 1, the Protocol ID
value of 0 denotes the "Default application priority. For use when
application priority is not otherwise specified."

The way in which the dcbnl integration in DSA has been designed has to
do with its requirements. Andrew Lunn explains that SOHO switches are
expected to come with some sort of pre-configured QoS profile, and that
it is desirable for this to come pre-loaded into the DSA slave interfaces'
DCB application priority table.

In the dcbnl design, this is possible because calls to dcb_ieee_setapp()
can be initiated by anyone including being self-initiated by this device
driver.

However, what makes this challenging to implement in DSA is that the DSA
core manages the net_devices (effectively hiding them from drivers),
while drivers manage the hardware. The DSA core has no knowledge of what
individual drivers' QoS policies are. DSA could export to drivers a
wrapper over dcb_ieee_setapp() and these could call that function to
pre-populate the app priority table, however drivers don't have a good
moment in time to do this. The dsa_switch_ops :: setup() method gets
called before the net_devices are created (dsa_slave_create), and so is
dsa_switch_ops :: port_setup(). What remains is dsa_switch_ops ::
port_enable(), but this gets called upon each ndo_open. If we add app
table entries on every open, we'd need to remove them on close, to avoid
duplicate entry errors. But if we delete app priority entries on close,
what we delete may not be the initial, driver pre-populated entries, but
rather user-added entries.

So it is clear that letting drivers choose the timing of the
dcb_ieee_setapp() call is inappropriate. The alternative which was
chosen is to introduce hardware-specific ops in dsa_switch_ops, and
effectively hide dcbnl details from drivers as well. For pre-populating
the application table, dsa_slave_dcbnl_init() will call
ds->ops->port_get_default_prio() which is supposed to read from
hardware. If the operation succeeds, DSA creates a default-prio app
table entry. The method is called as soon as the slave_dev is
registered, but before we release the rtnl_mutex. This is done such that
user space sees the app table entries as soon as it sees the interface
being registered.

The fact that we populate slave_dev->dcbnl_ops with a non-NULL pointer
changes behavior in dcb_doit() from net/dcb/dcbnl.c, which used to
return -EOPNOTSUPP for any dcbnl operation where netdev->dcbnl_ops is
NULL. Because there are still dcbnl-unaware DSA drivers even if they
have dcbnl_ops populated, the way to restore the behavior is to make all
dcbnl_ops return -EOPNOTSUPP on absence of the hardware-specific
dsa_switch_ops method.

The dcbnl framework absurdly allows there to be more than one app table
entry for the same selector and protocol (in other words, more than one
port-based default priority). In the iproute2 dcb program, there is a
"replace" syntactical sugar command which performs an "add" and a "del"
to hide this away. But we choose the largest configured priority when we
call ds->ops->port_set_default_prio(), using __fls(). When there is no
default-prio app table entry left, the port-default priority is restored
to 0.

Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210113154139.1803705-2-olteanv@gmail.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-14 10:36:15 +00:00
Vladimir Oltean
e2d0576f0c net: dsa: be mostly no-op in dsa_slave_set_mac_address when down
Since the slave unicast address is synced to hardware and to the DSA
master during dsa_slave_open(), this means that a call to
dsa_slave_set_mac_address() while the slave interface is down will
result to a call to dsa_port_standalone_host_fdb_del() and to
dev_uc_del() for the MAC address while there was no previous
dsa_port_standalone_host_fdb_add() or dev_uc_add().

This is a partial revert of the blamed commit below, which was too
aggressive.

Fixes: 35aae5ab91 ("net: dsa: remove workarounds for changing master promisc/allmulti only while up")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 11:12:09 +00:00
Vladimir Oltean
7569459a52 net: dsa: manage flooding on the CPU ports
DSA can treat IFF_PROMISC and IFF_ALLMULTI on standalone user ports as
signifying whether packets with an unknown MAC DA will be received or
not. Since known MAC DAs are handled by FDB/MDB entries, this means that
promiscuity is analogous to including/excluding the CPU port from the
flood domain of those packets.

There are two ways to signal CPU flooding to drivers.

The first (chosen here) is to synthesize a call to
ds->ops->port_bridge_flags() for the CPU port, with a mask of
BR_FLOOD | BR_MCAST_FLOOD. This has the effect of turning on egress
flooding on the CPU port regardless of source.

The alternative would be to create a new ds->ops->port_host_flood()
which is called per user port. Some switches (sja1105) have a flood
domain that is managed per {ingress port, egress port} pair, so it would
make more sense for this kind of switch to not flood the CPU from port A
if just port B requires it. Nonetheless, the sja1105 has other quirks
that prevent it from making use of unicast filtering, and without a
concrete user making use of this feature, I chose not to implement it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
499aa9e1b3 net: dsa: install the primary unicast MAC address as standalone port host FDB
To be able to safely turn off CPU flooding for standalone ports, we need
to ensure that the dev_addr of each DSA slave interface is installed as
a standalone host FDB entry for compatible switches.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
5e8a1e03aa net: dsa: install secondary unicast and multicast addresses as host FDB/MDB
In preparation of disabling flooding towards the CPU in standalone ports
mode, identify the addresses requested by upper interfaces and use the
new API for DSA FDB isolation to request the hardware driver to offload
these as FDB or MDB objects. The objects belong to the user port's
database, and are installed pointing towards the CPU port.

Because dev_uc_add()/dev_mc_add() is VLAN-unaware, we offload to the
port standalone database addresses with VID 0 (also VLAN-unaware).
So this excludes switches with global VLAN filtering from supporting
unicast filtering, because there, it is possible for a port of a switch
to join a VLAN-aware bridge, and this changes the VLAN awareness of
standalone ports, requiring VLAN-aware standalone host FDB entries.
For the same reason, hellcreek, which requires VLAN awareness in
standalone mode, is also exempted from unicast filtering.

We create "standalone" variants of dsa_port_host_fdb_add() and
dsa_port_host_mdb_add() (and the _del coresponding functions).

We also create a separate work item type for handling deferred
standalone host FDB/MDB entries compared to the switchdev one.
This is done for the purpose of clarity - the procedure for offloading a
bridge FDB entry is different than offloading a standalone one, and
the switchdev event work handles only FDBs anyway, not MDBs.
Deferral is needed for standalone entries because ndo_set_rx_mode runs
in atomic context. We could probably optimize things a little by first
queuing up all entries that need to be offloaded, and scheduling the
work item just once, but the data structures that we can pass through
__dev_uc_sync() and __dev_mc_sync() are limiting (there is nothing like
a void *priv), so we'd have to keep the list of queued events somewhere
in struct dsa_switch, and possibly a lock for it. Too complicated for
now.

Adding the address to the master is handled by dev_uc_sync(), adding it
to the hardware is handled by __dev_uc_sync(). So this is the reason why
dsa_port_standalone_host_fdb_add() does not call dev_uc_add(). Not that
it had the rtnl_mutex anyway - ndo_set_rx_mode has it, but is atomic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
68d6d71eaf net: dsa: rename the host FDB and MDB methods to contain the "bridge" namespace
We are preparing to add API in port.c that adds FDB and MDB entries that
correspond to the port's standalone database. Rename the existing
methods to make it clear that the FDB and MDB entries offloaded come
from the bridge database.

Since the function names lengthen in dsa_slave_switchdev_event_work(),
we place "addr" and "vid" in temporary variables, to shorten those.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
35aae5ab91 net: dsa: remove workarounds for changing master promisc/allmulti only while up
Lennert Buytenhek explains in commit df02c6ff2e ("dsa: fix master
interface allmulti/promisc handling"), dated Nov 2008, that changing the
promiscuity of interfaces that are down (here the master) is broken.

This fact regarding promisc/allmulti has changed since commit
b6c40d68ff ("net: only invoke dev->change_rx_flags when device is UP")
by Vlad Yasevich, dated Nov 2013.

Therefore, DSA now has unnecessary complexity to handle master state
transitions from down to up. In fact, syncing the unicast and multicast
addresses can happen completely asynchronously to the administrative
state changes.

This change reduces that complexity by effectively fully reverting
commit df02c6ff2e ("dsa: fix master interface allmulti/promisc
handling").

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
e212fa7c54 net: dsa: support FDB events on offloaded LAG interfaces
This change introduces support for installing static FDB entries towards
a bridge port that is a LAG of multiple DSA switch ports, as well as
support for filtering towards the CPU local FDB entries emitted for LAG
interfaces that are bridge ports.

Conceptually, host addresses on LAG ports are identical to what we do
for plain bridge ports. Whereas FDB entries _towards_ a LAG can't simply
be replicated towards all member ports like we do for multicast, or VLAN.
Instead we need new driver API. Hardware usually considers a LAG to be a
"logical port", and sets the entire LAG as the forwarding destination.
The physical egress port selection within the LAG is made by hashing
policy, as usual.

To represent the logical port corresponding to the LAG, we pass by value
a copy of the dsa_lag structure to all switches in the tree that have at
least one port in that LAG.

To illustrate why a refcounted list of FDB entries is needed in struct
dsa_lag, it is enough to say that:
- a LAG may be a bridge port and may therefore receive FDB events even
  while it isn't yet offloaded by any DSA interface
- DSA interfaces may be removed from a LAG while that is a bridge port;
  we don't want FDB entries lingering around, but we don't want to
  remove entries that are still in use, either

For all the cases below to work, the idea is to always keep an FDB entry
on a LAG with a reference count equal to the DSA member ports. So:
- if a port joins a LAG, it requests the bridge to replay the FDB, and
  the FDB entries get created, or their refcount gets bumped by one
- if a port leaves a LAG, the FDB replay deletes or decrements refcount
  by one
- if an FDB is installed towards a LAG with ports already present, that
  entry is created (if it doesn't exist) and its refcount is bumped by
  the amount of ports already present in the LAG

echo "Adding FDB entry to bond with existing ports"
ip link del bond0
ip link add bond0 type bond mode 802.3ad
ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up
ip link del br0
ip link add br0 type bridge
ip link set bond0 master br0
bridge fdb add dev bond0 00:01:02:03:04:05 master static

ip link del br0
ip link del bond0

echo "Adding FDB entry to empty bond"
ip link del bond0
ip link add bond0 type bond mode 802.3ad
ip link del br0
ip link add br0 type bridge
ip link set bond0 master br0
bridge fdb add dev bond0 00:01:02:03:04:05 master static
ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up

ip link del br0
ip link del bond0

echo "Adding FDB entry to empty bond, then removing ports one by one"
ip link del bond0
ip link add bond0 type bond mode 802.3ad
ip link del br0
ip link add br0 type bridge
ip link set bond0 master br0
bridge fdb add dev bond0 00:01:02:03:04:05 master static
ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up

ip link set swp1 nomaster
ip link set swp2 nomaster
ip link del br0
ip link del bond0

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:44 -08:00
Vladimir Oltean
93c798230a net: dsa: call SWITCHDEV_FDB_OFFLOADED for the orig_dev
When switchdev_handle_fdb_event_to_device() replicates a FDB event
emitted for the bridge or for a LAG port and DSA offloads that, we
should notify back to switchdev that the FDB entry on the original
device is what was offloaded, not on the DSA slave devices that the
event is replicated on.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:43 -08:00
Vladimir Oltean
e35f12e993 net: dsa: remove "ds" and "port" from struct dsa_switchdev_event_work
By construction, the struct net_device *dev passed to
dsa_slave_switchdev_event_work() via struct dsa_switchdev_event_work
is always a DSA slave device.

Therefore, it is redundant to pass struct dsa_switch and int port
information in the deferred work structure. This can be retrieved at all
times from the provided struct net_device via dsa_slave_to_port().

For the same reason, we can drop the dsa_is_user_port() check in
dsa_fdb_offload_notify().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:43 -08:00
Vladimir Oltean
ec638740fc net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device
When the switchdev_handle_fdb_event_to_device() event replication helper
was created, my original thought was that FDB events on LAG interfaces
should most likely be special-cased, not just replicated towards all
switchdev ports beneath that LAG. So this replication helper currently
does not recurse through switchdev lower interfaces of LAG bridge ports,
but rather calls the lag_mod_cb() if that was provided.

No switchdev driver uses this helper for FDB events on LAG interfaces
yet, so that was an assumption which was yet to be tested. It is
certainly usable for that purpose, as my RFC series shows:

https://patchwork.kernel.org/project/netdevbpf/cover/20220210125201.2859463-1-vladimir.oltean@nxp.com/

however this approach is slightly convoluted because:

- the switchdev driver gets a "dev" that isn't its own net device, but
  rather the LAG net device. It must call switchdev_lower_dev_find(dev)
  in order to get a handle of any of its own net devices (the ones that
  pass check_cb).

- in order for FDB entries on LAG ports to be correctly refcounted per
  the number of switchdev ports beneath that LAG, we haven't escaped the
  need to iterate through the LAG's lower interfaces. Except that is now
  the responsibility of the switchdev driver, because the replication
  helper just stopped half-way.

So, even though yes, FDB events on LAG bridge ports must be
special-cased, in the end it's simpler to let switchdev_handle_fdb_*
just iterate through the LAG port's switchdev lowers, and let the
switchdev driver figure out that those physical ports are under a LAG.

The switchdev_handle_fdb_event_to_device() helper takes a
"foreign_dev_check" callback so it can figure out whether @dev can
autonomously forward to @foreign_dev. DSA fills this method properly:
if the LAG is offloaded by another port in the same tree as @dev, then
it isn't foreign. If it is a software LAG, it is foreign - forwarding
happens in software.

Whether an interface is foreign or not decides whether the replication
helper will go through the LAG's switchdev lowers or not. Since the
lan966x doesn't properly fill this out, FDB events on software LAG
uppers will get called. By changing lan966x_foreign_dev_check(), we can
suppress them.

Whereas DSA will now start receiving FDB events for its offloaded LAG
uppers, so we need to return -EOPNOTSUPP, since we currently don't do
the right thing for them.

Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:43 -08:00
Vladimir Oltean
dedd6a009f net: dsa: create a dsa_lag structure
The main purpose of this change is to create a data structure for a LAG
as seen by DSA. This is similar to what we have for bridging - we pass a
copy of this structure by value to ->port_lag_join and ->port_lag_leave.
For now we keep the lag_dev, id and a reference count in it. Future
patches will add a list of FDB entries for the LAG (these also need to
be refcounted to work properly).

The LAG structure is created using dsa_port_lag_create() and destroyed
using dsa_port_lag_destroy(), just like we have for bridging.

Because now, the dsa_lag itself is refcounted, we can simplify
dsa_lag_map() and dsa_lag_unmap(). These functions need to keep a LAG in
the dst->lags array only as long as at least one port uses it. The
refcounting logic inside those functions can be removed now - they are
called only when we should perform the operation.

dsa_lag_dev() is renamed to dsa_lag_by_id() and now returns the dsa_lag
structure instead of the lag_dev net_device.

dsa_lag_foreach_port() now takes the dsa_lag structure as argument.

dst->lags holds an array of dsa_lag structures.

dsa_lag_map() now also saves the dsa_lag->id value, so that linear
walking of dst->lags in drivers using dsa_lag_id() is no longer
necessary. They can just look at lag.id.

dsa_port_lag_id_get() is a helper, similar to dsa_port_bridge_num_get(),
which can be used by drivers to get the LAG ID assigned by DSA to a
given port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:43 -08:00
Vladimir Oltean
164f861bd4 net: dsa: offload bridge port VLANs on foreign interfaces
DSA now explicitly handles VLANs installed with the 'self' flag on the
bridge as host VLANs, instead of just replicating every bridge port VLAN
also on the CPU port and never deleting it, which is what it did before.

However, this leaves a corner case uncovered, as explained by
Tobias Waldekranz:
https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260

Forwarding towards a bridge port VLAN installed on a bridge port foreign
to DSA (separate NIC, Wi-Fi AP) used to work by virtue of the fact that
DSA itself needed to have at least one port in that VLAN (therefore, it
also had the CPU port in said VLAN). However, now that the CPU port may
not be member of all VLANs that user ports are members of, we need to
ensure this isn't the case if software forwarding to a foreign interface
is required.

The solution is to treat bridge port VLANs on standalone interfaces in
the exact same way as host VLANs. From DSA's perspective, there is no
difference between local termination and software forwarding; packets in
that VLAN must reach the CPU in both cases.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:05 +00:00
Vladimir Oltean
134ef2388e net: dsa: add explicit support for host bridge VLANs
Currently, DSA programs VLANs on shared (DSA and CPU) ports each time it
does so on user ports. This is good for basic functionality but has
several limitations:

- the VLAN group which must reach the CPU may be radically different
  from the VLAN group that must be autonomously forwarded by the switch.
  In other words, the admin may want to isolate noisy stations and avoid
  traffic from them going to the control processor of the switch, where
  it would just waste useless cycles. The bridge already supports
  independent control of VLAN groups on bridge ports and on the bridge
  itself, and when VLAN-aware, it will drop packets in software anyway
  if their VID isn't added as a 'self' entry towards the bridge device.

- Replaying host FDB entries may depend, for some drivers like mv88e6xxx,
  on replaying the host VLANs as well. The 2 VLAN groups are
  approximately the same in most regular cases, but there are corner
  cases when timing matters, and DSA's approximation of replicating
  VLANs on shared ports simply does not work.

- If a user makes the bridge (implicitly the CPU port) join a VLAN by
  accident, there is no way for the CPU port to isolate itself from that
  noisy VLAN except by rebooting the system. This is because for each
  VLAN added on a user port, DSA will add it on shared ports too, but
  for each VLAN deletion on a user port, it will remain installed on
  shared ports, since DSA has no good indication of whether the VLAN is
  still in use or not.

Now that the bridge driver emits well-balanced SWITCHDEV_OBJ_ID_PORT_VLAN
addition and removal events, DSA has a simple and straightforward task
of separating the bridge port VLANs (these have an orig_dev which is a
DSA slave interface, or a LAG interface) from the host VLANs (these have
an orig_dev which is a bridge interface), and to keep a simple reference
count of each VID on each shared port.

Forwarding VLANs must be installed on the bridge ports and on all DSA
ports interconnecting them. We don't have a good view of the exact
topology, so we simply install forwarding VLANs on all DSA ports, which
is what has been done until now.

Host VLANs must be installed primarily on the dedicated CPU port of each
bridge port. More subtly, they must also be installed on upstream-facing
and downstream-facing DSA ports that are connecting the bridge ports and
the CPU. This ensures that the mv88e6xxx's problem (VID of host FDB
entry may be absent from VTU) is still addressed even if that switch is
in a cross-chip setup, and it has no local CPU port.

Therefore:
- user ports contain only bridge port (forwarding) VLANs, and no
  refcounting is necessary
- DSA ports contain both forwarding and host VLANs. Refcounting is
  necessary among these 2 types.
- CPU ports contain only host VLANs. Refcounting is also necessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:05 +00:00
Vladimir Oltean
ddb44bdcde net: dsa: remove lockdep class for DSA slave address list
Since commit 2f1e8ea726 ("net: dsa: link interfaces with the DSA
master to get rid of lockdep warnings"), suggested by Cong Wang, the
DSA interfaces and their master have different dev->nested_level, which
makes netif_addr_lock() stop complaining about potentially recursive
locking on the same lock class.

So we no longer need DSA slave interfaces to have their own lockdep
class.

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11 11:17:33 +00:00
Vladimir Oltean
45b987d5ed net: dsa: remove ndo_get_phys_port_name and ndo_get_port_parent_id
There are no legacy ports, DSA registers a devlink instance with ports
unconditionally for all switch drivers. Therefore, delete the old-style
ndo operations used for determining bridge forwarding domains.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11 11:17:32 +00:00
Vladimir Oltean
295ab96f47 net: dsa: provide switch operations for tracking the master state
Certain drivers may need to send management traffic to the switch for
things like register access, FDB dump, etc, to accelerate what their
slow bus (SPI, I2C, MDIO) can already do.

Ethernet is faster (especially in bulk transactions) but is also more
unreliable, since the user may decide to bring the DSA master down (or
not bring it up), therefore severing the link between the host and the
attached switch.

Drivers needing Ethernet-based register access already should have
fallback logic to the slow bus if the Ethernet method fails, but that
fallback may be based on a timeout, and the I/O to the switch may slow
down to a halt if the master is down, because every Ethernet packet will
have to time out. The driver also doesn't have the option to turn off
Ethernet-based I/O momentarily, because it wouldn't know when to turn it
back on.

Which is where this change comes in. By tracking NETDEV_CHANGE,
NETDEV_UP and NETDEV_GOING_DOWN events on the DSA master, we should know
the exact interval of time during which this interface is reliably
available for traffic. Provide this information to switches so they can
use it as they wish.

An helper is added dsa_port_master_is_operational() to check if a master
port is operational.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02 14:43:59 +00:00
Vladimir Oltean
e31dbd3b6a net: dsa: merge rtnl_lock sections in dsa_slave_create
Currently dsa_slave_create() has two sequences of rtnl_lock/rtnl_unlock
in a row. Remove the rtnl_unlock() and rtnl_lock() in between, such that
the operation can execute slighly faster.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06 11:59:09 +00:00