Merge branch 'mlxsw-support-cff-flood-mode'

Petr Machata says:

====================
mlxsw: Support CFF flood mode

The registers to configure to initialize a flood table differ between the
controlled and CFF flood modes. In therefore needs to be an op. Add it,
hook up the current init to the existing families, and invoke the op.

PGT is an in-HW table that maps addresses to sets of ports. Then when some
HW process needs a set of ports as an argument, instead of embedding the
actual set in the dynamic configuration, what gets configured is the
address referencing the set. The HW then works with the appropriate PGT
entry.

Among other allocations, the PGT currently contains two large blocks for
bridge flooding: one for 802.1q and one for 802.1d. Within each of these
blocks are three tables, for unknown-unicast, multicast and broadcast
flooding:

      . . . |    802.1q    |    802.1d    | . . .
            | UC | MC | BC | UC | MC | BC |
             \______ _____/ \_____ ______/
                    v             v
                   FID flood vectors

Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an
802.1q bridge) uses three flood vectors spread across a fairly large region
of PGT.

This way of organizing the flood table (called "controlled") is not very
flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one
would need to completely rewrite the bridge PGT blocks, or resort to hacks
such as storing individual MC flood vectors into unused part of the bridge
table.

In order to address these shortcomings, Spectrum-2 and above support what
is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode,
each FID has a little table of its own, with three entries adjacent to each
other, one for unknown-UC, one for MC, one for BC. This allows for a much
more fine-grained approach to PGT management, where bits of it are
allocated on demand.

      . . . | FID | FID | FID | FID | FID | . . .
            |U|M|B|U|M|B|U|M|B|U|M|B|U|M|B|
             \_____________ _____________/
                           v
                   FID flood vectors

Besides the FID table organization, the CFF flood mode also impacts Router
Subport (RSP) table. This table contains flood vectors for rFIDs, which are
FIDs that reference front panel ports or LAGs. The RSP table contains two
entries per front panel port and LAG, one for unknown-UC traffic, and one
for everything else. Currently, the FW allocates and manages the table in
its own part of PGT. rFIDs are marked with flood_rsp bit and managed
specially. In CFF mode, rFIDs are managed as all other FIDs. The driver
therefore has to allocate and maintain the flood vectors. Like with bridge
FIDs, this is more work, but increases flexibility of the system.

The FW currently supports both the controlled and CFF flood modes. To shed
complexity, in the future it should only support CFF flood mode. Hence this
patchset, which adds CFF flood mode support to mlxsw.

Since mlxsw needs to maintain both the controlled mode as well as CFF mode
support, we will keep the layout as compatible as possible. The bridge
tables will stay in the same overall shape, just their inner organization
will change from flood mode -> FID to FID -> flood mode. Likewise will RSP
be kept as a contiguous block of PGT memory, as was the case when the FW
maintained it.

- The way FIDs get configured under the CFF flood mode differs from the
  currently used controlled mode. The simple approach of having several
  globally visible arrays for spectrum.c to statically choose from no
  longer works.

  Patch #1 thus privatizes all FID initialization and finalization logic,
  and exposes it as ops instead.

- Patch #2 renames the ops that are specific to the controlled mode, to
  make room in the namespace for the CFF variants.

  Patch #3 extracts a helper to compute flood table base out of
  mlxsw_sp_fid_flood_table_mid().

- The op fid_setup configured fid_offset, i.e. the number of this FID
  within its family. For rFIDs in CFF mode, to determine this number, the
  driver will need to do fallible queries.

  Thus in patch #4, make the FID setup operation fallible as well.

- Flood mode initialization routine differs between the controlled and CFF
  flood modes. The controlled mode needs to configure flood table layout,
  which the CFF mode does not need to do.

  In patch #5, move mlxsw_sp_fid_flood_table_init() up so that the
  following patch can make use of it.

  In patch #6, add an op to be invoked per table (if defined).

- The current way of determining PGT allocation size depends on the number
  of FIDs and number of flood tables. RFIDs however have PGT footprint
  depending not on number of FIDs, but on number of ports and LAGs, because
  which ports an rFID should flood to does not depend on the FID itself,
  but on the port or LAG that it references.

  Therefore in patch #7, add FID family ops for determining PGT allocation
  size.

- As elaborated above, layout of PGT will differ between controlled and CFF
  flood modes. In CFF mode, it will further differ between rFIDs and other
  FIDs (as described at previous patch). The way to pack the SFMR register
  to configure a FID will likewise differ from controlled to CFF.

  Thus in patches #8 and #9 add FID family ops to determine PGT base
  address for a FID and to pack SFMR.

- Patches #10 and #11 add more bits for RSP support. In patch #10, add a
  new traffic type enumerator, for non-UC traffic. This is a combination of
  BC and MC traffic, but the way that mlxsw maps these mnemonic names to
  actual traffic type configurations requires that we have a new name to
  describe this class of traffic.

  Patch #11 then adds hooks necessary for RSP table maintenance. As ports
  come and go, and join and leave LAGs, it is necessary to update flood
  vectors that the rFIDs use. These new hooks will make that possible.

- Patches #12, #13 and #14 introduce flood profiles. These have been
  implicit so far, but the way that CFF flood mode works with profile IDs
  requires that we make them explicit.

  Thus in patch #12, introduce flood profile objects as a set of flood
  tables that FID families then refer to. The FID code currently only
  uses a single flood profile.

  In patch #13, add a flood profile ID to flood profile objects.

  In patch #14, when in CFF mode, configure SFFP according to the existing
  flood profiles (or the one that exists as of that point).

- Patches #15 and #16 add code to implement, respectively, bridge FIDs and
  RSP FIDs in CFF mode.

- In patch #17, toggle flood_mode_prefer_cff on Spectrum-2 and above, which
  makes the newly-added code live.
====================

Link: https://lore.kernel.org/r/cover.1701183891.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Jakub Kicinski 2023-11-29 20:03:27 -08:00
commit 3d6d754904
3 changed files with 727 additions and 110 deletions

View file

@ -3190,10 +3190,10 @@ static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
goto err_lag_init;
}
err = mlxsw_sp_fids_init(mlxsw_sp);
err = mlxsw_sp->fid_core_ops->init(mlxsw_sp);
if (err) {
dev_err(mlxsw_sp->bus_info->dev, "Failed to initialize FIDs\n");
goto err_fids_init;
goto err_fid_core_init;
}
err = mlxsw_sp_policers_init(mlxsw_sp);
@ -3379,8 +3379,8 @@ static int mlxsw_sp_init(struct mlxsw_core *mlxsw_core,
err_traps_init:
mlxsw_sp_policers_fini(mlxsw_sp);
err_policers_init:
mlxsw_sp_fids_fini(mlxsw_sp);
err_fids_init:
mlxsw_sp->fid_core_ops->fini(mlxsw_sp);
err_fid_core_init:
mlxsw_sp_lag_fini(mlxsw_sp);
err_lag_init:
mlxsw_sp_pgt_fini(mlxsw_sp);
@ -3416,7 +3416,7 @@ static int mlxsw_sp1_init(struct mlxsw_core *mlxsw_core,
mlxsw_sp->router_ops = &mlxsw_sp1_router_ops;
mlxsw_sp->listeners = mlxsw_sp1_listener;
mlxsw_sp->listeners_count = ARRAY_SIZE(mlxsw_sp1_listener);
mlxsw_sp->fid_family_arr = mlxsw_sp1_fid_family_arr;
mlxsw_sp->fid_core_ops = &mlxsw_sp1_fid_core_ops;
mlxsw_sp->lowest_shaper_bs = MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP1;
mlxsw_sp->pgt_smpe_index_valid = true;
@ -3450,7 +3450,7 @@ static int mlxsw_sp2_init(struct mlxsw_core *mlxsw_core,
mlxsw_sp->router_ops = &mlxsw_sp2_router_ops;
mlxsw_sp->listeners = mlxsw_sp2_listener;
mlxsw_sp->listeners_count = ARRAY_SIZE(mlxsw_sp2_listener);
mlxsw_sp->fid_family_arr = mlxsw_sp2_fid_family_arr;
mlxsw_sp->fid_core_ops = &mlxsw_sp2_fid_core_ops;
mlxsw_sp->lowest_shaper_bs = MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP2;
mlxsw_sp->pgt_smpe_index_valid = false;
@ -3484,7 +3484,7 @@ static int mlxsw_sp3_init(struct mlxsw_core *mlxsw_core,
mlxsw_sp->router_ops = &mlxsw_sp2_router_ops;
mlxsw_sp->listeners = mlxsw_sp2_listener;
mlxsw_sp->listeners_count = ARRAY_SIZE(mlxsw_sp2_listener);
mlxsw_sp->fid_family_arr = mlxsw_sp2_fid_family_arr;
mlxsw_sp->fid_core_ops = &mlxsw_sp2_fid_core_ops;
mlxsw_sp->lowest_shaper_bs = MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP3;
mlxsw_sp->pgt_smpe_index_valid = false;
@ -3518,7 +3518,7 @@ static int mlxsw_sp4_init(struct mlxsw_core *mlxsw_core,
mlxsw_sp->router_ops = &mlxsw_sp2_router_ops;
mlxsw_sp->listeners = mlxsw_sp2_listener;
mlxsw_sp->listeners_count = ARRAY_SIZE(mlxsw_sp2_listener);
mlxsw_sp->fid_family_arr = mlxsw_sp2_fid_family_arr;
mlxsw_sp->fid_core_ops = &mlxsw_sp2_fid_core_ops;
mlxsw_sp->lowest_shaper_bs = MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP4;
mlxsw_sp->pgt_smpe_index_valid = false;
@ -3552,7 +3552,7 @@ static void mlxsw_sp_fini(struct mlxsw_core *mlxsw_core)
mlxsw_sp_devlink_traps_fini(mlxsw_sp);
mlxsw_sp_traps_fini(mlxsw_sp);
mlxsw_sp_policers_fini(mlxsw_sp);
mlxsw_sp_fids_fini(mlxsw_sp);
mlxsw_sp->fid_core_ops->fini(mlxsw_sp);
mlxsw_sp_lag_fini(mlxsw_sp);
mlxsw_sp_pgt_fini(mlxsw_sp);
mlxsw_sp_kvdl_fini(mlxsw_sp);
@ -3598,6 +3598,7 @@ static const struct mlxsw_config_profile mlxsw_sp2_config_profile = {
.used_cqe_time_stamp_type = 1,
.cqe_time_stamp_type = MLXSW_CMD_MBOX_CONFIG_PROFILE_CQE_TIME_STAMP_TYPE_UTC,
.lag_mode_prefer_sw = true,
.flood_mode_prefer_cff = true,
};
/* Reduce number of LAGs from full capacity (256) to the maximum supported LAGs
@ -3626,6 +3627,7 @@ static const struct mlxsw_config_profile mlxsw_sp4_config_profile = {
.used_cqe_time_stamp_type = 1,
.cqe_time_stamp_type = MLXSW_CMD_MBOX_CONFIG_PROFILE_CQE_TIME_STAMP_TYPE_UTC,
.lag_mode_prefer_sw = true,
.flood_mode_prefer_cff = true,
};
static void
@ -4515,6 +4517,10 @@ static int mlxsw_sp_port_lag_join(struct mlxsw_sp_port *mlxsw_sp_port,
mlxsw_sp_port->lagged = 1;
lag->ref_count++;
err = mlxsw_sp_fid_port_join_lag(mlxsw_sp_port);
if (err)
goto err_fid_port_join_lag;
/* Port is no longer usable as a router interface */
if (mlxsw_sp_port->default_vlan->fid)
mlxsw_sp_port_vlan_router_leave(mlxsw_sp_port->default_vlan);
@ -4534,6 +4540,8 @@ static int mlxsw_sp_port_lag_join(struct mlxsw_sp_port *mlxsw_sp_port,
err_replay:
mlxsw_sp_router_port_leave_lag(mlxsw_sp_port, lag_dev);
err_router_join:
mlxsw_sp_fid_port_leave_lag(mlxsw_sp_port);
err_fid_port_join_lag:
lag->ref_count--;
mlxsw_sp_port->lagged = 0;
mlxsw_core_lag_mapping_clear(mlxsw_sp->core, lag_id,
@ -4569,6 +4577,8 @@ static void mlxsw_sp_port_lag_leave(struct mlxsw_sp_port *mlxsw_sp_port,
*/
mlxsw_sp_port_lag_uppers_cleanup(mlxsw_sp_port, lag_dev);
mlxsw_sp_fid_port_leave_lag(mlxsw_sp_port);
if (lag->ref_count == 1)
mlxsw_sp_lag_destroy(mlxsw_sp, lag_id);

View file

@ -205,7 +205,7 @@ struct mlxsw_sp {
const struct mlxsw_sp_mall_ops *mall_ops;
const struct mlxsw_sp_router_ops *router_ops;
const struct mlxsw_listener *listeners;
const struct mlxsw_sp_fid_family **fid_family_arr;
const struct mlxsw_sp_fid_core_ops *fid_core_ops;
size_t listeners_count;
u32 lowest_shaper_bs;
struct rhashtable ipv6_addr_ht;
@ -252,6 +252,11 @@ struct mlxsw_sp_ptp_ops {
const struct mlxsw_tx_info *tx_info);
};
struct mlxsw_sp_fid_core_ops {
int (*init)(struct mlxsw_sp *mlxsw_sp);
void (*fini)(struct mlxsw_sp *mlxsw_sp);
};
static inline struct mlxsw_sp_upper *
mlxsw_sp_lag_get(struct mlxsw_sp *mlxsw_sp, u16 lag_id)
{
@ -508,6 +513,8 @@ enum mlxsw_sp_flood_type {
MLXSW_SP_FLOOD_TYPE_UC,
MLXSW_SP_FLOOD_TYPE_BC,
MLXSW_SP_FLOOD_TYPE_MC,
/* For RSP FIDs in CFF mode. */
MLXSW_SP_FLOOD_TYPE_NOT_UC,
};
int mlxsw_sp_port_get_stats_raw(struct net_device *dev, int grp,
@ -1321,11 +1328,11 @@ struct mlxsw_sp_fid *mlxsw_sp_fid_dummy_get(struct mlxsw_sp *mlxsw_sp);
void mlxsw_sp_fid_put(struct mlxsw_sp_fid *fid);
int mlxsw_sp_port_fids_init(struct mlxsw_sp_port *mlxsw_sp_port);
void mlxsw_sp_port_fids_fini(struct mlxsw_sp_port *mlxsw_sp_port);
int mlxsw_sp_fids_init(struct mlxsw_sp *mlxsw_sp);
void mlxsw_sp_fids_fini(struct mlxsw_sp *mlxsw_sp);
int mlxsw_sp_fid_port_join_lag(const struct mlxsw_sp_port *mlxsw_sp_port);
void mlxsw_sp_fid_port_leave_lag(const struct mlxsw_sp_port *mlxsw_sp_port);
extern const struct mlxsw_sp_fid_family *mlxsw_sp1_fid_family_arr[];
extern const struct mlxsw_sp_fid_family *mlxsw_sp2_fid_family_arr[];
extern const struct mlxsw_sp_fid_core_ops mlxsw_sp1_fid_core_ops;
extern const struct mlxsw_sp_fid_core_ops mlxsw_sp2_fid_core_ops;
/* spectrum_mr.c */
enum mlxsw_sp_mr_route_prio {

File diff suppressed because it is too large Load diff