Commit Graph

13 Commits

Author SHA1 Message Date
Carolina Jubran f9ac93b6f3 net/mlx5e: Fix mlx5e_priv_init() cleanup flow
[ Upstream commit ecb829459a ]

When mlx5e_priv_init() fails, the cleanup flow calls mlx5e_selq_cleanup which
calls mlx5e_selq_apply() that assures that the `priv->state_lock` is held using
lockdep_is_held().

Acquire the state_lock in mlx5e_selq_cleanup().

Kernel log:
=============================
WARNING: suspicious RCU usage
6.8.0-rc3_net_next_841a9b5 #1 Not tainted
-----------------------------
drivers/net/ethernet/mellanox/mlx5/core/en/selq.c:124 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by systemd-modules/293:
 #0: ffffffffa05067b0 (devices_rwsem){++++}-{3:3}, at: ib_register_client+0x109/0x1b0 [ib_core]
 #1: ffff8881096c65c0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x104/0x1c0 [ib_core]

stack backtrace:
CPU: 4 PID: 293 Comm: systemd-modules Not tainted 6.8.0-rc3_net_next_841a9b5 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x8a/0xa0
 lockdep_rcu_suspicious+0x154/0x1a0
 mlx5e_selq_apply+0x94/0xa0 [mlx5_core]
 mlx5e_selq_cleanup+0x3a/0x60 [mlx5_core]
 mlx5e_priv_init+0x2be/0x2f0 [mlx5_core]
 mlx5_rdma_setup_rn+0x7c/0x1a0 [mlx5_core]
 rdma_init_netdev+0x4e/0x80 [ib_core]
 ? mlx5_rdma_netdev_free+0x70/0x70 [mlx5_core]
 ipoib_intf_init+0x64/0x550 [ib_ipoib]
 ipoib_intf_alloc+0x4e/0xc0 [ib_ipoib]
 ipoib_add_one+0xb0/0x360 [ib_ipoib]
 add_client_context+0x112/0x1c0 [ib_core]
 ib_register_client+0x166/0x1b0 [ib_core]
 ? 0xffffffffa0573000
 ipoib_init_module+0xeb/0x1a0 [ib_ipoib]
 do_one_initcall+0x61/0x250
 do_init_module+0x8a/0x270
 init_module_from_file+0x8b/0xd0
 idempotent_init_module+0x17d/0x230
 __x64_sys_finit_module+0x61/0xb0
 do_syscall_64+0x71/0x140
 entry_SYSCALL_64_after_hwframe+0x46/0x4e
 </TASK>

Fixes: 8bf30be750 ("net/mlx5e: Introduce select queue parameters")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-17 11:19:31 +02:00
Moshe Tal 462b005999 net/mlx5e: HTB, move htb functions to a new file
Move htb related functions and data to a separated file for better
encapsulation.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:53 -07:00
Moshe Tal 3685eed56f net/mlx5e: HTB, change functions name to follow convention
Following the change of the functions to be object like, change also
the names.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:52 -07:00
Moshe Tal 28df4a0117 net/mlx5e: HTB, remove priv from htb function calls
As a step to make htb self-contained replace the passing of priv as a
parameter to htb function calls with members in the htb struct.

Full decoupling the htb from priv will require more work, so for now
leave the priv as one of the members in the htb struct, to be replaced
by channels in a future commit.

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19 13:32:52 -07:00
Moshe Tal 4f8d1d3adc net/mlx5e: HTB, move ids to selq_params struct
HTB id fields are needed for selecting queue. Moving them to the
selq_params struct will simplify synchronization between control flow
and mlx5e_select_queues and will keep the IDs in the hot cacheline of
mlx5e_selq_params.

Replace mlx5e_selq_prepare() with separate functions that change subsets
of parameters, while keeping the rest.

This also will be useful to hide mlx5e_htb structure from the rest of the
driver in a later patch in this series.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
2022-07-19 13:32:50 -07:00
Maxim Mikityanskiy 71753b8ec1 net/mlx5e: Optimize the common case condition in mlx5e_select_queue
Check all booleans for special queues at once, when deciding whether to
go to the fast path in mlx5e_select_queue. Pack them into bitfields to
have some room for extensibility.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-14 22:30:52 -08:00
Maxim Mikityanskiy 3a9e5fff2a net/mlx5e: Optimize modulo in mlx5e_select_queue
To improve the performance of the modulo operation (%), it's replaced by
a subtracting the divisor in a loop. The modulo is used to fix up an
out-of-bounds value that might be returned by netdev_pick_tx or to
convert the queue number to the channel number when num_tcs > 1. Both
situations are unlikely, because XPS is configured not to pick higher
queues (qid >= num_channels) by default, so under normal circumstances
the flow won't go inside the loop, and it will be faster than %.

num_tcs == 8 adds at most 7 iterations to the loop. PTP adds at most 1
iteration to the loop. HTB would add at most 256 iterations (when
num_channels == 1), so there is an additional boundary check in the HTB
flow, which falls back to % if more than 7 iterations are expected.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-14 22:30:51 -08:00
Maxim Mikityanskiy 3c87aedd48 net/mlx5e: Optimize mlx5e_select_queue
This commit optimizes mlx5e_select_queue for HTB and PTP cases by
short-cutting some checks, without sacrificing performance of the common
non-HTB non-PTP flow.

1. The HTB flow uses the fact that num_tcs == 1 to drop these checks
(it's not possible to attach both mqprio and htb as the root qdisc).
It's also enough to calculate `txq_ix % num_channels` only once, instead
of twice.

2. The PTP flow drops the check for HTB and the second calculation of
`txq_ix % num_channels`.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-14 22:30:51 -08:00
Maxim Mikityanskiy ed5f9cf06b net/mlx5e: Use READ_ONCE/WRITE_ONCE for DCBX trust state
trust_state can be written while mlx5e_select_queue() is reading it. To
avoid inconsistencies, use READ_ONCE and WRITE_ONCE for access and
updates, and touch the variable only once per operation.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-14 22:30:51 -08:00
Maxim Mikityanskiy 62f7991fea net/mlx5e: Move repeating code that gets TC prio into a function
Both mlx5e_select_queue and mlx5e_select_ptpsq contain the same logic to
get user priority of a packet, according to the current trust state
settings. This commit moves this repeating code to its own function.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-14 22:30:51 -08:00
Maxim Mikityanskiy 3ab45777a2 net/mlx5e: Use select queue parameters to sync with control flow
Start using the select queue parameters introduced in the previous
commit to have proper synchronization with changing the configuration
(such as number of channels and queues). It ensures that the state that
mlx5e_select_queue() sees is always consistent and stays the same while
the function is running. Also it allows mlx5e_select_queue to stop using
data structures that weren't synchronized properly: txq2sq,
channel_tc2realtxq, port_ptp_tc2realtxq. The last two are removed
completely, as they were used only in mlx5e_select_queue.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-14 22:30:50 -08:00
Maxim Mikityanskiy 6b23f6ab86 net/mlx5e: Move mlx5e_select_queue to en/selq.c
This commit moves mlx5e_select_queue and all stuff related to
ndo_select_queue to en/selq.c to put all stuff working with selq into a
separate file.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-14 22:30:50 -08:00
Maxim Mikityanskiy 8bf30be750 net/mlx5e: Introduce select queue parameters
ndo_select_queue can be called at any time, and there is no way to stop
the kernel from calling it to synchronize with configuration changes
(real_num_tx_queues, num_tc). This commit introduces an internal way in
mlx5e to sync mlx5e_select_queue() with these changes. The configuration
needed by this function is stored in a struct mlx5e_selq_params, which
is modified and accessed in an atomic way using RCU methods. The whole
ndo_select_queue is called under an RCU lock, providing the necessary
guarantees.

The parameters stored in the new struct mlx5e_selq_params should only be
used from inside mlx5e_select_queue. It's the minimal set of parameters
needed for mlx5e_select_queue to do its job efficiently, derived from
parameters stored elsewhere. That means that when the configuration
change, mlx5e_selq_params may need to be updated. In such cases, the
mlx5e_selq_prepare/mlx5e_selq_apply API should be used.

struct mlx5e_selq contains two slots for the params: active and standby.
mlx5e_selq_prepare updates the standby slot, and mlx5e_selq_apply swaps
the slots in a safe atomic way using the RCU API. It integrates well
with the open/activate stages of the configuration change flow.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-14 22:30:50 -08:00