linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-10-01 22:54:01 +00:00

Author	SHA1	Message	Date
Tom Rix	51aaa68222	net: alteon: remove unused len variable clang with W=1 reports drivers/net/ethernet/alteon/acenic.c:2438:10: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable] int i, len = 0; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-02 13:43:43 +01:00
Ido Schimmel	cc19439f70	mlxsw: core_thermal: Simplify transceiver module get_temp() callback The get_temp() callback of a thermal zone associated with a transceiver module no longer needs to read the temperature thresholds of the module. Therefore, simplify the callback by only reading the temperature. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-02 13:42:30 +01:00
Ido Schimmel	c1536d856e	mlxsw: core_thermal: Make mlxsw_thermal_module_init() void The function can no longer fail so make it void and remove the associated error path. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-02 13:42:30 +01:00
Ido Schimmel	5601ef91fb	mlxsw: core_thermal: Use static trip points for transceiver modules The driver registers a thermal zone for each transceiver module and tries to set the trip point temperatures according to the thresholds read from the transceiver. If a threshold cannot be read or if a transceiver is unplugged, the trip point temperature is set to zero, which means that it is disabled as far as the thermal subsystem is concerned. A recent change in the thermal core made it so that such trip points are no longer marked as disabled, which lead the thermal subsystem to incorrectly set the associated cooling devices to the their maximum state [1]. A fix to restore this behavior was merged in commit `f1b80a3878` ("thermal: core: Restore behavior regarding invalid trip points"). However, the thermal maintainer suggested to not rely on this behavior and instead always register a valid array of trip points [2]. Therefore, create a static array of trip points with sane defaults (suggested by Vadim) and register it with the thermal zone of each transceiver module. User space can choose to override these defaults using the thermal zone sysfs interface since these files are writeable. Before: $ cat /sys/class/thermal/thermal_zone11/type mlxsw-module11 $ cat /sys/class/thermal/thermal_zone11/trip_point__temp 65000 75000 80000 After: $ cat /sys/class/thermal/thermal_zone11/type mlxsw-module11 $ cat /sys/class/thermal/thermal_zone11/trip_point__temp 55000 65000 80000 Also tested by reverting commit `f1b80a3878` ("thermal: core: Restore behavior regarding invalid trip points") and making sure that the associated cooling devices are not set to their maximum state. [1] https://lore.kernel.org/linux-pm/ZA3CFNhU4AbtsP4G@shredder/ [2] https://lore.kernel.org/linux-pm/f78e6b70-a963-c0ca-a4b2-0d4c6aeef1fb@linaro.org/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-02 13:42:30 +01:00
Sylwester Dziedziuch	ceb29474bb	i40e: Add support for VF to specify its primary MAC address Currently in the i40e driver there is no implementation of different MAC address handling depending on whether it is a legacy or primary. Introduce new checks for VF to be able to specify its primary MAC address based on the VIRTCHNL_ETHER_ADDR_PRIMARY type. Primary MAC address are treated differently compared to legacy ones in a scenario where: 1. If a unicast MAC is being added and it's specified as VIRTCHNL_ETHER_ADDR_PRIMARY, then replace the current default_lan_addr.addr. 2. If a unicast MAC is being deleted and it's type is specified as VIRTCHNL_ETHER_ADDR_PRIMARY, then zero the hw_lan_addr.addr. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-02 13:20:52 +01:00
Jakub Kicinski	d74aab2ca1	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-30 (documentation, ice) This series contains updates to driver documentation and the ice driver. Tony removes links and addresses related to the out-of-tree driver from the Intel ethernet driver documentation. Jake removes a comment that is no longer valid to the ice driver. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: remove comment about not supporting driver reinit Documentation/eth/intel: Remove references to SourceForge Documentation/eth/intel: Update address for driver support ==================== Link: https://lore.kernel.org/r/20230330165935.2503604-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-31 21:41:11 -07:00
Simon Horman	709d0b880c	octeontx2-af: update type of prof fields in nix_aw_enq_req Update type of prof and prof_mask fields in nix_as_enq_req from u64 to struct nix_bandprof_s, which is 128 bits wide. This is to address warnings with compiling with gcc-12 W=1 regarding string fortification. Although the union of which these fields are a member is 128bits wide, and thus writing a 128bit entity is safe, the compiler flags a problem as the field being written is only 64 bits wide. CC [M] drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.o scripts/Makefile.build:252: ./drivers/net/ethernet/marvell/octeontx2/nic/Makefile: otx2_dcbnl.o is added to multiple modules: rvu_nicpf rvu_nicvf CC [M] drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.o CC [M] drivers/net/ethernet/marvell/octeontx2/nic/qos_sq.o CC [M] drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.o CC [M] drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.o In file included from ./include/linux/string.h:254, from ./include/linux/bitmap.h:11, from ./include/linux/cpumask.h:12, from ./arch/x86/include/asm/paravirt.h:17, from ./arch/x86/include/asm/cpuid.h:62, from ./arch/x86/include/asm/processor.h:19, from ./arch/x86/include/asm/timex.h:5, from ./include/linux/timex.h:67, from ./include/linux/time32.h:13, from ./include/linux/time.h:60, from ./include/linux/stat.h:19, from ./include/linux/module.h:13, from drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c:8: In function 'fortify_memcpy_chk', inlined from 'rvu_nix_blk_aq_enq_inst' at drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c:969:4: ./include/linux/fortify-string.h:529:25: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 529 \| __read_overflow2_field(q_size_field, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'fortify_memcpy_chk', inlined from 'rvu_nix_blk_aq_enq_inst' at drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c:984:4: ./include/linux/fortify-string.h:529:25: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 529 \| __read_overflow2_field(q_size_field, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Compile tested only! Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230329112356.458072-1-horms@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 23:27:58 -07:00
Nathan Chancellor	3292004c90	net: ethernet: ti: Fix format specifier in netcp_create_interface() After commit `3948b05950` ("net: introduce a config option to tweak MAX_SKB_FRAGS"), clang warns: drivers/net/ethernet/ti/netcp_core.c:2085:4: warning: format specifies type 'long' but the argument has type 'int' [-Wformat] MAX_SKB_FRAGS); ^~~~~~~~~~~~~ include/linux/dev_printk.h:144:65: note: expanded from macro 'dev_err' dev_printk_index_wrap(_dev_err, KERN_ERR, dev, dev_fmt(fmt), ##__VA_ARGS__) ~~~ ^~~~~~~~~~~ include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap' _p_func(dev, fmt, ##__VA_ARGS__); \ ~~~ ^~~~~~~~~~~ include/linux/skbuff.h:352:23: note: expanded from macro 'MAX_SKB_FRAGS' #define MAX_SKB_FRAGS CONFIG_MAX_SKB_FRAGS ^~~~~~~~~~~~~~~~~~~~ ./include/generated/autoconf.h:11789:30: note: expanded from macro 'CONFIG_MAX_SKB_FRAGS' #define CONFIG_MAX_SKB_FRAGS 17 ^~ 1 warning generated. Follow the pattern of the rest of the tree by changing the specifier to '%u' and casting MAX_SKB_FRAGS explicitly to 'unsigned int', which eliminates the warning. Fixes: `3948b05950` ("net: introduce a config option to tweak MAX_SKB_FRAGS") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20230329-net-ethernet-ti-wformat-v1-1-83d0f799b553@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 23:21:04 -07:00
Tom Rix	9a865a98a3	net: ksz884x: remove unused change variable clang with W=1 reports drivers/net/ethernet/micrel/ksz884x.c:3216:6: error: variable 'change' set but not used [-Werror,-Wunused-but-set-variable] int change = 0; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230329125929.1808420-1-trix@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 23:18:48 -07:00
Jakub Kicinski	79548b7984	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: drivers/net/ethernet/mediatek/mtk_ppe.c `3fbe4d8c0e` ("net: ethernet: mtk_eth_soc: ppe: add support for flow accounting") `924531326e` ("net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 14:43:03 -07:00
Felix Fietkau	924531326e	net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow The cache needs to be flushed to ensure that the hardware stops offloading the flow immediately. Fixes: `33fc42de33` ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230330120840.52079-3-nbd@nbd.name Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 11:44:59 -07:00
Felix Fietkau	5f36ca1b84	net: ethernet: mtk_eth_soc: fix L2 offloading with DSA untag offload Check for skb metadata in order to detect the case where the DSA header is not present. Fixes: `2d7605a729` ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging") Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230330120840.52079-2-nbd@nbd.name Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 11:44:59 -07:00
Felix Fietkau	8c1cb87c2a	net: ethernet: mtk_eth_soc: fix flow block refcounting logic Since we call flow_block_cb_decref on FLOW_BLOCK_UNBIND, we also need to call flow_block_cb_incref for a newly allocated cb. Also fix the accidentally inverted refcount check on unbind. Fixes: `502e84e238` ("net: ethernet: mtk_eth_soc: add flow offloading support") Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230330120840.52079-1-nbd@nbd.name Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 11:44:59 -07:00
Russell King (Oracle)	2960a2d33b	net: mvneta: fix potential double-frees in mvneta_txq_sw_deinit() Reported on the Turris forum, mvneta provokes kernel warnings in the architecture DMA mapping code when mvneta_setup_txqs() fails to allocate memory. This happens because when mvneta_cleanup_txqs() is called in the mvneta_stop() path, we leave pointers in the structure that have been freed. Then on mvneta_open(), we call mvneta_setup_txqs(), which starts allocating memory. On memory allocation failure, mvneta_cleanup_txqs() will walk all the queues freeing any non-NULL pointers - which includes pointers that were previously freed in mvneta_stop(). Fix this by setting these pointers to NULL to prevent double-freeing of the same memory. Fixes: `2adb719d74` ("net: mvneta: Implement software TSO") Link: https://forum.turris.cz/t/random-kernel-exceptions-on-hbl-tos-7-0/18865/8 Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1phUe5-00EieL-7q@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 11:43:39 -07:00
Jacob Keller	503d473c98	ice: remove comment about not supporting driver reinit Since commit `31c8db2c4f` ("ice: implement devlink reinit action"), the ice driver does support driver re-initialization via devlink reload. Remove the stale comment indicating that the driver lacks this support from the ice_devlink_ops structure. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-03-30 09:35:07 -07:00
Wolfram Sang	da617cd8d9	smsc911x: remove superfluous variable init phydev is assigned a value right away, no need to initialize it. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230329064414.25028-1-wsa+renesas@sang-engineering.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-03-30 15:35:33 +02:00
Jakub Kicinski	7079d5e61a	mlx5-updates-2023-03-28 Dragos Tatulea says: ==================== net/mlx5e: RX, Drop page_cache and fully use page_pool For page allocation on the rx path, the mlx5e driver has been using an internal page cache in tandem with the page pool. The internal page cache uses a queue for page recycling which has the issue of head of queue blocking. This patch series drops the internal page_cache altogether and uses the page_pool to implement everything that was done by the page_cache before: * Let the page_pool handle dma mapping and unmapping. * Use fragmented pages with fragment counter instead of tracking via page ref. * Enable skb recycling. The patch series has the following effects on the rx path: * Improved performance for the cases when there was low page recycling due to head of queue blocking in the internal page_cache. The test for this was running a single iperf TCP stream to a rx queue which is bound on the same cpu as the application. \|-------------+--------+--------+------+---------\| \| rq type \| before \| after \| unit \| diff \| \|-------------+--------+--------+------+---------\| \| striding rq \| 30.1 \| 31.4 \| Gbps \| 4.14 % \| \| legacy rq \| 30.2 \| 33.0 \| Gbps \| 8.48 % \| \|-------------+--------+--------+------+---------\| * Small XDP performance degradation. The test was is XDP drop program running on a single rx queue with small packets incoming it looks like this: \|-------------+----------+----------+------+---------\| \| rq type \| before \| after \| unit \| diff \| \|-------------+----------+----------+------+---------\| \| striding rq \| 19725449 \| 18544617 \| pps \| -6.37 % \| \| legacy rq \| 19879931 \| 18631841 \| pps \| -6.70 % \| \|-------------+----------+----------+------+---------\| This will be handled in a different patch series by adding support for multi-packet per page. * For other cases the performance is roughly the same. The above numbers were obtained on the following system: 24 core Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz 32 GB RAM ConnectX-7 single port The breakdown on the patch series is the following: * Preparations for introducing the mlx5e_frag_page struct. * Delete the mlx5e_page_cache struct. * Enable dma mapping from page_pool. * Enable skb recycling and fragment counting. * Do deferred release of pages (just before alloc) to ensure better page_pool cache utilization. ==================== -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmQjUY8ACgkQSD+KveBX +j6tVAf/QHCbKgt9c2Q5EpFch2e4x3A/HfE7DbxTancIj0cc1bH98xd4wO574aE4 PCJ/aJ+9zTLvTUgUnKDaiqonfmcsF7v6d/ltoLW1PTNnPqdsjsXpVy76dnL81SWy u/g7h68cfeMdMjAAoewyVv+k7GeTIZCsIdvik3dWGFQ67IpE1k5dLbO13YBNW/5m Cm39RzD55tjgxS8GHdyFYAV4MwgHy+pdhTYR9LGzH80hfd02KqsCO38u1NIShuez 1rwjRF213Qdln20bMNSNiXG36JUV65mo+Q/XHKOEjB0qNKRcF5bzZovqHzP+R7QZ qhhhfce8c63UWpcXADP6k6qevW8+UA== =8F1t -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2023-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-03-28 Dragos Tatulea says: ==================== net/mlx5e: RX, Drop page_cache and fully use page_pool For page allocation on the rx path, the mlx5e driver has been using an internal page cache in tandem with the page pool. The internal page cache uses a queue for page recycling which has the issue of head of queue blocking. This patch series drops the internal page_cache altogether and uses the page_pool to implement everything that was done by the page_cache before: * Let the page_pool handle dma mapping and unmapping. * Use fragmented pages with fragment counter instead of tracking via page ref. * Enable skb recycling. The patch series has the following effects on the rx path: * Improved performance for the cases when there was low page recycling due to head of queue blocking in the internal page_cache. The test for this was running a single iperf TCP stream to a rx queue which is bound on the same cpu as the application. \|-------------+--------+--------+------+---------\| \| rq type \| before \| after \| unit \| diff \| \|-------------+--------+--------+------+---------\| \| striding rq \| 30.1 \| 31.4 \| Gbps \| 4.14 % \| \| legacy rq \| 30.2 \| 33.0 \| Gbps \| 8.48 % \| \|-------------+--------+--------+------+---------\| * Small XDP performance degradation. The test was is XDP drop program running on a single rx queue with small packets incoming it looks like this: \|-------------+----------+----------+------+---------\| \| rq type \| before \| after \| unit \| diff \| \|-------------+----------+----------+------+---------\| \| striding rq \| 19725449 \| 18544617 \| pps \| -6.37 % \| \| legacy rq \| 19879931 \| 18631841 \| pps \| -6.70 % \| \|-------------+----------+----------+------+---------\| This will be handled in a different patch series by adding support for multi-packet per page. * For other cases the performance is roughly the same. The above numbers were obtained on the following system: 24 core Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz 32 GB RAM ConnectX-7 single port The breakdown on the patch series is the following: * Preparations for introducing the mlx5e_frag_page struct. * Delete the mlx5e_page_cache struct. * Enable dma mapping from page_pool. * Enable skb recycling and fragment counting. * Do deferred release of pages (just before alloc) to ensure better page_pool cache utilization. ==================== * tag 'mlx5-updates-2023-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: RX, Remove unnecessary recycle parameter and page_cache stats net/mlx5e: RX, Break the wqe bulk refill in smaller chunks net/mlx5e: RX, Increase WQE bulk size for legacy rq net/mlx5e: RX, Split off release path for xsk buffers for legacy rq net/mlx5e: RX, Defer page release in legacy rq for better recycling net/mlx5e: RX, Change wqe last_in_page field from bool to bit flags net/mlx5e: RX, Defer page release in striding rq for better recycling net/mlx5e: RX, Rename xdp_xmit_bitmap to a more generic name net/mlx5e: RX, Enable skb page recycling through the page_pool net/mlx5e: RX, Enable dma map and sync from page_pool allocator net/mlx5e: RX, Remove internal page_cache net/mlx5e: RX, Store SHAMPO header pages in array net/mlx5e: RX, Remove alloc unit layout constraint for striding rq net/mlx5e: RX, Remove alloc unit layout constraint for legacy rq net/mlx5e: RX, Remove mlx5e_alloc_unit argument in page allocation ==================== Link: https://lore.kernel.org/r/20230328205623.142075-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 22:15:24 -07:00
Michael Chan	581bce7bcb	bnxt_en: Add missing 200G link speed reporting bnxt_fw_to_ethtool_speed() is missing the case statement for 200G link speed reported by firmware. As a result, ethtool will report unknown speed when the firmware reports 200G link speed. Fixes: `532262ba3b` ("bnxt_en: ethtool: support PAM4 link speeds up to 200G") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 21:48:16 -07:00
Kalesh AP	62aad36ed3	bnxt_en: Fix typo in PCI id to device description string mapping Fix 57502 and 57508 NPAR description string entries. The typos caused these devices to not match up with lspci output. Fixes: `49c98421e6` ("bnxt_en: Add PCI IDs for 57500 series NPAR devices.") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 21:48:16 -07:00
Kalesh AP	83714dc3db	bnxt_en: Fix reporting of test result in ethtool selftest When the selftest command fails, driver is not reporting the failure by updating the "test->flags" when bnxt_close_nic() fails. Fixes: `eb51365846` ("bnxt_en: Add basic ethtool -t selftest support.") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 21:48:16 -07:00
Radoslaw Tyl	c5cff16f46	i40e: fix registers dump after run ethtool adapter self test Fix invalid registers dump from ethtool -d ethX after adapter self test by ethtool -t ethY. It causes invalid data display. The problem was caused by overwriting i40e_reg_list[].elements which is common for ethtool self test and dump. Fixes: `22dd9ae8af` ("i40e: Rework register diagnostic") Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230328172659.3906413-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 21:47:31 -07:00
Jakub Kicinski	165d35159c	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-28 (ice) This series contains updates to ice driver only. Jesse fixes mismatched header documentation reported when building with W=1. Brett restricts setting of VSI context to only applicable fields for the given ICE_AQ_VSI_PROP_Q_OPT_VALID bit. Junfeng adds check when adding Flow Director filters that conflict with existing filter rules. Jakob Koschel adds interim variable for iterating to prevent possible misuse after looping. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: fix invalid check for empty list in ice_sched_assoc_vsi_to_agg() ice: add profile conflict check for AVF FDIR ice: Fix ice_cfg_rdma_fltr() to only update relevant fields ice: fix W=1 headers mismatch ==================== Link: https://lore.kernel.org/r/20230328172035.3904953-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 21:46:18 -07:00
Simon Horman	c5370374bb	net: ena: removed unused tx_bytes variable clang 16.0.0 with W=1 reports: drivers/net/ethernet/amazon/ena/ena_netdev.c:1901:6: error: variable 'tx_bytes' set but not used [-Werror,-Wunused-but-set-variable] u32 tx_bytes = 0; The variable is not used so remove it. Signed-off-by: Simon Horman <horms@kernel.org> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20230328151958.410687-1-horms@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 21:39:35 -07:00
Dan Carpenter	765f360464	octeon_ep: unlock the correct lock on error path The h and the f letters are swapped so it unlocks the wrong lock. Fixes: `577f0d1b1c` ("octeon_ep: add separate mailbox command and response queues") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/251aa2a2-913e-4868-aac9-0a90fc3eeeda@kili.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 21:39:35 -07:00
Jakub Kicinski	8c49527084	bnx2x: use the right build_skb() helper build_skb() no longer accepts slab buffers. Since slab use is fairly uncommon we prefer the drivers to call a separate slab_build_skb() function appropriately. bnx2x uses the old semantics where size of 0 meant buffer from slab. It sets the fp->rx_frag_size to 0 for MTUs which don't fit in a page. It needs to call slab_build_skb(). This fixes the WARN_ONCE() of incorrect API use seen with bnx2x. Reported-by: Thomas Voegtle <tv@lio96.de> Link: https://lore.kernel.org/all/b8f295e4-ba57-8bfb-7d9c-9d62a498a727@lio96.de/ Fixes: `ce098da149` ("skbuff: Introduce slab_build_skb()") Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230329000013.2734957-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-29 21:29:31 -07:00
Hao Lan	3b064f541b	net: hns3: support wake on lan configuration and query The HNS3 driver supports Wake-on-LAN, which can wake up the server from power off state to power on state by magic packet or magic security packet. ChangeLog: v1->v2: Deleted the debugfs function that overlaps with the ethtool function from suggestion of Andrew Lunn. v2->v3: Return the wol configuration stored in driver, suggested by Alexander H Duyck. v3->v4: Add a helper to go from netdev to the local struct, suggested by Simon Horman and Jakub Kicinski. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-29 09:07:42 +01:00
Edward Cree	17654d84b4	sfc: add offloading of 'foreign' TC (decap) rules A 'foreign' rule is one for which the net_dev is not the sfc netdevice or any of its representors. The driver registers indirect flow blocks for tunnel netdevs so that it can offload decap rules. For example: tc filter add dev vxlan0 parent ffff: protocol ipv4 flower \ enc_src_ip 10.1.0.2 enc_dst_ip 10.1.0.1 \ enc_key_id 1000 enc_dst_port 4789 \ action tunnel_key unset \ action mirred egress redirect dev $REPRESENTOR When notified of a rule like this, register an encap match on the IP and dport tuple (creating an Outer Rule table entry) and insert an MAE action rule to perform the decapsulation and deliver to the representee. Moved efx_tc_delete_rule() below efx_tc_flower_release_encap_match() to avoid the need for a forward declaration. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-29 09:06:08 +01:00
Edward Cree	746224cdef	sfc: add code to register and unregister encap matches Add a hashtable to detect duplicate and conflicting matches. If match is not a duplicate, call MAE functions to add/remove it from OR table. Calling code not added yet, so mark the new functions as unused. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-29 09:06:08 +01:00
Edward Cree	2245eb0086	sfc: add functions to insert encap matches into the MAE An encap match corresponds to an entry in the exact-match Outer Rule table; the lookup response includes the encap type (protocol) allowing the hardware to continue parsing into the inner headers. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-29 09:06:08 +01:00
Edward Cree	b7f5e17b3b	sfc: handle enc keys in efx_tc_flower_parse_match() Translate the fields from flow dissector into struct efx_tc_match. In efx_tc_flower_replace(), reject filters that match on them, because only 'foreign' filters (i.e. those for which the ingress dev is not the sfc netdev or any of its representors, e.g. a tunnel netdev) can use them. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-29 09:06:08 +01:00
Edward Cree	b9d5c9b7d8	sfc: add notion of match on enc keys to MAE machinery Extend the MAE caps check to validate that the hardware supports these outer-header matches where used by the driver. Extend efx_mae_populate_match_criteria() to fill in the outer rule ID and VNI match fields. Nothing yet populates these match fields, nor creates outer rules. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-29 09:06:08 +01:00
Edward Cree	edd025ca08	sfc: document TC-to-EF100-MAE action translation concepts Includes an explanation of the lifetime of the 'cursor' action-set `act`. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-29 09:06:08 +01:00
Kuniyuki Iwashima	8cdc3223e7	ipv6: Remove in6addr_any alternatives. Some code defines the IPv6 wildcard address as a local variable and use it with memcmp() or ipv6_addr_equal(). Let's use in6addr_any and ipv6_addr_any() instead. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-03-29 08:22:52 +01:00
Jakub Kicinski	de7494524d	mlx5-updates-2023-03-20 mlx5 dynamic msix This patch series adds support for dynamic msix vectors allocation in mlx5. Eli Cohen Says: ================ The following series of patches modifies mlx5_core to work with the dynamic MSIX API. Currently, mlx5_core allocates all the interrupt vectors it needs and distributes them amongst the consumers. With the introduction of dynamic MSIX support, which allows for allocation of interrupts more than once, we now allocate vectors as we need them. This allows other drivers running on top of mlx5_core to allocate interrupt vectors for their own use. An example for this is mlx5_vdpa, which uses these vectors to propagate interrupts directly from the hardware to the vCPU [1]. As a preparation for using this series, a use after free issue is fixed in lib/cpu_rmap.c and the allocator for rmap entries has been modified. A complementary API for irq_cpu_rmap_add() has also been introduced. [1] https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/patch/?id=0f2bf1fcae96a83b8c5581854713c9fc3407556e ================ -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmQeTIUACgkQSD+KveBX +j7oCQgAx9yNHM4BZD2UfIx/P+W13v1B+xOds04Vezl9JlakoqvviPxm3vvuKkl+ j/8DdyoqMUbWV0j5XxgZ+GG91bc14jN1GQ+4fUf63SzA99vAGb9GJPV2aQt5roGh JmMqI2utDfoz+29qtQ+kVchY5AN5AoPXSQH2zkEZmJaPUjYb9Dr/4IayL0JaViAw S31QLHKkSJ8bL8Wc6Op1emNVV7eXs18f7IIjVs3sYOb3WJRPVpmdKneRqLgVYplf Td40Gwobl1elpjEqSSRTJI5YUSR8gcAJlBqIwHeJzFFpO3Pnciopl761osNKKs/a 5ctES5DS6JHqqFGbWV1gKYcRMil3LA== =9i8l -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2023-03-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-03-20 mlx5 dynamic msix This patch series adds support for dynamic msix vectors allocation in mlx5. Eli Cohen Says: ================ The following series of patches modifies mlx5_core to work with the dynamic MSIX API. Currently, mlx5_core allocates all the interrupt vectors it needs and distributes them amongst the consumers. With the introduction of dynamic MSIX support, which allows for allocation of interrupts more than once, we now allocate vectors as we need them. This allows other drivers running on top of mlx5_core to allocate interrupt vectors for their own use. An example for this is mlx5_vdpa, which uses these vectors to propagate interrupts directly from the hardware to the vCPU [1]. As a preparation for using this series, a use after free issue is fixed in lib/cpu_rmap.c and the allocator for rmap entries has been modified. A complementary API for irq_cpu_rmap_add() has also been introduced. [1] https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/patch/?id=0f2bf1fcae96a83b8c5581854713c9fc3407556e ================ * tag 'mlx5-updates-2023-03-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Provide external API for allocating vectors net/mlx5: Use one completion vector if eth is disabled net/mlx5: Refactor calculation of required completion vectors net/mlx5: Move devlink registration before mlx5_load net/mlx5: Use dynamic msix vectors allocation net/mlx5: Refactor completion irq request/release code net/mlx5: Improve naming of pci function vectors net/mlx5: Use newer affinity descriptor net/mlx5: Modify struct mlx5_irq to use struct msi_map net/mlx5: Fix wrong comment net/mlx5e: Coding style fix, add empty line lib: cpu_rmap: Add irq_cpu_rmap_remove to complement irq_cpu_rmap_add lib: cpu_rmap: Use allocator for rmap entries lib: cpu_rmap: Avoid use after free on rmap->obj array entries ==================== Link: https://lore.kernel.org/r/20230324231341.29808-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-28 23:52:12 -07:00
Tom Rix	e48cefb9c8	net: ethernet: 8390: axnet_cs: remove unused xfer_count variable clang with W=1 reports drivers/net/ethernet/8390/axnet_cs.c:653:9: error: variable 'xfer_count' set but not used [-Werror,-Wunused-but-set-variable] int xfer_count = count; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230327235423.1777590-1-trix@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-28 23:48:02 -07:00
Felix Fietkau	07b3af42d8	net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links Using the QDMA tx scheduler to throttle tx to line speed works fine for switch ports, but apparently caused a regression on non-switch ports. Based on a number of tests, it seems that this throttling can be safely dropped without re-introducing the issues on switch ports that the tx scheduling changes resolved. Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ Fixes: `f63959c7ee` ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") Reported-by: Frank Wunderlich <frank-w@public-files.de> Reported-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20230324140404.95745-1-nbd@nbd.name Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-28 23:23:50 -07:00
Wolfram Sang	cdeccd13a0	Revert "sh_eth: remove open coded netif_running()" This reverts commit `ce1fdb0656`. It turned out this actually introduces a race condition. netif_running() is not a suitable check for get_stats. Reported-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230327152112.15635-1-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-28 19:23:32 -07:00
Saeed Mahameed	163c2c7059	net/mlx5e: Fix build break on 32bit The cited commit caused the following build break in mlx5 due to a change in size of MAX_SKB_FRAGS. error: format '%lu' expects argument of type 'long unsigned int', but argument 7 has type 'unsigned int' [-Werror=format=] Fix this by explicit casting. Fixes: `3948b05950` ("net: introduce a config option to tweak MAX_SKB_FRAGS") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230328200723.125122-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-28 16:18:59 -07:00
Dragos Tatulea	3905f8d64c	net/mlx5e: RX, Remove unnecessary recycle parameter and page_cache stats The recycle parameter used during page release is no longer necessary: the page pool can detect when the page cannot be recycled to the cache or ring without any outside hint. The page pool will also take care of cleaning up after itself once all the inflight pages have been released. So no need to explicitly release pages to the system. Remove the internal page_cache stats as the mlx5e_page_cache struct no longer exists. Delete the documentation entries along with the stats. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:59 -07:00
Dragos Tatulea	cd640b0503	net/mlx5e: RX, Break the wqe bulk refill in smaller chunks To avoid overflowing the page pool's cache, don't release the whole bulk which is usually larger than the cache refill size. Group release+alloc instead into cache refill units that allow releasing to the cache and then allocating from the cache. A refill_unit variable is added as a iteration unit over the wqe_bulk when doing release+alloc. For a single ring, single core, default MTU (1500) TCP stream test the number of pages allocated from the cache directly (rx_pp_recycle_cached) increases from 0% to 52%: +---------------------------------------------+ \| Page Pool stats (/sec) \| Before \| After \| +-------------------------+---------+---------+ \|rx_pp_alloc_fast \| 2145422 \| 2193802 \| \|rx_pp_alloc_slow \| 2 \| 0 \| \|rx_pp_alloc_empty \| 2 \| 0 \| \|rx_pp_alloc_refill \| 34059 \| 16634 \| \|rx_pp_alloc_waive \| 0 \| 0 \| \|rx_pp_recycle_cached \| 0 \| 1145818 \| \|rx_pp_recycle_cache_full \| 0 \| 0 \| \|rx_pp_recycle_ring \| 2179361 \| 1064616 \| \|rx_pp_recycle_ring_full \| 121 \| 0 \| +---------------------------------------------+ With this patch, the performance for legacy rq for the above test is back to baseline. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:59 -07:00
Dragos Tatulea	4ba2b4988c	net/mlx5e: RX, Increase WQE bulk size for legacy rq Deferred page release was added to legacy rq but its desired effect (driver releases last fragment to page pool cache) is not yet visible due to the WQE bulks being too small. This patch increases the WQE bulk size to span 512 KB and clip it to one quarter of the rx queue size. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:59 -07:00
Dragos Tatulea	76238d0fbd	net/mlx5e: RX, Split off release path for xsk buffers for legacy rq Don't mix xsk buffer releases with page releases anymore. This is needed for handling of deferred page release. Add a new bulk free function for xsk buffers from wqe frags. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:59 -07:00
Dragos Tatulea	3f93f82988	net/mlx5e: RX, Defer page release in legacy rq for better recycling Currently, fragmented pages from the page pool can be released in two ways: 1) In the mlx5e driver when trimming off the unused fragments AND the associated skb fragments have been released. This path allows recycling of pages to the page pool cache (allow_direct == true). 2) On the skb release path (last fragment release), which will always release pages to the page pool ring (allow_direct == false). Whichever is releasing the last fragment will be decisive on where the page gets released: the cache or the ring. So we obviously want to maximize for doing the release from 1. This patch does that by deferring the release of page fragments right before requesting new ones from the page pool. A flag is added to make sure that there's no release before first alloc and that XDP_TX fragments are not released prematurely. This is a preparation patch that doesn't unlock the performance improvements yet. A followup patch will do that. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:59 -07:00
Dragos Tatulea	625dff29df	net/mlx5e: RX, Change wqe last_in_page field from bool to bit flags Change the bool flag to a bitfield as we'll use it in a downstream patch in the series to add signaling about skipping a fragment release. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:58 -07:00
Dragos Tatulea	4c2a132368	net/mlx5e: RX, Defer page release in striding rq for better recycling Currently, for striding RQ, fragmented pages from the page pool can get released in two ways: 1) In the mlx5e driver when trimming off the unused fragments AND the associated skb fragments have been released. This path allows recycling of pages to the page pool cache (allow_direct == true). 2) On the skb release path (last fragment release), which will always release pages to the page pool ring (allow_direct == false). Whichever is releasing the last fragment will be decisive on where the page gets released: the cache or the ring. So we obviously want to maximize for doing the release from 1. This patch does that by deferring the release of page fragments right before requesting new ones from the page pool. Extra care needs to be taken for the corner cases: * On first call, make sure that release is not called. The skip_release_bitmap is used for this purpose. * On rq shutdown, make sure that all wqes that were not in the linked list are released. For a single ring, single core, default MTU (1500) TCP stream test the number of pages allocated from the cache directly (rx_pp_recycle_cached) increases from 31 % to 98 %: +----------------------------------------------+ \| Page Pool stats (/sec) \| Before \| After \| +-------------------------+---------+----------+ \|rx_pp_alloc_fast \| 2137754 \| 2261033 \| \|rx_pp_alloc_slow \| 47 \| 9 \| \|rx_pp_alloc_empty \| 47 \| 9 \| \|rx_pp_alloc_refill \| 23230 \| 819 \| \|rx_pp_alloc_waive \| 0 \| 0 \| \|rx_pp_recycle_cached \| 672182 \| 2209015 \| \|rx_pp_recycle_cache_full \| 1789 \| 0 \| \|rx_pp_recycle_ring \| 1485848 \| 52259 \| \|rx_pp_recycle_ring_full \| 3003 \| 584 \| +----------------------------------------------+ With this patch, the performance in striding rq for the above test is back to baseline. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:58 -07:00
Dragos Tatulea	38a36efccd	net/mlx5e: RX, Rename xdp_xmit_bitmap to a more generic name The xdp_xmit_bitmap currently serves only one purpose: to avoid releasing pages that are still in use due to XDP TX. A following patch will use this bitmap in a slightly different context but for the same purpose. So rename the bitmap to a more generic name that reflects the purpose not the context. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:58 -07:00
Dragos Tatulea	6f57428460	net/mlx5e: RX, Enable skb page recycling through the page_pool Start using the page_pool skb recycling api to recycle all pages back to the page pool and stop using atomic page reference counting. The mlx5e driver used to manage in-flight pages using page refcounting: for each fragment there were 2 atomic write operations happening (one for building the skb and one on skb release). The page_pool api introduced a method to track page fragments more optimally: * The page's pp_fragment_count is set to a large bias on page alloc (1 x atomic write operation). * The driver tracks the actual page fragments in a non atomic variable. * When the skb is recycled, pp_fragment_count is decremented (atomic write operation). * When page is released in the driver, the unused number of fragments (relative to the bias) is deducted from pp_fragment_count (atomic write operation). * Last page defragmentation will only be an atomic read. So in total there are `number of fragments + 1` atomic write ops. As opposed to previously: `2 * frags` atomic writes ops. Pages are wrapped in a mlx5e_frag_page structure which also contains the number of fragments. This makes it easy to count the fragments in the driver. This change brings performance improvements for the case when the old rx page_cache had low recycling rates due to head of queue blocking. For a iperf3 TCP test with a single stream, on a single core (iperf and receive queue running on same core), the following improvements can be noticed: * Striding rq: - before (net-next baseline): bitrate = 30.1 Gbits/sec - after : bitrate = 31.4 Gbits/sec (diff: 4.14 %) * Legacy rq: - before (net-next baseline): bitrate = 30.2 Gbits/sec - after : bitrate = 33.0 Gbits/sec (diff: 8.48 %) There are 2 temporary performance degradations introduced: 1) TCP streams that had a good recycling rate with the old page_cache have a degradation for both striding and linear rq. This is due to very low page pool cache recycling: the pages are released during skb recycle which will release pages to the page pool ring for safety. The following patches in this series will tackle this problem by deferring the page release in the driver to increase the chance of having pages recycled to the cache. 2) XDP performance is now lower (4-5 %) due to the higher number of atomic operations used for fragment management. But this opens the door for supporting multiple packets per page in XDP, which will bring a big gain. Otherwise, performance is similar to baseline. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:58 -07:00
Dragos Tatulea	4a5c5e2500	net/mlx5e: RX, Enable dma map and sync from page_pool allocator Remove driver dma mapping and unmapping of pages. Let the page_pool api do it. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:58 -07:00
Dragos Tatulea	08c9b61b07	net/mlx5e: RX, Remove internal page_cache This patch removes the internal rx page_cache and uses the generic page_pool api only. It used to be that the page_pool couldn't handle all the mlx5 driver usecases, but with the introduction of skb recycling and page fragmentaton in the page_pool full switch can now be made. Some benfits of this transition: * Better page recycling in the cases when the page_cache was suffering from head of queue blocking. The page_pool doesn't have this issue. * DMA mapping/unmapping can be managed by the page_pool. * mlx5e_rq size reduced by more than 50% due to the page_cache array being deleted. This patch only removes the page_cache. Downstream patches will enable the required page_pool features and will add further fine-tuning. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:57 -07:00
Dragos Tatulea	ca6ef9f031	net/mlx5e: RX, Store SHAMPO header pages in array Save allocated SHAMPO header pages to an array to which the mlx5e_dma_info page will point to. This change is a preparation for introducing mlx5e_frag_page structure in a downstream patch. There's no new functionality introduced. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2023-03-28 13:43:57 -07:00

1 2 3 4 5 ...

45905 commits