linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-08-27 19:29:37 +00:00

Author	SHA1	Message	Date
Tuong Lien	f779bf7922	tipc: optimize key switching time and logic We reduce the lasting time for a pending TX key to be active as well as for a passive RX key to be freed which generally helps speed up the key switching. It is not expected to be too fast but should not be too slow either. Also the key handling logic is simplified that a pending RX key will be removed automatically if it is found not working after a number of times; the probing for a pending TX key is now carried on a specific message user ('LINK_PROTOCOL' or 'LINK_CONFIG') which is more efficient than using a timer on broadcast messages, the timer is reserved for use later as needed. The kernel logs or 'pr***()' are now made as clear as possible to user. Some prints are added, removed or changed to the debug-level. The 'TIPC_CRYPTO_DEBUG' definition is removed, and the 'pr_debug()' is used instead which will be much helpful in runtime. Besides we also optimize the code in some other places as a preparation for later commits. v2: silent more kernel logs, also use 'info->extack' for a message emitted due to netlink operations instead (- David's comments). Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 13:58:37 -07:00
David S. Miller	cb589a55f4	Merge branch 'ionic-add-devlink-dev-flash-support' Shannon Nelson says: ==================== ionic: add devlink dev flash support Add support for using devlink's dev flash facility to update the firmware on an ionic device, and add a new timeout parameter to the devlink flash netlink message. For long-running flash commands, we add a timeout element to the dev flash notify message in order for a userland utility to display a timeout deadline to the user. This allows the userland utility to display a count down to the user when a firmware update action is otherwise going to go for ahile without any updates. An example use is added to the netdevsim module. The ionic driver uses this timeout element in its new flash function. The driver uses a simple model of pushing the firmware file to the NIC, asking the NIC to unpack and install the file into the device, and then selecting it for the next boot. If any of these steps fail, the whole transaction is failed. A couple of the steps can take a long time, so we use the timeout status message rather than faking it with bogus done/total messages. The driver doesn't currently support doing these steps individually. In the future we want to be able to list the FW that is installed and selectable but we don't yet have the API to fully support that. v5: pulled the cmd field back out of the new params struct changed netdevsim example message to "Flash select" v4: Added a new devlink status notify message for showing timeout information, and modified the ionic fw update to use it for its long running firmware commands. v3: Changed long dev_cmd timeout on status check calls to a loop around calls with a normal timeout, which allows for more intermediate log messaging when in a long wait, and for letting other threads run dev_cmds if waiting. v2: Changed "Activate" to "Select" in status messages. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 13:54:23 -07:00
Shannon Nelson	30b5191ad1	ionic: add devlink firmware update Add support for firmware update through the devlink interface. This update copies the firmware object into the device, asks the current firmware to install it, then asks the firmware to select the new firmware for the next boot-up. The install and select steps are launched as asynchronous requests, which are then followed up with status request commands. These status request commands will be answered with an EAGAIN return value and will try again until the request has completed or reached the timeout specified. Signed-off-by: Shannon Nelson <snelson@pensando.io> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 13:54:23 -07:00
Shannon Nelson	87c905d84f	ionic: update the fw update api Add the rest of the firmware api bits needed to support the driver running a firmware update. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 13:54:23 -07:00
Shannon Nelson	b311b001de	netdevsim: devlink flash timeout message Add a simple devlink flash timeout message to exercise the message mechanism. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 13:54:23 -07:00
Shannon Nelson	6700acc5f1	devlink: collect flash notify params into a struct The dev flash status notify function parameter lists are getting rather long, so add a struct to be filled and passed rather than continuously changing the function signatures. Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 13:54:23 -07:00
Shannon Nelson	f92970c694	devlink: add timeout information to status_notify Add a timeout element to the DEVLINK_CMD_FLASH_UPDATE_STATUS netlink message for use by a userland utility to show that a particular firmware flash activity may take a long but bounded time to finish. Also add a handy helper for drivers to make use of the new timeout value. UI usage hints: - if non-zero, add timeout display to the end of the status line [component] status_msg ( Xm Ys : Am Bs ) using the timeout value for Am Bs and updating the Xm Ys every second - if the timeout expires while awaiting the next update, display something like [component] status_msg ( timeout reached : Am Bs ) - if new status notify messages are received, remove the timeout and start over Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 13:54:22 -07:00
Francesco Ruggeri	0e4be9e57e	net: use exponential backoff in netdev_wait_allrefs The combination of aca_free_rcu, introduced in commit `2384d02520` ("net/ipv6: Add anycast addresses to a global hashtable"), and fib6_info_destroy_rcu, introduced in commit `9b0a8da8c4` ("net/ipv6: respect rcu grace period before freeing fib6_info"), can result in an extra rcu grace period being needed when deleting an interface, with the result that netdev_wait_allrefs ends up hitting the msleep(250), which is considerably longer than the required grace period. This can result in long delays when deleting a large number of interfaces, and it can be observed with this script: ns=dummy-ns NIFS=100 ip netns add $ns ip netns exec $ns ip link set lo up ip netns exec $ns sysctl net.ipv6.conf.default.disable_ipv6=0 ip netns exec $ns sysctl net.ipv6.conf.default.forwarding=1 for ((i=0; i<$NIFS; i++)) do if=eth$i ip netns exec $ns ip link add $if type dummy ip netns exec $ns ip link set $if up ip netns exec $ns ip -6 addr add 2021:$i::1/120 dev $if done for ((i=0; i<$NIFS; i++)) do if=eth$i ip netns exec $ns ip link del $if done ip netns del $ns Instead of using a fixed msleep(250), this patch tries an extra rcu_barrier() followed by an exponential backoff. Time with this patch on a 5.4 kernel: real 0m7.704s user 0m0.385s sys 0m1.230s Time without this patch: real 0m31.522s user 0m0.438s sys 0m1.156s v2: use exponential backoff instead of trying to wake up netdev_wait_allrefs. v3: preserve reverse christmas tree ordering of local variables v4: try an extra rcu_barrier before the backoff, plus some cosmetic changes. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-18 13:47:31 -07:00
Paolo Abeni	1d39cd8cf7	mptcp: fix integer overflow in mptcp_subflow_discard_data() Christoph reported an infinite loop in the subflow receive path under stress condition. If there are multiple subflows, each of them using a large send buffer, the delta between the sequence number used by MPTCP-level retransmission can and the current msk->ack_seq can be greater than MAX_INT. In the above scenario, when calling mptcp_subflow_discard_data(), such delta will be truncated to int, and could result in a negative number: no bytes will be dropped, and subflow_check_data_avail() will try again to process the same packet, looping forever. This change addresses the issue by expanding the 'limit' size to 64 bits, so that overflows are not possible anymore. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/87 Fixes: `6719331c2f` ("mptcp: trigger msk processing even for OoO data") Reported-and-tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 18:04:48 -07:00
Ursula Braun	ac679364b9	net/smc: fix double kfree in smc_listen_work() If smc_listen_rmda_finish() returns with an error, the storage addressed by 'buf' is freed a second time. Consolidate freeing under a common label and jump to that label. Fixes: `6bb14e48ee` ("net/smc: dynamic allocation of CLC proposal buffer") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 18:03:56 -07:00
Shannon Nelson	86d009f1cb	ionic: add DIMLIB to Kconfig >> ld.lld: error: undefined symbol: net_dim_get_rx_moderation >>> referenced by ionic_lif.c:52 (drivers/net/ethernet/pensando/ionic/ionic_lif.c:52) >>> net/ethernet/pensando/ionic/ionic_lif.o:(ionic_dim_work) in archive drivers/built-in.a >> ld.lld: error: undefined symbol: net_dim >>> referenced by ionic_txrx.c:456 (drivers/net/ethernet/pensando/ionic/ionic_txrx.c:456) >>> net/ethernet/pensando/ionic/ionic_txrx.o:(ionic_dim_update) in archive drivers/built-in.a v2: removed sketchy dashes in commit message Fixes: `04a834592b` ("ionic: dynamic interrupt moderation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 18:00:30 -07:00
Jakub Kicinski	78a3ea5557	net: remove comments on struct rtnl_link_stats We removed the misleading comments from struct rtnl_link_stats64 when we added proper kdoc. struct rtnl_link_stats has the same inline comments, so remove them, too. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:58:13 -07:00
Andrew Lunn	529d1fdf97	net: mdio: octeon: Select MDIO_DEVRES This driver makes use of devm_mdiobus_alloc_size. To ensure this is available select MDIO_DEVRES which provides it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:07:50 -07:00
David Ahern	897217b9a0	selftests: Set default protocol for raw sockets in nettest IPPROTO_IP (0) is not valid for raw sockets. Default the protocol for raw sockets to IPPROTO_RAW if the protocol has not been set via the -P option. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:07:15 -07:00
Qinglang Miao	77646b63ff	dpaa2-eth: Convert to DEFINE_SHOW_ATTRIBUTE Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:05:29 -07:00
Qinglang Miao	2170ff0819	net: hsr: Convert to DEFINE_SHOW_ATTRIBUTE Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:05:10 -07:00
David S. Miller	72d61d3009	Merge branch 'mlxsw-Support-dcbnl_setbuffer-dcbnl_getbuffer' Ido Schimmel says: ==================== mlxsw: Support dcbnl_setbuffer, dcbnl_getbuffer Petr says: On Spectrum, port buffers, also called port headroom, is where packets are stored while they are parsed and the forwarding decision is being made. For lossless traffic flows, in case shared buffer admission is not allowed, headroom is also where to put the extra traffic received before the sent PAUSE takes effect. Linux supports two DCB interfaces related to the headroom: dcbnl_setbuffer for configuration, and dcbnl_getbuffer for inspection. This patch set implements them. With dcbnl_setbuffer in place, there will be two sources of authority over the ingress configuration: the DCB ETS hook, because ETS configuration is mirrored to ingress, and the DCB setbuffer hook. mlxsw is in a similar situation on the egress side, where there are two sources of the ETS configuration: the DCB ETS hook, and the TC qdisc hooks. This is a non-intuitive situation, because the way the ASIC ends up being configured depends not only on the actual configured bits, but also on the order in which they were configured. To prevent these issues on the ingress side, two configuration modes will exist: DCB mode and TC mode. DCB ETS will keep getting projected to ingress in the (default) DCB mode. When a qdisc is installed on a port, it will be switched to the TC mode, the ingress configuration will be done through the dcbnl_setbuffer callback. The reason is that the dcbnl_setbuffer hook is not standardized and supported by lldpad. Projecting DCB ETS configuration to ingress is a reasonable heuristic to configure ingress especially when PFC is in effect. In patch #1, the toggle between the DCB and TC modes of headroom configuration, described above, is introduced. Patch #2 implements dcbnl_getbuffer and dcbnl_setbuffer. dcbnl_getbuffer can be always used to determine the current port headroom configuration. dcbnl_setbuffer is only permitted in the TC mode. In patch #3, make the qdisc module toggle the headroom mode from DCB to TC and back, depending on whether there is an offloaded qdisc on the port. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:01:38 -07:00
Petr Machata	509f04ca62	mlxsw: spectrum_qdisc: Disable port buffer autoresize with qdiscs There are two interfaces to configure ETS: qdiscs and DCB. Historically, DCB ETS configuration was projected to ingress as well, and configured port buffers. Qdisc was not. Keep qdiscs behaving this way, and if an offloaded qdisc is configured on a port, move this port's headroom to a manual mode, thus allowing configuration of port buffers through dcbnl_setbuffer. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:01:38 -07:00
Petr Machata	5ebc6031e6	mlxsw: spectrum_dcb: Implement dcbnl_setbuffer / getbuffer Add dcbnl_setbuffer, which bounces requests if a headroom is in DCB mode. Implement dcbnl_getbuffer such that it can always be used to determine port-buffer configuration, regardless of headroom mode. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:01:38 -07:00
Petr Machata	69e408a2cf	mlxsw: spectrum_buffers: Support two headroom modes There are two interfaces to configure ETS: qdiscs and DCB. Historically, DCB ETS configuration was projected to ingress as well, and configured port buffers. Qdisc was not. So as not to break clients that today use DCB ETS and PFC and rely on getting a reasonable ingress buffer priomap, keep the ETS mirroring in effect. Since qdiscs have not done this mirroring historically, it is reasonable not to introduce it, but rather permit manual ingress configuration through dcbnl_setbuffer only in the qdisc mode. This will require a toggle to indicate whether buffer sizes should be autocomputed or taken from dcbnl_setbuffer, and likewise for priomaps. Introduce such and initialize it, and guard port buffer size configuration as appropriate. The toggle is currently left in the DCB position. In a following patch, qdisc code will switch it. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 17:01:38 -07:00
Yang Yingliang	4d11af5d00	netlink: add spaces around '&' in netlink_recv/sendmsg() It's hard to read the code without spaces around '&', for better reading, add spaces around '&'. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:53:47 -07:00
YueHaibing	2492c205d2	netdev: Remove unused functions There is no callers in tree, so can remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:51:33 -07:00
Ye Bin	c2ec6bc010	mptcp: Fix unsigned 'max_seq' compared with zero in mptcp_data_queue_ofo Fixes coccicheck warnig: net/mptcp/protocol.c:164:11-18: WARNING: Unsigned expression compared with zero: max_seq > 0 Fixes: `ab174ad8ef` ("mptcp: move ooo skbs into msk out of order queue") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:45:05 -07:00
David S. Miller	3ce406bda0	Merge branch 'net-marvell-prestera-Add-Switchdev-driver-for-Prestera-family-ASIC-device-98DX3255-AC3x' Vadym Kochan says: ==================== net: marvell: prestera: Add Switchdev driver for Prestera family ASIC device 98DX3255 (AC3x) Marvell Prestera 98DX3255 integrates up to 24 ports of 1GbE with 8 ports of 10GbE uplinks or 2 ports of 40Gbps stacking for a largely wireless SMB deployment. Prestera Switchdev is a firmware based driver that operates via PCI bus. The current implementation supports only boards designed for the Marvell Switchdev solution and requires special firmware. This driver implementation includes only L1, basic L2 support, and RX/TX. The core Prestera switching logic is implemented in prestera_main.c, there is an intermediate hw layer between core logic and firmware. It is implemented in prestera_hw.c, the purpose of it is to encapsulate hw related logic, in future there is a plan to support more devices with different HW related configurations. The following Switchdev features are supported: - VLAN-aware bridge offloading - VLAN-unaware bridge offloading - FDB offloading (learning, ageing) - Switchport configuration The original firmware image is uploaded to the linux-firmware repository. PATCH v9: 1) Replace read_poll_timeout_atomic() by original 'do {} while()' loop because it works much better than read_poll_timeout_atomic() considering the TX rate. Also it fixes warning reported on v8. 2) Use ENOENT instead of EEXIST when item is not found in few places - prestera_hw.c and prestera_rxtx.c Patches updated: [1] net: marvell: prestera: Add driver for Prestera family ASIC devices PATCH v8: 1) Put license in one line. 2) Sort includes. 3) Add missing comma for last enum member 4) Return original error code from last called func in places where instead other error code was used. 5) Add comma for last member in initialized struct in prestera_hw.c 6) Do not initialize 'int err = 0' where it is not needed. 7) Simplify device-tree "marvell,prestera" node parsing by removing not needed checking on 'np == NULL'. 8) Use u32p_replace_bits() instead of open-coded ((word & ~mask) \| val) 9) Use dev_warn_ratelimited() instead of pr_warn_ratelimited to indicate the device instance in prestera_rxtx.c 10) Simplify circular buffer list creation in prestera_sdma_{rx,tx}_init() by using do { } while (prev != tail) construction. 11) Use MSEC_PER_SEC instead of hard-coded 1000. 12) Use traditional error handling pattern: err = F(); if (err) return err; 13) Use ether_addr_copy() instead of memcpy() for mac FDB copying in prestera_hw.c 14) Drop swdev->ageing_time member which is not used. 15) Fix ageing macro to be in ms instead of seconds. Patches updated: [1] net: marvell: prestera: Add driver for Prestera family ASIC devices [2] net: marvell: prestera: Add PCI interface support [3] net: marvell: prestera: Add basic devlink support [4] net: marvell: prestera: Add ethtool interface support [5] net: marvell: prestera: Add Switchdev driver implementation PATCH v7: 1) Use ether_addr_copy() in prestera_main.c:prestera_port_set_mac_address() instead of memcpy(). 2) Removed not needed device's DMA address range check on dma_pool_alloc() in prestera_rxtx.c:prestera_sdma_buf_init(), this should be handled by dma_xxx() API considerig device's DMA mask. 3) Removed not needed device's DMA address range check on dma_map_single() in prestera_rxtx.c:prestera_sdma_rx_skb_alloc(), this should be handled by dma_xxx() API considerig device's DMA mask. 4) Add comment about port mac address limitation in the code where it is used and checked - prestera_main.c: - prestera_is_valid_mac_addr() - prestera_port_create() 5) Add missing destroy_workqueue(swdev_wq) in prestera_switchdev.c:prestera_switchdev_init() on error path handling. Patches updated: [1] net: marvell: prestera: Add driver for Prestera family ASIC devices [5] net: marvell: prestera: Add Switchdev driver implementation PATCH v6: 1) Use rwlock to protect port list on create/delete stages. The list is mostly readable by fw event handler or packets receiver, but updated only on create/delete port which are performed on switch init/fini stages. 2) Remove not needed variable initialization in prestera_dsa.c:prestera_dsa_parse() 3) Get rid of bounce buffer used by tx handler in prestera_rxtx.c, the bounce buffer should be handled by dma_xxx API via swiotlb. 4) Fix PRESTERA_SDMA_RX_DESC_PKT_LEN macro by using correct GENMASK(13, 0) in prestera_rxtx.c Patches updated: [1] net: marvell: prestera: Add driver for Prestera family ASIC devices PATCH v5: 0) add Co-developed tags for people who was involved in development. 1) Make SPDX license as separate comment 2) Change 'u8 ' -> 'void ', It allows to avoid not-needed u8* casting. 3) Remove "," in terminated enum's. 4) Use GENMASK(end, start) where it is applicable in. 5) Remove not-needed 'u8 *' casting. 6) Apply common error-check pattern 7) Use ether_addr_copy instead of memcpy 8) Use define for maximum MAC address range (255) 9) Simplify prestera_port_state_set() in prestera_main.c by using separate if-blocks for state setting: if (is_up) { ... } else { ... } which makes logic more understandable. 10) Simplify sdma tx wait logic when checking/updating tx_ring->burst. 11) Remove not-needed packed & aligned attributes 12) Use USEC_PER_MSEC as multiplier when converting ms -> usec on calling readl_poll_timeout. 13) Simplified some error path handling by simple return error code in. 14) Remove not-needed err assignment in. 15) Use dev_err() in prestera_devlink_register(...). Patches updated: [1] net: marvell: prestera: Add driver for Prestera family ASIC devices [2] net: marvell: prestera: Add PCI interface support [3] net: marvell: prestera: Add basic devlink support [4] net: marvell: prestera: Add ethtool interface support [5] net: marvell: prestera: Add Switchdev driver implementation PATCH v4: 1) Use prestera_ prefix in netdev_ops variable. 2) Kconfig: use 'default PRESTERA' build type for CONFIG_PRESTERA_PCI to be synced by default with prestera core module. 3) Use memcpy_xxio helpers in prestera_pci.c for IO buffer copying. 4) Generate fw image path via snprintf() instead of macroses. 5) Use pcim_ helpers in prestera_pci.c which simplified the probe/remove logic. 6) Removed not needed initializations of variables which are used in readl_poll_xxx() helpers. 7) Fixed few grammar mistakes in patch[2] description. 8) Export only prestera_ethtool_ops struct instead of each ethtool handler. 9) Add check for prestera_dev_check() in switchdev event handling to make sure there is no wrong topology. Patches updated: [1] net: marvell: prestera: Add driver for Prestera family ASIC devices [2] net: marvell: prestera: Add PCI interface support [4] net: marvell: prestera: Add ethtool interface support [5] net: marvell: prestera: Add Switchdev driver implementation PATCH v3: 1) Simplify __be32 type casting in prestera_dsa.c 2) Added per-patch changelog under "---" line. PATCH v2: 1) Use devlink_port_type_clear() 2) Add _MS prefix to timeout defines. 3) Remove not-needed packed attribute from the firmware ipc structs, also the firmware image needs to be uploaded too (will do it soon). 4) Introduce prestera_hw_switch_fini(), to be mirrored with init and do simple validation if the event handlers are unregistered. 5) Use kfree_rcu() for event handler unregistering. 6) Get rid of rcu-list usage when dealing with ports, not needed for now. 7) Little spelling corrections in the error/info messages. 8) Make pci probe & remove logic mirrored. 9) Get rid of ETH_FCS_LEN in headroom setting, not needed. PATCH: 1) Fixed W=1 warnings 2) Renamed PCI driver name to be more generic "Prestera DX" because there will be more devices supported. 3) Changed firmware image dir path: marvell/ -> mrvl/prestera/ to be aligned with location in linux-firmware.git (if such will be accepted). RFC v3: 1) Fix prestera prefix in prestera_rxtx.c 2) Protect concurrent access from multiple ports on multiple CPU system on tx path by spinlock in prestera_rxtx.c 3) Try to get base mac address from device-tree, otherwise use a random generated one. 4) Move ethtool interface support into separate prestera_ethtool.c file. 5) Add basic devlink support and get rid of physical port naming ops. 6) Add STP support in Switchdev driver. 7) Removed MODULE_AUTHOR 8) Renamed prestera.c -> prestera_main.c, and kernel module to prestera.ko RFC v2: 1) Use "pestera_" prefix in struct's and functions instead of mvsw_pr_ 2) Original series split into additional patches for Switchdev ethtool support. 3) Use major and minor firmware version numbers in the firmware image filename. 4) Removed not needed prints. 5) Use iopoll API for waiting on register's value in prestera_pci.c 6) Use standart approach for describing PCI ID matching section instead of using custom wrappers in prestera_pci.c 7) Add RX/TX support in prestera_rxtx.c. 8) Rewritten prestera_switchdev.c with following changes: - handle netdev events from prestera.c - use struct prestera_bridge for bridge objects, and get rid of struct prestera_bridge_device which may confuse. - use refcount_t 9) Get rid of macro usage for sending fw requests in prestera_hw.c 10) Add base_mac setting as module parameter. base_mac is required for generation default port's mac. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:35:47 -07:00
Vadym Kochan	40acc05271	dt-bindings: marvell,prestera: Add description for device-tree bindings Add brief description how to configure base mac address binding in device-tree. Describe requirement for the PCI port which is connected to the ASIC, to allow access to the firmware related registers. Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:35:47 -07:00
Vadym Kochan	e1189d9a5f	net: marvell: prestera: Add Switchdev driver implementation The following features are supported: - VLAN-aware bridge offloading - VLAN-unaware bridge offloading - FDB offloading (learning, ageing) - Switchport configuration Currently there are some limitations like: - Only 1 VLAN-aware bridge instance supported - FDB ageing timeout parameter is set globally per device Co-developed-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Co-developed-by: Serhiy Pshyk <serhiy.pshyk@plvision.eu> Signed-off-by: Serhiy Pshyk <serhiy.pshyk@plvision.eu> Co-developed-by: Taras Chornyi <taras.chornyi@plvision.eu> Signed-off-by: Taras Chornyi <taras.chornyi@plvision.eu> Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:35:47 -07:00
Vadym Kochan	a97d3c6939	net: marvell: prestera: Add ethtool interface support The ethtool API provides support for the configuration of the following features: speed and duplex, auto-negotiation, MDI-x, forward error correction, port media type. The API also provides information about the port status, hardware and software statistic. The following limitation exists: - port media type should be configured before speed setting - ethtool -m option is not supported - ethtool -p option is not supported - ethtool -r option is supported for RJ45 port only - the following combination of parameters is not supported: ethtool -s sw1pX port XX autoneg on - forward error correction feature is supported only on SFP ports, 10G speed - auto-negotiation and MDI-x features are not supported on Copper-to-Fiber SFP module Co-developed-by: Andrii Savka <andrii.savka@plvision.eu> Signed-off-by: Andrii Savka <andrii.savka@plvision.eu> Co-developed-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:35:47 -07:00
Vadym Kochan	34dd1710f5	net: marvell: prestera: Add basic devlink support Add very basic support for devlink interface: - driver name - fw version - devlink ports Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:35:47 -07:00
Vadym Kochan	4c2703dfd7	net: marvell: prestera: Add PCI interface support Add PCI interface driver for Prestera Switch ASICs family devices, which provides: - Firmware loading mechanism - Requests & events handling to/from the firmware - Access to the firmware on the bus level The firmware has to be loaded each time the device is reset. The driver is loading it from: /lib/firmware/mrvl/prestera/mvsw_prestera_fw-v{MAJOR}.{MINOR}.img The full firmware image version is located within the internal header and consists of 3 numbers - MAJOR.MINOR.PATCH. Additionally, driver has hard-coded minimum supported firmware version which it can work with: MAJOR - reflects the support on ABI level between driver and loaded firmware, this number should be the same for driver and loaded firmware. MINOR - this is the minimum supported version between driver and the firmware. PATCH - indicates only fixes, firmware ABI is not changed. Firmware image file name contains only MAJOR and MINOR numbers to make driver be compatible with any PATCH version. Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:35:47 -07:00
Vadym Kochan	501ef3066c	net: marvell: prestera: Add driver for Prestera family ASIC devices Marvell Prestera 98DX326x integrates up to 24 ports of 1GbE with 8 ports of 10GbE uplinks or 2 ports of 40Gbps stacking for a largely wireless SMB deployment. The current implementation supports only boards designed for the Marvell Switchdev solution and requires special firmware. The core Prestera switching logic is implemented in prestera_main.c, there is an intermediate hw layer between core logic and firmware. It is implemented in prestera_hw.c, the purpose of it is to encapsulate hw related logic, in future there is a plan to support more devices with different HW related configurations. This patch contains only basic switch initialization and RX/TX support over SDMA mechanism. Currently supported devices have DMA access range <= 32bit and require ZONE_DMA to be enabled, for such cases SDMA driver checks if the skb allocated in proper range supported by the Prestera device. Also meanwhile there is no TX interrupt support in current firmware version so recycling work is scheduled on each xmit. Port's mac address is generated from the switch base mac which may be provided via device-tree (static one or as nvme cell), or randomly generated. This is required by the firmware. Co-developed-by: Andrii Savka <andrii.savka@plvision.eu> Signed-off-by: Andrii Savka <andrii.savka@plvision.eu> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Co-developed-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Co-developed-by: Serhiy Pshyk <serhiy.pshyk@plvision.eu> Signed-off-by: Serhiy Pshyk <serhiy.pshyk@plvision.eu> Co-developed-by: Taras Chornyi <taras.chornyi@plvision.eu> Signed-off-by: Taras Chornyi <taras.chornyi@plvision.eu> Co-developed-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu> Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu> Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:35:46 -07:00
YueHaibing	5114b33105	genetlink: Remove unused function genl_err_attr() It is never used, so can remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:28:13 -07:00
YueHaibing	2b7ea122a0	net/sched: Remove unused function qdisc_queue_drop_head() It is not used since commit `a09ceb0e08` ("sched: remove qdisc->drop") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:27:10 -07:00
Matthieu Baerts	8b974778f9	selftests: mptcp: interpret \n as a new line In case of errors, this message was printed: (...) # read: Resource temporarily unavailable # client exit code 0, server 3 # \nnetns ns1-0-BJlt5D socket stat for 10003: (...) Obviously, the idea was to add a new line before the socket stat and not print "\nnetns". Fixes: `b08fbf2410` ("selftests: add test-cases for MPTCP MP_JOIN") Fixes: `048d19d444` ("mptcp: add basic kselftest for mptcp") Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:26:09 -07:00
Xie He	b79a80bd6d	net/packet: Fix a comment about mac_header 1. Change all "dev->hard_header" to "dev->header_ops" 2. On receiving incoming frames when header_ops == NULL: The comment only says what is wrong, but doesn't say what is right. This patch changes the comment to make it clear what is right. 3. On transmitting and receiving outgoing frames when header_ops == NULL: The comment explains that the LL header will be later added by the driver. However, I think it's better to simply say that the LL header is invisible to us. This phrasing is better from a software engineering perspective, because this makes it clear that what happens in the driver should be hidden from us and we should not care about what happens internally in the driver. 4. On resuming the LL header (for RAW frames) when header_ops == NULL: The comment says we are "unlikely" to restore the LL header. However, we should say that we are "unable" to restore it. It's not possible (rather than not likely) to restore it, because: 1) There is no way for us to restore because the LL header internally processed by the driver should be invisible to us. 2) In function packet_rcv and tpacket_rcv, the code only tries to restore the LL header when header_ops != NULL. Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: Xie He <xie.he.0141@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:24:45 -07:00
David S. Miller	31660a9766	Merge branch 'net-hns3-updates-for-next' Huazhong Tan says: ==================== net: hns3: updates for -next There are some optimizations related to IO path. Change since V1: - fixes a unsuitable handling in hns3_lb_clear_tx_ring() of #6 which pointed out by Saeed Mahameed. previous version: V1: https://patchwork.ozlabs.org/project/netdev/cover/1600085217-26245-1-git-send-email-tanhuazhong@huawei.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:14:28 -07:00
Yunsheng Lin	619ae331d1	net: hns3: use napi_consume_skb() when cleaning tx desc Use napi_consume_skb() to batch consuming skb when cleaning tx desc in NAPI polling. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:14:28 -07:00
Yunsheng Lin	48ee56fd0b	net: hns3: use writel() to optimize the barrier operation writel() can be used to order I/O vs memory by default when writing portable drivers. Use writel() to replace wmb() + writel_relaxed(), and writel() is dma_wmb() + writel_relaxed() for ARM64, so there is an optimization here because dma_wmb() is a lighter barrier than wmb(). Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:14:28 -07:00
Yunsheng Lin	8c30e19460	net: hns3: optimize the rx clean process Currently HNS3_RING_RX_RING_FBDNUM_REG register is read to determine how many rx desc can be cleaned. To avoid the register read operation in the critical data path, use the valid bit in the rx desc to determine if a specific rx desc can be cleaned. The hns3 driver clear valid bit in the rx desc before notifying the rx desc to the hw, and hw will only set the valid bit of the rx desc after corresponding buffer is filled with packet data and other field in the rx desc is set accordingly. Add hns3_rx_ring_move_fw() function to clear the valid bit in the rx desc before moving rx ring's next_to_clean forward to avoid double cleaning a rx desc, also add a dma_rmb() barrier in hns3_handle_rx_bd() to make sure valid bit is set before reading other field in the rx desc. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:14:28 -07:00
Yunsheng Lin	20d06ca267	net: hns3: optimize the tx clean process Currently HNS3_RING_TX_RING_HEAD_REG register is read to determine how many tx desc can be cleaned. To avoid the register read operation in the critical data path, use the valid bit in the tx desc to determine if a specific tx desc can be cleaned. The hns3 driver sets valid bit in the tx desc before ringing a doorbell to the hw, and hw will only clear the valid bit of the tx desc after corresponding packet is sent out to the wire. And because next_to_use for tx ring is a changing variable when the driver is filling the tx desc, so reuse the pull_len for rx ring to record the tx desc that has notified to the hw, so that hns3_nic_reclaim_desc() can decide how many tx desc's valid bit need checking when reclaiming tx desc. And io_err_cnt stat is also removed for it is not used anymore. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:14:28 -07:00
Yunsheng Lin	f6061a056c	net: hns3: batch tx doorbell operation Use netdev_xmit_more() to defer the tx doorbell operation when the skb is passed to the driver continuously. By doing this we can improve the overall xmit performance by avoid some doorbell operations. Also, the tx_err_cnt stat is not used, so rename it to tx_more stat. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:14:28 -07:00
Yunsheng Lin	aeda9bf87a	net: hns3: batch the page reference count updates Batch the page reference count updates instead of doing them one at a time. By doing this we can improve the overall receive performance by avoid some atomic increment operations when the rx page is reused. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-17 16:14:28 -07:00
Liu Shixin	b948577b98	cxgb4vf: convert to use DEFINE_SEQ_ATTRIBUTE macro Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code. Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 17:39:00 -07:00
Shannon Nelson	04a834592b	ionic: dynamic interrupt moderation Use the dim library to manage dynamic interrupt moderation in ionic. v3: rebase v2: untangled declarations in ionic_dim_work() Signed-off-by: Shannon Nelson <snelson@pensando.io> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 17:35:47 -07:00
Karsten Graul	ddcc9b7feb	net/smc: check variable before dereferencing in smc_close.c smc->clcsock and smc->clcsock->sk are used before the check if they can be dereferenced. Fix this by checking the variables first. Fixes: `a60a2b1e0a` ("net/smc: reduce active tcp_listen workers") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 17:30:16 -07:00
Nikolay Aleksandrov	d5bf31ddd8	net: bridge: mcast: don't ignore return value of __grp_src_toex_excl When we're handling TO_EXCLUDE report in EXCLUDE filter mode we should not ignore the return value of __grp_src_toex_excl() as we'll miss sending notifications about group changes. Fixes: `5bf1e00b68` ("net: bridge: mcast: support for IGMPV3/MLDv2 CHANGE_TO_INCLUDE/EXCLUDE report") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 17:13:25 -07:00
Song, Yoong Siang	aa042f60e4	net: stmmac: Add support to Ethtool get/set ring parameters This patch add support to --show-ring & --set-ring Ethtool functions: - Adding min, max, power of two check to new ring parameter's value. - Bring down the network interface before changing the value of ring parameters. - Bring up the network interface after changing the value of ring parameters. Signed-off-by: Song, Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 15:22:52 -07:00
David S. Miller	18e9a40732	Merge branch 'mlxsw-Refactor-headroom-management' Ido Schimmel says: ==================== mlxsw: Refactor headroom management Petr says: On Spectrum, port buffers, also called port headroom, is where packets are stored while they are parsed and the forwarding decision is being made. For lossless traffic flows, in case shared buffer admission is not allowed, headroom is also where to put the extra traffic received before the sent PAUSE takes effect. Another aspect of the port headroom is the so called internal buffer, which is used for egress mirroring. Linux supports two DCB interfaces related to the headroom: dcbnl_setbuffer for configuration, and dcbnl_getbuffer for inspection. In order to make it possible to implement these interfaces, it is first necessary to clean up headroom handling, which is currently strewn in several places in the driver. The end goal is an architecture whereby it is possible to take a copy of the current configuration, adjust parameters, and then hand the proposed configuration over to the system to implement it. When everything works, the proposed configuration is accepted and saved. First, this centralizes the reconfiguration handling to one function, which takes care of coordinating buffer size changes and priority map changes to avoid introducing drops. Second, the fact that the configuration is all in one place makes it easy to keep a backup and handle error path rollbacks, which were previously hard to understand. Patch #1 introduces struct mlxsw_sp_hdroom, which will keep port headroom configuration. Patch #2 unifies handling of delay provision between PFC and PAUSE. From now on, delay is to be measured in bytes of extra space, and will not include MTU. PFC handler sets the delay directly from the parameter it gets through the DCB interface. For PAUSE, MLXSW_SP_PAUSE_DELAY is converted to have the same meaning. In patches #3-#5, MTU, lossiness and priorities are gradually moved over to struct mlxsw_sp_hdroom. In patches #6-#11, handling of buffer resizing and priority maps is moved from spectrum.c and spectrum_dcb.c to spectrum_buffers.c. The API is gradually adapted so that struct mlxsw_sp_hdroom becomes the main interface through which the various clients express how the headroom should be configured. Patch #12 is a small cleanup that the previous transformation made possible. In patch #13, the port init code becomes a boring client of the headroom code, instead of rolling its own thing. Patches #14 and #15 move handling of internal mirroring buffer to the new headroom code as well. Previously, this code was in the SPAN module. This patchset converts the SPAN module to another boring client of the headroom code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 15:19:30 -07:00
Petr Machata	22881adf85	mlxsw: spectrum_buffers: Manage internal buffer in the hdroom code Traffic mirroring modes that are in-chip implemented on egress need an internal buffer to work. As the only client, the SPAN module was managing the buffer so far. However logically it belongs to the buffers module. E.g. buffer size validation needs to take the size of the internal buffer into account. Therefore move the related code from SPAN to spectrum_buffers. Move over the callbacks that determine the minimum buffer size as a function of maximum speed and MTU. Add a field describing the internal buffer to struct mlxsw_sp_hdroom. Extend mlxsw_sp_hdroom_bufs_reset_sizes() to take care of sizing the internal buffer as well. Change the SPAN module to invoke that function and mlxsw_sp_hdroom_configure() like all the other hdroom clients. Drop the now-unnecessary mlxsw_sp_span_port_buffer_disable(). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 15:19:30 -07:00
Petr Machata	a41b96267c	mlxsw: spectrum_buffers: Introduce shared buffer ops The size of the internal buffer is currently calculated in the SPAN module. Logically it belongs to the spectrum_buffers module, where it should be moved. However, that being a chip-specific operation, it needs dynamic dispatch. There currently is a chip-specific structure for description of shared buffer values, struct mlxsw_sp_sb_vals. However placing ops into this structure would be confusing. Therefore introduce a new per-chip structure, currently empty, and initialize the ops pointer as appropriate. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 15:19:30 -07:00
Petr Machata	0cda1a9b45	mlxsw: spectrum_buffers: Convert mlxsw_sp_port_headroom_init() Currently mlxsw_sp_port_headroom_init() configures both priomap and buffers by hand. Additionally, for port buffers, it configures buffer 0 with a size that it will never again have if PFC configuration is touched. Rewrite the init code to become a client of the new hdroom code. The only difference in invocation is that the configuration is forced, so that it is issued even if the desired configuration happens to match what is contained in (hitherto not initialized with meaningful values) mlxsw_sp_port->hdroom. Since now mlxsw_sp_port_headroom_init() initializes all the PG buffers to meaningful values, mlxsw_sp_hdroom_configure_buffers() can avoid querying the current configuration, and can fill the whole PBMC itself. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-09-16 15:19:30 -07:00

1 2 3 4 5 ...

951156 commits