linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-12 21:57:43 +00:00

Author	SHA1	Message	Date
Bodong Wang	5ccf2770e8	net/mlx5: Don't handle VF func change if host PF is disabled When ECPF eswitch manager is at offloads mode, it monitors functions changed event from host PF side and acts according to the number of VFs enabled/disabled. As ECPF and host PF work in two independent hosts, it's possible that host PF OS reboots but ECPF system is still kept on and continues monitoring events from host PF. When kernel from host PF side is booting, PCI iov driver does sriov_init and compute_max_vf_buses by iterating over all valid num of VFs. This triggers FLR and generates functions changed events, even though host PF HCA is not enabled at this time. However, ECPF is not aware of this information, and still handles these events as usual. ECPF system will see massive number of reps are created, but destroyed immediately once creation finished. To eliminate this noise, a bit is added to host parameter context to indicate host PF is disabled. ECPF will not handle the VF changed event if this bit is set. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Parav Pandit	7e26dac281	net/mlx5: Limit scope of mlx5_get_next_phys_dev() to PCI PF devices As mlx5_get_next_phys_dev is used only for PCI PF devices use case, limit it to search only for PCI devices. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Parav Pandit	d22663edac	net/mlx5: Move pci status reg access mutex to mlx5_pci_init mlx5_pci_init() performs pci specific initialization of the mlx5_core_dev struct. Hence move pci_status_mutex to pci initialization routine mlx5_pci_init(). This allows reusing mlx5_mdev_init() to non PCI devices. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Huy Nguyen	386e75af99	net/mlx5: Rename mlx5_pci_dev_type to mlx5_coredev_type Rename mlx5_pci_dev_type to mlx5_coredev_type to distinguish different mlx5 device types. mlx5_coredev_type represents mlx5_core_dev instance type. Hence keep mlx5_coredev_type in mlx5_core_dev structure. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Bodong Wang	b8ca123860	RDMA/mlx5: Cleanup rep when doing unload When an IB rep is loaded, netdev for the same vport is saved for later reference. However, it's not cleaned up when doing unload. For ECPF, kernel crashes when driver is referring to the already removed netdev. Following steps lead to a shown call trace: 1. Create n VFs from host PF 2. Distroy the VFs 3. Run "rdma link" from ARM Call trace: mlx5_ib_get_netdev+0x9c/0xe8 [mlx5_ib] mlx5_query_port_roce+0x268/0x558 [mlx5_ib] mlx5_ib_rep_query_port+0x14/0x34 [mlx5_ib] ib_query_port+0x9c/0xfc [ib_core] fill_port_info+0x74/0x28c [ib_core] nldev_port_get_doit+0x1a8/0x1e8 [ib_core] rdma_nl_rcv_msg+0x16c/0x1c0 [ib_core] rdma_nl_rcv+0xe8/0x144 [ib_core] netlink_unicast+0x184/0x214 netlink_sendmsg+0x288/0x354 sock_sendmsg+0x18/0x2c __sys_sendto+0xbc/0x138 __arm64_sys_sendto+0x28/0x34 el0_svc_common+0xb0/0x100 el0_svc_handler+0x6c/0x84 el0_svc+0x8/0xc Cleanup the rep and netdev reference when unloading IB rep. Fixes: `26628e2d58` ("RDMA/mlx5: Move to single device multiport ports in switchdev mode") Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Bodong Wang	2f69e591e4	{IB, net}/mlx5: E-Switch, Use index of rep for vport to IB port mapping In the single IB device mode, the mapping between vport number and rep relies on a counter. However for dynamic vport allocation, it is desired to keep consistent map of eswitch vport and IB port. Hence, simplify code to remove the free running counter and instead use the available vport index during load/unload sequence from the eswitch. Signed-off-by: Bodong Wang <bodong@mellanox.com> Suggested-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Bodong Wang	d6518db278	net/mlx5: E-Switch, Use vport index when init rep Driver is referring to the array index when doing rep initialization, using vport is confusing as it's normally interpreted as vport number. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Shay Agroskin	a82e0b5bda	net/mlx5: Added MCQI and MCQS registers' description to ifc Given a fw component index, the MCQI register allows us to query this component's information (e.g. its version and capabilities). Given a fw component index, the MCQS register allows us to query the status of a fw component, including its type and state (e.g. PRESET/IN_USE). It can be used to find the index of a component of a specific type, by sequentially increasing the component index, and querying each time the type of the returned component. If max component index is reached, 'last_index_flag' is set by the HCA. These registers' description was added to query the running and pending fw version of the HCA. Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Parav Pandit	1759d322f4	net/mlx5: Add hardware definitions for sub functions Update mlx5 device interface data structures for: 1. New command definitions for allocating, deallocating SF 2. Query SF partition 3. Eswitch SF fields 4. HCA CAP SF fields 5. Extend Eswitch functions command for SF Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-07-01 16:40:30 -07:00
Daniel T. Lee	6e32a74a6f	samples: pktgen: allow to specify destination port Currently, kernel pktgen has the feature to specify udp destination port for sending packet. (e.g. pgset "udp_dst_min 9") But on samples, each of the scripts doesn't have any option to achieve this. This commit adds the DST_PORT option to specify the target port(s) in the script. -p : ($DST_PORT) destination PORT range (e.g. 433-444) is also allowed Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-01 11:02:20 -07:00
Daniel T. Lee	226b96c25d	samples: pktgen: add some helper functions for port parsing This commit adds port parsing and port validate helper function to parse single or range of port(s) from a given string. (e.g. 1234, 443-444) Helpers will be used in prior to set target port(s) in samples/pktgen. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-01 11:02:20 -07:00
Eric Dumazet	a346abe051	ipv6: icmp: allow flowlabel reflection in echo replies Extend flowlabel_reflect bitmask to allow conditional reflection of incoming flowlabels in echo replies. Note this has precedence against auto flowlabels. Add flowlabel_reflect enum to replace hard coded values. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-01 10:54:51 -07:00
Florian Westphal	c7b37c769d	xfrm: remove get_mtu indirection from xfrm_type esp4_get_mtu and esp6_get_mtu are exactly the same, the only difference is a single sizeof() (ipv4 vs. ipv6 header). Merge both into xfrm_state_mtu() and remove the indirection. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2019-07-01 06:16:40 +02:00
David S. Miller	954a5a0294	mlx5e-updates-2019-06-28 This series adds some misc updates for mlx5e driver 1) Allow adding the same mac more than once in MPFS table 2) Move to HW checksumming advertising 3) Report netdevice MPLS features 4) Correct physical port name of the PF representor 5) Reduce stack usage in mlx5_eswitch_termtbl_create 6) Refresh TIR improvement for representors 7) Expose same physical switch_id for all representors -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl0WnOAACgkQSD+KveBX +j6xQwgAuarc5NCi8jg9m+yXFXj/kr1cPwZ6Pxadi8hAd3NbEMGtutdFZIXsIXZZ y0uYxzkrCqXUUh2ZnpE09YlBISLyVSt3WGdqgJn1wel//O33gS6GpYXwVGOfgL7n +d9FCYvrun6v6aOHM5ZGxQ+qxHzupz5k4F0r7fz2Gsd+JEgL58nwc8ERSbOTbZMO TLO1pcxlXWwGSqd5uc4AHi8hZTvuzWl/Fm5hOTP9gx/Sl3UaYWa3WiTgj5uOD5Zt 956Xqk0LLwSaiKVAsFjIa7HHWOaDLnVmbmTUzhv82hharqmvPW1CIWlx1gv01KoP wMWoyyoy7cTyyrWNozWEed/14LyEVA== =S4bJ -----END PGP SIGNATURE----- Merge tag 'mlx5e-updates-2019-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5e-updates-2019-06-28 This series adds some misc updates for mlx5e driver 1) Allow adding the same mac more than once in MPFS table 2) Move to HW checksumming advertising 3) Report netdevice MPLS features 4) Correct physical port name of the PF representor 5) Reduce stack usage in mlx5_eswitch_termtbl_create 6) Refresh TIR improvement for representors 7) Expose same physical switch_id for all representors ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-30 18:41:13 -07:00
David S. Miller	11697cfc71	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2019-06-28 This series contains a smorgasbord of updates to many of the Intel drivers. Gustavo A. R. Silva updates the ice and iavf drivers to use the strcut_size() helper where possible. Miguel increases the pause and refresh time for flow control in the e1000e driver during reset for certain devices. Dann Frazier fixes a potential NULL pointer dereference in ixgbe driver when using non-IPSec enabled devices. Colin Ian King fixes a potential overflow during a shift in the ixgbe driver. Also fixes a potential NULL pointer dereference in the iavf driver by adding a check. Venkatesh Srinivas converts the e1000 driver to use dma_wmb() instead of wmb() for doorbell writes to avoid SFENCEs in the transmit and receive paths. Arjan updates the e1000e driver to improve boot time by over 100 msec by reducing the usleep ranges suring system startup. Artem updates the igb driver register dump in ethtool, first prepares the register dump for future additions of registers in the dump, then secondly, adds the RR2DCDELAY register to the dump. When dealing with time-sensitive networks, this register is helpful in determining your latency from the device to the ring. Alex fixes the ixgbevf driver to use the current cached link state, rather than trying to re-check the value from the PF. Harshitha adds support for MACVLAN offloads in i40e by using channels as MACVLAN interfaces. Detlev Casanova updates the e1000e driver to use delayed work instead of timers to run the watchdog. Vitaly fixes an issue in e1000e, where when disconnecting and reconnecting the physical cable connection, the NIC enters a DMoff state. This state causes a mismatch in link and duplexing, so check the PCIm function state and perform a PHY reset when in this state to resolve the issue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-30 16:03:35 -07:00
Heiner Kallweit	f072218cca	r8169: remove not needed call to dma_sync_single_for_device DMA_API_HOWTO.txt includes an example explaining when dma_sync_single_for_device() is not needed, and that example matches our use case. The buffer isn't changed by the CPU and direction is DMA_FROM_DEVICE, so we can remove the call to dma_sync_single_for_device(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 12:29:39 -07:00
Heiner Kallweit	3c18cbe337	r8169: consider that 32 Bit DMA is the default Documentation/DMA-API-HOWTO.txt states: By default, the kernel assumes that your device can address 32-bits of DMA addressing. For a 64-bit capable device, this needs to be increased, and for a device with limitations, it needs to be decreased. Therefore we don't need the 32 Bit DMA fallback configuration and can remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 12:29:39 -07:00
Heiner Kallweit	759d095741	r8169: improve handling VLAN tag The VLAN tag is stored in the descriptor in network byte order. Using swab16 works on little endian host systems only. Better play safe and use ntohs or htons respectively. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 12:29:39 -07:00
Florian Westphal	3099c59db0	selftests: rtnetlink: skip ipsec offload tests if netdevsim isn't present running the script on systems without netdevsim now prints: SKIP: ipsec_offload can't load netdevsim instead of error message & failed status. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 12:18:52 -07:00
David S. Miller	fc41388564	Merge branch 'em_ipt-add-support-for-addrtype' Nikolay Aleksandrov says: ==================== em_ipt: add support for addrtype We would like to be able to use the addrtype from tc for ACL rules and em_ipt seems the best place to add support for the already existing xt match. The biggest issue is that addrtype revision 1 (with ipv6 support) is NFPROTO_UNSPEC and currently em_ipt can't differentiate between v4/v6 if such xt match is used because it passes the match's family instead of the packet one. The first 3 patches make em_ipt match only on IP traffic (currently both policy and addrtype recognize such traffic only) and make it pass the actual packet's protocol instead of the xt match family when it's unspecified. They also add support for NFPROTO_UNSPEC xt matches. The last patch allows to add addrtype rules via em_ipt. We need to keep the user-specified nfproto for dumping in order to be compatible with libxtables, we cannot dump NFPROTO_UNSPEC as the nfproto or we'll get an error from libxtables, thus the nfproto is limited to ipv4/ipv6 in patch 03 and is recorded. v3: don't use the user nfproto for matching, only for dumping, more information is available in the commit message in patch 03 v2: change patch 02 to set the nfproto only when unspecified and drop patch 04 from v1 (Eyal Birger) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 11:15:12 -07:00
Nikolay Aleksandrov	0c4231c784	net: sched: em_ipt: add support for addrtype matching Allow em_ipt to use addrtype for matching. Restrict the use only to revision 1 which has IPv6 support. Since it's a NFPROTO_UNSPEC xt match we use the user-specified nfproto for matching, in case it's unspecified both v4/v6 will be matched by the rule. v2: no changes, was patch 5 in v1 Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 11:15:12 -07:00
Nikolay Aleksandrov	ba3d24d48f	net: sched: em_ipt: keep the user-specified nfproto and dump it If we dump NFPROTO_UNSPEC as nfproto user-space libxtables can't handle it and would exit with an error like: "libxtables: unhandled NFPROTO in xtables_set_nfproto" In order to avoid the error return the user-specified nfproto. If we don't record it then the match family is used which can be NFPROTO_UNSPEC. Even if we add support to mask NFPROTO_UNSPEC in iproute2 we have to be compatible with older versions which would be also be allowed to add NFPROTO_UNSPEC matches (e.g. addrtype after the last patch). v3: don't use the user nfproto for matching, only for dumping the rule, also don't allow the nfproto to be unspecified (explained above) v2: adjust changes to missing patch, was patch 04 in v1 Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 11:15:12 -07:00
Nikolay Aleksandrov	f4c1c40c35	net: sched: em_ipt: set the family based on the packet if it's unspecified Set the family based on the packet if it's unspecified otherwise protocol-neutral matches will have wrong information (e.g. NFPROTO_UNSPEC). In preparation for using NFPROTO_UNSPEC xt matches. v2: set the nfproto only when unspecified Suggested-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 11:15:12 -07:00
Nikolay Aleksandrov	9e10edd7dc	net: sched: em_ipt: match only on ip/ipv6 traffic Restrict matching only to ip/ipv6 traffic and make sure we can use the headers, otherwise matches will be attempted on any protocol which can be unexpected by the xt matches. Currently policy supports only ipv4/6. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 11:15:12 -07:00
Xue Chaojing	aebd17b768	hinic: add vlan offload support This patch adds vlan offload support for the HINIC driver. Signed-off-by: Xue Chaojing <xuechaojing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-29 10:28:23 -07:00
Daniel Borkmann	8daed7677a	Merge branch 'bpf-lookup-devmap' Toke Høiland-Jørgensen says: ==================== When using the bpf_redirect_map() helper to redirect packets from XDP, the eBPF program cannot currently know whether the redirect will succeed, which makes it impossible to gracefully handle errors. To properly fix this will probably require deeper changes to the way TX resources are allocated, but one thing that is fairly straight forward to fix is to allow lookups into devmaps, so programs can at least know when a redirect is guaranteed to fail because there is no entry in the map. Currently, programs work around this by keeping a shadow map of another type which indicates whether a map index is valid. This series contains two changes that are complementary ways to fix this issue: - Moving the map lookup into the bpf_redirect_map() helper (and caching the result), so the helper can return an error if no value is found in the map. This includes a refactoring of the devmap and cpumap code to not care about the index on enqueue. - Allowing regular lookups into devmaps from eBPF programs, using the read-only flag to make sure they don't change the values. The performance impact of the series is negligible, in the sense that I cannot measure it because the variance between test runs is higher than the difference pre/post series. Changelog: v6: - Factor out list handling in maps to a helper in list.h (new patch 1) - Rename variables in struct bpf_redirect_info (new patch 3 + patch 4) - Explain why we are clearing out the map in the info struct on lookup failure - Remove unneeded check for forwarding target in tracepoint macro v5: - Rebase on latest bpf-next. - Update documentation for bpf_redirect_map() with the new meaning of flags. v4: - Fix a few nits from Andrii - Lose the #defines in bpf.h and just compare the flags argument directly to XDP_TX in bpf_xdp_redirect_map(). v3: - Adopt Jonathan's idea of using the lower two bits of the flag value as the return code. - Always do the lookup, and cache the result for use in xdp_do_redirect(); to achieve this, refactor the devmap and cpumap code to get rid the bitmap for selecting which devices to flush. v2: - For patch 1, make it clear that the change works for any map type. - For patch 2, just use the new BPF_F_RDONLY_PROG flag to make the return value read-only. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-29 01:31:11 +02:00
Toke Høiland-Jørgensen	0cdbb4b09a	devmap: Allow map lookups from eBPF We don't currently allow lookups into a devmap from eBPF, because the map lookup returns a pointer directly to the dev->ifindex, which shouldn't be modifiable from eBPF. However, being able to do lookups in devmaps is useful to know (e.g.) whether forwarding to a specific interface is enabled. Currently, programs work around this by keeping a shadow map of another type which indicates whether a map index is valid. Since we now have a flag to make maps read-only from the eBPF side, we can simply lift the lookup restriction if we make sure this flag is always set. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-29 01:31:09 +02:00
Toke Høiland-Jørgensen	43e74c0267	bpf_xdp_redirect_map: Perform map lookup in eBPF helper The bpf_redirect_map() helper used by XDP programs doesn't return any indication of whether it can successfully redirect to the map index it was given. Instead, BPF programs have to track this themselves, leading to programs using duplicate maps to track which entries are populated in the devmap. This patch fixes this by moving the map lookup into the bpf_redirect_map() helper, which makes it possible to return failure to the eBPF program. The lower bits of the flags argument is used as the return code, which means that existing users who pass a '0' flag argument will get XDP_ABORTED. With this, a BPF program can check the return code from the helper call and react by, for instance, substituting a different redirect. This works for any type of map used for redirect. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-29 01:31:09 +02:00
Toke Høiland-Jørgensen	4b55cf290d	devmap: Rename ifindex member in bpf_redirect_info The bpf_redirect_info struct has an 'ifindex' member which was named back when the redirects could only target egress interfaces. Now that we can also redirect to sockets and CPUs, this is a bit misleading, so rename the member to tgt_index. Reorder the struct members so we can have 'tgt_index' and 'tgt_value' next to each other in a subsequent patch. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-29 01:31:09 +02:00
Toke Høiland-Jørgensen	d5df2830ca	devmap/cpumap: Use flush list instead of bitmap The socket map uses a linked list instead of a bitmap to keep track of which entries to flush. Do the same for devmap and cpumap, as this means we don't have to care about the map index when enqueueing things into the map (and so we can cache the map lookup). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-29 01:31:08 +02:00
Toke Høiland-Jørgensen	c8af5cd75e	xskmap: Move non-standard list manipulation to helper Add a helper in list.h for the non-standard way of clearing a list that is used in xskmap. This makes it easier to reuse it in the other map types, and also makes sure this usage is not forgotten in any list refactorings in the future. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-29 01:31:08 +02:00
Stanislav Fomichev	2d6dbb9a65	selftests/bpf: fix -Wstrict-aliasing in test_sockopt_sk.c Let's use union with u8[4] and u32 members for sockopt buffer, that should fix any possible aliasing issues. test_sockopt_sk.c: In function ‘getsetsockopt’: test_sockopt_sk.c:115:2: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] if ((__u32 )buf != 0x55AA2) { ^~ test_sockopt_sk.c:116:3: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] log_err("Unexpected getsockopt(SO_SNDBUF) 0x%x != 0x55AA2", ^~~~~~~ Fixes: `8a027dc0d8` ("selftests/bpf: add sockopt test that exercises sk helpers") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-29 01:21:53 +02:00
Paul Blakey	f6dc1264f1	net/mlx5e: Disallow tc redirect offload cases we don't support After changing the parent_id to be the same for both NICs of same the hardware device, netdev_port_same_parent_id now returns true for more cases (all the lower devices in the hierarchy are on the same hardware device). If merged eswitch isn't enabled, these cases aren't supported, so disallow them. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:04:00 -07:00
Paul Blakey	7ff40a46dd	net/mlx5e: Expose same physical switch_id for all representors Report system_image_guid as the E-Switch switch_id, this ensures that when a NIC contains multiple PCI functions and which has merged eswitch capability, all representors from multiple PFs publish same switch_id. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:04:00 -07:00
Gavi Teitz	a90f88fe55	net/mlx5e: Don't refresh TIRs when updating representor SQs Refreshing TIRs is done in order to update the TIRs with the current state of SQs in the transport domain, so that the TIRs can filter out undesired self-loopback packets based on the source SQ of the packet. Representor TIRs will only receive packets that originate from their associated vport, due to dedicated steering, and therefore will never receive self-loopback packets, whose source vport will be the vport of the E-Switch manager, and therefore not the vport associated with the representor. As such, it is not necessary to refresh the representors' TIRs, since self-loopback packets can't reach them. Since representors only exist in switchdev mode, and there is no scenario in which a representor will exist in the transport domain alongside a non-representor, it is not necessary to refresh the transport domain's TIRs upon changing the state of a representor's queues. Therefore, do not refresh TIRs upon such a change. Achieve this by adding an update_rx callback to the mlx5e_profile, which refreshes TIRs for non-representors and does nothing for representors, and replace instances of mlx5e_refresh_tirs() upon changing the state of the queues with update_rx(). Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:04:00 -07:00
Arnd Bergmann	5233794b17	net/mlx5e: reduce stack usage in mlx5_eswitch_termtbl_create Putting an empty 'mlx5_flow_spec' structure on the stack is a bit wasteful and causes a warning on 32-bit architectures when building with clang -fsanitize-coverage: drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c: In function 'mlx5_eswitch_termtbl_create': drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c:90:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Since the structure is never written to, we can statically allocate it to avoid the stack usage. To be on the safe side, mark all subsequent function arguments that we pass it into as 'const' as well. Fixes: `10caabdaad` ("net/mlx5e: Use termination table for VLAN push actions") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:03:59 -07:00
Parav Pandit	f72e6c3e17	net/mlx5e: Set drvinfo in generic manner Consider PCI and non PCI device types while setting device name in get_drvinfo() callback using existing generic device. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:03:59 -07:00
Parav Pandit	087067368a	net/mlx5e: Correct phys_port_name for PF port Currently PF phys_port_name is named as pfNvf-1 as vport number for PF vport is 65535. Correct PF's phys_port name as agreed upon name as pfN. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:03:59 -07:00
Ariel Levkovich	5dc9520bf0	net/mlx5e: Report netdevice MPLS features Set supported device features in the netdevice MPLS features mask. This will enable HW checksumming and TSO for MPLS tagged traffic. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:03:59 -07:00
Ariel Levkovich	e4683f35f8	net/mlx5e: Move to HW checksumming advertising This patch changes the way the driver advertises its checksum offload capabilities within the net device features bit mask. Instead of advertising protocol specific checksumming capabilities which are limited today to IPv4 and IPv6, we move to reporing generic HW checksumming capabilities. This will allow the network stack to let mlx5 device offload checksum for cases where the IP header is encapsulated within another protocol and the skb->protocol doesn't indicate one of the IP versions protocol, specifically in the case of MPLS label encapsulating the IP header and the skb->protocol indiciates MPLS ethertype rather than IP. Moving the HW_CSUM reporting is required in the basic net device hw features mask and also in the extensions (vlan and encpasulation features) since the extensions are always multiplied by the basic features set during the packet's traversal through the stack's tx flow. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:03:59 -07:00
Gavi Teitz	e7e0bee8c5	net/mlx5: MPFS, Allow adding the same MAC more than once Remove the limitation preventing adding a vport's MAC address to the Multi-Physical Function Switch (MPFS) more than once per E-switch, as there is no difference in the MPFS if an address is being used by an E-switch more than once. This allows the E-switch to have multiple vports with the same MAC address, allowing vports to be classified by VLAN id instead of by MAC if desired. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:03:58 -07:00
Gavi Teitz	6311f30884	net/mlx5: MPFS, Cleanup add MAC flow Unify and isolate the error handling flow in mlx5_mpfs_add_mac(), removing code duplication. Signed-off-by: Gavi Teitz <gavi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:03:58 -07:00
Saeed Mahameed	4f5d1beadc	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Misc updates from mlx5-next branch: 1) E-Switch vport metadata support for source vport matching 2) Convert mkey_table to XArray 3) Shared IRQs and to use single IRQ for all async EQs Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-06-28 16:03:54 -07:00
Vitaly Lifshits	def4ec6dce	e1000e: PCIm function state support Due to commit: 5d8682588605 ("[misc] mei: me: allow runtime pm for platform with D0i3") When disconnecting the cable and reconnecting it the NIC enters DMoff state. This caused wrong link indication and duplex mismatch. This bug is described in: https://bugzilla.redhat.com/show_bug.cgi?id=1689436 Checking PCIm function state and performing PHY reset after a timeout in watchdog task solves this issue. Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-28 16:00:28 -07:00
Detlev Casanova	59653e6497	e1000e: Make watchdog use delayed work Use delayed work instead of timers to run the watchdog of the e1000e driver. Simplify the code with one less middle function. Signed-off-by: Detlev Casanova <detlev.casanova@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-28 16:00:24 -07:00
Harshitha Ramamurthy	1d8d80b4e4	i40e: Add macvlan support on i40e This patch enables macvlan offloads for i40e. The idea is to use channels as macvlan interfaces. The channels are VSIs of type VMDQ. When the first macvlan is created, the maximum number of channels possible are created. From then on, as a macvlan interface is created, a macvlan filter is added to these already created channels (VSIs). This patch utilizes subordinate device traffic classes to make queue groups(channels) available for an upper device like a macvlan. Steps to configure macvlan offloads: 1. ethtool -K ethx l2-fwd-offload on 2. ip link add link ethx name macvlan1 type macvlan 3. ip addr add <address> dev macvlan1 4. ip link set macvlan1 up Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-28 16:00:20 -07:00
Alexander Duyck	1e1b0c658d	ixgbevf: Use cached link state instead of re-reading the value for ethtool Change the ethtool link settings call to just read the cached state out of the adapter structure instead of trying to recheck the value from the PF. Doing this should prevent excessive reading of the mailbox. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Reviewed-by: "Guilherme G. Piccoli" <gpiccoli@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-28 16:00:13 -07:00
Colin Ian King	9fe06a5128	iavf: fix dereference of null rx_buffer pointer A recent commit `efa14c3985` ("iavf: allow null RX descriptors") added a null pointer sanity check on rx_buffer, however, rx_buffer is being dereferenced before that check, which implies a null pointer dereference bug can potentially occur. Fix this by only dereferencing rx_buffer until after the null pointer check. Addresses-Coverity: ("Dereference before null check") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-28 16:00:10 -07:00
Artem Bityutskiy	cd502a7f7c	igb: add RR2DCDELAY to ethtool registers dump This patch adds the RR2DCDELAY register to the ethtool registers dump. RR2DCDELAY exists on I210 and I211 Intel Gigabit Ethernet chips and it stands for "Read Request To Data Completion Delay". Here is how this register is described in the I210 datasheet: "This field captures the maximum PCIe split time in 16 ns units, which is the maximum delay between the read request to the first data completion. This is giving an estimation of the PCIe round trip time." In other words, whenever I210 reads from the host memory (e.g., fetches a descriptor from the ring), the chip measures every PCI DMA read transaction and captures the maximum value. So it ends up containing the longest DMA transaction time. This register is very useful for troubleshooting and research purposes. If you are dealing with time-sensitive networks, this register can help you get an idea of your "I210-to-ring" latency. This helps answering questions like "should I have PCIe ASPM enabled?" or "should I enable deep C-states?" on my system. It is safe to read this register at any point, reading it has no effect on the I210 chip functionality. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-28 16:00:06 -07:00
Artem Bityutskiy	9379b39945	igb: minor ethool regdump amendment This patch has no functional impact and it is just a preparation for the following patch. It removes an early return from the 'igb_get_regs()' function by moving the 82576-only registers dump into an "if" block. With this preparation, we can dump more non-82576 registers at the end of this function. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-28 16:00:00 -07:00

... 2 3 4 5 6 ...

843500 commits