Commit graph

796826 commits

Author SHA1 Message Date
Mike Manning
e78190581a net: ensure unbound stream socket to be chosen when not in a VRF
The commit a04a480d43 ("net: Require exact match for TCP socket
lookups if dif is l3mdev") only ensures that the correct socket is
selected for packets in a VRF. However, there is no guarantee that
the unbound socket will be selected for packets when not in a VRF.
By checking for a device match in compute_score() also for the case
when there is no bound device and attaching a score to this, the
unbound socket is selected. And if a failure is returned when there
is no device match, this ensures that bound sockets are never selected,
even if there is no unbound socket.

Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 16:12:38 -08:00
Robert Shearman
3c82a21f43 net: allow binding socket in a VRF when there's an unbound socket
Change the inet socket lookup to avoid packets arriving on a device
enslaved to an l3mdev from matching unbound sockets by removing the
wildcard for non sk_bound_dev_if and instead relying on check against
the secondary device index, which will be 0 when the input device is
not enslaved to an l3mdev and so match against an unbound socket and
not match when the input device is enslaved.

Change the socket binding to take the l3mdev into account to allow an
unbound socket to not conflict sockets bound to an l3mdev given the
datapath isolation now guaranteed.

Signed-off-by: Robert Shearman <rshearma@vyatta.att-mail.com>
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 16:12:38 -08:00
YueHaibing
f601a85bd7 net: hns3: Remove set but not used variable 'reset_level'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c: In function 'hclge_log_and_clear_ppp_error':
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c:821:24: warning:
 variable 'reset_level' set but not used [-Wunused-but-set-variable]
  enum hnae3_reset_type reset_level = HNAE3_NONE_RESET;

It never used since introduction in commit
01865a50d7 ("net: hns3: Add enable and process hw errors of TM scheduler")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:46:37 -08:00
David S. Miller
75790a7425 Merge branch 'nfp-more-set-actions-and-notifier-refactor'
Jakub Kicinski says:

====================
nfp: more set actions and notifier refactor

This series brings updates to flower offload code.  First Pieter adds
support for setting TTL, ToS, Flow Label and Hop Limit fields in IPv4
and IPv6 headers.

Remaining 5 patches deal with factoring out netdev notifiers from flower
code.  We already have two instances, and more is coming, so it's time
to move to one central notifier which then feeds individual feature
handlers.

I start that part by cleaning up the existing notifiers.  Next a central
notifier is added, and used by flower offloads.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
0c665e2bf4 nfp: flower: use the common netdev notifier
Use driver's common notifier for LAG and tunnel configuration.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
3e33359040 nfp: register a notifier handler in a central location for the device
Code interested in networking events registers its own notifier
handlers.  Create one device-wide notifier instance.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
659bb404eb nfp: flower: make nfp_fl_lag_changels_event() void
nfp_fl_lag_changels_event() never fails, and therefore we would
never return NOTIFY_BAD for NETDEV_CHANGELOWERSTATE.  Make this
clearer by changing nfp_fl_lag_changels_event()'s return type
to void.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
a558c982a8 nfp: flower: don't try to nack device unregister events
Returning an error from a notifier means we want to veto the change.
We shouldn't veto NETDEV_UNREGISTER just because we couldn't find
the tracking info for given master.

I can't seem to find a way to trigger this unless we have some
other bug, so it's probably not fix-worthy.

While at it move the checking if the netdev really is of interest
into the handling functions, like we do for other events.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Jakub Kicinski
e50bfdf74d nfp: flower: remove unnecessary iteration over devices
For flower tunnel offloads FW has to be informed about MAC addresses
of tunnel devices.  We use a netdev notifier to keep track of these
addresses.

Remove unnecessary loop over netdevices after notifier is registered.
The intention of the loop was to catch devices which already existed
on the system before nfp driver got loaded, but netdev notifier will
replay NETDEV_REGISTER events.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:22 -08:00
Pieter Jansen van Vuuren
4234d62c27 nfp: flower: add ipv6 set flow label and hop limit offload
Add ipv6 set flow label and hop limit action offload. Since pedit sets
headers per 4 byte word, we need to ensure that setting either version,
priority, payload_len or nexthdr does not get offloaded.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:21 -08:00
Pieter Jansen van Vuuren
a3c6b063fe nfp: flower: add ipv4 set ttl and tos offload
Add ipv4 set ttl and tos action offload. Since pedit sets headers per 4
byte word, we need to ensure that setting either version, ihl, protocol,
total length or checksum does not get offloaded.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:45:21 -08:00
David S. Miller
6a02d1fa03 Merge branch 'hns3-next'
Huazhong Tan says:

====================
hns3: provide new interfaces & bugfixes & code optimization

This patchset provides some reset interfaces for RAS & RoCE, also
some bugfixes and optimization related to reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:18 -08:00
Huazhong Tan
8b0195a305 net: hns3: fix for cmd queue memory not freed problem during reset
It is not necessary to reallocate the descriptor and remap the
descriptor memory in reset process, otherwise it may cause memory
not freed problem.

Also, this patch initializes the cmd queue's spinlocks in
hclgevf_alloc_cmd_queue, and take the spinlocks when reinitializing
cmd queue' registers.

Fixes: fedd0c15d2 ("net: hns3: Add HNS3 VF IMP(Integrated Management Proc) cmd interface")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:18 -08:00
Huazhong Tan
65e41e7e68 net: hns3: add error handler for hclge_reset()
When hclge_reset() is called, it may fail for several reasons.
For example, an higher-level reset event occurs, memory allocation
failure, hardware reset timeout, etc. Therefore, it is necessary
to add corresponding error handling for these situations.
1. A high-level reset is required due to a high-level reset failure.
2. For memory allocation failure, a high-level reset is initiated by
the timer to recover. The reason for using the timer is to prevent this
new high-level reset to interrupt the reset process of other pf/vf;
3. For the case of hardware reset timeout, reschedule the reset task
to wait for the hardware to complete the reset.
For memory allocation failure and reset timeouts, in order to prevent
an infinite number of scheduled reset tasks, the number of error
recovery needs to be limited.

This patch also add some reset related debug log printing.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:18 -08:00
Huazhong Tan
f403a84fb2 net: hns3: call roce's reset notify callback when resetting
While doing resetting, roce should do its uninitailization part
before nic's, and do its initialization part after nic's.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:18 -08:00
Huazhong Tan
35d93a3004 net: hns3: adjust the process of PF reset
When doing PF reset, the driver needs to do some preparatory work
before asserting PF reset. Since when hardware is resetting, it
is necessary to stop tx/rx queue, clear hardware table, etc,
otherwise hardware may run into unrecoverable state if there is
still IO running when the hardware is resetting.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:18 -08:00
Huazhong Tan
0742ed7c24 net: hns3: move some reset information from hnae3_handle into hclge_dev/hclgevf_dev
Saving reset related information in the hclge_dev/hclgevf_dev
structure is more suitable than the hnae3_handle, since hardware
related information is kept in these two structure.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:18 -08:00
Huazhong Tan
7cea834d94 net: hns3: ignore new coming low-level reset while doing high-level reset
When processing a higher level reset, the pending lower level reset
does not have to be processed anymore, because the higher level
reset is the superset of the lower level reset.

Therefore, when processing an higher level reset, the request of
lower level reset needs to be cleared.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:17 -08:00
Huazhong Tan
257e4f2994 net: hns3: use HNS3_NIC_STATE_RESETTING to indicate resetting
While hclge is going to reset, it will notify its client with
HNAE3_DOWN_CLIENT, so this client should get into a resetting
status from this moment, other operations from the stack need to
be blocked as well. And when the reset is finished, the client
will be notified with HNAE3_UP_CLIENT, so this is the end of
the resetting status.

This patch uses HNS3_NIC_STATE_RESETTING flag to implement that,
and adds hns3_nic_resetting() to indicate which operation is not
allowed.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:17 -08:00
Huazhong Tan
8df0fa9168 net: hns3: enable/disable ring in the enet while doing UP/DOWN
While hardware gets into reset status, the firmware will not respond to
driver's command request, which may cause ring not disabled problem
during reset process.

So this patch uses register instead of command to enable/disable the ring
in the enet while doing UP/DOWN operation.

Also, HNS3_RING_RX_VM_REG is previously unused, so change it to the
correct meaning, and add a wrapper function for readl().

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:17 -08:00
Huazhong Tan
7edff5339a net: hns3: adjust the location of clearing the table when doing reset
When doing a function reset, the hardware table should be cleared
before the hardware reset. In current code, this clearing is done
in hns3_reset_notify_uninit_enet, but it is too late, because
the hardware reset is already done, hns3_reset_notify_down_enet
is more suitable to do that.

Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:17 -08:00
Huazhong Tan
4d60291b6b net: hns3: provide some interface & information for the client
The client needs to know if the hardware is resetting when
loading or unloading itself, because client may abort the loading
process or wait for the reset process to finish when unloading
if hardware is resetting.

So this patch provides these interfaces to do it.
1. get_hw_reset_stat, the reset status of hardware.
2. ae_dev_resetting, whether reset task is scheduling.
3. ae_dev_reset_cnt, how many reset has been done.

Also, the RoCE client needs some field in the hnae3_roce_private_info
to save its state, and process_hw_error interface in the
hnae3_client_ops to process hardware errors.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:17 -08:00
Huazhong Tan
720bd5837e net: hns3: add set_default_reset_request in the hnae3_ae_ops
Currently, when reset_event is called because of tx timeout, it will
upgrade the reset level (For PF, HNAE3_FUNC_RESET -> HNAE3_CORE_RESET
-> HNAE3_GLOBAL_RESET) if the time between the new reset and last reset
is within 20 secs, or restore the reset level to HNAE3_FUNC_RESET if
the time between the new reset and last reset is over 20 secs.

There is requirement that the caller needs to decide the reset level
when triggering a reset, for example, RAS recovery. So this patch
adds the set_default_reset_request to meet this requirement.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:17 -08:00
Huazhong Tan
814da63c55 net: hns3: use HNS3_NIC_STATE_INITED to indicate the initialization state of enet
Besides of module_init and module_exit, the process of reset will
also uninitialize and initialize the enet client. When reset process
fails with enet client uninitialized, the module_exit does not need
to uninitialize the enet client, otherwise it may cause double
uninitialization problem.

So we need the HNS3_NIC_STATE_INITED flag to indicate whether
the enet client is initialized.

Also HNS3_NIC_STATE_REINITING is previously unused, so change it to
HNS3_NIC_STATE_INITED.

Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07 11:42:17 -08:00
Sasha Neftin
920664a8f7 igc: Clean up code
Address few community comments.
Remove unused code, will be added per demand.
Remove blank lines and unneeded includes.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:01 -08:00
Miroslav Lichvar
e1f65b0d70 e1000e: allow non-monotonic SYSTIM readings
It seems with some NICs supported by the e1000e driver a SYSTIM reading
may occasionally be few microseconds before the previous reading and if
enabled also pass e1000e_sanitize_systim() without reaching the maximum
number of rereads, even if the function is modified to check three
consecutive readings (i.e. it doesn't look like a double read error).
This causes an underflow in the timecounter and the PHC time jumps hours
ahead.

This was observed on 82574, I217 and I219. The fastest way to reproduce
it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
on the PHC.

Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
timecounter_read() in order to allow non-monotonic SYSTIM readings and
prevent the PHC from jumping.

Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:01 -08:00
Dan Carpenter
bb9089b668 igc: Tidy up some white space
I just cleaned up a couple small white space issues.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:01 -08:00
Colin Ian King
14b21cec85 igc: fix error return handling from call to netif_set_real_num_tx_queues
The call to netif_set_real_num_tx_queues is not assigning the error
return to variable err even though the next line checks err for an
error.  Fix this by adding the missing err assignment.

Detected by CoverityScan, CID#1474551 ("Logically dead code")

Fixes: 3df25e4c1e ("igc: Add interrupt support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:01 -08:00
YueHaibing
84cfa53740 igc: Remove set but not used variable 'pci_using_dac'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/intel/igc/igc_main.c: In function 'igc_probe':
drivers/net/ethernet/intel/igc/igc_main.c:3535:11: warning:
 variable 'pci_using_dac' set but not used [-Wunused-but-set-variable]

It never used since introduction in commit
d89f88419f ("igc: Add skeletal frame for Intel(R) 2.5G Ethernet Controller support")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:01 -08:00
YueHaibing
dda458d285 igc: Remove set but not used variables 'ctrl_ext, link_mode'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/intel/igc/igc_base.c: In function 'igc_init_phy_params_base':
drivers/net/ethernet/intel/igc/igc_base.c:240:6: warning:
 variable 'ctrl_ext' set but not used [-Wunused-but-set-variable]
  u32 ctrl_ext;

drivers/net/ethernet/intel/igc/igc_base.c: In function 'igc_get_invariants_base':
drivers/net/ethernet/intel/igc/igc_base.c:290:6: warning:
 variable 'link_mode' set but not used [-Wunused-but-set-variable]
  u32 link_mode = 0;

It never used since introduction in
commit c0071c7aa5 ("igc: Add HW initialization code")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:01 -08:00
Todd Fujinaka
540a152da7 i40e/ixgbe/igb: fail on new WoL flag setting WAKE_MAGICSECURE
There's a new flag for setting WoL filters that is only
enabled on one manufacturer's NICs, and it's not ours. Fail
with EOPNOTSUPP.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:01 -08:00
Jacob Keller
a9e510589d intel-ethernet: software timestamp skbs as late as possible
Many of the Intel Ethernet drivers call skb_tx_timestamp() earlier than
necessary. Move the calls to this function to the latest point possible,
just prior to notifying hardware of the new Tx packet when we bump the
tail register.

This affects i40e, iavf, igb, igc, and ixgbe.

The e100, e1000, e1000e, fm10k, and ice drivers already call the
skb_tx_timestamp() function just prior to indicating the Tx packet to
hardware, so they do not need to be changed.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:01 -08:00
Jacob Keller
9fc145fcb5 ixgbevf: add support for software timestamps
Add a call to skb_tx_timestamp in the ixgbevf_tx_map function. This
enables software timestamping for packets sent over this device driver.
The call is placed just prior to when we notify hardware of the new
packet, in order to software timestamp as close as possible to when the
hardware will transmit.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:00 -08:00
Shannon Nelson
7fa57ca443 ixgbe: allow IPsec Tx offload in VEPA mode
When it's possible that the PF might end up trying to send a
packet to one of its own VFs, we have to forbid IPsec offload
because the device drops the packets into a black hole.
See commit 47b6f50077 ("ixgbe: disallow IPsec Tx offload
when in SR-IOV mode") for more info.

This really is only necessary when the device is in the default
VEB mode.  If instead the device is running in VEPA mode,
the packets will go through the encryption engine and out the
MAC/PHY as normal, and get "hairpinned" as needed by the switch.

So let's not block IPsec offload when in VEPA mode.  To get
there with the ixgbe device, use the handy 'bridge' command:
	bridge link set dev eth1 hwmode vepa

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:00 -08:00
Colin Ian King
0db4a47c05 ixgbe: don't clear_bit on xdp_ring->state if xdp_ring is null
There is an earlier check to see if xdp_ring is null when configuring
the tx ring, so assuming that it can still be null, the clearing of
the xdp_ring->state currently could end up with a null pointer
dereference.  Fix this by only clearing the bit if xdp_ring is not null.

Detected by CoverityScan, CID#1473795 ("Dereference after null check")

Fixes: 024aa5800f ("ixgbe: added Rx/Tx ring disable/enable functions")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:00 -08:00
Lance Roy
b86077207d igbvf: Replace spin_is_locked() with lockdep
lockdep_assert_held() is better suited to checking locking requirements,
since it won't get confused when someone else holds the lock. This is
also a step towards possibly removing spin_is_locked().

Signed-off-by: Lance Roy <ldr709@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-07 09:47:00 -08:00
David S. Miller
7c588c7468 Merge branch 'net-systemport-Unmap-queues-upon-DSA-unregister-event'
Florian Fainelli says:

====================
net: systemport: Unmap queues upon DSA unregister event

This patch series fixes the unbinding/binding of the bcm_sf2 switch
driver along with bcmsysport which monitors the switch port queues.
Because the driver was not processing the DSA_PORT_UNREGISTER event, we
would not be unmapping switch port/queues, which could cause incorrect
decisions to be made by the HW (e.g: queue always back-pressured).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:39:48 -08:00
Florian Fainelli
da106a140f net: systemport: Unmap queues upon DSA unregister event
Binding and unbinding the switch driver which creates the DSA slave
network devices for which we set-up inspection would lead to
undesireable effects since we were not clearing the port/queue mapping
to the SYSTEMPORT TX queue.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:39:48 -08:00
Florian Fainelli
25c4407046 net: systemport: Simplify queue mapping logic
The use of a bitmap speeds up the finding of the first available queue
to which we could start establishing the mapping for, but we still have
to loop over all slave network devices to set them up. Simplify the
logic to have a single loop, and use the fact that a correctly
configured ring has inspect set to true. This will make things simpler
to unwind during device unregistration.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:39:48 -08:00
Florian Fainelli
c04a17d2a9 net: dsa: bcm_sf2: Turn on PHY to allow successful registration
We are binding to the PHY using the SF2 slave MDIO bus that we create,
binding involves reading the PHY's MII_PHYSID1/2 which won't be possible
if the PHY is turned off. Temporarily turn it on/off for the bus probing
to succeeed. This fixes unbind/bind problems where the port connecting
to that PHY would be in error since it could not connect to it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:39:48 -08:00
David S. Miller
5882d526d8 Merge branch 'net-dsa-bcm_sf2-Store-rules-in-lists'
Florian Fainelli says:

====================
net: dsa: bcm_sf2: Store rules in lists

This patch series changes the bcm-sf2 driver to keep a copy of the
inserted rules as opposed to using the HW as a storage area for a number
of reasons:

- this helps us with doing duplicate rule detection in a faster way, it
  would have required a full rule read before

- this helps with Pablo's on-going work to convert ethtool_rx_flow_spec
  to a more generic flow rule structure by having fewer code paths to
  convert to the new structure/helpers

- we need to cache copies to restore them during drive resumption,
  because depending on the low power mode the system has entered, the
  switch may have lost all of its context
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:05:23 -08:00
Florian Fainelli
80f8dea876 net: systemport: Restore Broadcom tag match filters upon resume
Some of the system suspend states that we support wipe out entirely the
HW contents. If we had a Wake-on-LAN filter programmed prior to going
into suspend, but we did not actually wake-up from Wake-on-LAN and
instead used a deeper suspend state, make sure we restore the CID number
that we need to match against.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:05:22 -08:00
Florian Fainelli
1c60c7f900 net: dsa: bcm_sf2: Get rid of unmarshalling functions
Now that we have migrated the CFP rule handling to a list with a
software copy, the delete/get operation just returns what is on the
list, no need to read from the hardware which is both slow and more
error prone.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:05:22 -08:00
Florian Fainelli
1c0130f0b5 net: dsa: bcm_sf2: Restore CFP rules during system resume
The hardware can lose its context during system suspend, and depending
on the switch generation (7445 vs. 7278), while the rules are still
there, they will have their valid bit cleared (because that's the
fastest way for the HW to reset things). Just make sure we re-apply them
coming back from resume. The 7445 switch is an older version of the core
that has some quirky RAM technology requiring a delete then re-inser to
guarantee the RAM entries are properly latched.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:05:22 -08:00
Florian Fainelli
ce24b08a2e net: dsa: bcm_sf2: Split rule handling from HW operation
In preparation for restoring CFP rules during system wide system
suspend/resume where the hardware loses its context, split the rule
validation from its actual insertion as well as the rule removal from
its actual hardware deletion operation.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:05:22 -08:00
Florian Fainelli
ae7a5aff78 net: dsa: bcm_sf2: Keep copy of inserted rules
We tried hard to use the hardware as a storage area, which made things
needlessly complex in that we had to both marshall and unmarshall the
ethtool_rx_flow_spec into what the CFP hardware understands but it did
not require any driver level allocations, so that was nice.

Keep a copy of the ethtool_rx_flow_spec rule we want to insert, and also
make sure we don't have a duplicate rule already. This greatly speeds up
the deletion time since we only need to clear the slice's valid bit and
not perform a full read.

This is a preparatory step for being able to restore rules upon system
resumption where the hardware loses its context partially or entirely.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:05:22 -08:00
David S. Miller
95772ec991 Merge branch 'net-More-extack-messages'
David Ahern says:

====================
net: More extack messages

Add more extack messages for several link create errors (e.g., invalid
number of queues, unknown link kind) and invalid metrics argument.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:00:45 -08:00
David Ahern
68d57f3b1d rtnetlink: Add more extack messages to rtnl_newlink
Add extack arg to the nla_parse_nested calls in rtnl_newlink, and
add messages for unknown device type and link network namespace id.
In particular, it improves the failure message when the wrong link
type is used. From
    $ ip li add bond1 type bonding
    RTNETLINK answers: Operation not supported
to
    $ ip li add bond1 type bonding
    Error: Unknown device type.

(The module name is bonding but the link type is bond.)

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:00:45 -08:00
David Ahern
d7e774f356 net: Add extack argument to ip_fib_metrics_init
Add extack argument to ip_fib_metrics_init and add messages for invalid
metrics.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:00:45 -08:00
David Ahern
d0522f1cd2 net: Add extack argument to rtnl_create_link
Add extack arg to rtnl_create_link and add messages for invalid
number of Tx or Rx queues.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06 15:00:45 -08:00