Commit graph

1042995 commits

Author SHA1 Message Date
Leon Romanovsky
e9310aed8e net/mlx5: Publish and unpublish all devlink parameters at once
The devlink parameters were published in two steps despite being static
and known in advance.

First step was to use devlink_params_publish() which iterated over all
known up to that point parameters and sent notification messages.
In second step, the call was devlink_param_publish() that looped over
same parameters list and sent notification for new parameters.

In order to simplify the API, move devlink_params_publish() to be called
when all parameters were already added and save the need to iterate over
parameters list again.

As a side effect, this change fixes the error unwind flow in which
parameters were not marked as unpublished.

Fixes: 82e6c96f04 ("net/mlx5: Register to devlink ingress VLAN filter trap")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 16:12:55 +01:00
David S. Miller
dc50b930be Merge branch 'qdisc-visibility'
Jakub Kicinski says:

====================
net: sched: update default qdisc visibility after Tx queue cnt changes

Matthew noticed that number of children reported by mq does not match
number of queues on reconfigured interfaces. For example if mq is
instantiated when there is 8 queues it will always show 8 children,
regardless of config being changed:

 # ethtool -L eth0 combined 8
 # tc qdisc replace dev eth0 root handle 100: mq
 # tc qdisc show dev eth0
 qdisc mq 100: root
 qdisc pfifo_fast 0: parent 100:8 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:7 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:6 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:5 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:4 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:3 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:2 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:1 bands 3 priomap 1 2 ...
 # ethtool -L eth0 combined 1
 # tc qdisc show dev eth0
 qdisc mq 100: root
 qdisc pfifo_fast 0: parent 100:8 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:7 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:6 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:5 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:4 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:3 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:2 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:1 bands 3 priomap 1 2 ...
 # ethtool -L eth0 combined 32
 # tc qdisc show dev eth0
 qdisc mq 100: root
 qdisc pfifo_fast 0: parent 100:8 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:7 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:6 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:5 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:4 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:3 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:2 bands 3 priomap 1 2 ...
 qdisc pfifo_fast 0: parent 100:1 bands 3 priomap 1 2 ...

This patchset fixes this by hashing and unhasing the default
child qdiscs as number of queues gets adjusted.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 15:46:02 +01:00
Jakub Kicinski
2d6a58996e selftests: net: test ethtool -L vs mq
Add a selftest for checking mq children are visible after ethtool -L.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 15:46:02 +01:00
Jakub Kicinski
2e367522ce netdevsim: add ability to change channel count
For testing visibility of mq/mqprio default children.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 15:46:02 +01:00
Jakub Kicinski
1e080f1775 net: sched: update default qdisc visibility after Tx queue cnt changes
mq / mqprio make the default child qdiscs visible. They only do
so for the qdiscs which are within real_num_tx_queues when the
device is registered. Depending on order of calls in the driver,
or if user space changes config via ethtool -L the number of
qdiscs visible under tc qdisc show will differ from the number
of queues. This is confusing to users and potentially to system
configuration scripts which try to make sure qdiscs have the
right parameters.

Add a new Qdisc_ops callback and make relevant qdiscs TTRT.

Note that this uncovers the "shortcut" created by
commit 1f27cde313 ("net: sched: use pfifo_fast for non real queues")
The default child qdiscs beyond initial real_num_tx are always
pfifo_fast, no matter what the sysfs setting is. Fixing this
gets a little tricky because we'd need to keep a reference
on whatever the default qdisc was at the time of creation.
In practice this is likely an non-issue the qdiscs likely have
to be configured to non-default settings, so whatever user space
is doing such configuration can replace the pfifos... now that
it will see them.

Reported-by: Matthew Massey <matthewmassey@fb.com>
Reviewed-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 15:46:02 +01:00
David S. Miller
c506cc5bc6 Merge branch 'ibmvnic-next'
Sukadev Bhattiprolu says:

====================
ibmvnic: Reuse ltb, rx, tx pools

It can take a long time to free and reallocate rx and tx pools and long
term buffer (LTB) during each reset of the VNIC. This is specially true
when the partition (LPAR) is heavily loaded and going through a Logical
Partition Migration (LPM). The long drawn reset causes the LPAR to lose
connectivity for extended periods of time and results in "RMC connection"
errors and the LPM failing.

What is worse is that during the LPM we could get a failover because
of the lost connectivity. At that point, the vnic driver releases
even the resources it has already allocated and starts over.

As long as the resources we have already allocated are valid/applicable,
we might as well hold on to them while trying to allocate the remaining
resources. This patch set attempts to reuse the resources previously
allocated as long as they are valid. It seems to vastly improve the
time taken for the vnic reset and signficantly reduces the chances of
getting the RMC connection errors. We do get still them occasionally,
but appears to be for reasons other than memory allocation delays and
those are still being investigated.

If the backing devices for a vnic adapter are not "matched" (see "pool
parameters" in patches 8 and 9) it is possible that we will still free
all the resources and allocate them. If that becomes a common problem,
we have to address it separately.

Thanks to input and extensive testing from Brian King, Cris Forno,
Dany Madden, Rick Lindsley.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
bbd809305b ibmvnic: Reuse tx pools when possible
Rather than releasing the tx pools on every close and reallocating
them on open, reuse the tx pools unless the pool parameters (number
of pools, size of each pool or size of each buffer in a pool) have
changed.

If the pool parameters changed, then release the old pools (if
any) and allocate new ones.

Specifically release tx pools, if:
	- adapter is removed,
	- pool parameters change during reset,
	- we encounter an error when opening the adapter in response
	  to a user request (in ibmvnic_open()).

and don't release them:
	- in __ibmvnic_close() or
	- on errors in __ibmvnic_open()

in the hope that we can reuse them during this or next reset.

With these changes reset_tx_pools() can be dropped because its
optimization is now included in init_tx_pools() itself.

cleanup_tx_pools() releases all the skbs associated with the pool and
is called from ibmvnic_cleanup(), which is called on every reset. Since
we want to reuse skbs across resets, move cleanup_tx_pools() out of
ibmvnic_cleanup() and call it only when user closes the adapter.

Add two new adapter fields, ->prev_mtu, ->prev_tx_pool_size to track the
previous values and use them to decide whether to reuse or realloc the
pools.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
489de956e7 ibmvnic: Reuse rx pools when possible
Rather than releasing the rx pools on and reallocating them on every
reset, reuse the rx pools unless the pool parameters (number of pools,
size of each pool or size of each buffer in a pool) have changed.

If the pool parameters changed, then release the old pools (if any)
and allocate new ones.

Specifically release rx pools, if:
	- adapter is removed,
	- pool parameters change during reset,
	- we encounter an error when opening the adapter in response
	  to a user request (in ibmvnic_open()).

and don't release them:
	- in __ibmvnic_close() or
	- on errors in __ibmvnic_open()

in the hope that we can reuse them on the next reset.

With these, reset_rx_pools() can be dropped because its optimzation is
now included in init_rx_pools() itself.

cleanup_rx_pools() releases all the skbs associated with the pool and
is called from ibmvnic_cleanup(), which is called on every reset. Since
we want to reuse skbs across resets, move cleanup_rx_pools() out of
ibmvnic_cleanup() and call it only when user closes the adapter.

Add two new adapter fields, ->prev_rx_buf_sz, ->prev_rx_pool_size to
keep track of the previous values and use them to decide whether to
reuse or realloc the pools.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
f8ac0bfa7d ibmvnic: Reuse LTB when possible
Reuse the long term buffer during a reset as long as its size has
not changed. If the size has changed, free it and allocate a new
one of the appropriate size.

When we do this, alloc_long_term_buff() and reset_long_term_buff()
become identical. Drop reset_long_term_buff().

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
129854f061 ibmvnic: Use bitmap for LTB map_ids
In a follow-on patch, we will reuse long term buffers when possible.
When doing so we have to be careful to properly assign map ids. We
can no longer assign them sequentially because a lower map id may be
available and we could wrap at 255 and collide with an in-use map id.

Instead, use a bitmap to track active map ids and to find a free map id.
Don't need to take locks here since the map_id only changes during reset
and at that time only the reset worker thread should be using the adapter.

Noticed this when analyzing an error Dany Madden ran into with the
patch set.

Reported-by: Dany Madden <drt@linux.ibm.com>
Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:24 +01:00
Sukadev Bhattiprolu
0d1af4fa71 ibmvnic: init_tx_pools move loop-invariant code
In init_tx_pools() move some loop-invariant code out of the loop.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Sukadev Bhattiprolu
8243c7ed6d ibmvnic: Use/rename local vars in init_tx_pools
Use/rename local variables in init_tx_pools() for consistency with
init_rx_pools() and for readability. Also add some comments

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Sukadev Bhattiprolu
0df7b9ad8f ibmvnic: Use/rename local vars in init_rx_pools
To make the code more readable, use/rename some local variables.
Basically we have a set of pools, num_pools. Each pool has a set of
buffers, pool_size and each buffer is of size buff_size.

pool_size is a bit ambiguous (whether size in bytes or buffers). Add
a comment in the header file to make it explicit.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Sukadev Bhattiprolu
0f2bf3188c ibmvnic: Fix up some comments and messages
Add/update some comments/function headers and fix up some messages.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
Sukadev Bhattiprolu
38106b2c43 ibmvnic: Consolidate code in replenish_rx_pool()
For better readability, consolidate related code in replenish_rx_pool()
and add some comments.

Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:12:23 +01:00
David S. Miller
923990f643 Merge branch 'ptp-ocp-timecard-v13-fw'
Jonathan Lemon says:

====================
timecard updates for v13 firmware

This update mainly deals with features for the TimeCard v13 firmware.

The signals provided from the external SMA connectors can be steered
to different locations, and the generated SMA signals can be chosen.

Future timecard revisions will allow selectable I/O on any of the
SMA connectors, so name the attributes appropriately, and set up
the ABI in preparation for the new features.

The update also adds support for IRIG-B and DCF formats, as well
as NMEA output.

A ts_window_adjust tunable is also provided to fine-tune the
PHC:SYS time mapping.
--
v1: Earlier reviewed series was for v10 firmware, this is expanded to
    include the v13 features.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
d7050a2b85 docs: ABI: Add sysfs documentation for timecard
This patch describes the sysfs interface implemented by the
ptp_ocp driver, under /sys/class/timecard.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
1acffc6e09 ptp: ocp: Add timestamp window adjustment
The following process is used to read the PHC clock and correlate
the reading with the "correct" system time.

- get starting timestamp
- issue PCI write command
- issue PCI read command
- get ending timestamp
- read latched sec/nsec registers

The write command is posted to PCI bus and returns.  When the write
arrives at the FPGA, the PHC time is latched into the sec/nsec registers,
and a flag is set indicating the registers are valid.  The read command
returns this flag, and the time retrieval proceeds.

Below is a non-scaled picture of the timing diagram involved.  The
PHC time corresponds to some SYS time between [start, end].  Userspace
usually uses the midpoint between [start, end] to estimate the PCI
delay and match this with the PHC time.

 [start] |                |
   write |-------+        |
	 |        \       |
    read |----+    +----->|
	 |     \          * PHC time latched into register
	 |      \         |
midpoint |       +------->|
	 |                |
	 |                |
	 |           +----|
	 |          /     |
	 |<--------+      |
   [end] |                |

As the diagram indicates, the PHC time is latched before the midpoint,
so the system clock time is slightly off the real PHC time.  This shows
up as a phase error with an oscilliscope.

The workaround here is to provide a tunable which reduces (shrinks)
the end time in the above diagram.  This in turn moves the calculated
midpoint so the system time and PHC time are in agreemment.

Currently, the adjustment reduces the end time by 3/16th of the entire
window.  E.g.:  [start, end] ==> [start, (end - (3/16 * end)], which
produces reasonably good results.

Also reduce delays by just writing to the clock control register
instead of performing a read/modify/write sequence, as the contents
of the control register are known.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
6d59d4fa17 ptp: ocp: Have FPGA fold in ns adjustment for adjtime.
The current implementation of adjtime uses gettime/settime to
perform nanosecond adjustments.  This introduces addtional phase
errors due to delays.

Instead, use the FPGA's ability to just apply the nanosecond
adjustment to the clock directly.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
a62a56d04e ptp: ocp: Enable 4th timestamper / PPS generator
A 4th timestamper is added which timestamps the output of the PHC.

The clock nanosecond offset is not always zero, so when compared
to other timestampers, this provides precise measurements.

Also, the timestamper interrupt from the PHC can be used to generate
a PPS signal for /dev/pps.

Also allow PTP_CLK_REQ_PEROUT requests for a 1PPS output, but do
not actually configure any output pins, this is done via sysfs.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
71d7e08504 ptp: ocp: Add second GNSS device
Upcoming boards may have a second GNSS receiver, getting information
from a different constellation than the first receiver, which provides
some measure of anti-spoofing.

Expose the sysfs attribute for this device, if detected.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
e3516bb450 ptp: ocp: Add NMEA output
The timecard can provide a NMEA-1083 ZDA (time and date) output
string on a serial port, which can be used to drive other devices.

Add the NMEA resources, and the serial port as a sysfs attribute.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
f67bf662d2 ptp: ocp: Add debugfs entry for timecard
Provide a view into the timecard internals for debugging.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
065efcc5e9 ptp: ocp: Separate the init and info logic
On startup, parts of the FPGA need to be initialized - break these
out into their own functions, separate from the purely informational
blocks.

On startup, distrbute the UTC:TAI offset from the NMEA GNSS parser,
if it is available.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
89260d8782 ptp: ocp: Add sysfs attribute utc_tai_offset
IRIG and DCF output time in UTC, but the timecard operates
on TAI internally.  Add an attribute node which allows adding
an offset to these modes before output.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
d14ee2525d ptp: ocp: Add IRIG-B output mode control
IRIG-B has several different output formats, the timecard defaults
to using B007.  Add a control which selects different output modes.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:01 +01:00
Jonathan Lemon
6baf292542 ptp: ocp: Add IRIG-B and DCF blocks
IRIG (Inter-range Instrumentation Group) timecode format on
one of the SMA output channels is provided by the IRIG master
FPGA block.  Enable the master when the IRIG output format is
selected on either one of the output channels.

By default, the output is in B007 format.

DCF output format is provided by the DCF master block.

Also enable the IRIG and DCF slaves, which parse an incoming
signal from the external SMA connectors, and may be used to
adjust the PHC.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:00 +01:00
Jonathan Lemon
e1daf0ec73 ptp: ocp: Add SMA selector and controls
The latest firmware for the TimeCard adds selectable signals for
the SMA input/outputs.  Add support for SMA selectors, and the
GPIO controls needed for steering signals.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:00 +01:00
Jonathan Lemon
dcf614692c ptp: ocp: Add third timestamper
The firmware may provide a third signal timestamper, so make it
available for use.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:00 +01:00
Jonathan Lemon
bceff2905e ptp: ocp: Report error if resource registration fails.
If a resource could not be registered, report the name of
the resource and the error code.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:00 +01:00
Jonathan Lemon
56ec44033c ptp: ocp: Skip resources with out of range irqs
The TimeCard exposes different resources, which may have their
own irqs.  Space for the irqs is allocated through a MSI or MSI-X
interrupt vector.  On some platforms, the interrupt allocation
fails.

Rather than making this fatal, just skip exposing those resources.

The main timecard functionality (that of a PTP clock) will work
without the additional resources.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:00 +01:00
Jonathan Lemon
1447149d65 ptp: ocp: Skip I2C flash read when there is no controller.
If an I2C controller isn't present, don't try and read the I2C flash.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:00 +01:00
Jonathan Lemon
498ad3f438 ptp: ocp: Parameterize the TOD information display.
Only display the TOD information if there is a corresponding
TOD resource.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:00 +01:00
Jonathan Lemon
1618df6afa ptp: ocp: parameterize the i2c driver used
Move the xilinx i2c driver parameters to the resource block instead
of hardcoding things in the registration functions.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:10:00 +01:00
Aleksander Jan Bajkowski
c688721464 dt-bindings: net: lantiq: Add the burst length properties
The new added properties are used for configuring burst length.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Aleksander Jan Bajkowski
dac0bad937 dt-bindings: net: lantiq,etop-xway: Document Lantiq Xway ETOP bindings
Document the Lantiq Xway SoC series External Bus Unit (ETOP) bindings.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Aleksander Jan Bajkowski
5535bcfa72 dt-bindings: net: lantiq-xrx200-net: convert to the json-schema
Convert the Lantiq PMAC Device Tree binding documentation to json-schema.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Aleksander Jan Bajkowski
14d4e308e0 net: lantiq: configure the burst length in ethernet drivers
Configure the burst length in Ethernet drivers. This improves
Ethernet performance by 58%. According to the vendor BSP,
8W burst length is supported by ar9 and newer SoCs.

The NAT benchmark results on xRX200 (Down/Up):
* 2W: 330 Mb/s
* 4W: 432 Mb/s    372 Mb/s
* 8W: 520 Mb/s    389 Mb/s

Tested on xRX200 and xRX330.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Aleksander Jan Bajkowski
49293bbc50 MIPS: lantiq: dma: make the burst length configurable by the drivers
Make the burst length configurable by the drivers.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Aleksander Jan Bajkowski
5ad74d39c5 MIPS: lantiq: dma: fix burst length for DEU
The current definition of 2W burst length is invalid.
This patch fixes it. Current downstream DEU driver doesn't
use DMA. An incorrect burst length value doesn't cause any
errors. This patch also adds other burst length values.

Fixes: dfec1a827d ("MIPS: Lantiq: Add DMA support")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Aleksander Jan Bajkowski
5ca9ce2ba4 MIPS: lantiq: dma: reset correct number of channel
Different SoCs have a different number of channels, e.g .:
* amazon-se has 10 channels,
* danube+ar9 have 20 channels,
* vr9 has 28 channels,
* ar10 has 24 channels.

We can read the ID register and, depending on the reported
number of channels, reset the appropriate number of channels.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Aleksander Jan Bajkowski
c12aa581f6 MIPS: lantiq: dma: add small delay after reset
Reading the DMA registers immediately after the reset causes
Data Bus Error. Adding a small delay fixes this issue.

Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 11:02:01 +01:00
Randy Dunlap
7366c23ff4 ptp: dp83640: don't define PAGE0
Building dp83640.c on arch/parisc/ produces a build warning for
PAGE0 being redefined. Since the macro is not used in the dp83640
driver, just make it a comment for documentation purposes.

In file included from ../drivers/net/phy/dp83640.c:23:
../drivers/net/phy/dp83640_reg.h:8: warning: "PAGE0" redefined
    8 | #define PAGE0                     0x0000
                 from ../drivers/net/phy/dp83640.c:11:
../arch/parisc/include/asm/page.h:187: note: this is the location of the previous definition
  187 | #define PAGE0   ((struct zeropage *)__PAGE_OFFSET)

Fixes: cb646e2b02 ("ptp: Added a clock driver for the National Semiconductor PHYTER.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Richard Cochran <richard.cochran@omicron.at>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210913220605.19682-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-14 20:03:24 -07:00
Linus Walleij
339133f6c3 net: dsa: tag_rtl4_a: Drop bit 9 from egress frames
This drops the code setting bit 9 on egress frames on the
Realtek "type A" (RTL8366RB) frames.

This bit was set on ingress frames for unknown reason,
and was set on egress frames as the format of ingress
and egress frames was believed to be the same. As that
assumption turned out to be false, and since this bit
seems to have zero effect on the behaviour of the switch
let's drop this bit entirely.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210913143156.1264570-1-linus.walleij@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-14 19:34:03 -07:00
Adrian Bunk
52ce14c134 bnx2x: Fix enabling network interfaces without VFs
This function is called to enable SR-IOV when available,
not enabling interfaces without VFs was a regression.

Fixes: 65161c3555 ("bnx2x: Fix missing error code in bnx2x_iov_init_one()")
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Reported-by: YunQiang Su <wzssyqa@gmail.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Shai Malin <smalin@marvell.com>
Link: https://lore.kernel.org/r/20210912190523.27991-1-bunk@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-14 19:26:48 -07:00
Alexei Starovoitov
4c24483e24 Merge branch 'bpf: add support for new btf kind BTF_KIND_TAG'
Yonghong Song says:

====================

LLVM14 added support for a new C attribute ([1])
  __attribute__((btf_tag("arbitrary_str")))
This attribute will be emitted to dwarf ([2]) and pahole
will convert it to BTF. Or for bpf target, this
attribute will be emitted to BTF directly ([3], [4]).
The attribute is intended to provide additional
information for
  - struct/union type or struct/union member
  - static/global variables
  - static/global function or function parameter.

This new attribute can be used to add attributes
to kernel codes, e.g., pre- or post- conditions,
allow/deny info, or any other info in which only
the kernel is interested. Such attributes will
be processed by clang frontend and emitted to
dwarf, converting to BTF by pahole. Ultimiately
the verifier can use these information for
verification purpose.

The new attribute can also be used for bpf
programs, e.g., tagging with __user attributes
for function parameters, specifying global
function preconditions, etc. Such information
may help verifier to detect user program
bugs.

After this series, pahole dwarf->btf converter
will be enhanced to support new llvm tag
for btf_tag attribute. With pahole support,
we will then try to add a few real use case,
e.g., __user/__rcu tagging, allow/deny list,
some kernel function precondition, etc,
in the kernel.

In the rest of the series, Patches 1-2 had
kernel support. Patches 3-4 added
libbpf support. Patch 5 added bpftool
support. Patches 6-10 added various selftests.
Patch 11 added documentation for the new kind.

  [1] https://reviews.llvm.org/D106614
  [2] https://reviews.llvm.org/D106621
  [3] https://reviews.llvm.org/D106622
  [4] https://reviews.llvm.org/D109560

Changelog:
  v2 -> v3:
    - put NR_BTF_KINDS and BTF_KIND_MAX into enum as well
    - check component_idx earlier (check_meta stage) in kernel
    - add more tests
    - fix misc nits
  v1 -> v2:
    - BTF ELF format changed in llvm ([4] above),
      so cross-board change to use the new format.
    - Clarified in commit message that BTF_KIND_TAG
      is not emitted by bpftool btf dump format c.
    - Fix various comments from Andrii.
====================

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-09-14 18:48:37 -07:00
Yonghong Song
48f5a6c416 docs/bpf: Add documentation for BTF_KIND_TAG
Add BTF_KIND_TAG documentation in btf.rst.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223103.249100-1-yhs@fb.com
2021-09-14 18:45:53 -07:00
Yonghong Song
c240ba2878 selftests/bpf: Add a test with a bpf program with btf_tag attributes
Add a bpf program with btf_tag attributes. The program is
loaded successfully with the kernel. With the command
  bpftool btf dump file ./tag.o
the following dump shows that tags are properly encoded:
  [8] STRUCT 'key_t' size=12 vlen=3
          'a' type_id=2 bits_offset=0
          'b' type_id=2 bits_offset=32
          'c' type_id=2 bits_offset=64
  [9] TAG 'tag1' type_id=8 component_id=-1
  [10] TAG 'tag2' type_id=8 component_id=-1
  [11] TAG 'tag1' type_id=8 component_id=1
  [12] TAG 'tag2' type_id=8 component_id=1
  ...
  [21] FUNC_PROTO '(anon)' ret_type_id=2 vlen=1
          'x' type_id=2
  [22] FUNC 'foo' type_id=21 linkage=static
  [23] TAG 'tag1' type_id=22 component_id=0
  [24] TAG 'tag2' type_id=22 component_id=0
  [25] TAG 'tag1' type_id=22 component_id=-1
  [26] TAG 'tag2' type_id=22 component_id=-1
  ...
  [29] VAR 'total' type_id=27, linkage=global
  [30] TAG 'tag1' type_id=29 component_id=-1
  [31] TAG 'tag2' type_id=29 component_id=-1

If an old clang compiler, which does not support btf_tag attribute,
is used, these btf_tag attributes will be silently ignored.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223058.248949-1-yhs@fb.com
2021-09-14 18:45:53 -07:00
Yonghong Song
ad526474ae selftests/bpf: Test BTF_KIND_TAG for deduplication
Add unit tests for BTF_KIND_TAG deduplication for
  - struct and struct member
  - variable
  - func and func argument

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223052.248535-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song
35baba7a83 selftests/bpf: Add BTF_KIND_TAG unit tests
Test good and bad variants of BTF_KIND_TAG encoding.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223047.248223-1-yhs@fb.com
2021-09-14 18:45:52 -07:00