Commit graph

32 commits

Author SHA1 Message Date
Guanjun
0c154698a0 dmaengine: idxd: Fix incorrect descriptions for GRPCFG register
Fix incorrect descriptions for the GRPCFG register which has three
sub-registers (GRPWQCFG, GRPENGCFG and GRPFLGCFG).
No functional changes

Signed-off-by: Guanjun <guanjun@linux.alibaba.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Lijun Pan <lijun.pan@intel.com>
Link: https://lore.kernel.org/r/20231211053704.2725417-3-guanjun@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-12-11 11:52:16 +05:30
Dave Jiang
f2dc327131 dmaengine: idxd: add per wq PRS disable
Add sysfs knob for per wq Page Request Service disable. This knob
disables PRS support for the specific wq. When this bit is set,
it also overrides the wq's block on fault enabling.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-17-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12 23:18:46 +05:30
Dave Jiang
2442b7473a dmaengine: idxd: process batch descriptor completion record faults
Add event log processing for faulting of user batch descriptor completion
record.

When encountering an event log entry for a page fault on a completion
record, the driver is expected to do the following:
1. If the "first error in batch" bit in event log entry error info is
set, discard any previously recorded errors associated with the
"batch identifier".
2. Fix the page fault according to the fault address in the event log. If
successful, write the completion record to the fault address in user space.
3. If an error is encountered while writing the completion record and it is
associated to a descriptor in the batch, the driver associates the error
with the batch identifier of the event log entry and tracks it until the
event log entry for the corresponding batch desc is encountered.

While processing an event log entry for a batch descriptor with error
indicating that one or more descs in the batch had event log entries,
the driver will do the following before writing the batch completion
record:
1. If the status field of the completion record is 0x1, the driver will
change it to error code 0x5 (one or more operations in batch completed
with status not successful) and changes the result field to 1.
2. If the status is error code 0x6 (page fault on batch descriptor list
address), change the result field to 1.
3. If status is any other value, the completion record is not changed.
4. Clear the recorded error in preparation for next batch with same batch
identifier.

The result field is for user software to determine whether to set the
"Batch Error" flag bit in the descriptor for continuation of partial
batch descriptor completion. See DSA spec 2.0 for additional information.

If no error has been recorded for the batch, the batch completion record is
written to user space as is.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-12-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12 23:18:45 +05:30
Dave Jiang
2f431ba908 dmaengine: idxd: add interrupt handling for event log
An event log interrupt is raised in the misc interrupt INTCAUSE register
when an event is written by the hardware. Add basic event log processing
support to the interrupt handler. The event log is a ring where the
hardware owns the tail and the software owns the head. The hardware will
advance the tail index when an additional event has been pushed to memory.
The software will process the log entry and then advances the head. The
log is full when (tail + 1) % log_size = head. The hardware will stop
writing when the log is full. The user is expected to create a log size
large enough to handle all the expected events.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-5-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12 23:18:45 +05:30
Dave Jiang
244da66cda dmaengine: idxd: setup event log configuration
Add setup of event log feature for supported device. Event log addresses
error reporting that was lacking in gen 1 DSA devices where a second error
event does not get reported when a first event is pending software
handling. The event log allows a circular buffer that the device can push
error events to. It is up to the user to create a large enough event log
ring in order to capture the expected events. The evl size can be set in
the device sysfs attribute. By default 64 entries are supported as minimal
when event log is enabled.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12 23:18:45 +05:30
Dave Jiang
1649091f91 dmaengine: idxd: add event log size sysfs attribute
Add support for changing of the event log size. Event log is a
feature added to DSA 2.0 hardware to improve error reporting.
It supersedes the SWERROR register on DSA 1.0 hardware and hope
to prevent loss of reported errors.

The error log size determines how many error entries supported for
the device. It can be configured by the user via sysfs attribute.

Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230407203143.2189681-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-04-12 23:18:44 +05:30
Dave Jiang
9f0d99b327 dmaengine: idxd: expose IAA CAP register via sysfs knob
Add IAA (IAX) capability mask sysfs attribute to expose to applications.
The mask provides application knowledge of what capabilities this IAA
device supports. This mask is available for IAA 2.0 device or later.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20230303213732.3357494-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-03-31 17:26:53 +05:30
Dave Jiang
7ca68fa3c8 dmaengine: idxd: add configuration for concurrent batch descriptor processing
Add sysfs knob to allow control of the number of batch descriptors that can
be concurrently processed by an engine in the group as a fraction of the
Maximum Work Descriptors in Progress value specfied in ENGCAP register.
This control knob is part of toggle for QoS control.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-6-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29 22:46:08 +05:30
Dave Jiang
1f2737521a dmaengine: idxd: add configuration for concurrent work descriptor processing
Add sysfs knob to allow control of the number of work descriptors that can
be concurrently processed by an engine in the group as a fraction of the
Maximum Work Descriptors in Progress value specified in ENGCAP register.
This control knob is part of toggle for QoS control.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-5-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29 22:46:08 +05:30
Dave Jiang
b0325aefd3 dmaengine: idxd: add WQ operation cap restriction support
DSA 2.0 add the capability of configuring DMA ops on a per workqueue basis.
This means that certain ops can be disabled by the system administrator for
certain wq. By default, all ops are available. A bitmap is used to store
the ops due to total op size of 256 bits and it is more convenient to use a
range list to specify which bits are enabled.

One of the usage to support this is for VM migration between different
iteration of devices. The newer ops are disabled in order to allow guest to
migrate to a host that only support older ops. Another usage is to
restrict the WQ to certain operations for QoS of performance.

A sysfs of ops_config attribute is added per wq. It is only usable when the
ops_config bit is set under WQ_CAP register. This means that this attribute
will return -EOPNOTSUPP on DSA 1.x devices. The expected input is a range
list for the bits per operation the WQ supports.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-4-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29 22:46:08 +05:30
Dave Jiang
a8563a33a5 dmanegine: idxd: reformat opcap output to match bitmap_parse() input
To make input and output consistent and prepping for the per WQ operation
configuration support, change the output of opcap display to match the
input that is expected by bitmap_parse() helper function. The output will
be a bitmap with field width as the number of bits using the %*pb format
specifier for printk() family.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20220917161222.2835172-3-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29 22:46:08 +05:30
Dave Jiang
3157dd0a36 dmaengine: idxd: don't load pasid config until needed
The driver currently programs the system pasid to the WQ preemptively when
system pasid is enabled. Given that a dwq will reprogram the pasid and
possibly a different pasid, the programming is not necessary. The pasid_en
bit can be set for swq as it does not need pasid programming but
needs the pasid_en bit. Remove system pasid programming on device config
write. Add pasid programming for kernel wq type on wq driver enable. The
char dev driver already reprograms the dwq on ->open() call so there's no
change.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/164935607115.1660372.6734518676950372366.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-04-11 17:28:58 +05:30
Dave Jiang
7ed6f1b85f dmaengine: idxd: change bandwidth token to read buffers
DSA spec v1.2 has changed the term of "bandwidth tokens" to "read buffers"
in order to make the concept clearer. Deprecate bandwidth token
naming in the driver and convert to read buffers in order to match with
the spec and reduce confusion when reading the spec.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163951338932.2988321.6162640806935567317.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-01-05 13:14:25 +05:30
Dave Jiang
56fc39f5a3 dmaengine: idxd: handle interrupt handle revoked event
"Interrupt handle revoked" is an event that happens when the driver is
running on a guest kernel and the VM is migrated to a new machine.
The device will trigger an interrupt that signals to the guest driver
that the interrupt handles need to be replaced.

The misc irq thread function calls a helper function to handle the
event. The function uses the WQ percpu_ref to quiesce the kernel
submissions. It then replaces the interrupt handles by requesting
interrupt handle command for each I/O MSIX vector. Once the handle is
updated, the driver will unblock the submission path to allow new
submissions.

The submitter will attempt to acquire a percpu_ref before submission. When
the request fails, it will wait on the wq_resurrect 'completion'.

The driver does anticipate the possibility of descriptors being submitted
before the WQ percpu_ref is killed. If a descriptor has already been
submitted, it will return with incorrect interrupt handle status. The
descriptor will be re-submitted with the new interrupt handle on the
completion path. For descriptors with incorrect interrupt handles,
completion interrupt won't be triggered.

At the completion of the interrupt handle refresh, the handling function
will call idxd_int_handle_refresh_drain() to issue drain descriptors to
each of the wq with associated interrupt handle. The drain descriptor will have
interrupt request set but without completion record. This will ensure all
descriptors with incorrect interrupt completion handle get drained and
a completion interrupt is triggered for the guest driver to process them.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-Developed-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163528420189.3925689.18212568593220415551.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-11-22 11:21:26 +05:30
Dave Jiang
88d97ea82c dmaengine: idxd: add halt interrupt support
Add halt interrupt support. Given that the misc interrupt handler already
check halt state, the driver just need to run the halt handling code when
receiving the halt interrupt.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163114224352.846654.14334468363464318828.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-25 12:09:15 +05:30
Dave Jiang
c5b64b6826 dmaengine: idxd: remove gen cap field per spec 1.2 update
Remove max_descs_per_engine field. The recently released DSA spec 1.2 [1]
has removed this field and made it reserved.

[1]: https://software.intel.com/content/dam/develop/external/us/en/documents-tps/341204-intel-data-streaming-accelerator-spec.pdf

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/163406167978.1303649.1798682437841822837.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-18 12:04:51 +05:30
Dave Jiang
ade8a86b51 dmaengine: idxd: Set defaults for GRPCFG traffic class
Set GRPCFG traffic class to value of 1 for best performance on current
generation of accelerators. Also add override option to allow experimentation.
Sysfs knobs are disabled for DSA/IAX gen1 devices.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162681373005.1968485.3761065664382799202.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-28 17:55:40 +05:30
Dave Jiang
e753a64bee dmaengine: idxd: Add wq occupancy information to sysfs attribute
Add occupancy information to wq sysfs attribute. Attribute will show
wq occupancy data if "WQ Occupancy Support" field in WQCAP is 1. It
displays the number of entries currently in this WQ. This is provided
as an estimate and should not be relied on to determine whether there
is space in the WQ. The data is to provide information to user apps
for flow control.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162275745546.1857062.8765615879420582018.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-14 12:20:56 +05:30
Tom Zanussi
81dd4d4d61 dmaengine: idxd: Add IDXD performance monitor support
Implement the IDXD performance monitor capability (named 'perfmon' in
the DSA (Data Streaming Accelerator) spec [1]), which supports the
collection of information about key events occurring during DSA and
IAX (Intel Analytics Accelerator) device execution, to assist in
performance tuning and debugging.

The idxd perfmon support is implemented as part of the IDXD driver and
interfaces with the Linux perf framework.  It has several features in
common with the existing uncore pmu support:

  - it does not support sampling
  - does not support per-thread counting

However it also has some unique features not present in the core and
uncore support:

  - all general-purpose counters are identical, thus no event constraints
  - operation is always system-wide

While the core perf subsystem assumes that all counters are by default
per-cpu, the uncore pmus are socket-scoped and use a cpu mask to
restrict counting to one cpu from each socket.  IDXD counters use a
similar strategy but expand the scope even further; since IDXD
counters are system-wide and can be read from any cpu, the IDXD perf
driver picks a single cpu to do the work (with cpu hotplug notifiers
to choose a different cpu if the chosen one is taken off-line).

More specifically, the perf userspace tool by default opens a counter
for each cpu for an event.  However, if it finds a cpumask file
associated with the pmu under sysfs, as is the case with the uncore
pmus, it will open counters only on the cpus specified by the cpumask.
Since perfmon only needs to open a single counter per event for a
given IDXD device, the perfmon driver will create a sysfs cpumask file
for the device and insert the first cpu of the system into it.  When a
user uses perf to open an event, perf will open a single counter on
the cpu specified by the cpu mask.  This amounts to the default
system-wide rather than per-cpu counting mentioned previously for
perfmon pmu events.  In order to keep the cpu mask up-to-date, the
driver implements cpu hotplug support for multiple devices, as IDXD
usually enumerates and registers more than one idxd device.

The perfmon driver implements basic perfmon hardware capability
discovery and configuration, and is initialized by the IDXD driver's
probe function.  During initialization, the driver retrieves the total
number of supported performance counters, the pmu ID, and the device
type from idxd device, and registers itself under the Linux perf
framework.

The perf userspace tool can be used to monitor single or multiple
events depending on the given configuration, as well as event groups,
which are also supported by the perfmon driver.  The user configures
events using the perf tool command-line interface by specifying the
event and corresponding event category, along with an optional set of
filters that can be used to restrict counting to specific work queues,
traffic classes, page and transfer sizes, and engines (See [1] for
specifics).

With the configuration specified by the user, the perf tool issues a
system call passing that information to the kernel, which uses it to
initialize the specified event(s).  The event(s) are opened and
started, and following termination of the perf command, they're
stopped.  At that point, the perfmon driver will read the latest count
for the event(s), calculate the difference between the latest counter
values and previously tracked counter values, and display the final
incremental count as the event count for the cycle.  An overflow
handler registered on the IDXD irq path is used to account for counter
overflows, which are signaled by an overflow interrupt.

Below are a couple of examples of perf usage for monitoring DSA events.

The following monitors all events in the 'engine' category.  Becuuse
no filters are specified, this captures all engine events for the
workload, which in this case is 19 iterations of the work generated by
the kernel dmatest module.

Details describing the events can be found in Appendix D of [1],
Performance Monitoring Events, but briefly they are:

  event 0x1:  total input data processed, in 32-byte units
  event 0x2:  total data written, in 32-byte units
  event 0x4:  number of work descriptors that read the source
  event 0x8:  number of work descriptors that write the destination
  event 0x10: number of work descriptors dispatched from batch descriptors
  event 0x20: number of work descriptors dispatched from work queues

 # perf stat -e dsa0/event=0x1,event_category=0x1/,
                dsa0/event=0x2,event_category=0x1/,
		dsa0/event=0x4,event_category=0x1/,
		dsa0/event=0x8,event_category=0x1/,
		dsa0/event=0x10,event_category=0x1/,
		dsa0/event=0x20,event_category=0x1/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

                 5,332      dsa0/event=0x1,event_category=0x1/
                 5,327      dsa0/event=0x2,event_category=0x1/
                    19      dsa0/event=0x4,event_category=0x1/
                    19      dsa0/event=0x8,event_category=0x1/
                     0      dsa0/event=0x10,event_category=0x1/
                    19      dsa0/event=0x20,event_category=0x1/

          21.977436186 seconds time elapsed

The command below illustrates filter usage with a simple example.  It
specifies that MEM_MOVE operations should be counted for the DSA
device dsa0 (event 0x8 corresponds to the EV_MEM_MOVE event - Number
of Memory Move Descriptors, which is part of event category 0x3 -
Operations. The detailed category and event IDs are available in
Appendix D, Performance Monitoring Events, of [1]).  In addition to
the event and event category, a number of filters are also specified
(the detailed filter values are available in Chapter 6.4 (Filter
Support) of [1]), which will restrict counting to only those events
that meet all of the filter criteria.  In this case, the filters
specify that only MEM_MOVE operations that are serviced by work queue
wq0 and specifically engine number engine0 and traffic class tc0
having sizes between 0 and 4k and page size of between 0 and 1G result
in a counter hit; anything else will be filtered out and not appear in
the final count.  Note that filters are optional - any filter not
specified is assumed to be all ones and will pass anything.

 # perf stat -e dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
                filter_eng=0x1,event=0x8,event_category=0x3/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

       19      dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
               filter_eng=0x1,event=0x8,event_category=0x3/

          21.865914091 seconds time elapsed

The output above reflects that the unspecified workload resulted in
the counting of 19 MEM_MOVE operation events that met the filter
criteria.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

[ Based on work originally by Jing Lin. ]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: https://lore.kernel.org/r/0c5080a7d541904c4ad42b848c76a1ce056ddac7.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-25 21:46:12 +05:30
Dave Jiang
5b0c68c473 dmaengine: idxd: support reporting of halt interrupt
Unmask the halt error interrupt so it gets reported to the interrupt
handler. When halt state interrupt is received, quiesce the kernel
WQs and unmap the portals to stop submission.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894441167.3202472.9485946398140619501.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-23 23:08:45 +05:30
Dave Jiang
eb15e7154f dmaengine: idxd: add interrupt handle request and release support
DSA spec states that when Request Interrupt Handle and Release Interrupt
Handle command bits are set in the CMDCAP register, these device commands
must be supported by the driver.

The interrupt handle is programmed in a descriptor. When Request Interrupt
Handle is not supported, the interrupt handle is the index of the desired
entry in the MSI-X table. When the command is supported, driver must use
the command to obtain a handle to be programmed in the submitted
descriptor.

A requested handle may be revoked. After the handle is revoked, any use of
the handle will result in Invalid Interrupt Handle error.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894439422.3202472.17579543737810265471.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-23 23:08:45 +05:30
Linus Torvalds
6daa90439e dmaengine updates for v5.11-rc1
New drivers/devices
   - Qualcomm ADM driver
   - Qualcomm GPI driver
   - Allwinner A100 DMA support
   - Microchip Sama7g5 support
   - Mediatek MT8516 apdma
 
 - Updates:
   - more updates to idxd driver and support for IAX config
   - runtime PM support for dw driver
 
 - TI keystone drivers for 5.11 included here due to dependency for TI
   drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAl/bkxEACgkQfBQHDyUj
 g0dxNA//WzpVy9QJnj6OgXIjM+9sBjqls0iPVfy1JeeMmVW8cgCwLyBNZcssKRye
 mJ9+VnTx4JQBj4KD2cFBdpr46GvBFbSbcWNSCdm179NtHI4G6tjtynOcWaI9Clu2
 0KHoa/EHIj8/jD3Hsbm+WZ1zCoY4VKBXsEq6x1Sj2tpp0/ocDhH4XLAsWTHE9OAD
 sc+0OtHr1wU4EdV6TKNTT0jXsdtzxOPPsvRsoaKncnR+Mkrgv0FvMBfBLhOb3a+m
 wUHEkwrEP1pT4Xcew6ZkYs4RYwJI3pllu2gUTs5qtidc723ol/C4kJ27q55N4azb
 j5buA8AhEwqIDH8qNuV07qaIu2VTdTdbYid3xgAeFygwM1npecZvOf8k6rTjxR/6
 XN8jaDuhc2uISY7Gt5c6tOe8rG3ffNhYrmEuGD5HI0hcglpALiE4NcgalaaQS9J1
 suQ6AUtCslReD+6M/lfarn9Zd3UAKGbxU8vCNPq0EcSAGUz9u9VK2VwKiGnAZ8bb
 ED3QDUzZYjTDWpiVodsuJlONgaMLsRCQecZWMDRNpzmf1rCXnkY0eDGiSMz+IiXZ
 87IdD66u3d/Mkm6jVdwp6+tKZ/ohj+dtIWKhMd8cKXv5zTzS+4IokxpkxdjBsHPx
 z+G73IMHjQo8xl/P0IhhwZw+7cBrkntLq8lRSbYxjSTP09QerNE=
 =uNBn
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
 "The last dmaengine updates for this year :)

  This contains couple of new drivers, new device support and updates to
  bunch of drivers.

  New drivers/devices:
   - Qualcomm ADM driver
   - Qualcomm GPI driver
   - Allwinner A100 DMA support
   - Microchip Sama7g5 support
   - Mediatek MT8516 apdma

  Updates:
   - more updates to idxd driver and support for IAX config
   - runtime PM support for dw driver
   - TI drivers"

* tag 'dmaengine-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (75 commits)
  soc: ti: k3-ringacc: Use correct error casting in k3_ringacc_dmarings_init
  dmaengine: ti: k3-udma-glue: Add support for K3 PKTDMA
  dmaengine: ti: k3-udma: Initial support for K3 PKTDMA
  dmaengine: ti: k3-udma: Add support for BCDMA channel TPL handling
  dmaengine: ti: k3-udma: Initial support for K3 BCDMA
  soc: ti: k3-ringacc: add AM64 DMA rings support.
  dmaengine: ti: Add support for k3 event routers
  dmaengine: ti: k3-psil: Add initial map for AM64
  dmaengine: ti: k3-psil: Extend psil_endpoint_config for K3 PKTDMA
  dt-bindings: dma: ti: Add document for K3 PKTDMA
  dt-bindings: dma: ti: Add document for K3 BCDMA
  dmaengine: dmatest: Use dmaengine_get_dma_device
  dmaengine: doc: client: Update for dmaengine_get_dma_device() usage
  dmaengine: Add support for per channel coherency handling
  dmaengine: of-dma: Add support for optional router configuration callback
  dmaengine: ti: k3-udma-glue: Configure the dma_dev for rings
  dmaengine: ti: k3-udma-glue: Get the ringacc from udma_dev
  dmaengine: ti: k3-udma-glue: Add function to get device pointer for DMA API
  dmaengine: ti: k3-udma: Add support for second resource range from sysfw
  dmaengine: ti: k3-udma: Wait for peer teardown completion if supported
  ...
2020-12-17 12:52:23 -08:00
Dave Jiang
f25b463883 dmaengine: idxd: add IAX configuration support in the IDXD driver
Add support to allow configuration of Intel Analytics Accelerator (IAX) in
addition to the Intel Data Streaming Accelerator (DSA). The IAX hardware
has the same configuration interface as DSA. The main difference
is the type of operations it performs. We can support the DSA and
IAX devices on the same driver with some tweaks.

IAX has a 64B completion record that needs to be 64B aligned, as opposed to
a 32B completion record that is 32B aligned for DSA. IAX also does not
support token management.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160564555488.1834439.4261958859935360473.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-11 19:45:53 +05:30
Dave Jiang
92de5fa2dc dmaengine: idxd: add ATS disable knob for work queues
With the DSA spec 1.1 update, a knob to disable ATS for individually is
introduced. Add enabling code to allow a system admin to make the
configuration through sysfs.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160530810593.1288392.2561048329116529566.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-10 12:56:44 +05:30
Dave Jiang
8326be9f1c dmaengine: idxd: fix mapping of portal size
Portal size is 4k. Current code is mapping all 4 portals in a single chunk.
Restrict the mapped portal size to a single portal to ensure that submission
only goes to the intended portal address.

Fixes: c52ca47823 ("dmaengine: idxd: add configuration component of driver")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160513342642.510187.16450549281618747065.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-11-16 22:38:20 +05:30
Dave Jiang
2f8417a967 dmaengine: idxd: define table offset multiplier
Convert table offset multiplier magic number to a define.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160407311690.839435.6941865731867828234.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-11-09 17:17:46 +05:30
Dave Jiang
5a71270197 dmaengine: idxd: Update calculation of group offset to be more readable
Create helper macros to make group offset calculation more readable.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160407294683.839093.10740868559754142070.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-11-09 17:17:46 +05:30
Dave Jiang
8e50d39265 dmaengine: idxd: Add shared workqueue support
Add shared workqueue support that includes the support of Shared Virtual
memory (SVM) or in similar terms On Demand Paging (ODP). The shared
workqueue uses the enqcmds command in kernel and will respond with retry if
the workqueue is full. Shared workqueue only works when there is PASID
support from the IOMMU.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/160382007499.3911367.26043087963708134.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-10-30 14:10:36 +05:30
Dave Jiang
d98793b5d4 dmaengine: idxd: fix wq config registers offset programming
DSA spec v1.1 [1] updated to include a stride size register for WQ
configuration that will specify how much space is reserved for the WQ
configuration register set. This change is expected to be in the final
gen1 DSA hardware. Fix the driver to use WQCFG_OFFSET() for all WQ
offset calculation and fixup WQCFG_OFFSET() to use the new calculated
wq size.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160383444959.48058.14249265538404901781.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-10-30 14:10:27 +05:30
Dave Jiang
484f910e93 dmaengine: idxd: fix wq config registers offset programming
DSA spec v1.1 [1] updated to include a stride size register for WQ
configuration that will specify how much space is reserved for the WQ
configuration register set. This change is expected to be in the final
gen1 DSA hardware. Fix the driver to use WQCFG_OFFSET() for all WQ
offset calculation and fixup WQCFG_OFFSET() to use the new calculated
wq size.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

Fixes: bfe1d56091 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160383444959.48058.14249265538404901781.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-10-28 11:02:41 +05:30
Dave Jiang
c52ca47823 dmaengine: idxd: add configuration component of driver
The device is left unconfigured when the driver is loaded. Various
components are configured via the driver sysfs attributes. Once
configuration is done, the device can be enabled by writing the device name
to the bind attribute of the device driver sysfs. Disabling can be done
similarly. Also the individual work queues can also be enabled and disabled
through the bind/unbind attributes. A constructed hierarchy is created
through the struct device framework in order to provide appropriate
configuration points and device state and status. This hierarchy is
presented off the virtual DSA bus.

i.e. /sys/bus/dsa/...

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965024585.73301.6431413676230150589.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-24 11:18:45 +05:30
Dave Jiang
bfe1d56091 dmaengine: idxd: Init and probe for Intel data accelerators
The idxd driver introduces the Intel Data Stream Accelerator [1] that will
be available on future Intel Xeon CPUs. One of the kernel access
point for the driver is through the dmaengine subsystem. It will initially
provide the DMA copy service to the kernel.

Some of the main functionality introduced with this accelerator
are: shared virtual memory (SVM) support, and descriptor submission using
Intel CPU instructions movdir64b and enqcmds. There will be additional
accelerator devices that share the same driver with variations to
capabilities.

This commit introduces the probe and initialization component of the
driver.

[1]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965023991.73301.6186843973135311580.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-24 11:18:45 +05:30