Driver uAPI changes:

- All related to the Small BAR support: (and all by Matt Auld)
  * add probed_cpu_visible_size
  * expose the avail memory region tracking
  * apply ALLOC_GPU only by default
  * add NEEDS_CPU_ACCESS hint
  * tweak error capture on recoverable contexts
 
 Driver highlights:
 - Add Small BAR support (Matt)
 - Add MeteorLake support (RK)
 - Add support for LMEM PCIe resizable BAR (Akeem)
 
 Driver important fixes:
 - ttm related fixes (Matt Auld)
 - Fix a performance regression related to waitboost (Chris)
 - Fix GT resets (Chris)
 
 Driver others:
 - Adding GuC SLPC selftest (Vinay)
 - Fix ADL-N GuC load (Daniele)
 - Add platform workaround (Gustavo, Matt Roper)
 - DG2 and ATS-M device ID updates (Matt Roper)
 - Add VM_BIND doc rfc with uAPI documentation (Niranjana)
 - Fix user-after-free in vma destruction (Thomas)
 - Async flush of GuC log regions (Alan)
 - Fixes in selftests (Chris, Dan, Andrzej)
 - Convert to drm_dbg (Umesh)
 - Disable OA sseu config param for newer hardware (Umesh)
 - Multi-cast register steering changes (Matt Roper)
 - Add lmem_bar_size modparam (Priyanka)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmLPOTUACgkQ+mJfZA7r
 E8rnPQf+NcelYE/Zmt5ExAkBoqECHBHvWbAZsJ4Ve7pxFQMoCn1vMWcpzHhizD8m
 ga9ZLQ8cURcNst0Zrc73LLNPtZyqGu3tn9dhw+atIkmFjb4nJ40gvSyEwuPVOYiD
 Z0JjcmU9cUfPiYTadHoY+yYGPh9LkF61nOxVZdEJ/ykeQZ1ThgDlFRpjfw7pteMp
 SET3Iy1pZXGlJMmlOU9fJXivyN3yO6fO8xY2mqhiPbtf/HUapVpovNYleYBtJt3i
 zS52DnuGGsC/gY+rhKfaQE/BfsXBhQK5fy39oZLKRz8uL7PXzYVV6bMAwnODqPwq
 Q5+4C2J9M8jChfrkRJyD0TK6U80Zlw==
 =l7gi
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-gt-next-2022-07-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

Driver uAPI changes:
- All related to the Small BAR support: (and all by Matt Auld)
 * add probed_cpu_visible_size
 * expose the avail memory region tracking
 * apply ALLOC_GPU only by default
 * add NEEDS_CPU_ACCESS hint
 * tweak error capture on recoverable contexts

Driver highlights:
- Add Small BAR support (Matt)
- Add MeteorLake support (RK)
- Add support for LMEM PCIe resizable BAR (Akeem)

Driver important fixes:
- ttm related fixes (Matt Auld)
- Fix a performance regression related to waitboost (Chris)
- Fix GT resets (Chris)

Driver others:
- Adding GuC SLPC selftest (Vinay)
- Fix ADL-N GuC load (Daniele)
- Add platform workaround (Gustavo, Matt Roper)
- DG2 and ATS-M device ID updates (Matt Roper)
- Add VM_BIND doc rfc with uAPI documentation (Niranjana)
- Fix user-after-free in vma destruction (Thomas)
- Async flush of GuC log regions (Alan)
- Fixes in selftests (Chris, Dan, Andrzej)
- Convert to drm_dbg (Umesh)
- Disable OA sseu config param for newer hardware (Umesh)
- Multi-cast register steering changes (Matt Roper)
- Add lmem_bar_size modparam (Priyanka)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Ys85pcMYLkqF/HtB@intel.com
This commit is contained in:
Dave Airlie 2022-07-22 15:51:26 +10:00
commit 417c1c1963
58 changed files with 2301 additions and 542 deletions

View file

@ -0,0 +1,189 @@
/**
* struct __drm_i915_memory_region_info - Describes one region as known to the
* driver.
*
* Note this is using both struct drm_i915_query_item and struct drm_i915_query.
* For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS
* at &drm_i915_query_item.query_id.
*/
struct __drm_i915_memory_region_info {
/** @region: The class:instance pair encoding */
struct drm_i915_gem_memory_class_instance region;
/** @rsvd0: MBZ */
__u32 rsvd0;
/**
* @probed_size: Memory probed by the driver
*
* Note that it should not be possible to ever encounter a zero value
* here, also note that no current region type will ever return -1 here.
* Although for future region types, this might be a possibility. The
* same applies to the other size fields.
*/
__u64 probed_size;
/**
* @unallocated_size: Estimate of memory remaining
*
* Requires CAP_PERFMON or CAP_SYS_ADMIN to get reliable accounting.
* Without this (or if this is an older kernel) the value here will
* always equal the @probed_size. Note this is only currently tracked
* for I915_MEMORY_CLASS_DEVICE regions (for other types the value here
* will always equal the @probed_size).
*/
__u64 unallocated_size;
union {
/** @rsvd1: MBZ */
__u64 rsvd1[8];
struct {
/**
* @probed_cpu_visible_size: Memory probed by the driver
* that is CPU accessible.
*
* This will be always be <= @probed_size, and the
* remainder (if there is any) will not be CPU
* accessible.
*
* On systems without small BAR, the @probed_size will
* always equal the @probed_cpu_visible_size, since all
* of it will be CPU accessible.
*
* Note this is only tracked for
* I915_MEMORY_CLASS_DEVICE regions (for other types the
* value here will always equal the @probed_size).
*
* Note that if the value returned here is zero, then
* this must be an old kernel which lacks the relevant
* small-bar uAPI support (including
* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS), but on
* such systems we should never actually end up with a
* small BAR configuration, assuming we are able to load
* the kernel module. Hence it should be safe to treat
* this the same as when @probed_cpu_visible_size ==
* @probed_size.
*/
__u64 probed_cpu_visible_size;
/**
* @unallocated_cpu_visible_size: Estimate of CPU
* visible memory remaining
*
* Note this is only tracked for
* I915_MEMORY_CLASS_DEVICE regions (for other types the
* value here will always equal the
* @probed_cpu_visible_size).
*
* Requires CAP_PERFMON or CAP_SYS_ADMIN to get reliable
* accounting. Without this the value here will always
* equal the @probed_cpu_visible_size. Note this is only
* currently tracked for I915_MEMORY_CLASS_DEVICE
* regions (for other types the value here will also
* always equal the @probed_cpu_visible_size).
*
* If this is an older kernel the value here will be
* zero, see also @probed_cpu_visible_size.
*/
__u64 unallocated_cpu_visible_size;
};
};
};
/**
* struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added
* extension support using struct i915_user_extension.
*
* Note that new buffer flags should be added here, at least for the stuff that
* is immutable. Previously we would have two ioctls, one to create the object
* with gem_create, and another to apply various parameters, however this
* creates some ambiguity for the params which are considered immutable. Also in
* general we're phasing out the various SET/GET ioctls.
*/
struct __drm_i915_gem_create_ext {
/**
* @size: Requested size for the object.
*
* The (page-aligned) allocated size for the object will be returned.
*
* Note that for some devices we have might have further minimum
* page-size restrictions (larger than 4K), like for device local-memory.
* However in general the final size here should always reflect any
* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
* extension to place the object in device local-memory. The kernel will
* always select the largest minimum page-size for the set of possible
* placements as the value to use when rounding up the @size.
*/
__u64 size;
/**
* @handle: Returned handle for the object.
*
* Object handles are nonzero.
*/
__u32 handle;
/**
* @flags: Optional flags.
*
* Supported values:
*
* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that
* the object will need to be accessed via the CPU.
*
* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and only
* strictly required on configurations where some subset of the device
* memory is directly visible/mappable through the CPU (which we also
* call small BAR), like on some DG2+ systems. Note that this is quite
* undesirable, but due to various factors like the client CPU, BIOS etc
* it's something we can expect to see in the wild. See
* &__drm_i915_memory_region_info.probed_cpu_visible_size for how to
* determine if this system applies.
*
* Note that one of the placements MUST be I915_MEMORY_CLASS_SYSTEM, to
* ensure the kernel can always spill the allocation to system memory,
* if the object can't be allocated in the mappable part of
* I915_MEMORY_CLASS_DEVICE.
*
* Also note that since the kernel only supports flat-CCS on objects
* that can *only* be placed in I915_MEMORY_CLASS_DEVICE, we therefore
* don't support I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS together with
* flat-CCS.
*
* Without this hint, the kernel will assume that non-mappable
* I915_MEMORY_CLASS_DEVICE is preferred for this object. Note that the
* kernel can still migrate the object to the mappable part, as a last
* resort, if userspace ever CPU faults this object, but this might be
* expensive, and so ideally should be avoided.
*
* On older kernels which lack the relevant small-bar uAPI support (see
* also &__drm_i915_memory_region_info.probed_cpu_visible_size),
* usage of the flag will result in an error, but it should NEVER be
* possible to end up with a small BAR configuration, assuming we can
* also successfully load the i915 kernel module. In such cases the
* entire I915_MEMORY_CLASS_DEVICE region will be CPU accessible, and as
* such there are zero restrictions on where the object can be placed.
*/
#define I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS (1 << 0)
__u32 flags;
/**
* @extensions: The chain of extensions to apply to this object.
*
* This will be useful in the future when we need to support several
* different extensions, and we need to apply more than one when
* creating the object. See struct i915_user_extension.
*
* If we don't supply any extensions then we get the same old gem_create
* behaviour.
*
* For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see
* struct drm_i915_gem_create_ext_memory_regions.
*
* For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
* struct drm_i915_gem_create_ext_protected_content.
*/
#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
#define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
__u64 extensions;
};

View file

@ -0,0 +1,47 @@
==========================
I915 Small BAR RFC Section
==========================
Starting from DG2 we will have resizable BAR support for device local-memory(i.e
I915_MEMORY_CLASS_DEVICE), but in some cases the final BAR size might still be
smaller than the total probed_size. In such cases, only some subset of
I915_MEMORY_CLASS_DEVICE will be CPU accessible(for example the first 256M),
while the remainder is only accessible via the GPU.
I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS flag
----------------------------------------------
New gem_create_ext flag to tell the kernel that a BO will require CPU access.
This becomes important when placing an object in I915_MEMORY_CLASS_DEVICE, where
underneath the device has a small BAR, meaning only some portion of it is CPU
accessible. Without this flag the kernel will assume that CPU access is not
required, and prioritize using the non-CPU visible portion of
I915_MEMORY_CLASS_DEVICE.
.. kernel-doc:: Documentation/gpu/rfc/i915_small_bar.h
:functions: __drm_i915_gem_create_ext
probed_cpu_visible_size attribute
---------------------------------
New struct__drm_i915_memory_region attribute which returns the total size of the
CPU accessible portion, for the particular region. This should only be
applicable for I915_MEMORY_CLASS_DEVICE. We also report the
unallocated_cpu_visible_size, alongside the unallocated_size.
Vulkan will need this as part of creating a separate VkMemoryHeap with the
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT set, to represent the CPU visible portion,
where the total size of the heap needs to be known. It also wants to be able to
give a rough estimate of how memory can potentially be allocated.
.. kernel-doc:: Documentation/gpu/rfc/i915_small_bar.h
:functions: __drm_i915_memory_region_info
Error Capture restrictions
--------------------------
With error capture we have two new restrictions:
1) Error capture is best effort on small BAR systems; if the pages are not
CPU accessible, at the time of capture, then the kernel is free to skip
trying to capture them.
2) On discrete and newer integrated platforms we now reject error capture
on recoverable contexts. In the future the kernel may want to blit during
error capture, when for example something is not currently CPU accessible.

View file

@ -0,0 +1,291 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
/**
* DOC: I915_PARAM_VM_BIND_VERSION
*
* VM_BIND feature version supported.
* See typedef drm_i915_getparam_t param.
*
* Specifies the VM_BIND feature version supported.
* The following versions of VM_BIND have been defined:
*
* 0: No VM_BIND support.
*
* 1: In VM_UNBIND calls, the UMD must specify the exact mappings created
* previously with VM_BIND, the ioctl will not support unbinding multiple
* mappings or splitting them. Similarly, VM_BIND calls will not replace
* any existing mappings.
*
* 2: The restrictions on unbinding partial or multiple mappings is
* lifted, Similarly, binding will replace any mappings in the given range.
*
* See struct drm_i915_gem_vm_bind and struct drm_i915_gem_vm_unbind.
*/
#define I915_PARAM_VM_BIND_VERSION 57
/**
* DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
*
* Flag to opt-in for VM_BIND mode of binding during VM creation.
* See struct drm_i915_gem_vm_control flags.
*
* The older execbuf2 ioctl will not support VM_BIND mode of operation.
* For VM_BIND mode, we have new execbuf3 ioctl which will not accept any
* execlist (See struct drm_i915_gem_execbuffer3 for more details).
*/
#define I915_VM_CREATE_FLAGS_USE_VM_BIND (1 << 0)
/* VM_BIND related ioctls */
#define DRM_I915_GEM_VM_BIND 0x3d
#define DRM_I915_GEM_VM_UNBIND 0x3e
#define DRM_I915_GEM_EXECBUFFER3 0x3f
#define DRM_IOCTL_I915_GEM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
#define DRM_IOCTL_I915_GEM_VM_UNBIND DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
#define DRM_IOCTL_I915_GEM_EXECBUFFER3 DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
/**
* struct drm_i915_gem_timeline_fence - An input or output timeline fence.
*
* The operation will wait for input fence to signal.
*
* The returned output fence will be signaled after the completion of the
* operation.
*/
struct drm_i915_gem_timeline_fence {
/** @handle: User's handle for a drm_syncobj to wait on or signal. */
__u32 handle;
/**
* @flags: Supported flags are:
*
* I915_TIMELINE_FENCE_WAIT:
* Wait for the input fence before the operation.
*
* I915_TIMELINE_FENCE_SIGNAL:
* Return operation completion fence as output.
*/
__u32 flags;
#define I915_TIMELINE_FENCE_WAIT (1 << 0)
#define I915_TIMELINE_FENCE_SIGNAL (1 << 1)
#define __I915_TIMELINE_FENCE_UNKNOWN_FLAGS (-(I915_TIMELINE_FENCE_SIGNAL << 1))
/**
* @value: A point in the timeline.
* Value must be 0 for a binary drm_syncobj. A Value of 0 for a
* timeline drm_syncobj is invalid as it turns a drm_syncobj into a
* binary one.
*/
__u64 value;
};
/**
* struct drm_i915_gem_vm_bind - VA to object mapping to bind.
*
* This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
* virtual address (VA) range to the section of an object that should be bound
* in the device page table of the specified address space (VM).
* The VA range specified must be unique (ie., not currently bound) and can
* be mapped to whole object or a section of the object (partial binding).
* Multiple VA mappings can be created to the same section of the object
* (aliasing).
*
* The @start, @offset and @length must be 4K page aligned. However the DG2
* and XEHPSDV has 64K page size for device local memory and has compact page
* table. On those platforms, for binding device local-memory objects, the
* @start, @offset and @length must be 64K aligned. Also, UMDs should not mix
* the local memory 64K page and the system memory 4K page bindings in the same
* 2M range.
*
* Error code -EINVAL will be returned if @start, @offset and @length are not
* properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
* -ENOSPC will be returned if the VA range specified can't be reserved.
*
* VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
* are not ordered. Furthermore, parts of the VM_BIND operation can be done
* asynchronously, if valid @fence is specified.
*/
struct drm_i915_gem_vm_bind {
/** @vm_id: VM (address space) id to bind */
__u32 vm_id;
/** @handle: Object handle */
__u32 handle;
/** @start: Virtual Address start to bind */
__u64 start;
/** @offset: Offset in object to bind */
__u64 offset;
/** @length: Length of mapping to bind */
__u64 length;
/**
* @flags: Supported flags are:
*
* I915_GEM_VM_BIND_CAPTURE:
* Capture this mapping in the dump upon GPU error.
*
* Note that @fence carries its own flags.
*/
__u64 flags;
#define I915_GEM_VM_BIND_CAPTURE (1 << 0)
/**
* @fence: Timeline fence for bind completion signaling.
*
* Timeline fence is of format struct drm_i915_gem_timeline_fence.
*
* It is an out fence, hence using I915_TIMELINE_FENCE_WAIT flag
* is invalid, and an error will be returned.
*
* If I915_TIMELINE_FENCE_SIGNAL flag is not set, then out fence
* is not requested and binding is completed synchronously.
*/
struct drm_i915_gem_timeline_fence fence;
/**
* @extensions: Zero-terminated chain of extensions.
*
* For future extensions. See struct i915_user_extension.
*/
__u64 extensions;
};
/**
* struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
*
* This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
* address (VA) range that should be unbound from the device page table of the
* specified address space (VM). VM_UNBIND will force unbind the specified
* range from device page table without waiting for any GPU job to complete.
* It is UMDs responsibility to ensure the mapping is no longer in use before
* calling VM_UNBIND.
*
* If the specified mapping is not found, the ioctl will simply return without
* any error.
*
* VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
* are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
* asynchronously, if valid @fence is specified.
*/
struct drm_i915_gem_vm_unbind {
/** @vm_id: VM (address space) id to bind */
__u32 vm_id;
/** @rsvd: Reserved, MBZ */
__u32 rsvd;
/** @start: Virtual Address start to unbind */
__u64 start;
/** @length: Length of mapping to unbind */
__u64 length;
/**
* @flags: Currently reserved, MBZ.
*
* Note that @fence carries its own flags.
*/
__u64 flags;
/**
* @fence: Timeline fence for unbind completion signaling.
*
* Timeline fence is of format struct drm_i915_gem_timeline_fence.
*
* It is an out fence, hence using I915_TIMELINE_FENCE_WAIT flag
* is invalid, and an error will be returned.
*
* If I915_TIMELINE_FENCE_SIGNAL flag is not set, then out fence
* is not requested and unbinding is completed synchronously.
*/
struct drm_i915_gem_timeline_fence fence;
/**
* @extensions: Zero-terminated chain of extensions.
*
* For future extensions. See struct i915_user_extension.
*/
__u64 extensions;
};
/**
* struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
* ioctl.
*
* DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
* only works with this ioctl for submission.
* See I915_VM_CREATE_FLAGS_USE_VM_BIND.
*/
struct drm_i915_gem_execbuffer3 {
/**
* @ctx_id: Context id
*
* Only contexts with user engine map are allowed.
*/
__u32 ctx_id;
/**
* @engine_idx: Engine index
*
* An index in the user engine map of the context specified by @ctx_id.
*/
__u32 engine_idx;
/**
* @batch_address: Batch gpu virtual address/es.
*
* For normal submission, it is the gpu virtual address of the batch
* buffer. For parallel submission, it is a pointer to an array of
* batch buffer gpu virtual addresses with array size equal to the
* number of (parallel) engines involved in that submission (See
* struct i915_context_engines_parallel_submit).
*/
__u64 batch_address;
/** @flags: Currently reserved, MBZ */
__u64 flags;
/** @rsvd1: Reserved, MBZ */
__u32 rsvd1;
/** @fence_count: Number of fences in @timeline_fences array. */
__u32 fence_count;
/**
* @timeline_fences: Pointer to an array of timeline fences.
*
* Timeline fences are of format struct drm_i915_gem_timeline_fence.
*/
__u64 timeline_fences;
/** @rsvd2: Reserved, MBZ */
__u64 rsvd2;
/**
* @extensions: Zero-terminated chain of extensions.
*
* For future extensions. See struct i915_user_extension.
*/
__u64 extensions;
};
/**
* struct drm_i915_gem_create_ext_vm_private - Extension to make the object
* private to the specified VM.
*
* See struct drm_i915_gem_create_ext.
*/
struct drm_i915_gem_create_ext_vm_private {
#define I915_GEM_CREATE_EXT_VM_PRIVATE 2
/** @base: Extension link. See struct i915_user_extension. */
struct i915_user_extension base;
/** @vm_id: Id of the VM to which the object is private */
__u32 vm_id;
};

View file

@ -0,0 +1,245 @@
==========================================
I915 VM_BIND feature design and use cases
==========================================
VM_BIND feature
================
DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
specified address space (VM). These mappings (also referred to as persistent
mappings) will be persistent across multiple GPU submissions (execbuf calls)
issued by the UMD, without user having to provide a list of all required
mappings during each submission (as required by older execbuf mode).
The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
signaling the completion of bind/unbind operation.
VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
User has to opt-in for VM_BIND mode of binding for an address space (VM)
during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently are
not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be done
asynchronously, when valid out fence is specified.
VM_BIND features include:
* Multiple Virtual Address (VA) mappings can map to the same physical pages
of an object (aliasing).
* VA mapping can map to a partial section of the BO (partial binding).
* Support capture of persistent mappings in the dump upon GPU error.
* Support for userptr gem objects (no special uapi is required for this).
TLB flush consideration
------------------------
The i915 driver flushes the TLB for each submission and when an object's
pages are released. The VM_BIND/UNBIND operation will not do any additional
TLB flush. Any VM_BIND mapping added will be in the working set for subsequent
submissions on that VM and will not be in the working set for currently running
batches (which would require additional TLB flushes, which is not supported).
Execbuf ioctl in VM_BIND mode
-------------------------------
A VM in VM_BIND mode will not support older execbuf mode of binding.
The execbuf ioctl handling in VM_BIND mode differs significantly from the
older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
execlist. Hence, no support for implicit sync. It is expected that the below
work will be able to support requirements of object dependency setting in all
use cases:
"dma-buf: Add an API for exporting sync files"
(https://lwn.net/Articles/859290/)
The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
VM_BIND call) at the time of execbuf3 call are deemed required for that
submission.
The execbuf3 ioctl directly specifies the batch addresses instead of as
object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
support many of the older features like in/out/submit fences, fence array,
default gem context and many more (See struct drm_i915_gem_execbuffer3).
In VM_BIND mode, VA allocation is completely managed by the user instead of
the i915 driver. Hence all VA assignment, eviction are not applicable in
VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
be using the i915_vma active reference tracking. It will instead use dma-resv
object for that (See `VM_BIND dma_resv usage`_).
So, a lot of existing code supporting execbuf2 ioctl, like relocations, VA
evictions, vma lookup table, implicit sync, vma active reference tracking etc.,
are not applicable for execbuf3 ioctl. Hence, all execbuf3 specific handling
should be in a separate file and only functionalities common to these ioctls
can be the shared code where possible.
VM_PRIVATE objects
-------------------
By default, BOs can be mapped on multiple VMs and can also be dma-buf
exported. Hence these BOs are referred to as Shared BOs.
During each execbuf submission, the request fence must be added to the
dma-resv fence list of all shared BOs mapped on the VM.
VM_BIND feature introduces an optimization where user can create BO which
is private to a specified VM via I915_GEM_CREATE_EXT_VM_PRIVATE flag during
BO creation. Unlike Shared BOs, these VM private BOs can only be mapped on
the VM they are private to and can't be dma-buf exported.
All private BOs of a VM share the dma-resv object. Hence during each execbuf
submission, they need only one dma-resv fence list updated. Thus, the fast
path (where required mappings are already bound) submission latency is O(1)
w.r.t the number of VM private BOs.
VM_BIND locking hirarchy
-------------------------
The locking design here supports the older (execlist based) execbuf mode, the
newer VM_BIND mode, the VM_BIND mode with GPU page faults and possible future
system allocator support (See `Shared Virtual Memory (SVM) support`_).
The older execbuf mode and the newer VM_BIND mode without page faults manages
residency of backing storage using dma_fence. The VM_BIND mode with page faults
and the system allocator support do not use any dma_fence at all.
VM_BIND locking order is as below.
1) Lock-A: A vm_bind mutex will protect vm_bind lists. This lock is taken in
vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
mapping.
In future, when GPU page faults are supported, we can potentially use a
rwsem instead, so that multiple page fault handlers can take the read side
lock to lookup the mapping and hence can run in parallel.
The older execbuf mode of binding do not need this lock.
2) Lock-B: The object's dma-resv lock will protect i915_vma state and needs to
be held while binding/unbinding a vma in the async worker and while updating
dma-resv fence list of an object. Note that private BOs of a VM will all
share a dma-resv object.
The future system allocator support will use the HMM prescribed locking
instead.
3) Lock-C: Spinlock/s to protect some of the VM's lists like the list of
invalidated vmas (due to eviction and userptr invalidation) etc.
When GPU page faults are supported, the execbuf path do not take any of these
locks. There we will simply smash the new batch buffer address into the ring and
then tell the scheduler run that. The lock taking only happens from the page
fault handler, where we take lock-A in read mode, whichever lock-B we need to
find the backing storage (dma_resv lock for gem objects, and hmm/core mm for
system allocator) and some additional locks (lock-D) for taking care of page
table races. Page fault mode should not need to ever manipulate the vm lists,
so won't ever need lock-C.
VM_BIND LRU handling
---------------------
We need to ensure VM_BIND mapped objects are properly LRU tagged to avoid
performance degradation. We will also need support for bulk LRU movement of
VM_BIND objects to avoid additional latencies in execbuf path.
The page table pages are similar to VM_BIND mapped objects (See
`Evictable page table allocations`_) and are maintained per VM and needs to
be pinned in memory when VM is made active (ie., upon an execbuf call with
that VM). So, bulk LRU movement of page table pages is also needed.
VM_BIND dma_resv usage
-----------------------
Fences needs to be added to all VM_BIND mapped objects. During each execbuf
submission, they are added with DMA_RESV_USAGE_BOOKKEEP usage to prevent
over sync (See enum dma_resv_usage). One can override it with either
DMA_RESV_USAGE_READ or DMA_RESV_USAGE_WRITE usage during explicit object
dependency setting.
Note that DRM_I915_GEM_WAIT and DRM_I915_GEM_BUSY ioctls do not check for
DMA_RESV_USAGE_BOOKKEEP usage and hence should not be used for end of batch
check. Instead, the execbuf3 out fence should be used for end of batch check
(See struct drm_i915_gem_execbuffer3).
Also, in VM_BIND mode, use dma-resv apis for determining object activeness
(See dma_resv_test_signaled() and dma_resv_wait_timeout()) and do not use the
older i915_vma active reference tracking which is deprecated. This should be
easier to get it working with the current TTM backend.
Mesa use case
--------------
VM_BIND can potentially reduce the CPU overhead in Mesa (both Vulkan and Iris),
hence improving performance of CPU-bound applications. It also allows us to
implement Vulkan's Sparse Resources. With increasing GPU hardware performance,
reducing CPU overhead becomes more impactful.
Other VM_BIND use cases
========================
Long running Compute contexts
------------------------------
Usage of dma-fence expects that they complete in reasonable amount of time.
Compute on the other hand can be long running. Hence it is appropriate for
compute to use user/memory fence (See `User/Memory Fence`_) and dma-fence usage
must be limited to in-kernel consumption only.
Where GPU page faults are not available, kernel driver upon buffer invalidation
will initiate a suspend (preemption) of long running context, finish the
invalidation, revalidate the BO and then resume the compute context. This is
done by having a per-context preempt fence which is enabled when someone tries
to wait on it and triggers the context preemption.
User/Memory Fence
~~~~~~~~~~~~~~~~~~
User/Memory fence is a <address, value> pair. To signal the user fence, the
specified value will be written at the specified virtual address and wakeup the
waiting process. User fence can be signaled either by the GPU or kernel async
worker (like upon bind completion). User can wait on a user fence with a new
user fence wait ioctl.
Here is some prior work on this:
https://patchwork.freedesktop.org/patch/349417/
Low Latency Submission
~~~~~~~~~~~~~~~~~~~~~~~
Allows compute UMD to directly submit GPU jobs instead of through execbuf
ioctl. This is made possible by VM_BIND is not being synchronized against
execbuf. VM_BIND allows bind/unbind of mappings required for the directly
submitted jobs.
Debugger
---------
With debug event interface user space process (debugger) is able to keep track
of and act upon resources created by another process (debugged) and attached
to GPU via vm_bind interface.
GPU page faults
----------------
GPU page faults when supported (in future), will only be supported in the
VM_BIND mode. While both the older execbuf mode and the newer VM_BIND mode of
binding will require using dma-fence to ensure residency, the GPU page faults
mode when supported, will not use any dma-fence as residency is purely managed
by installing and removing/invalidating page table entries.
Page level hints settings
--------------------------
VM_BIND allows any hints setting per mapping instead of per BO. Possible hints
include placement and atomicity. Sub-BO level placement hint will be even more
relevant with upcoming GPU on-demand page fault support.
Page level Cache/CLOS settings
-------------------------------
VM_BIND allows cache/CLOS settings per mapping instead of per BO.
Evictable page table allocations
---------------------------------
Make pagetable allocations evictable and manage them similar to VM_BIND
mapped objects. Page table pages are similar to persistent mappings of a
VM (difference here are that the page table pages will not have an i915_vma
structure and after swapping pages back in, parent page link needs to be
updated).
Shared Virtual Memory (SVM) support
------------------------------------
VM_BIND interface can be used to map system memory directly (without gem BO
abstraction) using the HMM interface. SVM is only supported with GPU page
faults enabled.
VM_BIND UAPI
=============
.. kernel-doc:: Documentation/gpu/rfc/i915_vm_bind.h

View file

@ -23,3 +23,11 @@ host such documentation:
.. toctree::
i915_scheduler.rst
.. toctree::
i915_small_bar.rst
.. toctree::
i915_vm_bind.rst

View file

@ -241,6 +241,7 @@ struct create_ext {
struct drm_i915_private *i915;
struct intel_memory_region *placements[INTEL_REGION_UNKNOWN];
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
};
@ -337,6 +338,7 @@ static int set_placements(struct drm_i915_gem_create_ext_memory_regions *args,
for (i = 0; i < args->num_regions; i++)
ext_data->placements[i] = placements[i];
ext_data->placement_mask = mask;
return 0;
out_dump:
@ -411,7 +413,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
struct drm_i915_gem_object *obj;
int ret;
if (args->flags)
if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
return -EINVAL;
ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
@ -427,6 +429,22 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
ext_data.n_placements = 1;
}
if (args->flags & I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) {
if (ext_data.n_placements == 1)
return -EINVAL;
/*
* We always need to be able to spill to system memory, if we
* can't place in the mappable part of LMEM.
*/
if (!(ext_data.placement_mask & BIT(INTEL_REGION_SMEM)))
return -EINVAL;
} else {
if (ext_data.n_placements > 1 ||
ext_data.placements[0]->type != INTEL_MEMORY_SYSTEM)
ext_data.flags |= I915_BO_ALLOC_GPU_ONLY;
}
obj = __i915_gem_object_create_user_ext(i915, args->size,
ext_data.placements,
ext_data.n_placements,

View file

@ -1951,7 +1951,7 @@ eb_find_first_request_added(struct i915_execbuffer *eb)
#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
/* Stage with GFP_KERNEL allocations before we enter the signaling critical path */
static void eb_capture_stage(struct i915_execbuffer *eb)
static int eb_capture_stage(struct i915_execbuffer *eb)
{
const unsigned int count = eb->buffer_count;
unsigned int i = count, j;
@ -1964,6 +1964,10 @@ static void eb_capture_stage(struct i915_execbuffer *eb)
if (!(flags & EXEC_OBJECT_CAPTURE))
continue;
if (i915_gem_context_is_recoverable(eb->gem_context) &&
(IS_DGFX(eb->i915) || GRAPHICS_VER_FULL(eb->i915) > IP_VER(12, 0)))
return -EINVAL;
for_each_batch_create_order(eb, j) {
struct i915_capture_list *capture;
@ -1976,6 +1980,8 @@ static void eb_capture_stage(struct i915_execbuffer *eb)
eb->capture_lists[j] = capture;
}
}
return 0;
}
/* Commit once we're in the critical path */
@ -2017,8 +2023,9 @@ static void eb_capture_list_clear(struct i915_execbuffer *eb)
#else
static void eb_capture_stage(struct i915_execbuffer *eb)
static int eb_capture_stage(struct i915_execbuffer *eb)
{
return 0;
}
static void eb_capture_commit(struct i915_execbuffer *eb)
@ -3410,7 +3417,9 @@ i915_gem_do_execbuffer(struct drm_device *dev,
}
ww_acquire_done(&eb.ww.ctx);
eb_capture_stage(&eb);
err = eb_capture_stage(&eb);
if (err)
goto err_vma;
out_fence = eb_requests_create(&eb, in_fence, out_fence_fd);
if (IS_ERR(out_fence)) {

View file

@ -717,6 +717,32 @@ bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
return false;
}
/**
* i915_gem_object_needs_ccs_pages - Check whether the object requires extra
* pages when placed in system-memory, in order to save and later restore the
* flat-CCS aux state when the object is moved between local-memory and
* system-memory
* @obj: Pointer to the object
*
* Return: True if the object needs extra ccs pages. False otherwise.
*/
bool i915_gem_object_needs_ccs_pages(struct drm_i915_gem_object *obj)
{
bool lmem_placement = false;
int i;
for (i = 0; i < obj->mm.n_placements; i++) {
/* Compression is not allowed for the objects with smem placement */
if (obj->mm.placements[i]->type == INTEL_MEMORY_SYSTEM)
return false;
if (!lmem_placement &&
obj->mm.placements[i]->type == INTEL_MEMORY_LOCAL)
lmem_placement = true;
}
return lmem_placement;
}
void i915_gem_init__objects(struct drm_i915_private *i915)
{
INIT_DELAYED_WORK(&i915->mm.free_work, __i915_gem_free_work);
@ -783,10 +809,31 @@ int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
intr, MAX_SCHEDULE_TIMEOUT);
if (!ret)
ret = -ETIME;
else if (ret > 0 && i915_gem_object_has_unknown_state(obj))
ret = -EIO;
return ret < 0 ? ret : 0;
}
/**
* i915_gem_object_has_unknown_state - Return true if the object backing pages are
* in an unknown_state. This means that userspace must NEVER be allowed to touch
* the pages, with either the GPU or CPU.
*
* ONLY valid to be called after ensuring that all kernel fences have signalled
* (in particular the fence for moving/clearing the object).
*/
bool i915_gem_object_has_unknown_state(struct drm_i915_gem_object *obj)
{
/*
* The below barrier pairs with the dma_fence_signal() in
* __memcpy_work(). We should only sample the unknown_state after all
* the kernel fences have signalled.
*/
smp_rmb();
return obj->mm.unknown_state;
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftests/huge_gem_object.c"
#include "selftests/huge_pages.c"

View file

@ -524,6 +524,7 @@ int i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj,
struct dma_fence **fence);
int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
bool intr);
bool i915_gem_object_has_unknown_state(struct drm_i915_gem_object *obj);
void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
unsigned int cache_level);
@ -617,6 +618,8 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj,
bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,
enum intel_memory_type type);
bool i915_gem_object_needs_ccs_pages(struct drm_i915_gem_object *obj);
int shmem_sg_alloc_table(struct drm_i915_private *i915, struct sg_table *st,
size_t size, struct intel_memory_region *mr,
struct address_space *mapping,

View file

@ -547,6 +547,24 @@ struct drm_i915_gem_object {
*/
bool ttm_shrinkable;
/**
* @unknown_state: Indicate that the object is effectively
* borked. This is write-once and set if we somehow encounter a
* fatal error when moving/clearing the pages, and we are not
* able to fallback to memcpy/memset, like on small-BAR systems.
* The GPU should also be wedged (or in the process) at this
* point.
*
* Only valid to read this after acquiring the dma-resv lock and
* waiting for all DMA_RESV_USAGE_KERNEL fences to be signalled,
* or if we otherwise know that the moving fence has signalled,
* and we are certain the pages underneath are valid for
* immediate access (under normal operation), like just prior to
* binding the object or when setting up the CPU fault handler.
* See i915_gem_object_has_unknown_state();
*/
bool unknown_state;
/**
* Priority list of potential placements for this object.
*/

View file

@ -60,6 +60,8 @@ __i915_gem_object_create_region(struct intel_memory_region *mem,
if (page_size)
default_page_size = page_size;
/* We should be able to fit a page within an sg entry */
GEM_BUG_ON(overflows_type(default_page_size, u32));
GEM_BUG_ON(!is_power_of_2_u64(default_page_size));
GEM_BUG_ON(default_page_size < PAGE_SIZE);

View file

@ -266,24 +266,6 @@ static const struct i915_refct_sgt_ops tt_rsgt_ops = {
.release = i915_ttm_tt_release
};
static inline bool
i915_gem_object_needs_ccs_pages(struct drm_i915_gem_object *obj)
{
bool lmem_placement = false;
int i;
for (i = 0; i < obj->mm.n_placements; i++) {
/* Compression is not allowed for the objects with smem placement */
if (obj->mm.placements[i]->type == INTEL_MEMORY_SYSTEM)
return false;
if (!lmem_placement &&
obj->mm.placements[i]->type == INTEL_MEMORY_LOCAL)
lmem_placement = true;
}
return lmem_placement;
}
static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
uint32_t page_flags)
{
@ -620,10 +602,15 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
struct ttm_resource *res)
{
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
u32 page_alignment;
if (!i915_ttm_gtt_binds_lmem(res))
return i915_ttm_tt_get_st(bo->ttm);
page_alignment = bo->page_alignment << PAGE_SHIFT;
if (!page_alignment)
page_alignment = obj->mm.region->min_page_size;
/*
* If CPU mapping differs, we need to add the ttm_tt pages to
* the resulting st. Might make sense for GGTT.
@ -634,7 +621,8 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
struct i915_refct_sgt *rsgt;
rsgt = intel_region_ttm_resource_to_rsgt(obj->mm.region,
res);
res,
page_alignment);
if (IS_ERR(rsgt))
return rsgt;
@ -643,7 +631,8 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
return i915_refct_sgt_get(obj->ttm.cached_io_rsgt);
}
return intel_region_ttm_resource_to_rsgt(obj->mm.region, res);
return intel_region_ttm_resource_to_rsgt(obj->mm.region, res,
page_alignment);
}
static int i915_ttm_truncate(struct drm_i915_gem_object *obj)
@ -675,7 +664,15 @@ static void i915_ttm_swap_notify(struct ttm_buffer_object *bo)
i915_ttm_purge(obj);
}
static bool i915_ttm_resource_mappable(struct ttm_resource *res)
/**
* i915_ttm_resource_mappable - Return true if the ttm resource is CPU
* accessible.
* @res: The TTM resource to check.
*
* This is interesting on small-BAR systems where we may encounter lmem objects
* that can't be accessed via the CPU.
*/
bool i915_ttm_resource_mappable(struct ttm_resource *res)
{
struct i915_ttm_buddy_resource *bman_res = to_ttm_buddy_resource(res);
@ -687,6 +684,22 @@ static bool i915_ttm_resource_mappable(struct ttm_resource *res)
static int i915_ttm_io_mem_reserve(struct ttm_device *bdev, struct ttm_resource *mem)
{
struct drm_i915_gem_object *obj = i915_ttm_to_gem(mem->bo);
bool unknown_state;
if (!obj)
return -EINVAL;
if (!kref_get_unless_zero(&obj->base.refcount))
return -EINVAL;
assert_object_held(obj);
unknown_state = i915_gem_object_has_unknown_state(obj);
i915_gem_object_put(obj);
if (unknown_state)
return -EINVAL;
if (!i915_ttm_cpu_maps_iomem(mem))
return 0;

View file

@ -92,4 +92,7 @@ static inline bool i915_ttm_cpu_maps_iomem(struct ttm_resource *mem)
/* Once / if we support GGTT, this is also false for cached ttm_tts */
return mem->mem_type != I915_PL_SYSTEM;
}
bool i915_ttm_resource_mappable(struct ttm_resource *res);
#endif

View file

@ -33,6 +33,7 @@
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
static bool fail_gpu_migration;
static bool fail_work_allocation;
static bool ban_memcpy;
void i915_ttm_migrate_set_failure_modes(bool gpu_migration,
bool work_allocation)
@ -40,6 +41,11 @@ void i915_ttm_migrate_set_failure_modes(bool gpu_migration,
fail_gpu_migration = gpu_migration;
fail_work_allocation = work_allocation;
}
void i915_ttm_migrate_set_ban_memcpy(bool ban)
{
ban_memcpy = ban;
}
#endif
static enum i915_cache_level
@ -258,15 +264,23 @@ struct i915_ttm_memcpy_arg {
* from the callback for lockdep reasons.
* @cb: Callback for the accelerated migration fence.
* @arg: The argument for the memcpy functionality.
* @i915: The i915 pointer.
* @obj: The GEM object.
* @memcpy_allowed: Instead of processing the @arg, and falling back to memcpy
* or memset, we wedge the device and set the @obj unknown_state, to prevent
* further access to the object with the CPU or GPU. On some devices we might
* only be permitted to use the blitter engine for such operations.
*/
struct i915_ttm_memcpy_work {
struct dma_fence fence;
struct work_struct work;
/* The fence lock */
spinlock_t lock;
struct irq_work irq_work;
struct dma_fence_cb cb;
struct i915_ttm_memcpy_arg arg;
struct drm_i915_private *i915;
struct drm_i915_gem_object *obj;
bool memcpy_allowed;
};
static void i915_ttm_move_memcpy(struct i915_ttm_memcpy_arg *arg)
@ -317,14 +331,42 @@ static void __memcpy_work(struct work_struct *work)
struct i915_ttm_memcpy_work *copy_work =
container_of(work, typeof(*copy_work), work);
struct i915_ttm_memcpy_arg *arg = &copy_work->arg;
bool cookie = dma_fence_begin_signalling();
bool cookie;
/*
* FIXME: We need to take a closer look here. We should be able to plonk
* this into the fence critical section.
*/
if (!copy_work->memcpy_allowed) {
struct intel_gt *gt;
unsigned int id;
for_each_gt(gt, copy_work->i915, id)
intel_gt_set_wedged(gt);
}
cookie = dma_fence_begin_signalling();
if (copy_work->memcpy_allowed) {
i915_ttm_move_memcpy(arg);
} else {
/*
* Prevent further use of the object. Any future GTT binding or
* CPU access is not allowed once we signal the fence. Outside
* of the fence critical section, we then also then wedge the gpu
* to indicate the device is not functional.
*
* The below dma_fence_signal() is our write-memory-barrier.
*/
copy_work->obj->mm.unknown_state = true;
}
i915_ttm_move_memcpy(arg);
dma_fence_end_signalling(cookie);
dma_fence_signal(&copy_work->fence);
i915_ttm_memcpy_release(arg);
i915_gem_object_put(copy_work->obj);
dma_fence_put(&copy_work->fence);
}
@ -336,6 +378,7 @@ static void __memcpy_irq_work(struct irq_work *irq_work)
dma_fence_signal(&copy_work->fence);
i915_ttm_memcpy_release(arg);
i915_gem_object_put(copy_work->obj);
dma_fence_put(&copy_work->fence);
}
@ -389,6 +432,19 @@ i915_ttm_memcpy_work_arm(struct i915_ttm_memcpy_work *work,
return &work->fence;
}
static bool i915_ttm_memcpy_allowed(struct ttm_buffer_object *bo,
struct ttm_resource *dst_mem)
{
if (i915_gem_object_needs_ccs_pages(i915_ttm_to_gem(bo)))
return false;
if (!(i915_ttm_resource_mappable(bo->resource) &&
i915_ttm_resource_mappable(dst_mem)))
return false;
return I915_SELFTEST_ONLY(ban_memcpy) ? false : true;
}
static struct dma_fence *
__i915_ttm_move(struct ttm_buffer_object *bo,
const struct ttm_operation_ctx *ctx, bool clear,
@ -396,6 +452,9 @@ __i915_ttm_move(struct ttm_buffer_object *bo,
struct i915_refct_sgt *dst_rsgt, bool allow_accel,
const struct i915_deps *move_deps)
{
const bool memcpy_allowed = i915_ttm_memcpy_allowed(bo, dst_mem);
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
struct drm_i915_private *i915 = to_i915(bo->base.dev);
struct i915_ttm_memcpy_work *copy_work = NULL;
struct i915_ttm_memcpy_arg _arg, *arg = &_arg;
struct dma_fence *fence = ERR_PTR(-EINVAL);
@ -423,9 +482,14 @@ __i915_ttm_move(struct ttm_buffer_object *bo,
copy_work = kzalloc(sizeof(*copy_work), GFP_KERNEL);
if (copy_work) {
copy_work->i915 = i915;
copy_work->memcpy_allowed = memcpy_allowed;
copy_work->obj = i915_gem_object_get(obj);
arg = &copy_work->arg;
i915_ttm_memcpy_init(arg, bo, clear, dst_mem, dst_ttm,
dst_rsgt);
if (memcpy_allowed)
i915_ttm_memcpy_init(arg, bo, clear, dst_mem,
dst_ttm, dst_rsgt);
fence = i915_ttm_memcpy_work_arm(copy_work, dep);
} else {
dma_fence_wait(dep, false);
@ -450,17 +514,23 @@ __i915_ttm_move(struct ttm_buffer_object *bo,
}
/* Error intercept failed or no accelerated migration to start with */
if (!copy_work)
i915_ttm_memcpy_init(arg, bo, clear, dst_mem, dst_ttm,
dst_rsgt);
i915_ttm_move_memcpy(arg);
i915_ttm_memcpy_release(arg);
if (memcpy_allowed) {
if (!copy_work)
i915_ttm_memcpy_init(arg, bo, clear, dst_mem, dst_ttm,
dst_rsgt);
i915_ttm_move_memcpy(arg);
i915_ttm_memcpy_release(arg);
}
if (copy_work)
i915_gem_object_put(copy_work->obj);
kfree(copy_work);
return NULL;
return memcpy_allowed ? NULL : ERR_PTR(-EIO);
out:
if (!fence && copy_work) {
i915_ttm_memcpy_release(arg);
i915_gem_object_put(copy_work->obj);
kfree(copy_work);
}
@ -539,8 +609,11 @@ int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
}
if (migration_fence) {
ret = ttm_bo_move_accel_cleanup(bo, migration_fence, evict,
true, dst_mem);
if (I915_SELFTEST_ONLY(evict && fail_gpu_migration))
ret = -EIO; /* never feed non-migrate fences into ttm */
else
ret = ttm_bo_move_accel_cleanup(bo, migration_fence, evict,
true, dst_mem);
if (ret) {
dma_fence_wait(migration_fence, false);
ttm_bo_move_sync_cleanup(bo, dst_mem);

View file

@ -22,6 +22,7 @@ int i915_ttm_move_notify(struct ttm_buffer_object *bo);
I915_SELFTEST_DECLARE(void i915_ttm_migrate_set_failure_modes(bool gpu_migration,
bool work_allocation));
I915_SELFTEST_DECLARE(void i915_ttm_migrate_set_ban_memcpy(bool ban));
int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
struct drm_i915_gem_object *src,

View file

@ -9,6 +9,7 @@
#include <linux/jiffies.h>
#include "gt/intel_engine.h"
#include "gt/intel_rps.h"
#include "i915_gem_ioctls.h"
#include "i915_gem_object.h"
@ -31,6 +32,37 @@ i915_gem_object_wait_fence(struct dma_fence *fence,
timeout);
}
static void
i915_gem_object_boost(struct dma_resv *resv, unsigned int flags)
{
struct dma_resv_iter cursor;
struct dma_fence *fence;
/*
* Prescan all fences for potential boosting before we begin waiting.
*
* When we wait, we wait on outstanding fences serially. If the
* dma-resv contains a sequence such as 1:1, 1:2 instead of a reduced
* form 1:2, then as we look at each wait in turn we see that each
* request is currently executing and not worthy of boosting. But if
* we only happen to look at the final fence in the sequence (because
* of request coalescing or splitting between read/write arrays by
* the iterator), then we would boost. As such our decision to boost
* or not is delicately balanced on the order we wait on fences.
*
* So instead of looking for boosts sequentially, look for all boosts
* upfront and then wait on the outstanding fences.
*/
dma_resv_iter_begin(&cursor, resv,
dma_resv_usage_rw(flags & I915_WAIT_ALL));
dma_resv_for_each_fence_unlocked(&cursor, fence)
if (dma_fence_is_i915(fence) &&
!i915_request_started(to_request(fence)))
intel_rps_boost(to_request(fence));
dma_resv_iter_end(&cursor);
}
static long
i915_gem_object_wait_reservation(struct dma_resv *resv,
unsigned int flags,
@ -40,6 +72,8 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
struct dma_fence *fence;
long ret = timeout ?: 1;
i915_gem_object_boost(resv, flags);
dma_resv_iter_begin(&cursor, resv,
dma_resv_usage_rw(flags & I915_WAIT_ALL));
dma_resv_for_each_fence_unlocked(&cursor, fence) {

View file

@ -1623,6 +1623,7 @@ static int igt_shrink_thp(void *arg)
struct file *file;
unsigned int flags = PIN_USER;
unsigned int n;
intel_wakeref_t wf;
bool should_swap;
int err;
@ -1659,9 +1660,11 @@ static int igt_shrink_thp(void *arg)
goto out_put;
}
wf = intel_runtime_pm_get(&i915->runtime_pm); /* active shrink */
err = i915_vma_pin(vma, 0, 0, flags);
if (err)
goto out_put;
goto out_wf;
if (obj->mm.page_sizes.phys < I915_GTT_PAGE_SIZE_2M) {
pr_info("failed to allocate THP, finishing test early\n");
@ -1732,6 +1735,8 @@ static int igt_shrink_thp(void *arg)
out_unpin:
i915_vma_unpin(vma);
out_wf:
intel_runtime_pm_put(&i915->runtime_pm, wf);
out_put:
i915_gem_object_put(obj);
out_vm:

View file

@ -9,6 +9,7 @@
#include "i915_deps.h"
#include "selftests/igt_reset.h"
#include "selftests/igt_spinner.h"
static int igt_fill_check_buffer(struct drm_i915_gem_object *obj,
@ -109,7 +110,8 @@ static int igt_same_create_migrate(void *arg)
static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
struct drm_i915_gem_object *obj,
struct i915_vma *vma)
struct i915_vma *vma,
bool silent_migrate)
{
int err;
@ -138,7 +140,8 @@ static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
if (i915_gem_object_is_lmem(obj)) {
err = i915_gem_object_migrate(obj, ww, INTEL_REGION_SMEM);
if (err) {
pr_err("Object failed migration to smem\n");
if (!silent_migrate)
pr_err("Object failed migration to smem\n");
if (err)
return err;
}
@ -156,7 +159,8 @@ static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
} else {
err = i915_gem_object_migrate(obj, ww, INTEL_REGION_LMEM_0);
if (err) {
pr_err("Object failed migration to lmem\n");
if (!silent_migrate)
pr_err("Object failed migration to lmem\n");
if (err)
return err;
}
@ -179,7 +183,8 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
struct i915_address_space *vm,
struct i915_deps *deps,
struct igt_spinner *spin,
struct dma_fence *spin_fence)
struct dma_fence *spin_fence,
bool borked_migrate)
{
struct drm_i915_private *i915 = gt->i915;
struct drm_i915_gem_object *obj;
@ -242,7 +247,8 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
*/
for (i = 1; i <= 5; ++i) {
for_i915_gem_ww(&ww, err, true)
err = lmem_pages_migrate_one(&ww, obj, vma);
err = lmem_pages_migrate_one(&ww, obj, vma,
borked_migrate);
if (err)
goto out_put;
}
@ -283,23 +289,70 @@ static int __igt_lmem_pages_migrate(struct intel_gt *gt,
static int igt_lmem_pages_failsafe_migrate(void *arg)
{
int fail_gpu, fail_alloc, ret;
int fail_gpu, fail_alloc, ban_memcpy, ret;
struct intel_gt *gt = arg;
for (fail_gpu = 0; fail_gpu < 2; ++fail_gpu) {
for (fail_alloc = 0; fail_alloc < 2; ++fail_alloc) {
pr_info("Simulated failure modes: gpu: %d, alloc: %d\n",
fail_gpu, fail_alloc);
i915_ttm_migrate_set_failure_modes(fail_gpu,
fail_alloc);
ret = __igt_lmem_pages_migrate(gt, NULL, NULL, NULL, NULL);
if (ret)
goto out_err;
for (ban_memcpy = 0; ban_memcpy < 2; ++ban_memcpy) {
pr_info("Simulated failure modes: gpu: %d, alloc:%d, ban_memcpy: %d\n",
fail_gpu, fail_alloc, ban_memcpy);
i915_ttm_migrate_set_ban_memcpy(ban_memcpy);
i915_ttm_migrate_set_failure_modes(fail_gpu,
fail_alloc);
ret = __igt_lmem_pages_migrate(gt, NULL, NULL,
NULL, NULL,
ban_memcpy &&
fail_gpu);
if (ban_memcpy && fail_gpu) {
struct intel_gt *__gt;
unsigned int id;
if (ret != -EIO) {
pr_err("expected -EIO, got (%d)\n", ret);
ret = -EINVAL;
} else {
ret = 0;
}
for_each_gt(__gt, gt->i915, id) {
intel_wakeref_t wakeref;
bool wedged;
mutex_lock(&__gt->reset.mutex);
wedged = test_bit(I915_WEDGED, &__gt->reset.flags);
mutex_unlock(&__gt->reset.mutex);
if (fail_gpu && !fail_alloc) {
if (!wedged) {
pr_err("gt(%u) not wedged\n", id);
ret = -EINVAL;
continue;
}
} else if (wedged) {
pr_err("gt(%u) incorrectly wedged\n", id);
ret = -EINVAL;
} else {
continue;
}
wakeref = intel_runtime_pm_get(__gt->uncore->rpm);
igt_global_reset_lock(__gt);
intel_gt_reset(__gt, ALL_ENGINES, NULL);
igt_global_reset_unlock(__gt);
intel_runtime_pm_put(__gt->uncore->rpm, wakeref);
}
if (ret)
goto out_err;
}
}
}
}
out_err:
i915_ttm_migrate_set_failure_modes(false, false);
i915_ttm_migrate_set_ban_memcpy(false);
return ret;
}
@ -370,7 +423,7 @@ static int igt_async_migrate(struct intel_gt *gt)
goto out_ce;
err = __igt_lmem_pages_migrate(gt, &ppgtt->vm, &deps, &spin,
spin_fence);
spin_fence, false);
i915_deps_fini(&deps);
dma_fence_put(spin_fence);
if (err)
@ -394,23 +447,67 @@ static int igt_async_migrate(struct intel_gt *gt)
#define ASYNC_FAIL_ALLOC 1
static int igt_lmem_async_migrate(void *arg)
{
int fail_gpu, fail_alloc, ret;
int fail_gpu, fail_alloc, ban_memcpy, ret;
struct intel_gt *gt = arg;
for (fail_gpu = 0; fail_gpu < 2; ++fail_gpu) {
for (fail_alloc = 0; fail_alloc < ASYNC_FAIL_ALLOC; ++fail_alloc) {
pr_info("Simulated failure modes: gpu: %d, alloc: %d\n",
fail_gpu, fail_alloc);
i915_ttm_migrate_set_failure_modes(fail_gpu,
fail_alloc);
ret = igt_async_migrate(gt);
if (ret)
goto out_err;
for (ban_memcpy = 0; ban_memcpy < 2; ++ban_memcpy) {
pr_info("Simulated failure modes: gpu: %d, alloc: %d, ban_memcpy: %d\n",
fail_gpu, fail_alloc, ban_memcpy);
i915_ttm_migrate_set_ban_memcpy(ban_memcpy);
i915_ttm_migrate_set_failure_modes(fail_gpu,
fail_alloc);
ret = igt_async_migrate(gt);
if (fail_gpu && ban_memcpy) {
struct intel_gt *__gt;
unsigned int id;
if (ret != -EIO) {
pr_err("expected -EIO, got (%d)\n", ret);
ret = -EINVAL;
} else {
ret = 0;
}
for_each_gt(__gt, gt->i915, id) {
intel_wakeref_t wakeref;
bool wedged;
mutex_lock(&__gt->reset.mutex);
wedged = test_bit(I915_WEDGED, &__gt->reset.flags);
mutex_unlock(&__gt->reset.mutex);
if (fail_gpu && !fail_alloc) {
if (!wedged) {
pr_err("gt(%u) not wedged\n", id);
ret = -EINVAL;
continue;
}
} else if (wedged) {
pr_err("gt(%u) incorrectly wedged\n", id);
ret = -EINVAL;
} else {
continue;
}
wakeref = intel_runtime_pm_get(__gt->uncore->rpm);
igt_global_reset_lock(__gt);
intel_gt_reset(__gt, ALL_ENGINES, NULL);
igt_global_reset_unlock(__gt);
intel_runtime_pm_put(__gt->uncore->rpm, wakeref);
}
}
if (ret)
goto out_err;
}
}
}
out_err:
i915_ttm_migrate_set_failure_modes(false, false);
i915_ttm_migrate_set_ban_memcpy(false);
return ret;
}

View file

@ -10,6 +10,7 @@
#include "gem/i915_gem_internal.h"
#include "gem/i915_gem_region.h"
#include "gem/i915_gem_ttm.h"
#include "gem/i915_gem_ttm_move.h"
#include "gt/intel_engine_pm.h"
#include "gt/intel_gpu_commands.h"
#include "gt/intel_gt.h"
@ -21,6 +22,7 @@
#include "i915_selftest.h"
#include "selftests/i915_random.h"
#include "selftests/igt_flush_test.h"
#include "selftests/igt_reset.h"
#include "selftests/igt_mmap.h"
struct tile {
@ -979,6 +981,9 @@ static int igt_mmap(void *arg)
};
int i;
if (mr->private)
continue;
for (i = 0; i < ARRAY_SIZE(sizes); i++) {
struct drm_i915_gem_object *obj;
int err;
@ -1160,6 +1165,7 @@ static int ___igt_mmap_migrate(struct drm_i915_private *i915,
#define IGT_MMAP_MIGRATE_FILL (1 << 1)
#define IGT_MMAP_MIGRATE_EVICTABLE (1 << 2)
#define IGT_MMAP_MIGRATE_UNFAULTABLE (1 << 3)
#define IGT_MMAP_MIGRATE_FAIL_GPU (1 << 4)
static int __igt_mmap_migrate(struct intel_memory_region **placements,
int n_placements,
struct intel_memory_region *expected_mr,
@ -1221,8 +1227,10 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
expand32(POISON_INUSE), &rq);
i915_gem_object_unpin_pages(obj);
if (rq) {
dma_resv_add_fence(obj->base.resv, &rq->fence,
DMA_RESV_USAGE_KERNEL);
err = dma_resv_reserve_fences(obj->base.resv, 1);
if (!err)
dma_resv_add_fence(obj->base.resv, &rq->fence,
DMA_RESV_USAGE_KERNEL);
i915_request_put(rq);
}
i915_gem_object_unlock(obj);
@ -1232,13 +1240,62 @@ static int __igt_mmap_migrate(struct intel_memory_region **placements,
if (flags & IGT_MMAP_MIGRATE_EVICTABLE)
igt_make_evictable(&objects);
if (flags & IGT_MMAP_MIGRATE_FAIL_GPU) {
err = i915_gem_object_lock(obj, NULL);
if (err)
goto out_put;
/*
* Ensure we only simulate the gpu failuire when faulting the
* pages.
*/
err = i915_gem_object_wait_moving_fence(obj, true);
i915_gem_object_unlock(obj);
if (err)
goto out_put;
i915_ttm_migrate_set_failure_modes(true, false);
}
err = ___igt_mmap_migrate(i915, obj, addr,
flags & IGT_MMAP_MIGRATE_UNFAULTABLE);
if (!err && obj->mm.region != expected_mr) {
pr_err("%s region mismatch %s\n", __func__, expected_mr->name);
err = -EINVAL;
}
if (flags & IGT_MMAP_MIGRATE_FAIL_GPU) {
struct intel_gt *gt;
unsigned int id;
i915_ttm_migrate_set_failure_modes(false, false);
for_each_gt(gt, i915, id) {
intel_wakeref_t wakeref;
bool wedged;
mutex_lock(&gt->reset.mutex);
wedged = test_bit(I915_WEDGED, &gt->reset.flags);
mutex_unlock(&gt->reset.mutex);
if (!wedged) {
pr_err("gt(%u) not wedged\n", id);
err = -EINVAL;
continue;
}
wakeref = intel_runtime_pm_get(gt->uncore->rpm);
igt_global_reset_lock(gt);
intel_gt_reset(gt, ALL_ENGINES, NULL);
igt_global_reset_unlock(gt);
intel_runtime_pm_put(gt->uncore->rpm, wakeref);
}
if (!i915_gem_object_has_unknown_state(obj)) {
pr_err("object missing unknown_state\n");
err = -EINVAL;
}
}
out_put:
i915_gem_object_put(obj);
igt_close_objects(i915, &objects);
@ -1319,6 +1376,23 @@ static int igt_mmap_migrate(void *arg)
IGT_MMAP_MIGRATE_TOPDOWN |
IGT_MMAP_MIGRATE_FILL |
IGT_MMAP_MIGRATE_UNFAULTABLE);
if (err)
goto out_io_size;
/*
* Allocate in the non-mappable portion, but force migrating to
* the mappable portion on fault (LMEM -> LMEM). We then also
* simulate a gpu error when moving the pages when faulting the
* pages, which should result in wedging the gpu and returning
* SIGBUS in the fault handler, since we can't fallback to
* memcpy.
*/
err = __igt_mmap_migrate(single, ARRAY_SIZE(single), mr,
IGT_MMAP_MIGRATE_TOPDOWN |
IGT_MMAP_MIGRATE_FILL |
IGT_MMAP_MIGRATE_EVICTABLE |
IGT_MMAP_MIGRATE_FAIL_GPU |
IGT_MMAP_MIGRATE_UNFAULTABLE);
out_io_size:
mr->io_size = saved_io_size;
i915_ttm_buddy_man_force_visible_size(man,
@ -1435,6 +1509,9 @@ static int igt_mmap_access(void *arg)
struct drm_i915_gem_object *obj;
int err;
if (mr->private)
continue;
obj = __i915_gem_object_create_user(i915, PAGE_SIZE, &mr, 1);
if (obj == ERR_PTR(-ENODEV))
continue;
@ -1580,6 +1657,9 @@ static int igt_mmap_gpu(void *arg)
struct drm_i915_gem_object *obj;
int err;
if (mr->private)
continue;
obj = __i915_gem_object_create_user(i915, PAGE_SIZE, &mr, 1);
if (obj == ERR_PTR(-ENODEV))
continue;
@ -1727,6 +1807,9 @@ static int igt_mmap_revoke(void *arg)
struct drm_i915_gem_object *obj;
int err;
if (mr->private)
continue;
obj = __i915_gem_object_create_user(i915, PAGE_SIZE, &mr, 1);
if (obj == ERR_PTR(-ENODEV))
continue;

View file

@ -399,7 +399,8 @@ static void insert_breadcrumb(struct i915_request *rq)
* the request as it may have completed and raised the interrupt as
* we were attaching it into the lists.
*/
irq_work_queue(&b->irq_work);
if (!b->irq_armed || __i915_request_is_complete(rq))
irq_work_queue(&b->irq_work);
}
bool i915_request_enable_breadcrumb(struct i915_request *rq)

View file

@ -1517,7 +1517,6 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
struct intel_instdone *instdone)
{
struct drm_i915_private *i915 = engine->i915;
const struct sseu_dev_info *sseu = &engine->gt->info.sseu;
struct intel_uncore *uncore = engine->uncore;
u32 mmio_base = engine->mmio_base;
int slice;
@ -1542,32 +1541,19 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
intel_uncore_read(uncore, GEN12_SC_INSTDONE_EXTRA2);
}
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
instdone->sampler[slice][subslice] =
intel_gt_mcr_read(engine->gt,
GEN7_SAMPLER_INSTDONE,
slice, subslice);
instdone->row[slice][subslice] =
intel_gt_mcr_read(engine->gt,
GEN7_ROW_INSTDONE,
slice, subslice);
}
} else {
for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
instdone->sampler[slice][subslice] =
intel_gt_mcr_read(engine->gt,
GEN7_SAMPLER_INSTDONE,
slice, subslice);
instdone->row[slice][subslice] =
intel_gt_mcr_read(engine->gt,
GEN7_ROW_INSTDONE,
slice, subslice);
}
for_each_ss_steering(iter, engine->gt, slice, subslice) {
instdone->sampler[slice][subslice] =
intel_gt_mcr_read(engine->gt,
GEN7_SAMPLER_INSTDONE,
slice, subslice);
instdone->row[slice][subslice] =
intel_gt_mcr_read(engine->gt,
GEN7_ROW_INSTDONE,
slice, subslice);
}
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice)
for_each_ss_steering(iter, engine->gt, slice, subslice)
instdone->geom_svg[slice][subslice] =
intel_gt_mcr_read(engine->gt,
XEHPG_INSTDONE_GEOM_SVG,

View file

@ -647,26 +647,4 @@ intel_engine_uses_wa_hold_ccs_switchout(struct intel_engine_cs *engine)
return engine->flags & I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT;
}
#define instdone_has_slice(dev_priv___, sseu___, slice___) \
((GRAPHICS_VER(dev_priv___) == 7 ? 1 : ((sseu___)->slice_mask)) & BIT(slice___))
#define instdone_has_subslice(dev_priv__, sseu__, slice__, subslice__) \
(GRAPHICS_VER(dev_priv__) == 7 ? (1 & BIT(subslice__)) : \
intel_sseu_has_subslice(sseu__, 0, subslice__))
#define for_each_instdone_slice_subslice(dev_priv_, sseu_, slice_, subslice_) \
for ((slice_) = 0, (subslice_) = 0; (slice_) < I915_MAX_SLICES; \
(subslice_) = ((subslice_) + 1) % I915_MAX_SUBSLICES, \
(slice_) += ((subslice_) == 0)) \
for_each_if((instdone_has_slice(dev_priv_, sseu_, slice_)) && \
(instdone_has_subslice(dev_priv_, sseu_, slice_, \
subslice_)))
#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, dss_) \
for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
(iter_) < GEN_SS_MASK_SIZE; \
(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))
#endif /* __INTEL_ENGINE_TYPES_H__ */

View file

@ -952,6 +952,20 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
mutex_lock(&gt->tlb_invalidate_lock);
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */
for_each_engine(engine, gt, id) {
struct reg_and_bit rb;
rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
if (!i915_mmio_reg_offset(rb.reg))
continue;
intel_uncore_write_fw(uncore, rb.reg, rb.bit);
}
spin_unlock_irq(&uncore->lock);
for_each_engine(engine, gt, id) {
/*
* HW architecture suggest typical invalidation time at 40us,
@ -966,7 +980,6 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
if (!i915_mmio_reg_offset(rb.reg))
continue;
intel_uncore_write_fw(uncore, rb.reg, rb.bit);
if (__intel_wait_for_register_fw(uncore,
rb.reg, rb.bit, 0,
timeout_us, timeout_ms,

View file

@ -495,3 +495,28 @@ void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
}
}
/**
* intel_gt_mcr_get_ss_steering - returns the group/instance steering for a SS
* @gt: GT structure
* @dss: DSS ID to obtain steering for
* @group: pointer to storage for steering group ID
* @instance: pointer to storage for steering instance ID
*
* Returns the steering IDs (via the @group and @instance parameters) that
* correspond to a specific subslice/DSS ID.
*/
void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
unsigned int *group, unsigned int *instance)
{
if (IS_PONTEVECCHIO(gt->i915)) {
*group = dss / GEN_DSS_PER_CSLICE;
*instance = dss % GEN_DSS_PER_CSLICE;
} else if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) {
*group = dss / GEN_DSS_PER_GSLICE;
*instance = dss % GEN_DSS_PER_GSLICE;
} else {
*group = dss / GEN_MAX_SS_PER_HSW_SLICE;
*instance = dss % GEN_MAX_SS_PER_HSW_SLICE;
return;
}
}

View file

@ -31,4 +31,28 @@ void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
bool dump_table);
void intel_gt_mcr_get_ss_steering(struct intel_gt *gt, unsigned int dss,
unsigned int *group, unsigned int *instance);
/*
* Helper for for_each_ss_steering loop. On pre-Xe_HP platforms, subslice
* presence is determined by using the group/instance as direct lookups in the
* slice/subslice topology. On Xe_HP and beyond, the steering is unrelated to
* the topology, so we lookup the DSS ID directly in "slice 0."
*/
#define _HAS_SS(ss_, gt_, group_, instance_) ( \
GRAPHICS_VER_FULL(gt_->i915) >= IP_VER(12, 50) ? \
intel_sseu_has_subslice(&(gt_)->info.sseu, 0, ss_) : \
intel_sseu_has_subslice(&(gt_)->info.sseu, group_, instance_))
/*
* Loop over each subslice/DSS and determine the group and instance IDs that
* should be used to steer MCR accesses toward this DSS.
*/
#define for_each_ss_steering(ss_, gt_, group_, instance_) \
for (ss_ = 0, intel_gt_mcr_get_ss_steering(gt_, 0, &group_, &instance_); \
ss_ < I915_MAX_SS_FUSE_BITS; \
ss_++, intel_gt_mcr_get_ss_steering(gt_, ss_, &group_, &instance_)) \
for_each_if(_HAS_SS(ss_, gt_, group_, instance_))
#endif /* __INTEL_GT_MCR__ */

View file

@ -371,6 +371,9 @@
#define GEN9_WM_CHICKEN3 _MMIO(0x5588)
#define GEN9_FACTOR_IN_CLR_VAL_HIZ (1 << 9)
#define CHICKEN_RASTER_1 _MMIO(0x6204)
#define DIS_SF_ROUND_NEAREST_EVEN REG_BIT(8)
#define VFLSKPD _MMIO(0x62a8)
#define DIS_OVER_FETCH_CACHE REG_BIT(1)
#define DIS_MULT_MISS_RD_SQUASH REG_BIT(0)
@ -918,6 +921,10 @@
#define GEN7_L3CNTLREG1 _MMIO(0xb01c)
#define GEN7_WA_FOR_GEN7_L3_CONTROL 0x3C47FF8C
#define GEN7_L3AGDIS (1 << 19)
#define XEHPC_LNCFMISCCFGREG0 _MMIO(0xb01c)
#define XEHPC_OVRLSCCC REG_BIT(0)
#define GEN7_L3CNTLREG2 _MMIO(0xb020)
/* MOCS (Memory Object Control State) registers */

View file

@ -15,6 +15,103 @@
#include "gt/intel_gt_mcr.h"
#include "gt/intel_gt_regs.h"
static void _release_bars(struct pci_dev *pdev)
{
int resno;
for (resno = PCI_STD_RESOURCES; resno < PCI_STD_RESOURCE_END; resno++) {
if (pci_resource_len(pdev, resno))
pci_release_resource(pdev, resno);
}
}
static void
_resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
{
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
int bar_size = pci_rebar_bytes_to_size(size);
int ret;
_release_bars(pdev);
ret = pci_resize_resource(pdev, resno, bar_size);
if (ret) {
drm_info(&i915->drm, "Failed to resize BAR%d to %dM (%pe)\n",
resno, 1 << bar_size, ERR_PTR(ret));
return;
}
drm_info(&i915->drm, "BAR%d resized to %dM\n", resno, 1 << bar_size);
}
#define LMEM_BAR_NUM 2
static void i915_resize_lmem_bar(struct drm_i915_private *i915, resource_size_t lmem_size)
{
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
struct pci_bus *root = pdev->bus;
struct resource *root_res;
resource_size_t rebar_size;
resource_size_t current_size;
u32 pci_cmd;
int i;
current_size = roundup_pow_of_two(pci_resource_len(pdev, LMEM_BAR_NUM));
if (i915->params.lmem_bar_size) {
u32 bar_sizes;
rebar_size = i915->params.lmem_bar_size *
(resource_size_t)SZ_1M;
bar_sizes = pci_rebar_get_possible_sizes(pdev,
LMEM_BAR_NUM);
if (rebar_size == current_size)
return;
if (!(bar_sizes & BIT(pci_rebar_bytes_to_size(rebar_size))) ||
rebar_size >= roundup_pow_of_two(lmem_size)) {
rebar_size = lmem_size;
drm_info(&i915->drm,
"Given bar size is not within supported size, setting it to default: %llu\n",
(u64)lmem_size >> 20);
}
} else {
rebar_size = current_size;
if (rebar_size != roundup_pow_of_two(lmem_size))
rebar_size = lmem_size;
else
return;
}
/* Find out if root bus contains 64bit memory addressing */
while (root->parent)
root = root->parent;
pci_bus_for_each_resource(root, root_res, i) {
if (root_res && root_res->flags & (IORESOURCE_MEM | IORESOURCE_MEM_64) &&
root_res->start > 0x100000000ull)
break;
}
/* pci_resize_resource will fail anyways */
if (!root_res) {
drm_info(&i915->drm, "Can't resize LMEM BAR - platform support is missing\n");
return;
}
/* First disable PCI memory decoding references */
pci_read_config_dword(pdev, PCI_COMMAND, &pci_cmd);
pci_write_config_dword(pdev, PCI_COMMAND,
pci_cmd & ~PCI_COMMAND_MEMORY);
_resize_bar(i915, LMEM_BAR_NUM, rebar_size);
pci_assign_unassigned_bus_resources(pdev->bus);
pci_write_config_dword(pdev, PCI_COMMAND, pci_cmd);
}
static int
region_lmem_release(struct intel_memory_region *mem)
{
@ -112,12 +209,6 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
flat_ccs_base = intel_gt_mcr_read_any(gt, XEHP_FLAT_CCS_BASE_ADDR);
flat_ccs_base = (flat_ccs_base >> XEHP_CCS_BASE_SHIFT) * SZ_64K;
/* FIXME: Remove this when we have small-bar enabled */
if (pci_resource_len(pdev, 2) < lmem_size) {
drm_err(&i915->drm, "System requires small-BAR support, which is currently unsupported on this kernel\n");
return ERR_PTR(-EINVAL);
}
if (GEM_WARN_ON(lmem_size < flat_ccs_base))
return ERR_PTR(-EIO);
@ -134,6 +225,8 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
lmem_size = intel_uncore_read64(&i915->uncore, GEN12_GSMBASE);
}
i915_resize_lmem_bar(i915, lmem_size);
if (i915->params.lmem_size > 0) {
lmem_size = min_t(resource_size_t, lmem_size,
mul_u32_u32(i915->params.lmem_size, SZ_1M));
@ -170,6 +263,10 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
drm_info(&i915->drm, "Local memory available: %pa\n",
&lmem_size);
if (io_size < lmem_size)
drm_info(&i915->drm, "Using a reduced BAR size of %lluMiB. Consider enabling 'Resizable BAR' or similar, if available in the BIOS.\n",
(u64)io_size >> 20);
return mem;
err_region_put:

View file

@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
static int gen6_reset_engines(struct intel_gt *gt,
intel_engine_mask_t engine_mask,
unsigned int retry)
static int __gen6_reset_engines(struct intel_gt *gt,
intel_engine_mask_t engine_mask,
unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
static int gen6_reset_engines(struct intel_gt *gt,
intel_engine_mask_t engine_mask,
unsigned int retry)
{
unsigned long flags;
int ret;
spin_lock_irqsave(&gt->uncore->lock, flags);
ret = __gen6_reset_engines(gt, engine_mask, retry);
spin_unlock_irqrestore(&gt->uncore->lock, flags);
return ret;
}
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
static int gen11_reset_engines(struct intel_gt *gt,
intel_engine_mask_t engine_mask,
unsigned int retry)
static int __gen11_reset_engines(struct intel_gt *gt,
intel_engine_mask_t engine_mask,
unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
unsigned long flags;
int ret;
spin_lock_irqsave(&gt->uncore->lock, flags);
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
gen11_reset_engines(gt, gt->info.engine_mask, 0);
__gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
ret = gen11_reset_engines(gt, engine_mask, retry);
ret = __gen11_reset_engines(gt, engine_mask, retry);
else
ret = gen6_reset_engines(gt, engine_mask, retry);
ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
spin_unlock_irqrestore(&gt->uncore->lock, flags);
return ret;
}

View file

@ -689,6 +689,9 @@ static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_B0, STEP_FOREVER) ||
IS_DG2_G11(engine->i915) || IS_DG2_G12(engine->i915))
wa_masked_field_set(wal, VF_PREEMPTION, PREEMPTION_VERTEX_COUNT, 0x4000);
/* Wa_15010599737:dg2 */
wa_masked_en(wal, CHICKEN_RASTER_1, DIS_SF_ROUND_NEAREST_EVEN);
}
static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine,
@ -2687,6 +2690,9 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li
* performance guide section.
*/
wa_write(wal, XEHPC_L3SCRUB, SCRUB_CL_DWNGRADE_SHARED | SCRUB_RATE_4B_PER_CLK);
/* Wa_16016694945 */
wa_masked_en(wal, XEHPC_LNCFMISCCFGREG0, XEHPC_OVRLSCCC);
}
if (IS_XEHPSDV(i915)) {

View file

@ -176,8 +176,8 @@ static int live_lrc_layout(void *arg)
continue;
hw = shmem_pin_map(engine->default_state);
if (IS_ERR(hw)) {
err = PTR_ERR(hw);
if (!hw) {
err = -ENOMEM;
break;
}
hw += LRC_STATE_OFFSET / sizeof(*hw);
@ -365,8 +365,8 @@ static int live_lrc_fixed(void *arg)
continue;
hw = shmem_pin_map(engine->default_state);
if (IS_ERR(hw)) {
err = PTR_ERR(hw);
if (!hw) {
err = -ENOMEM;
break;
}
hw += LRC_STATE_OFFSET / sizeof(*hw);

View file

@ -8,6 +8,11 @@
#define delay_for_h2g() usleep_range(H2G_DELAY, H2G_DELAY + 10000)
#define FREQUENCY_REQ_UNIT DIV_ROUND_CLOSEST(GT_FREQUENCY_MULTIPLIER, \
GEN9_FREQ_SCALER)
enum test_type {
VARY_MIN,
VARY_MAX,
MAX_GRANTED
};
static int slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 freq)
{
@ -36,10 +41,107 @@ static int slpc_set_max_freq(struct intel_guc_slpc *slpc, u32 freq)
return ret;
}
static int live_slpc_clamp_min(void *arg)
static int vary_max_freq(struct intel_guc_slpc *slpc, struct intel_rps *rps,
u32 *max_act_freq)
{
u32 step, max_freq, req_freq;
u32 act_freq;
int err = 0;
/* Go from max to min in 5 steps */
step = (slpc->rp0_freq - slpc->min_freq) / NUM_STEPS;
*max_act_freq = slpc->min_freq;
for (max_freq = slpc->rp0_freq; max_freq > slpc->min_freq;
max_freq -= step) {
err = slpc_set_max_freq(slpc, max_freq);
if (err)
break;
req_freq = intel_rps_read_punit_req_frequency(rps);
/* GuC requests freq in multiples of 50/3 MHz */
if (req_freq > (max_freq + FREQUENCY_REQ_UNIT)) {
pr_err("SWReq is %d, should be at most %d\n", req_freq,
max_freq + FREQUENCY_REQ_UNIT);
err = -EINVAL;
}
act_freq = intel_rps_read_actual_frequency(rps);
if (act_freq > *max_act_freq)
*max_act_freq = act_freq;
if (err)
break;
}
return err;
}
static int vary_min_freq(struct intel_guc_slpc *slpc, struct intel_rps *rps,
u32 *max_act_freq)
{
u32 step, min_freq, req_freq;
u32 act_freq;
int err = 0;
/* Go from min to max in 5 steps */
step = (slpc->rp0_freq - slpc->min_freq) / NUM_STEPS;
*max_act_freq = slpc->min_freq;
for (min_freq = slpc->min_freq; min_freq < slpc->rp0_freq;
min_freq += step) {
err = slpc_set_min_freq(slpc, min_freq);
if (err)
break;
req_freq = intel_rps_read_punit_req_frequency(rps);
/* GuC requests freq in multiples of 50/3 MHz */
if (req_freq < (min_freq - FREQUENCY_REQ_UNIT)) {
pr_err("SWReq is %d, should be at least %d\n", req_freq,
min_freq - FREQUENCY_REQ_UNIT);
err = -EINVAL;
}
act_freq = intel_rps_read_actual_frequency(rps);
if (act_freq > *max_act_freq)
*max_act_freq = act_freq;
if (err)
break;
}
return err;
}
static int max_granted_freq(struct intel_guc_slpc *slpc, struct intel_rps *rps, u32 *max_act_freq)
{
struct intel_gt *gt = rps_to_gt(rps);
u32 perf_limit_reasons;
int err = 0;
err = slpc_set_min_freq(slpc, slpc->rp0_freq);
if (err)
return err;
*max_act_freq = intel_rps_read_actual_frequency(rps);
if (*max_act_freq != slpc->rp0_freq) {
/* Check if there was some throttling by pcode */
perf_limit_reasons = intel_uncore_read(gt->uncore, GT0_PERF_LIMIT_REASONS);
/* If not, this is an error */
if (!(perf_limit_reasons & GT0_PERF_LIMIT_REASONS_MASK)) {
pr_err("Pcode did not grant max freq\n");
err = -EINVAL;
} else {
pr_info("Pcode throttled frequency 0x%x\n", perf_limit_reasons);
}
}
return err;
}
static int run_test(struct intel_gt *gt, int test_type)
{
struct drm_i915_private *i915 = arg;
struct intel_gt *gt = to_gt(i915);
struct intel_guc_slpc *slpc = &gt->uc.guc.slpc;
struct intel_rps *rps = &gt->rps;
struct intel_engine_cs *engine;
@ -64,7 +166,7 @@ static int live_slpc_clamp_min(void *arg)
return -EIO;
}
if (slpc_min_freq == slpc_max_freq) {
if (slpc->min_freq == slpc->rp0_freq) {
pr_err("Min/Max are fused to the same value\n");
return -EINVAL;
}
@ -73,78 +175,71 @@ static int live_slpc_clamp_min(void *arg)
intel_gt_pm_get(gt);
for_each_engine(engine, gt, id) {
struct i915_request *rq;
u32 step, min_freq, req_freq;
u32 act_freq, max_act_freq;
u32 max_act_freq;
if (!intel_engine_can_store_dword(engine))
continue;
/* Go from min to max in 5 steps */
step = (slpc_max_freq - slpc_min_freq) / NUM_STEPS;
max_act_freq = slpc_min_freq;
for (min_freq = slpc_min_freq; min_freq < slpc_max_freq;
min_freq += step) {
err = slpc_set_min_freq(slpc, min_freq);
if (err)
break;
st_engine_heartbeat_disable(engine);
st_engine_heartbeat_disable(engine);
rq = igt_spinner_create_request(&spin,
engine->kernel_context,
MI_NOOP);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
st_engine_heartbeat_enable(engine);
break;
}
rq = igt_spinner_create_request(&spin,
engine->kernel_context,
MI_NOOP);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
st_engine_heartbeat_enable(engine);
break;
}
i915_request_add(rq);
if (!igt_wait_for_spinner(&spin, rq)) {
pr_err("%s: Spinner did not start\n",
engine->name);
igt_spinner_end(&spin);
st_engine_heartbeat_enable(engine);
intel_gt_set_wedged(engine->gt);
err = -EIO;
break;
}
/* Wait for GuC to detect business and raise
* requested frequency if necessary.
*/
delay_for_h2g();
req_freq = intel_rps_read_punit_req_frequency(rps);
/* GuC requests freq in multiples of 50/3 MHz */
if (req_freq < (min_freq - FREQUENCY_REQ_UNIT)) {
pr_err("SWReq is %d, should be at least %d\n", req_freq,
min_freq - FREQUENCY_REQ_UNIT);
igt_spinner_end(&spin);
st_engine_heartbeat_enable(engine);
err = -EINVAL;
break;
}
act_freq = intel_rps_read_actual_frequency(rps);
if (act_freq > max_act_freq)
max_act_freq = act_freq;
i915_request_add(rq);
if (!igt_wait_for_spinner(&spin, rq)) {
pr_err("%s: Spinner did not start\n",
engine->name);
igt_spinner_end(&spin);
st_engine_heartbeat_enable(engine);
intel_gt_set_wedged(engine->gt);
err = -EIO;
break;
}
switch (test_type) {
case VARY_MIN:
err = vary_min_freq(slpc, rps, &max_act_freq);
break;
case VARY_MAX:
err = vary_max_freq(slpc, rps, &max_act_freq);
break;
case MAX_GRANTED:
/* Media engines have a different RP0 */
if (engine->class == VIDEO_DECODE_CLASS ||
engine->class == VIDEO_ENHANCEMENT_CLASS) {
igt_spinner_end(&spin);
st_engine_heartbeat_enable(engine);
err = 0;
continue;
}
err = max_granted_freq(slpc, rps, &max_act_freq);
break;
}
pr_info("Max actual frequency for %s was %d\n",
engine->name, max_act_freq);
/* Actual frequency should rise above min */
if (max_act_freq == slpc_min_freq) {
if (max_act_freq <= slpc_min_freq) {
pr_err("Actual freq did not rise above min\n");
pr_err("Perf Limit Reasons: 0x%x\n",
intel_uncore_read(gt->uncore, GT0_PERF_LIMIT_REASONS));
err = -EINVAL;
}
igt_spinner_end(&spin);
st_engine_heartbeat_enable(engine);
if (err)
break;
}
@ -163,145 +258,37 @@ static int live_slpc_clamp_min(void *arg)
return err;
}
static int live_slpc_clamp_max(void *arg)
static int live_slpc_vary_min(void *arg)
{
struct drm_i915_private *i915 = arg;
struct intel_gt *gt = to_gt(i915);
struct intel_guc_slpc *slpc;
struct intel_rps *rps;
struct intel_engine_cs *engine;
enum intel_engine_id id;
struct igt_spinner spin;
int err = 0;
u32 slpc_min_freq, slpc_max_freq;
slpc = &gt->uc.guc.slpc;
rps = &gt->rps;
return run_test(gt, VARY_MIN);
}
if (!intel_uc_uses_guc_slpc(&gt->uc))
return 0;
static int live_slpc_vary_max(void *arg)
{
struct drm_i915_private *i915 = arg;
struct intel_gt *gt = to_gt(i915);
if (igt_spinner_init(&spin, gt))
return -ENOMEM;
return run_test(gt, VARY_MAX);
}
if (intel_guc_slpc_get_max_freq(slpc, &slpc_max_freq)) {
pr_err("Could not get SLPC max freq\n");
return -EIO;
}
/* check if pcode can grant RP0 */
static int live_slpc_max_granted(void *arg)
{
struct drm_i915_private *i915 = arg;
struct intel_gt *gt = to_gt(i915);
if (intel_guc_slpc_get_min_freq(slpc, &slpc_min_freq)) {
pr_err("Could not get SLPC min freq\n");
return -EIO;
}
if (slpc_min_freq == slpc_max_freq) {
pr_err("Min/Max are fused to the same value\n");
return -EINVAL;
}
intel_gt_pm_wait_for_idle(gt);
intel_gt_pm_get(gt);
for_each_engine(engine, gt, id) {
struct i915_request *rq;
u32 max_freq, req_freq;
u32 act_freq, max_act_freq;
u32 step;
if (!intel_engine_can_store_dword(engine))
continue;
/* Go from max to min in 5 steps */
step = (slpc_max_freq - slpc_min_freq) / NUM_STEPS;
max_act_freq = slpc_min_freq;
for (max_freq = slpc_max_freq; max_freq > slpc_min_freq;
max_freq -= step) {
err = slpc_set_max_freq(slpc, max_freq);
if (err)
break;
st_engine_heartbeat_disable(engine);
rq = igt_spinner_create_request(&spin,
engine->kernel_context,
MI_NOOP);
if (IS_ERR(rq)) {
st_engine_heartbeat_enable(engine);
err = PTR_ERR(rq);
break;
}
i915_request_add(rq);
if (!igt_wait_for_spinner(&spin, rq)) {
pr_err("%s: SLPC spinner did not start\n",
engine->name);
igt_spinner_end(&spin);
st_engine_heartbeat_enable(engine);
intel_gt_set_wedged(engine->gt);
err = -EIO;
break;
}
delay_for_h2g();
/* Verify that SWREQ indeed was set to specific value */
req_freq = intel_rps_read_punit_req_frequency(rps);
/* GuC requests freq in multiples of 50/3 MHz */
if (req_freq > (max_freq + FREQUENCY_REQ_UNIT)) {
pr_err("SWReq is %d, should be at most %d\n", req_freq,
max_freq + FREQUENCY_REQ_UNIT);
igt_spinner_end(&spin);
st_engine_heartbeat_enable(engine);
err = -EINVAL;
break;
}
act_freq = intel_rps_read_actual_frequency(rps);
if (act_freq > max_act_freq)
max_act_freq = act_freq;
st_engine_heartbeat_enable(engine);
igt_spinner_end(&spin);
if (err)
break;
}
pr_info("Max actual frequency for %s was %d\n",
engine->name, max_act_freq);
/* Actual frequency should rise above min */
if (max_act_freq == slpc_min_freq) {
pr_err("Actual freq did not rise above min\n");
err = -EINVAL;
}
if (igt_flush_test(gt->i915)) {
err = -EIO;
break;
}
if (err)
break;
}
/* Restore min/max freq */
slpc_set_max_freq(slpc, slpc_max_freq);
slpc_set_min_freq(slpc, slpc_min_freq);
intel_gt_pm_put(gt);
igt_spinner_fini(&spin);
intel_gt_pm_wait_for_idle(gt);
return err;
return run_test(gt, MAX_GRANTED);
}
int intel_slpc_live_selftests(struct drm_i915_private *i915)
{
static const struct i915_subtest tests[] = {
SUBTEST(live_slpc_clamp_max),
SUBTEST(live_slpc_clamp_min),
SUBTEST(live_slpc_vary_max),
SUBTEST(live_slpc_vary_min),
SUBTEST(live_slpc_max_granted),
};
if (intel_gt_is_wedged(to_gt(i915)))

View file

@ -9,6 +9,7 @@
#include "gt/intel_engine_regs.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_mcr.h"
#include "gt/intel_gt_regs.h"
#include "gt/intel_lrc.h"
#include "guc_capture_fwif.h"
@ -281,8 +282,7 @@ guc_capture_alloc_steered_lists_xe_lpd(struct intel_guc *guc,
const struct __guc_mmio_reg_descr_group *lists)
{
struct intel_gt *gt = guc_to_gt(guc);
struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
int slice, subslice, i, num_steer_regs, num_tot_regs = 0;
int slice, subslice, iter, i, num_steer_regs, num_tot_regs = 0;
const struct __guc_mmio_reg_descr_group *list;
struct __guc_mmio_reg_descr_group *extlists;
struct __guc_mmio_reg_descr *extarray;
@ -298,7 +298,7 @@ guc_capture_alloc_steered_lists_xe_lpd(struct intel_guc *guc,
num_steer_regs = ARRAY_SIZE(xe_extregs);
sseu = &gt->info.sseu;
for_each_instdone_slice_subslice(i915, sseu, slice, subslice)
for_each_ss_steering(iter, gt, slice, subslice)
num_tot_regs += num_steer_regs;
if (!num_tot_regs)
@ -315,7 +315,7 @@ guc_capture_alloc_steered_lists_xe_lpd(struct intel_guc *guc,
}
extarray = extlists[0].extlist;
for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
for_each_ss_steering(iter, gt, slice, subslice) {
for (i = 0; i < num_steer_regs; ++i) {
__fill_ext_reg(extarray, &xe_extregs[i], slice, subslice);
++extarray;
@ -359,9 +359,8 @@ guc_capture_alloc_steered_lists_xe_hpg(struct intel_guc *guc,
num_steer_regs += ARRAY_SIZE(xehpg_extregs);
sseu = &gt->info.sseu;
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
for_each_ss_steering(iter, gt, slice, subslice)
num_tot_regs += num_steer_regs;
}
if (!num_tot_regs)
return;
@ -377,7 +376,7 @@ guc_capture_alloc_steered_lists_xe_hpg(struct intel_guc *guc,
}
extarray = extlists[0].extlist;
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
for_each_ss_steering(iter, gt, slice, subslice) {
for (i = 0; i < ARRAY_SIZE(xe_extregs); ++i) {
__fill_ext_reg(extarray, &xe_extregs[i], slice, subslice);
++extarray;
@ -1261,7 +1260,8 @@ static int __guc_capture_flushlog_complete(struct intel_guc *guc)
GUC_CAPTURE_LOG_BUFFER
};
return intel_guc_send(guc, action, ARRAY_SIZE(action));
return intel_guc_send_nb(guc, action, ARRAY_SIZE(action), 0);
}
static void __guc_capture_process_output(struct intel_guc *guc)

View file

@ -31,7 +31,7 @@ static int guc_action_flush_log_complete(struct intel_guc *guc)
GUC_DEBUG_LOG_BUFFER
};
return intel_guc_send(guc, action, ARRAY_SIZE(action));
return intel_guc_send_nb(guc, action, ARRAY_SIZE(action), 0);
}
static int guc_action_flush_log(struct intel_guc *guc)

View file

@ -162,6 +162,15 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct intel_uc_fw *uc_fw)
u8 rev = INTEL_REVID(i915);
int i;
/*
* The only difference between the ADL GuC FWs is the HWConfig support.
* ADL-N does not support HWConfig, so we should use the same binary as
* ADL-S, otherwise the GuC might attempt to fetch a config table that
* does not exist.
*/
if (IS_ADLP_N(i915))
p = INTEL_ALDERLAKE_S;
GEM_BUG_ON(uc_fw->type >= ARRAY_SIZE(blobs_all));
fw_blobs = blobs_all[uc_fw->type].blobs;
fw_count = blobs_all[uc_fw->type].count;

View file

@ -974,7 +974,7 @@ void i915_active_acquire_barrier(struct i915_active *ref)
GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
llist_add(barrier_to_ll(node), &engine->barrier_tasks);
intel_engine_pm_put_delay(engine, 1);
intel_engine_pm_put_delay(engine, 2);
}
}

View file

@ -1005,7 +1005,12 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
#define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV)
#define IS_DG2(dev_priv) IS_PLATFORM(dev_priv, INTEL_DG2)
#define IS_PONTEVECCHIO(dev_priv) IS_PLATFORM(dev_priv, INTEL_PONTEVECCHIO)
#define IS_METEORLAKE(dev_priv) IS_PLATFORM(dev_priv, INTEL_METEORLAKE)
#define IS_METEORLAKE_M(dev_priv) \
IS_SUBPLATFORM(dev_priv, INTEL_METEORLAKE, INTEL_SUBPLATFORM_M)
#define IS_METEORLAKE_P(dev_priv) \
IS_SUBPLATFORM(dev_priv, INTEL_METEORLAKE, INTEL_SUBPLATFORM_P)
#define IS_DG2_G10(dev_priv) \
IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G10)
#define IS_DG2_G11(dev_priv) \

View file

@ -46,6 +46,7 @@
#include "gem/i915_gem_lmem.h"
#include "gt/intel_engine_regs.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_mcr.h"
#include "gt/intel_gt_pm.h"
#include "gt/intel_gt_regs.h"
#include "gt/uc/intel_guc_capture.h"
@ -436,7 +437,6 @@ static void err_compression_marker(struct drm_i915_error_state_buf *m)
static void error_print_instdone(struct drm_i915_error_state_buf *m,
const struct intel_engine_coredump *ee)
{
const struct sseu_dev_info *sseu = &ee->engine->gt->info.sseu;
int slice;
int subslice;
int iter;
@ -453,33 +453,21 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m,
if (GRAPHICS_VER(m->i915) <= 6)
return;
if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) {
for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice)
err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
slice, subslice,
ee->instdone.sampler[slice][subslice]);
for_each_ss_steering(iter, ee->engine->gt, slice, subslice)
err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
slice, subslice,
ee->instdone.sampler[slice][subslice]);
for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice)
err_printf(m, " ROW_INSTDONE[%d][%d]: 0x%08x\n",
slice, subslice,
ee->instdone.row[slice][subslice]);
} else {
for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
slice, subslice,
ee->instdone.sampler[slice][subslice]);
for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice)
err_printf(m, " ROW_INSTDONE[%d][%d]: 0x%08x\n",
slice, subslice,
ee->instdone.row[slice][subslice]);
}
for_each_ss_steering(iter, ee->engine->gt, slice, subslice)
err_printf(m, " ROW_INSTDONE[%d][%d]: 0x%08x\n",
slice, subslice,
ee->instdone.row[slice][subslice]);
if (GRAPHICS_VER(m->i915) < 12)
return;
if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 55)) {
for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice)
for_each_ss_steering(iter, ee->engine->gt, slice, subslice)
err_printf(m, " GEOM_SVGUNIT_INSTDONE[%d][%d]: 0x%08x\n",
slice, subslice,
ee->instdone.geom_svg[slice][subslice]);
@ -1129,11 +1117,15 @@ i915_vma_coredump_create(const struct intel_gt *gt,
dma_addr_t dma;
for_each_sgt_daddr(dma, iter, vma_res->bi.pages) {
dma_addr_t offset = dma - mem->region.start;
void __iomem *s;
s = io_mapping_map_wc(&mem->iomap,
dma - mem->region.start,
PAGE_SIZE);
if (offset + PAGE_SIZE > mem->io_size) {
ret = -EINVAL;
break;
}
s = io_mapping_map_wc(&mem->iomap, offset, PAGE_SIZE);
ret = compress_page(compress,
(void __force *)s, dst,
true);

View file

@ -204,6 +204,8 @@ i915_param_named_unsafe(request_timeout_ms, uint, 0600,
i915_param_named_unsafe(lmem_size, uint, 0400,
"Set the lmem size(in MiB) for each region. (default: 0, all memory)");
i915_param_named_unsafe(lmem_bar_size, uint, 0400,
"Set the lmem bar size(in MiB).");
static __always_inline void _print_param(struct drm_printer *p,
const char *name,

View file

@ -74,6 +74,7 @@ struct drm_printer;
param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \
param(unsigned int, request_timeout_ms, CONFIG_DRM_I915_REQUEST_TIMEOUT, CONFIG_DRM_I915_REQUEST_TIMEOUT ? 0600 : 0) \
param(unsigned int, lmem_size, 0, 0400) \
param(unsigned int, lmem_bar_size, 0, 0400) \
/* leave bools at the end to not create holes */ \
param(bool, enable_hangcheck, true, 0600) \
param(bool, load_detect_test, false, 0600) \

View file

@ -1075,7 +1075,6 @@ static const struct intel_device_info dg2_info = {
.require_force_probe = 1,
};
__maybe_unused
static const struct intel_device_info ats_m_info = {
DG2_FEATURES,
.display = { 0 },
@ -1108,6 +1107,31 @@ static const struct intel_device_info pvc_info = {
.require_force_probe = 1,
};
#define XE_LPDP_FEATURES \
XE_LPD_FEATURES, \
.display.ver = 14, \
.display.has_cdclk_crawl = 1
__maybe_unused
static const struct intel_device_info mtl_info = {
XE_HP_FEATURES,
XE_LPDP_FEATURES,
/*
* Real graphics IP version will be obtained from hardware GMD_ID
* register. Value provided here is just for sanity checking.
*/
.graphics.ver = 12,
.graphics.rel = 70,
.media.ver = 13,
PLATFORM(INTEL_METEORLAKE),
.display.has_modular_fia = 1,
.has_flat_ccs = 0,
.has_snoop = 1,
.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
.platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(CCS0),
.require_force_probe = 1,
};
#undef PLATFORM
/*
@ -1189,6 +1213,8 @@ static const struct pci_device_id pciidlist[] = {
INTEL_RPLS_IDS(&adl_s_info),
INTEL_RPLP_IDS(&adl_p_info),
INTEL_DG2_IDS(&dg2_info),
INTEL_ATS_M_IDS(&ats_m_info),
INTEL_MTL_IDS(&mtl_info),
{0, 0, 0}
};
MODULE_DEVICE_TABLE(pci, pciidlist);

View file

@ -885,8 +885,9 @@ static int gen8_oa_read(struct i915_perf_stream *stream,
if (ret)
return ret;
DRM_DEBUG("OA buffer overflow (exponent = %d): force restart\n",
stream->period_exponent);
drm_dbg(&stream->perf->i915->drm,
"OA buffer overflow (exponent = %d): force restart\n",
stream->period_exponent);
stream->perf->ops.oa_disable(stream);
stream->perf->ops.oa_enable(stream);
@ -1108,8 +1109,9 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
if (ret)
return ret;
DRM_DEBUG("OA buffer overflow (exponent = %d): force restart\n",
stream->period_exponent);
drm_dbg(&stream->perf->i915->drm,
"OA buffer overflow (exponent = %d): force restart\n",
stream->period_exponent);
stream->perf->ops.oa_disable(stream);
stream->perf->ops.oa_enable(stream);
@ -2863,7 +2865,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
int ret;
if (!props->engine) {
DRM_DEBUG("OA engine not specified\n");
drm_dbg(&stream->perf->i915->drm,
"OA engine not specified\n");
return -EINVAL;
}
@ -2873,18 +2876,21 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
* IDs
*/
if (!perf->metrics_kobj) {
DRM_DEBUG("OA metrics weren't advertised via sysfs\n");
drm_dbg(&stream->perf->i915->drm,
"OA metrics weren't advertised via sysfs\n");
return -EINVAL;
}
if (!(props->sample_flags & SAMPLE_OA_REPORT) &&
(GRAPHICS_VER(perf->i915) < 12 || !stream->ctx)) {
DRM_DEBUG("Only OA report sampling supported\n");
drm_dbg(&stream->perf->i915->drm,
"Only OA report sampling supported\n");
return -EINVAL;
}
if (!perf->ops.enable_metric_set) {
DRM_DEBUG("OA unit not supported\n");
drm_dbg(&stream->perf->i915->drm,
"OA unit not supported\n");
return -ENODEV;
}
@ -2894,12 +2900,14 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
* we currently only allow exclusive access
*/
if (perf->exclusive_stream) {
DRM_DEBUG("OA unit already in use\n");
drm_dbg(&stream->perf->i915->drm,
"OA unit already in use\n");
return -EBUSY;
}
if (!props->oa_format) {
DRM_DEBUG("OA report format not specified\n");
drm_dbg(&stream->perf->i915->drm,
"OA report format not specified\n");
return -EINVAL;
}
@ -2929,20 +2937,23 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
if (stream->ctx) {
ret = oa_get_render_ctx_id(stream);
if (ret) {
DRM_DEBUG("Invalid context id to filter with\n");
drm_dbg(&stream->perf->i915->drm,
"Invalid context id to filter with\n");
return ret;
}
}
ret = alloc_noa_wait(stream);
if (ret) {
DRM_DEBUG("Unable to allocate NOA wait batch buffer\n");
drm_dbg(&stream->perf->i915->drm,
"Unable to allocate NOA wait batch buffer\n");
goto err_noa_wait_alloc;
}
stream->oa_config = i915_perf_get_oa_config(perf, props->metrics_set);
if (!stream->oa_config) {
DRM_DEBUG("Invalid OA config id=%i\n", props->metrics_set);
drm_dbg(&stream->perf->i915->drm,
"Invalid OA config id=%i\n", props->metrics_set);
ret = -EINVAL;
goto err_config;
}
@ -2973,11 +2984,13 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
ret = i915_perf_stream_enable_sync(stream);
if (ret) {
DRM_DEBUG("Unable to enable metric set\n");
drm_dbg(&stream->perf->i915->drm,
"Unable to enable metric set\n");
goto err_enable;
}
DRM_DEBUG("opening stream oa config uuid=%s\n",
drm_dbg(&stream->perf->i915->drm,
"opening stream oa config uuid=%s\n",
stream->oa_config->uuid);
hrtimer_init(&stream->poll_check_timer,
@ -3429,7 +3442,8 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
if (IS_ERR(specific_ctx)) {
DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
drm_dbg(&perf->i915->drm,
"Failed to look up context with ID %u for opening perf stream\n",
ctx_handle);
ret = PTR_ERR(specific_ctx);
goto err;
@ -3463,7 +3477,8 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
if (props->hold_preemption) {
if (!props->single_context) {
DRM_DEBUG("preemption disable with no context\n");
drm_dbg(&perf->i915->drm,
"preemption disable with no context\n");
ret = -EINVAL;
goto err;
}
@ -3485,7 +3500,8 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
*/
if (privileged_op &&
i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to open i915 perf stream\n");
drm_dbg(&perf->i915->drm,
"Insufficient privileges to open i915 perf stream\n");
ret = -EACCES;
goto err_ctx;
}
@ -3592,7 +3608,8 @@ static int read_properties_unlocked(struct i915_perf *perf,
props->poll_oa_period = DEFAULT_POLL_PERIOD_NS;
if (!n_props) {
DRM_DEBUG("No i915 perf properties given\n");
drm_dbg(&perf->i915->drm,
"No i915 perf properties given\n");
return -EINVAL;
}
@ -3601,7 +3618,8 @@ static int read_properties_unlocked(struct i915_perf *perf,
I915_ENGINE_CLASS_RENDER,
0);
if (!props->engine) {
DRM_DEBUG("No RENDER-capable engines\n");
drm_dbg(&perf->i915->drm,
"No RENDER-capable engines\n");
return -EINVAL;
}
@ -3612,7 +3630,8 @@ static int read_properties_unlocked(struct i915_perf *perf,
* from userspace.
*/
if (n_props >= DRM_I915_PERF_PROP_MAX) {
DRM_DEBUG("More i915 perf properties specified than exist\n");
drm_dbg(&perf->i915->drm,
"More i915 perf properties specified than exist\n");
return -EINVAL;
}
@ -3629,7 +3648,8 @@ static int read_properties_unlocked(struct i915_perf *perf,
return ret;
if (id == 0 || id >= DRM_I915_PERF_PROP_MAX) {
DRM_DEBUG("Unknown i915 perf property ID\n");
drm_dbg(&perf->i915->drm,
"Unknown i915 perf property ID\n");
return -EINVAL;
}
@ -3644,19 +3664,22 @@ static int read_properties_unlocked(struct i915_perf *perf,
break;
case DRM_I915_PERF_PROP_OA_METRICS_SET:
if (value == 0) {
DRM_DEBUG("Unknown OA metric set ID\n");
drm_dbg(&perf->i915->drm,
"Unknown OA metric set ID\n");
return -EINVAL;
}
props->metrics_set = value;
break;
case DRM_I915_PERF_PROP_OA_FORMAT:
if (value == 0 || value >= I915_OA_FORMAT_MAX) {
DRM_DEBUG("Out-of-range OA report format %llu\n",
drm_dbg(&perf->i915->drm,
"Out-of-range OA report format %llu\n",
value);
return -EINVAL;
}
if (!oa_format_valid(perf, value)) {
DRM_DEBUG("Unsupported OA report format %llu\n",
drm_dbg(&perf->i915->drm,
"Unsupported OA report format %llu\n",
value);
return -EINVAL;
}
@ -3664,7 +3687,8 @@ static int read_properties_unlocked(struct i915_perf *perf,
break;
case DRM_I915_PERF_PROP_OA_EXPONENT:
if (value > OA_EXPONENT_MAX) {
DRM_DEBUG("OA timer exponent too high (> %u)\n",
drm_dbg(&perf->i915->drm,
"OA timer exponent too high (> %u)\n",
OA_EXPONENT_MAX);
return -EINVAL;
}
@ -3692,7 +3716,8 @@ static int read_properties_unlocked(struct i915_perf *perf,
oa_freq_hz = 0;
if (oa_freq_hz > i915_oa_max_sample_rate && !perfmon_capable()) {
DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without CAP_PERFMON or CAP_SYS_ADMIN privileges\n",
drm_dbg(&perf->i915->drm,
"OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without CAP_PERFMON or CAP_SYS_ADMIN privileges\n",
i915_oa_max_sample_rate);
return -EACCES;
}
@ -3706,16 +3731,25 @@ static int read_properties_unlocked(struct i915_perf *perf,
case DRM_I915_PERF_PROP_GLOBAL_SSEU: {
struct drm_i915_gem_context_param_sseu user_sseu;
if (GRAPHICS_VER_FULL(perf->i915) >= IP_VER(12, 50)) {
drm_dbg(&perf->i915->drm,
"SSEU config not supported on gfx %x\n",
GRAPHICS_VER_FULL(perf->i915));
return -ENODEV;
}
if (copy_from_user(&user_sseu,
u64_to_user_ptr(value),
sizeof(user_sseu))) {
DRM_DEBUG("Unable to copy global sseu parameter\n");
drm_dbg(&perf->i915->drm,
"Unable to copy global sseu parameter\n");
return -EFAULT;
}
ret = get_sseu_config(&props->sseu, props->engine, &user_sseu);
if (ret) {
DRM_DEBUG("Invalid SSEU configuration\n");
drm_dbg(&perf->i915->drm,
"Invalid SSEU configuration\n");
return ret;
}
props->has_sseu = true;
@ -3723,7 +3757,8 @@ static int read_properties_unlocked(struct i915_perf *perf,
}
case DRM_I915_PERF_PROP_POLL_OA_PERIOD:
if (value < 100000 /* 100us */) {
DRM_DEBUG("OA availability timer too small (%lluns < 100us)\n",
drm_dbg(&perf->i915->drm,
"OA availability timer too small (%lluns < 100us)\n",
value);
return -EINVAL;
}
@ -3774,7 +3809,8 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
int ret;
if (!perf->i915) {
DRM_DEBUG("i915 perf interface not available for this system\n");
drm_dbg(&perf->i915->drm,
"i915 perf interface not available for this system\n");
return -ENOTSUPP;
}
@ -3782,7 +3818,8 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
I915_PERF_FLAG_FD_NONBLOCK |
I915_PERF_FLAG_DISABLED;
if (param->flags & ~known_open_flags) {
DRM_DEBUG("Unknown drm_i915_perf_open_param flag\n");
drm_dbg(&perf->i915->drm,
"Unknown drm_i915_perf_open_param flag\n");
return -EINVAL;
}
@ -4028,7 +4065,8 @@ static struct i915_oa_reg *alloc_oa_regs(struct i915_perf *perf,
goto addr_err;
if (!is_valid(perf, addr)) {
DRM_DEBUG("Invalid oa_reg address: %X\n", addr);
drm_dbg(&perf->i915->drm,
"Invalid oa_reg address: %X\n", addr);
err = -EINVAL;
goto addr_err;
}
@ -4102,30 +4140,35 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
int err, id;
if (!perf->i915) {
DRM_DEBUG("i915 perf interface not available for this system\n");
drm_dbg(&perf->i915->drm,
"i915 perf interface not available for this system\n");
return -ENOTSUPP;
}
if (!perf->metrics_kobj) {
DRM_DEBUG("OA metrics weren't advertised via sysfs\n");
drm_dbg(&perf->i915->drm,
"OA metrics weren't advertised via sysfs\n");
return -EINVAL;
}
if (i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to add i915 OA config\n");
drm_dbg(&perf->i915->drm,
"Insufficient privileges to add i915 OA config\n");
return -EACCES;
}
if ((!args->mux_regs_ptr || !args->n_mux_regs) &&
(!args->boolean_regs_ptr || !args->n_boolean_regs) &&
(!args->flex_regs_ptr || !args->n_flex_regs)) {
DRM_DEBUG("No OA registers given\n");
drm_dbg(&perf->i915->drm,
"No OA registers given\n");
return -EINVAL;
}
oa_config = kzalloc(sizeof(*oa_config), GFP_KERNEL);
if (!oa_config) {
DRM_DEBUG("Failed to allocate memory for the OA config\n");
drm_dbg(&perf->i915->drm,
"Failed to allocate memory for the OA config\n");
return -ENOMEM;
}
@ -4133,7 +4176,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
kref_init(&oa_config->ref);
if (!uuid_is_valid(args->uuid)) {
DRM_DEBUG("Invalid uuid format for OA config\n");
drm_dbg(&perf->i915->drm,
"Invalid uuid format for OA config\n");
err = -EINVAL;
goto reg_err;
}
@ -4150,7 +4194,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
args->n_mux_regs);
if (IS_ERR(regs)) {
DRM_DEBUG("Failed to create OA config for mux_regs\n");
drm_dbg(&perf->i915->drm,
"Failed to create OA config for mux_regs\n");
err = PTR_ERR(regs);
goto reg_err;
}
@ -4163,7 +4208,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
args->n_boolean_regs);
if (IS_ERR(regs)) {
DRM_DEBUG("Failed to create OA config for b_counter_regs\n");
drm_dbg(&perf->i915->drm,
"Failed to create OA config for b_counter_regs\n");
err = PTR_ERR(regs);
goto reg_err;
}
@ -4182,7 +4228,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
args->n_flex_regs);
if (IS_ERR(regs)) {
DRM_DEBUG("Failed to create OA config for flex_regs\n");
drm_dbg(&perf->i915->drm,
"Failed to create OA config for flex_regs\n");
err = PTR_ERR(regs);
goto reg_err;
}
@ -4198,7 +4245,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
*/
idr_for_each_entry(&perf->metrics_idr, tmp, id) {
if (!strcmp(tmp->uuid, oa_config->uuid)) {
DRM_DEBUG("OA config already exists with this uuid\n");
drm_dbg(&perf->i915->drm,
"OA config already exists with this uuid\n");
err = -EADDRINUSE;
goto sysfs_err;
}
@ -4206,7 +4254,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
err = create_dynamic_oa_sysfs_entry(perf, oa_config);
if (err) {
DRM_DEBUG("Failed to create sysfs entry for OA config\n");
drm_dbg(&perf->i915->drm,
"Failed to create sysfs entry for OA config\n");
goto sysfs_err;
}
@ -4215,14 +4264,16 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
oa_config, 2,
0, GFP_KERNEL);
if (oa_config->id < 0) {
DRM_DEBUG("Failed to create sysfs entry for OA config\n");
drm_dbg(&perf->i915->drm,
"Failed to create sysfs entry for OA config\n");
err = oa_config->id;
goto sysfs_err;
}
mutex_unlock(&perf->metrics_lock);
DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id);
drm_dbg(&perf->i915->drm,
"Added config %s id=%i\n", oa_config->uuid, oa_config->id);
return oa_config->id;
@ -4230,7 +4281,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
mutex_unlock(&perf->metrics_lock);
reg_err:
i915_oa_config_put(oa_config);
DRM_DEBUG("Failed to add new OA config\n");
drm_dbg(&perf->i915->drm,
"Failed to add new OA config\n");
return err;
}
@ -4254,12 +4306,14 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
int ret;
if (!perf->i915) {
DRM_DEBUG("i915 perf interface not available for this system\n");
drm_dbg(&perf->i915->drm,
"i915 perf interface not available for this system\n");
return -ENOTSUPP;
}
if (i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to remove i915 OA config\n");
drm_dbg(&perf->i915->drm,
"Insufficient privileges to remove i915 OA config\n");
return -EACCES;
}
@ -4269,7 +4323,8 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
oa_config = idr_find(&perf->metrics_idr, *arg);
if (!oa_config) {
DRM_DEBUG("Failed to remove unknown OA config\n");
drm_dbg(&perf->i915->drm,
"Failed to remove unknown OA config\n");
ret = -ENOENT;
goto err_unlock;
}
@ -4282,7 +4337,8 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
mutex_unlock(&perf->metrics_lock);
DRM_DEBUG("Removed config %s id=%i\n", oa_config->uuid, oa_config->id);
drm_dbg(&perf->i915->drm,
"Removed config %s id=%i\n", oa_config->uuid, oa_config->id);
i915_oa_config_put(oa_config);

View file

@ -498,7 +498,21 @@ static int query_memregion_info(struct drm_i915_private *i915,
info.region.memory_class = mr->type;
info.region.memory_instance = mr->instance;
info.probed_size = mr->total;
info.unallocated_size = mr->avail;
if (mr->type == INTEL_MEMORY_LOCAL)
info.probed_cpu_visible_size = mr->io_size;
else
info.probed_cpu_visible_size = mr->total;
if (perfmon_capable()) {
intel_memory_region_avail(mr,
&info.unallocated_size,
&info.unallocated_cpu_visible_size);
} else {
info.unallocated_size = info.probed_size;
info.unallocated_cpu_visible_size =
info.probed_cpu_visible_size;
}
if (__copy_to_user(info_ptr, &info, sizeof(info)))
return -EFAULT;

View file

@ -68,6 +68,7 @@ void i915_refct_sgt_init(struct i915_refct_sgt *rsgt, size_t size)
* drm_mm_node
* @node: The drm_mm_node.
* @region_start: An offset to add to the dma addresses of the sg list.
* @page_alignment: Required page alignment for each sg entry. Power of two.
*
* Create a struct sg_table, initializing it from a struct drm_mm_node,
* taking a maximum segment length into account, splitting into segments
@ -77,22 +78,25 @@ void i915_refct_sgt_init(struct i915_refct_sgt *rsgt, size_t size)
* error code cast to an error pointer on failure.
*/
struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
u64 region_start)
u64 region_start,
u32 page_alignment)
{
const u64 max_segment = SZ_1G; /* Do we have a limit on this? */
u64 segment_pages = max_segment >> PAGE_SHIFT;
const u32 max_segment = round_down(UINT_MAX, page_alignment);
const u32 segment_pages = max_segment >> PAGE_SHIFT;
u64 block_size, offset, prev_end;
struct i915_refct_sgt *rsgt;
struct sg_table *st;
struct scatterlist *sg;
GEM_BUG_ON(!max_segment);
rsgt = kmalloc(sizeof(*rsgt), GFP_KERNEL);
if (!rsgt)
return ERR_PTR(-ENOMEM);
i915_refct_sgt_init(rsgt, node->size << PAGE_SHIFT);
st = &rsgt->table;
if (sg_alloc_table(st, DIV_ROUND_UP(node->size, segment_pages),
if (sg_alloc_table(st, DIV_ROUND_UP_ULL(node->size, segment_pages),
GFP_KERNEL)) {
i915_refct_sgt_put(rsgt);
return ERR_PTR(-ENOMEM);
@ -112,12 +116,14 @@ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
sg = __sg_next(sg);
sg_dma_address(sg) = region_start + offset;
GEM_BUG_ON(!IS_ALIGNED(sg_dma_address(sg),
page_alignment));
sg_dma_len(sg) = 0;
sg->length = 0;
st->nents++;
}
len = min(block_size, max_segment - sg->length);
len = min_t(u64, block_size, max_segment - sg->length);
sg->length += len;
sg_dma_len(sg) += len;
@ -138,6 +144,7 @@ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
* i915_buddy_block list
* @res: The struct i915_ttm_buddy_resource.
* @region_start: An offset to add to the dma addresses of the sg list.
* @page_alignment: Required page alignment for each sg entry. Power of two.
*
* Create a struct sg_table, initializing it from struct i915_buddy_block list,
* taking a maximum segment length into account, splitting into segments
@ -147,11 +154,12 @@ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
* error code cast to an error pointer on failure.
*/
struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
u64 region_start)
u64 region_start,
u32 page_alignment)
{
struct i915_ttm_buddy_resource *bman_res = to_ttm_buddy_resource(res);
const u64 size = res->num_pages << PAGE_SHIFT;
const u64 max_segment = rounddown(UINT_MAX, PAGE_SIZE);
const u32 max_segment = round_down(UINT_MAX, page_alignment);
struct drm_buddy *mm = bman_res->mm;
struct list_head *blocks = &bman_res->blocks;
struct drm_buddy_block *block;
@ -161,6 +169,7 @@ struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
resource_size_t prev_end;
GEM_BUG_ON(list_empty(blocks));
GEM_BUG_ON(!max_segment);
rsgt = kmalloc(sizeof(*rsgt), GFP_KERNEL);
if (!rsgt)
@ -191,12 +200,14 @@ struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
sg = __sg_next(sg);
sg_dma_address(sg) = region_start + offset;
GEM_BUG_ON(!IS_ALIGNED(sg_dma_address(sg),
page_alignment));
sg_dma_len(sg) = 0;
sg->length = 0;
st->nents++;
}
len = min(block_size, max_segment - sg->length);
len = min_t(u64, block_size, max_segment - sg->length);
sg->length += len;
sg_dma_len(sg) += len;

View file

@ -213,9 +213,11 @@ static inline void __i915_refct_sgt_init(struct i915_refct_sgt *rsgt,
void i915_refct_sgt_init(struct i915_refct_sgt *rsgt, size_t size);
struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
u64 region_start);
u64 region_start,
u32 page_alignment);
struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
u64 region_start);
u64 region_start,
u32 page_alignment);
#endif

View file

@ -104,18 +104,15 @@ static int i915_ttm_buddy_man_alloc(struct ttm_resource_manager *man,
min_page_size,
&bman_res->blocks,
bman_res->flags);
mutex_unlock(&bman->lock);
if (unlikely(err))
goto err_free_blocks;
if (place->flags & TTM_PL_FLAG_CONTIGUOUS) {
u64 original_size = (u64)bman_res->base.num_pages << PAGE_SHIFT;
mutex_lock(&bman->lock);
drm_buddy_block_trim(mm,
original_size,
&bman_res->blocks);
mutex_unlock(&bman->lock);
}
if (lpfn <= bman->visible_size) {
@ -137,11 +134,10 @@ static int i915_ttm_buddy_man_alloc(struct ttm_resource_manager *man,
}
}
if (bman_res->used_visible_size) {
mutex_lock(&bman->lock);
if (bman_res->used_visible_size)
bman->visible_avail -= bman_res->used_visible_size;
mutex_unlock(&bman->lock);
}
mutex_unlock(&bman->lock);
if (place->lpfn - place->fpfn == n_pages)
bman_res->base.start = place->fpfn;
@ -154,7 +150,6 @@ static int i915_ttm_buddy_man_alloc(struct ttm_resource_manager *man,
return 0;
err_free_blocks:
mutex_lock(&bman->lock);
drm_buddy_free_list(mm, &bman_res->blocks);
mutex_unlock(&bman->lock);
err_free_res:
@ -365,6 +360,26 @@ u64 i915_ttm_buddy_man_visible_size(struct ttm_resource_manager *man)
return bman->visible_size;
}
/**
* i915_ttm_buddy_man_avail - Query the avail tracking for the manager.
*
* @man: The buddy allocator ttm manager
* @avail: The total available memory in pages for the entire manager.
* @visible_avail: The total available memory in pages for the CPU visible
* portion. Note that this will always give the same value as @avail on
* configurations that don't have a small BAR.
*/
void i915_ttm_buddy_man_avail(struct ttm_resource_manager *man,
u64 *avail, u64 *visible_avail)
{
struct i915_ttm_buddy_manager *bman = to_buddy_manager(man);
mutex_lock(&bman->lock);
*avail = bman->mm.avail >> PAGE_SHIFT;
*visible_avail = bman->visible_avail;
mutex_unlock(&bman->lock);
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
void i915_ttm_buddy_man_force_visible_size(struct ttm_resource_manager *man,
u64 size)

View file

@ -61,6 +61,9 @@ int i915_ttm_buddy_man_reserve(struct ttm_resource_manager *man,
u64 i915_ttm_buddy_man_visible_size(struct ttm_resource_manager *man);
void i915_ttm_buddy_man_avail(struct ttm_resource_manager *man,
u64 *avail, u64 *avail_visible);
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
void i915_ttm_buddy_man_force_visible_size(struct ttm_resource_manager *man,
u64 size);

View file

@ -310,7 +310,7 @@ struct i915_vma_work {
struct i915_address_space *vm;
struct i915_vm_pt_stash stash;
struct i915_vma_resource *vma_res;
struct drm_i915_gem_object *pinned;
struct drm_i915_gem_object *obj;
struct i915_sw_dma_fence_cb cb;
enum i915_cache_level cache_level;
unsigned int flags;
@ -321,17 +321,25 @@ static void __vma_bind(struct dma_fence_work *work)
struct i915_vma_work *vw = container_of(work, typeof(*vw), base);
struct i915_vma_resource *vma_res = vw->vma_res;
/*
* We are about the bind the object, which must mean we have already
* signaled the work to potentially clear/move the pages underneath. If
* something went wrong at that stage then the object should have
* unknown_state set, in which case we need to skip the bind.
*/
if (i915_gem_object_has_unknown_state(vw->obj))
return;
vma_res->ops->bind_vma(vma_res->vm, &vw->stash,
vma_res, vw->cache_level, vw->flags);
}
static void __vma_release(struct dma_fence_work *work)
{
struct i915_vma_work *vw = container_of(work, typeof(*vw), base);
if (vw->pinned)
i915_gem_object_put(vw->pinned);
if (vw->obj)
i915_gem_object_put(vw->obj);
i915_vm_free_pt_stash(vw->vm, &vw->stash);
if (vw->vma_res)
@ -517,14 +525,7 @@ int i915_vma_bind(struct i915_vma *vma,
}
work->base.dma.error = 0; /* enable the queue_work() */
/*
* If we don't have the refcounted pages list, keep a reference
* on the object to avoid waiting for the async bind to
* complete in the object destruction path.
*/
if (!work->vma_res->bi.pages_rsgt)
work->pinned = i915_gem_object_get(vma->obj);
work->obj = i915_gem_object_get(vma->obj);
} else {
ret = i915_gem_object_wait_moving_fence(vma->obj, true);
if (ret) {
@ -1645,10 +1646,10 @@ static void force_unbind(struct i915_vma *vma)
GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
}
static void release_references(struct i915_vma *vma, bool vm_ddestroy)
static void release_references(struct i915_vma *vma, struct intel_gt *gt,
bool vm_ddestroy)
{
struct drm_i915_gem_object *obj = vma->obj;
struct intel_gt *gt = vma->vm->gt;
GEM_BUG_ON(i915_vma_is_active(vma));
@ -1703,11 +1704,12 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
force_unbind(vma);
list_del_init(&vma->vm_link);
release_references(vma, false);
release_references(vma, vma->vm->gt, false);
}
void i915_vma_destroy(struct i915_vma *vma)
{
struct intel_gt *gt;
bool vm_ddestroy;
mutex_lock(&vma->vm->mutex);
@ -1715,8 +1717,11 @@ void i915_vma_destroy(struct i915_vma *vma)
list_del_init(&vma->vm_link);
vm_ddestroy = vma->vm_ddestroy;
vma->vm_ddestroy = false;
/* vma->vm may be freed when releasing vma->vm->mutex. */
gt = vma->vm->gt;
mutex_unlock(&vma->vm->mutex);
release_references(vma, vm_ddestroy);
release_references(vma, gt, vm_ddestroy);
}
void i915_vma_parked(struct intel_gt *gt)

View file

@ -73,6 +73,7 @@ static const char * const platform_names[] = {
PLATFORM_NAME(XEHPSDV),
PLATFORM_NAME(DG2),
PLATFORM_NAME(PONTEVECCHIO),
PLATFORM_NAME(METEORLAKE),
};
#undef PLATFORM_NAME
@ -189,16 +190,26 @@ static const u16 subplatform_rpl_ids[] = {
static const u16 subplatform_g10_ids[] = {
INTEL_DG2_G10_IDS(0),
INTEL_ATS_M150_IDS(0),
};
static const u16 subplatform_g11_ids[] = {
INTEL_DG2_G11_IDS(0),
INTEL_ATS_M75_IDS(0),
};
static const u16 subplatform_g12_ids[] = {
INTEL_DG2_G12_IDS(0),
};
static const u16 subplatform_m_ids[] = {
INTEL_MTL_M_IDS(0),
};
static const u16 subplatform_p_ids[] = {
INTEL_MTL_P_IDS(0),
};
static bool find_devid(u16 id, const u16 *p, unsigned int num)
{
for (; num; num--, p++) {
@ -253,6 +264,12 @@ void intel_device_info_subplatform_init(struct drm_i915_private *i915)
} else if (find_devid(devid, subplatform_g12_ids,
ARRAY_SIZE(subplatform_g12_ids))) {
mask = BIT(INTEL_SUBPLATFORM_G12);
} else if (find_devid(devid, subplatform_m_ids,
ARRAY_SIZE(subplatform_m_ids))) {
mask = BIT(INTEL_SUBPLATFORM_M);
} else if (find_devid(devid, subplatform_p_ids,
ARRAY_SIZE(subplatform_p_ids))) {
mask = BIT(INTEL_SUBPLATFORM_P);
}
GEM_BUG_ON(mask & ~INTEL_SUBPLATFORM_MASK);

View file

@ -89,6 +89,7 @@ enum intel_platform {
INTEL_XEHPSDV,
INTEL_DG2,
INTEL_PONTEVECCHIO,
INTEL_METEORLAKE,
INTEL_MAX_PLATFORMS
};
@ -126,6 +127,10 @@ enum intel_platform {
*/
#define INTEL_SUBPLATFORM_N 1
/* MTL */
#define INTEL_SUBPLATFORM_M 0
#define INTEL_SUBPLATFORM_P 1
enum intel_ppgtt_type {
INTEL_PPGTT_NONE = I915_GEM_PPGTT_NONE,
INTEL_PPGTT_ALIASING = I915_GEM_PPGTT_ALIASING,

View file

@ -198,8 +198,7 @@ void intel_memory_region_debug(struct intel_memory_region *mr,
if (mr->region_private)
ttm_resource_manager_debug(mr->region_private, printer);
else
drm_printf(printer, "total:%pa, available:%pa bytes\n",
&mr->total, &mr->avail);
drm_printf(printer, "total:%pa bytes\n", &mr->total);
}
static int intel_memory_region_memtest(struct intel_memory_region *mem,
@ -242,7 +241,6 @@ intel_memory_region_create(struct drm_i915_private *i915,
mem->min_page_size = min_page_size;
mem->ops = ops;
mem->total = size;
mem->avail = mem->total;
mem->type = type;
mem->instance = instance;
@ -279,6 +277,20 @@ void intel_memory_region_set_name(struct intel_memory_region *mem,
va_end(ap);
}
void intel_memory_region_avail(struct intel_memory_region *mr,
u64 *avail, u64 *visible_avail)
{
if (mr->type == INTEL_MEMORY_LOCAL) {
i915_ttm_buddy_man_avail(mr->region_private,
avail, visible_avail);
*avail <<= PAGE_SHIFT;
*visible_avail <<= PAGE_SHIFT;
} else {
*avail = mr->total;
*visible_avail = mr->total;
}
}
void intel_memory_region_destroy(struct intel_memory_region *mem)
{
int ret = 0;

View file

@ -75,7 +75,6 @@ struct intel_memory_region {
resource_size_t io_size;
resource_size_t min_page_size;
resource_size_t total;
resource_size_t avail;
u16 type;
u16 instance;
@ -127,6 +126,9 @@ int intel_memory_region_reserve(struct intel_memory_region *mem,
void intel_memory_region_debug(struct intel_memory_region *mr,
struct drm_printer *printer);
void intel_memory_region_avail(struct intel_memory_region *mr,
u64 *avail, u64 *visible_avail);
struct intel_memory_region *
i915_gem_ttm_system_setup(struct drm_i915_private *i915,
u16 type, u16 instance);

View file

@ -152,6 +152,7 @@ int intel_region_ttm_fini(struct intel_memory_region *mem)
* Convert an opaque TTM resource manager resource to a refcounted sg_table.
* @mem: The memory region.
* @res: The resource manager resource obtained from the TTM resource manager.
* @page_alignment: Required page alignment for each sg entry. Power of two.
*
* The gem backends typically use sg-tables for operations on the underlying
* io_memory. So provide a way for the backends to translate the
@ -161,16 +162,19 @@ int intel_region_ttm_fini(struct intel_memory_region *mem)
*/
struct i915_refct_sgt *
intel_region_ttm_resource_to_rsgt(struct intel_memory_region *mem,
struct ttm_resource *res)
struct ttm_resource *res,
u32 page_alignment)
{
if (mem->is_range_manager) {
struct ttm_range_mgr_node *range_node =
to_ttm_range_mgr_node(res);
return i915_rsgt_from_mm_node(&range_node->mm_nodes[0],
mem->region.start);
mem->region.start,
page_alignment);
} else {
return i915_rsgt_from_buddy_resource(res, mem->region.start);
return i915_rsgt_from_buddy_resource(res, mem->region.start,
page_alignment);
}
}

View file

@ -24,7 +24,8 @@ int intel_region_ttm_fini(struct intel_memory_region *mem);
struct i915_refct_sgt *
intel_region_ttm_resource_to_rsgt(struct intel_memory_region *mem,
struct ttm_resource *res);
struct ttm_resource *res,
u32 page_alignment);
void intel_region_ttm_resource_free(struct intel_memory_region *mem,
struct ttm_resource *res);

View file

@ -742,7 +742,7 @@ static int pot_hole(struct i915_address_space *vm,
u64 addr;
for (addr = round_up(hole_start + min_alignment, step) - min_alignment;
addr <= round_down(hole_end - (2 * min_alignment), step) - min_alignment;
hole_end > addr && hole_end - addr >= 2 * min_alignment;
addr += step) {
err = i915_vma_pin(vma, 0, 0, addr | flags);
if (err) {

View file

@ -451,7 +451,6 @@ static int igt_mock_splintered_region(void *arg)
static int igt_mock_max_segment(void *arg)
{
const unsigned int max_segment = rounddown(UINT_MAX, PAGE_SIZE);
struct intel_memory_region *mem = arg;
struct drm_i915_private *i915 = mem->i915;
struct i915_ttm_buddy_resource *res;
@ -460,7 +459,10 @@ static int igt_mock_max_segment(void *arg)
struct drm_buddy *mm;
struct list_head *blocks;
struct scatterlist *sg;
I915_RND_STATE(prng);
LIST_HEAD(objects);
unsigned int max_segment;
unsigned int ps;
u64 size;
int err = 0;
@ -472,7 +474,13 @@ static int igt_mock_max_segment(void *arg)
*/
size = SZ_8G;
mem = mock_region_create(i915, 0, size, PAGE_SIZE, 0, 0);
ps = PAGE_SIZE;
if (i915_prandom_u64_state(&prng) & 1)
ps = SZ_64K; /* For something like DG2 */
max_segment = round_down(UINT_MAX, ps);
mem = mock_region_create(i915, 0, size, ps, 0, 0);
if (IS_ERR(mem))
return PTR_ERR(mem);
@ -498,12 +506,21 @@ static int igt_mock_max_segment(void *arg)
}
for (sg = obj->mm.pages->sgl; sg; sg = sg_next(sg)) {
dma_addr_t daddr = sg_dma_address(sg);
if (sg->length > max_segment) {
pr_err("%s: Created an oversized scatterlist entry, %u > %u\n",
__func__, sg->length, max_segment);
err = -EINVAL;
goto out_close;
}
if (!IS_ALIGNED(daddr, ps)) {
pr_err("%s: Created an unaligned scatterlist entry, addr=%pa, ps=%u\n",
__func__, &daddr, ps);
err = -EINVAL;
goto out_close;
}
}
out_close:

View file

@ -33,7 +33,8 @@ static int mock_region_get_pages(struct drm_i915_gem_object *obj)
return PTR_ERR(obj->mm.res);
obj->mm.rsgt = intel_region_ttm_resource_to_rsgt(obj->mm.region,
obj->mm.res);
obj->mm.res,
obj->mm.region->min_page_size);
if (IS_ERR(obj->mm.rsgt)) {
err = PTR_ERR(obj->mm.rsgt);
goto err_free_resource;

View file

@ -696,22 +696,55 @@
#define INTEL_DG2_G10_IDS(info) \
INTEL_VGA_DEVICE(0x5690, info), \
INTEL_VGA_DEVICE(0x5691, info), \
INTEL_VGA_DEVICE(0x5692, info)
INTEL_VGA_DEVICE(0x5692, info), \
INTEL_VGA_DEVICE(0x56A0, info), \
INTEL_VGA_DEVICE(0x56A1, info), \
INTEL_VGA_DEVICE(0x56A2, info)
#define INTEL_DG2_G11_IDS(info) \
INTEL_VGA_DEVICE(0x5693, info), \
INTEL_VGA_DEVICE(0x5694, info), \
INTEL_VGA_DEVICE(0x5695, info), \
INTEL_VGA_DEVICE(0x56B0, info)
INTEL_VGA_DEVICE(0x5698, info), \
INTEL_VGA_DEVICE(0x56A5, info), \
INTEL_VGA_DEVICE(0x56A6, info), \
INTEL_VGA_DEVICE(0x56B0, info), \
INTEL_VGA_DEVICE(0x56B1, info)
#define INTEL_DG2_G12_IDS(info) \
INTEL_VGA_DEVICE(0x5696, info), \
INTEL_VGA_DEVICE(0x5697, info), \
INTEL_VGA_DEVICE(0x56B2, info)
INTEL_VGA_DEVICE(0x56A3, info), \
INTEL_VGA_DEVICE(0x56A4, info), \
INTEL_VGA_DEVICE(0x56B2, info), \
INTEL_VGA_DEVICE(0x56B3, info)
#define INTEL_DG2_IDS(info) \
INTEL_DG2_G10_IDS(info), \
INTEL_DG2_G11_IDS(info), \
INTEL_DG2_G12_IDS(info)
#define INTEL_ATS_M150_IDS(info) \
INTEL_VGA_DEVICE(0x56C0, info)
#define INTEL_ATS_M75_IDS(info) \
INTEL_VGA_DEVICE(0x56C1, info)
#define INTEL_ATS_M_IDS(info) \
INTEL_ATS_M150_IDS(info), \
INTEL_ATS_M75_IDS(info)
/* MTL */
#define INTEL_MTL_M_IDS(info) \
INTEL_VGA_DEVICE(0x7D40, info), \
INTEL_VGA_DEVICE(0x7D60, info)
#define INTEL_MTL_P_IDS(info) \
INTEL_VGA_DEVICE(0x7D45, info), \
INTEL_VGA_DEVICE(0x7D55, info), \
INTEL_VGA_DEVICE(0x7DD5, info)
#define INTEL_MTL_IDS(info) \
INTEL_MTL_M_IDS(info), \
INTEL_MTL_P_IDS(info)
#endif /* _I915_PCIIDS_H */

View file

@ -751,14 +751,27 @@ typedef struct drm_i915_irq_wait {
/* Must be kept compact -- no holes and well documented */
typedef struct drm_i915_getparam {
/**
* struct drm_i915_getparam - Driver parameter query structure.
*/
struct drm_i915_getparam {
/** @param: Driver parameter to query. */
__s32 param;
/*
/**
* @value: Address of memory where queried value should be put.
*
* WARNING: Using pointers instead of fixed-size u64 means we need to write
* compat32 code. Don't repeat this mistake.
*/
int __user *value;
} drm_i915_getparam_t;
};
/**
* typedef drm_i915_getparam_t - Driver parameter query structure.
* See struct drm_i915_getparam.
*/
typedef struct drm_i915_getparam drm_i915_getparam_t;
/* Ioctl to set kernel params:
*/
@ -1239,76 +1252,119 @@ struct drm_i915_gem_exec_object2 {
__u64 rsvd2;
};
/**
* struct drm_i915_gem_exec_fence - An input or output fence for the execbuf
* ioctl.
*
* The request will wait for input fence to signal before submission.
*
* The returned output fence will be signaled after the completion of the
* request.
*/
struct drm_i915_gem_exec_fence {
/**
* User's handle for a drm_syncobj to wait on or signal.
*/
/** @handle: User's handle for a drm_syncobj to wait on or signal. */
__u32 handle;
/**
* @flags: Supported flags are:
*
* I915_EXEC_FENCE_WAIT:
* Wait for the input fence before request submission.
*
* I915_EXEC_FENCE_SIGNAL:
* Return request completion fence as output
*/
__u32 flags;
#define I915_EXEC_FENCE_WAIT (1<<0)
#define I915_EXEC_FENCE_SIGNAL (1<<1)
#define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SIGNAL << 1))
__u32 flags;
};
/*
* See drm_i915_gem_execbuffer_ext_timeline_fences.
*/
#define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
/*
/**
* struct drm_i915_gem_execbuffer_ext_timeline_fences - Timeline fences
* for execbuf ioctl.
*
* This structure describes an array of drm_syncobj and associated points for
* timeline variants of drm_syncobj. It is invalid to append this structure to
* the execbuf if I915_EXEC_FENCE_ARRAY is set.
*/
struct drm_i915_gem_execbuffer_ext_timeline_fences {
#define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
/** @base: Extension link. See struct i915_user_extension. */
struct i915_user_extension base;
/**
* Number of element in the handles_ptr & value_ptr arrays.
* @fence_count: Number of elements in the @handles_ptr & @value_ptr
* arrays.
*/
__u64 fence_count;
/**
* Pointer to an array of struct drm_i915_gem_exec_fence of length
* fence_count.
* @handles_ptr: Pointer to an array of struct drm_i915_gem_exec_fence
* of length @fence_count.
*/
__u64 handles_ptr;
/**
* Pointer to an array of u64 values of length fence_count. Values
* must be 0 for a binary drm_syncobj. A Value of 0 for a timeline
* drm_syncobj is invalid as it turns a drm_syncobj into a binary one.
* @values_ptr: Pointer to an array of u64 values of length
* @fence_count.
* Values must be 0 for a binary drm_syncobj. A Value of 0 for a
* timeline drm_syncobj is invalid as it turns a drm_syncobj into a
* binary one.
*/
__u64 values_ptr;
};
/**
* struct drm_i915_gem_execbuffer2 - Structure for DRM_I915_GEM_EXECBUFFER2
* ioctl.
*/
struct drm_i915_gem_execbuffer2 {
/**
* List of gem_exec_object2 structs
*/
/** @buffers_ptr: Pointer to a list of gem_exec_object2 structs */
__u64 buffers_ptr;
/** @buffer_count: Number of elements in @buffers_ptr array */
__u32 buffer_count;
/** Offset in the batchbuffer to start execution from. */
__u32 batch_start_offset;
/** Bytes used in batchbuffer from batch_start_offset */
__u32 batch_len;
__u32 DR1;
__u32 DR4;
__u32 num_cliprects;
/**
* This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
* & I915_EXEC_USE_EXTENSIONS are not set.
* @batch_start_offset: Offset in the batchbuffer to start execution
* from.
*/
__u32 batch_start_offset;
/**
* @batch_len: Length in bytes of the batch buffer, starting from the
* @batch_start_offset. If 0, length is assumed to be the batch buffer
* object size.
*/
__u32 batch_len;
/** @DR1: deprecated */
__u32 DR1;
/** @DR4: deprecated */
__u32 DR4;
/** @num_cliprects: See @cliprects_ptr */
__u32 num_cliprects;
/**
* @cliprects_ptr: Kernel clipping was a DRI1 misfeature.
*
* It is invalid to use this field if I915_EXEC_FENCE_ARRAY or
* I915_EXEC_USE_EXTENSIONS flags are not set.
*
* If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array
* of struct drm_i915_gem_exec_fence and num_cliprects is the length
* of the array.
* of &drm_i915_gem_exec_fence and @num_cliprects is the length of the
* array.
*
* If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a
* single struct i915_user_extension and num_cliprects is 0.
* single &i915_user_extension and num_cliprects is 0.
*/
__u64 cliprects_ptr;
/** @flags: Execbuf flags */
__u64 flags;
#define I915_EXEC_RING_MASK (0x3f)
#define I915_EXEC_DEFAULT (0<<0)
#define I915_EXEC_RENDER (1<<0)
@ -1326,10 +1382,6 @@ struct drm_i915_gem_execbuffer2 {
#define I915_EXEC_CONSTANTS_REL_GENERAL (0<<6) /* default */
#define I915_EXEC_CONSTANTS_ABSOLUTE (1<<6)
#define I915_EXEC_CONSTANTS_REL_SURFACE (2<<6) /* gen4/5 only */
__u64 flags;
__u64 rsvd1; /* now used for context info */
__u64 rsvd2;
};
/** Resets the SO write offset registers for transform feedback on gen7. */
#define I915_EXEC_GEN7_SOL_RESET (1<<8)
@ -1432,9 +1484,23 @@ struct drm_i915_gem_execbuffer2 {
* drm_i915_gem_execbuffer_ext enum.
*/
#define I915_EXEC_USE_EXTENSIONS (1 << 21)
#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_USE_EXTENSIONS << 1))
/** @rsvd1: Context id */
__u64 rsvd1;
/**
* @rsvd2: in and out sync_file file descriptors.
*
* When I915_EXEC_FENCE_IN or I915_EXEC_FENCE_SUBMIT flag is set, the
* lower 32 bits of this field will have the in sync_file fd (input).
*
* When I915_EXEC_FENCE_OUT flag is set, the upper 32 bits of this
* field will have the out sync_file fd (output).
*/
__u64 rsvd2;
};
#define I915_EXEC_CONTEXT_ID_MASK (0xffffffff)
#define i915_execbuffer2_set_context_id(eb2, context) \
(eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK
@ -1814,19 +1880,58 @@ struct drm_i915_gem_context_create {
__u32 pad;
};
/**
* struct drm_i915_gem_context_create_ext - Structure for creating contexts.
*/
struct drm_i915_gem_context_create_ext {
__u32 ctx_id; /* output: id of new context*/
/** @ctx_id: Id of the created context (output) */
__u32 ctx_id;
/**
* @flags: Supported flags are:
*
* I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS:
*
* Extensions may be appended to this structure and driver must check
* for those. See @extensions.
*
* I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE
*
* Created context will have single timeline.
*/
__u32 flags;
#define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS (1u << 0)
#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE (1u << 1)
#define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
/**
* @extensions: Zero-terminated chain of extensions.
*
* I915_CONTEXT_CREATE_EXT_SETPARAM:
* Context parameter to set or query during context creation.
* See struct drm_i915_gem_context_create_ext_setparam.
*
* I915_CONTEXT_CREATE_EXT_CLONE:
* This extension has been removed. On the off chance someone somewhere
* has attempted to use it, never re-use this extension number.
*/
__u64 extensions;
#define I915_CONTEXT_CREATE_EXT_SETPARAM 0
#define I915_CONTEXT_CREATE_EXT_CLONE 1
};
/**
* struct drm_i915_gem_context_param - Context parameter to set or query.
*/
struct drm_i915_gem_context_param {
/** @ctx_id: Context id */
__u32 ctx_id;
/** @size: Size of the parameter @value */
__u32 size;
/** @param: Parameter to set or query */
__u64 param;
#define I915_CONTEXT_PARAM_BAN_PERIOD 0x1
/* I915_CONTEXT_PARAM_NO_ZEROMAP has been removed. On the off chance
@ -1973,6 +2078,7 @@ struct drm_i915_gem_context_param {
#define I915_CONTEXT_PARAM_PROTECTED_CONTENT 0xd
/* Must be kept compact -- no holes and well documented */
/** @value: Context parameter value to be set or queried */
__u64 value;
};
@ -2371,23 +2477,29 @@ struct i915_context_param_engines {
struct i915_engine_class_instance engines[N__]; \
} __attribute__((packed)) name__
/**
* struct drm_i915_gem_context_create_ext_setparam - Context parameter
* to set or query during context creation.
*/
struct drm_i915_gem_context_create_ext_setparam {
#define I915_CONTEXT_CREATE_EXT_SETPARAM 0
/** @base: Extension link. See struct i915_user_extension. */
struct i915_user_extension base;
/**
* @param: Context parameter to set or query.
* See struct drm_i915_gem_context_param.
*/
struct drm_i915_gem_context_param param;
};
/* This API has been removed. On the off chance someone somewhere has
* attempted to use it, never re-use this extension number.
*/
#define I915_CONTEXT_CREATE_EXT_CLONE 1
struct drm_i915_gem_context_destroy {
__u32 ctx_id;
__u32 pad;
};
/*
/**
* struct drm_i915_gem_vm_control - Structure to create or destroy VM.
*
* DRM_I915_GEM_VM_CREATE -
*
* Create a new virtual memory address space (ppGTT) for use within a context
@ -2397,20 +2509,23 @@ struct drm_i915_gem_context_destroy {
* The id of new VM (bound to the fd) for use with I915_CONTEXT_PARAM_VM is
* returned in the outparam @id.
*
* No flags are defined, with all bits reserved and must be zero.
*
* An extension chain maybe provided, starting with @extensions, and terminated
* by the @next_extension being 0. Currently, no extensions are defined.
*
* DRM_I915_GEM_VM_DESTROY -
*
* Destroys a previously created VM id, specified in @id.
* Destroys a previously created VM id, specified in @vm_id.
*
* No extensions or flags are allowed currently, and so must be zero.
*/
struct drm_i915_gem_vm_control {
/** @extensions: Zero-terminated chain of extensions. */
__u64 extensions;
/** @flags: reserved for future usage, currently MBZ */
__u32 flags;
/** @vm_id: Id of the VM created or to be destroyed */
__u32 vm_id;
};
@ -3207,36 +3322,6 @@ struct drm_i915_gem_memory_class_instance {
* struct drm_i915_memory_region_info - Describes one region as known to the
* driver.
*
* Note that we reserve some stuff here for potential future work. As an example
* we might want expose the capabilities for a given region, which could include
* things like if the region is CPU mappable/accessible, what are the supported
* mapping types etc.
*
* Note that to extend struct drm_i915_memory_region_info and struct
* drm_i915_query_memory_regions in the future the plan is to do the following:
*
* .. code-block:: C
*
* struct drm_i915_memory_region_info {
* struct drm_i915_gem_memory_class_instance region;
* union {
* __u32 rsvd0;
* __u32 new_thing1;
* };
* ...
* union {
* __u64 rsvd1[8];
* struct {
* __u64 new_thing2;
* __u64 new_thing3;
* ...
* };
* };
* };
*
* With this things should remain source compatible between versions for
* userspace, even as we add new fields.
*
* Note this is using both struct drm_i915_query_item and struct drm_i915_query.
* For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS
* at &drm_i915_query_item.query_id.
@ -3248,14 +3333,81 @@ struct drm_i915_memory_region_info {
/** @rsvd0: MBZ */
__u32 rsvd0;
/** @probed_size: Memory probed by the driver (-1 = unknown) */
/**
* @probed_size: Memory probed by the driver
*
* Note that it should not be possible to ever encounter a zero value
* here, also note that no current region type will ever return -1 here.
* Although for future region types, this might be a possibility. The
* same applies to the other size fields.
*/
__u64 probed_size;
/** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
/**
* @unallocated_size: Estimate of memory remaining
*
* Requires CAP_PERFMON or CAP_SYS_ADMIN to get reliable accounting.
* Without this (or if this is an older kernel) the value here will
* always equal the @probed_size. Note this is only currently tracked
* for I915_MEMORY_CLASS_DEVICE regions (for other types the value here
* will always equal the @probed_size).
*/
__u64 unallocated_size;
/** @rsvd1: MBZ */
__u64 rsvd1[8];
union {
/** @rsvd1: MBZ */
__u64 rsvd1[8];
struct {
/**
* @probed_cpu_visible_size: Memory probed by the driver
* that is CPU accessible.
*
* This will be always be <= @probed_size, and the
* remainder (if there is any) will not be CPU
* accessible.
*
* On systems without small BAR, the @probed_size will
* always equal the @probed_cpu_visible_size, since all
* of it will be CPU accessible.
*
* Note this is only tracked for
* I915_MEMORY_CLASS_DEVICE regions (for other types the
* value here will always equal the @probed_size).
*
* Note that if the value returned here is zero, then
* this must be an old kernel which lacks the relevant
* small-bar uAPI support (including
* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS), but on
* such systems we should never actually end up with a
* small BAR configuration, assuming we are able to load
* the kernel module. Hence it should be safe to treat
* this the same as when @probed_cpu_visible_size ==
* @probed_size.
*/
__u64 probed_cpu_visible_size;
/**
* @unallocated_cpu_visible_size: Estimate of CPU
* visible memory remaining.
*
* Note this is only tracked for
* I915_MEMORY_CLASS_DEVICE regions (for other types the
* value here will always equal the
* @probed_cpu_visible_size).
*
* Requires CAP_PERFMON or CAP_SYS_ADMIN to get reliable
* accounting. Without this the value here will always
* equal the @probed_cpu_visible_size. Note this is only
* currently tracked for I915_MEMORY_CLASS_DEVICE
* regions (for other types the value here will also
* always equal the @probed_cpu_visible_size).
*
* If this is an older kernel the value here will be
* zero, see also @probed_cpu_visible_size.
*/
__u64 unallocated_cpu_visible_size;
};
};
};
/**
@ -3329,11 +3481,11 @@ struct drm_i915_query_memory_regions {
* struct drm_i915_gem_create_ext - Existing gem_create behaviour, with added
* extension support using struct i915_user_extension.
*
* Note that in the future we want to have our buffer flags here, at least for
* the stuff that is immutable. Previously we would have two ioctls, one to
* create the object with gem_create, and another to apply various parameters,
* however this creates some ambiguity for the params which are considered
* immutable. Also in general we're phasing out the various SET/GET ioctls.
* Note that new buffer flags should be added here, at least for the stuff that
* is immutable. Previously we would have two ioctls, one to create the object
* with gem_create, and another to apply various parameters, however this
* creates some ambiguity for the params which are considered immutable. Also in
* general we're phasing out the various SET/GET ioctls.
*/
struct drm_i915_gem_create_ext {
/**
@ -3341,7 +3493,6 @@ struct drm_i915_gem_create_ext {
*
* The (page-aligned) allocated size for the object will be returned.
*
*
* DG2 64K min page size implications:
*
* On discrete platforms, starting from DG2, we have to contend with GTT
@ -3353,7 +3504,9 @@ struct drm_i915_gem_create_ext {
*
* Note that the returned size here will always reflect any required
* rounding up done by the kernel, i.e 4K will now become 64K on devices
* such as DG2.
* such as DG2. The kernel will always select the largest minimum
* page-size for the set of possible placements as the value to use when
* rounding up the @size.
*
* Special DG2 GTT address alignment requirement:
*
@ -3377,14 +3530,58 @@ struct drm_i915_gem_create_ext {
* is deemed to be a good compromise.
*/
__u64 size;
/**
* @handle: Returned handle for the object.
*
* Object handles are nonzero.
*/
__u32 handle;
/** @flags: MBZ */
/**
* @flags: Optional flags.
*
* Supported values:
*
* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that
* the object will need to be accessed via the CPU.
*
* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and only
* strictly required on configurations where some subset of the device
* memory is directly visible/mappable through the CPU (which we also
* call small BAR), like on some DG2+ systems. Note that this is quite
* undesirable, but due to various factors like the client CPU, BIOS etc
* it's something we can expect to see in the wild. See
* &drm_i915_memory_region_info.probed_cpu_visible_size for how to
* determine if this system applies.
*
* Note that one of the placements MUST be I915_MEMORY_CLASS_SYSTEM, to
* ensure the kernel can always spill the allocation to system memory,
* if the object can't be allocated in the mappable part of
* I915_MEMORY_CLASS_DEVICE.
*
* Also note that since the kernel only supports flat-CCS on objects
* that can *only* be placed in I915_MEMORY_CLASS_DEVICE, we therefore
* don't support I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS together with
* flat-CCS.
*
* Without this hint, the kernel will assume that non-mappable
* I915_MEMORY_CLASS_DEVICE is preferred for this object. Note that the
* kernel can still migrate the object to the mappable part, as a last
* resort, if userspace ever CPU faults this object, but this might be
* expensive, and so ideally should be avoided.
*
* On older kernels which lack the relevant small-bar uAPI support (see
* also &drm_i915_memory_region_info.probed_cpu_visible_size),
* usage of the flag will result in an error, but it should NEVER be
* possible to end up with a small BAR configuration, assuming we can
* also successfully load the i915 kernel module. In such cases the
* entire I915_MEMORY_CLASS_DEVICE region will be CPU accessible, and as
* such there are zero restrictions on where the object can be placed.
*/
#define I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS (1 << 0)
__u32 flags;
/**
* @extensions: The chain of extensions to apply to this object.
*