Commit Graph

179 Commits

Author SHA1 Message Date
Maarten Lankhorst 95c3d27580 drm/i915: Remove resv from i915_vma
It's just an alias to vma->obj->base.resv, no need to duplicate it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211117142024.1043017-5-matthew.auld@intel.com
2021-11-19 17:38:03 +00:00
Wan Jiabing 86752bd613 drm/i915: Use ERR_CAST instead of ERR_PTR(PTR_ERR())
Fix following coccicheck warning:
./drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:3117:15-22: WARNING:
ERR_CAST can be used with eb->requests[i].

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211025113316.24424-1-wanjiabing@vivo.com
2021-10-25 16:35:55 +01:00
Matthew Auld df94fd05e6 drm/i915: expand on the kernel-doc for cache_dirty
Add some details around non-LLC platforms and cflushing, when dealing
with the flush-on-acquire, which is potentially security sensitive.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018174508.2137279-7-matthew.auld@intel.com
2021-10-20 16:50:24 +01:00
Matthew Brost 7647f0096e drm/i915: Update I915_GEM_BUSY IOCTL to understand composite fences
Parallel submission create composite fences (dma_fence_array) for excl /
shared slots in objects. The I915_GEM_BUSY IOCTL checks these slots to
determine the busyness of the object. Prior to patch it only check if
the fence in the slot was a i915_request. Update the check to understand
composite fences and correctly report the busyness.

v2:
 (Tvrtko)
  - Remove duplicate BUILD_BUG_ON

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-24-matthew.brost@intel.com
2021-10-15 10:45:51 -07:00
Matthew Brost 544460c338 drm/i915: Multi-BB execbuf
Allow multiple batch buffers to be submitted in a single execbuf IOCTL
after a context has been configured with the 'set_parallel' extension.
The number batches is implicit based on the contexts configuration.

This is implemented with a series of loops. First a loop is used to find
all the batches, a loop to pin all the HW contexts, a loop to create all
the requests, a loop to submit (emit BB start, etc...) all the requests,
a loop to tie the requests to the VMAs they touch, and finally a loop to
commit the requests to the backend.

A composite fence is also created for the generated requests to return
to the user and to stick in dma resv slots.

No behavior from the existing IOCTL should be changed aside from when
throttling because the ring for a context is full. In this situation,
i915 will now wait while holding the object locks. This change was done
because the code is much simpler to wait while holding the locks and we
believe there isn't a huge benefit of dropping these locks. If this
proves false we can restructure the code to drop the locks during the
wait.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
 (Matthew Brost)
  - Return proper error value if i915_request_create fails
v3:
 (John Harrison)
  - Add comment explaining create / add order loops + locking
  - Update commit message explaining different in IOCTL behavior
  - Line wrap some comments
  - eb_add_request returns void
  - Return -EINVAL rather triggering BUG_ON if cmd parser used
 (Checkpatch)
  - Check eb->batch_len[*current_batch]
v4:
 (CI)
  - Set batch len if passed if via execbuf args
  - Call __i915_request_skip after __i915_request_commit
 (Kernel test robot)
  - Initialize rq to NULL in eb_pin_timeline
v5:
 (John Harrison)
  - Fix typo in comments near bb order loops

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-21-matthew.brost@intel.com
2021-10-15 10:45:50 -07:00
Anshuman Gupta ef6ba31dd3 drm/i915/pxp: Add plane decryption support
Add support to enable/disable PLANE_SURF Decryption Request bit.
It requires only to enable plane decryption support when following
condition met.
1. PXP session is enabled.
2. Buffer object is protected.

v2:
- Used gen fb obj user_flags instead gem_object_metadata. [Krishna]

v3:
- intel_pxp_gem_object_status() API changes.

v4: use intel_pxp_is_active (Daniele)

v5: rebase and use the new protected object status checker (Daniele)

v6: used plane state for plane_decryption to handle async flip
    as suggested by Ville.

v7: check pxp session while plane decrypt state computation. [Ville]
    removed pointless code. [Ville]

v8 (Daniele): update PXP check

v9: move decrypt check after icl_check_nv12_planes() when overlays
    have fb set (Juston)

v10 (Daniele): update PXP check again to match rework in earlier
patches and don't consider protection valid if the object has not
been used in an execbuf beforehand.

Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Huang Sean Z <sean.z.huang@intel.com>
Cc: Gaurav Kumar <kumar.gaurav@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Juston Li <juston.li@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com> #v9
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-14-alan.previn.teres.alexis@intel.com
2021-10-04 13:11:16 -04:00
Daniele Ceraolo Spurio d3ac8d4216 drm/i915/pxp: interfaces for using protected objects
This api allow user mode to create protected buffers and to mark
contexts as making use of such objects. Only when using contexts
marked in such a way is the execution guaranteed to work as expected.

Contexts can only be marked as using protected content at creation time
(i.e. the parameter is immutable) and they must be both bannable and not
recoverable. Given that the protected session gets invalidated on
suspend, contexts created this way hold a runtime pm wakeref until
they're either destroyed or invalidated.

All protected objects and contexts will be considered invalid when the
PXP session is destroyed and all new submissions using them will be
rejected. All intel contexts within the invalidated gem contexts will be
marked banned. Userspace can detect that an invalidation has occurred via
the RESET_STATS ioctl, where we report it the same way as a ban due to a
hang.

v5: squash patches, rebase on proto_ctx, update kerneldoc

v6: rebase on obj create_ext changes

v7: Use session counter to check if an object it valid, hold wakeref in
    context, don't add a new flag to RESET_STATS (Daniel)

v8: don't increase guilty count for contexts banned during pxp
    invalidation (Rodrigo)

v9: better comments, avoid wakeref put race between pxp_inval and
    context_close, add usage examples (Rodrigo)

v10: modify internal set/get-protected-context functions to not
     return -ENODEV when setting PXP param to false or getting param
     when running on pxp-unsupported hw or getting param when i915
     was built with CONFIG_PXP off

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924191452.1539378-11-alan.previn.teres.alexis@intel.com
2021-10-04 13:11:00 -04:00
Joonas Lahtinen d5dd580deb Merge drm/drm-next into drm-intel-gt-next
Close the divergence which has caused patches not to apply and
have a solid baseline for the PXP patches that Rodrigo will send
a topic branch PR for.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2021-09-15 13:23:27 +03:00
Daniel Vetter a82a9979de drm/i915: Add i915_gem_context_is_full_ppgtt
And use it anywhere we have open-coded checks for ctx->vm that really
only check for full ppgtt.

Plus for paranoia add a GEM_BUG_ON that checks it's really only set
when we have full ppgtt, just in case. gem_context->vm is different
since it's NULL in ggtt mode, unlike intel_context->vm or gt->vm,
which is always set.

v2: 0day found a testcase that I missed.

v3: Repaint shed (Jon, Tvrtko)

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-7-daniel.vetter@ffwll.ch
2021-09-06 10:53:04 +02:00
Daniel Vetter e1068a9e80 drm/i915: Drop code to handle set-vm races from execbuf
Changing the vm from a finalized gem ctx is no longer possible, which
means we don't have to check for that anymore.

I was pondering whether to keep the check as a WARN_ON, but things go
boom real bad real fast if the vm of a vma is wrong. Plus we'd need to
also get the ggtt vm for !full-ppgtt platforms. Ditching it all seemed
like a better idea.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
References: ccbc1b9794 ("drm/i915/gem: Don't allow changing the VM on running contexts (v4)")
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-4-daniel.vetter@ffwll.ch
2021-09-06 10:45:56 +02:00
Dave Airlie 25fed6b324 Merge tag 'drm-intel-gt-next-2021-08-06-1' of ssh://git.freedesktop.org/git/drm/drm-intel into drm-next
UAPI Changes:

- Add I915_MMAP_OFFSET_FIXED

  On devices with local memory `I915_MMAP_OFFSET_FIXED` is the only valid
  type. On devices without local memory, this caching mode is invalid.

  As caching mode when specifying `I915_MMAP_OFFSET_FIXED`, WC or WB will
  be used, depending on the object placement on creation. WB will be used
  when the object can only exist in system memory, WC otherwise.

  Userspace: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11888

- Reinstate the mmap ioctl for (already released) integrated Gen12 platforms

  Rationale: Otherwise media driver breaks eg. for ADL-P. Long term goal is
  still to sunset the IOCTL even for integrated and require using mmap_offset.

- Reject caching/set_domain IOCTLs on discrete

  Expected to become immutable property of the BO

- Disallow changing context parameters after first use on Gen12 and earlier
- Require setting context parameters at creation on platforms after Gen12

  Rationale (for both): Allow less dynamic changes to the context to simplify
  the implementation and avoid user shooting theirselves in the foot.

- Drop I915_CONTEXT_PARAM_RINGSIZE

  Userspace PR for compute-driver has not been merged

- Drop I915_CONTEXT_PARAM_NO_ZEROMAP

  Userspace PR for libdrm / Beignet was never landed

- Drop CONTEXT_CLONE API

  Userspace PR for Mesa was never landed

- Drop getparam support for I915_CONTEXT_PARAM_ENGINES

  Only existed for symmetry wrt. setparam, never used.

- Disallow bonding of virtual engines

  Drop the prep work, no hardware has been released needing it.

- (Implicit) Disable gpu relocations

  Media userspace was the last userspace to still use them. They
  have converted so performance can be regained with an update.

Core Changes:

- Merge topic branch 'topic/i915-ttm-2021-06-11' (from Maarten)
- Merge topic branch 'topic/revid_steppings' (from Matt R)
- Merge topic branch 'topic/xehp-dg2-definitions-2021-07-21' (from Matt R)
- Backmerges drm-next (Rodrigo)

Driver Changes:

- Initial workarounds for ADL-P (Clint)
- Preliminary code for XeHP/DG2 (Stuart, Umesh, Matt R, Prathap, Ram,
  Venkata, Akeem, Tvrtko, John, Lucas)
- Fix ADL-S DMA mask size to 39 bits (Tejas)
- Remove code for CNL (Lucas)
- Add ADL-P GuC/HuC firmwares (John)
- Update HuC to 7.9.3 for TGL/ADL-S/RKL (John)
- Fix -EDEADLK handling regression (Ville)
- Implement Wa_1508744258 for DG1 and Gen12 iGFX (Jose)
- Extend Wa_1406941453 to ADL-S (Jose)
- Drop unnecessary workarounds per stepping for SKL/BXT/ICL (Matt R)
- Use fuse info to enable SFC on Gen12 (Venkata)
- Unconditionally flush the pages on acquire on EHL/JSL (Matt A)
- Probe existence of backing struct pages upon userptr creation (Chris, Matt A)

- Add an intermediate GEM proto-context to delay real context creation (Jason)
- Implement SINGLE_TIMELINE with a syncobj (Jason)
- Set the watchdog timeout directly in intel_context_set_gem (Jason)
- Disallow userspace from creating contexts with too many engines (Jason)
- Revert "drm/i915/gem: Asynchronous cmdparser" (Jason)
- Revert "drm/i915: Propagate errors on awaiting already signaled fences" (Jason)
- Revert "drm/i915: Skip over MI_NOOP when parsing" (Jason)
- Revert "drm/i915: Shrink the GEM kmem_caches upon idling" (Daniel)
- Always let TTM handle object migration (Jason)
- Correct the locking and pin pattern for dma-buf (Thomas H, Michael R, Jason)
- Migrate to system at dma-buf attach time (Thomas, Michael R)

- MAJOR refactoring of the GuC backend code to allow for enabling on Gen11+
  (Matt B, John, Michal Wa., Fernando, Daniele, Vinay)
- Update GuC firmware interface to v62.0.0 (John, Michal Wa., Matt B)
- Add GuCRC feature to hand over the control of HW RC6 to the GuC on
  Gen12+ when GuC submission is enabled (Vinay, Sujaritha, Daniele,
  John, Tvrtko)
- Use the correct IRQ during resume and eliminate DRM IRQ midlayer (Thomas Z)
- Add pipelined page migration and clearing (Chris, Thomas H)
- Use TTM for system memory on discrete (Thomas H)
- Implement object migration for display vs. dma-buf (Thomas H)
- Perform execbuffer object locking as a separate step (Thomas H)
- Add support for explicit L3BANK steering (Matt, Daniele)
- Remove duplicated call to ops->pread (Daniel)
- Fix pagefault disabling in the first execbuf slowpath (Daniel)
- Simplify userptr locking (Thomas H)
- Improvements to the GuC CTB code (Matt B, John)
- Make GT workaround upper bounds exclusive (Matt R)
- Check for nomodeset in i915_init() first (Daniel)
- Delete now unused gpu reloc code (Daniel)

- Document RFC plans for GuC submission, DRM scheduler and new parallel
  submit uAPI (Matt B)
- Reintroduce buddy allocator this time with TTM (Matt A)
- Support forcing page size with LMEM (Matt A)
- Add i915_sched_engine to abstract a submission queue between backends (Matt B)
- Use accelerated move in TTM (Ram)
- Fix memory leaks from TTM backend (Thomas H)
- Introduce WW transaction helper (Thomas H)
- Improve debug Kconfig texts a bit (Daniel)
- Unify user object creation code (Jason)
- Use a table for i915_init/exit (Jason)
- Move slabs to module init/exit (Daniel)
- Remove now unused i915_globals (Daniel)
- Extract i915_module.c (Daniel)

- Consistently use adl-p/adl-s in WA comments (Jose)
- Finish INTEL_GEN and friends conversion (Lucas)
- Correct variable/function namings (Lucas)
- Code checker fixes (Wan, Matt A)
- Tracepoint improvements (Matt B)
- Kerneldoc improvements (Tvrtko, Jason, Matt A, Maarten)
- Selftest improvements (Chris, Matt A, Tejas, Thomas H, John, Matt B,
  Rahul, Vinay)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YQ0JmYiXhGskNcrI@jlahtine-mobl.ger.corp.intel.com
2021-08-12 09:56:04 +10:00
Daniel Vetter 8e02cceb1f drm/i915: delete gpu reloc code
It's already removed, this just garbage collects it all.

v2: Rebase over s/GEN/GRAPHICS_VER/

v3: Also ditch eb.reloc_pool and eb.reloc_context (Maarten)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210803124833.3817354-2-daniel.vetter@ffwll.ch
2021-08-05 00:25:48 +02:00
Daniel Vetter ce13c78fa9 drm/i915: Disable gpu relocations
Media userspace was the last userspace to still use them, and they
converted now too:

144020c377

This means no reason anymore to make relocations faster than they've
been for the first 9 years of gem. This code was added in

commit 7dd4f6729f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jun 16 15:05:24 2017 +0100

    drm/i915: Async GPU relocation processing

Furthermore there's pretty strong indications it's buggy, since the
code to use it by default as the only option had to be reverted:

commit ad5d95e4d5
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Sep 8 15:41:17 2020 +1000

    Revert "drm/i915/gem: Async GPU relocations only"

This code just disables gpu relocations, leaving the garbage
collection for later patches and more importantly, much less confusing
diff. Also given how much headaches this code has caused in the past,
letting this soak for a bit seems justified.

Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210803124833.3817354-1-daniel.vetter@ffwll.ch
2021-08-05 00:24:52 +02:00
Dave Airlie 35482f9dc5 Linux 5.14-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmD95yIeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGqp0H/j/xHL20EHaUJOaV
 iJjnfGyjtnkLC5FCoV/q/v9sFuSW2p4W1nyF8/eIgVKObef94Mg4/xxaHQrWIM56
 cbzK9aIcD9InAuImJ6lju4fqjNmFrt2x7mhfzjPKqmhfINfZ5CohpLFN5XdOwzYC
 l+ZgmUUl7GLDAND2M6rtkc7AOk4qTyAySDvvPFELE/uNgV4EKaENSIWofHhEzW5v
 Yk+4agawaFTfa6H9+uMVYZBOcEKwheQ0E2tcOJvHJT8Mwm8MFoC/B7fLY5zxIdN2
 7A7r/7qbSQmSDSjOgwKS4ZOjom0xGSD+V+596SzET6jkbahR2HJ/mrFvmD7GNEoW
 OWJPjzI=
 =vzIM
 -----END PGP SIGNATURE-----

Backmerge tag 'v5.14-rc3' into drm-next

Linux 5.14-rc3

Daniel said we should pull the nouveau fix from fixes in here, probably
a good plan.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-07-26 09:27:59 +10:00
Jason Ekstrand c9d9fdbc10 drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"
This reverts 686c7c35ab ("drm/i915/gem: Asynchronous cmdparser").  The
justification for this commit in the git history was a vague comment
about getting it out from under the struct_mutex.  While this may
improve perf for some workloads on Gen7 platforms where we rely on the
command parser for features such as indirect rendering, no numbers were
provided to prove such an improvement.  It claims to closed two
gitlab/bugzilla issues but with no explanation whatsoever as to why or
what bug it's fixing.

Meanwhile, by moving command parsing off to an async callback, it leaves
us with a problem of what to do on error.  When things were synchronous,
EXECBUFFER2 would fail with an error code if parsing failed.  When
moving it to async, we needed another way to handle that error and the
solution employed was to set an error on the dma_fence and then trust
that said error gets propagated to the client eventually.  Moving back
to synchronous will help us untangle the fence error propagation mess.

This also reverts most of 0edbb9ba1b ("drm/i915: Move cmd parser
pinning to execbuffer") which is a refactor of some of our allocation
paths for asynchronous parsing.  Now that everything is synchronous, we
don't need it.

v2 (Daniel Vetter):
 - Add stabel Cc and Fixes tag

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v5.6+
Fixes: 9e31c1fe45 ("drm/i915: Propagate errors on awaiting already signaled fences")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-2-jason@jlekstrand.net
(cherry picked from commit 93b7133041)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-07-19 09:55:20 -04:00
Jason Ekstrand 93b7133041 drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"
This reverts 686c7c35ab ("drm/i915/gem: Asynchronous cmdparser").  The
justification for this commit in the git history was a vague comment
about getting it out from under the struct_mutex.  While this may
improve perf for some workloads on Gen7 platforms where we rely on the
command parser for features such as indirect rendering, no numbers were
provided to prove such an improvement.  It claims to closed two
gitlab/bugzilla issues but with no explanation whatsoever as to why or
what bug it's fixing.

Meanwhile, by moving command parsing off to an async callback, it leaves
us with a problem of what to do on error.  When things were synchronous,
EXECBUFFER2 would fail with an error code if parsing failed.  When
moving it to async, we needed another way to handle that error and the
solution employed was to set an error on the dma_fence and then trust
that said error gets propagated to the client eventually.  Moving back
to synchronous will help us untangle the fence error propagation mess.

This also reverts most of 0edbb9ba1b ("drm/i915: Move cmd parser
pinning to execbuffer") which is a refactor of some of our allocation
paths for asynchronous parsing.  Now that everything is synchronous, we
don't need it.

v2 (Daniel Vetter):
 - Add stabel Cc and Fixes tag

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v5.6+
Fixes: 9e31c1fe45 ("drm/i915: Propagate errors on awaiting already signaled fences")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-2-jason@jlekstrand.net
2021-07-16 21:46:52 +02:00
Jason Ekstrand 046d1660da drm/i915/gem: Return an error ptr from context_lookup
We're about to start doing lazy context creation which means contexts
get created in i915_gem_context_lookup and we may start having more
errors than -ENOENT.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-23-jason@jlekstrand.net
2021-07-08 19:48:23 +02:00
Jason Ekstrand 5ac545b8b0 drm/i915/request: Remove the hook from await_execution
This was only ever used for FENCE_SUBMIT automatic engine selection
which was removed in the previous commit.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-12-jason@jlekstrand.net
2021-07-08 19:46:14 +02:00
Jason Ekstrand dd4f1bbae8 drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2)
Even though FENCE_SUBMIT is only documented to wait until the request in
the in-fence starts instead of waiting until it completes, it has a bit
more magic than that.  If FENCE_SUBMIT is used to submit something to a
balanced engine, we would wait to assign engines until the primary
request was ready to start and then attempt to assign it to a different
engine than the primary.  There is an IGT test (the bonded-slice subtest
of gem_exec_balancer) which exercises this by submitting a primary batch
to a specific VCS and then using FENCE_SUBMIT to submit a secondary
which can run on any VCS and have i915 figure out which VCS to run it on
such that they can run in parallel.

However, this functionality has never been used in the real world.  The
media driver (the only user of FENCE_SUBMIT) always picks exactly two
physical engines to bond and never asks us to pick which to use.

v2 (Daniel Vetter):
 - Mention the exact IGT test this breaks

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-11-jason@jlekstrand.net
2021-07-08 19:44:31 +02:00
Jason Ekstrand 00dae4d3d3 drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)
This API is entirely unnecessary and I'd love to get rid of it.  If
userspace wants a single timeline across multiple contexts, they can
either use implicit synchronization or a syncobj, both of which existed
at the time this feature landed.  The justification given at the time
was that it would help GL drivers which are inherently single-timeline.
However, neither of our GL drivers actually wanted the feature.  i965
was already in maintenance mode at the time and iris uses syncobj for
everything.

Unfortunately, as much as I'd love to get rid of it, it is used by the
media driver so we can't do that.  We can, however, do the next-best
thing which is to embed a syncobj in the context and do exactly what
we'd expect from userspace internally.  This isn't an entirely identical
implementation because it's no longer atomic if userspace races with
itself by calling execbuffer2 twice simultaneously from different
threads.  It won't crash in that case; it just doesn't guarantee any
ordering between those two submits.  It also means that sync files
exported from different engines on a SINGLE_TIMELINE context will have
different fence contexts.  This is visible to userspace if it looks at
the obj_name field of sync_fence_info.

Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
advantages beyond mere annoyance.  One is that intel_timeline is no
longer an api-visible object and can remain entirely an implementation
detail.  This may be advantageous as we make scheduler changes going
forward.  Second is that, together with deleting the CLONE_CONTEXT API,
we should now have a 1:1 mapping between intel_context and
intel_timeline which may help us reduce locking.

v2 (Tvrtko Ursulin):
 - Update the comment on i915_gem_context::syncobj to mention that it's
   an emulation and the possible race if userspace calls execbuffer2
   twice on the same context concurrently.
v2 (Jason Ekstrand):
 - Wrap the checks for eb.gem_context->syncobj in unlikely()
 - Drop the dma_fence reference
 - Improved commit message

v3 (Jason Ekstrand):
 - Move the dma_fence_put() to before the error exit

v4 (Tvrtko Ursulin):
 - Add a comment about fence contexts to the commit message

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-8-jason@jlekstrand.net
2021-07-08 19:44:11 +02:00
Jason Ekstrand 6ff6d61dd2 drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAP
The idea behind this param is to support OpenCL drivers with relocations
because OpenCL reserves 0x0 for NULL and, if we placed memory there, it
would confuse CL kernels.  It was originally sent out as part of a patch
series including libdrm [1] and Beignet [2] support.  However, the
libdrm and Beignet patches never landed in their respective upstream
projects so this API has never been used.  It's never been used in Mesa
or any other driver, either.

Dropping this API allows us to delete a small bit of code.

[1]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067030.html
[2]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067031.html

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-4-jason@jlekstrand.net
2021-07-08 19:43:43 +02:00
Daniel Vetter ca319ee9ca drm/i915/eb: Fix pagefault disabling in the first slowpath
In

commit ebc0808fa2
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Oct 18 13:02:51 2016 +0100

    drm/i915: Restrict pagefault disabling to just around copy_from_user()

we entirely missed that there's a slow path call to eb_relocate_entry
(or i915_gem_execbuffer_relocate_entry as it was called back then)
which was left fully wrapped by pagefault_disable/enable() calls.
Previously any issues with blocking calls where handled by the
following code:

	/* we can't wait for rendering with pagefaults disabled */
	if (pagefault_disabled() && !object_is_idle(obj))
		return -EFAULT;

Now at this point the prefaulting was still around, which means in
normal applications it was very hard to hit this bug. No idea why the
regressions in igts weren't caught.

Now this all changed big time with 2 patches merged closely together.

First

commit 2889caa923
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jun 16 15:05:19 2017 +0100

    drm/i915: Eliminate lots of iterations over the execobjects array

removes the prefaulting from the first relocation path, pushing it into
the first slowpath (of which this patch added a total of 3 escalation
levels). This would have really quickly uncovered the above bug, were
it not for immediate adding a duct-tape on top with

commit 7dd4f6729f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jun 16 15:05:24 2017 +0100

    drm/i915: Async GPU relocation processing

by pushing all all the relocation patching to the gpu if the buffer
was busy, which avoided all the possible blocking calls.

The entire slowpath was then furthermore ditched in

commit 7dc8f11437
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Mar 11 16:03:10 2020 +0000

        drm/i915/gem: Drop relocation slowpath

and resurrected in

commit fd1500fcd4
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Wed Aug 19 16:08:43 2020 +0200

        Revert "drm/i915/gem: Drop relocation slowpath".

but this did not further impact what's going on.

Since pagefault_disable/enable is an atomic section, any sleeping in
there is prohibited, and we definitely do that without gpu relocations
since we have to wait for the gpu usage to finish before we can patch
up the relocations.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618214503.1773805-1-daniel.vetter@ffwll.ch
2021-06-21 17:45:37 +02:00
Thomas Hellström 5cd57f676b drm/i915: Perform execbuffer object locking as a separate step
To help avoid evicting already resident buffers from the batch we're
processing, perform locking as a separate step.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210615113600.30660-1-thomas.hellstrom@linux.intel.com
2021-06-17 14:39:58 +01:00
Christian König 440d0f12b5 dma-buf: add dma_fence_chain_alloc/free v3
Add a common allocation helper. Cleaning up the mix of kzalloc/kmalloc
and some unused code in the selftest.

v2: polish kernel doc a bit
v3: polish kernel doc even a bit more

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210611120301.10595-3-christian.koenig@amd.com
2021-06-14 19:38:34 +02:00
Thomas Hellström b4b9731b02 drm/i915: Simplify userptr locking
Use an rwlock instead of spinlock for the global notifier lock
to reduce risk of contention in execbuf.

Protect object state with the object lock whenever possible rather
than with the global notifier lock

Don't take an explicit page_ref in userptr_submit_init() but rather
call get_pages() after obtaining the page list so that
get_pages() holds the page_ref. This means we don't need to call
userptr_submit_fini(), which is needed to avoid awkward locking
in our upcoming VM_BIND code.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210610143525.624677-1-thomas.hellstrom@linux.intel.com
2021-06-14 12:23:49 +01:00
Dave Airlie 2a7005c8a3 Merge tag 'drm-intel-gt-next-2021-06-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- Disable mmap ioctl for gen12+ (excl. TGL-LP)
- Start enabling HuC loading by default for upcoming Gen12+
  platforms (excludes TGL and RKL)

Core Changes:

- Backmerge of drm-next

Driver Changes:

- Revert "i915: use io_mapping_map_user" (Eero, Matt A)
- Initialize the TTM device and memory managers (Thomas)
- Major rework to the GuC submission backend to prepare
  for enabling on new platforms (Michal Wa., Daniele,
  Matt B, Rodrigo)
- Fix i915_sg_page_sizes to record dma segments rather
  than physical pages (Thomas)

- Locking rework to prep for TTM conversion (Thomas)
- Replace IS_GEN and friends with GRAPHICS_VER (Lucas)
- Use DEVICE_ATTR_RO macro (Yue)
- Static code checker fixes (Zhihao)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YMHeDxg9VLiFtyn3@jlahtine-mobl.ger.corp.intel.com
2021-06-11 13:37:34 +10:00
Christian König d3fae3b3da dma-buf: drop the _rcu postfix on function names v3
The functions can be called both in _rcu context as well
as while holding the lock.

v2: add some kerneldoc as suggested by Daniel
v3: fix indentation

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-7-christian.koenig@amd.com
2021-06-06 11:19:51 +02:00
Lucas De Marchi 40e1956ec5 drm/i915/gem: replace IS_GEN and friends with GRAPHICS_VER
This was done by the following semantic patch:

	@@ expression i915; @@
	- INTEL_GEN(i915)
	+ GRAPHICS_VER(i915)

	@@ expression i915; expression E; @@
	- INTEL_GEN(i915) >= E
	+ GRAPHICS_VER(i915) >= E

	@@ expression dev_priv; expression E; @@
	- !IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) != E

	@@ expression dev_priv; expression E; @@
	- IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) == E

	@@
	expression dev_priv;
	expression from, until;
	@@
	- IS_GEN_RANGE(dev_priv, from, until)
	+ IS_GRAPHICS_VER(dev_priv, from, until)

	@def@
	expression E;
	identifier id =~ "^gen$";
	@@
	- id = GRAPHICS_VER(E)
	+ ver = GRAPHICS_VER(E)

	@@
	identifier def.id;
	@@
	- id
	+ ver

It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-4-lucas.demarchi@intel.com
2021-06-05 15:09:43 -07:00
Lucas De Marchi 8802190f64 drm/i915: eliminate remaining uses of intel_device_info->gen
Replace gen with the new graphics_ver value and use GRAPHICS_VER()
in those places.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210413051002.92589-10-lucas.demarchi@intel.com
2021-04-14 13:05:05 +03:00
Maarten Lankhorst c93987753a drm/i915: Defer pin calls in buffer pool until first use by caller.
We need to take the obj lock to pin pages, so wait until the callers
have done so, before making the object unshrinkable.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-30-maarten.lankhorst@linux.intel.com
2021-03-24 17:27:20 +01:00
Maarten Lankhorst ed29c26911 drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.
Instead of doing what we do currently, which will never work with
PROVE_LOCKING, do the same as AMD does, and something similar to
relocation slowpath. When all locks are dropped, we acquire the
pages for pinning. When the locks are taken, we transfer those
pages in .get_pages() to the bo. As a final check before installing
the fences, we ensure that the mmu notifier was not called; if it is,
we return -EAGAIN to userspace to signal it has to start over.

Changes since v1:
- Unbinding is done in submit_init only. submit_begin() removed.
- MMU_NOTFIER -> MMU_NOTIFIER
Changes since v2:
- Make i915->mm.notifier a spinlock.
Changes since v3:
- Add WARN_ON if there are any page references left, should have been 0.
- Return 0 on success in submit_init(), bug from spinlock conversion.
- Release pvec outside of notifier_lock (Thomas).
Changes since v4:
- Mention why we're clearing eb->[i + 1].vma in the code. (Thomas)
- Actually check all invalidations in eb_move_to_gpu. (Thomas)
- Do not wait when process is exiting to fix gem_ctx_persistence.userptr.
Changes since v5:
- Clarify why check on PF_EXITING is (temporarily) required.
Changes since v6:
- Ensure userptr validity is checked in set_domain through a special path.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
[danvet: s/kfree/kvfree/ in i915_gem_object_userptr_drop_ref in the
previous review round, but which got lost. The other open questions
around page refcount are imo better discussed in a separate series,
with amdgpu folks involved].
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-17-maarten.lankhorst@linux.intel.com
2021-03-24 17:27:19 +01:00
Maarten Lankhorst 20ee27bd84 drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER.
Now that unsynchronized mappings are removed, the only time userptr
works is when the MMU notifier is enabled. Put all of the userptr
code behind a mmu notifier ifdef.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-16-maarten.lankhorst@linux.intel.com
2021-03-24 17:27:19 +01:00
Maarten Lankhorst bfaae47db3 drm/i915: make lockdep slightly happier about execbuf.
As soon as we install fences, we should stop allocating memory
in order to prevent any potential deadlocks.

This is required later on, when we start adding support for
dma-fence annotations.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-11-maarten.lankhorst@linux.intel.com
2021-03-24 11:51:38 +01:00
Maarten Lankhorst 237647f4dd drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2.
i915_vma_pin may fail with -EDEADLK when we start locking page tables,
so ensure we handle this correctly.

Changes since v1:
- Drop -EDEADLK todo, this commit handles it.
- Change eb_pin_vma from sort-of-bool + -EDEADLK to a proper int. (Matt)

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-5-maarten.lankhorst@linux.intel.com
2021-03-24 11:41:18 +01:00
Maarten Lankhorst 0edbb9ba1b drm/i915: Move cmd parser pinning to execbuffer
We need to get rid of allocations in the cmd parser, because it needs
to be called from a signaling context, first move all pinning to
execbuf, where we already hold all locks.

Allocate jump_whitelist in the execbuffer, and add annotations around
intel_engine_cmd_parser(), to ensure we only call the command parser
without allocating any memory, or taking any locks we're not supposed to.

Because i915_gem_object_get_page() may also allocate memory, add a
path to i915_gem_object_get_sg() that prevents memory allocations,
and walk the sg list manually. It should be similarly fast.

This has the added benefit of being able to catch all memory allocation
errors before the point of no return, and return -ENOMEM safely to the
execbuf submitter.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-4-maarten.lankhorst@linux.intel.com
2021-03-24 11:39:59 +01:00
Jason Ekstrand 2eb8e1a69d drm/i915/gem: Drop relocation support on all new hardware (v6)
The Vulkan driver in Mesa for Intel hardware never uses relocations if
it's running on a version of i915 that supports at least softpin which
all versions of i915 supporting Gen12 do.  On the OpenGL side, Gen12+ is
only supported by iris which never uses relocations.  The older i965
driver in Mesa does use relocations but it only supports Intel hardware
through Gen11 and has been deprecated for all hardware Gen9+.  The
compute driver also never uses relocations.  This only leaves the media
driver which is supposed to be switching to softpin going forward.
Making softpin a requirement for all future hardware seems reasonable.

There is one piece of hardware enabled by default in i915: RKL which was
enabled by e22fa6f0a9 which has not yet landed in drm-next so this
almost but not really a userspace API change for RKL.  If it becomes a
problem, we can always add !IS_ROCKETLAKE(eb->i915) to the condition.

Rejecting relocations starting with newer Gen12 platforms has the
benefit that we don't have to bother supporting it on platforms with
local memory.  Given how much CPU touching of memory is required for
relocations, not having to do so on platforms where not all memory is
directly CPU-accessible carries significant advantages.

v2 (Jason Ekstrand):
 - Allow TGL-LP platforms as they've already shipped

v3 (Jason Ekstrand):
 - WARN_ON platforms with LMEM support in case the check is wrong

v4 (Jason Ekstrand):
 - Call out Rocket Lake in the commit message

v5 (Jason Ekstrand):
 - Drop the HAS_LMEM check as it's already covered by the version check

v6 (Jason Ekstrand):
 - Move the check to eb_validate_vma() with all the other exec_object
   validation checks.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210317234014.2271006-3-jason@jlekstrand.net
2021-03-18 14:26:10 +01:00
Jason Ekstrand b5b6f6a610 drm/i915/gem: Drop legacy execbuffer support (v2)
libdrm has supported the newer execbuffer2 ioctl and using it by default
when it exists since libdrm commit b50964027bef which landed Mar 2, 2010.
The i915 and i965 drivers in Mesa at the time both used libdrm and so
did the Intel X11 back-end.  The SNA back-end for X11 has always used
execbuffer2.

v2 (Jason Ekstrand):
 - Add a comment saying what Linux version it's being removed in.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Keith Packard <keithp@keithp.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210317234014.2271006-2-jason@jlekstrand.net
2021-03-18 14:25:42 +01:00
Matthew Auld 8f47c8c3b0 drm/i915/pool: constrain pool objects by mapping type
In a few places we always end up mapping the pool object with the FORCE
constraint(to prevent hitting -EBUSY) which will destroy the cached
mapping if it has a different type. As a simple first step, make the
mapping type part of the pool interface, where the behaviour is to only
give out pool objects which match the requested mapping type.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-4-matthew.auld@intel.com
2021-01-19 20:36:28 +00:00
Chris Wilson 9b3a8f558d drm/i915/gt: Disable arbitration on no-preempt requests
If a request is submitted and known to require no preemption, disable
arbitration around the batch which prevents the HW from handling a
preemption request during the payload.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-7-chris@chris-wilson.co.uk
2021-01-08 21:35:57 +00:00
Arnd Bergmann bb80d8784d drm/i915: fix shift warning
Randconfig builds on 32-bit machines show lots of warnings for
the i915 driver for passing a 32-bit value into __const_hweight64():

drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2584:9: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
        return hweight64(VDBOX_MASK(&i915->gt));
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64'
 #define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))

Change it to hweight_long() to avoid the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210103135158.3591442-1-arnd@kernel.org
2021-01-03 14:12:48 +00:00
Matthew Auld 26ebc511e7 drm/i915: clear the gpu reloc batch
The reloc batch is short lived but can exist in the user visible ppGTT,
and since it's backed by an internal object, which lacks page clearing,
we should take care to clear it upfront.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-2-matthew.auld@intel.com
Cc: stable@vger.kernel.org
2020-12-24 15:25:41 +00:00
Chris Wilson 45233ab2d0 drm/i915/gt: Move gen8 CS emitters into gen8_engine_cs.h
Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control
and friends to gen8_engine_cs.h

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk
2020-12-16 15:55:43 +00:00
Chris Wilson 5f22cc0b13 drm/i915: Fix mismatch between misplaced vma check and vma insert
When inserting a VMA, we restrict the placement to the low 4G unless the
caller opts into using the full range. This was done to allow usersapce
the opportunity to transition slowly from a 32b address space, and to
avoid breaking inherent 32b assumptions of some commands.

However, for insert we limited ourselves to 4G-4K, but on verification
we allowed the full 4G. This causes some attempts to bind a new buffer
to sporadically fail with -ENOSPC, but at other times be bound
successfully.

commit 48ea1e32c3 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1
page") suggests that there is a genuine problem with stateless addressing
that cannot utilize the last page in 4G and so we purposefully excluded
it. This means that the quick pin pass may cause us to utilize a buggy
placement.

Reported-by: CQ Tang <cq.tang@intel.com>
Testcase: igt/gem_exec_params/larger-than-life-batch
Fixes: 48ea1e32c3 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: CQ Tang <cq.tang@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.5+
Link: https://patchwork.freedesktop.org/patch/msgid/20201216092951.7124-1-chris@chris-wilson.co.uk
2020-12-16 11:06:38 +00:00
Chris Wilson e9f4829f95 drm/i915/gem: Drop false !i915_vma_is_closed assertion
Closed vma are protected by the GT wakeref held as we lookup the vma, so
we know that the vma will not be freed as we process it for the execbuf.
Instead we expect to catch the closed status of the context, and simply
allow the close-race on an individual vma to be washed away.

Longer term, the GT wakeref protection will be removed by explicit
vma.kref tracking.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2245
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207193824.18114-1-chris@chris-wilson.co.uk
2020-12-08 14:01:51 +00:00
Chris Wilson ba38b79eae drm/i915/gem: Propagate error from cancelled submit due to context closure
In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6bd0 ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.

Fixes: 61231f6bd0 ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk
2020-12-04 11:27:57 +00:00
Chris Wilson d5e8782129 drm/i915/gem: Support parsing of oversize batches
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.

Based on a patch by Matthew Auld.

Reported-by: Matthew Auld <matthew.auld@intel.com>
Fixes: 435e8fc059 ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk
(cherry picked from commit 57b2d834bf)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-10-19 13:29:50 -04:00
Chris Wilson c60b93cd48 drm/i915: Avoid mixing integer types during batch copies
Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.

Fixes: ed13033f02 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria@intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk
(cherry picked from commit b7eeb2b413)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:54 -04:00
Maarten Lankhorst 166774a2c2 drm/i915: Fix slightly botched merge in __reloc_entry_gpu
This function should be an int, not a bool.

Presumably because we had the same 2 reverts in a slightly different
way, git got confused.

Thanks to Dan for reporting. :)

The conflict is between the 3 reverts in drm-fixes:

4993a8a378 ("Revert "drm/i915: Remove i915_gem_object_get_dirty_page()"")
ad5d95e4d5 ("Revert "drm/i915/gem: Async GPU relocations only"")
20561da3a2 ("Revert "drm/i915/gem: Delete unused code"")

And the slightly different combined revert in drm-intel-gt-next, but
with the same goal:

102a0a9051 ("Revert "drm/i915/gem: Async GPU relocations only"")

In the merge commit 1f4b2aca79 ("Merge tag
'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next") things
went wrong, but the merge commit view now doesn't show any conflict
anymore (as git tends to do when the resolution picks one or the other
branch).

The need to handle other than just true/false error codes in
__reloc_entry_gpu was added in the dma_resv locking changes in
c43ce12328 ("drm/i915: Use per object locking in execbuf, v12.")

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dave Airlie <airlied@redhat.com>
[danvet: Explain this entire saga a lot better, adding tons of commit
references. Also note that this was merged before full intel-gfx-CI
results, only after BAT, since the breakage at the BAT run is already
severe enough to block all pre-merge testing.]
Fixes: 1f4b2aca79 ("Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next")
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200910111225.2184193-1-maarten.lankhorst@linux.intel.com
2020-09-10 15:19:10 +02:00
Dave Airlie 1f4b2aca79 Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
(Same content as drm-intel-gt-next-2020-09-04-3, S-o-b's added)

UAPI Changes:
(- Potential implicit changes from WW locking refactoring)

Cross-subsystem Changes:
(- WW locking changes should align the i915 locking more with others)

Driver Changes:

- MAJOR: Apply WW locking across the driver (Maarten)

- Reverts for 5 commits to make applying WW locking faster (Maarten)
- Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris)
- Add missing dma_fence_put() for error case of syncobj timeline (Chris)
- Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten)
- Pin engine before pinning all objects (Maarten)
- Rework intel_context pinning to do everything outside of pin_mutex (Maarten)

- Avoid tracking GEM context until registered (Cc: stable, Chris)
- Provide a fastpath for waiting on vma bindings (Chris)
- Fixes to preempt-to-busy mechanism (Chris)
- Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris)
- Switch to object allocations for page directories (Chris)
- Hold context/request reference while breadcrumbs are active (Chris)
- Make sure execbuffer always passes ww state to i915_vma_pin (Maarten)

- Code refactoring to facilitate use of WW locking (Maarten)
- Locking refactoring to use more granular locking (Maarten, Chris)
- Support for multiple pinned timelines per engine (Chris)
- Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris)
- Make active tracking/vma page-directory stash work preallocated (Chris)
- Avoid flushing submission tasklet too often (Chris)
- Reduce context termination list iteration guard to RCU (Chris)
- Reductions to locking contention (Chris)
- Fixes for issues found by CI (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <jlahtine@jlahtine-mobl.ger.corp.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907130039.GA27766@jlahtine-mobl.ger.corp.intel.com
2020-09-09 07:55:22 +10:00
Dave Airlie 61d98185b4 Backmerge drm-fixes merge into drm-next
Commit '6f6a73c8b715d595977774d48450a734297ab21f' from Linus' tree

The fixes reverts cause a bit of a conflict pain with intel next,
start fixing it up here.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-09 07:46:32 +10:00