Commit Graph

101 Commits

Author SHA1 Message Date
Jason A. Donenfeld 8032bf1233 treewide: use get_random_u32_below() instead of deprecated function
This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-18 02:15:15 +01:00
Chris Wilson 6ef7d36212 drm/i915/gt: Restrict forced preemption to the active context
When we submit a new pair of contexts to ELSP for execution, we start a
timer by which point we expect the HW to have switched execution to the
pending contexts. If the promotion to the new pair of contexts has not
occurred, we declare the executing context to have hung and force the
preemption to take place by resetting the engine and resubmitting the
new contexts.

This can lead to an unfair situation where almost all of the preemption
timeout is consumed by the first context which just switches into the
second context immediately prior to the timer firing and triggering the
preemption reset (assuming that the timer interrupts before we process
the CS events for the context switch). The second context hasn't yet had
a chance to yield to the incoming ELSP (and send the ACk for the
promotion) and so ends up being blamed for the reset.

If we see that a context switch has occurred since setting the
preemption timeout, but have not yet received the ACK for the ELSP
promotion, rearm the preemption timer and check again. This is
especially significant if the first context was not schedulable and so
we used the shortest timer possible, greatly increasing the chance of
accidentally blaming the second innocent context.

Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Tested-by: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com
(cherry picked from commit 107ba1a2c7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-09-26 10:33:06 -04:00
Dave Airlie c6a3d73592 Merge tag 'drm-intel-gt-next-2022-06-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson)
- Document memory residency and Flat-CCS capability of obj (Ramalingam C)
- Disable GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK on Xe_HP+ (Matt Roper)

Cross-subsystem Changes:

- Rename intel-gtt symbols (Lucas De Marchi)

Core Changes:

Driver Changes:

- Support programming the EU priority in the GuC descriptor (DG2) (Matthew Brost)
- DG2 HuC loading support (Daniele Ceraolo Spurio)
- Fix build error without CONFIG_PM (YueHaibing)
- Enable THP on Icelake and beyond (Tvrtko Ursulin)
- Only setup private tmpfs mount when needed and fix logging (Tvrtko Ursulin)
- Make __guc_reset_context aware of guilty engines (Umesh Nerlige Ramappa)
- DG2 small bar memory probing fixes (Nirmoy Das)
- Remove unnecessary GuC err capture noise (Alan Previn)
- Fix i915_gem_object_ggtt_pin_ww regression on old platforms (Maarten Lankhorst)
- Fix undefined behavior in GuC backend due to shift overflowing the constant (Borislav Petkov)
- New DG2 workarounds (Swathi Dhanavanthri, Anshuman Gupta)
- Report no hwconfig support on ADL-N (Balasubramani Vivekanandan)
- Fix error_state_read ptr + offset use (Alan Previn)
- Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson)
- Fix memory leaks in per-gt sysfs (Ashutosh Dixit)
- Fix dma_resv fence handling in multi-batch execbuf (Nirmoy Das)
- Add extra registers to GPU error dump on Gen11+ (Stuart Summers)
- More PVC+DG2 workarounds (Matt Roper)
- Improve user experience and driver robustness under SIGINT or similar (Tvrtko Ursulin)
- Don't show engine classes not present (Tvrtko Ursulin)
- Improve on suspend / resume time with VT-d enabled (Thomas Hellström)
- Add missing else (katrinzhou)
- Don't leak lmem mapping in vma_evict (Juha-Pekka Heikkila)
- Add smem fallback allocation for dpt (Juha-Pekka Heikkila)
- Tweak the ordering in cpu_write_needs_clflush (Matthew Auld)
- Do not access rq->engine without a reference (Niranjana Vishwanathapura)
- Revert "drm/i915: Hold reference to intel_context over life of i915_request" (Niranjana Vishwanathapura)
- Don't update engine busyness stats too frequently (Alan Previn)
- Add additional steps for Wa_22011802037 for execlist backend (Umesh Nerlige Ramappa)
- Fix a lockdep warning at error capture (Nirmoy Das)

- Ponte Vecchio prep work and new blitter engines (Matt Roper, John Harrison, Lucas De Marchi)
- Read correct RP_STATE_CAP register (PVC) (Matt Roper)
- Define MOCS table for PVC (Ayaz A Siddiqui)
- Driver refactor and support Ponte Vecchio forcewake handling (Matt Roper)
- Remove additional 3D flags from PIPE_CONTROL (Ponte Vecchio) (Stuart Summers)
- XEHPSDV and PVC do not use HuC (Daniele Ceraolo Spurio)
- Extract stepping information from PCI revid (Ponte Vecchio) (Matt Roper)
- Add initial PVC workarounds (Stuart Summers)
- SSEU handling driver refactor and Ponte Vecchio support (Matt Roper)
- GuC depriv applies to PVC (Matt Roper)
- Add register steering (Ponte Vecchio) (Matt Roper)
- Add recommended MMIO setting (Ponte Vecchio) (Matt Roper)

- Move multicast register handling to a dedicated file (Matt Roper)
- Cleanup interface for MCR operations (Matt Roper)
- Extend i915_vma_pin_iomap() (CQ Tang)
- Re-do the intel-gtt split (Lucas De Marchi)
- Correct duplicated/misplaced GT register definitions (Matt Roper)
- Prefer "XEHP_" prefix for registers (Matt Roper)

- Don't use DRM_DEBUG_WARN_ON for unexpected l3bank/mslice config (Tvrtko Ursulin)
- Don't use DRM_DEBUG_WARN_ON for ring unexpectedly not idle (Tvrtko Ursulin)
- Make drop_pages() return bool (Lucas De Marchi)
- Fix CFI violation with show_dynamic_id() (Nathan Chancellor)
- Use i915_probe_error instead of drm_error in GuC code (Vinay Belgaumkar)
- Fix use of static in macro mismatch (Andi Shyti)
- Update tiled blits selftest (Bommu Krishnaiah)
- Future-proof platform checks (Matt Roper)
- Only include what's needed (Jani Nikula)
- remove accidental static from a local variable (Jani Nikula)
- Add global forcewake request to drpc (Vinay Belgaumkar)
- Fix spelling typo in comment (pengfuyuan)
- Increase timeout for live_parallel_switch selftest (Akeem G Abodunrin)
- Use non-blocking H2G for waitboost (Vinay Belgaumkar)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YrwtLM081SQUG1Dc@tursulin-desk
2022-07-01 14:14:52 +10:00
Umesh Nerlige Ramappa 0667429ce6 drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend
For execlists backend, current implementation of Wa_22011802037 is to
stop the CS before doing a reset of the engine. This WA was further
extended to wait for any pending MI FORCE WAKEUPs before issuing a
reset. Add the extended steps in the execlist path of reset.

In addition, extend the WA to gen11.

v2: (Tvrtko)
- Clarify comments, commit message, fix typos
- Use IS_GRAPHICS_VER for gen 11/12 checks

v3: (Daneile)
- Drop changes to intel_ring_submission since WA does not apply to it
- Log an error if MSG IDLE is not defined for an engine

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: f6aa0d713c ("drm/i915: Add Wa_22011802037 force cs halt")
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com
2022-06-27 12:21:22 -07:00
Dave Airlie 805ada63ba - General driver clean-up (Jani, Ville, Julia)
- DG2 enabling (Anusha, Vandita)
 - Fix sparse warnings (Imre, Jani)
 - DMC MMIO range checks (Anusha)
 - Audio related fixes (Jani)
 - Runtime PM fixes (Anshuman)
 - PSR fixes (Jouni, Jose)
 - Media freq factor and per-gt enhancements (Ashutosh, Dale)
 - DSI fixes for ICL+ (Jani)
 - Disable DMC flip queue handlers (Imre)
 - ADL_P voltage swing updates (Balasubramani)
 - Use more the VBT for panel information (Ville, Animesh)
 - Fix on Type-C ports with TBT mode (Vivek)
 - Improve fastset and allow seamless M/N changes (Ville)
 - Accept more fixed modes with VRR/DMRRS panels (Ville)
 - FBC fix (Jose)
 - Remove noise logs (Luca)
 - Disable connector polling for a headless SKU (Jouni)
 - Sanitize display underrun reporting (Ville)
 - ADL-S display PLL w/a (Ville)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmKzcqMACgkQ+mJfZA7r
 E8rZgwf8CugXS7PcnEoMgnXkywvkGixOuXVZQlAvGPVJTVx8+Yaa2YrzRB+J8gG4
 VfcXLQpQytxSWcFzlXvzYVU0Km9ZechI5xfHVBF7MvLKeX0amSIE9y08QtLQa3Lr
 XhbprCrCas+jt0o66HKxOVQ+Se7fli7Ch8uwG+vGAXOfhpO2K7p//OcGap4KP9q7
 mhl45d91FJpSoNA3mVPWkg6Q22Ojep+Mfltwz0weUqlcNS2vh1HgPSoG5bTRhkAF
 epkFB+WBh3Eci9TrZiUkHuKEPCkyMJJgRz875NHe/R6V4CPlZ+UAcljuAFCiZfr3
 mQJxv0dfSh+vmDuprUrNeNaOlYv6qA==
 =10Pk
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2022-06-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

- General driver clean-up (Jani, Ville, Julia)
- DG2 enabling (Anusha, Vandita)
- Fix sparse warnings (Imre, Jani)
- DMC MMIO range checks (Anusha)
- Audio related fixes (Jani)
- Runtime PM fixes (Anshuman)
- PSR fixes (Jouni, Jose)
- Media freq factor and per-gt enhancements (Ashutosh, Dale)
- DSI fixes for ICL+ (Jani)
- Disable DMC flip queue handlers (Imre)
- ADL_P voltage swing updates (Balasubramani)
- Use more the VBT for panel information (Ville, Animesh)
- Fix on Type-C ports with TBT mode (Vivek)
- Improve fastset and allow seamless M/N changes (Ville)
- Accept more fixed modes with VRR/DMRRS panels (Ville)
- FBC fix (Jose)
- Remove noise logs (Luca)
- Disable connector polling for a headless SKU (Jouni)
- Sanitize display underrun reporting (Ville)
- ADL-S display PLL w/a (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YrNzP2WTf3WBvpvd@intel.com
2022-06-24 12:07:47 +10:00
Tvrtko Ursulin 45c64ecf97 drm/i915: Improve user experience and driver robustness under SIGINT or similar
We have long standing customer complaints that pressing Ctrl-C (or to the
effect of) causes engine resets with otherwise well behaving programs.

Not only is logging engine resets during normal operation not desirable
since it creates support incidents, but more fundamentally we should avoid
going the engine reset path when we can since any engine reset introduces
a chance of harming an innocent context.

Reason for this undesirable behaviour is that the driver currently does
not distinguish between banned contexts and non-persistent contexts which
have been closed.

To fix this we add the distinction between the two reasons for revoking
contexts, which then allows the strict timeout only be applied to banned,
while innocent contexts (well behaving) can preempt cleanly and exit
without triggering the engine reset path.

Note that the added context exiting category applies both to closed non-
persistent context, and any exiting context when hangcheck has been
disabled by the user.

At the same time we rename the backend operation from 'ban' to 'revoke'
which more accurately describes the actual semantics. (There is no ban at
the backend level since banning is a concept driven by the scheduling
frontend. Backends are simply able to revoke a running context so that
is the more appropriate name chosen.)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220527072452.2225610-1-tvrtko.ursulin@linux.intel.com
2022-06-17 09:03:11 +01:00
Julia Lawall 8ae6649079 drm/i915: fix typos in comments
Spelling mistakes (triple letters) in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220521111145.81697-90-Julia.Lawall@inria.fr
2022-05-24 13:48:30 +03:00
Matthew Brost a5c89f7c43 drm/i915/guc: Support programming the EU priority in the GuC descriptor
In GuC submission mode the EU priority must be updated by the GuC rather
than the driver as the GuC owns the programming of the context descriptor.

Given that the GuC code uses the GuC priorities, we can't use a generic
function using i915 priorities for both execlists and GuC submission.
The existing function has therefore been pushed to the execlists
back-end while a new one has been added for GuC.

v2: correctly use the GuC prio.

Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220504234636.2119794-1-daniele.ceraolospurio@intel.com
2022-05-05 15:46:18 -07:00
Chris Wilson 166c44e694 drm/i915/gt: Clear SET_PREDICATE_RESULT prior to executing the ring
Userspace may leave predication enabled upon return from the batch
buffer, which has the consequent of preventing all operation from the
ring from being executed, including all the synchronisation, coherency
control, arbitration and user signaling. This is more than just a local
gpu hang in one client, as the user has the ability to prevent the
kernel from applying critical workarounds and can cause a full GT reset.

We could simply execute MI_SET_PREDICATE upon return from the user
batch, but this has the repercussion of modifying the user's context
state. Instead, we opt to execute a fixup batch which by mixing
predicated operations can determine the state of the
SET_PREDICATE_RESULT register and restore it prior to the next userspace
batch. This allows us to protect the kernel's ring without changing the
uABI.

Suggested-by: Zbigniew Kempczynski <zbigniew.kempczynski@intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Zbigniew Kempczynski <zbigniew.kempczynski@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220425152317.4275-4-ramalingam.c@intel.com
2022-05-02 15:18:09 +05:30
Rodrigo Vivi e1e1f4e325 Merge drm/drm-next into drm-intel-gt-next
In order to get the GSC Support merged on drm-intel-gt-next
in a clean fashion we needed this ATS-M patch to avoid
conflict in i915_pci.c:

commit 412c942bdf ("drm/i915/ats-m: add ATS-M platform info")

--

Fixing a silent conflict on drivers/gpu/drm/i915/gt/intel_gt_gmch.c:
-       if (!intel_vtd_active(i915))
+       if (!i915_vtd_active(i915))

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-04-21 13:48:26 -04:00
Jani Nikula 83970cd63b Merge drm/drm-next into drm-intel-next
Sync up with v5.18-rc1, in particular to get 5e3094cfd9
("drm/i915/xehpsdv: Add has_flat_ccs to device info").

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-04-11 16:01:56 +03:00
Tvrtko Ursulin bb6287cb18 drm/i915: Track context current active time
Track context active (on hardware) status together with the start
timestamp.

This will be used to provide better granularity of context
runtime reporting in conjunction with already tracked pphwsp accumulated
runtime.

The latter is only updated on context save so does not give us visibility
to any currently executing work.

As part of the patch the existing runtime tracking data is moved under the
new ce->stats member and updated under the seqlock. This provides the
ability to atomically read out accumulated plus active runtime.

v2:
 * Rename and make __intel_context_get_active_time unlocked.

v3:
 * Use GRAPHICS_VER.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> #  v1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-6-tvrtko.ursulin@linux.intel.com
2022-04-05 08:39:10 +01:00
Alan Previn a6f0f9cf33 drm/i915/guc: Plumb GuC-capture into gpu_coredump
Add a flags parameter through all of the coredump creation
functions. Add a bitmask flag to indicate if the top
level gpu_coredump event is triggered in response to
a GuC context reset notification.

Using that flag, ensure all coredump functions that
read or print mmio-register values related to work submission
or command-streamer engines are skipped and replaced with
a calls guc-capture module equivalent functions to retrieve
or print the register dump.

While here, split out display related register reading
and printing into its own function that is called agnostic
to whether GuC had triggered the reset.

For now, introduce an empty printing function that can
filled in on a subsequent patch just to handle formatting.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-13-alan.previn.teres.alexis@intel.com
2022-03-22 10:33:31 -07:00
Michael Cheng 61c5ed946d drm/i915/gt: replace cache_clflush_range
Replace all occurrence of cache_clflush_range with drm_clflush_virt_range.
This will prevent compile errors on non-x86 platforms.

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-6-michael.cheng@intel.com
2022-03-22 10:10:53 -07:00
Michael Cheng 92b0cba49e drm/i915/gt: Re-work reset_csb
Use drm_clflush_virt_range instead of directly invoking clflush. This
will prevent compiler errors when building for non-x86 architectures.

v2(Michael Cheng): Remove extra clflush

v3(Michael Cheng): Remove memory barrier since drm_clflush_virt_range
		   takes care of it.

v4(Michael Cheng): Get the size of value and not the size of the pointer
		   when passing in execlists->csb_write. Thanks to Matt
		   Roper for pointing this out.

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-4-michael.cheng@intel.com
2022-03-22 10:10:52 -07:00
Michael Cheng dc0406820e drm/i915/gt: Drop invalidate_csb_entries
Drop invalidate_csb_entries and directly call drm_clflush_virt_range.
This allows for one less function call, and prevent complier errors when
building for non-x86 architectures.

v2(Michael Cheng): Drop invalidate_csb_entries function and directly
		   invoke drm_clflush_virt_range. Thanks to Tvrtko for the
		   sugguestion.

v3(Michael Cheng): Use correct parameters for drm_clflush_virt_range.
		   Thanks to Tvrtko for pointing this out.

v4(Michael Cheng): Simplify &execlists->csb_status[0] to
		   execlists->csb_status. Thanks to Matt Roper for the
		   suggestion.

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-3-michael.cheng@intel.com
2022-03-22 10:10:51 -07:00
Matt Roper f9576e36c6 drm/i915/xehp: Support platforms with CCS engines but no RCS
In the past we've always assumed that an RCS engine is present on every
platform.  However now that we have compute engines there may be
platforms that have CCS engines but no RCS, or platforms that are
designed to have both, but have the RCS engine fused off.

Various engine-centric initialization that only needs to be done a
single time for the group of RCS+CCS engines can't rely on being setup
with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag
that will be assigned to a single engine in the group; whichever engine
has this flag will be responsible for some of the general setup
(RCU_MODE programming, initialization of certain workarounds, etc.).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303223435.2793124-1-matthew.d.roper@intel.com
2022-03-04 08:02:15 -08:00
Lucas De Marchi 01fabda8e3 drm/i915: Use str_yes_no()
Remove the local yesno() implementation and adopt the str_yes_no() from
linux/string_helpers.h.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225234631.3725943-1-lucas.demarchi@intel.com
2022-03-02 08:48:13 -08:00
Matt Roper 87cb6d80f2 drm/i915/xehp: Enable ccs/dual-ctx in RCU_MODE
We have to specify in the Render Control Unit Mode register
when CCS is enabled.

v2:
 - Move RCU_MODE programming to a helper function.  (Tvrtko)
 - Clean up and clarify comments.  (Tvrtko)
 - Add RCU_MODE to the GuC save/restore list.  (Daniele)
v3:
 - Move this patch before the GuC ADS update to enable compute engines;
   the definition of RCU_MODE and its insertion into the save/restore
   list moves to this patch.  (Daniele)
v4:
 - Call xehp_enable_ccs_engines() directly in guc_resume() and
   execlists_resume() rather than adding an extra layer of wrapping to
   the engine->resume() vfunc.  (Umesh)

Bspec: 46034
Original-author: Michel Thierry
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302001554.1836066-1-matthew.d.roper@intel.com
2022-03-02 06:45:21 -08:00
Matt Roper adfadb5638 drm/i915/xehp: Define context scheduling attributes in lrc descriptor
In Dual Context mode the EUs are shared between render and compute
command streamers. The hardware provides a field in the lrc descriptor
to indicate the prioritization of the thread dispatch associated to the
corresponding context.

The context priority is set to 'low' at creation time and relies on the
existing context priority to set it to low/normal/high.

Bspec: 46145, 46260
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Prasad Nallani <prasad.nallani@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-8-matthew.d.roper@intel.com
2022-03-02 06:45:21 -08:00
Matt Roper c674c5b934 drm/i915/xehp: CCS should use RCS setup functions
The compute engine handles the same commands the render engine can
(except 3D pipeline), so it makes sense that CCS is more similar to RCS
than non-render engines.

The CCS context state (lrc) is also similar to the render one, so reuse
it. Note that the compute engine has its own CTX_R_PWR_CLK_STATE
register.

In order to avoid having multiple RCS && CCS checks, add the following
engine flag:
 - I915_ENGINE_HAS_RCS_REG_STATE - use the render (larger) reg state ctx.

BSpec: 46260
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-6-matthew.d.roper@intel.com
2022-03-02 06:45:19 -08:00
Rodrigo Vivi 30424ebae8 Merge tag 'drm-intel-gt-next-2022-02-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-intel-next
UAPI Changes:

- Weak parallel submission support for execlists

  Minimal implementation of the parallel submission support for
  execlists backend that was previously only implemented for GuC.
  Support one sibling non-virtual engine.

Core Changes:

- Two backmerges of drm/drm-next for header file renames/changes and
  i915_regs reorganization

Driver Changes:

- Add new DG2 subplatform: DG2-G12 (Matt R)
- Add new DG2 workarounds (Matt R, Ram, Bruce)
- Handle pre-programmed WOPCM registers for DG2+ (Daniele)
- Update guc shim control programming on XeHP SDV+ (Daniele)
- Add RPL-S C0/D0 stepping information (Anusha)
- Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas)

- Fix KMD and GuC race on accessing PMU busyness (Umesh)
- Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh)
- Report error on invalid reset notification from GuC (John)
- Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston)
- Fixes to parallel submission implementation (Matt B.)
- Improve GuC loading status check/error reports (John)
- Tweak TTM LRU priority hint selection (Matt A.)
- Align the plane_vma to min_page_size of stolen mem (Ram)

- Introduce vma resources and implement async unbinding (Thomas)
- Use struct vma_resource instead of struct vma_snapshot (Thomas)
- Return some TTM accel move errors instead of trying memcpy move (Thomas)
- Fix a race between vma / object destruction and unbinding (Thomas)
- Remove short-term pins from execbuf (Maarten)
- Update to GuC version 69.0.3 (John, Michal Wa.)
- Improvements to GT reset paths in GuC backend (Matt B.)
- Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko)
- Use trylock instead of blocking lock when freeing GEM objects (Maarten)
- Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.)
- Fixes to object unmapping and purging (Matt A)
- Check for wedged device in GuC backend (John)
- Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten)
- Allow dead vm to unbind vma's without lock (Maarten)
- s/engine->i915/i915/ for DG2 engine workarounds (Matt R)

- Use to_gt() helper for GGTT accesses (Michal Wi.)
- Selftest improvements (Matt B., Thomas, Ram)
- Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan)

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch
from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip
rerere cache update")]
2022-02-23 15:03:51 -05:00
Jani Nikula df62ae6fc9 drm/i915: move intel_hws_csb_write_index() out of i915_drv.h
Underscore prefix the index macros, and place
INTEL_HWS_CSB_WRITE_INDEX() as a macro next to them, to declutter
i915_drv.h.

v2: Don't underscore the index macros (Tvrtko)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209131143.3365230-1-jani.nikula@intel.com
2022-02-10 10:58:32 +02:00
Matt Roper 0d6419e9c8 drm/i915: Move GT registers to their own header file
This is a huge, chaotic mass of registers copied over as-is without any
real cleanup.  We'll come back and organize these better, align on
consistent coding style, remove dead code, etc. in separate patches
later that will be easier to review.

v2:
 - Add missing include in intel_pxp_irq.c
v3:
 - Correct a few indentation errors (Lucas)
 - Minor conflict resolution

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com
2022-02-02 07:59:14 -08:00
Rodrigo Vivi 063565aca3 Merge drm/drm-next into drm-intel-next
Catch-up with 5.17-rc2 and trying to align with drm-intel-gt-next
for a possible topic branch for merging the split of i915_regs...

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-01-31 13:19:33 -05:00
Matt Roper 202b1f4c12 drm/i915/gt: Move engine registers to their own header
Let's continue breaking up and cleaning up the massive i915_reg.h file
by moving all registers that are defined in relation to an engine base
to their own header.

There are probably a bunch of other "engine registers" that we haven't
moved yet (especially those that belong to the render engine in the
0x2??? range), but this is a relatively straightforward first step.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com
2022-01-11 14:03:25 -08:00
Matthew Brost a88afcfa25 drm/i915/execlists: Weak parallel submission support for execlists
A weak implementation of parallel submission (multi-bb execbuf IOCTL) for
execlists. Doing as little as possible to support this interface for
execlists - basically just passing submit fences between each request
generated and virtual engines are not allowed. This is on par with what
is there for the existing (hopefully soon deprecated) bonding interface.

We perma-pin these execlists contexts to align with GuC implementation.

v2:
 (John Harrison)
  - Drop siblings array as num_siblings must be 1
v3:
 (John Harrison)
  - Drop single submission
v4:
 (John Harrison)
  - Actually drop single submission
  - Use IS_ERR check on return value from intel_context_create
  - Set last request to NULL on unpin

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211222223532.28698-1-matthew.brost@intel.com
2021-12-23 11:14:23 -08:00
Thomas Hellström 8b91cdd4f8 drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code
The capture code is typically run entirely in the fence signalling
critical path. We're about to add lockdep annotation in an upcoming patch
which reveals a lockdep splat similar to the below one.

Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM
(which is the same as GFP_WAIT, but open-coded for clarity) rather than
GFP_KERNEL for memory allocation in the capture path. This has the
potential drawback that capture might fail in situations with memory
pressure.

[  234.842048] WARNING: possible circular locking dependency detected
[  234.842050] 5.15.0-rc7+ #20 Tainted: G     U  W
[  234.842052] ------------------------------------------------------
[  234.842054] gem_exec_captur/1180 is trying to acquire lock:
[  234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330
[  234.842063]
               but task is already holding lock:
[  234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[  234.842138]
               which lock already depends on the new lock.

[  234.842140]
               the existing dependency chain (in reverse order) is:
[  234.842142]
               -> #2 (dma_fence_map){++++}-{0:0}:
[  234.842145]        __dma_fence_might_wait+0x41/0xa0
[  234.842149]        dma_resv_lockdep+0x1dc/0x28f
[  234.842151]        do_one_initcall+0x58/0x2d0
[  234.842154]        kernel_init_freeable+0x273/0x2bf
[  234.842157]        kernel_init+0x16/0x120
[  234.842160]        ret_from_fork+0x1f/0x30
[  234.842163]
               -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
[  234.842166]        fs_reclaim_acquire+0x6d/0xd0
[  234.842168]        __kmalloc_node+0x51/0x3a0
[  234.842171]        alloc_cpumask_var_node+0x1b/0x30
[  234.842174]        native_smp_prepare_cpus+0xc7/0x292
[  234.842177]        kernel_init_freeable+0x160/0x2bf
[  234.842179]        kernel_init+0x16/0x120
[  234.842181]        ret_from_fork+0x1f/0x30
[  234.842184]
               -> #0 (fs_reclaim){+.+.}-{0:0}:
[  234.842186]        __lock_acquire+0x1161/0x1dc0
[  234.842189]        lock_acquire+0xb5/0x2b0
[  234.842192]        fs_reclaim_acquire+0xa1/0xd0
[  234.842193]        __kmalloc+0x4d/0x330
[  234.842196]        i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842253]        intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[  234.842307]        __i915_gpu_coredump+0x290/0x5e0 [i915]
[  234.842365]        i915_capture_error_state+0x57/0xa0 [i915]
[  234.842415]        intel_gt_handle_error+0x348/0x3e0 [i915]
[  234.842462]        intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[  234.842504]        simple_attr_write+0xc1/0xe0
[  234.842507]        full_proxy_write+0x53/0x80
[  234.842509]        vfs_write+0xbc/0x350
[  234.842513]        ksys_write+0x58/0xd0
[  234.842514]        do_syscall_64+0x38/0x90
[  234.842516]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  234.842519]
               other info that might help us debug this:

[  234.842521] Chain exists of:
                 fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map

[  234.842526]  Possible unsafe locking scenario:

[  234.842528]        CPU0                    CPU1
[  234.842529]        ----                    ----
[  234.842531]   lock(dma_fence_map);
[  234.842532]                                lock(mmu_notifier_invalidate_range_start);
[  234.842535]                                lock(dma_fence_map);
[  234.842537]   lock(fs_reclaim);
[  234.842539]
                *** DEADLOCK ***

[  234.842540] 4 locks held by gem_exec_captur/1180:
[  234.842543]  #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
[  234.842547]  #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0
[  234.842552]  #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915]
[  234.842602]  #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[  234.842656]
               stack backtrace:
[  234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G     U  W         5.15.0-rc7+ #20
[  234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
[  234.842664] Call Trace:
[  234.842666]  dump_stack_lvl+0x57/0x72
[  234.842669]  check_noncircular+0xde/0x100
[  234.842672]  ? __lock_acquire+0x3bf/0x1dc0
[  234.842675]  __lock_acquire+0x1161/0x1dc0
[  234.842678]  lock_acquire+0xb5/0x2b0
[  234.842680]  ? __kmalloc+0x4d/0x330
[  234.842683]  ? finish_task_switch.isra.0+0xf2/0x360
[  234.842686]  ? i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842734]  fs_reclaim_acquire+0xa1/0xd0
[  234.842737]  ? __kmalloc+0x4d/0x330
[  234.842739]  __kmalloc+0x4d/0x330
[  234.842742]  i915_vma_coredump_create+0x78/0x5b0 [i915]
[  234.842793]  ? capture_vma+0xbe/0x110 [i915]
[  234.842844]  intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[  234.842892]  __i915_gpu_coredump+0x290/0x5e0 [i915]
[  234.842939]  i915_capture_error_state+0x57/0xa0 [i915]
[  234.842985]  intel_gt_handle_error+0x348/0x3e0 [i915]
[  234.843032]  ? __mutex_lock+0x81/0x830
[  234.843035]  ? simple_attr_write+0x3a/0xe0
[  234.843038]  ? __lock_acquire+0x3bf/0x1dc0
[  234.843041]  intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[  234.843083]  ? _copy_from_user+0x45/0x80
[  234.843086]  simple_attr_write+0xc1/0xe0
[  234.843089]  full_proxy_write+0x53/0x80
[  234.843091]  vfs_write+0xbc/0x350
[  234.843094]  ksys_write+0x58/0xd0
[  234.843096]  do_syscall_64+0x38/0x90
[  234.843098]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  234.843101] RIP: 0033:0x7fa467480877
[  234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[  234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877
[  234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007
[  234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0
[  234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014
[  234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005

v5:
- Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity.
  (Daniel Vetter)
v6:
- Include an instance in execlists_capture_work().
- Rework the commit message due to patch reordering.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-3-thomas.hellstrom@linux.intel.com
2021-11-26 08:26:10 +01:00
Umesh Nerlige Ramappa 77cdd054dd drm/i915/pmu: Connect engine busyness stats from GuC to pmu
With GuC handling scheduling, i915 is not aware of the time that a
context is scheduled in and out of the engine. Since i915 pmu relies on
this info to provide engine busyness to the user, GuC shares this info
with i915 for all engines using shared memory. For each engine, this
info contains:

- total busyness: total time that the context was running (total)
- id: id of the running context (id)
- start timestamp: timestamp when the context started running (start)

At the time (now) of sampling the engine busyness, if the id is valid
(!= ~0), and start is non-zero, then the context is considered to be
active and the engine busyness is calculated using the below equation

	engine busyness = total + (now - start)

All times are obtained from the gt clock base. For inactive contexts,
engine busyness is just equal to the total.

The start and total values provided by GuC are 32 bits and wrap around
in a few minutes. Since perf pmu provides busyness as 64 bit
monotonically increasing values, there is a need for this implementation
to account for overflows and extend the time to 64 bits before returning
busyness to the user. In order to do that, a worker runs periodically at
frequency = 1/8th the time it takes for the timestamp to wrap. As an
example, that would be once in 27 seconds for a gt clock frequency of
19.2 MHz.

Note:
There might be an over-accounting of busyness due to the fact that GuC
may be updating the total and start values while kmd is reading them.
(i.e kmd may read the updated total and the stale start). In such a
case, user may see higher busyness value followed by smaller ones which
would eventually catch up to the higher value.

v2: (Tvrtko)
- Include details in commit message
- Move intel engine busyness function into execlist code
- Use union inside engine->stats
- Use natural type for ping delay jiffies
- Drop active_work condition checks
- Use for_each_engine if iterating all engines
- Drop seq locking, use spinlock at GuC level to update engine stats
- Document worker specific details

v3: (Tvrtko/Umesh)
- Demarcate GuC and execlist stat objects with comments
- Document known over-accounting issue in commit
- Provide a consistent view of GuC state
- Add hooks to gt park/unpark for GuC busyness
- Stop/start worker in gt park/unpark path
- Drop inline
- Move spinlock and worker inits to GuC initialization
- Drop helpers that are called only once

v4: (Tvrtko/Matt/Umesh)
- Drop addressed opens from commit message
- Get runtime pm in ping, remove from the park path
- Use cancel_delayed_work_sync in disable_submission path
- Update stats during reset prepare
- Skip ping if reset in progress
- Explicitly name execlists and GuC stats objects
- Since disable_submission is called from many places, move resetting
  stats to intel_guc_submission_reset_prepare

v5: (Tvrtko)
- Add a trylock helper that does not sleep and synchronize PMU event
  callbacks and worker with gt reset

v6: (CI BAT failures)
- DUTs using execlist submission failed to boot since __gt_unpark is
  called during i915 load. This ends up calling the GuC busyness unpark
  hook and results in kick-starting an uninitialized worker. Let
  park/unpark hooks check if GuC submission has been initialized.
- drop cant_sleep() from trylock helper since rcu_read_lock takes care
  of that.

v7: (CI) Fix igt@i915_selftest@live@gt_engines
- For GuC mode of submission the engine busyness is derived from gt time
  domain. Use gt time elapsed as reference in the selftest.
- Increase busyness calculation to 10ms duration to ensure batch runs
  longer and falls within the busyness tolerances in selftest.

v8:
- Use ktime_get in selftest as before
- intel_reset_trylock_no_wait results in a lockdep splat that is not
  trivial to fix since the PMU callback runs in irq context and the
  reset paths are tightly knit into the driver. The test that uncovers
  this is igt@perf_pmu@faulting-read. Drop intel_reset_trylock_no_wait,
  instead use the reset_count to synchronize with gt reset during pmu
  callback. For the ping, continue to use intel_reset_trylock since ping
  is not run in irq context.

- GuC PM timestamp does not tick when GuC is idle. This can potentially
  result in wrong busyness values when a context is active on the
  engine, but GuC is idle. Use the RING TIMESTAMP as GPU timestamp to
  process the GuC busyness stats. This works since both GuC timestamp and
  RING timestamp are synced with the same clock.

- The busyness stats may get updated after the batch starts running.
  This delay causes the busyness reported for 100us duration to fall
  below 95% in the selftest. The only option at this time is to wait for
  GuC busyness to change from idle to active before we sample busyness
  over a 100us period.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-2-umesh.nerlige.ramappa@intel.com
2021-10-28 11:04:43 -07:00
Matthew Brost e5e32171a2 drm/i915/guc: Connect UAPI to GuC multi-lrc interface
Introduce 'set parallel submit' extension to connect UAPI to GuC
multi-lrc interface. Kernel doc in new uAPI should explain it all.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
 (Daniel Vetter)
  - Add IGT link and placeholder for media UMD link
v3:
 (Kernel test robot)
  - Fix warning in unpin engines call
 (John Harrison)
  - Reword a bunch of the kernel doc
v4:
 (John Harrison)
  - Add comment why perma-pin is done after setting gem context
  - Update some comments / docs for proto contexts
v5:
 (John Harrison)
  - Rework perma-pin comment
  - Add BUG_IN if context is pinned when setting gem context

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-17-matthew.brost@intel.com
2021-10-15 10:45:50 -07:00
Matthew Brost 4f3059dc2d drm/i915: Add logical engine mapping
Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
 (Daniel Vetter)
  - Add kernel doc for new fields
v3:
 (Tvrtko)
  - Update comment for new logical_mask field
v4:
 (John Harrison)
  - Update comment for new logical_mask field

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-6-matthew.brost@intel.com
2021-10-15 10:37:29 -07:00
Lucas De Marchi 1a839e016e drm/i915: remove IS_ACTIVE
When trying to bring IS_ACTIVE to linux/kconfig.h I thought it wouldn't
provide much value just encapsulating it in a boolean context. So I also
added the support for handling undefined macros as the IS_ENABLED()
counterpart. However the feedback received from Masahiro Yamada was that
it is too ugly, not providing much value. And just wrapping in a boolean
context is too dumb - we could simply open code it.

As detailed in commit babaab2f47 ("drm/i915: Encapsulate kconfig
constant values inside boolean predicates"), the IS_ACTIVE macro was
added to workaround a compilation warning. However after checking again
our current uses of IS_ACTIVE it turned out there is only
1 case in which it triggers a warning in clang (due
-Wconstant-logical-operand) and 2 in smatch. All the others
can simply use the shorter version, without wrapping it in any macro.

So here I'm dialing all the way back to simply removing the macro. That
single case hit by clang can be changed to make the constant come first,
so it doesn't think it's mask:

	-       if (context && CONFIG_DRM_I915_FENCE_TIMEOUT)
	+       if (CONFIG_DRM_I915_FENCE_TIMEOUT && context)

As talked with Dan Carpenter, that logic will be added in smatch as
well, so it will also stop warning about it.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005171728.3147094-1-lucas.demarchi@intel.com
2021-10-07 11:04:05 -07:00
Thomas Hellström 3e42cc6127 drm/i915/gt: Register the migrate contexts with their engines
Pinned contexts, like the migrate contexts need reset after resume
since their context image may have been lost. Also the GuC needs to
register pinned contexts.

Add a list to struct intel_engine_cs where we add all pinned contexts on
creation, and traverse that list at resume time to reset the pinned
contexts.

This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now,
but proper LMEM backup / restore is needed for full suspend functionality.
However, note that even with full LMEM backup / restore it may be
desirable to keep the reset since backing up the migrate context images
must happen using memcpy() after the migrate context has become inactive,
and for performance- and other reasons we want to avoid memcpy() from
LMEM.

Also traverse the list at guc_init_lrc_mapping() calling
guc_kernel_context_pin() for the pinned contexts, like is already done
for the kernel context.

v2:
- Don't reset the contexts on each __engine_unpark() but rather at
  resume time (Chris Wilson).
v3:
- Reset contexts in the engine sanitize callback. (Chris Wilson)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Brost Matthew <matthew.brost@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-6-thomas.hellstrom@linux.intel.com
2021-09-24 08:19:13 +02:00
Matthew Brost ac653dd799 Revert "drm/i915/gt: Propagate change in error status to children on unhold"
Propagating errors to dependent fences is broken and can lead to errors
from one client ending up in another. In commit 3761baae90 ("Revert
"drm/i915: Propagate errors on awaiting already signaled fences""), we
attempted to get rid of fence error propagation but missed the case
added in commit 8e9f84cf5c ("drm/i915/gt: Propagate change in error
status to children on unhold"). Revert that one too. This error was
found by an up-and-coming selftest which triggers a reset during
request cancellation and verifies that subsequent requests complete
successfully.

v2:
 (Daniel Vetter)
  - Use revert
v3:
 (Jason)
  - Update commit message

v4 (Daniele):
 - fix checkpatch error in commit message.

References: '3761baae908a ("Revert "drm/i915: Propagate errors on awaiting already signaled fences"")'
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-8-matthew.brost@intel.com
2021-09-13 11:30:31 -07:00
Colin Ian King 058d7d6260 drm/i915: clean up inconsistent indenting
There is a statement that is indented one character too deeply,
clean this up.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902215737.55570-1-colin.king@canonical.com
2021-09-08 19:13:26 +02:00
Matthew Brost 62eaf0ae21 drm/i915/guc: Support request cancellation
This adds GuC backend support for i915_request_cancel(), which in turn
makes CONFIG_DRM_I915_REQUEST_TIMEOUT work.

This implementation makes use of fence while there are likely simplier
options. A fence was chosen because of another feature coming soon
which requires a user to block on a context until scheduling is
disabled. In that case we return the fence to the user and the user can
wait on that fence.

v2:
 (Daniele)
  - A comment about locking the blocked incr / decr
  - A comments about the use of the fence
  - Update commit message explaining why fence
  - Delete redundant check blocked count in unblock function
  - Ring buffer implementation
  - Comment about blocked in submission path
  - Shorter rpm path
v3:
 (Checkpatch)
  - Fix typos in commit message
 (Daniel)
  - Rework to simplier locking structure in guc_context_block / unblock

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-26-matthew.brost@intel.com
2021-07-27 17:32:14 -07:00
Matthew Brost d1cee2d37a drm/i915: Move active request tracking to a vfunc
Move active request tracking to a backend vfunc rather than assuming all
backends want to do this in the manner. In the of case execlists /
ring submission the tracking is on the physical engine while with GuC
submission it is on the context.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-8-matthew.brost@intel.com
2021-07-27 17:31:39 -07:00
Matthew Brost a95d116098 drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbs
With GuC virtual engines the physical engine which a request executes
and completes on isn't known to the i915. Therefore we can't attach a
request to a physical engines breadcrumbs. To work around this we create
a single breadcrumbs per engine class when using GuC submission and
direct all physical engine interrupts to this breadcrumbs.

v2:
 (John H)
  - Rework header file structure so intel_engine_mask_t can be in
    intel_engine_types.h

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
CC: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-6-matthew.brost@intel.com
2021-07-27 17:31:35 -07:00
Matthew Brost 556120256e drm/i915/guc: GuC virtual engines
Implement GuC virtual engines. Rather simple implementation, basically
just allocate an engine, setup context enter / exit function to virtual
engine specific functions, set all other variables / functions to guc
versions, and set the engine mask to that of all the siblings.

v2: Update to work with proto-ctx
v3:
 (Daniele)
  - Drop include, add comment to intel_virtual_engine_has_heartbeat

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-2-matthew.brost@intel.com
2021-07-27 17:31:28 -07:00
Matt Roper bfac1e2b6e drm/i915/xehp: Xe_HP forcewake support
Implement Xe_HP forcewake handling.  While we're at it, let's reorder to
the forcewake assignment if/else ladder to match our usual driver
conventions.

Co-authored-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723174239.1551352-6-matthew.d.roper@intel.com
2021-07-24 07:18:18 -07:00
Stuart Summers 50a9ea0843 drm/i915/xehp: Handle new device context ID format
Xe_HP changes the format of the context ID from past platforms.

Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721223043.834562-9-matthew.d.roper@intel.com
2021-07-22 09:37:26 -07:00
Jason Ekstrand dd4f1bbae8 drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2)
Even though FENCE_SUBMIT is only documented to wait until the request in
the in-fence starts instead of waiting until it completes, it has a bit
more magic than that.  If FENCE_SUBMIT is used to submit something to a
balanced engine, we would wait to assign engines until the primary
request was ready to start and then attempt to assign it to a different
engine than the primary.  There is an IGT test (the bonded-slice subtest
of gem_exec_balancer) which exercises this by submitting a primary batch
to a specific VCS and then using FENCE_SUBMIT to submit a secondary
which can run on any VCS and have i915 figure out which VCS to run it on
such that they can run in parallel.

However, this functionality has never been used in the real world.  The
media driver (the only user of FENCE_SUBMIT) always picks exactly two
physical engines to bond and never asks us to pick which to use.

v2 (Daniel Vetter):
 - Mention the exact IGT test this breaks

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-11-jason@jlekstrand.net
2021-07-08 19:44:31 +02:00
Jason Ekstrand 521695c6f1 drm/i915/gem: Disallow bonding of virtual engines (v3)
This adds a bunch of complexity which the media driver has never
actually used.  The media driver does technically bond a balanced engine
to another engine but the balanced engine only has one engine in the
sibling set.  This doesn't actually result in a virtual engine.

This functionality was originally added to handle cases where we may
have more than two video engines and media might want to load-balance
their bonded submits by, for instance, submitting to a balanced vcs0-1
as the primary and then vcs2-3 as the secondary.  However, no such
hardware has shipped thus far and, if we ever want to enable such
use-cases in the future, we'll use the up-and-coming parallel submit API
which targets GuC submission.

This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We leave the
validation code in place in case we ever decide we want to do something
interesting with the bonding information.

v2 (Jason Ekstrand):
 - Don't delete quite as much code.

v3 (Tvrtko Ursulin):
 - Add some history to the commit message

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-10-jason@jlekstrand.net
2021-07-08 19:44:25 +02:00
Jason Ekstrand 4a766ae40e drm/i915: Drop the CONTEXT_CLONE API (v2)
This API allows one context to grab bits out of another context upon
creation.  It can be used as a short-cut for setparam(getparam()) for
things like I915_CONTEXT_PARAM_VM.  However, it's never been used by any
real userspace.  It's used by a few IGT tests and that's it.  Since it
doesn't add any real value (most of the stuff you can CLONE you can copy
in other ways), drop it.

There is one thing that this API allows you to clone which you cannot
clone via getparam/setparam: timelines.  However, timelines are an
implementation detail of i915 and not really something that needs to be
exposed to userspace.  Also, sharing timelines between contexts isn't
obviously useful and supporting it has the potential to complicate i915
internally.  It also doesn't add any functionality that the client can't
get in other ways.  If a client really wants a shared timeline, they can
use a syncobj and set it as an in and out fence on every submit.

v2 (Jason Ekstrand):
 - More detailed commit message

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-7-jason@jlekstrand.net
2021-07-08 19:44:04 +02:00
Matthew Brost 22916bad07 drm/i915: Move submission tasklet to i915_sched_engine
The submission tasklet operates on i915_sched_engine, thus it is the
correct place for it.

v3:
 (Jason Ekstrand)
  Change sched_engine->engine to a void* private data pointer
  Add kernel doc
v4:
 (Daniele)
  Update private_data comment
  Set queue_priority_hint in kick_execlists
v5:
 (CI)
  Rebase and fix build error

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-9-matthew.brost@intel.com
2021-06-18 15:14:42 -07:00
Matthew Brost d2a31d0264 drm/i915: Update i915_scheduler to operate on i915_sched_engine
Rather passing around an intel_engine_cs in the scheduling code, pass
around a i915_sched_engine.

v3:
 (Jason Ekstrand)
  Add READ_ONCE around rq->engine in lock_sched_engine

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-8-matthew.brost@intel.com
2021-06-18 15:14:09 -07:00
Matthew Brost 71ed60112d drm/i915: Add kick_backend function to i915_sched_engine
Not all back-ends require a kick after a  scheduling update, so make the
kick a call-back function that the  back-end can opt-in to. Also move
the current kick function from the scheduler to the execlists file as it
is specific to that back-end.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-7-matthew.brost@intel.com
2021-06-18 15:14:09 -07:00
Matthew Brost 3f623e06cd drm/i915: Move engine->schedule to i915_sched_engine
The schedule function should be in the schedule object.

v3:
 (Jason Ekstrand)
  Add kernel doc

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-6-matthew.brost@intel.com
2021-06-18 15:13:34 -07:00
Matthew Brost 349a2bc5aa drm/i915: Move active tracking to i915_sched_engine
Move active request tracking and its lock to i915_sched_engine. This
lock is also the submission lock so having it in the i915_sched_engine
is the correct place.

v3:
 (Jason Ekstrand)
  Add kernel doc
v6:
  Rebase

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.comk>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-5-matthew.brost@intel.com
2021-06-18 15:13:33 -07:00
Matthew Brost c4fd7d8cc3 drm/i915: Reset sched_engine.no_priolist immediately after dequeue
Rather than touching schedule state in the generic PM code, reset the
priolist allocation when empty in the submission code. Add a wrapper
function to do this and update the backends to call it in the correct
place.

v3:
 (Jason Ekstrand)
  Update patch commit message with a better description

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-4-matthew.brost@intel.com
2021-06-18 15:13:20 -07:00