Commit graph

22463 commits

Author SHA1 Message Date
Matt Roper
fe8b7085ca drm/i915: Handle all MCR ranges
The bspec documents multiple MCR ranges; make sure they're all captured
by the driver.

Bspec: 13991, 52079
Fixes: 592a7c5e08 ("drm/i915: Extend non readable mcr range")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-2-matthew.d.roper@intel.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
(cherry picked from commit 415d126997)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-16 12:17:00 +02:00
Caz Yokoyama
c09f6b4d08 Revert "drm/i915/tgl: Add extra hdc flush workaround"
This reverts commit 36a6b5d964.

The commit takes care Wa_1604544889 which was fixed on a0 stepping based on
a0 replan. So no SW workaround is required on any stepping now.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Caz Yokoyama <caz.yokoyama@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Fixes: 36a6b5d964 ("drm/i915/tgl: Add extra hdc flush workaround")
Link: https://patchwork.freedesktop.org/patch/msgid/1c751032ce79c80c5485cae315f1a9904ce07cac.1583359940.git.caz.yokoyama@intel.com
(cherry picked from commit 175c4d9b3b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-16 12:17:00 +02:00
Chris Wilson
9777d8b2d2 drm/i915/execlists: Track active elements during dequeue
Record the initial active element we use when building the next ELSP
submission, so that we can compare against it latter to see if there's
no change.

Fixes: 44d0a9c05b ("drm/i915/execlists: Skip redundant resubmission")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311092624.10012-2-chris@chris-wilson.co.uk
(cherry picked from commit 60ef5b7ac6)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-16 12:17:00 +02:00
Chris Wilson
14a0d527a4 drm/i915: Defer semaphore priority bumping to a workqueue
Since the semaphore fence may be signaled from inside an interrupt
handler from inside a request holding its request->lock, we cannot then
enter into the engine->active.lock for processing the semaphore priority
bump as we may traverse our call tree and end up on another held
request.

CPU 0:
[ 2243.218864]  _raw_spin_lock_irqsave+0x9a/0xb0
[ 2243.218867]  i915_schedule_bump_priority+0x49/0x80 [i915]
[ 2243.218869]  semaphore_notify+0x6d/0x98 [i915]
[ 2243.218871]  __i915_sw_fence_complete+0x61/0x420 [i915]
[ 2243.218874]  ? kmem_cache_free+0x211/0x290
[ 2243.218876]  i915_sw_fence_complete+0x58/0x80 [i915]
[ 2243.218879]  dma_i915_sw_fence_wake+0x3e/0x80 [i915]
[ 2243.218881]  signal_irq_work+0x571/0x690 [i915]
[ 2243.218883]  irq_work_run_list+0xd7/0x120
[ 2243.218885]  irq_work_run+0x1d/0x50
[ 2243.218887]  smp_irq_work_interrupt+0x21/0x30
[ 2243.218889]  irq_work_interrupt+0xf/0x20

CPU 1:
[ 2242.173107]  _raw_spin_lock+0x8f/0xa0
[ 2242.173110]  __i915_request_submit+0x64/0x4a0 [i915]
[ 2242.173112]  __execlists_submission_tasklet+0x8ee/0x2120 [i915]
[ 2242.173114]  ? i915_sched_lookup_priolist+0x1e3/0x2b0 [i915]
[ 2242.173117]  execlists_submit_request+0x2e8/0x2f0 [i915]
[ 2242.173119]  submit_notify+0x8f/0xc0 [i915]
[ 2242.173121]  __i915_sw_fence_complete+0x61/0x420 [i915]
[ 2242.173124]  ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 2242.173137]  i915_sw_fence_complete+0x58/0x80 [i915]
[ 2242.173140]  i915_sw_fence_commit+0x16/0x20 [i915]

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1318
Fixes: b7404c7ecb ("drm/i915: Bump ready tasks ahead of busywaits")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310101720.9944-1-chris@chris-wilson.co.uk
(cherry picked from commit 209df10bb4)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-11 23:12:39 +02:00
Chris Wilson
8ea6bb8e4d drm/i915/gt: Close race between cacheline_retire and free
If the cacheline may still be busy, atomically mark it for future
release, and only if we can determine that it will never be used again,
immediately free it.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1392
Fixes: ebece75392 ("drm/i915: Keep timeline HWSP allocated until idle across the system")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Link: https://patchwork.freedesktop.org/patch/msgid/20200306154647.3528345-1-chris@chris-wilson.co.uk
(cherry picked from commit 2d4bd971f5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-11 23:12:39 +02:00
Chris Wilson
eafc2aa20f drm/i915/execlists: Enable timeslice on partial virtual engine dequeue
If we stop filling the ELSP due to an incompatible virtual engine
request, check if we should enable the timeslice on behalf of the queue.

This fixes the case where we are inspecting the last->next element when
we know that the last element is the last request in the execution queue,
and so decided we did not need to enable timeslicing despite the intent
to do so!

Fixes: 8ee36e048c ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200306113012.3184606-1-chris@chris-wilson.co.uk
(cherry picked from commit 3df2deed41)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-11 23:12:39 +02:00
Matthew Auld
1d61c5d711 drm/i915: be more solid in checking the alignment
The alignment is u64, and yet is_power_of_2() assumes unsigned long,
which might give different results between 32b and 64b kernel.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305203534.210466-1-matthew.auld@intel.com
Cc: stable@vger.kernel.org
(cherry picked from commit 2920516b2f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-11 23:12:39 +02:00
Tina Zhang
259170cb4c drm/i915/gvt: Fix dma-buf display blur issue on CFL
Commit c3b5a8430d ("drm/i915/gvt: Enable gfx virtualiztion for CFL")
added the support on CFL. The vgpu emulation hotplug support on CFL was
supposed to be included in that patch. Without the vgpu emulation
hotplug support, the dma-buf based display gives us a blur face.

So fix this issue by adding the vgpu emulation hotplug support on CFL.

Fixes: c3b5a8430d ("drm/i915/gvt: Enable gfx virtualiztion for CFL")
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200227010041.32248-1-tina.zhang@intel.com
(cherry picked from commit 135dde8853)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-11 23:12:39 +02:00
Chris Wilson
c951b0af2d drm/i915: Return early for await_start on same timeline
Requests within a timeline are ordered by that timeline, so awaiting for
the start of a request within the timeline is a no-op. This used to work
by falling out of the mutex_trylock() as the signaler and waiter had the
same timeline and not returning an error.

Fixes: 6a79d84840 ("drm/i915: Lock signaler timeline while navigating")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305134822.2750496-1-chris@chris-wilson.co.uk
(cherry picked from commit ab7a69020f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-11 23:12:38 +02:00
Chris Wilson
c67b35d970 drm/i915: Actually emit the await_start
Fix the inverted test to emit the wait on the end of the previous
request if we /haven't/ already.

Fixes: 6a79d84840 ("drm/i915: Lock signaler timeline while navigating")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305104210.2619967-1-chris@chris-wilson.co.uk
(cherry picked from commit 07e9c59d63)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-11 23:12:38 +02:00
Jani Nikula
b74f241d71 Merge tag 'gvt-fixes-2020-03-10' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2020-03-10

- Fix vgpu idr destroy causing timer destroy failure (Zhenyu)
- Fix VBT size (Tina)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310080933.GE28483@zhen-hp.sh.intel.com
2020-03-10 11:16:43 +02:00
Tina Zhang
2fa7e15c5f drm/i915/gvt: Fix emulated vbt size issue
The emulated vbt doesn't tell its size correctly. According to the
intel_vbt_defs.h, vbt_header.vbt_size should the size of VBT (VBT Header,
BDB Header and data blocks), and bdb_header.bdb_size should be the size
of BDB (BDB Header and data blocks).

This patch fixes the issue and lets vbt provided by GVT-g pass the guest
i915's sanity test.

v2: refine the commit message. (Zhenyu)

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200305131600.29640-1-tina.zhang@intel.com
2020-03-06 09:35:30 +08:00
Chris Wilson
169c0aa4bc drm/i915/gt: Drop the timeline->mutex as we wait for retirement
As we have pinned the timeline (using tl->active_count), we can safely
drop the tl->mutex as we wait for what we believe to be the final
request on that timeline. This is useful for ensuring that we do not
block the engine heartbeat by hogging the kernel_context's timeline on a
dead GPU.

References: https://gitlab.freedesktop.org/drm/intel/issues/1364
Fixes: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Fixes: f33a8a5160 ("drm/i915: Merge wait_for_timelines with retire_request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200303140009.1494819-1-chris@chris-wilson.co.uk
(cherry picked from commit 82126e596d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-04 13:49:26 +02:00
Chris Wilson
08f56f8f37 drm/i915/perf: Reintroduce wait on OA configuration completion
We still need to wait for the initial OA configuration to happen
before we enable OA report writes to the OA buffer.

Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15d0ace1f8 ("drm/i915/perf: execute OA configuration from command stream")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1356
Testcase: igt/perf/stream-open-close
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200302085812.4172450-7-chris@chris-wilson.co.uk
(cherry picked from commit 4b4e973d5e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-04 13:49:26 +02:00
Zhenyu Wang
04d6067f1f drm/i915/gvt: Fix unnecessary schedule timer when no vGPU exits
From commit f25a49ab8a ("drm/i915/gvt: Use vgpu_lock to protect per
vgpu access") the vgpu idr destroy is moved later than vgpu resource
destroy, then it would fail to stop timer for schedule policy clean
which to check vgpu idr for any left vGPU. So this trys to destroy
vgpu idr earlier.

Cc: Colin Xu <colin.xu@intel.com>
Fixes: f25a49ab8a ("drm/i915/gvt: Use vgpu_lock to protect per vgpu access")
Acked-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200229055445.31481-1-zhenyuw@linux.intel.com
2020-03-03 14:09:26 +08:00
Dan Carpenter
f4aaa44e8b drm/i915/selftests: Fix return in assert_mmap_offset()
The assert_mmap_offset() returns type bool so if we return an error
pointer that is "return true;" or success.  If we have an error, then
we should return false.

Fixes: 3d81d589d6 ("drm/i915: Test exhaustion of the mmap space")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200228141413.qfjf4abr323drlo4@kili.mountain
(cherry picked from commit efbf928824)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-02 12:11:07 +02:00
Chris Wilson
0b1570b7ff drm/i915: Protect i915_request_await_start from early waits
We need to be extremely careful inside i915_request_await_start() as it
needs to walk the list of requests in the foreign timeline with very
little protection. As we hold our own timeline mutex, we can not nest
inside the signaler's timeline mutex, so all that remains is our RCU
protection. However, to be safe we need to tell the compiler that we may
be traversing the list only under RCU protection, and furthermore we
need to start declaring requests as elements of the timeline from their
construction.

Fixes: 9ddc8ec027 ("drm/i915: Eliminate the trylock for awaiting an earlier request")
Fixes: 6a79d84840 ("drm/i915: Lock signaler timeline while navigating")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-11-chris@chris-wilson.co.uk
(cherry picked from commit d22d2d073e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-02 12:10:46 +02:00
Lucas De Marchi
eddf309a8e drm/i915/tgl: Add Wa_1608008084
Wa_1608008084 is an additional WA that applies to writes on FF_MODE2
register. We can't read it back either from CPU or GPU. Since the other
bits should be 0, recommendation to handle Wa_1604555607 is to actually
just write the timer value.

Do a write only and don't try to read it, neither before or after
the WA is applied.

Fixes: ff690b2111 ("drm/i915/tgl: Implement Wa_1604555607")
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200224191258.15668-1-lucas.demarchi@intel.com
(cherry picked from commit e94bda1432)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-02 12:10:12 +02:00
Matt Roper
4c116e1ae4 drm/i915/tgl: Add Wa_22010178259:tgl
We need to explicitly set the TLB Request Timer initial value in the
BW_BUDDY registers to 0x8 rather than relying on the hardware default.

v2: Apply missing REG_FIELD_PREP to ensure 0x8 is placed in the correct
    bits during the rmw.  (Jose)

Bspec: 52890
Bspec: 50044
Fixes: 3fa01d642f ("drm/i915/tgl: Program BW_BUDDY registers during display init")
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200219215655.2923650-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 87e04f7592)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200228004320.127142-2-matthew.d.roper@intel.com
2020-03-02 10:48:56 +02:00
Matt Roper
c725161924 drm/i915: Program MBUS with rmw during initialization
It wasn't terribly clear from the bspec's wording, but after discussion
with the hardware folks, it turns out that we need to preserve the
pre-existing contents of the MBUS ABOX control register when
initializing a few specific bits.

Bspec: 49213
Bspec: 50096
Fixes: 4cb4585e5a ("drm/i915/icl: initialize MBus during display init")
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200204011032.582737-1-matthew.d.roper@intel.com
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
(cherry picked from commit 837b63e608)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200228004320.127142-1-matthew.d.roper@intel.com
2020-03-02 10:48:20 +02:00
José Roberto de Souza
33e059a2e4 drm/i915/psr: Force PSR probe only after full initialization
Commit 60c6a14b48 ("drm/i915/display: Force the state compute phase
once to enable PSR") was forcing the state compute too earlier
causing errors because not everything was initialized, so here
moving to the end of i915_driver_modeset_probe() when the display is
all initialized.

Also fixing the place where it disarm the force probe as during the
atomic check phase errors could happen like the ones due locking and
it would cause PSR to never be enabled if that happens.
Leaving the disarm to the atomic commit phase, intel_psr_enable() or
intel_psr_update() will be called even if the current state do not
allow PSR to be enabled.

v2: Check if intel_dp is null in intel_psr_force_mode_changed_set()
v3: Check intel_dp before get dev_priv
v4:
- renamed intel_psr_force_mode_changed_set() to
intel_psr_set_force_mode_changed()
- removed the set parameter from intel_psr_set_force_mode_changed()
- not calling intel_psr_set_force_mode_changed() from
intel_psr_enable/update(), directly setting it after the same checks
that intel_psr_set_force_mode_changed() does
- moved intel_psr_set_force_mode_changed() arm call to
i915_driver_modeset_probe() as it is a better for a PSR call, all the
functions calls happening between the old and the new function call
will cause issue

[backported to v5.6-rc3]

Fixes: 60c6a14b48 ("drm/i915/display: Force the state compute phase once to enable PSR")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1151
Tested-by: Ross Zwisler <zwisler@google.com>
Reported-by: Ross Zwisler <zwisler@google.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200221212635.11614-1-jose.souza@intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20200227205540.126135-1-jose.souza@intel.com
(cherry picked from commit df1a5bfc16)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-02 10:46:25 +02:00
Chris Wilson
bb699a7931 drm/i915/gem: Break up long lists of object reclaim
Call cond_resched() between each freed object in case we have a really,
really long list, and we don't want to block normal processes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200221100953.2587176-1-chris@chris-wilson.co.uk
(cherry picked from commit deeee411a9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-03-02 10:24:34 +02:00
Jani Nikula
8e9a400c70 Merge tag 'gvt-fixes-2020-02-26' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2020-02-26

- Fix virtual display reset (Tina)
- Fix one use-after-free for dmabuf (Tina)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200226103016.GC10413@zhen-hp.sh.intel.com
2020-02-26 22:58:25 +02:00
Chris Wilson
2387342621 drm/i915: Avoid recursing onto active vma from the shrinker
We mark the vma as active while binding it in order to protect outselves
from being shrunk under mempressure. This only works if we are strict in
not attempting to shrink active objects.

<6> [472.618968] Workqueue: events_unbound fence_work [i915]
<4> [472.618970] Call Trace:
<4> [472.618974]  ? __schedule+0x2e5/0x810
<4> [472.618978]  schedule+0x37/0xe0
<4> [472.618982]  schedule_preempt_disabled+0xf/0x20
<4> [472.618984]  __mutex_lock+0x281/0x9c0
<4> [472.618987]  ? mark_held_locks+0x49/0x70
<4> [472.618989]  ? _raw_spin_unlock_irqrestore+0x47/0x60
<4> [472.619038]  ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619084]  ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619122]  i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619165]  i915_gem_object_unbind+0x1dc/0x400 [i915]
<4> [472.619208]  i915_gem_shrink+0x328/0x660 [i915]
<4> [472.619250]  ? i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619282]  i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619325]  vm_alloc_page.constprop.25+0x1aa/0x240 [i915]
<4> [472.619330]  ? rcu_read_lock_sched_held+0x4d/0x80
<4> [472.619363]  ? __alloc_pd+0xb/0x30 [i915]
<4> [472.619366]  ? module_assert_mutex_or_preempt+0xf/0x30
<4> [472.619368]  ? __module_address+0x23/0xe0
<4> [472.619371]  ? is_module_address+0x26/0x40
<4> [472.619374]  ? static_obj+0x34/0x50
<4> [472.619376]  ? lockdep_init_map+0x4d/0x1e0
<4> [472.619407]  setup_page_dma+0xd/0x90 [i915]
<4> [472.619437]  alloc_pd+0x29/0x50 [i915]
<4> [472.619470]  __gen8_ppgtt_alloc+0x443/0x6b0 [i915]
<4> [472.619503]  gen8_ppgtt_alloc+0xd7/0x300 [i915]
<4> [472.619535]  ppgtt_bind_vma+0x2a/0xe0 [i915]
<4> [472.619577]  __vma_bind+0x26/0x40 [i915]
<4> [472.619611]  fence_work+0x1c/0x90 [i915]
<4> [472.619617]  process_one_work+0x26a/0x620

Fixes: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200221221818.2861432-1-chris@chris-wilson.co.uk
(cherry picked from commit 6f24e41022)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-26 14:07:50 +02:00
Michał Winiarski
2de0147d77 drm/i915/pmu: Avoid using globals for PMU events
Attempting to bind / unbind module from devices where we have both
integrated and discreete GPU handled by i915, will cause us to try and
double free the global state, hitting null ptr deref in free_event_attributes.

Let's move it to i915_pmu.

Fixes: 05488673a4 ("drm/i915/pmu: Support multiple GPUs")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200219161822.24592-2-michal.winiarski@intel.com
(cherry picked from commit 46129dc10f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-26 14:07:50 +02:00
Michał Winiarski
19ee5e8da6 drm/i915/pmu: Avoid using globals for CPU hotplug state
Attempting to bind / unbind module from devices where we have both
integrated and discreete GPU handled by i915 can lead to leaks and
warnings from cpuhp:
Error: Removing state XXX which has instances left.

Let's move the state to i915_pmu.

Fixes: 05488673a4 ("drm/i915/pmu: Support multiple GPUs")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200219161822.24592-1-michal.winiarski@intel.com
(cherry picked from commit f5a179d468)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-26 14:07:50 +02:00
Chris Wilson
eee18939e5 drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt
Full-ppgtt on gen7 is proving to be highly unstable and not robust.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/694
Fixes: 3cd6e8860e ("drm/i915/gen7: Re-enable full-ppgtt for ivb & hsw")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200224101120.4024481-1-chris@chris-wilson.co.uk
(cherry picked from commit 4fbe112a56)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-26 14:04:28 +02:00
Jani Nikula
b5dacc8fb5 drm/i915: fix header test with GCOV
$(CC) with $(CFLAGS_GCOV) assumes the output filename with .gcno suffix
appended is writable. This is not the case when the output filename is
/dev/null:

  HDRTEST drivers/gpu/drm/i915/display/intel_frontbuffer.h
/dev/null:1:0: error: cannot open /dev/null.gcno
  HDRTEST drivers/gpu/drm/i915/display/intel_ddi.h
/dev/null:1:0: error: cannot open /dev/null.gcno
make[5]: *** [../drivers/gpu/drm/i915/Makefile:307:
drivers/gpu/drm/i915/display/intel_ddi.hdrtest] Error 1
make[5]: *** Waiting for unfinished jobs....
make[5]: *** [../drivers/gpu/drm/i915/Makefile:307:
drivers/gpu/drm/i915/display/intel_frontbuffer.hdrtest] Error 1

Filter out $(CFLAGS_GVOC) from the header test $(c_flags) as they don't
make sense here anyway.

References: http://lore.kernel.org/r/d8112767-4089-4c58-d7d3-2ce03139858a@infradead.org
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: c6d4a099a2 ("drm/i915: reimplement header test feature")
Cc: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200221105414.14358-1-jani.nikula@intel.com
(cherry picked from commit 408c1b3253)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-26 14:04:28 +02:00
Tina Zhang
b549c252b1 drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime
Deleting dmabuf item's list head after releasing its container can lead
to KASAN-reported issue:

  BUG: KASAN: use-after-free in __list_del_entry_valid+0x15/0xf0
  Read of size 8 at addr ffff88818a4598a8 by task kworker/u8:3/13119

So fix this issue by puting deleting dmabuf_objs ahead of releasing its
container.

Fixes: dfb6ae4e14 ("drm/i915/gvt: Handle orphan dmabuf_objs")
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200225053527.8336-2-tina.zhang@intel.com
2020-02-25 16:14:20 +08:00
Tina Zhang
3eb55e6f75 drm/i915/gvt: Separate display reset from ALL_ENGINES reset
ALL_ENGINES reset doesn't clobber display with the current gvt-g
supported platforms. Thus ALL_ENGINES reset shouldn't reset the
display engine registers emulated by gvt-g.

This fixes guest warning like

[ 14.622026] [drm] Initialized i915 1.6.0 20200114 for 0000:00:03.0 on minor 0
[ 14.967917] fbcon: i915drmfb (fb0) is primary device
[ 25.100188] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] E RROR [CRTC:51:pipe A] flip_done timed out
[ 25.100860] -----------[ cut here ]-----------
[ 25.100861] pll on state mismatch (expected 0, found 1)
[ 25.101024] WARNING: CPU: 1 PID: 30 at drivers/gpu/drm/i915/display/intel_dis play.c:14382 verify_single_dpll_state.isra.115+0x28f/0x320 [i915]
[ 25.101025] Modules linked in: intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915 aesni_intel cr ypto_simd cryptd glue_helper cec rc_core video drm_kms_helper joydev drm input_l eds i2c_algo_bit serio_raw fb_sys_fops syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 e1000 psmouse i2c_piix4 pata_acpi floppy
[ 25.101052] CPU: 1 PID: 30 Comm: kworker/u4:1 Not tainted 5.5.0+ #1
[ 25.101053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1 .12.1-0-ga5cab58 04/01/2014
[ 25.101055] Workqueue: events_unbound async_run_entry_fn
[ 25.101092] RIP: 0010:verify_single_dpll_state.isra.115+0x28f/0x320 [i915]
[ 25.101093] Code: e0 d9 ff e9 a3 fe ff ff 80 3d e9 c2 11 00 00 44 89 f6 48 c7 c7 c0 9d 88 c0 75 3b e8 eb df d9 ff e9 c7 fe ff ff e8 d1 e0 ae c4 <0f> 0b e9 7a fe ff ff 80 3d c0 c2 11 00 00 8d 71 41 89 c2 48 c7 c7
[ 25.101093] RSP: 0018:ffffb1de80107878 EFLAGS: 00010286
[ 25.101094] RAX: 0000000000000000 RBX: ffffb1de80107884 RCX: 0000000000000007
[ 25.101095] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff94fdfdd19740
[ 25.101095] RBP: ffffb1de80107938 R08: 0000000d6bfdc7b4 R09: 000000000000002b
[ 25.101096] R10: ffff94fdf82dc000 R11: 0000000000000225 R12: 00000000000001f8
[ 25.101096] R13: ffff94fdb3ca6a90 R14: ffff94fdb3ca0000 R15: 0000000000000000
[ 25.101097] FS: 0000000000000000(0000) GS:ffff94fdfdd00000(0000) knlGS:00000 00000000000
[ 25.101098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 25.101098] CR2: 00007fbc3e2be9c8 CR3: 000000003339a003 CR4: 0000000000360ee0
[ 25.101101] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 25.101101] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 25.101102] Call Trace:
[ 25.101139] intel_atomic_commit_tail+0xde4/0x1520 [i915]
[ 25.101141] ? flush_workqueue_prep_pwqs+0xfa/0x130
[ 25.101142] ? flush_workqueue+0x198/0x3c0
[ 25.101174] intel_atomic_commit+0x2ad/0x320 [i915]
[ 25.101209] drm_atomic_commit+0x4a/0x50 [drm]
[ 25.101220] drm_client_modeset_commit_atomic+0x1c4/0x200 [drm]
[ 25.101231] drm_client_modeset_commit_force+0x47/0x170 [drm]
[ 25.101250] drm_fb_helper_restore_fbdev_mode_unlocked+0x4e/0xa0 [drm_kms_hel per]
[ 25.101255] drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper]
[ 25.101287] intel_fbdev_set_par+0x1a/0x40 [i915]
[ 25.101289] ? con_is_visible+0x2e/0x60
[ 25.101290] fbcon_init+0x378/0x600
[ 25.101292] visual_init+0xd5/0x130
[ 25.101296] do_bind_con_driver+0x217/0x430
[ 25.101297] do_take_over_console+0x7d/0x1b0
[ 25.101298] do_fbcon_takeover+0x5c/0xb0
[ 25.101299] fbcon_fb_registered+0x199/0x1a0
[ 25.101301] register_framebuffer+0x22c/0x330
[ 25.101306] __drm_fb_helper_initial_config_and_unlock+0x31a/0x520 [drm_kms_h elper]
[ 25.101311] drm_fb_helper_initial_config+0x35/0x40 [drm_kms_helper]
[ 25.101341] intel_fbdev_initial_config+0x18/0x30 [i915]
[ 25.101342] async_run_entry_fn+0x3c/0x150
[ 25.101343] process_one_work+0x1fd/0x3f0
[ 25.101344] worker_thread+0x34/0x410
[ 25.101346] kthread+0x121/0x140
[ 25.101346] ? process_one_work+0x3f0/0x3f0
[ 25.101347] ? kthread_park+0x90/0x90
[ 25.101350] ret_from_fork+0x35/0x40
[ 25.101351] --[ end trace b5b47d44cd998ba1 ]--

Fixes: 6294b61ba7 ("drm/i915/gvt: add missing display part reset for vGPU reset")
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200221023234.28635-1-tina.zhang@intel.com
2020-02-24 17:55:27 +08:00
Chris Wilson
15de9cb5c9 drm/i915/gt: Avoid resetting ring->head outside of its timeline mutex
We manipulate ring->head while active in i915_request_retire underneath
the timeline manipulation. We cannot rely on a stable ring->head outside
of the timeline->mutex, in particular while setting up the context for
resume and reset.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1126
Fixes: 0881954965 ("drm/i915: Introduce intel_context.pin_mutex for pin management")
Fixes: e5dadff4b0 ("drm/i915: Protect request retirement with timeline->mutex")
References: f3c0efc9fe ("drm/i915/execlists: Leave resetting ring to intel_ring")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200211120131.958949-1-chris@chris-wilson.co.uk
(cherry picked from commit 42827350f7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-18 09:53:18 +02:00
Chris Wilson
b1339ecac6 drm/i915/execlists: Always force a context reload when rewinding RING_TAIL
If we rewind the RING_TAIL on a context, due to a preemption event, we
must force the context restore for the RING_TAIL update to be properly
handled. Rather than note which preemption events may cause us to rewind
the tail, compare the new request's tail with the previously submitted
RING_TAIL, as it turns out that timeslicing was causing unexpected
rewinds.

   <idle>-0       0d.s2 1280851190us : __execlists_submission_tasklet: 0000:00:02.0 rcs0: expired last=130:4698, prio=3, hint=3
   <idle>-0       0d.s2 1280851192us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 66:119966, current 119964
   <idle>-0       0d.s2 1280851195us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 130:4698, current 4695
   <idle>-0       0d.s2 1280851198us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 130:4696, current 4695
^----  Note we unwind 2 requests from the same context

   <idle>-0       0d.s2 1280851208us : __i915_request_submit: 0000:00:02.0 rcs0: fence 130:4696, current 4695
   <idle>-0       0d.s2 1280851213us : __i915_request_submit: 0000:00:02.0 rcs0: fence 134:1508, current 1506
^---- But to apply the new timeslice, we have to replay the first request
      before the new client can start -- the unexpected RING_TAIL rewind

   <idle>-0       0d.s2 1280851219us : trace_ports: 0000:00:02.0 rcs0: submit { 130:4696*, 134:1508 }
 synmark2-5425    2..s. 1280851239us : process_csb: 0000:00:02.0 rcs0: cs-irq head=5, tail=0
 synmark2-5425    2..s. 1280851240us : process_csb: 0000:00:02.0 rcs0: csb[0]: status=0x00008002:0x00000000
^---- Preemption event for the ELSP update; note the lite-restore

 synmark2-5425    2..s. 1280851243us : trace_ports: 0000:00:02.0 rcs0: preempted { 130:4698, 66:119966 }
 synmark2-5425    2..s. 1280851246us : trace_ports: 0000:00:02.0 rcs0: promote { 130:4696*, 134:1508 }
 synmark2-5425    2.... 1280851462us : __i915_request_commit: 0000:00:02.0 rcs0: fence 130:4700, current 4695
 synmark2-5425    2.... 1280852111us : __i915_request_commit: 0000:00:02.0 rcs0: fence 130:4702, current 4695
 synmark2-5425    2.Ns1 1280852296us : process_csb: 0000:00:02.0 rcs0: cs-irq head=0, tail=2
 synmark2-5425    2.Ns1 1280852297us : process_csb: 0000:00:02.0 rcs0: csb[1]: status=0x00000814:0x00000000
 synmark2-5425    2.Ns1 1280852299us : trace_ports: 0000:00:02.0 rcs0: completed { 130:4696!, 134:1508 }
 synmark2-5425    2.Ns1 1280852301us : process_csb: 0000:00:02.0 rcs0: csb[2]: status=0x00000818:0x00000040
 synmark2-5425    2.Ns1 1280852302us : trace_ports: 0000:00:02.0 rcs0: completed { 134:1508, 0:0 }
 synmark2-5425    2.Ns1 1280852313us : process_csb: process_csb:2336 GEM_BUG_ON(!i915_request_completed(*execlists->active) && !reset_in_progress(execlists))

Fixes: 8ee36e048c ("drm/i915/execlists: Minimalistic timeslicing")
Referenecs: 82c69bf586 ("drm/i915/gt: Detect if we miss WaIdleLiteRestore")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://patchwork.freedesktop.org/patch/msgid/20200207211452.2860634-1-chris@chris-wilson.co.uk
(cherry picked from commit 5ba32c7be8)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-18 09:53:09 +02:00
Chris Wilson
aa3146193a drm/i915: Wean off drm_pci_alloc/drm_pci_free
drm_pci_alloc and drm_pci_free are just very thin wrappers around
dma_alloc_coherent, with a note that we should be removing them.
Furthermore since

commit de09d31dd3
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Fri Jan 15 16:51:42 2016 -0800

    page-flags: define PG_reserved behavior on compound pages

    As far as I can see there's no users of PG_reserved on compound pages.
    Let's use PF_NO_COMPOUND here.

drm_pci_alloc has been declared broken since it mixes GFP_COMP and
SetPageReserved. Avoid this conflict by weaning ourselves off using the
abstraction and using the dma functions directly.

Reported-by: Taketo Kabe
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1027
Fixes: de09d31dd3 ("page-flags: define PG_reserved behavior on compound pages")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.5+
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200202153934.3899472-1-chris@chris-wilson.co.uk
(cherry picked from commit c6790dc223)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-18 09:52:54 +02:00
Chris Wilson
19b5f3b419 drm/i915/gt: Protect defer_request() from new waiters
Mika spotted

<4>[17436.705441] general protection fault: 0000 [#1] PREEMPT SMP PTI
<4>[17436.705447] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.5.0+ #1
<4>[17436.705449] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3805 05/16/2018
<4>[17436.705512] RIP: 0010:__execlists_submission_tasklet+0xc4d/0x16e0 [i915]
<4>[17436.705516] Code: c5 4c 8d 60 e0 75 17 e9 8c 07 00 00 49 8b 44 24 20 49 39 c5 4c 8d 60 e0 0f 84 7a 07 00 00 49 8b 5c 24 08 49 8b 87 80 00 00 00 <48> 39 83 d8 fe ff ff 75 d9 48 8b 83 88 fe ff ff a8 01 0f 84 b6 05
<4>[17436.705518] RSP: 0018:ffffc9000012ce80 EFLAGS: 00010083
<4>[17436.705521] RAX: ffff88822ae42000 RBX: 5a5a5a5a5a5a5a5a RCX: dead000000000122
<4>[17436.705523] RDX: ffff88822ae42588 RSI: ffff8881e32a7908 RDI: ffff8881c429fd48
<4>[17436.705525] RBP: ffffc9000012cf00 R08: ffff88822ae42588 R09: 00000000fffffffe
<4>[17436.705527] R10: ffff8881c429fb80 R11: 00000000a677cf08 R12: ffff8881c42a0aa8
<4>[17436.705529] R13: ffff8881c429fd38 R14: ffff88822ae42588 R15: ffff8881c429fb80
<4>[17436.705532] FS:  0000000000000000(0000) GS:ffff88822ed00000(0000) knlGS:0000000000000000
<4>[17436.705534] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[17436.705536] CR2: 00007f858c76d000 CR3: 0000000005610003 CR4: 00000000003606e0
<4>[17436.705538] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[17436.705540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[17436.705542] Call Trace:
<4>[17436.705545]  <IRQ>
<4>[17436.705603]  execlists_submission_tasklet+0xc0/0x130 [i915]

which is us consuming a partially initialised new waiter in
defer_requests(). We can prevent this by initialising the i915_dependency
prior to making it visible, and since we are using a concurrent
list_add/iterator mark them up to the compiler.

Fixes: 8ee36e048c ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200206204915.2636606-2-chris@chris-wilson.co.uk
(cherry picked from commit f14f27b166)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-17 21:24:19 +02:00
Chris Wilson
e543e370ec drm/i915/gt: Prevent queuing retire workers on the virtual engine
Virtual engines are fleeting. They carry a reference count and may be freed
when their last request is retired. This makes them unsuitable for the
task of housing engine->retire.work so assert that it is not used.

Tvrtko tracked down an instance where we did indeed violate this rule.
In virtual_submit_request, we flush a completed request directly with
__i915_request_submit and this causes us to queue that request on the
veng's breadcrumb list and signal it. Leading us down a path where we
should not attach the retire.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: dc93c9b693 ("drm/i915/gt: Schedule request retirement when signaler idles")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200206204915.2636606-1-chris@chris-wilson.co.uk
(cherry picked from commit f91d8156ab)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-17 21:24:13 +02:00
Jani Nikula
2e0a576511 drm/i915/dsc: force full modeset whenever DSC is enabled at probe
We lack full state readout of DSC config, which may lead to DSC enable
using a config that's all zeros, failing spectacularly. Force full
modeset and thus compute config at probe to get a sane state, until we
implement DSC state readout. Any fastset that did appear to work with
DSC at probe, worked by coincidence. [1] is an example of a change that
triggered the issue on TGL DSI DSC.

[1] http://patchwork.freedesktop.org/patch/msgid/20200212150102.7600-1-ville.syrjala@linux.intel.com

Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Fixes: fbacb15ea8 ("drm/i915/dsc: add basic hardware state readout support")
Acked-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200213140412.32697-3-stanislav.lisovskiy@intel.com
(cherry picked from commit a4277aa398)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-17 21:16:47 +02:00
Matt Roper
58e9121c32 drm/i915/ehl: Update port clock voltage level requirements
Voltage level depends not only on the cdclk, but also on the DDI clock.
Last time the bspec voltage level table for EHL was updated, we only
updated the cdclk requirements, but forgot to account for the new port
clock criteria.

Bspec: 21809
Fixes: d147483884 ("drm/i915/ehl: Update voltage level checks")
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200207001417.1229251-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 9d5fd37ed7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-17 21:16:47 +02:00
Jani Nikula
7ddc7005a0 drm/i915: Update drm/i915 bug filing URL
We've moved from bugzilla to gitlab.

Cc: stable@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200212160434.6437-2-jani.nikula@intel.com
(cherry picked from commit ddae4d7af0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-17 21:16:45 +02:00
Chris Wilson
c01e8da2cd drm/i915: Initialise basic fence before acquiring seqno
Inside the intel_timeline_get_seqno(), we currently track the retirement
of the old cachelines by listening to the new request. This requires
that the new request is ready to be used and so requires a minimum bit
of initialisation prior to getting the new seqno.

Fixes: b1e3177bd1 ("drm/i915: Coordinate i915_active with its own mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200203094152.4150550-2-chris@chris-wilson.co.uk
(cherry picked from commit 855e39e65c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-17 19:17:10 +02:00
Chris Wilson
dea8d5ce46 drm/i915/gem: Require per-engine reset support for non-persistent contexts
To enable non-persistent contexts, we require a means of cancelling any
inflight work from that context. This is first done "gracefully" by
using preemption to kick the active context off the engine, and then
forcefully by resetting the engine if it is active. If we are unable to
reset the engine to remove hostile userspace, we should not allow
userspace to opt into using non-persistent contexts.

If the per-engine reset fails, we still do a full GPU reset, but that is
rare and usually indicative of much deeper issues. The damage is already
done. However, the goal of the interface to allow long running compute
jobs without causing collateral damage elsewhere, and if we are unable
to support that we should make that known by not providing the
interface (and falsely pretending we can).

Fixes: a0e047156c ("drm/i915/gem: Make context persistence optional")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130164553.1937718-1-chris@chris-wilson.co.uk
(cherry picked from commit d1b9b5f127)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-17 19:17:02 +02:00
Dave Airlie
6f4134b30b drm/i915 fixes for v5.6-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFWWmW3ewYy4RJOWc05gHnSar7m8FAl5FGuMACgkQ05gHnSar
 7m8vRxAAobELoLLkdf44Cf4f8hgIxLRYiWxSVavh6ucFXBlGEWpGVwf2t8fM6ZBB
 yBNJWcxun6F8wy8aHZxjAHSK/LDL5sKUyMdU+GCUthkgtJxM/SYTJLvFL1y+Eacr
 PuQ50IXQFHRRQI1Cp6kVo9Y91/oU2LuWzrX82ZOIcxglO35A8vm3iT4Ggno3cDli
 1vAl1VIbXQX2GKhm1y4dGK2/lzbeN4byqJNpGQIq+1PDBEVgNsOPXRMhNLBFqIhA
 yVn/t1Z780KSTh8Oa24xkLSFKj4y0Yj7TDdkmIsaxPADqxy6Ptiuysf+scuPEpOS
 epRG3R3Dtajb+ZHzV2A5TmVAlgEvSDBKWKDA9wBzMIEKS8m5eW1UoDuJ4JhRy/IR
 ZNVcPNRAX61owmjEhlncQh9Mx8cUF3ku1Oup17/cm5o9Tcphubl6ilGmC5JAO3zj
 rX6NUyxbp4h9Gv6kY1eQfXtAe8Vo+vwejStew4ajo/r2PdAlRWzDUXyn6K7kXRb2
 3btgaVKulLAQQayP5FPp3LXvyaU4/Zg6QYKaV+5sXDDy/onvwoK4m1z9dxxC511a
 0DnpZOIIX0eVL5p7/FcIkfMan5wKK2QiWYfKe2jVC/9TooxK2g4brp+ImCXZlt8t
 kT/M1sYKkblUy4KN+f/asKLM85jzRjw5lH7LPiCwChj8KVWS7BQ=
 =BZol
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-fixes-2020-02-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.6-rc2

Most of these were aimed at a "next fixes" pull already during the merge
window, but there were issues with the baseline I used, which resulted
in a lot of issues in CI. I've regenerated this stuff piecemeal now,
adding gradually to it, and it seems healthy now.

Due to the issues this is much bigger than I'd like. But it was
obviously necessary to take the time to ensure it's not garbage...

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/878sl6yfrn.fsf@intel.com
2020-02-14 13:04:46 +10:00
Chris Wilson
2aaaa5ee1c drm/i915: Mark the removal of the i915_request from the sched.link
Keep the rq->fence.flags consistent with the status of the
rq->sched.link, and clear the associated bits when decoupling the link
on retirement (as we may wish to inspect those flags independent of
other state).

Fixes: c3f1ed90e6 ("drm/i915/gt: Allow temporary suspension of inflight requests")
References: https://gitlab.freedesktop.org/drm/intel/issues/997
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-3-chris@chris-wilson.co.uk
(cherry picked from commit b4a9a149f9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 17:04:33 +02:00
Chris Wilson
a2f90f4ff3 drm/i915/execlists: Reclaim the hanging virtual request
If we encounter a hang on a virtual engine, as we process the hang the
request may already have been moved back to the virtual engine (we are
processing the hang on the physical engine). We need to reclaim the
request from the virtual engine so that the locking is consistent and
local to the real engine on which we will hold the request for error
state capturing.

v2: Pull the reclamation into execlists_hold() and assert that cannot be
called from outside of the reset (i.e. with the tasklet disabled).
v3: Added selftest
v4: Drop the reference owned by the virtual engine

Fixes: ad18ba7b5e ("drm/i915/execlists: Offline error capture")
Testcase: igt/gem_exec_balancer/hang
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-2-chris@chris-wilson.co.uk
(cherry picked from commit 989df3a7bd)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 17:03:53 +02:00
Chris Wilson
317e0395cc drm/i915/execlists: Take a reference while capturing the guilty request
Thanks to preempt-to-busy, we leave the request on the HW as we submit
the preemption request. This means that the request may complete at any
moment as we process HW events, and in particular the request may be
retired as we are planning to capture it for a preemption timeout.

Be more careful while obtaining the request to capture after a
preemption timeout, and check to see if it completed before we were able
to put it on the on-hold list. If we do see it did complete just before
we capture the request, proclaim the preemption-timeout a false positive
and pardon the reset as we should hit an arbitration point momentarily
and so be able to process the preemption.

Note that even after we move the request to be on hold it may be retired
(as the reset to stop the HW comes after), so we do require to hold our
own reference as we work on the request for capture (and all of the
peeking at state within the request needs to be carefully protected).

Fixes: c3f1ed90e6 ("drm/i915/gt: Allow temporary suspension of inflight requests")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/997
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-1-chris@chris-wilson.co.uk
(cherry picked from commit 4ba5c086a1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 17:02:53 +02:00
Chris Wilson
ad18ba7b5e drm/i915/execlists: Offline error capture
Currently, we skip error capture upon forced preemption. We apply forced
preemption when there is a higher priority request that should be
running but is being blocked, and we skip inline error capture so that
the preemption request is not further delayed by a user controlled
capture -- extending the denial of service.

However, preemption reset is also used for heartbeats and regular GPU
hangs. By skipping the error capture, we remove the ability to debug GPU
hangs.

In order to capture the error without delaying the preemption request
further, we can do an out-of-line capture by removing the guilty request
from the execution queue and scheduling a worker to dump that request.
When removing a request, we need to remove the entire context and all
descendants from the execution queue, so that they do not jump past.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/738
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-3-chris@chris-wilson.co.uk
(cherry picked from commit 748317386a)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 16:55:58 +02:00
Chris Wilson
c3f1ed90e6 drm/i915/gt: Allow temporary suspension of inflight requests
In order to support out-of-line error capture, we need to remove the
active request from HW and put it to one side while a worker compresses
and stores all the details associated with that request. (As that
compression may take an arbitrary user-controlled amount of time, we
want to let the engine continue running on other workloads while the
hanging request is dumped.) Not only do we need to remove the active
request, but we also have to remove its context and all requests that
were dependent on it (both in flight, queued and future submission).

Finally once the capture is complete, we need to be able to resubmit the
request and its dependents and allow them to execute.

v2: Replace stack recursion with a simple list.
v3: Check all the parents, not just the first, when searching for a
stuck ancestor!

References: https://gitlab.freedesktop.org/drm/intel/issues/738
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk
(cherry picked from commit 32ff621fd7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 16:55:58 +02:00
Chris Wilson
9e2750fc80 drm/i915: Keep track of request among the scheduling lists
If we keep track of when the i915_request.sched.link is on the HW
runlist, or in the priority queue we can simplify our interactions with
the request (such as during rescheduling). This also simplifies the next
patch where we introduce a new in-between list, for requests that are
ready but neither on the run list or in the queue.

v2: Update i915_sched_node.link explanation for current usage where it
is a link on both the queue and on the runlists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-1-chris@chris-wilson.co.uk
(cherry picked from commit 672c368f93)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 16:55:58 +02:00
Jani Nikula
cc3251d8ef Merge tag 'gvt-fixes-2020-02-12' of https://github.com/intel/gvt-linux into drm-intel-next-fixes
gvt-fixes-2020-02-12

- fix possible high-order allocation fail for late load (Igor)
- fix one missed lock for ppgtt mm LRU list (Igor)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200212065912.GB4997@zhen-hp.sh.intel.com
2020-02-12 16:50:04 +02:00
Chris Wilson
2933803bdc drm/i915/gem: Tighten checks and acquiring the mmap object
Make sure we hold the rcu lock as we acquire the rcu protected reference
of the object when looking it up from the associated mmap vma.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1083
Fixes: cc662126b4 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130143931.1906301-1-chris@chris-wilson.co.uk
(cherry picked from commit 280d14a69d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 13:24:45 +02:00
José Roberto de Souza
52144db130 drm/i915: Fix preallocated barrier list append
Only the first and the last nodes were being added to
ref->preallocated_barriers.

Renaming variables to make it more easy to read.

Fixes: 8413502238 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200129232345.84512-1-jose.souza@intel.com
(cherry picked from commit d4c3c0b822)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 13:24:45 +02:00