Commit graph

2181 commits

Author SHA1 Message Date
Daniel Vetter
d5d0804f8f drm/i915: Update DRIVER_DATE to 20160822
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-22 08:35:48 +02:00
Chris Wilson
c58305af18 drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass
Very old numbers indicate this is a 66% improvement when remapping the
entire object for fence contention - due to the elimination of
track_pfn_insert and its strcmp.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Testcase: igt/gem_fence_upload/performance
Testcase: igt/gem_mmap_gtt
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-6-chris@chris-wilson.co.uk
2016-08-19 17:13:36 +01:00
Dave Gordon
0be81156b3 drm/i915: Reattach comment, complete type specification
In the recent patch
bc3d674 drm/i915: Allow userspace to request no-error-capture upon ...
the final version moved the flags and the associated #defines around
so they were adjacent; unfortunately, they ended up between a comment
and the thing (hw_id) to which the comment applies :(

So this patch reshuffles the comment and subject back together.

Also, as we're touching 'hw_id', let's change it from just 'unsigned'
to a fully-specified 'unsigned int', because some code checking tools
(including checkpatch) object to plain 'unsigned'.

Fixes: bc3d674462 ("drm/i915: Allow userspace to request no-error-capture...")
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471616622-6919-1-git-send-email-david.s.gordon@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-19 15:31:17 +01:00
Chris Wilson
7756e45407 drm/i915/cmdparser: Make initialisation failure non-fatal
If the developer adds a register in the wrong order, we BUG during boot.
That makes development and testing very difficult. Let's be a bit more
friendly and disable the command parser with a big warning if the tables
are invalid.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-31-chris@chris-wilson.co.uk
2016-08-18 22:36:58 +01:00
Chris Wilson
50349247ea drm/i915: Drop ORIGIN_GTT for untracked GTT writes
If FBC is set on a framebuffer that is unmapped, all GTT faults will be
from a partial mapping. Writes by the user through the partial VMA are
then untracked by the FBC and so we must use the ORIGIN_CPU when flushing
the I915_GEM_DOMAIN_GTT.

v2: Keep ORIGIN_CPU for set-to-domain(.write=CPU)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Zanoni, Paulo R" <paulo.r.zanoni@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-25-chris@chris-wilson.co.uk
2016-08-18 22:36:55 +01:00
Chris Wilson
49ef5294cd drm/i915: Move fence tracking from object to vma
In order to handle tiled partial GTT mmappings, we need to associate the
fence with an individual vma.

v2: A couple of silly drops replaced spotted by Joonas

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-21-chris@chris-wilson.co.uk
2016-08-18 22:36:50 +01:00
Chris Wilson
a1e5afbe4d drm/i915: Rename fence.lru_list to link
Our current practice is to only name the actual list (here
dev_priv->fence_list) using "list", and elements upon that list are
referred to as "link". Further, the lru nature is of the list and not of
the node and including in the name does not disambiguate the link from
anything else.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-20-chris@chris-wilson.co.uk
2016-08-18 22:36:49 +01:00
Chris Wilson
05a20d098d drm/i915: Move map-and-fenceable tracking to the VMA
By moving map-and-fenceable tracking from the object to the VMA, we gain
fine-grained tracking and the ability to track individual fences on the VMA
(subsequent patch).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-16-chris@chris-wilson.co.uk
2016-08-18 22:36:48 +01:00
Chris Wilson
43394c7d0d drm/i915: Extract i915_gem_obj_prepare_shmem_write()
This is a companion to i915_gem_obj_prepare_shmem_read() that prepares
the backing storage for direct writes. It first serialises with the GPU,
pins the backing storage and then indicates what clfushes are required in
order for the writes to be coherent.

Whilst here, fix support for ancient CPUs without clflush for which we
cannot do the GTT+clflush tricks.

v2: Add i915_gem_obj_finish_shmem_access() for symmetry

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-8-chris@chris-wilson.co.uk
2016-08-18 22:36:44 +01:00
Chris Wilson
4b30cb2334 drm/i915: vfree() no longer ignores the low bits of the address
Since vfree() now likes to WARN when passed a non-page-aligned pointer,
we need to discard the low bits to comply with it.

Fixes: d31d7cb146 ("drm/i915: Support for creating write combined type vmaps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-3-chris@chris-wilson.co.uk
2016-08-18 22:36:23 +01:00
Chris Wilson
600f436801 drm/i915: Unconditionally flush any chipset buffers before execbuf
If userspace is asynchronously streaming into the batch or other
execobjects, we may not flush those writes along with a change in cache
domain (as there is no change). Therefore those writes may end up in
internal chipset buffers and not visible to the GPU upon execution. We
must issue a flush command or otherwise we encounter incoherency in the
batchbuffers and the GPU executing invalid commands (i.e. hanging) quite
regularly.

v2: Throw a paranoid wmb() into the general flush so that we remain
consistent with before.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90841
Fixes: 1816f92363 ("drm/i915: Support creation of unbound wc user...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Tested-by: Matti Hämäläinen <ccr@tnsp.org>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-1-chris@chris-wilson.co.uk
2016-08-18 22:36:15 +01:00
Chris Wilson
21a2c58a9c drm/i915: Record the RING_MODE register for post-mortem debugging
Just another useful register to inspect following a GPU hang.

v2: Remove partial decoding of RING_MODE to userspace, be consistent and
use GEN > 2 guards around RING_MODE everywhere.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-32-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:19 +01:00
Chris Wilson
03382dfb63 drm/i915: Print the batchbuffer offset next to BBADDR in error state
It is useful when looking at captured error states to check the recorded
BBADDR register (the address of the last batchbuffer instruction loaded)
against the expected offset of the batch buffer, and so do a quick check
that (a) the capture is true or (b) HEAD hasn't wandered off into the
badlands.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-30-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:16 +01:00
Chris Wilson
c84455b4ba drm/i915: Move debug only per-request pid tracking from request to ctx
Since contexts are not currently shared between userspace processes, we
have an exact correspondence between context creator and guilty batch
submitter. Therefore we can save some per-batch work by inspecting the
context->pid upon error instead. Note that we take the context's
creator's pid rather than the file's pid in order to better track fd
passed over sockets.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-29-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:15 +01:00
Chris Wilson
bde13ebdab drm/i915: Introduce i915_ggtt_offset()
This little helper only exists to safely discard the upper unused 32bits
of the general 64-bit VMA address - as we know that all Global GTT
currently are less than 4GiB in size and so that the upper bits must be
zero. In many places, we use a u32 for the global GTT offset and we want
to document where we are discarding the full VMA offset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-28-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:14 +01:00
Chris Wilson
058d88c433 drm/i915: Track pinned VMA
Treat the VMA as the primary struct responsible for tracking bindings
into the GPU's VM. That is we want to treat the VMA returned after we
pin an object into the VM as the cookie we hold and eventually release
when unpinning. Doing so eliminates the ambiguity in pinning the object
and then searching for the relevant pin later.

v2: Joonas' stylistic nitpicks, a fun rebase.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-27-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:13 +01:00
Chris Wilson
51d545d026 drm/i915: Use VMA as the primary tracker for semaphore page
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-23-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:10 +01:00
Chris Wilson
bf3783e52a drm/i915: Use VMA as the primary object for context state
When working with contexts, we most frequently want the GGTT VMA for the
context state, first and foremost. Since the object is available via the
VMA, we need only then store the VMA.

v2: Formatting tweaks to debugfs output, restored some comments removed
in the next patch

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-15-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:02 +01:00
Chris Wilson
624192cfd3 drm/i915: Add convenience wrappers for vma's object get/put
The VMA are unreferenced, they belong to the object and live until they
are closed. However, if we want to use the VMA as a cookie and use it to
keep the object alive, we want to hold onto a reference to the object
for the lifetime of the VMA cookie. To facilitate this, add a couple of
simple wrappers for managing the reference count on the object owning the
VMA.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-11-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:57 +01:00
Chris Wilson
78ef2d9aba drm/i915: Add fetch_and_zero() macro
A simple little macro to clear a pointer and return the old value. This
is useful for writing

	value = *ptr;
	if (!value)
		return;

	*ptr = 0;
	...
	free(value);

in a slightly more concise form:

	value = fetch_and_zero(ptr);
	if (!value)
		return;

	...
	free(value);

with the idea that this establishes a pattern that may be extended for
atomic use (using xchg or cmpxchg) i.e. atomic_fetch_and_zero() and
similar to llist.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-10-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:56 +01:00
Chris Wilson
2bd160a131 drm/i915: Reduce i915_gem_objects to only show object information
No longer is knowing how much of the GTT (both mappable aperture and
beyond) relevant, and the output clutters the real information - that is
how many objects are allocated and bound (and by who) so that we can
quickly grasp if there is a leak.

v2: Relent, and rename pinned to indicate display only. Since the
display objects are semi-static and are of variable size, they are the
interesting objects to watch over time for aperture leaking. The other
pins are either static (such as the scratch page) or very short lived
(such as execbuf) and not part of the precious GGTT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-6-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:52 +01:00
Chris Wilson
c0ce466361 drm/i915: Reduce amount of duplicate buffer information captured on error
When capturing the error state, we do not need to know about every
address space - just those that are related to the error. We know which
context is active at the time, therefore we know which VM are implicated
in the error. We can then restrict the VM which we report to the
relevant subset.

v2: s/i/count_active/ (and similar)
    Rewrite label generation for "Buffers"

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-2-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:49 +01:00
Chris Wilson
d045446df1 drm/i915: Record the position of the start of the request
Not only does it make for good documentation and debugging aide, but it is
also vital for when we want to unwind requests - such as when throwing away
an incomplete request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluvery@linux.intel.com
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-1-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:49 +01:00
Chris Wilson
0b1de5d58e drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory
This patch provides the infrastructure for performing a 16-byte aligned
read from WC memory using non-temporal instructions introduced with sse4.1.
Using movntdqa we can bypass the CPU caches and read directly from memory
and ignoring the page attributes set on the CPU PTE i.e. negating the
impact of an otherwise UC access. Copying using movntdqa from WC is almost
as fast as reading from WB memory, modulo the possibility of both hitting
the CPU cache or leaving the data in the CPU cache for the next consumer.
(The CPU cache itself my be flushed for the region of the movntdqa and on
later access the movntdqa reads from a separate internal buffer for the
cacheline.) The write back to the memory is however cached.

This will be used in later patches to accelerate accessing WC memory.

v2: Report whether the accelerated copy is successful/possible.
v3: Function alignment override was only necessary when using the
function target("sse4.1") - which is not necessary for emitting movntdqa
from __asm__.
v4: Improve notes on CPU cache behaviour vs non-temporal stores.
v5: Fix byte offsets for unrolled moves.
v6: Find all remaining typos of "movntqda", use kernel_fpu_begin.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-2-git-send-email-chris@chris-wilson.co.uk
2016-08-12 13:06:37 +01:00
Chris Wilson
d31d7cb146 drm/i915: Support for creating write combined type vmaps
vmaps has a provision for controlling the page protection bits, with which
we can use to control the mapping type, e.g. WB, WC, UC or even WT.
To allow the caller to choose their mapping type, we add a parameter to
i915_gem_object_pin_map - but we still only allow one vmap to be cached
per object. If the object is currently not pinned, then we recreate the
previous vmap with the new access type, but if it was pinned we report an
error. This effectively limits the access via i915_gem_object_pin_map to a
single mapping type for the lifetime of the object. Not usually a problem,
but something to be aware of when setting up the object's vmap.

We will want to vary the access type to enable WC mappings of ringbuffer
and context objects on !llc platforms, as well as other objects where we
need coherent access to the GPU's pages without going through the GTT

v2: Remove the redundant braces around pin count check and fix the marker
     in documentation (Chris)

v3:
- Add a new enum for the vmalloc mapping type & pass that as an argument to
   i915_object_pin_map. (Tvrtko)
- Use PAGE_MASK to extract or filter the mapping type info and remove a
   superfluous BUG_ON.(Tvrtko)

v4:
- Rename the enums and clean up the pin_map function. (Chris)

v5: Drop the VM_NO_GUARD, minor cosmetics.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-1-git-send-email-chris@chris-wilson.co.uk
2016-08-12 13:06:36 +01:00
Tvrtko Ursulin
c1bb11451e drm/i915: Store number of active engines in device info
Until now code was calling hweight32 to figure out the
number from device_info->ring_mask at runtime. Instead
we can cache it at engine init time and use directly.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1470842530-35854-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-08-11 11:33:10 +01:00
Imre Deak
eebb40e081 drm/i915: Remove LVDS and PPS suspend time save/restore
In the preceding patches we made sure that:
- the LVDS encoder takes care of reiniting both the LVDS register
and its PPS
- the eDP encoder takes care of reiniting its PPS
- the PPS register unlocking workaround is applied explicitly whenever
the PPS context is lost

Based on the above we can safely remove the opaque LVDS and PPS save /
restore from generic code.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470827254-21954-6-git-send-email-imre.deak@intel.com
2016-08-10 16:02:14 +03:00
Imre Deak
44cb734cd2 drm/i915: Merge the PPS register definitions
The PPS registers are pretty much the same everywhere, the differences
being:
- Register fields appearing, disappearing from one platform to the
  next: panel-reset-on-powerdown, backlight-on, panel-port,
  register-unlock
- Different register base addresses
- Different number of PPS instances: 2 on VLV/CHV/BXT, 1 everywhere
  else.

We can merge the separate set of PPS definitions by extending the PPS
instance argument to all platforms and using instance 0 on platforms
with a single instance. This means we'll need to calculate the register
addresses dynamically based on the given platform and PPS instance.

v2:
- Simplify if ladder in intel_pps_get_registers(). (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470827254-21954-1-git-send-email-imre.deak@intel.com
2016-08-10 16:00:07 +03:00
Chris Wilson
dbd6ef29a7 drm/i915: Use RCU to annotate and enforce protection for breadcrumb's bh
The bottom-half we use for processing the breadcrumb interrupt is a
task, which is an RCU protected struct. When accessing this struct, we
need to be holding the RCU read lock to prevent it disappearing beneath
us. We can use the RCU annotation to mark our irq_seqno_bh pointer as
being under RCU guard and then use the RCU accessors to both provide
correct ordering of access through the pointer.

Most notably, this fixes the access from hard irq context to use the RCU
read lock, which both Daniel and Tvrtko complained about.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470761272-1245-3-git-send-email-chris@chris-wilson.co.uk
2016-08-10 10:37:49 +01:00
Maarten Lankhorst
7397489399 drm/i915: Fix modeset handling during gpu reset, v5.
This function would call drm_modeset_lock_all, while the suspend/resume
functions already have their own locking. Fix this by factoring out
__intel_display_resume, and calling the atomic helpers for duplicating
atomic state and disabling all crtc's during suspend.

Changes since v1:
- Deal with -EDEADLK right after lock_all and clean up calls
  to hw readout.
- Always take all modeset locks so updates during gpu reset are blocked.
Changes since v2:
- Fix deadlock in intel_update_primary_planes.
- Move WARN_ON(EDEADLK) to __intel_display_resume.
- pctx -> ctx
- only call __intel_display_resume on success in intel_display_resume.
Changes since v3:
- Rebase on top of dev_priv -> dev change.
- Use drm_modeset_lock_all_ctx instead of drm_modeset_lock_all.
Changes since v4 [by vsyrjala]:
- Deal with skip_intermediate_wm
- Update comment w.r.t. mode_config.mutex vs. ->detect()
- Rebase due to INTEL_GEN() etc.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: e2c8b8701e ("drm/i915: Use atomic helpers for suspend, v2.")
Cc: stable@vger.kernel.org
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470428910-12125-2-git-send-email-ville.syrjala@linux.intel.com
2016-08-05 23:28:27 +03:00
Daniel Vetter
c5b7e97b27 drm/i915: Update DRIVER_DATE to 20160808
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-08 09:37:31 +02:00
Chris Wilson
3e510a8e65 drm/i915: Repack fence tiling mode and stride into a single integer
In the previous commit, we moved the obj->tiling_mode out of a bitfield
and into its own integer so that we could safely use READ_ONCE(). Let us
now repair some of that damage by sharing the tiling_mode with its
companion, the fence stride.

v2: New magic

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-18-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:43 +01:00
Chris Wilson
9ad3676148 drm/i915: Remove locking for get_tiling
Since we are not concerned with userspace racing itself with set-tiling
(the order is indeterminant even if we take a lock), then we can safely
read back the single obj->tiling_mode and do the static lookup of
swizzle mode without having to take a lock.

get-tiling is reasonably frequent due to the back-channel passing around
of tiling parameters in DRI2/DRI3.

v2: Make tiling_mode a full unsigned int so that we can trivially use it
with READ_ONCE(). Separating it out into manual control over the flags
field was too noisy for a simple patch. Note that we could use the lower
bits of obj->stride for the tiling mode.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-16-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:42 +01:00
Chris Wilson
3b4e896f14 drm/i915: Remove unused no-shrinker-steal
After removing the user of this wart, we can remove the wart entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-10-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:39 +01:00
Chris Wilson
dcff85c844 drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

As a side-effect of having a robust means for tracking engine busyness,
we can replace our other busyness heuristic, that of comparing against
the last submitted seqno. For paranoid reasons, we have a semi-ordered
check of that seqno inside the hangchecker, which we can now improve to
an ordered check of the engine's busyness (removing a locked xchg in the
process).

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.
v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:37 +01:00
Chris Wilson
90f4fcd56b drm/i915: Remove forced stop ring on suspend/unload
Before suspending (or unloading), we would first wait upon all rendering
to be completed and then disable the rings. This later step is a remanent
from DRI1 days when we did not use request tracking for all operations
upon the ring. Now that we are sure we are waiting upon the very last
operation by the engine, we can forgo clobbering the ring registers,
though we do keep the assert that the engine is indeed idle before
sleeping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-5-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:36 +01:00
Chris Wilson
573adb3962 drm/i915: Move obj->active:5 to obj->flags
We are motivated to avoid using a bitfield for obj->active for a couple
of reasons. Firstly, we wish to document our lockless read of obj->active
using READ_ONCE inside i915_gem_busy_ioctl() and that requires an
integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code
when presented with a bitfield and that shows up high on the profiles of
request tracking (mainly due to excess memory traffic as it converts
the bitfield to a register and back and generates frequent AGI in the
process).

v2: BIT, break up a long line in compute the other engines, new paint
for i915_gem_object_is_active (now i915_gem_object_get_active).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-23-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:20:04 +01:00
Chris Wilson
faf5bf0ad6 drm/i915: Use atomics to manipulate obj->frontbuffer_bits
The individual bits inside obj->frontbuffer_bits are protected by each
plane->mutex, but the whole bitfield may be accessed by multiple KMS
operations simultaneously and so the RMW need to be under atomics.
However, for updating the single field we do not need to mandate that it
be under the struct_mutex, one more step towards its removal as the de
facto BKL.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-21-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:20:03 +01:00
Chris Wilson
b5add9591c drm/i915: Make fb_tracking.lock a spinlock
We only need a very lightweight mechanism here as the locking is only
used for co-ordinating a bitfield.

v2: Move the cheap unlikely tests into the caller
v3: Move the kerneldoc into the header (now separated out into
intel_fronbuffer.h for better kerneldoc and readability)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtien <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-20-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:20:02 +01:00
Chris Wilson
de895082f7 drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin()
Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for
i915_gem_object_ggtt_pin(), spare us the confusion and remove it.
Removing it now simplifies later patches to change the i915_vma_pin()
(and friends) interface.

v2: Add a redundant GEM_BUG_ON(!view) to
i915_gem_obj_lookup_or_create_ggtt_vma()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-18-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:20:00 +01:00
Chris Wilson
59bfa1248e drm/i915: Start passing around i915_vma from execbuffer
During execbuffer we look up the i915_vma in order to reserve them in
the VM. However, we then do a double lookup of the vma in order to then
pin them, all because we lack the necessary interfaces to operate on
i915_vma - so introduce i915_vma_pin()!

v2: Tidy parameter lists to remove one level of redirection in the hot
path.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-15-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:58 +01:00
Chris Wilson
a9f1481f41 drm/i915: Update i915_gem_get_ggtt_size/_alignment to use drm_i915_private
For consistency, internal functions should take drm_i915_private rather
than drm_device. Now that we are subclassing drm_device, there are no
more size wins, but being consistent is its own blessing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-12-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:56 +01:00
Chris Wilson
ad1a7d20a1 drm/i915: Update the GGTT size/alignment query functions
In order to be consistent with other address space functions, we want to
pass around 64-bit sizes, even though all known global GTT are limited
to 4GiB. Similarly, we are trying to be consistent in using the _ggtt_
nomenclature when referring to the special global GTT.

v2: Update docs to consistently state "global GTT".

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-11-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:55 +01:00
Chris Wilson
91b2db6f65 drm/i915: Pad GTT views of exec objects up to user specified size
Our GPUs impose certain requirements upon buffers that depend upon how
exactly they are used. Typically this is expressed as that they require
a larger surface than would be naively computed by pitch * height.
Normally such requirements are hidden away in the userspace driver, but
when we accept pointers from strangers and later impose extra conditions
on them, the original client allocator has no idea about the
monstrosities in the GPU and we require the userspace driver to inform
the kernel how many padding pages are required beyond the client
allocation.

v2: Long time, no see
v3: Try an anonymous union for uapi struct compatibility

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-7-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:53 +01:00
Chris Wilson
2ffffd0f85 drm/i915: Fix up vma alignment to be u64
This is not the full fix, as we are required to percolate the u64 nature
down through the drm_mm stack, but this is required now to prevent
explosions due to mismatch between execbuf (eb_vma_misplaced) and vma
binding (i915_vma_misplaced) - and reduces the risk of spurious changes
as we adjust the vma interface in the next patches.

v2: long long casts not required for u64 printk (%llx)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-6-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:52 +01:00
Chris Wilson
0340d9fd0f drm/i915: Remove request retirement before each batch
This reimplements the denial-of-service protection against igt from
commit 227f782e46 ("drm/i915: Retire requests before creating a new
one") and transfers the stall from before each batch into get_pages().
The issue is that the stall is increasing latency between batches which
is detrimental in some cases (especially coupled with execlists) to
keeping the GPU well fed. Also we have made the observation that retiring
requests can of itself free objects (and requests) and therefore makes
a good first step when shrinking.

v2: Recycle objects prior to i915_gem_object_get_pages()
v3: Remove the reference to the ring from i915_gem_requests_ring() as it
operates on an intel_engine_cs.
v4: Since commit 9b5f4e5ed6 ("drm/i915: Retire oldest completed request
before allocating next") we no longer need the safeguard to retire
requests before get_pages(). We no longer see the huge latencies when
hitting the shrinker between allocations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-4-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:51 +01:00
Chris Wilson
e522ac2324 drm/i915: Remove surplus drm_device parameter to i915_gem_evict_something()
Eviction is VM local, so we can ignore the significance of the
drm_device in the caller, and leave it to i915_gem_evict_something() to
manage itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-2-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:50 +01:00
Chris Wilson
df0e9a287d Revert "drm/i915: Clean up associated VMAs on context destruction"
This reverts commit e9f24d5fb7.

The patch was only a stop-gap measure that fixed half the problem - the
leak of the fbcon when restarting X. A complete solution required
releasing the VMA when the object itself was closed rather than rely on
file/process exit. The previous patches add the VMA tracking necessary
to do close them along with the object, context or file, and so the time
has come to remove the partial fix.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-28-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:34 +01:00
Chris Wilson
50e046b6a0 drm/i915: Mark the context and address space as closed
When the user closes the context mark it and the dependent address space
as closed. As we use an asynchronous destruct method, this has two
purposes.  First it allows us to flag the closed context and detect
internal errors if we to create any new objects for it (as it is removed
from the user's namespace, these should be internal bugs only). And
secondly, it allows us to immediately reap stale vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-27-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:33 +01:00
Chris Wilson
b1f788c6ac drm/i915: Release vma when the handle is closed
In order to prevent a leak of the vma on shared objects, we need to
hook into the object_close callback to destroy the vma on the object for
this file. However, if we destroyed that vma immediately we may cause
unexpected application stalls as we try to unbind a busy vma - hence we
defer the unbind to when we retire the vma.

v2: Keep vma allocated until closed. This is useful for a later
optimisation, but it is required now in order to handle potential
recursion of i915_vma_unbind() by retiring itself.
v3: Comments are important.

Testcase: igt/gem_ppggtt/flink-and-close-vma-leak
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-26-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:33 +01:00
Chris Wilson
5cf3d28098 drm/i915: i915_vma_move_to_active prep patch
This patch is broken out of the next just to remove the code motion from
that patch and make it more readable. What we do here is move the
i915_vma_move_to_active() to i915_gem_execbuffer.c and put the three
stages (read, write, fenced) together so that future modifications to
active handling are all located in the same spot. The importance of this
is so that we can more simply control the order in which the requests
are place in the retirement list (i.e. control the order at which we
retire and so control the lifetimes to avoid having to hold onto
references).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-24-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:31 +01:00
Chris Wilson
fa545cbf97 drm/i915: Refactor activity tracking for requests
With the introduction of requests, we amplified the number of atomic
refcounted objects we use and update every execbuffer; from none to
several references, and a set of references that need to be changed. We
also introduced interesting side-effects in the order of retiring
requests and objects.

Instead of independently tracking the last request for an object, track
the active objects for each request. The object will reside in the
buffer list of its most recent active request and so we reduce the kref
interchange to a list_move. Now retirements are entirely driven by the
request, dramatically simplifying activity tracking on the object
themselves, and removing the ambiguity between retiring objects and
retiring requests.

Furthermore with the consolidation of managing the activity tracking
centrally, we can look forward to using RCU to enable lockless lookup of
the current active requests for an object. In the future, we will be
able to query the status or wait upon rendering to an object without
even touching the struct_mutex BKL.

All told, less code, simpler and faster, and more extensible.

v2: Add a typedef for the function pointer for convenience later.
v3: Make the noop retirement callback explicit. Allow passing NULL to
the init_request_active() which is expanded to a common noop function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-16-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:25 +01:00
Chris Wilson
381f371b25 drm/i915: Introduce i915_gem_active for request tracking
In the next patch, request tracking is made more generic and for that we
need a new expanded struct and to separate out the logic changes from
the mechanical churn, we split out the structure renaming into this
patch.

v2: Writer's block. Add some spiel about why we track requests.
v3: Now i915_gem_active.
v4: Now with i915_gem_active_set() for attaching to the active request.
v5: Use i915_gem_active_set() from inside the retirement handlers

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-10-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:20 +01:00
Chris Wilson
aa653a685d drm/i915: Be more careful when unbinding vma
When we call i915_vma_unbind(), we will wait upon outstanding rendering.
This will also trigger a retirement phase, which may update the object
lists. If, we extend request tracking to the VMA itself (rather than
keep it at the encompassing object), then there is a potential that the
obj->vma_list be modified for other elements upon i915_vma_unbind(). As
a result, if we walk over the object list and call i915_vma_unbind(), we
need to be prepared for that list to change.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-8-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:18 +01:00
Chris Wilson
15717de219 drm/i915: Count how many VMA are bound for an object
Since we may have VMA allocated for an object, but we interrupted their
binding, there is a disparity between have elements on the obj->vma_list
and being bound. i915_gem_obj_bound_any() does this check, but this is
not rigorously observed - add an explicit count to make it easier.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-7-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:17 +01:00
Chris Wilson
2bfa996e03 drm/i915: Store owning file on the i915_address_space
For the global GTT (and aliasing GTT), the address space is owned by the
device (it is a global resource) and so the per-file owner field is
NULL. For per-process GTT (where we create an address space per
context), each is owned by the opening file. We can use this ownership
information to both distinguish GGTT and ppGTT address spaces, as well
as occasionally inspect the owner.

v2: Whitespace, tells us who owns i915_address_space

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-6-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:17 +01:00
Chris Wilson
ddf07be7a2 drm/i915: Simplify calling engine->sync_to
Since requests can no longer be generated as a side-effect of
intel_ring_begin(), we know that the seqno will be unchanged during
ring-emission. This predicatablity then means we do not have to check
for the seqno wrapping around whilst emitting the semaphore for
engine->sync_to().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-31-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-22-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:32 +01:00
Chris Wilson
5b043f4e60 drm/i915: Unify legacy/execlists submit_execbuf callbacks
Now that emitting requests is identical between legacy and execlists, we
can use the same function to build up the ring for submitting to either
engine. (With the exception of i915_switch_contexts(), but in time that
will also be handled gracefully.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-30-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-21-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:31 +01:00
Chris Wilson
8e63717837 drm/i915: Simplify request_alloc by returning the allocated request
If is simpler and leads to more readable code through the callstack if
the allocation returns the allocated struct through the return value.

The importance of this is that it no longer looks like we accidentally
allocate requests as side-effect of calling certain functions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-19-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-9-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:20 +01:00
Chris Wilson
7e37f889b5 drm/i915: Rename struct intel_ringbuffer to struct intel_ring
The state stored in this struct is not only the information about the
buffer object, but the ring used to communicate with the hardware. Using
buffer here is overly specific and, for me at least, conflates with the
notion of buffer objects themselves.

s/struct intel_ringbuffer/struct intel_ring/
s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/
s/describe_ctx_ringbuf()/describe_ctx_ring()/
s/intel_ring_get_active_head()/intel_engine_get_active_head()/
s/intel_ring_sync_index()/intel_engine_sync_index()/
s/intel_ring_init_seqno()/intel_engine_init_seqno()/
s/ring_stuck()/engine_stuck()/
s/intel_cleanup_engine()/intel_engine_cleanup()/
s/intel_stop_engine()/intel_engine_stop()/
s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/
s/intel_unpin_ringbuffer()/intel_unpin_ring()/
s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/
s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/
s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/
s/intel_ringbuffer_free()/intel_ring_free()/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:16 +01:00
Chris Wilson
dca33ecc5f drm/i915: Rename intel_context[engine].ringbuf
Perform s/ringbuf/ring/ on the context struct for consistency with the
ring/engine split.

v2: Kill an outdated error_ringbuf label

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-14-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-3-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:15 +01:00
Chris Wilson
6361f4ba46 drm/i915: Avoid using intel_engine_cs *ring for GPU error capture
Inside the error capture itself, we refer to not only the hardware
engine, its ringbuffer but also the capture state. Finding clear names
for each whilst avoiding mixing ring/intel_engine_cs is tricky. As a
compromise we keep using ering for the error capture.

v2: Use 'ee' locals for struct drm_i915_error_engine

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-8-git-send-email-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-3-git-send-email-chris@chris-wilson.co.uk
2016-07-27 16:23:05 +01:00
Chris Wilson
c80ff16e11 drm/i915: Use engine to refer to the user's BSD intel_engine_cs
This patch transitions the execbuf engine selection away from using the
ring nomenclature - though we still refer to the user's incoming
selector as their user_ring_id.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-7-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-2-git-send-email-chris@chris-wilson.co.uk
2016-07-27 16:23:05 +01:00
Chris Wilson
33a051a5fc drm/i915/cmdparser: Remove stray intel_engine_cs *ring
When we refer to intel_engine_cs, we want to use engine so as not to
confuse ourselves about ringbuffers.

v2: Rename all the functions as well, as well as a few more stray comments.
v3: Split the really long error message strings

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-6-git-send-email-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-1-git-send-email-chris@chris-wilson.co.uk
2016-07-27 16:23:05 +01:00
Daniel Vetter
d2b9448f96 drm/i915: Update DRIVER_DATE to 20160725
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-07-25 08:00:19 +02:00
Chris Wilson
54b4f68f18 Revert "drm/i915: Enable RC6 immediately"
This reverts commit b12e0ee208 ("drm/i915: Enable RC6 immediately"),
as it was never meant to be sent anywhere other than the bug report for
experimentation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1469132179-4052-1-git-send-email-chris@chris-wilson.co.uk
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-07-21 21:43:19 +01:00
Chris Wilson
b12e0ee208 drm/i915: Enable RC6 immediately
Now that PCU communication is reasonably fast, we do not need to defer
RC6 initialisation to a workqueue.

References: https://bugs.freedesktop.org/show_bug.cgi?id=97017
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-07-21 18:30:22 +01:00
Mika Kuoppala
4ba9c1f7c7 drm/i915/gen9: Add WaInPlaceDecompressionHang
Add this workaround to prevent hang when in place compression
is used.

References: HSD#2135774
Cc: stable@vger.kernel.org
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-07-20 16:10:20 +03:00
Chris Wilson
39df91905d drm/i915: Convert i915_semaphores_is_enabled over to early sanitize
Rather than recomputing whether semaphores are enabled, we can do that
computation once during early initialisation as the i915.semaphores
module parameter is now read-only.

s/i915_semaphores_is_enabled/i915.semaphores/

v2: Add the state to the debug dmesg as well

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-10-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-9-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:43 +01:00
Chris Wilson
34911fd30c drm/i915: Rename drm_gem_object_unreference_unlocked in preparation for lockless free
Whilst this ultimately wraps kref_put_mutex(), our goal here is the
lockless variant, so keep the _unlocked() suffix until we need it no
more.

s/drm_gem_object_unreference_unlocked/i915_gem_object_put_unlocked/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-7-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-6-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:13 +01:00
Chris Wilson
f8c417cdb1 drm/i915: Rename drm_gem_object_unreference in preparation for lockless free
Ultimately wraps kref_put(), so adopt its nomenclature for consistency
with other subsystems.

s/drm_gem_object_unreference/i915_gem_object_put/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-6-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-5-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:12 +01:00
Chris Wilson
25dc556a2a drm/i915: Wrap drm_gem_object_reference in i915_gem_object_get
Ultimately wraps kref_get(), so adopt its nomenclature for consistency
with other subsystems.

s/drm_gem_object_reference/i915_gem_object_get/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-4-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:11 +01:00
Chris Wilson
03ac0642f6 drm/i915: Wrap drm_gem_object_lookup in i915_gem_object_lookup
For symmetry with a forthcoming i915_gem_object_get() and
i915_gem_object_put().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-3-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:10 +01:00
Chris Wilson
9a6feaf0d7 drm/i915: Rename i915_gem_context_reference/unreference()
As these are wrappers around kref_get/kref_put() it is preferable to
follow the naming convention and use the same verb get/put in our
wrapper names for manipulating a reference to the context.

s/i915_gem_context_reference/i915_gem_context_get/
s/i915_gem_context_unreference/i915_gem_context_put/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-3-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-2-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:10 +01:00
Chris Wilson
197be2ae8b drm/i915: Disable waitboosting for mmioflips/semaphores
Since commit a6f766f397 ("drm/i915: Limit ring synchronisation (sw
sempahores) RPS boosts") and commit bcafc4e38b ("drm/i915: Limit mmio
flip RPS boosts") we have limited the waitboosting for semaphores and
flips. Ideally we do not want to boost in either of these instances as no
userspace consumer is waiting upon the results (though a userspace producer
may be stalled trying to submit an execbuf - but in this case the
producer is being throttled due to the engine being saturated with
work). With the introduction of NO_WAITBOOST in the previous patch, we
can finally disable these needless boosts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-6-git-send-email-chris@chris-wilson.co.uk
2016-07-20 09:29:53 +01:00
Chris Wilson
05235c5354 drm/i915: Move GEM request routines to i915_gem_request.c
Migrate the request operations out of the main body of i915_gem.c and
into their own C file for easier expansion.

v2: Move __i915_add_request() across as well

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-1-git-send-email-chris@chris-wilson.co.uk
2016-07-20 09:29:53 +01:00
Chris Wilson
5ab57c7020 drm/i915: Flush logical context image out to memory upon suspend
Before suspend, and especially before building the hibernation image, we
need to context image to be coherent in memory. To do this we require
that we perform a context switch to a disposable context (i.e. the
dev_priv->kernel_context) - when that switch is complete, all other
context images will be complete. This leaves the kernel_context image as
incomplete, but fortunately that is disposable and we can do a quick
fixup of the logical state after resuming.

v2: Share the nearly identical code to switch to the kernel context with
eviction.
v3: Explain why we need the switch and reset.

Testcase: igt/gem_exec_suspend # bsw
References: https://bugs.freedesktop.org/show_bug.cgi?id=96526
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468590980-6186-2-git-send-email-chris@chris-wilson.co.uk
2016-07-15 20:41:05 +01:00
Chris Wilson
945657b461 drm/i915/evict: Always switch away from the current context
Currently execlists is exempt from emitting a request to switch each
ring away from the current context over to the dev_priv->kernel_context
(for whatever reason, just under execlists the GGTT is unlikely to be as
fragmented, however the switch may help in some extreme cases). Extract
the switcher and enable it for execlsts as well, as we need to do so in
a later patch to force the context switch before suspend. (And since for
that switch we explicitly require the disposable kernel context, rename
the extracted function.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468590980-6186-1-git-send-email-chris@chris-wilson.co.uk
2016-07-15 20:40:53 +01:00
Lyude
19625e85c6 drm/i915: Enable polling when we don't have hpd
Unfortunately, there's two situations where we lose hpd right now:
- Runtime suspend
- When we've shut off all of the power wells on Valleyview/Cherryview

While it would be nice if this didn't cause issues, this has the
ability to get us in some awkward states where a user won't be able to
get their display to turn on. For instance; if we boot a Valleyview
system without any monitors connected, it won't need any of it's power
wells and thus shut them off. Since this causes us to lose HPD, this
means that unless the user knows how to ssh into their machine and do a
manual reprobe for monitors, none of the monitors they connect after
booting will actually work.

Eventually we should come up with a better fix then having to enable
polling for this, since this makes rpm a lot less useful, but for now
the infrastructure in i915 just isn't there yet to get hpd in these
situations.

Changes since v1:
 - Add comment explaining the addition of the if
   (!mode_config->poll_running) in intel_hpd_init()
 - Remove unneeded if (!dev->mode_config.poll_enabled) in
   i915_hpd_poll_init_work()
 - Call to drm_helper_hpd_irq_event() after we disable polling
 - Add cancel_work_sync() call to intel_hpd_cancel_work()

Changes since v2:
 - Apparently dev->mode_config.poll_running doesn't actually reflect
   whether or not a poll is currently in progress, and is actually used
   for dynamic module paramter enabling/disabling. So now we instead
   keep track of our own poll_running variable in dev_priv->hotplug
 - Clean i915_hpd_poll_init_work() a little bit

Changes since v3:
 - Remove the now-redundant connector loop in intel_hpd_init(), just
   rely on intel_hpd_poll_enable() for setting connector->polled
   correctly on each connector
 - Get rid of poll_running
 - Don't assign enabled in i915_hpd_poll_init_work before we actually
   lock dev->mode_config.mutex
 - Wrap enabled assignment in i915_hpd_poll_init_work() in READ_ONCE()
   for doc purposes
 - Do the same for dev_priv->hotplug.poll_enabled with WRITE_ONCE in
   intel_hpd_poll_enable()
 - Add some comments about racing not mattering in intel_hpd_poll_enable

Changes since v4:
 - Rename intel_hpd_poll_enable() to intel_hpd_poll_init()
 - Drop the bool argument from intel_hpd_poll_init()
 - Remove redundant calls to intel_hpd_poll_init()
 - Rename poll_enable_work to poll_init_work
 - Add some kerneldoc for intel_hpd_poll_init()
 - Cross-reference intel_hpd_poll_init() in intel_hpd_init()
 - Just copy the loop from intel_hpd_init() in intel_hpd_poll_init()

Changes since v5:
 - Minor kerneldoc nitpicks

Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-07-14 22:06:11 +02:00
Lyude
b236d7c842 drm/i915/vlv: Disable HPD in valleyview_crt_detect_hotplug()
One of the things preventing us from using polling is the fact that
calling valleyview_crt_detect_hotplug() when there's a VGA cable
connected results in sending another hotplug. With polling enabled when
HPD is disabled, this results in a scenario like this:

- We enable power wells and reset the ADPA
- output_poll_exec does force probe on VGA, triggering a hpd
- HPD handler waits for poll to unlock dev->mode_config.mutex
- output_poll_exec shuts off the ADPA, unlocks dev->mode_config.mutex
- HPD handler runs, resets ADPA and brings us back to the start

This results in an endless irq storm getting sent from the ADPA
whenever a VGA connector gets detected in the middle of polling.

Somewhat based off of the "drm/i915: Disable CRT HPD around force
trigger" patch Ville Syrjälä sent a while back

Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-07-14 22:06:11 +02:00
Chris Wilson
b7137e0cf1 drm/i915: Defer enabling rc6 til after we submit the first batch/context
Some hardware requires a valid render context before it can initiate
rc6 power gating of the GPU; the default state of the GPU is not
sufficient and may lead to undefined behaviour. The first execution of
any batch will load the "golden render state", at which point it is safe
to enable rc6. As we do not forcibly load the kernel context at resume,
we have to hook into the batch submission to be sure that the render
state is setup before enabling rc6.

However, since we don't enable powersaving until that first batch, we
queued a delayed task in order to guarantee that the batch is indeed
submitted.

v2: Rearrange intel_disable_gt_powersave() to match.
v3: Apply user specified cur_freq (or idle_freq if not set).
v4: Give in, and supply a delayed work to autoenable rc6
v5: Mika suggested a couple of better names for delayed_resume_work
v6: Rebalance rpm_put around the autoenable task

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468397438-21226-7-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-14 15:24:21 +01:00
Chris Wilson
29ecd78d3b drm/i915: Define a separate variable and control for RPS waitboost frequency
To allow the user finer control over waitboosting, allow them to set the
frequency we request for the boost. This also them allows to effectively
disable the boosting by setting the boost request to a low frequency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1468397438-21226-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-14 15:22:58 +01:00
Tvrtko Ursulin
8b3e2d3639 drm/i915: Unify engine init loop
With the unified common engine setup done, and the execlist engine
initialization loop clearly split into two phases, we can eliminate
the separate legacy engine initialization code.

v2: Fix cleanup path for legacy.
v3: Rename constructors. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris-wilson.co.uk>
2016-07-14 11:17:06 +01:00
Chris Wilson
8d35acba25 drm/i915: Provide argument names for static stubs
Make sure we keep kbuilder happy in all of its random configs by
providing argument names for compile-time stubs.

In file included from drivers/gpu/drm/i915/intel_dp_mst.c:27:0:
   drivers/gpu/drm/i915/i915_drv.h: In function 'i915_debugfs_register':
>> drivers/gpu/drm/i915/i915_drv.h:3612:48: error: parameter name omitted
    static inline int i915_debugfs_register(struct drm_i915_private *) {return 0;}
                                                   ^~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.h: In function 'i915_debugfs_unregister':
   drivers/gpu/drm/i915/i915_drv.h:3613:51: error: parameter name omitted
    static inline void i915_debugfs_unregister(struct drm_i915_private *) {}

Reported-by: 0day
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1468324529-20461-1-git-send-email-chris@chris-wilson.co.uk
2016-07-12 14:08:57 +01:00
Daniel Vetter
0b2c0582f1 drm/i915: Update DRIVER_DATE to 20160711
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-07-11 09:18:31 +02:00
Chris Wilson
48f112fed3 drm/i915: Fill unused GGTT with scratch pages for VT-d
One of the numerous VT-d workarounds we require is that the display
hardware reads past the end of the buffer triggering VT-d faults. This
is acknowledged in the code as being safe "since we fill the unused
portions of the GGTT with the scratch page". Alas, that is no longer
always true and so we trigger DMAR read faults.

Skylake also requires another workaround to avoid mixing VT-d and
unpopulated PTE, and so there we also need to ensure we fill unused
entries with the scratch page.

Reported-by: Mike Lothian <mike@fireburn.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96584
Fixes: f7770bfd9f ("drm/i915: Skip clearing the GGTT on full-ppgtt systems")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Weinehall <david.weinehall@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773634-8106-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: David Weinehall <david.weinehall@intel.com>
2016-07-08 13:36:27 +01:00
Rodrigo Vivi
22dea0be50 drm/i915: Introduce Kabypoint PCH for Kabylake H/DT.
Some Kabylake SKUs are going to use Kabypoint PCH.
It is mainly for Halo and DT ones.

>From our specs it doesn't seem that KBP brings
any change on the display south engine. So let's consider
this as a continuation of SunrisePoint, i.e., SPT+.

Since it is easy to get confused by a letter change:
KBL = Kabylake - CPU/GPU codename.
KBP = Kabypoint - PCH codename.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96826
Link: http://patchwork.freedesktop.org/patch/msgid/1467418032-15167-1-git-send-email-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2016-07-07 10:01:37 -07:00
Chris Wilson
aca34b6e1c drm/i915: Group the irq breadcrumb variables into the same cacheline
As we inspect both the tasklet (to check for an active bottom-half) and
set the irq-posted flag at the same time (both in the interrupt handler
and then in the bottom-halt), group those two together into the same
cacheline. (Not having total control over placement of the struct means
we can't guarantee the cacheline boundary, we need to align the kmalloc
and then each struct, but the grouping should help.)

v2: Try a couple of different names for the state touched by the user
interrupt handler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-3-git-send-email-chris@chris-wilson.co.uk
2016-07-06 12:47:39 +01:00
Chris Wilson
99fe4a5f73 drm/i915: Wake up the bottom-half if we steal their interrupt
Following on from the scenario Tvrtko envisioned to explain a hard-to-hit
race with multiple first waiters, we could also then race in the
__i915_request_irq_complete() and the bottom-half may miss the vital
irq-seqno barrier and so go to sleep not noticing their seqno is
complete.

v2: unlock, not double lock the rcu_read_lock.

Fixes: 3d5564e910 ("drm/i915: Only apply one barrier after...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-2-git-send-email-chris@chris-wilson.co.uk
2016-07-06 12:47:31 +01:00
Chris Wilson
91c8a326a1 drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm
Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.

   text	   data	    bss	    dec	    hex	filename
1068757	   4565	    416	1073738	 10624a	drivers/gpu/drm/i915/i915.ko
1066949	   4565	    416	1071930	 105b3a	drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)

and for good measure the dev_priv->dev backpointer was removed entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk
2016-07-05 11:58:45 +01:00
Chris Wilson
94b4f3ba48 drm/i915: Split out runtime configuration of device info to its own file
Let's reclaim a few hundred lines from i915_drv.c by splitting out the
runtime configuration of the "constant" dev_priv->info.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-07-05 11:53:27 +01:00
Tvrtko Ursulin
af1346a0f3 drm/i915: Explicitly convert some macros to boolean values
Some IS_ and HAS_ macros can return any non-zero value for true.

One potential problem with that is that someone could assign
them to integers and be surprised with the result. Therefore it
is probably safer to do the conversion to 0/1 in the macros
themselves.

Luckily this does not seem to have an effect on code size.

Only one call site was getting bit by this and a patch for
that has been sent as "drm/i915/guc: Protect against HAS_GUC_*
returning true values other than one".

v2: Added some extra braces as suggested by checkpatch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467643823-9798-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-07-04 16:38:04 +01:00
Peter Antoine
6f8be28012 Revert "drm/i915/kbl: drm/i915: Avoid GuC loading for now on Kabylake."
This reverts commit 2b81b84471

Cc: Christophe Prigent <christophe.prigent@intel.com>
Signed-off-by: Peter Antoine <peter.antoine@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-07-04 11:21:29 +01:00
Chris Wilson
bc3d674462 drm/i915: Allow userspace to request no-error-capture upon GPU hangs
igt likes to inject GPU hangs into its command streams. However, as we
expect these hangs, we don't actually want them recorded in the dmesg
output or stored in the i915_error_state (usually). To accommodate this
allow userspace to set a flag on the context that any hang emanating
from that context will not be recorded. We still do the error capture
(otherwise how do we find the guilty context and know its intent?) as
part of the reason for random GPU hang injection is to exercise the race
conditions between the error capture and normal execution.

v2: Split out the request->ringbuf error capture changes.
v3: Move the flag defines next to the intel_context->flags definition

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-9-git-send-email-chris@chris-wilson.co.uk
2016-07-04 08:18:24 +01:00
Chris Wilson
7b4d3a16dd drm/i915: Remove stop-rings debugfs interface
Now that we have (near) universal GPU recovery code, we can inject a
real hang from userspace and not need any fakery. Not only does this
mean that the testing is far more realistic, but we can simplify the
kernel in the process.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-7-git-send-email-chris@chris-wilson.co.uk
2016-07-04 08:18:23 +01:00
Chris Wilson
67d97da349 drm/i915: Only start retire worker when idle
The retire worker is a low frequency task that makes sure we retire
outstanding requests if userspace is being lax. We only need to start it
once as it remains active until the GPU is idle, so do a cheap test
before the more expensive queue_work(). A consequence of this is that we
need correct locking in the worker to make the hot path of request
submission cheap. To keep the symmetry and keep hangcheck strictly bound
by the GPU's wakelock, we move the cancel_sync(hangcheck) to the idle
worker before dropping the wakelock.

v2: Guard against RCU fouling the breadcrumbs bottom-half whilst we kick
the waiter.
v3: Remove the wakeref assertion squelching (now we hold a wakeref for
the hangcheck, any rpm error there is genuine).
v4: To prevent excess work when retiring requests, we split the busy
flag into two, a boolean to denote whether we hold the wakeref and a
bitmask of active engines.
v5: Reorder cancelling hangcheck upon idling to avoid a race where we
might cancel a hangcheck after being preempted by a new task

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=88437
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-1-git-send-email-chris@chris-wilson.co.uk
2016-07-04 08:18:19 +01:00
Chris Wilson
b3850855f4 drm/i915: Embed signaling node into the GEM request
Under the assumption that enabling signaling will be a frequent
operation, lets preallocate our attachments for signaling inside the
(rather large) request struct (and so benefiting from the slab cache).

v2: Convert from void * to more meaningful names and types.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-17-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:04:14 +01:00
Chris Wilson
c81d46138d drm/i915: Convert trace-irq to the breadcrumb waiter
If we convert the tracing over from direct use of ring->irq_get() and
over to the breadcrumb infrastructure, we only have a single user of the
ring->irq_get and so we will be able to simplify the driver routines
(eliminating the redundant validation and irq refcounting).

Process context is preferred over softirq (or even hardirq) for a couple
of reasons:

 - we already utilize process context to have fast wakeup of a single
   client (i.e. the client waiting for the GPU inspects the seqno for
   itself following an interrupt to avoid the overhead of a context
   switch before it returns to userspace)

 - engine->irq_seqno() is not suitable for use from an softirq/hardirq
   context as we may require long waits (100-250us) to ensure the seqno
   write is posted before we read it from the CPU

A signaling framework is a requirement for enabling dma-fences.

v2: Move to a signaling framework based upon the waiter.
v3: Track the first-signal to avoid having to walk the rbtree everytime.
v4: Mark the signaler thread as RT priority to reduce latency in the
indirect wakeups.
v5: Make failure to allocate the thread fatal.
v6: Rename kthreads to i915/signal:%u

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-16-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:02:34 +01:00
Chris Wilson
3d5564e910 drm/i915: Only apply one barrier after a breadcrumb interrupt is posted
If we flag the seqno as potentially stale upon receiving an interrupt,
we can use that information to reduce the frequency that we apply the
heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).

v2: Use cmpxchg to replace READ_ONCE/WRITE_ONCE for more explicit
control of the ordering wrt to interrupt generation and interrupt
checking in the bottom-half.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-14-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:00:54 +01:00
Chris Wilson
7ec2c73b1d drm/i915: Check the CPU cached value in HWS of seqno after waking the waiter
If we have multiple waiters, we may find that many complete on the same
wake up. If we first inspect the seqno from the CPU cache, we may reduce
the number of heavyweight coherent seqno reads we require.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-13-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:00:54 +01:00
Chris Wilson
1b7744e7ba drm/i915: Use HWS for seqno tracking everywhere
By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.

v2: Fix semaphore_passed() to look at the signaling engine (not the
waiter's)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-8-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:58:48 +01:00
Chris Wilson
f69a02c9d5 drm/i915: Spin after waking up for an interrupt
When waiting for an interrupt (waiting for the engine to complete some
work), we know we are the only waiter to be woken on this engine. We also
know when the GPU has nearly completed our request (or at least started
processing it), so after being woken and we detect that the GPU is
active and working on our request, allow us the bottom-half (the first
waiter who wakes up to handle checking the seqno after the interrupt) to
spin for a very short while to reduce client latencies.

The impact is minimal, there was an improvement to the realtime-vs-many
clients case, but exporting the function proves useful later. However,
it is tempting to adjust irq_seqno_barrier to include the spin. The
problem is first ensuring that the "start-of-request" seqno is coherent
as we use that as our basis for judging when it is ok to spin. If we
could, spinning there could dramatically shorten some sleeps, and allow
us to make the barriers more conservative to handle missed seqno writes
on more platforms (all gen7+ are known to have the occasional issue, at
least).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-7-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:58:47 +01:00
Chris Wilson
688e6c7258 drm/i915: Slaughter the thundering i915_wait_request herd
One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batchbuffer - hence the thunder of hooves as then every client must
do its heavyweight dance to read a coherent seqno to see if it is the
lucky one.

Ideally, we only want one client to wake up after the interrupt and
check its request for completion. Since the requests must retire in
order, we can select the first client on the oldest request to be woken.
Once that client has completed his wait, we can then wake up the
next client and so on. However, all clients then incur latency as every
process in the chain may be delayed for scheduling - this may also then
cause some priority inversion. To reduce the latency, when a client
is added or removed from the list, we scan the tree for completed
seqno and wake up all the completed waiters in parallel.

Using igt/benchmarks/gem_latency, we can demonstrate this effect. The
benchmark measures the number of GPU cycles between completion of a
batch and the client waking up from a call to wait-ioctl. With many
concurrent waiters, with each on a different request, we observe that
the wakeup latency before the patch scales nearly linearly with the
number of waiters (before external factors kick in making the scaling much
worse). After applying the patch, we can see that only the single waiter
for the request is being woken up, providing a constant wakeup latency
for every operation. However, the situation is not quite as rosy for
many waiters on the same request, though to the best of my knowledge this
is much less likely in practice. Here, we can observe that the
concurrent waiters incur extra latency from being woken up by the
solitary bottom-half, rather than directly by the interrupt. This
appears to be scheduler induced (having discounted adverse effects from
having a rbtree walk/erase in the wakeup path), each additional
wake_up_process() costs approximately 1us on big core. Another effect of
performing the secondary wakeups from the first bottom-half is the
incurred delay this imposes on high priority threads - rather than
immediately returning to userspace and leaving the interrupt handler to
wake the others.

To offset the delay incurred with additional waiters on a request, we
could use a hybrid scheme that did a quick read in the interrupt handler
and dequeued all the completed waiters (incurring the overhead in the
interrupt handler, not the best plan either as we then incur GPU
submission latency) but we would still have to wake up the bottom-half
every time to do the heavyweight slow read. Or we could only kick the
waiters on the seqno with the same priority as the current task (i.e. in
the realtime waiter scenario, only it is woken up immediately by the
interrupt and simply queues the next waiter before returning to userspace,
minimising its delay at the expense of the chain, and also reducing
contention on its scheduler runqueue). This is effective at avoid long
pauses in the interrupt handler and at avoiding the extra latency in
realtime/high-priority waiters.

v2: Convert from a kworker per engine into a dedicated kthread for the
bottom-half.
v3: Rename request members and tweak comments.
v4: Use a per-engine spinlock in the breadcrumbs bottom-half.
v5: Fix race in locklessly checking waiter status and kicking the task on
adding a new waiter.
v6: Fix deciding when to force the timer to hide missing interrupts.
v7: Move the bottom-half from the kthread to the first client process.
v8: Reword a few comments
v9: Break the busy loop when the interrupt is unmasked or has fired.
v10: Comments, unnecessary churn, better debugging from Tvrtko
v11: Wake all completed waiters on removing the current bottom-half to
reduce the latency of waking up a herd of clients all waiting on the
same request.
v12: Rearrange missed-interrupt fault injection so that it works with
igt/drv_missed_irq_hang
v13: Rename intel_breadcrumb and friends to intel_wait in preparation
for signal handling.
v14: RCU commentary, assert_spin_locked
v15: Hide BUG_ON behind the compiler; report on gem_latency findings.
v16: Sort seqno-groups by priority so that first-waiter has the highest
task priority (and so avoid priority inversion).
v17: Add waiters to post-mortem GPU hang state.
v18: Return early for a completed wait after acquiring the spinlock.
Avoids adding ourselves to the tree if the is already complete, and
skips the awkward question of why we don't do completion wakeups for
waits earlier than or equal to ourselves.
v19: Prepare for init_breadcrumbs to fail. Later patches may want to
allocate during init, so be prepared to propagate back the error code.

Testcase: igt/gem_concurrent_blit
Testcase: igt/benchmarks/gem_latency
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> #v18
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-6-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:58:43 +01:00
Chris Wilson
1f15b76f1e drm/i915: Separate GPU hang waitqueue from advance
Currently __i915_wait_request uses a per-engine wait_queue_t for the dual
purpose of waking after the GPU advances or for waking after an error.
In the future, we may add even more wake sources and require greater
separation, but for now we can conceptually simplify wakeups by separating
the two sources. In particular, this allows us to use different wait-queues
(e.g. one on the engine advancement, a global one for errors and one on
each requests) without any hassle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-5-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:31 +01:00
Chris Wilson
26a02b8fc3 drm/i915: Make queueing the hangcheck work inline
Since the function is a small wrapper around schedule_delayed_work(),
move it inline to remove the function call overhead for the principle
caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-4-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:30 +01:00
Chris Wilson
7774002586 drm/i915: Remove the dedicated hangcheck workqueue
The queue only ever contains at most one item and has no special flags.
It is just a very simple wrapper around the system-wq - a complication
with no benefits.

v2: Use the system_long_wq as we may wish to capture the error state
after detecting the hang - which may take a bit of time.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-3-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:29 +01:00
Chris Wilson
76f8421f2a drm/i915: Perform Sandybridge BSD tail write under the forcewake
Since we have a sequence of register reads and writes, we can reduce the
latency of starting the BSD ring by performing all the mmio operations
under the same forcewake wakeref.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467297225-21379-62-git-send-email-chris@chris-wilson.co.uk
2016-06-30 15:42:34 +01:00
Chris Wilson
1758b90e38 drm/i915: Use a hybrid scheme for fast register waits
Ville Syrjälä reported that in the majority of wait_for(I915_READ()) he
inspect, most completed within the first couple of reads and that the
delay between those wait_for() reads was the ratelimiting step for many
code paths. For example, __gen6_update_ring_freq() was blamed for
slowing down boot by many milliseconds, but under Ville's scrutiny the
issue was just excessive delay waiting for sandybridge_pcode_write().

We can eliminate the wait by initially using a busyspin upon the register
read and only fallback to the sleeping loop in cases where the hardware
is indeed too slow. A threshold of 2 microseconds is used as the initial
ballpark.

To avoid excessive code bloating from converting every wait_for() into a
hybrid busy/sleep loop, we extend wait_for_register_fw() and export it
for use by other callers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467297225-21379-1-git-send-email-chris@chris-wilson.co.uk
2016-06-30 15:41:58 +01:00
Chris Wilson
0c5eed6514 drm/i915: Remove request->reset_counter
Since commit 2ed53a94d8 ("drm/i915: On GPU reset, set the HWS
breadcrumb to the last seqno") once a hang is completed, the seqno is
advanced past all current requests. With this we know that if we wake up
from waiting for a request, if a hang has occurred and reset completed,
our request will be considered complete (i.e.
i915_gem_request_completed() returns true). Therefore we only need to
worry about the situation where a hang has occurred, but not yet reset,
where we may need to release our struct_mutex. Since we don't need to
detect the completed reset using the global gpu_error->reset_counter
anymore, we do not need to track the reset_counter epoch inside the
request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467211874-11552-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
2016-06-29 17:06:41 +01:00
Randy Dunlap
bdaa2dfbba drm/i915: fix build errors when ACPI is not enabled
Fix build errors when ACPI is not enabled by adding function stubs:

../drivers/gpu/drm/i915/i915_drv.c: In function 'i915_drm_suspend':
../drivers/gpu/drm/i915/i915_drv.c:635:2: error: implicit declaration of function 'intel_opregion_unregister' [-Werror=implicit-function-declaration]
  intel_opregion_unregister(dev_priv);
../drivers/gpu/drm/i915/i915_drv.c: In function 'i915_drm_resume':
../drivers/gpu/drm/i915/i915_drv.c:798:2: error: implicit declaration of function 'intel_opregion_register' [-Werror=implicit-function-declaration]
  intel_opregion_register(dev_priv);

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Fixes: 03d92e4779 ("drm/i915/opregion: Rename init/fini functions to register/unregister")
Cc: drm-intel-fixes@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[Jani: dropped the stale init/fini declarations]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467028399-9965-1-git-send-email-jani.nikula@intel.com
2016-06-28 13:10:14 +03:00
Chris Wilson
6e5a5beb8e drm/i915: Split idling from forcing context switch
We only need to force a switch to the kernel context placeholder during
eviction. All other uses of i915_gpu_idle() just want to wait until
existing work on the GPU is idle. Rename i915_gpu_idle() to
i915_gem_wait_for_idle() to avoid any implications about "parking" the
context first.

v2: Tweak an error message if the wait fails for the ilk vtd w/a

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466776558-21516-6-git-send-email-chris@chris-wilson.co.uk
2016-06-24 15:03:14 +01:00
Chris Wilson
0cb26a8ed1 drm/i915: Move legacy kernel context pinning to intel_ringbuffer.c
This is so that we have symmetry with intel_lrc.c and avoid a source of
if (i915.enable_execlists) layering violation within i915_gem_context.c -
that is we move the specific handling of the dev_priv->kernel_context
for legacy submission into the legacy submission code.

This depends upon the init/fini ordering between contexts and engines
already defined by intel_lrc.c, and also exporting the context alignment
required for pinning the legacy context.

v2: Separate out pin/unpin context funcs for greater symmetry with
intel_lrc. One more step towards unifying behaviour between the two
classes of engines and towards fixing another bug in i915_switch_context
vs requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466776558-21516-2-git-send-email-chris@chris-wilson.co.uk
2016-06-24 15:02:34 +01:00
Chris Wilson
0673ad472b drm/i915: Merge i915_dma.c into i915_drv.c
i915_dma.c used to contain the DRI1/UMS horror show, but now all that
remains are the out-of-place driver level interfaces (such as
allocating, initialising and registering the driver). These should be in
i915_drv.c alongside similar routines for suspend/resume.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-10-git-send-email-chris@chris-wilson.co.uk
2016-06-24 14:44:44 +01:00
Chris Wilson
091387c1cb drm/i915: Start exploiting drm_device subclassing
Baby step, update to_i915() conversion from drm_device to
drm_i915_private:

   text	   data	    bss	    dec	    hex	filename
1108812	  23207	    416	1132435	 114793	i915.ko (before)
1104999	  23207	    416	1128622	 1138ae	i915.ko (after)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-9-git-send-email-chris@chris-wilson.co.uk
2016-06-24 14:44:37 +01:00
Chris Wilson
2e5673e73d drm/i915: Remove redundant drm_connector_register_all()
drm_connector_register_all() is now automatically called by
drm_dev_register(), and so we no longer have to do so ourselves (via
intel_modeset_register() after calling drm_dev_register()). Similarly
for unregistering.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-8-git-send-email-chris@chris-wilson.co.uk
2016-06-24 14:44:27 +01:00
Chris Wilson
8f460e2c78 drm/i915: Demidlayer driver loading
Take control over allocating, loading and registering the driver from the
DRM midlayer by performing it manually from i915_pci_probe. This allows
us to carefully control the order of when we setup the hardware vs when
it becomes visible to third parties (including userspace). The current
ordering makes the driver visible to userspace first (in order to
coordinate with removed DRI1 userspace), but that ordering incurs risk.
The risk increases as we strive for more asynchronous loading.

One side effect of controlling the allocation is that we can allocate
both the drm_device + drm_i915_private in one block, the next step
towards subclassing.

Unload is still left as before, a mix of midlayer and driver.

v2: After drm_dev_init(), we should call drm_dev_unref() so that we call
drm_dev_release() and free everything from drm_dev_init().
v3: Fixup missed error code for failing to allocate dev_priv

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-6-git-send-email-chris@chris-wilson.co.uk
2016-06-24 14:43:50 +01:00
Chris Wilson
1dac891c1c drm/i915: Register debugfs interface last
Currently debugfs files are created before the driver is even loads.
This gives the opportunity for userspace to open that interface and poke
around before the backing data structures are initialised - with the
possibility of oopsing or worse.

Move the creation of the debugfs files to our registration phase, where
we announce our presence to the world when we are ready, i.e the
sequence changes from

	drm_dev_register()
	 -> drm_minor_register()
	  -> drm_debugfs_init()
	   -> i915_debugfs_init()
	 -> i915_driver_load()

to

	drm_dev_register()
	 -> drm_minor_register()
	  -> drm_debugfs_init()
	 -> i915_driver_load()
	  -> i915_debugfs_register()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-5-git-send-email-chris@chris-wilson.co.uk
2016-06-24 14:43:36 +01:00
Chris Wilson
843152b4b9 drm/i915: Move connector registration to driver registration
Defer connector registration from during construction to the driver
registration phase. This is important for ordering the action correctly,
e.g. not using debugfs before it is ready.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-4-git-send-email-chris@chris-wilson.co.uk
2016-06-24 14:43:26 +01:00
Chris Wilson
1ebaa0b9c2 drm/i915: Move backlight registration to connector registration
Currently the backlight is being registered in the load phase (before
the display and its objects are registered). Move the backlight
registration into the analogous phase by performing it from the
connector registration, just after its creation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-3-git-send-email-chris@chris-wilson.co.uk
2016-06-24 14:43:14 +01:00
Tvrtko Ursulin
a19d6ff29a drm/i915: Small compaction of the engine init code
Effectively removes one layer of indirection between the mask of
possible engines and the engine constructors. Instead of spelling
out in code the mapping of HAS_<engine> to constructors, makes
more use of the recently added data driven approach by putting
engine constructor vfuncs into the table as well.

Effect is fewer lines of source and smaller binary.

At the same time simplify the error handling since engine
destructors can run on unitialized engines anyway.

Similar approach could be done for legacy submission is wanted.

v2: Removed ugly BUILD_BUG_ONs in favour of newly introduced
    ENGINE_MASK and HAS_ENGINE macros.
    Also removed the forward declarations by shuffling functions
    around.

v3: Warn when logical_rings table does not contain enough data
    and disable the engines which could not be initialized.
    (Chris Wilson)

v4: Chris Wilson suggested a nicer engine init loop.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1466689961-23232-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-06-24 12:09:00 +01:00
Daniel Vetter
10bb667223 Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge drm-next for the reworked device register/unregistering.
Chris Wilson needs that to be able to land his i915 load/unload
demidlayering.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-06-24 08:29:45 +02:00
Dave Airlie
9da1030e3c Merge tag 'drm-intel-next-2016-06-20' of git://anongit.freedesktop.org/drm-intel into drm-next
- Infrastructure for GVT-g (paravirtualized gpu on gen8+), from Zhi Wang
- another attemp at nonblocking atomic plane updates
- bugfixes and refactoring for GuC doorbell code (Dave Gordon)
- GuC command submission enabled by default, if fw available (Dave Gordon)
- more bxt w/a (Arun Siluvery)
- bxt phy improvements (Imre Deak)
- prep work for stolen objects support (Ankitprasa Sharma & Chris Wilson)
- skl/bkl w/a update from Mika Kuoppala
- bunch of small improvements and fixes all over, as usual

* tag 'drm-intel-next-2016-06-20' of git://anongit.freedesktop.org/drm-intel: (81 commits)
  drm/i915: Update DRIVER_DATE to 20160620
  drm/i915: Introduce GVT context creation API
  drm/i915: Support LRC context single submission
  drm/i915: Introduce execlist context status change notification
  drm/i915: Make addressing mode bits in context descriptor configurable
  drm/i915: Make ring buffer size of a LRC context configurable
  drm/i915: gvt: Introduce the basic architecture of GVT-g
  drm/i915: Fold vGPU active check into inner functions
  drm/i915: Use offsetof() to calculate the offset of members in PVINFO page
  drm/i915: Factor out i915_pvinfo.h
  drm/i915: Serialise presentation with imported dmabufs
  drm/i915: Use atomic commits for legacy page_flips
  drm/i915: Move fb_bits updating later in atomic_commit
  drm/i915: nonblocking commit
  Reapply "drm/i915: Pass atomic states to fbc update, functions."
  drm/i915: Roll out the helper nonblock tracking
  drm/i915: Signal drm events for atomic
  drm/i915/ilk: Don't disable SSC source if it's in use
  drm/i915/guc: (re)initialise doorbell h/w when enabling GuC submission
  drm/i915/guc: replace assign_doorbell() with select_doorbell_register()
  ...
2016-06-24 13:13:41 +10:00
Daniel Vetter
3b96a0b140 drm: document drm_auth.c
Also extract drm_auth.h for nicer grouping.

v2: Nuke the other comments since they don't really explain a lot, and
within the drm core we generally only document functions exported to
drivers: The main audience for these docs are driver writers.

v3: Limit the exposure of drm_master internals by only including
drm_auth.h where it is neede (Chris).

v4: Spelling polish (Emil).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-06-21 22:10:55 +02:00
Tvrtko Ursulin
612515121b drm/i915/guc: Remove one unnecessary variable
No need for local struct drm_device * since dev_priv is the
correct thing to pass in to NEEDS_WaRsDisableCoarsePowerGating
anyway. Changed the macro definition for the latter to reflect
that as well.

v2: Alignment bikeshed.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1466518034-24838-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-06-21 15:43:32 +01:00
Chris Wilson
aeecc9696a drm/i915: use ORIGIN_CPU for frontbuffer invalidation on WC mmaps
... instead of the previous ORIGIN_GTT. This should actually
invalidate FBC once something is written on the frontbuffer using WC
mmaps. The problem with ORIGIN_GTT is that the automatic hardware
tracking is not able to detect the WC writes as it can detect the GTT
writes.

This should help fix the SKL bug where nothing happens when you type
your username/password on lightdm.

This patch was originally pasted on an email by Chris and converted to
an actual git patch by Paulo.

v2 (from Paulo):
 - Make it a full variable instead of a bit-field (Daniel)
 - Use WRITE_ONCE (Chris)
v3 (from Paulo):
 - Remove huge comment since now we have WRITE_ONCE (Chris)
 - Remove uneeded new line (Chris)
 - Add Chris' Signed-off-by, authorized via IRC

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466185599-26401-1-git-send-email-paulo.r.zanoni@intel.com
2016-06-20 17:47:36 -03:00
Chris Wilson
b9bcd14a2b drm/i915: Extract checking for backing struct pages to a helper
Currently to see if an object is backed by struct pages (as opposed to
being a simple pointer to stolen memory, for example) we do a manual
check on the obj->ops->flags. This is quite shouty and before adding
more checks in future, we should make it a bit calmer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466431552-17860-1-git-send-email-chris@chris-wilson.co.uk
2016-06-20 15:57:54 +01:00
Daniel Vetter
a02b01096d drm/i915: Update DRIVER_DATE to 20160620
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-06-20 00:30:34 +02:00
Chris Wilson
c191eca110 drm/i915: Move intel_connector->unregister to connector->early_unregister
We now have a connector->func that serves the same purpose as our own
intel_connector->unregister vfunc allowing us to unwrap ourselves and
use drm_connector_register() (and friends) as the central function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1466160034-12173-2-git-send-email-chris@chris-wilson.co.uk
2016-06-19 10:39:06 +02:00
Zhi Wang
c8c35799f2 drm/i915: Introduce GVT context creation API
GVT workload scheduler needs special host LRC contexts, the so called
"shadow LRC context" to submit guest workload to host i915. During the
guest workload submission, workload scheduler fills the shadow LRC
context with the content of guest LRC context: engine context is copied
without changes, ring context is mostly owned by host i915.

v8:

- Remove the graph temporarily. (Chris)
- Use interruptible mutex_lock. (Chris)
- Rename the function name of creating a GVT context. (Chris)
- Add the missing declaration in i915_drv.h (Chris)

v7:

- Move chart to a better place. (Joonas)

v6:

- Make GVT code as dead code when !CONFIG_DRM_I915_GVT. (Chris)

v5:
- Only compile this feature when CONFIG_DRM_I915_GVT is enabled. (Tvrtko)
- Rebase the code into new repo.
- Add a comment about the ring buffer size. (Joonas)

v2:

Mostly based on Daniel's idea. Call the refactored core logic of GEM
context creation service and LRC context creation service to create the GVT
context.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-10-git-send-email-zhi.a.wang@intel.com
2016-06-17 20:36:43 +01:00
Zhi Wang
80a9a8db16 drm/i915: Support LRC context single submission
This patch introduces the support of LRC context single submission.
As GVT context may come from different guests, which require different
configuration of render registers. It can't be combined into a dual ELSP
submission combo.

Only GVT-g will create this kinds of GEM context currently.

v8:

- Rename the data member in struct i915_gem_context. (Chris)

v7:

- Fix typos in commit message. (Joonas)

v6:
- Make GVT code as dead code when !CONFIG_DRM_I915_GVT. (Chris)

v5:

- Only compile this feature when CONFIG_DRM_I915_GVT=y. (Tvrtko)

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-9-git-send-email-zhi.a.wang@intel.com
2016-06-17 20:36:37 +01:00
Zhi Wang
3c7ba6359d drm/i915: Introduce execlist context status change notification
This patch introduces an approach to track the execlist context status
change.

GVT-g uses GVT context as the "shadow context". The content inside GVT
context will be copied back to guest after the context is idle. And GVT-g
has to know the status of the execlist context.

This function is configurable when creating a new GEM context. Currently,
Only GVT-g will create the "status-change-notification" enabled GEM
context.

v10:

- Fix the identation. (Joonas)

v8:

- Remove the boolean flag in struct i915_gem_context. (Joonas)

v7:

- Remove per-engine ctx status notifiers. Use one status notifier for all
engines. (Joonas)
- Add prefix "INTEL_" for related definitions. (Joonas)
- Refine the comments in execlists_context_status_change(). (Joonas)

v6:

- When !CONFIG_DRM_I915_GVT, make GVT code as dead code then compiler
could automatically eliminate them for us. (Chris)
- Always initialize the notifier header, so it could be switched on/off
at runtime. (Chris)

v5:

- Only compile this feature when CONFIG_DRM_I915_GVT is enabled.(Tvrtko)

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v8)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-8-git-send-email-zhi.a.wang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-06-17 20:36:19 +01:00
Zhi Wang
c01fc53229 drm/i915: Make addressing mode bits in context descriptor configurable
Currently the addressing mode bit in context descriptor is statically
generated from the configuration of system-wide PPGTT usage model.

GVT-g will load the PPGTT shadow page table by itself and probably one
guest is using a different addressing mode with i915 host. The addressing
mode bits of a LRC context should be configurable under this case.

v10:

- Fix the identation. (Joonas)

v9:
- Rename the data member in struct i915_gem_context. (Chris)

v8:
- Rename the data member in struct i915_gem_context. (Chris)

v7:
- Move context addressing mode bit into i915_reg.h. (Joonas/Chris)
- Add prefix "INTEL_" for related definitions. (Joonas)

v6:
- Directly save the addressing mode bits inside i915_gem_context. (Chris)
- Move the LRC context addressing mode bits into intel_lrc.h. (Chris)

v5:
- Change USES_FULL_48BIT(dev) to USES_FULL_48BIT(dev_priv) (Tvrtko)

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-7-git-send-email-zhi.a.wang@intel.com
2016-06-17 20:35:26 +01:00
Zhi Wang
bcd794c227 drm/i915: Make ring buffer size of a LRC context configurable
This patch introduces an option for configuring the ring buffer size
of a LRC context after the context creation.

v9:
- Fix an identation issue. (Chris)

v8:
- Rename the data member in i915_gem_context. (Chris)

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-6-git-send-email-zhi.a.wang@intel.com
2016-06-17 20:35:18 +01:00
Zhi Wang
0ad35fed61 drm/i915: gvt: Introduce the basic architecture of GVT-g
This patch introduces the very basic framework of GVT-g device model,
includes basic prototypes, definitions, initialization.

v12:
- Call intel_gvt_init() in driver early initialization stage. (Chris)

v8:
- Remove the GVT idr and mutex in intel_gvt_host. (Joonas)

v7:
- Refine the URL link in Kconfig. (Joonas)
- Refine the introduction of GVT-g host support in Kconfig. (Joonas)
- Remove the macro GVT_ALIGN(), use round_down() instead. (Joonas)
- Make "struct intel_gvt" a data member in struct drm_i915_private.(Joonas)
	- Remove {alloc, free}_gvt_device()
	- Rename intel_gvt_{create, destroy}_gvt_device()
	- Expost intel_gvt_init_host()
- Remove the dummy "struct intel_gvt" declaration in intel_gvt.h (Joonas)

v6:
- Refine introduction in Kconfig. (Chris)
- The exposed API functions will take struct intel_gvt * instead of
void *. (Chris/Tvrtko)
- Remove most memebers of strct intel_gvt_device_info. Will add them
in the device model patches.(Chris)
- Remove gvt_info() and gvt_err() in debug.h. (Chris)
- Move GVT kernel parameter into i915_params. (Chris)
- Remove include/drm/i915_gvt.h, as GVT-g will be built within i915.
- Remove the redundant struct i915_gvt *, as the functions in i915
will directly take struct intel_gvt *.
- Add more comments for reviewer.

v5:
Take Tvrtko's comments:
- Fix the misspelled words in Kconfig
- Let functions take drm_i915_private * instead of struct drm_device *
- Remove redundant prints/local varible initialization

v3:
Take Joonas' comments:
- Change file name i915_gvt.* to intel_gvt.*
- Move GVT kernel parameter into intel_gvt.c
- Remove redundant debug macros
- Change error handling style
- Add introductions for some stub functions
- Introduce drm/i915_gvt.h.

Take Kevin's comments:
- Move GVT-g host/guest check into intel_vgt_balloon in i915_gem_gtt.c

v2:
- Introduce i915_gvt.c.
It's necessary to introduce the stubs between i915 driver and GVT-g host,
as GVT-g components is configurable in kernel config. When disabled, the
stubs here do nothing.

Take Joonas' comments:
- Replace boolean return value with int.
- Replace customized info/warn/debug macros with DRM macros.
- Document all non-static functions like i915.
- Remove empty and unused functions.
- Replace magic number with marcos.
- Set GVT-g in kernel config to "n" by default.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-5-git-send-email-zhi.a.wang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-06-17 20:35:06 +01:00
arun.siluvery@linux.intel.com
33e141ed1c drm/i915:bxt: Enable Pooled EU support
This mode allows to assign EUs to pools which can process work collectively.
The command to enable this mode should be issued as part of context initialization.

The pooled mode is global, once enabled it has to stay the same across all
contexts until HW reset hence this is sent in auxiliary golden context batch.
Thanks to Mika for the preliminary review and comments.

v2: explain why this is enabled in golden context, use feature flag while
enabling the support (Chris)

v3: Include only kernel support as userspace support is not available yet.

User space clients need to know when the pooled EU feature is present
and enabled on the hardware so that they can adapt work submissions.
Create a new device info flag for this purpose.

Set has_pooled_eu to true in the Broxton static device info - Broxton
supports the feature in hardware and the driver will enable it by
default.

We need to add getparam ioctls to enable userspace to query availability of
this feature and to retrieve min. no of eus in a pool but we will expose
them once userspace support is available. Opensource users for this feature
are mesa, libva and beignet.

Beignet team is currently working on adding userspace support.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Winiarski, Michal <michal.winiarski@intel.com>
Cc: Zou, Nanhai <nanhai.zou@intel.com>
Cc: Yang, Rong R <rong.r.yang@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Armin Reese <armin.c.reese@intel.com>
Cc: Tim Gore <tim.gore@intel.com>
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-06-14 10:42:35 +01:00
Chris Wilson
341be1cd61 drm/i915: Introduce i915_gem_object_get_dma_address()
This utility function is a companion to i915_gem_object_get_page() that
uses the same cached iterator for the scatterlist to perform fast
sequential lookup of the dma address associated with any page within the
object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-06-13 10:03:55 +01:00
Mika Kuoppala
fe90581987 drm/i915/kbl: Add WaDisableLSQCROPERFforOCL
Extend the scope of this workaround, already used in skl,
to also take effect in kbl.

v2: Fix KBL_REVID_E0 (Matthew)

References: HSD#2132677
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1465309159-30531-12-git-send-email-mika.kuoppala@intel.com
2016-06-08 16:24:03 +03:00
Mika Kuoppala
c033a37cd4 drm/i915/kbl: Add REVID macro
Add REVID macro for kbl to limit wa applicability to particular
revision range.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1465309159-30531-4-git-send-email-mika.kuoppala@intel.com
2016-06-08 16:20:03 +03:00
Ville Syrjälä
22f3504259 drm/i915: Check VBT for port presence in addition to the strap on VLV/CHV
Apparently some CHV boards failed to hook up the port presence straps
for HDMI ports as well (earlier we assumed this problem only affected
eDP ports). So let's check the VBT in addition to the strap, and if
either one claims that the port is present go ahead and register the
relevant connector.

While at it, change port D to register DP before HDMI as we do for ports
B and C since
commit 457c52d87e ("drm/i915: Only ignore eDP ports that are connected")

Also print a debug message when we register a HDMI connector to aid
in diagnosing missing/incorrect ports. We already had such a print for
DP/eDP.

v2: Improve the comment in the code a bit, note the port D change in
    the commit message

Cc: Radoslav Duda <radosd@radosd.com>
Tested-by: Radoslav Duda <radosd@radosd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96321
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464945463-14364-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-06-07 18:45:43 +03:00
Daniel Vetter
1750d59dfa drm/i915: Update DRIVER_DATE to 20160606
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-06-06 00:29:53 +02:00
Daniel Vetter
5599617ec0 Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Git got absolutely destroyed with all our cherry-picking from
drm-intel-next-queued to various branches. It ended up inserting
intel_crtc_page_flip 2x even in intel_display.c.

Backmerge to get back to sanity.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-06-02 09:54:12 +02:00
Dave Airlie
66fd7a66e8 Merge branch 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel into drm-next
drm-intel-next-2016-05-22:
- cmd-parser support for direct reg->reg loads (Ken Graunke)
- better handle DP++ smart dongles (Ville)
- bxt guc fw loading support (Nick Hoathe)
- remove a bunch of struct typedefs from dpll code (Ander)
- tons of small work all over to avoid casting between drm_device and the i915
  dev struct (Tvrtko&Chris)
- untangle request retiring from other operations, also fixes reset stat corner
  cases (Chris)
- skl atomic watermark support from Matt Roper, yay!
- various wm handling bugfixes from Ville
- big pile of cdclck rework for bxt/skl (Ville)
- CABC (Content Adaptive Brigthness Control) for dsi panels (Jani&Deepak M)
- nonblocking atomic commits for plane-only updates (Maarten Lankhorst)
- bunch of PSR fixes&improvements
- untangle our map/pin/sg_iter code a bit (Dave Gordon)
drm-intel-next-2016-05-08:
- refactor stolen quirks to share code between early quirks and i915 (Joonas)
- refactor gem BO/vma funcstion (Tvrtko&Dave)
- backlight over DPCD support (Yetunde Abedisi)
- more dsi panel sequence support (Jani)
- lots of refactoring around handling iomaps, vma, ring access and related
  topics culmulating in removing the duplicated request tracking in the execlist
  code (Chris & Tvrtko) includes a small patch for core iomapping code
- hw state readout for bxt dsi (Ramalingam C)
- cdclk cleanups (Ville)
- dedupe chv pll code a bit (Ander)
- enable semaphores on gen8+ for legacy submission, to be able to have a direct
  comparison against execlist on the same platform (Chris) Not meant to be used
  for anything else but performance tuning
- lvds border bit hw state checker fix (Jani)
- rpm vs. shrinker/oom-notifier fixes (Praveen Paneri)
- l3 tuning (Imre)
- revert mst dp audio, it's totally non-functional and crash-y (Lyude)
- first official dmc for kbl (Rodrigo)
- and tons of small things all over as usual

* 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel: (194 commits)
  drm/i915: Revert async unpin and nonblocking atomic commit
  drm/i915: Update DRIVER_DATE to 20160522
  drm/i915: Inline sg_next() for the optimised SGL iterator
  drm/i915: Introduce & use new lightweight SGL iterators
  drm/i915: optimise i915_gem_object_map() for small objects
  drm/i915: refactor i915_gem_object_pin_map()
  drm/i915/psr: Implement PSR2 w/a for gen9
  drm/i915/psr: Use ->get_aux_send_ctl functions
  drm/i915/psr: Order DP aux transactions correctly
  drm/i915/psr: Make idle_frames sensible again
  drm/i915/psr: Try to program link training times correctly
  drm/i915/userptr: Convert to drm_i915_private
  drm/i915: Allow nonblocking update of pageflips.
  drm/i915: Check for unpin correctness.
  Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
  drm/i915: Make unpin async.
  drm/i915: Prepare connectors for nonblocking checks.
  drm/i915: Pass atomic states to fbc update functions.
  drm/i915: Remove reset_counter from intel_crtc.
  drm/i915: Remove queue_flip pointer.
  ...
2016-06-02 07:58:36 +10:00
Daniel Vetter
e42aeef123 drm/i915: Revert async unpin and nonblocking atomic commit
This reverts the following patches:

d55dbd06bb drm/i915: Allow nonblocking update of pageflips.
15c86bdb76 drm/i915: Check for unpin correctness.
95c2ccdc82 Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
a6747b7304 drm/i915: Make unpin async.
03f476e1fc drm/i915: Prepare connectors for nonblocking checks.
2099deffef drm/i915: Pass atomic states to fbc update functions.
ee7171af72 drm/i915: Remove reset_counter from intel_crtc.
2ee004f7c5 drm/i915: Remove queue_flip pointer.
b8d2afae55 drm/i915: Remove use_mmio_flip kernel parameter.
8dd634d922 drm/i915: Remove cs based page flip support.
143f73b3bf drm/i915: Rework intel_crtc_page_flip to be almost atomic, v3.
84fc494b64 drm/i915: Add the exclusive fence to plane_state.
6885843ae1 drm/i915: Convert flip_work to a list.
aa420ddd8e drm/i915: Allow mmio updates on all platforms, v2.
afee4d8707 Revert "drm/i915: Avoid stalling on pending flips for legacy cursor updates"

"drm/i915: Allow nonblocking update of pageflips" should have been
split up, misses a proper commit message and seems to cause issues in
the legacy page_flip path as demonstrated by kms_flip.

"drm/i915: Make unpin async" doesn't handle the unthrottled cursor
updates correctly, leading to an apparent pin count leak. This is
caught by the WARN_ON in i915_gem_object_do_pin which screams if we
have more than DRM_I915_GEM_OBJECT_MAX_PIN_COUNT pins.

Unfortuantely we can't just revert these two because this patch series
came with a built-in bisect breakage in the form of temporarily
removing the unthrottled cursor update hack for legacy cursor ioctl.
Therefore there's no other option than to revert the entire pile :(

There's one tiny conflict in intel_drv.h due to other patches, nothing
serious.

Normally I'd wait a bit longer with doing a maintainer revert, but
since the minimal set of patches we need to revert (due to the bisect
breakage) is so big, time is running out fast. And very soon
(especially after a few attempts at fixing issues) it'll be really
hard to revert things cleanly.

Lessons learned:
- Not a good idea to rush the review (done by someone fairly new to
  the area) and not make sure domain experts had a chance to read it.

- Patches should be properly split up. I only looked at the two
  patches that should be reverted in detail, but both look like the
  mix up different things in one patch.

- Patches really should have proper commit messages. Especially when
  doing more than one thing, and especially when touching critical and
  tricky core code.

- Building a patch series and r-b stamping it when it has a built-in
  bisect breakage is not a good idea.

- I also think we need to stop building up technical debt by
  postponing atomic igt testcases even longer. I think it's clear that
  there's enough corner cases in this beast that we really need to
  have the testcases _before_ the next step lands.

(cherry picked from commit 5a21b6650a
from drm-intel-next-queeud)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-06-01 09:46:46 +02:00
Sagar Arun Kamble
1800ad255c drm/i915: Update GEN6_PMINTRMSK setup with GuC enabled
On Loading, GuC sets PM interrupts routing (bit 31) and clears ARAT
expired interrupt (bit 9). Host turbo also updates this register
in RPS flows. This patch ensures bit 31 and bit 9 setup by GuC persists.
ARAT timer interrupt is needed in GuC for various features. It also
facilitates halting GuC and hence achieving RC6. PM interrupt routing
will not impact RPS interrupt reception by host as GuC will redirect
them.
This patch fixes igt test pm_rc6_residency that was failing with guc
load/submission enabled. Tested with SKL GuC v6.1 and BXT GuC v5.1 and v8.7.

v2: i915_irq/i915_pm decoupling from intel_guc. (ChrisW)

v3: restructuring the mask update and rebase w.r.t Ville's patch. (ChrisW)

v4: Updating the pm_intr_keep during direct_interrupts_to_guc. (Sagar)

Cc: Chris Harris <chris.harris@intel.com>
Cc: Zhe Wang <zhe1.wang@intel.com>
Cc: Deepak S <deepak.s@intel.com>
Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
Cc: Akash Goel <akash.goel@intel.com>
Testcase: igt/pm_rc6_residency
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Tested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464683307-19475-1-git-send-email-sagar.a.kamble@intel.com
2016-05-31 16:13:47 -07:00
Dave Airlie
7fa1d27b63 Merge tag 'drm-intel-next-fixes-2016-05-25' of git://anongit.freedesktop.org/drm-intel into drm-next
I see the main drm pull got merged, here's the first batch of fixes for
4.7 already. Fixes all around, a large portion cc: stable stuff.

[airlied: the DP++ stuff is a regression fix].
* tag 'drm-intel-next-fixes-2016-05-25' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Stop automatically retiring requests after a GPU hang
  drm/i915: Unify intel_ring_begin()
  drm/i915: Ignore stale wm register values on resume on ilk-bdw (v2)
  drm/i915/psr: Try to program link training times correctly
  drm/i915/bxt: Adjusting the error in horizontal timings retrieval
  drm/i915: Don't leave old junk in ilk active watermarks on readout
  drm/i915: s/DPPL/DPLL/ for SKL DPLLs
  drm/i915: Fix gen8 semaphores id for legacy mode
  drm/i915: Set crtc_state->lane_count for HDMI
  drm/i915/BXT: Retrieving the horizontal timing for DSI
  drm/i915: Protect gen7 irq_seqno_barrier with uncore lock
  drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platforms
  drm/i915: Determine DP++ type 1 DVI adaptor presence based on VBT
  drm/i915: Enable/disable TMDS output buffers in DP++ adaptor as needed
  drm/i915: Respect DP++ adaptor TMDS clock limit
  drm: Add helper for DP++ adaptors
2016-05-27 16:08:38 +10:00
Daniel Vetter
5a21b6650a drm/i915: Revert async unpin and nonblocking atomic commit
This reverts the following patches:

d55dbd06bb drm/i915: Allow nonblocking update of pageflips.
15c86bdb76 drm/i915: Check for unpin correctness.
95c2ccdc82 Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
a6747b7304 drm/i915: Make unpin async.
03f476e1fc drm/i915: Prepare connectors for nonblocking checks.
2099deffef drm/i915: Pass atomic states to fbc update functions.
ee7171af72 drm/i915: Remove reset_counter from intel_crtc.
2ee004f7c5 drm/i915: Remove queue_flip pointer.
b8d2afae55 drm/i915: Remove use_mmio_flip kernel parameter.
8dd634d922 drm/i915: Remove cs based page flip support.
143f73b3bf drm/i915: Rework intel_crtc_page_flip to be almost atomic, v3.
84fc494b64 drm/i915: Add the exclusive fence to plane_state.
6885843ae1 drm/i915: Convert flip_work to a list.
aa420ddd8e drm/i915: Allow mmio updates on all platforms, v2.
afee4d8707 Revert "drm/i915: Avoid stalling on pending flips for legacy cursor updates"

"drm/i915: Allow nonblocking update of pageflips" should have been
split up, misses a proper commit message and seems to cause issues in
the legacy page_flip path as demonstrated by kms_flip.

"drm/i915: Make unpin async" doesn't handle the unthrottled cursor
updates correctly, leading to an apparent pin count leak. This is
caught by the WARN_ON in i915_gem_object_do_pin which screams if we
have more than DRM_I915_GEM_OBJECT_MAX_PIN_COUNT pins.

Unfortuantely we can't just revert these two because this patch series
came with a built-in bisect breakage in the form of temporarily
removing the unthrottled cursor update hack for legacy cursor ioctl.
Therefore there's no other option than to revert the entire pile :(

There's one tiny conflict in intel_drv.h due to other patches, nothing
serious.

Normally I'd wait a bit longer with doing a maintainer revert, but
since the minimal set of patches we need to revert (due to the bisect
breakage) is so big, time is running out fast. And very soon
(especially after a few attempts at fixing issues) it'll be really
hard to revert things cleanly.

Lessons learned:
- Not a good idea to rush the review (done by someone fairly new to
  the area) and not make sure domain experts had a chance to read it.

- Patches should be properly split up. I only looked at the two
  patches that should be reverted in detail, but both look like the
  mix up different things in one patch.

- Patches really should have proper commit messages. Especially when
  doing more than one thing, and especially when touching critical and
  tricky core code.

- Building a patch series and r-b stamping it when it has a built-in
  bisect breakage is not a good idea.

- I also think we need to stop building up technical debt by
  postponing atomic igt testcases even longer. I think it's clear that
  there's enough corner cases in this beast that we really need to
  have the testcases _before_ the next step lands.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-05-25 09:33:04 +02:00
Chris Wilson
8d59bc6ae9 drm/i915: Rearrange i915_gem_context
Pack the integers and related types together inside the struct.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-9-git-send-email-chris@chris-wilson.co.uk
2016-05-24 15:32:03 +01:00
Chris Wilson
bca44d8055 drm/i915: Merge legacy+execlists context structs
struct intel_context contains two substructs, one for the legacy RCS and
one for every execlists engine. Since legacy RCS is a subset of the
execlists engine support, just combine the two substructs.

v2: Only pin the default context for legacy mode (the object only exists
for legacy, but adding i915.enable_execlists provides symmetry with the
cleanup functions).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-8-git-send-email-chris@chris-wilson.co.uk
2016-05-24 15:30:31 +01:00
Chris Wilson
0ca5fa3a6b drm/i915: Put the kernel_context in drm_i915_private next to its friends
Just move the kernel_context member of drm_i915_private next to the
engines it is associated with.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-7-git-send-email-chris@chris-wilson.co.uk
2016-05-24 15:29:57 +01:00
Chris Wilson
9021ad03d0 drm/i915: Name the inner most per-engine intel_context struct
We want to give a name to the currently anonymous per-engine struct
inside the context, so that we can assign it to a local variable and
save clumsy typing. The name we have chosen is intel_context as it
reflects the HW facing portion of the context state (the logical context
state, the registers, the ringbuffer etc).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-4-git-send-email-chris@chris-wilson.co.uk
2016-05-24 15:28:37 +01:00
Chris Wilson
ca585b5d0a drm/i915: Rename and inline i915_gem_context_get()
i915_gem_context_get() is a very simple wrapper around idr_find(), so
simple that it would be smaller to do the lookup inline. Also we use the
verb 'lookup' to return a pointer from a handle, freeing 'get' to imply
obtaining a reference to the context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-3-git-send-email-chris@chris-wilson.co.uk
2016-05-24 15:28:29 +01:00
Chris Wilson
499f2697da drm/i915: Apply lockdep annotations to i915_gem_context.c
Markup the functions that require the caller to hold struct_mutex with
lockdep_assert_held(). In the hopefully not-too-distant future we will
split the struct_mutex up, and in doing so we need to be sure that we
know what it protects - here the lockdep annotations are invaluable.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-2-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-2-git-send-email-chris@chris-wilson.co.uk
2016-05-24 15:28:18 +01:00
Chris Wilson
e2efd13007 drm/i915: Rename struct intel_context
Our goal is to rename the anonymous per-engine struct beneath the
current intel_context. However, after a lively debate resolving around
the confusion between intel_context_engine and intel_engine_context, the
realisation is that the two structs target different users. The outer
struct is API / user facing, and so carries the higher level GEM
information. The inner struct is hw facing. Thus we want to name the
inner struct intel_context and the outer one i915_gem_context. As the
first step, we need to rename the current struct:

	s/struct intel_context/struct i915_gem_context/

which fits much better with its constructors already conveying the
i915_gem_context prefix!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464098023-3294-1-git-send-email-chris@chris-wilson.co.uk
2016-05-24 15:27:14 +01:00
Linus Torvalds
1d6da87a32 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "Here's the main drm pull request for 4.7, it's been a busy one, and
  I've been a bit more distracted in real life this merge window.  Lots
  more ARM drivers, not sure if it'll ever end.  I think I've at least
  one more coming the next merge window.

  But changes are all over the place, support for AMD Polaris GPUs is in
  here, some missing GM108 support for nouveau (found in some Lenovos),
  a bunch of MST and skylake fixes.

  I've also noticed a few fixes from Arnd in my inbox, that I'll try and
  get in asap, but I didn't think they should hold this up.

  New drivers:
   - Hisilicon kirin display driver
   - Mediatek MT8173 display driver
   - ARC PGU - bitstreamer on Synopsys ARC SDP boards
   - Allwinner A13 initial RGB output driver
   - Analogix driver for DisplayPort IP found in exynos and rockchip

  DRM Core:
   - UAPI headers fixes and C++ safety
   - DRM connector reference counting
   - DisplayID mode parsing for Dell 5K monitors
   - Removal of struct_mutex from drivers
   - Connector registration cleanups
   - MST robustness fixes
   - MAINTAINERS updates
   - Lockless GEM object freeing
   - Generic fbdev deferred IO support

  panel:
   - Support for a bunch of new panels

  i915:
   - VBT refactoring
   - PLL computation cleanups
   - DSI support for BXT
   - Color manager support
   - More atomic patches
   - GEM improvements
   - GuC fw loading fixes
   - DP detection fixes
   - SKL GPU hang fixes
   - Lots of BXT fixes

  radeon/amdgpu:
   - Initial Polaris support
   - GPUVM/Scheduler/Clock/Power improvements
   - ASYNC pageflip support
   - New mesa feature support

  nouveau:
   - GM108 support
   - Power sensor support improvements
   - GR init + ucode fixes.
   - Use GPU provided topology information

  vmwgfx:
   - Add host messaging support

  gma500:
   - Some cleanups and fixes

  atmel:
   - Bridge support
   - Async atomic commit support

  fsl-dcu:
   - Timing controller for LCD support
   - Pixel clock polarity support

  rcar-du:
   - Misc fixes

  exynos:
   - Pipeline clock support
   - Exynoss4533 SoC support
   - HW trigger mode support
   - export HDMI_PHY clock
   - DECON5433 fixes
   - Use generic prime functions
   - use DMA mapping APIs

  rockchip:
   - Lots of little fixes

  vc4:
   - Render node support
   - Gamma ramp support
   - DPI output support

  msm:
   - Mostly cleanups and fixes
   - Conversion to generic struct fence

  etnaviv:
   - Fix for prime buffer handling
   - Allow hangcheck to be coalesced with other wakeups

  tegra:
   - Gamme table size fix"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1050 commits)
  drm/edid: add displayid detailed 1 timings to the modelist. (v1.1)
  drm/edid: move displayid validation to it's own function.
  drm/displayid: Iterate over all DisplayID blocks
  drm/edid: move displayid tiled block parsing into separate function.
  drm: Nuke ->vblank_disable_allowed
  drm/vmwgfx: Report vmwgfx version to vmware.log
  drm/vmwgfx: Add VMWare host messaging capability
  drm/vmwgfx: Kill some lockdep warnings
  drm/nouveau/gr/gf100-: fix race condition in fecs/gpccs ucode
  drm/nouveau/core: recognise GM108 chipsets
  drm/nouveau/gr/gm107-: fix touching non-existent ppcs in attrib cb setup
  drm/nouveau/gr/gk104-: share implementation of ppc exception init
  drm/nouveau/gr/gk104-: move rop_active_fbps init to nonctx
  drm/nouveau/bios/pll: check BIT table version before trying to parse it
  drm/nouveau/bios/pll: prevent oops when limits table can't be parsed
  drm/nouveau/volt/gk104: round up in gk104_volt_set
  drm/nouveau/fb/gm200: setup mmu debug buffer registers at init()
  drm/nouveau/fb/gk20a,gm20b: setup mmu debug buffer registers at init()
  drm/nouveau/fb/gf100-: allocate mmu debug buffers
  drm/nouveau/fb: allow chipset-specific actions for oneinit()
  ...
2016-05-23 11:48:48 -07:00
Ville Syrjälä
709e05c3c4 drm/i915: Store cdclk PLL reference clock under dev_priv
Future platforms will have multiple options for the cdclk PLL reference
clock, so let's start tracking that under dev_priv alreday on SKL,
although on SKL it's always 24 MHz.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-15-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-05-23 21:11:15 +03:00
Ville Syrjälä
63911d7295 drm/i915: Rename skl_vco_freq to cdclk_pll.vco
We'll want to store the cdclk PLL (whatever PLL that is in reality) vco
frequency somewhere on other platforms too, so let's rename the
skl_vco_freq to cdclk_pll.vco, and let's store it in kHz instead of MHz
to match most of the other clocks.

v2: Drop the spurious > vs != change (Imre)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-14-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-05-23 21:11:15 +03:00
Ville Syrjälä
b204535204 drm/i915: Keep track of preferred cdclk vco frequency on SKL
Now that skl_vco_freq tracks the actual DPLL0 vco frequency, we'll need
something that keeps track of which vco frequency we want to use in case
the current vco is 0. This would be important across supend/resume since
we'll disable DPLL0 around those parts.

We'll also update our idea of max cdclk/dotclock when the preferred
vco changes. That could happen if out initial guess was wrong, and
later eDP would force us to change it. One issue here could be that
changing the max dotclock could cause our mode list to change during
next time the displays get probed. But I don't see a good way to avoid
that, except perhaps by allowing either vco frequency to be used as
needed. But the docs suggest that such usage wasn't really inteded.

Also need to make sure we don't update our max_cdclk value before we
have a preferred vco value, which means moving that to happen after
the cdclk sanitation.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-05-23 21:11:13 +03:00
Clint Taylor
c89e39f327 drm/i915/skl: SKL CDCLK change on modeset tracking VCO
WARNING: Using ChromeOS with an eDP panel and a 4K@60 DP monitor connected
to DDI1 the system will hard hang during a cold boot. Occurs when DDI1
is enabled when the cdclk is less then required. DP connected to DDI2
and HPD on either port works correctly.

Set cdclk based on the max required pixel clock based on VCO
selected. Track boot vco instead of boot cdclk.

The vco is now tracked at the atomic level and all CRTCs updated if
the required vco is changed. Not tested with eDP v1.4 panels that
require 8640 vco due to availability.

V1: initial version
V2: add vco tracking in intel_dp_compute_config(), rename
skl_boot_cdclk.
V3: rebase, V2 feedback not possible as encoders are not aware of
atomic.
V4: track target vco is atomic state. modeset all CRTCs if vco changes
V5: rename atomic variable, cleaner if/else logic, use existing vco if
      encoder does not return a new vco value. check_patch.pl cleanup
V6: simplify logic in intel_modeset_checks.
V7: reorder an IF for readability and whitespace fix.
V8: use dev_cdclk for tracking new cdclk during atomic
V9: correctly handle vco 8640 when crtcs==0
V10: Clean up if else in crtcs==0
V11: Rebase for new intel_dpll_mgr.c

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
[vsyrjala: rebased due to churn]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463172100-24715-3-git-send-email-ville.syrjala@linux.intel.com
2016-05-23 21:10:55 +03:00
Chris Wilson
03d92e4779 drm/i915/opregion: Rename init/fini functions to register/unregister
Current intel_opregion_init is called during the driver registration
phase and intel_opregion_fini from the unregistration phase. Rename the
functions so that this is clear from their names. The phases tell us
what we expect the existing hw state to be, e.g. whether interrupts are
still enabled etc.

It should be noted that the opregion init/fini routines are asymmetric
and this is carried across into their new names. Indeed, their new names
make it even clearer that perhaps all is not well in the opregion
suspend/resume sequence (as well in the module unload).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464012490-30961-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
2016-05-23 16:35:50 +01:00
Chris Wilson
6f9f4b7a2c drm/i915/opregion: Convert to using native drm_i915_private
Prefer passing struct drm_i915_private to internal interfaces as this
saves us having to dance between drm_device and our native struct. The
savings hare are small (only 70 bytes of unrequired dancing), but
progressive!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464012490-30961-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
2016-05-23 16:35:21 +01:00
Dave Gordon
1a3d189837 drm/i915/guc: distinguish HAS_GUC() from HAS_GUC_UCODE/HAS_GUC_SCHED
For now, anything with a GuC requires uCode loading, and then supports
command submission once loaded. But these are logically distinct from
simply "having a GuC", so we need a separate macro for the latter. Then,
various tests should use this new macro rather than HAS_GUC_UCODE() or
testing enable_guc_submission.

v4:
    Added a couple more uses of the new macro.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-05-23 14:21:52 +01:00
Ville Syrjälä
55d7f30ee1 drm/i915: Determine DP++ type 1 DVI adaptor presence based on VBT
DP dual mode type 1 DVI adaptors aren't required to implement any
registers, so it's a bit hard to detect them. The best way would
be to check the state of the CONFIG1 pin, but we have no way to
do that. So as a last resort, check the VBT to see if the HDMI
port is in fact a dual mode capable DP port.

v2: Deal with VBT code reorganization
    Deal with DRM_DP_DUAL_MODE_UNKNOWN
    Reduce DEVICE_TYPE_DP_DUAL_MODE_BITS a bit
    Accept both DP and HDMI dvo_port in VBT as my BSW
    at least declare its DP port as HDMI :(
v3: Ignore DEVICE_TYPE_NOT_HDMI_OUTPUT (Shashank)

Cc: stable@vger.kernel.org
Cc: Tore Anderson <tore@fud.no>
Reported-by: Tore Anderson <tore@fud.no>
Fixes: 7a0baa6234 ("Revert "drm/i915: Disable 12bpc hdmi for now"")
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462362322-31278-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
(cherry picked from commit d61992565b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-05-23 11:10:48 +03:00
Daniel Vetter
9cce44316a drm/i915: Update DRIVER_DATE to 20160522
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-05-22 18:22:42 +02:00
Dave Gordon
63d1532616 drm/i915: Inline sg_next() for the optimised SGL iterator
Avoiding the out-of-line call to sg_next() reduces the kernel execution
overhead by 10% in some workloads (for example the Unreal Engine 4 demo
Atlantis on 2GiB GTTs) which are dominated by the cost of inserting PTEs
due to texture thrashing. We can demonstrate this in a microbenchmark
that forces us to rebind the object on every execbuf, where we can
measure a 25% improvement, in the time required to execute an execbuf
requiring a texture to be rebound, for inlining the sg_next() for large
texture sizes.

Benchmark: igt/benchmarks/gem_exec_fault
Benchmark: igt/benchmarks/gem_exec_trace/Atlantis
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-5-git-send-email-chris@chris-wilson.co.uk
2016-05-20 13:43:00 +01:00
Dave Gordon
85d1225ec0 drm/i915: Introduce & use new lightweight SGL iterators
The existing for_each_sg_page() iterator is somewhat heavyweight, and is
limiting i915 driver performance in a few benchmarks. So here we
introduce somewhat lighter weight iterators, primarily for use with GEM
objects or other case where we need only deal with whole aligned pages.

Unlike the old iterator, the new iterators use an internal state
structure which is not intended to be accessed by the caller; instead
each takes as a parameter an output variable which is set before each
iteration. This makes them particularly simple to use :)

One of the new iterators provides the caller with the DMA address of
each page in turn; the other provides the 'struct page' pointer required
by many memory management operations.

Various uses of for_each_sg_page() are then converted to the new macros.

v2: Force inlining of the sg_iter constructor and make the union
anonymous.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-4-git-send-email-chris@chris-wilson.co.uk
2016-05-20 13:43:00 +01:00
Chris Wilson
72778cb266 drm/i915/userptr: Convert to drm_i915_private
userptr directly only uses drm_device in a single interface where it
meant to use drm_i915_private (everywhere else we have to derive it from
the drm_i915_gem_object and so require going from drm_device).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463671036-3235-1-git-send-email-chris@chris-wilson.co.uk
2016-05-19 17:57:08 +01:00
Maarten Lankhorst
2ee004f7c5 drm/i915: Remove queue_flip pointer.
With the removal of cs support this is no longer reachable.
Can be revived if needed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463490484-19540-15-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-05-19 14:38:40 +02:00
Maarten Lankhorst
6885843ae1 drm/i915: Convert flip_work to a list.
This will be required to allow more than 1 update in the future.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463490484-19540-10-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-05-19 14:37:46 +02:00
Chris Wilson
461fb99c15 drm/i915: Update domain tracking for GEM objects on hibernation
When creating the hibernation image, the CPU will read the pages of all
objects and thus conflict with our domain tracking. We need to update
our domain tracking to accurately reflect the state on restoration.

v2: Perform the domain tracking inside freeze, before the image is
written, rather than upon restoration.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: David Weinehall <david.weinehall@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463207195-22076-2-git-send-email-chris@chris-wilson.co.uk
2016-05-14 08:51:39 +01:00
Matt Roper
5b483747a9 drm/i915: Remove wm_config from dev_priv/intel_atomic_state
We calculate the watermark config into intel_atomic_state and then save
it into dev_priv, but never actually use it from there.  This is
left-over from some early ILK-style watermark programming designs that
got changed over time.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463061971-19638-18-git-send-email-matthew.d.roper@intel.com
2016-05-13 07:36:05 -07:00
Matt Roper
2b4b9f35d9 drm/i915/gen9: Use a bitmask to track dirty pipe watermarks
Slightly easier to work with than an array of bools.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463061971-19638-14-git-send-email-matthew.d.roper@intel.com
2016-05-13 07:34:40 -07:00
Matt Roper
98d39494d3 drm/i915/gen9: Compute DDB allocation at atomic check time (v4)
Calculate the DDB blocks needed to satisfy the current atomic
transaction at atomic check time.  This is a prerequisite to calculating
SKL watermarks during the 'check' phase and rejecting any configurations
that we can't find valid watermarks for.

Due to the nature of DDB allocation, it's possible for the addition of a
new CRTC to make the watermark configuration already in use on another,
unchanged CRTC become invalid.  A change in which CRTC's are active
triggers a recompute of the entire DDB, which unfortunately means we
need to disallow any other atomic commits from racing with such an
update.  If the active CRTC's change, we need to grab the lock on all
CRTC's and run all CRTC's through their 'check' handler to recompute and
re-check their per-CRTC DDB allocations.

Note that with this patch we only compute the DDB allocation but we
don't actually use the computed values during watermark programming yet.
For ease of review/testing/bisecting, we still recompute the DDB at
watermark programming time and just WARN() if it doesn't match the
precomputed values.  A future patch will switch over to using the
precomputed values once we're sure they're being properly computed.

Another clarifying note:  DDB allocation itself shouldn't ever fail with
the algorithm we use today (i.e., we have enough DDB blocks on BXT to
support the minimum needs of the worst-case scenario of every pipe/plane
enabled at full size).  However the watermarks calculations based on the
DDB may fail and we'll be moving those to the atomic check as well in
future patches.

v2:
 - Skip DDB calculations in the rare case where our transaction doesn't
   actually touch any CRTC's at all.  Assuming at least one CRTC state
   is present in our transaction, then it means we can't race with any
   transactions that would update dev_priv->active_crtcs (which requires
   _all_ CRTC locks).

v3:
 - Also calculate DDB during initial hw readout, to prevent using
   incorrect bios values. (Maarten)

v4:
 - Use new distrust_bios_wm flag instead of skip_initial_wm (which was
   never actually set).
 - Set intel_state->active_pipe_changes instead of just realloc_pipes

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lyude Paul <cpaul@redhat.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463061971-19638-10-git-send-email-matthew.d.roper@intel.com
2016-05-13 07:34:00 -07:00
Matt Roper
279e99d76e drm/i915: Add distrust_bios_wm flag to dev_priv (v2)
SKL-style platforms can't fully trust the watermark/DDB settings
programmed by the BIOS and need to do extra sanitization on their first
atomic update.  Add a flag to dev_priv that is set during hardware
readout and cleared at the end of the first commit.

Note that for the somewhat common case where everything is turned off
when the driver starts up, we don't need to bother with a recompute...we
know exactly what the DDB should be (all zero's) so just setup the DDB
directly in that case.

v2:
 - Move clearing of distrust_bios_wm up below the swap_state call since
   it's a more natural / self-explanatory location.  (Maarten)
 - Use dev_priv->active_crtcs to test whether any CRTC's are turned on
   during HW WM readout rather than trying to count the active CRTC's
   again ourselves.  (Maarten)

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463061971-19638-9-git-send-email-matthew.d.roper@intel.com
2016-05-13 07:33:54 -07:00
Matt Roper
c107acfeb0 drm/i915/gen9: Allow skl_allocate_pipe_ddb() to operate on in-flight state (v3)
We eventually want to calculate watermark values at atomic 'check' time
instead of atomic 'commit' time so that any requested configurations
that result in impossible watermark requirements are properly rejected.
The first step along this path is to allocate the DDB at atomic 'check'
time.  As we perform this transition, allow the main allocation function
to operate successfully on either an in-flight state or an
already-commited state.  Once we complete the transition in a future
patch, we'll come back and remove the unnecessary logic for the
already-committed case.

v2: Rebase/refactor; we should no longer need to grab extra plane states
    while allocating the DDB since we can pull cached data rates and
    minimum block counts from the CRTC state for any planes that aren't
    being modified by this transaction.

v3:
 - Simplify memsets to clear DDB plane entries.  (Maarten)
 - Drop a redundant memset of plane[pipe][PLANE_CURSOR] that was added
   by an earlier Coccinelle patch.  (Maarten)
 - Assign *num_active at the top of skl_ddb_get_pipe_allocation_limits()
   so that no code paths return without setting it.  (kbuild robot)

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463061971-19638-8-git-send-email-matthew.d.roper@intel.com
2016-05-13 07:33:16 -07:00
Chris Wilson
d538704ba0 drm/i915: Move get-reset-stats ioctl from intel_uncore.c to i915_gem_context.c
The get-reset-stats ioctl reports upon the statistics (number of hangs,
be it as a victim or the guilty party) of a particular context. It is
semantically better as being part of i915_gem_context.c user interface,
as opposed to the hardware level access of intel_uncore.c

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463137042-9669-1-git-send-email-chris@chris-wilson.co.uk
2016-05-13 12:38:42 +01:00
Tvrtko Ursulin
ac657f6461 drm/i915: Introduce IS_GEN macro
To be used for more efficient Gen range checking.

v2: Remove spurious chunk. (Chris Wilson)
v3: Rebase.
v4: Renamed from INTEL_GEN_RANGE and added GEN_FOREVER.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462874228-6601-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-05-11 12:27:28 +01:00
Tvrtko Ursulin
ac208a8b9a drm/i915: Do not use a bitfield for INTEL_INFO->num_pipes
It just makes more work for the compiler and generates more code.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-11 12:27:28 +01:00
Tvrtko Ursulin
7e22dbbbae drm/i915: Replace "INTEL_INFO->gen == x" checks with IS_GENx
This way optimization from a previous patch works even better.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-05-11 12:27:27 +01:00
Tvrtko Ursulin
ab0d24ac38 drm/i915: Promote IS_BROADWELL to a simple macro
If we allow it a dedicated flag in dev_priv we enable the
compiler to nicely optimize conditions like IS_HASSWELL ||
IS_BROADWELL.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-11 12:27:27 +01:00
Tvrtko Ursulin
ae5702d245 drm/i915: Make IS_GENx macros work on a mask
If instead of numerical comparison me make these test a
bitmask, we enable the compiler to optimize all instances
of IS_GENx || IS_GENy.

v2: Make bit zero of gen mask mean gen 1.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-11 12:27:27 +01:00
Chris Wilson
dc97997a21 drm/i915: Use drm_i915_private as the native pointer for intel_uncore.c
Pass drm_i915_private to the uncore init/fini routines and their
subservients as it is their native type.

   text    data     bss     dec     hex filename
6309978 3578778  696320 10585076         a183f4 vmlinux
6309530 3578778  696320 10584628         a18234 vmlinux

a modest 400 bytes of saving, but 60 lines of code deleted!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462885804-26750-1-git-send-email-chris@chris-wilson.co.uk
2016-05-10 17:20:20 +01:00
Chris Wilson
c033666a94 drm/i915: Store a i915 backpointer from engine, and use it
text	   data	    bss	    dec	    hex	filename
6309351	3578714	 696320	10584385	 a18141	vmlinux
6308391	3578714	 696320	10583425	 a17d81	vmlinux

Almost 1KiB of code reduction.

v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions

   text	   data	    bss	    dec	    hex	filename
6304579	3578778	 696320	10579677	 a16edd	vmlinux
6303427	3578778	 696320	10578525	 a16a5d	vmlinux

Now over 1KiB!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk
2016-05-09 13:41:24 +01:00
Tvrtko Ursulin
91d14251bb drm/i915: Small display interrupt handlers tidy
I have noticed some of our interrupt handlers use both dev and
dev_priv while they could get away with only dev_priv in the
huge majority of cases.

Tidying that up had a cascading effect on changing functions
prototypes, so relatively big churn factor, but I think it is
for the better.

For example even where changes cascade out of i915_irq.c, for
functions prefixed with intel_, genX_ or <plat>_, it makes more
sense to take dev_priv directly anyway.

This allows us to eliminate local variables and intermixed usage
of dev and dev_priv where only one is good enough.

End result is shrinkage of both source and the resulting binary.

i915.ko:

 - .text         000b0899
 + .text         000b0619

Or if we look at the Gen8 display irq chain:

 -00000000000006ad t gen8_irq_handler
 +0000000000000663 t gen8_irq_handler
   -0000000000000028 T intel_opregion_asle_intr
   +0000000000000024 T intel_opregion_asle_intr
   -000000000000008c t ilk_hpd_irq_handler
   +000000000000007f t ilk_hpd_irq_handler
   -0000000000000116 T intel_check_page_flip
   +0000000000000112 T intel_check_page_flip
   -000000000000011a T intel_prepare_page_flip
   +0000000000000119 T intel_prepare_page_flip
   -0000000000000014 T intel_finish_page_flip_plane
   +0000000000000013 T intel_finish_page_flip_plane
   -0000000000000053 t hsw_pipe_crc_irq_handler
   +000000000000004c t hsw_pipe_crc_irq_handler
   -000000000000022e t cpt_irq_handler
   +0000000000000213 t cpt_irq_handler

So small shrinkage but it is all fast paths so doesn't harm.

Situation is similar in other interrupt handlers as well.

v2: Tidy intel_queue_rps_boost_for_request as well. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-09 13:38:16 +01:00
Greg Kroah-Hartman
4096e645d8 Merge 4.6-rc7 into staging-next
This fixes some merge issues with some iio drivers that were found in
linux-next.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-09 13:20:04 +02:00
Ville Syrjälä
d61992565b drm/i915: Determine DP++ type 1 DVI adaptor presence based on VBT
DP dual mode type 1 DVI adaptors aren't required to implement any
registers, so it's a bit hard to detect them. The best way would
be to check the state of the CONFIG1 pin, but we have no way to
do that. So as a last resort, check the VBT to see if the HDMI
port is in fact a dual mode capable DP port.

v2: Deal with VBT code reorganization
    Deal with DRM_DP_DUAL_MODE_UNKNOWN
    Reduce DEVICE_TYPE_DP_DUAL_MODE_BITS a bit
    Accept both DP and HDMI dvo_port in VBT as my BSW
    at least declare its DP port as HDMI :(
v3: Ignore DEVICE_TYPE_NOT_HDMI_OUTPUT (Shashank)

Cc: stable@vger.kernel.org
Cc: Tore Anderson <tore@fud.no>
Reported-by: Tore Anderson <tore@fud.no>
Fixes: 7a0baa6234 ("Revert "drm/i915: Disable 12bpc hdmi for now"")
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462362322-31278-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
2016-05-09 14:09:38 +03:00
Daniel Vetter
2a55135c67 drm/i915: Update DRIVER_DATE to 20160508
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-05-08 18:20:53 +02:00
Chris Wilson
1ca3712ca3 drm/i915: Report command parser version 0 if disabled
If the command parser is not active, then it is appropriate to report it
as operating at version 0 as no higher mode is supported. This greatly
simplifies userspace querying for the command parser as we then do not
need to second guess when it will be active (a mixture of module
parameters and generational support, which may change over time).

v2: s/comand/command/ misspelling in comment

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1462368336-21230-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-05-05 08:40:02 +01:00
Matthew Auld
e96b7e575b drm/i915: remove i915_gem_object_ggtt_unbind
Only has one user and is nothing more than a shim on top of
i915_vma_unbind, so let's just get rid of it.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461842691-27575-1-git-send-email-matthew.auld@intel.com
2016-05-04 13:10:10 +03:00
Deepak M
9a41e17de3 drm/i915: Parse LFP brightness control field in VBT
These fields in VBT indicates the PWM source which
is used and also the controller number.

v2 by Jani: check for out of bounds access, some renames, change default
type, etc.

v3 by Jani: s/INTEL_BACKLIGHT_CABC/INTEL_BACKLIGHT_DSI_DCS/

Signed-off-by: Deepak M <m.deepak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/eee2f7b683a081f006a7df1ddad9b20fbf53c48c.1461676337.git.jani.nikula@intel.com
2016-05-02 16:17:38 +03:00
Gustavo Padovan
3ed605bc8a kernel.h: add u64_to_user_ptr()
This function had copies in 3 different files. Unify them in kernel.h.

Cc: Joe Perches <joe@perches.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@intel.com>	[drm/i915/]
Acked-by: Rob Clark <robdclark@gmail.com>		[drm/msm/]
Acked-by: Lucas Stach <l.stach@pengutronix.de>		[drm/etinav/]
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-29 17:03:49 -07:00
Chris Wilson
0e4ca1008e drm/i915: Fix ordering of sanitize ppgtt and sanitize execlists
The i915.enable_ppgtt option depends upon the state of
i915.enable_execlists option - so we need to sanitize execlists first.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461932305-14637-2-git-send-email-chris@chris-wilson.co.uk
2016-04-29 13:59:05 +01:00
Ander Conselvan de Oliveira
0f572ebec0 drm/i915: Move VLV HDMI lane reset work around logic to intel_dpio_phy.c
This moves the last phy specific code from the encoders to the phy
specific file.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-11-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:59:00 +03:00
Ander Conselvan de Oliveira
5f68c275b4 drm/i915: Unduplicate pre encoder enabling phy code
The phy code in vlv_pre_enable_dp() and vlv_hdmi_pre_enable() is
exectly the same, so extract it to intel_dpio_phy.c.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-10-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:58:53 +03:00
Ander Conselvan de Oliveira
6da2e61602 drm/i915: Unduplicate VLV phy pre pll enabling code
The code used by the DP and HDMI paths was very similar, so make them
share it. Note that this removes the write to signal level registers
from the HDMI pre pll enable path, but that's OK since those are set
in vlv_hdmi_pre_enable() function.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-9-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:58:45 +03:00
Ander Conselvan de Oliveira
53d9872511 drm/i915: Unduplicate VLV signal level code
The logic for setting signal levels is used for both HDMI and DP with
small variations. But it is similar enough to put behind a function
called from the encoders.

v2: Remove unrelated MST changes due to rebase fumble. (Jim Bride)
    Fix typo in the commit message. (Jim Bride)

v3: Really fix the typo. (Jim)

Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-8-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:57:36 +03:00
Ander Conselvan de Oliveira
204970b5cf drm/i915: Unduplicate CHV encoders' post pll disable code
The exact same code was used by HDMI and DP encoders, so move it to
intel_dpio_phy.c.

v2: Fix typo in the commit message. (Jim Bride)
v3: Call the new function chv_phy_post_pll_disable() instead of
    chv_phy_post_disable(), as it should be called after the pll
    is disabled. (Ville)

Cc: Jim Bride <jim.bride@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-7-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:57:20 +03:00
Ander Conselvan de Oliveira
e7d2a71724 drm/i915: Unduplicate CHV pre-encoder enabling phy logic
The only difference between the DP and HDMI versions was the lane count.
Since lane_count is now set appropriately for HDMI too, get rid of the
duplication and move this to intel_dpio_phy.c

v2: Don't move comments about 2nd common lane staying alive. (Ville)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-6-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:56:34 +03:00
Ander Conselvan de Oliveira
419b1b7ae1 drm/i915: Unduplicate CHV phy-releated pre pll enabling code
The same logic is used for DP and HDMI so move it to intel_dpio_phy.c.

v2: Rebase

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-5-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:56:22 +03:00
Ander Conselvan de Oliveira
844b2f9a5d drm/i915: Unduplicate chv_data_lane_soft_reset()
The function chv_data_lane_soft_reset() was duplicated in DP and HDMI
code. Move it to intel_dpio_phy.c.

Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-4-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:56:10 +03:00
Ander Conselvan de Oliveira
b7fa22d872 drm/i915: Unduplicate CHV signal level code
The code for programming voltage swing and emphasis was duplicated
between DP and HDMI code. Move that to a new file, intel_dpio_phy.c.

v2: Keep the "Use 800mV-0dB" comment in the HDMI code. (Ville)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-3-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-04-29 09:55:14 +03:00
Tvrtko Ursulin
a3d127616e drm/i915: Store LRC hardware id in the request
This way in the following patch we can disconnect requests
from contexts.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-23-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
a16a405259 drm/i915: Track the previous pinned context inside the request
As the contexts are accessed by the hardware until the switch is completed
to a new context, the hardware may still be writing to the context object
after the breadcrumb is visible. We must not unpin/unbind/prune that
object whilst still active and so we keep the previous context pinned until
the following request. We can generalise the tracking we already do via
the engine->last_context and move it to the request so that it works
equally for execlists and GuC.

v2: Drop the execlists double pin as that exposes a race inside the lrc
irq handler as it tries to access the context after it may be retired.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-22-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
73db04cfa8 drm/i915: Move releasing of the GEM request from free to retire/cancel
If we move the release of the GEM request (i.e. decoupling it from the
various lists used for client and context tracking) after it is complete
(either by the GPU retiring the request, or by the caller cancelling the
request), we can remove the requirement that the final unreference of
the GEM request need to be under the struct_mutex.

The careful reader may notice that one or two impossible NULL pointer
tests are dropped for readability. These pointers cannot be NULL since
they are assigned during request construction and never unset.

v2,v3: Rebalance execlists by moving the context unpinning.
v4: Rebase onto -nightly
v5: Avoid trying to rebalance execlist/GuC context pinning, leave that
to the next step

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-21-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
24f1d3cc09 drm/i915: Refactor execlists default context pinning
Refactor pinning and unpinning of contexts, such that the default
context for an engine is pinned during initialisation and unpinned
during teardown (pinning of the context handles the reference counting).
Thus we can eliminate the special case handling of the default context
that was required to mask that it was not being pinned normally.

v2: Rebalance context_queue after rebasing.
v3: Rebase to -nightly (not 40 patches in)
v4: Rebase onto request_alloc unwinding

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-19-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
5d1808ecbc drm/i915: Assign every HW context a unique ID
The hardware tracks contexts and expects all live contexts (those active
on the hardware) to have a unique identifier. This is used by the
hardware to assign pagefaults and the like to a particular context.

v2: Reorder to make sure ctx->link is not left dangling if the
assignment of a hw_id fails (Mika).

v3: We have 21bits of context space, not 20.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-17-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
0251a96322 drm/i915: Remove the identical implementations of request space reservation
Now that we share intel_ring_begin(), reserving space for the tail of
the request is identical between legacy/execlists and so the tautology
can be removed. In the process, we move the reserved space tracking
from the ringbuffer on to the request. This is to enable us to reorder
the reserved space allocation in the next patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-13-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
f9326be5f1 drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
The code to switch_mm() is already handled by i915_switch_context(), the
only difference required to setup the aliasing ppgtt is that we need to
emit te switch_mm() on the first context, i.e. when transitioning from
engine->last_context == NULL. This allows us to defer the
initialisation of the GPU from early device initialisation to first use,
which should marginally speed up both. The caveat is that we then defer
the context initialisation until first use - i.e. we cannot assume that
the GPU engines are initialised. For example, this means that power
contexts for rc6 (Ironlake) need to explicitly loaded, as they are.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-11-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
d200cda6bd drm/i915: Remove early l3-remap
Since we do the l3-remap on context switch, we can remove the redundant
early call to set the mapping prior to performing the first context
switch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-10-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
b2e862d08e drm/i915: Mark the current context as lost on suspend
In order to force a reload of the context image upon resume, we first
need to mark its absence on suspend. Currently we are failing to restore
the golden context state and any context w/a to the default context
after resume.

One oversight corrected, is that we had forgotten to reapply the L3
remapping when restoring the lost default context.

v2: Remove deprecated WARN.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-7-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Tvrtko Ursulin
8da32727ac drm/i915: Remove i915_gem_obj_size
Only caller is i915_gem_obj_ggtt_size which only cares about
GGTT so simplify it and implement under that name.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-04-25 12:34:05 +01:00
Dave Gordon
d37cd8a887 drm/i915: rename i915_gem_alloc_object() to i915_gem_object_create()
Because having both i915_gem_object_alloc() and i915_gem_alloc_object()
(with different return conventions) is just too confusing!

(i915_gem_object_alloc() is the low-level memory allocator, and remains
unchanged, whereas i915_gem_alloc_object() is a constructor that ALSO
initialises the newly-allocated object.)

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461348872-4702-1-git-send-email-david.s.gordon@intel.com
2016-04-25 12:31:34 +01:00
Daniel Vetter
5b4fd5b111 drm/i915: Update DRIVER_DATE to 20160425
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-04-25 09:35:38 +02:00
Dave Gordon
8305216ff8 drm/i915: check for ERR_PTR from i915_gem_object_pin_map()
The newly-introduced function i915_gem_object_pin_map() returns an
ERR_PTR (not NULL) if the pin-and-map opertaion fails, so that's what we
must check for. And it's nicer not to assign such a pointer-or-error to
a structure being filled in until after it's been validated, so we
should keep it local and avoid exporting a bogus pointer. Also, for
clarity and symmetry, we should clear 'virtual_start' along with 'vma'
when unmapping a ringbuffer.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-04-20 16:59:03 +01:00
Mika Kuoppala
d528a6a0f3 drm/i915/skl: Fix rc6 based gpu/system hang
For all gt3 and gt4 skylake variants, extend the usage of
WaRsDisableCoarsePowerGating for all revisions. Without this
gt3 and gt4 skylakes up to atleast rev 0xa can gpu hang or
system hang.

Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: <stable@vger.kernel.org>
Reported-by: Mikael Djurfeldt <mikael@djurfeldt.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94161
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-1-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 185c66e57c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-04-18 12:35:49 +03:00
Imre Deak
8f6d855c4b drm/i915/bxt: Enable runtime PM
With the preceding fixes runtime PM should be functional, I could
runtime suspend/resume the device without problems.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-17-git-send-email-imre.deak@intel.com
2016-04-15 14:48:19 +03:00
Imre Deak
adc7f04bfd drm/i915/bxt: Add HW state verification for DDI PHY and CDCLK
I caught a few errors in our current PHY/CDCLK programming by sanity
checking the actual programmed state, so I thought it would be also
useful for the future. In addition to verifying the state after
programming it also verify it after exiting DC5, to make sure DMC
restored/kept intact everything related.

v2:
- Inlining __phy_reg_verify_state() doesn't make sense and also
  incorrect, so don't do it (PW/CI gcc)
v3:
- Rebase on latest -nightly

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459780030-15781-1-git-send-email-imre.deak@intel.com
2016-04-15 14:48:19 +03:00
Chris Wilson
aa9b78104f drm/i915: Late request cancellations are harmful
Conceptually, each request is a record of a hardware transaction - we
build up a list of pending commands and then either commit them to
hardware, or cancel them. However, whilst building up the list of
pending commands, we may modify state outside of the request and make
references to the pending request. If we do so and then cancel that
request, external objects then point to the deleted request leading to
both graphical and memory corruption.

The easiest example is to consider object/VMA tracking. When we mark an
object as active in a request, we store a pointer to this, the most
recent request, in the object. Then we want to free that object, we wait
for the most recent request to be idle before proceeding (otherwise the
hardware will write to pages now owned by the system, or we will attempt
to read from those pages before the hardware is finished writing). If
the request was cancelled instead, that wait completes immediately. As a
result, all requests must be committed and not cancelled if the external
state is unknown.

All that remains of i915_gem_request_cancel() users are just a couple of
extremely unlikely allocation failures, so remove the API entirely.

A consequence of committing all incomplete requests is that we generate
excess breadcrumbs and fill the ring much more often with dummy work. We
have completely undone the outstanding_last_seqno optimisation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-16-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
f4457ae71f drm/i915: Prevent leaking of -EIO from i915_wait_request()
Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.

If the we reset the GPU or have detected a hang and wish to reset the
GPU, the request is forcibly complete and the wait broken. Currently, we
report either -EAGAIN or -EIO in order for the caller to retreat and
restart the wait (if appropriate) after dropping and then reacquiring
the struct_mutex (essential to allow the GPU reset to proceed). However,
if we take the view that the request is complete (no further work will
be done on it by the GPU because it is dead and soon to be reset), then
we can proceed with the task at hand and then drop the struct_mutex
allowing the reset to occur. This transfers the burden of checking
whether it is safe to proceed to the caller, which in all but one
instance it is safe - completely eliminating the source of all spurious
-EIO.

Of note, we only have two API entry points where we expect that
userspace can observe an EIO. First is when submitting an execbuf, if
the GPU is terminally wedged, then the operation cannot succeed and an
-EIO is reported. Secondly, existing userspace uses the throttle ioctl
to detect an already wedged GPU before starting using HW acceleration
(or to confirm that the GPU is wedged after an error condition). So if
the GPU is wedged when the user calls throttle, also report -EIO.

v2: Split more carefully the change to i915_wait_request() and assorted
ABI from the reset handling.
v3: Add a couple of WARN_ON(EIO) to the interruptible modesetting code
so that we don't start to leak EIO there in future (and break our hang
resistant modesetting).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-9-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
299259a3a9 drm/i915: Store the reset counter when constructing a request
As the request is only valid during the same global reset epoch, we can
record the current reset_counter when constructing the request and reuse
it when waiting upon that request in future. This removes a very hairy
atomic check serialised by the struct_mutex at the time of waiting and
allows us to transfer those waits to a central dispatcher for all
waiters and all requests.

PS: With per-engine resets, we obviously cannot assume a global reset
epoch for the requests - a per-engine epoch makes the most sense. The
challenge then is how to handle checking in the waiter for when to break
the wait, as the fine-grained reset may also want to requeue the
request (i.e. the assumption that just because the epoch changes the
request is completed may be broken - or we just avoid breaking that
assumption with the fine-grained resets).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-7-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
d98c52cf4f drm/i915: Tighten reset_counter for reset status
In the reset_counter, we use two bits to track a GPU hang and reset. The
low bit is a "reset-in-progress" flag that we set to signal when we need
to break waiters in order for the recovery task to grab the mutex. As
soon as the recovery task has the mutex, we can clear that flag (which
we do by incrementing the reset_counter thereby incrementing the gobal
reset epoch). By clearing that flag when the recovery task holds the
struct_mutex, we can forgo a second flag that simply tells GEM to ignore
the "reset-in-progress" flag.

The second flag we store in the reset_counter is whether the
reset failed and we consider the GPU terminally wedged. Whilst this flag
is set, all access to the GPU (at least through GEM rather than direct mmio
access) is verboten.

PS: Fun is in store, as in the future we want to move from a global
reset epoch to a per-engine reset engine with request recovery.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-6-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
c19ae989b0 drm/i915: Hide the atomic_read(reset_counter) behind a helper
This is principally a little bit of syntatic sugar to hide the
atomic_read()s throughout the code to retrieve the current reset_counter.
It also provides the other utility functions to check the reset state on the
already read reset_counter, so that (in later patches) we can read it once
and do multiple tests rather than risk the value changing between tests.

v2: Be more strict on converting existing i915_reset_in_progress() over to
the more verbose i915_reset_in_progress_or_wedged().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-4-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
d501b1d2b1 drm/i915: Add GEM debugging Kconfig option
Currently there is a #define to enable extra BUG_ON for debugging
requests and associated activities. I want to expand its use to cover
all of GEM internals (so that we can saturate the code with asserts).
We can add a Kconfig option to make it easier to enable - with the usual
caveats of not enabling unless explicitly requested.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-3-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
e73bdd206a drm/i915: Disentangle i915_drv.h includes
Separate out the layers of includes (linux, drm, intel, i915) so that it
is a little easier to order our definitions between our multiple
reentrant headers. A couple of headers needed fixes to make them more
standalone (forgotten includes, forward declarations etc).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-2-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Mika Kuoppala
3accaf7e73 drm/i915: Store and use edram capabilities
Store the edram capabilities instead of only the size of
edram. This is preparatory patch to allow edram size calculation
based on edram capability bits for gen9+. With gen9 the
edram is behind llc and is a separate entity. With hsw/bdw
it was more of a victim cache for LLC so the name 'eLLC' might
be warranted. Regardless, rename all mentions of eLLC to EDRAM to
clear the confusion.

v2: return bytes for edram size (Chris)
    s/eLLC/eDRAM in output if we are gen > 8

v3: rebase, INTEL_GEN (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 12:27:37 +03:00
Mika Kuoppala
185c66e57c drm/i915/skl: Fix rc6 based gpu/system hang
For all gt3 and gt4 skylake variants, extend the usage of
WaRsDisableCoarsePowerGating for all revisions. Without this
gt3 and gt4 skylakes up to atleast rev 0xa can gpu hang or
system hang.

Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: <stable@vger.kernel.org>
Reported-by: Mikael Djurfeldt <mikael@djurfeldt.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94161
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-1-git-send-email-mika.kuoppala@intel.com
2016-04-13 15:28:25 +03:00
Jani Nikula
3f10e82fe1 drm/i915: add INTEL_GEN() helper shorthand for INTEL_INFO()->gen
Sudden realization:

$ grep -ho "INTEL_INFO([^)]*)->[a-zA-Z0-9_]*" *.[ch] | sed 's/.*->//' |\
  sort | uniq -c | sort -rn | head -5
  446 gen
   24 num_pipes
   10 ring_mask
    9 color
    4 subslice_per_slice

Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460022497-29304-1-git-send-email-jani.nikula@intel.com
2016-04-13 14:36:14 +03:00
Tvrtko Ursulin
3756685a18 drm/i915: Only grab correct forcewake for the engine with execlists
Rather than blindly waking up all forcewake domains on command
submission, we can teach each engine what is (or are) the correct
one to take.

On platforms with multiple forcewake domains like VLV, CHV, SKL
and BXT, this has the potential of lowering the GPU and CPU
power use and submission latency.

To implement it we add a function named
intel_uncore_forcewake_for_reg whose purpose is to query which
forcewake domains need to be taken to read or write a specific
register with raw mmio accessors.

These enables the execlists engine setup  to query which
forcewake domains are relevant per engine on the currently
running platform.

v2:
  * Kerneldoc.
  * Split from intel_uncore.c macro extraction, WARN_ON,
    no warns on old platforms. (Chris Wilson)

v3:
  * Single domain per engine, mention all registers,
    bi-directional function and a new name, fix handling
    of gen6 and gen7 writes. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460468251-14069-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-04-12 15:35:22 +01:00
Tvrtko Ursulin
33c582c10a drm/i915: Simplify for_each_fw_domain iterators
As the vast majority of users do not use the domain id variable,
we can eliminate it from the iterator and also change the latter
using the same principle as was recently done for for_each_engine.

For a couple of callers which do need the domain mask, store it
in the domain array (which already has the domain id), then both
can be retrieved thence.

Result is clearer code and smaller generated binary, especially
in the tight fw get/put loops. Also, relationship between domain
id and mask is no longer assumed in the macro.

v2: Improve grammar in the commit message and rename the
    iterator to for_each_fw_domain_masked for consistency.
    (Dave Gordon)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
2016-04-12 14:30:41 +01:00
Tvrtko Ursulin
a57a4a67e5 drm/i915: Use consistent forcewake auto-release timeout across kernel configs
Because it is based on jiffies, current implementation releases the
forcewake at any time between straight away and between 1ms and 10ms,
depending on the kernel configuration (CONFIG_HZ).

This is probably not what has been desired, since the dynamics of keeping
parts of the GPU awake should not be correlated with this kernel
configuration parameter.

Change the auto-release mechanism to use hrtimers and set the timeout to
1ms with a 1ms of slack. This should make the GPU power consistent
across kernel configs, and timer slack should enable some timer coalescing
where multiple force-wake domains exist, or with unrelated timers.

For GlBench/T-Rex this decreases the number of forcewake releases from
~480 to ~300 per second, and for a heavy combined OGL/OCL test from
~670 to ~360 (HZ=1000 kernel).

Even though this reduction can be attributed to the average release period
extending from 0-1ms to 1-2ms, as discussed above, it will make the
forcewake timeout consistent for different CONFIG_HZ values.

Real life measurements with the above workload has shown that, with this
patch, both manage to auto-release the forcewake between 2-4 times per
10ms, even though the number of forcewake gets is dramatically different.

T-Rex requests between 5-10 explicit gets and 5-10 implict gets in each
10ms period, while the OGL/OCL test requests 250 and 380 times in the same
period.

The two data points together suggest that the nature of the forwake
accesses is bursty and that further changes and potential timeout
extensions, or moving the start of timeout from the first to the last
automatic forcewake grab, should be carefully measured for power and
performance effects.

v2:
  * Commit spelling. (Dave Gordon)
  * More discussion on numbers in the commit. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 14:30:41 +01:00
Ville Syrjälä
a05628195a drm/i915: Get panel_type from OpRegion panel details
We've had problems on several occasions with using the panel type
from the VBT block 40. Usually it seems to be 2, which often
doesn't give us the correct timings for the panel. After some
more digging I found a way to get a panel type via the OpRegion
SWSCI GBDA "Get Panel Details" method. Let's try to use it.

The spec has this to say about the output:
"Bits [15:8] - Panel Type
 Bits contain the panel type user setting from CMOS
 00h = Not Valid, use default Panel Type & Timings from VBT
 01h - 0Fh = Panel Number"

Another version of the spec lists the valid range as 1-16, which makes
more sense since VBT supports 16 panels. Based on actual results
from Rob's G45, 1-16 is what we need to accept.

The other bits in the output don't look relevant for the problem at
hand.

The input is specified as:
"Bits [31:4] - Reserved
 Reserved (must be zero)
 Bits [3:0] - Panel Number
 These bits contain the sequential index of Panel, starting at 0 and
 counting upwards from the first integrated Internal Flat-Panel Display
 Encoder present, and then from the first external Display Encoder
 (e.g., S/DVO-B then S/DVO-C) which supports Internal Flat-Panels.
 0h - 0Fh = Panel number"

For now I've just hardcoded the input panel number as 0. That would seem
like a decent choise for LVDS. Not so sure about eDP when port != A.

v2: Accept values 1-16
    Filter out bogus results in opregion code (Jani)
    Add debug logging for all the different branches (Jani)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94825
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460359431-11003-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Rob Kramer <rob@solution-space.com>
2016-04-12 13:23:43 +03:00
Ville Syrjälä
3e845c7a40 drm/i915: Replace the static panel_type variable with dev_priv->vbt.panel_type
Store the extracted panel_type under dev_priv.vbt instead of keeping
around a static variable for it.

Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 13:23:43 +03:00
Ville Syrjälä
3e4d44e0fa drm/i915: Restore GMBUS operation after a failed bit-banging fallback
When the GMBUS based i2c transfer times out, we try to fall back to
bit-banging and retry the operation that way. However if the bit-banging
attempt also fails, we should probably go back to the GMBUS method for
the next attempt. Maybe there simply wasn't anyone one the bus at this
time.

There's also a bit of a mess going on with the force_bit handling.
It's supposed to be a ref count actually, and it is as far as
intel_gmbus_force_bit() is concerned. But it's treated as just a
flag by the timeout based bit-banging fallback. I suppose that's
fine since we should never end up in the timeout fallback case
if force_bit was already non-zero. However now that we want to restore
things back to where they were after the bit-banging attempt failed,
we're going to have to do things a bit differently to avoid clobbering
the force_bit count as set up by intel_gmbus_force_bit(). So let's
dedicate the high bit as a flag for the low level timeout based fallback
and treat the rest of the bits as a ref count just as before.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 13:20:58 +03:00
Daniel Vetter
ba3150ac38 drm/i915: Update DRIVER_DATE to 20160411
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-04-11 20:20:18 +02:00
Chris Wilson
eae2c43b12 drm/i915/shrinker: Restrict vmap purge to objects with vmaps
When called because we have run out of vmap address space, we only need
to recover objects that have vmappings and not all.

v2: Start using is_vmalloc_addr()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:12:16 +01:00
Chris Wilson
0a798eb92e drm/i915: Refactor duplicate object vmap functions
We now have two implementations for vmapping a whole object, one for
dma-buf and one for the ringbuffer. If we couple the mapping into the
obj->pages lifetime, then we can reuse an obj->mapping for both and at
the same time couple it into the shrinker. There is a third vmapping
routine in the cmdparser that maps only a range within the object, for
the time being that is left alone, but will eventually use these routines
in order to cache the mapping between invocations.

v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
v3: Call unpin_vmap from the right dmabuf unmapper

v4: Rename vmap to map as we don't wish to imply the type of mapping
involved, just that it contiguously maps the object into kernel space.
Add kerneldoc and lockdep annotations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:11:44 +01:00
Chris Wilson
c04e0f3b4e drm/i915: Separate out the seqno-barrier from engine->get_seqno
In order to simplify future patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.

v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
reading the seqno (to ensure that the seqno update lands in time as we
do not have strict seqno-irq ordering on all platforms).

Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2]

v3: Comments for hangcheck paranoia. Mika wanted to keep the extra
barrier inside the hangcheck, just in case. I can argue that it doesn't
provide a barrier against anything, but the side-effects of applying the
barrier may prevent a false declaration of a hung GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-2-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:05 +01:00
Chris Wilson
14fd0d6d0b drm/i915: Include engine->last_submitted_seqno in GPU error state
It's useful to look at the last seqno submitted on a particular engine
and compare it against the HWS value to check for irregularities.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-1-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:42:50 +01:00
Joonas Lahtinen
9458f4ba7d drm/i915: Do not WARN_ON in i915_vm_to_ppgtt
According to Chris, use of i915_vm_to_ppgtt is visible in benchmark
unless WARN_ON is removed, so lets get rid of it.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-04-07 14:49:42 +03:00
Shubhangi Shrivastava
d252bf68b7 drm/i915: Set invert bit for hpd based on VBT
This patch sets the invert bit for hpd detection for each port
based on VBT configuration. Since each AOB can be designed to
depend on invert bit or not, it is expected if an AOB requires
invert bit, the user will set respective bit in VBT.

v2: Separated VBT parsing from the rest of the logic. (Jani)

v3: Moved setting invert bit logic to bxt_hpd_irq_setup()
    and changed its logic to avoid looping twice. (Ville)

v4: Changed the logic to mask out the bits first and then
    set them to remove need of temporary variable. (Ville)

v5: Moved defines to existing set of defines for the register
    and added required breaks. (Ville)

Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[Jani: fixed some checkpatch noise, added kernel-doc.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459420907-11383-2-git-send-email-shubhangi.shrivastava@intel.com
2016-04-06 14:22:48 +03:00
Ville Syrjälä
c30fec656d drm/i915: Use GPLL ref clock to calculate GPU freqs on VLV/CHV
Extract the GPLL reference frequency from CCK and use it in the
GPU freq<->opcode conversions on VLV/CHV. This eliminates all the
assumptions we have about which divider is used for which czclk
frequency.

Note that unlike most clocks from CCK, the GPLL ref clock is a divided
down version of the CZ clock rather than the HPLL clock. CZ clock itself
is a divided down version of the HPLL clock though, so in effect it just
gets divided down twice.

While at it, throw in a few comments explaining the remaining constants
for anyone who later wants to compare this to the spreadsheets.

v2: Add slow/fast notes for CHV clocks (Imre)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457120584-26080-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
2016-04-05 21:17:39 +03:00
Arun Siluvery
6b332fa20f drm/i915/guc: reset GuC and retry on firmware load failure
Due to timing issues in the HW, some of the status bits required for GuC
authentication occasionally don't get set; when that happens, the GuC
cannot be initialized and we will be left with a wedged GPU. The W/A
suggested is to perform a soft reset of the GuC and attempt to reload
the F/W again for few times before giving up.

As the failure is dependent on timing, tests performed by triggering
manual full gpu reset (i915_wedged) showed that we could sometimes hit
this after several thousand iterations, but sometimes tests ran even
longer without any issues. Reset and reload mechanism proved helpful
when we indeed hit f/w load failure, so it is better to include this
to improve driver stability.

This change implements the following WAs,

	WaEnableuKernelHeaderValidFix:skl,bxt
	WaEnableGuCBootHashCheckNotSet:skl,bxt

Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-04-05 13:29:24 +01:00
Chris Wilson
e87666b52f drm/i915/shrinker: Hook up vmap allocation failure notifier
If the core runs out of vmap address space, it will call a notifier in
case any driver can reap some of its vmaps. As i915.ko is possibily
holding onto vmap address space that could be recovered, hook into the
notifier chain and try and reap objects holding onto vmaps.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Roman Pen <r.peniaev@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459777603-23618-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-04-05 11:13:15 +01:00
Ville Syrjälä
c231775c2d drm/i915: Implement WaPixelRepeatModeFixForC0:chv
DPLL_MD(PIPE_C) is AWOL on CHV. Instead of fixing it someone added
chicken bits to propagate the pixel multiplier from DPLL_MD(PIPE_B)
to either pipe B or C. So do that to make pixel repeat work on pipes
B and C. Pipe A is fine without any tricks.

Fortunately the pixel repeat propagation appears to be a oneshot
operation, so once the value has been written we can clear the
chicken bits. So it is still possible to drive pipe B and C with
different pixel multipliers simultaneosly.

Looks like DPLL_VGA_MODE_DIS must also be set in DPLL(PIPE_B)
for this to work. But since we keep that bit always set in all
DPLLs there's no problem.

This of course means we can't reliably read out the pixel multiplier
for pipes B and C. That would make the state checker unhappy, so I
added shadow copies of those registers in to dev_priv. The other
option would have been to skip pixel multiplier, dpll_md an dotclock
checks entirely on CHV, but that feels like a serious loss of cross
checking, so just pretending that we have working DPLL MD registers
seemed better. Obviously with the shadow copies we can't detect if
the pixel multiplier was properly configured, nor can we take over
its state from the BIOS, but hopefully people won't have displays
that would be limitd to such crappy modes.

There is one strange flicker still remaining. It's visible on
pipe C/HDMID when HDMIB is enabled while driven by pipe B.
It doesn't occur if pipe A drives HDMIB, nor is there any glitch
on pipe B/HDMIB when port C/HDMID starts up. I don't have a board
with HDMIC so not sure if it happens there too. So I'm not sure
if it's somehow tied in with this strange linkage between pipe B
and C. Sadly I was unable to find an enable sequence that would
avoid the glitch, but at least it's not fatal ie. the output
recovers afterwards.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-01 22:16:02 +03:00
Joonas Lahtinen
72e96d6450 drm/i915: Refer to GGTT {,VM} consistently
Refer to the GGTT VM consistently as "ggtt->base" instead of just "ggtt",
"vm" or indirectly through other variables like "dev_priv->ggtt.base"
to avoid confusion with the i915_ggtt object itself and PPGTT VMs.

Refer to the GGTT as "ggtt" instead of indirectly through chaining.

As a bonus gets rid of the long-standing i915_obj_to_ggtt vs.
i915_gem_obj_to_ggtt conflict, due to removal of i915_obj_to_ggtt!

v2:
- Added some more after grepping sources with Chris

v3:
- Refer to GGTT VM through ggtt->base consistently instead of ggtt_vm
  (Chris)

v4:
- Convert all dev_priv->ggtt->foo accesses to ggtt->foo.

v5:
- Make patch checker happy

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-31 17:55:43 +03:00
Maarten Lankhorst
b95c532148 drm/i915: Pass crtc_state to color management functions.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459350996-4957-2-git-send-email-maarten.lankhorst@linux.intel.com
2016-03-31 12:45:45 +02:00
Maarten Lankhorst
fbf6d8798f drm/i915: Add locking to pll updates, v3.
With async modesets this is no longer protected with connection_mutex,
so ensure that each pll has its own lock. The pll configuration state
is still protected; it's only the pll updates that need locking against
concurrency.

Changes since v1:
- Rebased.
- Fix locking to protect all accesses. (Durgadoss)
Changes since v2:
- Make the dpll_lock global to protect concurrent updates to the
  same register, for example DPLL_CTRL1 on skl. (Ander)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56F29F50.1090708@linux.intel.com
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2016-03-31 09:43:11 +02:00
Daniel Vetter
68d4aee9d1 drm/i915: Update DRIVER_DATE to 20160330
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-03-30 09:33:11 +02:00
Jani Nikula
5833498964 drm/i915: remove unused dev_priv->render_reclock_avail
Set from VBT, but never used. Good riddance.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458834623-8734-5-git-send-email-jani.nikula@intel.com
2016-03-29 15:12:40 +03:00
Jani Nikula
9d6c875db4 drm/i915: move sdvo mappings to vbt data
Move all data initialized from VBT under dev_priv->vbt. No functional
changes.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458834623-8734-4-git-send-email-jani.nikula@intel.com
2016-03-29 15:12:30 +03:00
Jani Nikula
06411f08b3 drm/i915: move edp low vswing config to vbt data
Move all data initialized from VBT under dev_priv->vbt. No functional
changes.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458834623-8734-3-git-send-email-jani.nikula@intel.com
2016-03-29 15:12:20 +03:00