linux-stable

Commit Graph

Author	SHA1	Message	Date
Maarten Lankhorst	50ae6c61a1	drm/i915: Revert relocation chaining commits. This reverts commit `964a9b0f61` ("drm/i915/gem: Use chained reloc batches") and commit `0e97fbb080` ("drm/i915/gem: Use a single chained reloc batches for a single execbuf"). When adding ww locking to execbuf, it's hard enough to deal with a single BO that is part of relocation execution. Chaining is hard to get right, and with GPU relocation deprecated, it's best to drop this altogether, instead of trying to fix something we will remove. This is not a completely 1:1 revert, we reset rq_size to 0 in reloc_cache_init, this was from `e3d291301f` ("drm/i915/gem: Implement legacy MI_STORE_DATA_IMM"), because we don't want to break the selftests. (Daniel) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-3-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2020-09-07 14:29:07 +03:00
Maarten Lankhorst	102a0a9051	Revert "drm/i915/gem: Async GPU relocations only" This reverts commit `9e0f9464e2` ("drm/i915/gem: Async GPU relocations only"), and related commit `7ac2d2536d` ("drm/i915/gem: Delete unused code"). Async GPU relocations are not the path forward, we want to remove GPU accelerated relocation support eventually when userspace is fixed to use VM_BIND, and this is the first step towards that. We will keep async gpu relocations around for now, until userspace is fixed. Relocation support will be disabled completely on platforms where there was never any userspace that depends on it, as the hardware doesn't require it from at least gen9+ onward. For older platforms, the plan is to use cpu relocations only. The igt side is fixed in igt commit 39e9aa1032a4e ("tests/i915: Remove subtests that rely on async relocation behavior"). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-2-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2020-09-07 14:28:46 +03:00
Chris Wilson	da1ea128a6	drm/i915/gem: Free the fence after a fence-chain lookup failure If dma_fence_chain_find_seqno() reports an error, it does so in its preamble before it disposes of the input fence. On handling the error, we need to drop the reference to the fence. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2292 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `13149e8baf` ("drm/i915: add syncobj timeline support") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806161056.17593-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2020-09-07 14:28:21 +03:00
Chris Wilson	5d9341370f	drm/i915: Export a preallocate variant of i915_active_acquire() Sometimes we have to be very careful not to allocate underneath a mutex (or spinlock) and yet still want to track activity. Enter i915_active_acquire_for_context(). This raises the activity counter on i915_active prior to use and ensures that the fence-tree contains a slot for the context. v2: Refactor active_lookup() so it can be called again before/after locking to resolve contention. Since we protect the rbtree until we idle, we can do a lockfree lookup, with the caveat that if another thread performs a concurrent insertion, the rotations from the insert may cause us to not find our target. A second pass holding the treelock will find the target if it exists, or the place to perform our insertion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-3-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2020-09-07 13:18:17 +03:00
Chris Wilson	27a5dcfe73	drm/i915/gem: Remove disordered per-file request list for throttling I915_GEM_THROTTLE dates back to the time before contexts where there was just a single engine, and therefore a single timeline and request list globally. That request list was in execution/retirement order, and so walking it to find a particular aged request made sense and could be split per file. That is no more. We now have many timelines with a file, as many as the user wants to construct (essentially per-engine, per-context). Each of those run independently and so make the single list futile. Remove the disordered list, and iterate over all the timelines to find a request to wait on in each to satisfy the criteria that the CPU is no more than 20ms ahead of its oldest request. It should go without saying that the I915_GEM_THROTTLE ioctl is no longer used as the primary means of throttling, so it makes sense to push the complication into the ioctl where it only impacts upon its few irregular users, rather than the execbuf/retire where everybody has to pay the cost. Fortunately, the few users do not create vast amount of contexts, so the loops over contexts/engines should be concise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2020-09-07 13:13:50 +03:00
Lionel Landwerlin	13149e8baf	drm/i915: add syncobj timeline support Introduces a new parameters to execbuf so that we can specify syncobj handles as well as timeline points. v2: Reuse i915_user_extension_fn v3: Check that the chained extension is only present once (Chris) v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel) v5: Use BIT_ULL (Chris) v6: Fix issue with already signaled timeline points, dma_fence_chain_find_seqno() setting fence to NULL (Chris) v7: Report ENOENT with invalid syncobj handle (Lionel) v8: Check for out of order timeline point insertion (Chris) v9: After explanations on https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html drop the ordering check from v8 (Lionel) v10: Set first extension enum item to 1 (Jason) v11: Rebase v12: Allow multiple extension nodes of timeline syncobj (Chris) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Co-authored-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v11) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-3-lionel.g.landwerlin@intel.com Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-08-17 16:16:51 -04:00
Lionel Landwerlin	cda9edd024	drm/i915: introduce a mechanism to extend execbuf2 We're planning to use this for a couple of new feature where we need to provide additional parameters to execbuf. v2: Check for invalid flags in execbuffer2 (Lionel) v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris) v4: Rebase Move array fence parsing in i915_gem_do_execbuffer() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-2-lionel.g.landwerlin@intel.com Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-08-17 16:16:48 -04:00
Daniele Ceraolo Spurio	792592e72a	drm/i915: Move the engine mask to intel_gt_info Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to indicate it contains the maximum engine list that can be found on a matching device. In preparation for other info being moved to the gt in follow up patches (sseu), introduce an intel_gt_info structure to group all gt-related runtime info. v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com	2020-07-08 21:07:11 +01:00
Chris Wilson	f7ce8639f6	drm/i915/gem: Split the context's obj:vma lut into its own mutex Rather than reuse the common ctx->mutex for locking the execbuffer LUT, split it into its own lock to avoid being taken [as part of ctx->mutex] at inappropriate times. In particular to avoid the inversion from taking the timeline->mutex for the whole execbuf submission in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200703004306.11117-1-chris@chris-wilson.co.uk	2020-07-03 10:13:13 +01:00
Chris Wilson	096a42dd19	drm/i915/gem: Move obj->lut_list under its own lock The obj->lut_list is traversed when the object is closed as the file table is destroyed during process termination. As this occurs before we kill any outstanding context if, due to some bug or another, the closure is blocked, then we fail to shootdown any inflight operations potentially leaving the GPU spinning forever. As we only need to guard the list against concurrent closures and insertions, the hold is short and merits being treated as a simple spinlock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk	2020-07-01 11:58:49 +01:00
Jani Nikula	0f69403d25	Merge drm/drm-next into drm-intel-next-queued Catch up with upstream, in particular to get `c1e8d7c6a7` ("mmap locking API: convert mmap_sem comments"). Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2020-06-25 18:05:03 +03:00
Linus Torvalds	d4e181f204	drm fixes for 5.7-rc1 core: - fix race in connectors sending hotplug i915: - Avoid use after free in cmdparser - Avoid NULL dereference when probing all display encoders - Fixup to module parameter type sun4i: - clock divider fix ast: - 24/32 bpp mode setting fix -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJe4ec6AAoJEAx081l5xIa+nlwQAJbFmInMITfkNoLwh/x7MPZt MUbpO/cZqE/WaNTN8rLOMtU074yav+7YUSnvdjbTfZSy0+/f+aRzGRIL7BkckMmU NJVGU4lB7HeGmdafIdwe11+jCOEv4DLNNqFx2xk+Frp2cRTisF10uaXrTQKQZD1N E2kROWIuWD3cM4lqho4fhETm3HmMc62H3udgHJjMRI5lF0Gj+7uE4Eva08SPTr5E ogC0+R+kkJdzc3iXE/kBkh65Gsygh9x3NkCd62SDHMu8SAZ00YTCn3C1FZ6w0XrB GjaRd6NtlMElLuWy/DAcTxxOorCXVz1XslbGPcogG/lmjsIV8vzf/mTlH70sp9Ex qGbGj1i4sn4HUzOZi7Q4IR+9NVN15r9USSMgQT+nleQXFTq2rk8lctPvF7NtQYIm vHtieo73FjNcA7+ewI9b/2zDaOX2CxrjphCFGPMYneZQ61vYZXiJaV4ir0/k1XNv mQwv0riIXqQrRRlWVJz+wGBLtNgzKnxpZE0V9b2kpvcYpfqMmsXGcaUIpj+ufSmQ ERwS76aJ2QhsGRkvz8Ys9meK1K84wWWWaxDNozeQFRLd4Rjpa0cBjL41pX7RwD/+ WFsWg+ujt4slxZNvRUakO9LeQCV0GvL7WpMbdmCIJKYyTC3erf8j72H9AMxvwz9X 1V/7G+LUo863IOkaVjlG =IBzW -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "One sun4i fix and a connector hotplug race The ast fix is for a regression in 5.6, and one of the i915 ones fixes an oops reported by dhowells. core: - fix race in connectors sending hotplug i915: - Avoid use after free in cmdparser - Avoid NULL dereference when probing all display encoders - Fixup to module parameter type sun4i: - clock divider fix ast: - 24/32 bpp mode setting fix" * tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm: drm/ast: fix missing break in switch statement for format->cpp[0] case 4 drm/sun4i: hdmi ddc clk: Fix size of m divider drm/i915/display: Only query DP state of a DDI encoder drm/i915/params: fix i915.reset module param type drm/i915/gem: Mark the buffer pool as active for the cmdparser drm/connector: notify userspace on hotplug after register complete	2020-06-11 12:27:06 -07:00
Chris Wilson	d7466a5adb	drm/i915/gem: Mark the buffer pool as active for the cmdparser If the execbuf is interrupted after building the cmdparser pipeline, and before we commit to submitting the request to HW, we would attempt to clean up the cmdparser early. While we held active references to the vma being parsed and constructed, we did not hold an active reference for the buffer pool itself. The result was that an interrupted execbuf could still have run the cmdparser pipeline, but since the buffer pool was idle, its target vma could have been recycled. Note this problem only occurs if the cmdparser is running async due to pipelined waits on busy fences, and the execbuf is interrupted. Fixes: `686c7c35ab` ("drm/i915/gem: Asynchronous cmdparser") Fixes: `16e8745967` ("drm/i915/gt: Move the batch buffer pool from the engine to the gt") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk (cherry picked from commit `57a78ca4ec`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2020-06-08 12:52:44 +03:00
Chris Wilson	7ac2d2536d	drm/i915/gem: Delete unused code Unused as of commit `9e0f9464e2` ("drm/i915/gem: Async GPU relocations only"), but left behind. >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:933:21: error: unused function 'unmask_page' [-Werror,-Wunused-function] static inline void unmask_page(unsigned long p) ^ >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:938:28: error: unused function 'unmask_flags' [-Werror,-Wunused-function] static inline unsigned int unmask_flags(unsigned long p) ^ >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:945:33: error: unused function 'cache_to_ggtt' [-Werror,-Wunused-function] static inline struct i915_ggtt cache_to_ggtt(struct reloc_cache *cache) Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200605200357.13069-1-chris@chris-wilson.co.uk	2020-06-05 21:46:31 +01:00
Chris Wilson	9e0f9464e2	drm/i915/gem: Async GPU relocations only Reduce the 3 relocation paths down to the single path that accommodates all. The primary motivation for this is to guard the relocations with a natural fence (derived from the i915_request used to write the relocation from the GPU). The tradeoff in using async gpu relocations is that it increases latency over using direct CPU relocations, for the cases where the target is idle and accessible by the CPU. The benefit is greatly reduced lock contention and improved concurrency by pipelining. Note that forcing the async gpu relocations does reveal a few issues they have. Firstly, is that they are visible as writes to gem_busy, causing to mark some buffers are being to written to by the GPU even though userspace only reads. Secondly is that, in combination with the cmdparser, they can cause priority inversions. This should be the case where the work is being put into a common workqueue losing our priority information and so being executed in FIFO from the worker, denying us the opportunity to reorder the requests afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604211457.19696-1-chris@chris-wilson.co.uk	2020-06-05 09:45:44 +01:00
Chris Wilson	57a78ca4ec	drm/i915/gem: Mark the buffer pool as active for the cmdparser If the execbuf is interrupted after building the cmdparser pipeline, and before we commit to submitting the request to HW, we would attempt to clean up the cmdparser early. While we held active references to the vma being parsed and constructed, we did not hold an active reference for the buffer pool itself. The result was that an interrupted execbuf could still have run the cmdparser pipeline, but since the buffer pool was idle, its target vma could have been recycled. Note this problem only occurs if the cmdparser is running async due to pipelined waits on busy fences, and the execbuf is interrupted. Fixes: `686c7c35ab` ("drm/i915/gem: Asynchronous cmdparser") Fixes: `16e8745967` ("drm/i915/gt: Move the batch buffer pool from the engine to the gt") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk	2020-06-04 14:38:06 +01:00
Chris Wilson	5a83399536	drm/i915: Drop i915_request.i915 backpointer We infrequently use the direct i915 backpointer from the i915_request, so do we really need to waste the space in the struct for it? 8 bytes from the most frequently allocated struct vs an 3 bytes and pointer chasing in using rq->engine->i915? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk	2020-06-03 13:53:39 +01:00
Linus Torvalds	faa392181a	drm pull for 5.8-rc1 core: - uapi: error out EBUSY when existing master - uapi: rework SET/DROP MASTER permission handling - remove drm_pci.h - drm_pci* are now legacy - introduced managed DRM resources - subclassing support for drm_framebuffer - simple encoder helper - edid improvements - vblank + writeback documentation improved - drm/mm - optimise tree searches - port drivers to use devm_drm_dev_alloc dma-buf: - add flag for p2p buffer support mst: - ACT timeout improvements - remove drm_dp_mst_has_audio - don't use 2nd TX slot - spec recommends against it bridge: - dw-hdmi various improvements - chrontel ch7033 support - fix stack issues with old gcc hdmi: - add unpack function for drm infoframe fbdev: - misc fbdev driver fixes i915: - uapi: global sseu pinning - uapi: OA buffer polling - uapi: remove generated perf code - uapi: per-engine default property values in sysfs - Tigerlake GEN12 enabled. - Lots of gem refactoring - Tigerlake enablement patches - move to drm_device logging - Icelake gamma HW readout - push MST link retrain to hotplug work - bandwidth atomic helpers - ICL fixes - RPS/GT refactoring - Cherryview full-ppgtt support - i915 locking guidelines documented - require linear fb stride to be 512 multiple on gen9 - Tigerlake SAGV support amdgpu: - uapi: encrypted GPU memory handling - uapi: add MEM_SYNC IB flag - p2p dma-buf support - export VRAM dma-bufs - FRU chip access support - RAS/SR-IOV updates - Powerplay locking fixes - VCN DPG (powergating) enablement - GFX10 clockgating fixes - DC fixes - GPU reset fixes - navi SDMA fix - expose FP16 for modesetting - DP 1.4 compliance fixes - gfx10 soft recovery - Improved Critical Thermal Faults handling - resizable BAR on gmc10 amdkfd: - uapi: GWS resource management - track GPU memory per process - report PCI domain in topology radeon: - safe reg list generator fixes nouveau: - HD audio fixes on recent systems - vGPU detection (fail probe if we're on one, for now) - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it) - SVM improvements/fixes - NVIDIA format modifier support - Misc other fixes. adv7511: - HDMI SPDIF support ast: - allocate crtc state size - fix double assignment - fix suspend bochs: - drop connector register cirrus: - move to tiny drivers. exynos: - fix imported dma-buf mapping - enable runtime PM - fixes and cleanups mediatek: - DPI pin mode swap - config mipi_tx current/impedance lima: - devfreq + cooling device support - task handling improvements - runtime PM support pl111: - vexpress init improvements - fix module auto-load rcar-du: - DT bindings conversion to YAML - Planes zpos sanity check and fix - MAINTAINERS entry for LVDS panel driver mcde: - fix return value mgag200: - use managed config init stm: - read endpoints from DT vboxvideo: - use PCI managed functions - drop WC mtrr vkms: - enable cursor by default rockchip: - afbc support virtio: - various cleanups qxl: - fix cursor notify port hisilicon: - 128-byte stride alignment fix sun4i: - improved format handling -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJe1edsAAoJEAx081l5xIa+bKEQAJAZv/8OMM2rx+p+GyKgrNpl ihTX/oyToy8dw97s1kWF7V5kKU+qjF8aWlKoPS0xovzaMAzYSFz9FRNEUgqtTXMI zIAzSXioqP21oL9/ZTHcXDULtz8Gk3uiPomgXMWLlNBdt3X5qvCwsmPRIYSwG0GJ 00VCvxDbVxGSM3wzcvbfyRwHCq3SrFvIusXv5jHnnxEFGH0C7Mj2/FLYMKLNjvli Q8VEI2wQPZj1QdA8fLFVneIQsR6YUSko9OfFMANP8VJGpPMnUkvVxTJ5ACGJspvn U/h6NYqJeUU2Y3BSKqtjIC3a1LY51tp5tL9q4H9TD1hqMckt6F2V7T2IeFU8i6+V YzUsSiT4q1xB+uiFVcgopx2hyIp8INOEyWrVdYgw2JviROeRD+pDHvJd13ZNMnTe GvLWQ/PfBFrcz8eligjiYjOf66ZTU+j/rivaOBFyrs9gdlsaEW2QRurFrcNX+0lZ kDbLsIFjhYnPXsvHP87x4BuQCKQIEh8wWuxXuJjunBPdqVrJyltZWbBiKO571b5/ BtX6xj6ztUOffR2RdiVanzY546I2hEi7SHMUuWnMqXsOV46GBN0QvlpZad/47n9x ZUy8HDDD0/qWuGwvPOJGIeAnUteWge9AhWXTeN5+1h5m+QEOzYkPKqC3Hp8TW1pM gToTWgAhnu731fhzLWyt =H7IS -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "Highlights: - Core DRM had a lot of refactoring around managed drm resources to make drivers simpler. - Intel Tigerlake support is on by default - amdgpu now support p2p PCI buffer sharing and encrypted GPU memory Details: core: - uapi: error out EBUSY when existing master - uapi: rework SET/DROP MASTER permission handling - remove drm_pci.h - drm_pci* are now legacy - introduced managed DRM resources - subclassing support for drm_framebuffer - simple encoder helper - edid improvements - vblank + writeback documentation improved - drm/mm - optimise tree searches - port drivers to use devm_drm_dev_alloc dma-buf: - add flag for p2p buffer support mst: - ACT timeout improvements - remove drm_dp_mst_has_audio - don't use 2nd TX slot - spec recommends against it bridge: - dw-hdmi various improvements - chrontel ch7033 support - fix stack issues with old gcc hdmi: - add unpack function for drm infoframe fbdev: - misc fbdev driver fixes i915: - uapi: global sseu pinning - uapi: OA buffer polling - uapi: remove generated perf code - uapi: per-engine default property values in sysfs - Tigerlake GEN12 enabled. - Lots of gem refactoring - Tigerlake enablement patches - move to drm_device logging - Icelake gamma HW readout - push MST link retrain to hotplug work - bandwidth atomic helpers - ICL fixes - RPS/GT refactoring - Cherryview full-ppgtt support - i915 locking guidelines documented - require linear fb stride to be 512 multiple on gen9 - Tigerlake SAGV support amdgpu: - uapi: encrypted GPU memory handling - uapi: add MEM_SYNC IB flag - p2p dma-buf support - export VRAM dma-bufs - FRU chip access support - RAS/SR-IOV updates - Powerplay locking fixes - VCN DPG (powergating) enablement - GFX10 clockgating fixes - DC fixes - GPU reset fixes - navi SDMA fix - expose FP16 for modesetting - DP 1.4 compliance fixes - gfx10 soft recovery - Improved Critical Thermal Faults handling - resizable BAR on gmc10 amdkfd: - uapi: GWS resource management - track GPU memory per process - report PCI domain in topology radeon: - safe reg list generator fixes nouveau: - HD audio fixes on recent systems - vGPU detection (fail probe if we're on one, for now) - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it) - SVM improvements/fixes - NVIDIA format modifier support - Misc other fixes. adv7511: - HDMI SPDIF support ast: - allocate crtc state size - fix double assignment - fix suspend bochs: - drop connector register cirrus: - move to tiny drivers. exynos: - fix imported dma-buf mapping - enable runtime PM - fixes and cleanups mediatek: - DPI pin mode swap - config mipi_tx current/impedance lima: - devfreq + cooling device support - task handling improvements - runtime PM support pl111: - vexpress init improvements - fix module auto-load rcar-du: - DT bindings conversion to YAML - Planes zpos sanity check and fix - MAINTAINERS entry for LVDS panel driver mcde: - fix return value mgag200: - use managed config init stm: - read endpoints from DT vboxvideo: - use PCI managed functions - drop WC mtrr vkms: - enable cursor by default rockchip: - afbc support virtio: - various cleanups qxl: - fix cursor notify port hisilicon: - 128-byte stride alignment fix sun4i: - improved format handling" * tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm: (1401 commits) drm/amd/display: Fix potential integer wraparound resulting in a hang drm/amd/display: drop cursor position check in atomic test drm/amdgpu: fix device attribute node create failed with multi gpu drm/nouveau: use correct conflicting framebuffer API drm/vblank: Fix -Wformat compile warnings on some arches drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode drm/amd/display: Handle GPU reset for DC block drm/amdgpu: add apu flags (v2) drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven drm/amdgpu: fix pm sysfs node handling (v2) drm/amdgpu: move gpu_info parsing after common early init drm/amdgpu: move discovery gfx config fetching drm/nouveau/dispnv50: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau/debugfs: fix runtime pm imbalance on error drm/nouveau/nouveau/hmm: fix migrate zero page to GPU drm/nouveau/nouveau/hmm: fix nouveau_dmem_chunk allocations drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST drm/nouveau/kms/nv50-: Move 8BPC limit for MST into nv50_mstc_get_modes() ...	2020-06-02 15:04:15 -07:00
Chris Wilson	ea97c4ca54	drm/i915/gem: Suppress some random warnings Leave the error propagation in place, but limit the warnings to only show up in CI if the unlikely errors are hit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200525141957.3061-2-chris@chris-wilson.co.uk	2020-05-25 16:45:18 +01:00
Pankaj Bharadiya	6db20e27f6	drm/i915/gem: Prefer drm_WARN* over WARN* struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN* over WARN* at places where struct drm_device pointer can be extracted. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504181600.18503-6-pankaj.laxminarayan.bharadiya@intel.com	2020-05-19 16:01:23 +03:00
Chris Wilson	18e4af04d2	drm/i915: Drop no-semaphore boosting Now that we have fast timeslicing on semaphores, we no longer need to prioritise none-semaphore work as we will yield any work blocked on a semaphore to the next in the queue. Previously with no timeslicing, blocking on the semaphore caused extremely bad scheduling with multiple clients utilising multiple rings. Now, there is no impact and we can remove the complication. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513173504.28322-1-chris@chris-wilson.co.uk	2020-05-14 06:14:33 +01:00
Chris Wilson	889333c772	drm/i915/gem: Remove redundant exec_fence Since there can only be one of in_fence/exec_fence, just use the single in_fence local. v2: Consolidate lookup Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513180937.28992-1-chris@chris-wilson.co.uk	2020-05-13 20:02:24 +01:00
Chris Wilson	eec39e441c	drm/i915: Remove wait priority boosting Upon waiting a request (when asked), we gave that request a small priority boost, not enough for it to cause preemption, but enough for it to be scheduled next before all equals. We also used that bit to give new clients a small priority boost, similar to FQ_CODEL, such that we favoured short interactive tasks ahead of long running streams. However, this is causing lots of complications with timeslicing where we both want to honour the boost and yet ignore it. Those complications cause unexpected user behaviour (tasks not being timesliced and run concurrently as epxected), and the easiest way to resolve that is to remove the boost. Hopefully, we can find a compromise again if we need to, but in theory timeslicing itself and future more advanced schedulers should give us the interactivity boost we seek. Testcase: igt/gem_exec_schedule/lateslice Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk	2020-05-07 20:08:58 +01:00
Chris Wilson	e3d291301f	drm/i915/gem: Implement legacy MI_STORE_DATA_IMM The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but left them writing to a physical address. The notes suggest that the primary reason would be so that the writes were cache coherent, as the CPU cache uses physical tagging. As such we did not implement the legacy variant of MI_STORE_DATA_IMM and so left all the relocations synchronous -- but with a small function to convert from the vma address into the physical address, we can implement asynchronous relocs on these older arches, fixing up a few tests that require them. In order to be able to test the legacy paths, refactor the gpu relocations so that we can hook them up to a selftest. v2: Use an array of offsets not enum labels for the selftest v3: Refactor the common igt_hexdump() Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk	2020-05-04 15:15:04 +01:00
Chris Wilson	f5b62bdbb6	drm/i915/gem: Specify address type for chained reloc batches It is required that a chained batch be in the same address domain as its parent, and also that must be specified in the command for earlier gen as it is not inferred from the chaining until gen6. Fixes: `964a9b0f61` ("drm/i915/gem: Use chained reloc batches") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk	2020-05-04 14:28:48 +01:00
Chris Wilson	6f576d6277	drm/i915/gem: Try an alternate engine for relocations If at first we don't succeed, try try again. Not all engines may support the MI ops we need to perform asynchronous relocation patching, and so we end up falling back to a synchronous operation that has a liability of blocking. However, Tvrtko pointed out we don't need to use the same engine to perform the relocations as we are planning to execute the execbuf on, and so if we switch over to a working engine, we can perform the relocation asynchronously. The user execbuf will be queued after the relocations by virtue of fencing. This patch creates a new context per execbuf requiring asynchronous relocations on an unusable engines. This is perhaps a bit excessive and can be ameliorated by a small context cache, but for the moment we only need it for working around a little used engine on Sandybridge, and only if relocations are actually required to an active batch buffer. Now we just need to teach the relocation code to handle physical addressing for gen2/3, and we should then have universal support! Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Testcase: igt/gem_exec_reloc/basic-spin # snb Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk	2020-05-01 22:56:16 +01:00
Chris Wilson	0e97fbb080	drm/i915/gem: Use a single chained reloc batches for a single execbuf As we can now keep chaining together a relocation batch to process any number of relocations, we can keep building that relocation batch for all of the target vma. This avoiding emitting a new request into the ring for each target, consuming precious ring space and a potential stall. v2: Propagate the failure from submitting the relocation batch. Testcase: igt/gem_exec_reloc/basic-wide-active Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk	2020-05-01 22:56:15 +01:00
Chris Wilson	964a9b0f61	drm/i915/gem: Use chained reloc batches The ring is a precious resource: we anticipate to only use a few hundred bytes for a request, and only try to reserve that before we start. If we go beyond our guess in building the request, then instead of waiting at the start of execbuf before we hold any locks or other resources, we may trigger a wait inside a critical region. One example is in using gpu relocations, where currently we emit a new MI_BB_START from the ring every time we overflow a page of relocation entries. However, instead of insert the command into the precious ring, we can chain the next page of relocation entries as MI_BB_START from the end of the previous. v2: Delay the emit_bb_start until after all the chained vma synchronisation is complete. Since the buffer pool batches are idle, this _should_ be a no-op, but one day we may some fancy async GPU bindings for new vma! v3: Use pool/batch consitently, once we start thinking in terms of the batch vma, use batch->obj. v4: Explain the magic number 4. Tvrtko spotted that we lose propagation of the error for failing to submit the relocation request; that's easier to fix up in the next patch. Testcase: igt/gem_exec_reloc/basic-many-active Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk	2020-05-01 22:56:15 +01:00
Christophe Leroy	b44f687386	drm/i915/gem: Replace user_access_begin by user_write_access_begin When i915_gem_execbuffer2_ioctl() is using user_access_begin(), that's only to perform unsafe_put_user() so use user_write_access_begin() in order to only open write access. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ebf1250b6d4f351469fb339e5399d8b92aa8a1c1.1585898438.git.christophe.leroy@c-s.fr	2020-05-01 12:35:22 +10:00
Chris Wilson	16e8745967	drm/i915/gt: Move the batch buffer pool from the engine to the gt Since the introduction of 'soft-rc6', we aim to park the device quickly and that results in frequent idling of the whole device. Currently upon idling we free the batch buffer pool, and so this renders the cache ineffective for many workloads. If we want to have an effective cache of recently allocated buffers available for reuse, we need to decouple that cache from the engine powermanagement and make it timer based. As there is no reason then to keep it within the engine (where it once made retirement order easier to track), we can move it up the hierarchy to the owner of the memory allocations. v2: Hook up to debugfs/drop_caches to clear the cache on demand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk	2020-04-30 19:12:02 +01:00
Chris Wilson	50689771c8	drm/i915: Only close vma we open The history of i915_vma_close() is confusing, as is its use. As the lifetime of the i915_vma is currently bounded by the object it is attached to, we needed a means of identify when a vma was no longer in use by userspace (via the user's fd). This is further complicated by that only ppgtt vma should be closed at the user's behest, as the ggtt were always shared. Now that we attach the vma to a lut on the user's context, the open count does indicate how many unique and open context/vm are referencing this vma from the user. As such, we can and should just use the open_count to track when the vma is still in use by userspace. It's a poor man's replacement for reference counting. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk	2020-04-24 11:24:45 +01:00
Chris Wilson	e94f785642	drm/i915/gem: Promote 'remain' to unsigned long Tidy the code by casting remain to unsigned long once for the duration of eb_relocate_vma() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407085930.19421-1-chris@chris-wilson.co.uk	2020-04-07 14:43:58 +01:00
Chris Wilson	1aaea8476d	drm/i915/gem: Flush all the reloc_gpu batch __i915_gem_object_flush_map() takes a byte range, so feed it the written bytes and do not mistake the u32 index as bytes! Fixes: `a679f58d05` ("drm/i915: Flush pages on acquisition") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406114821.10949-1-chris@chris-wilson.co.uk (cherry picked from commit `30c88a47f1`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-04-06 10:31:38 -07:00
Chris Wilson	721017cf4b	drm/i915/gem: Ignore readonly failures when updating relocs If the user passes in a readonly reloc[], by the time we notice we have already committed to modifying the execobjects, or have indeed done so already. Reporting the failure just compounds the issue as we have no second pass to fall back to anymore. "Be damned if you do, and damned if you don't." Testcase: igt/gem_exec_reloc/readonly Fixes: `7dc8f11437` ("drm/i915/gem: Drop relocation slowpath") References: `fddcd00a49` ("drm/i915: Force the slow path after a user-write error") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200331162150.3635-1-chris@chris-wilson.co.uk (cherry picked from commit `97a37c919f`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-04-06 10:31:23 -07:00
Chris Wilson	39d571d172	drm/i915/gem: Take DBG_FORCE_RELOC into account prior to using reloc_gpu If we set the debug flag to force ourselves not to relocate via the gpu, do not relocate via the gpu. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406123616.7334-1-chris@chris-wilson.co.uk	2020-04-06 15:58:33 +01:00
Chris Wilson	30c88a47f1	drm/i915/gem: Flush all the reloc_gpu batch __i915_gem_object_flush_map() takes a byte range, so feed it the written bytes and do not mistake the u32 index as bytes! Fixes: `a679f58d05` ("drm/i915: Flush pages on acquisition") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200406114821.10949-1-chris@chris-wilson.co.uk	2020-04-06 15:58:33 +01:00
Chris Wilson	8a338f4bf6	drm/i915/gem: Try allocating va from free space If the current node/entry location is occupied, and the object is not pinned, try assigning it some free space. We cannot wait here, so if in doubt, we unreserve and try to grab all at once. v2: Use the final pin_flags so that we won't have to move the object if we find the wrong free space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200401194135.5442-1-chris@chris-wilson.co.uk	2020-04-01 21:57:48 +01:00
Chris Wilson	97a37c919f	drm/i915/gem: Ignore readonly failures when updating relocs If the user passes in a readonly reloc[], by the time we notice we have already committed to modifying the execobjects, or have indeed done so already. Reporting the failure just compounds the issue as we have no second pass to fall back to anymore. "Be damned if you do, and damned if you don't." Testcase: igt/gem_exec_reloc/readonly Fixes: `7dc8f11437` ("drm/i915/gem: Drop relocation slowpath") References: `fddcd00a49` ("drm/i915: Force the slow path after a user-write error") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200331162150.3635-1-chris@chris-wilson.co.uk	2020-03-31 22:40:39 +01:00
Chris Wilson	0f1dd02295	drm/i915/gem: Split eb_vma into its own allocation Use a separate array allocation for the execbuf vma, so that we can track their lifetime independently from the copy of the user arguments. With luck, this has a secondary benefit of splitting the malloc size to within reason and avoid vmalloc. The downside is that we might require two separate vmallocs -- but much less likely. In the process, this prevents a memory leak on the ww_mutex error unwind. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1390 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200330133710.14385-1-chris@chris-wilson.co.uk	2020-03-30 20:08:13 +01:00
Nathan Chancellor	7bf03e7504	drm/i915: Cast remain to unsigned long in eb_relocate_vma A recent commit in clang added -Wtautological-compare to -Wall, which is enabled for i915 after -Wtautological-compare is disabled for the rest of the kernel so we see the following warning on x86_64: ../drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1433:22: warning: result of comparison of constant 576460752303423487 with expression of type 'unsigned int' is always false [-Wtautological-constant-out-of-range-compare] if (unlikely(remain > N_RELOC(ULONG_MAX))) ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~ ../include/linux/compiler.h:78:42: note: expanded from macro 'unlikely' # define unlikely(x) __builtin_expect(!!(x), 0) ^ 1 warning generated. It is not wrong in the case where ULONG_MAX > UINT_MAX but it does not account for the case where this file is built for 32-bit x86, where ULONG_MAX == UINT_MAX and this check is still relevant. Cast remain to unsigned long, which keeps the generated code the same (verified with clang-11 on x86_64 and GCC 9.2.0 on x86 and x86_64) and the warning is silenced so we can catch more potential issues in the future. Closes: https://github.com/ClangBuiltLinux/linux/issues/778 Suggested-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200214054706.33870-1-natechancellor@gmail.com	2020-03-27 00:12:42 +02:00
Chris Wilson	2e46a2a0b0	drm/i915: Use explicit flag to mark unreachable intel_context I need to keep the GEM context around a bit longer so adding an explicit flag for syncing execbuf with closed/abandonded contexts. v2: * Use already available context flags. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk (cherry picked from commit `207e4a71fb`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2020-03-26 10:21:04 -07:00
Chris Wilson	92581f9fb9	drm/i915: Immediately execute the fenced work If the caller allows and we do not have to wait for any signals, immediately execute the work within the caller's process. By doing so we avoid the overhead of scheduling a new task, and the latency in executing it, at the cost of pulling that work back into the immediate context. (Sometimes we still prefer to offload the task to another cpu, especially if we plan on executing many such tasks in parallel for this client.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200325120227.8044-2-chris@chris-wilson.co.uk	2020-03-25 13:05:04 +00:00
Chris Wilson	41e4065a6b	drm/i915: Rely on direct submission to the queue Drop the pretense of kicking the tasklet (used only for the defunct guc submission backend, it should just take ownership of the submit!) and so remove the bh-kicking from around submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-5-chris@chris-wilson.co.uk	2020-03-23 11:51:39 +00:00
Chris Wilson	93159e1235	drm/i915/gem: Avoid gem_context->mutex for simple vma lookup As we store the handle lookup inside a radix tree, we do not need the gem_context->mutex except until we need to insert our lookup into the common radix tree. This takes a small bit of rearranging to ensure that the lut we insert into the tree is ready prior to actually inserting it (as soon as it is exposed via the radixtree, it is visible to any other submission). v2: For brownie points, remove the goto spaghetti. v3: Tighten up the closed-handle checks. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-8-chris@chris-wilson.co.uk	2020-03-23 11:18:16 +00:00
Chris Wilson	207e4a71fb	drm/i915: Use explicit flag to mark unreachable intel_context I need to keep the GEM context around a bit longer so adding an explicit flag for syncing execbuf with closed/abandonded contexts. v2: * Use already available context flags. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk	2020-03-19 21:28:20 +00:00
Chris Wilson	7dc8f11437	drm/i915/gem: Drop relocation slowpath Since the relocations are no longer performed under a global struct_mutex, or any other lock, that is also held by pagefault handlers, we can relax and allow our fast path to take a fault. As we no longer need to abort the fast path for lock avoidance, we no longer need the slow path handling at all. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200311160310.26711-1-chris@chris-wilson.co.uk	2020-03-12 20:28:57 +00:00
Chris Wilson	ef398881d2	drm/i915/gem: Limit struct_mutex to eb_reserve We only need to serialise the multiple pinning during the eb_reserve phase. Ideally this would be using the vm->mutex as an outer lock, or using a composite global mutex (ww_mutex), but at the moment we are using struct_mutex for the group. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1381 Fixes: `003d8b9143` ("drm/i915/gem: Only call eb_lookup_vma once during execbuf ioctl") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200306071614.2846708-3-chris@chris-wilson.co.uk	2020-03-06 10:58:05 +00:00
Matthew Auld	2920516b2f	drm/i915: be more solid in checking the alignment The alignment is u64, and yet is_power_of_2() assumes unsigned long, which might give different results between 32b and 64b kernel. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200305203534.210466-1-matthew.auld@intel.com Cc: stable@vger.kernel.org	2020-03-06 10:07:44 +00:00
Chris Wilson	36e191f064	drm/i915: Apply i915_request_skip() on submission Trying to use i915_request_skip() prior to i915_request_add() causes us to try and fill the ring upto request->postfix, which has not yet been set, and so may cause us to memset() past the end of the ring. Instead of skipping the request immediately, just flag the error on the request (only accepting the first fatal error we see) and then clear the request upon submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200304121849.2448028-1-chris@chris-wilson.co.uk	2020-03-04 14:29:50 +00:00
Chris Wilson	003d8b9143	drm/i915/gem: Only call eb_lookup_vma once during execbuf ioctl As we no longer stash anything inside i915_vma under the exclusive protection of struct_mutex, we do not need to revoke the i915_vma stashes before dropping struct_mutex to handle pagefaults. Knowing that we must drop the struct_mutex while keeping the eb->vma around, means that we are required to hold onto to the object reference until we have marked the vma as active. Fixes: `155ab8836c` ("drm/i915: Move object close under its own lock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200303204345.1859734-3-chris@chris-wilson.co.uk	2020-03-03 21:52:51 +00:00

1 2 3

118 Commits