linux-stable

mirrors/linux-stable

Fork 0

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-12 21:57:43 +00:00

Commit graph

Author SHA1 Message Date

Author	SHA1	Message	Date
Changbin Du	219af7331f	drm/i915: Do not enable movntdqa optimization in hypervisor guest Our QA reported a problem caused by movntdqa instructions. Currently, the KVM hypervisor doesn't support VEX-prefix instructions emulation. If users passthrough a GPU to guest with vfio option 'x-no-mmap=on', then all access to the BARs will be trapped and emulated. The KVM hypervisor would raise an inertal error to qemu which cause the guest killed. (Since 'movntdqa' ins is not supported.) This patch try not to enable movntdqa optimization if the driver is running in hypervisor guest. Signed-off-by: Changbin Du <changbin.du@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1513924309-3113-1-git-send-email-changbin.du@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-12-22 11:12:15 +00:00
Chris Wilson	21aea5cc06	drm/i915: Mark the static key for movntqda as static Silence sparse who warns that the global variable is not declared static. Fixes: `0b1de5d58e` ("drm/i915: Use SSE4.1 movntdqa to ...") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471432146-5196-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-08-17 12:36:07 +01:00
Chris Wilson	0b1de5d58e	drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory This patch provides the infrastructure for performing a 16-byte aligned read from WC memory using non-temporal instructions introduced with sse4.1. Using movntdqa we can bypass the CPU caches and read directly from memory and ignoring the page attributes set on the CPU PTE i.e. negating the impact of an otherwise UC access. Copying using movntdqa from WC is almost as fast as reading from WB memory, modulo the possibility of both hitting the CPU cache or leaving the data in the CPU cache for the next consumer. (The CPU cache itself my be flushed for the region of the movntdqa and on later access the movntdqa reads from a separate internal buffer for the cacheline.) The write back to the memory is however cached. This will be used in later patches to accelerate accessing WC memory. v2: Report whether the accelerated copy is successful/possible. v3: Function alignment override was only necessary when using the function target("sse4.1") - which is not necessary for emitting movntdqa from __asm__. v4: Improve notes on CPU cache behaviour vs non-temporal stores. v5: Fix byte offsets for unrolled moves. v6: Find all remaining typos of "movntqda", use kernel_fpu_begin. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-2-git-send-email-chris@chris-wilson.co.uk	2016-08-12 13:06:37 +01:00

Changbin Du

219af7331f

drm/i915: Do not enable movntdqa optimization in hypervisor guest

Our QA reported a problem caused by movntdqa instructions. Currently,
the KVM hypervisor doesn't support VEX-prefix instructions emulation.
If users passthrough a GPU to guest with vfio option 'x-no-mmap=on',
then all access to the BARs will be trapped and emulated. The KVM
hypervisor would raise an inertal error to qemu which cause the guest
killed. (Since 'movntdqa' ins is not supported.)

This patch try not to enable movntdqa optimization if the driver is
running in hypervisor guest.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1513924309-3113-1-git-send-email-changbin.du@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

2017-12-22 11:12:15 +00:00

Chris Wilson

21aea5cc06

drm/i915: Mark the static key for movntqda as static

Silence sparse who warns that the global variable is not declared
static.

Fixes: 0b1de5d58e ("drm/i915: Use SSE4.1 movntdqa to ...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471432146-5196-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

2016-08-17 12:36:07 +01:00

Chris Wilson

0b1de5d58e

drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory

This patch provides the infrastructure for performing a 16-byte aligned
read from WC memory using non-temporal instructions introduced with sse4.1.
Using movntdqa we can bypass the CPU caches and read directly from memory
and ignoring the page attributes set on the CPU PTE i.e. negating the
impact of an otherwise UC access. Copying using movntdqa from WC is almost
as fast as reading from WB memory, modulo the possibility of both hitting
the CPU cache or leaving the data in the CPU cache for the next consumer.
(The CPU cache itself my be flushed for the region of the movntdqa and on
later access the movntdqa reads from a separate internal buffer for the
cacheline.) The write back to the memory is however cached.

This will be used in later patches to accelerate accessing WC memory.

v2: Report whether the accelerated copy is successful/possible.
v3: Function alignment override was only necessary when using the
function target("sse4.1") - which is not necessary for emitting movntdqa
from __asm__.
v4: Improve notes on CPU cache behaviour vs non-temporal stores.
v5: Fix byte offsets for unrolled moves.
v6: Find all remaining typos of "movntqda", use kernel_fpu_begin.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-2-git-send-email-chris@chris-wilson.co.uk

2016-08-12 13:06:37 +01:00

3 commits