drm/xe: Introduce a new DRM driver for Intel GPUs

Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).

The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).

The new Xe driver leverages a lot from i915.

As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.

This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
This commit is contained in:
Matthew Brost 2023-03-30 17:31:57 -04:00 committed by Rodrigo Vivi
parent a60501d7c2
commit dd08ebf6c3
210 changed files with 40575 additions and 0 deletions

View File

@ -18,6 +18,7 @@ GPU Driver Documentation
vkms
bridge/dw-hdmi
xen-front
xe/index
afbc
komeda-kms
panfrost

View File

@ -0,0 +1,23 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
=======================
drm/xe Intel GFX Driver
=======================
The drm/xe driver supports some future GFX cards with rendering, display,
compute and media. Support for currently available platforms like TGL, ADL,
DG2, etc is provided to prototype the driver.
.. toctree::
:titlesonly:
xe_mm
xe_map
xe_migrate
xe_cs
xe_pm
xe_pcode
xe_gt_mcr
xe_wa
xe_rtp
xe_firmware

View File

@ -0,0 +1,8 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
==================
Command submission
==================
.. kernel-doc:: drivers/gpu/drm/xe/xe_exec.c
:doc: Execbuf (User GPU command submission)

View File

@ -0,0 +1,34 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
========
Firmware
========
Firmware Layout
===============
.. kernel-doc:: drivers/gpu/drm/xe/xe_uc_fw_abi.h
:doc: Firmware Layout
Write Once Protected Content Memory (WOPCM) Layout
==================================================
.. kernel-doc:: drivers/gpu/drm/xe/xe_wopcm.c
:doc: Write Once Protected Content Memory (WOPCM) Layout
GuC CTB Blob
============
.. kernel-doc:: drivers/gpu/drm/xe/xe_guc_ct.c
:doc: GuC CTB Blob
GuC Power Conservation (PC)
===========================
.. kernel-doc:: drivers/gpu/drm/xe/xe_guc_pc.c
:doc: GuC Power Conservation (PC)
Internal API
============
TODO

View File

@ -0,0 +1,13 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
==============================================
GT Multicast/Replicated (MCR) Register Support
==============================================
.. kernel-doc:: drivers/gpu/drm/xe/xe_gt_mcr.c
:doc: GT Multicast/Replicated (MCR) Register Support
Internal API
============
TODO

View File

@ -0,0 +1,8 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
=========
Map Layer
=========
.. kernel-doc:: drivers/gpu/drm/xe/xe_map.h
:doc: Map layer

View File

@ -0,0 +1,8 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
=============
Migrate Layer
=============
.. kernel-doc:: drivers/gpu/drm/xe/xe_migrate_doc.h
:doc: Migrate Layer

View File

@ -0,0 +1,14 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
=================
Memory Management
=================
.. kernel-doc:: drivers/gpu/drm/xe/xe_bo_doc.h
:doc: Buffer Objects (BO)
Pagetable building
==================
.. kernel-doc:: drivers/gpu/drm/xe/xe_pt.c
:doc: Pagetable building

View File

@ -0,0 +1,14 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
=====
Pcode
=====
.. kernel-doc:: drivers/gpu/drm/xe/xe_pcode.c
:doc: PCODE
Internal API
============
.. kernel-doc:: drivers/gpu/drm/xe/xe_pcode.c
:internal:

View File

@ -0,0 +1,14 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
========================
Runtime Power Management
========================
.. kernel-doc:: drivers/gpu/drm/xe/xe_pm.c
:doc: Xe Power Management
Internal API
============
.. kernel-doc:: drivers/gpu/drm/xe/xe_pm.c
:internal:

View File

@ -0,0 +1,20 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
=========================
Register Table Processing
=========================
.. kernel-doc:: drivers/gpu/drm/xe/xe_rtp.c
:doc: Register Table Processing
Internal API
============
.. kernel-doc:: drivers/gpu/drm/xe/xe_rtp_types.h
:internal:
.. kernel-doc:: drivers/gpu/drm/xe/xe_rtp.h
:internal:
.. kernel-doc:: drivers/gpu/drm/xe/xe_rtp.c
:internal:

View File

@ -0,0 +1,14 @@
.. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
====================
Hardware workarounds
====================
.. kernel-doc:: drivers/gpu/drm/xe/xe_wa.c
:doc: Hardware workarounds
Internal API
============
.. kernel-doc:: drivers/gpu/drm/xe/xe_wa.c
:internal:

View File

@ -276,6 +276,8 @@ source "drivers/gpu/drm/nouveau/Kconfig"
source "drivers/gpu/drm/i915/Kconfig"
source "drivers/gpu/drm/xe/Kconfig"
source "drivers/gpu/drm/kmb/Kconfig"
config DRM_VGEM

View File

@ -134,6 +134,7 @@ obj-$(CONFIG_DRM_RADEON)+= radeon/
obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/
obj-$(CONFIG_DRM_AMDGPU)+= amd/amdxcp/
obj-$(CONFIG_DRM_I915) += i915/
obj-$(CONFIG_DRM_XE) += xe/
obj-$(CONFIG_DRM_KMB_DISPLAY) += kmb/
obj-$(CONFIG_DRM_MGAG200) += mgag200/
obj-$(CONFIG_DRM_V3D) += v3d/

2
drivers/gpu/drm/xe/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
# SPDX-License-Identifier: GPL-2.0-only
*.hdrtest

View File

@ -0,0 +1,63 @@
# SPDX-License-Identifier: GPL-2.0-only
config DRM_XE
tristate "Intel Xe Graphics"
depends on DRM && PCI && MMU
select INTERVAL_TREE
# we need shmfs for the swappable backing store, and in particular
# the shmem_readpage() which depends upon tmpfs
select SHMEM
select TMPFS
select DRM_BUDDY
select DRM_KMS_HELPER
select DRM_PANEL
select DRM_SUBALLOC_HELPER
select RELAY
select IRQ_WORK
select SYNC_FILE
select IOSF_MBI
select CRC32
select SND_HDA_I915 if SND_HDA_CORE
select CEC_CORE if CEC_NOTIFIER
select VMAP_PFN
select DRM_TTM
select DRM_TTM_HELPER
select DRM_SCHED
select MMU_NOTIFIER
help
Experimental driver for Intel Xe series GPUs
If "M" is selected, the module will be called xe.
config DRM_XE_FORCE_PROBE
string "Force probe xe for selected Intel hardware IDs"
depends on DRM_XE
help
This is the default value for the xe.force_probe module
parameter. Using the module parameter overrides this option.
Force probe the xe for Intel graphics devices that are
recognized but not properly supported by this kernel version. It is
recommended to upgrade to a kernel version with proper support as soon
as it is available.
It can also be used to block the probe of recognized and fully
supported devices.
Use "" to disable force probe. If in doubt, use this.
Use "<pci-id>[,<pci-id>,...]" to force probe the xe for listed
devices. For example, "4500" or "4500,4571".
Use "*" to force probe the driver for all known devices.
Use "!" right before the ID to block the probe of the device. For
example, "4500,!4571" forces the probe of 4500 and blocks the probe of
4571.
Use "!*" to block the probe of the driver for all known devices.
menu "drm/Xe Debugging"
depends on DRM_XE
depends on EXPERT
source "drivers/gpu/drm/xe/Kconfig.debug"
endmenu

View File

@ -0,0 +1,96 @@
# SPDX-License-Identifier: GPL-2.0-only
config DRM_XE_WERROR
bool "Force GCC to throw an error instead of a warning when compiling"
# As this may inadvertently break the build, only allow the user
# to shoot oneself in the foot iff they aim really hard
depends on EXPERT
# We use the dependency on !COMPILE_TEST to not be enabled in
# allmodconfig or allyesconfig configurations
depends on !COMPILE_TEST
default n
help
Add -Werror to the build flags for (and only for) xe.ko.
Do not enable this unless you are writing code for the xe.ko module.
Recommended for driver developers only.
If in doubt, say "N".
config DRM_XE_DEBUG
bool "Enable additional driver debugging"
depends on DRM_XE
depends on EXPERT
depends on !COMPILE_TEST
default n
help
Choose this option to turn on extra driver debugging that may affect
performance but will catch some internal issues.
Recommended for driver developers only.
If in doubt, say "N".
config DRM_XE_DEBUG_VM
bool "Enable extra VM debugging info"
default n
help
Enable extra VM debugging info
Recommended for driver developers only.
If in doubt, say "N".
config DRM_XE_DEBUG_MEM
bool "Enable passing SYS/LMEM addresses to user space"
default n
help
Pass object location trough uapi. Intended for extended
testing and development only.
Recommended for driver developers only.
If in doubt, say "N".
config DRM_XE_SIMPLE_ERROR_CAPTURE
bool "Enable simple error capture to dmesg on job timeout"
default n
help
Choose this option when debugging an unexpected job timeout
Recommended for driver developers only.
If in doubt, say "N".
config DRM_XE_KUNIT_TEST
tristate "KUnit tests for the drm xe driver" if !KUNIT_ALL_TESTS
depends on DRM_XE && KUNIT
default KUNIT_ALL_TESTS
select DRM_EXPORT_FOR_TESTS if m
help
Choose this option to allow the driver to perform selftests under
the kunit framework
Recommended for driver developers only.
If in doubt, say "N".
config DRM_XE_LARGE_GUC_BUFFER
bool "Enable larger guc log buffer"
default n
help
Choose this option when debugging guc issues.
Buffer should be large enough for complex issues.
Recommended for driver developers only.
If in doubt, say "N".
config DRM_XE_USERPTR_INVAL_INJECT
bool "Inject userptr invalidation -EINVAL errors"
default n
help
Choose this option when debugging error paths that
are hit during checks for userptr invalidations.
Recomended for driver developers only.
If in doubt, say "N".

121
drivers/gpu/drm/xe/Makefile Normal file
View File

@ -0,0 +1,121 @@
# SPDX-License-Identifier: GPL-2.0
#
# Makefile for the drm device driver. This driver provides support for the
# Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
# Add a set of useful warning flags and enable -Werror for CI to prevent
# trivial mistakes from creeping in. We have to do this piecemeal as we reject
# any patch that isn't warning clean, so turning on -Wall -Wextra (or W=1) we
# need to filter out dubious warnings. Still it is our interest
# to keep running locally with W=1 C=1 until we are completely clean.
#
# Note the danger in using -Wall -Wextra is that when CI updates gcc we
# will most likely get a sudden build breakage... Hopefully we will fix
# new warnings before CI updates!
subdir-ccflags-y := -Wall -Wextra
# making these call cc-disable-warning breaks when trying to build xe.mod.o
# by calling make M=drivers/gpu/drm/xe. This doesn't happen in upstream tree,
# so it was somehow fixed by the changes in the build system. Move it back to
# $(call cc-disable-warning, ...) after rebase.
subdir-ccflags-y += -Wno-unused-parameter
subdir-ccflags-y += -Wno-type-limits
#subdir-ccflags-y += $(call cc-disable-warning, unused-parameter)
#subdir-ccflags-y += $(call cc-disable-warning, type-limits)
subdir-ccflags-y += $(call cc-disable-warning, missing-field-initializers)
subdir-ccflags-y += $(call cc-disable-warning, unused-but-set-variable)
# clang warnings
subdir-ccflags-y += $(call cc-disable-warning, sign-compare)
subdir-ccflags-y += $(call cc-disable-warning, sometimes-uninitialized)
subdir-ccflags-y += $(call cc-disable-warning, initializer-overrides)
subdir-ccflags-y += $(call cc-disable-warning, frame-address)
subdir-ccflags-$(CONFIG_DRM_XE_WERROR) += -Werror
# Fine grained warnings disable
CFLAGS_xe_pci.o = $(call cc-disable-warning, override-init)
subdir-ccflags-y += -I$(srctree)/$(src)
# Please keep these build lists sorted!
# core driver code
xe-y += xe_bb.o \
xe_bo.o \
xe_bo_evict.o \
xe_debugfs.o \
xe_device.o \
xe_dma_buf.o \
xe_engine.o \
xe_exec.o \
xe_execlist.o \
xe_force_wake.o \
xe_ggtt.o \
xe_gpu_scheduler.o \
xe_gt.o \
xe_gt_clock.o \
xe_gt_debugfs.o \
xe_gt_mcr.o \
xe_gt_pagefault.o \
xe_gt_sysfs.o \
xe_gt_topology.o \
xe_guc.o \
xe_guc_ads.o \
xe_guc_ct.o \
xe_guc_debugfs.o \
xe_guc_hwconfig.o \
xe_guc_log.o \
xe_guc_pc.o \
xe_guc_submit.o \
xe_hw_engine.o \
xe_hw_fence.o \
xe_huc.o \
xe_huc_debugfs.o \
xe_irq.o \
xe_lrc.o \
xe_migrate.o \
xe_mmio.o \
xe_mocs.o \
xe_module.o \
xe_pci.o \
xe_pcode.o \
xe_pm.o \
xe_preempt_fence.o \
xe_pt.o \
xe_pt_walk.o \
xe_query.o \
xe_reg_sr.o \
xe_reg_whitelist.o \
xe_rtp.o \
xe_ring_ops.o \
xe_sa.o \
xe_sched_job.o \
xe_step.o \
xe_sync.o \
xe_trace.o \
xe_ttm_gtt_mgr.o \
xe_ttm_vram_mgr.o \
xe_tuning.o \
xe_uc.o \
xe_uc_debugfs.o \
xe_uc_fw.o \
xe_vm.o \
xe_vm_madvise.o \
xe_wait_user_fence.o \
xe_wa.o \
xe_wopcm.o
# XXX: Needed for i915 register definitions. Will be removed after xe-regs.
subdir-ccflags-y += -I$(srctree)/drivers/gpu/drm/i915/
obj-$(CONFIG_DRM_XE) += xe.o
obj-$(CONFIG_DRM_XE_KUNIT_TEST) += tests/
\
# header test
always-$(CONFIG_DRM_XE_WERROR) += \
$(patsubst %.h,%.hdrtest, $(shell cd $(srctree)/$(src) && find * -name '*.h'))
quiet_cmd_hdrtest = HDRTEST $(patsubst %.hdrtest,%.h,$@)
cmd_hdrtest = $(CC) -DHDRTEST $(filter-out $(CFLAGS_GCOV), $(c_flags)) -S -o /dev/null -x c /dev/null -include $<; touch $@
$(obj)/%.hdrtest: $(src)/%.h FORCE
$(call if_changed_dep,hdrtest)

View File

@ -0,0 +1,219 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2014-2021 Intel Corporation
*/
#ifndef _ABI_GUC_ACTIONS_ABI_H
#define _ABI_GUC_ACTIONS_ABI_H
/**
* DOC: HOST2GUC_SELF_CFG
*
* This message is used by Host KMD to setup of the `GuC Self Config KLVs`_.
*
* This message must be sent as `MMIO HXG Message`_.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_HOST_ |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ |
* | +-------+--------------------------------------------------------------+
* | | 27:16 | DATA0 = MBZ |
* | +-------+--------------------------------------------------------------+
* | | 15:0 | ACTION = _`GUC_ACTION_HOST2GUC_SELF_CFG` = 0x0508 |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:16 | **KLV_KEY** - KLV key, see `GuC Self Config KLVs`_ |
* | +-------+--------------------------------------------------------------+
* | | 15:0 | **KLV_LEN** - KLV length |
* | | | |
* | | | - 32 bit KLV = 1 |
* | | | - 64 bit KLV = 2 |
* +---+-------+--------------------------------------------------------------+
* | 2 | 31:0 | **VALUE32** - Bits 31-0 of the KLV value |
* +---+-------+--------------------------------------------------------------+
* | 3 | 31:0 | **VALUE64** - Bits 63-32 of the KLV value (**KLV_LEN** = 2) |
* +---+-------+--------------------------------------------------------------+
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_GUC_ |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_SUCCESS_ |
* | +-------+--------------------------------------------------------------+
* | | 27:0 | DATA0 = **NUM** - 1 if KLV was parsed, 0 if not recognized |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_ACTION_HOST2GUC_SELF_CFG 0x0508
#define HOST2GUC_SELF_CFG_REQUEST_MSG_LEN (GUC_HXG_REQUEST_MSG_MIN_LEN + 3u)
#define HOST2GUC_SELF_CFG_REQUEST_MSG_0_MBZ GUC_HXG_REQUEST_MSG_0_DATA0
#define HOST2GUC_SELF_CFG_REQUEST_MSG_1_KLV_KEY (0xffff << 16)
#define HOST2GUC_SELF_CFG_REQUEST_MSG_1_KLV_LEN (0xffff << 0)
#define HOST2GUC_SELF_CFG_REQUEST_MSG_2_VALUE32 GUC_HXG_REQUEST_MSG_n_DATAn
#define HOST2GUC_SELF_CFG_REQUEST_MSG_3_VALUE64 GUC_HXG_REQUEST_MSG_n_DATAn
#define HOST2GUC_SELF_CFG_RESPONSE_MSG_LEN GUC_HXG_RESPONSE_MSG_MIN_LEN
#define HOST2GUC_SELF_CFG_RESPONSE_MSG_0_NUM GUC_HXG_RESPONSE_MSG_0_DATA0
/**
* DOC: HOST2GUC_CONTROL_CTB
*
* This H2G action allows Vf Host to enable or disable H2G and G2H `CT Buffer`_.
*
* This message must be sent as `MMIO HXG Message`_.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_HOST_ |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ |
* | +-------+--------------------------------------------------------------+
* | | 27:16 | DATA0 = MBZ |
* | +-------+--------------------------------------------------------------+
* | | 15:0 | ACTION = _`GUC_ACTION_HOST2GUC_CONTROL_CTB` = 0x4509 |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | **CONTROL** - control `CTB based communication`_ |
* | | | |
* | | | - _`GUC_CTB_CONTROL_DISABLE` = 0 |
* | | | - _`GUC_CTB_CONTROL_ENABLE` = 1 |
* +---+-------+--------------------------------------------------------------+
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_GUC_ |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_SUCCESS_ |
* | +-------+--------------------------------------------------------------+
* | | 27:0 | DATA0 = MBZ |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_ACTION_HOST2GUC_CONTROL_CTB 0x4509
#define HOST2GUC_CONTROL_CTB_REQUEST_MSG_LEN (GUC_HXG_REQUEST_MSG_MIN_LEN + 1u)
#define HOST2GUC_CONTROL_CTB_REQUEST_MSG_0_MBZ GUC_HXG_REQUEST_MSG_0_DATA0
#define HOST2GUC_CONTROL_CTB_REQUEST_MSG_1_CONTROL GUC_HXG_REQUEST_MSG_n_DATAn
#define GUC_CTB_CONTROL_DISABLE 0u
#define GUC_CTB_CONTROL_ENABLE 1u
#define HOST2GUC_CONTROL_CTB_RESPONSE_MSG_LEN GUC_HXG_RESPONSE_MSG_MIN_LEN
#define HOST2GUC_CONTROL_CTB_RESPONSE_MSG_0_MBZ GUC_HXG_RESPONSE_MSG_0_DATA0
/* legacy definitions */
enum xe_guc_action {
XE_GUC_ACTION_DEFAULT = 0x0,
XE_GUC_ACTION_REQUEST_PREEMPTION = 0x2,
XE_GUC_ACTION_REQUEST_ENGINE_RESET = 0x3,
XE_GUC_ACTION_ALLOCATE_DOORBELL = 0x10,
XE_GUC_ACTION_DEALLOCATE_DOORBELL = 0x20,
XE_GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE = 0x30,
XE_GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x40,
XE_GUC_ACTION_FORCE_LOG_BUFFER_FLUSH = 0x302,
XE_GUC_ACTION_ENTER_S_STATE = 0x501,
XE_GUC_ACTION_EXIT_S_STATE = 0x502,
XE_GUC_ACTION_GLOBAL_SCHED_POLICY_CHANGE = 0x506,
XE_GUC_ACTION_SCHED_CONTEXT = 0x1000,
XE_GUC_ACTION_SCHED_CONTEXT_MODE_SET = 0x1001,
XE_GUC_ACTION_SCHED_CONTEXT_MODE_DONE = 0x1002,
XE_GUC_ACTION_SCHED_ENGINE_MODE_SET = 0x1003,
XE_GUC_ACTION_SCHED_ENGINE_MODE_DONE = 0x1004,
XE_GUC_ACTION_SET_CONTEXT_PRIORITY = 0x1005,
XE_GUC_ACTION_SET_CONTEXT_EXECUTION_QUANTUM = 0x1006,
XE_GUC_ACTION_SET_CONTEXT_PREEMPTION_TIMEOUT = 0x1007,
XE_GUC_ACTION_CONTEXT_RESET_NOTIFICATION = 0x1008,
XE_GUC_ACTION_ENGINE_FAILURE_NOTIFICATION = 0x1009,
XE_GUC_ACTION_HOST2GUC_UPDATE_CONTEXT_POLICIES = 0x100B,
XE_GUC_ACTION_SETUP_PC_GUCRC = 0x3004,
XE_GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
XE_GUC_ACTION_GET_HWCONFIG = 0x4100,
XE_GUC_ACTION_REGISTER_CONTEXT = 0x4502,
XE_GUC_ACTION_DEREGISTER_CONTEXT = 0x4503,
XE_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER = 0x4505,
XE_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER = 0x4506,
XE_GUC_ACTION_DEREGISTER_CONTEXT_DONE = 0x4600,
XE_GUC_ACTION_REGISTER_CONTEXT_MULTI_LRC = 0x4601,
XE_GUC_ACTION_CLIENT_SOFT_RESET = 0x5507,
XE_GUC_ACTION_SET_ENG_UTIL_BUFF = 0x550A,
XE_GUC_ACTION_NOTIFY_MEMORY_CAT_ERROR = 0x6000,
XE_GUC_ACTION_REPORT_PAGE_FAULT_REQ_DESC = 0x6002,
XE_GUC_ACTION_PAGE_FAULT_RES_DESC = 0x6003,
XE_GUC_ACTION_ACCESS_COUNTER_NOTIFY = 0x6004,
XE_GUC_ACTION_TLB_INVALIDATION = 0x7000,
XE_GUC_ACTION_TLB_INVALIDATION_DONE = 0x7001,
XE_GUC_ACTION_TLB_INVALIDATION_ALL = 0x7002,
XE_GUC_ACTION_STATE_CAPTURE_NOTIFICATION = 0x8002,
XE_GUC_ACTION_NOTIFY_FLUSH_LOG_BUFFER_TO_FILE = 0x8003,
XE_GUC_ACTION_NOTIFY_CRASH_DUMP_POSTED = 0x8004,
XE_GUC_ACTION_NOTIFY_EXCEPTION = 0x8005,
XE_GUC_ACTION_LIMIT
};
enum xe_guc_rc_options {
XE_GUCRC_HOST_CONTROL,
XE_GUCRC_FIRMWARE_CONTROL,
};
enum xe_guc_preempt_options {
XE_GUC_PREEMPT_OPTION_DROP_WORK_Q = 0x4,
XE_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q = 0x8,
};
enum xe_guc_report_status {
XE_GUC_REPORT_STATUS_UNKNOWN = 0x0,
XE_GUC_REPORT_STATUS_ACKED = 0x1,
XE_GUC_REPORT_STATUS_ERROR = 0x2,
XE_GUC_REPORT_STATUS_COMPLETE = 0x4,
};
enum xe_guc_sleep_state_status {
XE_GUC_SLEEP_STATE_SUCCESS = 0x1,
XE_GUC_SLEEP_STATE_PREEMPT_TO_IDLE_FAILED = 0x2,
XE_GUC_SLEEP_STATE_ENGINE_RESET_FAILED = 0x3
#define XE_GUC_SLEEP_STATE_INVALID_MASK 0x80000000
};
#define GUC_LOG_CONTROL_LOGGING_ENABLED (1 << 0)
#define GUC_LOG_CONTROL_VERBOSITY_SHIFT 4
#define GUC_LOG_CONTROL_VERBOSITY_MASK (0xF << GUC_LOG_CONTROL_VERBOSITY_SHIFT)
#define GUC_LOG_CONTROL_DEFAULT_LOGGING (1 << 8)
#define XE_GUC_TLB_INVAL_TYPE_SHIFT 0
#define XE_GUC_TLB_INVAL_MODE_SHIFT 8
/* Flush PPC or SMRO caches along with TLB invalidation request */
#define XE_GUC_TLB_INVAL_FLUSH_CACHE (1 << 31)
enum xe_guc_tlb_invalidation_type {
XE_GUC_TLB_INVAL_FULL = 0x0,
XE_GUC_TLB_INVAL_PAGE_SELECTIVE = 0x1,
XE_GUC_TLB_INVAL_PAGE_SELECTIVE_CTX = 0x2,
XE_GUC_TLB_INVAL_GUC = 0x3,
};
/*
* 0: Heavy mode of Invalidation:
* The pipeline of the engine(s) for which the invalidation is targeted to is
* blocked, and all the in-flight transactions are guaranteed to be Globally
* Observed before completing the TLB invalidation
* 1: Lite mode of Invalidation:
* TLBs of the targeted engine(s) are immediately invalidated.
* In-flight transactions are NOT guaranteed to be Globally Observed before
* completing TLB invalidation.
* Light Invalidation Mode is to be used only when
* it can be guaranteed (by SW) that the address translations remain invariant
* for the in-flight transactions across the TLB invalidation. In other words,
* this mode can be used when the TLB invalidation is intended to clear out the
* stale cached translations that are no longer in use. Light Invalidation Mode
* is much faster than the Heavy Invalidation Mode, as it does not wait for the
* in-flight transactions to be GOd.
*/
enum xe_guc_tlb_inval_mode {
XE_GUC_TLB_INVAL_MODE_HEAVY = 0x0,
XE_GUC_TLB_INVAL_MODE_LITE = 0x1,
};
#endif

View File

@ -0,0 +1,249 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _GUC_ACTIONS_SLPC_ABI_H_
#define _GUC_ACTIONS_SLPC_ABI_H_
#include <linux/types.h>
/**
* DOC: SLPC SHARED DATA STRUCTURE
*
* +----+------+--------------------------------------------------------------+
* | CL | Bytes| Description |
* +====+======+==============================================================+
* | 1 | 0-3 | SHARED DATA SIZE |
* | +------+--------------------------------------------------------------+
* | | 4-7 | GLOBAL STATE |
* | +------+--------------------------------------------------------------+
* | | 8-11 | DISPLAY DATA ADDRESS |
* | +------+--------------------------------------------------------------+
* | | 12:63| PADDING |
* +----+------+--------------------------------------------------------------+
* | | 0:63 | PADDING(PLATFORM INFO) |
* +----+------+--------------------------------------------------------------+
* | 3 | 0-3 | TASK STATE DATA |
* + +------+--------------------------------------------------------------+
* | | 4:63 | PADDING |
* +----+------+--------------------------------------------------------------+
* |4-21|0:1087| OVERRIDE PARAMS AND BIT FIELDS |
* +----+------+--------------------------------------------------------------+
* | | | PADDING + EXTRA RESERVED PAGE |
* +----+------+--------------------------------------------------------------+
*/
/*
* SLPC exposes certain parameters for global configuration by the host.
* These are referred to as override parameters, because in most cases
* the host will not need to modify the default values used by SLPC.
* SLPC remembers the default values which allows the host to easily restore
* them by simply unsetting the override. The host can set or unset override
* parameters during SLPC (re-)initialization using the SLPC Reset event.
* The host can also set or unset override parameters on the fly using the
* Parameter Set and Parameter Unset events
*/
#define SLPC_MAX_OVERRIDE_PARAMETERS 256
#define SLPC_OVERRIDE_BITFIELD_SIZE \
(SLPC_MAX_OVERRIDE_PARAMETERS / 32)
#define SLPC_PAGE_SIZE_BYTES 4096
#define SLPC_CACHELINE_SIZE_BYTES 64
#define SLPC_SHARED_DATA_SIZE_BYTE_HEADER SLPC_CACHELINE_SIZE_BYTES
#define SLPC_SHARED_DATA_SIZE_BYTE_PLATFORM_INFO SLPC_CACHELINE_SIZE_BYTES
#define SLPC_SHARED_DATA_SIZE_BYTE_TASK_STATE SLPC_CACHELINE_SIZE_BYTES
#define SLPC_SHARED_DATA_MODE_DEFN_TABLE_SIZE SLPC_PAGE_SIZE_BYTES
#define SLPC_SHARED_DATA_SIZE_BYTE_MAX (2 * SLPC_PAGE_SIZE_BYTES)
/*
* Cacheline size aligned (Total size needed for
* SLPM_KMD_MAX_OVERRIDE_PARAMETERS=256 is 1088 bytes)
*/
#define SLPC_OVERRIDE_PARAMS_TOTAL_BYTES (((((SLPC_MAX_OVERRIDE_PARAMETERS * 4) \
+ ((SLPC_MAX_OVERRIDE_PARAMETERS / 32) * 4)) \
+ (SLPC_CACHELINE_SIZE_BYTES - 1)) / SLPC_CACHELINE_SIZE_BYTES) * \
SLPC_CACHELINE_SIZE_BYTES)
#define SLPC_SHARED_DATA_SIZE_BYTE_OTHER (SLPC_SHARED_DATA_SIZE_BYTE_MAX - \
(SLPC_SHARED_DATA_SIZE_BYTE_HEADER \
+ SLPC_SHARED_DATA_SIZE_BYTE_PLATFORM_INFO \
+ SLPC_SHARED_DATA_SIZE_BYTE_TASK_STATE \
+ SLPC_OVERRIDE_PARAMS_TOTAL_BYTES \
+ SLPC_SHARED_DATA_MODE_DEFN_TABLE_SIZE))
enum slpc_task_enable {
SLPC_PARAM_TASK_DEFAULT = 0,
SLPC_PARAM_TASK_ENABLED,
SLPC_PARAM_TASK_DISABLED,
SLPC_PARAM_TASK_UNKNOWN
};
enum slpc_global_state {
SLPC_GLOBAL_STATE_NOT_RUNNING = 0,
SLPC_GLOBAL_STATE_INITIALIZING = 1,
SLPC_GLOBAL_STATE_RESETTING = 2,
SLPC_GLOBAL_STATE_RUNNING = 3,
SLPC_GLOBAL_STATE_SHUTTING_DOWN = 4,
SLPC_GLOBAL_STATE_ERROR = 5
};
enum slpc_param_id {
SLPC_PARAM_TASK_ENABLE_GTPERF = 0,
SLPC_PARAM_TASK_DISABLE_GTPERF = 1,
SLPC_PARAM_TASK_ENABLE_BALANCER = 2,
SLPC_PARAM_TASK_DISABLE_BALANCER = 3,
SLPC_PARAM_TASK_ENABLE_DCC = 4,
SLPC_PARAM_TASK_DISABLE_DCC = 5,
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ = 6,
SLPC_PARAM_GLOBAL_MAX_GT_UNSLICE_FREQ_MHZ = 7,
SLPC_PARAM_GLOBAL_MIN_GT_SLICE_FREQ_MHZ = 8,
SLPC_PARAM_GLOBAL_MAX_GT_SLICE_FREQ_MHZ = 9,
SLPC_PARAM_GTPERF_THRESHOLD_MAX_FPS = 10,
SLPC_PARAM_GLOBAL_DISABLE_GT_FREQ_MANAGEMENT = 11,
SLPC_PARAM_GTPERF_ENABLE_FRAMERATE_STALLING = 12,
SLPC_PARAM_GLOBAL_DISABLE_RC6_MODE_CHANGE = 13,
SLPC_PARAM_GLOBAL_OC_UNSLICE_FREQ_MHZ = 14,
SLPC_PARAM_GLOBAL_OC_SLICE_FREQ_MHZ = 15,
SLPC_PARAM_GLOBAL_ENABLE_IA_GT_BALANCING = 16,
SLPC_PARAM_GLOBAL_ENABLE_ADAPTIVE_BURST_TURBO = 17,
SLPC_PARAM_GLOBAL_ENABLE_EVAL_MODE = 18,
SLPC_PARAM_GLOBAL_ENABLE_BALANCER_IN_NON_GAMING_MODE = 19,
SLPC_PARAM_GLOBAL_RT_MODE_TURBO_FREQ_DELTA_MHZ = 20,
SLPC_PARAM_PWRGATE_RC_MODE = 21,
SLPC_PARAM_EDR_MODE_COMPUTE_TIMEOUT_MS = 22,
SLPC_PARAM_EDR_QOS_FREQ_MHZ = 23,
SLPC_PARAM_MEDIA_FF_RATIO_MODE = 24,
SLPC_PARAM_ENABLE_IA_FREQ_LIMITING = 25,
SLPC_PARAM_STRATEGIES = 26,
SLPC_PARAM_POWER_PROFILE = 27,
SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY = 28,
SLPC_MAX_PARAM = 32,
};
enum slpc_media_ratio_mode {
SLPC_MEDIA_RATIO_MODE_DYNAMIC_CONTROL = 0,
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_ONE = 1,
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO = 2,
};
enum slpc_gucrc_mode {
SLPC_GUCRC_MODE_HW = 0,
SLPC_GUCRC_MODE_GUCRC_NO_RC6 = 1,
SLPC_GUCRC_MODE_GUCRC_STATIC_TIMEOUT = 2,
SLPC_GUCRC_MODE_GUCRC_DYNAMIC_HYSTERESIS = 3,
SLPC_GUCRC_MODE_MAX,
};
enum slpc_event_id {
SLPC_EVENT_RESET = 0,
SLPC_EVENT_SHUTDOWN = 1,
SLPC_EVENT_PLATFORM_INFO_CHANGE = 2,
SLPC_EVENT_DISPLAY_MODE_CHANGE = 3,
SLPC_EVENT_FLIP_COMPLETE = 4,
SLPC_EVENT_QUERY_TASK_STATE = 5,
SLPC_EVENT_PARAMETER_SET = 6,
SLPC_EVENT_PARAMETER_UNSET = 7,
};
struct slpc_task_state_data {
union {
u32 task_status_padding;
struct {
u32 status;
#define SLPC_GTPERF_TASK_ENABLED REG_BIT(0)
#define SLPC_DCC_TASK_ENABLED REG_BIT(11)
#define SLPC_IN_DCC REG_BIT(12)
#define SLPC_BALANCER_ENABLED REG_BIT(15)
#define SLPC_IBC_TASK_ENABLED REG_BIT(16)
#define SLPC_BALANCER_IA_LMT_ENABLED REG_BIT(17)
#define SLPC_BALANCER_IA_LMT_ACTIVE REG_BIT(18)
};
};
union {
u32 freq_padding;
struct {
#define SLPC_MAX_UNSLICE_FREQ_MASK REG_GENMASK(7, 0)
#define SLPC_MIN_UNSLICE_FREQ_MASK REG_GENMASK(15, 8)
#define SLPC_MAX_SLICE_FREQ_MASK REG_GENMASK(23, 16)
#define SLPC_MIN_SLICE_FREQ_MASK REG_GENMASK(31, 24)
u32 freq;
};
};
} __packed;
struct slpc_shared_data_header {
/* Total size in bytes of this shared buffer. */
u32 size;
u32 global_state;
u32 display_data_addr;
} __packed;
struct slpc_override_params {
u32 bits[SLPC_OVERRIDE_BITFIELD_SIZE];
u32 values[SLPC_MAX_OVERRIDE_PARAMETERS];
} __packed;
struct slpc_shared_data {
struct slpc_shared_data_header header;
u8 shared_data_header_pad[SLPC_SHARED_DATA_SIZE_BYTE_HEADER -
sizeof(struct slpc_shared_data_header)];
u8 platform_info_pad[SLPC_SHARED_DATA_SIZE_BYTE_PLATFORM_INFO];
struct slpc_task_state_data task_state_data;
u8 task_state_data_pad[SLPC_SHARED_DATA_SIZE_BYTE_TASK_STATE -
sizeof(struct slpc_task_state_data)];
struct slpc_override_params override_params;
u8 override_params_pad[SLPC_OVERRIDE_PARAMS_TOTAL_BYTES -
sizeof(struct slpc_override_params)];
u8 shared_data_pad[SLPC_SHARED_DATA_SIZE_BYTE_OTHER];
/* PAGE 2 (4096 bytes), mode based parameter will be removed soon */
u8 reserved_mode_definition[4096];
} __packed;
/**
* DOC: SLPC H2G MESSAGE FORMAT
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_HOST_ |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ |
* | +-------+--------------------------------------------------------------+
* | | 27:16 | DATA0 = MBZ |
* | +-------+--------------------------------------------------------------+
* | | 15:0 | ACTION = _`GUC_ACTION_HOST2GUC_PC_SLPM_REQUEST` = 0x3003 |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:8 | **EVENT_ID** |
* + +-------+--------------------------------------------------------------+
* | | 7:0 | **EVENT_ARGC** - number of data arguments |
* +---+-------+--------------------------------------------------------------+
* | 2 | 31:0 | **EVENT_DATA1** |
* +---+-------+--------------------------------------------------------------+
* |...| 31:0 | ... |
* +---+-------+--------------------------------------------------------------+
* |2+n| 31:0 | **EVENT_DATAn** |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST 0x3003
#define HOST2GUC_PC_SLPC_REQUEST_MSG_MIN_LEN \
(GUC_HXG_REQUEST_MSG_MIN_LEN + 1u)
#define HOST2GUC_PC_SLPC_EVENT_MAX_INPUT_ARGS 9
#define HOST2GUC_PC_SLPC_REQUEST_MSG_MAX_LEN \
(HOST2GUC_PC_SLPC_REQUEST_REQUEST_MSG_MIN_LEN + \
HOST2GUC_PC_SLPC_EVENT_MAX_INPUT_ARGS)
#define HOST2GUC_PC_SLPC_REQUEST_MSG_0_MBZ GUC_HXG_REQUEST_MSG_0_DATA0
#define HOST2GUC_PC_SLPC_REQUEST_MSG_1_EVENT_ID (0xff << 8)
#define HOST2GUC_PC_SLPC_REQUEST_MSG_1_EVENT_ARGC (0xff << 0)
#define HOST2GUC_PC_SLPC_REQUEST_MSG_N_EVENT_DATA_N GUC_HXG_REQUEST_MSG_n_DATAn
#endif

View File

@ -0,0 +1,189 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2014-2021 Intel Corporation
*/
#ifndef _ABI_GUC_COMMUNICATION_CTB_ABI_H
#define _ABI_GUC_COMMUNICATION_CTB_ABI_H
#include <linux/types.h>
#include <linux/build_bug.h>
#include "guc_messages_abi.h"
/**
* DOC: CT Buffer
*
* Circular buffer used to send `CTB Message`_
*/
/**
* DOC: CTB Descriptor
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31:0 | **HEAD** - offset (in dwords) to the last dword that was |
* | | | read from the `CT Buffer`_. |
* | | | It can only be updated by the receiver. |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | **TAIL** - offset (in dwords) to the last dword that was |
* | | | written to the `CT Buffer`_. |
* | | | It can only be updated by the sender. |
* +---+-------+--------------------------------------------------------------+
* | 2 | 31:0 | **STATUS** - status of the CTB |
* | | | |
* | | | - _`GUC_CTB_STATUS_NO_ERROR` = 0 (normal operation) |
* | | | - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too large) |
* | | | - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated message) |
* | | | - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail modified) |
* +---+-------+--------------------------------------------------------------+
* |...| | RESERVED = MBZ |
* +---+-------+--------------------------------------------------------------+
* | 15| 31:0 | RESERVED = MBZ |
* +---+-------+--------------------------------------------------------------+
*/
struct guc_ct_buffer_desc {
u32 head;
u32 tail;
u32 status;
#define GUC_CTB_STATUS_NO_ERROR 0
#define GUC_CTB_STATUS_OVERFLOW (1 << 0)
#define GUC_CTB_STATUS_UNDERFLOW (1 << 1)
#define GUC_CTB_STATUS_MISMATCH (1 << 2)
u32 reserved[13];
} __packed;
static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
/**
* DOC: CTB Message
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31:16 | **FENCE** - message identifier |
* | +-------+--------------------------------------------------------------+
* | | 15:12 | **FORMAT** - format of the CTB message |
* | | | - _`GUC_CTB_FORMAT_HXG` = 0 - see `CTB HXG Message`_ |
* | +-------+--------------------------------------------------------------+
* | | 11:8 | **RESERVED** |
* | +-------+--------------------------------------------------------------+
* | | 7:0 | **NUM_DWORDS** - length of the CTB message (w/o header) |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | optional (depends on FORMAT) |
* +---+-------+ |
* |...| | |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_CTB_HDR_LEN 1u
#define GUC_CTB_MSG_MIN_LEN GUC_CTB_HDR_LEN
#define GUC_CTB_MSG_MAX_LEN 256u
#define GUC_CTB_MSG_0_FENCE (0xffff << 16)
#define GUC_CTB_MSG_0_FORMAT (0xf << 12)
#define GUC_CTB_FORMAT_HXG 0u
#define GUC_CTB_MSG_0_RESERVED (0xf << 8)
#define GUC_CTB_MSG_0_NUM_DWORDS (0xff << 0)
/**
* DOC: CTB HXG Message
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31:16 | FENCE |
* | +-------+--------------------------------------------------------------+
* | | 15:12 | FORMAT = GUC_CTB_FORMAT_HXG_ |
* | +-------+--------------------------------------------------------------+
* | | 11:8 | RESERVED = MBZ |
* | +-------+--------------------------------------------------------------+
* | | 7:0 | NUM_DWORDS = length (in dwords) of the embedded HXG message |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | |
* +---+-------+ |
* |...| | [Embedded `HXG Message`_] |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_CTB_HXG_MSG_MIN_LEN (GUC_CTB_MSG_MIN_LEN + GUC_HXG_MSG_MIN_LEN)
#define GUC_CTB_HXG_MSG_MAX_LEN GUC_CTB_MSG_MAX_LEN
/**
* DOC: CTB based communication
*
* The CTB (command transport buffer) communication between Host and GuC
* is based on u32 data stream written to the shared buffer. One buffer can
* be used to transmit data only in one direction (one-directional channel).
*
* Current status of the each buffer is stored in the buffer descriptor.
* Buffer descriptor holds tail and head fields that represents active data
* stream. The tail field is updated by the data producer (sender), and head
* field is updated by the data consumer (receiver)::
*
* +------------+
* | DESCRIPTOR | +=================+============+========+
* +============+ | | MESSAGE(s) | |
* | address |--------->+=================+============+========+
* +------------+
* | head | ^-----head--------^
* +------------+
* | tail | ^---------tail-----------------^
* +------------+
* | size | ^---------------size--------------------^
* +------------+
*
* Each message in data stream starts with the single u32 treated as a header,
* followed by optional set of u32 data that makes message specific payload::
*
* +------------+---------+---------+---------+
* | MESSAGE |
* +------------+---------+---------+---------+
* | msg[0] | [1] | ... | [n-1] |
* +------------+---------+---------+---------+
* | MESSAGE | MESSAGE PAYLOAD |
* + HEADER +---------+---------+---------+
* | | 0 | ... | n |
* +======+=====+=========+=========+=========+
* | 31:16| code| | | |
* +------+-----+ | | |
* | 15:5|flags| | | |
* +------+-----+ | | |
* | 4:0| len| | | |
* +------+-----+---------+---------+---------+
*
* ^-------------len-------------^
*
* The message header consists of:
*
* - **len**, indicates length of the message payload (in u32)
* - **code**, indicates message code
* - **flags**, holds various bits to control message handling
*/
/*
* Definition of the command transport message header (DW0)
*
* bit[4..0] message len (in dwords)
* bit[7..5] reserved
* bit[8] response (G2H only)
* bit[8] write fence to desc (H2G only)
* bit[9] write status to H2G buff (H2G only)
* bit[10] send status back via G2H (H2G only)
* bit[15..11] reserved
* bit[31..16] action code
*/
#define GUC_CT_MSG_LEN_SHIFT 0
#define GUC_CT_MSG_LEN_MASK 0x1F
#define GUC_CT_MSG_IS_RESPONSE (1 << 8)
#define GUC_CT_MSG_WRITE_FENCE_TO_DESC (1 << 8)
#define GUC_CT_MSG_WRITE_STATUS_TO_BUFF (1 << 9)
#define GUC_CT_MSG_SEND_STATUS (1 << 10)
#define GUC_CT_MSG_ACTION_SHIFT 16
#define GUC_CT_MSG_ACTION_MASK 0xFFFF
#endif

View File

@ -0,0 +1,49 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2014-2021 Intel Corporation
*/
#ifndef _ABI_GUC_COMMUNICATION_MMIO_ABI_H
#define _ABI_GUC_COMMUNICATION_MMIO_ABI_H
/**
* DOC: GuC MMIO based communication
*
* The MMIO based communication between Host and GuC relies on special
* hardware registers which format could be defined by the software
* (so called scratch registers).
*
* Each MMIO based message, both Host to GuC (H2G) and GuC to Host (G2H)
* messages, which maximum length depends on number of available scratch
* registers, is directly written into those scratch registers.
*
* For Gen9+, there are 16 software scratch registers 0xC180-0xC1B8,
* but no H2G command takes more than 4 parameters and the GuC firmware
* itself uses an 4-element array to store the H2G message.
*
* For Gen11+, there are additional 4 registers 0x190240-0x19024C, which
* are, regardless on lower count, preferred over legacy ones.
*
* The MMIO based communication is mainly used during driver initialization
* phase to setup the `CTB based communication`_ that will be used afterwards.
*/
#define GUC_MAX_MMIO_MSG_LEN 4
/**
* DOC: MMIO HXG Message
*
* Format of the MMIO messages follows definitions of `HXG Message`_.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31:0 | |
* +---+-------+ |
* |...| | [Embedded `HXG Message`_] |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/
#endif

View File

@ -0,0 +1,37 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2014-2021 Intel Corporation
*/
#ifndef _ABI_GUC_ERRORS_ABI_H
#define _ABI_GUC_ERRORS_ABI_H
enum xe_guc_response_status {
XE_GUC_RESPONSE_STATUS_SUCCESS = 0x0,
XE_GUC_RESPONSE_STATUS_GENERIC_FAIL = 0xF000,
};
enum xe_guc_load_status {
XE_GUC_LOAD_STATUS_DEFAULT = 0x00,
XE_GUC_LOAD_STATUS_START = 0x01,
XE_GUC_LOAD_STATUS_ERROR_DEVID_BUILD_MISMATCH = 0x02,
XE_GUC_LOAD_STATUS_GUC_PREPROD_BUILD_MISMATCH = 0x03,
XE_GUC_LOAD_STATUS_ERROR_DEVID_INVALID_GUCTYPE = 0x04,
XE_GUC_LOAD_STATUS_GDT_DONE = 0x10,
XE_GUC_LOAD_STATUS_IDT_DONE = 0x20,
XE_GUC_LOAD_STATUS_LAPIC_DONE = 0x30,
XE_GUC_LOAD_STATUS_GUCINT_DONE = 0x40,
XE_GUC_LOAD_STATUS_DPC_READY = 0x50,
XE_GUC_LOAD_STATUS_DPC_ERROR = 0x60,
XE_GUC_LOAD_STATUS_EXCEPTION = 0x70,
XE_GUC_LOAD_STATUS_INIT_DATA_INVALID = 0x71,
XE_GUC_LOAD_STATUS_PXP_TEARDOWN_CTRL_ENABLED = 0x72,
XE_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START,
XE_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73,
XE_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID = 0x74,
XE_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END,
XE_GUC_LOAD_STATUS_READY = 0xF0,
};
#endif

View File

@ -0,0 +1,322 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _ABI_GUC_KLVS_ABI_H
#define _ABI_GUC_KLVS_ABI_H
#include <linux/types.h>
/**
* DOC: GuC KLV
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31:16 | **KEY** - KLV key identifier |
* | | | - `GuC Self Config KLVs`_ |
* | | | - `GuC VGT Policy KLVs`_ |
* | | | - `GuC VF Configuration KLVs`_ |
* | | | |
* | +-------+--------------------------------------------------------------+
* | | 15:0 | **LEN** - length of VALUE (in 32bit dwords) |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | **VALUE** - actual value of the KLV (format depends on KEY) |
* +---+-------+ |
* |...| | |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_KLV_LEN_MIN 1u
#define GUC_KLV_0_KEY (0xffff << 16)
#define GUC_KLV_0_LEN (0xffff << 0)
#define GUC_KLV_n_VALUE (0xffffffff << 0)
/**
* DOC: GuC Self Config KLVs
*
* `GuC KLV`_ keys available for use with HOST2GUC_SELF_CFG_.
*
* _`GUC_KLV_SELF_CFG_MEMIRQ_STATUS_ADDR` : 0x0900
* Refers to 64 bit Global Gfx address (in bytes) of memory based interrupts
* status vector for use by the GuC.
*
* _`GUC_KLV_SELF_CFG_MEMIRQ_SOURCE_ADDR` : 0x0901
* Refers to 64 bit Global Gfx address (in bytes) of memory based interrupts
* source vector for use by the GuC.
*
* _`GUC_KLV_SELF_CFG_H2G_CTB_ADDR` : 0x0902
* Refers to 64 bit Global Gfx address of H2G `CT Buffer`_.
* Should be above WOPCM address but below APIC base address for native mode.
*
* _`GUC_KLV_SELF_CFG_H2G_CTB_DESCRIPTOR_ADDR : 0x0903
* Refers to 64 bit Global Gfx address of H2G `CTB Descriptor`_.
* Should be above WOPCM address but below APIC base address for native mode.
*
* _`GUC_KLV_SELF_CFG_H2G_CTB_SIZE : 0x0904
* Refers to size of H2G `CT Buffer`_ in bytes.
* Should be a multiple of 4K.
*
* _`GUC_KLV_SELF_CFG_G2H_CTB_ADDR : 0x0905
* Refers to 64 bit Global Gfx address of G2H `CT Buffer`_.
* Should be above WOPCM address but below APIC base address for native mode.
*
* _GUC_KLV_SELF_CFG_G2H_CTB_DESCRIPTOR_ADDR : 0x0906
* Refers to 64 bit Global Gfx address of G2H `CTB Descriptor`_.
* Should be above WOPCM address but below APIC base address for native mode.
*
* _GUC_KLV_SELF_CFG_G2H_CTB_SIZE : 0x0907
* Refers to size of G2H `CT Buffer`_ in bytes.
* Should be a multiple of 4K.
*/
#define GUC_KLV_SELF_CFG_MEMIRQ_STATUS_ADDR_KEY 0x0900
#define GUC_KLV_SELF_CFG_MEMIRQ_STATUS_ADDR_LEN 2u
#define GUC_KLV_SELF_CFG_MEMIRQ_SOURCE_ADDR_KEY 0x0901
#define GUC_KLV_SELF_CFG_MEMIRQ_SOURCE_ADDR_LEN 2u
#define GUC_KLV_SELF_CFG_H2G_CTB_ADDR_KEY 0x0902
#define GUC_KLV_SELF_CFG_H2G_CTB_ADDR_LEN 2u
#define GUC_KLV_SELF_CFG_H2G_CTB_DESCRIPTOR_ADDR_KEY 0x0903
#define GUC_KLV_SELF_CFG_H2G_CTB_DESCRIPTOR_ADDR_LEN 2u
#define GUC_KLV_SELF_CFG_H2G_CTB_SIZE_KEY 0x0904
#define GUC_KLV_SELF_CFG_H2G_CTB_SIZE_LEN 1u
#define GUC_KLV_SELF_CFG_G2H_CTB_ADDR_KEY 0x0905
#define GUC_KLV_SELF_CFG_G2H_CTB_ADDR_LEN 2u
#define GUC_KLV_SELF_CFG_G2H_CTB_DESCRIPTOR_ADDR_KEY 0x0906
#define GUC_KLV_SELF_CFG_G2H_CTB_DESCRIPTOR_ADDR_LEN 2u
#define GUC_KLV_SELF_CFG_G2H_CTB_SIZE_KEY 0x0907
#define GUC_KLV_SELF_CFG_G2H_CTB_SIZE_LEN 1u
/*
* Per context scheduling policy update keys.
*/
enum {
GUC_CONTEXT_POLICIES_KLV_ID_EXECUTION_QUANTUM = 0x2001,
GUC_CONTEXT_POLICIES_KLV_ID_PREEMPTION_TIMEOUT = 0x2002,
GUC_CONTEXT_POLICIES_KLV_ID_SCHEDULING_PRIORITY = 0x2003,
GUC_CONTEXT_POLICIES_KLV_ID_PREEMPT_TO_IDLE_ON_QUANTUM_EXPIRY = 0x2004,
GUC_CONTEXT_POLICIES_KLV_ID_SLPM_GT_FREQUENCY = 0x2005,
GUC_CONTEXT_POLICIES_KLV_NUM_IDS = 5,
};
/**
* DOC: GuC VGT Policy KLVs
*
* `GuC KLV`_ keys available for use with PF2GUC_UPDATE_VGT_POLICY.
*
* _`GUC_KLV_VGT_POLICY_SCHED_IF_IDLE` : 0x8001
* This config sets whether strict scheduling is enabled whereby any VF
* that doesnt have work to submit is still allocated a fixed execution
* time-slice to ensure active VFs execution is always consitent even
* during other VF reprovisiong / rebooting events. Changing this KLV
* impacts all VFs and takes effect on the next VF-Switch event.
*
* :0: don't schedule idle (default)
* :1: schedule if idle
*
* _`GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD` : 0x8002
* This config sets the sample period for tracking adverse event counters.
* A sample period is the period in millisecs during which events are counted.
* This is applicable for all the VFs.
*
* :0: adverse events are not counted (default)
* :n: sample period in milliseconds
*
* _`GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH` : 0x8D00
* This enum is to reset utilized HW engine after VF Switch (i.e to clean
* up Stale HW register left behind by previous VF)
*
* :0: don't reset (default)
* :1: reset
*/
#define GUC_KLV_VGT_POLICY_SCHED_IF_IDLE_KEY 0x8001
#define GUC_KLV_VGT_POLICY_SCHED_IF_IDLE_LEN 1u
#define GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_KEY 0x8002
#define GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_LEN 1u
#define GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_KEY 0x8D00
#define GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_LEN 1u
/**
* DOC: GuC VF Configuration KLVs
*
* `GuC KLV`_ keys available for use with PF2GUC_UPDATE_VF_CFG.
*
* _`GUC_KLV_VF_CFG_GGTT_START` : 0x0001
* A 4K aligned start GTT address/offset assigned to VF.
* Value is 64 bits.
*
* _`GUC_KLV_VF_CFG_GGTT_SIZE` : 0x0002
* A 4K aligned size of GGTT assigned to VF.
* Value is 64 bits.
*
* _`GUC_KLV_VF_CFG_LMEM_SIZE` : 0x0003
* A 2M aligned size of local memory assigned to VF.
* Value is 64 bits.
*
* _`GUC_KLV_VF_CFG_NUM_CONTEXTS` : 0x0004
* Refers to the number of contexts allocated to this VF.
*
* :0: no contexts (default)
* :1-65535: number of contexts (Gen12)
*
* _`GUC_KLV_VF_CFG_TILE_MASK` : 0x0005
* For multi-tiled products, this field contains the bitwise-OR of tiles
* assigned to the VF. Bit-0-set means VF has access to Tile-0,
* Bit-31-set means VF has access to Tile-31, and etc.
* At least one tile will always be allocated.
* If all bits are zero, VF KMD should treat this as a fatal error.
* For, single-tile products this KLV config is ignored.
*
* _`GUC_KLV_VF_CFG_NUM_DOORBELLS` : 0x0006
* Refers to the number of doorbells allocated to this VF.
*
* :0: no doorbells (default)
* :1-255: number of doorbells (Gen12)
*
* _`GUC_KLV_VF_CFG_EXEC_QUANTUM` : 0x8A01
* This config sets the VFs-execution-quantum in milliseconds.
* GUC will attempt to obey the maximum values as much as HW is capable
* of and this will never be perfectly-exact (accumulated nano-second
* granularity) since the GPUs clock time runs off a different crystal
* from the CPUs clock. Changing this KLV on a VF that is currently
* running a context wont take effect until a new context is scheduled in.
* That said, when the PF is changing this value from 0xFFFFFFFF to
* something else, it might never take effect if the VF is running an
* inifinitely long compute or shader kernel. In such a scenario, the
* PF would need to trigger a VM PAUSE and then change the KLV to force
* it to take effect. Such cases might typically happen on a 1PF+1VF
* Virtualization config enabled for heavier workloads like AI/ML.
*
* :0: infinite exec quantum (default)
*
* _`GUC_KLV_VF_CFG_PREEMPT_TIMEOUT` : 0x8A02
* This config sets the VF-preemption-timeout in microseconds.
* GUC will attempt to obey the minimum and maximum values as much as
* HW is capable and this will never be perfectly-exact (accumulated
* nano-second granularity) since the GPUs clock time runs off a
* different crystal from the CPUs clock. Changing this KLV on a VF
* that is currently running a context wont take effect until a new
* context is scheduled in.
* That said, when the PF is changing this value from 0xFFFFFFFF to
* something else, it might never take effect if the VF is running an
* inifinitely long compute or shader kernel.
* In this case, the PF would need to trigger a VM PAUSE and then change
* the KLV to force it to take effect. Such cases might typically happen
* on a 1PF+1VF Virtualization config enabled for heavier workloads like
* AI/ML.
*
* :0: no preemption timeout (default)
*
* _`GUC_KLV_VF_CFG_THRESHOLD_CAT_ERR` : 0x8A03
* This config sets threshold for CAT errors caused by the VF.
*
* :0: adverse events or error will not be reported (default)
* :n: event occurrence count per sampling interval
*
* _`GUC_KLV_VF_CFG_THRESHOLD_ENGINE_RESET` : 0x8A04
* This config sets threshold for engine reset caused by the VF.
*
* :0: adverse events or error will not be reported (default)
* :n: event occurrence count per sampling interval
*
* _`GUC_KLV_VF_CFG_THRESHOLD_PAGE_FAULT` : 0x8A05
* This config sets threshold for page fault errors caused by the VF.
*
* :0: adverse events or error will not be reported (default)
* :n: event occurrence count per sampling interval
*
* _`GUC_KLV_VF_CFG_THRESHOLD_H2G_STORM` : 0x8A06
* This config sets threshold for H2G interrupts triggered by the VF.
*
* :0: adverse events or error will not be reported (default)
* :n: time (us) per sampling interval
*
* _`GUC_KLV_VF_CFG_THRESHOLD_IRQ_STORM` : 0x8A07
* This config sets threshold for GT interrupts triggered by the VF's
* workloads.
*
* :0: adverse events or error will not be reported (default)
* :n: time (us) per sampling interval
*
* _`GUC_KLV_VF_CFG_THRESHOLD_DOORBELL_STORM` : 0x8A08
* This config sets threshold for doorbell's ring triggered by the VF.
*
* :0: adverse events or error will not be reported (default)
* :n: time (us) per sampling interval
*
* _`GUC_KLV_VF_CFG_BEGIN_DOORBELL_ID` : 0x8A0A
* Refers to the start index of doorbell assigned to this VF.
*
* :0: (default)
* :1-255: number of doorbells (Gen12)
*
* _`GUC_KLV_VF_CFG_BEGIN_CONTEXT_ID` : 0x8A0B
* Refers to the start index in context array allocated to this VFs use.
*
* :0: (default)
* :1-65535: number of contexts (Gen12)
*/
#define GUC_KLV_VF_CFG_GGTT_START_KEY 0x0001
#define GUC_KLV_VF_CFG_GGTT_START_LEN 2u
#define GUC_KLV_VF_CFG_GGTT_SIZE_KEY 0x0002
#define GUC_KLV_VF_CFG_GGTT_SIZE_LEN 2u
#define GUC_KLV_VF_CFG_LMEM_SIZE_KEY 0x0003
#define GUC_KLV_VF_CFG_LMEM_SIZE_LEN 2u
#define GUC_KLV_VF_CFG_NUM_CONTEXTS_KEY 0x0004
#define GUC_KLV_VF_CFG_NUM_CONTEXTS_LEN 1u
#define GUC_KLV_VF_CFG_TILE_MASK_KEY 0x0005
#define GUC_KLV_VF_CFG_TILE_MASK_LEN 1u
#define GUC_KLV_VF_CFG_NUM_DOORBELLS_KEY 0x0006
#define GUC_KLV_VF_CFG_NUM_DOORBELLS_LEN 1u
#define GUC_KLV_VF_CFG_EXEC_QUANTUM_KEY 0x8a01
#define GUC_KLV_VF_CFG_EXEC_QUANTUM_LEN 1u
#define GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_KEY 0x8a02
#define GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_LEN 1u
#define GUC_KLV_VF_CFG_THRESHOLD_CAT_ERR_KEY 0x8a03
#define GUC_KLV_VF_CFG_THRESHOLD_CAT_ERR_LEN 1u
#define GUC_KLV_VF_CFG_THRESHOLD_ENGINE_RESET_KEY 0x8a04
#define GUC_KLV_VF_CFG_THRESHOLD_ENGINE_RESET_LEN 1u
#define GUC_KLV_VF_CFG_THRESHOLD_PAGE_FAULT_KEY 0x8a05
#define GUC_KLV_VF_CFG_THRESHOLD_PAGE_FAULT_LEN 1u
#define GUC_KLV_VF_CFG_THRESHOLD_H2G_STORM_KEY 0x8a06
#define GUC_KLV_VF_CFG_THRESHOLD_H2G_STORM_LEN 1u
#define GUC_KLV_VF_CFG_THRESHOLD_IRQ_STORM_KEY 0x8a07
#define GUC_KLV_VF_CFG_THRESHOLD_IRQ_STORM_LEN 1u
#define GUC_KLV_VF_CFG_THRESHOLD_DOORBELL_STORM_KEY 0x8a08
#define GUC_KLV_VF_CFG_THRESHOLD_DOORBELL_STORM_LEN 1u
#define GUC_KLV_VF_CFG_BEGIN_DOORBELL_ID_KEY 0x8a0a
#define GUC_KLV_VF_CFG_BEGIN_DOORBELL_ID_LEN 1u
#define GUC_KLV_VF_CFG_BEGIN_CONTEXT_ID_KEY 0x8a0b
#define GUC_KLV_VF_CFG_BEGIN_CONTEXT_ID_LEN 1u
#endif

View File

@ -0,0 +1,234 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2014-2021 Intel Corporation
*/
#ifndef _ABI_GUC_MESSAGES_ABI_H
#define _ABI_GUC_MESSAGES_ABI_H
/**
* DOC: HXG Message
*
* All messages exchanged with GuC are defined using 32 bit dwords.
* First dword is treated as a message header. Remaining dwords are optional.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | | | |
* | 0 | 31 | **ORIGIN** - originator of the message |
* | | | - _`GUC_HXG_ORIGIN_HOST` = 0 |
* | | | - _`GUC_HXG_ORIGIN_GUC` = 1 |
* | | | |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | **TYPE** - message type |
* | | | - _`GUC_HXG_TYPE_REQUEST` = 0 |
* | | | - _`GUC_HXG_TYPE_EVENT` = 1 |
* | | | - _`GUC_HXG_TYPE_NO_RESPONSE_BUSY` = 3 |
* | | | - _`GUC_HXG_TYPE_NO_RESPONSE_RETRY` = 5 |
* | | | - _`GUC_HXG_TYPE_RESPONSE_FAILURE` = 6 |
* | | | - _`GUC_HXG_TYPE_RESPONSE_SUCCESS` = 7 |
* | +-------+--------------------------------------------------------------+
* | | 27:0 | **AUX** - auxiliary data (depends on TYPE) |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | |
* +---+-------+ |
* |...| | **PAYLOAD** - optional payload (depends on TYPE) |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_HXG_MSG_MIN_LEN 1u
#define GUC_HXG_MSG_0_ORIGIN (0x1 << 31)
#define GUC_HXG_ORIGIN_HOST 0u
#define GUC_HXG_ORIGIN_GUC 1u
#define GUC_HXG_MSG_0_TYPE (0x7 << 28)
#define GUC_HXG_TYPE_REQUEST 0u
#define GUC_HXG_TYPE_EVENT 1u
#define GUC_HXG_TYPE_NO_RESPONSE_BUSY 3u
#define GUC_HXG_TYPE_NO_RESPONSE_RETRY 5u
#define GUC_HXG_TYPE_RESPONSE_FAILURE 6u
#define GUC_HXG_TYPE_RESPONSE_SUCCESS 7u
#define GUC_HXG_MSG_0_AUX (0xfffffff << 0)
#define GUC_HXG_MSG_n_PAYLOAD (0xffffffff << 0)
/**
* DOC: HXG Request
*
* The `HXG Request`_ message should be used to initiate synchronous activity
* for which confirmation or return data is expected.
*
* The recipient of this message shall use `HXG Response`_, `HXG Failure`_
* or `HXG Retry`_ message as a definite reply, and may use `HXG Busy`_
* message as a intermediate reply.
*
* Format of @DATA0 and all @DATAn fields depends on the @ACTION code.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ |
* | +-------+--------------------------------------------------------------+
* | | 27:16 | **DATA0** - request data (depends on ACTION) |
* | +-------+--------------------------------------------------------------+
* | | 15:0 | **ACTION** - requested action code |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | |
* +---+-------+ |
* |...| | **DATAn** - optional data (depends on ACTION) |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_HXG_REQUEST_MSG_MIN_LEN GUC_HXG_MSG_MIN_LEN
#define GUC_HXG_REQUEST_MSG_0_DATA0 (0xfff << 16)
#define GUC_HXG_REQUEST_MSG_0_ACTION (0xffff << 0)
#define GUC_HXG_REQUEST_MSG_n_DATAn GUC_HXG_MSG_n_PAYLOAD
/**
* DOC: HXG Event
*
* The `HXG Event`_ message should be used to initiate asynchronous activity
* that does not involves immediate confirmation nor data.
*
* Format of @DATA0 and all @DATAn fields depends on the @ACTION code.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_EVENT_ |
* | +-------+--------------------------------------------------------------+
* | | 27:16 | **DATA0** - event data (depends on ACTION) |
* | +-------+--------------------------------------------------------------+
* | | 15:0 | **ACTION** - event action code |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | |
* +---+-------+ |
* |...| | **DATAn** - optional event data (depends on ACTION) |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_HXG_EVENT_MSG_MIN_LEN GUC_HXG_MSG_MIN_LEN
#define GUC_HXG_EVENT_MSG_0_DATA0 (0xfff << 16)
#define GUC_HXG_EVENT_MSG_0_ACTION (0xffff << 0)
#define GUC_HXG_EVENT_MSG_n_DATAn GUC_HXG_MSG_n_PAYLOAD
/**
* DOC: HXG Busy
*
* The `HXG Busy`_ message may be used to acknowledge reception of the `HXG Request`_
* message if the recipient expects that it processing will be longer than default
* timeout.
*
* The @COUNTER field may be used as a progress indicator.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_NO_RESPONSE_BUSY_ |
* | +-------+--------------------------------------------------------------+
* | | 27:0 | **COUNTER** - progress indicator |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_HXG_BUSY_MSG_LEN GUC_HXG_MSG_MIN_LEN
#define GUC_HXG_BUSY_MSG_0_COUNTER GUC_HXG_MSG_0_AUX
/**
* DOC: HXG Retry
*
* The `HXG Retry`_ message should be used by recipient to indicate that the
* `HXG Request`_ message was dropped and it should be resent again.
*
* The @REASON field may be used to provide additional information.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_NO_RESPONSE_RETRY_ |
* | +-------+--------------------------------------------------------------+
* | | 27:0 | **REASON** - reason for retry |
* | | | - _`GUC_HXG_RETRY_REASON_UNSPECIFIED` = 0 |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_HXG_RETRY_MSG_LEN GUC_HXG_MSG_MIN_LEN
#define GUC_HXG_RETRY_MSG_0_REASON GUC_HXG_MSG_0_AUX
#define GUC_HXG_RETRY_REASON_UNSPECIFIED 0u
/**
* DOC: HXG Failure
*
* The `HXG Failure`_ message shall be used as a reply to the `HXG Request`_
* message that could not be processed due to an error.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_FAILURE_ |
* | +-------+--------------------------------------------------------------+
* | | 27:16 | **HINT** - additional error hint |
* | +-------+--------------------------------------------------------------+
* | | 15:0 | **ERROR** - error/result code |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_HXG_FAILURE_MSG_LEN GUC_HXG_MSG_MIN_LEN
#define GUC_HXG_FAILURE_MSG_0_HINT (0xfff << 16)
#define GUC_HXG_FAILURE_MSG_0_ERROR (0xffff << 0)
/**
* DOC: HXG Response
*
* The `HXG Response`_ message shall be used as a reply to the `HXG Request`_
* message that was successfully processed without an error.
*
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31 | ORIGIN |
* | +-------+--------------------------------------------------------------+
* | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_SUCCESS_ |
* | +-------+--------------------------------------------------------------+
* | | 27:0 | **DATA0** - data (depends on ACTION from `HXG Request`_) |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | |
* +---+-------+ |
* |...| | **DATAn** - data (depends on ACTION from `HXG Request`_) |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/
#define GUC_HXG_RESPONSE_MSG_MIN_LEN GUC_HXG_MSG_MIN_LEN
#define GUC_HXG_RESPONSE_MSG_0_DATA0 GUC_HXG_MSG_0_AUX
#define GUC_HXG_RESPONSE_MSG_n_DATAn GUC_HXG_MSG_n_PAYLOAD
/* deprecated */
#define INTEL_GUC_MSG_TYPE_SHIFT 28
#define INTEL_GUC_MSG_TYPE_MASK (0xF << INTEL_GUC_MSG_TYPE_SHIFT)
#define INTEL_GUC_MSG_DATA_SHIFT 16
#define INTEL_GUC_MSG_DATA_MASK (0xFFF << INTEL_GUC_MSG_DATA_SHIFT)
#define INTEL_GUC_MSG_CODE_SHIFT 0
#define INTEL_GUC_MSG_CODE_MASK (0xFFFF << INTEL_GUC_MSG_CODE_SHIFT)
enum intel_guc_msg_type {
INTEL_GUC_MSG_TYPE_REQUEST = 0x0,
INTEL_GUC_MSG_TYPE_RESPONSE = 0xF,
};
#endif

View File

@ -0,0 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_DRM_XE_KUNIT_TEST) += xe_bo_test.o xe_dma_buf_test.o \
xe_migrate_test.o

View File

@ -0,0 +1,303 @@
// SPDX-License-Identifier: GPL-2.0 AND MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <kunit/test.h>
#include "xe_bo_evict.h"
#include "xe_pci.h"
static int ccs_test_migrate(struct xe_gt *gt, struct xe_bo *bo,
bool clear, u64 get_val, u64 assign_val,
struct kunit *test)
{
struct dma_fence *fence;
struct ttm_tt *ttm;
struct page *page;
pgoff_t ccs_page;
long timeout;
u64 *cpu_map;
int ret;
u32 offset;
/* Move bo to VRAM if not already there. */
ret = xe_bo_validate(bo, NULL, false);
if (ret) {
KUNIT_FAIL(test, "Failed to validate bo.\n");
return ret;
}
/* Optionally clear bo *and* CCS data in VRAM. */
if (clear) {
fence = xe_migrate_clear(gt->migrate, bo, bo->ttm.resource, 0);
if (IS_ERR(fence)) {
KUNIT_FAIL(test, "Failed to submit bo clear.\n");
return PTR_ERR(fence);
}
dma_fence_put(fence);
}
/* Evict to system. CCS data should be copied. */
ret = xe_bo_evict(bo, true);
if (ret) {
KUNIT_FAIL(test, "Failed to evict bo.\n");
return ret;
}
/* Sync all migration blits */
timeout = dma_resv_wait_timeout(bo->ttm.base.resv,
DMA_RESV_USAGE_KERNEL,
true,
5 * HZ);
if (timeout <= 0) {
KUNIT_FAIL(test, "Failed to sync bo eviction.\n");
return -ETIME;
}
/*
* Bo with CCS data is now in system memory. Verify backing store
* and data integrity. Then assign for the next testing round while
* we still have a CPU map.
*/
ttm = bo->ttm.ttm;
if (!ttm || !ttm_tt_is_populated(ttm)) {
KUNIT_FAIL(test, "Bo was not in expected placement.\n");
return -EINVAL;
}
ccs_page = xe_bo_ccs_pages_start(bo) >> PAGE_SHIFT;
if (ccs_page >= ttm->num_pages) {
KUNIT_FAIL(test, "No TTM CCS pages present.\n");
return -EINVAL;
}
page = ttm->pages[ccs_page];
cpu_map = kmap_local_page(page);
/* Check first CCS value */
if (cpu_map[0] != get_val) {
KUNIT_FAIL(test,
"Expected CCS readout 0x%016llx, got 0x%016llx.\n",
(unsigned long long)get_val,
(unsigned long long)cpu_map[0]);
ret = -EINVAL;
}
/* Check last CCS value, or at least last value in page. */
offset = xe_device_ccs_bytes(gt->xe, bo->size);
offset = min_t(u32, offset, PAGE_SIZE) / sizeof(u64) - 1;
if (cpu_map[offset] != get_val) {
KUNIT_FAIL(test,
"Expected CCS readout 0x%016llx, got 0x%016llx.\n",
(unsigned long long)get_val,
(unsigned long long)cpu_map[offset]);
ret = -EINVAL;
}
cpu_map[0] = assign_val;
cpu_map[offset] = assign_val;
kunmap_local(cpu_map);
return ret;
}
static void ccs_test_run_gt(struct xe_device *xe, struct xe_gt *gt,
struct kunit *test)
{
struct xe_bo *bo;
u32 vram_bit;
int ret;
/* TODO: Sanity check */
vram_bit = XE_BO_CREATE_VRAM0_BIT << gt->info.vram_id;
kunit_info(test, "Testing gt id %u vram id %u\n", gt->info.id,
gt->info.vram_id);
bo = xe_bo_create_locked(xe, NULL, NULL, SZ_1M, ttm_bo_type_device,
vram_bit);
if (IS_ERR(bo)) {
KUNIT_FAIL(test, "Failed to create bo.\n");
return;
}
kunit_info(test, "Verifying that CCS data is cleared on creation.\n");
ret = ccs_test_migrate(gt, bo, false, 0ULL, 0xdeadbeefdeadbeefULL,
test);
if (ret)
goto out_unlock;
kunit_info(test, "Verifying that CCS data survives migration.\n");
ret = ccs_test_migrate(gt, bo, false, 0xdeadbeefdeadbeefULL,
0xdeadbeefdeadbeefULL, test);
if (ret)
goto out_unlock;
kunit_info(test, "Verifying that CCS data can be properly cleared.\n");
ret = ccs_test_migrate(gt, bo, true, 0ULL, 0ULL, test);
out_unlock:
xe_bo_unlock_no_vm(bo);
xe_bo_put(bo);
}
static int ccs_test_run_device(struct xe_device *xe)
{
struct kunit *test = xe_cur_kunit();
struct xe_gt *gt;
int id;
if (!xe_device_has_flat_ccs(xe)) {
kunit_info(test, "Skipping non-flat-ccs device.\n");
return 0;
}
for_each_gt(gt, xe, id)
ccs_test_run_gt(xe, gt, test);
return 0;
}
void xe_ccs_migrate_kunit(struct kunit *test)
{
xe_call_for_each_device(ccs_test_run_device);
}
EXPORT_SYMBOL(xe_ccs_migrate_kunit);
static int evict_test_run_gt(struct xe_device *xe, struct xe_gt *gt, struct kunit *test)
{
struct xe_bo *bo, *external;
unsigned int bo_flags = XE_BO_CREATE_USER_BIT |
XE_BO_CREATE_VRAM_IF_DGFX(gt);
struct xe_vm *vm = xe_migrate_get_vm(xe->gt[0].migrate);
struct ww_acquire_ctx ww;
int err, i;
kunit_info(test, "Testing device %s gt id %u vram id %u\n",
dev_name(xe->drm.dev), gt->info.id, gt->info.vram_id);
for (i = 0; i < 2; ++i) {
xe_vm_lock(vm, &ww, 0, false);
bo = xe_bo_create(xe, NULL, vm, 0x10000, ttm_bo_type_device,
bo_flags);
xe_vm_unlock(vm, &ww);
if (IS_ERR(bo)) {
KUNIT_FAIL(test, "bo create err=%pe\n", bo);
break;
}
external = xe_bo_create(xe, NULL, NULL, 0x10000,
ttm_bo_type_device, bo_flags);
if (IS_ERR(external)) {
KUNIT_FAIL(test, "external bo create err=%pe\n", external);
goto cleanup_bo;
}
xe_bo_lock(external, &ww, 0, false);
err = xe_bo_pin_external(external);
xe_bo_unlock(external, &ww);
if (err) {
KUNIT_FAIL(test, "external bo pin err=%pe\n",
ERR_PTR(err));
goto cleanup_external;
}
err = xe_bo_evict_all(xe);
if (err) {
KUNIT_FAIL(test, "evict err=%pe\n", ERR_PTR(err));
goto cleanup_all;
}
err = xe_bo_restore_kernel(xe);
if (err) {
KUNIT_FAIL(test, "restore kernel err=%pe\n",
ERR_PTR(err));
goto cleanup_all;
}
err = xe_bo_restore_user(xe);
if (err) {
KUNIT_FAIL(test, "restore user err=%pe\n", ERR_PTR(err));
goto cleanup_all;
}
if (!xe_bo_is_vram(external)) {
KUNIT_FAIL(test, "external bo is not vram\n");
err = -EPROTO;
goto cleanup_all;
}
if (xe_bo_is_vram(bo)) {
KUNIT_FAIL(test, "bo is vram\n");
err = -EPROTO;
goto cleanup_all;
}
if (i) {
down_read(&vm->lock);
xe_vm_lock(vm, &ww, 0, false);
err = xe_bo_validate(bo, bo->vm, false);
xe_vm_unlock(vm, &ww);
up_read(&vm->lock);
if (err) {
KUNIT_FAIL(test, "bo valid err=%pe\n",
ERR_PTR(err));
goto cleanup_all;
}
xe_bo_lock(external, &ww, 0, false);
err = xe_bo_validate(external, NULL, false);
xe_bo_unlock(external, &ww);
if (err) {
KUNIT_FAIL(test, "external bo valid err=%pe\n",
ERR_PTR(err));
goto cleanup_all;
}
}
xe_bo_lock(external, &ww, 0, false);
xe_bo_unpin_external(external);
xe_bo_unlock(external, &ww);
xe_bo_put(external);
xe_bo_put(bo);
continue;
cleanup_all:
xe_bo_lock(external, &ww, 0, false);
xe_bo_unpin_external(external);
xe_bo_unlock(external, &ww);
cleanup_external:
xe_bo_put(external);
cleanup_bo:
xe_bo_put(bo);
break;
}
xe_vm_put(vm);
return 0;
}
static int evict_test_run_device(struct xe_device *xe)
{
struct kunit *test = xe_cur_kunit();
struct xe_gt *gt;
int id;
if (!IS_DGFX(xe)) {
kunit_info(test, "Skipping non-discrete device %s.\n",
dev_name(xe->drm.dev));
return 0;
}
for_each_gt(gt, xe, id)
evict_test_run_gt(xe, gt, test);
return 0;
}
void xe_bo_evict_kunit(struct kunit *test)
{
xe_call_for_each_device(evict_test_run_device);
}
EXPORT_SYMBOL(xe_bo_evict_kunit);

View File

@ -0,0 +1,25 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright © 2022 Intel Corporation
*/
#include <kunit/test.h>
void xe_ccs_migrate_kunit(struct kunit *test);
void xe_bo_evict_kunit(struct kunit *test);
static struct kunit_case xe_bo_tests[] = {
KUNIT_CASE(xe_ccs_migrate_kunit),
KUNIT_CASE(xe_bo_evict_kunit),
{}
};
static struct kunit_suite xe_bo_test_suite = {
.name = "xe_bo",
.test_cases = xe_bo_tests,
};
kunit_test_suite(xe_bo_test_suite);
MODULE_AUTHOR("Intel Corporation");
MODULE_LICENSE("GPL");

View File

@ -0,0 +1,259 @@
// SPDX-License-Identifier: GPL-2.0 AND MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <kunit/test.h>
#include "xe_pci.h"
static bool p2p_enabled(struct dma_buf_test_params *params)
{
return IS_ENABLED(CONFIG_PCI_P2PDMA) && params->attach_ops &&
params->attach_ops->allow_peer2peer;
}
static bool is_dynamic(struct dma_buf_test_params *params)
{
return IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) && params->attach_ops &&
params->attach_ops->move_notify;
}
static void check_residency(struct kunit *test, struct xe_bo *exported,
struct xe_bo *imported, struct dma_buf *dmabuf)
{
struct dma_buf_test_params *params = to_dma_buf_test_params(test->priv);
u32 mem_type;
int ret;
xe_bo_assert_held(exported);
xe_bo_assert_held(imported);
mem_type = XE_PL_VRAM0;
if (!(params->mem_mask & XE_BO_CREATE_VRAM0_BIT))
/* No VRAM allowed */
mem_type = XE_PL_TT;
else if (params->force_different_devices && !p2p_enabled(params))
/* No P2P */
mem_type = XE_PL_TT;
else if (params->force_different_devices && !is_dynamic(params) &&
(params->mem_mask & XE_BO_CREATE_SYSTEM_BIT))
/* Pin migrated to TT */
mem_type = XE_PL_TT;
if (!xe_bo_is_mem_type(exported, mem_type)) {
KUNIT_FAIL(test, "Exported bo was not in expected memory type.\n");
return;
}
if (xe_bo_is_pinned(exported))
return;
/*
* Evict exporter. Note that the gem object dma_buf member isn't
* set from xe_gem_prime_export(), and it's needed for the move_notify()
* functionality, so hack that up here. Evicting the exported bo will
* evict also the imported bo through the move_notify() functionality if
* importer is on a different device. If they're on the same device,
* the exporter and the importer should be the same bo.
*/
swap(exported->ttm.base.dma_buf, dmabuf);
ret = xe_bo_evict(exported, true);
swap(exported->ttm.base.dma_buf, dmabuf);
if (ret) {
if (ret != -EINTR && ret != -ERESTARTSYS)
KUNIT_FAIL(test, "Evicting exporter failed with err=%d.\n",
ret);
return;
}
/* Verify that also importer has been evicted to SYSTEM */
if (!xe_bo_is_mem_type(imported, XE_PL_SYSTEM)) {
KUNIT_FAIL(test, "Importer wasn't properly evicted.\n");
return;
}
/* Re-validate the importer. This should move also exporter in. */
ret = xe_bo_validate(imported, NULL, false);
if (ret) {
if (ret != -EINTR && ret != -ERESTARTSYS)
KUNIT_FAIL(test, "Validating importer failed with err=%d.\n",
ret);
return;
}
/*
* If on different devices, the exporter is kept in system if
* possible, saving a migration step as the transfer is just
* likely as fast from system memory.
*/
if (params->force_different_devices &&
params->mem_mask & XE_BO_CREATE_SYSTEM_BIT)
KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, XE_PL_TT));
else
KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported, mem_type));
if (params->force_different_devices)
KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(imported, XE_PL_TT));
else
KUNIT_EXPECT_TRUE(test, exported == imported);
}
static void xe_test_dmabuf_import_same_driver(struct xe_device *xe)
{
struct kunit *test = xe_cur_kunit();
struct dma_buf_test_params *params = to_dma_buf_test_params(test->priv);
struct drm_gem_object *import;
struct dma_buf *dmabuf;
struct xe_bo *bo;
/* No VRAM on this device? */
if (!ttm_manager_type(&xe->ttm, XE_PL_VRAM0) &&
(params->mem_mask & XE_BO_CREATE_VRAM0_BIT))
return;
kunit_info(test, "running %s\n", __func__);
bo = xe_bo_create(xe, NULL, NULL, PAGE_SIZE, ttm_bo_type_device,
XE_BO_CREATE_USER_BIT | params->mem_mask);
if (IS_ERR(bo)) {
KUNIT_FAIL(test, "xe_bo_create() failed with err=%ld\n",
PTR_ERR(bo));
return;
}
dmabuf = xe_gem_prime_export(&bo->ttm.base, 0);
if (IS_ERR(dmabuf)) {
KUNIT_FAIL(test, "xe_gem_prime_export() failed with err=%ld\n",
PTR_ERR(dmabuf));
goto out;
}
import = xe_gem_prime_import(&xe->drm, dmabuf);
if (!IS_ERR(import)) {
struct xe_bo *import_bo = gem_to_xe_bo(import);
/*
* Did import succeed when it shouldn't due to lack of p2p support?
*/
if (params->force_different_devices &&
!p2p_enabled(params) &&
!(params->mem_mask & XE_BO_CREATE_SYSTEM_BIT)) {
KUNIT_FAIL(test,
"xe_gem_prime_import() succeeded when it shouldn't have\n");
} else {
int err;
/* Is everything where we expect it to be? */
xe_bo_lock_no_vm(import_bo, NULL);
err = xe_bo_validate(import_bo, NULL, false);
if (err && err != -EINTR && err != -ERESTARTSYS)
KUNIT_FAIL(test,
"xe_bo_validate() failed with err=%d\n", err);
check_residency(test, bo, import_bo, dmabuf);
xe_bo_unlock_no_vm(import_bo);
}
drm_gem_object_put(import);
} else if (PTR_ERR(import) != -EOPNOTSUPP) {
/* Unexpected error code. */
KUNIT_FAIL(test,
"xe_gem_prime_import failed with the wrong err=%ld\n",
PTR_ERR(import));
} else if (!params->force_different_devices ||
p2p_enabled(params) ||
(params->mem_mask & XE_BO_CREATE_SYSTEM_BIT)) {
/* Shouldn't fail if we can reuse same bo, use p2p or use system */
KUNIT_FAIL(test, "dynamic p2p attachment failed with err=%ld\n",
PTR_ERR(import));
}
dma_buf_put(dmabuf);
out:
drm_gem_object_put(&bo->ttm.base);
}
static const struct dma_buf_attach_ops nop2p_attach_ops = {
.allow_peer2peer = false,
.move_notify = xe_dma_buf_move_notify
};
/*
* We test the implementation with bos of different residency and with
* importers with different capabilities; some lacking p2p support and some
* lacking dynamic capabilities (attach_ops == NULL). We also fake
* different devices avoiding the import shortcut that just reuses the same
* gem object.
*/
static const struct dma_buf_test_params test_params[] = {
{.mem_mask = XE_BO_CREATE_VRAM0_BIT,
.attach_ops = &xe_dma_buf_attach_ops},
{.mem_mask = XE_BO_CREATE_VRAM0_BIT,
.attach_ops = &xe_dma_buf_attach_ops,
.force_different_devices = true},
{.mem_mask = XE_BO_CREATE_VRAM0_BIT,
.attach_ops = &nop2p_attach_ops},
{.mem_mask = XE_BO_CREATE_VRAM0_BIT,
.attach_ops = &nop2p_attach_ops,
.force_different_devices = true},
{.mem_mask = XE_BO_CREATE_VRAM0_BIT},
{.mem_mask = XE_BO_CREATE_VRAM0_BIT,
.force_different_devices = true},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT,
.attach_ops = &xe_dma_buf_attach_ops},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT,
.attach_ops = &xe_dma_buf_attach_ops,
.force_different_devices = true},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT,
.attach_ops = &nop2p_attach_ops},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT,
.attach_ops = &nop2p_attach_ops,
.force_different_devices = true},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT,
.force_different_devices = true},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT | XE_BO_CREATE_VRAM0_BIT,
.attach_ops = &xe_dma_buf_attach_ops},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT | XE_BO_CREATE_VRAM0_BIT,
.attach_ops = &xe_dma_buf_attach_ops,
.force_different_devices = true},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT | XE_BO_CREATE_VRAM0_BIT,
.attach_ops = &nop2p_attach_ops},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT | XE_BO_CREATE_VRAM0_BIT,
.attach_ops = &nop2p_attach_ops,
.force_different_devices = true},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT | XE_BO_CREATE_VRAM0_BIT},
{.mem_mask = XE_BO_CREATE_SYSTEM_BIT | XE_BO_CREATE_VRAM0_BIT,
.force_different_devices = true},
{}
};
static int dma_buf_run_device(struct xe_device *xe)
{
const struct dma_buf_test_params *params;
struct kunit *test = xe_cur_kunit();
for (params = test_params; params->mem_mask; ++params) {
struct dma_buf_test_params p = *params;
p.base.id = XE_TEST_LIVE_DMA_BUF;
test->priv = &p;
xe_test_dmabuf_import_same_driver(xe);
}
/* A non-zero return would halt iteration over driver devices */
return 0;
}
void xe_dma_buf_kunit(struct kunit *test)
{
xe_call_for_each_device(dma_buf_run_device);
}
EXPORT_SYMBOL(xe_dma_buf_kunit);

View File

@ -0,0 +1,23 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright © 2022 Intel Corporation
*/
#include <kunit/test.h>
void xe_dma_buf_kunit(struct kunit *test);
static struct kunit_case xe_dma_buf_tests[] = {
KUNIT_CASE(xe_dma_buf_kunit),
{}
};
static struct kunit_suite xe_dma_buf_test_suite = {
.name = "xe_dma_buf",
.test_cases = xe_dma_buf_tests,
};
kunit_test_suite(xe_dma_buf_test_suite);
MODULE_AUTHOR("Intel Corporation");
MODULE_LICENSE("GPL");

View File

@ -0,0 +1,378 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2020-2022 Intel Corporation
*/
#include <kunit/test.h>
#include "xe_pci.h"
static bool sanity_fence_failed(struct xe_device *xe, struct dma_fence *fence,
const char *str, struct kunit *test)
{
long ret;
if (IS_ERR(fence)) {
KUNIT_FAIL(test, "Failed to create fence for %s: %li\n", str,
PTR_ERR(fence));
return true;
}
if (!fence)
return true;
ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
if (ret <= 0) {
KUNIT_FAIL(test, "Fence timed out for %s: %li\n", str, ret);
return true;
}
return false;
}
static int run_sanity_job(struct xe_migrate *m, struct xe_device *xe,
struct xe_bb *bb, u32 second_idx, const char *str,
struct kunit *test)
{
struct xe_sched_job *job = xe_bb_create_migration_job(m->eng, bb,
m->batch_base_ofs,
second_idx);
struct dma_fence *fence;
if (IS_ERR(job)) {
KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n",
PTR_ERR(job));
return PTR_ERR(job);
}
xe_sched_job_arm(job);
fence = dma_fence_get(&job->drm.s_fence->finished);
xe_sched_job_push(job);
if (sanity_fence_failed(xe, fence, str, test))
return -ETIMEDOUT;
dma_fence_put(fence);
kunit_info(test, "%s: Job completed\n", str);
return 0;
}
static void
sanity_populate_cb(struct xe_migrate_pt_update *pt_update,
struct xe_gt *gt, struct iosys_map *map, void *dst,
u32 qword_ofs, u32 num_qwords,
const struct xe_vm_pgtable_update *update)
{
int i;
u64 *ptr = dst;
for (i = 0; i < num_qwords; i++)
ptr[i] = (qword_ofs + i - update->ofs) * 0x1111111111111111ULL;
}
static const struct xe_migrate_pt_update_ops sanity_ops = {
.populate = sanity_populate_cb,
};
#define check(_retval, _expected, str, _test) \
do { if ((_retval) != (_expected)) { \
KUNIT_FAIL(_test, "Sanity check failed: " str \
" expected %llx, got %llx\n", \
(u64)(_expected), (u64)(_retval)); \
} } while (0)
static void test_copy(struct xe_migrate *m, struct xe_bo *bo,
struct kunit *test)
{
struct xe_device *xe = gt_to_xe(m->gt);
u64 retval, expected = 0xc0c0c0c0c0c0c0c0ULL;
bool big = bo->size >= SZ_2M;
struct dma_fence *fence;
const char *str = big ? "Copying big bo" : "Copying small bo";
int err;
struct xe_bo *sysmem = xe_bo_create_locked(xe, m->gt, NULL,
bo->size,
ttm_bo_type_kernel,
XE_BO_CREATE_SYSTEM_BIT);
if (IS_ERR(sysmem)) {
KUNIT_FAIL(test, "Failed to allocate sysmem bo for %s: %li\n",
str, PTR_ERR(sysmem));
return;
}
err = xe_bo_validate(sysmem, NULL, false);
if (err) {
KUNIT_FAIL(test, "Failed to validate system bo for %s: %li\n",
str, err);
goto out_unlock;
}
err = xe_bo_vmap(sysmem);
if (err) {
KUNIT_FAIL(test, "Failed to vmap system bo for %s: %li\n",
str, err);
goto out_unlock;
}
xe_map_memset(xe, &sysmem->vmap, 0, 0xd0, sysmem->size);
fence = xe_migrate_clear(m, sysmem, sysmem->ttm.resource, 0xc0c0c0c0);
if (!sanity_fence_failed(xe, fence, big ? "Clearing sysmem big bo" :
"Clearing sysmem small bo", test)) {
retval = xe_map_rd(xe, &sysmem->vmap, 0, u64);
check(retval, expected, "sysmem first offset should be cleared",
test);
retval = xe_map_rd(xe, &sysmem->vmap, sysmem->size - 8, u64);
check(retval, expected, "sysmem last offset should be cleared",
test);
}
dma_fence_put(fence);
/* Try to copy 0xc0 from sysmem to lmem with 2MB or 64KiB/4KiB pages */
xe_map_memset(xe, &sysmem->vmap, 0, 0xc0, sysmem->size);
xe_map_memset(xe, &bo->vmap, 0, 0xd0, bo->size);
fence = xe_migrate_copy(m, sysmem, sysmem->ttm.resource,
bo->ttm.resource);
if (!sanity_fence_failed(xe, fence, big ? "Copying big bo sysmem -> vram" :
"Copying small bo sysmem -> vram", test)) {
retval = xe_map_rd(xe, &bo->vmap, 0, u64);
check(retval, expected,
"sysmem -> vram bo first offset should be copied", test);
retval = xe_map_rd(xe, &bo->vmap, bo->size - 8, u64);
check(retval, expected,
"sysmem -> vram bo offset should be copied", test);
}
dma_fence_put(fence);
/* And other way around.. slightly hacky.. */
xe_map_memset(xe, &sysmem->vmap, 0, 0xd0, sysmem->size);
xe_map_memset(xe, &bo->vmap, 0, 0xc0, bo->size);
fence = xe_migrate_copy(m, sysmem, bo->ttm.resource,
sysmem->ttm.resource);
if (!sanity_fence_failed(xe, fence, big ? "Copying big bo vram -> sysmem" :
"Copying small bo vram -> sysmem", test)) {
retval = xe_map_rd(xe, &sysmem->vmap, 0, u64);
check(retval, expected,
"vram -> sysmem bo first offset should be copied", test);
retval = xe_map_rd(xe, &sysmem->vmap, bo->size - 8, u64);
check(retval, expected,
"vram -> sysmem bo last offset should be copied", test);
}
dma_fence_put(fence);
xe_bo_vunmap(sysmem);
out_unlock:
xe_bo_unlock_no_vm(sysmem);
xe_bo_put(sysmem);
}
static void test_pt_update(struct xe_migrate *m, struct xe_bo *pt,
struct kunit *test)
{
struct xe_device *xe = gt_to_xe(m->gt);
struct dma_fence *fence;
u64 retval, expected;
int i;
struct xe_vm_pgtable_update update = {
.ofs = 1,
.qwords = 0x10,
.pt_bo = pt,
};
struct xe_migrate_pt_update pt_update = {
.ops = &sanity_ops,
};
/* Test xe_migrate_update_pgtables() updates the pagetable as expected */
expected = 0xf0f0f0f0f0f0f0f0ULL;
xe_map_memset(xe, &pt->vmap, 0, (u8)expected, pt->size);
fence = xe_migrate_update_pgtables(m, NULL, NULL, m->eng, &update, 1,
NULL, 0, &pt_update);
if (sanity_fence_failed(xe, fence, "Migration pagetable update", test))
return;
dma_fence_put(fence);
retval = xe_map_rd(xe, &pt->vmap, 0, u64);
check(retval, expected, "PTE[0] must stay untouched", test);
for (i = 0; i < update.qwords; i++) {
retval = xe_map_rd(xe, &pt->vmap, (update.ofs + i) * 8, u64);
check(retval, i * 0x1111111111111111ULL, "PTE update", test);
}
retval = xe_map_rd(xe, &pt->vmap, 8 * (update.ofs + update.qwords),
u64);
check(retval, expected, "PTE[0x11] must stay untouched", test);
}
static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
{
struct xe_gt *gt = m->gt;
struct xe_device *xe = gt_to_xe(gt);
struct xe_bo *pt, *bo = m->pt_bo, *big, *tiny;
struct xe_res_cursor src_it;
struct dma_fence *fence;
u64 retval, expected;
struct xe_bb *bb;
int err;
u8 id = gt->info.id;
err = xe_bo_vmap(bo);
if (err) {
KUNIT_FAIL(test, "Failed to vmap our pagetables: %li\n",
PTR_ERR(bo));
return;
}
big = xe_bo_create_pin_map(xe, m->gt, m->eng->vm, SZ_4M,
ttm_bo_type_kernel,
XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
XE_BO_CREATE_PINNED_BIT);
if (IS_ERR(big)) {
KUNIT_FAIL(test, "Failed to allocate bo: %li\n", PTR_ERR(big));
goto vunmap;
}
pt = xe_bo_create_pin_map(xe, m->gt, m->eng->vm, GEN8_PAGE_SIZE,
ttm_bo_type_kernel,
XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
XE_BO_CREATE_PINNED_BIT);
if (IS_ERR(pt)) {
KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n",
PTR_ERR(pt));
goto free_big;
}
tiny = xe_bo_create_pin_map(xe, m->gt, m->eng->vm,
2 * SZ_4K,
ttm_bo_type_kernel,
XE_BO_CREATE_VRAM_IF_DGFX(m->gt) |
XE_BO_CREATE_PINNED_BIT);
if (IS_ERR(tiny)) {
KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n",
PTR_ERR(pt));
goto free_pt;
}
bb = xe_bb_new(m->gt, 32, xe->info.supports_usm);
if (IS_ERR(bb)) {
KUNIT_FAIL(test, "Failed to create batchbuffer: %li\n",
PTR_ERR(bb));
goto free_tiny;
}
kunit_info(test, "Starting tests, top level PT addr: %llx, special pagetable base addr: %llx\n",
xe_bo_main_addr(m->eng->vm->pt_root[id]->bo, GEN8_PAGE_SIZE),
xe_bo_main_addr(m->pt_bo, GEN8_PAGE_SIZE));
/* First part of the test, are we updating our pagetable bo with a new entry? */
xe_map_wr(xe, &bo->vmap, GEN8_PAGE_SIZE * (NUM_KERNEL_PDE - 1), u64, 0xdeaddeadbeefbeef);
expected = gen8_pte_encode(NULL, pt, 0, XE_CACHE_WB, 0, 0);
if (m->eng->vm->flags & XE_VM_FLAGS_64K)
expected |= GEN12_PTE_PS64;
xe_res_first(pt->ttm.resource, 0, pt->size, &src_it);
emit_pte(m, bb, NUM_KERNEL_PDE - 1, xe_bo_is_vram(pt),
&src_it, GEN8_PAGE_SIZE, pt);
run_sanity_job(m, xe, bb, bb->len, "Writing PTE for our fake PT", test);
retval = xe_map_rd(xe, &bo->vmap, GEN8_PAGE_SIZE * (NUM_KERNEL_PDE - 1),
u64);
check(retval, expected, "PTE entry write", test);
/* Now try to write data to our newly mapped 'pagetable', see if it succeeds */
bb->len = 0;
bb->cs[bb->len++] = MI_BATCH_BUFFER_END;
xe_map_wr(xe, &pt->vmap, 0, u32, 0xdeaddead);
expected = 0x12345678U;
emit_clear(m->gt, bb, xe_migrate_vm_addr(NUM_KERNEL_PDE - 1, 0), 4, 4,
expected, IS_DGFX(xe));
run_sanity_job(m, xe, bb, 1, "Writing to our newly mapped pagetable",
test);
retval = xe_map_rd(xe, &pt->vmap, 0, u32);
check(retval, expected, "Write to PT after adding PTE", test);
/* Sanity checks passed, try the full ones! */
/* Clear a small bo */
kunit_info(test, "Clearing small buffer object\n");
xe_map_memset(xe, &tiny->vmap, 0, 0x22, tiny->size);
expected = 0x224488ff;
fence = xe_migrate_clear(m, tiny, tiny->ttm.resource, expected);
if (sanity_fence_failed(xe, fence, "Clearing small bo", test))
goto out;
dma_fence_put(fence);
retval = xe_map_rd(xe, &tiny->vmap, 0, u32);
check(retval, expected, "Command clear small first value", test);
retval = xe_map_rd(xe, &tiny->vmap, tiny->size - 4, u32);
check(retval, expected, "Command clear small last value", test);
if (IS_DGFX(xe)) {
kunit_info(test, "Copying small buffer object to system\n");
test_copy(m, tiny, test);
}
/* Clear a big bo with a fixed value */
kunit_info(test, "Clearing big buffer object\n");
xe_map_memset(xe, &big->vmap, 0, 0x11, big->size);
expected = 0x11223344U;
fence = xe_migrate_clear(m, big, big->ttm.resource, expected);
if (sanity_fence_failed(xe, fence, "Clearing big bo", test))
goto out;
dma_fence_put(fence);
retval = xe_map_rd(xe, &big->vmap, 0, u32);
check(retval, expected, "Command clear big first value", test);
retval = xe_map_rd(xe, &big->vmap, big->size - 4, u32);
check(retval, expected, "Command clear big last value", test);
if (IS_DGFX(xe)) {
kunit_info(test, "Copying big buffer object to system\n");
test_copy(m, big, test);
}
test_pt_update(m, pt, test);
out:
xe_bb_free(bb, NULL);
free_tiny:
xe_bo_unpin(tiny);
xe_bo_put(tiny);
free_pt:
xe_bo_unpin(pt);
xe_bo_put(pt);
free_big:
xe_bo_unpin(big);
xe_bo_put(big);
vunmap:
xe_bo_vunmap(m->pt_bo);
}
static int migrate_test_run_device(struct xe_device *xe)
{
struct kunit *test = xe_cur_kunit();
struct xe_gt *gt;
int id;
for_each_gt(gt, xe, id) {
struct xe_migrate *m = gt->migrate;
struct ww_acquire_ctx ww;
kunit_info(test, "Testing gt id %d.\n", id);
xe_vm_lock(m->eng->vm, &ww, 0, true);
xe_migrate_sanity_test(m, test);
xe_vm_unlock(m->eng->vm, &ww);
}
return 0;
}
void xe_migrate_sanity_kunit(struct kunit *test)
{
xe_call_for_each_device(migrate_test_run_device);
}
EXPORT_SYMBOL(xe_migrate_sanity_kunit);

View File

@ -0,0 +1,23 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright © 2022 Intel Corporation
*/
#include <kunit/test.h>
void xe_migrate_sanity_kunit(struct kunit *test);
static struct kunit_case xe_migrate_tests[] = {
KUNIT_CASE(xe_migrate_sanity_kunit),
{}
};
static struct kunit_suite xe_migrate_test_suite = {
.name = "xe_migrate",
.test_cases = xe_migrate_tests,
};
kunit_test_suite(xe_migrate_test_suite);
MODULE_AUTHOR("Intel Corporation");
MODULE_LICENSE("GPL");

View File

@ -0,0 +1,66 @@
/* SPDX-License-Identifier: GPL-2.0 AND MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef __XE_TEST_H__
#define __XE_TEST_H__
#include <linux/types.h>
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
#include <linux/sched.h>
#include <kunit/test.h>
/*
* Each test that provides a kunit private test structure, place a test id
* here and point the kunit->priv to an embedded struct xe_test_priv.
*/
enum xe_test_priv_id {
XE_TEST_LIVE_DMA_BUF,
};
/**
* struct xe_test_priv - Base class for test private info
* @id: enum xe_test_priv_id to identify the subclass.
*/
struct xe_test_priv {
enum xe_test_priv_id id;
};
#define XE_TEST_DECLARE(x) x
#define XE_TEST_ONLY(x) unlikely(x)
#define XE_TEST_EXPORT
#define xe_cur_kunit() current->kunit_test
/**
* xe_cur_kunit_priv - Obtain the struct xe_test_priv pointed to by
* current->kunit->priv if it exists and is embedded in the expected subclass.
* @id: Id of the expected subclass.
*
* Return: NULL if the process is not a kunit test, and NULL if the
* current kunit->priv pointer is not pointing to an object of the expected
* subclass. A pointer to the embedded struct xe_test_priv otherwise.
*/
static inline struct xe_test_priv *
xe_cur_kunit_priv(enum xe_test_priv_id id)
{
struct xe_test_priv *priv;
if (!xe_cur_kunit())
return NULL;
priv = xe_cur_kunit()->priv;
return priv->id == id ? priv : NULL;
}
#else /* if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) */
#define XE_TEST_DECLARE(x)
#define XE_TEST_ONLY(x) 0
#define XE_TEST_EXPORT static
#define xe_cur_kunit() NULL
#define xe_cur_kunit_priv(_id) NULL
#endif
#endif

View File

@ -0,0 +1,97 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include "xe_bb.h"
#include "xe_sa.h"
#include "xe_device.h"
#include "xe_engine_types.h"
#include "xe_hw_fence.h"
#include "xe_sched_job.h"
#include "xe_vm_types.h"
#include "gt/intel_gpu_commands.h"
struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 dwords, bool usm)
{
struct xe_bb *bb = kmalloc(sizeof(*bb), GFP_KERNEL);
int err;
if (!bb)
return ERR_PTR(-ENOMEM);
bb->bo = xe_sa_bo_new(!usm ? &gt->kernel_bb_pool :
&gt->usm.bb_pool, 4 * dwords + 4);
if (IS_ERR(bb->bo)) {
err = PTR_ERR(bb->bo);
goto err;
}
bb->cs = xe_sa_bo_cpu_addr(bb->bo);
bb->len = 0;
return bb;
err:
kfree(bb);
return ERR_PTR(err);
}
static struct xe_sched_job *
__xe_bb_create_job(struct xe_engine *kernel_eng, struct xe_bb *bb, u64 *addr)
{
u32 size = drm_suballoc_size(bb->bo);
XE_BUG_ON((bb->len * 4 + 1) > size);
bb->cs[bb->len++] = MI_BATCH_BUFFER_END;
xe_sa_bo_flush_write(bb->bo);
return xe_sched_job_create(kernel_eng, addr);
}
struct xe_sched_job *xe_bb_create_wa_job(struct xe_engine *wa_eng,
struct xe_bb *bb, u64 batch_base_ofs)
{
u64 addr = batch_base_ofs + drm_suballoc_soffset(bb->bo);
XE_BUG_ON(!(wa_eng->vm->flags & XE_VM_FLAG_MIGRATION));
return __xe_bb_create_job(wa_eng, bb, &addr);
}
struct xe_sched_job *xe_bb_create_migration_job(struct xe_engine *kernel_eng,
struct xe_bb *bb,
u64 batch_base_ofs,
u32 second_idx)
{
u64 addr[2] = {
batch_base_ofs + drm_suballoc_soffset(bb->bo),
batch_base_ofs + drm_suballoc_soffset(bb->bo) +
4 * second_idx,
};
BUG_ON(second_idx > bb->len);
BUG_ON(!(kernel_eng->vm->flags & XE_VM_FLAG_MIGRATION));
return __xe_bb_create_job(kernel_eng, bb, addr);
}
struct xe_sched_job *xe_bb_create_job(struct xe_engine *kernel_eng,
struct xe_bb *bb)
{
u64 addr = xe_sa_bo_gpu_addr(bb->bo);
BUG_ON(kernel_eng->vm && kernel_eng->vm->flags & XE_VM_FLAG_MIGRATION);
return __xe_bb_create_job(kernel_eng, bb, &addr);
}
void xe_bb_free(struct xe_bb *bb, struct dma_fence *fence)
{
if (!bb)
return;
xe_sa_bo_free(bb->bo, fence);
kfree(bb);
}

View File

@ -0,0 +1,27 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_BB_H_
#define _XE_BB_H_
#include "xe_bb_types.h"
struct dma_fence;
struct xe_gt;
struct xe_engine;
struct xe_sched_job;
struct xe_bb *xe_bb_new(struct xe_gt *gt, u32 size, bool usm);
struct xe_sched_job *xe_bb_create_job(struct xe_engine *kernel_eng,
struct xe_bb *bb);
struct xe_sched_job *xe_bb_create_migration_job(struct xe_engine *kernel_eng,
struct xe_bb *bb, u64 batch_ofs,
u32 second_idx);
struct xe_sched_job *xe_bb_create_wa_job(struct xe_engine *wa_eng,
struct xe_bb *bb, u64 batch_ofs);
void xe_bb_free(struct xe_bb *bb, struct dma_fence *fence);
#endif

View File

@ -0,0 +1,20 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_BB_TYPES_H_
#define _XE_BB_TYPES_H_
#include <linux/types.h>
struct drm_suballoc;
struct xe_bb {
struct drm_suballoc *bo;
u32 *cs;
u32 len; /* in dwords */
};
#endif

1698
drivers/gpu/drm/xe/xe_bo.c Normal file

File diff suppressed because it is too large Load Diff

290
drivers/gpu/drm/xe/xe_bo.h Normal file
View File

@ -0,0 +1,290 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _XE_BO_H_
#define _XE_BO_H_
#include "xe_bo_types.h"
#include "xe_macros.h"
#include "xe_vm_types.h"
#define XE_DEFAULT_GTT_SIZE_MB 3072ULL /* 3GB by default */
#define XE_BO_CREATE_USER_BIT BIT(1)
#define XE_BO_CREATE_SYSTEM_BIT BIT(2)
#define XE_BO_CREATE_VRAM0_BIT BIT(3)
#define XE_BO_CREATE_VRAM1_BIT BIT(4)
#define XE_BO_CREATE_VRAM_IF_DGFX(gt) \
(IS_DGFX(gt_to_xe(gt)) ? XE_BO_CREATE_VRAM0_BIT << gt->info.vram_id : \
XE_BO_CREATE_SYSTEM_BIT)
#define XE_BO_CREATE_GGTT_BIT BIT(5)
#define XE_BO_CREATE_IGNORE_MIN_PAGE_SIZE_BIT BIT(6)
#define XE_BO_CREATE_PINNED_BIT BIT(7)
#define XE_BO_DEFER_BACKING BIT(8)
#define XE_BO_SCANOUT_BIT BIT(9)
/* this one is trigger internally only */
#define XE_BO_INTERNAL_TEST BIT(30)
#define XE_BO_INTERNAL_64K BIT(31)
#define PPAT_UNCACHED GENMASK_ULL(4, 3)
#define PPAT_CACHED_PDE 0
#define PPAT_CACHED BIT_ULL(7)
#define PPAT_DISPLAY_ELLC BIT_ULL(4)
#define GEN8_PTE_SHIFT 12
#define GEN8_PAGE_SIZE (1 << GEN8_PTE_SHIFT)
#define GEN8_PTE_MASK (GEN8_PAGE_SIZE - 1)
#define GEN8_PDE_SHIFT (GEN8_PTE_SHIFT - 3)
#define GEN8_PDES (1 << GEN8_PDE_SHIFT)
#define GEN8_PDE_MASK (GEN8_PDES - 1)
#define GEN8_64K_PTE_SHIFT 16
#define GEN8_64K_PAGE_SIZE (1 << GEN8_64K_PTE_SHIFT)
#define GEN8_64K_PTE_MASK (GEN8_64K_PAGE_SIZE - 1)
#define GEN8_64K_PDE_MASK (GEN8_PDE_MASK >> 4)
#define GEN8_PDE_PS_2M BIT_ULL(7)
#define GEN8_PDPE_PS_1G BIT_ULL(7)
#define GEN8_PDE_IPS_64K BIT_ULL(11)
#define GEN12_GGTT_PTE_LM BIT_ULL(1)
#define GEN12_USM_PPGTT_PTE_AE BIT_ULL(10)
#define GEN12_PPGTT_PTE_LM BIT_ULL(11)
#define GEN12_PDE_64K BIT_ULL(6)
#define GEN12_PTE_PS64 BIT_ULL(8)
#define GEN8_PAGE_PRESENT BIT_ULL(0)
#define GEN8_PAGE_RW BIT_ULL(1)
#define PTE_READ_ONLY BIT(0)
#define XE_PL_SYSTEM TTM_PL_SYSTEM
#define XE_PL_TT TTM_PL_TT
#define XE_PL_VRAM0 TTM_PL_VRAM
#define XE_PL_VRAM1 (XE_PL_VRAM0 + 1)
#define XE_BO_PROPS_INVALID (-1)
struct sg_table;
struct xe_bo *xe_bo_alloc(void);
void xe_bo_free(struct xe_bo *bo);
struct xe_bo *__xe_bo_create_locked(struct xe_device *xe, struct xe_bo *bo,
struct xe_gt *gt, struct dma_resv *resv,
size_t size, enum ttm_bo_type type,
u32 flags);
struct xe_bo *xe_bo_create_locked(struct xe_device *xe, struct xe_gt *gt,
struct xe_vm *vm, size_t size,
enum ttm_bo_type type, u32 flags);
struct xe_bo *xe_bo_create(struct xe_device *xe, struct xe_gt *gt,
struct xe_vm *vm, size_t size,
enum ttm_bo_type type, u32 flags);
struct xe_bo *xe_bo_create_pin_map(struct xe_device *xe, struct xe_gt *gt,
struct xe_vm *vm, size_t size,
enum ttm_bo_type type, u32 flags);
struct xe_bo *xe_bo_create_from_data(struct xe_device *xe, struct xe_gt *gt,
const void *data, size_t size,
enum ttm_bo_type type, u32 flags);
int xe_bo_placement_for_flags(struct xe_device *xe, struct xe_bo *bo,
u32 bo_flags);
static inline struct xe_bo *ttm_to_xe_bo(const struct ttm_buffer_object *bo)
{
return container_of(bo, struct xe_bo, ttm);
}
static inline struct xe_bo *gem_to_xe_bo(const struct drm_gem_object *obj)
{
return container_of(obj, struct xe_bo, ttm.base);
}
#define xe_bo_device(bo) ttm_to_xe_device((bo)->ttm.bdev)
static inline struct xe_bo *xe_bo_get(struct xe_bo *bo)
{
if (bo)
drm_gem_object_get(&bo->ttm.base);
return bo;
}
static inline void xe_bo_put(struct xe_bo *bo)
{
if (bo)
drm_gem_object_put(&bo->ttm.base);
}
static inline void xe_bo_assert_held(struct xe_bo *bo)
{
if (bo)
dma_resv_assert_held((bo)->ttm.base.resv);
}
int xe_bo_lock(struct xe_bo *bo, struct ww_acquire_ctx *ww,
int num_resv, bool intr);
void xe_bo_unlock(struct xe_bo *bo, struct ww_acquire_ctx *ww);
static inline void xe_bo_unlock_vm_held(struct xe_bo *bo)
{
if (bo) {
XE_BUG_ON(bo->vm && bo->ttm.base.resv != &bo->vm->resv);
if (bo->vm)
xe_vm_assert_held(bo->vm);
else
dma_resv_unlock(bo->ttm.base.resv);
}
}
static inline void xe_bo_lock_no_vm(struct xe_bo *bo,
struct ww_acquire_ctx *ctx)
{
if (bo) {
XE_BUG_ON(bo->vm || (bo->ttm.type != ttm_bo_type_sg &&
bo->ttm.base.resv != &bo->ttm.base._resv));
dma_resv_lock(bo->ttm.base.resv, ctx);
}
}
static inline void xe_bo_unlock_no_vm(struct xe_bo *bo)
{
if (bo) {
XE_BUG_ON(bo->vm || (bo->ttm.type != ttm_bo_type_sg &&
bo->ttm.base.resv != &bo->ttm.base._resv));
dma_resv_unlock(bo->ttm.base.resv);
}
}
int xe_bo_pin_external(struct xe_bo *bo);
int xe_bo_pin(struct xe_bo *bo);
void xe_bo_unpin_external(struct xe_bo *bo);
void xe_bo_unpin(struct xe_bo *bo);
int xe_bo_validate(struct xe_bo *bo, struct xe_vm *vm, bool allow_res_evict);
static inline bool xe_bo_is_pinned(struct xe_bo *bo)
{
return bo->ttm.pin_count;
}
static inline void xe_bo_unpin_map_no_vm(struct xe_bo *bo)
{
if (likely(bo)) {
xe_bo_lock_no_vm(bo, NULL);
xe_bo_unpin(bo);
xe_bo_unlock_no_vm(bo);
xe_bo_put(bo);
}
}
bool xe_bo_is_xe_bo(struct ttm_buffer_object *bo);
dma_addr_t xe_bo_addr(struct xe_bo *bo, u64 offset,
size_t page_size, bool *is_lmem);
static inline dma_addr_t
xe_bo_main_addr(struct xe_bo *bo, size_t page_size)
{
bool is_lmem;
return xe_bo_addr(bo, 0, page_size, &is_lmem);
}
static inline u32
xe_bo_ggtt_addr(struct xe_bo *bo)
{
XE_BUG_ON(bo->ggtt_node.size > bo->size);
XE_BUG_ON(bo->ggtt_node.start + bo->ggtt_node.size > (1ull << 32));
return bo->ggtt_node.start;
}
int xe_bo_vmap(struct xe_bo *bo);
void xe_bo_vunmap(struct xe_bo *bo);
bool mem_type_is_vram(u32 mem_type);
bool xe_bo_is_vram(struct xe_bo *bo);
bool xe_bo_can_migrate(struct xe_bo *bo, u32 mem_type);
int xe_bo_migrate(struct xe_bo *bo, u32 mem_type);
int xe_bo_evict(struct xe_bo *bo, bool force_alloc);
extern struct ttm_device_funcs xe_ttm_funcs;
int xe_gem_create_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);
int xe_gem_mmap_offset_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);
int xe_bo_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
struct drm_mode_create_dumb *args);
bool xe_bo_needs_ccs_pages(struct xe_bo *bo);
static inline size_t xe_bo_ccs_pages_start(struct xe_bo *bo)
{
return PAGE_ALIGN(bo->ttm.base.size);
}
void __xe_bo_release_dummy(struct kref *kref);
/**
* xe_bo_put_deferred() - Put a buffer object with delayed final freeing
* @bo: The bo to put.
* @deferred: List to which to add the buffer object if we cannot put, or
* NULL if the function is to put unconditionally.
*
* Since the final freeing of an object includes both sleeping and (!)
* memory allocation in the dma_resv individualization, it's not ok
* to put an object from atomic context nor from within a held lock
* tainted by reclaim. In such situations we want to defer the final
* freeing until we've exited the restricting context, or in the worst
* case to a workqueue.
* This function either puts the object if possible without the refcount
* reaching zero, or adds it to the @deferred list if that was not possible.
* The caller needs to follow up with a call to xe_bo_put_commit() to actually
* put the bo iff this function returns true. It's safe to always
* follow up with a call to xe_bo_put_commit().
* TODO: It's TTM that is the villain here. Perhaps TTM should add an
* interface like this.
*
* Return: true if @bo was the first object put on the @freed list,
* false otherwise.
*/
static inline bool
xe_bo_put_deferred(struct xe_bo *bo, struct llist_head *deferred)
{
if (!deferred) {
xe_bo_put(bo);
return false;
}
if (!kref_put(&bo->ttm.base.refcount, __xe_bo_release_dummy))
return false;
return llist_add(&bo->freed, deferred);
}
void xe_bo_put_commit(struct llist_head *deferred);
struct sg_table *xe_bo_get_sg(struct xe_bo *bo);
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
/**
* xe_bo_is_mem_type - Whether the bo currently resides in the given
* TTM memory type
* @bo: The bo to check.
* @mem_type: The TTM memory type.
*
* Return: true iff the bo resides in @mem_type, false otherwise.
*/
static inline bool xe_bo_is_mem_type(struct xe_bo *bo, u32 mem_type)
{
xe_bo_assert_held(bo);
return bo->ttm.resource->mem_type == mem_type;
}
#endif
#endif

View File

@ -0,0 +1,179 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_BO_DOC_H_
#define _XE_BO_DOC_H_
/**
* DOC: Buffer Objects (BO)
*
* BO management
* =============
*
* TTM manages (placement, eviction, etc...) all BOs in XE.
*
* BO creation
* ===========
*
* Create a chunk of memory which can be used by the GPU. Placement rules
* (sysmem or vram region) passed in upon creation. TTM handles placement of BO
* and can trigger eviction of other BOs to make space for the new BO.
*
* Kernel BOs
* ----------
*
* A kernel BO is created as part of driver load (e.g. uC firmware images, GuC
* ADS, etc...) or a BO created as part of a user operation which requires
* a kernel BO (e.g. engine state, memory for page tables, etc...). These BOs
* are typically mapped in the GGTT (any kernel BOs aside memory for page tables
* are in the GGTT), are pinned (can't move or be evicted at runtime), have a
* vmap (XE can access the memory via xe_map layer) and have contiguous physical
* memory.
*
* More details of why kernel BOs are pinned and contiguous below.
*
* User BOs
* --------
*
* A user BO is created via the DRM_IOCTL_XE_GEM_CREATE IOCTL. Once it is
* created the BO can be mmap'd (via DRM_IOCTL_XE_GEM_MMAP_OFFSET) for user
* access and it can be bound for GPU access (via DRM_IOCTL_XE_VM_BIND). All
* user BOs are evictable and user BOs are never pinned by XE. The allocation of
* the backing store can be defered from creation time until first use which is
* either mmap, bind, or pagefault.
*
* Private BOs
* ~~~~~~~~~~~
*
* A private BO is a user BO created with a valid VM argument passed into the
* create IOCTL. If a BO is private it cannot be exported via prime FD and
* mappings can only be created for the BO within the VM it is tied to. Lastly,
* the BO dma-resv slots / lock point to the VM's dma-resv slots / lock (all
* private BOs to a VM share common dma-resv slots / lock).
*
* External BOs
* ~~~~~~~~~~~~
*
* An external BO is a user BO created with a NULL VM argument passed into the
* create IOCTL. An external BO can be shared with different UMDs / devices via
* prime FD and the BO can be mapped into multiple VMs. An external BO has its
* own unique dma-resv slots / lock. An external BO will be in an array of all
* VMs which has a mapping of the BO. This allows VMs to lookup and lock all
* external BOs mapped in the VM as needed.
*
* BO placement
* ~~~~~~~~~~~~
*
* When a user BO is created, a mask of valid placements is passed indicating
* which memory regions are considered valid.
*
* The memory region information is available via query uAPI (TODO: add link).
*
* BO validation
* =============
*
* BO validation (ttm_bo_validate) refers to ensuring a BO has a valid
* placement. If a BO was swapped to temporary storage, a validation call will
* trigger a move back to a valid (location where GPU can access BO) placement.
* Validation of a BO may evict other BOs to make room for the BO being
* validated.
*
* BO eviction / moving
* ====================
*
* All eviction (or in other words, moving a BO from one memory location to
* another) is routed through TTM with a callback into XE.
*
* Runtime eviction
* ----------------
*
* Runtime evictions refers to during normal operations where TTM decides it
* needs to move a BO. Typically this is because TTM needs to make room for
* another BO and the evicted BO is first BO on LRU list that is not locked.
*
* An example of this is a new BO which can only be placed in VRAM but there is
* not space in VRAM. There could be multiple BOs which have sysmem and VRAM
* placement rules which currently reside in VRAM, TTM trigger a will move of
* one (or multiple) of these BO(s) until there is room in VRAM to place the new
* BO. The evicted BO(s) are valid but still need new bindings before the BO
* used again (exec or compute mode rebind worker).
*
* Another example would be, TTM can't find a BO to evict which has another
* valid placement. In this case TTM will evict one (or multiple) unlocked BO(s)
* to a temporary unreachable (invalid) placement. The evicted BO(s) are invalid
* and before next use need to be moved to a valid placement and rebound.
*
* In both cases, moves of these BOs are scheduled behind the fences in the BO's
* dma-resv slots.
*
* WW locking tries to ensures if 2 VMs use 51% of the memory forward progress
* is made on both VMs.
*
* Runtime eviction uses per a GT migration engine (TODO: link to migration
* engine doc) to do a GPU memcpy from one location to another.
*
* Rebinds after runtime eviction
* ------------------------------
*
* When BOs are moved, every mapping (VMA) of the BO needs to rebound before
* the BO is used again. Every VMA is added to an evicted list of its VM when
* the BO is moved. This is safe because of the VM locking structure (TODO: link
* to VM locking doc). On the next use of a VM (exec or compute mode rebind
* worker) the evicted VMA list is checked and rebinds are triggered. In the
* case of faulting VM, the rebind is done in the page fault handler.
*
* Suspend / resume eviction of VRAM
* ---------------------------------
*
* During device suspend / resume VRAM may lose power which means the contents
* of VRAM's memory is blown away. Thus BOs present in VRAM at the time of
* suspend must be moved to sysmem in order for their contents to be saved.
*
* A simple TTM call (ttm_resource_manager_evict_all) can move all non-pinned
* (user) BOs to sysmem. External BOs that are pinned need to be manually
* evicted with a simple loop + xe_bo_evict call. It gets a little trickier
* with kernel BOs.
*
* Some kernel BOs are used by the GT migration engine to do moves, thus we
* can't move all of the BOs via the GT migration engine. For simplity, use a
* TTM memcpy (CPU) to move any kernel (pinned) BO on either suspend or resume.
*
* Some kernel BOs need to be restored to the exact same physical location. TTM
* makes this rather easy but the caveat is the memory must be contiguous. Again
* for simplity, we enforce that all kernel (pinned) BOs are contiguous and
* restored to the same physical location.
*
* Pinned external BOs in VRAM are restored on resume via the GPU.
*
* Rebinds after suspend / resume
* ------------------------------
*
* Most kernel BOs have GGTT mappings which must be restored during the resume
* process. All user BOs are rebound after validation on their next use.
*
* Future work
* ===========
*
* Trim the list of BOs which is saved / restored via TTM memcpy on suspend /
* resume. All we really need to save / restore via TTM memcpy is the memory
* required for the GuC to load and the memory for the GT migrate engine to
* operate.
*
* Do not require kernel BOs to be contiguous in physical memory / restored to
* the same physical address on resume. In all likelihood the only memory that
* needs to be restored to the same physical address is memory used for page
* tables. All of that memory is allocated 1 page at time so the contiguous
* requirement isn't needed. Some work on the vmap code would need to be done if
* kernel BOs are not contiguous too.
*
* Make some kernel BO evictable rather than pinned. An example of this would be
* engine state, in all likelihood if the dma-slots of these BOs where properly
* used rather than pinning we could safely evict + rebind these BOs as needed.
*
* Some kernel BOs do not need to be restored on resume (e.g. GuC ADS as that is
* repopulated on resume), add flag to mark such objects as no save / restore.
*/
#endif

View File

@ -0,0 +1,225 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include "xe_bo.h"
#include "xe_bo_evict.h"
#include "xe_device.h"
#include "xe_ggtt.h"
#include "xe_gt.h"
/**
* xe_bo_evict_all - evict all BOs from VRAM
*
* @xe: xe device
*
* Evict non-pinned user BOs first (via GPU), evict pinned external BOs next
* (via GPU), wait for evictions, and finally evict pinned kernel BOs via CPU.
* All eviction magic done via TTM calls.
*
* Evict == move VRAM BOs to temporary (typically system) memory.
*
* This function should be called before the device goes into a suspend state
* where the VRAM loses power.
*/
int xe_bo_evict_all(struct xe_device *xe)
{
struct ttm_device *bdev = &xe->ttm;
struct ww_acquire_ctx ww;
struct xe_bo *bo;
struct xe_gt *gt;
struct list_head still_in_list;
u32 mem_type;
u8 id;
int ret;
if (!IS_DGFX(xe))
return 0;
/* User memory */
for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) {
struct ttm_resource_manager *man =
ttm_manager_type(bdev, mem_type);
if (man) {
ret = ttm_resource_manager_evict_all(bdev, man);
if (ret)
return ret;
}
}
/* Pinned user memory in VRAM */
INIT_LIST_HEAD(&still_in_list);
spin_lock(&xe->pinned.lock);
for (;;) {
bo = list_first_entry_or_null(&xe->pinned.external_vram,
typeof(*bo), pinned_link);
if (!bo)
break;
xe_bo_get(bo);
list_move_tail(&bo->pinned_link, &still_in_list);
spin_unlock(&xe->pinned.lock);
xe_bo_lock(bo, &ww, 0, false);
ret = xe_bo_evict(bo, true);
xe_bo_unlock(bo, &ww);
xe_bo_put(bo);
if (ret) {
spin_lock(&xe->pinned.lock);
list_splice_tail(&still_in_list,
&xe->pinned.external_vram);
spin_unlock(&xe->pinned.lock);
return ret;
}
spin_lock(&xe->pinned.lock);
}
list_splice_tail(&still_in_list, &xe->pinned.external_vram);
spin_unlock(&xe->pinned.lock);
/*
* Wait for all user BO to be evicted as those evictions depend on the
* memory moved below.
*/
for_each_gt(gt, xe, id)
xe_gt_migrate_wait(gt);
spin_lock(&xe->pinned.lock);
for (;;) {
bo = list_first_entry_or_null(&xe->pinned.kernel_bo_present,
typeof(*bo), pinned_link);
if (!bo)
break;
xe_bo_get(bo);
list_move_tail(&bo->pinned_link, &xe->pinned.evicted);
spin_unlock(&xe->pinned.lock);
xe_bo_lock(bo, &ww, 0, false);
ret = xe_bo_evict(bo, true);
xe_bo_unlock(bo, &ww);
xe_bo_put(bo);
if (ret)
return ret;
spin_lock(&xe->pinned.lock);
}
spin_unlock(&xe->pinned.lock);
return 0;
}
/**
* xe_bo_restore_kernel - restore kernel BOs to VRAM
*
* @xe: xe device
*
* Move kernel BOs from temporary (typically system) memory to VRAM via CPU. All
* moves done via TTM calls.
*
* This function should be called early, before trying to init the GT, on device
* resume.
*/
int xe_bo_restore_kernel(struct xe_device *xe)
{
struct ww_acquire_ctx ww;
struct xe_bo *bo;
int ret;
if (!IS_DGFX(xe))
return 0;
spin_lock(&xe->pinned.lock);
for (;;) {
bo = list_first_entry_or_null(&xe->pinned.evicted,
typeof(*bo), pinned_link);
if (!bo)
break;
xe_bo_get(bo);
list_move_tail(&bo->pinned_link, &xe->pinned.kernel_bo_present);
spin_unlock(&xe->pinned.lock);
xe_bo_lock(bo, &ww, 0, false);
ret = xe_bo_validate(bo, NULL, false);
xe_bo_unlock(bo, &ww);
if (ret) {
xe_bo_put(bo);
return ret;
}
if (bo->flags & XE_BO_CREATE_GGTT_BIT)
xe_ggtt_map_bo(bo->gt->mem.ggtt, bo);
/*
* We expect validate to trigger a move VRAM and our move code
* should setup the iosys map.
*/
XE_BUG_ON(iosys_map_is_null(&bo->vmap));
XE_BUG_ON(!xe_bo_is_vram(bo));
xe_bo_put(bo);
spin_lock(&xe->pinned.lock);
}
spin_unlock(&xe->pinned.lock);
return 0;
}
/**
* xe_bo_restore_user - restore pinned user BOs to VRAM
*
* @xe: xe device
*
* Move pinned user BOs from temporary (typically system) memory to VRAM via
* CPU. All moves done via TTM calls.
*
* This function should be called late, after GT init, on device resume.
*/
int xe_bo_restore_user(struct xe_device *xe)
{
struct ww_acquire_ctx ww;
struct xe_bo *bo;
struct xe_gt *gt;
struct list_head still_in_list;
u8 id;
int ret;
if (!IS_DGFX(xe))
return 0;
/* Pinned user memory in VRAM should be validated on resume */
INIT_LIST_HEAD(&still_in_list);
spin_lock(&xe->pinned.lock);
for (;;) {
bo = list_first_entry_or_null(&xe->pinned.external_vram,
typeof(*bo), pinned_link);
if (!bo)
break;
list_move_tail(&bo->pinned_link, &still_in_list);
xe_bo_get(bo);
spin_unlock(&xe->pinned.lock);
xe_bo_lock(bo, &ww, 0, false);
ret = xe_bo_validate(bo, NULL, false);
xe_bo_unlock(bo, &ww);
xe_bo_put(bo);
if (ret) {
spin_lock(&xe->pinned.lock);
list_splice_tail(&still_in_list,
&xe->pinned.external_vram);
spin_unlock(&xe->pinned.lock);
return ret;
}
spin_lock(&xe->pinned.lock);
}
list_splice_tail(&still_in_list, &xe->pinned.external_vram);
spin_unlock(&xe->pinned.lock);
/* Wait for validate to complete */
for_each_gt(gt, xe, id)
xe_gt_migrate_wait(gt);
return 0;
}

View File

@ -0,0 +1,15 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_BO_EVICT_H_
#define _XE_BO_EVICT_H_
struct xe_device;
int xe_bo_evict_all(struct xe_device *xe);
int xe_bo_restore_kernel(struct xe_device *xe);
int xe_bo_restore_user(struct xe_device *xe);
#endif

View File

@ -0,0 +1,73 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_BO_TYPES_H_
#define _XE_BO_TYPES_H_
#include <linux/iosys-map.h>
#include <drm/drm_mm.h>
#include <drm/ttm/ttm_bo.h>
#include <drm/ttm/ttm_device.h>
#include <drm/ttm/ttm_execbuf_util.h>
#include <drm/ttm/ttm_placement.h>
struct xe_device;
struct xe_vm;
#define XE_BO_MAX_PLACEMENTS 3
/** @xe_bo: XE buffer object */
struct xe_bo {
/** @ttm: TTM base buffer object */
struct ttm_buffer_object ttm;
/** @size: Size of this buffer object */
size_t size;
/** @flags: flags for this buffer object */
u32 flags;
/** @vm: VM this BO is attached to, for extobj this will be NULL */
struct xe_vm *vm;
/** @gt: GT this BO is attached to (kernel BO only) */
struct xe_gt *gt;
/** @vmas: List of VMAs for this BO */
struct list_head vmas;
/** @placements: valid placements for this BO */
struct ttm_place placements[XE_BO_MAX_PLACEMENTS];
/** @placement: current placement for this BO */
struct ttm_placement placement;
/** @ggtt_node: GGTT node if this BO is mapped in the GGTT */
struct drm_mm_node ggtt_node;
/** @vmap: iosys map of this buffer */
struct iosys_map vmap;
/** @ttm_kmap: TTM bo kmap object for internal use only. Keep off. */
struct ttm_bo_kmap_obj kmap;
/** @pinned_link: link to present / evicted list of pinned BO */
struct list_head pinned_link;
/** @props: BO user controlled properties */
struct {
/** @preferred_mem: preferred memory class for this BO */
s16 preferred_mem_class;
/** @prefered_gt: preferred GT for this BO */
s16 preferred_gt;
/** @preferred_mem_type: preferred memory type */
s32 preferred_mem_type;
/**
* @cpu_atomic: the CPU expects to do atomics operations to
* this BO
*/
bool cpu_atomic;
/**
* @device_atomic: the device expects to do atomics operations
* to this BO
*/
bool device_atomic;
} props;
/** @freed: List node for delayed put. */
struct llist_node freed;
/** @created: Whether the bo has passed initial creation */
bool created;
};
#endif

View File

@ -0,0 +1,129 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <linux/string_helpers.h>
#include <drm/drm_debugfs.h>
#include "xe_bo.h"
#include "xe_device.h"
#include "xe_debugfs.h"
#include "xe_gt_debugfs.h"
#include "xe_step.h"
#ifdef CONFIG_DRM_XE_DEBUG
#include "xe_bo_evict.h"
#include "xe_migrate.h"
#include "xe_vm.h"
#endif
static struct xe_device *node_to_xe(struct drm_info_node *node)
{
return to_xe_device(node->minor->dev);
}
static int info(struct seq_file *m, void *data)
{
struct xe_device *xe = node_to_xe(m->private);
struct drm_printer p = drm_seq_file_printer(m);
struct xe_gt *gt;
u8 id;
drm_printf(&p, "graphics_verx100 %d\n", xe->info.graphics_verx100);
drm_printf(&p, "media_verx100 %d\n", xe->info.media_verx100);
drm_printf(&p, "stepping G:%s M:%s D:%s B:%s\n",
xe_step_name(xe->info.step.graphics),
xe_step_name(xe->info.step.media),
xe_step_name(xe->info.step.display),
xe_step_name(xe->info.step.basedie));
drm_printf(&p, "is_dgfx %s\n", str_yes_no(xe->info.is_dgfx));
drm_printf(&p, "platform %d\n", xe->info.platform);
drm_printf(&p, "subplatform %d\n",
xe->info.subplatform > XE_SUBPLATFORM_NONE ? xe->info.subplatform : 0);
drm_printf(&p, "devid 0x%x\n", xe->info.devid);
drm_printf(&p, "revid %d\n", xe->info.revid);
drm_printf(&p, "tile_count %d\n", xe->info.tile_count);
drm_printf(&p, "vm_max_level %d\n", xe->info.vm_max_level);
drm_printf(&p, "enable_guc %s\n", str_yes_no(xe->info.enable_guc));
drm_printf(&p, "supports_usm %s\n", str_yes_no(xe->info.supports_usm));
drm_printf(&p, "has_flat_ccs %s\n", str_yes_no(xe->info.has_flat_ccs));
for_each_gt(gt, xe, id) {
drm_printf(&p, "gt%d force wake %d\n", id,
xe_force_wake_ref(gt_to_fw(gt), XE_FW_GT));
drm_printf(&p, "gt%d engine_mask 0x%llx\n", id,
gt->info.engine_mask);
}
return 0;
}
static const struct drm_info_list debugfs_list[] = {
{"info", info, 0},
};
static int forcewake_open(struct inode *inode, struct file *file)
{
struct xe_device *xe = inode->i_private;
struct xe_gt *gt;
u8 id;
for_each_gt(gt, xe, id)
XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL));
return 0;
}
static int forcewake_release(struct inode *inode, struct file *file)
{
struct xe_device *xe = inode->i_private;
struct xe_gt *gt;
u8 id;
for_each_gt(gt, xe, id)
XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
return 0;
}
static const struct file_operations forcewake_all_fops = {
.owner = THIS_MODULE,
.open = forcewake_open,
.release = forcewake_release,
};
void xe_debugfs_register(struct xe_device *xe)
{
struct ttm_device *bdev = &xe->ttm;
struct drm_minor *minor = xe->drm.primary;
struct dentry *root = minor->debugfs_root;
struct ttm_resource_manager *man;
struct xe_gt *gt;
u32 mem_type;
u8 id;
drm_debugfs_create_files(debugfs_list,
ARRAY_SIZE(debugfs_list),
root, minor);
debugfs_create_file("forcewake_all", 0400, root, xe,
&forcewake_all_fops);
for (mem_type = XE_PL_VRAM0; mem_type <= XE_PL_VRAM1; ++mem_type) {
man = ttm_manager_type(bdev, mem_type);
if (man) {
char name[16];
sprintf(name, "vram%d_mm", mem_type - XE_PL_VRAM0);
ttm_resource_manager_create_debugfs(man, root, name);
}
}
man = ttm_manager_type(bdev, XE_PL_TT);
ttm_resource_manager_create_debugfs(man, root, "gtt_mm");
for_each_gt(gt, xe, id)
xe_gt_debugfs_register(gt);
}

View File

@ -0,0 +1,13 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_DEBUGFS_H_
#define _XE_DEBUGFS_H_
struct xe_device;
void xe_debugfs_register(struct xe_device *xe);
#endif

View File

@ -0,0 +1,359 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2021 Intel Corporation
*/
#include "xe_device.h"
#include <drm/drm_gem_ttm_helper.h>
#include <drm/drm_aperture.h>
#include <drm/drm_ioctl.h>
#include <drm/xe_drm.h>
#include <drm/drm_managed.h>
#include <drm/drm_atomic_helper.h>
#include "xe_bo.h"
#include "xe_debugfs.h"
#include "xe_dma_buf.h"
#include "xe_drv.h"
#include "xe_engine.h"
#include "xe_exec.h"
#include "xe_gt.h"
#include "xe_irq.h"
#include "xe_module.h"
#include "xe_mmio.h"
#include "xe_pcode.h"
#include "xe_pm.h"
#include "xe_query.h"
#include "xe_vm.h"
#include "xe_vm_madvise.h"
#include "xe_wait_user_fence.h"
static int xe_file_open(struct drm_device *dev, struct drm_file *file)
{
struct xe_file *xef;
xef = kzalloc(sizeof(*xef), GFP_KERNEL);
if (!xef)
return -ENOMEM;
xef->drm = file;
mutex_init(&xef->vm.lock);
xa_init_flags(&xef->vm.xa, XA_FLAGS_ALLOC1);
mutex_init(&xef->engine.lock);
xa_init_flags(&xef->engine.xa, XA_FLAGS_ALLOC1);
file->driver_priv = xef;
return 0;
}
static void device_kill_persitent_engines(struct xe_device *xe,
struct xe_file *xef);
static void xe_file_close(struct drm_device *dev, struct drm_file *file)
{
struct xe_device *xe = to_xe_device(dev);
struct xe_file *xef = file->driver_priv;
struct xe_vm *vm;
struct xe_engine *e;
unsigned long idx;
mutex_lock(&xef->engine.lock);
xa_for_each(&xef->engine.xa, idx, e) {
xe_engine_kill(e);
xe_engine_put(e);
}
mutex_unlock(&xef->engine.lock);
mutex_destroy(&xef->engine.lock);
device_kill_persitent_engines(xe, xef);
mutex_lock(&xef->vm.lock);
xa_for_each(&xef->vm.xa, idx, vm)
xe_vm_close_and_put(vm);
mutex_unlock(&xef->vm.lock);
mutex_destroy(&xef->vm.lock);
kfree(xef);
}
static const struct drm_ioctl_desc xe_ioctls[] = {
DRM_IOCTL_DEF_DRV(XE_DEVICE_QUERY, xe_query_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_GEM_CREATE, xe_gem_create_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_GEM_MMAP_OFFSET, xe_gem_mmap_offset_ioctl,
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_VM_CREATE, xe_vm_create_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_VM_DESTROY, xe_vm_destroy_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_VM_BIND, xe_vm_bind_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_ENGINE_CREATE, xe_engine_create_ioctl,
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_ENGINE_DESTROY, xe_engine_destroy_ioctl,
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_EXEC, xe_exec_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_MMIO, xe_mmio_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_ENGINE_SET_PROPERTY, xe_engine_set_property_ioctl,
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_WAIT_USER_FENCE, xe_wait_user_fence_ioctl,
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(XE_VM_MADVISE, xe_vm_madvise_ioctl, DRM_RENDER_ALLOW),
};
static const struct file_operations xe_driver_fops = {
.owner = THIS_MODULE,
.open = drm_open,
.release = drm_release_noglobal,
.unlocked_ioctl = drm_ioctl,
.mmap = drm_gem_mmap,
.poll = drm_poll,
.read = drm_read,
// .compat_ioctl = i915_ioc32_compat_ioctl,
.llseek = noop_llseek,
};
static void xe_driver_release(struct drm_device *dev)
{
struct xe_device *xe = to_xe_device(dev);
pci_set_drvdata(to_pci_dev(xe->drm.dev), NULL);
}
static struct drm_driver driver = {
/* Don't use MTRRs here; the Xserver or userspace app should
* deal with them for Intel hardware.
*/
.driver_features =
DRIVER_GEM |
DRIVER_RENDER | DRIVER_SYNCOBJ |
DRIVER_SYNCOBJ_TIMELINE,
.open = xe_file_open,
.postclose = xe_file_close,
.gem_prime_import = xe_gem_prime_import,
.dumb_create = xe_bo_dumb_create,
.dumb_map_offset = drm_gem_ttm_dumb_map_offset,
.release = &xe_driver_release,
.ioctls = xe_ioctls,
.num_ioctls = ARRAY_SIZE(xe_ioctls),
.fops = &xe_driver_fops,
.name = DRIVER_NAME,
.desc = DRIVER_DESC,
.date = DRIVER_DATE,
.major = DRIVER_MAJOR,
.minor = DRIVER_MINOR,
.patchlevel = DRIVER_PATCHLEVEL,
};
static void xe_device_destroy(struct drm_device *dev, void *dummy)
{
struct xe_device *xe = to_xe_device(dev);
destroy_workqueue(xe->ordered_wq);
mutex_destroy(&xe->persitent_engines.lock);
ttm_device_fini(&xe->ttm);
}
struct xe_device *xe_device_create(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
struct xe_device *xe;
int err;
err = drm_aperture_remove_conflicting_pci_framebuffers(pdev, &driver);
if (err)
return ERR_PTR(err);
xe = devm_drm_dev_alloc(&pdev->dev, &driver, struct xe_device, drm);
if (IS_ERR(xe))
return xe;
err = ttm_device_init(&xe->ttm, &xe_ttm_funcs, xe->drm.dev,
xe->drm.anon_inode->i_mapping,
xe->drm.vma_offset_manager, false, false);
if (WARN_ON(err))
goto err_put;
xe->info.devid = pdev->device;
xe->info.revid = pdev->revision;
xe->info.enable_guc = enable_guc;
spin_lock_init(&xe->irq.lock);
init_waitqueue_head(&xe->ufence_wq);
mutex_init(&xe->usm.lock);
xa_init_flags(&xe->usm.asid_to_vm, XA_FLAGS_ALLOC1);
mutex_init(&xe->persitent_engines.lock);
INIT_LIST_HEAD(&xe->persitent_engines.list);
spin_lock_init(&xe->pinned.lock);
INIT_LIST_HEAD(&xe->pinned.kernel_bo_present);
INIT_LIST_HEAD(&xe->pinned.external_vram);
INIT_LIST_HEAD(&xe->pinned.evicted);
xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
mutex_init(&xe->sb_lock);
xe->enabled_irq_mask = ~0;
err = drmm_add_action_or_reset(&xe->drm, xe_device_destroy, NULL);
if (err)
goto err_put;
mutex_init(&xe->mem_access.lock);
return xe;
err_put:
drm_dev_put(&xe->drm);
return ERR_PTR(err);
}
int xe_device_probe(struct xe_device *xe)
{
struct xe_gt *gt;
int err;
u8 id;
xe->info.mem_region_mask = 1;
for_each_gt(gt, xe, id) {
err = xe_gt_alloc(xe, gt);
if (err)
return err;
}
err = xe_mmio_init(xe);
if (err)
return err;
for_each_gt(gt, xe, id) {
err = xe_pcode_probe(gt);
if (err)
return err;
}
err = xe_irq_install(xe);
if (err)
return err;
for_each_gt(gt, xe, id) {
err = xe_gt_init_early(gt);
if (err)
goto err_irq_shutdown;
}
err = xe_mmio_probe_vram(xe);
if (err)
goto err_irq_shutdown;
for_each_gt(gt, xe, id) {
err = xe_gt_init_noalloc(gt);
if (err)
goto err_irq_shutdown;
}
for_each_gt(gt, xe, id) {
err = xe_gt_init(gt);
if (err)
goto err_irq_shutdown;
}
err = drm_dev_register(&xe->drm, 0);
if (err)
goto err_irq_shutdown;
xe_debugfs_register(xe);
return 0;
err_irq_shutdown:
xe_irq_shutdown(xe);
return err;
}
void xe_device_remove(struct xe_device *xe)
{
xe_irq_shutdown(xe);
}
void xe_device_shutdown(struct xe_device *xe)
{
}
void xe_device_add_persitent_engines(struct xe_device *xe, struct xe_engine *e)
{
mutex_lock(&xe->persitent_engines.lock);
list_add_tail(&e->persitent.link, &xe->persitent_engines.list);
mutex_unlock(&xe->persitent_engines.lock);
}
void xe_device_remove_persitent_engines(struct xe_device *xe,
struct xe_engine *e)
{
mutex_lock(&xe->persitent_engines.lock);
if (!list_empty(&e->persitent.link))
list_del(&e->persitent.link);
mutex_unlock(&xe->persitent_engines.lock);
}
static void device_kill_persitent_engines(struct xe_device *xe,
struct xe_file *xef)
{
struct xe_engine *e, *next;
mutex_lock(&xe->persitent_engines.lock);
list_for_each_entry_safe(e, next, &xe->persitent_engines.list,
persitent.link)
if (e->persitent.xef == xef) {
xe_engine_kill(e);
list_del_init(&e->persitent.link);
}
mutex_unlock(&xe->persitent_engines.lock);
}
#define SOFTWARE_FLAGS_SPR33 _MMIO(0x4F084)
void xe_device_wmb(struct xe_device *xe)
{
struct xe_gt *gt = xe_device_get_gt(xe, 0);
wmb();
if (IS_DGFX(xe))
xe_mmio_write32(gt, SOFTWARE_FLAGS_SPR33.reg, 0);
}
u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
{
return xe_device_has_flat_ccs(xe) ?
DIV_ROUND_UP(size, NUM_BYTES_PER_CCS_BYTE) : 0;
}
void xe_device_mem_access_get(struct xe_device *xe)
{
bool resumed = xe_pm_runtime_resume_if_suspended(xe);
mutex_lock(&xe->mem_access.lock);
if (xe->mem_access.ref++ == 0)
xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
mutex_unlock(&xe->mem_access.lock);
/* The usage counter increased if device was immediately resumed */
if (resumed)
xe_pm_runtime_put(xe);
XE_WARN_ON(xe->mem_access.ref == U32_MAX);
}
void xe_device_mem_access_put(struct xe_device *xe)
{
mutex_lock(&xe->mem_access.lock);
if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
xe_pm_runtime_put(xe);
mutex_unlock(&xe->mem_access.lock);
XE_WARN_ON(xe->mem_access.ref < 0);
}

View File

@ -0,0 +1,126 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _XE_DEVICE_H_
#define _XE_DEVICE_H_
struct xe_engine;
struct xe_file;
#include <drm/drm_util.h>
#include "xe_device_types.h"
#include "xe_macros.h"
#include "xe_force_wake.h"
#include "gt/intel_gpu_commands.h"
static inline struct xe_device *to_xe_device(const struct drm_device *dev)
{
return container_of(dev, struct xe_device, drm);
}
static inline struct xe_device *pdev_to_xe_device(struct pci_dev *pdev)
{
return pci_get_drvdata(pdev);
}
static inline struct xe_device *ttm_to_xe_device(struct ttm_device *ttm)
{
return container_of(ttm, struct xe_device, ttm);
}
struct xe_device *xe_device_create(struct pci_dev *pdev,
const struct pci_device_id *ent);
int xe_device_probe(struct xe_device *xe);
void xe_device_remove(struct xe_device *xe);
void xe_device_shutdown(struct xe_device *xe);
void xe_device_add_persitent_engines(struct xe_device *xe, struct xe_engine *e);
void xe_device_remove_persitent_engines(struct xe_device *xe,
struct xe_engine *e);
void xe_device_wmb(struct xe_device *xe);
static inline struct xe_file *to_xe_file(const struct drm_file *file)
{
return file->driver_priv;
}
static inline struct xe_gt *xe_device_get_gt(struct xe_device *xe, u8 gt_id)
{
struct xe_gt *gt;
XE_BUG_ON(gt_id > XE_MAX_GT);
gt = xe->gt + gt_id;
XE_BUG_ON(gt->info.id != gt_id);
XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
return gt;
}
/*
* FIXME: Placeholder until multi-gt lands. Once that lands, kill this function.
*/
static inline struct xe_gt *to_gt(struct xe_device *xe)
{
return xe->gt;
}
static inline bool xe_device_guc_submission_enabled(struct xe_device *xe)
{
return xe->info.enable_guc;
}
static inline void xe_device_guc_submission_disable(struct xe_device *xe)
{
xe->info.enable_guc = false;
}
#define for_each_gt(gt__, xe__, id__) \
for ((id__) = 0; (id__) < (xe__)->info.tile_count; (id__++)) \
for_each_if ((gt__) = xe_device_get_gt((xe__), (id__)))
static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
{
return &gt->mmio.fw;
}
void xe_device_mem_access_get(struct xe_device *xe);
void xe_device_mem_access_put(struct xe_device *xe);
static inline void xe_device_assert_mem_access(struct xe_device *xe)
{
XE_WARN_ON(!xe->mem_access.ref);
}
static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
{
bool ret;
mutex_lock(&xe->mem_access.lock);
ret = xe->mem_access.ref;
mutex_unlock(&xe->mem_access.lock);
return ret;
}
static inline bool xe_device_in_fault_mode(struct xe_device *xe)
{
return xe->usm.num_vm_in_fault_mode != 0;
}
static inline bool xe_device_in_non_fault_mode(struct xe_device *xe)
{
return xe->usm.num_vm_in_non_fault_mode != 0;
}
static inline bool xe_device_has_flat_ccs(struct xe_device *xe)
{
return xe->info.has_flat_ccs;
}
u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size);
#endif

View File

@ -0,0 +1,214 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_DEVICE_TYPES_H_
#define _XE_DEVICE_TYPES_H_
#include <linux/pci.h>
#include <drm/drm_device.h>
#include <drm/drm_file.h>
#include <drm/ttm/ttm_device.h>
#include "xe_gt_types.h"
#include "xe_platform_types.h"
#include "xe_step_types.h"
#define XE_BO_INVALID_OFFSET LONG_MAX
#define GRAPHICS_VER(xe) ((xe)->info.graphics_verx100 / 100)
#define MEDIA_VER(xe) ((xe)->info.media_verx100 / 100)
#define GRAPHICS_VERx100(xe) ((xe)->info.graphics_verx100)
#define MEDIA_VERx100(xe) ((xe)->info.media_verx100)
#define IS_DGFX(xe) ((xe)->info.is_dgfx)
#define XE_VRAM_FLAGS_NEED64K BIT(0)
#define XE_GT0 0
#define XE_GT1 1
#define XE_MAX_GT (XE_GT1 + 1)
#define XE_MAX_ASID (BIT(20))
#define IS_PLATFORM_STEP(_xe, _platform, min_step, max_step) \
((_xe)->info.platform == (_platform) && \
(_xe)->info.step.graphics >= (min_step) && \
(_xe)->info.step.graphics < (max_step))
#define IS_SUBPLATFORM_STEP(_xe, _platform, sub, min_step, max_step) \
((_xe)->info.platform == (_platform) && \
(_xe)->info.subplatform == (sub) && \
(_xe)->info.step.graphics >= (min_step) && \
(_xe)->info.step.graphics < (max_step))
/**
* struct xe_device - Top level struct of XE device
*/
struct xe_device {
/** @drm: drm device */
struct drm_device drm;
/** @info: device info */
struct intel_device_info {
/** @graphics_verx100: graphics IP version */
u32 graphics_verx100;
/** @media_verx100: media IP version */
u32 media_verx100;
/** @mem_region_mask: mask of valid memory regions */
u32 mem_region_mask;
/** @is_dgfx: is discrete device */
bool is_dgfx;
/** @platform: XE platform enum */
enum xe_platform platform;
/** @subplatform: XE subplatform enum */
enum xe_subplatform subplatform;
/** @devid: device ID */
u16 devid;
/** @revid: device revision */
u8 revid;
/** @step: stepping information for each IP */
struct xe_step_info step;
/** @dma_mask_size: DMA address bits */
u8 dma_mask_size;
/** @vram_flags: Vram flags */
u8 vram_flags;
/** @tile_count: Number of tiles */
u8 tile_count;
/** @vm_max_level: Max VM level */
u8 vm_max_level;
/** @media_ver: Media version */
u8 media_ver;
/** @supports_usm: Supports unified shared memory */
bool supports_usm;
/** @enable_guc: GuC submission enabled */
bool enable_guc;
/** @has_flat_ccs: Whether flat CCS metadata is used */
bool has_flat_ccs;
/** @has_4tile: Whether tile-4 tiling is supported */
bool has_4tile;
} info;
/** @irq: device interrupt state */
struct {
/** @lock: lock for processing irq's on this device */
spinlock_t lock;
/** @enabled: interrupts enabled on this device */
bool enabled;
} irq;
/** @ttm: ttm device */
struct ttm_device ttm;
/** @mmio: mmio info for device */
struct {
/** @size: size of MMIO space for device */
size_t size;
/** @regs: pointer to MMIO space for device */
void *regs;
} mmio;
/** @mem: memory info for device */
struct {
/** @vram: VRAM info for device */
struct {
/** @io_start: start address of VRAM */
resource_size_t io_start;
/** @size: size of VRAM */
resource_size_t size;
/** @mapping: pointer to VRAM mappable space */
void *__iomem mapping;
} vram;
} mem;
/** @usm: unified memory state */
struct {
/** @asid: convert a ASID to VM */
struct xarray asid_to_vm;
/** @next_asid: next ASID, used to cyclical alloc asids */
u32 next_asid;
/** @num_vm_in_fault_mode: number of VM in fault mode */
u32 num_vm_in_fault_mode;
/** @num_vm_in_non_fault_mode: number of VM in non-fault mode */
u32 num_vm_in_non_fault_mode;
/** @lock: protects UM state */
struct mutex lock;
} usm;
/** @persitent_engines: engines that are closed but still running */
struct {
/** @lock: protects persitent engines */
struct mutex lock;
/** @list: list of persitent engines */
struct list_head list;
} persitent_engines;
/** @pinned: pinned BO state */
struct {
/** @lock: protected pinned BO list state */
spinlock_t lock;
/** @evicted: pinned kernel BO that are present */
struct list_head kernel_bo_present;
/** @evicted: pinned BO that have been evicted */
struct list_head evicted;
/** @external_vram: pinned external BO in vram*/
struct list_head external_vram;
} pinned;
/** @ufence_wq: user fence wait queue */
wait_queue_head_t ufence_wq;
/** @ordered_wq: used to serialize compute mode resume */
struct workqueue_struct *ordered_wq;
/** @gt: graphics tile */
struct xe_gt gt[XE_MAX_GT];
/**
* @mem_access: keep track of memory access in the device, possibly
* triggering additional actions when they occur.
*/
struct {
/** @lock: protect the ref count */
struct mutex lock;
/** @ref: ref count of memory accesses */
u32 ref;
/** @hold_rpm: need to put rpm ref back at the end */
bool hold_rpm;
} mem_access;
/** @d3cold_allowed: Indicates if d3cold is a valid device state */
bool d3cold_allowed;
/* For pcode */
struct mutex sb_lock;
u32 enabled_irq_mask;
};
/**
* struct xe_file - file handle for XE driver
*/
struct xe_file {
/** @drm: base DRM file */
struct drm_file *drm;
/** @vm: VM state for file */
struct {
/** @xe: xarray to store VMs */
struct xarray xa;
/** @lock: protects file VM state */
struct mutex lock;
} vm;
/** @engine: Submission engine state for file */
struct {
/** @xe: xarray to store engines */
struct xarray xa;
/** @lock: protects file engine state */
struct mutex lock;
} engine;
};
#endif

View File

@ -0,0 +1,307 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <linux/dma-buf.h>
#include <drm/drm_device.h>
#include <drm/drm_prime.h>
#include <drm/ttm/ttm_tt.h>
#include <kunit/test.h>
#include <linux/pci-p2pdma.h>
#include "tests/xe_test.h"
#include "xe_bo.h"
#include "xe_device.h"
#include "xe_dma_buf.h"
#include "xe_ttm_vram_mgr.h"
#include "xe_vm.h"
MODULE_IMPORT_NS(DMA_BUF);
static int xe_dma_buf_attach(struct dma_buf *dmabuf,
struct dma_buf_attachment *attach)
{
struct drm_gem_object *obj = attach->dmabuf->priv;
if (attach->peer2peer &&
pci_p2pdma_distance(to_pci_dev(obj->dev->dev), attach->dev, false) < 0)
attach->peer2peer = false;
if (!attach->peer2peer && !xe_bo_can_migrate(gem_to_xe_bo(obj), XE_PL_TT))
return -EOPNOTSUPP;
xe_device_mem_access_get(to_xe_device(obj->dev));
return 0;
}
static void xe_dma_buf_detach(struct dma_buf *dmabuf,
struct dma_buf_attachment *attach)
{
struct drm_gem_object *obj = attach->dmabuf->priv;
xe_device_mem_access_put(to_xe_device(obj->dev));
}
static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
{
struct drm_gem_object *obj = attach->dmabuf->priv;
struct xe_bo *bo = gem_to_xe_bo(obj);
/*
* Migrate to TT first to increase the chance of non-p2p clients
* can attach.
*/
(void)xe_bo_migrate(bo, XE_PL_TT);
xe_bo_pin_external(bo);
return 0;
}
static void xe_dma_buf_unpin(struct dma_buf_attachment *attach)
{
struct drm_gem_object *obj = attach->dmabuf->priv;
struct xe_bo *bo = gem_to_xe_bo(obj);
xe_bo_unpin_external(bo);
}
static struct sg_table *xe_dma_buf_map(struct dma_buf_attachment *attach,
enum dma_data_direction dir)
{
struct dma_buf *dma_buf = attach->dmabuf;
struct drm_gem_object *obj = dma_buf->priv;
struct xe_bo *bo = gem_to_xe_bo(obj);
struct sg_table *sgt;
int r = 0;
if (!attach->peer2peer && !xe_bo_can_migrate(bo, XE_PL_TT))
return ERR_PTR(-EOPNOTSUPP);
if (!xe_bo_is_pinned(bo)) {
if (!attach->peer2peer ||
bo->ttm.resource->mem_type == XE_PL_SYSTEM) {
if (xe_bo_can_migrate(bo, XE_PL_TT))
r = xe_bo_migrate(bo, XE_PL_TT);
else
r = xe_bo_validate(bo, NULL, false);
}
if (r)
return ERR_PTR(r);
}
switch (bo->ttm.resource->mem_type) {
case XE_PL_TT:
sgt = drm_prime_pages_to_sg(obj->dev,
bo->ttm.ttm->pages,
bo->ttm.ttm->num_pages);
if (IS_ERR(sgt))
return sgt;
if (dma_map_sgtable(attach->dev, sgt, dir,
DMA_ATTR_SKIP_CPU_SYNC))
goto error_free;
break;
case XE_PL_VRAM0:
case XE_PL_VRAM1:
r = xe_ttm_vram_mgr_alloc_sgt(xe_bo_device(bo),
bo->ttm.resource, 0,
bo->ttm.base.size, attach->dev,
dir, &sgt);
if (r)
return ERR_PTR(r);
break;
default:
return ERR_PTR(-EINVAL);
}
return sgt;
error_free:
sg_free_table(sgt);
kfree(sgt);
return ERR_PTR(-EBUSY);
}
static void xe_dma_buf_unmap(struct dma_buf_attachment *attach,
struct sg_table *sgt,
enum dma_data_direction dir)
{
struct dma_buf *dma_buf = attach->dmabuf;
struct xe_bo *bo = gem_to_xe_bo(dma_buf->priv);
if (!xe_bo_is_vram(bo)) {
dma_unmap_sgtable(attach->dev, sgt, dir, 0);
sg_free_table(sgt);
kfree(sgt);
} else {
xe_ttm_vram_mgr_free_sgt(attach->dev, dir, sgt);
}
}
static int xe_dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
enum dma_data_direction direction)
{
struct drm_gem_object *obj = dma_buf->priv;
struct xe_bo *bo = gem_to_xe_bo(obj);
bool reads = (direction == DMA_BIDIRECTIONAL ||
direction == DMA_FROM_DEVICE);
if (!reads)
return 0;
xe_bo_lock_no_vm(bo, NULL);
(void)xe_bo_migrate(bo, XE_PL_TT);
xe_bo_unlock_no_vm(bo);
return 0;
}
const struct dma_buf_ops xe_dmabuf_ops = {
.attach = xe_dma_buf_attach,
.detach = xe_dma_buf_detach,
.pin = xe_dma_buf_pin,
.unpin = xe_dma_buf_unpin,
.map_dma_buf = xe_dma_buf_map,
.unmap_dma_buf = xe_dma_buf_unmap,
.release = drm_gem_dmabuf_release,
.begin_cpu_access = xe_dma_buf_begin_cpu_access,
.mmap = drm_gem_dmabuf_mmap,
.vmap = drm_gem_dmabuf_vmap,
.vunmap = drm_gem_dmabuf_vunmap,
};
struct dma_buf *xe_gem_prime_export(struct drm_gem_object *obj, int flags)
{
struct xe_bo *bo = gem_to_xe_bo(obj);
struct dma_buf *buf;
if (bo->vm)
return ERR_PTR(-EPERM);
buf = drm_gem_prime_export(obj, flags);
if (!IS_ERR(buf))
buf->ops = &xe_dmabuf_ops;
return buf;
}
static struct drm_gem_object *
xe_dma_buf_init_obj(struct drm_device *dev, struct xe_bo *storage,
struct dma_buf *dma_buf)
{
struct dma_resv *resv = dma_buf->resv;
struct xe_device *xe = to_xe_device(dev);
struct xe_bo *bo;
int ret;
dma_resv_lock(resv, NULL);
bo = __xe_bo_create_locked(xe, storage, NULL, resv, dma_buf->size,
ttm_bo_type_sg, XE_BO_CREATE_SYSTEM_BIT);
if (IS_ERR(bo)) {
ret = PTR_ERR(bo);
goto error;
}
dma_resv_unlock(resv);
return &bo->ttm.base;
error:
dma_resv_unlock(resv);
return ERR_PTR(ret);
}
static void xe_dma_buf_move_notify(struct dma_buf_attachment *attach)
{
struct drm_gem_object *obj = attach->importer_priv;
struct xe_bo *bo = gem_to_xe_bo(obj);
XE_WARN_ON(xe_bo_evict(bo, false));
}
static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = {
.allow_peer2peer = true,
.move_notify = xe_dma_buf_move_notify
};
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
struct dma_buf_test_params {
struct xe_test_priv base;
const struct dma_buf_attach_ops *attach_ops;
bool force_different_devices;
u32 mem_mask;
};
#define to_dma_buf_test_params(_priv) \
container_of(_priv, struct dma_buf_test_params, base)
#endif
struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf)
{
XE_TEST_DECLARE(struct dma_buf_test_params *test =
to_dma_buf_test_params
(xe_cur_kunit_priv(XE_TEST_LIVE_DMA_BUF));)
const struct dma_buf_attach_ops *attach_ops;
struct dma_buf_attachment *attach;
struct drm_gem_object *obj;
struct xe_bo *bo;
if (dma_buf->ops == &xe_dmabuf_ops) {
obj = dma_buf->priv;
if (obj->dev == dev &&
!XE_TEST_ONLY(test && test->force_different_devices)) {
/*
* Importing dmabuf exported from out own gem increases
* refcount on gem itself instead of f_count of dmabuf.
*/
drm_gem_object_get(obj);
return obj;
}
}
/*
* Don't publish the bo until we have a valid attachment, and a
* valid attachment needs the bo address. So pre-create a bo before
* creating the attachment and publish.
*/
bo = xe_bo_alloc();
if (IS_ERR(bo))
return ERR_CAST(bo);
attach_ops = &xe_dma_buf_attach_ops;
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
if (test)
attach_ops = test->attach_ops;
#endif
attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, &bo->ttm.base);
if (IS_ERR(attach)) {
obj = ERR_CAST(attach);
goto out_err;
}
/* Errors here will take care of freeing the bo. */
obj = xe_dma_buf_init_obj(dev, bo, dma_buf);
if (IS_ERR(obj))
return obj;
get_dma_buf(dma_buf);
obj->import_attach = attach;
return obj;
out_err:
xe_bo_free(bo);
return obj;
}
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST)
#include "tests/xe_dma_buf.c"
#endif

View File

@ -0,0 +1,15 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_DMA_BUF_H_
#define _XE_DMA_BUF_H_
#include <drm/drm_gem.h>
struct dma_buf *xe_gem_prime_export(struct drm_gem_object *obj, int flags);
struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf);
#endif

View File

@ -0,0 +1,24 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _XE_DRV_H_
#define _XE_DRV_H_
#include <drm/drm_drv.h>
#define DRIVER_NAME "xe"
#define DRIVER_DESC "Intel Xe Graphics"
#define DRIVER_DATE "20201103"
#define DRIVER_TIMESTAMP 1604406085
/* Interface history:
*
* 1.1: Original.
*/
#define DRIVER_MAJOR 1
#define DRIVER_MINOR 1
#define DRIVER_PATCHLEVEL 0
#endif

View File

@ -0,0 +1,734 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2021 Intel Corporation
*/
#include "xe_engine.h"
#include <drm/drm_device.h>
#include <drm/drm_file.h>
#include <drm/xe_drm.h>
#include <linux/nospec.h>
#include "xe_device.h"
#include "xe_gt.h"
#include "xe_lrc.h"
#include "xe_macros.h"
#include "xe_migrate.h"
#include "xe_pm.h"
#include "xe_trace.h"
#include "xe_vm.h"
static struct xe_engine *__xe_engine_create(struct xe_device *xe,
struct xe_vm *vm,
u32 logical_mask,
u16 width, struct xe_hw_engine *hwe,
u32 flags)
{
struct xe_engine *e;
struct xe_gt *gt = hwe->gt;
int err;
int i;
e = kzalloc(sizeof(*e) + sizeof(struct xe_lrc) * width, GFP_KERNEL);
if (!e)
return ERR_PTR(-ENOMEM);
kref_init(&e->refcount);
e->flags = flags;
e->hwe = hwe;
e->gt = gt;
if (vm)
e->vm = xe_vm_get(vm);
e->class = hwe->class;
e->width = width;
e->logical_mask = logical_mask;
e->fence_irq = &gt->fence_irq[hwe->class];
e->ring_ops = gt->ring_ops[hwe->class];
e->ops = gt->engine_ops;
INIT_LIST_HEAD(&e->persitent.link);
INIT_LIST_HEAD(&e->compute.link);
INIT_LIST_HEAD(&e->multi_gt_link);
/* FIXME: Wire up to configurable default value */
e->sched_props.timeslice_us = 1 * 1000;
e->sched_props.preempt_timeout_us = 640 * 1000;
if (xe_engine_is_parallel(e)) {
e->parallel.composite_fence_ctx = dma_fence_context_alloc(1);
e->parallel.composite_fence_seqno = 1;
}
if (e->flags & ENGINE_FLAG_VM) {
e->bind.fence_ctx = dma_fence_context_alloc(1);
e->bind.fence_seqno = 1;
}
for (i = 0; i < width; ++i) {
err = xe_lrc_init(e->lrc + i, hwe, e, vm, SZ_16K);
if (err)
goto err_lrc;
}
err = e->ops->init(e);
if (err)
goto err_lrc;
return e;
err_lrc:
for (i = i - 1; i >= 0; --i)
xe_lrc_finish(e->lrc + i);
kfree(e);
return ERR_PTR(err);
}
struct xe_engine *xe_engine_create(struct xe_device *xe, struct xe_vm *vm,
u32 logical_mask, u16 width,
struct xe_hw_engine *hwe, u32 flags)
{
struct ww_acquire_ctx ww;
struct xe_engine *e;
int err;
if (vm) {
err = xe_vm_lock(vm, &ww, 0, true);
if (err)
return ERR_PTR(err);
}
e = __xe_engine_create(xe, vm, logical_mask, width, hwe, flags);
if (vm)
xe_vm_unlock(vm, &ww);
return e;
}
struct xe_engine *xe_engine_create_class(struct xe_device *xe, struct xe_gt *gt,
struct xe_vm *vm,
enum xe_engine_class class, u32 flags)
{
struct xe_hw_engine *hwe, *hwe0 = NULL;
enum xe_hw_engine_id id;
u32 logical_mask = 0;
for_each_hw_engine(hwe, gt, id) {
if (xe_hw_engine_is_reserved(hwe))
continue;
if (hwe->class == class) {
logical_mask |= BIT(hwe->logical_instance);
if (!hwe0)
hwe0 = hwe;
}
}
if (!logical_mask)
return ERR_PTR(-ENODEV);
return xe_engine_create(xe, vm, logical_mask, 1, hwe0, flags);
}
void xe_engine_destroy(struct kref *ref)
{
struct xe_engine *e = container_of(ref, struct xe_engine, refcount);
struct xe_engine *engine, *next;
if (!(e->flags & ENGINE_FLAG_BIND_ENGINE_CHILD)) {
list_for_each_entry_safe(engine, next, &e->multi_gt_list,
multi_gt_link)
xe_engine_put(engine);
}
e->ops->fini(e);
}
void xe_engine_fini(struct xe_engine *e)
{
int i;
for (i = 0; i < e->width; ++i)
xe_lrc_finish(e->lrc + i);
if (e->vm)
xe_vm_put(e->vm);
kfree(e);
}
struct xe_engine *xe_engine_lookup(struct xe_file *xef, u32 id)
{
struct xe_engine *e;
mutex_lock(&xef->engine.lock);
e = xa_load(&xef->engine.xa, id);
mutex_unlock(&xef->engine.lock);
if (e)
xe_engine_get(e);
return e;
}
static int engine_set_priority(struct xe_device *xe, struct xe_engine *e,
u64 value, bool create)
{
if (XE_IOCTL_ERR(xe, value > XE_ENGINE_PRIORITY_HIGH))
return -EINVAL;
if (XE_IOCTL_ERR(xe, value == XE_ENGINE_PRIORITY_HIGH &&
!capable(CAP_SYS_NICE)))
return -EPERM;
return e->ops->set_priority(e, value);
}
static int engine_set_timeslice(struct xe_device *xe, struct xe_engine *e,
u64 value, bool create)
{
if (!capable(CAP_SYS_NICE))
return -EPERM;
return e->ops->set_timeslice(e, value);
}
static int engine_set_preemption_timeout(struct xe_device *xe,
struct xe_engine *e, u64 value,
bool create)
{
if (!capable(CAP_SYS_NICE))
return -EPERM;
return e->ops->set_preempt_timeout(e, value);
}
static int engine_set_compute_mode(struct xe_device *xe, struct xe_engine *e,
u64 value, bool create)
{
if (XE_IOCTL_ERR(xe, !create))
return -EINVAL;
if (XE_IOCTL_ERR(xe, e->flags & ENGINE_FLAG_COMPUTE_MODE))
return -EINVAL;
if (XE_IOCTL_ERR(xe, e->flags & ENGINE_FLAG_VM))
return -EINVAL;
if (value) {
struct xe_vm *vm = e->vm;
int err;
if (XE_IOCTL_ERR(xe, xe_vm_in_fault_mode(vm)))
return -EOPNOTSUPP;
if (XE_IOCTL_ERR(xe, !xe_vm_in_compute_mode(vm)))
return -EOPNOTSUPP;
if (XE_IOCTL_ERR(xe, e->width != 1))
return -EINVAL;
e->compute.context = dma_fence_context_alloc(1);
spin_lock_init(&e->compute.lock);
err = xe_vm_add_compute_engine(vm, e);
if (XE_IOCTL_ERR(xe, err))
return err;
e->flags |= ENGINE_FLAG_COMPUTE_MODE;
e->flags &= ~ENGINE_FLAG_PERSISTENT;
}
return 0;
}
static int engine_set_persistence(struct xe_device *xe, struct xe_engine *e,
u64 value, bool create)
{
if (XE_IOCTL_ERR(xe, !create))
return -EINVAL;
if (XE_IOCTL_ERR(xe, e->flags & ENGINE_FLAG_COMPUTE_MODE))
return -EINVAL;
if (value)
e->flags |= ENGINE_FLAG_PERSISTENT;
else
e->flags &= ~ENGINE_FLAG_PERSISTENT;
return 0;
}
static int engine_set_job_timeout(struct xe_device *xe, struct xe_engine *e,
u64 value, bool create)
{
if (XE_IOCTL_ERR(xe, !create))
return -EINVAL;
if (!capable(CAP_SYS_NICE))
return -EPERM;
return e->ops->set_job_timeout(e, value);
}
static int engine_set_acc_trigger(struct xe_device *xe, struct xe_engine *e,
u64 value, bool create)
{
if (XE_IOCTL_ERR(xe, !create))
return -EINVAL;
if (XE_IOCTL_ERR(xe, !xe->info.supports_usm))
return -EINVAL;
e->usm.acc_trigger = value;
return 0;
}
static int engine_set_acc_notify(struct xe_device *xe, struct xe_engine *e,
u64 value, bool create)
{
if (XE_IOCTL_ERR(xe, !create))
return -EINVAL;
if (XE_IOCTL_ERR(xe, !xe->info.supports_usm))
return -EINVAL;
e->usm.acc_notify = value;
return 0;
}
static int engine_set_acc_granularity(struct xe_device *xe, struct xe_engine *e,
u64 value, bool create)
{
if (XE_IOCTL_ERR(xe, !create))
return -EINVAL;
if (XE_IOCTL_ERR(xe, !xe->info.supports_usm))
return -EINVAL;
e->usm.acc_granularity = value;
return 0;
}
typedef int (*xe_engine_set_property_fn)(struct xe_device *xe,
struct xe_engine *e,
u64 value, bool create);
static const xe_engine_set_property_fn engine_set_property_funcs[] = {
[XE_ENGINE_PROPERTY_PRIORITY] = engine_set_priority,
[XE_ENGINE_PROPERTY_TIMESLICE] = engine_set_timeslice,
[XE_ENGINE_PROPERTY_PREEMPTION_TIMEOUT] = engine_set_preemption_timeout,
[XE_ENGINE_PROPERTY_COMPUTE_MODE] = engine_set_compute_mode,
[XE_ENGINE_PROPERTY_PERSISTENCE] = engine_set_persistence,
[XE_ENGINE_PROPERTY_JOB_TIMEOUT] = engine_set_job_timeout,
[XE_ENGINE_PROPERTY_ACC_TRIGGER] = engine_set_acc_trigger,
[XE_ENGINE_PROPERTY_ACC_NOTIFY] = engine_set_acc_notify,
[XE_ENGINE_PROPERTY_ACC_GRANULARITY] = engine_set_acc_granularity,
};
static int engine_user_ext_set_property(struct xe_device *xe,
struct xe_engine *e,
u64 extension,
bool create)
{
u64 __user *address = u64_to_user_ptr(extension);
struct drm_xe_ext_engine_set_property ext;
int err;
u32 idx;
err = __copy_from_user(&ext, address, sizeof(ext));
if (XE_IOCTL_ERR(xe, err))
return -EFAULT;
if (XE_IOCTL_ERR(xe, ext.property >=
ARRAY_SIZE(engine_set_property_funcs)))
return -EINVAL;
idx = array_index_nospec(ext.property, ARRAY_SIZE(engine_set_property_funcs));
return engine_set_property_funcs[idx](xe, e, ext.value, create);
}
typedef int (*xe_engine_user_extension_fn)(struct xe_device *xe,
struct xe_engine *e,
u64 extension,
bool create);
static const xe_engine_set_property_fn engine_user_extension_funcs[] = {
[XE_ENGINE_EXTENSION_SET_PROPERTY] = engine_user_ext_set_property,
};
#define MAX_USER_EXTENSIONS 16
static int engine_user_extensions(struct xe_device *xe, struct xe_engine *e,
u64 extensions, int ext_number, bool create)
{
u64 __user *address = u64_to_user_ptr(extensions);
struct xe_user_extension ext;
int err;
u32 idx;
if (XE_IOCTL_ERR(xe, ext_number >= MAX_USER_EXTENSIONS))
return -E2BIG;
err = __copy_from_user(&ext, address, sizeof(ext));
if (XE_IOCTL_ERR(xe, err))
return -EFAULT;
if (XE_IOCTL_ERR(xe, ext.name >=
ARRAY_SIZE(engine_user_extension_funcs)))
return -EINVAL;
idx = array_index_nospec(ext.name,
ARRAY_SIZE(engine_user_extension_funcs));
err = engine_user_extension_funcs[idx](xe, e, extensions, create);
if (XE_IOCTL_ERR(xe, err))
return err;
if (ext.next_extension)
return engine_user_extensions(xe, e, ext.next_extension,
++ext_number, create);
return 0;
}
static const enum xe_engine_class user_to_xe_engine_class[] = {
[DRM_XE_ENGINE_CLASS_RENDER] = XE_ENGINE_CLASS_RENDER,
[DRM_XE_ENGINE_CLASS_COPY] = XE_ENGINE_CLASS_COPY,
[DRM_XE_ENGINE_CLASS_VIDEO_DECODE] = XE_ENGINE_CLASS_VIDEO_DECODE,
[DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE] = XE_ENGINE_CLASS_VIDEO_ENHANCE,
[DRM_XE_ENGINE_CLASS_COMPUTE] = XE_ENGINE_CLASS_COMPUTE,
};
static struct xe_hw_engine *
find_hw_engine(struct xe_device *xe,
struct drm_xe_engine_class_instance eci)
{
u32 idx;
if (eci.engine_class > ARRAY_SIZE(user_to_xe_engine_class))
return NULL;
if (eci.gt_id >= xe->info.tile_count)
return NULL;
idx = array_index_nospec(eci.engine_class,
ARRAY_SIZE(user_to_xe_engine_class));
return xe_gt_hw_engine(xe_device_get_gt(xe, eci.gt_id),
user_to_xe_engine_class[idx],
eci.engine_instance, true);
}
static u32 bind_engine_logical_mask(struct xe_device *xe, struct xe_gt *gt,
struct drm_xe_engine_class_instance *eci,
u16 width, u16 num_placements)
{
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
u32 logical_mask = 0;
if (XE_IOCTL_ERR(xe, width != 1))
return 0;
if (XE_IOCTL_ERR(xe, num_placements != 1))
return 0;
if (XE_IOCTL_ERR(xe, eci[0].engine_instance != 0))
return 0;
eci[0].engine_class = DRM_XE_ENGINE_CLASS_COPY;
for_each_hw_engine(hwe, gt, id) {
if (xe_hw_engine_is_reserved(hwe))
continue;
if (hwe->class ==
user_to_xe_engine_class[DRM_XE_ENGINE_CLASS_COPY])
logical_mask |= BIT(hwe->logical_instance);
}
return logical_mask;
}
static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
struct drm_xe_engine_class_instance *eci,
u16 width, u16 num_placements)
{
int len = width * num_placements;
int i, j, n;
u16 class;
u16 gt_id;
u32 return_mask = 0, prev_mask;
if (XE_IOCTL_ERR(xe, !xe_device_guc_submission_enabled(xe) &&
len > 1))
return 0;
for (i = 0; i < width; ++i) {
u32 current_mask = 0;
for (j = 0; j < num_placements; ++j) {
struct xe_hw_engine *hwe;
n = j * width + i;
hwe = find_hw_engine(xe, eci[n]);
if (XE_IOCTL_ERR(xe, !hwe))
return 0;
if (XE_IOCTL_ERR(xe, xe_hw_engine_is_reserved(hwe)))
return 0;
if (XE_IOCTL_ERR(xe, n && eci[n].gt_id != gt_id) ||
XE_IOCTL_ERR(xe, n && eci[n].engine_class != class))
return 0;
class = eci[n].engine_class;
gt_id = eci[n].gt_id;
if (width == 1 || !i)
return_mask |= BIT(eci[n].engine_instance);
current_mask |= BIT(eci[n].engine_instance);
}
/* Parallel submissions must be logically contiguous */
if (i && XE_IOCTL_ERR(xe, current_mask != prev_mask << 1))
return 0;
prev_mask = current_mask;
}
return return_mask;
}
int xe_engine_create_ioctl(struct drm_device *dev, void *data,
struct drm_file *file)
{
struct xe_device *xe = to_xe_device(dev);
struct xe_file *xef = to_xe_file(file);
struct drm_xe_engine_create *args = data;
struct drm_xe_engine_class_instance eci[XE_HW_ENGINE_MAX_INSTANCE];
struct drm_xe_engine_class_instance __user *user_eci =
u64_to_user_ptr(args->instances);
struct xe_hw_engine *hwe;
struct xe_vm *vm, *migrate_vm;
struct xe_gt *gt;
struct xe_engine *e = NULL;
u32 logical_mask;
u32 id;
int len;
int err;
if (XE_IOCTL_ERR(xe, args->flags))
return -EINVAL;
len = args->width * args->num_placements;
if (XE_IOCTL_ERR(xe, !len || len > XE_HW_ENGINE_MAX_INSTANCE))
return -EINVAL;
err = __copy_from_user(eci, user_eci,
sizeof(struct drm_xe_engine_class_instance) *
len);
if (XE_IOCTL_ERR(xe, err))
return -EFAULT;
if (XE_IOCTL_ERR(xe, eci[0].gt_id >= xe->info.tile_count))
return -EINVAL;
xe_pm_runtime_get(xe);
if (eci[0].engine_class == DRM_XE_ENGINE_CLASS_VM_BIND) {
for_each_gt(gt, xe, id) {
struct xe_engine *new;
if (xe_gt_is_media_type(gt))
continue;
eci[0].gt_id = gt->info.id;
logical_mask = bind_engine_logical_mask(xe, gt, eci,
args->width,
args->num_placements);
if (XE_IOCTL_ERR(xe, !logical_mask)) {
err = -EINVAL;
goto put_rpm;
}
hwe = find_hw_engine(xe, eci[0]);
if (XE_IOCTL_ERR(xe, !hwe)) {
err = -EINVAL;
goto put_rpm;
}
migrate_vm = xe_migrate_get_vm(gt->migrate);
new = xe_engine_create(xe, migrate_vm, logical_mask,
args->width, hwe,
ENGINE_FLAG_PERSISTENT |
ENGINE_FLAG_VM |
(id ?
ENGINE_FLAG_BIND_ENGINE_CHILD :
0));
xe_vm_put(migrate_vm);
if (IS_ERR(new)) {
err = PTR_ERR(new);
if (e)
goto put_engine;
goto put_rpm;
}
if (id == 0)
e = new;
else
list_add_tail(&new->multi_gt_list,
&e->multi_gt_link);
}
} else {
gt = xe_device_get_gt(xe, eci[0].gt_id);
logical_mask = calc_validate_logical_mask(xe, gt, eci,
args->width,
args->num_placements);
if (XE_IOCTL_ERR(xe, !logical_mask)) {
err = -EINVAL;
goto put_rpm;
}
hwe = find_hw_engine(xe, eci[0]);
if (XE_IOCTL_ERR(xe, !hwe)) {
err = -EINVAL;
goto put_rpm;
}
vm = xe_vm_lookup(xef, args->vm_id);
if (XE_IOCTL_ERR(xe, !vm)) {
err = -ENOENT;
goto put_rpm;
}
e = xe_engine_create(xe, vm, logical_mask,
args->width, hwe, ENGINE_FLAG_PERSISTENT);
xe_vm_put(vm);
if (IS_ERR(e)) {
err = PTR_ERR(e);
goto put_rpm;
}
}
if (args->extensions) {
err = engine_user_extensions(xe, e, args->extensions, 0, true);
if (XE_IOCTL_ERR(xe, err))
goto put_engine;
}
if (XE_IOCTL_ERR(xe, e->vm && xe_vm_in_compute_mode(e->vm) !=
!!(e->flags & ENGINE_FLAG_COMPUTE_MODE))) {
err = -ENOTSUPP;
goto put_engine;
}
e->persitent.xef = xef;
mutex_lock(&xef->engine.lock);
err = xa_alloc(&xef->engine.xa, &id, e, xa_limit_32b, GFP_KERNEL);
mutex_unlock(&xef->engine.lock);
if (err)
goto put_engine;
args->engine_id = id;
return 0;
put_engine:
xe_engine_kill(e);
xe_engine_put(e);
put_rpm:
xe_pm_runtime_put(xe);
return err;
}
static void engine_kill_compute(struct xe_engine *e)
{
if (!xe_vm_in_compute_mode(e->vm))
return;
down_write(&e->vm->lock);
list_del(&e->compute.link);
--e->vm->preempt.num_engines;
if (e->compute.pfence) {
dma_fence_enable_sw_signaling(e->compute.pfence);
dma_fence_put(e->compute.pfence);
e->compute.pfence = NULL;
}
up_write(&e->vm->lock);
}
void xe_engine_kill(struct xe_engine *e)
{
struct xe_engine *engine = e, *next;
list_for_each_entry_safe(engine, next, &engine->multi_gt_list,
multi_gt_link) {
e->ops->kill(engine);
engine_kill_compute(engine);
}
e->ops->kill(e);
engine_kill_compute(e);
}
int xe_engine_destroy_ioctl(struct drm_device *dev, void *data,
struct drm_file *file)
{
struct xe_device *xe = to_xe_device(dev);
struct xe_file *xef = to_xe_file(file);
struct drm_xe_engine_destroy *args = data;
struct xe_engine *e;
if (XE_IOCTL_ERR(xe, args->pad))
return -EINVAL;
mutex_lock(&xef->engine.lock);
e = xa_erase(&xef->engine.xa, args->engine_id);
mutex_unlock(&xef->engine.lock);
if (XE_IOCTL_ERR(xe, !e))
return -ENOENT;
if (!(e->flags & ENGINE_FLAG_PERSISTENT))
xe_engine_kill(e);
else
xe_device_add_persitent_engines(xe, e);
trace_xe_engine_close(e);
xe_engine_put(e);
xe_pm_runtime_put(xe);
return 0;
}
int xe_engine_set_property_ioctl(struct drm_device *dev, void *data,
struct drm_file *file)
{
struct xe_device *xe = to_xe_device(dev);
struct xe_file *xef = to_xe_file(file);
struct drm_xe_engine_set_property *args = data;
struct xe_engine *e;
int ret;
u32 idx;
e = xe_engine_lookup(xef, args->engine_id);
if (XE_IOCTL_ERR(xe, !e))
return -ENOENT;
if (XE_IOCTL_ERR(xe, args->property >=
ARRAY_SIZE(engine_set_property_funcs))) {
ret = -EINVAL;
goto out;
}
idx = array_index_nospec(args->property,
ARRAY_SIZE(engine_set_property_funcs));
ret = engine_set_property_funcs[idx](xe, e, args->value, false);
if (XE_IOCTL_ERR(xe, ret))
goto out;
if (args->extensions)
ret = engine_user_extensions(xe, e, args->extensions, 0,
false);
out:
xe_engine_put(e);
return ret;
}

View File

@ -0,0 +1,54 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _XE_ENGINE_H_
#define _XE_ENGINE_H_
#include "xe_engine_types.h"
#include "xe_vm_types.h"
struct drm_device;
struct drm_file;
struct xe_device;
struct xe_file;
struct xe_engine *xe_engine_create(struct xe_device *xe, struct xe_vm *vm,
u32 logical_mask, u16 width,
struct xe_hw_engine *hw_engine, u32 flags);
struct xe_engine *xe_engine_create_class(struct xe_device *xe, struct xe_gt *gt,
struct xe_vm *vm,
enum xe_engine_class class, u32 flags);
void xe_engine_fini(struct xe_engine *e);
void xe_engine_destroy(struct kref *ref);
struct xe_engine *xe_engine_lookup(struct xe_file *xef, u32 id);
static inline struct xe_engine *xe_engine_get(struct xe_engine *engine)
{
kref_get(&engine->refcount);
return engine;
}
static inline void xe_engine_put(struct xe_engine *engine)
{
kref_put(&engine->refcount, xe_engine_destroy);
}
static inline bool xe_engine_is_parallel(struct xe_engine *engine)
{
return engine->width > 1;
}
void xe_engine_kill(struct xe_engine *e);
int xe_engine_create_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);
int xe_engine_destroy_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);
int xe_engine_set_property_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);
#endif

View File

@ -0,0 +1,208 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_ENGINE_TYPES_H_
#define _XE_ENGINE_TYPES_H_
#include <linux/kref.h>
#include <drm/gpu_scheduler.h>
#include "xe_gpu_scheduler_types.h"
#include "xe_hw_engine_types.h"
#include "xe_hw_fence_types.h"
#include "xe_lrc_types.h"
struct xe_execlist_engine;
struct xe_gt;
struct xe_guc_engine;
struct xe_hw_engine;
struct xe_vm;
enum xe_engine_priority {
XE_ENGINE_PRIORITY_UNSET = -2, /* For execlist usage only */
XE_ENGINE_PRIORITY_LOW = 0,
XE_ENGINE_PRIORITY_NORMAL,
XE_ENGINE_PRIORITY_HIGH,
XE_ENGINE_PRIORITY_KERNEL,
XE_ENGINE_PRIORITY_COUNT
};
/**
* struct xe_engine - Submission engine
*
* Contains all state necessary for submissions. Can either be a user object or
* a kernel object.
*/
struct xe_engine {
/** @gt: graphics tile this engine can submit to */
struct xe_gt *gt;
/**
* @hwe: A hardware of the same class. May (physical engine) or may not
* (virtual engine) be where jobs actual engine up running. Should never
* really be used for submissions.
*/
struct xe_hw_engine *hwe;
/** @refcount: ref count of this engine */
struct kref refcount;
/** @vm: VM (address space) for this engine */
struct xe_vm *vm;
/** @class: class of this engine */
enum xe_engine_class class;
/** @priority: priority of this exec queue */
enum xe_engine_priority priority;
/**
* @logical_mask: logical mask of where job submitted to engine can run
*/
u32 logical_mask;
/** @name: name of this engine */
char name[MAX_FENCE_NAME_LEN];
/** @width: width (number BB submitted per exec) of this engine */
u16 width;
/** @fence_irq: fence IRQ used to signal job completion */
struct xe_hw_fence_irq *fence_irq;
#define ENGINE_FLAG_BANNED BIT(0)
#define ENGINE_FLAG_KERNEL BIT(1)
#define ENGINE_FLAG_PERSISTENT BIT(2)
#define ENGINE_FLAG_COMPUTE_MODE BIT(3)
#define ENGINE_FLAG_VM BIT(4)
#define ENGINE_FLAG_BIND_ENGINE_CHILD BIT(5)
#define ENGINE_FLAG_WA BIT(6)
/**
* @flags: flags for this engine, should statically setup aside from ban
* bit
*/
unsigned long flags;
union {
/** @multi_gt_list: list head for VM bind engines if multi-GT */
struct list_head multi_gt_list;
/** @multi_gt_link: link for VM bind engines if multi-GT */
struct list_head multi_gt_link;
};
union {
/** @execlist: execlist backend specific state for engine */
struct xe_execlist_engine *execlist;
/** @guc: GuC backend specific state for engine */
struct xe_guc_engine *guc;
};
/**
* @persitent: persitent engine state
*/
struct {
/** @xef: file which this engine belongs to */
struct xe_file *xef;
/** @link: link in list of persitent engines */
struct list_head link;
} persitent;
union {
/**
* @parallel: parallel submission state
*/
struct {
/** @composite_fence_ctx: context composite fence */
u64 composite_fence_ctx;
/** @composite_fence_seqno: seqno for composite fence */
u32 composite_fence_seqno;
} parallel;
/**
* @bind: bind submission state
*/
struct {
/** @fence_ctx: context bind fence */
u64 fence_ctx;
/** @fence_seqno: seqno for bind fence */
u32 fence_seqno;
} bind;
};
/** @sched_props: scheduling properties */
struct {
/** @timeslice_us: timeslice period in micro-seconds */
u32 timeslice_us;
/** @preempt_timeout_us: preemption timeout in micro-seconds */
u32 preempt_timeout_us;
} sched_props;
/** @compute: compute engine state */
struct {
/** @pfence: preemption fence */
struct dma_fence *pfence;
/** @context: preemption fence context */
u64 context;
/** @seqno: preemption fence seqno */
u32 seqno;
/** @link: link into VM's list of engines */
struct list_head link;
/** @lock: preemption fences lock */
spinlock_t lock;
} compute;
/** @usm: unified shared memory state */
struct {
/** @acc_trigger: access counter trigger */
u32 acc_trigger;
/** @acc_notify: access counter notify */
u32 acc_notify;
/** @acc_granularity: access counter granularity */
u32 acc_granularity;
} usm;
/** @ops: submission backend engine operations */
const struct xe_engine_ops *ops;
/** @ring_ops: ring operations for this engine */
const struct xe_ring_ops *ring_ops;
/** @entity: DRM sched entity for this engine (1 to 1 relationship) */
struct drm_sched_entity *entity;
/** @lrc: logical ring context for this engine */
struct xe_lrc lrc[0];
};
/**
* struct xe_engine_ops - Submission backend engine operations
*/
struct xe_engine_ops {
/** @init: Initialize engine for submission backend */
int (*init)(struct xe_engine *e);
/** @kill: Kill inflight submissions for backend */
void (*kill)(struct xe_engine *e);
/** @fini: Fini engine for submission backend */
void (*fini)(struct xe_engine *e);
/** @set_priority: Set priority for engine */
int (*set_priority)(struct xe_engine *e,
enum xe_engine_priority priority);
/** @set_timeslice: Set timeslice for engine */
int (*set_timeslice)(struct xe_engine *e, u32 timeslice_us);
/** @set_preempt_timeout: Set preemption timeout for engine */
int (*set_preempt_timeout)(struct xe_engine *e, u32 preempt_timeout_us);
/** @set_job_timeout: Set job timeout for engine */
int (*set_job_timeout)(struct xe_engine *e, u32 job_timeout_ms);
/**
* @suspend: Suspend engine from executing, allowed to be called
* multiple times in a row before resume with the caveat that
* suspend_wait returns before calling suspend again.
*/
int (*suspend)(struct xe_engine *e);
/**
* @suspend_wait: Wait for an engine to suspend executing, should be
* call after suspend.
*/
void (*suspend_wait)(struct xe_engine *e);
/**
* @resume: Resume engine execution, engine must be in a suspended
* state and dma fence returned from most recent suspend call must be
* signalled when this function is called.
*/
void (*resume)(struct xe_engine *e);
};
#endif

View File

@ -0,0 +1,390 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_device.h>
#include <drm/drm_file.h>
#include <drm/xe_drm.h>
#include "xe_bo.h"
#include "xe_device.h"
#include "xe_engine.h"
#include "xe_exec.h"
#include "xe_macros.h"
#include "xe_sched_job.h"
#include "xe_sync.h"
#include "xe_vm.h"
/**
* DOC: Execbuf (User GPU command submission)
*
* Execs have historically been rather complicated in DRM drivers (at least in
* the i915) because a few things:
*
* - Passing in a list BO which are read / written to creating implicit syncs
* - Binding at exec time
* - Flow controlling the ring at exec time
*
* In XE we avoid all of this complication by not allowing a BO list to be
* passed into an exec, using the dma-buf implicit sync uAPI, have binds as
* seperate operations, and using the DRM scheduler to flow control the ring.
* Let's deep dive on each of these.
*
* We can get away from a BO list by forcing the user to use in / out fences on
* every exec rather than the kernel tracking dependencies of BO (e.g. if the
* user knows an exec writes to a BO and reads from the BO in the next exec, it
* is the user's responsibility to pass in / out fence between the two execs).
*
* Implicit dependencies for external BOs are handled by using the dma-buf
* implicit dependency uAPI (TODO: add link). To make this works each exec must
* install the job's fence into the DMA_RESV_USAGE_WRITE slot of every external
* BO mapped in the VM.
*
* We do not allow a user to trigger a bind at exec time rather we have a VM
* bind IOCTL which uses the same in / out fence interface as exec. In that
* sense, a VM bind is basically the same operation as an exec from the user
* perspective. e.g. If an exec depends on a VM bind use the in / out fence
* interface (struct drm_xe_sync) to synchronize like syncing between two
* dependent execs.
*
* Although a user cannot trigger a bind, we still have to rebind userptrs in
* the VM that have been invalidated since the last exec, likewise we also have
* to rebind BOs that have been evicted by the kernel. We schedule these rebinds
* behind any pending kernel operations on any external BOs in VM or any BOs
* private to the VM. This is accomplished by the rebinds waiting on BOs
* DMA_RESV_USAGE_KERNEL slot (kernel ops) and kernel ops waiting on all BOs
* slots (inflight execs are in the DMA_RESV_USAGE_BOOKING for private BOs and
* in DMA_RESV_USAGE_WRITE for external BOs).
*
* Rebinds / dma-resv usage applies to non-compute mode VMs only as for compute
* mode VMs we use preempt fences and a rebind worker (TODO: add link).
*
* There is no need to flow control the ring in the exec as we write the ring at
* submission time and set the DRM scheduler max job limit SIZE_OF_RING /
* MAX_JOB_SIZE. The DRM scheduler will then hold all jobs until space in the
* ring is available.
*
* All of this results in a rather simple exec implementation.
*
* Flow
* ~~~~
*
* .. code-block::
*
* Parse input arguments
* Wait for any async VM bind passed as in-fences to start
* <----------------------------------------------------------------------|
* Lock global VM lock in read mode |
* Pin userptrs (also finds userptr invalidated since last exec) |
* Lock exec (VM dma-resv lock, external BOs dma-resv locks) |
* Validate BOs that have been evicted |
* Create job |
* Rebind invalidated userptrs + evicted BOs (non-compute-mode) |
* Add rebind fence dependency to job |
* Add job VM dma-resv bookkeeping slot (non-compute mode) |
* Add job to external BOs dma-resv write slots (non-compute mode) |
* Check if any userptrs invalidated since pin ------ Drop locks ---------|
* Install in / out fences for job
* Submit job
* Unlock all
*/
static int xe_exec_begin(struct xe_engine *e, struct ww_acquire_ctx *ww,
struct ttm_validate_buffer tv_onstack[],
struct ttm_validate_buffer **tv,
struct list_head *objs)
{
struct xe_vm *vm = e->vm;
struct xe_vma *vma;
LIST_HEAD(dups);
int err;
*tv = NULL;
if (xe_vm_no_dma_fences(e->vm))
return 0;
err = xe_vm_lock_dma_resv(vm, ww, tv_onstack, tv, objs, true, 1);
if (err)
return err;
/*
* Validate BOs that have been evicted (i.e. make sure the
* BOs have valid placements possibly moving an evicted BO back
* to a location where the GPU can access it).
*/
list_for_each_entry(vma, &vm->rebind_list, rebind_link) {
if (xe_vma_is_userptr(vma))
continue;
err = xe_bo_validate(vma->bo, vm, false);
if (err) {
xe_vm_unlock_dma_resv(vm, tv_onstack, *tv, ww, objs);
*tv = NULL;
return err;
}
}
return 0;
}
static void xe_exec_end(struct xe_engine *e,
struct ttm_validate_buffer *tv_onstack,
struct ttm_validate_buffer *tv,
struct ww_acquire_ctx *ww,
struct list_head *objs)
{
if (!xe_vm_no_dma_fences(e->vm))
xe_vm_unlock_dma_resv(e->vm, tv_onstack, tv, ww, objs);
}
int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
{
struct xe_device *xe = to_xe_device(dev);
struct xe_file *xef = to_xe_file(file);
struct drm_xe_exec *args = data;
struct drm_xe_sync __user *syncs_user = u64_to_user_ptr(args->syncs);
u64 __user *addresses_user = u64_to_user_ptr(args->address);
struct xe_engine *engine;
struct xe_sync_entry *syncs = NULL;
u64 addresses[XE_HW_ENGINE_MAX_INSTANCE];
struct ttm_validate_buffer tv_onstack[XE_ONSTACK_TV];
struct ttm_validate_buffer *tv = NULL;
u32 i, num_syncs = 0;
struct xe_sched_job *job;
struct dma_fence *rebind_fence;
struct xe_vm *vm;
struct ww_acquire_ctx ww;
struct list_head objs;
bool write_locked;
int err = 0;
if (XE_IOCTL_ERR(xe, args->extensions))
return -EINVAL;
engine = xe_engine_lookup(xef, args->engine_id);
if (XE_IOCTL_ERR(xe, !engine))
return -ENOENT;
if (XE_IOCTL_ERR(xe, engine->flags & ENGINE_FLAG_VM))
return -EINVAL;
if (XE_IOCTL_ERR(xe, engine->width != args->num_batch_buffer))
return -EINVAL;
if (XE_IOCTL_ERR(xe, engine->flags & ENGINE_FLAG_BANNED)) {
err = -ECANCELED;
goto err_engine;
}
if (args->num_syncs) {
syncs = kcalloc(args->num_syncs, sizeof(*syncs), GFP_KERNEL);
if (!syncs) {
err = -ENOMEM;
goto err_engine;
}
}
vm = engine->vm;
for (i = 0; i < args->num_syncs; i++) {
err = xe_sync_entry_parse(xe, xef, &syncs[num_syncs++],
&syncs_user[i], true,
xe_vm_no_dma_fences(vm));
if (err)
goto err_syncs;
}
if (xe_engine_is_parallel(engine)) {
err = __copy_from_user(addresses, addresses_user, sizeof(u64) *
engine->width);
if (err) {
err = -EFAULT;
goto err_syncs;
}
}
/*
* We can't install a job into the VM dma-resv shared slot before an
* async VM bind passed in as a fence without the risk of deadlocking as
* the bind can trigger an eviction which in turn depends on anything in
* the VM dma-resv shared slots. Not an ideal solution, but we wait for
* all dependent async VM binds to start (install correct fences into
* dma-resv slots) before moving forward.
*/
if (!xe_vm_no_dma_fences(vm) &&
vm->flags & XE_VM_FLAG_ASYNC_BIND_OPS) {
for (i = 0; i < args->num_syncs; i++) {
struct dma_fence *fence = syncs[i].fence;
if (fence) {
err = xe_vm_async_fence_wait_start(fence);
if (err)
goto err_syncs;
}
}
}
retry:
if (!xe_vm_no_dma_fences(vm) && xe_vm_userptr_check_repin(vm)) {
err = down_write_killable(&vm->lock);
write_locked = true;
} else {
/* We don't allow execs while the VM is in error state */
err = down_read_interruptible(&vm->lock);
write_locked = false;
}
if (err)
goto err_syncs;
/* We don't allow execs while the VM is in error state */
if (vm->async_ops.error) {
err = vm->async_ops.error;
goto err_unlock_list;
}
/*
* Extreme corner where we exit a VM error state with a munmap style VM
* unbind inflight which requires a rebind. In this case the rebind
* needs to install some fences into the dma-resv slots. The worker to
* do this queued, let that worker make progress by dropping vm->lock,
* flushing the worker and retrying the exec.
*/
if (vm->async_ops.munmap_rebind_inflight) {
if (write_locked)
up_write(&vm->lock);
else
up_read(&vm->lock);
flush_work(&vm->async_ops.work);
goto retry;
}
if (write_locked) {
err = xe_vm_userptr_pin(vm);
downgrade_write(&vm->lock);
write_locked = false;
if (err)
goto err_unlock_list;
}
err = xe_exec_begin(engine, &ww, tv_onstack, &tv, &objs);
if (err)
goto err_unlock_list;
if (xe_vm_is_closed(engine->vm)) {
drm_warn(&xe->drm, "Trying to schedule after vm is closed\n");
err = -EIO;
goto err_engine_end;
}
job = xe_sched_job_create(engine, xe_engine_is_parallel(engine) ?
addresses : &args->address);
if (IS_ERR(job)) {
err = PTR_ERR(job);
goto err_engine_end;
}
/*
* Rebind any invalidated userptr or evicted BOs in the VM, non-compute
* VM mode only.
*/
rebind_fence = xe_vm_rebind(vm, false);
if (IS_ERR(rebind_fence)) {
err = PTR_ERR(rebind_fence);
goto err_put_job;
}
/*
* We store the rebind_fence in the VM so subsequent execs don't get
* scheduled before the rebinds of userptrs / evicted BOs is complete.
*/
if (rebind_fence) {
dma_fence_put(vm->rebind_fence);
vm->rebind_fence = rebind_fence;
}
if (vm->rebind_fence) {
if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
&vm->rebind_fence->flags)) {
dma_fence_put(vm->rebind_fence);
vm->rebind_fence = NULL;
} else {
dma_fence_get(vm->rebind_fence);
err = drm_sched_job_add_dependency(&job->drm,
vm->rebind_fence);
if (err)
goto err_put_job;
}
}
/* Wait behind munmap style rebinds */
if (!xe_vm_no_dma_fences(vm)) {
err = drm_sched_job_add_resv_dependencies(&job->drm,
&vm->resv,
DMA_RESV_USAGE_KERNEL);
if (err)
goto err_put_job;
}
for (i = 0; i < num_syncs && !err; i++)
err = xe_sync_entry_add_deps(&syncs[i], job);
if (err)
goto err_put_job;
if (!xe_vm_no_dma_fences(vm)) {
err = down_read_interruptible(&vm->userptr.notifier_lock);
if (err)
goto err_put_job;
err = __xe_vm_userptr_needs_repin(vm);
if (err)
goto err_repin;
}
/*
* Point of no return, if we error after this point just set an error on
* the job and let the DRM scheduler / backend clean up the job.
*/
xe_sched_job_arm(job);
if (!xe_vm_no_dma_fences(vm)) {
/* Block userptr invalidations / BO eviction */
dma_resv_add_fence(&vm->resv,
&job->drm.s_fence->finished,
DMA_RESV_USAGE_BOOKKEEP);
/*
* Make implicit sync work across drivers, assuming all external
* BOs are written as we don't pass in a read / write list.
*/
xe_vm_fence_all_extobjs(vm, &job->drm.s_fence->finished,
DMA_RESV_USAGE_WRITE);
}
for (i = 0; i < num_syncs; i++)
xe_sync_entry_signal(&syncs[i], job,
&job->drm.s_fence->finished);
xe_sched_job_push(job);
err_repin:
if (!xe_vm_no_dma_fences(vm))
up_read(&vm->userptr.notifier_lock);
err_put_job:
if (err)
xe_sched_job_put(job);
err_engine_end:
xe_exec_end(engine, tv_onstack, tv, &ww, &objs);
err_unlock_list:
if (write_locked)
up_write(&vm->lock);
else
up_read(&vm->lock);
if (err == -EAGAIN)
goto retry;
err_syncs:
for (i = 0; i < num_syncs; i++)
xe_sync_entry_cleanup(&syncs[i]);
kfree(syncs);
err_engine:
xe_engine_put(engine);
return err;
}

View File

@ -0,0 +1,14 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_EXEC_H_
#define _XE_EXEC_H_
struct drm_device;
struct drm_file;
int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file);
#endif

View File

@ -0,0 +1,489 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2021 Intel Corporation
*/
#include <drm/drm_managed.h>
#include "xe_execlist.h"
#include "xe_bo.h"
#include "xe_device.h"
#include "xe_engine.h"
#include "xe_hw_fence.h"
#include "xe_gt.h"
#include "xe_lrc.h"
#include "xe_macros.h"
#include "xe_mmio.h"
#include "xe_mocs.h"
#include "xe_ring_ops_types.h"
#include "xe_sched_job.h"
#include "i915_reg.h"
#include "gt/intel_gpu_commands.h"
#include "gt/intel_gt_regs.h"
#include "gt/intel_lrc_reg.h"
#include "gt/intel_engine_regs.h"
#define XE_EXECLIST_HANG_LIMIT 1
#define GEN11_SW_CTX_ID_SHIFT 37
#define GEN11_SW_CTX_ID_WIDTH 11
#define XEHP_SW_CTX_ID_SHIFT 39
#define XEHP_SW_CTX_ID_WIDTH 16
#define GEN11_SW_CTX_ID \
GENMASK_ULL(GEN11_SW_CTX_ID_WIDTH + GEN11_SW_CTX_ID_SHIFT - 1, \
GEN11_SW_CTX_ID_SHIFT)
#define XEHP_SW_CTX_ID \
GENMASK_ULL(XEHP_SW_CTX_ID_WIDTH + XEHP_SW_CTX_ID_SHIFT - 1, \
XEHP_SW_CTX_ID_SHIFT)
static void __start_lrc(struct xe_hw_engine *hwe, struct xe_lrc *lrc,
u32 ctx_id)
{
struct xe_gt *gt = hwe->gt;
struct xe_device *xe = gt_to_xe(gt);
u64 lrc_desc;
printk(KERN_INFO "__start_lrc(%s, 0x%p, %u)\n", hwe->name, lrc, ctx_id);
lrc_desc = xe_lrc_descriptor(lrc);
if (GRAPHICS_VERx100(xe) >= 1250) {
XE_BUG_ON(!FIELD_FIT(XEHP_SW_CTX_ID, ctx_id));
lrc_desc |= FIELD_PREP(XEHP_SW_CTX_ID, ctx_id);
} else {
XE_BUG_ON(!FIELD_FIT(GEN11_SW_CTX_ID, ctx_id));
lrc_desc |= FIELD_PREP(GEN11_SW_CTX_ID, ctx_id);
}
if (hwe->class == XE_ENGINE_CLASS_COMPUTE)
xe_mmio_write32(hwe->gt, GEN12_RCU_MODE.reg,
_MASKED_BIT_ENABLE(GEN12_RCU_MODE_CCS_ENABLE));
xe_lrc_write_ctx_reg(lrc, CTX_RING_TAIL, lrc->ring.tail);
lrc->ring.old_tail = lrc->ring.tail;
/*
* Make sure the context image is complete before we submit it to HW.
*
* Ostensibly, writes (including the WCB) should be flushed prior to
* an uncached write such as our mmio register access, the empirical
* evidence (esp. on Braswell) suggests that the WC write into memory
* may not be visible to the HW prior to the completion of the UC
* register write and that we may begin execution from the context
* before its image is complete leading to invalid PD chasing.
*/
wmb();
xe_mmio_write32(gt, RING_HWS_PGA(hwe->mmio_base).reg,
xe_bo_ggtt_addr(hwe->hwsp));
xe_mmio_read32(gt, RING_HWS_PGA(hwe->mmio_base).reg);
xe_mmio_write32(gt, RING_MODE_GEN7(hwe->mmio_base).reg,
_MASKED_BIT_ENABLE(GEN11_GFX_DISABLE_LEGACY_MODE));
xe_mmio_write32(gt, RING_EXECLIST_SQ_CONTENTS(hwe->mmio_base).reg + 0,
lower_32_bits(lrc_desc));
xe_mmio_write32(gt, RING_EXECLIST_SQ_CONTENTS(hwe->mmio_base).reg + 4,
upper_32_bits(lrc_desc));
xe_mmio_write32(gt, RING_EXECLIST_CONTROL(hwe->mmio_base).reg,
EL_CTRL_LOAD);
}
static void __xe_execlist_port_start(struct xe_execlist_port *port,
struct xe_execlist_engine *exl)
{
struct xe_device *xe = gt_to_xe(port->hwe->gt);
int max_ctx = FIELD_MAX(GEN11_SW_CTX_ID);
if (GRAPHICS_VERx100(xe) >= 1250)
max_ctx = FIELD_MAX(XEHP_SW_CTX_ID);
xe_execlist_port_assert_held(port);
if (port->running_exl != exl || !exl->has_run) {
port->last_ctx_id++;
/* 0 is reserved for the kernel context */
if (port->last_ctx_id > max_ctx)
port->last_ctx_id = 1;
}
__start_lrc(port->hwe, exl->engine->lrc, port->last_ctx_id);
port->running_exl = exl;
exl->has_run = true;
}
static void __xe_execlist_port_idle(struct xe_execlist_port *port)
{
u32 noop[2] = { MI_NOOP, MI_NOOP };
xe_execlist_port_assert_held(port);
if (!port->running_exl)
return;
printk(KERN_INFO "__xe_execlist_port_idle(%d:%d)\n", port->hwe->class,
port->hwe->instance);
xe_lrc_write_ring(&port->hwe->kernel_lrc, noop, sizeof(noop));
__start_lrc(port->hwe, &port->hwe->kernel_lrc, 0);
port->running_exl = NULL;
}
static bool xe_execlist_is_idle(struct xe_execlist_engine *exl)
{
struct xe_lrc *lrc = exl->engine->lrc;
return lrc->ring.tail == lrc->ring.old_tail;
}
static void __xe_execlist_port_start_next_active(struct xe_execlist_port *port)
{
struct xe_execlist_engine *exl = NULL;
int i;
xe_execlist_port_assert_held(port);
for (i = ARRAY_SIZE(port->active) - 1; i >= 0; i--) {
while (!list_empty(&port->active[i])) {
exl = list_first_entry(&port->active[i],
struct xe_execlist_engine,
active_link);
list_del(&exl->active_link);
if (xe_execlist_is_idle(exl)) {
exl->active_priority = XE_ENGINE_PRIORITY_UNSET;
continue;
}
list_add_tail(&exl->active_link, &port->active[i]);
__xe_execlist_port_start(port, exl);
return;
}
}
__xe_execlist_port_idle(port);
}
static u64 read_execlist_status(struct xe_hw_engine *hwe)
{
struct xe_gt *gt = hwe->gt;
u32 hi, lo;
lo = xe_mmio_read32(gt, RING_EXECLIST_STATUS_LO(hwe->mmio_base).reg);
hi = xe_mmio_read32(gt, RING_EXECLIST_STATUS_HI(hwe->mmio_base).reg);
printk(KERN_INFO "EXECLIST_STATUS %d:%d = 0x%08x %08x\n", hwe->class,
hwe->instance, hi, lo);
return lo | (u64)hi << 32;
}
static void xe_execlist_port_irq_handler_locked(struct xe_execlist_port *port)
{
u64 status;
xe_execlist_port_assert_held(port);
status = read_execlist_status(port->hwe);
if (status & BIT(7))
return;
__xe_execlist_port_start_next_active(port);
}
static void xe_execlist_port_irq_handler(struct xe_hw_engine *hwe,
u16 intr_vec)
{
struct xe_execlist_port *port = hwe->exl_port;
spin_lock(&port->lock);
xe_execlist_port_irq_handler_locked(port);
spin_unlock(&port->lock);
}
static void xe_execlist_port_wake_locked(struct xe_execlist_port *port,
enum xe_engine_priority priority)
{
xe_execlist_port_assert_held(port);
if (port->running_exl && port->running_exl->active_priority >= priority)
return;
__xe_execlist_port_start_next_active(port);
}
static void xe_execlist_make_active(struct xe_execlist_engine *exl)
{
struct xe_execlist_port *port = exl->port;
enum xe_engine_priority priority = exl->active_priority;
XE_BUG_ON(priority == XE_ENGINE_PRIORITY_UNSET);
XE_BUG_ON(priority < 0);
XE_BUG_ON(priority >= ARRAY_SIZE(exl->port->active));
spin_lock_irq(&port->lock);
if (exl->active_priority != priority &&
exl->active_priority != XE_ENGINE_PRIORITY_UNSET) {
/* Priority changed, move it to the right list */
list_del(&exl->active_link);
exl->active_priority = XE_ENGINE_PRIORITY_UNSET;
}
if (exl->active_priority == XE_ENGINE_PRIORITY_UNSET) {
exl->active_priority = priority;
list_add_tail(&exl->active_link, &port->active[priority]);
}
xe_execlist_port_wake_locked(exl->port, priority);
spin_unlock_irq(&port->lock);
}
static void xe_execlist_port_irq_fail_timer(struct timer_list *timer)
{
struct xe_execlist_port *port =
container_of(timer, struct xe_execlist_port, irq_fail);
spin_lock_irq(&port->lock);
xe_execlist_port_irq_handler_locked(port);
spin_unlock_irq(&port->lock);
port->irq_fail.expires = jiffies + msecs_to_jiffies(1000);
add_timer(&port->irq_fail);
}
struct xe_execlist_port *xe_execlist_port_create(struct xe_device *xe,
struct xe_hw_engine *hwe)
{
struct drm_device *drm = &xe->drm;
struct xe_execlist_port *port;
int i;
port = drmm_kzalloc(drm, sizeof(*port), GFP_KERNEL);
if (!port)
return ERR_PTR(-ENOMEM);
port->hwe = hwe;
spin_lock_init(&port->lock);
for (i = 0; i < ARRAY_SIZE(port->active); i++)
INIT_LIST_HEAD(&port->active[i]);
port->last_ctx_id = 1;
port->running_exl = NULL;
hwe->irq_handler = xe_execlist_port_irq_handler;
/* TODO: Fix the interrupt code so it doesn't race like mad */
timer_setup(&port->irq_fail, xe_execlist_port_irq_fail_timer, 0);
port->irq_fail.expires = jiffies + msecs_to_jiffies(1000);
add_timer(&port->irq_fail);
return port;
}
void xe_execlist_port_destroy(struct xe_execlist_port *port)
{
del_timer(&port->irq_fail);
/* Prevent an interrupt while we're destroying */
spin_lock_irq(&gt_to_xe(port->hwe->gt)->irq.lock);
port->hwe->irq_handler = NULL;
spin_unlock_irq(&gt_to_xe(port->hwe->gt)->irq.lock);
}
static struct dma_fence *
execlist_run_job(struct drm_sched_job *drm_job)
{
struct xe_sched_job *job = to_xe_sched_job(drm_job);
struct xe_engine *e = job->engine;
struct xe_execlist_engine *exl = job->engine->execlist;
e->ring_ops->emit_job(job);
xe_execlist_make_active(exl);
return dma_fence_get(job->fence);
}
static void execlist_job_free(struct drm_sched_job *drm_job)
{
struct xe_sched_job *job = to_xe_sched_job(drm_job);
xe_sched_job_put(job);
}
static const struct drm_sched_backend_ops drm_sched_ops = {
.run_job = execlist_run_job,
.free_job = execlist_job_free,
};
static int execlist_engine_init(struct xe_engine *e)
{
struct drm_gpu_scheduler *sched;
struct xe_execlist_engine *exl;
int err;
XE_BUG_ON(xe_device_guc_submission_enabled(gt_to_xe(e->gt)));
exl = kzalloc(sizeof(*exl), GFP_KERNEL);
if (!exl)
return -ENOMEM;
exl->engine = e;
err = drm_sched_init(&exl->sched, &drm_sched_ops, NULL, 1,
e->lrc[0].ring.size / MAX_JOB_SIZE_BYTES,
XE_SCHED_HANG_LIMIT, XE_SCHED_JOB_TIMEOUT,
NULL, NULL, e->hwe->name,
gt_to_xe(e->gt)->drm.dev);
if (err)
goto err_free;
sched = &exl->sched;
err = drm_sched_entity_init(&exl->entity, 0, &sched, 1, NULL);
if (err)
goto err_sched;
exl->port = e->hwe->exl_port;
exl->has_run = false;
exl->active_priority = XE_ENGINE_PRIORITY_UNSET;
e->execlist = exl;
e->entity = &exl->entity;
switch (e->class) {
case XE_ENGINE_CLASS_RENDER:
sprintf(e->name, "rcs%d", ffs(e->logical_mask) - 1);
break;
case XE_ENGINE_CLASS_VIDEO_DECODE:
sprintf(e->name, "vcs%d", ffs(e->logical_mask) - 1);
break;
case XE_ENGINE_CLASS_VIDEO_ENHANCE:
sprintf(e->name, "vecs%d", ffs(e->logical_mask) - 1);
break;
case XE_ENGINE_CLASS_COPY:
sprintf(e->name, "bcs%d", ffs(e->logical_mask) - 1);
break;
case XE_ENGINE_CLASS_COMPUTE:
sprintf(e->name, "ccs%d", ffs(e->logical_mask) - 1);
break;
default:
XE_WARN_ON(e->class);
}
return 0;
err_sched:
drm_sched_fini(&exl->sched);
err_free:
kfree(exl);
return err;
}
static void execlist_engine_fini_async(struct work_struct *w)
{
struct xe_execlist_engine *ee =
container_of(w, struct xe_execlist_engine, fini_async);
struct xe_engine *e = ee->engine;
struct xe_execlist_engine *exl = e->execlist;
unsigned long flags;
XE_BUG_ON(xe_device_guc_submission_enabled(gt_to_xe(e->gt)));
spin_lock_irqsave(&exl->port->lock, flags);
if (WARN_ON(exl->active_priority != XE_ENGINE_PRIORITY_UNSET))
list_del(&exl->active_link);
spin_unlock_irqrestore(&exl->port->lock, flags);
if (e->flags & ENGINE_FLAG_PERSISTENT)
xe_device_remove_persitent_engines(gt_to_xe(e->gt), e);
drm_sched_entity_fini(&exl->entity);
drm_sched_fini(&exl->sched);
kfree(exl);
xe_engine_fini(e);
}
static void execlist_engine_kill(struct xe_engine *e)
{
/* NIY */
}
static void execlist_engine_fini(struct xe_engine *e)
{
INIT_WORK(&e->execlist->fini_async, execlist_engine_fini_async);
queue_work(system_unbound_wq, &e->execlist->fini_async);
}
static int execlist_engine_set_priority(struct xe_engine *e,
enum xe_engine_priority priority)
{
/* NIY */
return 0;
}
static int execlist_engine_set_timeslice(struct xe_engine *e, u32 timeslice_us)
{
/* NIY */
return 0;
}
static int execlist_engine_set_preempt_timeout(struct xe_engine *e,
u32 preempt_timeout_us)
{
/* NIY */
return 0;
}
static int execlist_engine_set_job_timeout(struct xe_engine *e,
u32 job_timeout_ms)
{
/* NIY */
return 0;
}
static int execlist_engine_suspend(struct xe_engine *e)
{
/* NIY */
return 0;
}
static void execlist_engine_suspend_wait(struct xe_engine *e)
{
/* NIY */
}
static void execlist_engine_resume(struct xe_engine *e)
{
xe_mocs_init_engine(e);
}
static const struct xe_engine_ops execlist_engine_ops = {
.init = execlist_engine_init,
.kill = execlist_engine_kill,
.fini = execlist_engine_fini,
.set_priority = execlist_engine_set_priority,
.set_timeslice = execlist_engine_set_timeslice,
.set_preempt_timeout = execlist_engine_set_preempt_timeout,
.set_job_timeout = execlist_engine_set_job_timeout,
.suspend = execlist_engine_suspend,
.suspend_wait = execlist_engine_suspend_wait,
.resume = execlist_engine_resume,
};
int xe_execlist_init(struct xe_gt *gt)
{
/* GuC submission enabled, nothing to do */
if (xe_device_guc_submission_enabled(gt_to_xe(gt)))
return 0;
gt->engine_ops = &execlist_engine_ops;
return 0;
}

View File

@ -0,0 +1,21 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _XE_EXECLIST_H_
#define _XE_EXECLIST_H_
#include "xe_execlist_types.h"
struct xe_device;
struct xe_gt;
#define xe_execlist_port_assert_held(port) lockdep_assert_held(&(port)->lock);
int xe_execlist_init(struct xe_gt *gt);
struct xe_execlist_port *xe_execlist_port_create(struct xe_device *xe,
struct xe_hw_engine *hwe);
void xe_execlist_port_destroy(struct xe_execlist_port *port);
#endif

View File

@ -0,0 +1,49 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_EXECLIST_TYPES_H_
#define _XE_EXECLIST_TYPES_H_
#include <linux/list.h>
#include <linux/spinlock.h>
#include <linux/workqueue.h>
#include "xe_engine_types.h"
struct xe_hw_engine;
struct xe_execlist_engine;
struct xe_execlist_port {
struct xe_hw_engine *hwe;
spinlock_t lock;
struct list_head active[XE_ENGINE_PRIORITY_COUNT];
u32 last_ctx_id;
struct xe_execlist_engine *running_exl;
struct timer_list irq_fail;
};
struct xe_execlist_engine {
struct xe_engine *engine;
struct drm_gpu_scheduler sched;
struct drm_sched_entity entity;
struct xe_execlist_port *port;
bool has_run;
struct work_struct fini_async;
enum xe_engine_priority active_priority;
struct list_head active_link;
};
#endif

View File

@ -0,0 +1,203 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_util.h>
#include "xe_force_wake.h"
#include "xe_gt.h"
#include "xe_mmio.h"
#include "gt/intel_gt_regs.h"
#define XE_FORCE_WAKE_ACK_TIMEOUT_MS 50
static struct xe_gt *
fw_to_gt(struct xe_force_wake *fw)
{
return fw->gt;
}
static struct xe_device *
fw_to_xe(struct xe_force_wake *fw)
{
return gt_to_xe(fw_to_gt(fw));
}
static void domain_init(struct xe_force_wake_domain *domain,
enum xe_force_wake_domain_id id,
u32 reg, u32 ack, u32 val, u32 mask)
{
domain->id = id;
domain->reg_ctl = reg;
domain->reg_ack = ack;
domain->val = val;
domain->mask = mask;
}
#define FORCEWAKE_ACK_GT_MTL _MMIO(0xdfc)
void xe_force_wake_init_gt(struct xe_gt *gt, struct xe_force_wake *fw)
{
struct xe_device *xe = gt_to_xe(gt);
fw->gt = gt;
mutex_init(&fw->lock);
/* Assuming gen11+ so assert this assumption is correct */
XE_BUG_ON(GRAPHICS_VER(gt_to_xe(gt)) < 11);
if (xe->info.platform == XE_METEORLAKE) {
domain_init(&fw->domains[XE_FW_DOMAIN_ID_GT],
XE_FW_DOMAIN_ID_GT,
FORCEWAKE_GT_GEN9.reg,
FORCEWAKE_ACK_GT_MTL.reg,
BIT(0), BIT(16));
} else {
domain_init(&fw->domains[XE_FW_DOMAIN_ID_GT],
XE_FW_DOMAIN_ID_GT,
FORCEWAKE_GT_GEN9.reg,
FORCEWAKE_ACK_GT_GEN9.reg,
BIT(0), BIT(16));
}
}
void xe_force_wake_init_engines(struct xe_gt *gt, struct xe_force_wake *fw)
{
int i, j;
/* Assuming gen11+ so assert this assumption is correct */
XE_BUG_ON(GRAPHICS_VER(gt_to_xe(gt)) < 11);
if (!xe_gt_is_media_type(gt))
domain_init(&fw->domains[XE_FW_DOMAIN_ID_RENDER],
XE_FW_DOMAIN_ID_RENDER,
FORCEWAKE_RENDER_GEN9.reg,
FORCEWAKE_ACK_RENDER_GEN9.reg,
BIT(0), BIT(16));
for (i = XE_HW_ENGINE_VCS0, j = 0; i <= XE_HW_ENGINE_VCS7; ++i, ++j) {
if (!(gt->info.engine_mask & BIT(i)))
continue;
domain_init(&fw->domains[XE_FW_DOMAIN_ID_MEDIA_VDBOX0 + j],
XE_FW_DOMAIN_ID_MEDIA_VDBOX0 + j,
FORCEWAKE_MEDIA_VDBOX_GEN11(j).reg,
FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(j).reg,
BIT(0), BIT(16));
}
for (i = XE_HW_ENGINE_VECS0, j =0; i <= XE_HW_ENGINE_VECS3; ++i, ++j) {
if (!(gt->info.engine_mask & BIT(i)))
continue;
domain_init(&fw->domains[XE_FW_DOMAIN_ID_MEDIA_VEBOX0 + j],
XE_FW_DOMAIN_ID_MEDIA_VEBOX0 + j,
FORCEWAKE_MEDIA_VEBOX_GEN11(j).reg,
FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(j).reg,
BIT(0), BIT(16));
}
}
void xe_force_wake_prune(struct xe_gt *gt, struct xe_force_wake *fw)
{
int i, j;
/* Call after fuses have been read, prune domains that are fused off */
for (i = XE_HW_ENGINE_VCS0, j = 0; i <= XE_HW_ENGINE_VCS7; ++i, ++j)
if (!(gt->info.engine_mask & BIT(i)))
fw->domains[XE_FW_DOMAIN_ID_MEDIA_VDBOX0 + j].reg_ctl = 0;
for (i = XE_HW_ENGINE_VECS0, j =0; i <= XE_HW_ENGINE_VECS3; ++i, ++j)
if (!(gt->info.engine_mask & BIT(i)))
fw->domains[XE_FW_DOMAIN_ID_MEDIA_VEBOX0 + j].reg_ctl = 0;
}
static void domain_wake(struct xe_gt *gt, struct xe_force_wake_domain *domain)
{
xe_mmio_write32(gt, domain->reg_ctl, domain->mask | domain->val);
}
static int domain_wake_wait(struct xe_gt *gt,
struct xe_force_wake_domain *domain)
{
return xe_mmio_wait32(gt, domain->reg_ack, domain->val, domain->val,
XE_FORCE_WAKE_ACK_TIMEOUT_MS);
}
static void domain_sleep(struct xe_gt *gt, struct xe_force_wake_domain *domain)
{
xe_mmio_write32(gt, domain->reg_ctl, domain->mask);
}
static int domain_sleep_wait(struct xe_gt *gt,
struct xe_force_wake_domain *domain)
{
return xe_mmio_wait32(gt, domain->reg_ack, 0, domain->val,
XE_FORCE_WAKE_ACK_TIMEOUT_MS);
}
#define for_each_fw_domain_masked(domain__, mask__, fw__, tmp__) \
for (tmp__ = (mask__); tmp__ ;) \
for_each_if((domain__ = ((fw__)->domains + \
__mask_next_bit(tmp__))) && \
domain__->reg_ctl)
int xe_force_wake_get(struct xe_force_wake *fw,
enum xe_force_wake_domains domains)
{
struct xe_device *xe = fw_to_xe(fw);
struct xe_gt *gt = fw_to_gt(fw);
struct xe_force_wake_domain *domain;
enum xe_force_wake_domains tmp, woken = 0;
int ret, ret2 = 0;
mutex_lock(&fw->lock);
for_each_fw_domain_masked(domain, domains, fw, tmp) {
if (!domain->ref++) {
woken |= BIT(domain->id);
domain_wake(gt, domain);
}
}
for_each_fw_domain_masked(domain, woken, fw, tmp) {
ret = domain_wake_wait(gt, domain);
ret2 |= ret;
if (ret)
drm_notice(&xe->drm, "Force wake domain (%d) failed to ack wake, ret=%d\n",
domain->id, ret);
}
fw->awake_domains |= woken;
mutex_unlock(&fw->lock);
return ret2;
}
int xe_force_wake_put(struct xe_force_wake *fw,
enum xe_force_wake_domains domains)
{
struct xe_device *xe = fw_to_xe(fw);
struct xe_gt *gt = fw_to_gt(fw);
struct xe_force_wake_domain *domain;
enum xe_force_wake_domains tmp, sleep = 0;
int ret, ret2 = 0;
mutex_lock(&fw->lock);
for_each_fw_domain_masked(domain, domains, fw, tmp) {
if (!--domain->ref) {
sleep |= BIT(domain->id);
domain_sleep(gt, domain);
}
}
for_each_fw_domain_masked(domain, sleep, fw, tmp) {
ret = domain_sleep_wait(gt, domain);
ret2 |= ret;
if (ret)
drm_notice(&xe->drm, "Force wake domain (%d) failed to ack sleep, ret=%d\n",
domain->id, ret);
}
fw->awake_domains &= ~sleep;
mutex_unlock(&fw->lock);
return ret2;
}

View File

@ -0,0 +1,40 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_FORCE_WAKE_H_
#define _XE_FORCE_WAKE_H_
#include "xe_force_wake_types.h"
#include "xe_macros.h"
struct xe_gt;
void xe_force_wake_init_gt(struct xe_gt *gt,
struct xe_force_wake *fw);
void xe_force_wake_init_engines(struct xe_gt *gt,
struct xe_force_wake *fw);
void xe_force_wake_prune(struct xe_gt *gt,
struct xe_force_wake *fw);
int xe_force_wake_get(struct xe_force_wake *fw,
enum xe_force_wake_domains domains);
int xe_force_wake_put(struct xe_force_wake *fw,
enum xe_force_wake_domains domains);
static inline int
xe_force_wake_ref(struct xe_force_wake *fw,
enum xe_force_wake_domains domain)
{
XE_BUG_ON(!domain);
return fw->domains[ffs(domain) - 1].ref;
}
static inline void
xe_force_wake_assert_held(struct xe_force_wake *fw,
enum xe_force_wake_domains domain)
{
XE_BUG_ON(!(fw->awake_domains & domain));
}
#endif

View File

@ -0,0 +1,84 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_FORCE_WAKE_TYPES_H_
#define _XE_FORCE_WAKE_TYPES_H_
#include <linux/mutex.h>
#include <linux/types.h>
enum xe_force_wake_domain_id {
XE_FW_DOMAIN_ID_GT = 0,
XE_FW_DOMAIN_ID_RENDER,
XE_FW_DOMAIN_ID_MEDIA,
XE_FW_DOMAIN_ID_MEDIA_VDBOX0,
XE_FW_DOMAIN_ID_MEDIA_VDBOX1,
XE_FW_DOMAIN_ID_MEDIA_VDBOX2,
XE_FW_DOMAIN_ID_MEDIA_VDBOX3,
XE_FW_DOMAIN_ID_MEDIA_VDBOX4,
XE_FW_DOMAIN_ID_MEDIA_VDBOX5,
XE_FW_DOMAIN_ID_MEDIA_VDBOX6,
XE_FW_DOMAIN_ID_MEDIA_VDBOX7,
XE_FW_DOMAIN_ID_MEDIA_VEBOX0,
XE_FW_DOMAIN_ID_MEDIA_VEBOX1,
XE_FW_DOMAIN_ID_MEDIA_VEBOX2,
XE_FW_DOMAIN_ID_MEDIA_VEBOX3,
XE_FW_DOMAIN_ID_GSC,
XE_FW_DOMAIN_ID_COUNT
};
enum xe_force_wake_domains {
XE_FW_GT = BIT(XE_FW_DOMAIN_ID_GT),
XE_FW_RENDER = BIT(XE_FW_DOMAIN_ID_RENDER),
XE_FW_MEDIA = BIT(XE_FW_DOMAIN_ID_MEDIA),
XE_FW_MEDIA_VDBOX0 = BIT(XE_FW_DOMAIN_ID_MEDIA_VDBOX0),
XE_FW_MEDIA_VDBOX1 = BIT(XE_FW_DOMAIN_ID_MEDIA_VDBOX1),
XE_FW_MEDIA_VDBOX2 = BIT(XE_FW_DOMAIN_ID_MEDIA_VDBOX2),
XE_FW_MEDIA_VDBOX3 = BIT(XE_FW_DOMAIN_ID_MEDIA_VDBOX3),
XE_FW_MEDIA_VDBOX4 = BIT(XE_FW_DOMAIN_ID_MEDIA_VDBOX4),
XE_FW_MEDIA_VDBOX5 = BIT(XE_FW_DOMAIN_ID_MEDIA_VDBOX5),
XE_FW_MEDIA_VDBOX6 = BIT(XE_FW_DOMAIN_ID_MEDIA_VDBOX6),
XE_FW_MEDIA_VDBOX7 = BIT(XE_FW_DOMAIN_ID_MEDIA_VDBOX7),
XE_FW_MEDIA_VEBOX0 = BIT(XE_FW_DOMAIN_ID_MEDIA_VEBOX0),
XE_FW_MEDIA_VEBOX1 = BIT(XE_FW_DOMAIN_ID_MEDIA_VEBOX1),
XE_FW_MEDIA_VEBOX2 = BIT(XE_FW_DOMAIN_ID_MEDIA_VEBOX2),
XE_FW_MEDIA_VEBOX3 = BIT(XE_FW_DOMAIN_ID_MEDIA_VEBOX3),
XE_FW_GSC = BIT(XE_FW_DOMAIN_ID_GSC),
XE_FORCEWAKE_ALL = BIT(XE_FW_DOMAIN_ID_COUNT) - 1
};
/**
* struct xe_force_wake_domain - XE force wake domains
*/
struct xe_force_wake_domain {
/** @id: domain force wake id */
enum xe_force_wake_domain_id id;
/** @reg_ctl: domain wake control register address */
u32 reg_ctl;
/** @reg_ack: domain ack register address */
u32 reg_ack;
/** @val: domain wake write value */
u32 val;
/** @mask: domain mask */
u32 mask;
/** @ref: domain reference */
u32 ref;
};
/**
* struct xe_force_wake - XE force wake
*/
struct xe_force_wake {
/** @gt: back pointers to GT */
struct xe_gt *gt;
/** @lock: protects everything force wake struct */
struct mutex lock;
/** @awake_domains: mask of all domains awake */
enum xe_force_wake_domains awake_domains;
/** @domains: force wake domains */
struct xe_force_wake_domain domains[XE_FW_DOMAIN_ID_COUNT];
};
#endif

View File

@ -0,0 +1,304 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2021 Intel Corporation
*/
#include "xe_ggtt.h"
#include <linux/sizes.h>
#include <drm/i915_drm.h>
#include <drm/drm_managed.h>
#include "xe_device.h"
#include "xe_bo.h"
#include "xe_gt.h"
#include "xe_mmio.h"
#include "xe_wopcm.h"
#include "i915_reg.h"
#include "gt/intel_gt_regs.h"
/* FIXME: Common file, preferably auto-gen */
#define MTL_GGTT_PTE_PAT0 BIT(52)
#define MTL_GGTT_PTE_PAT1 BIT(53)
u64 xe_ggtt_pte_encode(struct xe_bo *bo, u64 bo_offset)
{
struct xe_device *xe = xe_bo_device(bo);
u64 pte;
bool is_lmem;
pte = xe_bo_addr(bo, bo_offset, GEN8_PAGE_SIZE, &is_lmem);
pte |= GEN8_PAGE_PRESENT;
if (is_lmem)
pte |= GEN12_GGTT_PTE_LM;
/* FIXME: vfunc + pass in caching rules */
if (xe->info.platform == XE_METEORLAKE) {
pte |= MTL_GGTT_PTE_PAT0;
pte |= MTL_GGTT_PTE_PAT1;
}
return pte;
}
static unsigned int probe_gsm_size(struct pci_dev *pdev)
{
u16 gmch_ctl, ggms;
pci_read_config_word(pdev, SNB_GMCH_CTRL, &gmch_ctl);
ggms = (gmch_ctl >> BDW_GMCH_GGMS_SHIFT) & BDW_GMCH_GGMS_MASK;
return ggms ? SZ_1M << ggms : 0;
}
void xe_ggtt_set_pte(struct xe_ggtt *ggtt, u64 addr, u64 pte)
{
XE_BUG_ON(addr & GEN8_PTE_MASK);
XE_BUG_ON(addr >= ggtt->size);
writeq(pte, &ggtt->gsm[addr >> GEN8_PTE_SHIFT]);
}
static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size)
{
u64 end = start + size - 1;
u64 scratch_pte;
XE_BUG_ON(start >= end);
if (ggtt->scratch)
scratch_pte = xe_ggtt_pte_encode(ggtt->scratch, 0);
else
scratch_pte = 0;
while (start < end) {
xe_ggtt_set_pte(ggtt, start, scratch_pte);
start += GEN8_PAGE_SIZE;
}
}
static void ggtt_fini_noalloc(struct drm_device *drm, void *arg)
{
struct xe_ggtt *ggtt = arg;
mutex_destroy(&ggtt->lock);
drm_mm_takedown(&ggtt->mm);
xe_bo_unpin_map_no_vm(ggtt->scratch);
}
int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt)
{
struct xe_device *xe = gt_to_xe(gt);
struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
unsigned int gsm_size;
XE_BUG_ON(xe_gt_is_media_type(gt));
ggtt->gt = gt;
gsm_size = probe_gsm_size(pdev);
if (gsm_size == 0) {
drm_err(&xe->drm, "Hardware reported no preallocated GSM\n");
return -ENOMEM;
}
ggtt->gsm = gt->mmio.regs + SZ_8M;
ggtt->size = (gsm_size / 8) * (u64)GEN8_PAGE_SIZE;
/*
* 8B per entry, each points to a 4KB page.
*
* The GuC owns the WOPCM space, thus we can't allocate GGTT address in
* this area. Even though we likely configure the WOPCM to less than the
* maximum value, to simplify the driver load (no need to fetch HuC +
* GuC firmwares and determine there sizes before initializing the GGTT)
* just start the GGTT allocation above the max WOPCM size. This might
* waste space in the GGTT (WOPCM is 2MB on modern platforms) but we can
* live with this.
*
* Another benifit of this is the GuC bootrom can't access anything
* below the WOPCM max size so anything the bootom needs to access (e.g.
* a RSA key) needs to be placed in the GGTT above the WOPCM max size.
* Starting the GGTT allocations above the WOPCM max give us the correct
* placement for free.
*/
drm_mm_init(&ggtt->mm, xe_wopcm_size(xe),
ggtt->size - xe_wopcm_size(xe));
mutex_init(&ggtt->lock);
return drmm_add_action_or_reset(&xe->drm, ggtt_fini_noalloc, ggtt);
}
static void xe_ggtt_initial_clear(struct xe_ggtt *ggtt)
{
struct drm_mm_node *hole;
u64 start, end;
/* Display may have allocated inside ggtt, so be careful with clearing here */
mutex_lock(&ggtt->lock);
drm_mm_for_each_hole(hole, &ggtt->mm, start, end)
xe_ggtt_clear(ggtt, start, end - start);
xe_ggtt_invalidate(ggtt->gt);
mutex_unlock(&ggtt->lock);
}
int xe_ggtt_init(struct xe_gt *gt, struct xe_ggtt *ggtt)
{
struct xe_device *xe = gt_to_xe(gt);
int err;
ggtt->scratch = xe_bo_create_locked(xe, gt, NULL, GEN8_PAGE_SIZE,
ttm_bo_type_kernel,
XE_BO_CREATE_VRAM_IF_DGFX(gt) |
XE_BO_CREATE_PINNED_BIT);
if (IS_ERR(ggtt->scratch)) {
err = PTR_ERR(ggtt->scratch);
goto err;
}
err = xe_bo_pin(ggtt->scratch);
xe_bo_unlock_no_vm(ggtt->scratch);
if (err) {
xe_bo_put(ggtt->scratch);
goto err;
}
xe_ggtt_initial_clear(ggtt);
return 0;
err:
ggtt->scratch = NULL;
return err;
}
#define GEN12_GUC_TLB_INV_CR _MMIO(0xcee8)
#define GEN12_GUC_TLB_INV_CR_INVALIDATE (1 << 0)
#define PVC_GUC_TLB_INV_DESC0 _MMIO(0xcf7c)
#define PVC_GUC_TLB_INV_DESC0_VALID (1 << 0)
#define PVC_GUC_TLB_INV_DESC1 _MMIO(0xcf80)
#define PVC_GUC_TLB_INV_DESC1_INVALIDATE (1 << 6)
void xe_ggtt_invalidate(struct xe_gt *gt)
{
/* TODO: vfunc for GuC vs. non-GuC */
/* TODO: i915 makes comments about this being uncached and
* therefore flushing WC buffers. Is that really true here?
*/
xe_mmio_write32(gt, GFX_FLSH_CNTL_GEN6.reg, GFX_FLSH_CNTL_EN);
if (xe_device_guc_submission_enabled(gt_to_xe(gt))) {
struct xe_device *xe = gt_to_xe(gt);
/* TODO: also use vfunc here */
if (xe->info.platform == XE_PVC) {
xe_mmio_write32(gt, PVC_GUC_TLB_INV_DESC1.reg,
PVC_GUC_TLB_INV_DESC1_INVALIDATE);
xe_mmio_write32(gt, PVC_GUC_TLB_INV_DESC0.reg,
PVC_GUC_TLB_INV_DESC0_VALID);
} else
xe_mmio_write32(gt, GEN12_GUC_TLB_INV_CR.reg,
GEN12_GUC_TLB_INV_CR_INVALIDATE);
}
}
void xe_ggtt_printk(struct xe_ggtt *ggtt, const char *prefix)
{
u64 addr, scratch_pte;
scratch_pte = xe_ggtt_pte_encode(ggtt->scratch, 0);
printk("%sGlobal GTT:", prefix);
for (addr = 0; addr < ggtt->size; addr += GEN8_PAGE_SIZE) {
unsigned int i = addr / GEN8_PAGE_SIZE;
XE_BUG_ON(addr > U32_MAX);
if (ggtt->gsm[i] == scratch_pte)
continue;
printk("%s ggtt[0x%08x] = 0x%016llx",
prefix, (u32)addr, ggtt->gsm[i]);
}
}
int xe_ggtt_insert_special_node_locked(struct xe_ggtt *ggtt, struct drm_mm_node *node,
u32 size, u32 align, u32 mm_flags)
{
return drm_mm_insert_node_generic(&ggtt->mm, node, size, align, 0,
mm_flags);
}
int xe_ggtt_insert_special_node(struct xe_ggtt *ggtt, struct drm_mm_node *node,
u32 size, u32 align)
{
int ret;
mutex_lock(&ggtt->lock);
ret = xe_ggtt_insert_special_node_locked(ggtt, node, size,
align, DRM_MM_INSERT_HIGH);
mutex_unlock(&ggtt->lock);
return ret;
}
void xe_ggtt_map_bo(struct xe_ggtt *ggtt, struct xe_bo *bo)
{
u64 start = bo->ggtt_node.start;
u64 offset, pte;
for (offset = 0; offset < bo->size; offset += GEN8_PAGE_SIZE) {
pte = xe_ggtt_pte_encode(bo, offset);
xe_ggtt_set_pte(ggtt, start + offset, pte);
}
xe_ggtt_invalidate(ggtt->gt);
}
int xe_ggtt_insert_bo(struct xe_ggtt *ggtt, struct xe_bo *bo)
{
int err;
if (XE_WARN_ON(bo->ggtt_node.size)) {
/* Someone's already inserted this BO in the GGTT */
XE_BUG_ON(bo->ggtt_node.size != bo->size);
return 0;
}
err = xe_bo_validate(bo, NULL, false);
if (err)
return err;
mutex_lock(&ggtt->lock);
err = drm_mm_insert_node(&ggtt->mm, &bo->ggtt_node, bo->size);
if (!err)
xe_ggtt_map_bo(ggtt, bo);
mutex_unlock(&ggtt->lock);
return 0;
}
void xe_ggtt_remove_node(struct xe_ggtt *ggtt, struct drm_mm_node *node)
{
mutex_lock(&ggtt->lock);
xe_ggtt_clear(ggtt, node->start, node->size);
drm_mm_remove_node(node);
node->size = 0;
xe_ggtt_invalidate(ggtt->gt);
mutex_unlock(&ggtt->lock);
}
void xe_ggtt_remove_bo(struct xe_ggtt *ggtt, struct xe_bo *bo)
{
if (XE_WARN_ON(!bo->ggtt_node.size))
return;
/* This BO is not currently in the GGTT */
XE_BUG_ON(bo->ggtt_node.size != bo->size);
xe_ggtt_remove_node(ggtt, &bo->ggtt_node);
}

View File

@ -0,0 +1,28 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _XE_GGTT_H_
#define _XE_GGTT_H_
#include "xe_ggtt_types.h"
u64 xe_ggtt_pte_encode(struct xe_bo *bo, u64 bo_offset);
void xe_ggtt_set_pte(struct xe_ggtt *ggtt, u64 addr, u64 pte);
void xe_ggtt_invalidate(struct xe_gt *gt);
int xe_ggtt_init_noalloc(struct xe_gt *gt, struct xe_ggtt *ggtt);
int xe_ggtt_init(struct xe_gt *gt, struct xe_ggtt *ggtt);
void xe_ggtt_printk(struct xe_ggtt *ggtt, const char *prefix);
int xe_ggtt_insert_special_node(struct xe_ggtt *ggtt, struct drm_mm_node *node,
u32 size, u32 align);
int xe_ggtt_insert_special_node_locked(struct xe_ggtt *ggtt,
struct drm_mm_node *node,
u32 size, u32 align, u32 mm_flags);
void xe_ggtt_remove_node(struct xe_ggtt *ggtt, struct drm_mm_node *node);
void xe_ggtt_map_bo(struct xe_ggtt *ggtt, struct xe_bo *bo);
int xe_ggtt_insert_bo(struct xe_ggtt *ggtt, struct xe_bo *bo);
void xe_ggtt_remove_bo(struct xe_ggtt *ggtt, struct xe_bo *bo);
#endif

View File

@ -0,0 +1,28 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GGTT_TYPES_H_
#define _XE_GGTT_TYPES_H_
#include <drm/drm_mm.h>
struct xe_bo;
struct xe_gt;
struct xe_ggtt {
struct xe_gt *gt;
u64 size;
struct xe_bo *scratch;
struct mutex lock;
u64 __iomem *gsm;
struct drm_mm mm;
};
#endif

View File

@ -0,0 +1,101 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2023 Intel Corporation
*/
#include "xe_gpu_scheduler.h"
static void xe_sched_process_msg_queue(struct xe_gpu_scheduler *sched)
{
if (!READ_ONCE(sched->base.pause_submit))
queue_work(sched->base.submit_wq, &sched->work_process_msg);
}
static void xe_sched_process_msg_queue_if_ready(struct xe_gpu_scheduler *sched)
{
struct xe_sched_msg *msg;
spin_lock(&sched->base.job_list_lock);
msg = list_first_entry_or_null(&sched->msgs, struct xe_sched_msg, link);
if (msg)
xe_sched_process_msg_queue(sched);
spin_unlock(&sched->base.job_list_lock);
}
static struct xe_sched_msg *
xe_sched_get_msg(struct xe_gpu_scheduler *sched)
{
struct xe_sched_msg *msg;
spin_lock(&sched->base.job_list_lock);
msg = list_first_entry_or_null(&sched->msgs,
struct xe_sched_msg, link);
if (msg)
list_del(&msg->link);
spin_unlock(&sched->base.job_list_lock);
return msg;
}
static void xe_sched_process_msg_work(struct work_struct *w)
{
struct xe_gpu_scheduler *sched =
container_of(w, struct xe_gpu_scheduler, work_process_msg);
struct xe_sched_msg *msg;
if (READ_ONCE(sched->base.pause_submit))
return;
msg = xe_sched_get_msg(sched);
if (msg) {
sched->ops->process_msg(msg);
xe_sched_process_msg_queue_if_ready(sched);
}
}
int xe_sched_init(struct xe_gpu_scheduler *sched,
const struct drm_sched_backend_ops *ops,
const struct xe_sched_backend_ops *xe_ops,
struct workqueue_struct *submit_wq,
uint32_t hw_submission, unsigned hang_limit,
long timeout, struct workqueue_struct *timeout_wq,
atomic_t *score, const char *name,
struct device *dev)
{
sched->ops = xe_ops;
INIT_LIST_HEAD(&sched->msgs);
INIT_WORK(&sched->work_process_msg, xe_sched_process_msg_work);
return drm_sched_init(&sched->base, ops, submit_wq, 1, hw_submission,
hang_limit, timeout, timeout_wq, score, name,
dev);
}
void xe_sched_fini(struct xe_gpu_scheduler *sched)
{
xe_sched_submission_stop(sched);
drm_sched_fini(&sched->base);
}
void xe_sched_submission_start(struct xe_gpu_scheduler *sched)
{
drm_sched_wqueue_start(&sched->base);
queue_work(sched->base.submit_wq, &sched->work_process_msg);
}
void xe_sched_submission_stop(struct xe_gpu_scheduler *sched)
{
drm_sched_wqueue_stop(&sched->base);
cancel_work_sync(&sched->work_process_msg);
}
void xe_sched_add_msg(struct xe_gpu_scheduler *sched,
struct xe_sched_msg *msg)
{
spin_lock(&sched->base.job_list_lock);
list_add_tail(&msg->link, &sched->msgs);
spin_unlock(&sched->base.job_list_lock);
xe_sched_process_msg_queue(sched);
}

View File

@ -0,0 +1,73 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2023 Intel Corporation
*/
#ifndef _XE_GPU_SCHEDULER_H_
#define _XE_GPU_SCHEDULER_H_
#include "xe_gpu_scheduler_types.h"
#include "xe_sched_job_types.h"
int xe_sched_init(struct xe_gpu_scheduler *sched,
const struct drm_sched_backend_ops *ops,
const struct xe_sched_backend_ops *xe_ops,
struct workqueue_struct *submit_wq,
uint32_t hw_submission, unsigned hang_limit,
long timeout, struct workqueue_struct *timeout_wq,
atomic_t *score, const char *name,
struct device *dev);
void xe_sched_fini(struct xe_gpu_scheduler *sched);
void xe_sched_submission_start(struct xe_gpu_scheduler *sched);
void xe_sched_submission_stop(struct xe_gpu_scheduler *sched);
void xe_sched_add_msg(struct xe_gpu_scheduler *sched,
struct xe_sched_msg *msg);
static inline void xe_sched_stop(struct xe_gpu_scheduler *sched)
{
drm_sched_stop(&sched->base, NULL);
}
static inline void xe_sched_tdr_queue_imm(struct xe_gpu_scheduler *sched)
{
drm_sched_tdr_queue_imm(&sched->base);
}
static inline void xe_sched_resubmit_jobs(struct xe_gpu_scheduler *sched)
{
drm_sched_resubmit_jobs(&sched->base);
}
static inline bool
xe_sched_invalidate_job(struct xe_sched_job *job, int threshold)
{
return drm_sched_invalidate_job(&job->drm, threshold);
}
static inline void xe_sched_add_pending_job(struct xe_gpu_scheduler *sched,
struct xe_sched_job *job)
{
list_add(&job->drm.list, &sched->base.pending_list);
}
static inline
struct xe_sched_job *xe_sched_first_pending_job(struct xe_gpu_scheduler *sched)
{
return list_first_entry_or_null(&sched->base.pending_list,
struct xe_sched_job, drm.list);
}
static inline int
xe_sched_entity_init(struct xe_sched_entity *entity,
struct xe_gpu_scheduler *sched)
{
return drm_sched_entity_init(entity, 0,
(struct drm_gpu_scheduler **)&sched,
1, NULL);
}
#define xe_sched_entity_fini drm_sched_entity_fini
#endif

View File

@ -0,0 +1,57 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2023 Intel Corporation
*/
#ifndef _XE_GPU_SCHEDULER_TYPES_H_
#define _XE_GPU_SCHEDULER_TYPES_H_
#include <drm/gpu_scheduler.h>
/**
* struct xe_sched_msg - an in-band (relative to GPU scheduler run queue)
* message
*
* Generic enough for backend defined messages, backend can expand if needed.
*/
struct xe_sched_msg {
/** @link: list link into the gpu scheduler list of messages */
struct list_head link;
/**
* @private_data: opaque pointer to message private data (backend defined)
*/
void *private_data;
/** @opcode: opcode of message (backend defined) */
unsigned int opcode;
};
/**
* struct xe_sched_backend_ops - Define the backend operations called by the
* scheduler
*/
struct xe_sched_backend_ops {
/**
* @process_msg: Process a message. Allowed to block, it is this
* function's responsibility to free message if dynamically allocated.
*/
void (*process_msg)(struct xe_sched_msg *msg);
};
/**
* struct xe_gpu_scheduler - Xe GPU scheduler
*/
struct xe_gpu_scheduler {
/** @base: DRM GPU scheduler */
struct drm_gpu_scheduler base;
/** @ops: Xe scheduler ops */
const struct xe_sched_backend_ops *ops;
/** @msgs: list of messages to be processed in @work_process_msg */
struct list_head msgs;
/** @work_process_msg: processes messages */
struct work_struct work_process_msg;
};
#define xe_sched_entity drm_sched_entity
#define xe_sched_policy drm_sched_policy
#endif

830
drivers/gpu/drm/xe/xe_gt.c Normal file
View File

@ -0,0 +1,830 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <linux/minmax.h>
#include <drm/drm_managed.h>
#include "xe_bb.h"
#include "xe_bo.h"
#include "xe_device.h"
#include "xe_engine.h"
#include "xe_execlist.h"
#include "xe_force_wake.h"
#include "xe_ggtt.h"
#include "xe_gt.h"
#include "xe_gt_clock.h"
#include "xe_gt_mcr.h"
#include "xe_gt_pagefault.h"
#include "xe_gt_sysfs.h"
#include "xe_gt_topology.h"
#include "xe_hw_fence.h"
#include "xe_irq.h"
#include "xe_lrc.h"
#include "xe_map.h"
#include "xe_migrate.h"
#include "xe_mmio.h"
#include "xe_mocs.h"
#include "xe_reg_sr.h"
#include "xe_ring_ops.h"
#include "xe_sa.h"
#include "xe_sched_job.h"
#include "xe_ttm_gtt_mgr.h"
#include "xe_ttm_vram_mgr.h"
#include "xe_tuning.h"
#include "xe_uc.h"
#include "xe_vm.h"
#include "xe_wa.h"
#include "xe_wopcm.h"
#include "gt/intel_gt_regs.h"
struct xe_gt *xe_find_full_gt(struct xe_gt *gt)
{
struct xe_gt *search;
u8 id;
XE_BUG_ON(!xe_gt_is_media_type(gt));
for_each_gt(search, gt_to_xe(gt), id) {
if (search->info.vram_id == gt->info.vram_id)
return search;
}
XE_BUG_ON("NOT POSSIBLE");
return NULL;
}
int xe_gt_alloc(struct xe_device *xe, struct xe_gt *gt)
{
struct drm_device *drm = &xe->drm;
XE_BUG_ON(gt->info.type == XE_GT_TYPE_UNINITIALIZED);
if (!xe_gt_is_media_type(gt)) {
gt->mem.ggtt = drmm_kzalloc(drm, sizeof(*gt->mem.ggtt),
GFP_KERNEL);
if (!gt->mem.ggtt)
return -ENOMEM;
gt->mem.vram_mgr = drmm_kzalloc(drm, sizeof(*gt->mem.vram_mgr),
GFP_KERNEL);
if (!gt->mem.vram_mgr)
return -ENOMEM;
gt->mem.gtt_mgr = drmm_kzalloc(drm, sizeof(*gt->mem.gtt_mgr),
GFP_KERNEL);
if (!gt->mem.gtt_mgr)
return -ENOMEM;
} else {
struct xe_gt *full_gt = xe_find_full_gt(gt);
gt->mem.ggtt = full_gt->mem.ggtt;
gt->mem.vram_mgr = full_gt->mem.vram_mgr;
gt->mem.gtt_mgr = full_gt->mem.gtt_mgr;
}
gt->ordered_wq = alloc_ordered_workqueue("gt-ordered-wq", 0);
return 0;
}
/* FIXME: These should be in a common file */
#define CHV_PPAT_SNOOP REG_BIT(6)
#define GEN8_PPAT_AGE(x) ((x)<<4)
#define GEN8_PPAT_LLCeLLC (3<<2)
#define GEN8_PPAT_LLCELLC (2<<2)
#define GEN8_PPAT_LLC (1<<2)
#define GEN8_PPAT_WB (3<<0)
#define GEN8_PPAT_WT (2<<0)
#define GEN8_PPAT_WC (1<<0)
#define GEN8_PPAT_UC (0<<0)
#define GEN8_PPAT_ELLC_OVERRIDE (0<<2)
#define GEN8_PPAT(i, x) ((u64)(x) << ((i) * 8))
#define GEN12_PPAT_CLOS(x) ((x)<<2)
static void tgl_setup_private_ppat(struct xe_gt *gt)
{
/* TGL doesn't support LLC or AGE settings */
xe_mmio_write32(gt, GEN12_PAT_INDEX(0).reg, GEN8_PPAT_WB);
xe_mmio_write32(gt, GEN12_PAT_INDEX(1).reg, GEN8_PPAT_WC);
xe_mmio_write32(gt, GEN12_PAT_INDEX(2).reg, GEN8_PPAT_WT);
xe_mmio_write32(gt, GEN12_PAT_INDEX(3).reg, GEN8_PPAT_UC);
xe_mmio_write32(gt, GEN12_PAT_INDEX(4).reg, GEN8_PPAT_WB);
xe_mmio_write32(gt, GEN12_PAT_INDEX(5).reg, GEN8_PPAT_WB);
xe_mmio_write32(gt, GEN12_PAT_INDEX(6).reg, GEN8_PPAT_WB);
xe_mmio_write32(gt, GEN12_PAT_INDEX(7).reg, GEN8_PPAT_WB);
}
static void pvc_setup_private_ppat(struct xe_gt *gt)
{
xe_mmio_write32(gt, GEN12_PAT_INDEX(0).reg, GEN8_PPAT_UC);
xe_mmio_write32(gt, GEN12_PAT_INDEX(1).reg, GEN8_PPAT_WC);
xe_mmio_write32(gt, GEN12_PAT_INDEX(2).reg, GEN8_PPAT_WT);
xe_mmio_write32(gt, GEN12_PAT_INDEX(3).reg, GEN8_PPAT_WB);
xe_mmio_write32(gt, GEN12_PAT_INDEX(4).reg,
GEN12_PPAT_CLOS(1) | GEN8_PPAT_WT);
xe_mmio_write32(gt, GEN12_PAT_INDEX(5).reg,
GEN12_PPAT_CLOS(1) | GEN8_PPAT_WB);
xe_mmio_write32(gt, GEN12_PAT_INDEX(6).reg,
GEN12_PPAT_CLOS(2) | GEN8_PPAT_WT);
xe_mmio_write32(gt, GEN12_PAT_INDEX(7).reg,
GEN12_PPAT_CLOS(2) | GEN8_PPAT_WB);
}
#define MTL_PPAT_L4_CACHE_POLICY_MASK REG_GENMASK(3, 2)
#define MTL_PAT_INDEX_COH_MODE_MASK REG_GENMASK(1, 0)
#define MTL_PPAT_3_UC REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
#define MTL_PPAT_1_WT REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
#define MTL_PPAT_0_WB REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
#define MTL_3_COH_2W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
#define MTL_2_COH_1W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
#define MTL_0_COH_NON REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
static void mtl_setup_private_ppat(struct xe_gt *gt)
{
xe_mmio_write32(gt, GEN12_PAT_INDEX(0).reg, MTL_PPAT_0_WB);
xe_mmio_write32(gt, GEN12_PAT_INDEX(1).reg,
MTL_PPAT_1_WT | MTL_2_COH_1W);
xe_mmio_write32(gt, GEN12_PAT_INDEX(2).reg,
MTL_PPAT_3_UC | MTL_2_COH_1W);
xe_mmio_write32(gt, GEN12_PAT_INDEX(3).reg,
MTL_PPAT_0_WB | MTL_2_COH_1W);
xe_mmio_write32(gt, GEN12_PAT_INDEX(4).reg,
MTL_PPAT_0_WB | MTL_3_COH_2W);
}
static void setup_private_ppat(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
if (xe->info.platform == XE_METEORLAKE)
mtl_setup_private_ppat(gt);
else if (xe->info.platform == XE_PVC)
pvc_setup_private_ppat(gt);
else
tgl_setup_private_ppat(gt);
}
static int gt_ttm_mgr_init(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
int err;
struct sysinfo si;
u64 gtt_size;
si_meminfo(&si);
gtt_size = (u64)si.totalram * si.mem_unit * 3/4;
if (gt->mem.vram.size) {
err = xe_ttm_vram_mgr_init(gt, gt->mem.vram_mgr);
if (err)
return err;
gtt_size = min(max((XE_DEFAULT_GTT_SIZE_MB << 20),
gt->mem.vram.size),
gtt_size);
xe->info.mem_region_mask |= BIT(gt->info.vram_id) << 1;
}
err = xe_ttm_gtt_mgr_init(gt, gt->mem.gtt_mgr, gtt_size);
if (err)
return err;
return 0;
}
static void gt_fini(struct drm_device *drm, void *arg)
{
struct xe_gt *gt = arg;
int i;
destroy_workqueue(gt->ordered_wq);
for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i)
xe_hw_fence_irq_finish(&gt->fence_irq[i]);
}
static void gt_reset_worker(struct work_struct *w);
int emit_nop_job(struct xe_gt *gt, struct xe_engine *e)
{
struct xe_sched_job *job;
struct xe_bb *bb;
struct dma_fence *fence;
u64 batch_ofs;
long timeout;
bb = xe_bb_new(gt, 4, false);
if (IS_ERR(bb))
return PTR_ERR(bb);
batch_ofs = xe_bo_ggtt_addr(gt->kernel_bb_pool.bo);
job = xe_bb_create_wa_job(e, bb, batch_ofs);
if (IS_ERR(job)) {
xe_bb_free(bb, NULL);
return PTR_ERR(bb);
}
xe_sched_job_arm(job);
fence = dma_fence_get(&job->drm.s_fence->finished);
xe_sched_job_push(job);
timeout = dma_fence_wait_timeout(fence, false, HZ);
dma_fence_put(fence);
xe_bb_free(bb, NULL);
if (timeout < 0)
return timeout;
else if (!timeout)
return -ETIME;
return 0;
}
int emit_wa_job(struct xe_gt *gt, struct xe_engine *e)
{
struct xe_reg_sr *sr = &e->hwe->reg_lrc;
struct xe_reg_sr_entry *entry;
unsigned long reg;
struct xe_sched_job *job;
struct xe_bb *bb;
struct dma_fence *fence;
u64 batch_ofs;
long timeout;
int count = 0;
bb = xe_bb_new(gt, SZ_4K, false); /* Just pick a large BB size */
if (IS_ERR(bb))
return PTR_ERR(bb);
xa_for_each(&sr->xa, reg, entry)
++count;
if (count) {
bb->cs[bb->len++] = MI_LOAD_REGISTER_IMM(count);
xa_for_each(&sr->xa, reg, entry) {
bb->cs[bb->len++] = reg;
bb->cs[bb->len++] = entry->set_bits;
}
}
bb->cs[bb->len++] = MI_NOOP;
bb->cs[bb->len++] = MI_BATCH_BUFFER_END;
batch_ofs = xe_bo_ggtt_addr(gt->kernel_bb_pool.bo);
job = xe_bb_create_wa_job(e, bb, batch_ofs);
if (IS_ERR(job)) {
xe_bb_free(bb, NULL);
return PTR_ERR(bb);
}
xe_sched_job_arm(job);
fence = dma_fence_get(&job->drm.s_fence->finished);
xe_sched_job_push(job);
timeout = dma_fence_wait_timeout(fence, false, HZ);
dma_fence_put(fence);
xe_bb_free(bb, NULL);
if (timeout < 0)
return timeout;
else if (!timeout)
return -ETIME;
return 0;
}
int xe_gt_record_default_lrcs(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
int err = 0;
for_each_hw_engine(hwe, gt, id) {
struct xe_engine *e, *nop_e;
struct xe_vm *vm;
void *default_lrc;
if (gt->default_lrc[hwe->class])
continue;
xe_reg_sr_init(&hwe->reg_lrc, "LRC", xe);
xe_wa_process_lrc(hwe);
default_lrc = drmm_kzalloc(&xe->drm,
xe_lrc_size(xe, hwe->class),
GFP_KERNEL);
if (!default_lrc)
return -ENOMEM;
vm = xe_migrate_get_vm(gt->migrate);
e = xe_engine_create(xe, vm, BIT(hwe->logical_instance), 1,
hwe, ENGINE_FLAG_WA);
if (IS_ERR(e)) {
err = PTR_ERR(e);
goto put_vm;
}
/* Prime golden LRC with known good state */
err = emit_wa_job(gt, e);
if (err)
goto put_engine;
nop_e = xe_engine_create(xe, vm, BIT(hwe->logical_instance),
1, hwe, ENGINE_FLAG_WA);
if (IS_ERR(nop_e)) {
err = PTR_ERR(nop_e);
goto put_engine;
}
/* Switch to different LRC */
err = emit_nop_job(gt, nop_e);
if (err)
goto put_nop_e;
/* Reload golden LRC to record the effect of any indirect W/A */
err = emit_nop_job(gt, e);
if (err)
goto put_nop_e;
xe_map_memcpy_from(xe, default_lrc,
&e->lrc[0].bo->vmap,
xe_lrc_pphwsp_offset(&e->lrc[0]),
xe_lrc_size(xe, hwe->class));
gt->default_lrc[hwe->class] = default_lrc;
put_nop_e:
xe_engine_put(nop_e);
put_engine:
xe_engine_put(e);
put_vm:
xe_vm_put(vm);
if (err)
break;
}
return err;
}
int xe_gt_init_early(struct xe_gt *gt)
{
int err;
xe_force_wake_init_gt(gt, gt_to_fw(gt));
err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
if (err)
return err;
xe_gt_topology_init(gt);
xe_gt_mcr_init(gt);
err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
if (err)
return err;
xe_reg_sr_init(&gt->reg_sr, "GT", gt_to_xe(gt));
xe_wa_process_gt(gt);
xe_tuning_process_gt(gt);
return 0;
}
/**
* xe_gt_init_noalloc - Init GT up to the point where allocations can happen.
* @gt: The GT to initialize.
*
* This function prepares the GT to allow memory allocations to VRAM, but is not
* allowed to allocate memory itself. This state is useful for display readout,
* because the inherited display framebuffer will otherwise be overwritten as it
* is usually put at the start of VRAM.
*
* Returns: 0 on success, negative error code on error.
*/
int xe_gt_init_noalloc(struct xe_gt *gt)
{
int err, err2;
if (xe_gt_is_media_type(gt))
return 0;
xe_device_mem_access_get(gt_to_xe(gt));
err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
if (err)
goto err;
err = gt_ttm_mgr_init(gt);
if (err)
goto err_force_wake;
err = xe_ggtt_init_noalloc(gt, gt->mem.ggtt);
err_force_wake:
err2 = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
XE_WARN_ON(err2);
xe_device_mem_access_put(gt_to_xe(gt));
err:
return err;
}
static int gt_fw_domain_init(struct xe_gt *gt)
{
int err, i;
xe_device_mem_access_get(gt_to_xe(gt));
err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
if (err)
goto err_hw_fence_irq;
if (!xe_gt_is_media_type(gt)) {
err = xe_ggtt_init(gt, gt->mem.ggtt);
if (err)
goto err_force_wake;
}
/* Allow driver to load if uC init fails (likely missing firmware) */
err = xe_uc_init(&gt->uc);
XE_WARN_ON(err);
err = xe_uc_init_hwconfig(&gt->uc);
if (err)
goto err_force_wake;
/* Enables per hw engine IRQs */
xe_gt_irq_postinstall(gt);
/* Rerun MCR init as we now have hw engine list */
xe_gt_mcr_init(gt);
err = xe_hw_engines_init_early(gt);
if (err)
goto err_force_wake;
err = xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
XE_WARN_ON(err);
xe_device_mem_access_put(gt_to_xe(gt));
return 0;
err_force_wake:
xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
err_hw_fence_irq:
for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i)
xe_hw_fence_irq_finish(&gt->fence_irq[i]);
xe_device_mem_access_put(gt_to_xe(gt));
return err;
}
static int all_fw_domain_init(struct xe_gt *gt)
{
int err, i;
xe_device_mem_access_get(gt_to_xe(gt));
err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
if (err)
goto err_hw_fence_irq;
setup_private_ppat(gt);
xe_reg_sr_apply_mmio(&gt->reg_sr, gt);
err = xe_gt_clock_init(gt);
if (err)
goto err_force_wake;
xe_mocs_init(gt);
err = xe_execlist_init(gt);
if (err)
goto err_force_wake;
err = xe_hw_engines_init(gt);
if (err)
goto err_force_wake;
err = xe_uc_init_post_hwconfig(&gt->uc);
if (err)
goto err_force_wake;
/*
* FIXME: This should be ok as SA should only be used by gt->migrate and
* vm->gt->migrate and both should be pointing to a non-media GT. But to
* realy safe, convert gt->kernel_bb_pool to a pointer and point a media
* GT to the kernel_bb_pool on a real tile.
*/
if (!xe_gt_is_media_type(gt)) {
err = xe_sa_bo_manager_init(gt, &gt->kernel_bb_pool, SZ_1M, 16);
if (err)
goto err_force_wake;
/*
* USM has its only SA pool to non-block behind user operations
*/
if (gt_to_xe(gt)->info.supports_usm) {
err = xe_sa_bo_manager_init(gt, &gt->usm.bb_pool,
SZ_1M, 16);
if (err)
goto err_force_wake;
}
}
if (!xe_gt_is_media_type(gt)) {
gt->migrate = xe_migrate_init(gt);
if (IS_ERR(gt->migrate))
goto err_force_wake;
} else {
gt->migrate = xe_find_full_gt(gt)->migrate;
}
err = xe_uc_init_hw(&gt->uc);
if (err)
goto err_force_wake;
err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
XE_WARN_ON(err);
xe_device_mem_access_put(gt_to_xe(gt));
return 0;
err_force_wake:
xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
err_hw_fence_irq:
for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i)
xe_hw_fence_irq_finish(&gt->fence_irq[i]);
xe_device_mem_access_put(gt_to_xe(gt));
return err;
}
int xe_gt_init(struct xe_gt *gt)
{
int err;
int i;
INIT_WORK(&gt->reset.worker, gt_reset_worker);
for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i) {
gt->ring_ops[i] = xe_ring_ops_get(gt, i);
xe_hw_fence_irq_init(&gt->fence_irq[i]);
}
err = xe_gt_pagefault_init(gt);
if (err)
return err;
xe_gt_sysfs_init(gt);
err = gt_fw_domain_init(gt);
if (err)
return err;
xe_force_wake_init_engines(gt, gt_to_fw(gt));
err = all_fw_domain_init(gt);
if (err)
return err;
xe_force_wake_prune(gt, gt_to_fw(gt));
err = drmm_add_action_or_reset(&gt_to_xe(gt)->drm, gt_fini, gt);
if (err)
return err;
return 0;
}
int do_gt_reset(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
int err;
xe_mmio_write32(gt, GEN6_GDRST.reg, GEN11_GRDOM_FULL);
err = xe_mmio_wait32(gt, GEN6_GDRST.reg, 0, GEN11_GRDOM_FULL, 5);
if (err)
drm_err(&xe->drm,
"GT reset failed to clear GEN11_GRDOM_FULL\n");
return err;
}
static int do_gt_restart(struct xe_gt *gt)
{
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
int err;
setup_private_ppat(gt);
xe_reg_sr_apply_mmio(&gt->reg_sr, gt);
err = xe_wopcm_init(&gt->uc.wopcm);
if (err)
return err;
for_each_hw_engine(hwe, gt, id)
xe_hw_engine_enable_ring(hwe);
err = xe_uc_init_hw(&gt->uc);
if (err)
return err;
xe_mocs_init(gt);
err = xe_uc_start(&gt->uc);
if (err)
return err;
for_each_hw_engine(hwe, gt, id) {
xe_reg_sr_apply_mmio(&hwe->reg_sr, gt);
xe_reg_sr_apply_whitelist(&hwe->reg_whitelist,
hwe->mmio_base, gt);
}
return 0;
}
static int gt_reset(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
int err;
/* We only support GT resets with GuC submission */
if (!xe_device_guc_submission_enabled(gt_to_xe(gt)))
return -ENODEV;
drm_info(&xe->drm, "GT reset started\n");
xe_device_mem_access_get(gt_to_xe(gt));
err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
if (err)
goto err_msg;
xe_uc_stop_prepare(&gt->uc);
xe_gt_pagefault_reset(gt);
err = xe_uc_stop(&gt->uc);
if (err)
goto err_out;
err = do_gt_reset(gt);
if (err)
goto err_out;
err = do_gt_restart(gt);
if (err)
goto err_out;
xe_device_mem_access_put(gt_to_xe(gt));
err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
XE_WARN_ON(err);
drm_info(&xe->drm, "GT reset done\n");
return 0;
err_out:
XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
err_msg:
XE_WARN_ON(xe_uc_start(&gt->uc));
xe_device_mem_access_put(gt_to_xe(gt));
drm_err(&xe->drm, "GT reset failed, err=%d\n", err);
return err;
}
static void gt_reset_worker(struct work_struct *w)
{
struct xe_gt *gt = container_of(w, typeof(*gt), reset.worker);
gt_reset(gt);
}
void xe_gt_reset_async(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
drm_info(&xe->drm, "Try GT reset\n");
/* Don't do a reset while one is already in flight */
if (xe_uc_reset_prepare(&gt->uc))
return;
drm_info(&xe->drm, "Doing GT reset\n");
queue_work(gt->ordered_wq, &gt->reset.worker);
}
void xe_gt_suspend_prepare(struct xe_gt *gt)
{
xe_device_mem_access_get(gt_to_xe(gt));
XE_WARN_ON(xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL));
xe_uc_stop_prepare(&gt->uc);
xe_device_mem_access_put(gt_to_xe(gt));
XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
}
int xe_gt_suspend(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
int err;
/* For now suspend/resume is only allowed with GuC */
if (!xe_device_guc_submission_enabled(gt_to_xe(gt)))
return -ENODEV;
xe_device_mem_access_get(gt_to_xe(gt));
err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
if (err)
goto err_msg;
err = xe_uc_suspend(&gt->uc);
if (err)
goto err_force_wake;
xe_device_mem_access_put(gt_to_xe(gt));
XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
drm_info(&xe->drm, "GT suspended\n");
return 0;
err_force_wake:
XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
err_msg:
xe_device_mem_access_put(gt_to_xe(gt));
drm_err(&xe->drm, "GT suspend failed: %d\n", err);
return err;
}
int xe_gt_resume(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
int err;
xe_device_mem_access_get(gt_to_xe(gt));
err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
if (err)
goto err_msg;
err = do_gt_restart(gt);
if (err)
goto err_force_wake;
xe_device_mem_access_put(gt_to_xe(gt));
XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
drm_info(&xe->drm, "GT resumed\n");
return 0;
err_force_wake:
XE_WARN_ON(xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL));
err_msg:
xe_device_mem_access_put(gt_to_xe(gt));
drm_err(&xe->drm, "GT resume failed: %d\n", err);
return err;
}
void xe_gt_migrate_wait(struct xe_gt *gt)
{
xe_migrate_wait(gt->migrate);
}
struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt,
enum xe_engine_class class,
u16 instance, bool logical)
{
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
for_each_hw_engine(hwe, gt, id)
if (hwe->class == class &&
((!logical && hwe->instance == instance) ||
(logical && hwe->logical_instance == instance)))
return hwe;
return NULL;
}
struct xe_hw_engine *xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt,
enum xe_engine_class class)
{
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
for_each_hw_engine(hwe, gt, id) {
switch (class) {
case XE_ENGINE_CLASS_RENDER:
case XE_ENGINE_CLASS_COMPUTE:
if (hwe->class == XE_ENGINE_CLASS_RENDER ||
hwe->class == XE_ENGINE_CLASS_COMPUTE)
return hwe;
break;
default:
if (hwe->class == class)
return hwe;
}
}
return NULL;
}

View File

@ -0,0 +1,64 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GT_H_
#define _XE_GT_H_
#include <drm/drm_util.h>
#include "xe_device_types.h"
#include "xe_hw_engine.h"
#define for_each_hw_engine(hwe__, gt__, id__) \
for ((id__) = 0; (id__) < ARRAY_SIZE((gt__)->hw_engines); (id__)++) \
for_each_if (((hwe__) = (gt__)->hw_engines + (id__)) && \
xe_hw_engine_is_valid((hwe__)))
int xe_gt_alloc(struct xe_device *xe, struct xe_gt *gt);
int xe_gt_init_early(struct xe_gt *gt);
int xe_gt_init_noalloc(struct xe_gt *gt);
int xe_gt_init(struct xe_gt *gt);
int xe_gt_record_default_lrcs(struct xe_gt *gt);
void xe_gt_suspend_prepare(struct xe_gt *gt);
int xe_gt_suspend(struct xe_gt *gt);
int xe_gt_resume(struct xe_gt *gt);
void xe_gt_reset_async(struct xe_gt *gt);
void xe_gt_migrate_wait(struct xe_gt *gt);
struct xe_gt *xe_find_full_gt(struct xe_gt *gt);
/**
* xe_gt_any_hw_engine_by_reset_domain - scan the list of engines and return the
* first that matches the same reset domain as @class
* @gt: GT structure
* @class: hw engine class to lookup
*/
struct xe_hw_engine *
xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt, enum xe_engine_class class);
struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt,
enum xe_engine_class class,
u16 instance,
bool logical);
static inline bool xe_gt_is_media_type(struct xe_gt *gt)
{
return gt->info.type == XE_GT_TYPE_MEDIA;
}
static inline struct xe_device * gt_to_xe(struct xe_gt *gt)
{
return gt->xe;
}
static inline bool xe_gt_is_usm_hwe(struct xe_gt *gt, struct xe_hw_engine *hwe)
{
struct xe_device *xe = gt_to_xe(gt);
return xe->info.supports_usm && hwe->class == XE_ENGINE_CLASS_COPY &&
hwe->instance == gt->usm.reserved_bcs_instance;
}
#endif

View File

@ -0,0 +1,83 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include "i915_reg.h"
#include "gt/intel_gt_regs.h"
#include "xe_device.h"
#include "xe_gt.h"
#include "xe_gt_clock.h"
#include "xe_macros.h"
#include "xe_mmio.h"
static u32 read_reference_ts_freq(struct xe_gt *gt)
{
u32 ts_override = xe_mmio_read32(gt, GEN9_TIMESTAMP_OVERRIDE.reg);
u32 base_freq, frac_freq;
base_freq = ((ts_override & GEN9_TIMESTAMP_OVERRIDE_US_COUNTER_DIVIDER_MASK) >>
GEN9_TIMESTAMP_OVERRIDE_US_COUNTER_DIVIDER_SHIFT) + 1;
base_freq *= 1000000;
frac_freq = ((ts_override &
GEN9_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_MASK) >>
GEN9_TIMESTAMP_OVERRIDE_US_COUNTER_DENOMINATOR_SHIFT);
frac_freq = 1000000 / (frac_freq + 1);
return base_freq + frac_freq;
}
static u32 get_crystal_clock_freq(u32 rpm_config_reg)
{
const u32 f19_2_mhz = 19200000;
const u32 f24_mhz = 24000000;
const u32 f25_mhz = 25000000;
const u32 f38_4_mhz = 38400000;
u32 crystal_clock =
(rpm_config_reg & GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
switch (crystal_clock) {
case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ:
return f24_mhz;
case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ:
return f19_2_mhz;
case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_38_4_MHZ:
return f38_4_mhz;
case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_25_MHZ:
return f25_mhz;
default:
XE_BUG_ON("NOT_POSSIBLE");
return 0;
}
}
int xe_gt_clock_init(struct xe_gt *gt)
{
u32 ctc_reg = xe_mmio_read32(gt, CTC_MODE.reg);
u32 freq = 0;
/* Assuming gen11+ so assert this assumption is correct */
XE_BUG_ON(GRAPHICS_VER(gt_to_xe(gt)) < 11);
if ((ctc_reg & CTC_SOURCE_PARAMETER_MASK) == CTC_SOURCE_DIVIDE_LOGIC) {
freq = read_reference_ts_freq(gt);
} else {
u32 c0 = xe_mmio_read32(gt, RPM_CONFIG0.reg);
freq = get_crystal_clock_freq(c0);
/*
* Now figure out how the command stream's timestamp
* register increments from this frequency (it might
* increment only every few clock cycle).
*/
freq >>= 3 - ((c0 & GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK) >>
GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT);
}
gt->info.clock_freq = freq;
return 0;
}

View File

@ -0,0 +1,13 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GT_CLOCK_H_
#define _XE_GT_CLOCK_H_
struct xe_gt;
int xe_gt_clock_init(struct xe_gt *gt);
#endif

View File

@ -0,0 +1,160 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_debugfs.h>
#include <drm/drm_managed.h>
#include "xe_device.h"
#include "xe_force_wake.h"
#include "xe_gt.h"
#include "xe_gt_debugfs.h"
#include "xe_gt_mcr.h"
#include "xe_gt_pagefault.h"
#include "xe_gt_topology.h"
#include "xe_hw_engine.h"
#include "xe_macros.h"
#include "xe_uc_debugfs.h"
static struct xe_gt *node_to_gt(struct drm_info_node *node)
{
return node->info_ent->data;
}
static int hw_engines(struct seq_file *m, void *data)
{
struct xe_gt *gt = node_to_gt(m->private);
struct xe_device *xe = gt_to_xe(gt);
struct drm_printer p = drm_seq_file_printer(m);
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
int err;
xe_device_mem_access_get(xe);
err = xe_force_wake_get(gt_to_fw(gt), XE_FORCEWAKE_ALL);
if (err) {
xe_device_mem_access_put(xe);
return err;
}
for_each_hw_engine(hwe, gt, id)
xe_hw_engine_print_state(hwe, &p);
xe_device_mem_access_put(xe);
err = xe_force_wake_put(gt_to_fw(gt), XE_FORCEWAKE_ALL);
if (err)
return err;
return 0;
}
static int force_reset(struct seq_file *m, void *data)
{
struct xe_gt *gt = node_to_gt(m->private);
xe_gt_reset_async(gt);
return 0;
}
static int sa_info(struct seq_file *m, void *data)
{
struct xe_gt *gt = node_to_gt(m->private);
struct drm_printer p = drm_seq_file_printer(m);
drm_suballoc_dump_debug_info(&gt->kernel_bb_pool.base, &p,
gt->kernel_bb_pool.gpu_addr);
return 0;
}
static int topology(struct seq_file *m, void *data)
{
struct xe_gt *gt = node_to_gt(m->private);
struct drm_printer p = drm_seq_file_printer(m);
xe_gt_topology_dump(gt, &p);
return 0;
}
static int steering(struct seq_file *m, void *data)
{
struct xe_gt *gt = node_to_gt(m->private);
struct drm_printer p = drm_seq_file_printer(m);
xe_gt_mcr_steering_dump(gt, &p);
return 0;
}
#ifdef CONFIG_DRM_XE_DEBUG
static int invalidate_tlb(struct seq_file *m, void *data)
{
struct xe_gt *gt = node_to_gt(m->private);
int seqno;
int ret = 0;
seqno = xe_gt_tlb_invalidation(gt);
XE_WARN_ON(seqno < 0);
if (seqno > 0)
ret = xe_gt_tlb_invalidation_wait(gt, seqno);
XE_WARN_ON(ret < 0);
return 0;
}
#endif
static const struct drm_info_list debugfs_list[] = {
{"hw_engines", hw_engines, 0},
{"force_reset", force_reset, 0},
{"sa_info", sa_info, 0},
{"topology", topology, 0},
{"steering", steering, 0},
#ifdef CONFIG_DRM_XE_DEBUG
{"invalidate_tlb", invalidate_tlb, 0},
#endif
};
void xe_gt_debugfs_register(struct xe_gt *gt)
{
struct drm_minor *minor = gt_to_xe(gt)->drm.primary;
struct dentry *root;
struct drm_info_list *local;
char name[8];
int i;
XE_BUG_ON(!minor->debugfs_root);
sprintf(name, "gt%d", gt->info.id);
root = debugfs_create_dir(name, minor->debugfs_root);
if (IS_ERR(root)) {
XE_WARN_ON("Create GT directory failed");
return;
}
/*
* Allocate local copy as we need to pass in the GT to the debugfs
* entry and drm_debugfs_create_files just references the drm_info_list
* passed in (e.g. can't define this on the stack).
*/
#define DEBUGFS_SIZE ARRAY_SIZE(debugfs_list) * sizeof(struct drm_info_list)
local = drmm_kmalloc(&gt_to_xe(gt)->drm, DEBUGFS_SIZE, GFP_KERNEL);
if (!local) {
XE_WARN_ON("Couldn't allocate memory");
return;
}
memcpy(local, debugfs_list, DEBUGFS_SIZE);
#undef DEBUGFS_SIZE
for (i = 0; i < ARRAY_SIZE(debugfs_list); ++i)
local[i].data = gt;
drm_debugfs_create_files(local,
ARRAY_SIZE(debugfs_list),
root, minor);
xe_uc_debugfs_register(&gt->uc, root);
}

View File

@ -0,0 +1,13 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GT_DEBUGFS_H_
#define _XE_GT_DEBUGFS_H_
struct xe_gt;
void xe_gt_debugfs_register(struct xe_gt *gt);
#endif

View File

@ -0,0 +1,552 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include "xe_gt.h"
#include "xe_gt_mcr.h"
#include "xe_gt_topology.h"
#include "xe_gt_types.h"
#include "xe_mmio.h"
#include "gt/intel_gt_regs.h"
/**
* DOC: GT Multicast/Replicated (MCR) Register Support
*
* Some GT registers are designed as "multicast" or "replicated" registers:
* multiple instances of the same register share a single MMIO offset. MCR
* registers are generally used when the hardware needs to potentially track
* independent values of a register per hardware unit (e.g., per-subslice,
* per-L3bank, etc.). The specific types of replication that exist vary
* per-platform.
*
* MMIO accesses to MCR registers are controlled according to the settings
* programmed in the platform's MCR_SELECTOR register(s). MMIO writes to MCR
* registers can be done in either a (i.e., a single write updates all
* instances of the register to the same value) or unicast (a write updates only
* one specific instance). Reads of MCR registers always operate in a unicast
* manner regardless of how the multicast/unicast bit is set in MCR_SELECTOR.
* Selection of a specific MCR instance for unicast operations is referred to
* as "steering."
*
* If MCR register operations are steered toward a hardware unit that is
* fused off or currently powered down due to power gating, the MMIO operation
* is "terminated" by the hardware. Terminated read operations will return a
* value of zero and terminated unicast write operations will be silently
* ignored.
*/
enum {
MCR_OP_READ,
MCR_OP_WRITE
};
static const struct xe_mmio_range xelp_l3bank_steering_table[] = {
{ 0x00B100, 0x00B3FF },
{},
};
/*
* Although the bspec lists more "MSLICE" ranges than shown here, some of those
* are of a "GAM" subclass that has special rules and doesn't need to be
* included here.
*/
static const struct xe_mmio_range xehp_mslice_steering_table[] = {
{ 0x00DD00, 0x00DDFF },
{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
{},
};
static const struct xe_mmio_range xehp_lncf_steering_table[] = {
{ 0x00B000, 0x00B0FF },
{ 0x00D880, 0x00D8FF },
{},
};
/*
* We have several types of MCR registers where steering to (0,0) will always
* provide us with a non-terminated value. We'll stick them all in the same
* table for simplicity.
*/
static const struct xe_mmio_range xehpc_instance0_steering_table[] = {
{ 0x004000, 0x004AFF }, /* HALF-BSLICE */
{ 0x008800, 0x00887F }, /* CC */
{ 0x008A80, 0x008AFF }, /* TILEPSMI */
{ 0x00B000, 0x00B0FF }, /* HALF-BSLICE */
{ 0x00B100, 0x00B3FF }, /* L3BANK */
{ 0x00C800, 0x00CFFF }, /* HALF-BSLICE */
{ 0x00D800, 0x00D8FF }, /* HALF-BSLICE */
{ 0x00DD00, 0x00DDFF }, /* BSLICE */
{ 0x00E900, 0x00E9FF }, /* HALF-BSLICE */
{ 0x00EC00, 0x00EEFF }, /* HALF-BSLICE */
{ 0x00F000, 0x00FFFF }, /* HALF-BSLICE */
{ 0x024180, 0x0241FF }, /* HALF-BSLICE */
{},
};
static const struct xe_mmio_range xelpg_instance0_steering_table[] = {
{ 0x000B00, 0x000BFF }, /* SQIDI */
{ 0x001000, 0x001FFF }, /* SQIDI */
{ 0x004000, 0x0048FF }, /* GAM */
{ 0x008700, 0x0087FF }, /* SQIDI */
{ 0x00B000, 0x00B0FF }, /* NODE */
{ 0x00C800, 0x00CFFF }, /* GAM */
{ 0x00D880, 0x00D8FF }, /* NODE */
{ 0x00DD00, 0x00DDFF }, /* OAAL2 */
{},
};
static const struct xe_mmio_range xelpg_l3bank_steering_table[] = {
{ 0x00B100, 0x00B3FF },
{},
};
static const struct xe_mmio_range xelp_dss_steering_table[] = {
{ 0x008150, 0x00815F },
{ 0x009520, 0x00955F },
{ 0x00DE80, 0x00E8FF },
{ 0x024A00, 0x024A7F },
{},
};
/* DSS steering is used for GSLICE ranges as well */
static const struct xe_mmio_range xehp_dss_steering_table[] = {
{ 0x005200, 0x0052FF }, /* GSLICE */
{ 0x005400, 0x007FFF }, /* GSLICE */
{ 0x008140, 0x00815F }, /* GSLICE (0x8140-0x814F), DSS (0x8150-0x815F) */
{ 0x008D00, 0x008DFF }, /* DSS */
{ 0x0094D0, 0x00955F }, /* GSLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */
{ 0x009680, 0x0096FF }, /* DSS */
{ 0x00D800, 0x00D87F }, /* GSLICE */
{ 0x00DC00, 0x00DCFF }, /* GSLICE */
{ 0x00DE80, 0x00E8FF }, /* DSS (0xE000-0xE0FF reserved ) */
{ 0x017000, 0x017FFF }, /* GSLICE */
{ 0x024A00, 0x024A7F }, /* DSS */
{},
};
/* DSS steering is used for COMPUTE ranges as well */
static const struct xe_mmio_range xehpc_dss_steering_table[] = {
{ 0x008140, 0x00817F }, /* COMPUTE (0x8140-0x814F & 0x8160-0x817F), DSS (0x8150-0x815F) */
{ 0x0094D0, 0x00955F }, /* COMPUTE (0x94D0-0x951F), DSS (0x9520-0x955F) */
{ 0x009680, 0x0096FF }, /* DSS */
{ 0x00DC00, 0x00DCFF }, /* COMPUTE */
{ 0x00DE80, 0x00E7FF }, /* DSS (0xDF00-0xE1FF reserved ) */
{},
};
/* DSS steering is used for SLICE ranges as well */
static const struct xe_mmio_range xelpg_dss_steering_table[] = {
{ 0x005200, 0x0052FF }, /* SLICE */
{ 0x005500, 0x007FFF }, /* SLICE */
{ 0x008140, 0x00815F }, /* SLICE (0x8140-0x814F), DSS (0x8150-0x815F) */
{ 0x0094D0, 0x00955F }, /* SLICE (0x94D0-0x951F), DSS (0x9520-0x955F) */
{ 0x009680, 0x0096FF }, /* DSS */
{ 0x00D800, 0x00D87F }, /* SLICE */
{ 0x00DC00, 0x00DCFF }, /* SLICE */
{ 0x00DE80, 0x00E8FF }, /* DSS (0xE000-0xE0FF reserved) */
{},
};
static const struct xe_mmio_range xelpmp_oaddrm_steering_table[] = {
{ 0x393200, 0x39323F },
{ 0x393400, 0x3934FF },
{},
};
/*
* DG2 GAM registers are a special case; this table is checked directly in
* xe_gt_mcr_get_nonterminated_steering and is not hooked up via
* gt->steering[].
*/
static const struct xe_mmio_range dg2_gam_ranges[] = {
{ 0x004000, 0x004AFF },
{ 0x00C800, 0x00CFFF },
{ 0x00F000, 0x00FFFF },
{},
};
static void init_steering_l3bank(struct xe_gt *gt)
{
if (GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270) {
u32 mslice_mask = REG_FIELD_GET(GEN12_MEML3_EN_MASK,
xe_mmio_read32(gt, GEN10_MIRROR_FUSE3.reg));
u32 bank_mask = REG_FIELD_GET(GT_L3_EXC_MASK,
xe_mmio_read32(gt, XEHP_FUSE4.reg));
/*
* Group selects mslice, instance selects bank within mslice.
* Bank 0 is always valid _except_ when the bank mask is 010b.
*/
gt->steering[L3BANK].group_target = __ffs(mslice_mask);
gt->steering[L3BANK].instance_target =
bank_mask & BIT(0) ? 0 : 2;
} else {
u32 fuse = REG_FIELD_GET(GEN10_L3BANK_MASK,
~xe_mmio_read32(gt, GEN10_MIRROR_FUSE3.reg));
gt->steering[L3BANK].group_target = 0; /* unused */
gt->steering[L3BANK].instance_target = __ffs(fuse);
}
}
static void init_steering_mslice(struct xe_gt *gt)
{
u32 mask = REG_FIELD_GET(GEN12_MEML3_EN_MASK,
xe_mmio_read32(gt, GEN10_MIRROR_FUSE3.reg));
/*
* mslice registers are valid (not terminated) if either the meml3
* associated with the mslice is present, or at least one DSS associated
* with the mslice is present. There will always be at least one meml3
* so we can just use that to find a non-terminated mslice and ignore
* the DSS fusing.
*/
gt->steering[MSLICE].group_target = __ffs(mask);
gt->steering[MSLICE].instance_target = 0; /* unused */
/*
* LNCF termination is also based on mslice presence, so we'll set
* it up here. Either LNCF within a non-terminated mslice will work,
* so we just always pick LNCF 0 here.
*/
gt->steering[LNCF].group_target = __ffs(mask) << 1;
gt->steering[LNCF].instance_target = 0; /* unused */
}
static void init_steering_dss(struct xe_gt *gt)
{
unsigned int dss = min(xe_dss_mask_group_ffs(gt->fuse_topo.g_dss_mask, 0, 0),
xe_dss_mask_group_ffs(gt->fuse_topo.c_dss_mask, 0, 0));
unsigned int dss_per_grp = gt_to_xe(gt)->info.platform == XE_PVC ? 8 : 4;
gt->steering[DSS].group_target = dss / dss_per_grp;
gt->steering[DSS].instance_target = dss % dss_per_grp;
}
static void init_steering_oaddrm(struct xe_gt *gt)
{
/*
* First instance is only terminated if the entire first media slice
* is absent (i.e., no VCS0 or VECS0).
*/
if (gt->info.engine_mask & (XE_HW_ENGINE_VCS0 | XE_HW_ENGINE_VECS0))
gt->steering[OADDRM].group_target = 0;
else
gt->steering[OADDRM].group_target = 1;
gt->steering[DSS].instance_target = 0; /* unused */
}
static void init_steering_inst0(struct xe_gt *gt)
{
gt->steering[DSS].group_target = 0; /* unused */
gt->steering[DSS].instance_target = 0; /* unused */
}
static const struct {
const char *name;
void (*init)(struct xe_gt *);
} xe_steering_types[] = {
{ "L3BANK", init_steering_l3bank },
{ "MSLICE", init_steering_mslice },
{ "LNCF", NULL }, /* initialized by mslice init */
{ "DSS", init_steering_dss },
{ "OADDRM", init_steering_oaddrm },
{ "INSTANCE 0", init_steering_inst0 },
};
void xe_gt_mcr_init(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
BUILD_BUG_ON(ARRAY_SIZE(xe_steering_types) != NUM_STEERING_TYPES);
spin_lock_init(&gt->mcr_lock);
if (gt->info.type == XE_GT_TYPE_MEDIA) {
drm_WARN_ON(&xe->drm, MEDIA_VER(xe) < 13);
gt->steering[OADDRM].ranges = xelpmp_oaddrm_steering_table;
} else if (GRAPHICS_VERx100(xe) >= 1270) {
gt->steering[INSTANCE0].ranges = xelpg_instance0_steering_table;
gt->steering[L3BANK].ranges = xelpg_l3bank_steering_table;
gt->steering[DSS].ranges = xelpg_dss_steering_table;
} else if (xe->info.platform == XE_PVC) {
gt->steering[INSTANCE0].ranges = xehpc_instance0_steering_table;
gt->steering[DSS].ranges = xehpc_dss_steering_table;
} else if (xe->info.platform == XE_DG2) {
gt->steering[MSLICE].ranges = xehp_mslice_steering_table;
gt->steering[LNCF].ranges = xehp_lncf_steering_table;
gt->steering[DSS].ranges = xehp_dss_steering_table;
} else {
gt->steering[L3BANK].ranges = xelp_l3bank_steering_table;
gt->steering[DSS].ranges = xelp_dss_steering_table;
}
/* Select non-terminated steering target for each type */
for (int i = 0; i < NUM_STEERING_TYPES; i++)
if (gt->steering[i].ranges && xe_steering_types[i].init)
xe_steering_types[i].init(gt);
}
/*
* xe_gt_mcr_get_nonterminated_steering - find group/instance values that
* will steer a register to a non-terminated instance
* @gt: GT structure
* @reg: register for which the steering is required
* @group: return variable for group steering
* @instance: return variable for instance steering
*
* This function returns a group/instance pair that is guaranteed to work for
* read steering of the given register. Note that a value will be returned even
* if the register is not replicated and therefore does not actually require
* steering.
*
* Returns true if the caller should steer to the @group/@instance values
* returned. Returns false if the caller need not perform any steering (i.e.,
* the DG2 GAM range special case).
*/
static bool xe_gt_mcr_get_nonterminated_steering(struct xe_gt *gt,
i915_mcr_reg_t reg,
u8 *group, u8 *instance)
{
for (int type = 0; type < NUM_STEERING_TYPES; type++) {
if (!gt->steering[type].ranges)
continue;
for (int i = 0; gt->steering[type].ranges[i].end > 0; i++) {
if (xe_mmio_in_range(&gt->steering[type].ranges[i], reg.reg)) {
*group = gt->steering[type].group_target;
*instance = gt->steering[type].instance_target;
return true;
}
}
}
/*
* All MCR registers should usually be part of one of the steering
* ranges we're tracking. However there's one special case: DG2
* GAM registers are technically multicast registers, but are special
* in a number of ways:
* - they have their own dedicated steering control register (they
* don't share 0xFDC with other MCR classes)
* - all reads should be directed to instance 1 (unicast reads against
* other instances are not allowed), and instance 1 is already the
* the hardware's default steering target, which we never change
*
* Ultimately this means that we can just treat them as if they were
* unicast registers and all operations will work properly.
*/
for (int i = 0; dg2_gam_ranges[i].end > 0; i++)
if (xe_mmio_in_range(&dg2_gam_ranges[i], reg.reg))
return false;
/*
* Not found in a steering table and not a DG2 GAM register? We'll
* just steer to 0/0 as a guess and raise a warning.
*/
drm_WARN(&gt_to_xe(gt)->drm, true,
"Did not find MCR register %#x in any MCR steering table\n",
reg.reg);
*group = 0;
*instance = 0;
return true;
}
#define STEER_SEMAPHORE 0xFD0
/*
* Obtain exclusive access to MCR steering. On MTL and beyond we also need
* to synchronize with external clients (e.g., firmware), so a semaphore
* register will also need to be taken.
*/
static void mcr_lock(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
int ret;
spin_lock(&gt->mcr_lock);
/*
* Starting with MTL we also need to grab a semaphore register
* to synchronize with external agents (e.g., firmware) that now
* shares the same steering control register.
*/
if (GRAPHICS_VERx100(xe) >= 1270)
ret = wait_for_us(xe_mmio_read32(gt, STEER_SEMAPHORE) == 0x1, 10);
drm_WARN_ON_ONCE(&xe->drm, ret == -ETIMEDOUT);
}
static void mcr_unlock(struct xe_gt *gt) {
/* Release hardware semaphore */
if (GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270)
xe_mmio_write32(gt, STEER_SEMAPHORE, 0x1);
spin_unlock(&gt->mcr_lock);
}
/*
* Access a register with specific MCR steering
*
* Caller needs to make sure the relevant forcewake wells are up.
*/
static u32 rw_with_mcr_steering(struct xe_gt *gt, i915_mcr_reg_t reg, u8 rw_flag,
int group, int instance, u32 value)
{
u32 steer_reg, steer_val, val = 0;
lockdep_assert_held(&gt->mcr_lock);
if (GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270) {
steer_reg = MTL_MCR_SELECTOR.reg;
steer_val = REG_FIELD_PREP(MTL_MCR_GROUPID, group) |
REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance);
} else {
steer_reg = GEN8_MCR_SELECTOR.reg;
steer_val = REG_FIELD_PREP(GEN11_MCR_SLICE_MASK, group) |
REG_FIELD_PREP(GEN11_MCR_SUBSLICE_MASK, instance);
}
/*
* Always leave the hardware in multicast mode when doing reads
* (see comment about Wa_22013088509 below) and only change it
* to unicast mode when doing writes of a specific instance.
*
* No need to save old steering reg value.
*/
if (rw_flag == MCR_OP_READ)
steer_val |= GEN11_MCR_MULTICAST;
xe_mmio_write32(gt, steer_reg, steer_val);
if (rw_flag == MCR_OP_READ)
val = xe_mmio_read32(gt, reg.reg);
else
xe_mmio_write32(gt, reg.reg, value);
/*
* If we turned off the multicast bit (during a write) we're required
* to turn it back on before finishing. The group and instance values
* don't matter since they'll be re-programmed on the next MCR
* operation.
*/
if (rw_flag == MCR_OP_WRITE)
xe_mmio_write32(gt, steer_reg, GEN11_MCR_MULTICAST);
return val;
}
/**
* xe_gt_mcr_unicast_read_any - reads a non-terminated instance of an MCR register
* @gt: GT structure
* @reg: register to read
*
* Reads a GT MCR register. The read will be steered to a non-terminated
* instance (i.e., one that isn't fused off or powered down by power gating).
* This function assumes the caller is already holding any necessary forcewake
* domains.
*
* Returns the value from a non-terminated instance of @reg.
*/
u32 xe_gt_mcr_unicast_read_any(struct xe_gt *gt, i915_mcr_reg_t reg)
{
u8 group, instance;
u32 val;
bool steer;
steer = xe_gt_mcr_get_nonterminated_steering(gt, reg, &group, &instance);
if (steer) {
mcr_lock(gt);
val = rw_with_mcr_steering(gt, reg, MCR_OP_READ,
group, instance, 0);
mcr_unlock(gt);
} else {
/* DG2 GAM special case rules; treat as if unicast */
val = xe_mmio_read32(gt, reg.reg);
}
return val;
}
/**
* xe_gt_mcr_unicast_read - read a specific instance of an MCR register
* @gt: GT structure
* @reg: the MCR register to read
* @group: the MCR group
* @instance: the MCR instance
*
* Returns the value read from an MCR register after steering toward a specific
* group/instance.
*/
u32 xe_gt_mcr_unicast_read(struct xe_gt *gt,
i915_mcr_reg_t reg,
int group, int instance)
{
u32 val;
mcr_lock(gt);
val = rw_with_mcr_steering(gt, reg, MCR_OP_READ, group, instance, 0);
mcr_unlock(gt);
return val;
}
/**
* xe_gt_mcr_unicast_write - write a specific instance of an MCR register
* @gt: GT structure
* @reg: the MCR register to write
* @value: value to write
* @group: the MCR group
* @instance: the MCR instance
*
* Write an MCR register in unicast mode after steering toward a specific
* group/instance.
*/
void xe_gt_mcr_unicast_write(struct xe_gt *gt, i915_mcr_reg_t reg, u32 value,
int group, int instance)
{
mcr_lock(gt);
rw_with_mcr_steering(gt, reg, MCR_OP_WRITE, group, instance, value);
mcr_unlock(gt);
}
/**
* xe_gt_mcr_multicast_write - write a value to all instances of an MCR register
* @gt: GT structure
* @reg: the MCR register to write
* @value: value to write
*
* Write an MCR register in multicast mode to update all instances.
*/
void xe_gt_mcr_multicast_write(struct xe_gt *gt, i915_mcr_reg_t reg, u32 value)
{
/*
* Synchronize with any unicast operations. Once we have exclusive
* access, the MULTICAST bit should already be set, so there's no need
* to touch the steering register.
*/
mcr_lock(gt);
xe_mmio_write32(gt, reg.reg, value);
mcr_unlock(gt);
}
void xe_gt_mcr_steering_dump(struct xe_gt *gt, struct drm_printer *p)
{
for (int i = 0; i < NUM_STEERING_TYPES; i++) {
if (gt->steering[i].ranges) {
drm_printf(p, "%s steering: group=%#x, instance=%#x\n",
xe_steering_types[i].name,
gt->steering[i].group_target,
gt->steering[i].instance_target);
for (int j = 0; gt->steering[i].ranges[j].end; j++)
drm_printf(p, "\t0x%06x - 0x%06x\n",
gt->steering[i].ranges[j].start,
gt->steering[i].ranges[j].end);
}
}
}

View File

@ -0,0 +1,26 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GT_MCR_H_
#define _XE_GT_MCR_H_
#include "i915_reg_defs.h"
struct drm_printer;
struct xe_gt;
void xe_gt_mcr_init(struct xe_gt *gt);
u32 xe_gt_mcr_unicast_read(struct xe_gt *gt, i915_mcr_reg_t reg,
int group, int instance);
u32 xe_gt_mcr_unicast_read_any(struct xe_gt *gt, i915_mcr_reg_t reg);
void xe_gt_mcr_unicast_write(struct xe_gt *gt, i915_mcr_reg_t reg, u32 value,
int group, int instance);
void xe_gt_mcr_multicast_write(struct xe_gt *gt, i915_mcr_reg_t reg, u32 value);
void xe_gt_mcr_steering_dump(struct xe_gt *gt, struct drm_printer *p);
#endif /* _XE_GT_MCR_H_ */

View File

@ -0,0 +1,750 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <linux/circ_buf.h>
#include <drm/drm_managed.h>
#include <drm/ttm/ttm_execbuf_util.h>
#include "xe_bo.h"
#include "xe_gt.h"
#include "xe_guc.h"
#include "xe_guc_ct.h"
#include "xe_gt_pagefault.h"
#include "xe_migrate.h"
#include "xe_pt.h"
#include "xe_trace.h"
#include "xe_vm.h"
struct pagefault {
u64 page_addr;
u32 asid;
u16 pdata;
u8 vfid;
u8 access_type;
u8 fault_type;
u8 fault_level;
u8 engine_class;
u8 engine_instance;
u8 fault_unsuccessful;
};
enum access_type {
ACCESS_TYPE_READ = 0,
ACCESS_TYPE_WRITE = 1,
ACCESS_TYPE_ATOMIC = 2,
ACCESS_TYPE_RESERVED = 3,
};
enum fault_type {
NOT_PRESENT = 0,
WRITE_ACCESS_VIOLATION = 1,
ATOMIC_ACCESS_VIOLATION = 2,
};
struct acc {
u64 va_range_base;
u32 asid;
u32 sub_granularity;
u8 granularity;
u8 vfid;
u8 access_type;
u8 engine_class;
u8 engine_instance;
};
static struct xe_gt *
guc_to_gt(struct xe_guc *guc)
{
return container_of(guc, struct xe_gt, uc.guc);
}
static int send_tlb_invalidation(struct xe_guc *guc)
{
struct xe_gt *gt = guc_to_gt(guc);
u32 action[] = {
XE_GUC_ACTION_TLB_INVALIDATION,
0,
XE_GUC_TLB_INVAL_FULL << XE_GUC_TLB_INVAL_TYPE_SHIFT |
XE_GUC_TLB_INVAL_MODE_HEAVY << XE_GUC_TLB_INVAL_MODE_SHIFT |
XE_GUC_TLB_INVAL_FLUSH_CACHE,
};
int seqno;
int ret;
/*
* XXX: The seqno algorithm relies on TLB invalidation being processed
* in order which they currently are, if that changes the algorithm will
* need to be updated.
*/
mutex_lock(&guc->ct.lock);
seqno = gt->usm.tlb_invalidation_seqno;
action[1] = seqno;
gt->usm.tlb_invalidation_seqno = (gt->usm.tlb_invalidation_seqno + 1) %
TLB_INVALIDATION_SEQNO_MAX;
if (!gt->usm.tlb_invalidation_seqno)
gt->usm.tlb_invalidation_seqno = 1;
ret = xe_guc_ct_send_locked(&guc->ct, action, ARRAY_SIZE(action),
G2H_LEN_DW_TLB_INVALIDATE, 1);
if (!ret)
ret = seqno;
mutex_unlock(&guc->ct.lock);
return ret;
}
static bool access_is_atomic(enum access_type access_type)
{
return access_type == ACCESS_TYPE_ATOMIC;
}
static bool vma_is_valid(struct xe_gt *gt, struct xe_vma *vma)
{
return BIT(gt->info.id) & vma->gt_present &&
!(BIT(gt->info.id) & vma->usm.gt_invalidated);
}
static bool vma_matches(struct xe_vma *vma, struct xe_vma *lookup)
{
if (lookup->start > vma->end || lookup->end < vma->start)
return false;
return true;
}
static bool only_needs_bo_lock(struct xe_bo *bo)
{
return bo && bo->vm;
}
static struct xe_vma *lookup_vma(struct xe_vm *vm, u64 page_addr)
{
struct xe_vma *vma = NULL, lookup;
lookup.start = page_addr;
lookup.end = lookup.start + SZ_4K - 1;
if (vm->usm.last_fault_vma) { /* Fast lookup */
if (vma_matches(vm->usm.last_fault_vma, &lookup))
vma = vm->usm.last_fault_vma;
}
if (!vma)
vma = xe_vm_find_overlapping_vma(vm, &lookup);
return vma;
}
static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf)
{
struct xe_device *xe = gt_to_xe(gt);
struct xe_vm *vm;
struct xe_vma *vma = NULL;
struct xe_bo *bo;
LIST_HEAD(objs);
LIST_HEAD(dups);
struct ttm_validate_buffer tv_bo, tv_vm;
struct ww_acquire_ctx ww;
struct dma_fence *fence;
bool write_locked;
int ret = 0;
bool atomic;
/* ASID to VM */
mutex_lock(&xe->usm.lock);
vm = xa_load(&xe->usm.asid_to_vm, pf->asid);
if (vm)
xe_vm_get(vm);
mutex_unlock(&xe->usm.lock);
if (!vm || !xe_vm_in_fault_mode(vm))
return -EINVAL;
retry_userptr:
/*
* TODO: Avoid exclusive lock if VM doesn't have userptrs, or
* start out read-locked?
*/
down_write(&vm->lock);
write_locked = true;
vma = lookup_vma(vm, pf->page_addr);
if (!vma) {
ret = -EINVAL;
goto unlock_vm;
}
if (!xe_vma_is_userptr(vma) || !xe_vma_userptr_check_repin(vma)) {
downgrade_write(&vm->lock);
write_locked = false;
}
trace_xe_vma_pagefault(vma);
atomic = access_is_atomic(pf->access_type);
/* Check if VMA is valid */
if (vma_is_valid(gt, vma) && !atomic)
goto unlock_vm;
/* TODO: Validate fault */
if (xe_vma_is_userptr(vma) && write_locked) {
spin_lock(&vm->userptr.invalidated_lock);
list_del_init(&vma->userptr.invalidate_link);
spin_unlock(&vm->userptr.invalidated_lock);
ret = xe_vma_userptr_pin_pages(vma);
if (ret)
goto unlock_vm;
downgrade_write(&vm->lock);
write_locked = false;
}
/* Lock VM and BOs dma-resv */
bo = vma->bo;
if (only_needs_bo_lock(bo)) {
/* This path ensures the BO's LRU is updated */
ret = xe_bo_lock(bo, &ww, xe->info.tile_count, false);
} else {
tv_vm.num_shared = xe->info.tile_count;
tv_vm.bo = xe_vm_ttm_bo(vm);
list_add(&tv_vm.head, &objs);
if (bo) {
tv_bo.bo = &bo->ttm;
tv_bo.num_shared = xe->info.tile_count;
list_add(&tv_bo.head, &objs);
}
ret = ttm_eu_reserve_buffers(&ww, &objs, false, &dups);
}
if (ret)
goto unlock_vm;
if (atomic) {
if (xe_vma_is_userptr(vma)) {
ret = -EACCES;
goto unlock_dma_resv;
}
/* Migrate to VRAM, move should invalidate the VMA first */
ret = xe_bo_migrate(bo, XE_PL_VRAM0 + gt->info.vram_id);
if (ret)
goto unlock_dma_resv;
} else if (bo) {
/* Create backing store if needed */
ret = xe_bo_validate(bo, vm, true);
if (ret)
goto unlock_dma_resv;
}
/* Bind VMA only to the GT that has faulted */
trace_xe_vma_pf_bind(vma);
fence = __xe_pt_bind_vma(gt, vma, xe_gt_migrate_engine(gt), NULL, 0,
vma->gt_present & BIT(gt->info.id));
if (IS_ERR(fence)) {
ret = PTR_ERR(fence);
goto unlock_dma_resv;
}
/*
* XXX: Should we drop the lock before waiting? This only helps if doing
* GPU binds which is currently only done if we have to wait for more
* than 10ms on a move.
*/
dma_fence_wait(fence, false);
dma_fence_put(fence);
if (xe_vma_is_userptr(vma))
ret = xe_vma_userptr_check_repin(vma);
vma->usm.gt_invalidated &= ~BIT(gt->info.id);
unlock_dma_resv:
if (only_needs_bo_lock(bo))
xe_bo_unlock(bo, &ww);
else
ttm_eu_backoff_reservation(&ww, &objs);
unlock_vm:
if (!ret)
vm->usm.last_fault_vma = vma;
if (write_locked)
up_write(&vm->lock);
else
up_read(&vm->lock);
if (ret == -EAGAIN)
goto retry_userptr;
if (!ret) {
/*
* FIXME: Doing a full TLB invalidation for now, likely could
* defer TLB invalidate + fault response to a callback of fence
* too
*/
ret = send_tlb_invalidation(&gt->uc.guc);
if (ret >= 0)
ret = 0;
}
xe_vm_put(vm);
return ret;
}
static int send_pagefault_reply(struct xe_guc *guc,
struct xe_guc_pagefault_reply *reply)
{
u32 action[] = {
XE_GUC_ACTION_PAGE_FAULT_RES_DESC,
reply->dw0,
reply->dw1,
};
return xe_guc_ct_send(&guc->ct, action, ARRAY_SIZE(action), 0, 0);
}
static void print_pagefault(struct xe_device *xe, struct pagefault *pf)
{
drm_warn(&xe->drm, "\n\tASID: %d\n"
"\tVFID: %d\n"
"\tPDATA: 0x%04x\n"
"\tFaulted Address: 0x%08x%08x\n"
"\tFaultType: %d\n"
"\tAccessType: %d\n"
"\tFaultLevel: %d\n"
"\tEngineClass: %d\n"
"\tEngineInstance: %d\n",
pf->asid, pf->vfid, pf->pdata, upper_32_bits(pf->page_addr),
lower_32_bits(pf->page_addr),
pf->fault_type, pf->access_type, pf->fault_level,
pf->engine_class, pf->engine_instance);
}
#define PF_MSG_LEN_DW 4
static int get_pagefault(struct pf_queue *pf_queue, struct pagefault *pf)
{
const struct xe_guc_pagefault_desc *desc;
int ret = 0;
spin_lock_irq(&pf_queue->lock);
if (pf_queue->head != pf_queue->tail) {
desc = (const struct xe_guc_pagefault_desc *)
(pf_queue->data + pf_queue->head);
pf->fault_level = FIELD_GET(PFD_FAULT_LEVEL, desc->dw0);
pf->engine_class = FIELD_GET(PFD_ENG_CLASS, desc->dw0);
pf->engine_instance = FIELD_GET(PFD_ENG_INSTANCE, desc->dw0);
pf->pdata = FIELD_GET(PFD_PDATA_HI, desc->dw1) <<
PFD_PDATA_HI_SHIFT;
pf->pdata |= FIELD_GET(PFD_PDATA_LO, desc->dw0);
pf->asid = FIELD_GET(PFD_ASID, desc->dw1);
pf->vfid = FIELD_GET(PFD_VFID, desc->dw2);
pf->access_type = FIELD_GET(PFD_ACCESS_TYPE, desc->dw2);
pf->fault_type = FIELD_GET(PFD_FAULT_TYPE, desc->dw2);
pf->page_addr = (u64)(FIELD_GET(PFD_VIRTUAL_ADDR_HI, desc->dw3)) <<
PFD_VIRTUAL_ADDR_HI_SHIFT;
pf->page_addr |= FIELD_GET(PFD_VIRTUAL_ADDR_LO, desc->dw2) <<
PFD_VIRTUAL_ADDR_LO_SHIFT;
pf_queue->head = (pf_queue->head + PF_MSG_LEN_DW) %
PF_QUEUE_NUM_DW;
} else {
ret = -1;
}
spin_unlock_irq(&pf_queue->lock);
return ret;
}
static bool pf_queue_full(struct pf_queue *pf_queue)
{
lockdep_assert_held(&pf_queue->lock);
return CIRC_SPACE(pf_queue->tail, pf_queue->head, PF_QUEUE_NUM_DW) <=
PF_MSG_LEN_DW;
}
int xe_guc_pagefault_handler(struct xe_guc *guc, u32 *msg, u32 len)
{
struct xe_gt *gt = guc_to_gt(guc);
struct pf_queue *pf_queue;
unsigned long flags;
u32 asid;
bool full;
if (unlikely(len != PF_MSG_LEN_DW))
return -EPROTO;
asid = FIELD_GET(PFD_ASID, msg[1]);
pf_queue = &gt->usm.pf_queue[asid % NUM_PF_QUEUE];
spin_lock_irqsave(&pf_queue->lock, flags);
full = pf_queue_full(pf_queue);
if (!full) {
memcpy(pf_queue->data + pf_queue->tail, msg, len * sizeof(u32));
pf_queue->tail = (pf_queue->tail + len) % PF_QUEUE_NUM_DW;
queue_work(gt->usm.pf_wq, &pf_queue->worker);
} else {
XE_WARN_ON("PF Queue full, shouldn't be possible");
}
spin_unlock_irqrestore(&pf_queue->lock, flags);
return full ? -ENOSPC : 0;
}
static void pf_queue_work_func(struct work_struct *w)
{
struct pf_queue *pf_queue = container_of(w, struct pf_queue, worker);
struct xe_gt *gt = pf_queue->gt;
struct xe_device *xe = gt_to_xe(gt);
struct xe_guc_pagefault_reply reply = {};
struct pagefault pf = {};
int ret;
ret = get_pagefault(pf_queue, &pf);
if (ret)
return;
ret = handle_pagefault(gt, &pf);
if (unlikely(ret)) {
print_pagefault(xe, &pf);
pf.fault_unsuccessful = 1;
drm_warn(&xe->drm, "Fault response: Unsuccessful %d\n", ret);
}
reply.dw0 = FIELD_PREP(PFR_VALID, 1) |
FIELD_PREP(PFR_SUCCESS, pf.fault_unsuccessful) |
FIELD_PREP(PFR_REPLY, PFR_ACCESS) |
FIELD_PREP(PFR_DESC_TYPE, FAULT_RESPONSE_DESC) |
FIELD_PREP(PFR_ASID, pf.asid);
reply.dw1 = FIELD_PREP(PFR_VFID, pf.vfid) |
FIELD_PREP(PFR_ENG_INSTANCE, pf.engine_instance) |
FIELD_PREP(PFR_ENG_CLASS, pf.engine_class) |
FIELD_PREP(PFR_PDATA, pf.pdata);
send_pagefault_reply(&gt->uc.guc, &reply);
}
static void acc_queue_work_func(struct work_struct *w);
int xe_gt_pagefault_init(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
int i;
if (!xe->info.supports_usm)
return 0;
gt->usm.tlb_invalidation_seqno = 1;
for (i = 0; i < NUM_PF_QUEUE; ++i) {
gt->usm.pf_queue[i].gt = gt;
spin_lock_init(&gt->usm.pf_queue[i].lock);
INIT_WORK(&gt->usm.pf_queue[i].worker, pf_queue_work_func);
}
for (i = 0; i < NUM_ACC_QUEUE; ++i) {
gt->usm.acc_queue[i].gt = gt;
spin_lock_init(&gt->usm.acc_queue[i].lock);
INIT_WORK(&gt->usm.acc_queue[i].worker, acc_queue_work_func);
}
gt->usm.pf_wq = alloc_workqueue("xe_gt_page_fault_work_queue",
WQ_UNBOUND | WQ_HIGHPRI, NUM_PF_QUEUE);
if (!gt->usm.pf_wq)
return -ENOMEM;
gt->usm.acc_wq = alloc_workqueue("xe_gt_access_counter_work_queue",
WQ_UNBOUND | WQ_HIGHPRI,
NUM_ACC_QUEUE);
if (!gt->usm.acc_wq)
return -ENOMEM;
return 0;
}
void xe_gt_pagefault_reset(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
int i;
if (!xe->info.supports_usm)
return;
for (i = 0; i < NUM_PF_QUEUE; ++i) {
spin_lock_irq(&gt->usm.pf_queue[i].lock);
gt->usm.pf_queue[i].head = 0;
gt->usm.pf_queue[i].tail = 0;
spin_unlock_irq(&gt->usm.pf_queue[i].lock);
}
for (i = 0; i < NUM_ACC_QUEUE; ++i) {
spin_lock(&gt->usm.acc_queue[i].lock);
gt->usm.acc_queue[i].head = 0;
gt->usm.acc_queue[i].tail = 0;
spin_unlock(&gt->usm.acc_queue[i].lock);
}
}
int xe_gt_tlb_invalidation(struct xe_gt *gt)
{
return send_tlb_invalidation(&gt->uc.guc);
}
static bool tlb_invalidation_seqno_past(struct xe_gt *gt, int seqno)
{
if (gt->usm.tlb_invalidation_seqno_recv >= seqno)
return true;
if (seqno - gt->usm.tlb_invalidation_seqno_recv >
(TLB_INVALIDATION_SEQNO_MAX / 2))
return true;
return false;
}
int xe_gt_tlb_invalidation_wait(struct xe_gt *gt, int seqno)
{
struct xe_device *xe = gt_to_xe(gt);
struct xe_guc *guc = &gt->uc.guc;
int ret;
/*
* XXX: See above, this algorithm only works if seqno are always in
* order
*/
ret = wait_event_timeout(guc->ct.wq,
tlb_invalidation_seqno_past(gt, seqno),
HZ / 5);
if (!ret) {
drm_err(&xe->drm, "TLB invalidation time'd out, seqno=%d, recv=%d\n",
seqno, gt->usm.tlb_invalidation_seqno_recv);
return -ETIME;
}
return 0;
}
int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
{
struct xe_gt *gt = guc_to_gt(guc);
int expected_seqno;
if (unlikely(len != 1))
return -EPROTO;
/* Sanity check on seqno */
expected_seqno = (gt->usm.tlb_invalidation_seqno_recv + 1) %
TLB_INVALIDATION_SEQNO_MAX;
XE_WARN_ON(expected_seqno != msg[0]);
gt->usm.tlb_invalidation_seqno_recv = msg[0];
smp_wmb();
wake_up_all(&guc->ct.wq);
return 0;
}
static int granularity_in_byte(int val)
{
switch (val) {
case 0:
return SZ_128K;
case 1:
return SZ_2M;
case 2:
return SZ_16M;
case 3:
return SZ_64M;
default:
return 0;
}
}
static int sub_granularity_in_byte(int val)
{
return (granularity_in_byte(val) / 32);
}
static void print_acc(struct xe_device *xe, struct acc *acc)
{
drm_warn(&xe->drm, "Access counter request:\n"
"\tType: %s\n"
"\tASID: %d\n"
"\tVFID: %d\n"
"\tEngine: %d:%d\n"
"\tGranularity: 0x%x KB Region/ %d KB sub-granularity\n"
"\tSub_Granularity Vector: 0x%08x\n"
"\tVA Range base: 0x%016llx\n",
acc->access_type ? "AC_NTFY_VAL" : "AC_TRIG_VAL",
acc->asid, acc->vfid, acc->engine_class, acc->engine_instance,
granularity_in_byte(acc->granularity) / SZ_1K,
sub_granularity_in_byte(acc->granularity) / SZ_1K,
acc->sub_granularity, acc->va_range_base);
}
static struct xe_vma *get_acc_vma(struct xe_vm *vm, struct acc *acc)
{
u64 page_va = acc->va_range_base + (ffs(acc->sub_granularity) - 1) *
sub_granularity_in_byte(acc->granularity);
struct xe_vma lookup;
lookup.start = page_va;
lookup.end = lookup.start + SZ_4K - 1;
return xe_vm_find_overlapping_vma(vm, &lookup);
}
static int handle_acc(struct xe_gt *gt, struct acc *acc)
{
struct xe_device *xe = gt_to_xe(gt);
struct xe_vm *vm;
struct xe_vma *vma;
struct xe_bo *bo;
LIST_HEAD(objs);
LIST_HEAD(dups);
struct ttm_validate_buffer tv_bo, tv_vm;
struct ww_acquire_ctx ww;
int ret = 0;
/* We only support ACC_TRIGGER at the moment */
if (acc->access_type != ACC_TRIGGER)
return -EINVAL;
/* ASID to VM */
mutex_lock(&xe->usm.lock);
vm = xa_load(&xe->usm.asid_to_vm, acc->asid);
if (vm)
xe_vm_get(vm);
mutex_unlock(&xe->usm.lock);
if (!vm || !xe_vm_in_fault_mode(vm))
return -EINVAL;
down_read(&vm->lock);
/* Lookup VMA */
vma = get_acc_vma(vm, acc);
if (!vma) {
ret = -EINVAL;
goto unlock_vm;
}
trace_xe_vma_acc(vma);
/* Userptr can't be migrated, nothing to do */
if (xe_vma_is_userptr(vma))
goto unlock_vm;
/* Lock VM and BOs dma-resv */
bo = vma->bo;
if (only_needs_bo_lock(bo)) {
/* This path ensures the BO's LRU is updated */
ret = xe_bo_lock(bo, &ww, xe->info.tile_count, false);
} else {
tv_vm.num_shared = xe->info.tile_count;
tv_vm.bo = xe_vm_ttm_bo(vm);
list_add(&tv_vm.head, &objs);
tv_bo.bo = &bo->ttm;
tv_bo.num_shared = xe->info.tile_count;
list_add(&tv_bo.head, &objs);
ret = ttm_eu_reserve_buffers(&ww, &objs, false, &dups);
}
if (ret)
goto unlock_vm;
/* Migrate to VRAM, move should invalidate the VMA first */
ret = xe_bo_migrate(bo, XE_PL_VRAM0 + gt->info.vram_id);
if (only_needs_bo_lock(bo))
xe_bo_unlock(bo, &ww);
else
ttm_eu_backoff_reservation(&ww, &objs);
unlock_vm:
up_read(&vm->lock);
xe_vm_put(vm);
return ret;
}
#define make_u64(hi__, low__) ((u64)(hi__) << 32 | (u64)(low__))
static int get_acc(struct acc_queue *acc_queue, struct acc *acc)
{
const struct xe_guc_acc_desc *desc;
int ret = 0;
spin_lock(&acc_queue->lock);
if (acc_queue->head != acc_queue->tail) {
desc = (const struct xe_guc_acc_desc *)
(acc_queue->data + acc_queue->head);
acc->granularity = FIELD_GET(ACC_GRANULARITY, desc->dw2);
acc->sub_granularity = FIELD_GET(ACC_SUBG_HI, desc->dw1) << 31 |
FIELD_GET(ACC_SUBG_LO, desc->dw0);
acc->engine_class = FIELD_GET(ACC_ENG_CLASS, desc->dw1);
acc->engine_instance = FIELD_GET(ACC_ENG_INSTANCE, desc->dw1);
acc->asid = FIELD_GET(ACC_ASID, desc->dw1);
acc->vfid = FIELD_GET(ACC_VFID, desc->dw2);
acc->access_type = FIELD_GET(ACC_TYPE, desc->dw0);
acc->va_range_base = make_u64(desc->dw3 & ACC_VIRTUAL_ADDR_RANGE_HI,
desc->dw2 & ACC_VIRTUAL_ADDR_RANGE_LO);
} else {
ret = -1;
}
spin_unlock(&acc_queue->lock);
return ret;
}
static void acc_queue_work_func(struct work_struct *w)
{
struct acc_queue *acc_queue = container_of(w, struct acc_queue, worker);
struct xe_gt *gt = acc_queue->gt;
struct xe_device *xe = gt_to_xe(gt);
struct acc acc = {};
int ret;
ret = get_acc(acc_queue, &acc);
if (ret)
return;
ret = handle_acc(gt, &acc);
if (unlikely(ret)) {
print_acc(xe, &acc);
drm_warn(&xe->drm, "ACC: Unsuccessful %d\n", ret);
}
}
#define ACC_MSG_LEN_DW 4
static bool acc_queue_full(struct acc_queue *acc_queue)
{
lockdep_assert_held(&acc_queue->lock);
return CIRC_SPACE(acc_queue->tail, acc_queue->head, ACC_QUEUE_NUM_DW) <=
ACC_MSG_LEN_DW;
}
int xe_guc_access_counter_notify_handler(struct xe_guc *guc, u32 *msg, u32 len)
{
struct xe_gt *gt = guc_to_gt(guc);
struct acc_queue *acc_queue;
u32 asid;
bool full;
if (unlikely(len != ACC_MSG_LEN_DW))
return -EPROTO;
asid = FIELD_GET(ACC_ASID, msg[1]);
acc_queue = &gt->usm.acc_queue[asid % NUM_ACC_QUEUE];
spin_lock(&acc_queue->lock);
full = acc_queue_full(acc_queue);
if (!full) {
memcpy(acc_queue->data + acc_queue->tail, msg,
len * sizeof(u32));
acc_queue->tail = (acc_queue->tail + len) % ACC_QUEUE_NUM_DW;
queue_work(gt->usm.acc_wq, &acc_queue->worker);
} else {
drm_warn(&gt_to_xe(gt)->drm, "ACC Queue full, dropping ACC");
}
spin_unlock(&acc_queue->lock);
return full ? -ENOSPC : 0;
}

View File

@ -0,0 +1,22 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GT_PAGEFAULT_H_
#define _XE_GT_PAGEFAULT_H_
#include <linux/types.h>
struct xe_gt;
struct xe_guc;
int xe_gt_pagefault_init(struct xe_gt *gt);
void xe_gt_pagefault_reset(struct xe_gt *gt);
int xe_gt_tlb_invalidation(struct xe_gt *gt);
int xe_gt_tlb_invalidation_wait(struct xe_gt *gt, int seqno);
int xe_guc_pagefault_handler(struct xe_guc *guc, u32 *msg, u32 len);
int xe_guc_tlb_invalidation_done_handler(struct xe_guc *guc, u32 *msg, u32 len);
int xe_guc_access_counter_notify_handler(struct xe_guc *guc, u32 *msg, u32 len);
#endif /* _XE_GT_PAGEFAULT_ */

View File

@ -0,0 +1,55 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <linux/kobject.h>
#include <linux/sysfs.h>
#include <drm/drm_managed.h>
#include "xe_gt.h"
#include "xe_gt_sysfs.h"
static void xe_gt_sysfs_kobj_release(struct kobject *kobj)
{
kfree(kobj);
}
static struct kobj_type xe_gt_sysfs_kobj_type = {
.release = xe_gt_sysfs_kobj_release,
.sysfs_ops = &kobj_sysfs_ops,
};
static void gt_sysfs_fini(struct drm_device *drm, void *arg)
{
struct xe_gt *gt = arg;
kobject_put(gt->sysfs);
}
int xe_gt_sysfs_init(struct xe_gt *gt)
{
struct device *dev = gt_to_xe(gt)->drm.dev;
struct kobj_gt *kg;
int err;
kg = kzalloc(sizeof(*kg), GFP_KERNEL);
if (!kg)
return -ENOMEM;
kobject_init(&kg->base, &xe_gt_sysfs_kobj_type);
kg->gt = gt;
err = kobject_add(&kg->base, &dev->kobj, "gt%d", gt->info.id);
if (err) {
kobject_put(&kg->base);
return err;
}
gt->sysfs = &kg->base;
err = drmm_add_action_or_reset(&gt_to_xe(gt)->drm, gt_sysfs_fini, gt);
if (err)
return err;
return 0;
}

View File

@ -0,0 +1,19 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GT_SYSFS_H_
#define _XE_GT_SYSFS_H_
#include "xe_gt_sysfs_types.h"
int xe_gt_sysfs_init(struct xe_gt *gt);
static inline struct xe_gt *
kobj_to_gt(struct kobject *kobj)
{
return container_of(kobj, struct kobj_gt, base)->gt;
}
#endif /* _XE_GT_SYSFS_H_ */

View File

@ -0,0 +1,26 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GT_SYSFS_TYPES_H_
#define _XE_GT_SYSFS_TYPES_H_
#include <linux/kobject.h>
struct xe_gt;
/**
* struct kobj_gt - A GT's kobject struct that connects the kobject and the GT
*
* When dealing with multiple GTs, this struct helps to understand which GT
* needs to be addressed on a given sysfs call.
*/
struct kobj_gt {
/** @base: The actual kobject */
struct kobject base;
/** @gt: A pointer to the GT itself */
struct xe_gt *gt;
};
#endif /* _XE_GT_SYSFS_TYPES_H_ */

View File

@ -0,0 +1,144 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <linux/bitmap.h>
#include "xe_gt.h"
#include "xe_gt_topology.h"
#include "xe_mmio.h"
#define XE_MAX_DSS_FUSE_BITS (32 * XE_MAX_DSS_FUSE_REGS)
#define XE_MAX_EU_FUSE_BITS (32 * XE_MAX_EU_FUSE_REGS)
#define XELP_EU_ENABLE 0x9134 /* "_DISABLE" on Xe_LP */
#define XELP_EU_MASK REG_GENMASK(7, 0)
#define XELP_GT_GEOMETRY_DSS_ENABLE 0x913c
#define XEHP_GT_COMPUTE_DSS_ENABLE 0x9144
#define XEHPC_GT_COMPUTE_DSS_ENABLE_EXT 0x9148
static void
load_dss_mask(struct xe_gt *gt, xe_dss_mask_t mask, int numregs, ...)
{
va_list argp;
u32 fuse_val[XE_MAX_DSS_FUSE_REGS] = {};
int i;
if (drm_WARN_ON(&gt_to_xe(gt)->drm, numregs > XE_MAX_DSS_FUSE_REGS))
numregs = XE_MAX_DSS_FUSE_REGS;
va_start(argp, numregs);
for (i = 0; i < numregs; i++)
fuse_val[i] = xe_mmio_read32(gt, va_arg(argp, u32));
va_end(argp);
bitmap_from_arr32(mask, fuse_val, numregs * 32);
}
static void
load_eu_mask(struct xe_gt *gt, xe_eu_mask_t mask)
{
struct xe_device *xe = gt_to_xe(gt);
u32 reg = xe_mmio_read32(gt, XELP_EU_ENABLE);
u32 val = 0;
int i;
BUILD_BUG_ON(XE_MAX_EU_FUSE_REGS > 1);
/*
* Pre-Xe_HP platforms inverted the bit meaning (disable instead
* of enable).
*/
if (GRAPHICS_VERx100(xe) < 1250)
reg = ~reg & XELP_EU_MASK;
/* On PVC, one bit = one EU */
if (GRAPHICS_VERx100(xe) == 1260) {
val = reg;
} else {
/* All other platforms, one bit = 2 EU */
for (i = 0; i < fls(reg); i++)
if (reg & BIT(i))
val |= 0x3 << 2 * i;
}
bitmap_from_arr32(mask, &val, XE_MAX_EU_FUSE_BITS);
}
void
xe_gt_topology_init(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
struct drm_printer p = drm_debug_printer("GT topology");
int num_geometry_regs, num_compute_regs;
if (GRAPHICS_VERx100(xe) == 1260) {
num_geometry_regs = 0;
num_compute_regs = 2;
} else if (GRAPHICS_VERx100(xe) >= 1250) {
num_geometry_regs = 1;
num_compute_regs = 1;
} else {
num_geometry_regs = 1;
num_compute_regs = 0;
}
load_dss_mask(gt, gt->fuse_topo.g_dss_mask, num_geometry_regs,
XELP_GT_GEOMETRY_DSS_ENABLE);
load_dss_mask(gt, gt->fuse_topo.c_dss_mask, num_compute_regs,
XEHP_GT_COMPUTE_DSS_ENABLE,
XEHPC_GT_COMPUTE_DSS_ENABLE_EXT);
load_eu_mask(gt, gt->fuse_topo.eu_mask_per_dss);
xe_gt_topology_dump(gt, &p);
}
unsigned int
xe_gt_topology_count_dss(xe_dss_mask_t mask)
{
return bitmap_weight(mask, XE_MAX_DSS_FUSE_BITS);
}
u64
xe_gt_topology_dss_group_mask(xe_dss_mask_t mask, int grpsize)
{
xe_dss_mask_t per_dss_mask = {};
u64 grpmask = 0;
WARN_ON(DIV_ROUND_UP(XE_MAX_DSS_FUSE_BITS, grpsize) > BITS_PER_TYPE(grpmask));
bitmap_fill(per_dss_mask, grpsize);
for (int i = 0; !bitmap_empty(mask, XE_MAX_DSS_FUSE_BITS); i++) {
if (bitmap_intersects(mask, per_dss_mask, grpsize))
grpmask |= BIT(i);
bitmap_shift_right(mask, mask, grpsize, XE_MAX_DSS_FUSE_BITS);
}
return grpmask;
}
void
xe_gt_topology_dump(struct xe_gt *gt, struct drm_printer *p)
{
drm_printf(p, "dss mask (geometry): %*pb\n", XE_MAX_DSS_FUSE_BITS,
gt->fuse_topo.g_dss_mask);
drm_printf(p, "dss mask (compute): %*pb\n", XE_MAX_DSS_FUSE_BITS,
gt->fuse_topo.c_dss_mask);
drm_printf(p, "EU mask per DSS: %*pb\n", XE_MAX_EU_FUSE_BITS,
gt->fuse_topo.eu_mask_per_dss);
}
/*
* Used to obtain the index of the first DSS. Can start searching from the
* beginning of a specific dss group (e.g., gslice, cslice, etc.) if
* groupsize and groupnum are non-zero.
*/
unsigned int
xe_dss_mask_group_ffs(xe_dss_mask_t mask, int groupsize, int groupnum)
{
return find_next_bit(mask, XE_MAX_DSS_FUSE_BITS, groupnum * groupsize);
}

View File

@ -0,0 +1,20 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef __XE_GT_TOPOLOGY_H__
#define __XE_GT_TOPOLOGY_H__
#include "xe_gt_types.h"
struct drm_printer;
void xe_gt_topology_init(struct xe_gt *gt);
void xe_gt_topology_dump(struct xe_gt *gt, struct drm_printer *p);
unsigned int
xe_dss_mask_group_ffs(xe_dss_mask_t mask, int groupsize, int groupnum);
#endif /* __XE_GT_TOPOLOGY_H__ */

View File

@ -0,0 +1,320 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GT_TYPES_H_
#define _XE_GT_TYPES_H_
#include "xe_force_wake_types.h"
#include "xe_hw_engine_types.h"
#include "xe_hw_fence_types.h"
#include "xe_reg_sr_types.h"
#include "xe_sa_types.h"
#include "xe_uc_types.h"
struct xe_engine_ops;
struct xe_ggtt;
struct xe_migrate;
struct xe_ring_ops;
struct xe_ttm_gtt_mgr;
struct xe_ttm_vram_mgr;
enum xe_gt_type {
XE_GT_TYPE_UNINITIALIZED,
XE_GT_TYPE_MAIN,
XE_GT_TYPE_REMOTE,
XE_GT_TYPE_MEDIA,
};
#define XE_MAX_DSS_FUSE_REGS 2
#define XE_MAX_EU_FUSE_REGS 1
typedef unsigned long xe_dss_mask_t[BITS_TO_LONGS(32 * XE_MAX_DSS_FUSE_REGS)];
typedef unsigned long xe_eu_mask_t[BITS_TO_LONGS(32 * XE_MAX_DSS_FUSE_REGS)];
struct xe_mmio_range {
u32 start;
u32 end;
};
/*
* The hardware has multiple kinds of multicast register ranges that need
* special register steering (and future platforms are expected to add
* additional types).
*
* During driver startup, we initialize the steering control register to
* direct reads to a slice/subslice that are valid for the 'subslice' class
* of multicast registers. If another type of steering does not have any
* overlap in valid steering targets with 'subslice' style registers, we will
* need to explicitly re-steer reads of registers of the other type.
*
* Only the replication types that may need additional non-default steering
* are listed here.
*/
enum xe_steering_type {
L3BANK,
MSLICE,
LNCF,
DSS,
OADDRM,
/*
* On some platforms there are multiple types of MCR registers that
* will always return a non-terminated value at instance (0, 0). We'll
* lump those all into a single category to keep things simple.
*/
INSTANCE0,
NUM_STEERING_TYPES
};
/**
* struct xe_gt - Top level struct of a graphics tile
*
* A graphics tile may be a physical split (duplicate pieces of silicon,
* different GGTT + VRAM) or a virtual split (shared GGTT + VRAM). Either way
* this structure encapsulates of everything a GT is (MMIO, VRAM, memory
* management, microcontrols, and a hardware set of engines).
*/
struct xe_gt {
/** @xe: backpointer to XE device */
struct xe_device *xe;
/** @info: GT info */
struct {
/** @type: type of GT */
enum xe_gt_type type;
/** @id: id of GT */
u8 id;
/** @vram: id of the VRAM for this GT */
u8 vram_id;
/** @clock_freq: clock frequency */
u32 clock_freq;
/** @engine_mask: mask of engines present on GT */
u64 engine_mask;
} info;
/**
* @mmio: mmio info for GT, can be subset of the global device mmio
* space
*/
struct {
/** @size: size of MMIO space on GT */
size_t size;
/** @regs: pointer to MMIO space on GT */
void *regs;
/** @fw: force wake for GT */
struct xe_force_wake fw;
/**
* @adj_limit: adjust MMIO address if address is below this
* value
*/
u32 adj_limit;
/** @adj_offset: offect to add to MMIO address when adjusting */
u32 adj_offset;
} mmio;
/**
* @reg_sr: table with registers to be restored on GT init/resume/reset
*/
struct xe_reg_sr reg_sr;
/**
* @mem: memory management info for GT, multiple GTs can point to same
* objects (virtual split)
*/
struct {
/**
* @vram: VRAM info for GT, multiple GTs can point to same info
* (virtual split), can be subset of global device VRAM
*/
struct {
/** @io_start: start address of VRAM */
resource_size_t io_start;
/** @size: size of VRAM */
resource_size_t size;
/** @mapping: pointer to VRAM mappable space */
void *__iomem mapping;
} vram;
/** @vram_mgr: VRAM TTM manager */
struct xe_ttm_vram_mgr *vram_mgr;
/** @gtt_mr: GTT TTM manager */
struct xe_ttm_gtt_mgr *gtt_mgr;
/** @ggtt: Global graphics translation table */
struct xe_ggtt *ggtt;
} mem;
/** @reset: state for GT resets */
struct {
/**
* @worker: work so GT resets can done async allowing to reset
* code to safely flush all code paths
*/
struct work_struct worker;
} reset;
/** @usm: unified shared memory state */
struct {
/**
* @bb_pool: Pool from which batchbuffers, for USM operations
* (e.g. migrations, fixing page tables), are allocated.
* Dedicated pool needed so USM operations to not get blocked
* behind any user operations which may have resulted in a
* fault.
*/
struct xe_sa_manager bb_pool;
/**
* @reserved_bcs_instance: reserved BCS instance used for USM
* operations (e.g. mmigrations, fixing page tables)
*/
u16 reserved_bcs_instance;
/**
* @tlb_invalidation_seqno: TLB invalidation seqno, protected by
* CT lock
*/
#define TLB_INVALIDATION_SEQNO_MAX 0x100000
int tlb_invalidation_seqno;
/**
* @tlb_invalidation_seqno_recv: last received TLB invalidation
* seqno, protected by CT lock
*/
int tlb_invalidation_seqno_recv;
/** @pf_wq: page fault work queue, unbound, high priority */
struct workqueue_struct *pf_wq;
/** @acc_wq: access counter work queue, unbound, high priority */
struct workqueue_struct *acc_wq;
/**
* @pf_queue: Page fault queue used to sync faults so faults can
* be processed not under the GuC CT lock. The queue is sized so
* it can sync all possible faults (1 per physical engine).
* Multiple queues exists for page faults from different VMs are
* be processed in parallel.
*/
struct pf_queue {
/** @gt: back pointer to GT */
struct xe_gt *gt;
#define PF_QUEUE_NUM_DW 128
/** @data: data in the page fault queue */
u32 data[PF_QUEUE_NUM_DW];
/**
* @head: head pointer in DWs for page fault queue,
* moved by worker which processes faults.
*/
u16 head;
/**
* @tail: tail pointer in DWs for page fault queue,
* moved by G2H handler.
*/
u16 tail;
/** @lock: protects page fault queue */
spinlock_t lock;
/** @worker: to process page faults */
struct work_struct worker;
#define NUM_PF_QUEUE 4
} pf_queue[NUM_PF_QUEUE];
/**
* @acc_queue: Same as page fault queue, cannot process access
* counters under CT lock.
*/
struct acc_queue {
/** @gt: back pointer to GT */
struct xe_gt *gt;
#define ACC_QUEUE_NUM_DW 128
/** @data: data in the page fault queue */
u32 data[ACC_QUEUE_NUM_DW];
/**
* @head: head pointer in DWs for page fault queue,
* moved by worker which processes faults.
*/
u16 head;
/**
* @tail: tail pointer in DWs for page fault queue,
* moved by G2H handler.
*/
u16 tail;
/** @lock: protects page fault queue */
spinlock_t lock;
/** @worker: to process access counters */
struct work_struct worker;
#define NUM_ACC_QUEUE 4
} acc_queue[NUM_ACC_QUEUE];
} usm;
/** @ordered_wq: used to serialize GT resets and TDRs */
struct workqueue_struct *ordered_wq;
/** @uc: micro controllers on the GT */
struct xe_uc uc;
/** @engine_ops: submission backend engine operations */
const struct xe_engine_ops *engine_ops;
/**
* @ring_ops: ring operations for this hw engine (1 per engine class)
*/
const struct xe_ring_ops *ring_ops[XE_ENGINE_CLASS_MAX];
/** @fence_irq: fence IRQs (1 per engine class) */
struct xe_hw_fence_irq fence_irq[XE_ENGINE_CLASS_MAX];
/** @default_lrc: default LRC state */
void *default_lrc[XE_ENGINE_CLASS_MAX];
/** @hw_engines: hardware engines on the GT */
struct xe_hw_engine hw_engines[XE_NUM_HW_ENGINES];
/** @kernel_bb_pool: Pool from which batchbuffers are allocated */
struct xe_sa_manager kernel_bb_pool;
/** @migrate: Migration helper for vram blits and clearing */
struct xe_migrate *migrate;
/** @pcode: GT's PCODE */
struct {
/** @lock: protecting GT's PCODE mailbox data */
struct mutex lock;
} pcode;
/** @sysfs: sysfs' kobj used by xe_gt_sysfs */
struct kobject *sysfs;
/** @mocs: info */
struct {
/** @uc_index: UC index */
u8 uc_index;
/** @wb_index: WB index, only used on L3_CCS platforms */
u8 wb_index;
} mocs;
/** @fuse_topo: GT topology reported by fuse registers */
struct {
/** @g_dss_mask: dual-subslices usable by geometry */
xe_dss_mask_t g_dss_mask;
/** @c_dss_mask: dual-subslices usable by compute */
xe_dss_mask_t c_dss_mask;
/** @eu_mask_per_dss: EU mask per DSS*/
xe_eu_mask_t eu_mask_per_dss;
} fuse_topo;
/** @steering: register steering for individual HW units */
struct {
/* @ranges: register ranges used for this steering type */
const struct xe_mmio_range *ranges;
/** @group_target: target to steer accesses to */
u16 group_target;
/** @instance_target: instance to steer accesses to */
u16 instance_target;
} steering[NUM_STEERING_TYPES];
/**
* @mcr_lock: protects the MCR_SELECTOR register for the duration
* of a steered operation
*/
spinlock_t mcr_lock;
};
#endif

875
drivers/gpu/drm/xe/xe_guc.c Normal file
View File

@ -0,0 +1,875 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include "xe_bo.h"
#include "xe_device.h"
#include "xe_guc.h"
#include "xe_guc_ads.h"
#include "xe_guc_ct.h"
#include "xe_guc_hwconfig.h"
#include "xe_guc_log.h"
#include "xe_guc_reg.h"
#include "xe_guc_pc.h"
#include "xe_guc_submit.h"
#include "xe_gt.h"
#include "xe_platform_types.h"
#include "xe_uc_fw.h"
#include "xe_wopcm.h"
#include "xe_mmio.h"
#include "xe_force_wake.h"
#include "i915_reg_defs.h"
#include "gt/intel_gt_regs.h"
/* TODO: move to common file */
#define GUC_PVC_MOCS_INDEX_MASK REG_GENMASK(25, 24)
#define PVC_MOCS_UC_INDEX 1
#define PVC_GUC_MOCS_INDEX(index) REG_FIELD_PREP(GUC_PVC_MOCS_INDEX_MASK,\
index)
static struct xe_gt *
guc_to_gt(struct xe_guc *guc)
{
return container_of(guc, struct xe_gt, uc.guc);
}
static struct xe_device *
guc_to_xe(struct xe_guc *guc)
{
return gt_to_xe(guc_to_gt(guc));
}
/* GuC addresses above GUC_GGTT_TOP also don't map through the GTT */
#define GUC_GGTT_TOP 0xFEE00000
static u32 guc_bo_ggtt_addr(struct xe_guc *guc,
struct xe_bo *bo)
{
u32 addr = xe_bo_ggtt_addr(bo);
XE_BUG_ON(addr < xe_wopcm_size(guc_to_xe(guc)));
XE_BUG_ON(range_overflows_t(u32, addr, bo->size, GUC_GGTT_TOP));
return addr;
}
static u32 guc_ctl_debug_flags(struct xe_guc *guc)
{
u32 level = xe_guc_log_get_level(&guc->log);
u32 flags = 0;
if (!GUC_LOG_LEVEL_IS_VERBOSE(level))
flags |= GUC_LOG_DISABLED;
else
flags |= GUC_LOG_LEVEL_TO_VERBOSITY(level) <<
GUC_LOG_VERBOSITY_SHIFT;
return flags;
}
static u32 guc_ctl_feature_flags(struct xe_guc *guc)
{
return GUC_CTL_ENABLE_SLPC;
}
static u32 guc_ctl_log_params_flags(struct xe_guc *guc)
{
u32 offset = guc_bo_ggtt_addr(guc, guc->log.bo) >> PAGE_SHIFT;
u32 flags;
#if (((CRASH_BUFFER_SIZE) % SZ_1M) == 0)
#define LOG_UNIT SZ_1M
#define LOG_FLAG GUC_LOG_LOG_ALLOC_UNITS
#else
#define LOG_UNIT SZ_4K
#define LOG_FLAG 0
#endif
#if (((CAPTURE_BUFFER_SIZE) % SZ_1M) == 0)
#define CAPTURE_UNIT SZ_1M
#define CAPTURE_FLAG GUC_LOG_CAPTURE_ALLOC_UNITS
#else
#define CAPTURE_UNIT SZ_4K
#define CAPTURE_FLAG 0
#endif
BUILD_BUG_ON(!CRASH_BUFFER_SIZE);
BUILD_BUG_ON(!IS_ALIGNED(CRASH_BUFFER_SIZE, LOG_UNIT));
BUILD_BUG_ON(!DEBUG_BUFFER_SIZE);
BUILD_BUG_ON(!IS_ALIGNED(DEBUG_BUFFER_SIZE, LOG_UNIT));
BUILD_BUG_ON(!CAPTURE_BUFFER_SIZE);
BUILD_BUG_ON(!IS_ALIGNED(CAPTURE_BUFFER_SIZE, CAPTURE_UNIT));
BUILD_BUG_ON((CRASH_BUFFER_SIZE / LOG_UNIT - 1) >
(GUC_LOG_CRASH_MASK >> GUC_LOG_CRASH_SHIFT));
BUILD_BUG_ON((DEBUG_BUFFER_SIZE / LOG_UNIT - 1) >
(GUC_LOG_DEBUG_MASK >> GUC_LOG_DEBUG_SHIFT));
BUILD_BUG_ON((CAPTURE_BUFFER_SIZE / CAPTURE_UNIT - 1) >
(GUC_LOG_CAPTURE_MASK >> GUC_LOG_CAPTURE_SHIFT));
flags = GUC_LOG_VALID |
GUC_LOG_NOTIFY_ON_HALF_FULL |
CAPTURE_FLAG |
LOG_FLAG |
((CRASH_BUFFER_SIZE / LOG_UNIT - 1) << GUC_LOG_CRASH_SHIFT) |
((DEBUG_BUFFER_SIZE / LOG_UNIT - 1) << GUC_LOG_DEBUG_SHIFT) |
((CAPTURE_BUFFER_SIZE / CAPTURE_UNIT - 1) <<
GUC_LOG_CAPTURE_SHIFT) |
(offset << GUC_LOG_BUF_ADDR_SHIFT);
#undef LOG_UNIT
#undef LOG_FLAG
#undef CAPTURE_UNIT
#undef CAPTURE_FLAG
return flags;
}
static u32 guc_ctl_ads_flags(struct xe_guc *guc)
{
u32 ads = guc_bo_ggtt_addr(guc, guc->ads.bo) >> PAGE_SHIFT;
u32 flags = ads << GUC_ADS_ADDR_SHIFT;
return flags;
}
static u32 guc_ctl_wa_flags(struct xe_guc *guc)
{
struct xe_device *xe = guc_to_xe(guc);
struct xe_gt *gt = guc_to_gt(guc);
u32 flags = 0;
/* Wa_22012773006:gen11,gen12 < XeHP */
if (GRAPHICS_VER(xe) >= 11 &&
GRAPHICS_VERx100(xe) < 1250)
flags |= GUC_WA_POLLCS;
/* Wa_16011759253 */
/* Wa_22011383443 */
if (IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G10, STEP_A0, STEP_B0) ||
IS_PLATFORM_STEP(xe, XE_PVC, STEP_A0, STEP_B0))
flags |= GUC_WA_GAM_CREDITS;
/* Wa_14014475959 */
if (IS_PLATFORM_STEP(xe, XE_METEORLAKE, STEP_A0, STEP_B0) ||
xe->info.platform == XE_DG2)
flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
/*
* Wa_14012197797
* Wa_22011391025
*
* The same WA bit is used for both and 22011391025 is applicable to
* all DG2.
*/
if (xe->info.platform == XE_DG2)
flags |= GUC_WA_DUAL_QUEUE;
/*
* Wa_2201180203
* GUC_WA_PRE_PARSER causes media workload hang for PVC A0 and PCIe
* errors. Disable this for PVC A0 steppings.
*/
if (GRAPHICS_VER(xe) <= 12 &&
!IS_PLATFORM_STEP(xe, XE_PVC, STEP_A0, STEP_B0))
flags |= GUC_WA_PRE_PARSER;
/* Wa_16011777198 */
if (IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G10, STEP_A0, STEP_C0) ||
IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G11, STEP_A0,
STEP_B0))
flags |= GUC_WA_RCS_RESET_BEFORE_RC6;
/*
* Wa_22012727170
* Wa_22012727685
*
* This WA is applicable to PVC CT A0, but causes media regressions.
* Drop the WA for PVC.
*/
if (IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G10, STEP_A0, STEP_C0) ||
IS_SUBPLATFORM_STEP(xe, XE_DG2, XE_SUBPLATFORM_DG2_G11, STEP_A0,
STEP_FOREVER))
flags |= GUC_WA_CONTEXT_ISOLATION;
/* Wa_16015675438, Wa_18020744125 */
if (!xe_hw_engine_mask_per_class(gt, XE_ENGINE_CLASS_RENDER))
flags |= GUC_WA_RCS_REGS_IN_CCS_REGS_LIST;
/* Wa_1509372804 */
if (IS_PLATFORM_STEP(xe, XE_PVC, STEP_A0, STEP_C0))
flags |= GUC_WA_RENDER_RST_RC6_EXIT;
return flags;
}
static u32 guc_ctl_devid(struct xe_guc *guc)
{
struct xe_device *xe = guc_to_xe(guc);
return (((u32)xe->info.devid) << 16) | xe->info.revid;
}
static void guc_init_params(struct xe_guc *guc)
{
struct xe_device *xe = guc_to_xe(guc);
u32 *params = guc->params;
int i;
BUILD_BUG_ON(sizeof(guc->params) != GUC_CTL_MAX_DWORDS * sizeof(u32));
BUILD_BUG_ON(SOFT_SCRATCH_COUNT != GUC_CTL_MAX_DWORDS + 2);
params[GUC_CTL_LOG_PARAMS] = guc_ctl_log_params_flags(guc);
params[GUC_CTL_FEATURE] = guc_ctl_feature_flags(guc);
params[GUC_CTL_DEBUG] = guc_ctl_debug_flags(guc);
params[GUC_CTL_ADS] = guc_ctl_ads_flags(guc);
params[GUC_CTL_WA] = guc_ctl_wa_flags(guc);
params[GUC_CTL_DEVID] = guc_ctl_devid(guc);
for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
drm_dbg(&xe->drm, "GuC param[%2d] = 0x%08x\n", i, params[i]);
}
/*
* Initialise the GuC parameter block before starting the firmware
* transfer. These parameters are read by the firmware on startup
* and cannot be changed thereafter.
*/
void guc_write_params(struct xe_guc *guc)
{
struct xe_gt *gt = guc_to_gt(guc);
int i;
xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GT);
xe_mmio_write32(gt, SOFT_SCRATCH(0).reg, 0);
for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
xe_mmio_write32(gt, SOFT_SCRATCH(1 + i).reg, guc->params[i]);
}
#define MEDIA_GUC_HOST_INTERRUPT _MMIO(0x190304)
int xe_guc_init(struct xe_guc *guc)
{
struct xe_device *xe = guc_to_xe(guc);
struct xe_gt *gt = guc_to_gt(guc);
int ret;
guc->fw.type = XE_UC_FW_TYPE_GUC;
ret = xe_uc_fw_init(&guc->fw);
if (ret)
goto out;
ret = xe_guc_log_init(&guc->log);
if (ret)
goto out;
ret = xe_guc_ads_init(&guc->ads);
if (ret)
goto out;
ret = xe_guc_ct_init(&guc->ct);
if (ret)
goto out;
ret = xe_guc_pc_init(&guc->pc);
if (ret)
goto out;
guc_init_params(guc);
if (xe_gt_is_media_type(gt))
guc->notify_reg = MEDIA_GUC_HOST_INTERRUPT.reg;
else
guc->notify_reg = GEN11_GUC_HOST_INTERRUPT.reg;
xe_uc_fw_change_status(&guc->fw, XE_UC_FIRMWARE_LOADABLE);
return 0;
out:
drm_err(&xe->drm, "GuC init failed with %d", ret);
return ret;
}
/**
* xe_guc_init_post_hwconfig - initialize GuC post hwconfig load
* @guc: The GuC object
*
* Return: 0 on success, negative error code on error.
*/
int xe_guc_init_post_hwconfig(struct xe_guc *guc)
{
return xe_guc_ads_init_post_hwconfig(&guc->ads);
}
int xe_guc_post_load_init(struct xe_guc *guc)
{
xe_guc_ads_populate_post_load(&guc->ads);
return 0;
}
int xe_guc_reset(struct xe_guc *guc)
{
struct xe_device *xe = guc_to_xe(guc);
struct xe_gt *gt = guc_to_gt(guc);
u32 guc_status;
int ret;
xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GT);
xe_mmio_write32(gt, GEN6_GDRST.reg, GEN11_GRDOM_GUC);
ret = xe_mmio_wait32(gt, GEN6_GDRST.reg, 0, GEN11_GRDOM_GUC, 5);
if (ret) {
drm_err(&xe->drm, "GuC reset timed out, GEN6_GDRST=0x%8x\n",
xe_mmio_read32(gt, GEN6_GDRST.reg));
goto err_out;
}
guc_status = xe_mmio_read32(gt, GUC_STATUS.reg);
if (!(guc_status & GS_MIA_IN_RESET)) {
drm_err(&xe->drm,
"GuC status: 0x%x, MIA core expected to be in reset\n",
guc_status);
ret = -EIO;
goto err_out;
}
return 0;
err_out:
return ret;
}
static void guc_prepare_xfer(struct xe_guc *guc)
{
struct xe_gt *gt = guc_to_gt(guc);
struct xe_device *xe = guc_to_xe(guc);
u32 shim_flags = GUC_ENABLE_READ_CACHE_LOGIC |
GUC_ENABLE_READ_CACHE_FOR_SRAM_DATA |
GUC_ENABLE_READ_CACHE_FOR_WOPCM_DATA |
GUC_ENABLE_MIA_CLOCK_GATING;
if (GRAPHICS_VERx100(xe) < 1250)
shim_flags |= GUC_DISABLE_SRAM_INIT_TO_ZEROES |
GUC_ENABLE_MIA_CACHING;
if (xe->info.platform == XE_PVC)
shim_flags |= PVC_GUC_MOCS_INDEX(PVC_MOCS_UC_INDEX);
/* Must program this register before loading the ucode with DMA */
xe_mmio_write32(gt, GUC_SHIM_CONTROL.reg, shim_flags);
xe_mmio_write32(gt, GEN9_GT_PM_CONFIG.reg, GT_DOORBELL_ENABLE);
}
/*
* Supporting MMIO & in memory RSA
*/
static int guc_xfer_rsa(struct xe_guc *guc)
{
struct xe_gt *gt = guc_to_gt(guc);
u32 rsa[UOS_RSA_SCRATCH_COUNT];
size_t copied;
int i;
if (guc->fw.rsa_size > 256) {
u32 rsa_ggtt_addr = xe_bo_ggtt_addr(guc->fw.bo) +
xe_uc_fw_rsa_offset(&guc->fw);
xe_mmio_write32(gt, UOS_RSA_SCRATCH(0).reg, rsa_ggtt_addr);
return 0;
}
copied = xe_uc_fw_copy_rsa(&guc->fw, rsa, sizeof(rsa));
if (copied < sizeof(rsa))
return -ENOMEM;
for (i = 0; i < UOS_RSA_SCRATCH_COUNT; i++)
xe_mmio_write32(gt, UOS_RSA_SCRATCH(i).reg, rsa[i]);
return 0;
}
/*
* Read the GuC status register (GUC_STATUS) and store it in the
* specified location; then return a boolean indicating whether
* the value matches either of two values representing completion
* of the GuC boot process.
*
* This is used for polling the GuC status in a wait_for()
* loop below.
*/
static bool guc_ready(struct xe_guc *guc, u32 *status)
{
u32 val = xe_mmio_read32(guc_to_gt(guc), GUC_STATUS.reg);
u32 uk_val = REG_FIELD_GET(GS_UKERNEL_MASK, val);
*status = val;
return uk_val == XE_GUC_LOAD_STATUS_READY;
}
static int guc_wait_ucode(struct xe_guc *guc)
{
struct xe_device *xe = guc_to_xe(guc);
u32 status;
int ret;
/*
* Wait for the GuC to start up.
* NB: Docs recommend not using the interrupt for completion.
* Measurements indicate this should take no more than 20ms
* (assuming the GT clock is at maximum frequency). So, a
* timeout here indicates that the GuC has failed and is unusable.
* (Higher levels of the driver may decide to reset the GuC and
* attempt the ucode load again if this happens.)
*
* FIXME: There is a known (but exceedingly unlikely) race condition
* where the asynchronous frequency management code could reduce
* the GT clock while a GuC reload is in progress (during a full
* GT reset). A fix is in progress but there are complex locking
* issues to be resolved. In the meantime bump the timeout to
* 200ms. Even at slowest clock, this should be sufficient. And
* in the working case, a larger timeout makes no difference.
*/
ret = wait_for(guc_ready(guc, &status), 200);
if (ret) {
struct drm_device *drm = &xe->drm;
struct drm_printer p = drm_info_printer(drm->dev);
drm_info(drm, "GuC load failed: status = 0x%08X\n", status);
drm_info(drm, "GuC load failed: status: Reset = %d, "
"BootROM = 0x%02X, UKernel = 0x%02X, "
"MIA = 0x%02X, Auth = 0x%02X\n",
REG_FIELD_GET(GS_MIA_IN_RESET, status),
REG_FIELD_GET(GS_BOOTROM_MASK, status),
REG_FIELD_GET(GS_UKERNEL_MASK, status),
REG_FIELD_GET(GS_MIA_MASK, status),
REG_FIELD_GET(GS_AUTH_STATUS_MASK, status));
if ((status & GS_BOOTROM_MASK) == GS_BOOTROM_RSA_FAILED) {
drm_info(drm, "GuC firmware signature verification failed\n");
ret = -ENOEXEC;
}
if (REG_FIELD_GET(GS_UKERNEL_MASK, status) ==
XE_GUC_LOAD_STATUS_EXCEPTION) {
drm_info(drm, "GuC firmware exception. EIP: %#x\n",
xe_mmio_read32(guc_to_gt(guc),
SOFT_SCRATCH(13).reg));
ret = -ENXIO;
}
xe_guc_log_print(&guc->log, &p);
} else {
drm_dbg(&xe->drm, "GuC successfully loaded");
}
return ret;
}
static int __xe_guc_upload(struct xe_guc *guc)
{
int ret;
guc_write_params(guc);
guc_prepare_xfer(guc);
/*
* Note that GuC needs the CSS header plus uKernel code to be copied
* by the DMA engine in one operation, whereas the RSA signature is
* loaded separately, either by copying it to the UOS_RSA_SCRATCH
* register (if key size <= 256) or through a ggtt-pinned vma (if key
* size > 256). The RSA size and therefore the way we provide it to the
* HW is fixed for each platform and hard-coded in the bootrom.
*/
ret = guc_xfer_rsa(guc);
if (ret)
goto out;
/*
* Current uCode expects the code to be loaded at 8k; locations below
* this are used for the stack.
*/
ret = xe_uc_fw_upload(&guc->fw, 0x2000, UOS_MOVE);
if (ret)
goto out;
/* Wait for authentication */
ret = guc_wait_ucode(guc);
if (ret)
goto out;
xe_uc_fw_change_status(&guc->fw, XE_UC_FIRMWARE_RUNNING);
return 0;
out:
xe_uc_fw_change_status(&guc->fw, XE_UC_FIRMWARE_LOAD_FAIL);
return 0 /* FIXME: ret, don't want to stop load currently */;
}
/**
* xe_guc_min_load_for_hwconfig - load minimal GuC and read hwconfig table
* @guc: The GuC object
*
* This function uploads a minimal GuC that does not support submissions but
* in a state where the hwconfig table can be read. Next, it reads and parses
* the hwconfig table so it can be used for subsequent steps in the driver load.
* Lastly, it enables CT communication (XXX: this is needed for PFs/VFs only).
*
* Return: 0 on success, negative error code on error.
*/
int xe_guc_min_load_for_hwconfig(struct xe_guc *guc)
{
int ret;
xe_guc_ads_populate_minimal(&guc->ads);
ret = __xe_guc_upload(guc);
if (ret)
return ret;
ret = xe_guc_hwconfig_init(guc);
if (ret)
return ret;
ret = xe_guc_enable_communication(guc);
if (ret)
return ret;
return 0;
}
int xe_guc_upload(struct xe_guc *guc)
{
xe_guc_ads_populate(&guc->ads);
return __xe_guc_upload(guc);
}
static void guc_handle_mmio_msg(struct xe_guc *guc)
{
struct xe_gt *gt = guc_to_gt(guc);
u32 msg;
xe_force_wake_assert_held(gt_to_fw(gt), XE_FW_GT);
msg = xe_mmio_read32(gt, SOFT_SCRATCH(15).reg);
msg &= XE_GUC_RECV_MSG_EXCEPTION |
XE_GUC_RECV_MSG_CRASH_DUMP_POSTED;
xe_mmio_write32(gt, SOFT_SCRATCH(15).reg, 0);
if (msg & XE_GUC_RECV_MSG_CRASH_DUMP_POSTED)
drm_err(&guc_to_xe(guc)->drm,
"Received early GuC crash dump notification!\n");
if (msg & XE_GUC_RECV_MSG_EXCEPTION)
drm_err(&guc_to_xe(guc)->drm,
"Received early GuC exception notification!\n");
}
void guc_enable_irq(struct xe_guc *guc)
{
struct xe_gt *gt = guc_to_gt(guc);
u32 events = xe_gt_is_media_type(gt) ?
REG_FIELD_PREP(ENGINE0_MASK, GUC_INTR_GUC2HOST) :
REG_FIELD_PREP(ENGINE1_MASK, GUC_INTR_GUC2HOST);
xe_mmio_write32(gt, GEN11_GUC_SG_INTR_ENABLE.reg,
REG_FIELD_PREP(ENGINE1_MASK, GUC_INTR_GUC2HOST));
if (xe_gt_is_media_type(gt))
xe_mmio_rmw32(gt, GEN11_GUC_SG_INTR_MASK.reg, events, 0);
else
xe_mmio_write32(gt, GEN11_GUC_SG_INTR_MASK.reg, ~events);
}
int xe_guc_enable_communication(struct xe_guc *guc)
{
int err;
guc_enable_irq(guc);
xe_mmio_rmw32(guc_to_gt(guc), GEN6_PMINTRMSK.reg,
ARAT_EXPIRED_INTRMSK, 0);
err = xe_guc_ct_enable(&guc->ct);
if (err)
return err;
guc_handle_mmio_msg(guc);
return 0;
}
int xe_guc_suspend(struct xe_guc *guc)
{
int ret;
u32 action[] = {
XE_GUC_ACTION_CLIENT_SOFT_RESET,
};
ret = xe_guc_send_mmio(guc, action, ARRAY_SIZE(action));
if (ret) {
drm_err(&guc_to_xe(guc)->drm,
"GuC suspend: CLIENT_SOFT_RESET fail: %d!\n", ret);
return ret;
}
xe_guc_sanitize(guc);
return 0;
}
void xe_guc_notify(struct xe_guc *guc)
{
struct xe_gt *gt = guc_to_gt(guc);
xe_mmio_write32(gt, guc->notify_reg, GUC_SEND_TRIGGER);
}
int xe_guc_auth_huc(struct xe_guc *guc, u32 rsa_addr)
{
u32 action[] = {
XE_GUC_ACTION_AUTHENTICATE_HUC,
rsa_addr
};
return xe_guc_ct_send_block(&guc->ct, action, ARRAY_SIZE(action));
}
#define MEDIA_SOFT_SCRATCH(n) _MMIO(0x190310 + (n) * 4)
#define MEDIA_SOFT_SCRATCH_COUNT 4
int xe_guc_send_mmio(struct xe_guc *guc, const u32 *request, u32 len)
{
struct xe_device *xe = guc_to_xe(guc);
struct xe_gt *gt = guc_to_gt(guc);
u32 header;
u32 reply_reg = xe_gt_is_media_type(gt) ?
MEDIA_SOFT_SCRATCH(0).reg : GEN11_SOFT_SCRATCH(0).reg;
int ret;
int i;
XE_BUG_ON(guc->ct.enabled);
XE_BUG_ON(!len);
XE_BUG_ON(len > GEN11_SOFT_SCRATCH_COUNT);
XE_BUG_ON(len > MEDIA_SOFT_SCRATCH_COUNT);
XE_BUG_ON(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, request[0]) !=
GUC_HXG_ORIGIN_HOST);
XE_BUG_ON(FIELD_GET(GUC_HXG_MSG_0_TYPE, request[0]) !=
GUC_HXG_TYPE_REQUEST);
retry:
/* Not in critical data-path, just do if else for GT type */
if (xe_gt_is_media_type(gt)) {
for (i = 0; i < len; ++i)
xe_mmio_write32(gt, MEDIA_SOFT_SCRATCH(i).reg,
request[i]);
#define LAST_INDEX MEDIA_SOFT_SCRATCH_COUNT - 1
xe_mmio_read32(gt, MEDIA_SOFT_SCRATCH(LAST_INDEX).reg);
} else {
for (i = 0; i < len; ++i)
xe_mmio_write32(gt, GEN11_SOFT_SCRATCH(i).reg,
request[i]);
#undef LAST_INDEX
#define LAST_INDEX GEN11_SOFT_SCRATCH_COUNT - 1
xe_mmio_read32(gt, GEN11_SOFT_SCRATCH(LAST_INDEX).reg);
}
xe_guc_notify(guc);
ret = xe_mmio_wait32(gt, reply_reg,
FIELD_PREP(GUC_HXG_MSG_0_ORIGIN,
GUC_HXG_ORIGIN_GUC),
GUC_HXG_MSG_0_ORIGIN,
50);
if (ret) {
timeout:
drm_err(&xe->drm, "mmio request 0x%08x: no reply 0x%08x\n",
request[0], xe_mmio_read32(gt, reply_reg));
return ret;
}
header = xe_mmio_read32(gt, reply_reg);
if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) ==
GUC_HXG_TYPE_NO_RESPONSE_BUSY) {
#define done ({ header = xe_mmio_read32(gt, reply_reg); \
FIELD_GET(GUC_HXG_MSG_0_ORIGIN, header) != \
GUC_HXG_ORIGIN_GUC || \
FIELD_GET(GUC_HXG_MSG_0_TYPE, header) != \
GUC_HXG_TYPE_NO_RESPONSE_BUSY; })
ret = wait_for(done, 1000);
if (unlikely(ret))
goto timeout;
if (unlikely(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, header) !=
GUC_HXG_ORIGIN_GUC))
goto proto;
#undef done
}
if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) ==
GUC_HXG_TYPE_NO_RESPONSE_RETRY) {
u32 reason = FIELD_GET(GUC_HXG_RETRY_MSG_0_REASON, header);
drm_dbg(&xe->drm, "mmio request %#x: retrying, reason %u\n",
request[0], reason);
goto retry;
}
if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) ==
GUC_HXG_TYPE_RESPONSE_FAILURE) {
u32 hint = FIELD_GET(GUC_HXG_FAILURE_MSG_0_HINT, header);
u32 error = FIELD_GET(GUC_HXG_FAILURE_MSG_0_ERROR, header);
drm_err(&xe->drm, "mmio request %#x: failure %x/%u\n",
request[0], error, hint);
return -ENXIO;
}
if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) !=
GUC_HXG_TYPE_RESPONSE_SUCCESS) {
proto:
drm_err(&xe->drm, "mmio request %#x: unexpected reply %#x\n",
request[0], header);
return -EPROTO;
}
/* Use data from the GuC response as our return value */
return FIELD_GET(GUC_HXG_RESPONSE_MSG_0_DATA0, header);
}
static int guc_self_cfg(struct xe_guc *guc, u16 key, u16 len, u64 val)
{
u32 request[HOST2GUC_SELF_CFG_REQUEST_MSG_LEN] = {
FIELD_PREP(GUC_HXG_MSG_0_ORIGIN, GUC_HXG_ORIGIN_HOST) |
FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION,
GUC_ACTION_HOST2GUC_SELF_CFG),
FIELD_PREP(HOST2GUC_SELF_CFG_REQUEST_MSG_1_KLV_KEY, key) |
FIELD_PREP(HOST2GUC_SELF_CFG_REQUEST_MSG_1_KLV_LEN, len),
FIELD_PREP(HOST2GUC_SELF_CFG_REQUEST_MSG_2_VALUE32,
lower_32_bits(val)),
FIELD_PREP(HOST2GUC_SELF_CFG_REQUEST_MSG_3_VALUE64,
upper_32_bits(val)),
};
int ret;
XE_BUG_ON(len > 2);
XE_BUG_ON(len == 1 && upper_32_bits(val));
/* Self config must go over MMIO */
ret = xe_guc_send_mmio(guc, request, ARRAY_SIZE(request));
if (unlikely(ret < 0))
return ret;
if (unlikely(ret > 1))
return -EPROTO;
if (unlikely(!ret))
return -ENOKEY;
return 0;
}
int xe_guc_self_cfg32(struct xe_guc *guc, u16 key, u32 val)
{
return guc_self_cfg(guc, key, 1, val);
}
int xe_guc_self_cfg64(struct xe_guc *guc, u16 key, u64 val)
{
return guc_self_cfg(guc, key, 2, val);
}
void xe_guc_irq_handler(struct xe_guc *guc, const u16 iir)
{
if (iir & GUC_INTR_GUC2HOST)
xe_guc_ct_irq_handler(&guc->ct);
}
void xe_guc_sanitize(struct xe_guc *guc)
{
xe_uc_fw_change_status(&guc->fw, XE_UC_FIRMWARE_LOADABLE);
xe_guc_ct_disable(&guc->ct);
}
int xe_guc_reset_prepare(struct xe_guc *guc)
{
return xe_guc_submit_reset_prepare(guc);
}
void xe_guc_reset_wait(struct xe_guc *guc)
{
xe_guc_submit_reset_wait(guc);
}
void xe_guc_stop_prepare(struct xe_guc *guc)
{
XE_WARN_ON(xe_guc_pc_stop(&guc->pc));
}
int xe_guc_stop(struct xe_guc *guc)
{
int ret;
xe_guc_ct_disable(&guc->ct);
ret = xe_guc_submit_stop(guc);
if (ret)
return ret;
return 0;
}
int xe_guc_start(struct xe_guc *guc)
{
int ret;
ret = xe_guc_submit_start(guc);
if (ret)
return ret;
ret = xe_guc_pc_start(&guc->pc);
XE_WARN_ON(ret);
return 0;
}
void xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p)
{
struct xe_gt *gt = guc_to_gt(guc);
u32 status;
int err;
int i;
xe_uc_fw_print(&guc->fw, p);
err = xe_force_wake_get(gt_to_fw(gt), XE_FW_GT);
if (err)
return;
status = xe_mmio_read32(gt, GUC_STATUS.reg);
drm_printf(p, "\nGuC status 0x%08x:\n", status);
drm_printf(p, "\tBootrom status = 0x%x\n",
(status & GS_BOOTROM_MASK) >> GS_BOOTROM_SHIFT);
drm_printf(p, "\tuKernel status = 0x%x\n",
(status & GS_UKERNEL_MASK) >> GS_UKERNEL_SHIFT);
drm_printf(p, "\tMIA Core status = 0x%x\n",
(status & GS_MIA_MASK) >> GS_MIA_SHIFT);
drm_printf(p, "\tLog level = %d\n",
xe_guc_log_get_level(&guc->log));
drm_puts(p, "\nScratch registers:\n");
for (i = 0; i < SOFT_SCRATCH_COUNT; i++) {
drm_printf(p, "\t%2d: \t0x%x\n",
i, xe_mmio_read32(gt, SOFT_SCRATCH(i).reg));
}
xe_force_wake_put(gt_to_fw(gt), XE_FW_GT);
xe_guc_ct_print(&guc->ct, p);
xe_guc_submit_print(guc, p);
}

View File

@ -0,0 +1,57 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_H_
#define _XE_GUC_H_
#include "xe_hw_engine_types.h"
#include "xe_guc_types.h"
#include "xe_macros.h"
struct drm_printer;
int xe_guc_init(struct xe_guc *guc);
int xe_guc_init_post_hwconfig(struct xe_guc *guc);
int xe_guc_post_load_init(struct xe_guc *guc);
int xe_guc_reset(struct xe_guc *guc);
int xe_guc_upload(struct xe_guc *guc);
int xe_guc_min_load_for_hwconfig(struct xe_guc *guc);
int xe_guc_enable_communication(struct xe_guc *guc);
int xe_guc_suspend(struct xe_guc *guc);
void xe_guc_notify(struct xe_guc *guc);
int xe_guc_auth_huc(struct xe_guc *guc, u32 rsa_addr);
int xe_guc_send_mmio(struct xe_guc *guc, const u32 *request, u32 len);
int xe_guc_self_cfg32(struct xe_guc *guc, u16 key, u32 val);
int xe_guc_self_cfg64(struct xe_guc *guc, u16 key, u64 val);
void xe_guc_irq_handler(struct xe_guc *guc, const u16 iir);
void xe_guc_sanitize(struct xe_guc *guc);
void xe_guc_print_info(struct xe_guc *guc, struct drm_printer *p);
int xe_guc_reset_prepare(struct xe_guc *guc);
void xe_guc_reset_wait(struct xe_guc *guc);
void xe_guc_stop_prepare(struct xe_guc *guc);
int xe_guc_stop(struct xe_guc *guc);
int xe_guc_start(struct xe_guc *guc);
static inline u16 xe_engine_class_to_guc_class(enum xe_engine_class class)
{
switch (class) {
case XE_ENGINE_CLASS_RENDER:
return GUC_RENDER_CLASS;
case XE_ENGINE_CLASS_VIDEO_DECODE:
return GUC_VIDEO_CLASS;
case XE_ENGINE_CLASS_VIDEO_ENHANCE:
return GUC_VIDEOENHANCE_CLASS;
case XE_ENGINE_CLASS_COPY:
return GUC_BLITTER_CLASS;
case XE_ENGINE_CLASS_COMPUTE:
return GUC_COMPUTE_CLASS;
case XE_ENGINE_CLASS_OTHER:
default:
XE_WARN_ON(class);
return -1;
}
}
#endif

View File

@ -0,0 +1,676 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_managed.h>
#include "xe_bo.h"
#include "xe_gt.h"
#include "xe_guc.h"
#include "xe_guc_ads.h"
#include "xe_guc_reg.h"
#include "xe_hw_engine.h"
#include "xe_lrc.h"
#include "xe_map.h"
#include "xe_mmio.h"
#include "xe_platform_types.h"
#include "gt/intel_gt_regs.h"
#include "gt/intel_engine_regs.h"
/* Slack of a few additional entries per engine */
#define ADS_REGSET_EXTRA_MAX 8
static struct xe_guc *
ads_to_guc(struct xe_guc_ads *ads)
{
return container_of(ads, struct xe_guc, ads);
}
static struct xe_gt *
ads_to_gt(struct xe_guc_ads *ads)
{
return container_of(ads, struct xe_gt, uc.guc.ads);
}
static struct xe_device *
ads_to_xe(struct xe_guc_ads *ads)
{
return gt_to_xe(ads_to_gt(ads));
}
static struct iosys_map *
ads_to_map(struct xe_guc_ads *ads)
{
return &ads->bo->vmap;
}
/* UM Queue parameters: */
#define GUC_UM_QUEUE_SIZE (SZ_64K)
#define GUC_PAGE_RES_TIMEOUT_US (-1)
/*
* The Additional Data Struct (ADS) has pointers for different buffers used by
* the GuC. One single gem object contains the ADS struct itself (guc_ads) and
* all the extra buffers indirectly linked via the ADS struct's entries.
*
* Layout of the ADS blob allocated for the GuC:
*
* +---------------------------------------+ <== base
* | guc_ads |
* +---------------------------------------+
* | guc_policies |
* +---------------------------------------+
* | guc_gt_system_info |
* +---------------------------------------+
* | guc_engine_usage |
* +---------------------------------------+
* | guc_um_init_params |
* +---------------------------------------+ <== static
* | guc_mmio_reg[countA] (engine 0.0) |
* | guc_mmio_reg[countB] (engine 0.1) |
* | guc_mmio_reg[countC] (engine 1.0) |
* | ... |
* +---------------------------------------+ <== dynamic
* | padding |
* +---------------------------------------+ <== 4K aligned
* | golden contexts |
* +---------------------------------------+
* | padding |
* +---------------------------------------+ <== 4K aligned
* | capture lists |
* +---------------------------------------+
* | padding |
* +---------------------------------------+ <== 4K aligned
* | UM queues |
* +---------------------------------------+
* | padding |
* +---------------------------------------+ <== 4K aligned
* | private data |
* +---------------------------------------+
* | padding |
* +---------------------------------------+ <== 4K aligned
*/
struct __guc_ads_blob {
struct guc_ads ads;
struct guc_policies policies;
struct guc_gt_system_info system_info;
struct guc_engine_usage engine_usage;
struct guc_um_init_params um_init_params;
/* From here on, location is dynamic! Refer to above diagram. */
struct guc_mmio_reg regset[0];
} __packed;
#define ads_blob_read(ads_, field_) \
xe_map_rd_field(ads_to_xe(ads_), ads_to_map(ads_), 0, \
struct __guc_ads_blob, field_)
#define ads_blob_write(ads_, field_, val_) \
xe_map_wr_field(ads_to_xe(ads_), ads_to_map(ads_), 0, \
struct __guc_ads_blob, field_, val_)
#define info_map_write(xe_, map_, field_, val_) \
xe_map_wr_field(xe_, map_, 0, struct guc_gt_system_info, field_, val_)
#define info_map_read(xe_, map_, field_) \
xe_map_rd_field(xe_, map_, 0, struct guc_gt_system_info, field_)
static size_t guc_ads_regset_size(struct xe_guc_ads *ads)
{
XE_BUG_ON(!ads->regset_size);
return ads->regset_size;
}
static size_t guc_ads_golden_lrc_size(struct xe_guc_ads *ads)
{
return PAGE_ALIGN(ads->golden_lrc_size);
}
static size_t guc_ads_capture_size(struct xe_guc_ads *ads)
{
/* FIXME: Allocate a proper capture list */
return PAGE_ALIGN(PAGE_SIZE);
}
static size_t guc_ads_um_queues_size(struct xe_guc_ads *ads)
{
struct xe_device *xe = ads_to_xe(ads);
if (!xe->info.supports_usm)
return 0;
return GUC_UM_QUEUE_SIZE * GUC_UM_HW_QUEUE_MAX;
}
static size_t guc_ads_private_data_size(struct xe_guc_ads *ads)
{
return PAGE_ALIGN(ads_to_guc(ads)->fw.private_data_size);
}
static size_t guc_ads_regset_offset(struct xe_guc_ads *ads)
{
return offsetof(struct __guc_ads_blob, regset);
}
static size_t guc_ads_golden_lrc_offset(struct xe_guc_ads *ads)
{
size_t offset;
offset = guc_ads_regset_offset(ads) +
guc_ads_regset_size(ads);
return PAGE_ALIGN(offset);
}
static size_t guc_ads_capture_offset(struct xe_guc_ads *ads)
{
size_t offset;
offset = guc_ads_golden_lrc_offset(ads) +
guc_ads_golden_lrc_size(ads);
return PAGE_ALIGN(offset);
}
static size_t guc_ads_um_queues_offset(struct xe_guc_ads *ads)
{
u32 offset;
offset = guc_ads_capture_offset(ads) +
guc_ads_capture_size(ads);
return PAGE_ALIGN(offset);
}
static size_t guc_ads_private_data_offset(struct xe_guc_ads *ads)
{
size_t offset;
offset = guc_ads_um_queues_offset(ads) +
guc_ads_um_queues_size(ads);
return PAGE_ALIGN(offset);
}
static size_t guc_ads_size(struct xe_guc_ads *ads)
{
return guc_ads_private_data_offset(ads) +
guc_ads_private_data_size(ads);
}
static void guc_ads_fini(struct drm_device *drm, void *arg)
{
struct xe_guc_ads *ads = arg;
xe_bo_unpin_map_no_vm(ads->bo);
}
static size_t calculate_regset_size(struct xe_gt *gt)
{
struct xe_reg_sr_entry *sr_entry;
unsigned long sr_idx;
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
unsigned int count = 0;
for_each_hw_engine(hwe, gt, id)
xa_for_each(&hwe->reg_sr.xa, sr_idx, sr_entry)
count++;
count += (ADS_REGSET_EXTRA_MAX + LNCFCMOCS_REG_COUNT) * XE_NUM_HW_ENGINES;
return count * sizeof(struct guc_mmio_reg);
}
static u32 engine_enable_mask(struct xe_gt *gt, enum xe_engine_class class)
{
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
u32 mask = 0;
for_each_hw_engine(hwe, gt, id)
if (hwe->class == class)
mask |= BIT(hwe->instance);
return mask;
}
static size_t calculate_golden_lrc_size(struct xe_guc_ads *ads)
{
struct xe_device *xe = ads_to_xe(ads);
struct xe_gt *gt = ads_to_gt(ads);
size_t total_size = 0, alloc_size, real_size;
int class;
for (class = 0; class < XE_ENGINE_CLASS_MAX; ++class) {
if (class == XE_ENGINE_CLASS_OTHER)
continue;
if (!engine_enable_mask(gt, class))
continue;
real_size = xe_lrc_size(xe, class);
alloc_size = PAGE_ALIGN(real_size);
total_size += alloc_size;
}
return total_size;
}
#define MAX_GOLDEN_LRC_SIZE (SZ_4K * 64)
int xe_guc_ads_init(struct xe_guc_ads *ads)
{
struct xe_device *xe = ads_to_xe(ads);
struct xe_gt *gt = ads_to_gt(ads);
struct xe_bo *bo;
int err;
ads->golden_lrc_size = calculate_golden_lrc_size(ads);
ads->regset_size = calculate_regset_size(gt);
bo = xe_bo_create_pin_map(xe, gt, NULL, guc_ads_size(ads) +
MAX_GOLDEN_LRC_SIZE,
ttm_bo_type_kernel,
XE_BO_CREATE_VRAM_IF_DGFX(gt) |
XE_BO_CREATE_GGTT_BIT);
if (IS_ERR(bo))
return PTR_ERR(bo);
ads->bo = bo;
err = drmm_add_action_or_reset(&xe->drm, guc_ads_fini, ads);
if (err)
return err;
return 0;
}
/**
* xe_guc_ads_init_post_hwconfig - initialize ADS post hwconfig load
* @ads: Additional data structures object
*
* Recalcuate golden_lrc_size & regset_size as the number hardware engines may
* have changed after the hwconfig was loaded. Also verify the new sizes fit in
* the already allocated ADS buffer object.
*
* Return: 0 on success, negative error code on error.
*/
int xe_guc_ads_init_post_hwconfig(struct xe_guc_ads *ads)
{
struct xe_gt *gt = ads_to_gt(ads);
u32 prev_regset_size = ads->regset_size;
XE_BUG_ON(!ads->bo);
ads->golden_lrc_size = calculate_golden_lrc_size(ads);
ads->regset_size = calculate_regset_size(gt);
XE_WARN_ON(ads->golden_lrc_size +
(ads->regset_size - prev_regset_size) >
MAX_GOLDEN_LRC_SIZE);
return 0;
}
static void guc_policies_init(struct xe_guc_ads *ads)
{
ads_blob_write(ads, policies.dpc_promote_time,
GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US);
ads_blob_write(ads, policies.max_num_work_items,
GLOBAL_POLICY_MAX_NUM_WI);
ads_blob_write(ads, policies.global_flags, 0);
ads_blob_write(ads, policies.is_valid, 1);
}
static void fill_engine_enable_masks(struct xe_gt *gt,
struct iosys_map *info_map)
{
struct xe_device *xe = gt_to_xe(gt);
info_map_write(xe, info_map, engine_enabled_masks[GUC_RENDER_CLASS],
engine_enable_mask(gt, XE_ENGINE_CLASS_RENDER));
info_map_write(xe, info_map, engine_enabled_masks[GUC_BLITTER_CLASS],
engine_enable_mask(gt, XE_ENGINE_CLASS_COPY));
info_map_write(xe, info_map, engine_enabled_masks[GUC_VIDEO_CLASS],
engine_enable_mask(gt, XE_ENGINE_CLASS_VIDEO_DECODE));
info_map_write(xe, info_map,
engine_enabled_masks[GUC_VIDEOENHANCE_CLASS],
engine_enable_mask(gt, XE_ENGINE_CLASS_VIDEO_ENHANCE));
info_map_write(xe, info_map, engine_enabled_masks[GUC_COMPUTE_CLASS],
engine_enable_mask(gt, XE_ENGINE_CLASS_COMPUTE));
}
static void guc_prep_golden_lrc_null(struct xe_guc_ads *ads)
{
struct xe_device *xe = ads_to_xe(ads);
struct iosys_map info_map = IOSYS_MAP_INIT_OFFSET(ads_to_map(ads),
offsetof(struct __guc_ads_blob, system_info));
u8 guc_class;
for (guc_class = 0; guc_class <= GUC_MAX_ENGINE_CLASSES; ++guc_class) {
if (!info_map_read(xe, &info_map,
engine_enabled_masks[guc_class]))
continue;
ads_blob_write(ads, ads.eng_state_size[guc_class],
guc_ads_golden_lrc_size(ads) -
xe_lrc_skip_size(xe));
ads_blob_write(ads, ads.golden_context_lrca[guc_class],
xe_bo_ggtt_addr(ads->bo) +
guc_ads_golden_lrc_offset(ads));
}
}
static void guc_mapping_table_init_invalid(struct xe_gt *gt,
struct iosys_map *info_map)
{
struct xe_device *xe = gt_to_xe(gt);
unsigned int i, j;
/* Table must be set to invalid values for entries not used */
for (i = 0; i < GUC_MAX_ENGINE_CLASSES; ++i)
for (j = 0; j < GUC_MAX_INSTANCES_PER_CLASS; ++j)
info_map_write(xe, info_map, mapping_table[i][j],
GUC_MAX_INSTANCES_PER_CLASS);
}
static void guc_mapping_table_init(struct xe_gt *gt,
struct iosys_map *info_map)
{
struct xe_device *xe = gt_to_xe(gt);
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
guc_mapping_table_init_invalid(gt, info_map);
for_each_hw_engine(hwe, gt, id) {
u8 guc_class;
guc_class = xe_engine_class_to_guc_class(hwe->class);
info_map_write(xe, info_map,
mapping_table[guc_class][hwe->logical_instance],
hwe->instance);
}
}
static void guc_capture_list_init(struct xe_guc_ads *ads)
{
int i, j;
u32 addr = xe_bo_ggtt_addr(ads->bo) + guc_ads_capture_offset(ads);
/* FIXME: Populate a proper capture list */
for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) {
for (j = 0; j < GUC_MAX_ENGINE_CLASSES; j++) {
ads_blob_write(ads, ads.capture_instance[i][j], addr);
ads_blob_write(ads, ads.capture_class[i][j], addr);
}
ads_blob_write(ads, ads.capture_global[i], addr);
}
}
static void guc_mmio_regset_write_one(struct xe_guc_ads *ads,
struct iosys_map *regset_map,
u32 reg, u32 flags,
unsigned int n_entry)
{
struct guc_mmio_reg entry = {
.offset = reg,
.flags = flags,
/* TODO: steering */
};
xe_map_memcpy_to(ads_to_xe(ads), regset_map, n_entry * sizeof(entry),
&entry, sizeof(entry));
}
static unsigned int guc_mmio_regset_write(struct xe_guc_ads *ads,
struct iosys_map *regset_map,
struct xe_hw_engine *hwe)
{
struct xe_hw_engine *hwe_rcs_reset_domain =
xe_gt_any_hw_engine_by_reset_domain(hwe->gt, XE_ENGINE_CLASS_RENDER);
struct xe_reg_sr_entry *entry;
unsigned long idx;
unsigned count = 0;
const struct {
u32 reg;
u32 flags;
bool skip;
} *e, extra_regs[] = {
{ .reg = RING_MODE_GEN7(hwe->mmio_base).reg, },
{ .reg = RING_HWS_PGA(hwe->mmio_base).reg, },
{ .reg = RING_IMR(hwe->mmio_base).reg, },
{ .reg = GEN12_RCU_MODE.reg, .flags = 0x3,
.skip = hwe != hwe_rcs_reset_domain },
};
u32 i;
BUILD_BUG_ON(ARRAY_SIZE(extra_regs) > ADS_REGSET_EXTRA_MAX);
xa_for_each(&hwe->reg_sr.xa, idx, entry) {
u32 flags = entry->masked_reg ? GUC_REGSET_MASKED : 0;
guc_mmio_regset_write_one(ads, regset_map, idx, flags, count++);
}
for (e = extra_regs; e < extra_regs + ARRAY_SIZE(extra_regs); e++) {
if (e->skip)
continue;
guc_mmio_regset_write_one(ads, regset_map,
e->reg, e->flags, count++);
}
for (i = 0; i < LNCFCMOCS_REG_COUNT; i++) {
guc_mmio_regset_write_one(ads, regset_map,
GEN9_LNCFCMOCS(i).reg, 0, count++);
}
XE_BUG_ON(ads->regset_size < (count * sizeof(struct guc_mmio_reg)));
return count;
}
static void guc_mmio_reg_state_init(struct xe_guc_ads *ads)
{
size_t regset_offset = guc_ads_regset_offset(ads);
struct xe_gt *gt = ads_to_gt(ads);
struct xe_hw_engine *hwe;
enum xe_hw_engine_id id;
u32 addr = xe_bo_ggtt_addr(ads->bo) + regset_offset;
struct iosys_map regset_map = IOSYS_MAP_INIT_OFFSET(ads_to_map(ads),
regset_offset);
for_each_hw_engine(hwe, gt, id) {
unsigned int count;
u8 gc;
/*
* 1. Write all MMIO entries for this engine to the table. No
* need to worry about fused-off engines and when there are
* entries in the regset: the reg_state_list has been zero'ed
* by xe_guc_ads_populate()
*/
count = guc_mmio_regset_write(ads, &regset_map, hwe);
if (!count)
continue;
/*
* 2. Record in the header (ads.reg_state_list) the address
* location and number of entries
*/
gc = xe_engine_class_to_guc_class(hwe->class);
ads_blob_write(ads, ads.reg_state_list[gc][hwe->instance].address, addr);
ads_blob_write(ads, ads.reg_state_list[gc][hwe->instance].count, count);
addr += count * sizeof(struct guc_mmio_reg);
iosys_map_incr(&regset_map, count * sizeof(struct guc_mmio_reg));
}
}
static void guc_um_init_params(struct xe_guc_ads *ads)
{
u32 um_queue_offset = guc_ads_um_queues_offset(ads);
u64 base_dpa;
u32 base_ggtt;
int i;
base_ggtt = xe_bo_ggtt_addr(ads->bo) + um_queue_offset;
base_dpa = xe_bo_main_addr(ads->bo, PAGE_SIZE) + um_queue_offset;
for (i = 0; i < GUC_UM_HW_QUEUE_MAX; ++i) {
ads_blob_write(ads, um_init_params.queue_params[i].base_dpa,
base_dpa + (i * GUC_UM_QUEUE_SIZE));
ads_blob_write(ads, um_init_params.queue_params[i].base_ggtt_address,
base_ggtt + (i * GUC_UM_QUEUE_SIZE));
ads_blob_write(ads, um_init_params.queue_params[i].size_in_bytes,
GUC_UM_QUEUE_SIZE);
}
ads_blob_write(ads, um_init_params.page_response_timeout_in_us,
GUC_PAGE_RES_TIMEOUT_US);
}
static void guc_doorbell_init(struct xe_guc_ads *ads)
{
struct xe_device *xe = ads_to_xe(ads);
struct xe_gt *gt = ads_to_gt(ads);
if (GRAPHICS_VER(xe) >= 12 && !IS_DGFX(xe)) {
u32 distdbreg =
xe_mmio_read32(gt, GEN12_DIST_DBS_POPULATED.reg);
ads_blob_write(ads,
system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI],
((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT)
& GEN12_DOORBELLS_PER_SQIDI) + 1);
}
}
/**
* xe_guc_ads_populate_minimal - populate minimal ADS
* @ads: Additional data structures object
*
* This function populates a minimal ADS that does not support submissions but
* enough so the GuC can load and the hwconfig table can be read.
*/
void xe_guc_ads_populate_minimal(struct xe_guc_ads *ads)
{
struct xe_gt *gt = ads_to_gt(ads);
struct iosys_map info_map = IOSYS_MAP_INIT_OFFSET(ads_to_map(ads),
offsetof(struct __guc_ads_blob, system_info));
u32 base = xe_bo_ggtt_addr(ads->bo);
XE_BUG_ON(!ads->bo);
xe_map_memset(ads_to_xe(ads), ads_to_map(ads), 0, 0, ads->bo->size);
guc_policies_init(ads);
guc_prep_golden_lrc_null(ads);
guc_mapping_table_init_invalid(gt, &info_map);
guc_doorbell_init(ads);
ads_blob_write(ads, ads.scheduler_policies, base +
offsetof(struct __guc_ads_blob, policies));
ads_blob_write(ads, ads.gt_system_info, base +
offsetof(struct __guc_ads_blob, system_info));
ads_blob_write(ads, ads.private_data, base +
guc_ads_private_data_offset(ads));
}
void xe_guc_ads_populate(struct xe_guc_ads *ads)
{
struct xe_device *xe = ads_to_xe(ads);
struct xe_gt *gt = ads_to_gt(ads);
struct iosys_map info_map = IOSYS_MAP_INIT_OFFSET(ads_to_map(ads),
offsetof(struct __guc_ads_blob, system_info));
u32 base = xe_bo_ggtt_addr(ads->bo);
XE_BUG_ON(!ads->bo);
xe_map_memset(ads_to_xe(ads), ads_to_map(ads), 0, 0, ads->bo->size);
guc_policies_init(ads);
fill_engine_enable_masks(gt, &info_map);
guc_mmio_reg_state_init(ads);
guc_prep_golden_lrc_null(ads);
guc_mapping_table_init(gt, &info_map);
guc_capture_list_init(ads);
guc_doorbell_init(ads);
if (xe->info.supports_usm) {
guc_um_init_params(ads);
ads_blob_write(ads, ads.um_init_data, base +
offsetof(struct __guc_ads_blob, um_init_params));
}
ads_blob_write(ads, ads.scheduler_policies, base +
offsetof(struct __guc_ads_blob, policies));
ads_blob_write(ads, ads.gt_system_info, base +
offsetof(struct __guc_ads_blob, system_info));
ads_blob_write(ads, ads.private_data, base +
guc_ads_private_data_offset(ads));
}
static void guc_populate_golden_lrc(struct xe_guc_ads *ads)
{
struct xe_device *xe = ads_to_xe(ads);
struct xe_gt *gt = ads_to_gt(ads);
struct iosys_map info_map = IOSYS_MAP_INIT_OFFSET(ads_to_map(ads),
offsetof(struct __guc_ads_blob, system_info));
size_t total_size = 0, alloc_size, real_size;
u32 addr_ggtt, offset;
int class;
offset = guc_ads_golden_lrc_offset(ads);
addr_ggtt = xe_bo_ggtt_addr(ads->bo) + offset;
for (class = 0; class < XE_ENGINE_CLASS_MAX; ++class) {
u8 guc_class;
if (class == XE_ENGINE_CLASS_OTHER)
continue;
guc_class = xe_engine_class_to_guc_class(class);
if (!info_map_read(xe, &info_map,
engine_enabled_masks[guc_class]))
continue;
XE_BUG_ON(!gt->default_lrc[class]);
real_size = xe_lrc_size(xe, class);
alloc_size = PAGE_ALIGN(real_size);
total_size += alloc_size;
/*
* This interface is slightly confusing. We need to pass the
* base address of the full golden context and the size of just
* the engine state, which is the section of the context image
* that starts after the execlists LRC registers. This is
* required to allow the GuC to restore just the engine state
* when a watchdog reset occurs.
* We calculate the engine state size by removing the size of
* what comes before it in the context image (which is identical
* on all engines).
*/
ads_blob_write(ads, ads.eng_state_size[guc_class],
real_size - xe_lrc_skip_size(xe));
ads_blob_write(ads, ads.golden_context_lrca[guc_class],
addr_ggtt);
xe_map_memcpy_to(xe, ads_to_map(ads), offset,
gt->default_lrc[class], real_size);
addr_ggtt += alloc_size;
offset += alloc_size;
}
XE_BUG_ON(total_size != ads->golden_lrc_size);
}
void xe_guc_ads_populate_post_load(struct xe_guc_ads *ads)
{
guc_populate_golden_lrc(ads);
}

View File

@ -0,0 +1,17 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_ADS_H_
#define _XE_GUC_ADS_H_
#include "xe_guc_ads_types.h"
int xe_guc_ads_init(struct xe_guc_ads *ads);
int xe_guc_ads_init_post_hwconfig(struct xe_guc_ads *ads);
void xe_guc_ads_populate(struct xe_guc_ads *ads);
void xe_guc_ads_populate_minimal(struct xe_guc_ads *ads);
void xe_guc_ads_populate_post_load(struct xe_guc_ads *ads);
#endif

View File

@ -0,0 +1,25 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_ADS_TYPES_H_
#define _XE_GUC_ADS_TYPES_H_
#include <linux/types.h>
struct xe_bo;
/**
* struct xe_guc_ads - GuC additional data structures (ADS)
*/
struct xe_guc_ads {
/** @bo: XE BO for GuC ads blob */
struct xe_bo *bo;
/** @golden_lrc_size: golden LRC size */
size_t golden_lrc_size;
/** @regset_size: size of register set passed to GuC for save/restore */
u32 regset_size;
};
#endif

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,62 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_CT_H_
#define _XE_GUC_CT_H_
#include "xe_guc_ct_types.h"
struct drm_printer;
int xe_guc_ct_init(struct xe_guc_ct *ct);
int xe_guc_ct_enable(struct xe_guc_ct *ct);
void xe_guc_ct_disable(struct xe_guc_ct *ct);
void xe_guc_ct_print(struct xe_guc_ct *ct, struct drm_printer *p);
void xe_guc_ct_fast_path(struct xe_guc_ct *ct);
static inline void xe_guc_ct_irq_handler(struct xe_guc_ct *ct)
{
wake_up_all(&ct->wq);
#ifdef XE_GUC_CT_SELFTEST
if (!ct->suppress_irq_handler && ct->enabled)
queue_work(system_unbound_wq, &ct->g2h_worker);
#else
if (ct->enabled)
queue_work(system_unbound_wq, &ct->g2h_worker);
#endif
xe_guc_ct_fast_path(ct);
}
/* Basic CT send / receives */
int xe_guc_ct_send(struct xe_guc_ct *ct, const u32 *action, u32 len,
u32 g2h_len, u32 num_g2h);
int xe_guc_ct_send_locked(struct xe_guc_ct *ct, const u32 *action, u32 len,
u32 g2h_len, u32 num_g2h);
int xe_guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len,
u32 *response_buffer);
static inline int
xe_guc_ct_send_block(struct xe_guc_ct *ct, const u32 *action, u32 len)
{
return xe_guc_ct_send_recv(ct, action, len, NULL);
}
/* This is only version of the send CT you can call from a G2H handler */
int xe_guc_ct_send_g2h_handler(struct xe_guc_ct *ct, const u32 *action,
u32 len);
/* Can't fail because a GT reset is in progress */
int xe_guc_ct_send_recv_no_fail(struct xe_guc_ct *ct, const u32 *action,
u32 len, u32 *response_buffer);
static inline int
xe_guc_ct_send_block_no_fail(struct xe_guc_ct *ct, const u32 *action, u32 len)
{
return xe_guc_ct_send_recv_no_fail(ct, action, len, NULL);
}
#ifdef XE_GUC_CT_SELFTEST
void xe_guc_ct_selftest(struct xe_guc_ct *ct, struct drm_printer *p);
#endif
#endif

View File

@ -0,0 +1,87 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_CT_TYPES_H_
#define _XE_GUC_CT_TYPES_H_
#include <linux/iosys-map.h>
#include <linux/interrupt.h>
#include <linux/spinlock_types.h>
#include <linux/wait.h>
#include <linux/xarray.h>
#include "abi/guc_communication_ctb_abi.h"
#define XE_GUC_CT_SELFTEST
struct xe_bo;
/**
* struct guc_ctb - GuC command transport buffer (CTB)
*/
struct guc_ctb {
/** @desc: dma buffer map for CTB descriptor */
struct iosys_map desc;
/** @cmds: dma buffer map for CTB commands */
struct iosys_map cmds;
/** @size: size of CTB commands (DW) */
u32 size;
/** @resv_space: reserved space of CTB commands (DW) */
u32 resv_space;
/** @head: head of CTB commands (DW) */
u32 head;
/** @tail: tail of CTB commands (DW) */
u32 tail;
/** @space: space in CTB commands (DW) */
u32 space;
/** @broken: channel broken */
bool broken;
};
/**
* struct xe_guc_ct - GuC command transport (CT) layer
*
* Includes a pair of CT buffers for bi-directional communication and tracking
* for the H2G and G2H requests sent and received through the buffers.
*/
struct xe_guc_ct {
/** @bo: XE BO for CT */
struct xe_bo *bo;
/** @lock: protects everything in CT layer */
struct mutex lock;
/** @fast_lock: protects G2H channel and credits */
spinlock_t fast_lock;
/** @ctbs: buffers for sending and receiving commands */
struct {
/** @send: Host to GuC (H2G, send) channel */
struct guc_ctb h2g;
/** @recv: GuC to Host (G2H, receive) channel */
struct guc_ctb g2h;
} ctbs;
/** @g2h_outstanding: number of outstanding G2H */
u32 g2h_outstanding;
/** @g2h_worker: worker to process G2H messages */
struct work_struct g2h_worker;
/** @enabled: CT enabled */
bool enabled;
/** @fence_seqno: G2H fence seqno - 16 bits used by CT */
u32 fence_seqno;
/** @fence_context: context for G2H fence */
u64 fence_context;
/** @fence_lookup: G2H fence lookup */
struct xarray fence_lookup;
/** @wq: wait queue used for reliable CT sends and freeing G2H credits */
wait_queue_head_t wq;
#ifdef XE_GUC_CT_SELFTEST
/** @suppress_irq_handler: force flow control to sender */
bool suppress_irq_handler;
#endif
/** @msg: Message buffer */
u32 msg[GUC_CTB_MSG_MAX_LEN];
/** @fast_msg: Message buffer */
u32 fast_msg[GUC_CTB_MSG_MAX_LEN];
};
#endif

View File

@ -0,0 +1,105 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_debugfs.h>
#include <drm/drm_managed.h>
#include "xe_device.h"
#include "xe_gt.h"
#include "xe_guc.h"
#include "xe_guc_ct.h"
#include "xe_guc_debugfs.h"
#include "xe_guc_log.h"
#include "xe_macros.h"
static struct xe_gt *
guc_to_gt(struct xe_guc *guc)
{
return container_of(guc, struct xe_gt, uc.guc);
}
static struct xe_device *
guc_to_xe(struct xe_guc *guc)
{
return gt_to_xe(guc_to_gt(guc));
}
static struct xe_guc *node_to_guc(struct drm_info_node *node)
{
return node->info_ent->data;
}
static int guc_info(struct seq_file *m, void *data)
{
struct xe_guc *guc = node_to_guc(m->private);
struct xe_device *xe = guc_to_xe(guc);
struct drm_printer p = drm_seq_file_printer(m);
xe_device_mem_access_get(xe);
xe_guc_print_info(guc, &p);
xe_device_mem_access_put(xe);
return 0;
}
static int guc_log(struct seq_file *m, void *data)
{
struct xe_guc *guc = node_to_guc(m->private);
struct xe_device *xe = guc_to_xe(guc);
struct drm_printer p = drm_seq_file_printer(m);
xe_device_mem_access_get(xe);
xe_guc_log_print(&guc->log, &p);
xe_device_mem_access_put(xe);
return 0;
}
#ifdef XE_GUC_CT_SELFTEST
static int guc_ct_selftest(struct seq_file *m, void *data)
{
struct xe_guc *guc = node_to_guc(m->private);
struct xe_device *xe = guc_to_xe(guc);
struct drm_printer p = drm_seq_file_printer(m);
xe_device_mem_access_get(xe);
xe_guc_ct_selftest(&guc->ct, &p);
xe_device_mem_access_put(xe);
return 0;
}
#endif
static const struct drm_info_list debugfs_list[] = {
{"guc_info", guc_info, 0},
{"guc_log", guc_log, 0},
#ifdef XE_GUC_CT_SELFTEST
{"guc_ct_selftest", guc_ct_selftest, 0},
#endif
};
void xe_guc_debugfs_register(struct xe_guc *guc, struct dentry *parent)
{
struct drm_minor *minor = guc_to_xe(guc)->drm.primary;
struct drm_info_list *local;
int i;
#define DEBUGFS_SIZE ARRAY_SIZE(debugfs_list) * sizeof(struct drm_info_list)
local = drmm_kmalloc(&guc_to_xe(guc)->drm, DEBUGFS_SIZE, GFP_KERNEL);
if (!local) {
XE_WARN_ON("Couldn't allocate memory");
return;
}
memcpy(local, debugfs_list, DEBUGFS_SIZE);
#undef DEBUGFS_SIZE
for (i = 0; i < ARRAY_SIZE(debugfs_list); ++i)
local[i].data = guc;
drm_debugfs_create_files(local,
ARRAY_SIZE(debugfs_list),
parent, minor);
}

View File

@ -0,0 +1,14 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_DEBUGFS_H_
#define _XE_GUC_DEBUGFS_H_
struct dentry;
struct xe_guc;
void xe_guc_debugfs_register(struct xe_guc *guc, struct dentry *parent);
#endif

View File

@ -0,0 +1,52 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_ENGINE_TYPES_H_
#define _XE_GUC_ENGINE_TYPES_H_
#include <linux/spinlock.h>
#include <linux/workqueue.h>
#include "xe_gpu_scheduler_types.h"
struct dma_fence;
struct xe_engine;
/**
* struct xe_guc_engine - GuC specific state for an xe_engine
*/
struct xe_guc_engine {
/** @engine: Backpointer to parent xe_engine */
struct xe_engine *engine;
/** @sched: GPU scheduler for this xe_engine */
struct xe_gpu_scheduler sched;
/** @entity: Scheduler entity for this xe_engine */
struct xe_sched_entity entity;
/**
* @static_msgs: Static messages for this xe_engine, used when a message
* needs to sent through the GPU scheduler but memory allocations are
* not allowed.
*/
#define MAX_STATIC_MSG_TYPE 3
struct xe_sched_msg static_msgs[MAX_STATIC_MSG_TYPE];
/** @fini_async: do final fini async from this worker */
struct work_struct fini_async;
/** @resume_time: time of last resume */
u64 resume_time;
/** @state: GuC specific state for this xe_engine */
atomic_t state;
/** @wqi_head: work queue item tail */
u32 wqi_head;
/** @wqi_tail: work queue item tail */
u32 wqi_tail;
/** @id: GuC id for this xe_engine */
u16 id;
/** @suspend_wait: wait queue used to wait on pending suspends */
wait_queue_head_t suspend_wait;
/** @suspend_pending: a suspend of the engine is pending */
bool suspend_pending;
};
#endif

View File

@ -0,0 +1,392 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_FWIF_H
#define _XE_GUC_FWIF_H
#include <linux/bits.h>
#include "abi/guc_actions_abi.h"
#include "abi/guc_actions_slpc_abi.h"
#include "abi/guc_errors_abi.h"
#include "abi/guc_communication_mmio_abi.h"
#include "abi/guc_communication_ctb_abi.h"
#include "abi/guc_klvs_abi.h"
#include "abi/guc_messages_abi.h"
#define G2H_LEN_DW_SCHED_CONTEXT_MODE_SET 4
#define G2H_LEN_DW_DEREGISTER_CONTEXT 3
#define G2H_LEN_DW_TLB_INVALIDATE 3
#define GUC_CONTEXT_DISABLE 0
#define GUC_CONTEXT_ENABLE 1
#define GUC_CLIENT_PRIORITY_KMD_HIGH 0
#define GUC_CLIENT_PRIORITY_HIGH 1
#define GUC_CLIENT_PRIORITY_KMD_NORMAL 2
#define GUC_CLIENT_PRIORITY_NORMAL 3
#define GUC_CLIENT_PRIORITY_NUM 4
#define GUC_RENDER_ENGINE 0
#define GUC_VIDEO_ENGINE 1
#define GUC_BLITTER_ENGINE 2
#define GUC_VIDEOENHANCE_ENGINE 3
#define GUC_VIDEO_ENGINE2 4
#define GUC_MAX_ENGINES_NUM (GUC_VIDEO_ENGINE2 + 1)
#define GUC_RENDER_CLASS 0
#define GUC_VIDEO_CLASS 1
#define GUC_VIDEOENHANCE_CLASS 2
#define GUC_BLITTER_CLASS 3
#define GUC_COMPUTE_CLASS 4
#define GUC_GSC_OTHER_CLASS 5
#define GUC_LAST_ENGINE_CLASS GUC_GSC_OTHER_CLASS
#define GUC_MAX_ENGINE_CLASSES 16
#define GUC_MAX_INSTANCES_PER_CLASS 32
/* Work item for submitting workloads into work queue of GuC. */
#define WQ_STATUS_ACTIVE 1
#define WQ_STATUS_SUSPENDED 2
#define WQ_STATUS_CMD_ERROR 3
#define WQ_STATUS_ENGINE_ID_NOT_USED 4
#define WQ_STATUS_SUSPENDED_FROM_RESET 5
#define WQ_TYPE_NOOP 0x4
#define WQ_TYPE_MULTI_LRC 0x5
#define WQ_TYPE_MASK GENMASK(7, 0)
#define WQ_LEN_MASK GENMASK(26, 16)
#define WQ_GUC_ID_MASK GENMASK(15, 0)
#define WQ_RING_TAIL_MASK GENMASK(28, 18)
struct guc_wq_item {
u32 header;
u32 context_desc;
u32 submit_element_info;
u32 fence_id;
} __packed;
struct guc_sched_wq_desc {
u32 head;
u32 tail;
u32 error_offset;
u32 wq_status;
u32 reserved[28];
} __packed;
/* Helper for context registration H2G */
struct guc_ctxt_registration_info {
u32 flags;
u32 context_idx;
u32 engine_class;
u32 engine_submit_mask;
u32 wq_desc_lo;
u32 wq_desc_hi;
u32 wq_base_lo;
u32 wq_base_hi;
u32 wq_size;
u32 hwlrca_lo;
u32 hwlrca_hi;
};
#define CONTEXT_REGISTRATION_FLAG_KMD BIT(0)
/* 32-bit KLV structure as used by policy updates and others */
struct guc_klv_generic_dw_t {
u32 kl;
u32 value;
} __packed;
/* Format of the UPDATE_CONTEXT_POLICIES H2G data packet */
struct guc_update_engine_policy_header {
u32 action;
u32 guc_id;
} __packed;
struct guc_update_engine_policy {
struct guc_update_engine_policy_header header;
struct guc_klv_generic_dw_t klv[GUC_CONTEXT_POLICIES_KLV_NUM_IDS];
} __packed;
/* GUC_CTL_* - Parameters for loading the GuC */
#define GUC_CTL_LOG_PARAMS 0
#define GUC_LOG_VALID BIT(0)
#define GUC_LOG_NOTIFY_ON_HALF_FULL BIT(1)
#define GUC_LOG_CAPTURE_ALLOC_UNITS BIT(2)
#define GUC_LOG_LOG_ALLOC_UNITS BIT(3)
#define GUC_LOG_CRASH_SHIFT 4
#define GUC_LOG_CRASH_MASK (0x3 << GUC_LOG_CRASH_SHIFT)
#define GUC_LOG_DEBUG_SHIFT 6
#define GUC_LOG_DEBUG_MASK (0xF << GUC_LOG_DEBUG_SHIFT)
#define GUC_LOG_CAPTURE_SHIFT 10
#define GUC_LOG_CAPTURE_MASK (0x3 << GUC_LOG_CAPTURE_SHIFT)
#define GUC_LOG_BUF_ADDR_SHIFT 12
#define GUC_CTL_WA 1
#define GUC_WA_GAM_CREDITS BIT(10)
#define GUC_WA_DUAL_QUEUE BIT(11)
#define GUC_WA_RCS_RESET_BEFORE_RC6 BIT(13)
#define GUC_WA_CONTEXT_ISOLATION BIT(15)
#define GUC_WA_PRE_PARSER BIT(14)
#define GUC_WA_HOLD_CCS_SWITCHOUT BIT(17)
#define GUC_WA_POLLCS BIT(18)
#define GUC_WA_RENDER_RST_RC6_EXIT BIT(19)
#define GUC_WA_RCS_REGS_IN_CCS_REGS_LIST BIT(21)
#define GUC_CTL_FEATURE 2
#define GUC_CTL_ENABLE_SLPC BIT(2)
#define GUC_CTL_DISABLE_SCHEDULER BIT(14)
#define GUC_CTL_DEBUG 3
#define GUC_LOG_VERBOSITY_SHIFT 0
#define GUC_LOG_VERBOSITY_LOW (0 << GUC_LOG_VERBOSITY_SHIFT)
#define GUC_LOG_VERBOSITY_MED (1 << GUC_LOG_VERBOSITY_SHIFT)
#define GUC_LOG_VERBOSITY_HIGH (2 << GUC_LOG_VERBOSITY_SHIFT)
#define GUC_LOG_VERBOSITY_ULTRA (3 << GUC_LOG_VERBOSITY_SHIFT)
#define GUC_LOG_VERBOSITY_MIN 0
#define GUC_LOG_VERBOSITY_MAX 3
#define GUC_LOG_VERBOSITY_MASK 0x0000000f
#define GUC_LOG_DESTINATION_MASK (3 << 4)
#define GUC_LOG_DISABLED (1 << 6)
#define GUC_PROFILE_ENABLED (1 << 7)
#define GUC_CTL_ADS 4
#define GUC_ADS_ADDR_SHIFT 1
#define GUC_ADS_ADDR_MASK (0xFFFFF << GUC_ADS_ADDR_SHIFT)
#define GUC_CTL_DEVID 5
#define GUC_CTL_MAX_DWORDS 14
/* Scheduling policy settings */
#define GLOBAL_POLICY_MAX_NUM_WI 15
/* Don't reset an engine upon preemption failure */
#define GLOBAL_POLICY_DISABLE_ENGINE_RESET BIT(0)
#define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
struct guc_policies {
u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
/* In micro seconds. How much time to allow before DPC processing is
* called back via interrupt (to prevent DPC queue drain starving).
* Typically 1000s of micro seconds (example only, not granularity). */
u32 dpc_promote_time;
/* Must be set to take these new values. */
u32 is_valid;
/* Max number of WIs to process per call. A large value may keep CS
* idle. */
u32 max_num_work_items;
u32 global_flags;
u32 reserved[4];
} __packed;
/* GuC MMIO reg state struct */
struct guc_mmio_reg {
u32 offset;
u32 value;
u32 flags;
u32 mask;
#define GUC_REGSET_MASKED BIT(0)
#define GUC_REGSET_MASKED_WITH_VALUE BIT(2)
#define GUC_REGSET_RESTORE_ONLY BIT(3)
} __packed;
/* GuC register sets */
struct guc_mmio_reg_set {
u32 address;
u16 count;
u16 reserved;
} __packed;
/* Generic GT SysInfo data types */
#define GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED 0
#define GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK 1
#define GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI 2
#define GUC_GENERIC_GT_SYSINFO_MAX 16
/* HW info */
struct guc_gt_system_info {
u8 mapping_table[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
u32 engine_enabled_masks[GUC_MAX_ENGINE_CLASSES];
u32 generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_MAX];
} __packed;
enum {
GUC_CAPTURE_LIST_INDEX_PF = 0,
GUC_CAPTURE_LIST_INDEX_VF = 1,
GUC_CAPTURE_LIST_INDEX_MAX = 2,
};
/* GuC Additional Data Struct */
struct guc_ads {
struct guc_mmio_reg_set reg_state_list[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
u32 reserved0;
u32 scheduler_policies;
u32 gt_system_info;
u32 reserved1;
u32 control_data;
u32 golden_context_lrca[GUC_MAX_ENGINE_CLASSES];
u32 eng_state_size[GUC_MAX_ENGINE_CLASSES];
u32 private_data;
u32 um_init_data;
u32 capture_instance[GUC_CAPTURE_LIST_INDEX_MAX][GUC_MAX_ENGINE_CLASSES];
u32 capture_class[GUC_CAPTURE_LIST_INDEX_MAX][GUC_MAX_ENGINE_CLASSES];
u32 capture_global[GUC_CAPTURE_LIST_INDEX_MAX];
u32 reserved[14];
} __packed;
/* Engine usage stats */
struct guc_engine_usage_record {
u32 current_context_index;
u32 last_switch_in_stamp;
u32 reserved0;
u32 total_runtime;
u32 reserved1[4];
} __packed;
struct guc_engine_usage {
struct guc_engine_usage_record engines[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
} __packed;
/* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
enum xe_guc_recv_message {
XE_GUC_RECV_MSG_CRASH_DUMP_POSTED = BIT(1),
XE_GUC_RECV_MSG_EXCEPTION = BIT(30),
};
/* Page fault structures */
struct access_counter_desc {
u32 dw0;
#define ACCESS_COUNTER_TYPE BIT(0)
#define ACCESS_COUNTER_SUBG_LO GENMASK(31, 1)
u32 dw1;
#define ACCESS_COUNTER_SUBG_HI BIT(0)
#define ACCESS_COUNTER_RSVD0 GENMASK(2, 1)
#define ACCESS_COUNTER_ENG_INSTANCE GENMASK(8, 3)
#define ACCESS_COUNTER_ENG_CLASS GENMASK(11, 9)
#define ACCESS_COUNTER_ASID GENMASK(31, 12)
u32 dw2;
#define ACCESS_COUNTER_VFID GENMASK(5, 0)
#define ACCESS_COUNTER_RSVD1 GENMASK(7, 6)
#define ACCESS_COUNTER_GRANULARITY GENMASK(10, 8)
#define ACCESS_COUNTER_RSVD2 GENMASK(16, 11)
#define ACCESS_COUNTER_VIRTUAL_ADDR_RANGE_LO GENMASK(31, 17)
u32 dw3;
#define ACCESS_COUNTER_VIRTUAL_ADDR_RANGE_HI GENMASK(31, 0)
} __packed;
enum guc_um_queue_type {
GUC_UM_HW_QUEUE_PAGE_FAULT = 0,
GUC_UM_HW_QUEUE_PAGE_FAULT_RESPONSE,
GUC_UM_HW_QUEUE_ACCESS_COUNTER,
GUC_UM_HW_QUEUE_MAX
};
struct guc_um_queue_params {
u64 base_dpa;
u32 base_ggtt_address;
u32 size_in_bytes;
u32 rsvd[4];
} __packed;
struct guc_um_init_params {
u64 page_response_timeout_in_us;
u32 rsvd[6];
struct guc_um_queue_params queue_params[GUC_UM_HW_QUEUE_MAX];
} __packed;
enum xe_guc_fault_reply_type {
PFR_ACCESS = 0,
PFR_ENGINE,
PFR_VFID,
PFR_ALL,
PFR_INVALID
};
enum xe_guc_response_desc_type {
TLB_INVALIDATION_DESC = 0,
FAULT_RESPONSE_DESC
};
struct xe_guc_pagefault_desc {
u32 dw0;
#define PFD_FAULT_LEVEL GENMASK(2, 0)
#define PFD_SRC_ID GENMASK(10, 3)
#define PFD_RSVD_0 GENMASK(17, 11)
#define XE2_PFD_TRVA_FAULT BIT(18)
#define PFD_ENG_INSTANCE GENMASK(24, 19)
#define PFD_ENG_CLASS GENMASK(27, 25)
#define PFD_PDATA_LO GENMASK(31, 28)
u32 dw1;
#define PFD_PDATA_HI GENMASK(11, 0)
#define PFD_PDATA_HI_SHIFT 4
#define PFD_ASID GENMASK(31, 12)
u32 dw2;
#define PFD_ACCESS_TYPE GENMASK(1, 0)
#define PFD_FAULT_TYPE GENMASK(3, 2)
#define PFD_VFID GENMASK(9, 4)
#define PFD_RSVD_1 GENMASK(11, 10)
#define PFD_VIRTUAL_ADDR_LO GENMASK(31, 12)
#define PFD_VIRTUAL_ADDR_LO_SHIFT 12
u32 dw3;
#define PFD_VIRTUAL_ADDR_HI GENMASK(31, 0)
#define PFD_VIRTUAL_ADDR_HI_SHIFT 32
} __packed;
struct xe_guc_pagefault_reply {
u32 dw0;
#define PFR_VALID BIT(0)
#define PFR_SUCCESS BIT(1)
#define PFR_REPLY GENMASK(4, 2)
#define PFR_RSVD_0 GENMASK(9, 5)
#define PFR_DESC_TYPE GENMASK(11, 10)
#define PFR_ASID GENMASK(31, 12)
u32 dw1;
#define PFR_VFID GENMASK(5, 0)
#define PFR_RSVD_1 BIT(6)
#define PFR_ENG_INSTANCE GENMASK(12, 7)
#define PFR_ENG_CLASS GENMASK(15, 13)
#define PFR_PDATA GENMASK(31, 16)
u32 dw2;
#define PFR_RSVD_2 GENMASK(31, 0)
} __packed;
struct xe_guc_acc_desc {
u32 dw0;
#define ACC_TYPE BIT(0)
#define ACC_TRIGGER 0
#define ACC_NOTIFY 1
#define ACC_SUBG_LO GENMASK(31, 1)
u32 dw1;
#define ACC_SUBG_HI BIT(0)
#define ACC_RSVD0 GENMASK(2, 1)
#define ACC_ENG_INSTANCE GENMASK(8, 3)
#define ACC_ENG_CLASS GENMASK(11, 9)
#define ACC_ASID GENMASK(31, 12)
u32 dw2;
#define ACC_VFID GENMASK(5, 0)
#define ACC_RSVD1 GENMASK(7, 6)
#define ACC_GRANULARITY GENMASK(10, 8)
#define ACC_RSVD2 GENMASK(16, 11)
#define ACC_VIRTUAL_ADDR_RANGE_LO GENMASK(31, 17)
u32 dw3;
#define ACC_VIRTUAL_ADDR_RANGE_HI GENMASK(31, 0)
} __packed;
#endif

View File

@ -0,0 +1,125 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_managed.h>
#include "xe_bo.h"
#include "xe_device.h"
#include "xe_gt.h"
#include "xe_guc.h"
#include "xe_guc_hwconfig.h"
#include "xe_map.h"
static struct xe_gt *
guc_to_gt(struct xe_guc *guc)
{
return container_of(guc, struct xe_gt, uc.guc);
}
static struct xe_device *
guc_to_xe(struct xe_guc *guc)
{
return gt_to_xe(guc_to_gt(guc));
}
static int send_get_hwconfig(struct xe_guc *guc, u32 ggtt_addr, u32 size)
{
u32 action[] = {
XE_GUC_ACTION_GET_HWCONFIG,
lower_32_bits(ggtt_addr),
upper_32_bits(ggtt_addr),
size,
};
return xe_guc_send_mmio(guc, action, ARRAY_SIZE(action));
}
static int guc_hwconfig_size(struct xe_guc *guc, u32 *size)
{
int ret = send_get_hwconfig(guc, 0, 0);
if (ret < 0)
return ret;
*size = ret;
return 0;
}
static int guc_hwconfig_copy(struct xe_guc *guc)
{
int ret = send_get_hwconfig(guc, xe_bo_ggtt_addr(guc->hwconfig.bo),
guc->hwconfig.size);
if (ret < 0)
return ret;
return 0;
}
static void guc_hwconfig_fini(struct drm_device *drm, void *arg)
{
struct xe_guc *guc = arg;
xe_bo_unpin_map_no_vm(guc->hwconfig.bo);
}
int xe_guc_hwconfig_init(struct xe_guc *guc)
{
struct xe_device *xe = guc_to_xe(guc);
struct xe_gt *gt = guc_to_gt(guc);
struct xe_bo *bo;
u32 size;
int err;
/* Initialization already done */
if (guc->hwconfig.bo)
return 0;
/*
* All hwconfig the same across GTs so only GT0 needs to be configured
*/
if (gt->info.id != XE_GT0)
return 0;
/* ADL_P, DG2+ supports hwconfig table */
if (GRAPHICS_VERx100(xe) < 1255 && xe->info.platform != XE_ALDERLAKE_P)
return 0;
err = guc_hwconfig_size(guc, &size);
if (err)
return err;
if (!size)
return -EINVAL;
bo = xe_bo_create_pin_map(xe, gt, NULL, PAGE_ALIGN(size),
ttm_bo_type_kernel,
XE_BO_CREATE_VRAM_IF_DGFX(gt) |
XE_BO_CREATE_GGTT_BIT);
if (IS_ERR(bo))
return PTR_ERR(bo);
guc->hwconfig.bo = bo;
guc->hwconfig.size = size;
err = drmm_add_action_or_reset(&xe->drm, guc_hwconfig_fini, guc);
if (err)
return err;
return guc_hwconfig_copy(guc);
}
u32 xe_guc_hwconfig_size(struct xe_guc *guc)
{
return !guc->hwconfig.bo ? 0 : guc->hwconfig.size;
}
void xe_guc_hwconfig_copy(struct xe_guc *guc, void *dst)
{
struct xe_device *xe = guc_to_xe(guc);
XE_BUG_ON(!guc->hwconfig.bo);
xe_map_memcpy_from(xe, dst, &guc->hwconfig.bo->vmap, 0,
guc->hwconfig.size);
}

View File

@ -0,0 +1,17 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_HWCONFIG_H_
#define _XE_GUC_HWCONFIG_H_
#include <linux/types.h>
struct xe_guc;
int xe_guc_hwconfig_init(struct xe_guc *guc);
u32 xe_guc_hwconfig_size(struct xe_guc *guc);
void xe_guc_hwconfig_copy(struct xe_guc *guc, void *dst);
#endif

View File

@ -0,0 +1,109 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include <drm/drm_managed.h>
#include "xe_bo.h"
#include "xe_gt.h"
#include "xe_guc_log.h"
#include "xe_map.h"
#include "xe_module.h"
static struct xe_gt *
log_to_gt(struct xe_guc_log *log)
{
return container_of(log, struct xe_gt, uc.guc.log);
}
static struct xe_device *
log_to_xe(struct xe_guc_log *log)
{
return gt_to_xe(log_to_gt(log));
}
static size_t guc_log_size(void)
{
/*
* GuC Log buffer Layout
*
* +===============================+ 00B
* | Crash dump state header |
* +-------------------------------+ 32B
* | Debug state header |
* +-------------------------------+ 64B
* | Capture state header |
* +-------------------------------+ 96B
* | |
* +===============================+ PAGE_SIZE (4KB)
* | Crash Dump logs |
* +===============================+ + CRASH_SIZE
* | Debug logs |
* +===============================+ + DEBUG_SIZE
* | Capture logs |
* +===============================+ + CAPTURE_SIZE
*/
return PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE +
CAPTURE_BUFFER_SIZE;
}
void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p)
{
struct xe_device *xe = log_to_xe(log);
size_t size;
int i, j;
XE_BUG_ON(!log->bo);
size = log->bo->size;
#define DW_PER_READ 128
XE_BUG_ON(size % (DW_PER_READ * sizeof(u32)));
for (i = 0; i < size / sizeof(u32); i += DW_PER_READ) {
u32 read[DW_PER_READ];
xe_map_memcpy_from(xe, read, &log->bo->vmap, i * sizeof(u32),
DW_PER_READ * sizeof(u32));
#define DW_PER_PRINT 4
for (j = 0; j < DW_PER_READ / DW_PER_PRINT; ++j) {
u32 *print = read + j * DW_PER_PRINT;
drm_printf(p, "0x%08x 0x%08x 0x%08x 0x%08x\n",
*(print + 0), *(print + 1),
*(print + 2), *(print + 3));
}
}
}
static void guc_log_fini(struct drm_device *drm, void *arg)
{
struct xe_guc_log *log = arg;
xe_bo_unpin_map_no_vm(log->bo);
}
int xe_guc_log_init(struct xe_guc_log *log)
{
struct xe_device *xe = log_to_xe(log);
struct xe_gt *gt = log_to_gt(log);
struct xe_bo *bo;
int err;
bo = xe_bo_create_pin_map(xe, gt, NULL, guc_log_size(),
ttm_bo_type_kernel,
XE_BO_CREATE_VRAM_IF_DGFX(gt) |
XE_BO_CREATE_GGTT_BIT);
if (IS_ERR(bo))
return PTR_ERR(bo);
xe_map_memset(xe, &bo->vmap, 0, 0, guc_log_size());
log->bo = bo;
log->level = xe_guc_log_level;
err = drmm_add_action_or_reset(&xe->drm, guc_log_fini, log);
if (err)
return err;
return 0;
}

View File

@ -0,0 +1,48 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_LOG_H_
#define _XE_GUC_LOG_H_
#include "xe_guc_log_types.h"
struct drm_printer;
#if IS_ENABLED(CONFIG_DRM_XE_LARGE_GUC_BUFFER)
#define CRASH_BUFFER_SIZE SZ_1M
#define DEBUG_BUFFER_SIZE SZ_8M
#define CAPTURE_BUFFER_SIZE SZ_2M
#else
#define CRASH_BUFFER_SIZE SZ_8K
#define DEBUG_BUFFER_SIZE SZ_64K
#define CAPTURE_BUFFER_SIZE SZ_16K
#endif
/*
* While we're using plain log level in i915, GuC controls are much more...
* "elaborate"? We have a couple of bits for verbosity, separate bit for actual
* log enabling, and separate bit for default logging - which "conveniently"
* ignores the enable bit.
*/
#define GUC_LOG_LEVEL_DISABLED 0
#define GUC_LOG_LEVEL_NON_VERBOSE 1
#define GUC_LOG_LEVEL_IS_ENABLED(x) ((x) > GUC_LOG_LEVEL_DISABLED)
#define GUC_LOG_LEVEL_IS_VERBOSE(x) ((x) > GUC_LOG_LEVEL_NON_VERBOSE)
#define GUC_LOG_LEVEL_TO_VERBOSITY(x) ({ \
typeof(x) _x = (x); \
GUC_LOG_LEVEL_IS_VERBOSE(_x) ? _x - 2 : 0; \
})
#define GUC_VERBOSITY_TO_LOG_LEVEL(x) ((x) + 2)
#define GUC_LOG_LEVEL_MAX GUC_VERBOSITY_TO_LOG_LEVEL(GUC_LOG_VERBOSITY_MAX)
int xe_guc_log_init(struct xe_guc_log *log);
void xe_guc_log_print(struct xe_guc_log *log, struct drm_printer *p);
static inline u32
xe_guc_log_get_level(struct xe_guc_log *log)
{
return log->level;
}
#endif

View File

@ -0,0 +1,23 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef _XE_GUC_LOG_TYPES_H_
#define _XE_GUC_LOG_TYPES_H_
#include <linux/types.h>
struct xe_bo;
/**
* struct xe_guc_log - GuC log
*/
struct xe_guc_log {
/** @level: GuC log level */
u32 level;
/** @bo: XE BO for GuC log */
struct xe_bo *bo;
};
#endif

Some files were not shown because too many files have changed in this diff Show More