Commit Graph

1157 Commits

Author SHA1 Message Date
Hawking Zhang 65ba96e91b drm/amdgpu: Move to common indirect reg access helper
Replace soc15, nv, soc21 specific callbacks with common
one. so we don't need to duplicate code when introduce
new asics.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-15 18:45:27 -04:00
Guchun Chen c69d51395a drm/amdgpu: move poll enabled/disable into non DC path
Some amd asics having reliable hotplug support don't call
drm_kms_helper_poll_init in driver init sequence. However,
due to the unified suspend/resume path for all asics, because
the output_poll_work->func is not set for these asics, a warning
arrives when suspending.

[   90.656049]  <TASK>
[   90.656050]  ? console_unlock+0x4d/0x100
[   90.656053]  ? __irq_work_queue_local+0x27/0x60
[   90.656056]  ? irq_work_queue+0x2b/0x50
[   90.656057]  ? __wake_up_klogd+0x40/0x60
[   90.656059]  __cancel_work_timer+0xed/0x180
[   90.656061]  drm_kms_helper_poll_disable.cold+0x1f/0x2c [drm_kms_helper]
[   90.656072]  amdgpu_device_suspend+0x81/0x170 [amdgpu]
[   90.656180]  amdgpu_pmops_runtime_suspend+0xb5/0x1b0 [amdgpu]
[   90.656269]  pci_pm_runtime_suspend+0x61/0x1b0

drm_kms_helper_poll_enable/disable is valid when poll_init is called in
amdgpu code, which is only used in non DC path. So move such codes into
non-DC path code to get rid of such warnings.

v1: introduce use_kms_poll flag in amdgpu as the poll stuff check
v2: use dc_enabled as the flag to simply code
v3: move code into non DC path instead of relying on any flag

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
Fixes: a4e771729a ("drm/probe_helper: sort out poll_running vs poll_enabled")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Guchun Chen 6ab68650a1 drm/amdgpu: use drm_device pointer directly rather than convert again
The convert from adev is redundant.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Guchun Chen 53e9d836ea drm/amdgpu: drop pm_sysfs_en flag from amdgpu_device structure
pm_sysfs_en is overlapped with pm.sysfs_initialized, so drop it
for simplifying code(no functional change).

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-13 17:27:48 -04:00
Bjorn Helgaas 58265640fb drm/amdgpu: Drop redundant pci_enable_pcie_error_reporting()
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-08 14:05:57 -05:00
Orlando Chamberlain d37a3929ca drm/amdgpu: register a vga_switcheroo client for MacBooks with apple-gmux
Commit 3840c5bcc2 ("drm/amdgpu: disentangle runtime pm and
vga_switcheroo") made amdgpu only register a vga_switcheroo client for
GPU's with PX, however AMD GPUs in dual gpu Apple Macbooks do need to
register, but don't have PX. Instead of AMD's PX, they use apple-gmux.

Use apple_gmux_detect() to identify these gpus, and
pci_is_thunderbolt_attached() to ensure eGPUs connected to Dual GPU
Macbooks don't register with vga_switcheroo.

Fixes: 3840c5bcc2 ("drm/amdgpu: disentangle runtime pm and vga_switcheroo")
Link: https://lore.kernel.org/amd-gfx/20230210044826.9834-10-orlandoch.dev@gmail.com/
Signed-off-by: Orlando Chamberlain <orlandoch.dev@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-07 14:22:41 -05:00
Jack Xiao dc907c9db8 drm/amd/amdgpu: fix warning during suspend
Freeing memory was warned during suspend.
Move the self test out of suspend.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2151825
Cc: jfalempe@redhat.com
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Tested-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-15 22:19:43 -05:00
Kent Russell 2c496a6cf4 drm/amdgpu: Fix incorrect filenames in sysfs comments
This looks like a standard copy/paste mistake. Replace the incorrect
serial_number references with product_name and product_model

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-08 17:35:36 -05:00
Vitaly Prosyak 39934d3ed5 Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled"
This reverts commit fac53471d0.
The following change: move the drm_dev_unplug call after
amdgpu_driver_unload_kms in amdgpu_pci_remove. The reason is
the following: amdgpu_pci_remove calls drm_dev_unregister
and it should be called first to ensure userspace can't access the
device instance anymore. If we call drm_dev_unplug after
amdgpu_driver_unload_kms then we observe IGT PCI software unplug
test failure (kernel hung) for all ASICs. This is how this
regression was found.

After this revert, the following commands do work not, but it would
be fixed in the next commit:
 - sudo modprobe -r amdgpu
 - sudo modprobe amdgpu

Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-31 13:59:21 -05:00
Dave Airlie 155c6b16ee Merge tag 'amd-drm-next-6.3-2023-01-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.3-2023-01-27:

amdgpu:
- GC11 fixes
- SMU13 fixes
- Freesync fixes
- DP MST fixes
- DP MST code rework and cleanup
- AV1 fixes for VCN4
- DCN 3.2.x fixes
- PSR fixes
- DML optimizations
- DC link code rework

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127225917.2419162-1-alexander.deucher@amd.com
2023-01-30 15:37:57 +10:00
Dave Airlie 7dd1be30f0 Merge tag 'amd-drm-next-6.3-2023-01-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.3-2023-01-20:

amdgpu:
- Secure display fixes
- Fix scaling
- Misc code cleanups
- Display BW alloc logic updates
- DCN 3.2 fixes
- Fix power reporting on certain firmwares for CZN/RN
- SR-IOV fixes
- Link training cleanup and code rework
- HDCP fixes
- Reserved VMID fix
- Documentation updates
- Colorspace fixes
- RAS updates
- GC11.0 fixes
- VCN instance harvesting fixes
- DCN 3.1.4/5 workarounds for S/G displays
- Add PCIe info to the INFO IOCTL

amdkfd:
- XNACK fix

UAPI:
- Add PCIe gen/lanes info to the amdgpu INFO IOCTL
  Nesa ultimately plans to use this to make decisions about buffer placement optimizations
  Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20790

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230120234523.7610-1-alexander.deucher@amd.com
2023-01-25 12:07:53 +10:00
Tim Huang e11c775030 drm/amdgpu: skip psp suspend for IMU enabled ASICs mode2 reset
The psp suspend & resume should be skipped to avoid destroy
the TMR and reload FWs again for IMU enabled APU ASICs.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-24 12:27:37 -05:00
Daniel Vetter b8f55f24bc drm-misc-next for $kernel-version:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
 
  * Cleanup unneeded include statements wrt <linux/fb.h>, <drm/drm_fb_helper.h>
    and <drm/drm_crtc_helper.h>
 
  * Remove unused helper DRM_DEBUG_KMS_RATELIMITED()
 
  * fbdev: Remove obsolete aperture field from struct fb_device, plus
    driver cleanups; Remove unused flag FBINFO_MISC_FIRMWARE
 
  * MIPI-DSI: Fix brightness, plus rsp. driver updates
 
  * scheduler: Deprecate drm_sched_resubmit_jobs()
 
  * ttm: Fix MIPS build; Remove ttm_bo_wait(); Documentation fixes
 
 Driver Changes:
 
  * Remove obsolete drivers for userspace modesetting i810, mga, r128,
    savage, sis, tdfx, via
 
  * bridge: Support CDNS DSI J721E, plus DT bindings; lt9611: Various
    fixes and improvements; sil902x: Various fixes; Fixes
 
  * nouveau: Removed support for legacy ioctls; Replace zero-size array;
    Cleanups
 
  * panel: Fixes
 
  * radeon: Use new DRM logging helpers
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmPJAq4ACgkQaA3BHVML
 eiNEIgf+I0R9KmX890K4usKG9LfPH/nIv+4Am6x4/4lv0PzN2vYGhoyPJG8cyNvs
 KFms+lTUJBBgHeTG8S8NU1qKWUlA78eYQz8S4dbaocchsAiPTHq4f5J45zbQWMGI
 P56iNAflaO2ETtb3CsH0P0TPsW2TpZC3dvZUYpAEQDli66Bn2BCPCYspt4scOhZX
 S9usD28sB6L9AnALcCUMLqF4DUsW4FC8Zz46hKVUFlQpN5dcC1b0x0gyclyWy0wh
 yi1fkqzBB3N44JOIFFwan/KxQttgvrc9Shllkqss525AhE+v3afkK2i9ZXgdckuU
 kLC09pn6yuxubYgS0vJEU1bsqiMs+Q==
 =WjQb
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2023-01-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for $kernel-version:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:

 * Cleanup unneeded include statements wrt <linux/fb.h>, <drm/drm_fb_helper.h>
   and <drm/drm_crtc_helper.h>

 * Remove unused helper DRM_DEBUG_KMS_RATELIMITED()

 * fbdev: Remove obsolete aperture field from struct fb_device, plus
   driver cleanups; Remove unused flag FBINFO_MISC_FIRMWARE

 * MIPI-DSI: Fix brightness, plus rsp. driver updates

 * scheduler: Deprecate drm_sched_resubmit_jobs()

 * ttm: Fix MIPS build; Remove ttm_bo_wait(); Documentation fixes

Driver Changes:

 * Remove obsolete drivers for userspace modesetting i810, mga, r128,
   savage, sis, tdfx, via

 * bridge: Support CDNS DSI J721E, plus DT bindings; lt9611: Various
   fixes and improvements; sil902x: Various fixes; Fixes

 * nouveau: Removed support for legacy ioctls; Replace zero-size array;
   Cleanups

 * panel: Fixes

 * radeon: Use new DRM logging helpers

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Y8kDk5YX7Yz3eRhM@linux-uq9g
2023-01-24 17:36:29 +01:00
Thomas Zimmermann 973ad6273c drm/amdgpu: Remove unnecessary include statements for drm_crtc_helper.h
Several source files include drm_crtc_helper.h without needing it or
only to get its transitive include statements; leading to unnecessary
compile-time dependencies.

Directly include required headers and drop drm_crtc_helper.h where
possible.

v2:
	* keep includes sorted in amdgpu_device.c (Sam)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230116131235.18917-4-tzimmermann@suse.de
2023-01-18 09:25:30 +01:00
Thomas Zimmermann 53a17b6b75 drm/amdgpu: Fix coding style
Align a closing brace and remove trailing whitespaces. No functional
changes.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-17 16:11:52 -05:00
Mario Limonciello ced6950276 drm/amd: Evaluate early init for all IP blocks even if one fails
If early init fails for a single IP block, then no further IP blocks
are evaluated.  This means that if a user was missing more than one
firmware binary they would have to keep adding binaries and re-probing
until they discovered the ones missing.

To make this easier, run early init for each IP block and report a single
failure if not all passed.

Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-17 16:11:51 -05:00
Dave Airlie 1f1c24dee2 Merge tag 'amd-drm-next-6.3-2023-01-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.3-2023-01-13:

amdgpu:
- Fix possible segfault in failure case
- Rework FW requests to happen in early_init for all IPs so
  that we don't lose the sbios console if FW is missing
- PSR fixes
- Misc cleanups
- Unload fix
- SMU13 fixes

amdkfd:
- Fix for cleared VRAM BOs
- Fix cleanup if GPUVM creation fails
- Memory accounting fix
- Use resource_size rather than open codeing it
- GC11 mGPU fix

radeon:
- Fix memory leak on shutdown

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230113225911.7776-1-alexander.deucher@amd.com
2023-01-16 15:04:13 +10:00
Dave Airlie 45be204806 Merge tag 'amd-drm-next-6.3-2023-01-06' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.3-2023-01-06:

amdgpu:
- secure display support for multiple displays
- DML optimizations
- DCN 3.2 updates
- PSR updates
- DP 2.1 updates
- SR-IOV RAS updates
- VCN RAS support
- SMU 13.x updates
- Switch 1 element arrays to flexible arrays
- Add RAS support for DF 4.3
- Stack size improvements
- S0ix rework
- Soft reset fix
- Allow 0 as a vram limit on APUs
- Display fixes
- Misc code cleanups
- Documentation fixes
- Handle profiling modes for SMU13.x

amdkfd:
- Error handling fixes
- PASID fixes

radeon:
- Switch 1 element arrays to flexible arrays

drm:
- Add DP adaptive sync DPCD definitions

UAPI:
- Add new INFO queries for peak and min sclk/mclk for profile modes on newer chips
  Proposed mesa patch: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/278

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230106222037.7870-1-alexander.deucher@amd.com
2023-01-16 14:00:12 +10:00
Mario Limonciello b31d306378 drm/amd: Use `amdgpu_ucode_*` helpers for GPU info bin
The `amdgpu_ucode_request` helper will ensure that the return code for
missing firmware is -ENODEV so that early_init can fail.

The `amdgpu_ucode_release` helper is for symmetry on unloading.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-10 14:32:58 -05:00
Mario Limonciello b7cdb41e7d drm/amd: Delay removal of the firmware framebuffer
Removing the firmware framebuffer from the driver means that even
if the driver doesn't support the IP blocks in a GPU it will no
longer be functional after the driver fails to initialize.

This change will ensure that unsupported IP blocks at least cause
the driver to work with the EFI framebuffer.

Cc: stable@vger.kernel.org
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-09 16:26:48 -05:00
Daniel Vetter 03a0a10408 drm-misc-next for v6.3:
UAPI Changes:
 
  * connector: Support analog-TV mode property
 
  * media: Add MEDIA_BUS_FMT_RGB565_1X24_CPADHI,
    MEDIA_BUS_FMT_RGB666_1X18 and MEDIA_BUS_FMT_RGB666_1X24_CPADHI
 
 Cross-subsystem Changes:
 
  * dma-buf: Documentation fixes
 
  * i2c: Introduce i2c_client_get_device_id() helper
 
 Core Changes:
 
  * Improve support for analog TV output
 
  * bridge: Remove unused drm_bridge_chain functions
 
  * debugfs: Add per-device helpers and convert various DRM drivers
 
  * dp-mst: Various fixes
 
  * fbdev emulation: Always pick 32 bpp as default
 
  * KUnit: Add tests for managed helpers; Various cleanups
 
  * panel-orientation: Add quirks for Lenovo Yoga Tab 3 X90F and DynaBook K50
 
  * TTM: Open-code ttm_bo_wait() and remove the helper
 
 Driver Changes:
 
  * Fix preferred depth and bpp values throughout DRM drivers
 
  * Remove #CONFIG_PM guards throughout DRM drivers
 
  * ast: Various fixes
 
  * bridge: Implement i2c's probe_new in various drivers; Fixes; ite-it6505:
    Locking fixes, Cache EDID data; ite-it66121: Support IT6610 chip,
    Cleanups; lontium-tl9611: Fix HDMI on DragonBoard 845c; parade-ps8640:
    Use atomic bridge functions
 
  * gud: Convert to DRM shadow-plane helpers; Perform flushing synchronously
    during atomic update
 
  * ili9486: Support 16-bit pixel data
 
  * imx: Split off IPUv3 driver; Various fixes
 
  * mipi-dbi: Convert to DRM shadow-plane helpers plus rsp driver changes;i
    Support separate I/O-voltage supply
 
  * mxsfb: Depend on ARCH_MXS or ARCH_MXC
 
  * omapdrm: Various fixes
 
  * panel: Use ktime_get_boottime() to measure power-down delay in various
    drivers; Fix auto-suspend delay in various drivers; orisetech-ota5601a:
    Add support
 
  * sprd: Cleanups
 
  * sun4i: Convert to new TV-mode property
 
  * tidss: Various fixes
 
  * v3d: Various fixes
 
  * vc4: Convert to new TV-mode property; Support Kunit tests; Cleanups;
    dpi: Support RGB565 and RGB666 formats; dsi: Convert DSI driver to
    bridge
 
  * virtio: Improve tracing
 
  * vkms: Support small cursors in IGT tests; Various fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmO0B0QACgkQaA3BHVML
 eiMKtAf9EXt5yaEonR4gsGNVD70VkebW+jPEWbEMg5hMFDHE9sjBja8T7bOxeeL1
 BVofJiAuEZW9176eqeWvShuwOuiE7lf4WQLMvXfFmtHF/Nac9HUtEvOcvc1vEDUB
 y6VFlVHe8mSp+Iy0WPLyZCtT4d7v2eM+VYm0HgFa74dSTzQTLGPiUI/XVb/YDaA7
 FHKB5NvsMX9S1XvjxCsq0jA5bo8SSzh5CVKerdAZkBJMCSmKiY09o2p5C7vw3EAz
 WUXL5CXqd9bwgvZa/9ZClQtpJaikNGoOaMSEl67rbWdVTk0LIpqx3nTY15py5LXd
 SG4Wtfor3Vf1REa9TrR2uCIOmh+1gQ==
 =0/PC
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2023-01-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v6.3:

UAPI Changes:

 * connector: Support analog-TV mode property
 * media: Add MEDIA_BUS_FMT_RGB565_1X24_CPADHI,
   MEDIA_BUS_FMT_RGB666_1X18 and MEDIA_BUS_FMT_RGB666_1X24_CPADHI

Cross-subsystem Changes:

 * dma-buf: Documentation fixes
 * i2c: Introduce i2c_client_get_device_id() helper

Core Changes:

 * Improve support for analog TV output
 * bridge: Remove unused drm_bridge_chain functions
 * debugfs: Add per-device helpers and convert various DRM drivers
 * dp-mst: Various fixes
 * fbdev emulation: Always pick 32 bpp as default
 * KUnit: Add tests for managed helpers; Various cleanups
 * panel-orientation: Add quirks for Lenovo Yoga Tab 3 X90F and DynaBook K50
 * TTM: Open-code ttm_bo_wait() and remove the helper

Driver Changes:

 * Fix preferred depth and bpp values throughout DRM drivers
 * Remove #CONFIG_PM guards throughout DRM drivers
 * ast: Various fixes
 * bridge: Implement i2c's probe_new in various drivers; Fixes; ite-it6505:
   Locking fixes, Cache EDID data; ite-it66121: Support IT6610 chip,
   Cleanups; lontium-tl9611: Fix HDMI on DragonBoard 845c; parade-ps8640:
   Use atomic bridge functions
 * gud: Convert to DRM shadow-plane helpers; Perform flushing synchronously
   during atomic update
 * ili9486: Support 16-bit pixel data
 * imx: Split off IPUv3 driver; Various fixes
 * mipi-dbi: Convert to DRM shadow-plane helpers plus rsp driver changes;i
   Support separate I/O-voltage supply
 * mxsfb: Depend on ARCH_MXS or ARCH_MXC
 * omapdrm: Various fixes
 * panel: Use ktime_get_boottime() to measure power-down delay in various
   drivers; Fix auto-suspend delay in various drivers; orisetech-ota5601a:
   Add support
 * sprd: Cleanups
 * sun4i: Convert to new TV-mode property
 * tidss: Various fixes
 * v3d: Various fixes
 * vc4: Convert to new TV-mode property; Support Kunit tests; Cleanups;
   dpi: Support RGB565 and RGB666 formats; dsi: Convert DSI driver to
   bridge
 * virtio: Improve tracing
 * vkms: Support small cursors in IGT tests; Various fixes

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Y7QIwlfElAYWxRcR@linux-uq9g
2023-01-04 14:59:25 +01:00
Christian König 7ccfd79fdd drm/amdgpu: rename vram_scratch into mem_scratch
Rename vram_scratch into mem_scratch and allow allocating it into GTT as
well.

The only problem with that is that we won't have a default page for the
system aperture any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-03 16:50:03 -05:00
Christian König 58ab2c08d7 drm/amdgpu: use VRAM|GTT for a bunch of kernel allocations
Technically all of those can use GTT as well, no need to force things
into VRAM.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-03 16:49:54 -05:00
Likun Gao 360cd08196 drm/amdgpu: adjust the sequence to check soft reset
1.Drop soft reset check when do should recover gpu check.
  (As it will skip gpu reset operation if some ip is hang but
   not support soft reset)
2.Check soft reset status before do soft reset when pre asic reset.
  a. If check soft reset return true, it means: some ip is hang and
     it also support soft reset, will try soft reset first.
  b. If check soft reset return false, it means:
       I.  All the ip are not hang, will skip gpu reset.
       II. Some ip is hang but not support soft reset, will skip soft
           reset and retry with full reset later.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-01-03 16:49:26 -05:00
Alex Deucher afa6646b1c drm/amdgpu: skip MES for S0ix as well since it's part of GFX
It's also part of gfxoff.

Cc: stable@vger.kernel.org # 6.0, 6.1
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-20 13:08:12 -05:00
Alex Deucher 5620a1889e drm/amdgpu: skip MES for S0ix as well since it's part of GFX
It's also part of gfxoff.

Cc: stable@vger.kernel.org # 6.0, 6.1
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-20 12:47:07 -05:00
Alex Deucher 5804463a65 Revert "drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resume"
This reverts commit f543d28687.

This is no longer needed since we no longer touch SDMA 5.x for s0i3.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-20 12:46:58 -05:00
Alex Deucher 2a7798ea73 drm/amdgpu: for S0ix, skip SDMA 5.x+ suspend/resume
SDMA 5.x is part of the GFX block so it's controlled via
GFXOFF.  Skip suspend as it should be handled the same
as GFX.

v2: drop SDMA 4.x.  That requires special handling.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-20 12:46:55 -05:00
Alex Deucher 47198eb721 drm/amdgpu: don't mess with SDMA clock or powergating in S0ix
It's handled by GFXOFF for SDMA 5.x and SMU saves the state on
SDMA 4.x.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-20 12:46:52 -05:00
Shikang Fan 47ea20762b drm/amdgpu: Add an extra evict_resource call during device_suspend.
- evict_resource is taking too long causing sriov full access mode timeout.
  So, add an extra evict_resource in the beginning as an early evict.

Signed-off-by: Shikang Fan <shikang.fan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-12-13 17:10:32 -05:00
Christian König 9bff18d134 drm/ttm: use per BO cleanup workers
Instead of a single worker going over the list of delete BOs in regular
intervals use a per BO worker which blocks for the resv object and
locking of the BO.

This not only simplifies the handling massively, but also results in
much better response time when cleaning up buffers.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-3-christian.koenig@amd.com
2022-12-06 10:53:20 +01:00
Christian König cd3a8a5962 drm/ttm: remove ttm_bo_(un)lock_delayed_workqueue
Those functions never worked correctly since it is still perfectly
possible that a buffer object is released and the background worker
restarted even after calling them.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-2-christian.koenig@amd.com
2022-12-06 10:28:12 +01:00
Liang He dfd0287bd3 drm/amdgpu: Fix potential double free and null pointer dereference
In amdgpu_get_xgmi_hive(), we should not call kfree() after
kobject_put() as the PUT will call kfree().

In amdgpu_device_ip_init(), we need to check the returned *hive*
which can be NULL before we dereference it.

Signed-off-by: Liang He <windhl@126.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-29 11:03:37 -05:00
Shikang Fan 3c22c1ead6 drm/amdgpu: fix for suspend/resume kiq fence fallback under sriov
- in device_resume, sriov configure interrupt should be in full access,
  so release_full_gpu should be done after kfd_resume.
- remove the previous workaround solution for sriov.

Fixes: ec4927d463 ("drm/amdgpu: fix for suspend/resume sequence under sriov")
Signed-off-by: Shikang Fan <shikang.fan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 10:31:31 -05:00
Yang Yingliang b85e285e3d drm/amdgpu: fix pci device refcount leak
As comment of pci_get_domain_bus_and_slot() says, it returns
a pci device with refcount increment, when finish using it,
the caller must decrement the reference count by calling
pci_dev_put().

So before returning from amdgpu_device_resume|suspend_display_audio(),
pci_dev_put() is called to avoid refcount leak.

Fixes: 3f12acc8d6 ("drm/amdgpu: put the audio codec into suspend state before gpu reset V3")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 09:47:13 -05:00
Dave Airlie fc58764bbf Merge tag 'amd-drm-next-6.2-2022-11-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.2-2022-11-18:

amdgpu:
- SR-IOV fixes
- Clean up DC checks
- DCN 3.2.x fixes
- DCN 3.1.x fixes
- Don't enable degamma on asics which don't support it
- IP discovery fixes
- BACO fixes
- Fix vbios allocation handling when vkms is enabled
- Drop buggy tdr advanced mode GPU reset handling
- Fix the build when DCN is not set in kconfig
- MST DSC fixes
- Userptr fixes
- FRU and RAS EEPROM fixes
- VCN 4.x RAS support
- Aldrebaran CU occupancy reporting fix
- PSP ring cleanup

amdkfd:
- Memory limit fix
- Enable cooperative launch on gfx 10.3

amd-drm-next-6.2-2022-11-11:

amdgpu:
- SMU 13.x updates
- GPUVM TLB race fix
- DCN 3.1.4 updates
- DCN 3.2.x updates
- PSR fixes
- Kerneldoc fix
- Vega10 fan fix
- GPUVM locking fixes in error pathes
- BACO fix for Beige Goby
- EEPROM I2C address cleanup
- GFXOFF fix
- Fix DC memory leak in error pathes
- Flexible array updates
- Mtype fix for GPUVM PTEs
- Move Kconfig into amdgpu directory
- SR-IOV updates
- Fix possible memory leak in CS IOCTL error path

amdkfd:
- Fix possible memory overrun
- CRIU fixes

radeon:
- ACPI ref count fix
- HDA audio notifier support
- Move Kconfig into radeon directory

UAPI:
- Add new GEM_CREATE flags to help to transition more KFD functionality to the DRM UAPI.
  These are used internally in the driver to align location based memory coherency
  requirements from memory allocated in the KFD with how we manage GPUVM PTEs.  They
  are currently blocked in the GEM_CREATE IOCTL as we don't have a user right now.
  They are just used internally in the kernel driver for now for existing KFD memory
  allocations. So a change to the UAPI header, but no functional change in the UAPI.

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221118170807.6505-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-11-22 13:41:11 +10:00
YiPeng Chai 1a11a65d53 drm/amdgpu: Enable mode-1 reset for RAS recovery in fatal error mode
The patch is enabling mode-1 reset for RAS recovery in fatal error mode.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-17 18:07:52 -05:00
Dave Airlie 4e291f2f58 drm-misc-next for 6.2:
UAPI Changes:
 
 Cross-subsystem Changes:
 
 Core Changes:
 - atomic-helper: Add begin_fb_access and end_fb_access hooks
 - fb-helper: Rework to move fb emulation into helpers
 - scheduler: rework entity flush, kill and fini
 - ttm: Optimize pool allocations
 
 Driver Changes:
 - amdgpu: scheduler rework
 - hdlcd: Switch to DRM-managed resources
 - ingenic: Fix registration error path
 - lcdif: FIFO threshold tuning
 - meson: Fix return type of cvbs' mode_valid
 - ofdrm: multiple fixes (kconfig, types, endianness)
 - sun4i: A100 and D1 support
 - panel:
   - New Panel: Jadard JD9365DA-H3
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCY2y3vQAKCRDj7w1vZxhR
 xQu+AP9CzbNI2s12aNS8DskEZggo0lqUyRiEBaRJ2jrWZdGr0gEA1+Lc06HaKmGC
 2WBD4nw2I7ch0NUN6VFjQ9ATofevPwA=
 =AY3a
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2022-11-10-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 6.2:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
- atomic-helper: Add begin_fb_access and end_fb_access hooks
- fb-helper: Rework to move fb emulation into helpers
- scheduler: rework entity flush, kill and fini
- ttm: Optimize pool allocations

Driver Changes:
- amdgpu: scheduler rework
- hdlcd: Switch to DRM-managed resources
- ingenic: Fix registration error path
- lcdif: FIFO threshold tuning
- meson: Fix return type of cvbs' mode_valid
- ofdrm: multiple fixes (kconfig, types, endianness)
- sun4i: A100 and D1 support
- panel:
  - New Panel: Jadard JD9365DA-H3

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20221110083612.g63eaocoaa554soh@houat
2022-11-16 07:17:32 +10:00
Christian König 0788a47e7c drm/amdgpu: stop resubmittting jobs in amdgpu_pci_resume
The state of VRAM is unreliable due to a PCI event like AER, link reset
or DPC.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 15:25:45 -05:00
Christian König 6868a2c465 drm/amdgpu: stop resubmitting jobs for GPU reset v2
Re-submitting IBs by the kernel has many problems because pre-
requisite state is not automatically re-created as well. In
other words neither binary semaphores nor things like ring
buffer pointers are in the state they should be when the
hardware starts to work on the IBs again.

Additional to that even after more than 5 years of
developing this feature it is still not stable and we have
massively problems getting the reference counts right.

As discussed with user space developers this behavior is not
helpful in the first place. For graphics and multimedia
workloads it makes much more sense to either completely
re-create the context or at least re-submitting the IBs
from userspace.

For compute use cases re-submitting is also not very
helpful since userspace must rely on the accuracy of
the result.

Because of this we stop this practice and instead just
properly note that the fence submission was canceled. The
only use case we keep the re-submission for now is SRIOV
and function level resets.

v2: as suggested by Sshaoyun stop resubmitting jobs even for SRIOV

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 15:25:37 -05:00
Christian König 06a2d7cc3f drm/amdgpu: revert "implement tdr advanced mode"
This reverts commit e6c6338f39.

This feature basically re-submits one job after another to
figure out which one was the one causing a hang.

This is obviously incompatible with gang-submit which requires
that multiple jobs run at the same time. It's also absolutely
not helpful to crash the hardware multiple times if a clean
recovery is desired.

For testing and debugging environments we should rather disable
recovery alltogether to be able to inspect the state with a hw
debugger.

Additional to that the sw implementation is clearly buggy and causes
reference count issues for the hardware fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 15:25:22 -05:00
YiPeng Chai d293470e10 drm/amdgpu: Fixed the problem that ras error can't be queried after gpu recovery is completed
Amdgpu_ras_set_error_query_ready is called at the start of
amdgpu_device_gpu_recover to disable query ras error, but the
code behind only enables query ras error in full reset path,
but not in soft reset path, emergency restart path and skip
the hardware reset path.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 13:35:16 -05:00
Alex Deucher 220c8cc855 drm/amdgpu: there is no vbios fb on devices with no display hw (v2)
If we enable virtual display functionality on parts with
no display hardware we can end up trying to check for and
reserve the vbios FB area on devices where it doesn't exist.
Check if display hardware is actually present on the hardware
before trying to reserve the memory.

v2: move the check into common code

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 13:35:16 -05:00
Alex Deucher d09ef24303 drm/amdgpu: clarify DC checks
There are several places where we don't want to check
if a particular asic could support DC, but rather, if
DC is enabled.  Set a flag if DC is enabled and check
for that rather than if a device supports DC or not.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 11:51:45 -05:00
Alex Deucher 25263da376 drm/amdgpu: rework SR-IOV virtual display handling
virtual display is enabled unconditionally in SR-IOV, but
without specifying the virtual_display module, the number
of crtcs defaults to 0.  Set a single display by default
for SR-IOV if the virtual_display parameter is not set.
Only enable virtual display by default on SR-IOV on asics
which actually have display hardware.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 11:51:32 -05:00
Thomas Zimmermann 45b64fd9f7 drm/fb-helper: Remove unnecessary include statements
Remove include statements for <drm/drm_fb_helper.h> where it is not
required (i.e., most of them). In a few places include other header
files that are required by the source code.

v3:
	* fix amdgpu include statements
	* fix rockchip include statements

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221103151446.2638-23-tzimmermann@suse.de
2022-11-05 17:12:04 +01:00
Victor Zhao ec4927d463 drm/amdgpu: fix for suspend/resume sequence under sriov
- clear kiq ring after suspend/resume under sriov to aviod kiq ring
test failure
- update irq after resume to fix kiq interrput loss

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-04 16:05:53 -04:00
Yiqing Yao 8a1fbb4a5e drm/amdgpu: Disable MCBP from soc21 for SRIOV
[why]
Start from soc21, CP does not support MCBP, so disable it.

[how]
Used amgpu_mcbp flag alone instead of checking if is in SRIOV to
enable/disable MCBP.
Only set flag to enable on asic_type prior to soc21 in SRIOV.

Signed-off-by: Yiqing Yao <yiqing.yao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-04 16:05:53 -04:00
Mario Limonciello 7863c15526 drm/amd: Fail the suspend if resources can't be evicted
If a system does not have swap and memory is under 100% usage,
amdgpu will fail to evict resources.  Currently the suspend
carries on proceeding to reset the GPU:

```
[drm] evicting device resources failed
[drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <vcn_v3_0> failed -12
[drm] free PSP TMR buffer
[TTM] Failed allocating page table
[drm] evicting device resources failed
amdgpu 0000:03:00.0: amdgpu: MODE1 reset
amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
amdgpu 0000:03:00.0: amdgpu: GPU smu mode1 reset
```

At this point if the suspend actually succeeded I think that amdgpu
would have recovered because the GPU would have power cut off and
restored.  However the kernel fails to continue the suspend from the
memory pressure and amdgpu fails to run the "resume" from the aborted
suspend.

```
ACPI: PM: Preparing to enter system sleep state S3
SLUB: Unable to allocate memory on node -1, gfp=0xdc0(GFP_KERNEL|__GFP_ZERO)
  cache: Acpi-State, object size: 80, buffer size: 80, default order: 0, min order: 0
  node 0: slabs: 22, objs: 1122, free: 0
ACPI Error: AE_NO_MEMORY, Could not update object reference count (20210730/utdelete-651)

[drm:psp_hw_start [amdgpu]] *ERROR* PSP load kdb failed!
[drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
[drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_resume failed (-62).
PM: dpm_run_callback(): pci_pm_resume+0x0/0x100 returns -62
amdgpu 0000:03:00.0: PM: failed to resume async: error -62
```

To avoid this series of unfortunate events, fail amdgpu's suspend
when the memory eviction fails.  This will let the system gracefully
recover and the user can try suspend again when the memory pressure
is relieved.

Reported-by: post@davidak.de
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2223
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27 15:12:09 -04:00
wangjianli 12024b1761 amd/amdgpu: fix repeated words in comments
Delete the redundant word 'the'.

Signed-off-by: wangjianli <wangjianli@cdjrlc.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24 14:38:46 -04:00