Commit graph

646 commits

Author SHA1 Message Date
Oak Zeng
14568cf658 drm/amdkfd: Expose sdma engine numbers to topology
Expose available numbers of both SDMA queue types in the topology.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
1b4670f698 drm/amdkfd: Introduce XGMI SDMA queue type
Existing QUEUE_TYPE_SDMA means PCIe optimized SDMA queues.
Introduce a new QUEUE_TYPE_SDMA_XGMI, which is optimized
for non-PCIe transfer such as XGMI.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
065e4bdfa1 drm/amdkfd: Fix sdma queue map issue
Previous codes assumes there are two sdma engines.
This is not true e.g., Raven only has 1 SDMA engine.
Fix the issue by using sdma engine number info in
device_info.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Yong Zhao
e78579aab7 drm/amdkfd: Move sdma_queue_id calculation into allocate_sdma_queue()
This avoids duplicated code.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
0803e7a9e8 drm/amdkfd: Allocate hiq and sdma mqd from mqd trunk
Instead of allocat hiq and sdma mqd from sub-allocator, allocate
them from a mqd trunk pool. This is done for all asics

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
d1f8f0d17d drm/amdkfd: Move non-sdma mqd allocation out of init_mqd
This is preparation work to introduce more mqd allocation
scheme

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
e73390d181 drm/amdkfd: Fix a potential memory leak
Free mqd_mem_obj it GTT buffer allocation for MQD+control stack fails.

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
11614c36bc drm/amdkfd: Allocate MQD trunk for HIQ and SDMA
MEC FW for some new asic requires all SDMA MQDs to be in a continuous
trunk of memory right after HIQ MQD. Add a field in device queue manager
to hold the HIQ/SDMA MQD memory object and allocate MQD trunk on device
queue manager initialization.

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
6c6cde557a drm/amdkfd: Add mqd size in mqd manager struct
Also initialize mqd size on mqd manager initialization

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
fdfa090bc9 drm/amdkfd: Init mqd managers in device queue manager init
Previously mqd managers was initialized on demand. As there
are only a few type of mqd managers, the on demand initialization
doesn't save too much memory. Initialize them on device
queue initialization instead and delete the get_mqd_manager
interface. This makes codes more organized for future changes.

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
59f650a06f drm/amdkfd: Introduce DIQ type mqd manager
With introduction of new mqd allocation scheme for HIQ,
DIQ and HIQ use different mqd allocation scheme, DIQ
can't reuse HIQ mqd manager

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Oak Zeng
972fcdb52f drm/amdkfd: Introduce asic-specific mqd_manager_init function
Global function mqd_manager_init just calls asic-specific functions and it
is not necessary. Delete it and introduce a mqd_manager_init interface in
dqm for asic-specific mqd manager init. Call mqd_manager_init interface
directly to initialize mqd manager

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:02 -05:00
Philip Yang
89cd9d23e9 drm/amdkfd: avoid HMM change cause circular lock
There is circular lock between gfx and kfd path with HMM change:
lock(dqm) -> bo::reserve -> amdgpu_mn_lock

To avoid this, move init/unint_mqd() out of lock(dqm), to remove nested
locking between mmap_sem and bo::reserve. The locking order
is: bo::reserve -> amdgpu_mn_lock(p->mn)

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Jay Cornwall
fa722f0d98 drm/amdkfd: Preserve ttmp[4:5] instead of ttmp[14:15]
ttmp[4:5] is initialized by the SPI with SPI_GDBG_TRAP_DATA* values.
These values are more useful to the debugger than ttmp[14:15], which
carries dispatch_scratch_base*. There are too few registers to
preserve both.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Jay Cornwall
5883600901 drm/amdkfd: Fix gfx9 XNACK state save/restore
SQ_WAVE_IB_STS.RCNT grew from 4 bits to 5 in gfx9. Do not truncate
when saving in the high bits of TTMP1.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Jay Cornwall
157e586dc9 drm/amdkfd: Preserve wave state after instruction fetch MEM_VIOL
If instruction fetch fails the wave cannot be halted and returned to
the shader without raising MEM_VIOL again. Currently the wave is
terminated if this occurs, but this loses information about the cause
of the fault. The debugger would prefer the faulting wave state to be
context-saved.

Poll inside the trap handler until TRAPSTS.SAVECTX indicates context
save is ready. Exit the poll loop and complete the remainder of the
exception handler, then return to the shader. The next instruction
fetch will be from the trap handler and not the faulting PC. Context
save will then deschedule the wave and save its state.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Jay Cornwall
2db2f25959 drm/amdkfd: Fix gfx8 MEM_VIOL exception handler
When MEM_VIOL is asserted the context save handler rewinds the
program counter. This is incorrect for any source of the exception.
MEM_VIOL may be raised in normal operation by out-of-bounds access
to LDS or GDS and does not require special handling.

Remove PC adjustment when MEM_VIOL has been raised.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Harish Kasiviswanathan
f756e6319c drm/amdkfd: Fix compute profile switching
Fix compute profile switching on process termination.

Add a dedicated reference counter to keep track of entry/exit to/from
compute profile. This enables switching compute profiles for other
reasons than process creation or termination.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Oak Zeng
323c71df94 drm/amdkfd: Differentiate b/t sdma_id and sdma_queue_id
sdma_queue_id is sdma queue index inside one sdma engine.
sdma_id is sdma queue index among all sdma engines. Use
those two names properly.

Signed-off-by: Oak Zeng <ozeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Oak Zeng
96eb5f9dd3 drm/amdkfd: Add sdma allocation debug message
Add debug messages during SDMA queue allocation.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Oak Zeng
cb77ee7cae drm/amdkfd: Use 64 bit sdma_bitmap
Maximumly support 64 sdma queues

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:21:01 -05:00
Kent Russell
0d87c9cfc0 drm/amdkfd: Cosmetic cleanup
Fix some spacing issues, log output, uses of !=NULL/==NULL, unneeded
extra lines and clean up a declaration from =1 to =true for clarity

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:20:48 -05:00
shaoyunl
0fb0df031a drm/amdkfd: Adjust weight to represent num_hops info when report xgmi iolink
Upper level runtime need the xgmi hops info to determine the data path

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:20:48 -05:00
Oak Zeng
d8e408a827 drm/amdkfd: Expose HDP registers to user space
Introduce a new memory type (KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP) and
expose mmio page of HDP registers to user space through this new
memory type.

v2: moved remapped hdp regs to adev struct
v3: rename the new memory type to ALLOC_MEM_FLAGS_MMIO_REMAP
v4: use more generic function name
v5: Fail remapped mmio allocation for asics before gfx9

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-24 12:20:47 -05:00
Linus Torvalds
a3b25d157d drm i915, amdgpu, vmwgfx, sun4i, panfrost, gma500 fixes. + revert build breakage
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc58B8AAoJEAx081l5xIa+mVYP/R0nRBWy8KhQfTFIXavhDCwc
 cEgMm8K/fssSpmvsFqgx6NDo2SjqaKAzl5fRk+ilSZTFFbA52f/Q5QItQFwBGRk2
 ACJVU+e/zILWtJGyRudpk/eQLK63pft0H9HnBcrEdnyrknKe9iQt91XU+UUc2GRS
 NJGNXZqP+aSwGHfBFxtlmpWgEzcS+iwqcLC8iRtU67WCiOJls50x8pC0awWFpw9V
 SDen8x6LP0PmcUiJqz4rWLa3/UMH4lmaT14DulPkZBQjaN1Sm3J7+jO4d2fz2qQL
 YmchtMSxQhfxbon6vxJNlDFqDRy7X+/47nRLToKp5biwGYUa9vp7MWgp3vhc4/Tk
 LzwYvGhYq81J9NnAqr96FQGStXWzThamjaV6aWbKJ8zwlSki4zPxi5YKZ+xbSVhm
 aOHjC57cgv98ppg24mHd7smAoHdCePDQz/fB1KNSrAXTdit323LoRiOKHYMyYMGR
 dtAsDMt2WYaihVJSGK0HP0ZcSem6oGGFz1jRVap+zQ6suVCxdmpLqy0pn/s6QH3r
 4Tjxai2iW8oYpL9nHet4SDO2SI4RNUev4vNh84Mr5SddK5N/yAC3QNiCP5ND6LJv
 kAGoHsRJ7dnlXYtU4hKdT9LrJW7dj6+PkuyPZCPy/1y3qQoROPSVHPZwc98BYMYE
 aUpQ0E+KvOD2l/CEa2FF
 =4Wu/
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2019-05-24-1' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Nothing too unusual here for rc2. Except the amdgpu DMCU firmware
  loading fix caused build breakage with a different set of Kconfig
  options. I've just reverted it for now until the AMD folks can rewrite
  it to avoid that problem.

  i915:
   - boosting fix
   - bump ready task fixes
   - GVT - reset fix, error return, TRTT handling fix

  amdgpu:
   - DMCU firmware loading fix
   - Polaris 10 pci id for kfd
   - picasso screen corruption fix
   - SR-IOV fixes
   - vega driver reload fixes
   - SMU locking fix
   - compute profile fix for kfd

  vmwgfx:
   - integer overflow fixes
   - dma sg fix

  sun4i:
   - HDMI phy fixes

  gma500:
   - LVDS detection fix

  panfrost:
   - devfreq selection fix"

* tag 'drm-fixes-2019-05-24-1' of git://anongit.freedesktop.org/drm/drm: (32 commits)
  Revert "drm/amd/display: Don't load DMCU for Raven 1"
  drm/panfrost: Select devfreq
  drm/gma500/cdv: Check vbt config bits when detecting lvds panels
  drm/vmwgfx: integer underflow in vmw_cmd_dx_set_shader() leading to an invalid read
  drm/vmwgfx: NULL pointer dereference from vmw_cmd_dx_view_define()
  drm/vmwgfx: Use the dma scatter-gather iterator to get dma addresses
  drm/vmwgfx: Fix compat mode shader operation
  drm/vmwgfx: Fix user space handle equal to zero
  drm/vmwgfx: Don't send drm sysfs hotplug events on initial master set
  drm/i915/gvt: Fix an error code in ppgtt_populate_spt_by_guest_entry()
  drm/i915/gvt: do not let TRTTE and 0x4dfc write passthrough to hardware
  drm/i915/gvt: add 0x4dfc to gen9 save-restore list
  drm/i915/gvt: Tiled Resources mmios are in-context mmios for gen9+
  drm/i915/gvt: use cmd to restore in-context mmios to hw for gen9 platform
  drm/i915/gvt: emit init breadcrumb for gvt request
  drm/amdkfd: Fix compute profile switching
  drm/amdgpu: skip fw pri bo alloc for SRIOV
  drm/amd/powerplay: fix locking in smu_feature_set_supported()
  drm/amdgpu/gmc9: set vram_width properly for SR-IOV
  drm/amdgpu/soc15: skip reset on init
  ...
2019-05-24 09:12:46 -07:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Harish Kasiviswanathan
43d8107f0b drm/amdkfd: Fix compute profile switching
Fix compute profile switching on process termination.

Add a dedicated reference counter to keep track of entry/exit to/from
compute profile. This enables switching compute profiles for other
reasons than process creation or termination.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-05-20 21:22:49 -05:00
Kent Russell
0a5a9c276c drm/amdkfd: Add missing Polaris10 ID
This was added to amdgpu but was missed in amdkfd

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.rg
2019-05-20 12:42:41 -05:00
Linus Torvalds
414147d99b pci-v5.2-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlzZ/4MUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwmYw/+Mzkkz/zOpzYdsYyy6Xv3qRdn92Kp
 bePOPACdwpUK+HV4qE6EEYBcVZdkO7NMkshA7wIb4VlsE0sVHSPvlybUmTUGWeFd
 CG87YytVOo4K7cAeKdGVwGaoQSeaZX3wmXVGGQtm/T4b63GdBjlNJ/MBuPWDDMlM
 XEis29MTH6xAu3MbT7pp5q+snSzOmt0RWuVpX/U1YcZdhu8fbwfOxj9Jx6slh4+2
 MvseYNNrTRJrMF0o5o83Khx3tAcW8OTTnDJ9+BCrAlE1PId1s/KjzY6nqReBtom9
 CIqtwAlx/wGkRBRgfsmEtFBhkDA05PPilhSy6k2LP8B4A3qBqir1Pd+5bhHG4FIu
 nPPCZjZs2+0DNrZwQv59qIlWsqDFm214WRln9Z7d/VNtrLs2UknVghjQcHv7rP+K
 /NKfPlAuHTI/AFi9pIPFWTMx5J4iXX1hX4LiptE9M0k9/vSiaCVnTS3QbFvp3py3
 VTT9sprzfV4JX4aqS/rbQc/9g4k9OXPW9viOuWf5rYZJTBbsu6PehjUIRECyFaO+
 0gDqE8WsQOtNNX7e5q2HJ/HpPQ+Q1IIlReC+1H56T/EQZmSIBwhTLttQMREL/8af
 Lka3/1SVUi4WG6SBrBI75ClsR91UzE6AK+h9fAyDuR6XJkbysWjkyG6Lmy617g6w
 lb+fQwOzUt4eGDo=
 =4Vc+
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration changes:

   - Add _HPX Type 3 settings support, which gives firmware more
     influence over device configuration (Alexandru Gagniuc)

   - Support fixed bus numbers from bridge Enhanced Allocation
     capabilities (Subbaraya Sundeep)

   - Add "external-facing" DT property to identify cases where we
     require IOMMU protection against untrusted devices (Jean-Philippe
     Brucker)

   - Enable PCIe services for host controller drivers that use managed
     host bridge alloc (Jean-Philippe Brucker)

   - Log PCIe port service messages with pci_dev, not the pcie_device
     (Frederick Lawler)

   - Convert pciehp from pciehp_debug module parameter to generic
     dynamic debug (Frederick Lawler)

  Peer-to-peer DMA:

   - Add whitelist of Root Complexes that support peer-to-peer DMA
     between Root Ports (Christian König)

  Native controller drivers:

   - Add PCI host bridge DMA ranges for bridges that can't DMA
     everywhere, e.g., iProc (Srinath Mannam)

   - Add Amazon Annapurna Labs PCIe host controller driver (Jonathan
     Chocron)

   - Fix Tegra MSI target allocation so DMA doesn't generate unwanted
     MSIs (Vidya Sagar)

   - Fix of_node reference leaks (Wen Yang)

   - Fix Hyper-V module unload & device removal issues (Dexuan Cui)

   - Cleanup R-Car driver (Marek Vasut)

   - Cleanup Keystone driver (Kishon Vijay Abraham I)

   - Cleanup i.MX6 driver (Andrey Smirnov)

  Significant bug fixes:

   - Reset Lenovo ThinkPad P50 GPU so nouveau works after reboot (Lyude
     Paul)

   - Fix Switchtec firmware update performance issue (Wesley Sheng)

   - Work around Pericom switch link retraining erratum (Stefan Mätje)"

* tag 'pci-v5.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (141 commits)
  MAINTAINERS: Add Karthikeyan Mitran and Hou Zhiqiang for Mobiveil PCI
  PCI: pciehp: Remove pointless MY_NAME definition
  PCI: pciehp: Remove pointless PCIE_MODULE_NAME definition
  PCI: pciehp: Remove unused dbg/err/info/warn() wrappers
  PCI: pciehp: Log messages with pci_dev, not pcie_device
  PCI: pciehp: Replace pciehp_debug module param with dyndbg
  PCI: pciehp: Remove pciehp_debug uses
  PCI/AER: Log messages with pci_dev, not pcie_device
  PCI/DPC: Log messages with pci_dev, not pcie_device
  PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI/AER: Replace dev_printk(KERN_DEBUG) with dev_info()
  PCI: Replace dev_printk(KERN_DEBUG) with dev_info(), etc
  PCI: Replace printk(KERN_INFO) with pr_info(), etc
  PCI: Use dev_printk() when possible
  PCI: Cleanup setup-bus.c comments and whitespace
  PCI: imx6: Allow asynchronous probing
  PCI: dwc: Save root bus for driver remove hooks
  PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code
  PCI: dwc: Free MSI in dw_pcie_host_init() error path
  PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi()
  ...
2019-05-14 10:30:10 -07:00
Dave Airlie
422449238e Merge branch 'drm-next-5.2' of git://people.freedesktop.org/~agd5f/linux into drm-next
- SR-IOV fixes
- Raven flickering fix
- Misc spelling fixes
- Vega20 power fixes
- Freesync improvements
- DC fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190502193020.3562-1-alexander.deucher@amd.com
2019-05-03 10:31:07 +10:00
Heiner Kallweit
babe2ef342 drm/amdkfd: Use pci_dev_id() helper
Use new helper pci_dev_id() to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Christian König <christian.koenig@amd.com>
2019-04-29 16:12:35 -05:00
Amber Lin
0da8b10e36 drm/amdgpu: get_fw_version isn't ASIC specific
Method of getting firmware version is the same across ASICs, so remove
them from ASIC-specific files and create one in amdgpu_amdkfd.c. This new
created get_fw_version simply reads fw_version from adev->gfx than parsing
the ucode header.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-19 11:32:40 -05:00
Dave Airlie
f06ddb5309 Linux 5.1-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlyzsYgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGMw0H/ir42KJiABBKSETD
 0d38qXVclAI/123zl8EkSfDrBKOsuIpXUDxzKeoDMhMkiurMpK6bbEOTPJAQMZJe
 nEYpq/bZQi+vO8Q/pMMpaC3ExlIRosd0JAR7TyDUh5ZAeeMuDNzmvMk/DPxXPbNt
 0P1FWePDa7908ajCOW1T8ZrB9Ak8boo7TKkF3LBb00ks1mEkyp/l74MKOHdu+HYn
 XIwncX/Jotl4BrKdNC2f/NXYLYk6MrJDGug8TxuHgIqiMWhhrcSqbxU1ri7iqFXB
 cBYdFo6ZJ8CWHux8/5LY5CMjSqEtzKha2Ohuhy3MMu1RsICyFLQtHnxHJ1ytLSBt
 DOPcDQ0=
 =CEUD
 -----END PGP SIGNATURE-----

BackMerge v5.1-rc5 into drm-next

Need rc5 for udl fix to add udl cleanups on top.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-15 15:51:49 +10:00
Alex Deucher
e7ad88553a drm/amdkfd: Add picasso pci id
Picasso is a new raven variant.

Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-04 10:20:34 -05:00
Alex Deucher
20d059278e Revert "drm/amdkfd: avoid HMM change cause circular lock"
This reverts commit 8dd69e69f4.

This depends on an HMM fix which is not upstream yet.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-28 10:15:49 -05:00
Eric Huang
9b54d20176 drm/amdkfd: add RAS ECC event support (v3)
RAS ECC event will combine with GPU reset event, due to
ECC interrupts are caused by uncorrectable error that triggers
GPU reset.

v2: Fix misleading-indentation warning
v3: fix build with CONFIG_HSA_AMD disabled

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:51 -05:00
Eric Huang
0dee45a25a drm/amdkfd: add RAS capabilities in topology for Vega20 (v2)
It is to collaborate with HSA_CAPABILITY in libhsakmt.

v2: squash in NULL pointer check

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:36:51 -05:00
Philip Yang
8dd69e69f4 drm/amdkfd: avoid HMM change cause circular lock
There is circular lock between gfx and kfd path with HMM change:
lock(dqm) -> bo::reserve -> amdgpu_mn_lock

To avoid this, move init/unint_mqd() out of lock(dqm), to remove nested
locking between mmap_sem and bo::reserve. The locking order
is: bo::reserve -> amdgpu_mn_lock(p->mn)

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-03-19 15:03:37 -05:00
Kevin Wang
cac734c2db drm/amdkfd: use init_mqd function to allocate object for hid_mqd (CI)
if use the legacy method to allocate object, when mqd_hiq need to run
uninit code, it will be cause WARNING call trace.

eg: (s3 suspend test)
[   34.918944] Call Trace:
[   34.918948]  [<ffffffff92961dc1>] dump_stack+0x19/0x1b
[   34.918950]  [<ffffffff92297648>] __warn+0xd8/0x100
[   34.918951]  [<ffffffff9229778d>] warn_slowpath_null+0x1d/0x20
[   34.918991]  [<ffffffffc03ce1fe>] uninit_mqd_hiq_sdma+0x4e/0x50 [amdgpu]
[   34.919028]  [<ffffffffc03d0ef7>] uninitialize+0x37/0xe0 [amdgpu]
[   34.919064]  [<ffffffffc03d15a6>] kernel_queue_uninit+0x16/0x30 [amdgpu]
[   34.919086]  [<ffffffffc03d26c2>] pm_uninit+0x12/0x20 [amdgpu]
[   34.919107]  [<ffffffffc03d4915>] stop_nocpsch+0x15/0x20 [amdgpu]
[   34.919129]  [<ffffffffc03c1dce>] kgd2kfd_suspend.part.4+0x2e/0x50 [amdgpu]
[   34.919150]  [<ffffffffc03c2667>] kgd2kfd_suspend+0x17/0x20 [amdgpu]
[   34.919171]  [<ffffffffc03c103a>] amdgpu_amdkfd_suspend+0x1a/0x20 [amdgpu]
[   34.919187]  [<ffffffffc02ec428>] amdgpu_device_suspend+0x88/0x3a0 [amdgpu]
[   34.919189]  [<ffffffff922e22cf>] ? enqueue_entity+0x2ef/0xbe0
[   34.919205]  [<ffffffffc02e8220>] amdgpu_pmops_suspend+0x20/0x30 [amdgpu]
[   34.919207]  [<ffffffff925c56ff>] pci_pm_suspend+0x6f/0x150
[   34.919208]  [<ffffffff925c5690>] ? pci_pm_freeze+0xf0/0xf0
[   34.919210]  [<ffffffff926b45c6>] dpm_run_callback+0x46/0x90
[   34.919212]  [<ffffffff926b49db>] __device_suspend+0xfb/0x2a0
[   34.919213]  [<ffffffff926b4b9f>] async_suspend+0x1f/0xa0
[   34.919214]  [<ffffffff922c918f>] async_run_entry_fn+0x3f/0x130
[   34.919216]  [<ffffffff922b9d4f>] process_one_work+0x17f/0x440
[   34.919217]  [<ffffffff922bade6>] worker_thread+0x126/0x3c0
[   34.919218]  [<ffffffff922bacc0>] ? manage_workers.isra.25+0x2a0/0x2a0
[   34.919220]  [<ffffffff922c1c31>] kthread+0xd1/0xe0
[   34.919221]  [<ffffffff922c1b60>] ? insert_kthread_work+0x40/0x40
[   34.919222]  [<ffffffff92974c1d>] ret_from_fork_nospec_begin+0x7/0x21
[   34.919224]  [<ffffffff922c1b60>] ? insert_kthread_work+0x40/0x40
[   34.919224] ---[ end trace 38cd9f65c963adad ]---

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-27 22:19:07 -05:00
Dave Airlie
fbac3c48fa Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
Fixes for 5.1:
amdgpu:
- Fix missing fw declaration after dropping old CI DPM code
- Fix debugfs access to registers beyond the MMIO bar size
- Fix context priority handling
- Add missing license on some new files
- Various cleanups and bug fixes

radeon:
- Fix missing break in CS parser for evergreen
- Various cleanups and bug fixes

sched:
- Fix entities with 0 run queues

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190221214134.3308-1-alexander.deucher@amd.com
2019-02-22 15:56:42 +10:00
Yong Zhao
234441dd49 drm/amdkfd: Optimize out sdma doorbell array in kgd2kfd_shared_resources
We can directly calculate sdma doorbell indexes in the process doorbell
pages through the doorbell_index structure in amdgpu_device, so no need
to cache them in kgd2kfd_shared_resources any more. This alleviates the
adaptation needs when new SDMA configurations are introduced.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-18 18:00:50 -05:00
Yong Zhao
1f86805adc drm/amdkfd: Fix bugs regarding CP queue doorbell mask on SOC15
Reserved doorbells for SDMA IH and VCN were not properly masked out
when allocating doorbells for CP user queues. This patch fixed that.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-18 18:00:41 -05:00
Yong Zhao
7452394310 drm/amdkfd: Move a constant definition around
The similar definitions should be consecutive.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-18 17:59:56 -05:00
Dave Airlie
c06de56121 Linux 5.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxqHJYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGWl8H/jPI4EipzD2GbnjZ
 GaFpMBBjcXBaVmoA+Y69so+7BHx1Ql+5GQtqbK0RHJRb9qEPLw3FBhHNjM/N8Sgf
 nSrK+GnBZp9s+k/NR/Yf2RacUR3jhz+Q9JEoQd3u9bFUeQyvE8Rf3vgtoBBwFOfz
 +t7N1memYVF3asLGWB4e4sP1YVMGfseTQpSPojvM30YWM86Bv+QtSx1AGgHczQIM
 kMKealR8ZPelN6JAXgLhQ5opDojBrE4YKB98pwsMDI6abz0Tz2JLFEUTTxsv5XNN
 o/Iz+XDoylskEyxN2unNWfHx7Swkvoklog8J/hDg5XlTvipL/WkT66PHBgcGMNvj
 BW9GgU8=
 =ZizU
 -----END PGP SIGNATURE-----

Merge v5.0-rc7 into drm-next

Backmerging for nouveau and imx that needed some fixes for next pulls.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-02-18 13:27:15 +10:00
Nathan Chancellor
6d3d8065bb drm/amdkfd: Fix if preprocessor statement above kfd_fill_iolink_info_for_cpu
Clang warns:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:866:5: warning:
'CONFIG_X86_64' is not defined, evaluates to 0 [-Wundef]
    ^
1 warning generated.

Fixes: d1c234e2cd ("drm/amdkfd: Allow building KFD on ARM64 (v2)")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-05 18:10:28 -05:00
Felix Kuehling
bbdf514fe5 drm/amdkfd: Don't assign dGPUs to APU topology devices
dGPUs need their own topology devices. Don't assign them to APU topology
devices with CPU cores.

Bug: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/66
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Elias Konstantinidis <ekondis@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:59:50 -05:00
Felix Kuehling
d1c234e2cd drm/amdkfd: Allow building KFD on ARM64 (v2)
ifdef x86_64 specific code.
Allow enabling CONFIG_HSA_AMD on ARM64.

v2: Fixed a compiler warning due to an unused variable

CC: Mark Nutter <Mark.Nutter@arm.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Mark Nutter <Mark.Nutter@arm.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:59:37 -05:00
Felix Kuehling
b8fe05247d drm/amdkfd: Don't assign dGPUs to APU topology devices
dGPUs need their own topology devices. Don't assign them to APU topology
devices with CPU cores.

Bug: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/66
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Elias Konstantinidis <ekondis@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:37:33 -05:00
Felix Kuehling
df1dd4f4a7 drm/amdkfd: Allow building KFD on ARM64 (v2)
ifdef x86_64 specific code.
Allow enabling CONFIG_HSA_AMD on ARM64.

v2: Fixed a compiler warning due to an unused variable

CC: Mark Nutter <Mark.Nutter@arm.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Mark Nutter <Mark.Nutter@arm.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:37:15 -05:00
Amber Lin
308176d6f6 drm/amdgpu: Remove kgd2kfd function pointers
kgd2kfd function pointers and global kgd2kfd pointer are no longer in use.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:29 -05:00
Amber Lin
2d3d25b616 drm/amdgpu: Relocate kgd2kfd function declaration
Since amdkfd is merged into amdgpu module and amdgpu can access amdkfd
directly, move declaration of kgd2kfd functions from kfd_priv.h to
amdgpu_amdkfd.h

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-14 15:04:28 -05:00
Linus Torvalds
0fe4e2d5cd drm i915 gvt, amdgpu, core fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcL65eAAoJEAx081l5xIa+y7EP+wQnTk3GV7rKiIi5LEtux5xW
 X2tTaPKHnwrMYjRaP2VNUntJPH6Wxcby3OHGNvGMe1IqNGL/5qRLQ/g1rSSPuM4z
 rYwWR/ooDU/KwYvsT/o+DSO62AoVzIqx8gn8+ShirRN3MdobCcwDebd5oqKjduOn
 hRy9WQwgPOnDG1D3fRWOGSzOE1K9yDFCUaR0AmhUehn9NvsztQGamMBBwMNg+y52
 a5vu+nSLxQrv3ZyZ5TQUgAzi2pWFtC6QxIVuLpl5TqFA3vdRVyN1T78klDnQ7WU7
 6GY1yq9D923c1Tfa0RZoXnE++bX91KKJ5y9YFuNFv8X/th6UoEzRrOPDINfLoZv3
 JsPPSPAiZTgoXc/RGfoMbnidajNB7Gx+No+Pd8P6MeY5H1E+ivMXt5MrOgcMXUqk
 FajthiuSlaB+u5OjNjuS6gBbAMIKw7Idg4hEFSabj91qhJIet/fPhzNmp0HPJ1wF
 XlNnxI7XOytCAORrjLy2q4/lkaoG2AlVpZzeMLgXSxGGlSCtIpDUIqgQbtV1ppCi
 RboQ8yMflRejeK6oXoC92mI8yDB6rwoQy2tK0Hvnag5/q1r7AVYJq+3890NFEU4X
 F5TuCgvhswdkTEJUED1G6pnX7aQzW0dh6KrCltF34sFzD1etYb150En7laa+2kmX
 G5HfZbkLwscPt91moA6B
 =hFld
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-01-05' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Happy New Year, just decloaking from leave to get some stuff from the
  last week in before rc1:

  core:
   - two regression fixes for damage blob and atomic

  i915 gvt:
   - Some missed GVT fixes from the original pull

  amdgpu:
   - new PCI IDs
   - SR-IOV fixes
   - DC fixes
   - Vega20 fixes"

* tag 'drm-next-2019-01-05' of git://anongit.freedesktop.org/drm/drm: (53 commits)
  drm: Put damage blob when destroy plane state
  drm: fix null pointer dereference on null state pointer
  drm/amdgpu: Add new VegaM pci id
  drm/ttm: Use drm_debug_printer for all ttm_bo_mem_space_debug output
  drm/amdgpu: add Vega20 PSP ASD firmware loading
  drm/amd/display: Fix MST dp_blank REG_WAIT timeout
  drm/amd/display: validate extended dongle caps
  drm/amd/display: Use div_u64 for flip timestamp ns to ms
  drm/amdgpu/uvd:Change uvd ring name convention
  drm/amd/powerplay: add Vega20 LCLK DPM level setting support
  drm/amdgpu: print process info when job timeout
  drm/amdgpu/nbio7.4: add hw bug workaround for vega20
  drm/amdgpu/nbio6.1: add hw bug workaround for vega10/12
  drm/amd/display: Optimize passive update planes.
  drm/amd/display: verify lane status before exiting verify link cap
  drm/amd/display: Fix bug with not updating VSP infoframe
  drm/amd/display: Add retry to read ddc_clock pin
  drm/amd/display: Don't skip link training for empty dongle
  drm/amd/display: Wait edp HPD to high in detect_sink
  drm/amd/display: fix surface update sequence
  ...
2019-01-05 18:25:19 -08:00
Linus Torvalds
96d4f267e4 Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
Arun KS
9705bea5f8 mm: convert zone->managed_pages to atomic variable
totalram_pages, zone->managed_pages and totalhigh_pages updates are
protected by managed_page_count_lock, but readers never care about it.
Convert these variables to atomic to avoid readers potentially seeing a
store tear.

This patch converts zone->managed_pages.  Subsequent patches will convert
totalram_panges, totalhigh_pages and eventually managed_page_count_lock
will be removed.

Main motivation was that managed_page_count_lock handling was complicating
things.  It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.

Link: http://lkml.kernel.org/r/1542090790-21750-3-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS <arunks@codeaurora.org>
Suggested-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:47 -08:00
Linus Torvalds
4971f090aa drm pull request for 4.21-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcExwOAAoJEAx081l5xIa+euIP/1NZZvSB+bsCtOwDG8I6uWsS
 OU5JUZ8q2dqyyFagRxzlkeSt3uWJqKp5NyNwuc9z/5u6AGF+3/97D0J1lG6Os/st
 4abF6NadivYJ4cXhJ1ddIHOFMVDcAsyMWNDb93NwPwncCsQ0jt5FFOsrCyj6BGY+
 ihHFlHrIyDrbBGDHz+u1E/EO5WkNnaLDoC+/k2fTRWCNI3bQL3O+orsYTI6S2uvU
 lQJnRfYAllgLD2p1k/rrBHcHXBv50roR0e8uhGmbdhGdp5bEW30UGBLHXxQjjSVy
 fQCwFwTO8X6zoxU53Zbbk+MVrp+jkTHcGKViHRuLkaHzE5mX26UXDwlXdN32ZUbK
 yHOJp+uDaWXX7MIz0LsB9Iqj2+eIUoFaIJMoZTMGVTNvqnTxKnoHnjAtbTH2u258
 teFgmy4BIgPgo2kwEnBEZjCapou0Eivyut2wq8bTAB2Fe8LwURJpr3cioTtMLlUO
 L5/PoD27eFvBCAeFrQIwF3b2XiQEnBpXocmilEwP1xDMPgoyeePAfIF2iEpDvi0U
 jce3rLd2yVvo92xYUgoHkVTD8si/pKKnZ1D0U3+RI6pxK6s0HJEHjcNEMdvdm+2S
 4qgvBQV3wlWFkXEK8PR5BHPoLntg18tKon/BTLBjgGkN9E1o9fWs1/s6KQGY4xdo
 l3Vvfx2LTdkgEoBssSwB
 =wh4W
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Core:
   - shared fencing staging removal
   - drop transactional atomic helpers and move helpers to new location
   - DP/MST atomic cleanup
   - Leasing cleanups and drop EXPORT_SYMBOL
   - Convert drivers to atomic helpers and generic fbdev.
   - removed deprecated obj_ref/unref in favour of get/put
   - Improve dumb callback documentation
   - MODESET_LOCK_BEGIN/END helpers

  panels:
   - CDTech panels, Banana Pi Panel, DLC1010GIG,
   - Olimex LCD-O-LinuXino, Samsung S6D16D0, Truly NT35597 WQXGA,
   - Himax HX8357D, simulated RTSM AEMv8.
   - GPD Win2 panel
   - AUO G101EVN010

  vgem:
   - render node support

  ttm:
   - move global init out of drivers
   - fix LRU handling for ghost objects
   - Support for simultaneous submissions to multiple engines

  scheduler:
   - timeout/fault handling changes to help GPU recovery
   - helpers for hw with preemption support

  i915:
   - Scaler/Watermark fixes
   - DP MST + powerwell fixes
   - PSR fixes
   - Break long get/put shmemfs pages
   - Icelake fixes
   - Icelake DSI video mode enablement
   - Engine workaround improvements

  amdgpu:
   - freesync support
   - GPU reset enabled on CI, VI, SOC15 dGPUs
   - ABM support in DC
   - KFD support for vega12/polaris12
   - SDMA paging queue on vega
   - More amdkfd code sharing
   - DCC scanout on GFX9
   - DC kerneldoc
   - Updated SMU firmware for GFX8 chips
   - XGMI PSP + hive reset support
   - GPU reset
   - DC trace support
   - Powerplay updates for newer Polaris
   - Cursor plane update fast path
   - kfd dma-buf support

  virtio-gpu:
   - add EDID support

  vmwgfx:
   - pageflip with damage support

  nouveau:
   - Initial Turing TU104/TU106 modesetting support

  msm:
   - a2xx gpu support for apq8060 and imx5
   - a2xx gpummu support
   - mdp4 display support for apq8060
   - DPU fixes and cleanups
   - enhanced profiling support
   - debug object naming interface
   - get_iova/page pinning decoupling

  tegra:
   - Tegra194 host1x, VIC and display support enabled
   - Audio over HDMI for Tegra186 and Tegra194

  exynos:
   - DMA/IOMMU refactoring
   - plane alpha + blend mode support
   - Color format fixes for mixer driver

  rcar-du:
   - R8A7744 and R8A77470 support
   - R8A77965 LVDS support

  imx:
   - fbdev emulation fix
   - multi-tiled scalling fixes
   - SPDX identifiers

  rockchip
   - dw_hdmi support
   - dw-mipi-dsi + dual dsi support
   - mailbox read size fix

  qxl:
   - fix cursor pinning

  vc4:
   - YUV support (scaling + cursor)

  v3d:
   - enable TFU (Texture Formatting Unit)

  mali-dp:
   - add support for linear tiled formats

  sun4i:
   - Display Engine 3 support
   - H6 DE3 mixer 0 support
   - H6 display engine support
   - dw-hdmi support
   - H6 HDMI phy support
   - implicit fence waiting
   - BGRX8888 support

  meson:
   - Overlay plane support
   - implicit fence waiting
   - HDMI 1.4 4k modes

  bridge:
   - i2c fixes for sii902x"

* tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm: (1403 commits)
  drm/amd/display: Add fast path for cursor plane updates
  drm/amdgpu: Enable GPU recovery by default for CI
  drm/amd/display: Fix duplicating scaling/underscan connector state
  drm/amd/display: Fix unintialized max_bpc state values
  Revert "drm/amd/display: Set RMX_ASPECT as default"
  drm/amdgpu: Fix stub function name
  drm/msm/dpu: Fix clock issue after bind failure
  drm/msm/dpu: Clean up dpu_media_info.h static inline functions
  drm/msm/dpu: Further cleanups for static inline functions
  drm/msm/dpu: Cleanup the debugfs functions
  drm/msm/dpu: Remove dpu_irq and unused functions
  drm/msm: Make irq_postinstall optional
  drm/msm/dpu: Cleanup callers of dpu_hw_blk_init
  drm/msm/dpu: Remove unused functions
  drm/msm/dpu: Remove dpu_crtc_is_enabled()
  drm/msm/dpu: Remove dpu_crtc_get_mixer_height
  drm/msm/dpu: Remove dpu_dbg
  drm/msm: dpu: Remove crtc_lock
  drm/msm: dpu: Remove vblank_requested flag from dpu_crtc
  drm/msm: dpu: Separate crtc assignment from vblank enable
  ...
2018-12-25 11:48:26 -08:00
Felix Kuehling
e98bdb8061 drm/amdkfd: Fix handling of return code of dma_buf_get
On errors, dma_buf_get returns a negative error code, rather than NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-18 17:39:11 -05:00
Alex Deucher
9bd206f89f drm/amdkfd: add new vega20 pci id
New vega20 id.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-10 15:27:45 -05:00
Alex Deucher
756e16bf79 drm/amdkfd: add new vega10 pci ids
New vega10 ids.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2018-12-10 15:27:32 -05:00
Felix Kuehling
b408a54884 drm/amdkfd: Add support for doorbell BOs
This allows user mode to map doorbell pages into GPUVM address space.
That way GPUs can submit to user mode queues (self-dispatch).

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-07 18:14:00 -05:00
Felix Kuehling
1dde0ea95b drm/amdkfd: Add DMABuf import functionality
This is used for interoperability between ROCm compute and graphics
APIs. It allows importing graphics driver BOs into the ROCm SVM
address space for zero-copy GPU access.

The API is split into two steps (query and import) to allow user mode
to manage the virtual address space allocation for the imported buffer.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-07 18:13:54 -05:00
Felix Kuehling
3704d56e1a drm/amdkfd: Add NULL-pointer check
top_dev->gpu is NULL for CPUs. Avoid dereferencing it if NULL.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-07 18:13:48 -05:00
Brajeswar Ghosh
b8b3ede2de drm/amd/amdkfd: Remove duplicate header
Remove gca/gfx_8_0_enum.h which is included more than once

Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-26 15:54:39 -05:00
Yong Zhao
a53a11a835 drm/amdkfd: Workaround PASID missing in gfx9 interrupt payload under non HWS
This is a known gfx9 HW issue, and this change can perfectly workaround
the issue.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:14 -05:00
Yong Zhao
00557f4131 drm/amdkfd: Adjust the debug message in KFD ISR
This makes debug message get printed even when there is early return.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:14 -05:00
Gang Ba
846a44d7e9 drm/amdkfd: Added Vega12 and Polaris12 for KFD.
Add Vega12 and Polaris12 device info and device IDs to KFD.

Signed-off-by: Gang Ba <gaba@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:13 -05:00
Yong Zhao
4e6c6fc19d drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager
This will make reading code much easier. This fixes a few spots missed in a
previous commit with the same title.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-19 16:38:13 -05:00
Christian König
2383a767c0 drm/amdkfd: fix interrupt spin lock
Vega10 has multiple interrupt rings, so this can be called from multiple
calles at the same time resulting in:

[   71.779334] ================================
[   71.779406] WARNING: inconsistent lock state
[   71.779478] 4.19.0-rc1+ #44 Tainted: G        W
[   71.779565] --------------------------------
[   71.779637] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[   71.779740] kworker/6:1/120 [HC0[0]:SC0[0]:HE1:SE1] takes:
[   71.779832] 00000000ad761971 (&(&kfd->interrupt_lock)->rlock){?...},
at: kgd2kfd_interrupt+0x75/0x100 [amdgpu]
[   71.780058] {IN-HARDIRQ-W} state was registered at:
[   71.780115]   _raw_spin_lock+0x2c/0x40
[   71.780180]   kgd2kfd_interrupt+0x75/0x100 [amdgpu]
[   71.780248]   amdgpu_irq_callback+0x6c/0x150 [amdgpu]
[   71.780315]   amdgpu_ih_process+0x88/0x100 [amdgpu]
[   71.780380]   amdgpu_irq_handler+0x20/0x40 [amdgpu]
[   71.780409]   __handle_irq_event_percpu+0x49/0x2a0
[   71.780436]   handle_irq_event_percpu+0x30/0x70
[   71.780461]   handle_irq_event+0x37/0x60
[   71.780484]   handle_edge_irq+0x83/0x1b0
[   71.780506]   handle_irq+0x1f/0x30
[   71.780526]   do_IRQ+0x53/0x110
[   71.780544]   ret_from_intr+0x0/0x22
[   71.780566]   cpuidle_enter_state+0xaa/0x330
[   71.780591]   do_idle+0x203/0x280
[   71.780610]   cpu_startup_entry+0x6f/0x80
[   71.780634]   start_secondary+0x1b0/0x200
[   71.780657]   secondary_startup_64+0xa4/0xb0

Fix this by always using irq save spin locks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 15:49:38 -05:00
Yong Zhao
435e2f9709 drm/amdkfd: page_table_base already have the flags needed
The flags are added when calling amdgpu_gmc_pd_addr().

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:14 -05:00
Yong Zhao
deb99d7c4f drm/amdkfd: Delete a duplicate statement in set_pasid_vmid_mapping()
The same statement is later done in kgd_set_pasid_vmid_mapping(), so no
need to do it in set_pasid_vmid_mapping().

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:13 -05:00
Amber Lin
7cd52c917a drm/amdkfd: Add proper prefix to functions
Add amdgpu_amdkfd_ prefix to amdgpu functions served for amdkfd usage.

v2: fix indentation

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:08 -05:00
Amber Lin
5b87245faf drm/amdkfd: Simplify kfd2kgd interface
After amdkfd module is merged into amdgpu, KFD can call amdgpu directly
and no longer needs to use the function pointer. Replace those function
pointers with functions if they are not ASIC dependent.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:21:07 -05:00
zhong jiang
6dfeb11a4b drm/amdkfd: Use kmemdup instead of duplicating its function
kmemdup has implemented the function that kmalloc() + memcpy().
We prefer to kmemdup rather than code opened implementation.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05 14:20:36 -05:00
YueHaibing
ae5c59a83b drm/amdkfd: Remove set but not used variable 'preempt_all_queues'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c: In function 'destroy_queue_cpsch':
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:1366:7: warning:
 variable 'preempt_all_queues' set but not used [-Wunused-but-set-variable]

It never used since introduct in
commit 992839ad64 ("drm/amdkfd: Add static user-mode queues support")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-10 14:49:42 -05:00
Felix Kuehling
1b19aa5aa8 drm/amdkfd: Fix incorrect use of process->mm
This mm_struct pointer should never be dereferenced. If running in
a user thread, just use current->mm. If running in a kernel worker
use get_task_mm to get a safe reference to the mm_struct.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-09 17:06:19 -05:00
Shaoyun Liu
006a0b3d86 drm/amdkfd: Remove the requirement for atomic Ops on vg20
Firmware have the workaround to replace the atomic Ops with read-modify-write on CP side.
User should not expect atomic Ops on system memory works normally if system didn't not
support it.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-By: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27 09:39:57 -05:00
Shaoyun Liu
22a3a2941b drm/amdkfd: Vega20 bring up on amdkfd side
Add Vega20 device IDs, device info and enable it in KFD.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-09-26 21:09:18 -05:00
Shaoyun Liu
e715c6d0ea drm/amd: Interface change to support 64 bit page_table_base
amdgpu_gpuvm_get_process_page_dir should return the page table address
in the format expected by the pm4_map_process packet for all ASIC
generations.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:17 -05:00
Shaoyun Liu
d50941892e drm/amdkfd: Make the number of SDMA queues variable
Vega20 supports 8 SDMA queues per engine

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:16 -05:00
Jay Cornwall
5df099e8bc drm/amdkfd: Add wavefront context save state retrieval ioctl
Wavefront context save data is of interest to userspace clients for
debugging static wavefront state. The MQD contains two parameters
required to parse the control stack and the control stack itself
is kept in the MQD from gfx9 onwards.

Add an ioctl to fetch the context save area and control stack offsets
and to copy the control stack to a userspace address if it is kept in
the MQD.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:15 -05:00
Felix Kuehling
5ade6c9c35 drm/amdkfd: Report SDMA firmware version in the topology
Also save the version in struct kfd_dev so we only need to query
it once.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:14 -05:00
Emily Deng
6d12aa8741 drm/amdkfd: KFD doesn't support TONGA SRIOV
KFD module doesn't support TONGA SRIOV, if init KFD module in TONGA SRIOV
environment, it will let compute ring IB test fail.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:14 -05:00
Eric Huang
d35f00d8ec drm/amdkfd: reflect atomic support in IO link properties
Add the flags of properties according to Asic type and pcie
capabilities.

Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-26 21:09:13 -05:00
Dave Airlie
bf78296ab1 This is the 4.19-rc5 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlunyjMACgkQONu9yGCS
 aT52HhAA0JU7E88QPZ1gSxc1ifTaIlHXhLQSvQKAXOhIvHDwj4tEKDqPhpCN/dWX
 /o/xaUf36gU0VzUD/1IyEiMFmJEeFKnfvN5SZYZLk8uSrd4swqaY8mSueZxNEDz4
 YNK9ugI/tPztuuz7I6KrO1iVquY1WlnECxc9FH76wvHsit8Sr3PvzhR+CVrOi+8p
 k3cpWlhHiOzT/3K3Wv2Et+oh+U+myKtQTlJDSe3fMx5chksJpBmsV/IDEtsLNZfz
 3v25fHz5a3DOYqKkGJaDrbLyPNC85249B+CiXqbXvfOAHDVkMwYqcxYUG+YZ5cpm
 U0OShLXm67dz8vT9cxqOSguCliPRlM9W5+EKzmVT7l8+ycds3BuEEHg1xWPrJWgG
 7XO10HkhZl+VvnJCj54KaszMUOdpvdEQSUs82gAFxjPbQIx5gosN9O0H+DnirMhS
 6VtzS20ZoIzjd4YVkRoLNcobHB4bZVTNXZ1Zi3C/neP9pxUjhOk0y+Vr/crC5Xph
 3TykIMgiVa+CdvQ/f4LOSiCgTFhF0tLGtfDQTG7f+9+W5pMc4NKSLi8EOMlJtYEy
 wsCYZ7/T9ElgrEzFvlxSvDwiPUhcldNao/EGdRYvMxXtgj0Ctw8LhR/2YKkqo6LK
 oMoKKWkj0o7uKSHKq+dakS0FprKnBnvE2Y+XA4SO/saPGFlDAVc=
 =OFJh
 -----END PGP SIGNATURE-----

BackMerge v4.19-rc5 into drm-next

Sean Paul requested an -rc5 backmerge from some sun4i fixes.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-09-27 11:06:46 +10:00
Yong Zhao
44d8cc6f1a drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs
Because CRAT_CU_FLAGS_IOMMU_PRESENT was not set in some BIOS crat, we
need to workaround this.

For future compatibility, we also overwrite the bit in capability according
to the value of needs_iommu_device.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:23 -05:00
Yong Zhao
15426dbb65 drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9
CWSR fails on Raven if the control stack is MTYPE_UC, which is used
for regular GART mappings. As a workaround we map it using MTYPE_NC.

The MEC firmware expects the control stack at one page offset from the
start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU
added a memory allocation flag just for this purpose.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:17 -05:00
shaoyunl
67f7cf9f76 drm/amdkfd: Only add bi-directional iolink on GPU with XGMI or largebar (v2)
v2: compile fix

Signed-off-by: shaoyunl <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:49:33 -05:00
Shaoyun Liu
ae9a25aea7 drm/amdkfd: Generate xGMI direct iolink
Generate xGMI iolink for upper level usage

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:49:00 -05:00
Shaoyun Liu
aa64ca38ed drm/amdkfd: Add new iolink type defines
Update the iolink type defines according to the new thunk spec

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:48:51 -05:00
Shaoyun Liu
0c1690e38b drm/amdkfd: kfd expose the hive_id of the device through its node properties
Thunk will generate the XGMI topology information when necessary with the hive_id
for each specified device

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-10 22:48:43 -05:00
Amber Lin
2690262ec9 drm/amdgpu: Relocate some definitions v2
Move some KFD-related (but used in amdgpu_drv.c) definitions from
kfd_priv.h to kgd_kfd_interface.h so we don't need to include kfd_priv.h
in amdgpu_drv.c. This fixes a build failure when AMDGPU is enabled but
MMU_NOTIFIER is not.
This patch also disables KFD-related module options when HSA_AMD is not
enabled.

v2: rebase (Alex)

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-29 12:41:50 -05:00
Oak Zeng
bf47afbabf drm/amdkfd: Release an acquired process vm
For compute vm acquired from amdgpu, vm.pasid is managed
by kfd. Decouple pasid from such vm on process destroy
to avoid duplicate pasid release.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-29 12:35:00 -05:00
Oak Zeng
1685b01a85 drm/amdgpu: Set pasid for compute vm (v2)
To make a amdgpu vm to a compute vm, the old pasid will be freed and
replaced with a pasid managed by kfd. Kfd can't reuse original pasid
allocated by amdgpu because kfd uses different pasid policy with amdgpu.
For example, all graphic devices share one same pasid in a process.

v2: rebase (Alex)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-29 12:34:49 -05:00
Amber Lin
521fb7d021 drm/amdgpu: Move KFD parameters to amdgpu (v3)
After merging KFD into amdgpu, move module parameters defined in KFD to
amdgpu_drv.c, where other module parameters are declared.

v2: add kernel-doc comments
v3: rebase and fix parameter variable name (Alex)

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-28 11:51:11 -05:00
Amber Lin
04d5e27658 drm/amdgpu: Merge amdkfd into amdgpu
Since KFD is only supported by single GPU driver, it makes sense to merge
amdgpu and amdkfd into one module. This patch is the initial step: merge
Kconfig and Makefile.

v2: also remove kfd from drm Kconfig

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-28 11:22:42 -05:00
Felix Kuehling
b5aa3f4aef drm/amdkfd: Call kfd2kgd.set_compute_idle
User mode queue submissions don't go through KFD. Therefore we don't
know exactly when compute is idle or not idle. We use the existence
of user mode queues on a device as an approximation.

register_process is called when the first queue of a process is
created. Conversely unregister_process is called when the last queue
is destroyed. The first process that is registered takes compute
out of idle. The last process that is unregisters sets compute back
to idle.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-16 19:10:37 -04:00
Felix Kuehling
39e7f33186 drm/amdkfd: Add CU-masking ioctl to KFD
CU-masking allows a KFD client to control the set of CUs used by a
user mode queue for executing compute dispatches. This can be used
for optimizing the partitioning of the GPU and minimize conflicts
between concurrent tasks.

Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-14 19:05:59 -04:00
Yong Zhao
4d663df658 drm/amdkfd: Enable Raven for KFD
Add DID and kfd_device_info for Raven.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:48 -04:00
Yong Zhao
359cecdd49 drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event()
memory_exception_data is already initialized for not-present faults.
It only needs to be overridden for permission faults.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:47 -04:00
Yong Zhao
8725aecac3 drm/amdkfd: Workaround to accommodate Raven too many PPR issue
On Raven multiple PPRs can be queued up by the hardware. When the
first of those requests is handled by the IOMMU driver, the memory
access succeeds. After that the application may be done with the
memory and unmap it. At that point the page table entries are
invalidated, but there are still outstanding duplicate PPRs for those
addresses. When the IOMMU driver processes those duplicate requests,
it finds invalid page table entries and triggers an invalid PPR fault.

As a workaround, don't signal invalid PPR faults on Raven to avoid
segfaulting applications that haven't done anything wrong. As a side
effect, real GPU memory access faults may go unnoticed by the
application.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:46 -04:00
Yong Zhao
eab69801cf drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues
On Raven Invalid PPRs (peripheral page requests) can be reported
because multiple PPRs can be still queued when memory is freed.
Apply a rate limit to avoid flooding the log in this case.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:45 -04:00
Yong Zhao
98bb92222e drm/amdkfd: Make SDMA engine number an ASIC-dependent variable
On Raven there is only one SDMA engine instead of previously assumed two,
so we need to adapt our code to this new scenario.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:44 -04:00
Yong Zhao
f3ed5df84c drm/amdkfd: Consolidate duplicate memory banks info in topology
If there are several memory banks that has the same properties in CRAT,
we aggregate them into one memory bank. This cleans up memory banks on
APUs (e.g. Raven) where the CRAT reports each memory channel as a
separate bank. This only confuses user mode, which only deals with
virtual memory.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-13 16:17:43 -04:00
Yong Zhao
e7016d8e6f drm/amdkfd: Clean up reference of radeon
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:08 -04:00
Yong Zhao
8d5f355290 drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager
This will make reading code much easier.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:07 -04:00
Yong Zhao
2b281977f5 drm/amdkfd: Use module parameters noretry as the internal variable name
This makes all module parameters use the same form. Meanwhile clean up
the surrounding code.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:06 -04:00
Yong Zhao
0e9a860c72 drm/amdkfd: Introduce KFD module parameter halt_if_hws_hang
This avoids triggering a GPU reset or otherwise changing the HW
state. Instead KFD will hang, which allows HW debugging tools to
analyze the problem.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:05 -04:00
Shaoyun Liu
a29ec470b1 drm/amdkfd: Add debugfs interface to trigger HWS hang
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:04 -04:00
Shaoyun Liu
951df6d9cf drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculation
The bitmap index calculation should reverse the logic used on allocation
so it will clear the same bit used on allocation

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:33:01 -04:00
Shaoyun Liu
73ea648d92 drm/amdkfd: Implement hang detection in KFD and call amdgpu
The reset will be performed in a new hw_exception work thread to
handle HWS hang without blocking the thread that detected the hang.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:58 -04:00
Shaoyun Liu
e42051d213 drm/amdkfd: Implement GPU reset handlers in KFD
Lock KFD and evict existing queues on reset. Notify user mode by
signaling hw_exception events.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:56 -04:00
Shaoyun Liu
e3b7a96774 drm/amdkfd: Add gpu reset interface and place holder
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:54 -04:00
Lan Xiao
58e6988612 drm/amdkfd: fix zero reading of VMID and PASID for Hawaii
Upon VM Fault, the VMID and PASID written by HW are zeros in
Hawaii. Instead of reading from ih_ring_entry, read directly
from the registers. This workaround fix the soft hang issues
caused by mishandled VM Fault in Hawaii.

Signed-off-by: Lan Xiao <Lan.Xiao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:51 -04:00
shaoyunl
2640c3facb drm/amdkfd: Handle VM faults in KFD
1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per
   per-vmid. amdkfd needs to get the information from amdgpu through the
   new get_vm_fault_info interface. On GFX9 and later, all the required
   information is in the IH ring
2. amdkfd unmaps all queues from the faulting process and create new
   run-list without the guilty process
3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY

Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:50 -04:00
Moses Reuben
101fee63cb drm/amdkfd: send SIGSEGV to process upon KFD_EVENT_TYPE_MEMORY
Signed-off-by: Moses Reuben <moses.reuben@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:48 -04:00
Wei Lu
e47cb828eb drm/amdkfd: Fix error codes in kfd_get_process
Return ERR_PTR(-EINVAL) if kfd_get_process fails to find the process.
This fixes kernel oopses when a child process calls KFD ioctls with
a file descriptor inherited from the parent process.

Signed-off-by: Wei Lu <wei.lu2@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:47 -04:00
Jay Cornwall
a60d811b2b drm/amdkfd: Fix race between scheduler and context restore
The scheduler may raise SQ_WAVE_STATUS.SPI_PRIO via SQ_CMD before
context restore has completed. Restoring SPI_PRIO=0 after this point
may cause context save to fail as the lower priority wavefronts
are not selected for execution among spin-waiting wavefronts.

Leave SPI_PRIO at its SPI-initialized or scheduler-raised value.

v2: Also fix race with exception handler

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:46 -04:00
Felix Kuehling
1cd106ecfc drm/amdkfd: Stop using GFP_NOIO explicitly
This is no longer needed with the memalloc_nofs_save/restore in
dqm_lock/unlock.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:45 -04:00
Felix Kuehling
efeaed4d98 drm/amdkfd: Reliably prevent reclaim-FS while holding DQM lock
This is needed to prevent deadlocks when MMU notifiers run in
reclaim-FS context and take the DQM lock for userptr evictions.
Previously this was done by making all memory allocations under
DQM locks GFP_NOIO. This is error prone. Using
memalloc_nofs_save/restore will reliably affect all memory
allocations anywhere in the kernel while the DQM lock is held.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 22:32:44 -04:00
Arnd Bergmann
0337976f40 drm/admkfd use modern ktime accessors
getrawmonotonic64() and get_monotonic_boottime64() are deprecated
because of the nonstandard naming.

The replacement functions ktime_get_raw_ns() and ktime_get_boot_ns()
also simplify the callers.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11 14:41:00 +02:00
Dave Airlie
c76f0b2cc2 Merge tag 'drm-amdkfd-next-2018-05-14' of git://people.freedesktop.org/~gabbayo/linux into drm-next
This is amdkfd pull for 4.18. The major new features are:

- Add support for GFXv9 dGPUs (VEGA)
- Add support for userptr memory mapping

In addition, there are a couple of small fixes and improvements, such as:
- Fix lock handling
- Fix rollback packet in kernel kfd_queue
- Optimize kfd signal handling
- Fix CP hang in APU

Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180514070126.GA1827@odedg-x270
2018-05-15 16:06:08 +10:00
Randy Dunlap
7bbc0b950f drm/amdkfd: fix build, select MMU_NOTIFIER
When CONFIG_MMU_NOTIFIER is not enabled, struct mmu_notifier has an
incomplete type definition, which causes build errors.

../drivers/gpu/drm/amd/amdkfd/kfd_priv.h:607:22: error: field 'mmu_notifier' has incomplete type
../include/linux/kernel.h:979:32: error: dereferencing pointer to incomplete type
../include/linux/kernel.h:980:18: error: dereferencing pointer to incomplete type
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:434:2: error: implicit declaration of function 'mmu_notifier_unregister_no_release' [-Werror=implicit-function-declaration]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:435:2: error: implicit declaration of function 'mmu_notifier_call_srcu' [-Werror=implicit-function-declaration]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:438:21: error: variable 'kfd_process_mmu_notifier_ops' has initializer but incomplete type
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: error: unknown field 'release' specified in initializer
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: excess elements in struct initializer [enabled by default]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: (near initialization for 'kfd_process_mmu_notifier_ops') [enabled by default]
../drivers/gpu/drm/amd/amdkfd/kfd_process.c:534:2: error: implicit declaration of function 'mmu_notifier_register' [-Werror=implicit-function-declaration]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 12:50:04 +03:00
Andres Rodriguez
1cf6cc74bb drm/amdkfd: fix clock counter retrieval for node without GPU
Currently if a user requests clock counters for a node without a GPU
resource we will always return EINVAL.

Instead if no GPU resource is attached, fill the gpu_clock_counter
argument with zeroes so that we may proceed and return valid CPU
counters.

Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 12:34:44 +03:00
Wei Yongjun
ded5e5622c drm/amdkfd: Fix the error return code in kfd_ioctl_unmap_memory_from_gpu()
Passing NULL pointer to PTR_ERR will result in return value of 0
indicating success which is clearly not what it is intended here.
This patch returns -EINVAL instead.

v2: change ret code to -ENODEV

Fixes: 5ec7e02854 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 12:14:55 +03:00
kbuild test robot
a4efd3a4e6 drm/amdkfd: kfd_dev_is_large_bar() can be static
Fixes: 5ec7e02854 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 12:05:27 +03:00
Laura Abbott
af47b39027 drm/amdkfd: Remove vla
There's an ongoing effort to remove VLAs[1] from the kernel to eventually
turn on -Wvla. Switch to a constant value that covers all hardware.

[1] https://lkml.org/lkml/2018/3/7/621

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-13 14:24:12 -07:00
Felix Kuehling
c129db1206 drm/amdkfd: Add sanity checks in IRQ handlers
Only accept interrupts from KFD VMIDs. Just checking for a PASID may
not be enough because amdgpu started using PASIDs to map VM faults
to processes.

Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug).

Suggested-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Suggested-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:12 -04:00
Shaoyun Liu
2533f0741e drm/amdkfd: Remove queue node when destroy queue failed
HWS may hang in the middle of destroy queue, remove the queue from the
process queue list so it won't be freed again in the future

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:11 -04:00
Ben Goz
bfdcbfd255 drm/amdkfd: Locking PM mutex while allocating IB buffer
Signed-off-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:10 -04:00
Felix Kuehling
ccb76b149e drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK
The initialization is not necessary. amd-kfd-staging and ROCm
releases have worked without it for two years.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:09 -04:00
Felix Kuehling
eeb27b7eb3 drm/amdkfd: Fix signal handling performance again
It turns out that idr_for_each_entry is really slow compared to just
iterating over the slots. Based on measurements the difference is
estimated to be about a factor 64. That means using idr_for_each_entry
is only worth it with very few allocated events.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:08 -04:00
Yong Zhao
f8ea72d097 drm/amdkfd: Fix CP soft hang on APUs
The problem happens on Raven and Carrizo. The context save handler
should not clear the high bits of PC_HI before extracting the bits
of IB_STS.

The bug is not relevant to VEGA10 until we enable demand paging.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:07 -04:00
Yong Zhao
0db54b24ad drm/amdkfd: Separate trap handler assembly code and its hex values
Since the assembly code is inside "#if 0", it is ineffective. Despite that,
during debugging, we need to change the assembly code, extract it into
a separate file and compile the new file into hex values using sp3.
That process also requires us to remove "#if 0" and modify lines starting
with "#", so that sp3 can successfully compile the new file.

With this change, all the above chore is no longer needed, and
cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its
hex values.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:06 -04:00
Felix Kuehling
a2e94158b8 drm/amdkfd: Remove redundant include of amd-iommu.h
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:05 -04:00
Philip Yang
fa7e65147e drm/amdkfd: use %px to print user space address instead of %p
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:04 -04:00
Jay Cornwall
2774c63ef3 drm/amdkfd: Use volatile MTYPE in default/alternate apertures
MTYPE_NC_NV (0) marks scalar/vector L1 cache lines as non-volatile.
Cache lines loaded through these apertures are intended to be
invalidated before (and sometimes during) a dispatch. The non-volatile
qualifier prevents these cache lines from being distinguished from
those loaded through the private aperture.

Use MTYPE_NC (1) instead on both Gfx7 and Gfx8. This allows the
compiler to use the BUFFER_WBINVL1_VOL instruction and is a precursor
to automatic per-dispatch scalar/vector L1 volatile invalidation.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:03 -04:00
Jay Cornwall
87e6d4e077 drm/amdkfd: Reduce priority of context-saving waves before spin-wait
Synchronization between context-saving wavefronts is achieved by
sending a SAVEWAVE message to the SPI and then spin-waiting for a
response. These spin-waiting wavefronts may inhibit the progress
of other wavefronts in the context save handler, leading to the
synchronization condition never being achieved.

Before spin-waiting reduce the priority of each wavefront to
guarantee foward progress in the others.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:02 -04:00
Oak Zeng
24f48a4203 drm/amdkfd: Dump HQD of HIQ
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-05-01 17:56:01 -04:00
Dan Carpenter
8feaccf71d drm/amdkfd: Integer overflows in ioctl
args->n_devices is a u32 that comes from the user.  The multiplication
could overflow on 32 bit systems possibly leading to privilege
escalation.

Fixes: 5ec7e02854 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: Dan Carpenter dan.carpenter@oracle.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-24 16:35:49 +03:00
Felix Kuehling
389056e5fe drm/amdkfd: Add Vega10 topology and device info
* Report 64-bit doorbells as HSA_CAP_DOORBELL_TYPE_2_0 in topology
* Report cache information in topology (duplicates GFXv8 info for now)
* Add device info for Vega10 support in KFD

Raven is not enabled at this time as it needs additional changes in
DQM to work with a single SDMA engine.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:18 -04:00
welu
6106dce955 drm/amdkfd: Try to enable atomics for all GPUs
Report failure to enable atomics only on GPUs that require them.
This allows GPUs that don't require atomics to function, but can
benefit if they are available. This is the case for Vega10, which
doesn't use atomics for basic functioning of the MEC, AQL and HWS
microcode. So it can work without atomics. But shader programs can
still use atomic instructions on systems that support PCIe atomics.

Signed-off-by: welu <Wei.Lu2@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:17 -04:00
Felix Kuehling
3e76c2399b drm/amdkfd: Add GFXv9 CWSR trap handler
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:16 -04:00
Felix Kuehling
70a31d16cc drm/amdkfd: Support flat memory apertures for GFXv9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:15 -04:00
Felix Kuehling
6aac0a48b0 drm/amdkfd: Remove limit on number of GPUs (follow-up)
This condition was missed in a previous commit with the same title.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:14 -04:00
Felix Kuehling
9d7d024816 drm/amdkfd: Add 64-bit doorbell and wptr support to kernel queue
v2: Removed redundant 0x before %p.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-08 22:03:51 -04:00
Felix Kuehling
bebfd2f412 drm/amdkfd: Fix kernel queue rollback_packet
kq->queue->properties.write_ptr is a GPU address which can'd be
derefenced in the kernel. Use kq->wptr_kernel instead, which is the
kernel CPU address of the same buffer.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:12 -04:00
Felix Kuehling
2a26fbfe80 drm/amdkfd: Fix goto usage
Missed a spot in previous cleanup commit:
Remove gotos that do not feature any common cleanup, and use gotos
instead of repeating cleanup commands.

According to kernel.org: "The goto statement comes in handy when a
function exits from multiple locations and some common work such as
cleanup has to be done. If there is no cleanup needed then just return
directly."

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:11 -04:00
Felix Kuehling
ca750681bc drm/amdkfd: Add SOC15 interrupt processing support
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:10 -04:00
Felix Kuehling
bed4f11025 drm/amdkfd: Add GFXv9 device queue manager
Signed-off-by: John Bridgman <john.bridgman@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:09 -04:00
Felix Kuehling
b91d43dd01 drm/amdkfd: Add GFXv9 MQD manager
Signed-off-by: John Bridgman <john.bridgman@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:08 -04:00
Felix Kuehling
454150b1f9 drm/amdkfd: Add GFXv9 PM4 packet writer functions
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:07 -04:00
Felix Kuehling
f6e27ff19d drm/amdkfd: Move packet writer functions into ASIC-specific file
This is in preparation for GFXv9 (Vega10) which uses incompatible PM4
packet formats from previous ASIC generations.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:06 -04:00
Felix Kuehling
ef568db792 drm/amdkfd: Implement doorbell allocation for SOC15
Allocate doorbells according to the doorbell routing information on
SOC15 ASICs (Vega10 and later). On older ASICs we continue to use the
queue_id as the doorbell ID to maintain compatibility with the Thunk.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:05 -04:00
Harish Kasiviswanathan
df03ef9342 drm/amdkfd: Clean up KFD_MMAP_ offset handling
Use bit-rotate for better clarity and remove _MASK from the #defines as
these represent mmap types.

Centralize all the parsing of the mmap offset in kfd_mmap and add device
parameter to doorbell and reserved_mem map functions.

Encode gpu_id into upper bits of vm_pgoff. This frees up the lower bits
for encoding the the doorbell ID on Vega10.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:04 -04:00
Felix Kuehling
ada2b29c4a drm/amdkfd: Make doorbell size ASIC-dependent
This prepares for GFXv9 (Vega10), which has 64-bit doorbells.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-10 17:33:03 -04:00
Linus Torvalds
320b164abb main drm pull request for v4.17
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJavDxYAAoJEAx081l5xIa+pCoP/iwjuxkSTdJpZUx5g0daGkCK
 O18moGqGPChb7qJovfHqCKZ1f9PGulQt7SxwFzzJXNbv0PbfMA/Og0EhMLBImb+Q
 VfYgq2vJLpmkikgcI5fBrzs9DRMQKKobGIzw24VS7IkPYA7d8KgAyywBwG0+LUFR
 G3sobClgapsfaUcleb3ZOeDwymGkGCuuYRpYE4giHtuMDIxCWLePKJKOaOIq8o6P
 A1557EvSbKuLQGI9X50jzJOoBE3TKRQYkzuM1GthdOF8RHaMNcFy44lDNO030HwZ
 hzwAIg5Izhu16PqZGyEdIQ6SJTv3isRJWEciPnOsijvjl1li3ehMdQfhGISa/jZO
 ivEGd32kaactiT0jJ5OyexergEViCPVKCIORksSIk46L84luDva9L22A3yu0mf3F
 ixB63bAiLH7Py77kH3DmeJdqhMxlVZXCbdBVFDvzZvY4O3Mx0Dv9mmN/nw1FVCFH
 scSYnXea9/o4IY5yGASU6FAUJEEGu20HAN12oHJw7/taqV/gbbEos3F7AGmjJE0f
 qe6Rt/8fwi7Lhm2va6EoOo6yltH/gL4/AgnsN76VzppNGbaIv7W8Qa4Y/ES1lAE1
 SATAEUJfU8kiLrVOolIElPbgfdJwv8TzoxiKB5wK/eoH20wf4BTmOuBMviaL2qXK
 Sz6wihq+IlMXW7Y7pIl/
 =DrA+
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "Cannonlake and Vega12 support are probably the two major things. This
  pull lacks nouveau, Ben had some unforseen leave and a few other
  blockers so we'll see how things look or maybe leave it for this merge
  window.

  core:
   - Device links to handle sound/gpu pm dependency
   - Color encoding/range properties
   - Plane clipping into plane check helper
   - Backlight helpers
   - DP TP4 + HBR3 helper support

  amdgpu:
   - Vega12 support
   - Enable DC by default on all supported GPUs
   - Powerplay restructuring and cleanup
   - DC bandwidth calc updates
   - DC backlight on pre-DCE11
   - TTM backing store dropping support
   - SR-IOV fixes
   - Adding "wattman" like functionality
   - DC crc support
   - Improved DC dual-link handling

  amdkfd:
   - GPUVM support for dGPU
   - KFD events for dGPU
   - Enable PCIe atomics for dGPUs
   - HSA process eviction support
   - Live-lock fixes for process eviction
   - VM page table allocation fix for large-bar systems

  panel:
   - Raydium RM68200
   - AUO G104SN02 V2
   - KEO TX31D200VM0BAA
   - ARM Versatile panels

  i915:
   - Cannonlake support enabled
   - AUX-F port support added
   - Icelake base enabling until internal milestone of forcewake support
   - Query uAPI interface (used for GPU topology information currently)
   - Compressed framebuffer support for sprites
   - kmem cache shrinking when GPU is idle
   - Avoid boosting GPU when waited item is being processed already
   - Avoid retraining LSPCON link unnecessarily
   - Decrease request signaling latency
   - Deprecation of I915_SET_COLORKEY_NONE
   - Kerneldoc and compiler warning cleanup for upcoming CI enforcements
   - Full range ycbcr toggling
   - HDCP support

  i915/gvt:
   - Big refactor for shadow ppgtt
   - KBL context save/restore via LRI cmd (Weinan)
   - Properly unmap dma for guest page (Changbin)

  vmwgfx:
   - Lots of various improvements

  etnaviv:
   - Use the drm gpu scheduler
   - prep work for GC7000L support

  vc4:
   - fix alpha blending
   - Expose perf counters to userspace

  pl111:
   - Bandwidth checking/limiting
   - Versatile panel support

  sun4i:
   - A83T HDMI support
   - A80 support
   - YUV plane support
   - H3/H5 HDMI support

  omapdrm:
   - HPD support for DVI connector
   - remove lots of static variables

  msm:
   - DSI updates from 10nm / SDM845
   - fix for race condition with a3xx/a4xx fence completion irq
   - some refactoring/prep work for eventual a6xx support (ie. when we
     have a userspace)
   - a5xx debugfs enhancements
   - some mdp5 fixes/cleanups to prepare for eventually merging
     writeback
   - support (ie. when we have a userspace)

  tegra:
   - mmap() fixes for fbdev devices
   - Overlay plane for hw cursor fix
   - dma-buf cache maintenance support

  mali-dp:
   - YUV->RGB conversion support

  rockchip:
   - rk3399/chromebook fixes and improvements

  rcar-du:
   - LVDS support move to drm bridge
   - DT bindings for R8A77995
   - Driver/DT support for R8A77970

  tilcdc:
   - DRM panel support"

* tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux: (1646 commits)
  drm/i915: Fix hibernation with ACPI S0 target state
  drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
  drm/i915: Specify which engines to reset following semaphore/event lockups
  drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
  drm/amdkfd: Use ordered workqueue to restore processes
  drm/amdgpu: Fix acquiring VM on large-BAR systems
  drm/amd/pp: clean header file hwmgr.h
  drm/amd/pp: use mlck_table.count for array loop index limit
  drm: Fix uabi regression by allowing garbage mode->type from userspace
  drm/amdgpu: Add an ATPX quirk for hybrid laptop
  drm/amdgpu: fix spelling mistake: "asssert" -> "assert"
  drm/amd/pp: Add new asic support in pp_psm.c
  drm/amd/pp: Clean up powerplay code on Vega12
  drm/amd/pp: Add smu irq handlers for legacy asics
  drm/amd/pp: Fix set wrong temperature range on smu7
  drm/amdgpu: Don't change preferred domian when fallback GTT v5
  drm/vmwgfx: Bump version patchlevel and date
  drm/vmwgfx: use monotonic event timestamps
  drm/vmwgfx: Unpin the screen object backup buffer when not used
  drm/vmwgfx: Stricter count of legacy surface device resources
  ...
2018-04-02 07:59:23 -07:00
Felix Kuehling
6b95e7973a drm/amdkfd: Add quiesce_mm and resume_mm to kgd2kfd_calls
These interfaces allow KGD to stop and resume all GPU user mode queue
access to a process address space. This is needed for handling MMU
notifiers of userptrs mapped for GPU access in KFD VMs.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:32:32 -04:00
Felix Kuehling
d1853f42b6 drm/amdkfd: GFP_NOIO while holding locks taken in MMU notifier
When an MMU notifier runs in memory reclaim context, it can deadlock
trying to take locks that are already held in the thread causing the
memory reclaim. The solution is to avoid memory reclaim while holding
locks that are taken in MMU notifiers by using GFP_NOIO.

This commit fixes memory allocations done while holding the dqm->lock
which is needed in the MMU notifier (dqm->ops.evict_process_queues).

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:32:31 -04:00
Felix Kuehling
1679ae8f8f drm/amdkfd: Use ordered workqueue to restore processes
Restoring multiple processes concurrently can lead to live-locks
where each process prevents the other from validating all its BOs.

v2: fix duplicate check of same variable

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:30:36 -04:00
Felix Kuehling
72a01d231d drm/amdkfd: Deallocate SDMA queues correctly
Deallocate SDMA queues during abnormal process termination and when
queue creation fails after the SDMA allocation.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:30:34 -04:00
Felix Kuehling
c70a362687 drm/amdkfd: Fix scratch memory with HWS enabled
Program sh_hidden_private_base_vmid correctly in the map-process
PM4 packet.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-23 15:30:33 -04:00
Felix Kuehling
374200b154 drm/amdkfd: Add module option for testing large-BAR functionality
Simulate large-BAR system by exporting only visible memory. This
limits the amount of available VRAM to the size of the BAR, but
enables CPU access to VRAM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:53 -04:00
Felix Kuehling
0fc8011f89 drm/amdkfd: Kmap event page for dGPUs
The events page must be accessible in user mode by the GPU and CPU
as well as in kernel mode by the CPU. On dGPUs user mode virtual
addresses are managed by the Thunk's GPU memory allocation code.
Therefore we can't allocate the memory in kernel mode like we do
on APUs. But KFD still needs to map the memory for kernel access.
To facilitate this, the Thunk provides the buffer handle of the
events page to KFD when creating the first event.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:52 -04:00
Felix Kuehling
5ec7e02854 drm/amdkfd: Add ioctls for GPUVM memory management
v2:
* Fix error handling after kfd_bind_process_to_device in
  kfd_ioctl_map_memory_to_gpu
v3:
* Add ioctl to acquire VM from a DRM FD
v4:
* Return number of successful map/unmap operations in failure cases
* Facilitate partial retry after failed map/unmap
* Added comments with parameter descriptions to new APIs
* Defined AMDKFD_IOC_FREE_MEMORY_OF_GPU write-only

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:51 -04:00
Felix Kuehling
552764b680 drm/amdkfd: Add TC flush on VMID deallocation for Hawaii
On GFX7 the CP does not perform a TC flush when queues are unmapped.
To avoid TC eviction from accessing an invalid VMID, flush it
explicitly before releasing a VMID.

v2: Fix unnecessary list_for_each_entry_safe
v3: Moved allocation to kfd_process_device_init_vm

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:50 -04:00
Felix Kuehling
f35751b870 drm/amdkfd: Allocate CWSR trap handler memory for dGPUs
Add helpers for allocating GPUVM memory in kernel mode and use them
to allocate memory for the CWSR trap handler.

v2: Use dev instead of pdd->dev in kfd_process_free_gpuvm
v3:
* Cleaned up and simplified kfd_process_alloc_gpuvm
* Moved allocation for dGPU to kfd_process_device_init_vm

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:49 -04:00
Felix Kuehling
52b29d7334 drm/amdkfd: Add per-process IDR for buffer handles
Also used for cleaning up on process termination.

v2: Refactored cleanup on process termination

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:48 -04:00
Felix Kuehling
d01994c24c drm/amdkfd: Aperture setup for dGPUs
Set up the GPUVM aperture for SVM (shared virtual memory) that allows
sharing a part of virtual address space between GPUs and CPUs.

Report the size of the GPUVM aperture that is supported by KGD accurately.

The low part of the GPUVM aperture is reserved for kernel use. This is
for kernel-allocated buffers that are only accessed on the GPU:
- CWSR trap handler
- IB for submitting commands in user-mode context from kernel mode

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:47 -04:00
Felix Kuehling
c7bcbfa4f8 drm/amdkfd: Remove limit on number of GPUs
Currently the number of GPUs is limited by aperture placement options
available on GFX7 and GFX8 hardware. This limitation is not necessary.
Scratch and LDS represent per-work-item and per-work-group storage
respectively. Different work-items and work-groups use the same virtual
address to access their own data. Work running on different GPUs is by
definition in different work-groups (different dispatches, in fact).
That means the same virtual addresses can be used for these apertures
on different GPUs.

Add a new AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl that removes the
artificial limitation on the number of GPUs that can be supported. The
new ioctl allows user mode to query the number of GPUs to allocate
enough memory for all GPUs to be reported.

This deprecates AMDKFD_IOC_GET_PROCESS_APERTURES.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:46 -04:00
Oak Zeng
7c9b717196 drm/amdkfd: Populate DRM render device minor
Populate DRM render device minor in kfd topology

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:45 -04:00
Felix Kuehling
b84394e206 drm/amdkfd: Create KFD VMs on demand
Instead of creating all VMs on process creation, create them when
a process is bound to a device. This will later allow registering
an existing VM from a DRM render node FD at runtime, before the
process is bound to the device. This way the render node VM can be
used for KFD instead of creating our own redundant VM.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:27:44 -04:00
Arnd Bergmann
48a4438718 drm/amdkfd: fix uninitialized variable use
When CONFIG_ACPI is disabled, we never initialize the acpi_table
structure in kfd_create_crat_image_virtual:

drivers/gpu/drm/amd/amdkfd/kfd_crat.c: In function 'kfd_create_crat_image_virtual':
drivers/gpu/drm/amd/amdkfd/kfd_crat.c:888:40: error: 'acpi_table' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The undefined behavior also happens for any other acpi_get_table()
failure, but then the compiler can't warn about it.

This adds an error check that prevents the structure from
being used in error, avoiding both the undefined behavior and
the warning about it.

Fixes: 520b8fb755 ("drm/amdkfd: Add topology support for CPUs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-03-15 17:49:40 +01:00
Felix Kuehling
26103436da drm/amdkfd: Implement KFD process eviction/restore
When the TTM memory manager in KGD evicts BOs, all user mode queues
potentially accessing these BOs must be evicted temporarily. Once
user mode queues are evicted, the eviction fence is signaled,
allowing the migration of the BO to proceed.

A delayed worker is scheduled to restore all the BOs belonging to
the evicted process and restart its queues.

During suspend/resume of the GPU we also evict all processes to allow
KGD to save BOs in system memory, since VRAM will be lost.

v2:
* Account for eviction when updating of q->is_active in MQD manager

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-02-06 20:32:45 -05:00
Felix Kuehling
403575c44e drm/amdkfd: Add GPUVM virtual address space to PDD
Create/destroy the GPUVM context during PDD creation/destruction.
Get VM page table base and program it during process registration
(HWS) or VMID allocation (non-HWS).

v2:
* Used dev instead of pdd->dev in kfd_flush_tlb

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-02-06 20:32:44 -05:00
Harish Kasiviswanathan
4252bf6866 drm/amdkfd: Remove unaligned memory access
Unaligned atomic operations can cause problems on some CPU
architectures. Use simpler bitmask operations instead. Atomic bit
manipulations are not necessary since dqm->lock is held during these
operations.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-02-06 20:32:42 -05:00
Gustavo A. R. Silva
2e3dca5365 drm/amdkfd: Fix potential NULL pointer dereferences
In case kfd_get_process_device_data returns null, there are some
null pointer dereferences in functions kfd_bind_processes_to_device
and kfd_unbind_processes_from_device.

Fix this by printing a WARN_ON for PDDs that aren't found and skip
them with continue statements.

Addresses-Coverity-ID: 1463794 ("Dereference null return value")
Addresses-Coverity-ID: 1463772 ("Dereference null return value")
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-10 17:15:09 -06:00
Oded Gabbay
a1235e10ee drm/amdkfd: add ull suffix to 64bit defines
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-01-10 12:55:17 +02:00
Yong Zhao
40a526dc1e drm/amdkfd: don't always call execute_queues_cpsch()
When destroying an inactive queue, we don't need to call
execute_queues_cpsch.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-02 13:10:50 -05:00
Yong Zhao
9e8272240b drm/amdkfd: Fix return value 0 when execute_queues_cpsch fails
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-02 13:10:49 -05:00
Harish Kasiviswanathan
b441093e40 drm/amdkfd: Ignore ACPI CRAT for non-APU systems
Some AMD motherboards without an APU have a broken CRAT table which
causes KFD initialization failures or incorrect information about
NUMA nodes, CPU cores or system memory. Ignore CRAT tables without
GPUs and rely on KFD's code to create a CRAT table for the CPU.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:04 -05:00
Felix Kuehling
ebcfd1e276 drm/amdkfd: Module option to disable CRAT table
Some systems have broken CRAT tables. Add a module option to ignore
a CRAT table.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:03 -05:00
Ben Goz
413e85d5d3 drm/amdkfd: Add AQL Queue Memory flag on topology
This is needed for enabling a user-mode workaround for an AQL queue
wrapping HW bug on Tonga.

Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:02 -05:00
Philip Cox
70f372bffc drm/amdkfd: Fixup incorrect info in the CZ CRAT table
* Wrong value for max_waves_per_simd
* Missing ATC capability bit

Signed-off-by: Philip Cox <Philip.Cox@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:01 -05:00
Amber Lin
f475734729 drm/amdkfd: Add perf counters to topology
For hardware blocks whose performance counters are accessed via MMIO
registers, KFD provides the support for those privileged blocks.

IOMMU is one of those privileged blocks. Most performance counter properties
required by Thunk are available at /sys/bus/event_source/devices/amd_iommu.

This patch adds properties to topology in KFD sysfs for information not
available in /sys/bus/event_source/devices/amd_iommu. They are shown at
/sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/iommu/ formatted as
/sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/<block>/<property>, i.e.
/sys/devices/virtual/kfd/kfd/topology/nodes/0/perf/iommu/max_concurrent.

For dGPUs, who don't have IOMMU, nothing appears under
/sys/devices/virtual/kfd/kfd/topology/nodes/0/perf.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:09:00 -05:00
Harish Kasiviswanathan
3a87177eb1 drm/amdkfd: Add topology support for dGPUs
Generate and parse VCRAT tables for dGPUs in kfd_topology_add_device.

Some information that isn't available in the CRAT table is patched
into the topology after parsing.

HSA_CAP_DOORBELL_TYPE_1_0 is dependent on the ASIC feature
CP_HQD_PQ_CONTROL.SLOT_BASED_WPTR, which was not introduced in VI
until Carrizo. Report HSA_CAP_DOORBELL_TYPE_PRE_1_0 on Tonga ASICs.

v2: Added #include <linux/pci.h> to kfd_crat.c to make it compile

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:59 -05:00
Felix Kuehling
520b8fb755 drm/amdkfd: Add topology support for CPUs
Currently, the KFD topology information is generated by parsing the CRAT
(ACPI) table. However, at present CRAT table is available only for AMD
APUs. To support CPUs on systems without a CRAT table, the KFD driver will
create a Virtual CRAT (VCRAT) table and then the existing code will parse
that table to generate topology.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:58 -05:00
Harish Kasiviswanathan
bc0c75a367 drm/amdkfd: Fix sibling_map[] size
Change kfd_cache_properties.sibling_map[256] to
kfd_cache_properties.sibling_map[32]. Since, CRAT uses bitmap for
sibling_map, it is more efficient to use bitmap in the kfd structure
also.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:57 -05:00
Felix Kuehling
175b926335 drm/amdkfd: Simplify counting of memory banks
Only count memory banks in one place. Ignore redundant num_banks
entry in crat_subtype_computeunit.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:56 -05:00
Felix Kuehling
42aa8793d7 drm/amdkfd: Turn verbose topology messages into pr_debug
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:55 -05:00
Harish Kasiviswanathan
4f2937bfff drm/amdkfd: sync IOLINK defines to thunk spec
Current thunk spec v1.07 dated Feb 1, 2016

v2: fix indentation

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:54 -05:00
Harish Kasiviswanathan
6d82eb0ef2 drm/amdkfd: Support enumerating non-GPU devices
Modify kfd_topology_enum_kfd_devices(..) function to support non-GPU
nodes. The function returned NULL when it encountered non-GPU (say CPU)
nodes. This caused kfd_ioctl_create_event and kfd_init_apertures to fail
for Intel + Tonga.

kfd_topology_enum_kfd_devices will now parse all the nodes and return
valid kfd_dev for nodes with GPU.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:53 -05:00
Harish Kasiviswanathan
4f449311e9 drm/amdkfd: Decouple CRAT parsing from device list update
Currently, CRAT parsing is intertwined with topology_device_list and
hence repeated calls to kfd_parse_crat_table() will fail. Decouple
kfd_parse_crat_table() and topology_device_list.

kfd_parse_crat_table() will parse CRAT and add topology devices to a
temporary list temp_topology_device_list and then
kfd_topology_update_device_list will move contents from temporary list to
master list.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:52 -05:00
Harish Kasiviswanathan
8e05247d4c drm/amdkfd: Reorganize CRAT fetching from ACPI
Reorganize and rename kfd_topology_get_crat_acpi function. In this way
acpi_get_table(..) needs to be called only once. This will also aid in
dGPU topology implementation.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:51 -05:00
Felix Kuehling
174de876d6 drm/amdkfd: Group up CRAT related functions
Take CRAT related functions out of kfd_topology.c and place them in
kfd_crat.c. This is the initial step of supporting more CRAT features,
i.e. creating virtual CRAT table for KFD devices without CRAT.

v2: Minor cleanup that was missed previously because code moved around

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:49 -05:00
Yong Zhao
5108d76840 drm/amdkfd: Fix memory leaks in kfd topology
Kobject created using kobject_create_and_add() can be freed using
kobject_put() when there is no referenece any more. However,
kobject memory allocated with kzalloc() has to set up a release
callback in order to free it when the counter decreases to 0.
Otherwise it causes memory leak.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:48 -05:00
Harish Kasiviswanathan
d63f0ba27a drm/amdkfd: Topology: Fix location_id
Fix location_id format to match Thunk specification.

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:47 -05:00
Flora Cui
f7ce2fade6 drm/amdkfd: Update number of compute unit from KGD
Overwrite the active simd_count from KGD at driver loading time. This is
based on assumption that register GC_USER_SHADER_ARRAY_CONFIG won’t get
changed.

V2: remove the incorrect simd_count reported at loading module.

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed by: Yair Shachar< yair.shachar@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:46 -05:00
Harish Kasiviswanathan
0504cccf34 drm/amdkfd: Stop using get_vmem_size KGD-KFD interface
get_vmem_size() is deprecated. Instead use get_local_mem_info().

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 23:08:43 -05:00
Felix Kuehling
64d1c3a43a drm/amdkfd: Centralize IOMMUv2 code and make it conditional
dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
ASIC information. Also allow building KFD without IOMMUv2 support.
This is still useful for dGPUs and prepares for enabling KFD on
architectures that don't support AMD IOMMUv2.

v2:
* Centralize IOMMUv2 code to avoid #ifdefs in too many places

v3:
* Imply AMD_IOMMU_V2 in Kconfig

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-12-08 19:22:12 -05:00
Felix Kuehling
a3084e6c52 drm/amdkfd: Add dGPU device IDs and device info
v2: remove needs_iommu field as it doesn't exists

CC: linux-pci@vger.kernel.org
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:47 -05:00
Felix Kuehling
1d63669885 drm/amdkfd: Add dGPU support to kernel_queue_init
Recognize dGPU ASIC families.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-01-04 17:17:46 -05:00