drm/amdkfd: Reset GPU on queue preemption failure

commit 8bdfb4ea95 upstream.

Currently, with F32 HWS GPU reset is only when unmap queue fails.

However, if compute queue doesn't repond to preemption request in time
unmap will return without any error. In this case, only preemption error
is logged and Reset is not triggered. Call GPU reset in this case also.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit is contained in:
Harish Kasiviswanathan 2024-03-26 15:32:46 -04:00 committed by Greg Kroah-Hartman
parent 1a867afa54
commit 38042ce767
1 changed files with 1 additions and 0 deletions

View File

@ -1997,6 +1997,7 @@ static int unmap_queues_cpsch(struct device_queue_manager *dqm,
dev_err(dev, "HIQ MQD's queue_doorbell_id0 is not 0, Queue preemption time out\n");
while (halt_if_hws_hang)
schedule();
kfd_hws_hang(dqm);
return -ETIME;
}