drm/sched: Document what the timedout_job method should do

The documentation is a bit vague and doesn't really describe what the
->timedout_job() is expected to do. Let's add a few more details.

v5:
* New patch

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-2-boris.brezillon@collabora.com
This commit is contained in:
Boris Brezillon 2021-06-30 08:27:36 +02:00
parent 60f3c604bc
commit 1fad1b7ed1
1 changed files with 14 additions and 0 deletions

View File

@ -239,6 +239,20 @@ struct drm_sched_backend_ops {
* @timedout_job: Called when a job has taken too long to execute,
* to trigger GPU recovery.
*
* This method is called in a workqueue context.
*
* Drivers typically issue a reset to recover from GPU hangs, and this
* procedure usually follows the following workflow:
*
* 1. Stop the scheduler using drm_sched_stop(). This will park the
* scheduler thread and cancel the timeout work, guaranteeing that
* nothing is queued while we reset the hardware queue
* 2. Try to gracefully stop non-faulty jobs (optional)
* 3. Issue a GPU reset (driver-specific)
* 4. Re-submit jobs using drm_sched_resubmit_jobs()
* 5. Restart the scheduler using drm_sched_start(). At that point, new
* jobs can be queued, and the scheduler thread is unblocked
*
* Return DRM_GPU_SCHED_STAT_NOMINAL, when all is normal,
* and the underlying driver has started or completed recovery.
*