2019-04-18 08:46:05

by Tomeu Vizoso

[permalink] [raw]
Subject: [PATCH] drm/panfrost: Prevent concurrent resets

If a job times out in slot 0 while a reset is performed because a job
timed out in slot 1, the drm-sched core can get into a deadlock.

Signed-off-by: Tomeu Vizoso <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_device.c | 1 +
drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
drivers/gpu/drm/panfrost/panfrost_job.c | 4 ++++
3 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c
index 91e8fb0f2b25..970f669c6d29 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -98,6 +98,7 @@ int panfrost_device_init(struct panfrost_device *pfdev)
struct resource *res;

mutex_init(&pfdev->sched_lock);
+ mutex_init(&pfdev->reset_lock);
INIT_LIST_HEAD(&pfdev->scheduled_jobs);

spin_lock_init(&pfdev->hwaccess_lock);
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index 1ba48d105763..56f452dfb490 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -78,6 +78,7 @@ struct panfrost_device {
struct list_head scheduled_jobs;

struct mutex sched_lock;
+ struct mutex reset_lock;

struct {
struct devfreq *devfreq;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 0a7ed04f7d52..a5716c8fe8b3 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -384,6 +384,8 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
job_read(pfdev, JS_TAIL_LO(js)),
sched_job);

+ mutex_lock(&pfdev->reset_lock);
+
for (i = 0; i < NUM_JOB_SLOTS; i++)
drm_sched_stop(&pfdev->js->queue[i].sched);

@@ -406,6 +408,8 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
/* restart scheduler after GPU is usable again */
for (i = 0; i < NUM_JOB_SLOTS; i++)
drm_sched_start(&pfdev->js->queue[i].sched, true);
+
+ mutex_unlock(&pfdev->reset_lock);
}

static const struct drm_sched_backend_ops panfrost_sched_ops = {
--
2.20.1


2019-04-18 14:35:05

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH] drm/panfrost: Prevent concurrent resets

On Thu, Apr 18, 2019 at 3:43 AM Tomeu Vizoso <[email protected]> wrote:
>
> If a job times out in slot 0 while a reset is performed because a job
> timed out in slot 1, the drm-sched core can get into a deadlock.
>
> Signed-off-by: Tomeu Vizoso <[email protected]>
> ---
> drivers/gpu/drm/panfrost/panfrost_device.c | 1 +
> drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
> drivers/gpu/drm/panfrost/panfrost_job.c | 4 ++++
> 3 files changed, 6 insertions(+)

Applied, thanks.