2022-03-11 02:55:55

by Rob Clark

[permalink] [raw]
Subject: [PATCH 0/3] drm/msm/gpu: More system suspend fixes

From: Rob Clark <[email protected]>

In particular, we want to park the scheduler threads so that suspend
is not racing with the kthread pushing more jobs to the driver.

Rob Clark (3):
drm/msm/gpu: Rename runtime suspend/resume functions
drm/msm/gpu: Park scheduler threads for system suspend
drm/msm/gpu: Remove mutex from wait_event condition

drivers/gpu/drm/msm/adreno/adreno_device.c | 79 ++++++++++++++++++----
1 file changed, 65 insertions(+), 14 deletions(-)

--
2.35.1


2022-03-11 08:12:05

by Rob Clark

[permalink] [raw]
Subject: [PATCH 1/3] drm/msm/gpu: Rename runtime suspend/resume functions

From: Rob Clark <[email protected]>

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/adreno/adreno_device.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 89cfd84760d7..8859834b51b8 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -600,7 +600,7 @@ static const struct of_device_id dt_match[] = {
};

#ifdef CONFIG_PM
-static int adreno_resume(struct device *dev)
+static int adreno_runtime_resume(struct device *dev)
{
struct msm_gpu *gpu = dev_to_gpu(dev);

@@ -616,7 +616,7 @@ static int active_submits(struct msm_gpu *gpu)
return active_submits;
}

-static int adreno_suspend(struct device *dev)
+static int adreno_runtime_suspend(struct device *dev)
{
struct msm_gpu *gpu = dev_to_gpu(dev);
int remaining;
@@ -635,7 +635,7 @@ static int adreno_suspend(struct device *dev)

static const struct dev_pm_ops adreno_pm_ops = {
SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
- SET_RUNTIME_PM_OPS(adreno_suspend, adreno_resume, NULL)
+ SET_RUNTIME_PM_OPS(adreno_runtime_suspend, adreno_runtime_resume, NULL)
};

static struct platform_driver adreno_driver = {
--
2.35.1

Subject: Re: [PATCH 1/3] drm/msm/gpu: Rename runtime suspend/resume functions

Il 11/03/22 00:46, Rob Clark ha scritto:
> From: Rob Clark <[email protected]>
>

Hey Rob,
looks like you've somehow lost the commit description on this one!

Cheers,
Angelo

> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/adreno_device.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>

2022-03-11 22:13:37

by Rob Clark

[permalink] [raw]
Subject: [PATCH 3/3] drm/msm/gpu: Remove mutex from wait_event condition

From: Rob Clark <[email protected]>

The mutex wasn't really protecting anything before. Before the previous
patch we could still be racing with the scheduler's kthread, as that is
not necessarily frozen yet. Now that we've parked the sched threads,
the only race is with jobs retiring, and that is harmless, ie.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/msm/adreno/adreno_device.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 0440a98988fc..661dfa7681fb 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -607,15 +607,6 @@ static int adreno_runtime_resume(struct device *dev)
return gpu->funcs->pm_resume(gpu);
}

-static int active_submits(struct msm_gpu *gpu)
-{
- int active_submits;
- mutex_lock(&gpu->active_lock);
- active_submits = gpu->active_submits;
- mutex_unlock(&gpu->active_lock);
- return active_submits;
-}
-
static int adreno_runtime_suspend(struct device *dev)
{
struct msm_gpu *gpu = dev_to_gpu(dev);
@@ -669,7 +660,7 @@ static int adreno_system_suspend(struct device *dev)
suspend_scheduler(gpu);

remaining = wait_event_timeout(gpu->retire_event,
- active_submits(gpu) == 0,
+ gpu->active_submits == 0,
msecs_to_jiffies(1000));
if (remaining == 0) {
dev_err(dev, "Timeout waiting for GPU to suspend\n");
--
2.35.1

2022-03-17 21:14:11

by Akhil P Oommen

[permalink] [raw]
Subject: Re: [Freedreno] [PATCH 3/3] drm/msm/gpu: Remove mutex from wait_event condition

On 3/11/2022 5:16 AM, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> The mutex wasn't really protecting anything before. Before the previous
> patch we could still be racing with the scheduler's kthread, as that is
> not necessarily frozen yet. Now that we've parked the sched threads,
> the only race is with jobs retiring, and that is harmless, ie.
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/msm/adreno/adreno_device.c | 11 +----------
> 1 file changed, 1 insertion(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 0440a98988fc..661dfa7681fb 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -607,15 +607,6 @@ static int adreno_runtime_resume(struct device *dev)
> return gpu->funcs->pm_resume(gpu);
> }
>
> -static int active_submits(struct msm_gpu *gpu)
> -{
> - int active_submits;
> - mutex_lock(&gpu->active_lock);
> - active_submits = gpu->active_submits;
> - mutex_unlock(&gpu->active_lock);
I assumed that this lock here was to ensure proper barriers while
reading active_submits. Is that not required?

-Akhil.
> - return active_submits;
> -}
> -
> static int adreno_runtime_suspend(struct device *dev)
> {
> struct msm_gpu *gpu = dev_to_gpu(dev);
> @@ -669,7 +660,7 @@ static int adreno_system_suspend(struct device *dev)
> suspend_scheduler(gpu);
>
> remaining = wait_event_timeout(gpu->retire_event,
> - active_submits(gpu) == 0,
> + gpu->active_submits == 0,
> msecs_to_jiffies(1000));
> if (remaining == 0) {
> dev_err(dev, "Timeout waiting for GPU to suspend\n");

2022-03-17 21:41:00

by Rob Clark

[permalink] [raw]
Subject: Re: [Freedreno] [PATCH 3/3] drm/msm/gpu: Remove mutex from wait_event condition

On Thu, Mar 17, 2022 at 1:45 PM Akhil P Oommen <[email protected]> wrote:
>
> On 3/11/2022 5:16 AM, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > The mutex wasn't really protecting anything before. Before the previous
> > patch we could still be racing with the scheduler's kthread, as that is
> > not necessarily frozen yet. Now that we've parked the sched threads,
> > the only race is with jobs retiring, and that is harmless, ie.
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > drivers/gpu/drm/msm/adreno/adreno_device.c | 11 +----------
> > 1 file changed, 1 insertion(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 0440a98988fc..661dfa7681fb 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -607,15 +607,6 @@ static int adreno_runtime_resume(struct device *dev)
> > return gpu->funcs->pm_resume(gpu);
> > }
> >
> > -static int active_submits(struct msm_gpu *gpu)
> > -{
> > - int active_submits;
> > - mutex_lock(&gpu->active_lock);
> > - active_submits = gpu->active_submits;
> > - mutex_unlock(&gpu->active_lock);
> I assumed that this lock here was to ensure proper barriers while
> reading active_submits. Is that not required?

There is a spinlock in prepare_to_wait_event() ahead of checking the
condition, which AFAIU is a sufficient barrier

BR,
-R

>
> -Akhil.
> > - return active_submits;
> > -}
> > -
> > static int adreno_runtime_suspend(struct device *dev)
> > {
> > struct msm_gpu *gpu = dev_to_gpu(dev);
> > @@ -669,7 +660,7 @@ static int adreno_system_suspend(struct device *dev)
> > suspend_scheduler(gpu);
> >
> > remaining = wait_event_timeout(gpu->retire_event,
> > - active_submits(gpu) == 0,
> > + gpu->active_submits == 0,
> > msecs_to_jiffies(1000));
> > if (remaining == 0) {
> > dev_err(dev, "Timeout waiting for GPU to suspend\n");
>