2019-09-09 03:41:00

by Rob Clark

[permalink] [raw]
Subject: [PATCH] iommu/arm-smmu: fix "hang" when games exit

From: Rob Clark <[email protected]>

When games, browser, or anything using a lot of GPU buffers exits, there
can be many hundreds or thousands of buffers to unmap and free. If the
GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
for each buffer, resulting 5-10 seconds worth of reprogramming the
context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
To the user it would appear that the system is locked up.

A simple solution is to use pm_runtime_put_autosuspend() instead, so we
don't immediately suspend the SMMU device.

Signed-off-by: Rob Clark <[email protected]>
---
Note: I've tied the autosuspend enable/delay to the consumer device,
based on the reasoning that if the consumer device benefits from using
an autosuspend delay, then it's corresponding SMMU probably does too.
Maybe that is overkill and we should just unconditionally enable
autosuspend.

drivers/iommu/arm-smmu.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c2733b447d9c..73a0dd53c8a3 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -289,7 +289,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
{
if (pm_runtime_enabled(smmu->dev))
- pm_runtime_put(smmu->dev);
+ pm_runtime_put_autosuspend(smmu->dev);
}

static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
@@ -1445,6 +1445,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
/* Looks ok, so add the device to the domain */
ret = arm_smmu_domain_add_master(smmu_domain, fwspec);

+#ifdef CONFIG_PM
+ /* TODO maybe device_link_add() should do this for us? */
+ if (dev->power.use_autosuspend) {
+ pm_runtime_set_autosuspend_delay(smmu->dev,
+ dev->power.autosuspend_delay);
+ pm_runtime_use_autosuspend(smmu->dev);
+ }
+#endif
+
rpm_put:
arm_smmu_rpm_put(smmu);
return ret;
--
2.21.0


2019-09-10 15:05:18

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] iommu/arm-smmu: fix "hang" when games exit

On 07/09/2019 18:50, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> When games, browser, or anything using a lot of GPU buffers exits, there
> can be many hundreds or thousands of buffers to unmap and free. If the
> GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> for each buffer, resulting 5-10 seconds worth of reprogramming the
> context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> To the user it would appear that the system is locked up.
>
> A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> don't immediately suspend the SMMU device.
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> Note: I've tied the autosuspend enable/delay to the consumer device,
> based on the reasoning that if the consumer device benefits from using
> an autosuspend delay, then it's corresponding SMMU probably does too.
> Maybe that is overkill and we should just unconditionally enable
> autosuspend.

I'm not sure there's really any reason to expect that a supplier's usage
model when doing things for itself bears any relation to that of its
consumer(s), so I'd certainly lean towards the "unconditional" argument
myself.

Of course ideally we'd skip resuming altogether in the map/unmap paths
(since resume implies a full TLB reset anyway), but IIRC that approach
started to get messy in the context of the initial RPM patchset. I'm
planning to fiddle around a bit more to clean up the implementation of
the new iommu_flush_ops stuff, so I've made a note to myself to revisit
RPM to see if there's a sufficiently clean way to do better. In the
meantime, though, I don't have any real objection to using some
reasonable autosuspend delay on the principle that if we've been woken
up to map/unmap one page, there's a high likelihood that more will
follow in short order (and in the configuration slow-paths it won't have
much impact either way).

Robin.

> drivers/iommu/arm-smmu.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index c2733b447d9c..73a0dd53c8a3 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -289,7 +289,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
> {
> if (pm_runtime_enabled(smmu->dev))
> - pm_runtime_put(smmu->dev);
> + pm_runtime_put_autosuspend(smmu->dev);
> }
>
> static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> @@ -1445,6 +1445,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> /* Looks ok, so add the device to the domain */
> ret = arm_smmu_domain_add_master(smmu_domain, fwspec);
>
> +#ifdef CONFIG_PM
> + /* TODO maybe device_link_add() should do this for us? */
> + if (dev->power.use_autosuspend) {
> + pm_runtime_set_autosuspend_delay(smmu->dev,
> + dev->power.autosuspend_delay);
> + pm_runtime_use_autosuspend(smmu->dev);
> + }
> +#endif
> +
> rpm_put:
> arm_smmu_rpm_put(smmu);
> return ret;
>

2019-09-10 15:48:07

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH] iommu/arm-smmu: fix "hang" when games exit

On Tue, Sep 10, 2019 at 8:01 AM Robin Murphy <[email protected]> wrote:
>
> On 07/09/2019 18:50, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > When games, browser, or anything using a lot of GPU buffers exits, there
> > can be many hundreds or thousands of buffers to unmap and free. If the
> > GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> > for each buffer, resulting 5-10 seconds worth of reprogramming the
> > context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> > To the user it would appear that the system is locked up.
> >
> > A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> > don't immediately suspend the SMMU device.
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > Note: I've tied the autosuspend enable/delay to the consumer device,
> > based on the reasoning that if the consumer device benefits from using
> > an autosuspend delay, then it's corresponding SMMU probably does too.
> > Maybe that is overkill and we should just unconditionally enable
> > autosuspend.
>
> I'm not sure there's really any reason to expect that a supplier's usage
> model when doing things for itself bears any relation to that of its
> consumer(s), so I'd certainly lean towards the "unconditional" argument
> myself.

Sounds good, I'll respin w/ unconditional autosuspend

> Of course ideally we'd skip resuming altogether in the map/unmap paths
> (since resume implies a full TLB reset anyway), but IIRC that approach
> started to get messy in the context of the initial RPM patchset. I'm
> planning to fiddle around a bit more to clean up the implementation of
> the new iommu_flush_ops stuff, so I've made a note to myself to revisit
> RPM to see if there's a sufficiently clean way to do better. In the
> meantime, though, I don't have any real objection to using some
> reasonable autosuspend delay on the principle that if we've been woken
> up to map/unmap one page, there's a high likelihood that more will
> follow in short order (and in the configuration slow-paths it won't have
> much impact either way).

It does sort of remind me about something I was chatting with Jordan
the other day.. about how we could possibly skip the TLB inv for
unmaps from non-current pagetables once we have per-context
pagetables.

The challenge is, since the GPU's command parser is the one switching
pagetables, we don't have any race-free way to know which pagetables
are current. But we do know which contexts have work queued up for
the GPU, so we can know either that a given context definitely isn't
current, or that it might be current. And in the "definitely not
current" case we could skip TLB inv.

BR,
-R

>
> Robin.
>
> > drivers/iommu/arm-smmu.c | 11 ++++++++++-
> > 1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index c2733b447d9c..73a0dd53c8a3 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -289,7 +289,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> > static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
> > {
> > if (pm_runtime_enabled(smmu->dev))
> > - pm_runtime_put(smmu->dev);
> > + pm_runtime_put_autosuspend(smmu->dev);
> > }
> >
> > static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> > @@ -1445,6 +1445,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> > /* Looks ok, so add the device to the domain */
> > ret = arm_smmu_domain_add_master(smmu_domain, fwspec);
> >
> > +#ifdef CONFIG_PM
> > + /* TODO maybe device_link_add() should do this for us? */
> > + if (dev->power.use_autosuspend) {
> > + pm_runtime_set_autosuspend_delay(smmu->dev,
> > + dev->power.autosuspend_delay);
> > + pm_runtime_use_autosuspend(smmu->dev);
> > + }
> > +#endif
> > +
> > rpm_put:
> > arm_smmu_rpm_put(smmu);
> > return ret;
> >

2019-10-07 20:54:50

by Rob Clark

[permalink] [raw]
Subject: [PATCH v2] iommu/arm-smmu: fix "hang" when games exit

From: Rob Clark <[email protected]>

When games, browser, or anything using a lot of GPU buffers exits, there
can be many hundreds or thousands of buffers to unmap and free. If the
GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
for each buffer, resulting 5-10 seconds worth of reprogramming the
context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
To the user it would appear that the system just locked up.

A simple solution is to use pm_runtime_put_autosuspend() instead, so we
don't immediately suspend the SMMU device.

Signed-off-by: Rob Clark <[email protected]>
---
v1: original
v2: unconditionally use autosuspend, rather than deciding based on what
consumer does

drivers/iommu/arm-smmu.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 3f1d55fb43c4..b7b41f5001bc 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -289,7 +289,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
{
if (pm_runtime_enabled(smmu->dev))
- pm_runtime_put(smmu->dev);
+ pm_runtime_put_autosuspend(smmu->dev);
}

static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
@@ -1445,6 +1445,9 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
/* Looks ok, so add the device to the domain */
ret = arm_smmu_domain_add_master(smmu_domain, fwspec);

+ pm_runtime_set_autosuspend_delay(smmu->dev, 20);
+ pm_runtime_use_autosuspend(smmu->dev);
+
rpm_put:
arm_smmu_rpm_put(smmu);
return ret;
--
2.21.0

2019-10-09 10:11:27

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu: fix "hang" when games exit

On 2019-10-07 9:49 pm, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> When games, browser, or anything using a lot of GPU buffers exits, there
> can be many hundreds or thousands of buffers to unmap and free. If the
> GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> for each buffer, resulting 5-10 seconds worth of reprogramming the
> context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> To the user it would appear that the system just locked up.
>
> A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> don't immediately suspend the SMMU device.

Reviewed-by: Robin Murphy <[email protected]>

> Signed-off-by: Rob Clark <[email protected]>
> ---
> v1: original
> v2: unconditionally use autosuspend, rather than deciding based on what
> consumer does
>
> drivers/iommu/arm-smmu.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 3f1d55fb43c4..b7b41f5001bc 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -289,7 +289,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
> {
> if (pm_runtime_enabled(smmu->dev))
> - pm_runtime_put(smmu->dev);
> + pm_runtime_put_autosuspend(smmu->dev);
> }
>
> static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> @@ -1445,6 +1445,9 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> /* Looks ok, so add the device to the domain */
> ret = arm_smmu_domain_add_master(smmu_domain, fwspec);
>
> + pm_runtime_set_autosuspend_delay(smmu->dev, 20);
> + pm_runtime_use_autosuspend(smmu->dev);
> +
> rpm_put:
> arm_smmu_rpm_put(smmu);
> return ret;
>

2019-10-28 23:42:01

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu: fix "hang" when games exit

Hi Rob,

On Mon, Oct 07, 2019 at 01:49:06PM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> When games, browser, or anything using a lot of GPU buffers exits, there
> can be many hundreds or thousands of buffers to unmap and free. If the
> GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> for each buffer, resulting 5-10 seconds worth of reprogramming the
> context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> To the user it would appear that the system just locked up.
>
> A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> don't immediately suspend the SMMU device.

Please can you reword the subject to be a bit more useful? The commit
message is great, but the subject is a bit like "fix bug in code" to me.

> Signed-off-by: Rob Clark <[email protected]>
> ---
> v1: original
> v2: unconditionally use autosuspend, rather than deciding based on what
> consumer does
>
> drivers/iommu/arm-smmu.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 3f1d55fb43c4..b7b41f5001bc 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -289,7 +289,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
> {
> if (pm_runtime_enabled(smmu->dev))
> - pm_runtime_put(smmu->dev);
> + pm_runtime_put_autosuspend(smmu->dev);
> }
>
> static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> @@ -1445,6 +1445,9 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> /* Looks ok, so add the device to the domain */
> ret = arm_smmu_domain_add_master(smmu_domain, fwspec);

Please can you put a comment here explaining what this is doing? An abridged
version of the commit message is fine.

> + pm_runtime_set_autosuspend_delay(smmu->dev, 20);
> + pm_runtime_use_autosuspend(smmu->dev);

Cheers,

Will

2019-10-28 23:50:59

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu: fix "hang" when games exit

On 2019-10-28 10:38 pm, Rob Clark wrote:
> On Mon, Oct 28, 2019 at 3:20 PM Will Deacon <[email protected]> wrote:
>>
>> Hi Rob,
>>
>> On Mon, Oct 07, 2019 at 01:49:06PM -0700, Rob Clark wrote:
>>> From: Rob Clark <[email protected]>
>>>
>>> When games, browser, or anything using a lot of GPU buffers exits, there
>>> can be many hundreds or thousands of buffers to unmap and free. If the
>>> GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
>>> for each buffer, resulting 5-10 seconds worth of reprogramming the
>>> context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
>>> To the user it would appear that the system just locked up.
>>>
>>> A simple solution is to use pm_runtime_put_autosuspend() instead, so we
>>> don't immediately suspend the SMMU device.
>>
>> Please can you reword the subject to be a bit more useful? The commit
>> message is great, but the subject is a bit like "fix bug in code" to me.
>
> yeah, not the best $subject, but I wasn't quite sure how to fit
> something better in a reasonable # of chars.. maybe something like:
> "iommu/arm-smmu: optimize unmap but avoiding toggling runpm state"?

FWIW, I'd be inclined to frame it as something like "avoid pathological
RPM behaviour for unmaps".

Robin.

2019-10-29 06:56:24

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu: fix "hang" when games exit

On Mon, Oct 28, 2019 at 3:20 PM Will Deacon <[email protected]> wrote:
>
> Hi Rob,
>
> On Mon, Oct 07, 2019 at 01:49:06PM -0700, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > When games, browser, or anything using a lot of GPU buffers exits, there
> > can be many hundreds or thousands of buffers to unmap and free. If the
> > GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> > for each buffer, resulting 5-10 seconds worth of reprogramming the
> > context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> > To the user it would appear that the system just locked up.
> >
> > A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> > don't immediately suspend the SMMU device.
>
> Please can you reword the subject to be a bit more useful? The commit
> message is great, but the subject is a bit like "fix bug in code" to me.

yeah, not the best $subject, but I wasn't quite sure how to fit
something better in a reasonable # of chars.. maybe something like:
"iommu/arm-smmu: optimize unmap but avoiding toggling runpm state"?

BR,
-R


>
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > v1: original
> > v2: unconditionally use autosuspend, rather than deciding based on what
> > consumer does
> >
> > drivers/iommu/arm-smmu.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 3f1d55fb43c4..b7b41f5001bc 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -289,7 +289,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> > static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
> > {
> > if (pm_runtime_enabled(smmu->dev))
> > - pm_runtime_put(smmu->dev);
> > + pm_runtime_put_autosuspend(smmu->dev);
> > }
> >
> > static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> > @@ -1445,6 +1445,9 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> > /* Looks ok, so add the device to the domain */
> > ret = arm_smmu_domain_add_master(smmu_domain, fwspec);
>
> Please can you put a comment here explaining what this is doing? An abridged
> version of the commit message is fine.
>
> > + pm_runtime_set_autosuspend_delay(smmu->dev, 20);
> > + pm_runtime_use_autosuspend(smmu->dev);
>
> Cheers,
>
> Will

2019-10-29 18:00:37

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2] iommu/arm-smmu: fix "hang" when games exit

On Mon, Oct 28, 2019 at 10:51:53PM +0000, Robin Murphy wrote:
> On 2019-10-28 10:38 pm, Rob Clark wrote:
> > On Mon, Oct 28, 2019 at 3:20 PM Will Deacon <[email protected]> wrote:
> > > On Mon, Oct 07, 2019 at 01:49:06PM -0700, Rob Clark wrote:
> > > > From: Rob Clark <[email protected]>
> > > >
> > > > When games, browser, or anything using a lot of GPU buffers exits, there
> > > > can be many hundreds or thousands of buffers to unmap and free. If the
> > > > GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> > > > for each buffer, resulting 5-10 seconds worth of reprogramming the
> > > > context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> > > > To the user it would appear that the system just locked up.
> > > >
> > > > A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> > > > don't immediately suspend the SMMU device.
> > >
> > > Please can you reword the subject to be a bit more useful? The commit
> > > message is great, but the subject is a bit like "fix bug in code" to me.
> >
> > yeah, not the best $subject, but I wasn't quite sure how to fit
> > something better in a reasonable # of chars.. maybe something like:
> > "iommu/arm-smmu: optimize unmap but avoiding toggling runpm state"?
>
> FWIW, I'd be inclined to frame it as something like "avoid pathological RPM
> behaviour for unmaps".

LGTM!

Will