2021-03-20 05:35:33

by hieagle

[permalink] [raw]
Subject: [PATCH] mmc: core: Mark mmc_host device with pm_runtime_no_callbacks

The rpm_resume() will call parent's resume callback recursively.
Since mmc_host has no its own pm_runtime callbacks, the mmc devices
may fail to resume (-ENOSYS in rpm_callback) sometimes. Mark mmc_host
device with pm_runtime_no_callbacks can fix the issue.

Signed-off-by: kehuanlin <[email protected]>
---
drivers/mmc/core/host.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 9b89a91b6b47..177bebd9a6c4 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -15,6 +15,7 @@
#include <linux/of.h>
#include <linux/of_gpio.h>
#include <linux/pagemap.h>
+#include <linux/pm_runtime.h>
#include <linux/pm_wakeup.h>
#include <linux/export.h>
#include <linux/leds.h>
@@ -480,6 +481,7 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
host->class_dev.class = &mmc_host_class;
device_initialize(&host->class_dev);
device_enable_async_suspend(&host->class_dev);
+ pm_runtime_no_callbacks(&host->class_dev);

if (mmc_gpio_alloc(host)) {
put_device(&host->class_dev);
--
2.30.0


2021-03-22 10:27:30

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] mmc: core: Mark mmc_host device with pm_runtime_no_callbacks

On Sat, 20 Mar 2021 at 05:57, kehuanlin <[email protected]> wrote:
>
> The rpm_resume() will call parent's resume callback recursively.
> Since mmc_host has no its own pm_runtime callbacks, the mmc devices
> may fail to resume (-ENOSYS in rpm_callback) sometimes. Mark mmc_host
> device with pm_runtime_no_callbacks can fix the issue.

Can you please elaborate more on this? What do you mean by "sometimes"?

More precisely, how do you trigger the rpm_callback() for mmc class
device to return -ENOSYS?

Don't get me wrong, the patch is fine, but I want to understand if it
actually solves a problem for you - or that it's better considered as
an optimization?

Kind regards
Uffe

>
> Signed-off-by: kehuanlin <[email protected]>
> ---
> drivers/mmc/core/host.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
> index 9b89a91b6b47..177bebd9a6c4 100644
> --- a/drivers/mmc/core/host.c
> +++ b/drivers/mmc/core/host.c
> @@ -15,6 +15,7 @@
> #include <linux/of.h>
> #include <linux/of_gpio.h>
> #include <linux/pagemap.h>
> +#include <linux/pm_runtime.h>
> #include <linux/pm_wakeup.h>
> #include <linux/export.h>
> #include <linux/leds.h>
> @@ -480,6 +481,7 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
> host->class_dev.class = &mmc_host_class;
> device_initialize(&host->class_dev);
> device_enable_async_suspend(&host->class_dev);
> + pm_runtime_no_callbacks(&host->class_dev);
>
> if (mmc_gpio_alloc(host)) {
> put_device(&host->class_dev);
> --
> 2.30.0
>

2021-03-23 10:53:55

by hieagle

[permalink] [raw]
Subject: Re: [PATCH] mmc: core: Mark mmc_host device with pm_runtime_no_callbacks

We encounter a resume issue in our device sometimes. The mmc device's
parent list is
mmc0:0001->mmc_host mmc0->fa630000.mmc->soc in our soc. We found in the blow
case with mmc0->power.disable_depth=0 the mmc_runtime_resume will be skipped,
which cause subsequent mmc command fail.

mmc_get_card(mmc0:0001)->pm_runtime_get_sync->rpm_resume(mmc0:0001)->rpm_resume(mmc0)
The rpm_resume(mmc0) return -ENOSYS due to no callback and
mmc0->power.runtime_status
keep RPM_SUSPENDED. This lead to rpm_resume(mmc0:0001) return -EBUSY and skip
rpm_callback which call mmc_runtime_resume, the mmc is still in
suspended and the
subsequent mmc command fail.

[ 198.856157] Call trace:
[ 198.858917] [<ffffff800808bd9c>] dump_backtrace+0x0/0x1cc
[ 198.864966] [<ffffff800808bf7c>] show_stack+0x14/0x1c
[ 198.870627] [<ffffff8008400e88>] dump_stack+0xa8/0xe0
[ 198.876288] [<ffffff800854d38c>] rpm_resume+0x850/0x938
[ 198.882141] [<ffffff800854cd8c>] rpm_resume+0x250/0x938
[ 198.887994] [<ffffff800854d4c4>] __pm_runtime_resume+0x50/0x74
[ 198.894530] [<ffffff80087b9e64>] mmc_get_card+0x3c/0xb8
[ 198.900388] [<ffffff80087cd2e0>] mmc_blk_issue_rq+0x2b0/0x4d8
[ 198.906824] [<ffffff80087cd5e4>] mmc_queue_thread+0xdc/0x198
[ 198.913165] [<ffffff80080d4b2c>] kthread+0xec/0x100
[ 198.918632] [<ffffff8008083890>] ret_from_fork+0x10/0x40
[ 198.924582] mmc0 callback (null)
[ 198.935837] mmcblk mmc0:0001: __pm_runtime_resume ret -16

Mark mmc_host device with pm_runtime_no_callbacks will solve the issue.
Thanks.
Huanlin Ke

Ulf Hansson <[email protected]> 于2021年3月22日周一 下午6:26写道:
>
> On Sat, 20 Mar 2021 at 05:57, kehuanlin <[email protected]> wrote:
> >
> > The rpm_resume() will call parent's resume callback recursively.
> > Since mmc_host has no its own pm_runtime callbacks, the mmc devices
> > may fail to resume (-ENOSYS in rpm_callback) sometimes. Mark mmc_host
> > device with pm_runtime_no_callbacks can fix the issue.
>
> Can you please elaborate more on this? What do you mean by "sometimes"?
>
> More precisely, how do you trigger the rpm_callback() for mmc class
> device to return -ENOSYS?
>
> Don't get me wrong, the patch is fine, but I want to understand if it
> actually solves a problem for you - or that it's better considered as
> an optimization?
>
> Kind regards
> Uffe
>
> >
> > Signed-off-by: kehuanlin <[email protected]>
> > ---
> > drivers/mmc/core/host.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
> > index 9b89a91b6b47..177bebd9a6c4 100644
> > --- a/drivers/mmc/core/host.c
> > +++ b/drivers/mmc/core/host.c
> > @@ -15,6 +15,7 @@
> > #include <linux/of.h>
> > #include <linux/of_gpio.h>
> > #include <linux/pagemap.h>
> > +#include <linux/pm_runtime.h>
> > #include <linux/pm_wakeup.h>
> > #include <linux/export.h>
> > #include <linux/leds.h>
> > @@ -480,6 +481,7 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
> > host->class_dev.class = &mmc_host_class;
> > device_initialize(&host->class_dev);
> > device_enable_async_suspend(&host->class_dev);
> > + pm_runtime_no_callbacks(&host->class_dev);
> >
> > if (mmc_gpio_alloc(host)) {
> > put_device(&host->class_dev);
> > --
> > 2.30.0
> >

2021-03-23 14:06:38

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] mmc: core: Mark mmc_host device with pm_runtime_no_callbacks

On Tue, 23 Mar 2021 at 11:49, hieagle <[email protected]> wrote:
>
> We encounter a resume issue in our device sometimes. The mmc device's
> parent list is
> mmc0:0001->mmc_host mmc0->fa630000.mmc->soc in our soc. We found in the blow
> case with mmc0->power.disable_depth=0 the mmc_runtime_resume will be skipped,
> which cause subsequent mmc command fail.
>
> mmc_get_card(mmc0:0001)->pm_runtime_get_sync->rpm_resume(mmc0:0001)->rpm_resume(mmc0)
> The rpm_resume(mmc0) return -ENOSYS due to no callback and
> mmc0->power.runtime_status
> keep RPM_SUSPENDED. This lead to rpm_resume(mmc0:0001) return -EBUSY and skip
> rpm_callback which call mmc_runtime_resume, the mmc is still in
> suspended and the
> subsequent mmc command fail.
>
> [ 198.856157] Call trace:
> [ 198.858917] [<ffffff800808bd9c>] dump_backtrace+0x0/0x1cc
> [ 198.864966] [<ffffff800808bf7c>] show_stack+0x14/0x1c
> [ 198.870627] [<ffffff8008400e88>] dump_stack+0xa8/0xe0
> [ 198.876288] [<ffffff800854d38c>] rpm_resume+0x850/0x938
> [ 198.882141] [<ffffff800854cd8c>] rpm_resume+0x250/0x938
> [ 198.887994] [<ffffff800854d4c4>] __pm_runtime_resume+0x50/0x74
> [ 198.894530] [<ffffff80087b9e64>] mmc_get_card+0x3c/0xb8
> [ 198.900388] [<ffffff80087cd2e0>] mmc_blk_issue_rq+0x2b0/0x4d8
> [ 198.906824] [<ffffff80087cd5e4>] mmc_queue_thread+0xdc/0x198
> [ 198.913165] [<ffffff80080d4b2c>] kthread+0xec/0x100
> [ 198.918632] [<ffffff8008083890>] ret_from_fork+0x10/0x40
> [ 198.924582] mmc0 callback (null)
> [ 198.935837] mmcblk mmc0:0001: __pm_runtime_resume ret -16
>
> Mark mmc_host device with pm_runtime_no_callbacks will solve the issue.
> Thanks.
> Huanlin Ke

Thanks for sharing more details! I have to admit, that this sounds
quite weird to me. I wonder if this is a problem that deserves to be
fixed in the runtime PM core....

Let me have a closer look a get back to you again. Please be patient
though, I have a busy week in front of me.

Kind regards
Uffe

>
> Ulf Hansson <[email protected]> 于2021年3月22日周一 下午6:26写道:
> >
> > On Sat, 20 Mar 2021 at 05:57, kehuanlin <[email protected]> wrote:
> > >
> > > The rpm_resume() will call parent's resume callback recursively.
> > > Since mmc_host has no its own pm_runtime callbacks, the mmc devices
> > > may fail to resume (-ENOSYS in rpm_callback) sometimes. Mark mmc_host
> > > device with pm_runtime_no_callbacks can fix the issue.
> >
> > Can you please elaborate more on this? What do you mean by "sometimes"?
> >
> > More precisely, how do you trigger the rpm_callback() for mmc class
> > device to return -ENOSYS?
> >
> > Don't get me wrong, the patch is fine, but I want to understand if it
> > actually solves a problem for you - or that it's better considered as
> > an optimization?
> >
> > Kind regards
> > Uffe
> >
> > >
> > > Signed-off-by: kehuanlin <[email protected]>
> > > ---
> > > drivers/mmc/core/host.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
> > > index 9b89a91b6b47..177bebd9a6c4 100644
> > > --- a/drivers/mmc/core/host.c
> > > +++ b/drivers/mmc/core/host.c
> > > @@ -15,6 +15,7 @@
> > > #include <linux/of.h>
> > > #include <linux/of_gpio.h>
> > > #include <linux/pagemap.h>
> > > +#include <linux/pm_runtime.h>
> > > #include <linux/pm_wakeup.h>
> > > #include <linux/export.h>
> > > #include <linux/leds.h>
> > > @@ -480,6 +481,7 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
> > > host->class_dev.class = &mmc_host_class;
> > > device_initialize(&host->class_dev);
> > > device_enable_async_suspend(&host->class_dev);
> > > + pm_runtime_no_callbacks(&host->class_dev);
> > >
> > > if (mmc_gpio_alloc(host)) {
> > > put_device(&host->class_dev);
> > > --
> > > 2.30.0
> > >

2021-05-07 15:31:07

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] mmc: core: Mark mmc_host device with pm_runtime_no_callbacks

On Tue, 23 Mar 2021 at 15:00, Ulf Hansson <[email protected]> wrote:
>
> On Tue, 23 Mar 2021 at 11:49, hieagle <[email protected]> wrote:
> >
> > We encounter a resume issue in our device sometimes. The mmc device's
> > parent list is
> > mmc0:0001->mmc_host mmc0->fa630000.mmc->soc in our soc. We found in the blow
> > case with mmc0->power.disable_depth=0 the mmc_runtime_resume will be skipped,
> > which cause subsequent mmc command fail.
> >
> > mmc_get_card(mmc0:0001)->pm_runtime_get_sync->rpm_resume(mmc0:0001)->rpm_resume(mmc0)
> > The rpm_resume(mmc0) return -ENOSYS due to no callback and
> > mmc0->power.runtime_status
> > keep RPM_SUSPENDED. This lead to rpm_resume(mmc0:0001) return -EBUSY and skip
> > rpm_callback which call mmc_runtime_resume, the mmc is still in
> > suspended and the
> > subsequent mmc command fail.
> >
> > [ 198.856157] Call trace:
> > [ 198.858917] [<ffffff800808bd9c>] dump_backtrace+0x0/0x1cc
> > [ 198.864966] [<ffffff800808bf7c>] show_stack+0x14/0x1c
> > [ 198.870627] [<ffffff8008400e88>] dump_stack+0xa8/0xe0
> > [ 198.876288] [<ffffff800854d38c>] rpm_resume+0x850/0x938
> > [ 198.882141] [<ffffff800854cd8c>] rpm_resume+0x250/0x938
> > [ 198.887994] [<ffffff800854d4c4>] __pm_runtime_resume+0x50/0x74
> > [ 198.894530] [<ffffff80087b9e64>] mmc_get_card+0x3c/0xb8
> > [ 198.900388] [<ffffff80087cd2e0>] mmc_blk_issue_rq+0x2b0/0x4d8
> > [ 198.906824] [<ffffff80087cd5e4>] mmc_queue_thread+0xdc/0x198
> > [ 198.913165] [<ffffff80080d4b2c>] kthread+0xec/0x100
> > [ 198.918632] [<ffffff8008083890>] ret_from_fork+0x10/0x40
> > [ 198.924582] mmc0 callback (null)
> > [ 198.935837] mmcblk mmc0:0001: __pm_runtime_resume ret -16
> >
> > Mark mmc_host device with pm_runtime_no_callbacks will solve the issue.
> > Thanks.
> > Huanlin Ke
>
> Thanks for sharing more details! I have to admit, that this sounds
> quite weird to me. I wonder if this is a problem that deserves to be
> fixed in the runtime PM core....
>
> Let me have a closer look a get back to you again. Please be patient
> though, I have a busy week in front of me.

Just wanted to notify you that I haven't forgotten. I will look into
this beginning of the next week.

[...]

Kind regards
Uffe

2021-05-26 17:25:43

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] mmc: core: Mark mmc_host device with pm_runtime_no_callbacks

On Tue, 23 Mar 2021 at 15:00, Ulf Hansson <[email protected]> wrote:
>
> On Tue, 23 Mar 2021 at 11:49, hieagle <[email protected]> wrote:
> >
> > We encounter a resume issue in our device sometimes. The mmc device's
> > parent list is
> > mmc0:0001->mmc_host mmc0->fa630000.mmc->soc in our soc. We found in the blow
> > case with mmc0->power.disable_depth=0 the mmc_runtime_resume will be skipped,
> > which cause subsequent mmc command fail.
> >
> > mmc_get_card(mmc0:0001)->pm_runtime_get_sync->rpm_resume(mmc0:0001)->rpm_resume(mmc0)
> > The rpm_resume(mmc0) return -ENOSYS due to no callback and
> > mmc0->power.runtime_status
> > keep RPM_SUSPENDED. This lead to rpm_resume(mmc0:0001) return -EBUSY and skip
> > rpm_callback which call mmc_runtime_resume, the mmc is still in
> > suspended and the
> > subsequent mmc command fail.
> >
> > [ 198.856157] Call trace:
> > [ 198.858917] [<ffffff800808bd9c>] dump_backtrace+0x0/0x1cc
> > [ 198.864966] [<ffffff800808bf7c>] show_stack+0x14/0x1c
> > [ 198.870627] [<ffffff8008400e88>] dump_stack+0xa8/0xe0
> > [ 198.876288] [<ffffff800854d38c>] rpm_resume+0x850/0x938
> > [ 198.882141] [<ffffff800854cd8c>] rpm_resume+0x250/0x938
> > [ 198.887994] [<ffffff800854d4c4>] __pm_runtime_resume+0x50/0x74
> > [ 198.894530] [<ffffff80087b9e64>] mmc_get_card+0x3c/0xb8
> > [ 198.900388] [<ffffff80087cd2e0>] mmc_blk_issue_rq+0x2b0/0x4d8
> > [ 198.906824] [<ffffff80087cd5e4>] mmc_queue_thread+0xdc/0x198
> > [ 198.913165] [<ffffff80080d4b2c>] kthread+0xec/0x100
> > [ 198.918632] [<ffffff8008083890>] ret_from_fork+0x10/0x40
> > [ 198.924582] mmc0 callback (null)
> > [ 198.935837] mmcblk mmc0:0001: __pm_runtime_resume ret -16
> >
> > Mark mmc_host device with pm_runtime_no_callbacks will solve the issue.
> > Thanks.
> > Huanlin Ke

So I have looked a bit closer to this, finally. Apologies for the delay.

If I am not mistaken, rpm_resume() should *not* be called recursively
for a device's parent, unless the parent's ->power.disable_depth has
been decremented to zero. No matter if we have called
pm_runtime_no_callbacks() for the child device or not.

In the scenario you describe, the parent device corresponds to the mmc
class device (initialized in mmc_alloc_host()). Since the mmc core
never calls pm_runtime_enable() for it, its - >power.disable_depth
should always remain greater than 0.

Although, I admit, the code in runtime.c isn't that easy to browse, so
I may be wrong. That said, I fail to see how -ENOSYS can be returned
rpm_resume() in the path you describe.

Would it be possible for you to provide more logs to show this?

Or could it be that you have a patch locally that calls
pm_runtime_enable() for the mmc class device, somewhere? That would
also explains things? :-)

[...]

> > >
> > > On Sat, 20 Mar 2021 at 05:57, kehuanlin <[email protected]> wrote:
> > > >
> > > > The rpm_resume() will call parent's resume callback recursively.
> > > > Since mmc_host has no its own pm_runtime callbacks, the mmc devices
> > > > may fail to resume (-ENOSYS in rpm_callback) sometimes. Mark mmc_host
> > > > device with pm_runtime_no_callbacks can fix the issue.
> > >
> > > Can you please elaborate more on this? What do you mean by "sometimes"?
> > >
> > > More precisely, how do you trigger the rpm_callback() for mmc class
> > > device to return -ENOSYS?
> > >
> > > Don't get me wrong, the patch is fine, but I want to understand if it
> > > actually solves a problem for you - or that it's better considered as
> > > an optimization?
> > >
> > > Kind regards
> > > Uffe
> > >
> > > >
> > > > Signed-off-by: kehuanlin <[email protected]>
> > > > ---
> > > > drivers/mmc/core/host.c | 2 ++
> > > > 1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
> > > > index 9b89a91b6b47..177bebd9a6c4 100644
> > > > --- a/drivers/mmc/core/host.c
> > > > +++ b/drivers/mmc/core/host.c
> > > > @@ -15,6 +15,7 @@
> > > > #include <linux/of.h>
> > > > #include <linux/of_gpio.h>
> > > > #include <linux/pagemap.h>
> > > > +#include <linux/pm_runtime.h>
> > > > #include <linux/pm_wakeup.h>
> > > > #include <linux/export.h>
> > > > #include <linux/leds.h>
> > > > @@ -480,6 +481,7 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
> > > > host->class_dev.class = &mmc_host_class;
> > > > device_initialize(&host->class_dev);
> > > > device_enable_async_suspend(&host->class_dev);
> > > > + pm_runtime_no_callbacks(&host->class_dev);
> > > >
> > > > if (mmc_gpio_alloc(host)) {
> > > > put_device(&host->class_dev);
> > > > --
> > > > 2.30.0
> > > >

Kind regards
Uffe

2021-06-21 01:40:01

by hieagle

[permalink] [raw]
Subject: Re: [PATCH] mmc: core: Mark mmc_host device with pm_runtime_no_callbacks

Sorry, I don't receive the reply email in my gmail.

Normally the mmc_host's power.disable_depth is large than zero, the
rpm_resume(mmc:0001) will not be called recursively for parent. This is
the most case.

Although the mmc class device never calls pm_runtime_enable() directly,
there are still some cases as below to call pm_runtime_enable(), which
may cause it's power.disable_depth decremented to zero.
case1: device_resume_early->pm_runtime_enable
case2: device_resume->pm_runtime_enable

Anything that can go wrong will go wrong. Unfortunately we meet the case.
If you trigger to set the mmc_host's power.disable_depth value to zero
after mmc suspended, you can find the issue.

In our platform the mmc device's parent list is as below:
mmc0:0001->mmc_host mmc0->fa630000.mmc->soc.
The rpm_resume call trace is as below in our scenario:

rpm_resume(mmc0:0001)
|
if (!parent && dev->parent) //true
if (!parent->power.disable_depth
&& !parent->power.ignore_children) //true
rpm_resume(parent, 0) ---> rpm_resume(mmc_host, 0)
| |
| callback = RPM_GET_CALLBACK(mmc_host, ...) = NULL
| retval = rpm_callback(callback, mmc_host) = -ENOSYS
| |
| return retval = -ENOSYS
if (retval) goto out; //skip rpm_callback()
return retval = -ENOSYS

The scenario is rare, but anything that can go wrong will go wrong.
The patch can enhance the code to avoid this scenario.

Ulf Hansson <[email protected]> 于2021年3月22日周一 下午6:26写道:
>
> On Sat, 20 Mar 2021 at 05:57, kehuanlin <[email protected]> wrote:
> >
> > The rpm_resume() will call parent's resume callback recursively.
> > Since mmc_host has no its own pm_runtime callbacks, the mmc devices
> > may fail to resume (-ENOSYS in rpm_callback) sometimes. Mark mmc_host
> > device with pm_runtime_no_callbacks can fix the issue.
>
> Can you please elaborate more on this? What do you mean by "sometimes"?
>
> More precisely, how do you trigger the rpm_callback() for mmc class
> device to return -ENOSYS?
>
> Don't get me wrong, the patch is fine, but I want to understand if it
> actually solves a problem for you - or that it's better considered as
> an optimization?
>
> Kind regards
> Uffe
>
> >
> > Signed-off-by: kehuanlin <[email protected]>
> > ---
> > drivers/mmc/core/host.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
> > index 9b89a91b6b47..177bebd9a6c4 100644
> > --- a/drivers/mmc/core/host.c
> > +++ b/drivers/mmc/core/host.c
> > @@ -15,6 +15,7 @@
> > #include <linux/of.h>
> > #include <linux/of_gpio.h>
> > #include <linux/pagemap.h>
> > +#include <linux/pm_runtime.h>
> > #include <linux/pm_wakeup.h>
> > #include <linux/export.h>
> > #include <linux/leds.h>
> > @@ -480,6 +481,7 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
> > host->class_dev.class = &mmc_host_class;
> > device_initialize(&host->class_dev);
> > device_enable_async_suspend(&host->class_dev);
> > + pm_runtime_no_callbacks(&host->class_dev);
> >
> > if (mmc_gpio_alloc(host)) {
> > put_device(&host->class_dev);
> > --
> > 2.30.0
> >

2021-06-21 09:25:35

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] mmc: core: Mark mmc_host device with pm_runtime_no_callbacks

On Mon, 21 Jun 2021 at 03:38, hieagle <[email protected]> wrote:
>
> Sorry, I don't receive the reply email in my gmail.
>
> Normally the mmc_host's power.disable_depth is large than zero, the
> rpm_resume(mmc:0001) will not be called recursively for parent. This is
> the most case.
>
> Although the mmc class device never calls pm_runtime_enable() directly,
> there are still some cases as below to call pm_runtime_enable(), which
> may cause it's power.disable_depth decremented to zero.
> case1: device_resume_early->pm_runtime_enable
> case2: device_resume->pm_runtime_enable

Those calls to pm_runtime_enable() are in balance with previous calls
to __pm_runtime_disable and pm_runtime_disable(), in
__device_suspend() and __device_suspend_late().

In other words, the power.disable_depth is not being decremented to
zero in any of those above path, I think.

>
> Anything that can go wrong will go wrong. Unfortunately we meet the case.
> If you trigger to set the mmc_host's power.disable_depth value to zero
> after mmc suspended, you can find the issue.
>
> In our platform the mmc device's parent list is as below:
> mmc0:0001->mmc_host mmc0->fa630000.mmc->soc.
> The rpm_resume call trace is as below in our scenario:
>
> rpm_resume(mmc0:0001)
> |
> if (!parent && dev->parent) //true
> if (!parent->power.disable_depth
> && !parent->power.ignore_children) //true
> rpm_resume(parent, 0) ---> rpm_resume(mmc_host, 0)
> | |
> | callback = RPM_GET_CALLBACK(mmc_host, ...) = NULL
> | retval = rpm_callback(callback, mmc_host) = -ENOSYS
> | |
> | return retval = -ENOSYS
> if (retval) goto out; //skip rpm_callback()
> return retval = -ENOSYS
>
> The scenario is rare, but anything that can go wrong will go wrong.
> The patch can enhance the code to avoid this scenario.

Well, I am still not convinced as I don't see how the
power.disable_depth can ever reach zero.

If you could provide a stack-trace of when power.disable_depth reaches
zero, that would be helpful.

[...]

Kind regards
Uffe