2021-08-10 20:08:53

by Prasad Sodagudi

[permalink] [raw]
Subject: [PATCH v3] PM: sleep: core: Avoid setting power.must_resume to false

There are variables(power.may_skip_resume and dev->power.must_resume)
and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
a system wide suspend transition.

Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
its "noirq" and "early" resume callbacks to be skipped if the device
can be left in suspend after a system-wide transition into the working
state. PM core determines that the driver's "noirq" and "early" resume
callbacks should be skipped or not with dev_pm_skip_resume() function by
checking power.may_skip_resume variable.

power.must_resume variable is getting set to false in __device_suspend()
function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
dev->power.usage_count variables. In problematic scenario, where
all the devices in the suspend_late stage are successful and some
device can fail to suspend in suspend_noirq phase. So some devices
successfully suspended in suspend_late stage are not getting chance
to execute __device_suspend_noirq() to set dev->power.must_resume
variable to true and not getting resumed in early_resume phase.

Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
setting power.must_resume variable in __device_suspend function.

Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")
Signed-off-by: Prasad Sodagudi <[email protected]>
---
V2 -> V3: Format issues patch posting
V1 -> V2: Fixed indentation and commit text to include scenario
drivers/base/power/main.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index d568772..9ee6987 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
}

dev->power.may_skip_resume = true;
- dev->power.must_resume = false;
+ if ((atomic_read(&dev->power.usage_count) <= 1) &&
+ (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
+ dev->power.must_resume = false;
+ else
+ dev->power.must_resume = true;

dpm_watchdog_set(&wd, dev);
device_lock(dev);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


2021-08-13 09:38:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v3] PM: sleep: core: Avoid setting power.must_resume to false

On Tue, Aug 10, 2021 at 01:05:38PM -0700, Prasad Sodagudi wrote:
> There are variables(power.may_skip_resume and dev->power.must_resume)
> and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
> a system wide suspend transition.
>
> Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
> its "noirq" and "early" resume callbacks to be skipped if the device
> can be left in suspend after a system-wide transition into the working
> state. PM core determines that the driver's "noirq" and "early" resume
> callbacks should be skipped or not with dev_pm_skip_resume() function by
> checking power.may_skip_resume variable.
>
> power.must_resume variable is getting set to false in __device_suspend()
> function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
> dev->power.usage_count variables. In problematic scenario, where
> all the devices in the suspend_late stage are successful and some
> device can fail to suspend in suspend_noirq phase. So some devices
> successfully suspended in suspend_late stage are not getting chance
> to execute __device_suspend_noirq() to set dev->power.must_resume
> variable to true and not getting resumed in early_resume phase.
>
> Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
> setting power.must_resume variable in __device_suspend function.
>
> Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")
> Signed-off-by: Prasad Sodagudi <[email protected]>
> ---
> V2 -> V3: Format issues patch posting
> V1 -> V2: Fixed indentation and commit text to include scenario
> drivers/base/power/main.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index d568772..9ee6987 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
> }
>
> dev->power.may_skip_resume = true;
> - dev->power.must_resume = false;
> + if ((atomic_read(&dev->power.usage_count) <= 1) &&
> + (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
> + dev->power.must_resume = false;
> + else
> + dev->power.must_resume = true;

Again, what happens if the usage_count changes right after reading the
value? What protects that from happening?

thanks,

greg k-h

2021-08-23 13:09:01

by Prasad Sodagudi

[permalink] [raw]
Subject: Re: [PATCH v3] PM: sleep: core: Avoid setting power.must_resume to false

On 2021-08-13 00:23, Greg KH wrote:
> On Tue, Aug 10, 2021 at 01:05:38PM -0700, Prasad Sodagudi wrote:
>> There are variables(power.may_skip_resume and dev->power.must_resume)
>> and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices
>> after
>> a system wide suspend transition.
>>
>> Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
>> its "noirq" and "early" resume callbacks to be skipped if the device
>> can be left in suspend after a system-wide transition into the working
>> state. PM core determines that the driver's "noirq" and "early" resume
>> callbacks should be skipped or not with dev_pm_skip_resume() function
>> by
>> checking power.may_skip_resume variable.
>>
>> power.must_resume variable is getting set to false in
>> __device_suspend()
>> function without checking device's DPM_FLAG_MAY_SKIP_RESUME and
>> dev->power.usage_count variables. In problematic scenario, where
>> all the devices in the suspend_late stage are successful and some
>> device can fail to suspend in suspend_noirq phase. So some devices
>> successfully suspended in suspend_late stage are not getting chance
>> to execute __device_suspend_noirq() to set dev->power.must_resume
>> variable to true and not getting resumed in early_resume phase.
>>
>> Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
>> setting power.must_resume variable in __device_suspend function.
>>
>> Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the
>> resume phase")
>> Signed-off-by: Prasad Sodagudi <[email protected]>
>> ---
>> V2 -> V3: Format issues patch posting
>> V1 -> V2: Fixed indentation and commit text to include scenario
>> drivers/base/power/main.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>> index d568772..9ee6987 100644
>> --- a/drivers/base/power/main.c
>> +++ b/drivers/base/power/main.c
>> @@ -1642,7 +1642,11 @@ static int __device_suspend(struct device *dev,
>> pm_message_t state, bool async)
>> }
>>
>> dev->power.may_skip_resume = true;
>> - dev->power.must_resume = false;
>> + if ((atomic_read(&dev->power.usage_count) <= 1) &&
>> + (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME)))
>> + dev->power.must_resume = false;
>> + else
>> + dev->power.must_resume = true;
>
> Again, what happens if the usage_count changes right after reading the
> value? What protects that from happening?

Hi Gregh KH,
Yes. you are right. I think, relying on the usage_count at the
__device_suspend stage may not be correct.
Devices IRQs are still enabled and usage_count can be changed even after
reading.
I will send next patchset without power.usage_count check.

@@ -1649,7 +1651,10 @@ static int __device_suspend(struct device *dev,
pm_message_t state, bool async)
}

dev->power.may_skip_resume = true;
- dev->power.must_resume = false;
+ if (dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME))
+ dev->power.must_resume = false;
+ else
+ dev->power.must_resume = true;


>
> thanks,
>
> greg k-h