2022-11-01 03:00:14

by Shawn Guo

[permalink] [raw]
Subject: [PATCH v2 0/3] Manage domain power state for hibernate freeze

The series tries to fix a hardware lockup issue caused by missing domain
power state management from hibernate .freeze[thaw]_noirq hooks. The
detailed description of the issue can be found in commit log of patch
#3.

Changes for v2:
- Fix kernel doc of genpd_finish_resume() to use the new function name

Shawn Guo (3):
PM: domains: Pass generic PM noirq hooks to genpd_finish_suspend()
PM: domains: Consolidate genpd_restore_noirq() and
genpd_resume_noirq()
PM: domains: Power off[on] domain in hibernate .freeze[thaw]_noirq
hook

drivers/base/power/domain.c | 130 +++++++++++++++---------------------
1 file changed, 52 insertions(+), 78 deletions(-)

--
2.25.1



2022-11-01 03:03:43

by Shawn Guo

[permalink] [raw]
Subject: [PATCH v2 2/3] PM: domains: Consolidate genpd_restore_noirq() and genpd_resume_noirq()

Most of the logic between genpd_restore_noirq() and genpd_resume_noirq()
are same except GENPD_STATE_OFF status reset for hibernation restore.
The suspended_count decrement for restore should be the right thing to do
anyway, considering there is an increment in genpd_finish_suspend() for
hibernation.

Consolidate genpd_restore_noirq() and genpd_resume_noirq() into
genpd_finish_resume() and handle GENPD_STATE_OFF status reset for
restore case specially.

Signed-off-by: Shawn Guo <[email protected]>
---
drivers/base/power/domain.c | 70 ++++++++++++++++++-------------------
1 file changed, 35 insertions(+), 35 deletions(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 54f6b0dd35fb..b81baeb38d81 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -1247,12 +1247,14 @@ static int genpd_suspend_noirq(struct device *dev)
}

/**
- * genpd_resume_noirq - Start of resume of device in an I/O PM domain.
+ * genpd_finish_resume - Completion of resume of device in an I/O PM domain.
* @dev: Device to resume.
+ * @resume_noirq: Generic resume_noirq callback.
*
* Restore power to the device's PM domain, if necessary, and start the device.
*/
-static int genpd_resume_noirq(struct device *dev)
+static int genpd_finish_resume(struct device *dev,
+ int (*resume_noirq)(struct device *dev))
{
struct generic_pm_domain *genpd;
int ret;
@@ -1264,9 +1266,25 @@ static int genpd_resume_noirq(struct device *dev)
return -EINVAL;

if (device_wakeup_path(dev) && genpd_is_active_wakeup(genpd))
- return pm_generic_resume_noirq(dev);
+ return resume_noirq(dev);

genpd_lock(genpd);
+
+ /*
+ * Special handling for hibernation restore:
+ * At this point suspended_count == 0 means we are being run for the
+ * first time for the given domain in the present cycle.
+ */
+ if (resume_noirq == pm_generic_restore_noirq &&
+ genpd->suspended_count++ == 0) {
+ /*
+ * The boot kernel might put the domain into arbitrary state,
+ * so make it appear as powered off to genpd_sync_power_on(),
+ * so that it tries to power it on in case it was really off.
+ */
+ genpd->status = GENPD_STATE_OFF;
+ }
+
genpd_sync_power_on(genpd, true, 0);
genpd->suspended_count--;
genpd_unlock(genpd);
@@ -1281,6 +1299,19 @@ static int genpd_resume_noirq(struct device *dev)
return pm_generic_resume_noirq(dev);
}

+/**
+ * genpd_resume_noirq - Start of resume of device in an I/O PM domain.
+ * @dev: Device to resume.
+ *
+ * Restore power to the device's PM domain, if necessary, and start the device.
+ */
+static int genpd_resume_noirq(struct device *dev)
+{
+ dev_dbg(dev, "%s()\n", __func__);
+
+ return genpd_finish_resume(dev, pm_generic_resume_noirq);
+}
+
/**
* genpd_freeze_noirq - Completion of freezing a device in an I/O PM domain.
* @dev: Device to freeze.
@@ -1366,40 +1397,9 @@ static int genpd_poweroff_noirq(struct device *dev)
*/
static int genpd_restore_noirq(struct device *dev)
{
- struct generic_pm_domain *genpd;
- int ret = 0;
-
dev_dbg(dev, "%s()\n", __func__);

- genpd = dev_to_genpd(dev);
- if (IS_ERR(genpd))
- return -EINVAL;
-
- /*
- * At this point suspended_count == 0 means we are being run for the
- * first time for the given domain in the present cycle.
- */
- genpd_lock(genpd);
- if (genpd->suspended_count++ == 0) {
- /*
- * The boot kernel might put the domain into arbitrary state,
- * so make it appear as powered off to genpd_sync_power_on(),
- * so that it tries to power it on in case it was really off.
- */
- genpd->status = GENPD_STATE_OFF;
- }
-
- genpd_sync_power_on(genpd, true, 0);
- genpd_unlock(genpd);
-
- if (genpd->dev_ops.stop && genpd->dev_ops.start &&
- !pm_runtime_status_suspended(dev)) {
- ret = genpd_start_dev(genpd, dev);
- if (ret)
- return ret;
- }
-
- return pm_generic_restore_noirq(dev);
+ return genpd_finish_resume(dev, pm_generic_restore_noirq);
}

/**
--
2.25.1


2022-11-01 03:06:27

by Shawn Guo

[permalink] [raw]
Subject: [PATCH v2 3/3] PM: domains: Power off[on] domain in hibernate .freeze[thaw]_noirq hook

On platforms which use SHUTDOWN as hibernation mode, the genpd noirq
hooks will be called like below.

genpd_freeze_noirq() genpd_restore_noirq()
↓ ↑
Create snapshot image Restore system
↓ ↑
genpd_thaw_noirq() Read snapshot image
↓ ↑
Write snapshot image Kernel boot
↓ ↑
power_down() Power on device

As of today suspend hooks genpd_suspend[resume]_noirq() manages domain
on/off state, but hibernate hooks genpd_freeze[thaw]_noirq() doesn't.
This results in a different behavior of domain power state between suspend
and hibernate freeze, i.e. domain is powered off for the former while on
for the later. It causes a problem on platforms like i.MX where the
domain needs to be powered on/off by calling clock and regulator interface.
When the platform restores from hibernation, the domain is off in hardware
and genpd_restore_noirq() tries to power it on, but will never succeed
because software state of domain (clock and regulator) is left on from the
last hibernate freeze, so kernel thinks that clock and regulator are
enabled while they are actually not turned on in hardware. The
consequence would be that devices in the power domain will access
registers without clock or power, and cause hardware lockup.

Power off[on] domain in hibernate .freeze[thaw]_noirq hook for reasons:

- Align the behavior between suspend and hibernate freeze.
- Have power state of domains stay in sync between hardware and software
for hibernate freeze, and thus fix the lockup issue seen on i.MX
platform.

Signed-off-by: Shawn Guo <[email protected]>
---
drivers/base/power/domain.c | 35 ++++-------------------------------
1 file changed, 4 insertions(+), 31 deletions(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index b81baeb38d81..28c9e04e3488 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -1323,24 +1323,11 @@ static int genpd_resume_noirq(struct device *dev)
*/
static int genpd_freeze_noirq(struct device *dev)
{
- const struct generic_pm_domain *genpd;
- int ret = 0;
-
dev_dbg(dev, "%s()\n", __func__);

- genpd = dev_to_genpd(dev);
- if (IS_ERR(genpd))
- return -EINVAL;
-
- ret = pm_generic_freeze_noirq(dev);
- if (ret)
- return ret;
-
- if (genpd->dev_ops.stop && genpd->dev_ops.start &&
- !pm_runtime_status_suspended(dev))
- ret = genpd_stop_dev(genpd, dev);
-
- return ret;
+ return genpd_finish_suspend(dev,
+ pm_generic_freeze_noirq,
+ pm_generic_thaw_noirq);
}

/**
@@ -1352,23 +1339,9 @@ static int genpd_freeze_noirq(struct device *dev)
*/
static int genpd_thaw_noirq(struct device *dev)
{
- const struct generic_pm_domain *genpd;
- int ret = 0;
-
dev_dbg(dev, "%s()\n", __func__);

- genpd = dev_to_genpd(dev);
- if (IS_ERR(genpd))
- return -EINVAL;
-
- if (genpd->dev_ops.stop && genpd->dev_ops.start &&
- !pm_runtime_status_suspended(dev)) {
- ret = genpd_start_dev(genpd, dev);
- if (ret)
- return ret;
- }
-
- return pm_generic_thaw_noirq(dev);
+ return genpd_finish_resume(dev, pm_generic_thaw_noirq);
}

/**
--
2.25.1


2022-11-01 04:05:00

by Shawn Guo

[permalink] [raw]
Subject: [PATCH v2 1/3] PM: domains: Pass generic PM noirq hooks to genpd_finish_suspend()

While argument `poweroff` works fine for genpd_finish_suspend() to handle
distinction between suspend and poweroff, it won't scale if we want to
use it for freeze as well. Pass generic PM noirq hooks as arguments
instead, so that the function can possibly cover freeze case too.

Signed-off-by: Shawn Guo <[email protected]>
---
drivers/base/power/domain.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 6471b559230e..54f6b0dd35fb 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -1189,12 +1189,15 @@ static int genpd_prepare(struct device *dev)
* genpd_finish_suspend - Completion of suspend or hibernation of device in an
* I/O pm domain.
* @dev: Device to suspend.
- * @poweroff: Specifies if this is a poweroff_noirq or suspend_noirq callback.
+ * @suspend_noirq: Generic suspend_noirq callback.
+ * @resume_noirq: Generic resume_noirq callback.
*
* Stop the device and remove power from the domain if all devices in it have
* been stopped.
*/
-static int genpd_finish_suspend(struct device *dev, bool poweroff)
+static int genpd_finish_suspend(struct device *dev,
+ int (*suspend_noirq)(struct device *dev),
+ int (*resume_noirq)(struct device *dev))
{
struct generic_pm_domain *genpd;
int ret = 0;
@@ -1203,10 +1206,7 @@ static int genpd_finish_suspend(struct device *dev, bool poweroff)
if (IS_ERR(genpd))
return -EINVAL;

- if (poweroff)
- ret = pm_generic_poweroff_noirq(dev);
- else
- ret = pm_generic_suspend_noirq(dev);
+ ret = suspend_noirq(dev);
if (ret)
return ret;

@@ -1217,10 +1217,7 @@ static int genpd_finish_suspend(struct device *dev, bool poweroff)
!pm_runtime_status_suspended(dev)) {
ret = genpd_stop_dev(genpd, dev);
if (ret) {
- if (poweroff)
- pm_generic_restore_noirq(dev);
- else
- pm_generic_resume_noirq(dev);
+ resume_noirq(dev);
return ret;
}
}
@@ -1244,7 +1241,9 @@ static int genpd_suspend_noirq(struct device *dev)
{
dev_dbg(dev, "%s()\n", __func__);

- return genpd_finish_suspend(dev, false);
+ return genpd_finish_suspend(dev,
+ pm_generic_suspend_noirq,
+ pm_generic_resume_noirq);
}

/**
@@ -1353,7 +1352,9 @@ static int genpd_poweroff_noirq(struct device *dev)
{
dev_dbg(dev, "%s()\n", __func__);

- return genpd_finish_suspend(dev, true);
+ return genpd_finish_suspend(dev,
+ pm_generic_poweroff_noirq,
+ pm_generic_restore_noirq);
}

/**
--
2.25.1


2022-11-01 10:57:35

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] PM: domains: Consolidate genpd_restore_noirq() and genpd_resume_noirq()

On Tue, 1 Nov 2022 at 03:47, Shawn Guo <[email protected]> wrote:
>
> Most of the logic between genpd_restore_noirq() and genpd_resume_noirq()
> are same except GENPD_STATE_OFF status reset for hibernation restore.
> The suspended_count decrement for restore should be the right thing to do
> anyway, considering there is an increment in genpd_finish_suspend() for
> hibernation.
>
> Consolidate genpd_restore_noirq() and genpd_resume_noirq() into
> genpd_finish_resume() and handle GENPD_STATE_OFF status reset for
> restore case specially.
>
> Signed-off-by: Shawn Guo <[email protected]>

I have a comment, see more below.

Nevertheless, please add:

Reviewed-by: Ulf Hansson <[email protected]>

> ---
> drivers/base/power/domain.c | 70 ++++++++++++++++++-------------------
> 1 file changed, 35 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 54f6b0dd35fb..b81baeb38d81 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -1247,12 +1247,14 @@ static int genpd_suspend_noirq(struct device *dev)
> }
>
> /**
> - * genpd_resume_noirq - Start of resume of device in an I/O PM domain.
> + * genpd_finish_resume - Completion of resume of device in an I/O PM domain.
> * @dev: Device to resume.
> + * @resume_noirq: Generic resume_noirq callback.
> *
> * Restore power to the device's PM domain, if necessary, and start the device.
> */
> -static int genpd_resume_noirq(struct device *dev)
> +static int genpd_finish_resume(struct device *dev,
> + int (*resume_noirq)(struct device *dev))
> {
> struct generic_pm_domain *genpd;
> int ret;
> @@ -1264,9 +1266,25 @@ static int genpd_resume_noirq(struct device *dev)
> return -EINVAL;
>
> if (device_wakeup_path(dev) && genpd_is_active_wakeup(genpd))
> - return pm_generic_resume_noirq(dev);
> + return resume_noirq(dev);
>
> genpd_lock(genpd);
> +
> + /*
> + * Special handling for hibernation restore:
> + * At this point suspended_count == 0 means we are being run for the
> + * first time for the given domain in the present cycle.
> + */
> + if (resume_noirq == pm_generic_restore_noirq &&
> + genpd->suspended_count++ == 0) {
> + /*
> + * The boot kernel might put the domain into arbitrary state,
> + * so make it appear as powered off to genpd_sync_power_on(),
> + * so that it tries to power it on in case it was really off.
> + */
> + genpd->status = GENPD_STATE_OFF;

This has really never worked as intended. Resetting the status like
this, needs more careful actions.

For example, if the genpd->status was GENPD_STATE_ON, the parent
domain's ->sd_count have been increased - so that needs to be adjusted
too.

By looking at patch3/3, I wonder if we shouldn't try to align the
hibernation behaviors so the above hack can be dropped. Do you think
that could work?

> + }
> +
> genpd_sync_power_on(genpd, true, 0);
> genpd->suspended_count--;
> genpd_unlock(genpd);
> @@ -1281,6 +1299,19 @@ static int genpd_resume_noirq(struct device *dev)
> return pm_generic_resume_noirq(dev);
> }
>
> +/**
> + * genpd_resume_noirq - Start of resume of device in an I/O PM domain.
> + * @dev: Device to resume.
> + *
> + * Restore power to the device's PM domain, if necessary, and start the device.
> + */
> +static int genpd_resume_noirq(struct device *dev)
> +{
> + dev_dbg(dev, "%s()\n", __func__);
> +
> + return genpd_finish_resume(dev, pm_generic_resume_noirq);
> +}
> +
> /**
> * genpd_freeze_noirq - Completion of freezing a device in an I/O PM domain.
> * @dev: Device to freeze.
> @@ -1366,40 +1397,9 @@ static int genpd_poweroff_noirq(struct device *dev)
> */
> static int genpd_restore_noirq(struct device *dev)
> {
> - struct generic_pm_domain *genpd;
> - int ret = 0;
> -
> dev_dbg(dev, "%s()\n", __func__);
>
> - genpd = dev_to_genpd(dev);
> - if (IS_ERR(genpd))
> - return -EINVAL;
> -
> - /*
> - * At this point suspended_count == 0 means we are being run for the
> - * first time for the given domain in the present cycle.
> - */
> - genpd_lock(genpd);
> - if (genpd->suspended_count++ == 0) {
> - /*
> - * The boot kernel might put the domain into arbitrary state,
> - * so make it appear as powered off to genpd_sync_power_on(),
> - * so that it tries to power it on in case it was really off.
> - */
> - genpd->status = GENPD_STATE_OFF;
> - }
> -
> - genpd_sync_power_on(genpd, true, 0);
> - genpd_unlock(genpd);
> -
> - if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> - !pm_runtime_status_suspended(dev)) {
> - ret = genpd_start_dev(genpd, dev);
> - if (ret)
> - return ret;
> - }
> -
> - return pm_generic_restore_noirq(dev);
> + return genpd_finish_resume(dev, pm_generic_restore_noirq);
> }
>
> /**

Kind regards
Uffe

2022-11-01 11:00:37

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] PM: domains: Power off[on] domain in hibernate .freeze[thaw]_noirq hook

On Tue, 1 Nov 2022 at 03:47, Shawn Guo <[email protected]> wrote:
>
> On platforms which use SHUTDOWN as hibernation mode, the genpd noirq
> hooks will be called like below.
>
> genpd_freeze_noirq() genpd_restore_noirq()
> ↓ ↑
> Create snapshot image Restore system
> ↓ ↑
> genpd_thaw_noirq() Read snapshot image
> ↓ ↑
> Write snapshot image Kernel boot
> ↓ ↑
> power_down() Power on device
>
> As of today suspend hooks genpd_suspend[resume]_noirq() manages domain
> on/off state, but hibernate hooks genpd_freeze[thaw]_noirq() doesn't.
> This results in a different behavior of domain power state between suspend
> and hibernate freeze, i.e. domain is powered off for the former while on
> for the later. It causes a problem on platforms like i.MX where the
> domain needs to be powered on/off by calling clock and regulator interface.
> When the platform restores from hibernation, the domain is off in hardware
> and genpd_restore_noirq() tries to power it on, but will never succeed
> because software state of domain (clock and regulator) is left on from the
> last hibernate freeze, so kernel thinks that clock and regulator are
> enabled while they are actually not turned on in hardware. The
> consequence would be that devices in the power domain will access
> registers without clock or power, and cause hardware lockup.
>
> Power off[on] domain in hibernate .freeze[thaw]_noirq hook for reasons:
>
> - Align the behavior between suspend and hibernate freeze.
> - Have power state of domains stay in sync between hardware and software
> for hibernate freeze, and thus fix the lockup issue seen on i.MX
> platform.
>
> Signed-off-by: Shawn Guo <[email protected]>

Reviewed-by: Ulf Hansson <[email protected]>

Kind regards
Uffe

> ---
> drivers/base/power/domain.c | 35 ++++-------------------------------
> 1 file changed, 4 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index b81baeb38d81..28c9e04e3488 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -1323,24 +1323,11 @@ static int genpd_resume_noirq(struct device *dev)
> */
> static int genpd_freeze_noirq(struct device *dev)
> {
> - const struct generic_pm_domain *genpd;
> - int ret = 0;
> -
> dev_dbg(dev, "%s()\n", __func__);
>
> - genpd = dev_to_genpd(dev);
> - if (IS_ERR(genpd))
> - return -EINVAL;
> -
> - ret = pm_generic_freeze_noirq(dev);
> - if (ret)
> - return ret;
> -
> - if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> - !pm_runtime_status_suspended(dev))
> - ret = genpd_stop_dev(genpd, dev);
> -
> - return ret;
> + return genpd_finish_suspend(dev,
> + pm_generic_freeze_noirq,
> + pm_generic_thaw_noirq);
> }
>
> /**
> @@ -1352,23 +1339,9 @@ static int genpd_freeze_noirq(struct device *dev)
> */
> static int genpd_thaw_noirq(struct device *dev)
> {
> - const struct generic_pm_domain *genpd;
> - int ret = 0;
> -
> dev_dbg(dev, "%s()\n", __func__);
>
> - genpd = dev_to_genpd(dev);
> - if (IS_ERR(genpd))
> - return -EINVAL;
> -
> - if (genpd->dev_ops.stop && genpd->dev_ops.start &&
> - !pm_runtime_status_suspended(dev)) {
> - ret = genpd_start_dev(genpd, dev);
> - if (ret)
> - return ret;
> - }
> -
> - return pm_generic_thaw_noirq(dev);
> + return genpd_finish_resume(dev, pm_generic_thaw_noirq);
> }
>
> /**
> --
> 2.25.1
>

2022-11-01 11:19:07

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] PM: domains: Pass generic PM noirq hooks to genpd_finish_suspend()

On Tue, 1 Nov 2022 at 03:47, Shawn Guo <[email protected]> wrote:
>
> While argument `poweroff` works fine for genpd_finish_suspend() to handle
> distinction between suspend and poweroff, it won't scale if we want to
> use it for freeze as well. Pass generic PM noirq hooks as arguments
> instead, so that the function can possibly cover freeze case too.
>
> Signed-off-by: Shawn Guo <[email protected]>

Reviewed-by: Ulf Hansson <[email protected]>

Kind regards
Uffe


> ---
> drivers/base/power/domain.c | 25 +++++++++++++------------
> 1 file changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 6471b559230e..54f6b0dd35fb 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -1189,12 +1189,15 @@ static int genpd_prepare(struct device *dev)
> * genpd_finish_suspend - Completion of suspend or hibernation of device in an
> * I/O pm domain.
> * @dev: Device to suspend.
> - * @poweroff: Specifies if this is a poweroff_noirq or suspend_noirq callback.
> + * @suspend_noirq: Generic suspend_noirq callback.
> + * @resume_noirq: Generic resume_noirq callback.
> *
> * Stop the device and remove power from the domain if all devices in it have
> * been stopped.
> */
> -static int genpd_finish_suspend(struct device *dev, bool poweroff)
> +static int genpd_finish_suspend(struct device *dev,
> + int (*suspend_noirq)(struct device *dev),
> + int (*resume_noirq)(struct device *dev))
> {
> struct generic_pm_domain *genpd;
> int ret = 0;
> @@ -1203,10 +1206,7 @@ static int genpd_finish_suspend(struct device *dev, bool poweroff)
> if (IS_ERR(genpd))
> return -EINVAL;
>
> - if (poweroff)
> - ret = pm_generic_poweroff_noirq(dev);
> - else
> - ret = pm_generic_suspend_noirq(dev);
> + ret = suspend_noirq(dev);
> if (ret)
> return ret;
>
> @@ -1217,10 +1217,7 @@ static int genpd_finish_suspend(struct device *dev, bool poweroff)
> !pm_runtime_status_suspended(dev)) {
> ret = genpd_stop_dev(genpd, dev);
> if (ret) {
> - if (poweroff)
> - pm_generic_restore_noirq(dev);
> - else
> - pm_generic_resume_noirq(dev);
> + resume_noirq(dev);
> return ret;
> }
> }
> @@ -1244,7 +1241,9 @@ static int genpd_suspend_noirq(struct device *dev)
> {
> dev_dbg(dev, "%s()\n", __func__);
>
> - return genpd_finish_suspend(dev, false);
> + return genpd_finish_suspend(dev,
> + pm_generic_suspend_noirq,
> + pm_generic_resume_noirq);
> }
>
> /**
> @@ -1353,7 +1352,9 @@ static int genpd_poweroff_noirq(struct device *dev)
> {
> dev_dbg(dev, "%s()\n", __func__);
>
> - return genpd_finish_suspend(dev, true);
> + return genpd_finish_suspend(dev,
> + pm_generic_poweroff_noirq,
> + pm_generic_restore_noirq);
> }
>
> /**
> --
> 2.25.1
>

2022-11-02 09:41:40

by Shawn Guo

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] PM: domains: Consolidate genpd_restore_noirq() and genpd_resume_noirq()

On Tue, Nov 01, 2022 at 11:23:45AM +0100, Ulf Hansson wrote:
> On Tue, 1 Nov 2022 at 03:47, Shawn Guo <[email protected]> wrote:
> >
> > Most of the logic between genpd_restore_noirq() and genpd_resume_noirq()
> > are same except GENPD_STATE_OFF status reset for hibernation restore.
> > The suspended_count decrement for restore should be the right thing to do
> > anyway, considering there is an increment in genpd_finish_suspend() for
> > hibernation.
> >
> > Consolidate genpd_restore_noirq() and genpd_resume_noirq() into
> > genpd_finish_resume() and handle GENPD_STATE_OFF status reset for
> > restore case specially.
> >
> > Signed-off-by: Shawn Guo <[email protected]>
>
> I have a comment, see more below.
>
> Nevertheless, please add:
>
> Reviewed-by: Ulf Hansson <[email protected]>
>
> > ---
> > drivers/base/power/domain.c | 70 ++++++++++++++++++-------------------
> > 1 file changed, 35 insertions(+), 35 deletions(-)
> >
> > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> > index 54f6b0dd35fb..b81baeb38d81 100644
> > --- a/drivers/base/power/domain.c
> > +++ b/drivers/base/power/domain.c
> > @@ -1247,12 +1247,14 @@ static int genpd_suspend_noirq(struct device *dev)
> > }
> >
> > /**
> > - * genpd_resume_noirq - Start of resume of device in an I/O PM domain.
> > + * genpd_finish_resume - Completion of resume of device in an I/O PM domain.
> > * @dev: Device to resume.
> > + * @resume_noirq: Generic resume_noirq callback.
> > *
> > * Restore power to the device's PM domain, if necessary, and start the device.
> > */
> > -static int genpd_resume_noirq(struct device *dev)
> > +static int genpd_finish_resume(struct device *dev,
> > + int (*resume_noirq)(struct device *dev))
> > {
> > struct generic_pm_domain *genpd;
> > int ret;
> > @@ -1264,9 +1266,25 @@ static int genpd_resume_noirq(struct device *dev)
> > return -EINVAL;
> >
> > if (device_wakeup_path(dev) && genpd_is_active_wakeup(genpd))
> > - return pm_generic_resume_noirq(dev);
> > + return resume_noirq(dev);
> >
> > genpd_lock(genpd);
> > +
> > + /*
> > + * Special handling for hibernation restore:
> > + * At this point suspended_count == 0 means we are being run for the
> > + * first time for the given domain in the present cycle.
> > + */
> > + if (resume_noirq == pm_generic_restore_noirq &&
> > + genpd->suspended_count++ == 0) {
> > + /*
> > + * The boot kernel might put the domain into arbitrary state,
> > + * so make it appear as powered off to genpd_sync_power_on(),
> > + * so that it tries to power it on in case it was really off.
> > + */
> > + genpd->status = GENPD_STATE_OFF;
>
> This has really never worked as intended. Resetting the status like
> this, needs more careful actions.
>
> For example, if the genpd->status was GENPD_STATE_ON, the parent
> domain's ->sd_count have been increased - so that needs to be adjusted
> too.
>
> By looking at patch3/3, I wonder if we shouldn't try to align the
> hibernation behaviors so the above hack can be dropped. Do you think
> that could work?

To be honest, I found this piece of code suspicious when I was fixing my
problem. To be on the safe side, I chose to leave it there because I'm
not sure if it's handling any special cases or platform quirks.

I tested on my platform with dropping the code. Worked perfectly fine.
So I will repost the series by starting with this cleanup.

Shawn