2022-06-30 19:46:39

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH] PM: runtime: Fix supplier device management during consumer probe

From: Rafael J. Wysocki <[email protected]>

Because pm_runtime_get_suppliers() bumps up the rpm_active counter
of each device link to a supplier of the given device in addition
to bumping up the supplier's PM-runtime usage counter, a runtime
suspend of the consumer device may case the latter to go down to 0
when pm_runtime_put_suppliers() is running on a remote CPU. If that
happens after pm_runtime_put_suppliers() has released power.lock for
the consumer device, and a runtime resume of that device takes place
immediately after it, before pm_runtime_put() is called for the
supplier, that pm_runtime_put() call may cause the supplier to be
suspended even though the consumer is active.

To prevent that from happening, modify pm_runtime_get_suppliers() to
call pm_runtime_get_sync() for the given device's suppliers without
touching the rpm_active counters of the involved device links
Accordingly, modify pm_runtime_put_suppliers() to call pm_runtime_put()
for the given device's suppliers without looking at the rpm_active
counters of the device links at hand. [This is analogous to what
happened before commit 4c06c4e6cf63 ("driver core: Fix possible
supplier PM-usage counter imbalance").]

Since pm_runtime_get_suppliers() sets supplier_preactivated for each
device link where the supplier's PM-runtime usage counter has been
incremented and pm_runtime_put_suppliers() calls pm_runtime_put() for
the suppliers whose device links have supplier_preactivated set, the
PM-runtime usage counter is balanced for each supplier and this is
independent of the runtime suspend and resume of the consumer device.

However, in case a device link with DL_FLAG_PM_RUNTIME set is dropped
during the consumer device probe, so pm_runtime_get_suppliers() bumps
up the supplier's PM-runtime usage counter, but it cannot be dropped by
pm_runtime_put_suppliers(), make device_link_release_fn() take care of
that.

Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance")
Reported-by: Peter Wang <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/base/core.c | 10 ++++++++++
drivers/base/power/runtime.c | 14 +-------------
2 files changed, 11 insertions(+), 13 deletions(-)

Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1768,7 +1768,6 @@ void pm_runtime_get_suppliers(struct dev
if (link->flags & DL_FLAG_PM_RUNTIME) {
link->supplier_preactivated = true;
pm_runtime_get_sync(link->supplier);
- refcount_inc(&link->rpm_active);
}

device_links_read_unlock(idx);
@@ -1788,19 +1787,8 @@ void pm_runtime_put_suppliers(struct dev
list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
device_links_read_lock_held())
if (link->supplier_preactivated) {
- bool put;
-
link->supplier_preactivated = false;
-
- spin_lock_irq(&dev->power.lock);
-
- put = pm_runtime_status_suspended(dev) &&
- refcount_dec_not_one(&link->rpm_active);
-
- spin_unlock_irq(&dev->power.lock);
-
- if (put)
- pm_runtime_put(link->supplier);
+ pm_runtime_put(link->supplier);
}

device_links_read_unlock(idx);
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -487,6 +487,16 @@ static void device_link_release_fn(struc
device_link_synchronize_removal();

pm_runtime_release_supplier(link);
+ /*
+ * If supplier_preactivated is set, the link has been dropped between
+ * the pm_runtime_get_suppliers() and pm_runtime_put_suppliers() calls
+ * in __driver_probe_device(). In that case, drop the supplier's
+ * PM-runtime usage counter to remove the reference taken by
+ * pm_runtime_get_suppliers().
+ */
+ if (link->supplier_preactivated)
+ pm_runtime_put_noidle(link->supplier);
+
pm_request_idle(link->supplier);

put_device(link->consumer);




2022-07-01 08:36:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] PM: runtime: Fix supplier device management during consumer probe

On Thu, Jun 30, 2022 at 09:16:41PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Because pm_runtime_get_suppliers() bumps up the rpm_active counter
> of each device link to a supplier of the given device in addition
> to bumping up the supplier's PM-runtime usage counter, a runtime
> suspend of the consumer device may case the latter to go down to 0
> when pm_runtime_put_suppliers() is running on a remote CPU. If that
> happens after pm_runtime_put_suppliers() has released power.lock for
> the consumer device, and a runtime resume of that device takes place
> immediately after it, before pm_runtime_put() is called for the
> supplier, that pm_runtime_put() call may cause the supplier to be
> suspended even though the consumer is active.
>
> To prevent that from happening, modify pm_runtime_get_suppliers() to
> call pm_runtime_get_sync() for the given device's suppliers without
> touching the rpm_active counters of the involved device links
> Accordingly, modify pm_runtime_put_suppliers() to call pm_runtime_put()
> for the given device's suppliers without looking at the rpm_active
> counters of the device links at hand. [This is analogous to what
> happened before commit 4c06c4e6cf63 ("driver core: Fix possible
> supplier PM-usage counter imbalance").]
>
> Since pm_runtime_get_suppliers() sets supplier_preactivated for each
> device link where the supplier's PM-runtime usage counter has been
> incremented and pm_runtime_put_suppliers() calls pm_runtime_put() for
> the suppliers whose device links have supplier_preactivated set, the
> PM-runtime usage counter is balanced for each supplier and this is
> independent of the runtime suspend and resume of the consumer device.
>
> However, in case a device link with DL_FLAG_PM_RUNTIME set is dropped
> during the consumer device probe, so pm_runtime_get_suppliers() bumps
> up the supplier's PM-runtime usage counter, but it cannot be dropped by
> pm_runtime_put_suppliers(), make device_link_release_fn() take care of
> that.
>
> Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance")
> Reported-by: Peter Wang <[email protected]>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/base/core.c | 10 ++++++++++
> drivers/base/power/runtime.c | 14 +-------------
> 2 files changed, 11 insertions(+), 13 deletions(-)

Reviewed-by: Greg Kroah-Hartman <[email protected]>

2022-07-01 11:00:15

by Peter Wang (王信友)

[permalink] [raw]
Subject: Re: [PATCH] PM: runtime: Fix supplier device management during consumer probe


On 7/1/22 3:16 AM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Because pm_runtime_get_suppliers() bumps up the rpm_active counter
> of each device link to a supplier of the given device in addition
> to bumping up the supplier's PM-runtime usage counter, a runtime
> suspend of the consumer device may case the latter to go down to 0
> when pm_runtime_put_suppliers() is running on a remote CPU. If that
> happens after pm_runtime_put_suppliers() has released power.lock for
> the consumer device, and a runtime resume of that device takes place
> immediately after it, before pm_runtime_put() is called for the
> supplier, that pm_runtime_put() call may cause the supplier to be
> suspended even though the consumer is active.
>
> To prevent that from happening, modify pm_runtime_get_suppliers() to
> call pm_runtime_get_sync() for the given device's suppliers without
> touching the rpm_active counters of the involved device links
> Accordingly, modify pm_runtime_put_suppliers() to call pm_runtime_put()
> for the given device's suppliers without looking at the rpm_active
> counters of the device links at hand. [This is analogous to what
> happened before commit 4c06c4e6cf63 ("driver core: Fix possible
> supplier PM-usage counter imbalance").]
>
> Since pm_runtime_get_suppliers() sets supplier_preactivated for each
> device link where the supplier's PM-runtime usage counter has been
> incremented and pm_runtime_put_suppliers() calls pm_runtime_put() for
> the suppliers whose device links have supplier_preactivated set, the
> PM-runtime usage counter is balanced for each supplier and this is
> independent of the runtime suspend and resume of the consumer device.
>
> However, in case a device link with DL_FLAG_PM_RUNTIME set is dropped
> during the consumer device probe, so pm_runtime_get_suppliers() bumps
> up the supplier's PM-runtime usage counter, but it cannot be dropped by
> pm_runtime_put_suppliers(), make device_link_release_fn() take care of
> that.
>
> Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance")
> Reported-by: Peter Wang <[email protected]>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/base/core.c | 10 ++++++++++
> drivers/base/power/runtime.c | 14 +-------------
> 2 files changed, 11 insertions(+), 13 deletions(-)
>
> Index: linux-pm/drivers/base/power/runtime.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/runtime.c
> +++ linux-pm/drivers/base/power/runtime.c
> @@ -1768,7 +1768,6 @@ void pm_runtime_get_suppliers(struct dev
> if (link->flags & DL_FLAG_PM_RUNTIME) {
> link->supplier_preactivated = true;
> pm_runtime_get_sync(link->supplier);
> - refcount_inc(&link->rpm_active);
> }
>
> device_links_read_unlock(idx);
> @@ -1788,19 +1787,8 @@ void pm_runtime_put_suppliers(struct dev
> list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
> device_links_read_lock_held())
> if (link->supplier_preactivated) {
> - bool put;
> -
> link->supplier_preactivated = false;
> -
> - spin_lock_irq(&dev->power.lock);
> -
> - put = pm_runtime_status_suspended(dev) &&
> - refcount_dec_not_one(&link->rpm_active);
> -
> - spin_unlock_irq(&dev->power.lock);
> -
> - if (put)
> - pm_runtime_put(link->supplier);
> + pm_runtime_put(link->supplier);
> }
>
> device_links_read_unlock(idx);
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -487,6 +487,16 @@ static void device_link_release_fn(struc
> device_link_synchronize_removal();
>
> pm_runtime_release_supplier(link);
> + /*
> + * If supplier_preactivated is set, the link has been dropped between
> + * the pm_runtime_get_suppliers() and pm_runtime_put_suppliers() calls
> + * in __driver_probe_device(). In that case, drop the supplier's
> + * PM-runtime usage counter to remove the reference taken by
> + * pm_runtime_get_suppliers().
> + */
> + if (link->supplier_preactivated)
> + pm_runtime_put_noidle(link->supplier);
> +
> pm_request_idle(link->supplier);
>
> put_device(link->consumer);

Thanks for fix this bug.
Reviewed-by: Peter Wang <[email protected]>


>
>