2019-02-01 01:11:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

Hi Greg at al,

This is a combination of the two device links series I have posted
recently (https://lore.kernel.org/lkml/[email protected]/
and https://lore.kernel.org/lkml/[email protected]/) rebased
on top of your driver-core-next branch.

Recently I have been looking at the device links code because of the
recent discussion on possibly using them in the DRM subsystem (see for
example https://marc.info/?l=linux-pm&m=154832771905309&w=2) and I have
found a few issues in that code which should be addressed by this patch
series. Please refer to the patch changelogs for details.

None of the problems addressed here should be manifesting themselves in
mainline kernel today, but if there are more device links users in the
future, they most likely will be encountered sooner or later. Also they
need to be fixed for the DRM use case to be supported IMO.

On top of this the series makes device links support the "composite device"
use case in the DRM subsystem mentioned above (essentially, the last patch
in the series is for that purpose).

Cheers,
Rafael



2019-02-01 01:12:37

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 7/9] IOMMU: Make dwo drivers use stateless device links

From: Rafael J. Wysocki <[email protected]>

The device links used by rockchip-iommu and exynos-iommu are
completely managed by these drivers within the IOMMU framework,
so there is no reason to involve the driver core in the management
of these links.

For this reason, make rockchip-iommu and exynos-iommu pass
DL_FLAG_STATELESS in flags to device_link_add(), so that the device
links used by them are stateless.

[Note that this change is requisite for a subsequent one that will
rework the management of stateful device links in the driver core
and it will not be compatible with the two drivers in question any
more.]

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Marek Szyprowski <[email protected]>
Acked-by: Joerg Roedel <[email protected]>
---
drivers/iommu/exynos-iommu.c | 1 +
drivers/iommu/rockchip-iommu.c | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/iommu/rockchip-iommu.c
===================================================================
--- linux-pm.orig/drivers/iommu/rockchip-iommu.c
+++ linux-pm/drivers/iommu/rockchip-iommu.c
@@ -1071,7 +1071,8 @@ static int rk_iommu_add_device(struct de
iommu_group_put(group);

iommu_device_link(&iommu->iommu, dev);
- data->link = device_link_add(dev, iommu->dev, DL_FLAG_PM_RUNTIME);
+ data->link = device_link_add(dev, iommu->dev,
+ DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME);

return 0;
}
Index: linux-pm/drivers/iommu/exynos-iommu.c
===================================================================
--- linux-pm.orig/drivers/iommu/exynos-iommu.c
+++ linux-pm/drivers/iommu/exynos-iommu.c
@@ -1260,6 +1260,7 @@ static int exynos_iommu_add_device(struc
* direct calls to pm_runtime_get/put in this driver.
*/
data->link = device_link_add(dev, data->sysmmu,
+ DL_FLAG_STATELESS |
DL_FLAG_PM_RUNTIME);
}
iommu_group_put(group);


2019-02-01 01:12:39

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 2/9] driver core: Avoid careless re-use of existing device links

From: Rafael J. Wysocki <[email protected]>

After commit ead18c23c263 ("driver core: Introduce device links
reference counting"), if there is a link between the given supplier
and the given consumer already, device_link_add() will refcount it
and return it unconditionally. However, if the flags passed to
it on the second (or any subsequent) attempt to create a device
link between the same consumer-supplier pair are not compatible with
the existing link's flags, that is incorrect.

First off, if the existing link is stateless and the next caller of
device_link_add() for the same consumer-supplier pair wants a
stateful one, or the other way around, the existing link cannot be
returned, because it will not match the expected behavior, so make
device_link_add() dump the stack and return NULL in that case.

Moreover, if the DL_FLAG_AUTOREMOVE_CONSUMER flag is passed to
device_link_add(), its caller will expect its reference to the link
to be dropped automatically on consumer driver removal, which will
not happen if that flag is not set in the link's flags (and
analogously for DL_FLAG_AUTOREMOVE_SUPPLIER). For this reason, make
device_link_add() update the existing link's flags accordingly
before returning it to the caller.

Fixes: ead18c23c263 ("driver core: Introduce device links reference counting")
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/base/core.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -221,12 +221,29 @@ struct device_link *device_link_add(stru
goto out;
}

- list_for_each_entry(link, &supplier->links.consumers, s_node)
- if (link->consumer == consumer) {
- kref_get(&link->kref);
+ list_for_each_entry(link, &supplier->links.consumers, s_node) {
+ if (link->consumer != consumer)
+ continue;
+
+ /*
+ * Don't return a stateless link if the caller wants a stateful
+ * one and vice versa.
+ */
+ if (WARN_ON((flags & DL_FLAG_STATELESS) != (link->flags & DL_FLAG_STATELESS))) {
+ link = NULL;
goto out;
}

+ if (flags & DL_FLAG_AUTOREMOVE_CONSUMER)
+ link->flags |= DL_FLAG_AUTOREMOVE_CONSUMER;
+
+ if (flags & DL_FLAG_AUTOREMOVE_SUPPLIER)
+ link->flags |= DL_FLAG_AUTOREMOVE_SUPPLIER;
+
+ kref_get(&link->kref);
+ goto out;
+ }
+
link = kzalloc(sizeof(*link), GFP_KERNEL);
if (!link)
goto out;


2019-02-01 01:13:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 1/9] driver core: Fix DL_FLAG_AUTOREMOVE_SUPPLIER device link flag handling

From: Rafael J. Wysocki <[email protected]>

Change the list walk in device_links_driver_cleanup() to a safe one
to avoid use-after-free when dropping a link from the list during the
walk.

Also, while at it, fix device_link_add() to refuse to create
stateless device links with DL_FLAG_AUTOREMOVE_SUPPLIER set, which is
an invalid combination (setting that flag means that the driver core
should manage the link, so it cannot be stateless), and extend the
kerneldoc comment of device_link_add() to cover the
DL_FLAG_AUTOREMOVE_SUPPLIER flag properly too.

Fixes: 1689cac5b32a ("driver core: Add flag to autoremove device link on supplier unbind")
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/base/core.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -179,10 +179,14 @@ void device_pm_move_to_tail(struct devic
* of the link. If DL_FLAG_PM_RUNTIME is not set, DL_FLAG_RPM_ACTIVE will be
* ignored.
*
- * If the DL_FLAG_AUTOREMOVE_CONSUMER is set, the link will be removed
- * automatically when the consumer device driver unbinds from it.
- * The combination of both DL_FLAG_AUTOREMOVE_CONSUMER and DL_FLAG_STATELESS
- * set is invalid and will cause NULL to be returned.
+ * If the DL_FLAG_AUTOREMOVE_CONSUMER flag is set, the link will be removed
+ * automatically when the consumer device driver unbinds from it. Analogously,
+ * if DL_FLAG_AUTOREMOVE_SUPPLIER is set in @flags, the link will be removed
+ * automatically when the supplier device driver unbinds from it.
+ *
+ * The combination of DL_FLAG_STATELESS and either DL_FLAG_AUTOREMOVE_CONSUMER
+ * or DL_FLAG_AUTOREMOVE_SUPPLIER set in @flags at the same time is invalid and
+ * will cause NULL to be returned upfront.
*
* A side effect of the link creation is re-ordering of dpm_list and the
* devices_kset list by moving the consumer device and all devices depending
@@ -199,8 +203,8 @@ struct device_link *device_link_add(stru
struct device_link *link;

if (!consumer || !supplier ||
- ((flags & DL_FLAG_STATELESS) &&
- (flags & DL_FLAG_AUTOREMOVE_CONSUMER)))
+ (flags & DL_FLAG_STATELESS &&
+ flags & (DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOREMOVE_SUPPLIER)))
return NULL;

device_links_write_lock();
@@ -539,11 +543,11 @@ void device_links_no_driver(struct devic
*/
void device_links_driver_cleanup(struct device *dev)
{
- struct device_link *link;
+ struct device_link *link, *ln;

device_links_write_lock();

- list_for_each_entry(link, &dev->links.consumers, s_node) {
+ list_for_each_entry_safe(link, ln, &dev->links.consumers, s_node) {
if (link->flags & DL_FLAG_STATELESS)
continue;



2019-02-01 01:13:15

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 8/9] driver core: Make driver core own stateful device links

From: Rafael J. Wysocki <[email protected]>

Even though stateful device links are managed by the driver core in
principle, their creators are allowed and sometimes even expected
to drop references to them via device_link_del() or
device_link_remove(), but that doesn't really play well with the
"persistent" link concept.

If "persistent" managed device links are created from driver
probe callbacks, device_link_add() called to do that will take a
new reference on the link each time the callback runs and those
references will never be dropped, which kind of isn't nice.

This issues arises because of the link reference counting carried
out by device_link_add() for existing links, but that is only done to
avoid deleting device links that may still be necessary, which
shouldn't be a concern for managed (stateful) links. These device
links are managed by the driver core and whoever creates one of them
will need it at least as long as until the consumer driver is detached
from its device and deleting it may be left to the driver core just
fine.

For this reason, rework device_link_add() to apply the reference
counting to stateless links only and make device_link_del() and
device_link_remove() drop references to stateless links only too.
After this change, if called to add a stateful device link for
a consumer-supplier pair for which a stateful device link is
present already, device_link_add() will return the existing link
without incrementing its reference counter. Accordingly,
device_link_del() and device_link_remove() will WARN() and do
nothing when called to drop a reference to a stateful link. Thus,
effectively, all stateful device links will be owned by the driver
core.

In addition, clean up the handling of the link management flags,
DL_FLAG_AUTOREMOVE_CONSUMER and DL_FLAG_AUTOREMOVE_SUPPLIER, so that
(a) they are never set at the same time and (b) if device_link_add()
is called for a consumer-supplier pair with an existing stateful link
between them, the flags of that link will be combined with the flags
passed to device_link_add() to ensure that the life time of the link
is sufficient for all of the callers of device_link_add() for the
same consumer-supplier pair.

Update the device_link_add() kerneldoc comment to reflect the
above changes.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
Documentation/driver-api/device_link.rst | 42 +++++++++++-------
drivers/base/core.c | 69 ++++++++++++++++++++++++-------
2 files changed, 79 insertions(+), 32 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -192,10 +192,21 @@ static void device_link_rpm_prepare(stru
* of the link. If DL_FLAG_PM_RUNTIME is not set, DL_FLAG_RPM_ACTIVE will be
* ignored.
*
- * If the DL_FLAG_AUTOREMOVE_CONSUMER flag is set, the link will be removed
- * automatically when the consumer device driver unbinds from it. Analogously,
- * if DL_FLAG_AUTOREMOVE_SUPPLIER is set in @flags, the link will be removed
- * automatically when the supplier device driver unbinds from it.
+ * If DL_FLAG_STATELESS is set in @flags, the link is not going to be managed by
+ * the driver core and, in particular, the caller of this function is expected
+ * to drop the reference to the link acquired by it directly.
+ *
+ * If that flag is not set, however, the caller of this function is handing the
+ * management of the link over to the driver core entirely and its return value
+ * can only be used to check whether or not the link is present. In that case,
+ * the DL_FLAG_AUTOREMOVE_CONSUMER and DL_FLAG_AUTOREMOVE_SUPPLIER device link
+ * flags can be used to indicate to the driver core when the link can be safely
+ * deleted. Namely, setting one of them in @flags indicates to the driver core
+ * that the link is not going to be used (by the given caller of this function)
+ * after unbinding the consumer or supplier driver, respectively, from its
+ * device, so the link can be deleted at that point. If none of them is set,
+ * the link will be maintained until one of the devices pointed to by it (either
+ * the consumer or the supplier) is unregistered.
*
* The combination of DL_FLAG_STATELESS and either DL_FLAG_AUTOREMOVE_CONSUMER
* or DL_FLAG_AUTOREMOVE_SUPPLIER set in @flags at the same time is invalid and
@@ -241,6 +252,14 @@ struct device_link *device_link_add(stru
goto out;
}

+ /*
+ * DL_FLAG_AUTOREMOVE_SUPPLIER indicates that the link will be needed
+ * longer than for DL_FLAG_AUTOREMOVE_CONSUMER and setting them both
+ * together doesn't make sense, so prefer DL_FLAG_AUTOREMOVE_SUPPLIER.
+ */
+ if (flags & DL_FLAG_AUTOREMOVE_SUPPLIER)
+ flags &= ~DL_FLAG_AUTOREMOVE_CONSUMER;
+
list_for_each_entry(link, &supplier->links.consumers, s_node) {
if (link->consumer != consumer)
continue;
@@ -254,12 +273,6 @@ struct device_link *device_link_add(stru
goto out;
}

- if (flags & DL_FLAG_AUTOREMOVE_CONSUMER)
- link->flags |= DL_FLAG_AUTOREMOVE_CONSUMER;
-
- if (flags & DL_FLAG_AUTOREMOVE_SUPPLIER)
- link->flags |= DL_FLAG_AUTOREMOVE_SUPPLIER;
-
if (flags & DL_FLAG_PM_RUNTIME) {
if (!(link->flags & DL_FLAG_PM_RUNTIME)) {
device_link_rpm_prepare(consumer, supplier);
@@ -269,7 +282,25 @@ struct device_link *device_link_add(stru
refcount_inc(&link->rpm_active);
}

- kref_get(&link->kref);
+ if (flags & DL_FLAG_STATELESS) {
+ kref_get(&link->kref);
+ goto out;
+ }
+
+ /*
+ * If the life time of the link following from the new flags is
+ * longer than indicated by the flags of the existing link,
+ * update the existing link to stay around longer.
+ */
+ if (flags & DL_FLAG_AUTOREMOVE_SUPPLIER) {
+ if (link->flags & DL_FLAG_AUTOREMOVE_CONSUMER) {
+ link->flags &= ~DL_FLAG_AUTOREMOVE_CONSUMER;
+ link->flags |= DL_FLAG_AUTOREMOVE_SUPPLIER;
+ }
+ } else if (!(flags & DL_FLAG_AUTOREMOVE_CONSUMER)) {
+ link->flags &= ~(DL_FLAG_AUTOREMOVE_CONSUMER |
+ DL_FLAG_AUTOREMOVE_SUPPLIER);
+ }
goto out;
}

@@ -419,8 +450,16 @@ static void __device_link_del(struct kre
}
#endif /* !CONFIG_SRCU */

+static void device_link_put_kref(struct device_link *link)
+{
+ if (link->flags & DL_FLAG_STATELESS)
+ kref_put(&link->kref, __device_link_del);
+ else
+ WARN(1, "Unable to drop a managed device link reference\n");
+}
+
/**
- * device_link_del - Delete a link between two devices.
+ * device_link_del - Delete a stateless link between two devices.
* @link: Device link to delete.
*
* The caller must ensure proper synchronization of this function with runtime
@@ -432,14 +471,14 @@ void device_link_del(struct device_link
{
device_links_write_lock();
device_pm_lock();
- kref_put(&link->kref, __device_link_del);
+ device_link_put_kref(link);
device_pm_unlock();
device_links_write_unlock();
}
EXPORT_SYMBOL_GPL(device_link_del);

/**
- * device_link_remove - remove a link between two devices.
+ * device_link_remove - Delete a stateless link between two devices.
* @consumer: Consumer end of the link.
* @supplier: Supplier end of the link.
*
@@ -458,7 +497,7 @@ void device_link_remove(void *consumer,

list_for_each_entry(link, &supplier->links.consumers, s_node) {
if (link->consumer == consumer) {
- kref_put(&link->kref, __device_link_del);
+ device_link_put_kref(link);
break;
}
}
Index: linux-pm/Documentation/driver-api/device_link.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/device_link.rst
+++ linux-pm/Documentation/driver-api/device_link.rst
@@ -25,8 +25,8 @@ suspend/resume and shutdown ordering.

Device links allow representation of such dependencies in the driver core.

-In its standard form, a device link combines *both* dependency types:
-It guarantees correct suspend/resume and shutdown ordering between a
+In its standard or *managed* form, a device link combines *both* dependency
+types: It guarantees correct suspend/resume and shutdown ordering between a
"supplier" device and its "consumer" devices, and it guarantees driver
presence on the supplier. The consumer devices are not probed before the
supplier is bound to a driver, and they're unbound before the supplier
@@ -69,12 +69,14 @@ know that the supplier is functional alr
the case, for instance, if the consumer has just acquired some resources that
would not have been available had the supplier not been functional then).]

-If a device link is added in the ``->probe`` callback of the supplier or
-consumer driver, it is typically deleted in its ``->remove`` callback for
-symmetry. That way, if the driver is compiled as a module, the device
-link is added on module load and orderly deleted on unload. The same
-restrictions that apply to device link addition (e.g. exclusion of a
-parallel suspend/resume transition) apply equally to deletion.
+If a device link with ``DL_FLAG_STATELESS`` set (i.e. a stateless device link)
+is added in the ``->probe`` callback of the supplier or consumer driver, it is
+typically deleted in its ``->remove`` callback for symmetry. That way, if the
+driver is compiled as a module, the device link is added on module load and
+orderly deleted on unload. The same restrictions that apply to device link
+addition (e.g. exclusion of a parallel suspend/resume transition) apply equally
+to deletion. Device links with ``DL_FLAG_STATELESS`` unset (i.e. managed
+device links) are deleted automatically by the driver core.

Several flags may be specified on device link addition, two of which
have already been mentioned above: ``DL_FLAG_STATELESS`` to express that no
@@ -87,8 +89,6 @@ link is added from the consumer's ``->pr
can be specified to runtime resume the supplier upon addition of the
device link. ``DL_FLAG_AUTOREMOVE_CONSUMER`` causes the device link to be
automatically purged when the consumer fails to probe or later unbinds.
-This obviates the need to explicitly delete the link in the ``->remove``
-callback or in the error path of the ``->probe`` callback.

Similarly, when the device link is added from supplier's ``->probe`` callback,
``DL_FLAG_AUTOREMOVE_SUPPLIER`` causes the device link to be automatically
@@ -97,12 +97,20 @@ purged when the supplier fails to probe
Limitations
===========

-Driver authors should be aware that a driver presence dependency (i.e. when
-``DL_FLAG_STATELESS`` is not specified on link addition) may cause probing of
-the consumer to be deferred indefinitely. This can become a problem if the
-consumer is required to probe before a certain initcall level is reached.
-Worse, if the supplier driver is blacklisted or missing, the consumer will
-never be probed.
+Driver authors should be aware that a driver presence dependency for managed
+device links (i.e. when ``DL_FLAG_STATELESS`` is not specified on link addition)
+may cause probing of the consumer to be deferred indefinitely. This can become
+a problem if the consumer is required to probe before a certain initcall level
+is reached. Worse, if the supplier driver is blacklisted or missing, the
+consumer will never be probed.
+
+Moreover, managed device links cannot be deleted directly. They are deleted
+by the driver core when they are not necessary any more in accordance with the
+``DL_FLAG_AUTOREMOVE_CONSUMER`` and ``DL_FLAG_AUTOREMOVE_SUPPLIER`` flags.
+However, stateless device links (i.e. device links with ``DL_FLAG_STATELESS``
+set) are expected to be removed by whoever called :c:func:`device_link_add()`
+to add them with the help of either :c:func:`device_link_del()` or
+:c:func:`device_link_remove()`.

Sometimes drivers depend on optional resources. They are able to operate
in a degraded mode (reduced feature set or performance) when those resources
@@ -286,4 +294,4 @@ API
===

.. kernel-doc:: drivers/base/core.c
- :functions: device_link_add device_link_del
+ :functions: device_link_add device_link_del device_link_remove


2019-02-01 01:13:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 3/9] driver core: Do not resume suppliers under device_links_write_lock()

From: Rafael J. Wysocki <[email protected]>

It is incorrect to call pm_runtime_get_sync() under
device_links_write_lock(), because it may end up trying to take
device_links_read_lock() while resuming the target device and that
will deadlock in the non-SRCU case, so avoid that by resuming the
supplier device in device_link_add() before calling
device_links_write_lock().

Fixes: 21d5c57b3726 ("PM / runtime: Use device links")
Fixes: baa8809f6097 ("PM / runtime: Optimize the use of device links")
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/base/core.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -201,12 +201,21 @@ struct device_link *device_link_add(stru
struct device *supplier, u32 flags)
{
struct device_link *link;
+ bool rpm_put_supplier = false;

if (!consumer || !supplier ||
(flags & DL_FLAG_STATELESS &&
flags & (DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOREMOVE_SUPPLIER)))
return NULL;

+ if (flags & DL_FLAG_PM_RUNTIME && flags & DL_FLAG_RPM_ACTIVE) {
+ if (pm_runtime_get_sync(supplier) < 0) {
+ pm_runtime_put_noidle(supplier);
+ return NULL;
+ }
+ rpm_put_supplier = true;
+ }
+
device_links_write_lock();
device_pm_lock();

@@ -250,13 +259,8 @@ struct device_link *device_link_add(stru

if (flags & DL_FLAG_PM_RUNTIME) {
if (flags & DL_FLAG_RPM_ACTIVE) {
- if (pm_runtime_get_sync(supplier) < 0) {
- pm_runtime_put_noidle(supplier);
- kfree(link);
- link = NULL;
- goto out;
- }
link->rpm_active = true;
+ rpm_put_supplier = false;
}
pm_runtime_new_link(consumer);
/*
@@ -328,6 +332,10 @@ struct device_link *device_link_add(stru
out:
device_pm_unlock();
device_links_write_unlock();
+
+ if (rpm_put_supplier)
+ pm_runtime_put(supplier);
+
return link;
}
EXPORT_SYMBOL_GPL(device_link_add);


2019-02-01 01:13:32

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 5/9] driver core: Fix adding device links to probing suppliers

From: Rafael J. Wysocki <[email protected]>

Currently, it is not valid to add a device link from a consumer
driver ->probe callback to a supplier that is still probing too, but
generally this is a valid use case. For example, if the consumer has
just acquired a resource that can only be available if the supplier
is functional, adding a device link to that supplier right away
should be safe (and even desirable arguably), but device_link_add()
doesn't handle that case correctly and the initial state of the link
created by it is wrong then.

To address this problem, change the initial state of device links
added between a probing supplier and a probing consumer to
DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
skip such links on the supplier side.

With this change, if the supplier probe completes first,
device_links_driver_bound() called for it will skip the link state
update and when it is called for the consumer, the link state will
be updated to "active". In turn, if the consumer probe completes
first, device_links_driver_bound() called for it will change the
state of the link to "active" and when it is called for the
supplier, the link status update will be skipped.

However, in principle the supplier or consumer probe may still fail
after the link has been added, so modify device_links_no_driver() to
change device links in the "active" or "consumer probe" state to
"dormant" on the supplier side and update __device_links_no_driver()
to change the link state to "available" only if it is "consumer
probe" or "active".

Then, if the supplier probe fails first, the leftover link to the
probing consumer will become "dormant" and device_links_no_driver()
called for the consumer (when its probe fails) will clean it up.
In turn, if the consumer probe fails first, it will either drop the
link, or change its state to "available" and, in the latter case,
when device_links_no_driver() is called for the supplier, it will
update the link state to "dormant". [If the supplier probe fails,
but the consumer probe succeeds, which should not happen as long as
the consumer driver is correct, the link still will be around, but
it will be "dormant" until the supplier is probed again.]

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
Documentation/driver-api/device_link.rst | 10 ++--
drivers/base/core.c | 74 +++++++++++++++++++++++++++----
2 files changed, 73 insertions(+), 11 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -300,17 +300,26 @@ struct device_link *device_link_add(stru
link->status = DL_STATE_NONE;
} else {
switch (supplier->links.status) {
- case DL_DEV_DRIVER_BOUND:
+ case DL_DEV_PROBING:
switch (consumer->links.status) {
case DL_DEV_PROBING:
/*
- * Some callers expect the link creation during
- * consumer driver probe to resume the supplier
- * even without DL_FLAG_RPM_ACTIVE.
+ * A consumer driver can create a link to a
+ * supplier that has not completed its probing
+ * yet as long as it knows that the supplier is
+ * already functional (for example, it has just
+ * acquired some resources from the supplier).
*/
- if (flags & DL_FLAG_PM_RUNTIME)
- pm_runtime_resume(supplier);
-
+ link->status = DL_STATE_CONSUMER_PROBE;
+ break;
+ default:
+ link->status = DL_STATE_DORMANT;
+ break;
+ }
+ break;
+ case DL_DEV_DRIVER_BOUND:
+ switch (consumer->links.status) {
+ case DL_DEV_PROBING:
link->status = DL_STATE_CONSUMER_PROBE;
break;
case DL_DEV_DRIVER_BOUND:
@@ -331,6 +340,14 @@ struct device_link *device_link_add(stru
}

/*
+ * Some callers expect the link creation during consumer driver probe to
+ * resume the supplier even without DL_FLAG_RPM_ACTIVE.
+ */
+ if (link->status == DL_STATE_CONSUMER_PROBE &&
+ flags & DL_FLAG_PM_RUNTIME)
+ pm_runtime_resume(supplier);
+
+ /*
* Move the consumer and all of the devices depending on it to the end
* of dpm_list and the devices_kset list.
*
@@ -518,6 +535,16 @@ void device_links_driver_bound(struct de
if (link->flags & DL_FLAG_STATELESS)
continue;

+ /*
+ * Links created during consumer probe may be in the "consumer
+ * probe" state to start with if the supplier is still probing
+ * when they are created and they may become "active" if the
+ * consumer probe returns first. Skip them here.
+ */
+ if (link->status == DL_STATE_CONSUMER_PROBE ||
+ link->status == DL_STATE_ACTIVE)
+ continue;
+
WARN_ON(link->status != DL_STATE_DORMANT);
WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
}
@@ -557,17 +584,48 @@ static void __device_links_no_driver(str

if (link->flags & DL_FLAG_AUTOREMOVE_CONSUMER)
__device_link_del(&link->kref);
- else if (link->status != DL_STATE_SUPPLIER_UNBIND)
+ else if (link->status == DL_STATE_CONSUMER_PROBE ||
+ link->status == DL_STATE_ACTIVE)
WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
}

dev->links.status = DL_DEV_NO_DRIVER;
}

+/**
+ * device_links_no_driver - Update links after failing driver probe.
+ * @dev: Device whose driver has just failed to probe.
+ *
+ * Clean up leftover links to consumers for @dev and invoke
+ * %__device_links_no_driver() to update links to suppliers for it as
+ * appropriate.
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
void device_links_no_driver(struct device *dev)
{
+ struct device_link *link;
+
device_links_write_lock();
+
+ list_for_each_entry(link, &dev->links.consumers, s_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ /*
+ * The probe has failed, so if the status of the link is
+ * "consumer probe" or "active", it must have been added by
+ * a probing consumer while this device was still probing.
+ * Change its state to "dormant", as it represents a valid
+ * relationship, but it is not functionally meaningful.
+ */
+ if (link->status == DL_STATE_CONSUMER_PROBE ||
+ link->status == DL_STATE_ACTIVE)
+ WRITE_ONCE(link->status, DL_STATE_DORMANT);
+ }
+
__device_links_no_driver(dev);
+
device_links_write_unlock();
}

Index: linux-pm/Documentation/driver-api/device_link.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/device_link.rst
+++ linux-pm/Documentation/driver-api/device_link.rst
@@ -59,11 +59,15 @@ device ``->probe`` callback or a boot-ti

Another example for an inconsistent state would be a device link that
represents a driver presence dependency, yet is added from the consumer's
-``->probe`` callback while the supplier hasn't probed yet: Had the driver
-core known about the device link earlier, it wouldn't have probed the
+``->probe`` callback while the supplier hasn't started to probe yet: Had the
+driver core known about the device link earlier, it wouldn't have probed the
consumer in the first place. The onus is thus on the consumer to check
presence of the supplier after adding the link, and defer probing on
-non-presence.
+non-presence. [Note that it is valid to create a link from the consumer's
+``->probe`` callback while the supplier is still probing, but the consumer must
+know that the supplier is functional already at the link creation time (that is
+the case, for instance, if the consumer has just acquired some resources that
+would not have been available had the supplier not been functional then).]

If a device link is added in the ``->probe`` callback of the supplier or
consumer driver, it is typically deleted in its ``->remove`` callback for


2019-02-01 01:14:12

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 4/9] driver core: Fix handling of runtime PM flags in device_link_add()

From: Rafael J. Wysocki <[email protected]>

After commit ead18c23c263 ("driver core: Introduce device links
reference counting"), if there is a link between the given supplier
and the given consumer already, device_link_add() will refcount it
and return it unconditionally without updating its flags. It is
possible, however, that the second (or any subsequent) caller of
device_link_add() for the same consumer-supplier pair will pass
DL_FLAG_PM_RUNTIME, possibly along with DL_FLAG_RPM_ACTIVE, in flags
to it and the existing link may not behave as expected then.

First, if DL_FLAG_PM_RUNTIME is not set in the existing link's flags
at all, it needs to be set like during the original initialization of
the link.

Second, if DL_FLAG_RPM_ACTIVE is passed to device_link_add() in flags
(in addition to DL_FLAG_PM_RUNTIME), the existing link should to be
updated to reflect the "active" runtime PM configuration of the
consumer-supplier pair and extra care must be taken here to avoid
possible destructive races with runtime PM of the consumer.

To that end, redefine the rpm_active field in struct device_link
as a refcount, initialize it to 1 and make rpm_resume() (for the
consumer) and device_link_add() increment it whenever they acquire
a runtime PM reference on the supplier device. Accordingly, make
rpm_suspend() (for the consumer) and pm_runtime_clean_up_links()
decrement it and drop runtime PM references to the supplier
device in a loop until rpm_active becones 1 again.

Fixes: ead18c23c263 ("driver core: Introduce device links reference counting")
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/base/core.c | 45 ++++++++++++++++++++++++++++---------------
drivers/base/power/runtime.c | 26 ++++++++++--------------
include/linux/device.h | 2 -
3 files changed, 42 insertions(+), 31 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -165,6 +165,19 @@ void device_pm_move_to_tail(struct devic
device_links_read_unlock(idx);
}

+static void device_link_rpm_prepare(struct device *consumer,
+ struct device *supplier)
+{
+ pm_runtime_new_link(consumer);
+ /*
+ * If the link is being added by the consumer driver at probe time,
+ * balance the decrementation of the supplier's runtime PM usage counter
+ * after consumer probe in driver_probe_device().
+ */
+ if (consumer->links.status == DL_DEV_PROBING)
+ pm_runtime_get_noresume(supplier);
+}
+
/**
* device_link_add - Create a link between two devices.
* @consumer: Consumer end of the link.
@@ -201,7 +214,6 @@ struct device_link *device_link_add(stru
struct device *supplier, u32 flags)
{
struct device_link *link;
- bool rpm_put_supplier = false;

if (!consumer || !supplier ||
(flags & DL_FLAG_STATELESS &&
@@ -213,7 +225,6 @@ struct device_link *device_link_add(stru
pm_runtime_put_noidle(supplier);
return NULL;
}
- rpm_put_supplier = true;
}

device_links_write_lock();
@@ -249,6 +260,15 @@ struct device_link *device_link_add(stru
if (flags & DL_FLAG_AUTOREMOVE_SUPPLIER)
link->flags |= DL_FLAG_AUTOREMOVE_SUPPLIER;

+ if (flags & DL_FLAG_PM_RUNTIME) {
+ if (!(link->flags & DL_FLAG_PM_RUNTIME)) {
+ device_link_rpm_prepare(consumer, supplier);
+ link->flags |= DL_FLAG_PM_RUNTIME;
+ }
+ if (flags & DL_FLAG_RPM_ACTIVE)
+ refcount_inc(&link->rpm_active);
+ }
+
kref_get(&link->kref);
goto out;
}
@@ -257,20 +277,15 @@ struct device_link *device_link_add(stru
if (!link)
goto out;

+ refcount_set(&link->rpm_active, 1);
+
if (flags & DL_FLAG_PM_RUNTIME) {
- if (flags & DL_FLAG_RPM_ACTIVE) {
- link->rpm_active = true;
- rpm_put_supplier = false;
- }
- pm_runtime_new_link(consumer);
- /*
- * If the link is being added by the consumer driver at probe
- * time, balance the decrementation of the supplier's runtime PM
- * usage counter after consumer probe in driver_probe_device().
- */
- if (consumer->links.status == DL_DEV_PROBING)
- pm_runtime_get_noresume(supplier);
+ if (flags & DL_FLAG_RPM_ACTIVE)
+ refcount_inc(&link->rpm_active);
+
+ device_link_rpm_prepare(consumer, supplier);
}
+
get_device(supplier);
link->supplier = supplier;
INIT_LIST_HEAD(&link->s_node);
@@ -333,7 +348,7 @@ struct device_link *device_link_add(stru
device_pm_unlock();
device_links_write_unlock();

- if (rpm_put_supplier)
+ if ((flags & DL_FLAG_PM_RUNTIME && flags & DL_FLAG_RPM_ACTIVE) && !link)
pm_runtime_put(supplier);

return link;
Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -853,7 +853,7 @@ struct device_link {
struct list_head c_node;
enum device_link_state status;
u32 flags;
- bool rpm_active;
+ refcount_t rpm_active;
struct kref kref;
#ifdef CONFIG_SRCU
struct rcu_head rcu_head;
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -259,11 +259,8 @@ static int rpm_get_suppliers(struct devi
list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
int retval;

- if (!(link->flags & DL_FLAG_PM_RUNTIME))
- continue;
-
- if (READ_ONCE(link->status) == DL_STATE_SUPPLIER_UNBIND ||
- link->rpm_active)
+ if (!(link->flags & DL_FLAG_PM_RUNTIME) ||
+ READ_ONCE(link->status) == DL_STATE_SUPPLIER_UNBIND)
continue;

retval = pm_runtime_get_sync(link->supplier);
@@ -272,7 +269,7 @@ static int rpm_get_suppliers(struct devi
pm_runtime_put_noidle(link->supplier);
return retval;
}
- link->rpm_active = true;
+ refcount_inc(&link->rpm_active);
}
return 0;
}
@@ -281,12 +278,13 @@ static void rpm_put_suppliers(struct dev
{
struct device_link *link;

- list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
- if (link->rpm_active &&
- READ_ONCE(link->status) != DL_STATE_SUPPLIER_UNBIND) {
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
+ if (READ_ONCE(link->status) == DL_STATE_SUPPLIER_UNBIND)
+ continue;
+
+ while (refcount_dec_not_one(&link->rpm_active))
pm_runtime_put(link->supplier);
- link->rpm_active = false;
- }
+ }
}

/**
@@ -1539,7 +1537,7 @@ void pm_runtime_remove(struct device *de
*
* Check links from this device to any consumers and if any of them have active
* runtime PM references to the device, drop the usage counter of the device
- * (once per link).
+ * (as many times as needed).
*
* Links with the DL_FLAG_STATELESS flag set are ignored.
*
@@ -1561,10 +1559,8 @@ void pm_runtime_clean_up_links(struct de
if (link->flags & DL_FLAG_STATELESS)
continue;

- if (link->rpm_active) {
+ while (refcount_dec_not_one(&link->rpm_active))
pm_runtime_put_noidle(dev);
- link->rpm_active = false;
- }
}

device_links_read_unlock(idx);


2019-02-01 01:14:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 6/9] driver core: Do not call rpm_put_suppliers() in pm_runtime_drop_link()

From: Rafael J. Wysocki <[email protected]>

Calling rpm_put_suppliers() from pm_runtime_drop_link() is excessive
as it affects all suppliers of the consumer device and not just the
one pointed to by the device link being dropped. Worst case it may
cause the consumer device to stop working unexpectedly. Moreover, in
principle it is racy with respect to runtime PM of the consumer
device.

To avoid these problems drop runtime PM references on the particular
supplier pointed to by the link in question only and do that after
the link has been dropped from the consumer device's list of links to
suppliers, which is in device_link_free().

Fixes: a0504aecba76 ("PM / runtime: Drop usage count for suppliers at device link removal")
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/base/core.c | 3 +++
drivers/base/power/runtime.c | 2 --
2 files changed, 3 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -374,6 +374,9 @@ EXPORT_SYMBOL_GPL(device_link_add);

static void device_link_free(struct device_link *link)
{
+ while (refcount_dec_not_one(&link->rpm_active))
+ pm_runtime_put(link->supplier);
+
put_device(link->consumer);
put_device(link->supplier);
kfree(link);
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1611,8 +1611,6 @@ void pm_runtime_new_link(struct device *

void pm_runtime_drop_link(struct device *dev)
{
- rpm_put_suppliers(dev);
-
spin_lock_irq(&dev->power.lock);
WARN_ON(dev->power.links_count == 0);
dev->power.links_count--;


2019-02-01 01:15:03

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2 9/9] driver core: Add device link flag DL_FLAG_AUTOPROBE_CONSUMER

From: Rafael J. Wysocki <[email protected]>

Add a new device link flag, DL_FLAG_AUTOPROBE_CONSUMER, to request the
driver core to probe for a consumer driver automatically after binding
a driver to the supplier device on a persistent managed device link.

As unbinding the supplier driver on a managed device link causes the
consumer driver to be detached from its device automatically, this
flag provides a complementary mechanism which is needed to address
some "composite device" use cases.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
Documentation/driver-api/device_link.rst | 9 +++++++++
drivers/base/core.c | 16 +++++++++++++++-
drivers/base/dd.c | 2 +-
include/linux/device.h | 3 +++
4 files changed, 28 insertions(+), 2 deletions(-)

Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -341,6 +341,7 @@ struct device *driver_find_device(struct
struct device *start, void *data,
int (*match)(struct device *dev, void *data));

+void driver_deferred_probe_add(struct device *dev);
int driver_deferred_probe_check_state(struct device *dev);

/**
@@ -827,12 +828,14 @@ enum device_link_state {
* PM_RUNTIME: If set, the runtime PM framework will use this link.
* RPM_ACTIVE: Run pm_runtime_get_sync() on the supplier during link creation.
* AUTOREMOVE_SUPPLIER: Remove the link automatically on supplier driver unbind.
+ * AUTOPROBE_CONSUMER: Probe consumer driver automatically after supplier binds.
*/
#define DL_FLAG_STATELESS BIT(0)
#define DL_FLAG_AUTOREMOVE_CONSUMER BIT(1)
#define DL_FLAG_PM_RUNTIME BIT(2)
#define DL_FLAG_RPM_ACTIVE BIT(3)
#define DL_FLAG_AUTOREMOVE_SUPPLIER BIT(4)
+#define DL_FLAG_AUTOPROBE_CONSUMER BIT(5)

/**
* struct device_link - Device link representation.
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -208,6 +208,12 @@ static void device_link_rpm_prepare(stru
* the link will be maintained until one of the devices pointed to by it (either
* the consumer or the supplier) is unregistered.
*
+ * Also, if DL_FLAG_STATELESS, DL_FLAG_AUTOREMOVE_CONSUMER and
+ * DL_FLAG_AUTOREMOVE_SUPPLIER are not set in @flags (that is, a persistent
+ * managed device link is being added), the DL_FLAG_AUTOPROBE_CONSUMER flag can
+ * be used to request the driver core to automaticall probe for a consmer
+ * driver after successfully binding a driver to the supplier device.
+ *
* The combination of DL_FLAG_STATELESS and either DL_FLAG_AUTOREMOVE_CONSUMER
* or DL_FLAG_AUTOREMOVE_SUPPLIER set in @flags at the same time is invalid and
* will cause NULL to be returned upfront.
@@ -228,7 +234,12 @@ struct device_link *device_link_add(stru

if (!consumer || !supplier ||
(flags & DL_FLAG_STATELESS &&
- flags & (DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOREMOVE_SUPPLIER)))
+ flags & (DL_FLAG_AUTOREMOVE_CONSUMER |
+ DL_FLAG_AUTOREMOVE_SUPPLIER |
+ DL_FLAG_AUTOPROBE_CONSUMER)) ||
+ (flags & DL_FLAG_AUTOPROBE_CONSUMER &&
+ flags & (DL_FLAG_AUTOREMOVE_CONSUMER |
+ DL_FLAG_AUTOREMOVE_SUPPLIER)))
return NULL;

if (flags & DL_FLAG_PM_RUNTIME && flags & DL_FLAG_RPM_ACTIVE) {
@@ -589,6 +600,9 @@ void device_links_driver_bound(struct de

WARN_ON(link->status != DL_STATE_DORMANT);
WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
+
+ if (link->flags & DL_FLAG_AUTOPROBE_CONSUMER)
+ driver_deferred_probe_add(link->consumer);
}

list_for_each_entry(link, &dev->links.suppliers, c_node) {
Index: linux-pm/drivers/base/dd.c
===================================================================
--- linux-pm.orig/drivers/base/dd.c
+++ linux-pm/drivers/base/dd.c
@@ -116,7 +116,7 @@ static void deferred_probe_work_func(str
}
static DECLARE_WORK(deferred_probe_work, deferred_probe_work_func);

-static void driver_deferred_probe_add(struct device *dev)
+void driver_deferred_probe_add(struct device *dev)
{
mutex_lock(&deferred_probe_mutex);
if (list_empty(&dev->p->deferred_probe)) {
Index: linux-pm/Documentation/driver-api/device_link.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/device_link.rst
+++ linux-pm/Documentation/driver-api/device_link.rst
@@ -94,6 +94,15 @@ Similarly, when the device link is added
``DL_FLAG_AUTOREMOVE_SUPPLIER`` causes the device link to be automatically
purged when the supplier fails to probe or later unbinds.

+If neither ``DL_FLAG_AUTOREMOVE_CONSUMER`` nor ``DL_FLAG_AUTOREMOVE_SUPPLIER``
+is set, ``DL_FLAG_AUTOPROBE_CONSUMER`` can be used to request the driver core
+to probe for a driver for the consumer driver on the link automatically after
+a driver has been bound to the supplier device.
+
+Note, however, that any combinations of ``DL_FLAG_AUTOREMOVE_CONSUMER``,
+``DL_FLAG_AUTOREMOVE_SUPPLIER`` or ``DL_FLAG_AUTOPROBE_CONSUMER`` with
+``DL_FLAG_STATELESS`` are invalid and cannot be used.
+
Limitations
===========



2019-02-01 09:07:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Fri, Feb 01, 2019 at 01:44:59AM +0100, Rafael J. Wysocki wrote:
> Hi Greg at al,
>
> This is a combination of the two device links series I have posted
> recently (https://lore.kernel.org/lkml/[email protected]/
> and https://lore.kernel.org/lkml/[email protected]/) rebased
> on top of your driver-core-next branch.
>
> Recently I have been looking at the device links code because of the
> recent discussion on possibly using them in the DRM subsystem (see for
> example https://marc.info/?l=linux-pm&m=154832771905309&w=2) and I have
> found a few issues in that code which should be addressed by this patch
> series. Please refer to the patch changelogs for details.
>
> None of the problems addressed here should be manifesting themselves in
> mainline kernel today, but if there are more device links users in the
> future, they most likely will be encountered sooner or later. Also they
> need to be fixed for the DRM use case to be supported IMO.
>
> On top of this the series makes device links support the "composite device"
> use case in the DRM subsystem mentioned above (essentially, the last patch
> in the series is for that purpose).

All now queued up, thanks!

greg k-h

2019-02-01 09:52:00

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Friday, February 1, 2019 10:05:36 AM CET Greg Kroah-Hartman wrote:
> On Fri, Feb 01, 2019 at 01:44:59AM +0100, Rafael J. Wysocki wrote:
> > Hi Greg at al,
> >
> > This is a combination of the two device links series I have posted
> > recently (https://lore.kernel.org/lkml/[email protected]/
> > and https://lore.kernel.org/lkml/[email protected]/) rebased
> > on top of your driver-core-next branch.
> >
> > Recently I have been looking at the device links code because of the
> > recent discussion on possibly using them in the DRM subsystem (see for
> > example https://marc.info/?l=linux-pm&m=154832771905309&w=2) and I have
> > found a few issues in that code which should be addressed by this patch
> > series. Please refer to the patch changelogs for details.
> >
> > None of the problems addressed here should be manifesting themselves in
> > mainline kernel today, but if there are more device links users in the
> > future, they most likely will be encountered sooner or later. Also they
> > need to be fixed for the DRM use case to be supported IMO.
> >
> > On top of this the series makes device links support the "composite device"
> > use case in the DRM subsystem mentioned above (essentially, the last patch
> > in the series is for that purpose).
>
> All now queued up, thanks!

Thank you!


2019-02-01 15:19:42

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Fri, 1 Feb 2019 at 02:04, Rafael J. Wysocki <[email protected]> wrote:
>
> Hi Greg at al,
>
> This is a combination of the two device links series I have posted
> recently (https://lore.kernel.org/lkml/[email protected]/
> and https://lore.kernel.org/lkml/[email protected]/) rebased
> on top of your driver-core-next branch.
>
> Recently I have been looking at the device links code because of the
> recent discussion on possibly using them in the DRM subsystem (see for
> example https://marc.info/?l=linux-pm&m=154832771905309&w=2) and I have
> found a few issues in that code which should be addressed by this patch
> series. Please refer to the patch changelogs for details.
>
> None of the problems addressed here should be manifesting themselves in
> mainline kernel today, but if there are more device links users in the
> future, they most likely will be encountered sooner or later. Also they
> need to be fixed for the DRM use case to be supported IMO.
>
> On top of this the series makes device links support the "composite device"
> use case in the DRM subsystem mentioned above (essentially, the last patch
> in the series is for that purpose).
>

Rafael, Greg, I have reviewed patch 1 -> 7, they all look good to me.

If not too late, feel free to add for the first 7 patches:

Reviewed-by: Ulf Hansson <[email protected]>

Although, I want to point out one problem that I have found when using
device links. I believe it's already there, even before this series,
but just wanted to described it for your consideration.

This is what happens:
I have a platform driver is being probed. During ->probe() the driver
adds a device link like this:

link = device_link_add(consumer-dev, supplier-dev, DL_FLAG_STATELESS |
DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);

At some point later in ->probe(), the driver realizes that it must
remove the device link, either because it encountered an error or
simply because it doesn't need the device link to be there anymore.
Thus it calls:

device_link_del(link);

When probe finished of the driver, the runtime PM usage count for the
supplier-dev remains increased to 1 and thus it never becomes runtime
suspended.

I haven't thought more in detail of how to address this problem, just
wanted to describe it for you, briefly. The test I have done is by
using a my local RPM testdriver and my local PM domain testdriver, so
no real bug has been reported, at least as far as I know.

Kind regards
Uffe

2019-02-04 14:02:01

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:
>
> On Fri, 1 Feb 2019 at 02:04, Rafael J. Wysocki <[email protected]> wrote:
> >
> > Hi Greg at al,
> >
> > This is a combination of the two device links series I have posted
> > recently (https://lore.kernel.org/lkml/[email protected]/
> > and https://lore.kernel.org/lkml/[email protected]/) rebased
> > on top of your driver-core-next branch.
> >
> > Recently I have been looking at the device links code because of the
> > recent discussion on possibly using them in the DRM subsystem (see for
> > example https://marc.info/?l=linux-pm&m=154832771905309&w=2) and I have
> > found a few issues in that code which should be addressed by this patch
> > series. Please refer to the patch changelogs for details.
> >
> > None of the problems addressed here should be manifesting themselves in
> > mainline kernel today, but if there are more device links users in the
> > future, they most likely will be encountered sooner or later. Also they
> > need to be fixed for the DRM use case to be supported IMO.
> >
> > On top of this the series makes device links support the "composite device"
> > use case in the DRM subsystem mentioned above (essentially, the last patch
> > in the series is for that purpose).
> >
>
> Rafael, Greg, I have reviewed patch 1 -> 7, they all look good to me.
>
> If not too late, feel free to add for the first 7 patches:
>
> Reviewed-by: Ulf Hansson <[email protected]>

Thanks!

> Although, I want to point out one problem that I have found when using
> device links. I believe it's already there, even before this series,
> but just wanted to described it for your consideration.
>
> This is what happens:
> I have a platform driver is being probed. During ->probe() the driver
> adds a device link like this:
>
> link = device_link_add(consumer-dev, supplier-dev, DL_FLAG_STATELESS |
> DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
>
> At some point later in ->probe(), the driver realizes that it must
> remove the device link, either because it encountered an error or
> simply because it doesn't need the device link to be there anymore.
> Thus it calls:
>
> device_link_del(link);
>
> When probe finished of the driver, the runtime PM usage count for the
> supplier-dev remains increased to 1 and thus it never becomes runtime
> suspended.

OK, so this is a tricky one.

With this series applied, if the link actually goes away after the
cleanup device_link_del(), device_link_free() should take care of
dropping the PM-runtime count of the supplier. If it doesn't do that,
there is a mistake in the code that needs to be fixed.

However, if the link doesn't go away after the cleanup
device_link_del(), the supplier's PM-runtime count will not be
dropped, because the core doesn't know whether or not the
device_link_del() has been called by the same entity that caused the
supplier's PM-runtime count to be incremented. For example, if the
consumer device is suspended after the device_link_add() that
incremented the supplier's PM-runtime count and then suspended again,
the link's rpm_active refcount is one already and so the supplier's
PM-runtime count should not be dropped.

Arguably, device_link_del() could be made automatically drop the
supplier's PM-runtime count by one if the link's rpm_active refcount
is not one, but there will be failing scenarios in that case too
AFAICS.

2019-02-04 14:09:15

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki <[email protected]> wrote:
>
> On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:
> >
> > On Fri, 1 Feb 2019 at 02:04, Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > Hi Greg at al,
> > >
> > > This is a combination of the two device links series I have posted
> > > recently (https://lore.kernel.org/lkml/[email protected]/
> > > and https://lore.kernel.org/lkml/[email protected]/) rebased
> > > on top of your driver-core-next branch.
> > >
> > > Recently I have been looking at the device links code because of the
> > > recent discussion on possibly using them in the DRM subsystem (see for
> > > example https://marc.info/?l=linux-pm&m=154832771905309&w=2) and I have
> > > found a few issues in that code which should be addressed by this patch
> > > series. Please refer to the patch changelogs for details.
> > >
> > > None of the problems addressed here should be manifesting themselves in
> > > mainline kernel today, but if there are more device links users in the
> > > future, they most likely will be encountered sooner or later. Also they
> > > need to be fixed for the DRM use case to be supported IMO.
> > >
> > > On top of this the series makes device links support the "composite device"
> > > use case in the DRM subsystem mentioned above (essentially, the last patch
> > > in the series is for that purpose).
> > >
> >
> > Rafael, Greg, I have reviewed patch 1 -> 7, they all look good to me.
> >
> > If not too late, feel free to add for the first 7 patches:
> >
> > Reviewed-by: Ulf Hansson <[email protected]>
>
> Thanks!
>
> > Although, I want to point out one problem that I have found when using
> > device links. I believe it's already there, even before this series,
> > but just wanted to described it for your consideration.
> >
> > This is what happens:
> > I have a platform driver is being probed. During ->probe() the driver
> > adds a device link like this:
> >
> > link = device_link_add(consumer-dev, supplier-dev, DL_FLAG_STATELESS |
> > DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
> >
> > At some point later in ->probe(), the driver realizes that it must
> > remove the device link, either because it encountered an error or
> > simply because it doesn't need the device link to be there anymore.
> > Thus it calls:
> >
> > device_link_del(link);
> >
> > When probe finished of the driver, the runtime PM usage count for the
> > supplier-dev remains increased to 1 and thus it never becomes runtime
> > suspended.
>
> OK, so this is a tricky one.
>
> With this series applied, if the link actually goes away after the
> cleanup device_link_del(), device_link_free() should take care of
> dropping the PM-runtime count of the supplier. If it doesn't do that,
> there is a mistake in the code that needs to be fixed.
>
> However, if the link doesn't go away after the cleanup
> device_link_del(), the supplier's PM-runtime count will not be
> dropped, because the core doesn't know whether or not the
> device_link_del() has been called by the same entity that caused the
> supplier's PM-runtime count to be incremented. For example, if the
> consumer device is suspended after the device_link_add() that
> incremented the supplier's PM-runtime count and then suspended again,

I was distracted while writing this, sorry for the confusion.

So let me rephrase:

For example, if the consumer device is suspended after the
device_link_add() that incremented the supplier's PM-runtime count and
then resumed again, the rpm_active refcount will be greater than one
because of the last resume and not because of the initial link
creation. In that case, dropping the supplier's PM-runtime count on
link deletion may not work as expected.

> Arguably, device_link_del() could be made automatically drop the
> supplier's PM-runtime count by one if the link's rpm_active refcount
> is not one, but there will be failing scenarios in that case too
> AFAICS.

2019-02-05 08:17:23

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Mon, 4 Feb 2019 at 12:45, Rafael J. Wysocki <[email protected]> wrote:
>
> On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:
> > >
> > > On Fri, 1 Feb 2019 at 02:04, Rafael J. Wysocki <[email protected]> wrote:
> > > >
> > > > Hi Greg at al,
> > > >
> > > > This is a combination of the two device links series I have posted
> > > > recently (https://lore.kernel.org/lkml/[email protected]/
> > > > and https://lore.kernel.org/lkml/[email protected]/) rebased
> > > > on top of your driver-core-next branch.
> > > >
> > > > Recently I have been looking at the device links code because of the
> > > > recent discussion on possibly using them in the DRM subsystem (see for
> > > > example https://marc.info/?l=linux-pm&m=154832771905309&w=2) and I have
> > > > found a few issues in that code which should be addressed by this patch
> > > > series. Please refer to the patch changelogs for details.
> > > >
> > > > None of the problems addressed here should be manifesting themselves in
> > > > mainline kernel today, but if there are more device links users in the
> > > > future, they most likely will be encountered sooner or later. Also they
> > > > need to be fixed for the DRM use case to be supported IMO.
> > > >
> > > > On top of this the series makes device links support the "composite device"
> > > > use case in the DRM subsystem mentioned above (essentially, the last patch
> > > > in the series is for that purpose).
> > > >
> > >
> > > Rafael, Greg, I have reviewed patch 1 -> 7, they all look good to me.
> > >
> > > If not too late, feel free to add for the first 7 patches:
> > >
> > > Reviewed-by: Ulf Hansson <[email protected]>
> >
> > Thanks!
> >
> > > Although, I want to point out one problem that I have found when using
> > > device links. I believe it's already there, even before this series,
> > > but just wanted to described it for your consideration.
> > >
> > > This is what happens:
> > > I have a platform driver is being probed. During ->probe() the driver
> > > adds a device link like this:
> > >
> > > link = device_link_add(consumer-dev, supplier-dev, DL_FLAG_STATELESS |
> > > DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
> > >
> > > At some point later in ->probe(), the driver realizes that it must
> > > remove the device link, either because it encountered an error or
> > > simply because it doesn't need the device link to be there anymore.
> > > Thus it calls:
> > >
> > > device_link_del(link);
> > >
> > > When probe finished of the driver, the runtime PM usage count for the
> > > supplier-dev remains increased to 1 and thus it never becomes runtime
> > > suspended.
> >
> > OK, so this is a tricky one.
> >
> > With this series applied, if the link actually goes away after the
> > cleanup device_link_del(), device_link_free() should take care of
> > dropping the PM-runtime count of the supplier. If it doesn't do that,
> > there is a mistake in the code that needs to be fixed.

Unless this is a of your "distracted part", then I think this is what
happening and thus is a problem.

> >
> > However, if the link doesn't go away after the cleanup
> > device_link_del(), the supplier's PM-runtime count will not be
> > dropped, because the core doesn't know whether or not the
> > device_link_del() has been called by the same entity that caused the
> > supplier's PM-runtime count to be incremented. For example, if the
> > consumer device is suspended after the device_link_add() that
> > incremented the supplier's PM-runtime count and then suspended again,
>
> I was distracted while writing this, sorry for the confusion.
>
> So let me rephrase:
>
> For example, if the consumer device is suspended after the
> device_link_add() that incremented the supplier's PM-runtime count and
> then resumed again, the rpm_active refcount will be greater than one
> because of the last resume and not because of the initial link
> creation. In that case, dropping the supplier's PM-runtime count on
> link deletion may not work as expected.

I see what your are saying and I must admit, by looking at the code,
that it has turned into being rather complicated. Assuming of good
reasons, of course.

Anyway, I will play a little bit more with my tests to see what I can find out.

>
> > Arguably, device_link_del() could be made automatically drop the
> > supplier's PM-runtime count by one if the link's rpm_active refcount
> > is not one, but there will be failing scenarios in that case too
> > AFAICS.

Let's see.

Kind regards
Uffe

2019-02-05 11:41:06

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Tuesday, February 5, 2019 9:15:49 AM CET Ulf Hansson wrote:
> On Mon, 4 Feb 2019 at 12:45, Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:
> > > >
> > > > On Fri, 1 Feb 2019 at 02:04, Rafael J. Wysocki <[email protected]> wrote:
> > > > >
> > > > > Hi Greg at al,
> > > > >
> > > > > This is a combination of the two device links series I have posted
> > > > > recently (https://lore.kernel.org/lkml/[email protected]/
> > > > > and https://lore.kernel.org/lkml/[email protected]/) rebased
> > > > > on top of your driver-core-next branch.
> > > > >
> > > > > Recently I have been looking at the device links code because of the
> > > > > recent discussion on possibly using them in the DRM subsystem (see for
> > > > > example https://marc.info/?l=linux-pm&m=154832771905309&w=2) and I have
> > > > > found a few issues in that code which should be addressed by this patch
> > > > > series. Please refer to the patch changelogs for details.
> > > > >
> > > > > None of the problems addressed here should be manifesting themselves in
> > > > > mainline kernel today, but if there are more device links users in the
> > > > > future, they most likely will be encountered sooner or later. Also they
> > > > > need to be fixed for the DRM use case to be supported IMO.
> > > > >
> > > > > On top of this the series makes device links support the "composite device"
> > > > > use case in the DRM subsystem mentioned above (essentially, the last patch
> > > > > in the series is for that purpose).
> > > > >
> > > >
> > > > Rafael, Greg, I have reviewed patch 1 -> 7, they all look good to me.
> > > >
> > > > If not too late, feel free to add for the first 7 patches:
> > > >
> > > > Reviewed-by: Ulf Hansson <[email protected]>
> > >
> > > Thanks!
> > >
> > > > Although, I want to point out one problem that I have found when using
> > > > device links. I believe it's already there, even before this series,
> > > > but just wanted to described it for your consideration.
> > > >
> > > > This is what happens:
> > > > I have a platform driver is being probed. During ->probe() the driver
> > > > adds a device link like this:
> > > >
> > > > link = device_link_add(consumer-dev, supplier-dev, DL_FLAG_STATELESS |
> > > > DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);
> > > >
> > > > At some point later in ->probe(), the driver realizes that it must
> > > > remove the device link, either because it encountered an error or
> > > > simply because it doesn't need the device link to be there anymore.
> > > > Thus it calls:
> > > >
> > > > device_link_del(link);
> > > >
> > > > When probe finished of the driver, the runtime PM usage count for the
> > > > supplier-dev remains increased to 1 and thus it never becomes runtime
> > > > suspended.
> > >
> > > OK, so this is a tricky one.
> > >
> > > With this series applied, if the link actually goes away after the
> > > cleanup device_link_del(), device_link_free() should take care of
> > > dropping the PM-runtime count of the supplier. If it doesn't do that,
> > > there is a mistake in the code that needs to be fixed.
>
> Unless this is a of your "distracted part", then I think this is what
> happening and thus is a problem.
>
> > >
> > > However, if the link doesn't go away after the cleanup
> > > device_link_del(), the supplier's PM-runtime count will not be
> > > dropped, because the core doesn't know whether or not the
> > > device_link_del() has been called by the same entity that caused the
> > > supplier's PM-runtime count to be incremented. For example, if the
> > > consumer device is suspended after the device_link_add() that
> > > incremented the supplier's PM-runtime count and then suspended again,
> >
> > I was distracted while writing this, sorry for the confusion.
> >
> > So let me rephrase:
> >
> > For example, if the consumer device is suspended after the
> > device_link_add() that incremented the supplier's PM-runtime count and
> > then resumed again, the rpm_active refcount will be greater than one
> > because of the last resume and not because of the initial link
> > creation. In that case, dropping the supplier's PM-runtime count on
> > link deletion may not work as expected.
>
> I see what your are saying and I must admit, by looking at the code,
> that it has turned into being rather complicated. Assuming of good
> reasons, of course.
>
> Anyway, I will play a little bit more with my tests to see what I can find out.
>
> >
> > > Arguably, device_link_del() could be made automatically drop the
> > > supplier's PM-runtime count by one if the link's rpm_active refcount
> > > is not one, but there will be failing scenarios in that case too
> > > AFAICS.
>
> Let's see.

So for the record, below is the (untested) patch I'm thinking about.

Having considered this for some time, I think that it would be better to
try to drop the supplier's PM-runtime usage counter on link removal even if
the link doesn't go away then. That would be more consistent at least IMO.

Of course, the potentially failing case is when device_link_add() is called
for the same consumer-supplier pair a number of times in a row, sometimes
with DL_FLAG_RPM_ACTIVE set and sometimes without it, and the callers of it
try to delete the link in a different order. However, if all of them remember
that the supplier is not guaranteed to be PM-runtime-activer after a call
to device_link_del(), they should be able to avoid any problems related to
that AFAICS.

---
drivers/base/core.c | 31 ++++++++++++++++++++++---------
include/linux/device.h | 2 ++
2 files changed, 24 insertions(+), 9 deletions(-)

Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -847,6 +847,7 @@ enum device_link_state {
* @flags: Link flags.
* @rpm_active: Whether or not the consumer device is runtime-PM-active.
* @kref: Count repeated addition of the same link.
+ * @rpm_refs: Repeated addition of the same link with DL_FLAG_RPM_ACTIVE set.
* @rcu_head: An RCU head to use for deferred execution of SRCU callbacks.
*/
struct device_link {
@@ -858,6 +859,7 @@ struct device_link {
u32 flags;
refcount_t rpm_active;
struct kref kref;
+ unsigned int rpm_refs;
#ifdef CONFIG_SRCU
struct rcu_head rcu_head;
#endif
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -178,6 +178,15 @@ static void device_link_rpm_prepare(stru
pm_runtime_get_noresume(supplier);
}

+static void device_link_rpm_active(struct device_link *link, u32 flags)
+{
+ if (flags & DL_FLAG_RPM_ACTIVE) {
+ refcount_inc(&link->rpm_active);
+ if (flags & DL_FLAG_STATELESS)
+ link->rpm_refs++;
+ }
+}
+
/**
* device_link_add - Create a link between two devices.
* @consumer: Consumer end of the link.
@@ -289,8 +298,7 @@ struct device_link *device_link_add(stru
device_link_rpm_prepare(consumer, supplier);
link->flags |= DL_FLAG_PM_RUNTIME;
}
- if (flags & DL_FLAG_RPM_ACTIVE)
- refcount_inc(&link->rpm_active);
+ device_link_rpm_active(link, flags);
}

if (flags & DL_FLAG_STATELESS) {
@@ -322,10 +330,8 @@ struct device_link *device_link_add(stru
refcount_set(&link->rpm_active, 1);

if (flags & DL_FLAG_PM_RUNTIME) {
- if (flags & DL_FLAG_RPM_ACTIVE)
- refcount_inc(&link->rpm_active);
-
device_link_rpm_prepare(consumer, supplier);
+ device_link_rpm_active(link, flags);
}

get_device(supplier);
@@ -463,10 +469,17 @@ static void __device_link_del(struct kre

static void device_link_put_kref(struct device_link *link)
{
- if (link->flags & DL_FLAG_STATELESS)
- kref_put(&link->kref, __device_link_del);
- else
- WARN(1, "Unable to drop a managed device link reference\n");
+ if (WARN(!(link->flags & DL_FLAG_STATELESS),
+ "Unable to drop a managed device link reference\n"))
+ return;
+
+ if (link->rpm_refs) {
+ link->rpm_refs--;
+ if (refcount_dec_not_one(&link->rpm_active))
+ pm_runtime_put(link->supplier);
+ }
+
+ kref_put(&link->kref, __device_link_del);
}

/**


2019-02-06 10:05:50

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Tue, Feb 5, 2019 at 12:27 PM Rafael J. Wysocki <[email protected]> wrote:
>
> On Tuesday, February 5, 2019 9:15:49 AM CET Ulf Hansson wrote:
> > On Mon, 4 Feb 2019 at 12:45, Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki <[email protected]> wrote:
> > > >
> > > > On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:

[cut]

> > >
> > > For example, if the consumer device is suspended after the
> > > device_link_add() that incremented the supplier's PM-runtime count and
> > > then resumed again, the rpm_active refcount will be greater than one
> > > because of the last resume and not because of the initial link
> > > creation. In that case, dropping the supplier's PM-runtime count on
> > > link deletion may not work as expected.
> >
> > I see what your are saying and I must admit, by looking at the code,
> > that it has turned into being rather complicated. Assuming of good
> > reasons, of course.
> >
> > Anyway, I will play a little bit more with my tests to see what I can find out.
> >
> > >
> > > > Arguably, device_link_del() could be made automatically drop the
> > > > supplier's PM-runtime count by one if the link's rpm_active refcount
> > > > is not one, but there will be failing scenarios in that case too
> > > > AFAICS.
> >
> > Let's see.
>
> So for the record, below is the (untested) patch I'm thinking about.
>
> Having considered this for some time, I think that it would be better to
> try to drop the supplier's PM-runtime usage counter on link removal even if
> the link doesn't go away then. That would be more consistent at least IMO.

So I can't convince myself that this is the case.

Either way, if there are two callers of device_link_add() for one
consumer-supplier pair trying to add a stateless link between them and
one of these callers passes DL_FLAG_RPM_ACTIVE set in the flags to it,
there may be issues regardless of what device_link_del() and
device_link_remove() do. However, if they decrement the link's
rpm_active refcount (and possibly the supplier's PM-runtime usage
counter too), the supplier may be suspended prematurely, whereas in
the other case (no decrementation of rpm_active, which how the code
works after this series) it may just be prevented from suspending. To
me, the former is worse than the latter.

Moreover, there is a workaround for the latter issue that seems to be
easy enough: it is sufficient to let the consumer runtime suspend
after calling device_link_add() to create the link (with
DL_FLAG_RPM_ACTIVE set) and before trying to remove it.

Because of the above, I'm just going to post a patch to document the
current behavior of the code as a known limitation.

2019-02-06 11:24:28

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Wed, 6 Feb 2019 at 10:56, Rafael J. Wysocki <[email protected]> wrote:
>
> On Tue, Feb 5, 2019 at 12:27 PM Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Tuesday, February 5, 2019 9:15:49 AM CET Ulf Hansson wrote:
> > > On Mon, 4 Feb 2019 at 12:45, Rafael J. Wysocki <[email protected]> wrote:
> > > >
> > > > On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki <[email protected]> wrote:
> > > > >
> > > > > On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:
>
> [cut]
>
> > > >
> > > > For example, if the consumer device is suspended after the
> > > > device_link_add() that incremented the supplier's PM-runtime count and
> > > > then resumed again, the rpm_active refcount will be greater than one
> > > > because of the last resume and not because of the initial link
> > > > creation. In that case, dropping the supplier's PM-runtime count on
> > > > link deletion may not work as expected.
> > >
> > > I see what your are saying and I must admit, by looking at the code,
> > > that it has turned into being rather complicated. Assuming of good
> > > reasons, of course.
> > >
> > > Anyway, I will play a little bit more with my tests to see what I can find out.
> > >
> > > >
> > > > > Arguably, device_link_del() could be made automatically drop the
> > > > > supplier's PM-runtime count by one if the link's rpm_active refcount
> > > > > is not one, but there will be failing scenarios in that case too
> > > > > AFAICS.
> > >
> > > Let's see.
> >
> > So for the record, below is the (untested) patch I'm thinking about.
> >
> > Having considered this for some time, I think that it would be better to
> > try to drop the supplier's PM-runtime usage counter on link removal even if
> > the link doesn't go away then. That would be more consistent at least IMO.
>
> So I can't convince myself that this is the case.
>
> Either way, if there are two callers of device_link_add() for one
> consumer-supplier pair trying to add a stateless link between them and
> one of these callers passes DL_FLAG_RPM_ACTIVE set in the flags to it,
> there may be issues regardless of what device_link_del() and
> device_link_remove() do. However, if they decrement the link's
> rpm_active refcount (and possibly the supplier's PM-runtime usage
> counter too), the supplier may be suspended prematurely, whereas in
> the other case (no decrementation of rpm_active, which how the code
> works after this series) it may just be prevented from suspending. To
> me, the former is worse than the latter.

Well, I would say it sucks in both cases. :-)

>
> Moreover, there is a workaround for the latter issue that seems to be
> easy enough: it is sufficient to let the consumer runtime suspend
> after calling device_link_add() to create the link (with
> DL_FLAG_RPM_ACTIVE set) and before trying to remove it.

I get your point, but unfortunate I don't think it's that simple.

For example, someone (like a child) may prevent runtime suspend for
the consumer. Hence, also the supplier is prevented from being runtime
suspended.

So, if you want to push this responsibility to the driver, then I
think we need make __pm_runtime_set_status() to respect device links,
similar to how it already deals with child/parents.

In that way, the driver could call pm_runtime_set_suspended(), before
dropping the device link in ->probe(), which would allow the supplier
to also become runtime suspended.

I did a quick research of users of device links, unless I am mistaken,
this seems like an okay approach.

What do you think?

>
> Because of the above, I'm just going to post a patch to document the
> current behavior of the code as a known limitation.

Let's not give up, yet, please. I am sure we can figure something out.

Kind regards
Uffe

2019-02-06 12:12:22

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Wed, Feb 6, 2019 at 12:24 PM Ulf Hansson <[email protected]> wrote:
>
> On Wed, 6 Feb 2019 at 10:56, Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Tue, Feb 5, 2019 at 12:27 PM Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > On Tuesday, February 5, 2019 9:15:49 AM CET Ulf Hansson wrote:
> > > > On Mon, 4 Feb 2019 at 12:45, Rafael J. Wysocki <[email protected]> wrote:
> > > > >
> > > > > On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki <[email protected]> wrote:
> > > > > >
> > > > > > On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:
> >
> > [cut]
> >
> > > > >
> > > > > For example, if the consumer device is suspended after the
> > > > > device_link_add() that incremented the supplier's PM-runtime count and
> > > > > then resumed again, the rpm_active refcount will be greater than one
> > > > > because of the last resume and not because of the initial link
> > > > > creation. In that case, dropping the supplier's PM-runtime count on
> > > > > link deletion may not work as expected.
> > > >
> > > > I see what your are saying and I must admit, by looking at the code,
> > > > that it has turned into being rather complicated. Assuming of good
> > > > reasons, of course.
> > > >
> > > > Anyway, I will play a little bit more with my tests to see what I can find out.
> > > >
> > > > >
> > > > > > Arguably, device_link_del() could be made automatically drop the
> > > > > > supplier's PM-runtime count by one if the link's rpm_active refcount
> > > > > > is not one, but there will be failing scenarios in that case too
> > > > > > AFAICS.
> > > >
> > > > Let's see.
> > >
> > > So for the record, below is the (untested) patch I'm thinking about.
> > >
> > > Having considered this for some time, I think that it would be better to
> > > try to drop the supplier's PM-runtime usage counter on link removal even if
> > > the link doesn't go away then. That would be more consistent at least IMO.
> >
> > So I can't convince myself that this is the case.
> >
> > Either way, if there are two callers of device_link_add() for one
> > consumer-supplier pair trying to add a stateless link between them and
> > one of these callers passes DL_FLAG_RPM_ACTIVE set in the flags to it,
> > there may be issues regardless of what device_link_del() and
> > device_link_remove() do. However, if they decrement the link's
> > rpm_active refcount (and possibly the supplier's PM-runtime usage
> > counter too), the supplier may be suspended prematurely, whereas in
> > the other case (no decrementation of rpm_active, which how the code
> > works after this series) it may just be prevented from suspending. To
> > me, the former is worse than the latter.
>
> Well, I would say it sucks in both cases. :-)
>
> >
> > Moreover, there is a workaround for the latter issue that seems to be
> > easy enough: it is sufficient to let the consumer runtime suspend
> > after calling device_link_add() to create the link (with
> > DL_FLAG_RPM_ACTIVE set) and before trying to remove it.
>
> I get your point, but unfortunate I don't think it's that simple.
>
> For example, someone (like a child) may prevent runtime suspend for
> the consumer. Hence, also the supplier is prevented from being runtime
> suspended.

Well, in that case the supplier should not be suspended until the
consumer can be suspended too.

IOW, if you call device_link_del() in that case, it would be a bug if
it allowed the supplier suspend.

> So, if you want to push this responsibility to the driver, then I
> think we need make __pm_runtime_set_status() to respect device links,
> similar to how it already deals with child/parents.
>
> In that way, the driver could call pm_runtime_set_suspended(), before
> dropping the device link in ->probe(), which would allow the supplier
> to also become runtime suspended.

I guess you mean that runtime PM would be disabled for the consumer at
that point?

> I did a quick research of users of device links, unless I am mistaken,
> this seems like an okay approach.
>
> What do you think?

Well, I think I need to know the exact use case you have in mind. :-)

2019-02-06 13:06:22

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Wed, 6 Feb 2019 at 13:10, Rafael J. Wysocki <[email protected]> wrote:
>
> On Wed, Feb 6, 2019 at 12:24 PM Ulf Hansson <[email protected]> wrote:
> >
> > On Wed, 6 Feb 2019 at 10:56, Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > On Tue, Feb 5, 2019 at 12:27 PM Rafael J. Wysocki <[email protected]> wrote:
> > > >
> > > > On Tuesday, February 5, 2019 9:15:49 AM CET Ulf Hansson wrote:
> > > > > On Mon, 4 Feb 2019 at 12:45, Rafael J. Wysocki <[email protected]> wrote:
> > > > > >
> > > > > > On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki <[email protected]> wrote:
> > > > > > >
> > > > > > > On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:
> > >
> > > [cut]
> > >
> > > > > >
> > > > > > For example, if the consumer device is suspended after the
> > > > > > device_link_add() that incremented the supplier's PM-runtime count and
> > > > > > then resumed again, the rpm_active refcount will be greater than one
> > > > > > because of the last resume and not because of the initial link
> > > > > > creation. In that case, dropping the supplier's PM-runtime count on
> > > > > > link deletion may not work as expected.
> > > > >
> > > > > I see what your are saying and I must admit, by looking at the code,
> > > > > that it has turned into being rather complicated. Assuming of good
> > > > > reasons, of course.
> > > > >
> > > > > Anyway, I will play a little bit more with my tests to see what I can find out.
> > > > >
> > > > > >
> > > > > > > Arguably, device_link_del() could be made automatically drop the
> > > > > > > supplier's PM-runtime count by one if the link's rpm_active refcount
> > > > > > > is not one, but there will be failing scenarios in that case too
> > > > > > > AFAICS.
> > > > >
> > > > > Let's see.
> > > >
> > > > So for the record, below is the (untested) patch I'm thinking about.
> > > >
> > > > Having considered this for some time, I think that it would be better to
> > > > try to drop the supplier's PM-runtime usage counter on link removal even if
> > > > the link doesn't go away then. That would be more consistent at least IMO.
> > >
> > > So I can't convince myself that this is the case.
> > >
> > > Either way, if there are two callers of device_link_add() for one
> > > consumer-supplier pair trying to add a stateless link between them and
> > > one of these callers passes DL_FLAG_RPM_ACTIVE set in the flags to it,
> > > there may be issues regardless of what device_link_del() and
> > > device_link_remove() do. However, if they decrement the link's
> > > rpm_active refcount (and possibly the supplier's PM-runtime usage
> > > counter too), the supplier may be suspended prematurely, whereas in
> > > the other case (no decrementation of rpm_active, which how the code
> > > works after this series) it may just be prevented from suspending. To
> > > me, the former is worse than the latter.
> >
> > Well, I would say it sucks in both cases. :-)
> >
> > >
> > > Moreover, there is a workaround for the latter issue that seems to be
> > > easy enough: it is sufficient to let the consumer runtime suspend
> > > after calling device_link_add() to create the link (with
> > > DL_FLAG_RPM_ACTIVE set) and before trying to remove it.
> >
> > I get your point, but unfortunate I don't think it's that simple.
> >
> > For example, someone (like a child) may prevent runtime suspend for
> > the consumer. Hence, also the supplier is prevented from being runtime
> > suspended.
>
> Well, in that case the supplier should not be suspended until the
> consumer can be suspended too.
>
> IOW, if you call device_link_del() in that case, it would be a bug if
> it allowed the supplier suspend.

Well, maybe the "child" was a bad example. The point is, the driver
doesn't control the RPM child count and nor the RPM usage count for
its consumer device - solely by itself.

Someone, like the driver/PM core can at some point decide to increase
the runtime PM usage count for example for the consumer device. For
whatever good reason.

>
> > So, if you want to push this responsibility to the driver, then I
> > think we need make __pm_runtime_set_status() to respect device links,
> > similar to how it already deals with child/parents.
> >
> > In that way, the driver could call pm_runtime_set_suspended(), before
> > dropping the device link in ->probe(), which would allow the supplier
> > to also become runtime suspended.
>
> I guess you mean that runtime PM would be disabled for the consumer at
> that point?

Yes.

Calling pm_runtime_set_suspended() should be a part of the error path
in the driver, which includes disabling runtime PM as well (if it
enabled it in the first place of course).

>
> > I did a quick research of users of device links, unless I am mistaken,
> > this seems like an okay approach.
> >
> > What do you think?
>
> Well, I think I need to know the exact use case you have in mind. :-)

The use case is simply a generic driver that fails to probe by
returning -EPROBE_DEFER. So it's hypothetical, but I often tests
common sequences by using my RPM testdriver, to make sure it all works
as expected.
The below sequence is common, so then I have added the use of device
links, to see how this plays. And it doesn't.

->probe()
...

pm_runtime_get_noresume()
pm_runtime_set_active()
pm_runtime_enable()

...

device_link_add(con, supp, DL_FLAG_STATELESS |DL_FLAG_PM_RUNTIME |
DL_FLAG_RPM_ACTIVE);

we got some errors...
goto err:

...

err:
pm_runtime_put_noidle()
pm_runtime_disable()
pm_runtime_set_suspended()

device_link_remove()

return err;

Kind regards
Uffe

2019-02-06 23:17:07

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 0/9] driver core: Fix some device links issues and add "consumer autoprobe" flag

On Wed, Feb 6, 2019 at 2:02 PM Ulf Hansson <[email protected]> wrote:
>
> On Wed, 6 Feb 2019 at 13:10, Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Wed, Feb 6, 2019 at 12:24 PM Ulf Hansson <[email protected]> wrote:
> > >
> > > On Wed, 6 Feb 2019 at 10:56, Rafael J. Wysocki <[email protected]> wrote:
> > > >
> > > > On Tue, Feb 5, 2019 at 12:27 PM Rafael J. Wysocki <[email protected]> wrote:
> > > > >
> > > > > On Tuesday, February 5, 2019 9:15:49 AM CET Ulf Hansson wrote:
> > > > > > On Mon, 4 Feb 2019 at 12:45, Rafael J. Wysocki <[email protected]> wrote:
> > > > > > >
> > > > > > > On Mon, Feb 4, 2019 at 12:40 PM Rafael J. Wysocki <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Fri, Feb 1, 2019 at 4:18 PM Ulf Hansson <[email protected]> wrote:
> > > >
> > > > [cut]
> > > >
> > > > > > >
> > > > > > > For example, if the consumer device is suspended after the
> > > > > > > device_link_add() that incremented the supplier's PM-runtime count and
> > > > > > > then resumed again, the rpm_active refcount will be greater than one
> > > > > > > because of the last resume and not because of the initial link
> > > > > > > creation. In that case, dropping the supplier's PM-runtime count on
> > > > > > > link deletion may not work as expected.
> > > > > >
> > > > > > I see what your are saying and I must admit, by looking at the code,
> > > > > > that it has turned into being rather complicated. Assuming of good
> > > > > > reasons, of course.
> > > > > >
> > > > > > Anyway, I will play a little bit more with my tests to see what I can find out.
> > > > > >
> > > > > > >
> > > > > > > > Arguably, device_link_del() could be made automatically drop the
> > > > > > > > supplier's PM-runtime count by one if the link's rpm_active refcount
> > > > > > > > is not one, but there will be failing scenarios in that case too
> > > > > > > > AFAICS.
> > > > > >
> > > > > > Let's see.
> > > > >
> > > > > So for the record, below is the (untested) patch I'm thinking about.
> > > > >
> > > > > Having considered this for some time, I think that it would be better to
> > > > > try to drop the supplier's PM-runtime usage counter on link removal even if
> > > > > the link doesn't go away then. That would be more consistent at least IMO.
> > > >
> > > > So I can't convince myself that this is the case.
> > > >
> > > > Either way, if there are two callers of device_link_add() for one
> > > > consumer-supplier pair trying to add a stateless link between them and
> > > > one of these callers passes DL_FLAG_RPM_ACTIVE set in the flags to it,
> > > > there may be issues regardless of what device_link_del() and
> > > > device_link_remove() do. However, if they decrement the link's
> > > > rpm_active refcount (and possibly the supplier's PM-runtime usage
> > > > counter too), the supplier may be suspended prematurely, whereas in
> > > > the other case (no decrementation of rpm_active, which how the code
> > > > works after this series) it may just be prevented from suspending. To
> > > > me, the former is worse than the latter.
> > >
> > > Well, I would say it sucks in both cases. :-)
> > >
> > > >
> > > > Moreover, there is a workaround for the latter issue that seems to be
> > > > easy enough: it is sufficient to let the consumer runtime suspend
> > > > after calling device_link_add() to create the link (with
> > > > DL_FLAG_RPM_ACTIVE set) and before trying to remove it.
> > >
> > > I get your point, but unfortunate I don't think it's that simple.
> > >
> > > For example, someone (like a child) may prevent runtime suspend for
> > > the consumer. Hence, also the supplier is prevented from being runtime
> > > suspended.
> >
> > Well, in that case the supplier should not be suspended until the
> > consumer can be suspended too.
> >
> > IOW, if you call device_link_del() in that case, it would be a bug if
> > it allowed the supplier suspend.
>
> Well, maybe the "child" was a bad example. The point is, the driver
> doesn't control the RPM child count and nor the RPM usage count for
> its consumer device - solely by itself.
>
> Someone, like the driver/PM core can at some point decide to increase
> the runtime PM usage count for example for the consumer device. For
> whatever good reason.

If PM-runtime is enabled for the consumer and there is a good enough
reason why it cannot suspend, then the supplier cannot suspend too.
Again, allowing the supplier to suspend in that situation would be a
bug.

However, the supplier can be allowed to suspend if PM-runtime is not
enabled for the consumer and it is not operational (and the latter
needs to be known to whoever wants to allow the supplier to suspend).

> >
> > > So, if you want to push this responsibility to the driver, then I
> > > think we need make __pm_runtime_set_status() to respect device links,
> > > similar to how it already deals with child/parents.
> > >
> > > In that way, the driver could call pm_runtime_set_suspended(), before
> > > dropping the device link in ->probe(), which would allow the supplier
> > > to also become runtime suspended.
> >
> > I guess you mean that runtime PM would be disabled for the consumer at
> > that point?
>
> Yes.
>
> Calling pm_runtime_set_suspended() should be a part of the error path
> in the driver, which includes disabling runtime PM as well (if it
> enabled it in the first place of course).

OK

I admit that it would make sense to change __pm_runtime_set_status()
to take device links into account.

> >
> > > I did a quick research of users of device links, unless I am mistaken,
> > > this seems like an okay approach.
> > >
> > > What do you think?
> >
> > Well, I think I need to know the exact use case you have in mind. :-)
>
> The use case is simply a generic driver that fails to probe by
> returning -EPROBE_DEFER. So it's hypothetical, but I often tests
> common sequences by using my RPM testdriver, to make sure it all works
> as expected.
> The below sequence is common, so then I have added the use of device
> links, to see how this plays. And it doesn't.
>
> ->probe()
> ...
>
> pm_runtime_get_noresume()
> pm_runtime_set_active()
> pm_runtime_enable()

I don't think you can do the above safely without ensuring that the
supplier is not suspended upfront. So you need

pm_runtim_get_sync(supplier);

before the above, and then ->

>
> ...
>
> device_link_add(con, supp, DL_FLAG_STATELESS |DL_FLAG_PM_RUNTIME |
> DL_FLAG_RPM_ACTIVE);
>
> we got some errors...
> goto err:

-> it is not necessary to add the link before the point at which you
know that you are not going to return any errors (except when the link
creation itself fails). So then you do

device_link_add(con, supp, DL_FLAG_STATELESS |
DL_FLAG_PM_RUNTIME |
DL_FLAG_RPM_ACTIVE);
...
pm_runtime_put_noidle(supplier);

and the link prevents the supplier from suspending until the consumer
is suspended.

>
> ...
>
> err:
> pm_runtime_put_noidle()
> pm_runtime_disable()
> pm_runtime_set_suspended()
>
> device_link_remove()
>
> return err;

In particular, device links really should not be added and removed
before returning -EPROBE_DEFER IMO, because adding them causes
dpm_list to be re-ordered and that may be quite heavy if the consumer
has children etc.

2019-02-07 19:05:09

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v2 2/9] driver core: Avoid careless re-use of existing device links

Sorry, late to the party.

On Fri, Feb 01, 2019 at 01:46:54AM +0100, Rafael J. Wysocki wrote:
> After commit ead18c23c263 ("driver core: Introduce device links
> reference counting"), if there is a link between the given supplier
> and the given consumer already, device_link_add() will refcount it
> and return it unconditionally. However, if the flags passed to
> it on the second (or any subsequent) attempt to create a device
> link between the same consumer-supplier pair are not compatible with
> the existing link's flags, that is incorrect.

Prior to ead18c23c263, a second invocation of device_link_add() also
returned the existing device link without checking for incompatible
flags. Hence this issue was not introduced by ead18c23c263, but
rather by 9ed9895370ae ("driver core: Functional dependencies tracking
support"), i.e. it was present all along. (Unless I'm missing something.)


> Moreover, if the DL_FLAG_AUTOREMOVE_CONSUMER flag is passed to
> device_link_add(), its caller will expect its reference to the link
> to be dropped automatically on consumer driver removal, which will
> not happen if that flag is not set in the link's flags (and
> analogously for DL_FLAG_AUTOREMOVE_SUPPLIER). For this reason, make
> device_link_add() update the existing link's flags accordingly
> before returning it to the caller.

Same here.


> Fixes: ead18c23c263 ("driver core: Introduce device links reference counting")

Should rather be
Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support")

(Again, unless I'm missing something.)

Thanks,

Lukas

2019-02-07 19:13:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 2/9] driver core: Avoid careless re-use of existing device links

On Thu, Feb 7, 2019 at 8:03 PM Lukas Wunner <[email protected]> wrote:
>
> Sorry, late to the party.
>
> On Fri, Feb 01, 2019 at 01:46:54AM +0100, Rafael J. Wysocki wrote:
> > After commit ead18c23c263 ("driver core: Introduce device links
> > reference counting"), if there is a link between the given supplier
> > and the given consumer already, device_link_add() will refcount it
> > and return it unconditionally. However, if the flags passed to
> > it on the second (or any subsequent) attempt to create a device
> > link between the same consumer-supplier pair are not compatible with
> > the existing link's flags, that is incorrect.
>
> Prior to ead18c23c263, a second invocation of device_link_add() also
> returned the existing device link without checking for incompatible
> flags. Hence this issue was not introduced by ead18c23c263, but
> rather by 9ed9895370ae ("driver core: Functional dependencies tracking
> support"), i.e. it was present all along. (Unless I'm missing something.)
>
>
> > Moreover, if the DL_FLAG_AUTOREMOVE_CONSUMER flag is passed to
> > device_link_add(), its caller will expect its reference to the link
> > to be dropped automatically on consumer driver removal, which will
> > not happen if that flag is not set in the link's flags (and
> > analogously for DL_FLAG_AUTOREMOVE_SUPPLIER). For this reason, make
> > device_link_add() update the existing link's flags accordingly
> > before returning it to the caller.
>
> Same here.
>
>
> > Fixes: ead18c23c263 ("driver core: Introduce device links reference counting")
>
> Should rather be
> Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support")
>
> (Again, unless I'm missing something.)

I haven't checked far enough into the past, sorry about that.

You are right, but actually I think that

Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support")

should be there in addition to the other Fixes: tag, because the issue
is still present after that commit and the patch does not apply
directly on top of 9ed9895370ae.

2019-02-07 19:17:30

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v2 4/9] driver core: Fix handling of runtime PM flags in device_link_add()

On Fri, Feb 01, 2019 at 01:49:14AM +0100, Rafael J. Wysocki wrote:
> After commit ead18c23c263 ("driver core: Introduce device links
> reference counting"), if there is a link between the given supplier
> and the given consumer already, device_link_add() will refcount it
> and return it unconditionally without updating its flags. It is
> possible, however, that the second (or any subsequent) caller of
> device_link_add() for the same consumer-supplier pair will pass
> DL_FLAG_PM_RUNTIME, possibly along with DL_FLAG_RPM_ACTIVE, in flags
> to it and the existing link may not behave as expected then.
[...]
> Fixes: ead18c23c263 ("driver core: Introduce device links reference counting")

I think this should be:

Fixes: 21d5c57b3726 ("PM / runtime: Use device links")

Thanks,

Lukas

2019-02-07 19:21:22

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 4/9] driver core: Fix handling of runtime PM flags in device_link_add()

On Thu, Feb 7, 2019 at 8:15 PM Lukas Wunner <[email protected]> wrote:
>
> On Fri, Feb 01, 2019 at 01:49:14AM +0100, Rafael J. Wysocki wrote:
> > After commit ead18c23c263 ("driver core: Introduce device links
> > reference counting"), if there is a link between the given supplier
> > and the given consumer already, device_link_add() will refcount it
> > and return it unconditionally without updating its flags. It is
> > possible, however, that the second (or any subsequent) caller of
> > device_link_add() for the same consumer-supplier pair will pass
> > DL_FLAG_PM_RUNTIME, possibly along with DL_FLAG_RPM_ACTIVE, in flags
> > to it and the existing link may not behave as expected then.
> [...]
> > Fixes: ead18c23c263 ("driver core: Introduce device links reference counting")
>
> I think this should be:
>
> Fixes: 21d5c57b3726 ("PM / runtime: Use device links")

Again, sorry for failing to look deeper into the past.

Yes, there should be

Fixes: 21d5c57b3726 ("PM / runtime: Use device links")

in the changelog of this patch, but again, I still would add the other
Fixes: tag too as the issue is still present after ead18c23c263 and
that commit is needed for the patch to apply.