2016-10-10 12:59:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v5 0/5] Functional dependencies between devices

Hi Everyone,

> On Thursday, September 08, 2016 11:25:44 PM Rafael J. Wysocki wrote:
>

[cut]

>
> Time for another update. :-)
>
> Fewer changes this time, mostly to address issues found by Lukas and Marek.
>
> The most significant one is to make device_link_add() cope with the case
> when
> the consumer device has not been registered yet when it is called. The
> supplier device still is required to be registered and the function will
> return NULL if that is not the case.
>
> Another significant change is in patch [4/5] that now makes the core apply
> pm_runtime_get_sync()/pm_runtime_put() to supplier devices around the
> probing of a consumer one (in analogy with the parent).

One more update after some conversations during LinuxCon Europe.

The main point was to make it possible for device_link_add() to figure out the
initial state of the link instead of expecting the caller to provide it which
might not be reliable enough in general.

In this version device_link_add() takes three arguments, the supplier and
consumer pointers and flags and it sets the correct initial state of the link
automatically (unless invoked with the "stateless" flag, of course). The cost
is one additional field in struct device (I moved all of the links-related
fields in struct device to a separate sub-structure while at it) to track
the "driver presence status" of the device (to be used by device_link_add()).

In addition to that, the links list walks in the core.c and dd.c code are
under the device links mutex now, so the iternal link spinlock is not needed
any more and I have renamed symbols to distinguish between flags, link states
and device "driver presence statuses".

More information is there in the changelogs.

Thanks,
Rafael


2016-10-10 12:59:31

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v5 4/5] PM / runtime: Use device links

From: Rafael J. Wysocki <[email protected]>

Modify the runtime PM framework to use device links to ensure that
supplier devices will not be suspended if any of their consumer
devices are active.

The idea is to reference count suppliers on the consumer's resume
and drop references to them on its suspend. The information on
whether or not the supplier has been reference counted by the
consumer's (runtime) resume is stored in a new field (rpm_active)
in the link object for each link.

It may be necessary to clean up those references when the
supplier is unbinding and that's why the links whose status is
DEVICE_LINK_SUPPLIER_UNBIND are skipped by the runtime suspend
and resume code.

The above means that if the consumer device is probed in the
runtime-active state, the supplier has to be resumed and reference
counted by device_link_add() so the code works as expected on its
(runtime) suspend. There is a new flag, DEVICE_LINK_RPM_ACTIVE,
to tell device_link_add() about that (in which case the caller
is responsible for making sure that the consumer really will
be runtime-active when runtime PM is enabled for it).

The other new link flag, DEVICE_LINK_PM_RUNTIME, tells the core
whether or not the link should be used for runtime PM at all.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---

v4 -> v5:

- Rebase, change the symbol names in accordance with patch [2/5] etc.

- Use READ_ONCE() to annotate "unprotected" accesses to link status data.

v3 -> v4:

- Add pm_runtime_get/put_suppliers() and make the core call them
around the probing of a consumer device (in analogy with what it
does to the device's parent).

---
drivers/base/core.c | 19 ++++-
drivers/base/dd.c | 3
drivers/base/power/runtime.c | 157 +++++++++++++++++++++++++++++++++++++++++--
include/linux/device.h | 5 +
include/linux/pm_runtime.h | 6 +
5 files changed, 184 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -117,6 +117,14 @@ static int device_reorder_to_tail(struct
* @supplier: Supplier end of the link.
* @flags: Link flags.
*
+ * The caller is responsible for the proper synchronization of the link creation
+ * with runtime PM. First, setting the DL_FLAG_PM_RUNTIME flag will cause the
+ * runtime PM framework to take the link into account. Second, if the
+ * DL_FLAG_RPM_ACTIVE flag is set in addition to it, the supplier devices will
+ * be forced into the active metastate and reference-counted upon the creation
+ * of the link. If DL_FLAG_PM_RUNTIME is not set, DL_FLAG_RPM_ACTIVE will be
+ * ignored.
+ *
* If the DL_FLAG_AUTOREMOVE is set, the link will be removed automatically
* when the consumer device driver unbinds from it. The combination of both
* DL_FLAG_AUTOREMOVE and DL_FLAG_STATELESS set is invalid and will cause NULL
@@ -158,10 +166,19 @@ struct device_link *device_link_add(stru
if (link->consumer == consumer)
goto out;

- link = kmalloc(sizeof(*link), GFP_KERNEL);
+ link = kzalloc(sizeof(*link), GFP_KERNEL);
if (!link)
goto out;

+ if ((flags & DL_FLAG_PM_RUNTIME) && (flags & DL_FLAG_RPM_ACTIVE)) {
+ if (pm_runtime_get_sync(supplier) < 0) {
+ pm_runtime_put_noidle(supplier);
+ kfree(link);
+ link = NULL;
+ goto out;
+ }
+ link->rpm_active = true;
+ }
get_device(supplier);
link->supplier = supplier;
INIT_LIST_HEAD(&link->s_node);
Index: linux-pm/drivers/base/dd.c
===================================================================
--- linux-pm.orig/drivers/base/dd.c
+++ linux-pm/drivers/base/dd.c
@@ -498,6 +498,7 @@ int driver_probe_device(struct device_dr
pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
drv->bus->name, __func__, dev_name(dev), drv->name);

+ pm_runtime_get_suppliers(dev);
if (dev->parent)
pm_runtime_get_sync(dev->parent);

@@ -508,6 +509,7 @@ int driver_probe_device(struct device_dr
if (dev->parent)
pm_runtime_put(dev->parent);

+ pm_runtime_put_suppliers(dev);
return ret;
}

@@ -791,6 +793,7 @@ static void __device_release_driver(stru
}

pm_runtime_get_sync(dev);
+ pm_runtime_clean_up_links(dev);

driver_sysfs_remove(dev);

Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -12,6 +12,8 @@
#include <linux/pm_runtime.h>
#include <linux/pm_wakeirq.h>
#include <trace/events/rpm.h>
+
+#include "../base.h"
#include "power.h"

typedef int (*pm_callback_t)(struct device *);
@@ -258,6 +260,42 @@ static int rpm_check_suspend_allowed(str
return retval;
}

+static int rpm_get_suppliers(struct device *dev)
+{
+ struct device_link *link;
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
+ int retval;
+
+ if (!(link->flags & DL_FLAG_PM_RUNTIME))
+ continue;
+
+ if (READ_ONCE(link->status) == DL_STATE_SUPPLIER_UNBIND ||
+ link->rpm_active)
+ continue;
+
+ retval = pm_runtime_get_sync(link->supplier);
+ if (retval < 0) {
+ pm_runtime_put_noidle(link->supplier);
+ return retval;
+ }
+ link->rpm_active = true;
+ }
+ return 0;
+}
+
+static void rpm_put_suppliers(struct device *dev)
+{
+ struct device_link *link;
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+ if (link->rpm_active &&
+ READ_ONCE(link->status) != DL_STATE_SUPPLIER_UNBIND) {
+ pm_runtime_put(link->supplier);
+ link->rpm_active = false;
+ }
+}
+
/**
* __rpm_callback - Run a given runtime PM callback for a given device.
* @cb: Runtime PM callback to run.
@@ -266,19 +304,55 @@ static int rpm_check_suspend_allowed(str
static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
__releases(&dev->power.lock) __acquires(&dev->power.lock)
{
- int retval;
+ int retval, idx;

- if (dev->power.irq_safe)
+ if (dev->power.irq_safe) {
spin_unlock(&dev->power.lock);
- else
+ } else {
spin_unlock_irq(&dev->power.lock);

+ /*
+ * Resume suppliers if necessary.
+ *
+ * The device's runtime PM status cannot change until this
+ * routine returns, so it is safe to read the status outside of
+ * the lock.
+ */
+ if (dev->power.runtime_status == RPM_RESUMING) {
+ idx = device_links_read_lock();
+
+ retval = rpm_get_suppliers(dev);
+ if (retval)
+ goto fail;
+
+ device_links_read_unlock(idx);
+ }
+ }
+
retval = cb(dev);

- if (dev->power.irq_safe)
+ if (dev->power.irq_safe) {
spin_lock(&dev->power.lock);
- else
+ } else {
+ /*
+ * If the device is suspending and the callback has returned
+ * success, drop the usage counters of the suppliers that have
+ * been reference counted on its resume.
+ *
+ * Do that if resume fails too.
+ */
+ if ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
+ || (dev->power.runtime_status == RPM_RESUMING && retval)) {
+ idx = device_links_read_lock();
+
+ fail:
+ rpm_put_suppliers(dev);
+
+ device_links_read_unlock(idx);
+ }
+
spin_lock_irq(&dev->power.lock);
+ }

return retval;
}
@@ -1447,6 +1521,79 @@ void pm_runtime_remove(struct device *de
}

/**
+ * pm_runtime_clean_up_links - Prepare links to consumers for driver removal.
+ * @dev: Device whose driver is going to be removed.
+ *
+ * Check links from this device to any consumers and if any of them have active
+ * runtime PM references to the device, drop the usage counter of the device
+ * (once per link).
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ *
+ * Since the device is guaranteed to be runtime-active at the point this is
+ * called, nothing else needs to be done here.
+ *
+ * Moreover, this is called after device_links_busy() has returned 'false', so
+ * the status of each link is guaranteed to be DL_STATE_SUPPLIER_UNBIND and
+ * therefore rpm_active can't be manipulated concurrently.
+ */
+void pm_runtime_clean_up_links(struct device *dev)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ if (link->rpm_active) {
+ pm_runtime_put_noidle(dev);
+ link->rpm_active = false;
+ }
+ }
+
+ device_links_read_unlock(idx);
+}
+
+/**
+ * pm_runtime_get_suppliers - Resume and reference-count supplier devices.
+ * @dev: Consumer device.
+ */
+void pm_runtime_get_suppliers(struct device *dev)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+ if (link->flags & DL_FLAG_PM_RUNTIME)
+ pm_runtime_get_sync(link->supplier);
+
+ device_links_read_unlock(idx);
+}
+
+/**
+ * pm_runtime_put_suppliers - Drop references to supplier devices.
+ * @dev: Consumer device.
+ */
+void pm_runtime_put_suppliers(struct device *dev)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+ if (link->flags & DL_FLAG_PM_RUNTIME)
+ pm_runtime_put(link->supplier);
+
+ device_links_read_unlock(idx);
+}
+
+/**
* pm_runtime_force_suspend - Force a device into suspend state if needed.
* @dev: Device to suspend.
*
Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -729,9 +729,13 @@ enum device_link_state {
*
* STATELESS: The core won't track the presence of supplier/consumer drivers.
* AUTOREMOVE: Remove this link automatically on consumer driver unbind.
+ * PM_RUNTIME: If set, the runtime PM framework will use this link.
+ * RPM_ACTIVE: Run pm_runtime_get_sync() on the supplier during link creation.
*/
#define DL_FLAG_STATELESS (1 << 0)
#define DL_FLAG_AUTOREMOVE (1 << 1)
+#define DL_FLAG_PM_RUNTIME (1 << 2)
+#define DL_FLAG_RPM_ACTIVE (1 << 3)

/**
* struct device_link - Device link representation.
@@ -750,6 +754,7 @@ struct device_link {
struct list_head c_node;
enum device_link_state status;
u32 flags;
+ bool rpm_active;
struct rcu_head rcu_head;
};

Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -55,6 +55,9 @@ extern unsigned long pm_runtime_autosusp
extern void pm_runtime_update_max_time_suspended(struct device *dev,
s64 delta_ns);
extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern void pm_runtime_clean_up_links(struct device *dev);
+extern void pm_runtime_get_suppliers(struct device *dev);
+extern void pm_runtime_put_suppliers(struct device *dev);

static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
{
@@ -186,6 +189,9 @@ static inline unsigned long pm_runtime_a
struct device *dev) { return 0; }
static inline void pm_runtime_set_memalloc_noio(struct device *dev,
bool enable){}
+static inline void pm_runtime_clean_up_links(struct device *dev) {}
+static inline void pm_runtime_get_suppliers(struct device *dev) {}
+static inline void pm_runtime_put_suppliers(struct device *dev) {}

#endif /* !CONFIG_PM */


2016-10-10 12:59:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Rebase][PATCH v5 5/5] PM / runtime: Optimize the use of device links

From: Rafael J. Wysocki <[email protected]>

If the device has no links to suppliers that should be used for
runtime PM (links with DEVICE_LINK_PM_RUNTIME set), there is no
reason to walk the list of suppliers for that device during
runtime suspend and resume.

Add a simple mechanism to detect that case and possibly avoid the
extra unnecessary overhead.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/base/core.c | 20 +++++++++++++-------
drivers/base/power/runtime.c | 23 ++++++++++++++++++++---
include/linux/pm.h | 1 +
include/linux/pm_runtime.h | 4 ++++
4 files changed, 38 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -170,14 +170,17 @@ struct device_link *device_link_add(stru
if (!link)
goto out;

- if ((flags & DL_FLAG_PM_RUNTIME) && (flags & DL_FLAG_RPM_ACTIVE)) {
- if (pm_runtime_get_sync(supplier) < 0) {
- pm_runtime_put_noidle(supplier);
- kfree(link);
- link = NULL;
- goto out;
+ if (flags & DL_FLAG_PM_RUNTIME) {
+ if (flags & DL_FLAG_RPM_ACTIVE) {
+ if (pm_runtime_get_sync(supplier) < 0) {
+ pm_runtime_put_noidle(supplier);
+ kfree(link);
+ link = NULL;
+ goto out;
+ }
+ link->rpm_active = true;
}
- link->rpm_active = true;
+ pm_runtime_new_link(consumer);
}
get_device(supplier);
link->supplier = supplier;
@@ -250,6 +253,9 @@ static void __device_link_del(struct dev
dev_info(link->consumer, "Dropping the link to %s\n",
dev_name(link->supplier));

+ if (link->flags & DL_FLAG_PM_RUNTIME)
+ pm_runtime_drop_link(link->consumer);
+
list_del_rcu(&link->s_node);
list_del_rcu(&link->c_node);
call_srcu(&device_links_srcu, &link->rcu_head, __device_link_free_srcu);
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -305,6 +305,7 @@ static int __rpm_callback(int (*cb)(stru
__releases(&dev->power.lock) __acquires(&dev->power.lock)
{
int retval, idx;
+ bool use_links = dev->power.links_count > 0;

if (dev->power.irq_safe) {
spin_unlock(&dev->power.lock);
@@ -318,7 +319,7 @@ static int __rpm_callback(int (*cb)(stru
* routine returns, so it is safe to read the status outside of
* the lock.
*/
- if (dev->power.runtime_status == RPM_RESUMING) {
+ if (use_links && dev->power.runtime_status == RPM_RESUMING) {
idx = device_links_read_lock();

retval = rpm_get_suppliers(dev);
@@ -341,8 +342,9 @@ static int __rpm_callback(int (*cb)(stru
*
* Do that if resume fails too.
*/
- if ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
- || (dev->power.runtime_status == RPM_RESUMING && retval)) {
+ if (use_links
+ && ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
+ || (dev->power.runtime_status == RPM_RESUMING && retval))) {
idx = device_links_read_lock();

fail:
@@ -1593,6 +1595,21 @@ void pm_runtime_put_suppliers(struct dev
device_links_read_unlock(idx);
}

+void pm_runtime_new_link(struct device *dev)
+{
+ spin_lock_irq(&dev->power.lock);
+ dev->power.links_count++;
+ spin_unlock_irq(&dev->power.lock);
+}
+
+void pm_runtime_drop_link(struct device *dev)
+{
+ spin_lock_irq(&dev->power.lock);
+ WARN_ON(dev->power.links_count == 0);
+ dev->power.links_count--;
+ spin_unlock_irq(&dev->power.lock);
+}
+
/**
* pm_runtime_force_suspend - Force a device into suspend state if needed.
* @dev: Device to suspend.
Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -597,6 +597,7 @@ struct dev_pm_info {
unsigned int use_autosuspend:1;
unsigned int timer_autosuspends:1;
unsigned int memalloc_noio:1;
+ unsigned int links_count;
enum rpm_request request;
enum rpm_status runtime_status;
int runtime_error;
Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -58,6 +58,8 @@ extern void pm_runtime_set_memalloc_noio
extern void pm_runtime_clean_up_links(struct device *dev);
extern void pm_runtime_get_suppliers(struct device *dev);
extern void pm_runtime_put_suppliers(struct device *dev);
+extern void pm_runtime_new_link(struct device *dev);
+extern void pm_runtime_drop_link(struct device *dev);

static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
{
@@ -192,6 +194,8 @@ static inline void pm_runtime_set_memall
static inline void pm_runtime_clean_up_links(struct device *dev) {}
static inline void pm_runtime_get_suppliers(struct device *dev) {}
static inline void pm_runtime_put_suppliers(struct device *dev) {}
+static inline void pm_runtime_new_link(struct device *dev) {}
+static inline void pm_runtime_drop_link(struct device *dev) {}

#endif /* !CONFIG_PM */


2016-10-10 12:59:29

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v5 3/5] PM / sleep: Make async suspend/resume of devices use device links

From: Rafael J. Wysocki <[email protected]>

Make the device suspend/resume part of the core system
suspend/resume code use device links to ensure that supplier
and consumer devices will be suspended and resumed in the right
order in case of async suspend/resume.

The idea, roughly, is to use dpm_wait() to wait for all consumers
before a supplier device suspend and to wait for all suppliers
before a consumer device resume.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---

v4 -> v5:

- Rebase, change the symbol names in accordance with patch [2/5] etc.

- Use READ_ONCE() to annotate "unprotected" accesses to link status data.

---
drivers/base/power/main.c | 85 ++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 79 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -246,6 +246,62 @@ static void dpm_wait_for_children(struct
device_for_each_child(dev, &async, dpm_wait_fn);
}

+static void dpm_wait_for_suppliers(struct device *dev, bool async)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ /*
+ * If the supplier goes away right after we've checked the link to it,
+ * we'll wait for its completion to change the state, but that's fine,
+ * because the only things that will block as a result are the SRCU
+ * callbacks freeing the link objects for the links in the list we're
+ * walking.
+ */
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+ if (READ_ONCE(link->status) != DL_STATE_DORMANT)
+ dpm_wait(link->supplier, async);
+
+ device_links_read_unlock(idx);
+}
+
+static void dpm_wait_for_superior(struct device *dev, bool async)
+{
+ dpm_wait(dev->parent, async);
+ dpm_wait_for_suppliers(dev, async);
+}
+
+static void dpm_wait_for_consumers(struct device *dev, bool async)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ /*
+ * The status of a device link can only be changed from "dormant" by a
+ * probe, but that cannot happen during system suspend/resume. In
+ * theory it can change to "dormant" at that time, but then it is
+ * reasonable to wait for the target device anyway (eg. if it goes
+ * away, it's better to wait for it to go away completely and then
+ * continue instead of trying to continue in parallel with its
+ * unregistration).
+ */
+ list_for_each_entry_rcu(link, &dev->links.consumers, s_node)
+ if (READ_ONCE(link->status) != DL_STATE_DORMANT)
+ dpm_wait(link->consumer, async);
+
+ device_links_read_unlock(idx);
+}
+
+static void dpm_wait_for_subordinate(struct device *dev, bool async)
+{
+ dpm_wait_for_children(dev, async);
+ dpm_wait_for_consumers(dev, async);
+}
+
/**
* pm_op - Return the PM operation appropriate for given PM event.
* @ops: PM operations to choose from.
@@ -490,7 +546,7 @@ static int device_resume_noirq(struct de
if (!dev->power.is_noirq_suspended)
goto Out;

- dpm_wait(dev->parent, async);
+ dpm_wait_for_superior(dev, async);

if (dev->pm_domain) {
info = "noirq power domain ";
@@ -620,7 +676,7 @@ static int device_resume_early(struct de
if (!dev->power.is_late_suspended)
goto Out;

- dpm_wait(dev->parent, async);
+ dpm_wait_for_superior(dev, async);

if (dev->pm_domain) {
info = "early power domain ";
@@ -752,7 +808,7 @@ static int device_resume(struct device *
goto Complete;
}

- dpm_wait(dev->parent, async);
+ dpm_wait_for_superior(dev, async);
dpm_watchdog_set(&wd, dev);
device_lock(dev);

@@ -1040,7 +1096,7 @@ static int __device_suspend_noirq(struct
if (dev->power.syscore || dev->power.direct_complete)
goto Complete;

- dpm_wait_for_children(dev, async);
+ dpm_wait_for_subordinate(dev, async);

if (dev->pm_domain) {
info = "noirq power domain ";
@@ -1187,7 +1243,7 @@ static int __device_suspend_late(struct
if (dev->power.syscore || dev->power.direct_complete)
goto Complete;

- dpm_wait_for_children(dev, async);
+ dpm_wait_for_subordinate(dev, async);

if (dev->pm_domain) {
info = "late power domain ";
@@ -1344,6 +1400,22 @@ static int legacy_suspend(struct device
return error;
}

+static void dpm_clear_suppliers_direct_complete(struct device *dev)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
+ spin_lock_irq(&link->supplier->power.lock);
+ link->supplier->power.direct_complete = false;
+ spin_unlock_irq(&link->supplier->power.lock);
+ }
+
+ device_links_read_unlock(idx);
+}
+
/**
* device_suspend - Execute "suspend" callbacks for given device.
* @dev: Device to handle.
@@ -1360,7 +1432,7 @@ static int __device_suspend(struct devic
TRACE_DEVICE(dev);
TRACE_SUSPEND(0);

- dpm_wait_for_children(dev, async);
+ dpm_wait_for_subordinate(dev, async);

if (async_error)
goto Complete;
@@ -1456,6 +1528,7 @@ static int __device_suspend(struct devic

spin_unlock_irq(&parent->power.lock);
}
+ dpm_clear_suppliers_direct_complete(dev);
}

device_unlock(dev);

2016-10-10 12:59:28

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Resend][PATCH v5 1/5] driver core: Add a wrapper around __device_release_driver()

From: Rafael J. Wysocki <[email protected]>

Add an internal wrapper around __device_release_driver() that will
acquire device locks and do the necessary checks before calling it.

The next patch will make use of it.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/base/dd.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 16688f50729c..d9e76e9205c7 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -796,6 +796,22 @@ static void __device_release_driver(struct device *dev)
}
}

+static void device_release_driver_internal(struct device *dev,
+ struct device_driver *drv,
+ struct device *parent)
+{
+ if (parent)
+ device_lock(parent);
+
+ device_lock(dev);
+ if (!drv || drv == dev->driver)
+ __device_release_driver(dev);
+
+ device_unlock(dev);
+ if (parent)
+ device_unlock(parent);
+}
+
/**
* device_release_driver - manually detach device from driver.
* @dev: device.
@@ -810,9 +826,7 @@ void device_release_driver(struct device *dev)
* within their ->remove callback for the same device, they
* will deadlock right here.
*/
- device_lock(dev);
- __device_release_driver(dev);
- device_unlock(dev);
+ device_release_driver_internal(dev, NULL, NULL);
}
EXPORT_SYMBOL_GPL(device_release_driver);

@@ -837,15 +851,7 @@ void driver_detach(struct device_driver *drv)
dev = dev_prv->device;
get_device(dev);
spin_unlock(&drv->p->klist_devices.k_lock);
-
- if (dev->parent) /* Needed for USB */
- device_lock(dev->parent);
- device_lock(dev);
- if (dev->driver == drv)
- __device_release_driver(dev);
- device_unlock(dev);
- if (dev->parent)
- device_unlock(dev->parent);
+ device_release_driver_internal(dev, drv, dev->parent);
put_device(dev);
}
}

2016-10-10 12:59:27

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v5 2/5] driver core: Functional dependencies tracking support

From: Rafael J. Wysocki <[email protected]>

Currently, there is a problem with taking functional dependencies
between devices into account.

What I mean by a "functional dependency" is when the driver of device
B needs device A to be functional and (generally) its driver to be
present in order to work properly. This has certain consequences
for power management (suspend/resume and runtime PM ordering) and
shutdown ordering of these devices. In general, it also implies that
the driver of A needs to be working for B to be probed successfully
and it cannot be unbound from the device before the B's driver.

Support for representing those functional dependencies between
devices is added here to allow the driver core to track them and act
on them in certain cases where applicable.

The argument for doing that in the driver core is that there are
quite a few distinct use cases involving device dependencies, they
are relatively hard to get right in a driver (if one wants to
address all of them properly) and it only gets worse if multiplied
by the number of drivers potentially needing to do it. Morever, at
least one case (asynchronous system suspend/resume) cannot be handled
in a single driver at all, because it requires the driver of A to
wait for B to suspend (during system suspend) and the driver of B to
wait for A to resume (during system resume).

For this reason, represent dependencies between devices as "links",
with the help of struct device_link objects each containing pointers
to the "linked" devices, a list node for each of them, status
information, flags, and an RCU head for synchronization.

Also add two new list heads, representing the lists of links to the
devices that depend on the given one (consumers) and to the devices
depended on by it (suppliers), and a "driver presence status" field
(needed for figuring out initial states of device links) to struct
device.

The entire data structure consisting of all of the lists of link
objects for all devices is protected by a mutex (for link object
addition/removal and for list walks during device driver probing
and removal) and by SRCU (for list walking in other case that will
be introduced by subsequent change sets). In addition, each link
object has an internal status field whose value reflects whether or
not drivers are bound to the devices pointed to by the link or
probing/removal of their drivers is in progress etc. That field
is only modified under the device links mutex, but it may be read
outside of it in some cases (introduced by subsequent change sets),
so modifications of it are annotated with WRITE_ONCE().

New links are added by calling device_link_add() which takes three
arguments: pointers to the devices in question and flags. In
particular, if DL_FLAG_STATELESS is set in the flags, the link status
is not to be taken into account for this link and the driver core
will not manage it. In turn, if DL_FLAG_AUTOREMOVE is set in the
flags, the driver core will remove the link automatically when the
consumer device driver unbinds from it.

One of the actions carried out by device_link_add() is to reorder
the lists used for device shutdown and system suspend/resume to
put the consumer device along with all of its children and all of
its consumers (and so on, recursively) to the ends of those lists
in order to ensure the right ordering between all of the supplier
and consumer devices.

For this reason, it is not possible to create a link between two
devices if the would-be supplier device already depends on the
would-be consumer device as either a direct descendant of it or a
consumer of one of its direct descendants or one of its consumers
and so on.

There are two types of link objects, persistent and non-persistent.
The persistent ones stay around until one of the target devices is
deleted, while the non-persistent ones are removed automatically when
the consumer driver unbinds from its device (ie. they are assumed to
be valid only as long as the consumer device has a driver bound to
it). Persistent links are created by default and non-persistent
links are created when the DL_FLAG_AUTOREMOVE flag is passed
to device_link_add().

Both persistent and non-persistent device links can be deleted
with an explicit call to device_link_del().

Links created without the DL_FLAG_STATELESS flag set are managed
by the driver core using a simple state machine. There are 5 states
each link can be in: DORMANT (unused), AVAILABLE (the supplier driver
is present and functional), CONSUMER_PROBE (the consumer driver is
probing), ACTIVE (both supplier and consumer drivers are present and
functional), and SUPPLIER_UNBIND (the supplier driver is unbinding).
The driver core updates the link state automatically depending on
what happens to the linked devices and for each link state specific
actions are taken in addition to that.

For example, if the supplier driver unbinds from its device, the
driver core will also unbind the drivers of all of its consumers
automatically under the assumption that they cannot function
properly without the supplier. Analogously, the driver core will
only allow the consumer driver to bind to its device if the
supplier driver is present and functional (ie. the link is in
the AVAILABLE state). If that's not the case, it will rely on
the existing deferred probing mechanism to wait for the supplier
driver to become available.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---

v4 -> v5:

- Redefine device_link_add() to take three arguments, the supplier and consumer
pointers and flags, and to figure out the initial link state automatically.

- Move the links-related fields in struct device to a separate sub-structure
and add a "driver presence tracking" field to it (to help device_link_add()
to do its job).

- Modify device_links_check_suppliers(), device_links_driver_bound(),
device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(),
and device_links_unbind_consumers() to walk link lists under device_links_lock
(to make the new "driver presence tracking" mechanism work reliably).

- Drop the (not necessary any more) spinlock from struct device_link.

- Rename symbols to better reflect their purpose (flags vs link states etc).

v3 -> v4:

- Add the in_dpm_list field to struct dev_pm_info and use it for checking
if the device has been added to dpm_list already, which is needed for
handling the "not registered consumer" case in device_link_add().

- Rework device_link_add() to handle consumer devices that have not been
registered before calling it.

- Drop the parent check from device_link_add().

- Rename device_links_driver_gone() to device_links_driver_cleanup()

- Rearrange device_links_unbind_consumers() to avoid one spin_unlock()
instance (which wasn't necessary).

- Introduce device_links_purge() to delete links from a device going away
and make device_del() call it (instead of running links-related code
directly).

- Move the invocation of device_links_ckeck_suppliers() to really_probe().

- Change the description of the DEVICE_LINK_STATELESS flag.

---
drivers/base/base.h | 13 +
drivers/base/core.c | 491 +++++++++++++++++++++++++++++++++++++++++++++
drivers/base/dd.c | 41 +++
drivers/base/power/main.c | 2
drivers/base/power/power.h | 10
include/linux/device.h | 78 +++++++
include/linux/pm.h | 1
7 files changed, 631 insertions(+), 5 deletions(-)

Index: linux-pm/drivers/base/base.h
===================================================================
--- linux-pm.orig/drivers/base/base.h
+++ linux-pm/drivers/base/base.h
@@ -107,6 +107,9 @@ extern void bus_remove_device(struct dev

extern int bus_add_driver(struct device_driver *drv);
extern void bus_remove_driver(struct device_driver *drv);
+extern void device_release_driver_internal(struct device *dev,
+ struct device_driver *drv,
+ struct device *parent);

extern void driver_detach(struct device_driver *drv);
extern int driver_probe_device(struct device_driver *drv, struct device *dev);
@@ -152,3 +155,13 @@ extern int devtmpfs_init(void);
#else
static inline int devtmpfs_init(void) { return 0; }
#endif
+
+/* Device links support */
+extern int device_links_read_lock(void);
+extern void device_links_read_unlock(int idx);
+extern int device_links_check_suppliers(struct device *dev);
+extern void device_links_driver_bound(struct device *dev);
+extern void device_links_driver_cleanup(struct device *dev);
+extern void device_links_no_driver(struct device *dev);
+extern bool device_links_busy(struct device *dev);
+extern void device_links_unbind_consumers(struct device *dev);
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -44,6 +44,492 @@ static int __init sysfs_deprecated_setup
early_param("sysfs.deprecated", sysfs_deprecated_setup);
#endif

+/* Device links support. */
+
+static DEFINE_MUTEX(device_links_lock);
+DEFINE_STATIC_SRCU(device_links_srcu);
+
+int device_links_read_lock(void)
+{
+ return srcu_read_lock(&device_links_srcu);
+}
+
+void device_links_read_unlock(int idx)
+{
+ return srcu_read_unlock(&device_links_srcu, idx);
+}
+
+/**
+ * device_is_dependent - Check if one device depends on another one
+ * @dev: Device to check dependencies for.
+ * @target: Device to check against.
+ *
+ * Check if @target depends on @dev or any device dependent on it (its child or
+ * its consumer etc). Return 1 if that is the case or 0 otherwise.
+ */
+static int device_is_dependent(struct device *dev, void *target)
+{
+ struct device_link *link;
+ int ret;
+
+ if (WARN_ON(dev == target))
+ return 1;
+
+ ret = device_for_each_child(dev, target, device_is_dependent);
+ if (ret)
+ return ret;
+
+ list_for_each_entry(link, &dev->links.consumers, s_node) {
+ if (WARN_ON(link->consumer == target))
+ return 1;
+
+ ret = device_is_dependent(link->consumer, target);
+ if (ret)
+ break;
+ }
+ return ret;
+}
+
+static int device_reorder_to_tail(struct device *dev, void *not_used)
+{
+ struct device_link *link;
+
+ /*
+ * Devices that have not been registered yet will be put to the ends
+ * of the lists during the registration, so skip them here.
+ */
+ if (device_is_registered(dev))
+ devices_kset_move_last(dev);
+
+ if (device_pm_initialized(dev))
+ device_pm_move_last(dev);
+
+ device_for_each_child(dev, NULL, device_reorder_to_tail);
+ list_for_each_entry(link, &dev->links.consumers, s_node)
+ device_reorder_to_tail(link->consumer, NULL);
+
+ return 0;
+}
+
+/**
+ * device_link_add - Create a link between two devices.
+ * @consumer: Consumer end of the link.
+ * @supplier: Supplier end of the link.
+ * @flags: Link flags.
+ *
+ * If the DL_FLAG_AUTOREMOVE is set, the link will be removed automatically
+ * when the consumer device driver unbinds from it. The combination of both
+ * DL_FLAG_AUTOREMOVE and DL_FLAG_STATELESS set is invalid and will cause NULL
+ * to be returned.
+ *
+ * A side effect of the link creation is re-ordering of dpm_list and the
+ * devices_kset list by moving the consumer device and all devices depending
+ * on it to the ends of these lists (that does not happen to devices that have
+ * not been registered when this function is called).
+ *
+ * The supplier device is required to be registered when this function is called
+ * and NULL will be returned if that is not the case. The consumer device need
+ * not be registerd, however.
+ */
+struct device_link *device_link_add(struct device *consumer,
+ struct device *supplier, u32 flags)
+{
+ struct device_link *link;
+
+ if (!consumer || !supplier ||
+ ((flags & DL_FLAG_STATELESS) && (flags & DL_FLAG_AUTOREMOVE)))
+ return NULL;
+
+ mutex_lock(&device_links_lock);
+ device_pm_lock();
+
+ /*
+ * If the supplier has not been fully registered yet or there is a
+ * reverse dependency between the consumer and the supplier already in
+ * the graph, return NULL.
+ */
+ if (!device_pm_initialized(supplier)
+ || device_is_dependent(consumer, supplier)) {
+ link = NULL;
+ goto out;
+ }
+
+ list_for_each_entry(link, &supplier->links.consumers, s_node)
+ if (link->consumer == consumer)
+ goto out;
+
+ link = kmalloc(sizeof(*link), GFP_KERNEL);
+ if (!link)
+ goto out;
+
+ get_device(supplier);
+ link->supplier = supplier;
+ INIT_LIST_HEAD(&link->s_node);
+ get_device(consumer);
+ link->consumer = consumer;
+ INIT_LIST_HEAD(&link->c_node);
+ link->flags = flags;
+
+ /* Deterine the initial link state. */
+ if (flags & DL_FLAG_STATELESS) {
+ link->status = DL_STATE_NONE;
+ } else {
+ switch (supplier->links.status) {
+ case DL_DEV_DRIVER_BOUND:
+ switch (consumer->links.status) {
+ case DL_DEV_PROBING:
+ link->status = DL_STATE_CONSUMER_PROBE;
+ break;
+ case DL_DEV_DRIVER_BOUND:
+ link->status = DL_STATE_ACTIVE;
+ break;
+ default:
+ link->status = DL_STATE_AVAILABLE;
+ break;
+ }
+ break;
+ case DL_DEV_UNBINDING:
+ link->status = DL_STATE_SUPPLIER_UNBIND;
+ break;
+ default:
+ link->status = DL_STATE_DORMANT;
+ break;
+ }
+ }
+
+ /*
+ * Move the consumer and all of the devices depending on it to the end
+ * of dpm_list and the devices_kset list.
+ *
+ * It is necessary to hold dpm_list locked throughout all that or else
+ * we may end up suspending with a wrong ordering of it.
+ */
+ device_reorder_to_tail(consumer, NULL);
+
+ list_add_tail_rcu(&link->s_node, &supplier->links.consumers);
+ list_add_tail_rcu(&link->c_node, &consumer->links.suppliers);
+
+ dev_info(consumer, "Linked as a consumer to %s\n", dev_name(supplier));
+
+ out:
+ device_pm_unlock();
+ mutex_unlock(&device_links_lock);
+ return link;
+}
+EXPORT_SYMBOL_GPL(device_link_add);
+
+static void __device_link_free_srcu(struct rcu_head *rhead)
+{
+ struct device_link *link;
+
+ link = container_of(rhead, struct device_link, rcu_head);
+ put_device(link->consumer);
+ put_device(link->supplier);
+ kfree(link);
+}
+
+static void __device_link_del(struct device_link *link)
+{
+ dev_info(link->consumer, "Dropping the link to %s\n",
+ dev_name(link->supplier));
+
+ list_del_rcu(&link->s_node);
+ list_del_rcu(&link->c_node);
+ call_srcu(&device_links_srcu, &link->rcu_head, __device_link_free_srcu);
+}
+
+/**
+ * device_link_del - Delete a link between two devices.
+ * @link: Device link to delete.
+ *
+ * The caller must ensure proper synchronization of this function with runtime
+ * PM.
+ */
+void device_link_del(struct device_link *link)
+{
+ mutex_lock(&device_links_lock);
+ device_pm_lock();
+ __device_link_del(link);
+ device_pm_unlock();
+ mutex_unlock(&device_links_lock);
+}
+EXPORT_SYMBOL_GPL(device_link_del);
+
+static void device_links_missing_supplier(struct device *dev)
+{
+ struct device_link *link;
+
+ list_for_each_entry(link, &dev->links.suppliers, c_node)
+ if (link->status == DL_STATE_CONSUMER_PROBE)
+ WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
+}
+
+/**
+ * device_links_check_suppliers - Check presence of supplier drivers.
+ * @dev: Consumer device.
+ *
+ * Check links from this device to any suppliers. Walk the list of the device's
+ * links to suppliers and see if all of them are available. If not, simply
+ * return -EPROBE_DEFER.
+ *
+ * We need to guarantee that the supplier will not go away after the check has
+ * been positive here. It only can go away in __device_release_driver() and
+ * that function checks the device's links to consumers. This means we need to
+ * mark the link as "consumer probe in progress" to make the supplier removal
+ * wait for us to complete (or bad things may happen).
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
+int device_links_check_suppliers(struct device *dev)
+{
+ struct device_link *link;
+ int ret = 0;
+
+ mutex_lock(&device_links_lock);
+
+ list_for_each_entry(link, &dev->links.suppliers, c_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ if (link->status != DL_STATE_AVAILABLE) {
+ device_links_missing_supplier(dev);
+ ret = -EPROBE_DEFER;
+ break;
+ }
+ WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
+ }
+ dev->links.status = DL_DEV_PROBING;
+
+ mutex_unlock(&device_links_lock);
+ return ret;
+}
+
+/**
+ * device_links_driver_bound - Update device links after probing its driver.
+ * @dev: Device to update the links for.
+ *
+ * The probe has been successful, so update links from this device to any
+ * consumers by changing their status to "available".
+ *
+ * Also change the status of @dev's links to suppliers to "active".
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
+void device_links_driver_bound(struct device *dev)
+{
+ struct device_link *link;
+
+ mutex_lock(&device_links_lock);
+
+ list_for_each_entry(link, &dev->links.consumers, s_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ WARN_ON(link->status != DL_STATE_DORMANT);
+ WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
+ }
+
+ list_for_each_entry(link, &dev->links.suppliers, c_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ WARN_ON(link->status != DL_STATE_CONSUMER_PROBE);
+ WRITE_ONCE(link->status, DL_STATE_ACTIVE);
+ }
+
+ dev->links.status = DL_DEV_DRIVER_BOUND;
+
+ mutex_unlock(&device_links_lock);
+}
+
+/**
+ * __device_links_no_driver - Update links of a device without a driver.
+ * @dev: Device without a drvier.
+ *
+ * Delete all non-persistent links from this device to any suppliers.
+ *
+ * Persistent links stay around, but their status is changed to "available",
+ * unless they already are in the "supplier unbind in progress" state in which
+ * case they need not be updated.
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
+static void __device_links_no_driver(struct device *dev)
+{
+ struct device_link *link, *ln;
+
+ list_for_each_entry_safe_reverse(link, ln, &dev->links.suppliers, c_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ if (link->flags & DL_FLAG_AUTOREMOVE)
+ __device_link_del(link);
+ else if (link->status != DL_STATE_SUPPLIER_UNBIND)
+ WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
+ }
+
+ dev->links.status = DL_DEV_NO_DRIVER;
+}
+
+void device_links_no_driver(struct device *dev)
+{
+ mutex_lock(&device_links_lock);
+ __device_links_no_driver(dev);
+ mutex_unlock(&device_links_lock);
+}
+
+/**
+ * device_links_driver_cleanup - Update links after driver removal.
+ * @dev: Device whose driver has just gone away.
+ *
+ * Update links to consumers for @dev by changing their status to "dormant" and
+ * invoke %__device_links_no_driver() to update links to suppliers for it as
+ * appropriate.
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
+void device_links_driver_cleanup(struct device *dev)
+{
+ struct device_link *link;
+
+ mutex_lock(&device_links_lock);
+
+ list_for_each_entry(link, &dev->links.consumers, s_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ WARN_ON(link->flags & DL_FLAG_AUTOREMOVE);
+ WARN_ON(link->status != DL_STATE_SUPPLIER_UNBIND);
+ WRITE_ONCE(link->status, DL_STATE_DORMANT);
+ }
+
+ __device_links_no_driver(dev);
+
+ mutex_unlock(&device_links_lock);
+}
+
+/**
+ * device_links_busy - Check if there are any busy links to consumers.
+ * @dev: Device to check.
+ *
+ * Check each consumer of the device and return 'true' if its link's status
+ * is one of "consumer probe" or "active" (meaning that the given consumer is
+ * probing right now or its driver is present). Otherwise, change the link
+ * state to "supplier unbind" to prevent the consumer from being probed
+ * successfully going forward.
+ *
+ * Return 'false' if there are no probing or active consumers.
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
+bool device_links_busy(struct device *dev)
+{
+ struct device_link *link;
+ bool ret = false;
+
+ mutex_lock(&device_links_lock);
+
+ list_for_each_entry(link, &dev->links.consumers, s_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ if (link->status == DL_STATE_CONSUMER_PROBE
+ || link->status == DL_STATE_ACTIVE) {
+ ret = true;
+ break;
+ }
+ WRITE_ONCE(link->status, DL_STATE_SUPPLIER_UNBIND);
+ }
+
+ dev->links.status = DL_DEV_UNBINDING;
+
+ mutex_unlock(&device_links_lock);
+ return ret;
+}
+
+/**
+ * device_links_unbind_consumers - Force unbind consumers of the given device.
+ * @dev: Device to unbind the consumers of.
+ *
+ * Walk the list of links to consumers for @dev and if any of them is in the
+ * "consumer probe" state, wait for all device probes in progress to complete
+ * and start over.
+ *
+ * If that's not the case, change the status of the link to "supplier unbind"
+ * and check if the link was in the "active" state. If so, force the consumer
+ * driver to unbind and start over (the consumer will not re-probe as we have
+ * changed the state of the link already).
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
+void device_links_unbind_consumers(struct device *dev)
+{
+ struct device_link *link;
+
+ start:
+ mutex_lock(&device_links_lock);
+
+ list_for_each_entry(link, &dev->links.consumers, s_node) {
+ enum device_link_state status;
+
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ status = link->status;
+ if (status == DL_STATE_CONSUMER_PROBE) {
+ mutex_unlock(&device_links_lock);
+
+ wait_for_device_probe();
+ goto start;
+ }
+ WRITE_ONCE(link->status, DL_STATE_SUPPLIER_UNBIND);
+ if (status == DL_STATE_ACTIVE) {
+ struct device *consumer = link->consumer;
+
+ get_device(consumer);
+
+ mutex_unlock(&device_links_lock);
+
+ device_release_driver_internal(consumer, NULL,
+ consumer->parent);
+ put_device(consumer);
+ goto start;
+ }
+ }
+
+ mutex_unlock(&device_links_lock);
+}
+
+/**
+ * device_links_purge - Delete existing links to other devices.
+ * @dev: Target device.
+ */
+static void device_links_purge(struct device *dev)
+{
+ struct device_link *link, *ln;
+
+ /*
+ * Delete all of the remaining links from this device to any other
+ * devices (either consumers or suppliers).
+ */
+ mutex_lock(&device_links_lock);
+
+ list_for_each_entry_safe_reverse(link, ln, &dev->links.suppliers, c_node) {
+ WARN_ON(link->status == DL_STATE_ACTIVE);
+ __device_link_del(link);
+ }
+
+ list_for_each_entry_safe_reverse(link, ln, &dev->links.consumers, s_node) {
+ WARN_ON(link->status != DL_STATE_DORMANT &&
+ link->status != DL_STATE_NONE);
+ __device_link_del(link);
+ }
+
+ mutex_unlock(&device_links_lock);
+}
+
+/* Device links support end. */
+
int (*platform_notify)(struct device *dev) = NULL;
int (*platform_notify_remove)(struct device *dev) = NULL;
static struct kobject *dev_kobj;
@@ -711,6 +1197,9 @@ void device_initialize(struct device *de
#ifdef CONFIG_GENERIC_MSI_IRQ
INIT_LIST_HEAD(&dev->msi_list);
#endif
+ INIT_LIST_HEAD(&dev->links.consumers);
+ INIT_LIST_HEAD(&dev->links.suppliers);
+ dev->links.status = DL_DEV_NO_DRIVER;
}
EXPORT_SYMBOL_GPL(device_initialize);

@@ -1240,6 +1729,8 @@ void device_del(struct device *dev)
if (dev->bus)
blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
BUS_NOTIFY_DEL_DEVICE, dev);
+
+ device_links_purge(dev);
dpm_sysfs_remove(dev);
if (parent)
klist_del(&dev->p->knode_parent);
Index: linux-pm/drivers/base/dd.c
===================================================================
--- linux-pm.orig/drivers/base/dd.c
+++ linux-pm/drivers/base/dd.c
@@ -249,6 +249,7 @@ static void driver_bound(struct device *
__func__, dev_name(dev));

klist_add_tail(&dev->p->knode_driver, &dev->driver->p->klist_devices);
+ device_links_driver_bound(dev);

device_pm_check_callbacks(dev);

@@ -341,6 +342,10 @@ static int really_probe(struct device *d
return ret;
}

+ ret = device_links_check_suppliers(dev);
+ if (ret)
+ return ret;
+
atomic_inc(&probe_count);
pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
drv->bus->name, __func__, drv->name, dev_name(dev));
@@ -399,6 +404,7 @@ probe_failed:
blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
pinctrl_bind_failed:
+ device_links_no_driver(dev);
devres_release_all(dev);
driver_sysfs_remove(dev);
dev->driver = NULL;
@@ -756,7 +762,7 @@ EXPORT_SYMBOL_GPL(driver_attach);
* __device_release_driver() must be called with @dev lock held.
* When called for a USB interface, @dev->parent lock must be held as well.
*/
-static void __device_release_driver(struct device *dev)
+static void __device_release_driver(struct device *dev, struct device *parent)
{
struct device_driver *drv;

@@ -765,6 +771,25 @@ static void __device_release_driver(stru
if (driver_allows_async_probing(drv))
async_synchronize_full();

+ while (device_links_busy(dev)) {
+ device_unlock(dev);
+ if (parent)
+ device_unlock(parent);
+
+ device_links_unbind_consumers(dev);
+ if (parent)
+ device_lock(parent);
+
+ device_lock(dev);
+ /*
+ * A concurrent invocation of the same function might
+ * have released the driver successfully while this one
+ * was waiting, so check for that.
+ */
+ if (dev->driver != drv)
+ return;
+ }
+
pm_runtime_get_sync(dev);

driver_sysfs_remove(dev);
@@ -780,6 +805,8 @@ static void __device_release_driver(stru
dev->bus->remove(dev);
else if (drv->remove)
drv->remove(dev);
+
+ device_links_driver_cleanup(dev);
devres_release_all(dev);
dev->driver = NULL;
dev_set_drvdata(dev, NULL);
@@ -796,16 +823,16 @@ static void __device_release_driver(stru
}
}

-static void device_release_driver_internal(struct device *dev,
- struct device_driver *drv,
- struct device *parent)
+void device_release_driver_internal(struct device *dev,
+ struct device_driver *drv,
+ struct device *parent)
{
if (parent)
device_lock(parent);

device_lock(dev);
if (!drv || drv == dev->driver)
- __device_release_driver(dev);
+ __device_release_driver(dev, parent);

device_unlock(dev);
if (parent)
@@ -818,6 +845,10 @@ static void device_release_driver_intern
*
* Manually detach device from driver.
* When called for a USB interface, @dev->parent lock must be held.
+ *
+ * If this function is to be called with @dev->parent lock held, ensure that
+ * the device's consumers are unbound in advance or that their locks can be
+ * acquired under the @dev->parent lock.
*/
void device_release_driver(struct device *dev)
{
Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -707,6 +707,79 @@ struct device_dma_parameters {
};

/**
+ * enum device_link_state - Device link states.
+ * @NONE: The presence of the drivers is not being tracked.
+ * @DORMANT: None of the supplier/consumer drivers is present.
+ * @AVAILABLE: The supplier driver is present, but the consumer is not.
+ * @CONSUMER_PROBE: The supplier driver is present and the consumer is probing.
+ * @ACTIVE: Active link; both the supplier and consumer drivers are present.
+ * @SUPPLIER_UNBIND: The supplier driver is unbinding.
+ */
+enum device_link_state {
+ DL_STATE_NONE = -1,
+ DL_STATE_DORMANT = 0,
+ DL_STATE_AVAILABLE,
+ DL_STATE_CONSUMER_PROBE,
+ DL_STATE_ACTIVE,
+ DL_STATE_SUPPLIER_UNBIND,
+};
+
+/*
+ * Device link flags.
+ *
+ * STATELESS: The core won't track the presence of supplier/consumer drivers.
+ * AUTOREMOVE: Remove this link automatically on consumer driver unbind.
+ */
+#define DL_FLAG_STATELESS (1 << 0)
+#define DL_FLAG_AUTOREMOVE (1 << 1)
+
+/**
+ * struct device_link - Device link representation.
+ * @supplier: The device on the supplier end of the link.
+ * @s_node: Hook to the supplier device's list of links to consumers.
+ * @consumer: The device on the consumer end of the link.
+ * @c_node: Hook to the consumer device's list of links to suppliers.
+ * @status: The state of the link (with respect to the presence of drivers).
+ * @flags: Link flags.
+ * @rcu_head: An RCU head to use for deferred execution of SRCU callbacks.
+ */
+struct device_link {
+ struct device *supplier;
+ struct list_head s_node;
+ struct device *consumer;
+ struct list_head c_node;
+ enum device_link_state status;
+ u32 flags;
+ struct rcu_head rcu_head;
+};
+
+/**
+ * enum dl_dev_state - Device driver presence tracking information.
+ * @NO_DRIVER: There is no driver attached to the device.
+ * @PROBING: A driver is probing.
+ * @DRIVER_BOUND: The driver has been bound to the device.
+ * @UNBINDING: The driver is unbinding from the device.
+ */
+enum dl_dev_state {
+ DL_DEV_NO_DRIVER = 0,
+ DL_DEV_PROBING,
+ DL_DEV_DRIVER_BOUND,
+ DL_DEV_UNBINDING,
+};
+
+/**
+ * struct dev_links_info - Device data related to device links.
+ * @suppliers: List of links to supplier devices.
+ * @consumers: List of links to consumer devices.
+ * @status: Driver status information.
+ */
+struct dev_links_info {
+ struct list_head suppliers;
+ struct list_head consumers;
+ enum dl_dev_state status;
+};
+
+/**
* struct device - The basic device structure
* @parent: The device's "parent" device, the device to which it is attached.
* In most cases, a parent device is some sort of bus or host
@@ -797,6 +870,7 @@ struct device {
core doesn't touch it */
void *driver_data; /* Driver data, set and get with
dev_set/get_drvdata */
+ struct dev_links_info links;
struct dev_pm_info power;
struct dev_pm_domain *pm_domain;

@@ -1113,6 +1187,10 @@ extern void device_shutdown(void);
/* debugging and troubleshooting/diagnostic helpers. */
extern const char *dev_driver_string(const struct device *dev);

+/* Device links interface. */
+struct device_link *device_link_add(struct device *consumer,
+ struct device *supplier, u32 flags);
+void device_link_del(struct device_link *link);

#ifdef CONFIG_PRINTK

Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -131,6 +131,7 @@ void device_pm_add(struct device *dev)
dev_warn(dev, "parent %s should not be sleeping\n",
dev_name(dev->parent));
list_add_tail(&dev->power.entry, &dpm_list);
+ dev->power.in_dpm_list = true;
mutex_unlock(&dpm_list_mtx);
}

@@ -145,6 +146,7 @@ void device_pm_remove(struct device *dev
complete_all(&dev->power.completion);
mutex_lock(&dpm_list_mtx);
list_del_init(&dev->power.entry);
+ dev->power.in_dpm_list = false;
mutex_unlock(&dpm_list_mtx);
device_wakeup_disable(dev);
pm_runtime_remove(dev);
Index: linux-pm/drivers/base/power/power.h
===================================================================
--- linux-pm.orig/drivers/base/power/power.h
+++ linux-pm/drivers/base/power/power.h
@@ -127,6 +127,11 @@ extern void device_pm_move_after(struct
extern void device_pm_move_last(struct device *);
extern void device_pm_check_callbacks(struct device *dev);

+static inline bool device_pm_initialized(struct device *dev)
+{
+ return dev->power.in_dpm_list;
+}
+
#else /* !CONFIG_PM_SLEEP */

static inline void device_pm_sleep_init(struct device *dev) {}
@@ -146,6 +151,11 @@ static inline void device_pm_move_last(s

static inline void device_pm_check_callbacks(struct device *dev) {}

+static inline bool device_pm_initialized(struct device *dev)
+{
+ return device_is_registered(dev);
+}
+
#endif /* !CONFIG_PM_SLEEP */

static inline void device_pm_init(struct device *dev)
Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -559,6 +559,7 @@ struct dev_pm_info {
pm_message_t power_state;
unsigned int can_wakeup:1;
unsigned int async_suspend:1;
+ bool in_dpm_list:1; /* Owned by the PM core */
bool is_prepared:1; /* Owned by the PM core */
bool is_suspended:1; /* Ditto */
bool is_noirq_suspended:1;

2016-10-18 10:46:46

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] Functional dependencies between devices

Hi Rafael,


On 2016-10-10 14:36, Rafael J. Wysocki wrote:
> [...]
>
> One more update after some conversations during LinuxCon Europe.
>
> The main point was to make it possible for device_link_add() to figure out the
> initial state of the link instead of expecting the caller to provide it which
> might not be reliable enough in general.
>
> In this version device_link_add() takes three arguments, the supplier and
> consumer pointers and flags and it sets the correct initial state of the link
> automatically (unless invoked with the "stateless" flag, of course). The cost
> is one additional field in struct device (I moved all of the links-related
> fields in struct device to a separate sub-structure while at it) to track
> the "driver presence status" of the device (to be used by device_link_add()).
>
> In addition to that, the links list walks in the core.c and dd.c code are
> under the device links mutex now, so the iternal link spinlock is not needed
> any more and I have renamed symbols to distinguish between flags, link states
> and device "driver presence statuses".
>
> More information is there in the changelogs.

Thanks for the update. This version is indeed easier to use and still
works fine
with my Exynos IOMMU runtime pm rework. You can keep my:

Tested-by: Marek Szyprowski <[email protected]>

I will send updated version of my patchset soon.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2016-10-19 14:42:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] Functional dependencies between devices

On Tue, Oct 18, 2016 at 12:46 PM, Marek Szyprowski
<[email protected]> wrote:
> Hi Rafael,
>
>
> On 2016-10-10 14:36, Rafael J. Wysocki wrote:
>>
>> [...]
>>
>> One more update after some conversations during LinuxCon Europe.
>>
>> The main point was to make it possible for device_link_add() to figure out
>> the
>> initial state of the link instead of expecting the caller to provide it
>> which
>> might not be reliable enough in general.
>>
>> In this version device_link_add() takes three arguments, the supplier and
>> consumer pointers and flags and it sets the correct initial state of the
>> link
>> automatically (unless invoked with the "stateless" flag, of course). The
>> cost
>> is one additional field in struct device (I moved all of the links-related
>> fields in struct device to a separate sub-structure while at it) to track
>> the "driver presence status" of the device (to be used by
>> device_link_add()).
>>
>> In addition to that, the links list walks in the core.c and dd.c code are
>> under the device links mutex now, so the iternal link spinlock is not
>> needed
>> any more and I have renamed symbols to distinguish between flags, link
>> states
>> and device "driver presence statuses".
>>
>> More information is there in the changelogs.
>
>
> Thanks for the update. This version is indeed easier to use and still works
> fine
> with my Exynos IOMMU runtime pm rework. You can keep my:
>
> Tested-by: Marek Szyprowski <[email protected]>
>
> I will send updated version of my patchset soon.

Thanks for the testing, much appreciated!

The series is in a new branch called "device-links-test" in my tree now:

git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
device-links-test

Thanks,
Rafael

2016-10-20 10:22:02

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] Functional dependencies between devices

Hi Rafael,


On 2016-10-19 13:57, Rafael J. Wysocki wrote:
> On Tue, Oct 18, 2016 at 12:46 PM, Marek Szyprowski
> <[email protected]> wrote:
>> On 2016-10-10 14:36, Rafael J. Wysocki wrote:
>>> [...]
>>>
>>> One more update after some conversations during LinuxCon Europe.
>>>
>>> The main point was to make it possible for device_link_add() to figure out
>>> the
>>> initial state of the link instead of expecting the caller to provide it
>>> which
>>> might not be reliable enough in general.
>>>
>>> In this version device_link_add() takes three arguments, the supplier and
>>> consumer pointers and flags and it sets the correct initial state of the
>>> link
>>> automatically (unless invoked with the "stateless" flag, of course). The
>>> cost
>>> is one additional field in struct device (I moved all of the links-related
>>> fields in struct device to a separate sub-structure while at it) to track
>>> the "driver presence status" of the device (to be used by
>>> device_link_add()).
>>>
>>> In addition to that, the links list walks in the core.c and dd.c code are
>>> under the device links mutex now, so the iternal link spinlock is not
>>> needed
>>> any more and I have renamed symbols to distinguish between flags, link
>>> states
>>> and device "driver presence statuses".
>>>
>>> More information is there in the changelogs.
>> Thanks for the update. This version is indeed easier to use and still works
>> fine
>> with my Exynos IOMMU runtime pm rework. You can keep my:
>>
>> Tested-by: Marek Szyprowski <[email protected]>
>>
>> I will send updated version of my patchset soon.
> Thanks for the testing, much appreciated!
>
> The series is in a new branch called "device-links-test" in my tree now:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
> device-links-test

While working on integrating IOMMU deferred probing patches I found a bug,
which has been introduced in v4 of device dependency patchset (v3 worked
fine in this area, v5 also contains this bug). The following fixup is
needed to properly create links with DL_FLAG_PM_RUNTIME flag set during
consumer device probing:

From: Marek Szyprowski <[email protected]>
Date: Thu, 20 Oct 2016 12:12:14 +0200
Subject: [PATCH] driver core: fix runtime pm state for
DEVICE_LINK_CONSUMER_PROBE links

If link is added during consumer probe with DL_FLAG_PM_RUNTIME flag set,
the code will do additional pm_runtime_put() on the supplier after
finishing consumer probing. This will break runtime pm operation for
the supplier device. To solve this issue, enforce additional call to
pm_runtime_get_sync() on the supplier device while creating such link.

Signed-off-by: Marek Szyprowski <[email protected]>
---
drivers/base/core.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 48bc5a362f7d..d4cc285a1df7 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -217,6 +217,9 @@ struct device_link *device_link_add(struct device
*consumer,
}
}

+ if (flags & DL_FLAG_PM_RUNTIME && status == DL_STATE_CONSUMER_PROBE)
+ pm_runtime_get_sync(supplier);
+
/*
* Move the consumer and all of the devices depending on it to the end
* of dpm_list and the devices_kset list.
--
1.9.1

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2016-10-20 12:55:03

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] Functional dependencies between devices

Hi,

On Thu, Oct 20, 2016 at 12:21 PM, Marek Szyprowski
<[email protected]> wrote:
> Hi Rafael,
>
>
> On 2016-10-19 13:57, Rafael J. Wysocki wrote:
>>
>> On Tue, Oct 18, 2016 at 12:46 PM, Marek Szyprowski
>> <[email protected]> wrote:
>>>
>>> On 2016-10-10 14:36, Rafael J. Wysocki wrote:
>>>>
>>>> [...]
>>>>
>>>> One more update after some conversations during LinuxCon Europe.
>>>>
>>>> The main point was to make it possible for device_link_add() to figure
>>>> out
>>>> the
>>>> initial state of the link instead of expecting the caller to provide it
>>>> which
>>>> might not be reliable enough in general.
>>>>
>>>> In this version device_link_add() takes three arguments, the supplier
>>>> and
>>>> consumer pointers and flags and it sets the correct initial state of the
>>>> link
>>>> automatically (unless invoked with the "stateless" flag, of course).
>>>> The
>>>> cost
>>>> is one additional field in struct device (I moved all of the
>>>> links-related
>>>> fields in struct device to a separate sub-structure while at it) to
>>>> track
>>>> the "driver presence status" of the device (to be used by
>>>> device_link_add()).
>>>>
>>>> In addition to that, the links list walks in the core.c and dd.c code
>>>> are
>>>> under the device links mutex now, so the iternal link spinlock is not
>>>> needed
>>>> any more and I have renamed symbols to distinguish between flags, link
>>>> states
>>>> and device "driver presence statuses".
>>>>
>>>> More information is there in the changelogs.
>>>
>>> Thanks for the update. This version is indeed easier to use and still
>>> works
>>> fine
>>> with my Exynos IOMMU runtime pm rework. You can keep my:
>>>
>>> Tested-by: Marek Szyprowski <[email protected]>
>>>
>>> I will send updated version of my patchset soon.
>>
>> Thanks for the testing, much appreciated!
>>
>> The series is in a new branch called "device-links-test" in my tree now:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
>> device-links-test
>
>
> While working on integrating IOMMU deferred probing patches I found a bug,
> which has been introduced in v4 of device dependency patchset (v3 worked
> fine in this area, v5 also contains this bug). The following fixup is
> needed to properly create links with DL_FLAG_PM_RUNTIME flag set during
> consumer device probing:
>
> From: Marek Szyprowski <[email protected]>
> Date: Thu, 20 Oct 2016 12:12:14 +0200
> Subject: [PATCH] driver core: fix runtime pm state for
> DEVICE_LINK_CONSUMER_PROBE links
>
> If link is added during consumer probe with DL_FLAG_PM_RUNTIME flag set,
> the code will do additional pm_runtime_put() on the supplier after
> finishing consumer probing. This will break runtime pm operation for
> the supplier device. To solve this issue, enforce additional call to
> pm_runtime_get_sync() on the supplier device while creating such link.
>
> Signed-off-by: Marek Szyprowski <[email protected]>
> ---
> drivers/base/core.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 48bc5a362f7d..d4cc285a1df7 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -217,6 +217,9 @@ struct device_link *device_link_add(struct device
> *consumer,
> }
> }
>
> + if (flags & DL_FLAG_PM_RUNTIME && status == DL_STATE_CONSUMER_PROBE)
> + pm_runtime_get_sync(supplier);
> +

Good catch!

I'd prefer to do this slightly differently, though, so I'll send an
update of the runtime PM patch with this folded in shortly.

Thanks,
Rafael

2016-10-20 13:10:06

by Rafael J. Wysocki

[permalink] [raw]
Subject: [Update][PATCH v5 4/5] PM / runtime: Use device links

From: Rafael J. Wysocki <[email protected]>

Modify the runtime PM framework to use device links to ensure that
supplier devices will not be suspended if any of their consumer
devices are active.

The idea is to reference count suppliers on the consumer's resume
and drop references to them on its suspend. The information on
whether or not the supplier has been reference counted by the
consumer's (runtime) resume is stored in a new field (rpm_active)
in the link object for each link.

It may be necessary to clean up those references when the
supplier is unbinding and that's why the links whose status is
DEVICE_LINK_SUPPLIER_UNBIND are skipped by the runtime suspend
and resume code.

The above means that if the consumer device is probed in the
runtime-active state, the supplier has to be resumed and reference
counted by device_link_add() so the code works as expected on its
(runtime) suspend. There is a new flag, DEVICE_LINK_RPM_ACTIVE,
to tell device_link_add() about that (in which case the caller
is responsible for making sure that the consumer really will
be runtime-active when runtime PM is enabled for it).

The other new link flag, DEVICE_LINK_PM_RUNTIME, tells the core
whether or not the link should be used for runtime PM at all.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---

As Marek noticed, it is necessary to balance the pm_runtime_put() done by
the core on the supplier if the link is added during consumer probe.

---
drivers/base/core.c | 27 +++++++
drivers/base/dd.c | 3
drivers/base/power/runtime.c | 157 +++++++++++++++++++++++++++++++++++++++++--
include/linux/device.h | 6 +
include/linux/pm_runtime.h | 6 +
5 files changed, 193 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -117,6 +117,14 @@ static int device_reorder_to_tail(struct
* @supplier: Supplier end of the link.
* @flags: Link flags.
*
+ * The caller is responsible for the proper synchronization of the link creation
+ * with runtime PM. First, setting the DL_FLAG_PM_RUNTIME flag will cause the
+ * runtime PM framework to take the link into account. Second, if the
+ * DL_FLAG_RPM_ACTIVE flag is set in addition to it, the supplier devices will
+ * be forced into the active metastate and reference-counted upon the creation
+ * of the link. If DL_FLAG_PM_RUNTIME is not set, DL_FLAG_RPM_ACTIVE will be
+ * ignored.
+ *
* If the DL_FLAG_AUTOREMOVE is set, the link will be removed automatically
* when the consumer device driver unbinds from it. The combination of both
* DL_FLAG_AUTOREMOVE and DL_FLAG_STATELESS set is invalid and will cause NULL
@@ -158,10 +166,19 @@ struct device_link *device_link_add(stru
if (link->consumer == consumer)
goto out;

- link = kmalloc(sizeof(*link), GFP_KERNEL);
+ link = kzalloc(sizeof(*link), GFP_KERNEL);
if (!link)
goto out;

+ if ((flags & DL_FLAG_PM_RUNTIME) && (flags & DL_FLAG_RPM_ACTIVE)) {
+ if (pm_runtime_get_sync(supplier) < 0) {
+ pm_runtime_put_noidle(supplier);
+ kfree(link);
+ link = NULL;
+ goto out;
+ }
+ link->rpm_active = true;
+ }
get_device(supplier);
link->supplier = supplier;
INIT_LIST_HEAD(&link->s_node);
@@ -178,6 +195,14 @@ struct device_link *device_link_add(stru
case DL_DEV_DRIVER_BOUND:
switch (consumer->links.status) {
case DL_DEV_PROBING:
+ /*
+ * Balance the decrementation of the supplier's
+ * runtime PM usage counter after consumer probe
+ * in driver_probe_device().
+ */
+ if (flags & DL_FLAG_PM_RUNTIME)
+ pm_runtime_get_sync(supplier);
+
link->status = DL_STATE_CONSUMER_PROBE;
break;
case DL_DEV_DRIVER_BOUND:
Index: linux-pm/drivers/base/dd.c
===================================================================
--- linux-pm.orig/drivers/base/dd.c
+++ linux-pm/drivers/base/dd.c
@@ -498,6 +498,7 @@ int driver_probe_device(struct device_dr
pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
drv->bus->name, __func__, dev_name(dev), drv->name);

+ pm_runtime_get_suppliers(dev);
if (dev->parent)
pm_runtime_get_sync(dev->parent);

@@ -508,6 +509,7 @@ int driver_probe_device(struct device_dr
if (dev->parent)
pm_runtime_put(dev->parent);

+ pm_runtime_put_suppliers(dev);
return ret;
}

@@ -791,6 +793,7 @@ static void __device_release_driver(stru
}

pm_runtime_get_sync(dev);
+ pm_runtime_clean_up_links(dev);

driver_sysfs_remove(dev);

Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -12,6 +12,8 @@
#include <linux/pm_runtime.h>
#include <linux/pm_wakeirq.h>
#include <trace/events/rpm.h>
+
+#include "../base.h"
#include "power.h"

typedef int (*pm_callback_t)(struct device *);
@@ -258,6 +260,42 @@ static int rpm_check_suspend_allowed(str
return retval;
}

+static int rpm_get_suppliers(struct device *dev)
+{
+ struct device_link *link;
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
+ int retval;
+
+ if (!(link->flags & DL_FLAG_PM_RUNTIME))
+ continue;
+
+ if (READ_ONCE(link->status) == DL_STATE_SUPPLIER_UNBIND ||
+ link->rpm_active)
+ continue;
+
+ retval = pm_runtime_get_sync(link->supplier);
+ if (retval < 0) {
+ pm_runtime_put_noidle(link->supplier);
+ return retval;
+ }
+ link->rpm_active = true;
+ }
+ return 0;
+}
+
+static void rpm_put_suppliers(struct device *dev)
+{
+ struct device_link *link;
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+ if (link->rpm_active &&
+ READ_ONCE(link->status) != DL_STATE_SUPPLIER_UNBIND) {
+ pm_runtime_put(link->supplier);
+ link->rpm_active = false;
+ }
+}
+
/**
* __rpm_callback - Run a given runtime PM callback for a given device.
* @cb: Runtime PM callback to run.
@@ -266,19 +304,55 @@ static int rpm_check_suspend_allowed(str
static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
__releases(&dev->power.lock) __acquires(&dev->power.lock)
{
- int retval;
+ int retval, idx;

- if (dev->power.irq_safe)
+ if (dev->power.irq_safe) {
spin_unlock(&dev->power.lock);
- else
+ } else {
spin_unlock_irq(&dev->power.lock);

+ /*
+ * Resume suppliers if necessary.
+ *
+ * The device's runtime PM status cannot change until this
+ * routine returns, so it is safe to read the status outside of
+ * the lock.
+ */
+ if (dev->power.runtime_status == RPM_RESUMING) {
+ idx = device_links_read_lock();
+
+ retval = rpm_get_suppliers(dev);
+ if (retval)
+ goto fail;
+
+ device_links_read_unlock(idx);
+ }
+ }
+
retval = cb(dev);

- if (dev->power.irq_safe)
+ if (dev->power.irq_safe) {
spin_lock(&dev->power.lock);
- else
+ } else {
+ /*
+ * If the device is suspending and the callback has returned
+ * success, drop the usage counters of the suppliers that have
+ * been reference counted on its resume.
+ *
+ * Do that if resume fails too.
+ */
+ if ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
+ || (dev->power.runtime_status == RPM_RESUMING && retval)) {
+ idx = device_links_read_lock();
+
+ fail:
+ rpm_put_suppliers(dev);
+
+ device_links_read_unlock(idx);
+ }
+
spin_lock_irq(&dev->power.lock);
+ }

return retval;
}
@@ -1447,6 +1521,79 @@ void pm_runtime_remove(struct device *de
}

/**
+ * pm_runtime_clean_up_links - Prepare links to consumers for driver removal.
+ * @dev: Device whose driver is going to be removed.
+ *
+ * Check links from this device to any consumers and if any of them have active
+ * runtime PM references to the device, drop the usage counter of the device
+ * (once per link).
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ *
+ * Since the device is guaranteed to be runtime-active at the point this is
+ * called, nothing else needs to be done here.
+ *
+ * Moreover, this is called after device_links_busy() has returned 'false', so
+ * the status of each link is guaranteed to be DL_STATE_SUPPLIER_UNBIND and
+ * therefore rpm_active can't be manipulated concurrently.
+ */
+void pm_runtime_clean_up_links(struct device *dev)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+
+ if (link->rpm_active) {
+ pm_runtime_put_noidle(dev);
+ link->rpm_active = false;
+ }
+ }
+
+ device_links_read_unlock(idx);
+}
+
+/**
+ * pm_runtime_get_suppliers - Resume and reference-count supplier devices.
+ * @dev: Consumer device.
+ */
+void pm_runtime_get_suppliers(struct device *dev)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+ if (link->flags & DL_FLAG_PM_RUNTIME)
+ pm_runtime_get_sync(link->supplier);
+
+ device_links_read_unlock(idx);
+}
+
+/**
+ * pm_runtime_put_suppliers - Drop references to supplier devices.
+ * @dev: Consumer device.
+ */
+void pm_runtime_put_suppliers(struct device *dev)
+{
+ struct device_link *link;
+ int idx;
+
+ idx = device_links_read_lock();
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+ if (link->flags & DL_FLAG_PM_RUNTIME)
+ pm_runtime_put(link->supplier);
+
+ device_links_read_unlock(idx);
+}
+
+/**
* pm_runtime_force_suspend - Force a device into suspend state if needed.
* @dev: Device to suspend.
*
Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -729,9 +729,13 @@ enum device_link_state {
*
* STATELESS: The core won't track the presence of supplier/consumer drivers.
* AUTOREMOVE: Remove this link automatically on consumer driver unbind.
+ * PM_RUNTIME: If set, the runtime PM framework will use this link.
+ * RPM_ACTIVE: Run pm_runtime_get_sync() on the supplier during link creation.
*/
#define DL_FLAG_STATELESS (1 << 0)
#define DL_FLAG_AUTOREMOVE (1 << 1)
+#define DL_FLAG_PM_RUNTIME (1 << 2)
+#define DL_FLAG_RPM_ACTIVE (1 << 3)

/**
* struct device_link - Device link representation.
@@ -741,6 +745,7 @@ enum device_link_state {
* @c_node: Hook to the consumer device's list of links to suppliers.
* @status: The state of the link (with respect to the presence of drivers).
* @flags: Link flags.
+ * @rpm_active: Whether or not the consumer device is runtime-PM-active.
* @rcu_head: An RCU head to use for deferred execution of SRCU callbacks.
*/
struct device_link {
@@ -750,6 +755,7 @@ struct device_link {
struct list_head c_node;
enum device_link_state status;
u32 flags;
+ bool rpm_active;
struct rcu_head rcu_head;
};

Index: linux-pm/include/linux/pm_runtime.h
===================================================================
--- linux-pm.orig/include/linux/pm_runtime.h
+++ linux-pm/include/linux/pm_runtime.h
@@ -55,6 +55,9 @@ extern unsigned long pm_runtime_autosusp
extern void pm_runtime_update_max_time_suspended(struct device *dev,
s64 delta_ns);
extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
+extern void pm_runtime_clean_up_links(struct device *dev);
+extern void pm_runtime_get_suppliers(struct device *dev);
+extern void pm_runtime_put_suppliers(struct device *dev);

static inline void pm_suspend_ignore_children(struct device *dev, bool enable)
{
@@ -186,6 +189,9 @@ static inline unsigned long pm_runtime_a
struct device *dev) { return 0; }
static inline void pm_runtime_set_memalloc_noio(struct device *dev,
bool enable){}
+static inline void pm_runtime_clean_up_links(struct device *dev) {}
+static inline void pm_runtime_get_suppliers(struct device *dev) {}
+static inline void pm_runtime_put_suppliers(struct device *dev) {}

#endif /* !CONFIG_PM */


2016-10-26 11:18:36

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

Hi Rafael,

sorry for not responding to v5 of your series earlier, just sending
this out now in the hope that it reaches you before your travels.

On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> - Modify device_links_check_suppliers(), device_links_driver_bound(),
> device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(),
> and device_links_unbind_consumers() to walk link lists under device_links_lock
> (to make the new "driver presence tracking" mechanism work reliably).

This change might increase boot time if drivers return -EPROBE_DEFER.
It can easily happen that such drivers are probed dozens of times,
and each time the lock will be acquired, blocking other drivers from
probing. Granted, it's hard to measure if and in how far boot time
really increases, but somehow I liked the previous approach better.
The combination of a mutex with an RCU plus a spinlock seemed complicated
when I reviewed the patch, but I never said anything because I came
to the conclusion that the effort seemed worthwhile to keep up the
performance.

(BTW, kbuild test robot has complained that the usage of RCUs isn't
enclosed in #ifdef CONFIG_SRCU. Just in case you haven't seen that.)


> - Redefine device_link_add() to take three arguments, the supplier and consumer
> pointers and flags, and to figure out the initial link state automatically.

That's a good idea, however it seems to me that there's some redundancy
between the dl_dev_state and device_link_state: AFAICS, at any given
moment the device_link_state can be computed from the supplier's and
consumer's dl_dev_state.

One could argue that caching the device_link_state is cheaper than
recomputing it every time it's needed, but often the dl_dev_state of
one of the two devices is known, so the link can only be in a subset
of all possible states, and instead of computing the device_link_state
it's sufficient to just look at the other device's dl_dev_state.

E.g. in device_links_check_suppliers() we have this:

+ if (link->status != DL_STATE_AVAILABLE) {
+ device_links_missing_supplier(dev);
+ ret = -EPROBE_DEFER;
+ break;
+ }

When this function is called we know that the consumer's dl_dev_state is
DL_DEV_PROBING. Of interest is only the status of the supplier. The above
if-condition would seem to be equivalent to:

if (link->supplier->links.status == DL_DEV_DRIVER_BOUND) {

So the device_link_state seems redundant, but if it's removed from
struct device_link, changes to dl_dev_state need to be synchronized
between supplier and consumers. Perhaps this could be accomplished
by acquiring the device_lock for the other device. E.g. when a
consumer wants to probe, before changing its own dl_dev_state it would
have to acquire the device_lock for all suppliers to prevent races
if one of them simultaneously decides to unbind (and changes its
dl_dev_state to DL_DEV_UNBINDING). Hm, could this deadlock?


Also, I notice that the dl_dev_state is part of a device's "links"
structure, but being able to look up a device's driver state might be
useful in other cases, not just for device links. Maybe it makes sense
to move the dl_dev_state one level up to struct device and name the
attribute "driver_status" or somesuch (like the existing "driver" and
"driver_data" attributes).


> + /* Deterine the initial link state. */
^
m

Thanks,

Lukas

2016-10-27 15:33:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote:
> Hi Rafael,
>
> sorry for not responding to v5 of your series earlier, just sending
> this out now in the hope that it reaches you before your travels.
>
> On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > - Modify device_links_check_suppliers(), device_links_driver_bound(),
> > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(),
> > and device_links_unbind_consumers() to walk link lists under device_links_lock
> > (to make the new "driver presence tracking" mechanism work reliably).
>
> This change might increase boot time if drivers return -EPROBE_DEFER.

"might"? Please verify this before guessing....

And don't make this more complex than needed before actually determining
a real issue.

thanks,

greg k-h

2016-10-27 15:33:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] Functional dependencies between devices

On Mon, Oct 10, 2016 at 02:36:31PM +0200, Rafael J. Wysocki wrote:
> Hi Everyone,
>
> > On Thursday, September 08, 2016 11:25:44 PM Rafael J. Wysocki wrote:
> >
>
> [cut]
>
> >
> > Time for another update. :-)
> >
> > Fewer changes this time, mostly to address issues found by Lukas and Marek.
> >
> > The most significant one is to make device_link_add() cope with the case
> > when
> > the consumer device has not been registered yet when it is called. The
> > supplier device still is required to be registered and the function will
> > return NULL if that is not the case.
> >
> > Another significant change is in patch [4/5] that now makes the core apply
> > pm_runtime_get_sync()/pm_runtime_put() to supplier devices around the
> > probing of a consumer one (in analogy with the parent).
>
> One more update after some conversations during LinuxCon Europe.
>
> The main point was to make it possible for device_link_add() to figure out the
> initial state of the link instead of expecting the caller to provide it which
> might not be reliable enough in general.
>
> In this version device_link_add() takes three arguments, the supplier and
> consumer pointers and flags and it sets the correct initial state of the link
> automatically (unless invoked with the "stateless" flag, of course). The cost
> is one additional field in struct device (I moved all of the links-related
> fields in struct device to a separate sub-structure while at it) to track
> the "driver presence status" of the device (to be used by device_link_add()).
>
> In addition to that, the links list walks in the core.c and dd.c code are
> under the device links mutex now, so the iternal link spinlock is not needed
> any more and I have renamed symbols to distinguish between flags, link states
> and device "driver presence statuses".
>
> More information is there in the changelogs.

First off, thanks so much for doing this work, it's been needed for a
long time. It all looks good, and I've added it to my testing tree to
give the 0-day bot some fun with it for a day or so.

thanks,

greg k-h

2016-10-27 15:33:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> +/*
> + * Device link flags.
> + *
> + * STATELESS: The core won't track the presence of supplier/consumer drivers.
> + * AUTOREMOVE: Remove this link automatically on consumer driver unbind.
> + */
> +#define DL_FLAG_STATELESS (1 << 0)
> +#define DL_FLAG_AUTOREMOVE (1 << 1)

Minor nit, BIT()? I'll leave this for someone to send a checkpatch
cleanup patch later on :)

thanks,

greg k-h

2016-10-28 09:39:20

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] Functional dependencies between devices

On Thu, Oct 27, 2016 at 08:19:27PM +0800, Hanjun Guo wrote:
> I'm trying to using this patch set to solve the functional dependency
> between devices and irqchip, which are both ACPI platform devices.
> irqchip needs to be probed before the devices connecting to them,
> which specifically, it's the mbi-gen support I send out recently:
>
> https://lkml.org/lkml/2016/10/25/453
>
> But I didn't see an example to do so in this patch set, and seems that
> some extra code needs to be added for that purpose, could you give me
> some suggestions for how to do that then I can work on and test against
> your patch set?

If the consumers can detect that there's a consumer on which they depend,
you could call device_link_add() from their ->probe hook.

Generally the earliest point in time when device links can be added is
after device_initialize() has been called for the consumer and device_add()
has been called for the supplier. (At least that's my understanding.)

HTH,

Lukas

2016-10-28 09:56:25

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote:
> > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > > - Modify device_links_check_suppliers(), device_links_driver_bound(),
> > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(),
> > > and device_links_unbind_consumers() to walk link lists under device_links_lock
> > > (to make the new "driver presence tracking" mechanism work reliably).
> >
> > This change might increase boot time if drivers return -EPROBE_DEFER.
>
> "might"? Please verify this before guessing....

I can't, my machine only uses device links for Thunderbolt hotplug ports,
and their driver (portdrv) doesn't use deferred probing. I'm only aware
of a single device in my system whose driver causes others to defer
probing (apple-gmux), but they don't use device links, so the mutex is
only acquired very briefly because the supplier/consumer lists are empty,
hence the performance impact can't be measured on my system. But the
situation may be different for Marek's or Hanjun's use cases. I'm not
familiar with their systems, hence the "might".


> And don't make this more complex than needed before actually determining
> a real issue.

As pointed out it currently *is* more complex than needed because the
device_link_state is dispensable. The whole issue is moot with the
changes I suggested because the mutex would only have to be taken for
addition/deletion of device links, not when probing.

Thanks,

Lukas

2016-11-02 20:55:36

by Hanjun Guo

[permalink] [raw]
Subject: Re: [PATCH v5 0/5] Functional dependencies between devices

On 10/28/2016 05:39 PM, Lukas Wunner wrote:
> On Thu, Oct 27, 2016 at 08:19:27PM +0800, Hanjun Guo wrote:
>> I'm trying to using this patch set to solve the functional dependency
>> between devices and irqchip, which are both ACPI platform devices.
>> irqchip needs to be probed before the devices connecting to them,
>> which specifically, it's the mbi-gen support I send out recently:
>>
>> https://lkml.org/lkml/2016/10/25/453
>>
>> But I didn't see an example to do so in this patch set, and seems that
>> some extra code needs to be added for that purpose, could you give me
>> some suggestions for how to do that then I can work on and test against
>> your patch set?
>
> If the consumers can detect that there's a consumer on which they depend,
> you could call device_link_add() from their ->probe hook.
>
> Generally the earliest point in time when device links can be added is
> after device_initialize() has been called for the consumer and device_add()
> has been called for the supplier. (At least that's my understanding.)

Thank you, currently I'm on travailing and will take a deep look to
see if it works (or adding things on top) on my case.

Thanks
Hanjun

2016-11-07 21:22:54

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote:
> > Hi Rafael,
> >
> > sorry for not responding to v5 of your series earlier, just sending
> > this out now in the hope that it reaches you before your travels.
> >
> > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > > - Modify device_links_check_suppliers(), device_links_driver_bound(),
> > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(),
> > > and device_links_unbind_consumers() to walk link lists under device_links_lock
> > > (to make the new "driver presence tracking" mechanism work reliably).
> >
> > This change might increase boot time if drivers return -EPROBE_DEFER.
>
> "might"? Please verify this before guessing....
>
> And don't make this more complex than needed before actually determining
> a real issue.

As clarified by Rafael at Plumbers, this functional dependencies
framework assumes your driver / subsystem supports deferred probe,
if it does not support its not clear what will happen....

We have no explicit semantics to check if a driver / subsystem
supports deferred probe.

Luis

2016-11-07 21:40:07

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Currently, there is a problem with taking functional dependencies
> between devices into account.
>
> What I mean by a "functional dependency" is when the driver of device
> B needs device A to be functional and (generally) its driver to be
> present in order to work properly. This has certain consequences
> for power management (suspend/resume and runtime PM ordering) and
> shutdown ordering of these devices. In general, it also implies that
> the driver of A needs to be working for B to be probed successfully
> and it cannot be unbound from the device before the B's driver.
>
> Support for representing those functional dependencies between
> devices is added here to allow the driver core to track them and act
> on them in certain cases where applicable.
>
> The argument for doing that in the driver core is that there are
> quite a few distinct use cases involving device dependencies, they
> are relatively hard to get right in a driver (if one wants to
> address all of them properly) and it only gets worse if multiplied
> by the number of drivers potentially needing to do it.

How many drivers actually *need this* today for suspend / resume ?

How many of these are because of ACPI firmware bugs rather than
some other reason ?

Can ACPI somehow be used instead by these devices to address quirks?

> Morever, at
> least one case (asynchronous system suspend/resume) cannot be handled
> in a single driver at all, because it requires the driver of A to
> wait for B to suspend (during system suspend) and the driver of B to
> wait for A to resume (during system resume).

Why is this all of a sudden a new issue? It seems there is quite a bit of
frameworks already out there that somehow deal with some sort of device
ordering / dependencies, and I am curious if they've been addressing some of
these problems for a while now on their own somehow with their own solutions,
is that the case? For instance can DAPM the / DRM / Audio component framework
v4l async solution be used or does it already address some of these concerns ?

> For this reason, represent dependencies between devices as "links",
> with the help of struct device_link objects each containing pointers
> to the "linked" devices, a list node for each of them, status
> information, flags, and an RCU head for synchronization.
>
> Also add two new list heads, representing the lists of links to the
> devices that depend on the given one (consumers) and to the devices
> depended on by it (suppliers), and a "driver presence status" field
> (needed for figuring out initial states of device links) to struct
> device.
>
> The entire data structure consisting of all of the lists of link
> objects for all devices is protected by a mutex (for link object
> addition/removal and for list walks during device driver probing
> and removal) and by SRCU (for list walking in other case that will
> be introduced by subsequent change sets). In addition, each link
> object has an internal status field whose value reflects whether or
> not drivers are bound to the devices pointed to by the link or
> probing/removal of their drivers is in progress etc. That field
> is only modified under the device links mutex, but it may be read
> outside of it in some cases (introduced by subsequent change sets),
> so modifications of it are annotated with WRITE_ONCE().
>
> New links are added by calling device_link_add() which takes three
> arguments: pointers to the devices in question and flags. In
> particular, if DL_FLAG_STATELESS is set in the flags, the link status
> is not to be taken into account for this link and the driver core
> will not manage it. In turn, if DL_FLAG_AUTOREMOVE is set in the
> flags, the driver core will remove the link automatically when the
> consumer device driver unbinds from it.
>
> One of the actions carried out by device_link_add() is to reorder
> the lists used for device shutdown and system suspend/resume to
> put the consumer device along with all of its children and all of
> its consumers (and so on, recursively) to the ends of those lists
> in order to ensure the right ordering between all of the supplier
> and consumer devices.

There's no explanation as to why this order is ensured to be
correct, I think its important to document this. From our discussions
at Plumbers it seems the order is ensured due to the fact that order
was already implicitly provided through platform firmware (ACPI
enumeration is one), adjusting order on the dpm list is just shuffling
order between consumer / provider, but nothing else. In theory it
works because in the simple case this should suffice however I
remain unconvinced that if we have more users of this framework this
simple algorithm will suffice. Having a test driver or series of
test drivers that shows this would be good. As it stands there is simply
an assumption that this is correct, how *strongly* do you feel that
the order will *always* be correct if this is done and no cycles
can be added, even if tons of drivers start using this ?

> For this reason, it is not possible to create a link between two
> devices if the would-be supplier device already depends on the
> would-be consumer device as either a direct descendant of it or a
> consumer of one of its direct descendants or one of its consumers
> and so on.
>
> There are two types of link objects, persistent and non-persistent.
> The persistent ones stay around until one of the target devices is
> deleted, while the non-persistent ones are removed automatically when
> the consumer driver unbinds from its device (ie. they are assumed to
> be valid only as long as the consumer device has a driver bound to
> it). Persistent links are created by default and non-persistent
> links are created when the DL_FLAG_AUTOREMOVE flag is passed
> to device_link_add().
>
> Both persistent and non-persistent device links can be deleted
> with an explicit call to device_link_del().
>
> Links created without the DL_FLAG_STATELESS flag set are managed
> by the driver core using a simple state machine. There are 5 states
> each link can be in: DORMANT (unused), AVAILABLE (the supplier driver
> is present and functional), CONSUMER_PROBE (the consumer driver is
> probing), ACTIVE (both supplier and consumer drivers are present and
> functional), and SUPPLIER_UNBIND (the supplier driver is unbinding).
> The driver core updates the link state automatically depending on
> what happens to the linked devices and for each link state specific
> actions are taken in addition to that.
>
> For example, if the supplier driver unbinds from its device, the
> driver core will also unbind the drivers of all of its consumers
> automatically under the assumption that they cannot function
> properly without the supplier. Analogously, the driver core will
> only allow the consumer driver to bind to its device if the
> supplier driver is present and functional (ie. the link is in
> the AVAILABLE state). If that's not the case, it will rely on
> the existing deferred probing mechanism to wait for the supplier
> driver to become available.

This assumes drivers and subsystems support deferred probe, is
there a way to ensure that drivers that use this framework support
it instead of possibly breaking them if they use this framework ?

Luis

2016-11-08 06:45:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Mon, Nov 07, 2016 at 10:22:50PM +0100, Luis R. Rodriguez wrote:
> We have no explicit semantics to check if a driver / subsystem
> supports deferred probe.

Why would we need such a thing?

thanks,

greg k-h

2016-11-08 19:21:10

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Tue, Nov 08, 2016 at 07:45:41AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Nov 07, 2016 at 10:22:50PM +0100, Luis R. Rodriguez wrote:
> > We have no explicit semantics to check if a driver / subsystem
> > supports deferred probe.
>
> Why would we need such a thing?

It depends on the impact of a driver/subsystem not properly supporting
deffered probe, if this is no-op then such a need is not critical but
would be good to proactively inform developers / users so they avoid
its use, if this will cause issues its perhaps best to make this a
no-op through a check. AFAICT reviewing implications of not supporting
deferred probe on drivers/subsytsems for this framework is not clearly
spelled out, if we start considering re-using this framework for probe
ordering I'd hate to see issues come up without this corner case being
concretely considered.

Furthermore -- how does this framework compare to Andrzej's resource tracking
solution? I confess I have not had a chance yet to review yet but in light of
this question it would be good to know if Andrzej's framework also requires
deferred probe as similar concerns would exist there as well.

Luis

2016-11-08 19:43:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Tue, Nov 08, 2016 at 08:21:04PM +0100, Luis R. Rodriguez wrote:
> On Tue, Nov 08, 2016 at 07:45:41AM +0100, Greg Kroah-Hartman wrote:
> > On Mon, Nov 07, 2016 at 10:22:50PM +0100, Luis R. Rodriguez wrote:
> > > We have no explicit semantics to check if a driver / subsystem
> > > supports deferred probe.
> >
> > Why would we need such a thing?
>
> It depends on the impact of a driver/subsystem not properly supporting
> deffered probe, if this is no-op then such a need is not critical but
> would be good to proactively inform developers / users so they avoid
> its use, if this will cause issues its perhaps best to make this a
> no-op through a check. AFAICT reviewing implications of not supporting
> deferred probe on drivers/subsytsems for this framework is not clearly
> spelled out, if we start considering re-using this framework for probe
> ordering I'd hate to see issues come up without this corner case being
> concretely considered.

It should not matter to the driver core if a subsystem, or a driver,
supports or does not support deferred probe. It's a quick and simple
solution to a complex problem that works well. Yes, you can iterate a
lot of times, but that's fine, we have time at boot to do that (and
really, it is fast.)

> Furthermore -- how does this framework compare to Andrzej's resource tracking
> solution? I confess I have not had a chance yet to review yet but in light of
> this question it would be good to know if Andrzej's framework also requires
> deferred probe as similar concerns would exist there as well.

I have no idea what "framework" you are talking about here, do you have
a pointer to patches?

thanks,

greg k-h

2016-11-08 20:58:31

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Tue, Nov 08, 2016 at 08:43:35PM +0100, Greg Kroah-Hartman wrote:
> On Tue, Nov 08, 2016 at 08:21:04PM +0100, Luis R. Rodriguez wrote:
> > On Tue, Nov 08, 2016 at 07:45:41AM +0100, Greg Kroah-Hartman wrote:
> > > On Mon, Nov 07, 2016 at 10:22:50PM +0100, Luis R. Rodriguez wrote:
> > > > We have no explicit semantics to check if a driver / subsystem
> > > > supports deferred probe.
> > >
> > > Why would we need such a thing?
> >
> > It depends on the impact of a driver/subsystem not properly supporting
> > deffered probe, if this is no-op then such a need is not critical but
> > would be good to proactively inform developers / users so they avoid
> > its use, if this will cause issues its perhaps best to make this a
> > no-op through a check. AFAICT reviewing implications of not supporting
> > deferred probe on drivers/subsytsems for this framework is not clearly
> > spelled out, if we start considering re-using this framework for probe
> > ordering I'd hate to see issues come up without this corner case being
> > concretely considered.
>
> It should not matter to the driver core if a subsystem, or a driver,
> supports or does not support deferred probe.

That was my impression as well -- however Rafael noted this as an area
worth highlighting. So perhaps he can elaborate.

But at least as per my notes I do have here Geert Uytterhoeven reminding
us that:

"Some drivers / subsystems don’t support deferred probe yet, such failures
usually don’t blow up, but cause subtle malfunctioning. Example, an Ethernet
phy could not get its interrupt as the primary IRQ chip had not been probed
yet, it reverted to polling though. Sub-optimal." [0]

[0] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003425.html

This sounds more like the existing deffered probe solution has detrimental
unexpected effects on some drivers / subsystems -- its not clear *why*, but
its worth reviewing if there are other drivers and seeing if we need this
annotation.

> It's a quick and simple solution to a complex problem that works well.

We all agree. The recent discussions over probe ordering are simply
optimization considerations -- should the time incurred to use deferred
probe be X and the time incurred when using an alternative is Y and we
determine the time Y is < X we have a winning alternative. Part of my
own big issue with deferred probe is a) its extremely non-deterministic,
b) it seems some folks have thought about similar problems and we might
really be able to do better. I'm not convinced that the functional
dependencies patches are the panacea for probe ordering, and while it
was not intended for that, some folks are assuming it could be. To really
vet this prospect we must really consider what other subsystem have done
and review other alternatives efforts.

> Yes, you can iterate a
> lot of times, but that's fine, we have time at boot to do that (and
> really, it is fast.)

Deferred probe is left for late_initcall() [1] -- this *assumes* that the
driver/subsystem can be loaded so late, and as per Andrzej this solution
isunacceptable/undesirable. So it would be unfair and incorrect to categorize
all drivers in the same boat, in fact given this lone fact it may be
worth revisiting the idea I mentioned about a capability aspect to support
a late deferred probe later. We tend to assume its fine, however it does
not seem to be the case.

[1] drivers/base/dd.c:late_initcall(deferred_probe_initcall);

> > Furthermore -- how does this framework compare to Andrzej's resource tracking
> > solution? I confess I have not had a chance yet to review yet but in light of
> > this question it would be good to know if Andrzej's framework also requires
> > deferred probe as similar concerns would exist there as well.
>
> I have no idea what "framework" you are talking about here, do you have
> a pointer to patches?

I'm surprised given Andrzej did both Cc you on his patches [2] *and* chimed
in on Rafael's patches to indicate that we likely can integrate PM concerns
into his own "framework" [3]. There was no resolution to this discussion, however
its not IMHO sufficient to brush off Andrzej's points in particular because
Andrzej *is* indicating that his framework:

- Eliminates deferred probe and resulting late_initcall(), consumer registers
callbacks informing when given resources (clock, regulator, etc) becomes
available
- Properly handle resource disappearance (driver unbind, hotplug)
- Track resources which are not vital to the device, but can influence behavior
- Offers simplified resource allocation
- Can be easily expanded to help with power management

Granted I have not reviewed this yet but it at least was on my radar, and
I do believe its worth reviewing this further given the generally expressed
interest to see if we can have a common framework to address both ordering
problems, suspend and probe. At a quick glance the "ghost provider" idea
seems like a rather crazy idea but hey, there may be some goods in there.

It was sad both Andrzej and yourself could not attend the complex dependencies
tracks -- I think it would have been useful. I don't expect us to address a
resolution to probe ordering immediately -- but I am in the hopes we at least
can keep an open mind about the similarity of the problems and see if we can
aim for a clean elegant solution that might help both.

[2] https://lwn.net/Articles/625454/
[3] http://thread.gmane.org/gmane.linux.kernel/2087152

Luis

2016-11-09 06:44:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Tue, Nov 08, 2016 at 09:58:24PM +0100, Luis R. Rodriguez wrote:
> > > Furthermore -- how does this framework compare to Andrzej's resource tracking
> > > solution? I confess I have not had a chance yet to review yet but in light of
> > > this question it would be good to know if Andrzej's framework also requires
> > > deferred probe as similar concerns would exist there as well.
> >
> > I have no idea what "framework" you are talking about here, do you have
> > a pointer to patches?
>
> I'm surprised given Andrzej did both Cc you on his patches [2] *and* chimed
> in on Rafael's patches to indicate that we likely can integrate PM concerns
> into his own "framework" [3]. There was no resolution to this discussion, however
> its not IMHO sufficient to brush off Andrzej's points in particular because
> Andrzej *is* indicating that his framework:

Dude, those patches were from 2014! I can't remember patches people
sent to me a month ago...

> - Eliminates deferred probe and resulting late_initcall(), consumer registers
> callbacks informing when given resources (clock, regulator, etc) becomes
> available
> - Properly handle resource disappearance (driver unbind, hotplug)
> - Track resources which are not vital to the device, but can influence behavior
> - Offers simplified resource allocation
> - Can be easily expanded to help with power management
>
> Granted I have not reviewed this yet but it at least was on my radar, and
> I do believe its worth reviewing this further given the generally expressed
> interest to see if we can have a common framework to address both ordering
> problems, suspend and probe. At a quick glance the "ghost provider" idea
> seems like a rather crazy idea but hey, there may be some goods in there.

>From what I remember, and I could be totally wrong, these patches were
way too complex and required that every subsystem change their
interfaces. That's not going to work out well, but read the email
threads for the details...

> It was sad both Andrzej and yourself could not attend the complex dependencies
> tracks -- I think it would have been useful.

Sometimes real-life gets in the way of work, sorry :(

> I don't expect us to address a
> resolution to probe ordering immediately -- but I am in the hopes we at least
> can keep an open mind about the similarity of the problems and see if we can
> aim for a clean elegant solution that might help both.

I'll always review patches of what people come up with.

thanks,

greg k-h

2016-11-09 09:37:18

by Andrzej Hajda

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On 09.11.2016 07:45, Greg Kroah-Hartman wrote:
> On Tue, Nov 08, 2016 at 09:58:24PM +0100, Luis R. Rodriguez wrote:
>>>> Furthermore -- how does this framework compare to Andrzej's resource tracking
>>>> solution? I confess I have not had a chance yet to review yet but in light of
>>>> this question it would be good to know if Andrzej's framework also requires
>>>> deferred probe as similar concerns would exist there as well.
>>> I have no idea what "framework" you are talking about here, do you have
>>> a pointer to patches?
>> I'm surprised given Andrzej did both Cc you on his patches [2] *and* chimed
>> in on Rafael's patches to indicate that we likely can integrate PM concerns
>> into his own "framework" [3]. There was no resolution to this discussion, however
>> its not IMHO sufficient to brush off Andrzej's points in particular because
>> Andrzej *is* indicating that his framework:
> Dude, those patches were from 2014! I can't remember patches people
> sent to me a month ago...
>
>> - Eliminates deferred probe and resulting late_initcall(), consumer registers
>> callbacks informing when given resources (clock, regulator, etc) becomes
>> available
>> - Properly handle resource disappearance (driver unbind, hotplug)
>> - Track resources which are not vital to the device, but can influence behavior
>> - Offers simplified resource allocation
>> - Can be easily expanded to help with power management
>>
>> Granted I have not reviewed this yet but it at least was on my radar, and
>> I do believe its worth reviewing this further given the generally expressed
>> interest to see if we can have a common framework to address both ordering
>> problems, suspend and probe. At a quick glance the "ghost provider" idea
>> seems like a rather crazy idea but hey, there may be some goods in there.
> >From what I remember, and I could be totally wrong, these patches were
> way too complex and required that every subsystem change their
> interfaces. That's not going to work out well, but read the email
> threads for the details...

I haven't seen your comment on my patches, except few general questions
regarding one of earlier version of the framework.
So maybe you are talking about different framework.

Regarding complexity, if the subsystem have simple way of
'(un)publishing' resources it just adds single calls to restrack core:
restrack_up, restrack_down in proper places.
Additionally it adds quite simple stuff to encapsulate resource
description and allocation routines into generic *_restrack_desc
structure, see for example patch adding restrack to phy framework[1].

[1]:
https://lists.freedesktop.org/archives/dri-devel/2014-December/073759.html

Regards
Andrzej

>
>> It was sad both Andrzej and yourself could not attend the complex dependencies
>> tracks -- I think it would have been useful.
> Sometimes real-life gets in the way of work, sorry :(
>
>> I don't expect us to address a
>> resolution to probe ordering immediately -- but I am in the hopes we at least
>> can keep an open mind about the similarity of the problems and see if we can
>> aim for a clean elegant solution that might help both.
> I'll always review patches of what people come up with.
>
> thanks,
>
> greg k-h
>
>
>

2016-11-09 09:41:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Wed, Nov 09, 2016 at 10:36:54AM +0100, Andrzej Hajda wrote:
> On 09.11.2016 07:45, Greg Kroah-Hartman wrote:
> > On Tue, Nov 08, 2016 at 09:58:24PM +0100, Luis R. Rodriguez wrote:
> >>>> Furthermore -- how does this framework compare to Andrzej's resource tracking
> >>>> solution? I confess I have not had a chance yet to review yet but in light of
> >>>> this question it would be good to know if Andrzej's framework also requires
> >>>> deferred probe as similar concerns would exist there as well.
> >>> I have no idea what "framework" you are talking about here, do you have
> >>> a pointer to patches?
> >> I'm surprised given Andrzej did both Cc you on his patches [2] *and* chimed
> >> in on Rafael's patches to indicate that we likely can integrate PM concerns
> >> into his own "framework" [3]. There was no resolution to this discussion, however
> >> its not IMHO sufficient to brush off Andrzej's points in particular because
> >> Andrzej *is* indicating that his framework:
> > Dude, those patches were from 2014! I can't remember patches people
> > sent to me a month ago...
> >
> >> - Eliminates deferred probe and resulting late_initcall(), consumer registers
> >> callbacks informing when given resources (clock, regulator, etc) becomes
> >> available
> >> - Properly handle resource disappearance (driver unbind, hotplug)
> >> - Track resources which are not vital to the device, but can influence behavior
> >> - Offers simplified resource allocation
> >> - Can be easily expanded to help with power management
> >>
> >> Granted I have not reviewed this yet but it at least was on my radar, and
> >> I do believe its worth reviewing this further given the generally expressed
> >> interest to see if we can have a common framework to address both ordering
> >> problems, suspend and probe. At a quick glance the "ghost provider" idea
> >> seems like a rather crazy idea but hey, there may be some goods in there.
> > >From what I remember, and I could be totally wrong, these patches were
> > way too complex and required that every subsystem change their
> > interfaces. That's not going to work out well, but read the email
> > threads for the details...
>
> I haven't seen your comment on my patches, except few general questions
> regarding one of earlier version of the framework.
> So maybe you are talking about different framework.
>
> Regarding complexity, if the subsystem have simple way of
> '(un)publishing' resources it just adds single calls to restrack core:
> restrack_up, restrack_down in proper places.
> Additionally it adds quite simple stuff to encapsulate resource
> description and allocation routines into generic *_restrack_desc
> structure, see for example patch adding restrack to phy framework[1].

Ok, again, I have no idea what my response was to a 2 year-old patchset,
again, I can't remember my response to a patchset that was sent just a
month ago...

update it, and repost and we can all go from there if you think it is a
viable solution.

thanks,

greg k-h

2016-11-10 00:43:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote:
>> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote:
>> > Hi Rafael,
>> >
>> > sorry for not responding to v5 of your series earlier, just sending
>> > this out now in the hope that it reaches you before your travels.
>> >
>> > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
>> > > - Modify device_links_check_suppliers(), device_links_driver_bound(),
>> > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(),
>> > > and device_links_unbind_consumers() to walk link lists under device_links_lock
>> > > (to make the new "driver presence tracking" mechanism work reliably).
>> >
>> > This change might increase boot time if drivers return -EPROBE_DEFER.
>>
>> "might"? Please verify this before guessing....
>>
>> And don't make this more complex than needed before actually determining
>> a real issue.
>
> As clarified by Rafael at Plumbers, this functional dependencies
> framework assumes your driver / subsystem supports deferred probe,

It isn't particularly clear what you mean by "support" here.

I guess that you mean that it will allow the ->probe callback to be
invoked for multiple times for the same device/driver combination
without issues. If that's the case, the way the new code uses
-EPROBE_DEFER doesn't interfere with this, because it will not invoke
the ->probe callbacks for consumers at all until their (required)
suppliers are ready.

> if it does not support its not clear what will happen....

I don't see any problems here, but if you see any, please just say
what they are.

> We have no explicit semantics to check if a driver / subsystem
> supports deferred probe.

That's correct, but then do we need it?

Thanks,
Rafael

2016-11-10 00:59:59

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez <[email protected]> wrote:
>> On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote:
>>> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote:
>>> > Hi Rafael,
>>> >
>>> > sorry for not responding to v5 of your series earlier, just sending
>>> > this out now in the hope that it reaches you before your travels.
>>> >
>>> > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
>>> > > - Modify device_links_check_suppliers(), device_links_driver_bound(),
>>> > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(),
>>> > > and device_links_unbind_consumers() to walk link lists under device_links_lock
>>> > > (to make the new "driver presence tracking" mechanism work reliably).
>>> >
>>> > This change might increase boot time if drivers return -EPROBE_DEFER.
>>>
>>> "might"? Please verify this before guessing....
>>>
>>> And don't make this more complex than needed before actually determining
>>> a real issue.
>>
>> As clarified by Rafael at Plumbers, this functional dependencies
>> framework assumes your driver / subsystem supports deferred probe,
>
> It isn't particularly clear what you mean by "support" here.

I noted some folks had reported issues, and you acknowledged that if
deferred probe was used in some drivers and if this created an issue
the same issue would be seen with this framework. AFAICT there are two
possible issues to consider:

1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned [0].

"Some drivers / subsystems don’t support deferred probe yet, such failures
usually don’t blow up, but cause subtle malfunctioning. Example, an Ethernet
phy could not get its interrupt as the primary IRQ chip had not been probed
yet, it reverted to polling though. Sub-optimal." [0]

[0] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003425.html

Geert can you provide more details?

2) Since deferred probe relies on late_initcall() if your driver must
load earlier than this deferred probe can create an issue. Andrzej had
you identified a driver that ran into this and had issues ? If not
this seems like a semantics thing we should consider in extending the
documentation for drivers so that driver writers are aware of this
limitation. I would suppose candidates for this would be anything not
using module_init() or late_initcall() on their inits and have a
probe.

>> if it does not support its not clear what will happen....
>
> I don't see any problems here, but if you see any, please just say
> what they are.
>
>> We have no explicit semantics to check if a driver / subsystem
>> supports deferred probe.
>
> That's correct, but then do we need it?

We can determine this by reviewing the two items above.

Luis

2016-11-10 01:07:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Mon, Nov 7, 2016 at 10:39 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <[email protected]>
>>
>> Currently, there is a problem with taking functional dependencies
>> between devices into account.
>>
>> What I mean by a "functional dependency" is when the driver of device
>> B needs device A to be functional and (generally) its driver to be
>> present in order to work properly. This has certain consequences
>> for power management (suspend/resume and runtime PM ordering) and
>> shutdown ordering of these devices. In general, it also implies that
>> the driver of A needs to be working for B to be probed successfully
>> and it cannot be unbound from the device before the B's driver.
>>
>> Support for representing those functional dependencies between
>> devices is added here to allow the driver core to track them and act
>> on them in certain cases where applicable.
>>
>> The argument for doing that in the driver core is that there are
>> quite a few distinct use cases involving device dependencies, they
>> are relatively hard to get right in a driver (if one wants to
>> address all of them properly) and it only gets worse if multiplied
>> by the number of drivers potentially needing to do it.
>
> How many drivers actually *need this* today for suspend / resume ?

Admittedly, I don't have a list, but from discussions I had in the
past it looked like at least a dozen.

Plus, there is this pesky ACPI _DEP thing that in some cases is
actually correct and (before this patchest) there is no way to follow
it (a classical example here is when ACPI power management methods on
one device are implemented in terms of I2C accesses and require a
specific I2C controller to be present and functional for the PM of the
device in question to work, but there are other cases like that).

> How many of these are because of ACPI firmware bugs rather than
> some other reason ?

Two things here (as I said elsewhere). There is the order in which
operations (eg. async suspend resume of devices during system-wide
state transitions) are started and there is the order in which they
are run/completed. Even if the former is right, the latter still may
not be correct and there needs to be a generic way to make that
happen.

The "firmware bugs" above affect the first item and even if it is
entirely correct (ie. all PM operations for all devices are always
started in the right order, consumers before suppliers or the other
way around depending on whether this is suspend or resume), there
still is the second item to address.

Apart from this, even if the information provided by the firmware does
not lead to a device registration ordering that will produce the
correct dpm_list ordering (for example), it may still not be clear
whether or not that's a bug in the firmware. The device registration
ordering itself is just not sufficient in general.

> Can ACPI somehow be used instead by these devices to address quirks?
>
>> Morever, at
>> least one case (asynchronous system suspend/resume) cannot be handled
>> in a single driver at all, because it requires the driver of A to
>> wait for B to suspend (during system suspend) and the driver of B to
>> wait for A to resume (during system resume).
>
> Why is this all of a sudden a new issue? It seems there is quite a bit of
> frameworks already out there that somehow deal with some sort of device
> ordering / dependencies, and I am curious if they've been addressing some of
> these problems for a while now on their own somehow with their own solutions,
> is that the case? For instance can DAPM the / DRM / Audio component framework
> v4l async solution be used or does it already address some of these concerns ?

None of them don't address all of the problems involved in a
sufficiently generic way to my knowledge.

>> For this reason, represent dependencies between devices as "links",
>> with the help of struct device_link objects each containing pointers
>> to the "linked" devices, a list node for each of them, status
>> information, flags, and an RCU head for synchronization.
>>
>> Also add two new list heads, representing the lists of links to the
>> devices that depend on the given one (consumers) and to the devices
>> depended on by it (suppliers), and a "driver presence status" field
>> (needed for figuring out initial states of device links) to struct
>> device.
>>
>> The entire data structure consisting of all of the lists of link
>> objects for all devices is protected by a mutex (for link object
>> addition/removal and for list walks during device driver probing
>> and removal) and by SRCU (for list walking in other case that will
>> be introduced by subsequent change sets). In addition, each link
>> object has an internal status field whose value reflects whether or
>> not drivers are bound to the devices pointed to by the link or
>> probing/removal of their drivers is in progress etc. That field
>> is only modified under the device links mutex, but it may be read
>> outside of it in some cases (introduced by subsequent change sets),
>> so modifications of it are annotated with WRITE_ONCE().
>>
>> New links are added by calling device_link_add() which takes three
>> arguments: pointers to the devices in question and flags. In
>> particular, if DL_FLAG_STATELESS is set in the flags, the link status
>> is not to be taken into account for this link and the driver core
>> will not manage it. In turn, if DL_FLAG_AUTOREMOVE is set in the
>> flags, the driver core will remove the link automatically when the
>> consumer device driver unbinds from it.
>>
>> One of the actions carried out by device_link_add() is to reorder
>> the lists used for device shutdown and system suspend/resume to
>> put the consumer device along with all of its children and all of
>> its consumers (and so on, recursively) to the ends of those lists
>> in order to ensure the right ordering between all of the supplier
>> and consumer devices.
>
> There's no explanation as to why this order is ensured to be
> correct, I think its important to document this.

Agreed. There is a documentation work in progress (started by Lukas).
We will get to that point.

> From our discussions
> at Plumbers it seems the order is ensured due to the fact that order
> was already implicitly provided through platform firmware (ACPI
> enumeration is one), adjusting order on the dpm list is just shuffling
> order between consumer / provider, but nothing else.

That basically is the case.

> In theory it
> works because in the simple case this should suffice however I
> remain unconvinced that if we have more users of this framework this
> simple algorithm will suffice. Having a test driver or series of
> test drivers that shows this would be good. As it stands there is simply
> an assumption that this is correct, how *strongly* do you feel that
> the order will *always* be correct if this is done and no cycles
> can be added, even if tons of drivers start using this ?

Cycles cannot be added because of the way the code works. It will
just return NULL from device_link_add() if it detects a cycle (that it
cannot deal with).

>> For this reason, it is not possible to create a link between two
>> devices if the would-be supplier device already depends on the
>> would-be consumer device as either a direct descendant of it or a
>> consumer of one of its direct descendants or one of its consumers
>> and so on.
>>
>> There are two types of link objects, persistent and non-persistent.
>> The persistent ones stay around until one of the target devices is
>> deleted, while the non-persistent ones are removed automatically when
>> the consumer driver unbinds from its device (ie. they are assumed to
>> be valid only as long as the consumer device has a driver bound to
>> it). Persistent links are created by default and non-persistent
>> links are created when the DL_FLAG_AUTOREMOVE flag is passed
>> to device_link_add().
>>
>> Both persistent and non-persistent device links can be deleted
>> with an explicit call to device_link_del().
>>
>> Links created without the DL_FLAG_STATELESS flag set are managed
>> by the driver core using a simple state machine. There are 5 states
>> each link can be in: DORMANT (unused), AVAILABLE (the supplier driver
>> is present and functional), CONSUMER_PROBE (the consumer driver is
>> probing), ACTIVE (both supplier and consumer drivers are present and
>> functional), and SUPPLIER_UNBIND (the supplier driver is unbinding).
>> The driver core updates the link state automatically depending on
>> what happens to the linked devices and for each link state specific
>> actions are taken in addition to that.
>>
>> For example, if the supplier driver unbinds from its device, the
>> driver core will also unbind the drivers of all of its consumers
>> automatically under the assumption that they cannot function
>> properly without the supplier. Analogously, the driver core will
>> only allow the consumer driver to bind to its device if the
>> supplier driver is present and functional (ie. the link is in
>> the AVAILABLE state). If that's not the case, it will rely on
>> the existing deferred probing mechanism to wait for the supplier
>> driver to become available.
>
> This assumes drivers and subsystems support deferred probe,

No, I don't think so.

Where does that assumption matter, exactly?

> is there a way to ensure that drivers that use this framework support
> it instead of possibly breaking them if they use this framework ?

The whole driver/device binding tracking thing is not mandatory in the
first place. You can just tell the framework to ignore that part and
just use the links for PM and shutdown.

Thanks,
Rafael

2016-11-10 07:05:31

by Laurent Pinchart

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

Hi Luis,

On Monday 07 Nov 2016 22:39:54 Luis R. Rodriguez wrote:
> On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > Currently, there is a problem with taking functional dependencies
> > between devices into account.
> >
> > What I mean by a "functional dependency" is when the driver of device
> > B needs device A to be functional and (generally) its driver to be
> > present in order to work properly. This has certain consequences
> > for power management (suspend/resume and runtime PM ordering) and
> > shutdown ordering of these devices. In general, it also implies that
> > the driver of A needs to be working for B to be probed successfully
> > and it cannot be unbound from the device before the B's driver.
> >
> > Support for representing those functional dependencies between
> > devices is added here to allow the driver core to track them and act
> > on them in certain cases where applicable.
> >
> > The argument for doing that in the driver core is that there are
> > quite a few distinct use cases involving device dependencies, they
> > are relatively hard to get right in a driver (if one wants to
> > address all of them properly) and it only gets worse if multiplied
> > by the number of drivers potentially needing to do it.
>
> How many drivers actually *need this* today for suspend / resume ?

I don't think there's a list, but just speaking for Renesas R-Car platforms
such a dependency exists from all DMA bus masters to the system IOMMUs (we're
talking about dozens of devices here), as well as from the display, video
codec and video processing IP cores to transparent memory access IP cores that
handle burst access and compression/decompression.

> How many of these are because of ACPI firmware bugs rather than
> some other reason ?

None from the list above, even with s/ACPI/DT/.

> Can ACPI somehow be used instead by these devices to address quirks?

Certainly not on ARM embedded platforms :-)

> > Morever, at least one case (asynchronous system suspend/resume) cannot be
> > handled in a single driver at all, because it requires the driver of A to
> > wait for B to suspend (during system suspend) and the driver of B to
> > wait for A to resume (during system resume).
>
> Why is this all of a sudden a new issue?

It's not a new issue, we've just never addressed it properly so far. Various
workarounds existed, with various level of success (usually more for runtime
PM than system suspend/resume).

> It seems there is quite a bit of frameworks already out there that somehow
> deal with some sort of device ordering / dependencies, and I am curious if
> they've been addressing some of these problems for a while now on their own
> somehow with their own solutions, is that the case? For instance can DAPM
> the / DRM / Audio component framework v4l async solution be used or does it
> already address some of these concerns ?
>
> > For this reason, represent dependencies between devices as "links",
> > with the help of struct device_link objects each containing pointers
> > to the "linked" devices, a list node for each of them, status
> > information, flags, and an RCU head for synchronization.
> >
> > Also add two new list heads, representing the lists of links to the
> > devices that depend on the given one (consumers) and to the devices
> > depended on by it (suppliers), and a "driver presence status" field
> > (needed for figuring out initial states of device links) to struct
> > device.
> >
> > The entire data structure consisting of all of the lists of link
> > objects for all devices is protected by a mutex (for link object
> > addition/removal and for list walks during device driver probing
> > and removal) and by SRCU (for list walking in other case that will
> > be introduced by subsequent change sets). In addition, each link
> > object has an internal status field whose value reflects whether or
> > not drivers are bound to the devices pointed to by the link or
> > probing/removal of their drivers is in progress etc. That field
> > is only modified under the device links mutex, but it may be read
> > outside of it in some cases (introduced by subsequent change sets),
> > so modifications of it are annotated with WRITE_ONCE().
> >
> > New links are added by calling device_link_add() which takes three
> > arguments: pointers to the devices in question and flags. In
> > particular, if DL_FLAG_STATELESS is set in the flags, the link status
> > is not to be taken into account for this link and the driver core
> > will not manage it. In turn, if DL_FLAG_AUTOREMOVE is set in the
> > flags, the driver core will remove the link automatically when the
> > consumer device driver unbinds from it.
> >
> > One of the actions carried out by device_link_add() is to reorder
> > the lists used for device shutdown and system suspend/resume to
> > put the consumer device along with all of its children and all of
> > its consumers (and so on, recursively) to the ends of those lists
> > in order to ensure the right ordering between all of the supplier
> > and consumer devices.
>
> There's no explanation as to why this order is ensured to be
> correct, I think its important to document this. From our discussions
> at Plumbers it seems the order is ensured due to the fact that order
> was already implicitly provided through platform firmware (ACPI
> enumeration is one), adjusting order on the dpm list is just shuffling
> order between consumer / provider, but nothing else. In theory it
> works because in the simple case this should suffice however I
> remain unconvinced that if we have more users of this framework this
> simple algorithm will suffice. Having a test driver or series of
> test drivers that shows this would be good. As it stands there is simply
> an assumption that this is correct, how *strongly* do you feel that
> the order will *always* be correct if this is done and no cycles
> can be added, even if tons of drivers start using this ?
>
> > For this reason, it is not possible to create a link between two
> > devices if the would-be supplier device already depends on the
> > would-be consumer device as either a direct descendant of it or a
> > consumer of one of its direct descendants or one of its consumers
> > and so on.
> >
> > There are two types of link objects, persistent and non-persistent.
> > The persistent ones stay around until one of the target devices is
> > deleted, while the non-persistent ones are removed automatically when
> > the consumer driver unbinds from its device (ie. they are assumed to
> > be valid only as long as the consumer device has a driver bound to
> > it). Persistent links are created by default and non-persistent
> > links are created when the DL_FLAG_AUTOREMOVE flag is passed
> > to device_link_add().
> >
> > Both persistent and non-persistent device links can be deleted
> > with an explicit call to device_link_del().
> >
> > Links created without the DL_FLAG_STATELESS flag set are managed
> > by the driver core using a simple state machine. There are 5 states
> > each link can be in: DORMANT (unused), AVAILABLE (the supplier driver
> > is present and functional), CONSUMER_PROBE (the consumer driver is
> > probing), ACTIVE (both supplier and consumer drivers are present and
> > functional), and SUPPLIER_UNBIND (the supplier driver is unbinding).
> > The driver core updates the link state automatically depending on
> > what happens to the linked devices and for each link state specific
> > actions are taken in addition to that.
> >
> > For example, if the supplier driver unbinds from its device, the
> > driver core will also unbind the drivers of all of its consumers
> > automatically under the assumption that they cannot function
> > properly without the supplier. Analogously, the driver core will
> > only allow the consumer driver to bind to its device if the
> > supplier driver is present and functional (ie. the link is in
> > the AVAILABLE state). If that's not the case, it will rely on
> > the existing deferred probing mechanism to wait for the supplier
> > driver to become available.
>
> This assumes drivers and subsystems support deferred probe, is
> there a way to ensure that drivers that use this framework support
> it instead of possibly breaking them if they use this framework ?

--
Regards,

Laurent Pinchart

2016-11-10 07:14:32

by Laurent Pinchart

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

Hi Luis,

On Wednesday 09 Nov 2016 16:59:30 Luis R. Rodriguez wrote:
> On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki wrote:
> > On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez wrote:
> >> On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote:
> >>> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote:
> >>>> Hi Rafael,
> >>>>
> >>>> sorry for not responding to v5 of your series earlier, just sending
> >>>> this out now in the hope that it reaches you before your travels.
> >>>>
> >>>> On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> >>>>> - Modify device_links_check_suppliers(),
> >>>>> device_links_driver_bound(),
> >>>>>
> >>>>> device_links_no_driver(), device_links_driver_cleanup(),
> >>>>> device_links_busy(), and device_links_unbind_consumers() to walk
> >>>>> link lists under device_links_lock (to make the new "driver
> >>>>> presence tracking" mechanism work reliably).
> >>>>
> >>>> This change might increase boot time if drivers return -EPROBE_DEFER.
> >>>
> >>> "might"? Please verify this before guessing....
> >>>
> >>> And don't make this more complex than needed before actually determining
> >>> a real issue.
> >>
> >> As clarified by Rafael at Plumbers, this functional dependencies
> >> framework assumes your driver / subsystem supports deferred probe,
> >
> > It isn't particularly clear what you mean by "support" here.
>
> I noted some folks had reported issues, and you acknowledged that if
> deferred probe was used in some drivers and if this created an issue
> the same issue would be seen with this framework. AFAICT there are two
> possible issues to consider:
>
> 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned
> [0].
>
> "Some drivers / subsystems don’t support deferred probe yet, such failures
> usually don’t blow up, but cause subtle malfunctioning. Example, an
> Ethernet phy could not get its interrupt as the primary IRQ chip had not
> been probed yet, it reverted to polling though. Sub-optimal." [0]
>
> [0]
> https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003
> 425.html
>
> Geert can you provide more details?

This is a more global issue. In many cases drivers depend on optional
resources. They are able to operate in a degraded mode (reduced feature set,
reduced performances, ...) when those resources are not present. They can
easily determine at probe time whether those resources are present, but have
no way to know, in case they're absent, whether they will be present at some
point in the near future (due to another driver probing the device providing
the resource for instance) or if they will never be present (for instance
because the required driver is missing). In the first case it would make sense
to defer probe, in the latter case deferring probe forever for missing
optional resources would prevent the device from being probed successfully at
all.

The functional dependencies tracking patch series isn't meant to address this
issue. I can imagine a framework that would notify drivers of optional
resource availability after probe time, but it would come at a high cost for
drivers as switching between modes of operation at runtime based on the
availability of such resources would be way more complex than a mechanism
based on probe deferral.

> 2) Since deferred probe relies on late_initcall() if your driver must
> load earlier than this deferred probe can create an issue. Andrzej had
> you identified a driver that ran into this and had issues ? If not
> this seems like a semantics thing we should consider in extending the
> documentation for drivers so that driver writers are aware of this
> limitation. I would suppose candidates for this would be anything not
> using module_init() or late_initcall() on their inits and have a
> probe.
>
> >> if it does not support its not clear what will happen....
> >
> > I don't see any problems here, but if you see any, please just say
> > what they are.
> >
> >> We have no explicit semantics to check if a driver / subsystem
> >> supports deferred probe.
> >
> > That's correct, but then do we need it?
>
> We can determine this by reviewing the two items above.

--
Regards,

Laurent Pinchart


2016-11-10 08:46:45

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Thu, Nov 10, 2016 at 1:59 AM, Luis R. Rodriguez <[email protected]> wrote:
> On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki <[email protected]> wrote:
>> On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez <[email protected]> wrote:
>>> As clarified by Rafael at Plumbers, this functional dependencies
>>> framework assumes your driver / subsystem supports deferred probe,
>>
>> It isn't particularly clear what you mean by "support" here.
>
> I noted some folks had reported issues, and you acknowledged that if
> deferred probe was used in some drivers and if this created an issue
> the same issue would be seen with this framework. AFAICT there are two
> possible issues to consider:
>
> 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned [0].
>
> "Some drivers / subsystems don’t support deferred probe yet, such failures
> usually don’t blow up, but cause subtle malfunctioning. Example, an Ethernet
> phy could not get its interrupt as the primary IRQ chip had not been probed
> yet, it reverted to polling though. Sub-optimal." [0]
>
> [0] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003425.html
>
> Geert can you provide more details?

Issue reported in "of_mdiobus_register_phy() and deferred probe"
(http://lkml.iu.edu/hypermail/linux/kernel/1510.2/05770.html)

Key point is:
"However, of_mdiobus_register_phy() uses irq_of_parse_and_map(), which plainly
ignores EPROBE_DEFER, and it just continues."

At that time, the PHY driver fell back to polling, but as of commit d5c3d8465
("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS") that's no longer the
case, and now the PHY fails to work completely.

Workaround is "[PATCH v2] irqchip/renesas-irqc: Postpone driver initialization"
(https://www.spinics.net/lists/netdev/msg403325.html), which seems to have
sparked some interest in fixing the issue for good ;-)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2016-11-10 22:04:14

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Thu, Nov 10, 2016 at 09:14:32AM +0200, Laurent Pinchart wrote:
> Hi Luis,
>
> On Wednesday 09 Nov 2016 16:59:30 Luis R. Rodriguez wrote:
> > On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki wrote:
> > > On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez wrote:
> > >> On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote:
> > >>> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote:
> > >>>> Hi Rafael,
> > >>>>
> > >>>> sorry for not responding to v5 of your series earlier, just sending
> > >>>> this out now in the hope that it reaches you before your travels.
> > >>>>
> > >>>> On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > >>>>> - Modify device_links_check_suppliers(),
> > >>>>> device_links_driver_bound(),
> > >>>>>
> > >>>>> device_links_no_driver(), device_links_driver_cleanup(),
> > >>>>> device_links_busy(), and device_links_unbind_consumers() to walk
> > >>>>> link lists under device_links_lock (to make the new "driver
> > >>>>> presence tracking" mechanism work reliably).
> > >>>>
> > >>>> This change might increase boot time if drivers return -EPROBE_DEFER.
> > >>>
> > >>> "might"? Please verify this before guessing....
> > >>>
> > >>> And don't make this more complex than needed before actually determining
> > >>> a real issue.
> > >>
> > >> As clarified by Rafael at Plumbers, this functional dependencies
> > >> framework assumes your driver / subsystem supports deferred probe,
> > >
> > > It isn't particularly clear what you mean by "support" here.
> >
> > I noted some folks had reported issues, and you acknowledged that if
> > deferred probe was used in some drivers and if this created an issue
> > the same issue would be seen with this framework. AFAICT there are two
> > possible issues to consider:
> >
> > 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned
> > [0].
> >
> > "Some drivers / subsystems don’t support deferred probe yet, such failures
> > usually don’t blow up, but cause subtle malfunctioning. Example, an
> > Ethernet phy could not get its interrupt as the primary IRQ chip had not
> > been probed yet, it reverted to polling though. Sub-optimal." [0]
> >
> > [0]
> > https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003
> > 425.html
> >
> > Geert can you provide more details?
>
> This is a more global issue. In many cases drivers depend on optional
> resources. They are able to operate in a degraded mode (reduced feature set,
> reduced performances, ...) when those resources are not present. They can
> easily determine at probe time whether those resources are present, but have
> no way to know, in case they're absent, whether they will be present at some
> point in the near future (due to another driver probing the device providing
> the resource for instance) or if they will never be present (for instance
> because the required driver is missing).

I see thanks, so -EPROBE_DEFER assumes a late_initcall() would suffice to load
all necessary requirements.

> In the first case it would make sense to defer probe,

So if the assumption is correct then it -EPROBE_DEFER should work.

> in the latter case deferring probe forever for missing
> optional resources would prevent the device from being probed successfully at
> all.

Right I see. And the driver core has no way to know what things *may* be
needed.

> The functional dependencies tracking patch series isn't meant to address this
> issue.

Right, however it does track functional dependencies for suspend/run time PM
and we were certainly in hope this could help with probe ordering *later* in
the future. This is a separate topic. But more on point, the issue here is that
this framework relies on -EPROBE_DEFER -- so the issues discussed with it still
exist and should be properly documented.

> I can imagine a framework that would notify drivers of optional
> resource availability after probe time, but it would come at a high cost for
> drivers as switching between modes of operation at runtime based on the
> availability of such resources would be way more complex than a mechanism
> based on probe deferral.

Right, I see.

This is more forward looking, but -- if we had an annotation in Kconfig/turned
to a mod info section, or to start off with just a driver MODULE_SUGGESTS() macro
to start off with it might suffice for the driver core to request_module()
annotated dependencies, such requests could be explicitly suggested as
synchronous so init + probe do run together (as-is today), after which it
could know that all possible drivers that needed to be loaded should now be
loaded. If this sounds plausible to help, do we have drivers where we can
test this on? For instance, since the functional dependency framework
annotates functional dependencies for consumers/providers for suspend/resume
and un time PM could such MODULE_SUGGESTS() annotations be considered on the
consumers to suggest the provider drivers so their own probe yields to their
providers to try first ?

Luis

> > 2) Since deferred probe relies on late_initcall() if your driver must
> > load earlier than this deferred probe can create an issue. Andrzej had
> > you identified a driver that ran into this and had issues ? If not
> > this seems like a semantics thing we should consider in extending the
> > documentation for drivers so that driver writers are aware of this
> > limitation. I would suppose candidates for this would be anything not
> > using module_init() or late_initcall() on their inits and have a
> > probe.
> >
> > >> if it does not support its not clear what will happen....
> > >
> > > I don't see any problems here, but if you see any, please just say
> > > what they are.
> > >
> > >> We have no explicit semantics to check if a driver / subsystem
> > >> supports deferred probe.
> > >
> > > That's correct, but then do we need it?
> >
> > We can determine this by reviewing the two items above.
>
> --
> Regards,
>
> Laurent Pinchart
>
>

--
Luis Rodriguez, SUSE LINUX GmbH
Maxfeldstrasse 5; D-90409 Nuernberg

2016-11-10 22:13:07

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Thu, Nov 10, 2016 at 09:46:42AM +0100, Geert Uytterhoeven wrote:
> On Thu, Nov 10, 2016 at 1:59 AM, Luis R. Rodriguez <[email protected]> wrote:
> > On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki <[email protected]> wrote:
> >> On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez <[email protected]> wrote:
> >>> As clarified by Rafael at Plumbers, this functional dependencies
> >>> framework assumes your driver / subsystem supports deferred probe,
> >>
> >> It isn't particularly clear what you mean by "support" here.
> >
> > I noted some folks had reported issues, and you acknowledged that if
> > deferred probe was used in some drivers and if this created an issue
> > the same issue would be seen with this framework. AFAICT there are two
> > possible issues to consider:
> >
> > 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned [0].
> >
> > "Some drivers / subsystems don’t support deferred probe yet, such failures
> > usually don’t blow up, but cause subtle malfunctioning. Example, an Ethernet
> > phy could not get its interrupt as the primary IRQ chip had not been probed
> > yet, it reverted to polling though. Sub-optimal." [0]
> >
> > [0] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003425.html
> >
> > Geert can you provide more details?
>
> Issue reported in "of_mdiobus_register_phy() and deferred probe"
> (http://lkml.iu.edu/hypermail/linux/kernel/1510.2/05770.html)
>
> Key point is:
> "However, of_mdiobus_register_phy() uses irq_of_parse_and_map(), which plainly
> ignores EPROBE_DEFER, and it just continues."
>
> At that time, the PHY driver fell back to polling, but as of commit d5c3d8465
> ("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS") that's no longer the
> case, and now the PHY fails to work completely.
>
> Workaround is "[PATCH v2] irqchip/renesas-irqc: Postpone driver initialization"
> (https://www.spinics.net/lists/netdev/msg403325.html), which seems to have
> sparked some interest in fixing the issue for good ;-)

Ah playing with init levels. You are lucky here that this suffices, the
IOMMU folks already ran out with enough levels to play with so they cannot
resolve their issue this way, for instance.

Luis

2016-11-10 22:40:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote:
> This is more forward looking, but -- if we had an annotation in Kconfig/turned
> to a mod info section, or to start off with just a driver MODULE_SUGGESTS() macro
> to start off with it might suffice for the driver core to request_module()
> annotated dependencies, such requests could be explicitly suggested as
> synchronous so init + probe do run together (as-is today), after which it
> could know that all possible drivers that needed to be loaded should now be
> loaded. If this sounds plausible to help, do we have drivers where we can
> test this on? For instance, since the functional dependency framework
> annotates functional dependencies for consumers/providers for suspend/resume
> and un time PM could such MODULE_SUGGESTS() annotations be considered on the
> consumers to suggest the provider drivers so their own probe yields to their
> providers to try first ?

No.

Stop.

First off, the "driver core" NEVER can "know" if "all possible drivers
that should be loaded, are loaded. That way lies madness and
impossibility.

Secondly, yet-another-section isn't going to help anything here, we
alredy "suggest" to userspace a bunch of stuff, so we get the needed
modules loaded, at sometime in the future, if they are around, and
userspace feels like it. That's the best we can ever do.

Don't try to make this more difficult than it is please. DEFER works
today really really well, and it's really really simple.
Inter-dependancy of modules and devices connected to each other are two
different things, be careful about this.

thanks,

greg k-h

2016-11-10 23:09:15

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Thu, Nov 10, 2016 at 09:05:30AM +0200, Laurent Pinchart wrote:
> Hi Luis,
>
> On Monday 07 Nov 2016 22:39:54 Luis R. Rodriguez wrote:
> > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <[email protected]>
> > >
> > > Currently, there is a problem with taking functional dependencies
> > > between devices into account.
> > >
> > > What I mean by a "functional dependency" is when the driver of device
> > > B needs device A to be functional and (generally) its driver to be
> > > present in order to work properly. This has certain consequences
> > > for power management (suspend/resume and runtime PM ordering) and
> > > shutdown ordering of these devices. In general, it also implies that
> > > the driver of A needs to be working for B to be probed successfully
> > > and it cannot be unbound from the device before the B's driver.
> > >
> > > Support for representing those functional dependencies between
> > > devices is added here to allow the driver core to track them and act
> > > on them in certain cases where applicable.
> > >
> > > The argument for doing that in the driver core is that there are
> > > quite a few distinct use cases involving device dependencies, they
> > > are relatively hard to get right in a driver (if one wants to
> > > address all of them properly) and it only gets worse if multiplied
> > > by the number of drivers potentially needing to do it.
> >
> > How many drivers actually *need this* today for suspend / resume ?
>
> I don't think there's a list, but just speaking for Renesas R-Car platforms
> such a dependency exists from all DMA bus masters to the system IOMMUs (we're
> talking about dozens of devices here), as well as from the display, video
> codec and video processing IP cores to transparent memory access IP cores that
> handle burst access and compression/decompression.

This is very significant, thanks.

> > How many of these are because of ACPI firmware bugs rather than
> > some other reason ?
>
> None from the list above, even with s/ACPI/DT/.
>
> > Can ACPI somehow be used instead by these devices to address quirks?
>
> Certainly not on ARM embedded platforms :-)

So this framework adds a generic way.

> > > Morever, at least one case (asynchronous system suspend/resume) cannot be
> > > handled in a single driver at all, because it requires the driver of A to
> > > wait for B to suspend (during system suspend) and the driver of B to
> > > wait for A to resume (during system resume).
> >
> > Why is this all of a sudden a new issue?
>
> It's not a new issue, we've just never addressed it properly so far. Various
> workarounds existed, with various level of success (usually more for runtime
> PM than system suspend/resume).

This helps a lot. I think there has at this point been enough effort to
try to inform as many folks who might have these workaround or their own
solution to voice their concerns, I tried to review some of the existing
solutions myself but its hard to get a quick grasp of them -- provided no one
else yells out I welcome then this change as a proper evolution.

I suppose the only last remaining concerns I had were aggregating on top of
deferred probe and the lack of a test driver to test complex topologies
using this framework to test order is never broken. For the first concern
it would seem that folks using this framework would actually test and ensure
device links works for their drivers, even if deferred probe is used. For
the later concern, a separate test driver could be added later.

Luis

2016-11-11 00:08:36

by Laurent Pinchart

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

Hi Greg,

On Thursday 10 Nov 2016 23:40:54 Greg Kroah-Hartman wrote:
> On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote:
> > This is more forward looking, but -- if we had an annotation in
> > Kconfig/turned to a mod info section, or to start off with just a driver
> > MODULE_SUGGESTS() macro to start off with it might suffice for the driver
> > core to request_module() annotated dependencies, such requests could be
> > explicitly suggested as synchronous so init + probe do run together
> > (as-is today), after which it could know that all possible drivers that
> > needed to be loaded should now be loaded. If this sounds plausible to
> > help, do we have drivers where we can test this on? For instance, since
> > the functional dependency framework annotates functional dependencies for
> > consumers/providers for suspend/resume and un time PM could such
> > MODULE_SUGGESTS() annotations be considered on the consumers to suggest
> > the provider drivers so their own probe yields to their providers to try
> > first ?
>
> No.
>
> Stop.
>
> First off, the "driver core" NEVER can "know" if "all possible drivers
> that should be loaded, are loaded. That way lies madness and
> impossibility.
>
> Secondly, yet-another-section isn't going to help anything here, we
> alredy "suggest" to userspace a bunch of stuff, so we get the needed
> modules loaded, at sometime in the future, if they are around, and
> userspace feels like it. That's the best we can ever do.
>
> Don't try to make this more difficult than it is please. DEFER works
> today really really well, and it's really really simple.
> Inter-dependancy of modules and devices connected to each other are two
> different things, be careful about this.

One issue we don't address today is handling of optional dependencies. A
simple example is an SPI controller that can use a DMA engine or work in PIO
mode. At probe time the driver will request a DMA channel if the platform
(ACPI, DT, platform data) specifies that DMA is available. This can fail for
various reasons, one of them being that the DMA engine driver hasn't probed
the DMA device yet. In that case the SPI controller driver will continue in
PIO mode, ignoring the DMA engine that will later be probed. We can't defer
probing of the SPI controller as the DMA engine driver might never get loaded,
which would result in the SPI controller probe being deferred forever.

One solution for this type of dependency issue would be to notify the SPI
controller driver after probe that the DMA channel is now available. I'd like
to avoid that though, as it would drastically increase the complexity of lots
of drivers and create lots of race conditions.

There are certain configurations that we could possibly consider as invalid.
For instance if the SPI controller driver is built-in and the DMA engine
driver built as a module, the user clearly shot themselves in the foot and the
kernel can't be blamed.

For resources that can't be built as a module (IOMMUs for instance) we thus
only have to consider the case where both drivers are built-in, as the
resource built-in and consumer as a module should work properly from an
ordering point of view (at least as long as we don't allow asynchronous
probing of built-in drivers to be delayed enough for modules to be loaded...).
In this case, if the resource driver isn't available when the consumer is
probed, if will never be available at the consumer can safely proceed in a
degraded mode. We would thus only need to solve the probe ordering issue.

I'm not sure how far these simple(r) solutions that consider certain cases as
invalid would scale though, and whether we won't need a more generic solution
at some point anyway.

--
Regards,

Laurent Pinchart

2016-11-13 10:59:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Fri, Nov 11, 2016 at 02:08:35AM +0200, Laurent Pinchart wrote:
> Hi Greg,
>
> On Thursday 10 Nov 2016 23:40:54 Greg Kroah-Hartman wrote:
> > On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote:
> > > This is more forward looking, but -- if we had an annotation in
> > > Kconfig/turned to a mod info section, or to start off with just a driver
> > > MODULE_SUGGESTS() macro to start off with it might suffice for the driver
> > > core to request_module() annotated dependencies, such requests could be
> > > explicitly suggested as synchronous so init + probe do run together
> > > (as-is today), after which it could know that all possible drivers that
> > > needed to be loaded should now be loaded. If this sounds plausible to
> > > help, do we have drivers where we can test this on? For instance, since
> > > the functional dependency framework annotates functional dependencies for
> > > consumers/providers for suspend/resume and un time PM could such
> > > MODULE_SUGGESTS() annotations be considered on the consumers to suggest
> > > the provider drivers so their own probe yields to their providers to try
> > > first ?
> >
> > No.
> >
> > Stop.
> >
> > First off, the "driver core" NEVER can "know" if "all possible drivers
> > that should be loaded, are loaded. That way lies madness and
> > impossibility.
> >
> > Secondly, yet-another-section isn't going to help anything here, we
> > alredy "suggest" to userspace a bunch of stuff, so we get the needed
> > modules loaded, at sometime in the future, if they are around, and
> > userspace feels like it. That's the best we can ever do.
> >
> > Don't try to make this more difficult than it is please. DEFER works
> > today really really well, and it's really really simple.
> > Inter-dependancy of modules and devices connected to each other are two
> > different things, be careful about this.
>
> One issue we don't address today is handling of optional dependencies. A
> simple example is an SPI controller that can use a DMA engine or work in PIO
> mode. At probe time the driver will request a DMA channel if the platform
> (ACPI, DT, platform data) specifies that DMA is available. This can fail for
> various reasons, one of them being that the DMA engine driver hasn't probed
> the DMA device yet. In that case the SPI controller driver will continue in
> PIO mode, ignoring the DMA engine that will later be probed. We can't defer
> probing of the SPI controller as the DMA engine driver might never get loaded,
> which would result in the SPI controller probe being deferred forever.
>
> One solution for this type of dependency issue would be to notify the SPI
> controller driver after probe that the DMA channel is now available. I'd like
> to avoid that though, as it would drastically increase the complexity of lots
> of drivers and create lots of race conditions.
>
> There are certain configurations that we could possibly consider as invalid.
> For instance if the SPI controller driver is built-in and the DMA engine
> driver built as a module, the user clearly shot themselves in the foot and the
> kernel can't be blamed.
>
> For resources that can't be built as a module (IOMMUs for instance) we thus
> only have to consider the case where both drivers are built-in, as the
> resource built-in and consumer as a module should work properly from an
> ordering point of view (at least as long as we don't allow asynchronous
> probing of built-in drivers to be delayed enough for modules to be loaded...).
> In this case, if the resource driver isn't available when the consumer is
> probed, if will never be available at the consumer can safely proceed in a
> degraded mode. We would thus only need to solve the probe ordering issue.
>
> I'm not sure how far these simple(r) solutions that consider certain cases as
> invalid would scale though, and whether we won't need a more generic solution
> at some point anyway.

I would love to see a generic solution that works for all of these
complex cases, as I agree with you, it's complex :)

But I have yet to see any such patches that implement this. As always,
I am very glad to review anything that people create, but I don't have
the time to work on such a solution myself at the moment.

thanks,

greg k-h

2016-11-13 16:57:37

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Tue, Nov 08, 2016 at 09:58:24PM +0100, Luis R. Rodriguez wrote:
> On Tue, Nov 08, 2016 at 08:43:35PM +0100, Greg Kroah-Hartman wrote:
> > Yes, you can iterate a
> > lot of times, but that's fine, we have time at boot to do that (and
> > really, it is fast.)
>
> Deferred probe is left for late_initcall() [1] -- this *assumes* that the
> driver/subsystem can be loaded so late, and as per Andrzej this solution
> isunacceptable/undesirable. So it would be unfair and incorrect to categorize
> all drivers in the same boat, in fact given this lone fact it may be
> worth revisiting the idea I mentioned about a capability aspect to support
> a late deferred probe later. We tend to assume its fine, however it does
> not seem to be the case.
>
> [1] drivers/base/dd.c:late_initcall(deferred_probe_initcall);

Note that driver_deferred_probe_trigger() is called not only from this
late_initcall but also from driver_bound(). In other words, whenever
a driver binds successfully, deferred devices will reprobe, and devices
which had to defer will not necessarily have to wait until late_initcall
until they're reprobed. The late_initcall is just a way to tell devices
"this is last call".

Thanks,

Lukas

2016-11-13 17:33:03

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Mon, Nov 07, 2016 at 10:39:54PM +0100, Luis R. Rodriguez wrote:
> On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > One of the actions carried out by device_link_add() is to reorder
> > the lists used for device shutdown and system suspend/resume to
> > put the consumer device along with all of its children and all of
> > its consumers (and so on, recursively) to the ends of those lists
> > in order to ensure the right ordering between all of the supplier
> > and consumer devices.
>
> There's no explanation as to why this order is ensured to be
> correct, I think its important to document this. From our discussions
> at Plumbers it seems the order is ensured due to the fact that order
> was already implicitly provided through platform firmware (ACPI
> enumeration is one), adjusting order on the dpm list is just shuffling
> order between consumer / provider, but nothing else.

ACPI specifies a hierarchy and the order on the dpm_list and
devices_kset is such that children are behind their parent.

A device link specifies a dependency that exists in addition
to the hierarchy, hence consumers need to be moved behind
their supplier. And not only the consumers themselves but
also recursively their children and consumers. Essentially
the entire subtree is moved to the back. That happens in
device_reorder_to_tail() in patch 2.

If another device is enumerated which acts as a supplier to
an existing other supplier, that other supplier and all its
dependents are moved behind the newly enumerated device,
and so on.

That is provably correct so long as no loops are introduced
in the dependency graph. That is checked by device_is_dependent(),
which is called from device_link_add(), and the addition of the
link is aborted if a loop is detected.

Best regards,

Lukas

2016-11-14 08:15:06

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

Hi Laurent,

On Fri, Nov 11, 2016 at 1:08 AM, Laurent Pinchart
<[email protected]> wrote:
> On Thursday 10 Nov 2016 23:40:54 Greg Kroah-Hartman wrote:
>> On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote:
>> Don't try to make this more difficult than it is please. DEFER works
>> today really really well, and it's really really simple.
>> Inter-dependancy of modules and devices connected to each other are two
>> different things, be careful about this.
>
> One issue we don't address today is handling of optional dependencies. A
> simple example is an SPI controller that can use a DMA engine or work in PIO
> mode. At probe time the driver will request a DMA channel if the platform
> (ACPI, DT, platform data) specifies that DMA is available. This can fail for
> various reasons, one of them being that the DMA engine driver hasn't probed
> the DMA device yet. In that case the SPI controller driver will continue in
> PIO mode, ignoring the DMA engine that will later be probed. We can't defer
> probing of the SPI controller as the DMA engine driver might never get loaded,
> which would result in the SPI controller probe being deferred forever.
>
> One solution for this type of dependency issue would be to notify the SPI
> controller driver after probe that the DMA channel is now available. I'd like
> to avoid that though, as it would drastically increase the complexity of lots
> of drivers and create lots of race conditions.

Alternatively, the driver can request optional DMA resources at open time
instead of at probe time.
E.g. sh_sci does that in its uart_ops.startup() callback.

Regardless of that, DMA can fail temporarily for other reasons
(e.g. running out of channels).

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2016-11-14 13:48:43

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Sun, Nov 13, 2016 at 06:34:13PM +0100, Lukas Wunner wrote:
> On Mon, Nov 07, 2016 at 10:39:54PM +0100, Luis R. Rodriguez wrote:
> > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > > One of the actions carried out by device_link_add() is to reorder
> > > the lists used for device shutdown and system suspend/resume to
> > > put the consumer device along with all of its children and all of
> > > its consumers (and so on, recursively) to the ends of those lists
> > > in order to ensure the right ordering between all of the supplier
> > > and consumer devices.
> >
> > There's no explanation as to why this order is ensured to be
> > correct, I think its important to document this. From our discussions
> > at Plumbers it seems the order is ensured due to the fact that order
> > was already implicitly provided through platform firmware (ACPI
> > enumeration is one), adjusting order on the dpm list is just shuffling
> > order between consumer / provider, but nothing else.
>
> ACPI specifies a hierarchy and the order on the dpm_list and
> devices_kset is such that children are behind their parent.
>
> A device link specifies a dependency that exists in addition
> to the hierarchy, hence consumers need to be moved behind
> their supplier. And not only the consumers themselves but
> also recursively their children and consumers. Essentially
> the entire subtree is moved to the back. That happens in
> device_reorder_to_tail() in patch 2.

Ah neat, I failed to notice this full subtree tree move, its
rather important.

> If another device is enumerated which acts as a supplier to
> an existing other supplier, that other supplier and all its
> dependents are moved behind the newly enumerated device,
> and so on.
>
> That is probably correct so long as no loops are introduced
> in the dependency graph.

"Probably" is what concerns me, there is no formality about
the correctness of this.

> That is checked by device_is_dependent(),
> which is called from device_link_add(), and the addition of the
> link is aborted if a loop is detected.

And that is sufficient ?

Luis

2016-11-14 14:51:03

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Sun, Nov 13, 2016 at 11:59:42AM +0100, Greg Kroah-Hartman wrote:
> On Fri, Nov 11, 2016 at 02:08:35AM +0200, Laurent Pinchart wrote:
> > Hi Greg,
> >
> > On Thursday 10 Nov 2016 23:40:54 Greg Kroah-Hartman wrote:
> > > On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote:
> > > > This is more forward looking, but -- if we had an annotation in
> > > > Kconfig/turned to a mod info section, or to start off with just a driver
> > > > MODULE_SUGGESTS() macro to start off with it might suffice for the driver
> > > > core to request_module() annotated dependencies, such requests could be
> > > > explicitly suggested as synchronous so init + probe do run together
> > > > (as-is today), after which it could know that all possible drivers that
> > > > needed to be loaded should now be loaded. If this sounds plausible to
> > > > help, do we have drivers where we can test this on? For instance, since
> > > > the functional dependency framework annotates functional dependencies for
> > > > consumers/providers for suspend/resume and un time PM could such
> > > > MODULE_SUGGESTS() annotations be considered on the consumers to suggest
> > > > the provider drivers so their own probe yields to their providers to try
> > > > first ?
> > >
> > > No.
> > >
> > > Stop.
> > >
> > > First off, the "driver core" NEVER can "know" if "all possible drivers
> > > that should be loaded, are loaded. That way lies madness and
> > > impossibility.

At first I had discarded the generic driver problem as a slightly unrelated
topic however its clear now its not. In terms of functional dependencies I
agree that not providing strict order is *sometimes* desirable to help with
simplicity. The generic driver problem can be described graph-wise: on a DAG,
we're considering a topology with where nodes have optional superior
alternatives, and what you seem to be advocating is a transition to an
alternative is better and much simpler than forcing order from the start. That
can only work so long as some intermediary dependencies are (for lack of a
better term) soft-nodes -- where transitions to better alternatives are
possible and can such transitions can be handled in software. You may have
(again for lack of a better term) hard-nodes though where an entry in the DAG
is required as a hard requirement immediately prior to letting another entry
proceed. An example here is the x86 IOMMU drivers and dependent GPU DRM
drivers, currently only link order asserts proper order given we have run out
of semantics in between to ensure proper order is correct.

> > > Secondly, yet-another-section isn't going to help anything here, we
> > > alredy "suggest" to userspace a bunch of stuff, so we get the needed
> > > modules loaded, at sometime in the future, if they are around, and
> > > userspace feels like it. That's the best we can ever do.

For some cases this is sufficient, for hard-nodes (term used above) though
if you get the incorrect order you may in the worst case oops.

> > > Don't try to make this more difficult than it is please. DEFER works
> > > today really really well, and it's really really simple.

It seems many disagree. What is clear is its simplicity outweighs the
complexity by alternatives which have been historically considered. This is
reasonable. Part of the reason probe ordering, as an optimization
consideration, came up while function dependencies for runtime PM and suspend
are being discussed is as we've determined this is a related problem and
at least for hard-nodes this is critical to resolve. For now we have enough
tools to work around problems for hard-nodes, but it would be silly for us
not to start thinking about ways to improve upon this for the future.

> > > Inter-dependancy of modules and devices connected to each other are two
> > > different things, be careful about this.

This is a *very* fair warning :)

> > One issue we don't address today is handling of optional dependencies. A
> > simple example is an SPI controller that can use a DMA engine or work in PIO
> > mode. At probe time the driver will request a DMA channel if the platform
> > (ACPI, DT, platform data) specifies that DMA is available. This can fail for
> > various reasons, one of them being that the DMA engine driver hasn't probed
> > the DMA device yet. In that case the SPI controller driver will continue in
> > PIO mode, ignoring the DMA engine that will later be probed. We can't defer
> > probing of the SPI controller as the DMA engine driver might never get loaded,
> > which would result in the SPI controller probe being deferred forever.
> >
> > One solution for this type of dependency issue would be to notify the SPI
> > controller driver after probe that the DMA channel is now available. I'd like
> > to avoid that though, as it would drastically increase the complexity of lots
> > of drivers and create lots of race conditions.
> >
> > There are certain configurations that we could possibly consider as invalid.
> > For instance if the SPI controller driver is built-in and the DMA engine
> > driver built as a module, the user clearly shot themselves in the foot and the
> > kernel can't be blamed.
> >
> > For resources that can't be built as a module (IOMMUs for instance) we thus
> > only have to consider the case where both drivers are built-in, as the
> > resource built-in and consumer as a module should work properly from an
> > ordering point of view (at least as long as we don't allow asynchronous
> > probing of built-in drivers to be delayed enough for modules to be loaded...).
> > In this case, if the resource driver isn't available when the consumer is
> > probed, if will never be available at the consumer can safely proceed in a
> > degraded mode. We would thus only need to solve the probe ordering issue.
> >
> > I'm not sure how far these simple(r) solutions that consider certain cases as
> > invalid would scale though, and whether we won't need a more generic solution
> > at some point anyway.
>
> I would love to see a generic solution that works for all of these
> complex cases, as I agree with you, it's complex :)
>
> But I have yet to see any such patches that implement this.

The generic driver topic is related but it certainly only part of the
picture. It seems there were enough folks interested in that topic though
so perhaps patches will be eventually produced for it.

> As always,
> I am very glad to review anything that people create, but I don't have
> the time to work on such a solution myself at the moment.

Part of what we tried to discuss during the complex dependencies topics at
Plumbers was evaluating if some of the existing solutions for run time PM and
suspend could help with probe ordering, it seems we had agreement on it, what
we found though was that for many cases the use of struct device for link
association is too late. Alternatives mechanisms will be considered in the
future, and it seems that one path forward will be to consider expanding upon
this simple functional device dependency framework.

So let's see more patches!

Luis

2016-11-14 15:46:54

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Mon, Nov 14, 2016 at 02:48:32PM +0100, Luis R. Rodriguez wrote:
> On Sun, Nov 13, 2016 at 06:34:13PM +0100, Lukas Wunner wrote:
> > On Mon, Nov 07, 2016 at 10:39:54PM +0100, Luis R. Rodriguez wrote:
> > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
> > > > One of the actions carried out by device_link_add() is to reorder
> > > > the lists used for device shutdown and system suspend/resume to
> > > > put the consumer device along with all of its children and all of
> > > > its consumers (and so on, recursively) to the ends of those lists
> > > > in order to ensure the right ordering between all of the supplier
> > > > and consumer devices.
> > >
> > > There's no explanation as to why this order is ensured to be
> > > correct, I think its important to document this. From our discussions
> > > at Plumbers it seems the order is ensured due to the fact that order
> > > was already implicitly provided through platform firmware (ACPI
> > > enumeration is one), adjusting order on the dpm list is just shuffling
> > > order between consumer / provider, but nothing else.
> >
> > ACPI specifies a hierarchy and the order on the dpm_list and
> > devices_kset is such that children are behind their parent.
> >
> > A device link specifies a dependency that exists in addition
> > to the hierarchy, hence consumers need to be moved behind
> > their supplier. And not only the consumers themselves but
> > also recursively their children and consumers. Essentially
> > the entire subtree is moved to the back. That happens in
> > device_reorder_to_tail() in patch 2.
>
> Ah neat, I failed to notice this full subtree tree move, its
> rather important.
>
> > If another device is enumerated which acts as a supplier to
> > an existing other supplier, that other supplier and all its
> > dependents are moved behind the newly enumerated device,
> > and so on.
> >
> > That is probably correct so long as no loops are introduced
> > in the dependency graph.
>
> "Probably" is what concerns me, there is no formality about
> the correctness of this.

It's a typo, I meant to say "provably correct". Sorry.

Quite a difference in meaning. :-)


> > That is checked by device_is_dependent(),
> > which is called from device_link_add(), and the addition of the
> > link is aborted if a loop is detected.
>
> And that is sufficient ?

The device links turn the device tree into a directed acyclic graph.
For the dpm_list and devices_kset, that graph is flattened into a
one-dimensional form such that all ancestors and suppliers of a
device appear in front of that device in the lists. I'm not a
graph theorist and can't provide a formal proof. I think Rafael
is a Dr., maybe he can do it. :-) I merely looked at this from a
practical point of view, i.e. I tried to come up with corner cases
where dependencies are added that would result in incorrect ordering,
and concluded that I couldn't find any.

Thanks,

Lukas

2016-11-14 16:01:52

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 2/5] driver core: Functional dependencies tracking support

On Mon, Nov 14, 2016 at 7:48 AM, Lukas Wunner <[email protected]> wrote:
> On Mon, Nov 14, 2016 at 02:48:32PM +0100, Luis R. Rodriguez wrote:
>> On Sun, Nov 13, 2016 at 06:34:13PM +0100, Lukas Wunner wrote:
>> > On Mon, Nov 07, 2016 at 10:39:54PM +0100, Luis R. Rodriguez wrote:
>> > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote:
>> > > > One of the actions carried out by device_link_add() is to reorder
>> > > > the lists used for device shutdown and system suspend/resume to
>> > > > put the consumer device along with all of its children and all of
>> > > > its consumers (and so on, recursively) to the ends of those lists
>> > > > in order to ensure the right ordering between all of the supplier
>> > > > and consumer devices.
>> > >
>> > > There's no explanation as to why this order is ensured to be
>> > > correct, I think its important to document this. From our discussions
>> > > at Plumbers it seems the order is ensured due to the fact that order
>> > > was already implicitly provided through platform firmware (ACPI
>> > > enumeration is one), adjusting order on the dpm list is just shuffling
>> > > order between consumer / provider, but nothing else.
>> >
>> > ACPI specifies a hierarchy and the order on the dpm_list and
>> > devices_kset is such that children are behind their parent.
>> >
>> > A device link specifies a dependency that exists in addition
>> > to the hierarchy, hence consumers need to be moved behind
>> > their supplier. And not only the consumers themselves but
>> > also recursively their children and consumers. Essentially
>> > the entire subtree is moved to the back. That happens in
>> > device_reorder_to_tail() in patch 2.
>>
>> Ah neat, I failed to notice this full subtree tree move, its
>> rather important.
>>
>> > If another device is enumerated which acts as a supplier to
>> > an existing other supplier, that other supplier and all its
>> > dependents are moved behind the newly enumerated device,
>> > and so on.
>> >
>> > That is probably correct so long as no loops are introduced
>> > in the dependency graph.
>>
>> "Probably" is what concerns me, there is no formality about
>> the correctness of this.
>
> It's a typo, I meant to say "provably correct". Sorry.
>
> Quite a difference in meaning. :-)

No sorry my own mistake -- you had written provably but I thought you
had a typo, clearly you meant as you typed :)

If the trees are independent then yes, I can see this working.

>> > That is checked by device_is_dependent(),
>> > which is called from device_link_add(), and the addition of the
>> > link is aborted if a loop is detected.
>>
>> And that is sufficient ?
>
> The device links turn the device tree into a directed acyclic graph.
> For the dpm_list and devices_kset, that graph is flattened into a
> one-dimensional form such that all ancestors and suppliers of a
> device appear in front of that device in the lists. I'm not a
> graph theorist and can't provide a formal proof. I think Rafael
> is a Dr., maybe he can do it. :-)

If you traverse the DAG you can get a linear one-dimensional
representation of the dependencies -- I'm no expert on this either,
however I'm buying what you described above then, provided we do
indeed avoid cycles.

> I merely looked at this from a
> practical point of view, i.e. I tried to come up with corner cases
> where dependencies are added that would result in incorrect ordering,
> and concluded that I couldn't find any.

Fair enough, likewise. I think the takeaway from this discussion is a
test driver trying to break this might be good to have, but also
documenting how this works in practice and how we avoid the cycles is
important. This was not very clear from the patches.

Luis