Add device-links to track functional dependencies between devices
after they are created (but before they are probed) by looking at
their common DT bindings like clocks, interconnects, etc.
Having functional dependencies automatically added before the devices
are probed, provides the following benefits:
- Optimizes device probe order and avoids the useless work of
attempting probes of devices that will not probe successfully
(because their suppliers aren't present or haven't probed yet).
For example, in a commonly available mobile SoC, registering just
one consumer device's driver at an initcall level earlier than the
supplier device's driver causes 11 failed probe attempts before the
consumer device probes successfully. This was with a kernel with all
the drivers statically compiled in. This problem gets a lot worse if
all the drivers are loaded as modules without direct symbol
dependencies.
- Supplier devices like clock providers, interconnect providers, etc
need to keep the resources they provide active and at a particular
state(s) during boot up even if their current set of consumers don't
request the resource to be active. This is because the rest of the
consumers might not have probed yet and turning off the resource
before all the consumers have probed could lead to a hang or
undesired user experience.
Some frameworks (Eg: regulator) handle this today by turning off
"unused" resources at late_initcall_sync and hoping all the devices
have probed by then. This is not a valid assumption for systems with
loadable modules. Other frameworks (Eg: clock) just don't handle
this due to the lack of a clear signal for when they can turn off
resources. This leads to downstream hacks to handle cases like this
that can easily be solved in the upstream kernel.
By linking devices before they are probed, we give suppliers a clear
count of the number of dependent consumers. Once all of the
consumers are active, the suppliers can turn off the unused
resources without making assumptions about the number of consumers.
By default we just add device-links to track "driver presence" (probe
succeeded) of the supplier device. If any other functionality provided
by device-links are needed, it is left to the consumer/supplier
devices to change the link when they probe.
v1 -> v2:
- Drop patch to speed up of_find_device_by_node()
- Drop depends-on property and use existing bindings
v2 -> v3:
- Refactor the code to have driver core initiate the linking of devs
- Have driver core link consumers to supplier before it's probed
- Add support for drivers to edit the device links before probing
TODO:
- For the case of consumer child sub-nodes being added by a parent
device after late_initcall_sync we might be able to address that by
recursively parsing all child nodes and adding all their suppliers as
suppliers of the parent node too. The parent probe will add the
children before its probe is completed and that will prevent the
supplier's sync_state from being executed before the children are
probed.
But I'll write that part once I see how this series is received.
-Saravana
Saravana Kannan (4):
driver core: Add support for linking devices during device addition
of/platform: Add functional dependency link from DT bindings
driver core: Add sync_state driver/bus callback
driver core: Add edit_links() callback for drivers
drivers/base/core.c | 142 +++++++++++++++++++++++++++++++++++++++++
drivers/base/dd.c | 29 +++++++++
drivers/of/Kconfig | 9 +++
drivers/of/platform.c | 61 ++++++++++++++++++
include/linux/device.h | 45 +++++++++++++
5 files changed, 286 insertions(+)
--
2.22.0.410.gd8fdbe21b5-goog
When devices are added, the bus might want to create device links to track
functional dependencies between supplier and consumer devices. This
tracking of supplier-consumer relationship allows optimizing device probe
order and tracking whether all consumers of a supplier are active. The
add_links bus callback is added to support this.
However, when consumer devices are added, they might not have a supplier
device to link to despite needing mandatory resources/functionality from
one or more suppliers. A waiting_for_suppliers list is created to track
such consumers and retry linking them when new devices get added.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 83 ++++++++++++++++++++++++++++++++++++++++++
include/linux/device.h | 8 ++++
2 files changed, 91 insertions(+)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index fd7511e04e62..0705926d362f 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -44,6 +44,8 @@ early_param("sysfs.deprecated", sysfs_deprecated_setup);
#endif
/* Device links support. */
+static LIST_HEAD(wait_for_suppliers);
+static DEFINE_MUTEX(wfs_lock);
#ifdef CONFIG_SRCU
static DEFINE_MUTEX(device_links_lock);
@@ -401,6 +403,51 @@ struct device_link *device_link_add(struct device *consumer,
}
EXPORT_SYMBOL_GPL(device_link_add);
+/**
+ * device_link_wait_for_supplier - Mark device as waiting for supplier
+ * @consumer: Consumer device
+ *
+ * Marks the consumer device as waiting for suppliers to become available. The
+ * consumer device will never be probed until it's unmarked as waiting for
+ * suppliers. The caller is responsible for adding the link to the supplier
+ * once the supplier device is present.
+ *
+ * This function is NOT meant to be called from the probe function of the
+ * consumer but rather from code that creates/adds the consumer device.
+ */
+static void device_link_wait_for_supplier(struct device *consumer)
+{
+ mutex_lock(&wfs_lock);
+ list_add_tail(&consumer->links.needs_suppliers, &wait_for_suppliers);
+ mutex_unlock(&wfs_lock);
+}
+
+/**
+ * device_link_check_waiting_consumers - Try to unmark waiting consumers
+ *
+ * Loops through all consumers waiting on suppliers and tries to add all their
+ * supplier links. If that succeeds, the consumer device is unmarked as waiting
+ * for suppliers. Otherwise, they are left marked as waiting on suppliers,
+ *
+ * The add_links bus callback is expected to return 0 if it has found and added
+ * all the supplier links for the consumer device. It should return an error if
+ * it isn't able to do so.
+ *
+ * The caller of device_link_wait_for_supplier() is expected to call this once
+ * it's aware of potential suppliers becoming available.
+ */
+static void device_link_check_waiting_consumers(void)
+{
+ struct device *dev, *tmp;
+
+ mutex_lock(&wfs_lock);
+ list_for_each_entry_safe(dev, tmp, &wait_for_suppliers,
+ links.needs_suppliers)
+ if (!dev->bus->add_links(dev))
+ list_del_init(&dev->links.needs_suppliers);
+ mutex_unlock(&wfs_lock);
+}
+
static void device_link_free(struct device_link *link)
{
while (refcount_dec_not_one(&link->rpm_active))
@@ -535,6 +582,19 @@ int device_links_check_suppliers(struct device *dev)
struct device_link *link;
int ret = 0;
+ /*
+ * If a device is waiting for one or more suppliers (in
+ * wait_for_suppliers list), it is not ready to probe yet. So just
+ * return -EPROBE_DEFER without having to check the links with existing
+ * suppliers.
+ */
+ mutex_lock(&wfs_lock);
+ if (!list_empty(&dev->links.needs_suppliers)) {
+ mutex_unlock(&wfs_lock);
+ return -EPROBE_DEFER;
+ }
+ mutex_unlock(&wfs_lock);
+
device_links_write_lock();
list_for_each_entry(link, &dev->links.suppliers, c_node) {
@@ -812,6 +872,10 @@ static void device_links_purge(struct device *dev)
{
struct device_link *link, *ln;
+ mutex_lock(&wfs_lock);
+ list_del(&dev->links.needs_suppliers);
+ mutex_unlock(&wfs_lock);
+
/*
* Delete all of the remaining links from this device to any other
* devices (either consumers or suppliers).
@@ -1673,6 +1737,7 @@ void device_initialize(struct device *dev)
#endif
INIT_LIST_HEAD(&dev->links.consumers);
INIT_LIST_HEAD(&dev->links.suppliers);
+ INIT_LIST_HEAD(&dev->links.needs_suppliers);
dev->links.status = DL_DEV_NO_DRIVER;
}
EXPORT_SYMBOL_GPL(device_initialize);
@@ -2108,6 +2173,24 @@ int device_add(struct device *dev)
BUS_NOTIFY_ADD_DEVICE, dev);
kobject_uevent(&dev->kobj, KOBJ_ADD);
+
+ /*
+ * Check if any of the other devices (consumers) have been waiting for
+ * this device (supplier) to be added so that they can create a device
+ * link to it.
+ *
+ * This needs to happen after device_pm_add() because device_link_add()
+ * requires the supplier be registered before it's called.
+ *
+ * But this also needs to happe before bus_probe_device() to make sure
+ * waiting consumers can link to it before the driver is bound to the
+ * device and the driver sync_state callback is called for this device.
+ */
+ device_link_check_waiting_consumers();
+
+ if (dev->bus && dev->bus->add_links && dev->bus->add_links(dev))
+ device_link_wait_for_supplier(dev);
+
bus_probe_device(dev);
if (parent)
klist_add_tail(&dev->p->knode_parent,
diff --git a/include/linux/device.h b/include/linux/device.h
index 848fc71c6ba6..7f8ae7e5fc6b 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -77,6 +77,11 @@ extern void bus_remove_file(struct bus_type *, struct bus_attribute *);
* -EPROBE_DEFER it will queue the device for deferred probing.
* @uevent: Called when a device is added, removed, or a few other things
* that generate uevents to add the environment variables.
+ * @add_links: Called, perhaps multiple times, when a new device is added to
+ * this bus. The function is expected to create all the device
+ * links for the new device and return 0 if it was completed
+ * successfully or return an error if it needs to be reattempted
+ * in the future.
* @probe: Called when a new device or driver add to this bus, and callback
* the specific driver's probe to initial the matched device.
* @remove: Called when a device removed from this bus.
@@ -121,6 +126,7 @@ struct bus_type {
int (*match)(struct device *dev, struct device_driver *drv);
int (*uevent)(struct device *dev, struct kobj_uevent_env *env);
+ int (*add_links)(struct device *dev);
int (*probe)(struct device *dev);
int (*remove)(struct device *dev);
void (*shutdown)(struct device *dev);
@@ -888,11 +894,13 @@ enum dl_dev_state {
* struct dev_links_info - Device data related to device links.
* @suppliers: List of links to supplier devices.
* @consumers: List of links to consumer devices.
+ * @needs_suppliers: Hook to global list of devices waiting for suppliers.
* @status: Driver status information.
*/
struct dev_links_info {
struct list_head suppliers;
struct list_head consumers;
+ struct list_head needs_suppliers;
enum dl_dev_state status;
};
--
2.22.0.410.gd8fdbe21b5-goog
Add device-links after the devices are created (but before they are
probed) by looking at common DT bindings like clocks and
interconnects.
Automatically adding device-links for functional dependencies at the
framework level provides the following benefits:
- Optimizes device probe order and avoids the useless work of
attempting probes of devices that will not probe successfully
(because their suppliers aren't present or haven't probed yet).
For example, in a commonly available mobile SoC, registering just
one consumer device's driver at an initcall level earlier than the
supplier device's driver causes 11 failed probe attempts before the
consumer device probes successfully. This was with a kernel with all
the drivers statically compiled in. This problem gets a lot worse if
all the drivers are loaded as modules without direct symbol
dependencies.
- Supplier devices like clock providers, interconnect providers, etc
need to keep the resources they provide active and at a particular
state(s) during boot up even if their current set of consumers don't
request the resource to be active. This is because the rest of the
consumers might not have probed yet and turning off the resource
before all the consumers have probed could lead to a hang or
undesired user experience.
Some frameworks (Eg: regulator) handle this today by turning off
"unused" resources at late_initcall_sync and hoping all the devices
have probed by then. This is not a valid assumption for systems with
loadable modules. Other frameworks (Eg: clock) just don't handle
this due to the lack of a clear signal for when they can turn off
resources. This leads to downstream hacks to handle cases like this
that can easily be solved in the upstream kernel.
By linking devices before they are probed, we give suppliers a clear
count of the number of dependent consumers. Once all of the
consumers are active, the suppliers can turn off the unused
resources without making assumptions about the number of consumers.
By default we just add device-links to track "driver presence" (probe
succeeded) of the supplier device. If any other functionality provided
by device-links are needed, it is left to the consumer/supplier
devices to change the link when they probe.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/of/Kconfig | 9 ++++++++
drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)
diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
index 37c2ccbefecd..7c7fa7394b4c 100644
--- a/drivers/of/Kconfig
+++ b/drivers/of/Kconfig
@@ -103,4 +103,13 @@ config OF_OVERLAY
config OF_NUMA
bool
+config OF_DEVLINKS
+ bool "Device links from DT bindings"
+ help
+ Common DT bindings like clocks, interconnects, etc represent a
+ consumer device's dependency on suppliers devices. This option
+ creates device links from these common bindings so that consumers are
+ probed only after all their suppliers are active and suppliers can
+ tell when all their consumers are active.
+
endif # OF
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 04ad312fd85b..a53717168aca 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
EXPORT_SYMBOL(of_find_device_by_node);
#ifdef CONFIG_OF_ADDRESS
+static int of_link_binding(struct device *dev, char *binding, char *cell)
+{
+ struct of_phandle_args sup_args;
+ struct platform_device *sup_dev;
+ unsigned int i = 0, links = 0;
+ u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
+
+ while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
+ &sup_args)) {
+ i++;
+ sup_dev = of_find_device_by_node(sup_args.np);
+ if (!sup_dev)
+ continue;
+ if (device_link_add(dev, &sup_dev->dev, dl_flags))
+ links++;
+ put_device(&sup_dev->dev);
+ }
+ if (links < i)
+ return -ENODEV;
+ return 0;
+}
+
+/*
+ * List of bindings and their cell names (use NULL if no cell names) from which
+ * device links need to be created.
+ */
+static char *link_bindings[] = {
+#ifdef CONFIG_OF_DEVLINKS
+ "clocks", "#clock-cells",
+ "interconnects", "#interconnect-cells",
+#endif
+};
+
+static int of_link_to_suppliers(struct device *dev)
+{
+ unsigned int i = 0;
+ bool done = true;
+
+ if (unlikely(!dev->of_node))
+ return 0;
+
+ for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
+ if (of_link_binding(dev, link_bindings[i * 2],
+ link_bindings[i * 2 + 1]))
+ done = false;
+
+ if (!done)
+ return -ENODEV;
+ return 0;
+}
+
/*
* The following routines scan a subtree and registers a device for
* each applicable node.
@@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void)
if (!of_have_populated_dt())
return -ENODEV;
+ platform_bus_type.add_links = of_link_to_suppliers;
/*
* Handle certain compatibles explicitly, since we don't want to create
* platform_devices for every node in /reserved-memory with a
--
2.22.0.410.gd8fdbe21b5-goog
This sync_state driver/bus callback is called once all the consumers
of a supplier have probed successfully.
This allows the supplier device's driver/bus to sync the supplier
device's state to the software state with the guarantee that all the
consumers are actively managing the resources provided by the supplier
device.
To maintain backwards compatibility and ease transition from existing
frameworks and resource cleanup schemes, late_initcall_sync is the
earliest when the sync_state callback might be called.
There is no upper bound on the time by which the sync_state callback
has to be called. This is because if a consumer device never probes,
the supplier has to maintain its resources in the state left by the
bootloader. For example, if the bootloader leaves the display
backlight at a fixed voltage and the backlight driver is never probed,
you don't want the backlight to ever be turned off after boot up.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 39 +++++++++++++++++++++++++++++++++++++++
drivers/of/platform.c | 9 +++++++++
include/linux/device.h | 19 +++++++++++++++++++
3 files changed, 67 insertions(+)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 0705926d362f..8b8b812d26f1 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -46,6 +46,7 @@ early_param("sysfs.deprecated", sysfs_deprecated_setup);
/* Device links support. */
static LIST_HEAD(wait_for_suppliers);
static DEFINE_MUTEX(wfs_lock);
+static bool supplier_sync_state_enabled;
#ifdef CONFIG_SRCU
static DEFINE_MUTEX(device_links_lock);
@@ -614,6 +615,41 @@ int device_links_check_suppliers(struct device *dev)
return ret;
}
+static void __device_links_supplier_sync_state(struct device *dev)
+{
+ struct device_link *link;
+
+ if (dev->state_synced)
+ return;
+
+ list_for_each_entry(link, &dev->links.consumers, s_node) {
+ if (link->flags & DL_FLAG_STATELESS)
+ continue;
+ if (link->status != DL_STATE_ACTIVE)
+ return;
+ }
+
+ if (dev->bus->sync_state)
+ dev->bus->sync_state(dev);
+ else if (dev->driver && dev->driver->sync_state)
+ dev->driver->sync_state(dev);
+
+ dev->state_synced = true;
+}
+
+int device_links_supplier_sync_state(struct device *dev, void *data)
+{
+ device_links_write_lock();
+ __device_links_supplier_sync_state(dev);
+ device_links_write_unlock();
+ return 0;
+}
+
+void device_links_supplier_sync_state_enable(void)
+{
+ supplier_sync_state_enabled = true;
+}
+
/**
* device_links_driver_bound - Update device links after probing its driver.
* @dev: Device to update the links for.
@@ -658,6 +694,9 @@ void device_links_driver_bound(struct device *dev)
WARN_ON(link->status != DL_STATE_CONSUMER_PROBE);
WRITE_ONCE(link->status, DL_STATE_ACTIVE);
+
+ if (supplier_sync_state_enabled)
+ __device_links_supplier_sync_state(link->supplier);
}
dev->links.status = DL_DEV_DRIVER_BOUND;
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index a53717168aca..ebafa2410045 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -596,6 +596,15 @@ static int __init of_platform_default_populate_init(void)
return 0;
}
arch_initcall_sync(of_platform_default_populate_init);
+
+static int __init of_platform_sync_state_init(void)
+{
+ device_links_supplier_sync_state_enable();
+ bus_for_each_dev(&platform_bus_type, NULL, NULL,
+ device_links_supplier_sync_state);
+ return 0;
+}
+late_initcall_sync(of_platform_sync_state_init);
#endif
int of_platform_device_destroy(struct device *dev, void *data)
diff --git a/include/linux/device.h b/include/linux/device.h
index 7f8ae7e5fc6b..4a0db34ae650 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -84,6 +84,13 @@ extern void bus_remove_file(struct bus_type *, struct bus_attribute *);
* in the future.
* @probe: Called when a new device or driver add to this bus, and callback
* the specific driver's probe to initial the matched device.
+ * @sync_state: Called to sync device state to software state after all the
+ * state tracking consumers linked to this device (present at
+ * the time of late_initcall) have successfully bound to a
+ * driver. If the device has no consumers, this function will
+ * be called at late_initcall_sync level. If the device has
+ * consumers that are never bound to a driver, this function
+ * will never get called until they do.
* @remove: Called when a device removed from this bus.
* @shutdown: Called at shut-down time to quiesce the device.
*
@@ -128,6 +135,7 @@ struct bus_type {
int (*uevent)(struct device *dev, struct kobj_uevent_env *env);
int (*add_links)(struct device *dev);
int (*probe)(struct device *dev);
+ void (*sync_state)(struct device *dev);
int (*remove)(struct device *dev);
void (*shutdown)(struct device *dev);
@@ -257,6 +265,13 @@ enum probe_type {
* @probe: Called to query the existence of a specific device,
* whether this driver can work with it, and bind the driver
* to a specific device.
+ * @sync_state: Called to sync device state to software state after all the
+ * state tracking consumers linked to this device (present at
+ * the time of late_initcall) have successfully bound to a
+ * driver. If the device has no consumers, this function will
+ * be called at late_initcall_sync level. If the device has
+ * consumers that are never bound to a driver, this function
+ * will never get called until they do.
* @remove: Called when the device is removed from the system to
* unbind a device from this driver.
* @shutdown: Called at shut-down time to quiesce the device.
@@ -294,6 +309,7 @@ struct device_driver {
const struct acpi_device_id *acpi_match_table;
int (*probe) (struct device *dev);
+ void (*sync_state)(struct device *dev);
int (*remove) (struct device *dev);
void (*shutdown) (struct device *dev);
int (*suspend) (struct device *dev, pm_message_t state);
@@ -1065,6 +1081,7 @@ struct device {
bool offline_disabled:1;
bool offline:1;
bool of_node_reused:1;
+ bool state_synced:1;
#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
@@ -1404,6 +1421,8 @@ struct device_link *device_link_add(struct device *consumer,
struct device *supplier, u32 flags);
void device_link_del(struct device_link *link);
void device_link_remove(void *consumer, struct device *supplier);
+int device_links_supplier_sync_state(struct device *dev, void *data);
+void device_links_supplier_sync_state_enable(void);
#ifndef dev_fmt
#define dev_fmt(fmt) fmt
--
2.22.0.410.gd8fdbe21b5-goog
The driver core/bus adding dependencies by default makes sure that
suppliers don't sync the hardware state with software state before all the
consumers have their drivers loaded (if they are modules) and are probed.
However, when the bus incorrectly adds dependencies that it shouldn't have
added, the devices might never probe.
For example, if device-C is a consumer of device-S and they have phandles
to each other in DT, the following could happen:
1. Device-S get added first.
2. The bus add_links() callback will (incorrectly) try to link it as
a consumer of device-C.
3. Since device-C isn't present, device-S will be put in
"waiting-for-supplier" list.
4. Device-C gets added next.
5. All devices in "waiting-for-supplier" list are retried for linking.
6. Device-S gets linked as consumer to Device-C.
7. The bus add_links() callback will (correctly) try to link it as
a consumer of device-S.
8. This isn't allowed because it would create a cyclic device links.
So neither devices will get probed since the supplier is dependent on a
consumer that'll never probe (because it can't get resources from the
supplier).
Without this patch, things stay in this broken state. However, with this
patch, the execution will continue like this:
9. Device-C's driver is loaded.
10. Device-C's driver removes Device-S as a consumer of Device-C.
11. Device-C's driver adds Device-C as a consumer of Device-S.
12. Device-S probes.
13. Device-S sync_state() isn't called because Device-C hasn't probed yet.
14. Device-C probes.
15. Device-S's sync_state() callback is called.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 24 ++++++++++++++++++++++--
drivers/base/dd.c | 29 +++++++++++++++++++++++++++++
include/linux/device.h | 18 ++++++++++++++++++
3 files changed, 69 insertions(+), 2 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 8b8b812d26f1..dce97b5f3536 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -423,6 +423,19 @@ static void device_link_wait_for_supplier(struct device *consumer)
mutex_unlock(&wfs_lock);
}
+/**
+ * device_link_remove_from_wfs - Unmark device as waiting for supplier
+ * @consumer: Consumer device
+ *
+ * Unmark the consumer device as waiting for suppliers to become available.
+ */
+void device_link_remove_from_wfs(struct device *consumer)
+{
+ mutex_lock(&wfs_lock);
+ list_del_init(&consumer->links.needs_suppliers);
+ mutex_unlock(&wfs_lock);
+}
+
/**
* device_link_check_waiting_consumers - Try to unmark waiting consumers
*
@@ -440,12 +453,19 @@ static void device_link_wait_for_supplier(struct device *consumer)
static void device_link_check_waiting_consumers(void)
{
struct device *dev, *tmp;
+ int ret;
mutex_lock(&wfs_lock);
list_for_each_entry_safe(dev, tmp, &wait_for_suppliers,
- links.needs_suppliers)
- if (!dev->bus->add_links(dev))
+ links.needs_suppliers) {
+ ret = 0;
+ if (dev->has_edit_links)
+ ret = driver_edit_links(dev);
+ else if (dev->bus->add_links)
+ ret = dev->bus->add_links(dev);
+ if (!ret)
list_del_init(&dev->links.needs_suppliers);
+ }
mutex_unlock(&wfs_lock);
}
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 0df9b4461766..842fc7b704f9 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -659,6 +659,12 @@ int driver_probe_device(struct device_driver *drv, struct device *dev)
pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
drv->bus->name, __func__, dev_name(dev), drv->name);
+ if (drv->edit_links) {
+ if (drv->edit_links(dev))
+ dev->has_edit_links = true;
+ else
+ device_link_remove_from_wfs(dev);
+ }
pm_runtime_get_suppliers(dev);
if (dev->parent)
pm_runtime_get_sync(dev->parent);
@@ -747,6 +753,29 @@ struct device_attach_data {
bool have_async;
};
+static int __driver_edit_links(struct device_driver *drv, void *data)
+{
+ struct device *dev = data;
+
+ if (!drv->edit_links)
+ return 0;
+
+ if (driver_match_device(drv, dev) <= 0)
+ return 0;
+
+ return drv->edit_links(dev);
+}
+
+int driver_edit_links(struct device *dev)
+{
+ int ret;
+
+ device_lock(dev);
+ ret = bus_for_each_drv(dev->bus, NULL, dev, __driver_edit_links);
+ device_unlock(dev);
+ return ret;
+}
+
static int __device_attach_driver(struct device_driver *drv, void *_data)
{
struct device_attach_data *data = _data;
diff --git a/include/linux/device.h b/include/linux/device.h
index 4a0db34ae650..d3c9e70052d8 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -262,6 +262,20 @@ enum probe_type {
* @probe_type: Type of the probe (synchronous or asynchronous) to use.
* @of_match_table: The open firmware table.
* @acpi_match_table: The ACPI match table.
+ * @edit_links: Called to allow a matched driver to edit the device links the
+ * bus might have added incorrectly. This will be useful to handle
+ * cases where the bus incorrectly adds functional dependencies
+ * that aren't true or tries to create cyclic dependencies. But
+ * doesn't correctly handle functional dependencies that are
+ * missed by the bus as the supplier's sync_state might get to
+ * execute before the driver for a missing consumer is loaded and
+ * gets to edit the device links for the consumer.
+ *
+ * This function might be called multiple times after a new device
+ * is added. The function is expected to create all the device
+ * links for the new device and return 0 if it was completed
+ * successfully or return an error if it needs to be reattempted
+ * in the future.
* @probe: Called to query the existence of a specific device,
* whether this driver can work with it, and bind the driver
* to a specific device.
@@ -308,6 +322,7 @@ struct device_driver {
const struct of_device_id *of_match_table;
const struct acpi_device_id *acpi_match_table;
+ int (*edit_links)(struct device *dev);
int (*probe) (struct device *dev);
void (*sync_state)(struct device *dev);
int (*remove) (struct device *dev);
@@ -1082,6 +1097,7 @@ struct device {
bool offline:1;
bool of_node_reused:1;
bool state_synced:1;
+ bool has_edit_links:1;
#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
@@ -1331,6 +1347,7 @@ extern int __must_check device_attach(struct device *dev);
extern int __must_check driver_attach(struct device_driver *drv);
extern void device_initial_probe(struct device *dev);
extern int __must_check device_reprobe(struct device *dev);
+extern int driver_edit_links(struct device *dev);
extern bool device_is_bound(struct device *dev);
@@ -1423,6 +1440,7 @@ void device_link_del(struct device_link *link);
void device_link_remove(void *consumer, struct device *supplier);
int device_links_supplier_sync_state(struct device *dev, void *data);
void device_links_supplier_sync_state_enable(void);
+void device_link_remove_from_wfs(struct device *consumer);
#ifndef dev_fmt
#define dev_fmt(fmt) fmt
--
2.22.0.410.gd8fdbe21b5-goog
On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <[email protected]> wrote:
>
> Add device-links after the devices are created (but before they are
> probed) by looking at common DT bindings like clocks and
> interconnects.
>
> Automatically adding device-links for functional dependencies at the
> framework level provides the following benefits:
>
> - Optimizes device probe order and avoids the useless work of
> attempting probes of devices that will not probe successfully
> (because their suppliers aren't present or haven't probed yet).
>
> For example, in a commonly available mobile SoC, registering just
> one consumer device's driver at an initcall level earlier than the
> supplier device's driver causes 11 failed probe attempts before the
> consumer device probes successfully. This was with a kernel with all
> the drivers statically compiled in. This problem gets a lot worse if
> all the drivers are loaded as modules without direct symbol
> dependencies.
>
> - Supplier devices like clock providers, interconnect providers, etc
> need to keep the resources they provide active and at a particular
> state(s) during boot up even if their current set of consumers don't
> request the resource to be active. This is because the rest of the
> consumers might not have probed yet and turning off the resource
> before all the consumers have probed could lead to a hang or
> undesired user experience.
>
> Some frameworks (Eg: regulator) handle this today by turning off
> "unused" resources at late_initcall_sync and hoping all the devices
> have probed by then. This is not a valid assumption for systems with
> loadable modules. Other frameworks (Eg: clock) just don't handle
> this due to the lack of a clear signal for when they can turn off
> resources. This leads to downstream hacks to handle cases like this
> that can easily be solved in the upstream kernel.
>
> By linking devices before they are probed, we give suppliers a clear
> count of the number of dependent consumers. Once all of the
> consumers are active, the suppliers can turn off the unused
> resources without making assumptions about the number of consumers.
>
> By default we just add device-links to track "driver presence" (probe
> succeeded) of the supplier device. If any other functionality provided
> by device-links are needed, it is left to the consumer/supplier
> devices to change the link when they probe.
>
> Signed-off-by: Saravana Kannan <[email protected]>
> ---
> drivers/of/Kconfig | 9 ++++++++
> drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 61 insertions(+)
>
> diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
> index 37c2ccbefecd..7c7fa7394b4c 100644
> --- a/drivers/of/Kconfig
> +++ b/drivers/of/Kconfig
> @@ -103,4 +103,13 @@ config OF_OVERLAY
> config OF_NUMA
> bool
>
> +config OF_DEVLINKS
I'd prefer this not be a config option. After all, we want one kernel
build that works for all platforms.
A kernel command line option to disable might be useful for debugging.
> + bool "Device links from DT bindings"
> + help
> + Common DT bindings like clocks, interconnects, etc represent a
> + consumer device's dependency on suppliers devices. This option
> + creates device links from these common bindings so that consumers are
> + probed only after all their suppliers are active and suppliers can
> + tell when all their consumers are active.
> +
> endif # OF
> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> index 04ad312fd85b..a53717168aca 100644
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
> EXPORT_SYMBOL(of_find_device_by_node);
>
> #ifdef CONFIG_OF_ADDRESS
> +static int of_link_binding(struct device *dev, char *binding, char *cell)
Under CONFIG_OF_ADDRESS seems like a strange location.
> +{
> + struct of_phandle_args sup_args;
> + struct platform_device *sup_dev;
> + unsigned int i = 0, links = 0;
> + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
> +
> + while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
> + &sup_args)) {
> + i++;
> + sup_dev = of_find_device_by_node(sup_args.np);
> + if (!sup_dev)
> + continue;
> + if (device_link_add(dev, &sup_dev->dev, dl_flags))
> + links++;
> + put_device(&sup_dev->dev);
> + }
> + if (links < i)
> + return -ENODEV;
> + return 0;
> +}
> +
> +/*
> + * List of bindings and their cell names (use NULL if no cell names) from which
> + * device links need to be created.
> + */
> +static char *link_bindings[] = {
const
> +#ifdef CONFIG_OF_DEVLINKS
> + "clocks", "#clock-cells",
> + "interconnects", "#interconnect-cells",
Planning to add others?
> +#endif
> +};
> +
> +static int of_link_to_suppliers(struct device *dev)
> +{
> + unsigned int i = 0;
> + bool done = true;
> +
> + if (unlikely(!dev->of_node))
> + return 0;
> +
> + for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
> + if (of_link_binding(dev, link_bindings[i * 2],
> + link_bindings[i * 2 + 1]))
> + done = false;
> +
> + if (!done)
> + return -ENODEV;
> + return 0;
> +}
> +
> /*
> * The following routines scan a subtree and registers a device for
> * each applicable node.
> @@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void)
> if (!of_have_populated_dt())
> return -ENODEV;
>
> + platform_bus_type.add_links = of_link_to_suppliers;
> /*
> * Handle certain compatibles explicitly, since we don't want to create
> * platform_devices for every node in /reserved-memory with a
> --
> 2.22.0.410.gd8fdbe21b5-goog
>
On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <[email protected]> wrote:
>
> The driver core/bus adding dependencies by default makes sure that
> suppliers don't sync the hardware state with software state before all the
> consumers have their drivers loaded (if they are modules) and are probed.
>
> However, when the bus incorrectly adds dependencies that it shouldn't have
> added, the devices might never probe.
>
> For example, if device-C is a consumer of device-S and they have phandles
> to each other in DT, the following could happen:
>
> 1. Device-S get added first.
> 2. The bus add_links() callback will (incorrectly) try to link it as
> a consumer of device-C.
> 3. Since device-C isn't present, device-S will be put in
> "waiting-for-supplier" list.
> 4. Device-C gets added next.
> 5. All devices in "waiting-for-supplier" list are retried for linking.
> 6. Device-S gets linked as consumer to Device-C.
> 7. The bus add_links() callback will (correctly) try to link it as
> a consumer of device-S.
> 8. This isn't allowed because it would create a cyclic device links.
>
> So neither devices will get probed since the supplier is dependent on a
> consumer that'll never probe (because it can't get resources from the
> supplier).
>
> Without this patch, things stay in this broken state. However, with this
> patch, the execution will continue like this:
>
> 9. Device-C's driver is loaded.
> 10. Device-C's driver removes Device-S as a consumer of Device-C.
> 11. Device-C's driver adds Device-C as a consumer of Device-S.
> 12. Device-S probes.
> 13. Device-S sync_state() isn't called because Device-C hasn't probed yet.
> 14. Device-C probes.
> 15. Device-S's sync_state() callback is called.
We already have some DT unittests around platform devices. It would be
nice to extend them to demonstrate this problem. Could be a follow-up
patch though.
In the case a driver hasn't been updated, couldn't the driver core
just remove all the links of C to S and S to C so that progress can be
made and we retain the status quo of what we have today? That would
lessen the chances of breaking platforms and reduce the immediate need
to fix them.
Rob
On Mon, Jul 1, 2019 at 6:32 PM Rob Herring <[email protected]> wrote:
>
> On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <[email protected]> wrote:
> >
> > Add device-links after the devices are created (but before they are
> > probed) by looking at common DT bindings like clocks and
> > interconnects.
> >
> > Automatically adding device-links for functional dependencies at the
> > framework level provides the following benefits:
> >
> > - Optimizes device probe order and avoids the useless work of
> > attempting probes of devices that will not probe successfully
> > (because their suppliers aren't present or haven't probed yet).
> >
> > For example, in a commonly available mobile SoC, registering just
> > one consumer device's driver at an initcall level earlier than the
> > supplier device's driver causes 11 failed probe attempts before the
> > consumer device probes successfully. This was with a kernel with all
> > the drivers statically compiled in. This problem gets a lot worse if
> > all the drivers are loaded as modules without direct symbol
> > dependencies.
> >
> > - Supplier devices like clock providers, interconnect providers, etc
> > need to keep the resources they provide active and at a particular
> > state(s) during boot up even if their current set of consumers don't
> > request the resource to be active. This is because the rest of the
> > consumers might not have probed yet and turning off the resource
> > before all the consumers have probed could lead to a hang or
> > undesired user experience.
> >
> > Some frameworks (Eg: regulator) handle this today by turning off
> > "unused" resources at late_initcall_sync and hoping all the devices
> > have probed by then. This is not a valid assumption for systems with
> > loadable modules. Other frameworks (Eg: clock) just don't handle
> > this due to the lack of a clear signal for when they can turn off
> > resources. This leads to downstream hacks to handle cases like this
> > that can easily be solved in the upstream kernel.
> >
> > By linking devices before they are probed, we give suppliers a clear
> > count of the number of dependent consumers. Once all of the
> > consumers are active, the suppliers can turn off the unused
> > resources without making assumptions about the number of consumers.
> >
> > By default we just add device-links to track "driver presence" (probe
> > succeeded) of the supplier device. If any other functionality provided
> > by device-links are needed, it is left to the consumer/supplier
> > devices to change the link when they probe.
> >
> > Signed-off-by: Saravana Kannan <[email protected]>
> > ---
> > drivers/of/Kconfig | 9 ++++++++
> > drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 61 insertions(+)
> >
> > diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
> > index 37c2ccbefecd..7c7fa7394b4c 100644
> > --- a/drivers/of/Kconfig
> > +++ b/drivers/of/Kconfig
> > @@ -103,4 +103,13 @@ config OF_OVERLAY
> > config OF_NUMA
> > bool
> >
> > +config OF_DEVLINKS
>
> I'd prefer this not be a config option. After all, we want one kernel
> build that works for all platforms.
We need a lot more changes before one kernel build can work for all
platforms. At least until then, I think we need this. Lot less chance
of breaking existing platforms before all the missing pieces are
created.
> A kernel command line option to disable might be useful for debugging.
Or we can have a command line to enable this for platforms that want
to use it and have it default off.
> > + bool "Device links from DT bindings"
> > + help
> > + Common DT bindings like clocks, interconnects, etc represent a
> > + consumer device's dependency on suppliers devices. This option
> > + creates device links from these common bindings so that consumers are
> > + probed only after all their suppliers are active and suppliers can
> > + tell when all their consumers are active.
> > +
> > endif # OF
> > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > index 04ad312fd85b..a53717168aca 100644
> > --- a/drivers/of/platform.c
> > +++ b/drivers/of/platform.c
> > @@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
> > EXPORT_SYMBOL(of_find_device_by_node);
> >
> > #ifdef CONFIG_OF_ADDRESS
> > +static int of_link_binding(struct device *dev, char *binding, char *cell)
>
> Under CONFIG_OF_ADDRESS seems like a strange location.
Yeah, but the rest of the file seems to be under this. So I'm not
touching that. I can probably move this function further down (close
to platform populate) if you want that.
>
> > +{
> > + struct of_phandle_args sup_args;
> > + struct platform_device *sup_dev;
> > + unsigned int i = 0, links = 0;
> > + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
> > +
> > + while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
> > + &sup_args)) {
> > + i++;
> > + sup_dev = of_find_device_by_node(sup_args.np);
> > + if (!sup_dev)
> > + continue;
> > + if (device_link_add(dev, &sup_dev->dev, dl_flags))
> > + links++;
> > + put_device(&sup_dev->dev);
> > + }
> > + if (links < i)
> > + return -ENODEV;
> > + return 0;
> > +}
> > +
> > +/*
> > + * List of bindings and their cell names (use NULL if no cell names) from which
> > + * device links need to be created.
> > + */
> > +static char *link_bindings[] = {
>
> const
Ack
>
> > +#ifdef CONFIG_OF_DEVLINKS
> > + "clocks", "#clock-cells",
> > + "interconnects", "#interconnect-cells",
>
> Planning to add others?
Not in this patch.
Regulators are the other big missing piece that I'm aware of now but
they need a lot of discussion (see email from David and my reply).
Not sure what other resources are shared where they can be "turned
off" and cause devices set up at boot to fail. For example, I don't
think interrupts need functional dependency tracking because they
aren't really turned off by consumer 1 in a way that breaks things for
consumer 2. Just masked and the consumer 2 can unmask and use it once
it probes.
I'm only intimately familiar with clocks, interconnects and regulators
(to some extent). I'm open to adding other supplier categories in
future patches as I educate myself of those or if other people want to
add support for more categories.
-Saravana
> > +#endif
> > +};
> > +
> > +static int of_link_to_suppliers(struct device *dev)
> > +{
> > + unsigned int i = 0;
> > + bool done = true;
> > +
> > + if (unlikely(!dev->of_node))
> > + return 0;
> > +
> > + for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
> > + if (of_link_binding(dev, link_bindings[i * 2],
> > + link_bindings[i * 2 + 1]))
> > + done = false;
> > +
> > + if (!done)
> > + return -ENODEV;
> > + return 0;
> > +}
> > +
> > /*
> > * The following routines scan a subtree and registers a device for
> > * each applicable node.
> > @@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void)
> > if (!of_have_populated_dt())
> > return -ENODEV;
> >
> > + platform_bus_type.add_links = of_link_to_suppliers;
> > /*
> > * Handle certain compatibles explicitly, since we don't want to create
> > * platform_devices for every node in /reserved-memory with a
> > --
> > 2.22.0.410.gd8fdbe21b5-goog
> >
On Mon, Jul 1, 2019 at 6:46 PM Rob Herring <[email protected]> wrote:
>
> On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <[email protected]> wrote:
> >
> > The driver core/bus adding dependencies by default makes sure that
> > suppliers don't sync the hardware state with software state before all the
> > consumers have their drivers loaded (if they are modules) and are probed.
> >
> > However, when the bus incorrectly adds dependencies that it shouldn't have
> > added, the devices might never probe.
> >
> > For example, if device-C is a consumer of device-S and they have phandles
> > to each other in DT, the following could happen:
> >
> > 1. Device-S get added first.
> > 2. The bus add_links() callback will (incorrectly) try to link it as
> > a consumer of device-C.
> > 3. Since device-C isn't present, device-S will be put in
> > "waiting-for-supplier" list.
> > 4. Device-C gets added next.
> > 5. All devices in "waiting-for-supplier" list are retried for linking.
> > 6. Device-S gets linked as consumer to Device-C.
> > 7. The bus add_links() callback will (correctly) try to link it as
> > a consumer of device-S.
> > 8. This isn't allowed because it would create a cyclic device links.
> >
> > So neither devices will get probed since the supplier is dependent on a
> > consumer that'll never probe (because it can't get resources from the
> > supplier).
> >
> > Without this patch, things stay in this broken state. However, with this
> > patch, the execution will continue like this:
> >
> > 9. Device-C's driver is loaded.
> > 10. Device-C's driver removes Device-S as a consumer of Device-C.
> > 11. Device-C's driver adds Device-C as a consumer of Device-S.
> > 12. Device-S probes.
> > 13. Device-S sync_state() isn't called because Device-C hasn't probed yet.
> > 14. Device-C probes.
> > 15. Device-S's sync_state() callback is called.
>
> We already have some DT unittests around platform devices. It would be
> nice to extend them to demonstrate this problem. Could be a follow-up
> patch though.
>
> In the case a driver hasn't been updated, couldn't the driver core
> just remove all the links of C to S and S to C so that progress can be
> made and we retain the status quo of what we have today?
The problem is knowing which of those links to delete and when.
If a link between S and C fails, how do we know and keep track of
which of the other 100 links in the system are causing a cycle? It can
get unwieldy real quick. We could delete all the links to fall back to
status quo, but how do we tell at what point in time we can delete
them all?
> That would
> lessen the chances of breaking platforms and reduce the immediate need
> to fix them.
Which is why I think we need to have a commandline/config option to
turn this series on. Keep in mind that once this patch is merged, the
API for the supplier drivers would be the same whether the feature is
enabled or not. They just fallback to status quo behavior (do their
stuff in late_initcall_sync() like they do today).
This patch series has a huge impact on the behavior and I don't think
there's a sound reason to force it on everyone right away. This is
something that needs incremental changes to bring in more and more
platforms/drivers into the new scheme. At a minimum Qualcomm seems
pretty interested in using this to solve their "when do I change/turn
off this clock/interconnect after boot?" question.
-Saravana
Hello Saravana,
On 7/1/19 5:48 PM, Saravana Kannan wrote:
...
> TODO:
> - For the case of consumer child sub-nodes being added by a parent
> device after late_initcall_sync we might be able to address that by
> recursively parsing all child nodes and adding all their suppliers as
> suppliers of the parent node too. The parent probe will add the
> children before its probe is completed and that will prevent the
> supplier's sync_state from being executed before the children are
> probed.
>
> But I'll write that part once I see how this series is received.
I don't think that this scheme will work in all cases. It can also lead
to probing deadlock.
Here is an example:
Three DT devices (top level A with subnodes B and C):
/A
/A/B
/A/C
C is a consumer of B.
When device A is created, a search of its subnodes will find the link from
C to B. Since device B hasn't been created yet, of_link_to_suppliers()
will fail and add A to the wait_for_suppliers list. This will cause the
probe of A to fail with -EPROBE_DEFER (thanks to the check in
device_links_check_suppliers()). As a result device B will not be created
and device A will never probe.
You could try to resolve this situation by detecting the cycle and *not*
adding A to the wait_for_suppliers list. However, that would get us back
to the problem we had before. A would be allowed to probe which would
then result in devices being added for B and C. If the device for B is
added before C, then it would be allowed to immediately probe and
(assuming this all takes place after late_initcall_sync thanks to modules)
its sync_state() callback would be called since no consumer devices are
linked to B.
Please note that to change this example from theoretical to practical,
replace "A" with apps_rsc, "B" with pmi8998-rpmh-regulators, and "C" with
pm8998-rpmh-regulators in [1].
Take care,
David
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/qcom/sdm845-mtp.dts?h=v5.2-rc7#n55
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
On Tue, Jul 2, 2019 at 5:03 PM David Collins <[email protected]> wrote:
>
> Hello Saravana,
>
> On 7/1/19 5:48 PM, Saravana Kannan wrote:
> ...
> > TODO:
> > - For the case of consumer child sub-nodes being added by a parent
> > device after late_initcall_sync we might be able to address that by
> > recursively parsing all child nodes and adding all their suppliers as
> > suppliers of the parent node too. The parent probe will add the
> > children before its probe is completed and that will prevent the
> > supplier's sync_state from being executed before the children are
> > probed.
> >
> > But I'll write that part once I see how this series is received.
>
> I don't think that this scheme will work in all cases. It can also lead
> to probing deadlock.
>
> Here is an example:
>
> Three DT devices (top level A with subnodes B and C):
> /A
> /A/B
> /A/C
> C is a consumer of B.
>
> When device A is created, a search of its subnodes will find the link from
> C to B. Since device B hasn't been created yet, of_link_to_suppliers()
> will fail and add A to the wait_for_suppliers list. This will cause the
> probe of A to fail with -EPROBE_DEFER (thanks to the check in
> device_links_check_suppliers()). As a result device B will not be created
> and device A will never probe.
>
> You could try to resolve this situation by detecting the cycle and *not*
> adding A to the wait_for_suppliers list. However, that would get us back
> to the problem we had before. A would be allowed to probe which would
> then result in devices being added for B and C. If the device for B is
> added before C, then it would be allowed to immediately probe and
> (assuming this all takes place after late_initcall_sync thanks to modules)
> its sync_state() callback would be called since no consumer devices are
> linked to B.
>
> Please note that to change this example from theoretical to practical,
> replace "A" with apps_rsc, "B" with pmi8998-rpmh-regulators, and "C" with
> pm8998-rpmh-regulators in [1].
Interesting use case.
First, to clarify my TODO: I was initially thinking of the recursive
"up-heritance" of suppliers from child to parent to handle cases where
the supplier is a device from some other top level device (or its
child). My thinking has evolved a bit on that. I think the parent
needs to inherit only from it's immediate children and not its
grandchildren (the child is responsible for handling grandchildren
suppliers). I'll also have to make sure I don't try to create a link
from a parent device to one of its child device nodes (should be easy
to check).
Anyway, going back to your case, for dependencies between child nodes
of a parent, can't the parent just populate them in the right order?
You can loop through the children and add them in multiple stages.
I'll continue to think if I can come up with anything nicer on the
drivers, but even if we can't come up with anything better, we can
still make sync_state() work.
Cheers,
Saravana
>
> Take care,
> David
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/qcom/sdm845-mtp.dts?h=v5.2-rc7#n55
>
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
On Tue, Jul 2, 2019 at 5:59 PM Saravana Kannan <[email protected]> wrote:
>
> On Tue, Jul 2, 2019 at 5:03 PM David Collins <[email protected]> wrote:
> >
> > Hello Saravana,
> >
> > On 7/1/19 5:48 PM, Saravana Kannan wrote:
> > ...
> > > TODO:
> > > - For the case of consumer child sub-nodes being added by a parent
> > > device after late_initcall_sync we might be able to address that by
> > > recursively parsing all child nodes and adding all their suppliers as
> > > suppliers of the parent node too. The parent probe will add the
> > > children before its probe is completed and that will prevent the
> > > supplier's sync_state from being executed before the children are
> > > probed.
> > >
> > > But I'll write that part once I see how this series is received.
> >
> > I don't think that this scheme will work in all cases. It can also lead
> > to probing deadlock.
> >
> > Here is an example:
> >
> > Three DT devices (top level A with subnodes B and C):
> > /A
> > /A/B
> > /A/C
> > C is a consumer of B.
> >
> > When device A is created, a search of its subnodes will find the link from
> > C to B. Since device B hasn't been created yet, of_link_to_suppliers()
> > will fail and add A to the wait_for_suppliers list. This will cause the
> > probe of A to fail with -EPROBE_DEFER (thanks to the check in
> > device_links_check_suppliers()). As a result device B will not be created
> > and device A will never probe.
> >
> > You could try to resolve this situation by detecting the cycle and *not*
> > adding A to the wait_for_suppliers list. However, that would get us back
> > to the problem we had before. A would be allowed to probe which would
> > then result in devices being added for B and C. If the device for B is
> > added before C, then it would be allowed to immediately probe and
> > (assuming this all takes place after late_initcall_sync thanks to modules)
> > its sync_state() callback would be called since no consumer devices are
> > linked to B.
> >
> > Please note that to change this example from theoretical to practical,
> > replace "A" with apps_rsc, "B" with pmi8998-rpmh-regulators, and "C" with
> > pm8998-rpmh-regulators in [1].
>
> Interesting use case.
>
> First, to clarify my TODO: I was initially thinking of the recursive
> "up-heritance" of suppliers from child to parent to handle cases where
> the supplier is a device from some other top level device (or its
> child). My thinking has evolved a bit on that. I think the parent
> needs to inherit only from it's immediate children and not its
> grandchildren (the child is responsible for handling grandchildren
> suppliers). I'll also have to make sure I don't try to create a link
> from a parent device to one of its child device nodes (should be easy
> to check).
>
> Anyway, going back to your case, for dependencies between child nodes
> of a parent, can't the parent just populate them in the right order?
> You can loop through the children and add them in multiple stages.
>
> I'll continue to think if I can come up with anything nicer on the
> drivers, but even if we can't come up with anything better, we can
> still make sync_state() work.
There's actually a much better way to handle this case where you won't
have to handle ordering on the driver side. I just need to add one or
two patches to my patch series. I'll send that out sometime next week.
-Saravana
HiRob,
Sorry for such a late reply...
On 7/1/19 8:25 PM, Saravana Kannan wrote:
> On Mon, Jul 1, 2019 at 6:32 PM Rob Herring <[email protected]> wrote:
>>
>> On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <[email protected]> wrote:
>>>
>>> Add device-links after the devices are created (but before they are
>>> probed) by looking at common DT bindings like clocks and
>>> interconnects.
>>>
>>> Automatically adding device-links for functional dependencies at the
>>> framework level provides the following benefits:
>>>
>>> - Optimizes device probe order and avoids the useless work of
>>> attempting probes of devices that will not probe successfully
>>> (because their suppliers aren't present or haven't probed yet).
>>>
>>> For example, in a commonly available mobile SoC, registering just
>>> one consumer device's driver at an initcall level earlier than the
>>> supplier device's driver causes 11 failed probe attempts before the
>>> consumer device probes successfully. This was with a kernel with all
>>> the drivers statically compiled in. This problem gets a lot worse if
>>> all the drivers are loaded as modules without direct symbol
>>> dependencies.
>>>
>>> - Supplier devices like clock providers, interconnect providers, etc
>>> need to keep the resources they provide active and at a particular
>>> state(s) during boot up even if their current set of consumers don't
>>> request the resource to be active. This is because the rest of the
>>> consumers might not have probed yet and turning off the resource
>>> before all the consumers have probed could lead to a hang or
>>> undesired user experience.
>>>
>>> Some frameworks (Eg: regulator) handle this today by turning off
>>> "unused" resources at late_initcall_sync and hoping all the devices
>>> have probed by then. This is not a valid assumption for systems with
>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>> this due to the lack of a clear signal for when they can turn off
>>> resources. This leads to downstream hacks to handle cases like this
>>> that can easily be solved in the upstream kernel.
>>>
>>> By linking devices before they are probed, we give suppliers a clear
>>> count of the number of dependent consumers. Once all of the
>>> consumers are active, the suppliers can turn off the unused
>>> resources without making assumptions about the number of consumers.
>>>
>>> By default we just add device-links to track "driver presence" (probe
>>> succeeded) of the supplier device. If any other functionality provided
>>> by device-links are needed, it is left to the consumer/supplier
>>> devices to change the link when they probe.
>>>
>>> Signed-off-by: Saravana Kannan <[email protected]>
>>> ---
>>> drivers/of/Kconfig | 9 ++++++++
>>> drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 61 insertions(+)
>>>
>>> diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
>>> index 37c2ccbefecd..7c7fa7394b4c 100644
>>> --- a/drivers/of/Kconfig
>>> +++ b/drivers/of/Kconfig
>>> @@ -103,4 +103,13 @@ config OF_OVERLAY
>>> config OF_NUMA
>>> bool
>>>
>>> +config OF_DEVLINKS
>>
>> I'd prefer this not be a config option. After all, we want one kernel
>> build that works for all platforms.
>
> We need a lot more changes before one kernel build can work for all
> platforms. At least until then, I think we need this. Lot less chance
> of breaking existing platforms before all the missing pieces are
> created.
>
>> A kernel command line option to disable might be useful for debugging.
>
> Or we can have a command line to enable this for platforms that want
> to use it and have it default off.
Given the fragility of the current boot sequence (without this patch set)
and the potential breakage of existing systems, I think that if we choose
to accept this patch set that it should first bake in the -next tree for
at least one major release cycle. Maybe even two major release cycles.
-Frank
>
>>> + bool "Device links from DT bindings"
>>> + help
>>> + Common DT bindings like clocks, interconnects, etc represent a
>>> + consumer device's dependency on suppliers devices. This option
>>> + creates device links from these common bindings so that consumers are
>>> + probed only after all their suppliers are active and suppliers can
>>> + tell when all their consumers are active.
>>> +
>>> endif # OF
>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
>>> index 04ad312fd85b..a53717168aca 100644
>>> --- a/drivers/of/platform.c
>>> +++ b/drivers/of/platform.c
>>> @@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
>>> EXPORT_SYMBOL(of_find_device_by_node);
>>>
>>> #ifdef CONFIG_OF_ADDRESS
>>> +static int of_link_binding(struct device *dev, char *binding, char *cell)
>>
>> Under CONFIG_OF_ADDRESS seems like a strange location.
>
> Yeah, but the rest of the file seems to be under this. So I'm not
> touching that. I can probably move this function further down (close
> to platform populate) if you want that.
>>
>>> +{
>>> + struct of_phandle_args sup_args;
>>> + struct platform_device *sup_dev;
>>> + unsigned int i = 0, links = 0;
>>> + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
>>> +
>>> + while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
>>> + &sup_args)) {
>>> + i++;
>>> + sup_dev = of_find_device_by_node(sup_args.np);
>>> + if (!sup_dev)
>>> + continue;
>>> + if (device_link_add(dev, &sup_dev->dev, dl_flags))
>>> + links++;
>>> + put_device(&sup_dev->dev);
>>> + }
>>> + if (links < i)
>>> + return -ENODEV;
>>> + return 0;
>>> +}
>>> +
>>> +/*
>>> + * List of bindings and their cell names (use NULL if no cell names) from which
>>> + * device links need to be created.
>>> + */
>>> +static char *link_bindings[] = {
>>
>> const
>
> Ack
>
>>
>>> +#ifdef CONFIG_OF_DEVLINKS
>>> + "clocks", "#clock-cells",
>>> + "interconnects", "#interconnect-cells",
>>
>> Planning to add others?
>
> Not in this patch.
>
> Regulators are the other big missing piece that I'm aware of now but
> they need a lot of discussion (see email from David and my reply).
>
> Not sure what other resources are shared where they can be "turned
> off" and cause devices set up at boot to fail. For example, I don't
> think interrupts need functional dependency tracking because they
> aren't really turned off by consumer 1 in a way that breaks things for
> consumer 2. Just masked and the consumer 2 can unmask and use it once
> it probes.
>
> I'm only intimately familiar with clocks, interconnects and regulators
> (to some extent). I'm open to adding other supplier categories in
> future patches as I educate myself of those or if other people want to
> add support for more categories.
>
> -Saravana
>
>>> +#endif
>>> +};
>>> +
>>> +static int of_link_to_suppliers(struct device *dev)
>>> +{
>>> + unsigned int i = 0;
>>> + bool done = true;
>>> +
>>> + if (unlikely(!dev->of_node))
>>> + return 0;
>>> +
>>> + for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
>>> + if (of_link_binding(dev, link_bindings[i * 2],
>>> + link_bindings[i * 2 + 1]))
>>> + done = false;
>>> +
>>> + if (!done)
>>> + return -ENODEV;
>>> + return 0;
>>> +}
>>> +
>>> /*
>>> * The following routines scan a subtree and registers a device for
>>> * each applicable node.
>>> @@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void)
>>> if (!of_have_populated_dt())
>>> return -ENODEV;
>>>
>>> + platform_bus_type.add_links = of_link_to_suppliers;
>>> /*
>>> * Handle certain compatibles explicitly, since we don't want to create
>>> * platform_devices for every node in /reserved-memory with a
>>> --
>>> 2.22.0.410.gd8fdbe21b5-goog
>>>
>
On 7/15/19 7:26 AM, Frank Rowand wrote:
> HiRob,
>
> Sorry for such a late reply...
>
>
> On 7/1/19 8:25 PM, Saravana Kannan wrote:
>> On Mon, Jul 1, 2019 at 6:32 PM Rob Herring <[email protected]> wrote:
>>>
>>> On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <[email protected]> wrote:
>>>>
>>>> Add device-links after the devices are created (but before they are
>>>> probed) by looking at common DT bindings like clocks and
>>>> interconnects.
>>>>
>>>> Automatically adding device-links for functional dependencies at the
>>>> framework level provides the following benefits:
>>>>
>>>> - Optimizes device probe order and avoids the useless work of
>>>> attempting probes of devices that will not probe successfully
>>>> (because their suppliers aren't present or haven't probed yet).
>>>>
>>>> For example, in a commonly available mobile SoC, registering just
>>>> one consumer device's driver at an initcall level earlier than the
>>>> supplier device's driver causes 11 failed probe attempts before the
>>>> consumer device probes successfully. This was with a kernel with all
>>>> the drivers statically compiled in. This problem gets a lot worse if
>>>> all the drivers are loaded as modules without direct symbol
>>>> dependencies.
>>>>
>>>> - Supplier devices like clock providers, interconnect providers, etc
>>>> need to keep the resources they provide active and at a particular
>>>> state(s) during boot up even if their current set of consumers don't
>>>> request the resource to be active. This is because the rest of the
>>>> consumers might not have probed yet and turning off the resource
>>>> before all the consumers have probed could lead to a hang or
>>>> undesired user experience.
>>>>
>>>> Some frameworks (Eg: regulator) handle this today by turning off
>>>> "unused" resources at late_initcall_sync and hoping all the devices
>>>> have probed by then. This is not a valid assumption for systems with
>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>>> this due to the lack of a clear signal for when they can turn off
>>>> resources. This leads to downstream hacks to handle cases like this
>>>> that can easily be solved in the upstream kernel.
>>>>
>>>> By linking devices before they are probed, we give suppliers a clear
>>>> count of the number of dependent consumers. Once all of the
>>>> consumers are active, the suppliers can turn off the unused
>>>> resources without making assumptions about the number of consumers.
>>>>
>>>> By default we just add device-links to track "driver presence" (probe
>>>> succeeded) of the supplier device. If any other functionality provided
>>>> by device-links are needed, it is left to the consumer/supplier
>>>> devices to change the link when they probe.
>>>>
>>>> Signed-off-by: Saravana Kannan <[email protected]>
>>>> ---
>>>> drivers/of/Kconfig | 9 ++++++++
>>>> drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 61 insertions(+)
>>>>
>>>> diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
>>>> index 37c2ccbefecd..7c7fa7394b4c 100644
>>>> --- a/drivers/of/Kconfig
>>>> +++ b/drivers/of/Kconfig
>>>> @@ -103,4 +103,13 @@ config OF_OVERLAY
>>>> config OF_NUMA
>>>> bool
>>>>
>>>> +config OF_DEVLINKS
>>>
>>> I'd prefer this not be a config option. After all, we want one kernel
>>> build that works for all platforms.
>>
>> We need a lot more changes before one kernel build can work for all
>> platforms. At least until then, I think we need this. Lot less chance
>> of breaking existing platforms before all the missing pieces are
>> created.
>>
>>> A kernel command line option to disable might be useful for debugging.
>>
>> Or we can have a command line to enable this for platforms that want
>> to use it and have it default off.
>
> Given the fragility of the current boot sequence (without this patch set)
> and the potential breakage of existing systems, I think that if we choose
> to accept this patch set that it should first bake in the -next tree for
> at least one major release cycle. Maybe even two major release cycles.
I probably didn't state that very well. I was trying to not sound like
I was accusing this patch series of being fragile. The issue is that
adding the patches to systems that weren't expecting the new ordering
may cause boot problems for some systems. I'm not concerned with
pointing fingers, just want to make sure that we proceed cautiously
until we know that the resulting system is robust.
-Frank
>
> -Frank
>
>
>>
>>>> + bool "Device links from DT bindings"
>>>> + help
>>>> + Common DT bindings like clocks, interconnects, etc represent a
>>>> + consumer device's dependency on suppliers devices. This option
>>>> + creates device links from these common bindings so that consumers are
>>>> + probed only after all their suppliers are active and suppliers can
>>>> + tell when all their consumers are active.
>>>> +
>>>> endif # OF
>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
>>>> index 04ad312fd85b..a53717168aca 100644
>>>> --- a/drivers/of/platform.c
>>>> +++ b/drivers/of/platform.c
>>>> @@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
>>>> EXPORT_SYMBOL(of_find_device_by_node);
>>>>
>>>> #ifdef CONFIG_OF_ADDRESS
>>>> +static int of_link_binding(struct device *dev, char *binding, char *cell)
>>>
>>> Under CONFIG_OF_ADDRESS seems like a strange location.
>>
>> Yeah, but the rest of the file seems to be under this. So I'm not
>> touching that. I can probably move this function further down (close
>> to platform populate) if you want that.
>>>
>>>> +{
>>>> + struct of_phandle_args sup_args;
>>>> + struct platform_device *sup_dev;
>>>> + unsigned int i = 0, links = 0;
>>>> + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
>>>> +
>>>> + while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
>>>> + &sup_args)) {
>>>> + i++;
>>>> + sup_dev = of_find_device_by_node(sup_args.np);
>>>> + if (!sup_dev)
>>>> + continue;
>>>> + if (device_link_add(dev, &sup_dev->dev, dl_flags))
>>>> + links++;
>>>> + put_device(&sup_dev->dev);
>>>> + }
>>>> + if (links < i)
>>>> + return -ENODEV;
>>>> + return 0;
>>>> +}
>>>> +
>>>> +/*
>>>> + * List of bindings and their cell names (use NULL if no cell names) from which
>>>> + * device links need to be created.
>>>> + */
>>>> +static char *link_bindings[] = {
>>>
>>> const
>>
>> Ack
>>
>>>
>>>> +#ifdef CONFIG_OF_DEVLINKS
>>>> + "clocks", "#clock-cells",
>>>> + "interconnects", "#interconnect-cells",
>>>
>>> Planning to add others?
>>
>> Not in this patch.
>>
>> Regulators are the other big missing piece that I'm aware of now but
>> they need a lot of discussion (see email from David and my reply).
>>
>> Not sure what other resources are shared where they can be "turned
>> off" and cause devices set up at boot to fail. For example, I don't
>> think interrupts need functional dependency tracking because they
>> aren't really turned off by consumer 1 in a way that breaks things for
>> consumer 2. Just masked and the consumer 2 can unmask and use it once
>> it probes.
>>
>> I'm only intimately familiar with clocks, interconnects and regulators
>> (to some extent). I'm open to adding other supplier categories in
>> future patches as I educate myself of those or if other people want to
>> add support for more categories.
>>
>> -Saravana
>>
>>>> +#endif
>>>> +};
>>>> +
>>>> +static int of_link_to_suppliers(struct device *dev)
>>>> +{
>>>> + unsigned int i = 0;
>>>> + bool done = true;
>>>> +
>>>> + if (unlikely(!dev->of_node))
>>>> + return 0;
>>>> +
>>>> + for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
>>>> + if (of_link_binding(dev, link_bindings[i * 2],
>>>> + link_bindings[i * 2 + 1]))
>>>> + done = false;
>>>> +
>>>> + if (!done)
>>>> + return -ENODEV;
>>>> + return 0;
>>>> +}
>>>> +
>>>> /*
>>>> * The following routines scan a subtree and registers a device for
>>>> * each applicable node.
>>>> @@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void)
>>>> if (!of_have_populated_dt())
>>>> return -ENODEV;
>>>>
>>>> + platform_bus_type.add_links = of_link_to_suppliers;
>>>> /*
>>>> * Handle certain compatibles explicitly, since we don't want to create
>>>> * platform_devices for every node in /reserved-memory with a
>>>> --
>>>> 2.22.0.410.gd8fdbe21b5-goog
>>>>
>>
>
>
Replying again because the previous email accidentally included HTML.
Thanks for taking the time to reconsider the wording Frank. Your
intention was clear to me in the first email too.
A kernel command line option can also completely disable this
functionality easily and cleanly. Can we pick that as an option? I've
an implementation of that in the v5 series I sent out last week.
-Saravana
On Mon, Jul 15, 2019 at 7:39 AM Frank Rowand <[email protected]> wrote:
>
> On 7/15/19 7:26 AM, Frank Rowand wrote:
> > HiRob,
> >
> > Sorry for such a late reply...
> >
> >
> > On 7/1/19 8:25 PM, Saravana Kannan wrote:
> >> On Mon, Jul 1, 2019 at 6:32 PM Rob Herring <[email protected]> wrote:
> >>>
> >>> On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <[email protected]> wrote:
> >>>>
> >>>> Add device-links after the devices are created (but before they are
> >>>> probed) by looking at common DT bindings like clocks and
> >>>> interconnects.
> >>>>
> >>>> Automatically adding device-links for functional dependencies at the
> >>>> framework level provides the following benefits:
> >>>>
> >>>> - Optimizes device probe order and avoids the useless work of
> >>>> attempting probes of devices that will not probe successfully
> >>>> (because their suppliers aren't present or haven't probed yet).
> >>>>
> >>>> For example, in a commonly available mobile SoC, registering just
> >>>> one consumer device's driver at an initcall level earlier than the
> >>>> supplier device's driver causes 11 failed probe attempts before the
> >>>> consumer device probes successfully. This was with a kernel with all
> >>>> the drivers statically compiled in. This problem gets a lot worse if
> >>>> all the drivers are loaded as modules without direct symbol
> >>>> dependencies.
> >>>>
> >>>> - Supplier devices like clock providers, interconnect providers, etc
> >>>> need to keep the resources they provide active and at a particular
> >>>> state(s) during boot up even if their current set of consumers don't
> >>>> request the resource to be active. This is because the rest of the
> >>>> consumers might not have probed yet and turning off the resource
> >>>> before all the consumers have probed could lead to a hang or
> >>>> undesired user experience.
> >>>>
> >>>> Some frameworks (Eg: regulator) handle this today by turning off
> >>>> "unused" resources at late_initcall_sync and hoping all the devices
> >>>> have probed by then. This is not a valid assumption for systems with
> >>>> loadable modules. Other frameworks (Eg: clock) just don't handle
> >>>> this due to the lack of a clear signal for when they can turn off
> >>>> resources. This leads to downstream hacks to handle cases like this
> >>>> that can easily be solved in the upstream kernel.
> >>>>
> >>>> By linking devices before they are probed, we give suppliers a clear
> >>>> count of the number of dependent consumers. Once all of the
> >>>> consumers are active, the suppliers can turn off the unused
> >>>> resources without making assumptions about the number of consumers.
> >>>>
> >>>> By default we just add device-links to track "driver presence" (probe
> >>>> succeeded) of the supplier device. If any other functionality provided
> >>>> by device-links are needed, it is left to the consumer/supplier
> >>>> devices to change the link when they probe.
> >>>>
> >>>> Signed-off-by: Saravana Kannan <[email protected]>
> >>>> ---
> >>>> drivers/of/Kconfig | 9 ++++++++
> >>>> drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++
> >>>> 2 files changed, 61 insertions(+)
> >>>>
> >>>> diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
> >>>> index 37c2ccbefecd..7c7fa7394b4c 100644
> >>>> --- a/drivers/of/Kconfig
> >>>> +++ b/drivers/of/Kconfig
> >>>> @@ -103,4 +103,13 @@ config OF_OVERLAY
> >>>> config OF_NUMA
> >>>> bool
> >>>>
> >>>> +config OF_DEVLINKS
> >>>
> >>> I'd prefer this not be a config option. After all, we want one kernel
> >>> build that works for all platforms.
> >>
> >> We need a lot more changes before one kernel build can work for all
> >> platforms. At least until then, I think we need this. Lot less chance
> >> of breaking existing platforms before all the missing pieces are
> >> created.
> >>
> >>> A kernel command line option to disable might be useful for debugging.
> >>
> >> Or we can have a command line to enable this for platforms that want
> >> to use it and have it default off.
> >
>
> > Given the fragility of the current boot sequence (without this patch set)
> > and the potential breakage of existing systems, I think that if we choose
> > to accept this patch set that it should first bake in the -next tree for
> > at least one major release cycle. Maybe even two major release cycles.
>
> I probably didn't state that very well. I was trying to not sound like
> I was accusing this patch series of being fragile. The issue is that
> adding the patches to systems that weren't expecting the new ordering
> may cause boot problems for some systems. I'm not concerned with
> pointing fingers, just want to make sure that we proceed cautiously
> until we know that the resulting system is robust.
>
> -Frank
>
> >
> > -Frank
> >
> >
> >>
> >>>> + bool "Device links from DT bindings"
> >>>> + help
> >>>> + Common DT bindings like clocks, interconnects, etc represent a
> >>>> + consumer device's dependency on suppliers devices. This option
> >>>> + creates device links from these common bindings so that consumers are
> >>>> + probed only after all their suppliers are active and suppliers can
> >>>> + tell when all their consumers are active.
> >>>> +
> >>>> endif # OF
> >>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> >>>> index 04ad312fd85b..a53717168aca 100644
> >>>> --- a/drivers/of/platform.c
> >>>> +++ b/drivers/of/platform.c
> >>>> @@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
> >>>> EXPORT_SYMBOL(of_find_device_by_node);
> >>>>
> >>>> #ifdef CONFIG_OF_ADDRESS
> >>>> +static int of_link_binding(struct device *dev, char *binding, char *cell)
> >>>
> >>> Under CONFIG_OF_ADDRESS seems like a strange location.
> >>
> >> Yeah, but the rest of the file seems to be under this. So I'm not
> >> touching that. I can probably move this function further down (close
> >> to platform populate) if you want that.
> >>>
> >>>> +{
> >>>> + struct of_phandle_args sup_args;
> >>>> + struct platform_device *sup_dev;
> >>>> + unsigned int i = 0, links = 0;
> >>>> + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
> >>>> +
> >>>> + while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
> >>>> + &sup_args)) {
> >>>> + i++;
> >>>> + sup_dev = of_find_device_by_node(sup_args.np);
> >>>> + if (!sup_dev)
> >>>> + continue;
> >>>> + if (device_link_add(dev, &sup_dev->dev, dl_flags))
> >>>> + links++;
> >>>> + put_device(&sup_dev->dev);
> >>>> + }
> >>>> + if (links < i)
> >>>> + return -ENODEV;
> >>>> + return 0;
> >>>> +}
> >>>> +
> >>>> +/*
> >>>> + * List of bindings and their cell names (use NULL if no cell names) from which
> >>>> + * device links need to be created.
> >>>> + */
> >>>> +static char *link_bindings[] = {
> >>>
> >>> const
> >>
> >> Ack
> >>
> >>>
> >>>> +#ifdef CONFIG_OF_DEVLINKS
> >>>> + "clocks", "#clock-cells",
> >>>> + "interconnects", "#interconnect-cells",
> >>>
> >>> Planning to add others?
> >>
> >> Not in this patch.
> >>
> >> Regulators are the other big missing piece that I'm aware of now but
> >> they need a lot of discussion (see email from David and my reply).
> >>
> >> Not sure what other resources are shared where they can be "turned
> >> off" and cause devices set up at boot to fail. For example, I don't
> >> think interrupts need functional dependency tracking because they
> >> aren't really turned off by consumer 1 in a way that breaks things for
> >> consumer 2. Just masked and the consumer 2 can unmask and use it once
> >> it probes.
> >>
> >> I'm only intimately familiar with clocks, interconnects and regulators
> >> (to some extent). I'm open to adding other supplier categories in
> >> future patches as I educate myself of those or if other people want to
> >> add support for more categories.
> >>
> >> -Saravana
> >>
> >>>> +#endif
> >>>> +};
> >>>> +
> >>>> +static int of_link_to_suppliers(struct device *dev)
> >>>> +{
> >>>> + unsigned int i = 0;
> >>>> + bool done = true;
> >>>> +
> >>>> + if (unlikely(!dev->of_node))
> >>>> + return 0;
> >>>> +
> >>>> + for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
> >>>> + if (of_link_binding(dev, link_bindings[i * 2],
> >>>> + link_bindings[i * 2 + 1]))
> >>>> + done = false;
> >>>> +
> >>>> + if (!done)
> >>>> + return -ENODEV;
> >>>> + return 0;
> >>>> +}
> >>>> +
> >>>> /*
> >>>> * The following routines scan a subtree and registers a device for
> >>>> * each applicable node.
> >>>> @@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void)
> >>>> if (!of_have_populated_dt())
> >>>> return -ENODEV;
> >>>>
> >>>> + platform_bus_type.add_links = of_link_to_suppliers;
> >>>> /*
> >>>> * Handle certain compatibles explicitly, since we don't want to create
> >>>> * platform_devices for every node in /reserved-memory with a
> >>>> --
> >>>> 2.22.0.410.gd8fdbe21b5-goog
> >>>>
> >>
> >
> >
>
On 7/15/19 11:40 AM, Saravana Kannan wrote:
> Replying again because the previous email accidentally included HTML.
>
> Thanks for taking the time to reconsider the wording Frank. Your
> intention was clear to me in the first email too.
>
> A kernel command line option can also completely disable this
> functionality easily and cleanly. Can we pick that as an option? I've
> an implementation of that in the v5 series I sent out last week.
Yes, Rob suggested a command line option for debugging, and I am fine with
that. But even with that, I would like a lot of testing so that we have a
chance of finding systems that have trouble with the changes and could
potentially be fixed before impacting a large number of users.
-Frank
>
> -Saravana
>
> On Mon, Jul 15, 2019 at 7:39 AM Frank Rowand <[email protected]> wrote:
>>
>> On 7/15/19 7:26 AM, Frank Rowand wrote:
>>> HiRob,
>>>
>>> Sorry for such a late reply...
>>>
>>>
>>> On 7/1/19 8:25 PM, Saravana Kannan wrote:
>>>> On Mon, Jul 1, 2019 at 6:32 PM Rob Herring <[email protected]> wrote:
>>>>>
>>>>> On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <[email protected]> wrote:
>>>>>>
>>>>>> Add device-links after the devices are created (but before they are
>>>>>> probed) by looking at common DT bindings like clocks and
>>>>>> interconnects.
>>>>>>
>>>>>> Automatically adding device-links for functional dependencies at the
>>>>>> framework level provides the following benefits:
>>>>>>
>>>>>> - Optimizes device probe order and avoids the useless work of
>>>>>> attempting probes of devices that will not probe successfully
>>>>>> (because their suppliers aren't present or haven't probed yet).
>>>>>>
>>>>>> For example, in a commonly available mobile SoC, registering just
>>>>>> one consumer device's driver at an initcall level earlier than the
>>>>>> supplier device's driver causes 11 failed probe attempts before the
>>>>>> consumer device probes successfully. This was with a kernel with all
>>>>>> the drivers statically compiled in. This problem gets a lot worse if
>>>>>> all the drivers are loaded as modules without direct symbol
>>>>>> dependencies.
>>>>>>
>>>>>> - Supplier devices like clock providers, interconnect providers, etc
>>>>>> need to keep the resources they provide active and at a particular
>>>>>> state(s) during boot up even if their current set of consumers don't
>>>>>> request the resource to be active. This is because the rest of the
>>>>>> consumers might not have probed yet and turning off the resource
>>>>>> before all the consumers have probed could lead to a hang or
>>>>>> undesired user experience.
>>>>>>
>>>>>> Some frameworks (Eg: regulator) handle this today by turning off
>>>>>> "unused" resources at late_initcall_sync and hoping all the devices
>>>>>> have probed by then. This is not a valid assumption for systems with
>>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle
>>>>>> this due to the lack of a clear signal for when they can turn off
>>>>>> resources. This leads to downstream hacks to handle cases like this
>>>>>> that can easily be solved in the upstream kernel.
>>>>>>
>>>>>> By linking devices before they are probed, we give suppliers a clear
>>>>>> count of the number of dependent consumers. Once all of the
>>>>>> consumers are active, the suppliers can turn off the unused
>>>>>> resources without making assumptions about the number of consumers.
>>>>>>
>>>>>> By default we just add device-links to track "driver presence" (probe
>>>>>> succeeded) of the supplier device. If any other functionality provided
>>>>>> by device-links are needed, it is left to the consumer/supplier
>>>>>> devices to change the link when they probe.
>>>>>>
>>>>>> Signed-off-by: Saravana Kannan <[email protected]>
>>>>>> ---
>>>>>> drivers/of/Kconfig | 9 ++++++++
>>>>>> drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++
>>>>>> 2 files changed, 61 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
>>>>>> index 37c2ccbefecd..7c7fa7394b4c 100644
>>>>>> --- a/drivers/of/Kconfig
>>>>>> +++ b/drivers/of/Kconfig
>>>>>> @@ -103,4 +103,13 @@ config OF_OVERLAY
>>>>>> config OF_NUMA
>>>>>> bool
>>>>>>
>>>>>> +config OF_DEVLINKS
>>>>>
>>>>> I'd prefer this not be a config option. After all, we want one kernel
>>>>> build that works for all platforms.
>>>>
>>>> We need a lot more changes before one kernel build can work for all
>>>> platforms. At least until then, I think we need this. Lot less chance
>>>> of breaking existing platforms before all the missing pieces are
>>>> created.
>>>>
>>>>> A kernel command line option to disable might be useful for debugging.
>>>>
>>>> Or we can have a command line to enable this for platforms that want
>>>> to use it and have it default off.
>>>
>>
>>> Given the fragility of the current boot sequence (without this patch set)
>>> and the potential breakage of existing systems, I think that if we choose
>>> to accept this patch set that it should first bake in the -next tree for
>>> at least one major release cycle. Maybe even two major release cycles.
>>
>> I probably didn't state that very well. I was trying to not sound like
>> I was accusing this patch series of being fragile. The issue is that
>> adding the patches to systems that weren't expecting the new ordering
>> may cause boot problems for some systems. I'm not concerned with
>> pointing fingers, just want to make sure that we proceed cautiously
>> until we know that the resulting system is robust.
>>
>> -Frank
>>
>>>
>>> -Frank
>>>
>>>
>>>>
>>>>>> + bool "Device links from DT bindings"
>>>>>> + help
>>>>>> + Common DT bindings like clocks, interconnects, etc represent a
>>>>>> + consumer device's dependency on suppliers devices. This option
>>>>>> + creates device links from these common bindings so that consumers are
>>>>>> + probed only after all their suppliers are active and suppliers can
>>>>>> + tell when all their consumers are active.
>>>>>> +
>>>>>> endif # OF
>>>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
>>>>>> index 04ad312fd85b..a53717168aca 100644
>>>>>> --- a/drivers/of/platform.c
>>>>>> +++ b/drivers/of/platform.c
>>>>>> @@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
>>>>>> EXPORT_SYMBOL(of_find_device_by_node);
>>>>>>
>>>>>> #ifdef CONFIG_OF_ADDRESS
>>>>>> +static int of_link_binding(struct device *dev, char *binding, char *cell)
>>>>>
>>>>> Under CONFIG_OF_ADDRESS seems like a strange location.
>>>>
>>>> Yeah, but the rest of the file seems to be under this. So I'm not
>>>> touching that. I can probably move this function further down (close
>>>> to platform populate) if you want that.
>>>>>
>>>>>> +{
>>>>>> + struct of_phandle_args sup_args;
>>>>>> + struct platform_device *sup_dev;
>>>>>> + unsigned int i = 0, links = 0;
>>>>>> + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
>>>>>> +
>>>>>> + while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
>>>>>> + &sup_args)) {
>>>>>> + i++;
>>>>>> + sup_dev = of_find_device_by_node(sup_args.np);
>>>>>> + if (!sup_dev)
>>>>>> + continue;
>>>>>> + if (device_link_add(dev, &sup_dev->dev, dl_flags))
>>>>>> + links++;
>>>>>> + put_device(&sup_dev->dev);
>>>>>> + }
>>>>>> + if (links < i)
>>>>>> + return -ENODEV;
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> +/*
>>>>>> + * List of bindings and their cell names (use NULL if no cell names) from which
>>>>>> + * device links need to be created.
>>>>>> + */
>>>>>> +static char *link_bindings[] = {
>>>>>
>>>>> const
>>>>
>>>> Ack
>>>>
>>>>>
>>>>>> +#ifdef CONFIG_OF_DEVLINKS
>>>>>> + "clocks", "#clock-cells",
>>>>>> + "interconnects", "#interconnect-cells",
>>>>>
>>>>> Planning to add others?
>>>>
>>>> Not in this patch.
>>>>
>>>> Regulators are the other big missing piece that I'm aware of now but
>>>> they need a lot of discussion (see email from David and my reply).
>>>>
>>>> Not sure what other resources are shared where they can be "turned
>>>> off" and cause devices set up at boot to fail. For example, I don't
>>>> think interrupts need functional dependency tracking because they
>>>> aren't really turned off by consumer 1 in a way that breaks things for
>>>> consumer 2. Just masked and the consumer 2 can unmask and use it once
>>>> it probes.
>>>>
>>>> I'm only intimately familiar with clocks, interconnects and regulators
>>>> (to some extent). I'm open to adding other supplier categories in
>>>> future patches as I educate myself of those or if other people want to
>>>> add support for more categories.
>>>>
>>>> -Saravana
>>>>
>>>>>> +#endif
>>>>>> +};
>>>>>> +
>>>>>> +static int of_link_to_suppliers(struct device *dev)
>>>>>> +{
>>>>>> + unsigned int i = 0;
>>>>>> + bool done = true;
>>>>>> +
>>>>>> + if (unlikely(!dev->of_node))
>>>>>> + return 0;
>>>>>> +
>>>>>> + for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
>>>>>> + if (of_link_binding(dev, link_bindings[i * 2],
>>>>>> + link_bindings[i * 2 + 1]))
>>>>>> + done = false;
>>>>>> +
>>>>>> + if (!done)
>>>>>> + return -ENODEV;
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> /*
>>>>>> * The following routines scan a subtree and registers a device for
>>>>>> * each applicable node.
>>>>>> @@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void)
>>>>>> if (!of_have_populated_dt())
>>>>>> return -ENODEV;
>>>>>>
>>>>>> + platform_bus_type.add_links = of_link_to_suppliers;
>>>>>> /*
>>>>>> * Handle certain compatibles explicitly, since we don't want to create
>>>>>> * platform_devices for every node in /reserved-memory with a
>>>>>> --
>>>>>> 2.22.0.410.gd8fdbe21b5-goog
>>>>>>
>>>>
>>>
>>>
>>
>
On Mon, Jul 15, 2019 at 7:05 PM Frank Rowand <[email protected]> wrote:
>
> On 7/15/19 11:40 AM, Saravana Kannan wrote:
> > Replying again because the previous email accidentally included HTML.
> >
> > Thanks for taking the time to reconsider the wording Frank. Your
> > intention was clear to me in the first email too.
> >
> > A kernel command line option can also completely disable this
> > functionality easily and cleanly. Can we pick that as an option? I've
> > an implementation of that in the v5 series I sent out last week.
>
> Yes, Rob suggested a command line option for debugging, and I am fine with
> that. But even with that, I would like a lot of testing so that we have a
> chance of finding systems that have trouble with the changes and could
> potentially be fixed before impacting a large number of users.
Leaving it in -next for more than a cycle will not help. There's some
number of users who test linux-next. Then there's more that test -rc
kernels. Then there's more that test final releases and/or stable
kernels. Probably, the more stable the h/w, the more it tends to be
latter groups. (I don't get reports of breaking PowerMacs with the
changes sitting in linux-next.)
My main worry about this being off by default is it won't get tested.
I'm not sure there's enough interest to drive folks to turn it on and
test. Maybe it needs to be on until we see breakage.
Rob
Resending again due to HTML. Sorry about it, the darn thing keeps
getting turned on for some reason!
On Tue, Jul 16, 2019 at 3:57 PM Rob Herring <[email protected]> wrote:
>
> On Mon, Jul 15, 2019 at 7:05 PM Frank Rowand <[email protected]> wrote:
> >
> > On 7/15/19 11:40 AM, Saravana Kannan wrote:
> > > Replying again because the previous email accidentally included HTML.
> > >
> > > Thanks for taking the time to reconsider the wording Frank. Your
> > > intention was clear to me in the first email too.
> > >
> > > A kernel command line option can also completely disable this
> > > functionality easily and cleanly. Can we pick that as an option? I've
> > > an implementation of that in the v5 series I sent out last week.
> >
> > Yes, Rob suggested a command line option for debugging, and I am fine with
> > that. But even with that, I would like a lot of testing so that we have a
> > chance of finding systems that have trouble with the changes and could
> > potentially be fixed before impacting a large number of users.
>
> Leaving it in -next for more than a cycle will not help. There's some
> number of users who test linux-next. Then there's more that test -rc
> kernels. Then there's more that test final releases and/or stable
> kernels. Probably, the more stable the h/w, the more it tends to be
> latter groups. (I don't get reports of breaking PowerMacs with the
> changes sitting in linux-next.)
>
> My main worry about this being off by default is it won't get tested.
> I'm not sure there's enough interest to drive folks to turn it on and
> test. Maybe it needs to be on until we see breakage.
Android will start using this. So there's definitely going to be a lot
of motivation for people to start using this and testing this.
I'm also thinking we should start rejecting late_initcall_sync() level
clean up code in device drivers in the future and start asking them to
move to sync_state(). If this feature isn't turned on, the behavior
will default to a late_initcall_sync() level execution. But when this
is on, it'll actually work nicely with modules. So that's another way
to drive folks to it and improve things over time.
Maybe we can have this on by default in linux-next and -rc. Fix any
issues that show up and can be fixed without breaking the goal of this
series (make things work with modules). And finally turn it off by
default before it gets pulled into mainline? That way, we get the
maximum test coverage we can get, but not accidentally break anything
that wasn't tested?
Thanks,
Saravana
On 7/16/19 3:56 PM, Rob Herring wrote:
> On Mon, Jul 15, 2019 at 7:05 PM Frank Rowand <[email protected]> wrote:
>>
>> On 7/15/19 11:40 AM, Saravana Kannan wrote:
>>> Replying again because the previous email accidentally included HTML.
>>>
>>> Thanks for taking the time to reconsider the wording Frank. Your
>>> intention was clear to me in the first email too.
>>>
>>> A kernel command line option can also completely disable this
>>> functionality easily and cleanly. Can we pick that as an option? I've
>>> an implementation of that in the v5 series I sent out last week.
>>
>> Yes, Rob suggested a command line option for debugging, and I am fine with
>> that. But even with that, I would like a lot of testing so that we have a
>> chance of finding systems that have trouble with the changes and could
>> potentially be fixed before impacting a large number of users.
>
> Leaving it in -next for more than a cycle will not help. There's some
I have to agree with your scepticism of the value of -next for this
specific case. But I think there is a _tiny_ potential of additional
testing if the feature is in more than one -next cycle.
> number of users who test linux-next. Then there's more that test -rc
> kernels. Then there's more that test final releases and/or stable
> kernels. Probably, the more stable the h/w, the more it tends to be
> latter groups. (I don't get reports of breaking PowerMacs with the
> changes sitting in linux-next.)
>
> My main worry about this being off by default is it won't get tested.
> I'm not sure there's enough interest to drive folks to turn it on and
> test. Maybe it needs to be on until we see breakage.
Agreed, but worried about the potential disruption when breakage
occurs.
-Frank
>
> Rob
> .
>