This patch series improves fw_devlink in the following ways:
1. It no longer cares about a fwnode having a "compatible" property. It
figures this our more dynamically. The only expectation is that
fwnode that are converted to devices actually get probed by a driver
for the dependencies to be enforced correctly.
2. Finer grained dependency tracking. fw_devlink will now create device
links from the consumer to the actual resource's device (if it has one,
Eg: gpio_device) instead of the parent supplier device. This improves
things like async suspend/resume ordering, potentially remove the need
for frameworks to create device links, more parallelized async probing,
and better sync_state() tracking.
3. Handle hardware/software quirks where a child firmware node gets
populated as a device before its parent firmware node AND actually
supplies a non-optional resource to the parent firmware node's
device.
4. Way more robust at cycle handling (see patch for the insane cases).
5. Stops depending on OF_POPULATED to figure out some corner cases.
6. Simplifies the work that needs to be done by the firmware specific
code.
Sorry it took a while to roll in the fixes I gave in the v1 series
thread[1] into a v2 series.
Since I didn't make any additional changes on top of what I already gave
in the v1 thread and Dmitry is very eager to get this series going, I'm
sending it out without testing locally. I already tested these patches a
few months ago as part of the v1 series. So I don't expect any major
issues. I'll test them again on my end in the next few days and will
report here if I actually find anything wrong.
Tony, Naresh, Abel, Sudeep, Geert,
I got the following reviewed by's and tested by's a few months back, but
it's been 5 months since I sent out v1. So I wasn't sure if it was okay
to include them in the v2 commits. Let me know if you are okay with this
being included in the commits and/or if you want to test this series
again.
Reviewed-by: Tony Lindgren <[email protected]>
Tested-by: Tony Lindgren <[email protected]>
Tested-by: Linux Kernel Functional Testing <[email protected]>
Tested-by: Naresh Kamboju <[email protected]>
Tested-by: Abel Vesa <[email protected]>
Tested-by: Sudeep Holla <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Dmitry, Maxim(s), Miquel, Luca, Doug, Colin, Martin, Jean-Philippe,
I've Cc-ed you because I had pointed you to v1 of this series + the
patches in that thread at one point or another as a fix to some issue
you were facing. It'd appreciate it if you can test this series and
report any issues, or things it fixed and give Tested-bys.
In addition, if you can also apply a revert of this series[2] and delete
driver_deferred_probe_check_state() from your tree and see if you hit
any issues and report them, that'd be great too! I'm pretty sure some of
you will hit issues with that. I want to fix those next and then
revert[2].
Thanks,
Saravana
[1] - https://lore.kernel.org/lkml/[email protected]/
[2] - https://lore.kernel.org/lkml/[email protected]/
[3] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/
v1 -> v2:
- Fixed Patch 1 to handle a corner case discussed in [3].
- New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
- New patch 11 to add fw_devlink support for SCMI devices.
Cc: Abel Vesa <[email protected]>
Cc: Alexander Stein <[email protected]>
Cc: Tony Lindgren <[email protected]>
Cc: Sudeep Holla <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Doug Anderson <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Dmitry Baryshkov <[email protected]>
Cc: Maxim Kiselev <[email protected]>
Cc: Maxim Kochetkov <[email protected]>
Cc: Miquel Raynal <[email protected]>
Cc: Luca Weiss <[email protected]>
Cc: Colin Foster <[email protected]>
Cc: Martin Kepplinger <[email protected]>
Cc: Jean-Philippe Brucker <[email protected]>
Saravana Kannan (11):
driver core: fw_devlink: Don't purge child fwnode's consumer links
driver core: fw_devlink: Improve check for fwnode with no
device/driver
soc: renesas: Move away from using OF_POPULATED for fw_devlink
gpiolib: Clear the gpio_device's fwnode initialized flag before adding
driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
driver core: fw_devlink: Allow marking a fwnode link as being part of
a cycle
driver core: fw_devlink: Consolidate device link flag computation
driver core: fw_devlink: Make cycle detection more robust
of: property: Simplify of_link_to_phandle()
irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
firmware: arm_scmi: Set fwnode for the scmi_device
drivers/base/core.c | 443 +++++++++++++++++++++-----------
drivers/firmware/arm_scmi/bus.c | 2 +
drivers/gpio/gpiolib.c | 6 +
drivers/irqchip/irq-imx-gpcv2.c | 1 +
drivers/of/property.c | 84 +-----
drivers/soc/imx/gpcv2.c | 1 +
drivers/soc/renesas/rcar-sysc.c | 2 +-
include/linux/device.h | 1 +
include/linux/fwnode.h | 12 +-
9 files changed, 332 insertions(+), 220 deletions(-)
--
2.39.1.456.gfc5497dd1b-goog
When a device X is bound successfully to a driver, if it has a child
firmware node Y that doesn't have a struct device created by then, we
delete fwnode links where the child firmware node Y is the supplier. We
did this to avoid blocking the consumers of the child firmware node Y
from deferring probe indefinitely.
While that a step in the right direction, it's better to make the
consumers of the child firmware node Y to be consumers of the device X
because device X is probably implementing whatever functionality is
represented by child firmware node Y. By doing this, we capture the
device dependencies more accurately and ensure better
probe/suspend/resume ordering.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 97 ++++++++++++++++++++++++++++++++++++---------
1 file changed, 79 insertions(+), 18 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index a3e14143ec0c..b6d98cc82f26 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -54,11 +54,12 @@ static LIST_HEAD(deferred_sync);
static unsigned int defer_sync_state_count = 1;
static DEFINE_MUTEX(fwnode_link_lock);
static bool fw_devlink_is_permissive(void);
+static void __fw_devlink_link_to_consumers(struct device *dev);
static bool fw_devlink_drv_reg_done;
static bool fw_devlink_best_effort;
/**
- * fwnode_link_add - Create a link between two fwnode_handles.
+ * __fwnode_link_add - Create a link between two fwnode_handles.
* @con: Consumer end of the link.
* @sup: Supplier end of the link.
*
@@ -74,22 +75,18 @@ static bool fw_devlink_best_effort;
* Attempts to create duplicate links between the same pair of fwnode handles
* are ignored and there is no reference counting.
*/
-int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
+static int __fwnode_link_add(struct fwnode_handle *con,
+ struct fwnode_handle *sup)
{
struct fwnode_link *link;
- int ret = 0;
-
- mutex_lock(&fwnode_link_lock);
list_for_each_entry(link, &sup->consumers, s_hook)
if (link->consumer == con)
- goto out;
+ return 0;
link = kzalloc(sizeof(*link), GFP_KERNEL);
- if (!link) {
- ret = -ENOMEM;
- goto out;
- }
+ if (!link)
+ return -ENOMEM;
link->supplier = sup;
INIT_LIST_HEAD(&link->s_hook);
@@ -100,9 +97,17 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
list_add(&link->c_hook, &con->suppliers);
pr_debug("%pfwP Linked as a fwnode consumer to %pfwP\n",
con, sup);
-out:
- mutex_unlock(&fwnode_link_lock);
+ return 0;
+}
+
+int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
+{
+ int ret = 0;
+
+ mutex_lock(&fwnode_link_lock);
+ ret = __fwnode_link_add(con, sup);
+ mutex_unlock(&fwnode_link_lock);
return ret;
}
@@ -181,6 +186,51 @@ void fw_devlink_purge_absent_suppliers(struct fwnode_handle *fwnode)
}
EXPORT_SYMBOL_GPL(fw_devlink_purge_absent_suppliers);
+/**
+ * __fwnode_links_move_consumers - Move consumer from @from to @to fwnode_handle
+ * @from: move consumers away from this fwnode
+ * @to: move consumers to this fwnode
+ *
+ * Move all consumer links from @from fwnode to @to fwnode.
+ */
+static void __fwnode_links_move_consumers(struct fwnode_handle *from,
+ struct fwnode_handle *to)
+{
+ struct fwnode_link *link, *tmp;
+
+ list_for_each_entry_safe(link, tmp, &from->consumers, s_hook) {
+ __fwnode_link_add(link->consumer, to);
+ __fwnode_link_del(link);
+ }
+}
+
+/**
+ * __fw_devlink_pickup_dangling_consumers - Pick up dangling consumers
+ * @fwnode: fwnode from which to pick up dangling consumers
+ * @new_sup: fwnode of new supplier
+ *
+ * If the @fwnode has a corresponding struct device and the device supports
+ * probing (that is, added to a bus), then we want to let fw_devlink create
+ * MANAGED device links to this device, so leave @fwnode and its descendant's
+ * fwnode links alone.
+ *
+ * Otherwise, move its consumers to the new supplier @new_sup.
+ */
+static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
+ struct fwnode_handle *new_sup)
+{
+ struct fwnode_handle *child;
+
+ if (fwnode->dev && fwnode->dev->bus)
+ return;
+
+ fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
+ __fwnode_links_move_consumers(fwnode, new_sup);
+
+ fwnode_for_each_available_child_node(fwnode, child)
+ __fw_devlink_pickup_dangling_consumers(child, new_sup);
+}
+
#ifdef CONFIG_SRCU
static DEFINE_MUTEX(device_links_lock);
DEFINE_STATIC_SRCU(device_links_srcu);
@@ -1267,16 +1317,23 @@ void device_links_driver_bound(struct device *dev)
* them. So, fw_devlink no longer needs to create device links to any
* of the device's suppliers.
*
- * Also, if a child firmware node of this bound device is not added as
- * a device by now, assume it is never going to be added and make sure
- * other devices don't defer probe indefinitely by waiting for such a
- * child device.
+ * Also, if a child firmware node of this bound device is not added as a
+ * device by now, assume it is never going to be added. Make this bound
+ * device the fallback supplier to the dangling consumers of the child
+ * firmware node because this bound device is probably implementing the
+ * child firmware node functionality and we don't want the dangling
+ * consumers to defer probe indefinitely waiting for a device for the
+ * child firmware node.
*/
if (dev->fwnode && dev->fwnode->dev == dev) {
struct fwnode_handle *child;
fwnode_links_purge_suppliers(dev->fwnode);
+ mutex_lock(&fwnode_link_lock);
fwnode_for_each_available_child_node(dev->fwnode, child)
- fw_devlink_purge_absent_suppliers(child);
+ __fw_devlink_pickup_dangling_consumers(child,
+ dev->fwnode);
+ __fw_devlink_link_to_consumers(dev);
+ mutex_unlock(&fwnode_link_lock);
}
device_remove_file(dev, &dev_attr_waiting_for_supplier);
@@ -1855,7 +1912,11 @@ static int fw_devlink_create_devlink(struct device *con,
fwnode_is_ancestor_of(sup_handle, con->fwnode))
return -EINVAL;
- sup_dev = get_dev_from_fwnode(sup_handle);
+ if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
+ sup_dev = fwnode_get_next_parent_dev(sup_handle);
+ else
+ sup_dev = get_dev_from_fwnode(sup_handle);
+
if (sup_dev) {
/*
* If it's one of those drivers that don't actually bind to
--
2.39.1.456.gfc5497dd1b-goog
fw_devlink shouldn't defer the probe of a device to wait on a supplier
that'll never have a struct device or will never be probed by a driver.
We currently check if a supplier falls into this category, but don't
check its ancestors. We need to check the ancestors too because if the
ancestor will never probe, then the supplier will never probe either.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 40 ++++++++++++++++++++++++++++++++++++++--
1 file changed, 38 insertions(+), 2 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b6d98cc82f26..919728e784e8 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1867,6 +1867,35 @@ static int fw_devlink_relax_cycle(struct device *con, void *sup)
return ret;
}
+static bool fwnode_init_without_drv(struct fwnode_handle *fwnode)
+{
+ struct device *dev;
+ bool ret;
+
+ if (!(fwnode->flags & FWNODE_FLAG_INITIALIZED))
+ return false;
+
+ dev = get_dev_from_fwnode(fwnode);
+ ret = !dev || dev->links.status == DL_DEV_NO_DRIVER;
+ put_device(dev);
+
+ return ret;
+}
+
+static bool fwnode_ancestor_init_without_drv(struct fwnode_handle *fwnode)
+{
+ struct fwnode_handle *parent;
+
+ fwnode_for_each_parent_node(fwnode, parent) {
+ if (fwnode_init_without_drv(parent)) {
+ fwnode_handle_put(parent);
+ return true;
+ }
+ }
+
+ return false;
+}
+
/**
* fw_devlink_create_devlink - Create a device link from a consumer to fwnode
* @con: consumer device for the device link
@@ -1948,9 +1977,16 @@ static int fw_devlink_create_devlink(struct device *con,
goto out;
}
- /* Supplier that's already initialized without a struct device. */
- if (sup_handle->flags & FWNODE_FLAG_INITIALIZED)
+ /*
+ * Supplier or supplier's ancestor already initialized without a struct
+ * device or being probed by a driver.
+ */
+ if (fwnode_init_without_drv(sup_handle) ||
+ fwnode_ancestor_init_without_drv(sup_handle)) {
+ dev_dbg(con, "Not linking %pfwP - Might never probe\n",
+ sup_handle);
return -EINVAL;
+ }
/*
* DL_FLAG_SYNC_STATE_ONLY doesn't block probing and supports
--
2.39.1.456.gfc5497dd1b-goog
Registering an irqdomain sets the flag for the fwnode. But having the
flag set when a device is added is interpreted by fw_devlink to mean the
device has already been initialized and will never probe. This prevents
fw_devlink from creating device links with the gpio_device as a
supplier. So, clear the flag before adding the device.
Signed-off-by: Saravana Kannan <[email protected]>
Acked-by: Bartosz Golaszewski <[email protected]>
---
drivers/gpio/gpiolib.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 939c776b9488..b23140c6485f 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -578,6 +578,12 @@ static int gpiochip_setup_dev(struct gpio_device *gdev)
{
int ret;
+ /*
+ * If fwnode doesn't belong to another device, it's safe to clear its
+ * initialized flag.
+ */
+ if (!gdev->dev.fwnode->dev)
+ fwnode_dev_initialized(gdev->dev.fwnode, false);
ret = gcdev_register(gdev, gpio_devt);
if (ret)
return ret;
--
2.39.1.456.gfc5497dd1b-goog
The OF_POPULATED flag was set to let fw_devlink know that the device
tree node will not have a struct device created for it. This information
is used by fw_devlink to avoid deferring the probe of consumers of this
device tree node.
Let's use fwnode_dev_initialized() instead because it achieves the same
effect without using OF specific flags. This allows more generic code to
be written in driver core.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/soc/renesas/rcar-sysc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/renesas/rcar-sysc.c b/drivers/soc/renesas/rcar-sysc.c
index b0a80de34c98..03246ed4a79e 100644
--- a/drivers/soc/renesas/rcar-sysc.c
+++ b/drivers/soc/renesas/rcar-sysc.c
@@ -437,7 +437,7 @@ static int __init rcar_sysc_pd_init(void)
error = of_genpd_add_provider_onecell(np, &domains->onecell_data);
if (!error)
- of_node_set_flag(np, OF_POPULATED);
+ fwnode_dev_initialized(&np->fwnode, true);
out_put:
of_node_put(np);
--
2.39.1.456.gfc5497dd1b-goog
fw_devlink uses DL_FLAG_SYNC_STATE_ONLY device link flag for two
purposes:
1. To allow a parent device to proxy its child device's dependency on a
supplier so that the supplier doesn't get its sync_state() callback
before the child device/consumer can be added and probed. In this
usage scenario, we need to ignore cycles for ensure correctness of
sync_state() callbacks.
2. When there are dependency cycles in firmware, we don't know which of
those dependencies are valid. So, we have to ignore them all wrt
probe ordering while still making sure the sync_state() callbacks
come correctly.
However, when detecting dependency cycles, there can be multiple
dependency cycles between two devices that we need to detect. For
example:
A -> B -> A and A -> C -> B -> A.
To detect multiple cycles correct, we need to be able to differentiate
DL_FLAG_SYNC_STATE_ONLY device links used for (1) vs (2) above.
To allow this differentiation, add a DL_FLAG_CYCLE that can be use to
mark use case (2). We can then use the DL_FLAG_CYCLE to decide which
DL_FLAG_SYNC_STATE_ONLY device links to follow when looking for
dependency cycles.
Fixes: 2de9d8e0d2fe ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 28 ++++++++++++++++++----------
include/linux/device.h | 1 +
2 files changed, 19 insertions(+), 10 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 919728e784e8..e5390b09a02f 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -322,6 +322,12 @@ static bool device_is_ancestor(struct device *dev, struct device *target)
return false;
}
+static inline bool device_link_flag_is_sync_state_only(u32 flags)
+{
+ return (flags & ~(DL_FLAG_INFERRED | DL_FLAG_CYCLE))
+ == (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED);
+}
+
/**
* device_is_dependent - Check if one device depends on another one
* @dev: Device to check dependencies for.
@@ -348,8 +354,7 @@ int device_is_dependent(struct device *dev, void *target)
return ret;
list_for_each_entry(link, &dev->links.consumers, s_node) {
- if ((link->flags & ~DL_FLAG_INFERRED) ==
- (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED))
+ if (device_link_flag_is_sync_state_only(link->flags))
continue;
if (link->consumer == target)
@@ -422,8 +427,7 @@ static int device_reorder_to_tail(struct device *dev, void *not_used)
device_for_each_child(dev, NULL, device_reorder_to_tail);
list_for_each_entry(link, &dev->links.consumers, s_node) {
- if ((link->flags & ~DL_FLAG_INFERRED) ==
- (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED))
+ if (device_link_flag_is_sync_state_only(link->flags))
continue;
device_reorder_to_tail(link->consumer, NULL);
}
@@ -684,7 +688,8 @@ postcore_initcall(devlink_class_init);
DL_FLAG_AUTOREMOVE_SUPPLIER | \
DL_FLAG_AUTOPROBE_CONSUMER | \
DL_FLAG_SYNC_STATE_ONLY | \
- DL_FLAG_INFERRED)
+ DL_FLAG_INFERRED | \
+ DL_FLAG_CYCLE)
#define DL_ADD_VALID_FLAGS (DL_MANAGED_LINK_FLAGS | DL_FLAG_STATELESS | \
DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE)
@@ -753,8 +758,6 @@ struct device_link *device_link_add(struct device *consumer,
if (!consumer || !supplier || consumer == supplier ||
flags & ~DL_ADD_VALID_FLAGS ||
(flags & DL_FLAG_STATELESS && flags & DL_MANAGED_LINK_FLAGS) ||
- (flags & DL_FLAG_SYNC_STATE_ONLY &&
- (flags & ~DL_FLAG_INFERRED) != DL_FLAG_SYNC_STATE_ONLY) ||
(flags & DL_FLAG_AUTOPROBE_CONSUMER &&
flags & (DL_FLAG_AUTOREMOVE_CONSUMER |
DL_FLAG_AUTOREMOVE_SUPPLIER)))
@@ -770,6 +773,10 @@ struct device_link *device_link_add(struct device *consumer,
if (!(flags & DL_FLAG_STATELESS))
flags |= DL_FLAG_MANAGED;
+ if (flags & DL_FLAG_SYNC_STATE_ONLY &&
+ !device_link_flag_is_sync_state_only(flags))
+ return NULL;
+
device_links_write_lock();
device_pm_lock();
@@ -1729,7 +1736,7 @@ static void fw_devlink_relax_link(struct device_link *link)
if (!(link->flags & DL_FLAG_INFERRED))
return;
- if (link->flags == (DL_FLAG_MANAGED | FW_DEVLINK_FLAGS_PERMISSIVE))
+ if (device_link_flag_is_sync_state_only(link->flags))
return;
pm_runtime_drop_link(link);
@@ -1853,8 +1860,8 @@ static int fw_devlink_relax_cycle(struct device *con, void *sup)
return ret;
list_for_each_entry(link, &con->links.consumers, s_node) {
- if ((link->flags & ~DL_FLAG_INFERRED) ==
- (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED))
+ if (!(link->flags & DL_FLAG_CYCLE) &&
+ device_link_flag_is_sync_state_only(link->flags))
continue;
if (!fw_devlink_relax_cycle(link->consumer, sup))
@@ -1863,6 +1870,7 @@ static int fw_devlink_relax_cycle(struct device *con, void *sup)
ret = 1;
fw_devlink_relax_link(link);
+ link->flags |= DL_FLAG_CYCLE;
}
return ret;
}
diff --git a/include/linux/device.h b/include/linux/device.h
index 44e3acae7b36..f4d20655d2d7 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -328,6 +328,7 @@ enum device_link_state {
#define DL_FLAG_MANAGED BIT(6)
#define DL_FLAG_SYNC_STATE_ONLY BIT(7)
#define DL_FLAG_INFERRED BIT(8)
+#define DL_FLAG_CYCLE BIT(9)
/**
* enum dl_dev_state - Device driver presence tracking information.
--
2.39.1.456.gfc5497dd1b-goog
To improve detection and handling of dependency cycles, we need to be
able to mark fwnode links as being part of cycles. fwnode links marked
as being part of a cycle should not block their consumers from probing.
Fixes: 2de9d8e0d2fe ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 41 +++++++++++++++++++++++++++++++++++------
include/linux/fwnode.h | 11 ++++++++++-
2 files changed, 45 insertions(+), 7 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index e5390b09a02f..82b29e9070bf 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -126,6 +126,19 @@ static void __fwnode_link_del(struct fwnode_link *link)
kfree(link);
}
+/**
+ * __fwnode_link_cycle - Mark a fwnode link as being part of a cycle.
+ * @link: the fwnode_link to be marked
+ *
+ * The fwnode_link_lock needs to be held when this function is called.
+ */
+static void __fwnode_link_cycle(struct fwnode_link *link)
+{
+ pr_debug("%pfwf: Relaxing link with %pfwf\n",
+ link->consumer, link->supplier);
+ link->flags |= FWLINK_FLAG_CYCLE;
+}
+
/**
* fwnode_links_purge_suppliers - Delete all supplier links of fwnode_handle.
* @fwnode: fwnode whose supplier links need to be deleted
@@ -1041,6 +1054,23 @@ static bool dev_is_best_effort(struct device *dev)
(dev->fwnode && (dev->fwnode->flags & FWNODE_FLAG_BEST_EFFORT));
}
+static struct fwnode_handle *fwnode_links_check_suppliers(
+ struct fwnode_handle *fwnode)
+{
+ struct fwnode_link *link;
+
+ if (!fwnode || fw_devlink_is_permissive())
+ return NULL;
+
+ list_for_each_entry(link, &fwnode->suppliers, c_hook) {
+ if (link->flags & FWLINK_FLAG_CYCLE)
+ continue;
+ return link->supplier;
+ }
+
+ return NULL;
+}
+
/**
* device_links_check_suppliers - Check presence of supplier drivers.
* @dev: Consumer device.
@@ -1068,11 +1098,8 @@ int device_links_check_suppliers(struct device *dev)
* probe.
*/
mutex_lock(&fwnode_link_lock);
- if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
- !fw_devlink_is_permissive()) {
- sup_fw = list_first_entry(&dev->fwnode->suppliers,
- struct fwnode_link,
- c_hook)->supplier;
+ sup_fw = fwnode_links_check_suppliers(dev->fwnode);
+ if (sup_fw) {
if (!dev_is_best_effort(dev)) {
fwnode_ret = -EPROBE_DEFER;
dev_err_probe(dev, -EPROBE_DEFER,
@@ -1261,7 +1288,9 @@ static ssize_t waiting_for_supplier_show(struct device *dev,
bool val;
device_lock(dev);
- val = !list_empty(&dev->fwnode->suppliers);
+ mutex_lock(&fwnode_link_lock);
+ val = !!fwnode_links_check_suppliers(dev->fwnode);
+ mutex_unlock(&fwnode_link_lock);
device_unlock(dev);
return sysfs_emit(buf, "%u\n", val);
}
diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index 89b9bdfca925..fdf2ee0285b7 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -18,7 +18,7 @@ struct fwnode_operations;
struct device;
/*
- * fwnode link flags
+ * fwnode flags
*
* LINKS_ADDED: The fwnode has already be parsed to add fwnode links.
* NOT_DEVICE: The fwnode will never be populated as a struct device.
@@ -36,6 +36,7 @@ struct device;
#define FWNODE_FLAG_INITIALIZED BIT(2)
#define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3)
#define FWNODE_FLAG_BEST_EFFORT BIT(4)
+#define FWNODE_FLAG_VISITED BIT(5)
struct fwnode_handle {
struct fwnode_handle *secondary;
@@ -46,11 +47,19 @@ struct fwnode_handle {
u8 flags;
};
+/*
+ * fwnode link flags
+ *
+ * CYCLE: The fwnode link is part of a cycle. Don't defer probe.
+ */
+#define FWLINK_FLAG_CYCLE BIT(0)
+
struct fwnode_link {
struct fwnode_handle *supplier;
struct list_head s_hook;
struct fwnode_handle *consumer;
struct list_head c_hook;
+ u8 flags;
};
/**
--
2.39.1.456.gfc5497dd1b-goog
Consolidate the code that computes the flags to be used when creating a
device link from a fwnode link.
Fixes: 2de9d8e0d2fe ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 28 +++++++++++++++-------------
include/linux/fwnode.h | 1 -
2 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 82b29e9070bf..b61d5d86a600 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1726,8 +1726,11 @@ static int __init fw_devlink_strict_setup(char *arg)
}
early_param("fw_devlink.strict", fw_devlink_strict_setup);
-u32 fw_devlink_get_flags(void)
+static inline u32 fw_devlink_get_flags(u8 fwlink_flags)
{
+ if (fwlink_flags & FWLINK_FLAG_CYCLE)
+ return FW_DEVLINK_FLAGS_PERMISSIVE | DL_FLAG_CYCLE;
+
return fw_devlink_flags;
}
@@ -1937,7 +1940,7 @@ static bool fwnode_ancestor_init_without_drv(struct fwnode_handle *fwnode)
* fw_devlink_create_devlink - Create a device link from a consumer to fwnode
* @con: consumer device for the device link
* @sup_handle: fwnode handle of supplier
- * @flags: devlink flags
+ * @link: fwnode link that's being converted to a device link
*
* This function will try to create a device link between the consumer device
* @con and the supplier device represented by @sup_handle.
@@ -1954,10 +1957,17 @@ static bool fwnode_ancestor_init_without_drv(struct fwnode_handle *fwnode)
* possible to do that in the future
*/
static int fw_devlink_create_devlink(struct device *con,
- struct fwnode_handle *sup_handle, u32 flags)
+ struct fwnode_handle *sup_handle,
+ struct fwnode_link *link)
{
struct device *sup_dev;
int ret = 0;
+ u32 flags;
+
+ if (con->fwnode == link->consumer)
+ flags = fw_devlink_get_flags(link->flags);
+ else
+ flags = FW_DEVLINK_FLAGS_PERMISSIVE;
/*
* In some cases, a device P might also be a supplier to its child node
@@ -2090,7 +2100,6 @@ static void __fw_devlink_link_to_consumers(struct device *dev)
struct fwnode_link *link, *tmp;
list_for_each_entry_safe(link, tmp, &fwnode->consumers, s_hook) {
- u32 dl_flags = fw_devlink_get_flags();
struct device *con_dev;
bool own_link = true;
int ret;
@@ -2120,14 +2129,13 @@ static void __fw_devlink_link_to_consumers(struct device *dev)
con_dev = NULL;
} else {
own_link = false;
- dl_flags = FW_DEVLINK_FLAGS_PERMISSIVE;
}
}
if (!con_dev)
continue;
- ret = fw_devlink_create_devlink(con_dev, fwnode, dl_flags);
+ ret = fw_devlink_create_devlink(con_dev, fwnode, link);
put_device(con_dev);
if (!own_link || ret == -EAGAIN)
continue;
@@ -2168,19 +2176,13 @@ static void __fw_devlink_link_to_suppliers(struct device *dev,
bool own_link = (dev->fwnode == fwnode);
struct fwnode_link *link, *tmp;
struct fwnode_handle *child = NULL;
- u32 dl_flags;
-
- if (own_link)
- dl_flags = fw_devlink_get_flags();
- else
- dl_flags = FW_DEVLINK_FLAGS_PERMISSIVE;
list_for_each_entry_safe(link, tmp, &fwnode->suppliers, c_hook) {
int ret;
struct device *sup_dev;
struct fwnode_handle *sup = link->supplier;
- ret = fw_devlink_create_devlink(dev, sup, dl_flags);
+ ret = fw_devlink_create_devlink(dev, sup, link);
if (!own_link || ret == -EAGAIN)
continue;
diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index fdf2ee0285b7..5700451b300f 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -207,7 +207,6 @@ static inline void fwnode_dev_initialized(struct fwnode_handle *fwnode,
fwnode->flags &= ~FWNODE_FLAG_INITIALIZED;
}
-extern u32 fw_devlink_get_flags(void);
extern bool fw_devlink_is_strict(void);
int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup);
void fwnode_links_purge(struct fwnode_handle *fwnode);
--
2.39.1.456.gfc5497dd1b-goog
fw_devlink could only detect a single and simple cycle because it relied
mainly on device link cycle detection code that only checked for cycles
between devices. The expectation was that the firmware wouldn't have
complicated cycles and multiple cycles between devices. That expectation
has been proven to be wrong.
For example, fw_devlink could handle:
+-+ +-+
|A+------> |B+
+-+ +++
^ |
| |
+----------+
But it couldn't handle even something as "simple" as:
+---------------------+
| |
v |
+-+ +-+ +++
|A+------> |B+------> |C|
+-+ +++ +-+
^ |
| |
+----------+
But firmware has even more complicated cycles like:
+---------------------+
| |
v |
+-+ +---+ +++
+--+A+------>| B +-----> |C|<--+
| +-+ ++--+ +++ |
| ^ | ^ | |
| | | | | |
| +---------+ +---------+ |
| |
+------------------------------+
And this is without including parent child dependencies or nodes in the
cycle that are just firmware nodes that'll never have a struct device
created for them.
The proper way to treat these devices it to not force any probe ordering
between them, while still enforce dependencies between node in the
cycles (A, B and C) and their consumers.
So this patch goes all out and just deals with all types of cycles. It
does this by:
1. Following dependencies across device links, parent-child and fwnode
links.
2. When it find cycles, it mark the device links and fwnode links as
such instead of just deleting them or making the indistinguishable
from proxy SYNC_STATE_ONLY device links.
This way, when new nodes get added, we can immediately find and mark any
new cycles whether the new node is a device or firmware node.
Fixes: 2de9d8e0d2fe ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/base/core.c | 245 +++++++++++++++++++++++---------------------
1 file changed, 130 insertions(+), 115 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b61d5d86a600..fbb843220458 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1865,47 +1865,6 @@ static void fw_devlink_unblock_consumers(struct device *dev)
device_links_write_unlock();
}
-/**
- * fw_devlink_relax_cycle - Convert cyclic links to SYNC_STATE_ONLY links
- * @con: Device to check dependencies for.
- * @sup: Device to check against.
- *
- * Check if @sup depends on @con or any device dependent on it (its child or
- * its consumer etc). When such a cyclic dependency is found, convert all
- * device links created solely by fw_devlink into SYNC_STATE_ONLY device links.
- * This is the equivalent of doing fw_devlink=permissive just between the
- * devices in the cycle. We need to do this because, at this point, fw_devlink
- * can't tell which of these dependencies is not a real dependency.
- *
- * Return 1 if a cycle is found. Otherwise, return 0.
- */
-static int fw_devlink_relax_cycle(struct device *con, void *sup)
-{
- struct device_link *link;
- int ret;
-
- if (con == sup)
- return 1;
-
- ret = device_for_each_child(con, sup, fw_devlink_relax_cycle);
- if (ret)
- return ret;
-
- list_for_each_entry(link, &con->links.consumers, s_node) {
- if (!(link->flags & DL_FLAG_CYCLE) &&
- device_link_flag_is_sync_state_only(link->flags))
- continue;
-
- if (!fw_devlink_relax_cycle(link->consumer, sup))
- continue;
-
- ret = 1;
-
- fw_devlink_relax_link(link);
- link->flags |= DL_FLAG_CYCLE;
- }
- return ret;
-}
static bool fwnode_init_without_drv(struct fwnode_handle *fwnode)
{
@@ -1936,6 +1895,113 @@ static bool fwnode_ancestor_init_without_drv(struct fwnode_handle *fwnode)
return false;
}
+/**
+ * __fw_devlink_relax_cycles - Relax and mark dependency cycles.
+ * @con: Potential consumer device.
+ * @sup_handle: Potential supplier's fwnode.
+ *
+ * Needs to be called with fwnode_lock and device link lock held.
+ *
+ * Check if @sup_handle or any of its ancestors or suppliers direct/indirectly
+ * depend on @con. This function can detect multiple cyles between @sup_handle
+ * and @con. When such dependency cycles are found, convert all device links
+ * created solely by fw_devlink into SYNC_STATE_ONLY device links. Also, mark
+ * all fwnode links in the cycle with FWLINK_FLAG_CYCLE so that when they are
+ * converted into a device link in the future, they are created as
+ * SYNC_STATE_ONLY device links. This is the equivalent of doing
+ * fw_devlink=permissive just between the devices in the cycle. We need to do
+ * this because, at this point, fw_devlink can't tell which of these
+ * dependencies is not a real dependency.
+ *
+ * Return true if one or more cycles were found. Otherwise, return false.
+ */
+static bool __fw_devlink_relax_cycles(struct device *con,
+ struct fwnode_handle *sup_handle)
+{
+ struct fwnode_link *link;
+ struct device_link *dev_link;
+ struct device *sup_dev = NULL, *par_dev = NULL;
+ bool ret = false;
+
+ if (!sup_handle)
+ return false;
+
+ /*
+ * We aren't trying to find all cycles. Just a cycle between con and
+ * sup_handle.
+ */
+ if (sup_handle->flags & FWNODE_FLAG_VISITED)
+ return false;
+
+ sup_handle->flags |= FWNODE_FLAG_VISITED;
+
+ sup_dev = get_dev_from_fwnode(sup_handle);
+
+ /* Termination condition. */
+ if (sup_dev == con) {
+ ret = true;
+ goto out;
+ }
+
+ /*
+ * If sup_dev is bound to a driver and @con hasn't started binding to
+ * a driver, @sup_dev can't be a consumer of @con. So, no need to
+ * check further.
+ */
+ if (sup_dev && sup_dev->links.status == DL_DEV_DRIVER_BOUND &&
+ con->links.status == DL_DEV_NO_DRIVER) {
+ ret = false;
+ goto out;
+ }
+
+ list_for_each_entry(link, &sup_handle->suppliers, c_hook) {
+ if (__fw_devlink_relax_cycles(con, link->supplier)) {
+ __fwnode_link_cycle(link);
+ ret = true;
+ }
+ }
+
+ /*
+ * Give priority to device parent over fwnode parent to account for any
+ * quirks in how fwnodes are converted to devices.
+ */
+ if (sup_dev) {
+ par_dev = sup_dev->parent;
+ get_device(par_dev);
+ } else {
+ par_dev = fwnode_get_next_parent_dev(sup_handle);
+ }
+
+ if (par_dev)
+ ret |= __fw_devlink_relax_cycles(con, par_dev->fwnode);
+
+ if (!sup_dev)
+ goto out;
+
+ list_for_each_entry(dev_link, &sup_dev->links.suppliers, c_node) {
+ /*
+ * Ignore a SYNC_STATE_ONLY flag only if it wasn't marked as a
+ * such due to a cycle.
+ */
+ if (device_link_flag_is_sync_state_only(dev_link->flags) &&
+ !(dev_link->flags & DL_FLAG_CYCLE))
+ continue;
+
+ if (__fw_devlink_relax_cycles(con,
+ dev_link->supplier->fwnode)) {
+ fw_devlink_relax_link(dev_link);
+ dev_link->flags |= DL_FLAG_CYCLE;
+ ret = true;
+ }
+ }
+
+out:
+ sup_handle->flags &= ~FWNODE_FLAG_VISITED;
+ put_device(sup_dev);
+ put_device(par_dev);
+ return ret;
+}
+
/**
* fw_devlink_create_devlink - Create a device link from a consumer to fwnode
* @con: consumer device for the device link
@@ -1988,6 +2054,21 @@ static int fw_devlink_create_devlink(struct device *con,
fwnode_is_ancestor_of(sup_handle, con->fwnode))
return -EINVAL;
+ /*
+ * SYNC_STATE_ONLY device links don't block probing and supports cycles.
+ * So cycle detection isn't necessary and shouldn't be done.
+ */
+ if (!(flags & DL_FLAG_SYNC_STATE_ONLY)) {
+ device_links_write_lock();
+ if (__fw_devlink_relax_cycles(con, sup_handle)) {
+ __fwnode_link_cycle(link);
+ flags = fw_devlink_get_flags(link->flags);
+ dev_info(con, "Fixed dependency cycle(s) with %pfwf\n",
+ sup_handle);
+ }
+ device_links_write_unlock();
+ }
+
if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
sup_dev = fwnode_get_next_parent_dev(sup_handle);
else
@@ -2001,23 +2082,16 @@ static int fw_devlink_create_devlink(struct device *con,
*/
if (sup_dev->links.status == DL_DEV_NO_DRIVER &&
sup_handle->flags & FWNODE_FLAG_INITIALIZED) {
+ dev_dbg(con,
+ "Not linking %pfwf - dev might never probe\n",
+ sup_handle);
ret = -EINVAL;
goto out;
}
- /*
- * If this fails, it is due to cycles in device links. Just
- * give up on this link and treat it as invalid.
- */
- if (!device_link_add(con, sup_dev, flags) &&
- !(flags & DL_FLAG_SYNC_STATE_ONLY)) {
- dev_info(con, "Fixing up cyclic dependency with %s\n",
- dev_name(sup_dev));
- device_links_write_lock();
- fw_devlink_relax_cycle(con, sup_dev);
- device_links_write_unlock();
- device_link_add(con, sup_dev,
- FW_DEVLINK_FLAGS_PERMISSIVE);
+ if (!device_link_add(con, sup_dev, flags)) {
+ dev_err(con, "Failed to create device link with %s\n",
+ dev_name(sup_dev));
ret = -EINVAL;
}
@@ -2030,49 +2104,12 @@ static int fw_devlink_create_devlink(struct device *con,
*/
if (fwnode_init_without_drv(sup_handle) ||
fwnode_ancestor_init_without_drv(sup_handle)) {
- dev_dbg(con, "Not linking %pfwP - Might never probe\n",
+ dev_dbg(con, "Not linking %pfwf - might never become dev\n",
sup_handle);
return -EINVAL;
}
- /*
- * DL_FLAG_SYNC_STATE_ONLY doesn't block probing and supports
- * cycles. So cycle detection isn't necessary and shouldn't be
- * done.
- */
- if (flags & DL_FLAG_SYNC_STATE_ONLY)
- return -EAGAIN;
-
- /*
- * If we can't find the supplier device from its fwnode, it might be
- * due to a cyclic dependency between fwnodes. Some of these cycles can
- * be broken by applying logic. Check for these types of cycles and
- * break them so that devices in the cycle probe properly.
- *
- * If the supplier's parent is dependent on the consumer, then the
- * consumer and supplier have a cyclic dependency. Since fw_devlink
- * can't tell which of the inferred dependencies are incorrect, don't
- * enforce probe ordering between any of the devices in this cyclic
- * dependency. Do this by relaxing all the fw_devlink device links in
- * this cycle and by treating the fwnode link between the consumer and
- * the supplier as an invalid dependency.
- */
- sup_dev = fwnode_get_next_parent_dev(sup_handle);
- if (sup_dev && device_is_dependent(con, sup_dev)) {
- dev_info(con, "Fixing up cyclic dependency with %pfwP (%s)\n",
- sup_handle, dev_name(sup_dev));
- device_links_write_lock();
- fw_devlink_relax_cycle(con, sup_dev);
- device_links_write_unlock();
- ret = -EINVAL;
- } else {
- /*
- * Can't check for cycles or no cycles. So let's try
- * again later.
- */
- ret = -EAGAIN;
- }
-
+ ret = -EAGAIN;
out:
put_device(sup_dev);
return ret;
@@ -2179,7 +2216,6 @@ static void __fw_devlink_link_to_suppliers(struct device *dev,
list_for_each_entry_safe(link, tmp, &fwnode->suppliers, c_hook) {
int ret;
- struct device *sup_dev;
struct fwnode_handle *sup = link->supplier;
ret = fw_devlink_create_devlink(dev, sup, link);
@@ -2187,27 +2223,6 @@ static void __fw_devlink_link_to_suppliers(struct device *dev,
continue;
__fwnode_link_del(link);
-
- /* If no device link was created, nothing more to do. */
- if (ret)
- continue;
-
- /*
- * If a device link was successfully created to a supplier, we
- * now need to try and link the supplier to all its suppliers.
- *
- * This is needed to detect and delete false dependencies in
- * fwnode links that haven't been converted to a device link
- * yet. See comments in fw_devlink_create_devlink() for more
- * details on the false dependency.
- *
- * Without deleting these false dependencies, some devices will
- * never probe because they'll keep waiting for their false
- * dependency fwnode links to be converted to device links.
- */
- sup_dev = get_dev_from_fwnode(sup);
- __fw_devlink_link_to_suppliers(sup_dev, sup_dev->fwnode);
- put_device(sup_dev);
}
/*
--
2.39.1.456.gfc5497dd1b-goog
The driver core now:
- Has the parent device of a supplier pick up the consumers if the
supplier never has a device created for it.
- Ignores a supplier if the supplier has no parent device and will never
be probed by a driver
And already prevents creating a device link with the consumer as a
supplier of a parent.
So, we no longer need to find the "compatible" node of the supplier or
do any other checks in of_link_to_phandle(). We simply need to make sure
that the supplier is available in DT.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/of/property.c | 84 +++++++------------------------------------
1 file changed, 13 insertions(+), 71 deletions(-)
diff --git a/drivers/of/property.c b/drivers/of/property.c
index 134cfc980b70..c651aad6f34b 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1062,20 +1062,6 @@ of_fwnode_device_get_match_data(const struct fwnode_handle *fwnode,
return of_device_get_match_data(dev);
}
-static bool of_is_ancestor_of(struct device_node *test_ancestor,
- struct device_node *child)
-{
- of_node_get(child);
- while (child) {
- if (child == test_ancestor) {
- of_node_put(child);
- return true;
- }
- child = of_get_next_parent(child);
- }
- return false;
-}
-
static struct device_node *of_get_compat_node(struct device_node *np)
{
of_node_get(np);
@@ -1106,71 +1092,27 @@ static struct device_node *of_get_compat_node_parent(struct device_node *np)
return node;
}
-/**
- * of_link_to_phandle - Add fwnode link to supplier from supplier phandle
- * @con_np: consumer device tree node
- * @sup_np: supplier device tree node
- *
- * Given a phandle to a supplier device tree node (@sup_np), this function
- * finds the device that owns the supplier device tree node and creates a
- * device link from @dev consumer device to the supplier device. This function
- * doesn't create device links for invalid scenarios such as trying to create a
- * link with a parent device as the consumer of its child device. In such
- * cases, it returns an error.
- *
- * Returns:
- * - 0 if fwnode link successfully created to supplier
- * - -EINVAL if the supplier link is invalid and should not be created
- * - -ENODEV if struct device will never be create for supplier
- */
-static int of_link_to_phandle(struct device_node *con_np,
+static void of_link_to_phandle(struct device_node *con_np,
struct device_node *sup_np)
{
- struct device *sup_dev;
- struct device_node *tmp_np = sup_np;
+ struct device_node *tmp_np = of_node_get(sup_np);
- /*
- * Find the device node that contains the supplier phandle. It may be
- * @sup_np or it may be an ancestor of @sup_np.
- */
- sup_np = of_get_compat_node(sup_np);
- if (!sup_np) {
- pr_debug("Not linking %pOFP to %pOFP - No device\n",
- con_np, tmp_np);
- return -ENODEV;
- }
+ /* Check that sup_np and its ancestors are available. */
+ while (tmp_np) {
+ if (of_fwnode_handle(tmp_np)->dev) {
+ of_node_put(tmp_np);
+ break;
+ }
- /*
- * Don't allow linking a device node as a consumer of one of its
- * descendant nodes. By definition, a child node can't be a functional
- * dependency for the parent node.
- */
- if (of_is_ancestor_of(con_np, sup_np)) {
- pr_debug("Not linking %pOFP to %pOFP - is descendant\n",
- con_np, sup_np);
- of_node_put(sup_np);
- return -EINVAL;
- }
+ if (!of_device_is_available(tmp_np)) {
+ of_node_put(tmp_np);
+ return;
+ }
- /*
- * Don't create links to "early devices" that won't have struct devices
- * created for them.
- */
- sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
- if (!sup_dev &&
- (of_node_check_flag(sup_np, OF_POPULATED) ||
- sup_np->fwnode.flags & FWNODE_FLAG_NOT_DEVICE)) {
- pr_debug("Not linking %pOFP to %pOFP - No struct device\n",
- con_np, sup_np);
- of_node_put(sup_np);
- return -ENODEV;
+ tmp_np = of_get_next_parent(tmp_np);
}
- put_device(sup_dev);
fwnode_link_add(of_fwnode_handle(con_np), of_fwnode_handle(sup_np));
- of_node_put(sup_np);
-
- return 0;
}
/**
--
2.39.1.456.gfc5497dd1b-goog
Since this device is only partially initialized by the irqchip driver,
we need to mark the fwnode device as not initialized. This is to let
fw_devlink know that the device will be completely initialized at a
later point. That way, fw_devlink will continue to defer the probe of
the power domain consumers till the power domain driver successfully
binds to the struct device and completes the initialization of the
device.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/irqchip/irq-imx-gpcv2.c | 1 +
drivers/soc/imx/gpcv2.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/irqchip/irq-imx-gpcv2.c b/drivers/irqchip/irq-imx-gpcv2.c
index b9c22f764b4d..8a0e82067924 100644
--- a/drivers/irqchip/irq-imx-gpcv2.c
+++ b/drivers/irqchip/irq-imx-gpcv2.c
@@ -283,6 +283,7 @@ static int __init imx_gpcv2_irqchip_init(struct device_node *node,
* later the GPC power domain driver will not be skipped.
*/
of_node_clear_flag(node, OF_POPULATED);
+ fwnode_dev_initialized(domain->fwnode, false);
return 0;
}
diff --git a/drivers/soc/imx/gpcv2.c b/drivers/soc/imx/gpcv2.c
index 7a47d14fde44..b24f9ab634dc 100644
--- a/drivers/soc/imx/gpcv2.c
+++ b/drivers/soc/imx/gpcv2.c
@@ -1519,6 +1519,7 @@ static int imx_gpcv2_probe(struct platform_device *pdev)
pd_pdev->dev.parent = dev;
pd_pdev->dev.of_node = np;
+ pd_pdev->dev.fwnode = of_fwnode_handle(np);
ret = platform_device_add(pd_pdev);
if (ret) {
--
2.39.1.456.gfc5497dd1b-goog
This allows fw_devlink to track and enforce supplier-consumer
dependencies for scmi_device.
Signed-off-by: Saravana Kannan <[email protected]>
---
drivers/firmware/arm_scmi/bus.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/firmware/arm_scmi/bus.c b/drivers/firmware/arm_scmi/bus.c
index 35bb70724d44..1d8a6a8d9906 100644
--- a/drivers/firmware/arm_scmi/bus.c
+++ b/drivers/firmware/arm_scmi/bus.c
@@ -12,6 +12,7 @@
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/device.h>
+#include <linux/of.h>
#include "common.h"
@@ -192,6 +193,7 @@ scmi_device_create(struct device_node *np, struct device *parent, int protocol,
scmi_dev->protocol_id = protocol;
scmi_dev->dev.parent = parent;
scmi_dev->dev.of_node = np;
+ scmi_dev->dev.fwnode = of_fwnode_handle(np);
scmi_dev->dev.bus = &scmi_bus_type;
scmi_dev->dev.release = scmi_device_release;
dev_set_name(&scmi_dev->dev, "scmi_dev.%d", id);
--
2.39.1.456.gfc5497dd1b-goog
Hi Saravana,
On Fri, Jan 27, 2023 at 1:11 AM Saravana Kannan <[email protected]> wrote:
> The OF_POPULATED flag was set to let fw_devlink know that the device
> tree node will not have a struct device created for it. This information
> is used by fw_devlink to avoid deferring the probe of consumers of this
> device tree node.
>
> Let's use fwnode_dev_initialized() instead because it achieves the same
> effect without using OF specific flags. This allows more generic code to
> be written in driver core.
>
> Signed-off-by: Saravana Kannan <[email protected]>
Thanks for your patch!
> --- a/drivers/soc/renesas/rcar-sysc.c
> +++ b/drivers/soc/renesas/rcar-sysc.c
> @@ -437,7 +437,7 @@ static int __init rcar_sysc_pd_init(void)
>
> error = of_genpd_add_provider_onecell(np, &domains->onecell_data);
> if (!error)
> - of_node_set_flag(np, OF_POPULATED);
> + fwnode_dev_initialized(&np->fwnode, true);
As drivers/soc/renesas/rmobile-sysc.c is already using this method,
it should work fine.
Reviewed-by: Geert Uytterhoeven <[email protected]>
i.e. will queue in renesas-devel for v6.4.
>
> out_put:
> of_node_put(np);
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Thu, Jan 26, 2023 at 04:11:28PM -0800, Saravana Kannan wrote:
> When a device X is bound successfully to a driver, if it has a child
> firmware node Y that doesn't have a struct device created by then, we
> delete fwnode links where the child firmware node Y is the supplier. We
> did this to avoid blocking the consumers of the child firmware node Y
> from deferring probe indefinitely.
>
> While that a step in the right direction, it's better to make the
> consumers of the child firmware node Y to be consumers of the device X
> because device X is probably implementing whatever functionality is
> represented by child firmware node Y. By doing this, we capture the
> device dependencies more accurately and ensure better
> probe/suspend/resume ordering.
...
> static unsigned int defer_sync_state_count = 1;
> static DEFINE_MUTEX(fwnode_link_lock);
> static bool fw_devlink_is_permissive(void);
> +static void __fw_devlink_link_to_consumers(struct device *dev);
> static bool fw_devlink_drv_reg_done;
> static bool fw_devlink_best_effort;
I'm wondering if may avoid adding more forward declarations...
Perhaps it's a sign that devlink code should be split to its own
module?
...
> -int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> +static int __fwnode_link_add(struct fwnode_handle *con,
> + struct fwnode_handle *sup)
I believe we tolerate a bit longer lines, so you may still have it on a single
line.
...
> +int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> +{
> + int ret = 0;
Redundant assignment.
> + mutex_lock(&fwnode_link_lock);
> + ret = __fwnode_link_add(con, sup);
> + mutex_unlock(&fwnode_link_lock);
> return ret;
> }
...
> if (dev->fwnode && dev->fwnode->dev == dev) {
You may have above something like
fwnode = dev_fwnode(dev);
if (fwnode && fwnode->dev == dev) {
> struct fwnode_handle *child;
> fwnode_links_purge_suppliers(dev->fwnode);
> + mutex_lock(&fwnode_link_lock);
> fwnode_for_each_available_child_node(dev->fwnode, child)
> - fw_devlink_purge_absent_suppliers(child);
> + __fw_devlink_pickup_dangling_consumers(child,
> + dev->fwnode);
__fw_devlink_pickup_dangling_consumers(child, fwnode);
> + __fw_devlink_link_to_consumers(dev);
> + mutex_unlock(&fwnode_link_lock);
> }
--
With Best Regards,
Andy Shevchenko
On Thu, Jan 26, 2023 at 04:11:30PM -0800, Saravana Kannan wrote:
> The OF_POPULATED flag was set to let fw_devlink know that the device
> tree node will not have a struct device created for it. This information
> is used by fw_devlink to avoid deferring the probe of consumers of this
> device tree node.
>
> Let's use fwnode_dev_initialized() instead because it achieves the same
> effect without using OF specific flags. This allows more generic code to
> be written in driver core.
...
> - of_node_set_flag(np, OF_POPULATED);
> + fwnode_dev_initialized(&np->fwnode, true);
of_fwnode_handle(np) ?
--
With Best Regards,
Andy Shevchenko
On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote:
> Registering an irqdomain sets the flag for the fwnode. But having the
> flag set when a device is added is interpreted by fw_devlink to mean the
> device has already been initialized and will never probe. This prevents
> fw_devlink from creating device links with the gpio_device as a
> supplier. So, clear the flag before adding the device.
...
> + /*
> + * If fwnode doesn't belong to another device, it's safe to clear its
> + * initialized flag.
> + */
> + if (!gdev->dev.fwnode->dev)
> + fwnode_dev_initialized(gdev->dev.fwnode, false);
Do not dereference fwnode in struct device. Use dev_fwnode() for that.
struct fwnode_handle *fwnode = dev_fwnode(&gdev->dev);
if (!fwnode->dev)
fwnode_dev_initialized(fwnode, false);
+ Blank line.
> ret = gcdev_register(gdev, gpio_devt);
> if (ret)
> return ret;
--
With Best Regards,
Andy Shevchenko
On Thu, Jan 26, 2023 at 04:11:32PM -0800, Saravana Kannan wrote:
> fw_devlink uses DL_FLAG_SYNC_STATE_ONLY device link flag for two
> purposes:
>
> 1. To allow a parent device to proxy its child device's dependency on a
> supplier so that the supplier doesn't get its sync_state() callback
> before the child device/consumer can be added and probed. In this
> usage scenario, we need to ignore cycles for ensure correctness of
> sync_state() callbacks.
>
> 2. When there are dependency cycles in firmware, we don't know which of
> those dependencies are valid. So, we have to ignore them all wrt
> probe ordering while still making sure the sync_state() callbacks
> come correctly.
>
> However, when detecting dependency cycles, there can be multiple
> dependency cycles between two devices that we need to detect. For
> example:
>
> A -> B -> A and A -> C -> B -> A.
>
> To detect multiple cycles correct, we need to be able to differentiate
> DL_FLAG_SYNC_STATE_ONLY device links used for (1) vs (2) above.
>
> To allow this differentiation, add a DL_FLAG_CYCLE that can be use to
> mark use case (2). We can then use the DL_FLAG_CYCLE to decide which
> DL_FLAG_SYNC_STATE_ONLY device links to follow when looking for
> dependency cycles.
...
> +static inline bool device_link_flag_is_sync_state_only(u32 flags)
> +{
> + return (flags & ~(DL_FLAG_INFERRED | DL_FLAG_CYCLE))
> + == (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED);
Weird indentation, why not
return (flags & ~(DL_FLAG_INFERRED | DL_FLAG_CYCLE)) ==
(DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED);
?
> +}
...
> DL_FLAG_AUTOREMOVE_SUPPLIER | \
> DL_FLAG_AUTOPROBE_CONSUMER | \
> DL_FLAG_SYNC_STATE_ONLY | \
> - DL_FLAG_INFERRED)
> + DL_FLAG_INFERRED | \
> + DL_FLAG_CYCLE)
You can make less churn by squeezing the new one above the last one.
--
With Best Regards,
Andy Shevchenko
On Fri, Jan 27, 2023 at 11:29:43AM +0200, Andy Shevchenko wrote:
> On Thu, Jan 26, 2023 at 04:11:32PM -0800, Saravana Kannan wrote:
...
> > DL_FLAG_AUTOREMOVE_SUPPLIER | \
> > DL_FLAG_AUTOPROBE_CONSUMER | \
> > DL_FLAG_SYNC_STATE_ONLY | \
> > - DL_FLAG_INFERRED)
> > + DL_FLAG_INFERRED | \
> > + DL_FLAG_CYCLE)
>
> You can make less churn by squeezing the new one above the last one.
Or even define a mask with all necessary bits in the header and use it.
--
With Best Regards,
Andy Shevchenko
Hi Andy,
On Fri, Jan 27, 2023 at 10:25 AM Andy Shevchenko
<[email protected]> wrote:
> On Thu, Jan 26, 2023 at 04:11:30PM -0800, Saravana Kannan wrote:
> > The OF_POPULATED flag was set to let fw_devlink know that the device
> > tree node will not have a struct device created for it. This information
> > is used by fw_devlink to avoid deferring the probe of consumers of this
> > device tree node.
> >
> > Let's use fwnode_dev_initialized() instead because it achieves the same
> > effect without using OF specific flags. This allows more generic code to
> > be written in driver core.
>
> ...
>
> > - of_node_set_flag(np, OF_POPULATED);
> > + fwnode_dev_initialized(&np->fwnode, true);
>
> of_fwnode_handle(np) ?
Or of_node_to_fwnode(). Looks like we have (at least) two of them...
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Thu, Jan 26, 2023 at 04:11:33PM -0800, Saravana Kannan wrote:
> To improve detection and handling of dependency cycles, we need to be
> able to mark fwnode links as being part of cycles. fwnode links marked
> as being part of a cycle should not block their consumers from probing.
...
> + list_for_each_entry(link, &fwnode->suppliers, c_hook) {
> + if (link->flags & FWLINK_FLAG_CYCLE)
> + continue;
> + return link->supplier;
Hmm...
if (!(link->flags & FWLINK_FLAG_CYCLE))
return link->supplier;
?
> + }
> +
> + return NULL;
...
> - if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
> - !fw_devlink_is_permissive()) {
> - sup_fw = list_first_entry(&dev->fwnode->suppliers,
> - struct fwnode_link,
> - c_hook)->supplier;
> + sup_fw = fwnode_links_check_suppliers(dev->fwnode);
dev_fwnode() ?
...
> - val = !list_empty(&dev->fwnode->suppliers);
> + mutex_lock(&fwnode_link_lock);
> + val = !!fwnode_links_check_suppliers(dev->fwnode);
Ditto?
> + mutex_unlock(&fwnode_link_lock);
--
With Best Regards,
Andy Shevchenko
On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> fw_devlink could only detect a single and simple cycle because it relied
> mainly on device link cycle detection code that only checked for cycles
> between devices. The expectation was that the firmware wouldn't have
> complicated cycles and multiple cycles between devices. That expectation
> has been proven to be wrong.
>
> For example, fw_devlink could handle:
>
> +-+ +-+
> |A+------> |B+
> +-+ +++
> ^ |
> | |
> +----------+
>
> But it couldn't handle even something as "simple" as:
>
> +---------------------+
> | |
> v |
> +-+ +-+ +++
> |A+------> |B+------> |C|
> +-+ +++ +-+
> ^ |
> | |
> +----------+
>
> But firmware has even more complicated cycles like:
>
> +---------------------+
> | |
> v |
> +-+ +---+ +++
> +--+A+------>| B +-----> |C|<--+
> | +-+ ++--+ +++ |
> | ^ | ^ | |
> | | | | | |
> | +---------+ +---------+ |
> | |
> +------------------------------+
>
> And this is without including parent child dependencies or nodes in the
> cycle that are just firmware nodes that'll never have a struct device
> created for them.
>
> The proper way to treat these devices it to not force any probe ordering
> between them, while still enforce dependencies between node in the
> cycles (A, B and C) and their consumers.
>
> So this patch goes all out and just deals with all types of cycles. It
> does this by:
>
> 1. Following dependencies across device links, parent-child and fwnode
> links.
> 2. When it find cycles, it mark the device links and fwnode links as
> such instead of just deleting them or making the indistinguishable
> from proxy SYNC_STATE_ONLY device links.
>
> This way, when new nodes get added, we can immediately find and mark any
> new cycles whether the new node is a device or firmware node.
...
> + * Check if @sup_handle or any of its ancestors or suppliers direct/indirectly
> + * depend on @con. This function can detect multiple cyles between @sup_handle
A single space is enough.
> + * and @con. When such dependency cycles are found, convert all device links
> + * created solely by fw_devlink into SYNC_STATE_ONLY device links. Also, mark
Ditto.
> + * all fwnode links in the cycle with FWLINK_FLAG_CYCLE so that when they are
> + * converted into a device link in the future, they are created as
> + * SYNC_STATE_ONLY device links. This is the equivalent of doing
Ditto.
> + * fw_devlink=permissive just between the devices in the cycle. We need to do
> + * this because, at this point, fw_devlink can't tell which of these
> + * dependencies is not a real dependency.
> + *
> + * Return true if one or more cycles were found. Otherwise, return false.
Return:
(you may run `kernel-doc -v ...` to see all warnings)
...
> +static bool __fw_devlink_relax_cycles(struct device *con,
> + struct fwnode_handle *sup_handle)
> +{
> + struct fwnode_link *link;
> + struct device_link *dev_link;
> + struct device *sup_dev = NULL, *par_dev = NULL;
You can put it the first line since it's long enough.
But why do you need sup_dev assignment?
> + bool ret = false;
> +
> + if (!sup_handle)
> + return false;
> +
> + /*
> + * We aren't trying to find all cycles. Just a cycle between con and
> + * sup_handle.
> + */
> + if (sup_handle->flags & FWNODE_FLAG_VISITED)
> + return false;
> +
> + sup_handle->flags |= FWNODE_FLAG_VISITED;
> + sup_dev = get_dev_from_fwnode(sup_handle);
> +
I would put it closer to the condition:
> + /* Termination condition. */
> + if (sup_dev == con) {
/* Get supplier device and check for termination condition */
sup_dev = get_dev_from_fwnode(sup_handle);
if (sup_dev == con) {
> + ret = true;
> + goto out;
> + }
> +
> + /*
> + * If sup_dev is bound to a driver and @con hasn't started binding to
> + * a driver, @sup_dev can't be a consumer of @con. So, no need to
sup_dev or @sup_dev? What's the difference? Should you spell one of them
in full?
> + * check further.
> + */
> + if (sup_dev && sup_dev->links.status == DL_DEV_DRIVER_BOUND &&
As in the comment above, the single space is enough.
> + con->links.status == DL_DEV_NO_DRIVER) {
> + ret = false;
> + goto out;
> + }
> +
> + list_for_each_entry(link, &sup_handle->suppliers, c_hook) {
> + if (__fw_devlink_relax_cycles(con, link->supplier)) {
> + __fwnode_link_cycle(link);
> + ret = true;
> + }
> + }
> +
> + /*
> + * Give priority to device parent over fwnode parent to account for any
> + * quirks in how fwnodes are converted to devices.
> + */
> + if (sup_dev) {
> + par_dev = sup_dev->parent;
> + get_device(par_dev);
> + } else {
> + par_dev = fwnode_get_next_parent_dev(sup_handle);
> + }
if (sup_dev)
par_dev = get_device(sup_dev->parent);
else
par_dev = fwnode_get_next_parent_dev(sup_handle);
> + if (par_dev)
> + ret |= __fw_devlink_relax_cycles(con, par_dev->fwnode);
Instead I would rather do a similar pattern of the ret assignment as elsewhere
in the function.
if (par_dev && __fw_devlink_relax_cycles(con, par_dev->fwnode))
ret = true;
> + if (!sup_dev)
> + goto out;
> +
> + list_for_each_entry(dev_link, &sup_dev->links.suppliers, c_node) {
> + /*
> + * Ignore a SYNC_STATE_ONLY flag only if it wasn't marked as a
> + * such due to a cycle.
> + */
> + if (device_link_flag_is_sync_state_only(dev_link->flags) &&
> + !(dev_link->flags & DL_FLAG_CYCLE))
> + continue;
> +
> + if (__fw_devlink_relax_cycles(con,
> + dev_link->supplier->fwnode)) {
Keep it on one line.
> + fw_devlink_relax_link(dev_link);
> + dev_link->flags |= DL_FLAG_CYCLE;
> + ret = true;
> + }
> + }
> +
> +out:
> + sup_handle->flags &= ~FWNODE_FLAG_VISITED;
> + put_device(sup_dev);
> + put_device(par_dev);
> + return ret;
> +}
--
With Best Regards,
Andy Shevchenko
On Fri, Jan 27, 2023 at 10:30:35AM +0100, Geert Uytterhoeven wrote:
> On Fri, Jan 27, 2023 at 10:25 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Thu, Jan 26, 2023 at 04:11:30PM -0800, Saravana Kannan wrote:
...
> > > - of_node_set_flag(np, OF_POPULATED);
> > > + fwnode_dev_initialized(&np->fwnode, true);
> >
> > of_fwnode_handle(np) ?
>
> Or of_node_to_fwnode().
Not really.
> Looks like we have (at least) two of them...
Yes, and the latter one is IRQ subsystem invention. Should gone in favour of
the generic helper.
--
With Best Regards,
Andy Shevchenko
On Thu, Jan 26, 2023 at 04:11:37PM -0800, Saravana Kannan wrote:
> Since this device is only partially initialized by the irqchip driver,
> we need to mark the fwnode device as not initialized. This is to let
> fw_devlink know that the device will be completely initialized at a
> later point. That way, fw_devlink will continue to defer the probe of
> the power domain consumers till the power domain driver successfully
> binds to the struct device and completes the initialization of the
> device.
...
> pd_pdev->dev.of_node = np;
> + pd_pdev->dev.fwnode = of_fwnode_handle(np);
Instead,
device_set_node(&pd_dev->dev, of_fwnode_handle(np));
--
With Best Regards,
Andy Shevchenko
Hi Andy,
On Fri, Jan 27, 2023 at 10:43 AM Andy Shevchenko
<[email protected]> wrote:
> On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> > + * Check if @sup_handle or any of its ancestors or suppliers direct/indirectly
> > + * depend on @con. This function can detect multiple cyles between @sup_handle
>
> A single space is enough.
It's very common to write two spaces after a full stop.
When joining two sentences on separate lines in vim using SHIFT-J,
vim will make sure there are two spaces.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Thu, Jan 26, 2023 at 04:11:38PM -0800, Saravana Kannan wrote:
> This allows fw_devlink to track and enforce supplier-consumer
> dependencies for scmi_device.
...
> scmi_dev->dev.of_node = np;
> + scmi_dev->dev.fwnode = of_fwnode_handle(np);
As per previous patch:
device_set_node(&scmi_dev->dev, of_fwnode_handle(np));
--
With Best Regards,
Andy Shevchenko
On Fri, Jan 27, 2023 at 10:30 AM Andy Shevchenko
<[email protected]> wrote:
> On Thu, Jan 26, 2023 at 04:11:32PM -0800, Saravana Kannan wrote:
> > DL_FLAG_AUTOREMOVE_SUPPLIER | \
> > DL_FLAG_AUTOPROBE_CONSUMER | \
> > DL_FLAG_SYNC_STATE_ONLY | \
> > - DL_FLAG_INFERRED)
> > + DL_FLAG_INFERRED | \
> > + DL_FLAG_CYCLE)
>
> You can make less churn by squeezing the new one above the last one.
And avoiding some future churn by introducing alphabetical order.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Fri, Jan 27, 2023 at 10:52:02AM +0100, Geert Uytterhoeven wrote:
> On Fri, Jan 27, 2023 at 10:43 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> > > + * Check if @sup_handle or any of its ancestors or suppliers direct/indirectly
> > > + * depend on @con. This function can detect multiple cyles between @sup_handle
> >
> > A single space is enough.
>
> It's very common to write two spaces after a full stop.
> When joining two sentences on separate lines in vim using SHIFT-J,
> vim will make sure there are two spaces.
But is this consistent with all kernel doc comments in the core.c?
I'm fine with either as long as it's consistent.
--
With Best Regards,
Andy Shevchenko
Hi Andy,
On Fri, Jan 27, 2023 at 11:10 AM Andy Shevchenko
<[email protected]> wrote:
> On Fri, Jan 27, 2023 at 10:52:02AM +0100, Geert Uytterhoeven wrote:
> > On Fri, Jan 27, 2023 at 10:43 AM Andy Shevchenko
> > <[email protected]> wrote:
> > > On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> > > > + * Check if @sup_handle or any of its ancestors or suppliers direct/indirectly
> > > > + * depend on @con. This function can detect multiple cyles between @sup_handle
> > >
> > > A single space is enough.
> >
> > It's very common to write two spaces after a full stop.
See e.g.:
git grep "\. [^ ]
> > When joining two sentences on separate lines in vim using SHIFT-J,
> > vim will make sure there are two spaces.
>
> But is this consistent with all kernel doc comments in the core.c?
Probably there are inconsistencies...
(Aren't there everywhere?)
> I'm fine with either as long as it's consistent.
At least the kerneldoc source will look similar to the PDF output
(LaTeX inserts more space after a full stop automatically ;-).
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Thu, Jan 26, 2023 at 04:11:38PM -0800, Saravana Kannan wrote:
> This allows fw_devlink to track and enforce supplier-consumer
> dependencies for scmi_device.
>
Is there any dependency in the series, if so
Acked-by: Sudeep Holla <[email protected]>
after you incorporate Andy's suggestion.
Let me know if you want me to pick this up.
--
Regards,
Sudeep
On Thu, Jan 26, 2023 at 04:11:27PM -0800, Saravana Kannan wrote:
> Dmitry, Maxim(s), Miquel, Luca, Doug, Colin, Martin, Jean-Philippe,
>
> I've Cc-ed you because I had pointed you to v1 of this series + the
> patches in that thread at one point or another as a fix to some issue
> you were facing. It'd appreciate it if you can test this series and
> report any issues, or things it fixed and give Tested-bys.
I applied this on my working net-next/main development branch and can
confirm I am able to successfully boot the Beaglebone Black.
Tested-by: Colin Foster <[email protected]>
On Fri, Jan 27, 2023 at 12:30 PM Colin Foster
<[email protected]> wrote:
>
> On Thu, Jan 26, 2023 at 04:11:27PM -0800, Saravana Kannan wrote:
> > Dmitry, Maxim(s), Miquel, Luca, Doug, Colin, Martin, Jean-Philippe,
> >
> > I've Cc-ed you because I had pointed you to v1 of this series + the
> > patches in that thread at one point or another as a fix to some issue
> > you were facing. It'd appreciate it if you can test this series and
> > report any issues, or things it fixed and give Tested-bys.
>
> I applied this on my working net-next/main development branch and can
> confirm I am able to successfully boot the Beaglebone Black.
>
> Tested-by: Colin Foster <[email protected]>
Thanks!
-Saravana
On Fri, Jan 27, 2023 at 12:11 AM Geert Uytterhoeven
<[email protected]> wrote:
>
> Hi Saravana,
>
> On Fri, Jan 27, 2023 at 1:11 AM Saravana Kannan <[email protected]> wrote:
> > The OF_POPULATED flag was set to let fw_devlink know that the device
> > tree node will not have a struct device created for it. This information
> > is used by fw_devlink to avoid deferring the probe of consumers of this
> > device tree node.
> >
> > Let's use fwnode_dev_initialized() instead because it achieves the same
> > effect without using OF specific flags. This allows more generic code to
> > be written in driver core.
> >
> > Signed-off-by: Saravana Kannan <[email protected]>
>
> Thanks for your patch!
>
> > --- a/drivers/soc/renesas/rcar-sysc.c
> > +++ b/drivers/soc/renesas/rcar-sysc.c
> > @@ -437,7 +437,7 @@ static int __init rcar_sysc_pd_init(void)
> >
> > error = of_genpd_add_provider_onecell(np, &domains->onecell_data);
> > if (!error)
> > - of_node_set_flag(np, OF_POPULATED);
> > + fwnode_dev_initialized(&np->fwnode, true);
>
> As drivers/soc/renesas/rmobile-sysc.c is already using this method,
> it should work fine.
>
> Reviewed-by: Geert Uytterhoeven <[email protected]>
> i.e. will queue in renesas-devel for v6.4.
Thanks! Does that mean I should drop this from this series? If two
maintainers pick the same patch up, will it cause problems? I'm
eventually expecting this series to be picked up by Greg into
driver-core-next.
-Saravana
>
> >
> > out_put:
> > of_node_put(np);
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
On Fri, Jan 27, 2023 at 1:22 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Thu, Jan 26, 2023 at 04:11:28PM -0800, Saravana Kannan wrote:
> > When a device X is bound successfully to a driver, if it has a child
> > firmware node Y that doesn't have a struct device created by then, we
> > delete fwnode links where the child firmware node Y is the supplier. We
> > did this to avoid blocking the consumers of the child firmware node Y
> > from deferring probe indefinitely.
> >
> > While that a step in the right direction, it's better to make the
> > consumers of the child firmware node Y to be consumers of the device X
> > because device X is probably implementing whatever functionality is
> > represented by child firmware node Y. By doing this, we capture the
> > device dependencies more accurately and ensure better
> > probe/suspend/resume ordering.
>
> ...
>
> > static unsigned int defer_sync_state_count = 1;
> > static DEFINE_MUTEX(fwnode_link_lock);
> > static bool fw_devlink_is_permissive(void);
> > +static void __fw_devlink_link_to_consumers(struct device *dev);
> > static bool fw_devlink_drv_reg_done;
> > static bool fw_devlink_best_effort;
>
> I'm wondering if may avoid adding more forward declarations...
>
> Perhaps it's a sign that devlink code should be split to its own
> module?
I've thought about that before, but I'm not there yet. Maybe once my
remaining refactors and TODOs are done, it'd be a good time to revisit
this question.
But I don't think it should be done for the reason of forward
declaration as we'd just end up moving these into base.h and we can do
that even today.
>
> ...
>
> > -int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> > +static int __fwnode_link_add(struct fwnode_handle *con,
> > + struct fwnode_handle *sup)
>
> I believe we tolerate a bit longer lines, so you may still have it on a single
> line.
That'd make it >80 cols. I'm going to leave it as is.
>
> ...
>
> > +int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> > +{
>
> > + int ret = 0;
>
> Redundant assignment.
Thanks. Will fix in v3.
>
> > + mutex_lock(&fwnode_link_lock);
> > + ret = __fwnode_link_add(con, sup);
> > + mutex_unlock(&fwnode_link_lock);
> > return ret;
> > }
>
> ...
>
> > if (dev->fwnode && dev->fwnode->dev == dev) {
>
> You may have above something like
>
>
> fwnode = dev_fwnode(dev);
I'll leave it as-is for now. I see dev->fwnode vs dev_fwnode() don't
always give the same results. I need to re-examine other places I use
dev->fwnode in fw_devlink code before I start using that function. But
in general it seems like a good idea. I'll add this to my TODOs.
> if (fwnode && fwnode->dev == dev) {
>
> > struct fwnode_handle *child;
> > fwnode_links_purge_suppliers(dev->fwnode);
> > + mutex_lock(&fwnode_link_lock);
> > fwnode_for_each_available_child_node(dev->fwnode, child)
> > - fw_devlink_purge_absent_suppliers(child);
> > + __fw_devlink_pickup_dangling_consumers(child,
> > + dev->fwnode);
>
> __fw_devlink_pickup_dangling_consumers(child, fwnode);
I like the dev->fwnode->dev == dev check. It makes it super clear that
I'm checking "The device's fwnode points back to the device". If I
just use fwnode->dev == dev, then one will have to go back and read
what fwnode is set to, etc. Also, when reading all these function
calls it's easier to see that I'm working on the dev's fwnode (where
dev is the device that was just bound to a driver) instead of some
other fwnode.
So I find it more readable as is and the compiler would optimize it
anyway. If you feel strongly about this, I can change to use fwnode
instead of dev->fwnode.
Thanks,
Saravana
On Fri, Jan 27, 2023 at 1:27 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote:
> > Registering an irqdomain sets the flag for the fwnode. But having the
> > flag set when a device is added is interpreted by fw_devlink to mean the
> > device has already been initialized and will never probe. This prevents
> > fw_devlink from creating device links with the gpio_device as a
> > supplier. So, clear the flag before adding the device.
>
> ...
>
> > + /*
> > + * If fwnode doesn't belong to another device, it's safe to clear its
> > + * initialized flag.
> > + */
> > + if (!gdev->dev.fwnode->dev)
> > + fwnode_dev_initialized(gdev->dev.fwnode, false);
>
> Do not dereference fwnode in struct device. Use dev_fwnode() for that.
>
> struct fwnode_handle *fwnode = dev_fwnode(&gdev->dev);
>
> if (!fwnode->dev)
> fwnode_dev_initialized(fwnode, false);
Honestly, we should work towards NOT needing dev_fwnode(). The
function literally dereferences dev->fwnode or the one inside of_node.
So my dereference is fine. The whole "fwnode might not be set for
devices with of_node" is wrong and we should fix that instead of
writing wrappers to work around it.
Also, for now I'm going to leave this as if for the same reasons as I
mentioned in Patch 1.
>
> + Blank line.
Ack.
-Saravana
>
> > ret = gcdev_register(gdev, gpio_devt);
> > if (ret)
> > return ret;
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>
On Fri, Jan 27, 2023 at 1:30 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Thu, Jan 26, 2023 at 04:11:32PM -0800, Saravana Kannan wrote:
> > fw_devlink uses DL_FLAG_SYNC_STATE_ONLY device link flag for two
> > purposes:
> >
> > 1. To allow a parent device to proxy its child device's dependency on a
> > supplier so that the supplier doesn't get its sync_state() callback
> > before the child device/consumer can be added and probed. In this
> > usage scenario, we need to ignore cycles for ensure correctness of
> > sync_state() callbacks.
> >
> > 2. When there are dependency cycles in firmware, we don't know which of
> > those dependencies are valid. So, we have to ignore them all wrt
> > probe ordering while still making sure the sync_state() callbacks
> > come correctly.
> >
> > However, when detecting dependency cycles, there can be multiple
> > dependency cycles between two devices that we need to detect. For
> > example:
> >
> > A -> B -> A and A -> C -> B -> A.
> >
> > To detect multiple cycles correct, we need to be able to differentiate
> > DL_FLAG_SYNC_STATE_ONLY device links used for (1) vs (2) above.
> >
> > To allow this differentiation, add a DL_FLAG_CYCLE that can be use to
> > mark use case (2). We can then use the DL_FLAG_CYCLE to decide which
> > DL_FLAG_SYNC_STATE_ONLY device links to follow when looking for
> > dependency cycles.
>
> ...
>
> > +static inline bool device_link_flag_is_sync_state_only(u32 flags)
> > +{
> > + return (flags & ~(DL_FLAG_INFERRED | DL_FLAG_CYCLE))
> > + == (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED);
>
> Weird indentation, why not
>
> return (flags & ~(DL_FLAG_INFERRED | DL_FLAG_CYCLE)) ==
> (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED);
>
> ?
Ack. Will fix in v3.
>
> > +}
>
> ...
>
> > DL_FLAG_AUTOREMOVE_SUPPLIER | \
> > DL_FLAG_AUTOPROBE_CONSUMER | \
> > DL_FLAG_SYNC_STATE_ONLY | \
> > - DL_FLAG_INFERRED)
> > + DL_FLAG_INFERRED | \
> > + DL_FLAG_CYCLE)
>
> You can make less churn by squeezing the new one above the last one.
I feel like this part is getting bike shedded. I'm going to leave it
as is. It's done in the order it's defined in the header and keeping
it that way makes it way more easier to read than worry about a single
line churn.
-Saravana
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>
On Fri, Jan 27, 2023 at 1:33 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Thu, Jan 26, 2023 at 04:11:33PM -0800, Saravana Kannan wrote:
> > To improve detection and handling of dependency cycles, we need to be
> > able to mark fwnode links as being part of cycles. fwnode links marked
> > as being part of a cycle should not block their consumers from probing.
>
> ...
>
> > + list_for_each_entry(link, &fwnode->suppliers, c_hook) {
> > + if (link->flags & FWLINK_FLAG_CYCLE)
> > + continue;
> > + return link->supplier;
>
> Hmm...
Thanks!
>
> if (!(link->flags & FWLINK_FLAG_CYCLE))
> return link->supplier;
>
> ?
>
> > + }
> > +
> > + return NULL;
>
> ...
>
> > - if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
> > - !fw_devlink_is_permissive()) {
> > - sup_fw = list_first_entry(&dev->fwnode->suppliers,
> > - struct fwnode_link,
> > - c_hook)->supplier;
> > + sup_fw = fwnode_links_check_suppliers(dev->fwnode);
>
> dev_fwnode() ?
>
> ...
>
> > - val = !list_empty(&dev->fwnode->suppliers);
> > + mutex_lock(&fwnode_link_lock);
> > + val = !!fwnode_links_check_suppliers(dev->fwnode);
>
> Ditto?
Similar response as Patch 1 and Patch 4.
-Saravana
On Fri, Jan 27, 2023 at 1:43 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> > fw_devlink could only detect a single and simple cycle because it relied
> > mainly on device link cycle detection code that only checked for cycles
> > between devices. The expectation was that the firmware wouldn't have
> > complicated cycles and multiple cycles between devices. That expectation
> > has been proven to be wrong.
> >
> > For example, fw_devlink could handle:
> >
> > +-+ +-+
> > |A+------> |B+
> > +-+ +++
> > ^ |
> > | |
> > +----------+
> >
> > But it couldn't handle even something as "simple" as:
> >
> > +---------------------+
> > | |
> > v |
> > +-+ +-+ +++
> > |A+------> |B+------> |C|
> > +-+ +++ +-+
> > ^ |
> > | |
> > +----------+
> >
> > But firmware has even more complicated cycles like:
> >
> > +---------------------+
> > | |
> > v |
> > +-+ +---+ +++
> > +--+A+------>| B +-----> |C|<--+
> > | +-+ ++--+ +++ |
> > | ^ | ^ | |
> > | | | | | |
> > | +---------+ +---------+ |
> > | |
> > +------------------------------+
> >
> > And this is without including parent child dependencies or nodes in the
> > cycle that are just firmware nodes that'll never have a struct device
> > created for them.
> >
> > The proper way to treat these devices it to not force any probe ordering
> > between them, while still enforce dependencies between node in the
> > cycles (A, B and C) and their consumers.
> >
> > So this patch goes all out and just deals with all types of cycles. It
> > does this by:
> >
> > 1. Following dependencies across device links, parent-child and fwnode
> > links.
> > 2. When it find cycles, it mark the device links and fwnode links as
> > such instead of just deleting them or making the indistinguishable
> > from proxy SYNC_STATE_ONLY device links.
> >
> > This way, when new nodes get added, we can immediately find and mark any
> > new cycles whether the new node is a device or firmware node.
>
> ...
>
> > + * Check if @sup_handle or any of its ancestors or suppliers direct/indirectly
> > + * depend on @con. This function can detect multiple cyles between @sup_handle
>
> A single space is enough.
>
> > + * and @con. When such dependency cycles are found, convert all device links
> > + * created solely by fw_devlink into SYNC_STATE_ONLY device links. Also, mark
>
> Ditto.
>
> > + * all fwnode links in the cycle with FWLINK_FLAG_CYCLE so that when they are
> > + * converted into a device link in the future, they are created as
> > + * SYNC_STATE_ONLY device links. This is the equivalent of doing
>
> Ditto.
Lol, you are the king of nit picks :) I don't know how you even notice
these :) I don't like the double spacing either, but as Geert pointed
out, vim inserts them when I use it to auto word-wrap comment blocks.
I'll try to address them as I find them, but I'm not going to send out
revisions of patches just for double spaces.
>
> > + * fw_devlink=permissive just between the devices in the cycle. We need to do
> > + * this because, at this point, fw_devlink can't tell which of these
> > + * dependencies is not a real dependency.
> > + *
> > + * Return true if one or more cycles were found. Otherwise, return false.
>
> Return:
I'm following the rest of the function docs in this file.
>
> (you may run `kernel-doc -v ...` to see all warnings)
Hmmm I ran it on the patch file and it didn't give me anything useful.
Running it on the whole file is just a lot of lines to dig through.
>
> ...
>
> > +static bool __fw_devlink_relax_cycles(struct device *con,
> > + struct fwnode_handle *sup_handle)
> > +{
> > + struct fwnode_link *link;
> > + struct device_link *dev_link;
>
> > + struct device *sup_dev = NULL, *par_dev = NULL;
>
> You can put it the first line since it's long enough.
Wait, is that a style guideline to have the longer lines first?
> But why do you need sup_dev assignment?
Defensive programming I suppose. I can see this function being
refactored in the future where a goto out; is inserted before sup_dev
is assigned. And then the put_device(sup_dev) at "out" will end up
operating on some junk value and causing memory corruption.
>
> > + bool ret = false;
> > +
> > + if (!sup_handle)
> > + return false;
> > +
> > + /*
> > + * We aren't trying to find all cycles. Just a cycle between con and
> > + * sup_handle.
> > + */
> > + if (sup_handle->flags & FWNODE_FLAG_VISITED)
> > + return false;
> > +
> > + sup_handle->flags |= FWNODE_FLAG_VISITED;
>
> > + sup_dev = get_dev_from_fwnode(sup_handle);
> > +
>
> I would put it closer to the condition:
>
> > + /* Termination condition. */
> > + if (sup_dev == con) {
>
> /* Get supplier device and check for termination condition */
> sup_dev = get_dev_from_fwnode(sup_handle);
> if (sup_dev == con) {
I put it the way it is because sup_dev is used for more than just
checking for termination condition.
>
> > + ret = true;
> > + goto out;
> > + }
> > +
> > + /*
> > + * If sup_dev is bound to a driver and @con hasn't started binding to
> > + * a driver, @sup_dev can't be a consumer of @con. So, no need to
>
> sup_dev or @sup_dev? What's the difference? Should you spell one of them
> in full?
Probably copy-pasta from a function doc. I'll make it sup_dev.
>
> > + * check further.
> > + */
> > + if (sup_dev && sup_dev->links.status == DL_DEV_DRIVER_BOUND &&
>
> As in the comment above, the single space is enough.
>
> > + con->links.status == DL_DEV_NO_DRIVER) {
> > + ret = false;
> > + goto out;
> > + }
> > +
> > + list_for_each_entry(link, &sup_handle->suppliers, c_hook) {
> > + if (__fw_devlink_relax_cycles(con, link->supplier)) {
> > + __fwnode_link_cycle(link);
> > + ret = true;
> > + }
> > + }
> > +
> > + /*
> > + * Give priority to device parent over fwnode parent to account for any
> > + * quirks in how fwnodes are converted to devices.
> > + */
>
> > + if (sup_dev) {
> > + par_dev = sup_dev->parent;
> > + get_device(par_dev);
> > + } else {
> > + par_dev = fwnode_get_next_parent_dev(sup_handle);
> > + }
>
> if (sup_dev)
> par_dev = get_device(sup_dev->parent);
> else
> par_dev = fwnode_get_next_parent_dev(sup_handle);
Ack, thanks. Makes it nicer.
>
> > + if (par_dev)
> > + ret |= __fw_devlink_relax_cycles(con, par_dev->fwnode);
>
> Instead I would rather do a similar pattern of the ret assignment as elsewhere
> in the function.
>
> if (par_dev && __fw_devlink_relax_cycles(con, par_dev->fwnode))
> ret = true;
Ack. Good suggestion!
>
> > + if (!sup_dev)
> > + goto out;
> > +
> > + list_for_each_entry(dev_link, &sup_dev->links.suppliers, c_node) {
> > + /*
> > + * Ignore a SYNC_STATE_ONLY flag only if it wasn't marked as a
> > + * such due to a cycle.
> > + */
> > + if (device_link_flag_is_sync_state_only(dev_link->flags) &&
> > + !(dev_link->flags & DL_FLAG_CYCLE))
> > + continue;
> > +
> > + if (__fw_devlink_relax_cycles(con,
> > + dev_link->supplier->fwnode)) {
>
> Keep it on one line.
It'll make it > 80. Is this some recent change about allowing > 80
cols? I'm leaving it as is for now.
> > + fw_devlink_relax_link(dev_link);
> > + dev_link->flags |= DL_FLAG_CYCLE;
> > + ret = true;
> > + }
> > + }
> > +
> > +out:
> > + sup_handle->flags &= ~FWNODE_FLAG_VISITED;
> > + put_device(sup_dev);
> > + put_device(par_dev);
> > + return ret;
> > +}
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>
On Fri, Jan 27, 2023 at 1:51 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Thu, Jan 26, 2023 at 04:11:37PM -0800, Saravana Kannan wrote:
> > Since this device is only partially initialized by the irqchip driver,
> > we need to mark the fwnode device as not initialized. This is to let
> > fw_devlink know that the device will be completely initialized at a
> > later point. That way, fw_devlink will continue to defer the probe of
> > the power domain consumers till the power domain driver successfully
> > binds to the struct device and completes the initialization of the
> > device.
>
> ...
>
> > pd_pdev->dev.of_node = np;
> > + pd_pdev->dev.fwnode = of_fwnode_handle(np);
>
> Instead,
>
> device_set_node(&pd_dev->dev, of_fwnode_handle(np));
Ack
-Saravana
Hi Saravana,
On Sat, Jan 28, 2023 at 8:19 AM Saravana Kannan <[email protected]> wrote:
> On Fri, Jan 27, 2023 at 12:11 AM Geert Uytterhoeven
> <[email protected]> wrote:
> > On Fri, Jan 27, 2023 at 1:11 AM Saravana Kannan <[email protected]> wrote:
> > > The OF_POPULATED flag was set to let fw_devlink know that the device
> > > tree node will not have a struct device created for it. This information
> > > is used by fw_devlink to avoid deferring the probe of consumers of this
> > > device tree node.
> > >
> > > Let's use fwnode_dev_initialized() instead because it achieves the same
> > > effect without using OF specific flags. This allows more generic code to
> > > be written in driver core.
> > >
> > > Signed-off-by: Saravana Kannan <[email protected]>
> >
> > Thanks for your patch!
> >
> > > --- a/drivers/soc/renesas/rcar-sysc.c
> > > +++ b/drivers/soc/renesas/rcar-sysc.c
> > > @@ -437,7 +437,7 @@ static int __init rcar_sysc_pd_init(void)
> > >
> > > error = of_genpd_add_provider_onecell(np, &domains->onecell_data);
> > > if (!error)
> > > - of_node_set_flag(np, OF_POPULATED);
> > > + fwnode_dev_initialized(&np->fwnode, true);
> >
> > As drivers/soc/renesas/rmobile-sysc.c is already using this method,
> > it should work fine.
> >
> > Reviewed-by: Geert Uytterhoeven <[email protected]>
> > i.e. will queue in renesas-devel for v6.4.
>
> Thanks! Does that mean I should drop this from this series? If two
> maintainers pick the same patch up, will it cause problems? I'm
> eventually expecting this series to be picked up by Greg into
> driver-core-next.
Indeed. Patches for drivers/soc/renesas/ are supposed to go upstream
through the renesas-devel and soc trees. This patch has no dependencies
on anything else in the series (or vice versa), so there is no reason
to deviate from that, and possibly cause conflicts later.
BTW, I will convert to of_node_to_fwnode() while applying.
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh,
sparc and x86_64.
Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64.
Boot failed on FVP.
Reported-by: Linux Kernel Functional Testing <[email protected]>
Please refer following link for details of testing.
FVP boot log failed.
https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/
[ 2.613437] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
[ 2.613628] Mem abort info:
[ 2.613756] ESR = 0x0000000096000005
[ 2.613904] EC = 0x25: DABT (current EL), IL = 32 bits
[ 2.614071] SET = 0, FnV = 0
[ 2.614215] EA = 0, S1PTW = 0
[ 2.614358] FSC = 0x05: level 1 translation fault
[ 2.614517] Data abort info:
[ 2.614647] ISV = 0, ISS = 0x00000005
[ 2.614792] CM = 0, WnR = 0
[ 2.614934] [0000000000000010] user address but active_mm is swapper
[ 2.615105] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[ 2.615219] Modules linked in:
[ 2.615310] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc5 #1
[ 2.615445] Hardware name: FVP Base RevC (DT)
[ 2.615533] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 2.615685] pc : gpiochip_setup_dev (include/linux/err.h:41 include/linux/fwnode.h:201 drivers/gpio/gpiolib.c:586)
[ 2.615816] lr : gpiochip_add_data_with_key (drivers/gpio/gpiolib.c:871)
[ 2.615970] sp : ffff8000081af5e0
[ 2.616051] x29: ffff8000081af5e0 x28: 0000000000000000 x27: ffff0008027cb5a0
[ 2.616261] x26: 0000000000000000 x25: ffffd7c5d6745910 x24: ffff0008027f4800
[ 2.616472] x23: 0000000000000000 x22: ffffd7c5d62b99a8 x21: 0000000000000202
[ 2.616679] x20: 0000000000000000 x19: ffff0008027f4800 x18: ffffffffffffffff
[ 2.616890] x17: ffffd7c5d6467928 x16: 0000000013e3690a x15: ffff8000081af3b0
[ 2.617102] x14: ffff00080275cd8a x13: ffff00080275cd88 x12: 0000000000000001
[ 2.617312] x11: 62726568746f6d3a x10: 0000000000000000 x9 : ffffd7c5d3b3ebe0
[ 2.617522] x8 : ffff8000081af548 x7 : 0000000000000000 x6 : 0000000000000001
[ 2.617727] x5 : 0000000000000000 x4 : ffff000800640000 x3 : ffffd7c5d62b99c8
[ 2.617933] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[ 2.618138] Call trace:
[ 2.618204] gpiochip_setup_dev (include/linux/err.h:41 include/linux/fwnode.h:201 drivers/gpio/gpiolib.c:586)
[ 2.618337] gpiochip_add_data_with_key (drivers/gpio/gpiolib.c:871)
[ 2.618493] devm_gpiochip_add_data_with_key (drivers/gpio/gpiolib-devres.c:478)
[ 2.618654] bgpio_pdev_probe (drivers/gpio/gpio-mmio.c:793)
[ 2.618785] platform_probe (drivers/base/platform.c:1401)
[ 2.618928] really_probe (drivers/base/dd.c:560 drivers/base/dd.c:639)
[ 2.619056] __driver_probe_device (drivers/base/dd.c:778)
[ 2.619193] driver_probe_device (drivers/base/dd.c:808)
[ 2.619329] __device_attach_driver (drivers/base/dd.c:937)
[ 2.619464] bus_for_each_drv (drivers/base/bus.c:427)
[ 2.619590] __device_attach (drivers/base/dd.c:1010)
[ 2.619722] device_initial_probe (drivers/base/dd.c:1058)
[ 2.619861] bus_probe_device (drivers/base/bus.c:489)
[ 2.619988] device_add (drivers/base/core.c:3637)
[ 2.620102] platform_device_add (drivers/base/platform.c:717)
[ 2.620251] mfd_add_device (drivers/mfd/mfd-core.c:297)
[ 2.620397] devm_mfd_add_devices (drivers/mfd/mfd-core.c:351 drivers/mfd/mfd-core.c:449)
[ 2.620548] vexpress_sysreg_probe (drivers/mfd/vexpress-sysreg.c:115)
[ 2.620672] platform_probe (drivers/base/platform.c:1401)
[ 2.620814] really_probe (drivers/base/dd.c:560 drivers/base/dd.c:639)
[ 2.620940] __driver_probe_device (drivers/base/dd.c:778)
[ 2.621080] driver_probe_device (drivers/base/dd.c:808)
[ 2.621216] __driver_attach (drivers/base/dd.c:1195)
[ 2.621344] bus_for_each_dev (drivers/base/bus.c:301)
[ 2.621467] driver_attach (drivers/base/dd.c:1212)
[ 2.621596] bus_add_driver (drivers/base/bus.c:618)
[ 2.621720] driver_register (drivers/base/driver.c:246)
[ 2.621859] __platform_driver_register (drivers/base/platform.c:868)
[ 2.622012] vexpress_sysreg_driver_init (drivers/mfd/vexpress-sysreg.c:134)
[ 2.622145] do_one_initcall (init/main.c:1306)
[ 2.622269] kernel_init_freeable (init/main.c:1378 init/main.c:1395 init/main.c:1414 init/main.c:1634)
[ 2.622394] kernel_init (init/main.c:1526)
[ 2.622531] ret_from_fork (arch/arm64/kernel/entry.S:864)
[ 2.622692] Code: 910003fd a90153f3 aa0003f3 f9414c00 (f9400801)
All code
========
0:* fd std <-- trapping instruction
1: 03 00 add (%rax),%eax
3: 91 xchg %eax,%ecx
4: f3 53 repz push %rbx
6: 01 a9 f3 03 00 aa add %ebp,-0x55fffc0d(%rcx)
c: 00 4c 41 f9 add %cl,-0x7(%rcx,%rax,2)
10: 01 08 add %ecx,(%rax)
12: 40 f9 rex stc
Code starting with the faulting instruction
===========================================
0: 01 08 add %ecx,(%rax)
2: 40 f9 rex stc
[ 2.622807] ---[ end trace 0000000000000000 ]---
[ 2.623043] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 2.623157] SMP: stopping secondary CPUs
[ 2.623303] Kernel Offset: 0x57c5cb400000 from 0xffff800008000000
[ 2.623413] PHYS_OFFSET: 0x80000000
[ 2.623492] CPU features: 0x00000,001439ff,cd3e772f
[ 2.623591] Memory Limit: none
[ 2.623679] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
ref:
- https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/?results_layout=table&failures_only=false#!?details=#test-results
--
Linaro LKFT
https://lkft.linaro.org
Hi Maxim & Maxim,
[email protected] wrote on Thu, 26 Jan 2023 16:11:27 -0800:
> This patch series improves fw_devlink in the following ways:
>
> 1. It no longer cares about a fwnode having a "compatible" property. It
> figures this our more dynamically. The only expectation is that
> fwnode that are converted to devices actually get probed by a driver
> for the dependencies to be enforced correctly.
>
> 2. Finer grained dependency tracking. fw_devlink will now create device
> links from the consumer to the actual resource's device (if it has one,
> Eg: gpio_device) instead of the parent supplier device. This improves
> things like async suspend/resume ordering, potentially remove the need
> for frameworks to create device links, more parallelized async probing,
> and better sync_state() tracking.
>
> 3. Handle hardware/software quirks where a child firmware node gets
> populated as a device before its parent firmware node AND actually
> supplies a non-optional resource to the parent firmware node's
> device.
>
> 4. Way more robust at cycle handling (see patch for the insane cases).
>
> 5. Stops depending on OF_POPULATED to figure out some corner cases.
>
> 6. Simplifies the work that needs to be done by the firmware specific
> code.
>
> Sorry it took a while to roll in the fixes I gave in the v1 series
> thread[1] into a v2 series.
>
> Since I didn't make any additional changes on top of what I already gave
> in the v1 thread and Dmitry is very eager to get this series going, I'm
> sending it out without testing locally. I already tested these patches a
> few months ago as part of the v1 series. So I don't expect any major
> issues. I'll test them again on my end in the next few days and will
> report here if I actually find anything wrong.
>
> Tony, Naresh, Abel, Sudeep, Geert,
>
> I got the following reviewed by's and tested by's a few months back, but
> it's been 5 months since I sent out v1. So I wasn't sure if it was okay
> to include them in the v2 commits. Let me know if you are okay with this
> being included in the commits and/or if you want to test this series
> again.
>
> Reviewed-by: Tony Lindgren <[email protected]>
> Tested-by: Tony Lindgren <[email protected]>
> Tested-by: Linux Kernel Functional Testing <[email protected]>
> Tested-by: Naresh Kamboju <[email protected]>
> Tested-by: Abel Vesa <[email protected]>
> Tested-by: Sudeep Holla <[email protected]>
> Tested-by: Geert Uytterhoeven <[email protected]>
>
> Dmitry, Maxim(s), Miquel, Luca, Doug, Colin, Martin, Jean-Philippe,
>
> I've Cc-ed you because I had pointed you to v1 of this series + the
> patches in that thread at one point or another as a fix to some issue
> you were facing. It'd appreciate it if you can test this series and
> report any issues, or things it fixed and give Tested-bys.
Maxim & Maxim I would really appreciate if you could validate that the
original issue you had is solved with this version? I don't have any
hardware suffering from this issue.
> In addition, if you can also apply a revert of this series[2] and delete
> driver_deferred_probe_check_state() from your tree and see if you hit
> any issues and report them, that'd be great too! I'm pretty sure some of
> you will hit issues with that. I want to fix those next and then
> revert[2].
>
> Thanks,
> Saravana
>
> [1] - https://lore.kernel.org/lkml/[email protected]/
> [2] - https://lore.kernel.org/lkml/[email protected]/
> [3] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/
>
> v1 -> v2:
> - Fixed Patch 1 to handle a corner case discussed in [3].
> - New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
> - New patch 11 to add fw_devlink support for SCMI devices.
>
> Cc: Abel Vesa <[email protected]>
> Cc: Alexander Stein <[email protected]>
> Cc: Tony Lindgren <[email protected]>
> Cc: Sudeep Holla <[email protected]>
> Cc: Geert Uytterhoeven <[email protected]>
> Cc: John Stultz <[email protected]>
> Cc: Doug Anderson <[email protected]>
> Cc: Guenter Roeck <[email protected]>
> Cc: Dmitry Baryshkov <[email protected]>
> Cc: Maxim Kiselev <[email protected]>
> Cc: Maxim Kochetkov <[email protected]>
> Cc: Miquel Raynal <[email protected]>
> Cc: Luca Weiss <[email protected]>
> Cc: Colin Foster <[email protected]>
> Cc: Martin Kepplinger <[email protected]>
> Cc: Jean-Philippe Brucker <[email protected]>
>
> Saravana Kannan (11):
> driver core: fw_devlink: Don't purge child fwnode's consumer links
> driver core: fw_devlink: Improve check for fwnode with no
> device/driver
> soc: renesas: Move away from using OF_POPULATED for fw_devlink
> gpiolib: Clear the gpio_device's fwnode initialized flag before adding
> driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
> driver core: fw_devlink: Allow marking a fwnode link as being part of
> a cycle
> driver core: fw_devlink: Consolidate device link flag computation
> driver core: fw_devlink: Make cycle detection more robust
> of: property: Simplify of_link_to_phandle()
> irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
> firmware: arm_scmi: Set fwnode for the scmi_device
>
> drivers/base/core.c | 443 +++++++++++++++++++++-----------
> drivers/firmware/arm_scmi/bus.c | 2 +
> drivers/gpio/gpiolib.c | 6 +
> drivers/irqchip/irq-imx-gpcv2.c | 1 +
> drivers/of/property.c | 84 +-----
> drivers/soc/imx/gpcv2.c | 1 +
> drivers/soc/renesas/rcar-sysc.c | 2 +-
> include/linux/device.h | 1 +
> include/linux/fwnode.h | 12 +-
> 9 files changed, 332 insertions(+), 220 deletions(-)
>
Thanks,
Miquèl
On Mon, 30 Jan 2023 08:55:42 +0000,
Naresh Kamboju <[email protected]> wrote:
>
> Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh,
> sparc and x86_64.
>
> Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64.
> Boot failed on FVP.
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> Please refer following link for details of testing.
> FVP boot log failed.
> https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/
>
>
> [ 2.613437] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
> [ 2.613628] Mem abort info:
> [ 2.613756] ESR = 0x0000000096000005
> [ 2.613904] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 2.614071] SET = 0, FnV = 0
> [ 2.614215] EA = 0, S1PTW = 0
> [ 2.614358] FSC = 0x05: level 1 translation fault
> [ 2.614517] Data abort info:
> [ 2.614647] ISV = 0, ISS = 0x00000005
> [ 2.614792] CM = 0, WnR = 0
> [ 2.614934] [0000000000000010] user address but active_mm is swapper
> [ 2.615105] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> [ 2.615219] Modules linked in:
> [ 2.615310] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc5 #1
> [ 2.615445] Hardware name: FVP Base RevC (DT)
> [ 2.615533] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [ 2.615685] pc : gpiochip_setup_dev (include/linux/err.h:41 include/linux/fwnode.h:201 drivers/gpio/gpiolib.c:586)
> [ 2.615816] lr : gpiochip_add_data_with_key (drivers/gpio/gpiolib.c:871)
> [ 2.615970] sp : ffff8000081af5e0
> [ 2.616051] x29: ffff8000081af5e0 x28: 0000000000000000 x27: ffff0008027cb5a0
> [ 2.616261] x26: 0000000000000000 x25: ffffd7c5d6745910 x24: ffff0008027f4800
> [ 2.616472] x23: 0000000000000000 x22: ffffd7c5d62b99a8 x21: 0000000000000202
> [ 2.616679] x20: 0000000000000000 x19: ffff0008027f4800 x18: ffffffffffffffff
> [ 2.616890] x17: ffffd7c5d6467928 x16: 0000000013e3690a x15: ffff8000081af3b0
> [ 2.617102] x14: ffff00080275cd8a x13: ffff00080275cd88 x12: 0000000000000001
> [ 2.617312] x11: 62726568746f6d3a x10: 0000000000000000 x9 : ffffd7c5d3b3ebe0
> [ 2.617522] x8 : ffff8000081af548 x7 : 0000000000000000 x6 : 0000000000000001
> [ 2.617727] x5 : 0000000000000000 x4 : ffff000800640000 x3 : ffffd7c5d62b99c8
> [ 2.617933] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
> [ 2.618138] Call trace:
> [ 2.618204] gpiochip_setup_dev (include/linux/err.h:41 include/linux/fwnode.h:201 drivers/gpio/gpiolib.c:586)
> [ 2.618337] gpiochip_add_data_with_key (drivers/gpio/gpiolib.c:871)
> [ 2.618493] devm_gpiochip_add_data_with_key (drivers/gpio/gpiolib-devres.c:478)
> [ 2.618654] bgpio_pdev_probe (drivers/gpio/gpio-mmio.c:793)
> [ 2.618785] platform_probe (drivers/base/platform.c:1401)
> [ 2.618928] really_probe (drivers/base/dd.c:560 drivers/base/dd.c:639)
> [ 2.619056] __driver_probe_device (drivers/base/dd.c:778)
> [ 2.619193] driver_probe_device (drivers/base/dd.c:808)
> [ 2.619329] __device_attach_driver (drivers/base/dd.c:937)
> [ 2.619464] bus_for_each_drv (drivers/base/bus.c:427)
> [ 2.619590] __device_attach (drivers/base/dd.c:1010)
> [ 2.619722] device_initial_probe (drivers/base/dd.c:1058)
> [ 2.619861] bus_probe_device (drivers/base/bus.c:489)
> [ 2.619988] device_add (drivers/base/core.c:3637)
> [ 2.620102] platform_device_add (drivers/base/platform.c:717)
> [ 2.620251] mfd_add_device (drivers/mfd/mfd-core.c:297)
> [ 2.620397] devm_mfd_add_devices (drivers/mfd/mfd-core.c:351 drivers/mfd/mfd-core.c:449)
> [ 2.620548] vexpress_sysreg_probe (drivers/mfd/vexpress-sysreg.c:115)
> [ 2.620672] platform_probe (drivers/base/platform.c:1401)
> [ 2.620814] really_probe (drivers/base/dd.c:560 drivers/base/dd.c:639)
> [ 2.620940] __driver_probe_device (drivers/base/dd.c:778)
> [ 2.621080] driver_probe_device (drivers/base/dd.c:808)
> [ 2.621216] __driver_attach (drivers/base/dd.c:1195)
> [ 2.621344] bus_for_each_dev (drivers/base/bus.c:301)
> [ 2.621467] driver_attach (drivers/base/dd.c:1212)
> [ 2.621596] bus_add_driver (drivers/base/bus.c:618)
> [ 2.621720] driver_register (drivers/base/driver.c:246)
> [ 2.621859] __platform_driver_register (drivers/base/platform.c:868)
> [ 2.622012] vexpress_sysreg_driver_init (drivers/mfd/vexpress-sysreg.c:134)
> [ 2.622145] do_one_initcall (init/main.c:1306)
> [ 2.622269] kernel_init_freeable (init/main.c:1378 init/main.c:1395 init/main.c:1414 init/main.c:1634)
> [ 2.622394] kernel_init (init/main.c:1526)
> [ 2.622531] ret_from_fork (arch/arm64/kernel/entry.S:864)
> [ 2.622692] Code: 910003fd a90153f3 aa0003f3 f9414c00 (f9400801)
> All code
> ========
> 0:* fd std <-- trapping instruction
> 1: 03 00 add (%rax),%eax
> 3: 91 xchg %eax,%ecx
> 4: f3 53 repz push %rbx
> 6: 01 a9 f3 03 00 aa add %ebp,-0x55fffc0d(%rcx)
> c: 00 4c 41 f9 add %cl,-0x7(%rcx,%rax,2)
> 10: 01 08 add %ecx,(%rax)
> 12: 40 f9 rex stc
Could you please fix your scripts so that they report something that
matches the tested architecture? I like x86 asm as much as the next
guy, but this is an arm64 crash... :-/
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
On Fri, Jan 27, 2023 at 11:33:28PM -0800, Saravana Kannan wrote:
> On Fri, Jan 27, 2023 at 1:22 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Thu, Jan 26, 2023 at 04:11:28PM -0800, Saravana Kannan wrote:
...
> > > static unsigned int defer_sync_state_count = 1;
> > > static DEFINE_MUTEX(fwnode_link_lock);
> > > static bool fw_devlink_is_permissive(void);
> > > +static void __fw_devlink_link_to_consumers(struct device *dev);
> > > static bool fw_devlink_drv_reg_done;
> > > static bool fw_devlink_best_effort;
> >
> > I'm wondering if may avoid adding more forward declarations...
> >
> > Perhaps it's a sign that devlink code should be split to its own
> > module?
>
> I've thought about that before, but I'm not there yet. Maybe once my
> remaining refactors and TODOs are done, it'd be a good time to revisit
> this question.
>
> But I don't think it should be done for the reason of forward
> declaration as we'd just end up moving these into base.h and we can do
> that even today.
What I meant is that the stacking up forward declarations is a good sign that
something has to be done sooner than later.
...
> > > -int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> > > +static int __fwnode_link_add(struct fwnode_handle *con,
> > > + struct fwnode_handle *sup)
> >
> > I believe we tolerate a bit longer lines, so you may still have it on a single
> > line.
>
> That'd make it >80 cols. I'm going to leave it as is.
Is it a problem?
...
> > > if (dev->fwnode && dev->fwnode->dev == dev) {
> >
> > You may have above something like
> >
> > fwnode = dev_fwnode(dev);
>
> I'll leave it as-is for now. I see dev->fwnode vs dev_fwnode() don't
> always give the same results. I need to re-examine other places I use
> dev->fwnode in fw_devlink code before I start using that function. But
> in general it seems like a good idea. I'll add this to my TODOs.
Please do, the rationale is to actually move the fwnode to the proper layer,
now we have the single linked list defined in struct fwnode_handle and
dereferencing fwnode from struct device without helper adds a lot of
headache in the future. So, I really would like to see that we stopped doing
that.
> > if (fwnode && fwnode->dev == dev) {
> >
> > > struct fwnode_handle *child;
> > > fwnode_links_purge_suppliers(dev->fwnode);
> > > + mutex_lock(&fwnode_link_lock);
> > > fwnode_for_each_available_child_node(dev->fwnode, child)
> > > - fw_devlink_purge_absent_suppliers(child);
> > > + __fw_devlink_pickup_dangling_consumers(child,
> > > + dev->fwnode);
> >
> > __fw_devlink_pickup_dangling_consumers(child, fwnode);
>
> I like the dev->fwnode->dev == dev check. It makes it super clear that
> I'm checking "The device's fwnode points back to the device". If I
> just use fwnode->dev == dev, then one will have to go back and read
> what fwnode is set to, etc. Also, when reading all these function
> calls it's easier to see that I'm working on the dev's fwnode (where
> dev is the device that was just bound to a driver) instead of some
> other fwnode.
>
> So I find it more readable as is and the compiler would optimize it
> anyway. If you feel strongly about this, I can change to use fwnode
> instead of dev->fwnode.
Please, read above.
--
With Best Regards,
Andy Shevchenko
On Fri, Jan 27, 2023 at 11:33:38PM -0800, Saravana Kannan wrote:
> On Fri, Jan 27, 2023 at 1:27 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote:
...
> > > + /*
> > > + * If fwnode doesn't belong to another device, it's safe to clear its
> > > + * initialized flag.
> > > + */
> > > + if (!gdev->dev.fwnode->dev)
> > > + fwnode_dev_initialized(gdev->dev.fwnode, false);
> >
> > Do not dereference fwnode in struct device. Use dev_fwnode() for that.
> >
> > struct fwnode_handle *fwnode = dev_fwnode(&gdev->dev);
> >
> > if (!fwnode->dev)
> > fwnode_dev_initialized(fwnode, false);
>
> Honestly, we should work towards NOT needing dev_fwnode().
Honestly, it's that We SHOULD go to avoid any direct dereference of fwnode from
the struct device. I explained you in the comment in the other patch.
> The
> function literally dereferences dev->fwnode or the one inside of_node.
> So my dereference is fine. The whole "fwnode might not be set for
> devices with of_node" is wrong and we should fix that instead of
> writing wrappers to work around it.
>
> Also, for now I'm going to leave this as if for the same reasons as I
> mentioned in Patch 1.
Same.
--
With Best Regards,
Andy Shevchenko
Hi Saravana & Miquel.
Sorry for the long response. I finally got access to my test device
and tried this patch series.
And unfortunately it didn't solve my issue. I'm still getting a
hanging f1070000.ethernet dependency
from the nvmem-cell mac@6 subnode.
Here are related parts of my kernel log and device tree:
[ 2.713302] device: 'mtd-0': device_add
[ 2.719528] device: 'spi0': device_add
[ 2.724180] device: 'spi0.0': device_add
[ 2.728957] spi-nor spi0.0: mx66l51235f (65536 Kbytes)
[ 2.735338] 7 fixed-partitions partitions found on MTD device spi0.0
[ 2.741978] device:
'f1010600.spi:m25p80@0:partitions:partition@1': device_add
[ 2.749636] Creating 7 MTD partitions on "spi0.0":
[ 2.754564] 0x000000000000-0x000000080000 : "SPI.U_BOOT"
[ 2.759981] device: 'mtd0': device_add
[ 2.764323] device: 'mtd0': device_add
[ 2.768280] device: 'mtd0ro': device_add
[ 2.772624] 0x0000000a0000-0x0000000c0000 : "SPI.INV_INFO"
[ 2.778218] device: 'mtd1': device_add
[ 2.782549] device: 'mtd1': device_add
[ 2.786582] device: 'mtd1ro': device_add
...
[ 5.426625] mvneta_bm f10c0000.bm: Buffer Manager for network
controller enabled
[ 5.492867] platform f1070000.ethernet: error -EPROBE_DEFER:
wait for supplier mac@6
[ 5.528636] device: 'Fixed MDIO bus.0': device_add
[ 5.533726] device: 'fixed-0': device_add
[ 5.547564] device: 'f1072004.mdio-eth-mii': device_add
[ 5.616368] device: 'f1072004.mdio-eth-mii:00': device_add
[ 5.645127] device: 'f1072004.mdio-eth-mii:1e': device_add
[ 5.651530] devices_kset: Moving f1070000.ethernet to end of list
[ 5.657948] platform f1070000.ethernet: error -EPROBE_DEFER:
wait for supplier mac@6
spi@10600 {
m25p80@0 {
compatible = "mx66l51235l";
partitions {
compatible = "fixed-partitions";
partition@0 {
label = "SPI.U_BOOT";
};
partition@1 {
compatible = "nvmem-cells";
label = "SPI.INV_INFO";
macaddr: mac@6 {
reg = <0x6 0x6>;
};
};
...
};
};
};
enet1: ethernet@70000 {
nvmem-cells = <&macaddr>;
nvmem-cell-names = "mac-address";
phy-mode = "rgmii";
phy = <&phy0>;
};
Maybe I should provide some additional debug info?
пн, 30 янв. 2023 г. в 13:48, Miquel Raynal <[email protected]>:
>
> Hi Maxim & Maxim,
>
> [email protected] wrote on Thu, 26 Jan 2023 16:11:27 -0800:
>
> > This patch series improves fw_devlink in the following ways:
> >
> > 1. It no longer cares about a fwnode having a "compatible" property. It
> > figures this our more dynamically. The only expectation is that
> > fwnode that are converted to devices actually get probed by a driver
> > for the dependencies to be enforced correctly.
> >
> > 2. Finer grained dependency tracking. fw_devlink will now create device
> > links from the consumer to the actual resource's device (if it has one,
> > Eg: gpio_device) instead of the parent supplier device. This improves
> > things like async suspend/resume ordering, potentially remove the need
> > for frameworks to create device links, more parallelized async probing,
> > and better sync_state() tracking.
> >
> > 3. Handle hardware/software quirks where a child firmware node gets
> > populated as a device before its parent firmware node AND actually
> > supplies a non-optional resource to the parent firmware node's
> > device.
> >
> > 4. Way more robust at cycle handling (see patch for the insane cases).
> >
> > 5. Stops depending on OF_POPULATED to figure out some corner cases.
> >
> > 6. Simplifies the work that needs to be done by the firmware specific
> > code.
> >
> > Sorry it took a while to roll in the fixes I gave in the v1 series
> > thread[1] into a v2 series.
> >
> > Since I didn't make any additional changes on top of what I already gave
> > in the v1 thread and Dmitry is very eager to get this series going, I'm
> > sending it out without testing locally. I already tested these patches a
> > few months ago as part of the v1 series. So I don't expect any major
> > issues. I'll test them again on my end in the next few days and will
> > report here if I actually find anything wrong.
> >
> > Tony, Naresh, Abel, Sudeep, Geert,
> >
> > I got the following reviewed by's and tested by's a few months back, but
> > it's been 5 months since I sent out v1. So I wasn't sure if it was okay
> > to include them in the v2 commits. Let me know if you are okay with this
> > being included in the commits and/or if you want to test this series
> > again.
> >
> > Reviewed-by: Tony Lindgren <[email protected]>
> > Tested-by: Tony Lindgren <[email protected]>
> > Tested-by: Linux Kernel Functional Testing <[email protected]>
> > Tested-by: Naresh Kamboju <[email protected]>
> > Tested-by: Abel Vesa <[email protected]>
> > Tested-by: Sudeep Holla <[email protected]>
> > Tested-by: Geert Uytterhoeven <[email protected]>
> >
> > Dmitry, Maxim(s), Miquel, Luca, Doug, Colin, Martin, Jean-Philippe,
> >
> > I've Cc-ed you because I had pointed you to v1 of this series + the
> > patches in that thread at one point or another as a fix to some issue
> > you were facing. It'd appreciate it if you can test this series and
> > report any issues, or things it fixed and give Tested-bys.
>
> Maxim & Maxim I would really appreciate if you could validate that the
> original issue you had is solved with this version? I don't have any
> hardware suffering from this issue.
>
> > In addition, if you can also apply a revert of this series[2] and delete
> > driver_deferred_probe_check_state() from your tree and see if you hit
> > any issues and report them, that'd be great too! I'm pretty sure some of
> > you will hit issues with that. I want to fix those next and then
> > revert[2].
> >
> > Thanks,
> > Saravana
> >
> > [1] - https://lore.kernel.org/lkml/[email protected]/
> > [2] - https://lore.kernel.org/lkml/[email protected]/
> > [3] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/
> >
> > v1 -> v2:
> > - Fixed Patch 1 to handle a corner case discussed in [3].
> > - New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
> > - New patch 11 to add fw_devlink support for SCMI devices.
> >
> > Cc: Abel Vesa <[email protected]>
> > Cc: Alexander Stein <[email protected]>
> > Cc: Tony Lindgren <[email protected]>
> > Cc: Sudeep Holla <[email protected]>
> > Cc: Geert Uytterhoeven <[email protected]>
> > Cc: John Stultz <[email protected]>
> > Cc: Doug Anderson <[email protected]>
> > Cc: Guenter Roeck <[email protected]>
> > Cc: Dmitry Baryshkov <[email protected]>
> > Cc: Maxim Kiselev <[email protected]>
> > Cc: Maxim Kochetkov <[email protected]>
> > Cc: Miquel Raynal <[email protected]>
> > Cc: Luca Weiss <[email protected]>
> > Cc: Colin Foster <[email protected]>
> > Cc: Martin Kepplinger <[email protected]>
> > Cc: Jean-Philippe Brucker <[email protected]>
> >
> > Saravana Kannan (11):
> > driver core: fw_devlink: Don't purge child fwnode's consumer links
> > driver core: fw_devlink: Improve check for fwnode with no
> > device/driver
> > soc: renesas: Move away from using OF_POPULATED for fw_devlink
> > gpiolib: Clear the gpio_device's fwnode initialized flag before adding
> > driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
> > driver core: fw_devlink: Allow marking a fwnode link as being part of
> > a cycle
> > driver core: fw_devlink: Consolidate device link flag computation
> > driver core: fw_devlink: Make cycle detection more robust
> > of: property: Simplify of_link_to_phandle()
> > irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
> > firmware: arm_scmi: Set fwnode for the scmi_device
> >
> > drivers/base/core.c | 443 +++++++++++++++++++++-----------
> > drivers/firmware/arm_scmi/bus.c | 2 +
> > drivers/gpio/gpiolib.c | 6 +
> > drivers/irqchip/irq-imx-gpcv2.c | 1 +
> > drivers/of/property.c | 84 +-----
> > drivers/soc/imx/gpcv2.c | 1 +
> > drivers/soc/renesas/rcar-sysc.c | 2 +-
> > include/linux/device.h | 1 +
> > include/linux/fwnode.h | 12 +-
> > 9 files changed, 332 insertions(+), 220 deletions(-)
> >
>
>
> Thanks,
> Miquèl
On Fri, Jan 27, 2023 at 11:34:11PM -0800, Saravana Kannan wrote:
> On Fri, Jan 27, 2023 at 1:30 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Thu, Jan 26, 2023 at 04:11:32PM -0800, Saravana Kannan wrote:
...
> > > DL_FLAG_AUTOREMOVE_SUPPLIER | \
> > > DL_FLAG_AUTOPROBE_CONSUMER | \
> > > DL_FLAG_SYNC_STATE_ONLY | \
> > > - DL_FLAG_INFERRED)
> > > + DL_FLAG_INFERRED | \
> > > + DL_FLAG_CYCLE)
> >
> > You can make less churn by squeezing the new one above the last one.
>
> I feel like this part is getting bike shedded. I'm going to leave it
> as is. It's done in the order it's defined in the header and keeping
> it that way makes it way more easier to read than worry about a single
> line churn.
Not at all. What you are showing here is the additional churn for the sake of
the churn. With a no-cost trick you may avoid that.
Also, it will compress better the Git index and save some mW here and there
and in the size of the world and amount of Git copies of the Linux kernel
this may save the planet (I'm serious, no jokes).
--
With Best Regards,
Andy Shevchenko
On Fri, Jan 27, 2023 at 11:34:19PM -0800, Saravana Kannan wrote:
> On Fri, Jan 27, 2023 at 1:33 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Thu, Jan 26, 2023 at 04:11:33PM -0800, Saravana Kannan wrote:
...
> > > - if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
> > > - !fw_devlink_is_permissive()) {
> > > - sup_fw = list_first_entry(&dev->fwnode->suppliers,
> > > - struct fwnode_link,
> > > - c_hook)->supplier;
> > > + sup_fw = fwnode_links_check_suppliers(dev->fwnode);
> >
> > dev_fwnode() ?
> >
> > ...
> >
> > > - val = !list_empty(&dev->fwnode->suppliers);
> > > + mutex_lock(&fwnode_link_lock);
> > > + val = !!fwnode_links_check_suppliers(dev->fwnode);
> >
> > Ditto?
>
> Similar response as Patch 1 and Patch 4.
Same.
--
With Best Regards,
Andy Shevchenko
On Fri, Jan 27, 2023 at 11:34:28PM -0800, Saravana Kannan wrote:
> On Fri, Jan 27, 2023 at 1:43 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
...
> Lol, you are the king of nit picks :)
Not always, only when it comes with something else.
...
> > > + * Return true if one or more cycles were found. Otherwise, return false.
> >
> > Return:
>
> I'm following the rest of the function docs in this file.
Okay, it makes sense. We will need to address them all.
> > (you may run `kernel-doc -v ...` to see all warnings)
>
> Hmmm I ran it on the patch file and it didn't give me anything useful.
> Running it on the whole file is just a lot of lines to dig through.
The function description missing Return section. Something like this
I can get from the kernel doc without it.
...
> > > +static bool __fw_devlink_relax_cycles(struct device *con,
> > > + struct fwnode_handle *sup_handle)
> > > +{
> > > + struct fwnode_link *link;
> > > + struct device_link *dev_link;
> >
> > > + struct device *sup_dev = NULL, *par_dev = NULL;
> >
> > You can put it the first line since it's long enough.
>
> Wait, is that a style guideline to have the longer lines first?
No, but it's easier to read.
> > But why do you need sup_dev assignment?
>
> Defensive programming I suppose. I can see this function being
> refactored in the future where a goto out; is inserted before sup_dev
> is assigned. And then the put_device(sup_dev) at "out" will end up
> operating on some junk value and causing memory corruption.
We don't need to be defensive here. Moreover, it's harder and easy to make
a mistake with predefined values (it's actually better NOT to define anything
qr at least define as closest to its user as possible, so you want miss the
compiler warnings as I believe you run your compiler with `make W=1 ...`, and
if not, I highly recommend to get this habit).
...
> > > + sup_dev = get_dev_from_fwnode(sup_handle);
> > > +
> >
> > I would put it closer to the condition:
> >
> > > + /* Termination condition. */
> > > + if (sup_dev == con) {
> >
> > /* Get supplier device and check for termination condition */
> > sup_dev = get_dev_from_fwnode(sup_handle);
> > if (sup_dev == con) {
>
> I put it the way it is because sup_dev is used for more than just
> checking for termination condition.
Yes, but still it's better to see what you are actually checking.
> > > + ret = true;
> > > + goto out;
> > > + }
...
> > > + if (__fw_devlink_relax_cycles(con,
> > > + dev_link->supplier->fwnode)) {
> >
> > Keep it on one line.
>
> It'll make it > 80. Is this some recent change about allowing > 80
> cols? I'm leaving it as is for now.
Is it a problem? Do you have any tool that complains about it?
--
With Best Regards,
Andy Shevchenko
On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote:
> Registering an irqdomain sets the flag for the fwnode. But having the
> flag set when a device is added is interpreted by fw_devlink to mean the
> device has already been initialized and will never probe. This prevents
> fw_devlink from creating device links with the gpio_device as a
> supplier. So, clear the flag before adding the device.
>
> Signed-off-by: Saravana Kannan <[email protected]>
> Acked-by: Bartosz Golaszewski <[email protected]>
> ---
> drivers/gpio/gpiolib.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 939c776b9488..b23140c6485f 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -578,6 +578,12 @@ static int gpiochip_setup_dev(struct gpio_device *gdev)
> {
> int ret;
>
> + /*
> + * If fwnode doesn't belong to another device, it's safe to clear its
> + * initialized flag.
> + */
> + if (!gdev->dev.fwnode->dev)
> + fwnode_dev_initialized(gdev->dev.fwnode, false);
This is the one causing the kernel crash during the boot on FVP which
Naresh has reported. Just reverted this and was able to boot, confirming
the issue with this patch.
--
Regards,
Sudeep
Hi Andy,
On Mon, Jan 30, 2023 at 1:16 PM Andy Shevchenko
<[email protected]> wrote:
> On Fri, Jan 27, 2023 at 11:34:28PM -0800, Saravana Kannan wrote:
> > On Fri, Jan 27, 2023 at 1:43 AM Andy Shevchenko
> > <[email protected]> wrote:
> > > On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> > > > +static bool __fw_devlink_relax_cycles(struct device *con,
> > > > + struct fwnode_handle *sup_handle)
> > > > +{
> > > > + struct fwnode_link *link;
> > > > + struct device_link *dev_link;
> > >
> > > > + struct device *sup_dev = NULL, *par_dev = NULL;
> > >
> > > You can put it the first line since it's long enough.
> >
> > Wait, is that a style guideline to have the longer lines first?
>
> No, but it's easier to read.
Yes it is, "reverse xmas tree" local variable ordering:
https://elixir.bootlin.com/linux/v6.2-rc6/source/Documentation/process/maintainer-netdev.rst#L272
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Mon, Jan 30, 2023 at 02:31:53PM +0000, Sudeep Holla wrote:
> On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote:
> > Registering an irqdomain sets the flag for the fwnode. But having the
> > flag set when a device is added is interpreted by fw_devlink to mean the
> > device has already been initialized and will never probe. This prevents
> > fw_devlink from creating device links with the gpio_device as a
> > supplier. So, clear the flag before adding the device.
...
> > + /*
> > + * If fwnode doesn't belong to another device, it's safe to clear its
> > + * initialized flag.
> > + */
> > + if (!gdev->dev.fwnode->dev)
> > + fwnode_dev_initialized(gdev->dev.fwnode, false);
>
> This is the one causing the kernel crash during the boot on FVP which
> Naresh has reported. Just reverted this and was able to boot, confirming
> the issue with this patch.
I'm wondering if
if (!dev_fwnode(&gdev->dev)->dev)
fwnode_dev_initialized(&dev_fwnode(gdev->dev), false);
works.
--
With Best Regards,
Andy Shevchenko
On Mon, Jan 30, 2023 at 03:36:04PM +0100, Geert Uytterhoeven wrote:
> On Mon, Jan 30, 2023 at 1:16 PM Andy Shevchenko
> <[email protected]> wrote:
> > On Fri, Jan 27, 2023 at 11:34:28PM -0800, Saravana Kannan wrote:
> > > On Fri, Jan 27, 2023 at 1:43 AM Andy Shevchenko
> > > <[email protected]> wrote:
> > > > On Thu, Jan 26, 2023 at 04:11:35PM -0800, Saravana Kannan wrote:
> > > > > +static bool __fw_devlink_relax_cycles(struct device *con,
> > > > > + struct fwnode_handle *sup_handle)
> > > > > +{
> > > > > + struct fwnode_link *link;
> > > > > + struct device_link *dev_link;
> > > >
> > > > > + struct device *sup_dev = NULL, *par_dev = NULL;
> > > >
> > > > You can put it the first line since it's long enough.
> > >
> > > Wait, is that a style guideline to have the longer lines first?
> >
> > No, but it's easier to read.
>
> Yes it is, "reverse xmas tree" local variable ordering:
> https://elixir.bootlin.com/linux/v6.2-rc6/source/Documentation/process/maintainer-netdev.rst#L272
Good to know, thanks!
--
With Best Regards,
Andy Shevchenko
Hi Saravana,
On Thu, Jan 26, 2023 at 04:11:36PM -0800, Saravana Kannan wrote:
> The driver core now:
> - Has the parent device of a supplier pick up the consumers if the
> supplier never has a device created for it.
> - Ignores a supplier if the supplier has no parent device and will never
> be probed by a driver
>
> And already prevents creating a device link with the consumer as a
> supplier of a parent.
>
> So, we no longer need to find the "compatible" node of the supplier or
> do any other checks in of_link_to_phandle(). We simply need to make sure
> that the supplier is available in DT.
>
> Signed-off-by: Saravana Kannan <[email protected]>
> ---
> drivers/of/property.c | 84 +++++++------------------------------------
> 1 file changed, 13 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index 134cfc980b70..c651aad6f34b 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -1062,20 +1062,6 @@ of_fwnode_device_get_match_data(const struct fwnode_handle *fwnode,
> return of_device_get_match_data(dev);
> }
>
> -static bool of_is_ancestor_of(struct device_node *test_ancestor,
> - struct device_node *child)
> -{
> - of_node_get(child);
> - while (child) {
> - if (child == test_ancestor) {
> - of_node_put(child);
> - return true;
> - }
> - child = of_get_next_parent(child);
> - }
> - return false;
> -}
> -
> static struct device_node *of_get_compat_node(struct device_node *np)
> {
> of_node_get(np);
> @@ -1106,71 +1092,27 @@ static struct device_node *of_get_compat_node_parent(struct device_node *np)
> return node;
> }
>
> -/**
> - * of_link_to_phandle - Add fwnode link to supplier from supplier phandle
> - * @con_np: consumer device tree node
> - * @sup_np: supplier device tree node
> - *
> - * Given a phandle to a supplier device tree node (@sup_np), this function
> - * finds the device that owns the supplier device tree node and creates a
> - * device link from @dev consumer device to the supplier device. This function
> - * doesn't create device links for invalid scenarios such as trying to create a
> - * link with a parent device as the consumer of its child device. In such
> - * cases, it returns an error.
> - *
> - * Returns:
> - * - 0 if fwnode link successfully created to supplier
> - * - -EINVAL if the supplier link is invalid and should not be created
> - * - -ENODEV if struct device will never be create for supplier
> - */
> -static int of_link_to_phandle(struct device_node *con_np,
> +static void of_link_to_phandle(struct device_node *con_np,
> struct device_node *sup_np)
> {
> - struct device *sup_dev;
> - struct device_node *tmp_np = sup_np;
> + struct device_node *tmp_np = of_node_get(sup_np);
>
> - /*
> - * Find the device node that contains the supplier phandle. It may be
> - * @sup_np or it may be an ancestor of @sup_np.
> - */
> - sup_np = of_get_compat_node(sup_np);
> - if (!sup_np) {
> - pr_debug("Not linking %pOFP to %pOFP - No device\n",
> - con_np, tmp_np);
> - return -ENODEV;
> - }
> + /* Check that sup_np and its ancestors are available. */
> + while (tmp_np) {
> + if (of_fwnode_handle(tmp_np)->dev) {
> + of_node_put(tmp_np);
> + break;
> + }
>
> - /*
> - * Don't allow linking a device node as a consumer of one of its
> - * descendant nodes. By definition, a child node can't be a functional
> - * dependency for the parent node.
> - */
> - if (of_is_ancestor_of(con_np, sup_np)) {
> - pr_debug("Not linking %pOFP to %pOFP - is descendant\n",
> - con_np, sup_np);
> - of_node_put(sup_np);
> - return -EINVAL;
> - }
> + if (!of_device_is_available(tmp_np)) {
> + of_node_put(tmp_np);
> + return;
> + }
>
> - /*
> - * Don't create links to "early devices" that won't have struct devices
> - * created for them.
> - */
> - sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
> - if (!sup_dev &&
> - (of_node_check_flag(sup_np, OF_POPULATED) ||
> - sup_np->fwnode.flags & FWNODE_FLAG_NOT_DEVICE)) {
> - pr_debug("Not linking %pOFP to %pOFP - No struct device\n",
> - con_np, sup_np);
> - of_node_put(sup_np);
> - return -ENODEV;
> + tmp_np = of_get_next_parent(tmp_np);
> }
> - put_device(sup_dev);
>
> fwnode_link_add(of_fwnode_handle(con_np), of_fwnode_handle(sup_np));
fwnode_link_add() returns int. Why is the return type of this function
changed to void?
> - of_node_put(sup_np);
> -
> - return 0;
> }
>
> /**
--
Regards,
Sakari Ailus
On Mon, Jan 30, 2023 at 12:43 AM Geert Uytterhoeven
<[email protected]> wrote:
>
> Hi Saravana,
>
> On Sat, Jan 28, 2023 at 8:19 AM Saravana Kannan <[email protected]> wrote:
> > On Fri, Jan 27, 2023 at 12:11 AM Geert Uytterhoeven
> > <[email protected]> wrote:
> > > On Fri, Jan 27, 2023 at 1:11 AM Saravana Kannan <[email protected]> wrote:
> > > > The OF_POPULATED flag was set to let fw_devlink know that the device
> > > > tree node will not have a struct device created for it. This information
> > > > is used by fw_devlink to avoid deferring the probe of consumers of this
> > > > device tree node.
> > > >
> > > > Let's use fwnode_dev_initialized() instead because it achieves the same
> > > > effect without using OF specific flags. This allows more generic code to
> > > > be written in driver core.
> > > >
> > > > Signed-off-by: Saravana Kannan <[email protected]>
> > >
> > > Thanks for your patch!
> > >
> > > > --- a/drivers/soc/renesas/rcar-sysc.c
> > > > +++ b/drivers/soc/renesas/rcar-sysc.c
> > > > @@ -437,7 +437,7 @@ static int __init rcar_sysc_pd_init(void)
> > > >
> > > > error = of_genpd_add_provider_onecell(np, &domains->onecell_data);
> > > > if (!error)
> > > > - of_node_set_flag(np, OF_POPULATED);
> > > > + fwnode_dev_initialized(&np->fwnode, true);
> > >
> > > As drivers/soc/renesas/rmobile-sysc.c is already using this method,
> > > it should work fine.
> > >
> > > Reviewed-by: Geert Uytterhoeven <[email protected]>
> > > i.e. will queue in renesas-devel for v6.4.
I hope you meant queue it up for 6.3 and not 6.4?
> >
> > Thanks! Does that mean I should drop this from this series? If two
> > maintainers pick the same patch up, will it cause problems? I'm
> > eventually expecting this series to be picked up by Greg into
> > driver-core-next.
>
> Indeed. Patches for drivers/soc/renesas/ are supposed to go upstream
> through the renesas-devel and soc trees. This patch has no dependencies
> on anything else in the series (or vice versa), so there is no reason
> to deviate from that, and possibly cause conflicts later.
This series is supposed to fix a bunch of issues and I vaguely think
the series depends on this patch to work correctly on some Renesas
systems. You are my main renesas person, so it's probably some issue
you hit. Is you pick it up outside of this series I need to keep
asking folks to pick up two different patch threads. I don't have a
strong opinion, just a FYI. If you can take this patch soon, I don't
have any concerns.
> BTW, I will convert to of_node_to_fwnode() while applying.
Sounds good.
-Saravana
On Mon, Jan 30, 2023 at 12:56 AM Naresh Kamboju
<[email protected]> wrote:
>
> Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh,
> sparc and x86_64.
>
> Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64.
> Boot failed on FVP.
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> Please refer following link for details of testing.
> FVP boot log failed.
> https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/
Sudeep pointed me to what the issue might be. But it's strange that
you are hitting an issue now. I'm pretty sure I haven't changed this
part since v1. I'd also expect the limited assumptions I made to have
not been affected between v1 and v2.
Anyway, I'll look at this and fix it in v3.
-Saravana
On Mon, Jan 30, 2023 at 4:09 AM Maxim Kiselev <[email protected]> wrote:
>
> Hi Saravana & Miquel.
>
> Sorry for the long response. I finally got access to my test device
> and tried this patch series.
>
> And unfortunately it didn't solve my issue. I'm still getting a
> hanging f1070000.ethernet dependency
> from the nvmem-cell mac@6 subnode.
Thanks for testing the series.
Btw, don't top post. It's frowned upon. Top post means your reply is
on the top before the email you are replying to. See how my first line
of reply in inline with your email I'm replying to?
>
> Here are related parts of my kernel log and device tree:
>
>
> [ 2.713302] device: 'mtd-0': device_add
> [ 2.719528] device: 'spi0': device_add
> [ 2.724180] device: 'spi0.0': device_add
> [ 2.728957] spi-nor spi0.0: mx66l51235f (65536 Kbytes)
> [ 2.735338] 7 fixed-partitions partitions found on MTD device spi0.0
> [ 2.741978] device:
> 'f1010600.spi:m25p80@0:partitions:partition@1': device_add
> [ 2.749636] Creating 7 MTD partitions on "spi0.0":
> [ 2.754564] 0x000000000000-0x000000080000 : "SPI.U_BOOT"
> [ 2.759981] device: 'mtd0': device_add
> [ 2.764323] device: 'mtd0': device_add
> [ 2.768280] device: 'mtd0ro': device_add
> [ 2.772624] 0x0000000a0000-0x0000000c0000 : "SPI.INV_INFO"
> [ 2.778218] device: 'mtd1': device_add
> [ 2.782549] device: 'mtd1': device_add
> [ 2.786582] device: 'mtd1ro': device_add
> ...
> [ 5.426625] mvneta_bm f10c0000.bm: Buffer Manager for network
> controller enabled
> [ 5.492867] platform f1070000.ethernet: error -EPROBE_DEFER:
> wait for supplier mac@6
> [ 5.528636] device: 'Fixed MDIO bus.0': device_add
> [ 5.533726] device: 'fixed-0': device_add
> [ 5.547564] device: 'f1072004.mdio-eth-mii': device_add
> [ 5.616368] device: 'f1072004.mdio-eth-mii:00': device_add
> [ 5.645127] device: 'f1072004.mdio-eth-mii:1e': device_add
> [ 5.651530] devices_kset: Moving f1070000.ethernet to end of list
> [ 5.657948] platform f1070000.ethernet: error -EPROBE_DEFER:
> wait for supplier mac@6
>
> spi@10600 {
> m25p80@0 {
> compatible = "mx66l51235l";
>
> partitions {
> compatible = "fixed-partitions";
>
> partition@0 {
> label = "SPI.U_BOOT";
> };
> partition@1 {
> compatible = "nvmem-cells";
> label = "SPI.INV_INFO";
> macaddr: mac@6 {
> reg = <0x6 0x6>;
> };
> };
> ...
> };
> };
> };
>
> enet1: ethernet@70000 {
> nvmem-cells = <&macaddr>;
> nvmem-cell-names = "mac-address";
> phy-mode = "rgmii";
> phy = <&phy0>;
> };
>
>
> Maybe I should provide some additional debug info?
I took a look at it and I think I know the issue. But it'll be good if
you can point me to the dts (not dtsi) file that corresponds to the
board you are seeing this issue on so I can double check my guess by
looking at the exact code/drivers.
The main problem/mistake is the nvmem framework is using a "struct
bus" instead of a "struct class" to keep a list of the nvmem devices.
And we can't change it now because it'd affect the sysfs paths
significantly and might break userspace ABI.
Can you try the patch at the end of this email under these
configurations and tell me which ones fail vs pass? I don't need logs
for any pass/failures.
1. On top of this series
2. Without this series
3. On top of the series but with the call to fwnode_dev_initialized() deleted?
4. Without this series, but with the call to fwnode_dev_initialized() deleted?
-Saravana
Sorry about tabs to spaces conversion. Email client issue.
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 321d7d63e068..23d94c0ecccf 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -752,6 +752,7 @@ static int nvmem_add_cells_from_of(struct
nvmem_device *nvmem)
struct nvmem_device *nvmem_register(const struct nvmem_config *config)
{
struct nvmem_device *nvmem;
+ struct fwnode_handle *fwnode;
int rval;
if (!config->dev)
@@ -804,9 +805,18 @@ struct nvmem_device *nvmem_register(const struct
nvmem_config *config)
nvmem->keepout = config->keepout;
nvmem->nkeepout = config->nkeepout;
if (config->of_node)
- nvmem->dev.of_node = config->of_node;
+ fwnode = of_fwnode_handle(config->of_node);
else if (!config->no_of_node)
- nvmem->dev.of_node = config->dev->of_node;
+ fwnode = of_fwnode_handle(config->dev->of_node);
+ device_set_node(&nvmem->dev, fwnode);
+
+ /*
+ * If the fwnode doesn't have another device associated with it, mark
+ * the fwnode as initialized since no driver is going to bind to the
+ * nvmem.
+ */
+ if (fwnode && !fwnode->dev)
+ fwnode_dev_initialized(fwnode, true);
switch (config->id) {
case NVMEM_DEVID_NONE:
On Mon, Jan 30, 2023 at 10:15 AM Sakari Ailus
<[email protected]> wrote:
>
> Hi Saravana,
>
> On Thu, Jan 26, 2023 at 04:11:36PM -0800, Saravana Kannan wrote:
> > The driver core now:
> > - Has the parent device of a supplier pick up the consumers if the
> > supplier never has a device created for it.
> > - Ignores a supplier if the supplier has no parent device and will never
> > be probed by a driver
> >
> > And already prevents creating a device link with the consumer as a
> > supplier of a parent.
> >
> > So, we no longer need to find the "compatible" node of the supplier or
> > do any other checks in of_link_to_phandle(). We simply need to make sure
> > that the supplier is available in DT.
> >
> > Signed-off-by: Saravana Kannan <[email protected]>
> > ---
> > drivers/of/property.c | 84 +++++++------------------------------------
> > 1 file changed, 13 insertions(+), 71 deletions(-)
> >
> > diff --git a/drivers/of/property.c b/drivers/of/property.c
> > index 134cfc980b70..c651aad6f34b 100644
> > --- a/drivers/of/property.c
> > +++ b/drivers/of/property.c
> > @@ -1062,20 +1062,6 @@ of_fwnode_device_get_match_data(const struct fwnode_handle *fwnode,
> > return of_device_get_match_data(dev);
> > }
> >
> > -static bool of_is_ancestor_of(struct device_node *test_ancestor,
> > - struct device_node *child)
> > -{
> > - of_node_get(child);
> > - while (child) {
> > - if (child == test_ancestor) {
> > - of_node_put(child);
> > - return true;
> > - }
> > - child = of_get_next_parent(child);
> > - }
> > - return false;
> > -}
> > -
> > static struct device_node *of_get_compat_node(struct device_node *np)
> > {
> > of_node_get(np);
> > @@ -1106,71 +1092,27 @@ static struct device_node *of_get_compat_node_parent(struct device_node *np)
> > return node;
> > }
> >
> > -/**
> > - * of_link_to_phandle - Add fwnode link to supplier from supplier phandle
> > - * @con_np: consumer device tree node
> > - * @sup_np: supplier device tree node
> > - *
> > - * Given a phandle to a supplier device tree node (@sup_np), this function
> > - * finds the device that owns the supplier device tree node and creates a
> > - * device link from @dev consumer device to the supplier device. This function
> > - * doesn't create device links for invalid scenarios such as trying to create a
> > - * link with a parent device as the consumer of its child device. In such
> > - * cases, it returns an error.
> > - *
> > - * Returns:
> > - * - 0 if fwnode link successfully created to supplier
> > - * - -EINVAL if the supplier link is invalid and should not be created
> > - * - -ENODEV if struct device will never be create for supplier
> > - */
> > -static int of_link_to_phandle(struct device_node *con_np,
> > +static void of_link_to_phandle(struct device_node *con_np,
> > struct device_node *sup_np)
> > {
> > - struct device *sup_dev;
> > - struct device_node *tmp_np = sup_np;
> > + struct device_node *tmp_np = of_node_get(sup_np);
> >
> > - /*
> > - * Find the device node that contains the supplier phandle. It may be
> > - * @sup_np or it may be an ancestor of @sup_np.
> > - */
> > - sup_np = of_get_compat_node(sup_np);
> > - if (!sup_np) {
> > - pr_debug("Not linking %pOFP to %pOFP - No device\n",
> > - con_np, tmp_np);
> > - return -ENODEV;
> > - }
> > + /* Check that sup_np and its ancestors are available. */
> > + while (tmp_np) {
> > + if (of_fwnode_handle(tmp_np)->dev) {
> > + of_node_put(tmp_np);
> > + break;
> > + }
> >
> > - /*
> > - * Don't allow linking a device node as a consumer of one of its
> > - * descendant nodes. By definition, a child node can't be a functional
> > - * dependency for the parent node.
> > - */
> > - if (of_is_ancestor_of(con_np, sup_np)) {
> > - pr_debug("Not linking %pOFP to %pOFP - is descendant\n",
> > - con_np, sup_np);
> > - of_node_put(sup_np);
> > - return -EINVAL;
> > - }
> > + if (!of_device_is_available(tmp_np)) {
> > + of_node_put(tmp_np);
> > + return;
> > + }
> >
> > - /*
> > - * Don't create links to "early devices" that won't have struct devices
> > - * created for them.
> > - */
> > - sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
> > - if (!sup_dev &&
> > - (of_node_check_flag(sup_np, OF_POPULATED) ||
> > - sup_np->fwnode.flags & FWNODE_FLAG_NOT_DEVICE)) {
> > - pr_debug("Not linking %pOFP to %pOFP - No struct device\n",
> > - con_np, sup_np);
> > - of_node_put(sup_np);
> > - return -ENODEV;
> > + tmp_np = of_get_next_parent(tmp_np);
> > }
> > - put_device(sup_dev);
> >
> > fwnode_link_add(of_fwnode_handle(con_np), of_fwnode_handle(sup_np));
>
> fwnode_link_add() returns int. Why is the return type of this function
> changed to void?
The return value of fwnode_link_add() was ignored even before this
patch. Since all other reasons for of_link_to_phandle() to fail are
gone, I'm switching it to void.
fwnode_link_add() is ignored because it can only fail due to -ENOMEM.
Not much to do in that case. We do our best and move on.
-Saravana
On Mon, Jan 30, 2023 at 7:14 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Mon, Jan 30, 2023 at 02:31:53PM +0000, Sudeep Holla wrote:
> > On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote:
> > > Registering an irqdomain sets the flag for the fwnode. But having the
> > > flag set when a device is added is interpreted by fw_devlink to mean the
> > > device has already been initialized and will never probe. This prevents
> > > fw_devlink from creating device links with the gpio_device as a
> > > supplier. So, clear the flag before adding the device.
>
> ...
>
> > > + /*
> > > + * If fwnode doesn't belong to another device, it's safe to clear its
> > > + * initialized flag.
> > > + */
> > > + if (!gdev->dev.fwnode->dev)
> > > + fwnode_dev_initialized(gdev->dev.fwnode, false);
> >
> > This is the one causing the kernel crash during the boot on FVP which
> > Naresh has reported. Just reverted this and was able to boot, confirming
> > the issue with this patch.
>
> I'm wondering if
>
> if (!dev_fwnode(&gdev->dev)->dev)
> fwnode_dev_initialized(&dev_fwnode(gdev->dev), false);
>
> works.
No, that won't help. The problem was that with arm32, we have gpio
devices created without any of_node or fwnode. So I can't assume
fwnode will always be present.
-Saravana
Hi Saravana,
On Mon, Jan 30, 2023 at 9:00 PM Saravana Kannan <[email protected]> wrote:
> On Mon, Jan 30, 2023 at 12:43 AM Geert Uytterhoeven
> <[email protected]> wrote:
> > On Sat, Jan 28, 2023 at 8:19 AM Saravana Kannan <[email protected]> wrote:
> > > On Fri, Jan 27, 2023 at 12:11 AM Geert Uytterhoeven
> > > <[email protected]> wrote:
> > > > On Fri, Jan 27, 2023 at 1:11 AM Saravana Kannan <[email protected]> wrote:
> > > > > The OF_POPULATED flag was set to let fw_devlink know that the device
> > > > > tree node will not have a struct device created for it. This information
> > > > > is used by fw_devlink to avoid deferring the probe of consumers of this
> > > > > device tree node.
> > > > >
> > > > > Let's use fwnode_dev_initialized() instead because it achieves the same
> > > > > effect without using OF specific flags. This allows more generic code to
> > > > > be written in driver core.
> > > > >
> > > > > Signed-off-by: Saravana Kannan <[email protected]>
> > > >
> > > > Thanks for your patch!
> > > >
> > > > > --- a/drivers/soc/renesas/rcar-sysc.c
> > > > > +++ b/drivers/soc/renesas/rcar-sysc.c
> > > > > @@ -437,7 +437,7 @@ static int __init rcar_sysc_pd_init(void)
> > > > >
> > > > > error = of_genpd_add_provider_onecell(np, &domains->onecell_data);
> > > > > if (!error)
> > > > > - of_node_set_flag(np, OF_POPULATED);
> > > > > + fwnode_dev_initialized(&np->fwnode, true);
> > > >
> > > > As drivers/soc/renesas/rmobile-sysc.c is already using this method,
> > > > it should work fine.
> > > >
> > > > Reviewed-by: Geert Uytterhoeven <[email protected]>
> > > > i.e. will queue in renesas-devel for v6.4.
>
> I hope you meant queue it up for 6.3 and not 6.4?
V6.4.
The deadline for submitting pull requests for the soc tree is rc6.
Sorry, your series was posted too late to make that.
> > > Thanks! Does that mean I should drop this from this series? If two
> > > maintainers pick the same patch up, will it cause problems? I'm
> > > eventually expecting this series to be picked up by Greg into
> > > driver-core-next.
> >
> > Indeed. Patches for drivers/soc/renesas/ are supposed to go upstream
> > through the renesas-devel and soc trees. This patch has no dependencies
> > on anything else in the series (or vice versa), so there is no reason
> > to deviate from that, and possibly cause conflicts later.
>
> This series is supposed to fix a bunch of issues and I vaguely think
> the series depends on this patch to work correctly on some Renesas
> systems. You are my main renesas person, so it's probably some issue
> you hit. Is you pick it up outside of this series I need to keep
> asking folks to pick up two different patch threads. I don't have a
> strong opinion, just a FYI. If you can take this patch soon, I don't
> have any concerns.
Oh right, you do remove OF_POPULATED handling in
"[PATCH v2 09/11] of: property: Simplify of_link_to_phandle()".
It might be wise to postpone that removal, as after your series,
there are stillseveral users left, some of them might be impacted.
I do plan to test your full series on all my boards, but probably that
won't happen this week.
> > BTW, I will convert to of_node_to_fwnode() while applying.
>
> Sounds good.
If you still want this to land in v6,3 (with the of_node_to_fwnode()
conversion):
Acked-by: Geert Uytterhoeven <[email protected]>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Mon, Jan 30, 2023 at 08:01:17PM -0800, Saravana Kannan wrote:
> On Mon, Jan 30, 2023 at 7:14 AM Andy Shevchenko
> <[email protected]> wrote:
> >
> > On Mon, Jan 30, 2023 at 02:31:53PM +0000, Sudeep Holla wrote:
> > > On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote:
> > > > Registering an irqdomain sets the flag for the fwnode. But having the
> > > > flag set when a device is added is interpreted by fw_devlink to mean the
> > > > device has already been initialized and will never probe. This prevents
> > > > fw_devlink from creating device links with the gpio_device as a
> > > > supplier. So, clear the flag before adding the device.
> >
> > ...
> >
> > > > + /*
> > > > + * If fwnode doesn't belong to another device, it's safe to clear its
> > > > + * initialized flag.
> > > > + */
> > > > + if (!gdev->dev.fwnode->dev)
> > > > + fwnode_dev_initialized(gdev->dev.fwnode, false);
> > >
> > > This is the one causing the kernel crash during the boot on FVP which
> > > Naresh has reported. Just reverted this and was able to boot, confirming
> > > the issue with this patch.
> >
> > I'm wondering if
> >
> > if (!dev_fwnode(&gdev->dev)->dev)
> > fwnode_dev_initialized(&dev_fwnode(gdev->dev), false);
> >
> > works.
>
> No, that won't help. The problem was that with arm32, we have gpio
> devices created without any of_node or fwnode. So I can't assume
> fwnode will always be present.
>
Correct, and this one is not even arm32. But it is just reusing a driver
that needs to be supported even on arm32.
Not sure on how to proceed. As a simple way to check, I added a NULL check
for fwnode building on top of Andy's suggestion[1]. That works.
Also the driver in question on arm64 FVP model is drivers/mfd/vexpress-sysreg.c
mfd_add_device() in drivers/mfd/mfd-core.c allows addition of devices without
of_node/fwnode. I am sure returning error like[2] will break many platforms
but I just wanted to confirm the root cause and [2] fixes the boot without
NULL check for fwnode in gpiochip_setup_dev().
Hope this helps.
--
Regards,
Sudeep
[1]
-->8
diff --git i/drivers/gpio/gpiolib.c w/drivers/gpio/gpiolib.c
index b23140c6485f..e162f13aa2c9 100644
--- i/drivers/gpio/gpiolib.c
+++ w/drivers/gpio/gpiolib.c
@@ -577,13 +577,15 @@ static void gpiodevice_release(struct device *dev)
static int gpiochip_setup_dev(struct gpio_device *gdev)
{
int ret;
+ struct fwnode_handle *fwnode = dev_fwnode(&gdev->dev);
/*
* If fwnode doesn't belong to another device, it's safe to clear its
* initialized flag.
*/
- if (!gdev->dev.fwnode->dev)
- fwnode_dev_initialized(gdev->dev.fwnode, false);
+ if (fwnode && !fwnode->dev)
+ fwnode_dev_initialized(fwnode, false);
+
ret = gcdev_register(gdev, gpio_devt);
if (ret)
return ret;
[2]
-->8
diff --git i/drivers/mfd/mfd-core.c w/drivers/mfd/mfd-core.c
index 16d1861e9682..3b2c4b0e9a2a 100644
--- i/drivers/mfd/mfd-core.c
+++ w/drivers/mfd/mfd-core.c
@@ -231,9 +231,11 @@ static int mfd_add_device(struct device *parent, int id,
}
}
- if (!pdev->dev.of_node)
+ if (!pdev->dev.of_node) {
pr_warn("%s: Failed to locate of_node [id: %d]\n",
cell->name, platform_id);
+ goto fail_alias;
+ }
}
mfd_acpi_add_device(cell, pdev);
Hi Saravana,
On Mon, Jan 30, 2023 at 03:03:01PM -0800, Saravana Kannan wrote:
> On Mon, Jan 30, 2023 at 12:56 AM Naresh Kamboju
> <[email protected]> wrote:
> >
> > Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh,
> > sparc and x86_64.
> >
> > Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64.
> > Boot failed on FVP.
> >
> > Reported-by: Linux Kernel Functional Testing <[email protected]>
> >
> > Please refer following link for details of testing.
> > FVP boot log failed.
> > https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/
>
> Sudeep pointed me to what the issue might be. But it's strange that
> you are hitting an issue now. I'm pretty sure I haven't changed this
> part since v1. I'd also expect the limited assumptions I made to have
> not been affected between v1 and v2.
>
Sorry I hadn't seen or tested v1.
FYI The fwnode non-NULL check as in your nvmem diff/suggestion and the diff I
replied on the gpiolib patch thread fixes the issues.
> Anyway, I'll look at this and fix it in v3.
>
If you add that fwnode check, feel free to add my tested by.
--
Regards,
Sudeep
Hi Saravana,
> Can you try the patch at the end of this email under these
> configurations and tell me which ones fail vs pass? I don't need logs
I did these tests and here is the results:
1. On top of this series - Not works
2. Without this series - Works
3. On top of the series with the fwnode_dev_initialized() deleted - Not works
4. Without this series, with the fwnode_dev_initialized() deleted - Works
So your nvmem/core.c patch helps only when it is applied without the series.
But despite the fact that this helps to avoid getting stuck at probing
my ethernet device, there is still regression.
When the ethernet module is loaded it takes a lot of time to drop dependency
from the nvmem-cell with mac address.
Please look at the kernel logs below.
The first log corresponds to kernel with your nvmem/core.c patch:
[ 0.036462] ethernet@70000 Linked as a fwnode consumer to
clock-gating-control@1821c
[ 0.036572] ethernet@70000 Linked as a fwnode consumer to partition@1
[ 0.045596] device: 'f1070000.ethernet': device_add
[ 0.045854] ethernet@70000 Dropping the fwnode link to
clock-gating-control@1821c
[ 0.114990] device:
'platform:f1010600.spi:m25p80@0:partitions:partition@1--platform:f1070000.ethernet':
device_add
[ 0.115266] devices_kset: Moving f1070000.ethernet to end of list
[ 0.115308] platform f1070000.ethernet: Linked as a consumer to
f1010600.spi:m25p80@0:partitions:partition@1
[ 0.115345] ethernet@70000 Dropping the fwnode link to partition@1
[ 1.968232] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 2.088696] devices_kset: Moving f1070000.ethernet to end of list
[ 2.088988] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 2.152411] devices_kset: Moving f1070000.ethernet to end of list
[ 2.152735] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 2.153870] devices_kset: Moving f1070000.ethernet to end of list
[ 2.154152] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 2.644950] devices_kset: Moving f1070000.ethernet to end of list
[ 2.645282] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 3.169218] devices_kset: Moving f1070000.ethernet to end of list
[ 3.169506] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 3.170444] devices_kset: Moving f1070000.ethernet to end of list
[ 3.170721] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 3.419068] devices_kset: Moving f1070000.ethernet to end of list
[ 3.419359] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 3.521275] devices_kset: Moving f1070000.ethernet to end of list
[ 3.521564] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 3.639196] devices_kset: Moving f1070000.ethernet to end of list
[ 3.639532] platform f1070000.ethernet: error -EPROBE_DEFER:
supplier f1010600.spi:m25p80@0:partitions:partition@1 not ready
[ 13.960144] platform f1070000.ethernet: Relaxing link with
f1010600.spi:m25p80@0:partitions:partition@1
[ 13.960260] devices_kset: Moving f1070000.ethernet to end of list
[ 13.971735] device: 'eth0': device_add
[ 13.974140] mvneta f1070000.ethernet eth0: Using device tree
mac address de:fa:ce:db:ab:e1
[ 13.974275] mvneta f1070000.ethernet: Dropping the link to
f1010600.spi:m25p80@0:partitions:partition@1
[ 13.974318] device:
'platform:f1010600.spi:m25p80@0:partitions:partition@1--platform:f1070000.ethernet':
device_unregister
It took around 13 seconds to obtain a mac from nvmem-cell and bring up
f1070000.ethernet
And here is the second log which corresponds to kernel without your
nvmem/core.c patch but also with reverted change 'bcdf0315':
[ 0.036285] ethernet@70000 Linked as a fwnode consumer to
clock-gating-control@1821c
[ 0.036395] ethernet@70000 Linked as a fwnode consumer to partition@1
[ 0.045416] device: 'f1070000.ethernet': device_add
[ 0.045674] ethernet@70000 Dropping the fwnode link to
clock-gating-control@1821c
[ 0.116136] ethernet@70000 Dropping the fwnode link to partition@1
[ 1.977060] device: 'eth0': device_add
[ 1.979145] mvneta f1070000.ethernet eth0: Using device tree
mac address de:fa:ce:db:ab:e1
It took around 1.5 second to obtain a mac from nvmem-cell
P.S. Your nvmem patch definitely helps to avoid a device probe stuck
but look like it is not best way to solve a problem which we discussed
in the MTD thread.
P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
applied on top of this series. Maybe I missed something.
On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <[email protected]> wrote:
>
> Hi Saravana,
>
> > Can you try the patch at the end of this email under these
> > configurations and tell me which ones fail vs pass? I don't need logs
>
> I did these tests and here is the results:
Did you hand edit the In-Reply-To: in the header? Because in the
thread you are reply to the wrong email, but the context in your email
seems to be from the right email.
For example, see how your reply isn't under the email you are replying
to in this thread overview:
https://lore.kernel.org/lkml/[email protected]/#r
> 1. On top of this series - Not works
> 2. Without this series - Works
> 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> 4. Without this series, with the fwnode_dev_initialized() deleted - Works
>
> So your nvmem/core.c patch helps only when it is applied without the series.
> But despite the fact that this helps to avoid getting stuck at probing
> my ethernet device, there is still regression.
>
> When the ethernet module is loaded it takes a lot of time to drop dependency
> from the nvmem-cell with mac address.
>
> Please look at the kernel logs below.
The kernel logs below really aren't that useful for me in their
current state. See more below.
---8<---- <snip> --->8----
> P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> but look like it is not best way to solve a problem which we discussed
> in the MTD thread.
>
> P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> applied on top of this series. Maybe I missed something.
Yeah, I'm not too sure if the test was done correctly. You also didn't
answer my question about the dts from my earlier email.
https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
So, can you please retest config 1 with all pr_debug and dev_dbg in
drivers/core/base.c changed to the _info variants? And then share the
kernel log from the beginning of boot? Maybe attach it to the email so
it doesn't get word wrapped by your email client. And please point me
to the .dts that corresponds to your board. Without that, I can't
debug much.
Thanks,
Saravana
пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <[email protected]>:
>
> On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <[email protected]> wrote:
> >
> > Hi Saravana,
> >
> > > Can you try the patch at the end of this email under these
> > > configurations and tell me which ones fail vs pass? I don't need logs
> >
> > I did these tests and here is the results:
>
> Did you hand edit the In-Reply-To: in the header? Because in the
> thread you are reply to the wrong email, but the context in your email
> seems to be from the right email.
>
> For example, see how your reply isn't under the email you are replying
> to in this thread overview:
> https://lore.kernel.org/lkml/[email protected]/#r
>
> > 1. On top of this series - Not works
> > 2. Without this series - Works
> > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > 4. Without this series, with the fwnode_dev_initialized() deleted - Works
> >
> > So your nvmem/core.c patch helps only when it is applied without the series.
> > But despite the fact that this helps to avoid getting stuck at probing
> > my ethernet device, there is still regression.
> >
> > When the ethernet module is loaded it takes a lot of time to drop dependency
> > from the nvmem-cell with mac address.
> >
> > Please look at the kernel logs below.
>
> The kernel logs below really aren't that useful for me in their
> current state. See more below.
>
> ---8<---- <snip> --->8----
>
> > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > but look like it is not best way to solve a problem which we discussed
> > in the MTD thread.
> >
> > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > applied on top of this series. Maybe I missed something.
>
> Yeah, I'm not too sure if the test was done correctly. You also didn't
> answer my question about the dts from my earlier email.
> https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
>
> So, can you please retest config 1 with all pr_debug and dev_dbg in
> drivers/core/base.c changed to the _info variants? And then share the
> kernel log from the beginning of boot? Maybe attach it to the email so
> it doesn't get word wrapped by your email client. And please point me
> to the .dts that corresponds to your board. Without that, I can't
> debug much.
>
> Thanks,
> Saravana
> Did you hand edit the In-Reply-To: in the header? Because in the
> thread you are reply to the wrong email, but the context in your email
> seems to be from the right email.
Sorry for that, it seems like I accidently deleted it.
> So, can you please retest config 1 with all pr_debug and dev_dbg in
> drivers/core/base.c changed to the _info variants? And then share the
> kernel log from the beginning of boot? Maybe attach it to the email so
> it doesn't get word wrapped by your email client. And please point me
> to the .dts that corresponds to your board. Without that, I can't
> debug much.
Ok, I retested config 1 with all _debug logs changed to the _info. I
added the kernel log and the dts file to the attachment of this email.
On Tue, Jan 31, 2023 at 12:14 AM Geert Uytterhoeven
<[email protected]> wrote:
>
> Hi Saravana,
>
> On Mon, Jan 30, 2023 at 9:00 PM Saravana Kannan <[email protected]> wrote:
> > On Mon, Jan 30, 2023 at 12:43 AM Geert Uytterhoeven
> > <[email protected]> wrote:
> > > On Sat, Jan 28, 2023 at 8:19 AM Saravana Kannan <[email protected]> wrote:
> > > > On Fri, Jan 27, 2023 at 12:11 AM Geert Uytterhoeven
> > > > <[email protected]> wrote:
> > > > > On Fri, Jan 27, 2023 at 1:11 AM Saravana Kannan <[email protected]> wrote:
> > > > > > The OF_POPULATED flag was set to let fw_devlink know that the device
> > > > > > tree node will not have a struct device created for it. This information
> > > > > > is used by fw_devlink to avoid deferring the probe of consumers of this
> > > > > > device tree node.
> > > > > >
> > > > > > Let's use fwnode_dev_initialized() instead because it achieves the same
> > > > > > effect without using OF specific flags. This allows more generic code to
> > > > > > be written in driver core.
> > > > > >
> > > > > > Signed-off-by: Saravana Kannan <[email protected]>
> > > > >
> > > > > Thanks for your patch!
> > > > >
> > > > > > --- a/drivers/soc/renesas/rcar-sysc.c
> > > > > > +++ b/drivers/soc/renesas/rcar-sysc.c
> > > > > > @@ -437,7 +437,7 @@ static int __init rcar_sysc_pd_init(void)
> > > > > >
> > > > > > error = of_genpd_add_provider_onecell(np, &domains->onecell_data);
> > > > > > if (!error)
> > > > > > - of_node_set_flag(np, OF_POPULATED);
> > > > > > + fwnode_dev_initialized(&np->fwnode, true);
> > > > >
> > > > > As drivers/soc/renesas/rmobile-sysc.c is already using this method,
> > > > > it should work fine.
> > > > >
> > > > > Reviewed-by: Geert Uytterhoeven <[email protected]>
> > > > > i.e. will queue in renesas-devel for v6.4.
> >
> > I hope you meant queue it up for 6.3 and not 6.4?
>
> V6.4.
> The deadline for submitting pull requests for the soc tree is rc6.
> Sorry, your series was posted too late to make that.
>
> > > > Thanks! Does that mean I should drop this from this series? If two
> > > > maintainers pick the same patch up, will it cause problems? I'm
> > > > eventually expecting this series to be picked up by Greg into
> > > > driver-core-next.
> > >
> > > Indeed. Patches for drivers/soc/renesas/ are supposed to go upstream
> > > through the renesas-devel and soc trees. This patch has no dependencies
> > > on anything else in the series (or vice versa), so there is no reason
> > > to deviate from that, and possibly cause conflicts later.
> >
> > This series is supposed to fix a bunch of issues and I vaguely think
> > the series depends on this patch to work correctly on some Renesas
> > systems. You are my main renesas person, so it's probably some issue
> > you hit. Is you pick it up outside of this series I need to keep
> > asking folks to pick up two different patch threads. I don't have a
> > strong opinion, just a FYI. If you can take this patch soon, I don't
> > have any concerns.
>
> Oh right, you do remove OF_POPULATED handling in
> "[PATCH v2 09/11] of: property: Simplify of_link_to_phandle()".
> It might be wise to postpone that removal, as after your series,
> there are stillseveral users left, some of them might be impacted.
>
> I do plan to test your full series on all my boards, but probably that
> won't happen this week.
>
> > > BTW, I will convert to of_node_to_fwnode() while applying.
> >
> > Sounds good.
>
> If you still want this to land in v6,3 (with the of_node_to_fwnode()
> conversion):
> Acked-by: Geert Uytterhoeven <[email protected]>
>
Yeah, let me try to land this in 6.3 with the series.
-Saravana
On Tue, Jan 31, 2023 at 2:13 AM Sudeep Holla <[email protected]> wrote:
>
> On Mon, Jan 30, 2023 at 08:01:17PM -0800, Saravana Kannan wrote:
> > On Mon, Jan 30, 2023 at 7:14 AM Andy Shevchenko
> > <[email protected]> wrote:
> > >
> > > On Mon, Jan 30, 2023 at 02:31:53PM +0000, Sudeep Holla wrote:
> > > > On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote:
> > > > > Registering an irqdomain sets the flag for the fwnode. But having the
> > > > > flag set when a device is added is interpreted by fw_devlink to mean the
> > > > > device has already been initialized and will never probe. This prevents
> > > > > fw_devlink from creating device links with the gpio_device as a
> > > > > supplier. So, clear the flag before adding the device.
> > >
> > > ...
> > >
> > > > > + /*
> > > > > + * If fwnode doesn't belong to another device, it's safe to clear its
> > > > > + * initialized flag.
> > > > > + */
> > > > > + if (!gdev->dev.fwnode->dev)
> > > > > + fwnode_dev_initialized(gdev->dev.fwnode, false);
> > > >
> > > > This is the one causing the kernel crash during the boot on FVP which
> > > > Naresh has reported. Just reverted this and was able to boot, confirming
> > > > the issue with this patch.
> > >
> > > I'm wondering if
> > >
> > > if (!dev_fwnode(&gdev->dev)->dev)
> > > fwnode_dev_initialized(&dev_fwnode(gdev->dev), false);
> > >
> > > works.
> >
> > No, that won't help. The problem was that with arm32, we have gpio
> > devices created without any of_node or fwnode. So I can't assume
> > fwnode will always be present.
> >
>
> Correct, and this one is not even arm32. But it is just reusing a driver
> that needs to be supported even on arm32.
>
> Not sure on how to proceed. As a simple way to check, I added a NULL check
> for fwnode building on top of Andy's suggestion[1]. That works.
>
> Also the driver in question on arm64 FVP model is drivers/mfd/vexpress-sysreg.c
> mfd_add_device() in drivers/mfd/mfd-core.c allows addition of devices without
> of_node/fwnode. I am sure returning error like[2] will break many platforms
> but I just wanted to confirm the root cause and [2] fixes the boot without
> NULL check for fwnode in gpiochip_setup_dev().
>
> Hope this helps.
Thanks for debugging it for me Sudeep. Incorporated into my v3.
-Saravana
>
> --
> Regards,
> Sudeep
>
> [1]
>
> -->8
> diff --git i/drivers/gpio/gpiolib.c w/drivers/gpio/gpiolib.c
> index b23140c6485f..e162f13aa2c9 100644
> --- i/drivers/gpio/gpiolib.c
> +++ w/drivers/gpio/gpiolib.c
> @@ -577,13 +577,15 @@ static void gpiodevice_release(struct device *dev)
> static int gpiochip_setup_dev(struct gpio_device *gdev)
> {
> int ret;
> + struct fwnode_handle *fwnode = dev_fwnode(&gdev->dev);
>
> /*
> * If fwnode doesn't belong to another device, it's safe to clear its
> * initialized flag.
> */
> - if (!gdev->dev.fwnode->dev)
> - fwnode_dev_initialized(gdev->dev.fwnode, false);
> + if (fwnode && !fwnode->dev)
> + fwnode_dev_initialized(fwnode, false);
> +
> ret = gcdev_register(gdev, gpio_devt);
> if (ret)
> return ret;
>
> [2]
>
> -->8
>
> diff --git i/drivers/mfd/mfd-core.c w/drivers/mfd/mfd-core.c
> index 16d1861e9682..3b2c4b0e9a2a 100644
> --- i/drivers/mfd/mfd-core.c
> +++ w/drivers/mfd/mfd-core.c
> @@ -231,9 +231,11 @@ static int mfd_add_device(struct device *parent, int id,
> }
> }
>
> - if (!pdev->dev.of_node)
> + if (!pdev->dev.of_node) {
> pr_warn("%s: Failed to locate of_node [id: %d]\n",
> cell->name, platform_id);
> + goto fail_alias;
> + }
> }
>
> mfd_acpi_add_device(cell, pdev);
>
On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <[email protected]> wrote:
>
> пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <[email protected]>:
> >
> > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <[email protected]> wrote:
> > >
> > > Hi Saravana,
> > >
> > > > Can you try the patch at the end of this email under these
> > > > configurations and tell me which ones fail vs pass? I don't need logs
> > >
> > > I did these tests and here is the results:
> >
> > Did you hand edit the In-Reply-To: in the header? Because in the
> > thread you are reply to the wrong email, but the context in your email
> > seems to be from the right email.
> >
> > For example, see how your reply isn't under the email you are replying
> > to in this thread overview:
> > https://lore.kernel.org/lkml/[email protected]/#r
> >
> > > 1. On top of this series - Not works
> > > 2. Without this series - Works
> > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works
> > >
> > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > But despite the fact that this helps to avoid getting stuck at probing
> > > my ethernet device, there is still regression.
> > >
> > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > from the nvmem-cell with mac address.
> > >
> > > Please look at the kernel logs below.
> >
> > The kernel logs below really aren't that useful for me in their
> > current state. See more below.
> >
> > ---8<---- <snip> --->8----
> >
> > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > but look like it is not best way to solve a problem which we discussed
> > > in the MTD thread.
> > >
> > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > applied on top of this series. Maybe I missed something.
> >
> > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > answer my question about the dts from my earlier email.
> > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> >
> > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > drivers/core/base.c changed to the _info variants? And then share the
> > kernel log from the beginning of boot? Maybe attach it to the email so
> > it doesn't get word wrapped by your email client. And please point me
> > to the .dts that corresponds to your board. Without that, I can't
> > debug much.
> >
> > Thanks,
> > Saravana
>
> > Did you hand edit the In-Reply-To: in the header? Because in the
> > thread you are reply to the wrong email, but the context in your email
> > seems to be from the right email.
>
> Sorry for that, it seems like I accidently deleted it.
>
> > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > drivers/core/base.c changed to the _info variants? And then share the
> > kernel log from the beginning of boot? Maybe attach it to the email so
> > it doesn't get word wrapped by your email client. And please point me
> > to the .dts that corresponds to your board. Without that, I can't
> > debug much.
>
> Ok, I retested config 1 with all _debug logs changed to the _info. I
> added the kernel log and the dts file to the attachment of this email.
Ah, so your device is not supported/present upstream? Even though it's
not upstream, I'll help fix this because it should fix what I believe
are unreported issues in upstream.
Ok I know why configs 1 - 4 behaved the way they did and why my test
patch didn't help.
After staring at mtd/nvmem code for a few hours I think mtd/nvmem
interaction is kind of a mess. mtd core creates "partition" platform
devices (including for nvmem-cells) that are probed by drivers in
drivers/nvmem. However, there's no driver for "nvmem-cells" partition
platform device. However, the nvmem core creates nvmem_device when
nvmem_register() is called by MTD or these partition platform devices
created by MTD. But these nvmem_devices are added to a nvmem_bus but
the bus has no means to even register a driver (it should really be a
nvmem_class and not nvmem_bus). And the nvmem_device sometimes points
to the DT node of the MTD device or sometimes the partition platform
devices or maybe no DT node at all.
So it's a mess of multiple devices pointing to the same DT node with
no clear way to identify which ones will point to a DT node and which
ones will probe and which ones won't. In the future, we shouldn't
allow adding new compatible strings for partitions for which we don't
plan on adding nvmem drivers.
Can you give the patch at the end of the email a shot? It should fix
the issue with this series and without this series. It just avoids
this whole mess by not creating useless platform device for
nvmem-cells compatible DT nodes.
Thanks,
Saravana
diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index d442fa94c872..88a213f4d651 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master,
{
struct mtd_part_parser *parser;
struct device_node *np;
+ struct device_node *child;
struct property *prop;
struct device *dev;
const char *compat;
@@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master,
else
np = of_get_child_by_name(np, "partitions");
+ for_each_child_of_node(np, child)
+ if (of_device_is_compatible(child, "nvmem-cells"))
+ of_node_set_flag(child, OF_POPULATED);
+
of_property_for_each_string(np, "compatible", prop, compat) {
parser = mtd_part_get_compatible_parser(compat);
if (!parser)
On Sun, Feb 5, 2023 at 5:32 PM Saravana Kannan <[email protected]> wrote:
>
> On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <[email protected]> wrote:
> >
> > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <[email protected]>:
> > >
> > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <[email protected]> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > > Can you try the patch at the end of this email under these
> > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > >
> > > > I did these tests and here is the results:
> > >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> > >
> > > For example, see how your reply isn't under the email you are replying
> > > to in this thread overview:
> > > https://lore.kernel.org/lkml/[email protected]/#r
> > >
> > > > 1. On top of this series - Not works
> > > > 2. Without this series - Works
> > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works
> > > >
> > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > my ethernet device, there is still regression.
> > > >
> > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > from the nvmem-cell with mac address.
> > > >
> > > > Please look at the kernel logs below.
> > >
> > > The kernel logs below really aren't that useful for me in their
> > > current state. See more below.
> > >
> > > ---8<---- <snip> --->8----
> > >
> > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > but look like it is not best way to solve a problem which we discussed
> > > > in the MTD thread.
> > > >
> > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > applied on top of this series. Maybe I missed something.
> > >
> > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > answer my question about the dts from my earlier email.
> > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> > >
> > > Thanks,
> > > Saravana
> >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> >
> > Sorry for that, it seems like I accidently deleted it.
> >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> >
> > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > added the kernel log and the dts file to the attachment of this email.
>
> Ah, so your device is not supported/present upstream? Even though it's
> not upstream, I'll help fix this because it should fix what I believe
> are unreported issues in upstream.
>
> Ok I know why configs 1 - 4 behaved the way they did and why my test
> patch didn't help.
>
> After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> interaction is kind of a mess. mtd core creates "partition" platform
> devices (including for nvmem-cells) that are probed by drivers in
> drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> platform device. However, the nvmem core creates nvmem_device when
> nvmem_register() is called by MTD or these partition platform devices
> created by MTD. But these nvmem_devices are added to a nvmem_bus but
> the bus has no means to even register a driver (it should really be a
> nvmem_class and not nvmem_bus). And the nvmem_device sometimes points
> to the DT node of the MTD device or sometimes the partition platform
> devices or maybe no DT node at all.
>
> So it's a mess of multiple devices pointing to the same DT node with
> no clear way to identify which ones will point to a DT node and which
> ones will probe and which ones won't. In the future, we shouldn't
> allow adding new compatible strings for partitions for which we don't
> plan on adding nvmem drivers.
>
> Can you give the patch at the end of the email a shot? It should fix
> the issue with this series and without this series. It just avoids
> this whole mess by not creating useless platform device for
> nvmem-cells compatible DT nodes.
Actually, without this series, the patch below will need an additional
line of code inside the if block:
fwnode_dev_initialized(of_fwnode_handle(child), true);
-Saravana
>
> Thanks,
> Saravana
>
> diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> index d442fa94c872..88a213f4d651 100644
> --- a/drivers/mtd/mtdpart.c
> +++ b/drivers/mtd/mtdpart.c
> @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master,
> {
> struct mtd_part_parser *parser;
> struct device_node *np;
> + struct device_node *child;
> struct property *prop;
> struct device *dev;
> const char *compat;
> @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master,
> else
> np = of_get_child_by_name(np, "partitions");
>
> + for_each_child_of_node(np, child)
> + if (of_device_is_compatible(child, "nvmem-cells"))
> + of_node_set_flag(child, OF_POPULATED);
> +
> of_property_for_each_string(np, "compatible", prop, compat) {
> parser = mtd_part_get_compatible_parser(compat);
> if (!parser)
Hi Saravana,
+ Srinivas, nvmem maintainer
[email protected] wrote on Sun, 5 Feb 2023 17:32:57 -0800:
> On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <[email protected]> wrote:
> >
> > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <[email protected]>:
> > >
> > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <[email protected]> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > > Can you try the patch at the end of this email under these
> > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > >
> > > > I did these tests and here is the results:
> > >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> > >
> > > For example, see how your reply isn't under the email you are replying
> > > to in this thread overview:
> > > https://lore.kernel.org/lkml/[email protected]/#r
> > >
> > > > 1. On top of this series - Not works
> > > > 2. Without this series - Works
> > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works
> > > >
> > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > my ethernet device, there is still regression.
> > > >
> > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > from the nvmem-cell with mac address.
> > > >
> > > > Please look at the kernel logs below.
> > >
> > > The kernel logs below really aren't that useful for me in their
> > > current state. See more below.
> > >
> > > ---8<---- <snip> --->8----
> > >
> > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > but look like it is not best way to solve a problem which we discussed
> > > > in the MTD thread.
> > > >
> > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > applied on top of this series. Maybe I missed something.
> > >
> > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > answer my question about the dts from my earlier email.
> > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> > >
> > > Thanks,
> > > Saravana
> >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> >
> > Sorry for that, it seems like I accidently deleted it.
> >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> >
> > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > added the kernel log and the dts file to the attachment of this email.
>
> Ah, so your device is not supported/present upstream? Even though it's
> not upstream, I'll help fix this because it should fix what I believe
> are unreported issues in upstream.
>
> Ok I know why configs 1 - 4 behaved the way they did and why my test
> patch didn't help.
>
> After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> interaction is kind of a mess.
nvmem is a recent subsystem but mtd carries a lot of legacy stuff we
cannot really re-wire without breaking users, so nvmem on top of mtd
of course inherit from the fragile designs in place.
> mtd core creates "partition" platform
> devices (including for nvmem-cells) that are probed by drivers in
> drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> platform device. However, the nvmem core creates nvmem_device when
> nvmem_register() is called by MTD or these partition platform devices
> created by MTD. But these nvmem_devices are added to a nvmem_bus but
> the bus has no means to even register a driver (it should really be a
> nvmem_class and not nvmem_bus).
Srinivas, do you think we could change this?
> And the nvmem_device sometimes points
> to the DT node of the MTD device or sometimes the partition platform
> devices or maybe no DT node at all.
I guess this comes from the fact that this is not strongly defined in
mtd and depends on the situation (not mentioning 20 years of history
there as well). "mtd" is a bit inconsistent on what it means. Older
designs mixed: controllers, ECC engines when relevant and memories;
while these three components are completely separated. Hence
sometimes the mtd device ends up being the top level controller,
sometimes it's just one partition...
But I'm surprised not all of them point to a DT node. Could you show us
an example? Because that might likely be unexpected (or perhaps I am
missing something).
> So it's a mess of multiple devices pointing to the same DT node with
> no clear way to identify which ones will point to a DT node and which
> ones will probe and which ones won't. In the future, we shouldn't
> allow adding new compatible strings for partitions for which we don't
> plan on adding nvmem drivers.
>
> Can you give the patch at the end of the email a shot? It should fix
> the issue with this series and without this series. It just avoids
> this whole mess by not creating useless platform device for
> nvmem-cells compatible DT nodes.
Thanks a lot for your help.
>
> Thanks,
> Saravana
>
> diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> index d442fa94c872..88a213f4d651 100644
> --- a/drivers/mtd/mtdpart.c
> +++ b/drivers/mtd/mtdpart.c
> @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master,
> {
> struct mtd_part_parser *parser;
> struct device_node *np;
> + struct device_node *child;
> struct property *prop;
> struct device *dev;
> const char *compat;
> @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master,
> else
> np = of_get_child_by_name(np, "partitions");
>
> + for_each_child_of_node(np, child)
> + if (of_device_is_compatible(child, "nvmem-cells"))
> + of_node_set_flag(child, OF_POPULATED);
What about a comment explaining why we need that in the final patch
(with a comment)? Otherwise it's a little bit obscure.
> +
> of_property_for_each_string(np, "compatible", prop, compat) {
> parser = mtd_part_get_compatible_parser(compat);
> if (!parser)
Thanks,
Miquèl
On Sun, Feb 5, 2023 at 7:33 PM Saravana Kannan <[email protected]> wrote:
>
> On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <[email protected]> wrote:
> >
> > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <[email protected]>:
> > >
> > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <[email protected]> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > > Can you try the patch at the end of this email under these
> > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > >
> > > > I did these tests and here is the results:
> > >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> > >
> > > For example, see how your reply isn't under the email you are replying
> > > to in this thread overview:
> > > https://lore.kernel.org/lkml/[email protected]/#r
> > >
> > > > 1. On top of this series - Not works
> > > > 2. Without this series - Works
> > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works
> > > >
> > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > my ethernet device, there is still regression.
> > > >
> > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > from the nvmem-cell with mac address.
> > > >
> > > > Please look at the kernel logs below.
> > >
> > > The kernel logs below really aren't that useful for me in their
> > > current state. See more below.
> > >
> > > ---8<---- <snip> --->8----
> > >
> > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > but look like it is not best way to solve a problem which we discussed
> > > > in the MTD thread.
> > > >
> > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > applied on top of this series. Maybe I missed something.
> > >
> > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > answer my question about the dts from my earlier email.
> > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> > >
> > > Thanks,
> > > Saravana
> >
> > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > thread you are reply to the wrong email, but the context in your email
> > > seems to be from the right email.
> >
> > Sorry for that, it seems like I accidently deleted it.
> >
> > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > drivers/core/base.c changed to the _info variants? And then share the
> > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > it doesn't get word wrapped by your email client. And please point me
> > > to the .dts that corresponds to your board. Without that, I can't
> > > debug much.
> >
> > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > added the kernel log and the dts file to the attachment of this email.
>
> Ah, so your device is not supported/present upstream? Even though it's
> not upstream, I'll help fix this because it should fix what I believe
> are unreported issues in upstream.
>
> Ok I know why configs 1 - 4 behaved the way they did and why my test
> patch didn't help.
>
> After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> interaction is kind of a mess. mtd core creates "partition" platform
> devices (including for nvmem-cells) that are probed by drivers in
> drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> platform device. However, the nvmem core creates nvmem_device when
> nvmem_register() is called by MTD or these partition platform devices
> created by MTD. But these nvmem_devices are added to a nvmem_bus but
> the bus has no means to even register a driver (it should really be a
> nvmem_class and not nvmem_bus). And the nvmem_device sometimes points
> to the DT node of the MTD device or sometimes the partition platform
> devices or maybe no DT node at all.
>
> So it's a mess of multiple devices pointing to the same DT node with
> no clear way to identify which ones will point to a DT node and which
> ones will probe and which ones won't. In the future, we shouldn't
> allow adding new compatible strings for partitions for which we don't
> plan on adding nvmem drivers.
That won't work. Having a compatible string cannot mean there must be a driver.
Rob
On Mon, Feb 6, 2023 at 7:19 AM Rob Herring <[email protected]> wrote:
>
> On Sun, Feb 5, 2023 at 7:33 PM Saravana Kannan <[email protected]> wrote:
> >
> > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <[email protected]> wrote:
> > >
> > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <[email protected]>:
> > > >
> > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <[email protected]> wrote:
> > > > >
> > > > > Hi Saravana,
> > > > >
> > > > > > Can you try the patch at the end of this email under these
> > > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > > >
> > > > > I did these tests and here is the results:
> > > >
> > > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > > thread you are reply to the wrong email, but the context in your email
> > > > seems to be from the right email.
> > > >
> > > > For example, see how your reply isn't under the email you are replying
> > > > to in this thread overview:
> > > > https://lore.kernel.org/lkml/[email protected]/#r
> > > >
> > > > > 1. On top of this series - Not works
> > > > > 2. Without this series - Works
> > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works
> > > > >
> > > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > > my ethernet device, there is still regression.
> > > > >
> > > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > > from the nvmem-cell with mac address.
> > > > >
> > > > > Please look at the kernel logs below.
> > > >
> > > > The kernel logs below really aren't that useful for me in their
> > > > current state. See more below.
> > > >
> > > > ---8<---- <snip> --->8----
> > > >
> > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > > but look like it is not best way to solve a problem which we discussed
> > > > > in the MTD thread.
> > > > >
> > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > > applied on top of this series. Maybe I missed something.
> > > >
> > > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > > answer my question about the dts from my earlier email.
> > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > > >
> > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > it doesn't get word wrapped by your email client. And please point me
> > > > to the .dts that corresponds to your board. Without that, I can't
> > > > debug much.
> > > >
> > > > Thanks,
> > > > Saravana
> > >
> > > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > > thread you are reply to the wrong email, but the context in your email
> > > > seems to be from the right email.
> > >
> > > Sorry for that, it seems like I accidently deleted it.
> > >
> > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > it doesn't get word wrapped by your email client. And please point me
> > > > to the .dts that corresponds to your board. Without that, I can't
> > > > debug much.
> > >
> > > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > > added the kernel log and the dts file to the attachment of this email.
> >
> > Ah, so your device is not supported/present upstream? Even though it's
> > not upstream, I'll help fix this because it should fix what I believe
> > are unreported issues in upstream.
> >
> > Ok I know why configs 1 - 4 behaved the way they did and why my test
> > patch didn't help.
> >
> > After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> > interaction is kind of a mess. mtd core creates "partition" platform
> > devices (including for nvmem-cells) that are probed by drivers in
> > drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> > platform device. However, the nvmem core creates nvmem_device when
> > nvmem_register() is called by MTD or these partition platform devices
> > created by MTD. But these nvmem_devices are added to a nvmem_bus but
> > the bus has no means to even register a driver (it should really be a
> > nvmem_class and not nvmem_bus). And the nvmem_device sometimes points
> > to the DT node of the MTD device or sometimes the partition platform
> > devices or maybe no DT node at all.
> >
> > So it's a mess of multiple devices pointing to the same DT node with
> > no clear way to identify which ones will point to a DT node and which
> > ones will probe and which ones won't. In the future, we shouldn't
> > allow adding new compatible strings for partitions for which we don't
> > plan on adding nvmem drivers.
>
> That won't work. Having a compatible string cannot mean there must be a driver.
Right, I know what you mean Rob and I know where you are coming from
(DT isn't just about Linux or even driver core). But what I'm saying
is that this seems to already be the case for MTD partitions after
commit:
bcdf0315a61a mtd: call of_platform_populate() for MTD partitions
So, if we are adding compatible properties only for some of them, then
I'm saying we should make sure people write drivers for them going
forward.
I don't know enough about MTD partitions to know why only some of them
have compatible properties.
-Saravana
On Mon, Feb 6, 2023 at 1:39 AM Miquel Raynal <[email protected]> wrote:
>
> Hi Saravana,
>
> + Srinivas, nvmem maintainer
>
> [email protected] wrote on Sun, 5 Feb 2023 17:32:57 -0800:
>
> > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <[email protected]> wrote:
> > >
> > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <[email protected]>:
> > > >
> > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <[email protected]> wrote:
> > > > >
> > > > > Hi Saravana,
> > > > >
> > > > > > Can you try the patch at the end of this email under these
> > > > > > configurations and tell me which ones fail vs pass? I don't need logs
> > > > >
> > > > > I did these tests and here is the results:
> > > >
> > > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > > thread you are reply to the wrong email, but the context in your email
> > > > seems to be from the right email.
> > > >
> > > > For example, see how your reply isn't under the email you are replying
> > > > to in this thread overview:
> > > > https://lore.kernel.org/lkml/[email protected]/#r
> > > >
> > > > > 1. On top of this series - Not works
> > > > > 2. Without this series - Works
> > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works
> > > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works
> > > > >
> > > > > So your nvmem/core.c patch helps only when it is applied without the series.
> > > > > But despite the fact that this helps to avoid getting stuck at probing
> > > > > my ethernet device, there is still regression.
> > > > >
> > > > > When the ethernet module is loaded it takes a lot of time to drop dependency
> > > > > from the nvmem-cell with mac address.
> > > > >
> > > > > Please look at the kernel logs below.
> > > >
> > > > The kernel logs below really aren't that useful for me in their
> > > > current state. See more below.
> > > >
> > > > ---8<---- <snip> --->8----
> > > >
> > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck
> > > > > but look like it is not best way to solve a problem which we discussed
> > > > > in the MTD thread.
> > > > >
> > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was
> > > > > applied on top of this series. Maybe I missed something.
> > > >
> > > > Yeah, I'm not too sure if the test was done correctly. You also didn't
> > > > answer my question about the dts from my earlier email.
> > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t
> > > >
> > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > it doesn't get word wrapped by your email client. And please point me
> > > > to the .dts that corresponds to your board. Without that, I can't
> > > > debug much.
> > > >
> > > > Thanks,
> > > > Saravana
> > >
> > > > Did you hand edit the In-Reply-To: in the header? Because in the
> > > > thread you are reply to the wrong email, but the context in your email
> > > > seems to be from the right email.
> > >
> > > Sorry for that, it seems like I accidently deleted it.
> > >
> > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > it doesn't get word wrapped by your email client. And please point me
> > > > to the .dts that corresponds to your board. Without that, I can't
> > > > debug much.
> > >
> > > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > > added the kernel log and the dts file to the attachment of this email.
> >
> > Ah, so your device is not supported/present upstream? Even though it's
> > not upstream, I'll help fix this because it should fix what I believe
> > are unreported issues in upstream.
> >
> > Ok I know why configs 1 - 4 behaved the way they did and why my test
> > patch didn't help.
> >
> > After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> > interaction is kind of a mess.
>
> nvmem is a recent subsystem but mtd carries a lot of legacy stuff we
> cannot really re-wire without breaking users, so nvmem on top of mtd
> of course inherit from the fragile designs in place.
Thanks for the context. Yeah, I figured. That's why I explicitly
limited my comment to "interaction". Although, I'd love to see the MTD
parsers all be converted to proper drivers that probe. MTD is
essentially repeating the driver matching logic. I think it can be
cleaned up to move to proper drivers and still not break backward
compatibility. Not saying it'll be trivial, but it should be possible.
Ironically MTD uses mtd_class but has real drivers that work on the
device (compared to nvmem_bus below).
> > mtd core creates "partition" platform
> > devices (including for nvmem-cells) that are probed by drivers in
> > drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> > platform device. However, the nvmem core creates nvmem_device when
> > nvmem_register() is called by MTD or these partition platform devices
> > created by MTD. But these nvmem_devices are added to a nvmem_bus but
> > the bus has no means to even register a driver (it should really be a
> > nvmem_class and not nvmem_bus).
>
> Srinivas, do you think we could change this?
Yeah, this part gets a bit tricky. It depends on whether the sysfs
files for nvmem devices is considered an ABI. Changing from bus to
class would change the sysfs path for nvmem devices from:
/sys/class/nvmem to /sys/bus/nvmem
> > And the nvmem_device sometimes points
> > to the DT node of the MTD device or sometimes the partition platform
> > devices or maybe no DT node at all.
>
> I guess this comes from the fact that this is not strongly defined in
> mtd and depends on the situation (not mentioning 20 years of history
> there as well). "mtd" is a bit inconsistent on what it means. Older
> designs mixed: controllers, ECC engines when relevant and memories;
> while these three components are completely separated. Hence
> sometimes the mtd device ends up being the top level controller,
> sometimes it's just one partition...
>
> But I'm surprised not all of them point to a DT node. Could you show us
> an example? Because that might likely be unexpected (or perhaps I am
> missing something).
Well, the logic that sets the DT node for nvmem_device is like so:
if (config->of_node)
nvmem->dev.of_node = config->of_node;
else if (!config->no_of_node)
nvmem->dev.of_node = config->dev->of_node;
So there's definitely a path (where both if's could be false) where
the DT node will not get set. I don't know if that path is possible
with the existing users of nvmem_register(), but it's definitely
possible.
> > So it's a mess of multiple devices pointing to the same DT node with
> > no clear way to identify which ones will point to a DT node and which
> > ones will probe and which ones won't. In the future, we shouldn't
> > allow adding new compatible strings for partitions for which we don't
> > plan on adding nvmem drivers.
> >
> > Can you give the patch at the end of the email a shot? It should fix
> > the issue with this series and without this series. It just avoids
> > this whole mess by not creating useless platform device for
> > nvmem-cells compatible DT nodes.
>
> Thanks a lot for your help.
No problem. I want fw_devlink to work for everyone.
> >
> > Thanks,
> > Saravana
> >
> > diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> > index d442fa94c872..88a213f4d651 100644
> > --- a/drivers/mtd/mtdpart.c
> > +++ b/drivers/mtd/mtdpart.c
> > @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master,
> > {
> > struct mtd_part_parser *parser;
> > struct device_node *np;
> > + struct device_node *child;
> > struct property *prop;
> > struct device *dev;
> > const char *compat;
> > @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master,
> > else
> > np = of_get_child_by_name(np, "partitions");
> >
> > + for_each_child_of_node(np, child)
> > + if (of_device_is_compatible(child, "nvmem-cells"))
> > + of_node_set_flag(child, OF_POPULATED);
>
> What about a comment explaining why we need that in the final patch
> (with a comment)? Otherwise it's a little bit obscure.
This wasn't meant to be reviewed :) Just a quick patch to make sure
I'm going down the right path. Once Maxim confirms I was going to roll
this into a proper patch.
But point noted. Will add a comment.
Thanks,
Saravana
Hi Saravana,
> > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in
> > > > > drivers/core/base.c changed to the _info variants? And then share the
> > > > > kernel log from the beginning of boot? Maybe attach it to the email so
> > > > > it doesn't get word wrapped by your email client. And please point me
> > > > > to the .dts that corresponds to your board. Without that, I can't
> > > > > debug much.
> > > >
> > > > Ok, I retested config 1 with all _debug logs changed to the _info. I
> > > > added the kernel log and the dts file to the attachment of this email.
> > >
> > > Ah, so your device is not supported/present upstream? Even though it's
> > > not upstream, I'll help fix this because it should fix what I believe
> > > are unreported issues in upstream.
> > >
> > > Ok I know why configs 1 - 4 behaved the way they did and why my test
> > > patch didn't help.
> > >
> > > After staring at mtd/nvmem code for a few hours I think mtd/nvmem
> > > interaction is kind of a mess.
> >
> > nvmem is a recent subsystem but mtd carries a lot of legacy stuff we
> > cannot really re-wire without breaking users, so nvmem on top of mtd
> > of course inherit from the fragile designs in place.
>
> Thanks for the context. Yeah, I figured. That's why I explicitly
> limited my comment to "interaction". Although, I'd love to see the MTD
> parsers all be converted to proper drivers that probe. MTD is
> essentially repeating the driver matching logic. I think it can be
> cleaned up to move to proper drivers and still not break backward
> compatibility. Not saying it'll be trivial, but it should be possible.
> Ironically MTD uses mtd_class but has real drivers that work on the
> device (compared to nvmem_bus below).
>
> > > mtd core creates "partition" platform
> > > devices (including for nvmem-cells) that are probed by drivers in
> > > drivers/nvmem. However, there's no driver for "nvmem-cells" partition
> > > platform device. However, the nvmem core creates nvmem_device when
> > > nvmem_register() is called by MTD or these partition platform devices
> > > created by MTD. But these nvmem_devices are added to a nvmem_bus but
> > > the bus has no means to even register a driver (it should really be a
> > > nvmem_class and not nvmem_bus).
> >
> > Srinivas, do you think we could change this?
>
> Yeah, this part gets a bit tricky. It depends on whether the sysfs
> files for nvmem devices is considered an ABI. Changing from bus to
> class would change the sysfs path for nvmem devices from:
> /sys/class/nvmem to /sys/bus/nvmem
Ok, so this is a no :)
> > > And the nvmem_device sometimes points
> > > to the DT node of the MTD device or sometimes the partition platform
> > > devices or maybe no DT node at all.
> >
> > I guess this comes from the fact that this is not strongly defined in
> > mtd and depends on the situation (not mentioning 20 years of history
> > there as well). "mtd" is a bit inconsistent on what it means. Older
> > designs mixed: controllers, ECC engines when relevant and memories;
> > while these three components are completely separated. Hence
> > sometimes the mtd device ends up being the top level controller,
> > sometimes it's just one partition...
> >
> > But I'm surprised not all of them point to a DT node. Could you show us
> > an example? Because that might likely be unexpected (or perhaps I am
> > missing something).
>
> Well, the logic that sets the DT node for nvmem_device is like so:
>
> if (config->of_node)
> nvmem->dev.of_node = config->of_node;
> else if (!config->no_of_node)
> nvmem->dev.of_node = config->dev->of_node;
>
> So there's definitely a path (where both if's could be false) where
> the DT node will not get set. I don't know if that path is possible
> with the existing users of nvmem_register(), but it's definitely
> possible.
It's an actual path. I just checked more in details, this is the change
from 2018 which uses the no_of_node flag:
c4dfa25ab307 ("mtd: add support for reading MTD devices via the nvmem API")
It basically allows any mtd device to be accessible (read-only) through
nvmem. So mtd partitions or such which are not described in the DT may
just be accessed through nvmem (that is my current understanding).
There was later a patch in 2021 which prevented this flag to be
automatically set, so that if partitions (well, mtd devices in general)
were described in the DT, they would provide a valid of_node in order
to be used as cell providers (again, my understanding):
658c4448bbbf ("mtd: core: add nvmem-cells compatible to parse mtd as nvmem cells")
But I guess the major problem comes from the nvmem-cell compatible. I
am wondering if it would make sense to kind of transpose the meaning of
this compatible into a property. But, well, backward compatibility
would still be a problem I guess...
> > > So it's a mess of multiple devices pointing to the same DT node with
> > > no clear way to identify which ones will point to a DT node and which
> > > ones will probe and which ones won't. In the future, we shouldn't
> > > allow adding new compatible strings for partitions for which we don't
> > > plan on adding nvmem drivers.
> > >
> > > Can you give the patch at the end of the email a shot? It should fix
> > > the issue with this series and without this series. It just avoids
> > > this whole mess by not creating useless platform device for
> > > nvmem-cells compatible DT nodes.
> >
> > Thanks a lot for your help.
>
> No problem. I want fw_devlink to work for everyone.
>
Thanks,
Miquèl