2024-01-23 02:20:45

by Michael Pratt

[permalink] [raw]
Subject: [PATCH v1 0/4] fw_devlink: generically handle bad links to child fwnodes

Hi all,

First off, if you are CC'ed here, it's likely because you were
involved in or CC'ed in a patch series by Saravana Kannan for fw_devlink,
and for consistency, I'm kindly asking for review from you all as well.
If you think this may not affect your use cases, feel free to ignore this.
I'm also CC'ing Christian and Rafal from the Openwrt project.

This series is following up on some excellent work from Saravana [1]
which is another patch series that includes some changes
that helped make it possible for fw_devlink to work with MTD partitions
that are a supplier fwnode by being a NVMEM provider. For an MTD partition
to become an NVMEM provider, it must be registered as a platform device
using of_platform_populate() or similar function
which was done in a previous commit [2]
but this also resulted in fw_devlink to apply
to those fwnodes and their child fwnodes.

That regression caused consumer devices to defer indefinitely
while waiting for probing that will never happen or for device links
to form from fwnode links with fwnodes that are not associated
with any real device or driver that probes
(e.g. describing the location of a MAC address).

Saravana's patch series helped in two ways:
First, by moving consumers from a child fwnode (in this case,
the "nvmem-cells" compatible node) to an ancestor fwnode
that represents the actual supplier device when that device probes,
which handles the case where
the supplier device has a probe attempt first. [3] [4]
And secondly, by specifically marking "nvmem-cells" compatible nodes
as populated during MTD partition parsing which also occurs during
the supplier device probe, which handles both cases of initial probe order,
but only for that node type. [5]

However, there's some drawbacks to the second solution
from having to manually name which nodes need the workaround
for the corner case with the compatible string.
Most notably, that it's possible for this corner case
to apply to other nodes types, but also the fact that initial probe order
deciding whether or not everything probes in the intended order, if at all,
through driver core alone is still an issue with fw_devlink,
despite the fact that controlling probe order in driver core
is one of it's goals. In other words, the real problem is in driver core,
but the fix is in the mtd driver.

Unfortunately, with the Openwrt project
we are experiencing this regression again
by implementing the new NVMEM layout "fixed-layout"
after it was accepted into the kernel. [6]
This causes some subsystems of an SOC, like
ethernet or wireless or both depending on hardware and DTS,
and in some cases a completely different function like USB
to never probe once again, and the temporary fix, like before,
is by disabling fw_devlink. [7]

Below is a simplified DTS of an Atheros device with the NVMEM layout:


&eth0 {
nvmem-cells = <&macaddr_art_0>;
nvmem-cell-names = "mac-address";
};

&wifi {
nvmem-cells = <&caldata_art_1000>;
nvmem-cell-names = "calibration";
};

&spi {
status = "okay";

flash@0 {
compatible = "jedec,spi-nor";

partitions {
compatible = "fixed-partitions";

partition@ff0000 {
label = "art";
reg = <0xff0000 0x010000>;
read-only;

nvmem-layout {
compatible = "fixed-layout";

macaddr_art_0: macaddr@0 {
reg = <0x0 0x6>;
};

caldata_art_1000: caldata@1000 {
reg = <0x1000 0x440>;
};
};
};
};
};
};


When the DTS is written this way, not only is there a different node
involved for NVMEM, but this node is a new node that is yet another
descendant in the tree. In other words, the "partition@ff0000" node
used to be what designated this device as the NVMEM provider
with the "nvmem-cells" compatible, so the node that represents
the actual probing device is now 4 parent nodes up in the tree
in this example instead of 3 parent nodes up in the tree as before.

For the past year, and even before the "fixed-layout" issue came up,
I had been experimenting with a way to handle these reverse probe order
and linking of distant descendant node issues in a generic way instead of
naming exceptions with the compatible string, and this series is the
culmination of those efforts. It may look strange at first,
but there are a myriad set of cases to handle and other reasons
that led me to develop this approach of using existing bad device links
in order to find the correct fwnode to link to, and then redo
the relevant links for that consumer device altogether.
I'm concerned that doing this another way would be a much more massive
project that would basically rewrite what the fw_devlink feature does.
Or perhaps there would have to be a new, third form of device links
that would be a "placeholder" before it becomes a fwnode link.
Eventually, I came to the conclusion that
there simply is not enough information to form the correct fwnode link
before the real supplier device has a successful probe.

Some of the changes proposed here are made on the extreme side of caution,
for example, checking for null dereference when it might not be necessary,
and reducing the activity of some functions in order to reduce
the amount of assumptions taking place in the middle of driver core
in cases where the new functions proposed here handles that just as well
and closer to a possible probe defer event
(e.g. not declaring a fwnode as NOT_DEVICE before
a probe attempt is expected to have happened).

I have tried to make the commit messages as self-explanatory as I can,
but they might have ended up a little too verbose, and therefore confusing
but I'm ready to explain whatever has not been explained well already.
Lastly, this is my first attempt at sending a larger change to the kernel,
so I appreciate your time and patience very much.

MCP


[1] https://lore.kernel.org/lkml/[email protected]/

[2] bcdf0315a61a29eb753a607d3a85a4032de72d94

[3] 3a2dbc510c437ca392516b0105bad8e7970e6614

[4] 411c0d58ca6faa9bc4b9f5382118a31c7bb92a6f

[5] fb42378dcc7f247df56f0ecddfdae85487495fbc

[6] 27f699e578b1a72df89dfa3bc42e093a01dc8d10

[7] https://github.com/openwrt/openwrt/commit/703d667a0cdf6dfa402c08e72d0c77a257ca5009


Michael Pratt (4):
driver core: fw_devlink: Use driver to determine probe ability
driver core: fw_devlink: Link to supplier ancestor if no device
driver core: fw_devlink: Add function device_links_fixup_suppliers()
mtd: mtdpart: Allow fwnode links to NVMEM compatible fwnodes

drivers/base/base.h | 1 +
drivers/base/core.c | 144 ++++++++++++++++++++++++++++++++++++++---
drivers/base/dd.c | 2 +
drivers/mtd/mtdpart.c | 10 ---
include/linux/fwnode.h | 4 ++
5 files changed, 143 insertions(+), 18 deletions(-)


base-commit: b0d326da462e20285236e11e4cbc32085de9f363
--
2.30.2




2024-01-23 02:23:00

by Michael Pratt

[permalink] [raw]
Subject: [PATCH v1 4/4] mtd: mtdpart: Allow fwnode links to NVMEM compatible fwnodes

This reverts commit fb42378dcc7f247df56f0ecddfdae85487495fbc
("mtd: mtdpart: Don't create platform device that'll never probe").

That commit is a manual named exception in order to avoid fw_devlink links
to an "nvmem-cells" compatible node which is a descendant of the fwnode
that represents the real supplier device that probes.

The commit does not work for newer cases, like the "fixed-layout"
compatible nodes, but instead of adding another compatible string,
remove this workaround as it is no longer needed after
the previous few commits which handle the situation in a generic way
for all supplier nodes that are a child or further descendant fwnode
of a parent device that can probe, including when the consumer device
has a probe attempt before the supplier device, by using an existing
incorrect fwnode or device link to recreate the correct one.

Signed-off-by: Michael Pratt <[email protected]>
---
drivers/mtd/mtdpart.c | 10 ----------
1 file changed, 10 deletions(-)

diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index 6811a714349d..dd2b27674f56 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -582,7 +582,6 @@ static int mtd_part_of_parse(struct mtd_info *master,
{
struct mtd_part_parser *parser;
struct device_node *np;
- struct device_node *child;
struct property *prop;
struct device *dev;
const char *compat;
@@ -600,15 +599,6 @@ static int mtd_part_of_parse(struct mtd_info *master,
else
np = of_get_child_by_name(np, "partitions");

- /*
- * Don't create devices that are added to a bus but will never get
- * probed. That'll cause fw_devlink to block probing of consumers of
- * this partition until the partition device is probed.
- */
- for_each_child_of_node(np, child)
- if (of_device_is_compatible(child, "nvmem-cells"))
- of_node_set_flag(child, OF_POPULATED);
-
of_property_for_each_string(np, "compatible", prop, compat) {
parser = mtd_part_get_compatible_parser(compat);
if (!parser)
--
2.30.2



2024-01-23 02:28:17

by Michael Pratt

[permalink] [raw]
Subject: [PATCH v1 3/4] driver core: fw_devlink: Add function device_links_fixup_suppliers()

If the supplier fwnode of a link is a child firmware node
of a real device, the probe of the consumer device can be indefinitely
deferred in some cases like when the consumer is registered as a device
and attempts probe first.

Add a function to fixup or recreate both fwnode and device links
to a consumer device before probing gets deferred.

As noted in the previous commit,
descendant fwnodes of the real supplier device only become flagged
with PARENT_IS_DEV when the real supplier device probes,
which is necessary in some cases for knowing how many ancestors up
in the tree the fwnode representing the real device is.

In the case where the consumer device has a probe attempt
before the supplier device while the correct supplier fwnode
is multiple ancestors up from the fwnode being linked,
the unknowns present before the supplier device probes
causes incorrect fwnode links that either become incorrect device links
or in some cases, fwnode links are unable to become device links at all,
which leads to the need to redo the link between the devices
after a probe defer of the consumer, therefore,
when the original supplier fwnode is flagged PARENT_IS_DEV
after supplier device probing,
and the consumer is about to be deferred again from a missing supplier,
flag the original supplier fwnode as NOT_DEVICE also
and recreate the relevant fwnode/device links to the consumer device.

In the case where the supplier probes first
but a descendant fwnode is the one linked to the consumer,
the link recreation happens on the first probe attempt of the consumer.

Because of the fwnode flags, the links will be handled differently
when deleted and recreated compared to before the consumer probe attempt
by linking the consumer with a ancestor of the original supplier fwnode,
which at this point represents a device that already probed successfully.
In the future, a way to develop "deferred device links" may be possible,
but that is likely much more complex.

The existing function device_links_check_suppliers() is not suitable
for these fixup changes because of the case where a different supplier
may also cause a deferred probe for a completely different reason.
Handle this in a new function device_links_fixup_suppliers()
which will be called before device_links_check_suppliers()
which is where it is decided whether or not the consumer must be deferred.
Checks in the new function exactly match checks in the function
device_links_check_suppliers() for consistency.

This new function is now the last opportunity to interact with
device links before probe is deferred, which stops device links from
being processed until re-probe or the next time device_add() is called.
This makes it an ideal location to finally assume that the supplier
must have probed at that point in time, unless there is an issue.
This should be called at every probe, because it is important to
handle as many device link changes as possible before probe deferral,
since deferred_probe_initcall() will only be called *once*
at late_initcall, and another probe defer at that point or later
can cause the consumer to be deferred indefinitely,
and the fact that in some cases, a probe deferral can be
avoided entirely if the real supplier device probes first.

Also, make sure that the device links are not handled earlier
than the fixup function for these cases.
Otherwise, before deferred probes are started again,
the link will be converted to sync-state only.

Signed-off-by: Michael Pratt <[email protected]>
---
drivers/base/base.h | 1 +
drivers/base/core.c | 91 +++++++++++++++++++++++++++++++++++++++++++++
drivers/base/dd.c | 2 +
3 files changed, 94 insertions(+)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index eb4c0ace9242..96593650a861 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -229,6 +229,7 @@ bool device_links_busy(struct device *dev);
void device_links_unbind_consumers(struct device *dev);
void fw_devlink_drivers_done(void);
void fw_devlink_probing_done(void);
+void device_links_fixup_suppliers(struct device *dev);

/* device pm support */
void device_pm_move_to_tail(struct device *dev);
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 7f2652cf5edc..96edcd842b42 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1773,6 +1773,9 @@ static void fw_devlink_relax_link(struct device_link *link)
if (device_link_flag_is_sync_state_only(link->flags))
return;

+ if (link->supplier->fwnode && (link->supplier->fwnode->flags & FWNODE_FLAG_PARENT_IS_DEV))
+ return;
+
pm_runtime_drop_link(link);
link->flags = DL_FLAG_MANAGED | FW_DEVLINK_FLAGS_PERMISSIVE;
dev_dbg(link->consumer, "Relaxing link with %s\n",
@@ -2286,6 +2289,94 @@ static void fw_devlink_link_device(struct device *dev)
mutex_unlock(&fwnode_link_lock);
}

+/**
+ * device_links_fixup_suppliers - Fix bad device links to suppliers of @dev.
+ * @dev: Consumer device.
+ *
+ * This should be called after the suppliers of @dev have a chance to form
+ * device links and @dev is probing so that, even if the consumer device
+ * has had it's fwtree parsed and it's attempt to probe started first,
+ * the suppliers are guaranteed to have an attempt at probing
+ * thanks to the dependency defined through the existing device links.
+ *
+ * In driver core, the suppliers have had an opportunity to probe at this point.
+ * Therefore, this is an ideal position to handle corner cases where,
+ * for whatever reason, the supplier is not optional, but the link to
+ * a supplier is bad and causing the probing of the consumer to defer.
+ * This is the last opportunity to handle a bad device link before
+ * that link will cause a probe defer of the consumer.
+ *
+ * If a fwnode link has not been translated to a device link at this point,
+ * or if a device link has not successfully resulted in the supplier probing,
+ * we know it is not possible for that node to represent a real device that
+ * provides functionality to the consumer, but an ancestor of that node might be.
+ */
+void device_links_fixup_suppliers(struct device *dev)
+{
+ struct fwnode_link *fwlink, *fwtmp;
+ struct device_link *link, *tmp;
+
+ mutex_lock(&fwnode_link_lock);
+ device_links_write_lock();
+
+ if (!dev->fwnode)
+ goto no_fwnode;
+
+ /* Keep flag checks in sync with fwnode_links_check_suppliers() */
+ list_for_each_entry_safe(fwlink, fwtmp, &dev->fwnode->suppliers, c_hook) {
+ if (fwlink->flags & FWLINK_FLAG_CYCLE)
+ continue;
+
+ /* The supplier may be a child fwnode of a device, if so, relink */
+ if (fwlink->supplier->flags & FWNODE_FLAG_PARENT_IS_DEV) {
+ dev_dbg(dev,
+ "Linking to ancestor of fwnode %pfwf\n",
+ fwlink->supplier);
+ fwlink->supplier->flags |= FWNODE_FLAG_NOT_DEVICE;
+ dev->fwnode->flags &= ~FWNODE_FLAG_LINKS_ADDED;
+ __fwnode_link_del(fwlink);
+ mutex_unlock(&fwnode_link_lock);
+ device_links_write_unlock();
+ fw_devlink_link_device(dev);
+ mutex_lock(&fwnode_link_lock);
+ device_links_write_lock();
+ continue;
+ }
+ }
+
+no_fwnode:
+ /* Keep flag checks in sync with device_links_check_suppliers() */
+ list_for_each_entry_safe(link, tmp, &dev->links.suppliers, c_node) {
+ if (!(link->flags & DL_FLAG_MANAGED))
+ continue;
+
+ if (link->status != DL_STATE_AVAILABLE &&
+ !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) {
+
+ if (!dev->fwnode || !link->supplier->fwnode)
+ continue;
+
+ /* The supplier may be a child fwnode of a device, if so, relink */
+ if (link->supplier->fwnode->flags & FWNODE_FLAG_PARENT_IS_DEV) {
+ dev_dbg(dev,
+ "Linking to ancestor of %s\n",
+ dev_name(link->supplier));
+ link->supplier->fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
+ dev->fwnode->flags &= ~FWNODE_FLAG_LINKS_ADDED;
+ __device_link_del(&link->kref);
+ mutex_unlock(&fwnode_link_lock);
+ device_links_write_unlock();
+ fw_devlink_link_device(dev);
+ mutex_lock(&fwnode_link_lock);
+ device_links_write_lock();
+ continue;
+ }
+ }
+ }
+ mutex_unlock(&fwnode_link_lock);
+ device_links_write_unlock();
+}
+
/* Device links support end. */

int (*platform_notify)(struct device *dev) = NULL;
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 85152537dbf1..24b8a506bc51 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -606,6 +606,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
!drv->suppress_bind_attrs;
int ret, link_ret;

+ device_links_fixup_suppliers(dev);
+
if (defer_all_probes) {
/*
* Value of defer_all_probes can be set only by
--
2.30.2



2024-01-23 02:49:25

by Michael Pratt

[permalink] [raw]
Subject: [PATCH v1 2/4] driver core: fw_devlink: Link to supplier ancestor if no device

Driver core currently supports linking to the next parent fwnode,
but is not yet handling cases where that parent
is also a firmware child node not representing a real device,
which can lead to an indefinite deferred probe in some cases.
In this case, the fwnode that should actually be linked to
is multiple ancestors up which presents a challenge where
it is unknown how many ancestors up the node that
represents the real probing device is. This makes the usage of
fwnode_get_next_parent_dev() insufficient because the real device's
fwnode may or may not be an ancestor of the next parent fwnode as well.

Introduce flag FWNODE_FLAG_PARENT_IS_DEV
in order to mark child firmware nodes of a device
as having a parent device that can probe.

Allow fwnode link creation to the original supplier fwnode's ancestors
when the original supplier fwnode and any fwnodes in between are flagged
as FWNODE_FLAG_NOT_DEVICE and/or FWNODE_FLAG_PARENT_IS_DEV
with a new function __fwnode_link_add_parents() which then creates
the fwnode link to a real device that provides the supplier's function.

This depends on other functions to label a supplier fwnode
as not a real device, which must be done before the fwnode links
are created, and if after that, relevant links to the supplier
would have to be deleted and have links recreated, otherwise,
the fwnode link would be dropped before the device link is attempted
or a fwnode link would not be able to become a device link at all,
because they were created before these fwnode flags can have any effect.

It also depends on the supplier device to actually probe first
in order to have the fwnode flags in place to know for certain
which fwnodes are non-probing child nodes
of the fwnode for the supplier device.

The use case of function __fw_devlink_pickup_dangling_consumers()
is designed so that the parameters are always a supplier fwnode
and one of it's parent fwnodes, so it is safer to assume and more specific
that the flag PARENT_IS_DEV should be added there, rather than
declaring the original supplier fwnode as NOT_DEVICE at that point.
Because this function is called when the real supplier device probes
and recursively calls itself for all child nodes of the device's fwnode,
set the new flag here in order to let it propagate down
to all descendant nodes, thereby providing the info needed later
in order to link to the proper fwnode representing the supplier device.

If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
by the time a device link is to be made with it,
but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
the link is dropped, otherwise the device link
is still made with the original supplier fwnode.
Theoretically, we can also handle linking to an ancestor
of the supplier fwnode when forming device links, but there
are still cases where the necessary fwnode flags are still missing
because the real supplier device did not probe yet.

Signed-off-by: Michael Pratt <[email protected]>
---
drivers/base/core.c | 49 ++++++++++++++++++++++++++++++++++++------
include/linux/fwnode.h | 4 ++++
2 files changed, 47 insertions(+), 6 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index c05a5f6b0641..7f2652cf5edc 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -92,13 +92,45 @@ static int __fwnode_link_add(struct fwnode_handle *con,
return 0;
}

+static int __fwnode_link_add_parents(struct fwnode_handle *con,
+ struct fwnode_handle *sup,
+ u8 flags, bool loop)
+{
+ int ret = -EINVAL;
+ struct fwnode_handle *parent;
+
+ fwnode_for_each_parent_node(sup, parent) {
+ /* Ignore the root node */
+ if (fwnode_count_parents(parent) &&
+ fwnode_device_is_available(parent) &&
+ !(parent->flags & FWNODE_FLAG_NOT_DEVICE) &&
+ !(parent->flags & FWNODE_FLAG_PARENT_IS_DEV)) {
+ ret = __fwnode_link_add(con, parent, flags);
+ }
+
+ if (!loop && !ret) {
+ fwnode_handle_put(parent);
+ return 0;
+ }
+ }
+
+ return ret;
+}
+
int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
{
int ret;

mutex_lock(&fwnode_link_lock);
- ret = __fwnode_link_add(con, sup, 0);
+
+ if ((sup->flags & FWNODE_FLAG_NOT_DEVICE) &&
+ (sup->flags & FWNODE_FLAG_PARENT_IS_DEV))
+ ret = __fwnode_link_add_parents(con, sup, 0, false);
+ else
+ ret = __fwnode_link_add(con, sup, 0);
+
mutex_unlock(&fwnode_link_lock);
+
return ret;
}

@@ -218,7 +250,8 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
* MANAGED device links to this device, so leave @fwnode and its descendant's
* fwnode links alone.
*
- * Otherwise, move its consumers to the new supplier @new_sup.
+ * Otherwise, flag descendants of @fwnode as having a parent fwnode for a device
+ * that has probed and move bad fwnode consumer links from @fwnode to @new_sup.
*/
static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
struct fwnode_handle *new_sup)
@@ -228,8 +261,11 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
if (fwnode->dev && fwnode->dev->driver)
return;

- fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
- __fwnode_links_move_consumers(fwnode, new_sup);
+ if (fwnode->flags & FWNODE_FLAG_NOT_DEVICE)
+ __fwnode_links_move_consumers(fwnode, new_sup);
+
+ fwnode->flags |= FWNODE_FLAG_PARENT_IS_DEV;
+ new_sup->flags &= ~(FWNODE_FLAG_PARENT_IS_DEV);

fwnode_for_each_available_child_node(fwnode, child)
__fw_devlink_pickup_dangling_consumers(child, new_sup);
@@ -2071,8 +2107,9 @@ static int fw_devlink_create_devlink(struct device *con,
device_links_write_unlock();
}

- if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
- sup_dev = fwnode_get_next_parent_dev(sup_handle);
+ if ((sup_handle->flags & FWNODE_FLAG_NOT_DEVICE) &&
+ !(sup_handle->flags & FWNODE_FLAG_PARENT_IS_DEV))
+ return -EINVAL;
else
sup_dev = get_dev_from_fwnode(sup_handle);

diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index 2a72f55d26eb..3ed8ba503062 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -30,6 +30,9 @@ struct device;
* BEST_EFFORT: The fwnode/device needs to probe early and might be missing some
* suppliers. Only enforce ordering with suppliers that have
* drivers.
+ * PARENT_IS_DEV: The fwnode is a child of a fwnode that is or will be populated as a
+ * struct device, which is more suitable for device links
+ * than the fwnode child which may never have a struct device.
*/
#define FWNODE_FLAG_LINKS_ADDED BIT(0)
#define FWNODE_FLAG_NOT_DEVICE BIT(1)
@@ -37,6 +40,7 @@ struct device;
#define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3)
#define FWNODE_FLAG_BEST_EFFORT BIT(4)
#define FWNODE_FLAG_VISITED BIT(5)
+#define FWNODE_FLAG_PARENT_IS_DEV BIT(6)

struct fwnode_handle {
struct fwnode_handle *secondary;
--
2.30.2



2024-01-23 02:49:55

by Michael Pratt

[permalink] [raw]
Subject: [PATCH v1 1/4] driver core: fw_devlink: Use driver to determine probe ability

The function __fw_devlink_pickup_dangling_consumers()
intends to ignore suppliers that are already capable of probing,
but uses whether or not a bus struct is defined in the device struct.

There are some cases where a firmware child node
can be address translatable but not able to probe
(e.g. the use of of_platform_populate() for MTD partitions),
so whether or not a driver is present is a more accurate way
to guess whether a fwnode represents a real probing device here.

This also serves as a preparation step for further changes
to fw_devlink including making the contents of this function
less strict in order to compensate for more cases being passed into
the rest of the function because the return case is now more strict.

"Hey! Who's driving the bus?"

Signed-off-by: Michael Pratt <[email protected]>
---
drivers/base/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 14d46af40f9a..c05a5f6b0641 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -214,7 +214,7 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
* @new_sup: fwnode of new supplier
*
* If the @fwnode has a corresponding struct device and the device supports
- * probing (that is, added to a bus), then we want to let fw_devlink create
+ * probing (that is, bound to a driver), then we want to let fw_devlink create
* MANAGED device links to this device, so leave @fwnode and its descendant's
* fwnode links alone.
*
@@ -225,7 +225,7 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
{
struct fwnode_handle *child;

- if (fwnode->dev && fwnode->dev->bus)
+ if (fwnode->dev && fwnode->dev->driver)
return;

fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
--
2.30.2



2024-02-05 15:02:43

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 2/4] driver core: fw_devlink: Link to supplier ancestor if no device

Hi Michael,

First, I want to say, this is a great job as fw_devlinks in mtd and
nvmem are really not easy to handle. I am willing to help, despite my
very light understanding of what the core actually does with these
flags.

[email protected] wrote on Tue, 23 Jan 2024 01:46:40 +0000:

> Driver core currently supports linking to the next parent fwnode,
> but is not yet handling cases where that parent
> is also a firmware child node not representing a real device,
> which can lead to an indefinite deferred probe in some cases.
> In this case, the fwnode that should actually be linked to
> is multiple ancestors up which presents a challenge where
> it is unknown how many ancestors up the node that
> represents the real probing device is. This makes the usage of
> fwnode_get_next_parent_dev() insufficient because the real device's
> fwnode may or may not be an ancestor of the next parent fwnode as well.
>
> Introduce flag FWNODE_FLAG_PARENT_IS_DEV
> in order to mark child firmware nodes of a device
> as having a parent device that can probe.
>
> Allow fwnode link creation to the original supplier fwnode's ancestors
> when the original supplier fwnode and any fwnodes in between are flagged
> as FWNODE_FLAG_NOT_DEVICE and/or FWNODE_FLAG_PARENT_IS_DEV
> with a new function __fwnode_link_add_parents() which then creates
> the fwnode link to a real device that provides the supplier's function.
>
> This depends on other functions to label a supplier fwnode
> as not a real device, which must be done before the fwnode links
> are created, and if after that, relevant links to the supplier
> would have to be deleted and have links recreated, otherwise,
> the fwnode link would be dropped before the device link is attempted
> or a fwnode link would not be able to become a device link at all,
> because they were created before these fwnode flags can have any effect.
>
> It also depends on the supplier device to actually probe first
> in order to have the fwnode flags in place to know for certain
> which fwnodes are non-probing child nodes
> of the fwnode for the supplier device.
>
> The use case of function __fw_devlink_pickup_dangling_consumers()
> is designed so that the parameters are always a supplier fwnode
> and one of it's parent fwnodes, so it is safer to assume and more specific
> that the flag PARENT_IS_DEV should be added there, rather than
> declaring the original supplier fwnode as NOT_DEVICE at that point.
> Because this function is called when the real supplier device probes
> and recursively calls itself for all child nodes of the device's fwnode,
> set the new flag here in order to let it propagate down
> to all descendant nodes, thereby providing the info needed later
> in order to link to the proper fwnode representing the supplier device.
>
> If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
> by the time a device link is to be made with it,
> but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
> the link is dropped, otherwise the device link
> is still made with the original supplier fwnode.
> Theoretically, we can also handle linking to an ancestor
> of the supplier fwnode when forming device links, but there
> are still cases where the necessary fwnode flags are still missing
> because the real supplier device did not probe yet.

I am not sure I follow this. In the following case, I would expect any
dependency towards node-c to be made against node-a. But the above
paragraph seems to tell otherwise: that the the link would be dropped
(and thus, not enforced) because recursively searching for a parent
that would be a device could be endless? It feels wrong, so I probably
mis

node-a {
# IS DEV
node-b {
# PARENT IS DEV
node-c {
# PARENT IS DEV
};
};
};

Besides that, the commit feels like a good idea.

Thanks,
Miquèl

2024-02-05 15:11:48

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH v1 4/4] mtd: mtdpart: Allow fwnode links to NVMEM compatible fwnodes

Hi Michael,

[email protected] wrote on Tue, 23 Jan 2024 01:47:21 +0000:

> This reverts commit fb42378dcc7f247df56f0ecddfdae85487495fbc
> ("mtd: mtdpart: Don't create platform device that'll never probe").
>
> That commit is a manual named exception in order to avoid fw_devlink links
> to an "nvmem-cells" compatible node which is a descendant of the fwnode
> that represents the real supplier device that probes.
>
> The commit does not work for newer cases, like the "fixed-layout"

Do you have plans for it? Because it is the modern description that is
now expected, so I don't feel convinced by all this work (which is
nevertheless considerable) if fixed-layouts are still broken?

> compatible nodes, but instead of adding another compatible string,
> remove this workaround as it is no longer needed after
> the previous few commits which handle the situation in a generic way
> for all supplier nodes that are a child or further descendant fwnode
> of a parent device that can probe, including when the consumer device
> has a probe attempt before the supplier device, by using an existing
> incorrect fwnode or device link to recreate the correct one.
>
> Signed-off-by: Michael Pratt <[email protected]>

Thanks,
Miquèl

2024-02-05 21:18:22

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v1 2/4] driver core: fw_devlink: Link to supplier ancestor if no device

On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt <[email protected]> wrote:
>
> Driver core currently supports linking to the next parent fwnode,
> but is not yet handling cases where that parent
> is also a firmware child node not representing a real device,
> which can lead to an indefinite deferred probe in some cases.
> In this case, the fwnode that should actually be linked to
> is multiple ancestors up which presents a challenge where
> it is unknown how many ancestors up the node that
> represents the real probing device is. This makes the usage of
> fwnode_get_next_parent_dev() insufficient because the real device's
> fwnode may or may not be an ancestor of the next parent fwnode as well.
>
> Introduce flag FWNODE_FLAG_PARENT_IS_DEV
> in order to mark child firmware nodes of a device
> as having a parent device that can probe.
>
> Allow fwnode link creation to the original supplier fwnode's ancestors
> when the original supplier fwnode and any fwnodes in between are flagged
> as FWNODE_FLAG_NOT_DEVICE and/or FWNODE_FLAG_PARENT_IS_DEV
> with a new function __fwnode_link_add_parents() which then creates
> the fwnode link to a real device that provides the supplier's function.
>
> This depends on other functions to label a supplier fwnode
> as not a real device, which must be done before the fwnode links
> are created, and if after that, relevant links to the supplier
> would have to be deleted and have links recreated, otherwise,
> the fwnode link would be dropped before the device link is attempted
> or a fwnode link would not be able to become a device link at all,
> because they were created before these fwnode flags can have any effect.
>
> It also depends on the supplier device to actually probe first
> in order to have the fwnode flags in place to know for certain
> which fwnodes are non-probing child nodes
> of the fwnode for the supplier device.
>
> The use case of function __fw_devlink_pickup_dangling_consumers()
> is designed so that the parameters are always a supplier fwnode
> and one of it's parent fwnodes, so it is safer to assume and more specific
> that the flag PARENT_IS_DEV should be added there, rather than
> declaring the original supplier fwnode as NOT_DEVICE at that point.
> Because this function is called when the real supplier device probes
> and recursively calls itself for all child nodes of the device's fwnode,
> set the new flag here in order to let it propagate down
> to all descendant nodes, thereby providing the info needed later
> in order to link to the proper fwnode representing the supplier device.
>
> If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
> by the time a device link is to be made with it,
> but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
> the link is dropped, otherwise the device link
> is still made with the original supplier fwnode.
> Theoretically, we can also handle linking to an ancestor
> of the supplier fwnode when forming device links, but there
> are still cases where the necessary fwnode flags are still missing
> because the real supplier device did not probe yet.
>
> Signed-off-by: Michael Pratt <[email protected]>

NACK for a bunch of reasons.

1. This logic should not be in fwnode_link_add(). That's supposed to
do the exact linking that the firmware meant. This logic should be
where the fwnode links are converted to device links or deferred
probing decision is done.

2. If we handle one parent above correctly, it should be easy to
handle multiple parents above too. So why not fix it where we handle
multiple parents above?

Btw, I'm happy to fix this or help you fix this once I understand the
issue. Just wanted to give a quick NACK to avoid people spending
cycles on this patch in its current state.

-Saravana


> ---
> drivers/base/core.c | 49 ++++++++++++++++++++++++++++++++++++------
> include/linux/fwnode.h | 4 ++++
> 2 files changed, 47 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index c05a5f6b0641..7f2652cf5edc 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -92,13 +92,45 @@ static int __fwnode_link_add(struct fwnode_handle *con,
> return 0;
> }
>
> +static int __fwnode_link_add_parents(struct fwnode_handle *con,
> + struct fwnode_handle *sup,
> + u8 flags, bool loop)
> +{
> + int ret = -EINVAL;
> + struct fwnode_handle *parent;
> +
> + fwnode_for_each_parent_node(sup, parent) {
> + /* Ignore the root node */
> + if (fwnode_count_parents(parent) &&
> + fwnode_device_is_available(parent) &&
> + !(parent->flags & FWNODE_FLAG_NOT_DEVICE) &&
> + !(parent->flags & FWNODE_FLAG_PARENT_IS_DEV)) {
> + ret = __fwnode_link_add(con, parent, flags);
> + }
> +
> + if (!loop && !ret) {
> + fwnode_handle_put(parent);
> + return 0;
> + }
> + }
> +
> + return ret;
> +}
> +
> int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> {
> int ret;
>
> mutex_lock(&fwnode_link_lock);
> - ret = __fwnode_link_add(con, sup, 0);
> +
> + if ((sup->flags & FWNODE_FLAG_NOT_DEVICE) &&
> + (sup->flags & FWNODE_FLAG_PARENT_IS_DEV))
> + ret = __fwnode_link_add_parents(con, sup, 0, false);
> + else
> + ret = __fwnode_link_add(con, sup, 0);
> +
> mutex_unlock(&fwnode_link_lock);
> +
> return ret;
> }
>
> @@ -218,7 +250,8 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
> * MANAGED device links to this device, so leave @fwnode and its descendant's
> * fwnode links alone.
> *
> - * Otherwise, move its consumers to the new supplier @new_sup.
> + * Otherwise, flag descendants of @fwnode as having a parent fwnode for a device
> + * that has probed and move bad fwnode consumer links from @fwnode to @new_sup.
> */
> static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> struct fwnode_handle *new_sup)
> @@ -228,8 +261,11 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> if (fwnode->dev && fwnode->dev->driver)
> return;
>
> - fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
> - __fwnode_links_move_consumers(fwnode, new_sup);
> + if (fwnode->flags & FWNODE_FLAG_NOT_DEVICE)
> + __fwnode_links_move_consumers(fwnode, new_sup);
> +
> + fwnode->flags |= FWNODE_FLAG_PARENT_IS_DEV;
> + new_sup->flags &= ~(FWNODE_FLAG_PARENT_IS_DEV);
>
> fwnode_for_each_available_child_node(fwnode, child)
> __fw_devlink_pickup_dangling_consumers(child, new_sup);
> @@ -2071,8 +2107,9 @@ static int fw_devlink_create_devlink(struct device *con,
> device_links_write_unlock();
> }
>
> - if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
> - sup_dev = fwnode_get_next_parent_dev(sup_handle);
> + if ((sup_handle->flags & FWNODE_FLAG_NOT_DEVICE) &&
> + !(sup_handle->flags & FWNODE_FLAG_PARENT_IS_DEV))
> + return -EINVAL;
> else
> sup_dev = get_dev_from_fwnode(sup_handle);
>
> diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
> index 2a72f55d26eb..3ed8ba503062 100644
> --- a/include/linux/fwnode.h
> +++ b/include/linux/fwnode.h
> @@ -30,6 +30,9 @@ struct device;
> * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some
> * suppliers. Only enforce ordering with suppliers that have
> * drivers.
> + * PARENT_IS_DEV: The fwnode is a child of a fwnode that is or will be populated as a
> + * struct device, which is more suitable for device links
> + * than the fwnode child which may never have a struct device.
> */
> #define FWNODE_FLAG_LINKS_ADDED BIT(0)
> #define FWNODE_FLAG_NOT_DEVICE BIT(1)
> @@ -37,6 +40,7 @@ struct device;
> #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3)
> #define FWNODE_FLAG_BEST_EFFORT BIT(4)
> #define FWNODE_FLAG_VISITED BIT(5)
> +#define FWNODE_FLAG_PARENT_IS_DEV BIT(6)
>
> struct fwnode_handle {
> struct fwnode_handle *secondary;
> --
> 2.30.2
>
>

2024-02-05 21:41:23

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] fw_devlink: generically handle bad links to child fwnodes

On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt <[email protected]> wrote:
>
> Hi all,
>
> First off, if you are CC'ed here, it's likely because you were
> involved in or CC'ed in a patch series by Saravana Kannan for fw_devlink,
> and for consistency, I'm kindly asking for review from you all as well.
> If you think this may not affect your use cases, feel free to ignore this.
> I'm also CC'ing Christian and Rafal from the Openwrt project.
>
> This series is following up on some excellent work from Saravana [1]
> which is another patch series that includes some changes
> that helped make it possible for fw_devlink to work with MTD partitions
> that are a supplier fwnode by being a NVMEM provider. For an MTD partition
> to become an NVMEM provider, it must be registered as a platform device
> using of_platform_populate() or similar function
> which was done in a previous commit [2]
> but this also resulted in fw_devlink to apply
> to those fwnodes and their child fwnodes.
>
> That regression caused consumer devices to defer indefinitely
> while waiting for probing that will never happen or for device links
> to form from fwnode links with fwnodes that are not associated
> with any real device or driver that probes
> (e.g. describing the location of a MAC address).
>
> Saravana's patch series helped in two ways:
> First, by moving consumers from a child fwnode (in this case,
> the "nvmem-cells" compatible node) to an ancestor fwnode
> that represents the actual supplier device when that device probes,
> which handles the case where
> the supplier device has a probe attempt first. [3] [4]
> And secondly, by specifically marking "nvmem-cells" compatible nodes
> as populated during MTD partition parsing which also occurs during
> the supplier device probe, which handles both cases of initial probe order,
> but only for that node type. [5]
>
> However, there's some drawbacks to the second solution

Oh, somehow missed this thread entirely until it saw some activity today.

> from having to manually name which nodes need the workaround
> for the corner case with the compatible string.
> Most notably, that it's possible for this corner case
> to apply to other nodes types, but also the fact that initial probe order
> deciding whether or not everything probes in the intended order, if at all,
> through driver core alone is still an issue with fw_devlink,
> despite the fact that controlling probe order in driver core
> is one of it's goals. In other words, the real problem is in driver core,
> but the fix is in the mtd driver.

It's been a while since I looked at MTD code, but the real problem is
not in driver core, but how it's used incorrectly by the MTD/nvmem
frameworks. Adding devices to a bus that'll never probe is wrong. I
think there's also a case of two devices being created off the same DT
node. While not technically wrong, it's weird when one of them never
probes.

I'll take a closer look and respond to this series later. Hopefully
this week, but if not, then next week.

As I said in the other patch, I don't like the series in the current
form because it's changing APIs in not so great ways.

fwnode_link_add() is supposed to be super dumb and simple. It's
literally there just to avoid reparsing DT nodes multiple times.
Nothing more ever. It just denotes the "pointers" between fwnode or DT
nodes.

The "smarts" should either be where fwnode links are converted into
device links (and not have to fix them up) or where nvmem-cells is
being parsed and converted to fwnode links.

As I said in the other patch, let me take a closer look this week or
next and get back. These things needs several hours of uninterrupted
time for me to debug and unwind.

-Saravana

>
> Unfortunately, with the Openwrt project
> we are experiencing this regression again
> by implementing the new NVMEM layout "fixed-layout"
> after it was accepted into the kernel. [6]
> This causes some subsystems of an SOC, like
> ethernet or wireless or both depending on hardware and DTS,
> and in some cases a completely different function like USB
> to never probe once again, and the temporary fix, like before,
> is by disabling fw_devlink. [7]
>
> Below is a simplified DTS of an Atheros device with the NVMEM layout:
>
>
> &eth0 {
> nvmem-cells = <&macaddr_art_0>;
> nvmem-cell-names = "mac-address";
> };
>
> &wifi {
> nvmem-cells = <&caldata_art_1000>;
> nvmem-cell-names = "calibration";
> };
>
> &spi {
> status = "okay";
>
> flash@0 {
> compatible = "jedec,spi-nor";
>
> partitions {
> compatible = "fixed-partitions";
>
> partition@ff0000 {
> label = "art";
> reg = <0xff0000 0x010000>;
> read-only;
>
> nvmem-layout {
> compatible = "fixed-layout";
>
> macaddr_art_0: macaddr@0 {
> reg = <0x0 0x6>;
> };
>
> caldata_art_1000: caldata@1000 {
> reg = <0x1000 0x440>;
> };
> };
> };
> };
> };
> };
>
>
> When the DTS is written this way, not only is there a different node
> involved for NVMEM, but this node is a new node that is yet another
> descendant in the tree. In other words, the "partition@ff0000" node
> used to be what designated this device as the NVMEM provider
> with the "nvmem-cells" compatible, so the node that represents
> the actual probing device is now 4 parent nodes up in the tree
> in this example instead of 3 parent nodes up in the tree as before.
>
> For the past year, and even before the "fixed-layout" issue came up,
> I had been experimenting with a way to handle these reverse probe order
> and linking of distant descendant node issues in a generic way instead of
> naming exceptions with the compatible string, and this series is the
> culmination of those efforts. It may look strange at first,
> but there are a myriad set of cases to handle and other reasons
> that led me to develop this approach of using existing bad device links
> in order to find the correct fwnode to link to, and then redo
> the relevant links for that consumer device altogether.
> I'm concerned that doing this another way would be a much more massive
> project that would basically rewrite what the fw_devlink feature does.
> Or perhaps there would have to be a new, third form of device links
> that would be a "placeholder" before it becomes a fwnode link.
> Eventually, I came to the conclusion that
> there simply is not enough information to form the correct fwnode link
> before the real supplier device has a successful probe.
>
> Some of the changes proposed here are made on the extreme side of caution,
> for example, checking for null dereference when it might not be necessary,
> and reducing the activity of some functions in order to reduce
> the amount of assumptions taking place in the middle of driver core
> in cases where the new functions proposed here handles that just as well
> and closer to a possible probe defer event
> (e.g. not declaring a fwnode as NOT_DEVICE before
> a probe attempt is expected to have happened).
>
> I have tried to make the commit messages as self-explanatory as I can,
> but they might have ended up a little too verbose, and therefore confusing
> but I'm ready to explain whatever has not been explained well already.
> Lastly, this is my first attempt at sending a larger change to the kernel,
> so I appreciate your time and patience very much.
>
> MCP
>
>
> [1] https://lore.kernel.org/lkml/[email protected]/
>
> [2] bcdf0315a61a29eb753a607d3a85a4032de72d94
>
> [3] 3a2dbc510c437ca392516b0105bad8e7970e6614
>
> [4] 411c0d58ca6faa9bc4b9f5382118a31c7bb92a6f
>
> [5] fb42378dcc7f247df56f0ecddfdae85487495fbc
>
> [6] 27f699e578b1a72df89dfa3bc42e093a01dc8d10
>
> [7] https://github.com/openwrt/openwrt/commit/703d667a0cdf6dfa402c08e72d0c77a257ca5009
>
>
> Michael Pratt (4):
> driver core: fw_devlink: Use driver to determine probe ability
> driver core: fw_devlink: Link to supplier ancestor if no device
> driver core: fw_devlink: Add function device_links_fixup_suppliers()
> mtd: mtdpart: Allow fwnode links to NVMEM compatible fwnodes
>
> drivers/base/base.h | 1 +
> drivers/base/core.c | 144 ++++++++++++++++++++++++++++++++++++++---
> drivers/base/dd.c | 2 +
> drivers/mtd/mtdpart.c | 10 ---
> include/linux/fwnode.h | 4 ++
> 5 files changed, 143 insertions(+), 18 deletions(-)
>
>
> base-commit: b0d326da462e20285236e11e4cbc32085de9f363
> --
> 2.30.2
>
>

2024-02-14 03:18:52

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v1 1/4] driver core: fw_devlink: Use driver to determine probe ability

Hi Michael,

Thanks for reporting this and being willing to work on a fix.

On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt <[email protected]> wrote:
>
> The function __fw_devlink_pickup_dangling_consumers()
> intends to ignore suppliers that are already capable of probing,

fw_devlink isn't trying to figure out if a fwnode is "already capable"
of probing. It's trying to figure out if a fwnode will NEVER probe. If
it's just looking at "right now" or "already capable", it becomes
kinda useless.

> but uses whether or not a bus struct is defined in the device struct.

Because if you don't need a class of devices to probe, you add them to
a "class" not a "bus".

> There are some cases where a firmware child node
> can be address translatable but not able to probe
> (e.g. the use of of_platform_populate() for MTD partitions),
> so whether or not a driver is present is a more accurate way
> to guess whether a fwnode represents a real probing device here.

No, checking for the driver is not a "more accurate way" for the
reasons mentioned above.

> This also serves as a preparation step for further changes
> to fw_devlink including making the contents of this function
> less strict in order to compensate for more cases being passed into
> the rest of the function because the return case is now more strict.

This change itself is a definite Nack, but I'll look at your other
patches to see what you are trying to do.

> "Hey! Who's driving the bus?"

The driver isn't here yet. He'll be here in a while. But at least this
is a mode of transportation and not a football stadium :)

See more below.

> Signed-off-by: Michael Pratt <[email protected]>
> ---
> drivers/base/core.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 14d46af40f9a..c05a5f6b0641 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -214,7 +214,7 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
> * @new_sup: fwnode of new supplier
> *
> * If the @fwnode has a corresponding struct device and the device supports
> - * probing (that is, added to a bus), then we want to let fw_devlink create
> + * probing (that is, bound to a driver), then we want to let fw_devlink create
> * MANAGED device links to this device, so leave @fwnode and its descendant's
> * fwnode links alone.
> *
> @@ -225,7 +225,7 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> {
> struct fwnode_handle *child;
>
> - if (fwnode->dev && fwnode->dev->bus)
> + if (fwnode->dev && fwnode->dev->driver)

This will completely break fw_devlink when modules are enabled. Which
is where fw_devlink is also very much needed. And if modules are
loaded using udev events, this is guaranteed to break for those cases.
Also, the driver gets set AFTER a device is probed. Not before. So, I
think you are just deleting/incorrectly moving a whole bunch of device
links that would have been created.

A first level sanity test for any fw_devlink change is to take a
sufficiently complicated board/system and then compare the output of
this command before and after your changes:
ls -1 /sys/class/devlink

The diff you see should be exactly what you expect/want to happen. If
there are other unexpected diffs it's generally a bug. I've caught so
many bugs in my changes (before I send them) this way.

Also, if a device is never supposed to probe, it should not be added
to a bus anyway. That's what a "class" is for. It's for a class of
devices. Adding a device to a bus and then never probing it is such a
waste. And this device (at least for nvmem-cells) is never even
referenced -- which is an even bigger waste of memory.

I'd really prefer if someone with nvmem cells experience (hint hint
Michael hint hint :) ) can clean up the framework to not create
devices unnecessarily or at least make it a device that's added to a
class instead of a bus.

-Saravana

2024-02-14 03:20:44

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v1 2/4] driver core: fw_devlink: Link to supplier ancestor if no device

On Mon, Feb 5, 2024 at 11:25 AM Saravana Kannan <[email protected]> wrote:
>
> On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt <[email protected]> wrote:
> >
> > Driver core currently supports linking to the next parent fwnode,
> > but is not yet handling cases where that parent
> > is also a firmware child node not representing a real device,

Can you tell me why you say this? Because as far as I can tell, I did
take into account climbing all the way up to find the actual struct
device that encompasses a fwnode and then create a device link to that
struct device.

> > which can lead to an indefinite deferred probe in some cases.
> > In this case, the fwnode that should actually be linked to
> > is multiple ancestors up which presents a challenge where
> > it is unknown how many ancestors up the node that
> > represents the real probing device is.

The goal is to find the device. Not "the probing device". That device
IS the one providing the functionality of the fwnode. If you never use
that device, don't create it? If you never probe the device, then use
a class instead of a bus?

> > This makes the usage of
> > fwnode_get_next_parent_dev() insufficient because the real device's
> > fwnode may or may not be an ancestor of the next parent fwnode as well.

It is doing what it's supposed to do. Finding the parent device. Not
the next probing parent device. This becomes even more relevant when
we start talking about suspend/resume. Too much to explain about
suspend/resume and it's not relevant to your case. So I'll leave that
as an exercise to the reader.

> > Introduce flag FWNODE_FLAG_PARENT_IS_DEV
> > in order to mark child firmware nodes of a device
> > as having a parent device that can probe.
> >
> > Allow fwnode link creation to the original supplier fwnode's ancestors
> > when the original supplier fwnode and any fwnodes in between are flagged
> > as FWNODE_FLAG_NOT_DEVICE and/or FWNODE_FLAG_PARENT_IS_DEV
> > with a new function __fwnode_link_add_parents() which then creates
> > the fwnode link to a real device that provides the supplier's function.
> >
> > This depends on other functions to label a supplier fwnode
> > as not a real device, which must be done before the fwnode links
> > are created, and if after that, relevant links to the supplier
> > would have to be deleted and have links recreated, otherwise,
> > the fwnode link would be dropped before the device link is attempted
> > or a fwnode link would not be able to become a device link at all,
> > because they were created before these fwnode flags can have any effect.
> >
> > It also depends on the supplier device to actually probe first
> > in order to have the fwnode flags in place to know for certain
> > which fwnodes are non-probing child nodes
> > of the fwnode for the supplier device.
> >
> > The use case of function __fw_devlink_pickup_dangling_consumers()
> > is designed so that the parameters are always a supplier fwnode
> > and one of it's parent fwnodes, so it is safer to assume and more specific
> > that the flag PARENT_IS_DEV should be added there, rather than
> > declaring the original supplier fwnode as NOT_DEVICE at that point.
> > Because this function is called when the real supplier device probes
> > and recursively calls itself for all child nodes of the device's fwnode,
> > set the new flag here in order to let it propagate down
> > to all descendant nodes, thereby providing the info needed later
> > in order to link to the proper fwnode representing the supplier device.
> >
> > If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
> > by the time a device link is to be made with it,
> > but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
> > the link is dropped, otherwise the device link
> > is still made with the original supplier fwnode.
> > Theoretically, we can also handle linking to an ancestor
> > of the supplier fwnode when forming device links, but there
> > are still cases where the necessary fwnode flags are still missing
> > because the real supplier device did not probe yet.

It's still not clear why you need FWNODE_FLAG_PARENT_IS_DEV, but
considering patch 1 is a definite no-go, I'll stop here.

-Saravana

> >
> > Signed-off-by: Michael Pratt <[email protected]>
>
> NACK for a bunch of reasons.
>
> 1. This logic should not be in fwnode_link_add(). That's supposed to
> do the exact linking that the firmware meant. This logic should be
> where the fwnode links are converted to device links or deferred
> probing decision is done.
>
> 2. If we handle one parent above correctly, it should be easy to
> handle multiple parents above too. So why not fix it where we handle
> multiple parents above?
>
> Btw, I'm happy to fix this or help you fix this once I understand the
> issue. Just wanted to give a quick NACK to avoid people spending
> cycles on this patch in its current state.
>
> -Saravana
>
>
> > ---
> > drivers/base/core.c | 49 ++++++++++++++++++++++++++++++++++++------
> > include/linux/fwnode.h | 4 ++++
> > 2 files changed, 47 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index c05a5f6b0641..7f2652cf5edc 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -92,13 +92,45 @@ static int __fwnode_link_add(struct fwnode_handle *con,
> > return 0;
> > }
> >
> > +static int __fwnode_link_add_parents(struct fwnode_handle *con,
> > + struct fwnode_handle *sup,
> > + u8 flags, bool loop)
> > +{
> > + int ret = -EINVAL;
> > + struct fwnode_handle *parent;
> > +
> > + fwnode_for_each_parent_node(sup, parent) {
> > + /* Ignore the root node */
> > + if (fwnode_count_parents(parent) &&
> > + fwnode_device_is_available(parent) &&
> > + !(parent->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + !(parent->flags & FWNODE_FLAG_PARENT_IS_DEV)) {
> > + ret = __fwnode_link_add(con, parent, flags);
> > + }
> > +
> > + if (!loop && !ret) {
> > + fwnode_handle_put(parent);
> > + return 0;
> > + }
> > + }
> > +
> > + return ret;
> > +}
> > +
> > int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> > {
> > int ret;
> >
> > mutex_lock(&fwnode_link_lock);
> > - ret = __fwnode_link_add(con, sup, 0);
> > +
> > + if ((sup->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + (sup->flags & FWNODE_FLAG_PARENT_IS_DEV))
> > + ret = __fwnode_link_add_parents(con, sup, 0, false);
> > + else
> > + ret = __fwnode_link_add(con, sup, 0);
> > +
> > mutex_unlock(&fwnode_link_lock);
> > +
> > return ret;
> > }
> >
> > @@ -218,7 +250,8 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
> > * MANAGED device links to this device, so leave @fwnode and its descendant's
> > * fwnode links alone.
> > *
> > - * Otherwise, move its consumers to the new supplier @new_sup.
> > + * Otherwise, flag descendants of @fwnode as having a parent fwnode for a device
> > + * that has probed and move bad fwnode consumer links from @fwnode to @new_sup.
> > */
> > static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> > struct fwnode_handle *new_sup)
> > @@ -228,8 +261,11 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> > if (fwnode->dev && fwnode->dev->driver)
> > return;
> >
> > - fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
> > - __fwnode_links_move_consumers(fwnode, new_sup);
> > + if (fwnode->flags & FWNODE_FLAG_NOT_DEVICE)
> > + __fwnode_links_move_consumers(fwnode, new_sup);
> > +
> > + fwnode->flags |= FWNODE_FLAG_PARENT_IS_DEV;
> > + new_sup->flags &= ~(FWNODE_FLAG_PARENT_IS_DEV);
> >
> > fwnode_for_each_available_child_node(fwnode, child)
> > __fw_devlink_pickup_dangling_consumers(child, new_sup);
> > @@ -2071,8 +2107,9 @@ static int fw_devlink_create_devlink(struct device *con,
> > device_links_write_unlock();
> > }
> >
> > - if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
> > - sup_dev = fwnode_get_next_parent_dev(sup_handle);
> > + if ((sup_handle->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + !(sup_handle->flags & FWNODE_FLAG_PARENT_IS_DEV))
> > + return -EINVAL;
> > else
> > sup_dev = get_dev_from_fwnode(sup_handle);
> >
> > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
> > index 2a72f55d26eb..3ed8ba503062 100644
> > --- a/include/linux/fwnode.h
> > +++ b/include/linux/fwnode.h
> > @@ -30,6 +30,9 @@ struct device;
> > * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some
> > * suppliers. Only enforce ordering with suppliers that have
> > * drivers.
> > + * PARENT_IS_DEV: The fwnode is a child of a fwnode that is or will be populated as a
> > + * struct device, which is more suitable for device links
> > + * than the fwnode child which may never have a struct device.
> > */
> > #define FWNODE_FLAG_LINKS_ADDED BIT(0)
> > #define FWNODE_FLAG_NOT_DEVICE BIT(1)
> > @@ -37,6 +40,7 @@ struct device;
> > #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3)
> > #define FWNODE_FLAG_BEST_EFFORT BIT(4)
> > #define FWNODE_FLAG_VISITED BIT(5)
> > +#define FWNODE_FLAG_PARENT_IS_DEV BIT(6)
> >
> > struct fwnode_handle {
> > struct fwnode_handle *secondary;
> > --
> > 2.30.2
> >
> >

2024-02-14 03:29:28

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] fw_devlink: generically handle bad links to child fwnodes

On Mon, Feb 5, 2024 at 12:06 PM Saravana Kannan <[email protected]> wrote:
>
> On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt <[email protected]> wrote:
> >
> > Hi all,
> >
> > First off, if you are CC'ed here, it's likely because you were
> > involved in or CC'ed in a patch series by Saravana Kannan for fw_devlink,
> > and for consistency, I'm kindly asking for review from you all as well.
> > If you think this may not affect your use cases, feel free to ignore this.
> > I'm also CC'ing Christian and Rafal from the Openwrt project.
> >
> > This series is following up on some excellent work from Saravana [1]
> > which is another patch series that includes some changes
> > that helped make it possible for fw_devlink to work with MTD partitions
> > that are a supplier fwnode by being a NVMEM provider. For an MTD partition
> > to become an NVMEM provider, it must be registered as a platform device
> > using of_platform_populate() or similar function
> > which was done in a previous commit [2]
> > but this also resulted in fw_devlink to apply
> > to those fwnodes and their child fwnodes.
> >
> > That regression caused consumer devices to defer indefinitely
> > while waiting for probing that will never happen or for device links
> > to form from fwnode links with fwnodes that are not associated
> > with any real device or driver that probes
> > (e.g. describing the location of a MAC address).
> >
> > Saravana's patch series helped in two ways:
> > First, by moving consumers from a child fwnode (in this case,
> > the "nvmem-cells" compatible node) to an ancestor fwnode
> > that represents the actual supplier device when that device probes,
> > which handles the case where
> > the supplier device has a probe attempt first. [3] [4]
> > And secondly, by specifically marking "nvmem-cells" compatible nodes
> > as populated during MTD partition parsing which also occurs during
> > the supplier device probe, which handles both cases of initial probe order,
> > but only for that node type. [5]
> >
> > However, there's some drawbacks to the second solution
>
> Oh, somehow missed this thread entirely until it saw some activity today.
>
> > from having to manually name which nodes need the workaround
> > for the corner case with the compatible string.
> > Most notably, that it's possible for this corner case
> > to apply to other nodes types, but also the fact that initial probe order
> > deciding whether or not everything probes in the intended order, if at all,
> > through driver core alone is still an issue with fw_devlink,
> > despite the fact that controlling probe order in driver core
> > is one of it's goals. In other words, the real problem is in driver core,
> > but the fix is in the mtd driver.
>
> It's been a while since I looked at MTD code, but the real problem is
> not in driver core, but how it's used incorrectly by the MTD/nvmem
> frameworks. Adding devices to a bus that'll never probe is wrong. I
> think there's also a case of two devices being created off the same DT
> node. While not technically wrong, it's weird when one of them never
> probes.
>
> I'll take a closer look and respond to this series later. Hopefully
> this week, but if not, then next week.
>
> As I said in the other patch, I don't like the series in the current
> form because it's changing APIs in not so great ways.
>
> fwnode_link_add() is supposed to be super dumb and simple. It's
> literally there just to avoid reparsing DT nodes multiple times.
> Nothing more ever. It just denotes the "pointers" between fwnode or DT
> nodes.
>
> The "smarts" should either be where fwnode links are converted into
> device links (and not have to fix them up) or where nvmem-cells is
> being parsed and converted to fwnode links.
>
> As I said in the other patch, let me take a closer look this week or
> next and get back. These things needs several hours of uninterrupted
> time for me to debug and unwind.

As promised, I took a closer look and left some comments in Patch 1
and 2. Patch 1 is 100% broken/wrong so the series will not be
accepted.

Just for the sake of understanding, can you send a patch that'll add
these additional compatible strings similar to how I handled
nvmem-cells and show my how much worse it is than this series?

>
> >
> > Unfortunately, with the Openwrt project
> > we are experiencing this regression again
> > by implementing the new NVMEM layout "fixed-layout"
> > after it was accepted into the kernel. [6]
> > This causes some subsystems of an SOC, like
> > ethernet or wireless or both depending on hardware and DTS,
> > and in some cases a completely different function like USB
> > to never probe once again, and the temporary fix, like before,
> > is by disabling fw_devlink. [7]
> >
> > Below is a simplified DTS of an Atheros device with the NVMEM layout:
> >
> >
> > &eth0 {
> > nvmem-cells = <&macaddr_art_0>;
> > nvmem-cell-names = "mac-address";
> > };
> >
> > &wifi {
> > nvmem-cells = <&caldata_art_1000>;
> > nvmem-cell-names = "calibration";
> > };
> >
> > &spi {
> > status = "okay";
> >
> > flash@0 {
> > compatible = "jedec,spi-nor";
> >
> > partitions {
> > compatible = "fixed-partitions";
> >
> > partition@ff0000 {
> > label = "art";
> > reg = <0xff0000 0x010000>;
> > read-only;
> >
> > nvmem-layout {
> > compatible = "fixed-layout";
> >
> > macaddr_art_0: macaddr@0 {
> > reg = <0x0 0x6>;
> > };
> >
> > caldata_art_1000: caldata@1000 {
> > reg = <0x1000 0x440>;
> > };
> > };
> > };
> > };
> > };
> > };

In this example, can you walk me through the probe attempts/successes
of the nvmem supplier and it's consumers and at what point does
fw_devlink makes the wrong decision? You also said fw_devlink depends
on the order in which devices probe to work correctly. This is
definitely not the case/intention. So, if you can show me an example
of that, I'll fix that too.

I'm fairly certain this example will end up being a case of the nvmem
framework creating pointless devices or using a "bus" when it needs a
"class". But I don't want to assume.

I'm asking all these question to understand your case better and maybe
suggest a better fix that doesn't break fw_devlink or effectively
disable it.

Thanks,
Saravana

> >
> >
> > When the DTS is written this way, not only is there a different node
> > involved for NVMEM, but this node is a new node that is yet another
> > descendant in the tree. In other words, the "partition@ff0000" node
> > used to be what designated this device as the NVMEM provider
> > with the "nvmem-cells" compatible, so the node that represents
> > the actual probing device is now 4 parent nodes up in the tree
> > in this example instead of 3 parent nodes up in the tree as before.
> >
> > For the past year, and even before the "fixed-layout" issue came up,
> > I had been experimenting with a way to handle these reverse probe order
> > and linking of distant descendant node issues in a generic way instead of
> > naming exceptions with the compatible string, and this series is the
> > culmination of those efforts. It may look strange at first,
> > but there are a myriad set of cases to handle and other reasons
> > that led me to develop this approach of using existing bad device links
> > in order to find the correct fwnode to link to, and then redo
> > the relevant links for that consumer device altogether.
> > I'm concerned that doing this another way would be a much more massive
> > project that would basically rewrite what the fw_devlink feature does.
> > Or perhaps there would have to be a new, third form of device links
> > that would be a "placeholder" before it becomes a fwnode link.
> > Eventually, I came to the conclusion that
> > there simply is not enough information to form the correct fwnode link
> > before the real supplier device has a successful probe.
> >
> > Some of the changes proposed here are made on the extreme side of caution,
> > for example, checking for null dereference when it might not be necessary,
> > and reducing the activity of some functions in order to reduce
> > the amount of assumptions taking place in the middle of driver core
> > in cases where the new functions proposed here handles that just as well
> > and closer to a possible probe defer event
> > (e.g. not declaring a fwnode as NOT_DEVICE before
> > a probe attempt is expected to have happened).
> >
> > I have tried to make the commit messages as self-explanatory as I can,
> > but they might have ended up a little too verbose, and therefore confusing
> > but I'm ready to explain whatever has not been explained well already.
> > Lastly, this is my first attempt at sending a larger change to the kernel,
> > so I appreciate your time and patience very much.
> >
> > MCP
> >
> >
> > [1] https://lore.kernel.org/lkml/[email protected]/
> >
> > [2] bcdf0315a61a29eb753a607d3a85a4032de72d94
> >
> > [3] 3a2dbc510c437ca392516b0105bad8e7970e6614
> >
> > [4] 411c0d58ca6faa9bc4b9f5382118a31c7bb92a6f
> >
> > [5] fb42378dcc7f247df56f0ecddfdae85487495fbc
> >
> > [6] 27f699e578b1a72df89dfa3bc42e093a01dc8d10
> >
> > [7] https://github.com/openwrt/openwrt/commit/703d667a0cdf6dfa402c08e72d0c77a257ca5009
> >
> >
> > Michael Pratt (4):
> > driver core: fw_devlink: Use driver to determine probe ability
> > driver core: fw_devlink: Link to supplier ancestor if no device
> > driver core: fw_devlink: Add function device_links_fixup_suppliers()
> > mtd: mtdpart: Allow fwnode links to NVMEM compatible fwnodes
> >
> > drivers/base/base.h | 1 +
> > drivers/base/core.c | 144 ++++++++++++++++++++++++++++++++++++++---
> > drivers/base/dd.c | 2 +
> > drivers/mtd/mtdpart.c | 10 ---
> > include/linux/fwnode.h | 4 ++
> > 5 files changed, 143 insertions(+), 18 deletions(-)
> >
> >
> > base-commit: b0d326da462e20285236e11e4cbc32085de9f363
> > --
> > 2.30.2
> >
> >

2024-02-25 20:16:49

by Michael Pratt

[permalink] [raw]
Subject: Re: [PATCH v1 2/4] driver core: fw_devlink: Link to supplier ancestor if no device

Hi Miquel,

Thanks for taking a look at what I have so far.

On Monday, February 5th, 2024 at 10:00, Miquel Raynal <[email protected]> wrote:

>
>
> Hi Michael,
>
> [email protected] wrote on Tue, 23 Jan 2024 01:46:40 +0000:
>
> > If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
> > by the time a device link is to be made with it,
> > but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
> > the link is dropped, otherwise the device link
> > is still made with the original supplier fwnode.
> > Theoretically, we can also handle linking to an ancestor
> > of the supplier fwnode when forming device links, but there
> > are still cases where the necessary fwnode flags are still missing
> > because the real supplier device did not probe yet.
>
>
> I am not sure I follow this. In the following case, I would expect any
> dependency towards node-c to be made against node-a. But the above
> paragraph seems to tell otherwise: that the link would be dropped
> (and thus, not enforced) because recursively searching for a parent
> that would be a device could be endless? It feels wrong, so I probably
> mis
>
> node-a {
> # IS DEV
> node-b {
> # PARENT IS DEV
> node-c {
> # PARENT IS DEV
> };
> };
>};
>
> Besides that, the commit feels like a good idea.


The link is dropped _in order to_ make the dependency towards node-c
become a dependency towards node-a instead.

The "recursive searching" happens after links are dropped in order to
create the new fwnode links, and it depends on the new flag
being placed when the supplier device (node-a) probes, which can happen
before the links are re-attempted, or after a single probe defer of the consumer.

I placed all the logic that decides whether to drop links and retry linking
to the consumer immediately before a probe defer of the consumer would occur.
That logic could be distributed around multiple functions for fw_devlink,
but I'm concerned about false positives. The only reason I didn't use or make
a new function in order to "move" the links is that in this position in the driver core
which I believe is the right place to do the fixup function, we don't have direct access
to the fwnode that the links should go to, it would have to be discovered by recursively
walking up the tree looking for the flag in the new fixup function
instead of where the fwnode links are made.

To me, it doesn't matter which order we call the functions,
but if we are starting with fwnode links that refuse
to be converted to device links, we need to do more than just move the fwnode links
because after a probe defer there is no hook to automatically try switching them
to device links again. Driver core expects that to have already happened by then.
I imagine that without having to add a lot more code in a lot of places,
I would have to call fw_devlink_link_device() after "moving" links anyway...

It's possible to call that function only when the bad link is still a fwnode_link
and do a "move" function when the bad link is a device_link, that is,
if "moving" a finished device_link is possible or good practice at all
since it would be skipping quite a few checks that occur before a device_link is made.
It seems to me that making a device_link is a multiple-step process, so a new function
to only move the supplier of a device_link might be a big one as well.
I tend to try to reuse as many core functions as I could.

>
> Thanks,
> Miquèl

--
MCP

2024-02-25 20:38:00

by Michael Pratt

[permalink] [raw]
Subject: Re: [PATCH v1 2/4] driver core: fw_devlink: Link to supplier ancestor if no device

Hi Saravana,

Thanks for taking a look at the patch series.
I'm going to write a longer reply to the rest of what you said later.

On Monday, February 5th, 2024 at 14:25, Saravana Kannan <[email protected]> wrote:

>
>
> On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt [email protected] wrote:
>
> >
> > Signed-off-by: Michael Pratt [email protected]
>
>
> NACK for a bunch of reasons.
>
> 1. This logic should not be in fwnode_link_add(). That's supposed to
> do the exact linking that the firmware meant. This logic should be
> where the fwnode links are converted to device links or deferred
> probing decision is done.

Yeah, I can definitely make this change without any problems.

>
> 2. If we handle one parent above correctly, it should be easy to
> handle multiple parents above too. So why not fix it where we handle
> multiple parents above?

I can do this too, but since going one parent up would have to occur
multiple times, it would still end up being a recursive search for the flag.
Without finding the right fwnode right away, the new fixup function here
would have to recursively call itself 3 or 4 times instead of once or twice,
and each call would result in going through the entire loop of
fw_devlink_link_device() for the consumer which would overall take a little more time.


>
> Btw, I'm happy to fix this or help you fix this once I understand the
> issue. Just wanted to give a quick NACK to avoid people spending
> cycles on this patch in its current state.


I totally understand. I am still new to kernel things, so I imagine that
whenever I think that I'm finished with something,
that I'm actually just halfway done with it, even though it took forever
to get to this point, haha.


>
> -Saravana
>
> > ---
> > drivers/base/core.c | 49 ++++++++++++++++++++++++++++++++++++------
> > include/linux/fwnode.h | 4 ++++
> > 2 files changed, 47 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index c05a5f6b0641..7f2652cf5edc 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -92,13 +92,45 @@ static int __fwnode_link_add(struct fwnode_handle *con,
> > return 0;
> > }
> >
> > +static int __fwnode_link_add_parents(struct fwnode_handle *con,
> > + struct fwnode_handle *sup,
> > + u8 flags, bool loop)
> > +{
> > + int ret = -EINVAL;
> > + struct fwnode_handle parent;
> > +
> > + fwnode_for_each_parent_node(sup, parent) {
> > + / Ignore the root node */
> > + if (fwnode_count_parents(parent) &&
> > + fwnode_device_is_available(parent) &&
> > + !(parent->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + !(parent->flags & FWNODE_FLAG_PARENT_IS_DEV)) {
> > + ret = __fwnode_link_add(con, parent, flags);
> > + }
> > +
> > + if (!loop && !ret) {
> > + fwnode_handle_put(parent);
> > + return 0;
> > + }
> > + }
> > +
> > + return ret;
> > +}
> > +
> > int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> > {
> > int ret;
> >
> > mutex_lock(&fwnode_link_lock);
> > - ret = __fwnode_link_add(con, sup, 0);
> > +
> > + if ((sup->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + (sup->flags & FWNODE_FLAG_PARENT_IS_DEV))
> > + ret = __fwnode_link_add_parents(con, sup, 0, false);
> > + else
> > + ret = __fwnode_link_add(con, sup, 0);
> > +
> > mutex_unlock(&fwnode_link_lock);
> > +
> > return ret;
> > }
> >
> > @@ -218,7 +250,8 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
> > * MANAGED device links to this device, so leave @fwnode and its descendant's
> > * fwnode links alone.
> > *
> > - * Otherwise, move its consumers to the new supplier @new_sup.
> > + * Otherwise, flag descendants of @fwnode as having a parent fwnode for a device
> > + * that has probed and move bad fwnode consumer links from @fwnode to @new_sup.
> > */
> > static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> > struct fwnode_handle *new_sup)
> > @@ -228,8 +261,11 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> > if (fwnode->dev && fwnode->dev->driver)
> > return;
> >
> > - fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
> > - __fwnode_links_move_consumers(fwnode, new_sup);
> > + if (fwnode->flags & FWNODE_FLAG_NOT_DEVICE)
> > + __fwnode_links_move_consumers(fwnode, new_sup);
> > +
> > + fwnode->flags |= FWNODE_FLAG_PARENT_IS_DEV;
> > + new_sup->flags &= ~(FWNODE_FLAG_PARENT_IS_DEV);
> >
> > fwnode_for_each_available_child_node(fwnode, child)
> > __fw_devlink_pickup_dangling_consumers(child, new_sup);
> > @@ -2071,8 +2107,9 @@ static int fw_devlink_create_devlink(struct device *con,
> > device_links_write_unlock();
> > }
> >
> > - if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
> > - sup_dev = fwnode_get_next_parent_dev(sup_handle);
> > + if ((sup_handle->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + !(sup_handle->flags & FWNODE_FLAG_PARENT_IS_DEV))
> > + return -EINVAL;
> > else
> > sup_dev = get_dev_from_fwnode(sup_handle);
> >
> > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
> > index 2a72f55d26eb..3ed8ba503062 100644
> > --- a/include/linux/fwnode.h
> > +++ b/include/linux/fwnode.h
> > @@ -30,6 +30,9 @@ struct device;
> > * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some
> > * suppliers. Only enforce ordering with suppliers that have
> > * drivers.
> > + * PARENT_IS_DEV: The fwnode is a child of a fwnode that is or will be populated as a
> > + * struct device, which is more suitable for device links
> > + * than the fwnode child which may never have a struct device.
> > */
> > #define FWNODE_FLAG_LINKS_ADDED BIT(0)
> > #define FWNODE_FLAG_NOT_DEVICE BIT(1)
> > @@ -37,6 +40,7 @@ struct device;
> > #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3)
> > #define FWNODE_FLAG_BEST_EFFORT BIT(4)
> > #define FWNODE_FLAG_VISITED BIT(5)
> > +#define FWNODE_FLAG_PARENT_IS_DEV BIT(6)
> >
> > struct fwnode_handle {
> > struct fwnode_handle *secondary;
> > --
> > 2.30.2

--
MCP

2024-02-25 17:36:20

by Michael Pratt

[permalink] [raw]
Subject: Re: [PATCH v1 4/4] mtd: mtdpart: Allow fwnode links to NVMEM compatible fwnodes

Hi Miquel,

Sorry for the wait in replying.


On Monday, February 5th, 2024 at 10:11, Miquel Raynal <[email protected]> wrote:

>
> Hi Michael,
>
> [email protected] wrote on Tue, 23 Jan 2024 01:47:21 +0000:
>
> > This reverts commit fb42378dcc7f247df56f0ecddfdae85487495fbc
> > ("mtd: mtdpart: Don't create platform device that'll never probe").
> >
> > That commit is a manual named exception in order to avoid fw_devlink links
> > to an "nvmem-cells" compatible node which is a descendant of the fwnode
> > that represents the real supplier device that probes.
> >
> > The commit does not work for newer cases, like the "fixed-layout"
>
>
> Do you have plans for it? Because it is the modern description that is
> now expected, so I don't feel convinced by all this work (which is
> nevertheless considerable) if fixed-layouts are still broken?
>


Sorry for the misunderstanding, I was referring to the already existing commit
cited in the message when saying "The commit does not work for newer cases.."
which is why reverting it in this patch is part of the series. I'll reword it.

>
>
> Thanks,
> Miquèl