2024-02-29 10:52:29

by Herve Codina

[permalink] [raw]
Subject: [PATCH v3 0/2] Synchronize DT overlay removal with devlink removals

Hi,

In the following sequence:
of_platform_depopulate(); /* Remove devices from a DT overlay node */
of_overlay_remove(); /* Remove the DT overlay node itself */

Some warnings are raised by __of_changeset_entry_destroy() which was
called from of_overlay_remove():
ERROR: memory leak, expected refcount 1 instead of 2 ...

The issue is that, during the device devlink removals triggered from the
of_platform_depopulate(), jobs are put in a workqueue.
These jobs drop the reference to the devices. When a device is no more
referenced (refcount == 0), it is released and the reference to its
of_node is dropped by a call to of_node_put().
These operations are fully correct except that, because of the
workqueue, they are done asynchronously with respect to function calls.

In the sequence provided, the jobs are run too late, after the call to
__of_changeset_entry_destroy() and so a missing of_node_put() call is
detected by __of_changeset_entry_destroy().

This series fixes this issue introducing device_link_wait_removal() in
order to wait for the end of jobs execution (patch 1) and using this
function to synchronize the overlay removal with the end of jobs
execution (patch 2).

Compared to the previous iteration:
https://lore.kernel.org/linux-kernel/[email protected]/
this v3 series:
- add the missing device.h

This series handles cases reported by Luca [1] and Nuno [2].
[1]: https://lore.kernel.org/all/20231220181627.341e8789@booty/
[2]: https://lore.kernel.org/all/[email protected]/

Best regards,
Hervé

Changes v2 -> v3
- Patch 1
No changes

- Patch 2
Add missing device.h

Changes v1 -> v2
- Patch 1
Rename the workqueue to 'device_link_wq'
Add 'Fixes' tag and Cc stable

- Patch 2
Add device.h inclusion.
Call device_link_wait_removal() later in the overlay removal
sequence (i.e. in free_overlay_changeset() function).
Drop of_mutex lock while calling device_link_wait_removal().
Add 'Fixes' tag and Cc stable

Herve Codina (2):
driver core: Introduce device_link_wait_removal()
of: overlay: Synchronize of_overlay_remove() with the devlink removals

drivers/base/core.c | 26 +++++++++++++++++++++++---
drivers/of/overlay.c | 10 +++++++++-
include/linux/device.h | 1 +
3 files changed, 33 insertions(+), 4 deletions(-)

--
2.43.0



2024-02-29 10:52:51

by Herve Codina

[permalink] [raw]
Subject: [PATCH v3 1/2] driver core: Introduce device_link_wait_removal()

The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.

Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.

For instance, in the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()

During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2

Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.

Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.

Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
Cc: [email protected]
Signed-off-by: Herve Codina <[email protected]>
---
drivers/base/core.c | 26 +++++++++++++++++++++++---
include/linux/device.h | 1 +
2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index d5f4e4aac09b..80d9430856a8 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
static void __fw_devlink_link_to_consumers(struct device *dev);
static bool fw_devlink_drv_reg_done;
static bool fw_devlink_best_effort;
+static struct workqueue_struct *device_link_wq;

/**
* __fwnode_link_add - Create a link between two fwnode_handles.
@@ -532,12 +533,26 @@ static void devlink_dev_release(struct device *dev)
/*
* It may take a while to complete this work because of the SRCU
* synchronization in device_link_release_fn() and if the consumer or
- * supplier devices get deleted when it runs, so put it into the "long"
- * workqueue.
+ * supplier devices get deleted when it runs, so put it into the
+ * dedicated workqueue.
*/
- queue_work(system_long_wq, &link->rm_work);
+ queue_work(device_link_wq, &link->rm_work);
}

+/**
+ * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate
+ */
+void device_link_wait_removal(void)
+{
+ /*
+ * devlink removal jobs are queued in the dedicated work queue.
+ * To be sure that all removal jobs are terminated, ensure that any
+ * scheduled work has run to completion.
+ */
+ drain_workqueue(device_link_wq);
+}
+EXPORT_SYMBOL_GPL(device_link_wait_removal);
+
static struct class devlink_class = {
.name = "devlink",
.dev_groups = devlink_groups,
@@ -4099,9 +4114,14 @@ int __init devices_init(void)
sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj);
if (!sysfs_dev_char_kobj)
goto char_kobj_err;
+ device_link_wq = alloc_workqueue("device_link_wq", 0, 0);
+ if (!device_link_wq)
+ goto wq_err;

return 0;

+ wq_err:
+ kobject_put(sysfs_dev_char_kobj);
char_kobj_err:
kobject_put(sysfs_dev_block_kobj);
block_kobj_err:
diff --git a/include/linux/device.h b/include/linux/device.h
index 1795121dee9a..d7d8305a72e8 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1249,6 +1249,7 @@ void device_link_del(struct device_link *link);
void device_link_remove(void *consumer, struct device *supplier);
void device_links_supplier_sync_state_pause(void);
void device_links_supplier_sync_state_resume(void);
+void device_link_wait_removal(void);

/* Create alias, so I can be autoloaded. */
#define MODULE_ALIAS_CHARDEV(major,minor) \
--
2.43.0


2024-02-29 11:13:31

by Nuno Sá

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] driver core: Introduce device_link_wait_removal()

Hi,

Just copy pasting my previous comments :)

On Thu, 2024-02-29 at 11:52 +0100, Herve Codina wrote:
> The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
> introduces a workqueue to release the consumer and supplier devices used
> in the devlink.
> In the job queued, devices are release and in turn, when all the
> references to these devices are dropped, the release function of the
> device itself is called.
>
> Nothing is present to provide some synchronisation with this workqueue
> in order to ensure that all ongoing releasing operations are done and
> so, some other operations can be started safely.
>
> For instance, in the following sequence:
>   1) of_platform_depopulate()
>   2) of_overlay_remove()
>
> During the step 1, devices are released and related devlinks are removed
> (jobs pushed in the workqueue).
> During the step 2, OF nodes are destroyed but, without any
> synchronisation with devlink removal jobs, of_overlay_remove() can raise
> warnings related to missing of_node_put():
>   ERROR: memory leak, expected refcount 1 instead of 2
>
> Indeed, the missing of_node_put() call is going to be done, too late,
> from the workqueue job execution.
>
> Introduce device_link_wait_removal() to offer a way to synchronize
> operations waiting for the end of devlink removals (i.e. end of
> workqueue jobs).
> Also, as a flushing operation is done on the workqueue, the workqueue
> used is moved from a system-wide workqueue to a local one.
>
> Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
> Cc: [email protected]
> Signed-off-by: Herve Codina <[email protected]>
> ---
>  drivers/base/core.c    | 26 +++++++++++++++++++++++---
>  include/linux/device.h |  1 +
>  2 files changed, 24 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index d5f4e4aac09b..80d9430856a8 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
>  static void __fw_devlink_link_to_consumers(struct device *dev);
>  static bool fw_devlink_drv_reg_done;
>  static bool fw_devlink_best_effort;
> +static struct workqueue_struct *device_link_wq;
>  
>  /**
>   * __fwnode_link_add - Create a link between two fwnode_handles.
> @@ -532,12 +533,26 @@ static void devlink_dev_release(struct device *dev)
>   /*
>   * It may take a while to complete this work because of the SRCU
>   * synchronization in device_link_release_fn() and if the consumer or
> - * supplier devices get deleted when it runs, so put it into the
> "long"
> - * workqueue.
> + * supplier devices get deleted when it runs, so put it into the
> + * dedicated workqueue.
>   */
> - queue_work(system_long_wq, &link->rm_work);
> + queue_work(device_link_wq, &link->rm_work);
>  }
>  
> +/**
> + * device_link_wait_removal - Wait for ongoing devlink removal jobs to
> terminate
> + */
> +void device_link_wait_removal(void)
> +{
> + /*
> + * devlink removal jobs are queued in the dedicated work queue.
> + * To be sure that all removal jobs are terminated, ensure that any
> + * scheduled work has run to completion.
> + */
> + drain_workqueue(device_link_wq);
> +}

I'm still not convinced we can have a recursive call into devlinks removal so I
do think flush_workqueue() is enough. I will defer to Saravana though...

> +EXPORT_SYMBOL_GPL(device_link_wait_removal);
> +
>  static struct class devlink_class = {
>   .name = "devlink",
>   .dev_groups = devlink_groups,
> @@ -4099,9 +4114,14 @@ int __init devices_init(void)
>   sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj);
>   if (!sysfs_dev_char_kobj)
>   goto char_kobj_err;
> + device_link_wq = alloc_workqueue("device_link_wq", 0, 0);
> + if (!device_link_wq)
> + goto wq_err;
>

I still think this makes more sense in devlink_class_init() as this really
device link specific. Moreover, as I said to Saravana, we need to "convince"
Rafael about this as he (in my series) did not agreed with erroring out in case
we fail to allocate the queue.

Rafael?

- Nuno Sá



2024-02-29 11:13:56

by Herve Codina

[permalink] [raw]
Subject: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

In the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()

During the step 1, devices are destroyed and devlinks are removed.
During the step 2, OF nodes are destroyed but
__of_changeset_entry_destroy() can raise warnings related to missing
of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2 ...

Indeed, during the devlink removals performed at step 1, the removal
itself releasing the device (and the attached of_node) is done by a job
queued in a workqueue and so, it is done asynchronously with respect to
function calls.
When the warning is present, of_node_put() will be called but wrongly
too late from the workqueue job.

In order to be sure that any ongoing devlink removals are done before
the of_node destruction, synchronize the of_overlay_remove() with the
devlink removals.

Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
Cc: [email protected]
Signed-off-by: Herve Codina <[email protected]>
---
drivers/of/overlay.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index 2ae7e9d24a64..7a010a62b9d8 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -8,6 +8,7 @@

#define pr_fmt(fmt) "OF: overlay: " fmt

+#include <linux/device.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/of.h>
@@ -853,6 +854,14 @@ static void free_overlay_changeset(struct overlay_changeset *ovcs)
{
int i;

+ /*
+ * Wait for any ongoing device link removals before removing some of
+ * nodes. Drop the global lock while waiting
+ */
+ mutex_unlock(&of_mutex);
+ device_link_wait_removal();
+ mutex_lock(&of_mutex);
+
if (ovcs->cset.entries.next)
of_changeset_destroy(&ovcs->cset);

@@ -862,7 +871,6 @@ static void free_overlay_changeset(struct overlay_changeset *ovcs)
ovcs->id = 0;
}

-
for (i = 0; i < ovcs->count; i++) {
of_node_put(ovcs->fragments[i].target);
of_node_put(ovcs->fragments[i].overlay);
--
2.43.0


2024-02-29 11:15:39

by Nuno Sá

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

On Thu, 2024-02-29 at 11:52 +0100, Herve Codina wrote:
> In the following sequence:
>   1) of_platform_depopulate()
>   2) of_overlay_remove()
>
> During the step 1, devices are destroyed and devlinks are removed.
> During the step 2, OF nodes are destroyed but
> __of_changeset_entry_destroy() can raise warnings related to missing
> of_node_put():
>   ERROR: memory leak, expected refcount 1 instead of 2 ...
>
> Indeed, during the devlink removals performed at step 1, the removal
> itself releasing the device (and the attached of_node) is done by a job
> queued in a workqueue and so, it is done asynchronously with respect to
> function calls.
> When the warning is present, of_node_put() will be called but wrongly
> too late from the workqueue job.
>
> In order to be sure that any ongoing devlink removals are done before
> the of_node destruction, synchronize the of_overlay_remove() with the
> devlink removals.
>
> Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
> Cc: [email protected]
> Signed-off-by: Herve Codina <[email protected]>
> ---
>  drivers/of/overlay.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> index 2ae7e9d24a64..7a010a62b9d8 100644
> --- a/drivers/of/overlay.c
> +++ b/drivers/of/overlay.c
> @@ -8,6 +8,7 @@
>  
>  #define pr_fmt(fmt) "OF: overlay: " fmt
>  
> +#include <linux/device.h>

This is clearly up to the DT maintainers to decide but, IMHO, I would very much
prefer to see fwnode.h included in here rather than directly device.h (so yeah,
renaming the function to fwnode_*).

But yeah, I might be biased by own series :)

>  #include <linux/kernel.h>
>  #include <linux/module.h>
>  #include <linux/of.h>
> @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> overlay_changeset *ovcs)
>  {
>   int i;
>  
> + /*
> + * Wait for any ongoing device link removals before removing some of
> + * nodes. Drop the global lock while waiting
> + */
> + mutex_unlock(&of_mutex);
> + device_link_wait_removal();
> + mutex_lock(&of_mutex);

I'm still not convinced we need to drop the lock. What happens if someone else
grabs the lock while we are in device_link_wait_removal()? Can we guarantee that
we can't screw things badly?

The question is, do you have a system/use case where you can really see the
deadlock happening? Until I see one, I'm very skeptical about this. And if we
have one, I'm not really sure this is also the right solution for it.

- Nuno Sá



2024-02-29 12:57:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] driver core: Introduce device_link_wait_removal()

On Thu, Feb 29, 2024 at 12:13 PM Nuno Sá <[email protected]> wrote:
>
> Hi,
>
> Just copy pasting my previous comments :)
>
> On Thu, 2024-02-29 at 11:52 +0100, Herve Codina wrote:
> > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > introduces a workqueue to release the consumer and supplier devices used
> > in the devlink.
> > In the job queued, devices are release and in turn, when all the
> > references to these devices are dropped, the release function of the
> > device itself is called.
> >
> > Nothing is present to provide some synchronisation with this workqueue
> > in order to ensure that all ongoing releasing operations are done and
> > so, some other operations can be started safely.
> >
> > For instance, in the following sequence:
> > 1) of_platform_depopulate()
> > 2) of_overlay_remove()
> >
> > During the step 1, devices are released and related devlinks are removed
> > (jobs pushed in the workqueue).
> > During the step 2, OF nodes are destroyed but, without any
> > synchronisation with devlink removal jobs, of_overlay_remove() can raise
> > warnings related to missing of_node_put():
> > ERROR: memory leak, expected refcount 1 instead of 2
> >
> > Indeed, the missing of_node_put() call is going to be done, too late,
> > from the workqueue job execution.
> >
> > Introduce device_link_wait_removal() to offer a way to synchronize
> > operations waiting for the end of devlink removals (i.e. end of
> > workqueue jobs).
> > Also, as a flushing operation is done on the workqueue, the workqueue
> > used is moved from a system-wide workqueue to a local one.
> >
> > Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > Cc: [email protected]
> > Signed-off-by: Herve Codina <[email protected]>
> > ---
> > drivers/base/core.c | 26 +++++++++++++++++++++++---
> > include/linux/device.h | 1 +
> > 2 files changed, 24 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index d5f4e4aac09b..80d9430856a8 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
> > static void __fw_devlink_link_to_consumers(struct device *dev);
> > static bool fw_devlink_drv_reg_done;
> > static bool fw_devlink_best_effort;
> > +static struct workqueue_struct *device_link_wq;
> >
> > /**
> > * __fwnode_link_add - Create a link between two fwnode_handles.
> > @@ -532,12 +533,26 @@ static void devlink_dev_release(struct device *dev)
> > /*
> > * It may take a while to complete this work because of the SRCU
> > * synchronization in device_link_release_fn() and if the consumer or
> > - * supplier devices get deleted when it runs, so put it into the
> > "long"
> > - * workqueue.
> > + * supplier devices get deleted when it runs, so put it into the
> > + * dedicated workqueue.
> > */
> > - queue_work(system_long_wq, &link->rm_work);
> > + queue_work(device_link_wq, &link->rm_work);
> > }
> >
> > +/**
> > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to
> > terminate
> > + */
> > +void device_link_wait_removal(void)
> > +{
> > + /*
> > + * devlink removal jobs are queued in the dedicated work queue.
> > + * To be sure that all removal jobs are terminated, ensure that any
> > + * scheduled work has run to completion.
> > + */
> > + drain_workqueue(device_link_wq);
> > +}
>
> I'm still not convinced we can have a recursive call into devlinks removal so I
> do think flush_workqueue() is enough. I will defer to Saravana though...
>
> > +EXPORT_SYMBOL_GPL(device_link_wait_removal);
> > +
> > static struct class devlink_class = {
> > .name = "devlink",
> > .dev_groups = devlink_groups,
> > @@ -4099,9 +4114,14 @@ int __init devices_init(void)
> > sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj);
> > if (!sysfs_dev_char_kobj)
> > goto char_kobj_err;
> > + device_link_wq = alloc_workqueue("device_link_wq", 0, 0);
> > + if (!device_link_wq)
> > + goto wq_err;
> >
>
> I still think this makes more sense in devlink_class_init() as this really
> device link specific. Moreover, as I said to Saravana, we need to "convince"
> Rafael about this as he (in my series) did not agreed with erroring out in case
> we fail to allocate the queue.
>
> Rafael?

I don't really think it matters in practice, so this is fine with me too.

2024-02-29 13:02:25

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] driver core: Introduce device_link_wait_removal()

On Thu, Feb 29, 2024 at 12:13 PM Nuno Sá <[email protected]> wrote:
>
> Hi,
>
> Just copy pasting my previous comments :)
>
> On Thu, 2024-02-29 at 11:52 +0100, Herve Codina wrote:
> > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > introduces a workqueue to release the consumer and supplier devices used
> > in the devlink.
> > In the job queued, devices are release and in turn, when all the
> > references to these devices are dropped, the release function of the
> > device itself is called.
> >
> > Nothing is present to provide some synchronisation with this workqueue
> > in order to ensure that all ongoing releasing operations are done and
> > so, some other operations can be started safely.
> >
> > For instance, in the following sequence:
> > 1) of_platform_depopulate()
> > 2) of_overlay_remove()
> >
> > During the step 1, devices are released and related devlinks are removed
> > (jobs pushed in the workqueue).
> > During the step 2, OF nodes are destroyed but, without any
> > synchronisation with devlink removal jobs, of_overlay_remove() can raise
> > warnings related to missing of_node_put():
> > ERROR: memory leak, expected refcount 1 instead of 2
> >
> > Indeed, the missing of_node_put() call is going to be done, too late,
> > from the workqueue job execution.
> >
> > Introduce device_link_wait_removal() to offer a way to synchronize
> > operations waiting for the end of devlink removals (i.e. end of
> > workqueue jobs).
> > Also, as a flushing operation is done on the workqueue, the workqueue
> > used is moved from a system-wide workqueue to a local one.
> >
> > Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > Cc: [email protected]
> > Signed-off-by: Herve Codina <[email protected]>
> > ---
> > drivers/base/core.c | 26 +++++++++++++++++++++++---
> > include/linux/device.h | 1 +
> > 2 files changed, 24 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index d5f4e4aac09b..80d9430856a8 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
> > static void __fw_devlink_link_to_consumers(struct device *dev);
> > static bool fw_devlink_drv_reg_done;
> > static bool fw_devlink_best_effort;
> > +static struct workqueue_struct *device_link_wq;
> >
> > /**
> > * __fwnode_link_add - Create a link between two fwnode_handles.
> > @@ -532,12 +533,26 @@ static void devlink_dev_release(struct device *dev)
> > /*
> > * It may take a while to complete this work because of the SRCU
> > * synchronization in device_link_release_fn() and if the consumer or
> > - * supplier devices get deleted when it runs, so put it into the
> > "long"
> > - * workqueue.
> > + * supplier devices get deleted when it runs, so put it into the
> > + * dedicated workqueue.
> > */
> > - queue_work(system_long_wq, &link->rm_work);
> > + queue_work(device_link_wq, &link->rm_work);
> > }
> >
> > +/**
> > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to
> > terminate
> > + */
> > +void device_link_wait_removal(void)
> > +{
> > + /*
> > + * devlink removal jobs are queued in the dedicated work queue.
> > + * To be sure that all removal jobs are terminated, ensure that any
> > + * scheduled work has run to completion.
> > + */
> > + drain_workqueue(device_link_wq);
> > +}
>
> I'm still not convinced we can have a recursive call into devlinks removal so I
> do think flush_workqueue() is enough. I will defer to Saravana though...

AFAICS, the difference betwee flush_workqueue() and drain_workqueue()
is the handling of the case when a given work item can queue up itself
again. This does not happen here.

2024-02-29 13:03:18

by Nuno Sá

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] driver core: Introduce device_link_wait_removal()

On Thu, 2024-02-29 at 14:01 +0100, Rafael J. Wysocki wrote:
> On Thu, Feb 29, 2024 at 12:13 PM Nuno Sá <[email protected]> wrote:
> >
> > Hi,
> >
> > Just copy pasting my previous comments :)
> >
> > On Thu, 2024-02-29 at 11:52 +0100, Herve Codina wrote:
> > > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > > introduces a workqueue to release the consumer and supplier devices used
> > > in the devlink.
> > > In the job queued, devices are release and in turn, when all the
> > > references to these devices are dropped, the release function of the
> > > device itself is called.
> > >
> > > Nothing is present to provide some synchronisation with this workqueue
> > > in order to ensure that all ongoing releasing operations are done and
> > > so, some other operations can be started safely.
> > >
> > > For instance, in the following sequence:
> > >   1) of_platform_depopulate()
> > >   2) of_overlay_remove()
> > >
> > > During the step 1, devices are released and related devlinks are removed
> > > (jobs pushed in the workqueue).
> > > During the step 2, OF nodes are destroyed but, without any
> > > synchronisation with devlink removal jobs, of_overlay_remove() can raise
> > > warnings related to missing of_node_put():
> > >   ERROR: memory leak, expected refcount 1 instead of 2
> > >
> > > Indeed, the missing of_node_put() call is going to be done, too late,
> > > from the workqueue job execution.
> > >
> > > Introduce device_link_wait_removal() to offer a way to synchronize
> > > operations waiting for the end of devlink removals (i.e. end of
> > > workqueue jobs).
> > > Also, as a flushing operation is done on the workqueue, the workqueue
> > > used is moved from a system-wide workqueue to a local one.
> > >
> > > Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > > Cc: [email protected]
> > > Signed-off-by: Herve Codina <[email protected]>
> > > ---
> > >  drivers/base/core.c    | 26 +++++++++++++++++++++++---
> > >  include/linux/device.h |  1 +
> > >  2 files changed, 24 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > index d5f4e4aac09b..80d9430856a8 100644
> > > --- a/drivers/base/core.c
> > > +++ b/drivers/base/core.c
> > > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
> > >  static void __fw_devlink_link_to_consumers(struct device *dev);
> > >  static bool fw_devlink_drv_reg_done;
> > >  static bool fw_devlink_best_effort;
> > > +static struct workqueue_struct *device_link_wq;
> > >
> > >  /**
> > >   * __fwnode_link_add - Create a link between two fwnode_handles.
> > > @@ -532,12 +533,26 @@ static void devlink_dev_release(struct device *dev)
> > >       /*
> > >        * It may take a while to complete this work because of the SRCU
> > >        * synchronization in device_link_release_fn() and if the consumer
> > > or
> > > -      * supplier devices get deleted when it runs, so put it into the
> > > "long"
> > > -      * workqueue.
> > > +      * supplier devices get deleted when it runs, so put it into the
> > > +      * dedicated workqueue.
> > >        */
> > > -     queue_work(system_long_wq, &link->rm_work);
> > > +     queue_work(device_link_wq, &link->rm_work);
> > >  }
> > >
> > > +/**
> > > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to
> > > terminate
> > > + */
> > > +void device_link_wait_removal(void)
> > > +{
> > > +     /*
> > > +      * devlink removal jobs are queued in the dedicated work queue.
> > > +      * To be sure that all removal jobs are terminated, ensure that any
> > > +      * scheduled work has run to completion.
> > > +      */
> > > +     drain_workqueue(device_link_wq);
> > > +}
> >
> > I'm still not convinced we can have a recursive call into devlinks removal
> > so I
> > do think flush_workqueue() is enough. I will defer to Saravana though...
>
> AFAICS, the difference betwee flush_workqueue() and drain_workqueue()
> is the handling of the case when a given work item can queue up itself
> again.  This does not happen here.


Yeah, that's also my understanding...

2024-02-29 13:11:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] driver core: Introduce device_link_wait_removal()

On Thu, Feb 29, 2024 at 2:03 PM Nuno Sá <[email protected]> wrote:
>
> On Thu, 2024-02-29 at 14:01 +0100, Rafael J. Wysocki wrote:
> > On Thu, Feb 29, 2024 at 12:13 PM Nuno Sá <[email protected]> wrote:
> > >
> > > Hi,
> > >
> > > Just copy pasting my previous comments :)
> > >
> > > On Thu, 2024-02-29 at 11:52 +0100, Herve Codina wrote:
> > > > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > > > introduces a workqueue to release the consumer and supplier devices used
> > > > in the devlink.
> > > > In the job queued, devices are release and in turn, when all the
> > > > references to these devices are dropped, the release function of the
> > > > device itself is called.
> > > >
> > > > Nothing is present to provide some synchronisation with this workqueue
> > > > in order to ensure that all ongoing releasing operations are done and
> > > > so, some other operations can be started safely.
> > > >
> > > > For instance, in the following sequence:
> > > > 1) of_platform_depopulate()
> > > > 2) of_overlay_remove()
> > > >
> > > > During the step 1, devices are released and related devlinks are removed
> > > > (jobs pushed in the workqueue).
> > > > During the step 2, OF nodes are destroyed but, without any
> > > > synchronisation with devlink removal jobs, of_overlay_remove() can raise
> > > > warnings related to missing of_node_put():
> > > > ERROR: memory leak, expected refcount 1 instead of 2
> > > >
> > > > Indeed, the missing of_node_put() call is going to be done, too late,
> > > > from the workqueue job execution.
> > > >
> > > > Introduce device_link_wait_removal() to offer a way to synchronize
> > > > operations waiting for the end of devlink removals (i.e. end of
> > > > workqueue jobs).
> > > > Also, as a flushing operation is done on the workqueue, the workqueue
> > > > used is moved from a system-wide workqueue to a local one.
> > > >
> > > > Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > > > Cc: [email protected]
> > > > Signed-off-by: Herve Codina <[email protected]>
> > > > ---
> > > > drivers/base/core.c | 26 +++++++++++++++++++++++---
> > > > include/linux/device.h | 1 +
> > > > 2 files changed, 24 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > > index d5f4e4aac09b..80d9430856a8 100644
> > > > --- a/drivers/base/core.c
> > > > +++ b/drivers/base/core.c
> > > > @@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
> > > > static void __fw_devlink_link_to_consumers(struct device *dev);
> > > > static bool fw_devlink_drv_reg_done;
> > > > static bool fw_devlink_best_effort;
> > > > +static struct workqueue_struct *device_link_wq;
> > > >
> > > > /**
> > > > * __fwnode_link_add - Create a link between two fwnode_handles.
> > > > @@ -532,12 +533,26 @@ static void devlink_dev_release(struct device *dev)
> > > > /*
> > > > * It may take a while to complete this work because of the SRCU
> > > > * synchronization in device_link_release_fn() and if the consumer
> > > > or
> > > > - * supplier devices get deleted when it runs, so put it into the
> > > > "long"
> > > > - * workqueue.
> > > > + * supplier devices get deleted when it runs, so put it into the
> > > > + * dedicated workqueue.
> > > > */
> > > > - queue_work(system_long_wq, &link->rm_work);
> > > > + queue_work(device_link_wq, &link->rm_work);
> > > > }
> > > >
> > > > +/**
> > > > + * device_link_wait_removal - Wait for ongoing devlink removal jobs to
> > > > terminate
> > > > + */
> > > > +void device_link_wait_removal(void)
> > > > +{
> > > > + /*
> > > > + * devlink removal jobs are queued in the dedicated work queue.
> > > > + * To be sure that all removal jobs are terminated, ensure that any
> > > > + * scheduled work has run to completion.
> > > > + */
> > > > + drain_workqueue(device_link_wq);
> > > > +}
> > >
> > > I'm still not convinced we can have a recursive call into devlinks removal
> > > so I
> > > do think flush_workqueue() is enough. I will defer to Saravana though..
> >
> > AFAICS, the difference betwee flush_workqueue() and drain_workqueue()
> > is the handling of the case when a given work item can queue up itself
> > again. This does not happen here.
>
>
> Yeah, that's also my understanding...

Moreover, IIUC this is called after dropping the last reference to the
device link in question and so after queuing up the link removal work.
Because that work does not requeue itself, flush_workqueue() is
sufficient to ensure that the removal work has been completed.

If anyone thinks that it may not be sufficient, please explain to me
why you think so. Otherwise, don't do stuff to prevent things you
cannot explain.

2024-03-04 15:06:07

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] Synchronize DT overlay removal with devlink removals

On Thu, Feb 29, 2024 at 11:52:01AM +0100, Herve Codina wrote:
> Hi,

Please CC Saravana on this.

>
> In the following sequence:
> of_platform_depopulate(); /* Remove devices from a DT overlay node */
> of_overlay_remove(); /* Remove the DT overlay node itself */
>
> Some warnings are raised by __of_changeset_entry_destroy() which was
> called from of_overlay_remove():
> ERROR: memory leak, expected refcount 1 instead of 2 ...
>
> The issue is that, during the device devlink removals triggered from the
> of_platform_depopulate(), jobs are put in a workqueue.
> These jobs drop the reference to the devices. When a device is no more
> referenced (refcount == 0), it is released and the reference to its
> of_node is dropped by a call to of_node_put().
> These operations are fully correct except that, because of the
> workqueue, they are done asynchronously with respect to function calls.
>
> In the sequence provided, the jobs are run too late, after the call to
> __of_changeset_entry_destroy() and so a missing of_node_put() call is
> detected by __of_changeset_entry_destroy().
>
> This series fixes this issue introducing device_link_wait_removal() in
> order to wait for the end of jobs execution (patch 1) and using this
> function to synchronize the overlay removal with the end of jobs
> execution (patch 2).
>
> Compared to the previous iteration:
> https://lore.kernel.org/linux-kernel/[email protected]/
> this v3 series:
> - add the missing device.h
>
> This series handles cases reported by Luca [1] and Nuno [2].
> [1]: https://lore.kernel.org/all/20231220181627.341e8789@booty/
> [2]: https://lore.kernel.org/all/[email protected]/
>
> Best regards,
> Herv?
>
> Changes v2 -> v3
> - Patch 1
> No changes
>
> - Patch 2
> Add missing device.h
>
> Changes v1 -> v2
> - Patch 1
> Rename the workqueue to 'device_link_wq'
> Add 'Fixes' tag and Cc stable
>
> - Patch 2
> Add device.h inclusion.
> Call device_link_wait_removal() later in the overlay removal
> sequence (i.e. in free_overlay_changeset() function).
> Drop of_mutex lock while calling device_link_wait_removal().
> Add 'Fixes' tag and Cc stable
>
> Herve Codina (2):
> driver core: Introduce device_link_wait_removal()
> of: overlay: Synchronize of_overlay_remove() with the devlink removals
>
> drivers/base/core.c | 26 +++++++++++++++++++++++---
> drivers/of/overlay.c | 10 +++++++++-
> include/linux/device.h | 1 +
> 3 files changed, 33 insertions(+), 4 deletions(-)
>
> --
> 2.43.0
>

2024-03-04 15:23:32

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

On Thu, Feb 29, 2024 at 12:18:49PM +0100, Nuno S? wrote:
> On Thu, 2024-02-29 at 11:52 +0100, Herve Codina wrote:
> > In the following sequence:
> > ? 1) of_platform_depopulate()
> > ? 2) of_overlay_remove()
> >
> > During the step 1, devices are destroyed and devlinks are removed.
> > During the step 2, OF nodes are destroyed but
> > __of_changeset_entry_destroy() can raise warnings related to missing
> > of_node_put():
> > ? ERROR: memory leak, expected refcount 1 instead of 2 ...
> >
> > Indeed, during the devlink removals performed at step 1, the removal
> > itself releasing the device (and the attached of_node) is done by a job
> > queued in a workqueue and so, it is done asynchronously with respect to
> > function calls.
> > When the warning is present, of_node_put() will be called but wrongly
> > too late from the workqueue job.
> >
> > In order to be sure that any ongoing devlink removals are done before
> > the of_node destruction, synchronize the of_overlay_remove() with the
> > devlink removals.
> >
> > Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > Cc: [email protected]
> > Signed-off-by: Herve Codina <[email protected]>
> > ---
> > ?drivers/of/overlay.c | 10 +++++++++-
> > ?1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> > index 2ae7e9d24a64..7a010a62b9d8 100644
> > --- a/drivers/of/overlay.c
> > +++ b/drivers/of/overlay.c
> > @@ -8,6 +8,7 @@
> > ?
> > ?#define pr_fmt(fmt) "OF: overlay: " fmt
> > ?
> > +#include <linux/device.h>
>
> This is clearly up to the DT maintainers to decide but, IMHO, I would very much
> prefer to see fwnode.h included in here rather than directly device.h (so yeah,
> renaming the function to fwnode_*).

IMO, the DT code should know almost nothing about fwnode because that's
the layer above it. But then overlay stuff is kind of a layer above the
core DT code too.

> But yeah, I might be biased by own series :)
>
> > ?#include <linux/kernel.h>
> > ?#include <linux/module.h>
> > ?#include <linux/of.h>
> > @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> > overlay_changeset *ovcs)
> > ?{
> > ? int i;
> > ?
> > + /*
> > + * Wait for any ongoing device link removals before removing some of
> > + * nodes. Drop the global lock while waiting
> > + */
> > + mutex_unlock(&of_mutex);
> > + device_link_wait_removal();
> > + mutex_lock(&of_mutex);
>
> I'm still not convinced we need to drop the lock. What happens if someone else
> grabs the lock while we are in device_link_wait_removal()? Can we guarantee that
> we can't screw things badly?

It is also just ugly because it's the callers of
free_overlay_changeset() that hold the lock and now we're releasing it
behind their back.

As device_link_wait_removal() is called before we touch anything, can't
it be called before we take the lock? And do we need to call it if
applying the overlay fails?

Rob

2024-03-04 15:34:49

by Nuno Sá

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

On Mon, 2024-03-04 at 09:22 -0600, Rob Herring wrote:
> On Thu, Feb 29, 2024 at 12:18:49PM +0100, Nuno Sá wrote:
> > On Thu, 2024-02-29 at 11:52 +0100, Herve Codina wrote:
> > > In the following sequence:
> > >   1) of_platform_depopulate()
> > >   2) of_overlay_remove()
> > >
> > > During the step 1, devices are destroyed and devlinks are removed.
> > > During the step 2, OF nodes are destroyed but
> > > __of_changeset_entry_destroy() can raise warnings related to missing
> > > of_node_put():
> > >   ERROR: memory leak, expected refcount 1 instead of 2 ...
> > >
> > > Indeed, during the devlink removals performed at step 1, the removal
> > > itself releasing the device (and the attached of_node) is done by a job
> > > queued in a workqueue and so, it is done asynchronously with respect to
> > > function calls.
> > > When the warning is present, of_node_put() will be called but wrongly
> > > too late from the workqueue job.
> > >
> > > In order to be sure that any ongoing devlink removals are done before
> > > the of_node destruction, synchronize the of_overlay_remove() with the
> > > devlink removals.
> > >
> > > Fixes: 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > > Cc: [email protected]
> > > Signed-off-by: Herve Codina <[email protected]>
> > > ---
> > >  drivers/of/overlay.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> > > index 2ae7e9d24a64..7a010a62b9d8 100644
> > > --- a/drivers/of/overlay.c
> > > +++ b/drivers/of/overlay.c
> > > @@ -8,6 +8,7 @@
> > >  
> > >  #define pr_fmt(fmt) "OF: overlay: " fmt
> > >  
> > > +#include <linux/device.h>
> >
> > This is clearly up to the DT maintainers to decide but, IMHO, I would very
> > much
> > prefer to see fwnode.h included in here rather than directly device.h (so
> > yeah,
> > renaming the function to fwnode_*).
>
> IMO, the DT code should know almost nothing about fwnode because that's
> the layer above it. But then overlay stuff is kind of a layer above the
> core DT code too.

Yeah, my reasoning is just that it may be better than knowing about device.h
code... But maybe I'm wrong :)

>
> > But yeah, I might be biased by own series :)
> >
> > >  #include <linux/kernel.h>
> > >  #include <linux/module.h>
> > >  #include <linux/of.h>
> > > @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> > > overlay_changeset *ovcs)
> > >  {
> > >   int i;
> > >  
> > > + /*
> > > + * Wait for any ongoing device link removals before removing some
> > > of
> > > + * nodes. Drop the global lock while waiting
> > > + */
> > > + mutex_unlock(&of_mutex);
> > > + device_link_wait_removal();
> > > + mutex_lock(&of_mutex);
> >
> > I'm still not convinced we need to drop the lock. What happens if someone
> > else
> > grabs the lock while we are in device_link_wait_removal()? Can we guarantee
> > that
> > we can't screw things badly?
>
> It is also just ugly because it's the callers of
> free_overlay_changeset() that hold the lock and now we're releasing it
> behind their back.
>
> As device_link_wait_removal() is called before we touch anything, can't
> it be called before we take the lock? And do we need to call it if
> applying the overlay fails?
>

My natural feeling was to put it right before checking the node refcount... and
I would like to still see proof that there's any potential deadlock. I did not
checked the code but the issue with calling it before we take the lock is that
likely the device links wont be removed because the overlay removal path (which
unbinds devices from drivers) needs to run under the lock?

- Nuno Sá

2024-03-05 06:18:22

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] Synchronize DT overlay removal with devlink removals

On Mon, Mar 4, 2024 at 7:02 AM Rob Herring <[email protected]> wrote:
>
> On Thu, Feb 29, 2024 at 11:52:01AM +0100, Herve Codina wrote:
> > Hi,
>
> Please CC Saravana on this.

Nuno, this is why I was replying to the older series. I didn't even
get this one.

>
> >
> > In the following sequence:
> > of_platform_depopulate(); /* Remove devices from a DT overlay node */
> > of_overlay_remove(); /* Remove the DT overlay node itself */
> >
> > Some warnings are raised by __of_changeset_entry_destroy() which was
> > called from of_overlay_remove():
> > ERROR: memory leak, expected refcount 1 instead of 2 ...
> >
> > The issue is that, during the device devlink removals triggered from the
> > of_platform_depopulate(), jobs are put in a workqueue.
> > These jobs drop the reference to the devices. When a device is no more
> > referenced (refcount == 0), it is released and the reference to its
> > of_node is dropped by a call to of_node_put().
> > These operations are fully correct except that, because of the
> > workqueue, they are done asynchronously with respect to function calls.
> >
> > In the sequence provided, the jobs are run too late, after the call to
> > __of_changeset_entry_destroy() and so a missing of_node_put() call is
> > detected by __of_changeset_entry_destroy().
> >
> > This series fixes this issue introducing device_link_wait_removal() in
> > order to wait for the end of jobs execution (patch 1) and using this
> > function to synchronize the overlay removal with the end of jobs
> > execution (patch 2).
> >
> > Compared to the previous iteration:
> > https://lore.kernel.org/linux-kernel/[email protected]/
> > this v3 series:
> > - add the missing device.h
> >
> > This series handles cases reported by Luca [1] and Nuno [2].
> > [1]: https://lore.kernel.org/all/20231220181627.341e8789@booty/
> > [2]: https://lore.kernel.org/all/[email protected]/
> >
> > Best regards,
> > Hervé
> >
> > Changes v2 -> v3
> > - Patch 1
> > No changes
> >
> > - Patch 2
> > Add missing device.h
> >
> > Changes v1 -> v2
> > - Patch 1
> > Rename the workqueue to 'device_link_wq'
> > Add 'Fixes' tag and Cc stable
> >
> > - Patch 2
> > Add device.h inclusion.
> > Call device_link_wait_removal() later in the overlay removal
> > sequence (i.e. in free_overlay_changeset() function).
> > Drop of_mutex lock while calling device_link_wait_removal().
> > Add 'Fixes' tag and Cc stable
> >
> > Herve Codina (2):
> > driver core: Introduce device_link_wait_removal()
> > of: overlay: Synchronize of_overlay_remove() with the devlink removals
> >
> > drivers/base/core.c | 26 +++++++++++++++++++++++---
> > drivers/of/overlay.c | 10 +++++++++-
> > include/linux/device.h | 1 +
> > 3 files changed, 33 insertions(+), 4 deletions(-)
> >
> > --
> > 2.43.0
> >

2024-03-05 06:48:02

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

On Mon, Mar 4, 2024 at 8:49 AM Herve Codina <[email protected]> wrote:
>
> Hi Rob,
>
> On Mon, 4 Mar 2024 09:22:02 -0600
> Rob Herring <[email protected]> wrote:
>
> ...
>
> > > > @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> > > > overlay_changeset *ovcs)
> > > > {
> > > > int i;
> > > >
> > > > + /*
> > > > + * Wait for any ongoing device link removals before removing some of
> > > > + * nodes. Drop the global lock while waiting
> > > > + */
> > > > + mutex_unlock(&of_mutex);
> > > > + device_link_wait_removal();
> > > > + mutex_lock(&of_mutex);
> > >
> > > I'm still not convinced we need to drop the lock. What happens if someone else
> > > grabs the lock while we are in device_link_wait_removal()? Can we guarantee that
> > > we can't screw things badly?
> >
> > It is also just ugly because it's the callers of
> > free_overlay_changeset() that hold the lock and now we're releasing it
> > behind their back.
> >
> > As device_link_wait_removal() is called before we touch anything, can't
> > it be called before we take the lock? And do we need to call it if
> > applying the overlay fails?

Rob,

This[1] scenario Luca reported seems like a reason for the
device_link_wait_removal() to be where Herve put it. That example
seems reasonable.

[1] - https://lore.kernel.org/all/20231220181627.341e8789@booty/

> >
>
> Indeed, having device_link_wait_removal() is not needed when applying the
> overlay fails.
>
> I can call device_link_wait_removal() from the caller of_overlay_remove()
> but not before the lock is taken.
> We need to call it between __of_changeset_revert_notify() and
> free_overlay_changeset() and so, the lock is taken.
>
> This lead to the following sequence:
> --- 8< ---
> int of_overlay_remove(int *ovcs_id)
> {
> ...
> mutex_lock(&of_mutex);
> ...
>
> ret = __of_changeset_revert_notify(&ovcs->cset);
> ...
>
> ret_tmp = overlay_notify(ovcs, OF_OVERLAY_POST_REMOVE);
> ...
>
> mutex_unlock(&of_mutex);
> device_link_wait_removal();
> mutex_lock(&of_mutex);
>
> free_overlay_changeset(ovcs);
> ...
> mutex_unlock(&of_mutex);
> ...
> }
> --- 8< ---
>
> In this sequence, the question is:
> Do we need to release the mutex lock while device_link_wait_removal() is
> called ?

In general I hate these kinds of sequences that release a lock and
then grab it again quickly. It's not always a bug, but my personal
take on that is 90% of these introduce a bug.

Drop the unlock/lock and we'll deal a deadlock if we actually hit one.
I'm also fairly certain that device_link_wait_removal() can't trigger
something else that can cause an OF overlay change while we are in the
middle of one. And like Rob said, I'm not sure this unlock/lock is a
good solution for that anyway.

Please CC me on the next series. And I'm glad folks convinced you to
use flush_workqueue(). As I said in the older series, I think
drain_workqueue() will actually break device links.

-Saravana


-Saravana

2024-03-05 07:22:48

by Nuno Sá

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] Synchronize DT overlay removal with devlink removals

On Mon, 2024-03-04 at 22:17 -0800, Saravana Kannan wrote:
> On Mon, Mar 4, 2024 at 7:02 AM Rob Herring <[email protected]> wrote:
> >
> > On Thu, Feb 29, 2024 at 11:52:01AM +0100, Herve Codina wrote:
> > > Hi,
> >
> > Please CC Saravana on this.
>
> Nuno, this is why I was replying to the older series. I didn't even
> get this one.

Arghh, I see... In lot's of replies I was mentioning you :)

- Nuno Sá

>
> >
> > >
> > > In the following sequence:
> > >   of_platform_depopulate(); /* Remove devices from a DT overlay node */
> > >   of_overlay_remove(); /* Remove the DT overlay node itself */
> > >
> > > Some warnings are raised by __of_changeset_entry_destroy() which  was
> > > called from of_overlay_remove():
> > >   ERROR: memory leak, expected refcount 1 instead of 2 ...
> > >
> > > The issue is that, during the device devlink removals triggered from the
> > > of_platform_depopulate(), jobs are put in a workqueue.
> > > These jobs drop the reference to the devices. When a device is no more
> > > referenced (refcount == 0), it is released and the reference to its
> > > of_node is dropped by a call to of_node_put().
> > > These operations are fully correct except that, because of the
> > > workqueue, they are done asynchronously with respect to function calls.
> > >
> > > In the sequence provided, the jobs are run too late, after the call to
> > > __of_changeset_entry_destroy() and so a missing of_node_put() call is
> > > detected by __of_changeset_entry_destroy().
> > >
> > > This series fixes this issue introducing device_link_wait_removal() in
> > > order to wait for the end of jobs execution (patch 1) and using this
> > > function to synchronize the overlay removal with the end of jobs
> > > execution (patch 2).
> > >
> > > Compared to the previous iteration:
> > >  
> > > https://lore.kernel.org/linux-kernel/[email protected]/
> > > this v3 series:
> > > - add the missing device.h
> > >
> > > This series handles cases reported by Luca [1] and Nuno [2].
> > >   [1]: https://lore.kernel.org/all/20231220181627.341e8789@booty/
> > >   [2]:
> > > https://lore.kernel.org/all/[email protected]/
> > >
> > > Best regards,
> > > Hervé
> > >
> > > Changes v2 -> v3
> > >   - Patch 1
> > >     No changes
> > >
> > >   - Patch 2
> > >     Add missing device.h
> > >
> > > Changes v1 -> v2
> > >   - Patch 1
> > >     Rename the workqueue to 'device_link_wq'
> > >     Add 'Fixes' tag and Cc stable
> > >
> > >   - Patch 2
> > >     Add device.h inclusion.
> > >     Call device_link_wait_removal() later in the overlay removal
> > >     sequence (i.e. in free_overlay_changeset() function).
> > >     Drop of_mutex lock while calling device_link_wait_removal().
> > >     Add       'Fixes' tag and Cc stable
> > >
> > > Herve Codina (2):
> > >   driver core: Introduce device_link_wait_removal()
> > >   of: overlay: Synchronize of_overlay_remove() with the devlink removals
> > >
> > >  drivers/base/core.c    | 26 +++++++++++++++++++++++---
> > >  drivers/of/overlay.c   | 10 +++++++++-
> > >  include/linux/device.h |  1 +
> > >  3 files changed, 33 insertions(+), 4 deletions(-)
> > >
> > > --
> > > 2.43.0
> > >


2024-03-05 07:35:18

by Nuno Sá

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

On Mon, 2024-03-04 at 22:47 -0800, Saravana Kannan wrote:
> On Mon, Mar 4, 2024 at 8:49 AM Herve Codina <herve.codina@bootlincom> wrote:
> >
> > Hi Rob,
> >
> > On Mon, 4 Mar 2024 09:22:02 -0600
> > Rob Herring <[email protected]> wrote:
> >
> > ...
> >
> > > > > @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> > > > > overlay_changeset *ovcs)
> > > > >  {
> > > > >   int i;
> > > > >
> > > > > + /*
> > > > > +  * Wait for any ongoing device link removals before removing some of
> > > > > +  * nodes. Drop the global lock while waiting
> > > > > +  */
> > > > > + mutex_unlock(&of_mutex);
> > > > > + device_link_wait_removal();
> > > > > + mutex_lock(&of_mutex);
> > > >
> > > > I'm still not convinced we need to drop the lock. What happens if
> > > > someone else
> > > > grabs the lock while we are in device_link_wait_removal()? Can we
> > > > guarantee that
> > > > we can't screw things badly?
> > >
> > > It is also just ugly because it's the callers of
> > > free_overlay_changeset() that hold the lock and now we're releasing it
> > > behind their back.
> > >
> > > As device_link_wait_removal() is called before we touch anything, can't
> > > it be called before we take the lock? And do we need to call it if
> > > applying the overlay fails?
>
> Rob,
>
> This[1] scenario Luca reported seems like a reason for the
> device_link_wait_removal() to be where Herve put it. That example
> seems reasonable.
>
> [1] - https://lore.kernel.org/all/20231220181627.341e8789@booty/
>

I'm still not totally convinced about that. Why not putting the check right
before checking the kref in __of_changeset_entry_destroy(). I'll contradict
myself a bit because this is just theory but if we look at pci_stop_dev(), which
AFAIU, could be reached from a sysfs write(), we have:

device_release_driver(&dev->dev);
..
of_pci_remove_node(dev);
of_changeset_revert(np->data);
of_changeset_destroy(np->data);

So looking at the above we would hit the same issue if we flush the queue in
free_overlay_changeset() - as the queue won't be flushed at all and we could
have devlink removal due to device_release_driver(). Right?

Again, completely theoretical but seems like a reasonable one plus I'm not
understanding the push against having the flush in
__of_changeset_entry_destroy(). Conceptually, it looks the best place to me but
I may be missing some issue in doing it there?

> > >
> >
> > Indeed, having device_link_wait_removal() is not needed when applying the
> > overlay fails.
> >
> > I can call device_link_wait_removal() from the caller of_overlay_remove()
> > but not before the lock is taken.
> > We need to call it between __of_changeset_revert_notify() and
> > free_overlay_changeset() and so, the lock is taken.
> >
> > This lead to the following sequence:
> > --- 8< ---
> > int of_overlay_remove(int *ovcs_id)
> > {
> >         ...
> >         mutex_lock(&of_mutex);
> >         ...
> >
> >         ret = __of_changeset_revert_notify(&ovcs->cset);
> >         ...
> >
> >         ret_tmp = overlay_notify(ovcs, OF_OVERLAY_POST_REMOVE);
> >         ...
> >
> >         mutex_unlock(&of_mutex);
> >         device_link_wait_removal();
> >         mutex_lock(&of_mutex);
> >
> >         free_overlay_changeset(ovcs);
> >         ...
> >         mutex_unlock(&of_mutex);
> >         ...
> > }
> > --- 8< ---
> >
> > In this sequence, the question is:
> > Do we need to release the mutex lock while device_link_wait_removal() is
> > called ?
>
> In general I hate these kinds of sequences that release a lock and
> then grab it again quickly. It's not always a bug, but my personal
> take on that is 90% of these introduce a bug.
>
> Drop the unlock/lock and we'll deal a deadlock if we actually hit one.
> I'm also fairly certain that device_link_wait_removal() can't trigger
> something else that can cause an OF overlay change while we are in the
> middle of one. And like Rob said, I'm not sure this unlock/lock is a
> good solution for that anyway.

Totally agree. Unless we really see a deadlock this is a very bad idea (IMHO).
Even on the PCI code, it seems to me that we're never destroying a changeset
from a device/kobj_type release callback. That would be super weird right?

- Nuno Sá
>


2024-03-05 10:54:03

by Nuno Sá

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

On Tue, 2024-03-05 at 11:27 +0100, Herve Codina wrote:
> Hi Nuno, Saravana, Rob,
>
> On Tue, 05 Mar 2024 08:36:45 +0100
> Nuno Sá <[email protected]> wrote:
>
> > On Mon, 2024-03-04 at 22:47 -0800, Saravana Kannan wrote:
> > > On Mon, Mar 4, 2024 at 8:49 AM Herve Codina <[email protected]>
> > > wrote: 
> > > >
> > > > Hi Rob,
> > > >
> > > > On Mon, 4 Mar 2024 09:22:02 -0600
> > > > Rob Herring <[email protected]> wrote:
> > > >
> > > > ...
> > > >  
> > > > > > > @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> > > > > > > overlay_changeset *ovcs)
> > > > > > >  {
> > > > > > >   int i;
> > > > > > >
> > > > > > > + /*
> > > > > > > +  * Wait for any ongoing device link removals before removing
> > > > > > > some of
> > > > > > > +  * nodes. Drop the global lock while waiting
> > > > > > > +  */
> > > > > > > + mutex_unlock(&of_mutex);
> > > > > > > + device_link_wait_removal();
> > > > > > > + mutex_lock(&of_mutex); 
> > > > > >
> > > > > > I'm still not convinced we need to drop the lock. What happens if
> > > > > > someone else
> > > > > > grabs the lock while we are in device_link_wait_removal()? Can we
> > > > > > guarantee that
> > > > > > we can't screw things badly? 
> > > > >
> > > > > It is also just ugly because it's the callers of
> > > > > free_overlay_changeset() that hold the lock and now we're releasing it
> > > > > behind their back.
> > > > >
> > > > > As device_link_wait_removal() is called before we touch anything,
> > > > > can't
> > > > > it be called before we take the lock? And do we need to call it if
> > > > > applying the overlay fails? 
> > >
> > > Rob,
> > >
> > > This[1] scenario Luca reported seems like a reason for the
> > > device_link_wait_removal() to be where Herve put it. That example
> > > seems reasonable.
> > >
> > > [1] - https://lore.kernel.org/all/20231220181627.341e8789@booty/
> > >  
> >
> > I'm still not totally convinced about that. Why not putting the check right
> > before checking the kref in __of_changeset_entry_destroy(). I'll contradict
> > myself a bit because this is just theory but if we look at pci_stop_dev(),
> > which
> > AFAIU, could be reached from a sysfs write(), we have:
> >
> > device_release_driver(&dev->dev);
> > ...
> > of_pci_remove_node(dev);
> > of_changeset_revert(np->data);
> > of_changeset_destroy(np->data);
> >
> > So looking at the above we would hit the same issue if we flush the queue in
> > free_overlay_changeset() - as the queue won't be flushed at all and we could
> > have devlink removal due to device_release_driver(). Right?
> >
> > Again, completely theoretical but seems like a reasonable one plus I'm not
> > understanding the push against having the flush in
> > __of_changeset_entry_destroy(). Conceptually, it looks the best place to me
> > but
> > I may be missing some issue in doing it there?
>
> Instead of having the wait called in __of_changeset_entry_destroy() and so
> called in a loop. I could move this call in the __of_changeset_entry_destroy()
> caller (without any of_mutex lock drop).
>

Oh, good catch! At this point all the devlinks removals (related to the
changeset) should have been queued so yes, we should only need to flush once.

> So this will look like this:
> --- 8< ---
> void of_changeset_destroy(struct of_changeset *ocs)
> {
> struct of_changeset_entry *ce, *cen;
>
> device_link_wait_removal();
>
> list_for_each_entry_safe_reverse(ce, cen, &ocs->entries, node)
> __of_changeset_entry_destroy(ce);
> }
> --- 8< ---
>
> I already tested on my system and it works correctly with
> device_link_wait_removal() only called from of_changeset_destroy()
> as proposed.
>
> Saravana, Nuno, Rob does it seems ok for you ?
>

It looks good to me...

- Nuno Sá
>


2024-03-06 03:09:23

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

On Tue, Mar 5, 2024 at 2:43 AM Nuno Sá <[email protected]> wrote:
>
> On Tue, 2024-03-05 at 11:27 +0100, Herve Codina wrote:
> > Hi Nuno, Saravana, Rob,
> >
> > On Tue, 05 Mar 2024 08:36:45 +0100
> > Nuno Sá <[email protected]> wrote:
> >
> > > On Mon, 2024-03-04 at 22:47 -0800, Saravana Kannan wrote:
> > > > On Mon, Mar 4, 2024 at 8:49 AM Herve Codina <[email protected]>
> > > > wrote:
> > > > >
> > > > > Hi Rob,
> > > > >
> > > > > On Mon, 4 Mar 2024 09:22:02 -0600
> > > > > Rob Herring <[email protected]> wrote:
> > > > >
> > > > > ...
> > > > >
> > > > > > > > @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> > > > > > > > overlay_changeset *ovcs)
> > > > > > > > {
> > > > > > > > int i;
> > > > > > > >
> > > > > > > > + /*
> > > > > > > > + * Wait for any ongoing device link removals before removing
> > > > > > > > some of
> > > > > > > > + * nodes. Drop the global lock while waiting
> > > > > > > > + */
> > > > > > > > + mutex_unlock(&of_mutex);
> > > > > > > > + device_link_wait_removal();
> > > > > > > > + mutex_lock(&of_mutex);
> > > > > > >
> > > > > > > I'm still not convinced we need to drop the lock. What happens if
> > > > > > > someone else
> > > > > > > grabs the lock while we are in device_link_wait_removal()? Can we
> > > > > > > guarantee that
> > > > > > > we can't screw things badly?
> > > > > >
> > > > > > It is also just ugly because it's the callers of
> > > > > > free_overlay_changeset() that hold the lock and now we're releasing it
> > > > > > behind their back.
> > > > > >
> > > > > > As device_link_wait_removal() is called before we touch anything,
> > > > > > can't
> > > > > > it be called before we take the lock? And do we need to call it if
> > > > > > applying the overlay fails?
> > > >
> > > > Rob,
> > > >
> > > > This[1] scenario Luca reported seems like a reason for the
> > > > device_link_wait_removal() to be where Herve put it. That example
> > > > seems reasonable.
> > > >
> > > > [1] - https://lore.kernel.org/all/20231220181627.341e8789@booty/
> > > >
> > >
> > > I'm still not totally convinced about that. Why not putting the check right
> > > before checking the kref in __of_changeset_entry_destroy(). I'll contradict
> > > myself a bit because this is just theory but if we look at pci_stop_dev(),
> > > which
> > > AFAIU, could be reached from a sysfs write(), we have:
> > >
> > > device_release_driver(&dev->dev);
> > > ...
> > > of_pci_remove_node(dev);
> > > of_changeset_revert(np->data);
> > > of_changeset_destroy(np->data);
> > >
> > > So looking at the above we would hit the same issue if we flush the queue in
> > > free_overlay_changeset() - as the queue won't be flushed at all and we could
> > > have devlink removal due to device_release_driver(). Right?
> > >
> > > Again, completely theoretical but seems like a reasonable one plus I'm not
> > > understanding the push against having the flush in
> > > __of_changeset_entry_destroy(). Conceptually, it looks the best place to me
> > > but
> > > I may be missing some issue in doing it there?
> >
> > Instead of having the wait called in __of_changeset_entry_destroy() and so
> > called in a loop. I could move this call in the __of_changeset_entry_destroy()
> > caller (without any of_mutex lock drop).
> >
>
> Oh, good catch! At this point all the devlinks removals (related to the
> changeset) should have been queued so yes, we should only need to flush once.
>
> > So this will look like this:
> > --- 8< ---
> > void of_changeset_destroy(struct of_changeset *ocs)
> > {
> > struct of_changeset_entry *ce, *cen;
> >
> > device_link_wait_removal();
> >
> > list_for_each_entry_safe_reverse(ce, cen, &ocs->entries, node)
> > __of_changeset_entry_destroy(ce);
> > }
> > --- 8< ---
> >
> > I already tested on my system and it works correctly with
> > device_link_wait_removal() only called from of_changeset_destroy()
> > as proposed.
> >
> > Saravana, Nuno, Rob does it seems ok for you ?

Looks good to me.

-Saravana

> >
>
> It looks good to me...
>
> - Nuno Sá
> >
>