2023-12-29 10:26:52

by Luca Ceresoli

[permalink] [raw]
Subject: [PATCH] nvmem: core: fix nvmem cells not being available in notifiers

With current code, when an NVMEM notifier for NVMEM_CELL_ADD is called, the
cell is not accessible within the notifier call function.

This can be easily seen with a simple NVMEM notifier call doing:

if (action == NVMEM_CELL_ADD)
cell = nvmem_cell_get(hpm->dev, "id");

In this case nvmem_cell_get() always returns -EPROBE_DEFER if trying to get
the cell whose addition is being notified. Adding a long msleep() before
nvmem_cell_get() does not change the result. On the other hand,
nvmem_cell_get() works when invoked from a different code path not
involving the NVMEM event notifier.

The failing code path in my test case is:

nvmem_cell_get()
-> of_nvmem_cell_get()
-> __nvmem_device_get()
-> bus_find_device(); ---> returns NULL

The cause is in nvmem_register(), which adds the cells (via
nvmem_add_cells_from_fixed_layout() in my case) and in the process calls
the NVMEM_CELL_ADD notifiers _before_ calling device_add(&nvmem->dev), thus
before the nvmem device is added to the bus. This prevents
bus_find_device() from finding the device.

This makes the NVMEM_CELL_ADD notifier pretty useless, at least in the use
case where a consumer driver wants to read a cell when it becomes
available, without additional infrastructure to postpone the
nvmem_cell_get() call.

The easy solution would be moving the device_add() just before cell
addition instead of just after. This is exactly what the original
implementation was doing, but it had a race condition, which was fixed in
commit ab3428cfd9aa ("nvmem: core: fix registration vs use race") exactly
by swapping device_add() and cell addition.

Solve this problem by leaving cell addition before device_add() but moving
the notifications after. This would be pretty simple if all cells were
added by nvmem_register(), but cell layouts could run later if they are
built as modules, so they need to be allowed to notify when they add cells
after nvmem_register().

Solve this by adding a flag in struct nvmem_device to block all
notifications before calling device_add(), and keep track of whether each
cell got notified or not, so that exactly one notification is sent ber
cell.

Signed-off-by: Luca Ceresoli <[email protected]>
---
drivers/nvmem/core.c | 35 +++++++++++++++++++++++++++++++++--
drivers/nvmem/internals.h | 2 ++
2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index ba559e81f77f..42f8edbfb39c 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -36,6 +36,7 @@ struct nvmem_cell_entry {
struct device_node *np;
struct nvmem_device *nvmem;
struct list_head node;
+ atomic_t notified_add;
};

struct nvmem_cell {
@@ -520,9 +521,29 @@ static struct bus_type nvmem_bus_type = {
.name = "nvmem",
};

+/*
+ * Send cell add/remove notification unless it has been already sent.
+ *
+ * Uses and updates cell->notified_add to avoid duplicates.
+ *
+ * Must never be called with NVMEM_CELL_ADD after being called with
+ * NVMEM_CELL_REMOVE.
+ *
+ * @cell: the cell just added or going to be removed
+ * @event: NVMEM_CELL_ADD or NVMEM_CELL_REMOVE
+ */
+static void nvmem_cell_notify(struct nvmem_cell_entry *cell, unsigned long event)
+{
+ int new_notified = (event == NVMEM_CELL_ADD) ? 1 : 0;
+ int was_notified = atomic_xchg(&cell->notified_add, new_notified);
+
+ if (new_notified != was_notified)
+ blocking_notifier_call_chain(&nvmem_notifier, event, cell);
+}
+
static void nvmem_cell_entry_drop(struct nvmem_cell_entry *cell)
{
- blocking_notifier_call_chain(&nvmem_notifier, NVMEM_CELL_REMOVE, cell);
+ nvmem_cell_notify(cell, NVMEM_CELL_REMOVE);
mutex_lock(&nvmem_mutex);
list_del(&cell->node);
mutex_unlock(&nvmem_mutex);
@@ -544,7 +565,9 @@ static void nvmem_cell_entry_add(struct nvmem_cell_entry *cell)
mutex_lock(&nvmem_mutex);
list_add_tail(&cell->node, &cell->nvmem->cells);
mutex_unlock(&nvmem_mutex);
- blocking_notifier_call_chain(&nvmem_notifier, NVMEM_CELL_ADD, cell);
+
+ if (cell->nvmem->do_notify_cell_add)
+ nvmem_cell_notify(cell, NVMEM_CELL_ADD);
}

static int nvmem_cell_info_to_nvmem_cell_entry_nodup(struct nvmem_device *nvmem,
@@ -902,6 +925,7 @@ EXPORT_SYMBOL_GPL(nvmem_layout_get_match_data);
struct nvmem_device *nvmem_register(const struct nvmem_config *config)
{
struct nvmem_device *nvmem;
+ struct nvmem_cell_entry *cell;
int rval;

if (!config->dev)
@@ -1033,6 +1057,13 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)

blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);

+ /* After device_add() it is now OK to notify of new cells */
+ nvmem->do_notify_cell_add = true;
+
+ /* Notify about cells previously added but not notified */
+ list_for_each_entry(cell, &nvmem->cells, node)
+ nvmem_cell_notify(cell, NVMEM_CELL_ADD);
+
return nvmem;

#ifdef CONFIG_NVMEM_SYSFS
diff --git a/drivers/nvmem/internals.h b/drivers/nvmem/internals.h
index 18fed57270e5..3dbaa8523530 100644
--- a/drivers/nvmem/internals.h
+++ b/drivers/nvmem/internals.h
@@ -33,6 +33,8 @@ struct nvmem_device {
struct nvmem_layout *layout;
void *priv;
bool sysfs_cells_populated;
+ /* Enable sending NVMEM_CELL_ADD notifications */
+ bool do_notify_cell_add;
};

#if IS_ENABLED(CONFIG_OF)

---
base-commit: 399769c2014d2aa0463636d50f2bc6431b377331
change-id: 20231229-nvmem-cell-add-notification-feb857742f0a

Best regards,
--
Luca Ceresoli <[email protected]>



2024-01-02 09:35:25

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH] nvmem: core: fix nvmem cells not being available in notifiers

Hi Luca,

[email protected] wrote on Fri, 29 Dec 2023 11:26:26 +0100:

> With current code, when an NVMEM notifier for NVMEM_CELL_ADD is called, the
> cell is not accessible within the notifier call function.
>

Nice commit log :) a few minor comments below.

...

>
> Solve this by adding a flag in struct nvmem_device to block all
> notifications before calling device_add(), and keep track of whether each
> cell got notified or not, so that exactly one notification is sent ber

per?

> cell.
>
> Signed-off-by: Luca Ceresoli <[email protected]>
> ---
> drivers/nvmem/core.c | 35 +++++++++++++++++++++++++++++++++--
> drivers/nvmem/internals.h | 2 ++
> 2 files changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
> index ba559e81f77f..42f8edbfb39c 100644
> --- a/drivers/nvmem/core.c
> +++ b/drivers/nvmem/core.c
> @@ -36,6 +36,7 @@ struct nvmem_cell_entry {
> struct device_node *np;
> struct nvmem_device *nvmem;
> struct list_head node;
> + atomic_t notified_add;
> };
>
> struct nvmem_cell {
> @@ -520,9 +521,29 @@ static struct bus_type nvmem_bus_type = {
> .name = "nvmem",
> };
>
> +/*
> + * Send cell add/remove notification unless it has been already sent.
> + *
> + * Uses and updates cell->notified_add to avoid duplicates.
> + *
> + * Must never be called with NVMEM_CELL_ADD after being called with
> + * NVMEM_CELL_REMOVE.
> + *
> + * @cell: the cell just added or going to be removed
> + * @event: NVMEM_CELL_ADD or NVMEM_CELL_REMOVE
> + */
> +static void nvmem_cell_notify(struct nvmem_cell_entry *cell, unsigned long event)
> +{
> + int new_notified = (event == NVMEM_CELL_ADD) ? 1 : 0;

The ternary operator is not needed here, (event == VAL) will return the
correct value.

Could we rename new_notified into something like "is_addition"? It took
me a bit of time understanding what this boolean meant.

> + int was_notified = atomic_xchg(&cell->notified_add, new_notified);
> +
> + if (new_notified != was_notified)

I believe what you want is (with my terms):

if ((is_addition && !was_notified) || !is_addition)

> + blocking_notifier_call_chain(&nvmem_notifier, event, cell);

I believe your if condition works, but is a bit complex to read. Is
there a reason for the following condition ?

(new_notified := 0) /*removal */ != (was_notified := 1)

I see no use to this, but I am probably over looking something.

> +}
> +
> static void nvmem_cell_entry_drop(struct nvmem_cell_entry *cell)
> {
> - blocking_notifier_call_chain(&nvmem_notifier, NVMEM_CELL_REMOVE, cell);
> + nvmem_cell_notify(cell, NVMEM_CELL_REMOVE);
> mutex_lock(&nvmem_mutex);
> list_del(&cell->node);
> mutex_unlock(&nvmem_mutex);
> @@ -544,7 +565,9 @@ static void nvmem_cell_entry_add(struct nvmem_cell_entry *cell)
> mutex_lock(&nvmem_mutex);
> list_add_tail(&cell->node, &cell->nvmem->cells);
> mutex_unlock(&nvmem_mutex);
> - blocking_notifier_call_chain(&nvmem_notifier, NVMEM_CELL_ADD, cell);
> +
> + if (cell->nvmem->do_notify_cell_add)
> + nvmem_cell_notify(cell, NVMEM_CELL_ADD);
> }
>
> static int nvmem_cell_info_to_nvmem_cell_entry_nodup(struct nvmem_device *nvmem,
> @@ -902,6 +925,7 @@ EXPORT_SYMBOL_GPL(nvmem_layout_get_match_data);
> struct nvmem_device *nvmem_register(const struct nvmem_config *config)
> {
> struct nvmem_device *nvmem;
> + struct nvmem_cell_entry *cell;
> int rval;
>
> if (!config->dev)
> @@ -1033,6 +1057,13 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
>
> blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
>
> + /* After device_add() it is now OK to notify of new cells */
> + nvmem->do_notify_cell_add = true;

Could we rename this as well to be simpler? Like
"notify_cell_additions" or "cells_can_be_notified"? I am actually
asking myself whether this boolean is useful. In practice we call the
notifier after setting this to true. On the other hand, the layouts
will only probe after the device_add(), so they should be safe?

> +
> + /* Notify about cells previously added but not notified */
> + list_for_each_entry(cell, &nvmem->cells, node)
> + nvmem_cell_notify(cell, NVMEM_CELL_ADD);
> +
> return nvmem;
>
> #ifdef CONFIG_NVMEM_SYSFS
> diff --git a/drivers/nvmem/internals.h b/drivers/nvmem/internals.h
> index 18fed57270e5..3dbaa8523530 100644
> --- a/drivers/nvmem/internals.h
> +++ b/drivers/nvmem/internals.h
> @@ -33,6 +33,8 @@ struct nvmem_device {
> struct nvmem_layout *layout;
> void *priv;
> bool sysfs_cells_populated;
> + /* Enable sending NVMEM_CELL_ADD notifications */
> + bool do_notify_cell_add;
> };
>
> #if IS_ENABLED(CONFIG_OF)
>
> ---
> base-commit: 399769c2014d2aa0463636d50f2bc6431b377331
> change-id: 20231229-nvmem-cell-add-notification-feb857742f0a
>
> Best regards,


Thanks,
Miquèl

2024-01-02 13:46:51

by Luca Ceresoli

[permalink] [raw]
Subject: Re: [PATCH] nvmem: core: fix nvmem cells not being available in notifiers

On Tue, 2 Jan 2024 10:35:03 +0100
Miquel Raynal <[email protected]> wrote:

> > Solve this by adding a flag in struct nvmem_device to block all
> > notifications before calling device_add(), and keep track of whether each
> > cell got notified or not, so that exactly one notification is sent ber
>
> per?

Sure.

> > +/*
> > + * Send cell add/remove notification unless it has been already sent.
> > + *
> > + * Uses and updates cell->notified_add to avoid duplicates.
> > + *
> > + * Must never be called with NVMEM_CELL_ADD after being called with
> > + * NVMEM_CELL_REMOVE.
> > + *
> > + * @cell: the cell just added or going to be removed
> > + * @event: NVMEM_CELL_ADD or NVMEM_CELL_REMOVE
> > + */
> > +static void nvmem_cell_notify(struct nvmem_cell_entry *cell, unsigned long event)
> > +{
> > + int new_notified = (event == NVMEM_CELL_ADD) ? 1 : 0;
>
> The ternary operator is not needed here, (event == VAL) will return the
> correct value.

OK.

> Could we rename new_notified into something like "is_addition"? It took
> me a bit of time understanding what this boolean meant.

Let me explain better the idea. This is the value that
cell->notified_add gets over time:

1. at initialization: 0
2. when calling nvmem_cell_notify(cell, NVMEM_CELL_ADD): 1
and ADD notifier functions are called
3. if calling nvmem_cell_notify(cell, NVMEM_CELL_ADD) again
nothing happens
4. when calling nvmem_cell_notify(cell, NVMEM_CELL_REMOVE): 0
and REMOVE notifier functions are called
5. if calling nvmem_cell_notify(cell, NVMEM_CELL_REMOVE) again
nothing happens

So it avoids calling multiple notifiers both for addition, which is the
main goal, but also for removal. I understand there is probably no code
path for multiple removal calls, so maybe this is not useful.

I tried to find a good variable name to express this, and failed. :)

> > + int was_notified = atomic_xchg(&cell->notified_add, new_notified);
> > +
> > + if (new_notified != was_notified)

The "{was,new}_notified" names in my mind mean "{old,new} value of the
atomic flag". Thus "if (new_notified != was_notified)" means "if there
is a change of state, then notify it".

> I believe what you want is (with my terms):
>
> if ((is_addition && !was_notified) || !is_addition)
>
> > + blocking_notifier_call_chain(&nvmem_notifier, event, cell);
>
> I believe your if condition works, but is a bit complex to read. Is
> there a reason for the following condition ?
>
> (new_notified := 0) /*removal */ != (was_notified := 1)

From my explanation above, it is hopefully now clear that this means:

(new_notified := 0, i.e. we are having a removal event) !=
(was_notified := 1, i.e. the last even notified was not a removal)

That said, I'm open to remove this logic, and on cell removal just
unconditionally send a notifier, probably without changing the variable
value:

if (removal || !notify_cell_additions(&cell->notified_add, 1)

> > @@ -1033,6 +1057,13 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
> >
> > blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
> >
> > + /* After device_add() it is now OK to notify of new cells */
> > + nvmem->do_notify_cell_add = true;
>
> Could we rename this as well to be simpler? Like
> "notify_cell_additions" or "cells_can_be_notified"?

"notify_cell_additions" seems the best, thanks for the suggestion.

> I am actually
> asking myself whether this boolean is useful. In practice we call the
> notifier after setting this to true. On the other hand, the layouts
> will only probe after the device_add(), so they should be safe?

What if the module implementing the layout is loaded after
nvmem_register() finished? of_nvmem_cell_get() ->
nvmem_layout_module_get_optional() -> try_module_get() should allow
that, but I may be missing something.

Luca

--
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2024-01-02 15:31:24

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH] nvmem: core: fix nvmem cells not being available in notifiers

Hi Luca,

[...]

> > Could we rename new_notified into something like "is_addition"? It took
> > me a bit of time understanding what this boolean meant.
>
> Let me explain better the idea. This is the value that
> cell->notified_add gets over time:
>
> 1. at initialization: 0
> 2. when calling nvmem_cell_notify(cell, NVMEM_CELL_ADD): 1
> and ADD notifier functions are called
> 3. if calling nvmem_cell_notify(cell, NVMEM_CELL_ADD) again
> nothing happens
> 4. when calling nvmem_cell_notify(cell, NVMEM_CELL_REMOVE): 0
> and REMOVE notifier functions are called
> 5. if calling nvmem_cell_notify(cell, NVMEM_CELL_REMOVE) again
> nothing happens
>
> So it avoids calling multiple notifiers both for addition, which is the
> main goal, but also for removal. I understand there is probably no code
> path for multiple removal calls, so maybe this is not useful.

Ok, that's clear now, I was on the wrong path, not because of the
naming, but because you also focused on the REMOVE, while I was not
expecting anything on that side.

> I tried to find a good variable name to express this, and failed. :)
>
> > > + int was_notified = atomic_xchg(&cell->notified_add, new_notified);
> > > +
> > > + if (new_notified != was_notified)
>
> The "{was,new}_notified" names in my mind mean "{old,new} value of the
> atomic flag". Thus "if (new_notified != was_notified)" means "if there
> is a change of state, then notify it".
>
> > I believe what you want is (with my terms):
> >
> > if ((is_addition && !was_notified) || !is_addition)
> >
> > > + blocking_notifier_call_chain(&nvmem_notifier, event, cell);
> >
> > I believe your if condition works, but is a bit complex to read. Is
> > there a reason for the following condition ?
> >
> > (new_notified := 0) /*removal */ != (was_notified := 1)
>
> From my explanation above, it is hopefully now clear that this means:
>
> (new_notified := 0, i.e. we are having a removal event) !=
> (was_notified := 1, i.e. the last even notified was not a removal)
>
> That said, I'm open to remove this logic, and on cell removal just
> unconditionally send a notifier, probably without changing the variable
> value:
>
> if (removal || !notify_cell_additions(&cell->notified_add, 1)

Yes, I see no use of the atomic counter in the right path for now, so
I'd suggest to keep the logic simpler for now, if you don't mind.

> > > @@ -1033,6 +1057,13 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
> > >
> > > blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
> > >
> > > + /* After device_add() it is now OK to notify of new cells */
> > > + nvmem->do_notify_cell_add = true;
> >
> > Could we rename this as well to be simpler? Like
> > "notify_cell_additions" or "cells_can_be_notified"?
>
> "notify_cell_additions" seems the best, thanks for the suggestion.
>
> > I am actually
> > asking myself whether this boolean is useful. In practice we call the
> > notifier after setting this to true. On the other hand, the layouts
> > will only probe after the device_add(), so they should be safe?
>
> What if the module implementing the layout is loaded after
> nvmem_register() finished? of_nvmem_cell_get() ->
> nvmem_layout_module_get_optional() -> try_module_get() should allow
> that, but I may be missing something.

Consumers should get -EPROBE_DEFER in this case. They can either try it
later or... wait on the notifier :)

Thanks,
Miquèl