The i.MX6 CPU frequency driver sometimes fails to register at boot time
due to nvmem_cell_read_u32() sporadically returning -ENOENT.
This happens because there is a window where __nvmem_device_get() in
of_nvmem_cell_get() is able to return the nvmem device, but as cells
have been setup, nvmem_find_cell_entry_by_node() returns NULL.
The occurs because the nvmem core registration code violates one of the
fundamental principles of kernel programming: do not publish data
structures before their setup is complete.
Fix this by making nvmem core code conform with this principle.
Signed-off-by: Russell King (Oracle) <[email protected]>
---
drivers/nvmem/core.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 321d7d63e068..6b89fb6fa582 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -835,22 +835,16 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
nvmem->dev.groups = nvmem_dev_groups;
#endif
- dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);
-
- rval = device_register(&nvmem->dev);
- if (rval)
- goto err_put_device;
-
if (nvmem->nkeepout) {
rval = nvmem_validate_keepouts(nvmem);
if (rval)
- goto err_device_del;
+ goto err_put_device;
}
if (config->compat) {
rval = nvmem_sysfs_setup_compat(nvmem, config);
if (rval)
- goto err_device_del;
+ goto err_put_device;
}
if (config->cells) {
@@ -867,6 +861,12 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
if (rval)
goto err_remove_cells;
+ dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);
+
+ rval = device_register(&nvmem->dev);
+ if (rval)
+ goto err_remove_cells;
+
blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
return nvmem;
@@ -876,8 +876,6 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
err_teardown_compat:
if (config->compat)
nvmem_sysfs_remove_compat(nvmem, config);
-err_device_del:
- device_del(&nvmem->dev);
err_put_device:
put_device(&nvmem->dev);
--
2.30.2
On 03/01/2023 09:42, Russell King (Oracle) wrote:
> The i.MX6 CPU frequency driver sometimes fails to register at boot time
> due to nvmem_cell_read_u32() sporadically returning -ENOENT.
>
> This happens because there is a window where __nvmem_device_get() in
> of_nvmem_cell_get() is able to return the nvmem device, but as cells
> have been setup, nvmem_find_cell_entry_by_node() returns NULL.
>
> The occurs because the nvmem core registration code violates one of the
> fundamental principles of kernel programming: do not publish data
> structures before their setup is complete.
>
> Fix this by making nvmem core code conform with this principle.
>
how about a Fixes tag and Cc stable?
> Signed-off-by: Russell King (Oracle) <[email protected]>
> ---
> drivers/nvmem/core.c | 18 ++++++++----------
> 1 file changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
> index 321d7d63e068..6b89fb6fa582 100644
> --- a/drivers/nvmem/core.c
> +++ b/drivers/nvmem/core.c
> @@ -835,22 +835,16 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
> nvmem->dev.groups = nvmem_dev_groups;
> #endif
>
> - dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);
> -
> - rval = device_register(&nvmem->dev);
> - if (rval)
> - goto err_put_device;
> -
> if (nvmem->nkeepout) {
> rval = nvmem_validate_keepouts(nvmem);
> if (rval)
> - goto err_device_del;
> + goto err_put_device;
AFAIU, as we never did a get_device/kobject_get, calling put_device at
this point will not invoke a release callback which can potentially leak
both nvmem and ida.
--srini
> }
>
> if (config->compat) {
> rval = nvmem_sysfs_setup_compat(nvmem, config);
> if (rval)
> - goto err_device_del;
> + goto err_put_device;
> }
>
> if (config->cells) {
> @@ -867,6 +861,12 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
> if (rval)
> goto err_remove_cells;
>
> + dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);
> +
> + rval = device_register(&nvmem->dev);
> + if (rval)
> + goto err_remove_cells;
> +
> blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
>
> return nvmem;
> @@ -876,8 +876,6 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
> err_teardown_compat:
> if (config->compat)
> nvmem_sysfs_remove_compat(nvmem, config);
> -err_device_del:
> - device_del(&nvmem->dev);
> err_put_device:
> put_device(&nvmem->dev);
>
On Tue, Jan 03, 2023 at 11:30:36AM +0000, Srinivas Kandagatla wrote:
>
>
> On 03/01/2023 09:42, Russell King (Oracle) wrote:
> > The i.MX6 CPU frequency driver sometimes fails to register at boot time
> > due to nvmem_cell_read_u32() sporadically returning -ENOENT.
> >
> > This happens because there is a window where __nvmem_device_get() in
> > of_nvmem_cell_get() is able to return the nvmem device, but as cells
> > have been setup, nvmem_find_cell_entry_by_node() returns NULL.
> >
> > The occurs because the nvmem core registration code violates one of the
> > fundamental principles of kernel programming: do not publish data
> > structures before their setup is complete.
> >
> > Fix this by making nvmem core code conform with this principle.
> >
> how about a Fixes tag and Cc stable?
Which commit do you suggest? This error goes all the way back to the
inception of nvmem, commit
eace75cfdcf7 ("nvmem: Add a simple NVMEM framework for nvmem providers")
but clearly its going to be a lot of effort to backport it all the
way due to all the changes.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
On 03/01/2023 11:46, Russell King (Oracle) wrote:
> On Tue, Jan 03, 2023 at 11:30:36AM +0000, Srinivas Kandagatla wrote:
>>
>>
>> On 03/01/2023 09:42, Russell King (Oracle) wrote:
>>> The i.MX6 CPU frequency driver sometimes fails to register at boot time
>>> due to nvmem_cell_read_u32() sporadically returning -ENOENT.
>>>
>>> This happens because there is a window where __nvmem_device_get() in
>>> of_nvmem_cell_get() is able to return the nvmem device, but as cells
>>> have been setup, nvmem_find_cell_entry_by_node() returns NULL.
>>>
>>> The occurs because the nvmem core registration code violates one of the
>>> fundamental principles of kernel programming: do not publish data
>>> structures before their setup is complete.
>>>
>>> Fix this by making nvmem core code conform with this principle.
>>>
>> how about a Fixes tag and Cc stable?
>
> Which commit do you suggest? This error goes all the way back to the
> inception of nvmem, commit
>
> eace75cfdcf7 ("nvmem: Add a simple NVMEM framework for nvmem providers")
>
> but clearly its going to be a lot of effort to backport it all the
> way due to all the changes.
I understand the backport issue, On the other hand as this a real issue
backporting to atleast stable kernels would be worth.
--srini
>
On Tue, Jan 03, 2023 at 12:42:49PM +0000, Srinivas Kandagatla wrote:
>
>
> On 03/01/2023 11:46, Russell King (Oracle) wrote:
> > On Tue, Jan 03, 2023 at 11:30:36AM +0000, Srinivas Kandagatla wrote:
> > >
> > >
> > > On 03/01/2023 09:42, Russell King (Oracle) wrote:
> > > > The i.MX6 CPU frequency driver sometimes fails to register at boot time
> > > > due to nvmem_cell_read_u32() sporadically returning -ENOENT.
> > > >
> > > > This happens because there is a window where __nvmem_device_get() in
> > > > of_nvmem_cell_get() is able to return the nvmem device, but as cells
> > > > have been setup, nvmem_find_cell_entry_by_node() returns NULL.
> > > >
> > > > The occurs because the nvmem core registration code violates one of the
> > > > fundamental principles of kernel programming: do not publish data
> > > > structures before their setup is complete.
> > > >
> > > > Fix this by making nvmem core code conform with this principle.
> > > >
> > > how about a Fixes tag and Cc stable?
> >
> > Which commit do you suggest? This error goes all the way back to the
> > inception of nvmem, commit
> >
> > eace75cfdcf7 ("nvmem: Add a simple NVMEM framework for nvmem providers")
> >
> > but clearly its going to be a lot of effort to backport it all the
> > way due to all the changes.
>
> I understand the backport issue, On the other hand as this a real issue
> backporting to atleast stable kernels would be worth.
I'll add this commit as a fixes tag, but I don't have the ability to
test backports of this, since the use of nvmem on imx6 platforms is
relatively recent. How do you suggest we end up with tested backports
for stable trees?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!