2021-12-21 15:46:00

by Rafał Miłecki

[permalink] [raw]
Subject: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

From: Rafał Miłecki <[email protected]>

1. Drop incorrect put_device() calls

If device_register() fails then underlaying device_add() takes care of
calling put_device() if needed. There is no need to do that in a driver.

2. Use device_unregister()

Now that we don't call put_device() we can use above helper.

Fixes: 3360acdf8391 ("nvmem: core: fix leaks on registration errors")
Cc: Johan Hovold <[email protected]>
Signed-off-by: Rafał Miłecki <[email protected]>
---
That put_device() was explicitly added by Johan but after checking
device_register() twice I still think it's incorrect. I hope I didn't
miss sth obvious and I didn't mess it up.
---
drivers/nvmem/core.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 785a56e33f69..f7f31af7226f 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -901,12 +901,12 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)

rval = device_register(&nvmem->dev);
if (rval)
- goto err_put_device;
+ return ERR_PTR(rval);

if (config->compat) {
rval = nvmem_sysfs_setup_compat(nvmem, config);
if (rval)
- goto err_device_del;
+ goto err_device_unreg;
}

if (config->cells) {
@@ -932,10 +932,8 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
err_teardown_compat:
if (config->compat)
nvmem_sysfs_remove_compat(nvmem, config);
-err_device_del:
- device_del(&nvmem->dev);
-err_put_device:
- put_device(&nvmem->dev);
+err_device_unreg:
+ device_unregister(&nvmem->dev);

return ERR_PTR(rval);
}
--
2.31.1



2021-12-21 16:06:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
> From: Rafał Miłecki <[email protected]>
>
> 1. Drop incorrect put_device() calls
>
> If device_register() fails then underlaying device_add() takes care of
> calling put_device() if needed. There is no need to do that in a driver.

Did you read the documentation for device_register() that says:

* NOTE: _Never_ directly free @dev after calling this function, even
* if it returned an error! Always use put_device() to give up the
* reference initialized in this function instead.

> 2. Use device_unregister()
>
> Now that we don't call put_device() we can use above helper.
>
> Fixes: 3360acdf8391 ("nvmem: core: fix leaks on registration errors")
> Cc: Johan Hovold <[email protected]>
> Signed-off-by: Rafał Miłecki <[email protected]>
> ---
> That put_device() was explicitly added by Johan but after checking
> device_register() twice I still think it's incorrect. I hope I didn't
> miss sth obvious and I didn't mess it up.
> ---
> drivers/nvmem/core.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
> index 785a56e33f69..f7f31af7226f 100644
> --- a/drivers/nvmem/core.c
> +++ b/drivers/nvmem/core.c
> @@ -901,12 +901,12 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
>
> rval = device_register(&nvmem->dev);
> if (rval)
> - goto err_put_device;
> + return ERR_PTR(rval);

Where do you call put_device() to free the allocated memory?

You just leaked the kzalloc() call to allocate the memory pointed to by
nvmem :(

I think the code is fine as-is.

thanks,

greg k-h

2021-12-21 17:46:09

by Rafał Miłecki

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
> On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
>> From: Rafał Miłecki <[email protected]>
>>
>> 1. Drop incorrect put_device() calls
>>
>> If device_register() fails then underlaying device_add() takes care of
>> calling put_device() if needed. There is no need to do that in a driver.
>
> Did you read the documentation for device_register() that says:
>
> * NOTE: _Never_ directly free @dev after calling this function, even
> * if it returned an error! Always use put_device() to give up the
> * reference initialized in this function instead.

I clearly tried to be too smart and ignored documentation.

I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
Most kernel functions are safe to assume to do nothing that requires
cleanup if they fail.

E.g. if I call platform_device_register() and it fails I don't need to
call anything like platform_device_put(). I just free previously
allocated memory.

When calling device_register() / device_add() it seems device always
gets partially registered (even if it fails!). Enough to make it safe to
depend on core subsystem calling .release() after device_put().

So what initially looks like unbalanced device_put() call is actually
some device_add() specific magic behaviour ;)

Sorry. I should have checked documentation before posting patches.
That's not my best day.


>> 2. Use device_unregister()
>>
>> Now that we don't call put_device() we can use above helper.
>>
>> Fixes: 3360acdf8391 ("nvmem: core: fix leaks on registration errors")
>> Cc: Johan Hovold <[email protected]>
>> Signed-off-by: Rafał Miłecki <[email protected]>
>> ---
>> That put_device() was explicitly added by Johan but after checking
>> device_register() twice I still think it's incorrect. I hope I didn't
>> miss sth obvious and I didn't mess it up.
>> ---
>> drivers/nvmem/core.c | 10 ++++------
>> 1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
>> index 785a56e33f69..f7f31af7226f 100644
>> --- a/drivers/nvmem/core.c
>> +++ b/drivers/nvmem/core.c
>> @@ -901,12 +901,12 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
>>
>> rval = device_register(&nvmem->dev);
>> if (rval)
>> - goto err_put_device;
>> + return ERR_PTR(rval);
>
> Where do you call put_device() to free the allocated memory?
>
> You just leaked the kzalloc() call to allocate the memory pointed to by
> nvmem :(
>
> I think the code is fine as-is.

Yeah, I forgot about:

ida_free(&nvmem_ida, nvmem->id);
kfree(nvmem);

2021-12-22 07:44:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
> On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
> > On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
> > > From: Rafał Miłecki <[email protected]>
> > >
> > > 1. Drop incorrect put_device() calls
> > >
> > > If device_register() fails then underlaying device_add() takes care of
> > > calling put_device() if needed. There is no need to do that in a driver.
> >
> > Did you read the documentation for device_register() that says:
> >
> > * NOTE: _Never_ directly free @dev after calling this function, even
> > * if it returned an error! Always use put_device() to give up the
> > * reference initialized in this function instead.
>
> I clearly tried to be too smart and ignored documentation.
>
> I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
> Most kernel functions are safe to assume to do nothing that requires
> cleanup if they fail.
>
> E.g. if I call platform_device_register() and it fails I don't need to
> call anything like platform_device_put(). I just free previously
> allocated memory.

And that is wrong.

{sigh}

Seems the author of that function did not read the documentation. I'll
add "fix platform_device_register()" to my long TODO list. Arguably, it
should handle this type of failure internally to it, to prevent all
individual drivers from having to handle it.

You also need to handle this type of functionality in your bus call, and
you do, which is good as you do not want everyone who calls
nvmem_register() to also have to do much the same thing.

> When calling device_register() / device_add() it seems device always
> gets partially registered (even if it fails!). Enough to make it safe to
> depend on core subsystem calling .release() after device_put().
>
> So what initially looks like unbalanced device_put() call is actually
> some device_add() specific magic behaviour ;)

It's documented magic behavior :)

thanks,

greg k-h

2021-12-22 08:38:41

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 08:44:44AM +0100, Greg Kroah-Hartman wrote:
> On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
> > On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
> > > On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
> > > > From: Rafał Miłecki <[email protected]>
> > > >
> > > > 1. Drop incorrect put_device() calls
> > > >
> > > > If device_register() fails then underlaying device_add() takes care of
> > > > calling put_device() if needed. There is no need to do that in a driver.
> > >
> > > Did you read the documentation for device_register() that says:
> > >
> > > * NOTE: _Never_ directly free @dev after calling this function, even
> > > * if it returned an error! Always use put_device() to give up the
> > > * reference initialized in this function instead.
> >
> > I clearly tried to be too smart and ignored documentation.
> >
> > I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
> > Most kernel functions are safe to assume to do nothing that requires
> > cleanup if they fail.
> >
> > E.g. if I call platform_device_register() and it fails I don't need to
> > call anything like platform_device_put(). I just free previously
> > allocated memory.
>
> And that is wrong.

It seems Rafał is mistaken here too; you certainly need to call
platform_device_put() if platform_device_register() fail, even if many
current users do appear to get this wrong.

Johan

2021-12-22 08:56:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 09:38:27AM +0100, Johan Hovold wrote:
> On Wed, Dec 22, 2021 at 08:44:44AM +0100, Greg Kroah-Hartman wrote:
> > On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
> > > On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
> > > > On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
> > > > > From: Rafał Miłecki <[email protected]>
> > > > >
> > > > > 1. Drop incorrect put_device() calls
> > > > >
> > > > > If device_register() fails then underlaying device_add() takes care of
> > > > > calling put_device() if needed. There is no need to do that in a driver.
> > > >
> > > > Did you read the documentation for device_register() that says:
> > > >
> > > > * NOTE: _Never_ directly free @dev after calling this function, even
> > > > * if it returned an error! Always use put_device() to give up the
> > > > * reference initialized in this function instead.
> > >
> > > I clearly tried to be too smart and ignored documentation.
> > >
> > > I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
> > > Most kernel functions are safe to assume to do nothing that requires
> > > cleanup if they fail.
> > >
> > > E.g. if I call platform_device_register() and it fails I don't need to
> > > call anything like platform_device_put(). I just free previously
> > > allocated memory.
> >
> > And that is wrong.
>
> It seems Rafał is mistaken here too; you certainly need to call
> platform_device_put() if platform_device_register() fail, even if many
> current users do appear to get this wrong.

A short search found almost everyone getting this wrong. Arguably
platform_device_register() can clean up properly on its own if we want
it to do so. Will take a lot of auditing of the current codebase first
to see if it's safe...

thanks,

greg k-h

2021-12-22 09:00:10

by Rafał Miłecki

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On 22.12.2021 09:38, Johan Hovold wrote:
> On Wed, Dec 22, 2021 at 08:44:44AM +0100, Greg Kroah-Hartman wrote:
>> On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
>>> On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
>>>> On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
>>>>> From: Rafał Miłecki <[email protected]>
>>>>>
>>>>> 1. Drop incorrect put_device() calls
>>>>>
>>>>> If device_register() fails then underlaying device_add() takes care of
>>>>> calling put_device() if needed. There is no need to do that in a driver.
>>>>
>>>> Did you read the documentation for device_register() that says:
>>>>
>>>> * NOTE: _Never_ directly free @dev after calling this function, even
>>>> * if it returned an error! Always use put_device() to give up the
>>>> * reference initialized in this function instead.
>>>
>>> I clearly tried to be too smart and ignored documentation.
>>>
>>> I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
>>> Most kernel functions are safe to assume to do nothing that requires
>>> cleanup if they fail.
>>>
>>> E.g. if I call platform_device_register() and it fails I don't need to
>>> call anything like platform_device_put(). I just free previously
>>> allocated memory.
>>
>> And that is wrong.
>
> It seems Rafał is mistaken here too; you certainly need to call
> platform_device_put() if platform_device_register() fail, even if many
> current users do appear to get this wrong.

Yes I was! Gosh I made up that "platform_device_put()" name and only
now I realized it actually exists!

I stand by saying this design is really misleading. Even though
platform_device_put() was obviously a bad example.

Please remember I'm just a minor kernel developer however in my humble
opinion behaviour of device_register() and platform_device_register()
should be changed.

If any function fails I expect:
1. That function to clean up its mess if any
2. Me to be responsible to clean up my mess if any

This is how "most" code (whatever it means) works.
1. If POSIX snprintf() fails I'm not expected to call *printf_put() sth
2. If POSIX bind() fails I'm not expected to call bind_put() sth
3. (...)

I'm not sure if those are the best examples but you should get my point.

2021-12-22 09:02:52

by Rafał Miłecki

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On 22.12.2021 09:56, Greg Kroah-Hartman wrote:
> On Wed, Dec 22, 2021 at 09:38:27AM +0100, Johan Hovold wrote:
>> On Wed, Dec 22, 2021 at 08:44:44AM +0100, Greg Kroah-Hartman wrote:
>>> On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
>>>> On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
>>>>> On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
>>>>>> From: Rafał Miłecki <[email protected]>
>>>>>>
>>>>>> 1. Drop incorrect put_device() calls
>>>>>>
>>>>>> If device_register() fails then underlaying device_add() takes care of
>>>>>> calling put_device() if needed. There is no need to do that in a driver.
>>>>>
>>>>> Did you read the documentation for device_register() that says:
>>>>>
>>>>> * NOTE: _Never_ directly free @dev after calling this function, even
>>>>> * if it returned an error! Always use put_device() to give up the
>>>>> * reference initialized in this function instead.
>>>>
>>>> I clearly tried to be too smart and ignored documentation.
>>>>
>>>> I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
>>>> Most kernel functions are safe to assume to do nothing that requires
>>>> cleanup if they fail.
>>>>
>>>> E.g. if I call platform_device_register() and it fails I don't need to
>>>> call anything like platform_device_put(). I just free previously
>>>> allocated memory.
>>>
>>> And that is wrong.
>>
>> It seems Rafał is mistaken here too; you certainly need to call
>> platform_device_put() if platform_device_register() fail, even if many
>> current users do appear to get this wrong.
>
> A short search found almost everyone getting this wrong. Arguably
> platform_device_register() can clean up properly on its own if we want
> it to do so. Will take a lot of auditing of the current codebase first
> to see if it's safe...

If so many people get it wrong maybe that is indded an unintuitive
design?

I'll better hide now ;)

2021-12-22 09:04:03

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 09:56:29AM +0100, Greg Kroah-Hartman wrote:
> On Wed, Dec 22, 2021 at 09:38:27AM +0100, Johan Hovold wrote:
> > On Wed, Dec 22, 2021 at 08:44:44AM +0100, Greg Kroah-Hartman wrote:
> > > On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
> > > > On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
> > > > > On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
> > > > > > From: Rafał Miłecki <[email protected]>
> > > > > >
> > > > > > 1. Drop incorrect put_device() calls
> > > > > >
> > > > > > If device_register() fails then underlaying device_add() takes care of
> > > > > > calling put_device() if needed. There is no need to do that in a driver.
> > > > >
> > > > > Did you read the documentation for device_register() that says:
> > > > >
> > > > > * NOTE: _Never_ directly free @dev after calling this function, even
> > > > > * if it returned an error! Always use put_device() to give up the
> > > > > * reference initialized in this function instead.
> > > >
> > > > I clearly tried to be too smart and ignored documentation.
> > > >
> > > > I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
> > > > Most kernel functions are safe to assume to do nothing that requires
> > > > cleanup if they fail.
> > > >
> > > > E.g. if I call platform_device_register() and it fails I don't need to
> > > > call anything like platform_device_put(). I just free previously
> > > > allocated memory.
> > >
> > > And that is wrong.
> >
> > It seems Rafał is mistaken here too; you certainly need to call
> > platform_device_put() if platform_device_register() fail, even if many
> > current users do appear to get this wrong.
>
> A short search found almost everyone getting this wrong. Arguably
> platform_device_register() can clean up properly on its own if we want
> it to do so. Will take a lot of auditing of the current codebase first
> to see if it's safe...

Right, but I found at least a couple of callers getting it it right, so
changing the behaviour now risks introducing a double free (which is
worse than a memleak on registration failure). But yeah, a careful
review might suffice.

Johan

2021-12-22 09:08:57

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 10:00:03AM +0100, Rafał Miłecki wrote:
> On 22.12.2021 09:38, Johan Hovold wrote:

> > It seems Rafał is mistaken here too; you certainly need to call
> > platform_device_put() if platform_device_register() fail, even if many
> > current users do appear to get this wrong.
>
> Yes I was! Gosh I made up that "platform_device_put()" name and only
> now I realized it actually exists!
>
> I stand by saying this design is really misleading. Even though
> platform_device_put() was obviously a bad example.
>
> Please remember I'm just a minor kernel developer however in my humble
> opinion behaviour of device_register() and platform_device_register()
> should be changed.
>
> If any function fails I expect:
> 1. That function to clean up its mess if any
> 2. Me to be responsible to clean up my mess if any
>
> This is how "most" code (whatever it means) works.
> 1. If POSIX snprintf() fails I'm not expected to call *printf_put() sth
> 2. If POSIX bind() fails I'm not expected to call bind_put() sth
> 3. (...)
>
> I'm not sure if those are the best examples but you should get my point.

Yes, and we all agree that it's not the best interface. But it exists,
and changing it now risks introducing worse problem than a minor, mostly
theoretical, memleak.

Johan

2021-12-22 09:25:12

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 10:03:17AM +0100, Johan Hovold wrote:
> On Wed, Dec 22, 2021 at 09:56:29AM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Dec 22, 2021 at 09:38:27AM +0100, Johan Hovold wrote:
> > > On Wed, Dec 22, 2021 at 08:44:44AM +0100, Greg Kroah-Hartman wrote:
> > > > On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
> > > > > On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
> > > > > > On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
> > > > > > > From: Rafał Miłecki <[email protected]>
> > > > > > >
> > > > > > > 1. Drop incorrect put_device() calls
> > > > > > >
> > > > > > > If device_register() fails then underlaying device_add() takes care of
> > > > > > > calling put_device() if needed. There is no need to do that in a driver.
> > > > > >
> > > > > > Did you read the documentation for device_register() that says:
> > > > > >
> > > > > > * NOTE: _Never_ directly free @dev after calling this function, even
> > > > > > * if it returned an error! Always use put_device() to give up the
> > > > > > * reference initialized in this function instead.
> > > > >
> > > > > I clearly tried to be too smart and ignored documentation.
> > > > >
> > > > > I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
> > > > > Most kernel functions are safe to assume to do nothing that requires
> > > > > cleanup if they fail.
> > > > >
> > > > > E.g. if I call platform_device_register() and it fails I don't need to
> > > > > call anything like platform_device_put(). I just free previously
> > > > > allocated memory.
> > > >
> > > > And that is wrong.
> > >
> > > It seems Rafał is mistaken here too; you certainly need to call
> > > platform_device_put() if platform_device_register() fail, even if many
> > > current users do appear to get this wrong.
> >
> > A short search found almost everyone getting this wrong. Arguably
> > platform_device_register() can clean up properly on its own if we want
> > it to do so. Will take a lot of auditing of the current codebase first
> > to see if it's safe...
>
> Right, but I found at least a couple of callers getting it it right, so
> changing the behaviour now risks introducing a double free (which is
> worse than a memleak on registration failure). But yeah, a careful
> review might suffice.

Actually, I'm not sure we can (should) change
platform_device_register(). The platform device has been allocated by
the caller and it would be quite counterintuitive to have the
registration function deallocate that memory if registration fails.

Heh, we even have statically allocated structures being registered with
this function and we certainly don't want the helper to try to free
those.

Johan

2021-12-22 09:27:12

by Johan Hovold

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 10:16:20AM +0100, Rafał Miłecki wrote:
> On 22.12.2021 10:08, Johan Hovold wrote:
> > On Wed, Dec 22, 2021 at 10:00:03AM +0100, Rafał Miłecki wrote:
> >> On 22.12.2021 09:38, Johan Hovold wrote:
> >
> >>> It seems Rafał is mistaken here too; you certainly need to call
> >>> platform_device_put() if platform_device_register() fail, even if many
> >>> current users do appear to get this wrong.
> >>
> >> Yes I was! Gosh I made up that "platform_device_put()" name and only
> >> now I realized it actually exists!
> >>
> >> I stand by saying this design is really misleading. Even though
> >> platform_device_put() was obviously a bad example.
> >>
> >> Please remember I'm just a minor kernel developer however in my humble
> >> opinion behaviour of device_register() and platform_device_register()
> >> should be changed.
> >>
> >> If any function fails I expect:
> >> 1. That function to clean up its mess if any
> >> 2. Me to be responsible to clean up my mess if any
> >>
> >> This is how "most" code (whatever it means) works.
> >> 1. If POSIX snprintf() fails I'm not expected to call *printf_put() sth
> >> 2. If POSIX bind() fails I'm not expected to call bind_put() sth
> >> 3. (...)
> >>
> >> I'm not sure if those are the best examples but you should get my point.
> >
> > Yes, and we all agree that it's not the best interface. But it exists,
> > and changing it now risks introducing worse problem than a minor, mostly
> > theoretical, memleak.
>
> Thanks for confirming that, I was wondering if it's just my mind that
> doesn't find this design clear enough.
>
> Now, assuming this design isn't perfect and some purists would like it
> cleaned up:
>
> Would that make sense to introduce something like
> 1. device_register2() / device_add2()
> and
> 2. platform_device_register2() / platform_device_add2()
>
> that would *not* require calling *_put() on failure? Then start
> converting existing drivers to those new (clearner?) helpers?

Nah, let's not add more helpers. Also see my last reply to Greg about
why the registration helper cannot free object being registered.

device_initialize() is special, and everyone just needs to learn that.

Johan

2021-12-22 09:29:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 10:00:03AM +0100, Rafał Miłecki wrote:
> On 22.12.2021 09:38, Johan Hovold wrote:
> > On Wed, Dec 22, 2021 at 08:44:44AM +0100, Greg Kroah-Hartman wrote:
> > > On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
> > > > On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
> > > > > On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
> > > > > > From: Rafał Miłecki <[email protected]>
> > > > > >
> > > > > > 1. Drop incorrect put_device() calls
> > > > > >
> > > > > > If device_register() fails then underlaying device_add() takes care of
> > > > > > calling put_device() if needed. There is no need to do that in a driver.
> > > > >
> > > > > Did you read the documentation for device_register() that says:
> > > > >
> > > > > * NOTE: _Never_ directly free @dev after calling this function, even
> > > > > * if it returned an error! Always use put_device() to give up the
> > > > > * reference initialized in this function instead.
> > > >
> > > > I clearly tried to be too smart and ignored documentation.
> > > >
> > > > I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
> > > > Most kernel functions are safe to assume to do nothing that requires
> > > > cleanup if they fail.
> > > >
> > > > E.g. if I call platform_device_register() and it fails I don't need to
> > > > call anything like platform_device_put(). I just free previously
> > > > allocated memory.
> > >
> > > And that is wrong.
> >
> > It seems Rafał is mistaken here too; you certainly need to call
> > platform_device_put() if platform_device_register() fail, even if many
> > current users do appear to get this wrong.
>
> Yes I was! Gosh I made up that "platform_device_put()" name and only
> now I realized it actually exists!
>
> I stand by saying this design is really misleading. Even though
> platform_device_put() was obviously a bad example.
>
> Please remember I'm just a minor kernel developer however in my humble
> opinion behaviour of device_register() and platform_device_register()
> should be changed.
>
> If any function fails I expect:
> 1. That function to clean up its mess if any
> 2. Me to be responsible to clean up my mess if any
>
> This is how "most" code (whatever it means) works.
> 1. If POSIX snprintf() fails I'm not expected to call *printf_put() sth
> 2. If POSIX bind() fails I'm not expected to call bind_put() sth
> 3. (...)
>
> I'm not sure if those are the best examples but you should get my point.

I do understand, and for platform_device_register() I agree with you.

But for device_register() we can not do this as the driver core is not
the "owner" of the structure being passed into it. If you call
device_register() you are bus and you have to know how to handle an
error here as there is usually much more that needs to be done that a
device_put() can not do by the core.

Yes, it's well down on the "Rusty's API usability scale", but it is
documented well and in a number of places for device_register().

platform_device_register() is not documented, and that's not good, so we
should fix it up. Although there's the larger issue of everyone using
static 'struct device' for this which is yet-another-reason I hate the
platform device code.

thanks,

greg k-h

2021-12-22 09:30:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 10:16:20AM +0100, Rafał Miłecki wrote:
> On 22.12.2021 10:08, Johan Hovold wrote:
> > On Wed, Dec 22, 2021 at 10:00:03AM +0100, Rafał Miłecki wrote:
> > > On 22.12.2021 09:38, Johan Hovold wrote:
> >
> > > > It seems Rafał is mistaken here too; you certainly need to call
> > > > platform_device_put() if platform_device_register() fail, even if many
> > > > current users do appear to get this wrong.
> > >
> > > Yes I was! Gosh I made up that "platform_device_put()" name and only
> > > now I realized it actually exists!
> > >
> > > I stand by saying this design is really misleading. Even though
> > > platform_device_put() was obviously a bad example.
> > >
> > > Please remember I'm just a minor kernel developer however in my humble
> > > opinion behaviour of device_register() and platform_device_register()
> > > should be changed.
> > >
> > > If any function fails I expect:
> > > 1. That function to clean up its mess if any
> > > 2. Me to be responsible to clean up my mess if any
> > >
> > > This is how "most" code (whatever it means) works.
> > > 1. If POSIX snprintf() fails I'm not expected to call *printf_put() sth
> > > 2. If POSIX bind() fails I'm not expected to call bind_put() sth
> > > 3. (...)
> > >
> > > I'm not sure if those are the best examples but you should get my point.
> >
> > Yes, and we all agree that it's not the best interface. But it exists,
> > and changing it now risks introducing worse problem than a minor, mostly
> > theoretical, memleak.
>
> Thanks for confirming that, I was wondering if it's just my mind that
> doesn't find this design clear enough.
>
> Now, assuming this design isn't perfect and some purists would like it
> cleaned up:
>
> Would that make sense to introduce something like
> 1. device_register2() / device_add2()
> and
> 2. platform_device_register2() / platform_device_add2()
>
> that would *not* require calling *_put() on failure? Then start
> converting existing drivers to those new (clearner?) helpers?

See my other response, but no, this is not a good idea.
device_register() is correct as-is, but platform_device_register()
isn't.

thanks,

greg k-h

2021-12-22 09:34:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On Wed, Dec 22, 2021 at 10:24:33AM +0100, Johan Hovold wrote:
> On Wed, Dec 22, 2021 at 10:03:17AM +0100, Johan Hovold wrote:
> > On Wed, Dec 22, 2021 at 09:56:29AM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Dec 22, 2021 at 09:38:27AM +0100, Johan Hovold wrote:
> > > > On Wed, Dec 22, 2021 at 08:44:44AM +0100, Greg Kroah-Hartman wrote:
> > > > > On Tue, Dec 21, 2021 at 06:46:01PM +0100, Rafał Miłecki wrote:
> > > > > > On 21.12.2021 17:06, Greg Kroah-Hartman wrote:
> > > > > > > On Tue, Dec 21, 2021 at 04:45:50PM +0100, Rafał Miłecki wrote:
> > > > > > > > From: Rafał Miłecki <[email protected]>
> > > > > > > >
> > > > > > > > 1. Drop incorrect put_device() calls
> > > > > > > >
> > > > > > > > If device_register() fails then underlaying device_add() takes care of
> > > > > > > > calling put_device() if needed. There is no need to do that in a driver.
> > > > > > >
> > > > > > > Did you read the documentation for device_register() that says:
> > > > > > >
> > > > > > > * NOTE: _Never_ directly free @dev after calling this function, even
> > > > > > > * if it returned an error! Always use put_device() to give up the
> > > > > > > * reference initialized in this function instead.
> > > > > >
> > > > > > I clearly tried to be too smart and ignored documentation.
> > > > > >
> > > > > > I'd say device_add() behaviour is rather uncommon and a bit unintuitive.
> > > > > > Most kernel functions are safe to assume to do nothing that requires
> > > > > > cleanup if they fail.
> > > > > >
> > > > > > E.g. if I call platform_device_register() and it fails I don't need to
> > > > > > call anything like platform_device_put(). I just free previously
> > > > > > allocated memory.
> > > > >
> > > > > And that is wrong.
> > > >
> > > > It seems Rafał is mistaken here too; you certainly need to call
> > > > platform_device_put() if platform_device_register() fail, even if many
> > > > current users do appear to get this wrong.
> > >
> > > A short search found almost everyone getting this wrong. Arguably
> > > platform_device_register() can clean up properly on its own if we want
> > > it to do so. Will take a lot of auditing of the current codebase first
> > > to see if it's safe...
> >
> > Right, but I found at least a couple of callers getting it it right, so
> > changing the behaviour now risks introducing a double free (which is
> > worse than a memleak on registration failure). But yeah, a careful
> > review might suffice.
>
> Actually, I'm not sure we can (should) change
> platform_device_register(). The platform device has been allocated by
> the caller and it would be quite counterintuitive to have the
> registration function deallocate that memory if registration fails.
>
> Heh, we even have statically allocated structures being registered with
> this function and we certainly don't want the helper to try to free
> those.

Yeah, it's a mess. I'll try to look at it this break if things calm
down...

greg k-h

2021-12-22 10:22:46

by Rafał Miłecki

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On 22.12.2021 10:26, Johan Hovold wrote:
> device_initialize() is special, and everyone just needs to learn that.

Understood :) Thank you

2021-12-22 10:35:53

by Rafał Miłecki

[permalink] [raw]
Subject: Re: [PATCH] nvmem: fix unregistering device in nvmem_register() error path

On 22.12.2021 10:08, Johan Hovold wrote:
> On Wed, Dec 22, 2021 at 10:00:03AM +0100, Rafał Miłecki wrote:
>> On 22.12.2021 09:38, Johan Hovold wrote:
>
>>> It seems Rafał is mistaken here too; you certainly need to call
>>> platform_device_put() if platform_device_register() fail, even if many
>>> current users do appear to get this wrong.
>>
>> Yes I was! Gosh I made up that "platform_device_put()" name and only
>> now I realized it actually exists!
>>
>> I stand by saying this design is really misleading. Even though
>> platform_device_put() was obviously a bad example.
>>
>> Please remember I'm just a minor kernel developer however in my humble
>> opinion behaviour of device_register() and platform_device_register()
>> should be changed.
>>
>> If any function fails I expect:
>> 1. That function to clean up its mess if any
>> 2. Me to be responsible to clean up my mess if any
>>
>> This is how "most" code (whatever it means) works.
>> 1. If POSIX snprintf() fails I'm not expected to call *printf_put() sth
>> 2. If POSIX bind() fails I'm not expected to call bind_put() sth
>> 3. (...)
>>
>> I'm not sure if those are the best examples but you should get my point.
>
> Yes, and we all agree that it's not the best interface. But it exists,
> and changing it now risks introducing worse problem than a minor, mostly
> theoretical, memleak.

Thanks for confirming that, I was wondering if it's just my mind that
doesn't find this design clear enough.

Now, assuming this design isn't perfect and some purists would like it
cleaned up:

Would that make sense to introduce something like
1. device_register2() / device_add2()
and
2. platform_device_register2() / platform_device_add2()

that would *not* require calling *_put() on failure? Then start
converting existing drivers to those new (clearner?) helpers?