2023-12-08 08:25:05

by Li Nan

[permalink] [raw]
Subject: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

From: Li Nan <[email protected]>

"if device_add() succeeds, you should call device_del() when you want to
get rid of it."

In sd_probe(), device_add_disk() fails when device_add() has already
succeeded, so change put_device() to device_unregister() to ensure device
resources are released.

Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
Signed-off-by: Li Nan <[email protected]>
---
drivers/scsi/sd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 542a4bbb21bc..d81cbeee06eb 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)

error = device_add_disk(dev, gd, NULL);
if (error) {
- put_device(&sdkp->disk_dev);
+ device_unregister(&sdkp->disk_dev);
put_disk(gd);
goto out;
}
--
2.39.2


2023-12-22 06:50:11

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

On Fri, Dec 08, 2023 at 04:23:35PM +0800, [email protected] wrote:
> From: Li Nan <[email protected]>
>
> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
>
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
>
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Li Nan <[email protected]>

Nacked-by: Luis Chamberlain <[email protected]>

> ---
> drivers/scsi/sd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 542a4bbb21bc..d81cbeee06eb 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>
> error = device_add_disk(dev, gd, NULL);
> if (error) {
> - put_device(&sdkp->disk_dev);
> + device_unregister(&sdkp->disk_dev);
> put_disk(gd);
> goto out;
> }

This is incorrect, device_unregister() calls:

void device_unregister(struct device *dev)
{
pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
device_del(dev);
put_device(dev);
}

So you're adding what you believe to be a correct missing device_del().
But what you missed is that if device_add_disk() fails then device_add()
did not succeed because the new code we have in the kernel *today* unwinds
this for us now.

What you missed is that in today's code inside device_add_disk(), if
device_add() succeeeds we now unwind and call device_del() for the
device for you. And so, quoting the next sentence you took from
device_add():

"If device_add() has *not* succeeded, use *only* put_device() to drop the
reference count."

Please do reference in the future a crash dump / or explain how you
reached your conclusions if you do not have a crash dump to prove an
issue. Specially if you are suggesting it Fixes a commit.

Luis

2023-12-22 08:27:55

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

Hi,

?? 2023/12/22 14:49, Luis Chamberlain д??:
> On Fri, Dec 08, 2023 at 04:23:35PM +0800, [email protected] wrote:
>> From: Li Nan <[email protected]>
>>
>> "if device_add() succeeds, you should call device_del() when you want to
>> get rid of it."
>>
>> In sd_probe(), device_add_disk() fails when device_add() has already
>> succeeded, so change put_device() to device_unregister() to ensure device
>> resources are released.
>>
>> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
>> Signed-off-by: Li Nan <[email protected]>
>
> Nacked-by: Luis Chamberlain <[email protected]>
>
>> ---
>> drivers/scsi/sd.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index 542a4bbb21bc..d81cbeee06eb 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>>
>> error = device_add_disk(dev, gd, NULL);
>> if (error) {
>> - put_device(&sdkp->disk_dev);
>> + device_unregister(&sdkp->disk_dev);
>> put_disk(gd);
>> goto out;
>> }
>
> This is incorrect, device_unregister() calls:
>
> void device_unregister(struct device *dev)
> {
> pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
> device_del(dev);
> put_device(dev);
> }
>
> So you're adding what you believe to be a correct missing device_del().
> But what you missed is that if device_add_disk() fails then device_add()
> did not succeed because the new code we have in the kernel *today* unwinds
> this for us now.

I'm confused here, there are two device here, one is 'sdkp->disk_dev',
one is gendisk->part0->bd_device, and the order in which they
initialize:

sd_probe
device_add(&sdkp->disk_dev) -> succeed
device_add_disk -> failed, and device_add(bd_device) did not succeed
put_device(&sdkp->disk_dev) -> device_del is missed

I don't see that if device_add_disk() fail, device_del() for
'sdkp->disk_dev'is called from anywhere. Do I missing anything?

Thanks,
Kuai

>
> What you missed is that in today's code inside device_add_disk(), if
> device_add() succeeeds we now unwind and call device_del() for the
> device for you. And so, quoting the next sentence you took from
> device_add():
>
> "If device_add() has *not* succeeded, use *only* put_device() to drop the
> reference count."
>
> Please do reference in the future a crash dump / or explain how you
> reached your conclusions if you do not have a crash dump to prove an
> issue. Specially if you are suggesting it Fixes a commit.
>
> Luis
>
> .
>


2024-01-29 13:41:33

by Li Nan

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

friendly ping ...

在 2023/12/8 16:23, [email protected] 写道:
> From: Li Nan <[email protected]>
>
> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
>
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
>
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Li Nan <[email protected]>
> ---
> drivers/scsi/sd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 542a4bbb21bc..d81cbeee06eb 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>
> error = device_add_disk(dev, gd, NULL);
> if (error) {
> - put_device(&sdkp->disk_dev);
> + device_unregister(&sdkp->disk_dev);
> put_disk(gd);
> goto out;
> }

--
Thanks,
Nan


2024-01-29 17:46:43

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

On Fri, Dec 22, 2023 at 04:27:16PM +0800, Yu Kuai wrote:
> Hi,
>
> 在 2023/12/22 14:49, Luis Chamberlain 写道:
> > On Fri, Dec 08, 2023 at 04:23:35PM +0800, [email protected] wrote:
> > > From: Li Nan <[email protected]>
> > >
> > > "if device_add() succeeds, you should call device_del() when you want to
> > > get rid of it."
> > >
> > > In sd_probe(), device_add_disk() fails when device_add() has already
> > > succeeded, so change put_device() to device_unregister() to ensure device
> > > resources are released.
> > >
> > > Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> > > Signed-off-by: Li Nan <[email protected]>
> >
> > Nacked-by: Luis Chamberlain <[email protected]>
> >
> > > ---
> > > drivers/scsi/sd.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> > > index 542a4bbb21bc..d81cbeee06eb 100644
> > > --- a/drivers/scsi/sd.c
> > > +++ b/drivers/scsi/sd.c
> > > @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
> > > error = device_add_disk(dev, gd, NULL);
> > > if (error) {
> > > - put_device(&sdkp->disk_dev);
> > > + device_unregister(&sdkp->disk_dev);
> > > put_disk(gd);
> > > goto out;
> > > }
> >
> > This is incorrect, device_unregister() calls:
> >
> > void device_unregister(struct device *dev)
> > {
> > pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
> > device_del(dev);
> > put_device(dev);
> > }
> >
> > So you're adding what you believe to be a correct missing device_del().
> > But what you missed is that if device_add_disk() fails then device_add()
> > did not succeed because the new code we have in the kernel *today* unwinds
> > this for us now.
>
> I'm confused here, there are two device here, one is 'sdkp->disk_dev',
> one is gendisk->part0->bd_device, and the order in which they
> initialize:
>
> sd_probe
> device_add(&sdkp->disk_dev) -> succeed
> device_add_disk -> failed, and device_add(bd_device) did not succeed
> put_device(&sdkp->disk_dev) -> device_del is missed
>
> I don't see that if device_add_disk() fail, device_del() for
> 'sdkp->disk_dev'is called from anywhere. Do I missing anything?

Ah then the fix is still incorrect and the commit log should
describe that this is for another device.

How about this instead?

From c3f6e03f4a82aa253b6c487a293dcd576393b606 Mon Sep 17 00:00:00 2001
From: Luis Chamberlain <[email protected]>
Date: Mon, 29 Jan 2024 09:25:18 -0800
Subject: [PATCH] sd: remove extra put_device() for extra scsi device

The sd driver first device_add() its own device, and later use
device_add_disk() with another device. When we added error handling
for device_add_disk() we now call put_disk() and that will trigger
disk_release() when the refcount is 0. That will end up calling
the block driver's disk->fops->free_disk() if one is defined. The
sd driver has scsi_disk_free_disk() as its free_disk() and that
does the proper put_device(&sdkp->disk_dev) for us so we should not
need to call it, however we are left still missing the device_del()
for it.

While at it, unwind with scsi_autopm_put_device(sdp) *prior* to
putting to device as we do in sd_remove().

Reported-by: Li Nan <[email protected]>
Reported-by: Yu Kuai <[email protected]>
Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
Signed-off-by: Luis Chamberlain <[email protected]>
---
drivers/scsi/sd.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 7f949adbadfd..6475a3c947f8 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3693,8 +3693,9 @@ static int sd_probe(struct device *dev)

error = device_add(&sdkp->disk_dev);
if (error) {
+ scsi_autopm_put_device(sdp);
put_device(&sdkp->disk_dev);
- goto out;
+ return error;
}

dev_set_drvdata(dev, sdkp);
@@ -3734,9 +3735,10 @@ static int sd_probe(struct device *dev)

error = device_add_disk(dev, gd, NULL);
if (error) {
- put_device(&sdkp->disk_dev);
+ scsi_autopm_put_device(sdp);
+ device_del(&sdkp->disk_dev);
put_disk(gd);
- goto out;
+ return error;
}

if (sdkp->security) {
--
2.42.0

2024-01-30 01:30:52

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

Hi,

在 2024/01/30 1:46, Luis Chamberlain 写道:
> On Fri, Dec 22, 2023 at 04:27:16PM +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2023/12/22 14:49, Luis Chamberlain 写道:
>>> On Fri, Dec 08, 2023 at 04:23:35PM +0800, [email protected] wrote:
>>>> From: Li Nan <[email protected]>
>>>>
>>>> "if device_add() succeeds, you should call device_del() when you want to
>>>> get rid of it."
>>>>
>>>> In sd_probe(), device_add_disk() fails when device_add() has already
>>>> succeeded, so change put_device() to device_unregister() to ensure device
>>>> resources are released.
>>>>
>>>> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
>>>> Signed-off-by: Li Nan <[email protected]>
>>>
>>> Nacked-by: Luis Chamberlain <[email protected]>
>>>
>>>> ---
>>>> drivers/scsi/sd.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>>>> index 542a4bbb21bc..d81cbeee06eb 100644
>>>> --- a/drivers/scsi/sd.c
>>>> +++ b/drivers/scsi/sd.c
>>>> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>>>> error = device_add_disk(dev, gd, NULL);
>>>> if (error) {
>>>> - put_device(&sdkp->disk_dev);
>>>> + device_unregister(&sdkp->disk_dev);
>>>> put_disk(gd);
>>>> goto out;
>>>> }
>>>
>>> This is incorrect, device_unregister() calls:
>>>
>>> void device_unregister(struct device *dev)
>>> {
>>> pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
>>> device_del(dev);
>>> put_device(dev);
>>> }
>>>
>>> So you're adding what you believe to be a correct missing device_del().
>>> But what you missed is that if device_add_disk() fails then device_add()
>>> did not succeed because the new code we have in the kernel *today* unwinds
>>> this for us now.
>>
>> I'm confused here, there are two device here, one is 'sdkp->disk_dev',
>> one is gendisk->part0->bd_device, and the order in which they
>> initialize:
>>
>> sd_probe
>> device_add(&sdkp->disk_dev) -> succeed
>> device_add_disk -> failed, and device_add(bd_device) did not succeed
>> put_device(&sdkp->disk_dev) -> device_del is missed
>>
>> I don't see that if device_add_disk() fail, device_del() for
>> 'sdkp->disk_dev'is called from anywhere. Do I missing anything?
>
> Ah then the fix is still incorrect and the commit log should
> describe that this is for another device.
>
> How about this instead?
>
>>From c3f6e03f4a82aa253b6c487a293dcd576393b606 Mon Sep 17 00:00:00 2001
> From: Luis Chamberlain <[email protected]>
> Date: Mon, 29 Jan 2024 09:25:18 -0800
> Subject: [PATCH] sd: remove extra put_device() for extra scsi device
>
> The sd driver first device_add() its own device, and later use
> device_add_disk() with another device. When we added error handling
> for device_add_disk() we now call put_disk() and that will trigger
> disk_release() when the refcount is 0. That will end up calling
> the block driver's disk->fops->free_disk() if one is defined. The

This is incorrect. GD_ADDED will only set when device_add_disk()
succeed, and free_disk() will only be called from disk_release() if
GD_ADDED is set. I think Li Nan's patch is correct.

> sd driver has scsi_disk_free_disk() as its free_disk() and that
> does the proper put_device(&sdkp->disk_dev) for us so we should not
> need to call it, however we are left still missing the device_del()
> for it.
>
> While at it, unwind with scsi_autopm_put_device(sdp) *prior* to
> putting to device as we do in sd_remove().
>
> Reported-by: Li Nan <[email protected]>
> Reported-by: Yu Kuai <[email protected]>
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Luis Chamberlain <[email protected]>
> ---
> drivers/scsi/sd.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 7f949adbadfd..6475a3c947f8 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3693,8 +3693,9 @@ static int sd_probe(struct device *dev)
>
> error = device_add(&sdkp->disk_dev);
> if (error) {
> + scsi_autopm_put_device(sdp);
> put_device(&sdkp->disk_dev);
> - goto out;
> + return error;

I don't see why this is necessary, the tag 'out' is still there. If
you think is a problem, I think you need a separate patch to call
scsi_autopm_put_device() before putting the device.

Thanks,
Kuai

> }
>
> dev_set_drvdata(dev, sdkp);
> @@ -3734,9 +3735,10 @@ static int sd_probe(struct device *dev)
>
> error = device_add_disk(dev, gd, NULL);
> if (error) {
> - put_device(&sdkp->disk_dev);
> + scsi_autopm_put_device(sdp);
> + device_del(&sdkp->disk_dev);
> put_disk(gd);
> - goto out;
> + return error;
> }
>
> if (sdkp->security) {
>


2024-02-22 09:24:37

by Li Nan

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

friendly ping...

在 2023/12/8 16:23, [email protected] 写道:
> From: Li Nan <[email protected]>
>
> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
>
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
>
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Li Nan <[email protected]>
> ---
> drivers/scsi/sd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 542a4bbb21bc..d81cbeee06eb 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>
> error = device_add_disk(dev, gd, NULL);
> if (error) {
> - put_device(&sdkp->disk_dev);
> + device_unregister(&sdkp->disk_dev);
> put_disk(gd);
> goto out;
> }

--
Thanks,
Nan


2024-04-01 01:31:29

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

+CC Christoph Hellwig
+CC Bart Van Assche
?? 2023/12/08 16:23, [email protected] д??:
> From: Li Nan <[email protected]>
>
> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
>
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.

I was shocked that this patch is still there. This patch is easy and
straightforward.

LGTM
Reviewed-by: Yu Kuai <[email protected]>

BTW, Nan, it will be better if you have a reporducer for this.
>
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Li Nan <[email protected]>
> ---
> drivers/scsi/sd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 542a4bbb21bc..d81cbeee06eb 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>
> error = device_add_disk(dev, gd, NULL);
> if (error) {
> - put_device(&sdkp->disk_dev);
> + device_unregister(&sdkp->disk_dev);
> put_disk(gd);
> goto out;
> }
>


2024-04-01 17:50:10

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

On 12/8/23 00:23, [email protected] wrote:
> From: Li Nan <[email protected]>
>
> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
>
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
>
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Li Nan <[email protected]>
> ---
> drivers/scsi/sd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 542a4bbb21bc..d81cbeee06eb 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>
> error = device_add_disk(dev, gd, NULL);
> if (error) {
> - put_device(&sdkp->disk_dev);
> + device_unregister(&sdkp->disk_dev);
> put_disk(gd);
> goto out;
> }

Reviewed-by: Bart Van Assche <[email protected]>

2024-04-01 17:51:45

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

On 3/31/24 18:31, Yu Kuai wrote:
> I was shocked that this patch is still there. This patch is easy and
> straightforward.
>
> LGTM
> Reviewed-by: Yu Kuai <[email protected]>
>
> BTW, Nan, it will be better if you have a reporducer for this.
Agreed that a reproducer would help. No information is present in the
patch description about how this issue was detected nor about how it was
tested. That probably would have encouraged reviewers.

Bart.

2024-04-02 00:17:53

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

On 4/2/24 02:51, Bart Van Assche wrote:
> On 3/31/24 18:31, Yu Kuai wrote:
>> I was shocked that this patch is still there. This patch is easy and
>> straightforward.
>>
>> LGTM
>> Reviewed-by: Yu Kuai <[email protected]>
>>
>> BTW, Nan, it will be better if you have a reporducer for this.
> Agreed that a reproducer would help. No information is present in the
> patch description about how this issue was detected nor about how it was
> tested. That probably would have encouraged reviewers.

May be submit the reproducer as a blktest case while at it ?

>
> Bart.
>

--
Damien Le Moal
Western Digital Research


2024-04-02 01:49:30

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH] scsi: sd: unregister device if device_add_disk() failed in sd_probe()

On Fri, 08 Dec 2023 16:23:35 +0800, [email protected] wrote:

> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
>
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
>
> [...]

Applied to 6.9/scsi-fixes, thanks!

[1/1] scsi: sd: unregister device if device_add_disk() failed in sd_probe()
https://git.kernel.org/mkp/scsi/c/0296bea01cfa

--
Martin K. Petersen Oracle Linux Engineering