2021-10-18 12:00:35

by Zqiang

[permalink] [raw]
Subject: [PATCH v3] block: fix incorrect references to disk objects

When adding partitions to the disk, the reference count of the disk
object is increased. then alloc partition device and called
device_add(), if the device_add() return error, the reference
count of the disk object will be reduced twice, at put_device(pdev)
and put_disk(disk). this leads to the end of the object's life cycle
prematurely, and trigger following calltrace.

__init_work+0x2d/0x50 kernel/workqueue.c:519
synchronize_rcu_expedited+0x3af/0x650 kernel/rcu/tree_exp.h:847
bdi_remove_from_list mm/backing-dev.c:938 [inline]
bdi_unregister+0x17f/0x5c0 mm/backing-dev.c:946
release_bdi+0xa1/0xc0 mm/backing-dev.c:968
kref_put include/linux/kref.h:65 [inline]
bdi_put+0x72/0xa0 mm/backing-dev.c:976
bdev_free_inode+0x11e/0x220 block/bdev.c:408
i_callback+0x3f/0x70 fs/inode.c:226
rcu_do_batch kernel/rcu/tree.c:2508 [inline]
rcu_core+0x76d/0x16c0 kernel/rcu/tree.c:2743
__do_softirq+0x1d7/0x93b kernel/softirq.c:558
invoke_softirq kernel/softirq.c:432 [inline]
__irq_exit_rcu kernel/softirq.c:636 [inline]
irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648
sysvec_apic_timer_interrupt+0x93/0xc0

Return directly after calling the put_device().

Reported-by: Hao Sun <[email protected]>
Signed-off-by: Zqiang <[email protected]>
---
v1->v2:
directly returning instead of assigning disk to NULL
v2->v3:
modify description information

block/partitions/core.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/block/partitions/core.c b/block/partitions/core.c
index 9dbddc355b40..ed5deef1d7e1 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -424,6 +424,7 @@ static struct block_device *add_partition(struct gendisk *disk, int partno,
device_del(pdev);
out_put:
put_device(pdev);
+ return ERR_PTR(err);
out_put_disk:
put_disk(disk);
return ERR_PTR(err);
--
2.17.1


2021-10-18 12:31:07

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v3] block: fix incorrect references to disk objects

On Mon, Oct 18, 2021 at 07:58:07PM +0800, Zqiang wrote:
> When adding partitions to the disk, the reference count of the disk
> object is increased. then alloc partition device and called
> device_add(), if the device_add() return error, the reference
> count of the disk object will be reduced twice, at put_device(pdev)
> and put_disk(disk). this leads to the end of the object's life cycle
> prematurely, and trigger following calltrace.
>
> __init_work+0x2d/0x50 kernel/workqueue.c:519
> synchronize_rcu_expedited+0x3af/0x650 kernel/rcu/tree_exp.h:847
> bdi_remove_from_list mm/backing-dev.c:938 [inline]
> bdi_unregister+0x17f/0x5c0 mm/backing-dev.c:946
> release_bdi+0xa1/0xc0 mm/backing-dev.c:968
> kref_put include/linux/kref.h:65 [inline]
> bdi_put+0x72/0xa0 mm/backing-dev.c:976
> bdev_free_inode+0x11e/0x220 block/bdev.c:408
> i_callback+0x3f/0x70 fs/inode.c:226
> rcu_do_batch kernel/rcu/tree.c:2508 [inline]
> rcu_core+0x76d/0x16c0 kernel/rcu/tree.c:2743
> __do_softirq+0x1d7/0x93b kernel/softirq.c:558
> invoke_softirq kernel/softirq.c:432 [inline]
> __irq_exit_rcu kernel/softirq.c:636 [inline]
> irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648
> sysvec_apic_timer_interrupt+0x93/0xc0
>
> Return directly after calling the put_device().
>
> Reported-by: Hao Sun <[email protected]>
> Signed-off-by: Zqiang <[email protected]>

Fixes: 9d3b8813895d ("block: change the refcounting for partitions")
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>

2021-10-25 08:29:42

by Zqiang

[permalink] [raw]
Subject: Re: [PATCH v3] block: fix incorrect references to disk objects


On 2021/10/18 下午8:26, Matthew Wilcox wrote:
> On Mon, Oct 18, 2021 at 07:58:07PM +0800, Zqiang wrote:
>> When adding partitions to the disk, the reference count of the disk
>> object is increased. then alloc partition device and called
>> device_add(), if the device_add() return error, the reference
>> count of the disk object will be reduced twice, at put_device(pdev)
>> and put_disk(disk). this leads to the end of the object's life cycle
>> prematurely, and trigger following calltrace.
>>
>> __init_work+0x2d/0x50 kernel/workqueue.c:519
>> synchronize_rcu_expedited+0x3af/0x650 kernel/rcu/tree_exp.h:847
>> bdi_remove_from_list mm/backing-dev.c:938 [inline]
>> bdi_unregister+0x17f/0x5c0 mm/backing-dev.c:946
>> release_bdi+0xa1/0xc0 mm/backing-dev.c:968
>> kref_put include/linux/kref.h:65 [inline]
>> bdi_put+0x72/0xa0 mm/backing-dev.c:976
>> bdev_free_inode+0x11e/0x220 block/bdev.c:408
>> i_callback+0x3f/0x70 fs/inode.c:226
>> rcu_do_batch kernel/rcu/tree.c:2508 [inline]
>> rcu_core+0x76d/0x16c0 kernel/rcu/tree.c:2743
>> __do_softirq+0x1d7/0x93b kernel/softirq.c:558
>> invoke_softirq kernel/softirq.c:432 [inline]
>> __irq_exit_rcu kernel/softirq.c:636 [inline]
>> irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648
>> sysvec_apic_timer_interrupt+0x93/0xc0
>>
>> Return directly after calling the put_device().
>>
>> Reported-by: Hao Sun <[email protected]>
>> Signed-off-by: Zqiang <[email protected]>
> Fixes: 9d3b8813895d ("block: change the refcounting for partitions")
> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>

Hello Jens Axboe

The patch description information of the previous v2 version is
incorrect, v3 modified the description information, please applied the
v3 version.

Thanks

Zqiang



2021-10-25 18:11:42

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v3] block: fix incorrect references to disk objects

On 10/25/21 2:27 AM, Zqiang wrote:
>
> On 2021/10/18 下午8:26, Matthew Wilcox wrote:
>> On Mon, Oct 18, 2021 at 07:58:07PM +0800, Zqiang wrote:
>>> When adding partitions to the disk, the reference count of the disk
>>> object is increased. then alloc partition device and called
>>> device_add(), if the device_add() return error, the reference
>>> count of the disk object will be reduced twice, at put_device(pdev)
>>> and put_disk(disk). this leads to the end of the object's life cycle
>>> prematurely, and trigger following calltrace.
>>>
>>> __init_work+0x2d/0x50 kernel/workqueue.c:519
>>> synchronize_rcu_expedited+0x3af/0x650 kernel/rcu/tree_exp.h:847
>>> bdi_remove_from_list mm/backing-dev.c:938 [inline]
>>> bdi_unregister+0x17f/0x5c0 mm/backing-dev.c:946
>>> release_bdi+0xa1/0xc0 mm/backing-dev.c:968
>>> kref_put include/linux/kref.h:65 [inline]
>>> bdi_put+0x72/0xa0 mm/backing-dev.c:976
>>> bdev_free_inode+0x11e/0x220 block/bdev.c:408
>>> i_callback+0x3f/0x70 fs/inode.c:226
>>> rcu_do_batch kernel/rcu/tree.c:2508 [inline]
>>> rcu_core+0x76d/0x16c0 kernel/rcu/tree.c:2743
>>> __do_softirq+0x1d7/0x93b kernel/softirq.c:558
>>> invoke_softirq kernel/softirq.c:432 [inline]
>>> __irq_exit_rcu kernel/softirq.c:636 [inline]
>>> irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648
>>> sysvec_apic_timer_interrupt+0x93/0xc0
>>>
>>> Return directly after calling the put_device().
>>>
>>> Reported-by: Hao Sun <[email protected]>
>>> Signed-off-by: Zqiang <[email protected]>
>> Fixes: 9d3b8813895d ("block: change the refcounting for partitions")
>> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
>
> Hello Jens Axboe
>
> The patch description information of the previous v2 version is
> incorrect, v3 modified the description information, please applied the
> v3 version.

This patch is already upstream, I don't have a time machine...

--
Jens Axboe