2022-10-12 12:53:58

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir

From: Yu Kuai <[email protected]>

Our test report a following uaf:

==================================================================
BUG: KASAN: use-after-free in sysfs_remove_link+0x23/0x60
Read of size 8 at addr ffff888117398db0 by task dmsetup/1097

CPU: 12 PID: 1097 Comm: dmsetup Not tainted 6.0.0-next-20221007-00011-gf46c8956
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
Call Trace:
<TASK>
? dump_stack_lvl+0x73/0x9f
? print_report+0x249/0x746
? __virt_addr_valid+0xd4/0x200
? sysfs_remove_link+0x23/0x60
? kasan_report+0xc0/0x120
? sysfs_remove_link+0x23/0x60
? __asan_load8+0x74/0x110
? sysfs_remove_link+0x23/0x60
? __unlink_disk_holder.isra.0+0x2f/0x80
? bd_unlink_disk_holder+0xd2/0x1c0
? dm_put_table_device+0xf1/0x250
? dm_put_device+0x14f/0x230
? linear_dtr+0x34/0x50
? dm_table_destroy+0x7b/0x280
? table_load+0x34a/0x710
? list_devices+0x4c0/0x4c0
? kvmalloc_node+0x7d/0x160
? __kmalloc_node+0x185/0x2b0
? ctl_ioctl+0x388/0x7b0
? list_devices+0x4c0/0x4c0
? free_params+0x50/0x50
? do_vfs_ioctl+0x931/0x10d0
? tick_program_event+0x65/0xd0
? __kasan_check_read+0x1d/0x30
? __fget_light+0xc2/0x370
? dm_ctl_ioctl+0x12/0x20
? __x64_sys_ioctl+0xd5/0x150
? do_syscall_64+0x35/0x80
? entry_SYSCALL_64_after_hwframe+0x63/0xcd
</TASK>

Allocated by task 1097:
kasan_save_stack+0x26/0x60
kasan_set_track+0x29/0x40
kasan_save_alloc_info+0x1f/0x40
__kasan_kmalloc+0xcb/0xe0
kmalloc_trace+0x7e/0x150
kobject_create_and_add+0x3d/0xc0
device_add_disk+0x429/0x7e0
dm_setup_md_queue+0x15b/0x240
table_load+0x469/0x710
ctl_ioctl+0x388/0x7b0
dm_ctl_ioctl+0x12/0x20
__x64_sys_ioctl+0xd5/0x150
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 1097:
kasan_save_stack+0x26/0x60
kasan_set_track+0x29/0x40
kasan_save_free_info+0x32/0x60
__kasan_slab_free+0x172/0x2c0
__kmem_cache_free+0x11c/0x560
kfree+0xd3/0x240
dynamic_kobj_release+0x1e/0x60
kobject_put+0x192/0x410
device_add_disk+0x535/0x7e0
dm_setup_md_queue+0x15b/0x240
table_load+0x469/0x710
ctl_ioctl+0x388/0x7b0
dm_ctl_ioctl+0x12/0x20
__x64_sys_ioctl+0xd5/0x150
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

This problem is very easy to reporduce by injecting failure while creating
dm, and root cause is that lifetime of bd_holder_dir/slave_dir is
problematic:

1) device alloc
alloc_disk_node
kzalloc_node -> gendisk
disk->part0 = bdev_alloc -> part0
device_initialize
kobject_init() -> part0->bd_device

2) device create
device_add_disk
device_add
kobject_add -> part0->bd_device
kobject_create_and_add -> part0->bd_holder_dir
kobject_create_and_add -> gendisk->slave_dir

3) device remove
del_gendisk
kobject_put -> part0->bd_holder_dir
kobject_put -> gendisk->slave_dir
device_del
kobject_del -> part0->bd_device

4) device free
disk_release
iput
bdev_free_inode
kfree -> gendisk
kmem_cache_free -> part0

bd_holder_dir and slave_dir is allocated in step 2), and freed in steup
3), while they are still accessible before step 4).

This patch delay the kobject destruction to disk_release/part_release to
fix the problem.

Signed-off-by: Yu Kuai <[email protected]>
---
block/genhd.c | 7 ++++---
block/partitions/core.c | 3 ++-
drivers/base/core.c | 1 -
3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 514395361d7c..7ebd085d3a3f 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -615,9 +615,6 @@ void del_gendisk(struct gendisk *disk)

blk_unregister_queue(disk);

- kobject_put(disk->part0->bd_holder_dir);
- kobject_put(disk->slave_dir);
-
part_stat_set_all(disk->part0, 0);
disk->part0->bd_stamp = 0;
if (!sysfs_deprecated)
@@ -1139,6 +1136,10 @@ static void disk_release(struct device *dev)
might_sleep();
WARN_ON_ONCE(disk_live(disk));

+ kobject_put(disk->part0->bd_holder_dir);
+ kobject_put(disk->slave_dir);
+ kobject_del(&dev->kobj);
+
/*
* To undo the all initialization from blk_mq_init_allocated_queue in
* case of a probe failure where add_disk is never called we have to
diff --git a/block/partitions/core.c b/block/partitions/core.c
index b8112f52d388..b86a66198cc8 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -250,6 +250,8 @@ static const struct attribute_group *part_attr_groups[] = {

static void part_release(struct device *dev)
{
+ kobject_put(dev_to_bdev(dev)->bd_holder_dir);
+ kobject_del(&dev->kobj);
put_disk(dev_to_bdev(dev)->bd_disk);
iput(dev_to_bdev(dev)->bd_inode);
}
@@ -279,7 +281,6 @@ static void delete_partition(struct block_device *part)
__invalidate_device(part, true);

xa_erase(&part->bd_disk->part_tbl, part->bd_partno);
- kobject_put(part->bd_holder_dir);
device_del(&part->bd_device);

/*
diff --git a/drivers/base/core.c b/drivers/base/core.c
index d02501933467..d2ff3d2a8710 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -3712,7 +3712,6 @@ void device_del(struct device *dev)
BUS_NOTIFY_REMOVED_DEVICE, dev);
kobject_uevent(&dev->kobj, KOBJ_REMOVE);
glue_dir = get_glue_dir(dev);
- kobject_del(&dev->kobj);
cleanup_glue_dir(dev, glue_dir);
memalloc_noio_restore(noio_flag);
put_device(parent);
--
2.31.1


2022-10-12 15:54:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir

On Wed, Oct 12, 2022 at 08:58:38PM +0800, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> Our test report a following uaf:
>
> ==================================================================
> BUG: KASAN: use-after-free in sysfs_remove_link+0x23/0x60
> Read of size 8 at addr ffff888117398db0 by task dmsetup/1097
>
> CPU: 12 PID: 1097 Comm: dmsetup Not tainted 6.0.0-next-20221007-00011-gf46c8956
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
> Call Trace:
> <TASK>
> ? dump_stack_lvl+0x73/0x9f
> ? print_report+0x249/0x746
> ? __virt_addr_valid+0xd4/0x200
> ? sysfs_remove_link+0x23/0x60
> ? kasan_report+0xc0/0x120
> ? sysfs_remove_link+0x23/0x60
> ? __asan_load8+0x74/0x110
> ? sysfs_remove_link+0x23/0x60
> ? __unlink_disk_holder.isra.0+0x2f/0x80
> ? bd_unlink_disk_holder+0xd2/0x1c0
> ? dm_put_table_device+0xf1/0x250
> ? dm_put_device+0x14f/0x230
> ? linear_dtr+0x34/0x50
> ? dm_table_destroy+0x7b/0x280
> ? table_load+0x34a/0x710
> ? list_devices+0x4c0/0x4c0
> ? kvmalloc_node+0x7d/0x160
> ? __kmalloc_node+0x185/0x2b0
> ? ctl_ioctl+0x388/0x7b0
> ? list_devices+0x4c0/0x4c0
> ? free_params+0x50/0x50
> ? do_vfs_ioctl+0x931/0x10d0
> ? tick_program_event+0x65/0xd0
> ? __kasan_check_read+0x1d/0x30
> ? __fget_light+0xc2/0x370
> ? dm_ctl_ioctl+0x12/0x20
> ? __x64_sys_ioctl+0xd5/0x150
> ? do_syscall_64+0x35/0x80
> ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
> </TASK>
>
> Allocated by task 1097:
> kasan_save_stack+0x26/0x60
> kasan_set_track+0x29/0x40
> kasan_save_alloc_info+0x1f/0x40
> __kasan_kmalloc+0xcb/0xe0
> kmalloc_trace+0x7e/0x150
> kobject_create_and_add+0x3d/0xc0
> device_add_disk+0x429/0x7e0
> dm_setup_md_queue+0x15b/0x240
> table_load+0x469/0x710
> ctl_ioctl+0x388/0x7b0
> dm_ctl_ioctl+0x12/0x20
> __x64_sys_ioctl+0xd5/0x150
> do_syscall_64+0x35/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Freed by task 1097:
> kasan_save_stack+0x26/0x60
> kasan_set_track+0x29/0x40
> kasan_save_free_info+0x32/0x60
> __kasan_slab_free+0x172/0x2c0
> __kmem_cache_free+0x11c/0x560
> kfree+0xd3/0x240
> dynamic_kobj_release+0x1e/0x60
> kobject_put+0x192/0x410
> device_add_disk+0x535/0x7e0
> dm_setup_md_queue+0x15b/0x240
> table_load+0x469/0x710
> ctl_ioctl+0x388/0x7b0
> dm_ctl_ioctl+0x12/0x20
> __x64_sys_ioctl+0xd5/0x150
> do_syscall_64+0x35/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> This problem is very easy to reporduce by injecting failure while creating
> dm, and root cause is that lifetime of bd_holder_dir/slave_dir is
> problematic:
>
> 1) device alloc
> alloc_disk_node
> kzalloc_node -> gendisk
> disk->part0 = bdev_alloc -> part0
> device_initialize
> kobject_init() -> part0->bd_device
>
> 2) device create
> device_add_disk
> device_add
> kobject_add -> part0->bd_device
> kobject_create_and_add -> part0->bd_holder_dir
> kobject_create_and_add -> gendisk->slave_dir
>
> 3) device remove
> del_gendisk
> kobject_put -> part0->bd_holder_dir
> kobject_put -> gendisk->slave_dir
> device_del
> kobject_del -> part0->bd_device
>
> 4) device free
> disk_release
> iput
> bdev_free_inode
> kfree -> gendisk
> kmem_cache_free -> part0
>
> bd_holder_dir and slave_dir is allocated in step 2), and freed in steup
> 3), while they are still accessible before step 4).
>
> This patch delay the kobject destruction to disk_release/part_release to
> fix the problem.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> block/genhd.c | 7 ++++---
> block/partitions/core.c | 3 ++-
> drivers/base/core.c | 1 -
> 3 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index 514395361d7c..7ebd085d3a3f 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -615,9 +615,6 @@ void del_gendisk(struct gendisk *disk)
>
> blk_unregister_queue(disk);
>
> - kobject_put(disk->part0->bd_holder_dir);
> - kobject_put(disk->slave_dir);
> -
> part_stat_set_all(disk->part0, 0);
> disk->part0->bd_stamp = 0;
> if (!sysfs_deprecated)
> @@ -1139,6 +1136,10 @@ static void disk_release(struct device *dev)
> might_sleep();
> WARN_ON_ONCE(disk_live(disk));
>
> + kobject_put(disk->part0->bd_holder_dir);
> + kobject_put(disk->slave_dir);
> + kobject_del(&dev->kobj);
> +
> /*
> * To undo the all initialization from blk_mq_init_allocated_queue in
> * case of a probe failure where add_disk is never called we have to
> diff --git a/block/partitions/core.c b/block/partitions/core.c
> index b8112f52d388..b86a66198cc8 100644
> --- a/block/partitions/core.c
> +++ b/block/partitions/core.c
> @@ -250,6 +250,8 @@ static const struct attribute_group *part_attr_groups[] = {
>
> static void part_release(struct device *dev)
> {
> + kobject_put(dev_to_bdev(dev)->bd_holder_dir);
> + kobject_del(&dev->kobj);
> put_disk(dev_to_bdev(dev)->bd_disk);
> iput(dev_to_bdev(dev)->bd_inode);
> }
> @@ -279,7 +281,6 @@ static void delete_partition(struct block_device *part)
> __invalidate_device(part, true);
>
> xa_erase(&part->bd_disk->part_tbl, part->bd_partno);
> - kobject_put(part->bd_holder_dir);
> device_del(&part->bd_device);
>
> /*
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index d02501933467..d2ff3d2a8710 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3712,7 +3712,6 @@ void device_del(struct device *dev)
> BUS_NOTIFY_REMOVED_DEVICE, dev);
> kobject_uevent(&dev->kobj, KOBJ_REMOVE);
> glue_dir = get_glue_dir(dev);
> - kobject_del(&dev->kobj);

Wait, no, you can't do this without having to change a bunch of other
code too. This feels really wrong, how did you test that this actually
works for all devices in the systems (not just block devices)?

so NAK on this change.

greg k-h

2022-10-13 02:23:36

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir

Hi,

?? 2022/10/12 23:49, Greg KH д??:
> On Wed, Oct 12, 2022 at 08:58:38PM +0800, Yu Kuai wrote:
>> From: Yu Kuai <[email protected]>
>>
>> Our test report a following uaf:
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in sysfs_remove_link+0x23/0x60
>> Read of size 8 at addr ffff888117398db0 by task dmsetup/1097
>>
>> CPU: 12 PID: 1097 Comm: dmsetup Not tainted 6.0.0-next-20221007-00011-gf46c8956
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
>> Call Trace:
>> <TASK>
>> ? dump_stack_lvl+0x73/0x9f
>> ? print_report+0x249/0x746
>> ? __virt_addr_valid+0xd4/0x200
>> ? sysfs_remove_link+0x23/0x60
>> ? kasan_report+0xc0/0x120
>> ? sysfs_remove_link+0x23/0x60
>> ? __asan_load8+0x74/0x110
>> ? sysfs_remove_link+0x23/0x60
>> ? __unlink_disk_holder.isra.0+0x2f/0x80
>> ? bd_unlink_disk_holder+0xd2/0x1c0
>> ? dm_put_table_device+0xf1/0x250
>> ? dm_put_device+0x14f/0x230
>> ? linear_dtr+0x34/0x50
>> ? dm_table_destroy+0x7b/0x280
>> ? table_load+0x34a/0x710
>> ? list_devices+0x4c0/0x4c0
>> ? kvmalloc_node+0x7d/0x160
>> ? __kmalloc_node+0x185/0x2b0
>> ? ctl_ioctl+0x388/0x7b0
>> ? list_devices+0x4c0/0x4c0
>> ? free_params+0x50/0x50
>> ? do_vfs_ioctl+0x931/0x10d0
>> ? tick_program_event+0x65/0xd0
>> ? __kasan_check_read+0x1d/0x30
>> ? __fget_light+0xc2/0x370
>> ? dm_ctl_ioctl+0x12/0x20
>> ? __x64_sys_ioctl+0xd5/0x150
>> ? do_syscall_64+0x35/0x80
>> ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> </TASK>
>>
>> Allocated by task 1097:
>> kasan_save_stack+0x26/0x60
>> kasan_set_track+0x29/0x40
>> kasan_save_alloc_info+0x1f/0x40
>> __kasan_kmalloc+0xcb/0xe0
>> kmalloc_trace+0x7e/0x150
>> kobject_create_and_add+0x3d/0xc0
>> device_add_disk+0x429/0x7e0
>> dm_setup_md_queue+0x15b/0x240
>> table_load+0x469/0x710
>> ctl_ioctl+0x388/0x7b0
>> dm_ctl_ioctl+0x12/0x20
>> __x64_sys_ioctl+0xd5/0x150
>> do_syscall_64+0x35/0x80
>> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> Freed by task 1097:
>> kasan_save_stack+0x26/0x60
>> kasan_set_track+0x29/0x40
>> kasan_save_free_info+0x32/0x60
>> __kasan_slab_free+0x172/0x2c0
>> __kmem_cache_free+0x11c/0x560
>> kfree+0xd3/0x240
>> dynamic_kobj_release+0x1e/0x60
>> kobject_put+0x192/0x410
>> device_add_disk+0x535/0x7e0
>> dm_setup_md_queue+0x15b/0x240
>> table_load+0x469/0x710
>> ctl_ioctl+0x388/0x7b0
>> dm_ctl_ioctl+0x12/0x20
>> __x64_sys_ioctl+0xd5/0x150
>> do_syscall_64+0x35/0x80
>> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> This problem is very easy to reporduce by injecting failure while creating
>> dm, and root cause is that lifetime of bd_holder_dir/slave_dir is
>> problematic:
>>
>> 1) device alloc
>> alloc_disk_node
>> kzalloc_node -> gendisk
>> disk->part0 = bdev_alloc -> part0
>> device_initialize
>> kobject_init() -> part0->bd_device
>>
>> 2) device create
>> device_add_disk
>> device_add
>> kobject_add -> part0->bd_device
>> kobject_create_and_add -> part0->bd_holder_dir
>> kobject_create_and_add -> gendisk->slave_dir
>>
>> 3) device remove
>> del_gendisk
>> kobject_put -> part0->bd_holder_dir
>> kobject_put -> gendisk->slave_dir
>> device_del
>> kobject_del -> part0->bd_device
>>
>> 4) device free
>> disk_release
>> iput
>> bdev_free_inode
>> kfree -> gendisk
>> kmem_cache_free -> part0
>>
>> bd_holder_dir and slave_dir is allocated in step 2), and freed in steup
>> 3), while they are still accessible before step 4).
>>
>> This patch delay the kobject destruction to disk_release/part_release to
>> fix the problem.
>>
>> Signed-off-by: Yu Kuai <[email protected]>
>> ---
>> block/genhd.c | 7 ++++---
>> block/partitions/core.c | 3 ++-
>> drivers/base/core.c | 1 -
>> 3 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/block/genhd.c b/block/genhd.c
>> index 514395361d7c..7ebd085d3a3f 100644
>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -615,9 +615,6 @@ void del_gendisk(struct gendisk *disk)
>>
>> blk_unregister_queue(disk);
>>
>> - kobject_put(disk->part0->bd_holder_dir);
>> - kobject_put(disk->slave_dir);
>> -
>> part_stat_set_all(disk->part0, 0);
>> disk->part0->bd_stamp = 0;
>> if (!sysfs_deprecated)
>> @@ -1139,6 +1136,10 @@ static void disk_release(struct device *dev)
>> might_sleep();
>> WARN_ON_ONCE(disk_live(disk));
>>
>> + kobject_put(disk->part0->bd_holder_dir);
>> + kobject_put(disk->slave_dir);
>> + kobject_del(&dev->kobj);
>> +
>> /*
>> * To undo the all initialization from blk_mq_init_allocated_queue in
>> * case of a probe failure where add_disk is never called we have to
>> diff --git a/block/partitions/core.c b/block/partitions/core.c
>> index b8112f52d388..b86a66198cc8 100644
>> --- a/block/partitions/core.c
>> +++ b/block/partitions/core.c
>> @@ -250,6 +250,8 @@ static const struct attribute_group *part_attr_groups[] = {
>>
>> static void part_release(struct device *dev)
>> {
>> + kobject_put(dev_to_bdev(dev)->bd_holder_dir);
>> + kobject_del(&dev->kobj);
>> put_disk(dev_to_bdev(dev)->bd_disk);
>> iput(dev_to_bdev(dev)->bd_inode);
>> }
>> @@ -279,7 +281,6 @@ static void delete_partition(struct block_device *part)
>> __invalidate_device(part, true);
>>
>> xa_erase(&part->bd_disk->part_tbl, part->bd_partno);
>> - kobject_put(part->bd_holder_dir);
>> device_del(&part->bd_device);
>>
>> /*
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index d02501933467..d2ff3d2a8710 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -3712,7 +3712,6 @@ void device_del(struct device *dev)
>> BUS_NOTIFY_REMOVED_DEVICE, dev);
>> kobject_uevent(&dev->kobj, KOBJ_REMOVE);
>> glue_dir = get_glue_dir(dev);
>> - kobject_del(&dev->kobj);
>
> Wait, no, you can't do this without having to change a bunch of other
> code too. This feels really wrong, how did you test that this actually
> works for all devices in the systems (not just block devices)?

Yes, you're right, I just tested some common block devices, and I don't
have a good idea to fix this yet...

Thanks,
Kuai
>
> so NAK on this change.
>
> greg k-h
> .
>

2022-10-14 06:03:34

by kernel test robot

[permalink] [raw]
Subject: [block] ababeb0121: mdadm-selftests.18imsm-1d-takeover-r1_2d.fail


Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: ababeb0121d1dfee30f2b26533f8ced4fc5ad1f7 ("[PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir")
url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/block-fix-use-after-free-for-bd_holder_dir-slave_dir/20221012-203747
base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/linux-block/[email protected]
patch subject: [PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir

in testcase: mdadm-selftests
version: mdadm-selftests-x86_64-5f41845-1_20220826
with following parameters:

disk: 1HDD
test_prefix: 18



on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/r/[email protected]


Testing on linux-6.0.0-07245-gababeb0121d1 kernel
/lkp/benchmarks/mdadm-selftests/tests/18imsm-1d-takeover-r1_2d... FAILED - see /var/tmp/18imsm-1d-takeover-r1_2d.log and /var/tmp/fail18imsm-1d-takeover-r1_2d.log for details

(log is attached)


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (1.85 kB)
config-6.0.0-07245-gababeb0121d1 (171.38 kB)
job-script (5.78 kB)
kmsg.xz (19.27 kB)
mdadm-selftests (2.22 kB)
job.yaml (4.70 kB)
reproduce (100.00 B)
log (2.14 kB)
Download all attachments

2022-10-17 08:12:26

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir

AFAICS the problem is the that pre-registered holders don't get
unregistered for a late add_disk failure.

Something like this should fix your error:

diff --git a/block/genhd.c b/block/genhd.c
index 17b33c62423df..6123005154b2a 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -484,7 +484,7 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,

ret = blk_register_queue(disk);
if (ret)
- goto out_put_slave_dir;
+ goto out_unregister_holders;

if (!(disk->flags & GENHD_FL_HIDDEN)) {
ret = bdi_register(disk->bdi, "%u:%u",
@@ -526,6 +526,8 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
bdi_unregister(disk->bdi);
out_unregister_queue:
blk_unregister_queue(disk);
+out_unregister_holders:
+ bd_unregister_all_holders(disk);
out_put_slave_dir:
kobject_put(disk->slave_dir);
out_put_holder_dir:
diff --git a/block/holder.c b/block/holder.c
index 5283bc804cc14..12c09d5c21280 100644
--- a/block/holder.c
+++ b/block/holder.c
@@ -169,3 +169,13 @@ int bd_register_pending_holders(struct gendisk *disk)
mutex_unlock(&disk->open_mutex);
return ret;
}
+
+void bd_unregister_all_holders(struct gendisk *disk)
+{
+ struct bd_holder_disk *holder;
+
+ mutex_lock(&disk->open_mutex);
+ list_for_each_entry(holder, &disk->slave_bdevs, list)
+ __unlink_disk_holder(holder->bdev, disk);
+ mutex_unlock(&disk->open_mutex);
+}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 50e358a19d986..ccab9a2dae4bd 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -840,6 +840,7 @@ void set_capacity(struct gendisk *disk, sector_t size);
int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk);
int bd_register_pending_holders(struct gendisk *disk);
+void bd_unregister_all_holders(struct gendisk *disk);
#else
static inline int bd_link_disk_holder(struct block_device *bdev,
struct gendisk *disk)
@@ -854,6 +855,9 @@ static inline int bd_register_pending_holders(struct gendisk *disk)
{
return 0;
}
+static inline void bd_unregister_all_holders(struct gendisk *disk)
+{
+}
#endif /* CONFIG_BLOCK_HOLDER_DEPRECATED */

dev_t part_devt(struct gendisk *disk, u8 partno);

2022-10-17 09:58:06

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir

Hi, christoph!

?? 2022/10/17 15:59, Christoph Hellwig д??:
> AFAICS the problem is the that pre-registered holders don't get
> unregistered for a late add_disk failure.
>
> Something like this should fix your error:
>

I agree that this patch do make sense, however, it seems to me this
patch should fix the problem that kobject is leaked, not uaf... And I
verified that the problem can still be reporduced with this patch.

Some details about our error??

while creating dm using sda, sda is removed. Thus device_add_disk() will
fail at bd_register_pending_holders() and and slave_dir is removed.
Then, later in dm error path, slave_dir is accessed again.

Perhaps refactor dm error path can fix this error, however, I still
think the root cause is lifecycle of slave_dir/bd_holder_dir, and there
are still probably other potential problems...

Thanks,
Kuai
> diff --git a/block/genhd.c b/block/genhd.c
> index 17b33c62423df..6123005154b2a 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -484,7 +484,7 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
>
> ret = blk_register_queue(disk);
> if (ret)
> - goto out_put_slave_dir;
> + goto out_unregister_holders;
>
> if (!(disk->flags & GENHD_FL_HIDDEN)) {
> ret = bdi_register(disk->bdi, "%u:%u",
> @@ -526,6 +526,8 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
> bdi_unregister(disk->bdi);
> out_unregister_queue:
> blk_unregister_queue(disk);
> +out_unregister_holders:
> + bd_unregister_all_holders(disk);
> out_put_slave_dir:
> kobject_put(disk->slave_dir);
> out_put_holder_dir:
> diff --git a/block/holder.c b/block/holder.c
> index 5283bc804cc14..12c09d5c21280 100644
> --- a/block/holder.c
> +++ b/block/holder.c
> @@ -169,3 +169,13 @@ int bd_register_pending_holders(struct gendisk *disk)
> mutex_unlock(&disk->open_mutex);
> return ret;
> }
> +
> +void bd_unregister_all_holders(struct gendisk *disk)
> +{
> + struct bd_holder_disk *holder;
> +
> + mutex_lock(&disk->open_mutex);
> + list_for_each_entry(holder, &disk->slave_bdevs, list)
> + __unlink_disk_holder(holder->bdev, disk);
> + mutex_unlock(&disk->open_mutex);
> +}
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 50e358a19d986..ccab9a2dae4bd 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -840,6 +840,7 @@ void set_capacity(struct gendisk *disk, sector_t size);
> int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
> void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk);
> int bd_register_pending_holders(struct gendisk *disk);
> +void bd_unregister_all_holders(struct gendisk *disk);
> #else
> static inline int bd_link_disk_holder(struct block_device *bdev,
> struct gendisk *disk)
> @@ -854,6 +855,9 @@ static inline int bd_register_pending_holders(struct gendisk *disk)
> {
> return 0;
> }
> +static inline void bd_unregister_all_holders(struct gendisk *disk)
> +{
> +}
> #endif /* CONFIG_BLOCK_HOLDER_DEPRECATED */
>
> dev_t part_devt(struct gendisk *disk, u8 partno);
> .
>

2022-10-17 12:03:02

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir

On Mon, Oct 17, 2022 at 05:24:20PM +0800, Yu Kuai wrote:
> I agree that this patch do make sense, however, it seems to me this
> patch should fix the problem that kobject is leaked, not uaf... And I
> verified that the problem can still be reporduced with this patch.

Can you share your reproducer?

2022-10-17 13:30:22

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH RFC] block: fix use after free for bd_holder_dir/slave_dir

Hi,

?? 2022/10/17 19:36, Christoph Hellwig д??:
> On Mon, Oct 17, 2022 at 05:24:20PM +0800, Yu Kuai wrote:
>> I agree that this patch do make sense, however, it seems to me this
>> patch should fix the problem that kobject is leaked, not uaf... And I
>> verified that the problem can still be reporduced with this patch.
>
> Can you share your reproducer?
> .
>

Of course, I add some delay in kernel to make sure the problem is
reproduced 100%:

diff --git a/block/holder.c b/block/holder.c
index 5283bc804cc1..be8f4b4245f3 100644
--- a/block/holder.c
+++ b/block/holder.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/blkdev.h>
#include <linux/slab.h>
+#include <linux/delay.h>

struct bd_holder_disk {
struct list_head list;
@@ -33,6 +34,8 @@ static int __link_disk_holder(struct block_device
*bdev, struct gendisk *disk)
{
int ret;

+ printk("%s: delay 5s\n", __func__);
+ msleep(5000);
ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
if (ret)
return ret;

test cmd is very simple:

dmsetup create test1 --table "0 100000 linear /dev/sda 0" &
sleep 1
echo 1 > /sys/block/sda/device/delete