2023-02-17 01:58:33

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next 0/2] block: fix scan partition for exclusively open device again

From: Yu Kuai <[email protected]>

Changes from RFC:
- remove the patch to factor out GD_NEED_PART_SCAN

Yu Kuai (2):
block: Revert "block: Do not reread partition table on exclusively
open device"
block: fix scan partition for exclusively open device again

block/blk.h | 2 +-
block/genhd.c | 37 ++++++++++++++++++++++++++++---------
block/ioctl.c | 13 ++++++-------
3 files changed, 35 insertions(+), 17 deletions(-)

--
2.31.1



2023-02-17 01:58:36

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next 1/2] block: Revert "block: Do not reread partition table on exclusively open device"

From: Yu Kuai <[email protected]>

This reverts commit 36369f46e91785688a5f39d7a5590e3f07981316.

This patch can't fix the problem in a corner case that device can be
opened exclusively after the checking and before blkdev_get_by_dev().
We'll use a new solution to fix the problem in the next patch, and
the new solution doesn't need to change apis.

Signed-off-by: Yu Kuai <[email protected]>
---
block/blk.h | 2 +-
block/genhd.c | 7 ++-----
block/ioctl.c | 13 ++++++-------
3 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/block/blk.h b/block/blk.h
index f02381405311..4a166f847ffd 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -439,7 +439,7 @@ static inline void bio_release_page(struct bio *bio, struct page *page)

struct request_queue *blk_alloc_queue(int node_id);

-int disk_scan_partitions(struct gendisk *disk, fmode_t mode, void *owner);
+int disk_scan_partitions(struct gendisk *disk, fmode_t mode);

int disk_alloc_events(struct gendisk *disk);
void disk_add_events(struct gendisk *disk);
diff --git a/block/genhd.c b/block/genhd.c
index d09d775c222a..b30d5538710c 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -356,7 +356,7 @@ void disk_uevent(struct gendisk *disk, enum kobject_action action)
}
EXPORT_SYMBOL_GPL(disk_uevent);

-int disk_scan_partitions(struct gendisk *disk, fmode_t mode, void *owner)
+int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
{
struct block_device *bdev;

@@ -366,9 +366,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode, void *owner)
return -EINVAL;
if (disk->open_partitions)
return -EBUSY;
- /* Someone else has bdev exclusively open? */
- if (disk->part0->bd_holder && disk->part0->bd_holder != owner)
- return -EBUSY;

set_bit(GD_NEED_PART_SCAN, &disk->state);
bdev = blkdev_get_by_dev(disk_devt(disk), mode, NULL);
@@ -499,7 +496,7 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,

bdev_add(disk->part0, ddev->devt);
if (get_capacity(disk))
- disk_scan_partitions(disk, FMODE_READ, NULL);
+ disk_scan_partitions(disk, FMODE_READ);

/*
* Announce the disk and partitions after all partitions are
diff --git a/block/ioctl.c b/block/ioctl.c
index 96617512982e..6dd49d877584 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -467,10 +467,10 @@ static int blkdev_bszset(struct block_device *bdev, fmode_t mode,
* user space. Note the separate arg/argp parameters that are needed
* to deal with the compat_ptr() conversion.
*/
-static int blkdev_common_ioctl(struct file *file, fmode_t mode, unsigned cmd,
- unsigned long arg, void __user *argp)
+static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode,
+ unsigned int cmd, unsigned long arg,
+ void __user *argp)
{
- struct block_device *bdev = I_BDEV(file->f_mapping->host);
unsigned int max_sectors;

switch (cmd) {
@@ -528,8 +528,7 @@ static int blkdev_common_ioctl(struct file *file, fmode_t mode, unsigned cmd,
return -EACCES;
if (bdev_is_partition(bdev))
return -EINVAL;
- return disk_scan_partitions(bdev->bd_disk, mode & ~FMODE_EXCL,
- file);
+ return disk_scan_partitions(bdev->bd_disk, mode & ~FMODE_EXCL);
case BLKTRACESTART:
case BLKTRACESTOP:
case BLKTRACETEARDOWN:
@@ -607,7 +606,7 @@ long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
break;
}

- ret = blkdev_common_ioctl(file, mode, cmd, arg, argp);
+ ret = blkdev_common_ioctl(bdev, mode, cmd, arg, argp);
if (ret != -ENOIOCTLCMD)
return ret;

@@ -676,7 +675,7 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
break;
}

- ret = blkdev_common_ioctl(file, mode, cmd, arg, argp);
+ ret = blkdev_common_ioctl(bdev, mode, cmd, arg, argp);
if (ret == -ENOIOCTLCMD && disk->fops->compat_ioctl)
ret = disk->fops->compat_ioctl(bdev, mode, cmd, arg);

--
2.31.1


2023-02-17 01:58:38

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next 2/2] block: fix scan partition for exclusively open device again

From: Yu Kuai <[email protected]>

As explained in commit 36369f46e917 ("block: Do not reread partition table
on exclusively open device"), reread partition on the device that is
exclusively opened by someone else is problematic.

This patch will make sure partition scan will only be proceed if current
thread open the device exclusively, or the device is not opened
exclusively, and in the later case, other scanners and exclusive openers
will be blocked temporarily until partition scan is done.

Fixes: 10c70d95c0f2 ("block: remove the bd_openers checks in blk_drop_partitions")
Cc: <[email protected]>
Suggested-by: Jan Kara <[email protected]>
Signed-off-by: Yu Kuai <[email protected]>
---
block/genhd.c | 30 ++++++++++++++++++++++++++----
block/ioctl.c | 2 +-
2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index b30d5538710c..3ee5577e1586 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -359,6 +359,7 @@ EXPORT_SYMBOL_GPL(disk_uevent);
int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
{
struct block_device *bdev;
+ int ret = 0;

if (disk->flags & (GENHD_FL_NO_PART | GENHD_FL_HIDDEN))
return -EINVAL;
@@ -368,11 +369,27 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
return -EBUSY;

set_bit(GD_NEED_PART_SCAN, &disk->state);
- bdev = blkdev_get_by_dev(disk_devt(disk), mode, NULL);
+ /*
+ * If the device is opened exclusively by current thread already, it's
+ * safe to scan partitons, otherwise, use bd_prepare_to_claim() to
+ * synchronize with other exclusive openers and other partition
+ * scanners.
+ */
+ if (!(mode & FMODE_EXCL)) {
+ ret = bd_prepare_to_claim(disk->part0, disk_scan_partitions);
+ if (ret)
+ return ret;
+ }
+
+ bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
if (IS_ERR(bdev))
- return PTR_ERR(bdev);
- blkdev_put(bdev, mode);
- return 0;
+ ret = PTR_ERR(bdev);
+ else
+ blkdev_put(bdev, mode);
+
+ if (!(mode & FMODE_EXCL))
+ bd_abort_claiming(disk->part0, disk_scan_partitions);
+ return ret;
}

/**
@@ -494,6 +511,11 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
if (ret)
goto out_unregister_bdi;

+ /* Make sure the first partition scan will be proceed */
+ if (get_capacity(disk) && !(disk->flags & GENHD_FL_NO_PART) &&
+ !test_bit(GD_SUPPRESS_PART_SCAN, &disk->state))
+ set_bit(GD_NEED_PART_SCAN, &disk->state);
+
bdev_add(disk->part0, ddev->devt);
if (get_capacity(disk))
disk_scan_partitions(disk, FMODE_READ);
diff --git a/block/ioctl.c b/block/ioctl.c
index 6dd49d877584..9c5f637ff153 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -528,7 +528,7 @@ static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode,
return -EACCES;
if (bdev_is_partition(bdev))
return -EINVAL;
- return disk_scan_partitions(bdev->bd_disk, mode & ~FMODE_EXCL);
+ return disk_scan_partitions(bdev->bd_disk, mode);
case BLKTRACESTART:
case BLKTRACESTOP:
case BLKTRACETEARDOWN:
--
2.31.1


2023-02-17 11:01:53

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH -next 1/2] block: Revert "block: Do not reread partition table on exclusively open device"

On Fri 17-02-23 10:21:59, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> This reverts commit 36369f46e91785688a5f39d7a5590e3f07981316.
>
> This patch can't fix the problem in a corner case that device can be
> opened exclusively after the checking and before blkdev_get_by_dev().
> We'll use a new solution to fix the problem in the next patch, and
> the new solution doesn't need to change apis.
>
> Signed-off-by: Yu Kuai <[email protected]>

Sure. Feel free to add:

Acked-by: Jan Kara <[email protected]>

or Reviewed-by... whatever for the revert :)

Honza

> ---
> block/blk.h | 2 +-
> block/genhd.c | 7 ++-----
> block/ioctl.c | 13 ++++++-------
> 3 files changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/block/blk.h b/block/blk.h
> index f02381405311..4a166f847ffd 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -439,7 +439,7 @@ static inline void bio_release_page(struct bio *bio, struct page *page)
>
> struct request_queue *blk_alloc_queue(int node_id);
>
> -int disk_scan_partitions(struct gendisk *disk, fmode_t mode, void *owner);
> +int disk_scan_partitions(struct gendisk *disk, fmode_t mode);
>
> int disk_alloc_events(struct gendisk *disk);
> void disk_add_events(struct gendisk *disk);
> diff --git a/block/genhd.c b/block/genhd.c
> index d09d775c222a..b30d5538710c 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -356,7 +356,7 @@ void disk_uevent(struct gendisk *disk, enum kobject_action action)
> }
> EXPORT_SYMBOL_GPL(disk_uevent);
>
> -int disk_scan_partitions(struct gendisk *disk, fmode_t mode, void *owner)
> +int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> {
> struct block_device *bdev;
>
> @@ -366,9 +366,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode, void *owner)
> return -EINVAL;
> if (disk->open_partitions)
> return -EBUSY;
> - /* Someone else has bdev exclusively open? */
> - if (disk->part0->bd_holder && disk->part0->bd_holder != owner)
> - return -EBUSY;
>
> set_bit(GD_NEED_PART_SCAN, &disk->state);
> bdev = blkdev_get_by_dev(disk_devt(disk), mode, NULL);
> @@ -499,7 +496,7 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
>
> bdev_add(disk->part0, ddev->devt);
> if (get_capacity(disk))
> - disk_scan_partitions(disk, FMODE_READ, NULL);
> + disk_scan_partitions(disk, FMODE_READ);
>
> /*
> * Announce the disk and partitions after all partitions are
> diff --git a/block/ioctl.c b/block/ioctl.c
> index 96617512982e..6dd49d877584 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -467,10 +467,10 @@ static int blkdev_bszset(struct block_device *bdev, fmode_t mode,
> * user space. Note the separate arg/argp parameters that are needed
> * to deal with the compat_ptr() conversion.
> */
> -static int blkdev_common_ioctl(struct file *file, fmode_t mode, unsigned cmd,
> - unsigned long arg, void __user *argp)
> +static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode,
> + unsigned int cmd, unsigned long arg,
> + void __user *argp)
> {
> - struct block_device *bdev = I_BDEV(file->f_mapping->host);
> unsigned int max_sectors;
>
> switch (cmd) {
> @@ -528,8 +528,7 @@ static int blkdev_common_ioctl(struct file *file, fmode_t mode, unsigned cmd,
> return -EACCES;
> if (bdev_is_partition(bdev))
> return -EINVAL;
> - return disk_scan_partitions(bdev->bd_disk, mode & ~FMODE_EXCL,
> - file);
> + return disk_scan_partitions(bdev->bd_disk, mode & ~FMODE_EXCL);
> case BLKTRACESTART:
> case BLKTRACESTOP:
> case BLKTRACETEARDOWN:
> @@ -607,7 +606,7 @@ long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
> break;
> }
>
> - ret = blkdev_common_ioctl(file, mode, cmd, arg, argp);
> + ret = blkdev_common_ioctl(bdev, mode, cmd, arg, argp);
> if (ret != -ENOIOCTLCMD)
> return ret;
>
> @@ -676,7 +675,7 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
> break;
> }
>
> - ret = blkdev_common_ioctl(file, mode, cmd, arg, argp);
> + ret = blkdev_common_ioctl(bdev, mode, cmd, arg, argp);
> if (ret == -ENOIOCTLCMD && disk->fops->compat_ioctl)
> ret = disk->fops->compat_ioctl(bdev, mode, cmd, arg);
>
> --
> 2.31.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-02-17 11:06:12

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH -next 2/2] block: fix scan partition for exclusively open device again

On Fri 17-02-23 10:22:00, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> As explained in commit 36369f46e917 ("block: Do not reread partition table
> on exclusively open device"), reread partition on the device that is
> exclusively opened by someone else is problematic.
>
> This patch will make sure partition scan will only be proceed if current
> thread open the device exclusively, or the device is not opened
> exclusively, and in the later case, other scanners and exclusive openers
> will be blocked temporarily until partition scan is done.
>
> Fixes: 10c70d95c0f2 ("block: remove the bd_openers checks in blk_drop_partitions")
> Cc: <[email protected]>
> Suggested-by: Jan Kara <[email protected]>
> Signed-off-by: Yu Kuai <[email protected]>

Looks good to me, just two minor comments below:

> diff --git a/block/genhd.c b/block/genhd.c
> index b30d5538710c..3ee5577e1586 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -359,6 +359,7 @@ EXPORT_SYMBOL_GPL(disk_uevent);
> int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> {
> struct block_device *bdev;
> + int ret = 0;
>
> if (disk->flags & (GENHD_FL_NO_PART | GENHD_FL_HIDDEN))
> return -EINVAL;
> @@ -368,11 +369,27 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> return -EBUSY;
>
> set_bit(GD_NEED_PART_SCAN, &disk->state);

I'd move the set_bit() after we are sure we have exclusive access to the
bdev. Otherwise we could set GD_NEED_PART_SCAN on a device exclusively open
by someone else and if we race with open in an unfortunate way, we could
trigger unexpected partition scan...

> - bdev = blkdev_get_by_dev(disk_devt(disk), mode, NULL);
> + /*
> + * If the device is opened exclusively by current thread already, it's
> + * safe to scan partitons, otherwise, use bd_prepare_to_claim() to
> + * synchronize with other exclusive openers and other partition
> + * scanners.
> + */
> + if (!(mode & FMODE_EXCL)) {
> + ret = bd_prepare_to_claim(disk->part0, disk_scan_partitions);
> + if (ret)
> + return ret;
> + }
> +
> + bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
> if (IS_ERR(bdev))
> - return PTR_ERR(bdev);
> - blkdev_put(bdev, mode);
> - return 0;
> + ret = PTR_ERR(bdev);
> + else
> + blkdev_put(bdev, mode);
> +
> + if (!(mode & FMODE_EXCL))
> + bd_abort_claiming(disk->part0, disk_scan_partitions);
> + return ret;
> }
>
> /**
> @@ -494,6 +511,11 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
> if (ret)
> goto out_unregister_bdi;
>
> + /* Make sure the first partition scan will be proceed */
^^^^^^ "will happen"
probably makes more sense here.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-02-17 13:16:46

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH -next 0/2] block: fix scan partition for exclusively open device again


On Fri, 17 Feb 2023 10:21:58 +0800, Yu Kuai wrote:
> Changes from RFC:
> - remove the patch to factor out GD_NEED_PART_SCAN
>
> Yu Kuai (2):
> block: Revert "block: Do not reread partition table on exclusively
> open device"
> block: fix scan partition for exclusively open device again
>
> [...]

Applied, thanks!

[1/2] block: Revert "block: Do not reread partition table on exclusively open device"
commit: 0f77b29ad14e34a89961f32edc87b92db623bb37
[2/2] block: fix scan partition for exclusively open device again
commit: e5cfefa97bccf956ea0bb6464c1f6c84fd7a8d9f

Best regards,
--
Jens Axboe




2023-03-21 11:44:52

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH -next 0/2] block: fix scan partition for exclusively open device again

On Fri, Feb 17, 2023 at 10:21:58AM +0800, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> Changes from RFC:
> - remove the patch to factor out GD_NEED_PART_SCAN
>
> Yu Kuai (2):
> block: Revert "block: Do not reread partition table on exclusively
> open device"
> block: fix scan partition for exclusively open device again

Hi Yu kuai,

Looks the original issue starts to re-appear now with the two patches:

https://lore.kernel.org/linux-block/20221130135344.2ul4cyfstfs3znxg@quack3/

And underlying disk partition and raid partition can be observed at the
same time.

Can you take a look?

Follows the script, which isn't 100% triggered, but still easy.

#create level 1 with 2 devices, meta 1.0
mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0

#create partition 0, start from 0 sector, size 100MiB
sgdisk -n 0:0:+100MiB /dev/md0

#observe partitions
cat /proc/partitions

#stop the array
mdadm -S /dev/md0

#re-assemble
mdadm -A /dev/md0 /dev/sda /dev/sdb
cat /proc/partitions


Thanks,
Ming


2023-03-22 01:34:29

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next 0/2] block: fix scan partition for exclusively open device again

Hi,

在 2023/03/21 19:43, Ming Lei 写道:
> On Fri, Feb 17, 2023 at 10:21:58AM +0800, Yu Kuai wrote:
>> From: Yu Kuai <[email protected]>
>>
>> Changes from RFC:
>> - remove the patch to factor out GD_NEED_PART_SCAN
>>
>> Yu Kuai (2):
>> block: Revert "block: Do not reread partition table on exclusively
>> open device"
>> block: fix scan partition for exclusively open device again
>
> Hi Yu kuai,
>
> Looks the original issue starts to re-appear now with the two patches:
>
> https://lore.kernel.org/linux-block/20221130135344.2ul4cyfstfs3znxg@quack3/
>
> And underlying disk partition and raid partition can be observed at the
> same time.
>
> Can you take a look?
Yes, thanks for the report. I realize that sda1 adn sdb1 is created
while raid open sda and sdb excl, and I think this problem should exist
before this patchset.

And I verify this with following test:

1) mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
2) sgdisk -n 0:0:+100MiB /dev/md0
3) mdadm -S /dev/md0
# scan partitions of sda
4) blockdev --rereadpt /dev/sda

Then sda1 is created.

I'm not sure how to fix this yet????

Thanks,
Kuai
>
> Follows the script, which isn't 100% triggered, but still easy.
>
> #create level 1 with 2 devices, meta 1.0
> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
>
> #create partition 0, start from 0 sector, size 100MiB
> sgdisk -n 0:0:+100MiB /dev/md0
>
> #observe partitions
> cat /proc/partitions
>
> #stop the array
> mdadm -S /dev/md0
>
> #re-assemble
> mdadm -A /dev/md0 /dev/sda /dev/sdb
> cat /proc/partitions
>
>
> Thanks,
> Ming
>
> .
>

2023-03-22 01:37:24

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH -next 0/2] block: fix scan partition for exclusively open device again

On Wed, Mar 22, 2023 at 09:26:07AM +0800, Yu Kuai wrote:
> Hi,
>
> 在 2023/03/21 19:43, Ming Lei 写道:
> > On Fri, Feb 17, 2023 at 10:21:58AM +0800, Yu Kuai wrote:
> > > From: Yu Kuai <[email protected]>
> > >
> > > Changes from RFC:
> > > - remove the patch to factor out GD_NEED_PART_SCAN
> > >
> > > Yu Kuai (2):
> > > block: Revert "block: Do not reread partition table on exclusively
> > > open device"
> > > block: fix scan partition for exclusively open device again
> >
> > Hi Yu kuai,
> >
> > Looks the original issue starts to re-appear now with the two patches:
> >
> > https://lore.kernel.org/linux-block/20221130135344.2ul4cyfstfs3znxg@quack3/
> >
> > And underlying disk partition and raid partition can be observed at the
> > same time.
> >
> > Can you take a look?
> Yes, thanks for the report. I realize that sda1 adn sdb1 is created
> while raid open sda and sdb excl, and I think this problem should exist
> before this patchset.

Looks not reproduced before applying your two patches, that is exactly what Jan
tried to fix with 36369f46e917 ("block: Do not reread partition table on exclusively open device").

The issue is reported by Changhui's block regression test.


Thanks,
Ming

2023-03-22 02:04:50

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next 0/2] block: fix scan partition for exclusively open device again

Hi,

在 2023/03/22 9:34, Ming Lei 写道:
> On Wed, Mar 22, 2023 at 09:26:07AM +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2023/03/21 19:43, Ming Lei 写道:
>>> On Fri, Feb 17, 2023 at 10:21:58AM +0800, Yu Kuai wrote:
>>>> From: Yu Kuai <[email protected]>
>>>>
>>>> Changes from RFC:
>>>> - remove the patch to factor out GD_NEED_PART_SCAN
>>>>
>>>> Yu Kuai (2):
>>>> block: Revert "block: Do not reread partition table on exclusively
>>>> open device"
>>>> block: fix scan partition for exclusively open device again
>>>
>>> Hi Yu kuai,
>>>
>>> Looks the original issue starts to re-appear now with the two patches:
>>>
>>> https://lore.kernel.org/linux-block/20221130135344.2ul4cyfstfs3znxg@quack3/
>>>
>>> And underlying disk partition and raid partition can be observed at the
>>> same time.
>>>
>>> Can you take a look?
>> Yes, thanks for the report. I realize that sda1 adn sdb1 is created
>> while raid open sda and sdb excl, and I think this problem should exist
>> before this patchset.
>
> Looks not reproduced before applying your two patches, that is exactly what Jan
> tried to fix with 36369f46e917 ("block: Do not reread partition table on exclusively open device").

Hi, Ming

I just tried your test with this patchset reverted, and I can still
reporduce the problem myself.

raid only open this device excl, and disk_scan_partitions is not called:

md_import_device
blkdev_get_by_devo

I need to add some debuginfo to figure out how GD_NEED_PART_SCAN is set
for sda after raid is stopped. And this should explain why sda1 is
created.

Thanks,
Kuai
>
> The issue is reported by Changhui's block regression test.
>
>
> Thanks,
> Ming
>
>
> .
>

2023-03-22 02:19:25

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next 0/2] block: fix scan partition for exclusively open device again

Hi,

在 2023/03/22 10:02, Yu Kuai 写道:
> Hi,
>
> 在 2023/03/22 9:34, Ming Lei 写道:
>> On Wed, Mar 22, 2023 at 09:26:07AM +0800, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2023/03/21 19:43, Ming Lei 写道:
>>>> On Fri, Feb 17, 2023 at 10:21:58AM +0800, Yu Kuai wrote:
>>>>> From: Yu Kuai <[email protected]>
>>>>>
>>>>> Changes from RFC:
>>>>>    - remove the patch to factor out GD_NEED_PART_SCAN
>>>>>
>>>>> Yu Kuai (2):
>>>>>     block: Revert "block: Do not reread partition table on exclusively
>>>>>       open device"
>>>>>     block: fix scan partition for exclusively open device again
>>>>
>>>> Hi Yu kuai,
>>>>
>>>> Looks the original issue starts to re-appear now with the two patches:
>>>>
>>>> https://lore.kernel.org/linux-block/20221130135344.2ul4cyfstfs3znxg@quack3/
>>>>
>>>>
>>>> And underlying disk partition and raid partition can be observed at the
>>>> same time.
>>>>
>>>> Can you take a look?
>>> Yes, thanks for the report. I realize that sda1 adn sdb1 is created
>>> while raid open sda and sdb excl, and I think this problem should exist
>>> before this patchset.
>>
>> Looks not reproduced before applying your two patches, that is exactly
>> what Jan
>> tried to fix with 36369f46e917 ("block: Do not reread partition table
>> on exclusively open device").
>
> Hi, Ming
>
> I just tried your test with this patchset reverted, and I can still
> reporduce the problem myself.

Oops, I forgot to revert the first patch. It's right the problem can't
be reporduced.
>
> raid only open this device excl, and disk_scan_partitions is not called:
>
> md_import_device
>  blkdev_get_by_devo
>
> I need to add some debuginfo to figure out how GD_NEED_PART_SCAN is set
> for sda after raid is stopped. And this should explain why sda1 is
> created.

I found how GD_NEED_PART_SCAN is set now, in patch 2, this is set before
bd_prepare_to_claim, so preciously faild part scan will still set this
bit, and following patch shold fix this problem:

diff --git a/block/genhd.c b/block/genhd.c
index 08bb1a9ec22c..2487c9452b94 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk,
fmode_t mode)
if (disk->open_partitions)
return -EBUSY;

- set_bit(GD_NEED_PART_SCAN, &disk->state);
/*
* If the device is opened exclusively by current thread
already, it's
* safe to scan partitons, otherwise, use bd_prepare_to_claim() to
@@ -381,6 +380,7 @@ int disk_scan_partitions(struct gendisk *disk,
fmode_t mode)
return ret;
}

+ set_bit(GD_NEED_PART_SCAN, &disk->state);
bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL,
NULL);
if (IS_ERR(bdev))
ret = PTR_ERR(bdev);

>
> Thanks,
> Kuai
>>
>> The issue is reported by Changhui's block regression test.
>>
>>
>> Thanks,
>> Ming
>>
>>
>> .
>>
>
>
> .
>

2023-03-22 03:51:22

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH -next 0/2] block: fix scan partition for exclusively open device again

On Wed, Mar 22, 2023 at 10:15:30AM +0800, Yu Kuai wrote:
> Hi,
>
> 在 2023/03/22 10:02, Yu Kuai 写道:
> > Hi,
> >
> > 在 2023/03/22 9:34, Ming Lei 写道:
> > > On Wed, Mar 22, 2023 at 09:26:07AM +0800, Yu Kuai wrote:
> > > > Hi,
> > > >
> > > > 在 2023/03/21 19:43, Ming Lei 写道:
> > > > > On Fri, Feb 17, 2023 at 10:21:58AM +0800, Yu Kuai wrote:
> > > > > > From: Yu Kuai <[email protected]>
> > > > > >
> > > > > > Changes from RFC:
> > > > > >    - remove the patch to factor out GD_NEED_PART_SCAN
> > > > > >
> > > > > > Yu Kuai (2):
> > > > > >     block: Revert "block: Do not reread partition table on exclusively
> > > > > >       open device"
> > > > > >     block: fix scan partition for exclusively open device again
> > > > >
> > > > > Hi Yu kuai,
> > > > >
> > > > > Looks the original issue starts to re-appear now with the two patches:
> > > > >
> > > > > https://lore.kernel.org/linux-block/20221130135344.2ul4cyfstfs3znxg@quack3/
> > > > >
> > > > >
> > > > > And underlying disk partition and raid partition can be observed at the
> > > > > same time.
> > > > >
> > > > > Can you take a look?
> > > > Yes, thanks for the report. I realize that sda1 adn sdb1 is created
> > > > while raid open sda and sdb excl, and I think this problem should exist
> > > > before this patchset.
> > >
> > > Looks not reproduced before applying your two patches, that is
> > > exactly what Jan
> > > tried to fix with 36369f46e917 ("block: Do not reread partition
> > > table on exclusively open device").
> >
> > Hi, Ming
> >
> > I just tried your test with this patchset reverted, and I can still
> > reporduce the problem myself.
>
> Oops, I forgot to revert the first patch. It's right the problem can't
> be reporduced.
> >
> > raid only open this device excl, and disk_scan_partitions is not called:
> >
> > md_import_device
> >  blkdev_get_by_devo
> >
> > I need to add some debuginfo to figure out how GD_NEED_PART_SCAN is set
> > for sda after raid is stopped. And this should explain why sda1 is
> > created.
>
> I found how GD_NEED_PART_SCAN is set now, in patch 2, this is set before
> bd_prepare_to_claim, so preciously faild part scan will still set this
> bit, and following patch shold fix this problem:

Just run quick test, the issue won't be reproduced with your patch, and
the change looks rational too,

Reviewed-by: Ming Lei <[email protected]>


Thanks,
Ming

2023-03-22 04:01:20

by Yu Kuai

[permalink] [raw]
Subject: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

From: Yu Kuai <[email protected]>

Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
set, and partition scan will be proceed again when blkdev_get_by_dev()
is called. However, this will cause a problem that re-assemble partitioned
raid device will creat partition for underlying disk.

Test procedure:

mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
sgdisk -n 0:0:+100MiB /dev/md0
blockdev --rereadpt /dev/sda
blockdev --rereadpt /dev/sdb
mdadm -S /dev/md0
mdadm -A /dev/md0 /dev/sda /dev/sdb

Test result: underlying disk partition and raid partition can be
observed at the same time

Note that this can still happen in come corner cases that
GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
device.

Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
Signed-off-by: Yu Kuai <[email protected]>
---
block/genhd.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/genhd.c b/block/genhd.c
index 08bb1a9ec22c..a72e27d6779d 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
if (disk->open_partitions)
return -EBUSY;

- set_bit(GD_NEED_PART_SCAN, &disk->state);
/*
* If the device is opened exclusively by current thread already, it's
* safe to scan partitons, otherwise, use bd_prepare_to_claim() to
@@ -381,12 +380,19 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
return ret;
}

+ set_bit(GD_NEED_PART_SCAN, &disk->state);
bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
if (IS_ERR(bdev))
ret = PTR_ERR(bdev);
else
blkdev_put(bdev, mode & ~FMODE_EXCL);

+ /*
+ * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
+ * and this will cause that re-assemble partitioned raid device will
+ * creat partition for underlying disk.
+ */
+ clear_bit(GD_NEED_PART_SCAN, &disk->state);
if (!(mode & FMODE_EXCL))
bd_abort_claiming(disk->part0, disk_scan_partitions);
return ret;
--
2.31.1

2023-03-22 04:05:31

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next 0/2] block: fix scan partition for exclusively open device again

Hi,

在 2023/03/22 11:38, Ming Lei 写道:
>>>>>> Hi Yu kuai,
>>>>>>
>>>>>> Looks the original issue starts to re-appear now with the two patches:
>>>>>>
>>>>>> https://lore.kernel.org/linux-block/20221130135344.2ul4cyfstfs3znxg@quack3/
>>>>>>
>>>>>>
>>>>>> And underlying disk partition and raid partition can be observed at the
>>>>>> same time.
>>>>>>
>>>>>> Can you take a look?
>>>>> Yes, thanks for the report. I realize that sda1 adn sdb1 is created
>>>>> while raid open sda and sdb excl, and I think this problem should exist
>>>>> before this patchset.
>>>>
>>>> Looks not reproduced before applying your two patches, that is
>>>> exactly what Jan
>>>> tried to fix with 36369f46e917 ("block: Do not reread partition
>>>> table on exclusively open device").
>>>
>>> Hi, Ming
>>>
>>> I just tried your test with this patchset reverted, and I can still
>>> reporduce the problem myself.
>>
>> Oops, I forgot to revert the first patch. It's right the problem can't
>> be reporduced.
>>>
>>> raid only open this device excl, and disk_scan_partitions is not called:
>>>
>>> md_import_device
>>>  blkdev_get_by_devo
>>>
>>> I need to add some debuginfo to figure out how GD_NEED_PART_SCAN is set
>>> for sda after raid is stopped. And this should explain why sda1 is
>>> created.
>>
>> I found how GD_NEED_PART_SCAN is set now, in patch 2, this is set before
>> bd_prepare_to_claim, so preciously faild part scan will still set this
>> bit, and following patch shold fix this problem:
>
> Just run quick test, the issue won't be reproduced with your patch, and
> the change looks rational too,
>
> Reviewed-by: Ming Lei <[email protected]>
>

Thanks for the test and review, I just do some additional change to
clear GD_NEED_PART_SCAN, I will send a patch, can you take a look?

Kuai
>
> Thanks,
> Ming
>
> .
>

2023-03-22 08:06:23

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> set, and partition scan will be proceed again when blkdev_get_by_dev()
> is called. However, this will cause a problem that re-assemble partitioned
> raid device will creat partition for underlying disk.
>
> Test procedure:
>
> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> sgdisk -n 0:0:+100MiB /dev/md0
> blockdev --rereadpt /dev/sda
> blockdev --rereadpt /dev/sdb
> mdadm -S /dev/md0
> mdadm -A /dev/md0 /dev/sda /dev/sdb
>
> Test result: underlying disk partition and raid partition can be
> observed at the same time
>
> Note that this can still happen in come corner cases that
> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> device.
>
> Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> Signed-off-by: Yu Kuai <[email protected]>

The issue still can't be avoided completely, such as, after rebooting,
/dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
should be underlying partitions scanned before re-assembling raid, I
guess it may not be easy to avoid.

Also seems the following change added in e5cfefa97bcc isn't necessary:

/* Make sure the first partition scan will be proceed */
if (get_capacity(disk) && !(disk->flags & GENHD_FL_NO_PART) &&
!test_bit(GD_SUPPRESS_PART_SCAN, &disk->state))
set_bit(GD_NEED_PART_SCAN, &disk->state);

since the following disk_scan_partitions() in device_add_disk() should cover
partitions scan.

> ---
> block/genhd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index 08bb1a9ec22c..a72e27d6779d 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> if (disk->open_partitions)
> return -EBUSY;
>
> - set_bit(GD_NEED_PART_SCAN, &disk->state);
> /*
> * If the device is opened exclusively by current thread already, it's
> * safe to scan partitons, otherwise, use bd_prepare_to_claim() to
> @@ -381,12 +380,19 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> return ret;
> }
>
> + set_bit(GD_NEED_PART_SCAN, &disk->state);
> bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
> if (IS_ERR(bdev))
> ret = PTR_ERR(bdev);
> else
> blkdev_put(bdev, mode & ~FMODE_EXCL);
>
> + /*
> + * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
> + * and this will cause that re-assemble partitioned raid device will
> + * creat partition for underlying disk.
> + */
> + clear_bit(GD_NEED_PART_SCAN, &disk->state);

I feel GD_NEED_PART_SCAN becomes a bit hard to follow.

So far, it is only consumed by blkdev_get_whole(), and cleared in
bdev_disk_changed(). That means partition scan can be retried
if bdev_disk_changed() fails.

Another mess is that more drivers start to touch this flag, such as
nbd/sd, probably it is better to change them into one API of
blk_disk_need_partition_scan(), and hide implementation detail
to drivers.


thanks,
Ming

2023-03-22 09:15:12

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

Hi, Ming

?? 2023/03/22 15:58, Ming Lei д??:
> On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
>> From: Yu Kuai <[email protected]>
>>
>> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
>> set, and partition scan will be proceed again when blkdev_get_by_dev()
>> is called. However, this will cause a problem that re-assemble partitioned
>> raid device will creat partition for underlying disk.
>>
>> Test procedure:
>>
>> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
>> sgdisk -n 0:0:+100MiB /dev/md0
>> blockdev --rereadpt /dev/sda
>> blockdev --rereadpt /dev/sdb
>> mdadm -S /dev/md0
>> mdadm -A /dev/md0 /dev/sda /dev/sdb
>>
>> Test result: underlying disk partition and raid partition can be
>> observed at the same time
>>
>> Note that this can still happen in come corner cases that
>> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
>> device.
>>
>> Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
>> Signed-off-by: Yu Kuai <[email protected]>
>
> The issue still can't be avoided completely, such as, after rebooting,
> /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
> should be underlying partitions scanned before re-assembling raid, I
> guess it may not be easy to avoid.

Yes, this is possible and I don't know how to fix this yet...
>
> Also seems the following change added in e5cfefa97bcc isn't necessary:
>
> /* Make sure the first partition scan will be proceed */
> if (get_capacity(disk) && !(disk->flags & GENHD_FL_NO_PART) &&
> !test_bit(GD_SUPPRESS_PART_SCAN, &disk->state))
> set_bit(GD_NEED_PART_SCAN, &disk->state);
>
> since the following disk_scan_partitions() in device_add_disk() should cover
> partitions scan.

This can't be guaranteed if someone else open the device excl after
bdev_add and before disk_scan_partitions:

t1: t2:
device_add_disk
bdev_add
insert_inode_hash
// open device excl
disk_scan_partitions
// will fail

However, this is just in theory, and it's unlikely to happen in
practice.

Thanks,
Kuai
>
>> ---
>> block/genhd.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/genhd.c b/block/genhd.c
>> index 08bb1a9ec22c..a72e27d6779d 100644
>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
>> if (disk->open_partitions)
>> return -EBUSY;
>>
>> - set_bit(GD_NEED_PART_SCAN, &disk->state);
>> /*
>> * If the device is opened exclusively by current thread already, it's
>> * safe to scan partitons, otherwise, use bd_prepare_to_claim() to
>> @@ -381,12 +380,19 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
>> return ret;
>> }
>>
>> + set_bit(GD_NEED_PART_SCAN, &disk->state);
>> bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
>> if (IS_ERR(bdev))
>> ret = PTR_ERR(bdev);
>> else
>> blkdev_put(bdev, mode & ~FMODE_EXCL);
>>
>> + /*
>> + * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
>> + * and this will cause that re-assemble partitioned raid device will
>> + * creat partition for underlying disk.
>> + */
>> + clear_bit(GD_NEED_PART_SCAN, &disk->state);
>
> I feel GD_NEED_PART_SCAN becomes a bit hard to follow.
>
> So far, it is only consumed by blkdev_get_whole(), and cleared in
> bdev_disk_changed(). That means partition scan can be retried
> if bdev_disk_changed() fails.
>
> Another mess is that more drivers start to touch this flag, such as
> nbd/sd, probably it is better to change them into one API of
> blk_disk_need_partition_scan(), and hide implementation detail
> to drivers.
>
>
> thanks,
> Ming
>
> .
>

2023-03-22 09:52:13

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Wed 22-03-23 15:58:35, Ming Lei wrote:
> On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> > From: Yu Kuai <[email protected]>
> >
> > Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> > set, and partition scan will be proceed again when blkdev_get_by_dev()
> > is called. However, this will cause a problem that re-assemble partitioned
> > raid device will creat partition for underlying disk.
> >
> > Test procedure:
> >
> > mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> > sgdisk -n 0:0:+100MiB /dev/md0
> > blockdev --rereadpt /dev/sda
> > blockdev --rereadpt /dev/sdb
> > mdadm -S /dev/md0
> > mdadm -A /dev/md0 /dev/sda /dev/sdb
> >
> > Test result: underlying disk partition and raid partition can be
> > observed at the same time
> >
> > Note that this can still happen in come corner cases that
> > GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> > device.
> >
> > Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> > Signed-off-by: Yu Kuai <[email protected]>
>
> The issue still can't be avoided completely, such as, after rebooting,
> /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
> should be underlying partitions scanned before re-assembling raid, I
> guess it may not be easy to avoid.

So this was always happening (before my patches, after my patches, and now
after Yu's patches) and kernel does not have enough information to know
that sda will become part of md0 device in the future. But mdadm actually
deals with this as far as I remember and deletes partitions for all devices
it is assembling the array from (and quick tracing experiment I did
supports this).

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-03-22 09:55:45

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Wed 22-03-23 11:59:26, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> set, and partition scan will be proceed again when blkdev_get_by_dev()
> is called. However, this will cause a problem that re-assemble partitioned
> raid device will creat partition for underlying disk.
>
> Test procedure:
>
> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> sgdisk -n 0:0:+100MiB /dev/md0
> blockdev --rereadpt /dev/sda
> blockdev --rereadpt /dev/sdb
> mdadm -S /dev/md0
> mdadm -A /dev/md0 /dev/sda /dev/sdb
>
> Test result: underlying disk partition and raid partition can be
> observed at the same time
>
> Note that this can still happen in come corner cases that
> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> device.
>
> Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> Signed-off-by: Yu Kuai <[email protected]>

This looks good to me. I've actually noticed this problem already when
looking at the patch resulting in commit e5cfefa97bcc but Jens merged it
before I got to checking it and then I've convinced myself it's not serious
enough to redo the patch. Anyway, feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> block/genhd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index 08bb1a9ec22c..a72e27d6779d 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> if (disk->open_partitions)
> return -EBUSY;
>
> - set_bit(GD_NEED_PART_SCAN, &disk->state);
> /*
> * If the device is opened exclusively by current thread already, it's
> * safe to scan partitons, otherwise, use bd_prepare_to_claim() to
> @@ -381,12 +380,19 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> return ret;
> }
>
> + set_bit(GD_NEED_PART_SCAN, &disk->state);
> bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
> if (IS_ERR(bdev))
> ret = PTR_ERR(bdev);
> else
> blkdev_put(bdev, mode & ~FMODE_EXCL);
>
> + /*
> + * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
> + * and this will cause that re-assemble partitioned raid device will
> + * creat partition for underlying disk.
> + */
> + clear_bit(GD_NEED_PART_SCAN, &disk->state);
> if (!(mode & FMODE_EXCL))
> bd_abort_claiming(disk->part0, disk_scan_partitions);
> return ret;
> --
> 2.31.1
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-03-22 11:37:46

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Wed, Mar 22, 2023 at 10:47:07AM +0100, Jan Kara wrote:
> On Wed 22-03-23 15:58:35, Ming Lei wrote:
> > On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> > > From: Yu Kuai <[email protected]>
> > >
> > > Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> > > set, and partition scan will be proceed again when blkdev_get_by_dev()
> > > is called. However, this will cause a problem that re-assemble partitioned
> > > raid device will creat partition for underlying disk.
> > >
> > > Test procedure:
> > >
> > > mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> > > sgdisk -n 0:0:+100MiB /dev/md0
> > > blockdev --rereadpt /dev/sda
> > > blockdev --rereadpt /dev/sdb
> > > mdadm -S /dev/md0
> > > mdadm -A /dev/md0 /dev/sda /dev/sdb
> > >
> > > Test result: underlying disk partition and raid partition can be
> > > observed at the same time
> > >
> > > Note that this can still happen in come corner cases that
> > > GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> > > device.
> > >
> > > Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> > > Signed-off-by: Yu Kuai <[email protected]>
> >
> > The issue still can't be avoided completely, such as, after rebooting,
> > /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
> > should be underlying partitions scanned before re-assembling raid, I
> > guess it may not be easy to avoid.
>
> So this was always happening (before my patches, after my patches, and now
> after Yu's patches) and kernel does not have enough information to know
> that sda will become part of md0 device in the future. But mdadm actually
> deals with this as far as I remember and deletes partitions for all devices
> it is assembling the array from (and quick tracing experiment I did
> supports this).

I am testing on Fedora 37, so mdadm v4.2 doesn't delete underlying
partitions before re-assemble.

Also given mdadm or related userspace has to change for avoiding
to scan underlying partitions, just wondering why not let userspace
to tell kernel not do it explicitly?

Thanks,
Ming

2023-03-22 13:11:18

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Wed 22-03-23 19:34:30, Ming Lei wrote:
> On Wed, Mar 22, 2023 at 10:47:07AM +0100, Jan Kara wrote:
> > On Wed 22-03-23 15:58:35, Ming Lei wrote:
> > > On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> > > > From: Yu Kuai <[email protected]>
> > > >
> > > > Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> > > > set, and partition scan will be proceed again when blkdev_get_by_dev()
> > > > is called. However, this will cause a problem that re-assemble partitioned
> > > > raid device will creat partition for underlying disk.
> > > >
> > > > Test procedure:
> > > >
> > > > mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> > > > sgdisk -n 0:0:+100MiB /dev/md0
> > > > blockdev --rereadpt /dev/sda
> > > > blockdev --rereadpt /dev/sdb
> > > > mdadm -S /dev/md0
> > > > mdadm -A /dev/md0 /dev/sda /dev/sdb
> > > >
> > > > Test result: underlying disk partition and raid partition can be
> > > > observed at the same time
> > > >
> > > > Note that this can still happen in come corner cases that
> > > > GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> > > > device.
> > > >
> > > > Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> > > > Signed-off-by: Yu Kuai <[email protected]>
> > >
> > > The issue still can't be avoided completely, such as, after rebooting,
> > > /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
> > > should be underlying partitions scanned before re-assembling raid, I
> > > guess it may not be easy to avoid.
> >
> > So this was always happening (before my patches, after my patches, and now
> > after Yu's patches) and kernel does not have enough information to know
> > that sda will become part of md0 device in the future. But mdadm actually
> > deals with this as far as I remember and deletes partitions for all devices
> > it is assembling the array from (and quick tracing experiment I did
> > supports this).
>
> I am testing on Fedora 37, so mdadm v4.2 doesn't delete underlying
> partitions before re-assemble.

Strange, I'm on openSUSE Leap 15.4 and mdadm v4.1 deletes these partitions
(at least I can see mdadm do BLKPG_DEL_PARTITION ioctls). And checking
mdadm sources I can see calls to remove_partitions() from start_array()
function in Assemble.c so I'm not sure why this is not working for you...

> Also given mdadm or related userspace has to change for avoiding
> to scan underlying partitions, just wondering why not let userspace
> to tell kernel not do it explicitly?

Well, those userspace changes are long deployed, now you would introduce
new API that needs to proliferate again. Not very nice. Also how would that
exactly work? I mean once mdadm has underlying device open, the current
logic makes sure we do not create partitions anymore. But there's no way
how mdadm could possibly prevent creation of partitions for devices it
doesn't know about yet so it still has to delete existing partitions...

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-03-22 16:11:26

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Wed, Mar 22, 2023 at 02:07:09PM +0100, Jan Kara wrote:
> On Wed 22-03-23 19:34:30, Ming Lei wrote:
> > On Wed, Mar 22, 2023 at 10:47:07AM +0100, Jan Kara wrote:
> > > On Wed 22-03-23 15:58:35, Ming Lei wrote:
> > > > On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> > > > > From: Yu Kuai <[email protected]>
> > > > >
> > > > > Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> > > > > set, and partition scan will be proceed again when blkdev_get_by_dev()
> > > > > is called. However, this will cause a problem that re-assemble partitioned
> > > > > raid device will creat partition for underlying disk.
> > > > >
> > > > > Test procedure:
> > > > >
> > > > > mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> > > > > sgdisk -n 0:0:+100MiB /dev/md0
> > > > > blockdev --rereadpt /dev/sda
> > > > > blockdev --rereadpt /dev/sdb
> > > > > mdadm -S /dev/md0
> > > > > mdadm -A /dev/md0 /dev/sda /dev/sdb
> > > > >
> > > > > Test result: underlying disk partition and raid partition can be
> > > > > observed at the same time
> > > > >
> > > > > Note that this can still happen in come corner cases that
> > > > > GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> > > > > device.
> > > > >
> > > > > Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> > > > > Signed-off-by: Yu Kuai <[email protected]>
> > > >
> > > > The issue still can't be avoided completely, such as, after rebooting,
> > > > /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
> > > > should be underlying partitions scanned before re-assembling raid, I
> > > > guess it may not be easy to avoid.
> > >
> > > So this was always happening (before my patches, after my patches, and now
> > > after Yu's patches) and kernel does not have enough information to know
> > > that sda will become part of md0 device in the future. But mdadm actually
> > > deals with this as far as I remember and deletes partitions for all devices
> > > it is assembling the array from (and quick tracing experiment I did
> > > supports this).
> >
> > I am testing on Fedora 37, so mdadm v4.2 doesn't delete underlying
> > partitions before re-assemble.
>
> Strange, I'm on openSUSE Leap 15.4 and mdadm v4.1 deletes these partitions
> (at least I can see mdadm do BLKPG_DEL_PARTITION ioctls). And checking
> mdadm sources I can see calls to remove_partitions() from start_array()
> function in Assemble.c so I'm not sure why this is not working for you...

I added dump_stack() in delete_partition() for partition 1, not observe
stack trace during booting.

>
> > Also given mdadm or related userspace has to change for avoiding
> > to scan underlying partitions, just wondering why not let userspace
> > to tell kernel not do it explicitly?
>
> Well, those userspace changes are long deployed, now you would introduce
> new API that needs to proliferate again. Not very nice. Also how would that
> exactly work? I mean once mdadm has underlying device open, the current
> logic makes sure we do not create partitions anymore. But there's no way
> how mdadm could possibly prevent creation of partitions for devices it
> doesn't know about yet so it still has to delete existing partitions...

I meant if mdadm has to change to delete existed partitions, why not add
one ioctl to disable partition scan for this disk when deleting
partitions/re-assemble, and re-enable scan after stopping array.

But looks it isn't so, since you mentioned that remove_partitions is
supposed to be called before starting array, however I didn't observe this
behavior.

I am worrying if the current approach may cause regression, one concern is
that ioctl(BLKRRPART) needs exclusive open now, such as:

1) mount /dev/vdb1 /mnt

2) ioctl(BLKRRPART) may fail after removing /dev/vdb3


thanks,
Ming

2023-03-23 11:01:07

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Thu 23-03-23 00:08:51, Ming Lei wrote:
> On Wed, Mar 22, 2023 at 02:07:09PM +0100, Jan Kara wrote:
> > On Wed 22-03-23 19:34:30, Ming Lei wrote:
> > > On Wed, Mar 22, 2023 at 10:47:07AM +0100, Jan Kara wrote:
> > > > On Wed 22-03-23 15:58:35, Ming Lei wrote:
> > > > > On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> > > > > > From: Yu Kuai <[email protected]>
> > > > > >
> > > > > > Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> > > > > > set, and partition scan will be proceed again when blkdev_get_by_dev()
> > > > > > is called. However, this will cause a problem that re-assemble partitioned
> > > > > > raid device will creat partition for underlying disk.
> > > > > >
> > > > > > Test procedure:
> > > > > >
> > > > > > mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> > > > > > sgdisk -n 0:0:+100MiB /dev/md0
> > > > > > blockdev --rereadpt /dev/sda
> > > > > > blockdev --rereadpt /dev/sdb
> > > > > > mdadm -S /dev/md0
> > > > > > mdadm -A /dev/md0 /dev/sda /dev/sdb
> > > > > >
> > > > > > Test result: underlying disk partition and raid partition can be
> > > > > > observed at the same time
> > > > > >
> > > > > > Note that this can still happen in come corner cases that
> > > > > > GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> > > > > > device.
> > > > > >
> > > > > > Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> > > > > > Signed-off-by: Yu Kuai <[email protected]>
> > > > >
> > > > > The issue still can't be avoided completely, such as, after rebooting,
> > > > > /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
> > > > > should be underlying partitions scanned before re-assembling raid, I
> > > > > guess it may not be easy to avoid.
> > > >
> > > > So this was always happening (before my patches, after my patches, and now
> > > > after Yu's patches) and kernel does not have enough information to know
> > > > that sda will become part of md0 device in the future. But mdadm actually
> > > > deals with this as far as I remember and deletes partitions for all devices
> > > > it is assembling the array from (and quick tracing experiment I did
> > > > supports this).
> > >
> > > I am testing on Fedora 37, so mdadm v4.2 doesn't delete underlying
> > > partitions before re-assemble.
> >
> > Strange, I'm on openSUSE Leap 15.4 and mdadm v4.1 deletes these partitions
> > (at least I can see mdadm do BLKPG_DEL_PARTITION ioctls). And checking
> > mdadm sources I can see calls to remove_partitions() from start_array()
> > function in Assemble.c so I'm not sure why this is not working for you...
>
> I added dump_stack() in delete_partition() for partition 1, not observe
> stack trace during booting.
>
> >
> > > Also given mdadm or related userspace has to change for avoiding
> > > to scan underlying partitions, just wondering why not let userspace
> > > to tell kernel not do it explicitly?
> >
> > Well, those userspace changes are long deployed, now you would introduce
> > new API that needs to proliferate again. Not very nice. Also how would that
> > exactly work? I mean once mdadm has underlying device open, the current
> > logic makes sure we do not create partitions anymore. But there's no way
> > how mdadm could possibly prevent creation of partitions for devices it
> > doesn't know about yet so it still has to delete existing partitions...
>
> I meant if mdadm has to change to delete existed partitions, why not add
> one ioctl to disable partition scan for this disk when deleting
> partitions/re-assemble, and re-enable scan after stopping array.
>
> But looks it isn't so, since you mentioned that remove_partitions is
> supposed to be called before starting array, however I didn't observe this
> behavior.

Yeah, not sure what's happening on your system.

> I am worrying if the current approach may cause regression, one concern is
> that ioctl(BLKRRPART) needs exclusive open now, such as:
>
> 1) mount /dev/vdb1 /mnt
>
> 2) ioctl(BLKRRPART) may fail after removing /dev/vdb3

Well, but we always had some variant of:

if (disk->open_partitions)
return -EBUSY;

in disk_scan_partitions(). So as long as any partition on the disk is used,
EBUSY is the correct return value from BLKRRPART.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-03-23 12:15:21

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Thu, Mar 23, 2023 at 11:51:20AM +0100, Jan Kara wrote:
> On Thu 23-03-23 00:08:51, Ming Lei wrote:
> > On Wed, Mar 22, 2023 at 02:07:09PM +0100, Jan Kara wrote:
> > > On Wed 22-03-23 19:34:30, Ming Lei wrote:
> > > > On Wed, Mar 22, 2023 at 10:47:07AM +0100, Jan Kara wrote:
> > > > > On Wed 22-03-23 15:58:35, Ming Lei wrote:
> > > > > > On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> > > > > > > From: Yu Kuai <[email protected]>
> > > > > > >
> > > > > > > Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> > > > > > > set, and partition scan will be proceed again when blkdev_get_by_dev()
> > > > > > > is called. However, this will cause a problem that re-assemble partitioned
> > > > > > > raid device will creat partition for underlying disk.
> > > > > > >
> > > > > > > Test procedure:
> > > > > > >
> > > > > > > mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> > > > > > > sgdisk -n 0:0:+100MiB /dev/md0
> > > > > > > blockdev --rereadpt /dev/sda
> > > > > > > blockdev --rereadpt /dev/sdb
> > > > > > > mdadm -S /dev/md0
> > > > > > > mdadm -A /dev/md0 /dev/sda /dev/sdb
> > > > > > >
> > > > > > > Test result: underlying disk partition and raid partition can be
> > > > > > > observed at the same time
> > > > > > >
> > > > > > > Note that this can still happen in come corner cases that
> > > > > > > GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> > > > > > > device.
> > > > > > >
> > > > > > > Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> > > > > > > Signed-off-by: Yu Kuai <[email protected]>
> > > > > >
> > > > > > The issue still can't be avoided completely, such as, after rebooting,
> > > > > > /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
> > > > > > should be underlying partitions scanned before re-assembling raid, I
> > > > > > guess it may not be easy to avoid.
> > > > >
> > > > > So this was always happening (before my patches, after my patches, and now
> > > > > after Yu's patches) and kernel does not have enough information to know
> > > > > that sda will become part of md0 device in the future. But mdadm actually
> > > > > deals with this as far as I remember and deletes partitions for all devices
> > > > > it is assembling the array from (and quick tracing experiment I did
> > > > > supports this).
> > > >
> > > > I am testing on Fedora 37, so mdadm v4.2 doesn't delete underlying
> > > > partitions before re-assemble.
> > >
> > > Strange, I'm on openSUSE Leap 15.4 and mdadm v4.1 deletes these partitions
> > > (at least I can see mdadm do BLKPG_DEL_PARTITION ioctls). And checking
> > > mdadm sources I can see calls to remove_partitions() from start_array()
> > > function in Assemble.c so I'm not sure why this is not working for you...
> >
> > I added dump_stack() in delete_partition() for partition 1, not observe
> > stack trace during booting.
> >
> > >
> > > > Also given mdadm or related userspace has to change for avoiding
> > > > to scan underlying partitions, just wondering why not let userspace
> > > > to tell kernel not do it explicitly?
> > >
> > > Well, those userspace changes are long deployed, now you would introduce
> > > new API that needs to proliferate again. Not very nice. Also how would that
> > > exactly work? I mean once mdadm has underlying device open, the current
> > > logic makes sure we do not create partitions anymore. But there's no way
> > > how mdadm could possibly prevent creation of partitions for devices it
> > > doesn't know about yet so it still has to delete existing partitions...
> >
> > I meant if mdadm has to change to delete existed partitions, why not add
> > one ioctl to disable partition scan for this disk when deleting
> > partitions/re-assemble, and re-enable scan after stopping array.
> >
> > But looks it isn't so, since you mentioned that remove_partitions is
> > supposed to be called before starting array, however I didn't observe this
> > behavior.
>
> Yeah, not sure what's happening on your system.

Looks not see such issue on Fedora 38, but it does happen on Fedora 37.

>
> > I am worrying if the current approach may cause regression, one concern is
> > that ioctl(BLKRRPART) needs exclusive open now, such as:
> >
> > 1) mount /dev/vdb1 /mnt
> >
> > 2) ioctl(BLKRRPART) may fail after removing /dev/vdb3
>
> Well, but we always had some variant of:
>
> if (disk->open_partitions)
> return -EBUSY;
>
> in disk_scan_partitions(). So as long as any partition on the disk is used,
> EBUSY is the correct return value from BLKRRPART.

OK, missing that check.

Then the change basically can be thought as ioctl(BLKRRPART) requiring O_EXCL,
One app just open(disk) with O_EXCL for a bit long, then ioctl(BLKRRPART) can't
be done from other process. Hope there isn't such case in practice.


Thanks,
Ming

2023-03-24 00:07:13

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> set, and partition scan will be proceed again when blkdev_get_by_dev()
> is called. However, this will cause a problem that re-assemble partitioned
> raid device will creat partition for underlying disk.
>
> Test procedure:
>
> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> sgdisk -n 0:0:+100MiB /dev/md0
> blockdev --rereadpt /dev/sda
> blockdev --rereadpt /dev/sdb
> mdadm -S /dev/md0
> mdadm -A /dev/md0 /dev/sda /dev/sdb
>
> Test result: underlying disk partition and raid partition can be
> observed at the same time
>
> Note that this can still happen in come corner cases that
> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> device.

That is why I suggest to touch this flag as less as possible, maybe
replace it with one function parameter in future.

>
> Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> Signed-off-by: Yu Kuai <[email protected]>

So far, let's move on with the fix:

Reviewed-by: Ming Lei <[email protected]>

Thanks,
Ming

2023-04-06 03:52:00

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

Hi, Jens!

?? 2023/03/22 11:59, Yu Kuai д??:
> From: Yu Kuai <[email protected]>
>
> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> set, and partition scan will be proceed again when blkdev_get_by_dev()
> is called. However, this will cause a problem that re-assemble partitioned
> raid device will creat partition for underlying disk.
>
> Test procedure:
>
> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> sgdisk -n 0:0:+100MiB /dev/md0
> blockdev --rereadpt /dev/sda
> blockdev --rereadpt /dev/sdb
> mdadm -S /dev/md0
> mdadm -A /dev/md0 /dev/sda /dev/sdb
>
> Test result: underlying disk partition and raid partition can be
> observed at the same time
>
> Note that this can still happen in come corner cases that
> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> device.
>

Can you apply this patch?

Thanks,
Kuai
> Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> block/genhd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index 08bb1a9ec22c..a72e27d6779d 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> if (disk->open_partitions)
> return -EBUSY;
>
> - set_bit(GD_NEED_PART_SCAN, &disk->state);
> /*
> * If the device is opened exclusively by current thread already, it's
> * safe to scan partitons, otherwise, use bd_prepare_to_claim() to
> @@ -381,12 +380,19 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
> return ret;
> }
>
> + set_bit(GD_NEED_PART_SCAN, &disk->state);
> bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
> if (IS_ERR(bdev))
> ret = PTR_ERR(bdev);
> else
> blkdev_put(bdev, mode & ~FMODE_EXCL);
>
> + /*
> + * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
> + * and this will cause that re-assemble partitioned raid device will
> + * creat partition for underlying disk.
> + */
> + clear_bit(GD_NEED_PART_SCAN, &disk->state);
> if (!(mode & FMODE_EXCL))
> bd_abort_claiming(disk->part0, disk_scan_partitions);
> return ret;
>

2023-04-06 22:47:09

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On 4/5/23 9:42 PM, Yu Kuai wrote:
> Hi, Jens!
>
> 在 2023/03/22 11:59, Yu Kuai 写道:
>> From: Yu Kuai <[email protected]>
>>
>> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
>> set, and partition scan will be proceed again when blkdev_get_by_dev()
>> is called. However, this will cause a problem that re-assemble partitioned
>> raid device will creat partition for underlying disk.
>>
>> Test procedure:
>>
>> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
>> sgdisk -n 0:0:+100MiB /dev/md0
>> blockdev --rereadpt /dev/sda
>> blockdev --rereadpt /dev/sdb
>> mdadm -S /dev/md0
>> mdadm -A /dev/md0 /dev/sda /dev/sdb
>>
>> Test result: underlying disk partition and raid partition can be
>> observed at the same time
>>
>> Note that this can still happen in come corner cases that
>> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
>> device.
>>
>
> Can you apply this patch?

None of them apply to my for-6.4/block branch...

--
Jens Axboe


2023-04-07 02:12:09

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On Thu, Apr 06, 2023 at 04:29:43PM -0600, Jens Axboe wrote:
> On 4/5/23 9:42 PM, Yu Kuai wrote:
> > Hi, Jens!
> >
> > 在 2023/03/22 11:59, Yu Kuai 写道:
> >> From: Yu Kuai <[email protected]>
> >>
> >> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
> >> set, and partition scan will be proceed again when blkdev_get_by_dev()
> >> is called. However, this will cause a problem that re-assemble partitioned
> >> raid device will creat partition for underlying disk.
> >>
> >> Test procedure:
> >>
> >> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
> >> sgdisk -n 0:0:+100MiB /dev/md0
> >> blockdev --rereadpt /dev/sda
> >> blockdev --rereadpt /dev/sdb
> >> mdadm -S /dev/md0
> >> mdadm -A /dev/md0 /dev/sda /dev/sdb
> >>
> >> Test result: underlying disk partition and raid partition can be
> >> observed at the same time
> >>
> >> Note that this can still happen in come corner cases that
> >> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
> >> device.
> >>
> >
> > Can you apply this patch?
>
> None of them apply to my for-6.4/block branch...

This patch is bug fix, and probably should aim at 6.3.

Thanks,
Ming

2023-04-07 03:09:13

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed

On 4/6/23 8:01 PM, Ming Lei wrote:
> On Thu, Apr 06, 2023 at 04:29:43PM -0600, Jens Axboe wrote:
>> On 4/5/23 9:42 PM, Yu Kuai wrote:
>>> Hi, Jens!
>>>
>>> 在 2023/03/22 11:59, Yu Kuai 写道:
>>>> From: Yu Kuai <[email protected]>
>>>>
>>>> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
>>>> set, and partition scan will be proceed again when blkdev_get_by_dev()
>>>> is called. However, this will cause a problem that re-assemble partitioned
>>>> raid device will creat partition for underlying disk.
>>>>
>>>> Test procedure:
>>>>
>>>> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
>>>> sgdisk -n 0:0:+100MiB /dev/md0
>>>> blockdev --rereadpt /dev/sda
>>>> blockdev --rereadpt /dev/sdb
>>>> mdadm -S /dev/md0
>>>> mdadm -A /dev/md0 /dev/sda /dev/sdb
>>>>
>>>> Test result: underlying disk partition and raid partition can be
>>>> observed at the same time
>>>>
>>>> Note that this can still happen in come corner cases that
>>>> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
>>>> device.
>>>>
>>>
>>> Can you apply this patch?
>>
>> None of them apply to my for-6.4/block branch...
>
> This patch is bug fix, and probably should aim at 6.3.

Yeah I see now, but it's a bit of a mashup of 2 patches, and
then a separate one. I've applied the single fixup for 6.3.

--
Jens Axboe