2023-12-11 08:18:53

by Li Nan

[permalink] [raw]
Subject: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop

From: Li Nan <[email protected]>

The raid should not be opened anymore when it is about to be stopped.
However, other processes can open it again if the flag MD_CLOSING is
cleared before exiting. From now on, this flag will not be cleared when
the raid will be stopped.

Fixes: 065e519e71b2 ("md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop")
Signed-off-by: Li Nan <[email protected]>
---
drivers/md/md.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4e9fe5cbeedc..ebdfc9068a60 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6238,7 +6238,6 @@ static void md_clean(struct mddev *mddev)
mddev->persistent = 0;
mddev->level = LEVEL_NONE;
mddev->clevel[0] = 0;
- mddev->flags = 0;
mddev->sb_flags = 0;
mddev->ro = MD_RDWR;
mddev->metadata_type[0] = 0;
@@ -7614,7 +7613,6 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
int err = 0;
void __user *argp = (void __user *)arg;
struct mddev *mddev = NULL;
- bool did_set_md_closing = false;

if (!md_ioctl_valid(cmd))
return -ENOTTY;
@@ -7698,7 +7696,6 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
err = -EBUSY;
goto out;
}
- did_set_md_closing = true;
mutex_unlock(&mddev->open_mutex);
sync_blockdev(bdev);
}
@@ -7742,10 +7739,13 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,

case STOP_ARRAY:
err = do_md_stop(mddev, 0, bdev);
+ if (err)
+ clear_bit(MD_CLOSING, &mddev->flags);
goto unlock;

case STOP_ARRAY_RO:
err = md_set_readonly(mddev, bdev);
+ clear_bit(MD_CLOSING, &mddev->flags);
goto unlock;

case HOT_REMOVE_DISK:
@@ -7840,8 +7840,6 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
mddev_unlock(mddev);

out:
- if(did_set_md_closing)
- clear_bit(MD_CLOSING, &mddev->flags);
return err;
}
#ifdef CONFIG_COMPAT
--
2.39.2


2023-12-11 09:56:37

by Mariusz Tkaczyk

[permalink] [raw]
Subject: Re: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop

On Mon, 11 Dec 2023 16:17:14 +0800
[email protected] wrote:

> From: Li Nan <[email protected]>
>
> The raid should not be opened anymore when it is about to be stopped.
> However, other processes can open it again if the flag MD_CLOSING is
> cleared before exiting. From now on, this flag will not be cleared when
> the raid will be stopped.
>
> Fixes: 065e519e71b2 ("md: MD_CLOSING needs to be cleared after called
> md_set_readonly or do_md_stop") Signed-off-by: Li Nan <[email protected]>

Hello Li Nan,
I was there when I needed to fix this:
https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c8870379a21fbd9ad14ca36204ccfbe9d25def43

For sure, you have to consider applying same solution for array_store "clear".
Minor nit below.

Thanks,
Mariusz

> ---
> drivers/md/md.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 4e9fe5cbeedc..ebdfc9068a60 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -6238,7 +6238,6 @@ static void md_clean(struct mddev *mddev)
> mddev->persistent = 0;
> mddev->level = LEVEL_NONE;
> mddev->clevel[0] = 0;
> - mddev->flags = 0;

I recommend (safety recommendation):
mddev->flags = MD_CLOSING;

Unless you can prove that other flags cannot race.

> mddev->sb_flags = 0;
> mddev->ro = MD_RDWR;
> mddev->metadata_type[0] = 0;

2023-12-12 03:21:46

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop

Hi,

?? 2023/12/11 17:56, Mariusz Tkaczyk д??:
> On Mon, 11 Dec 2023 16:17:14 +0800
> [email protected] wrote:
>
>> From: Li Nan <[email protected]>
>>
>> The raid should not be opened anymore when it is about to be stopped.
>> However, other processes can open it again if the flag MD_CLOSING is
>> cleared before exiting. From now on, this flag will not be cleared when
>> the raid will be stopped.
>>
>> Fixes: 065e519e71b2 ("md: MD_CLOSING needs to be cleared after called
>> md_set_readonly or do_md_stop") Signed-off-by: Li Nan <[email protected]>
>
> Hello Li Nan,
> I was there when I needed to fix this:
> https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c8870379a21fbd9ad14ca36204ccfbe9d25def43
>
> For sure, you have to consider applying same solution for array_store "clear".
> Minor nit below.
>
> Thanks,
> Mariusz
>
>> ---
>> drivers/md/md.c | 8 +++-----
>> 1 file changed, 3 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 4e9fe5cbeedc..ebdfc9068a60 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -6238,7 +6238,6 @@ static void md_clean(struct mddev *mddev)
>> mddev->persistent = 0;
>> mddev->level = LEVEL_NONE;
>> mddev->clevel[0] = 0;
>> - mddev->flags = 0;
>
> I recommend (safety recommendation):
> mddev->flags = MD_CLOSING;

Taking a look I think both MD_CLOSING and MD_DELETED should not be
cleared, however, there is no guarantee that MD_CLOSING will be set
before md_clean, because mdadm can be removed without running. Hence I
think just set MD_CLOSING is werid.

I think the proper way is to keep MD_CLOSING and MD_DELETED if they are
set. However, there is no such api to clear other bits at once. Since
we're not expecting anyone else to write flags, following maybe
acceptable:

mddev->flags &= BIT_ULL_MASK(MD_CLOSING) | BIT_ULL_MASK(MD_DELETED);

Or after making sure other flags cannot race, this patch is ok.

Thanks,
Kuai

>
> Unless you can prove that other flags cannot race.
>
>> mddev->sb_flags = 0;
>> mddev->ro = MD_RDWR;
>> mddev->metadata_type[0] = 0;
>
> .
>

2023-12-12 09:35:10

by Mariusz Tkaczyk

[permalink] [raw]
Subject: Re: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop

On Tue, 12 Dec 2023 11:21:28 +0800
Yu Kuai <[email protected]> wrote:

> Hi,
>
> ?? 2023/12/11 17:56, Mariusz Tkaczyk д??:
> > On Mon, 11 Dec 2023 16:17:14 +0800
> > [email protected] wrote:
> >
> >> From: Li Nan <[email protected]>
> >>
> >> The raid should not be opened anymore when it is about to be stopped.
> >> However, other processes can open it again if the flag MD_CLOSING is
> >> cleared before exiting. From now on, this flag will not be cleared when
> >> the raid will be stopped.
> >>
> >> Fixes: 065e519e71b2 ("md: MD_CLOSING needs to be cleared after called
> >> md_set_readonly or do_md_stop") Signed-off-by: Li Nan
> >> <[email protected]>
> >
> > Hello Li Nan,
> > I was there when I needed to fix this:
> > https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c8870379a21fbd9ad14ca36204ccfbe9d25def43
> >
> > For sure, you have to consider applying same solution for array_store
> > "clear". Minor nit below.
> >
> > Thanks,
> > Mariusz
> >
> >> ---
> >> drivers/md/md.c | 8 +++-----
> >> 1 file changed, 3 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >> index 4e9fe5cbeedc..ebdfc9068a60 100644
> >> --- a/drivers/md/md.c
> >> +++ b/drivers/md/md.c
> >> @@ -6238,7 +6238,6 @@ static void md_clean(struct mddev *mddev)
> >> mddev->persistent = 0;
> >> mddev->level = LEVEL_NONE;
> >> mddev->clevel[0] = 0;
> >> - mddev->flags = 0;
> >
> > I recommend (safety recommendation):
> > mddev->flags = MD_CLOSING;
>
> Taking a look I think both MD_CLOSING and MD_DELETED should not be
> cleared, however, there is no guarantee that MD_CLOSING will be set
> before md_clean, because mdadm can be removed without running. Hence I
> think just set MD_CLOSING is werid.
>
> I think the proper way is to keep MD_CLOSING and MD_DELETED if they are
> set. However, there is no such api to clear other bits at once. Since
> we're not expecting anyone else to write flags, following maybe
> acceptable:
>
> mddev->flags &= BIT_ULL_MASK(MD_CLOSING) | BIT_ULL_MASK(MD_DELETED);

Yes, MD_CLOSING is a bit number to not a bit value I can assign directly.
Thanks for clarifying!

Mariusz
>
> Or after making sure other flags cannot race, this patch is ok.
>
> Thanks,
> Kuai
>
> >
> > Unless you can prove that other flags cannot race.
> >
> >> mddev->sb_flags = 0;
> >> mddev->ro = MD_RDWR;
> >> mddev->metadata_type[0] = 0;
> >
> > .
> >
>

2023-12-26 06:25:05

by Oliver Sang

[permalink] [raw]
Subject: Re: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop



Hello,

kernel test robot noticed "mdadm-selftests.06name.fail" on:

commit: 4d060913335fad6f1d1f0816859c20aae823851c ("[PATCH] md: Don't clear MD_CLOSING when the raid is about to stop")
url: https://github.com/intel-lab-lkp/linux/commits/linan666-huaweicloud-com/md-Don-t-clear-MD_CLOSING-when-the-raid-is-about-to-stop/20231211-162010
base: git://git.kernel.org/cgit/linux/kernel/git/song/md.git md-next
patch link: https://lore.kernel.org/all/[email protected]/
patch subject: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop

in testcase: mdadm-selftests
version: mdadm-selftests-x86_64-5f41845-1_20220826
with following parameters:

disk: 1HDD
test_prefix: 06name



compiler: gcc-12
test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]

2023-12-24 10:19:28 mkdir -p /var/tmp
2023-12-24 10:19:28 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sdb1
2023-12-24 10:19:59 mount -t ext3 /dev/sdb1 /var/tmp
sed -e 's/{DEFAULT_METADATA}/1.2/g' \
-e 's,{MAP_PATH},/run/mdadm/map,g' mdadm.8.in > mdadm.8
/usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
/usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
/usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
/usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
/usr/bin/install -D -m 644 udev-md-raid-creating.rules /lib/udev/rules.d/01-md-raid-creating.rules
/usr/bin/install -D -m 644 udev-md-raid-arrays.rules /lib/udev/rules.d/63-md-raid-arrays.rules
/usr/bin/install -D -m 644 udev-md-raid-assembly.rules /lib/udev/rules.d/64-md-raid-assembly.rules
/usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules /lib/udev/rules.d/69-md-clustered-confirm-device.rules
/usr/bin/install -D -m 755 mdadm /sbin/mdadm
/usr/bin/install -D -m 755 mdmon /sbin/mdmon
Testing on linux-6.7.0-rc3-00014-g4d060913335f kernel
/lkp/benchmarks/mdadm-selftests/tests/06name... FAILED - see /var/tmp/06name.log and /var/tmp/fail06name.log for details



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231226/[email protected]



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


2023-12-26 12:23:01

by Li Nan

[permalink] [raw]
Subject: Re: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop



在 2023/12/26 14:24, kernel test robot 写道:
>
>
> Hello,
>
> kernel test robot noticed "mdadm-selftests.06name.fail" on:
>
> commit: 4d060913335fad6f1d1f0816859c20aae823851c ("[PATCH] md: Don't clear MD_CLOSING when the raid is about to stop")
> url: https://github.com/intel-lab-lkp/linux/commits/linan666-huaweicloud-com/md-Don-t-clear-MD_CLOSING-when-the-raid-is-about-to-stop/20231211-162010
> base: git://git.kernel.org/cgit/linux/kernel/git/song/md.git md-next
> patch link: https://lore.kernel.org/all/[email protected]/
> patch subject: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop
>
> in testcase: mdadm-selftests
> version: mdadm-selftests-x86_64-5f41845-1_20220826
> with following parameters:
>
> disk: 1HDD
> test_prefix: 06name
>
>
>
> compiler: gcc-12
> test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <[email protected]>
> | Closes: https://lore.kernel.org/oe-lkp/[email protected]
>
> 2023-12-24 10:19:28 mkdir -p /var/tmp
> 2023-12-24 10:19:28 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sdb1
> 2023-12-24 10:19:59 mount -t ext3 /dev/sdb1 /var/tmp
> sed -e 's/{DEFAULT_METADATA}/1.2/g' \
> -e 's,{MAP_PATH},/run/mdadm/map,g' mdadm.8.in > mdadm.8
> /usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
> /usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
> /usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
> /usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
> /usr/bin/install -D -m 644 udev-md-raid-creating.rules /lib/udev/rules.d/01-md-raid-creating.rules
> /usr/bin/install -D -m 644 udev-md-raid-arrays.rules /lib/udev/rules.d/63-md-raid-arrays.rules
> /usr/bin/install -D -m 644 udev-md-raid-assembly.rules /lib/udev/rules.d/64-md-raid-assembly.rules
> /usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules /lib/udev/rules.d/69-md-clustered-confirm-device.rules
> /usr/bin/install -D -m 755 mdadm /sbin/mdadm
> /usr/bin/install -D -m 755 mdmon /sbin/mdmon
> Testing on linux-6.7.0-rc3-00014-g4d060913335f kernel
> /lkp/benchmarks/mdadm-selftests/tests/06name... FAILED - see /var/tmp/06name.log and /var/tmp/fail06name.log for details
>
>

If MD_CLOSING is not cleared when stopping, it will still exist when
assembling and mddev can't be opened anymore. We need to clear it when
starting. I will fix it in next version.

>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20231226/[email protected]
>
>
>

--
Thanks,
Nan


2024-01-18 01:50:30

by Li Nan

[permalink] [raw]
Subject: Re: [PATCH] md: Don't clear MD_CLOSING when the raid is about to stop



在 2023/12/12 11:21, Yu Kuai 写道:
> Hi,
>
> 在 2023/12/11 17:56, Mariusz Tkaczyk 写道:
>> On Mon, 11 Dec 2023 16:17:14 +0800
>> [email protected] wrote:
>>
>>> From: Li Nan <[email protected]>
>>>
>>> The raid should not be opened anymore when it is about to be stopped.
>>> However, other processes can open it again if the flag MD_CLOSING is
>>> cleared before exiting. From now on, this flag will not be cleared when
>>> the raid will be stopped.
>>>
>>> Fixes: 065e519e71b2 ("md: MD_CLOSING needs to be cleared after called
>>> md_set_readonly or do_md_stop") Signed-off-by: Li Nan
>>> <[email protected]>
>>
>> Hello Li Nan,
>> I was there when I needed to fix this:
>> https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=c8870379a21fbd9ad14ca36204ccfbe9d25def43
>>
>>
>> For sure, you have to consider applying same solution for array_store
>> "clear".
>> Minor nit below.
>>
>> Thanks,
>> Mariusz
>>
>>> ---
>>>   drivers/md/md.c | 8 +++-----
>>>   1 file changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index 4e9fe5cbeedc..ebdfc9068a60 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -6238,7 +6238,6 @@ static void md_clean(struct mddev *mddev)
>>>       mddev->persistent = 0;
>>>       mddev->level = LEVEL_NONE;
>>>       mddev->clevel[0] = 0;
>>> -    mddev->flags = 0;
>>
>> I recommend (safety recommendation):
>>     mddev->flags = MD_CLOSING;
>
> Taking a look I think both MD_CLOSING and MD_DELETED should not be
> cleared, however, there is no guarantee that MD_CLOSING will be set
> before md_clean, because mdadm can be removed without running. Hence I
> think just set MD_CLOSING is werid.
>
> I think the proper way is to keep MD_CLOSING and MD_DELETED if they are
> set. However, there is no such api to clear other bits at once. Since
> we're not expecting anyone else to write flags, following maybe
> acceptable:
>
> mddev->flags &= BIT_ULL_MASK(MD_CLOSING) | BIT_ULL_MASK(MD_DELETED);
>

MD_DELETED is only set after mddev->active is put to 0. We need to open
mddev and get it before stropping raid, so the active must not be 0 and
MD_DELETED will not be set in md_clean.

> Or after making sure other flags cannot race, this patch is ok.
>
> Thanks,
> Kuai
>
>>
>> Unless you can prove that other flags cannot race.
>>
>>>       mddev->sb_flags = 0;
>>>       mddev->ro = MD_RDWR;
>>>       mddev->metadata_type[0] = 0;
>>
>> .
>>
>
>
> .

--
Thanks,
Nan