2023-08-15 06:44:46

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v2 3/7] md: delay choosing sync direction to md_start_sync()

From: Yu Kuai <[email protected]>

Before this patch, for read-write array:

1) md_check_recover() found that something need to be done, and it'll
try to grab 'reconfig_mutex'. The case that md_check_recover() need
to do something:
- array is not suspend;
- super_block need to be updated;
- 'MD_RECOVERY_NEEDED' or ''MD_RECOVERY_DONE' is set;
- unusual case related to safemode;

2) if 'MD_RECOVERY_RUNNING' is not set, and 'MD_RECOVERY_NEEDED' is set,
md_check_recover() will try to choose a sync direction, and then
queue a work md_start_sync().

3) md_start_sync() register sync_thread;

After this patch,

1) is the same;
2) if 'MD_RECOVERY_RUNNING' is not set, and 'MD_RECOVERY_NEEDED' is set,
queue a work md_start_sync() directly;
3) md_start_sync() will try to choose a sync direction, and then
register sync_thread();

Because 'MD_RECOVERY_RUNNING' is cleared when sync_thread is done, 2)
and 3) is always ran in serial and they can never concurrent, this
change should not introduce any behavior change for now.

Also fix a problem that md_start_sync() can clear 'MD_RECOVERY_RUNNING'
without protection in error path, which might affect the logical in
md_check_recovery().

The advantage to change this is that array reconfiguration is
independent from daemon now, and it'll be much easier to synchronize it
with io, consider that io may rely on daemon thread to be done.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 70 ++++++++++++++++++++++++++-----------------------
1 file changed, 37 insertions(+), 33 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4846ff6d25b0..03615b0e9fe1 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9291,6 +9291,22 @@ static bool md_choose_sync_direction(struct mddev *mddev, int *spares)
static void md_start_sync(struct work_struct *ws)
{
struct mddev *mddev = container_of(ws, struct mddev, sync_work);
+ int spares = 0;
+
+ mddev_lock_nointr(mddev);
+
+ if (!md_choose_sync_direction(mddev, &spares))
+ goto not_running;
+
+ if (!mddev->pers->sync_request)
+ goto not_running;
+
+ /*
+ * We are adding a device or devices to an array which has the bitmap
+ * stored on all devices. So make sure all bitmap pages get written.
+ */
+ if (spares)
+ md_bitmap_write_all(mddev->bitmap);

rcu_assign_pointer(mddev->sync_thread,
md_register_thread(md_do_sync, mddev, "resync"));
@@ -9298,20 +9314,27 @@ static void md_start_sync(struct work_struct *ws)
pr_warn("%s: could not start resync thread...\n",
mdname(mddev));
/* leave the spares where they are, it shouldn't hurt */
- clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
- clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
- clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
- clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
- clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
- wake_up(&resync_wait);
- if (test_and_clear_bit(MD_RECOVERY_RECOVER,
- &mddev->recovery))
- if (mddev->sysfs_action)
- sysfs_notify_dirent_safe(mddev->sysfs_action);
- } else
- md_wakeup_thread(mddev->sync_thread);
+ goto not_running;
+ }
+
+ mddev_unlock(mddev);
+ md_wakeup_thread(mddev->sync_thread);
sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event();
+ return;
+
+not_running:
+ clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+ clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
+ clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
+ clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
+ clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
+ mddev_unlock(mddev);
+
+ wake_up(&resync_wait);
+ if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
+ mddev->sysfs_action)
+ sysfs_notify_dirent_safe(mddev->sysfs_action);
}

/*
@@ -9379,7 +9402,6 @@ void md_check_recovery(struct mddev *mddev)
return;

if (mddev_trylock(mddev)) {
- int spares = 0;
bool try_set_sync = mddev->safemode != 0;

if (!mddev->external && mddev->safemode == 1)
@@ -9467,29 +9489,11 @@ void md_check_recovery(struct mddev *mddev)
clear_bit(MD_RECOVERY_DONE, &mddev->recovery);

if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
- test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
- goto not_running;
- if (!md_choose_sync_direction(mddev, &spares))
- goto not_running;
- if (mddev->pers->sync_request) {
- if (spares) {
- /* We are adding a device or devices to an array
- * which has the bitmap stored on all devices.
- * So make sure all bitmap pages get written
- */
- md_bitmap_write_all(mddev->bitmap);
- }
+ test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
queue_work(md_misc_wq, &mddev->sync_work);
- goto unlock;
- }
- not_running:
- if (!mddev->sync_thread) {
+ } else {
clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
wake_up(&resync_wait);
- if (test_and_clear_bit(MD_RECOVERY_RECOVER,
- &mddev->recovery))
- if (mddev->sysfs_action)
- sysfs_notify_dirent_safe(mddev->sysfs_action);
}
unlock:
wake_up(&mddev->sb_wait);
--
2.39.2



2023-08-16 01:37:19

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v2 3/7] md: delay choosing sync direction to md_start_sync()

Hi,

?? 2023/08/15 11:09, Yu Kuai д??:
> From: Yu Kuai <[email protected]>
>
> Before this patch, for read-write array:
>
> 1) md_check_recover() found that something need to be done, and it'll
> try to grab 'reconfig_mutex'. The case that md_check_recover() need
> to do something:
> - array is not suspend;
> - super_block need to be updated;
> - 'MD_RECOVERY_NEEDED' or ''MD_RECOVERY_DONE' is set;
> - unusual case related to safemode;
>
> 2) if 'MD_RECOVERY_RUNNING' is not set, and 'MD_RECOVERY_NEEDED' is set,
> md_check_recover() will try to choose a sync direction, and then
> queue a work md_start_sync().
>
> 3) md_start_sync() register sync_thread;
>
> After this patch,
>
> 1) is the same;
> 2) if 'MD_RECOVERY_RUNNING' is not set, and 'MD_RECOVERY_NEEDED' is set,
> queue a work md_start_sync() directly;
> 3) md_start_sync() will try to choose a sync direction, and then
> register sync_thread();
>
> Because 'MD_RECOVERY_RUNNING' is cleared when sync_thread is done, 2)
> and 3) is always ran in serial and they can never concurrent, this
> change should not introduce any behavior change for now.
>
> Also fix a problem that md_start_sync() can clear 'MD_RECOVERY_RUNNING'
> without protection in error path, which might affect the logical in
> md_check_recovery().
>
> The advantage to change this is that array reconfiguration is
> independent from daemon now, and it'll be much easier to synchronize it
> with io, consider that io may rely on daemon thread to be done.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/md/md.c | 70 ++++++++++++++++++++++++++-----------------------
> 1 file changed, 37 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 4846ff6d25b0..03615b0e9fe1 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -9291,6 +9291,22 @@ static bool md_choose_sync_direction(struct mddev *mddev, int *spares)
> static void md_start_sync(struct work_struct *ws)
> {
> struct mddev *mddev = container_of(ws, struct mddev, sync_work);
> + int spares = 0;
> +
> + mddev_lock_nointr(mddev);
> +
> + if (!md_choose_sync_direction(mddev, &spares))
> + goto not_running;
> +
> + if (!mddev->pers->sync_request)
> + goto not_running;
> +
> + /*
> + * We are adding a device or devices to an array which has the bitmap
> + * stored on all devices. So make sure all bitmap pages get written.
> + */
> + if (spares)
> + md_bitmap_write_all(mddev->bitmap);
>
> rcu_assign_pointer(mddev->sync_thread,
> md_register_thread(md_do_sync, mddev, "resync"));
> @@ -9298,20 +9314,27 @@ static void md_start_sync(struct work_struct *ws)
> pr_warn("%s: could not start resync thread...\n",
> mdname(mddev));
> /* leave the spares where they are, it shouldn't hurt */
> - clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
> - clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
> - clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
> - clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
> - clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
> - wake_up(&resync_wait);
> - if (test_and_clear_bit(MD_RECOVERY_RECOVER,
> - &mddev->recovery))
> - if (mddev->sysfs_action)
> - sysfs_notify_dirent_safe(mddev->sysfs_action);
> - } else
> - md_wakeup_thread(mddev->sync_thread);
> + goto not_running;
> + }
> +
> + mddev_unlock(mddev);
> + md_wakeup_thread(mddev->sync_thread);
> sysfs_notify_dirent_safe(mddev->sysfs_action);
> md_new_event();
> + return;
> +
> +not_running:
> + clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
> + clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
> + clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
> + clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
> + clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
> + mddev_unlock(mddev);
> +
> + wake_up(&resync_wait);
> + if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
> + mddev->sysfs_action)
> + sysfs_notify_dirent_safe(mddev->sysfs_action);
> }
>
> /*
> @@ -9379,7 +9402,6 @@ void md_check_recovery(struct mddev *mddev)
> return;
>
> if (mddev_trylock(mddev)) {
> - int spares = 0;
> bool try_set_sync = mddev->safemode != 0;
>
> if (!mddev->external && mddev->safemode == 1)
> @@ -9467,29 +9489,11 @@ void md_check_recovery(struct mddev *mddev)
> clear_bit(MD_RECOVERY_DONE, &mddev->recovery);
>
> if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
> - test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
> - goto not_running;
> - if (!md_choose_sync_direction(mddev, &spares))
> - goto not_running;
> - if (mddev->pers->sync_request) {
> - if (spares) {
> - /* We are adding a device or devices to an array
> - * which has the bitmap stored on all devices.
> - * So make sure all bitmap pages get written
> - */
> - md_bitmap_write_all(mddev->bitmap);
> - }
> + test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {

Sorry that I made a mistake here while rebasing v2, here should be

!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)

With this fixed, there are no new regression for mdadm tests using loop
devicein my VM.

Thanks,
Kuai
> queue_work(md_misc_wq, &mddev->sync_work);
> - goto unlock;
> - }
> - not_running:
> - if (!mddev->sync_thread) {
> + } else {
> clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
> wake_up(&resync_wait);
> - if (test_and_clear_bit(MD_RECOVERY_RECOVER,
> - &mddev->recovery))
> - if (mddev->sysfs_action)
> - sysfs_notify_dirent_safe(mddev->sysfs_action);
> }
> unlock:
> wake_up(&mddev->sb_wait);
>


2023-08-16 08:08:13

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v2 3/7] md: delay choosing sync direction to md_start_sync()

Hi,

在 2023/08/15 23:54, Song Liu 写道:
> On Tue, Aug 15, 2023 at 2:00 PM Yu Kuai <[email protected]> wrote:
> [...]
>>> +
>>> +not_running:
>>> + clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
>>> + clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
>>> + clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
>>> + clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
>>> + clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
>>> + mddev_unlock(mddev);
>>> +
>>> + wake_up(&resync_wait);
>>> + if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
>>> + mddev->sysfs_action)
>>> + sysfs_notify_dirent_safe(mddev->sysfs_action);
>>> }
>>>
>>> /*
>>> @@ -9379,7 +9402,6 @@ void md_check_recovery(struct mddev *mddev)
>>> return;
>>>
>>> if (mddev_trylock(mddev)) {
>>> - int spares = 0;
>>> bool try_set_sync = mddev->safemode != 0;
>>>
>>> if (!mddev->external && mddev->safemode == 1)
>>> @@ -9467,29 +9489,11 @@ void md_check_recovery(struct mddev *mddev)
>>> clear_bit(MD_RECOVERY_DONE, &mddev->recovery);
>>>
>>> if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
>>> - test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
>>> - goto not_running;
>>> - if (!md_choose_sync_direction(mddev, &spares))
>>> - goto not_running;
>>> - if (mddev->pers->sync_request) {
>>> - if (spares) {
>>> - /* We are adding a device or devices to an array
>>> - * which has the bitmap stored on all devices.
>>> - * So make sure all bitmap pages get written
>>> - */
>>> - md_bitmap_write_all(mddev->bitmap);
>>> - }
>>> + test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
>>
>> Sorry that I made a mistake here while rebasing v2, here should be
>>
>> !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)
>>
>> With this fixed, there are no new regression for mdadm tests using loop
>> devicein my VM.
>
> if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
> !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
> queue_work(md_misc_wq, &mddev->sync_work);
> } else {
>
> This doesn't look right. Should we do
>
> if (test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) &&
> !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
> queue_work(md_misc_wq, &mddev->sync_work);
> } else {
>
> instead?
>

Yes you're right, this is exactly what I did in v1, sorry that I keep
making mistake while rebasing.

Thanks,
Kuai

> Thanks,
> Song
> .
>


2023-08-16 14:41:11

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH -next v2 3/7] md: delay choosing sync direction to md_start_sync()

On Tue, Aug 15, 2023 at 2:00 PM Yu Kuai <[email protected]> wrote:
[...]
> > +
> > +not_running:
> > + clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
> > + clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
> > + clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
> > + clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
> > + clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
> > + mddev_unlock(mddev);
> > +
> > + wake_up(&resync_wait);
> > + if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
> > + mddev->sysfs_action)
> > + sysfs_notify_dirent_safe(mddev->sysfs_action);
> > }
> >
> > /*
> > @@ -9379,7 +9402,6 @@ void md_check_recovery(struct mddev *mddev)
> > return;
> >
> > if (mddev_trylock(mddev)) {
> > - int spares = 0;
> > bool try_set_sync = mddev->safemode != 0;
> >
> > if (!mddev->external && mddev->safemode == 1)
> > @@ -9467,29 +9489,11 @@ void md_check_recovery(struct mddev *mddev)
> > clear_bit(MD_RECOVERY_DONE, &mddev->recovery);
> >
> > if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
> > - test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
> > - goto not_running;
> > - if (!md_choose_sync_direction(mddev, &spares))
> > - goto not_running;
> > - if (mddev->pers->sync_request) {
> > - if (spares) {
> > - /* We are adding a device or devices to an array
> > - * which has the bitmap stored on all devices.
> > - * So make sure all bitmap pages get written
> > - */
> > - md_bitmap_write_all(mddev->bitmap);
> > - }
> > + test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
>
> Sorry that I made a mistake here while rebasing v2, here should be
>
> !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)
>
> With this fixed, there are no new regression for mdadm tests using loop
> devicein my VM.

if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
queue_work(md_misc_wq, &mddev->sync_work);
} else {

This doesn't look right. Should we do

if (test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) &&
!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
queue_work(md_misc_wq, &mddev->sync_work);
} else {

instead?

Thanks,
Song

2023-08-18 04:22:01

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH -next v2 3/7] md: delay choosing sync direction to md_start_sync()

On Tue, Aug 15, 2023 at 6:07 PM Yu Kuai <[email protected]> wrote:
>
> Hi,
>
> 在 2023/08/15 23:54, Song Liu 写道:
> > On Tue, Aug 15, 2023 at 2:00 PM Yu Kuai <[email protected]> wrote:
> > [...]
> >>> +
> >>> +not_running:
> >>> + clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
> >>> + clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
> >>> + clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
> >>> + clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
> >>> + clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
> >>> + mddev_unlock(mddev);
> >>> +
> >>> + wake_up(&resync_wait);
> >>> + if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
> >>> + mddev->sysfs_action)
> >>> + sysfs_notify_dirent_safe(mddev->sysfs_action);
> >>> }
> >>>
> >>> /*
> >>> @@ -9379,7 +9402,6 @@ void md_check_recovery(struct mddev *mddev)
> >>> return;
> >>>
> >>> if (mddev_trylock(mddev)) {
> >>> - int spares = 0;
> >>> bool try_set_sync = mddev->safemode != 0;
> >>>
> >>> if (!mddev->external && mddev->safemode == 1)
> >>> @@ -9467,29 +9489,11 @@ void md_check_recovery(struct mddev *mddev)
> >>> clear_bit(MD_RECOVERY_DONE, &mddev->recovery);
> >>>
> >>> if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
> >>> - test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
> >>> - goto not_running;
> >>> - if (!md_choose_sync_direction(mddev, &spares))
> >>> - goto not_running;
> >>> - if (mddev->pers->sync_request) {
> >>> - if (spares) {
> >>> - /* We are adding a device or devices to an array
> >>> - * which has the bitmap stored on all devices.
> >>> - * So make sure all bitmap pages get written
> >>> - */
> >>> - md_bitmap_write_all(mddev->bitmap);
> >>> - }
> >>> + test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
> >>
> >> Sorry that I made a mistake here while rebasing v2, here should be
> >>
> >> !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)
> >>
> >> With this fixed, there are no new regression for mdadm tests using loop
> >> devicein my VM.
> >
> > if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
> > !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
> > queue_work(md_misc_wq, &mddev->sync_work);
> > } else {
> >
> > This doesn't look right. Should we do
> >
> > if (test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) &&
> > !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
> > queue_work(md_misc_wq, &mddev->sync_work);
> > } else {
> >
> > instead?
> >
>
> Yes you're right, this is exactly what I did in v1, sorry that I keep
> making mistake while rebasing.

Please fix this, address comments from other reviews, and resend the
patches. Also, there are some typos in the commit logs, please also fix them.

Unfortunately, we won't ship this (and the two other big sets) in 6.6.

Thanks,
Song

2023-08-20 10:28:23

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v2 3/7] md: delay choosing sync direction to md_start_sync()

Hi,

在 2023/08/16 14:38, Xiao Ni 写道:
> On Tue, Aug 15, 2023 at 11:13 AM Yu Kuai <[email protected]> wrote:
>>
>> From: Yu Kuai <[email protected]>
>>
>> Before this patch, for read-write array:
>>
>> 1) md_check_recover() found that something need to be done, and it'll
>> try to grab 'reconfig_mutex'. The case that md_check_recover() need
>> to do something:
>> - array is not suspend;
>> - super_block need to be updated;
>> - 'MD_RECOVERY_NEEDED' or ''MD_RECOVERY_DONE' is set;
>> - unusual case related to safemode;
>>
>> 2) if 'MD_RECOVERY_RUNNING' is not set, and 'MD_RECOVERY_NEEDED' is set,
>> md_check_recover() will try to choose a sync direction, and then
>> queue a work md_start_sync().
>>
>> 3) md_start_sync() register sync_thread;
>>
>> After this patch,
>>
>> 1) is the same;
>> 2) if 'MD_RECOVERY_RUNNING' is not set, and 'MD_RECOVERY_NEEDED' is set,
>> queue a work md_start_sync() directly;
>> 3) md_start_sync() will try to choose a sync direction, and then
>> register sync_thread();
>>
>> Because 'MD_RECOVERY_RUNNING' is cleared when sync_thread is done, 2)
>> and 3) is always ran in serial and they can never concurrent, this
>> change should not introduce any behavior change for now.
>>
>> Also fix a problem that md_start_sync() can clear 'MD_RECOVERY_RUNNING'
>> without protection in error path, which might affect the logical in
>> md_check_recovery().
>>
>> The advantage to change this is that array reconfiguration is
>> independent from daemon now, and it'll be much easier to synchronize it
>> with io, consider that io may rely on daemon thread to be done.
>
> Hi Kuai
>
> What's the meaning of "array reconfiguration" here? "mdadm -f/-r/-a"
> something like this, right?. Because before and after this patch, only
> one sync thread can be running, so If we don't do this change, are
> there bugs or performance problems?

As we discussed([1]), and explained in patch 0, the purpose of this
change is to prepare to synchronize io with array reconfiguration(add
or remove rdev from array, for example, modify
conf->mirrors[].rdev/replacement for raid10).

Without this change, normal io can rely on daemon thread, while daemone
thread can change array configuration. raid1/raid10 record such io as
'io_queued', and can use freeze_array() to do synchronization in daemon
thread. However, other personalities have to implement such logical as
well, and I found it quite complicated, at least for raid456.

[1]
https://lore.kernel.org/all/[email protected]/

Thanks,
Kuai

>
> If it's only a patch that wants to make md_check_recovery more clearer
> and easier, I'm good with this idea too.
>


2023-08-20 15:25:46

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v2 3/7] md: delay choosing sync direction to md_start_sync()

Hi,

在 2023/08/18 5:53, Song Liu 写道:
> On Tue, Aug 15, 2023 at 6:07 PM Yu Kuai <[email protected]> wrote:
>>
>> Hi,
>>
>> 在 2023/08/15 23:54, Song Liu 写道:
>>> On Tue, Aug 15, 2023 at 2:00 PM Yu Kuai <[email protected]> wrote:
>>> [...]
>>>>> +
>>>>> +not_running:
>>>>> + clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
>>>>> + clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
>>>>> + clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
>>>>> + clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
>>>>> + clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
>>>>> + mddev_unlock(mddev);
>>>>> +
>>>>> + wake_up(&resync_wait);
>>>>> + if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
>>>>> + mddev->sysfs_action)
>>>>> + sysfs_notify_dirent_safe(mddev->sysfs_action);
>>>>> }
>>>>>
>>>>> /*
>>>>> @@ -9379,7 +9402,6 @@ void md_check_recovery(struct mddev *mddev)
>>>>> return;
>>>>>
>>>>> if (mddev_trylock(mddev)) {
>>>>> - int spares = 0;
>>>>> bool try_set_sync = mddev->safemode != 0;
>>>>>
>>>>> if (!mddev->external && mddev->safemode == 1)
>>>>> @@ -9467,29 +9489,11 @@ void md_check_recovery(struct mddev *mddev)
>>>>> clear_bit(MD_RECOVERY_DONE, &mddev->recovery);
>>>>>
>>>>> if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
>>>>> - test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
>>>>> - goto not_running;
>>>>> - if (!md_choose_sync_direction(mddev, &spares))
>>>>> - goto not_running;
>>>>> - if (mddev->pers->sync_request) {
>>>>> - if (spares) {
>>>>> - /* We are adding a device or devices to an array
>>>>> - * which has the bitmap stored on all devices.
>>>>> - * So make sure all bitmap pages get written
>>>>> - */
>>>>> - md_bitmap_write_all(mddev->bitmap);
>>>>> - }
>>>>> + test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
>>>>
>>>> Sorry that I made a mistake here while rebasing v2, here should be
>>>>
>>>> !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)
>>>>
>>>> With this fixed, there are no new regression for mdadm tests using loop
>>>> devicein my VM.
>>>
>>> if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
>>> !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
>>> queue_work(md_misc_wq, &mddev->sync_work);
>>> } else {
>>>
>>> This doesn't look right. Should we do
>>>
>>> if (test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) &&
>>> !test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
>>> queue_work(md_misc_wq, &mddev->sync_work);
>>> } else {
>>>
>>> instead?
>>>
>>
>> Yes you're right, this is exactly what I did in v1, sorry that I keep
>> making mistake while rebasing.
>
> Please fix this, address comments from other reviews, and resend the
> patches. Also, there are some typos in the commit logs, please also fix them.
>

Of course, and sorry for the dealy, I was ill and rested at home for a
few days.

Thanks,
Kuai

> Unfortunately, we won't ship this (and the two other big sets) in 6.6.
>
> Thanks,
> Song
> .
>