2023-10-10 07:24:22

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 00/19] md: synchronize io with array reconfiguration

From: Yu Kuai <[email protected]>

Changes in v4:
- add detailed explanation of backgroud and design in cover letter;
- merge some patches;

Changes in v3:
- rebase with latest md-next;
- remove patch 2 from v2, and replace it with a new patch;
- fix a null-ptr-derefrence in rdev_attr_store() that mddev is used
before checking;
- merge patch 20-22 from v1 into one patch;
- mddev_lock() used to be called first and can be interruptted, allow new
api, which is called before mddev_lock() now, to be interruptted as well;
- improve some comments and coding;

Changes in v2:
- rebase with latest md-next;
- remove some follow up cleanup patches, these patches will be sent
later after this patchset.

##### Background

Our testers started to test raid10 last year, and we found that there are
lots of problems in the following test scenario:

- add or remove disks to the array
- issue io to the array

At first, we fixed each problem independently respect that io can
concurrent with array reconfiguration. However, on the one hand new
issues are continuously reported, on the other hand other personalities
might have the same problems. I'm thinking about how to fix these problems
thoroughly.

Refer to how block layer protect io with queue reconfiguration(for example,
change elevator):

```
blk_mq_freeze_queue
-> wait for all io to be done, and prevent new io to be dispatched
// reconfiguration
blk_mq_unfreeze_queue
```

Then it comes to my mind that I can do something similar to synchronize io
with array reconfiguration.

##### rcu introduction

see details in https://www.kernel.org/doc/html/next/RCU/whatisRCU.html

- writer should replace old data with new data first, and free old data
after grace period;
- reader should handle both cases that old data and new data is read, and
the data that is read should not be dereferenced after critical section;

##### Current synchronization

Add or remove disks to the array can be triggered by ioctl/sysfs/daemon
thread:

1. hold 'reconfig_mutex';

2. check that rdev can be added/removed, one condition is that there is no
IO, for example:

```
raid10_remove_disk
if (atomic_read(&rdev->nr_pending))
err = -EBUSY;
```

3. do the actual operations to add/remove a rdev, one procedure is
set/clear a pointer to rdev, for example:

```
raid10_remove_disk
p = conf->mirrors[xx]
rdevp = &p->rdev/replacement
*rdevp = NULL
```

4. check if there is still no io on this rdev, if not, revert the pointer
to rdev and return failure, for example:

```
raid10_remove_disk
synchronize_rcu()
if (atomic_read(&rdev->nr_pending))
err = -EBUSY
*rdevp = rdev
```

IO path is using rcu_read_lock/unlock() to access rdev, for example:

```
raid10_write_request
rcu_read_lock
rdev = rcu_dereference(mirror->rdev/replacement)
...
rcu_read_unlock

raid10_end_write_request
rdev = conf->mirrors[dev].rdev/replacement
-> rdev/rrdev is still used after rcu_read_unlock()
```

##### Current problems

- rcu is used wrongly;
- There are lots of places involved that old rdev can be read, however,
many places doesn't handle old value correctly;
- Between step 3 and 4, if new io is dispatched, NULL will be read for
the rdev, and data will be lost if step 4 failed.

##### New synchronization

Similar to how blk_mq_freeze_queue() works

Add or remove disks:

1. suspend the array, this should guarantee no new io is dispatched and
wait for dispatched io to be done;
2. add or remove rdevs to array;
3. resume the array;

IO path doesn't need to change for now, and all rcu implementation can
be removed.

There are already apis to suspend/resume the array, unfortunately, they
can't be used here because:

- old apis only wait for io to be dispatched, not to be done;
- old apis is only supported for the personality that implement quiesce
callback;
- old apis can only be called after the array start running;
- old apis must hold 'reconfig_mutex', and will wait for io to be done,
this behavior is risky because 'reconfig_mutex' is used for daemon thread
to update super_block and handle io. In order to prevent potential
problems, there is a weird logical that suspend array hold 'reconfig_mutex'
for mddev_check_recovery() to update super_block, however, this is still
not enough;

Then main work is divided into 3 steps, at first make sure new apis to
suspend the array is general:

- make sure suspend array will wait for io to be done(Done by [1]);
- make sure suspend array can be called for all personalities(Done by [2]);
- make sure suspend array can be called at any time(Done by [3]);
- make sure suspend array doesn't rely on 'reconfig_mutex'(PATCH 3-5);

The second step is to replace old apis with new apis(PATCH 6-16):

```
From:
lock reconfig_mutex
suspend array
resume array
unlock reconfig_mutex

To:
suspend array
lock reconfig_mutex
unlock reconfig_mutex
resume array
```

Finally, for the remain path that involved reconfiguration, suspend the
array first(PATCH 11,12, [4] and PATCH 17):

```
From:
// reconfiguration

To:
suspend array
// reconfiguration
resume array
```

Preparatory work:
[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/
[3] https://lore.kernel.org/all/[email protected]/
[4] https://lore.kernel.org/all/[email protected]/

Yu Kuai (19):
md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
md: replace is_md_suspended() with 'mddev->suspended' in
md_check_recovery()
md: add new helpers to suspend/resume array
md: add new helpers to suspend/resume and lock/unlock array
md/dm-raid: use new apis to suspend array
md/md-bitmap: use new apis to suspend array for location_store()
md/raid5-cache: use new apis to suspend array
md/raid5: use new apis to suspend array
md: use new apis to suspend array for sysfs apis
md: use new apis to suspend array for adding/removing rdev from
state_store()
md: use new apis to suspend array for ioctls involed array
reconfiguration
md: use new apis to suspend array before
mddev_create/destroy_serial_pool
md: cleanup mddev_create/destroy_serial_pool()
md/md-linear: cleanup linear_add()
md/raid5: replace suspend with quiesce() callback
md: suspend array in md_start_sync() if array need reconfiguration
md: remove old apis to suspend the array
md: rename __mddev_suspend/resume() back to mddev_suspend/resume()

drivers/md/dm-raid.c | 10 +-
drivers/md/md-autodetect.c | 4 +-
drivers/md/md-bitmap.c | 18 ++-
drivers/md/md-linear.c | 2 -
drivers/md/md.c | 233 ++++++++++++++++++++-----------------
drivers/md/md.h | 43 +++++--
drivers/md/raid5-cache.c | 64 +++++-----
drivers/md/raid5.c | 56 ++++-----
8 files changed, 226 insertions(+), 204 deletions(-)

--
2.39.2


2023-10-10 07:24:24

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 09/19] md/raid5: use new apis to suspend array

From: Yu Kuai <[email protected]>

Convert to use new apis, the old apis will be removed eventually.

These are not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid5.c | 38 ++++++++++++--------------------------
1 file changed, 12 insertions(+), 26 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 6383723468e5..e6b8c0145648 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7025,7 +7025,7 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len)
new != roundup_pow_of_two(new))
return -EINVAL;

- err = mddev_lock(mddev);
+ err = mddev_suspend_and_lock(mddev);
if (err)
return err;

@@ -7049,7 +7049,6 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len)
goto out_unlock;
}

- mddev_suspend(mddev);
mutex_lock(&conf->cache_size_mutex);
size = conf->max_nr_stripes;

@@ -7064,10 +7063,9 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len)
err = -ENOMEM;
}
mutex_unlock(&conf->cache_size_mutex);
- mddev_resume(mddev);

out_unlock:
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
return err ?: len;
}

@@ -7153,7 +7151,7 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
return -EINVAL;
new = !!new;

- err = mddev_lock(mddev);
+ err = mddev_suspend_and_lock(mddev);
if (err)
return err;
conf = mddev->private;
@@ -7162,15 +7160,13 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
else if (new != conf->skip_copy) {
struct request_queue *q = mddev->queue;

- mddev_suspend(mddev);
conf->skip_copy = new;
if (new)
blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, q);
else
blk_queue_flag_clear(QUEUE_FLAG_STABLE_WRITES, q);
- mddev_resume(mddev);
}
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
return err ?: len;
}

@@ -7225,15 +7221,13 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
if (new > 8192)
return -EINVAL;

- err = mddev_lock(mddev);
+ err = mddev_suspend_and_lock(mddev);
if (err)
return err;
conf = mddev->private;
if (!conf)
err = -ENODEV;
else if (new != conf->worker_cnt_per_group) {
- mddev_suspend(mddev);
-
old_groups = conf->worker_groups;
if (old_groups)
flush_workqueue(raid5_wq);
@@ -7250,9 +7244,8 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
kfree(old_groups[0].workers);
kfree(old_groups);
}
- mddev_resume(mddev);
}
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);

return err ?: len;
}
@@ -8974,12 +8967,12 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
struct r5conf *conf;
int err;

- err = mddev_lock(mddev);
+ err = mddev_suspend_and_lock(mddev);
if (err)
return err;
conf = mddev->private;
if (!conf) {
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
return -ENODEV;
}

@@ -8989,19 +8982,14 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
err = log_init(conf, NULL, true);
if (!err) {
err = resize_stripes(conf, conf->pool_size);
- if (err) {
- mddev_suspend(mddev);
+ if (err)
log_exit(conf);
- mddev_resume(mddev);
- }
}
} else
err = -EINVAL;
} else if (strncmp(buf, "resync", 6) == 0) {
if (raid5_has_ppl(conf)) {
- mddev_suspend(mddev);
log_exit(conf);
- mddev_resume(mddev);
err = resize_stripes(conf, conf->pool_size);
} else if (test_bit(MD_HAS_JOURNAL, &conf->mddev->flags) &&
r5l_log_disk_error(conf)) {
@@ -9014,11 +9002,9 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
break;
}

- if (!journal_dev_exists) {
- mddev_suspend(mddev);
+ if (!journal_dev_exists)
clear_bit(MD_HAS_JOURNAL, &mddev->flags);
- mddev_resume(mddev);
- } else /* need remove journal device first */
+ else /* need remove journal device first */
err = -EBUSY;
} else
err = -EINVAL;
@@ -9029,7 +9015,7 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
if (!err)
md_update_sb(mddev, 1);

- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);

return err;
}
--
2.39.2

2023-10-10 07:24:26

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 06/19] md/dm-raid: use new apis to suspend array

From: Yu Kuai <[email protected]>

Convert to use new apis, the old apis will be removed eventually.

These are not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/dm-raid.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 69805d37e113..05dd6ccf6f48 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3244,7 +3244,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);

/* Has to be held on running the array */
- mddev_lock_nointr(&rs->md);
+ mddev_suspend_and_lock_nointr(&rs->md);
r = md_run(&rs->md);
rs->md.in_sync = 0; /* Assume already marked dirty */
if (r) {
@@ -3268,7 +3268,6 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
}
}

- mddev_suspend(&rs->md);
set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags);

/* Try to adjust the raid4/5/6 stripe cache size to the stripe size */
@@ -3798,9 +3797,7 @@ static void raid_postsuspend(struct dm_target *ti)
if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
md_stop_writes(&rs->md);

- mddev_lock_nointr(&rs->md);
- mddev_suspend(&rs->md);
- mddev_unlock(&rs->md);
+ __mddev_suspend(&rs->md, false);
}
}

@@ -4012,7 +4009,7 @@ static int raid_preresume(struct dm_target *ti)
}

/* Check for any resize/reshape on @rs and adjust/initiate */
- /* Be prepared for mddev_resume() in raid_resume() */
+ /* Be prepared for __mddev_resume() in raid_resume() */
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) {
set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
@@ -4059,8 +4056,7 @@ static void raid_resume(struct dm_target *ti)
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
mddev->ro = 0;
mddev->in_sync = 0;
- mddev_resume(mddev);
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
}
}

--
2.39.2

2023-10-10 07:24:34

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 08/19] md/raid5-cache: use new apis to suspend array

From: Yu Kuai <[email protected]>

Convert to use new apis, the old apis will be removed eventually.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid5-cache.c | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 889bba60d6ff..9909110262ee 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -686,7 +686,6 @@ static void r5c_disable_writeback_async(struct work_struct *work)
disable_writeback_work);
struct mddev *mddev = log->rdev->mddev;
struct r5conf *conf = mddev->private;
- int locked = 0;

if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH)
return;
@@ -696,13 +695,13 @@ static void r5c_disable_writeback_async(struct work_struct *work)
/* wait superblock change before suspend */
wait_event(mddev->sb_wait,
!READ_ONCE(conf->log) ||
- (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
- (locked = mddev_trylock(mddev))));
- if (locked) {
- mddev_suspend(mddev);
+ !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags));
+
+ log = READ_ONCE(conf->log);
+ if (log) {
+ __mddev_suspend(mddev, false);
log->r5c_journal_mode = R5C_JOURNAL_MODE_WRITE_THROUGH;
- mddev_resume(mddev);
- mddev_unlock(mddev);
+ __mddev_resume(mddev);
}
}

@@ -2586,9 +2585,7 @@ int r5c_journal_mode_set(struct mddev *mddev, int mode)
mode == R5C_JOURNAL_MODE_WRITE_BACK)
return -EINVAL;

- mddev_suspend(mddev);
conf->log->r5c_journal_mode = mode;
- mddev_resume(mddev);

pr_debug("md/raid:%s: setting r5c cache mode to %d: %s\n",
mdname(mddev), mode, r5c_journal_mode_str[mode]);
@@ -2613,11 +2610,11 @@ static ssize_t r5c_journal_mode_store(struct mddev *mddev,
if (strlen(r5c_journal_mode_str[mode]) == len &&
!strncmp(page, r5c_journal_mode_str[mode], len))
break;
- ret = mddev_lock(mddev);
+ ret = mddev_suspend_and_lock(mddev);
if (ret)
return ret;
ret = r5c_journal_mode_set(mddev, mode);
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
return ret ?: length;
}

--
2.39.2

2023-10-10 07:24:35

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 02/19] md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'

From: Yu Kuai <[email protected]>

'conf->log' is set with 'reconfig_mutex' grabbed, however, readers are
not procted, hence protect it with READ_ONCE/WRITE_ONCE to prevent
reading abnormal values.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid5-cache.c | 47 +++++++++++++++++++++-------------------
1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 518b7cfa78b9..889bba60d6ff 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -327,8 +327,9 @@ void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
void r5c_check_stripe_cache_usage(struct r5conf *conf)
{
int total_cached;
+ struct r5l_log *log = READ_ONCE(conf->log);

- if (!r5c_is_writeback(conf->log))
+ if (!r5c_is_writeback(log))
return;

total_cached = atomic_read(&conf->r5c_cached_partial_stripes) +
@@ -344,7 +345,7 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
*/
if (total_cached > conf->min_nr_stripes * 1 / 2 ||
atomic_read(&conf->empty_inactive_list_nr) > 0)
- r5l_wake_reclaim(conf->log, 0);
+ r5l_wake_reclaim(log, 0);
}

/*
@@ -353,7 +354,9 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
*/
void r5c_check_cached_full_stripe(struct r5conf *conf)
{
- if (!r5c_is_writeback(conf->log))
+ struct r5l_log *log = READ_ONCE(conf->log);
+
+ if (!r5c_is_writeback(log))
return;

/*
@@ -363,7 +366,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
if (atomic_read(&conf->r5c_cached_full_stripes) >=
min(R5C_FULL_STRIPE_FLUSH_BATCH(conf),
conf->chunk_sectors >> RAID5_STRIPE_SHIFT(conf)))
- r5l_wake_reclaim(conf->log, 0);
+ r5l_wake_reclaim(log, 0);
}

/*
@@ -396,7 +399,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
*/
static sector_t r5c_log_required_to_flush_cache(struct r5conf *conf)
{
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);

if (!r5c_is_writeback(log))
return 0;
@@ -449,7 +452,7 @@ static inline void r5c_update_log_state(struct r5l_log *log)
void r5c_make_stripe_write_out(struct stripe_head *sh)
{
struct r5conf *conf = sh->raid_conf;
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);

BUG_ON(!r5c_is_writeback(log));

@@ -491,7 +494,7 @@ static void r5c_handle_parity_cached(struct stripe_head *sh)
*/
static void r5c_finish_cache_stripe(struct stripe_head *sh)
{
- struct r5l_log *log = sh->raid_conf->log;
+ struct r5l_log *log = READ_ONCE(sh->raid_conf->log);

if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) {
BUG_ON(test_bit(STRIPE_R5C_CACHING, &sh->state));
@@ -692,7 +695,7 @@ static void r5c_disable_writeback_async(struct work_struct *work)

/* wait superblock change before suspend */
wait_event(mddev->sb_wait,
- conf->log == NULL ||
+ !READ_ONCE(conf->log) ||
(!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
(locked = mddev_trylock(mddev))));
if (locked) {
@@ -1151,7 +1154,7 @@ static void r5l_run_no_space_stripes(struct r5l_log *log)
static sector_t r5c_calculate_new_cp(struct r5conf *conf)
{
struct stripe_head *sh;
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);
sector_t new_cp;
unsigned long flags;

@@ -1159,12 +1162,12 @@ static sector_t r5c_calculate_new_cp(struct r5conf *conf)
return log->next_checkpoint;

spin_lock_irqsave(&log->stripe_in_journal_lock, flags);
- if (list_empty(&conf->log->stripe_in_journal_list)) {
+ if (list_empty(&log->stripe_in_journal_list)) {
/* all stripes flushed */
spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
return log->next_checkpoint;
}
- sh = list_first_entry(&conf->log->stripe_in_journal_list,
+ sh = list_first_entry(&log->stripe_in_journal_list,
struct stripe_head, r5c);
new_cp = sh->log_start;
spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
@@ -1399,7 +1402,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)
struct stripe_head *sh, *next;

lockdep_assert_held(&conf->device_lock);
- if (!conf->log)
+ if (!READ_ONCE(conf->log))
return;

count = 0;
@@ -1420,7 +1423,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)

static void r5c_do_reclaim(struct r5conf *conf)
{
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);
struct stripe_head *sh;
int count = 0;
unsigned long flags;
@@ -1549,7 +1552,7 @@ static void r5l_reclaim_thread(struct md_thread *thread)
{
struct mddev *mddev = thread->mddev;
struct r5conf *conf = mddev->private;
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);

if (!log)
return;
@@ -1591,7 +1594,7 @@ void r5l_quiesce(struct r5l_log *log, int quiesce)

bool r5l_log_disk_error(struct r5conf *conf)
{
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);

/* don't allow write if journal disk is missing */
if (!log)
@@ -2635,7 +2638,7 @@ int r5c_try_caching_write(struct r5conf *conf,
struct stripe_head_state *s,
int disks)
{
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);
int i;
struct r5dev *dev;
int to_cache = 0;
@@ -2802,7 +2805,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
struct stripe_head *sh,
struct stripe_head_state *s)
{
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);
int i;
int do_wakeup = 0;
sector_t tree_index;
@@ -2941,7 +2944,7 @@ int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh)
/* check whether this big stripe is in write back cache. */
bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect)
{
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);
sector_t tree_index;
void *slot;

@@ -3049,14 +3052,14 @@ int r5l_start(struct r5l_log *log)
void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev)
{
struct r5conf *conf = mddev->private;
- struct r5l_log *log = conf->log;
+ struct r5l_log *log = READ_ONCE(conf->log);

if (!log)
return;

if ((raid5_calc_degraded(conf) > 0 ||
test_bit(Journal, &rdev->flags)) &&
- conf->log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
+ log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
schedule_work(&log->disable_writeback_work);
}

@@ -3145,7 +3148,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
spin_lock_init(&log->stripe_in_journal_lock);
atomic_set(&log->stripe_in_journal_count, 0);

- conf->log = log;
+ WRITE_ONCE(conf->log, log);

set_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
return 0;
@@ -3173,7 +3176,7 @@ void r5l_exit_log(struct r5conf *conf)
* 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
* ensure disable_writeback_work wakes up and exits.
*/
- conf->log = NULL;
+ WRITE_ONCE(conf->log, NULL);
wake_up(&conf->mddev->sb_wait);
flush_work(&log->disable_writeback_work);

--
2.39.2

2023-10-10 07:24:36

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 07/19] md/md-bitmap: use new apis to suspend array for location_store()

From: Yu Kuai <[email protected]>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-bitmap.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 0c661e5036bb..7d21e2a5b06e 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2348,11 +2348,10 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
{
int rv;

- rv = mddev_lock(mddev);
+ rv = mddev_suspend_and_lock(mddev);
if (rv)
return rv;

- mddev_suspend(mddev);
if (mddev->pers) {
if (mddev->recovery || mddev->sync_thread) {
rv = -EBUSY;
@@ -2429,8 +2428,7 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
}
rv = 0;
out:
- mddev_resume(mddev);
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
if (rv)
return rv;
return len;
--
2.39.2

2023-10-10 07:24:41

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 11/19] md: use new apis to suspend array for adding/removing rdev from state_store()

From: Yu Kuai <[email protected]>

User can write 'remove' and 're-add' to trigger array reconfiguration
through sysfs, suspend array in this case so that io won't concurrent
with array reconfiguration.

And now that all the caller of add_bound_rdev() alread suspend the
array, remove mddev_suspend/resume() from add_bound_rdev() as well.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index aa08b9b78332..56523bac5140 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2940,11 +2940,7 @@ static int add_bound_rdev(struct md_rdev *rdev)
*/
super_types[mddev->major_version].
validate_super(mddev, rdev);
- if (add_journal)
- mddev_suspend(mddev);
err = mddev->pers->hot_add_disk(mddev, rdev);
- if (add_journal)
- mddev_resume(mddev);
if (err) {
md_kick_rdev_from_array(rdev);
return err;
@@ -3697,6 +3693,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
struct rdev_sysfs_entry *entry = container_of(attr, struct rdev_sysfs_entry, attr);
struct md_rdev *rdev = container_of(kobj, struct md_rdev, kobj);
struct kernfs_node *kn = NULL;
+ bool suspend = false;
ssize_t rv;
struct mddev *mddev = rdev->mddev;

@@ -3704,17 +3701,23 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
return -EIO;
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
+ if (!mddev)
+ return -ENODEV;

- if (entry->store == state_store && cmd_match(page, "remove"))
- kn = sysfs_break_active_protection(kobj, attr);
+ if (entry->store == state_store) {
+ if (cmd_match(page, "remove"))
+ kn = sysfs_break_active_protection(kobj, attr);
+ if (cmd_match(page, "remove") || cmd_match(page, "re-add"))
+ suspend = true;
+ }

- rv = mddev ? mddev_lock(mddev) : -ENODEV;
+ rv = suspend ? mddev_suspend_and_lock(mddev) : mddev_lock(mddev);
if (!rv) {
if (rdev->mddev == NULL)
rv = -ENODEV;
else
rv = entry->store(rdev, page, length);
- mddev_unlock(mddev);
+ suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
}

if (kn)
--
2.39.2

2023-10-10 07:24:47

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 03/19] md: replace is_md_suspended() with 'mddev->suspended' in md_check_recovery()

From: Yu Kuai <[email protected]>

Prepare to cleanup pers->prepare_suspend(), which is used to fix a
deadlock in raid456 by returning error for io that is waiting for
reshape to make progress in mddev_suspend().

This change will allow reshape to make progress while waiting for io to
be done in mddev_suspend() in following patches.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 930bcabeec79..cf929c42cdbe 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9418,7 +9418,7 @@ void md_check_recovery(struct mddev *mddev)
wake_up(&mddev->sb_wait);
}

- if (is_md_suspended(mddev))
+ if (READ_ONCE(mddev->suspended))
return;

if (mddev->bitmap)
--
2.39.2

2023-10-10 07:24:56

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 10/19] md: use new apis to suspend array for sysfs apis

From: Yu Kuai <[email protected]>

Convert to use new apis in following sysfs apis:
- level_store
- suspend_lo_store
- suspend_hi_store
- serialize_policy_store

These are not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 24 ++++++++----------------
1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 201de29d913c..aa08b9b78332 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4019,7 +4019,7 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
if (slen == 0 || slen >= sizeof(clevel))
return -EINVAL;

- rv = mddev_lock(mddev);
+ rv = mddev_suspend_and_lock(mddev);
if (rv)
return rv;

@@ -4112,7 +4112,6 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
}

/* Looks like we have a winner */
- mddev_suspend(mddev);
mddev_detach(mddev);

spin_lock(&mddev->lock);
@@ -4198,14 +4197,13 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
blk_set_stacking_limits(&mddev->queue->limits);
pers->run(mddev);
set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags);
- mddev_resume(mddev);
if (!mddev->thread)
md_update_sb(mddev, 1);
sysfs_notify_dirent_safe(mddev->sysfs_level);
md_new_event();
rv = len;
out_unlock:
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
return rv;
}

@@ -5293,15 +5291,13 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
if (new != (sector_t)new)
return -EINVAL;

- err = mddev_lock(mddev);
+ err = __mddev_suspend(mddev, true);
if (err)
return err;

- mddev_suspend(mddev);
WRITE_ONCE(mddev->suspend_lo, new);
- mddev_resume(mddev);
+ __mddev_resume(mddev);

- mddev_unlock(mddev);
return len;
}
static struct md_sysfs_entry md_suspend_lo =
@@ -5326,15 +5322,13 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
if (new != (sector_t)new)
return -EINVAL;

- err = mddev_lock(mddev);
+ err = __mddev_suspend(mddev, true);
if (err)
return err;

- mddev_suspend(mddev);
WRITE_ONCE(mddev->suspend_hi, new);
- mddev_resume(mddev);
+ __mddev_resume(mddev);

- mddev_unlock(mddev);
return len;
}
static struct md_sysfs_entry md_suspend_hi =
@@ -5582,7 +5576,7 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
if (value == mddev->serialize_policy)
return len;

- err = mddev_lock(mddev);
+ err = mddev_suspend_and_lock(mddev);
if (err)
return err;
if (mddev->pers == NULL || (mddev->pers->level != 1)) {
@@ -5591,15 +5585,13 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
goto unlock;
}

- mddev_suspend(mddev);
if (value)
mddev_create_serial_pool(mddev, NULL, true);
else
mddev_destroy_serial_pool(mddev, NULL, true);
mddev->serialize_policy = value;
- mddev_resume(mddev);
unlock:
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
return err ?: len;
}

--
2.39.2

2023-10-10 07:25:00

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 18/19] md: remove old apis to suspend the array

From: Yu Kuai <[email protected]>

Now that mddev_suspend() and mddev_resume() is not used anywhere, remove
them, and remove 'MD_ALLOW_SB_UPDATE' and 'MD_UPDATING_SB' as well.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 82 ++-----------------------------------------------
drivers/md/md.h | 8 -----
2 files changed, 3 insertions(+), 87 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 42f5fbde4e89..757cd53c0fa1 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -418,74 +418,10 @@ static void md_submit_bio(struct bio *bio)
md_handle_request(mddev, bio);
}

-/* mddev_suspend makes sure no new requests are submitted
- * to the device, and that any requests that have been submitted
- * are completely handled.
- * Once mddev_detach() is called and completes, the module will be
- * completely unused.
+/*
+ * Make sure no new requests are submitted to the device, and any requests that
+ * have been submitted are completely handled.
*/
-void mddev_suspend(struct mddev *mddev)
-{
- struct md_thread *thread = rcu_dereference_protected(mddev->thread,
- lockdep_is_held(&mddev->reconfig_mutex));
-
- WARN_ON_ONCE(thread && current == thread->tsk);
-
- /* can't concurrent with __mddev_suspend() and __mddev_resume() */
- mutex_lock(&mddev->suspend_mutex);
- if (mddev->suspended++) {
- mutex_unlock(&mddev->suspend_mutex);
- return;
- }
-
- wake_up(&mddev->sb_wait);
- set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
- percpu_ref_kill(&mddev->active_io);
-
- /*
- * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
- * by __mddev_suspend().
- */
- if (mddev->pers && mddev->pers->prepare_suspend)
- mddev->pers->prepare_suspend(mddev);
-
- wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
- clear_bit_unlock(MD_ALLOW_SB_UPDATE, &mddev->flags);
- wait_event(mddev->sb_wait, !test_bit(MD_UPDATING_SB, &mddev->flags));
-
- del_timer_sync(&mddev->safemode_timer);
- /* restrict memory reclaim I/O during raid array is suspend */
- mddev->noio_flag = memalloc_noio_save();
-
- mutex_unlock(&mddev->suspend_mutex);
-}
-EXPORT_SYMBOL_GPL(mddev_suspend);
-
-void mddev_resume(struct mddev *mddev)
-{
- lockdep_assert_held(&mddev->reconfig_mutex);
-
- /* can't concurrent with __mddev_suspend() and __mddev_resume() */
- mutex_lock(&mddev->suspend_mutex);
- if (--mddev->suspended) {
- mutex_unlock(&mddev->suspend_mutex);
- return;
- }
-
- /* entred the memalloc scope from mddev_suspend() */
- memalloc_noio_restore(mddev->noio_flag);
-
- percpu_ref_resurrect(&mddev->active_io);
- wake_up(&mddev->sb_wait);
-
- set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
- md_wakeup_thread(mddev->thread);
- md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
-
- mutex_unlock(&mddev->suspend_mutex);
-}
-EXPORT_SYMBOL_GPL(mddev_resume);
-
int __mddev_suspend(struct mddev *mddev, bool interruptible)
{
int err = 0;
@@ -9503,18 +9439,6 @@ static void md_start_sync(struct work_struct *ws)
*/
void md_check_recovery(struct mddev *mddev)
{
- if (test_bit(MD_ALLOW_SB_UPDATE, &mddev->flags) && mddev->sb_flags) {
- /* Write superblock - thread that called mddev_suspend()
- * holds reconfig_mutex for us.
- */
- set_bit(MD_UPDATING_SB, &mddev->flags);
- smp_mb__after_atomic();
- if (test_bit(MD_ALLOW_SB_UPDATE, &mddev->flags))
- md_update_sb(mddev, 0);
- clear_bit_unlock(MD_UPDATING_SB, &mddev->flags);
- wake_up(&mddev->sb_wait);
- }
-
if (READ_ONCE(mddev->suspended))
return;

diff --git a/drivers/md/md.h b/drivers/md/md.h
index 63b4c393b1ee..4c5f3f032656 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -248,10 +248,6 @@ struct md_cluster_info;
* become failed.
* @MD_HAS_PPL: The raid array has PPL feature set.
* @MD_HAS_MULTIPLE_PPLS: The raid array has multiple PPLs feature set.
- * @MD_ALLOW_SB_UPDATE: md_check_recovery is allowed to update the metadata
- * without taking reconfig_mutex.
- * @MD_UPDATING_SB: md_check_recovery is updating the metadata without
- * explicitly holding reconfig_mutex.
* @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that
* array is ready yet.
* @MD_BROKEN: This is used to stop writes and mark array as failed.
@@ -268,8 +264,6 @@ enum mddev_flags {
MD_FAILFAST_SUPPORTED,
MD_HAS_PPL,
MD_HAS_MULTIPLE_PPLS,
- MD_ALLOW_SB_UPDATE,
- MD_UPDATING_SB,
MD_NOT_READY,
MD_BROKEN,
MD_DELETED,
@@ -810,8 +804,6 @@ extern int md_rdev_init(struct md_rdev *rdev);
extern void md_rdev_clear(struct md_rdev *rdev);

extern void md_handle_request(struct mddev *mddev, struct bio *bio);
-extern void mddev_suspend(struct mddev *mddev);
-extern void mddev_resume(struct mddev *mddev);
extern int __mddev_suspend(struct mddev *mddev, bool interruptible);
extern void __mddev_resume(struct mddev *mddev);

--
2.39.2

2023-10-10 07:25:44

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 15/19] md/md-linear: cleanup linear_add()

From: Yu Kuai <[email protected]>

Now that caller already suspend the array, there is no need to suspend
array in liner_add().

Note that mddev_suspend/resume() is not used anymore.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-linear.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
index ae2826e9645b..8eca7693b793 100644
--- a/drivers/md/md-linear.c
+++ b/drivers/md/md-linear.c
@@ -183,7 +183,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
* in linear_congested(), therefore kfree_rcu() is used to free
* oldconf until no one uses it anymore.
*/
- mddev_suspend(mddev);
oldconf = rcu_dereference_protected(mddev->private,
lockdep_is_held(&mddev->reconfig_mutex));
mddev->raid_disks++;
@@ -192,7 +191,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
rcu_assign_pointer(mddev->private, newconf);
md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
- mddev_resume(mddev);
kfree_rcu(oldconf, rcu);
return 0;
}
--
2.39.2

2023-10-10 07:26:02

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 16/19] md/raid5: replace suspend with quiesce() callback

From: Yu Kuai <[email protected]>

raid5 is the only personality to suspend array in check_reshape() and
start_reshape() callback, suspend and quiesce() callback can both wait
for all normal io to be done, and prevent new io to be dispatched, the
difference is that suspend is implemented in common layer, and quiesce()
callback is implemented in raid5.

In order to cleanup all the usage of mddev_suspend(), the new apis
__mddev_suspend() need to be called before 'reconfig_mutex' is held,
and it's not good to affect all the personalities in common layer just
for raid5. Hence replace suspend with quiesce() callaback, prepare to
reomove all the users of mddev_suspend().

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid5.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e6b8c0145648..d6de084a85e5 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -70,6 +70,8 @@ MODULE_PARM_DESC(devices_handle_discard_safely,
"Set to Y if all devices in each array reliably return zeroes on reads from discarded regions");
static struct workqueue_struct *raid5_wq;

+static void raid5_quiesce(struct mddev *mddev, int quiesce);
+
static inline struct hlist_head *stripe_hash(struct r5conf *conf, sector_t sect)
{
int hash = (sect >> RAID5_STRIPE_SHIFT(conf)) & HASH_MASK;
@@ -2492,15 +2494,12 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
unsigned long cpu;
int err = 0;

- /*
- * Never shrink. And mddev_suspend() could deadlock if this is called
- * from raid5d. In that case, scribble_disks and scribble_sectors
- * should equal to new_disks and new_sectors
- */
+ /* Never shrink. */
if (conf->scribble_disks >= new_disks &&
conf->scribble_sectors >= new_sectors)
return 0;
- mddev_suspend(conf->mddev);
+
+ raid5_quiesce(conf->mddev, true);
cpus_read_lock();

for_each_present_cpu(cpu) {
@@ -2514,7 +2513,8 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
}

cpus_read_unlock();
- mddev_resume(conf->mddev);
+ raid5_quiesce(conf->mddev, false);
+
if (!err) {
conf->scribble_disks = new_disks;
conf->scribble_sectors = new_sectors;
@@ -8551,8 +8551,8 @@ static int raid5_start_reshape(struct mddev *mddev)
* the reshape wasn't running - like Discard or Read - have
* completed.
*/
- mddev_suspend(mddev);
- mddev_resume(mddev);
+ raid5_quiesce(mddev, true);
+ raid5_quiesce(mddev, false);

/* Add some new drives, as many as will fit.
* We know there are enough to make the newly sized array work.
--
2.39.2

2023-10-10 07:26:47

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 17/19] md: suspend array in md_start_sync() if array need reconfiguration

From: Yu Kuai <[email protected]>

So that io won't concurrent with array reconfiguration, and it's safe to
suspend the array directly because normal io won't rely on
md_start_sync().

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 77ba8b265e16..42f5fbde4e89 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9417,8 +9417,13 @@ static void md_start_sync(struct work_struct *ws)
{
struct mddev *mddev = container_of(ws, struct mddev, sync_work);
int spares = 0;
+ bool suspend = false;

- mddev_lock_nointr(mddev);
+ if (md_spares_need_change(mddev))
+ suspend = true;
+
+ suspend ? mddev_suspend_and_lock_nointr(mddev) :
+ mddev_lock_nointr(mddev);

if (!md_is_rdwr(mddev)) {
/*
@@ -9454,7 +9459,7 @@ static void md_start_sync(struct work_struct *ws)
goto not_running;
}

- mddev_unlock(mddev);
+ suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
md_wakeup_thread(mddev->sync_thread);
sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event();
@@ -9466,7 +9471,7 @@ static void md_start_sync(struct work_struct *ws)
clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
- mddev_unlock(mddev);
+ suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);

wake_up(&resync_wait);
if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
--
2.39.2

2023-10-10 07:26:47

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 19/19] md: rename __mddev_suspend/resume() back to mddev_suspend/resume()

From: Yu Kuai <[email protected]>

Now that the old apis are removed, __mddev_suspend/resume() can be
renamed to their original names.

This is done by:

sed -i "s/__mddev_suspend/mddev_suspend/g" *.[ch]
sed -i "s/__mddev_resume/mddev_resume/g" *.[ch]

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/dm-raid.c | 4 ++--
drivers/md/md.c | 18 +++++++++---------
drivers/md/md.h | 12 ++++++------
drivers/md/raid5-cache.c | 4 ++--
4 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 05dd6ccf6f48..a4692f8f98ee 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3797,7 +3797,7 @@ static void raid_postsuspend(struct dm_target *ti)
if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
md_stop_writes(&rs->md);

- __mddev_suspend(&rs->md, false);
+ mddev_suspend(&rs->md, false);
}
}

@@ -4009,7 +4009,7 @@ static int raid_preresume(struct dm_target *ti)
}

/* Check for any resize/reshape on @rs and adjust/initiate */
- /* Be prepared for __mddev_resume() in raid_resume() */
+ /* Be prepared for mddev_resume() in raid_resume() */
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) {
set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 757cd53c0fa1..8ee079c4dc1e 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -422,7 +422,7 @@ static void md_submit_bio(struct bio *bio)
* Make sure no new requests are submitted to the device, and any requests that
* have been submitted are completely handled.
*/
-int __mddev_suspend(struct mddev *mddev, bool interruptible)
+int mddev_suspend(struct mddev *mddev, bool interruptible)
{
int err = 0;

@@ -473,9 +473,9 @@ int __mddev_suspend(struct mddev *mddev, bool interruptible)
mutex_unlock(&mddev->suspend_mutex);
return 0;
}
-EXPORT_SYMBOL_GPL(__mddev_suspend);
+EXPORT_SYMBOL_GPL(mddev_suspend);

-void __mddev_resume(struct mddev *mddev)
+void mddev_resume(struct mddev *mddev)
{
lockdep_assert_not_held(&mddev->reconfig_mutex);

@@ -486,7 +486,7 @@ void __mddev_resume(struct mddev *mddev)
return;
}

- /* entred the memalloc scope from __mddev_suspend() */
+ /* entred the memalloc scope from mddev_suspend() */
memalloc_noio_restore(mddev->noio_flag);

percpu_ref_resurrect(&mddev->active_io);
@@ -498,7 +498,7 @@ void __mddev_resume(struct mddev *mddev)

mutex_unlock(&mddev->suspend_mutex);
}
-EXPORT_SYMBOL_GPL(__mddev_resume);
+EXPORT_SYMBOL_GPL(mddev_resume);

/*
* Generic flush handling for md
@@ -5219,12 +5219,12 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
if (new != (sector_t)new)
return -EINVAL;

- err = __mddev_suspend(mddev, true);
+ err = mddev_suspend(mddev, true);
if (err)
return err;

WRITE_ONCE(mddev->suspend_lo, new);
- __mddev_resume(mddev);
+ mddev_resume(mddev);

return len;
}
@@ -5250,12 +5250,12 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
if (new != (sector_t)new)
return -EINVAL;

- err = __mddev_suspend(mddev, true);
+ err = mddev_suspend(mddev, true);
if (err)
return err;

WRITE_ONCE(mddev->suspend_hi, new);
- __mddev_resume(mddev);
+ mddev_resume(mddev);

return len;
}
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 4c5f3f032656..55d01d431418 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -804,8 +804,8 @@ extern int md_rdev_init(struct md_rdev *rdev);
extern void md_rdev_clear(struct md_rdev *rdev);

extern void md_handle_request(struct mddev *mddev, struct bio *bio);
-extern int __mddev_suspend(struct mddev *mddev, bool interruptible);
-extern void __mddev_resume(struct mddev *mddev);
+extern int mddev_suspend(struct mddev *mddev, bool interruptible);
+extern void mddev_resume(struct mddev *mddev);

extern void md_reload_sb(struct mddev *mddev, int raid_disk);
extern void md_update_sb(struct mddev *mddev, int force);
@@ -853,27 +853,27 @@ static inline int mddev_suspend_and_lock(struct mddev *mddev)
{
int ret;

- ret = __mddev_suspend(mddev, true);
+ ret = mddev_suspend(mddev, true);
if (ret)
return ret;

ret = mddev_lock(mddev);
if (ret)
- __mddev_resume(mddev);
+ mddev_resume(mddev);

return ret;
}

static inline void mddev_suspend_and_lock_nointr(struct mddev *mddev)
{
- __mddev_suspend(mddev, false);
+ mddev_suspend(mddev, false);
mutex_lock(&mddev->reconfig_mutex);
}

static inline void mddev_unlock_and_resume(struct mddev *mddev)
{
mddev_unlock(mddev);
- __mddev_resume(mddev);
+ mddev_resume(mddev);
}

struct mdu_array_info_s;
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 9909110262ee..6157f5beb9fe 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -699,9 +699,9 @@ static void r5c_disable_writeback_async(struct work_struct *work)

log = READ_ONCE(conf->log);
if (log) {
- __mddev_suspend(mddev, false);
+ mddev_suspend(mddev, false);
log->r5c_journal_mode = R5C_JOURNAL_MODE_WRITE_THROUGH;
- __mddev_resume(mddev);
+ mddev_resume(mddev);
}
}

--
2.39.2

2023-10-10 07:26:48

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 13/19] md: use new apis to suspend array before mddev_create/destroy_serial_pool

From: Yu Kuai <[email protected]>

mddev_create/destroy_serial_pool() will be called from several places
where mddev_suspend() will be called later.

Prepare to remove the mddev_suspend() from
mddev_create/destroy_serial_pool().

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-autodetect.c | 4 ++--
drivers/md/md-bitmap.c | 8 ++++----
drivers/md/md.c | 22 ++++++++++++----------
3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/md/md-autodetect.c b/drivers/md/md-autodetect.c
index 6eaa0eab40f9..4b80165afd23 100644
--- a/drivers/md/md-autodetect.c
+++ b/drivers/md/md-autodetect.c
@@ -175,7 +175,7 @@ static void __init md_setup_drive(struct md_setup_args *args)
return;
}

- err = mddev_lock(mddev);
+ err = mddev_suspend_and_lock(mddev);
if (err) {
pr_err("md: failed to lock array %s\n", name);
goto out_mddev_put;
@@ -221,7 +221,7 @@ static void __init md_setup_drive(struct md_setup_args *args)
if (err)
pr_warn("md: starting %s failed\n", name);
out_unlock:
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
out_mddev_put:
mddev_put(mddev);
}
diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 7d21e2a5b06e..b3d701c5c461 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2537,7 +2537,7 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
if (backlog > COUNTER_MAX)
return -EINVAL;

- rv = mddev_lock(mddev);
+ rv = mddev_suspend_and_lock(mddev);
if (rv)
return rv;

@@ -2562,16 +2562,16 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
if (!backlog && mddev->serial_info_pool) {
/* serial_info_pool is not needed if backlog is zero */
if (!mddev->serialize_policy)
- mddev_destroy_serial_pool(mddev, NULL, false);
+ mddev_destroy_serial_pool(mddev, NULL, true);
} else if (backlog && !mddev->serial_info_pool) {
/* serial_info_pool is needed since backlog is not zero */
rdev_for_each(rdev, mddev)
- mddev_create_serial_pool(mddev, rdev, false);
+ mddev_create_serial_pool(mddev, rdev, true);
}
if (old_mwb != backlog)
md_bitmap_update_sb(mddev->bitmap);

- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
return len;
}

diff --git a/drivers/md/md.c b/drivers/md/md.c
index ee20e6a4b998..85e263376be9 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2557,7 +2557,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
pr_debug("md: bind<%s>\n", b);

if (mddev->raid_disks)
- mddev_create_serial_pool(mddev, rdev, false);
+ mddev_create_serial_pool(mddev, rdev, true);

if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b)))
goto fail;
@@ -3077,11 +3077,11 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
}
} else if (cmd_match(buf, "writemostly")) {
set_bit(WriteMostly, &rdev->flags);
- mddev_create_serial_pool(rdev->mddev, rdev, false);
+ mddev_create_serial_pool(rdev->mddev, rdev, true);
need_update_sb = true;
err = 0;
} else if (cmd_match(buf, "-writemostly")) {
- mddev_destroy_serial_pool(rdev->mddev, rdev, false);
+ mddev_destroy_serial_pool(rdev->mddev, rdev, true);
clear_bit(WriteMostly, &rdev->flags);
need_update_sb = true;
err = 0;
@@ -3707,7 +3707,9 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
if (entry->store == state_store) {
if (cmd_match(page, "remove"))
kn = sysfs_break_active_protection(kobj, attr);
- if (cmd_match(page, "remove") || cmd_match(page, "re-add"))
+ if (cmd_match(page, "remove") || cmd_match(page, "re-add") ||
+ cmd_match(page, "writemostly") ||
+ cmd_match(page, "-writemostly"))
suspend = true;
}

@@ -4684,7 +4686,7 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
minor != MINOR(dev))
return -EOVERFLOW;

- err = mddev_lock(mddev);
+ err = mddev_suspend_and_lock(mddev);
if (err)
return err;
if (mddev->persistent) {
@@ -4705,14 +4707,14 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
rdev = md_import_device(dev, -1, -1);

if (IS_ERR(rdev)) {
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
return PTR_ERR(rdev);
}
err = bind_rdev_to_array(rdev, mddev);
out:
if (err)
export_rdev(rdev, mddev);
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
if (!err)
md_new_event();
return err ? err : len;
@@ -6649,13 +6651,13 @@ static void autorun_devices(int part)
if (IS_ERR(mddev))
break;

- if (mddev_lock(mddev))
+ if (mddev_suspend_and_lock(mddev))
pr_warn("md: %s locked, cannot run\n", mdname(mddev));
else if (mddev->raid_disks || mddev->major_version
|| !list_empty(&mddev->disks)) {
pr_warn("md: %s already running, cannot run %pg\n",
mdname(mddev), rdev0->bdev);
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
} else {
pr_debug("md: created %s\n", mdname(mddev));
mddev->persistent = 1;
@@ -6665,7 +6667,7 @@ static void autorun_devices(int part)
export_rdev(rdev, mddev);
}
autorun_array(mddev);
- mddev_unlock(mddev);
+ mddev_unlock_and_resume(mddev);
}
/* on success, candidates will be empty, on error
* it won't...
--
2.39.2

2023-10-10 07:26:56

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 14/19] md: cleanup mddev_create/destroy_serial_pool()

From: Yu Kuai <[email protected]>

Now that except for stopping the array, all the callers already suspend
the array, there is no need to suspend anymore, hence remove the second
parameter.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-bitmap.c | 8 ++++----
drivers/md/md.c | 33 ++++++++++-----------------------
drivers/md/md.h | 7 +++----
3 files changed, 17 insertions(+), 31 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index b3d701c5c461..9672f75c3050 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1861,7 +1861,7 @@ void md_bitmap_destroy(struct mddev *mddev)

md_bitmap_wait_behind_writes(mddev);
if (!mddev->serialize_policy)
- mddev_destroy_serial_pool(mddev, NULL, true);
+ mddev_destroy_serial_pool(mddev, NULL);

mutex_lock(&mddev->bitmap_info.mutex);
spin_lock(&mddev->lock);
@@ -1977,7 +1977,7 @@ int md_bitmap_load(struct mddev *mddev)
goto out;

rdev_for_each(rdev, mddev)
- mddev_create_serial_pool(mddev, rdev, true);
+ mddev_create_serial_pool(mddev, rdev);

if (mddev_is_clustered(mddev))
md_cluster_ops->load_bitmaps(mddev, mddev->bitmap_info.nodes);
@@ -2562,11 +2562,11 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
if (!backlog && mddev->serial_info_pool) {
/* serial_info_pool is not needed if backlog is zero */
if (!mddev->serialize_policy)
- mddev_destroy_serial_pool(mddev, NULL, true);
+ mddev_destroy_serial_pool(mddev, NULL);
} else if (backlog && !mddev->serial_info_pool) {
/* serial_info_pool is needed since backlog is not zero */
rdev_for_each(rdev, mddev)
- mddev_create_serial_pool(mddev, rdev, true);
+ mddev_create_serial_pool(mddev, rdev);
}
if (old_mwb != backlog)
md_bitmap_update_sb(mddev->bitmap);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 85e263376be9..77ba8b265e16 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -206,8 +206,7 @@ static int rdev_need_serial(struct md_rdev *rdev)
* 1. rdev is the first device which return true from rdev_enable_serial.
* 2. rdev is NULL, means we want to enable serialization for all rdevs.
*/
-void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
- bool is_suspend)
+void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev)
{
int ret = 0;

@@ -215,15 +214,12 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
!test_bit(CollisionCheck, &rdev->flags))
return;

- if (!is_suspend)
- mddev_suspend(mddev);
-
if (!rdev)
ret = rdevs_init_serial(mddev);
else
ret = rdev_init_serial(rdev);
if (ret)
- goto abort;
+ return;

if (mddev->serial_info_pool == NULL) {
/*
@@ -238,10 +234,6 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
pr_err("can't alloc memory pool for serialization\n");
}
}
-
-abort:
- if (!is_suspend)
- mddev_resume(mddev);
}

/*
@@ -250,8 +242,7 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
* 2. when bitmap is destroyed while policy is not enabled.
* 3. for disable policy, the pool is destroyed only when no rdev needs it.
*/
-void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
- bool is_suspend)
+void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev)
{
if (rdev && !test_bit(CollisionCheck, &rdev->flags))
return;
@@ -260,8 +251,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
struct md_rdev *temp;
int num = 0; /* used to track if other rdevs need the pool */

- if (!is_suspend)
- mddev_suspend(mddev);
rdev_for_each(temp, mddev) {
if (!rdev) {
if (!mddev->serialize_policy ||
@@ -283,8 +272,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
mempool_destroy(mddev->serial_info_pool);
mddev->serial_info_pool = NULL;
}
- if (!is_suspend)
- mddev_resume(mddev);
}
}

@@ -2557,7 +2544,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
pr_debug("md: bind<%s>\n", b);

if (mddev->raid_disks)
- mddev_create_serial_pool(mddev, rdev, true);
+ mddev_create_serial_pool(mddev, rdev);

if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b)))
goto fail;
@@ -2610,7 +2597,7 @@ static void md_kick_rdev_from_array(struct md_rdev *rdev)
bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk);
list_del_rcu(&rdev->same_set);
pr_debug("md: unbind<%pg>\n", rdev->bdev);
- mddev_destroy_serial_pool(rdev->mddev, rdev, false);
+ mddev_destroy_serial_pool(rdev->mddev, rdev);
rdev->mddev = NULL;
sysfs_remove_link(&rdev->kobj, "block");
sysfs_put(rdev->sysfs_state);
@@ -3077,11 +3064,11 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
}
} else if (cmd_match(buf, "writemostly")) {
set_bit(WriteMostly, &rdev->flags);
- mddev_create_serial_pool(rdev->mddev, rdev, true);
+ mddev_create_serial_pool(rdev->mddev, rdev);
need_update_sb = true;
err = 0;
} else if (cmd_match(buf, "-writemostly")) {
- mddev_destroy_serial_pool(rdev->mddev, rdev, true);
+ mddev_destroy_serial_pool(rdev->mddev, rdev);
clear_bit(WriteMostly, &rdev->flags);
need_update_sb = true;
err = 0;
@@ -5591,9 +5578,9 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
}

if (value)
- mddev_create_serial_pool(mddev, NULL, true);
+ mddev_create_serial_pool(mddev, NULL);
else
- mddev_destroy_serial_pool(mddev, NULL, true);
+ mddev_destroy_serial_pool(mddev, NULL);
mddev->serialize_policy = value;
unlock:
mddev_unlock_and_resume(mddev);
@@ -6359,7 +6346,7 @@ static void __md_stop_writes(struct mddev *mddev)
}
/* disable policy to guarantee rdevs free resources for serialization */
mddev->serialize_policy = 0;
- mddev_destroy_serial_pool(mddev, NULL, true);
+ mddev_destroy_serial_pool(mddev, NULL);
}

void md_stop_writes(struct mddev *mddev)
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 5c8f3f045e78..63b4c393b1ee 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -817,10 +817,9 @@ extern void __mddev_resume(struct mddev *mddev);

extern void md_reload_sb(struct mddev *mddev, int raid_disk);
extern void md_update_sb(struct mddev *mddev, int force);
-extern void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
- bool is_suspend);
-extern void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
- bool is_suspend);
+extern void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev);
+extern void mddev_destroy_serial_pool(struct mddev *mddev,
+ struct md_rdev *rdev);
struct md_rdev *md_find_rdev_nr_rcu(struct mddev *mddev, int nr);
struct md_rdev *md_find_rdev_rcu(struct mddev *mddev, dev_t dev);

--
2.39.2

2023-10-10 07:26:56

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v4 12/19] md: use new apis to suspend array for ioctls involed array reconfiguration

From: Yu Kuai <[email protected]>

'reconfig_mutex' will be grabbed before these ioctls, suspend array
before holding the lock, so that io won't concurrent with array
reconfiguration through ioctls.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 56523bac5140..ee20e6a4b998 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7215,7 +7215,6 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
struct bitmap *bitmap;

bitmap = md_bitmap_create(mddev, -1);
- mddev_suspend(mddev);
if (!IS_ERR(bitmap)) {
mddev->bitmap = bitmap;
err = md_bitmap_load(mddev);
@@ -7225,11 +7224,8 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
md_bitmap_destroy(mddev);
fd = -1;
}
- mddev_resume(mddev);
} else if (fd < 0) {
- mddev_suspend(mddev);
md_bitmap_destroy(mddev);
- mddev_resume(mddev);
}
}
if (fd < 0) {
@@ -7518,7 +7514,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
mddev->bitmap_info.space =
mddev->bitmap_info.default_space;
bitmap = md_bitmap_create(mddev, -1);
- mddev_suspend(mddev);
if (!IS_ERR(bitmap)) {
mddev->bitmap = bitmap;
rv = md_bitmap_load(mddev);
@@ -7526,7 +7521,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
rv = PTR_ERR(bitmap);
if (rv)
md_bitmap_destroy(mddev);
- mddev_resume(mddev);
} else {
/* remove the bitmap */
if (!mddev->bitmap) {
@@ -7551,9 +7545,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
module_put(md_cluster_mod);
mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY;
}
- mddev_suspend(mddev);
md_bitmap_destroy(mddev);
- mddev_resume(mddev);
mddev->bitmap_info.offset = 0;
}
}
@@ -7624,6 +7616,20 @@ static inline bool md_ioctl_valid(unsigned int cmd)
}
}

+static bool md_ioctl_need_suspend(unsigned int cmd)
+{
+ switch (cmd) {
+ case ADD_NEW_DISK:
+ case HOT_ADD_DISK:
+ case HOT_REMOVE_DISK:
+ case SET_BITMAP_FILE:
+ case SET_ARRAY_INFO:
+ return true;
+ default:
+ return false;
+ }
+}
+
static int __md_set_array_info(struct mddev *mddev, void __user *argp)
{
mdu_array_info_t info;
@@ -7756,7 +7762,8 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
if (!md_is_rdwr(mddev))
flush_work(&mddev->sync_work);

- err = mddev_lock(mddev);
+ err = md_ioctl_need_suspend(cmd) ? mddev_suspend_and_lock(mddev) :
+ mddev_lock(mddev);
if (err) {
pr_debug("md: ioctl lock interrupted, reason %d, cmd %d\n",
err, cmd);
@@ -7884,7 +7891,10 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
if (mddev->hold_active == UNTIL_IOCTL &&
err != -EINVAL)
mddev->hold_active = 0;
- mddev_unlock(mddev);
+
+ md_ioctl_need_suspend(cmd) ? mddev_unlock_and_resume(mddev) :
+ mddev_unlock(mddev);
+
out:
if(did_set_md_closing)
clear_bit(MD_CLOSING, &mddev->flags);
--
2.39.2

2023-10-11 02:47:37

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH -next v4 00/19] md: synchronize io with array reconfiguration

On Tue, Oct 10, 2023 at 12:23 AM Yu Kuai <[email protected]> wrote:
>
> From: Yu Kuai <[email protected]>
>
> Changes in v4:
> - add detailed explanation of backgroud and design in cover letter;
> - merge some patches;

Thanks for v4!

I applied it to a tmp branch (md-tmp-20231010) with modified cover letter
as the merge log. Please let me know what do you think about it.

Thanks,
Song

2023-10-11 03:26:35

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v4 00/19] md: synchronize io with array reconfiguration

Hi,

在 2023/10/11 10:47, Song Liu 写道:
> On Tue, Oct 10, 2023 at 12:23 AM Yu Kuai <[email protected]> wrote:
>>
>> From: Yu Kuai <[email protected]>
>>
>> Changes in v4:
>> - add detailed explanation of backgroud and design in cover letter;
>> - merge some patches;
>
> Thanks for v4!
>
> I applied it to a tmp branch (md-tmp-20231010) with modified cover letter
> as the merge log. Please let me know what do you think about it.

Thanks, this looks great!

I'll start to cleanup lots of stuff.

Kuai,

>
> Thanks,
> Song
> .
>