2024-06-03 13:00:49

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 00/12] md: refacotor and some fixes related to sync_thread

Changes from RFC:
- fix some typos;
- add patch 7 to prevent some mdadm tests failure;
- add patch 12 to fix BUG_ON() panic by mdadm test 07revert-grow;

Yu Kuai (12):
md: rearrange recovery_flags
md: add a new enum type sync_action
md: add new helpers for sync_action
md: factor out helper to start reshape from action_store()
md: replace sysfs api sync_action with new helpers
md: remove parameter check_seq for stop_sync_thread()
md: don't fail action_store() if sync_thread is not registered
md: use new helers in md_do_sync()
md: replace last_sync_action with new enum type
md: factor out helpers for different sync_action in md_do_sync()
md: pass in max_sectors for pers->sync_request()
md/raid5: avoid BUG_ON() while continue reshape after reassembling

drivers/md/dm-raid.c | 2 +-
drivers/md/md.c | 437 ++++++++++++++++++++++++++-----------------
drivers/md/md.h | 124 +++++++++---
drivers/md/raid1.c | 5 +-
drivers/md/raid10.c | 8 +-
drivers/md/raid5.c | 23 ++-
6 files changed, 388 insertions(+), 211 deletions(-)

--
2.39.2



2024-06-03 13:00:59

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 04/12] md: factor out helper to start reshape from action_store()

There are no functional changes, just to make code cleaner and prepare
for following refactor.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 65 +++++++++++++++++++++++++++++++------------------
1 file changed, 41 insertions(+), 24 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index e5487008f66e..28f50a106566 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5070,6 +5070,45 @@ static void frozen_sync_thread(struct mddev *mddev)
mutex_unlock(&mddev->sync_mutex);
}

+static int mddev_start_reshape(struct mddev *mddev)
+{
+ int ret;
+
+ if (mddev->pers->start_reshape == NULL)
+ return -EINVAL;
+
+ ret = mddev_lock(mddev);
+ if (ret)
+ return ret;
+
+ if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
+ mddev_unlock(mddev);
+ return -EBUSY;
+ }
+
+ if (mddev->reshape_position == MaxSector ||
+ mddev->pers->check_reshape == NULL ||
+ mddev->pers->check_reshape(mddev)) {
+ clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ ret = mddev->pers->start_reshape(mddev);
+ if (ret) {
+ mddev_unlock(mddev);
+ return ret;
+ }
+ } else {
+ /*
+ * If reshape is still in progress, and md_check_recovery() can
+ * continue to reshape, don't restart reshape because data can
+ * be corrupted for raid456.
+ */
+ clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ }
+
+ mddev_unlock(mddev);
+ sysfs_notify_dirent_safe(mddev->sysfs_degraded);
+ return 0;
+}
+
static ssize_t
action_store(struct mddev *mddev, const char *page, size_t len)
{
@@ -5089,32 +5128,10 @@ action_store(struct mddev *mddev, const char *page, size_t len)
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
} else if (cmd_match(page, "reshape")) {
- int err;
- if (mddev->pers->start_reshape == NULL)
- return -EINVAL;
- err = mddev_lock(mddev);
- if (!err) {
- if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
- err = -EBUSY;
- } else if (mddev->reshape_position == MaxSector ||
- mddev->pers->check_reshape == NULL ||
- mddev->pers->check_reshape(mddev)) {
- clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- err = mddev->pers->start_reshape(mddev);
- } else {
- /*
- * If reshape is still in progress, and
- * md_check_recovery() can continue to reshape,
- * don't restart reshape because data can be
- * corrupted for raid456.
- */
- clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- }
- mddev_unlock(mddev);
- }
+ int err = mddev_start_reshape(mddev);
+
if (err)
return err;
- sysfs_notify_dirent_safe(mddev->sysfs_degraded);
} else {
if (cmd_match(page, "check"))
set_bit(MD_RECOVERY_CHECK, &mddev->recovery);
--
2.39.2


2024-06-03 13:01:22

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 02/12] md: add a new enum type sync_action

In order to make code related to sync_thread cleaner in following
patches, also add detail comment about each sync action. And also
prepare to remove the related recovery_flags in the fulture.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.h | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/drivers/md/md.h b/drivers/md/md.h
index 170412a65b63..6b9d9246f260 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -34,6 +34,61 @@
*/
#define MD_FAILFAST (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT)

+/* Status of sync thread. */
+enum sync_action {
+ /*
+ * Represent by MD_RECOVERY_SYNC, start when:
+ * 1) after assemble, sync data from first rdev to other copies, this
+ * must be done first before other sync actions and will only execute
+ * once;
+ * 2) resize the array(notice that this is not reshape), sync data for
+ * the new range;
+ */
+ ACTION_RESYNC,
+ /*
+ * Represent by MD_RECOVERY_RECOVER, start when:
+ * 1) for new replacement, sync data based on the replace rdev or
+ * available copies from other rdev;
+ * 2) for new member disk while the array is degraded, sync data from
+ * other rdev;
+ * 3) reassemble after power failure or re-add a hot removed rdev, sync
+ * data from first rdev to other copies based on bitmap;
+ */
+ ACTION_RECOVER,
+ /*
+ * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED |
+ * MD_RECOVERY_CHECK, start when user echo "check" to sysfs api
+ * sync_action, used to check if data copies from differenct rdev are
+ * the same. The number of mismatch sectors will be exported to user
+ * by sysfs api mismatch_cnt;
+ */
+ ACTION_CHECK,
+ /*
+ * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED, start when
+ * user echo "repair" to sysfs api sync_action, usually paired with
+ * ACTION_CHECK, used to force syncing data once user found that there
+ * are inconsistent data,
+ */
+ ACTION_REPAIR,
+ /*
+ * Represent by MD_RECOVERY_RESHAPE, start when new member disk is added
+ * to the conf, notice that this is different from spares or
+ * replacement;
+ */
+ ACTION_RESHAPE,
+ /*
+ * Represent by MD_RECOVERY_FROZEN, can be set by sysfs api sync_action
+ * or internal usage like setting the array read-only, will forbid above
+ * actions.
+ */
+ ACTION_FROZEN,
+ /*
+ * All above actions don't match.
+ */
+ ACTION_IDLE,
+ NR_SYNC_ACTIONS,
+};
+
/*
* The struct embedded in rdev is used to serialize IO.
*/
@@ -571,7 +626,7 @@ enum recovery_flags {
/* interrupted because io-error */
MD_RECOVERY_ERROR,

- /* flags determines sync action */
+ /* flags determines sync action, see details in enum sync_action */

/* if just this flag is set, action is resync. */
MD_RECOVERY_SYNC,
--
2.39.2


2024-06-03 13:01:51

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 07/12] md: don't fail action_store() if sync_thread is not registered

MD_RECOVERY_RUNNING will always be set when trying to register a new
sync_thread, however, if md_start_sync() turns out to do nothing,
MD_RECOVERY_RUNNING will be cleared in this case. And during the race
window, action_store() will return -EBUSY, which will cause some
mdadm tests to fail. For example:

The test 07reshape5intr will add a new disk to array, then start
reshape:

mdadm /dev/md0 --add /dev/xxx
mdadm --grow /dev/md0 -n 3

And add_bound_rdev() from mdadm --add will set MD_RECOVERY_NEEDED,
then during the race windown, mdadm --grow will fail.

Fix the problem by waiting in action_store() during the race window,
fail only if sync_thread is registered.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 85 +++++++++++++++++++------------------------------
drivers/md/md.h | 2 --
2 files changed, 33 insertions(+), 54 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 5e3c3c109412..3890ae86449a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -752,7 +752,6 @@ int mddev_init(struct mddev *mddev)

mutex_init(&mddev->open_mutex);
mutex_init(&mddev->reconfig_mutex);
- mutex_init(&mddev->sync_mutex);
mutex_init(&mddev->suspend_mutex);
mutex_init(&mddev->bitmap_info.mutex);
INIT_LIST_HEAD(&mddev->disks);
@@ -5020,34 +5019,6 @@ void md_unfrozen_sync_thread(struct mddev *mddev)
}
EXPORT_SYMBOL_GPL(md_unfrozen_sync_thread);

-static void idle_sync_thread(struct mddev *mddev)
-{
- mutex_lock(&mddev->sync_mutex);
- clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
-
- if (mddev_lock(mddev)) {
- mutex_unlock(&mddev->sync_mutex);
- return;
- }
-
- stop_sync_thread(mddev, false);
- mutex_unlock(&mddev->sync_mutex);
-}
-
-static void frozen_sync_thread(struct mddev *mddev)
-{
- mutex_lock(&mddev->sync_mutex);
- set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
-
- if (mddev_lock(mddev)) {
- mutex_unlock(&mddev->sync_mutex);
- return;
- }
-
- stop_sync_thread(mddev, false);
- mutex_unlock(&mddev->sync_mutex);
-}
-
static int mddev_start_reshape(struct mddev *mddev)
{
int ret;
@@ -5055,24 +5026,13 @@ static int mddev_start_reshape(struct mddev *mddev)
if (mddev->pers->start_reshape == NULL)
return -EINVAL;

- ret = mddev_lock(mddev);
- if (ret)
- return ret;
-
- if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
- mddev_unlock(mddev);
- return -EBUSY;
- }
-
if (mddev->reshape_position == MaxSector ||
mddev->pers->check_reshape == NULL ||
mddev->pers->check_reshape(mddev)) {
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
ret = mddev->pers->start_reshape(mddev);
- if (ret) {
- mddev_unlock(mddev);
+ if (ret)
return ret;
- }
} else {
/*
* If reshape is still in progress, and md_check_recovery() can
@@ -5082,7 +5042,6 @@ static int mddev_start_reshape(struct mddev *mddev)
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
}

- mddev_unlock(mddev);
sysfs_notify_dirent_safe(mddev->sysfs_degraded);
return 0;
}
@@ -5096,36 +5055,53 @@ action_store(struct mddev *mddev, const char *page, size_t len)
if (!mddev->pers || !mddev->pers->sync_request)
return -EINVAL;

+retry:
+ if (work_busy(&mddev->sync_work))
+ flush_work(&mddev->sync_work);
+
+ ret = mddev_lock(mddev);
+ if (ret)
+ return ret;
+
+ if (work_busy(&mddev->sync_work)) {
+ mddev_unlock(mddev);
+ goto retry;
+ }
+
action = md_sync_action_by_name(page);

/* TODO: mdadm rely on "idle" to start sync_thread. */
if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
switch (action) {
case ACTION_FROZEN:
- frozen_sync_thread(mddev);
- return len;
+ md_frozen_sync_thread(mddev);
+ ret = len;
+ goto out;
case ACTION_IDLE:
- idle_sync_thread(mddev);
+ md_idle_sync_thread(mddev);
break;
case ACTION_RESHAPE:
case ACTION_RECOVER:
case ACTION_CHECK:
case ACTION_REPAIR:
case ACTION_RESYNC:
- return -EBUSY;
+ ret = -EBUSY;
+ goto out;
default:
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}
} else {
switch (action) {
case ACTION_FROZEN:
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- return len;
+ ret = len;
+ goto out;
case ACTION_RESHAPE:
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
ret = mddev_start_reshape(mddev);
if (ret)
- return ret;
+ goto out;
break;
case ACTION_RECOVER:
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
@@ -5143,7 +5119,8 @@ action_store(struct mddev *mddev, const char *page, size_t len)
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
break;
default:
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}
}

@@ -5151,14 +5128,18 @@ action_store(struct mddev *mddev, const char *page, size_t len)
/* A write to sync_action is enough to justify
* canceling read-auto mode
*/
- flush_work(&mddev->sync_work);
mddev->ro = MD_RDWR;
md_wakeup_thread(mddev->sync_thread);
}
+
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
sysfs_notify_dirent_safe(mddev->sysfs_action);
- return len;
+ ret = len;
+
+out:
+ mddev_unlock(mddev);
+ return ret;
}

static struct md_sysfs_entry md_scan_mode =
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 018f3292a25c..f7afc5a46031 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -595,8 +595,6 @@ struct mddev {
*/
struct list_head deleting;

- /* Used to synchronize idle and frozen for action_store() */
- struct mutex sync_mutex;
/* The sequence number for sync thread */
atomic_t sync_seq;

--
2.39.2


2024-06-03 13:02:47

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 06/12] md: remove parameter check_seq for stop_sync_thread()

Caller will always set MD_RECOVERY_FROZEN if check_seq is true, and
always clear MD_RECOVERY_FROZEN if check_seq is false, hence replace
the parameter with test_bit() to make code cleaner.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 26 +++++++++++---------------
1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 8a54b56e3463..5e3c3c109412 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4960,15 +4960,10 @@ action_show(struct mddev *mddev, char *page)
* @locked: if set, reconfig_mutex will still be held after this function
* return; if not set, reconfig_mutex will be released after this
* function return.
- * @check_seq: if set, only wait for curent running sync_thread to stop, noted
- * that new sync_thread can still start.
*/
-static void stop_sync_thread(struct mddev *mddev, bool locked, bool check_seq)
+static void stop_sync_thread(struct mddev *mddev, bool locked)
{
- int sync_seq;
-
- if (check_seq)
- sync_seq = atomic_read(&mddev->sync_seq);
+ int sync_seq = atomic_read(&mddev->sync_seq);

if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
if (!locked)
@@ -4989,7 +4984,8 @@ static void stop_sync_thread(struct mddev *mddev, bool locked, bool check_seq)

wait_event(resync_wait,
!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
- (check_seq && sync_seq != atomic_read(&mddev->sync_seq)));
+ (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery) &&
+ sync_seq != atomic_read(&mddev->sync_seq)));

if (locked)
mddev_lock_nointr(mddev);
@@ -5000,7 +4996,7 @@ void md_idle_sync_thread(struct mddev *mddev)
lockdep_assert_held(&mddev->reconfig_mutex);

clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- stop_sync_thread(mddev, true, true);
+ stop_sync_thread(mddev, true);
}
EXPORT_SYMBOL_GPL(md_idle_sync_thread);

@@ -5009,7 +5005,7 @@ void md_frozen_sync_thread(struct mddev *mddev)
lockdep_assert_held(&mddev->reconfig_mutex);

set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- stop_sync_thread(mddev, true, false);
+ stop_sync_thread(mddev, true);
}
EXPORT_SYMBOL_GPL(md_frozen_sync_thread);

@@ -5034,7 +5030,7 @@ static void idle_sync_thread(struct mddev *mddev)
return;
}

- stop_sync_thread(mddev, false, true);
+ stop_sync_thread(mddev, false);
mutex_unlock(&mddev->sync_mutex);
}

@@ -5048,7 +5044,7 @@ static void frozen_sync_thread(struct mddev *mddev)
return;
}

- stop_sync_thread(mddev, false, false);
+ stop_sync_thread(mddev, false);
mutex_unlock(&mddev->sync_mutex);
}

@@ -6543,7 +6539,7 @@ void md_stop_writes(struct mddev *mddev)
{
mddev_lock_nointr(mddev);
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- stop_sync_thread(mddev, true, false);
+ stop_sync_thread(mddev, true);
__md_stop_writes(mddev);
mddev_unlock(mddev);
}
@@ -6611,7 +6607,7 @@ static int md_set_readonly(struct mddev *mddev)
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
}

- stop_sync_thread(mddev, false, false);
+ stop_sync_thread(mddev, false);
wait_event(mddev->sb_wait,
!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags));
mddev_lock_nointr(mddev);
@@ -6657,7 +6653,7 @@ static int do_md_stop(struct mddev *mddev, int mode)
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
}

- stop_sync_thread(mddev, true, false);
+ stop_sync_thread(mddev, true);

if (mddev->sysfs_active ||
test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
--
2.39.2


2024-06-03 13:03:25

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 08/12] md: use new helers in md_do_sync()

Make code cleaner. and also use the action_name directly in kernel log:
- "check" instead of "data-check"
- "repair" instead of "requested-resync"

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 21 +++++----------------
drivers/md/md.h | 2 +-
2 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 3890ae86449a..e44016055b56 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8934,7 +8934,8 @@ void md_do_sync(struct md_thread *thread)
sector_t last_check;
int skipped = 0;
struct md_rdev *rdev;
- char *desc, *action = NULL;
+ enum sync_action action;
+ const char *desc;
struct blk_plug plug;
int ret;

@@ -8965,21 +8966,9 @@ void md_do_sync(struct md_thread *thread)
goto skip;
}

- if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
- if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) {
- desc = "data-check";
- action = "check";
- } else if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
- desc = "requested-resync";
- action = "repair";
- } else
- desc = "resync";
- } else if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery))
- desc = "reshape";
- else
- desc = "recovery";
-
- mddev->last_sync_action = action ?: desc;
+ action = md_sync_action(mddev);
+ desc = md_sync_action_name(action);
+ mddev->last_sync_action = desc;

/*
* Before starting a resync we must have set curr_resync to
diff --git a/drivers/md/md.h b/drivers/md/md.h
index f7afc5a46031..614011651f79 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -432,7 +432,7 @@ struct mddev {
* when the sync thread is "frozen" (interrupted) or "idle" (stopped
* or finished). It is overwritten when a new sync operation is begun.
*/
- char *last_sync_action;
+ const char *last_sync_action;
sector_t curr_resync; /* last block scheduled */
/* As resync requests can complete out of order, we cannot easily track
* how much resync has been completed. So we occasionally pause until
--
2.39.2


2024-06-03 13:05:02

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 09/12] md: replace last_sync_action with new enum type

The only difference is that "none" is removed and initial
last_sync_action will be idle.

On the one hand, this value is introduced by commit c4a395514516
("MD: Remember the last sync operation that was performed"), and the
usage described in commit message is not affected. On the other hand,
last_sync_action is not used in mdadm or mdmon, and none of the tests
that I can find.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/dm-raid.c | 2 +-
drivers/md/md.c | 7 ++++---
drivers/md/md.h | 9 ++++-----
3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index abe88d1e6735..052c00c1eb15 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3542,7 +3542,7 @@ static void raid_status(struct dm_target *ti, status_type_t type,
recovery = rs->md.recovery;
state = decipher_sync_action(mddev, recovery);
progress = rs_get_progress(rs, recovery, state, resync_max_sectors);
- resync_mismatches = (mddev->last_sync_action && !strcasecmp(mddev->last_sync_action, "check")) ?
+ resync_mismatches = mddev->last_sync_action == ACTION_CHECK ?
atomic64_read(&mddev->resync_mismatches) : 0;

/* HM FIXME: do we want another state char for raid0? It shows 'D'/'A'/'-' now */
diff --git a/drivers/md/md.c b/drivers/md/md.c
index e44016055b56..28977595b7ba 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -767,7 +767,7 @@ int mddev_init(struct mddev *mddev)
init_waitqueue_head(&mddev->recovery_wait);
mddev->reshape_position = MaxSector;
mddev->reshape_backwards = 0;
- mddev->last_sync_action = "none";
+ mddev->last_sync_action = ACTION_IDLE;
mddev->resync_min = 0;
mddev->resync_max = MaxSector;
mddev->level = LEVEL_NONE;
@@ -5148,7 +5148,8 @@ __ATTR_PREALLOC(sync_action, S_IRUGO|S_IWUSR, action_show, action_store);
static ssize_t
last_sync_action_show(struct mddev *mddev, char *page)
{
- return sprintf(page, "%s\n", mddev->last_sync_action);
+ return sprintf(page, "%s\n",
+ md_sync_action_name(mddev->last_sync_action));
}

static struct md_sysfs_entry md_last_scan_mode = __ATTR_RO(last_sync_action);
@@ -8968,7 +8969,7 @@ void md_do_sync(struct md_thread *thread)

action = md_sync_action(mddev);
desc = md_sync_action_name(action);
- mddev->last_sync_action = desc;
+ mddev->last_sync_action = action;

/*
* Before starting a resync we must have set curr_resync to
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 614011651f79..296a78568fc4 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -426,13 +426,12 @@ struct mddev {
struct md_thread __rcu *thread; /* management thread */
struct md_thread __rcu *sync_thread; /* doing resync or reconstruct */

- /* 'last_sync_action' is initialized to "none". It is set when a
- * sync operation (i.e "data-check", "requested-resync", "resync",
- * "recovery", or "reshape") is started. It holds this value even
+ /*
+ * Set when a sync operation is started. It holds this value even
* when the sync thread is "frozen" (interrupted) or "idle" (stopped
- * or finished). It is overwritten when a new sync operation is begun.
+ * or finished). It is overwritten when a new sync operation is begun.
*/
- const char *last_sync_action;
+ enum sync_action last_sync_action;
sector_t curr_resync; /* last block scheduled */
/* As resync requests can complete out of order, we cannot easily track
* how much resync has been completed. So we occasionally pause until
--
2.39.2


2024-06-03 13:05:07

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 12/12] md/raid5: avoid BUG_ON() while continue reshape after reassembling

Currently, mdadm support --revert-reshape to abort the reshape while
reassembling, as the test 07revert-grow. However, following BUG_ON()
can be triggerred by the test:

kernel BUG at drivers/md/raid5.c:6278!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
irq event stamp: 158985
CPU: 6 PID: 891 Comm: md0_reshape Not tainted 6.9.0-03335-g7592a0b0049a #94
RIP: 0010:reshape_request+0x3f1/0xe60
Call Trace:
<TASK>
raid5_sync_request+0x43d/0x550
md_do_sync+0xb7a/0x2110
md_thread+0x294/0x2b0
kthread+0x147/0x1c0
ret_from_fork+0x59/0x70
ret_from_fork_asm+0x1a/0x30
</TASK>

Root cause is that --revert-reshape update the raid_disks from 5 to 4,
while reshape position is still set, and after reassembling the array,
reshape position will be read from super block, then during reshape the
checking of 'writepos' that is caculated by old reshape position will
fail.

Fix this panic the easy way first, by converting the BUG_ON() to
WARN_ON(), and stop the reshape if checkings fail.

Noted that mdadm must fix --revert-shape as well, and probably md/raid
should enhance metadata validation as well, however this means
reassemble will fail and there must be user tools to fix the wrong
metadata.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid5.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 69a083ca41a3..d4c30a94e4e4 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6255,7 +6255,9 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk
safepos = conf->reshape_safe;
sector_div(safepos, data_disks);
if (mddev->reshape_backwards) {
- BUG_ON(writepos < reshape_sectors);
+ if (WARN_ON(writepos < reshape_sectors))
+ return MaxSector;
+
writepos -= reshape_sectors;
readpos += reshape_sectors;
safepos += reshape_sectors;
@@ -6273,14 +6275,18 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk
* to set 'stripe_addr' which is where we will write to.
*/
if (mddev->reshape_backwards) {
- BUG_ON(conf->reshape_progress == 0);
+ if (WARN_ON(conf->reshape_progress == 0))
+ return MaxSector;
+
stripe_addr = writepos;
- BUG_ON((mddev->dev_sectors &
- ~((sector_t)reshape_sectors - 1))
- - reshape_sectors - stripe_addr
- != sector_nr);
+ if (WARN_ON((mddev->dev_sectors &
+ ~((sector_t)reshape_sectors - 1)) -
+ reshape_sectors - stripe_addr != sector_nr))
+ return MaxSector;
} else {
- BUG_ON(writepos != sector_nr + reshape_sectors);
+ if (WARN_ON(writepos != sector_nr + reshape_sectors))
+ return MaxSector;
+
stripe_addr = sector_nr;
}

--
2.39.2


2024-06-03 13:06:06

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 10/12] md: factor out helpers for different sync_action in md_do_sync()

Make code cleaner by replacing if else if with switch, and it's more
obvious now what is doing for each sync_action. There are no
functional changes.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 123 ++++++++++++++++++++++++++++--------------------
1 file changed, 73 insertions(+), 50 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 28977595b7ba..a1f9d9b69911 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8919,6 +8919,77 @@ void md_allow_write(struct mddev *mddev)
}
EXPORT_SYMBOL_GPL(md_allow_write);

+static sector_t md_sync_max_sectors(struct mddev *mddev,
+ enum sync_action action)
+{
+ switch (action) {
+ case ACTION_RESYNC:
+ case ACTION_CHECK:
+ case ACTION_REPAIR:
+ atomic64_set(&mddev->resync_mismatches, 0);
+ fallthrough;
+ case ACTION_RESHAPE:
+ return mddev->resync_max_sectors;
+ case ACTION_RECOVER:
+ return mddev->dev_sectors;
+ default:
+ return 0;
+ }
+}
+
+static sector_t md_sync_position(struct mddev *mddev, enum sync_action action)
+{
+ sector_t start = 0;
+ struct md_rdev *rdev;
+
+ switch (action) {
+ case ACTION_CHECK:
+ case ACTION_REPAIR:
+ return mddev->resync_min;
+ case ACTION_RESYNC:
+ if (!mddev->bitmap)
+ return mddev->recovery_cp;
+ return 0;
+ case ACTION_RESHAPE:
+ /*
+ * If the original node aborts reshaping then we continue the
+ * reshaping, so set again to avoid restart reshape from the
+ * first beginning
+ */
+ if (mddev_is_clustered(mddev) &&
+ mddev->reshape_position != MaxSector)
+ return mddev->reshape_position;
+ return 0;
+ case ACTION_RECOVER:
+ start = MaxSector;
+ rcu_read_lock();
+ rdev_for_each_rcu(rdev, mddev)
+ if (rdev->raid_disk >= 0 &&
+ !test_bit(Journal, &rdev->flags) &&
+ !test_bit(Faulty, &rdev->flags) &&
+ !test_bit(In_sync, &rdev->flags) &&
+ rdev->recovery_offset < start)
+ start = rdev->recovery_offset;
+ rcu_read_unlock();
+
+ /* If there is a bitmap, we need to make sure all
+ * writes that started before we added a spare
+ * complete before we start doing a recovery.
+ * Otherwise the write might complete and (via
+ * bitmap_endwrite) set a bit in the bitmap after the
+ * recovery has checked that bit and skipped that
+ * region.
+ */
+ if (mddev->bitmap) {
+ mddev->pers->quiesce(mddev, 1);
+ mddev->pers->quiesce(mddev, 0);
+ }
+ return start;
+ default:
+ return MaxSector;
+ }
+}
+
#define SYNC_MARKS 10
#define SYNC_MARK_STEP (3*HZ)
#define UPDATE_FREQUENCY (5*60*HZ)
@@ -9037,56 +9108,8 @@ void md_do_sync(struct md_thread *thread)
spin_unlock(&all_mddevs_lock);
} while (mddev->curr_resync < MD_RESYNC_DELAYED);

- j = 0;
- if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
- /* resync follows the size requested by the personality,
- * which defaults to physical size, but can be virtual size
- */
- max_sectors = mddev->resync_max_sectors;
- atomic64_set(&mddev->resync_mismatches, 0);
- /* we don't use the checkpoint if there's a bitmap */
- if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery))
- j = mddev->resync_min;
- else if (!mddev->bitmap)
- j = mddev->recovery_cp;
-
- } else if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) {
- max_sectors = mddev->resync_max_sectors;
- /*
- * If the original node aborts reshaping then we continue the
- * reshaping, so set j again to avoid restart reshape from the
- * first beginning
- */
- if (mddev_is_clustered(mddev) &&
- mddev->reshape_position != MaxSector)
- j = mddev->reshape_position;
- } else {
- /* recovery follows the physical size of devices */
- max_sectors = mddev->dev_sectors;
- j = MaxSector;
- rcu_read_lock();
- rdev_for_each_rcu(rdev, mddev)
- if (rdev->raid_disk >= 0 &&
- !test_bit(Journal, &rdev->flags) &&
- !test_bit(Faulty, &rdev->flags) &&
- !test_bit(In_sync, &rdev->flags) &&
- rdev->recovery_offset < j)
- j = rdev->recovery_offset;
- rcu_read_unlock();
-
- /* If there is a bitmap, we need to make sure all
- * writes that started before we added a spare
- * complete before we start doing a recovery.
- * Otherwise the write might complete and (via
- * bitmap_endwrite) set a bit in the bitmap after the
- * recovery has checked that bit and skipped that
- * region.
- */
- if (mddev->bitmap) {
- mddev->pers->quiesce(mddev, 1);
- mddev->pers->quiesce(mddev, 0);
- }
- }
+ max_sectors = md_sync_max_sectors(mddev, action);
+ j = md_sync_position(mddev, action);

pr_info("md: %s of RAID array %s\n", desc, mdname(mddev));
pr_debug("md: minimum _guaranteed_ speed: %d KB/sec/disk.\n", speed_min(mddev));
--
2.39.2


2024-06-03 13:06:53

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 11/12] md: pass in max_sectors for pers->sync_request()

For different sync_action, sync_thread will use different max_sectors,
see details in md_sync_max_sectors(), currently both md_do_sync() and
pers->sync_request() in eatch iteration have to get the same
max_sectors. Hence pass in max_sectors for pers->sync_request() to
prevent redundant code.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 5 +++--
drivers/md/md.h | 3 ++-
drivers/md/raid1.c | 5 ++---
drivers/md/raid10.c | 8 ++------
drivers/md/raid5.c | 3 +--
5 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index a1f9d9b69911..e1555971ebe8 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9191,7 +9191,8 @@ void md_do_sync(struct md_thread *thread)
if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
break;

- sectors = mddev->pers->sync_request(mddev, j, &skipped);
+ sectors = mddev->pers->sync_request(mddev, j, max_sectors,
+ &skipped);
if (sectors == 0) {
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
break;
@@ -9281,7 +9282,7 @@ void md_do_sync(struct md_thread *thread)
mddev->curr_resync_completed = mddev->curr_resync;
sysfs_notify_dirent_safe(mddev->sysfs_completed);
}
- mddev->pers->sync_request(mddev, max_sectors, &skipped);
+ mddev->pers->sync_request(mddev, max_sectors, max_sectors, &skipped);

if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) &&
mddev->curr_resync > MD_RESYNC_ACTIVE) {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 296a78568fc4..3458ac5ce9c1 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -729,7 +729,8 @@ struct md_personality
int (*hot_add_disk) (struct mddev *mddev, struct md_rdev *rdev);
int (*hot_remove_disk) (struct mddev *mddev, struct md_rdev *rdev);
int (*spare_active) (struct mddev *mddev);
- sector_t (*sync_request)(struct mddev *mddev, sector_t sector_nr, int *skipped);
+ sector_t (*sync_request)(struct mddev *mddev, sector_t sector_nr,
+ sector_t max_sector, int *skipped);
int (*resize) (struct mddev *mddev, sector_t sectors);
sector_t (*size) (struct mddev *mddev, sector_t sectors, int raid_disks);
int (*check_reshape) (struct mddev *mddev);
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 7b8a71ca66dd..387f9a53e35f 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2757,12 +2757,12 @@ static struct r1bio *raid1_alloc_init_r1buf(struct r1conf *conf)
*/

static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
- int *skipped)
+ sector_t max_sector, int *skipped)
{
struct r1conf *conf = mddev->private;
struct r1bio *r1_bio;
struct bio *bio;
- sector_t max_sector, nr_sectors;
+ sector_t nr_sectors;
int disk = -1;
int i;
int wonly = -1;
@@ -2778,7 +2778,6 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
if (init_resync(conf))
return 0;

- max_sector = mddev->dev_sectors;
if (sector_nr >= max_sector) {
/* If we aborted, we need to abort the
* sync on the 'current' bitmap chunk (there will
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index a4556d2e46bf..af54ac30a667 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3140,12 +3140,12 @@ static void raid10_set_cluster_sync_high(struct r10conf *conf)
*/

static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
- int *skipped)
+ sector_t max_sector, int *skipped)
{
struct r10conf *conf = mddev->private;
struct r10bio *r10_bio;
struct bio *biolist = NULL, *bio;
- sector_t max_sector, nr_sectors;
+ sector_t nr_sectors;
int i;
int max_sync;
sector_t sync_blocks;
@@ -3175,10 +3175,6 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
return 0;

skipped:
- max_sector = mddev->dev_sectors;
- if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ||
- test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery))
- max_sector = mddev->resync_max_sectors;
if (sector_nr >= max_sector) {
conf->cluster_sync_low = 0;
conf->cluster_sync_high = 0;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 2bd1ce9b3922..69a083ca41a3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6458,11 +6458,10 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk
}

static inline sector_t raid5_sync_request(struct mddev *mddev, sector_t sector_nr,
- int *skipped)
+ sector_t max_sector, int *skipped)
{
struct r5conf *conf = mddev->private;
struct stripe_head *sh;
- sector_t max_sector = mddev->dev_sectors;
sector_t sync_blocks;
int still_degraded = 0;
int i;
--
2.39.2


2024-06-03 13:15:33

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 05/12] md: replace sysfs api sync_action with new helpers

To get rid of extrem long if else if usage, and make code cleaner.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md.c | 94 +++++++++++++++++++++++++++----------------------
1 file changed, 52 insertions(+), 42 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 28f50a106566..8a54b56e3463 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4949,27 +4949,9 @@ const char *md_sync_action_name(enum sync_action action)
static ssize_t
action_show(struct mddev *mddev, char *page)
{
- char *type = "idle";
- unsigned long recovery = mddev->recovery;
- if (test_bit(MD_RECOVERY_FROZEN, &recovery))
- type = "frozen";
- else if (test_bit(MD_RECOVERY_RUNNING, &recovery) ||
- (md_is_rdwr(mddev) && test_bit(MD_RECOVERY_NEEDED, &recovery))) {
- if (test_bit(MD_RECOVERY_RESHAPE, &recovery))
- type = "reshape";
- else if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
- if (!test_bit(MD_RECOVERY_REQUESTED, &recovery))
- type = "resync";
- else if (test_bit(MD_RECOVERY_CHECK, &recovery))
- type = "check";
- else
- type = "repair";
- } else if (test_bit(MD_RECOVERY_RECOVER, &recovery))
- type = "recover";
- else if (mddev->reshape_position != MaxSector)
- type = "reshape";
- }
- return sprintf(page, "%s\n", type);
+ enum sync_action action = md_sync_action(mddev);
+
+ return sprintf(page, "%s\n", md_sync_action_name(action));
}

/**
@@ -5112,35 +5094,63 @@ static int mddev_start_reshape(struct mddev *mddev)
static ssize_t
action_store(struct mddev *mddev, const char *page, size_t len)
{
+ int ret;
+ enum sync_action action;
+
if (!mddev->pers || !mddev->pers->sync_request)
return -EINVAL;

+ action = md_sync_action_by_name(page);

- if (cmd_match(page, "idle"))
- idle_sync_thread(mddev);
- else if (cmd_match(page, "frozen"))
- frozen_sync_thread(mddev);
- else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
- return -EBUSY;
- else if (cmd_match(page, "resync"))
- clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- else if (cmd_match(page, "recover")) {
- clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
- } else if (cmd_match(page, "reshape")) {
- int err = mddev_start_reshape(mddev);
-
- if (err)
- return err;
+ /* TODO: mdadm rely on "idle" to start sync_thread. */
+ if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
+ switch (action) {
+ case ACTION_FROZEN:
+ frozen_sync_thread(mddev);
+ return len;
+ case ACTION_IDLE:
+ idle_sync_thread(mddev);
+ break;
+ case ACTION_RESHAPE:
+ case ACTION_RECOVER:
+ case ACTION_CHECK:
+ case ACTION_REPAIR:
+ case ACTION_RESYNC:
+ return -EBUSY;
+ default:
+ return -EINVAL;
+ }
} else {
- if (cmd_match(page, "check"))
+ switch (action) {
+ case ACTION_FROZEN:
+ set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ return len;
+ case ACTION_RESHAPE:
+ clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ ret = mddev_start_reshape(mddev);
+ if (ret)
+ return ret;
+ break;
+ case ACTION_RECOVER:
+ clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
+ break;
+ case ACTION_CHECK:
set_bit(MD_RECOVERY_CHECK, &mddev->recovery);
- else if (!cmd_match(page, "repair"))
+ fallthrough;
+ case ACTION_REPAIR:
+ set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
+ set_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+ fallthrough;
+ case ACTION_RESYNC:
+ case ACTION_IDLE:
+ clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ break;
+ default:
return -EINVAL;
- clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
- set_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+ }
}
+
if (mddev->ro == MD_AUTO_READ) {
/* A write to sync_action is enough to justify
* canceling read-auto mode
--
2.39.2


2024-06-11 13:04:57

by Mariusz Tkaczyk

[permalink] [raw]
Subject: Re: [PATCH 02/12] md: add a new enum type sync_action

On Mon, 3 Jun 2024 20:58:05 +0800
Yu Kuai <[email protected]> wrote:

> In order to make code related to sync_thread cleaner in following
> patches, also add detail comment about each sync action. And also
> prepare to remove the related recovery_flags in the fulture.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/md/md.h | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 56 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 170412a65b63..6b9d9246f260 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -34,6 +34,61 @@
> */
> #define MD_FAILFAST (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT)
>
> +/* Status of sync thread. */
> +enum sync_action {
> + /*
> + * Represent by MD_RECOVERY_SYNC, start when:
> + * 1) after assemble, sync data from first rdev to other copies, this
> + * must be done first before other sync actions and will only execute
> + * once;
> + * 2) resize the array(notice that this is not reshape), sync data
> for
> + * the new range;
> + */
> + ACTION_RESYNC,
> + /*
> + * Represent by MD_RECOVERY_RECOVER, start when:
> + * 1) for new replacement, sync data based on the replace rdev or
> + * available copies from other rdev;
> + * 2) for new member disk while the array is degraded, sync data from
> + * other rdev;
> + * 3) reassemble after power failure or re-add a hot removed rdev,
> sync
> + * data from first rdev to other copies based on bitmap;
> + */
> + ACTION_RECOVER,
> + /*
> + * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED |
> + * MD_RECOVERY_CHECK, start when user echo "check" to sysfs api
> + * sync_action, used to check if data copies from differenct rdev are
> + * the same. The number of mismatch sectors will be exported to user
> + * by sysfs api mismatch_cnt;
> + */
> + ACTION_CHECK,
> + /*
> + * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED, start when
> + * user echo "repair" to sysfs api sync_action, usually paired with
> + * ACTION_CHECK, used to force syncing data once user found that
> there
> + * are inconsistent data,
> + */
> + ACTION_REPAIR,
> + /*
> + * Represent by MD_RECOVERY_RESHAPE, start when new member disk is
> added
> + * to the conf, notice that this is different from spares or
> + * replacement;
> + */
> + ACTION_RESHAPE,
> + /*
> + * Represent by MD_RECOVERY_FROZEN, can be set by sysfs api
> sync_action
> + * or internal usage like setting the array read-only, will forbid
> above
> + * actions.
> + */
> + ACTION_FROZEN,
> + /*
> + * All above actions don't match.
> + */
> + ACTION_IDLE,
> + NR_SYNC_ACTIONS,
> +};

I like if counter is keep in same style as rest enum values, like ACTION_COUNT.

Anyway LGTM.

Mariusz

2024-06-11 13:33:52

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 02/12] md: add a new enum type sync_action

Hi,

?? 2024/06/11 16:31, Mariusz Tkaczyk ะด??:
> On Mon, 3 Jun 2024 20:58:05 +0800
> Yu Kuai <[email protected]> wrote:
>
>> In order to make code related to sync_thread cleaner in following
>> patches, also add detail comment about each sync action. And also
>> prepare to remove the related recovery_flags in the fulture.
>>
>> Signed-off-by: Yu Kuai <[email protected]>
>> ---
>> drivers/md/md.h | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 56 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>> index 170412a65b63..6b9d9246f260 100644
>> --- a/drivers/md/md.h
>> +++ b/drivers/md/md.h
>> @@ -34,6 +34,61 @@
>> */
>> #define MD_FAILFAST (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT)
>>
>> +/* Status of sync thread. */
>> +enum sync_action {
>> + /*
>> + * Represent by MD_RECOVERY_SYNC, start when:
>> + * 1) after assemble, sync data from first rdev to other copies, this
>> + * must be done first before other sync actions and will only execute
>> + * once;
>> + * 2) resize the array(notice that this is not reshape), sync data
>> for
>> + * the new range;
>> + */
>> + ACTION_RESYNC,
>> + /*
>> + * Represent by MD_RECOVERY_RECOVER, start when:
>> + * 1) for new replacement, sync data based on the replace rdev or
>> + * available copies from other rdev;
>> + * 2) for new member disk while the array is degraded, sync data from
>> + * other rdev;
>> + * 3) reassemble after power failure or re-add a hot removed rdev,
>> sync
>> + * data from first rdev to other copies based on bitmap;
>> + */
>> + ACTION_RECOVER,
>> + /*
>> + * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED |
>> + * MD_RECOVERY_CHECK, start when user echo "check" to sysfs api
>> + * sync_action, used to check if data copies from differenct rdev are
>> + * the same. The number of mismatch sectors will be exported to user
>> + * by sysfs api mismatch_cnt;
>> + */
>> + ACTION_CHECK,
>> + /*
>> + * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED, start when
>> + * user echo "repair" to sysfs api sync_action, usually paired with
>> + * ACTION_CHECK, used to force syncing data once user found that
>> there
>> + * are inconsistent data,
>> + */
>> + ACTION_REPAIR,
>> + /*
>> + * Represent by MD_RECOVERY_RESHAPE, start when new member disk is
>> added
>> + * to the conf, notice that this is different from spares or
>> + * replacement;
>> + */
>> + ACTION_RESHAPE,
>> + /*
>> + * Represent by MD_RECOVERY_FROZEN, can be set by sysfs api
>> sync_action
>> + * or internal usage like setting the array read-only, will forbid
>> above
>> + * actions.
>> + */
>> + ACTION_FROZEN,
>> + /*
>> + * All above actions don't match.
>> + */
>> + ACTION_IDLE,
>> + NR_SYNC_ACTIONS,
>> +};
>
> I like if counter is keep in same style as rest enum values, like ACTION_COUNT.

Thanks for the review, however, I didn't find this style "xxx_COUNT" in
md code, AFAIK, "NR_xxx" style is used more, for example:

enum stat_group {
STAT_READ,
STAT_WRITE,
STAT_DISCARD,
STAT_FLUSH,

NR_STAT_GROUPS
};

Just to let you know, I'll keep this style in the next version. :)

Thanks,
Kuai
>
> Anyway LGTM.
>
> Mariusz
> .
>