2023-06-28 02:17:32

by Li Nan

[permalink] [raw]
Subject: [PATCH 0/3] raid10 bugfix

From: Li Nan <[email protected]>

Li Nan (3):
md/raid10: check replacement and rdev to prevent submit the same io
twice
md/raid10: factor out get_rdev_repl_from_mirror()
md/raid10: use get_rdev_repl_from_mirror() to get devices

drivers/md/raid10.c | 43 ++++++++++++++++++++++++++-----------------
1 file changed, 26 insertions(+), 17 deletions(-)

--
2.39.2



2023-06-28 02:18:50

by Li Nan

[permalink] [raw]
Subject: [PATCH 2/3] md/raid10: factor out get_rdev_repl_from_mirror()

From: Li Nan <[email protected]>

Factor out a helper to get 'rdev' and 'replacement' from config->mirrors.
Just to make code cleaner and prepare to fix the bug of io loss while
'replacement' replace 'rdev'.

There is no functional change.

Signed-off-by: Li Nan <[email protected]>
---
drivers/md/raid10.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 3e6a09aaaba6..eaaf6307ddda 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1346,6 +1346,26 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
}
}

+static void get_rdev_repl_from_mirror(struct raid10_info *mirror,
+ struct md_rdev **prdev,
+ struct md_rdev **prrdev)
+{
+ struct md_rdev *rdev, *rrdev;
+
+ rrdev = rcu_dereference(mirror->replacement);
+ /*
+ * Read replacement first to prevent reading both rdev and
+ * replacement as NULL during replacement replace rdev.
+ */
+ smp_mb();
+ rdev = rcu_dereference(mirror->rdev);
+ if (rdev == rrdev)
+ rrdev = NULL;
+
+ *prrdev = rrdev;
+ *prdev = rdev;
+}
+
static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio)
{
int i;
@@ -1489,15 +1509,7 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
int d = r10_bio->devs[i].devnum;
struct md_rdev *rdev, *rrdev;

- rrdev = rcu_dereference(conf->mirrors[d].replacement);
- /*
- * Read replacement first to prevent reading both rdev and
- * replacement as NULL during replacement replace rdev.
- */
- smp_mb();
- rdev = rcu_dereference(conf->mirrors[d].rdev);
- if (rdev == rrdev)
- rrdev = NULL;
+ get_rdev_repl_from_mirror(&conf->mirrors[d], &rdev, &rrdev);
if (rdev && (test_bit(Faulty, &rdev->flags)))
rdev = NULL;
if (rrdev && (test_bit(Faulty, &rrdev->flags)))
--
2.39.2


2023-06-28 02:19:04

by Li Nan

[permalink] [raw]
Subject: [PATCH 1/3] md/raid10: check replacement and rdev to prevent submit the same io twice

From: Li Nan <[email protected]>

After commit 4ca40c2ce099 ("md/raid10: Allow replacement device to be
replace old drive."), 'rdev' and 'replacement' could appear to be
identical. There are already checks for that in wait_blocked_dev() and
raid10_write_request(). Add check for raid10_handle_discard() now.

Signed-off-by: Li Nan <[email protected]>
---
drivers/md/raid10.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index fabc340aae4f..3e6a09aaaba6 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1811,6 +1811,8 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
r10_bio->devs[disk].bio = NULL;
r10_bio->devs[disk].repl_bio = NULL;

+ if (rdev == rrdev)
+ rrdev = NULL;
if (rdev && (test_bit(Faulty, &rdev->flags)))
rdev = NULL;
if (rrdev && (test_bit(Faulty, &rrdev->flags)))
--
2.39.2


2023-06-28 02:21:12

by Li Nan

[permalink] [raw]
Subject: [PATCH 3/3] md/raid10: use get_rdev_repl_from_mirror() to get devices

From: Li Nan <[email protected]>

Commit 2ae6aaf76912 ("md/raid10: fix io loss while replacement replace
rdev") reads replacement first to prevent io loss. However, there are same
issue in wait_blocked_dev() and raid10_handle_discard(), too. Fix it by
using get_rdev_repl_from_mirror() to get devices.

Fixes: d30588b2731f ("md/raid10: improve raid10 discard request")
Fixes: f2e7e269a752 ("md/raid10: pull the code that wait for blocked dev into one function")
Signed-off-by: Li Nan <[email protected]>
---
drivers/md/raid10.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index eaaf6307ddda..2d55374d8b22 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1376,11 +1376,9 @@ static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio)
blocked_rdev = NULL;
rcu_read_lock();
for (i = 0; i < conf->copies; i++) {
- struct md_rdev *rdev = rcu_dereference(conf->mirrors[i].rdev);
- struct md_rdev *rrdev = rcu_dereference(
- conf->mirrors[i].replacement);
- if (rdev == rrdev)
- rrdev = NULL;
+ struct md_rdev *rdev, *rrdev;
+
+ get_rdev_repl_from_mirror(&conf->mirrors[i], &rdev, &rrdev);
if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
atomic_inc(&rdev->nr_pending);
blocked_rdev = rdev;
@@ -1816,15 +1814,12 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
*/
rcu_read_lock();
for (disk = 0; disk < geo->raid_disks; disk++) {
- struct md_rdev *rdev = rcu_dereference(conf->mirrors[disk].rdev);
- struct md_rdev *rrdev = rcu_dereference(
- conf->mirrors[disk].replacement);
+ struct md_rdev *rdev, *rrdev;

+ get_rdev_repl_from_mirror(&conf->mirrors[disk], &rdev, &rrdev);
r10_bio->devs[disk].bio = NULL;
r10_bio->devs[disk].repl_bio = NULL;

- if (rdev == rrdev)
- rrdev = NULL;
if (rdev && (test_bit(Faulty, &rdev->flags)))
rdev = NULL;
if (rrdev && (test_bit(Faulty, &rrdev->flags)))
--
2.39.2


2023-06-28 08:32:47

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 1/3] md/raid10: check replacement and rdev to prevent submit the same io twice

Hi,

?? 2023/06/28 9:57, [email protected] д??:
> From: Li Nan <[email protected]>
>
> After commit 4ca40c2ce099 ("md/raid10: Allow replacement device to be
> replace old drive."), 'rdev' and 'replacement' could appear to be
> identical. There are already checks for that in wait_blocked_dev() and
> raid10_write_request(). Add check for raid10_handle_discard() now.
>

I'm working on synchronize io with array configuration, so that these
checks is not necessary and can be removed, however, this might take
some time, before that, this patch LGTM:

Reviewed-by: Yu Kuai <[email protected]>

> Signed-off-by: Li Nan <[email protected]>
> ---
> drivers/md/raid10.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index fabc340aae4f..3e6a09aaaba6 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1811,6 +1811,8 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
> r10_bio->devs[disk].bio = NULL;
> r10_bio->devs[disk].repl_bio = NULL;
>
> + if (rdev == rrdev)
> + rrdev = NULL;
> if (rdev && (test_bit(Faulty, &rdev->flags)))
> rdev = NULL;
> if (rrdev && (test_bit(Faulty, &rrdev->flags)))
>


2023-06-28 10:06:08

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 3/3] md/raid10: use get_rdev_repl_from_mirror() to get devices

?? 2023/06/28 9:57, [email protected] д??:
> From: Li Nan <[email protected]>
>
> Commit 2ae6aaf76912 ("md/raid10: fix io loss while replacement replace
> rdev") reads replacement first to prevent io loss. However, there are same
> issue in wait_blocked_dev() and raid10_handle_discard(), too. Fix it by
> using get_rdev_repl_from_mirror() to get devices.

LGTM

Reviewed-by: Yu Kuai <[email protected]>
>
> Fixes: d30588b2731f ("md/raid10: improve raid10 discard request")
> Fixes: f2e7e269a752 ("md/raid10: pull the code that wait for blocked dev into one function")
> Signed-off-by: Li Nan <[email protected]>
> ---
> drivers/md/raid10.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index eaaf6307ddda..2d55374d8b22 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1376,11 +1376,9 @@ static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio)
> blocked_rdev = NULL;
> rcu_read_lock();
> for (i = 0; i < conf->copies; i++) {
> - struct md_rdev *rdev = rcu_dereference(conf->mirrors[i].rdev);
> - struct md_rdev *rrdev = rcu_dereference(
> - conf->mirrors[i].replacement);
> - if (rdev == rrdev)
> - rrdev = NULL;
> + struct md_rdev *rdev, *rrdev;
> +
> + get_rdev_repl_from_mirror(&conf->mirrors[i], &rdev, &rrdev);
> if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
> atomic_inc(&rdev->nr_pending);
> blocked_rdev = rdev;
> @@ -1816,15 +1814,12 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
> */
> rcu_read_lock();
> for (disk = 0; disk < geo->raid_disks; disk++) {
> - struct md_rdev *rdev = rcu_dereference(conf->mirrors[disk].rdev);
> - struct md_rdev *rrdev = rcu_dereference(
> - conf->mirrors[disk].replacement);
> + struct md_rdev *rdev, *rrdev;
>
> + get_rdev_repl_from_mirror(&conf->mirrors[disk], &rdev, &rrdev);
> r10_bio->devs[disk].bio = NULL;
> r10_bio->devs[disk].repl_bio = NULL;
>
> - if (rdev == rrdev)
> - rrdev = NULL;
> if (rdev && (test_bit(Faulty, &rdev->flags)))
> rdev = NULL;
> if (rrdev && (test_bit(Faulty, &rrdev->flags)))
>


2023-06-28 10:08:28

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 2/3] md/raid10: factor out get_rdev_repl_from_mirror()

Hi,

?? 2023/06/28 9:57, [email protected] д??:
> From: Li Nan <[email protected]>
>
> Factor out a helper to get 'rdev' and 'replacement' from config->mirrors.
> Just to make code cleaner and prepare to fix the bug of io loss while
> 'replacement' replace 'rdev'.
>
> There is no functional change.
>
> Signed-off-by: Li Nan <[email protected]>
> ---
> drivers/md/raid10.c | 30 +++++++++++++++++++++---------
> 1 file changed, 21 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 3e6a09aaaba6..eaaf6307ddda 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1346,6 +1346,26 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
> }
> }
>
> +static void get_rdev_repl_from_mirror(struct raid10_info *mirror,
> + struct md_rdev **prdev,
> + struct md_rdev **prrdev)

I don't like this name, but I can live with this for now, related code
will be removed eventually.

Reviewed-by: Yu Kuai <[email protected]>
> +{
> + struct md_rdev *rdev, *rrdev;
> +
> + rrdev = rcu_dereference(mirror->replacement);
> + /*
> + * Read replacement first to prevent reading both rdev and
> + * replacement as NULL during replacement replace rdev.
> + */
> + smp_mb();
> + rdev = rcu_dereference(mirror->rdev);
> + if (rdev == rrdev)
> + rrdev = NULL;
> +
> + *prrdev = rrdev;
> + *prdev = rdev;
> +}
> +
> static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio)
> {
> int i;
> @@ -1489,15 +1509,7 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
> int d = r10_bio->devs[i].devnum;
> struct md_rdev *rdev, *rrdev;
>
> - rrdev = rcu_dereference(conf->mirrors[d].replacement);
> - /*
> - * Read replacement first to prevent reading both rdev and
> - * replacement as NULL during replacement replace rdev.
> - */
> - smp_mb();
> - rdev = rcu_dereference(conf->mirrors[d].rdev);
> - if (rdev == rrdev)
> - rrdev = NULL;
> + get_rdev_repl_from_mirror(&conf->mirrors[d], &rdev, &rrdev);
> if (rdev && (test_bit(Faulty, &rdev->flags)))
> rdev = NULL;
> if (rrdev && (test_bit(Faulty, &rrdev->flags)))
>


2023-07-01 00:22:42

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH 2/3] md/raid10: factor out get_rdev_repl_from_mirror()

On Tue, Jun 27, 2023 at 6:58 PM <[email protected]> wrote:
>
[...]

>
> +static void get_rdev_repl_from_mirror(struct raid10_info *mirror,
> + struct md_rdev **prdev,
> + struct md_rdev **prrdev)
> +{
> + struct md_rdev *rdev, *rrdev;
> +
> + rrdev = rcu_dereference(mirror->replacement);
> + /*
> + * Read replacement first to prevent reading both rdev and
> + * replacement as NULL during replacement replace rdev.
> + */
> + smp_mb();
> + rdev = rcu_dereference(mirror->rdev);
> + if (rdev == rrdev)
> + rrdev = NULL;
> +
> + *prrdev = rrdev;
> + *prdev = rdev;

I don't think the reduction in duplicated code justifies two output arguments.

How about

static struct md_rdev *dereference_rdev_and_rrdev(struct raid10_info *mirror,
struct md_rdev **prrdev)
{
...
*prrdev = xxx;
return rdev;
}

So we only have one argument for output.

Also, "from_mirror" in the function name doesn't really add more value.

Thanks,
Song

2023-07-01 01:46:48

by Li Nan

[permalink] [raw]
Subject: Re: [PATCH 2/3] md/raid10: factor out get_rdev_repl_from_mirror()



在 2023/7/1 7:53, Song Liu 写道:
> On Tue, Jun 27, 2023 at 6:58 PM <[email protected]> wrote:
>>
> [...]
>
>>
>> +static void get_rdev_repl_from_mirror(struct raid10_info *mirror,
>> + struct md_rdev **prdev,
>> + struct md_rdev **prrdev)
>> +{
>> + struct md_rdev *rdev, *rrdev;
>> +
>> + rrdev = rcu_dereference(mirror->replacement);
>> + /*
>> + * Read replacement first to prevent reading both rdev and
>> + * replacement as NULL during replacement replace rdev.
>> + */
>> + smp_mb();
>> + rdev = rcu_dereference(mirror->rdev);
>> + if (rdev == rrdev)
>> + rrdev = NULL;
>> +
>> + *prrdev = rrdev;
>> + *prdev = rdev;
>
> I don't think the reduction in duplicated code justifies two output arguments.
>
> How about
>
> static struct md_rdev *dereference_rdev_and_rrdev(struct raid10_info *mirror,
> struct md_rdev **prrdev)
> {
> ...
> *prrdev = xxx;
> return rdev;
> }
>
> So we only have one argument for output.
>
> Also, "from_mirror" in the function name doesn't really add more value.
>
> Thanks,
> Song
> .

I agree. Let me improve this.

--
Thanks,
Nan