From: Li Nan <[email protected]>
Li Nan (3):
md/raid10: fix null-ptr-deref of mreplace in raid10_sync_request
md/raid10: fix incorrect done of recovery
md/raid10: fix io loss while replacement replace rdev
drivers/md/raid10.c | 51 +++++++++++++++++++++++++++++++++------------
1 file changed, 38 insertions(+), 13 deletions(-)
--
2.31.1
From: Li Nan <[email protected]>
Recovery will go to giveup and let chunks_skipped++ in
raid10_sync_request() if there are some bad_blocks, and it will return
max_sector when chunks_skipped >= geo.raid_disks. Now, recovery fail and
data is inconsistent but user think recovery is done, it is wrong.
Fix it by set mirror's recovery_disabled and spare device shouln't be
added to here.
Signed-off-by: Li Nan <[email protected]>
---
drivers/md/raid10.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index e21502c03b45..70cc87c7ee57 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3303,6 +3303,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
int chunks_skipped = 0;
sector_t chunk_mask = conf->geo.chunk_mask;
int page_idx = 0;
+ int error_disk = -1;
/*
* Allow skipping a full rebuild for incremental assembly
@@ -3386,7 +3387,18 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
return reshape_request(mddev, sector_nr, skipped);
if (chunks_skipped >= conf->geo.raid_disks) {
- /* if there has been nothing to do on any drive,
+ pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
+ test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ? "resync" : "recovery");
+ if (error_disk >= 0 && !test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
+ /*
+ * recovery fail, set mirrors.recovory_disabled,
+ * device shouldn't be added to there.
+ */
+ conf->mirrors[error_disk].recovery_disabled = mddev->recovery_disabled;
+ return 0;
+ }
+ /*
+ * if there has been nothing to do on any drive,
* then there is nothing to do at all..
*/
*skipped = 1;
@@ -3640,6 +3652,8 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
mdname(mddev));
mirror->recovery_disabled
= mddev->recovery_disabled;
+ } else {
+ error_disk = i;
}
put_buf(r10_bio);
if (rb2)
--
2.31.1
Hi,
?? 2023/05/22 19:54, [email protected] д??:
> From: Li Nan <[email protected]>
>
> Recovery will go to giveup and let chunks_skipped++ in
> raid10_sync_request() if there are some bad_blocks, and it will return
> max_sector when chunks_skipped >= geo.raid_disks. Now, recovery fail and
> data is inconsistent but user think recovery is done, it is wrong.
>
> Fix it by set mirror's recovery_disabled and spare device shouln't be
> added to here.
>
> Signed-off-by: Li Nan <[email protected]>
> ---
> drivers/md/raid10.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index e21502c03b45..70cc87c7ee57 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -3303,6 +3303,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
> int chunks_skipped = 0;
> sector_t chunk_mask = conf->geo.chunk_mask;
> int page_idx = 0;
> + int error_disk = -1;
>
> /*
> * Allow skipping a full rebuild for incremental assembly
> @@ -3386,7 +3387,18 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
> return reshape_request(mddev, sector_nr, skipped);
>
> if (chunks_skipped >= conf->geo.raid_disks) {
> - /* if there has been nothing to do on any drive,
> + pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
> + test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ? "resync" : "recovery");
Line exceed 80 columns, and following.
> + if (error_disk >= 0 && !test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
Resync has the same problem, right?
Thanks,
Kuai
> + /*
> + * recovery fail, set mirrors.recovory_disabled,
> + * device shouldn't be added to there.
> + */
> + conf->mirrors[error_disk].recovery_disabled = mddev->recovery_disabled;
> + return 0;
> + }
> + /*
> + * if there has been nothing to do on any drive,
> * then there is nothing to do at all..
> */
> *skipped = 1;
> @@ -3640,6 +3652,8 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
> mdname(mddev));
> mirror->recovery_disabled
> = mddev->recovery_disabled;
> + } else {
> + error_disk = i;
> }
> put_buf(r10_bio);
> if (rb2)
>
On Mon, May 22, 2023 at 6:54 AM Yu Kuai <[email protected]> wrote:
>
> Hi,
>
> 在 2023/05/22 19:54, [email protected] 写道:
> > From: Li Nan <[email protected]>
> >
> > Recovery will go to giveup and let chunks_skipped++ in
> > raid10_sync_request() if there are some bad_blocks, and it will return
> > max_sector when chunks_skipped >= geo.raid_disks. Now, recovery fail and
> > data is inconsistent but user think recovery is done, it is wrong.
> >
> > Fix it by set mirror's recovery_disabled and spare device shouln't be
> > added to here.
> >
> > Signed-off-by: Li Nan <[email protected]>
> > ---
> > drivers/md/raid10.c | 16 +++++++++++++++-
> > 1 file changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> > index e21502c03b45..70cc87c7ee57 100644
> > --- a/drivers/md/raid10.c
> > +++ b/drivers/md/raid10.c
> > @@ -3303,6 +3303,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
> > int chunks_skipped = 0;
> > sector_t chunk_mask = conf->geo.chunk_mask;
> > int page_idx = 0;
> > + int error_disk = -1;
> >
> > /*
> > * Allow skipping a full rebuild for incremental assembly
> > @@ -3386,7 +3387,18 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
> > return reshape_request(mddev, sector_nr, skipped);
> >
> > if (chunks_skipped >= conf->geo.raid_disks) {
> > - /* if there has been nothing to do on any drive,
> > + pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
> > + test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ? "resync" : "recovery");
>
> Line exceed 80 columns, and following.
If it makes the code look better, such as in this case, it is OK to have
lines longer than 80 columns.
Thanks,
Song
> > + if (error_disk >= 0 && !test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
>
On Mon, May 22, 2023 at 4:56 AM <[email protected]> wrote:
>
> From: Li Nan <[email protected]>
>
> Li Nan (3):
> md/raid10: fix null-ptr-deref of mreplace in raid10_sync_request
> md/raid10: fix incorrect done of recovery
> md/raid10: fix io loss while replacement replace rdev
Please address feedback and send v2.
Thanks,
Song
在 2023/5/24 2:44, Song Liu 写道:
> On Mon, May 22, 2023 at 4:56 AM <[email protected]> wrote:
>>
>> From: Li Nan <[email protected]>
>>
>> Li Nan (3):
>> md/raid10: fix null-ptr-deref of mreplace in raid10_sync_request
>> md/raid10: fix incorrect done of recovery
>> md/raid10: fix io loss while replacement replace rdev
>
> Please address feedback and send v2.
>
> Thanks,
> Song
> .
Thanks for review, I will send v2 later.
--
Thanks,
Nan
在 2023/5/22 21:54, Yu Kuai 写道:
> Hi,
>
> 在 2023/05/22 19:54, [email protected] 写道:
>> From: Li Nan <[email protected]>
>>
>> Recovery will go to giveup and let chunks_skipped++ in
>> raid10_sync_request() if there are some bad_blocks, and it will return
>> max_sector when chunks_skipped >= geo.raid_disks. Now, recovery fail and
>> data is inconsistent but user think recovery is done, it is wrong.
>>
>> Fix it by set mirror's recovery_disabled and spare device shouln't be
>> added to here.
>>
>> Signed-off-by: Li Nan <[email protected]>
>> ---
>> drivers/md/raid10.c | 16 +++++++++++++++-
>> 1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index e21502c03b45..70cc87c7ee57 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -3303,6 +3303,7 @@ static sector_t raid10_sync_request(struct mddev
>> *mddev, sector_t sector_nr,
>> int chunks_skipped = 0;
>> sector_t chunk_mask = conf->geo.chunk_mask;
>> int page_idx = 0;
>> + int error_disk = -1;
>> /*
>> * Allow skipping a full rebuild for incremental assembly
>> @@ -3386,7 +3387,18 @@ static sector_t raid10_sync_request(struct
>> mddev *mddev, sector_t sector_nr,
>> return reshape_request(mddev, sector_nr, skipped);
>> if (chunks_skipped >= conf->geo.raid_disks) {
>> - /* if there has been nothing to do on any drive,
>> + pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
>> + test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ? "resync"
>> : "recovery");
>
> Line exceed 80 columns, and following.
>> + if (error_disk >= 0 && !test_bit(MD_RECOVERY_SYNC,
>> &mddev->recovery)) {
>
> Resync has the same problem, right?
>
Yes. But I have no idea to fix it. md_error disk nor set
recovery_disabled is a good solution. So, just print error message now.
Do you have any ideas?
--
Thanks,
Nan
Hi,
在 2023/05/25 22:00, Li Nan 写道:
>
>
> 在 2023/5/22 21:54, Yu Kuai 写道:
>> Hi,
>>
>> 在 2023/05/22 19:54, [email protected] 写道:
>>> From: Li Nan <[email protected]>
>>>
>>> Recovery will go to giveup and let chunks_skipped++ in
>>> raid10_sync_request() if there are some bad_blocks, and it will return
>>> max_sector when chunks_skipped >= geo.raid_disks. Now, recovery fail and
>>> data is inconsistent but user think recovery is done, it is wrong.
>>>
>>> Fix it by set mirror's recovery_disabled and spare device shouln't be
>>> added to here.
>>>
>>> Signed-off-by: Li Nan <[email protected]>
>>> ---
>>> drivers/md/raid10.c | 16 +++++++++++++++-
>>> 1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>> index e21502c03b45..70cc87c7ee57 100644
>>> --- a/drivers/md/raid10.c
>>> +++ b/drivers/md/raid10.c
>>> @@ -3303,6 +3303,7 @@ static sector_t raid10_sync_request(struct
>>> mddev *mddev, sector_t sector_nr,
>>> int chunks_skipped = 0;
>>> sector_t chunk_mask = conf->geo.chunk_mask;
>>> int page_idx = 0;
>>> + int error_disk = -1;
>>> /*
>>> * Allow skipping a full rebuild for incremental assembly
>>> @@ -3386,7 +3387,18 @@ static sector_t raid10_sync_request(struct
>>> mddev *mddev, sector_t sector_nr,
>>> return reshape_request(mddev, sector_nr, skipped);
>>> if (chunks_skipped >= conf->geo.raid_disks) {
>>> - /* if there has been nothing to do on any drive,
>>> + pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
>>> + test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ? "resync"
>>> : "recovery");
>>
>> Line exceed 80 columns, and following.
>>> + if (error_disk >= 0 && !test_bit(MD_RECOVERY_SYNC,
>>> &mddev->recovery)) {
>>
>> Resync has the same problem, right?
>>
>
> Yes. But I have no idea to fix it. md_error disk nor set
> recovery_disabled is a good solution. So, just print error message now.
> Do you have any ideas?
I'll look into this, in the meadtime, I don't suggest to apply this
patch because this is just temporary solution that only fix half of
the problem.
Thanks,
Kuai