Hey,
Please find two patches with fixes to the raid5 code.
The first patch fixes a bug in my recent commit that causes data
corruption in very specific circumstances.
The second patch fixes a theoretical issue with nested waits in
code that was recently cleaned up in the previous series (though the
issue always existed).
Thanks,
Logan
--
Logan Gunthorpe (2):
md/raid5: Fix sectors_to_do bitmap overflow in raid5_make_request()
md/raid5: Convert prepare_to_wait() to wait_woken() api
drivers/md/raid5.c | 32 +++++++++++++++++---------------
1 file changed, 17 insertions(+), 15 deletions(-)
base-commit: ff4ec5f79108cf82fe7168547c76fe754c4ade0a
--
2.30.2
For unaligned IO that have nearly maximum sectors, the number of stripes
will end up being one greater than the size of the bitmap. When this
happens, the last stripe in the IO will not be processed as it should
be, resulting in data corruption.
However, this is not normally seen when the backing block devices have
4K physical block sizes seeing the block layer will split the request
before that happens.
To fix this increase the bitmap size by one bit and ensure the full
number of stripes are checked when calling find_first_bit().
Reported-by: David Sloan <[email protected]>
Fixes: a5b9c6a653fb ("md/raid5: Pivot raid5_make_request()")
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/md/raid5.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 184145b49b7c..e37ed93d130f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5872,8 +5872,11 @@ struct stripe_request_ctx {
/* last sector in the request */
sector_t last_sector;
- /* bitmap to track stripe sectors that have been added to stripes */
- DECLARE_BITMAP(sectors_to_do, RAID5_MAX_REQ_STRIPES);
+ /*
+ * bitmap to track stripe sectors that have been added to stripes
+ * add one to account for unaligned requests
+ */
+ DECLARE_BITMAP(sectors_to_do, RAID5_MAX_REQ_STRIPES + 1);
/* the request had REQ_PREFLUSH, cleared after the first stripe_head */
bool do_flush;
@@ -6046,7 +6049,7 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
const int rw = bio_data_dir(bi);
enum stripe_result res;
DEFINE_WAIT(w);
- int s;
+ int s, stripe_cnt;
if (unlikely(bi->bi_opf & REQ_PREFLUSH)) {
int ret = log_handle_flush_request(conf, bi);
@@ -6090,9 +6093,9 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
ctx.last_sector = bio_end_sector(bi);
bi->bi_next = NULL;
- bitmap_set(ctx.sectors_to_do, 0,
- DIV_ROUND_UP_SECTOR_T(ctx.last_sector - logical_sector,
- RAID5_STRIPE_SECTORS(conf)));
+ stripe_cnt = DIV_ROUND_UP_SECTOR_T(ctx.last_sector - logical_sector,
+ RAID5_STRIPE_SECTORS(conf));
+ bitmap_set(ctx.sectors_to_do, 0, stripe_cnt);
pr_debug("raid456: %s, logical %llu to %llu\n", __func__,
bi->bi_iter.bi_sector, ctx.last_sector);
@@ -6137,8 +6140,8 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
continue;
}
- s = find_first_bit(ctx.sectors_to_do, RAID5_MAX_REQ_STRIPES);
- if (s == RAID5_MAX_REQ_STRIPES)
+ s = find_first_bit(ctx.sectors_to_do, stripe_cnt);
+ if (s == stripe_cnt)
break;
logical_sector = ctx.first_sector +
--
2.30.2
On Thu, Jul 7, 2022 at 12:15 PM Logan Gunthorpe <[email protected]> wrote:
>
> Hey,
>
> Please find two patches with fixes to the raid5 code.
>
> The first patch fixes a bug in my recent commit that causes data
> corruption in very specific circumstances.
>
> The second patch fixes a theoretical issue with nested waits in
> code that was recently cleaned up in the previous series (though the
> issue always existed).
>
> Thanks,
Applied to md-next after fixing a couple typos.
Thanks!
Song
>
> Logan
>
> --
>
> Logan Gunthorpe (2):
> md/raid5: Fix sectors_to_do bitmap overflow in raid5_make_request()
> md/raid5: Convert prepare_to_wait() to wait_woken() api
>
> drivers/md/raid5.c | 32 +++++++++++++++++---------------
> 1 file changed, 17 insertions(+), 15 deletions(-)
>
>
> base-commit: ff4ec5f79108cf82fe7168547c76fe754c4ade0a
> --
> 2.30.2