2023-10-07 03:24:48

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next] md/raid1: don't split discard io for write behind

From: Yu Kuai <[email protected]>

Currently, discad io is treated the same as normal write io, and for
write behind case, io size is limited to:

BIO_MAX_VECS * (PAGE_SIZE >> 9)

For 0.5KB sector size and 4KB PAGE_SIZE, this is just 1MB. For
consequence, if 'WriteMostly' is set to one of the underlying disks,
then diskcard io will be splited into 1MB and it will take a long time
for the diskcard to finish.

Fix this problem by disable write behind for discard io.

Reported-by: Roman Mamedov <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Reported-and-tested-by: Kirill Kirilenko <[email protected]>
Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid1.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 3a78f79ee6d5..35d12948e0a9 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1345,6 +1345,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
int first_clone;
int max_sectors;
bool write_behind = false;
+ bool is_discard = (bio_op(bio) == REQ_OP_DISCARD);

if (mddev_is_clustered(mddev) &&
md_cluster_ops->area_resyncing(mddev, WRITE,
@@ -1405,7 +1406,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
* write-mostly, which means we could allocate write behind
* bio later.
*/
- if (rdev && test_bit(WriteMostly, &rdev->flags))
+ if (!is_discard && rdev && test_bit(WriteMostly, &rdev->flags))
write_behind = true;

if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
--
2.39.2


2023-10-09 23:33:44

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH -next] md/raid1: don't split discard io for write behind

On Fri, Oct 6, 2023 at 8:24 PM Yu Kuai <[email protected]> wrote:
>
> From: Yu Kuai <[email protected]>
>
> Currently, discad io is treated the same as normal write io, and for
> write behind case, io size is limited to:
>
> BIO_MAX_VECS * (PAGE_SIZE >> 9)
>
> For 0.5KB sector size and 4KB PAGE_SIZE, this is just 1MB. For
> consequence, if 'WriteMostly' is set to one of the underlying disks,
> then diskcard io will be splited into 1MB and it will take a long time
> for the diskcard to finish.
>
> Fix this problem by disable write behind for discard io.
>
> Reported-by: Roman Mamedov <[email protected]>
> Closes: https://lore.kernel.org/all/[email protected]/
> Reported-and-tested-by: Kirill Kirilenko <[email protected]>
> Signed-off-by: Yu Kuai <[email protected]>

Applied to md-next. Thanks!

Song

> ---
> drivers/md/raid1.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 3a78f79ee6d5..35d12948e0a9 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1345,6 +1345,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> int first_clone;
> int max_sectors;
> bool write_behind = false;
> + bool is_discard = (bio_op(bio) == REQ_OP_DISCARD);
>
> if (mddev_is_clustered(mddev) &&
> md_cluster_ops->area_resyncing(mddev, WRITE,
> @@ -1405,7 +1406,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> * write-mostly, which means we could allocate write behind
> * bio later.
> */
> - if (rdev && test_bit(WriteMostly, &rdev->flags))
> + if (!is_discard && rdev && test_bit(WriteMostly, &rdev->flags))
> write_behind = true;
>
> if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
> --
> 2.39.2
>