2023-05-29 13:20:50

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v3 0/7] limit the number of plugged bio

From: Yu Kuai <[email protected]>

Changes in v3:
- prefix function with 'raid1_' instead of 'md_'
- use a globle workequeue instead of per bitmap in patch 5
Changes in v2:
- remove the patch to rename raid1-10.c

Yu Kuai (7):
md/raid10: prevent soft lockup while flush writes
md/raid1-10: factor out a helper to add bio to plug
md/raid1-10: factor out a helper to submit normal write
md/raid1-10: submit write io directly if bitmap is not enabled
md/md-bitmap: add a new helper to unplug bitmap asynchrously
md/raid1-10: don't handle pluged bio by daemon thread
md/raid1-10: limit the number of plugged bio

drivers/md/md-bitmap.c | 33 ++++++++++++++++++++--
drivers/md/md-bitmap.h | 8 ++++++
drivers/md/md.c | 9 ++++++
drivers/md/md.h | 1 +
drivers/md/raid1-10.c | 63 ++++++++++++++++++++++++++++++++++++++++++
drivers/md/raid1.c | 29 ++++---------------
drivers/md/raid10.c | 47 +++++++------------------------
7 files changed, 126 insertions(+), 64 deletions(-)

--
2.39.2



2023-05-29 13:24:21

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v3 3/7] md/raid1-10: factor out a helper to submit normal write

From: Yu Kuai <[email protected]>

There are multiple places to do the same thing, factor out a helper to
prevent redundant code, and the helper will be used in following patch
as well.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid1-10.c | 17 +++++++++++++++++
drivers/md/raid1.c | 13 ++-----------
drivers/md/raid10.c | 26 ++++----------------------
3 files changed, 23 insertions(+), 33 deletions(-)

diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
index 9bf19a3409ce..506299bd55cb 100644
--- a/drivers/md/raid1-10.c
+++ b/drivers/md/raid1-10.c
@@ -110,6 +110,23 @@ static void md_bio_reset_resync_pages(struct bio *bio, struct resync_pages *rp,
} while (idx++ < RESYNC_PAGES && size > 0);
}

+
+static inline void raid1_submit_write(struct bio *bio)
+{
+ struct md_rdev *rdev = (struct md_rdev *)bio->bi_bdev;
+
+ bio->bi_next = NULL;
+ bio_set_dev(bio, rdev->bdev);
+ if (test_bit(Faulty, &rdev->flags))
+ bio_io_error(bio);
+ else if (unlikely(bio_op(bio) == REQ_OP_DISCARD &&
+ !bdev_max_discard_sectors(bio->bi_bdev)))
+ /* Just ignore it */
+ bio_endio(bio);
+ else
+ submit_bio_noacct(bio);
+}
+
static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *bio,
blk_plug_cb_fn unplug)
{
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index e86c5e71c604..0778e398584c 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -799,17 +799,8 @@ static void flush_bio_list(struct r1conf *conf, struct bio *bio)

while (bio) { /* submit pending writes */
struct bio *next = bio->bi_next;
- struct md_rdev *rdev = (void *)bio->bi_bdev;
- bio->bi_next = NULL;
- bio_set_dev(bio, rdev->bdev);
- if (test_bit(Faulty, &rdev->flags)) {
- bio_io_error(bio);
- } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
- !bdev_max_discard_sectors(bio->bi_bdev)))
- /* Just ignore it */
- bio_endio(bio);
- else
- submit_bio_noacct(bio);
+
+ raid1_submit_write(bio);
bio = next;
cond_resched();
}
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 18702051ebd1..6640507ecb0d 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -909,17 +909,8 @@ static void flush_pending_writes(struct r10conf *conf)

while (bio) { /* submit pending writes */
struct bio *next = bio->bi_next;
- struct md_rdev *rdev = (void*)bio->bi_bdev;
- bio->bi_next = NULL;
- bio_set_dev(bio, rdev->bdev);
- if (test_bit(Faulty, &rdev->flags)) {
- bio_io_error(bio);
- } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
- !bdev_max_discard_sectors(bio->bi_bdev)))
- /* Just ignore it */
- bio_endio(bio);
- else
- submit_bio_noacct(bio);
+
+ raid1_submit_write(bio);
bio = next;
cond_resched();
}
@@ -1134,17 +1125,8 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)

while (bio) { /* submit pending writes */
struct bio *next = bio->bi_next;
- struct md_rdev *rdev = (void*)bio->bi_bdev;
- bio->bi_next = NULL;
- bio_set_dev(bio, rdev->bdev);
- if (test_bit(Faulty, &rdev->flags)) {
- bio_io_error(bio);
- } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
- !bdev_max_discard_sectors(bio->bi_bdev)))
- /* Just ignore it */
- bio_endio(bio);
- else
- submit_bio_noacct(bio);
+
+ raid1_submit_write(bio);
bio = next;
cond_resched();
}
--
2.39.2


2023-05-29 13:24:37

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v3 1/7] md/raid10: prevent soft lockup while flush writes

From: Yu Kuai <[email protected]>

Currently, there is no limit for raid1/raid10 plugged bio. While flushing
writes, raid1 has cond_resched() while raid10 doesn't, and too many
writes can cause soft lockup.

Follow up soft lockup can be triggered easily with writeback test for
raid10 with ramdisks:

watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293]
Call Trace:
<TASK>
call_rcu+0x16/0x20
put_object+0x41/0x80
__delete_object+0x50/0x90
delete_object_full+0x2b/0x40
kmemleak_free+0x46/0xa0
slab_free_freelist_hook.constprop.0+0xed/0x1a0
kmem_cache_free+0xfd/0x300
mempool_free_slab+0x1f/0x30
mempool_free+0x3a/0x100
bio_free+0x59/0x80
bio_put+0xcf/0x2c0
free_r10bio+0xbf/0xf0
raid_end_bio_io+0x78/0xb0
one_write_done+0x8a/0xa0
raid10_end_write_request+0x1b4/0x430
bio_endio+0x175/0x320
brd_submit_bio+0x3b9/0x9b7 [brd]
__submit_bio+0x69/0xe0
submit_bio_noacct_nocheck+0x1e6/0x5a0
submit_bio_noacct+0x38c/0x7e0
flush_pending_writes+0xf0/0x240
raid10d+0xac/0x1ed0

Fix the problem by adding cond_resched() to raid10 like what raid1 did.

Note that unlimited plugged bio still need to be optimized, for example,
in the case of lots of dirty pages writeback, this will take lots of
memory and io will spend a long time in plug, hence io latency is bad.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid10.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 32fb4ff0acdb..6b31f848a6d9 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf)
else
submit_bio_noacct(bio);
bio = next;
+ cond_resched();
}
blk_finish_plug(&plug);
} else
@@ -1145,6 +1146,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)
else
submit_bio_noacct(bio);
bio = next;
+ cond_resched();
}
kfree(plug);
}
--
2.39.2


2023-05-29 13:25:04

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v3 4/7] md/raid1-10: submit write io directly if bitmap is not enabled

From: Yu Kuai <[email protected]>

Commit 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10")
add bitmap support, and it changed that write io is submitted through
daemon thread because bitmap need to be updated before write io. And
later, plug is used to fix performance regression because all the write io
will go to demon thread, which means io can't be issued concurrently.

However, if bitmap is not enabled, the write io should not go to daemon
thread in the first place, and plug is not needed as well.

Fixes: 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10")
Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-bitmap.c | 4 +---
drivers/md/md-bitmap.h | 7 +++++++
drivers/md/raid1-10.c | 13 +++++++++++--
3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index ad5a3456cd8a..3ee590cf12a7 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1016,7 +1016,6 @@ static int md_bitmap_file_test_bit(struct bitmap *bitmap, sector_t block)
return set;
}

-
/* this gets called when the md device is ready to unplug its underlying
* (slave) device queues -- before we let any writes go down, we need to
* sync the dirty pages of the bitmap file to disk */
@@ -1026,8 +1025,7 @@ void md_bitmap_unplug(struct bitmap *bitmap)
int dirty, need_write;
int writing = 0;

- if (!bitmap || !bitmap->storage.filemap ||
- test_bit(BITMAP_STALE, &bitmap->flags))
+ if (!md_bitmap_enabled(bitmap))
return;

/* look at each page to see if there are any set bits that need to be
diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
index cfd7395de8fd..3a4750952b3a 100644
--- a/drivers/md/md-bitmap.h
+++ b/drivers/md/md-bitmap.h
@@ -273,6 +273,13 @@ int md_bitmap_copy_from_slot(struct mddev *mddev, int slot,
sector_t *lo, sector_t *hi, bool clear_bits);
void md_bitmap_free(struct bitmap *bitmap);
void md_bitmap_wait_behind_writes(struct mddev *mddev);
+
+static inline bool md_bitmap_enabled(struct bitmap *bitmap)
+{
+ return bitmap && bitmap->storage.filemap &&
+ !test_bit(BITMAP_STALE, &bitmap->flags);
+}
+
#endif

#endif
diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
index 506299bd55cb..73cc3cb9154d 100644
--- a/drivers/md/raid1-10.c
+++ b/drivers/md/raid1-10.c
@@ -131,9 +131,18 @@ static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *bio,
blk_plug_cb_fn unplug)
{
struct raid1_plug_cb *plug = NULL;
- struct blk_plug_cb *cb = blk_check_plugged(unplug, mddev,
- sizeof(*plug));
+ struct blk_plug_cb *cb;
+
+ /*
+ * If bitmap is not enabled, it's safe to submit the io directly, and
+ * this can get optimal performance.
+ */
+ if (!md_bitmap_enabled(mddev->bitmap)) {
+ raid1_submit_write(bio);
+ return true;
+ }

+ cb = blk_check_plugged(unplug, mddev, sizeof(*plug));
if (!cb)
return false;

--
2.39.2


2023-05-29 13:25:22

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v3 2/7] md/raid1-10: factor out a helper to add bio to plug

From: Yu Kuai <[email protected]>

The code in raid1 and raid10 is identical, prepare to limit the number
of plugged bios.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/raid1-10.c | 16 ++++++++++++++++
drivers/md/raid1.c | 12 +-----------
drivers/md/raid10.c | 11 +----------
3 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
index e61f6cad4e08..9bf19a3409ce 100644
--- a/drivers/md/raid1-10.c
+++ b/drivers/md/raid1-10.c
@@ -109,3 +109,19 @@ static void md_bio_reset_resync_pages(struct bio *bio, struct resync_pages *rp,
size -= len;
} while (idx++ < RESYNC_PAGES && size > 0);
}
+
+static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *bio,
+ blk_plug_cb_fn unplug)
+{
+ struct raid1_plug_cb *plug = NULL;
+ struct blk_plug_cb *cb = blk_check_plugged(unplug, mddev,
+ sizeof(*plug));
+
+ if (!cb)
+ return false;
+
+ plug = container_of(cb, struct raid1_plug_cb, cb);
+ bio_list_add(&plug->pending, bio);
+
+ return true;
+}
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 2f1011ffdf09..e86c5e71c604 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1343,8 +1343,6 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
struct bitmap *bitmap = mddev->bitmap;
unsigned long flags;
struct md_rdev *blocked_rdev;
- struct blk_plug_cb *cb;
- struct raid1_plug_cb *plug = NULL;
int first_clone;
int max_sectors;
bool write_behind = false;
@@ -1573,15 +1571,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
r1_bio->sector);
/* flush_pending_writes() needs access to the rdev so...*/
mbio->bi_bdev = (void *)rdev;
-
- cb = blk_check_plugged(raid1_unplug, mddev, sizeof(*plug));
- if (cb)
- plug = container_of(cb, struct raid1_plug_cb, cb);
- else
- plug = NULL;
- if (plug) {
- bio_list_add(&plug->pending, mbio);
- } else {
+ if (!raid1_add_bio_to_plug(mddev, mbio, raid1_unplug)) {
spin_lock_irqsave(&conf->device_lock, flags);
bio_list_add(&conf->pending_bio_list, mbio);
spin_unlock_irqrestore(&conf->device_lock, flags);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 6b31f848a6d9..18702051ebd1 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1287,8 +1287,6 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC;
const blk_opf_t do_fua = bio->bi_opf & REQ_FUA;
unsigned long flags;
- struct blk_plug_cb *cb;
- struct raid1_plug_cb *plug = NULL;
struct r10conf *conf = mddev->private;
struct md_rdev *rdev;
int devnum = r10_bio->devs[n_copy].devnum;
@@ -1328,14 +1326,7 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,

atomic_inc(&r10_bio->remaining);

- cb = blk_check_plugged(raid10_unplug, mddev, sizeof(*plug));
- if (cb)
- plug = container_of(cb, struct raid1_plug_cb, cb);
- else
- plug = NULL;
- if (plug) {
- bio_list_add(&plug->pending, mbio);
- } else {
+ if (!raid1_add_bio_to_plug(mddev, mbio, raid10_unplug)) {
spin_lock_irqsave(&conf->device_lock, flags);
bio_list_add(&conf->pending_bio_list, mbio);
spin_unlock_irqrestore(&conf->device_lock, flags);
--
2.39.2


2023-05-29 13:26:03

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v3 5/7] md/md-bitmap: add a new helper to unplug bitmap asynchrously

From: Yu Kuai <[email protected]>

If bitmap is enabled, bitmap must update before submitting write io, this
is why unplug callback must move these io to 'conf->pending_io_list' if
'current->bio_list' is not empty, which will suffer performance
degradation.

A new helper md_bitmap_unplug_async() is introduced to submit bitmap io
in a kworker, so that submit bitmap io in raid10_unplug() doesn't require
that 'current->bio_list' is empty.

This patch prepare to limit the number of plugged bio.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-bitmap.c | 29 +++++++++++++++++++++++++++++
drivers/md/md-bitmap.h | 1 +
drivers/md/md.c | 9 +++++++++
drivers/md/md.h | 1 +
4 files changed, 40 insertions(+)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 3ee590cf12a7..25cd72b317f1 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1054,6 +1054,35 @@ void md_bitmap_unplug(struct bitmap *bitmap)
}
EXPORT_SYMBOL(md_bitmap_unplug);

+struct bitmap_unplug_work {
+ struct work_struct work;
+ struct bitmap *bitmap;
+ struct completion *done;
+};
+
+static void md_bitmap_unplug_fn(struct work_struct *work)
+{
+ struct bitmap_unplug_work *unplug_work =
+ container_of(work, struct bitmap_unplug_work, work);
+
+ md_bitmap_unplug(unplug_work->bitmap);
+ complete(unplug_work->done);
+}
+
+void md_bitmap_unplug_async(struct bitmap *bitmap)
+{
+ DECLARE_COMPLETION_ONSTACK(done);
+ struct bitmap_unplug_work unplug_work;
+
+ INIT_WORK(&unplug_work.work, md_bitmap_unplug_fn);
+ unplug_work.bitmap = bitmap;
+ unplug_work.done = &done;
+
+ queue_work(md_bitmap_wq, &unplug_work.work);
+ wait_for_completion(&done);
+}
+EXPORT_SYMBOL(md_bitmap_unplug_async);
+
static void md_bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offset, int needed);
/* * bitmap_init_from_disk -- called at bitmap_create time to initialize
* the in-memory bitmap from the on-disk bitmap -- also, sets up the
diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
index 3a4750952b3a..8a3788c9bfef 100644
--- a/drivers/md/md-bitmap.h
+++ b/drivers/md/md-bitmap.h
@@ -264,6 +264,7 @@ void md_bitmap_sync_with_cluster(struct mddev *mddev,
sector_t new_lo, sector_t new_hi);

void md_bitmap_unplug(struct bitmap *bitmap);
+void md_bitmap_unplug_async(struct bitmap *bitmap);
void md_bitmap_daemon_work(struct mddev *mddev);

int md_bitmap_resize(struct bitmap *bitmap, sector_t blocks,
diff --git a/drivers/md/md.c b/drivers/md/md.c
index e592f37a1071..a5a7af2f4e59 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -83,6 +83,7 @@ static struct module *md_cluster_mod;
static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
static struct workqueue_struct *md_wq;
static struct workqueue_struct *md_misc_wq;
+struct workqueue_struct *md_bitmap_wq;

static int remove_and_add_spares(struct mddev *mddev,
struct md_rdev *this);
@@ -9636,6 +9637,11 @@ static int __init md_init(void)
if (!md_misc_wq)
goto err_misc_wq;

+ md_bitmap_wq = alloc_workqueue("md_bitmap", WQ_MEM_RECLAIM | WQ_UNBOUND,
+ 0);
+ if (!md_bitmap_wq)
+ goto err_bitmap_wq;
+
ret = __register_blkdev(MD_MAJOR, "md", md_probe);
if (ret < 0)
goto err_md;
@@ -9654,6 +9660,8 @@ static int __init md_init(void)
err_mdp:
unregister_blkdev(MD_MAJOR, "md");
err_md:
+ destroy_workqueue(md_bitmap_wq);
+err_bitmap_wq:
destroy_workqueue(md_misc_wq);
err_misc_wq:
destroy_workqueue(md_wq);
@@ -9950,6 +9958,7 @@ static __exit void md_exit(void)
spin_unlock(&all_mddevs_lock);

destroy_workqueue(md_misc_wq);
+ destroy_workqueue(md_bitmap_wq);
destroy_workqueue(md_wq);
}

diff --git a/drivers/md/md.h b/drivers/md/md.h
index a50122165fa1..bfd2306bc750 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -852,6 +852,7 @@ struct mdu_array_info_s;
struct mdu_disk_info_s;

extern int mdp_major;
+extern struct workqueue_struct *md_bitmap_wq;
void md_autostart_arrays(int part);
int md_set_array_info(struct mddev *mddev, struct mdu_array_info_s *info);
int md_add_new_disk(struct mddev *mddev, struct mdu_disk_info_s *info);
--
2.39.2


2023-05-31 07:27:53

by Xiao Ni

[permalink] [raw]
Subject: Re: [PATCH -next v3 3/7] md/raid1-10: factor out a helper to submit normal write

On Mon, May 29, 2023 at 9:14 PM Yu Kuai <[email protected]> wrote:
>
> From: Yu Kuai <[email protected]>
>
> There are multiple places to do the same thing, factor out a helper to
> prevent redundant code, and the helper will be used in following patch
> as well.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/md/raid1-10.c | 17 +++++++++++++++++
> drivers/md/raid1.c | 13 ++-----------
> drivers/md/raid10.c | 26 ++++----------------------
> 3 files changed, 23 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
> index 9bf19a3409ce..506299bd55cb 100644
> --- a/drivers/md/raid1-10.c
> +++ b/drivers/md/raid1-10.c
> @@ -110,6 +110,23 @@ static void md_bio_reset_resync_pages(struct bio *bio, struct resync_pages *rp,
> } while (idx++ < RESYNC_PAGES && size > 0);
> }
>
> +
> +static inline void raid1_submit_write(struct bio *bio)

Hi Kuai

Is it better to change the name to rdev_submit_write? It's just a
suggestion. The patch looks good to me.

Regards
Xiao

> +{
> + struct md_rdev *rdev = (struct md_rdev *)bio->bi_bdev;
> +
> + bio->bi_next = NULL;
> + bio_set_dev(bio, rdev->bdev);
> + if (test_bit(Faulty, &rdev->flags))
> + bio_io_error(bio);
> + else if (unlikely(bio_op(bio) == REQ_OP_DISCARD &&
> + !bdev_max_discard_sectors(bio->bi_bdev)))
> + /* Just ignore it */
> + bio_endio(bio);
> + else
> + submit_bio_noacct(bio);
> +}
> +
> static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *bio,
> blk_plug_cb_fn unplug)
> {
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index e86c5e71c604..0778e398584c 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -799,17 +799,8 @@ static void flush_bio_list(struct r1conf *conf, struct bio *bio)
>
> while (bio) { /* submit pending writes */
> struct bio *next = bio->bi_next;
> - struct md_rdev *rdev = (void *)bio->bi_bdev;
> - bio->bi_next = NULL;
> - bio_set_dev(bio, rdev->bdev);
> - if (test_bit(Faulty, &rdev->flags)) {
> - bio_io_error(bio);
> - } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
> - !bdev_max_discard_sectors(bio->bi_bdev)))
> - /* Just ignore it */
> - bio_endio(bio);
> - else
> - submit_bio_noacct(bio);
> +
> + raid1_submit_write(bio);
> bio = next;
> cond_resched();
> }
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 18702051ebd1..6640507ecb0d 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -909,17 +909,8 @@ static void flush_pending_writes(struct r10conf *conf)
>
> while (bio) { /* submit pending writes */
> struct bio *next = bio->bi_next;
> - struct md_rdev *rdev = (void*)bio->bi_bdev;
> - bio->bi_next = NULL;
> - bio_set_dev(bio, rdev->bdev);
> - if (test_bit(Faulty, &rdev->flags)) {
> - bio_io_error(bio);
> - } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
> - !bdev_max_discard_sectors(bio->bi_bdev)))
> - /* Just ignore it */
> - bio_endio(bio);
> - else
> - submit_bio_noacct(bio);
> +
> + raid1_submit_write(bio);
> bio = next;
> cond_resched();
> }
> @@ -1134,17 +1125,8 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)
>
> while (bio) { /* submit pending writes */
> struct bio *next = bio->bi_next;
> - struct md_rdev *rdev = (void*)bio->bi_bdev;
> - bio->bi_next = NULL;
> - bio_set_dev(bio, rdev->bdev);
> - if (test_bit(Faulty, &rdev->flags)) {
> - bio_io_error(bio);
> - } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
> - !bdev_max_discard_sectors(bio->bi_bdev)))
> - /* Just ignore it */
> - bio_endio(bio);
> - else
> - submit_bio_noacct(bio);
> +
> + raid1_submit_write(bio);
> bio = next;
> cond_resched();
> }
> --
> 2.39.2
>


2023-05-31 07:31:21

by Xiao Ni

[permalink] [raw]
Subject: Re: [PATCH -next v3 4/7] md/raid1-10: submit write io directly if bitmap is not enabled

On Mon, May 29, 2023 at 9:14 PM Yu Kuai <[email protected]> wrote:
>
> From: Yu Kuai <[email protected]>
>
> Commit 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10")
> add bitmap support, and it changed that write io is submitted through
> daemon thread because bitmap need to be updated before write io. And
> later, plug is used to fix performance regression because all the write io
> will go to demon thread, which means io can't be issued concurrently.
>
> However, if bitmap is not enabled, the write io should not go to daemon
> thread in the first place, and plug is not needed as well.
>
> Fixes: 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10")
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/md/md-bitmap.c | 4 +---
> drivers/md/md-bitmap.h | 7 +++++++
> drivers/md/raid1-10.c | 13 +++++++++++--
> 3 files changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index ad5a3456cd8a..3ee590cf12a7 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -1016,7 +1016,6 @@ static int md_bitmap_file_test_bit(struct bitmap *bitmap, sector_t block)
> return set;
> }
>
> -
> /* this gets called when the md device is ready to unplug its underlying
> * (slave) device queues -- before we let any writes go down, we need to
> * sync the dirty pages of the bitmap file to disk */
> @@ -1026,8 +1025,7 @@ void md_bitmap_unplug(struct bitmap *bitmap)
> int dirty, need_write;
> int writing = 0;
>
> - if (!bitmap || !bitmap->storage.filemap ||
> - test_bit(BITMAP_STALE, &bitmap->flags))
> + if (!md_bitmap_enabled(bitmap))
> return;
>
> /* look at each page to see if there are any set bits that need to be
> diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
> index cfd7395de8fd..3a4750952b3a 100644
> --- a/drivers/md/md-bitmap.h
> +++ b/drivers/md/md-bitmap.h
> @@ -273,6 +273,13 @@ int md_bitmap_copy_from_slot(struct mddev *mddev, int slot,
> sector_t *lo, sector_t *hi, bool clear_bits);
> void md_bitmap_free(struct bitmap *bitmap);
> void md_bitmap_wait_behind_writes(struct mddev *mddev);
> +
> +static inline bool md_bitmap_enabled(struct bitmap *bitmap)
> +{
> + return bitmap && bitmap->storage.filemap &&
> + !test_bit(BITMAP_STALE, &bitmap->flags);
> +}
> +
> #endif
>
> #endif
> diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
> index 506299bd55cb..73cc3cb9154d 100644
> --- a/drivers/md/raid1-10.c
> +++ b/drivers/md/raid1-10.c
> @@ -131,9 +131,18 @@ static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *bio,
> blk_plug_cb_fn unplug)
> {
> struct raid1_plug_cb *plug = NULL;
> - struct blk_plug_cb *cb = blk_check_plugged(unplug, mddev,
> - sizeof(*plug));
> + struct blk_plug_cb *cb;
> +
> + /*
> + * If bitmap is not enabled, it's safe to submit the io directly, and
> + * this can get optimal performance.
> + */
> + if (!md_bitmap_enabled(mddev->bitmap)) {
> + raid1_submit_write(bio);
> + return true;
> + }

Can we check this out of raid1_add_bio_to_plug and call
raid1_submit_write directly in make_request function?

Regards
Xiao
>
> + cb = blk_check_plugged(unplug, mddev, sizeof(*plug));
> if (!cb)
> return false;
>
> --
> 2.39.2
>


2023-05-31 08:25:48

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v3 3/7] md/raid1-10: factor out a helper to submit normal write

Hi,

在 2023/05/31 15:20, Xiao Ni 写道:
>> @@ -110,6 +110,23 @@ static void md_bio_reset_resync_pages(struct bio *bio, struct resync_pages *rp,
>> } while (idx++ < RESYNC_PAGES && size > 0);
>> }
>>
>> +
>> +static inline void raid1_submit_write(struct bio *bio)
>
> Hi Kuai
>
> Is it better to change the name to rdev_submit_write? It's just a
> suggestion. The patch looks good to me.

Yes, this sounds like a better name.

Thanks,
Kuai
>
> Regards
> Xiao


2023-05-31 08:37:32

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v3 4/7] md/raid1-10: submit write io directly if bitmap is not enabled

Hi,

在 2023/05/31 15:26, Xiao Ni 写道:
> On Mon, May 29, 2023 at 9:14 PM Yu Kuai <[email protected]> wrote:
>>
>> From: Yu Kuai <[email protected]>
>>
>> Commit 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10")
>> add bitmap support, and it changed that write io is submitted through
>> daemon thread because bitmap need to be updated before write io. And
>> later, plug is used to fix performance regression because all the write io
>> will go to demon thread, which means io can't be issued concurrently.
>>
>> However, if bitmap is not enabled, the write io should not go to daemon
>> thread in the first place, and plug is not needed as well.
>>
>> Fixes: 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10")
>> Signed-off-by: Yu Kuai <[email protected]>
>> ---
>> drivers/md/md-bitmap.c | 4 +---
>> drivers/md/md-bitmap.h | 7 +++++++
>> drivers/md/raid1-10.c | 13 +++++++++++--
>> 3 files changed, 19 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
>> index ad5a3456cd8a..3ee590cf12a7 100644
>> --- a/drivers/md/md-bitmap.c
>> +++ b/drivers/md/md-bitmap.c
>> @@ -1016,7 +1016,6 @@ static int md_bitmap_file_test_bit(struct bitmap *bitmap, sector_t block)
>> return set;
>> }
>>
>> -
>> /* this gets called when the md device is ready to unplug its underlying
>> * (slave) device queues -- before we let any writes go down, we need to
>> * sync the dirty pages of the bitmap file to disk */
>> @@ -1026,8 +1025,7 @@ void md_bitmap_unplug(struct bitmap *bitmap)
>> int dirty, need_write;
>> int writing = 0;
>>
>> - if (!bitmap || !bitmap->storage.filemap ||
>> - test_bit(BITMAP_STALE, &bitmap->flags))
>> + if (!md_bitmap_enabled(bitmap))
>> return;
>>
>> /* look at each page to see if there are any set bits that need to be
>> diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
>> index cfd7395de8fd..3a4750952b3a 100644
>> --- a/drivers/md/md-bitmap.h
>> +++ b/drivers/md/md-bitmap.h
>> @@ -273,6 +273,13 @@ int md_bitmap_copy_from_slot(struct mddev *mddev, int slot,
>> sector_t *lo, sector_t *hi, bool clear_bits);
>> void md_bitmap_free(struct bitmap *bitmap);
>> void md_bitmap_wait_behind_writes(struct mddev *mddev);
>> +
>> +static inline bool md_bitmap_enabled(struct bitmap *bitmap)
>> +{
>> + return bitmap && bitmap->storage.filemap &&
>> + !test_bit(BITMAP_STALE, &bitmap->flags);
>> +}
>> +
>> #endif
>>
>> #endif
>> diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
>> index 506299bd55cb..73cc3cb9154d 100644
>> --- a/drivers/md/raid1-10.c
>> +++ b/drivers/md/raid1-10.c
>> @@ -131,9 +131,18 @@ static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *bio,
>> blk_plug_cb_fn unplug)
>> {
>> struct raid1_plug_cb *plug = NULL;
>> - struct blk_plug_cb *cb = blk_check_plugged(unplug, mddev,
>> - sizeof(*plug));
>> + struct blk_plug_cb *cb;
>> +
>> + /*
>> + * If bitmap is not enabled, it's safe to submit the io directly, and
>> + * this can get optimal performance.
>> + */
>> + if (!md_bitmap_enabled(mddev->bitmap)) {
>> + raid1_submit_write(bio);
>> + return true;
>> + }
>
> Can we check this out of raid1_add_bio_to_plug and call
> raid1_submit_write directly in make_request function?

Of course we can, I'm trying to avoid redundant code here...

Thanks,
Kuai
>
> Regards
> Xiao
>>
>> + cb = blk_check_plugged(unplug, mddev, sizeof(*plug));
>> if (!cb)
>> return false;
>>
>> --
>> 2.39.2
>>
>
> .
>


2023-05-31 15:24:45

by Xiao Ni

[permalink] [raw]
Subject: Re: [PATCH -next v3 4/7] md/raid1-10: submit write io directly if bitmap is not enabled

On Wed, May 31, 2023 at 4:25 PM Yu Kuai <[email protected]> wrote:
>
> Hi,
>
> 在 2023/05/31 15:26, Xiao Ni 写道:
> > On Mon, May 29, 2023 at 9:14 PM Yu Kuai <[email protected]> wrote:
> >>
> >> From: Yu Kuai <[email protected]>
> >>
> >> Commit 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10")
> >> add bitmap support, and it changed that write io is submitted through
> >> daemon thread because bitmap need to be updated before write io. And
> >> later, plug is used to fix performance regression because all the write io
> >> will go to demon thread, which means io can't be issued concurrently.
> >>
> >> However, if bitmap is not enabled, the write io should not go to daemon
> >> thread in the first place, and plug is not needed as well.
> >>
> >> Fixes: 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10")
> >> Signed-off-by: Yu Kuai <[email protected]>
> >> ---
> >> drivers/md/md-bitmap.c | 4 +---
> >> drivers/md/md-bitmap.h | 7 +++++++
> >> drivers/md/raid1-10.c | 13 +++++++++++--
> >> 3 files changed, 19 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> >> index ad5a3456cd8a..3ee590cf12a7 100644
> >> --- a/drivers/md/md-bitmap.c
> >> +++ b/drivers/md/md-bitmap.c
> >> @@ -1016,7 +1016,6 @@ static int md_bitmap_file_test_bit(struct bitmap *bitmap, sector_t block)
> >> return set;
> >> }
> >>
> >> -
> >> /* this gets called when the md device is ready to unplug its underlying
> >> * (slave) device queues -- before we let any writes go down, we need to
> >> * sync the dirty pages of the bitmap file to disk */
> >> @@ -1026,8 +1025,7 @@ void md_bitmap_unplug(struct bitmap *bitmap)
> >> int dirty, need_write;
> >> int writing = 0;
> >>
> >> - if (!bitmap || !bitmap->storage.filemap ||
> >> - test_bit(BITMAP_STALE, &bitmap->flags))
> >> + if (!md_bitmap_enabled(bitmap))
> >> return;
> >>
> >> /* look at each page to see if there are any set bits that need to be
> >> diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
> >> index cfd7395de8fd..3a4750952b3a 100644
> >> --- a/drivers/md/md-bitmap.h
> >> +++ b/drivers/md/md-bitmap.h
> >> @@ -273,6 +273,13 @@ int md_bitmap_copy_from_slot(struct mddev *mddev, int slot,
> >> sector_t *lo, sector_t *hi, bool clear_bits);
> >> void md_bitmap_free(struct bitmap *bitmap);
> >> void md_bitmap_wait_behind_writes(struct mddev *mddev);
> >> +
> >> +static inline bool md_bitmap_enabled(struct bitmap *bitmap)
> >> +{
> >> + return bitmap && bitmap->storage.filemap &&
> >> + !test_bit(BITMAP_STALE, &bitmap->flags);
> >> +}
> >> +
> >> #endif
> >>
> >> #endif
> >> diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
> >> index 506299bd55cb..73cc3cb9154d 100644
> >> --- a/drivers/md/raid1-10.c
> >> +++ b/drivers/md/raid1-10.c
> >> @@ -131,9 +131,18 @@ static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *bio,
> >> blk_plug_cb_fn unplug)
> >> {
> >> struct raid1_plug_cb *plug = NULL;
> >> - struct blk_plug_cb *cb = blk_check_plugged(unplug, mddev,
> >> - sizeof(*plug));
> >> + struct blk_plug_cb *cb;
> >> +
> >> + /*
> >> + * If bitmap is not enabled, it's safe to submit the io directly, and
> >> + * this can get optimal performance.
> >> + */
> >> + if (!md_bitmap_enabled(mddev->bitmap)) {
> >> + raid1_submit_write(bio);
> >> + return true;
> >> + }
> >
> > Can we check this out of raid1_add_bio_to_plug and call
> > raid1_submit_write directly in make_request function?
>
> Of course we can, I'm trying to avoid redundant code here...

I know. But it doesn't fit the name of the function.

Regards
Xiao
>
> Thanks,
> Kuai
> >
> > Regards
> > Xiao
> >>
> >> + cb = blk_check_plugged(unplug, mddev, sizeof(*plug));
> >> if (!cb)
> >> return false;
> >>
> >> --
> >> 2.39.2
> >>
> >
> > .
> >
>


2023-06-06 18:09:16

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH -next v3 5/7] md/md-bitmap: add a new helper to unplug bitmap asynchrously

On Mon, May 29, 2023 at 6:14 AM Yu Kuai <[email protected]> wrote:
>
> From: Yu Kuai <[email protected]>
>
> If bitmap is enabled, bitmap must update before submitting write io, this
> is why unplug callback must move these io to 'conf->pending_io_list' if
> 'current->bio_list' is not empty, which will suffer performance
> degradation.
>
> A new helper md_bitmap_unplug_async() is introduced to submit bitmap io
> in a kworker, so that submit bitmap io in raid10_unplug() doesn't require
> that 'current->bio_list' is empty.
>
> This patch prepare to limit the number of plugged bio.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/md/md-bitmap.c | 29 +++++++++++++++++++++++++++++
> drivers/md/md-bitmap.h | 1 +
> drivers/md/md.c | 9 +++++++++
> drivers/md/md.h | 1 +
> 4 files changed, 40 insertions(+)
>
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index 3ee590cf12a7..25cd72b317f1 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -1054,6 +1054,35 @@ void md_bitmap_unplug(struct bitmap *bitmap)
> }
> EXPORT_SYMBOL(md_bitmap_unplug);
>
> +struct bitmap_unplug_work {
> + struct work_struct work;
> + struct bitmap *bitmap;
> + struct completion *done;
> +};
> +
> +static void md_bitmap_unplug_fn(struct work_struct *work)
> +{
> + struct bitmap_unplug_work *unplug_work =
> + container_of(work, struct bitmap_unplug_work, work);
> +
> + md_bitmap_unplug(unplug_work->bitmap);
> + complete(unplug_work->done);
> +}
> +
> +void md_bitmap_unplug_async(struct bitmap *bitmap)
> +{
> + DECLARE_COMPLETION_ONSTACK(done);
> + struct bitmap_unplug_work unplug_work;
> +
> + INIT_WORK(&unplug_work.work, md_bitmap_unplug_fn);
> + unplug_work.bitmap = bitmap;
> + unplug_work.done = &done;
> +
> + queue_work(md_bitmap_wq, &unplug_work.work);
> + wait_for_completion(&done);
> +}
> +EXPORT_SYMBOL(md_bitmap_unplug_async);

I updated the patch with:

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 25cd72b317f1..1ff712889a3b 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1074,7 +1074,7 @@ void md_bitmap_unplug_async(struct bitmap *bitmap)
DECLARE_COMPLETION_ONSTACK(done);
struct bitmap_unplug_work unplug_work;

- INIT_WORK(&unplug_work.work, md_bitmap_unplug_fn);
+ INIT_WORK_ONSTACK(&unplug_work.work, md_bitmap_unplug_fn);
unplug_work.bitmap = bitmap;
unplug_work.done = &done;

Otherwise, we will get the following warning with mdadm test
05r1-add-internalbitmap.

Thanks,
Song



[ 135.072374] ODEBUG: object ffffc90005adf500 is on stack
ffffc90005ad8000, but NOT annotated.
[ 135.074994] ------------[ cut here ]------------
[ 135.075770] WARNING: CPU: 1 PID: 1115 at lib/debugobjects.c:548
lookup_object_or_alloc+0x294/0x560
[ 135.077085] Modules linked in:
[ 135.077571] CPU: 1 PID: 1115 Comm: mkfs.ext3 Not tainted 6.4.0-rc2+ #737
[ 135.078542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[ 135.080398] RIP: 0010:lookup_object_or_alloc+0x294/0x560
[ 135.081165] Code: 2d b1 24 15 09 65 48 8b 2c 25 80 9e 1f 00 48 8d
7d 20 e8 7f 19 a9 ff 48 8b 55 20 48 89 de 48 c7 c7 00 f2 05 83 e8 5c
97 70 ff <0f> 0b 48 83 c4 58 4c 89 f8 5b 5d 41 5c 41 5d 41 5e 41 5f c3
cc cc
[ 135.083830] RSP: 0018:ffffc90005adf398 EFLAGS: 00010082
[ 135.084568] RAX: 0000000000000050 RBX: ffffc90005adf500 RCX: 0000000000000000
[ 135.085582] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff8a7dea40
[ 135.086687] RBP: 0000000000000001 R08: fffff52000b5be3d R09: fffff52000b5be3d
[ 135.087684] R10: ffffc90005adf1e7 R11: fffff52000b5be3c R12: ffff888103ee0040
[ 135.088725] R13: ffff888103ee0060 R14: ffff888df71f7c40 R15: ffff88815eb90060
[ 135.089780] FS: 00007fa1f1c85780(0000) GS:ffff888df7000000(0000)
knlGS:0000000000000000
[ 135.090920] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 135.091751] CR2: 000055add4915f44 CR3: 000000011227a002 CR4: 0000000000370ee0
[ 135.092775] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 135.093812] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 135.094838] Call Trace:
[ 135.095178] <TASK>
[ 135.095528] ? _raw_spin_lock_irqsave+0x6d/0x90
[ 135.096197] __debug_object_init+0x63/0x130
[ 135.096853] md_bitmap_unplug_async+0x9c/0x180
[ 135.097512] ? __pfx_md_bitmap_unplug_async+0x10/0x10
[ 135.098267] ? bio_associate_blkg_from_css+0x446/0xb10
[ 135.099043] ? update_io_ticks+0xad/0x180
[ 135.099636] ? __pfx_update_io_ticks+0x10/0x10
[ 135.100355] flush_bio_list+0x60/0x1c0
[ 135.100922] raid1_unplug+0x64/0x120
[ 135.101460] raid1_make_request+0xfd7/0x1e10
[ 135.102125] ? __pfx_raid1_make_request+0x10/0x10
[ 135.102806] ? lock_release+0x27a/0x690
[ 135.103368] ? __pfx_lock_release+0x10/0x10
[ 135.103963] ? md_handle_request+0x2b3/0x7c0
[ 135.104570] ? __pfx_rcu_read_lock_held+0x10/0x10
[ 135.105232] ? md_handle_request+0x2b3/0x7c0
[ 135.105886] md_handle_request+0x365/0x7c0
[ 135.106492] ? __pfx_md_handle_request+0x10/0x10
[ 135.107177] __submit_bio+0x1dc/0x5d0
[ 135.107770] submit_bio_noacct_nocheck+0x2c2/0x5b0
[ 135.108457] ? lock_is_held_type+0xd8/0x130
[ 135.109076] ? __pfx_submit_bio_noacct_nocheck+0x10/0x10
[ 135.109872] ? submit_bio_noacct+0x283/0x750
[ 135.110533] __block_write_full_page+0x45f/0xbb0
[ 135.111200] ? __pfx_blkdev_get_block+0x10/0x10
[ 135.111901] ? __pfx_end_buffer_async_write+0x10/0x10
[ 135.112741] writepage_cb+0x3e/0xc0
[ 135.113285] write_cache_pages+0x2d8/0x730
[ 135.113906] ? __pfx_writepage_cb+0x10/0x10
[ 135.114531] ? __pfx_write_cache_pages+0x10/0x10
[ 135.115196] ? lock_release+0x27a/0x690
[ 135.115819] do_writepages+0x190/0x310
[ 135.116370] ? __pfx_do_writepages+0x10/0x10
[ 135.116995] ? preempt_count_sub+0x14/0xc0
[ 135.117596] ? _raw_spin_unlock+0x29/0x50
[ 135.118171] ? wbc_attach_and_unlock_inode+0x216/0x410
[ 135.118934] filemap_fdatawrite_wbc+0x93/0xc0
[ 135.119574] __filemap_fdatawrite_range+0x99/0xd0
[ 135.120246] ? __pfx___filemap_fdatawrite_range+0x10/0x10
[ 135.121055] ? ktime_get_coarse_real_ts64+0xec/0x100
[ 135.121789] file_write_and_wait_range+0x67/0xd0
[ 135.122578] blkdev_fsync+0x35/0x60
[ 135.123095] do_fsync+0x38/0x70
[ 135.123587] ? syscall_trace_enter.isra.15+0x15c/0x1f0
[ 135.124318] __x64_sys_fsync+0x1d/0x30
[ 135.124924] do_syscall_64+0x3a/0x90
[ 135.125481] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 135.126197] RIP: 0033:0x7fa1efef4478
[ 135.126765] Code: 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00
00 90 f3 0f 1e fa 48 8d 05 25 01 2d 00 8b 00 85 c0 75 17 b8 4a 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 40 c3 0f 1f 80 00 00 00 00 53 89 fb 48
83 ec
[ 135.129424] RSP: 002b:00007fffb2176318 EFLAGS: 00000246 ORIG_RAX:
000000000000004a
[ 135.130498] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa1efef4478
[ 135.131493] RDX: 0000000000000400 RSI: 0000559e026969e0 RDI: 0000000000000003
[ 135.132491] RBP: 0000559e026950d0 R08: 0000559e026969e0 R09: 00007fa1eff80620
[ 135.133496] R10: 0000000000800800 R11: 0000000000000246 R12: 00007fffb2176390
[ 135.134491] R13: 00007fffb2176398 R14: 0000559e02698e70 R15: 0000559e026944f0
[ 135.135515] </TASK>
[ 135.135843] irq event stamp: 141152
[ 135.136349] hardirqs last enabled at (141151):
[<ffffffff827778e0>] _raw_spin_unlock_irqrestore+0x30/0x60
[ 135.137721] hardirqs last disabled at (141152):
[<ffffffff827775cd>] _raw_spin_lock_irqsave+0x6d/0x90
[ 135.139031] softirqs last enabled at (139890):
[<ffffffff827793fb>] __do_softirq+0x3eb/0x531
[ 135.140249] softirqs last disabled at (139885):
[<ffffffff810d8f45>] irq_exit_rcu+0x115/0x160
[ 135.141503] ---[ end trace 0000000000000000 ]---




> +
> static void md_bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offset, int needed);
> /* * bitmap_init_from_disk -- called at bitmap_create time to initialize
> * the in-memory bitmap from the on-disk bitmap -- also, sets up the
> diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
> index 3a4750952b3a..8a3788c9bfef 100644
> --- a/drivers/md/md-bitmap.h
> +++ b/drivers/md/md-bitmap.h
> @@ -264,6 +264,7 @@ void md_bitmap_sync_with_cluster(struct mddev *mddev,
> sector_t new_lo, sector_t new_hi);
>
> void md_bitmap_unplug(struct bitmap *bitmap);
> +void md_bitmap_unplug_async(struct bitmap *bitmap);
> void md_bitmap_daemon_work(struct mddev *mddev);
>
> int md_bitmap_resize(struct bitmap *bitmap, sector_t blocks,
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index e592f37a1071..a5a7af2f4e59 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -83,6 +83,7 @@ static struct module *md_cluster_mod;
> static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
> static struct workqueue_struct *md_wq;
> static struct workqueue_struct *md_misc_wq;
> +struct workqueue_struct *md_bitmap_wq;
>
> static int remove_and_add_spares(struct mddev *mddev,
> struct md_rdev *this);
> @@ -9636,6 +9637,11 @@ static int __init md_init(void)
> if (!md_misc_wq)
> goto err_misc_wq;
>
> + md_bitmap_wq = alloc_workqueue("md_bitmap", WQ_MEM_RECLAIM | WQ_UNBOUND,
> + 0);
> + if (!md_bitmap_wq)
> + goto err_bitmap_wq;
> +
> ret = __register_blkdev(MD_MAJOR, "md", md_probe);
> if (ret < 0)
> goto err_md;
> @@ -9654,6 +9660,8 @@ static int __init md_init(void)
> err_mdp:
> unregister_blkdev(MD_MAJOR, "md");
> err_md:
> + destroy_workqueue(md_bitmap_wq);
> +err_bitmap_wq:
> destroy_workqueue(md_misc_wq);
> err_misc_wq:
> destroy_workqueue(md_wq);
> @@ -9950,6 +9958,7 @@ static __exit void md_exit(void)
> spin_unlock(&all_mddevs_lock);
>
> destroy_workqueue(md_misc_wq);
> + destroy_workqueue(md_bitmap_wq);
> destroy_workqueue(md_wq);
> }
>
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index a50122165fa1..bfd2306bc750 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -852,6 +852,7 @@ struct mdu_array_info_s;
> struct mdu_disk_info_s;
>
> extern int mdp_major;
> +extern struct workqueue_struct *md_bitmap_wq;
> void md_autostart_arrays(int part);
> int md_set_array_info(struct mddev *mddev, struct mdu_array_info_s *info);
> int md_add_new_disk(struct mddev *mddev, struct mdu_disk_info_s *info);
> --
> 2.39.2
>

2023-06-06 22:53:32

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH -next v3 0/7] limit the number of plugged bio

On Mon, May 29, 2023 at 6:14 AM Yu Kuai <[email protected]> wrote:
>
> From: Yu Kuai <[email protected]>
>
> Changes in v3:
> - prefix function with 'raid1_' instead of 'md_'
> - use a globle workequeue instead of per bitmap in patch 5
> Changes in v2:
> - remove the patch to rename raid1-10.c
>
> Yu Kuai (7):
> md/raid10: prevent soft lockup while flush writes
> md/raid1-10: factor out a helper to add bio to plug
> md/raid1-10: factor out a helper to submit normal write
> md/raid1-10: submit write io directly if bitmap is not enabled
> md/md-bitmap: add a new helper to unplug bitmap asynchrously
> md/raid1-10: don't handle pluged bio by daemon thread
> md/raid1-10: limit the number of plugged bio

With the fix on 5/7, applied to md-next.

Thanks,
Song


>
> drivers/md/md-bitmap.c | 33 ++++++++++++++++++++--
> drivers/md/md-bitmap.h | 8 ++++++
> drivers/md/md.c | 9 ++++++
> drivers/md/md.h | 1 +
> drivers/md/raid1-10.c | 63 ++++++++++++++++++++++++++++++++++++++++++
> drivers/md/raid1.c | 29 ++++---------------
> drivers/md/raid10.c | 47 +++++++------------------------
> 7 files changed, 126 insertions(+), 64 deletions(-)
>
> --
> 2.39.2
>