2023-11-28 09:45:14

by Yunlong Xing

[permalink] [raw]
Subject: [PATCH] dm: increase the io priority of the kworker-kverityd process

From: Hongyu Jin <[email protected]>

When obtaining the hash value of a high IO priority data block
from the disk, the kverity-worker that obtains the hash will
also have a high IO priority to avoid being blocked by other
IO with low IO priority.

Signed-off-by: Hongyu Jin <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
---
drivers/md/dm-verity-target.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index e115fcfe723c..ade9c6734154 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -22,6 +22,7 @@
#include <linux/scatterlist.h>
#include <linux/string.h>
#include <linux/jump_label.h>
+#include <linux/ioprio.h>

#define DM_MSG_PREFIX "verity"

@@ -639,7 +640,9 @@ static void verity_finish_io(struct dm_verity_io *io, blk_status_t status)
static void verity_work(struct work_struct *w)
{
struct dm_verity_io *io = container_of(w, struct dm_verity_io, work);
+ struct bio *bio = dm_bio_from_per_bio_data(io, io->v->ti->per_io_data_size);

+ set_task_ioprio(current, bio->bi_ioprio);
io->in_tasklet = false;

verity_fec_init_io(io);
--
2.25.1


2023-11-28 14:08:00

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH] dm: increase the io priority of the kworker-kverityd process

Hi

This isn't correct - the workqueue process is reused by multiple work
items - from dm-verity as well as from other subsystems - so you would be
unexpectedly boosting priority of other tasks.

The correct solution would be to set the ioprio field of the outcoming
bios in dm-bufio explicitely.

Mikulas


On Tue, 28 Nov 2023, Yunlong Xing wrote:

> From: Hongyu Jin <[email protected]>
>
> When obtaining the hash value of a high IO priority data block
> from the disk, the kverity-worker that obtains the hash will
> also have a high IO priority to avoid being blocked by other
> IO with low IO priority.
>
> Signed-off-by: Hongyu Jin <[email protected]>
> Signed-off-by: Yibin Ding <[email protected]>
> ---
> drivers/md/dm-verity-target.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> index e115fcfe723c..ade9c6734154 100644
> --- a/drivers/md/dm-verity-target.c
> +++ b/drivers/md/dm-verity-target.c
> @@ -22,6 +22,7 @@
> #include <linux/scatterlist.h>
> #include <linux/string.h>
> #include <linux/jump_label.h>
> +#include <linux/ioprio.h>
>
> #define DM_MSG_PREFIX "verity"
>
> @@ -639,7 +640,9 @@ static void verity_finish_io(struct dm_verity_io *io, blk_status_t status)
> static void verity_work(struct work_struct *w)
> {
> struct dm_verity_io *io = container_of(w, struct dm_verity_io, work);
> + struct bio *bio = dm_bio_from_per_bio_data(io, io->v->ti->per_io_data_size);
>
> + set_task_ioprio(current, bio->bi_ioprio);
> io->in_tasklet = false;
>
> verity_fec_init_io(io);
> --
> 2.25.1
>
>

2023-12-06 11:41:05

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v2] dm verity: Inherit I/O priority from data I/O when read FEC and hash from disk

From: Hongyu Jin <[email protected]>

when read FEC and hash from disk, I/O priority are inconsistent
with data block and blocked by other I/O with low I/O priority.

Add dm_bufio_prefetch_by_ioprio() and dm_bufio_read_by_ioprio(),
can pecific I/O priority for some I/O.
Make I/O for FEC and hash has same I/O priority with data I/O.

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>

---
Changes in v2:
- Add ioprio field in struct dm_io_region
- Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
- Add two interface
---
drivers/md/dm-bufio.c | 50 ++++++++++++++++++++++-----------
drivers/md/dm-integrity.c | 5 ++++
drivers/md/dm-io.c | 1 +
drivers/md/dm-log.c | 1 +
drivers/md/dm-raid1.c | 2 ++
drivers/md/dm-snap-persistent.c | 2 ++
drivers/md/dm-verity-fec.c | 3 +-
drivers/md/dm-verity-target.c | 10 +++++--
drivers/md/dm-writecache.c | 4 +++
include/linux/dm-bufio.h | 6 ++++
include/linux/dm-io.h | 2 ++
11 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 62eb27639c9b..f1f89b89ff6d 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1256,7 +1256,7 @@ static void dmio_complete(unsigned long error, void *context)
}

static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset, unsigned short ioprio)
{
int r;
struct dm_io_request io_req = {
@@ -1269,6 +1269,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
.bdev = b->c->bdev,
.sector = sector,
.count = n_sectors,
+ .ioprio = ioprio,
};

if (b->data_mode != DATA_MODE_VMALLOC) {
@@ -1295,7 +1296,7 @@ static void bio_complete(struct bio *bio)
}

static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset, unsigned short ioprio)
{
struct bio *bio;
char *ptr;
@@ -1303,13 +1304,14 @@ static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,

bio = bio_kmalloc(1, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOWARN);
if (!bio) {
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
return;
}
bio_init(bio, b->c->bdev, bio->bi_inline_vecs, 1, op);
bio->bi_iter.bi_sector = sector;
bio->bi_end_io = bio_complete;
bio->bi_private = b;
+ bio->bi_ioprio = ioprio;

ptr = (char *)b->data + offset;
len = n_sectors << SECTOR_SHIFT;
@@ -1332,7 +1334,7 @@ static inline sector_t block_to_sector(struct dm_bufio_client *c, sector_t block
return sector;
}

-static void submit_io(struct dm_buffer *b, enum req_op op,
+static void submit_io(struct dm_buffer *b, enum req_op op, unsigned short ioprio,
void (*end_io)(struct dm_buffer *, blk_status_t))
{
unsigned int n_sectors;
@@ -1362,9 +1364,9 @@ static void submit_io(struct dm_buffer *b, enum req_op op,
}

if (b->data_mode != DATA_MODE_VMALLOC)
- use_bio(b, op, sector, n_sectors, offset);
+ use_bio(b, op, sector, n_sectors, offset, ioprio);
else
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
}

/*
@@ -1420,7 +1422,7 @@ static void __write_dirty_buffer(struct dm_buffer *b,
b->write_end = b->dirty_end;

if (!write_list)
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
else
list_add_tail(&b->write_list, write_list);
}
@@ -1434,7 +1436,7 @@ static void __flush_write_list(struct list_head *write_list)
struct dm_buffer *b =
list_entry(write_list->next, struct dm_buffer, write_list);
list_del(&b->write_list);
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
cond_resched();
}
blk_finish_plug(&plug);
@@ -1816,7 +1818,7 @@ static void read_endio(struct dm_buffer *b, blk_status_t status)
* and uses dm_bufio_mark_buffer_dirty to write new data back).
*/
static void *new_read(struct dm_bufio_client *c, sector_t block,
- enum new_flag nf, struct dm_buffer **bp)
+ enum new_flag nf, struct dm_buffer **bp, unsigned short ioprio)
{
int need_submit = 0;
struct dm_buffer *b;
@@ -1869,7 +1871,7 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
return NULL;

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);

wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);

@@ -1889,19 +1891,26 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
void *dm_bufio_get(struct dm_bufio_client *c, sector_t block,
struct dm_buffer **bp)
{
- return new_read(c, block, NF_GET, bp);
+ return new_read(c, block, NF_GET, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_get);

void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
struct dm_buffer **bp)
+{
+ return dm_bufio_read_by_ioprio(c, block, bp, IOPRIO_DEFAULT);
+}
+EXPORT_SYMBOL_GPL(dm_bufio_read);
+
+void *dm_bufio_read_by_ioprio(struct dm_bufio_client *c, sector_t block,
+ struct dm_buffer **bp, unsigned short ioprio)
{
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_READ, bp);
+ return new_read(c, block, NF_READ, bp, ioprio);
}
-EXPORT_SYMBOL_GPL(dm_bufio_read);
+EXPORT_SYMBOL_GPL(dm_bufio_read_by_ioprio);

void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
struct dm_buffer **bp)
@@ -1909,12 +1918,19 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_FRESH, bp);
+ return new_read(c, block, NF_FRESH, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_new);

void dm_bufio_prefetch(struct dm_bufio_client *c,
sector_t block, unsigned int n_blocks)
+{
+ return dm_bufio_prefetch_by_ioprio(c, block, n_blocks, IOPRIO_DEFAULT);
+}
+EXPORT_SYMBOL_GPL(dm_bufio_prefetch);
+
+void dm_bufio_prefetch_by_ioprio(struct dm_bufio_client *c,
+ sector_t block, unsigned int n_blocks, unsigned short ioprio)
{
struct blk_plug plug;

@@ -1950,7 +1966,7 @@ void dm_bufio_prefetch(struct dm_bufio_client *c,
dm_bufio_unlock(c);

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);
dm_bufio_release(b);

cond_resched();
@@ -1965,7 +1981,7 @@ void dm_bufio_prefetch(struct dm_bufio_client *c,
flush_plug:
blk_finish_plug(&plug);
}
-EXPORT_SYMBOL_GPL(dm_bufio_prefetch);
+EXPORT_SYMBOL_GPL(dm_bufio_prefetch_by_ioprio);

void dm_bufio_release(struct dm_buffer *b)
{
@@ -2125,6 +2141,7 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
.bdev = c->bdev,
.sector = 0,
.count = 0,
+ .ioprio = IOPRIO_DEFAULT,
};

if (WARN_ON_ONCE(dm_bufio_in_request()))
@@ -2149,6 +2166,7 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c
.bdev = c->bdev,
.sector = block_to_sector(c, block),
.count = block_to_sector(c, count),
+ .ioprio = IOPRIO_DEFAULT,
};

if (WARN_ON_ONCE(dm_bufio_in_request()))
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index e85c688fd91e..4bbfaf8f5230 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -543,6 +543,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, blk_opf_t opf)
io_loc.bdev = ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev;
io_loc.sector = ic->start;
io_loc.count = SB_SECTORS;
+ io_loc.ioprio = IOPRIO_DEFAULT;

if (op == REQ_OP_WRITE) {
sb_set_version(ic);
@@ -1070,6 +1071,7 @@ static void rw_journal_sectors(struct dm_integrity_c *ic, blk_opf_t opf,
io_loc.bdev = ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev;
io_loc.sector = ic->start + SB_SECTORS + sector;
io_loc.count = n_sectors;
+ io_loc.ioprio = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
@@ -1187,6 +1189,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned int section, u
io_loc.bdev = ic->dev->bdev;
io_loc.sector = target;
io_loc.count = n_sectors;
+ io_loc.IOPRIO_DEFAULT = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
@@ -1515,6 +1518,7 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
fr.io_reg.bdev = ic->dev->bdev,
fr.io_reg.sector = 0,
fr.io_reg.count = 0,
+ fr.io_reg.ioprio = IOPRIO_DEFAULT,
fr.ic = ic;
init_completion(&fr.comp);
r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL);
@@ -2738,6 +2742,7 @@ static void integrity_recalc(struct work_struct *w)
io_loc.bdev = ic->dev->bdev;
io_loc.sector = get_data_sector(ic, area, offset);
io_loc.count = n_sectors;
+ io_loc.ioprio = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index f053ce245814..b40f0a432981 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -354,6 +354,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,
&io->client->bios);
bio->bi_iter.bi_sector = where->sector + (where->count - remaining);
bio->bi_end_io = endio;
+ bio->bi_ioprio = where->ioprio;
store_io_and_region_in_bio(bio, io, region);

if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..e0dacdcd94f1 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -309,6 +309,7 @@ static int flush_header(struct log_c *lc)
.bdev = lc->header_location.bdev,
.sector = 0,
.count = 0,
+ .ioprio = IOPRIO_DEFAULT,
};

lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..2de9b1377de3 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -275,6 +275,7 @@ static int mirror_flush(struct dm_target *ti)
io[i].bdev = m->dev->bdev;
io[i].sector = 0;
io[i].count = 0;
+ io[i].ioprio = IOPRIO_DEFAULT;
}

error_bits = -1;
@@ -475,6 +476,7 @@ static void map_region(struct dm_io_region *io, struct mirror *m,
io->bdev = m->dev->bdev;
io->sector = map_sector(m, bio);
io->count = bio_sectors(bio);
+ io->ioprio = bio_prio(bio);
}

static void hold_bio(struct mirror_set *ms, struct bio *bio)
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..d8f911727058 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -236,6 +236,8 @@ static int chunk_io(struct pstore *ps, void *area, chunk_t chunk, blk_opf_t opf,
.bdev = dm_snap_cow(ps->store->snap)->bdev,
.sector = ps->store->chunk_size * chunk,
.count = ps->store->chunk_size,
+ .ioprio = IOPRIO_DEFAULT,
+
};
struct dm_io_request io_req = {
.bi_opf = opf,
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 3ef9f018da60..160a4de56b28 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -209,6 +209,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
u8 *bbuf, *rs_block;
u8 want_digest[HASH_MAX_DIGESTSIZE];
unsigned int n, k;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (neras)
*neras = 0;
@@ -247,7 +248,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf);
+ bbuf = dm_bufio_read_by_ioprio(bufio, block, &buf, bio->bi_ioprio);
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 26adcfea0302..5945ac1dfdff 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -51,6 +51,7 @@ static DEFINE_STATIC_KEY_FALSE(use_tasklet_enabled);
struct dm_verity_prefetch_work {
struct work_struct work;
struct dm_verity *v;
+ struct dm_verity_io *io;
sector_t block;
unsigned int n_blocks;
};
@@ -293,6 +294,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
int r;
sector_t hash_block;
unsigned int offset;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

verity_hash_at_level(v, block, level, &hash_block, &offset);

@@ -307,7 +309,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf);
+ data = dm_bufio_read_by_ioprio(v->bufio, hash_block, &buf, bio->bi_ioprio);

if (IS_ERR(data))
return PTR_ERR(data);
@@ -692,6 +694,7 @@ static void verity_prefetch_io(struct work_struct *work)
container_of(work, struct dm_verity_prefetch_work, work);
struct dm_verity *v = pw->v;
int i;
+ struct bio *bio = dm_bio_from_per_bio_data(pw->io, v->ti->per_io_data_size);

for (i = v->levels - 2; i >= 0; i--) {
sector_t hash_block_start;
@@ -716,8 +719,8 @@ static void verity_prefetch_io(struct work_struct *work)
hash_block_end = v->hash_blocks - 1;
}
no_prefetch_cluster:
- dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1);
+ dm_bufio_prefetch_by_ioprio(v->bufio, hash_block_start,
+ hash_block_end - hash_block_start + 1, bio->bi_ioprio);
}

kfree(pw);
@@ -751,6 +754,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
pw->v = v;
pw->block = block;
pw->n_blocks = n_blocks;
+ pw->io = io;
queue_work(v->verify_wq, &pw->work);
}

diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 074cb785eafc..135d1268246f 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -515,6 +515,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
region.bdev = wc->ssd_dev->bdev;
region.sector = (sector_t)i * (BITMAP_GRANULARITY >> SECTOR_SHIFT);
region.count = (sector_t)(j - i) * (BITMAP_GRANULARITY >> SECTOR_SHIFT);
+ region.ioprio = IOPRIO_DEFAULT;

if (unlikely(region.sector >= wc->metadata_sectors))
break;
@@ -555,6 +556,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
region.bdev = wc->ssd_dev->bdev;
region.sector = 0;
region.count = max(4096U, wc->block_size) >> SECTOR_SHIFT;
+ region.ioprio = IOPRIO_DEFAULT;

if (unlikely(region.sector + region.count > wc->metadata_sectors))
region.count = wc->metadata_sectors - region.sector;
@@ -590,6 +592,7 @@ static void writecache_disk_flush(struct dm_writecache *wc, struct dm_dev *dev)
region.bdev = dev->bdev;
region.sector = 0;
region.count = 0;
+ region.ioprio = IOPRIO_DEFAULT;
req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
req.mem.type = DM_IO_KMEM;
req.mem.ptr.addr = NULL;
@@ -984,6 +987,7 @@ static int writecache_read_metadata(struct dm_writecache *wc, sector_t n_sectors
region.bdev = wc->ssd_dev->bdev;
region.sector = wc->start_sector;
region.count = n_sectors;
+ region.ioprio = IOPRIO_DEFAULT;
req.bi_opf = REQ_OP_READ | REQ_SYNC;
req.mem.type = DM_IO_VMA;
req.mem.ptr.vma = (char *)wc->memory_map;
diff --git a/include/linux/dm-bufio.h b/include/linux/dm-bufio.h
index 75e7d8cbb532..39e939bf0419 100644
--- a/include/linux/dm-bufio.h
+++ b/include/linux/dm-bufio.h
@@ -11,6 +11,7 @@
#define _LINUX_DM_BUFIO_H

#include <linux/blkdev.h>
+#include <linux/ioprio.h>
#include <linux/types.h>

/*----------------------------------------------------------------*/
@@ -64,6 +65,9 @@ void dm_bufio_set_sector_offset(struct dm_bufio_client *c, sector_t start);
void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
struct dm_buffer **bp);

+void *dm_bufio_read_by_ioprio(struct dm_bufio_client *c, sector_t block,
+ struct dm_buffer **bp, unsigned short ioprio);
+
/*
* Like dm_bufio_read, but return buffer from cache, don't read
* it. If the buffer is not in the cache, return NULL.
@@ -86,6 +90,8 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
void dm_bufio_prefetch(struct dm_bufio_client *c,
sector_t block, unsigned int n_blocks);

+void dm_bufio_prefetch_by_ioprio(struct dm_bufio_client *c,
+ sector_t block, unsigned int n_blocks, unsigned short ioprio);
/*
* Release a reference obtained with dm_bufio_{read,get,new}. The data
* pointer and dm_buffer pointer is no longer valid after this call.
diff --git a/include/linux/dm-io.h b/include/linux/dm-io.h
index 7595142f3fc5..227ee6d77c70 100644
--- a/include/linux/dm-io.h
+++ b/include/linux/dm-io.h
@@ -20,6 +20,8 @@ struct dm_io_region {
struct block_device *bdev;
sector_t sector;
sector_t count; /* If this is zero the region is ignored. */
+ /* Set it to IOPRIO_DEFAULT if you don't know what value to set */
+ unsigned short ioprio;
};

struct page_list {
--
2.34.1

2023-12-08 01:54:42

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v2] dm verity: Inherit I/O priority from data I/O when read FEC and hash from disk

On Wed, Dec 06, 2023 at 07:39:35PM +0800, Hongyu Jin wrote:
> From: Hongyu Jin <[email protected]>
>
> when read FEC and hash from disk, I/O priority are inconsistent
> with data block and blocked by other I/O with low I/O priority.
>
> Add dm_bufio_prefetch_by_ioprio() and dm_bufio_read_by_ioprio(),
> can pecific I/O priority for some I/O.
> Make I/O for FEC and hash has same I/O priority with data I/O.
>
> Co-developed-by: Yibin Ding <[email protected]>
> Signed-off-by: Yibin Ding <[email protected]>
> Signed-off-by: Hongyu Jin <[email protected]>
>
> ---
> Changes in v2:
> - Add ioprio field in struct dm_io_region
> - Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
> - Add two interface
> ---
> drivers/md/dm-bufio.c | 50 ++++++++++++++++++++++-----------
> drivers/md/dm-integrity.c | 5 ++++
> drivers/md/dm-io.c | 1 +
> drivers/md/dm-log.c | 1 +
> drivers/md/dm-raid1.c | 2 ++
> drivers/md/dm-snap-persistent.c | 2 ++
> drivers/md/dm-verity-fec.c | 3 +-
> drivers/md/dm-verity-target.c | 10 +++++--
> drivers/md/dm-writecache.c | 4 +++
> include/linux/dm-bufio.h | 6 ++++
> include/linux/dm-io.h | 2 ++
> 11 files changed, 66 insertions(+), 20 deletions(-)

Changing so many things in one patch should be avoided if possible. Is there a
way to split this patch up? Maybe first add ioprio support to dm-io, then add
ioprio support to dm-bufio, then make dm-verity set the correct ioprio?

> void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
> struct dm_buffer **bp)
> +{
> + return dm_bufio_read_by_ioprio(c, block, bp, IOPRIO_DEFAULT);
> +}
> +EXPORT_SYMBOL_GPL(dm_bufio_read);
> +
> +void *dm_bufio_read_by_ioprio(struct dm_bufio_client *c, sector_t block,
> + struct dm_buffer **bp, unsigned short ioprio)
> {
> if (WARN_ON_ONCE(dm_bufio_in_request()))
> return ERR_PTR(-EINVAL);
>
> - return new_read(c, block, NF_READ, bp);
> + return new_read(c, block, NF_READ, bp, ioprio);
> }
> -EXPORT_SYMBOL_GPL(dm_bufio_read);
> +EXPORT_SYMBOL_GPL(dm_bufio_read_by_ioprio);
>
> void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
> struct dm_buffer **bp)
> @@ -1909,12 +1918,19 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
> if (WARN_ON_ONCE(dm_bufio_in_request()))
> return ERR_PTR(-EINVAL);
>
> - return new_read(c, block, NF_FRESH, bp);
> + return new_read(c, block, NF_FRESH, bp, IOPRIO_DEFAULT);
> }
> EXPORT_SYMBOL_GPL(dm_bufio_new);
>
> void dm_bufio_prefetch(struct dm_bufio_client *c,
> sector_t block, unsigned int n_blocks)
> +{
> + return dm_bufio_prefetch_by_ioprio(c, block, n_blocks, IOPRIO_DEFAULT);
> +}
> +EXPORT_SYMBOL_GPL(dm_bufio_prefetch);
> +
> +void dm_bufio_prefetch_by_ioprio(struct dm_bufio_client *c,
> + sector_t block, unsigned int n_blocks, unsigned short ioprio)

I think it would be cleaner to just add the ioprio parameter to dm_bufio_read()
and dm_bufio_prefetch(), instead of adding new functions.

> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> index 26adcfea0302..5945ac1dfdff 100644
> --- a/drivers/md/dm-verity-target.c
> +++ b/drivers/md/dm-verity-target.c
> @@ -51,6 +51,7 @@ static DEFINE_STATIC_KEY_FALSE(use_tasklet_enabled);
> struct dm_verity_prefetch_work {
> struct work_struct work;
> struct dm_verity *v;
> + struct dm_verity_io *io;
> sector_t block;
> unsigned int n_blocks;
> };

Isn't it possible for 'io' to complete and be freed while the prefetch work is
still running?

- Eric

2023-12-08 20:44:31

by Eric Wheeler

[permalink] [raw]
Subject: Re: [PATCH v2] dm verity: Inherit I/O priority from data I/O when read FEC and hash from disk

On Thu, 7 Dec 2023, Eric Biggers wrote:
> On Wed, Dec 06, 2023 at 07:39:35PM +0800, Hongyu Jin wrote:
> > From: Hongyu Jin <[email protected]>
> >
> > when read FEC and hash from disk, I/O priority are inconsistent
> > with data block and blocked by other I/O with low I/O priority.
> >
> > Add dm_bufio_prefetch_by_ioprio() and dm_bufio_read_by_ioprio(),
> > can pecific I/O priority for some I/O.
> > Make I/O for FEC and hash has same I/O priority with data I/O.

Hi Hongyu,

+1 for the feature, thank you for cleaning up ioprio in device mapper!

A few years ago we proposed a similar prior patch in dm-crypt; however, it
was never committed, and I did not have the time to shepherd it through.
Maybe this has since been addressed in some other way, or perhaps your
work solves what we were doing with dm-crypt; either way, here is the
link to that thread incase it is relevant to your work:
https://www.mail-archive.com/[email protected]/msg03828.html

I look forward to seeing all (or at least the most common) device mapper
targets cleanly support ioprio.

Cheers,

--
Eric Wheeler




> > Co-developed-by: Yibin Ding <[email protected]>
> > Signed-off-by: Yibin Ding <[email protected]>
> > Signed-off-by: Hongyu Jin <[email protected]>
> >
> > ---
> > Changes in v2:
> > - Add ioprio field in struct dm_io_region
> > - Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
> > - Add two interface
> > ---
> > drivers/md/dm-bufio.c | 50 ++++++++++++++++++++++-----------
> > drivers/md/dm-integrity.c | 5 ++++
> > drivers/md/dm-io.c | 1 +
> > drivers/md/dm-log.c | 1 +
> > drivers/md/dm-raid1.c | 2 ++
> > drivers/md/dm-snap-persistent.c | 2 ++
> > drivers/md/dm-verity-fec.c | 3 +-
> > drivers/md/dm-verity-target.c | 10 +++++--
> > drivers/md/dm-writecache.c | 4 +++
> > include/linux/dm-bufio.h | 6 ++++
> > include/linux/dm-io.h | 2 ++
> > 11 files changed, 66 insertions(+), 20 deletions(-)
>
> Changing so many things in one patch should be avoided if possible. Is there a
> way to split this patch up? Maybe first add ioprio support to dm-io, then add
> ioprio support to dm-bufio, then make dm-verity set the correct ioprio?
>
> > void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
> > struct dm_buffer **bp)
> > +{
> > + return dm_bufio_read_by_ioprio(c, block, bp, IOPRIO_DEFAULT);
> > +}
> > +EXPORT_SYMBOL_GPL(dm_bufio_read);
> > +
> > +void *dm_bufio_read_by_ioprio(struct dm_bufio_client *c, sector_t block,
> > + struct dm_buffer **bp, unsigned short ioprio)
> > {
> > if (WARN_ON_ONCE(dm_bufio_in_request()))
> > return ERR_PTR(-EINVAL);
> >
> > - return new_read(c, block, NF_READ, bp);
> > + return new_read(c, block, NF_READ, bp, ioprio);
> > }
> > -EXPORT_SYMBOL_GPL(dm_bufio_read);
> > +EXPORT_SYMBOL_GPL(dm_bufio_read_by_ioprio);
> >
> > void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
> > struct dm_buffer **bp)
> > @@ -1909,12 +1918,19 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
> > if (WARN_ON_ONCE(dm_bufio_in_request()))
> > return ERR_PTR(-EINVAL);
> >
> > - return new_read(c, block, NF_FRESH, bp);
> > + return new_read(c, block, NF_FRESH, bp, IOPRIO_DEFAULT);
> > }
> > EXPORT_SYMBOL_GPL(dm_bufio_new);
> >
> > void dm_bufio_prefetch(struct dm_bufio_client *c,
> > sector_t block, unsigned int n_blocks)
> > +{
> > + return dm_bufio_prefetch_by_ioprio(c, block, n_blocks, IOPRIO_DEFAULT);
> > +}
> > +EXPORT_SYMBOL_GPL(dm_bufio_prefetch);
> > +
> > +void dm_bufio_prefetch_by_ioprio(struct dm_bufio_client *c,
> > + sector_t block, unsigned int n_blocks, unsigned short ioprio)
>
> I think it would be cleaner to just add the ioprio parameter to dm_bufio_read()
> and dm_bufio_prefetch(), instead of adding new functions.
>
> > diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> > index 26adcfea0302..5945ac1dfdff 100644
> > --- a/drivers/md/dm-verity-target.c
> > +++ b/drivers/md/dm-verity-target.c
> > @@ -51,6 +51,7 @@ static DEFINE_STATIC_KEY_FALSE(use_tasklet_enabled);
> > struct dm_verity_prefetch_work {
> > struct work_struct work;
> > struct dm_verity *v;
> > + struct dm_verity_io *io;
> > sector_t block;
> > unsigned int n_blocks;
> > };
>
> Isn't it possible for 'io' to complete and be freed while the prefetch work is
> still running?
>
> - Eric
>
>

2023-12-11 09:01:31

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v3 0/5] Fix I/O priority lost in device-mapper

From: Hongyu Jin <[email protected]>

Changes in v3:
- Split patch for device-mapper
- Add patch to fix dm-crypy I/O priority question
- Add block patch to review together
- Fix some error in v2 patch

Changes in v2:
- Add ioprio field in struct dm_io_region
- Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
- Add two interface


Hongyu Jin (5):
block: Optimize bio io priority setting
dm: Support I/O priority for dm_io()
dm-bufio: Support I/O priority
dm verity: Fix I/O priority lost when read FEC and hash
dm-crypt: Fix lost ioprio when queuing write bios

block/blk-core.c | 10 ++++++
block/blk-mq.c | 11 ------
drivers/md/dm-bufio.c | 36 ++++++++++---------
drivers/md/dm-crypt.c | 1 +
drivers/md/dm-ebs-target.c | 8 ++---
drivers/md/dm-integrity.c | 7 +++-
drivers/md/dm-io.c | 1 +
drivers/md/dm-log.c | 1 +
drivers/md/dm-raid1.c | 2 ++
drivers/md/dm-snap-persistent.c | 5 +--
drivers/md/dm-verity-fec.c | 5 +--
drivers/md/dm-verity-target.c | 8 +++--
drivers/md/dm-writecache.c | 4 +++
drivers/md/persistent-data/dm-block-manager.c | 6 ++--
include/linux/dm-bufio.h | 6 ++--
include/linux/dm-io.h | 2 ++
16 files changed, 69 insertions(+), 44 deletions(-)

--
2.34.1

2023-12-11 09:01:32

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v3 1/5] block: Optimize bio io priority setting

From: Hongyu Jin <[email protected]>

Current call bio_set_ioprio() for each cloned bio and splited bio,
and the io priority can't be passed to module that implement
struct gendisk::fops::submit_bio, such as device-mapper.

Move bio_set_ioprio() into submit_bio(), only call bio_set_ioprio()
once set the priority of origin bio, cloned and splited bio
auto inherit the priority of origin bio in clone process.

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
block/blk-core.c | 10 ++++++++++
block/blk-mq.c | 11 -----------
2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdf25b8d6e78..68158c327aea 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -49,6 +49,7 @@
#include "blk-pm.h"
#include "blk-cgroup.h"
#include "blk-throttle.h"
+#include "blk-ioprio.h"

struct dentry *blk_debugfs_root;

@@ -809,6 +810,14 @@ void submit_bio_noacct(struct bio *bio)
}
EXPORT_SYMBOL(submit_bio_noacct);

+static void bio_set_ioprio(struct bio *bio)
+{
+ /* Nobody set ioprio so far? Initialize it based on task's nice value */
+ if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
+ bio->bi_ioprio = get_current_ioprio();
+ blkcg_set_ioprio(bio);
+}
+
/**
* submit_bio - submit a bio to the block device layer for I/O
* @bio: The &struct bio which describes the I/O
@@ -831,6 +840,7 @@ void submit_bio(struct bio *bio)
count_vm_events(PGPGOUT, bio_sectors(bio));
}

+ bio_set_ioprio(bio);
submit_bio_noacct(bio);
}
EXPORT_SYMBOL(submit_bio);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e2d11183f62e..a6e2609df9c9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,7 +40,6 @@
#include "blk-stat.h"
#include "blk-mq-sched.h"
#include "blk-rq-qos.h"
-#include "blk-ioprio.h"

static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd);
@@ -2922,14 +2921,6 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
return rq;
}

-static void bio_set_ioprio(struct bio *bio)
-{
- /* Nobody set ioprio so far? Initialize it based on task's nice value */
- if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
- bio->bi_ioprio = get_current_ioprio();
- blkcg_set_ioprio(bio);
-}
-
/**
* blk_mq_submit_bio - Create and send a request to block device.
* @bio: Bio pointer.
@@ -2963,8 +2954,6 @@ void blk_mq_submit_bio(struct bio *bio)
if (!bio_integrity_prep(bio))
return;

- bio_set_ioprio(bio);
-
rq = blk_mq_get_cached_request(q, plug, &bio, nr_segs);
if (!rq) {
if (!bio)
--
2.34.1

2023-12-11 09:01:34

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v3 4/5] dm verity: Fix I/O priority lost when read FEC and hash

From: Hongyu Jin <[email protected]>

To fix this problem, when read FEC and hash from disk, I/O priority are
inconsistent with data block and blocked by other I/O with low I/O
priority.

Make I/O for FEC and hash has same I/O priority with original data I/O.

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-verity-fec.c | 3 ++-
drivers/md/dm-verity-target.c | 8 ++++++--
2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 715173cbf0ee..6a5a679e7e8a 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -209,6 +209,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
u8 *bbuf, *rs_block;
u8 want_digest[HASH_MAX_DIGESTSIZE];
unsigned int n, k;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (neras)
*neras = 0;
@@ -247,7 +248,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
+ bbuf = dm_bufio_read(bufio, block, &buf, bio_prio(bio));
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 0038e168f3d7..8c911b6722ce 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -51,6 +51,7 @@ static DEFINE_STATIC_KEY_FALSE(use_tasklet_enabled);
struct dm_verity_prefetch_work {
struct work_struct work;
struct dm_verity *v;
+ unsigned short ioprio;
sector_t block;
unsigned int n_blocks;
};
@@ -293,6 +294,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
int r;
sector_t hash_block;
unsigned int offset;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

verity_hash_at_level(v, block, level, &hash_block, &offset);

@@ -307,7 +309,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, bio_prio(bio));

if (IS_ERR(data))
return PTR_ERR(data);
@@ -717,7 +719,7 @@ static void verity_prefetch_io(struct work_struct *work)
}
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1, IOPRIO_DEFAULT);
+ hash_block_end - hash_block_start + 1, pw->ioprio);
}

kfree(pw);
@@ -728,6 +730,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
sector_t block = io->block;
unsigned int n_blocks = io->n_blocks;
struct dm_verity_prefetch_work *pw;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (v->validated_blocks) {
while (n_blocks && test_bit(block, v->validated_blocks)) {
@@ -751,6 +754,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
pw->v = v;
pw->block = block;
pw->n_blocks = n_blocks;
+ pw->ioprio = bio_prio(bio);
queue_work(v->verify_wq, &pw->work);
}

--
2.34.1

2023-12-11 09:01:36

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v3 2/5] dm: Support I/O priority for dm_io()

From: Hongyu Jin <[email protected]>

Add ioprio field in struct dm_io_region, by this field
specific I/O priority when call dm_io().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 3 +++
drivers/md/dm-integrity.c | 5 +++++
drivers/md/dm-io.c | 1 +
drivers/md/dm-log.c | 1 +
drivers/md/dm-raid1.c | 2 ++
drivers/md/dm-snap-persistent.c | 1 +
drivers/md/dm-writecache.c | 4 ++++
include/linux/dm-io.h | 2 ++
8 files changed, 19 insertions(+)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 62eb27639c9b..7f82262aed54 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1269,6 +1269,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
.bdev = b->c->bdev,
.sector = sector,
.count = n_sectors,
+ .ioprio = IOPRIO_DEFAULT,
};

if (b->data_mode != DATA_MODE_VMALLOC) {
@@ -2125,6 +2126,7 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
.bdev = c->bdev,
.sector = 0,
.count = 0,
+ .ioprio = IOPRIO_DEFAULT,
};

if (WARN_ON_ONCE(dm_bufio_in_request()))
@@ -2149,6 +2151,7 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c
.bdev = c->bdev,
.sector = block_to_sector(c, block),
.count = block_to_sector(c, count),
+ .ioprio = IOPRIO_DEFAULT,
};

if (WARN_ON_ONCE(dm_bufio_in_request()))
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index e85c688fd91e..7cba183abdce 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -543,6 +543,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, blk_opf_t opf)
io_loc.bdev = ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev;
io_loc.sector = ic->start;
io_loc.count = SB_SECTORS;
+ io_loc.ioprio = IOPRIO_DEFAULT;

if (op == REQ_OP_WRITE) {
sb_set_version(ic);
@@ -1070,6 +1071,7 @@ static void rw_journal_sectors(struct dm_integrity_c *ic, blk_opf_t opf,
io_loc.bdev = ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev;
io_loc.sector = ic->start + SB_SECTORS + sector;
io_loc.count = n_sectors;
+ io_loc.ioprio = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
@@ -1187,6 +1189,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned int section, u
io_loc.bdev = ic->dev->bdev;
io_loc.sector = target;
io_loc.count = n_sectors;
+ io_loc.ioprio = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
@@ -1515,6 +1518,7 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
fr.io_reg.bdev = ic->dev->bdev,
fr.io_reg.sector = 0,
fr.io_reg.count = 0,
+ fr.io_reg.ioprio = IOPRIO_DEFAULT,
fr.ic = ic;
init_completion(&fr.comp);
r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL);
@@ -2738,6 +2742,7 @@ static void integrity_recalc(struct work_struct *w)
io_loc.bdev = ic->dev->bdev;
io_loc.sector = get_data_sector(ic, area, offset);
io_loc.count = n_sectors;
+ io_loc.ioprio = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index f053ce245814..b40f0a432981 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -354,6 +354,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,
&io->client->bios);
bio->bi_iter.bi_sector = where->sector + (where->count - remaining);
bio->bi_end_io = endio;
+ bio->bi_ioprio = where->ioprio;
store_io_and_region_in_bio(bio, io, region);

if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..e0dacdcd94f1 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -309,6 +309,7 @@ static int flush_header(struct log_c *lc)
.bdev = lc->header_location.bdev,
.sector = 0,
.count = 0,
+ .ioprio = IOPRIO_DEFAULT,
};

lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..2de9b1377de3 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -275,6 +275,7 @@ static int mirror_flush(struct dm_target *ti)
io[i].bdev = m->dev->bdev;
io[i].sector = 0;
io[i].count = 0;
+ io[i].ioprio = IOPRIO_DEFAULT;
}

error_bits = -1;
@@ -475,6 +476,7 @@ static void map_region(struct dm_io_region *io, struct mirror *m,
io->bdev = m->dev->bdev;
io->sector = map_sector(m, bio);
io->count = bio_sectors(bio);
+ io->ioprio = bio_prio(bio);
}

static void hold_bio(struct mirror_set *ms, struct bio *bio)
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..4aa70b71f1da 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -236,6 +236,7 @@ static int chunk_io(struct pstore *ps, void *area, chunk_t chunk, blk_opf_t opf,
.bdev = dm_snap_cow(ps->store->snap)->bdev,
.sector = ps->store->chunk_size * chunk,
.count = ps->store->chunk_size,
+ .ioprio = IOPRIO_DEFAULT,
};
struct dm_io_request io_req = {
.bi_opf = opf,
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 074cb785eafc..135d1268246f 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -515,6 +515,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
region.bdev = wc->ssd_dev->bdev;
region.sector = (sector_t)i * (BITMAP_GRANULARITY >> SECTOR_SHIFT);
region.count = (sector_t)(j - i) * (BITMAP_GRANULARITY >> SECTOR_SHIFT);
+ region.ioprio = IOPRIO_DEFAULT;

if (unlikely(region.sector >= wc->metadata_sectors))
break;
@@ -555,6 +556,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
region.bdev = wc->ssd_dev->bdev;
region.sector = 0;
region.count = max(4096U, wc->block_size) >> SECTOR_SHIFT;
+ region.ioprio = IOPRIO_DEFAULT;

if (unlikely(region.sector + region.count > wc->metadata_sectors))
region.count = wc->metadata_sectors - region.sector;
@@ -590,6 +592,7 @@ static void writecache_disk_flush(struct dm_writecache *wc, struct dm_dev *dev)
region.bdev = dev->bdev;
region.sector = 0;
region.count = 0;
+ region.ioprio = IOPRIO_DEFAULT;
req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
req.mem.type = DM_IO_KMEM;
req.mem.ptr.addr = NULL;
@@ -984,6 +987,7 @@ static int writecache_read_metadata(struct dm_writecache *wc, sector_t n_sectors
region.bdev = wc->ssd_dev->bdev;
region.sector = wc->start_sector;
region.count = n_sectors;
+ region.ioprio = IOPRIO_DEFAULT;
req.bi_opf = REQ_OP_READ | REQ_SYNC;
req.mem.type = DM_IO_VMA;
req.mem.ptr.vma = (char *)wc->memory_map;
diff --git a/include/linux/dm-io.h b/include/linux/dm-io.h
index 7595142f3fc5..227ee6d77c70 100644
--- a/include/linux/dm-io.h
+++ b/include/linux/dm-io.h
@@ -20,6 +20,8 @@ struct dm_io_region {
struct block_device *bdev;
sector_t sector;
sector_t count; /* If this is zero the region is ignored. */
+ /* Set it to IOPRIO_DEFAULT if you don't know what value to set */
+ unsigned short ioprio;
};

struct page_list {
--
2.34.1

2023-12-11 09:01:39

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v3 3/5] dm-bufio: Support I/O priority

From: Hongyu Jin <[email protected]>

Add I/O priority parameter for dm_bufio_read() and
dm_bufio_prefetch().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 33 ++++++++++---------
drivers/md/dm-ebs-target.c | 8 ++---
drivers/md/dm-integrity.c | 2 +-
drivers/md/dm-snap-persistent.c | 4 +--
drivers/md/dm-verity-fec.c | 4 +--
drivers/md/dm-verity-target.c | 4 +--
drivers/md/persistent-data/dm-block-manager.c | 6 ++--
include/linux/dm-bufio.h | 6 ++--
8 files changed, 34 insertions(+), 33 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 7f82262aed54..739f5dc52432 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1256,7 +1256,7 @@ static void dmio_complete(unsigned long error, void *context)
}

static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset, unsigned short ioprio)
{
int r;
struct dm_io_request io_req = {
@@ -1296,7 +1296,7 @@ static void bio_complete(struct bio *bio)
}

static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset, unsigned short ioprio)
{
struct bio *bio;
char *ptr;
@@ -1304,13 +1304,14 @@ static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,

bio = bio_kmalloc(1, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOWARN);
if (!bio) {
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
return;
}
bio_init(bio, b->c->bdev, bio->bi_inline_vecs, 1, op);
bio->bi_iter.bi_sector = sector;
bio->bi_end_io = bio_complete;
bio->bi_private = b;
+ bio->bi_ioprio = ioprio;

ptr = (char *)b->data + offset;
len = n_sectors << SECTOR_SHIFT;
@@ -1333,7 +1334,7 @@ static inline sector_t block_to_sector(struct dm_bufio_client *c, sector_t block
return sector;
}

-static void submit_io(struct dm_buffer *b, enum req_op op,
+static void submit_io(struct dm_buffer *b, enum req_op op, unsigned short ioprio,
void (*end_io)(struct dm_buffer *, blk_status_t))
{
unsigned int n_sectors;
@@ -1363,9 +1364,9 @@ static void submit_io(struct dm_buffer *b, enum req_op op,
}

if (b->data_mode != DATA_MODE_VMALLOC)
- use_bio(b, op, sector, n_sectors, offset);
+ use_bio(b, op, sector, n_sectors, offset, ioprio);
else
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
}

/*
@@ -1421,7 +1422,7 @@ static void __write_dirty_buffer(struct dm_buffer *b,
b->write_end = b->dirty_end;

if (!write_list)
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
else
list_add_tail(&b->write_list, write_list);
}
@@ -1435,7 +1436,7 @@ static void __flush_write_list(struct list_head *write_list)
struct dm_buffer *b =
list_entry(write_list->next, struct dm_buffer, write_list);
list_del(&b->write_list);
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
cond_resched();
}
blk_finish_plug(&plug);
@@ -1817,7 +1818,7 @@ static void read_endio(struct dm_buffer *b, blk_status_t status)
* and uses dm_bufio_mark_buffer_dirty to write new data back).
*/
static void *new_read(struct dm_bufio_client *c, sector_t block,
- enum new_flag nf, struct dm_buffer **bp)
+ enum new_flag nf, struct dm_buffer **bp, unsigned short ioprio)
{
int need_submit = 0;
struct dm_buffer *b;
@@ -1870,7 +1871,7 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
return NULL;

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);

wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);

@@ -1890,17 +1891,17 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
void *dm_bufio_get(struct dm_bufio_client *c, sector_t block,
struct dm_buffer **bp)
{
- return new_read(c, block, NF_GET, bp);
+ return new_read(c, block, NF_GET, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_get);

void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
- struct dm_buffer **bp)
+ struct dm_buffer **bp, unsigned short ioprio)
{
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_READ, bp);
+ return new_read(c, block, NF_READ, bp, ioprio);
}
EXPORT_SYMBOL_GPL(dm_bufio_read);

@@ -1910,12 +1911,12 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_FRESH, bp);
+ return new_read(c, block, NF_FRESH, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_new);

void dm_bufio_prefetch(struct dm_bufio_client *c,
- sector_t block, unsigned int n_blocks)
+ sector_t block, unsigned int n_blocks, unsigned short ioprio)
{
struct blk_plug plug;

@@ -1951,7 +1952,7 @@ void dm_bufio_prefetch(struct dm_bufio_client *c,
dm_bufio_unlock(c);

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);
dm_bufio_release(b);

cond_resched();
diff --git a/drivers/md/dm-ebs-target.c b/drivers/md/dm-ebs-target.c
index 435b45201f4d..8198c8a7b416 100644
--- a/drivers/md/dm-ebs-target.c
+++ b/drivers/md/dm-ebs-target.c
@@ -84,7 +84,7 @@ static int __ebs_rw_bvec(struct ebs_c *ec, enum req_op op, struct bio_vec *bv,

/* Avoid reading for writes in case bio vector's page overwrites block completely. */
if (op == REQ_OP_READ || buf_off || bv_len < dm_bufio_get_block_size(ec->bufio))
- ba = dm_bufio_read(ec->bufio, block, &b);
+ ba = dm_bufio_read(ec->bufio, block, &b, IOPRIO_DEFAULT);
else
ba = dm_bufio_new(ec->bufio, block, &b);

@@ -194,13 +194,13 @@ static void __ebs_process_bios(struct work_struct *ws)
bio_list_for_each(bio, &bios) {
block1 = __sector_to_block(ec, bio->bi_iter.bi_sector);
if (bio_op(bio) == REQ_OP_READ)
- dm_bufio_prefetch(ec->bufio, block1, __nr_blocks(ec, bio));
+ dm_bufio_prefetch(ec->bufio, block1, __nr_blocks(ec, bio), IOPRIO_DEFAULT);
else if (bio_op(bio) == REQ_OP_WRITE && !(bio->bi_opf & REQ_PREFLUSH)) {
block2 = __sector_to_block(ec, bio_end_sector(bio));
if (__block_mod(bio->bi_iter.bi_sector, ec->u_bs))
- dm_bufio_prefetch(ec->bufio, block1, 1);
+ dm_bufio_prefetch(ec->bufio, block1, 1, IOPRIO_DEFAULT);
if (__block_mod(bio_end_sector(bio), ec->u_bs) && block2 != block1)
- dm_bufio_prefetch(ec->bufio, block2, 1);
+ dm_bufio_prefetch(ec->bufio, block2, 1, IOPRIO_DEFAULT);
}
}

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 7cba183abdce..a2853c24a259 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -1421,7 +1421,7 @@ static int dm_integrity_rw_tag(struct dm_integrity_c *ic, unsigned char *tag, se
if (unlikely(r))
return r;

- data = dm_bufio_read(ic->bufio, *metadata_block, &b);
+ data = dm_bufio_read(ic->bufio, *metadata_block, &b, IOPRIO_DEFAULT);
if (IS_ERR(data))
return PTR_ERR(data);

diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 4aa70b71f1da..eb6943fc7024 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -525,7 +525,7 @@ static int read_exceptions(struct pstore *ps,

if (unlikely(pf_chunk >= dm_bufio_get_device_size(client)))
break;
- dm_bufio_prefetch(client, pf_chunk, 1);
+ dm_bufio_prefetch(client, pf_chunk, 1, IOPRIO_DEFAULT);
prefetch_area++;
if (unlikely(!prefetch_area))
break;
@@ -534,7 +534,7 @@ static int read_exceptions(struct pstore *ps,

chunk = area_location(ps, ps->current_area);

- area = dm_bufio_read(client, chunk, &bp);
+ area = dm_bufio_read(client, chunk, &bp, IOPRIO_DEFAULT);
if (IS_ERR(area)) {
r = PTR_ERR(area);
goto ret_destroy_bufio;
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 3ef9f018da60..715173cbf0ee 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -68,7 +68,7 @@ static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index,
block = div64_u64_rem(position, v->fec->io_size, &rem);
*offset = (unsigned int)rem;

- res = dm_bufio_read(v->fec->bufio, block, buf);
+ res = dm_bufio_read(v->fec->bufio, block, buf, IOPRIO_DEFAULT);
if (IS_ERR(res)) {
DMERR("%s: FEC %llu: parity read failed (block %llu): %ld",
v->data_dev->name, (unsigned long long)rsb,
@@ -247,7 +247,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf);
+ bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 26adcfea0302..0038e168f3d7 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -307,7 +307,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);

if (IS_ERR(data))
return PTR_ERR(data);
@@ -717,7 +717,7 @@ static void verity_prefetch_io(struct work_struct *work)
}
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1);
+ hash_block_end - hash_block_start + 1, IOPRIO_DEFAULT);
}

kfree(pw);
diff --git a/drivers/md/persistent-data/dm-block-manager.c b/drivers/md/persistent-data/dm-block-manager.c
index 0e010e1204aa..86a4f73d2f3d 100644
--- a/drivers/md/persistent-data/dm-block-manager.c
+++ b/drivers/md/persistent-data/dm-block-manager.c
@@ -474,7 +474,7 @@ int dm_bm_read_lock(struct dm_block_manager *bm, dm_block_t b,
void *p;
int r;

- p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result);
+ p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result, IOPRIO_DEFAULT);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -510,7 +510,7 @@ int dm_bm_write_lock(struct dm_block_manager *bm,
if (dm_bm_is_read_only(bm))
return -EPERM;

- p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result);
+ p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result, IOPRIO_DEFAULT);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -624,7 +624,7 @@ EXPORT_SYMBOL_GPL(dm_bm_flush);

void dm_bm_prefetch(struct dm_block_manager *bm, dm_block_t b)
{
- dm_bufio_prefetch(bm->bufio, b, 1);
+ dm_bufio_prefetch(bm->bufio, b, 1, IOPRIO_DEFAULT);
}

bool dm_bm_is_read_only(struct dm_block_manager *bm)
diff --git a/include/linux/dm-bufio.h b/include/linux/dm-bufio.h
index 75e7d8cbb532..d270d48891f7 100644
--- a/include/linux/dm-bufio.h
+++ b/include/linux/dm-bufio.h
@@ -11,6 +11,7 @@
#define _LINUX_DM_BUFIO_H

#include <linux/blkdev.h>
+#include <linux/ioprio.h>
#include <linux/types.h>

/*----------------------------------------------------------------*/
@@ -62,7 +63,7 @@ void dm_bufio_set_sector_offset(struct dm_bufio_client *c, sector_t start);
* it dirty.
*/
void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
- struct dm_buffer **bp);
+ struct dm_buffer **bp, unsigned short ioprio);

/*
* Like dm_bufio_read, but return buffer from cache, don't read
@@ -84,8 +85,7 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
* I/O to finish.
*/
void dm_bufio_prefetch(struct dm_bufio_client *c,
- sector_t block, unsigned int n_blocks);
-
+ sector_t block, unsigned int n_blocks, unsigned short ioprio);
/*
* Release a reference obtained with dm_bufio_{read,get,new}. The data
* pointer and dm_buffer pointer is no longer valid after this call.
--
2.34.1

2023-12-11 09:01:53

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v3 5/5] dm-crypt: Fix lost ioprio when queuing write bios

From: Hongyu Jin <[email protected]>

The original submitting bio->bi_ioprio setting can be retained by
struct dm_crypt_io::base_bio, we set the original bio's ioprio to
the cloned bio for write.

Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-crypt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 6de107aff331..b67fec865f00 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
GFP_NOIO, &cc->bs);
clone->bi_private = io;
clone->bi_end_io = crypt_endio;
+ clone->bi_ioprio = bio_prio(io->base_bio);

remaining_size = size;

--
2.34.1

2023-12-11 20:32:43

by Eric Wheeler

[permalink] [raw]
Subject: Re: [PATCH v3 5/5] dm-crypt: Fix lost ioprio when queuing write bios

On Mon, 11 Dec 2023, Hongyu Jin wrote:
> From: Hongyu Jin <[email protected]>
>
> The original submitting bio->bi_ioprio setting can be retained by
> struct dm_crypt_io::base_bio, we set the original bio's ioprio to
> the cloned bio for write.
>
> Signed-off-by: Hongyu Jin <[email protected]>

Thanks,

Reviewed-by: Eric Wheeler <[email protected]>

> ---
> drivers/md/dm-crypt.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> index 6de107aff331..b67fec865f00 100644
> --- a/drivers/md/dm-crypt.c
> +++ b/drivers/md/dm-crypt.c
> @@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
> GFP_NOIO, &cc->bs);
> clone->bi_private = io;
> clone->bi_end_io = crypt_endio;
> + clone->bi_ioprio = bio_prio(io->base_bio);
>
> remaining_size = size;
>
> --
> 2.34.1
>
>
>

2023-12-11 21:12:56

by Mike Snitzer

[permalink] [raw]
Subject: Re: [PATCH v3 1/5] block: Optimize bio io priority setting

On Mon, Dec 11 2023 at 3:59P -0500,
Hongyu Jin <[email protected]> wrote:

> From: Hongyu Jin <[email protected]>
>
> Current call bio_set_ioprio() for each cloned bio and splited bio,
> and the io priority can't be passed to module that implement
> struct gendisk::fops::submit_bio, such as device-mapper.
>
> Move bio_set_ioprio() into submit_bio(), only call bio_set_ioprio()
> once set the priority of origin bio, cloned and splited bio
> auto inherit the priority of origin bio in clone process.
>
> Co-developed-by: Yibin Ding <[email protected]>
> Signed-off-by: Yibin Ding <[email protected]>
> Signed-off-by: Hongyu Jin <[email protected]>

This patch's subject needs fixing (this is a fix, not an optimization)
and the header needs fixing (various issues that make it hard to
read).

This should also be tagged with:
Fixes: a78418e6a04c9 ("block: Always initialize bio IO priority on submit")

(commit 82b74cac28493 was commit immediately prior that placed the
direct call incorrectly)

Reviewed-by: Mike Snitzer <[email protected]>

> ---
> block/blk-core.c | 10 ++++++++++
> block/blk-mq.c | 11 -----------
> 2 files changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index fdf25b8d6e78..68158c327aea 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -49,6 +49,7 @@
> #include "blk-pm.h"
> #include "blk-cgroup.h"
> #include "blk-throttle.h"
> +#include "blk-ioprio.h"
>
> struct dentry *blk_debugfs_root;
>
> @@ -809,6 +810,14 @@ void submit_bio_noacct(struct bio *bio)
> }
> EXPORT_SYMBOL(submit_bio_noacct);
>
> +static void bio_set_ioprio(struct bio *bio)
> +{
> + /* Nobody set ioprio so far? Initialize it based on task's nice value */
> + if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
> + bio->bi_ioprio = get_current_ioprio();
> + blkcg_set_ioprio(bio);
> +}
> +
> /**
> * submit_bio - submit a bio to the block device layer for I/O
> * @bio: The &struct bio which describes the I/O
> @@ -831,6 +840,7 @@ void submit_bio(struct bio *bio)
> count_vm_events(PGPGOUT, bio_sectors(bio));
> }
>
> + bio_set_ioprio(bio);
> submit_bio_noacct(bio);
> }
> EXPORT_SYMBOL(submit_bio);
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index e2d11183f62e..a6e2609df9c9 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -40,7 +40,6 @@
> #include "blk-stat.h"
> #include "blk-mq-sched.h"
> #include "blk-rq-qos.h"
> -#include "blk-ioprio.h"
>
> static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
> static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd);
> @@ -2922,14 +2921,6 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
> return rq;
> }
>
> -static void bio_set_ioprio(struct bio *bio)
> -{
> - /* Nobody set ioprio so far? Initialize it based on task's nice value */
> - if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
> - bio->bi_ioprio = get_current_ioprio();
> - blkcg_set_ioprio(bio);
> -}
> -
> /**
> * blk_mq_submit_bio - Create and send a request to block device.
> * @bio: Bio pointer.
> @@ -2963,8 +2954,6 @@ void blk_mq_submit_bio(struct bio *bio)
> if (!bio_integrity_prep(bio))
> return;
>
> - bio_set_ioprio(bio);
> -
> rq = blk_mq_get_cached_request(q, plug, &bio, nr_segs);
> if (!rq) {
> if (!bio)
> --
> 2.34.1
>
>

2023-12-11 22:15:34

by Mike Snitzer

[permalink] [raw]
Subject: Re: [PATCH v3 5/5] dm-crypt: Fix lost ioprio when queuing write bios

On Mon, Dec 11 2023 at 4:00P -0500,
Hongyu Jin <[email protected]> wrote:

> From: Hongyu Jin <[email protected]>
>
> The original submitting bio->bi_ioprio setting can be retained by
> struct dm_crypt_io::base_bio, we set the original bio's ioprio to
> the cloned bio for write.
>
> Signed-off-by: Hongyu Jin <[email protected]>
> ---
> drivers/md/dm-crypt.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> index 6de107aff331..b67fec865f00 100644
> --- a/drivers/md/dm-crypt.c
> +++ b/drivers/md/dm-crypt.c
> @@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
> GFP_NOIO, &cc->bs);
> clone->bi_private = io;
> clone->bi_end_io = crypt_endio;
> + clone->bi_ioprio = bio_prio(io->base_bio);

Weird use of bio_prio() wrapper given the assignment to
clone->bi_ioprio. I'd prefer:
clone->bi_ioprio = io->base_bio->bi_ioprio;

Some additional info to be mindful of:

This encryption bio has always been unique (ever since dm-crypt
stopped using the block layer's methods for cloning with 2007's commit
2f9941b6c55d7).

Prior to commit 2f9941b6c55d7, dm-crypt used to call __bio_clone() to
make sure not to miss cloning other capabilities -- and __bio_clone()
does exist again as of commit a0e8de798dd67 but it is private to bio.c
(in service to bio_alloc_clone, etc).

My point: because we aren't using traditional bio cloning (due to not
wanting to share the bio_vec) we also aren't transferring over the
cgroup (via bio_clone_blkg_association), etc.

That can be a secondary concern that you don't need to worry about
(but it is something Mikulas and I need to look at closer).

Mike

2023-12-12 11:13:38

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v4 0/5] Fix I/O priority lost in device-mapper

From: Hongyu Jin <[email protected]>

A high-priority task obtains data from the dm-verity device using the
RT IO priority, during the verification, the IO reading FEC and hash
by kworker loses the RT priority and is blocked by the low-priority IO.
dm-crypt has the same problem in the process of writing data.

This is because io_context and blkcg are missing.

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. Make the IO priority of the original bio to be passed to dm,
and the dm target inherits the IO priority as needed.

Changes in v4:
- Modify commit message by Suggestion
- Modify patch for dm-crypt
Changes in v3:
- Split patch for device-mapper
- Add patch to fix dm-crypy I/O priority question
- Add block patch to review together
- Fix some error in v2 patch

Changes in v2:
- Add ioprio field in struct dm_io_region
- Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
- Add two interface
**

Hongyu Jin (5):
block: Fix bio IO priority setting
dm: Support I/O priority for dm_io()
dm-bufio: Support I/O priority
dm verity: Fix I/O priority lost when read FEC and hash
dm-crypt: Fix lost ioprio when queuing write bios

block/blk-core.c | 10 ++++++
block/blk-mq.c | 11 ------
drivers/md/dm-bufio.c | 36 ++++++++++---------
drivers/md/dm-crypt.c | 1 +
drivers/md/dm-ebs-target.c | 8 ++---
drivers/md/dm-integrity.c | 7 +++-
drivers/md/dm-io.c | 1 +
drivers/md/dm-log.c | 1 +
drivers/md/dm-raid1.c | 2 ++
drivers/md/dm-snap-persistent.c | 5 +--
drivers/md/dm-verity-fec.c | 5 +--
drivers/md/dm-verity-target.c | 8 +++--
drivers/md/dm-writecache.c | 4 +++
drivers/md/persistent-data/dm-block-manager.c | 6 ++--
include/linux/dm-bufio.h | 6 ++--
include/linux/dm-io.h | 2 ++
16 files changed, 69 insertions(+), 44 deletions(-)

--
2.34.1

2023-12-12 11:13:55

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v4 1/5] block: Fix bio IO priority setting

From: Hongyu Jin <[email protected]>

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. The IO priority can be passed to module that implement
struct gendisk::fops::submit_bio, help resolve some
of the IO priority loss issues.

This patch depends on commit 82b74cac2849 ("blk-ioprio: Convert from
rqos policy to direct call")

Fixes: a78418e6a04c ("block: Always initialize bio IO priority on submit")

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
block/blk-core.c | 10 ++++++++++
block/blk-mq.c | 11 -----------
2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdf25b8d6e78..68158c327aea 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -49,6 +49,7 @@
#include "blk-pm.h"
#include "blk-cgroup.h"
#include "blk-throttle.h"
+#include "blk-ioprio.h"

struct dentry *blk_debugfs_root;

@@ -809,6 +810,14 @@ void submit_bio_noacct(struct bio *bio)
}
EXPORT_SYMBOL(submit_bio_noacct);

+static void bio_set_ioprio(struct bio *bio)
+{
+ /* Nobody set ioprio so far? Initialize it based on task's nice value */
+ if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
+ bio->bi_ioprio = get_current_ioprio();
+ blkcg_set_ioprio(bio);
+}
+
/**
* submit_bio - submit a bio to the block device layer for I/O
* @bio: The &struct bio which describes the I/O
@@ -831,6 +840,7 @@ void submit_bio(struct bio *bio)
count_vm_events(PGPGOUT, bio_sectors(bio));
}

+ bio_set_ioprio(bio);
submit_bio_noacct(bio);
}
EXPORT_SYMBOL(submit_bio);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e2d11183f62e..a6e2609df9c9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,7 +40,6 @@
#include "blk-stat.h"
#include "blk-mq-sched.h"
#include "blk-rq-qos.h"
-#include "blk-ioprio.h"

static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd);
@@ -2922,14 +2921,6 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
return rq;
}

-static void bio_set_ioprio(struct bio *bio)
-{
- /* Nobody set ioprio so far? Initialize it based on task's nice value */
- if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
- bio->bi_ioprio = get_current_ioprio();
- blkcg_set_ioprio(bio);
-}
-
/**
* blk_mq_submit_bio - Create and send a request to block device.
* @bio: Bio pointer.
@@ -2963,8 +2954,6 @@ void blk_mq_submit_bio(struct bio *bio)
if (!bio_integrity_prep(bio))
return;

- bio_set_ioprio(bio);
-
rq = blk_mq_get_cached_request(q, plug, &bio, nr_segs);
if (!rq) {
if (!bio)
--
2.34.1

2023-12-12 11:14:01

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v4 5/5] dm-crypt: Fix lost ioprio when queuing write bios

From: Hongyu Jin <[email protected]>

The original submitting bio->bi_ioprio setting can be retained by
struct dm_crypt_io::base_bio, we set the original bio's ioprio to
the cloned bio for write.

Link: https://lore.kernel.org/dm-devel/[email protected]

Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-crypt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 6de107aff331..7149da6555b8 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
GFP_NOIO, &cc->bs);
clone->bi_private = io;
clone->bi_end_io = crypt_endio;
+ clone->bi_ioprio = io->base_bio->bi_ioprio;

remaining_size = size;

--
2.34.1

2023-12-12 11:14:22

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v4 2/5] dm: Support I/O priority for dm_io()

From: Hongyu Jin <[email protected]>

Add ioprio field in struct dm_io_region, by this field
specific I/O priority when call dm_io().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 3 +++
drivers/md/dm-integrity.c | 5 +++++
drivers/md/dm-io.c | 1 +
drivers/md/dm-log.c | 1 +
drivers/md/dm-raid1.c | 2 ++
drivers/md/dm-snap-persistent.c | 1 +
drivers/md/dm-writecache.c | 4 ++++
include/linux/dm-io.h | 2 ++
8 files changed, 19 insertions(+)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 62eb27639c9b..7f82262aed54 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1269,6 +1269,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
.bdev = b->c->bdev,
.sector = sector,
.count = n_sectors,
+ .ioprio = IOPRIO_DEFAULT,
};

if (b->data_mode != DATA_MODE_VMALLOC) {
@@ -2125,6 +2126,7 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
.bdev = c->bdev,
.sector = 0,
.count = 0,
+ .ioprio = IOPRIO_DEFAULT,
};

if (WARN_ON_ONCE(dm_bufio_in_request()))
@@ -2149,6 +2151,7 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c
.bdev = c->bdev,
.sector = block_to_sector(c, block),
.count = block_to_sector(c, count),
+ .ioprio = IOPRIO_DEFAULT,
};

if (WARN_ON_ONCE(dm_bufio_in_request()))
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index e85c688fd91e..7cba183abdce 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -543,6 +543,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, blk_opf_t opf)
io_loc.bdev = ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev;
io_loc.sector = ic->start;
io_loc.count = SB_SECTORS;
+ io_loc.ioprio = IOPRIO_DEFAULT;

if (op == REQ_OP_WRITE) {
sb_set_version(ic);
@@ -1070,6 +1071,7 @@ static void rw_journal_sectors(struct dm_integrity_c *ic, blk_opf_t opf,
io_loc.bdev = ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev;
io_loc.sector = ic->start + SB_SECTORS + sector;
io_loc.count = n_sectors;
+ io_loc.ioprio = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
@@ -1187,6 +1189,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned int section, u
io_loc.bdev = ic->dev->bdev;
io_loc.sector = target;
io_loc.count = n_sectors;
+ io_loc.ioprio = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
@@ -1515,6 +1518,7 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
fr.io_reg.bdev = ic->dev->bdev,
fr.io_reg.sector = 0,
fr.io_reg.count = 0,
+ fr.io_reg.ioprio = IOPRIO_DEFAULT,
fr.ic = ic;
init_completion(&fr.comp);
r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL);
@@ -2738,6 +2742,7 @@ static void integrity_recalc(struct work_struct *w)
io_loc.bdev = ic->dev->bdev;
io_loc.sector = get_data_sector(ic, area, offset);
io_loc.count = n_sectors;
+ io_loc.ioprio = IOPRIO_DEFAULT;

r = dm_io(&io_req, 1, &io_loc, NULL);
if (unlikely(r)) {
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index f053ce245814..b40f0a432981 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -354,6 +354,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,
&io->client->bios);
bio->bi_iter.bi_sector = where->sector + (where->count - remaining);
bio->bi_end_io = endio;
+ bio->bi_ioprio = where->ioprio;
store_io_and_region_in_bio(bio, io, region);

if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..e0dacdcd94f1 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -309,6 +309,7 @@ static int flush_header(struct log_c *lc)
.bdev = lc->header_location.bdev,
.sector = 0,
.count = 0,
+ .ioprio = IOPRIO_DEFAULT,
};

lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..2de9b1377de3 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -275,6 +275,7 @@ static int mirror_flush(struct dm_target *ti)
io[i].bdev = m->dev->bdev;
io[i].sector = 0;
io[i].count = 0;
+ io[i].ioprio = IOPRIO_DEFAULT;
}

error_bits = -1;
@@ -475,6 +476,7 @@ static void map_region(struct dm_io_region *io, struct mirror *m,
io->bdev = m->dev->bdev;
io->sector = map_sector(m, bio);
io->count = bio_sectors(bio);
+ io->ioprio = bio_prio(bio);
}

static void hold_bio(struct mirror_set *ms, struct bio *bio)
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..4aa70b71f1da 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -236,6 +236,7 @@ static int chunk_io(struct pstore *ps, void *area, chunk_t chunk, blk_opf_t opf,
.bdev = dm_snap_cow(ps->store->snap)->bdev,
.sector = ps->store->chunk_size * chunk,
.count = ps->store->chunk_size,
+ .ioprio = IOPRIO_DEFAULT,
};
struct dm_io_request io_req = {
.bi_opf = opf,
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 074cb785eafc..135d1268246f 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -515,6 +515,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
region.bdev = wc->ssd_dev->bdev;
region.sector = (sector_t)i * (BITMAP_GRANULARITY >> SECTOR_SHIFT);
region.count = (sector_t)(j - i) * (BITMAP_GRANULARITY >> SECTOR_SHIFT);
+ region.ioprio = IOPRIO_DEFAULT;

if (unlikely(region.sector >= wc->metadata_sectors))
break;
@@ -555,6 +556,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
region.bdev = wc->ssd_dev->bdev;
region.sector = 0;
region.count = max(4096U, wc->block_size) >> SECTOR_SHIFT;
+ region.ioprio = IOPRIO_DEFAULT;

if (unlikely(region.sector + region.count > wc->metadata_sectors))
region.count = wc->metadata_sectors - region.sector;
@@ -590,6 +592,7 @@ static void writecache_disk_flush(struct dm_writecache *wc, struct dm_dev *dev)
region.bdev = dev->bdev;
region.sector = 0;
region.count = 0;
+ region.ioprio = IOPRIO_DEFAULT;
req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
req.mem.type = DM_IO_KMEM;
req.mem.ptr.addr = NULL;
@@ -984,6 +987,7 @@ static int writecache_read_metadata(struct dm_writecache *wc, sector_t n_sectors
region.bdev = wc->ssd_dev->bdev;
region.sector = wc->start_sector;
region.count = n_sectors;
+ region.ioprio = IOPRIO_DEFAULT;
req.bi_opf = REQ_OP_READ | REQ_SYNC;
req.mem.type = DM_IO_VMA;
req.mem.ptr.vma = (char *)wc->memory_map;
diff --git a/include/linux/dm-io.h b/include/linux/dm-io.h
index 7595142f3fc5..227ee6d77c70 100644
--- a/include/linux/dm-io.h
+++ b/include/linux/dm-io.h
@@ -20,6 +20,8 @@ struct dm_io_region {
struct block_device *bdev;
sector_t sector;
sector_t count; /* If this is zero the region is ignored. */
+ /* Set it to IOPRIO_DEFAULT if you don't know what value to set */
+ unsigned short ioprio;
};

struct page_list {
--
2.34.1

2023-12-12 11:14:40

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v4 4/5] dm verity: Fix I/O priority lost when read FEC and hash

From: Hongyu Jin <[email protected]>

To fix this problem, when read FEC and hash from disk, I/O priority are
inconsistent with data block and blocked by other I/O with low I/O
priority.

Make I/O for FEC and hash has same I/O priority with original data I/O.

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-verity-fec.c | 3 ++-
drivers/md/dm-verity-target.c | 8 ++++++--
2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 715173cbf0ee..6a5a679e7e8a 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -209,6 +209,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
u8 *bbuf, *rs_block;
u8 want_digest[HASH_MAX_DIGESTSIZE];
unsigned int n, k;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (neras)
*neras = 0;
@@ -247,7 +248,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
+ bbuf = dm_bufio_read(bufio, block, &buf, bio_prio(bio));
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 0038e168f3d7..8c911b6722ce 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -51,6 +51,7 @@ static DEFINE_STATIC_KEY_FALSE(use_tasklet_enabled);
struct dm_verity_prefetch_work {
struct work_struct work;
struct dm_verity *v;
+ unsigned short ioprio;
sector_t block;
unsigned int n_blocks;
};
@@ -293,6 +294,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
int r;
sector_t hash_block;
unsigned int offset;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

verity_hash_at_level(v, block, level, &hash_block, &offset);

@@ -307,7 +309,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, bio_prio(bio));

if (IS_ERR(data))
return PTR_ERR(data);
@@ -717,7 +719,7 @@ static void verity_prefetch_io(struct work_struct *work)
}
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1, IOPRIO_DEFAULT);
+ hash_block_end - hash_block_start + 1, pw->ioprio);
}

kfree(pw);
@@ -728,6 +730,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
sector_t block = io->block;
unsigned int n_blocks = io->n_blocks;
struct dm_verity_prefetch_work *pw;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (v->validated_blocks) {
while (n_blocks && test_bit(block, v->validated_blocks)) {
@@ -751,6 +754,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
pw->v = v;
pw->block = block;
pw->n_blocks = n_blocks;
+ pw->ioprio = bio_prio(bio);
queue_work(v->verify_wq, &pw->work);
}

--
2.34.1

2023-12-12 11:24:58

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v4 3/5] dm-bufio: Support I/O priority

From: Hongyu Jin <[email protected]>

Add I/O priority parameter for dm_bufio_read() and
dm_bufio_prefetch().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 33 ++++++++++---------
drivers/md/dm-ebs-target.c | 8 ++---
drivers/md/dm-integrity.c | 2 +-
drivers/md/dm-snap-persistent.c | 4 +--
drivers/md/dm-verity-fec.c | 4 +--
drivers/md/dm-verity-target.c | 4 +--
drivers/md/persistent-data/dm-block-manager.c | 6 ++--
include/linux/dm-bufio.h | 6 ++--
8 files changed, 34 insertions(+), 33 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 7f82262aed54..739f5dc52432 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1256,7 +1256,7 @@ static void dmio_complete(unsigned long error, void *context)
}

static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset, unsigned short ioprio)
{
int r;
struct dm_io_request io_req = {
@@ -1296,7 +1296,7 @@ static void bio_complete(struct bio *bio)
}

static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset, unsigned short ioprio)
{
struct bio *bio;
char *ptr;
@@ -1304,13 +1304,14 @@ static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,

bio = bio_kmalloc(1, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOWARN);
if (!bio) {
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
return;
}
bio_init(bio, b->c->bdev, bio->bi_inline_vecs, 1, op);
bio->bi_iter.bi_sector = sector;
bio->bi_end_io = bio_complete;
bio->bi_private = b;
+ bio->bi_ioprio = ioprio;

ptr = (char *)b->data + offset;
len = n_sectors << SECTOR_SHIFT;
@@ -1333,7 +1334,7 @@ static inline sector_t block_to_sector(struct dm_bufio_client *c, sector_t block
return sector;
}

-static void submit_io(struct dm_buffer *b, enum req_op op,
+static void submit_io(struct dm_buffer *b, enum req_op op, unsigned short ioprio,
void (*end_io)(struct dm_buffer *, blk_status_t))
{
unsigned int n_sectors;
@@ -1363,9 +1364,9 @@ static void submit_io(struct dm_buffer *b, enum req_op op,
}

if (b->data_mode != DATA_MODE_VMALLOC)
- use_bio(b, op, sector, n_sectors, offset);
+ use_bio(b, op, sector, n_sectors, offset, ioprio);
else
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
}

/*
@@ -1421,7 +1422,7 @@ static void __write_dirty_buffer(struct dm_buffer *b,
b->write_end = b->dirty_end;

if (!write_list)
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
else
list_add_tail(&b->write_list, write_list);
}
@@ -1435,7 +1436,7 @@ static void __flush_write_list(struct list_head *write_list)
struct dm_buffer *b =
list_entry(write_list->next, struct dm_buffer, write_list);
list_del(&b->write_list);
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
cond_resched();
}
blk_finish_plug(&plug);
@@ -1817,7 +1818,7 @@ static void read_endio(struct dm_buffer *b, blk_status_t status)
* and uses dm_bufio_mark_buffer_dirty to write new data back).
*/
static void *new_read(struct dm_bufio_client *c, sector_t block,
- enum new_flag nf, struct dm_buffer **bp)
+ enum new_flag nf, struct dm_buffer **bp, unsigned short ioprio)
{
int need_submit = 0;
struct dm_buffer *b;
@@ -1870,7 +1871,7 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
return NULL;

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);

wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);

@@ -1890,17 +1891,17 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
void *dm_bufio_get(struct dm_bufio_client *c, sector_t block,
struct dm_buffer **bp)
{
- return new_read(c, block, NF_GET, bp);
+ return new_read(c, block, NF_GET, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_get);

void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
- struct dm_buffer **bp)
+ struct dm_buffer **bp, unsigned short ioprio)
{
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_READ, bp);
+ return new_read(c, block, NF_READ, bp, ioprio);
}
EXPORT_SYMBOL_GPL(dm_bufio_read);

@@ -1910,12 +1911,12 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_FRESH, bp);
+ return new_read(c, block, NF_FRESH, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_new);

void dm_bufio_prefetch(struct dm_bufio_client *c,
- sector_t block, unsigned int n_blocks)
+ sector_t block, unsigned int n_blocks, unsigned short ioprio)
{
struct blk_plug plug;

@@ -1951,7 +1952,7 @@ void dm_bufio_prefetch(struct dm_bufio_client *c,
dm_bufio_unlock(c);

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);
dm_bufio_release(b);

cond_resched();
diff --git a/drivers/md/dm-ebs-target.c b/drivers/md/dm-ebs-target.c
index 435b45201f4d..8198c8a7b416 100644
--- a/drivers/md/dm-ebs-target.c
+++ b/drivers/md/dm-ebs-target.c
@@ -84,7 +84,7 @@ static int __ebs_rw_bvec(struct ebs_c *ec, enum req_op op, struct bio_vec *bv,

/* Avoid reading for writes in case bio vector's page overwrites block completely. */
if (op == REQ_OP_READ || buf_off || bv_len < dm_bufio_get_block_size(ec->bufio))
- ba = dm_bufio_read(ec->bufio, block, &b);
+ ba = dm_bufio_read(ec->bufio, block, &b, IOPRIO_DEFAULT);
else
ba = dm_bufio_new(ec->bufio, block, &b);

@@ -194,13 +194,13 @@ static void __ebs_process_bios(struct work_struct *ws)
bio_list_for_each(bio, &bios) {
block1 = __sector_to_block(ec, bio->bi_iter.bi_sector);
if (bio_op(bio) == REQ_OP_READ)
- dm_bufio_prefetch(ec->bufio, block1, __nr_blocks(ec, bio));
+ dm_bufio_prefetch(ec->bufio, block1, __nr_blocks(ec, bio), IOPRIO_DEFAULT);
else if (bio_op(bio) == REQ_OP_WRITE && !(bio->bi_opf & REQ_PREFLUSH)) {
block2 = __sector_to_block(ec, bio_end_sector(bio));
if (__block_mod(bio->bi_iter.bi_sector, ec->u_bs))
- dm_bufio_prefetch(ec->bufio, block1, 1);
+ dm_bufio_prefetch(ec->bufio, block1, 1, IOPRIO_DEFAULT);
if (__block_mod(bio_end_sector(bio), ec->u_bs) && block2 != block1)
- dm_bufio_prefetch(ec->bufio, block2, 1);
+ dm_bufio_prefetch(ec->bufio, block2, 1, IOPRIO_DEFAULT);
}
}

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 7cba183abdce..a2853c24a259 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -1421,7 +1421,7 @@ static int dm_integrity_rw_tag(struct dm_integrity_c *ic, unsigned char *tag, se
if (unlikely(r))
return r;

- data = dm_bufio_read(ic->bufio, *metadata_block, &b);
+ data = dm_bufio_read(ic->bufio, *metadata_block, &b, IOPRIO_DEFAULT);
if (IS_ERR(data))
return PTR_ERR(data);

diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 4aa70b71f1da..eb6943fc7024 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -525,7 +525,7 @@ static int read_exceptions(struct pstore *ps,

if (unlikely(pf_chunk >= dm_bufio_get_device_size(client)))
break;
- dm_bufio_prefetch(client, pf_chunk, 1);
+ dm_bufio_prefetch(client, pf_chunk, 1, IOPRIO_DEFAULT);
prefetch_area++;
if (unlikely(!prefetch_area))
break;
@@ -534,7 +534,7 @@ static int read_exceptions(struct pstore *ps,

chunk = area_location(ps, ps->current_area);

- area = dm_bufio_read(client, chunk, &bp);
+ area = dm_bufio_read(client, chunk, &bp, IOPRIO_DEFAULT);
if (IS_ERR(area)) {
r = PTR_ERR(area);
goto ret_destroy_bufio;
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 3ef9f018da60..715173cbf0ee 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -68,7 +68,7 @@ static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index,
block = div64_u64_rem(position, v->fec->io_size, &rem);
*offset = (unsigned int)rem;

- res = dm_bufio_read(v->fec->bufio, block, buf);
+ res = dm_bufio_read(v->fec->bufio, block, buf, IOPRIO_DEFAULT);
if (IS_ERR(res)) {
DMERR("%s: FEC %llu: parity read failed (block %llu): %ld",
v->data_dev->name, (unsigned long long)rsb,
@@ -247,7 +247,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf);
+ bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 26adcfea0302..0038e168f3d7 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -307,7 +307,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);

if (IS_ERR(data))
return PTR_ERR(data);
@@ -717,7 +717,7 @@ static void verity_prefetch_io(struct work_struct *work)
}
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1);
+ hash_block_end - hash_block_start + 1, IOPRIO_DEFAULT);
}

kfree(pw);
diff --git a/drivers/md/persistent-data/dm-block-manager.c b/drivers/md/persistent-data/dm-block-manager.c
index 0e010e1204aa..86a4f73d2f3d 100644
--- a/drivers/md/persistent-data/dm-block-manager.c
+++ b/drivers/md/persistent-data/dm-block-manager.c
@@ -474,7 +474,7 @@ int dm_bm_read_lock(struct dm_block_manager *bm, dm_block_t b,
void *p;
int r;

- p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result);
+ p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result, IOPRIO_DEFAULT);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -510,7 +510,7 @@ int dm_bm_write_lock(struct dm_block_manager *bm,
if (dm_bm_is_read_only(bm))
return -EPERM;

- p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result);
+ p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result, IOPRIO_DEFAULT);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -624,7 +624,7 @@ EXPORT_SYMBOL_GPL(dm_bm_flush);

void dm_bm_prefetch(struct dm_block_manager *bm, dm_block_t b)
{
- dm_bufio_prefetch(bm->bufio, b, 1);
+ dm_bufio_prefetch(bm->bufio, b, 1, IOPRIO_DEFAULT);
}

bool dm_bm_is_read_only(struct dm_block_manager *bm)
diff --git a/include/linux/dm-bufio.h b/include/linux/dm-bufio.h
index 75e7d8cbb532..d270d48891f7 100644
--- a/include/linux/dm-bufio.h
+++ b/include/linux/dm-bufio.h
@@ -11,6 +11,7 @@
#define _LINUX_DM_BUFIO_H

#include <linux/blkdev.h>
+#include <linux/ioprio.h>
#include <linux/types.h>

/*----------------------------------------------------------------*/
@@ -62,7 +63,7 @@ void dm_bufio_set_sector_offset(struct dm_bufio_client *c, sector_t start);
* it dirty.
*/
void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
- struct dm_buffer **bp);
+ struct dm_buffer **bp, unsigned short ioprio);

/*
* Like dm_bufio_read, but return buffer from cache, don't read
@@ -84,8 +85,7 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
* I/O to finish.
*/
void dm_bufio_prefetch(struct dm_bufio_client *c,
- sector_t block, unsigned int n_blocks);
-
+ sector_t block, unsigned int n_blocks, unsigned short ioprio);
/*
* Release a reference obtained with dm_bufio_{read,get,new}. The data
* pointer and dm_buffer pointer is no longer valid after this call.
--
2.34.1

2023-12-12 13:13:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 1/5] block: Fix bio IO priority setting

On Tue, Dec 12, 2023 at 07:11:46PM +0800, Hongyu Jin wrote:
> From: Hongyu Jin <[email protected]>
>
> Move bio_set_ioprio() into submit_bio():
> 1. Only call bio_set_ioprio() once to set the priority of original bio,
> the bio that cloned and splited from original bio will auto inherit
> the priority of original bio in clone process.
>
> 2. The IO priority can be passed to module that implement
> struct gendisk::fops::submit_bio, help resolve some
> of the IO priority loss issues.

Can we reword this a bit. AFAICS what this primarily does it to ensure
the priority is set before dispatching to submit_bio based drivers or
blk-mq instead of just blk-mq, and the rest follows from that.

> +static void bio_set_ioprio(struct bio *bio)
> +{
> + /* Nobody set ioprio so far? Initialize it based on task's nice value */
> + if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
> + bio->bi_ioprio = get_current_ioprio();
> + blkcg_set_ioprio(bio);
> +}

I don't think we need the check here as anyone resubmitting a bio should
be using submit_bio_noact.

2023-12-12 18:02:51

by Mike Snitzer

[permalink] [raw]
Subject: Re: [PATCH v4 1/5] block: Fix bio IO priority setting

On Tue, Dec 12 2023 at 8:13P -0500,
Christoph Hellwig <[email protected]> wrote:

> On Tue, Dec 12, 2023 at 07:11:46PM +0800, Hongyu Jin wrote:
> > From: Hongyu Jin <[email protected]>
> >
> > Move bio_set_ioprio() into submit_bio():
> > 1. Only call bio_set_ioprio() once to set the priority of original bio,
> > the bio that cloned and splited from original bio will auto inherit
> > the priority of original bio in clone process.
> >
> > 2. The IO priority can be passed to module that implement
> > struct gendisk::fops::submit_bio, help resolve some
> > of the IO priority loss issues.
>
> Can we reword this a bit. AFAICS what this primarily does it to ensure
> the priority is set before dispatching to submit_bio based drivers or
> blk-mq instead of just blk-mq, and the rest follows from that.

Yeah, I agree.. something like:

Move bio_set_ioprio() and caller up from blk_mq_submit_bio() to
submit_bio(). This ensures all block drivers call bio_set_ioprio()
during initial bio submission.

> > +static void bio_set_ioprio(struct bio *bio)
> > +{
> > + /* Nobody set ioprio so far? Initialize it based on task's nice value */
> > + if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
> > + bio->bi_ioprio = get_current_ioprio();
> > + blkcg_set_ioprio(bio);
> > +}
>
> I don't think we need the check here as anyone resubmitting a bio should
> be using submit_bio_noact.

This patch moves the caller from blk_mq_submit_bio() to submit_bio().

So I'm not sure why you're seizing on the "resubmitting a bio" usecase
as reason for dropping this check (which occurs in submit_bio).

The original justification for the check is detailed in commit
a78418e6a04c93b ("block: Always initialize bio IO priority on submit").

Mike

2023-12-13 04:46:53

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] Fix I/O priority lost in device-mapper

On Tue, Dec 12, 2023 at 07:11:45PM +0800, Hongyu Jin wrote:
> From: Hongyu Jin <[email protected]>
>
> A high-priority task obtains data from the dm-verity device using the
> RT IO priority, during the verification, the IO reading FEC and hash
> by kworker loses the RT priority and is blocked by the low-priority IO.
> dm-crypt has the same problem in the process of writing data.
>
> This is because io_context and blkcg are missing.
>
> Move bio_set_ioprio() into submit_bio():
> 1. Only call bio_set_ioprio() once to set the priority of original bio,
> the bio that cloned and splited from original bio will auto inherit
> the priority of original bio in clone process.
>
> 2. Make the IO priority of the original bio to be passed to dm,
> and the dm target inherits the IO priority as needed.
>

What commit does this patch series apply to?

- Eric

2023-12-13 04:57:22

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 2/5] dm: Support I/O priority for dm_io()

On Tue, Dec 12, 2023 at 07:11:47PM +0800, Hongyu Jin wrote:
> From: Hongyu Jin <[email protected]>
>
> Add ioprio field in struct dm_io_region, by this field
> specific I/O priority when call dm_io().
>
> Co-developed-by: Yibin Ding <[email protected]>
> Signed-off-by: Yibin Ding <[email protected]>
> Signed-off-by: Hongyu Jin <[email protected]>

Is struct dm_io_region really the right place for this? What about
struct dm_io_request? Or a parameter to dm_io().

- Eric

2023-12-13 05:01:09

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v4 3/5] dm-bufio: Support I/O priority

On Tue, Dec 12, 2023 at 07:11:48PM +0800, Hongyu Jin wrote:
> static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
> - unsigned int n_sectors, unsigned int offset)
> + unsigned int n_sectors, unsigned int offset, unsigned short ioprio)

The ioprio argument to this function is unused.

> bool dm_bm_is_read_only(struct dm_block_manager *bm)
> diff --git a/include/linux/dm-bufio.h b/include/linux/dm-bufio.h
> index 75e7d8cbb532..d270d48891f7 100644
> --- a/include/linux/dm-bufio.h
> +++ b/include/linux/dm-bufio.h
> @@ -11,6 +11,7 @@
> #define _LINUX_DM_BUFIO_H
>
> #include <linux/blkdev.h>
> +#include <linux/ioprio.h>
> #include <linux/types.h>

It's not necessary to include linux/ioprio.h here.

- Eric

2023-12-13 09:24:53

by Hongyu Jin

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] Fix I/O priority lost in device-mapper

Eric Biggers <[email protected]> 于2023年12月13日周三 12:45写道:
>
> On Tue, Dec 12, 2023 at 07:11:45PM +0800, Hongyu Jin wrote:
> > From: Hongyu Jin <[email protected]>
> >
> > A high-priority task obtains data from the dm-verity device using the
> > RT IO priority, during the verification, the IO reading FEC and hash
> > by kworker loses the RT priority and is blocked by the low-priority IO.
> > dm-crypt has the same problem in the process of writing data.
> >
> > This is because io_context and blkcg are missing.
> >
> > Move bio_set_ioprio() into submit_bio():
> > 1. Only call bio_set_ioprio() once to set the priority of original bio,
> > the bio that cloned and splited from original bio will auto inherit
> > the priority of original bio in clone process.
> >
> > 2. Make the IO priority of the original bio to be passed to dm,
> > and the dm target inherits the IO priority as needed.
> >
>
> What commit does this patch series apply to?
>
> - Eric

Changes are based on the master branch
commit 9bacdd8996c7 (origin/master, origin/HEAD) Merge tag
'for-6.7-rc1-tag' of
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

2023-12-13 10:43:00

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v4 0/5] Fix I/O priority lost in device-mapper

From: Hongyu Jin <[email protected]>

A high-priority task obtains data from the dm-verity device using the
RT IO priority, during the verification, the IO reading FEC and hash
by kworker loses the RT priority and is blocked by the low-priority IO.
dm-crypt has the same problem in the process of writing data.

This is because io_context and blkcg are missing.

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. Make the IO priority of the original bio to be passed to dm,
and the dm target inherits the IO priority as needed.


All changes are based on commit 9bacdd8996c7 ("Merge tag 'for-6.7-rc1-tag'
of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux")
on the master branch.


Changes in v5:
- Rewrite patch 2, add ioprio parameter in dm_io();
- Modify dm_io() in patch 3
Changes in v4:
- Modify commit message by Suggestion
- Modify patch for dm-crypt
Changes in v3:
- Split patch for device-mapper
- Add patch to fix dm-crypy I/O priority question
- Add block patch to review together
- Fix some error in v2 patch

Changes in v2:
- Add ioprio field in struct dm_io_region
- Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
- Add two interface


Hongyu Jin (5):
block: Fix bio IO priority setting
dm: Support I/O priority for dm_io()
dm-bufio: Support I/O priority
dm verity: Fix I/O priority lost when read FEC and hash
dm-crypt: Fix lost ioprio when queuing write bios

block/blk-core.c | 10 +++++
block/blk-mq.c | 11 ------
drivers/md/dm-bufio.c | 39 ++++++++++---------
drivers/md/dm-crypt.c | 1 +
drivers/md/dm-ebs-target.c | 8 ++--
drivers/md/dm-integrity.c | 12 +++---
drivers/md/dm-io.c | 23 ++++++-----
drivers/md/dm-kcopyd.c | 4 +-
drivers/md/dm-log.c | 4 +-
drivers/md/dm-raid1.c | 6 +--
drivers/md/dm-snap-persistent.c | 8 ++--
drivers/md/dm-verity-fec.c | 5 ++-
drivers/md/dm-verity-target.c | 8 +++-
drivers/md/dm-writecache.c | 8 ++--
drivers/md/persistent-data/dm-block-manager.c | 6 +--
include/linux/dm-bufio.h | 5 +--
include/linux/dm-io.h | 3 +-
17 files changed, 85 insertions(+), 76 deletions(-)

--
2.34.1

2023-12-13 10:43:11

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 1/5] block: Fix bio IO priority setting

From: Hongyu Jin <[email protected]>

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. The IO priority can be passed to module that implement
struct gendisk::fops::submit_bio, help resolve some
of the IO priority loss issues.

This patch depends on commit 82b74cac2849 ("blk-ioprio: Convert from
rqos policy to direct call")

Fixes: a78418e6a04c ("block: Always initialize bio IO priority on submit")

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
block/blk-core.c | 10 ++++++++++
block/blk-mq.c | 11 -----------
2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdf25b8d6e78..68158c327aea 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -49,6 +49,7 @@
#include "blk-pm.h"
#include "blk-cgroup.h"
#include "blk-throttle.h"
+#include "blk-ioprio.h"

struct dentry *blk_debugfs_root;

@@ -809,6 +810,14 @@ void submit_bio_noacct(struct bio *bio)
}
EXPORT_SYMBOL(submit_bio_noacct);

+static void bio_set_ioprio(struct bio *bio)
+{
+ /* Nobody set ioprio so far? Initialize it based on task's nice value */
+ if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
+ bio->bi_ioprio = get_current_ioprio();
+ blkcg_set_ioprio(bio);
+}
+
/**
* submit_bio - submit a bio to the block device layer for I/O
* @bio: The &struct bio which describes the I/O
@@ -831,6 +840,7 @@ void submit_bio(struct bio *bio)
count_vm_events(PGPGOUT, bio_sectors(bio));
}

+ bio_set_ioprio(bio);
submit_bio_noacct(bio);
}
EXPORT_SYMBOL(submit_bio);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e2d11183f62e..a6e2609df9c9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,7 +40,6 @@
#include "blk-stat.h"
#include "blk-mq-sched.h"
#include "blk-rq-qos.h"
-#include "blk-ioprio.h"

static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd);
@@ -2922,14 +2921,6 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
return rq;
}

-static void bio_set_ioprio(struct bio *bio)
-{
- /* Nobody set ioprio so far? Initialize it based on task's nice value */
- if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
- bio->bi_ioprio = get_current_ioprio();
- blkcg_set_ioprio(bio);
-}
-
/**
* blk_mq_submit_bio - Create and send a request to block device.
* @bio: Bio pointer.
@@ -2963,8 +2954,6 @@ void blk_mq_submit_bio(struct bio *bio)
if (!bio_integrity_prep(bio))
return;

- bio_set_ioprio(bio);
-
rq = blk_mq_get_cached_request(q, plug, &bio, nr_segs);
if (!rq) {
if (!bio)
--
2.34.1

2023-12-13 10:43:15

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 3/5] dm-bufio: Support I/O priority

From: Hongyu Jin <[email protected]>

Add I/O priority parameter for dm_bufio_read() and
dm_bufio_prefetch().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 35 ++++++++++---------
drivers/md/dm-ebs-target.c | 8 ++---
drivers/md/dm-integrity.c | 2 +-
drivers/md/dm-snap-persistent.c | 4 +--
drivers/md/dm-verity-fec.c | 4 +--
drivers/md/dm-verity-target.c | 4 +--
drivers/md/persistent-data/dm-block-manager.c | 6 ++--
include/linux/dm-bufio.h | 5 ++-
8 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 91b05bf4b920..f3b051d3517e 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1256,7 +1256,7 @@ static void dmio_complete(unsigned long error, void *context)
}

static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset, unsigned short ioprio)
{
int r;
struct dm_io_request io_req = {
@@ -1279,7 +1279,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
io_req.mem.ptr.vma = (char *)b->data + offset;
}

- r = dm_io(&io_req, 1, &region, NULL, IOPRIO_DEFAULT);
+ r = dm_io(&io_req, 1, &region, NULL, ioprio);
if (unlikely(r))
b->end_io(b, errno_to_blk_status(r));
}
@@ -1295,7 +1295,7 @@ static void bio_complete(struct bio *bio)
}

static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset, unsigned short ioprio)
{
struct bio *bio;
char *ptr;
@@ -1303,13 +1303,14 @@ static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,

bio = bio_kmalloc(1, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOWARN);
if (!bio) {
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
return;
}
bio_init(bio, b->c->bdev, bio->bi_inline_vecs, 1, op);
bio->bi_iter.bi_sector = sector;
bio->bi_end_io = bio_complete;
bio->bi_private = b;
+ bio->bi_ioprio = ioprio;

ptr = (char *)b->data + offset;
len = n_sectors << SECTOR_SHIFT;
@@ -1332,7 +1333,7 @@ static inline sector_t block_to_sector(struct dm_bufio_client *c, sector_t block
return sector;
}

-static void submit_io(struct dm_buffer *b, enum req_op op,
+static void submit_io(struct dm_buffer *b, enum req_op op, unsigned short ioprio,
void (*end_io)(struct dm_buffer *, blk_status_t))
{
unsigned int n_sectors;
@@ -1362,9 +1363,9 @@ static void submit_io(struct dm_buffer *b, enum req_op op,
}

if (b->data_mode != DATA_MODE_VMALLOC)
- use_bio(b, op, sector, n_sectors, offset);
+ use_bio(b, op, sector, n_sectors, offset, ioprio);
else
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
}

/*
@@ -1420,7 +1421,7 @@ static void __write_dirty_buffer(struct dm_buffer *b,
b->write_end = b->dirty_end;

if (!write_list)
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
else
list_add_tail(&b->write_list, write_list);
}
@@ -1434,7 +1435,7 @@ static void __flush_write_list(struct list_head *write_list)
struct dm_buffer *b =
list_entry(write_list->next, struct dm_buffer, write_list);
list_del(&b->write_list);
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
cond_resched();
}
blk_finish_plug(&plug);
@@ -1816,7 +1817,7 @@ static void read_endio(struct dm_buffer *b, blk_status_t status)
* and uses dm_bufio_mark_buffer_dirty to write new data back).
*/
static void *new_read(struct dm_bufio_client *c, sector_t block,
- enum new_flag nf, struct dm_buffer **bp)
+ enum new_flag nf, struct dm_buffer **bp, unsigned short ioprio)
{
int need_submit = 0;
struct dm_buffer *b;
@@ -1869,7 +1870,7 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
return NULL;

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);

wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);

@@ -1889,17 +1890,17 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
void *dm_bufio_get(struct dm_bufio_client *c, sector_t block,
struct dm_buffer **bp)
{
- return new_read(c, block, NF_GET, bp);
+ return new_read(c, block, NF_GET, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_get);

void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
- struct dm_buffer **bp)
+ struct dm_buffer **bp, unsigned short ioprio)
{
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_READ, bp);
+ return new_read(c, block, NF_READ, bp, ioprio);
}
EXPORT_SYMBOL_GPL(dm_bufio_read);

@@ -1909,12 +1910,12 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_FRESH, bp);
+ return new_read(c, block, NF_FRESH, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_new);

void dm_bufio_prefetch(struct dm_bufio_client *c,
- sector_t block, unsigned int n_blocks)
+ sector_t block, unsigned int n_blocks, unsigned short ioprio)
{
struct blk_plug plug;

@@ -1950,7 +1951,7 @@ void dm_bufio_prefetch(struct dm_bufio_client *c,
dm_bufio_unlock(c);

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);
dm_bufio_release(b);

cond_resched();
diff --git a/drivers/md/dm-ebs-target.c b/drivers/md/dm-ebs-target.c
index 435b45201f4d..8198c8a7b416 100644
--- a/drivers/md/dm-ebs-target.c
+++ b/drivers/md/dm-ebs-target.c
@@ -84,7 +84,7 @@ static int __ebs_rw_bvec(struct ebs_c *ec, enum req_op op, struct bio_vec *bv,

/* Avoid reading for writes in case bio vector's page overwrites block completely. */
if (op == REQ_OP_READ || buf_off || bv_len < dm_bufio_get_block_size(ec->bufio))
- ba = dm_bufio_read(ec->bufio, block, &b);
+ ba = dm_bufio_read(ec->bufio, block, &b, IOPRIO_DEFAULT);
else
ba = dm_bufio_new(ec->bufio, block, &b);

@@ -194,13 +194,13 @@ static void __ebs_process_bios(struct work_struct *ws)
bio_list_for_each(bio, &bios) {
block1 = __sector_to_block(ec, bio->bi_iter.bi_sector);
if (bio_op(bio) == REQ_OP_READ)
- dm_bufio_prefetch(ec->bufio, block1, __nr_blocks(ec, bio));
+ dm_bufio_prefetch(ec->bufio, block1, __nr_blocks(ec, bio), IOPRIO_DEFAULT);
else if (bio_op(bio) == REQ_OP_WRITE && !(bio->bi_opf & REQ_PREFLUSH)) {
block2 = __sector_to_block(ec, bio_end_sector(bio));
if (__block_mod(bio->bi_iter.bi_sector, ec->u_bs))
- dm_bufio_prefetch(ec->bufio, block1, 1);
+ dm_bufio_prefetch(ec->bufio, block1, 1, IOPRIO_DEFAULT);
if (__block_mod(bio_end_sector(bio), ec->u_bs) && block2 != block1)
- dm_bufio_prefetch(ec->bufio, block2, 1);
+ dm_bufio_prefetch(ec->bufio, block2, 1, IOPRIO_DEFAULT);
}
}

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 9ffd093ad6cc..1e40e712bcd7 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -1418,7 +1418,7 @@ static int dm_integrity_rw_tag(struct dm_integrity_c *ic, unsigned char *tag, se
if (unlikely(r))
return r;

- data = dm_bufio_read(ic->bufio, *metadata_block, &b);
+ data = dm_bufio_read(ic->bufio, *metadata_block, &b, IOPRIO_DEFAULT);
if (IS_ERR(data))
return PTR_ERR(data);

diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 568d10842b1f..a2072b95e28c 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -524,7 +524,7 @@ static int read_exceptions(struct pstore *ps,

if (unlikely(pf_chunk >= dm_bufio_get_device_size(client)))
break;
- dm_bufio_prefetch(client, pf_chunk, 1);
+ dm_bufio_prefetch(client, pf_chunk, 1, IOPRIO_DEFAULT);
prefetch_area++;
if (unlikely(!prefetch_area))
break;
@@ -533,7 +533,7 @@ static int read_exceptions(struct pstore *ps,

chunk = area_location(ps, ps->current_area);

- area = dm_bufio_read(client, chunk, &bp);
+ area = dm_bufio_read(client, chunk, &bp, IOPRIO_DEFAULT);
if (IS_ERR(area)) {
r = PTR_ERR(area);
goto ret_destroy_bufio;
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 3ef9f018da60..715173cbf0ee 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -68,7 +68,7 @@ static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index,
block = div64_u64_rem(position, v->fec->io_size, &rem);
*offset = (unsigned int)rem;

- res = dm_bufio_read(v->fec->bufio, block, buf);
+ res = dm_bufio_read(v->fec->bufio, block, buf, IOPRIO_DEFAULT);
if (IS_ERR(res)) {
DMERR("%s: FEC %llu: parity read failed (block %llu): %ld",
v->data_dev->name, (unsigned long long)rsb,
@@ -247,7 +247,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf);
+ bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 26adcfea0302..0038e168f3d7 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -307,7 +307,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);

if (IS_ERR(data))
return PTR_ERR(data);
@@ -717,7 +717,7 @@ static void verity_prefetch_io(struct work_struct *work)
}
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1);
+ hash_block_end - hash_block_start + 1, IOPRIO_DEFAULT);
}

kfree(pw);
diff --git a/drivers/md/persistent-data/dm-block-manager.c b/drivers/md/persistent-data/dm-block-manager.c
index 0e010e1204aa..86a4f73d2f3d 100644
--- a/drivers/md/persistent-data/dm-block-manager.c
+++ b/drivers/md/persistent-data/dm-block-manager.c
@@ -474,7 +474,7 @@ int dm_bm_read_lock(struct dm_block_manager *bm, dm_block_t b,
void *p;
int r;

- p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result);
+ p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result, IOPRIO_DEFAULT);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -510,7 +510,7 @@ int dm_bm_write_lock(struct dm_block_manager *bm,
if (dm_bm_is_read_only(bm))
return -EPERM;

- p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result);
+ p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result, IOPRIO_DEFAULT);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -624,7 +624,7 @@ EXPORT_SYMBOL_GPL(dm_bm_flush);

void dm_bm_prefetch(struct dm_block_manager *bm, dm_block_t b)
{
- dm_bufio_prefetch(bm->bufio, b, 1);
+ dm_bufio_prefetch(bm->bufio, b, 1, IOPRIO_DEFAULT);
}

bool dm_bm_is_read_only(struct dm_block_manager *bm)
diff --git a/include/linux/dm-bufio.h b/include/linux/dm-bufio.h
index 75e7d8cbb532..6cdd9cb66dd5 100644
--- a/include/linux/dm-bufio.h
+++ b/include/linux/dm-bufio.h
@@ -62,7 +62,7 @@ void dm_bufio_set_sector_offset(struct dm_bufio_client *c, sector_t start);
* it dirty.
*/
void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
- struct dm_buffer **bp);
+ struct dm_buffer **bp, unsigned short ioprio);

/*
* Like dm_bufio_read, but return buffer from cache, don't read
@@ -84,8 +84,7 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
* I/O to finish.
*/
void dm_bufio_prefetch(struct dm_bufio_client *c,
- sector_t block, unsigned int n_blocks);
-
+ sector_t block, unsigned int n_blocks, unsigned short ioprio);
/*
* Release a reference obtained with dm_bufio_{read,get,new}. The data
* pointer and dm_buffer pointer is no longer valid after this call.
--
2.34.1

2023-12-13 10:43:28

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 4/5] dm verity: Fix I/O priority lost when read FEC and hash

From: Hongyu Jin <[email protected]>

To fix this problem, when read FEC and hash from disk, I/O priority are
inconsistent with data block and blocked by other I/O with low I/O
priority.

Make I/O for FEC and hash has same I/O priority with original data I/O.

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-verity-fec.c | 3 ++-
drivers/md/dm-verity-target.c | 8 ++++++--
2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 715173cbf0ee..6a5a679e7e8a 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -209,6 +209,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
u8 *bbuf, *rs_block;
u8 want_digest[HASH_MAX_DIGESTSIZE];
unsigned int n, k;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (neras)
*neras = 0;
@@ -247,7 +248,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
+ bbuf = dm_bufio_read(bufio, block, &buf, bio_prio(bio));
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 0038e168f3d7..8c911b6722ce 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -51,6 +51,7 @@ static DEFINE_STATIC_KEY_FALSE(use_tasklet_enabled);
struct dm_verity_prefetch_work {
struct work_struct work;
struct dm_verity *v;
+ unsigned short ioprio;
sector_t block;
unsigned int n_blocks;
};
@@ -293,6 +294,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
int r;
sector_t hash_block;
unsigned int offset;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

verity_hash_at_level(v, block, level, &hash_block, &offset);

@@ -307,7 +309,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, bio_prio(bio));

if (IS_ERR(data))
return PTR_ERR(data);
@@ -717,7 +719,7 @@ static void verity_prefetch_io(struct work_struct *work)
}
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1, IOPRIO_DEFAULT);
+ hash_block_end - hash_block_start + 1, pw->ioprio);
}

kfree(pw);
@@ -728,6 +730,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
sector_t block = io->block;
unsigned int n_blocks = io->n_blocks;
struct dm_verity_prefetch_work *pw;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (v->validated_blocks) {
while (n_blocks && test_bit(block, v->validated_blocks)) {
@@ -751,6 +754,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
pw->v = v;
pw->block = block;
pw->n_blocks = n_blocks;
+ pw->ioprio = bio_prio(bio);
queue_work(v->verify_wq, &pw->work);
}

--
2.34.1

2023-12-13 10:43:31

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 2/5] dm: Support I/O priority for dm_io()

From: Hongyu Jin <[email protected]>

Add I/O priority parameter for dm_io().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 6 +++---
drivers/md/dm-integrity.c | 10 +++++-----
drivers/md/dm-io.c | 23 +++++++++++++----------
drivers/md/dm-kcopyd.c | 4 ++--
drivers/md/dm-log.c | 4 ++--
drivers/md/dm-raid1.c | 6 +++---
drivers/md/dm-snap-persistent.c | 4 ++--
drivers/md/dm-writecache.c | 8 ++++----
include/linux/dm-io.h | 3 ++-
9 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 62eb27639c9b..91b05bf4b920 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1279,7 +1279,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
io_req.mem.ptr.vma = (char *)b->data + offset;
}

- r = dm_io(&io_req, 1, &region, NULL);
+ r = dm_io(&io_req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
b->end_io(b, errno_to_blk_status(r));
}
@@ -2130,7 +2130,7 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL;

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_flush);

@@ -2154,7 +2154,7 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL; /* discards are optional */

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_discard);

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index e85c688fd91e..9ffd093ad6cc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -553,7 +553,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, blk_opf_t opf)
}
}

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
return r;

@@ -1071,7 +1071,7 @@ static void rw_journal_sectors(struct dm_integrity_c *ic, blk_opf_t opf,
io_loc.sector = ic->start + SB_SECTORS + sector;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, (opf & REQ_OP_MASK) == REQ_OP_READ ?
"reading journal" : "writing journal", r);
@@ -1188,7 +1188,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned int section, u
io_loc.sector = target;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
WARN_ONCE(1, "asynchronous dm_io failed: %d", r);
fn(-1UL, data);
@@ -1517,7 +1517,7 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
fr.io_reg.count = 0,
fr.ic = ic;
init_completion(&fr.comp);
- r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL);
+ r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL, IOPRIO_DEFAULT);
BUG_ON(r);
}

@@ -2739,7 +2739,7 @@ static void integrity_recalc(struct work_struct *w)
io_loc.sector = get_data_sector(ic, area, offset);
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, "reading data", r);
goto err;
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index f053ce245814..7409490259d1 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -305,7 +305,7 @@ static void km_dp_init(struct dpages *dp, void *data)
*/
static void do_region(const blk_opf_t opf, unsigned int region,
struct dm_io_region *where, struct dpages *dp,
- struct io *io)
+ struct io *io, unsigned short ioprio)
{
struct bio *bio;
struct page *page;
@@ -354,6 +354,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,
&io->client->bios);
bio->bi_iter.bi_sector = where->sector + (where->count - remaining);
bio->bi_end_io = endio;
+ bio->bi_ioprio = ioprio;
store_io_and_region_in_bio(bio, io, region);

if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
@@ -383,7 +384,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,

static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
struct dm_io_region *where, struct dpages *dp,
- struct io *io, int sync)
+ struct io *io, int sync, unsigned short ioprio)
{
int i;
struct dpages old_pages = *dp;
@@ -400,7 +401,7 @@ static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
for (i = 0; i < num_regions; i++) {
*dp = old_pages;
if (where[i].count || (opf & REQ_PREFLUSH))
- do_region(opf, i, where + i, dp, io);
+ do_region(opf, i, where + i, dp, io, ioprio);
}

/*
@@ -425,7 +426,7 @@ static void sync_io_complete(unsigned long error, void *context)

static int sync_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf, struct dpages *dp,
- unsigned long *error_bits)
+ unsigned long *error_bits, unsigned short ioprio)
{
struct io *io;
struct sync_io sio;
@@ -447,7 +448,7 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 1);
+ dispatch_io(opf, num_regions, where, dp, io, 1, ioprio);

wait_for_completion_io(&sio.wait);

@@ -459,7 +460,8 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,

static int async_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf,
- struct dpages *dp, io_notify_fn fn, void *context)
+ struct dpages *dp, io_notify_fn fn, void *context,
+ unsigned short ioprio)
{
struct io *io;

@@ -479,7 +481,7 @@ static int async_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 0);
+ dispatch_io(opf, num_regions, where, dp, io, 0, ioprio);
return 0;
}

@@ -521,7 +523,8 @@ static int dp_init(struct dm_io_request *io_req, struct dpages *dp,
}

int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *where, unsigned long *sync_error_bits)
+ struct dm_io_region *where, unsigned long *sync_error_bits,
+ unsigned short ioprio)
{
int r;
struct dpages dp;
@@ -532,11 +535,11 @@ int dm_io(struct dm_io_request *io_req, unsigned int num_regions,

if (!io_req->notify.fn)
return sync_io(io_req->client, num_regions, where,
- io_req->bi_opf, &dp, sync_error_bits);
+ io_req->bi_opf, &dp, sync_error_bits, ioprio);

return async_io(io_req->client, num_regions, where,
io_req->bi_opf, &dp, io_req->notify.fn,
- io_req->notify.context);
+ io_req->notify.context, ioprio);
}
EXPORT_SYMBOL(dm_io);

diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index d01807c50f20..79c65c9ad5fa 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -578,9 +578,9 @@ static int run_io_job(struct kcopyd_job *job)
io_job_start(job->kc->throttle);

if (job->op == REQ_OP_READ)
- r = dm_io(&io_req, 1, &job->source, NULL);
+ r = dm_io(&io_req, 1, &job->source, NULL, IOPRIO_DEFAULT);
else
- r = dm_io(&io_req, job->num_dests, job->dests, NULL);
+ r = dm_io(&io_req, job->num_dests, job->dests, NULL, IOPRIO_DEFAULT);

return r;
}
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..f7f9c2100937 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -300,7 +300,7 @@ static int rw_header(struct log_c *lc, enum req_op op)
{
lc->io_req.bi_opf = op;

- return dm_io(&lc->io_req, 1, &lc->header_location, NULL);
+ return dm_io(&lc->io_req, 1, &lc->header_location, NULL, IOPRIO_DEFAULT);
}

static int flush_header(struct log_c *lc)
@@ -313,7 +313,7 @@ static int flush_header(struct log_c *lc)

lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;

- return dm_io(&lc->io_req, 1, &null_location, NULL);
+ return dm_io(&lc->io_req, 1, &null_location, NULL, IOPRIO_DEFAULT);
}

static int read_header(struct log_c *log)
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..9511dae5b556 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -278,7 +278,7 @@ static int mirror_flush(struct dm_target *ti)
}

error_bits = -1;
- dm_io(&io_req, ms->nr_mirrors, io, &error_bits);
+ dm_io(&io_req, ms->nr_mirrors, io, &error_bits, IOPRIO_DEFAULT);
if (unlikely(error_bits != 0)) {
for (i = 0; i < ms->nr_mirrors; i++)
if (test_bit(i, &error_bits))
@@ -554,7 +554,7 @@ static void read_async_bio(struct mirror *m, struct bio *bio)

map_region(&io, m, bio);
bio_set_m(bio, m);
- BUG_ON(dm_io(&io_req, 1, &io, NULL));
+ BUG_ON(dm_io(&io_req, 1, &io, NULL, IOPRIO_DEFAULT));
}

static inline int region_in_sync(struct mirror_set *ms, region_t region,
@@ -681,7 +681,7 @@ static void do_write(struct mirror_set *ms, struct bio *bio)
*/
bio_set_m(bio, get_default_mirror(ms));

- BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL));
+ BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL, IOPRIO_DEFAULT));
}

static void do_writes(struct mirror_set *ms, struct bio_list *writes)
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..568d10842b1f 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -223,7 +223,7 @@ static void do_metadata(struct work_struct *work)
{
struct mdata_req *req = container_of(work, struct mdata_req, work);

- req->result = dm_io(req->io_req, 1, req->where, NULL);
+ req->result = dm_io(req->io_req, 1, req->where, NULL, IOPRIO_DEFAULT);
}

/*
@@ -247,7 +247,7 @@ static int chunk_io(struct pstore *ps, void *area, chunk_t chunk, blk_opf_t opf,
struct mdata_req req;

if (!metadata)
- return dm_io(&io_req, 1, &where, NULL);
+ return dm_io(&io_req, 1, &where, NULL, IOPRIO_DEFAULT);

req.where = &where;
req.io_req = &io_req;
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 074cb785eafc..6a4279bfb1e7 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -531,7 +531,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
req.notify.context = &endio;

/* writing via async dm-io (implied by notify.fn above) won't return an error */
- (void) dm_io(&req, 1, &region, NULL);
+ (void) dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
i = j;
}

@@ -568,7 +568,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
req.notify.fn = NULL;
req.notify.context = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error writing superblock");
}
@@ -596,7 +596,7 @@ static void writecache_disk_flush(struct dm_writecache *wc, struct dm_dev *dev)
req.client = wc->dm_io;
req.notify.fn = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error flushing metadata: %d", r);
}
@@ -990,7 +990,7 @@ static int writecache_read_metadata(struct dm_writecache *wc, sector_t n_sectors
req.client = wc->dm_io;
req.notify.fn = NULL;

- return dm_io(&req, 1, &region, NULL);
+ return dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
}

static void writecache_resume(struct dm_target *ti)
diff --git a/include/linux/dm-io.h b/include/linux/dm-io.h
index 7595142f3fc5..7b2968612b7e 100644
--- a/include/linux/dm-io.h
+++ b/include/linux/dm-io.h
@@ -80,7 +80,8 @@ void dm_io_client_destroy(struct dm_io_client *client);
* error occurred doing io to the corresponding region.
*/
int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *region, unsigned int long *sync_error_bits);
+ struct dm_io_region *region, unsigned int long *sync_error_bits,
+ unsigned short ioprio);

#endif /* __KERNEL__ */
#endif /* _LINUX_DM_IO_H */
--
2.34.1

2023-12-13 10:43:32

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 5/5] dm-crypt: Fix lost ioprio when queuing write bios

From: Hongyu Jin <[email protected]>

The original submitting bio->bi_ioprio setting can be retained by
struct dm_crypt_io::base_bio, we set the original bio's ioprio to
the cloned bio for write.

Link: https://lore.kernel.org/dm-devel/[email protected]

Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-crypt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 6de107aff331..7149da6555b8 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
GFP_NOIO, &cc->bs);
clone->bi_private = io;
clone->bi_end_io = crypt_endio;
+ clone->bi_ioprio = io->base_bio->bi_ioprio;

remaining_size = size;

--
2.34.1

2023-12-13 17:42:55

by Mike Snitzer

[permalink] [raw]
Subject: Re: [PATCH v5 1/5] block: Fix bio IO priority setting

On Wed, Dec 13 2023 at 5:42P -0500,
Hongyu Jin <[email protected]> wrote:

> From: Hongyu Jin <[email protected]>
>
> Move bio_set_ioprio() into submit_bio():
> 1. Only call bio_set_ioprio() once to set the priority of original bio,
> the bio that cloned and splited from original bio will auto inherit
> the priority of original bio in clone process.
>
> 2. The IO priority can be passed to module that implement
> struct gendisk::fops::submit_bio, help resolve some
> of the IO priority loss issues.
>
> This patch depends on commit 82b74cac2849 ("blk-ioprio: Convert from
> rqos policy to direct call")
>
> Fixes: a78418e6a04c ("block: Always initialize bio IO priority on submit")
>
> Co-developed-by: Yibin Ding <[email protected]>
> Signed-off-by: Yibin Ding <[email protected]>
> Signed-off-by: Hongyu Jin <[email protected]>

Would be nice to get this block core fix upstream ASAP independent of
your various DM changes.

Please simplify this patch's header like was requested in review of v4:
https://patchwork.kernel.org/project/dm-devel/patch/[email protected]/

2023-12-18 01:25:12

by Hongyu Jin

[permalink] [raw]
Subject: Re: [PATCH v5 1/5] block: Fix bio IO priority setting

Mike Snitzer <[email protected]> 于2023年12月14日周四 00:58写道:
>
> On Wed, Dec 13 2023 at 5:42P -0500,
> Hongyu Jin <[email protected]> wrote:
>
> > From: Hongyu Jin <[email protected]>
> >
> > Move bio_set_ioprio() into submit_bio():
> > 1. Only call bio_set_ioprio() once to set the priority of original bio,
> > the bio that cloned and splited from original bio will auto inherit
> > the priority of original bio in clone process.
> >
> > 2. The IO priority can be passed to module that implement
> > struct gendisk::fops::submit_bio, help resolve some
> > of the IO priority loss issues.
> >
> > This patch depends on commit 82b74cac2849 ("blk-ioprio: Convert from
> > rqos policy to direct call")
> >
> > Fixes: a78418e6a04c ("block: Always initialize bio IO priority on submit")
> >
> > Co-developed-by: Yibin Ding <[email protected]>
> > Signed-off-by: Yibin Ding <[email protected]>
> > Signed-off-by: Hongyu Jin <[email protected]>
>
> Would be nice to get this block core fix upstream ASAP independent of
> your various DM changes.
dm modification depends on block modification to have effect, so it is
reviewed together.
>
> Please simplify this patch's header like was requested in review of v4:
> https://patchwork.kernel.org/project/dm-devel/patch/[email protected]/
ok, i will send the header。

2023-12-18 01:28:20

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 RESEND 0/5] Fix I/O priority lost in device-mapper

From: Hongyu Jin <[email protected]>

A high-priority task obtains data from the dm-verity device using the
RT IO priority, during the verification, the IO reading FEC and hash
by kworker loses the RT priority and is blocked by the low-priority IO.
dm-crypt has the same problem in the process of writing data.

This is because io_context and blkcg are missing.

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. Make the IO priority of the original bio to be passed to dm,
and the dm target inherits the IO priority as needed.


All changes are based on commit 9bacdd8996c7 ("Merge tag 'for-6.7-rc1-tag'
of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux")
on the master branch.


Changes in v5:
- Rewrite patch 2, add ioprio parameter in dm_io();
- Modify dm_io() in patch 3
Changes in v4:
- Modify commit message by Suggestion
- Modify patch for dm-crypt
Changes in v3:
- Split patch for device-mapper
- Add patch to fix dm-crypy I/O priority question
- Add block patch to review together
- Fix some error in v2 patch

Changes in v2:
- Add ioprio field in struct dm_io_region
- Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
- Add two interface



Hongyu Jin (5):
block: Fix bio IO priority setting
dm: Support I/O priority for dm_io()
dm-bufio: Support I/O priority
dm verity: Fix I/O priority lost when read FEC and hash
dm-crypt: Fix lost ioprio when queuing write bios

block/blk-core.c | 10 +++++
block/blk-mq.c | 11 ------
drivers/md/dm-bufio.c | 39 ++++++++++---------
drivers/md/dm-crypt.c | 1 +
drivers/md/dm-ebs-target.c | 8 ++--
drivers/md/dm-integrity.c | 12 +++---
drivers/md/dm-io.c | 23 ++++++-----
drivers/md/dm-kcopyd.c | 4 +-
drivers/md/dm-log.c | 4 +-
drivers/md/dm-raid1.c | 6 +--
drivers/md/dm-snap-persistent.c | 8 ++--
drivers/md/dm-verity-fec.c | 5 ++-
drivers/md/dm-verity-target.c | 8 +++-
drivers/md/dm-writecache.c | 8 ++--
drivers/md/persistent-data/dm-block-manager.c | 6 +--
include/linux/dm-bufio.h | 5 +--
include/linux/dm-io.h | 3 +-
17 files changed, 85 insertions(+), 76 deletions(-)

--
2.34.1


2023-12-18 01:28:49

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 RESEND 1/5] block: Fix bio IO priority setting

From: Hongyu Jin <[email protected]>

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. The IO priority can be passed to module that implement
struct gendisk::fops::submit_bio, help resolve some
of the IO priority loss issues.

This patch depends on commit 82b74cac2849 ("blk-ioprio: Convert from
rqos policy to direct call")

Fixes: a78418e6a04c ("block: Always initialize bio IO priority on submit")

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
block/blk-core.c | 10 ++++++++++
block/blk-mq.c | 11 -----------
2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fdf25b8d6e78..68158c327aea 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -49,6 +49,7 @@
#include "blk-pm.h"
#include "blk-cgroup.h"
#include "blk-throttle.h"
+#include "blk-ioprio.h"

struct dentry *blk_debugfs_root;

@@ -809,6 +810,14 @@ void submit_bio_noacct(struct bio *bio)
}
EXPORT_SYMBOL(submit_bio_noacct);

+static void bio_set_ioprio(struct bio *bio)
+{
+ /* Nobody set ioprio so far? Initialize it based on task's nice value */
+ if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
+ bio->bi_ioprio = get_current_ioprio();
+ blkcg_set_ioprio(bio);
+}
+
/**
* submit_bio - submit a bio to the block device layer for I/O
* @bio: The &struct bio which describes the I/O
@@ -831,6 +840,7 @@ void submit_bio(struct bio *bio)
count_vm_events(PGPGOUT, bio_sectors(bio));
}

+ bio_set_ioprio(bio);
submit_bio_noacct(bio);
}
EXPORT_SYMBOL(submit_bio);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e2d11183f62e..a6e2609df9c9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,7 +40,6 @@
#include "blk-stat.h"
#include "blk-mq-sched.h"
#include "blk-rq-qos.h"
-#include "blk-ioprio.h"

static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd);
@@ -2922,14 +2921,6 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
return rq;
}

-static void bio_set_ioprio(struct bio *bio)
-{
- /* Nobody set ioprio so far? Initialize it based on task's nice value */
- if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
- bio->bi_ioprio = get_current_ioprio();
- blkcg_set_ioprio(bio);
-}
-
/**
* blk_mq_submit_bio - Create and send a request to block device.
* @bio: Bio pointer.
@@ -2963,8 +2954,6 @@ void blk_mq_submit_bio(struct bio *bio)
if (!bio_integrity_prep(bio))
return;

- bio_set_ioprio(bio);
-
rq = blk_mq_get_cached_request(q, plug, &bio, nr_segs);
if (!rq) {
if (!bio)
--
2.34.1


2023-12-18 01:29:04

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 RESEND 2/5] dm: Support I/O priority for dm_io()

From: Hongyu Jin <[email protected]>

Add I/O priority parameter for dm_io().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 6 +++---
drivers/md/dm-integrity.c | 10 +++++-----
drivers/md/dm-io.c | 23 +++++++++++++----------
drivers/md/dm-kcopyd.c | 4 ++--
drivers/md/dm-log.c | 4 ++--
drivers/md/dm-raid1.c | 6 +++---
drivers/md/dm-snap-persistent.c | 4 ++--
drivers/md/dm-writecache.c | 8 ++++----
include/linux/dm-io.h | 3 ++-
9 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 62eb27639c9b..91b05bf4b920 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1279,7 +1279,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
io_req.mem.ptr.vma = (char *)b->data + offset;
}

- r = dm_io(&io_req, 1, &region, NULL);
+ r = dm_io(&io_req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
b->end_io(b, errno_to_blk_status(r));
}
@@ -2130,7 +2130,7 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL;

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_flush);

@@ -2154,7 +2154,7 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL; /* discards are optional */

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_discard);

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index e85c688fd91e..9ffd093ad6cc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -553,7 +553,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, blk_opf_t opf)
}
}

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
return r;

@@ -1071,7 +1071,7 @@ static void rw_journal_sectors(struct dm_integrity_c *ic, blk_opf_t opf,
io_loc.sector = ic->start + SB_SECTORS + sector;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, (opf & REQ_OP_MASK) == REQ_OP_READ ?
"reading journal" : "writing journal", r);
@@ -1188,7 +1188,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned int section, u
io_loc.sector = target;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
WARN_ONCE(1, "asynchronous dm_io failed: %d", r);
fn(-1UL, data);
@@ -1517,7 +1517,7 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
fr.io_reg.count = 0,
fr.ic = ic;
init_completion(&fr.comp);
- r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL);
+ r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL, IOPRIO_DEFAULT);
BUG_ON(r);
}

@@ -2739,7 +2739,7 @@ static void integrity_recalc(struct work_struct *w)
io_loc.sector = get_data_sector(ic, area, offset);
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, "reading data", r);
goto err;
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index f053ce245814..7409490259d1 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -305,7 +305,7 @@ static void km_dp_init(struct dpages *dp, void *data)
*/
static void do_region(const blk_opf_t opf, unsigned int region,
struct dm_io_region *where, struct dpages *dp,
- struct io *io)
+ struct io *io, unsigned short ioprio)
{
struct bio *bio;
struct page *page;
@@ -354,6 +354,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,
&io->client->bios);
bio->bi_iter.bi_sector = where->sector + (where->count - remaining);
bio->bi_end_io = endio;
+ bio->bi_ioprio = ioprio;
store_io_and_region_in_bio(bio, io, region);

if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
@@ -383,7 +384,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,

static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
struct dm_io_region *where, struct dpages *dp,
- struct io *io, int sync)
+ struct io *io, int sync, unsigned short ioprio)
{
int i;
struct dpages old_pages = *dp;
@@ -400,7 +401,7 @@ static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
for (i = 0; i < num_regions; i++) {
*dp = old_pages;
if (where[i].count || (opf & REQ_PREFLUSH))
- do_region(opf, i, where + i, dp, io);
+ do_region(opf, i, where + i, dp, io, ioprio);
}

/*
@@ -425,7 +426,7 @@ static void sync_io_complete(unsigned long error, void *context)

static int sync_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf, struct dpages *dp,
- unsigned long *error_bits)
+ unsigned long *error_bits, unsigned short ioprio)
{
struct io *io;
struct sync_io sio;
@@ -447,7 +448,7 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 1);
+ dispatch_io(opf, num_regions, where, dp, io, 1, ioprio);

wait_for_completion_io(&sio.wait);

@@ -459,7 +460,8 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,

static int async_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf,
- struct dpages *dp, io_notify_fn fn, void *context)
+ struct dpages *dp, io_notify_fn fn, void *context,
+ unsigned short ioprio)
{
struct io *io;

@@ -479,7 +481,7 @@ static int async_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 0);
+ dispatch_io(opf, num_regions, where, dp, io, 0, ioprio);
return 0;
}

@@ -521,7 +523,8 @@ static int dp_init(struct dm_io_request *io_req, struct dpages *dp,
}

int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *where, unsigned long *sync_error_bits)
+ struct dm_io_region *where, unsigned long *sync_error_bits,
+ unsigned short ioprio)
{
int r;
struct dpages dp;
@@ -532,11 +535,11 @@ int dm_io(struct dm_io_request *io_req, unsigned int num_regions,

if (!io_req->notify.fn)
return sync_io(io_req->client, num_regions, where,
- io_req->bi_opf, &dp, sync_error_bits);
+ io_req->bi_opf, &dp, sync_error_bits, ioprio);

return async_io(io_req->client, num_regions, where,
io_req->bi_opf, &dp, io_req->notify.fn,
- io_req->notify.context);
+ io_req->notify.context, ioprio);
}
EXPORT_SYMBOL(dm_io);

diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index d01807c50f20..79c65c9ad5fa 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -578,9 +578,9 @@ static int run_io_job(struct kcopyd_job *job)
io_job_start(job->kc->throttle);

if (job->op == REQ_OP_READ)
- r = dm_io(&io_req, 1, &job->source, NULL);
+ r = dm_io(&io_req, 1, &job->source, NULL, IOPRIO_DEFAULT);
else
- r = dm_io(&io_req, job->num_dests, job->dests, NULL);
+ r = dm_io(&io_req, job->num_dests, job->dests, NULL, IOPRIO_DEFAULT);

return r;
}
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..f7f9c2100937 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -300,7 +300,7 @@ static int rw_header(struct log_c *lc, enum req_op op)
{
lc->io_req.bi_opf = op;

- return dm_io(&lc->io_req, 1, &lc->header_location, NULL);
+ return dm_io(&lc->io_req, 1, &lc->header_location, NULL, IOPRIO_DEFAULT);
}

static int flush_header(struct log_c *lc)
@@ -313,7 +313,7 @@ static int flush_header(struct log_c *lc)

lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;

- return dm_io(&lc->io_req, 1, &null_location, NULL);
+ return dm_io(&lc->io_req, 1, &null_location, NULL, IOPRIO_DEFAULT);
}

static int read_header(struct log_c *log)
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..9511dae5b556 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -278,7 +278,7 @@ static int mirror_flush(struct dm_target *ti)
}

error_bits = -1;
- dm_io(&io_req, ms->nr_mirrors, io, &error_bits);
+ dm_io(&io_req, ms->nr_mirrors, io, &error_bits, IOPRIO_DEFAULT);
if (unlikely(error_bits != 0)) {
for (i = 0; i < ms->nr_mirrors; i++)
if (test_bit(i, &error_bits))
@@ -554,7 +554,7 @@ static void read_async_bio(struct mirror *m, struct bio *bio)

map_region(&io, m, bio);
bio_set_m(bio, m);
- BUG_ON(dm_io(&io_req, 1, &io, NULL));
+ BUG_ON(dm_io(&io_req, 1, &io, NULL, IOPRIO_DEFAULT));
}

static inline int region_in_sync(struct mirror_set *ms, region_t region,
@@ -681,7 +681,7 @@ static void do_write(struct mirror_set *ms, struct bio *bio)
*/
bio_set_m(bio, get_default_mirror(ms));

- BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL));
+ BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL, IOPRIO_DEFAULT));
}

static void do_writes(struct mirror_set *ms, struct bio_list *writes)
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..568d10842b1f 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -223,7 +223,7 @@ static void do_metadata(struct work_struct *work)
{
struct mdata_req *req = container_of(work, struct mdata_req, work);

- req->result = dm_io(req->io_req, 1, req->where, NULL);
+ req->result = dm_io(req->io_req, 1, req->where, NULL, IOPRIO_DEFAULT);
}

/*
@@ -247,7 +247,7 @@ static int chunk_io(struct pstore *ps, void *area, chunk_t chunk, blk_opf_t opf,
struct mdata_req req;

if (!metadata)
- return dm_io(&io_req, 1, &where, NULL);
+ return dm_io(&io_req, 1, &where, NULL, IOPRIO_DEFAULT);

req.where = &where;
req.io_req = &io_req;
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 074cb785eafc..6a4279bfb1e7 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -531,7 +531,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
req.notify.context = &endio;

/* writing via async dm-io (implied by notify.fn above) won't return an error */
- (void) dm_io(&req, 1, &region, NULL);
+ (void) dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
i = j;
}

@@ -568,7 +568,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
req.notify.fn = NULL;
req.notify.context = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error writing superblock");
}
@@ -596,7 +596,7 @@ static void writecache_disk_flush(struct dm_writecache *wc, struct dm_dev *dev)
req.client = wc->dm_io;
req.notify.fn = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error flushing metadata: %d", r);
}
@@ -990,7 +990,7 @@ static int writecache_read_metadata(struct dm_writecache *wc, sector_t n_sectors
req.client = wc->dm_io;
req.notify.fn = NULL;

- return dm_io(&req, 1, &region, NULL);
+ return dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
}

static void writecache_resume(struct dm_target *ti)
diff --git a/include/linux/dm-io.h b/include/linux/dm-io.h
index 7595142f3fc5..7b2968612b7e 100644
--- a/include/linux/dm-io.h
+++ b/include/linux/dm-io.h
@@ -80,7 +80,8 @@ void dm_io_client_destroy(struct dm_io_client *client);
* error occurred doing io to the corresponding region.
*/
int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *region, unsigned int long *sync_error_bits);
+ struct dm_io_region *region, unsigned int long *sync_error_bits,
+ unsigned short ioprio);

#endif /* __KERNEL__ */
#endif /* _LINUX_DM_IO_H */
--
2.34.1


2023-12-18 01:29:44

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 RESEND 4/5] dm verity: Fix I/O priority lost when read FEC and hash

From: Hongyu Jin <[email protected]>

To fix this problem, when read FEC and hash from disk, I/O priority are
inconsistent with data block and blocked by other I/O with low I/O
priority.

Make I/O for FEC and hash has same I/O priority with original data I/O.

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-verity-fec.c | 3 ++-
drivers/md/dm-verity-target.c | 8 ++++++--
2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 715173cbf0ee..6a5a679e7e8a 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -209,6 +209,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
u8 *bbuf, *rs_block;
u8 want_digest[HASH_MAX_DIGESTSIZE];
unsigned int n, k;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (neras)
*neras = 0;
@@ -247,7 +248,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
+ bbuf = dm_bufio_read(bufio, block, &buf, bio_prio(bio));
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 0038e168f3d7..8c911b6722ce 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -51,6 +51,7 @@ static DEFINE_STATIC_KEY_FALSE(use_tasklet_enabled);
struct dm_verity_prefetch_work {
struct work_struct work;
struct dm_verity *v;
+ unsigned short ioprio;
sector_t block;
unsigned int n_blocks;
};
@@ -293,6 +294,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
int r;
sector_t hash_block;
unsigned int offset;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

verity_hash_at_level(v, block, level, &hash_block, &offset);

@@ -307,7 +309,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, bio_prio(bio));

if (IS_ERR(data))
return PTR_ERR(data);
@@ -717,7 +719,7 @@ static void verity_prefetch_io(struct work_struct *work)
}
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1, IOPRIO_DEFAULT);
+ hash_block_end - hash_block_start + 1, pw->ioprio);
}

kfree(pw);
@@ -728,6 +730,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
sector_t block = io->block;
unsigned int n_blocks = io->n_blocks;
struct dm_verity_prefetch_work *pw;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (v->validated_blocks) {
while (n_blocks && test_bit(block, v->validated_blocks)) {
@@ -751,6 +754,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
pw->v = v;
pw->block = block;
pw->n_blocks = n_blocks;
+ pw->ioprio = bio_prio(bio);
queue_work(v->verify_wq, &pw->work);
}

--
2.34.1


2023-12-18 01:30:02

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v5 RESEND 5/5] dm-crypt: Fix lost ioprio when queuing write bios

From: Hongyu Jin <[email protected]>

The original submitting bio->bi_ioprio setting can be retained by
struct dm_crypt_io::base_bio, we set the original bio's ioprio to
the cloned bio for write.

Link: https://lore.kernel.org/dm-devel/[email protected]

Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-crypt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 6de107aff331..7149da6555b8 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
GFP_NOIO, &cc->bs);
clone->bi_private = io;
clone->bi_end_io = crypt_endio;
+ clone->bi_ioprio = io->base_bio->bi_ioprio;

remaining_size = size;

--
2.34.1


2023-12-19 00:47:01

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v5 RESEND 0/5] Fix I/O priority lost in device-mapper

On Mon, Dec 18, 2023 at 09:27:41AM +0800, Hongyu Jin wrote:
> All changes are based on commit 9bacdd8996c7 ("Merge tag 'for-6.7-rc1-tag'
> of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux")
> on the master branch.

That's from last month, and the patchset does not apply to the latest mainline.
Can you make sure to use an up-to-date tree? Thanks.

- Eric

2023-12-19 22:46:15

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v5 RESEND 2/5] dm: Support I/O priority for dm_io()

On Mon, Dec 18, 2023 at 09:27:43AM +0800, Hongyu Jin wrote:
> From: Hongyu Jin <[email protected]>
>
> Add I/O priority parameter for dm_io().

Needs an explanation.

- Eric

2023-12-19 22:48:34

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v5 RESEND 4/5] dm verity: Fix I/O priority lost when read FEC and hash

On Mon, Dec 18, 2023 at 09:27:45AM +0800, Hongyu Jin wrote:
> From: Hongyu Jin <[email protected]>
>
> To fix this problem, when read FEC and hash from disk, I/O priority are
> inconsistent with data block and blocked by other I/O with low I/O
> priority.
>
> Make I/O for FEC and hash has same I/O priority with original data I/O.

"To fix this problem" is supposed to be in the second paragraph, not the first,
right?

> @@ -728,6 +730,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
> sector_t block = io->block;
> unsigned int n_blocks = io->n_blocks;
> struct dm_verity_prefetch_work *pw;
> + struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);
>
> if (v->validated_blocks) {
> while (n_blocks && test_bit(block, v->validated_blocks)) {

The caller has the bio pointer already, so maybe just add it as a parameter to
verity_submit_prefetch()?

- Eric

2023-12-19 22:51:03

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v5 RESEND 5/5] dm-crypt: Fix lost ioprio when queuing write bios

On Mon, Dec 18, 2023 at 09:27:46AM +0800, Hongyu Jin wrote:
> From: Hongyu Jin <[email protected]>
>
> The original submitting bio->bi_ioprio setting can be retained by
> struct dm_crypt_io::base_bio, we set the original bio's ioprio to
> the cloned bio for write.

This commit message does not make sense. Can you make the commit message
properly describe the problem and how the patch fixes it?

- Eric

2023-12-20 01:15:27

by Hongyu Jin

[permalink] [raw]
Subject: Re: [PATCH v5 RESEND 4/5] dm verity: Fix I/O priority lost when read FEC and hash

Eric Biggers <[email protected]> 于2023年12月20日周三 06:48写道:
>
> On Mon, Dec 18, 2023 at 09:27:45AM +0800, Hongyu Jin wrote:
> > From: Hongyu Jin <[email protected]>
> >
> > To fix this problem, when read FEC and hash from disk, I/O priority are
> > inconsistent with data block and blocked by other I/O with low I/O
> > priority.
> >
> > Make I/O for FEC and hash has same I/O priority with original data I/O.
>
> "To fix this problem" is supposed to be in the second paragraph, not the first,
> right?
Yes, The verification and error correction process takes effect after
obtaining the data.
>
> > @@ -728,6 +730,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
> > sector_t block = io->block;
> > unsigned int n_blocks = io->n_blocks;
> > struct dm_verity_prefetch_work *pw;
> > + struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);
> >
> > if (v->validated_blocks) {
> > while (n_blocks && test_bit(block, v->validated_blocks)) {
>
> The caller has the bio pointer already, so maybe just add it as a parameter to
> verity_submit_prefetch()?
>
> - Eric
ok, I will change it.

2023-12-20 10:05:16

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v6 0/5] Fix I/O priority lost in device-mapper

From: Hongyu Jin <[email protected]>

High-priority tasks get data from dm-verity devices via RT IO priority,
I/O will lose RT priority when reading FEC and hash values via kworker
submission IO during verification, and the verification phase may be
blocked by low-priority IO.

Dm-crypt has the same problem in the data writing process.

This is because io_context and blkcg are missing.

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. Make the IO priority of the original bio to be passed to dm,
and the dm target inherits the IO priority as needed.

All changes are based on master branch commit 2cf4f94d8e86 ("Merge tag
'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi")

Changes in v6:
- Rebase patch and resolve conflict for patch 1, 3, 4
- Modify patch 4: fec_read_parity() follow the priority of original
bio
- Update commit message
Changes in v5:
- Rewrite patch 2, add ioprio parameter in dm_io();
- Modify dm_io() in patch 3
Changes in v4:
- Modify commit message by Suggestion
- Modify patch for dm-crypt
Changes in v3:
- Split patch for device-mapper
- Add patch to fix dm-crypy I/O priority question
- Add block patch to review together
- Fix some error in v2 patch
Changes in v2:
- Add ioprio field in struct dm_io_region
- Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
- Add two interface

Hongyu Jin (5):
block: Fix bio IO priority setting
dm: Support I/O priority for dm_io()
dm-bufio: Support I/O priority
dm verity: Fix I/O priority lost when read FEC and hash
dm-crypt: Fix lost ioprio when queuing write bios

block/blk-core.c | 10 +++++
block/blk-mq.c | 11 -----
drivers/md/dm-bufio.c | 43 +++++++++++--------
drivers/md/dm-crypt.c | 1 +
drivers/md/dm-ebs-target.c | 8 ++--
drivers/md/dm-integrity.c | 12 +++---
drivers/md/dm-io.c | 23 +++++-----
drivers/md/dm-kcopyd.c | 4 +-
drivers/md/dm-log.c | 4 +-
drivers/md/dm-raid1.c | 6 +--
drivers/md/dm-snap-persistent.c | 8 ++--
drivers/md/dm-verity-fec.c | 20 ++++++---
drivers/md/dm-verity-target.c | 13 ++++--
drivers/md/dm-writecache.c | 8 ++--
drivers/md/persistent-data/dm-block-manager.c | 6 +--
include/linux/dm-bufio.h | 5 ++-
include/linux/dm-io.h | 3 +-
17 files changed, 105 insertions(+), 80 deletions(-)


base-commit: 2cf4f94d8e8646803f8fb0facf134b0cd7fb691a
--
2.34.1


2023-12-20 10:05:39

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v6 1/5] block: Fix bio IO priority setting

From: Hongyu Jin <[email protected]>

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. The IO priority can be passed to module that implement
struct gendisk::fops::submit_bio, help resolve some
of the IO priority loss issues.

This patch depends on commit 82b74cac2849 ("blk-ioprio: Convert from
rqos policy to direct call")

Fixes: a78418e6a04c ("block: Always initialize bio IO priority on submit")

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
block/blk-core.c | 10 ++++++++++
block/blk-mq.c | 11 -----------
2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2eca76ccf4ee..d707ec056f34 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -49,6 +49,7 @@
#include "blk-pm.h"
#include "blk-cgroup.h"
#include "blk-throttle.h"
+#include "blk-ioprio.h"

struct dentry *blk_debugfs_root;

@@ -817,6 +818,14 @@ void submit_bio_noacct(struct bio *bio)
}
EXPORT_SYMBOL(submit_bio_noacct);

+static void bio_set_ioprio(struct bio *bio)
+{
+ /* Nobody set ioprio so far? Initialize it based on task's nice value */
+ if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
+ bio->bi_ioprio = get_current_ioprio();
+ blkcg_set_ioprio(bio);
+}
+
/**
* submit_bio - submit a bio to the block device layer for I/O
* @bio: The &struct bio which describes the I/O
@@ -839,6 +848,7 @@ void submit_bio(struct bio *bio)
count_vm_events(PGPGOUT, bio_sectors(bio));
}

+ bio_set_ioprio(bio);
submit_bio_noacct(bio);
}
EXPORT_SYMBOL(submit_bio);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ac18f802c027..351e8283eda1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,7 +40,6 @@
#include "blk-stat.h"
#include "blk-mq-sched.h"
#include "blk-rq-qos.h"
-#include "blk-ioprio.h"

static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd);
@@ -2919,14 +2918,6 @@ static bool blk_mq_can_use_cached_rq(struct request *rq, struct blk_plug *plug,
return true;
}

-static void bio_set_ioprio(struct bio *bio)
-{
- /* Nobody set ioprio so far? Initialize it based on task's nice value */
- if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
- bio->bi_ioprio = get_current_ioprio();
- blkcg_set_ioprio(bio);
-}
-
/**
* blk_mq_submit_bio - Create and send a request to block device.
* @bio: Bio pointer.
@@ -2957,8 +2948,6 @@ void blk_mq_submit_bio(struct bio *bio)
return;
}

- bio_set_ioprio(bio);
-
if (plug) {
rq = rq_list_peek(&plug->cached_rq);
if (rq && rq->q != q)
--
2.34.1


2023-12-20 10:06:49

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v6 2/5] dm: Support I/O priority for dm_io()

From: Hongyu Jin <[email protected]>

Some I/O will dispatch from kworker with different io_context settings
than the submitting task, we may need to specify a priority to avoid
losing priority.

Add I/O priority parameter for dm_io().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 6 +++---
drivers/md/dm-integrity.c | 10 +++++-----
drivers/md/dm-io.c | 23 +++++++++++++----------
drivers/md/dm-kcopyd.c | 4 ++--
drivers/md/dm-log.c | 4 ++--
drivers/md/dm-raid1.c | 6 +++---
drivers/md/dm-snap-persistent.c | 4 ++--
drivers/md/dm-writecache.c | 8 ++++----
include/linux/dm-io.h | 3 ++-
9 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index f03d7dba270c..4f2808ef387f 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1315,7 +1315,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
io_req.mem.ptr.vma = (char *)b->data + offset;
}

- r = dm_io(&io_req, 1, &region, NULL);
+ r = dm_io(&io_req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
b->end_io(b, errno_to_blk_status(r));
}
@@ -2167,7 +2167,7 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL;

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_flush);

@@ -2191,7 +2191,7 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL; /* discards are optional */

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_discard);

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index e85c688fd91e..9ffd093ad6cc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -553,7 +553,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, blk_opf_t opf)
}
}

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
return r;

@@ -1071,7 +1071,7 @@ static void rw_journal_sectors(struct dm_integrity_c *ic, blk_opf_t opf,
io_loc.sector = ic->start + SB_SECTORS + sector;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, (opf & REQ_OP_MASK) == REQ_OP_READ ?
"reading journal" : "writing journal", r);
@@ -1188,7 +1188,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned int section, u
io_loc.sector = target;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
WARN_ONCE(1, "asynchronous dm_io failed: %d", r);
fn(-1UL, data);
@@ -1517,7 +1517,7 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
fr.io_reg.count = 0,
fr.ic = ic;
init_completion(&fr.comp);
- r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL);
+ r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL, IOPRIO_DEFAULT);
BUG_ON(r);
}

@@ -2739,7 +2739,7 @@ static void integrity_recalc(struct work_struct *w)
io_loc.sector = get_data_sector(ic, area, offset);
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, "reading data", r);
goto err;
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index f053ce245814..7409490259d1 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -305,7 +305,7 @@ static void km_dp_init(struct dpages *dp, void *data)
*/
static void do_region(const blk_opf_t opf, unsigned int region,
struct dm_io_region *where, struct dpages *dp,
- struct io *io)
+ struct io *io, unsigned short ioprio)
{
struct bio *bio;
struct page *page;
@@ -354,6 +354,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,
&io->client->bios);
bio->bi_iter.bi_sector = where->sector + (where->count - remaining);
bio->bi_end_io = endio;
+ bio->bi_ioprio = ioprio;
store_io_and_region_in_bio(bio, io, region);

if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
@@ -383,7 +384,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,

static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
struct dm_io_region *where, struct dpages *dp,
- struct io *io, int sync)
+ struct io *io, int sync, unsigned short ioprio)
{
int i;
struct dpages old_pages = *dp;
@@ -400,7 +401,7 @@ static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
for (i = 0; i < num_regions; i++) {
*dp = old_pages;
if (where[i].count || (opf & REQ_PREFLUSH))
- do_region(opf, i, where + i, dp, io);
+ do_region(opf, i, where + i, dp, io, ioprio);
}

/*
@@ -425,7 +426,7 @@ static void sync_io_complete(unsigned long error, void *context)

static int sync_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf, struct dpages *dp,
- unsigned long *error_bits)
+ unsigned long *error_bits, unsigned short ioprio)
{
struct io *io;
struct sync_io sio;
@@ -447,7 +448,7 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 1);
+ dispatch_io(opf, num_regions, where, dp, io, 1, ioprio);

wait_for_completion_io(&sio.wait);

@@ -459,7 +460,8 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,

static int async_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf,
- struct dpages *dp, io_notify_fn fn, void *context)
+ struct dpages *dp, io_notify_fn fn, void *context,
+ unsigned short ioprio)
{
struct io *io;

@@ -479,7 +481,7 @@ static int async_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 0);
+ dispatch_io(opf, num_regions, where, dp, io, 0, ioprio);
return 0;
}

@@ -521,7 +523,8 @@ static int dp_init(struct dm_io_request *io_req, struct dpages *dp,
}

int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *where, unsigned long *sync_error_bits)
+ struct dm_io_region *where, unsigned long *sync_error_bits,
+ unsigned short ioprio)
{
int r;
struct dpages dp;
@@ -532,11 +535,11 @@ int dm_io(struct dm_io_request *io_req, unsigned int num_regions,

if (!io_req->notify.fn)
return sync_io(io_req->client, num_regions, where,
- io_req->bi_opf, &dp, sync_error_bits);
+ io_req->bi_opf, &dp, sync_error_bits, ioprio);

return async_io(io_req->client, num_regions, where,
io_req->bi_opf, &dp, io_req->notify.fn,
- io_req->notify.context);
+ io_req->notify.context, ioprio);
}
EXPORT_SYMBOL(dm_io);

diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index d01807c50f20..79c65c9ad5fa 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -578,9 +578,9 @@ static int run_io_job(struct kcopyd_job *job)
io_job_start(job->kc->throttle);

if (job->op == REQ_OP_READ)
- r = dm_io(&io_req, 1, &job->source, NULL);
+ r = dm_io(&io_req, 1, &job->source, NULL, IOPRIO_DEFAULT);
else
- r = dm_io(&io_req, job->num_dests, job->dests, NULL);
+ r = dm_io(&io_req, job->num_dests, job->dests, NULL, IOPRIO_DEFAULT);

return r;
}
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..f7f9c2100937 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -300,7 +300,7 @@ static int rw_header(struct log_c *lc, enum req_op op)
{
lc->io_req.bi_opf = op;

- return dm_io(&lc->io_req, 1, &lc->header_location, NULL);
+ return dm_io(&lc->io_req, 1, &lc->header_location, NULL, IOPRIO_DEFAULT);
}

static int flush_header(struct log_c *lc)
@@ -313,7 +313,7 @@ static int flush_header(struct log_c *lc)

lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;

- return dm_io(&lc->io_req, 1, &null_location, NULL);
+ return dm_io(&lc->io_req, 1, &null_location, NULL, IOPRIO_DEFAULT);
}

static int read_header(struct log_c *log)
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..9511dae5b556 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -278,7 +278,7 @@ static int mirror_flush(struct dm_target *ti)
}

error_bits = -1;
- dm_io(&io_req, ms->nr_mirrors, io, &error_bits);
+ dm_io(&io_req, ms->nr_mirrors, io, &error_bits, IOPRIO_DEFAULT);
if (unlikely(error_bits != 0)) {
for (i = 0; i < ms->nr_mirrors; i++)
if (test_bit(i, &error_bits))
@@ -554,7 +554,7 @@ static void read_async_bio(struct mirror *m, struct bio *bio)

map_region(&io, m, bio);
bio_set_m(bio, m);
- BUG_ON(dm_io(&io_req, 1, &io, NULL));
+ BUG_ON(dm_io(&io_req, 1, &io, NULL, IOPRIO_DEFAULT));
}

static inline int region_in_sync(struct mirror_set *ms, region_t region,
@@ -681,7 +681,7 @@ static void do_write(struct mirror_set *ms, struct bio *bio)
*/
bio_set_m(bio, get_default_mirror(ms));

- BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL));
+ BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL, IOPRIO_DEFAULT));
}

static void do_writes(struct mirror_set *ms, struct bio_list *writes)
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..568d10842b1f 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -223,7 +223,7 @@ static void do_metadata(struct work_struct *work)
{
struct mdata_req *req = container_of(work, struct mdata_req, work);

- req->result = dm_io(req->io_req, 1, req->where, NULL);
+ req->result = dm_io(req->io_req, 1, req->where, NULL, IOPRIO_DEFAULT);
}

/*
@@ -247,7 +247,7 @@ static int chunk_io(struct pstore *ps, void *area, chunk_t chunk, blk_opf_t opf,
struct mdata_req req;

if (!metadata)
- return dm_io(&io_req, 1, &where, NULL);
+ return dm_io(&io_req, 1, &where, NULL, IOPRIO_DEFAULT);

req.where = &where;
req.io_req = &io_req;
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 074cb785eafc..6a4279bfb1e7 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -531,7 +531,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
req.notify.context = &endio;

/* writing via async dm-io (implied by notify.fn above) won't return an error */
- (void) dm_io(&req, 1, &region, NULL);
+ (void) dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
i = j;
}

@@ -568,7 +568,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
req.notify.fn = NULL;
req.notify.context = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error writing superblock");
}
@@ -596,7 +596,7 @@ static void writecache_disk_flush(struct dm_writecache *wc, struct dm_dev *dev)
req.client = wc->dm_io;
req.notify.fn = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error flushing metadata: %d", r);
}
@@ -990,7 +990,7 @@ static int writecache_read_metadata(struct dm_writecache *wc, sector_t n_sectors
req.client = wc->dm_io;
req.notify.fn = NULL;

- return dm_io(&req, 1, &region, NULL);
+ return dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
}

static void writecache_resume(struct dm_target *ti)
diff --git a/include/linux/dm-io.h b/include/linux/dm-io.h
index 7595142f3fc5..7b2968612b7e 100644
--- a/include/linux/dm-io.h
+++ b/include/linux/dm-io.h
@@ -80,7 +80,8 @@ void dm_io_client_destroy(struct dm_io_client *client);
* error occurred doing io to the corresponding region.
*/
int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *region, unsigned int long *sync_error_bits);
+ struct dm_io_region *region, unsigned int long *sync_error_bits,
+ unsigned short ioprio);

#endif /* __KERNEL__ */
#endif /* _LINUX_DM_IO_H */
--
2.34.1


2023-12-20 10:08:26

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v6 5/5] dm-crypt: Fix lost ioprio when queuing write bios

From: Hongyu Jin <[email protected]>

Since dm-crypt queues writes to a different kernel thread (workqueue),
the bios will dispatch from tasks with different io_context->ioprio
settings and blkcg than the submitting task, thus giving incorrect
ioprio to the io scheduler.

Get the original io priority setting via struct dm_crypt_io::base_bio
and set this priority to the bio for write.

Link: https://lore.kernel.org/dm-devel/[email protected]

Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-crypt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 2ae8560b6a14..ba6e794f7871 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
GFP_NOIO, &cc->bs);
clone->bi_private = io;
clone->bi_end_io = crypt_endio;
+ clone->bi_ioprio = io->base_bio->bi_ioprio;

remaining_size = size;

--
2.34.1


2023-12-21 10:32:04

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v7 0/5] Fix I/O priority lost in device-mapper

From: Hongyu Jin <[email protected]>

High-priority tasks get data from dm-verity devices via RT IO priority,
I/O will lose RT priority when reading FEC and hash values via kworker
submission IO during verification, and the verification phase may be
blocked by low-priority IO.

Dm-crypt has the same problem in the data writing process.

This is because io_context and blkcg are missing.

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. Make the IO priority of the original bio to be passed to dm,
and the dm target inherits the IO priority as needed.

All changes are based on master branch commit 2cf4f94d8e86 ("Merge tag
'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi")

Changes in v7:
- Modify patch 4: change dm-verity-fec.c
Changes in v6:
- Rebase patch and resolve conflict for patch 1, 3, 4
- Modify patch 4: fec_read_parity() follow the priority of original
bio
- Update commit message
Changes in v5:
- Rewrite patch 2, add ioprio parameter in dm_io();
- Modify dm_io() in patch 3
Changes in v4:
- Modify commit message by Suggestion
- Modify patch for dm-crypt
Changes in v3:
- Split patch for device-mapper
- Add patch to fix dm-crypy I/O priority question
- Add block patch to review together
- Fix some error in v2 patch
Changes in v2:
- Add ioprio field in struct dm_io_region
- Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
- Add two interface

Hongyu Jin (5):
block: Fix bio IO priority setting
dm: Support I/O priority for dm_io()
dm-bufio: Support I/O priority
dm verity: Fix I/O priority lost when read FEC and hash
dm-crypt: Fix lost ioprio when queuing write bios

block/blk-core.c | 10 +++++
block/blk-mq.c | 11 -----
drivers/md/dm-bufio.c | 43 +++++++++++--------
drivers/md/dm-crypt.c | 1 +
drivers/md/dm-ebs-target.c | 8 ++--
drivers/md/dm-integrity.c | 12 +++---
drivers/md/dm-io.c | 23 +++++-----
drivers/md/dm-kcopyd.c | 4 +-
drivers/md/dm-log.c | 4 +-
drivers/md/dm-raid1.c | 6 +--
drivers/md/dm-snap-persistent.c | 8 ++--
drivers/md/dm-verity-fec.c | 21 +++++----
drivers/md/dm-verity-target.c | 13 ++++--
drivers/md/dm-writecache.c | 8 ++--
drivers/md/persistent-data/dm-block-manager.c | 6 +--
include/linux/dm-bufio.h | 5 ++-
include/linux/dm-io.h | 3 +-
17 files changed, 102 insertions(+), 84 deletions(-)


base-commit: 2cf4f94d8e8646803f8fb0facf134b0cd7fb691a
--
2.34.1


2023-12-21 10:32:22

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v7 1/5] block: Fix bio IO priority setting

From: Hongyu Jin <[email protected]>

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. The IO priority can be passed to module that implement
struct gendisk::fops::submit_bio, help resolve some
of the IO priority loss issues.

This patch depends on commit 82b74cac2849 ("blk-ioprio: Convert from
rqos policy to direct call")

Fixes: a78418e6a04c ("block: Always initialize bio IO priority on submit")

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
block/blk-core.c | 10 ++++++++++
block/blk-mq.c | 11 -----------
2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2eca76ccf4ee..d707ec056f34 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -49,6 +49,7 @@
#include "blk-pm.h"
#include "blk-cgroup.h"
#include "blk-throttle.h"
+#include "blk-ioprio.h"

struct dentry *blk_debugfs_root;

@@ -817,6 +818,14 @@ void submit_bio_noacct(struct bio *bio)
}
EXPORT_SYMBOL(submit_bio_noacct);

+static void bio_set_ioprio(struct bio *bio)
+{
+ /* Nobody set ioprio so far? Initialize it based on task's nice value */
+ if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
+ bio->bi_ioprio = get_current_ioprio();
+ blkcg_set_ioprio(bio);
+}
+
/**
* submit_bio - submit a bio to the block device layer for I/O
* @bio: The &struct bio which describes the I/O
@@ -839,6 +848,7 @@ void submit_bio(struct bio *bio)
count_vm_events(PGPGOUT, bio_sectors(bio));
}

+ bio_set_ioprio(bio);
submit_bio_noacct(bio);
}
EXPORT_SYMBOL(submit_bio);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ac18f802c027..351e8283eda1 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,7 +40,6 @@
#include "blk-stat.h"
#include "blk-mq-sched.h"
#include "blk-rq-qos.h"
-#include "blk-ioprio.h"

static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd);
@@ -2919,14 +2918,6 @@ static bool blk_mq_can_use_cached_rq(struct request *rq, struct blk_plug *plug,
return true;
}

-static void bio_set_ioprio(struct bio *bio)
-{
- /* Nobody set ioprio so far? Initialize it based on task's nice value */
- if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
- bio->bi_ioprio = get_current_ioprio();
- blkcg_set_ioprio(bio);
-}
-
/**
* blk_mq_submit_bio - Create and send a request to block device.
* @bio: Bio pointer.
@@ -2957,8 +2948,6 @@ void blk_mq_submit_bio(struct bio *bio)
return;
}

- bio_set_ioprio(bio);
-
if (plug) {
rq = rq_list_peek(&plug->cached_rq);
if (rq && rq->q != q)
--
2.34.1


2023-12-21 10:32:43

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v7 2/5] dm: Support I/O priority for dm_io()

From: Hongyu Jin <[email protected]>

Some I/O will dispatch from kworker with different io_context settings
than the submitting task, we may need to specify a priority to avoid
losing priority.

Add I/O priority parameter for dm_io().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 6 +++---
drivers/md/dm-integrity.c | 10 +++++-----
drivers/md/dm-io.c | 23 +++++++++++++----------
drivers/md/dm-kcopyd.c | 4 ++--
drivers/md/dm-log.c | 4 ++--
drivers/md/dm-raid1.c | 6 +++---
drivers/md/dm-snap-persistent.c | 4 ++--
drivers/md/dm-writecache.c | 8 ++++----
include/linux/dm-io.h | 3 ++-
9 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index f03d7dba270c..4f2808ef387f 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1315,7 +1315,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
io_req.mem.ptr.vma = (char *)b->data + offset;
}

- r = dm_io(&io_req, 1, &region, NULL);
+ r = dm_io(&io_req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
b->end_io(b, errno_to_blk_status(r));
}
@@ -2167,7 +2167,7 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL;

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_flush);

@@ -2191,7 +2191,7 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL; /* discards are optional */

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_discard);

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index e85c688fd91e..9ffd093ad6cc 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -553,7 +553,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, blk_opf_t opf)
}
}

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
return r;

@@ -1071,7 +1071,7 @@ static void rw_journal_sectors(struct dm_integrity_c *ic, blk_opf_t opf,
io_loc.sector = ic->start + SB_SECTORS + sector;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, (opf & REQ_OP_MASK) == REQ_OP_READ ?
"reading journal" : "writing journal", r);
@@ -1188,7 +1188,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned int section, u
io_loc.sector = target;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
WARN_ONCE(1, "asynchronous dm_io failed: %d", r);
fn(-1UL, data);
@@ -1517,7 +1517,7 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
fr.io_reg.count = 0,
fr.ic = ic;
init_completion(&fr.comp);
- r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL);
+ r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL, IOPRIO_DEFAULT);
BUG_ON(r);
}

@@ -2739,7 +2739,7 @@ static void integrity_recalc(struct work_struct *w)
io_loc.sector = get_data_sector(ic, area, offset);
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, "reading data", r);
goto err;
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index f053ce245814..7409490259d1 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -305,7 +305,7 @@ static void km_dp_init(struct dpages *dp, void *data)
*/
static void do_region(const blk_opf_t opf, unsigned int region,
struct dm_io_region *where, struct dpages *dp,
- struct io *io)
+ struct io *io, unsigned short ioprio)
{
struct bio *bio;
struct page *page;
@@ -354,6 +354,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,
&io->client->bios);
bio->bi_iter.bi_sector = where->sector + (where->count - remaining);
bio->bi_end_io = endio;
+ bio->bi_ioprio = ioprio;
store_io_and_region_in_bio(bio, io, region);

if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
@@ -383,7 +384,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,

static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
struct dm_io_region *where, struct dpages *dp,
- struct io *io, int sync)
+ struct io *io, int sync, unsigned short ioprio)
{
int i;
struct dpages old_pages = *dp;
@@ -400,7 +401,7 @@ static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
for (i = 0; i < num_regions; i++) {
*dp = old_pages;
if (where[i].count || (opf & REQ_PREFLUSH))
- do_region(opf, i, where + i, dp, io);
+ do_region(opf, i, where + i, dp, io, ioprio);
}

/*
@@ -425,7 +426,7 @@ static void sync_io_complete(unsigned long error, void *context)

static int sync_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf, struct dpages *dp,
- unsigned long *error_bits)
+ unsigned long *error_bits, unsigned short ioprio)
{
struct io *io;
struct sync_io sio;
@@ -447,7 +448,7 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 1);
+ dispatch_io(opf, num_regions, where, dp, io, 1, ioprio);

wait_for_completion_io(&sio.wait);

@@ -459,7 +460,8 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,

static int async_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf,
- struct dpages *dp, io_notify_fn fn, void *context)
+ struct dpages *dp, io_notify_fn fn, void *context,
+ unsigned short ioprio)
{
struct io *io;

@@ -479,7 +481,7 @@ static int async_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 0);
+ dispatch_io(opf, num_regions, where, dp, io, 0, ioprio);
return 0;
}

@@ -521,7 +523,8 @@ static int dp_init(struct dm_io_request *io_req, struct dpages *dp,
}

int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *where, unsigned long *sync_error_bits)
+ struct dm_io_region *where, unsigned long *sync_error_bits,
+ unsigned short ioprio)
{
int r;
struct dpages dp;
@@ -532,11 +535,11 @@ int dm_io(struct dm_io_request *io_req, unsigned int num_regions,

if (!io_req->notify.fn)
return sync_io(io_req->client, num_regions, where,
- io_req->bi_opf, &dp, sync_error_bits);
+ io_req->bi_opf, &dp, sync_error_bits, ioprio);

return async_io(io_req->client, num_regions, where,
io_req->bi_opf, &dp, io_req->notify.fn,
- io_req->notify.context);
+ io_req->notify.context, ioprio);
}
EXPORT_SYMBOL(dm_io);

diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index d01807c50f20..79c65c9ad5fa 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -578,9 +578,9 @@ static int run_io_job(struct kcopyd_job *job)
io_job_start(job->kc->throttle);

if (job->op == REQ_OP_READ)
- r = dm_io(&io_req, 1, &job->source, NULL);
+ r = dm_io(&io_req, 1, &job->source, NULL, IOPRIO_DEFAULT);
else
- r = dm_io(&io_req, job->num_dests, job->dests, NULL);
+ r = dm_io(&io_req, job->num_dests, job->dests, NULL, IOPRIO_DEFAULT);

return r;
}
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..f7f9c2100937 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -300,7 +300,7 @@ static int rw_header(struct log_c *lc, enum req_op op)
{
lc->io_req.bi_opf = op;

- return dm_io(&lc->io_req, 1, &lc->header_location, NULL);
+ return dm_io(&lc->io_req, 1, &lc->header_location, NULL, IOPRIO_DEFAULT);
}

static int flush_header(struct log_c *lc)
@@ -313,7 +313,7 @@ static int flush_header(struct log_c *lc)

lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;

- return dm_io(&lc->io_req, 1, &null_location, NULL);
+ return dm_io(&lc->io_req, 1, &null_location, NULL, IOPRIO_DEFAULT);
}

static int read_header(struct log_c *log)
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..9511dae5b556 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -278,7 +278,7 @@ static int mirror_flush(struct dm_target *ti)
}

error_bits = -1;
- dm_io(&io_req, ms->nr_mirrors, io, &error_bits);
+ dm_io(&io_req, ms->nr_mirrors, io, &error_bits, IOPRIO_DEFAULT);
if (unlikely(error_bits != 0)) {
for (i = 0; i < ms->nr_mirrors; i++)
if (test_bit(i, &error_bits))
@@ -554,7 +554,7 @@ static void read_async_bio(struct mirror *m, struct bio *bio)

map_region(&io, m, bio);
bio_set_m(bio, m);
- BUG_ON(dm_io(&io_req, 1, &io, NULL));
+ BUG_ON(dm_io(&io_req, 1, &io, NULL, IOPRIO_DEFAULT));
}

static inline int region_in_sync(struct mirror_set *ms, region_t region,
@@ -681,7 +681,7 @@ static void do_write(struct mirror_set *ms, struct bio *bio)
*/
bio_set_m(bio, get_default_mirror(ms));

- BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL));
+ BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL, IOPRIO_DEFAULT));
}

static void do_writes(struct mirror_set *ms, struct bio_list *writes)
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..568d10842b1f 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -223,7 +223,7 @@ static void do_metadata(struct work_struct *work)
{
struct mdata_req *req = container_of(work, struct mdata_req, work);

- req->result = dm_io(req->io_req, 1, req->where, NULL);
+ req->result = dm_io(req->io_req, 1, req->where, NULL, IOPRIO_DEFAULT);
}

/*
@@ -247,7 +247,7 @@ static int chunk_io(struct pstore *ps, void *area, chunk_t chunk, blk_opf_t opf,
struct mdata_req req;

if (!metadata)
- return dm_io(&io_req, 1, &where, NULL);
+ return dm_io(&io_req, 1, &where, NULL, IOPRIO_DEFAULT);

req.where = &where;
req.io_req = &io_req;
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 074cb785eafc..6a4279bfb1e7 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -531,7 +531,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
req.notify.context = &endio;

/* writing via async dm-io (implied by notify.fn above) won't return an error */
- (void) dm_io(&req, 1, &region, NULL);
+ (void) dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
i = j;
}

@@ -568,7 +568,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
req.notify.fn = NULL;
req.notify.context = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error writing superblock");
}
@@ -596,7 +596,7 @@ static void writecache_disk_flush(struct dm_writecache *wc, struct dm_dev *dev)
req.client = wc->dm_io;
req.notify.fn = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error flushing metadata: %d", r);
}
@@ -990,7 +990,7 @@ static int writecache_read_metadata(struct dm_writecache *wc, sector_t n_sectors
req.client = wc->dm_io;
req.notify.fn = NULL;

- return dm_io(&req, 1, &region, NULL);
+ return dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
}

static void writecache_resume(struct dm_target *ti)
diff --git a/include/linux/dm-io.h b/include/linux/dm-io.h
index 7595142f3fc5..7b2968612b7e 100644
--- a/include/linux/dm-io.h
+++ b/include/linux/dm-io.h
@@ -80,7 +80,8 @@ void dm_io_client_destroy(struct dm_io_client *client);
* error occurred doing io to the corresponding region.
*/
int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *region, unsigned int long *sync_error_bits);
+ struct dm_io_region *region, unsigned int long *sync_error_bits,
+ unsigned short ioprio);

#endif /* __KERNEL__ */
#endif /* _LINUX_DM_IO_H */
--
2.34.1


2023-12-21 10:33:00

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v7 3/5] dm-bufio: Support I/O priority

From: Hongyu Jin <[email protected]>

Some I/O will dispatch from kworker with different io_context settings
than the submitting task, we may need to specify a priority to avoid
losing priority.

Add I/O priority parameter for dm_bufio_read() and
dm_bufio_prefetch().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 39 +++++++++++--------
drivers/md/dm-ebs-target.c | 8 ++--
drivers/md/dm-integrity.c | 2 +-
drivers/md/dm-snap-persistent.c | 4 +-
drivers/md/dm-verity-fec.c | 4 +-
drivers/md/dm-verity-target.c | 5 ++-
drivers/md/persistent-data/dm-block-manager.c | 6 +--
include/linux/dm-bufio.h | 5 ++-
8 files changed, 40 insertions(+), 33 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 4f2808ef387f..a6974ecab68e 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1292,7 +1292,8 @@ static void dmio_complete(unsigned long error, void *context)
}

static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset,
+ unsigned short ioprio)
{
int r;
struct dm_io_request io_req = {
@@ -1315,7 +1316,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
io_req.mem.ptr.vma = (char *)b->data + offset;
}

- r = dm_io(&io_req, 1, &region, NULL, IOPRIO_DEFAULT);
+ r = dm_io(&io_req, 1, &region, NULL, ioprio);
if (unlikely(r))
b->end_io(b, errno_to_blk_status(r));
}
@@ -1331,7 +1332,8 @@ static void bio_complete(struct bio *bio)
}

static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,
- unsigned int n_sectors, unsigned int offset)
+ unsigned int n_sectors, unsigned int offset,
+ unsigned short ioprio)
{
struct bio *bio;
char *ptr;
@@ -1339,13 +1341,14 @@ static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector,

bio = bio_kmalloc(1, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOWARN);
if (!bio) {
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
return;
}
bio_init(bio, b->c->bdev, bio->bi_inline_vecs, 1, op);
bio->bi_iter.bi_sector = sector;
bio->bi_end_io = bio_complete;
bio->bi_private = b;
+ bio->bi_ioprio = ioprio;

ptr = (char *)b->data + offset;
len = n_sectors << SECTOR_SHIFT;
@@ -1368,7 +1371,7 @@ static inline sector_t block_to_sector(struct dm_bufio_client *c, sector_t block
return sector;
}

-static void submit_io(struct dm_buffer *b, enum req_op op,
+static void submit_io(struct dm_buffer *b, enum req_op op, unsigned short ioprio,
void (*end_io)(struct dm_buffer *, blk_status_t))
{
unsigned int n_sectors;
@@ -1398,9 +1401,9 @@ static void submit_io(struct dm_buffer *b, enum req_op op,
}

if (b->data_mode != DATA_MODE_VMALLOC)
- use_bio(b, op, sector, n_sectors, offset);
+ use_bio(b, op, sector, n_sectors, offset, ioprio);
else
- use_dmio(b, op, sector, n_sectors, offset);
+ use_dmio(b, op, sector, n_sectors, offset, ioprio);
}

/*
@@ -1456,7 +1459,7 @@ static void __write_dirty_buffer(struct dm_buffer *b,
b->write_end = b->dirty_end;

if (!write_list)
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
else
list_add_tail(&b->write_list, write_list);
}
@@ -1470,7 +1473,7 @@ static void __flush_write_list(struct list_head *write_list)
struct dm_buffer *b =
list_entry(write_list->next, struct dm_buffer, write_list);
list_del(&b->write_list);
- submit_io(b, REQ_OP_WRITE, write_endio);
+ submit_io(b, REQ_OP_WRITE, IOPRIO_DEFAULT, write_endio);
cond_resched();
}
blk_finish_plug(&plug);
@@ -1852,7 +1855,8 @@ static void read_endio(struct dm_buffer *b, blk_status_t status)
* and uses dm_bufio_mark_buffer_dirty to write new data back).
*/
static void *new_read(struct dm_bufio_client *c, sector_t block,
- enum new_flag nf, struct dm_buffer **bp)
+ enum new_flag nf, struct dm_buffer **bp,
+ unsigned short ioprio)
{
int need_submit = 0;
struct dm_buffer *b;
@@ -1905,7 +1909,7 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
return NULL;

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);

if (nf != NF_GET) /* we already tested this condition above */
wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);
@@ -1926,17 +1930,17 @@ static void *new_read(struct dm_bufio_client *c, sector_t block,
void *dm_bufio_get(struct dm_bufio_client *c, sector_t block,
struct dm_buffer **bp)
{
- return new_read(c, block, NF_GET, bp);
+ return new_read(c, block, NF_GET, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_get);

void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
- struct dm_buffer **bp)
+ struct dm_buffer **bp, unsigned short ioprio)
{
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_READ, bp);
+ return new_read(c, block, NF_READ, bp, ioprio);
}
EXPORT_SYMBOL_GPL(dm_bufio_read);

@@ -1946,12 +1950,13 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
if (WARN_ON_ONCE(dm_bufio_in_request()))
return ERR_PTR(-EINVAL);

- return new_read(c, block, NF_FRESH, bp);
+ return new_read(c, block, NF_FRESH, bp, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_new);

void dm_bufio_prefetch(struct dm_bufio_client *c,
- sector_t block, unsigned int n_blocks)
+ sector_t block, unsigned int n_blocks,
+ unsigned short ioprio)
{
struct blk_plug plug;

@@ -1987,7 +1992,7 @@ void dm_bufio_prefetch(struct dm_bufio_client *c,
dm_bufio_unlock(c);

if (need_submit)
- submit_io(b, REQ_OP_READ, read_endio);
+ submit_io(b, REQ_OP_READ, ioprio, read_endio);
dm_bufio_release(b);

cond_resched();
diff --git a/drivers/md/dm-ebs-target.c b/drivers/md/dm-ebs-target.c
index 435b45201f4d..8198c8a7b416 100644
--- a/drivers/md/dm-ebs-target.c
+++ b/drivers/md/dm-ebs-target.c
@@ -84,7 +84,7 @@ static int __ebs_rw_bvec(struct ebs_c *ec, enum req_op op, struct bio_vec *bv,

/* Avoid reading for writes in case bio vector's page overwrites block completely. */
if (op == REQ_OP_READ || buf_off || bv_len < dm_bufio_get_block_size(ec->bufio))
- ba = dm_bufio_read(ec->bufio, block, &b);
+ ba = dm_bufio_read(ec->bufio, block, &b, IOPRIO_DEFAULT);
else
ba = dm_bufio_new(ec->bufio, block, &b);

@@ -194,13 +194,13 @@ static void __ebs_process_bios(struct work_struct *ws)
bio_list_for_each(bio, &bios) {
block1 = __sector_to_block(ec, bio->bi_iter.bi_sector);
if (bio_op(bio) == REQ_OP_READ)
- dm_bufio_prefetch(ec->bufio, block1, __nr_blocks(ec, bio));
+ dm_bufio_prefetch(ec->bufio, block1, __nr_blocks(ec, bio), IOPRIO_DEFAULT);
else if (bio_op(bio) == REQ_OP_WRITE && !(bio->bi_opf & REQ_PREFLUSH)) {
block2 = __sector_to_block(ec, bio_end_sector(bio));
if (__block_mod(bio->bi_iter.bi_sector, ec->u_bs))
- dm_bufio_prefetch(ec->bufio, block1, 1);
+ dm_bufio_prefetch(ec->bufio, block1, 1, IOPRIO_DEFAULT);
if (__block_mod(bio_end_sector(bio), ec->u_bs) && block2 != block1)
- dm_bufio_prefetch(ec->bufio, block2, 1);
+ dm_bufio_prefetch(ec->bufio, block2, 1, IOPRIO_DEFAULT);
}
}

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 9ffd093ad6cc..1e40e712bcd7 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -1418,7 +1418,7 @@ static int dm_integrity_rw_tag(struct dm_integrity_c *ic, unsigned char *tag, se
if (unlikely(r))
return r;

- data = dm_bufio_read(ic->bufio, *metadata_block, &b);
+ data = dm_bufio_read(ic->bufio, *metadata_block, &b, IOPRIO_DEFAULT);
if (IS_ERR(data))
return PTR_ERR(data);

diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 568d10842b1f..a2072b95e28c 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -524,7 +524,7 @@ static int read_exceptions(struct pstore *ps,

if (unlikely(pf_chunk >= dm_bufio_get_device_size(client)))
break;
- dm_bufio_prefetch(client, pf_chunk, 1);
+ dm_bufio_prefetch(client, pf_chunk, 1, IOPRIO_DEFAULT);
prefetch_area++;
if (unlikely(!prefetch_area))
break;
@@ -533,7 +533,7 @@ static int read_exceptions(struct pstore *ps,

chunk = area_location(ps, ps->current_area);

- area = dm_bufio_read(client, chunk, &bp);
+ area = dm_bufio_read(client, chunk, &bp, IOPRIO_DEFAULT);
if (IS_ERR(area)) {
r = PTR_ERR(area);
goto ret_destroy_bufio;
diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index b475200d8586..49db19e537f9 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -69,7 +69,7 @@ static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index,
block = div64_u64_rem(position, v->fec->io_size, &rem);
*offset = (unsigned int)rem;

- res = dm_bufio_read(v->fec->bufio, block, buf);
+ res = dm_bufio_read(v->fec->bufio, block, buf, IOPRIO_DEFAULT);
if (IS_ERR(res)) {
DMERR("%s: FEC %llu: parity read failed (block %llu): %ld",
v->data_dev->name, (unsigned long long)rsb,
@@ -248,7 +248,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf);
+ bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 14e58ae70521..4758bfe2c156 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -308,7 +308,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);

if (IS_ERR(data))
return PTR_ERR(data);
@@ -719,7 +719,8 @@ static void verity_prefetch_io(struct work_struct *work)
}
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
- hash_block_end - hash_block_start + 1);
+ hash_block_end - hash_block_start + 1,
+ IOPRIO_DEFAULT);
}

kfree(pw);
diff --git a/drivers/md/persistent-data/dm-block-manager.c b/drivers/md/persistent-data/dm-block-manager.c
index 0e010e1204aa..86a4f73d2f3d 100644
--- a/drivers/md/persistent-data/dm-block-manager.c
+++ b/drivers/md/persistent-data/dm-block-manager.c
@@ -474,7 +474,7 @@ int dm_bm_read_lock(struct dm_block_manager *bm, dm_block_t b,
void *p;
int r;

- p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result);
+ p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result, IOPRIO_DEFAULT);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -510,7 +510,7 @@ int dm_bm_write_lock(struct dm_block_manager *bm,
if (dm_bm_is_read_only(bm))
return -EPERM;

- p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result);
+ p = dm_bufio_read(bm->bufio, b, (struct dm_buffer **) result, IOPRIO_DEFAULT);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -624,7 +624,7 @@ EXPORT_SYMBOL_GPL(dm_bm_flush);

void dm_bm_prefetch(struct dm_block_manager *bm, dm_block_t b)
{
- dm_bufio_prefetch(bm->bufio, b, 1);
+ dm_bufio_prefetch(bm->bufio, b, 1, IOPRIO_DEFAULT);
}

bool dm_bm_is_read_only(struct dm_block_manager *bm)
diff --git a/include/linux/dm-bufio.h b/include/linux/dm-bufio.h
index 75e7d8cbb532..256a246c7b97 100644
--- a/include/linux/dm-bufio.h
+++ b/include/linux/dm-bufio.h
@@ -62,7 +62,7 @@ void dm_bufio_set_sector_offset(struct dm_bufio_client *c, sector_t start);
* it dirty.
*/
void *dm_bufio_read(struct dm_bufio_client *c, sector_t block,
- struct dm_buffer **bp);
+ struct dm_buffer **bp, unsigned short ioprio);

/*
* Like dm_bufio_read, but return buffer from cache, don't read
@@ -84,7 +84,8 @@ void *dm_bufio_new(struct dm_bufio_client *c, sector_t block,
* I/O to finish.
*/
void dm_bufio_prefetch(struct dm_bufio_client *c,
- sector_t block, unsigned int n_blocks);
+ sector_t block, unsigned int n_blocks,
+ unsigned short ioprio);

/*
* Release a reference obtained with dm_bufio_{read,get,new}. The data
--
2.34.1


2023-12-21 10:33:17

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v7 4/5] dm verity: Fix I/O priority lost when read FEC and hash

From: Hongyu Jin <[email protected]>

After obtaining the data, verification or error correction process may
trigger a new I/O that loses the priority of the original I/O, that is,
the verification of the higher priority IO may be blocked by the lower
priority IO.

Make the I/O of verification and error correction follow the
priority of original I/O.

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-verity-fec.c | 21 ++++++++++++---------
drivers/md/dm-verity-target.c | 12 ++++++++----
2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
index 49db19e537f9..066521de08da 100644
--- a/drivers/md/dm-verity-fec.c
+++ b/drivers/md/dm-verity-fec.c
@@ -60,7 +60,8 @@ static int fec_decode_rs8(struct dm_verity *v, struct dm_verity_fec_io *fio,
* to the data block. Caller is responsible for releasing buf.
*/
static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index,
- unsigned int *offset, struct dm_buffer **buf)
+ unsigned int *offset, struct dm_buffer **buf,
+ unsigned short ioprio)
{
u64 position, block, rem;
u8 *res;
@@ -69,7 +70,7 @@ static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index,
block = div64_u64_rem(position, v->fec->io_size, &rem);
*offset = (unsigned int)rem;

- res = dm_bufio_read(v->fec->bufio, block, buf, IOPRIO_DEFAULT);
+ res = dm_bufio_read(v->fec->bufio, block, buf, ioprio);
if (IS_ERR(res)) {
DMERR("%s: FEC %llu: parity read failed (block %llu): %ld",
v->data_dev->name, (unsigned long long)rsb,
@@ -121,16 +122,17 @@ static inline unsigned int fec_buffer_rs_index(unsigned int i, unsigned int j)
* Decode all RS blocks from buffers and copy corrected bytes into fio->output
* starting from block_offset.
*/
-static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_fec_io *fio,
- u64 rsb, int byte_index, unsigned int block_offset,
- int neras)
+static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_io *io,
+ struct dm_verity_fec_io *fio, u64 rsb, int byte_index,
+ unsigned int block_offset, int neras)
{
int r, corrected = 0, res;
struct dm_buffer *buf;
unsigned int n, i, offset;
u8 *par, *block;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

- par = fec_read_parity(v, rsb, block_offset, &offset, &buf);
+ par = fec_read_parity(v, rsb, block_offset, &offset, &buf, bio_prio(bio));
if (IS_ERR(par))
return PTR_ERR(par);

@@ -158,7 +160,7 @@ static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_fec_io *fio,
if (offset >= v->fec->io_size) {
dm_bufio_release(buf);

- par = fec_read_parity(v, rsb, block_offset, &offset, &buf);
+ par = fec_read_parity(v, rsb, block_offset, &offset, &buf, bio_prio(bio));
if (IS_ERR(par))
return PTR_ERR(par);
}
@@ -210,6 +212,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
u8 *bbuf, *rs_block;
u8 want_digest[HASH_MAX_DIGESTSIZE];
unsigned int n, k;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

if (neras)
*neras = 0;
@@ -248,7 +251,7 @@ static int fec_read_bufs(struct dm_verity *v, struct dm_verity_io *io,
bufio = v->bufio;
}

- bbuf = dm_bufio_read(bufio, block, &buf, IOPRIO_DEFAULT);
+ bbuf = dm_bufio_read(bufio, block, &buf, bio_prio(bio));
if (IS_ERR(bbuf)) {
DMWARN_LIMIT("%s: FEC %llu: read failed (%llu): %ld",
v->data_dev->name,
@@ -377,7 +380,7 @@ static int fec_decode_rsb(struct dm_verity *v, struct dm_verity_io *io,
if (unlikely(r < 0))
return r;

- r = fec_decode_bufs(v, fio, rsb, r, pos, neras);
+ r = fec_decode_bufs(v, io, fio, rsb, r, pos, neras);
if (r < 0)
return r;

diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 4758bfe2c156..8cbf81fc0031 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -51,6 +51,7 @@ static DEFINE_STATIC_KEY_FALSE(use_tasklet_enabled);
struct dm_verity_prefetch_work {
struct work_struct work;
struct dm_verity *v;
+ unsigned short ioprio;
sector_t block;
unsigned int n_blocks;
};
@@ -294,6 +295,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
int r;
sector_t hash_block;
unsigned int offset;
+ struct bio *bio = dm_bio_from_per_bio_data(io, v->ti->per_io_data_size);

verity_hash_at_level(v, block, level, &hash_block, &offset);

@@ -308,7 +310,7 @@ static int verity_verify_level(struct dm_verity *v, struct dm_verity_io *io,
return -EAGAIN;
}
} else
- data = dm_bufio_read(v->bufio, hash_block, &buf, IOPRIO_DEFAULT);
+ data = dm_bufio_read(v->bufio, hash_block, &buf, bio_prio(bio));

if (IS_ERR(data))
return PTR_ERR(data);
@@ -720,13 +722,14 @@ static void verity_prefetch_io(struct work_struct *work)
no_prefetch_cluster:
dm_bufio_prefetch(v->bufio, hash_block_start,
hash_block_end - hash_block_start + 1,
- IOPRIO_DEFAULT);
+ pw->ioprio);
}

kfree(pw);
}

-static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
+static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io,
+ unsigned short ioprio)
{
sector_t block = io->block;
unsigned int n_blocks = io->n_blocks;
@@ -754,6 +757,7 @@ static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io)
pw->v = v;
pw->block = block;
pw->n_blocks = n_blocks;
+ pw->ioprio = ioprio;
queue_work(v->verify_wq, &pw->work);
}

@@ -796,7 +800,7 @@ static int verity_map(struct dm_target *ti, struct bio *bio)

verity_fec_init_io(io);

- verity_submit_prefetch(v, io);
+ verity_submit_prefetch(v, io, bio_prio(bio));

submit_bio_noacct(bio);

--
2.34.1


2023-12-21 10:33:36

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v7 5/5] dm-crypt: Fix lost ioprio when queuing write bios

From: Hongyu Jin <[email protected]>

Since dm-crypt queues writes to a different kernel thread (workqueue),
the bios will dispatch from tasks with different io_context->ioprio
settings and blkcg than the submitting task, thus giving incorrect
ioprio to the io scheduler.

Get the original io priority setting via struct dm_crypt_io::base_bio
and set this priority to the bio for write.

Link: https://lore.kernel.org/dm-devel/[email protected]

Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-crypt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 2ae8560b6a14..ba6e794f7871 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
GFP_NOIO, &cc->bs);
clone->bi_private = io;
clone->bi_end_io = crypt_endio;
+ clone->bi_ioprio = io->base_bio->bi_ioprio;

remaining_size = size;

--
2.34.1


2023-12-23 15:41:41

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v7 0/5] Fix I/O priority lost in device-mapper

On Thu, Dec 21, 2023 at 06:31:34PM +0800, Hongyu Jin wrote:
> Hongyu Jin (5):
> block: Fix bio IO priority setting
> dm: Support I/O priority for dm_io()
> dm-bufio: Support I/O priority
> dm verity: Fix I/O priority lost when read FEC and hash
> dm-crypt: Fix lost ioprio when queuing write bios

Looks good,

Reviewed-by: Eric Biggers <[email protected]>

- Eric

2024-01-24 05:36:18

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v8 0/5] Fix I/O priority lost in device-mapper

From: Hongyu Jin <[email protected]>

High-priority tasks get data from dm-verity devices via RT IO priority,
I/O will lose RT priority when reading FEC and hash values via kworker
submission IO during verification, and the verification phase may be
blocked by low-priority IO.

Dm-crypt has the same problem in the data writing process.

This is because io_context and blkcg are missing.

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. Make the IO priority of the original bio to be passed to dm,
and the dm target inherits the IO priority as needed.

Changes in v8:
- Rebase patch 1 on commit 7ed2632ec7d7
Changes in v7:
- Modify patch 4: change dm-verity-fec.c
Changes in v6:
- Rebase patch and resolve conflict for patch 1, 3, 4
- Modify patch 4: fec_read_parity() follow the priority of original
bio
- Update commit message
Changes in v5:
- Rewrite patch 2, add ioprio parameter in dm_io();
- Modify dm_io() in patch 3
Changes in v4:
- Modify commit message by Suggestion
- Modify patch for dm-crypt
Changes in v3:
- Split patch for device-mapper
- Add patch to fix dm-crypy I/O priority question
- Add block patch to review together
- Fix some error in v2 patch
Changes in v2:
- Add ioprio field in struct dm_io_region
- Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
- Add two interface


Hongyu Jin (5):
block: Fix bio IO priority setting
dm: Support I/O priority for dm_io()
dm-bufio: Support I/O priority
dm verity: Fix I/O priority lost when read FEC and hash
dm-crypt: Fix lost ioprio when queuing write bios

block/blk-core.c | 10 +++++
block/blk-mq.c | 10 -----
drivers/md/dm-bufio.c | 43 +++++++++++--------
drivers/md/dm-crypt.c | 1 +
drivers/md/dm-ebs-target.c | 8 ++--
drivers/md/dm-integrity.c | 12 +++---
drivers/md/dm-io.c | 23 +++++-----
drivers/md/dm-kcopyd.c | 4 +-
drivers/md/dm-log.c | 4 +-
drivers/md/dm-raid1.c | 6 +--
drivers/md/dm-snap-persistent.c | 8 ++--
drivers/md/dm-verity-fec.c | 21 +++++----
drivers/md/dm-verity-target.c | 13 ++++--
drivers/md/dm-writecache.c | 8 ++--
drivers/md/persistent-data/dm-block-manager.c | 6 +--
include/linux/dm-bufio.h | 5 ++-
include/linux/dm-io.h | 3 +-
17 files changed, 102 insertions(+), 83 deletions(-)


base-commit: 7ed2632ec7d72e926b9e8bcc9ad1bb0cd37274bf
--
2.34.1


2024-01-24 05:37:00

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v8 2/5] dm: Support I/O priority for dm_io()

From: Hongyu Jin <[email protected]>

Some I/O will dispatch from kworker with different io_context settings
than the submitting task, we may need to specify a priority to avoid
losing priority.

Add I/O priority parameter for dm_io().

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-bufio.c | 6 +++---
drivers/md/dm-integrity.c | 10 +++++-----
drivers/md/dm-io.c | 23 +++++++++++++----------
drivers/md/dm-kcopyd.c | 4 ++--
drivers/md/dm-log.c | 4 ++--
drivers/md/dm-raid1.c | 6 +++---
drivers/md/dm-snap-persistent.c | 4 ++--
drivers/md/dm-writecache.c | 8 ++++----
include/linux/dm-io.h | 3 ++-
9 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 13c65b7e1ed6..f5541b8f6320 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1315,7 +1315,7 @@ static void use_dmio(struct dm_buffer *b, enum req_op op, sector_t sector,
io_req.mem.ptr.vma = (char *)b->data + offset;
}

- r = dm_io(&io_req, 1, &region, NULL);
+ r = dm_io(&io_req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
b->end_io(b, errno_to_blk_status(r));
}
@@ -2167,7 +2167,7 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c)
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL;

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_flush);

@@ -2191,7 +2191,7 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c
if (WARN_ON_ONCE(dm_bufio_in_request()))
return -EINVAL; /* discards are optional */

- return dm_io(&io_req, 1, &io_reg, NULL);
+ return dm_io(&io_req, 1, &io_reg, NULL, IOPRIO_DEFAULT);
}
EXPORT_SYMBOL_GPL(dm_bufio_issue_discard);

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index c5f03aab4552..ed45411eb68d 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -553,7 +553,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, blk_opf_t opf)
}
}

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
return r;

@@ -1071,7 +1071,7 @@ static void rw_journal_sectors(struct dm_integrity_c *ic, blk_opf_t opf,
io_loc.sector = ic->start + SB_SECTORS + sector;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, (opf & REQ_OP_MASK) == REQ_OP_READ ?
"reading journal" : "writing journal", r);
@@ -1188,7 +1188,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned int section, u
io_loc.sector = target;
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
WARN_ONCE(1, "asynchronous dm_io failed: %d", r);
fn(-1UL, data);
@@ -1517,7 +1517,7 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
fr.io_reg.count = 0,
fr.ic = ic;
init_completion(&fr.comp);
- r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL);
+ r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL, IOPRIO_DEFAULT);
BUG_ON(r);
}

@@ -2740,7 +2740,7 @@ static void integrity_recalc(struct work_struct *w)
io_loc.sector = get_data_sector(ic, area, offset);
io_loc.count = n_sectors;

- r = dm_io(&io_req, 1, &io_loc, NULL);
+ r = dm_io(&io_req, 1, &io_loc, NULL, IOPRIO_DEFAULT);
if (unlikely(r)) {
dm_integrity_io_error(ic, "reading data", r);
goto err;
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index f053ce245814..7409490259d1 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -305,7 +305,7 @@ static void km_dp_init(struct dpages *dp, void *data)
*/
static void do_region(const blk_opf_t opf, unsigned int region,
struct dm_io_region *where, struct dpages *dp,
- struct io *io)
+ struct io *io, unsigned short ioprio)
{
struct bio *bio;
struct page *page;
@@ -354,6 +354,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,
&io->client->bios);
bio->bi_iter.bi_sector = where->sector + (where->count - remaining);
bio->bi_end_io = endio;
+ bio->bi_ioprio = ioprio;
store_io_and_region_in_bio(bio, io, region);

if (op == REQ_OP_DISCARD || op == REQ_OP_WRITE_ZEROES) {
@@ -383,7 +384,7 @@ static void do_region(const blk_opf_t opf, unsigned int region,

static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
struct dm_io_region *where, struct dpages *dp,
- struct io *io, int sync)
+ struct io *io, int sync, unsigned short ioprio)
{
int i;
struct dpages old_pages = *dp;
@@ -400,7 +401,7 @@ static void dispatch_io(blk_opf_t opf, unsigned int num_regions,
for (i = 0; i < num_regions; i++) {
*dp = old_pages;
if (where[i].count || (opf & REQ_PREFLUSH))
- do_region(opf, i, where + i, dp, io);
+ do_region(opf, i, where + i, dp, io, ioprio);
}

/*
@@ -425,7 +426,7 @@ static void sync_io_complete(unsigned long error, void *context)

static int sync_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf, struct dpages *dp,
- unsigned long *error_bits)
+ unsigned long *error_bits, unsigned short ioprio)
{
struct io *io;
struct sync_io sio;
@@ -447,7 +448,7 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 1);
+ dispatch_io(opf, num_regions, where, dp, io, 1, ioprio);

wait_for_completion_io(&sio.wait);

@@ -459,7 +460,8 @@ static int sync_io(struct dm_io_client *client, unsigned int num_regions,

static int async_io(struct dm_io_client *client, unsigned int num_regions,
struct dm_io_region *where, blk_opf_t opf,
- struct dpages *dp, io_notify_fn fn, void *context)
+ struct dpages *dp, io_notify_fn fn, void *context,
+ unsigned short ioprio)
{
struct io *io;

@@ -479,7 +481,7 @@ static int async_io(struct dm_io_client *client, unsigned int num_regions,
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;

- dispatch_io(opf, num_regions, where, dp, io, 0);
+ dispatch_io(opf, num_regions, where, dp, io, 0, ioprio);
return 0;
}

@@ -521,7 +523,8 @@ static int dp_init(struct dm_io_request *io_req, struct dpages *dp,
}

int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *where, unsigned long *sync_error_bits)
+ struct dm_io_region *where, unsigned long *sync_error_bits,
+ unsigned short ioprio)
{
int r;
struct dpages dp;
@@ -532,11 +535,11 @@ int dm_io(struct dm_io_request *io_req, unsigned int num_regions,

if (!io_req->notify.fn)
return sync_io(io_req->client, num_regions, where,
- io_req->bi_opf, &dp, sync_error_bits);
+ io_req->bi_opf, &dp, sync_error_bits, ioprio);

return async_io(io_req->client, num_regions, where,
io_req->bi_opf, &dp, io_req->notify.fn,
- io_req->notify.context);
+ io_req->notify.context, ioprio);
}
EXPORT_SYMBOL(dm_io);

diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index 36bcfdccae04..6ea75436a433 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -578,9 +578,9 @@ static int run_io_job(struct kcopyd_job *job)
io_job_start(job->kc->throttle);

if (job->op == REQ_OP_READ)
- r = dm_io(&io_req, 1, &job->source, NULL);
+ r = dm_io(&io_req, 1, &job->source, NULL, IOPRIO_DEFAULT);
else
- r = dm_io(&io_req, job->num_dests, job->dests, NULL);
+ r = dm_io(&io_req, job->num_dests, job->dests, NULL, IOPRIO_DEFAULT);

return r;
}
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..f7f9c2100937 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -300,7 +300,7 @@ static int rw_header(struct log_c *lc, enum req_op op)
{
lc->io_req.bi_opf = op;

- return dm_io(&lc->io_req, 1, &lc->header_location, NULL);
+ return dm_io(&lc->io_req, 1, &lc->header_location, NULL, IOPRIO_DEFAULT);
}

static int flush_header(struct log_c *lc)
@@ -313,7 +313,7 @@ static int flush_header(struct log_c *lc)

lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;

- return dm_io(&lc->io_req, 1, &null_location, NULL);
+ return dm_io(&lc->io_req, 1, &null_location, NULL, IOPRIO_DEFAULT);
}

static int read_header(struct log_c *log)
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..9511dae5b556 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -278,7 +278,7 @@ static int mirror_flush(struct dm_target *ti)
}

error_bits = -1;
- dm_io(&io_req, ms->nr_mirrors, io, &error_bits);
+ dm_io(&io_req, ms->nr_mirrors, io, &error_bits, IOPRIO_DEFAULT);
if (unlikely(error_bits != 0)) {
for (i = 0; i < ms->nr_mirrors; i++)
if (test_bit(i, &error_bits))
@@ -554,7 +554,7 @@ static void read_async_bio(struct mirror *m, struct bio *bio)

map_region(&io, m, bio);
bio_set_m(bio, m);
- BUG_ON(dm_io(&io_req, 1, &io, NULL));
+ BUG_ON(dm_io(&io_req, 1, &io, NULL, IOPRIO_DEFAULT));
}

static inline int region_in_sync(struct mirror_set *ms, region_t region,
@@ -681,7 +681,7 @@ static void do_write(struct mirror_set *ms, struct bio *bio)
*/
bio_set_m(bio, get_default_mirror(ms));

- BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL));
+ BUG_ON(dm_io(&io_req, ms->nr_mirrors, io, NULL, IOPRIO_DEFAULT));
}

static void do_writes(struct mirror_set *ms, struct bio_list *writes)
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..568d10842b1f 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -223,7 +223,7 @@ static void do_metadata(struct work_struct *work)
{
struct mdata_req *req = container_of(work, struct mdata_req, work);

- req->result = dm_io(req->io_req, 1, req->where, NULL);
+ req->result = dm_io(req->io_req, 1, req->where, NULL, IOPRIO_DEFAULT);
}

/*
@@ -247,7 +247,7 @@ static int chunk_io(struct pstore *ps, void *area, chunk_t chunk, blk_opf_t opf,
struct mdata_req req;

if (!metadata)
- return dm_io(&io_req, 1, &where, NULL);
+ return dm_io(&io_req, 1, &where, NULL, IOPRIO_DEFAULT);

req.where = &where;
req.io_req = &io_req;
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 074cb785eafc..6a4279bfb1e7 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -531,7 +531,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
req.notify.context = &endio;

/* writing via async dm-io (implied by notify.fn above) won't return an error */
- (void) dm_io(&req, 1, &region, NULL);
+ (void) dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
i = j;
}

@@ -568,7 +568,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
req.notify.fn = NULL;
req.notify.context = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error writing superblock");
}
@@ -596,7 +596,7 @@ static void writecache_disk_flush(struct dm_writecache *wc, struct dm_dev *dev)
req.client = wc->dm_io;
req.notify.fn = NULL;

- r = dm_io(&req, 1, &region, NULL);
+ r = dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
if (unlikely(r))
writecache_error(wc, r, "error flushing metadata: %d", r);
}
@@ -990,7 +990,7 @@ static int writecache_read_metadata(struct dm_writecache *wc, sector_t n_sectors
req.client = wc->dm_io;
req.notify.fn = NULL;

- return dm_io(&req, 1, &region, NULL);
+ return dm_io(&req, 1, &region, NULL, IOPRIO_DEFAULT);
}

static void writecache_resume(struct dm_target *ti)
diff --git a/include/linux/dm-io.h b/include/linux/dm-io.h
index 7595142f3fc5..7b2968612b7e 100644
--- a/include/linux/dm-io.h
+++ b/include/linux/dm-io.h
@@ -80,7 +80,8 @@ void dm_io_client_destroy(struct dm_io_client *client);
* error occurred doing io to the corresponding region.
*/
int dm_io(struct dm_io_request *io_req, unsigned int num_regions,
- struct dm_io_region *region, unsigned int long *sync_error_bits);
+ struct dm_io_region *region, unsigned int long *sync_error_bits,
+ unsigned short ioprio);

#endif /* __KERNEL__ */
#endif /* _LINUX_DM_IO_H */
--
2.34.1


2024-01-24 05:38:44

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v8 5/5] dm-crypt: Fix lost ioprio when queuing write bios

From: Hongyu Jin <[email protected]>

Since dm-crypt queues writes to a different kernel thread (workqueue),
the bios will dispatch from tasks with different io_context->ioprio
settings and blkcg than the submitting task, thus giving incorrect
ioprio to the io scheduler.

Get the original io priority setting via struct dm_crypt_io::base_bio
and set this priority to the bio for write.

Link: https://lore.kernel.org/dm-devel/[email protected]

Signed-off-by: Hongyu Jin <[email protected]>
---
drivers/md/dm-crypt.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 855b482cbff1..e0804a86946f 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1683,6 +1683,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
GFP_NOIO, &cc->bs);
clone->bi_private = io;
clone->bi_end_io = crypt_endio;
+ clone->bi_ioprio = io->base_bio->bi_ioprio;

remaining_size = size;

--
2.34.1


2024-01-24 05:39:01

by Hongyu Jin

[permalink] [raw]
Subject: [PATCH v8 1/5] block: Fix bio IO priority setting

From: Hongyu Jin <[email protected]>

Move bio_set_ioprio() into submit_bio():
1. Only call bio_set_ioprio() once to set the priority of original bio,
the bio that cloned and splited from original bio will auto inherit
the priority of original bio in clone process.

2. The IO priority can be passed to module that implement
struct gendisk::fops::submit_bio, help resolve some
of the IO priority loss issues.

This patch depends on commit 82b74cac2849 ("blk-ioprio: Convert from
rqos policy to direct call")

Fixes: a78418e6a04c ("block: Always initialize bio IO priority on submit")

Co-developed-by: Yibin Ding <[email protected]>
Signed-off-by: Yibin Ding <[email protected]>
Signed-off-by: Hongyu Jin <[email protected]>
---
block/blk-core.c | 10 ++++++++++
block/blk-mq.c | 10 ----------
2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 11342af420d0..de771093b526 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -49,6 +49,7 @@
#include "blk-pm.h"
#include "blk-cgroup.h"
#include "blk-throttle.h"
+#include "blk-ioprio.h"

struct dentry *blk_debugfs_root;

@@ -833,6 +834,14 @@ void submit_bio_noacct(struct bio *bio)
}
EXPORT_SYMBOL(submit_bio_noacct);

+static void bio_set_ioprio(struct bio *bio)
+{
+ /* Nobody set ioprio so far? Initialize it based on task's nice value */
+ if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
+ bio->bi_ioprio = get_current_ioprio();
+ blkcg_set_ioprio(bio);
+}
+
/**
* submit_bio - submit a bio to the block device layer for I/O
* @bio: The &struct bio which describes the I/O
@@ -855,6 +864,7 @@ void submit_bio(struct bio *bio)
count_vm_events(PGPGOUT, bio_sectors(bio));
}

+ bio_set_ioprio(bio);
submit_bio_noacct(bio);
}
EXPORT_SYMBOL(submit_bio);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index aa87fcfda1ec..2dc01551e27c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -40,7 +40,6 @@
#include "blk-stat.h"
#include "blk-mq-sched.h"
#include "blk-rq-qos.h"
-#include "blk-ioprio.h"

static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
static DEFINE_PER_CPU(call_single_data_t, blk_cpu_csd);
@@ -2944,14 +2943,6 @@ static bool blk_mq_use_cached_rq(struct request *rq, struct blk_plug *plug,
return true;
}

-static void bio_set_ioprio(struct bio *bio)
-{
- /* Nobody set ioprio so far? Initialize it based on task's nice value */
- if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
- bio->bi_ioprio = get_current_ioprio();
- blkcg_set_ioprio(bio);
-}
-
/**
* blk_mq_submit_bio - Create and send a request to block device.
* @bio: Bio pointer.
@@ -2976,7 +2967,6 @@ void blk_mq_submit_bio(struct bio *bio)
blk_status_t ret;

bio = blk_queue_bounce(bio, q);
- bio_set_ioprio(bio);

if (plug) {
rq = rq_list_peek(&plug->cached_rq);
--
2.34.1


2024-01-29 16:34:14

by Mike Snitzer

[permalink] [raw]
Subject: Re: [PATCH v8 0/5] Fix I/O priority lost in device-mapper

On Wed, Jan 24 2024 at 12:35P -0500,
Hongyu Jin <[email protected]> wrote:

> From: Hongyu Jin <[email protected]>
>
> High-priority tasks get data from dm-verity devices via RT IO priority,
> I/O will lose RT priority when reading FEC and hash values via kworker
> submission IO during verification, and the verification phase may be
> blocked by low-priority IO.
>
> Dm-crypt has the same problem in the data writing process.
>
> This is because io_context and blkcg are missing.
>
> Move bio_set_ioprio() into submit_bio():
> 1. Only call bio_set_ioprio() once to set the priority of original bio,
> the bio that cloned and splited from original bio will auto inherit
> the priority of original bio in clone process.
>
> 2. Make the IO priority of the original bio to be passed to dm,
> and the dm target inherits the IO priority as needed.
>
> Changes in v8:
> - Rebase patch 1 on commit 7ed2632ec7d7
> Changes in v7:
> - Modify patch 4: change dm-verity-fec.c
> Changes in v6:
> - Rebase patch and resolve conflict for patch 1, 3, 4
> - Modify patch 4: fec_read_parity() follow the priority of original
> bio
> - Update commit message
> Changes in v5:
> - Rewrite patch 2, add ioprio parameter in dm_io();
> - Modify dm_io() in patch 3
> Changes in v4:
> - Modify commit message by Suggestion
> - Modify patch for dm-crypt
> Changes in v3:
> - Split patch for device-mapper
> - Add patch to fix dm-crypy I/O priority question
> - Add block patch to review together
> - Fix some error in v2 patch
> Changes in v2:
> - Add ioprio field in struct dm_io_region
> - Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
> - Add two interface
>
>
> Hongyu Jin (5):
> block: Fix bio IO priority setting
> dm: Support I/O priority for dm_io()
> dm-bufio: Support I/O priority
> dm verity: Fix I/O priority lost when read FEC and hash
> dm-crypt: Fix lost ioprio when queuing write bios

Sorry for the delay.. I've been consumed with other work. I will look
at this patchset for consideration for the 6.9 merge window (we still
have time to make changes given we're now squarely in the 6.9
development window). So I appreciate getting you feedback sooner
rather than later is both useful and important.

I see Eric provided his Reviewed-by for v7 -- that really helps. BUT,
for some reason you didn't add his provided Reviewed-by to each commit
when you rebased with v8...

Mikulas, if you beat me to providing closer review: great. If not,
that's cool. That DM requires such care (with sprinkling changes
throughout DM core and targets) is unfortunate -- but could be
unavoidable all things considered. I will look closer "soon" (if not
this week then next).

Thanks for following up!

Mike

2024-01-29 19:31:59

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH v8 0/5] Fix I/O priority lost in device-mapper

The patchset seems OK to me.

Reviewed-by: Mikulas Patocka <[email protected]>


On Wed, 24 Jan 2024, Hongyu Jin wrote:

> From: Hongyu Jin <[email protected]>
>
> High-priority tasks get data from dm-verity devices via RT IO priority,
> I/O will lose RT priority when reading FEC and hash values via kworker
> submission IO during verification, and the verification phase may be
> blocked by low-priority IO.
>
> Dm-crypt has the same problem in the data writing process.
>
> This is because io_context and blkcg are missing.
>
> Move bio_set_ioprio() into submit_bio():
> 1. Only call bio_set_ioprio() once to set the priority of original bio,
> the bio that cloned and splited from original bio will auto inherit
> the priority of original bio in clone process.
>
> 2. Make the IO priority of the original bio to be passed to dm,
> and the dm target inherits the IO priority as needed.
>
> Changes in v8:
> - Rebase patch 1 on commit 7ed2632ec7d7
> Changes in v7:
> - Modify patch 4: change dm-verity-fec.c
> Changes in v6:
> - Rebase patch and resolve conflict for patch 1, 3, 4
> - Modify patch 4: fec_read_parity() follow the priority of original
> bio
> - Update commit message
> Changes in v5:
> - Rewrite patch 2, add ioprio parameter in dm_io();
> - Modify dm_io() in patch 3
> Changes in v4:
> - Modify commit message by Suggestion
> - Modify patch for dm-crypt
> Changes in v3:
> - Split patch for device-mapper
> - Add patch to fix dm-crypy I/O priority question
> - Add block patch to review together
> - Fix some error in v2 patch
> Changes in v2:
> - Add ioprio field in struct dm_io_region
> - Initial struct dm_io_region::ioprio to IOPRIO_DEFAULT
> - Add two interface
>
>
> Hongyu Jin (5):
> block: Fix bio IO priority setting
> dm: Support I/O priority for dm_io()
> dm-bufio: Support I/O priority
> dm verity: Fix I/O priority lost when read FEC and hash
> dm-crypt: Fix lost ioprio when queuing write bios
>
> block/blk-core.c | 10 +++++
> block/blk-mq.c | 10 -----
> drivers/md/dm-bufio.c | 43 +++++++++++--------
> drivers/md/dm-crypt.c | 1 +
> drivers/md/dm-ebs-target.c | 8 ++--
> drivers/md/dm-integrity.c | 12 +++---
> drivers/md/dm-io.c | 23 +++++-----
> drivers/md/dm-kcopyd.c | 4 +-
> drivers/md/dm-log.c | 4 +-
> drivers/md/dm-raid1.c | 6 +--
> drivers/md/dm-snap-persistent.c | 8 ++--
> drivers/md/dm-verity-fec.c | 21 +++++----
> drivers/md/dm-verity-target.c | 13 ++++--
> drivers/md/dm-writecache.c | 8 ++--
> drivers/md/persistent-data/dm-block-manager.c | 6 +--
> include/linux/dm-bufio.h | 5 ++-
> include/linux/dm-io.h | 3 +-
> 17 files changed, 102 insertions(+), 83 deletions(-)
>
>
> base-commit: 7ed2632ec7d72e926b9e8bcc9ad1bb0cd37274bf
> --
> 2.34.1