2017-08-08 08:46:14

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 00/49] block: support multipage bvec

Hi,

This patchset brings multipage bvec into block layer:

1) what is multipage bvec?

Multipage bvecs means that one 'struct bio_bvec' can hold
multiple pages which are physically contiguous instead
of one single page used in linux kernel for long time.

2) why is multipage bvec introduced?

Kent proposed the idea[1] first.

As system's RAM becomes much bigger than before, and
at the same time huge page, transparent huge page and
memory compaction are widely used, it is a bit easy now
to see physically contiguous pages from fs in I/O.
On the other hand, from block layer's view, it isn't
necessary to store intermediate pages into bvec, and
it is enough to just store the physicallly contiguous
'segment' in each io vector.

Also huge pages are being brought to filesystem and swap
[2][6], we can do IO on a hugepage each time[3], which
requires that one bio can transfer at least one huge page
one time. Turns out it isn't flexiable to change BIO_MAX_PAGES
simply[3][5]. Multipage bvec can fit in this case very well.

With multipage bvec:

- segment handling in block layer can be improved much
in future since it should be quite easy to convert
multipage bvec into segment easily. For example, we might
just store segment in each bvec directly in future.

- bio size can be increased and it should improve some
high-bandwidth IO case in theory[4].

- Inside block layer, both bio splitting and sg map can
become more efficient than before by just traversing the
physically contiguous 'segment' instead of each page.

- there is opportunity in future to improve memory footprint
of bvecs.

3) how is multipage bvec implemented in this patchset?

The 1st 17 patches comment on some special cases and deal with
some special cases of direct access to bvec table.

The 2nd part(18~29) implements multipage bvec in block layer:

- put all tricks into bvec/bio/rq iterators, and as far as
drivers and fs use these standard iterators, they are happy
with multipage bvec

- use multipage bvec to split bio and map sg

- bio_for_each_segment_all() changes
this helper pass pointer of each bvec directly to user, and
it has to be changed. Two new helpers(bio_for_each_segment_all_sp()
and bio_for_each_segment_all_mp()) are introduced.

The 3rd part(32~47) convert current users of bio_for_each_segment_all()
to bio_for_each_segment_all_sp()/bio_for_each_segment_all_mp().

The last part(48~49) enables multipage bvec.

These patches can be found in the following git tree:

https://github.com/ming1/linux/commits/v4.13-rc3-block-next-mp-bvec-V3

Thanks Christoph for looking at the early version and providing
very good suggestions, such as: introduce bio_init_with_vec_table(),
remove another unnecessary helpers for cleanup and so on.

Any comments are welcome!

BTW, I will be on a trip in the following week, so may not reply
in time.

V3:
- rebase on v4.13-rc3 with for-next of block tree
- run more xfstests: xfs/ext4 over NVMe, Sata, DM(linear),
MD(raid1), and not see regressions triggered
- add Reviewed-by on some btrfs patches
- remove two MD patches because both are merged to linus tree
already

V2:
- bvec table direct access in raid has been cleaned, so NO_MP
flag is dropped
- rebase on recent Neil Brown's change on bio and bounce code
- reorganize the patchset

V1:
- against v4.10-rc1 and some cleanup in V0 are in -linus already
- handle queue_virt_boundary() in mp bvec change and make NVMe happy
- further BTRFS cleanup
- remove QUEUE_FLAG_SPLIT_MP
- rename for two new helpers of bio_for_each_segment_all()
- fix bounce convertion
- address comments in V0

[1], http://marc.info/?l=linux-kernel&m=141680246629547&w=2
[2], https://patchwork.kernel.org/patch/9451523/
[3], http://marc.info/?t=147735447100001&r=1&w=2
[4], http://marc.info/?l=linux-mm&m=147745525801433&w=2
[5], http://marc.info/?t=149569484500007&r=1&w=2
[6], http://marc.info/?t=149820215300004&r=1&w=2

Ming Lei (49):
block: drbd: comment on direct access bvec table
block: loop: comment on direct access to bvec table
kernel/power/swap.c: comment on direct access to bvec table
mm: page_io.c: comment on direct access to bvec table
fs/buffer: comment on direct access to bvec table
f2fs: f2fs_read_end_io: comment on direct access to bvec table
bcache: comment on direct access to bvec table
block: comment on bio_alloc_pages()
block: comment on bio_iov_iter_get_pages()
dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE
btrfs: avoid access to .bi_vcnt directly
btrfs: avoid to access bvec table directly for a cloned bio
btrfs: comment on direct access bvec table
block: bounce: avoid direct access to bvec table
bvec_iter: introduce BVEC_ITER_ALL_INIT
block: bounce: don't access bio->bi_io_vec in copy_to_high_bio_irq
block: comments on bio_for_each_segment[_all]
block: introduce multipage/single page bvec helpers
block: implement sp version of bvec iterator helpers
block: introduce bio_for_each_segment_mp()
blk-merge: compute bio->bi_seg_front_size efficiently
block: blk-merge: try to make front segments in full size
block: blk-merge: remove unnecessary check
block: use bio_for_each_segment_mp() to compute segments count
block: use bio_for_each_segment_mp() to map sg
block: introduce bvec_for_each_sp_bvec()
block: bio: introduce single/multi page version of
bio_for_each_segment_all()
block: introduce bvec_get_last_page()
fs/buffer.c: use bvec iterator to truncate the bio
btrfs: use bvec_get_last_page to get bio's last page
block: deal with dirtying pages for multipage bvec
block: convert to singe/multi page version of
bio_for_each_segment_all()
bcache: convert to bio_for_each_segment_all_sp()
md: raid1: convert to bio_for_each_segment_all_sp()
dm-crypt: don't clear bvec->bv_page in crypt_free_buffer_pages()
dm-crypt: convert to bio_for_each_segment_all_sp()
fs/mpage: convert to bio_for_each_segment_all_sp()
fs/block: convert to bio_for_each_segment_all_sp()
fs/iomap: convert to bio_for_each_segment_all_sp()
ext4: convert to bio_for_each_segment_all_sp()
xfs: convert to bio_for_each_segment_all_sp()
gfs2: convert to bio_for_each_segment_all_sp()
f2fs: convert to bio_for_each_segment_all_sp()
exofs: convert to bio_for_each_segment_all_sp()
fs: crypto: convert to bio_for_each_segment_all_sp()
fs/btrfs: convert to bio_for_each_segment_all_sp()
fs/direct-io: convert to bio_for_each_segment_all_sp()
block: enable multipage bvecs
block: bio: pass segments to bio if bio_add_page() is bypassed

block/bio.c | 137 ++++++++++++++++++++----
block/blk-merge.c | 226 +++++++++++++++++++++++++++++++--------
block/blk-zoned.c | 5 +-
block/bounce.c | 39 ++++---
drivers/block/drbd/drbd_bitmap.c | 1 +
drivers/block/loop.c | 5 +
drivers/md/bcache/btree.c | 4 +-
drivers/md/bcache/super.c | 6 ++
drivers/md/bcache/util.c | 7 ++
drivers/md/dm-crypt.c | 4 +-
drivers/md/dm.c | 11 +-
drivers/md/raid1.c | 3 +-
fs/block_dev.c | 6 +-
fs/btrfs/compression.c | 12 ++-
fs/btrfs/disk-io.c | 3 +-
fs/btrfs/extent_io.c | 38 +++++--
fs/btrfs/extent_io.h | 2 +-
fs/btrfs/inode.c | 22 +++-
fs/btrfs/raid56.c | 1 +
fs/buffer.c | 11 +-
fs/crypto/bio.c | 3 +-
fs/direct-io.c | 4 +-
fs/exofs/ore.c | 3 +-
fs/exofs/ore_raid.c | 3 +-
fs/ext4/page-io.c | 3 +-
fs/ext4/readpage.c | 3 +-
fs/f2fs/data.c | 13 ++-
fs/gfs2/lops.c | 3 +-
fs/gfs2/meta_io.c | 3 +-
fs/iomap.c | 3 +-
fs/mpage.c | 3 +-
fs/xfs/xfs_aops.c | 3 +-
include/linux/bio.h | 67 +++++++++++-
include/linux/blk_types.h | 6 ++
include/linux/bvec.h | 141 ++++++++++++++++++++++--
kernel/power/swap.c | 2 +
mm/page_io.c | 2 +
37 files changed, 674 insertions(+), 134 deletions(-)

--
2.9.4


2017-08-08 08:46:26

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 01/49] block: drbd: comment on direct access bvec table

Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/drbd/drbd_bitmap.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 809fd245c3dc..70890d950dc9 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -953,6 +953,7 @@ static void drbd_bm_endio(struct bio *bio)
struct drbd_bm_aio_ctx *ctx = bio->bi_private;
struct drbd_device *device = ctx->device;
struct drbd_bitmap *b = device->bitmap;
+ /* single page bio, safe for multipage bvec */
unsigned int idx = bm_page_to_idx(bio->bi_io_vec[0].bv_page);

if ((ctx->flags & BM_AIO_COPY_PAGES) == 0 &&
--
2.9.4

2017-08-08 08:46:38

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 02/49] block: loop: comment on direct access to bvec table

Signed-off-by: Ming Lei <[email protected]>
---
drivers/block/loop.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ef8334949b42..58df9ed70328 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -487,6 +487,11 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
/* nomerge for loop request queue */
WARN_ON(cmd->rq->bio != cmd->rq->biotail);

+ /*
+ * For multipage bvec support, it is safe to pass the bvec
+ * table to iov iterator, because iov iter still uses bvec
+ * iter helpers to travese bvec.
+ */
bvec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
iov_iter_bvec(&iter, ITER_BVEC | rw, bvec,
bio_segments(bio), blk_rq_bytes(cmd->rq));
--
2.9.4

2017-08-08 08:46:49

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 03/49] kernel/power/swap.c: comment on direct access to bvec table

Cc: "Rafael J. Wysocki" <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
kernel/power/swap.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 57d22571f306..aa52ccc03fcc 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -238,6 +238,8 @@ static void hib_init_batch(struct hib_bio_batch *hb)
static void hib_end_io(struct bio *bio)
{
struct hib_bio_batch *hb = bio->bi_private;
+
+ /* single page bio, safe for multipage bvec */
struct page *page = bio->bi_io_vec[0].bv_page;

if (bio->bi_status) {
--
2.9.4

2017-08-08 08:47:08

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 05/49] fs/buffer: comment on direct access to bvec table

Signed-off-by: Ming Lei <[email protected]>
---
fs/buffer.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 5715dac7821f..c821ed6a6f0e 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3054,8 +3054,13 @@ static void end_bio_bh_io_sync(struct bio *bio)
void guard_bio_eod(int op, struct bio *bio)
{
sector_t maxsector;
- struct bio_vec *bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];
unsigned truncated_bytes;
+ /*
+ * It is safe to truncate the last bvec in the following way
+ * even though multipage bvec is supported, but we need to
+ * fix the parameters passed to zero_user().
+ */
+ struct bio_vec *bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];

maxsector = i_size_read(bio->bi_bdev->bd_inode) >> 9;
if (!maxsector)
--
2.9.4

2017-08-08 08:47:16

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 06/49] f2fs: f2fs_read_end_io: comment on direct access to bvec table

Cc: Jaegeuk Kim <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
fs/f2fs/data.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 87c1f4150c64..99fa8e9780e8 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -56,6 +56,10 @@ static void f2fs_read_end_io(struct bio *bio)
int i;

#ifdef CONFIG_F2FS_FAULT_INJECTION
+ /*
+ * It is still safe to retrieve the 1st page of the bio
+ * in this way after supporting multipage bvec.
+ */
if (time_to_inject(F2FS_P_SB(bio->bi_io_vec->bv_page), FAULT_IO)) {
f2fs_show_injection_info(FAULT_IO);
bio->bi_status = BLK_STS_IOERR;
--
2.9.4

2017-08-08 08:47:28

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 07/49] bcache: comment on direct access to bvec table

Looks all are safe after multipage bvec is supported.

Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
drivers/md/bcache/btree.c | 1 +
drivers/md/bcache/super.c | 6 ++++++
drivers/md/bcache/util.c | 7 +++++++
3 files changed, 14 insertions(+)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 866dcf78ff8e..3da595ae565b 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -431,6 +431,7 @@ static void do_btree_node_write(struct btree *b)

continue_at(cl, btree_node_write_done, NULL);
} else {
+ /* No harm for multipage bvec since the new is just allocated */
b->bio->bi_vcnt = 0;
bch_bio_map(b->bio, i);

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 8352fad765f6..6808f548cd13 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -208,6 +208,7 @@ static void write_bdev_super_endio(struct bio *bio)

static void __write_super(struct cache_sb *sb, struct bio *bio)
{
+ /* single page bio, safe for multipage bvec */
struct cache_sb *out = page_address(bio->bi_io_vec[0].bv_page);
unsigned i;

@@ -1154,6 +1155,8 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
dc->bdev->bd_holder = dc;

bio_init(&dc->sb_bio, dc->sb_bio.bi_inline_vecs, 1);
+
+ /* single page bio, safe for multipage bvec */
dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
get_page(sb_page);

@@ -1799,6 +1802,7 @@ void bch_cache_release(struct kobject *kobj)
for (i = 0; i < RESERVE_NR; i++)
free_fifo(&ca->free[i]);

+ /* single page bio, safe for multipage bvec */
if (ca->sb_bio.bi_inline_vecs[0].bv_page)
put_page(ca->sb_bio.bi_io_vec[0].bv_page);

@@ -1854,6 +1858,8 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
ca->bdev->bd_holder = ca;

bio_init(&ca->sb_bio, ca->sb_bio.bi_inline_vecs, 1);
+
+ /* single page bio, safe for multipage bvec */
ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
get_page(sb_page);

diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index 8c3a938f4bf0..11b4230ea6ad 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -223,6 +223,13 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
: 0;
}

+/*
+ * Generally it isn't good to access .bi_io_vec and .bi_vcnt
+ * directly, the preferred way is bio_add_page, but in
+ * this case, bch_bio_map() supposes that the bvec table
+ * is empty, so it is safe to access .bi_vcnt & .bi_io_vec
+ * in this way even after multipage bvec is supported.
+ */
void bch_bio_map(struct bio *bio, void *base)
{
size_t size = bio->bi_iter.bi_size;
--
2.9.4

2017-08-08 08:47:38

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 08/49] block: comment on bio_alloc_pages()

This patch adds comment on usage of bio_alloc_pages().

Signed-off-by: Ming Lei <[email protected]>
---
block/bio.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index e241bbc49f14..826b5d173416 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -981,7 +981,9 @@ EXPORT_SYMBOL(bio_advance);
* @bio: bio to allocate pages for
* @gfp_mask: flags for allocation
*
- * Allocates pages up to @bio->bi_vcnt.
+ * Allocates pages up to @bio->bi_vcnt, and this function should only
+ * be called on a new initialized bio, which means all pages aren't added
+ * to the bio via bio_add_page() yet.
*
* Returns 0 on success, -ENOMEM on failure. On failure, any allocated pages are
* freed.
--
2.9.4

2017-08-08 08:47:48

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 09/49] block: comment on bio_iov_iter_get_pages()

bio_iov_iter_get_pages() used unused bvec spaces for
storing page pointer array temporarily, and this patch
comments on this usage wrt. multipage bvec support.

Signed-off-by: Ming Lei <[email protected]>
---
block/bio.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 826b5d173416..28697e3c8ce3 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -875,6 +875,10 @@ EXPORT_SYMBOL(bio_add_page);
*
* Pins as many pages from *iter and appends them to @bio's bvec array. The
* pages will have to be released using put_page() when done.
+ *
+ * The hacking way of using bvec table as page pointer array is safe
+ * even after multipage bvec is introduced because that space can be
+ * thought as unused by bio_add_page().
*/
int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
{
--
2.9.4

2017-08-08 08:48:00

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 10/49] dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE

For BIO based DM, some targets aren't ready for dealing with
bigger incoming bio than 1Mbyte, such as crypt target.

Cc: Mike Snitzer <[email protected]>
Cc:[email protected]
Signed-off-by: Ming Lei <[email protected]>
---
drivers/md/dm.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 2edbcc2d7d3f..631348699fb8 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -922,7 +922,16 @@ int dm_set_target_max_io_len(struct dm_target *ti, sector_t len)
return -EINVAL;
}

- ti->max_io_len = (uint32_t) len;
+ /*
+ * BIO based queue uses its own splitting. When multipage bvecs
+ * is switched on, size of the incoming bio may be too big to
+ * be handled in some targets, such as crypt.
+ *
+ * When these targets are ready for the big bio, we can remove
+ * the limit.
+ */
+ ti->max_io_len = min_t(uint32_t, len,
+ (BIO_MAX_PAGES * PAGE_SIZE));

return 0;
}
--
2.9.4

2017-08-08 08:48:14

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 11/49] btrfs: avoid access to .bi_vcnt directly

BTRFS uses bio->bi_vcnt to figure out page numbers, this
way becomes not correct once we start to enable multipage
bvec.

So use bio_for_each_segment_all() to do that instead.

Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
Cc: [email protected]
Acked-by: David Sterba <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
fs/btrfs/extent_io.c | 20 ++++++++++++++++----
fs/btrfs/extent_io.h | 2 +-
2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0aff9b278c19..0e7367817b92 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2258,7 +2258,7 @@ int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end,
return 0;
}

-bool btrfs_check_repairable(struct inode *inode, struct bio *failed_bio,
+bool btrfs_check_repairable(struct inode *inode, unsigned failed_bio_pages,
struct io_failure_record *failrec, int failed_mirror)
{
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
@@ -2282,7 +2282,7 @@ bool btrfs_check_repairable(struct inode *inode, struct bio *failed_bio,
* a) deliver good data to the caller
* b) correct the bad sectors on disk
*/
- if (failed_bio->bi_vcnt > 1) {
+ if (failed_bio_pages > 1) {
/*
* to fulfill b), we need to know the exact failing sectors, as
* we don't want to rewrite any more than the failed ones. thus,
@@ -2355,6 +2355,17 @@ struct bio *btrfs_create_repair_bio(struct inode *inode, struct bio *failed_bio,
return bio;
}

+static unsigned int get_bio_pages(struct bio *bio)
+{
+ unsigned i;
+ struct bio_vec *bv;
+
+ bio_for_each_segment_all(bv, bio, i)
+ ;
+
+ return i;
+}
+
/*
* this is a generic handler for readpage errors (default
* readpage_io_failed_hook). if other copies exist, read those and write back
@@ -2375,6 +2386,7 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset,
int read_mode = 0;
blk_status_t status;
int ret;
+ unsigned failed_bio_pages = get_bio_pages(failed_bio);

BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE);

@@ -2382,13 +2394,13 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset,
if (ret)
return ret;

- if (!btrfs_check_repairable(inode, failed_bio, failrec,
+ if (!btrfs_check_repairable(inode, failed_bio_pages, failrec,
failed_mirror)) {
free_io_failure(failure_tree, tree, failrec);
return -EIO;
}

- if (failed_bio->bi_vcnt > 1)
+ if (failed_bio_pages > 1)
read_mode |= REQ_FAILFAST_DEV;

phy_offset >>= inode->i_sb->s_blocksize_bits;
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 4f030912f3ef..300ee10f39f2 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -539,7 +539,7 @@ void btrfs_free_io_failure_record(struct btrfs_inode *inode, u64 start,
u64 end);
int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end,
struct io_failure_record **failrec_ret);
-bool btrfs_check_repairable(struct inode *inode, struct bio *failed_bio,
+bool btrfs_check_repairable(struct inode *inode, unsigned failed_bio_pages,
struct io_failure_record *failrec, int fail_mirror);
struct bio *btrfs_create_repair_bio(struct inode *inode, struct bio *failed_bio,
struct io_failure_record *failrec,
--
2.9.4

2017-08-08 08:48:22

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 12/49] btrfs: avoid to access bvec table directly for a cloned bio

Commit 17347cec15f919901c90(Btrfs: change how we iterate bios in endio)
mentioned that for dio the submitted bio may be fast cloned, we
can't access the bvec table directly for a cloned bio, so use
bio_get_first_bvec() to retrieve the 1st bvec.

Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
Cc: [email protected]
Cc: Liu Bo <[email protected]>
Reviewed-by: Liu Bo <[email protected]>
Acked: David Sterba <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
fs/btrfs/inode.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 95c212037095..5cf320ee7ea0 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7993,6 +7993,7 @@ static int dio_read_error(struct inode *inode, struct bio *failed_bio,
int read_mode = 0;
int segs;
int ret;
+ struct bio_vec bvec;

BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE);

@@ -8008,8 +8009,9 @@ static int dio_read_error(struct inode *inode, struct bio *failed_bio,
}

segs = bio_segments(failed_bio);
+ bio_get_first_bvec(failed_bio, &bvec);
if (segs > 1 ||
- (failed_bio->bi_io_vec->bv_len > btrfs_inode_sectorsize(inode)))
+ (bvec.bv_len > btrfs_inode_sectorsize(inode)))
read_mode |= REQ_FAILFAST_DEV;

isector = start - btrfs_io_bio(failed_bio)->logical;
--
2.9.4

2017-08-08 08:48:31

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 13/49] btrfs: comment on direct access bvec table

Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
Cc: [email protected]
Acked: David Sterba <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
fs/btrfs/compression.c | 4 ++++
fs/btrfs/inode.c | 12 ++++++++++++
2 files changed, 16 insertions(+)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index d2ef9ac2a630..f795d0a6d176 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -542,6 +542,10 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,

/* we need the actual starting offset of this extent in the file */
read_lock(&em_tree->lock);
+ /*
+ * It is still safe to retrieve the 1st page of the bio
+ * in this way after supporting multipage bvec.
+ */
em = lookup_extent_mapping(em_tree,
page_offset(bio->bi_io_vec->bv_page),
PAGE_SIZE);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5cf320ee7ea0..084ed99dd308 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8051,6 +8051,12 @@ static void btrfs_retry_endio_nocsum(struct bio *bio)
if (bio->bi_status)
goto end;

+ /*
+ * WARNING:
+ *
+ * With multipage bvec, the following way of direct access to
+ * bvec table is only safe if the bio includes single page.
+ */
ASSERT(bio->bi_vcnt == 1);
io_tree = &BTRFS_I(inode)->io_tree;
failure_tree = &BTRFS_I(inode)->io_failure_tree;
@@ -8143,6 +8149,12 @@ static void btrfs_retry_endio(struct bio *bio)

uptodate = 1;

+ /*
+ * WARNING:
+ *
+ * With multipage bvec, the following way of direct access to
+ * bvec table is only safe if the bio includes single page.
+ */
ASSERT(bio->bi_vcnt == 1);
ASSERT(bio->bi_io_vec->bv_len == btrfs_inode_sectorsize(done->inode));

--
2.9.4

2017-08-08 08:48:40

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 14/49] block: bounce: avoid direct access to bvec table

We will support multipage bvecs in the future, so change to
iterator way for getting bv_page of bvec from original bio.

Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
block/bounce.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/block/bounce.c b/block/bounce.c
index 5793c2dc1a15..e57cf2bdcd27 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -136,21 +136,20 @@ static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
static void bounce_end_io(struct bio *bio, mempool_t *pool)
{
struct bio *bio_orig = bio->bi_private;
- struct bio_vec *bvec, *org_vec;
+ struct bio_vec *bvec, orig_vec;
int i;
- int start = bio_orig->bi_iter.bi_idx;
+ struct bvec_iter orig_iter = bio_orig->bi_iter;

/*
* free up bounce indirect pages used
*/
bio_for_each_segment_all(bvec, bio, i) {
- org_vec = bio_orig->bi_io_vec + i + start;
-
- if (bvec->bv_page == org_vec->bv_page)
- continue;
-
- dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
- mempool_free(bvec->bv_page, pool);
+ orig_vec = bio_iter_iovec(bio_orig, orig_iter);
+ if (bvec->bv_page != orig_vec.bv_page) {
+ dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
+ mempool_free(bvec->bv_page, pool);
+ }
+ bio_advance_iter(bio_orig, &orig_iter, orig_vec.bv_len);
}

bio_orig->bi_status = bio->bi_status;
--
2.9.4

2017-08-08 08:48:54

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 15/49] bvec_iter: introduce BVEC_ITER_ALL_INIT

Introduce BVEC_ITER_ALL_INIT for iterating one bio
from start to end.

Signed-off-by: Ming Lei <[email protected]>
---
include/linux/bvec.h | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index ec8a4d7af6bd..fe7a22dd133b 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -125,4 +125,13 @@ static inline bool bvec_iter_rewind(const struct bio_vec *bv,
((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \
bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len))

+/* for iterating one bio from start to end */
+#define BVEC_ITER_ALL_INIT (struct bvec_iter) \
+{ \
+ .bi_sector = 0, \
+ .bi_size = UINT_MAX, \
+ .bi_idx = 0, \
+ .bi_bvec_done = 0, \
+}
+
#endif /* __LINUX_BVEC_ITER_H */
--
2.9.4

2017-08-08 08:49:33

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 16/49] block: bounce: don't access bio->bi_io_vec in copy_to_high_bio_irq

As we need to support multipage bvecs, so don't access bio->bi_io_vec
in copy_to_high_bio_irq(), and just use the standard iterator
to do that.

Signed-off-by: Ming Lei <[email protected]>
---
block/bounce.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/block/bounce.c b/block/bounce.c
index e57cf2bdcd27..50e965a15295 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -112,24 +112,30 @@ int init_emergency_isa_pool(void)
static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
{
unsigned char *vfrom;
- struct bio_vec tovec, *fromvec = from->bi_io_vec;
+ struct bio_vec tovec, fromvec;
struct bvec_iter iter;
+ /*
+ * The bio of @from is created by bounce, so we can iterate
+ * its bvec from start to end, but the @from->bi_iter can't be
+ * trusted because it might be changed by splitting.
+ */
+ struct bvec_iter from_iter = BVEC_ITER_ALL_INIT;

bio_for_each_segment(tovec, to, iter) {
- if (tovec.bv_page != fromvec->bv_page) {
+ fromvec = bio_iter_iovec(from, from_iter);
+ if (tovec.bv_page != fromvec.bv_page) {
/*
* fromvec->bv_offset and fromvec->bv_len might have
* been modified by the block layer, so use the original
* copy, bounce_copy_vec already uses tovec->bv_len
*/
- vfrom = page_address(fromvec->bv_page) +
+ vfrom = page_address(fromvec.bv_page) +
tovec.bv_offset;

bounce_copy_vec(&tovec, vfrom);
flush_dcache_page(tovec.bv_page);
}
-
- fromvec++;
+ bio_advance_iter(from, &from_iter, tovec.bv_len);
}
}

--
2.9.4

2017-08-08 08:49:44

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 17/49] block: comments on bio_for_each_segment[_all]

This patch clarifies the fact that even though both
bio_for_each_segment() and bio_for_each_segment_all()
are named as _segment/_segment_all, they still return
one page in each vector, instead of real segment(multipage bvec).

With comming multipage bvec, both the two helpers
are capable of returning real segment(multipage bvec),
but the callers(users) of the two helpers may not be
capable of handling of the multipage bvec or real
segment, so we still keep the interfaces of the helpers
not changed. And new helpers for returning multipage bvec(real segment)
will be introduced later.

Signed-off-by: Ming Lei <[email protected]>
---
include/linux/bio.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 7b1cf4ba0902..80defb3cfca4 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -155,7 +155,10 @@ static inline void *bio_data(struct bio *bio)

/*
* drivers should _never_ use the all version - the bio may have been split
- * before it got to the driver and the driver won't own all of it
+ * before it got to the driver and the driver won't own all of it.
+ *
+ * Even though the helper is named as _segment_all, it still returns
+ * page one by one instead of real segment.
*/
#define bio_for_each_segment_all(bvl, bio, i) \
for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)
@@ -194,6 +197,10 @@ static inline bool bio_rewind_iter(struct bio *bio, struct bvec_iter *iter,
((bvl = bio_iter_iovec((bio), (iter))), 1); \
bio_advance_iter((bio), &(iter), (bvl).bv_len))

+/*
+ * Even though the helper is named as _segment, it still returns
+ * page one by one instead of real segment.
+ */
#define bio_for_each_segment(bvl, bio, iter) \
__bio_for_each_segment(bvl, bio, iter, (bio)->bi_iter)

--
2.9.4

2017-08-08 08:49:54

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 18/49] block: introduce multipage/single page bvec helpers

This patch introduces helpers which are suffixed with _mp
and _sp for the multipage bvec/segment support.

The helpers with _mp suffix are the interfaces for treating
one bvec/segment as real multipage one, for example, .bv_len
is the total length of the multipage segment.

The helpers with _sp suffix are interfaces for supporting
current bvec iterator which is thought as singlepage only
by drivers, fs, dm and etc. These _sp helpers are introduced
to build singlepage bvec in flight, so users of bio/bvec
iterator still can work well and needn't change even though
we store multipage into bvec.

Signed-off-by: Ming Lei <[email protected]>
---
include/linux/bvec.h | 56 +++++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 53 insertions(+), 3 deletions(-)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index fe7a22dd133b..1eaf7ca5cab3 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -25,6 +25,42 @@
#include <linux/errno.h>

/*
+ * What is multipage bvecs(segment)?
+ *
+ * - bvec stored in bio->bi_io_vec is always multipage style vector
+ *
+ * - bvec(struct bio_vec) represents one physically contiguous I/O
+ * buffer, now the buffer may include more than one pages since
+ * multipage(mp) bvec is supported, and all these pages represented
+ * by one bvec is physically contiguous. Before mp support, at most
+ * one page can be included in one bvec, we call it singlepage(sp)
+ * bvec.
+ *
+ * - .bv_page of th bvec represents the 1st page in the mp segment
+ *
+ * - .bv_offset of the bvec represents offset of the buffer in the bvec
+ *
+ * The effect on the current drivers/filesystem/dm/bcache/...:
+ *
+ * - almost everyone supposes that one bvec only includes one single
+ * page, so we keep the sp interface not changed, for example,
+ * bio_for_each_segment() still returns bvec with single page
+ *
+ * - bio_for_each_segment_all() will be changed to return singlepage
+ * bvec too
+ *
+ * - during iterating, iterator variable(struct bvec_iter) is always
+ * updated in multipage bvec style and that means bvec_iter_advance()
+ * is kept not changed
+ *
+ * - returned(copied) singlepage bvec is generated in flight by bvec
+ * helpers from the stored mp bvec
+ *
+ * - In case that some components(such as iov_iter) need to support mp
+ * segment, we introduce new helpers(suffixed with _mp) for them.
+ */
+
+/*
* was unsigned short, but we might as well be ready for > 64kB I/O pages
*/
struct bio_vec {
@@ -52,16 +88,30 @@ struct bvec_iter {
*/
#define __bvec_iter_bvec(bvec, iter) (&(bvec)[(iter).bi_idx])

-#define bvec_iter_page(bvec, iter) \
+#define bvec_iter_page_mp(bvec, iter) \
(__bvec_iter_bvec((bvec), (iter))->bv_page)

-#define bvec_iter_len(bvec, iter) \
+#define bvec_iter_len_mp(bvec, iter) \
min((iter).bi_size, \
__bvec_iter_bvec((bvec), (iter))->bv_len - (iter).bi_bvec_done)

-#define bvec_iter_offset(bvec, iter) \
+#define bvec_iter_offset_mp(bvec, iter) \
(__bvec_iter_bvec((bvec), (iter))->bv_offset + (iter).bi_bvec_done)

+/*
+ * <page, offset,length> of singlepage(sp) segment.
+ *
+ * This helpers will be implemented for building sp bvec in flight.
+ */
+#define bvec_iter_offset_sp(bvec, iter) bvec_iter_offset_mp((bvec), (iter))
+#define bvec_iter_len_sp(bvec, iter) bvec_iter_len_mp((bvec), (iter))
+#define bvec_iter_page_sp(bvec, iter) bvec_iter_page_mp((bvec), (iter))
+
+/* current interfaces support sp style at default */
+#define bvec_iter_page(bvec, iter) bvec_iter_page_sp((bvec), (iter))
+#define bvec_iter_len(bvec, iter) bvec_iter_len_sp((bvec), (iter))
+#define bvec_iter_offset(bvec, iter) bvec_iter_offset_sp((bvec), (iter))
+
#define bvec_iter_bvec(bvec, iter) \
((struct bio_vec) { \
.bv_page = bvec_iter_page((bvec), (iter)), \
--
2.9.4

2017-08-08 08:50:04

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 19/49] block: implement sp version of bvec iterator helpers

This patch implements singlepage version of the following
3 helpers:
- bvec_iter_offset_sp()
- bvec_iter_len_sp()
- bvec_iter_page_sp()

So that one multipage bvec can be splited to singlepage
bvec, and make users of current bvec iterator happy.

Signed-off-by: Ming Lei <[email protected]>
---
include/linux/bvec.h | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 1eaf7ca5cab3..d5f999a493de 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -23,6 +23,7 @@
#include <linux/kernel.h>
#include <linux/bug.h>
#include <linux/errno.h>
+#include <linux/mm.h>

/*
* What is multipage bvecs(segment)?
@@ -98,14 +99,25 @@ struct bvec_iter {
#define bvec_iter_offset_mp(bvec, iter) \
(__bvec_iter_bvec((bvec), (iter))->bv_offset + (iter).bi_bvec_done)

+#define bvec_iter_page_idx_mp(bvec, iter) \
+ (bvec_iter_offset_mp((bvec), (iter)) / PAGE_SIZE)
+
+
/*
* <page, offset,length> of singlepage(sp) segment.
*
* This helpers will be implemented for building sp bvec in flight.
*/
-#define bvec_iter_offset_sp(bvec, iter) bvec_iter_offset_mp((bvec), (iter))
-#define bvec_iter_len_sp(bvec, iter) bvec_iter_len_mp((bvec), (iter))
-#define bvec_iter_page_sp(bvec, iter) bvec_iter_page_mp((bvec), (iter))
+#define bvec_iter_offset_sp(bvec, iter) \
+ (bvec_iter_offset_mp((bvec), (iter)) % PAGE_SIZE)
+
+#define bvec_iter_len_sp(bvec, iter) \
+ min_t(unsigned, bvec_iter_len_mp((bvec), (iter)), \
+ (PAGE_SIZE - (bvec_iter_offset_sp((bvec), (iter)))))
+
+#define bvec_iter_page_sp(bvec, iter) \
+ nth_page(bvec_iter_page_mp((bvec), (iter)), \
+ bvec_iter_page_idx_mp((bvec), (iter)))

/* current interfaces support sp style at default */
#define bvec_iter_page(bvec, iter) bvec_iter_page_sp((bvec), (iter))
--
2.9.4

2017-08-08 08:50:17

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 20/49] block: introduce bio_for_each_segment_mp()

This helper is used to iterate multipage bvec and it is
required in bio_clone().

Signed-off-by: Ming Lei <[email protected]>
---
include/linux/bio.h | 34 +++++++++++++++++++++++++++++++---
include/linux/bvec.h | 36 ++++++++++++++++++++++++++++++++----
2 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 80defb3cfca4..ac8248558ab4 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -68,6 +68,9 @@
#define bio_data_dir(bio) \
(op_is_write(bio_op(bio)) ? WRITE : READ)

+#define bio_iter_iovec_mp(bio, iter) \
+ bvec_iter_bvec_mp((bio)->bi_io_vec, (iter))
+
/*
* Check whether this bio carries any data or not. A NULL bio is allowed.
*/
@@ -163,8 +166,8 @@ static inline void *bio_data(struct bio *bio)
#define bio_for_each_segment_all(bvl, bio, i) \
for (i = 0, bvl = (bio)->bi_io_vec; i < (bio)->bi_vcnt; i++, bvl++)

-static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
- unsigned bytes)
+static inline void __bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
+ unsigned bytes, bool mp)
{
iter->bi_sector += bytes >> 9;

@@ -172,11 +175,26 @@ static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
iter->bi_size -= bytes;
iter->bi_done += bytes;
} else {
- bvec_iter_advance(bio->bi_io_vec, iter, bytes);
+ if (!mp)
+ bvec_iter_advance(bio->bi_io_vec, iter, bytes);
+ else
+ bvec_iter_advance_mp(bio->bi_io_vec, iter, bytes);
/* TODO: It is reasonable to complete bio with error here. */
}
}

+static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
+ unsigned bytes)
+{
+ __bio_advance_iter(bio, iter, bytes, false);
+}
+
+static inline void bio_advance_iter_mp(struct bio *bio, struct bvec_iter *iter,
+ unsigned bytes)
+{
+ __bio_advance_iter(bio, iter, bytes, true);
+}
+
static inline bool bio_rewind_iter(struct bio *bio, struct bvec_iter *iter,
unsigned int bytes)
{
@@ -204,6 +222,16 @@ static inline bool bio_rewind_iter(struct bio *bio, struct bvec_iter *iter,
#define bio_for_each_segment(bvl, bio, iter) \
__bio_for_each_segment(bvl, bio, iter, (bio)->bi_iter)

+#define __bio_for_each_segment_mp(bvl, bio, iter, start) \
+ for (iter = (start); \
+ (iter).bi_size && \
+ ((bvl = bio_iter_iovec_mp((bio), (iter))), 1); \
+ bio_advance_iter_mp((bio), &(iter), (bvl).bv_len))
+
+/* returns one real segment(multipage bvec) each time */
+#define bio_for_each_segment_mp(bvl, bio, iter) \
+ __bio_for_each_segment_mp(bvl, bio, iter, (bio)->bi_iter)
+
#define bio_iter_last(bvec, iter) ((iter).bi_size == (bvec).bv_len)

static inline unsigned bio_segments(struct bio *bio)
diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index d5f999a493de..c1ec0945451a 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -131,8 +131,16 @@ struct bvec_iter {
.bv_offset = bvec_iter_offset((bvec), (iter)), \
})

-static inline bool bvec_iter_advance(const struct bio_vec *bv,
- struct bvec_iter *iter, unsigned bytes)
+#define bvec_iter_bvec_mp(bvec, iter) \
+((struct bio_vec) { \
+ .bv_page = bvec_iter_page_mp((bvec), (iter)), \
+ .bv_len = bvec_iter_len_mp((bvec), (iter)), \
+ .bv_offset = bvec_iter_offset_mp((bvec), (iter)), \
+})
+
+static inline bool __bvec_iter_advance(const struct bio_vec *bv,
+ struct bvec_iter *iter,
+ unsigned bytes, bool mp)
{
if (WARN_ONCE(bytes > iter->bi_size,
"Attempted to advance past end of bvec iter\n")) {
@@ -141,8 +149,14 @@ static inline bool bvec_iter_advance(const struct bio_vec *bv,
}

while (bytes) {
- unsigned iter_len = bvec_iter_len(bv, *iter);
- unsigned len = min(bytes, iter_len);
+ unsigned len;
+
+ if (mp)
+ len = bvec_iter_len_mp(bv, *iter);
+ else
+ len = bvec_iter_len_sp(bv, *iter);
+
+ len = min(bytes, len);

bytes -= len;
iter->bi_size -= len;
@@ -181,6 +195,20 @@ static inline bool bvec_iter_rewind(const struct bio_vec *bv,
return true;
}

+static inline bool bvec_iter_advance(const struct bio_vec *bv,
+ struct bvec_iter *iter,
+ unsigned bytes)
+{
+ return __bvec_iter_advance(bv, iter, bytes, false);
+}
+
+static inline bool bvec_iter_advance_mp(const struct bio_vec *bv,
+ struct bvec_iter *iter,
+ unsigned bytes)
+{
+ return __bvec_iter_advance(bv, iter, bytes, true);
+}
+
#define for_each_bvec(bvl, bio_vec, iter, start) \
for (iter = (start); \
(iter).bi_size && \
--
2.9.4

2017-08-08 08:50:28

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 21/49] blk-merge: compute bio->bi_seg_front_size efficiently

It is enough to check and compute bio->bi_seg_front_size just
after the 1st segment is found, but current code checks that
for each bvec, which is inefficient.

This patch follows the way in __blk_recalc_rq_segments()
for computing bio->bi_seg_front_size, and it is more efficient
and code becomes more readable too.

Signed-off-by: Ming Lei <[email protected]>
---
block/blk-merge.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 99038830fb42..d91f07813dee 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -145,22 +145,21 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
bvprvp = &bvprv;
sectors += bv.bv_len >> 9;

- if (nsegs == 1 && seg_size > front_seg_size)
- front_seg_size = seg_size;
continue;
}
new_segment:
if (nsegs == queue_max_segments(q))
goto split;

+ if (nsegs == 1 && seg_size > front_seg_size)
+ front_seg_size = seg_size;
+
nsegs++;
bvprv = bv;
bvprvp = &bvprv;
seg_size = bv.bv_len;
sectors += bv.bv_len >> 9;

- if (nsegs == 1 && seg_size > front_seg_size)
- front_seg_size = seg_size;
}

do_split = false;
@@ -173,6 +172,8 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
bio = new;
}

+ if (nsegs == 1 && seg_size > front_seg_size)
+ front_seg_size = seg_size;
bio->bi_seg_front_size = front_seg_size;
if (seg_size > bio->bi_seg_back_size)
bio->bi_seg_back_size = seg_size;
--
2.9.4

2017-08-08 08:50:38

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 22/49] block: blk-merge: try to make front segments in full size

When merging one bvec into segment, if the bvec is too big
to merge, current policy is to move the whole bvec into another
new segment.

This patchset changes the policy into trying to maximize size of
front segments, that means in above situation, part of bvec
is merged into current segment, and the remainder is put
into next segment.

This patch prepares for support multipage bvec because
it can be quite common to see this case and we should try
to make front segments in full size.

Signed-off-by: Ming Lei <[email protected]>
---
block/blk-merge.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 49 insertions(+), 5 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index d91f07813dee..aeb8933e6cae 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -108,6 +108,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
bool do_split = true;
struct bio *new = NULL;
const unsigned max_sectors = get_max_io_size(q, bio);
+ unsigned advance = 0;

bio_for_each_segment(bv, bio, iter) {
/*
@@ -133,12 +134,32 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
}

if (bvprvp && blk_queue_cluster(q)) {
- if (seg_size + bv.bv_len > queue_max_segment_size(q))
- goto new_segment;
if (!BIOVEC_PHYS_MERGEABLE(bvprvp, &bv))
goto new_segment;
if (!BIOVEC_SEG_BOUNDARY(q, bvprvp, &bv))
goto new_segment;
+ if (seg_size + bv.bv_len > queue_max_segment_size(q)) {
+ /*
+ * On assumption is that initial value of
+ * @seg_size(equals to bv.bv_len) won't be
+ * bigger than max segment size, but will
+ * becomes false after multipage bvec comes.
+ */
+ advance = queue_max_segment_size(q) - seg_size;
+
+ if (advance > 0) {
+ seg_size += advance;
+ sectors += advance >> 9;
+ bv.bv_len -= advance;
+ bv.bv_offset += advance;
+ }
+
+ /*
+ * Still need to put remainder of current
+ * bvec into a new segment.
+ */
+ goto new_segment;
+ }

seg_size += bv.bv_len;
bvprv = bv;
@@ -160,6 +181,12 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
seg_size = bv.bv_len;
sectors += bv.bv_len >> 9;

+ /* restore the bvec for iterator */
+ if (advance) {
+ bv.bv_len += advance;
+ bv.bv_offset -= advance;
+ advance = 0;
+ }
}

do_split = false;
@@ -360,16 +387,29 @@ __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
{

int nbytes = bvec->bv_len;
+ unsigned advance = 0;

if (*sg && *cluster) {
- if ((*sg)->length + nbytes > queue_max_segment_size(q))
- goto new_segment;
-
if (!BIOVEC_PHYS_MERGEABLE(bvprv, bvec))
goto new_segment;
if (!BIOVEC_SEG_BOUNDARY(q, bvprv, bvec))
goto new_segment;

+ /*
+ * try best to merge part of the bvec into previous
+ * segment and follow same policy with
+ * blk_bio_segment_split()
+ */
+ if ((*sg)->length + nbytes > queue_max_segment_size(q)) {
+ advance = queue_max_segment_size(q) - (*sg)->length;
+ if (advance) {
+ (*sg)->length += advance;
+ bvec->bv_offset += advance;
+ bvec->bv_len -= advance;
+ }
+ goto new_segment;
+ }
+
(*sg)->length += nbytes;
} else {
new_segment:
@@ -392,6 +432,10 @@ __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,

sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
(*nsegs)++;
+
+ /* for making iterator happy */
+ bvec->bv_offset -= advance;
+ bvec->bv_len += advance;
}
*bvprv = *bvec;
}
--
2.9.4

2017-08-08 08:50:47

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 23/49] block: blk-merge: remove unnecessary check

In this case, 'sectors' can't be zero at all, so remove the check
and let the bio be splitted.

Signed-off-by: Ming Lei <[email protected]>
---
block/blk-merge.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index aeb8933e6cae..ac217fce4921 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -128,9 +128,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
nsegs++;
sectors = max_sectors;
}
- if (sectors)
- goto split;
- /* Make this single bvec as the 1st segment */
+ goto split;
}

if (bvprvp && blk_queue_cluster(q)) {
--
2.9.4

2017-08-08 08:50:58

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 24/49] block: use bio_for_each_segment_mp() to compute segments count

Firstly it is more efficient to use bio_for_each_segment_mp()
in both blk_bio_segment_split() and __blk_recalc_rq_segments()
to compute how many segments there are in the bio.

Secondaly once bio_for_each_segment_mp() is used, the bvec
may need to be splitted because its length can be very long
and more than max segment size, so we have to support to split
one bvec into several segments.

Thirdly during splitting mp bvec into segments, max segment
number may be reached, then the bio need to be splitted when
this happens.

Signed-off-by: Ming Lei <[email protected]>
---
block/blk-merge.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-----------
1 file changed, 79 insertions(+), 18 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index ac217fce4921..c9b300f91fba 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -96,6 +96,62 @@ static inline unsigned get_max_io_size(struct request_queue *q,
return sectors;
}

+/*
+ * Split the bvec @bv into segments, and update all kinds of
+ * variables.
+ */
+static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
+ unsigned *nsegs, unsigned *last_seg_size,
+ unsigned *front_seg_size, unsigned *sectors)
+{
+ bool need_split = false;
+ unsigned len = bv->bv_len;
+ unsigned total_len = 0;
+ unsigned new_nsegs = 0, seg_size = 0;
+
+ if ((*nsegs >= queue_max_segments(q)) || !len)
+ return need_split;
+
+ /*
+ * Multipage bvec may be too big to hold in one segment,
+ * so the current bvec has to be splitted as multiple
+ * segments.
+ */
+ while (new_nsegs + *nsegs < queue_max_segments(q)) {
+ seg_size = min(queue_max_segment_size(q), len);
+
+ new_nsegs++;
+ total_len += seg_size;
+ len -= seg_size;
+
+ if ((queue_virt_boundary(q) && ((bv->bv_offset +
+ total_len) & queue_virt_boundary(q))) || !len)
+ break;
+ }
+
+ /* split in the middle of the bvec */
+ if (len)
+ need_split = true;
+
+ /* update front segment size */
+ if (!*nsegs) {
+ unsigned first_seg_size = seg_size;
+
+ if (new_nsegs > 1)
+ first_seg_size = queue_max_segment_size(q);
+ if (*front_seg_size < first_seg_size)
+ *front_seg_size = first_seg_size;
+ }
+
+ /* update other varibles */
+ *last_seg_size = seg_size;
+ *nsegs += new_nsegs;
+ if (sectors)
+ *sectors += total_len >> 9;
+
+ return need_split;
+}
+
static struct bio *blk_bio_segment_split(struct request_queue *q,
struct bio *bio,
struct bio_set *bs,
@@ -110,7 +166,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
const unsigned max_sectors = get_max_io_size(q, bio);
unsigned advance = 0;

- bio_for_each_segment(bv, bio, iter) {
+ bio_for_each_segment_mp(bv, bio, iter) {
/*
* If the queue doesn't support SG gaps and adding this
* offset would create a gap, disallow it.
@@ -125,8 +181,12 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
*/
if (nsegs < queue_max_segments(q) &&
sectors < max_sectors) {
- nsegs++;
- sectors = max_sectors;
+ /* split in the middle of bvec */
+ bv.bv_len = (max_sectors - sectors) << 9;
+ bvec_split_segs(q, &bv, &nsegs,
+ &seg_size,
+ &front_seg_size,
+ &sectors);
}
goto split;
}
@@ -138,10 +198,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
goto new_segment;
if (seg_size + bv.bv_len > queue_max_segment_size(q)) {
/*
- * On assumption is that initial value of
- * @seg_size(equals to bv.bv_len) won't be
- * bigger than max segment size, but will
- * becomes false after multipage bvec comes.
+ * The initial value of @seg_size won't be
+ * bigger than max segment size, because we
+ * split the bvec via bvec_split_segs().
*/
advance = queue_max_segment_size(q) - seg_size;

@@ -173,11 +232,12 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
if (nsegs == 1 && seg_size > front_seg_size)
front_seg_size = seg_size;

- nsegs++;
bvprv = bv;
bvprvp = &bvprv;
- seg_size = bv.bv_len;
- sectors += bv.bv_len >> 9;
+
+ if (bvec_split_segs(q, &bv, &nsegs, &seg_size,
+ &front_seg_size, &sectors))
+ goto split;

/* restore the bvec for iterator */
if (advance) {
@@ -251,6 +311,7 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
struct bio_vec bv, bvprv = { NULL };
int cluster, prev = 0;
unsigned int seg_size, nr_phys_segs;
+ unsigned front_seg_size = bio->bi_seg_front_size;
struct bio *fbio, *bbio;
struct bvec_iter iter;

@@ -271,7 +332,7 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
seg_size = 0;
nr_phys_segs = 0;
for_each_bio(bio) {
- bio_for_each_segment(bv, bio, iter) {
+ bio_for_each_segment_mp(bv, bio, iter) {
/*
* If SG merging is disabled, each bio vector is
* a segment
@@ -293,20 +354,20 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
continue;
}
new_segment:
- if (nr_phys_segs == 1 && seg_size >
- fbio->bi_seg_front_size)
- fbio->bi_seg_front_size = seg_size;
+ if (nr_phys_segs == 1 && seg_size > front_seg_size)
+ front_seg_size = seg_size;

- nr_phys_segs++;
bvprv = bv;
prev = 1;
- seg_size = bv.bv_len;
+ bvec_split_segs(q, &bv, &nr_phys_segs, &seg_size,
+ &front_seg_size, NULL);
}
bbio = bio;
}

- if (nr_phys_segs == 1 && seg_size > fbio->bi_seg_front_size)
- fbio->bi_seg_front_size = seg_size;
+ if (nr_phys_segs == 1 && seg_size > front_seg_size)
+ front_seg_size = seg_size;
+ fbio->bi_seg_front_size = front_seg_size;
if (seg_size > bbio->bi_seg_back_size)
bbio->bi_seg_back_size = seg_size;

--
2.9.4

2017-08-08 08:51:12

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 25/49] block: use bio_for_each_segment_mp() to map sg

It is more efficient to use bio_for_each_segment_mp()
for mapping sg, meantime we have to consider splitting
multipage bvec as done in blk_bio_segment_split().

Signed-off-by: Ming Lei <[email protected]>
---
block/blk-merge.c | 72 +++++++++++++++++++++++++++++++++++++++----------------
1 file changed, 52 insertions(+), 20 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index c9b300f91fba..33353ed8c32e 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -439,6 +439,56 @@ static int blk_phys_contig_segment(struct request_queue *q, struct bio *bio,
return 0;
}

+static inline struct scatterlist *blk_next_sg(struct scatterlist **sg,
+ struct scatterlist *sglist)
+{
+ if (!*sg)
+ return sglist;
+ else {
+ /*
+ * If the driver previously mapped a shorter
+ * list, we could see a termination bit
+ * prematurely unless it fully inits the sg
+ * table on each mapping. We KNOW that there
+ * must be more entries here or the driver
+ * would be buggy, so force clear the
+ * termination bit to avoid doing a full
+ * sg_init_table() in drivers for each command.
+ */
+ sg_unmark_end(*sg);
+ return sg_next(*sg);
+ }
+}
+
+static inline unsigned
+blk_bvec_map_sg(struct request_queue *q, struct bio_vec *bvec,
+ struct scatterlist *sglist, struct scatterlist **sg)
+{
+ unsigned nbytes = bvec->bv_len;
+ unsigned nsegs = 0, total = 0;
+
+ while (nbytes > 0) {
+ unsigned seg_size;
+ struct page *pg;
+ unsigned offset, idx;
+
+ *sg = blk_next_sg(sg, sglist);
+
+ seg_size = min(nbytes, queue_max_segment_size(q));
+ offset = (total + bvec->bv_offset) % PAGE_SIZE;
+ idx = (total + bvec->bv_offset) / PAGE_SIZE;
+ pg = nth_page(bvec->bv_page, idx);
+
+ sg_set_page(*sg, pg, seg_size, offset);
+
+ total += seg_size;
+ nbytes -= seg_size;
+ nsegs++;
+ }
+
+ return nsegs;
+}
+
static inline void
__blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
struct scatterlist *sglist, struct bio_vec *bvprv,
@@ -472,25 +522,7 @@ __blk_segment_map_sg(struct request_queue *q, struct bio_vec *bvec,
(*sg)->length += nbytes;
} else {
new_segment:
- if (!*sg)
- *sg = sglist;
- else {
- /*
- * If the driver previously mapped a shorter
- * list, we could see a termination bit
- * prematurely unless it fully inits the sg
- * table on each mapping. We KNOW that there
- * must be more entries here or the driver
- * would be buggy, so force clear the
- * termination bit to avoid doing a full
- * sg_init_table() in drivers for each command.
- */
- sg_unmark_end(*sg);
- *sg = sg_next(*sg);
- }
-
- sg_set_page(*sg, bvec->bv_page, nbytes, bvec->bv_offset);
- (*nsegs)++;
+ (*nsegs) += blk_bvec_map_sg(q, bvec, sglist, sg);

/* for making iterator happy */
bvec->bv_offset -= advance;
@@ -516,7 +548,7 @@ static int __blk_bios_map_sg(struct request_queue *q, struct bio *bio,
int cluster = blk_queue_cluster(q), nsegs = 0;

for_each_bio(bio)
- bio_for_each_segment(bvec, bio, iter)
+ bio_for_each_segment_mp(bvec, bio, iter)
__blk_segment_map_sg(q, &bvec, sglist, &bvprv, sg,
&nsegs, &cluster);

--
2.9.4

2017-08-08 08:51:23

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 26/49] block: introduce bvec_for_each_sp_bvec()

This helper can be used to iterate each singlepage bvec
from one multipage bvec.

Signed-off-by: Ming Lei <[email protected]>
---
include/linux/bvec.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index c1ec0945451a..23d3abdf057c 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -224,4 +224,18 @@ static inline bool bvec_iter_advance_mp(const struct bio_vec *bv,
.bi_bvec_done = 0, \
}

+/*
+ * This helper iterates over the multipage bvec of @mp_bvec and
+ * returns each singlepage bvec via @sp_bvl.
+ */
+#define __bvec_for_each_sp_bvec(sp_bvl, mp_bvec, iter, start) \
+ for (iter = start, \
+ (iter).bi_size = (mp_bvec)->bv_len - (iter).bi_bvec_done; \
+ (iter).bi_size && \
+ ((sp_bvl = bvec_iter_bvec((mp_bvec), (iter))), 1); \
+ bvec_iter_advance((mp_bvec), &(iter), (sp_bvl).bv_len))
+
+#define bvec_for_each_sp_bvec(sp_bvl, mp_bvec, iter) \
+ __bvec_for_each_sp_bvec(sp_bvl, mp_bvec, iter, BVEC_ITER_ALL_INIT)
+
#endif /* __LINUX_BVEC_ITER_H */
--
2.9.4

2017-08-08 08:51:31

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 27/49] block: bio: introduce single/multi page version of bio_for_each_segment_all()

This patches introduce bio_for_each_segment_all_sp() and
bio_for_each_segment_all_mp().

bio_for_each_segment_all_sp() is for replacing bio_for_each_segment_all()
in case that the returned bvec has to be single page bvec.

bio_for_each_segment_all_mp() is for replacing bio_for_each_segment_all()
in case that user wants to update the returned bvec via the pointer.

Signed-off-by: Ming Lei <[email protected]>
---
include/linux/bio.h | 24 ++++++++++++++++++++++++
include/linux/blk_types.h | 6 ++++++
2 files changed, 30 insertions(+)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index ac8248558ab4..cd43b4b80472 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -232,6 +232,30 @@ static inline bool bio_rewind_iter(struct bio *bio, struct bvec_iter *iter,
#define bio_for_each_segment_mp(bvl, bio, iter) \
__bio_for_each_segment_mp(bvl, bio, iter, (bio)->bi_iter)

+/*
+ * This helper returns each bvec stored in bvec table directly,
+ * so the returned bvec points to one multipage bvec in the table
+ * and caller can update the bvec via the returnd pointer.
+ */
+#define bio_for_each_segment_all_mp(bvl, bio, i) \
+ bio_for_each_segment_all((bvl), (bio), (i))
+
+/*
+ * This helper returns singlepage bvec to caller, and the sp bvec
+ * is generated in-flight from multipage bvec stored in bvec table.
+ * So we can _not_ change the bvec stored in bio->bi_io_vec[] via
+ * this helper.
+ *
+ * If someone need to update bvec in the table, please use
+ * bio_for_each_segment_all_mp() and make sure it is correctly used
+ * since the bvec points to one multipage bvec.
+ */
+#define bio_for_each_segment_all_sp(bvl, bio, i, bi) \
+ for ((bi).iter = BVEC_ITER_ALL_INIT, i = 0, bvl = &(bi).bv; \
+ (bi).iter.bi_idx < (bio)->bi_vcnt && \
+ (((bi).bv = bio_iter_iovec((bio), (bi).iter)), 1); \
+ bio_advance_iter((bio), &(bi).iter, (bi).bv.bv_len), i++)
+
#define bio_iter_last(bvec, iter) ((iter).bi_size == (bvec).bv_len)

static inline unsigned bio_segments(struct bio *bio)
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index d2eb87c84d82..99b47b7204fe 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -119,6 +119,12 @@ struct bio {

#define BIO_RESET_BYTES offsetof(struct bio, bi_max_vecs)

+/* this iter is only for implementing bio_for_each_segment_rd() */
+struct bvec_iter_all {
+ struct bvec_iter iter;
+ struct bio_vec bv; /* in-flight singlepage bvec */
+};
+
/*
* bio flags
*/
--
2.9.4

2017-08-08 08:51:42

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 28/49] block: introduce bvec_get_last_page()

BTRFS and guard_bio_eod() need to get the last page, so introduce
this helper to make them happy.

Signed-off-by: Ming Lei <[email protected]>
---
include/linux/bvec.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 23d3abdf057c..ceb6292750d6 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -238,4 +238,18 @@ static inline bool bvec_iter_advance_mp(const struct bio_vec *bv,
#define bvec_for_each_sp_bvec(sp_bvl, mp_bvec, iter) \
__bvec_for_each_sp_bvec(sp_bvl, mp_bvec, iter, BVEC_ITER_ALL_INIT)

+/*
+ * get the last page from the multipage bvec and store it
+ * in @sp_bv
+ */
+static inline void bvec_get_last_page(struct bio_vec *mp_bv,
+ struct bio_vec *sp_bv)
+{
+ struct bvec_iter iter;
+
+ *sp_bv = *mp_bv;
+ bvec_for_each_sp_bvec(*sp_bv, mp_bv, iter)
+ ;
+}
+
#endif /* __LINUX_BVEC_ITER_H */
--
2.9.4

2017-08-08 08:51:51

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 29/49] fs/buffer.c: use bvec iterator to truncate the bio

Once multipage bvec is enabled, the last bvec may include
more than one page, this patch use bvec_get_last_page()
to truncate the bio.

Signed-off-by: Ming Lei <[email protected]>
---
fs/buffer.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index c821ed6a6f0e..32a63e5b00f3 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3057,8 +3057,7 @@ void guard_bio_eod(int op, struct bio *bio)
unsigned truncated_bytes;
/*
* It is safe to truncate the last bvec in the following way
- * even though multipage bvec is supported, but we need to
- * fix the parameters passed to zero_user().
+ * even though multipage bvec is supported.
*/
struct bio_vec *bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];

@@ -3087,7 +3086,10 @@ void guard_bio_eod(int op, struct bio *bio)

/* ..and clear the end of the buffer for reads */
if (op == REQ_OP_READ) {
- zero_user(bvec->bv_page, bvec->bv_offset + bvec->bv_len,
+ struct bio_vec bv;
+
+ bvec_get_last_page(bvec, &bv);
+ zero_user(bv.bv_page, bv.bv_offset + bv.bv_len,
truncated_bytes);
}
}
--
2.9.4

2017-08-08 08:52:04

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 30/49] btrfs: use bvec_get_last_page to get bio's last page

Preparing for supporting multipage bvec.

Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
fs/btrfs/compression.c | 5 ++++-
fs/btrfs/extent_io.c | 8 ++++++--
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index f795d0a6d176..28746588f228 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -392,8 +392,11 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
static u64 bio_end_offset(struct bio *bio)
{
struct bio_vec *last = &bio->bi_io_vec[bio->bi_vcnt - 1];
+ struct bio_vec bv;

- return page_offset(last->bv_page) + last->bv_len + last->bv_offset;
+ bvec_get_last_page(last, &bv);
+
+ return page_offset(bv.bv_page) + bv.bv_len + bv.bv_offset;
}

static noinline int add_ra_bio_pages(struct inode *inode,
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0e7367817b92..c8f6a8657bf2 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2738,11 +2738,15 @@ static int __must_check submit_one_bio(struct bio *bio, int mirror_num,
{
blk_status_t ret = 0;
struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
- struct page *page = bvec->bv_page;
struct extent_io_tree *tree = bio->bi_private;
+ struct bio_vec bv;
+ struct page *page;
u64 start;

- start = page_offset(page) + bvec->bv_offset;
+ bvec_get_last_page(bvec, &bv);
+ page = bv.bv_page;
+
+ start = page_offset(page) + bv.bv_offset;

bio->bi_private = NULL;
bio_get(bio);
--
2.9.4

2017-08-08 08:52:13

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 31/49] block: deal with dirtying pages for multipage bvec

In bio_check_pages_dirty(), bvec->bv_page is used as flag
for marking if the page has been dirtied & released, and if
no, it will be dirtied in deferred workqueue.

With multipage bvec, we can't do that any more, so change
the logic into checking all pages in one mp bvec, and only
release all these pages if all are dirtied, otherwise dirty
them all in deferred wrokqueue.

Signed-off-by: Ming Lei <[email protected]>
---
block/bio.c | 45 +++++++++++++++++++++++++++++++++++++--------
1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 28697e3c8ce3..716e6917b0fd 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1650,8 +1650,9 @@ void bio_set_pages_dirty(struct bio *bio)
{
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;

if (page && !PageCompound(page))
@@ -1659,16 +1660,26 @@ void bio_set_pages_dirty(struct bio *bio)
}
}

+static inline void release_mp_bvec_pages(struct bio_vec *bvec)
+{
+ struct bio_vec bv;
+ struct bvec_iter iter;
+
+ bvec_for_each_sp_bvec(bv, bvec, iter)
+ put_page(bv.bv_page);
+}
+
static void bio_release_pages(struct bio *bio)
{
struct bio_vec *bvec;
int i;

- bio_for_each_segment_all(bvec, bio, i) {
+ /* iterate each mp bvec */
+ bio_for_each_segment_all_mp(bvec, bio, i) {
struct page *page = bvec->bv_page;

if (page)
- put_page(page);
+ release_mp_bvec_pages(bvec);
}
}

@@ -1712,20 +1723,38 @@ static void bio_dirty_fn(struct work_struct *work)
}
}

+static inline void check_mp_bvec_pages(struct bio_vec *bvec,
+ int *nr_dirty, int *nr_pages)
+{
+ struct bio_vec bv;
+ struct bvec_iter iter;
+
+ bvec_for_each_sp_bvec(bv, bvec, iter) {
+ struct page *page = bv.bv_page;
+
+ if (PageDirty(page) || PageCompound(page))
+ (*nr_dirty)++;
+ (*nr_pages)++;
+ }
+}
+
void bio_check_pages_dirty(struct bio *bio)
{
struct bio_vec *bvec;
int nr_clean_pages = 0;
int i;

- bio_for_each_segment_all(bvec, bio, i) {
- struct page *page = bvec->bv_page;
+ bio_for_each_segment_all_mp(bvec, bio, i) {
+ int nr_dirty = 0, nr_pages = 0;
+
+ check_mp_bvec_pages(bvec, &nr_dirty, &nr_pages);

- if (PageDirty(page) || PageCompound(page)) {
- put_page(page);
+ /* release all pages in the mp bvec if all are dirtied */
+ if (nr_dirty == nr_pages) {
+ release_mp_bvec_pages(bvec);
bvec->bv_page = NULL;
} else {
- nr_clean_pages++;
+ nr_clean_pages += nr_pages;
}
}

--
2.9.4

2017-08-08 08:52:24

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 32/49] block: convert to singe/multi page version of bio_for_each_segment_all()

Signed-off-by: Ming Lei <[email protected]>
---
block/bio.c | 17 +++++++++++------
block/blk-zoned.c | 5 +++--
block/bounce.c | 6 ++++--
3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 716e6917b0fd..fd6a055f491c 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -997,7 +997,7 @@ int bio_alloc_pages(struct bio *bio, gfp_t gfp_mask)
int i;
struct bio_vec *bv;

- bio_for_each_segment_all(bv, bio, i) {
+ bio_for_each_segment_all_mp(bv, bio, i) {
bv->bv_page = alloc_page(gfp_mask);
if (!bv->bv_page) {
while (--bv >= bio->bi_io_vec)
@@ -1098,8 +1098,9 @@ static int bio_copy_from_iter(struct bio *bio, struct iov_iter iter)
{
int i;
struct bio_vec *bvec;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
ssize_t ret;

ret = copy_page_from_iter(bvec->bv_page,
@@ -1129,8 +1130,9 @@ static int bio_copy_to_iter(struct bio *bio, struct iov_iter iter)
{
int i;
struct bio_vec *bvec;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
ssize_t ret;

ret = copy_page_to_iter(bvec->bv_page,
@@ -1152,8 +1154,9 @@ void bio_free_pages(struct bio *bio)
{
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i)
+ bio_for_each_segment_all_sp(bvec, bio, i, bia)
__free_page(bvec->bv_page);
}
EXPORT_SYMBOL(bio_free_pages);
@@ -1444,11 +1447,12 @@ static void __bio_unmap_user(struct bio *bio)
{
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

/*
* make sure we dirty pages we wrote to
*/
- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
if (bio_data_dir(bio) == READ)
set_page_dirty_lock(bvec->bv_page);

@@ -1540,8 +1544,9 @@ static void bio_copy_kern_endio_read(struct bio *bio)
char *p = bio->bi_private;
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
memcpy(p, page_address(bvec->bv_page), bvec->bv_len);
p += bvec->bv_len;
}
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 3bd15d8095b1..558b84ae2d86 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -81,6 +81,7 @@ int blkdev_report_zones(struct block_device *bdev,
unsigned int ofst;
void *addr;
int ret;
+ struct bvec_iter_all bia;

if (!q)
return -ENXIO;
@@ -148,7 +149,7 @@ int blkdev_report_zones(struct block_device *bdev,
n = 0;
nz = 0;
nr_rep = 0;
- bio_for_each_segment_all(bv, bio, i) {
+ bio_for_each_segment_all_sp(bv, bio, i, bia) {

if (!bv->bv_page)
break;
@@ -181,7 +182,7 @@ int blkdev_report_zones(struct block_device *bdev,

*nr_zones = nz;
out:
- bio_for_each_segment_all(bv, bio, i)
+ bio_for_each_segment_all_sp(bv, bio, i, bia)
__free_page(bv->bv_page);
bio_put(bio);

diff --git a/block/bounce.c b/block/bounce.c
index 50e965a15295..34286abc0a66 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -145,11 +145,12 @@ static void bounce_end_io(struct bio *bio, mempool_t *pool)
struct bio_vec *bvec, orig_vec;
int i;
struct bvec_iter orig_iter = bio_orig->bi_iter;
+ struct bvec_iter_all bia;

/*
* free up bounce indirect pages used
*/
- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
orig_vec = bio_iter_iovec(bio_orig, orig_iter);
if (bvec->bv_page != orig_vec.bv_page) {
dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
@@ -204,6 +205,7 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
unsigned i = 0;
bool bounce = false;
int sectors = 0;
+ struct bvec_iter_all bia;

bio_for_each_segment(from, *bio_orig, iter) {
if (i++ < BIO_MAX_PAGES)
@@ -222,7 +224,7 @@ static void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig,
}
bio = bio_clone_bioset(*bio_orig, GFP_NOIO, bounce_bio_set);

- bio_for_each_segment_all(to, bio, i) {
+ bio_for_each_segment_all_sp(to, bio, i, bia) {
struct page *page = to->bv_page;

if (page_to_pfn(page) <= q->limits.bounce_pfn)
--
2.9.4

2017-08-08 08:52:32

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 33/49] bcache: convert to bio_for_each_segment_all_sp()

Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
drivers/md/bcache/btree.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 3da595ae565b..74cbb7387dc5 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -422,8 +422,9 @@ static void do_btree_node_write(struct btree *b)
int j;
struct bio_vec *bv;
void *base = (void *) ((unsigned long) i & ~(PAGE_SIZE - 1));
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bv, b->bio, j)
+ bio_for_each_segment_all_sp(bv, b->bio, j, bia)
memcpy(page_address(bv->bv_page),
base + j * PAGE_SIZE, PAGE_SIZE);

--
2.9.4

2017-08-08 08:52:41

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 34/49] md: raid1: convert to bio_for_each_segment_all_sp()

Cc: Shaohua Li <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
drivers/md/raid1.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index f50958ded9f0..e34080bd91cb 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2107,13 +2107,14 @@ static void process_checks(struct r1bio *r1_bio)
struct page **spages = get_resync_pages(sbio)->pages;
struct bio_vec *bi;
int page_len[RESYNC_PAGES] = { 0 };
+ struct bvec_iter_all bia;

if (sbio->bi_end_io != end_sync_read)
continue;
/* Now we can 'fixup' the error value */
sbio->bi_status = 0;

- bio_for_each_segment_all(bi, sbio, j)
+ bio_for_each_segment_all_sp(bi, sbio, j, bia)
page_len[j] = bi->bv_len;

if (!status) {
--
2.9.4

2017-08-08 08:52:54

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 35/49] dm-crypt: don't clear bvec->bv_page in crypt_free_buffer_pages()

The bio is always freed after running crypt_free_buffer_pages(),
so it isn't necessary to clear the bv->bv_page.

Cc: Mike Snitzer <[email protected]>
Cc:[email protected]
Signed-off-by: Ming Lei <[email protected]>
---
drivers/md/dm-crypt.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index cdf6b1e12460..664ba3504f48 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1450,7 +1450,6 @@ static void crypt_free_buffer_pages(struct crypt_config *cc, struct bio *clone)
bio_for_each_segment_all(bv, clone, i) {
BUG_ON(!bv->bv_page);
mempool_free(bv->bv_page, cc->page_pool);
- bv->bv_page = NULL;
}
}

--
2.9.4

2017-08-08 08:53:07

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 36/49] dm-crypt: convert to bio_for_each_segment_all_sp()

Cc: Mike Snitzer <[email protected]>
Cc:[email protected]
Signed-off-by: Ming Lei <[email protected]>
---
drivers/md/dm-crypt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 664ba3504f48..0f2f44a73a32 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1446,8 +1446,9 @@ static void crypt_free_buffer_pages(struct crypt_config *cc, struct bio *clone)
{
unsigned int i;
struct bio_vec *bv;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bv, clone, i) {
+ bio_for_each_segment_all_sp(bv, clone, i, bia) {
BUG_ON(!bv->bv_page);
mempool_free(bv->bv_page, cc->page_pool);
}
--
2.9.4

2017-08-08 08:53:18

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 37/49] fs/mpage: convert to bio_for_each_segment_all_sp()

Signed-off-by: Ming Lei <[email protected]>
---
fs/mpage.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/mpage.c b/fs/mpage.c
index 2e4c41ccb5c9..b3c0f0d6bc21 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -46,9 +46,10 @@
static void mpage_end_io(struct bio *bio)
{
struct bio_vec *bv;
+ struct bvec_iter_all bia;
int i;

- bio_for_each_segment_all(bv, bio, i) {
+ bio_for_each_segment_all_sp(bv, bio, i, bia) {
struct page *page = bv->bv_page;
page_endio(page, op_is_write(bio_op(bio)),
blk_status_to_errno(bio->bi_status));
--
2.9.4

2017-08-08 08:53:28

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 38/49] fs/block: convert to bio_for_each_segment_all_sp()

Signed-off-by: Ming Lei <[email protected]>
---
fs/block_dev.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 9941dc8342df..489d103ae11b 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -209,6 +209,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
ssize_t ret;
blk_qc_t qc;
int i;
+ struct bvec_iter_all bia;

if ((pos | iov_iter_alignment(iter)) &
(bdev_logical_block_size(bdev) - 1))
@@ -254,7 +255,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
}
__set_current_state(TASK_RUNNING);

- bio_for_each_segment_all(bvec, &bio, i) {
+ bio_for_each_segment_all_sp(bvec, &bio, i, bia) {
if (should_dirty && !PageCompound(bvec->bv_page))
set_page_dirty_lock(bvec->bv_page);
put_page(bvec->bv_page);
@@ -321,8 +322,9 @@ static void blkdev_bio_end_io(struct bio *bio)
} else {
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i)
+ bio_for_each_segment_all_sp(bvec, bio, i, bia)
put_page(bvec->bv_page);
bio_put(bio);
}
--
2.9.4

2017-08-08 08:53:33

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 39/49] fs/iomap: convert to bio_for_each_segment_all_sp()

Signed-off-by: Ming Lei <[email protected]>
---
fs/iomap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/iomap.c b/fs/iomap.c
index 039266128b7f..17541e1c86a2 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -790,8 +790,9 @@ static void iomap_dio_bio_end_io(struct bio *bio)
} else {
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i)
+ bio_for_each_segment_all_sp(bvec, bio, i, bia)
put_page(bvec->bv_page);
bio_put(bio);
}
--
2.9.4

2017-08-08 08:53:40

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 40/49] ext4: convert to bio_for_each_segment_all_sp()

Cc: "Theodore Ts'o" <[email protected]>
Cc: Andreas Dilger <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
fs/ext4/page-io.c | 3 ++-
fs/ext4/readpage.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index c2fce4478cca..5a36ad110f14 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -62,8 +62,9 @@ static void ext4_finish_bio(struct bio *bio)
{
int i;
struct bio_vec *bvec;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;
#ifdef CONFIG_EXT4_FS_ENCRYPTION
struct page *data_page = NULL;
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index 40a5497b0f60..6bd33c4c1f7f 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -71,6 +71,7 @@ static void mpage_end_io(struct bio *bio)
{
struct bio_vec *bv;
int i;
+ struct bvec_iter_all bia;

if (ext4_bio_encrypted(bio)) {
if (bio->bi_status) {
@@ -80,7 +81,7 @@ static void mpage_end_io(struct bio *bio)
return;
}
}
- bio_for_each_segment_all(bv, bio, i) {
+ bio_for_each_segment_all_sp(bv, bio, i, bia) {
struct page *page = bv->bv_page;

if (!bio->bi_status) {
--
2.9.4

2017-08-08 08:53:50

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 41/49] xfs: convert to bio_for_each_segment_all_sp()

Cc: "Darrick J. Wong" <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
fs/xfs/xfs_aops.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 6bf120bb1a17..94df43dcae0b 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -139,6 +139,7 @@ xfs_destroy_ioend(
for (bio = &ioend->io_inline_bio; bio; bio = next) {
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

/*
* For the last bio, bi_private points to the ioend, so we
@@ -150,7 +151,7 @@ xfs_destroy_ioend(
next = bio->bi_private;

/* walk each page on bio, ending page IO on them */
- bio_for_each_segment_all(bvec, bio, i)
+ bio_for_each_segment_all_sp(bvec, bio, i, bia)
xfs_finish_page_writeback(inode, bvec, error);

bio_put(bio);
--
2.9.4

2017-08-08 08:54:01

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 42/49] gfs2: convert to bio_for_each_segment_all_sp()

Cc: Steven Whitehouse <[email protected]>
Cc: Bob Peterson <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
fs/gfs2/lops.c | 3 ++-
fs/gfs2/meta_io.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 3010f9edd177..d1fd8ed01b9e 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -206,11 +206,12 @@ static void gfs2_end_log_write(struct bio *bio)
struct bio_vec *bvec;
struct page *page;
int i;
+ struct bvec_iter_all bia;

if (bio->bi_status)
fs_err(sdp, "Error %d writing to log\n", bio->bi_status);

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
page = bvec->bv_page;
if (page_has_buffers(page))
gfs2_end_log_write_bh(sdp, bvec, bio->bi_status);
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index fabe1614f879..6879b0103539 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -190,8 +190,9 @@ static void gfs2_meta_read_endio(struct bio *bio)
{
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;
struct buffer_head *bh = page_buffers(page);
unsigned int len = bvec->bv_len;
--
2.9.4

2017-08-08 08:54:11

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 43/49] f2fs: convert to bio_for_each_segment_all_sp()

Cc: Jaegeuk Kim <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
fs/f2fs/data.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 99fa8e9780e8..69aa500834d3 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -54,6 +54,7 @@ static void f2fs_read_end_io(struct bio *bio)
{
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

#ifdef CONFIG_F2FS_FAULT_INJECTION
/*
@@ -75,7 +76,7 @@ static void f2fs_read_end_io(struct bio *bio)
}
}

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;

if (!bio->bi_status) {
@@ -95,8 +96,9 @@ static void f2fs_write_end_io(struct bio *bio)
struct f2fs_sb_info *sbi = bio->bi_private;
struct bio_vec *bvec;
int i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;
enum count_type type = WB_DATA_TYPE(page);

@@ -256,6 +258,7 @@ static bool __has_merged_page(struct f2fs_bio_info *io,
struct bio_vec *bvec;
struct page *target;
int i;
+ struct bvec_iter_all bia;

if (!io->bio)
return false;
@@ -263,7 +266,7 @@ static bool __has_merged_page(struct f2fs_bio_info *io,
if (!inode && !ino)
return true;

- bio_for_each_segment_all(bvec, io->bio, i) {
+ bio_for_each_segment_all_sp(bvec, io->bio, i, bia) {

if (bvec->bv_page->mapping)
target = bvec->bv_page;
--
2.9.4

2017-08-08 08:54:22

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 44/49] exofs: convert to bio_for_each_segment_all_sp()

Cc: Boaz Harrosh <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
fs/exofs/ore.c | 3 ++-
fs/exofs/ore_raid.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/exofs/ore.c b/fs/exofs/ore.c
index 8bb72807e70d..38a7d8bfdd4c 100644
--- a/fs/exofs/ore.c
+++ b/fs/exofs/ore.c
@@ -406,8 +406,9 @@ static void _clear_bio(struct bio *bio)
{
struct bio_vec *bv;
unsigned i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bv, bio, i) {
+ bio_for_each_segment_all_sp(bv, bio, i, bia) {
unsigned this_count = bv->bv_len;

if (likely(PAGE_SIZE == this_count))
diff --git a/fs/exofs/ore_raid.c b/fs/exofs/ore_raid.c
index 27cbdb697649..37c0a9aa2ec2 100644
--- a/fs/exofs/ore_raid.c
+++ b/fs/exofs/ore_raid.c
@@ -429,6 +429,7 @@ static void _mark_read4write_pages_uptodate(struct ore_io_state *ios, int ret)
{
struct bio_vec *bv;
unsigned i, d;
+ struct bvec_iter_all bia;

/* loop on all devices all pages */
for (d = 0; d < ios->numdevs; d++) {
@@ -437,7 +438,7 @@ static void _mark_read4write_pages_uptodate(struct ore_io_state *ios, int ret)
if (!bio)
continue;

- bio_for_each_segment_all(bv, bio, i) {
+ bio_for_each_segment_all_sp(bv, bio, i, bia) {
struct page *page = bv->bv_page;

SetPageUptodate(page);
--
2.9.4

2017-08-08 08:54:31

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 45/49] fs: crypto: convert to bio_for_each_segment_all_sp()

Signed-off-by: Ming Lei <[email protected]>
---
fs/crypto/bio.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/crypto/bio.c b/fs/crypto/bio.c
index 6181e9526860..d5516ed19166 100644
--- a/fs/crypto/bio.c
+++ b/fs/crypto/bio.c
@@ -36,8 +36,9 @@ static void completion_pages(struct work_struct *work)
struct bio *bio = ctx->r.bio;
struct bio_vec *bv;
int i;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bv, bio, i) {
+ bio_for_each_segment_all_sp(bv, bio, i, bia) {
struct page *page = bv->bv_page;
int ret = fscrypt_decrypt_page(page->mapping->host, page,
PAGE_SIZE, 0, page->index);
--
2.9.4

2017-08-08 08:54:42

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 46/49] fs/btrfs: convert to bio_for_each_segment_all_sp()

Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: David Sterba <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
fs/btrfs/compression.c | 3 ++-
fs/btrfs/disk-io.c | 3 ++-
fs/btrfs/extent_io.c | 12 ++++++++----
fs/btrfs/inode.c | 6 ++++--
fs/btrfs/raid56.c | 1 +
5 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 28746588f228..55f251a83d0b 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -147,13 +147,14 @@ static void end_compressed_bio_read(struct bio *bio)
} else {
int i;
struct bio_vec *bvec;
+ struct bvec_iter_all bia;

/*
* we have verified the checksum already, set page
* checked so the end_io handlers know about it
*/
ASSERT(!bio_flagged(bio, BIO_CLONED));
- bio_for_each_segment_all(bvec, cb->orig_bio, i)
+ bio_for_each_segment_all_sp(bvec, cb->orig_bio, i, bia)
SetPageChecked(bvec->bv_page);

bio_endio(cb->orig_bio);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 080e2ebb8aa0..a9cd75e6383d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -963,9 +963,10 @@ static blk_status_t btree_csum_one_bio(struct bio *bio)
struct bio_vec *bvec;
struct btrfs_root *root;
int i, ret = 0;
+ struct bvec_iter_all bia;

ASSERT(!bio_flagged(bio, BIO_CLONED));
- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
root = BTRFS_I(bvec->bv_page->mapping->host)->root;
ret = csum_dirty_buffer(root->fs_info, bvec->bv_page);
if (ret)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c8f6a8657bf2..4de9cfd1c385 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2359,8 +2359,9 @@ static unsigned int get_bio_pages(struct bio *bio)
{
unsigned i;
struct bio_vec *bv;
+ struct bvec_iter_all bia;

- bio_for_each_segment_all(bv, bio, i)
+ bio_for_each_segment_all_sp(bv, bio, i, bia)
;

return i;
@@ -2463,9 +2464,10 @@ static void end_bio_extent_writepage(struct bio *bio)
u64 start;
u64 end;
int i;
+ struct bvec_iter_all bia;

ASSERT(!bio_flagged(bio, BIO_CLONED));
- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;
struct inode *inode = page->mapping->host;
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
@@ -2534,9 +2536,10 @@ static void end_bio_extent_readpage(struct bio *bio)
int mirror;
int ret;
int i;
+ struct bvec_iter_all bia;

ASSERT(!bio_flagged(bio, BIO_CLONED));
- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;
struct inode *inode = page->mapping->host;
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
@@ -3693,9 +3696,10 @@ static void end_bio_extent_buffer_writepage(struct bio *bio)
struct bio_vec *bvec;
struct extent_buffer *eb;
int i, done;
+ struct bvec_iter_all bia;

ASSERT(!bio_flagged(bio, BIO_CLONED));
- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;

eb = (struct extent_buffer *)page->private;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 084ed99dd308..eeb2ff662ec4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8047,6 +8047,7 @@ static void btrfs_retry_endio_nocsum(struct bio *bio)
struct bio_vec *bvec;
struct extent_io_tree *io_tree, *failure_tree;
int i;
+ struct bvec_iter_all bia;

if (bio->bi_status)
goto end;
@@ -8064,7 +8065,7 @@ static void btrfs_retry_endio_nocsum(struct bio *bio)

done->uptodate = 1;
ASSERT(!bio_flagged(bio, BIO_CLONED));
- bio_for_each_segment_all(bvec, bio, i)
+ bio_for_each_segment_all_sp(bvec, bio, i, bia)
clean_io_failure(BTRFS_I(inode)->root->fs_info, failure_tree,
io_tree, done->start, bvec->bv_page,
btrfs_ino(BTRFS_I(inode)), 0);
@@ -8143,6 +8144,7 @@ static void btrfs_retry_endio(struct bio *bio)
int uptodate;
int ret;
int i;
+ struct bvec_iter_all bia;

if (bio->bi_status)
goto end;
@@ -8162,7 +8164,7 @@ static void btrfs_retry_endio(struct bio *bio)
failure_tree = &BTRFS_I(inode)->io_failure_tree;

ASSERT(!bio_flagged(bio, BIO_CLONED));
- bio_for_each_segment_all(bvec, bio, i) {
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
ret = __readpage_endio_check(inode, io_bio, i, bvec->bv_page,
bvec->bv_offset, done->start,
bvec->bv_len);
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 208638384cd2..9247226a2efd 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -1365,6 +1365,7 @@ static int find_logical_bio_stripe(struct btrfs_raid_bio *rbio,
u64 logical = bio->bi_iter.bi_sector;
u64 stripe_start;
int i;
+ struct bvec_iter_all bia;

logical <<= 9;

--
2.9.4

2017-08-08 08:54:53

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 47/49] fs/direct-io: convert to bio_for_each_segment_all_sp()

Signed-off-by: Ming Lei <[email protected]>
---
fs/direct-io.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 08cf27811e5a..8447f4d55730 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -491,7 +491,9 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio)
if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) {
bio_check_pages_dirty(bio); /* transfers ownership */
} else {
- bio_for_each_segment_all(bvec, bio, i) {
+ struct bvec_iter_all bia;
+
+ bio_for_each_segment_all_sp(bvec, bio, i, bia) {
struct page *page = bvec->bv_page;

if (dio->op == REQ_OP_READ && !PageCompound(page) &&
--
2.9.4

2017-08-08 08:55:06

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 48/49] block: enable multipage bvecs

This patch pulls the trigger for multipage bvecs.

Now any request queue which supports queue cluster
will see multipage bvecs.

Signed-off-by: Ming Lei <[email protected]>
---
block/bio.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index fd6a055f491c..a5f7fd4ef818 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -844,6 +844,11 @@ int bio_add_page(struct bio *bio, struct page *page,
* a consecutive offset. Optimize this special case.
*/
if (bio->bi_vcnt > 0) {
+ struct request_queue *q = NULL;
+
+ if (bio->bi_bdev)
+ q = bdev_get_queue(bio->bi_bdev);
+
bv = &bio->bi_io_vec[bio->bi_vcnt - 1];

if (page == bv->bv_page &&
@@ -851,6 +856,14 @@ int bio_add_page(struct bio *bio, struct page *page,
bv->bv_len += len;
goto done;
}
+
+ /* disable multipage bvec too if cluster isn't enabled */
+ if (q && blk_queue_cluster(q) &&
+ (bvec_to_phys(bv) + bv->bv_len ==
+ page_to_phys(page) + offset)) {
+ bv->bv_len += len;
+ goto done;
+ }
}

if (bio->bi_vcnt >= bio->bi_max_vecs)
--
2.9.4

2017-08-08 08:55:19

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 49/49] block: bio: pass segments to bio if bio_add_page() is bypassed

Under some situations, such as block direct I/O, we can't use
bio_add_page() for merging pages into multipage bvec, so
a new function is implemented for converting page array into one
segment array, then these cases can benefit from multipage bvec
too.

Signed-off-by: Ming Lei <[email protected]>
---
block/bio.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 48 insertions(+), 6 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index a5f7fd4ef818..aead0e3e36a9 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -881,6 +881,41 @@ int bio_add_page(struct bio *bio, struct page *page,
}
EXPORT_SYMBOL(bio_add_page);

+static unsigned convert_to_segs(struct bio* bio, struct page **pages,
+ unsigned char *page_cnt,
+ unsigned nr_pages)
+{
+
+ unsigned idx;
+ unsigned nr_seg = 0;
+ struct request_queue *q = NULL;
+
+ if (bio->bi_bdev)
+ q = bdev_get_queue(bio->bi_bdev);
+
+ if (!q || !blk_queue_cluster(q)) {
+ memset(page_cnt, 0, nr_pages);
+ return nr_pages;
+ }
+
+ page_cnt[nr_seg] = 0;
+ for (idx = 1; idx < nr_pages; idx++) {
+ struct page *pg_s = pages[nr_seg];
+ struct page *pg = pages[idx];
+
+ if (page_to_pfn(pg_s) + page_cnt[nr_seg] + 1 ==
+ page_to_pfn(pg)) {
+ page_cnt[nr_seg]++;
+ } else {
+ page_cnt[++nr_seg] = 0;
+ if (nr_seg < idx)
+ pages[nr_seg] = pg;
+ }
+ }
+
+ return nr_seg + 1;
+}
+
/**
* bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
* @bio: bio to add pages to
@@ -900,6 +935,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
struct page **pages = (struct page **)bv;
size_t offset, diff;
ssize_t size;
+ unsigned short nr_segs;
+ unsigned char page_cnt[nr_pages]; /* at most 256 pages */

size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
if (unlikely(size <= 0))
@@ -915,13 +952,18 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
* need to be reflected here as well.
*/
bio->bi_iter.bi_size += size;
- bio->bi_vcnt += nr_pages;
-
diff = (nr_pages * PAGE_SIZE - offset) - size;
- while (nr_pages--) {
- bv[nr_pages].bv_page = pages[nr_pages];
- bv[nr_pages].bv_len = PAGE_SIZE;
- bv[nr_pages].bv_offset = 0;
+
+ /* convert into segments */
+ nr_segs = convert_to_segs(bio, pages, page_cnt, nr_pages);
+ bio->bi_vcnt += nr_segs;
+
+ while (nr_segs--) {
+ unsigned cnt = (unsigned)page_cnt[nr_segs] + 1;
+
+ bv[nr_segs].bv_page = pages[nr_segs];
+ bv[nr_segs].bv_len = PAGE_SIZE * cnt;
+ bv[nr_segs].bv_offset = 0;
}

bv[0].bv_offset += offset;
--
2.9.4

2017-08-08 09:00:07

by Filipe Manana

[permalink] [raw]
Subject: Re: [PATCH v3 46/49] fs/btrfs: convert to bio_for_each_segment_all_sp()

On Tue, Aug 8, 2017 at 9:45 AM, Ming Lei <[email protected]> wrote:
> Cc: Chris Mason <[email protected]>
> Cc: Josef Bacik <[email protected]>
> Cc: David Sterba <[email protected]>
> Cc: [email protected]
> Signed-off-by: Ming Lei <[email protected]>

Can you please add some meaningful changelog? E.g., why is this
conversion needed.

> ---
> fs/btrfs/compression.c | 3 ++-
> fs/btrfs/disk-io.c | 3 ++-
> fs/btrfs/extent_io.c | 12 ++++++++----
> fs/btrfs/inode.c | 6 ++++--
> fs/btrfs/raid56.c | 1 +
> 5 files changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index 28746588f228..55f251a83d0b 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -147,13 +147,14 @@ static void end_compressed_bio_read(struct bio *bio)
> } else {
> int i;
> struct bio_vec *bvec;
> + struct bvec_iter_all bia;
>
> /*
> * we have verified the checksum already, set page
> * checked so the end_io handlers know about it
> */
> ASSERT(!bio_flagged(bio, BIO_CLONED));
> - bio_for_each_segment_all(bvec, cb->orig_bio, i)
> + bio_for_each_segment_all_sp(bvec, cb->orig_bio, i, bia)
> SetPageChecked(bvec->bv_page);
>
> bio_endio(cb->orig_bio);
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 080e2ebb8aa0..a9cd75e6383d 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -963,9 +963,10 @@ static blk_status_t btree_csum_one_bio(struct bio *bio)
> struct bio_vec *bvec;
> struct btrfs_root *root;
> int i, ret = 0;
> + struct bvec_iter_all bia;
>
> ASSERT(!bio_flagged(bio, BIO_CLONED));
> - bio_for_each_segment_all(bvec, bio, i) {
> + bio_for_each_segment_all_sp(bvec, bio, i, bia) {
> root = BTRFS_I(bvec->bv_page->mapping->host)->root;
> ret = csum_dirty_buffer(root->fs_info, bvec->bv_page);
> if (ret)
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index c8f6a8657bf2..4de9cfd1c385 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2359,8 +2359,9 @@ static unsigned int get_bio_pages(struct bio *bio)
> {
> unsigned i;
> struct bio_vec *bv;
> + struct bvec_iter_all bia;
>
> - bio_for_each_segment_all(bv, bio, i)
> + bio_for_each_segment_all_sp(bv, bio, i, bia)
> ;
>
> return i;
> @@ -2463,9 +2464,10 @@ static void end_bio_extent_writepage(struct bio *bio)
> u64 start;
> u64 end;
> int i;
> + struct bvec_iter_all bia;
>
> ASSERT(!bio_flagged(bio, BIO_CLONED));
> - bio_for_each_segment_all(bvec, bio, i) {
> + bio_for_each_segment_all_sp(bvec, bio, i, bia) {
> struct page *page = bvec->bv_page;
> struct inode *inode = page->mapping->host;
> struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> @@ -2534,9 +2536,10 @@ static void end_bio_extent_readpage(struct bio *bio)
> int mirror;
> int ret;
> int i;
> + struct bvec_iter_all bia;
>
> ASSERT(!bio_flagged(bio, BIO_CLONED));
> - bio_for_each_segment_all(bvec, bio, i) {
> + bio_for_each_segment_all_sp(bvec, bio, i, bia) {
> struct page *page = bvec->bv_page;
> struct inode *inode = page->mapping->host;
> struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> @@ -3693,9 +3696,10 @@ static void end_bio_extent_buffer_writepage(struct bio *bio)
> struct bio_vec *bvec;
> struct extent_buffer *eb;
> int i, done;
> + struct bvec_iter_all bia;
>
> ASSERT(!bio_flagged(bio, BIO_CLONED));
> - bio_for_each_segment_all(bvec, bio, i) {
> + bio_for_each_segment_all_sp(bvec, bio, i, bia) {
> struct page *page = bvec->bv_page;
>
> eb = (struct extent_buffer *)page->private;
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 084ed99dd308..eeb2ff662ec4 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -8047,6 +8047,7 @@ static void btrfs_retry_endio_nocsum(struct bio *bio)
> struct bio_vec *bvec;
> struct extent_io_tree *io_tree, *failure_tree;
> int i;
> + struct bvec_iter_all bia;
>
> if (bio->bi_status)
> goto end;
> @@ -8064,7 +8065,7 @@ static void btrfs_retry_endio_nocsum(struct bio *bio)
>
> done->uptodate = 1;
> ASSERT(!bio_flagged(bio, BIO_CLONED));
> - bio_for_each_segment_all(bvec, bio, i)
> + bio_for_each_segment_all_sp(bvec, bio, i, bia)
> clean_io_failure(BTRFS_I(inode)->root->fs_info, failure_tree,
> io_tree, done->start, bvec->bv_page,
> btrfs_ino(BTRFS_I(inode)), 0);
> @@ -8143,6 +8144,7 @@ static void btrfs_retry_endio(struct bio *bio)
> int uptodate;
> int ret;
> int i;
> + struct bvec_iter_all bia;
>
> if (bio->bi_status)
> goto end;
> @@ -8162,7 +8164,7 @@ static void btrfs_retry_endio(struct bio *bio)
> failure_tree = &BTRFS_I(inode)->io_failure_tree;
>
> ASSERT(!bio_flagged(bio, BIO_CLONED));
> - bio_for_each_segment_all(bvec, bio, i) {
> + bio_for_each_segment_all_sp(bvec, bio, i, bia) {
> ret = __readpage_endio_check(inode, io_bio, i, bvec->bv_page,
> bvec->bv_offset, done->start,
> bvec->bv_len);
> diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
> index 208638384cd2..9247226a2efd 100644
> --- a/fs/btrfs/raid56.c
> +++ b/fs/btrfs/raid56.c
> @@ -1365,6 +1365,7 @@ static int find_logical_bio_stripe(struct btrfs_raid_bio *rbio,
> u64 logical = bio->bi_iter.bi_sector;
> u64 stripe_start;
> int i;
> + struct bvec_iter_all bia;

Unused variable.

Thanks.

>
> logical <<= 9;
>
> --
> 2.9.4
>



--
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

2017-08-08 08:46:58

by Ming Lei

[permalink] [raw]
Subject: [PATCH v3 04/49] mm: page_io.c: comment on direct access to bvec table

Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
mm/page_io.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/page_io.c b/mm/page_io.c
index b6c4ac388209..11c6f4a9a25b 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -43,6 +43,7 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,

void end_swap_bio_write(struct bio *bio)
{
+ /* single page bio, safe for multipage bvec */
struct page *page = bio->bi_io_vec[0].bv_page;

if (bio->bi_status) {
@@ -116,6 +117,7 @@ static void swap_slot_free_notify(struct page *page)

static void end_swap_bio_read(struct bio *bio)
{
+ /* single page bio, safe for multipage bvec */
struct page *page = bio->bi_io_vec[0].bv_page;
struct task_struct *waiter = bio->bi_private;

--
2.9.4

2017-08-08 12:35:48

by Coly Li

[permalink] [raw]
Subject: Re: [PATCH v3 33/49] bcache: convert to bio_for_each_segment_all_sp()

On 2017/8/8 下午4:45, Ming Lei wrote:
> Cc: [email protected]
> Signed-off-by: Ming Lei <[email protected]>

The patch is good to me. Thanks.

Acked-by: Coly Li <[email protected]>

> ---
> drivers/md/bcache/btree.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index 3da595ae565b..74cbb7387dc5 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -422,8 +422,9 @@ static void do_btree_node_write(struct btree *b)
> int j;
> struct bio_vec *bv;
> void *base = (void *) ((unsigned long) i & ~(PAGE_SIZE - 1));
> + struct bvec_iter_all bia;
>
> - bio_for_each_segment_all(bv, b->bio, j)
> + bio_for_each_segment_all_sp(bv, b->bio, j, bia)
> memcpy(page_address(bv->bv_page),
> base + j * PAGE_SIZE, PAGE_SIZE);
>
>


--
Coly Li

2017-08-08 12:36:45

by Coly Li

[permalink] [raw]
Subject: Re: [PATCH v3 07/49] bcache: comment on direct access to bvec table

On 2017/8/8 下午4:45, Ming Lei wrote:
> Looks all are safe after multipage bvec is supported.
>
> Cc: [email protected]
> Signed-off-by: Ming Lei <[email protected]>

Acked-by: Coly Li <[email protected]>

Coly Li


> ---
> drivers/md/bcache/btree.c | 1 +
> drivers/md/bcache/super.c | 6 ++++++
> drivers/md/bcache/util.c | 7 +++++++
> 3 files changed, 14 insertions(+)
>
> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index 866dcf78ff8e..3da595ae565b 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -431,6 +431,7 @@ static void do_btree_node_write(struct btree *b)
>
> continue_at(cl, btree_node_write_done, NULL);
> } else {
> + /* No harm for multipage bvec since the new is just allocated */
> b->bio->bi_vcnt = 0;
> bch_bio_map(b->bio, i);
>
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 8352fad765f6..6808f548cd13 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -208,6 +208,7 @@ static void write_bdev_super_endio(struct bio *bio)
>
> static void __write_super(struct cache_sb *sb, struct bio *bio)
> {
> + /* single page bio, safe for multipage bvec */
> struct cache_sb *out = page_address(bio->bi_io_vec[0].bv_page);
> unsigned i;
>
> @@ -1154,6 +1155,8 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
> dc->bdev->bd_holder = dc;
>
> bio_init(&dc->sb_bio, dc->sb_bio.bi_inline_vecs, 1);
> +
> + /* single page bio, safe for multipage bvec */
> dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
> get_page(sb_page);
>
> @@ -1799,6 +1802,7 @@ void bch_cache_release(struct kobject *kobj)
> for (i = 0; i < RESERVE_NR; i++)
> free_fifo(&ca->free[i]);
>
> + /* single page bio, safe for multipage bvec */
> if (ca->sb_bio.bi_inline_vecs[0].bv_page)
> put_page(ca->sb_bio.bi_io_vec[0].bv_page);
>
> @@ -1854,6 +1858,8 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
> ca->bdev->bd_holder = ca;
>
> bio_init(&ca->sb_bio, ca->sb_bio.bi_inline_vecs, 1);
> +
> + /* single page bio, safe for multipage bvec */
> ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
> get_page(sb_page);
>
> diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
> index 8c3a938f4bf0..11b4230ea6ad 100644
> --- a/drivers/md/bcache/util.c
> +++ b/drivers/md/bcache/util.c
> @@ -223,6 +223,13 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
> : 0;
> }
>
> +/*
> + * Generally it isn't good to access .bi_io_vec and .bi_vcnt
> + * directly, the preferred way is bio_add_page, but in
> + * this case, bch_bio_map() supposes that the bvec table
> + * is empty, so it is safe to access .bi_vcnt & .bi_io_vec
> + * in this way even after multipage bvec is supported.
> + */
> void bch_bio_map(struct bio *bio, void *base)
> {
> size_t size = bio->bi_iter.bi_size;
>

2017-08-08 16:33:17

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH v3 41/49] xfs: convert to bio_for_each_segment_all_sp()

On Tue, Aug 08, 2017 at 04:45:40PM +0800, Ming Lei wrote:

Sure would be nice to have a changelog explaining why we're doing this.

> Cc: "Darrick J. Wong" <[email protected]>
> Cc: [email protected]
> Signed-off-by: Ming Lei <[email protected]>
> ---
> fs/xfs/xfs_aops.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 6bf120bb1a17..94df43dcae0b 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -139,6 +139,7 @@ xfs_destroy_ioend(
> for (bio = &ioend->io_inline_bio; bio; bio = next) {
> struct bio_vec *bvec;
> int i;
> + struct bvec_iter_all bia;
>
> /*
> * For the last bio, bi_private points to the ioend, so we
> @@ -150,7 +151,7 @@ xfs_destroy_ioend(
> next = bio->bi_private;
>
> /* walk each page on bio, ending page IO on them */
> - bio_for_each_segment_all(bvec, bio, i)
> + bio_for_each_segment_all_sp(bvec, bio, i, bia)

It's confusing that you're splitting the old bio_for_each_segment_all
into multipage and singlepage variants, but bio_for_each_segment_all
continues to exist?

Hmm, the new multipage variant aliases the name bio_for_each_segment_all,
so clearly the _all function's sematics have changed a bit, but its name
and signature haven't, which seems likely to trip up someone who didn't
notice the behavioral change.

Is it still valid to call bio_for_each_segment_all? I get the feeling
from this patchset that you're really supposed to decide whether you
want one page at a time or more than one page at a time and choose _sp
or _mp?

(And, seeing how this was the only patch sent to this list, the chances
are higher of someone missing out on these subtle changes...)

--D

> xfs_finish_page_writeback(inode, bvec, error);
>
> bio_put(bio);
> --
> 2.9.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-08-10 11:13:00

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 01/49] block: drbd: comment on direct access bvec table

I really don't think that these comments are all that useful.
A big comment near the bi_io_vec field defintion explaining the rules
for access would be a lot better.

2017-08-10 11:14:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 04/49] mm: page_io.c: comment on direct access to bvec table

Can we just add a bio_first_page macro that always return the first
page in the bio?

2017-08-10 11:16:24

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 05/49] fs/buffer: comment on direct access to bvec table

> + /*
> + * It is safe to truncate the last bvec in the following way
> + * even though multipage bvec is supported, but we need to
> + * fix the parameters passed to zero_user().
> + */
> + struct bio_vec *bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];

A 'we need to fix XXX' comment isn't very useful. Just fix it in the
series (which I suspect you're going to do anyway).

Also a bio_last_vec helper might be nice for something like this and
documents properly converted places much better than these comments.

2017-08-10 11:26:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 07/49] bcache: comment on direct access to bvec table

I think all this bcache code needs bigger attention. For one
bio_alloc_pages is only used in bcache, so we should move it in there.

Second the way bio_alloc_pages is currently written looks potentially
dangerous for multi-page biovecs, so we should think about a better
calling convention. The way bcache seems to generally use it is by
allocating a bio, then calling bch_bio_map on it and then calling
bio_alloc_pages. I think it just needs a new bio_alloc_pages calling
convention that passes the size to be allocated and stop looking into
the segment count.

Second bch_bio_map isn't something we should be doing in a driver,
it should be rewritten using bio_add_page.

> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
> index 866dcf78ff8e..3da595ae565b 100644
> --- a/drivers/md/bcache/btree.c
> +++ b/drivers/md/bcache/btree.c
> @@ -431,6 +431,7 @@ static void do_btree_node_write(struct btree *b)
>
> continue_at(cl, btree_node_write_done, NULL);
> } else {
> + /* No harm for multipage bvec since the new is just allocated */
> b->bio->bi_vcnt = 0;

This should go away - bio_alloc_pages or it's replacement should not
modify bi_vcnt on failure.

> + /* single page bio, safe for multipage bvec */
> dc->sb_bio.bi_io_vec[0].bv_page = sb_page;

needs to use bio_add_page.

> + /* single page bio, safe for multipage bvec */
> ca->sb_bio.bi_io_vec[0].bv_page = sb_page;

needs to use bio_add_page.

2017-08-10 11:28:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 09/49] block: comment on bio_iov_iter_get_pages()

> + * The hacking way of using bvec table as page pointer array is safe
> + * even after multipage bvec is introduced because that space can be
> + * thought as unused by bio_add_page().

I'm not sure what value this comment adds.

Note that once we have multi-page biovecs this could should change
to take advantage of multipage biovecs, so adding a comment before
that doesn't seem too helpful.

2017-08-10 11:29:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 10/49] dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE

> + ti->max_io_len = min_t(uint32_t, len,
> + (BIO_MAX_PAGES * PAGE_SIZE));

No need for the inner braces. Also all of the above fits nicely
onto a single < 80 char line.

Otherwise this looks fine:

Reviewed-by: Christoph Hellwig <[email protected]>

2017-08-10 11:30:07

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 11/49] btrfs: avoid access to .bi_vcnt directly

> +static unsigned int get_bio_pages(struct bio *bio)
> +{
> + unsigned i;
> + struct bio_vec *bv;
> +
> + bio_for_each_segment_all(bv, bio, i)
> + ;
> +
> + return i;
> +}

s/get_bio_pages/bio_nr_pages/ ?

Also this seems like a useful helper for bio.h

2017-08-10 11:32:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 17/49] block: comments on bio_for_each_segment[_all]

The comments should be added in the patches where semantics change,
not separately.

2017-08-10 12:00:37

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 18/49] block: introduce multipage/single page bvec helpers

Please skip adding the _sp names for the single page ones - those
are the only used to implement the non postfixed ones anyway.

The _mp ones should have bio_iter_segment_* names instead.

And while you're at it - I think this code would massively benefit
from turning it into inline functions in a prep step before doing these
changes, including passing the iter by reference for all these functions
instead of the odd by value calling convention.

2017-08-10 12:01:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 19/49] block: implement sp version of bvec iterator helpers

On Tue, Aug 08, 2017 at 04:45:18PM +0800, Ming Lei wrote:
> This patch implements singlepage version of the following
> 3 helpers:
> - bvec_iter_offset_sp()
> - bvec_iter_len_sp()
> - bvec_iter_page_sp()
>
> So that one multipage bvec can be splited to singlepage
> bvec, and make users of current bvec iterator happy.

Please merge this into the previous patch, and keep the existing
non postfixed names for the single page version, and use
bvec_iter_segment_* for the multipage versions.

2017-08-10 12:16:17

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 37/49] fs/mpage: convert to bio_for_each_segment_all_sp()

> struct bio_vec *bv;
> + struct bvec_iter_all bia;
> int i;
>
> - bio_for_each_segment_all(bv, bio, i) {
> + bio_for_each_segment_all_sp(bv, bio, i, bia) {
> struct page *page = bv->bv_page;
> page_endio(page, op_is_write(bio_op(bio)),
> blk_status_to_errno(bio->bi_status));

Hmm. Going back to my previous comment about implementing the single
page variants on top of multipage - I wonder if we should simply
do that in the callers, e.g. something like:

bio_for_each_segment_all(bv, bio, i) {
bvec_for_each_page(page, bv, j) {
page_endio(page, op_is_write(bio_op(bio)),
blk_status_to_errno(bio->bi_status));
}
}

with additional helpers to get the length and offset for the page, e.g.

bvec_page_offset(bv, idx)
bvev_page_len(bv, idx)

While this is a little more code in the callers it's a lot easier to
understand.

2017-08-10 12:18:46

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 23/49] block: blk-merge: remove unnecessary check

Looks good, and another candidate for a prep series:

Reviewed-by: Christoph Hellwig <[email protected]>

2017-08-10 12:21:05

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 22/49] block: blk-merge: try to make front segments in full size

This looks like another candidate for a standalone prep series.

2017-08-10 12:24:20

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 21/49] blk-merge: compute bio->bi_seg_front_size efficiently

On Tue, Aug 08, 2017 at 04:45:20PM +0800, Ming Lei wrote:
> It is enough to check and compute bio->bi_seg_front_size just
> after the 1st segment is found, but current code checks that
> for each bvec, which is inefficient.
>
> This patch follows the way in __blk_recalc_rq_segments()
> for computing bio->bi_seg_front_size, and it is more efficient
> and code becomes more readable too.

As far as I can tell this doesn't depend on anything else in the
series and could be sent standalone?

2017-08-10 12:28:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 20/49] block: introduce bio_for_each_segment_mp()

First: as mentioned in the previous patches I really hate the name
scheme with the _sp and _mp postfixes.

To be clear and understandable we should always name the versions
that iterate over segments *segment* and the ones that iterate over
pages *page*. To make sure we have a clean compile break for code
using the old _segment name I'd suggest to move to pass the bvec_iter
argument by reference, which is the right thing to do anyway.

As far as the implementation goes I don't think we actually need
to pass the mp argument down. Instead we always call the full-segment
version of bvec_iter_len / __bvec_iter_advance and then have an
inner loop that moves the fake bvecs forward inside each full-segment
one - that is implement the per-page version on top of the per-segment
one.

2017-08-11 16:55:46

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v3 46/49] fs/btrfs: convert to bio_for_each_segment_all_sp()

Hi Ming,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.13-rc4 next-20170810]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Ming-Lei/block-support-multipage-bvec/20170810-110521
config: x86_64-randconfig-b0-08112217 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

All warnings (new ones prefixed by >>):

fs/btrfs/raid56.c: In function 'find_logical_bio_stripe':
>> fs/btrfs/raid56.c:1368: warning: unused variable 'bia'

vim +/bia +1368 fs/btrfs/raid56.c

1356
1357 /*
1358 * helper to find the stripe number for a given
1359 * bio (before mapping). Used to figure out which stripe has
1360 * failed. This looks up based on logical block numbers.
1361 */
1362 static int find_logical_bio_stripe(struct btrfs_raid_bio *rbio,
1363 struct bio *bio)
1364 {
1365 u64 logical = bio->bi_iter.bi_sector;
1366 u64 stripe_start;
1367 int i;
> 1368 struct bvec_iter_all bia;
1369
1370 logical <<= 9;
1371
1372 for (i = 0; i < rbio->nr_data; i++) {
1373 stripe_start = rbio->bbio->raid_map[i];
1374 if (logical >= stripe_start &&
1375 logical < stripe_start + rbio->stripe_len) {
1376 return i;
1377 }
1378 }
1379 return -1;
1380 }
1381

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (1.58 kB)
.config.gz (27.71 kB)
Download all attachments

2017-08-18 01:25:49

by kernel test robot

[permalink] [raw]
Subject: [lkp-robot] [block] 434f2ea20d: fileio.requests_per_sec -3.1% regression


Greeting,

FYI, we noticed a -3.1% regression of fileio.requests_per_sec due to commit:


commit: 434f2ea20d5b4da12d9de87cb2838f320173f6a1 ("block: enable multipage bvecs")
url: https://github.com/0day-ci/linux/commits/Ming-Lei/block-support-multipage-bvec/20170810-110521


in testcase: fileio
on test machine: 4 threads Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz with 4G memory
with following parameters:

period: 600s
nr_threads: 100%
disk: 1HDD
fs: f2fs
size: 64G
filenum: 1024f
rwmode: seqwr
iomode: sync
cpufreq_governor: performance

test-description: fileio is a subtest of SysBench benchmark suite to measure file IO performance.
test-url: https://github.com/akopytov/sysbench



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml

testcase/path_params/tbox_group/run: fileio/600s-100%-1HDD-f2fs-64G-1024f-seqwr-sync-performance/lkp-sb02

3d5b6047555ee331 434f2ea20d5b4da12d9de87cb2
---------------- --------------------------
%stddev change %stddev
\ | \
7890 -3% 7642 fileio.requests_per_sec
531 3% 548 fileio.time.elapsed_time
531 3% 548 fileio.time.elapsed_time.max
28163 4% 29269 interrupts.CAL:Function_call_interrupts
564236 3% 581392 perf-stat.page-faults
564236 3% 581391 perf-stat.minor-faults
126044 -3% 122052 vmstat.io.bo
29771 -3% 28842 vmstat.system.cs
418 8% 450 iostat.sda.await
418 8% 450 iostat.sda.w_await
125990 -3% 122056 iostat.sda.wkB/s
5.28 ? 5% -43% 3.03 iostat.sda.wrqm/s


fileio.requests_per_sec

8000 *+*-*-*--*-*-*-*-*-*--*-*-*-*-*-*--*-*-*-*-*-*-*--*-*-*-*---*--*-*-*-*
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O |
7000 ++ : : |
6000 ++ : : |
| : : |
5000 ++ : : |
| : : |
4000 ++ : : |
| : : |
3000 ++ : : |
2000 ++ : : |
| : |
1000 ++ : |
| : |
0 ++--------------------------------------------------------*----------+


perf-stat.page-faults

600000 O+O-O-O-O----O-O-O-O-O-----O-O-O--O-------O-O-----O-O------O-O-O---+
*.*.*.*.*.O..*.*.*.*.*.O.O.*.*.*..*.O.O.O.*.*.O.O.*.*.O O *.*.*.*.*
500000 ++ : : |
| : : |
| : : |
400000 ++ : : |
| : : |
300000 ++ : : |
| : : |
200000 ++ : : |
| : : |
| :: |
100000 ++ : |
| : |
0 ++-------------------------------------------------------*---------+


perf-stat.minor-faults

600000 O+O-O-O-O----O-O-O-O-O-----O-O-O--O-------O-O-----O-O------O-O-O---+
*.*.*.*.*.O..*.*.*.*.*.O.O.*.*.*..*.O.O.O.*.*.O.O.*.*.O O *.*.*.*.*
500000 ++ : : |
| : : |
| : : |
400000 ++ : : |
| : : |
300000 ++ : : |
| : : |
200000 ++ : : |
| : : |
| :: |
100000 ++ : |
| : |
0 ++-------------------------------------------------------*---------+


fileio.time.elapsed_time

600 ++--------------------------------------------------------------------+
O O O O O O O O O O O O O O O O O O O O O O O O O O.O O O O O |
500 *+*.*..*.*.*.*.*..*.*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.* : *.*..*.*.*
| : : |
| : : |
400 ++ : : |
| : : |
300 ++ : : |
| : : |
200 ++ : : |
| : : |
| :: |
100 ++ : |
| : |
0 ++---------------------------------------------------------*----------+


fileio.time.elapsed_time.max

600 ++--------------------------------------------------------------------+
O O O O O O O O O O O O O O O O O O O O O O O O O O.O O O O O |
500 *+*.*..*.*.*.*.*..*.*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.* : *.*..*.*.*
| : : |
| : : |
400 ++ : : |
| : : |
300 ++ : : |
| : : |
200 ++ : : |
| : : |
| :: |
100 ++ : |
| : |
0 ++---------------------------------------------------------*----------+


iostat.sda.wrqm_s

7 ++----------------------------------------------------------------------+
| |
6 ++ *.. .* *. |
*.*..*.*. + *. .*.*..*.*.*. .*.*.*.*.. .*.*.*. + .*.*..* : *.. .*.|
5 ++ * * *. * * : : * *
| : : |
4 ++ : : |
| : : |
3 O+O O O O O O O O O O O O O O O O O O O O O O O O O O:O:O O O |
| : : |
2 ++ : : |
| : : |
1 ++ : |
| : |
0 ++-----------------------------------------------------------*----------+


iostat.sda.wkB_s

140000 ++-----------------------------------------------------------------+
*.*.*.*.*.*..*.*.*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*.*.*. *.*.*.*.*
120000 O+O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O |
| : : |
100000 ++ : : |
| : : |
80000 ++ : : |
| : : |
60000 ++ : : |
| : : |
40000 ++ : : |
| :: |
20000 ++ : |
| : |
0 ++-------------------------------------------------------*---------+


vmstat.io.bo

140000 ++-----------------------------------------------------------------+
*.*.*.*.*.*..*.*.*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*.*.*. *.*.*.*.*
120000 O+O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O |
| : : |
100000 ++ : : |
| : : |
80000 ++ : : |
| : : |
60000 ++ : : |
| : : |
40000 ++ : : |
| :: |
20000 ++ : |
| : |
0 ++-------------------------------------------------------*---------+

[*] bisect-good sample
[O] bisect-bad sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong


Attachments:
(No filename) (11.94 kB)
config-4.13.0-rc4-00187-g434f2ea (157.17 kB)
job-script (7.19 kB)
job.yaml (4.83 kB)
reproduce (550.00 B)
Download all attachments