2018-06-05 18:04:21

by Tejun Heo

[permalink] [raw]
Subject: [PATCHSET v2] block: Separating discards from writes in Linux IO statistics

This patchset was originally posted by Michael Callahan.

https://marc.info/?l=linux-block&m=146541910129172&w=2

The patchset is refreshed on top of the current git master
v4.17-1306-g716a685 and a patch was added to add discard stats for
cgroup io.stat. The original patchset description from Michael
follows.

This patch set separates block layer statistics for discards from
writes. Discards are currently bundled with writes in the various
/sys/block/*/stat files as well as in /proc/diskstats. However
discards are nearly always used to mark storage that is no longer in
use. There are many reasons having discard not counted with writes is
useful:

1) For many non volatile memory devices it is just nice to know
that discards are enabled and working properly.

2) Discards have different performance characteristics than
writes. They are generally much faster and larger and bundling
them makes performance statistics less meaningful.

3) Discards are not writes in terms of tracking device lifetime.
If a device supports six device writes per day it is nice to know
how many writes have actually been written to the device as
discards do not count against that total.

Separation of discard statistics is accomplished by expanding the
struct diskstat arrays to 3 entries for STAT_READ, STAT_WRITE,
and STAT_DISCARD. A new rw_stat_group function is then used to
convert from rw_flags (cmd_flags from requests, bi_rw from bios)
into the appropriate stat group which is then tracked as before.
Lastly the new statistics are appended to the current
/sys/bloc/*/stat and /proc/diskstats on output such that they are
the last four entries of each. These are analogous to the four
read and write statistics.

* Number of discard ios completed
* Number of discard ios merged
* Number of discard sectors completed
* Milliseconds spent on discard requests.

[before ~]# cat sys/block/nvme0/stat
296550701 0 2372405688 67317193 19672752 0 7972237312
9375167 0 2787238 79718726

[after ~]# cat sys/block/nvme0/stat
296550701 0 2372405688 67317193 18034352 0 4616794112
9125902 0 2787238 79718726 1638400 0 3355443200
249265

Note that the discards have moved out of the write fields to the
end and that the write fields are now smaller by the difference.

Adding the new statistics to the end of /sys/block/*/stat and
/proc/diskstats is backwards compatible with both iostat and
vmstat which pick up just the old fields:

[root@after ~]# iostat -x
Linux 4.5.0_68319_ge5065f4-dirty (##hostname###) 05/17/2016
_x86_64_ (48 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
0.41 0.00 0.23 0.01 0.00 99.35

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda 0.01 7.50 0.20 3.61 16.65 708.41
190.03 0.09 22.53 1.16 0.44
nvme0n1 0.00 0.00 587.03 35.70 4696.25 9139.08
22.22 0.16 0.24 0.01 0.55


[root@after ~]# vmstat -d
disk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
ram0 0 0 0 0 0 0 0 0 0 0
ram1 0 0 0 0 0 0 0 0 0 0
ram2 0 0 0 0 0 0 0 0 0 0
ram3 0 0 0 0 0 0 0 0 0 0
ram4 0 0 0 0 0 0 0 0 0 0
ram5 0 0 0 0 0 0 0 0 0 0
ram6 0 0 0 0 0 0 0 0 0 0
ram7 0 0 0 0 0 0 0 0 0 0
ram8 0 0 0 0 0 0 0 0 0 0
ram9 0 0 0 0 0 0 0 0 0 0
ram10 0 0 0 0 0 0 0 0 0 0
ram11 0 0 0 0 0 0 0 0 0 0
ram12 0 0 0 0 0 0 0 0 0 0
ram13 0 0 0 0 0 0 0 0 0 0
ram14 0 0 0 0 0 0 0 0 0 0
ram15 0 0 0 0 0 0 0 0 0 0
sda 102903 5247 8408420 47424 1820306 3782041 357153613 43320121
0 2228
nvme0n1 145633279 0 1165066312 18981333 13107200 0
3355443200 6663655 0 1796
loop0 0 0 0 0 0 0 0 0 0 0
loop1 0 0 0 0 0 0 0 0 0 0
[chop rest of loop devices]


[root@after ~]# cat /sys/fs/cgroup/user.slice/io.stat
8:0 rbytes=3534848 wbytes=4096 rios=723 wios=1 dbytes=20592091136 dios=16189


This patchset contains the following six patches.

0001-block-make-bdev_ops-rw_page-take-a-REQ_OP-instead-of.patch
0002-block-Add-part_stat_read_accum-to-read-across-field-.patch
0003-block-Define-and-use-STAT_READ-and-STAT_WRITE.patch
0004-block-Add-and-use-op_stat_group-for-indexing-disk_st.patch
0005-block-Track-DISCARD-statistics-and-output-them-in-st.patch
0006-blkcg-Track-DISCARD-statistics-and-output-them-in-cg.patch

and also available in the following git branch.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git block-discard-stat

diffstat follows. Thanks.

Documentation/ABI/testing/procfs-diskstats | 10 ++++++++++
Documentation/admin-guide/cgroup-v2.rst | 10 ++++++----
Documentation/block/stat.txt | 28 ++++++++++++++++------------
Documentation/iostats.txt | 15 +++++++++++++++
block/bio.c | 16 +++++++++-------
block/blk-cgroup.c | 14 ++++++++++----
block/blk-core.c | 12 ++++++------
block/genhd.c | 29 ++++++++++++++++++-----------
block/partition-generic.c | 25 +++++++++++++++----------
drivers/block/brd.c | 14 +++++++-------
drivers/block/drbd/drbd_receiver.c | 3 +--
drivers/block/drbd/drbd_req.c | 4 ++--
drivers/block/drbd/drbd_worker.c | 4 +---
drivers/block/rsxx/dev.c | 6 +++---
drivers/block/zram/zram_drv.c | 19 +++++++++----------
drivers/lightnvm/pblk-cache.c | 5 +++--
drivers/lightnvm/pblk-read.c | 5 +++--
drivers/md/bcache/request.c | 13 +++++--------
drivers/md/dm.c | 6 ++++--
drivers/md/md.c | 8 ++++----
drivers/nvdimm/btt.c | 12 ++++++------
drivers/nvdimm/nd.h | 7 +++----
drivers/nvdimm/pmem.c | 13 ++++++-------
fs/block_dev.c | 6 ++++--
fs/ext4/super.c | 5 +++--
fs/ext4/sysfs.c | 6 ++++--
fs/f2fs/f2fs.h | 2 +-
fs/f2fs/super.c | 3 ++-
fs/mpage.c | 4 ++--
include/linux/bio.h | 4 ++--
include/linux/blk-cgroup.h | 5 ++++-
include/linux/blk_types.h | 20 ++++++++++++++++++++
include/linux/blkdev.h | 2 +-
include/linux/genhd.h | 14 ++++++++++----
34 files changed, 215 insertions(+), 134 deletions(-)

--
tejun


2018-06-05 18:02:20

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 4/6] block: Add and use op_stat_group() for indexing disk_stat fields.

From: Michael Callahan <[email protected]>

Add and use a new op_stat_group() function for indexing partition stat
fields rather than indexing them by rq_data_dir() or bio_data_dir().
This function works similarly to op_is_sync() in that it takes the
request::cmd_flags or bio::bi_opf flags and determines which stats
should et updated.

In addition, the second parameter to generic_start_io_acct() and
generic_end_io_acct() is now a REQ_OP rather than simply a read or
write bit and it uses op_stat_group() on the parameter to determine
the stat group.

Note that the partition in_flight counts are not part of the per-cpu
statistics and as such are not indexed via this function. It's now
indexed by op_is_write().

tj: Refreshed on top of v4.17. Updated to pass around REQ_OP.

Signed-off-by: Michael Callahan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Joshua Morris <[email protected]>
Cc: Philipp Reisner <[email protected]>
Cc: Matias Bjorling <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Alasdair Kergon <[email protected]>
---
block/bio.c | 16 +++++++++-------
block/blk-core.c | 12 ++++++------
drivers/block/drbd/drbd_req.c | 4 ++--
drivers/block/rsxx/dev.c | 6 +++---
drivers/block/zram/zram_drv.c | 5 ++---
drivers/lightnvm/pblk-cache.c | 5 +++--
drivers/lightnvm/pblk-read.c | 5 +++--
drivers/md/bcache/request.c | 13 +++++--------
drivers/md/dm.c | 6 ++++--
drivers/md/md.c | 5 +++--
drivers/nvdimm/nd.h | 7 +++----
include/linux/bio.h | 4 ++--
include/linux/blk_types.h | 5 +++++
13 files changed, 50 insertions(+), 43 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 595663e..294455f 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1696,29 +1696,31 @@ void bio_check_pages_dirty(struct bio *bio)
}
EXPORT_SYMBOL_GPL(bio_check_pages_dirty);

-void generic_start_io_acct(struct request_queue *q, int rw,
+void generic_start_io_acct(struct request_queue *q, int op,
unsigned long sectors, struct hd_struct *part)
{
+ const int sgrp = op_stat_group(op);
int cpu = part_stat_lock();

part_round_stats(q, cpu, part);
- part_stat_inc(cpu, part, ios[rw]);
- part_stat_add(cpu, part, sectors[rw], sectors);
- part_inc_in_flight(q, part, rw);
+ part_stat_inc(cpu, part, ios[sgrp]);
+ part_stat_add(cpu, part, sectors[sgrp], sectors);
+ part_inc_in_flight(q, part, op_is_write(op));

part_stat_unlock();
}
EXPORT_SYMBOL(generic_start_io_acct);

-void generic_end_io_acct(struct request_queue *q, int rw,
+void generic_end_io_acct(struct request_queue *q, int req_op,
struct hd_struct *part, unsigned long start_time)
{
unsigned long duration = jiffies - start_time;
+ const int sgrp = op_stat_group(req_op);
int cpu = part_stat_lock();

- part_stat_add(cpu, part, ticks[rw], duration);
+ part_stat_add(cpu, part, ticks[sgrp], duration);
part_round_stats(q, cpu, part);
- part_dec_in_flight(q, part, rw);
+ part_dec_in_flight(q, part, op_is_write(req_op));

part_stat_unlock();
}
diff --git a/block/blk-core.c b/block/blk-core.c
index 3f56be1..c9ffd15 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2703,13 +2703,13 @@ EXPORT_SYMBOL_GPL(blk_rq_err_bytes);
void blk_account_io_completion(struct request *req, unsigned int bytes)
{
if (blk_do_io_stat(req)) {
- const int rw = rq_data_dir(req);
+ const int sgrp = op_stat_group(req_op(req));
struct hd_struct *part;
int cpu;

cpu = part_stat_lock();
part = req->part;
- part_stat_add(cpu, part, sectors[rw], bytes >> 9);
+ part_stat_add(cpu, part, sectors[sgrp], bytes >> 9);
part_stat_unlock();
}
}
@@ -2723,7 +2723,7 @@ void blk_account_io_done(struct request *req, u64 now)
*/
if (blk_do_io_stat(req) && !(req->rq_flags & RQF_FLUSH_SEQ)) {
unsigned long duration;
- const int rw = rq_data_dir(req);
+ const int sgrp = op_stat_group(req_op(req));
struct hd_struct *part;
int cpu;

@@ -2731,10 +2731,10 @@ void blk_account_io_done(struct request *req, u64 now)
cpu = part_stat_lock();
part = req->part;

- part_stat_inc(cpu, part, ios[rw]);
- part_stat_add(cpu, part, ticks[rw], duration);
+ part_stat_inc(cpu, part, ios[sgrp]);
+ part_stat_add(cpu, part, ticks[sgrp], duration);
part_round_stats(req->q, cpu, part);
- part_dec_in_flight(req->q, part, rw);
+ part_dec_in_flight(req->q, part, rq_data_dir(req));

hd_struct_put(part);
part_stat_unlock();
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index a47e498..eba7b61 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -38,7 +38,7 @@ static void _drbd_start_io_acct(struct drbd_device *device, struct drbd_request
{
struct request_queue *q = device->rq_queue;

- generic_start_io_acct(q, bio_data_dir(req->master_bio),
+ generic_start_io_acct(q, bio_op(req->master_bio),
req->i.size >> 9, &device->vdisk->part0);
}

@@ -47,7 +47,7 @@ static void _drbd_end_io_acct(struct drbd_device *device, struct drbd_request *r
{
struct request_queue *q = device->rq_queue;

- generic_end_io_acct(q, bio_data_dir(req->master_bio),
+ generic_end_io_acct(q, bio_op(req->master_bio),
&device->vdisk->part0, req->start_jif);
}

diff --git a/drivers/block/rsxx/dev.c b/drivers/block/rsxx/dev.c
index dddb3f2..1a92f9e 100644
--- a/drivers/block/rsxx/dev.c
+++ b/drivers/block/rsxx/dev.c
@@ -112,7 +112,7 @@ static const struct block_device_operations rsxx_fops = {

static void disk_stats_start(struct rsxx_cardinfo *card, struct bio *bio)
{
- generic_start_io_acct(card->queue, bio_data_dir(bio), bio_sectors(bio),
+ generic_start_io_acct(card->queue, bio_op(bio), bio_sectors(bio),
&card->gendisk->part0);
}

@@ -120,8 +120,8 @@ static void disk_stats_complete(struct rsxx_cardinfo *card,
struct bio *bio,
unsigned long start_time)
{
- generic_end_io_acct(card->queue, bio_data_dir(bio),
- &card->gendisk->part0, start_time);
+ generic_end_io_acct(card->queue, bio_op(bio),
+ &card->gendisk->part0, start_time);
}

static void bio_dma_done_cb(struct rsxx_cardinfo *card,
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index dd3da44..f3aaf26 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1148,11 +1148,10 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
int offset, unsigned int op, struct bio *bio)
{
unsigned long start_time = jiffies;
- int rw_acct = op_is_write(op) ? REQ_OP_WRITE : REQ_OP_READ;
struct request_queue *q = zram->disk->queue;
int ret;

- generic_start_io_acct(q, rw_acct, bvec->bv_len >> SECTOR_SHIFT,
+ generic_start_io_acct(q, op, bvec->bv_len >> SECTOR_SHIFT,
&zram->disk->part0);

if (!op_is_write(op)) {
@@ -1164,7 +1163,7 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
ret = zram_bvec_write(zram, bvec, index, offset, bio);
}

- generic_end_io_acct(q, rw_acct, &zram->disk->part0, start_time);
+ generic_end_io_acct(q, op, &zram->disk->part0, start_time);

if (unlikely(ret < 0)) {
if (!op_is_write(op))
diff --git a/drivers/lightnvm/pblk-cache.c b/drivers/lightnvm/pblk-cache.c
index b1c6d7e..643e276 100644
--- a/drivers/lightnvm/pblk-cache.c
+++ b/drivers/lightnvm/pblk-cache.c
@@ -27,7 +27,8 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
int nr_entries = pblk_get_secs(bio);
int i, ret;

- generic_start_io_acct(q, WRITE, bio_sectors(bio), &pblk->disk->part0);
+ generic_start_io_acct(q, REQ_OP_WRITE, bio_sectors(bio),
+ &pblk->disk->part0);

/* Update the write buffer head (mem) with the entries that we can
* write. The write in itself cannot fail, so there is no need to
@@ -75,7 +76,7 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
pblk_rl_inserted(&pblk->rl, nr_entries);

out:
- generic_end_io_acct(q, WRITE, &pblk->disk->part0, start_time);
+ generic_end_io_acct(q, REQ_OP_WRITE, &pblk->disk->part0, start_time);
pblk_write_should_kick(pblk);
return ret;
}
diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index 1869469..cf90cee 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -199,7 +199,7 @@ static void __pblk_end_io_read(struct pblk *pblk, struct nvm_rq *rqd,
struct bio *int_bio = rqd->bio;
unsigned long start_time = r_ctx->start_time;

- generic_end_io_acct(dev->q, READ, &pblk->disk->part0, start_time);
+ generic_end_io_acct(dev->q, REQ_OP_READ, &pblk->disk->part0, start_time);

if (rqd->error)
pblk_log_read_err(pblk, rqd);
@@ -411,7 +411,8 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
return NVM_IO_ERR;
}

- generic_start_io_acct(q, READ, bio_sectors(bio), &pblk->disk->part0);
+ generic_start_io_acct(q, REQ_OP_READ, bio_sectors(bio),
+ &pblk->disk->part0);

bitmap_zero(&read_bitmap, nr_secs);

diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index ae67f5f..97707b0 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -667,8 +667,7 @@ static void backing_request_endio(struct bio *bio)
static void bio_complete(struct search *s)
{
if (s->orig_bio) {
- generic_end_io_acct(s->d->disk->queue,
- bio_data_dir(s->orig_bio),
+ generic_end_io_acct(s->d->disk->queue, bio_op(s->orig_bio),
&s->d->disk->part0, s->start_time);

trace_bcache_request_end(s->d, s->orig_bio);
@@ -1062,8 +1061,7 @@ static void detached_dev_end_io(struct bio *bio)
bio->bi_end_io = ddip->bi_end_io;
bio->bi_private = ddip->bi_private;

- generic_end_io_acct(ddip->d->disk->queue,
- bio_data_dir(bio),
+ generic_end_io_acct(ddip->d->disk->queue, bio_op(bio),
&ddip->d->disk->part0, ddip->start_time);

if (bio->bi_status) {
@@ -1120,7 +1118,7 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q,
}

atomic_set(&dc->backing_idle, 0);
- generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
+ generic_start_io_acct(q, bio_op(bio), bio_sectors(bio), &d->disk->part0);

bio_set_dev(bio, dc->bdev);
bio->bi_iter.bi_sector += dc->sb.data_offset;
@@ -1229,7 +1227,6 @@ static blk_qc_t flash_dev_make_request(struct request_queue *q,
struct search *s;
struct closure *cl;
struct bcache_device *d = bio->bi_disk->private_data;
- int rw = bio_data_dir(bio);

if (unlikely(d->c && test_bit(CACHE_SET_IO_DISABLE, &d->c->flags))) {
bio->bi_status = BLK_STS_IOERR;
@@ -1237,7 +1234,7 @@ static blk_qc_t flash_dev_make_request(struct request_queue *q,
return BLK_QC_T_NONE;
}

- generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
+ generic_start_io_acct(q, bio_op(bio), bio_sectors(bio), &d->disk->part0);

s = search_alloc(bio, d);
cl = &s->cl;
@@ -1254,7 +1251,7 @@ static blk_qc_t flash_dev_make_request(struct request_queue *q,
flash_dev_nodata,
bcache_wq);
return BLK_QC_T_NONE;
- } else if (rw) {
+ } else if (bio_data_dir(bio)) {
bch_keybuf_check_overlapping(&s->iop.c->moving_gc_keys,
&KEY(d->id, bio->bi_iter.bi_sector, 0),
&KEY(d->id, bio_end_sector(bio), 0));
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 98dff36..8ac347e 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -609,7 +609,8 @@ static void start_io_acct(struct dm_io *io)

io->start_time = jiffies;

- generic_start_io_acct(md->queue, rw, bio_sectors(bio), &dm_disk(md)->part0);
+ generic_start_io_acct(md->queue, bio_op(bio), bio_sectors(bio),
+ &dm_disk(md)->part0);

atomic_set(&dm_disk(md)->part0.in_flight[rw],
atomic_inc_return(&md->pending[rw]));
@@ -628,7 +629,8 @@ static void end_io_acct(struct dm_io *io)
int pending;
int rw = bio_data_dir(bio);

- generic_end_io_acct(md->queue, rw, &dm_disk(md)->part0, io->start_time);
+ generic_end_io_acct(md->queue, bio_op(bio), &dm_disk(md)->part0,
+ io->start_time);

if (unlikely(dm_stats_used(&md->stats)))
dm_stats_account_io(&md->stats, bio_data_dir(bio),
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 817dd97..9f29cde 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -315,6 +315,7 @@ EXPORT_SYMBOL(md_handle_request);
static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
{
const int rw = bio_data_dir(bio);
+ const int sgrp = op_stat_group(bio_op(bio));
struct mddev *mddev = q->queuedata;
unsigned int sectors;
int cpu;
@@ -343,8 +344,8 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
md_handle_request(mddev, bio);

cpu = part_stat_lock();
- part_stat_inc(cpu, &mddev->gendisk->part0, ios[rw]);
- part_stat_add(cpu, &mddev->gendisk->part0, sectors[rw], sectors);
+ part_stat_inc(cpu, &mddev->gendisk->part0, ios[sgrp]);
+ part_stat_add(cpu, &mddev->gendisk->part0, sectors[sgrp], sectors);
part_stat_unlock();

return BLK_QC_T_NONE;
diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index 32e0364..6ee7fd7 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -396,16 +396,15 @@ static inline bool nd_iostat_start(struct bio *bio, unsigned long *start)
return false;

*start = jiffies;
- generic_start_io_acct(disk->queue, bio_data_dir(bio),
- bio_sectors(bio), &disk->part0);
+ generic_start_io_acct(disk->queue, bio_op(bio), bio_sectors(bio),
+ &disk->part0);
return true;
}
static inline void nd_iostat_end(struct bio *bio, unsigned long start)
{
struct gendisk *disk = bio->bi_disk;

- generic_end_io_acct(disk->queue, bio_data_dir(bio), &disk->part0,
- start);
+ generic_end_io_acct(disk->queue, bio_op(bio), &disk->part0, start);
}
static inline bool is_bad_pmem(struct badblocks *bb, sector_t sector,
unsigned int len)
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 810a8be..e95f68e 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -486,9 +486,9 @@ extern struct bio *bio_copy_kern(struct request_queue *, void *, unsigned int,
extern void bio_set_pages_dirty(struct bio *bio);
extern void bio_check_pages_dirty(struct bio *bio);

-void generic_start_io_acct(struct request_queue *q, int rw,
+void generic_start_io_acct(struct request_queue *q, int op,
unsigned long sectors, struct hd_struct *part);
-void generic_end_io_acct(struct request_queue *q, int rw,
+void generic_end_io_acct(struct request_queue *q, int op,
struct hd_struct *part,
unsigned long start_time);

diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 5bdf8c3..698890a 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -402,6 +402,11 @@ static inline bool op_is_sync(unsigned int op)
(op & (REQ_SYNC | REQ_FUA | REQ_PREFLUSH));
}

+static inline int op_stat_group(unsigned int op)
+{
+ return op_is_write(op);
+}
+
typedef unsigned int blk_qc_t;
#define BLK_QC_T_NONE -1U
#define BLK_QC_T_SHIFT 16
--
2.9.5


2018-06-05 18:02:27

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 6/6] blkcg: Track DISCARD statistics and output them in cgroup io.stat

Add tracking of REQ_OP_DISCARD ios to the per-cgroup io.stat. Two
fields, dbytes and dios, to respectively count the total bytes and
number of discards are added.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Andy Newell <[email protected]>
Cc: Michael Callahan <[email protected]>
---
Documentation/admin-guide/cgroup-v2.rst | 10 ++++++----
block/blk-cgroup.c | 14 ++++++++++----
include/linux/blk-cgroup.h | 5 ++++-
3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 74cdeae..7f7a161 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1263,17 +1263,19 @@ IO Interface Files
Lines are keyed by $MAJ:$MIN device numbers and not ordered.
The following nested keys are defined.

- ====== ===================
+ ====== =====================
rbytes Bytes read
wbytes Bytes written
rios Number of read IOs
wios Number of write IOs
- ====== ===================
+ dbytes Bytes discarded
+ dios Number of discard IOs
+ ====== =====================

An example read output follows:

- 8:16 rbytes=1459200 wbytes=314773504 rios=192 wios=353
- 8:0 rbytes=90430464 wbytes=299008000 rios=8950 wios=1252
+ 8:16 rbytes=1459200 wbytes=314773504 rios=192 wios=353 dbytes=0 dios=0
+ 8:0 rbytes=90430464 wbytes=299008000 rios=8950 wios=1252 dbytes=50331648 dios=3021

io.weight
A read-write flat-keyed file which exists on non-root cgroups.
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index eb85cb8..513d1ea 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -564,6 +564,7 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd,
[BLKG_RWSTAT_WRITE] = "Write",
[BLKG_RWSTAT_SYNC] = "Sync",
[BLKG_RWSTAT_ASYNC] = "Async",
+ [BLKG_RWSTAT_DISCARD] = "Discard",
};
const char *dname = blkg_dev_name(pd->blkg);
u64 v;
@@ -577,7 +578,8 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd,
(unsigned long long)atomic64_read(&rwstat->aux_cnt[i]));

v = atomic64_read(&rwstat->aux_cnt[BLKG_RWSTAT_READ]) +
- atomic64_read(&rwstat->aux_cnt[BLKG_RWSTAT_WRITE]);
+ atomic64_read(&rwstat->aux_cnt[BLKG_RWSTAT_WRITE]) +
+ atomic64_read(&rwstat->aux_cnt[BLKG_RWSTAT_DISCARD]);
seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)v);
return v;
}
@@ -955,7 +957,7 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) {
const char *dname;
struct blkg_rwstat rwstat;
- u64 rbytes, wbytes, rios, wios;
+ u64 rbytes, wbytes, rios, wios, dbytes, dios;

dname = blkg_dev_name(blkg);
if (!dname)
@@ -967,17 +969,21 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
offsetof(struct blkcg_gq, stat_bytes));
rbytes = atomic64_read(&rwstat.aux_cnt[BLKG_RWSTAT_READ]);
wbytes = atomic64_read(&rwstat.aux_cnt[BLKG_RWSTAT_WRITE]);
+ dbytes = atomic64_read(&rwstat.aux_cnt[BLKG_RWSTAT_DISCARD]);

rwstat = blkg_rwstat_recursive_sum(blkg, NULL,
offsetof(struct blkcg_gq, stat_ios));
rios = atomic64_read(&rwstat.aux_cnt[BLKG_RWSTAT_READ]);
wios = atomic64_read(&rwstat.aux_cnt[BLKG_RWSTAT_WRITE]);
+ dios = atomic64_read(&rwstat.aux_cnt[BLKG_RWSTAT_DISCARD]);

spin_unlock_irq(blkg->q->queue_lock);

if (rbytes || wbytes || rios || wios)
- seq_printf(sf, "%s rbytes=%llu wbytes=%llu rios=%llu wios=%llu\n",
- dname, rbytes, wbytes, rios, wios);
+ seq_printf(sf, "%s rbytes=%llu wbytes=%llu "
+ "rios=%llu wios=%llu dbytes=%llu dios=%llu\n",
+ dname, rbytes, wbytes, rios, wios,
+ dbytes, dios);
}

rcu_read_unlock();
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 6c666fd..bdb5c58d 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -35,6 +35,7 @@ enum blkg_rwstat_type {
BLKG_RWSTAT_WRITE,
BLKG_RWSTAT_SYNC,
BLKG_RWSTAT_ASYNC,
+ BLKG_RWSTAT_DISCARD,

BLKG_RWSTAT_NR,
BLKG_RWSTAT_TOTAL = BLKG_RWSTAT_NR,
@@ -589,7 +590,9 @@ static inline void blkg_rwstat_add(struct blkg_rwstat *rwstat,
{
struct percpu_counter *cnt;

- if (op_is_write(op))
+ if (op_is_discard(op))
+ cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_DISCARD];
+ else if (op_is_write(op))
cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_WRITE];
else
cnt = &rwstat->cpu_cnt[BLKG_RWSTAT_READ];
--
2.9.5


2018-06-05 18:02:46

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 5/6] block: Track DISCARD statistics and output them in stat and diskstat

From: Michael Callahan <[email protected]>

Add tracking of REQ_OP_DISCARD ios to the partition statistics and
append them to the various stat files in /sys as well as
/proc/diskstats. These are tracked with the same four stats as reads
and writes:

Number of discard ios completed.
Number of discard ios merged
Number of discard sectors completed
Milliseconds spent on discard requests

This is done via adding a new STAT_DISCARD define to genhd.h and then
using it to index that stat field for discard requests.

tj: Refreshed on top of v4.17 and other previous updates.

Signed-off-by: Michael Callahan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Cc: Andy Newell <[email protected]>
---
Documentation/ABI/testing/procfs-diskstats | 10 ++++++++++
Documentation/block/stat.txt | 28 ++++++++++++++++------------
Documentation/iostats.txt | 15 +++++++++++++++
block/genhd.c | 13 ++++++++++---
block/partition-generic.c | 9 +++++++--
include/linux/blk_types.h | 8 ++++++++
include/linux/genhd.h | 3 ++-
7 files changed, 68 insertions(+), 18 deletions(-)

diff --git a/Documentation/ABI/testing/procfs-diskstats b/Documentation/ABI/testing/procfs-diskstats
index f91a973..abac31d 100644
--- a/Documentation/ABI/testing/procfs-diskstats
+++ b/Documentation/ABI/testing/procfs-diskstats
@@ -5,6 +5,7 @@ Contact: Jerome Marchand <[email protected]>
The /proc/diskstats file displays the I/O statistics
of block devices. Each line contains the following 14
fields:
+
1 - major number
2 - minor mumber
3 - device name
@@ -19,4 +20,13 @@ Contact: Jerome Marchand <[email protected]>
12 - I/Os currently in progress
13 - time spent doing I/Os (ms)
14 - weighted time spent doing I/Os (ms)
+
+ Kernel 4.18+ appends four more fields for discard
+ tracking putting the total at 18:
+
+ 15 - discards completed successfully
+ 16 - discards merged
+ 17 - sectors discarded
+ 18 - time spent discarding
+
For more details refer to Documentation/iostats.txt
diff --git a/Documentation/block/stat.txt b/Documentation/block/stat.txt
index 0dbc946..0aace9c 100644
--- a/Documentation/block/stat.txt
+++ b/Documentation/block/stat.txt
@@ -31,28 +31,32 @@ write ticks milliseconds total wait time for write requests
in_flight requests number of I/Os currently in flight
io_ticks milliseconds total time this block device has been active
time_in_queue milliseconds total wait time for all requests
+discard I/Os requests number of discard I/Os processed
+discard merges requests number of discard I/Os merged with in-queue I/O
+discard sectors sectors number of sectors discarded
+discard ticks milliseconds total wait time for discard requests

-read I/Os, write I/Os
-=====================
+read I/Os, write I/Os, discard I/0s
+===================================

These values increment when an I/O request completes.

-read merges, write merges
-=========================
+read merges, write merges, discard merges
+=========================================

These values increment when an I/O request is merged with an
already-queued I/O request.

-read sectors, write sectors
-===========================
+read sectors, write sectors, discard_sectors
+============================================

-These values count the number of sectors read from or written to this
-block device. The "sectors" in question are the standard UNIX 512-byte
-sectors, not any device- or filesystem-specific block size. The
-counters are incremented when the I/O completes.
+These values count the number of sectors read from, written to, or
+discarded from this block device. The "sectors" in question are the
+standard UNIX 512-byte sectors, not any device- or filesystem-specific
+block size. The counters are incremented when the I/O completes.

-read ticks, write ticks
-=======================
+read ticks, write ticks, discard ticks
+======================================

These values count the number of milliseconds that I/O requests have
waited on this block device. If there are multiple I/O requests waiting,
diff --git a/Documentation/iostats.txt b/Documentation/iostats.txt
index 04d394a..49df45f 100644
--- a/Documentation/iostats.txt
+++ b/Documentation/iostats.txt
@@ -31,6 +31,9 @@ and so should not differ.
3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
3 1 hda1 35486 38030 38030 38030

+ 4.18+ diskstats:
+ 3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160 0 0 0 0
+
On 2.4 you might execute ``grep 'hda ' /proc/partitions``. On 2.6+, you have
a choice of ``cat /sys/block/hda/stat`` or ``grep 'hda ' /proc/diskstats``.

@@ -101,6 +104,18 @@ Field 11 -- weighted # of milliseconds spent doing I/Os
last update of this field. This can provide an easy measure of both
I/O completion time and the backlog that may be accumulating.

+Field 12 -- # of discards completed
+ This is the total number of discards completed successfully.
+
+Field 13 -- # of discards merged
+ See the description of field 2
+
+Field 14 -- # of sectors discarded
+ This is the total number of sectors discarded successfully.
+
+Field 15 -- # of milliseconds spent discarding
+ This is the total number of milliseconds spent by all discards (as
+ measured from __make_request() to end_that_request_last()).

To avoid introducing performance bottlenecks, no locks are held while
modifying these counters. This implies that minor inaccuracies may be
diff --git a/block/genhd.c b/block/genhd.c
index 0711a80..8cc719a3 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1333,8 +1333,11 @@ static int diskstats_show(struct seq_file *seqf, void *v)
part_round_stats(gp->queue, cpu, hd);
part_stat_unlock();
part_in_flight(gp->queue, hd, inflight);
- seq_printf(seqf, "%4d %7d %s %lu %lu %lu "
- "%u %lu %lu %lu %u %u %u %u\n",
+ seq_printf(seqf, "%4d %7d %s "
+ "%lu %lu %lu %u "
+ "%lu %lu %lu %u "
+ "%u %u %u "
+ "%lu %lu %lu %u\n",
MAJOR(part_devt(hd)), MINOR(part_devt(hd)),
disk_name(gp, hd->partno, buf),
part_stat_read(hd, ios[STAT_READ]),
@@ -1347,7 +1350,11 @@ static int diskstats_show(struct seq_file *seqf, void *v)
jiffies_to_msecs(part_stat_read(hd, ticks[STAT_WRITE])),
inflight[0],
jiffies_to_msecs(part_stat_read(hd, io_ticks)),
- jiffies_to_msecs(part_stat_read(hd, time_in_queue))
+ jiffies_to_msecs(part_stat_read(hd, time_in_queue)),
+ part_stat_read(hd, ios[STAT_DISCARD]),
+ part_stat_read(hd, merges[STAT_DISCARD]),
+ part_stat_read(hd, sectors[STAT_DISCARD]),
+ jiffies_to_msecs(part_stat_read(hd, ticks[STAT_DISCARD]))
);
}
disk_part_iter_exit(&piter);
diff --git a/block/partition-generic.c b/block/partition-generic.c
index 0ddb067..5a8975a 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -130,7 +130,8 @@ ssize_t part_stat_show(struct device *dev,
return sprintf(buf,
"%8lu %8lu %8llu %8u "
"%8lu %8lu %8llu %8u "
- "%8u %8u %8u"
+ "%8u %8u %8u "
+ "%8lu %8lu %8llu %8u"
"\n",
part_stat_read(p, ios[STAT_READ]),
part_stat_read(p, merges[STAT_READ]),
@@ -142,7 +143,11 @@ ssize_t part_stat_show(struct device *dev,
jiffies_to_msecs(part_stat_read(p, ticks[STAT_WRITE])),
inflight[0],
jiffies_to_msecs(part_stat_read(p, io_ticks)),
- jiffies_to_msecs(part_stat_read(p, time_in_queue)));
+ jiffies_to_msecs(part_stat_read(p, time_in_queue)),
+ part_stat_read(p, ios[STAT_DISCARD]),
+ part_stat_read(p, merges[STAT_DISCARD]),
+ (unsigned long long)part_stat_read(p, sectors[STAT_DISCARD]),
+ jiffies_to_msecs(part_stat_read(p, ticks[STAT_DISCARD])));
}

ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr,
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 698890a..6d72eb2 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -361,6 +361,7 @@ enum req_flag_bits {
enum stat_group {
STAT_READ,
STAT_WRITE,
+ STAT_DISCARD,

NR_STAT_GROUPS
};
@@ -402,8 +403,15 @@ static inline bool op_is_sync(unsigned int op)
(op & (REQ_SYNC | REQ_FUA | REQ_PREFLUSH));
}

+static inline bool op_is_discard(unsigned int op)
+{
+ return (op & REQ_OP_MASK) == REQ_OP_DISCARD;
+}
+
static inline int op_stat_group(unsigned int op)
{
+ if (op_is_discard(op))
+ return STAT_DISCARD;
return op_is_write(op);
}

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index a754454..5786442 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -356,7 +356,8 @@ static inline void free_part_stats(struct hd_struct *part)

#define part_stat_read_accum(part, field) \
(part_stat_read(part, field[STAT_READ]) + \
- part_stat_read(part, field[STAT_WRITE]))
+ part_stat_read(part, field[STAT_WRITE]) + \
+ part_stat_read(part, field[STAT_DISCARD]))

#define part_stat_add(cpu, part, field, addnd) do { \
__part_stat_add((cpu), (part), field, addnd); \
--
2.9.5


2018-06-05 18:02:56

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 2/6] block: Add part_stat_read_accum to read across field entries.

From: Michael Callahan <[email protected]>

Add a part_stat_read_accum macro to genhd.h to read and sum across
field entries. For example to sum up the number read and write
sectors completed. In addition to being ar reasonable cleanup by
itself this will make it easier to add new stat fields in the future.

tj: Refreshed on top of v4.17.

Signed-off-by: Michael Callahan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 3 +--
drivers/block/drbd/drbd_worker.c | 4 +---
drivers/md/md.c | 3 +--
include/linux/genhd.h | 4 ++++
4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index be9450f..d7e6973 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -2674,8 +2674,7 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
if (c_min_rate == 0)
return false;

- curr_events = (int)part_stat_read(&disk->part0, sectors[0]) +
- (int)part_stat_read(&disk->part0, sectors[1]) -
+ curr_events = (int)part_stat_read_accum(&disk->part0, sectors) -
atomic_read(&device->rs_sect_ev);

if (atomic_read(&device->ap_actlog_cnt)
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index 1476cb3..2063804 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -1690,9 +1690,7 @@ void drbd_rs_controller_reset(struct drbd_device *device)
atomic_set(&device->rs_sect_in, 0);
atomic_set(&device->rs_sect_ev, 0);
device->rs_in_flight = 0;
- device->rs_last_events =
- (int)part_stat_read(&disk->part0, sectors[0]) +
- (int)part_stat_read(&disk->part0, sectors[1]);
+ device->rs_last_events = (int)part_stat_read_accum(&disk->part0, sectors);

/* Updating the RCU protected object in place is necessary since
this function gets called from atomic context.
diff --git a/drivers/md/md.c b/drivers/md/md.c
index fc692b7..817dd97 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8001,8 +8001,7 @@ static int is_mddev_idle(struct mddev *mddev, int init)
rcu_read_lock();
rdev_for_each_rcu(rdev, mddev) {
struct gendisk *disk = rdev->bdev->bd_contains->bd_disk;
- curr_events = (int)part_stat_read(&disk->part0, sectors[0]) +
- (int)part_stat_read(&disk->part0, sectors[1]) -
+ curr_events = (int)part_stat_read_accum(&disk->part0, sectors) -
atomic_read(&disk->sync_io);
/* sync IO will cause sync_io to increase before the disk_stats
* as sync_io is counted when a request starts, and
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 6cb8a57..19f36fa 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -353,6 +353,10 @@ static inline void free_part_stats(struct hd_struct *part)

#endif /* CONFIG_SMP */

+#define part_stat_read_accum(part, field) \
+ (part_stat_read(part, field[0]) + \
+ part_stat_read(part, field[1]))
+
#define part_stat_add(cpu, part, field, addnd) do { \
__part_stat_add((cpu), (part), field, addnd); \
if ((part)->partno) \
--
2.9.5


2018-06-05 18:03:09

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 3/6] block: Define and use STAT_READ and STAT_WRITE

From: Michael Callahan <[email protected]>

Add defines for STAT_READ and STAT_WRITE for indexing the partition
stat entries. This clarifies some fs/ code which has hardcoded 1 for
STAT_WRITE and will make it easier to extend the stats with additional
fields.

tj: Refreshed on top of v4.17.

Signed-off-by: Michael Callahan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
---
block/genhd.c | 16 ++++++++--------
block/partition-generic.c | 16 ++++++++--------
fs/ext4/super.c | 5 +++--
fs/ext4/sysfs.c | 6 ++++--
fs/f2fs/f2fs.h | 2 +-
fs/f2fs/super.c | 3 ++-
include/linux/blk_types.h | 7 +++++++
include/linux/genhd.h | 13 +++++++------
8 files changed, 40 insertions(+), 28 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index f1543a4..0711a80 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1337,14 +1337,14 @@ static int diskstats_show(struct seq_file *seqf, void *v)
"%u %lu %lu %lu %u %u %u %u\n",
MAJOR(part_devt(hd)), MINOR(part_devt(hd)),
disk_name(gp, hd->partno, buf),
- part_stat_read(hd, ios[READ]),
- part_stat_read(hd, merges[READ]),
- part_stat_read(hd, sectors[READ]),
- jiffies_to_msecs(part_stat_read(hd, ticks[READ])),
- part_stat_read(hd, ios[WRITE]),
- part_stat_read(hd, merges[WRITE]),
- part_stat_read(hd, sectors[WRITE]),
- jiffies_to_msecs(part_stat_read(hd, ticks[WRITE])),
+ part_stat_read(hd, ios[STAT_READ]),
+ part_stat_read(hd, merges[STAT_READ]),
+ part_stat_read(hd, sectors[STAT_READ]),
+ jiffies_to_msecs(part_stat_read(hd, ticks[STAT_READ])),
+ part_stat_read(hd, ios[STAT_WRITE]),
+ part_stat_read(hd, merges[STAT_WRITE]),
+ part_stat_read(hd, sectors[STAT_WRITE]),
+ jiffies_to_msecs(part_stat_read(hd, ticks[STAT_WRITE])),
inflight[0],
jiffies_to_msecs(part_stat_read(hd, io_ticks)),
jiffies_to_msecs(part_stat_read(hd, time_in_queue))
diff --git a/block/partition-generic.c b/block/partition-generic.c
index 3dcfd4e..0ddb067 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -132,14 +132,14 @@ ssize_t part_stat_show(struct device *dev,
"%8lu %8lu %8llu %8u "
"%8u %8u %8u"
"\n",
- part_stat_read(p, ios[READ]),
- part_stat_read(p, merges[READ]),
- (unsigned long long)part_stat_read(p, sectors[READ]),
- jiffies_to_msecs(part_stat_read(p, ticks[READ])),
- part_stat_read(p, ios[WRITE]),
- part_stat_read(p, merges[WRITE]),
- (unsigned long long)part_stat_read(p, sectors[WRITE]),
- jiffies_to_msecs(part_stat_read(p, ticks[WRITE])),
+ part_stat_read(p, ios[STAT_READ]),
+ part_stat_read(p, merges[STAT_READ]),
+ (unsigned long long)part_stat_read(p, sectors[STAT_READ]),
+ jiffies_to_msecs(part_stat_read(p, ticks[STAT_READ])),
+ part_stat_read(p, ios[STAT_WRITE]),
+ part_stat_read(p, merges[STAT_WRITE]),
+ (unsigned long long)part_stat_read(p, sectors[STAT_WRITE]),
+ jiffies_to_msecs(part_stat_read(p, ticks[STAT_WRITE])),
inflight[0],
jiffies_to_msecs(part_stat_read(p, io_ticks)),
jiffies_to_msecs(part_stat_read(p, time_in_queue)));
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index eb104e8..b0bda4d 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3445,7 +3445,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
sbi->s_sb_block = sb_block;
if (sb->s_bdev->bd_part)
sbi->s_sectors_written_start =
- part_stat_read(sb->s_bdev->bd_part, sectors[1]);
+ part_stat_read(sb->s_bdev->bd_part, sectors[STAT_WRITE]);

/* Cleanup superblock name */
strreplace(sb->s_id, '/', '!');
@@ -4724,7 +4724,8 @@ static int ext4_commit_super(struct super_block *sb, int sync)
if (sb->s_bdev->bd_part)
es->s_kbytes_written =
cpu_to_le64(EXT4_SB(sb)->s_kbytes_written +
- ((part_stat_read(sb->s_bdev->bd_part, sectors[1]) -
+ ((part_stat_read(sb->s_bdev->bd_part,
+ sectors[STAT_WRITE]) -
EXT4_SB(sb)->s_sectors_written_start) >> 1));
else
es->s_kbytes_written =
diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c
index f34da0b..2be9ad7 100644
--- a/fs/ext4/sysfs.c
+++ b/fs/ext4/sysfs.c
@@ -56,7 +56,8 @@ static ssize_t session_write_kbytes_show(struct ext4_sb_info *sbi, char *buf)
if (!sb->s_bdev->bd_part)
return snprintf(buf, PAGE_SIZE, "0\n");
return snprintf(buf, PAGE_SIZE, "%lu\n",
- (part_stat_read(sb->s_bdev->bd_part, sectors[1]) -
+ (part_stat_read(sb->s_bdev->bd_part,
+ sectors[STAT_WRITE]) -
sbi->s_sectors_written_start) >> 1);
}

@@ -68,7 +69,8 @@ static ssize_t lifetime_write_kbytes_show(struct ext4_sb_info *sbi, char *buf)
return snprintf(buf, PAGE_SIZE, "0\n");
return snprintf(buf, PAGE_SIZE, "%llu\n",
(unsigned long long)(sbi->s_kbytes_written +
- ((part_stat_read(sb->s_bdev->bd_part, sectors[1]) -
+ ((part_stat_read(sb->s_bdev->bd_part,
+ sectors[STAT_WRITE]) -
EXT4_SB(sb)->s_sectors_written_start) >> 1)));
}

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 1df7f10..43b4a4c 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1286,7 +1286,7 @@ static inline bool time_to_inject(struct f2fs_sb_info *sbi, int type)
* and the return value is in kbytes. s is of struct f2fs_sb_info.
*/
#define BD_PART_WRITTEN(s) \
-(((u64)part_stat_read((s)->sb->s_bdev->bd_part, sectors[1]) - \
+(((u64)part_stat_read((s)->sb->s_bdev->bd_part, sectors[STAT_WRITE]) - \
(s)->sectors_written_start) >> 1)

static inline void f2fs_update_time(struct f2fs_sb_info *sbi, int type)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 42d564c..88d44ca 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2816,7 +2816,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
/* For write statistics */
if (sb->s_bdev->bd_part)
sbi->sectors_written_start =
- (u64)part_stat_read(sb->s_bdev->bd_part, sectors[1]);
+ (u64)part_stat_read(sb->s_bdev->bd_part,
+ sectors[STAT_WRITE]);

/* Read accumulated write IO statistics if exists */
seg_i = CURSEG_I(sbi, CURSEG_HOT_NODE);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 3c4f390..5bdf8c3 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -358,6 +358,13 @@ enum req_flag_bits {
#define REQ_NOMERGE_FLAGS \
(REQ_NOMERGE | REQ_PREFLUSH | REQ_FUA)

+enum stat_group {
+ STAT_READ,
+ STAT_WRITE,
+
+ NR_STAT_GROUPS
+};
+
#define bio_op(bio) \
((bio)->bi_opf & REQ_OP_MASK)
#define req_op(req) \
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 19f36fa..a754454 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -16,6 +16,7 @@
#include <linux/slab.h>
#include <linux/percpu-refcount.h>
#include <linux/uuid.h>
+#include <linux/blk_types.h>

#ifdef CONFIG_BLOCK

@@ -82,10 +83,10 @@ struct partition {
} __attribute__((packed));

struct disk_stats {
- unsigned long sectors[2]; /* READs and WRITEs */
- unsigned long ios[2];
- unsigned long merges[2];
- unsigned long ticks[2];
+ unsigned long sectors[NR_STAT_GROUPS];
+ unsigned long ios[NR_STAT_GROUPS];
+ unsigned long merges[NR_STAT_GROUPS];
+ unsigned long ticks[NR_STAT_GROUPS];
unsigned long io_ticks;
unsigned long time_in_queue;
};
@@ -354,8 +355,8 @@ static inline void free_part_stats(struct hd_struct *part)
#endif /* CONFIG_SMP */

#define part_stat_read_accum(part, field) \
- (part_stat_read(part, field[0]) + \
- part_stat_read(part, field[1]))
+ (part_stat_read(part, field[STAT_READ]) + \
+ part_stat_read(part, field[STAT_WRITE]))

#define part_stat_add(cpu, part, field, addnd) do { \
__part_stat_add((cpu), (part), field, addnd); \
--
2.9.5


2018-06-05 18:04:34

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 1/6] block: make bdev_ops->rw_page() take a REQ_OP instead of bool

c11f0c0b5bb9 ("block/mm: make bdev_ops->rw_page() take a bool for
read/write") replaced @op with boolean @is_write, which limited the
amount of information going into ->rw_page() and more importantly
page_endio(), which removed the need to expose block internals to mm.

Unfortunately, we want to track discards separately and @is_write
isn't enough information. This patch updates bdev_ops->rw_page() to
take REQ_OP instead but leaves page_endio() to take bool @is_write.
This allows the block part of operations to have enough information
while not leaking it to mm.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Mike Christie <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Dan Williams <[email protected]>
---
drivers/block/brd.c | 14 +++++++-------
drivers/block/zram/zram_drv.c | 16 ++++++++--------
drivers/nvdimm/btt.c | 12 ++++++------
drivers/nvdimm/pmem.c | 13 ++++++-------
fs/block_dev.c | 6 ++++--
fs/mpage.c | 4 ++--
include/linux/blkdev.h | 2 +-
7 files changed, 34 insertions(+), 33 deletions(-)

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index bb97659..df8103d 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -254,20 +254,20 @@ static void copy_from_brd(void *dst, struct brd_device *brd,
* Process a single bvec of a bio.
*/
static int brd_do_bvec(struct brd_device *brd, struct page *page,
- unsigned int len, unsigned int off, bool is_write,
+ unsigned int len, unsigned int off, unsigned int op,
sector_t sector)
{
void *mem;
int err = 0;

- if (is_write) {
+ if (op_is_write(op)) {
err = copy_to_brd_setup(brd, sector, len);
if (err)
goto out;
}

mem = kmap_atomic(page);
- if (!is_write) {
+ if (!op_is_write(op)) {
copy_from_brd(mem + off, brd, sector, len);
flush_dcache_page(page);
} else {
@@ -296,7 +296,7 @@ static blk_qc_t brd_make_request(struct request_queue *q, struct bio *bio)
int err;

err = brd_do_bvec(brd, bvec.bv_page, len, bvec.bv_offset,
- op_is_write(bio_op(bio)), sector);
+ bio_op(bio), sector);
if (err)
goto io_error;
sector += len >> SECTOR_SHIFT;
@@ -310,15 +310,15 @@ static blk_qc_t brd_make_request(struct request_queue *q, struct bio *bio)
}

static int brd_rw_page(struct block_device *bdev, sector_t sector,
- struct page *page, bool is_write)
+ struct page *page, unsigned int op)
{
struct brd_device *brd = bdev->bd_disk->private_data;
int err;

if (PageTransHuge(page))
return -ENOTSUPP;
- err = brd_do_bvec(brd, page, PAGE_SIZE, 0, is_write, sector);
- page_endio(page, is_write, err);
+ err = brd_do_bvec(brd, page, PAGE_SIZE, 0, op, sector);
+ page_endio(page, op_is_write(op), err);
return err;
}

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 0f3fadd..dd3da44 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1145,17 +1145,17 @@ static void zram_bio_discard(struct zram *zram, u32 index,
* Returns 1 if IO request was successfully submitted.
*/
static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
- int offset, bool is_write, struct bio *bio)
+ int offset, unsigned int op, struct bio *bio)
{
unsigned long start_time = jiffies;
- int rw_acct = is_write ? REQ_OP_WRITE : REQ_OP_READ;
+ int rw_acct = op_is_write(op) ? REQ_OP_WRITE : REQ_OP_READ;
struct request_queue *q = zram->disk->queue;
int ret;

generic_start_io_acct(q, rw_acct, bvec->bv_len >> SECTOR_SHIFT,
&zram->disk->part0);

- if (!is_write) {
+ if (!op_is_write(op)) {
atomic64_inc(&zram->stats.num_reads);
ret = zram_bvec_read(zram, bvec, index, offset, bio);
flush_dcache_page(bvec->bv_page);
@@ -1167,7 +1167,7 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index,
generic_end_io_acct(q, rw_acct, &zram->disk->part0, start_time);

if (unlikely(ret < 0)) {
- if (!is_write)
+ if (!op_is_write(op))
atomic64_inc(&zram->stats.failed_reads);
else
atomic64_inc(&zram->stats.failed_writes);
@@ -1205,7 +1205,7 @@ static void __zram_make_request(struct zram *zram, struct bio *bio)
bv.bv_len = min_t(unsigned int, PAGE_SIZE - offset,
unwritten);
if (zram_bvec_rw(zram, &bv, index, offset,
- op_is_write(bio_op(bio)), bio) < 0)
+ bio_op(bio), bio) < 0)
goto out;

bv.bv_offset += bv.bv_len;
@@ -1257,7 +1257,7 @@ static void zram_slot_free_notify(struct block_device *bdev,
}

static int zram_rw_page(struct block_device *bdev, sector_t sector,
- struct page *page, bool is_write)
+ struct page *page, unsigned int op)
{
int offset, ret;
u32 index;
@@ -1281,7 +1281,7 @@ static int zram_rw_page(struct block_device *bdev, sector_t sector,
bv.bv_len = PAGE_SIZE;
bv.bv_offset = 0;

- ret = zram_bvec_rw(zram, &bv, index, offset, is_write, NULL);
+ ret = zram_bvec_rw(zram, &bv, index, offset, op, NULL);
out:
/*
* If I/O fails, just return error(ie, non-zero) without
@@ -1296,7 +1296,7 @@ static int zram_rw_page(struct block_device *bdev, sector_t sector,

switch (ret) {
case 0:
- page_endio(page, is_write, 0);
+ page_endio(page, op_is_write(op), 0);
break;
case 1:
ret = 0;
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 85de805..0360c01 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1423,11 +1423,11 @@ static int btt_write_pg(struct btt *btt, struct bio_integrity_payload *bip,

static int btt_do_bvec(struct btt *btt, struct bio_integrity_payload *bip,
struct page *page, unsigned int len, unsigned int off,
- bool is_write, sector_t sector)
+ unsigned int op, sector_t sector)
{
int ret;

- if (!is_write) {
+ if (!op_is_write(op)) {
ret = btt_read_pg(btt, bip, page, off, sector, len);
flush_dcache_page(page);
} else {
@@ -1464,7 +1464,7 @@ static blk_qc_t btt_make_request(struct request_queue *q, struct bio *bio)
}

err = btt_do_bvec(btt, bip, bvec.bv_page, len, bvec.bv_offset,
- op_is_write(bio_op(bio)), iter.bi_sector);
+ bio_op(bio), iter.bi_sector);
if (err) {
dev_err(&btt->nd_btt->dev,
"io error in %s sector %lld, len %d,\n",
@@ -1483,16 +1483,16 @@ static blk_qc_t btt_make_request(struct request_queue *q, struct bio *bio)
}

static int btt_rw_page(struct block_device *bdev, sector_t sector,
- struct page *page, bool is_write)
+ struct page *page, unsigned int op)
{
struct btt *btt = bdev->bd_disk->private_data;
int rc;
unsigned int len;

len = hpage_nr_pages(page) * PAGE_SIZE;
- rc = btt_do_bvec(btt, NULL, page, len, 0, is_write, sector);
+ rc = btt_do_bvec(btt, NULL, page, len, 0, op, sector);
if (rc == 0)
- page_endio(page, is_write, 0);
+ page_endio(page, op_is_write(op), 0);

return rc;
}
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index e023d6a..90942508 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -120,7 +120,7 @@ static blk_status_t read_pmem(struct page *page, unsigned int off,
}

static blk_status_t pmem_do_bvec(struct pmem_device *pmem, struct page *page,
- unsigned int len, unsigned int off, bool is_write,
+ unsigned int len, unsigned int off, unsigned int op,
sector_t sector)
{
blk_status_t rc = BLK_STS_OK;
@@ -131,7 +131,7 @@ static blk_status_t pmem_do_bvec(struct pmem_device *pmem, struct page *page,
if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
bad_pmem = true;

- if (!is_write) {
+ if (!op_is_write(op)) {
if (unlikely(bad_pmem))
rc = BLK_STS_IOERR;
else {
@@ -185,8 +185,7 @@ static blk_qc_t pmem_make_request(struct request_queue *q, struct bio *bio)
do_acct = nd_iostat_start(bio, &start);
bio_for_each_segment(bvec, bio, iter) {
rc = pmem_do_bvec(pmem, bvec.bv_page, bvec.bv_len,
- bvec.bv_offset, op_is_write(bio_op(bio)),
- iter.bi_sector);
+ bvec.bv_offset, bio_op(bio), iter.bi_sector);
if (rc) {
bio->bi_status = rc;
break;
@@ -203,13 +202,13 @@ static blk_qc_t pmem_make_request(struct request_queue *q, struct bio *bio)
}

static int pmem_rw_page(struct block_device *bdev, sector_t sector,
- struct page *page, bool is_write)
+ struct page *page, unsigned int op)
{
struct pmem_device *pmem = bdev->bd_queue->queuedata;
blk_status_t rc;

rc = pmem_do_bvec(pmem, page, hpage_nr_pages(page) * PAGE_SIZE,
- 0, is_write, sector);
+ 0, op, sector);

/*
* The ->rw_page interface is subtle and tricky. The core
@@ -218,7 +217,7 @@ static int pmem_rw_page(struct block_device *bdev, sector_t sector,
* caused by double completion.
*/
if (rc == 0)
- page_endio(page, is_write, 0);
+ page_endio(page, op_is_write(op), 0);

return blk_status_to_errno(rc);
}
diff --git a/fs/block_dev.c b/fs/block_dev.c
index bef6934..82ed5e4 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -662,7 +662,8 @@ int bdev_read_page(struct block_device *bdev, sector_t sector,
result = blk_queue_enter(bdev->bd_queue, 0);
if (result)
return result;
- result = ops->rw_page(bdev, sector + get_start_sect(bdev), page, false);
+ result = ops->rw_page(bdev, sector + get_start_sect(bdev), page,
+ REQ_OP_READ);
blk_queue_exit(bdev->bd_queue);
return result;
}
@@ -700,7 +701,8 @@ int bdev_write_page(struct block_device *bdev, sector_t sector,
return result;

set_page_writeback(page);
- result = ops->rw_page(bdev, sector + get_start_sect(bdev), page, true);
+ result = ops->rw_page(bdev, sector + get_start_sect(bdev), page,
+ REQ_OP_WRITE);
if (result) {
end_page_writeback(page);
} else {
diff --git a/fs/mpage.c b/fs/mpage.c
index b7e7f57..b73638d 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -51,8 +51,8 @@ static void mpage_end_io(struct bio *bio)

bio_for_each_segment_all(bv, bio, i) {
struct page *page = bv->bv_page;
- page_endio(page, op_is_write(bio_op(bio)),
- blk_status_to_errno(bio->bi_status));
+ page_endio(page, bio_op(bio),
+ blk_status_to_errno(bio->bi_status));
}

bio_put(bio);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bca3a92..b038366 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1955,7 +1955,7 @@ static inline bool integrity_req_gap_front_merge(struct request *req,
struct block_device_operations {
int (*open) (struct block_device *, fmode_t);
void (*release) (struct gendisk *, fmode_t);
- int (*rw_page)(struct block_device *, sector_t, struct page *, bool);
+ int (*rw_page)(struct block_device *, sector_t, struct page *, unsigned int);
int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
unsigned int (*check_events) (struct gendisk *disk,
--
2.9.5


2018-06-05 18:07:51

by Matias Bjørling

[permalink] [raw]
Subject: Re: [PATCH 4/6] block: Add and use op_stat_group() for indexing disk_stat fields.

On 06/05/2018 08:01 PM, Tejun Heo wrote:
> From: Michael Callahan <[email protected]>
>
> Add and use a new op_stat_group() function for indexing partition stat
> fields rather than indexing them by rq_data_dir() or bio_data_dir().
> This function works similarly to op_is_sync() in that it takes the
> request::cmd_flags or bio::bi_opf flags and determines which stats
> should et updated.
>
> In addition, the second parameter to generic_start_io_acct() and
> generic_end_io_acct() is now a REQ_OP rather than simply a read or
> write bit and it uses op_stat_group() on the parameter to determine
> the stat group.
>
> Note that the partition in_flight counts are not part of the per-cpu
> statistics and as such are not indexed via this function. It's now
> indexed by op_is_write().
>
> tj: Refreshed on top of v4.17. Updated to pass around REQ_OP.
>
> Signed-off-by: Michael Callahan <[email protected]>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Minchan Kim <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Joshua Morris <[email protected]>
> Cc: Philipp Reisner <[email protected]>
> Cc: Matias Bjorling <[email protected]>
> Cc: Kent Overstreet <[email protected]>
> Cc: Alasdair Kergon <[email protected]>
> ---
> block/bio.c | 16 +++++++++-------
> block/blk-core.c | 12 ++++++------
> drivers/block/drbd/drbd_req.c | 4 ++--
> drivers/block/rsxx/dev.c | 6 +++---
> drivers/block/zram/zram_drv.c | 5 ++---
> drivers/lightnvm/pblk-cache.c | 5 +++--
> drivers/lightnvm/pblk-read.c | 5 +++--
> drivers/md/bcache/request.c | 13 +++++--------
> drivers/md/dm.c | 6 ++++--
> drivers/md/md.c | 5 +++--
> drivers/nvdimm/nd.h | 7 +++----
> include/linux/bio.h | 4 ++--
> include/linux/blk_types.h | 5 +++++
> 13 files changed, 50 insertions(+), 43 deletions(-)
>
> diff --git a/block/bio.c b/block/bio.c
....
> diff --git a/drivers/lightnvm/pblk-cache.c b/drivers/lightnvm/pblk-cache.c
> index b1c6d7e..643e276 100644
> --- a/drivers/lightnvm/pblk-cache.c
> +++ b/drivers/lightnvm/pblk-cache.c
> @@ -27,7 +27,8 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
> int nr_entries = pblk_get_secs(bio);
> int i, ret;
>
> - generic_start_io_acct(q, WRITE, bio_sectors(bio), &pblk->disk->part0);
> + generic_start_io_acct(q, REQ_OP_WRITE, bio_sectors(bio),
> + &pblk->disk->part0);
>
> /* Update the write buffer head (mem) with the entries that we can
> * write. The write in itself cannot fail, so there is no need to
> @@ -75,7 +76,7 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
> pblk_rl_inserted(&pblk->rl, nr_entries);
>
> out:
> - generic_end_io_acct(q, WRITE, &pblk->disk->part0, start_time);
> + generic_end_io_acct(q, REQ_OP_WRITE, &pblk->disk->part0, start_time);
> pblk_write_should_kick(pblk);
> return ret;
> }
> diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
> index 1869469..cf90cee 100644
> --- a/drivers/lightnvm/pblk-read.c
> +++ b/drivers/lightnvm/pblk-read.c
> @@ -199,7 +199,7 @@ static void __pblk_end_io_read(struct pblk *pblk, struct nvm_rq *rqd,
> struct bio *int_bio = rqd->bio;
> unsigned long start_time = r_ctx->start_time;
>
> - generic_end_io_acct(dev->q, READ, &pblk->disk->part0, start_time);
> + generic_end_io_acct(dev->q, REQ_OP_READ, &pblk->disk->part0, start_time);
>
> if (rqd->error)
> pblk_log_read_err(pblk, rqd);
> @@ -411,7 +411,8 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
> return NVM_IO_ERR;
> }
>
> - generic_start_io_acct(q, READ, bio_sectors(bio), &pblk->disk->part0);
> + generic_start_io_acct(q, REQ_OP_READ, bio_sectors(bio),
> + &pblk->disk->part0);
>
> bitmap_zero(&read_bitmap, nr_secs);
>

Looks good to me. Thanks.

2018-07-17 16:36:42

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCHSET v2] block: Separating discards from writes in Linux IO statistics

On 7/17/18 9:57 AM, Tejun Heo wrote:
> Jens, ping.
>
> Thanks.

Looks ok to me, should probably get it queued up. Can you respin
against for-4.19/block?

--
Jens Axboe