2024-03-23 04:07:34

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 0/2] block: support to account io_ticks precisely

From: Yu Kuai <[email protected]>

Yu Kuai (2):
block: support to account io_ticks precisely
block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()

block/blk-core.c | 9 +++++----
block/blk-merge.c | 2 ++
block/blk-mq.c | 36 ++++--------------------------------
block/blk-mq.h | 5 -----
block/blk.h | 1 +
block/genhd.c | 20 ++++----------------
6 files changed, 16 insertions(+), 57 deletions(-)

--
2.39.2



2024-03-23 04:07:57

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 0/2] block: support to account io_ticks precisely

From: Yu Kuai <[email protected]>

Yu Kuai (2):
block: support to account io_ticks precisely
block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()

block/blk-core.c | 9 +++++----
block/blk-merge.c | 2 ++
block/blk-mq.c | 36 ++++--------------------------------
block/blk-mq.h | 5 -----
block/blk.h | 1 +
block/genhd.c | 20 ++++----------------
6 files changed, 16 insertions(+), 57 deletions(-)

--
2.39.2


2024-03-23 04:08:03

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 1/2] block: support to account io_ticks precisely

From: Yu Kuai <[email protected]>

Currently, io_ticks is accounted based on sampling, specifically
update_io_ticks() will always account io_ticks by 1 jiffies from
bdev_start_io_acct()/blk_account_io_start(), and the result can be
inaccurate, for example(HZ is 250):

Test script:
fio -filename=/dev/sda -bs=4k -rw=write -direct=1 -name=test -thinktime=4ms

Test result: util is about 90%, while the disk is really idle.

This behaviour is introduced by commit 5b18b5a73760 ("block: delete
part_round_stats and switch to less precise counting"), however, there
was a key point that is missed that this patch also improve performance
a lot:

Before the commit:
part_round_stats:
if (part->stamp != now)
stats |= 1;

part_in_flight()
-> there can be lots of task here in 1 jiffies.
part_round_stats_single()
__part_stat_add()
part->stamp = now;

After the commit:
update_io_ticks:
stamp = part->bd_stamp;
if (time_after(now, stamp))
if (try_cmpxchg())
__part_stat_add()
-> only one task can reach here in 1 jiffies.

Hence in order to account io_ticks precisely, we only need to know if
there are IO inflight at most once in one jiffies. Noted that for
rq-based device, iterating tags should not be used here because
'tags->lock' is grabbed in blk_mq_find_and_get_req(), hence
part_stat_lock_inc/dec() and part_in_flight() is used to trace inflight.
The additional overhead is quite little:

- per cpu add/dec for each IO for rq-based device;
- per cpu sum for each jiffies;

And it's verified by null-blk that there are no performance degration
under heavy IO pressure.

Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Yu Kuai <[email protected]>
---
block/blk-core.c | 9 +++++----
block/blk-merge.c | 2 ++
block/blk-mq.c | 4 ++++
block/blk.h | 1 +
block/genhd.c | 2 +-
5 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a16b5abdbbf5..71c4d3982ef4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -985,10 +985,11 @@ void update_io_ticks(struct block_device *part, unsigned long now, bool end)
unsigned long stamp;
again:
stamp = READ_ONCE(part->bd_stamp);
- if (unlikely(time_after(now, stamp))) {
- if (likely(try_cmpxchg(&part->bd_stamp, &stamp, now)))
- __part_stat_add(part, io_ticks, end ? now - stamp : 1);
- }
+ if (unlikely(time_after(now, stamp)) &&
+ likely(try_cmpxchg(&part->bd_stamp, &stamp, now)) &&
+ (end || part_in_flight(part)))
+ __part_stat_add(part, io_ticks, now - stamp);
+
if (part->bd_partno) {
part = bdev_whole(part);
goto again;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2a06fd33039d..c96f7f612218 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -779,6 +779,8 @@ static void blk_account_io_merge_request(struct request *req)
if (blk_do_io_stat(req)) {
part_stat_lock();
part_stat_inc(req->part, merges[op_stat_group(req_op(req))]);
+ part_stat_local_dec(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 555ada922cf0..82045f0ab5ba 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1002,6 +1002,8 @@ static inline void blk_account_io_done(struct request *req, u64 now)
update_io_ticks(req->part, jiffies, true);
part_stat_inc(req->part, ios[sgrp]);
part_stat_add(req->part, nsecs[sgrp], now - req->start_time_ns);
+ part_stat_local_dec(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
@@ -1024,6 +1026,8 @@ static inline void blk_account_io_start(struct request *req)

part_stat_lock();
update_io_ticks(req->part, jiffies, false);
+ part_stat_local_inc(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
diff --git a/block/blk.h b/block/blk.h
index 47960bf31543..5fc206ced744 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -358,6 +358,7 @@ static inline bool blk_do_io_stat(struct request *rq)
}

void update_io_ticks(struct block_device *part, unsigned long now, bool end);
+unsigned int part_in_flight(struct block_device *part);

static inline void req_set_nomerge(struct request_queue *q, struct request *req)
{
diff --git a/block/genhd.c b/block/genhd.c
index bb29a68e1d67..782a42718965 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -118,7 +118,7 @@ static void part_stat_read_all(struct block_device *part,
}
}

-static unsigned int part_in_flight(struct block_device *part)
+unsigned int part_in_flight(struct block_device *part)
{
unsigned int inflight = 0;
int cpu;
--
2.39.2


2024-03-23 04:08:13

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 2/2] block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()

From: Yu Kuai <[email protected]>

Now that blk-mq also use per_cpu counter to trace inflight as bio-based
device, they can be replaced by part_in_flight() and part_in_flight_rw()
directly.

Noted that there will be a change that inflight will be accounted from
blk_account_io_start() instead of blk_mq_start_request(). This also fix
an inconsistence for rq-based device that if there are rq allocated but
not started, io_ticks will be updated from blk_account_io_start() but not
from part_stat_show() or diskstats_show(). For consequence, for example:

blk_account_io_start -> 0s
-> something is wrong with driver, rq can't be dispatched to driver
-> finially the driver recovers
blk_mq_start_request -> 10s

Then in this case, if user is using "iostat 1", then user will found
that 'util' is 0 from 0s-9s, because diskstats_show() doesn't update
io_ticks, then in 9s-10s user will found 'util' is 1000%.

Signed-off-by: Yu Kuai <[email protected]>
---
block/blk-mq.c | 32 --------------------------------
block/blk-mq.h | 5 -----
block/genhd.c | 18 +++---------------
3 files changed, 3 insertions(+), 52 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 82045f0ab5ba..dfbb4e24f04a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -87,38 +87,6 @@ struct mq_inflight {
unsigned int inflight[2];
};

-static bool blk_mq_check_inflight(struct request *rq, void *priv)
-{
- struct mq_inflight *mi = priv;
-
- if (rq->part && blk_do_io_stat(rq) &&
- (!mi->part->bd_partno || rq->part == mi->part) &&
- blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
- mi->inflight[rq_data_dir(rq)]++;
-
- return true;
-}
-
-unsigned int blk_mq_in_flight(struct request_queue *q,
- struct block_device *part)
-{
- struct mq_inflight mi = { .part = part };
-
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi);
-
- return mi.inflight[0] + mi.inflight[1];
-}
-
-void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
- unsigned int inflight[2])
-{
- struct mq_inflight mi = { .part = part };
-
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi);
- inflight[0] = mi.inflight[0];
- inflight[1] = mi.inflight[1];
-}
-
void blk_freeze_queue_start(struct request_queue *q)
{
mutex_lock(&q->mq_freeze_lock);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index f75a9ecfebde..aa92e1317f18 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -236,11 +236,6 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
return hctx->nr_ctx && hctx->tags;
}

-unsigned int blk_mq_in_flight(struct request_queue *q,
- struct block_device *part);
-void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
- unsigned int inflight[2]);
-
static inline void blk_mq_put_dispatch_budget(struct request_queue *q,
int budget_token)
{
diff --git a/block/genhd.c b/block/genhd.c
index 782a42718965..9c64d34b7dc3 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -954,15 +954,10 @@ ssize_t part_stat_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct block_device *bdev = dev_to_bdev(dev);
- struct request_queue *q = bdev_get_queue(bdev);
struct disk_stats stat;
unsigned int inflight;

- if (queue_is_mq(q))
- inflight = blk_mq_in_flight(q, bdev);
- else
- inflight = part_in_flight(bdev);
-
+ inflight = part_in_flight(bdev);
if (inflight) {
part_stat_lock();
update_io_ticks(bdev, jiffies, true);
@@ -1003,13 +998,9 @@ ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct block_device *bdev = dev_to_bdev(dev);
- struct request_queue *q = bdev_get_queue(bdev);
unsigned int inflight[2];

- if (queue_is_mq(q))
- blk_mq_in_flight_rw(q, bdev, inflight);
- else
- part_in_flight_rw(bdev, inflight);
+ part_in_flight_rw(bdev, inflight);

return sprintf(buf, "%8u %8u\n", inflight[0], inflight[1]);
}
@@ -1251,11 +1242,8 @@ static int diskstats_show(struct seq_file *seqf, void *v)
xa_for_each(&gp->part_tbl, idx, hd) {
if (bdev_is_partition(hd) && !bdev_nr_sectors(hd))
continue;
- if (queue_is_mq(gp->queue))
- inflight = blk_mq_in_flight(gp->queue, hd);
- else
- inflight = part_in_flight(hd);

+ inflight = part_in_flight(hd);
if (inflight) {
part_stat_lock();
update_io_ticks(hd, jiffies, true);
--
2.39.2


2024-03-23 04:08:16

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 2/2] block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()

From: Yu Kuai <[email protected]>

Now that blk-mq also use per_cpu counter to trace inflight as bio-based
device, they can be replaced by part_in_flight() and part_in_flight_rw()
directly.

Noted that there will be a change that inflight will be accounted from
blk_account_io_start() instead of blk_mq_start_request(). This also fix
an inconsistence for rq-based device that if there are rq allocated but
not started, io_ticks will be updated from blk_account_io_start() but not
from part_stat_show() or diskstats_show(). For consequence, for example:

blk_account_io_start -> 0s
-> something is wrong with driver, rq can't be dispatched to driver
-> finially the driver recovers
blk_mq_start_request -> 10s

Then in this case, if user is using "iostat 1", then user will found
that 'util' is 0 from 0s-9s, because diskstats_show() doesn't update
io_ticks, then in 9s-10s user will found 'util' is 1000%.

Signed-off-by: Yu Kuai <[email protected]>
---
block/blk-mq.c | 32 --------------------------------
block/blk-mq.h | 5 -----
block/genhd.c | 18 +++---------------
3 files changed, 3 insertions(+), 52 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 82045f0ab5ba..dfbb4e24f04a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -87,38 +87,6 @@ struct mq_inflight {
unsigned int inflight[2];
};

-static bool blk_mq_check_inflight(struct request *rq, void *priv)
-{
- struct mq_inflight *mi = priv;
-
- if (rq->part && blk_do_io_stat(rq) &&
- (!mi->part->bd_partno || rq->part == mi->part) &&
- blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
- mi->inflight[rq_data_dir(rq)]++;
-
- return true;
-}
-
-unsigned int blk_mq_in_flight(struct request_queue *q,
- struct block_device *part)
-{
- struct mq_inflight mi = { .part = part };
-
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi);
-
- return mi.inflight[0] + mi.inflight[1];
-}
-
-void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
- unsigned int inflight[2])
-{
- struct mq_inflight mi = { .part = part };
-
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi);
- inflight[0] = mi.inflight[0];
- inflight[1] = mi.inflight[1];
-}
-
void blk_freeze_queue_start(struct request_queue *q)
{
mutex_lock(&q->mq_freeze_lock);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index f75a9ecfebde..aa92e1317f18 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -236,11 +236,6 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
return hctx->nr_ctx && hctx->tags;
}

-unsigned int blk_mq_in_flight(struct request_queue *q,
- struct block_device *part);
-void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
- unsigned int inflight[2]);
-
static inline void blk_mq_put_dispatch_budget(struct request_queue *q,
int budget_token)
{
diff --git a/block/genhd.c b/block/genhd.c
index 782a42718965..9c64d34b7dc3 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -954,15 +954,10 @@ ssize_t part_stat_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct block_device *bdev = dev_to_bdev(dev);
- struct request_queue *q = bdev_get_queue(bdev);
struct disk_stats stat;
unsigned int inflight;

- if (queue_is_mq(q))
- inflight = blk_mq_in_flight(q, bdev);
- else
- inflight = part_in_flight(bdev);
-
+ inflight = part_in_flight(bdev);
if (inflight) {
part_stat_lock();
update_io_ticks(bdev, jiffies, true);
@@ -1003,13 +998,9 @@ ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct block_device *bdev = dev_to_bdev(dev);
- struct request_queue *q = bdev_get_queue(bdev);
unsigned int inflight[2];

- if (queue_is_mq(q))
- blk_mq_in_flight_rw(q, bdev, inflight);
- else
- part_in_flight_rw(bdev, inflight);
+ part_in_flight_rw(bdev, inflight);

return sprintf(buf, "%8u %8u\n", inflight[0], inflight[1]);
}
@@ -1251,11 +1242,8 @@ static int diskstats_show(struct seq_file *seqf, void *v)
xa_for_each(&gp->part_tbl, idx, hd) {
if (bdev_is_partition(hd) && !bdev_nr_sectors(hd))
continue;
- if (queue_is_mq(gp->queue))
- inflight = blk_mq_in_flight(gp->queue, hd);
- else
- inflight = part_in_flight(hd);

+ inflight = part_in_flight(hd);
if (inflight) {
part_stat_lock();
update_io_ticks(hd, jiffies, true);
--
2.39.2


2024-03-23 04:08:24

by Yu Kuai

[permalink] [raw]
Subject: [PATCH 1/2] block: support to account io_ticks precisely

From: Yu Kuai <[email protected]>

Currently, io_ticks is accounted based on sampling, specifically
update_io_ticks() will always account io_ticks by 1 jiffies from
bdev_start_io_acct()/blk_account_io_start(), and the result can be
inaccurate, for example(HZ is 250):

Test script:
fio -filename=/dev/sda -bs=4k -rw=write -direct=1 -name=test -thinktime=4ms

Test result: util is about 90%, while the disk is really idle.

This behaviour is introduced by commit 5b18b5a73760 ("block: delete
part_round_stats and switch to less precise counting"), however, there
was a key point that is missed that this patch also improve performance
a lot:

Before the commit:
part_round_stats:
if (part->stamp != now)
stats |= 1;

part_in_flight()
-> there can be lots of task here in 1 jiffies.
part_round_stats_single()
__part_stat_add()
part->stamp = now;

After the commit:
update_io_ticks:
stamp = part->bd_stamp;
if (time_after(now, stamp))
if (try_cmpxchg())
__part_stat_add()
-> only one task can reach here in 1 jiffies.

Hence in order to account io_ticks precisely, we only need to know if
there are IO inflight at most once in one jiffies. Noted that for
rq-based device, iterating tags should not be used here because
'tags->lock' is grabbed in blk_mq_find_and_get_req(), hence
part_stat_lock_inc/dec() and part_in_flight() is used to trace inflight.
The additional overhead is quite little:

- per cpu add/dec for each IO for rq-based device;
- per cpu sum for each jiffies;

And it's verified by null-blk that there are no performance degration
under heavy IO pressure.

Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Yu Kuai <[email protected]>
---
block/blk-core.c | 9 +++++----
block/blk-merge.c | 2 ++
block/blk-mq.c | 4 ++++
block/blk.h | 1 +
block/genhd.c | 2 +-
5 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a16b5abdbbf5..71c4d3982ef4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -985,10 +985,11 @@ void update_io_ticks(struct block_device *part, unsigned long now, bool end)
unsigned long stamp;
again:
stamp = READ_ONCE(part->bd_stamp);
- if (unlikely(time_after(now, stamp))) {
- if (likely(try_cmpxchg(&part->bd_stamp, &stamp, now)))
- __part_stat_add(part, io_ticks, end ? now - stamp : 1);
- }
+ if (unlikely(time_after(now, stamp)) &&
+ likely(try_cmpxchg(&part->bd_stamp, &stamp, now)) &&
+ (end || part_in_flight(part)))
+ __part_stat_add(part, io_ticks, now - stamp);
+
if (part->bd_partno) {
part = bdev_whole(part);
goto again;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2a06fd33039d..c96f7f612218 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -779,6 +779,8 @@ static void blk_account_io_merge_request(struct request *req)
if (blk_do_io_stat(req)) {
part_stat_lock();
part_stat_inc(req->part, merges[op_stat_group(req_op(req))]);
+ part_stat_local_dec(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 555ada922cf0..82045f0ab5ba 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1002,6 +1002,8 @@ static inline void blk_account_io_done(struct request *req, u64 now)
update_io_ticks(req->part, jiffies, true);
part_stat_inc(req->part, ios[sgrp]);
part_stat_add(req->part, nsecs[sgrp], now - req->start_time_ns);
+ part_stat_local_dec(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
@@ -1024,6 +1026,8 @@ static inline void blk_account_io_start(struct request *req)

part_stat_lock();
update_io_ticks(req->part, jiffies, false);
+ part_stat_local_inc(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
diff --git a/block/blk.h b/block/blk.h
index 47960bf31543..5fc206ced744 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -358,6 +358,7 @@ static inline bool blk_do_io_stat(struct request *rq)
}

void update_io_ticks(struct block_device *part, unsigned long now, bool end);
+unsigned int part_in_flight(struct block_device *part);

static inline void req_set_nomerge(struct request_queue *q, struct request *req)
{
diff --git a/block/genhd.c b/block/genhd.c
index bb29a68e1d67..782a42718965 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -118,7 +118,7 @@ static void part_stat_read_all(struct block_device *part,
}
}

-static unsigned int part_in_flight(struct block_device *part)
+unsigned int part_in_flight(struct block_device *part)
{
unsigned int inflight = 0;
int cpu;
--
2.39.2


2024-03-23 04:09:10

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 0/2] block: support to account io_ticks precisely

Hi,

Sorry that I somehow send this set twice... Please ignore the additional
set.

Thanks,
Kuai

?? 2024/03/23 11:59, Yu Kuai д??:
> From: Yu Kuai <[email protected]>
>
> Yu Kuai (2):
> block: support to account io_ticks precisely
> block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()
>
> block/blk-core.c | 9 +++++----
> block/blk-merge.c | 2 ++
> block/blk-mq.c | 36 ++++--------------------------------
> block/blk-mq.h | 5 -----
> block/blk.h | 1 +
> block/genhd.c | 20 ++++----------------
> 6 files changed, 16 insertions(+), 57 deletions(-)
>


2024-03-23 16:07:46

by Mike Snitzer

[permalink] [raw]
Subject: Re: [PATCH 2/2] block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()

On Fri, Mar 22 2024 at 11:59P -0400,
Yu Kuai <[email protected]> wrote:

> From: Yu Kuai <[email protected]>
>
> Now that blk-mq also use per_cpu counter to trace inflight as bio-based
> device, they can be replaced by part_in_flight() and part_in_flight_rw()
> directly.

Please reference the commit that enabled this, e.g.:

With commit XXXXX ("commit subject") blk-mq was updated to use per_cpu
counters to track inflight IO same as bio-based devices, so replace
blk_mq_in_flight* with part_in_flight() and part_in_flight_rw()
accordingly.

(I'm not seeing the commit in question, but I only took a quick look).

Mike

2024-03-24 02:11:50

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 2/2] block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()

Hi,

?? 2024/03/24 0:05, Mike Snitzer д??:
> On Fri, Mar 22 2024 at 11:59P -0400,
> Yu Kuai <[email protected]> wrote:
>
>> From: Yu Kuai <[email protected]>
>>
>> Now that blk-mq also use per_cpu counter to trace inflight as bio-based
>> device, they can be replaced by part_in_flight() and part_in_flight_rw()
>> directly.
>
> Please reference the commit that enabled this, e.g.:
>
> With commit XXXXX ("commit subject") blk-mq was updated to use per_cpu
> counters to track inflight IO same as bio-based devices, so replace
> blk_mq_in_flight* with part_in_flight() and part_in_flight_rw()
> accordingly.

Patch 1 in this set do this, so there is no commit xxx yet.

Thanks,
Kuai

>
> (I'm not seeing the commit in question, but I only took a quick look).
>
> Mike
> .
>


2024-03-24 21:57:42

by Mike Snitzer

[permalink] [raw]
Subject: Re: [PATCH 2/2] block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()

On Sat, Mar 23 2024 at 10:11P -0400,
Yu Kuai <[email protected]> wrote:

> Hi,
>
> 在 2024/03/24 0:05, Mike Snitzer 写道:
> > On Fri, Mar 22 2024 at 11:59P -0400,
> > Yu Kuai <[email protected]> wrote:
> >
> > > From: Yu Kuai <[email protected]>
> > >
> > > Now that blk-mq also use per_cpu counter to trace inflight as bio-based
> > > device, they can be replaced by part_in_flight() and part_in_flight_rw()
> > > directly.
> >
> > Please reference the commit that enabled this, e.g.:
> >
> > With commit XXXXX ("commit subject") blk-mq was updated to use per_cpu
> > counters to track inflight IO same as bio-based devices, so replace
> > blk_mq_in_flight* with part_in_flight() and part_in_flight_rw()
> > accordingly.
>
> Patch 1 in this set do this, so there is no commit xxx yet.
>
> Thanks,
> Kuai

Would've helped if I looked at 1/2, but please say:

With the previous commit blk-mq was updated to use per_cpu ...

2024-03-25 09:40:33

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 2/2] block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()

Hi,

在 2024/03/25 5:57, Mike Snitzer 写道:
> On Sat, Mar 23 2024 at 10:11P -0400,
> Yu Kuai <[email protected]> wrote:
>
>> Hi,
>>
>> 在 2024/03/24 0:05, Mike Snitzer 写道:
>>> On Fri, Mar 22 2024 at 11:59P -0400,
>>> Yu Kuai <[email protected]> wrote:
>>>
>>>> From: Yu Kuai <[email protected]>
>>>>
>>>> Now that blk-mq also use per_cpu counter to trace inflight as bio-based
>>>> device, they can be replaced by part_in_flight() and part_in_flight_rw()
>>>> directly.
>>>
>>> Please reference the commit that enabled this, e.g.:
>>>
>>> With commit XXXXX ("commit subject") blk-mq was updated to use per_cpu
>>> counters to track inflight IO same as bio-based devices, so replace
>>> blk_mq_in_flight* with part_in_flight() and part_in_flight_rw()
>>> accordingly.
>>
>> Patch 1 in this set do this, so there is no commit xxx yet.
>>
>> Thanks,
>> Kuai
>
> Would've helped if I looked at 1/2, but please say:
>
> With the previous commit blk-mq was updated to use per_cpu ...

OK, will pay attention to it later. :)

Thanks,
Kuai

> .
>


2024-04-06 07:57:43

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 0/2] block: support to account io_ticks precisely

Hi, Jens!
Hi, Ming!
Hi, Christoph!
Hi, Bart!

Friendly ping ...

The 'util' reported by iostat is very important for users, they don't
have much choise to get disk status, while 'util' has been inaccurate
for a long time unnecessarily. I really think patch 1 is meaningful.

Patch 2 also tries to fix a problem by our customer that util can
sometimes be huge. The root cause is that 'inflight' is account from
blk_mq_start_request() while 'io_ticks' is account from
blk_account_io_start(), there is a gap. I let 'inflight' to be account
from blk_account_io_start() as well, please let me know if this is not
good.

Thanks!
Kuai

?? 2024/03/23 11:59, Yu Kuai д??:
> From: Yu Kuai <[email protected]>
>
> Yu Kuai (2):
> block: support to account io_ticks precisely
> block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()
>
> block/blk-core.c | 9 +++++----
> block/blk-merge.c | 2 ++
> block/blk-mq.c | 36 ++++--------------------------------
> block/blk-mq.h | 5 -----
> block/blk.h | 1 +
> block/genhd.c | 20 ++++----------------
> 6 files changed, 16 insertions(+), 57 deletions(-)
>


2024-04-20 01:56:42

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH 0/2] block: support to account io_ticks precisely

Hi,

在 2024/04/06 15:57, Yu Kuai 写道:
> Hi, Jens!
> Hi, Ming!
> Hi, Christoph!
> Hi, Bart!
>
> Friendly ping ...
>
> The 'util' reported by iostat is very important for users, they don't
> have much choise to get disk status, while 'util' has been inaccurate
> for a long time unnecessarily. I really think patch 1 is meaningful.
>
> Patch 2 also tries to fix a problem by our customer that util can
> sometimes be huge. The root cause is that 'inflight' is account from
> blk_mq_start_request() while 'io_ticks' is account from
> blk_account_io_start(), there is a gap. I let 'inflight' to be account
> from blk_account_io_start() as well, please let me know if this is not
> good.

Friendly ping ...
>
> Thanks!
> Kuai
>
> 在 2024/03/23 11:59, Yu Kuai 写道:
>> From: Yu Kuai <[email protected]>
>>
>> Yu Kuai (2):
>>    block: support to account io_ticks precisely
>>    block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()
>>
>>   block/blk-core.c  |  9 +++++----
>>   block/blk-merge.c |  2 ++
>>   block/blk-mq.c    | 36 ++++--------------------------------
>>   block/blk-mq.h    |  5 -----
>>   block/blk.h       |  1 +
>>   block/genhd.c     | 20 ++++----------------
>>   6 files changed, 16 insertions(+), 57 deletions(-)
>>
>
> .
>