From: Yu Kuai <[email protected]>
Yu Kuai (2):
block: support to account io_ticks precisely
block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()
block/blk-core.c | 9 +++++----
block/blk-merge.c | 2 ++
block/blk-mq.c | 36 ++++--------------------------------
block/blk-mq.h | 5 -----
block/blk.h | 1 +
block/genhd.c | 20 ++++----------------
6 files changed, 16 insertions(+), 57 deletions(-)
--
2.39.2
From: Yu Kuai <[email protected]>
Yu Kuai (2):
block: support to account io_ticks precisely
block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()
block/blk-core.c | 9 +++++----
block/blk-merge.c | 2 ++
block/blk-mq.c | 36 ++++--------------------------------
block/blk-mq.h | 5 -----
block/blk.h | 1 +
block/genhd.c | 20 ++++----------------
6 files changed, 16 insertions(+), 57 deletions(-)
--
2.39.2
From: Yu Kuai <[email protected]>
Currently, io_ticks is accounted based on sampling, specifically
update_io_ticks() will always account io_ticks by 1 jiffies from
bdev_start_io_acct()/blk_account_io_start(), and the result can be
inaccurate, for example(HZ is 250):
Test script:
fio -filename=/dev/sda -bs=4k -rw=write -direct=1 -name=test -thinktime=4ms
Test result: util is about 90%, while the disk is really idle.
This behaviour is introduced by commit 5b18b5a73760 ("block: delete
part_round_stats and switch to less precise counting"), however, there
was a key point that is missed that this patch also improve performance
a lot:
Before the commit:
part_round_stats:
if (part->stamp != now)
stats |= 1;
part_in_flight()
-> there can be lots of task here in 1 jiffies.
part_round_stats_single()
__part_stat_add()
part->stamp = now;
After the commit:
update_io_ticks:
stamp = part->bd_stamp;
if (time_after(now, stamp))
if (try_cmpxchg())
__part_stat_add()
-> only one task can reach here in 1 jiffies.
Hence in order to account io_ticks precisely, we only need to know if
there are IO inflight at most once in one jiffies. Noted that for
rq-based device, iterating tags should not be used here because
'tags->lock' is grabbed in blk_mq_find_and_get_req(), hence
part_stat_lock_inc/dec() and part_in_flight() is used to trace inflight.
The additional overhead is quite little:
- per cpu add/dec for each IO for rq-based device;
- per cpu sum for each jiffies;
And it's verified by null-blk that there are no performance degration
under heavy IO pressure.
Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Yu Kuai <[email protected]>
---
block/blk-core.c | 9 +++++----
block/blk-merge.c | 2 ++
block/blk-mq.c | 4 ++++
block/blk.h | 1 +
block/genhd.c | 2 +-
5 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index a16b5abdbbf5..71c4d3982ef4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -985,10 +985,11 @@ void update_io_ticks(struct block_device *part, unsigned long now, bool end)
unsigned long stamp;
again:
stamp = READ_ONCE(part->bd_stamp);
- if (unlikely(time_after(now, stamp))) {
- if (likely(try_cmpxchg(&part->bd_stamp, &stamp, now)))
- __part_stat_add(part, io_ticks, end ? now - stamp : 1);
- }
+ if (unlikely(time_after(now, stamp)) &&
+ likely(try_cmpxchg(&part->bd_stamp, &stamp, now)) &&
+ (end || part_in_flight(part)))
+ __part_stat_add(part, io_ticks, now - stamp);
+
if (part->bd_partno) {
part = bdev_whole(part);
goto again;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2a06fd33039d..c96f7f612218 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -779,6 +779,8 @@ static void blk_account_io_merge_request(struct request *req)
if (blk_do_io_stat(req)) {
part_stat_lock();
part_stat_inc(req->part, merges[op_stat_group(req_op(req))]);
+ part_stat_local_dec(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 555ada922cf0..82045f0ab5ba 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1002,6 +1002,8 @@ static inline void blk_account_io_done(struct request *req, u64 now)
update_io_ticks(req->part, jiffies, true);
part_stat_inc(req->part, ios[sgrp]);
part_stat_add(req->part, nsecs[sgrp], now - req->start_time_ns);
+ part_stat_local_dec(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
@@ -1024,6 +1026,8 @@ static inline void blk_account_io_start(struct request *req)
part_stat_lock();
update_io_ticks(req->part, jiffies, false);
+ part_stat_local_inc(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
diff --git a/block/blk.h b/block/blk.h
index 47960bf31543..5fc206ced744 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -358,6 +358,7 @@ static inline bool blk_do_io_stat(struct request *rq)
}
void update_io_ticks(struct block_device *part, unsigned long now, bool end);
+unsigned int part_in_flight(struct block_device *part);
static inline void req_set_nomerge(struct request_queue *q, struct request *req)
{
diff --git a/block/genhd.c b/block/genhd.c
index bb29a68e1d67..782a42718965 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -118,7 +118,7 @@ static void part_stat_read_all(struct block_device *part,
}
}
-static unsigned int part_in_flight(struct block_device *part)
+unsigned int part_in_flight(struct block_device *part)
{
unsigned int inflight = 0;
int cpu;
--
2.39.2
From: Yu Kuai <[email protected]>
Now that blk-mq also use per_cpu counter to trace inflight as bio-based
device, they can be replaced by part_in_flight() and part_in_flight_rw()
directly.
Noted that there will be a change that inflight will be accounted from
blk_account_io_start() instead of blk_mq_start_request(). This also fix
an inconsistence for rq-based device that if there are rq allocated but
not started, io_ticks will be updated from blk_account_io_start() but not
from part_stat_show() or diskstats_show(). For consequence, for example:
blk_account_io_start -> 0s
-> something is wrong with driver, rq can't be dispatched to driver
-> finially the driver recovers
blk_mq_start_request -> 10s
Then in this case, if user is using "iostat 1", then user will found
that 'util' is 0 from 0s-9s, because diskstats_show() doesn't update
io_ticks, then in 9s-10s user will found 'util' is 1000%.
Signed-off-by: Yu Kuai <[email protected]>
---
block/blk-mq.c | 32 --------------------------------
block/blk-mq.h | 5 -----
block/genhd.c | 18 +++---------------
3 files changed, 3 insertions(+), 52 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 82045f0ab5ba..dfbb4e24f04a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -87,38 +87,6 @@ struct mq_inflight {
unsigned int inflight[2];
};
-static bool blk_mq_check_inflight(struct request *rq, void *priv)
-{
- struct mq_inflight *mi = priv;
-
- if (rq->part && blk_do_io_stat(rq) &&
- (!mi->part->bd_partno || rq->part == mi->part) &&
- blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
- mi->inflight[rq_data_dir(rq)]++;
-
- return true;
-}
-
-unsigned int blk_mq_in_flight(struct request_queue *q,
- struct block_device *part)
-{
- struct mq_inflight mi = { .part = part };
-
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi);
-
- return mi.inflight[0] + mi.inflight[1];
-}
-
-void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
- unsigned int inflight[2])
-{
- struct mq_inflight mi = { .part = part };
-
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi);
- inflight[0] = mi.inflight[0];
- inflight[1] = mi.inflight[1];
-}
-
void blk_freeze_queue_start(struct request_queue *q)
{
mutex_lock(&q->mq_freeze_lock);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index f75a9ecfebde..aa92e1317f18 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -236,11 +236,6 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
return hctx->nr_ctx && hctx->tags;
}
-unsigned int blk_mq_in_flight(struct request_queue *q,
- struct block_device *part);
-void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
- unsigned int inflight[2]);
-
static inline void blk_mq_put_dispatch_budget(struct request_queue *q,
int budget_token)
{
diff --git a/block/genhd.c b/block/genhd.c
index 782a42718965..9c64d34b7dc3 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -954,15 +954,10 @@ ssize_t part_stat_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct block_device *bdev = dev_to_bdev(dev);
- struct request_queue *q = bdev_get_queue(bdev);
struct disk_stats stat;
unsigned int inflight;
- if (queue_is_mq(q))
- inflight = blk_mq_in_flight(q, bdev);
- else
- inflight = part_in_flight(bdev);
-
+ inflight = part_in_flight(bdev);
if (inflight) {
part_stat_lock();
update_io_ticks(bdev, jiffies, true);
@@ -1003,13 +998,9 @@ ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct block_device *bdev = dev_to_bdev(dev);
- struct request_queue *q = bdev_get_queue(bdev);
unsigned int inflight[2];
- if (queue_is_mq(q))
- blk_mq_in_flight_rw(q, bdev, inflight);
- else
- part_in_flight_rw(bdev, inflight);
+ part_in_flight_rw(bdev, inflight);
return sprintf(buf, "%8u %8u\n", inflight[0], inflight[1]);
}
@@ -1251,11 +1242,8 @@ static int diskstats_show(struct seq_file *seqf, void *v)
xa_for_each(&gp->part_tbl, idx, hd) {
if (bdev_is_partition(hd) && !bdev_nr_sectors(hd))
continue;
- if (queue_is_mq(gp->queue))
- inflight = blk_mq_in_flight(gp->queue, hd);
- else
- inflight = part_in_flight(hd);
+ inflight = part_in_flight(hd);
if (inflight) {
part_stat_lock();
update_io_ticks(hd, jiffies, true);
--
2.39.2
From: Yu Kuai <[email protected]>
Now that blk-mq also use per_cpu counter to trace inflight as bio-based
device, they can be replaced by part_in_flight() and part_in_flight_rw()
directly.
Noted that there will be a change that inflight will be accounted from
blk_account_io_start() instead of blk_mq_start_request(). This also fix
an inconsistence for rq-based device that if there are rq allocated but
not started, io_ticks will be updated from blk_account_io_start() but not
from part_stat_show() or diskstats_show(). For consequence, for example:
blk_account_io_start -> 0s
-> something is wrong with driver, rq can't be dispatched to driver
-> finially the driver recovers
blk_mq_start_request -> 10s
Then in this case, if user is using "iostat 1", then user will found
that 'util' is 0 from 0s-9s, because diskstats_show() doesn't update
io_ticks, then in 9s-10s user will found 'util' is 1000%.
Signed-off-by: Yu Kuai <[email protected]>
---
block/blk-mq.c | 32 --------------------------------
block/blk-mq.h | 5 -----
block/genhd.c | 18 +++---------------
3 files changed, 3 insertions(+), 52 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 82045f0ab5ba..dfbb4e24f04a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -87,38 +87,6 @@ struct mq_inflight {
unsigned int inflight[2];
};
-static bool blk_mq_check_inflight(struct request *rq, void *priv)
-{
- struct mq_inflight *mi = priv;
-
- if (rq->part && blk_do_io_stat(rq) &&
- (!mi->part->bd_partno || rq->part == mi->part) &&
- blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
- mi->inflight[rq_data_dir(rq)]++;
-
- return true;
-}
-
-unsigned int blk_mq_in_flight(struct request_queue *q,
- struct block_device *part)
-{
- struct mq_inflight mi = { .part = part };
-
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi);
-
- return mi.inflight[0] + mi.inflight[1];
-}
-
-void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
- unsigned int inflight[2])
-{
- struct mq_inflight mi = { .part = part };
-
- blk_mq_queue_tag_busy_iter(q, blk_mq_check_inflight, &mi);
- inflight[0] = mi.inflight[0];
- inflight[1] = mi.inflight[1];
-}
-
void blk_freeze_queue_start(struct request_queue *q)
{
mutex_lock(&q->mq_freeze_lock);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index f75a9ecfebde..aa92e1317f18 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -236,11 +236,6 @@ static inline bool blk_mq_hw_queue_mapped(struct blk_mq_hw_ctx *hctx)
return hctx->nr_ctx && hctx->tags;
}
-unsigned int blk_mq_in_flight(struct request_queue *q,
- struct block_device *part);
-void blk_mq_in_flight_rw(struct request_queue *q, struct block_device *part,
- unsigned int inflight[2]);
-
static inline void blk_mq_put_dispatch_budget(struct request_queue *q,
int budget_token)
{
diff --git a/block/genhd.c b/block/genhd.c
index 782a42718965..9c64d34b7dc3 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -954,15 +954,10 @@ ssize_t part_stat_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct block_device *bdev = dev_to_bdev(dev);
- struct request_queue *q = bdev_get_queue(bdev);
struct disk_stats stat;
unsigned int inflight;
- if (queue_is_mq(q))
- inflight = blk_mq_in_flight(q, bdev);
- else
- inflight = part_in_flight(bdev);
-
+ inflight = part_in_flight(bdev);
if (inflight) {
part_stat_lock();
update_io_ticks(bdev, jiffies, true);
@@ -1003,13 +998,9 @@ ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct block_device *bdev = dev_to_bdev(dev);
- struct request_queue *q = bdev_get_queue(bdev);
unsigned int inflight[2];
- if (queue_is_mq(q))
- blk_mq_in_flight_rw(q, bdev, inflight);
- else
- part_in_flight_rw(bdev, inflight);
+ part_in_flight_rw(bdev, inflight);
return sprintf(buf, "%8u %8u\n", inflight[0], inflight[1]);
}
@@ -1251,11 +1242,8 @@ static int diskstats_show(struct seq_file *seqf, void *v)
xa_for_each(&gp->part_tbl, idx, hd) {
if (bdev_is_partition(hd) && !bdev_nr_sectors(hd))
continue;
- if (queue_is_mq(gp->queue))
- inflight = blk_mq_in_flight(gp->queue, hd);
- else
- inflight = part_in_flight(hd);
+ inflight = part_in_flight(hd);
if (inflight) {
part_stat_lock();
update_io_ticks(hd, jiffies, true);
--
2.39.2
From: Yu Kuai <[email protected]>
Currently, io_ticks is accounted based on sampling, specifically
update_io_ticks() will always account io_ticks by 1 jiffies from
bdev_start_io_acct()/blk_account_io_start(), and the result can be
inaccurate, for example(HZ is 250):
Test script:
fio -filename=/dev/sda -bs=4k -rw=write -direct=1 -name=test -thinktime=4ms
Test result: util is about 90%, while the disk is really idle.
This behaviour is introduced by commit 5b18b5a73760 ("block: delete
part_round_stats and switch to less precise counting"), however, there
was a key point that is missed that this patch also improve performance
a lot:
Before the commit:
part_round_stats:
if (part->stamp != now)
stats |= 1;
part_in_flight()
-> there can be lots of task here in 1 jiffies.
part_round_stats_single()
__part_stat_add()
part->stamp = now;
After the commit:
update_io_ticks:
stamp = part->bd_stamp;
if (time_after(now, stamp))
if (try_cmpxchg())
__part_stat_add()
-> only one task can reach here in 1 jiffies.
Hence in order to account io_ticks precisely, we only need to know if
there are IO inflight at most once in one jiffies. Noted that for
rq-based device, iterating tags should not be used here because
'tags->lock' is grabbed in blk_mq_find_and_get_req(), hence
part_stat_lock_inc/dec() and part_in_flight() is used to trace inflight.
The additional overhead is quite little:
- per cpu add/dec for each IO for rq-based device;
- per cpu sum for each jiffies;
And it's verified by null-blk that there are no performance degration
under heavy IO pressure.
Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Yu Kuai <[email protected]>
---
block/blk-core.c | 9 +++++----
block/blk-merge.c | 2 ++
block/blk-mq.c | 4 ++++
block/blk.h | 1 +
block/genhd.c | 2 +-
5 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index a16b5abdbbf5..71c4d3982ef4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -985,10 +985,11 @@ void update_io_ticks(struct block_device *part, unsigned long now, bool end)
unsigned long stamp;
again:
stamp = READ_ONCE(part->bd_stamp);
- if (unlikely(time_after(now, stamp))) {
- if (likely(try_cmpxchg(&part->bd_stamp, &stamp, now)))
- __part_stat_add(part, io_ticks, end ? now - stamp : 1);
- }
+ if (unlikely(time_after(now, stamp)) &&
+ likely(try_cmpxchg(&part->bd_stamp, &stamp, now)) &&
+ (end || part_in_flight(part)))
+ __part_stat_add(part, io_ticks, now - stamp);
+
if (part->bd_partno) {
part = bdev_whole(part);
goto again;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2a06fd33039d..c96f7f612218 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -779,6 +779,8 @@ static void blk_account_io_merge_request(struct request *req)
if (blk_do_io_stat(req)) {
part_stat_lock();
part_stat_inc(req->part, merges[op_stat_group(req_op(req))]);
+ part_stat_local_dec(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 555ada922cf0..82045f0ab5ba 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1002,6 +1002,8 @@ static inline void blk_account_io_done(struct request *req, u64 now)
update_io_ticks(req->part, jiffies, true);
part_stat_inc(req->part, ios[sgrp]);
part_stat_add(req->part, nsecs[sgrp], now - req->start_time_ns);
+ part_stat_local_dec(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
@@ -1024,6 +1026,8 @@ static inline void blk_account_io_start(struct request *req)
part_stat_lock();
update_io_ticks(req->part, jiffies, false);
+ part_stat_local_inc(req->part,
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
diff --git a/block/blk.h b/block/blk.h
index 47960bf31543..5fc206ced744 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -358,6 +358,7 @@ static inline bool blk_do_io_stat(struct request *rq)
}
void update_io_ticks(struct block_device *part, unsigned long now, bool end);
+unsigned int part_in_flight(struct block_device *part);
static inline void req_set_nomerge(struct request_queue *q, struct request *req)
{
diff --git a/block/genhd.c b/block/genhd.c
index bb29a68e1d67..782a42718965 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -118,7 +118,7 @@ static void part_stat_read_all(struct block_device *part,
}
}
-static unsigned int part_in_flight(struct block_device *part)
+unsigned int part_in_flight(struct block_device *part)
{
unsigned int inflight = 0;
int cpu;
--
2.39.2
Hi,
Sorry that I somehow send this set twice... Please ignore the additional
set.
Thanks,
Kuai
?? 2024/03/23 11:59, Yu Kuai д??:
> From: Yu Kuai <[email protected]>
>
> Yu Kuai (2):
> block: support to account io_ticks precisely
> block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()
>
> block/blk-core.c | 9 +++++----
> block/blk-merge.c | 2 ++
> block/blk-mq.c | 36 ++++--------------------------------
> block/blk-mq.h | 5 -----
> block/blk.h | 1 +
> block/genhd.c | 20 ++++----------------
> 6 files changed, 16 insertions(+), 57 deletions(-)
>
On Fri, Mar 22 2024 at 11:59P -0400,
Yu Kuai <[email protected]> wrote:
> From: Yu Kuai <[email protected]>
>
> Now that blk-mq also use per_cpu counter to trace inflight as bio-based
> device, they can be replaced by part_in_flight() and part_in_flight_rw()
> directly.
Please reference the commit that enabled this, e.g.:
With commit XXXXX ("commit subject") blk-mq was updated to use per_cpu
counters to track inflight IO same as bio-based devices, so replace
blk_mq_in_flight* with part_in_flight() and part_in_flight_rw()
accordingly.
(I'm not seeing the commit in question, but I only took a quick look).
Mike
Hi,
?? 2024/03/24 0:05, Mike Snitzer д??:
> On Fri, Mar 22 2024 at 11:59P -0400,
> Yu Kuai <[email protected]> wrote:
>
>> From: Yu Kuai <[email protected]>
>>
>> Now that blk-mq also use per_cpu counter to trace inflight as bio-based
>> device, they can be replaced by part_in_flight() and part_in_flight_rw()
>> directly.
>
> Please reference the commit that enabled this, e.g.:
>
> With commit XXXXX ("commit subject") blk-mq was updated to use per_cpu
> counters to track inflight IO same as bio-based devices, so replace
> blk_mq_in_flight* with part_in_flight() and part_in_flight_rw()
> accordingly.
Patch 1 in this set do this, so there is no commit xxx yet.
Thanks,
Kuai
>
> (I'm not seeing the commit in question, but I only took a quick look).
>
> Mike
> .
>
On Sat, Mar 23 2024 at 10:11P -0400,
Yu Kuai <[email protected]> wrote:
> Hi,
>
> 在 2024/03/24 0:05, Mike Snitzer 写道:
> > On Fri, Mar 22 2024 at 11:59P -0400,
> > Yu Kuai <[email protected]> wrote:
> >
> > > From: Yu Kuai <[email protected]>
> > >
> > > Now that blk-mq also use per_cpu counter to trace inflight as bio-based
> > > device, they can be replaced by part_in_flight() and part_in_flight_rw()
> > > directly.
> >
> > Please reference the commit that enabled this, e.g.:
> >
> > With commit XXXXX ("commit subject") blk-mq was updated to use per_cpu
> > counters to track inflight IO same as bio-based devices, so replace
> > blk_mq_in_flight* with part_in_flight() and part_in_flight_rw()
> > accordingly.
>
> Patch 1 in this set do this, so there is no commit xxx yet.
>
> Thanks,
> Kuai
Would've helped if I looked at 1/2, but please say:
With the previous commit blk-mq was updated to use per_cpu ...
Hi,
在 2024/03/25 5:57, Mike Snitzer 写道:
> On Sat, Mar 23 2024 at 10:11P -0400,
> Yu Kuai <[email protected]> wrote:
>
>> Hi,
>>
>> 在 2024/03/24 0:05, Mike Snitzer 写道:
>>> On Fri, Mar 22 2024 at 11:59P -0400,
>>> Yu Kuai <[email protected]> wrote:
>>>
>>>> From: Yu Kuai <[email protected]>
>>>>
>>>> Now that blk-mq also use per_cpu counter to trace inflight as bio-based
>>>> device, they can be replaced by part_in_flight() and part_in_flight_rw()
>>>> directly.
>>>
>>> Please reference the commit that enabled this, e.g.:
>>>
>>> With commit XXXXX ("commit subject") blk-mq was updated to use per_cpu
>>> counters to track inflight IO same as bio-based devices, so replace
>>> blk_mq_in_flight* with part_in_flight() and part_in_flight_rw()
>>> accordingly.
>>
>> Patch 1 in this set do this, so there is no commit xxx yet.
>>
>> Thanks,
>> Kuai
>
> Would've helped if I looked at 1/2, but please say:
>
> With the previous commit blk-mq was updated to use per_cpu ...
OK, will pay attention to it later. :)
Thanks,
Kuai
> .
>
Hi, Jens!
Hi, Ming!
Hi, Christoph!
Hi, Bart!
Friendly ping ...
The 'util' reported by iostat is very important for users, they don't
have much choise to get disk status, while 'util' has been inaccurate
for a long time unnecessarily. I really think patch 1 is meaningful.
Patch 2 also tries to fix a problem by our customer that util can
sometimes be huge. The root cause is that 'inflight' is account from
blk_mq_start_request() while 'io_ticks' is account from
blk_account_io_start(), there is a gap. I let 'inflight' to be account
from blk_account_io_start() as well, please let me know if this is not
good.
Thanks!
Kuai
?? 2024/03/23 11:59, Yu Kuai д??:
> From: Yu Kuai <[email protected]>
>
> Yu Kuai (2):
> block: support to account io_ticks precisely
> block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()
>
> block/blk-core.c | 9 +++++----
> block/blk-merge.c | 2 ++
> block/blk-mq.c | 36 ++++--------------------------------
> block/blk-mq.h | 5 -----
> block/blk.h | 1 +
> block/genhd.c | 20 ++++----------------
> 6 files changed, 16 insertions(+), 57 deletions(-)
>
Hi,
在 2024/04/06 15:57, Yu Kuai 写道:
> Hi, Jens!
> Hi, Ming!
> Hi, Christoph!
> Hi, Bart!
>
> Friendly ping ...
>
> The 'util' reported by iostat is very important for users, they don't
> have much choise to get disk status, while 'util' has been inaccurate
> for a long time unnecessarily. I really think patch 1 is meaningful.
>
> Patch 2 also tries to fix a problem by our customer that util can
> sometimes be huge. The root cause is that 'inflight' is account from
> blk_mq_start_request() while 'io_ticks' is account from
> blk_account_io_start(), there is a gap. I let 'inflight' to be account
> from blk_account_io_start() as well, please let me know if this is not
> good.
Friendly ping ...
>
> Thanks!
> Kuai
>
> 在 2024/03/23 11:59, Yu Kuai 写道:
>> From: Yu Kuai <[email protected]>
>>
>> Yu Kuai (2):
>> block: support to account io_ticks precisely
>> block: remove blk_mq_in_flight() and blk_mq_in_flight_rw()
>>
>> block/blk-core.c | 9 +++++----
>> block/blk-merge.c | 2 ++
>> block/blk-mq.c | 36 ++++--------------------------------
>> block/blk-mq.h | 5 -----
>> block/blk.h | 1 +
>> block/genhd.c | 20 ++++----------------
>> 6 files changed, 16 insertions(+), 57 deletions(-)
>>
>
> .
>