2021-07-05 22:13:21

by brookxu.cn

[permalink] [raw]
Subject: [PATCH v3] block: fix the problem of io_ticks becoming smaller

From: Chunguang Xu <[email protected]>

On the IO submission path, blk_account_io_start() may interrupt
the system interruption. When the interruption returns, the value
of part->stamp may have been updated by other cores, so the time
value collected before the interruption may be less than part->
stamp. So when this happens, we should do nothing to make io_ticks
more accurate? For kernels less than 5.0, this may cause io_ticks
to become smaller, which in turn may cause abnormal ioutil values.

v3: update the commit log
v2: sorry, fix compile error due to the missed ')'

Signed-off-by: Chunguang Xu <[email protected]>
---
block/blk-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 514838c..bbf56ae 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1243,7 +1243,7 @@ static void update_io_ticks(struct block_device *part, unsigned long now,
unsigned long stamp;
again:
stamp = READ_ONCE(part->bd_stamp);
- if (unlikely(stamp != now)) {
+ if (unlikely(time_after(now, stamp))) {
if (likely(cmpxchg(&part->bd_stamp, stamp, now) == stamp))
__part_stat_add(part, io_ticks, end ? now - stamp : 1);
}
--
1.8.3.1


2021-07-06 05:26:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3] block: fix the problem of io_ticks becoming smaller

On Tue, Jul 06, 2021 at 05:47:26AM +0800, brookxu wrote:
> From: Chunguang Xu <[email protected]>
>
> On the IO submission path, blk_account_io_start() may interrupt
> the system interruption. When the interruption returns, the value
> of part->stamp may have been updated by other cores, so the time
> value collected before the interruption may be less than part->
> stamp. So when this happens, we should do nothing to make io_ticks
> more accurate? For kernels less than 5.0, this may cause io_ticks
> to become smaller, which in turn may cause abnormal ioutil values.
>
> v3: update the commit log
> v2: sorry, fix compile error due to the missed ')'
>
> Signed-off-by: Chunguang Xu <[email protected]>

The change looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

Although I still have trouble understanding the commit log, especially
the last sentence.

2021-07-06 12:25:36

by brookxu.cn

[permalink] [raw]
Subject: Re: [PATCH v3] block: fix the problem of io_ticks becoming smaller



Christoph Hellwig wrote on 2021/7/6 13:25:
> On Tue, Jul 06, 2021 at 05:47:26AM +0800, brookxu wrote:
>> From: Chunguang Xu <[email protected]>
>>
>> On the IO submission path, blk_account_io_start() may interrupt
>> the system interruption. When the interruption returns, the value
>> of part->stamp may have been updated by other cores, so the time
>> value collected before the interruption may be less than part->
>> stamp. So when this happens, we should do nothing to make io_ticks
>> more accurate? For kernels less than 5.0, this may cause io_ticks
>> to become smaller, which in turn may cause abnormal ioutil values.
>>
>> v3: update the commit log
>> v2: sorry, fix compile error due to the missed ')'
>>
>> Signed-off-by: Chunguang Xu <[email protected]>
>
> The change looks good:
>
> Reviewed-by: Christoph Hellwig <[email protected]>
>
> Although I still have trouble understanding the commit log, especially
> the last sentence.

Thanks for your time,5b18b5a73760 ("block: delete part_round_stats and
switch to less precise counting") Merged in 5.0-rc1. Before that, we relied
on in_flight to count the disk's io_ticks,as follows:

static void part_round_stats_single(struct request_queue *q,
struct hd_struct *part, unsigned long now,
unsigned int inflight)
{
if (inflight) {
__part_stat_add(part, time_in_queue,
inflight * (now - part->stamp));
__part_stat_add(part, io_ticks, (now - part->stamp));
}
part->stamp = now;
}

The value of io_ticks should increase monotonically before the count wraps
around. However, due to the reasons mentioned above, the interrupt on the
IO submission path may cause the time obtained before the interrupt to be
less than part->stamp after the interrupt returns, resulting in the value
of io_ticks becoming smaller than the previous value. If the user task
periodically samples /proc/diskstats and calculates ioutil, since io_ticks
increases non-monotonously, the difference between adjacent times may be
negative, which in turn makes ioutil abnormal.Fortunately, this problem
can be easily circumvented.


2021-07-07 10:47:51

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH v3] block: fix the problem of io_ticks becoming smaller

On Tue, Jul 06, 2021 at 05:47:26AM +0800, brookxu wrote:
> From: Chunguang Xu <[email protected]>
>
> On the IO submission path, blk_account_io_start() may interrupt
> the system interruption. When the interruption returns, the value
> of part->stamp may have been updated by other cores, so the time
> value collected before the interruption may be less than part->
> stamp. So when this happens, we should do nothing to make io_ticks
> more accurate? For kernels less than 5.0, this may cause io_ticks
> to become smaller, which in turn may cause abnormal ioutil values.

Just be curious, is there any user visible difference with this change?

It can only be one issue if the stamp update crosses two jiffies, but
this event should be very unlikely, right?

Thanks,
Ming

2021-07-07 12:44:53

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v3] block: fix the problem of io_ticks becoming smaller

On 7/5/21 3:47 PM, brookxu wrote:
> From: Chunguang Xu <[email protected]>
>
> On the IO submission path, blk_account_io_start() may interrupt
> the system interruption. When the interruption returns, the value
> of part->stamp may have been updated by other cores, so the time
> value collected before the interruption may be less than part->
> stamp. So when this happens, we should do nothing to make io_ticks
> more accurate? For kernels less than 5.0, this may cause io_ticks
> to become smaller, which in turn may cause abnormal ioutil values.
>
> v3: update the commit log
> v2: sorry, fix compile error due to the missed ')'

The version log goes below the '---' in the email, not in the commit
message. I fixed that up. Applied for 5.14, thanks.

--
Jens Axboe