2021-04-18 05:06:43

by Cong Wang

[permalink] [raw]
Subject: [Patch] blk-wbt: fix a divide-by-zero error in rwb_arm_timer()

From: Cong Wang <[email protected]>

We hit a divide error in rwb_arm_timer() and crash dump shows
rqd->scale_step is 16777215 (0xffffff in hex), so the expression
"(rqd->scale_step + 1) << 8)" is 0x100000000, which is just beyond
32-bit integer range, hence it is truncated to 0 and int_sqrt(0)
returns 0 too, so we end up passing 0 as a divisor to div_u64().

Looking at the assembly code generated:

add $0x1,%edi
shl $0x8,%edi
movslq %edi,%rdi
mov 0x10(%rbx),%rdi
xor %edx,%edx
mov %eax,%ecx
shl $0x4,%rdi
mov %rdi,%rax
div %rcx

we notice that the left shift is still using 32 bit register %edi,
because the type of rqd->scale_step is 'int'. But actually int_sqrt()
takes 'long' as a parameter, so the temporary result should fit well
at least on x86_64. Fix this by explicitly casting the expression to
u64 and call int_sqrt64() to avoid any ambiguity on 32 bit.

After this patch, the assembly code looks correct:

add $0x1,%edi
movslq %edi,%rdi
shl $0x8,%rdi
mov 0x10(%rbx),%rdi
xor %edx,%edx
mov %eax,%ecx
shl $0x4,%rdi
mov %rdi,%rax
div %rcx

Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism")
Cc: Jens Axboe <[email protected]>
Cc: Fam Zheng <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
---
block/blk-wbt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 42aed0160f86..5157ca86574f 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -337,7 +337,7 @@ static void rwb_arm_timer(struct rq_wb *rwb)
* though.
*/
rwb->cur_win_nsec = div_u64(rwb->win_nsec << 4,
- int_sqrt((rqd->scale_step + 1) << 8));
+ int_sqrt64((u64)(rqd->scale_step + 1) << 8));
} else {
/*
* For step < 0, we don't want to increase/decrease the
--
2.25.1


2021-04-20 19:58:09

by Cong Wang

[permalink] [raw]
Subject: Re: [Patch] blk-wbt: fix a divide-by-zero error in rwb_arm_timer()

On Sat, Apr 17, 2021 at 9:41 PM Cong Wang <[email protected]> wrote:
>
> From: Cong Wang <[email protected]>
>
> We hit a divide error in rwb_arm_timer() and crash dump shows
> rqd->scale_step is 16777215 (0xffffff in hex), so the expression
> "(rqd->scale_step + 1) << 8)" is 0x100000000, which is just beyond
> 32-bit integer range, hence it is truncated to 0 and int_sqrt(0)
> returns 0 too, so we end up passing 0 as a divisor to div_u64().
>

Never mind. rqd->scale_step should be capped by
rq_depth_scale_down(), so should never be so large. In the old
calc_wb_limits() implementation, rwb->wb_max was set to zero
accidentally.

Thanks.