2023-09-28 02:08:30

by Li Nan

[permalink] [raw]
Subject: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

From: Li Nan <[email protected]>

When the throttle of bps is not enabled, tg_bps_limit() returns U64_MAX,
which is be used in calculate_bytes_allowed(), and divide 0 error will
happen.

To fix it, only calculate allowed value when the throttle of bps/iops is
enabled and the value will be used.

Fixes: e8368b57c006 ("blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice()")
Reported-by: Changhui Zhong <[email protected]>
Closes: https://lore.kernel.org/all/CAGVVp+Vt6idZtxfU9jF=VSbu145Wi-d-WnAZx_hEfOL8yLZgBA@mail.gmail.com
Signed-off-by: Li Nan <[email protected]>
---
block/blk-throttle.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 38a881cf97d0..3c9a74ab9f0e 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -730,8 +730,10 @@ static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
{
unsigned long time_elapsed;
- long long bytes_trim;
- int io_trim;
+ long long bytes_trim = 0;
+ int io_trim = 0;
+ u64 bps_limit;
+ u32 iops_limit;

BUG_ON(time_before(tg->slice_end[rw], tg->slice_start[rw]));

@@ -758,11 +760,14 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
if (!time_elapsed)
return;

- bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
- time_elapsed) +
- tg->carryover_bytes[rw];
- io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
- tg->carryover_ios[rw];
+ bps_limit = tg_bps_limit(tg, rw);
+ iops_limit = tg_iops_limit(tg, rw);
+ if (tg->bytes_disp[rw] > 0 && bps_limit != U64_MAX)
+ bytes_trim = calculate_bytes_allowed(bps_limit,
+ time_elapsed) + tg->carryover_bytes[rw];
+ if (tg->io_disp[rw] > 0 && iops_limit != UINT_MAX)
+ io_trim = calculate_io_allowed(iops_limit, time_elapsed) +
+ tg->carryover_ios[rw];
if (bytes_trim <= 0 && io_trim <= 0)
return;

--
2.39.2


2023-10-04 19:32:01

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

Hello,

On Thu, Sep 28, 2023 at 09:58:58AM +0800, [email protected] wrote:
> From: Li Nan <[email protected]>
>
> When the throttle of bps is not enabled, tg_bps_limit() returns U64_MAX,
> which is be used in calculate_bytes_allowed(), and divide 0 error will
> happen.

calculate_bytes_allowed() is just

return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);

The only division is by HZ. How does divide by 0 happen?

Thanks.

--
tejun

2023-10-04 20:31:32

by Khazhy Kumykov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

On Wed, Oct 4, 2023 at 12:32 PM Tejun Heo <[email protected]> wrote:
>
> Hello,
>
> On Thu, Sep 28, 2023 at 09:58:58AM +0800, [email protected] wrote:
> > From: Li Nan <[email protected]>
> >
> > When the throttle of bps is not enabled, tg_bps_limit() returns U64_MAX,
> > which is be used in calculate_bytes_allowed(), and divide 0 error will
> > happen.
>
> calculate_bytes_allowed() is just
>
> return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
>
> The only division is by HZ. How does divide by 0 happen?

We've also noticed this - haven't looked too deeply but I don't think
it's a divide by zero, but an overflow (bps_limit * jiffy_elapsed / HZ
will overflow for jiffies > HZ). mul_u64_u64_div_u64 does say it will
throw DE if the mul overflows

>
> Thanks.
>
> --
> tejun

2023-10-04 20:44:59

by Khazhy Kumykov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

On Wed, Sep 27, 2023 at 7:05 PM <[email protected]> wrote:
>
> From: Li Nan <[email protected]>
>
> When the throttle of bps is not enabled, tg_bps_limit() returns U64_MAX,
> which is be used in calculate_bytes_allowed(), and divide 0 error will
> happen.
>
> To fix it, only calculate allowed value when the throttle of bps/iops is
> enabled and the value will be used.
>
> Fixes: e8368b57c006 ("blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice()")
> Reported-by: Changhui Zhong <[email protected]>
> Closes: https://lore.kernel.org/all/CAGVVp+Vt6idZtxfU9jF=VSbu145Wi-d-WnAZx_hEfOL8yLZgBA@mail.gmail.com
> Signed-off-by: Li Nan <[email protected]>
> ---
> block/blk-throttle.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 38a881cf97d0..3c9a74ab9f0e 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -730,8 +730,10 @@ static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
> static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
> {
> unsigned long time_elapsed;
> - long long bytes_trim;
> - int io_trim;
> + long long bytes_trim = 0;
> + int io_trim = 0;
> + u64 bps_limit;
> + u32 iops_limit;
>
> BUG_ON(time_before(tg->slice_end[rw], tg->slice_start[rw]));
>
> @@ -758,11 +760,14 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
> if (!time_elapsed)
> return;
>
> - bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
> - time_elapsed) +
> - tg->carryover_bytes[rw];
> - io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
> - tg->carryover_ios[rw];
> + bps_limit = tg_bps_limit(tg, rw);
> + iops_limit = tg_iops_limit(tg, rw);
> + if (tg->bytes_disp[rw] > 0 && bps_limit != U64_MAX)
I don't think this change is sufficient to prevent kernel crash, as a
"clever" user could still set the bps_limit to U64_MAX - 1 (or another
large value), which probably would still result in the same crash. The
comment in mul_u64_u64_div_u64 suggests there's something we can do to
better handle the overflow case, but I'm not sure what it's referring
to. ("Will generate an #DE when the result doesn't fit u64, could fix
with an __ex_table[] entry when it becomes an issue.") Otherwise, we
probably need to remove the mul_u64_u64_div_u64 and check for
overflow/potential overflow ourselves?

Khazhy
> + bytes_trim = calculate_bytes_allowed(bps_limit,
> + time_elapsed) + tg->carryover_bytes[rw];
> + if (tg->io_disp[rw] > 0 && iops_limit != UINT_MAX)
> + io_trim = calculate_io_allowed(iops_limit, time_elapsed) +
> + tg->carryover_ios[rw];
> if (bytes_trim <= 0 && io_trim <= 0)
> return;
>
> --
> 2.39.2
>

2023-10-05 16:46:15

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

Hi Li,

On 10/05, Li Nan wrote:
>
> >I don't think this change is sufficient to prevent kernel crash, as a
> >"clever" user could still set the bps_limit to U64_MAX - 1 (or another
> >large value), which probably would still result in the same crash. The
> >comment in mul_u64_u64_div_u64 suggests there's something we can do to
> >better handle the overflow case, but I'm not sure what it's referring
> >to. ("Will generate an #DE when the result doesn't fit u64, could fix
> >with an __ex_table[] entry when it becomes an issue.") Otherwise, we
>
> When (a * mul) overflows, a divide 0 error occurs in
> mul_u64_u64_div_u64(). Commit 3dc167ba5729 ("sched/cputime: Improve
> cputime_adjust()") changed func and said: "Will generate an #DE when the
> result doesn't fit u64, could fix with an __ex_table[] entry when it
> becomes an issue." But we are unsure of how to fix it. Could you please
> explain how to fix this issue.

Not sure I understand the question...

OK, we can change mul_u64_u64_div_u64() to trap the exception, say,

static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
{
u64 q;

asm ("mulq %2; 1: divq %3; 2:\n"
_ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_DEFAULT|EX_FLAG_CLEAR_AX)
: "=a" (q)
: "a" (a), "rm" (mul), "rm" (div)
: "rdx");

return q;
}

should (iiuc) return 0 if the result doesn't fit u64 or div == 0.

But even if we forget that this is x86-specific, how can this help?
What should calculate_bytes_allowed() do/return in this case?

> >probably need to remove the mul_u64_u64_div_u64 and check for
> >overflow/potential overflow ourselves?

probably yes...

Oleg.

2023-10-05 17:17:49

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

sorry, didn't notice this part before.

I am not a asm expert (to say at least;) but

On 10/05, Li Nan wrote:
>
> When (a * mul) overflows, a divide 0 error occurs in
> mul_u64_u64_div_u64().

Just in case... No, iirc it is divq which triggers #DE when the
result of division doesn't fit u64.

(a * mul) can't overflow, the result is 128-bit rax:rdx number.

Oleg.

2023-10-05 17:21:59

by Khazhy Kumykov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

On Thu, Oct 5, 2023 at 10:05 AM Oleg Nesterov <[email protected]> wrote:
>
> sorry, didn't notice this part before.
>
> I am not a asm expert (to say at least;) but
>
> On 10/05, Li Nan wrote:
> >
> > When (a * mul) overflows, a divide 0 error occurs in
> > mul_u64_u64_div_u64().
>
> Just in case... No, iirc it is divq which triggers #DE when the
> result of division doesn't fit u64.
Yeah, sorry for my incorrect wording here - but we're probably seeing
exactly that the final result doesn't fit in u64. (I wasn't familiar
with the intermediary registers here, thanks for explaining)
>
> (a * mul) can't overflow, the result is 128-bit rax:rdx number.
>
> Oleg.
>

2023-10-07 01:24:33

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

Hi,

?? 2023/10/06 0:24, Oleg Nesterov д??:
> Hi Li,
>
> On 10/05, Li Nan wrote:
>>
>>> I don't think this change is sufficient to prevent kernel crash, as a
>>> "clever" user could still set the bps_limit to U64_MAX - 1 (or another
>>> large value), which probably would still result in the same crash. The
>>> comment in mul_u64_u64_div_u64 suggests there's something we can do to
>>> better handle the overflow case, but I'm not sure what it's referring
>>> to. ("Will generate an #DE when the result doesn't fit u64, could fix
>>> with an __ex_table[] entry when it becomes an issue.") Otherwise, we
>>
>> When (a * mul) overflows, a divide 0 error occurs in
>> mul_u64_u64_div_u64(). Commit 3dc167ba5729 ("sched/cputime: Improve
>> cputime_adjust()") changed func and said: "Will generate an #DE when the
>> result doesn't fit u64, could fix with an __ex_table[] entry when it
>> becomes an issue." But we are unsure of how to fix it. Could you please
>> explain how to fix this issue.
>
> Not sure I understand the question...
>
> OK, we can change mul_u64_u64_div_u64() to trap the exception, say,
>
> static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> {
> u64 q;
>
> asm ("mulq %2; 1: divq %3; 2:\n"
> _ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_DEFAULT|EX_FLAG_CLEAR_AX)
> : "=a" (q)
> : "a" (a), "rm" (mul), "rm" (div)
> : "rdx");
>
> return q;
> }
>
> should (iiuc) return 0 if the result doesn't fit u64 or div == 0.
>
> But even if we forget that this is x86-specific, how can this help?
> What should calculate_bytes_allowed() do/return in this case?

I believe, U64_MAX should be returned if result doesn't fit u64;
>
>>> probably need to remove the mul_u64_u64_div_u64 and check for
>>> overflow/potential overflow ourselves?
>
> probably yes...

How about this?

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 1101fb6f6cc8..5482c316a103 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -723,6 +723,10 @@ static unsigned int calculate_io_allowed(u32
iops_limit,

static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long
jiffy_elapsed)
{
+ if (jiffy_elapsed > HZ &&
+ bps_limit > mul_u64_u64_div_u64(U64_MAX, (u64)HZ,
(u64)jiffy_elapsed);
+ return U64_MAX;
+
return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
}

>
> Oleg.
>
> .
>

2023-10-07 15:18:15

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

On 10/07, Yu Kuai wrote:
>
> >>>probably need to remove the mul_u64_u64_div_u64 and check for
> >>>overflow/potential overflow ourselves?
> >
> >probably yes...
>
> How about this?
>
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 1101fb6f6cc8..5482c316a103 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -723,6 +723,10 @@ static unsigned int calculate_io_allowed(u32
> iops_limit,
>
> static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long
> jiffy_elapsed)
> {
> + if (jiffy_elapsed > HZ &&
> + bps_limit > mul_u64_u64_div_u64(U64_MAX, (u64)HZ,
> (u64)jiffy_elapsed);
> + return U64_MAX;
> +

I can't suggest anything better...

but I do not know if it is possible that HZ > jiffy_elapsed. If yes, then
mul_u64_u64_div_u64() above is not safe too.

Oleg.

2023-10-08 01:26:49

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

Hi,

?? 2023/10/07 23:16, Oleg Nesterov д??:
> On 10/07, Yu Kuai wrote:
>>
>>>>> probably need to remove the mul_u64_u64_div_u64 and check for
>>>>> overflow/potential overflow ourselves?
>>>
>>> probably yes...
>>
>> How about this?
>>
>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>> index 1101fb6f6cc8..5482c316a103 100644
>> --- a/block/blk-throttle.c
>> +++ b/block/blk-throttle.c
>> @@ -723,6 +723,10 @@ static unsigned int calculate_io_allowed(u32
>> iops_limit,
>>
>> static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long
>> jiffy_elapsed)
>> {
>> + if (jiffy_elapsed > HZ &&
>> + bps_limit > mul_u64_u64_div_u64(U64_MAX, (u64)HZ,
>> (u64)jiffy_elapsed);
>> + return U64_MAX;
>> +
>
> I can't suggest anything better...
>
> but I do not know if it is possible that HZ > jiffy_elapsed. If yes, then
> mul_u64_u64_div_u64() above is not safe too.

Well, 'jiffy_elapsed > HZ' is judged before mul_u64_u64_div_u64().

Overflow can only happen if the above 2 conditions passed, and U64_MAX
is returned.

Thanks,
Kuai

>
> Oleg.
>
> .
>

2023-10-08 11:38:12

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

On 10/08, Yu Kuai wrote:
>
> >> static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long
> >>jiffy_elapsed)
> >> {
> >>+ if (jiffy_elapsed > HZ &&
> >>+ bps_limit > mul_u64_u64_div_u64(U64_MAX, (u64)HZ,
> >>(u64)jiffy_elapsed);
> >>+ return U64_MAX;
> >>+
> >
> >I can't suggest anything better...
> >
> >but I do not know if it is possible that HZ > jiffy_elapsed. If yes, then
> >mul_u64_u64_div_u64() above is not safe too.
>
> Well, 'jiffy_elapsed > HZ' is judged before mul_u64_u64_div_u64().

Yes, sorry, somehow I didn't notice this check.

Oleg.

2023-10-13 21:52:54

by Khazhy Kumykov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

Looking at the generic mul_u64_u64_div_u64 impl, it doesn't handle
overflow of the final result either, as far as I can tell. So while on
x86 we get a DE, on non-x86 we just get the wrong result.

(Aside: after 8d6bbaada2e0 ("blk-throttle: prevent overflow while
calculating wait time"), setting a very-high bps_limit would probably
also cause this crash, no?)

Would it be possible to have a "check_mul_u64_u64_div_u64_overflow()",
where if the result doesn't fit in u64, we indicate (and let the
caller choose what to do? Here we should just return U64_MAX)?

Absent that, maybe we can take inspiration from the generic
mul_u64_u64_div_u64? (Forgive the paste)

static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
{
+ /* Final result probably won't fit in u64 */
+ if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)
+ return U64_MAX;
return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
}

2023-10-16 01:47:43

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

Hi,

在 2023/10/14 5:51, Khazhy Kumykov 写道:
> Looking at the generic mul_u64_u64_div_u64 impl, it doesn't handle
> overflow of the final result either, as far as I can tell. So while on
> x86 we get a DE, on non-x86 we just get the wrong result.
>
> (Aside: after 8d6bbaada2e0 ("blk-throttle: prevent overflow while
> calculating wait time"), setting a very-high bps_limit would probably
> also cause this crash, no?)
>
> Would it be possible to have a "check_mul_u64_u64_div_u64_overflow()",
> where if the result doesn't fit in u64, we indicate (and let the
> caller choose what to do? Here we should just return U64_MAX)?
>
> Absent that, maybe we can take inspiration from the generic
> mul_u64_u64_div_u64? (Forgive the paste)
>
> static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
> {
> + /* Final result probably won't fit in u64 */
> + if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)

I'm not sure, but this condition looks necessary, but doesn't look
sufficient, for example, jiffy_elapsed cound be greater than HZ, while
ilog2(jiffy_elapsed) is equal to ilog2(HZ).

Thanks,
Kuai

> + return U64_MAX;
> return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
> }
>
> .
>

2023-10-16 20:07:12

by Khazhy Kumykov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

On Sun, Oct 15, 2023 at 6:47 PM Yu Kuai <[email protected]> wrote:
>
> Hi,
>
> 在 2023/10/14 5:51, Khazhy Kumykov 写道:
> > Looking at the generic mul_u64_u64_div_u64 impl, it doesn't handle
> > overflow of the final result either, as far as I can tell. So while on
> > x86 we get a DE, on non-x86 we just get the wrong result.
> >
> > (Aside: after 8d6bbaada2e0 ("blk-throttle: prevent overflow while
> > calculating wait time"), setting a very-high bps_limit would probably
> > also cause this crash, no?)
> >
> > Would it be possible to have a "check_mul_u64_u64_div_u64_overflow()",
> > where if the result doesn't fit in u64, we indicate (and let the
> > caller choose what to do? Here we should just return U64_MAX)?
> >
> > Absent that, maybe we can take inspiration from the generic
> > mul_u64_u64_div_u64? (Forgive the paste)
> >
> > static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
> > {
> > + /* Final result probably won't fit in u64 */
> > + if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)
>
> I'm not sure, but this condition looks necessary, but doesn't look
> sufficient, for example, jiffy_elapsed cound be greater than HZ, while
> ilog2(jiffy_elapsed) is equal to ilog2(HZ).
I believe 62 is correct, although admittedly it's less "intuitive"
than the check in mul_u64_u64_div_u64()....

The result overflows if log2(A * B / C) >= 64, so we want to ensure that:
log2(A) + log2(B) - log2(C) < 64

Given that:
ilog2(A) <= log2(A) < ilog2(A) + 1 // truncation defn
It follows that:
-log2(A) <= -ilog2(A) // Inverse rule
log2(A) - 1 < ilog2(A)

Starting from:
ilog2(A) + ilog2(B) - ilog2(C) <= X

We can show:
(log2(A) - 1) + (log2(B) - 1) + (-log2(C)) < ilog2(A) + ilog2(B) +
(-ilog2(C)) // strict inequality here since the substitutions for A
and B terms are strictly less
(log2(A) - 1) + (log2(B) - 1) + (-log2(C)) < X
log2(A) + log2(B) - log2(C) < X + 2

So for X = 62, log2(A) + log2(B) - log2(C) < 64 must be true, and we
must be safe from overflow.

So... by converse, if ilog2(A) + ilog2(B) - ilog2(C) > 62, we cannot
guarantee that the result will not overflow - thus we bail out.

// end math

It /is/ less exact than your proposal (sufficient, but not necessary),
but saves an extra 128bit mul.

I mostly just want us to pick /something/, since 6.6-rc and the LTSs
with the patch in question are busted currently. :)



>
> Thanks,
> Kuai
>
> > + return U64_MAX;
> > return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
> > }
> >
> > .
> >
>

2023-10-16 21:48:47

by Khazhismel Kumykov

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

On Mon, Oct 16, 2023 at 1:06 PM Khazhy Kumykov <[email protected]> wrote:
>
> I mostly just want us to pick /something/, since 6.6-rc and the LTSs
> with the patch in question are busted currently. :)
>
And just to make it explicit: I believe Kaui's proposal is correct.

Khazhy


Attachments:
smime.p7s (3.91 kB)
S/MIME Cryptographic Signature

2023-10-17 02:39:00

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

Hi,

在 2023/10/17 4:06, Khazhy Kumykov 写道:
> On Sun, Oct 15, 2023 at 6:47 PM Yu Kuai <[email protected]> wrote:
>>
>> Hi,
>>
>> 在 2023/10/14 5:51, Khazhy Kumykov 写道:
>>> Looking at the generic mul_u64_u64_div_u64 impl, it doesn't handle
>>> overflow of the final result either, as far as I can tell. So while on
>>> x86 we get a DE, on non-x86 we just get the wrong result.
>>>
>>> (Aside: after 8d6bbaada2e0 ("blk-throttle: prevent overflow while
>>> calculating wait time"), setting a very-high bps_limit would probably
>>> also cause this crash, no?)
>>>
>>> Would it be possible to have a "check_mul_u64_u64_div_u64_overflow()",
>>> where if the result doesn't fit in u64, we indicate (and let the
>>> caller choose what to do? Here we should just return U64_MAX)?
>>>
>>> Absent that, maybe we can take inspiration from the generic
>>> mul_u64_u64_div_u64? (Forgive the paste)
>>>
>>> static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
>>> {
>>> + /* Final result probably won't fit in u64 */
>>> + if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)
>>
>> I'm not sure, but this condition looks necessary, but doesn't look
>> sufficient, for example, jiffy_elapsed cound be greater than HZ, while
>> ilog2(jiffy_elapsed) is equal to ilog2(HZ).
> I believe 62 is correct, although admittedly it's less "intuitive"
> than the check in mul_u64_u64_div_u64()....
>
> The result overflows if log2(A * B / C) >= 64, so we want to ensure that:
> log2(A) + log2(B) - log2(C) < 64
>
> Given that:
> ilog2(A) <= log2(A) < ilog2(A) + 1 // truncation defn
> It follows that:
> -log2(A) <= -ilog2(A) // Inverse rule
> log2(A) - 1 < ilog2(A)
>
> Starting from:
> ilog2(A) + ilog2(B) - ilog2(C) <= X
>
> We can show:
> (log2(A) - 1) + (log2(B) - 1) + (-log2(C)) < ilog2(A) + ilog2(B) +
> (-ilog2(C)) // strict inequality here since the substitutions for A
> and B terms are strictly less
> (log2(A) - 1) + (log2(B) - 1) + (-log2(C)) < X
> log2(A) + log2(B) - log2(C) < X + 2
>
> So for X = 62, log2(A) + log2(B) - log2(C) < 64 must be true, and we
> must be safe from overflow.
>
> So... by converse, if ilog2(A) + ilog2(B) - ilog2(C) > 62, we cannot
> guarantee that the result will not overflow - thus we bail out.
>
> // end math

Thanks for the explanation, I understand that, so the problem is that
if the above condition(>62) match, the result may not overflow, but
U64_MAX is returned here, this is not correct but I guess this doesn't
matter in real word, it's impossible that user will issue more than
1<<63 bytes IO in an extended slice.

I'm good with this fix with some comments.

Thanks,
Kuai

>
> It /is/ less exact than your proposal (sufficient, but not necessary),
> but saves an extra 128bit mul.
>
> I mostly just want us to pick /something/, since 6.6-rc and the LTSs
> with the patch in question are busted currently. :)
>
>
>
>>
>> Thanks,
>> Kuai
>>
>>> + return U64_MAX;
>>> return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
>>> }
>>>
>>> .
>>>
>>
>
> .
>