2022-05-16 09:42:33

by Zhang Wensheng

[permalink] [raw]
Subject: [PATCH -next] block: fix io hung of setting throttle limit frequently

Our test find a io hung problem which could be simplified:
setting throttle iops/bps limit to small, and to issue a big
bio. if the io is limited to 10s, just wait 1s, continue to
set same throttle iops/bps limit again, now, we could see
that the new throttle time become 10s again, like this, if
we distribute limit repeatedly within 10s, this io will always
in throttle queue.

when the throttle limit iops/bps is set to io. tg_conf_updated
will be called, it will start a new slice and update a new
dispatch time to pending timer which lead to wait again.

Because of commit 9f5ede3c01f9 ("block: throttle split bio in
case of iops limit"), the io will work fine if limited by bps.
which could fix part of the problem, not the root cause.

To fix this problem, adding the judge before update dispatch time.
if the pending timer is alive, we should not to update time.

Signed-off-by: Zhang Wensheng <[email protected]>
---
block/blk-throttle.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 469c483719be..8acb205dfa85 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1321,12 +1321,14 @@ static void tg_conf_updated(struct throtl_grp *tg, bool global)
* that a group's limit are dropped suddenly and we don't want to
* account recently dispatched IO with new low rate.
*/
- throtl_start_new_slice(tg, READ);
- throtl_start_new_slice(tg, WRITE);
+ if (!timer_pending(&sq->parent_sq->pending_timer)) {
+ throtl_start_new_slice(tg, READ);
+ throtl_start_new_slice(tg, WRITE);

- if (tg->flags & THROTL_TG_PENDING) {
- tg_update_disptime(tg);
- throtl_schedule_next_dispatch(sq->parent_sq, true);
+ if (tg->flags & THROTL_TG_PENDING) {
+ tg_update_disptime(tg);
+ throtl_schedule_next_dispatch(sq->parent_sq, true);
+ }
}
}

--
2.31.1



2022-05-17 02:42:32

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH -next] block: fix io hung of setting throttle limit frequently

On Mon, May 16, 2022 at 09:44:29AM +0800, Zhang Wensheng wrote:
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 469c483719be..8acb205dfa85 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -1321,12 +1321,14 @@ static void tg_conf_updated(struct throtl_grp *tg, bool global)
> * that a group's limit are dropped suddenly and we don't want to
> * account recently dispatched IO with new low rate.
> */
> - throtl_start_new_slice(tg, READ);
> - throtl_start_new_slice(tg, WRITE);
> + if (!timer_pending(&sq->parent_sq->pending_timer)) {
> + throtl_start_new_slice(tg, READ);
> + throtl_start_new_slice(tg, WRITE);
>
> - if (tg->flags & THROTL_TG_PENDING) {
> - tg_update_disptime(tg);
> - throtl_schedule_next_dispatch(sq->parent_sq, true);
> + if (tg->flags & THROTL_TG_PENDING) {
> + tg_update_disptime(tg);
> + throtl_schedule_next_dispatch(sq->parent_sq, true);
> + }

Yeah, but this ends up breaking the reason why it's starting the new slices
in the first place explained in the commit above, right? I'm not sure what
the right solution is but this likely isn't it.

Thanks.

--
tejun

2022-05-17 06:51:14

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next] block: fix io hung of setting throttle limit frequently

?? 2022/05/17 3:29, Tejun Heo д??:
> On Mon, May 16, 2022 at 09:44:29AM +0800, Zhang Wensheng wrote:
>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>> index 469c483719be..8acb205dfa85 100644
>> --- a/block/blk-throttle.c
>> +++ b/block/blk-throttle.c
>> @@ -1321,12 +1321,14 @@ static void tg_conf_updated(struct throtl_grp *tg, bool global)
>> * that a group's limit are dropped suddenly and we don't want to
>> * account recently dispatched IO with new low rate.
>> */
>> - throtl_start_new_slice(tg, READ);
>> - throtl_start_new_slice(tg, WRITE);
>> + if (!timer_pending(&sq->parent_sq->pending_timer)) {
>> + throtl_start_new_slice(tg, READ);
>> + throtl_start_new_slice(tg, WRITE);
>>
>> - if (tg->flags & THROTL_TG_PENDING) {
>> - tg_update_disptime(tg);
>> - throtl_schedule_next_dispatch(sq->parent_sq, true);
>> + if (tg->flags & THROTL_TG_PENDING) {
>> + tg_update_disptime(tg);
>> + throtl_schedule_next_dispatch(sq->parent_sq, true);
>> + }
>
> Yeah, but this ends up breaking the reason why it's starting the new slices
> in the first place explained in the commit above, right? I'm not sure what
> the right solution is but this likely isn't it.
>
Hi, Tejun

Ming added a condition in tg_with_in_bps_limit():
- if (bps_limit == U64_MAX) {
+ /* no need to throttle if this bio's bytes have been accounted */
+ if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {

Which will let the first throttled bio to be issued immediately once
the config if updated.

Do you think this behaviour is OK? If so, we can do the same for
tg_with_in_iops_limit.

Thanks,
Kuai

>

2022-05-17 10:33:08

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH -next] block: fix io hung of setting throttle limit frequently

On Tue, May 17, 2022 at 11:12:28AM +0800, yukuai (C) wrote:
> 在 2022/05/17 3:29, Tejun Heo 写道:
> > On Mon, May 16, 2022 at 09:44:29AM +0800, Zhang Wensheng wrote:
> > > diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> > > index 469c483719be..8acb205dfa85 100644
> > > --- a/block/blk-throttle.c
> > > +++ b/block/blk-throttle.c
> > > @@ -1321,12 +1321,14 @@ static void tg_conf_updated(struct throtl_grp *tg, bool global)
> > > * that a group's limit are dropped suddenly and we don't want to
> > > * account recently dispatched IO with new low rate.
> > > */
> > > - throtl_start_new_slice(tg, READ);
> > > - throtl_start_new_slice(tg, WRITE);
> > > + if (!timer_pending(&sq->parent_sq->pending_timer)) {
> > > + throtl_start_new_slice(tg, READ);
> > > + throtl_start_new_slice(tg, WRITE);
> > > - if (tg->flags & THROTL_TG_PENDING) {
> > > - tg_update_disptime(tg);
> > > - throtl_schedule_next_dispatch(sq->parent_sq, true);
> > > + if (tg->flags & THROTL_TG_PENDING) {
> > > + tg_update_disptime(tg);
> > > + throtl_schedule_next_dispatch(sq->parent_sq, true);
> > > + }
> >
> > Yeah, but this ends up breaking the reason why it's starting the new slices
> > in the first place explained in the commit above, right? I'm not sure what
> > the right solution is but this likely isn't it.
> >
> Hi, Tejun
>
> Ming added a condition in tg_with_in_bps_limit():
> - if (bps_limit == U64_MAX) {
> + /* no need to throttle if this bio's bytes have been accounted */
> + if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
>
> Which will let the first throttled bio to be issued immediately once
> the config if updated.
>
> Do you think this behaviour is OK? If so, we can do the same for
> tg_with_in_iops_limit.

IMO, you can't do that for iops limit. If BIO_THROTTLED is set for one
bio, all its bytes have been accounted, so no need to throttle this bio
in case of bps limit. iops limit is another story, since io account is
done in request IO which is based on split bio, so the bio(split bio)
still need to be check & throttle in case of iops limit.


Thanks,
Ming


2022-05-17 11:16:37

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH -next] block: fix io hung of setting throttle limit frequently

On Tue, May 17, 2022 at 11:12:28AM +0800, yukuai (C) wrote:
> Ming added a condition in tg_with_in_bps_limit():
> - if (bps_limit == U64_MAX) {
> + /* no need to throttle if this bio's bytes have been accounted */
> + if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
>
> Which will let the first throttled bio to be issued immediately once
> the config if updated.
>
> Do you think this behaviour is OK? If so, we can do the same for
> tg_with_in_iops_limit.

So, the current behavior is that if the user is being silly, it will get
slower and slower. The new behavior would be that if the user is being
silly, it can issue IOs faster and faster, which creates a perverse
incentive to be silly.

Probably the right thing to do is probably something like translating the
existing budget in light of the new configuration so that config change
neither gives or takes away the budget which has already accumulated. That
said, are you guys seeing this becoming an issue in practice?

Thanks.

--
tejun

2022-05-17 14:20:26

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next] block: fix io hung of setting throttle limit frequently

?? 2022/05/17 12:18, Tejun Heo д??:
> On Tue, May 17, 2022 at 11:12:28AM +0800, yukuai (C) wrote:
>> Ming added a condition in tg_with_in_bps_limit():
>> - if (bps_limit == U64_MAX) {
>> + /* no need to throttle if this bio's bytes have been accounted */
>> + if (bps_limit == U64_MAX || bio_flagged(bio, BIO_THROTTLED)) {
>>
>> Which will let the first throttled bio to be issued immediately once
>> the config if updated.
>>
>> Do you think this behaviour is OK? If so, we can do the same for
>> tg_with_in_iops_limit.
>
> So, the current behavior is that if the user is being silly, it will get
> slower and slower. The new behavior would be that if the user is being
> silly, it can issue IOs faster and faster, which creates a perverse
> incentive to be silly.
Yes,

I just found that Ming's patch introduce a new problem:

If multiple bios are throttled, then they will be issued one by one with
corresponding time. However, after Ming's patch, all throttled bios will
be issued immediately once the waiting time of first bio is reached. And
such behaviour is definitely a problem...

>
> Probably the right thing to do is probably something like translating the
> existing budget in light of the new configuration so that config change
> neither gives or takes away the budget which has already accumulated. That
> said, are you guys seeing this becoming an issue in practice?

Agreed, the solution sounds reasonable. And this problem is found during
test, which issue a large io and in the meantime updating config with
random value.

Thanks,
Kuai
>
> Thanks.
>