2022-06-10 02:47:34

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Currently, bfq can't handle sync io concurrently as long as they
are not issued from root group. This is because
'bfqd->num_groups_with_pending_reqs > 0' is always true in
bfq_asymmetric_scenario().

The way that bfqg is counted into 'num_groups_with_pending_reqs':

Before this patch:
1) root group will never be counted.
2) Count if bfqg or it's child bfqgs have pending requests.
3) Don't count if bfqg and it's child bfqgs complete all the requests.

After this patch:
1) root group is counted.
2) Count if bfqg have pending requests.
3) Don't count if bfqg complete all the requests.

With this change, the occasion that only one group is activated can be
detected, and next patch will support concurrent sync io in the
occasion.

Signed-off-by: Yu Kuai <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
---
block/bfq-iosched.c | 42 ------------------------------------------
block/bfq-iosched.h | 18 +++++++++---------
block/bfq-wf2q.c | 19 ++++---------------
3 files changed, 13 insertions(+), 66 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 0ec21018daba..03b04892440c 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd,
void bfq_weights_tree_remove(struct bfq_data *bfqd,
struct bfq_queue *bfqq)
{
- struct bfq_entity *entity = bfqq->entity.parent;
-
- for_each_entity(entity) {
- struct bfq_sched_data *sd = entity->my_sched_data;
-
- if (sd->next_in_service || sd->in_service_entity) {
- /*
- * entity is still active, because either
- * next_in_service or in_service_entity is not
- * NULL (see the comments on the definition of
- * next_in_service for details on why
- * in_service_entity must be checked too).
- *
- * As a consequence, its parent entities are
- * active as well, and thus this loop must
- * stop here.
- */
- break;
- }
-
- /*
- * The decrement of num_groups_with_pending_reqs is
- * not performed immediately upon the deactivation of
- * entity, but it is delayed to when it also happens
- * that the first leaf descendant bfqq of entity gets
- * all its pending requests completed. The following
- * instructions perform this delayed decrement, if
- * needed. See the comments on
- * num_groups_with_pending_reqs for details.
- */
- if (entity->in_groups_with_pending_reqs) {
- entity->in_groups_with_pending_reqs = false;
- bfqd->num_groups_with_pending_reqs--;
- }
- }
-
- /*
- * Next function is invoked last, because it causes bfqq to be
- * freed if the following holds: bfqq is not in service and
- * has no dispatched request. DO NOT use bfqq after the next
- * function invocation.
- */
__bfq_weights_tree_remove(bfqd, bfqq,
&bfqd->queue_weights_tree);
}
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index de2446a9b7ab..f0fce94583e4 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -496,27 +496,27 @@ struct bfq_data {
struct rb_root_cached queue_weights_tree;

/*
- * Number of groups with at least one descendant process that
+ * Number of groups with at least one process that
* has at least one request waiting for completion. Note that
* this accounts for also requests already dispatched, but not
* yet completed. Therefore this number of groups may differ
* (be larger) than the number of active groups, as a group is
* considered active only if its corresponding entity has
- * descendant queues with at least one request queued. This
+ * queues with at least one request queued. This
* number is used to decide whether a scenario is symmetric.
* For a detailed explanation see comments on the computation
* of the variable asymmetric_scenario in the function
* bfq_better_to_idle().
*
* However, it is hard to compute this number exactly, for
- * groups with multiple descendant processes. Consider a group
- * that is inactive, i.e., that has no descendant process with
+ * groups with multiple processes. Consider a group
+ * that is inactive, i.e., that has no process with
* pending I/O inside BFQ queues. Then suppose that
* num_groups_with_pending_reqs is still accounting for this
- * group, because the group has descendant processes with some
+ * group, because the group has processes with some
* I/O request still in flight. num_groups_with_pending_reqs
* should be decremented when the in-flight request of the
- * last descendant process is finally completed (assuming that
+ * last process is finally completed (assuming that
* nothing else has changed for the group in the meantime, in
* terms of composition of the group and active/inactive state of child
* groups and processes). To accomplish this, an additional
@@ -525,7 +525,7 @@ struct bfq_data {
* we resort to the following tradeoff between simplicity and
* accuracy: for an inactive group that is still counted in
* num_groups_with_pending_reqs, we decrement
- * num_groups_with_pending_reqs when the first descendant
+ * num_groups_with_pending_reqs when the first
* process of the group remains with no request waiting for
* completion.
*
@@ -533,12 +533,12 @@ struct bfq_data {
* carefulness: to avoid multiple decrements, we flag a group,
* more precisely an entity representing a group, as still
* counted in num_groups_with_pending_reqs when it becomes
- * inactive. Then, when the first descendant queue of the
+ * inactive. Then, when the first queue of the
* entity remains with no request waiting for completion,
* num_groups_with_pending_reqs is decremented, and this flag
* is reset. After this flag is reset for the entity,
* num_groups_with_pending_reqs won't be decremented any
- * longer in case a new descendant queue of the entity remains
+ * longer in case a new queue of the entity remains
* with no request waiting for completion.
*/
unsigned int num_groups_with_pending_reqs;
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index 6f36f3fe5cc8..9c2842bedf97 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -984,19 +984,6 @@ static void __bfq_activate_entity(struct bfq_entity *entity,
entity->on_st_or_in_serv = true;
}

-#ifdef CONFIG_BFQ_GROUP_IOSCHED
- if (!bfq_entity_to_bfqq(entity)) { /* bfq_group */
- struct bfq_group *bfqg =
- container_of(entity, struct bfq_group, entity);
- struct bfq_data *bfqd = bfqg->bfqd;
-
- if (!entity->in_groups_with_pending_reqs) {
- entity->in_groups_with_pending_reqs = true;
- bfqd->num_groups_with_pending_reqs++;
- }
- }
-#endif
-
bfq_update_fin_time_enqueue(entity, st, backshifted);
}

@@ -1654,7 +1641,8 @@ void bfq_add_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
if (!entity->in_groups_with_pending_reqs) {
entity->in_groups_with_pending_reqs = true;
#ifdef CONFIG_BFQ_GROUP_IOSCHED
- bfqq_group(bfqq)->num_queues_with_pending_reqs++;
+ if (!(bfqq_group(bfqq)->num_queues_with_pending_reqs++))
+ bfqq->bfqd->num_groups_with_pending_reqs++;
#endif
}
}
@@ -1666,7 +1654,8 @@ void bfq_del_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
if (entity->in_groups_with_pending_reqs) {
entity->in_groups_with_pending_reqs = false;
#ifdef CONFIG_BFQ_GROUP_IOSCHED
- bfqq_group(bfqq)->num_queues_with_pending_reqs--;
+ if (!(--bfqq_group(bfqq)->num_queues_with_pending_reqs))
+ bfqq->bfqd->num_groups_with_pending_reqs--;
#endif
}
}
--
2.31.1


2022-06-23 15:40:01

by Paolo Valente

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Sorry for the delay.

> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected]> ha scritto:
>
> Currently, bfq can't handle sync io concurrently as long as they
> are not issued from root group. This is because
> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
> bfq_asymmetric_scenario().
>
> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>
> Before this patch:
> 1) root group will never be counted.
> 2) Count if bfqg or it's child bfqgs have pending requests.
> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>
> After this patch:
> 1) root group is counted.
> 2) Count if bfqg have pending requests.
> 3) Don't count if bfqg complete all the requests.
>
> With this change, the occasion that only one group is activated can be
> detected, and next patch will support concurrent sync io in the
> occasion.
>
> Signed-off-by: Yu Kuai <[email protected]>
> Reviewed-by: Jan Kara <[email protected]>
> ---
> block/bfq-iosched.c | 42 ------------------------------------------
> block/bfq-iosched.h | 18 +++++++++---------
> block/bfq-wf2q.c | 19 ++++---------------
> 3 files changed, 13 insertions(+), 66 deletions(-)
>
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 0ec21018daba..03b04892440c 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd,
> void bfq_weights_tree_remove(struct bfq_data *bfqd,
> struct bfq_queue *bfqq)
> {
> - struct bfq_entity *entity = bfqq->entity.parent;
> -
> - for_each_entity(entity) {
> - struct bfq_sched_data *sd = entity->my_sched_data;
> -
> - if (sd->next_in_service || sd->in_service_entity) {
> - /*
> - * entity is still active, because either
> - * next_in_service or in_service_entity is not
> - * NULL (see the comments on the definition of
> - * next_in_service for details on why
> - * in_service_entity must be checked too).
> - *
> - * As a consequence, its parent entities are
> - * active as well, and thus this loop must
> - * stop here.
> - */
> - break;
> - }
> -
> - /*
> - * The decrement of num_groups_with_pending_reqs is
> - * not performed immediately upon the deactivation of
> - * entity, but it is delayed to when it also happens
> - * that the first leaf descendant bfqq of entity gets
> - * all its pending requests completed. The following
> - * instructions perform this delayed decrement, if
> - * needed. See the comments on
> - * num_groups_with_pending_reqs for details.
> - */
> - if (entity->in_groups_with_pending_reqs) {
> - entity->in_groups_with_pending_reqs = false;
> - bfqd->num_groups_with_pending_reqs--;
> - }
> - }

With this part removed, I'm missing how you handle the following
sequence of events:
1. a queue Q becomes non busy but still has dispatched requests, so
it must not be removed from the counter of queues with pending reqs
yet
2. the last request of Q is completed with Q being still idle (non
busy). At this point Q must be removed from the counter. It seems to
me that this case is not handled any longer

Additional comment: if your changes do not cpus the problem above,
then this function only invokes __bfq_weights_tree_remove. So what's
the point in keeping this function)

> -
> - /*
> - * Next function is invoked last, because it causes bfqq to be
> - * freed if the following holds: bfqq is not in service and
> - * has no dispatched request. DO NOT use bfqq after the next
> - * function invocation.
> - */

I would really love it if you leave this comment. I added it after
suffering a lot for a nasty UAF. Of course the first sentence may
need to be adjusted if the code that precedes it is to be removed.

Thanks,
Paolo


> __bfq_weights_tree_remove(bfqd, bfqq,
> &bfqd->queue_weights_tree);
> }
> diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
> index de2446a9b7ab..f0fce94583e4 100644
> --- a/block/bfq-iosched.h
> +++ b/block/bfq-iosched.h
> @@ -496,27 +496,27 @@ struct bfq_data {
> struct rb_root_cached queue_weights_tree;
>
> /*
> - * Number of groups with at least one descendant process that
> + * Number of groups with at least one process that
> * has at least one request waiting for completion. Note that
> * this accounts for also requests already dispatched, but not
> * yet completed. Therefore this number of groups may differ
> * (be larger) than the number of active groups, as a group is
> * considered active only if its corresponding entity has
> - * descendant queues with at least one request queued. This
> + * queues with at least one request queued. This
> * number is used to decide whether a scenario is symmetric.
> * For a detailed explanation see comments on the computation
> * of the variable asymmetric_scenario in the function
> * bfq_better_to_idle().
> *
> * However, it is hard to compute this number exactly, for
> - * groups with multiple descendant processes. Consider a group
> - * that is inactive, i.e., that has no descendant process with
> + * groups with multiple processes. Consider a group
> + * that is inactive, i.e., that has no process with
> * pending I/O inside BFQ queues. Then suppose that
> * num_groups_with_pending_reqs is still accounting for this
> - * group, because the group has descendant processes with some
> + * group, because the group has processes with some
> * I/O request still in flight. num_groups_with_pending_reqs
> * should be decremented when the in-flight request of the
> - * last descendant process is finally completed (assuming that
> + * last process is finally completed (assuming that
> * nothing else has changed for the group in the meantime, in
> * terms of composition of the group and active/inactive state of child
> * groups and processes). To accomplish this, an additional
> @@ -525,7 +525,7 @@ struct bfq_data {
> * we resort to the following tradeoff between simplicity and
> * accuracy: for an inactive group that is still counted in
> * num_groups_with_pending_reqs, we decrement
> - * num_groups_with_pending_reqs when the first descendant
> + * num_groups_with_pending_reqs when the first
> * process of the group remains with no request waiting for
> * completion.
> *
> @@ -533,12 +533,12 @@ struct bfq_data {
> * carefulness: to avoid multiple decrements, we flag a group,
> * more precisely an entity representing a group, as still
> * counted in num_groups_with_pending_reqs when it becomes
> - * inactive. Then, when the first descendant queue of the
> + * inactive. Then, when the first queue of the
> * entity remains with no request waiting for completion,
> * num_groups_with_pending_reqs is decremented, and this flag
> * is reset. After this flag is reset for the entity,
> * num_groups_with_pending_reqs won't be decremented any
> - * longer in case a new descendant queue of the entity remains
> + * longer in case a new queue of the entity remains
> * with no request waiting for completion.
> */
> unsigned int num_groups_with_pending_reqs;
> diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
> index 6f36f3fe5cc8..9c2842bedf97 100644
> --- a/block/bfq-wf2q.c
> +++ b/block/bfq-wf2q.c
> @@ -984,19 +984,6 @@ static void __bfq_activate_entity(struct bfq_entity *entity,
> entity->on_st_or_in_serv = true;
> }
>
> -#ifdef CONFIG_BFQ_GROUP_IOSCHED
> - if (!bfq_entity_to_bfqq(entity)) { /* bfq_group */
> - struct bfq_group *bfqg =
> - container_of(entity, struct bfq_group, entity);
> - struct bfq_data *bfqd = bfqg->bfqd;
> -
> - if (!entity->in_groups_with_pending_reqs) {
> - entity->in_groups_with_pending_reqs = true;
> - bfqd->num_groups_with_pending_reqs++;
> - }
> - }
> -#endif
> -
> bfq_update_fin_time_enqueue(entity, st, backshifted);
> }
>
> @@ -1654,7 +1641,8 @@ void bfq_add_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
> if (!entity->in_groups_with_pending_reqs) {
> entity->in_groups_with_pending_reqs = true;
> #ifdef CONFIG_BFQ_GROUP_IOSCHED
> - bfqq_group(bfqq)->num_queues_with_pending_reqs++;
> + if (!(bfqq_group(bfqq)->num_queues_with_pending_reqs++))
> + bfqq->bfqd->num_groups_with_pending_reqs++;
> #endif
> }
> }
> @@ -1666,7 +1654,8 @@ void bfq_del_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
> if (entity->in_groups_with_pending_reqs) {
> entity->in_groups_with_pending_reqs = false;
> #ifdef CONFIG_BFQ_GROUP_IOSCHED
> - bfqq_group(bfqq)->num_queues_with_pending_reqs--;
> + if (!(--bfqq_group(bfqq)->num_queues_with_pending_reqs))
> + bfqq->bfqd->num_groups_with_pending_reqs--;
> #endif
> }
> }
> --
> 2.31.1
>

2022-06-24 01:51:14

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

?? 2022/06/23 23:32, Paolo Valente д??:
> Sorry for the delay.
>
>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected]> ha scritto:
>>
>> Currently, bfq can't handle sync io concurrently as long as they
>> are not issued from root group. This is because
>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>> bfq_asymmetric_scenario().
>>
>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>
>> Before this patch:
>> 1) root group will never be counted.
>> 2) Count if bfqg or it's child bfqgs have pending requests.
>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>
>> After this patch:
>> 1) root group is counted.
>> 2) Count if bfqg have pending requests.
>> 3) Don't count if bfqg complete all the requests.
>>
>> With this change, the occasion that only one group is activated can be
>> detected, and next patch will support concurrent sync io in the
>> occasion.
>>
>> Signed-off-by: Yu Kuai <[email protected]>
>> Reviewed-by: Jan Kara <[email protected]>
>> ---
>> block/bfq-iosched.c | 42 ------------------------------------------
>> block/bfq-iosched.h | 18 +++++++++---------
>> block/bfq-wf2q.c | 19 ++++---------------
>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 0ec21018daba..03b04892440c 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd,
>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>> struct bfq_queue *bfqq)
>> {
>> - struct bfq_entity *entity = bfqq->entity.parent;
>> -
>> - for_each_entity(entity) {
>> - struct bfq_sched_data *sd = entity->my_sched_data;
>> -
>> - if (sd->next_in_service || sd->in_service_entity) {
>> - /*
>> - * entity is still active, because either
>> - * next_in_service or in_service_entity is not
>> - * NULL (see the comments on the definition of
>> - * next_in_service for details on why
>> - * in_service_entity must be checked too).
>> - *
>> - * As a consequence, its parent entities are
>> - * active as well, and thus this loop must
>> - * stop here.
>> - */
>> - break;
>> - }
>> -
>> - /*
>> - * The decrement of num_groups_with_pending_reqs is
>> - * not performed immediately upon the deactivation of
>> - * entity, but it is delayed to when it also happens
>> - * that the first leaf descendant bfqq of entity gets
>> - * all its pending requests completed. The following
>> - * instructions perform this delayed decrement, if
>> - * needed. See the comments on
>> - * num_groups_with_pending_reqs for details.
>> - */
>> - if (entity->in_groups_with_pending_reqs) {
>> - entity->in_groups_with_pending_reqs = false;
>> - bfqd->num_groups_with_pending_reqs--;
>> - }
>> - }
>
> With this part removed, I'm missing how you handle the following
> sequence of events:
> 1. a queue Q becomes non busy but still has dispatched requests, so
> it must not be removed from the counter of queues with pending reqs
> yet
> 2. the last request of Q is completed with Q being still idle (non
> busy). At this point Q must be removed from the counter. It seems to
> me that this case is not handled any longer
>
Hi, Paolo

1) At first, patch 1 support to track if bfqq has pending requests, it's
done by setting the flag 'entity->in_groups_with_pending_reqs' when the
first request is inserted to bfqq, and it's cleared when the last
request is completed.

2) Then, patch 2 add a counter in bfqg: how many bfqqs have pending
requests, which is updated while tracking if bfqq has pending requests.

3) Finally, patch 3 tracks 'num_groups_with_pending_reqs' based on the
new counter in patch 2:
- if the counter(how many bfqqs have pending requests) increased from 0
to 0, increase 'num_groups_with_pending_reqs'.
- if the counter is decreased from 1 to 0, decrease
'num_groups_with_pending_reqs'

> Additional comment: if your changes do not cpus the problem above,
> then this function only invokes __bfq_weights_tree_remove. So what's
> the point in keeping this function)

If this patchset is applied, there are following cleanup patches to
remove this function.

multiple cleanup patches for bfq:
https://lore.kernel.org/all/[email protected]/
>
>> -
>> - /*
>> - * Next function is invoked last, because it causes bfqq to be
>> - * freed if the following holds: bfqq is not in service and
>> - * has no dispatched request. DO NOT use bfqq after the next
>> - * function invocation.
>> - */
>
> I would really love it if you leave this comment. I added it after
> suffering a lot for a nasty UAF. Of course the first sentence may
> need to be adjusted if the code that precedes it is to be removed.
>

Same as above, if this patch is applied, this function will be gone.

Thanks,
Kuai
> Thanks,
> Paolo
>
>
>> __bfq_weights_tree_remove(bfqd, bfqq,
>> &bfqd->queue_weights_tree);
>> }
>> diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
>> index de2446a9b7ab..f0fce94583e4 100644
>> --- a/block/bfq-iosched.h
>> +++ b/block/bfq-iosched.h
>> @@ -496,27 +496,27 @@ struct bfq_data {
>> struct rb_root_cached queue_weights_tree;
>>
>> /*
>> - * Number of groups with at least one descendant process that
>> + * Number of groups with at least one process that
>> * has at least one request waiting for completion. Note that
>> * this accounts for also requests already dispatched, but not
>> * yet completed. Therefore this number of groups may differ
>> * (be larger) than the number of active groups, as a group is
>> * considered active only if its corresponding entity has
>> - * descendant queues with at least one request queued. This
>> + * queues with at least one request queued. This
>> * number is used to decide whether a scenario is symmetric.
>> * For a detailed explanation see comments on the computation
>> * of the variable asymmetric_scenario in the function
>> * bfq_better_to_idle().
>> *
>> * However, it is hard to compute this number exactly, for
>> - * groups with multiple descendant processes. Consider a group
>> - * that is inactive, i.e., that has no descendant process with
>> + * groups with multiple processes. Consider a group
>> + * that is inactive, i.e., that has no process with
>> * pending I/O inside BFQ queues. Then suppose that
>> * num_groups_with_pending_reqs is still accounting for this
>> - * group, because the group has descendant processes with some
>> + * group, because the group has processes with some
>> * I/O request still in flight. num_groups_with_pending_reqs
>> * should be decremented when the in-flight request of the
>> - * last descendant process is finally completed (assuming that
>> + * last process is finally completed (assuming that
>> * nothing else has changed for the group in the meantime, in
>> * terms of composition of the group and active/inactive state of child
>> * groups and processes). To accomplish this, an additional
>> @@ -525,7 +525,7 @@ struct bfq_data {
>> * we resort to the following tradeoff between simplicity and
>> * accuracy: for an inactive group that is still counted in
>> * num_groups_with_pending_reqs, we decrement
>> - * num_groups_with_pending_reqs when the first descendant
>> + * num_groups_with_pending_reqs when the first
>> * process of the group remains with no request waiting for
>> * completion.
>> *
>> @@ -533,12 +533,12 @@ struct bfq_data {
>> * carefulness: to avoid multiple decrements, we flag a group,
>> * more precisely an entity representing a group, as still
>> * counted in num_groups_with_pending_reqs when it becomes
>> - * inactive. Then, when the first descendant queue of the
>> + * inactive. Then, when the first queue of the
>> * entity remains with no request waiting for completion,
>> * num_groups_with_pending_reqs is decremented, and this flag
>> * is reset. After this flag is reset for the entity,
>> * num_groups_with_pending_reqs won't be decremented any
>> - * longer in case a new descendant queue of the entity remains
>> + * longer in case a new queue of the entity remains
>> * with no request waiting for completion.
>> */
>> unsigned int num_groups_with_pending_reqs;
>> diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
>> index 6f36f3fe5cc8..9c2842bedf97 100644
>> --- a/block/bfq-wf2q.c
>> +++ b/block/bfq-wf2q.c
>> @@ -984,19 +984,6 @@ static void __bfq_activate_entity(struct bfq_entity *entity,
>> entity->on_st_or_in_serv = true;
>> }
>>
>> -#ifdef CONFIG_BFQ_GROUP_IOSCHED
>> - if (!bfq_entity_to_bfqq(entity)) { /* bfq_group */
>> - struct bfq_group *bfqg =
>> - container_of(entity, struct bfq_group, entity);
>> - struct bfq_data *bfqd = bfqg->bfqd;
>> -
>> - if (!entity->in_groups_with_pending_reqs) {
>> - entity->in_groups_with_pending_reqs = true;
>> - bfqd->num_groups_with_pending_reqs++;
>> - }
>> - }
>> -#endif
>> -
>> bfq_update_fin_time_enqueue(entity, st, backshifted);
>> }
>>
>> @@ -1654,7 +1641,8 @@ void bfq_add_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
>> if (!entity->in_groups_with_pending_reqs) {
>> entity->in_groups_with_pending_reqs = true;
>> #ifdef CONFIG_BFQ_GROUP_IOSCHED
>> - bfqq_group(bfqq)->num_queues_with_pending_reqs++;
>> + if (!(bfqq_group(bfqq)->num_queues_with_pending_reqs++))
>> + bfqq->bfqd->num_groups_with_pending_reqs++;
>> #endif
>> }
>> }
>> @@ -1666,7 +1654,8 @@ void bfq_del_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
>> if (entity->in_groups_with_pending_reqs) {
>> entity->in_groups_with_pending_reqs = false;
>> #ifdef CONFIG_BFQ_GROUP_IOSCHED
>> - bfqq_group(bfqq)->num_queues_with_pending_reqs--;
>> + if (!(--bfqq_group(bfqq)->num_queues_with_pending_reqs))
>> + bfqq->bfqd->num_groups_with_pending_reqs--;
>> #endif
>> }
>> }
>> --
>> 2.31.1
>>
>
> .
>

2022-06-25 08:17:58

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

在 2022/06/24 9:26, Yu Kuai 写道:
> 在 2022/06/23 23:32, Paolo Valente 写道:
>> Sorry for the delay.
>>
>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected]>
>>> ha scritto:
>>>
>>> Currently, bfq can't handle sync io concurrently as long as they
>>> are not issued from root group. This is because
>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>> bfq_asymmetric_scenario().
>>>
>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>
>>> Before this patch:
>>> 1) root group will never be counted.
>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>>
>>> After this patch:
>>> 1) root group is counted.
>>> 2) Count if bfqg have pending requests.
>>> 3) Don't count if bfqg complete all the requests.
>>>
>>> With this change, the occasion that only one group is activated can be
>>> detected, and next patch will support concurrent sync io in the
>>> occasion.
>>>
>>> Signed-off-by: Yu Kuai <[email protected]>
>>> Reviewed-by: Jan Kara <[email protected]>
>>> ---
>>> block/bfq-iosched.c | 42 ------------------------------------------
>>> block/bfq-iosched.h | 18 +++++++++---------
>>> block/bfq-wf2q.c    | 19 ++++---------------
>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>
>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>> index 0ec21018daba..03b04892440c 100644
>>> --- a/block/bfq-iosched.c
>>> +++ b/block/bfq-iosched.c
>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data
>>> *bfqd,
>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>                  struct bfq_queue *bfqq)
>>> {
>>> -    struct bfq_entity *entity = bfqq->entity.parent;
>>> -
>>> -    for_each_entity(entity) {
>>> -        struct bfq_sched_data *sd = entity->my_sched_data;
>>> -
>>> -        if (sd->next_in_service || sd->in_service_entity) {
>>> -            /*
>>> -             * entity is still active, because either
>>> -             * next_in_service or in_service_entity is not
>>> -             * NULL (see the comments on the definition of
>>> -             * next_in_service for details on why
>>> -             * in_service_entity must be checked too).
>>> -             *
>>> -             * As a consequence, its parent entities are
>>> -             * active as well, and thus this loop must
>>> -             * stop here.
>>> -             */
>>> -            break;
>>> -        }
>>> -
>>> -        /*
>>> -         * The decrement of num_groups_with_pending_reqs is
>>> -         * not performed immediately upon the deactivation of
>>> -         * entity, but it is delayed to when it also happens
>>> -         * that the first leaf descendant bfqq of entity gets
>>> -         * all its pending requests completed. The following
>>> -         * instructions perform this delayed decrement, if
>>> -         * needed. See the comments on
>>> -         * num_groups_with_pending_reqs for details.
>>> -         */
>>> -        if (entity->in_groups_with_pending_reqs) {
>>> -            entity->in_groups_with_pending_reqs = false;
>>> -            bfqd->num_groups_with_pending_reqs--;
>>> -        }
>>> -    }
>>
>> With this part removed, I'm missing how you handle the following
>> sequence of events:
>> 1.  a queue Q becomes non busy but still has dispatched requests, so
>> it must not be removed from the counter of queues with pending reqs
>> yet
>> 2.  the last request of Q is completed with Q being still idle (non
>> busy).  At this point Q must be removed from the counter.  It seems to
>> me that this case is not handled any longer
>>
> Hi, Paolo
>
> 1) At first, patch 1 support to track if bfqq has pending requests, it's
> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
> first request is inserted to bfqq, and it's cleared when the last
> request is completed.
>
> 2) Then, patch 2 add a counter in bfqg: how many bfqqs have pending
> requests, which is updated while tracking if bfqq has pending requests.
>
> 3) Finally, patch 3 tracks 'num_groups_with_pending_reqs' based on the
> new counter in patch 2:
>  - if the counter(how many bfqqs have pending requests) increased from 0
>    to 0, increase 'num_groups_with_pending_reqs'.
Hi, Paolo

Sorry that I made a mistake here:
increased from 0 to 0 -> increased from 0 to 1.

look forward to your reply
Kuai
>  - if the counter is decreased from 1 to 0, decrease
>    'num_groups_with_pending_reqs'
>
>> Additional comment: if your changes do not cpus the problem above,
>> then this function only invokes __bfq_weights_tree_remove.  So what's
>> the point in keeping this function)
>
> If this patchset is applied, there are following cleanup patches to
> remove this function.
>
> multiple cleanup patches for bfq:
> https://lore.kernel.org/all/[email protected]/
>>
>>> -
>>> -    /*
>>> -     * Next function is invoked last, because it causes bfqq to be
>>> -     * freed if the following holds: bfqq is not in service and
>>> -     * has no dispatched request. DO NOT use bfqq after the next
>>> -     * function invocation.
>>> -     */
>>
>> I would really love it if you leave this comment.  I added it after
>> suffering a lot for a nasty UAF.  Of course the first sentence may
>> need to be adjusted if the code that precedes it is to be removed.
>>
>
> Same as above, if this patch is applied, this function will be gone.
>
> Thanks,
> Kuai
>> Thanks,
>> Paolo
>>
>>
>>>     __bfq_weights_tree_remove(bfqd, bfqq,
>>>                   &bfqd->queue_weights_tree);
>>> }
>>> diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
>>> index de2446a9b7ab..f0fce94583e4 100644
>>> --- a/block/bfq-iosched.h
>>> +++ b/block/bfq-iosched.h
>>> @@ -496,27 +496,27 @@ struct bfq_data {
>>>     struct rb_root_cached queue_weights_tree;
>>>
>>>     /*
>>> -     * Number of groups with at least one descendant process that
>>> +     * Number of groups with at least one process that
>>>      * has at least one request waiting for completion. Note that
>>>      * this accounts for also requests already dispatched, but not
>>>      * yet completed. Therefore this number of groups may differ
>>>      * (be larger) than the number of active groups, as a group is
>>>      * considered active only if its corresponding entity has
>>> -     * descendant queues with at least one request queued. This
>>> +     * queues with at least one request queued. This
>>>      * number is used to decide whether a scenario is symmetric.
>>>      * For a detailed explanation see comments on the computation
>>>      * of the variable asymmetric_scenario in the function
>>>      * bfq_better_to_idle().
>>>      *
>>>      * However, it is hard to compute this number exactly, for
>>> -     * groups with multiple descendant processes. Consider a group
>>> -     * that is inactive, i.e., that has no descendant process with
>>> +     * groups with multiple processes. Consider a group
>>> +     * that is inactive, i.e., that has no process with
>>>      * pending I/O inside BFQ queues. Then suppose that
>>>      * num_groups_with_pending_reqs is still accounting for this
>>> -     * group, because the group has descendant processes with some
>>> +     * group, because the group has processes with some
>>>      * I/O request still in flight. num_groups_with_pending_reqs
>>>      * should be decremented when the in-flight request of the
>>> -     * last descendant process is finally completed (assuming that
>>> +     * last process is finally completed (assuming that
>>>      * nothing else has changed for the group in the meantime, in
>>>      * terms of composition of the group and active/inactive state of
>>> child
>>>      * groups and processes). To accomplish this, an additional
>>> @@ -525,7 +525,7 @@ struct bfq_data {
>>>      * we resort to the following tradeoff between simplicity and
>>>      * accuracy: for an inactive group that is still counted in
>>>      * num_groups_with_pending_reqs, we decrement
>>> -     * num_groups_with_pending_reqs when the first descendant
>>> +     * num_groups_with_pending_reqs when the first
>>>      * process of the group remains with no request waiting for
>>>      * completion.
>>>      *
>>> @@ -533,12 +533,12 @@ struct bfq_data {
>>>      * carefulness: to avoid multiple decrements, we flag a group,
>>>      * more precisely an entity representing a group, as still
>>>      * counted in num_groups_with_pending_reqs when it becomes
>>> -     * inactive. Then, when the first descendant queue of the
>>> +     * inactive. Then, when the first queue of the
>>>      * entity remains with no request waiting for completion,
>>>      * num_groups_with_pending_reqs is decremented, and this flag
>>>      * is reset. After this flag is reset for the entity,
>>>      * num_groups_with_pending_reqs won't be decremented any
>>> -     * longer in case a new descendant queue of the entity remains
>>> +     * longer in case a new queue of the entity remains
>>>      * with no request waiting for completion.
>>>      */
>>>     unsigned int num_groups_with_pending_reqs;
>>> diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
>>> index 6f36f3fe5cc8..9c2842bedf97 100644
>>> --- a/block/bfq-wf2q.c
>>> +++ b/block/bfq-wf2q.c
>>> @@ -984,19 +984,6 @@ static void __bfq_activate_entity(struct
>>> bfq_entity *entity,
>>>         entity->on_st_or_in_serv = true;
>>>     }
>>>
>>> -#ifdef CONFIG_BFQ_GROUP_IOSCHED
>>> -    if (!bfq_entity_to_bfqq(entity)) { /* bfq_group */
>>> -        struct bfq_group *bfqg =
>>> -            container_of(entity, struct bfq_group, entity);
>>> -        struct bfq_data *bfqd = bfqg->bfqd;
>>> -
>>> -        if (!entity->in_groups_with_pending_reqs) {
>>> -            entity->in_groups_with_pending_reqs = true;
>>> -            bfqd->num_groups_with_pending_reqs++;
>>> -        }
>>> -    }
>>> -#endif
>>> -
>>>     bfq_update_fin_time_enqueue(entity, st, backshifted);
>>> }
>>>
>>> @@ -1654,7 +1641,8 @@ void
>>> bfq_add_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
>>>     if (!entity->in_groups_with_pending_reqs) {
>>>         entity->in_groups_with_pending_reqs = true;
>>> #ifdef CONFIG_BFQ_GROUP_IOSCHED
>>> -        bfqq_group(bfqq)->num_queues_with_pending_reqs++;
>>> +        if (!(bfqq_group(bfqq)->num_queues_with_pending_reqs++))
>>> +            bfqq->bfqd->num_groups_with_pending_reqs++;
>>> #endif
>>>     }
>>> }
>>> @@ -1666,7 +1654,8 @@ void
>>> bfq_del_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
>>>     if (entity->in_groups_with_pending_reqs) {
>>>         entity->in_groups_with_pending_reqs = false;
>>> #ifdef CONFIG_BFQ_GROUP_IOSCHED
>>> -        bfqq_group(bfqq)->num_queues_with_pending_reqs--;
>>> +        if (!(--bfqq_group(bfqq)->num_queues_with_pending_reqs))
>>> +            bfqq->bfqd->num_groups_with_pending_reqs--;
>>> #endif
>>>     }
>>> }
>>> --
>>> 2.31.1
>>>
>>
>> .
>>

2022-07-12 13:33:56

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi!

I'm copying my reply with new mail address, because Paolo seems
didn't receive my reply.

?? 2022/06/23 23:32, Paolo Valente д??:
> Sorry for the delay.
>
>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected]> ha scritto:
>>
>> Currently, bfq can't handle sync io concurrently as long as they
>> are not issued from root group. This is because
>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>> bfq_asymmetric_scenario().
>>
>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>
>> Before this patch:
>> 1) root group will never be counted.
>> 2) Count if bfqg or it's child bfqgs have pending requests.
>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>
>> After this patch:
>> 1) root group is counted.
>> 2) Count if bfqg have pending requests.
>> 3) Don't count if bfqg complete all the requests.
>>
>> With this change, the occasion that only one group is activated can be
>> detected, and next patch will support concurrent sync io in the
>> occasion.
>>
>> Signed-off-by: Yu Kuai <[email protected]>
>> Reviewed-by: Jan Kara <[email protected]>
>> ---
>> block/bfq-iosched.c | 42 ------------------------------------------
>> block/bfq-iosched.h | 18 +++++++++---------
>> block/bfq-wf2q.c | 19 ++++---------------
>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 0ec21018daba..03b04892440c 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd,
>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>> struct bfq_queue *bfqq)
>> {
>> - struct bfq_entity *entity = bfqq->entity.parent;
>> -
>> - for_each_entity(entity) {
>> - struct bfq_sched_data *sd = entity->my_sched_data;
>> -
>> - if (sd->next_in_service || sd->in_service_entity) {
>> - /*
>> - * entity is still active, because either
>> - * next_in_service or in_service_entity is not
>> - * NULL (see the comments on the definition of
>> - * next_in_service for details on why
>> - * in_service_entity must be checked too).
>> - *
>> - * As a consequence, its parent entities are
>> - * active as well, and thus this loop must
>> - * stop here.
>> - */
>> - break;
>> - }
>> -
>> - /*
>> - * The decrement of num_groups_with_pending_reqs is
>> - * not performed immediately upon the deactivation of
>> - * entity, but it is delayed to when it also happens
>> - * that the first leaf descendant bfqq of entity gets
>> - * all its pending requests completed. The following
>> - * instructions perform this delayed decrement, if
>> - * needed. See the comments on
>> - * num_groups_with_pending_reqs for details.
>> - */
>> - if (entity->in_groups_with_pending_reqs) {
>> - entity->in_groups_with_pending_reqs = false;
>> - bfqd->num_groups_with_pending_reqs--;
>> - }
>> - }
>
> With this part removed, I'm missing how you handle the following
> sequence of events:
> 1. a queue Q becomes non busy but still has dispatched requests, so
> it must not be removed from the counter of queues with pending reqs
> yet
> 2. the last request of Q is completed with Q being still idle (non
> busy). At this point Q must be removed from the counter. It seems to
> me that this case is not handled any longer
>
Hi, Paolo

1) At first, patch 1 support to track if bfqq has pending requests, it's
done by setting the flag 'entity->in_groups_with_pending_reqs' when the
first request is inserted to bfqq, and it's cleared when the last
request is completed(based on weights_tree insertion and removal).

2) Then, patch 2 add a counter in bfqg: how many bfqqs have pending
requests, which is updated while tracking if bfqq has pending requests.

3) Finally, patch 3 tracks 'num_groups_with_pending_reqs' based on the
new counter in patch 2:
- if the counter(how many bfqqs have pending requests) increased from 0
to 1, increase 'num_groups_with_pending_reqs'.
- if the counter is decreased from 1 to 0, decrease
'num_groups_with_pending_reqs'

> Additional comment: if your changes do not cpus the problem above,
> then this function only invokes __bfq_weights_tree_remove. So what's
> the point in keeping this function)
>
If this patchset is applied, there are following cleanup patches to
remove this function.

multiple cleanup patches for bfq:
https://lore.kernel.org/all/[email protected]/
>> -
>> - /*
>> - * Next function is invoked last, because it causes bfqq to be
>> - * freed if the following holds: bfqq is not in service and
>> - * has no dispatched request. DO NOT use bfqq after the next
>> - * function invocation.
>> - */
>
> I would really love it if you leave this comment. I added it after
> suffering a lot for a nasty UAF. Of course the first sentence may
> need to be adjusted if the code that precedes it is to be removed.
>
Same as above, if this patch is applied, this function will be gone.

Thanks,
Kuai
> Thanks,
> Paolo
>
>
>> __bfq_weights_tree_remove(bfqd, bfqq,
>> &bfqd->queue_weights_tree);
>> }
>> diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
>> index de2446a9b7ab..f0fce94583e4 100644
>> --- a/block/bfq-iosched.h
>> +++ b/block/bfq-iosched.h
>> @@ -496,27 +496,27 @@ struct bfq_data {
>> struct rb_root_cached queue_weights_tree;
>>
>> /*
>> - * Number of groups with at least one descendant process that
>> + * Number of groups with at least one process that
>> * has at least one request waiting for completion. Note that
>> * this accounts for also requests already dispatched, but not
>> * yet completed. Therefore this number of groups may differ
>> * (be larger) than the number of active groups, as a group is
>> * considered active only if its corresponding entity has
>> - * descendant queues with at least one request queued. This
>> + * queues with at least one request queued. This
>> * number is used to decide whether a scenario is symmetric.
>> * For a detailed explanation see comments on the computation
>> * of the variable asymmetric_scenario in the function
>> * bfq_better_to_idle().
>> *
>> * However, it is hard to compute this number exactly, for
>> - * groups with multiple descendant processes. Consider a group
>> - * that is inactive, i.e., that has no descendant process with
>> + * groups with multiple processes. Consider a group
>> + * that is inactive, i.e., that has no process with
>> * pending I/O inside BFQ queues. Then suppose that
>> * num_groups_with_pending_reqs is still accounting for this
>> - * group, because the group has descendant processes with some
>> + * group, because the group has processes with some
>> * I/O request still in flight. num_groups_with_pending_reqs
>> * should be decremented when the in-flight request of the
>> - * last descendant process is finally completed (assuming that
>> + * last process is finally completed (assuming that
>> * nothing else has changed for the group in the meantime, in
>> * terms of composition of the group and active/inactive state of child
>> * groups and processes). To accomplish this, an additional
>> @@ -525,7 +525,7 @@ struct bfq_data {
>> * we resort to the following tradeoff between simplicity and
>> * accuracy: for an inactive group that is still counted in
>> * num_groups_with_pending_reqs, we decrement
>> - * num_groups_with_pending_reqs when the first descendant
>> + * num_groups_with_pending_reqs when the first
>> * process of the group remains with no request waiting for
>> * completion.
>> *
>> @@ -533,12 +533,12 @@ struct bfq_data {
>> * carefulness: to avoid multiple decrements, we flag a group,
>> * more precisely an entity representing a group, as still
>> * counted in num_groups_with_pending_reqs when it becomes
>> - * inactive. Then, when the first descendant queue of the
>> + * inactive. Then, when the first queue of the
>> * entity remains with no request waiting for completion,
>> * num_groups_with_pending_reqs is decremented, and this flag
>> * is reset. After this flag is reset for the entity,
>> * num_groups_with_pending_reqs won't be decremented any
>> - * longer in case a new descendant queue of the entity remains
>> + * longer in case a new queue of the entity remains
>> * with no request waiting for completion.
>> */
>> unsigned int num_groups_with_pending_reqs;
>> diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
>> index 6f36f3fe5cc8..9c2842bedf97 100644
>> --- a/block/bfq-wf2q.c
>> +++ b/block/bfq-wf2q.c
>> @@ -984,19 +984,6 @@ static void __bfq_activate_entity(struct bfq_entity *entity,
>> entity->on_st_or_in_serv = true;
>> }
>>
>> -#ifdef CONFIG_BFQ_GROUP_IOSCHED
>> - if (!bfq_entity_to_bfqq(entity)) { /* bfq_group */
>> - struct bfq_group *bfqg =
>> - container_of(entity, struct bfq_group, entity);
>> - struct bfq_data *bfqd = bfqg->bfqd;
>> -
>> - if (!entity->in_groups_with_pending_reqs) {
>> - entity->in_groups_with_pending_reqs = true;
>> - bfqd->num_groups_with_pending_reqs++;
>> - }
>> - }
>> -#endif
>> -
>> bfq_update_fin_time_enqueue(entity, st, backshifted);
>> }
>>
>> @@ -1654,7 +1641,8 @@ void bfq_add_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
>> if (!entity->in_groups_with_pending_reqs) {
>> entity->in_groups_with_pending_reqs = true;
>> #ifdef CONFIG_BFQ_GROUP_IOSCHED
>> - bfqq_group(bfqq)->num_queues_with_pending_reqs++;
>> + if (!(bfqq_group(bfqq)->num_queues_with_pending_reqs++))
>> + bfqq->bfqd->num_groups_with_pending_reqs++;
>> #endif
>> }
>> }
>> @@ -1666,7 +1654,8 @@ void bfq_del_bfqq_in_groups_with_pending_reqs(struct bfq_queue *bfqq)
>> if (entity->in_groups_with_pending_reqs) {
>> entity->in_groups_with_pending_reqs = false;
>> #ifdef CONFIG_BFQ_GROUP_IOSCHED
>> - bfqq_group(bfqq)->num_queues_with_pending_reqs--;
>> + if (!(--bfqq_group(bfqq)->num_queues_with_pending_reqs))
>> + bfqq->bfqd->num_groups_with_pending_reqs--;
>> #endif
>> }
>> }
>> --
>> 2.31.1
>>
>
> .
>

2022-07-20 11:49:07

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi

在 2022/07/20 19:24, Paolo VALENTE 写道:
>
>
>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai
>> <[email protected] <mailto:[email protected]>> ha scritto:
>>
>> Hi!
>>
>> I'm copying my reply with new mail address, because Paolo seems
>> didn't receive my reply.
>>
>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>> Sorry for the delay.
>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected]
>>>> <mailto:[email protected]>> ha scritto:
>>>>
>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>> are not issued from root group. This is because
>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>> bfq_asymmetric_scenario().
>>>>
>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>
>>>> Before this patch:
>>>> 1) root group will never be counted.
>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>>>
>>>> After this patch:
>>>> 1) root group is counted.
>>>> 2) Count if bfqg have pending requests.
>>>> 3) Don't count if bfqg complete all the requests.
>>>>
>>>> With this change, the occasion that only one group is activated can be
>>>> detected, and next patch will support concurrent sync io in the
>>>> occasion.
>>>>
>>>> Signed-off-by: Yu Kuai <[email protected] <mailto:[email protected]>>
>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]>>
>>>> ---
>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>> block/bfq-wf2q.c    | 19 ++++---------------
>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>
>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>> index 0ec21018daba..03b04892440c 100644
>>>> --- a/block/bfq-iosched.c
>>>> +++ b/block/bfq-iosched.c
>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data
>>>> *bfqd,
>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>     struct bfq_queue *bfqq)
>>>> {
>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>> -
>>>> -for_each_entity(entity) {
>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>> -
>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>> -/*
>>>> -* entity is still active, because either
>>>> -* next_in_service or in_service_entity is not
>>>> -* NULL (see the comments on the definition of
>>>> -* next_in_service for details on why
>>>> -* in_service_entity must be checked too).
>>>> -*
>>>> -* As a consequence, its parent entities are
>>>> -* active as well, and thus this loop must
>>>> -* stop here.
>>>> -*/
>>>> -break;
>>>> -}
>>>> -
>>>> -/*
>>>> -* The decrement of num_groups_with_pending_reqs is
>>>> -* not performed immediately upon the deactivation of
>>>> -* entity, but it is delayed to when it also happens
>>>> -* that the first leaf descendant bfqq of entity gets
>>>> -* all its pending requests completed. The following
>>>> -* instructions perform this delayed decrement, if
>>>> -* needed. See the comments on
>>>> -* num_groups_with_pending_reqs for details.
>>>> -*/
>>>> -if (entity->in_groups_with_pending_reqs) {
>>>> -entity->in_groups_with_pending_reqs = false;
>>>> -bfqd->num_groups_with_pending_reqs--;
>>>> -}
>>>> -}
>>> With this part removed, I'm missing how you handle the following
>>> sequence of events:
>>> 1.  a queue Q becomes non busy but still has dispatched requests, so
>>> it must not be removed from the counter of queues with pending reqs
>>> yet
>>> 2.  the last request of Q is completed with Q being still idle (non
>>> busy).  At this point Q must be removed from the counter.  It seems to
>>> me that this case is not handled any longer
>> Hi, Paolo
>>
>> 1) At first, patch 1 support to track if bfqq has pending requests, it's
>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>> first request is inserted to bfqq, and it's cleared when the last
>> request is completed(based on weights_tree insertion and removal).
>>
>
> In patch 1 I don't see the flag cleared for the request-completion event :(
>
> The piece of code involved is this:
>
> static void bfq_completed_request(struct bfq_queue *bfqq, struct
> bfq_data *bfqd)
> {
> u64 now_ns;
> u32 delta_us;
>
> bfq_update_hw_tag(bfqd);
>
> bfqd->rq_in_driver[bfqq->actuator_idx]--;
> bfqd->tot_rq_in_driver--;
> bfqq->dispatched--;
>
> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
> /*
> * Set budget_timeout (which we overload to store the
> * time at which the queue remains with no backlog and
> * no outstanding request; used by the weight-raising
> * mechanism).
> */
> bfqq->budget_timeout = jiffies;
>
> bfq_weights_tree_remove(bfqd, bfqq);
> }
> ...
>
> Am I missing something?

I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
to clear the flag, and it's called both from bfq_del_bfqq_busy() and
bfq_completed_request(). I think you may miss the later:

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 0d46cb728bbf..0ec21018daba 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct bfq_queue
*bfqq, struct bfq_data *bfqd)
*/
bfqq->budget_timeout = jiffies;

+ bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
bfq_weights_tree_remove(bfqd, bfqq);
}

Thanks,
Kuai
>
> Thanks,
> Paolo

2022-07-27 12:50:41

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi, Paolo

Are you still interested in this patchset?

在 2022/07/20 19:38, Yu Kuai 写道:
> Hi
>
> 在 2022/07/20 19:24, Paolo VALENTE 写道:
>>
>>
>>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai
>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>
>>> Hi!
>>>
>>> I'm copying my reply with new mail address, because Paolo seems
>>> didn't receive my reply.
>>>
>>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>>> Sorry for the delay.
>>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected]
>>>>> <mailto:[email protected]>> ha scritto:
>>>>>
>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>> are not issued from root group. This is because
>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>> bfq_asymmetric_scenario().
>>>>>
>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>
>>>>> Before this patch:
>>>>> 1) root group will never be counted.
>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the
>>>>> requests.
>>>>>
>>>>> After this patch:
>>>>> 1) root group is counted.
>>>>> 2) Count if bfqg have pending requests.
>>>>> 3) Don't count if bfqg complete all the requests.
>>>>>
>>>>> With this change, the occasion that only one group is activated
>>>>> can be
>>>>> detected, and next patch will support concurrent sync io in the
>>>>> occasion.
>>>>>
>>>>> Signed-off-by: Yu Kuai <[email protected]
>>>>> <mailto:[email protected]>>
>>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]>>
>>>>> ---
>>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>>> block/bfq-wf2q.c    | 19 ++++---------------
>>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>>
>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>> index 0ec21018daba..03b04892440c 100644
>>>>> --- a/block/bfq-iosched.c
>>>>> +++ b/block/bfq-iosched.c
>>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct
>>>>> bfq_data *bfqd,
>>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>     struct bfq_queue *bfqq)
>>>>> {
>>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>>> -
>>>>> -for_each_entity(entity) {
>>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>>> -
>>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>>> -/*
>>>>> -* entity is still active, because either
>>>>> -* next_in_service or in_service_entity is not
>>>>> -* NULL (see the comments on the definition of
>>>>> -* next_in_service for details on why
>>>>> -* in_service_entity must be checked too).
>>>>> -*
>>>>> -* As a consequence, its parent entities are
>>>>> -* active as well, and thus this loop must
>>>>> -* stop here.
>>>>> -*/
>>>>> -break;
>>>>> -}
>>>>> -
>>>>> -/*
>>>>> -* The decrement of num_groups_with_pending_reqs is
>>>>> -* not performed immediately upon the deactivation of
>>>>> -* entity, but it is delayed to when it also happens
>>>>> -* that the first leaf descendant bfqq of entity gets
>>>>> -* all its pending requests completed. The following
>>>>> -* instructions perform this delayed decrement, if
>>>>> -* needed. See the comments on
>>>>> -* num_groups_with_pending_reqs for details.
>>>>> -*/
>>>>> -if (entity->in_groups_with_pending_reqs) {
>>>>> -entity->in_groups_with_pending_reqs = false;
>>>>> -bfqd->num_groups_with_pending_reqs--;
>>>>> -}
>>>>> -}
>>>> With this part removed, I'm missing how you handle the following
>>>> sequence of events:
>>>> 1.  a queue Q becomes non busy but still has dispatched requests, so
>>>> it must not be removed from the counter of queues with pending reqs
>>>> yet
>>>> 2.  the last request of Q is completed with Q being still idle (non
>>>> busy).  At this point Q must be removed from the counter.  It seems to
>>>> me that this case is not handled any longer
>>> Hi, Paolo
>>>
>>> 1) At first, patch 1 support to track if bfqq has pending requests,
>>> it's
>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>> first request is inserted to bfqq, and it's cleared when the last
>>> request is completed(based on weights_tree insertion and removal).
>>>
>>
>> In patch 1 I don't see the flag cleared for the request-completion
>> event :(
>>
>> The piece of code involved is this:
>>
>> static void bfq_completed_request(struct bfq_queue *bfqq, struct
>> bfq_data *bfqd)
>> {
>> u64 now_ns;
>> u32 delta_us;
>>
>> bfq_update_hw_tag(bfqd);
>>
>> bfqd->rq_in_driver[bfqq->actuator_idx]--;
>> bfqd->tot_rq_in_driver--;
>> bfqq->dispatched--;
>>
>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
>> /*
>> * Set budget_timeout (which we overload to store the
>> * time at which the queue remains with no backlog and
>> * no outstanding request; used by the weight-raising
>> * mechanism).
>> */
>> bfqq->budget_timeout = jiffies;
>>
>> bfq_weights_tree_remove(bfqd, bfqq);
>> }
>> ...
>>
>> Am I missing something?
>
> I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
> to clear the flag, and it's called both from bfq_del_bfqq_busy() and
> bfq_completed_request(). I think you may miss the later:
>
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 0d46cb728bbf..0ec21018daba 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct
> bfq_queue *bfqq, struct bfq_data *bfqd)
>           */
>          bfqq->budget_timeout = jiffies;
>
> +        bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
>          bfq_weights_tree_remove(bfqd, bfqq);
>      }
>
> Thanks,
> Kuai
>>
>> Thanks,
>> Paolo

2022-08-05 11:30:38

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'


在 2022/07/27 20:11, Yu Kuai 写道:
> Hi, Paolo
>
> Are you still interested in this patchset?

Friendly ping...
>
> 在 2022/07/20 19:38, Yu Kuai 写道:
>> Hi
>>
>> 在 2022/07/20 19:24, Paolo VALENTE 写道:
>>>
>>>
>>>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai
>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>>
>>>> Hi!
>>>>
>>>> I'm copying my reply with new mail address, because Paolo seems
>>>> didn't receive my reply.
>>>>
>>>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>>>> Sorry for the delay.
>>>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected]
>>>>>> <mailto:[email protected]>> ha scritto:
>>>>>>
>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>> are not issued from root group. This is because
>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>> bfq_asymmetric_scenario().
>>>>>>
>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>
>>>>>> Before this patch:
>>>>>> 1) root group will never be counted.
>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the
>>>>>> requests.
>>>>>>
>>>>>> After this patch:
>>>>>> 1) root group is counted.
>>>>>> 2) Count if bfqg have pending requests.
>>>>>> 3) Don't count if bfqg complete all the requests.
>>>>>>
>>>>>> With this change, the occasion that only one group is activated
>>>>>> can be
>>>>>> detected, and next patch will support concurrent sync io in the
>>>>>> occasion.
>>>>>>
>>>>>> Signed-off-by: Yu Kuai <[email protected]
>>>>>> <mailto:[email protected]>>
>>>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]>>
>>>>>> ---
>>>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>>>> block/bfq-wf2q.c    | 19 ++++---------------
>>>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>>>
>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>> index 0ec21018daba..03b04892440c 100644
>>>>>> --- a/block/bfq-iosched.c
>>>>>> +++ b/block/bfq-iosched.c
>>>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct
>>>>>> bfq_data *bfqd,
>>>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>     struct bfq_queue *bfqq)
>>>>>> {
>>>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>>>> -
>>>>>> -for_each_entity(entity) {
>>>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>>>> -
>>>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>>>> -/*
>>>>>> -* entity is still active, because either
>>>>>> -* next_in_service or in_service_entity is not
>>>>>> -* NULL (see the comments on the definition of
>>>>>> -* next_in_service for details on why
>>>>>> -* in_service_entity must be checked too).
>>>>>> -*
>>>>>> -* As a consequence, its parent entities are
>>>>>> -* active as well, and thus this loop must
>>>>>> -* stop here.
>>>>>> -*/
>>>>>> -break;
>>>>>> -}
>>>>>> -
>>>>>> -/*
>>>>>> -* The decrement of num_groups_with_pending_reqs is
>>>>>> -* not performed immediately upon the deactivation of
>>>>>> -* entity, but it is delayed to when it also happens
>>>>>> -* that the first leaf descendant bfqq of entity gets
>>>>>> -* all its pending requests completed. The following
>>>>>> -* instructions perform this delayed decrement, if
>>>>>> -* needed. See the comments on
>>>>>> -* num_groups_with_pending_reqs for details.
>>>>>> -*/
>>>>>> -if (entity->in_groups_with_pending_reqs) {
>>>>>> -entity->in_groups_with_pending_reqs = false;
>>>>>> -bfqd->num_groups_with_pending_reqs--;
>>>>>> -}
>>>>>> -}
>>>>> With this part removed, I'm missing how you handle the following
>>>>> sequence of events:
>>>>> 1.  a queue Q becomes non busy but still has dispatched requests, so
>>>>> it must not be removed from the counter of queues with pending reqs
>>>>> yet
>>>>> 2.  the last request of Q is completed with Q being still idle (non
>>>>> busy).  At this point Q must be removed from the counter.  It seems to
>>>>> me that this case is not handled any longer
>>>> Hi, Paolo
>>>>
>>>> 1) At first, patch 1 support to track if bfqq has pending requests,
>>>> it's
>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>> first request is inserted to bfqq, and it's cleared when the last
>>>> request is completed(based on weights_tree insertion and removal).
>>>>
>>>
>>> In patch 1 I don't see the flag cleared for the request-completion
>>> event :(
>>>
>>> The piece of code involved is this:
>>>
>>> static void bfq_completed_request(struct bfq_queue *bfqq, struct
>>> bfq_data *bfqd)
>>> {
>>> u64 now_ns;
>>> u32 delta_us;
>>>
>>> bfq_update_hw_tag(bfqd);
>>>
>>> bfqd->rq_in_driver[bfqq->actuator_idx]--;
>>> bfqd->tot_rq_in_driver--;
>>> bfqq->dispatched--;
>>>
>>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
>>> /*
>>> * Set budget_timeout (which we overload to store the
>>> * time at which the queue remains with no backlog and
>>> * no outstanding request; used by the weight-raising
>>> * mechanism).
>>> */
>>> bfqq->budget_timeout = jiffies;
>>>
>>> bfq_weights_tree_remove(bfqd, bfqq);
>>> }
>>> ...
>>>
>>> Am I missing something?
>>
>> I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
>> to clear the flag, and it's called both from bfq_del_bfqq_busy() and
>> bfq_completed_request(). I think you may miss the later:
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 0d46cb728bbf..0ec21018daba 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct
>> bfq_queue *bfqq, struct bfq_data *bfqd)
>>           */
>>          bfqq->budget_timeout = jiffies;
>>
>> +        bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
>>          bfq_weights_tree_remove(bfqd, bfqq);
>>      }
>>
>> Thanks,
>> Kuai
>>>
>>> Thanks,
>>> Paolo
>
> .
>


2022-08-10 10:53:33

by Paolo Valente

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'



> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai <[email protected]> ha scritto:
>
> Hi, Paolo
>

hi

> Are you still interested in this patchset?
>

Yes. Sorry for replying very late again.

Probably the last fix that you suggest is enough, but I'm a little bit
concerned that it may be a little hasty. In fact, before this fix, we
exchanged several messages, and I didn't seem to be very good at
convincing you about the need to keep into account also in-service
I/O. So, my question is: are you sure that now you have a
clear/complete understanding of this non-trivial matter?
Consequently, are we sure that this last fix is most certainly all we
need? Of course, I will check on my own, but if you reassure me on
this point, I will feel more confident.

Thanks,
Paolo

> 在 2022/07/20 19:38, Yu Kuai 写道:
>> Hi
>>
>> 在 2022/07/20 19:24, Paolo VALENTE 写道:
>>>
>>>
>>>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>
>>>> Hi!
>>>>
>>>> I'm copying my reply with new mail address, because Paolo seems
>>>> didn't receive my reply.
>>>>
>>>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>>>> Sorry for the delay.
>>>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>
>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>> are not issued from root group. This is because
>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>> bfq_asymmetric_scenario().
>>>>>>
>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>
>>>>>> Before this patch:
>>>>>> 1) root group will never be counted.
>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>>>>>
>>>>>> After this patch:
>>>>>> 1) root group is counted.
>>>>>> 2) Count if bfqg have pending requests.
>>>>>> 3) Don't count if bfqg complete all the requests.
>>>>>>
>>>>>> With this change, the occasion that only one group is activated can be
>>>>>> detected, and next patch will support concurrent sync io in the
>>>>>> occasion.
>>>>>>
>>>>>> Signed-off-by: Yu Kuai <[email protected] <mailto:[email protected]>>
>>>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]>>
>>>>>> ---
>>>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>>>> block/bfq-wf2q.c | 19 ++++---------------
>>>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>>>
>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>> index 0ec21018daba..03b04892440c 100644
>>>>>> --- a/block/bfq-iosched.c
>>>>>> +++ b/block/bfq-iosched.c
>>>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>> struct bfq_queue *bfqq)
>>>>>> {
>>>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>>>> -
>>>>>> -for_each_entity(entity) {
>>>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>>>> -
>>>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>>>> -/*
>>>>>> -* entity is still active, because either
>>>>>> -* next_in_service or in_service_entity is not
>>>>>> -* NULL (see the comments on the definition of
>>>>>> -* next_in_service for details on why
>>>>>> -* in_service_entity must be checked too).
>>>>>> -*
>>>>>> -* As a consequence, its parent entities are
>>>>>> -* active as well, and thus this loop must
>>>>>> -* stop here.
>>>>>> -*/
>>>>>> -break;
>>>>>> -}
>>>>>> -
>>>>>> -/*
>>>>>> -* The decrement of num_groups_with_pending_reqs is
>>>>>> -* not performed immediately upon the deactivation of
>>>>>> -* entity, but it is delayed to when it also happens
>>>>>> -* that the first leaf descendant bfqq of entity gets
>>>>>> -* all its pending requests completed. The following
>>>>>> -* instructions perform this delayed decrement, if
>>>>>> -* needed. See the comments on
>>>>>> -* num_groups_with_pending_reqs for details.
>>>>>> -*/
>>>>>> -if (entity->in_groups_with_pending_reqs) {
>>>>>> -entity->in_groups_with_pending_reqs = false;
>>>>>> -bfqd->num_groups_with_pending_reqs--;
>>>>>> -}
>>>>>> -}
>>>>> With this part removed, I'm missing how you handle the following
>>>>> sequence of events:
>>>>> 1. a queue Q becomes non busy but still has dispatched requests, so
>>>>> it must not be removed from the counter of queues with pending reqs
>>>>> yet
>>>>> 2. the last request of Q is completed with Q being still idle (non
>>>>> busy). At this point Q must be removed from the counter. It seems to
>>>>> me that this case is not handled any longer
>>>> Hi, Paolo
>>>>
>>>> 1) At first, patch 1 support to track if bfqq has pending requests, it's
>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>> first request is inserted to bfqq, and it's cleared when the last
>>>> request is completed(based on weights_tree insertion and removal).
>>>>
>>>
>>> In patch 1 I don't see the flag cleared for the request-completion event :(
>>>
>>> The piece of code involved is this:
>>>
>>> static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>>> {
>>> u64 now_ns;
>>> u32 delta_us;
>>>
>>> bfq_update_hw_tag(bfqd);
>>>
>>> bfqd->rq_in_driver[bfqq->actuator_idx]--;
>>> bfqd->tot_rq_in_driver--;
>>> bfqq->dispatched--;
>>>
>>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
>>> /*
>>> * Set budget_timeout (which we overload to store the
>>> * time at which the queue remains with no backlog and
>>> * no outstanding request; used by the weight-raising
>>> * mechanism).
>>> */
>>> bfqq->budget_timeout = jiffies;
>>>
>>> bfq_weights_tree_remove(bfqd, bfqq);
>>> }
>>> ...
>>>
>>> Am I missing something?
>>
>> I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
>> to clear the flag, and it's called both from bfq_del_bfqq_busy() and
>> bfq_completed_request(). I think you may miss the later:
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 0d46cb728bbf..0ec21018daba 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>> */
>> bfqq->budget_timeout = jiffies;
>>
>> + bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
>> bfq_weights_tree_remove(bfqd, bfqq);
>> }
>>
>> Thanks,
>> Kuai
>>>
>>> Thanks,
>>> Paolo
>

2022-08-11 01:38:27

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi, Paolo

在 2022/08/10 18:49, Paolo Valente 写道:
>
>
>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai <[email protected]> ha scritto:
>>
>> Hi, Paolo
>>
>
> hi
>
>> Are you still interested in this patchset?
>>
>
> Yes. Sorry for replying very late again.
>
> Probably the last fix that you suggest is enough, but I'm a little bit
> concerned that it may be a little hasty. In fact, before this fix, we
> exchanged several messages, and I didn't seem to be very good at
> convincing you about the need to keep into account also in-service
> I/O. So, my question is: are you sure that now you have a

I'm confused here, I'm pretty aware that in-service I/O(as said pending
requests is the patchset) should be counted, as you suggested in v7, are
you still thinking that the way in this patchset is problematic?

I'll try to explain again that how to track is bfqq has pending pending
requests, please let me know if you still think there are some problems:

patch 1 support to track if bfqq has pending requests, it's
done by setting the flag 'entity->in_groups_with_pending_reqs' when the
first request is inserted to bfqq, and it's cleared when the last
request is completed. specifically the flag is set in
bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
both in bfq_completed_request() and bfq_del_bfqq_busy() when
'bfqq->diapatched' is false.

Thanks,
Kuai
> clear/complete understanding of this non-trivial matter?
> Consequently, are we sure that this last fix is most certainly all we
> need? Of course, I will check on my own, but if you reassure me on
> this point, I will feel more confident.
>
> Thanks,
> Paolo
>
>> 在 2022/07/20 19:38, Yu Kuai 写道:
>>> Hi
>>>
>>> 在 2022/07/20 19:24, Paolo VALENTE 写道:
>>>>
>>>>
>>>>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>
>>>>> Hi!
>>>>>
>>>>> I'm copying my reply with new mail address, because Paolo seems
>>>>> didn't receive my reply.
>>>>>
>>>>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>>>>> Sorry for the delay.
>>>>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>>
>>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>>> are not issued from root group. This is because
>>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>>> bfq_asymmetric_scenario().
>>>>>>>
>>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>>
>>>>>>> Before this patch:
>>>>>>> 1) root group will never be counted.
>>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>>>>>>
>>>>>>> After this patch:
>>>>>>> 1) root group is counted.
>>>>>>> 2) Count if bfqg have pending requests.
>>>>>>> 3) Don't count if bfqg complete all the requests.
>>>>>>>
>>>>>>> With this change, the occasion that only one group is activated can be
>>>>>>> detected, and next patch will support concurrent sync io in the
>>>>>>> occasion.
>>>>>>>
>>>>>>> Signed-off-by: Yu Kuai <[email protected] <mailto:[email protected]>>
>>>>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]>>
>>>>>>> ---
>>>>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>>>>> block/bfq-wf2q.c | 19 ++++---------------
>>>>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>>>>
>>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>>> index 0ec21018daba..03b04892440c 100644
>>>>>>> --- a/block/bfq-iosched.c
>>>>>>> +++ b/block/bfq-iosched.c
>>>>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>> struct bfq_queue *bfqq)
>>>>>>> {
>>>>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>>>>> -
>>>>>>> -for_each_entity(entity) {
>>>>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>>>>> -
>>>>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>>>>> -/*
>>>>>>> -* entity is still active, because either
>>>>>>> -* next_in_service or in_service_entity is not
>>>>>>> -* NULL (see the comments on the definition of
>>>>>>> -* next_in_service for details on why
>>>>>>> -* in_service_entity must be checked too).
>>>>>>> -*
>>>>>>> -* As a consequence, its parent entities are
>>>>>>> -* active as well, and thus this loop must
>>>>>>> -* stop here.
>>>>>>> -*/
>>>>>>> -break;
>>>>>>> -}
>>>>>>> -
>>>>>>> -/*
>>>>>>> -* The decrement of num_groups_with_pending_reqs is
>>>>>>> -* not performed immediately upon the deactivation of
>>>>>>> -* entity, but it is delayed to when it also happens
>>>>>>> -* that the first leaf descendant bfqq of entity gets
>>>>>>> -* all its pending requests completed. The following
>>>>>>> -* instructions perform this delayed decrement, if
>>>>>>> -* needed. See the comments on
>>>>>>> -* num_groups_with_pending_reqs for details.
>>>>>>> -*/
>>>>>>> -if (entity->in_groups_with_pending_reqs) {
>>>>>>> -entity->in_groups_with_pending_reqs = false;
>>>>>>> -bfqd->num_groups_with_pending_reqs--;
>>>>>>> -}
>>>>>>> -}
>>>>>> With this part removed, I'm missing how you handle the following
>>>>>> sequence of events:
>>>>>> 1. a queue Q becomes non busy but still has dispatched requests, so
>>>>>> it must not be removed from the counter of queues with pending reqs
>>>>>> yet
>>>>>> 2. the last request of Q is completed with Q being still idle (non
>>>>>> busy). At this point Q must be removed from the counter. It seems to
>>>>>> me that this case is not handled any longer
>>>>> Hi, Paolo
>>>>>
>>>>> 1) At first, patch 1 support to track if bfqq has pending requests, it's
>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>> request is completed(based on weights_tree insertion and removal).
>>>>>
>>>>
>>>> In patch 1 I don't see the flag cleared for the request-completion event :(
>>>>
>>>> The piece of code involved is this:
>>>>
>>>> static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>>>> {
>>>> u64 now_ns;
>>>> u32 delta_us;
>>>>
>>>> bfq_update_hw_tag(bfqd);
>>>>
>>>> bfqd->rq_in_driver[bfqq->actuator_idx]--;
>>>> bfqd->tot_rq_in_driver--;
>>>> bfqq->dispatched--;
>>>>
>>>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
>>>> /*
>>>> * Set budget_timeout (which we overload to store the
>>>> * time at which the queue remains with no backlog and
>>>> * no outstanding request; used by the weight-raising
>>>> * mechanism).
>>>> */
>>>> bfqq->budget_timeout = jiffies;
>>>>
>>>> bfq_weights_tree_remove(bfqd, bfqq);
>>>> }
>>>> ...
>>>>
>>>> Am I missing something?
>>>
>>> I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
>>> to clear the flag, and it's called both from bfq_del_bfqq_busy() and
>>> bfq_completed_request(). I think you may miss the later:
>>>
>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>> index 0d46cb728bbf..0ec21018daba 100644
>>> --- a/block/bfq-iosched.c
>>> +++ b/block/bfq-iosched.c
>>> @@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>>> */
>>> bfqq->budget_timeout = jiffies;
>>>
>>> + bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
>>> bfq_weights_tree_remove(bfqd, bfqq);
>>> }
>>>
>>> Thanks,
>>> Kuai
>>>>
>>>> Thanks,
>>>> Paolo
>>
>
> .
>

2022-08-25 12:21:27

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'



在 2022/08/11 9:19, Yu Kuai 写道:
> Hi, Paolo
>
> 在 2022/08/10 18:49, Paolo Valente 写道:
>>
>>
>>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai
>>> <[email protected]> ha scritto:
>>>
>>> Hi, Paolo
>>>
>>
>> hi
>>
>>> Are you still interested in this patchset?
>>>
>>
>> Yes. Sorry for replying very late again.
>>
>> Probably the last fix that you suggest is enough, but I'm a little bit
>> concerned that it may be a little hasty.  In fact, before this fix, we
>> exchanged several messages, and I didn't seem to be very good at
>> convincing you about the need to keep into account also in-service
>> I/O.  So, my question is: are you sure that now you have a
>
> I'm confused here, I'm pretty aware that in-service I/O(as said pending
> requests is the patchset) should be counted, as you suggested in v7, are
> you still thinking that the way in this patchset is problematic?
>
> I'll try to explain again that how to track is bfqq has pending pending
> requests, please let me know if you still think there are some problems:
>
> patch 1 support to track if bfqq has pending requests, it's
> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
> first request is inserted to bfqq, and it's cleared when the last
> request is completed. specifically the flag is set in
> bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
> both in bfq_completed_request() and bfq_del_bfqq_busy() when
> 'bfqq->diapatched' is false.

Hi, Paolo

Can you please have a check if patch 1 is ok?

Thanks,
Kuai
>
> Thanks,
> Kuai
>> clear/complete understanding of this non-trivial matter?
>> Consequently, are we sure that this last fix is most certainly all we
>> need?  Of course, I will check on my own, but if you reassure me on
>> this point, I will feel more confident.
>>
>> Thanks,
>> Paolo
>>
>>> 在 2022/07/20 19:38, Yu Kuai 写道:
>>>> Hi
>>>>
>>>> 在 2022/07/20 19:24, Paolo VALENTE 写道:
>>>>>
>>>>>
>>>>>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai
>>>>>> <[email protected] <mailto:[email protected]>> ha
>>>>>> scritto:
>>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> I'm copying my reply with new mail address, because Paolo seems
>>>>>> didn't receive my reply.
>>>>>>
>>>>>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>>>>>> Sorry for the delay.
>>>>>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai
>>>>>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>>>
>>>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>>>> are not issued from root group. This is because
>>>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>>>> bfq_asymmetric_scenario().
>>>>>>>>
>>>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>>>
>>>>>>>> Before this patch:
>>>>>>>> 1) root group will never be counted.
>>>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the
>>>>>>>> requests.
>>>>>>>>
>>>>>>>> After this patch:
>>>>>>>> 1) root group is counted.
>>>>>>>> 2) Count if bfqg have pending requests.
>>>>>>>> 3) Don't count if bfqg complete all the requests.
>>>>>>>>
>>>>>>>> With this change, the occasion that only one group is activated
>>>>>>>> can be
>>>>>>>> detected, and next patch will support concurrent sync io in the
>>>>>>>> occasion.
>>>>>>>>
>>>>>>>> Signed-off-by: Yu Kuai <[email protected]
>>>>>>>> <mailto:[email protected]>>
>>>>>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]>>
>>>>>>>> ---
>>>>>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>>>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>>>>>> block/bfq-wf2q.c    | 19 ++++---------------
>>>>>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>>>> index 0ec21018daba..03b04892440c 100644
>>>>>>>> --- a/block/bfq-iosched.c
>>>>>>>> +++ b/block/bfq-iosched.c
>>>>>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct
>>>>>>>> bfq_data *bfqd,
>>>>>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>>>      struct bfq_queue *bfqq)
>>>>>>>> {
>>>>>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>>>>>> -
>>>>>>>> -for_each_entity(entity) {
>>>>>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>>>>>> -
>>>>>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>>>>>> -/*
>>>>>>>> -* entity is still active, because either
>>>>>>>> -* next_in_service or in_service_entity is not
>>>>>>>> -* NULL (see the comments on the definition of
>>>>>>>> -* next_in_service for details on why
>>>>>>>> -* in_service_entity must be checked too).
>>>>>>>> -*
>>>>>>>> -* As a consequence, its parent entities are
>>>>>>>> -* active as well, and thus this loop must
>>>>>>>> -* stop here.
>>>>>>>> -*/
>>>>>>>> -break;
>>>>>>>> -}
>>>>>>>> -
>>>>>>>> -/*
>>>>>>>> -* The decrement of num_groups_with_pending_reqs is
>>>>>>>> -* not performed immediately upon the deactivation of
>>>>>>>> -* entity, but it is delayed to when it also happens
>>>>>>>> -* that the first leaf descendant bfqq of entity gets
>>>>>>>> -* all its pending requests completed. The following
>>>>>>>> -* instructions perform this delayed decrement, if
>>>>>>>> -* needed. See the comments on
>>>>>>>> -* num_groups_with_pending_reqs for details.
>>>>>>>> -*/
>>>>>>>> -if (entity->in_groups_with_pending_reqs) {
>>>>>>>> -entity->in_groups_with_pending_reqs = false;
>>>>>>>> -bfqd->num_groups_with_pending_reqs--;
>>>>>>>> -}
>>>>>>>> -}
>>>>>>> With this part removed, I'm missing how you handle the following
>>>>>>> sequence of events:
>>>>>>> 1.  a queue Q becomes non busy but still has dispatched requests, so
>>>>>>> it must not be removed from the counter of queues with pending reqs
>>>>>>> yet
>>>>>>> 2.  the last request of Q is completed with Q being still idle (non
>>>>>>> busy).  At this point Q must be removed from the counter.  It
>>>>>>> seems to
>>>>>>> me that this case is not handled any longer
>>>>>> Hi, Paolo
>>>>>>
>>>>>> 1) At first, patch 1 support to track if bfqq has pending
>>>>>> requests, it's
>>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs'
>>>>>> when the
>>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>>> request is completed(based on weights_tree insertion and removal).
>>>>>>
>>>>>
>>>>> In patch 1 I don't see the flag cleared for the request-completion
>>>>> event :(
>>>>>
>>>>> The piece of code involved is this:
>>>>>
>>>>> static void bfq_completed_request(struct bfq_queue *bfqq, struct
>>>>> bfq_data *bfqd)
>>>>> {
>>>>> u64 now_ns;
>>>>> u32 delta_us;
>>>>>
>>>>> bfq_update_hw_tag(bfqd);
>>>>>
>>>>> bfqd->rq_in_driver[bfqq->actuator_idx]--;
>>>>> bfqd->tot_rq_in_driver--;
>>>>> bfqq->dispatched--;
>>>>>
>>>>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
>>>>> /*
>>>>> * Set budget_timeout (which we overload to store the
>>>>> * time at which the queue remains with no backlog and
>>>>> * no outstanding request; used by the weight-raising
>>>>> * mechanism).
>>>>> */
>>>>> bfqq->budget_timeout = jiffies;
>>>>>
>>>>> bfq_weights_tree_remove(bfqd, bfqq);
>>>>> }
>>>>> ...
>>>>>
>>>>> Am I missing something?
>>>>
>>>> I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
>>>> to clear the flag, and it's called both from bfq_del_bfqq_busy() and
>>>> bfq_completed_request(). I think you may miss the later:
>>>>
>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>> index 0d46cb728bbf..0ec21018daba 100644
>>>> --- a/block/bfq-iosched.c
>>>> +++ b/block/bfq-iosched.c
>>>> @@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct
>>>> bfq_queue *bfqq, struct bfq_data *bfqd)
>>>>            */
>>>>           bfqq->budget_timeout = jiffies;
>>>>
>>>> +        bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
>>>>           bfq_weights_tree_remove(bfqd, bfqq);
>>>>       }
>>>>
>>>> Thanks,
>>>> Kuai
>>>>>
>>>>> Thanks,
>>>>> Paolo
>>>
>>
>> .
>>
>
> .
>

2022-08-26 02:44:00

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi, Paolo!

在 2022/08/25 22:59, Paolo Valente 写道:
>
>
>> Il giorno 11 ago 2022, alle ore 03:19, Yu Kuai
>> <[email protected] <mailto:[email protected]>> ha scritto:
>>
>> Hi, Paolo
>>
>> 在 2022/08/10 18:49, Paolo Valente 写道:
>>>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai
>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>>
>>>> Hi, Paolo
>>>>
>>> hi
>>>> Are you still interested in this patchset?
>>>>
>>> Yes. Sorry for replying very late again.
>>> Probably the last fix that you suggest is enough, but I'm a little bit
>>> concerned that it may be a little hasty.  In fact, before this fix, we
>>> exchanged several messages, and I didn't seem to be very good at
>>> convincing you about the need to keep into account also in-service
>>> I/O.  So, my question is: are you sure that now you have a
>>
>> I'm confused here, I'm pretty aware that in-service I/O(as said pending
>> requests is the patchset) should be counted, as you suggested in v7, are
>> you still thinking that the way in this patchset is problematic?
>>
>> I'll try to explain again that how to track is bfqq has pending pending
>> requests, please let me know if you still think there are some problems:
>>
>> patch 1 support to track if bfqq has pending requests, it's
>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>> first request is inserted to bfqq, and it's cleared when the last
>> request is completed. specifically the flag is set in
>> bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
>> both in bfq_completed_request() and bfq_del_bfqq_busy() when
>> 'bfqq->diapatched' is false.
>>
>
> This general description seems correct to me. Have you already sent a
> new version of your patchset?

It's glad that we finially on the same page here.

Please take a look at patch 1, which already impelement the above
descriptions, it seems to me there is no need to send a new version
for now. If you think there are still some other problems, please let
me know.

Thanks,
Kuai
>
> Thanks,
> Paolo
>
>> Thanks,
>> Kuai
>>> clear/complete understanding of this non-trivial matter?
>>> Consequently, are we sure that this last fix is most certainly all we
>>> need?  Of course, I will check on my own, but if you reassure me on
>>> this point, I will feel more confident.
>>> Thanks,
>>> Paolo
>>>> 在 2022/07/20 19:38, Yu Kuai 写道:
>>>>> Hi
>>>>>
>>>>> 在 2022/07/20 19:24, Paolo VALENTE 写道:
>>>>>>
>>>>>>
>>>>>>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai
>>>>>>> <[email protected] <mailto:[email protected]>
>>>>>>> <mailto:[email protected]>> ha scritto:
>>>>>>>
>>>>>>> Hi!
>>>>>>>
>>>>>>> I'm copying my reply with new mail address, because Paolo seems
>>>>>>> didn't receive my reply.
>>>>>>>
>>>>>>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>>>>>>> Sorry for the delay.
>>>>>>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai
>>>>>>>>> <[email protected] <mailto:[email protected]>
>>>>>>>>> <mailto:[email protected]>> ha scritto:
>>>>>>>>>
>>>>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>>>>> are not issued from root group. This is because
>>>>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>>>>> bfq_asymmetric_scenario().
>>>>>>>>>
>>>>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>>>>
>>>>>>>>> Before this patch:
>>>>>>>>> 1) root group will never be counted.
>>>>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the
>>>>>>>>> requests.
>>>>>>>>>
>>>>>>>>> After this patch:
>>>>>>>>> 1) root group is counted.
>>>>>>>>> 2) Count if bfqg have pending requests.
>>>>>>>>> 3) Don't count if bfqg complete all the requests.
>>>>>>>>>
>>>>>>>>> With this change, the occasion that only one group is activated
>>>>>>>>> can be
>>>>>>>>> detected, and next patch will support concurrent sync io in the
>>>>>>>>> occasion.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Yu Kuai <[email protected]
>>>>>>>>> <mailto:[email protected]> <mailto:[email protected]>>
>>>>>>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]>
>>>>>>>>> <mailto:[email protected]>>
>>>>>>>>> ---
>>>>>>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>>>>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>>>>>>> block/bfq-wf2q.c    | 19 ++++---------------
>>>>>>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>>>>> index 0ec21018daba..03b04892440c 100644
>>>>>>>>> --- a/block/bfq-iosched.c
>>>>>>>>> +++ b/block/bfq-iosched.c
>>>>>>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct
>>>>>>>>> bfq_data *bfqd,
>>>>>>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>>>>     struct bfq_queue *bfqq)
>>>>>>>>> {
>>>>>>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>>>>>>> -
>>>>>>>>> -for_each_entity(entity) {
>>>>>>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>>>>>>> -
>>>>>>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>>>>>>> -/*
>>>>>>>>> -* entity is still active, because either
>>>>>>>>> -* next_in_service or in_service_entity is not
>>>>>>>>> -* NULL (see the comments on the definition of
>>>>>>>>> -* next_in_service for details on why
>>>>>>>>> -* in_service_entity must be checked too).
>>>>>>>>> -*
>>>>>>>>> -* As a consequence, its parent entities are
>>>>>>>>> -* active as well, and thus this loop must
>>>>>>>>> -* stop here.
>>>>>>>>> -*/
>>>>>>>>> -break;
>>>>>>>>> -}
>>>>>>>>> -
>>>>>>>>> -/*
>>>>>>>>> -* The decrement of num_groups_with_pending_reqs is
>>>>>>>>> -* not performed immediately upon the deactivation of
>>>>>>>>> -* entity, but it is delayed to when it also happens
>>>>>>>>> -* that the first leaf descendant bfqq of entity gets
>>>>>>>>> -* all its pending requests completed. The following
>>>>>>>>> -* instructions perform this delayed decrement, if
>>>>>>>>> -* needed. See the comments on
>>>>>>>>> -* num_groups_with_pending_reqs for details.
>>>>>>>>> -*/
>>>>>>>>> -if (entity->in_groups_with_pending_reqs) {
>>>>>>>>> -entity->in_groups_with_pending_reqs = false;
>>>>>>>>> -bfqd->num_groups_with_pending_reqs--;
>>>>>>>>> -}
>>>>>>>>> -}
>>>>>>>> With this part removed, I'm missing how you handle the following
>>>>>>>> sequence of events:
>>>>>>>> 1.  a queue Q becomes non busy but still has dispatched requests, so
>>>>>>>> it must not be removed from the counter of queues with pending reqs
>>>>>>>> yet
>>>>>>>> 2.  the last request of Q is completed with Q being still idle (non
>>>>>>>> busy).  At this point Q must be removed from the counter.  It
>>>>>>>> seems to
>>>>>>>> me that this case is not handled any longer
>>>>>>> Hi, Paolo
>>>>>>>
>>>>>>> 1) At first, patch 1 support to track if bfqq has pending
>>>>>>> requests, it's
>>>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs'
>>>>>>> when the
>>>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>>>> request is completed(based on weights_tree insertion and removal).
>>>>>>>
>>>>>>
>>>>>> In patch 1 I don't see the flag cleared for the request-completion
>>>>>> event :(
>>>>>>
>>>>>> The piece of code involved is this:
>>>>>>
>>>>>> static void bfq_completed_request(struct bfq_queue *bfqq, struct
>>>>>> bfq_data *bfqd)
>>>>>> {
>>>>>> u64 now_ns;
>>>>>> u32 delta_us;
>>>>>>
>>>>>> bfq_update_hw_tag(bfqd);
>>>>>>
>>>>>> bfqd->rq_in_driver[bfqq->actuator_idx]--;
>>>>>> bfqd->tot_rq_in_driver--;
>>>>>> bfqq->dispatched--;
>>>>>>
>>>>>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
>>>>>> /*
>>>>>> * Set budget_timeout (which we overload to store the
>>>>>> * time at which the queue remains with no backlog and
>>>>>> * no outstanding request; used by the weight-raising
>>>>>> * mechanism).
>>>>>> */
>>>>>> bfqq->budget_timeout = jiffies;
>>>>>>
>>>>>> bfq_weights_tree_remove(bfqd, bfqq);
>>>>>> }
>>>>>> ...
>>>>>>
>>>>>> Am I missing something?
>>>>>
>>>>> I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
>>>>> to clear the flag, and it's called both from bfq_del_bfqq_busy() and
>>>>> bfq_completed_request(). I think you may miss the later:
>>>>>
>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>> index 0d46cb728bbf..0ec21018daba 100644
>>>>> --- a/block/bfq-iosched.c
>>>>> +++ b/block/bfq-iosched.c
>>>>> @@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct
>>>>> bfq_queue *bfqq, struct bfq_data *bfqd)
>>>>>           */
>>>>>          bfqq->budget_timeout = jiffies;
>>>>>
>>>>> +        bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
>>>>>          bfq_weights_tree_remove(bfqd, bfqq);
>>>>>      }
>>>>>
>>>>> Thanks,
>>>>> Kuai
>>>>>>
>>>>>> Thanks,
>>>>>> Paolo
>>>>
>>> .
>

2022-09-06 09:45:31

by Paolo Valente

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'



> Il giorno 26 ago 2022, alle ore 04:34, Yu Kuai <[email protected]> ha scritto:
>
> Hi, Paolo!
>
> 在 2022/08/25 22:59, Paolo Valente 写道:
>>> Il giorno 11 ago 2022, alle ore 03:19, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>
>>> Hi, Paolo
>>>
>>> 在 2022/08/10 18:49, Paolo Valente 写道:
>>>>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>
>>>>> Hi, Paolo
>>>>>
>>>> hi
>>>>> Are you still interested in this patchset?
>>>>>
>>>> Yes. Sorry for replying very late again.
>>>> Probably the last fix that you suggest is enough, but I'm a little bit
>>>> concerned that it may be a little hasty. In fact, before this fix, we
>>>> exchanged several messages, and I didn't seem to be very good at
>>>> convincing you about the need to keep into account also in-service
>>>> I/O. So, my question is: are you sure that now you have a
>>>
>>> I'm confused here, I'm pretty aware that in-service I/O(as said pending
>>> requests is the patchset) should be counted, as you suggested in v7, are
>>> you still thinking that the way in this patchset is problematic?
>>>
>>> I'll try to explain again that how to track is bfqq has pending pending
>>> requests, please let me know if you still think there are some problems:
>>>
>>> patch 1 support to track if bfqq has pending requests, it's
>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>> first request is inserted to bfqq, and it's cleared when the last
>>> request is completed. specifically the flag is set in
>>> bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
>>> both in bfq_completed_request() and bfq_del_bfqq_busy() when
>>> 'bfqq->diapatched' is false.
>>>
>> This general description seems correct to me. Have you already sent a new version of your patchset?
>
> It's glad that we finially on the same page here.
>

Yep. Sorry for my chronicle delay.

> Please take a look at patch 1, which already impelement the above
> descriptions, it seems to me there is no need to send a new version
> for now. If you think there are still some other problems, please let
> me know.
>

Patch 1 seems ok to me. I seem to have only one pending comment on this patch (3/4) instead. Let me paste previous stuff here for your convenience:

>>
>> - /*
>> - * Next function is invoked last, because it causes bfqq to be
>> - * freed if the following holds: bfqq is not in service and
>> - * has no dispatched request. DO NOT use bfqq after the next
>> - * function invocation.
>> - */
> I would really love it if you leave this comment. I added it after
> suffering a lot for a nasty UAF. Of course the first sentence may
> need to be adjusted if the code that precedes it is to be removed.
> Same as above, if this patch is applied, this function will be gone.

yes, but this comment now must be moved forward.

Looking forward for a new complete version, for a new review. I'll do
my best to reply quicker.

Thanks,
Paolo





> Thanks,
> Kuai
>> Thanks,
>> Paolo
>>> Thanks,
>>> Kuai
>>>> clear/complete understanding of this non-trivial matter?
>>>> Consequently, are we sure that this last fix is most certainly all we
>>>> need? Of course, I will check on my own, but if you reassure me on
>>>> this point, I will feel more confident.
>>>> Thanks,
>>>> Paolo
>>>>> 在 2022/07/20 19:38, Yu Kuai 写道:
>>>>>> Hi
>>>>>>
>>>>>> 在 2022/07/20 19:24, Paolo VALENTE 写道:
>>>>>>>
>>>>>>>
>>>>>>>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai <[email protected] <mailto:[email protected]> <mailto:[email protected]>> ha scritto:
>>>>>>>>
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> I'm copying my reply with new mail address, because Paolo seems
>>>>>>>> didn't receive my reply.
>>>>>>>>
>>>>>>>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>>>>>>>> Sorry for the delay.
>>>>>>>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected] <mailto:[email protected]> <mailto:[email protected]>> ha scritto:
>>>>>>>>>>
>>>>>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>>>>>> are not issued from root group. This is because
>>>>>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>>>>>> bfq_asymmetric_scenario().
>>>>>>>>>>
>>>>>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>>>>>
>>>>>>>>>> Before this patch:
>>>>>>>>>> 1) root group will never be counted.
>>>>>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>>>>>>>>>
>>>>>>>>>> After this patch:
>>>>>>>>>> 1) root group is counted.
>>>>>>>>>> 2) Count if bfqg have pending requests.
>>>>>>>>>> 3) Don't count if bfqg complete all the requests.
>>>>>>>>>>
>>>>>>>>>> With this change, the occasion that only one group is activated can be
>>>>>>>>>> detected, and next patch will support concurrent sync io in the
>>>>>>>>>> occasion.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Yu Kuai <[email protected] <mailto:[email protected]> <mailto:[email protected]>>
>>>>>>>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]> <mailto:[email protected]>>
>>>>>>>>>> ---
>>>>>>>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>>>>>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>>>>>>>> block/bfq-wf2q.c | 19 ++++---------------
>>>>>>>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>>>>>> index 0ec21018daba..03b04892440c 100644
>>>>>>>>>> --- a/block/bfq-iosched.c
>>>>>>>>>> +++ b/block/bfq-iosched.c
>>>>>>>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>>>>> struct bfq_queue *bfqq)
>>>>>>>>>> {
>>>>>>>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>>>>>>>> -
>>>>>>>>>> -for_each_entity(entity) {
>>>>>>>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>>>>>>>> -
>>>>>>>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>>>>>>>> -/*
>>>>>>>>>> -* entity is still active, because either
>>>>>>>>>> -* next_in_service or in_service_entity is not
>>>>>>>>>> -* NULL (see the comments on the definition of
>>>>>>>>>> -* next_in_service for details on why
>>>>>>>>>> -* in_service_entity must be checked too).
>>>>>>>>>> -*
>>>>>>>>>> -* As a consequence, its parent entities are
>>>>>>>>>> -* active as well, and thus this loop must
>>>>>>>>>> -* stop here.
>>>>>>>>>> -*/
>>>>>>>>>> -break;
>>>>>>>>>> -}
>>>>>>>>>> -
>>>>>>>>>> -/*
>>>>>>>>>> -* The decrement of num_groups_with_pending_reqs is
>>>>>>>>>> -* not performed immediately upon the deactivation of
>>>>>>>>>> -* entity, but it is delayed to when it also happens
>>>>>>>>>> -* that the first leaf descendant bfqq of entity gets
>>>>>>>>>> -* all its pending requests completed. The following
>>>>>>>>>> -* instructions perform this delayed decrement, if
>>>>>>>>>> -* needed. See the comments on
>>>>>>>>>> -* num_groups_with_pending_reqs for details.
>>>>>>>>>> -*/
>>>>>>>>>> -if (entity->in_groups_with_pending_reqs) {
>>>>>>>>>> -entity->in_groups_with_pending_reqs = false;
>>>>>>>>>> -bfqd->num_groups_with_pending_reqs--;
>>>>>>>>>> -}
>>>>>>>>>> -}
>>>>>>>>> With this part removed, I'm missing how you handle the following
>>>>>>>>> sequence of events:
>>>>>>>>> 1. a queue Q becomes non busy but still has dispatched requests, so
>>>>>>>>> it must not be removed from the counter of queues with pending reqs
>>>>>>>>> yet
>>>>>>>>> 2. the last request of Q is completed with Q being still idle (non
>>>>>>>>> busy). At this point Q must be removed from the counter. It seems to
>>>>>>>>> me that this case is not handled any longer
>>>>>>>> Hi, Paolo
>>>>>>>>
>>>>>>>> 1) At first, patch 1 support to track if bfqq has pending requests, it's
>>>>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>>>>> request is completed(based on weights_tree insertion and removal).
>>>>>>>>
>>>>>>>
>>>>>>> In patch 1 I don't see the flag cleared for the request-completion event :(
>>>>>>>
>>>>>>> The piece of code involved is this:
>>>>>>>
>>>>>>> static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>>>>>>> {
>>>>>>> u64 now_ns;
>>>>>>> u32 delta_us;
>>>>>>>
>>>>>>> bfq_update_hw_tag(bfqd);
>>>>>>>
>>>>>>> bfqd->rq_in_driver[bfqq->actuator_idx]--;
>>>>>>> bfqd->tot_rq_in_driver--;
>>>>>>> bfqq->dispatched--;
>>>>>>>
>>>>>>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
>>>>>>> /*
>>>>>>> * Set budget_timeout (which we overload to store the
>>>>>>> * time at which the queue remains with no backlog and
>>>>>>> * no outstanding request; used by the weight-raising
>>>>>>> * mechanism).
>>>>>>> */
>>>>>>> bfqq->budget_timeout = jiffies;
>>>>>>>
>>>>>>> bfq_weights_tree_remove(bfqd, bfqq);
>>>>>>> }
>>>>>>> ...
>>>>>>>
>>>>>>> Am I missing something?
>>>>>>
>>>>>> I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
>>>>>> to clear the flag, and it's called both from bfq_del_bfqq_busy() and
>>>>>> bfq_completed_request(). I think you may miss the later:
>>>>>>
>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>> index 0d46cb728bbf..0ec21018daba 100644
>>>>>> --- a/block/bfq-iosched.c
>>>>>> +++ b/block/bfq-iosched.c
>>>>>> @@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>>>>>> */
>>>>>> bfqq->budget_timeout = jiffies;
>>>>>>
>>>>>> + bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
>>>>>> bfq_weights_tree_remove(bfqd, bfqq);
>>>>>> }
>>>>>>
>>>>>> Thanks,
>>>>>> Kuai
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Paolo
>>>>>
>>>> .
>

2022-09-07 01:46:46

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi, Paolo!

在 2022/09/06 17:37, Paolo Valente 写道:
>
>
>> Il giorno 26 ago 2022, alle ore 04:34, Yu Kuai <[email protected]> ha scritto:
>>
>> Hi, Paolo!
>>
>> 在 2022/08/25 22:59, Paolo Valente 写道:
>>>> Il giorno 11 ago 2022, alle ore 03:19, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>
>>>> Hi, Paolo
>>>>
>>>> 在 2022/08/10 18:49, Paolo Valente 写道:
>>>>>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>
>>>>>> Hi, Paolo
>>>>>>
>>>>> hi
>>>>>> Are you still interested in this patchset?
>>>>>>
>>>>> Yes. Sorry for replying very late again.
>>>>> Probably the last fix that you suggest is enough, but I'm a little bit
>>>>> concerned that it may be a little hasty. In fact, before this fix, we
>>>>> exchanged several messages, and I didn't seem to be very good at
>>>>> convincing you about the need to keep into account also in-service
>>>>> I/O. So, my question is: are you sure that now you have a
>>>>
>>>> I'm confused here, I'm pretty aware that in-service I/O(as said pending
>>>> requests is the patchset) should be counted, as you suggested in v7, are
>>>> you still thinking that the way in this patchset is problematic?
>>>>
>>>> I'll try to explain again that how to track is bfqq has pending pending
>>>> requests, please let me know if you still think there are some problems:
>>>>
>>>> patch 1 support to track if bfqq has pending requests, it's
>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>> first request is inserted to bfqq, and it's cleared when the last
>>>> request is completed. specifically the flag is set in
>>>> bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
>>>> both in bfq_completed_request() and bfq_del_bfqq_busy() when
>>>> 'bfqq->diapatched' is false.
>>>>
>>> This general description seems correct to me. Have you already sent a new version of your patchset?
>>
>> It's glad that we finially on the same page here.
>>
>
> Yep. Sorry for my chronicle delay.

Better late than never ????
>
>> Please take a look at patch 1, which already impelement the above
>> descriptions, it seems to me there is no need to send a new version
>> for now. If you think there are still some other problems, please let
>> me know.
>>
>
> Patch 1 seems ok to me. I seem to have only one pending comment on this patch (3/4) instead. Let me paste previous stuff here for your convenience:
That sounds good.

>
>>>
>>> - /*
>>> - * Next function is invoked last, because it causes bfqq to be
>>> - * freed if the following holds: bfqq is not in service and
>>> - * has no dispatched request. DO NOT use bfqq after the next
>>> - * function invocation.
>>> - */
>> I would really love it if you leave this comment. I added it after
>> suffering a lot for a nasty UAF. Of course the first sentence may
>> need to be adjusted if the code that precedes it is to be removed.
>> Same as above, if this patch is applied, this function will be gone.
>
> yes, but this comment now must be moved forward.
>
> Looking forward for a new complete version, for a new review. I'll do
> my best to reply quicker.
I'll send a new version soon, perhaps I'll also add the following
cleanups to this patchset.

Thanks,
Kuai
>
> Thanks,
> Paolo
>
>
>
>
>
>> Thanks,
>> Kuai
>>> Thanks,
>>> Paolo
>>>> Thanks,
>>>> Kuai
>>>>> clear/complete understanding of this non-trivial matter?
>>>>> Consequently, are we sure that this last fix is most certainly all we
>>>>> need? Of course, I will check on my own, but if you reassure me on
>>>>> this point, I will feel more confident.
>>>>> Thanks,
>>>>> Paolo
>>>>>> 在 2022/07/20 19:38, Yu Kuai 写道:
>>>>>>> Hi
>>>>>>>
>>>>>>> 在 2022/07/20 19:24, Paolo VALENTE 写道:
>>>>>>>>
>>>>>>>>
>>>>>>>>> Il giorno 12 lug 2022, alle ore 15:30, Yu Kuai <[email protected] <mailto:[email protected]> <mailto:[email protected]>> ha scritto:
>>>>>>>>>
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> I'm copying my reply with new mail address, because Paolo seems
>>>>>>>>> didn't receive my reply.
>>>>>>>>>
>>>>>>>>> 在 2022/06/23 23:32, Paolo Valente 写道:
>>>>>>>>>> Sorry for the delay.
>>>>>>>>>>> Il giorno 10 giu 2022, alle ore 04:17, Yu Kuai <[email protected] <mailto:[email protected]> <mailto:[email protected]>> ha scritto:
>>>>>>>>>>>
>>>>>>>>>>> Currently, bfq can't handle sync io concurrently as long as they
>>>>>>>>>>> are not issued from root group. This is because
>>>>>>>>>>> 'bfqd->num_groups_with_pending_reqs > 0' is always true in
>>>>>>>>>>> bfq_asymmetric_scenario().
>>>>>>>>>>>
>>>>>>>>>>> The way that bfqg is counted into 'num_groups_with_pending_reqs':
>>>>>>>>>>>
>>>>>>>>>>> Before this patch:
>>>>>>>>>>> 1) root group will never be counted.
>>>>>>>>>>> 2) Count if bfqg or it's child bfqgs have pending requests.
>>>>>>>>>>> 3) Don't count if bfqg and it's child bfqgs complete all the requests.
>>>>>>>>>>>
>>>>>>>>>>> After this patch:
>>>>>>>>>>> 1) root group is counted.
>>>>>>>>>>> 2) Count if bfqg have pending requests.
>>>>>>>>>>> 3) Don't count if bfqg complete all the requests.
>>>>>>>>>>>
>>>>>>>>>>> With this change, the occasion that only one group is activated can be
>>>>>>>>>>> detected, and next patch will support concurrent sync io in the
>>>>>>>>>>> occasion.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Yu Kuai <[email protected] <mailto:[email protected]> <mailto:[email protected]>>
>>>>>>>>>>> Reviewed-by: Jan Kara <[email protected] <mailto:[email protected]> <mailto:[email protected]>>
>>>>>>>>>>> ---
>>>>>>>>>>> block/bfq-iosched.c | 42 ------------------------------------------
>>>>>>>>>>> block/bfq-iosched.h | 18 +++++++++---------
>>>>>>>>>>> block/bfq-wf2q.c | 19 ++++---------------
>>>>>>>>>>> 3 files changed, 13 insertions(+), 66 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>>>>>>> index 0ec21018daba..03b04892440c 100644
>>>>>>>>>>> --- a/block/bfq-iosched.c
>>>>>>>>>>> +++ b/block/bfq-iosched.c
>>>>>>>>>>> @@ -970,48 +970,6 @@ void __bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>>>>>> void bfq_weights_tree_remove(struct bfq_data *bfqd,
>>>>>>>>>>> struct bfq_queue *bfqq)
>>>>>>>>>>> {
>>>>>>>>>>> -struct bfq_entity *entity = bfqq->entity.parent;
>>>>>>>>>>> -
>>>>>>>>>>> -for_each_entity(entity) {
>>>>>>>>>>> -struct bfq_sched_data *sd = entity->my_sched_data;
>>>>>>>>>>> -
>>>>>>>>>>> -if (sd->next_in_service || sd->in_service_entity) {
>>>>>>>>>>> -/*
>>>>>>>>>>> -* entity is still active, because either
>>>>>>>>>>> -* next_in_service or in_service_entity is not
>>>>>>>>>>> -* NULL (see the comments on the definition of
>>>>>>>>>>> -* next_in_service for details on why
>>>>>>>>>>> -* in_service_entity must be checked too).
>>>>>>>>>>> -*
>>>>>>>>>>> -* As a consequence, its parent entities are
>>>>>>>>>>> -* active as well, and thus this loop must
>>>>>>>>>>> -* stop here.
>>>>>>>>>>> -*/
>>>>>>>>>>> -break;
>>>>>>>>>>> -}
>>>>>>>>>>> -
>>>>>>>>>>> -/*
>>>>>>>>>>> -* The decrement of num_groups_with_pending_reqs is
>>>>>>>>>>> -* not performed immediately upon the deactivation of
>>>>>>>>>>> -* entity, but it is delayed to when it also happens
>>>>>>>>>>> -* that the first leaf descendant bfqq of entity gets
>>>>>>>>>>> -* all its pending requests completed. The following
>>>>>>>>>>> -* instructions perform this delayed decrement, if
>>>>>>>>>>> -* needed. See the comments on
>>>>>>>>>>> -* num_groups_with_pending_reqs for details.
>>>>>>>>>>> -*/
>>>>>>>>>>> -if (entity->in_groups_with_pending_reqs) {
>>>>>>>>>>> -entity->in_groups_with_pending_reqs = false;
>>>>>>>>>>> -bfqd->num_groups_with_pending_reqs--;
>>>>>>>>>>> -}
>>>>>>>>>>> -}
>>>>>>>>>> With this part removed, I'm missing how you handle the following
>>>>>>>>>> sequence of events:
>>>>>>>>>> 1. a queue Q becomes non busy but still has dispatched requests, so
>>>>>>>>>> it must not be removed from the counter of queues with pending reqs
>>>>>>>>>> yet
>>>>>>>>>> 2. the last request of Q is completed with Q being still idle (non
>>>>>>>>>> busy). At this point Q must be removed from the counter. It seems to
>>>>>>>>>> me that this case is not handled any longer
>>>>>>>>> Hi, Paolo
>>>>>>>>>
>>>>>>>>> 1) At first, patch 1 support to track if bfqq has pending requests, it's
>>>>>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>>>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>>>>>> request is completed(based on weights_tree insertion and removal).
>>>>>>>>>
>>>>>>>>
>>>>>>>> In patch 1 I don't see the flag cleared for the request-completion event :(
>>>>>>>>
>>>>>>>> The piece of code involved is this:
>>>>>>>>
>>>>>>>> static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>>>>>>>> {
>>>>>>>> u64 now_ns;
>>>>>>>> u32 delta_us;
>>>>>>>>
>>>>>>>> bfq_update_hw_tag(bfqd);
>>>>>>>>
>>>>>>>> bfqd->rq_in_driver[bfqq->actuator_idx]--;
>>>>>>>> bfqd->tot_rq_in_driver--;
>>>>>>>> bfqq->dispatched--;
>>>>>>>>
>>>>>>>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
>>>>>>>> /*
>>>>>>>> * Set budget_timeout (which we overload to store the
>>>>>>>> * time at which the queue remains with no backlog and
>>>>>>>> * no outstanding request; used by the weight-raising
>>>>>>>> * mechanism).
>>>>>>>> */
>>>>>>>> bfqq->budget_timeout = jiffies;
>>>>>>>>
>>>>>>>> bfq_weights_tree_remove(bfqd, bfqq);
>>>>>>>> }
>>>>>>>> ...
>>>>>>>>
>>>>>>>> Am I missing something?
>>>>>>>
>>>>>>> I add a new api bfq_del_bfqq_in_groups_with_pending_reqs() in patch 1
>>>>>>> to clear the flag, and it's called both from bfq_del_bfqq_busy() and
>>>>>>> bfq_completed_request(). I think you may miss the later:
>>>>>>>
>>>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>>>> index 0d46cb728bbf..0ec21018daba 100644
>>>>>>> --- a/block/bfq-iosched.c
>>>>>>> +++ b/block/bfq-iosched.c
>>>>>>> @@ -6263,6 +6263,7 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>>>>>>> */
>>>>>>> bfqq->budget_timeout = jiffies;
>>>>>>>
>>>>>>> + bfq_del_bfqq_in_groups_with_pending_reqs(bfqq);
>>>>>>> bfq_weights_tree_remove(bfqd, bfqq);
>>>>>>> }
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kuai
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Paolo
>>>>>>
>>>>> .
>>
>
> .
>

2022-09-14 03:00:35

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'



在 2022/09/07 9:16, Yu Kuai 写道:
> Hi, Paolo!
>
> 在 2022/09/06 17:37, Paolo Valente 写道:
>>
>>
>>> Il giorno 26 ago 2022, alle ore 04:34, Yu Kuai
>>> <[email protected]> ha scritto:
>>>
>>> Hi, Paolo!
>>>
>>> 在 2022/08/25 22:59, Paolo Valente 写道:
>>>>> Il giorno 11 ago 2022, alle ore 03:19, Yu Kuai
>>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>
>>>>> Hi, Paolo
>>>>>
>>>>> 在 2022/08/10 18:49, Paolo Valente 写道:
>>>>>>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai
>>>>>>> <[email protected] <mailto:[email protected]>> ha
>>>>>>> scritto:
>>>>>>>
>>>>>>> Hi, Paolo
>>>>>>>
>>>>>> hi
>>>>>>> Are you still interested in this patchset?
>>>>>>>
>>>>>> Yes. Sorry for replying very late again.
>>>>>> Probably the last fix that you suggest is enough, but I'm a little
>>>>>> bit
>>>>>> concerned that it may be a little hasty.  In fact, before this
>>>>>> fix, we
>>>>>> exchanged several messages, and I didn't seem to be very good at
>>>>>> convincing you about the need to keep into account also in-service
>>>>>> I/O.  So, my question is: are you sure that now you have a
>>>>>
>>>>> I'm confused here, I'm pretty aware that in-service I/O(as said
>>>>> pending
>>>>> requests is the patchset) should be counted, as you suggested in
>>>>> v7, are
>>>>> you still thinking that the way in this patchset is problematic?
>>>>>
>>>>> I'll try to explain again that how to track is bfqq has pending
>>>>> pending
>>>>> requests, please let me know if you still think there are some
>>>>> problems:
>>>>>
>>>>> patch 1 support to track if bfqq has pending requests, it's
>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when
>>>>> the
>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>> request is completed. specifically the flag is set in
>>>>> bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
>>>>> both in bfq_completed_request() and bfq_del_bfqq_busy() when
>>>>> 'bfqq->diapatched' is false.
>>>>>
>>>> This general description seems correct to me. Have you already sent
>>>> a new version of your patchset?
>>>
>>> It's glad that we finially on the same page here.
>>>
>>
>> Yep. Sorry for my chronicle delay.
>
> Better late than never ????
>>
>>> Please take a look at patch 1, which already impelement the above
>>> descriptions, it seems to me there is no need to send a new version
>>> for now. If you think there are still some other problems, please let
>>> me know.
>>>
>>
>> Patch 1 seems ok to me. I seem to have only one pending comment on
>> this patch (3/4) instead. Let me paste previous stuff here for your
>> convenience:
> That sounds good.
>
>>
>>>>
>>>> -    /*
>>>> -     * Next function is invoked last, because it causes bfqq to be
>>>> -     * freed if the following holds: bfqq is not in service and
>>>> -     * has no dispatched request. DO NOT use bfqq after the next
>>>> -     * function invocation.
>>>> -     */
>>> I would really love it if you leave this comment.  I added it after
>>> suffering a lot for a nasty UAF.  Of course the first sentence may
>>> need to be adjusted if the code that precedes it is to be removed.
>>> Same as above, if this patch is applied, this function will be gone.

Hi, I'm curious while I'm trying to add the comment, before this
patchset, can bfqq be freed when bfq_weights_tree_remove is called?

bfq_completed_request
bfqq->dispatched--
if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq))
bfq_weights_tree_remove(bfqd, bfqq);

// continue to use bfqq

It seems to me this is problematic if so, because bfqq is used after
bfq_weights_tree_remove() is called.

Thanks,
Kuai

2022-09-14 08:16:44

by Paolo Valente

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'



> Il giorno 14 set 2022, alle ore 03:55, Yu Kuai <[email protected]> ha scritto:
>
>
>
> 在 2022/09/07 9:16, Yu Kuai 写道:
>> Hi, Paolo!
>> 在 2022/09/06 17:37, Paolo Valente 写道:
>>>
>>>
>>>> Il giorno 26 ago 2022, alle ore 04:34, Yu Kuai <[email protected]> ha scritto:
>>>>
>>>> Hi, Paolo!
>>>>
>>>> 在 2022/08/25 22:59, Paolo Valente 写道:
>>>>>> Il giorno 11 ago 2022, alle ore 03:19, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>
>>>>>> Hi, Paolo
>>>>>>
>>>>>> 在 2022/08/10 18:49, Paolo Valente 写道:
>>>>>>>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>>>
>>>>>>>> Hi, Paolo
>>>>>>>>
>>>>>>> hi
>>>>>>>> Are you still interested in this patchset?
>>>>>>>>
>>>>>>> Yes. Sorry for replying very late again.
>>>>>>> Probably the last fix that you suggest is enough, but I'm a little bit
>>>>>>> concerned that it may be a little hasty. In fact, before this fix, we
>>>>>>> exchanged several messages, and I didn't seem to be very good at
>>>>>>> convincing you about the need to keep into account also in-service
>>>>>>> I/O. So, my question is: are you sure that now you have a
>>>>>>
>>>>>> I'm confused here, I'm pretty aware that in-service I/O(as said pending
>>>>>> requests is the patchset) should be counted, as you suggested in v7, are
>>>>>> you still thinking that the way in this patchset is problematic?
>>>>>>
>>>>>> I'll try to explain again that how to track is bfqq has pending pending
>>>>>> requests, please let me know if you still think there are some problems:
>>>>>>
>>>>>> patch 1 support to track if bfqq has pending requests, it's
>>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>>> request is completed. specifically the flag is set in
>>>>>> bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
>>>>>> both in bfq_completed_request() and bfq_del_bfqq_busy() when
>>>>>> 'bfqq->diapatched' is false.
>>>>>>
>>>>> This general description seems correct to me. Have you already sent a new version of your patchset?
>>>>
>>>> It's glad that we finially on the same page here.
>>>>
>>>
>>> Yep. Sorry for my chronicle delay.
>> Better late than never ????
>>>
>>>> Please take a look at patch 1, which already impelement the above
>>>> descriptions, it seems to me there is no need to send a new version
>>>> for now. If you think there are still some other problems, please let
>>>> me know.
>>>>
>>>
>>> Patch 1 seems ok to me. I seem to have only one pending comment on this patch (3/4) instead. Let me paste previous stuff here for your convenience:
>> That sounds good.
>>>
>>>>>
>>>>> - /*
>>>>> - * Next function is invoked last, because it causes bfqq to be
>>>>> - * freed if the following holds: bfqq is not in service and
>>>>> - * has no dispatched request. DO NOT use bfqq after the next
>>>>> - * function invocation.
>>>>> - */
>>>> I would really love it if you leave this comment. I added it after
>>>> suffering a lot for a nasty UAF. Of course the first sentence may
>>>> need to be adjusted if the code that precedes it is to be removed.
>>>> Same as above, if this patch is applied, this function will be gone.
>
> Hi, I'm curious while I'm trying to add the comment, before this
> patchset, can bfqq be freed when bfq_weights_tree_remove is called?
>
> bfq_completed_request
> bfqq->dispatched--
> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq))
> bfq_weights_tree_remove(bfqd, bfqq);
>
> // continue to use bfqq
>
> It seems to me this is problematic if so, because bfqq is used after
> bfq_weights_tree_remove() is called.
>

It is. Yet, IIRC, I verified that bfqq was not used after that free,
and I added that comment as a heads-up. What is a scenario (before
your pending modifications) where this use-after-free happens?

Thanks,
Paolo

> Thanks,
> Kuai

2022-09-14 08:55:04

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi, Paolo

在 2022/09/14 15:50, Paolo VALENTE 写道:
>
>
>> Il giorno 14 set 2022, alle ore 03:55, Yu Kuai <[email protected]> ha scritto:
>>
>>
>>
>> 在 2022/09/07 9:16, Yu Kuai 写道:
>>> Hi, Paolo!
>>> 在 2022/09/06 17:37, Paolo Valente 写道:
>>>>
>>>>
>>>>> Il giorno 26 ago 2022, alle ore 04:34, Yu Kuai <[email protected]> ha scritto:
>>>>>
>>>>> Hi, Paolo!
>>>>>
>>>>> 在 2022/08/25 22:59, Paolo Valente 写道:
>>>>>>> Il giorno 11 ago 2022, alle ore 03:19, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>>
>>>>>>> Hi, Paolo
>>>>>>>
>>>>>>> 在 2022/08/10 18:49, Paolo Valente 写道:
>>>>>>>>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>>>>
>>>>>>>>> Hi, Paolo
>>>>>>>>>
>>>>>>>> hi
>>>>>>>>> Are you still interested in this patchset?
>>>>>>>>>
>>>>>>>> Yes. Sorry for replying very late again.
>>>>>>>> Probably the last fix that you suggest is enough, but I'm a little bit
>>>>>>>> concerned that it may be a little hasty. In fact, before this fix, we
>>>>>>>> exchanged several messages, and I didn't seem to be very good at
>>>>>>>> convincing you about the need to keep into account also in-service
>>>>>>>> I/O. So, my question is: are you sure that now you have a
>>>>>>>
>>>>>>> I'm confused here, I'm pretty aware that in-service I/O(as said pending
>>>>>>> requests is the patchset) should be counted, as you suggested in v7, are
>>>>>>> you still thinking that the way in this patchset is problematic?
>>>>>>>
>>>>>>> I'll try to explain again that how to track is bfqq has pending pending
>>>>>>> requests, please let me know if you still think there are some problems:
>>>>>>>
>>>>>>> patch 1 support to track if bfqq has pending requests, it's
>>>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>>>> request is completed. specifically the flag is set in
>>>>>>> bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
>>>>>>> both in bfq_completed_request() and bfq_del_bfqq_busy() when
>>>>>>> 'bfqq->diapatched' is false.
>>>>>>>
>>>>>> This general description seems correct to me. Have you already sent a new version of your patchset?
>>>>>
>>>>> It's glad that we finially on the same page here.
>>>>>
>>>>
>>>> Yep. Sorry for my chronicle delay.
>>> Better late than never ????
>>>>
>>>>> Please take a look at patch 1, which already impelement the above
>>>>> descriptions, it seems to me there is no need to send a new version
>>>>> for now. If you think there are still some other problems, please let
>>>>> me know.
>>>>>
>>>>
>>>> Patch 1 seems ok to me. I seem to have only one pending comment on this patch (3/4) instead. Let me paste previous stuff here for your convenience:
>>> That sounds good.
>>>>
>>>>>>
>>>>>> - /*
>>>>>> - * Next function is invoked last, because it causes bfqq to be
>>>>>> - * freed if the following holds: bfqq is not in service and
>>>>>> - * has no dispatched request. DO NOT use bfqq after the next
>>>>>> - * function invocation.
>>>>>> - */
>>>>> I would really love it if you leave this comment. I added it after
>>>>> suffering a lot for a nasty UAF. Of course the first sentence may
>>>>> need to be adjusted if the code that precedes it is to be removed.
>>>>> Same as above, if this patch is applied, this function will be gone.
>>
>> Hi, I'm curious while I'm trying to add the comment, before this
>> patchset, can bfqq be freed when bfq_weights_tree_remove is called?
>>
>> bfq_completed_request
>> bfqq->dispatched--
>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq))
>> bfq_weights_tree_remove(bfqd, bfqq);
>>
>> // continue to use bfqq
>>
>> It seems to me this is problematic if so, because bfqq is used after
>> bfq_weights_tree_remove() is called.
>>
>
> It is. Yet, IIRC, I verified that bfqq was not used after that free,
> and I added that comment as a heads-up. What is a scenario (before
> your pending modifications) where this use-after-free happens?
>

No, it never happens, I just notice it because it'll be weird if I
place the comment where bfq_weights_tree_remove() is called, since bfqq
will still be accessed.

If the suituation that the comment says is possible, perhaps we should
move bfq_weights_tree_remove() to the last of bfq_completed_request().
However, it seems that we haven't meet the problem for quite a long
time...

Thanks,
Kuai

> Thanks,
> Paolo
>
>> Thanks,
>> Kuai
>
> .
>

2022-09-14 09:35:49

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi guys!

On Wed 14-09-22 16:15:26, Yu Kuai wrote:
> 在 2022/09/14 15:50, Paolo VALENTE 写道:
> >
> >
> > > Il giorno 14 set 2022, alle ore 03:55, Yu Kuai <[email protected]> ha scritto:
> > >
> > >
> > >
> > > 在 2022/09/07 9:16, Yu Kuai 写道:
> > > > Hi, Paolo!
> > > > 在 2022/09/06 17:37, Paolo Valente 写道:
> > > > >
> > > > >
> > > > > > Il giorno 26 ago 2022, alle ore 04:34, Yu Kuai <[email protected]> ha scritto:
> > > > > >
> > > > > > Hi, Paolo!
> > > > > >
> > > > > > 在 2022/08/25 22:59, Paolo Valente 写道:
> > > > > > > > Il giorno 11 ago 2022, alle ore 03:19, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
> > > > > > > >
> > > > > > > > Hi, Paolo
> > > > > > > >
> > > > > > > > 在 2022/08/10 18:49, Paolo Valente 写道:
> > > > > > > > > > Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
> > > > > > > > > >
> > > > > > > > > > Hi, Paolo
> > > > > > > > > >
> > > > > > > > > hi
> > > > > > > > > > Are you still interested in this patchset?
> > > > > > > > > >
> > > > > > > > > Yes. Sorry for replying very late again.
> > > > > > > > > Probably the last fix that you suggest is enough, but I'm a little bit
> > > > > > > > > concerned that it may be a little hasty. In fact, before this fix, we
> > > > > > > > > exchanged several messages, and I didn't seem to be very good at
> > > > > > > > > convincing you about the need to keep into account also in-service
> > > > > > > > > I/O. So, my question is: are you sure that now you have a
> > > > > > > >
> > > > > > > > I'm confused here, I'm pretty aware that in-service I/O(as said pending
> > > > > > > > requests is the patchset) should be counted, as you suggested in v7, are
> > > > > > > > you still thinking that the way in this patchset is problematic?
> > > > > > > >
> > > > > > > > I'll try to explain again that how to track is bfqq has pending pending
> > > > > > > > requests, please let me know if you still think there are some problems:
> > > > > > > >
> > > > > > > > patch 1 support to track if bfqq has pending requests, it's
> > > > > > > > done by setting the flag 'entity->in_groups_with_pending_reqs' when the
> > > > > > > > first request is inserted to bfqq, and it's cleared when the last
> > > > > > > > request is completed. specifically the flag is set in
> > > > > > > > bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
> > > > > > > > both in bfq_completed_request() and bfq_del_bfqq_busy() when
> > > > > > > > 'bfqq->diapatched' is false.
> > > > > > > >
> > > > > > > This general description seems correct to me. Have you already sent a new version of your patchset?
> > > > > >
> > > > > > It's glad that we finially on the same page here.
> > > > > >
> > > > >
> > > > > Yep. Sorry for my chronicle delay.
> > > > Better late than never ????
> > > > >
> > > > > > Please take a look at patch 1, which already impelement the above
> > > > > > descriptions, it seems to me there is no need to send a new version
> > > > > > for now. If you think there are still some other problems, please let
> > > > > > me know.
> > > > > >
> > > > >
> > > > > Patch 1 seems ok to me. I seem to have only one pending comment on this patch (3/4) instead. Let me paste previous stuff here for your convenience:
> > > > That sounds good.
> > > > >
> > > > > > >
> > > > > > > - /*
> > > > > > > - * Next function is invoked last, because it causes bfqq to be
> > > > > > > - * freed if the following holds: bfqq is not in service and
> > > > > > > - * has no dispatched request. DO NOT use bfqq after the next
> > > > > > > - * function invocation.
> > > > > > > - */
> > > > > > I would really love it if you leave this comment. I added it after
> > > > > > suffering a lot for a nasty UAF. Of course the first sentence may
> > > > > > need to be adjusted if the code that precedes it is to be removed.
> > > > > > Same as above, if this patch is applied, this function will be gone.
> > >
> > > Hi, I'm curious while I'm trying to add the comment, before this
> > > patchset, can bfqq be freed when bfq_weights_tree_remove is called?
> > >
> > > bfq_completed_request
> > > bfqq->dispatched--
> > > if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq))
> > > bfq_weights_tree_remove(bfqd, bfqq);
> > >
> > > // continue to use bfqq
> > >
> > > It seems to me this is problematic if so, because bfqq is used after
> > > bfq_weights_tree_remove() is called.
> > >
> >
> > It is. Yet, IIRC, I verified that bfqq was not used after that free,
> > and I added that comment as a heads-up. What is a scenario (before
> > your pending modifications) where this use-after-free happens?
> >
>
> No, it never happens, I just notice it because it'll be weird if I
> place the comment where bfq_weights_tree_remove() is called, since bfqq
> will still be accessed.
>
> If the suituation that the comment says is possible, perhaps we should
> move bfq_weights_tree_remove() to the last of bfq_completed_request().
> However, it seems that we haven't meet the problem for quite a long
> time...

I'm bit confused which comment you are speaking about but
bfq_completed_request() gets called only from bfq_finish_requeue_request()
and the request itself still holds a reference to bfqq. Only later in
bfq_finish_requeue_request() when we do:

bfqq_request_freed(bfqq);
bfq_put_queue(bfqq);

bfqq can get freed.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-09-15 01:37:16

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v10 3/4] block, bfq: refactor the counting of 'num_groups_with_pending_reqs'

Hi,

在 2022/09/14 17:00, Jan Kara 写道:
> Hi guys!
>
> On Wed 14-09-22 16:15:26, Yu Kuai wrote:
>> 在 2022/09/14 15:50, Paolo VALENTE 写道:
>>>
>>>
>>>> Il giorno 14 set 2022, alle ore 03:55, Yu Kuai <[email protected]> ha scritto:
>>>>
>>>>
>>>>
>>>> 在 2022/09/07 9:16, Yu Kuai 写道:
>>>>> Hi, Paolo!
>>>>> 在 2022/09/06 17:37, Paolo Valente 写道:
>>>>>>
>>>>>>
>>>>>>> Il giorno 26 ago 2022, alle ore 04:34, Yu Kuai <[email protected]> ha scritto:
>>>>>>>
>>>>>>> Hi, Paolo!
>>>>>>>
>>>>>>> 在 2022/08/25 22:59, Paolo Valente 写道:
>>>>>>>>> Il giorno 11 ago 2022, alle ore 03:19, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>>>>
>>>>>>>>> Hi, Paolo
>>>>>>>>>
>>>>>>>>> 在 2022/08/10 18:49, Paolo Valente 写道:
>>>>>>>>>>> Il giorno 27 lug 2022, alle ore 14:11, Yu Kuai <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>>>>>>>
>>>>>>>>>>> Hi, Paolo
>>>>>>>>>>>
>>>>>>>>>> hi
>>>>>>>>>>> Are you still interested in this patchset?
>>>>>>>>>>>
>>>>>>>>>> Yes. Sorry for replying very late again.
>>>>>>>>>> Probably the last fix that you suggest is enough, but I'm a little bit
>>>>>>>>>> concerned that it may be a little hasty. In fact, before this fix, we
>>>>>>>>>> exchanged several messages, and I didn't seem to be very good at
>>>>>>>>>> convincing you about the need to keep into account also in-service
>>>>>>>>>> I/O. So, my question is: are you sure that now you have a
>>>>>>>>>
>>>>>>>>> I'm confused here, I'm pretty aware that in-service I/O(as said pending
>>>>>>>>> requests is the patchset) should be counted, as you suggested in v7, are
>>>>>>>>> you still thinking that the way in this patchset is problematic?
>>>>>>>>>
>>>>>>>>> I'll try to explain again that how to track is bfqq has pending pending
>>>>>>>>> requests, please let me know if you still think there are some problems:
>>>>>>>>>
>>>>>>>>> patch 1 support to track if bfqq has pending requests, it's
>>>>>>>>> done by setting the flag 'entity->in_groups_with_pending_reqs' when the
>>>>>>>>> first request is inserted to bfqq, and it's cleared when the last
>>>>>>>>> request is completed. specifically the flag is set in
>>>>>>>>> bfq_add_bfqq_busy() when 'bfqq->dispatched' if false, and it's cleared
>>>>>>>>> both in bfq_completed_request() and bfq_del_bfqq_busy() when
>>>>>>>>> 'bfqq->diapatched' is false.
>>>>>>>>>
>>>>>>>> This general description seems correct to me. Have you already sent a new version of your patchset?
>>>>>>>
>>>>>>> It's glad that we finially on the same page here.
>>>>>>>
>>>>>>
>>>>>> Yep. Sorry for my chronicle delay.
>>>>> Better late than never ????
>>>>>>
>>>>>>> Please take a look at patch 1, which already impelement the above
>>>>>>> descriptions, it seems to me there is no need to send a new version
>>>>>>> for now. If you think there are still some other problems, please let
>>>>>>> me know.
>>>>>>>
>>>>>>
>>>>>> Patch 1 seems ok to me. I seem to have only one pending comment on this patch (3/4) instead. Let me paste previous stuff here for your convenience:
>>>>> That sounds good.
>>>>>>
>>>>>>>>
>>>>>>>> - /*
>>>>>>>> - * Next function is invoked last, because it causes bfqq to be
>>>>>>>> - * freed if the following holds: bfqq is not in service and
>>>>>>>> - * has no dispatched request. DO NOT use bfqq after the next
>>>>>>>> - * function invocation.
>>>>>>>> - */
>>>>>>> I would really love it if you leave this comment. I added it after
>>>>>>> suffering a lot for a nasty UAF. Of course the first sentence may
>>>>>>> need to be adjusted if the code that precedes it is to be removed.
>>>>>>> Same as above, if this patch is applied, this function will be gone.
>>>>
>>>> Hi, I'm curious while I'm trying to add the comment, before this
>>>> patchset, can bfqq be freed when bfq_weights_tree_remove is called?
>>>>
>>>> bfq_completed_request
>>>> bfqq->dispatched--
>>>> if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq))
>>>> bfq_weights_tree_remove(bfqd, bfqq);
>>>>
>>>> // continue to use bfqq
>>>>
>>>> It seems to me this is problematic if so, because bfqq is used after
>>>> bfq_weights_tree_remove() is called.
>>>>
>>>
>>> It is. Yet, IIRC, I verified that bfqq was not used after that free,
>>> and I added that comment as a heads-up. What is a scenario (before
>>> your pending modifications) where this use-after-free happens?
>>>
>>
>> No, it never happens, I just notice it because it'll be weird if I
>> place the comment where bfq_weights_tree_remove() is called, since bfqq
>> will still be accessed.
>>
>> If the suituation that the comment says is possible, perhaps we should
>> move bfq_weights_tree_remove() to the last of bfq_completed_request().
>> However, it seems that we haven't meet the problem for quite a long
>> time...
>
> I'm bit confused which comment you are speaking about but
> bfq_completed_request() gets called only from bfq_finish_requeue_request()
> and the request itself still holds a reference to bfqq. Only later in
> bfq_finish_requeue_request() when we do:
>
> bfqq_request_freed(bfqq);
> bfq_put_queue(bfqq);
>
> bfqq can get freed.

Yes, you're right. Then I think the only place that
bfq_weights_tree_remove() can free bfqq is from bfq_del_bfqq_busy().
I'll move the following comment with a little adjustment here, which is
from bfq_weights_tree_remove() before this patchset:

/*
┊* Next function is invoked last, because it causes bfqq to be
┊* freed. DO NOT use bfqq after the next function invocation.
┊*/

Thanks,
Kuai

>
> Honza
>