Hi Jens,
this the V2 of the series. The only change is the removal of ifdefs
from around gets and puts of bfq groups. As I wrote in my previous
cover letter, these patches are mostly fixes for the issues reported
in [1, 2]. All patches have been publicly tested in the dev version of
BFQ.
Thanks,
Paolo
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1767539
[2] https://bugzilla.kernel.org/show_bug.cgi?id=205447
Paolo Valente (7):
block, bfq: do not plug I/O for bfq_queues with no proc refs
block, bfq: do not insert oom queue into position tree
block, bfq: get extra ref to prevent a queue from being freed during a
group move
block, bfq: extend incomplete name of field on_st
block, bfq: remove ifdefs from around gets/puts of bfq groups
block, bfq: get a ref to a group when adding it to a service tree
block, bfq: clarify the goal of bfq_split_bfqq()
block/bfq-cgroup.c | 16 ++++++++++++++--
block/bfq-iosched.c | 26 ++++++++++++++++++++------
block/bfq-iosched.h | 4 +++-
block/bfq-wf2q.c | 23 +++++++++++++++++------
4 files changed, 54 insertions(+), 15 deletions(-)
--
2.20.1
BFQ schedules generic entities, which may represent either bfq_queues
or groups of bfq_queues. When an entity is inserted into a service
tree, a reference must be taken, to make sure that the entity does not
disappear while still referred in the tree. Unfortunately, such a
reference is mistakenly taken only if the entity represents a
bfq_queue. This commit takes a reference also in case the entity
represents a group.
Tested-by: Chris Evich <[email protected]>
Signed-off-by: Paolo Valente <[email protected]>
---
block/bfq-cgroup.c | 2 +-
block/bfq-iosched.h | 1 +
block/bfq-wf2q.c | 12 ++++++++++--
3 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index cae488b58049..09b69a3ed490 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -332,7 +332,7 @@ static void bfqg_put(struct bfq_group *bfqg)
kfree(bfqg);
}
-static void bfqg_and_blkg_get(struct bfq_group *bfqg)
+void bfqg_and_blkg_get(struct bfq_group *bfqg)
{
/* see comments in bfq_bic_update_cgroup for why refcounting bfqg */
bfqg_get(bfqg);
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 2c7cec737b2a..d1233af9c684 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -985,6 +985,7 @@ struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd,
struct blkcg_gq *bfqg_to_blkg(struct bfq_group *bfqg);
struct bfq_group *bfqq_group(struct bfq_queue *bfqq);
struct bfq_group *bfq_create_group_hierarchy(struct bfq_data *bfqd, int node);
+void bfqg_and_blkg_get(struct bfq_group *bfqg);
void bfqg_and_blkg_put(struct bfq_group *bfqg);
#ifdef CONFIG_BFQ_GROUP_IOSCHED
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index 26776bdbdf36..eb0e2a6daabe 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -533,7 +533,9 @@ static void bfq_get_entity(struct bfq_entity *entity)
bfqq->ref++;
bfq_log_bfqq(bfqq->bfqd, bfqq, "get_entity: %p %d",
bfqq, bfqq->ref);
- }
+ } else
+ bfqg_and_blkg_get(container_of(entity, struct bfq_group,
+ entity));
}
/**
@@ -647,8 +649,14 @@ static void bfq_forget_entity(struct bfq_service_tree *st,
entity->on_st_or_in_serv = false;
st->wsum -= entity->weight;
- if (bfqq && !is_in_service)
+ if (is_in_service)
+ return;
+
+ if (bfqq)
bfq_put_queue(bfqq);
+ else
+ bfqg_and_blkg_put(container_of(entity, struct bfq_group,
+ entity));
}
/**
--
2.20.1
The exact, general goal of the function bfq_split_bfqq() is not that
apparent. Add a comment to make it clear.
Signed-off-by: Paolo Valente <[email protected]>
---
block/bfq-iosched.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index fff76c920968..8c436abfaf14 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -5979,6 +5979,8 @@ static void bfq_finish_requeue_request(struct request *rq)
}
/*
+ * Removes the association between the current task and bfqq, assuming
+ * that bic points to the bfq iocontext of the task.
* Returns NULL if a new bfqq should be allocated, or the old bfqq if this
* was the last process referring to that bfqq.
*/
--
2.20.1
Commit 478de3380c1c ("block, bfq: deschedule empty bfq_queues not
referred by any process") fixed commit 3726112ec731 ("block, bfq:
re-schedule empty queues if they deserve I/O plugging") by
descheduling an empty bfq_queue when it remains with not process
reference. Yet, this still left a case uncovered: an empty bfq_queue
with not process reference that remains in service. This happens for
an in-service sync bfq_queue that is deemed to deserve I/O-dispatch
plugging when it remains empty. Yet no new requests will arrive for
such a bfq_queue if no process sends requests to it any longer. Even
worse, the bfq_queue may happen to be prematurely freed while still in
service (because there may remain no reference to it any longer).
This commit solves this problem by preventing I/O dispatch from being
plugged for the in-service bfq_queue, if the latter has no process
reference (the bfq_queue is then prevented from remaining in service).
Fixes: 3726112ec731 ("block, bfq: re-schedule empty queues if they deserve I/O plugging")
Reported-by: Patrick Dung <[email protected]>
Tested-by: Patrick Dung <[email protected]>
Signed-off-by: Paolo Valente <[email protected]>
---
block/bfq-iosched.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 4686b68b48b4..55d4328e7c12 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -3443,6 +3443,10 @@ static void bfq_dispatch_remove(struct request_queue *q, struct request *rq)
static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd,
struct bfq_queue *bfqq)
{
+ /* No point in idling for bfqq if it won't get requests any longer */
+ if (unlikely(!bfqq_process_refs(bfqq)))
+ return false;
+
return (bfqq->wr_coeff > 1 &&
(bfqd->wr_busy_queues <
bfq_tot_busy_queues(bfqd) ||
@@ -4076,6 +4080,10 @@ static bool idling_boosts_thr_without_issues(struct bfq_data *bfqd,
bfqq_sequential_and_IO_bound,
idling_boosts_thr;
+ /* No point in idling for bfqq if it won't get requests any longer */
+ if (unlikely(!bfqq_process_refs(bfqq)))
+ return false;
+
bfqq_sequential_and_IO_bound = !BFQQ_SEEKY(bfqq) &&
bfq_bfqq_IO_bound(bfqq) && bfq_bfqq_has_short_ttime(bfqq);
@@ -4169,6 +4177,10 @@ static bool bfq_better_to_idle(struct bfq_queue *bfqq)
struct bfq_data *bfqd = bfqq->bfqd;
bool idling_boosts_thr_with_no_issue, idling_needed_for_service_guar;
+ /* No point in idling for bfqq if it won't get requests any longer */
+ if (unlikely(!bfqq_process_refs(bfqq)))
+ return false;
+
if (unlikely(bfqd->strict_guarantees))
return true;
--
2.20.1
ifdefs around gets and puts of bfq groups reduce readability, remove them.
Reported-by: Jens Axboe <[email protected]>
Signed-off-by: Paolo Valente <[email protected]>
---
block/bfq-cgroup.c | 4 ++++
block/bfq-iosched.c | 6 +-----
block/bfq-iosched.h | 1 +
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index c818c64766e5..cae488b58049 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -1406,6 +1406,10 @@ struct bfq_group *bfqq_group(struct bfq_queue *bfqq)
return bfqq->bfqd->root_group;
}
+void bfqg_and_blkg_get(struct bfq_group *bfqg) {}
+
+void bfqg_and_blkg_put(struct bfq_group *bfqg) {}
+
struct bfq_group *bfq_create_group_hierarchy(struct bfq_data *bfqd, int node)
{
struct bfq_group *bfqg;
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 28770ec7c06f..fff76c920968 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -4825,9 +4825,7 @@ void bfq_put_queue(struct bfq_queue *bfqq)
{
struct bfq_queue *item;
struct hlist_node *n;
-#ifdef CONFIG_BFQ_GROUP_IOSCHED
struct bfq_group *bfqg = bfqq_group(bfqq);
-#endif
if (bfqq->bfqd)
bfq_log_bfqq(bfqq->bfqd, bfqq, "put_queue: %p %d",
@@ -4900,9 +4898,7 @@ void bfq_put_queue(struct bfq_queue *bfqq)
bfqq->bfqd->last_completed_rq_bfqq = NULL;
kmem_cache_free(bfq_pool, bfqq);
-#ifdef CONFIG_BFQ_GROUP_IOSCHED
bfqg_and_blkg_put(bfqg);
-#endif
}
static void bfq_put_cooperator(struct bfq_queue *bfqq)
@@ -6390,10 +6386,10 @@ static void bfq_exit_queue(struct elevator_queue *e)
hrtimer_cancel(&bfqd->idle_slice_timer);
-#ifdef CONFIG_BFQ_GROUP_IOSCHED
/* release oom-queue reference to root group */
bfqg_and_blkg_put(bfqd->root_group);
+#ifdef CONFIG_BFQ_GROUP_IOSCHED
blkcg_deactivate_policy(bfqd->queue, &blkcg_policy_bfq);
#else
spin_lock_irq(&bfqd->lock);
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index f1cb89def7f8..2c7cec737b2a 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -921,6 +921,7 @@ struct bfq_group {
#else
struct bfq_group {
+ struct bfq_entity entity;
struct bfq_sched_data sched_data;
struct bfq_queue *async_bfqq[2][IOPRIO_BE_NR];
--
2.20.1
BFQ maintains an ordered list, implemented with an RB tree, of
head-request positions of non-empty bfq_queues. This position tree,
inherited from CFQ, is used to find bfq_queues that contain I/O close
to each other. BFQ merges these bfq_queues into a single shared queue,
if this boosts throughput on the device at hand.
There is however a special-purpose bfq_queue that does not participate
in queue merging, the oom bfq_queue. Yet, also this bfq_queue could be
wrongly added to the position tree. So bfqq_find_close() could return
the oom bfq_queue, which is a source of further troubles in an
out-of-memory situation. This commit prevents the oom bfq_queue from
being inserted into the position tree.
Tested-by: Patrick Dung <[email protected]>
Signed-off-by: Paolo Valente <[email protected]>
---
block/bfq-iosched.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 55d4328e7c12..15dfb0844644 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -613,6 +613,10 @@ bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq)
bfqq->pos_root = NULL;
}
+ /* oom_bfqq does not participate in queue merging */
+ if (bfqq == &bfqd->oom_bfqq)
+ return;
+
/*
* bfqq cannot be merged any longer (see comments in
* bfq_setup_cooperator): no point in adding bfqq into the
--
2.20.1
The flag on_st in the bfq_entity data structure is true if the entity
is on a service tree or is in service. Yet the name of the field,
confusingly, does not mention the second, very important case. Extend
the name to mention the second case too.
Signed-off-by: Paolo Valente <[email protected]>
---
block/bfq-cgroup.c | 2 +-
block/bfq-iosched.c | 2 +-
block/bfq-iosched.h | 2 +-
block/bfq-wf2q.c | 11 +++++++----
4 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 8ab7f18ff8cb..c818c64766e5 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -659,7 +659,7 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
if (bfq_bfqq_busy(bfqq))
bfq_deactivate_bfqq(bfqd, bfqq, false, false);
- else if (entity->on_st)
+ else if (entity->on_st_or_in_serv)
bfq_put_idle_entity(bfq_entity_service_tree(entity), entity);
bfqg_and_blkg_put(bfqq_group(bfqq));
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 15dfb0844644..28770ec7c06f 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -1059,7 +1059,7 @@ bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_data *bfqd,
static int bfqq_process_refs(struct bfq_queue *bfqq)
{
- return bfqq->ref - bfqq->allocated - bfqq->entity.on_st -
+ return bfqq->ref - bfqq->allocated - bfqq->entity.on_st_or_in_serv -
(bfqq->weight_counter != NULL);
}
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 8526f20c53bc..f1cb89def7f8 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -150,7 +150,7 @@ struct bfq_entity {
* Flag, true if the entity is on a tree (either the active or
* the idle one of its service_tree) or is in service.
*/
- bool on_st;
+ bool on_st_or_in_serv;
/* B-WF2Q+ start and finish timestamps [sectors/weight] */
u64 start, finish;
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index ffe9ce9faa89..26776bdbdf36 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -645,7 +645,7 @@ static void bfq_forget_entity(struct bfq_service_tree *st,
{
struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
- entity->on_st = false;
+ entity->on_st_or_in_serv = false;
st->wsum -= entity->weight;
if (bfqq && !is_in_service)
bfq_put_queue(bfqq);
@@ -999,7 +999,7 @@ static void __bfq_activate_entity(struct bfq_entity *entity,
*/
bfq_get_entity(entity);
- entity->on_st = true;
+ entity->on_st_or_in_serv = true;
}
#ifdef CONFIG_BFQ_GROUP_IOSCHED
@@ -1165,7 +1165,10 @@ bool __bfq_deactivate_entity(struct bfq_entity *entity, bool ins_into_idle_tree)
struct bfq_service_tree *st;
bool is_in_service;
- if (!entity->on_st) /* entity never activated, or already inactive */
+ if (!entity->on_st_or_in_serv) /*
+ * entity never activated, or
+ * already inactive
+ */
return false;
/*
@@ -1620,7 +1623,7 @@ bool __bfq_bfqd_reset_in_service(struct bfq_data *bfqd)
* service tree either, then release the service reference to
* the queue it represents (taken with bfq_get_entity).
*/
- if (!in_serv_entity->on_st) {
+ if (!in_serv_entity->on_st_or_in_serv) {
/*
* If no process is referencing in_serv_bfqq any
* longer, then the service reference may be the only
--
2.20.1
In bfq_bfqq_move(), the bfq_queue, say Q, to be moved to a new group
may happen to be deactivated in the scheduling data structures of the
source group (and then activated in the destination group). If Q is
referred only by the data structures in the source group when the
deactivation happens, then Q is freed upon the deactivation.
This commit addresses this issue by getting an extra reference before
the possible deactivation, and releasing this extra reference after Q
has been moved.
Tested-by: Chris Evich <[email protected]>
Signed-off-by: Paolo Valente <[email protected]>
---
block/bfq-cgroup.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index e1419edde2ec..8ab7f18ff8cb 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -651,6 +651,12 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
bfq_bfqq_expire(bfqd, bfqd->in_service_queue,
false, BFQQE_PREEMPTED);
+ /*
+ * get extra reference to prevent bfqq from being freed in
+ * next possible deactivate
+ */
+ bfqq->ref++;
+
if (bfq_bfqq_busy(bfqq))
bfq_deactivate_bfqq(bfqd, bfqq, false, false);
else if (entity->on_st)
@@ -670,6 +676,8 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
if (!bfqd->in_service_queue && !bfqd->rq_in_driver)
bfq_schedule_dispatch(bfqd);
+ /* release extra ref taken above */
+ bfq_put_queue(bfqq);
}
/**
--
2.20.1
On 03.02.2020 11:40, Paolo Valente wrote:
> Hi Jens,
> this the V2 of the series. The only change is the removal of ifdefs
> from around gets and puts of bfq groups. As I wrote in my previous
> cover letter, these patches are mostly fixes for the issues reported
> in [1, 2]. All patches have been publicly tested in the dev version of
> BFQ.
>
> Thanks,
> Paolo
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1767539
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=205447
>
> Paolo Valente (7):
> block, bfq: do not plug I/O for bfq_queues with no proc refs
> block, bfq: do not insert oom queue into position tree
> block, bfq: get extra ref to prevent a queue from being freed during
> a
> group move
> block, bfq: extend incomplete name of field on_st
> block, bfq: remove ifdefs from around gets/puts of bfq groups
> block, bfq: get a ref to a group when adding it to a service tree
> block, bfq: clarify the goal of bfq_split_bfqq()
>
> block/bfq-cgroup.c | 16 ++++++++++++++--
> block/bfq-iosched.c | 26 ++++++++++++++++++++------
> block/bfq-iosched.h | 4 +++-
> block/bfq-wf2q.c | 23 +++++++++++++++++------
> 4 files changed, 54 insertions(+), 15 deletions(-)
>
> --
> 2.20.1
Feel free to add:
Tested-by: Oleksandr Natalenko <[email protected]>
since I'm running v5.5 with this series applied for a couple of days
already without any visible smoke.
--
Oleksandr Natalenko (post-factum)
On 2/3/20 3:40 AM, Paolo Valente wrote:
> Hi Jens,
> this the V2 of the series. The only change is the removal of ifdefs
> from around gets and puts of bfq groups. As I wrote in my previous
> cover letter, these patches are mostly fixes for the issues reported
> in [1, 2]. All patches have been publicly tested in the dev version of
> BFQ.
I'll apply this one, at least the ifdef cleanup is nice and patch
6/7 no longer makes your eyes burn!
--
Jens Axboe