2022-11-15 23:11:41

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 0/3] sbitmap: Fix two issues in the per-bitmap wakeup counter code

Jan reported two issues in the original thread.

The first is that wake_index was not updated after returning from
sbq_wake_ptr which meant we'd have to empty the wq before moving to the
next one. Patch 1/3 in this series reorders the code to avoid this
condition, increasing fairness of queue selection and preventing
starvation. I sent this patch already on the other thread and Jan
reviewed it, but since it is a small one, and a dependency for the
other, I'm resending it a along this series.

The second issue is trickier. When the selected queue is emptied after
the waitqueue_active check and before wake_up_nr, there is no waiters to
be awaken in that queue, even if other queues might have it. This
causes us to loose one too many wakeups, and there might not be enough
requests in flight to wake up every queued request.

The proposed fix, is to walk through every queue after doing the atomic
update, such that we ensure any waiters already queued are candidates
for awakening, and that we awake at least 1 waiter in any of the queues.
The patch is a bit more complex than the suggestion since it avoids
partial updates to wake_index, which measurably hurt performance
unnecessarily.

It survived the same tests done on the original patch.

btw, I'm still missing the latency and utilisation reports. I haven't
forgotten about it, but I didn't have a chance to collect them
yet. Sorry. I will follow up with them for completeness, even if the
original patch is already queued.

Gabriel Krisman Bertazi (3):
sbitmap: Advance the queue index before waking up a queue
wait: Return number of exclusive waiters awaken
sbitmap: Try each queue to wake up at least one waiter

include/linux/wait.h | 2 +-
kernel/sched/wait.c | 18 +++++++++++-------
lib/sbitmap.c | 36 +++++++++++++++++++-----------------
3 files changed, 31 insertions(+), 25 deletions(-)

--
2.35.3



2022-11-15 23:55:10

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH 3/3] sbitmap: Try each queue to wake up at least one waiter

Jan reported the new algorithm as merged might be problematic if the
queue being awaken becomes empty between the waitqueue_active inside
sbq_wake_ptr check and the wake up. If that happens, wake_up_nr will
not wake up any waiter and we loose too many wake ups. In order to
guarantee progress, we need to wake up at least one waiter here, if
there are any. This now requires trying to wake up from every queue.

Instead of walking through all the queues with sbq_wake_ptr, this call
moves the wake up inside that function. In a previous version of the
patch, I found that updating wake_index several times when walking
through queues had a measurable overhead. This ensures we only update
it once, at the end.

Fixes: 4f8126bb2308 ("sbitmap: Use single per-bitmap counting to wake up queued tags")
Reported-by: Jan Kara <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
lib/sbitmap.c | 28 ++++++++++++----------------
1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index bea7984f7987..586deb333237 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -560,12 +560,12 @@ void sbitmap_queue_min_shallow_depth(struct sbitmap_queue *sbq,
}
EXPORT_SYMBOL_GPL(sbitmap_queue_min_shallow_depth);

-static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq)
+static void __sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
{
int i, wake_index;

if (!atomic_read(&sbq->ws_active))
- return NULL;
+ return;

wake_index = atomic_read(&sbq->wake_index);
for (i = 0; i < SBQ_WAIT_QUEUES; i++) {
@@ -579,20 +579,22 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq)
*/
wake_index = sbq_index_inc(wake_index);

- if (waitqueue_active(&ws->wait)) {
- if (wake_index != atomic_read(&sbq->wake_index))
- atomic_set(&sbq->wake_index, wake_index);
- return ws;
- }
+ /*
+ * It is sufficient to wake up at least one waiter to
+ * guarantee forward progress.
+ */
+ if (waitqueue_active(&ws->wait) &&
+ wake_up_nr(&ws->wait, nr))
+ break;
}

- return NULL;
+ if (wake_index != atomic_read(&sbq->wake_index))
+ atomic_set(&sbq->wake_index, wake_index);
}

void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
{
unsigned int wake_batch = READ_ONCE(sbq->wake_batch);
- struct sbq_wait_state *ws = NULL;
unsigned int wakeups;

if (!atomic_read(&sbq->ws_active))
@@ -604,16 +606,10 @@ void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
do {
if (atomic_read(&sbq->completion_cnt) - wakeups < wake_batch)
return;
-
- if (!ws) {
- ws = sbq_wake_ptr(sbq);
- if (!ws)
- return;
- }
} while (!atomic_try_cmpxchg(&sbq->wakeup_cnt,
&wakeups, wakeups + wake_batch));

- wake_up_nr(&ws->wait, wake_batch);
+ __sbitmap_queue_wake_up(sbq, wake_batch);
}
EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up);

--
2.35.3


2022-11-16 11:31:27

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 3/3] sbitmap: Try each queue to wake up at least one waiter

On Tue 15-11-22 17:45:53, Gabriel Krisman Bertazi wrote:
> Jan reported the new algorithm as merged might be problematic if the
> queue being awaken becomes empty between the waitqueue_active inside
> sbq_wake_ptr check and the wake up. If that happens, wake_up_nr will
> not wake up any waiter and we loose too many wake ups. In order to
> guarantee progress, we need to wake up at least one waiter here, if
> there are any. This now requires trying to wake up from every queue.
>
> Instead of walking through all the queues with sbq_wake_ptr, this call
> moves the wake up inside that function. In a previous version of the
> patch, I found that updating wake_index several times when walking
> through queues had a measurable overhead. This ensures we only update
> it once, at the end.
>
> Fixes: 4f8126bb2308 ("sbitmap: Use single per-bitmap counting to wake up queued tags")
> Reported-by: Jan Kara <[email protected]>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

Elegant and looks good to me! Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> lib/sbitmap.c | 28 ++++++++++++----------------
> 1 file changed, 12 insertions(+), 16 deletions(-)
>
> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> index bea7984f7987..586deb333237 100644
> --- a/lib/sbitmap.c
> +++ b/lib/sbitmap.c
> @@ -560,12 +560,12 @@ void sbitmap_queue_min_shallow_depth(struct sbitmap_queue *sbq,
> }
> EXPORT_SYMBOL_GPL(sbitmap_queue_min_shallow_depth);
>
> -static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq)
> +static void __sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
> {
> int i, wake_index;
>
> if (!atomic_read(&sbq->ws_active))
> - return NULL;
> + return;
>
> wake_index = atomic_read(&sbq->wake_index);
> for (i = 0; i < SBQ_WAIT_QUEUES; i++) {
> @@ -579,20 +579,22 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq)
> */
> wake_index = sbq_index_inc(wake_index);
>
> - if (waitqueue_active(&ws->wait)) {
> - if (wake_index != atomic_read(&sbq->wake_index))
> - atomic_set(&sbq->wake_index, wake_index);
> - return ws;
> - }
> + /*
> + * It is sufficient to wake up at least one waiter to
> + * guarantee forward progress.
> + */
> + if (waitqueue_active(&ws->wait) &&
> + wake_up_nr(&ws->wait, nr))
> + break;
> }
>
> - return NULL;
> + if (wake_index != atomic_read(&sbq->wake_index))
> + atomic_set(&sbq->wake_index, wake_index);
> }
>
> void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
> {
> unsigned int wake_batch = READ_ONCE(sbq->wake_batch);
> - struct sbq_wait_state *ws = NULL;
> unsigned int wakeups;
>
> if (!atomic_read(&sbq->ws_active))
> @@ -604,16 +606,10 @@ void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
> do {
> if (atomic_read(&sbq->completion_cnt) - wakeups < wake_batch)
> return;
> -
> - if (!ws) {
> - ws = sbq_wake_ptr(sbq);
> - if (!ws)
> - return;
> - }
> } while (!atomic_try_cmpxchg(&sbq->wakeup_cnt,
> &wakeups, wakeups + wake_batch));
>
> - wake_up_nr(&ws->wait, wake_batch);
> + __sbitmap_queue_wake_up(sbq, wake_batch);
> }
> EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up);
>
> --
> 2.35.3
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-11-16 20:05:39

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 0/3] sbitmap: Fix two issues in the per-bitmap wakeup counter code

On Tue, 15 Nov 2022 17:45:50 -0500, Gabriel Krisman Bertazi wrote:
> Jan reported two issues in the original thread.
>
> The first is that wake_index was not updated after returning from
> sbq_wake_ptr which meant we'd have to empty the wq before moving to the
> next one. Patch 1/3 in this series reorders the code to avoid this
> condition, increasing fairness of queue selection and preventing
> starvation. I sent this patch already on the other thread and Jan
> reviewed it, but since it is a small one, and a dependency for the
> other, I'm resending it a along this series.
>
> [...]

Applied, thanks!

[1/3] sbitmap: Advance the queue index before waking up a queue
commit: 976570b4ecd30d3ec6e1b0910da8e5edc591f2b6
[2/3] wait: Return number of exclusive waiters awaken
commit: ee7dc86b6d3e3b86c2c487f713eda657850de238
[3/3] sbitmap: Try each queue to wake up at least one waiter
commit: 26edb30dd1c0c9be11fa676b4f330ada7b794ba6

Best regards,
--
Jens Axboe