2019-10-07 23:20:16

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH] io_uring: remove wait loop spurious wakeups

From: Pavel Begunkov <[email protected]>

Any changes interesting to tasks waiting in io_cqring_wait() are
commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
also tries to do that but with no reason, that means spurious wakeups
every io_free_req() and io_uring_enter().

Just use percpu_ref_put() instead.

Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 16 ++++------------
1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index c934f91c51e9..89d77a626063 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -591,14 +591,6 @@ static void io_cqring_add_event(struct io_ring_ctx *ctx, u64 user_data,
io_cqring_ev_posted(ctx);
}

-static void io_ring_drop_ctx_refs(struct io_ring_ctx *ctx, unsigned refs)
-{
- percpu_ref_put_many(&ctx->refs, refs);
-
- if (waitqueue_active(&ctx->wait))
- wake_up(&ctx->wait);
-}
-
static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
struct io_submit_state *state)
{
@@ -646,7 +638,7 @@ static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
req->result = 0;
return req;
out:
- io_ring_drop_ctx_refs(ctx, 1);
+ percpu_ref_put(&ctx->refs);
return NULL;
}

@@ -654,7 +646,7 @@ static void io_free_req_many(struct io_ring_ctx *ctx, void **reqs, int *nr)
{
if (*nr) {
kmem_cache_free_bulk(req_cachep, *nr, reqs);
- io_ring_drop_ctx_refs(ctx, *nr);
+ percpu_ref_put_many(&ctx->refs, *nr);
*nr = 0;
}
}
@@ -663,7 +655,7 @@ static void __io_free_req(struct io_kiocb *req)
{
if (req->file && !(req->flags & REQ_F_FIXED_FILE))
fput(req->file);
- io_ring_drop_ctx_refs(req->ctx, 1);
+ percpu_ref_put(&req->ctx->refs);
kmem_cache_free(req_cachep, req);
}

@@ -3630,7 +3622,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
}
}

- io_ring_drop_ctx_refs(ctx, 1);
+ percpu_ref_put(&ctx->refs);
out_fput:
fdput(f);
return submitted ? submitted : ret;
--
2.23.0


2019-10-08 03:16:57

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] io_uring: remove wait loop spurious wakeups

On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
> From: Pavel Begunkov <[email protected]>
>
> Any changes interesting to tasks waiting in io_cqring_wait() are
> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
> also tries to do that but with no reason, that means spurious wakeups
> every io_free_req() and io_uring_enter().
>
> Just use percpu_ref_put() instead.

Looks good, this is a leftover from when the ctx teardown used
the waitqueue as well.

--
Jens Axboe

2019-10-08 16:46:05

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH] io_uring: remove wait loop spurious wakeups

On 08/10/2019 06:16, Jens Axboe wrote:
> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>> From: Pavel Begunkov <[email protected]>
>>
>> Any changes interesting to tasks waiting in io_cqring_wait() are
>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>> also tries to do that but with no reason, that means spurious wakeups
>> every io_free_req() and io_uring_enter().
>>
>> Just use percpu_ref_put() instead.
>
> Looks good, this is a leftover from when the ctx teardown used
> the waitqueue as well.
>
BTW, is there a reason for ref-counting in struct io_kiocb? I understand
the idea behind submission reference, but don't see any actual part
needing it.

Tested with another ref-counting patch and got +5-8% to
nops performance.


--
Yours sincerely,
Pavel Begunkov


Attachments:
signature.asc (849.00 B)
OpenPGP digital signature

2019-10-08 17:04:37

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] io_uring: remove wait loop spurious wakeups

On 10/8/19 10:43 AM, Pavel Begunkov wrote:
> On 08/10/2019 06:16, Jens Axboe wrote:
>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>> From: Pavel Begunkov <[email protected]>
>>>
>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>> also tries to do that but with no reason, that means spurious wakeups
>>> every io_free_req() and io_uring_enter().
>>>
>>> Just use percpu_ref_put() instead.
>>
>> Looks good, this is a leftover from when the ctx teardown used
>> the waitqueue as well.
>>
> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
> the idea behind submission reference, but don't see any actual part
> needing it.

In short, it's to prevent the completion running before we're done with
the iocb on the submission side.

> Tested with another ref-counting patch and got +5-8% to
> nops performance.
>
>


--
Jens Axboe

2019-10-08 20:59:33

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH] io_uring: remove wait loop spurious wakeups

On 08/10/2019 20:00, Jens Axboe wrote:
> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>> On 08/10/2019 06:16, Jens Axboe wrote:
>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>> From: Pavel Begunkov <[email protected]>
>>>>
>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>> also tries to do that but with no reason, that means spurious wakeups
>>>> every io_free_req() and io_uring_enter().
>>>>
>>>> Just use percpu_ref_put() instead.
>>>
>>> Looks good, this is a leftover from when the ctx teardown used
>>> the waitqueue as well.
>>>
>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>> the idea behind submission reference, but don't see any actual part
>> needing it.
>
> In short, it's to prevent the completion running before we're done with
> the iocb on the submission side.

Yep, that's what I expected. Perhaps I missed something, but what I've
seen following code paths all the way down, it either
1. gets error / completes synchronously and then frees req locally
2. or passes it further (e.g. async list) and never accesses it after


--
Yours sincerely,
Pavel Begunkov


Attachments:
signature.asc (849.00 B)
OpenPGP digital signature

2019-10-08 21:23:28

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] io_uring: remove wait loop spurious wakeups

On 10/8/19 2:58 PM, Pavel Begunkov wrote:
> On 08/10/2019 20:00, Jens Axboe wrote:
>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>> From: Pavel Begunkov <[email protected]>
>>>>>
>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>> every io_free_req() and io_uring_enter().
>>>>>
>>>>> Just use percpu_ref_put() instead.
>>>>
>>>> Looks good, this is a leftover from when the ctx teardown used
>>>> the waitqueue as well.
>>>>
>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>> the idea behind submission reference, but don't see any actual part
>>> needing it.
>>
>> In short, it's to prevent the completion running before we're done with
>> the iocb on the submission side.
>
> Yep, that's what I expected. Perhaps I missed something, but what I've
> seen following code paths all the way down, it either
> 1. gets error / completes synchronously and then frees req locally
> 2. or passes it further (e.g. async list) and never accesses it after

As soon as the IO is passed on, it can complete. In fact, it can complete
even _before_ that call returns. That's the issue. Obviously this isn't
true for purely polled IO, but it is true for IRQ based IO.

--
Jens Axboe

2019-10-08 22:07:18

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH] io_uring: remove wait loop spurious wakeups

On 09/10/2019 00:22, Jens Axboe wrote:
> On 10/8/19 2:58 PM, Pavel Begunkov wrote:
>> On 08/10/2019 20:00, Jens Axboe wrote:
>>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>>> From: Pavel Begunkov <[email protected]>
>>>>>>
>>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>>> every io_free_req() and io_uring_enter().
>>>>>>
>>>>>> Just use percpu_ref_put() instead.
>>>>>
>>>>> Looks good, this is a leftover from when the ctx teardown used
>>>>> the waitqueue as well.
>>>>>
>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>> the idea behind submission reference, but don't see any actual part
>>>> needing it.
>>>
>>> In short, it's to prevent the completion running before we're done with
>>> the iocb on the submission side.
>>
>> Yep, that's what I expected. Perhaps I missed something, but what I've
>> seen following code paths all the way down, it either
>> 1. gets error / completes synchronously and then frees req locally
>> 2. or passes it further (e.g. async list) and never accesses it after
>
> As soon as the IO is passed on, it can complete. In fact, it can complete
> even _before_ that call returns. That's the issue. Obviously this isn't
> true for purely polled IO, but it is true for IRQ based IO.

And the idea was to not use io_kiocb after submission. Except when we know,
that it won't complete asynchronously (e.g. error), that could be checked
with return code, I guess.

Anyway, thanks for the explanation!

--
Yours sincerely,
Pavel Begunkov


Attachments:
signature.asc (849.00 B)
OpenPGP digital signature

2019-10-09 02:55:21

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] io_uring: remove wait loop spurious wakeups

On 10/8/19 4:05 PM, Pavel Begunkov wrote:
> On 09/10/2019 00:22, Jens Axboe wrote:
>> On 10/8/19 2:58 PM, Pavel Begunkov wrote:
>>> On 08/10/2019 20:00, Jens Axboe wrote:
>>>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>>>> From: Pavel Begunkov <[email protected]>
>>>>>>>
>>>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>>>> every io_free_req() and io_uring_enter().
>>>>>>>
>>>>>>> Just use percpu_ref_put() instead.
>>>>>>
>>>>>> Looks good, this is a leftover from when the ctx teardown used
>>>>>> the waitqueue as well.
>>>>>>
>>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>>> the idea behind submission reference, but don't see any actual part
>>>>> needing it.
>>>>
>>>> In short, it's to prevent the completion running before we're done with
>>>> the iocb on the submission side.
>>>
>>> Yep, that's what I expected. Perhaps I missed something, but what I've
>>> seen following code paths all the way down, it either
>>> 1. gets error / completes synchronously and then frees req locally
>>> 2. or passes it further (e.g. async list) and never accesses it after
>>
>> As soon as the IO is passed on, it can complete. In fact, it can complete
>> even _before_ that call returns. That's the issue. Obviously this isn't
>> true for purely polled IO, but it is true for IRQ based IO.
>
> And the idea was to not use io_kiocb after submission. Except when we know,
> that it won't complete asynchronously (e.g. error), that could be checked
> with return code, I guess.

I think you're still missing the point. During the submission it can go
away, it can be deep in a call chain. So it's not enough to say "we
won't touch it after completion returns", we need to hold a reference to
ensure it doesn't go away WHILE being submitted.

Hope that helps!

--
Jens Axboe

2019-10-09 09:22:15

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH] io_uring: remove wait loop spurious wakeups

On 10/9/2019 5:54 AM, Jens Axboe wrote:
>>>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>>>> the idea behind submission reference, but don't see any actual part
>>>>>> needing it.
>>>>>
>>>>> In short, it's to prevent the completion running before we're done with
>>>>> the iocb on the submission side.
>>>>
>>>> Yep, that's what I expected. Perhaps I missed something, but what I've
>>>> seen following code paths all the way down, it either
>>>> 1. gets error / completes synchronously and then frees req locally
>>>> 2. or passes it further (e.g. async list) and never accesses it after
>>>
>>> As soon as the IO is passed on, it can complete. In fact, it can complete
>>> even _before_ that call returns. That's the issue. Obviously this isn't
>>> true for purely polled IO, but it is true for IRQ based IO.
>>
>> And the idea was to not use io_kiocb after submission. Except when we know,
>> that it won't complete asynchronously (e.g. error), that could be checked
>> with return code, I guess.
>
> I think you're still missing the point. During the submission it can go
> away, it can be deep in a call chain. So it's not enough to say "we
> won't touch it after completion returns", we need to hold a reference to
> ensure it doesn't go away WHILE being submitted.
>
> Hope that helps!

Now I get it, thanks Jens!

--
Yours sincerely,
Pavel Begunkov