2020-06-22 23:35:25

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH 1/4] io_uring: fix hanging iopoll in case of -EAGAIN

io_do_iopoll() won't do anything with a request unless
req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
it, otherwise io_do_iopoll() will poll a file again and again even
though the request of interest was completed long ago.

Fixes: bbde017a32b3 ("io_uring: add memory barrier to synchronize
io_kiocb's result and iopoll_completed")
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index c686061c3762..bb0dfc450db5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2104,10 +2104,8 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2)

WRITE_ONCE(req->result, res);
/* order with io_poll_complete() checking ->result */
- if (res != -EAGAIN) {
- smp_wmb();
- WRITE_ONCE(req->iopoll_completed, 1);
- }
+ smp_wmb();
+ WRITE_ONCE(req->iopoll_completed, 1);
}

/*
--
2.24.0


2020-06-23 02:14:09

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/4] io_uring: fix hanging iopoll in case of -EAGAIN

On 6/22/20 4:16 PM, Pavel Begunkov wrote:
> io_do_iopoll() won't do anything with a request unless
> req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
> it, otherwise io_do_iopoll() will poll a file again and again even
> though the request of interest was completed long ago.

I need to look at this again, because with this change, I previously
got various use-after-free. I haven't seen any issues with it, but
I agree, from a quick look that I'm not quite sure how it's currently
not causing hangs. Yet I haven't seen any, with targeted -EAGAIN
testing.

--
Jens Axboe

2020-06-23 02:21:58

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/4] io_uring: fix hanging iopoll in case of -EAGAIN

On 6/22/20 8:07 PM, Jens Axboe wrote:
> On 6/22/20 4:16 PM, Pavel Begunkov wrote:
>> io_do_iopoll() won't do anything with a request unless
>> req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
>> it, otherwise io_do_iopoll() will poll a file again and again even
>> though the request of interest was completed long ago.
>
> I need to look at this again, because with this change, I previously
> got various use-after-free. I haven't seen any issues with it, but
> I agree, from a quick look that I'm not quite sure how it's currently
> not causing hangs. Yet I haven't seen any, with targeted -EAGAIN
> testing.

Ah I think I know what it is - if we run into:

if (req->result == -EAGAIN)
return -EAGAIN

in io_issue_sqe() and race with it, we'll reissue twice potentially.
So the above isn't quite enough, we'll need something a bit broader.

--
Jens Axboe

2020-06-23 12:05:26

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH 1/4] io_uring: fix hanging iopoll in case of -EAGAIN

On 23/06/2020 05:18, Jens Axboe wrote:
> On 6/22/20 8:07 PM, Jens Axboe wrote:
>> On 6/22/20 4:16 PM, Pavel Begunkov wrote:
>>> io_do_iopoll() won't do anything with a request unless
>>> req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
>>> it, otherwise io_do_iopoll() will poll a file again and again even
>>> though the request of interest was completed long ago.
>>
>> I need to look at this again, because with this change, I previously
>> got various use-after-free. I haven't seen any issues with it, but
>> I agree, from a quick look that I'm not quite sure how it's currently
>> not causing hangs. Yet I haven't seen any, with targeted -EAGAIN
>> testing.

Can io_complete_rw_iopoll() get -EAGAIN after being successfully enqueued
(i.e. EIOCBQUEUED)? It's reliably fails for me, because my hacked nullblk
_can_ (i.e. probabilistically returns BLK_STS_AGAIN from ->iopoll()).

>
> Ah I think I know what it is - if we run into:
>
> if (req->result == -EAGAIN)
> return -EAGAIN
>
> in io_issue_sqe() and race with it, we'll reissue twice potentially.
> So the above isn't quite enough, we'll need something a bit broader.

I see, I'll deal with it.


--
Pavel Begunkov

2020-06-23 19:02:17

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/4] io_uring: fix hanging iopoll in case of -EAGAIN

On 6/23/20 5:57 AM, Pavel Begunkov wrote:
> On 23/06/2020 05:18, Jens Axboe wrote:
>> On 6/22/20 8:07 PM, Jens Axboe wrote:
>>> On 6/22/20 4:16 PM, Pavel Begunkov wrote:
>>>> io_do_iopoll() won't do anything with a request unless
>>>> req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
>>>> it, otherwise io_do_iopoll() will poll a file again and again even
>>>> though the request of interest was completed long ago.
>>>
>>> I need to look at this again, because with this change, I previously
>>> got various use-after-free. I haven't seen any issues with it, but
>>> I agree, from a quick look that I'm not quite sure how it's currently
>>> not causing hangs. Yet I haven't seen any, with targeted -EAGAIN
>>> testing.
>
> Can io_complete_rw_iopoll() get -EAGAIN after being successfully enqueued
> (i.e. EIOCBQUEUED)? It's reliably fails for me, because my hacked nullblk
> _can_ (i.e. probabilistically returns BLK_STS_AGAIN from ->iopoll()).

Yes it can. The primary example would be a polled bio that gets split, into
let's say 4 bio's. First one queues fine, but one of the subsequent ones
run into request allocation failures and it gets marked as -EAGAIN.

>> Ah I think I know what it is - if we run into:
>>
>> if (req->result == -EAGAIN)
>> return -EAGAIN
>>
>> in io_issue_sqe() and race with it, we'll reissue twice potentially.
>> So the above isn't quite enough, we'll need something a bit broader.
>
> I see, I'll deal with it.

Thanks!

--
Jens Axboe

2020-06-24 16:59:51

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH 1/4] io_uring: fix hanging iopoll in case of -EAGAIN

On 23/06/2020 22:01, Jens Axboe wrote:
> On 6/23/20 5:57 AM, Pavel Begunkov wrote:
>> On 23/06/2020 05:18, Jens Axboe wrote:
>>> On 6/22/20 8:07 PM, Jens Axboe wrote:
>>>> On 6/22/20 4:16 PM, Pavel Begunkov wrote:
>>>>> io_do_iopoll() won't do anything with a request unless
>>>>> req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
>>>>> it, otherwise io_do_iopoll() will poll a file again and again even
>>>>> though the request of interest was completed long ago.
>>>>
>>>> I need to look at this again, because with this change, I previously
>>>> got various use-after-free. I haven't seen any issues with it, but
>>>> I agree, from a quick look that I'm not quite sure how it's currently
>>>> not causing hangs. Yet I haven't seen any, with targeted -EAGAIN
>>>> testing.
>>
>> Can io_complete_rw_iopoll() get -EAGAIN after being successfully enqueued
>> (i.e. EIOCBQUEUED)? It's reliably fails for me, because my hacked nullblk
>> _can_ (i.e. probabilistically returns BLK_STS_AGAIN from ->iopoll()).
>
> Yes it can. The primary example would be a polled bio that gets split, into
> let's say 4 bio's. First one queues fine, but one of the subsequent ones
> run into request allocation failures and it gets marked as -EAGAIN.

Right, thanks for the explanation. And that's the case where io_uring fails.
Now I tested all kinds of -EAGAIN to be sure.

--
Pavel Begunkov