From: Masayoshi Mizuma <[email protected]>
exit_aio() is sometimes stuck in wait_for_completion() after aio is issued
with direct IO and the task receives a signal.
That is because kioctx in mm->ioctx_table is in use by aio_kiocb.
aio_kiocb->ki_refcnt is 1 at that time. That means iocb_put() isn't
called correctly.
fuse_get_req() returns as -EINTR when it's blocked and receives a signal.
fuse_direct_IO() deals with the -EINTER as -EIOCBQUEUED and returns as
-EIOCBQUEUED even though the aio isn't queued.
As the result, aio_rw_done() doesn't handle the error, so iocb_put() isn't
called via aio_complete_rw(), which is the callback.
The flow is something like as:
io_submit
aio_get_req
refcount_set(&req->ki_refcnt, 2)
__io_submit_one
aio_read
...
fuse_direct_IO # return as -EIOCBQUEUED
__fuse_direct_read
...
fuse_get_req # return as -EINTR
aio_rw_done
# Nothing to do because ret is -EIOCBQUEUED...
iocb_put
refcount_dec_and_test(&iocb->ki_refcnt) # 2->1
Return as the error code of fuse_direct_io() or __fuse_direct_read() in
fuse_direct_IO() so that aio_rw_done() can handle the error and call
iocb_put().
This issue is trucked as a virtio-fs issue:
https://gitlab.com/virtio-fs/qemu/issues/14
Signed-off-by: Masayoshi Mizuma <[email protected]>
---
fs/fuse/file.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index db48a5cf8620..87b151aec8f2 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3115,8 +3115,12 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
fuse_aio_complete(io, ret < 0 ? ret : 0, -1);
/* we have a non-extending, async request, so return */
- if (!blocking)
- return -EIOCBQUEUED;
+ if (!blocking) {
+ if (ret >= 0)
+ return -EIOCBQUEUED;
+ else
+ return ret;
+ }
wait_for_completion(&wait);
ret = fuse_get_res_by_io(io);
--
2.18.1
On 11/18/19 10:24 AM, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma <[email protected]>
>
> exit_aio() is sometimes stuck in wait_for_completion() after aio is issued
> with direct IO and the task receives a signal.
>
> That is because kioctx in mm->ioctx_table is in use by aio_kiocb.
> aio_kiocb->ki_refcnt is 1 at that time. That means iocb_put() isn't
> called correctly.
>
> fuse_get_req() returns as -EINTR when it's blocked and receives a signal.
> fuse_direct_IO() deals with the -EINTER as -EIOCBQUEUED and returns as
> -EIOCBQUEUED even though the aio isn't queued.
> As the result, aio_rw_done() doesn't handle the error, so iocb_put() isn't
> called via aio_complete_rw(), which is the callback.
>
> The flow is something like as:
>
> io_submit
> aio_get_req
> refcount_set(&req->ki_refcnt, 2)
> __io_submit_one
> aio_read
> ...
> fuse_direct_IO # return as -EIOCBQUEUED
> __fuse_direct_read
> ...
> fuse_get_req # return as -EINTR
> aio_rw_done
> # Nothing to do because ret is -EIOCBQUEUED...
> iocb_put
> refcount_dec_and_test(&iocb->ki_refcnt) # 2->1
>
> Return as the error code of fuse_direct_io() or __fuse_direct_read() in
> fuse_direct_IO() so that aio_rw_done() can handle the error and call
> iocb_put().
>
> This issue is trucked as a virtio-fs issue:
> https://gitlab.com/virtio-fs/qemu/issues/14
>
I didn't reproduce this issue on kernel v5.4-rc7, but did on 5.4-rc8.
And verified this patch fixed the case in issue 14 on v5.4-rc8 and
virtiofsd (virtio-fs-dev 5f068fa9).
Tested-by: Cao jin <[email protected]>
--
Sincerely,
Cao jin
On Mon, Nov 25, 2019 at 01:38:38PM +0100, Miklos Szeredi wrote:
> On Mon, Nov 18, 2019 at 3:24 AM Masayoshi Mizuma <[email protected]> wrote:
> >
> > From: Masayoshi Mizuma <[email protected]>
> >
> > exit_aio() is sometimes stuck in wait_for_completion() after aio is issued
> > with direct IO and the task receives a signal.
> >
> > That is because kioctx in mm->ioctx_table is in use by aio_kiocb.
> > aio_kiocb->ki_refcnt is 1 at that time. That means iocb_put() isn't
> > called correctly.
> >
> > fuse_get_req() returns as -EINTR when it's blocked and receives a signal.
> > fuse_direct_IO() deals with the -EINTER as -EIOCBQUEUED and returns as
> > -EIOCBQUEUED even though the aio isn't queued.
> > As the result, aio_rw_done() doesn't handle the error, so iocb_put() isn't
> > called via aio_complete_rw(), which is the callback.
>
> Hi,
>
> Thanks for the report.
>
> Can you please test the attached patch (without your patch)?
The patch you attached works well, thanks! I tested it with virtiofs.
Should I post the patch? Or could you take care of it? Let me know.
Thanks!
Masa
On Mon, Nov 18, 2019 at 3:24 AM Masayoshi Mizuma <[email protected]> wrote:
>
> From: Masayoshi Mizuma <[email protected]>
>
> exit_aio() is sometimes stuck in wait_for_completion() after aio is issued
> with direct IO and the task receives a signal.
>
> That is because kioctx in mm->ioctx_table is in use by aio_kiocb.
> aio_kiocb->ki_refcnt is 1 at that time. That means iocb_put() isn't
> called correctly.
>
> fuse_get_req() returns as -EINTR when it's blocked and receives a signal.
> fuse_direct_IO() deals with the -EINTER as -EIOCBQUEUED and returns as
> -EIOCBQUEUED even though the aio isn't queued.
> As the result, aio_rw_done() doesn't handle the error, so iocb_put() isn't
> called via aio_complete_rw(), which is the callback.
Hi,
Thanks for the report.
Can you please test the attached patch (without your patch)?
Thanks,
Miklos