2020-05-01 14:12:41

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH 0/3] small fixes for-5.7

I split the sent patchset, please consider this part for current.

I'll send a test for [1] in a day or so.

Regarding [3], Jens, I haven't looked properly yet, how long
splice can wait on a inode mutex, but it can be problematic,
especially for latencies. How about go safe for-5.7, and
maybe think something out for next?


Pavel Begunkov (3):
io_uring: fix extra put in sync_file_range()
io_uring: check non-sync defer_list carefully
io_uring: punt splice async because of inode mtx

fs/io_uring.c | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)

--
2.24.0


2020-05-01 14:12:42

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH 1/3] io_uring: fix extra put in sync_file_range()

[ 40.179474] refcount_t: underflow; use-after-free.
[ 40.179499] WARNING: CPU: 6 PID: 1848 at lib/refcount.c:28 refcount_warn_saturate+0xae/0xf0
...
[ 40.179612] RIP: 0010:refcount_warn_saturate+0xae/0xf0
[ 40.179617] Code: 28 44 0a 01 01 e8 d7 01 c2 ff 0f 0b 5d c3 80 3d 15 44 0a 01 00 75 91 48 c7 c7 b8 f5 75 be c6 05 05 44 0a 01 01 e8 b7 01 c2 ff <0f> 0b 5d c3 80 3d f3 43 0a 01 00 0f 85 6d ff ff ff 48 c7 c7 10 f6
[ 40.179619] RSP: 0018:ffffb252423ebe18 EFLAGS: 00010286
[ 40.179623] RAX: 0000000000000000 RBX: ffff98d65e929400 RCX: 0000000000000000
[ 40.179625] RDX: 0000000000000001 RSI: 0000000000000086 RDI: 00000000ffffffff
[ 40.179627] RBP: ffffb252423ebe18 R08: 0000000000000001 R09: 000000000000055d
[ 40.179629] R10: 0000000000000c8c R11: 0000000000000001 R12: 0000000000000000
[ 40.179631] R13: ffff98d68c434400 R14: ffff98d6a9cbaa20 R15: ffff98d6a609ccb8
[ 40.179634] FS: 0000000000000000(0000) GS:ffff98d6af580000(0000) knlGS:0000000000000000
[ 40.179636] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 40.179638] CR2: 00000000033e3194 CR3: 000000006480a003 CR4: 00000000003606e0
[ 40.179641] Call Trace:
[ 40.179652] io_put_req+0x36/0x40
[ 40.179657] io_free_work+0x15/0x20
[ 40.179661] io_worker_handle_work+0x2f5/0x480
[ 40.179667] io_wqe_worker+0x2a9/0x360
[ 40.179674] ? _raw_spin_unlock_irqrestore+0x24/0x40
[ 40.179681] kthread+0x12c/0x170
[ 40.179685] ? io_worker_handle_work+0x480/0x480
[ 40.179690] ? kthread_park+0x90/0x90
[ 40.179695] ret_from_fork+0x35/0x40
[ 40.179702] ---[ end trace 85027405f00110aa ]---

Opcode handler must never put submission ref, but that's what
io_sync_file_range_finish() do. use io_steal_work() there.

Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 75ff4cc9818b..e7d7e83e3d56 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3515,7 +3515,7 @@ static void io_sync_file_range_finish(struct io_wq_work **workptr)
if (io_req_cancelled(req))
return;
__io_sync_file_range(req);
- io_put_req(req); /* put submission ref */
+ io_steal_work(req, workptr);
}

static int io_sync_file_range(struct io_kiocb *req, bool force_nonblock)
--
2.24.0

2020-05-01 14:12:46

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH 2/3] io_uring: check non-sync defer_list carefully

io_req_defer() do double-checked locking. Use proper helpers for that,
i.e. list_empty_careful().

Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index e7d7e83e3d56..e5d560f2ce12 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5028,7 +5028,7 @@ static int io_req_defer(struct io_kiocb *req, const struct io_uring_sqe *sqe)
int ret;

/* Still need defer if there is pending req in defer list. */
- if (!req_need_defer(req) && list_empty(&ctx->defer_list))
+ if (!req_need_defer(req) && list_empty_careful(&ctx->defer_list))
return 0;

if (!req->io && io_alloc_async_ctx(req))
--
2.24.0

2020-05-01 14:13:09

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH 3/3] io_uring: punt splice async because of inode mtx

Nonblocking do_splice() still may wait for some time on an inode mutex.
Let's play safe and always punt it async.

Reported-by: Jens Axboe <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
---
fs/io_uring.c | 16 ++--------------
1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index e5d560f2ce12..65458eda2127 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2765,15 +2765,6 @@ static int io_splice_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
return 0;
}

-static bool io_splice_punt(struct file *file, int rw)
-{
- if (get_pipe_info(file))
- return false;
- if (!io_file_supports_async(file, rw))
- return true;
- return !(file->f_flags & O_NONBLOCK);
-}
-
static int io_splice(struct io_kiocb *req, bool force_nonblock)
{
struct io_splice *sp = &req->splice;
@@ -2783,11 +2774,8 @@ static int io_splice(struct io_kiocb *req, bool force_nonblock)
loff_t *poff_in, *poff_out;
long ret;

- if (force_nonblock) {
- if (io_splice_punt(in, READ) || io_splice_punt(out, WRITE))
- return -EAGAIN;
- flags |= SPLICE_F_NONBLOCK;
- }
+ if (force_nonblock)
+ return -EAGAIN;

poff_in = (sp->off_in == -1) ? NULL : &sp->off_in;
poff_out = (sp->off_out == -1) ? NULL : &sp->off_out;
--
2.24.0

2020-05-01 15:09:52

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 0/3] small fixes for-5.7

On 5/1/20 8:09 AM, Pavel Begunkov wrote:
> I split the sent patchset, please consider this part for current.
>
> I'll send a test for [1] in a day or so.
>
> Regarding [3], Jens, I haven't looked properly yet, how long
> splice can wait on a inode mutex, but it can be problematic,
> especially for latencies. How about go safe for-5.7, and
> maybe think something out for next?

Agree, let's play it safe for 5.7, and look further into expanding
this for 5.8 so we don't always have to go async.

Series looks good to me, applied. Thanks Pavel!

--
Jens Axboe