Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18;
Subject: Re: [PATCH 12/12] io_uring: support true async buffered reads, if
 file provides it
To:     Pavel Begunkov <asml.silence@gmail.com>, io-uring@vger.kernel.org
Cc:     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
        linux-mm@kvack.org
References: <20200523185755.8494-1-axboe@kernel.dk>
 <20200523185755.8494-13-axboe@kernel.dk>
 <8d429d6b-81ee-0a28-8533-2e1d4faa6b37@gmail.com>
From:   Jens Axboe <axboe@kernel.dk>
Message-ID: <717e474a-5168-8e1e-2e02-c1bdff007bd9@kernel.dk>
Date:   Mon, 25 May 2020 13:59:38 -0600
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <8d429d6b-81ee-0a28-8533-2e1d4faa6b37@gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk

On 5/25/20 1:29 AM, Pavel Begunkov wrote:
> On 23/05/2020 21:57, Jens Axboe wrote:
>> If the file is flagged with FMODE_BUF_RASYNC, then we don't have to punt
>> the buffered read to an io-wq worker. Instead we can rely on page
>> unlocking callbacks to support retry based async IO. This is a lot more
>> efficient than doing async thread offload.
>>
>> The retry is done similarly to how we handle poll based retry. From
>> the unlock callback, we simply queue the retry to a task_work based
>> handler.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  fs/io_uring.c | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 99 insertions(+)
>>
> ...
>> +
>> +	init_task_work(&rw->task_work, io_async_buf_retry);
>> +	/* submit ref gets dropped, acquire a new one */
>> +	refcount_inc(&req->refs);
>> +	tsk = req->task;
>> +	ret = task_work_add(tsk, &rw->task_work, true);
>> +	if (unlikely(ret)) {
>> +		/* queue just for cancelation */
>> +		init_task_work(&rw->task_work, io_async_buf_cancel);
>> +		tsk = io_wq_get_task(req->ctx->io_wq);
> 
> IIRC, task will be put somewhere around io_free_req(). Then shouldn't here be
> some juggling with reassigning req->task with task_{get,put}()?

Not sure I follow? Yes, we'll put this task again when the request
is freed, but not sure what you mean with juggling?

>> +		task_work_add(tsk, &rw->task_work, true);
>> +	}
>> +	wake_up_process(tsk);
>> +	return 1;
>> +}
> ...
>>  static int io_read(struct io_kiocb *req, bool force_nonblock)
>>  {
>>  	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
>> @@ -2601,6 +2696,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>>  	if (!ret) {
>>  		ssize_t ret2;
>>  
>> +retry:
>>  		if (req->file->f_op->read_iter)
>>  			ret2 = call_read_iter(req->file, kiocb, &iter);
>>  		else
>> @@ -2619,6 +2715,9 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
>>  			if (!(req->flags & REQ_F_NOWAIT) &&
>>  			    !file_can_poll(req->file))
>>  				req->flags |= REQ_F_MUST_PUNT;
>> +			if (io_rw_should_retry(req))
> 
> It looks like a state machine with IOCB_WAITQ and gotos. Wouldn't it be cleaner
> to call call_read_iter()/loop_rw_iter() here directly instead of "goto retry" ?

We could, probably making that part a separate helper then. How about the
below incremental?

> BTW, can this async stuff return -EAGAIN ?

Probably? Prefer not to make any definitive calls on that being possible or
not, as it's sure to disappoint. If it does and IOCB_WAITQ is already set,
then we'll punt to a thread like before.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index a5a4d9602915..669dccd81207 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2677,6 +2677,13 @@ static bool io_rw_should_retry(struct io_kiocb *req)
 	return false;
 }
 
+static int __io_read(struct io_kiocb *req, struct iov_iter *iter)
+{
+	if (req->file->f_op->read_iter)
+		return call_read_iter(req->file, &req->rw.kiocb, iter);
+	return loop_rw_iter(READ, req->file, &req->rw.kiocb, iter);
+}
+
 static int io_read(struct io_kiocb *req, bool force_nonblock)
 {
 	struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
@@ -2710,11 +2717,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
 	if (!ret) {
 		ssize_t ret2;
 
-retry:
-		if (req->file->f_op->read_iter)
-			ret2 = call_read_iter(req->file, kiocb, &iter);
-		else
-			ret2 = loop_rw_iter(READ, req->file, kiocb, &iter);
+		ret2 = __io_read(req, &iter);
 
 		/* Catch -EAGAIN return for forced non-blocking submission */
 		if (!force_nonblock || ret2 != -EAGAIN) {
@@ -2729,8 +2732,11 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
 			if (!(req->flags & REQ_F_NOWAIT) &&
 			    !file_can_poll(req->file))
 				req->flags |= REQ_F_MUST_PUNT;
-			if (io_rw_should_retry(req))
-				goto retry;
+			if (io_rw_should_retry(req)) {
+				ret2 = __io_read(req, &iter);
+				if (ret2 != -EAGAIN)
+					goto out_free;
+			}
 			kiocb->ki_flags &= ~IOCB_WAITQ;
 			return -EAGAIN;
 		}

-- 
Jens Axboe