2021-09-01 09:49:50

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [RFC PATCH] io_uring: stop issue failed request to fix panic

On 9/1/21 10:39 AM, 王贇 wrote:
> We observed panic:
> BUG: kernel NULL pointer dereference, address:0000000000000028
> [skip]
> Oops: 0000 [#1] SMP PTI
> CPU: 1 PID: 737 Comm: a.out Not tainted 5.14.0+ #58
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> RIP: 0010:vfs_fadvise+0x1e/0x80
> [skip]
> Call Trace:
> ? tctx_task_work+0x111/0x2a0
> io_issue_sqe+0x524/0x1b90

Most likely it was fixed yesterday. Can you try?
https://git.kernel.dk/cgit/linux-block/log/?h=for-5.15/io_uring

Or these two patches in particular

https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.15/io_uring&id=c6d3d9cbd659de8f2176b4e4721149c88ac096d4
https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.15/io_uring&id=b8ce1b9d25ccf81e1bbabd45b963ed98b2222df8

> This is caused by io_wq_submit_work() calling io_issue_sqe()
> on a failed fadvise request, and the io_init_req() return error
> before initialize the file for it, lead into the panic when
> vfs_fadvise() try to access 'req->file'.
>
> This patch add the missing check & handle for failed request
> before calling io_issue_sqe().
>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> fs/io_uring.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 6f35b12..bfec7bf 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2214,7 +2214,8 @@ static void io_req_task_submit(struct io_kiocb *req, bool *locked)
>
> io_tw_lock(ctx, locked);
> /* req->task == current here, checking PF_EXITING is safe */
> - if (likely(!(req->task->flags & PF_EXITING)))
> + if (likely(!(req->task->flags & PF_EXITING) &&
> + !(req->flags & REQ_F_FAIL)))
> __io_queue_sqe(req);
> else
> io_req_complete_failed(req, -EFAULT);
> @@ -6704,7 +6705,10 @@ static void io_wq_submit_work(struct io_wq_work *work)
>
> if (!ret) {
> do {
> - ret = io_issue_sqe(req, 0);
> + if (likely(!(req->flags & REQ_F_FAIL)))
> + ret = io_issue_sqe(req, 0);
> + else
> + io_req_complete_failed(req, -EFAULT);
> /*
> * We can get EAGAIN for polled IO even though we're
> * forcing a sync submission from here, since we can't
>

--
Pavel Begunkov


2021-09-01 09:54:26

by 王贇

[permalink] [raw]
Subject: Re: [RFC PATCH] io_uring: stop issue failed request to fix panic



On 2021/9/1 下午5:47, Pavel Begunkov wrote:
> On 9/1/21 10:39 AM, 王贇 wrote:
>> We observed panic:
>> BUG: kernel NULL pointer dereference, address:0000000000000028
>> [skip]
>> Oops: 0000 [#1] SMP PTI
>> CPU: 1 PID: 737 Comm: a.out Not tainted 5.14.0+ #58
>> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>> RIP: 0010:vfs_fadvise+0x1e/0x80
>> [skip]
>> Call Trace:
>> ? tctx_task_work+0x111/0x2a0
>> io_issue_sqe+0x524/0x1b90
>
> Most likely it was fixed yesterday. Can you try?
> https://git.kernel.dk/cgit/linux-block/log/?h=for-5.15/io_uring
>
> Or these two patches in particular
>
> https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.15/io_uring&id=c6d3d9cbd659de8f2176b4e4721149c88ac096d4
> https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.15/io_uring&id=b8ce1b9d25ccf81e1bbabd45b963ed98b2222df8

Yup, it no longer panic :-)

Regards,
Michael Wang

>
>> This is caused by io_wq_submit_work() calling io_issue_sqe()
>> on a failed fadvise request, and the io_init_req() return error
>> before initialize the file for it, lead into the panic when
>> vfs_fadvise() try to access 'req->file'.
>>
>> This patch add the missing check & handle for failed request
>> before calling io_issue_sqe().
>>
>> Signed-off-by: Michael Wang <[email protected]>
>> ---
>> fs/io_uring.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 6f35b12..bfec7bf 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -2214,7 +2214,8 @@ static void io_req_task_submit(struct io_kiocb *req, bool *locked)
>>
>> io_tw_lock(ctx, locked);
>> /* req->task == current here, checking PF_EXITING is safe */
>> - if (likely(!(req->task->flags & PF_EXITING)))
>> + if (likely(!(req->task->flags & PF_EXITING) &&
>> + !(req->flags & REQ_F_FAIL)))
>> __io_queue_sqe(req);
>> else
>> io_req_complete_failed(req, -EFAULT);
>> @@ -6704,7 +6705,10 @@ static void io_wq_submit_work(struct io_wq_work *work)
>>
>> if (!ret) {
>> do {
>> - ret = io_issue_sqe(req, 0);
>> + if (likely(!(req->flags & REQ_F_FAIL)))
>> + ret = io_issue_sqe(req, 0);
>> + else
>> + io_req_complete_failed(req, -EFAULT);
>> /*
>> * We can get EAGAIN for polled IO even though we're
>> * forcing a sync submission from here, since we can't
>>
>

2021-09-01 11:03:24

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [RFC PATCH] io_uring: stop issue failed request to fix panic

On 9/1/21 10:52 AM, 王贇 wrote:
]> On 2021/9/1 下午5:47, Pavel Begunkov wrote:
>> On 9/1/21 10:39 AM, 王贇 wrote:
>>> We observed panic:
>>> BUG: kernel NULL pointer dereference, address:0000000000000028
>>> [skip]
>>> Oops: 0000 [#1] SMP PTI
>>> CPU: 1 PID: 737 Comm: a.out Not tainted 5.14.0+ #58
>>> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>> RIP: 0010:vfs_fadvise+0x1e/0x80
>>> [skip]
>>> Call Trace:
>>> ? tctx_task_work+0x111/0x2a0
>>> io_issue_sqe+0x524/0x1b90
>>
>> Most likely it was fixed yesterday. Can you try?
>> https://git.kernel.dk/cgit/linux-block/log/?h=for-5.15/io_uring
>>
>> Or these two patches in particular
>>
>> https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.15/io_uring&id=c6d3d9cbd659de8f2176b4e4721149c88ac096d4
>> https://git.kernel.dk/cgit/linux-block/commit/?h=for-5.15/io_uring&id=b8ce1b9d25ccf81e1bbabd45b963ed98b2222df8
>
> Yup, it no longer panic :-)

awesome, thanks

>
> Regards,
> Michael Wang
>
>>
>>> This is caused by io_wq_submit_work() calling io_issue_sqe()
>>> on a failed fadvise request, and the io_init_req() return error
>>> before initialize the file for it, lead into the panic when
>>> vfs_fadvise() try to access 'req->file'.
>>>
>>> This patch add the missing check & handle for failed request
>>> before calling io_issue_sqe().
>>>
>>> Signed-off-by: Michael Wang <[email protected]>
>>> ---
>>> fs/io_uring.c | 8 ++++++--
>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>> index 6f35b12..bfec7bf 100644
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -2214,7 +2214,8 @@ static void io_req_task_submit(struct io_kiocb *req, bool *locked)
>>>
>>> io_tw_lock(ctx, locked);
>>> /* req->task == current here, checking PF_EXITING is safe */
>>> - if (likely(!(req->task->flags & PF_EXITING)))
>>> + if (likely(!(req->task->flags & PF_EXITING) &&
>>> + !(req->flags & REQ_F_FAIL)))
>>> __io_queue_sqe(req);
>>> else
>>> io_req_complete_failed(req, -EFAULT);
>>> @@ -6704,7 +6705,10 @@ static void io_wq_submit_work(struct io_wq_work *work)
>>>
>>> if (!ret) {
>>> do {
>>> - ret = io_issue_sqe(req, 0);
>>> + if (likely(!(req->flags & REQ_F_FAIL)))
>>> + ret = io_issue_sqe(req, 0);
>>> + else
>>> + io_req_complete_failed(req, -EFAULT);
>>> /*
>>> * We can get EAGAIN for polled IO even though we're
>>> * forcing a sync submission from here, since we can't
>>>
>>

--
Pavel Begunkov