2020-09-03 14:55:45

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH next] io_uring: fix task hung in io_uring_setup

On 9/3/20 7:21 AM, Hillf Danton wrote:
>
> The smart syzbot found the following issue:
>
> INFO: task syz-executor047:6853 blocked for more than 143 seconds.
> Not tainted 5.9.0-rc3-next-20200902-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:syz-executor047 state:D stack:28104 pid: 6853 ppid: 6847 flags:0x00004000
> Call Trace:
> context_switch kernel/sched/core.c:3777 [inline]
> __schedule+0xea9/0x2230 kernel/sched/core.c:4526
> schedule+0xd0/0x2a0 kernel/sched/core.c:4601
> schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1855
> do_wait_for_common kernel/sched/completion.c:85 [inline]
> __wait_for_common kernel/sched/completion.c:106 [inline]
> wait_for_common kernel/sched/completion.c:117 [inline]
> wait_for_completion+0x163/0x260 kernel/sched/completion.c:138
> io_sq_thread_stop fs/io_uring.c:6906 [inline]
> io_finish_async fs/io_uring.c:6920 [inline]
> io_sq_offload_create fs/io_uring.c:7595 [inline]
> io_uring_create fs/io_uring.c:8671 [inline]
> io_uring_setup+0x1495/0x29a0 fs/io_uring.c:8744
> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> because the sqo_thread kthread is created in io_sq_offload_create() without
> being waked up. Then in the error branch of that function we will wait for
> the sqo kthread that never runs. It's fixed by waking it up before waiting.

Looks good - applied, thanks.

--
Jens Axboe


2020-09-07 08:54:41

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH next] io_uring: fix task hung in io_uring_setup

On 03/09/2020 17:04, Jens Axboe wrote:
> On 9/3/20 7:21 AM, Hillf Danton wrote:
>>
>> The smart syzbot found the following issue:
>>
>> INFO: task syz-executor047:6853 blocked for more than 143 seconds.
>> Not tainted 5.9.0-rc3-next-20200902-syzkaller #0
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:syz-executor047 state:D stack:28104 pid: 6853 ppid: 6847 flags:0x00004000
>> Call Trace:
>> context_switch kernel/sched/core.c:3777 [inline]
>> __schedule+0xea9/0x2230 kernel/sched/core.c:4526
>> schedule+0xd0/0x2a0 kernel/sched/core.c:4601
>> schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1855
>> do_wait_for_common kernel/sched/completion.c:85 [inline]
>> __wait_for_common kernel/sched/completion.c:106 [inline]
>> wait_for_common kernel/sched/completion.c:117 [inline]
>> wait_for_completion+0x163/0x260 kernel/sched/completion.c:138
>> io_sq_thread_stop fs/io_uring.c:6906 [inline]
>> io_finish_async fs/io_uring.c:6920 [inline]
>> io_sq_offload_create fs/io_uring.c:7595 [inline]
>> io_uring_create fs/io_uring.c:8671 [inline]
>> io_uring_setup+0x1495/0x29a0 fs/io_uring.c:8744
>> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> because the sqo_thread kthread is created in io_sq_offload_create() without
>> being waked up. Then in the error branch of that function we will wait for
>> the sqo kthread that never runs. It's fixed by waking it up before waiting.
>
> Looks good - applied, thanks.

BTW, I don't see the patch itself, and it's neither in io_uring, block
nor fs mailing lists. Hillf, could you please CC proper lists next time?

--
Pavel Begunkov

2020-09-07 12:58:40

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH next] io_uring: fix task hung in io_uring_setup

On 9/7/20 2:50 AM, Pavel Begunkov wrote:
> On 03/09/2020 17:04, Jens Axboe wrote:
>> On 9/3/20 7:21 AM, Hillf Danton wrote:
>>>
>>> The smart syzbot found the following issue:
>>>
>>> INFO: task syz-executor047:6853 blocked for more than 143 seconds.
>>> Not tainted 5.9.0-rc3-next-20200902-syzkaller #0
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:syz-executor047 state:D stack:28104 pid: 6853 ppid: 6847 flags:0x00004000
>>> Call Trace:
>>> context_switch kernel/sched/core.c:3777 [inline]
>>> __schedule+0xea9/0x2230 kernel/sched/core.c:4526
>>> schedule+0xd0/0x2a0 kernel/sched/core.c:4601
>>> schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1855
>>> do_wait_for_common kernel/sched/completion.c:85 [inline]
>>> __wait_for_common kernel/sched/completion.c:106 [inline]
>>> wait_for_common kernel/sched/completion.c:117 [inline]
>>> wait_for_completion+0x163/0x260 kernel/sched/completion.c:138
>>> io_sq_thread_stop fs/io_uring.c:6906 [inline]
>>> io_finish_async fs/io_uring.c:6920 [inline]
>>> io_sq_offload_create fs/io_uring.c:7595 [inline]
>>> io_uring_create fs/io_uring.c:8671 [inline]
>>> io_uring_setup+0x1495/0x29a0 fs/io_uring.c:8744
>>> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>
>>> because the sqo_thread kthread is created in io_sq_offload_create() without
>>> being waked up. Then in the error branch of that function we will wait for
>>> the sqo kthread that never runs. It's fixed by waking it up before waiting.
>>
>> Looks good - applied, thanks.
>
> BTW, I don't see the patch itself, and it's neither in io_uring, block
> nor fs mailing lists. Hillf, could you please CC proper lists next time?

He did, but I'm guessing that vger didn't like the email for whatever
reason. Hillf, did you get an error back from vger when sending the patch?

--
Jens Axboe