2023-03-20 21:06:57

by syzbot

[permalink] [raw]
Subject: [syzbot] [fs?] INFO: task hung in eventpoll_release_file

Hello,

syzbot found the following issue on:

HEAD commit: 6f72958a49f6 Add linux-next specific files for 20230316
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=178faee6c80000
kernel config: https://syzkaller.appspot.com/x/.config?x=dd64232230d3a0d8
dashboard link: https://syzkaller.appspot.com/bug?extid=e6dab35a08df7f7aa260
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1156a6ecc80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=179e9648c80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/3fcac83e3d53/disk-6f72958a.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/37b457d3103a/vmlinux-6f72958a.xz
kernel image: https://storage.googleapis.com/syzbot-assets/b82735625cce/bzImage-6f72958a.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

INFO: task syz-executor397:5874 blocked for more than 143 seconds.
Not tainted 6.3.0-rc2-next-20230316-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor397 state:D stack:27072 pid:5874 ppid:5147 flags:0x00004004
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5305 [inline]
__schedule+0x1d23/0x5650 kernel/sched/core.c:6623
schedule+0xde/0x1a0 kernel/sched/core.c:6699
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6758
__mutex_lock_common kernel/locking/mutex.c:679 [inline]
__mutex_lock+0xa3b/0x1350 kernel/locking/mutex.c:747
eventpoll_release_file+0xe2/0x1d0 fs/eventpoll.c:962
eventpoll_release include/linux/eventpoll.h:53 [inline]
__fput+0x7da/0xa90 fs/file_table.c:312
task_work_run+0x16f/0x270 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x210/0x240 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297
do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f0183a1b0b9
RSP: 002b:00007f01831bf208 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: 0000000000000116 RBX: 00007f0183a9d4a8 RCX: 00007f0183a1b0b9
RDX: 0000000000000318 RSI: 00000000200bd000 RDI: 0000000000000004
RBP: 00007f0183a9d4a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0183a9d4ac
R13: 00007fff8a59281f R14: 00007f01831bf300 R15: 0000000000022000
</TASK>

Showing all locks held in the system:
1 lock held by rcu_tasks_kthre/13:
#0: ffffffff8c795830 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0xd80 kernel/rcu/tasks.h:516
1 lock held by rcu_tasks_trace/14:
#0: ffffffff8c795530 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0xd80 kernel/rcu/tasks.h:516
1 lock held by khungtaskd/28:
#0: ffffffff8c796400 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x51/0x390 kernel/locking/lockdep.c:6545
2 locks held by getty/4761:
#0: ffff88814a34b098 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x26/0x80 drivers/tty/tty_ldisc.c:243
#1: ffffc900015902f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0xef4/0x13e0 drivers/tty/n_tty.c:2177
1 lock held by syz-executor397/5873:
1 lock held by syz-executor397/5874:
#0: ffff88801e3d5c68 (&ep->mtx){+.+.}-{3:3}, at: eventpoll_release_file+0xe2/0x1d0 fs/eventpoll.c:962

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 28 Comm: khungtaskd Not tainted 6.3.0-rc2-next-20230316-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
nmi_cpu_backtrace+0x29c/0x350 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x2a4/0x300 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:148 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:222 [inline]
watchdog+0xe16/0x1090 kernel/hung_task.c:379
kthread+0x33e/0x440 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 5873 Comm: syz-executor397 Not tainted 6.3.0-rc2-next-20230316-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
RIP: 0010:arch_atomic_read arch/x86/include/asm/atomic.h:29 [inline]
RIP: 0010:rcu_dynticks_curr_cpu_in_eqs include/linux/context_tracking.h:121 [inline]
RIP: 0010:rcu_is_watching+0x47/0xb0 kernel/rcu/tree.c:695
Code: c5 77 7a 48 8d 3c ed 00 fa 10 8c 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 54 48 03 1c ed 00 fa 10 8c <48> b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 0f b6 14 02 48
RSP: 0018:ffffc90004c9fc58 EFLAGS: 00000282
RAX: dffffc0000000000 RBX: ffff8880b9936c68 RCX: ffffffff8164cf65
RDX: 1ffffffff1821f41 RSI: 0000000000000002 RDI: ffffffff8c10fa08
RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffff8e784717
R10: fffffbfff1cf08e2 R11: 0000000000000000 R12: ffff8880772cf1c8
R13: ffff88801e3d5c00 R14: ffff8880772cf180 R15: 0000000000000000
FS: 0000555556477400(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005642a1c9c5c0 CR3: 0000000025100000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
trace_lock_release include/trace/events/lock.h:69 [inline]
lock_release+0x4fb/0x670 kernel/locking/lockdep.c:5702
__raw_spin_unlock include/linux/spinlock_api_smp.h:141 [inline]
_raw_spin_unlock+0x16/0x40 kernel/locking/spinlock.c:186
spin_unlock include/linux/spinlock.h:390 [inline]
__ep_remove+0x6ce/0xa50 fs/eventpoll.c:737
ep_remove_safe fs/eventpoll.c:782 [inline]
ep_clear_and_put+0x21c/0x3b0 fs/eventpoll.c:815
ep_eventpoll_release+0x45/0x60 fs/eventpoll.c:831
__fput+0x27c/0xa90 fs/file_table.c:321
task_work_run+0x16f/0x270 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x210/0x240 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297
do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f01839d3dfb
Code: 0f 05 48 3d 00 f0 ff ff 77 45 c3 0f 1f 40 00 48 83 ec 18 89 7c 24 0c e8 63 fc ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 a1 fc ff ff 8b 44
RSP: 002b:00007fff8a592880 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000006 RCX: 00007f01839d3dfb
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 0000000000152fad R08: 0000000000000000 R09: 00007fff8a5a7080
R10: 0000000000000000 R11: 0000000000000293 R12: 00007f0183a9d4bc
R13: 00007fff8a592900 R14: 00007fff8a5929a0 R15: 00007f0183a9d4a0
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


2023-03-21 06:42:33

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [fs?] INFO: task hung in eventpoll_release_file

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: [email protected]

Tested on:

commit: 6f72958a Add linux-next specific files for 20230316
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=1719c87ac80000
kernel config: https://syzkaller.appspot.com/x/.config?x=dd64232230d3a0d8
dashboard link: https://syzkaller.appspot.com/bug?extid=e6dab35a08df7f7aa260
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=178213a6c80000

Note: testing is done by a robot and is best-effort only.

2023-04-30 06:35:18

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [fs?] INFO: task hung in eventpoll_release_file

> #syz set subsystem: fs

The specified label "subsystem" is unknown.
Please use one of the supported labels.

The following labels are suported:
no-reminders, prio: {low, normal, high}, subsystems: {.. see below ..}
The list of subsystems: https://syzkaller.appspot.com/upstream/subsystems?all=true

>
> This somehow got tagged with the ext4 label, and not the fs label.
> (And this is not the first one I've noticed). I'm beginning to
> suspect there may have been some syzbot database hiccup? Anyway,
> fixing...
>
> - Ted

2023-04-30 06:51:30

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [syzbot] [fs?] INFO: task hung in eventpoll_release_file

#syz set subsystem: fs

This somehow got tagged with the ext4 label, and not the fs label.
(And this is not the first one I've noticed). I'm beginning to
suspect there may have been some syzbot database hiccup? Anyway,
fixing...

- Ted

2023-04-30 16:18:28

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [syzbot] [fs?] INFO: task hung in eventpoll_release_file

#syz set subsystems: fs

This somehow got tagged with the ext4 label, and not the fs label.
(And this is not the first one I've noticed). I'm beginning to
suspect there may have been some syzbot database hiccup? Anyway,
fixing...

- Ted

2023-05-02 20:09:38

by Aleksandr Nogikh

[permalink] [raw]
Subject: Re: [syzbot] [fs?] INFO: task hung in eventpoll_release_file

Hi Ted,

On Sun, Apr 30, 2023 at 8:34 AM Theodore Ts'o <[email protected]> wrote:
>
> #syz set subsystem: fs
>
> This somehow got tagged with the ext4 label, and not the fs label.
> (And this is not the first one I've noticed). I'm beginning to
> suspect there may have been some syzbot database hiccup? Anyway,
> fixing...

FWIW one of this bug's crashes was attributed to ext4 [1] and syzbot's
logic in this case was to prefer a more specific subsystem (ext4) to a
more generic one (fs), even if it's not mentioned in the majority of
crashes.

[1] https://syzkaller.appspot.com/text?tag=CrashReport&x=171abfaac80000

--
Aleksandr
>
> - Ted
>
> --

2023-05-03 05:08:05

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [syzbot] [fs?] INFO: task hung in eventpoll_release_file

On Tue, May 02, 2023 at 10:08:44PM +0200, Aleksandr Nogikh wrote:
> Hi Ted,
>
> On Sun, Apr 30, 2023 at 8:34 AM Theodore Ts'o <[email protected]> wrote:
> >
> > #syz set subsystem: fs
> >
> > This somehow got tagged with the ext4 label, and not the fs label.
> > (And this is not the first one I've noticed). I'm beginning to
> > suspect there may have been some syzbot database hiccup? Anyway,
> > fixing...
>
> FWIW one of this bug's crashes was attributed to ext4 [1] and syzbot's
> logic in this case was to prefer a more specific subsystem (ext4) to a
> more generic one (fs), even if it's not mentioned in the majority of
> crashes.
>
> [1] https://syzkaller.appspot.com/text?tag=CrashReport&x=171abfaac80000

One of the challenges is that the attribution is not necessasrily
accurate. One of the CPU's was running an ext4 workqueue task (which
was apparntly making forward progress) at the time of the crash.

It should also be noted that apparently there is a potential patch
which seems to fix the problem, and it's solely in the fs/eventpoll.c.
Unfortunately, it was not in the lore.kernel.org archives, since
apparently it wasn't cc'ed there. It's in the syzkaller-bugs Google
Groups archive, though, since Pauolo Abeni cc'ed the
[email protected], but not the lore archive, on his
test:

https://groups.google.com/g/syzkaller-bugs/c/oiBUmGsqz_Q/m/Xi5iOeJNAgAJ

- Ted