2022-10-24 10:23:16

by syzbot

[permalink] [raw]
Subject: [syzbot] INFO: task hung in nilfs_detach_log_writer

Hello,

syzbot found the following issue on:

HEAD commit: bbed346d5a96 Merge branch 'for-next/core' into for-kernelci
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=12c63b3c880000
kernel config: https://syzkaller.appspot.com/x/.config?x=3a4a45d2d827c1e
dashboard link: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13010d3c880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16c58e6e880000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/e8e91bc79312/disk-bbed346d.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/c1cb3fb3b77e/vmlinux-bbed346d.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/ef6217a5c5b6/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

INFO: task syz-executor217:3091 blocked for more than 143 seconds.
Not tainted 6.0.0-rc7-syzkaller-18095-gbbed346d5a96 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor217 state:D stack: 0 pid: 3091 ppid: 3089 flags:0x00000008
Call trace:
__switch_to+0x180/0x298 arch/arm64/kernel/process.c:557
context_switch kernel/sched/core.c:5182 [inline]
__schedule+0x414/0x5a0 kernel/sched/core.c:6494
schedule+0x64/0xa4 kernel/sched/core.c:6570
schedule_timeout+0x64/0x1b4 kernel/time/timer.c:1911
do_wait_for_common+0xf4/0x184 kernel/sched/completion.c:85
__wait_for_common kernel/sched/completion.c:106 [inline]
wait_for_common kernel/sched/completion.c:117 [inline]
wait_for_completion+0x48/0x60 kernel/sched/completion.c:138
__flush_work+0xd4/0x144 kernel/workqueue.c:3073
flush_work+0x24/0x38 kernel/workqueue.c:3094
nilfs_segctor_destroy fs/nilfs2/segment.c:2726 [inline]
nilfs_detach_log_writer+0x1e0/0x4d0 fs/nilfs2/segment.c:2810
nilfs_put_super+0x28/0x9c fs/nilfs2/super.c:468
generic_shutdown_super+0x8c/0x190 fs/super.c:491
kill_block_super+0x30/0x78 fs/super.c:1427
deactivate_locked_super+0x70/0xe8 fs/super.c:332
deactivate_super+0xd0/0xd4 fs/super.c:363
cleanup_mnt+0x1f8/0x234 fs/namespace.c:1186
__cleanup_mnt+0x20/0x30 fs/namespace.c:1193
task_work_run+0xc4/0x14c kernel/task_work.c:177
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
do_notify_resume+0x174/0x1f0 arch/arm64/kernel/signal.c:1127
prepare_exit_to_user_mode arch/arm64/kernel/entry-common.c:137 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:142 [inline]
el0_svc+0x9c/0x150 arch/arm64/kernel/entry-common.c:637
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:654
el0t_64_sync+0x18c/0x190 arch/arm64/kernel/entry.S:581

Showing all locks held in the system:
1 lock held by rcu_tasks_kthre/11:
#0: ffff80000d433568 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x3c/0x450 kernel/rcu/tasks.h:507
1 lock held by rcu_tasks_trace/12:
#0: ffff80000d433bb8 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x3c/0x450 kernel/rcu/tasks.h:507
1 lock held by khungtaskd/27:
#0: ffff80000d433440 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x4/0x48 include/linux/rcupdate.h:279
2 locks held by getty/2727:
#0: ffff0000c7906098 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x28/0x58 drivers/tty/tty_ldisc.c:244
#1: ffff80000f6162f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x19c/0x89c drivers/tty/n_tty.c:2177
1 lock held by syz-executor217/3091:
#0: ffff0000c90930e0 (&type->s_umount_key#41){+.+.}-{3:3}, at: deactivate_super+0xc8/0xd4 fs/super.c:362
3 locks held by kworker/1:0/3113:
#0: ffff0000c0010738 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x270/0x504 kernel/workqueue.c:2262
#1: ffff800012703d80 ((work_completion)(&sci->sc_iput_work)){+.+.}-{0:0}, at: process_one_work+0x29c/0x504 kernel/workqueue.c:2264
#2: ffff0000c9093650 (sb_internal#2){.+.+}-{0:0}, at: nilfs_evict_inode+0x94/0x1cc fs/nilfs2/inode.c:911

=============================================



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


2024-05-20 13:35:32

by Ryusuke Konishi

[permalink] [raw]
Subject: [PATCH 0/3] nilfs2: fix log writer related issues

Hello Andrew,

Please apply this series as bug fixes.

This bug fix series covers three nilfs2 log writer-related issues,
including a timer use-after-free issue and potential deadlock issue on
unmount, and a potential freeze issue in event synchronization found
during their analysis. Details are described in each commit log.

thank you always.

Ryusuke Konishi

Ryusuke Konishi (3):
nilfs2: fix use-after-free of timer for log writer thread
nilfs2: fix unexpected freezing of nilfs_segctor_sync()
nilfs2: fix potential hang in nilfs_detach_log_writer()

fs/nilfs2/segment.c | 63 +++++++++++++++++++++++++++++++++++----------
1 file changed, 50 insertions(+), 13 deletions(-)

--
2.34.1


2024-05-20 13:35:47

by Ryusuke Konishi

[permalink] [raw]
Subject: [PATCH 1/3] nilfs2: fix use-after-free of timer for log writer thread

A use-after-free issue has been reported regarding the timer sc_timer on
the nilfs_sc_info structure.

The problem is that even though it is used to wake up a sleeping log
writer thread, sc_timer is not shut down until the nilfs_sc_info
structure is about to be freed, and is used regardless of the thread's
lifetime.

Fix this issue by limiting the use of sc_timer only while the log writer
thread is alive.

Reported-by: "Bai, Shuangpeng" <[email protected]>
Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ
Signed-off-by: Ryusuke Konishi <[email protected]>
Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
Tested-by: Ryusuke Konishi <[email protected]>
Cc: [email protected]
---
fs/nilfs2/segment.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 8654ab8ad534..4e274bc8eb79 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2118,8 +2118,10 @@ static void nilfs_segctor_start_timer(struct nilfs_sc_info *sci)
{
spin_lock(&sci->sc_state_lock);
if (!(sci->sc_state & NILFS_SEGCTOR_COMMIT)) {
- sci->sc_timer.expires = jiffies + sci->sc_interval;
- add_timer(&sci->sc_timer);
+ if (sci->sc_task) {
+ sci->sc_timer.expires = jiffies + sci->sc_interval;
+ add_timer(&sci->sc_timer);
+ }
sci->sc_state |= NILFS_SEGCTOR_COMMIT;
}
spin_unlock(&sci->sc_state_lock);
@@ -2320,10 +2322,21 @@ int nilfs_construct_dsync_segment(struct super_block *sb, struct inode *inode,
*/
static void nilfs_segctor_accept(struct nilfs_sc_info *sci)
{
+ bool thread_is_alive;
+
spin_lock(&sci->sc_state_lock);
sci->sc_seq_accepted = sci->sc_seq_request;
+ thread_is_alive = (bool)sci->sc_task;
spin_unlock(&sci->sc_state_lock);
- del_timer_sync(&sci->sc_timer);
+
+ /*
+ * This function does not race with the log writer thread's
+ * termination. Therefore, deleting sc_timer, which should not be
+ * done after the log writer thread exits, can be done safely outside
+ * the area protected by sc_state_lock.
+ */
+ if (thread_is_alive)
+ del_timer_sync(&sci->sc_timer);
}

/**
@@ -2349,7 +2362,7 @@ static void nilfs_segctor_notify(struct nilfs_sc_info *sci, int mode, int err)
sci->sc_flush_request &= ~FLUSH_DAT_BIT;

/* re-enable timer if checkpoint creation was not done */
- if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+ if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) && sci->sc_task &&
time_before(jiffies, sci->sc_timer.expires))
add_timer(&sci->sc_timer);
}
@@ -2539,6 +2552,7 @@ static int nilfs_segctor_thread(void *arg)
int timeout = 0;

sci->sc_timer_task = current;
+ timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);

/* start sync. */
sci->sc_task = current;
@@ -2606,6 +2620,7 @@ static int nilfs_segctor_thread(void *arg)
end_thread:
/* end sync. */
sci->sc_task = NULL;
+ timer_shutdown_sync(&sci->sc_timer);
wake_up(&sci->sc_wait_task); /* for nilfs_segctor_kill_thread() */
spin_unlock(&sci->sc_state_lock);
return 0;
@@ -2669,7 +2684,6 @@ static struct nilfs_sc_info *nilfs_segctor_new(struct super_block *sb,
INIT_LIST_HEAD(&sci->sc_gc_inodes);
INIT_LIST_HEAD(&sci->sc_iput_queue);
INIT_WORK(&sci->sc_iput_work, nilfs_iput_work_func);
- timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);

sci->sc_interval = HZ * NILFS_SC_DEFAULT_TIMEOUT;
sci->sc_mjcp_freq = HZ * NILFS_SC_DEFAULT_SR_FREQ;
@@ -2748,7 +2762,6 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)

down_write(&nilfs->ns_segctor_sem);

- timer_shutdown_sync(&sci->sc_timer);
kfree(sci);
}

--
2.34.1