2023-12-17 19:50:37

by syzbot

[permalink] [raw]
Subject: [syzbot] [reiserfs?] possible deadlock in __run_timers

Hello,

syzbot found the following issue on:

HEAD commit: 88035e5694a8 Merge tag 'hid-for-linus-2023121201' of git:/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13467cc6e80000
kernel config: https://syzkaller.appspot.com/x/.config?x=be2bd0a72b52d4da
dashboard link: https://syzkaller.appspot.com/bug?extid=a3981d3c93cde53224be
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15befbfee80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17b20006e80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/ce88672b9863/disk-88035e56.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/7509f7d0b113/vmlinux-88035e56.xz
kernel image: https://storage.googleapis.com/syzbot-assets/7465dc030e58/bzImage-88035e56.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/a5134eb638e9/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

------------[ cut here ]------------
======================================================
WARNING: possible circular locking dependency detected
6.7.0-rc5-syzkaller-00042-g88035e5694a8 #0 Not tainted
------------------------------------------------------
syz-executor221/5060 is trying to acquire lock:
ffffffff8ceb8ea0 (console_owner){..-.}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1962 [inline]
ffffffff8ceb8ea0 (console_owner){..-.}-{0:0}, at: vprintk_emit+0x313/0x5f0 kernel/printk/printk.c:2302

but task is already holding lock:
ffff8880b98297d8 (&base->lock){-.-.}-{2:2}, at: expire_timers kernel/time/timer.c:1752 [inline]
ffff8880b98297d8 (&base->lock){-.-.}-{2:2}, at: __run_timers+0x76c/0xb20 kernel/time/timer.c:2022

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (&base->lock){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
lock_timer_base+0x5d/0x200 kernel/time/timer.c:999
__mod_timer+0x420/0xea0 kernel/time/timer.c:1080
worker_enter_idle+0x404/0x550 kernel/workqueue.c:945
create_worker+0x467/0x730 kernel/workqueue.c:2213
maybe_create_worker kernel/workqueue.c:2459 [inline]
manage_workers kernel/workqueue.c:2511 [inline]
worker_thread+0xca1/0x1290 kernel/workqueue.c:2756
kthread+0x2c6/0x3a0 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242

-> #3 (&pool->lock){-.-.}-{2:2}:
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
__queue_work+0x399/0x11d0 kernel/workqueue.c:1760
queue_work_on+0xed/0x110 kernel/workqueue.c:1831
queue_work include/linux/workqueue.h:562 [inline]
rpm_suspend+0x121b/0x16f0 drivers/base/power/runtime.c:660
rpm_idle+0x578/0x6e0 drivers/base/power/runtime.c:534
__pm_runtime_idle+0xbe/0x160 drivers/base/power/runtime.c:1102
pm_runtime_put include/linux/pm_runtime.h:460 [inline]
__device_attach+0x382/0x4b0 drivers/base/dd.c:1048
bus_probe_device+0x17c/0x1c0 drivers/base/bus.c:532
device_add+0x117e/0x1aa0 drivers/base/core.c:3625
serial_base_port_add+0x353/0x4b0 drivers/tty/serial/serial_base_bus.c:178
serial_core_port_device_add drivers/tty/serial/serial_core.c:3316 [inline]
serial_core_register_port+0x137/0x1af0 drivers/tty/serial/serial_core.c:3357
serial8250_register_8250_port+0x140d/0x2080 drivers/tty/serial/8250/8250_core.c:1139
serial_pnp_probe+0x47d/0x880 drivers/tty/serial/8250/8250_pnp.c:478
pnp_device_probe+0x2a3/0x4c0 drivers/pnp/driver.c:111
call_driver_probe drivers/base/dd.c:579 [inline]
really_probe+0x234/0xc90 drivers/base/dd.c:658
__driver_probe_device+0x1de/0x4b0 drivers/base/dd.c:800
driver_probe_device+0x4c/0x1a0 drivers/base/dd.c:830
__driver_attach+0x274/0x570 drivers/base/dd.c:1216
bus_for_each_dev+0x13c/0x1d0 drivers/base/bus.c:368
bus_add_driver+0x2e9/0x630 drivers/base/bus.c:673
driver_register+0x15c/0x4a0 drivers/base/driver.c:246
serial8250_init+0xba/0x4b0 drivers/tty/serial/8250/8250_core.c:1240
do_one_initcall+0x11c/0x650 init/main.c:1236
do_initcall_level init/main.c:1298 [inline]
do_initcalls init/main.c:1314 [inline]
do_basic_setup init/main.c:1333 [inline]
kernel_init_freeable+0x687/0xc10 init/main.c:1551
kernel_init+0x1c/0x2a0 init/main.c:1441
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242

-> #2 (&dev->power.lock){-...}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
__pm_runtime_resume+0xab/0x170 drivers/base/power/runtime.c:1169
pm_runtime_get include/linux/pm_runtime.h:408 [inline]
__uart_start+0x1b2/0x470 drivers/tty/serial/serial_core.c:148
uart_write+0x2ff/0x5b0 drivers/tty/serial/serial_core.c:616
process_output_block drivers/tty/n_tty.c:574 [inline]
n_tty_write+0x422/0x1130 drivers/tty/n_tty.c:2379
iterate_tty_write drivers/tty/tty_io.c:1021 [inline]
file_tty_write.constprop.0+0x519/0x9b0 drivers/tty/tty_io.c:1092
tty_write drivers/tty/tty_io.c:1113 [inline]
redirected_tty_write drivers/tty/tty_io.c:1136 [inline]
redirected_tty_write+0xa6/0xc0 drivers/tty/tty_io.c:1116
call_write_iter include/linux/fs.h:2020 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x64f/0xdf0 fs/read_write.c:584
ksys_write+0x12f/0x250 fs/read_write.c:637
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b

-> #1 (&port_lock_key){-...}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
uart_port_lock_irqsave include/linux/serial_core.h:616 [inline]
serial8250_console_write+0xa7c/0x1060 drivers/tty/serial/8250/8250_port.c:3403
console_emit_next_record kernel/printk/printk.c:2901 [inline]
console_flush_all+0x4d5/0xd60 kernel/printk/printk.c:2967
console_unlock+0x10c/0x260 kernel/printk/printk.c:3036
vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2303
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2328
register_console+0xa74/0x1060 kernel/printk/printk.c:3542
univ8250_console_init+0x35/0x50 drivers/tty/serial/8250/8250_core.c:717
console_init+0xba/0x5d0 kernel/printk/printk.c:3688
start_kernel+0x25a/0x480 init/main.c:1008
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:555
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:536
secondary_startup_64_no_verify+0x166/0x16b

-> #0 (console_owner){..-.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x2433/0x3b20 kernel/locking/lockdep.c:5137
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1ae/0x520 kernel/locking/lockdep.c:5719
console_trylock_spinning kernel/printk/printk.c:1962 [inline]
vprintk_emit+0x328/0x5f0 kernel/printk/printk.c:2302
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2328
__report_bug lib/bug.c:195 [inline]
report_bug+0x4a8/0x580 lib/bug.c:219
handle_bug+0x3d/0x70 arch/x86/kernel/traps.c:237
exc_invalid_op+0x17/0x40 arch/x86/kernel/traps.c:258
asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:568
expire_timers kernel/time/timer.c:1738 [inline]
__run_timers+0x8d2/0xb20 kernel/time/timer.c:2022
run_timer_softirq+0x58/0xd0 kernel/time/timer.c:2035
__do_softirq+0x21a/0x8de kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0xb7/0x120 kernel/softirq.c:644
sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649
memmove+0x44/0x1b0 arch/x86/lib/memmove_64.S:67
leaf_insert_into_buf+0x303/0xa30 fs/reiserfs/lbalance.c:933
balance_leaf_new_nodes_insert fs/reiserfs/do_balan.c:1001 [inline]
balance_leaf_new_nodes fs/reiserfs/do_balan.c:1243 [inline]
balance_leaf+0x2ff4/0xcda0 fs/reiserfs/do_balan.c:1450
do_balance+0x337/0x840 fs/reiserfs/do_balan.c:1888
reiserfs_insert_item+0xadd/0xe20 fs/reiserfs/stree.c:2260
indirect2direct+0x6d8/0xa20 fs/reiserfs/tail_conversion.c:283
maybe_indirect_to_direct fs/reiserfs/stree.c:1585 [inline]
reiserfs_cut_from_item+0xa82/0x1a10 fs/reiserfs/stree.c:1692
reiserfs_do_truncate+0x672/0x10b0 fs/reiserfs/stree.c:1971
reiserfs_truncate_file+0x1bf/0x940 fs/reiserfs/inode.c:2302
reiserfs_file_release+0xae3/0xc40 fs/reiserfs/file.c:109
__fput+0x270/0xbb0 fs/file_table.c:394
task_work_run+0x14d/0x240 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xa92/0x2ae0 kernel/exit.c:871
do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
__do_sys_exit_group kernel/exit.c:1032 [inline]
__se_sys_exit_group kernel/exit.c:1030 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1030
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b

other info that might help us debug this:

Chain exists of:
console_owner --> &pool->lock --> &base->lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&base->lock);
lock(&pool->lock);
lock(&base->lock);
lock(console_owner);

*** DEADLOCK ***

3 locks held by syz-executor221/5060:
#0: ffff8880766e0df8 (&ei->tailpack){+.+.}-{3:3}, at: reiserfs_file_release+0xdd/0xc40 fs/reiserfs/file.c:41
#1: ffff888078f6b090 (&sbi->lock){+.+.}-{3:3}, at: reiserfs_write_lock_nested+0x69/0xe0 fs/reiserfs/lock.c:78
#2: ffff8880b98297d8 (&base->lock){-.-.}-{2:2}, at: expire_timers kernel/time/timer.c:1752 [inline]
#2: ffff8880b98297d8 (&base->lock){-.-.}-{2:2}, at: __run_timers+0x76c/0xb20 kernel/time/timer.c:2022

stack backtrace:
CPU: 0 PID: 5060 Comm: syz-executor221 Not tainted 6.7.0-rc5-syzkaller-00042-g88035e5694a8 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
check_noncircular+0x317/0x400 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x2433/0x3b20 kernel/locking/lockdep.c:5137
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1ae/0x520 kernel/locking/lockdep.c:5719
console_trylock_spinning kernel/printk/printk.c:1962 [inline]
vprintk_emit+0x328/0x5f0 kernel/printk/printk.c:2302
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2328
__report_bug lib/bug.c:195 [inline]
report_bug+0x4a8/0x580 lib/bug.c:219
handle_bug+0x3d/0x70 arch/x86/kernel/traps.c:237
exc_invalid_op+0x17/0x40 arch/x86/kernel/traps.c:258
asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:568
RIP: 0010:expire_timers kernel/time/timer.c:1738 [inline]
RIP: 0010:__run_timers+0x8d2/0xb20 kernel/time/timer.c:2022
Code: 6f 48 e8 91 9d 11 00 89 de 31 ff 83 eb 01 e8 f5 98 11 00 8b 44 24 18 85 c0 0f 85 50 fc ff ff e9 50 fb ff ff e8 6f 9d 11 00 90 <0f> 0b 90 e9 b3 fc ff ff e8 61 9d 11 00 90 0f 0b 90 e9 37 fd ff ff
RSP: 0018:ffffc90000007d88 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88807e909300 RCX: ffffffff8175f032
RDX: ffff888023565940 RSI: ffffffff8175f091 RDI: ffff88807e909318
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000003 R12: ffffc90000007e60
R13: ffffc90000007e60 R14: dffffc0000000000 R15: ffff8880b98297c0
run_timer_softirq+0x58/0xd0 kernel/time/timer.c:2035
__do_softirq+0x21a/0x8de kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0xb7/0x120 kernel/softirq.c:644
sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:memmove+0x44/0x1b0 arch/x86/lib/memmove_64.S:68
Code: 00 48 83 fa 20 0f 82 01 01 00 00 66 0f 1f 44 00 00 48 81 fa a8 02 00 00 72 05 40 38 fe 74 47 48 83 ea 20 48 83 ea 20 4c 8b 1e <4c> 8b 56 08 4c 8b 4e 10 4c 8b 46 18 48 8d 76 20 4c 89 1f 4c 89 57
RSP: 0018:ffffc900039feb60 EFLAGS: 00000282
RAX: ffff88807c4ac0c0 RBX: 0000000000000006 RCX: 0000000000000000
RDX: ffffffffe7ab3e98 RSI: ffff8880949f9040 RDI: ffff8880949f8100
RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000f18
R13: ffff8880765cd938 R14: 0000000000000000 R15: ffff88807c4ac0a8
leaf_insert_into_buf+0x303/0xa30 fs/reiserfs/lbalance.c:933
balance_leaf_new_nodes_insert fs/reiserfs/do_balan.c:1001 [inline]
balance_leaf_new_nodes fs/reiserfs/do_balan.c:1243 [inline]
balance_leaf+0x2ff4/0xcda0 fs/reiserfs/do_balan.c:1450
do_balance+0x337/0x840 fs/reiserfs/do_balan.c:1888
reiserfs_insert_item+0xadd/0xe20 fs/reiserfs/stree.c:2260
indirect2direct+0x6d8/0xa20 fs/reiserfs/tail_conversion.c:283
maybe_indirect_to_direct fs/reiserfs/stree.c:1585 [inline]
reiserfs_cut_from_item+0xa82/0x1a10 fs/reiserfs/stree.c:1692
reiserfs_do_truncate+0x672/0x10b0 fs/reiserfs/stree.c:1971
reiserfs_truncate_file+0x1bf/0x940 fs/reiserfs/inode.c:2302
reiserfs_file_release+0xae3/0xc40 fs/reiserfs/file.c:109
__fput+0x270/0xbb0 fs/file_table.c:394
task_work_run+0x14d/0x240 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xa92/0x2ae0 kernel/exit.c:871
do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
__do_sys_exit_group kernel/exit.c:1032 [inline]
__se_sys_exit_group kernel/exit.c:1030 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1030
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7fb4f48ae339
Code: Unable to access opcode bytes at 0x7fb4f48ae30f.
RSP: 002b:00007fff27e4b078 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb4f48ae339
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
RBP: 00007fb4f49292b0 R08: ffffffffffffffb8 R09: 00007fb4f487bbf0
R10: 00007fff27e4b028 R11: 0000000000000246 R12: 00007fb4f49292b0
R13: 0000000000000000 R14: 00007fb4f492a020 R15: 00007fb4f487cc70
</TASK>
WARNING: CPU: 0 PID: 5060 at kernel/time/timer.c:1738 expire_timers kernel/time/timer.c:1738 [inline]
WARNING: CPU: 0 PID: 5060 at kernel/time/timer.c:1738 __run_timers+0x8d2/0xb20 kernel/time/timer.c:2022
Modules linked in:
CPU: 0 PID: 5060 Comm: syz-executor221 Not tainted 6.7.0-rc5-syzkaller-00042-g88035e5694a8 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:expire_timers kernel/time/timer.c:1738 [inline]
RIP: 0010:__run_timers+0x8d2/0xb20 kernel/time/timer.c:2022
Code: 6f 48 e8 91 9d 11 00 89 de 31 ff 83 eb 01 e8 f5 98 11 00 8b 44 24 18 85 c0 0f 85 50 fc ff ff e9 50 fb ff ff e8 6f 9d 11 00 90 <0f> 0b 90 e9 b3 fc ff ff e8 61 9d 11 00 90 0f 0b 90 e9 37 fd ff ff
RSP: 0018:ffffc90000007d88 EFLAGS: 00010046

RAX: 0000000000000000 RBX: ffff88807e909300 RCX: ffffffff8175f032
RDX: ffff888023565940 RSI: ffffffff8175f091 RDI: ffff88807e909318
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000003 R12: ffffc90000007e60
R13: ffffc90000007e60 R14: dffffc0000000000 R15: ffff8880b98297c0
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb4f48f7d08 CR3: 000000000cd77000 CR4: 0000000000350ef0
Call Trace:
<IRQ>
run_timer_softirq+0x58/0xd0 kernel/time/timer.c:2035
__do_softirq+0x21a/0x8de kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0xb7/0x120 kernel/softirq.c:644
sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:memmove+0x44/0x1b0 arch/x86/lib/memmove_64.S:68
Code: 00 48 83 fa 20 0f 82 01 01 00 00 66 0f 1f 44 00 00 48 81 fa a8 02 00 00 72 05 40 38 fe 74 47 48 83 ea 20 48 83 ea 20 4c 8b 1e <4c> 8b 56 08 4c 8b 4e 10 4c 8b 46 18 48 8d 76 20 4c 89 1f 4c 89 57
RSP: 0018:ffffc900039feb60 EFLAGS: 00000282

RAX: ffff88807c4ac0c0 RBX: 0000000000000006 RCX: 0000000000000000
RDX: ffffffffe7ab3e98 RSI: ffff8880949f9040 RDI: ffff8880949f8100
RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000f18
R13: ffff8880765cd938 R14: 0000000000000000 R15: ffff88807c4ac0a8
leaf_insert_into_buf+0x303/0xa30 fs/reiserfs/lbalance.c:933
balance_leaf_new_nodes_insert fs/reiserfs/do_balan.c:1001 [inline]
balance_leaf_new_nodes fs/reiserfs/do_balan.c:1243 [inline]
balance_leaf+0x2ff4/0xcda0 fs/reiserfs/do_balan.c:1450
do_balance+0x337/0x840 fs/reiserfs/do_balan.c:1888
reiserfs_insert_item+0xadd/0xe20 fs/reiserfs/stree.c:2260
indirect2direct+0x6d8/0xa20 fs/reiserfs/tail_conversion.c:283
maybe_indirect_to_direct fs/reiserfs/stree.c:1585 [inline]
reiserfs_cut_from_item+0xa82/0x1a10 fs/reiserfs/stree.c:1692
reiserfs_do_truncate+0x672/0x10b0 fs/reiserfs/stree.c:1971
reiserfs_truncate_file+0x1bf/0x940 fs/reiserfs/inode.c:2302
reiserfs_file_release+0xae3/0xc40 fs/reiserfs/file.c:109
__fput+0x270/0xbb0 fs/file_table.c:394
task_work_run+0x14d/0x240 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xa92/0x2ae0 kernel/exit.c:871
do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
__do_sys_exit_group kernel/exit.c:1032 [inline]
__se_sys_exit_group kernel/exit.c:1030 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1030
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7fb4f48ae339
Code: Unable to access opcode bytes at 0x7fb4f48ae30f.
RSP: 002b:00007fff27e4b078 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb4f48ae339
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
RBP: 00007fb4f49292b0 R08: ffffffffffffffb8 R09: 00007fb4f487bbf0
R10: 00007fff27e4b028 R11: 0000000000000246 R12: 00007fb4f49292b0
R13: 0000000000000000 R14: 00007fb4f492a020 R15: 00007fb4f487cc70
</TASK>
irq event stamp: 46901
hardirqs last enabled at (46900): [<ffffffff8a83b6ee>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
hardirqs last enabled at (46900): [<ffffffff8a83b6ee>] _raw_spin_unlock_irqrestore+0x4e/0x70 kernel/locking/spinlock.c:194
hardirqs last disabled at (46901): [<ffffffff8a83b445>] __raw_spin_lock_irq include/linux/spinlock_api_smp.h:117 [inline]
hardirqs last disabled at (46901): [<ffffffff8a83b445>] _raw_spin_lock_irq+0x45/0x50 kernel/locking/spinlock.c:170
softirqs last enabled at (46892): [<ffffffff8a83e307>] softirq_handle_end kernel/softirq.c:399 [inline]
softirqs last enabled at (46892): [<ffffffff8a83e307>] __do_softirq+0x597/0x8de kernel/softirq.c:582
softirqs last disabled at (46895): [<ffffffff814f9757>] invoke_softirq kernel/softirq.c:427 [inline]
softirqs last disabled at (46895): [<ffffffff814f9757>] __irq_exit_rcu kernel/softirq.c:632 [inline]
softirqs last disabled at (46895): [<ffffffff814f9757>] irq_exit_rcu+0xb7/0x120 kernel/softirq.c:644
---[ end trace 0000000000000000 ]---
----------------
Code disassembly (best guess), 1 bytes skipped:
0: 48 83 fa 20 cmp $0x20,%rdx
4: 0f 82 01 01 00 00 jb 0x10b
a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
10: 48 81 fa a8 02 00 00 cmp $0x2a8,%rdx
17: 72 05 jb 0x1e
19: 40 38 fe cmp %dil,%sil
1c: 74 47 je 0x65
1e: 48 83 ea 20 sub $0x20,%rdx
22: 48 83 ea 20 sub $0x20,%rdx
26: 4c 8b 1e mov (%rsi),%r11
* 29: 4c 8b 56 08 mov 0x8(%rsi),%r10 <-- trapping instruction
2d: 4c 8b 4e 10 mov 0x10(%rsi),%r9
31: 4c 8b 46 18 mov 0x18(%rsi),%r8
35: 48 8d 76 20 lea 0x20(%rsi),%rsi
39: 4c 89 1f mov %r11,(%rdi)
3c: 4c rex.WR
3d: 89 .byte 0x89
3e: 57 push %rdi


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


2023-12-18 01:14:06

by Lizhi Xu

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 88035e5694a8

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2989b57e154a..33478bfee814 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -941,8 +941,10 @@ static void worker_enter_idle(struct worker *worker)
/* idle_list is LIFO */
list_add(&worker->entry, &pool->idle_list);

+ raw_spin_unlock_irq(&pool->lock);
if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
+ raw_spin_lock_irq(&pool->lock);

/* Sanity check nr_running. */
WARN_ON_ONCE(pool->nr_workers == pool->nr_idle && pool->nr_running);

2023-12-18 01:47:16

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in mac80211_hwsim_netlink_notify

general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 PID: 5405 Comm: udevd Not tainted 6.7.0-rc5-syzkaller-00042-g88035e5694a8-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
RIP: 0010:remove_user_radios drivers/net/wireless/virtual/mac80211_hwsim.c:6195 [inline]
RIP: 0010:mac80211_hwsim_netlink_notify+0x1fb/0x8e0 drivers/net/wireless/virtual/mac80211_hwsim.c:6221
Code: 8b ab 94 2c 00 00 8b 3c 24 44 89 ee e8 3e 65 2e fb 44 39 2c 24 0f 84 a2 03 00 00 e8 3f 6a 2e fb 48 89 e8 49 89 ef 48 c1 e8 03 <42> 80 3c 20 00 0f 85 5b 05 00 00 48 81 fd 00 43 0f 8e 48 8b 45 00
RSP: 0018:ffffc90004d8f850 EFLAGS: 00010256
RAX: 0000000000000000 RBX: ffff888073793040 RCX: ffffffff8659238e
RDX: ffff888021b38000 RSI: ffffffff865923e1 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000002 R12: dffffc0000000000
R13: 0000000000000000 R14: ffffc90004d8f898 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000000cd77000 CR4: 0000000000350ef0
Call Trace:
<TASK>
notifier_call_chain+0xb6/0x3b0 kernel/notifier.c:93
blocking_notifier_call_chain kernel/notifier.c:388 [inline]
blocking_notifier_call_chain+0x69/0x90 kernel/notifier.c:376
netlink_release+0x1835/0x1ff0 net/netlink/af_netlink.c:795
__sock_release+0xae/0x260 net/socket.c:659
sock_close+0x1c/0x20 net/socket.c:1419
__fput+0x270/0xbb0 fs/file_table.c:394
task_work_run+0x14d/0x240 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xa92/0x2ae0 kernel/exit.c:871
do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
get_signal+0x23be/0x2790 kernel/signal.c:2904
arch_do_signal_or_restart+0x90/0x7f0 arch/x86/kernel/signal.c:309
exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
exit_to_user_mode_prepare+0x121/0x240 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x1e/0x60 kernel/entry/common.c:296
do_syscall_64+0x4d/0x110 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f0965ebe3cd
Code: Unable to access opcode bytes at 0x7f0965ebe3a3.
RSP: 002b:00007ffd49acfb40 EFLAGS: 00000246 ORIG_RAX: 00000000000000ea
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f0965ebe3cd
RDX: 0000000000000006 RSI: 000000000000151d RDI: 000000000000151d
RBP: 000000000000151d R08: 0000000000000000 R09: 0000000000000003
R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006
R13: 00007ffd49acfd50 R14: 0000000000001000 R15: 0000000000000000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:remove_user_radios drivers/net/wireless/virtual/mac80211_hwsim.c:6195 [inline]
RIP: 0010:mac80211_hwsim_netlink_notify+0x1fb/0x8e0 drivers/net/wireless/virtual/mac80211_hwsim.c:6221
Code: 8b ab 94 2c 00 00 8b 3c 24 44 89 ee e8 3e 65 2e fb 44 39 2c 24 0f 84 a2 03 00 00 e8 3f 6a 2e fb 48 89 e8 49 89 ef 48 c1 e8 03 <42> 80 3c 20 00 0f 85 5b 05 00 00 48 81 fd 00 43 0f 8e 48 8b 45 00
RSP: 0018:ffffc90004d8f850 EFLAGS: 00010256

RAX: 0000000000000000 RBX: ffff888073793040 RCX: ffffffff8659238e
RDX: ffff888021b38000 RSI: ffffffff865923e1 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000002 R12: dffffc0000000000
R13: 0000000000000000 R14: ffffc90004d8f898 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000000cd77000 CR4: 0000000000350ef0
----------------
Code disassembly (best guess):
0: 8b ab 94 2c 00 00 mov 0x2c94(%rbx),%ebp
6: 8b 3c 24 mov (%rsp),%edi
9: 44 89 ee mov %r13d,%esi
c: e8 3e 65 2e fb call 0xfb2e654f
11: 44 39 2c 24 cmp %r13d,(%rsp)
15: 0f 84 a2 03 00 00 je 0x3bd
1b: e8 3f 6a 2e fb call 0xfb2e6a5f
20: 48 89 e8 mov %rbp,%rax
23: 49 89 ef mov %rbp,%r15
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 42 80 3c 20 00 cmpb $0x0,(%rax,%r12,1) <-- trapping instruction
2f: 0f 85 5b 05 00 00 jne 0x590
35: 48 81 fd 00 43 0f 8e cmp $0xffffffff8e0f4300,%rbp
3c: 48 8b 45 00 mov 0x0(%rbp),%rax


Tested on:

commit: 88035e56 Merge tag 'hid-for-linus-2023121201' of git:/..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=14d3d969e80000
kernel config: https://syzkaller.appspot.com/x/.config?x=be2bd0a72b52d4da
dashboard link: https://syzkaller.appspot.com/bug?extid=a3981d3c93cde53224be
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=16a9b876e80000


2023-12-25 01:55:12

by Lizhi Xu

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 88035e5694a8

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2989b57e154a..9daa5d695dbd 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -941,8 +941,11 @@ static void worker_enter_idle(struct worker *worker)
/* idle_list is LIFO */
list_add(&worker->entry, &pool->idle_list);

- if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
+ if (too_many_workers(pool) && !timer_pending(&pool->idle_timer)) {
+ raw_spin_unlock_irq(&pool->lock);
mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
+ raw_spin_lock_irq(&pool->lock);
+ }

/* Sanity check nr_running. */
WARN_ON_ONCE(pool->nr_workers == pool->nr_idle && pool->nr_running);

2023-12-25 02:31:14

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
inconsistent lock state in unlink_file_vma

================================
WARNING: inconsistent lock state
6.7.0-rc5-syzkaller-00042-g88035e5694a8-dirty #0 Not tainted
--------------------------------
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
syz-executor.0/5423 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff888071f79078
(timekeeper_lock
){?.-.}-{2:2}
, at: i_mmap_lock_write include/linux/fs.h:512 [inline]
, at: unlink_file_vma+0x81/0x120 mm/mmap.c:128
{IN-HARDIRQ-W} state was registered at:
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1ae/0x520 kernel/locking/lockdep.c:5719
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
timekeeping_advance+0x82/0xf10 kernel/time/timekeeping.c:2159
update_wall_time+0x11/0x40 kernel/time/timekeeping.c:2231
tick_periodic+0x18b/0x230 kernel/time/tick-common.c:97
tick_handle_periodic+0x45/0x120 kernel/time/tick-common.c:112
timer_interrupt+0x48/0x70 arch/x86/kernel/time.c:57
__handle_irq_event_percpu+0x22a/0x750 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
handle_irq arch/x86/kernel/irq.c:238 [inline]
__common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
common_interrupt+0xab/0xd0 arch/x86/kernel/irq.c:247
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:640
console_flush_all+0xa0e/0xd60 kernel/printk/printk.c:2973
console_unlock+0x10c/0x260 kernel/printk/printk.c:3036
vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2303
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2328
setup_umip arch/x86/kernel/cpu/common.c:379 [inline]
identify_cpu+0xcfe/0x2390 arch/x86/kernel/cpu/common.c:1878
identify_boot_cpu arch/x86/kernel/cpu/common.c:1980 [inline]
arch_cpu_finalize_init+0x11/0x160 arch/x86/kernel/cpu/common.c:2343
start_kernel+0x32c/0x480 init/main.c:1039
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:555
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:536
secondary_startup_64_no_verify+0x166/0x16b
irq event stamp: 165397
hardirqs last enabled at (165397): [<ffffffff81de4612>] kasan_quarantine_put+0x102/0x230 mm/kasan/quarantine.c:242
hardirqs last disabled at (165396): [<ffffffff81de45ba>] kasan_quarantine_put+0xaa/0x230 mm/kasan/quarantine.c:215
softirqs last enabled at (165306): [<ffffffff8130d599>] local_bh_enable include/linux/bottom_half.h:33 [inline]
softirqs last enabled at (165306): [<ffffffff8130d599>] fpregs_unlock arch/x86/include/asm/fpu/api.h:80 [inline]
softirqs last enabled at (165306): [<ffffffff8130d599>] fpu__clear_user_states+0xf9/0x1e0 arch/x86/kernel/fpu/core.c:771
softirqs last disabled at (165304): [<ffffffff8130d4d9>] local_bh_disable include/linux/bottom_half.h:20 [inline]
softirqs last disabled at (165304): [<ffffffff8130d4d9>] fpregs_lock arch/x86/include/asm/fpu/api.h:72 [inline]
softirqs last disabled at (165304): [<ffffffff8130d4d9>] fpu__clear_user_states+0x39/0x1e0 arch/x86/kernel/fpu/core.c:745

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(
timekeeper_lock);
<Interrupt>
lock(timekeeper_lock
);

*** DEADLOCK ***

1 lock held by syz-executor.0/5423:
#0: ffff888016694420
(&mm->mmap_lock
){++++}-{3:3}
, at: mmap_write_lock include/linux/mmap_lock.h:108 [inline]
, at: exit_mmap+0x1ef/0xa70 mm/mmap.c:3316

stack backtrace:
CPU: 0 PID: 5423 Comm: syz-executor.0 Not tainted 6.7.0-rc5-syzkaller-00042-g88035e5694a8-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
print_usage_bug kernel/locking/lockdep.c:3971 [inline]
valid_state kernel/locking/lockdep.c:4013 [inline]
mark_lock_irq kernel/locking/lockdep.c:4216 [inline]
mark_lock+0x91a/0xc50 kernel/locking/lockdep.c:4678
mark_usage kernel/locking/lockdep.c:4587 [inline]
__lock_acquire+0x931/0x3b20 kernel/locking/lockdep.c:5091
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1ae/0x520 kernel/locking/lockdep.c:5719
down_write+0x3a/0x50 kernel/locking/rwsem.c:1579
i_mmap_lock_write include/linux/fs.h:512 [inline]
unlink_file_vma+0x81/0x120 mm/mmap.c:128
free_pgtables+0x311/0x800 mm/memory.c:401
exit_mmap+0x383/0xa70 mm/mmap.c:3319
__mmput+0x12a/0x4d0 kernel/fork.c:1349
mmput+0x62/0x70 kernel/fork.c:1371
exit_mm kernel/exit.c:567 [inline]
do_exit+0x9ad/0x2ae0 kernel/exit.c:858
do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
__do_sys_exit_group kernel/exit.c:1032 [inline]
__se_sys_exit_group kernel/exit.c:1030 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1030
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f8e26c7cba9
Code: Unable to access opcode bytes at 0x7f8e26c7cb7f.
RSP: 002b:00007ffc0e242a78 EFLAGS: 00000246
ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f8e26c7cba9
RDX: 00007f8e26ca7fb5 RSI: 0000000000000000 RDI: 000000000000000b
RBP: 00007ffc0e24314c R08: 0000000000000001 R09: 000000000000000b
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000032
R13: 0000000000014683 R14: 0000000000014581 R15: 0000000000000000
</TASK>


Tested on:

commit: 88035e56 Merge tag 'hid-for-linus-2023121201' of git:/..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=157106d9e80000
kernel config: https://syzkaller.appspot.com/x/.config?x=be2bd0a72b52d4da
dashboard link: https://syzkaller.appspot.com/bug?extid=a3981d3c93cde53224be
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=138e1e16e80000


2023-12-25 03:18:55

by Lizhi Xu

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 88035e5694a8

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2989b57e154a..30427a1f961c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -941,8 +941,12 @@ static void worker_enter_idle(struct worker *worker)
/* idle_list is LIFO */
list_add(&worker->entry, &pool->idle_list);

- if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
+ if (too_many_workers(pool) && !timer_pending(&pool->idle_timer)) {
+ unsigned long flags;
+ raw_spin_unlock_irqrestore(&pool->lock, flags);
mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
+ raw_spin_lock_irqsave(&pool->lock, flags);
+ }

/* Sanity check nr_running. */
WARN_ON_ONCE(pool->nr_workers == pool->nr_idle && pool->nr_running);
@@ -2164,6 +2168,7 @@ static struct worker *create_worker(struct worker_pool *pool)
struct worker *worker;
int id;
char id_buf[23];
+ unsigned long flags;

/* ID is needed to determine kthread name */
id = ida_alloc(&pool->worker_ida, GFP_KERNEL);
@@ -2207,7 +2212,7 @@ static struct worker *create_worker(struct worker_pool *pool)
worker_attach_to_pool(worker, pool);

/* start the newly created worker */
- raw_spin_lock_irq(&pool->lock);
+ raw_spin_lock_irqsave(&pool->lock, flags);

worker->pool->nr_workers++;
worker_enter_idle(worker);
@@ -2220,7 +2225,7 @@ static struct worker *create_worker(struct worker_pool *pool)
*/
wake_up_process(worker->task);

- raw_spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irqrestore(&pool->lock, flags);

return worker;

@@ -2727,15 +2732,16 @@ static int worker_thread(void *__worker)
{
struct worker *worker = __worker;
struct worker_pool *pool = worker->pool;
+ unsigned long flags;

/* tell the scheduler that this is a workqueue worker */
set_pf_worker(true);
woke_up:
- raw_spin_lock_irq(&pool->lock);
+ raw_spin_lock_irqsave(&pool->lock, flags);

/* am I supposed to die? */
if (unlikely(worker->flags & WORKER_DIE)) {
- raw_spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irqsave(&pool->lock, flags);
set_pf_worker(false);

set_task_comm(worker->task, "kworker/dying");
@@ -2792,7 +2798,7 @@ static int worker_thread(void *__worker)
*/
worker_enter_idle(worker);
__set_current_state(TASK_IDLE);
- raw_spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irqrestore(&pool->lock, flags);
schedule();
goto woke_up;
}

2023-12-25 03:28:17

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

kernel/workqueue.c:2744:17: error: implicit declaration of function 'raw_spin_unlock_irqsave'; did you mean 'raw_spin_lock_irqsave'? [-Werror=implicit-function-declaration]


Tested on:

commit: 88035e56 Merge tag 'hid-for-linus-2023121201' of git:/..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config: https://syzkaller.appspot.com/x/.config?x=be2bd0a72b52d4da
dashboard link: https://syzkaller.appspot.com/bug?extid=a3981d3c93cde53224be
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=11c16d9ee80000


2023-12-25 03:55:50

by Lizhi Xu

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 88035e5694a8

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2989b57e154a..b7f3525bedb0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -941,8 +941,12 @@ static void worker_enter_idle(struct worker *worker)
/* idle_list is LIFO */
list_add(&worker->entry, &pool->idle_list);

- if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
+ if (too_many_workers(pool) && !timer_pending(&pool->idle_timer)) {
+ unsigned long flags;
+ raw_spin_unlock_irqrestore(&pool->lock, flags);
mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
+ raw_spin_lock_irqsave(&pool->lock, flags);
+ }

/* Sanity check nr_running. */
WARN_ON_ONCE(pool->nr_workers == pool->nr_idle && pool->nr_running);
@@ -2164,6 +2168,7 @@ static struct worker *create_worker(struct worker_pool *pool)
struct worker *worker;
int id;
char id_buf[23];
+ unsigned long flags;

/* ID is needed to determine kthread name */
id = ida_alloc(&pool->worker_ida, GFP_KERNEL);
@@ -2207,7 +2212,7 @@ static struct worker *create_worker(struct worker_pool *pool)
worker_attach_to_pool(worker, pool);

/* start the newly created worker */
- raw_spin_lock_irq(&pool->lock);
+ raw_spin_lock_irqsave(&pool->lock, flags);

worker->pool->nr_workers++;
worker_enter_idle(worker);
@@ -2220,7 +2225,7 @@ static struct worker *create_worker(struct worker_pool *pool)
*/
wake_up_process(worker->task);

- raw_spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irqrestore(&pool->lock, flags);

return worker;

@@ -2727,15 +2732,16 @@ static int worker_thread(void *__worker)
{
struct worker *worker = __worker;
struct worker_pool *pool = worker->pool;
+ unsigned long flags;

/* tell the scheduler that this is a workqueue worker */
set_pf_worker(true);
woke_up:
- raw_spin_lock_irq(&pool->lock);
+ raw_spin_lock_irqsave(&pool->lock, flags);

/* am I supposed to die? */
if (unlikely(worker->flags & WORKER_DIE)) {
- raw_spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irqrestore(&pool->lock, flags);
set_pf_worker(false);

set_task_comm(worker->task, "kworker/dying");
@@ -2792,7 +2798,7 @@ static int worker_thread(void *__worker)
*/
worker_enter_idle(worker);
__set_current_state(TASK_IDLE);
- raw_spin_unlock_irq(&pool->lock);
+ raw_spin_unlock_irqrestore(&pool->lock, flags);
schedule();
goto woke_up;
}

2023-12-25 04:16:15

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
kernel panic: corrupted stack end in reiserfs_file_release

Kernel panic - not syncing: corrupted stack end detected inside scheduler
CPU: 0 PID: 5487 Comm: syz-executor.0 Not tainted 6.7.0-rc5-syzkaller-00042-g88035e5694a8-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
panic+0x6dc/0x790 kernel/panic.c:344
schedule_debug kernel/sched/core.c:5930 [inline]
__schedule+0x56be/0x5af0 kernel/sched/core.c:6581
preempt_schedule_irq+0x52/0x90 kernel/sched/core.c:7008
irqentry_exit+0x36/0x80 kernel/entry/common.c:432
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:memmove+0x54/0x1b0 arch/x86/lib/memmove_64.S:73
Code: 00 48 81 fa a8 02 00 00 72 05 40 38 fe 74 47 48 83 ea 20 48 83 ea 20 4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 48 8d 76 20 <4c> 89 1f 4c 89 57 08 4c 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48
RSP: 0018:ffffc9000569ecf0 EFLAGS: 00000282
RAX: ffff88806b84e0c0 RBX: 0000000000000006 RCX: 0000000000000000
RDX: fffffffff8d30c58 RSI: ffff888072b1e2a0 RDI: ffff888072b1d340
RBP: 00000000000000c0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000f18
R13: ffff8880719df028 R14: 0000000000000000 R15: ffff88806b84e0a8
leaf_insert_into_buf+0x303/0xa30 fs/reiserfs/lbalance.c:933
balance_leaf_new_nodes_insert fs/reiserfs/do_balan.c:1001 [inline]
balance_leaf_new_nodes fs/reiserfs/do_balan.c:1243 [inline]
balance_leaf+0x2ff4/0xcda0 fs/reiserfs/do_balan.c:1450
do_balance+0x337/0x840 fs/reiserfs/do_balan.c:1888
reiserfs_insert_item+0xadd/0xe20 fs/reiserfs/stree.c:2260
indirect2direct+0x6d8/0xa20 fs/reiserfs/tail_conversion.c:283
maybe_indirect_to_direct fs/reiserfs/stree.c:1585 [inline]
reiserfs_cut_from_item+0xa82/0x1a10 fs/reiserfs/stree.c:1692
reiserfs_do_truncate+0x672/0x10b0 fs/reiserfs/stree.c:1971
reiserfs_truncate_file+0x1bf/0x940 fs/reiserfs/inode.c:2302
reiserfs_file_release+0xae3/0xc40 fs/reiserfs/file.c:109
__fput+0x270/0xbb0 fs/file_table.c:394
__fput_sync+0x47/0x50 fs/file_table.c:475
__do_sys_close fs/open.c:1590 [inline]
__se_sys_close fs/open.c:1575 [inline]
__x64_sys_close+0x87/0xf0 fs/open.c:1575
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7eff5aa7ba9a
Code: 48 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c 24 0c e8 03 7f 02 00 8b 7c 24 0c 89 c2 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 63 7f 02 00 8b 44 24
RSP: 002b:00007fff479d5a90 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007eff5aa7ba9a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 0000000000000032 R08: 0000001b2e860000 R09: 00007eff5ab9bf8c
R10: 00007fff479d5be0 R11: 0000000000000293 R12: 00007eff5a6015a8
R13: ffffffffffffffff R14: 00007eff5a600000 R15: 0000000000014283
</TASK>
Kernel Offset: disabled
----------------
Code disassembly (best guess):
0: 00 48 81 add %cl,-0x7f(%rax)
3: fa cli
4: a8 02 test $0x2,%al
6: 00 00 add %al,(%rax)
8: 72 05 jb 0xf
a: 40 38 fe cmp %dil,%sil
d: 74 47 je 0x56
f: 48 83 ea 20 sub $0x20,%rdx
13: 48 83 ea 20 sub $0x20,%rdx
17: 4c 8b 1e mov (%rsi),%r11
1a: 4c 8b 56 08 mov 0x8(%rsi),%r10
1e: 4c 8b 4e 10 mov 0x10(%rsi),%r9
22: 4c 8b 46 18 mov 0x18(%rsi),%r8
26: 48 8d 76 20 lea 0x20(%rsi),%rsi
* 2a: 4c 89 1f mov %r11,(%rdi) <-- trapping instruction
2d: 4c 89 57 08 mov %r10,0x8(%rdi)
31: 4c 89 4f 10 mov %r9,0x10(%rdi)
35: 4c 89 47 18 mov %r8,0x18(%rdi)
39: 48 8d 7f 20 lea 0x20(%rdi),%rdi
3d: 73 d4 jae 0x13
3f: 48 rex.W


Tested on:

commit: 88035e56 Merge tag 'hid-for-linus-2023121201' of git:/..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=17b6fdc9e80000
kernel config: https://syzkaller.appspot.com/x/.config?x=be2bd0a72b52d4da
dashboard link: https://syzkaller.appspot.com/bug?extid=a3981d3c93cde53224be
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=178337a5e80000


2024-01-16 02:39:18

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

syzbot suspects this issue was fixed by commit:

commit 6f861765464f43a71462d52026fbddfc858239a5
Author: Jan Kara <[email protected]>
Date: Wed Nov 1 17:43:10 2023 +0000

fs: Block writes to mounted block devices

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=152ab62be80000
start commit: 88035e5694a8 Merge tag 'hid-for-linus-2023121201' of git:/..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=be2bd0a72b52d4da
dashboard link: https://syzkaller.appspot.com/bug?extid=a3981d3c93cde53224be
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15befbfee80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17b20006e80000

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: fs: Block writes to mounted block devices

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

2024-01-16 09:25:31

by Aleksandr Nogikh

[permalink] [raw]
Subject: Re: [syzbot] [reiserfs?] possible deadlock in __run_timers

#syz fix: fs: Block writes to mounted block devices

On Tue, Jan 16, 2024 at 3:39 AM syzbot
<[email protected]> wrote:
>
> syzbot suspects this issue was fixed by commit:
>
> commit 6f861765464f43a71462d52026fbddfc858239a5
> Author: Jan Kara <[email protected]>
> Date: Wed Nov 1 17:43:10 2023 +0000
>
> fs: Block writes to mounted block devices
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=152ab62be80000
> start commit: 88035e5694a8 Merge tag 'hid-for-linus-2023121201' of git:/..
> git tree: upstream
> kernel config: https://syzkaller.appspot.com/x/.config?x=be2bd0a72b52d4da
> dashboard link: https://syzkaller.appspot.com/bug?extid=a3981d3c93cde53224be
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15befbfee80000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17b20006e80000
>
> If the result looks correct, please mark the issue as fixed by replying with:
>
> #syz fix: fs: Block writes to mounted block devices
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
>