2023-11-16 11:08:48

by syzbot

[permalink] [raw]
Subject: [syzbot] [cgroups?] possible deadlock in cgroup_free

Hello,

syzbot found the following issue on:

HEAD commit: f31817cbcf48 Add linux-next specific files for 20231116
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17d0aca7680000
kernel config: https://syzkaller.appspot.com/x/.config?x=f59345f1d0a928c
dashboard link: https://syzkaller.appspot.com/bug?extid=cef555184e66963dabc2
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/987488cb251e/disk-f31817cb.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/6d4a82d8bd4b/vmlinux-f31817cb.xz
kernel image: https://storage.googleapis.com/syzbot-assets/fc43dee9cb86/bzImage-f31817cb.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
6.7.0-rc1-next-20231116-syzkaller #0 Not tainted
-----------------------------------------------------
syz-executor.3/8188 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
ffff88801f641298 (&sighand->siglock){+.+.}-{2:2}, at: __lock_task_sighand+0xc2/0x340 kernel/signal.c:1422

and this task is already holding:
ffffffff8cff86b8 (css_set_lock){..-.}-{2:2}, at: spin_lock_irq include/linux/spinlock.h:376 [inline]
ffffffff8cff86b8 (css_set_lock){..-.}-{2:2}, at: cgroup_migrate_execute+0xd8/0x1230 kernel/cgroup/cgroup.c:2566
which would create a new lock dependency:
(css_set_lock){..-.}-{2:2} -> (&sighand->siglock){+.+.}-{2:2}

but this new dependency connects a SOFTIRQ-irq-safe lock:
(css_set_lock){..-.}-{2:2}

... which became SOFTIRQ-irq-safe at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
put_css_set kernel/cgroup/cgroup-internal.h:208 [inline]
put_css_set kernel/cgroup/cgroup-internal.h:196 [inline]
cgroup_free+0x7c/0x1d0 kernel/cgroup/cgroup.c:6748
__put_task_struct+0x10b/0x3d0 kernel/fork.c:992
put_task_struct include/linux/sched/task.h:136 [inline]
put_task_struct include/linux/sched/task.h:123 [inline]
delayed_put_task_struct+0x22c/0x2d0 kernel/exit.c:227
rcu_do_batch kernel/rcu/tree.c:2158 [inline]
rcu_core+0x828/0x16b0 kernel/rcu/tree.c:2431
__do_softirq+0x216/0x8d5 kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0xb5/0x120 kernel/softirq.c:644
sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
lock_acquire+0x1f2/0x530 kernel/locking/lockdep.c:5721
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
lockref_put_or_lock+0x18/0x80 lib/lockref.c:147
fast_dput fs/dcache.c:775 [inline]
fast_dput fs/dcache.c:765 [inline]
dput+0x4c4/0xd90 fs/dcache.c:900
shmem_unlink+0x1bc/0x310 mm/shmem.c:3373
shmem_rename2+0x1ff/0x3b0 mm/shmem.c:3451
vfs_rename+0xe20/0x1c30 fs/namei.c:4844
do_renameat2+0xc3c/0xdc0 fs/namei.c:4996
__do_sys_rename fs/namei.c:5042 [inline]
__se_sys_rename fs/namei.c:5040 [inline]
__x64_sys_rename+0x81/0xa0 fs/namei.c:5040
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a

to a SOFTIRQ-irq-unsafe lock:
(&sighand->siglock){+.+.}-{2:2}

... which became SOFTIRQ-irq-unsafe at:
...
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
class_spinlock_constructor include/linux/spinlock.h:530 [inline]
ptrace_set_stopped kernel/ptrace.c:391 [inline]
ptrace_attach+0x401/0x650 kernel/ptrace.c:478
__do_sys_ptrace+0x204/0x230 kernel/ptrace.c:1290
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a

other info that might help us debug this:

Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&sighand->siglock);
local_irq_disable();
lock(css_set_lock);
lock(&sighand->siglock);
<Interrupt>
lock(css_set_lock);

*** DEADLOCK ***

8 locks held by syz-executor.3/8188:
#0: ffff88801d0e0d48 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe7/0x170 fs/file.c:1177
#1: ffff888014b44420 (sb_writers#10){.+.+}-{0:0}, at: ksys_write+0x12f/0x250 fs/read_write.c:637
#2: ffff88807ad7f088 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x27d/0x500 fs/kernfs/file.c:325
#3: ffffffff8cff8768 (cgroup_mutex){+.+.}-{3:3}, at: cgroup_lock include/linux/cgroup.h:368 [inline]
#3: ffffffff8cff8768 (cgroup_mutex){+.+.}-{3:3}, at: cgroup_lock_and_drain_offline+0xad/0x6d0 kernel/cgroup/cgroup.c:3092
#4: ffffffff8ce51f70 (cpu_hotplug_lock){++++}-{0:0}, at: cgroup_attach_lock kernel/cgroup/cgroup.c:2413 [inline]
#4: ffffffff8ce51f70 (cpu_hotplug_lock){++++}-{0:0}, at: cgroup_update_dfl_csses+0x2fb/0x640 kernel/cgroup/cgroup.c:3050
#5: ffffffff8cff8530 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: cgroup_attach_lock kernel/cgroup/cgroup.c:2415 [inline]
#5: ffffffff8cff8530 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: cgroup_attach_lock kernel/cgroup/cgroup.c:2411 [inline]
#5: ffffffff8cff8530 (cgroup_threadgroup_rwsem){++++}-{0:0}, at: cgroup_update_dfl_csses+0x3d1/0x640 kernel/cgroup/cgroup.c:3050
#6: ffffffff8cff86b8 (css_set_lock){..-.}-{2:2}, at: spin_lock_irq include/linux/spinlock.h:376 [inline]
#6: ffffffff8cff86b8 (css_set_lock){..-.}-{2:2}, at: cgroup_migrate_execute+0xd8/0x1230 kernel/cgroup/cgroup.c:2566
#7: ffffffff8cfad060 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:301 [inline]
#7: ffffffff8cfad060 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
#7: ffffffff8cfad060 (rcu_read_lock){....}-{1:2}, at: __lock_task_sighand+0x3f/0x340 kernel/signal.c:1405

the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (css_set_lock){..-.}-{2:2} {
IN-SOFTIRQ-W at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
put_css_set kernel/cgroup/cgroup-internal.h:208 [inline]
put_css_set kernel/cgroup/cgroup-internal.h:196 [inline]
cgroup_free+0x7c/0x1d0 kernel/cgroup/cgroup.c:6748
__put_task_struct+0x10b/0x3d0 kernel/fork.c:992
put_task_struct include/linux/sched/task.h:136 [inline]
put_task_struct include/linux/sched/task.h:123 [inline]
delayed_put_task_struct+0x22c/0x2d0 kernel/exit.c:227
rcu_do_batch kernel/rcu/tree.c:2158 [inline]
rcu_core+0x828/0x16b0 kernel/rcu/tree.c:2431
__do_softirq+0x216/0x8d5 kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0xb5/0x120 kernel/softirq.c:644
sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
lock_acquire+0x1f2/0x530 kernel/locking/lockdep.c:5721
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
lockref_put_or_lock+0x18/0x80 lib/lockref.c:147
fast_dput fs/dcache.c:775 [inline]
fast_dput fs/dcache.c:765 [inline]
dput+0x4c4/0xd90 fs/dcache.c:900
shmem_unlink+0x1bc/0x310 mm/shmem.c:3373
shmem_rename2+0x1ff/0x3b0 mm/shmem.c:3451
vfs_rename+0xe20/0x1c30 fs/namei.c:4844
do_renameat2+0xc3c/0xdc0 fs/namei.c:4996
__do_sys_rename fs/namei.c:5042 [inline]
__se_sys_rename fs/namei.c:5040 [inline]
__x64_sys_rename+0x81/0xa0 fs/namei.c:5040
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a
INITIAL USE at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irq include/linux/spinlock_api_smp.h:119 [inline]
_raw_spin_lock_irq+0x36/0x50 kernel/locking/spinlock.c:170
spin_lock_irq include/linux/spinlock.h:376 [inline]
cgroup_setup_root+0x62c/0xa00 kernel/cgroup/cgroup.c:2138
cgroup_init+0x23f/0x1100 kernel/cgroup/cgroup.c:6120
start_kernel+0x385/0x480 init/main.c:1063
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:555
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:536
secondary_startup_64_no_verify+0x166/0x16b
}
... key at: [<ffffffff8cff86b8>] css_set_lock+0x18/0x60

the dependencies between the lock to be acquired
and SOFTIRQ-irq-unsafe lock:
-> (&sighand->siglock){+.+.}-{2:2} {
HARDIRQ-ON-W at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
class_spinlock_constructor include/linux/spinlock.h:530 [inline]
ptrace_set_stopped kernel/ptrace.c:391 [inline]
ptrace_attach+0x401/0x650 kernel/ptrace.c:478
__do_sys_ptrace+0x204/0x230 kernel/ptrace.c:1290
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a
SOFTIRQ-ON-W at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
class_spinlock_constructor include/linux/spinlock.h:530 [inline]
ptrace_set_stopped kernel/ptrace.c:391 [inline]
ptrace_attach+0x401/0x650 kernel/ptrace.c:478
__do_sys_ptrace+0x204/0x230 kernel/ptrace.c:1290
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a
INITIAL USE at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irq include/linux/spinlock_api_smp.h:119 [inline]
_raw_spin_lock_irq+0x36/0x50 kernel/locking/spinlock.c:170
spin_lock_irq include/linux/spinlock.h:376 [inline]
calculate_sigpending+0x44/0xa0 kernel/signal.c:197
ret_from_fork+0x23/0x80 arch/x86/kernel/process.c:143
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
}
... key at: [<ffffffff90b49f80>] __key.341+0x0/0x40
... acquired at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
__lock_task_sighand+0xc2/0x340 kernel/signal.c:1422
lock_task_sighand include/linux/sched/signal.h:748 [inline]
cgroup_freeze_task+0x80/0x190 kernel/cgroup/freezer.c:160
cgroup_freezer_migrate_task+0x1b7/0x3a0 kernel/cgroup/freezer.c:257
cgroup_migrate_execute+0x2d3/0x1230 kernel/cgroup/cgroup.c:2580
cgroup_update_dfl_csses+0x51b/0x640 kernel/cgroup/cgroup.c:3068
cgroup_apply_control kernel/cgroup/cgroup.c:3308 [inline]
cgroup_subtree_control_write+0xb94/0xed0 kernel/cgroup/cgroup.c:3453
cgroup_file_write+0x209/0x7c0 kernel/cgroup/cgroup.c:4092
kernfs_fop_write_iter+0x33f/0x500 fs/kernfs/file.c:334
call_write_iter include/linux/fs.h:2021 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x64d/0xdf0 fs/read_write.c:584
ksys_write+0x12f/0x250 fs/read_write.c:637
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a


stack backtrace:
CPU: 1 PID: 8188 Comm: syz-executor.3 Not tainted 6.7.0-rc1-next-20231116-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
print_bad_irq_dependency kernel/locking/lockdep.c:2626 [inline]
check_irq_usage+0xe18/0x1470 kernel/locking/lockdep.c:2865
check_prev_add kernel/locking/lockdep.c:3138 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x247c/0x3b10 kernel/locking/lockdep.c:5136
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
__lock_task_sighand+0xc2/0x340 kernel/signal.c:1422
lock_task_sighand include/linux/sched/signal.h:748 [inline]
cgroup_freeze_task+0x80/0x190 kernel/cgroup/freezer.c:160
cgroup_freezer_migrate_task+0x1b7/0x3a0 kernel/cgroup/freezer.c:257
cgroup_migrate_execute+0x2d3/0x1230 kernel/cgroup/cgroup.c:2580
cgroup_update_dfl_csses+0x51b/0x640 kernel/cgroup/cgroup.c:3068
cgroup_apply_control kernel/cgroup/cgroup.c:3308 [inline]
cgroup_subtree_control_write+0xb94/0xed0 kernel/cgroup/cgroup.c:3453
cgroup_file_write+0x209/0x7c0 kernel/cgroup/cgroup.c:4092
kernfs_fop_write_iter+0x33f/0x500 fs/kernfs/file.c:334
call_write_iter include/linux/fs.h:2021 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x64d/0xdf0 fs/read_write.c:584
ksys_write+0x12f/0x250 fs/read_write.c:637
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a
RIP: 0033:0x7f83f387cae9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f83f466d0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f83f399bf80 RCX: 00007f83f387cae9
RDX: 0000000000000006 RSI: 0000000020000100 RDI: 0000000000000004
RBP: 00007f83f38c847a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f83f399bf80 R15: 00007ffdda059fd8
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


2023-11-16 14:17:36

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [cgroups?] possible deadlock in cgroup_free

syzbot has found a reproducer for the following issue on:

HEAD commit: f31817cbcf48 Add linux-next specific files for 20231116
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=14fa5a48e80000
kernel config: https://syzkaller.appspot.com/x/.config?x=f59345f1d0a928c
dashboard link: https://syzkaller.appspot.com/bug?extid=cef555184e66963dabc2
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13fd7920e80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17d80920e80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/987488cb251e/disk-f31817cb.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/6d4a82d8bd4b/vmlinux-f31817cb.xz
kernel image: https://storage.googleapis.com/syzbot-assets/fc43dee9cb86/bzImage-f31817cb.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

========================================================
WARNING: possible irq lock inversion dependency detected
6.7.0-rc1-next-20231116-syzkaller #0 Not tainted
--------------------------------------------------------
swapper/0/0 just changed the state of lock:
ffffffff8cff86b8 (css_set_lock){..-.}-{2:2}, at: put_css_set kernel/cgroup/cgroup-internal.h:208 [inline]
ffffffff8cff86b8 (css_set_lock){..-.}-{2:2}, at: put_css_set kernel/cgroup/cgroup-internal.h:196 [inline]
ffffffff8cff86b8 (css_set_lock){..-.}-{2:2}, at: cgroup_free+0x7c/0x1d0 kernel/cgroup/cgroup.c:6748
but this lock took another, SOFTIRQ-unsafe lock in the past:
(&sighand->siglock){+.+.}-{2:2}


and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&sighand->siglock);
local_irq_disable();
lock(css_set_lock);
lock(&sighand->siglock);
<Interrupt>
lock(css_set_lock);

*** DEADLOCK ***

2 locks held by swapper/0/0:
#0: ffffffff8cfacf40 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:301 [inline]
#0: ffffffff8cfacf40 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2152 [inline]
#0: ffffffff8cfacf40 (rcu_callback){....}-{0:0}, at: rcu_core+0x7cc/0x16b0 kernel/rcu/tree.c:2431
#1: ffffffff8ce58800 (put_task_map-wait-type-override){+...}-{3:3}, at: put_task_struct include/linux/sched/task.h:135 [inline]
#1: ffffffff8ce58800 (put_task_map-wait-type-override){+...}-{3:3}, at: put_task_struct include/linux/sched/task.h:123 [inline]
#1: ffffffff8ce58800 (put_task_map-wait-type-override){+...}-{3:3}, at: delayed_put_task_struct+0x21e/0x2d0 kernel/exit.c:227

the shortest dependencies between 2nd lock and 1st lock:
-> (&sighand->siglock){+.+.}-{2:2} {
HARDIRQ-ON-W at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
class_spinlock_constructor include/linux/spinlock.h:530 [inline]
ptrace_set_stopped kernel/ptrace.c:391 [inline]
ptrace_attach+0x401/0x650 kernel/ptrace.c:478
__do_sys_ptrace+0x204/0x230 kernel/ptrace.c:1290
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a
SOFTIRQ-ON-W at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
class_spinlock_constructor include/linux/spinlock.h:530 [inline]
ptrace_set_stopped kernel/ptrace.c:391 [inline]
ptrace_attach+0x401/0x650 kernel/ptrace.c:478
__do_sys_ptrace+0x204/0x230 kernel/ptrace.c:1290
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a
INITIAL USE at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irq include/linux/spinlock_api_smp.h:119 [inline]
_raw_spin_lock_irq+0x36/0x50 kernel/locking/spinlock.c:170
spin_lock_irq include/linux/spinlock.h:376 [inline]
calculate_sigpending+0x44/0xa0 kernel/signal.c:197
ret_from_fork+0x23/0x80 arch/x86/kernel/process.c:143
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
}
... key at: [<ffffffff90b49f80>] __key.341+0x0/0x40
... acquired at:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
__lock_task_sighand+0xc2/0x340 kernel/signal.c:1422
lock_task_sighand include/linux/sched/signal.h:748 [inline]
cgroup_freeze_task+0x80/0x190 kernel/cgroup/freezer.c:160
cgroup_freezer_migrate_task+0x1b7/0x3a0 kernel/cgroup/freezer.c:257
cgroup_migrate_execute+0x2d3/0x1230 kernel/cgroup/cgroup.c:2580
cgroup_update_dfl_csses+0x51b/0x640 kernel/cgroup/cgroup.c:3068
cgroup_apply_control kernel/cgroup/cgroup.c:3308 [inline]
cgroup_subtree_control_write+0xb94/0xed0 kernel/cgroup/cgroup.c:3453
cgroup_file_write+0x209/0x7c0 kernel/cgroup/cgroup.c:4092
kernfs_fop_write_iter+0x33f/0x500 fs/kernfs/file.c:334
call_write_iter include/linux/fs.h:2021 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x64d/0xdf0 fs/read_write.c:584
ksys_write+0x12f/0x250 fs/read_write.c:637
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x62/0x6a

-> (css_set_lock){..-.}-{2:2} {
IN-SOFTIRQ-W at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
put_css_set kernel/cgroup/cgroup-internal.h:208 [inline]
put_css_set kernel/cgroup/cgroup-internal.h:196 [inline]
cgroup_free+0x7c/0x1d0 kernel/cgroup/cgroup.c:6748
__put_task_struct+0x10b/0x3d0 kernel/fork.c:992
put_task_struct include/linux/sched/task.h:136 [inline]
put_task_struct include/linux/sched/task.h:123 [inline]
delayed_put_task_struct+0x22c/0x2d0 kernel/exit.c:227
rcu_do_batch kernel/rcu/tree.c:2158 [inline]
rcu_core+0x828/0x16b0 kernel/rcu/tree.c:2431
__do_softirq+0x216/0x8d5 kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0xb5/0x120 kernel/softirq.c:644
sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
acpi_safe_halt+0x1a/0x20 drivers/acpi/processor_idle.c:112
acpi_idle_enter+0xc5/0x160 drivers/acpi/processor_idle.c:707
cpuidle_enter_state+0x83/0x500 drivers/cpuidle/cpuidle.c:267
cpuidle_enter+0x4e/0xa0 drivers/cpuidle/cpuidle.c:388
cpuidle_idle_call kernel/sched/idle.c:215 [inline]
do_idle+0x314/0x3f0 kernel/sched/idle.c:312
cpu_startup_entry+0x4f/0x60 kernel/sched/idle.c:410
rest_init+0x16f/0x2b0 init/main.c:730
arch_call_rest_init+0x13/0x30 init/main.c:827
start_kernel+0x39e/0x480 init/main.c:1072
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:555
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:536
secondary_startup_64_no_verify+0x166/0x16b
INITIAL USE at:
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irq include/linux/spinlock_api_smp.h:119 [inline]
_raw_spin_lock_irq+0x36/0x50 kernel/locking/spinlock.c:170
spin_lock_irq include/linux/spinlock.h:376 [inline]
cgroup_setup_root+0x62c/0xa00 kernel/cgroup/cgroup.c:2138
cgroup_init+0x23f/0x1100 kernel/cgroup/cgroup.c:6120
start_kernel+0x385/0x480 init/main.c:1063
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:555
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:536
secondary_startup_64_no_verify+0x166/0x16b
}
... key at: [<ffffffff8cff86b8>] css_set_lock+0x18/0x60
... acquired at:
mark_usage kernel/locking/lockdep.c:4566 [inline]
__lock_acquire+0x13c2/0x3b10 kernel/locking/lockdep.c:5090
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
put_css_set kernel/cgroup/cgroup-internal.h:208 [inline]
put_css_set kernel/cgroup/cgroup-internal.h:196 [inline]
cgroup_free+0x7c/0x1d0 kernel/cgroup/cgroup.c:6748
__put_task_struct+0x10b/0x3d0 kernel/fork.c:992
put_task_struct include/linux/sched/task.h:136 [inline]
put_task_struct include/linux/sched/task.h:123 [inline]
delayed_put_task_struct+0x22c/0x2d0 kernel/exit.c:227
rcu_do_batch kernel/rcu/tree.c:2158 [inline]
rcu_core+0x828/0x16b0 kernel/rcu/tree.c:2431
__do_softirq+0x216/0x8d5 kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0xb5/0x120 kernel/softirq.c:644
sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
acpi_safe_halt+0x1a/0x20 drivers/acpi/processor_idle.c:112
acpi_idle_enter+0xc5/0x160 drivers/acpi/processor_idle.c:707
cpuidle_enter_state+0x83/0x500 drivers/cpuidle/cpuidle.c:267
cpuidle_enter+0x4e/0xa0 drivers/cpuidle/cpuidle.c:388
cpuidle_idle_call kernel/sched/idle.c:215 [inline]
do_idle+0x314/0x3f0 kernel/sched/idle.c:312
cpu_startup_entry+0x4f/0x60 kernel/sched/idle.c:410
rest_init+0x16f/0x2b0 init/main.c:730
arch_call_rest_init+0x13/0x30 init/main.c:827
start_kernel+0x39e/0x480 init/main.c:1072
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:555
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:536
secondary_startup_64_no_verify+0x166/0x16b


stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc1-next-20231116-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
print_irq_inversion_bug.part.0+0x3e1/0x590 kernel/locking/lockdep.c:4079
print_irq_inversion_bug kernel/locking/lockdep.c:4032 [inline]
check_usage_forwards kernel/locking/lockdep.c:4110 [inline]
mark_lock_irq kernel/locking/lockdep.c:4242 [inline]
mark_lock+0x570/0xc50 kernel/locking/lockdep.c:4677
mark_usage kernel/locking/lockdep.c:4566 [inline]
__lock_acquire+0x13c2/0x3b10 kernel/locking/lockdep.c:5090
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1b1/0x530 kernel/locking/lockdep.c:5718
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
put_css_set kernel/cgroup/cgroup-internal.h:208 [inline]
put_css_set kernel/cgroup/cgroup-internal.h:196 [inline]
cgroup_free+0x7c/0x1d0 kernel/cgroup/cgroup.c:6748
__put_task_struct+0x10b/0x3d0 kernel/fork.c:992
put_task_struct include/linux/sched/task.h:136 [inline]
put_task_struct include/linux/sched/task.h:123 [inline]
delayed_put_task_struct+0x22c/0x2d0 kernel/exit.c:227
rcu_do_batch kernel/rcu/tree.c:2158 [inline]
rcu_core+0x828/0x16b0 kernel/rcu/tree.c:2431
__do_softirq+0x216/0x8d5 kernel/softirq.c:553
invoke_softirq kernel/softirq.c:427 [inline]
__irq_exit_rcu kernel/softirq.c:632 [inline]
irq_exit_rcu+0xb5/0x120 kernel/softirq.c:644
sysvec_apic_timer_interrupt+0x95/0xb0 arch/x86/kernel/apic/apic.c:1076
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:native_irq_disable arch/x86/include/asm/irqflags.h:37 [inline]
RIP: 0010:arch_local_irq_disable arch/x86/include/asm/irqflags.h:72 [inline]
RIP: 0010:acpi_safe_halt+0x1a/0x20 drivers/acpi/processor_idle.c:113
Code: 08 ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 05 a8 8a 82 75 48 8b 00 a8 08 75 0c 66 90 0f 00 2d 78 0a b9 00 fb f4 <fa> c3 0f 1f 40 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb
RSP: 0018:ffffffff8cc07d68 EFLAGS: 00000246
RAX: 0000000000004000 RBX: 0000000000000001 RCX: ffffffff8a8117f5
RDX: 0000000000000001 RSI: ffff8880156c2800 RDI: ffff8880156c2864
RBP: ffff8880156c2864 R08: 0000000000000001 R09: ffffed1017306dbd
R10: ffff8880b9836deb R11: 0000000000000000 R12: ffff888147ac4000
R13: ffffffff8db1a520 R14: 0000000000000000 R15: 0000000000000000
acpi_idle_enter+0xc5/0x160 drivers/acpi/processor_idle.c:707
cpuidle_enter_state+0x83/0x500 drivers/cpuidle/cpuidle.c:267
cpuidle_enter+0x4e/0xa0 drivers/cpuidle/cpuidle.c:388
cpuidle_idle_call kernel/sched/idle.c:215 [inline]
do_idle+0x314/0x3f0 kernel/sched/idle.c:312
cpu_startup_entry+0x4f/0x60 kernel/sched/idle.c:410
rest_init+0x16f/0x2b0 init/main.c:730
arch_call_rest_init+0x13/0x30 init/main.c:827
start_kernel+0x39e/0x480 init/main.c:1072
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:555
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:536
secondary_startup_64_no_verify+0x166/0x16b
</TASK>
----------------
Code disassembly (best guess):
0: 08 ed or %ch,%ch
2: c3 ret
3: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1)
a: 00 00 00 00
e: 66 90 xchg %ax,%ax
10: 65 48 8b 05 a8 8a 82 mov %gs:0x75828aa8(%rip),%rax # 0x75828ac0
17: 75
18: 48 8b 00 mov (%rax),%rax
1b: a8 08 test $0x8,%al
1d: 75 0c jne 0x2b
1f: 66 90 xchg %ax,%ax
21: 0f 00 2d 78 0a b9 00 verw 0xb90a78(%rip) # 0xb90aa0
28: fb sti
29: f4 hlt
* 2a: fa cli <-- trapping instruction
2b: c3 ret
2c: 0f 1f 40 00 nopl 0x0(%rax)
30: 0f b6 47 08 movzbl 0x8(%rdi),%eax
34: 3c 01 cmp $0x1,%al
36: 74 0b je 0x43
38: 3c 02 cmp $0x2,%al
3a: 74 05 je 0x41
3c: 8b 7f 04 mov 0x4(%rdi),%edi
3f: eb .byte 0xeb


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

2023-11-17 01:25:54

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [cgroups?] possible deadlock in cgroup_free

syzbot has bisected this issue to:

commit 2d25a889601d2fbc87ec79b30ea315820f874b78
Author: Peter Zijlstra <[email protected]>
Date: Sun Sep 17 11:24:21 2023 +0000

ptrace: Convert ptrace_attach() to use lock guards

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=130edb3f680000
start commit: f31817cbcf48 Add linux-next specific files for 20231116
git tree: linux-next
final oops: https://syzkaller.appspot.com/x/report.txt?x=108edb3f680000
console output: https://syzkaller.appspot.com/x/log.txt?x=170edb3f680000
kernel config: https://syzkaller.appspot.com/x/.config?x=f59345f1d0a928c
dashboard link: https://syzkaller.appspot.com/bug?extid=cef555184e66963dabc2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13fd7920e80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17d80920e80000

Reported-by: [email protected]
Fixes: 2d25a889601d ("ptrace: Convert ptrace_attach() to use lock guards")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

2023-11-17 10:58:37

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] Test

For archival purposes, forwarding an incoming command email to
[email protected], [email protected].

***

Subject: Test
Author: [email protected]

#syz test:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

2023-11-17 11:53:19

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [cgroups?] possible deadlock in cgroup_free

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: [email protected]

Tested on:

commit: 7475e51b Merge tag 'net-6.7-rc2' of git://git.kernel.o..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=10621a14e80000
kernel config: https://syzkaller.appspot.com/x/.config?x=69e454cdc811976a
dashboard link: https://syzkaller.appspot.com/bug?extid=cef555184e66963dabc2
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.
Note: testing is done by a robot and is best-effort only.

2023-11-19 15:12:27

by Tejun Heo

[permalink] [raw]
Subject: Re: [syzbot] [cgroups?] possible deadlock in cgroup_free

On Thu, Nov 16, 2023 at 05:25:05PM -0800, syzbot wrote:
> syzbot has bisected this issue to:
>
> commit 2d25a889601d2fbc87ec79b30ea315820f874b78
> Author: Peter Zijlstra <[email protected]>
> Date: Sun Sep 17 11:24:21 2023 +0000
>
> ptrace: Convert ptrace_attach() to use lock guards

Looks like the tasklist_lock conversion in ptrace_attach() forgot _irq.
Peter, Oleg?

Thanks.

--
tejun

2023-11-19 15:32:19

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [syzbot] [cgroups?] possible deadlock in cgroup_free

On 11/19, Tejun Heo wrote:
>
> On Thu, Nov 16, 2023 at 05:25:05PM -0800, syzbot wrote:
> > syzbot has bisected this issue to:
> >
> > commit 2d25a889601d2fbc87ec79b30ea315820f874b78
> > Author: Peter Zijlstra <[email protected]>
> > Date: Sun Sep 17 11:24:21 2023 +0000
> >
> > ptrace: Convert ptrace_attach() to use lock guards
>
> Looks like the tasklist_lock conversion in ptrace_attach() forgot _irq.
> Peter, Oleg?

Yes, please see

Re: [syzbot] [kernel?] inconsistent lock state in ptrace_attach
https://lore.kernel.org/all/[email protected]/

Oleg.