2023-05-12 06:17:04

by syzbot

[permalink] [raw]
Subject: [syzbot] [kernel?] possible deadlock in __hrtimer_run_queues

Hello,

syzbot found the following issue on:

HEAD commit: 1dc3731daf1f Merge tag 'for-6.4-rc1-tag' of git://git.kern..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14ebdafa280000
kernel config: https://syzkaller.appspot.com/x/.config?x=8bc832f563d8bf38
dashboard link: https://syzkaller.appspot.com/bug?extid=3384541342de0ca933f1
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-1dc3731d.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/9ac41c523f85/vmlinux-1dc3731d.xz
kernel image: https://storage.googleapis.com/syzbot-assets/40b82936b92f/bzImage-1dc3731d.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

======================================================
WARNING: possible circular locking dependency detected
6.4.0-rc1-syzkaller-00011-g1dc3731daf1f #0 Not tainted
------------------------------------------------------
syz-fuzzer/5789 is trying to acquire lock:
ffff88807ffdaba0 (&pgdat->kswapd_wait){-...}-{2:2}, at: __wake_up_common_lock+0xb8/0x140 kernel/sched/wait.c:137

but task is already holding lock:
ffff88802c92b858 (hrtimer_bases.lock){-.-.}-{2:2}, at: __run_hrtimer kernel/time/hrtimer.c:1689 [inline]
ffff88802c92b858 (hrtimer_bases.lock){-.-.}-{2:2}, at: __hrtimer_run_queues+0x23e/0xbe0 kernel/time/hrtimer.c:1749

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (hrtimer_bases.lock){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
lock_hrtimer_base kernel/time/hrtimer.c:173 [inline]
hrtimer_start_range_ns+0xe9/0xd80 kernel/time/hrtimer.c:1296
hrtimer_start_expires include/linux/hrtimer.h:432 [inline]
do_start_rt_bandwidth kernel/sched/rt.c:116 [inline]
start_rt_bandwidth kernel/sched/rt.c:127 [inline]
inc_rt_group kernel/sched/rt.c:1241 [inline]
inc_rt_tasks kernel/sched/rt.c:1285 [inline]
__enqueue_rt_entity kernel/sched/rt.c:1461 [inline]
enqueue_rt_entity kernel/sched/rt.c:1510 [inline]
enqueue_task_rt+0xa86/0xfc0 kernel/sched/rt.c:1545
enqueue_task+0xad/0x330 kernel/sched/core.c:2082
__sched_setscheduler.constprop.0+0xb89/0x25d0 kernel/sched/core.c:7774
_sched_setscheduler kernel/sched/core.c:7820 [inline]
sched_setscheduler_nocheck kernel/sched/core.c:7867 [inline]
sched_set_fifo+0xb1/0x110 kernel/sched/core.c:7891
irq_thread+0xe3/0x540 kernel/irq/manage.c:1302
kthread+0x344/0x440 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

-> #3 (&rt_b->rt_runtime_lock){-.-.}-{2:2}:
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
__enable_runtime kernel/sched/rt.c:876 [inline]
rq_online_rt+0xb3/0x3b0 kernel/sched/rt.c:2485
set_rq_online.part.0+0xf9/0x140 kernel/sched/core.c:9541
set_rq_online kernel/sched/core.c:9533 [inline]
sched_cpu_activate+0x216/0x440 kernel/sched/core.c:9649
cpuhp_invoke_callback+0x645/0xeb0 kernel/cpu.c:192
cpuhp_thread_fun+0x47f/0x700 kernel/cpu.c:815
smpboot_thread_fn+0x659/0x9e0 kernel/smpboot.c:164
kthread+0x344/0x440 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

-> #2 (&rq->__lock){-.-.}-{2:2}:
_raw_spin_lock_nested+0x34/0x40 kernel/locking/spinlock.c:378
raw_spin_rq_lock_nested+0x2f/0x120 kernel/sched/core.c:558
raw_spin_rq_lock kernel/sched/sched.h:1366 [inline]
rq_lock kernel/sched/sched.h:1653 [inline]
task_fork_fair+0x74/0x4f0 kernel/sched/fair.c:12095
sched_cgroup_fork+0x3d1/0x540 kernel/sched/core.c:4777
copy_process+0x4b8a/0x7600 kernel/fork.c:2618
kernel_clone+0xeb/0x890 kernel/fork.c:2918
user_mode_thread+0xb1/0xf0 kernel/fork.c:2996
rest_init+0x27/0x2b0 init/main.c:700
arch_call_rest_init+0x13/0x30 init/main.c:834
start_kernel+0x3b6/0x490 init/main.c:1088
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
x86_64_start_kernel+0xb3/0xc0 arch/x86/kernel/head64.c:537
secondary_startup_64_no_verify+0xf4/0xfb

-> #1 (&p->pi_lock){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
try_to_wake_up+0xab/0x1c40 kernel/sched/core.c:4191
autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:419
__wake_up_common+0x147/0x650 kernel/sched/wait.c:107
__wake_up_common_lock+0xd4/0x140 kernel/sched/wait.c:138
wakeup_kswapd+0x3fe/0x5c0 mm/vmscan.c:7797
rmqueue mm/page_alloc.c:3057 [inline]
get_page_from_freelist+0x6c5/0x2c00 mm/page_alloc.c:3499
__alloc_pages+0x1cb/0x4a0 mm/page_alloc.c:4768
__folio_alloc+0x16/0x40 mm/page_alloc.c:4800
vma_alloc_folio+0x155/0x890 mm/mempolicy.c:2240
wp_page_copy mm/memory.c:3070 [inline]
do_wp_page+0x173d/0x33c0 mm/memory.c:3432
handle_pte_fault mm/memory.c:4964 [inline]
__handle_mm_fault+0x1635/0x41c0 mm/memory.c:5089
handle_mm_fault+0x2af/0x9f0 mm/memory.c:5243
do_user_addr_fault+0x2ca/0x1210 arch/x86/mm/fault.c:1349
handle_page_fault arch/x86/mm/fault.c:1534 [inline]
exc_page_fault+0x98/0x170 arch/x86/mm/fault.c:1590
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570

-> #0 (&pgdat->kswapd_wait){-...}-{2:2}:
check_prev_add kernel/locking/lockdep.c:3108 [inline]
check_prevs_add kernel/locking/lockdep.c:3227 [inline]
validate_chain kernel/locking/lockdep.c:3842 [inline]
__lock_acquire+0x2f21/0x5df0 kernel/locking/lockdep.c:5074
lock_acquire kernel/locking/lockdep.c:5691 [inline]
lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
__wake_up_common_lock+0xb8/0x140 kernel/sched/wait.c:137
wakeup_kswapd+0x3fe/0x5c0 mm/vmscan.c:7797
wake_all_kswapds+0x182/0x2d0 mm/page_alloc.c:4028
__alloc_pages_slowpath.constprop.0+0x1724/0x2170 mm/page_alloc.c:4296
__alloc_pages+0x408/0x4a0 mm/page_alloc.c:4781
alloc_pages+0x1aa/0x270 mm/mempolicy.c:2279
alloc_slab_page mm/slub.c:1851 [inline]
allocate_slab+0x25f/0x390 mm/slub.c:1998
new_slab mm/slub.c:2051 [inline]
___slab_alloc+0xa91/0x1400 mm/slub.c:3192
__slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3291
__slab_alloc_node mm/slub.c:3344 [inline]
slab_alloc_node mm/slub.c:3441 [inline]
slab_alloc mm/slub.c:3459 [inline]
__kmem_cache_alloc_lru mm/slub.c:3466 [inline]
kmem_cache_alloc+0x38e/0x3b0 mm/slub.c:3475
kmem_cache_zalloc include/linux/slab.h:670 [inline]
fill_pool+0x264/0x5c0 lib/debugobjects.c:168
debug_objects_fill_pool lib/debugobjects.c:597 [inline]
debug_object_activate+0xfd/0x400 lib/debugobjects.c:693
debug_hrtimer_activate kernel/time/hrtimer.c:420 [inline]
debug_activate kernel/time/hrtimer.c:475 [inline]
enqueue_hrtimer+0x27/0x320 kernel/time/hrtimer.c:1084
__run_hrtimer kernel/time/hrtimer.c:1702 [inline]
__hrtimer_run_queues+0xa5b/0xbe0 kernel/time/hrtimer.c:1749
hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1095 [inline]
__sysvec_apic_timer_interrupt+0x14a/0x430 arch/x86/kernel/apic/apic.c:1112
sysvec_apic_timer_interrupt+0x44/0xc0 arch/x86/kernel/apic/apic.c:1106
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645

other info that might help us debug this:

Chain exists of:
&pgdat->kswapd_wait --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(hrtimer_bases.lock);
lock(&rt_b->rt_runtime_lock);
lock(hrtimer_bases.lock);
lock(&pgdat->kswapd_wait);

*** DEADLOCK ***

1 lock held by syz-fuzzer/5789:
#0: ffff88802c92b858 (hrtimer_bases.lock){-.-.}-{2:2}, at: __run_hrtimer kernel/time/hrtimer.c:1689 [inline]
#0: ffff88802c92b858 (hrtimer_bases.lock){-.-.}-{2:2}, at: __hrtimer_run_queues+0x23e/0xbe0 kernel/time/hrtimer.c:1749

stack backtrace:
CPU: 3 PID: 5789 Comm: syz-fuzzer Not tainted 6.4.0-rc1-syzkaller-00011-g1dc3731daf1f #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2188
check_prev_add kernel/locking/lockdep.c:3108 [inline]
check_prevs_add kernel/locking/lockdep.c:3227 [inline]
validate_chain kernel/locking/lockdep.c:3842 [inline]
__lock_acquire+0x2f21/0x5df0 kernel/locking/lockdep.c:5074
lock_acquire kernel/locking/lockdep.c:5691 [inline]
lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
__wake_up_common_lock+0xb8/0x140 kernel/sched/wait.c:137
wakeup_kswapd+0x3fe/0x5c0 mm/vmscan.c:7797
wake_all_kswapds+0x182/0x2d0 mm/page_alloc.c:4028
__alloc_pages_slowpath.constprop.0+0x1724/0x2170 mm/page_alloc.c:4296
__alloc_pages+0x408/0x4a0 mm/page_alloc.c:4781
alloc_pages+0x1aa/0x270 mm/mempolicy.c:2279
alloc_slab_page mm/slub.c:1851 [inline]
allocate_slab+0x25f/0x390 mm/slub.c:1998
new_slab mm/slub.c:2051 [inline]
___slab_alloc+0xa91/0x1400 mm/slub.c:3192
__slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3291
__slab_alloc_node mm/slub.c:3344 [inline]
slab_alloc_node mm/slub.c:3441 [inline]
slab_alloc mm/slub.c:3459 [inline]
__kmem_cache_alloc_lru mm/slub.c:3466 [inline]
kmem_cache_alloc+0x38e/0x3b0 mm/slub.c:3475
kmem_cache_zalloc include/linux/slab.h:670 [inline]
fill_pool+0x264/0x5c0 lib/debugobjects.c:168
debug_objects_fill_pool lib/debugobjects.c:597 [inline]
debug_object_activate+0xfd/0x400 lib/debugobjects.c:693
debug_hrtimer_activate kernel/time/hrtimer.c:420 [inline]
debug_activate kernel/time/hrtimer.c:475 [inline]
enqueue_hrtimer+0x27/0x320 kernel/time/hrtimer.c:1084
__run_hrtimer kernel/time/hrtimer.c:1702 [inline]
__hrtimer_run_queues+0xa5b/0xbe0 kernel/time/hrtimer.c:1749
hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1095 [inline]
__sysvec_apic_timer_interrupt+0x14a/0x430 arch/x86/kernel/apic/apic.c:1112
sysvec_apic_timer_interrupt+0x44/0xc0 arch/x86/kernel/apic/apic.c:1106
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0033:0x4804d1
Code: ff 74 1c 48 8b 0d 4f 3a ce 00 48 8b 3d 50 3a ce 00 31 c0 31 db 48 8b 6c 24 18 48 83 c4 20 c3 48 89 f1 4c 89 c7 48 8b 6c 24 18 <48> 83 c4 20 c3 48 89 44 24 08 48 89 5c 24 10 e8 fb 71 fe ff 48 8b
RSP: 002b:000000c000e954c0 EFLAGS: 00000246
RAX: 0000000000921f51 RBX: 0000000000000005 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 000000c000e95520 R08: 0000000000000000 R09: 0000000000921f50
R10: 0000000000921f51 R11: 0000000000000000 R12: 000000c000e95468
R13: ffffffffffffffff R14: 000000c000083860 R15: 000000c00dbb6c00
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


2023-05-12 12:28:48

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] possible deadlock in __hrtimer_run_queues

On Thu, May 11 2023 at 22:55, syzbot wrote:
> HEAD commit: 1dc3731daf1f Merge tag 'for-6.4-rc1-tag' of git://git.kern..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=14ebdafa280000
> kernel config: https://syzkaller.appspot.com/x/.config?x=8bc832f563d8bf38
> dashboard link: https://syzkaller.appspot.com/bug?extid=3384541342de0ca933f1
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> userspace arch: i386
>
> Unfortunately, I don't have any reproducer for this issue yet.

No problem. The lockdep splat is enough data.

Fix below.

Thanks,

tglx

---

Subject: debugobjects: Prevent pool refill deadlock
From: Thomas Gleixner <[email protected]>
Date: Fri, 12 May 2023 12:23:17 +0200

The unconditional pool refill in debug_objects_activate() and
debug_objects_assert_init() can result in a dead lock when invoked from
the hrtimer [soft]interrupt:

Chain exists of:
&pgdat->kswapd_wait --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock

Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(hrtimer_bases.lock);
lock(&rt_b->rt_runtime_lock);
lock(hrtimer_bases.lock);
lock(&pgdat->kswapd_wait);

*** DEADLOCK ***

This went unnoticed in testing as the test system never tried to wake up
the kswapd_wait queue.

The original behaviour before commit 63a759694eed ("debugobject: Prevent
init race with static objects") was that the pool refill only happened when
the tracking object lookup failed.

Restore this behaviour by trying a lookup first and only if that fails to
invoke pool refill and then do the lookup_or_alloc().

That does not reintroduce the init race problem because lookup_or_alloc()
is still an atomic operation protected by the hash bucket lock and only one
context can do the static object stack and tracking object association and
the other one will observe the tracking object.

Fixes: 0af462f19e63 ("debugobject: Ensure pool refill (again)")
Reported-by: [email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
lib/debugobjects.c | 43 +++++++++++++++++++++++--------------------
1 file changed, 23 insertions(+), 20 deletions(-)

--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -124,7 +124,7 @@ static const char *obj_states[ODEBUG_STA
[ODEBUG_STATE_NOTAVAILABLE] = "not available",
};

-static void fill_pool(void)
+static void debug_objects_fill_pool(void)
{
gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN;
struct debug_obj *obj;
@@ -157,6 +157,13 @@ static void fill_pool(void)
raw_spin_unlock_irqrestore(&pool_lock, flags);
}

+ /*
+ * On RT enabled kernels the pool refill must happen in preemptible
+ * context:
+ */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
+ return;
+
if (unlikely(!obj_cache))
return;

@@ -587,16 +594,6 @@ static struct debug_obj *lookup_object_o
return NULL;
}

-static void debug_objects_fill_pool(void)
-{
- /*
- * On RT enabled kernels the pool refill must happen in preemptible
- * context:
- */
- if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible())
- fill_pool();
-}
-
static void
__debug_object_init(void *addr, const struct debug_obj_descr *descr, int onstack)
{
@@ -690,13 +687,16 @@ int debug_object_activate(void *addr, co
if (!debug_objects_enabled)
return 0;

- debug_objects_fill_pool();
-
db = get_bucket((unsigned long) addr);
-
raw_spin_lock_irqsave(&db->lock, flags);
+ obj = lookup_object(addr, db);
+ if (!obj) {
+ raw_spin_unlock_irqrestore(&db->lock, flags);
+ debug_objects_fill_pool();
+ raw_spin_lock_irqsave(&db->lock, flags);
+ obj = lookup_object_or_alloc(addr, db, descr, false, true);
+ }

- obj = lookup_object_or_alloc(addr, db, descr, false, true);
if (likely(!IS_ERR_OR_NULL(obj))) {
bool print_object = false;

@@ -901,13 +901,16 @@ void debug_object_assert_init(void *addr
if (!debug_objects_enabled)
return;

- debug_objects_fill_pool();
-
db = get_bucket((unsigned long) addr);
-
raw_spin_lock_irqsave(&db->lock, flags);
- obj = lookup_object_or_alloc(addr, db, descr, false, true);
- raw_spin_unlock_irqrestore(&db->lock, flags);
+ obj = lookup_object(addr, db);
+ if (!obj) {
+ raw_spin_unlock_irqrestore(&db->lock, flags);
+ debug_objects_fill_pool();
+ raw_spin_lock_irqsave(&db->lock, flags);
+ obj = lookup_object_or_alloc(addr, db, descr, false, true);
+ }
+
if (likely(!IS_ERR_OR_NULL(obj)))
return;


2023-05-13 09:53:09

by Ido Schimmel

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] possible deadlock in __hrtimer_run_queues

On Fri, May 12, 2023 at 01:41:47PM +0200, Thomas Gleixner wrote:
> No problem. The lockdep splat is enough data.
>
> Fix below.

Thomas, I applied the fix on top of net-next (which includes
0af462f19e63). I'm getting the splat below during boot and then the
systems hangs.

Thanks

[ 1.708425] ============================================
[ 1.709253] WARNING: possible recursive locking detected
[ 1.710079] 6.4.0-rc1-custom-gba79e9a73284-dirty #52 Not tainted
[ 1.711016] --------------------------------------------
[ 1.711842] swapper/0/0 is trying to acquire lock:
[ 1.712593] ffffffff8863bfb0 (&obj_hash[i].lock){....}-{2:2}, at: debug_object_activate+0x149/0x560
[ 1.714037]
[ 1.714037] but task is already holding lock:
[ 1.714947] ffffffff8863bfb0 (&obj_hash[i].lock){....}-{2:2}, at: debug_object_assert_init+0x279/0x410
[ 1.716411]
[ 1.716411] other info that might help us debug this:
[ 1.717419] Possible unsafe locking scenario:
[ 1.717419]
[ 1.718332] CPU0
[ 1.718716] ----
[ 1.719099] lock(&obj_hash[i].lock);
[ 1.719695] lock(&obj_hash[i].lock);
[ 1.720282]
[ 1.720282] *** DEADLOCK ***
[ 1.720282]
[ 1.721195] May be due to missing lock nesting notation
[ 1.721195]
[ 1.722245] 2 locks held by swapper/0/0:
[ 1.722861] #0: ffffffff8863bfb0 (&obj_hash[i].lock){....}-{2:2}, at: debug_object_assert_init+0x279/0x410
[ 1.724393] #1: ffff8881f61e9618 (&base->lock){....}-{2:2}, at: lock_timer_base+0x5f/0x200
[ 1.725718]
[ 1.725718] stack backtrace:
[ 1.726396] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0-rc1-custom-gba79e9a73284-dirty #52
[ 1.727737] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
[ 1.729091] Call Trace:
[ 1.729473] <TASK>
[ 1.729816] dump_stack_lvl+0x91/0xf0
[ 1.730400] check_deadlock+0x618/0xb10
[ 1.731023] __lock_acquire+0x2226/0x62d0
[ 1.731664] ? lockdep_hardirqs_on_prepare+0x420/0x420
[ 1.732475] lock_acquire+0x1a7/0x4e0
[ 1.733057] ? debug_object_activate+0x149/0x560
[ 1.733791] ? lock_sync+0x180/0x180
[ 1.734354] ? lock_acquired+0x238/0xa30
[ 1.735000] ? lock_acquire+0x1a7/0x4e0
[ 1.735613] ? find_held_lock+0x34/0x110
[ 1.736238] _raw_spin_lock_irqsave+0x46/0x90
[ 1.736936] ? debug_object_activate+0x149/0x560
[ 1.737669] debug_object_activate+0x149/0x560
[ 1.738372] ? debug_object_destroy+0x210/0x210
[ 1.739095] ? _raw_spin_lock_irqsave+0x60/0x90
[ 1.739820] ? lock_timer_base+0x184/0x200
[ 1.740469] __mod_timer+0x820/0xea0
[ 1.741047] ? timer_fixup_activate+0x2c0/0x2c0
[ 1.741770] ? find_held_lock+0x34/0x110
[ 1.742397] ? lock_release+0x3b5/0xbd0
[ 1.743013] add_timer+0x67/0x90
[ 1.743528] __queue_delayed_work+0x1a2/0x270
[ 1.744226] queue_delayed_work_on+0xce/0xd0
[ 1.744912] crng_reseed+0x124/0x270
[ 1.745485] ? random_write_iter+0x20/0x20
[ 1.746137] ? add_device_randomness+0xbd/0xf0
[ 1.746851] ? blake2s_update+0x106/0x4e0
[ 1.747492] random_init+0xcd/0x1a0
[ 1.748054] ? random_init_early+0x2a0/0x2a0
[ 1.748739] ? hrtimers_prepare_cpu+0xab/0x330
[ 1.749446] start_kernel+0x226/0x520
[ 1.750039] x86_64_start_reservations+0x18/0x30
[ 1.750776] x86_64_start_kernel+0xf4/0x150
[ 1.751432] secondary_startup_64_no_verify+0x106/0x10b
[ 1.752264] </TASK>

2023-05-13 20:18:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] possible deadlock in __hrtimer_run_queues

Ido!

On Sat, May 13 2023 at 12:37, Ido Schimmel wrote:
> Thomas, I applied the fix on top of net-next (which includes
> 0af462f19e63). I'm getting the splat below during boot and then the
> systems hangs.

Because I'm a moron. I got the same splat while testing, fixed it on the
test machine and then did not sync it back before sending...

Updated fix which also replaces GFP_ATOMIC with __GFP_HIGH to prevent
the allocator to wake up the swap daemons, which causes yet another lock
inversion issue.

Thanks for testing!

tglx
---
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -124,9 +124,9 @@ static const char *obj_states[ODEBUG_STA
[ODEBUG_STATE_NOTAVAILABLE] = "not available",
};

-static void fill_pool(void)
+static void debug_objects_fill_pool(void)
{
- gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN;
+ gfp_t gfp = __GFP_HIGH | __GFP_NORETRY | __GFP_NOWARN;
struct debug_obj *obj;
unsigned long flags;

@@ -157,6 +157,13 @@ static void fill_pool(void)
raw_spin_unlock_irqrestore(&pool_lock, flags);
}

+ /*
+ * On RT enabled kernels the pool refill must happen in preemptible
+ * context:
+ */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
+ return;
+
if (unlikely(!obj_cache))
return;

@@ -587,16 +594,6 @@ static struct debug_obj *lookup_object_o
return NULL;
}

-static void debug_objects_fill_pool(void)
-{
- /*
- * On RT enabled kernels the pool refill must happen in preemptible
- * context:
- */
- if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible())
- fill_pool();
-}
-
static void
__debug_object_init(void *addr, const struct debug_obj_descr *descr, int onstack)
{
@@ -690,13 +687,16 @@ int debug_object_activate(void *addr, co
if (!debug_objects_enabled)
return 0;

- debug_objects_fill_pool();
-
db = get_bucket((unsigned long) addr);
-
raw_spin_lock_irqsave(&db->lock, flags);
+ obj = lookup_object(addr, db);
+ if (!obj) {
+ raw_spin_unlock_irqrestore(&db->lock, flags);
+ debug_objects_fill_pool();
+ raw_spin_lock_irqsave(&db->lock, flags);
+ obj = lookup_object_or_alloc(addr, db, descr, false, true);
+ }

- obj = lookup_object_or_alloc(addr, db, descr, false, true);
if (likely(!IS_ERR_OR_NULL(obj))) {
bool print_object = false;

@@ -901,13 +901,17 @@ void debug_object_assert_init(void *addr
if (!debug_objects_enabled)
return;

- debug_objects_fill_pool();
-
db = get_bucket((unsigned long) addr);
-
raw_spin_lock_irqsave(&db->lock, flags);
- obj = lookup_object_or_alloc(addr, db, descr, false, true);
+ obj = lookup_object(addr, db);
+ if (!obj) {
+ raw_spin_unlock_irqrestore(&db->lock, flags);
+ debug_objects_fill_pool();
+ raw_spin_lock_irqsave(&db->lock, flags);
+ obj = lookup_object_or_alloc(addr, db, descr, false, true);
+ }
raw_spin_unlock_irqrestore(&db->lock, flags);
+
if (likely(!IS_ERR_OR_NULL(obj)))
return;


2023-05-14 07:24:59

by Ido Schimmel

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] possible deadlock in __hrtimer_run_queues

On Sat, May 13, 2023 at 09:34:12PM +0200, Thomas Gleixner wrote:
> On Sat, May 13 2023 at 12:37, Ido Schimmel wrote:
> > Thomas, I applied the fix on top of net-next (which includes
> > 0af462f19e63). I'm getting the splat below during boot and then the
> > systems hangs.
>
> Because I'm a moron. I got the same splat while testing, fixed it on the
> test machine and then did not sync it back before sending...
>
> Updated fix which also replaces GFP_ATOMIC with __GFP_HIGH to prevent
> the allocator to wake up the swap daemons, which causes yet another lock
> inversion issue.

Looks good now. Feel free to add:

Tested-by: Ido Schimmel <[email protected]>

Thanks a lot!