2020-10-08 15:03:51

by syzbot

[permalink] [raw]
Subject: inconsistent lock state in xa_destroy

Hello,

syzbot found the following issue on:

HEAD commit: e4fb79c7 Add linux-next specific files for 20201008
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
kernel config: https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
compiler: gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

================================
WARNING: inconsistent lock state
5.9.0-rc8-next-20201008-syzkaller #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
syz-executor.2/6913 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffff888023003c18 (&xa->xa_lock#9){+.?.}-{2:2}, at: xa_destroy+0xaa/0x350 lib/xarray.c:2205
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:354 [inline]
io_uring_add_task_file fs/io_uring.c:8607 [inline]
io_uring_add_task_file+0x207/0x430 fs/io_uring.c:8590
io_uring_get_fd fs/io_uring.c:9116 [inline]
io_uring_create fs/io_uring.c:9280 [inline]
io_uring_setup+0x2727/0x3660 fs/io_uring.c:9314
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
irq event stamp: 362445
hardirqs last enabled at (362444): [<ffffffff8847f0df>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
hardirqs last enabled at (362444): [<ffffffff8847f0df>] _raw_spin_unlock_irqrestore+0x6f/0x90 kernel/locking/spinlock.c:191
hardirqs last disabled at (362445): [<ffffffff8847f6c9>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (362445): [<ffffffff8847f6c9>] _raw_spin_lock_irqsave+0xa9/0xd0 kernel/locking/spinlock.c:159
softirqs last enabled at (361998): [<ffffffff86db0172>] tcp_close+0x8d2/0x1220 net/ipv4/tcp.c:2576
softirqs last disabled at (362079): [<ffffffff88600f2f>] asm_call_irq_on_stack+0xf/0x20

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&xa->xa_lock#9);
<Interrupt>
lock(&xa->xa_lock#9);

*** DEADLOCK ***

1 lock held by syz-executor.2/6913:
#0: ffffffff8a554c80 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2474 [inline]
#0: ffffffff8a554c80 (rcu_callback){....}-{0:0}, at: rcu_core+0x5d8/0x1240 kernel/rcu/tree.c:2718

stack backtrace:
CPU: 0 PID: 6913 Comm: syz-executor.2 Not tainted 5.9.0-rc8-next-20201008-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fb lib/dump_stack.c:118
print_usage_bug kernel/locking/lockdep.c:3715 [inline]
valid_state kernel/locking/lockdep.c:3726 [inline]
mark_lock_irq kernel/locking/lockdep.c:3929 [inline]
mark_lock.cold+0x32/0x74 kernel/locking/lockdep.c:4396
mark_usage kernel/locking/lockdep.c:4281 [inline]
__lock_acquire+0x118a/0x56d0 kernel/locking/lockdep.c:4771
lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
xa_destroy+0xaa/0x350 lib/xarray.c:2205
__io_uring_free+0x60/0xc0 fs/io_uring.c:7693
io_uring_free include/linux/io_uring.h:40 [inline]
__put_task_struct+0xff/0x3f0 kernel/fork.c:732
put_task_struct include/linux/sched/task.h:111 [inline]
delayed_put_task_struct+0x1f6/0x340 kernel/exit.c:172
rcu_do_batch kernel/rcu/tree.c:2484 [inline]
rcu_core+0x645/0x1240 kernel/rcu/tree.c:2718
__do_softirq+0x203/0xab6 kernel/softirq.c:298
asm_call_irq_on_stack+0xf/0x20
</IRQ>
__run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
do_softirq_own_stack+0x9b/0xd0 arch/x86/kernel/irq_64.c:77
invoke_softirq kernel/softirq.c:393 [inline]
__irq_exit_rcu kernel/softirq.c:423 [inline]
irq_exit_rcu+0x235/0x280 kernel/softirq.c:435
sysvec_apic_timer_interrupt+0x51/0xf0 arch/x86/kernel/apic/apic.c:1091
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
RIP: 0010:memset_erms+0x9/0x10 arch/x86/lib/memset_64.S:66
Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01
RSP: 0018:ffffc900053c7b78 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000002040
RDX: 0000000000008000 RSI: 0000000000000000 RDI: ffffc900161a5fc0
RBP: ffffc900053c7d08 R08: 0000000000000001 R09: ffffc900161a0000
R10: fffff52002c34fff R11: 0000000000000000 R12: ffff88805b9f0380
R13: ffff888010ccae08 R14: 0000000001200000 R15: 0000000000000000
memset include/linux/string.h:384 [inline]
alloc_thread_stack_node kernel/fork.c:232 [inline]
dup_task_struct kernel/fork.c:864 [inline]
copy_process+0x68a/0x6e90 kernel/fork.c:1938
kernel_clone+0xe5/0xae0 kernel/fork.c:2456
__do_sys_clone+0xc8/0x110 kernel/fork.c:2573
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45c3fa
Code: f7 d8 64 89 04 25 d4 02 00 00 64 4c 8b 0c 25 10 00 00 00 31 d2 4d 8d 91 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 f5 00 00 00 85 c0 41 89 c5 0f 85 fc 00 00
RSP: 002b:00007ffe5dc445b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 00007ffe5dc445b0 RCX: 000000000045c3fa
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 00007ffe5dc445f0 R08: 0000000000000001 R09: 0000000002f46940
R10: 0000000002f46c10 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000001 R15: 00007ffe5dc44640


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


2020-10-08 15:09:32

by Jens Axboe

[permalink] [raw]
Subject: Re: inconsistent lock state in xa_destroy

On 10/8/20 9:05 AM, Matthew Wilcox wrote:
> On Thu, Oct 08, 2020 at 09:01:57AM -0600, Jens Axboe wrote:
>> On 10/8/20 9:00 AM, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit: e4fb79c7 Add linux-next specific files for 20201008
>>> git tree: linux-next
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
>>> compiler: gcc (GCC) 10.1.0-syz 20200507
>>>
>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: [email protected]
>>
>> Already pushed out a fix for this, it's really an xarray issue where it just
>> assumes that destroy can irq grab the lock.
>
> ... nice of you to report the issue to the XArray maintainer.

This is from not even 12h ago, 10h of which I was offline. It wasn't on
the top of my list of priority items to tackle this morning, but it
is/was on the list.

--
Jens Axboe

2020-10-08 15:35:12

by Jens Axboe

[permalink] [raw]
Subject: Re: inconsistent lock state in xa_destroy

On 10/8/20 9:28 AM, Matthew Wilcox wrote:
> On Thu, Oct 08, 2020 at 09:06:56AM -0600, Jens Axboe wrote:
>> On 10/8/20 9:05 AM, Matthew Wilcox wrote:
>>> On Thu, Oct 08, 2020 at 09:01:57AM -0600, Jens Axboe wrote:
>>>> On 10/8/20 9:00 AM, syzbot wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following issue on:
>>>>>
>>>>> HEAD commit: e4fb79c7 Add linux-next specific files for 20201008
>>>>> git tree: linux-next
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
>>>>> compiler: gcc (GCC) 10.1.0-syz 20200507
>>>>>
>>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>>
>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>> Reported-by: [email protected]
>>>>
>>>> Already pushed out a fix for this, it's really an xarray issue where it just
>>>> assumes that destroy can irq grab the lock.
>>>
>>> ... nice of you to report the issue to the XArray maintainer.
>>
>> This is from not even 12h ago, 10h of which I was offline. It wasn't on
>> the top of my list of priority items to tackle this morning, but it
>> is/was on the list.
>
> How's this?

Looks like that'll do the trick in avoiding similar future lockdep
splats for xa_destroy().

--
Jens Axboe

2020-10-08 15:56:09

by Matthew Wilcox

[permalink] [raw]
Subject: Re: inconsistent lock state in xa_destroy

On Thu, Oct 08, 2020 at 09:01:57AM -0600, Jens Axboe wrote:
> On 10/8/20 9:00 AM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: e4fb79c7 Add linux-next specific files for 20201008
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
> > compiler: gcc (GCC) 10.1.0-syz 20200507
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
>
> Already pushed out a fix for this, it's really an xarray issue where it just
> assumes that destroy can irq grab the lock.

... nice of you to report the issue to the XArray maintainer.

2020-10-08 15:56:41

by Matthew Wilcox

[permalink] [raw]
Subject: Re: inconsistent lock state in xa_destroy

On Thu, Oct 08, 2020 at 09:06:56AM -0600, Jens Axboe wrote:
> On 10/8/20 9:05 AM, Matthew Wilcox wrote:
> > On Thu, Oct 08, 2020 at 09:01:57AM -0600, Jens Axboe wrote:
> >> On 10/8/20 9:00 AM, syzbot wrote:
> >>> Hello,
> >>>
> >>> syzbot found the following issue on:
> >>>
> >>> HEAD commit: e4fb79c7 Add linux-next specific files for 20201008
> >>> git tree: linux-next
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
> >>> kernel config: https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
> >>> dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
> >>> compiler: gcc (GCC) 10.1.0-syz 20200507
> >>>
> >>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>
> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>> Reported-by: [email protected]
> >>
> >> Already pushed out a fix for this, it's really an xarray issue where it just
> >> assumes that destroy can irq grab the lock.
> >
> > ... nice of you to report the issue to the XArray maintainer.
>
> This is from not even 12h ago, 10h of which I was offline. It wasn't on
> the top of my list of priority items to tackle this morning, but it
> is/was on the list.

How's this?

diff --git a/lib/xarray.c b/lib/xarray.c
index 1e4ed5bce5dc..d84cb98d5485 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -1999,21 +1999,32 @@ EXPORT_SYMBOL_GPL(xa_delete_node); /* For the benefit of the test suite */
* xa_destroy() - Free all internal data structures.
* @xa: XArray.
*
- * After calling this function, the XArray is empty and has freed all memory
- * allocated for its internal data structures. You are responsible for
- * freeing the objects referenced by the XArray.
- *
- * Context: Any context. Takes and releases the xa_lock, interrupt-safe.
+ * After calling this function, the XArray is empty and has freed all
+ * memory allocated for its internal data structures. You are responsible
+ * for freeing the objects referenced by the XArray.
+ *
+ * You do not need to call xa_destroy() if you know the XArray is
+ * already empty. The IDR used to require this, so you may see some
+ * old code calling idr_destroy() or xa_destroy() on arrays which we
+ * know to be empty, but new code should not do this.
+ *
+ * Context: If the XArray is protected by an IRQ-safe lock, this function
+ * must not be called from interrupt context or with interrupts disabled.
+ * Otherwise it may be called from any context. It will take and release
+ * the xa_lock with the appropriate disabling & enabling of softirqs
+ * or interrupts.
*/
void xa_destroy(struct xarray *xa)
{
XA_STATE(xas, xa, 0);
- unsigned long flags;
+ unsigned int lock_type = xa_lock_type(xa);
void *entry;

xas.xa_node = NULL;
- xas_lock_irqsave(&xas, flags);
+ xas_lock_type(&xas, lock_type);
entry = xa_head_locked(xa);
+ if (!entry)
+ goto out;
RCU_INIT_POINTER(xa->xa_head, NULL);
xas_init_marks(&xas);
if (xa_zero_busy(xa))
@@ -2021,7 +2032,8 @@ void xa_destroy(struct xarray *xa)
/* lockdep checks we're still holding the lock in xas_free_nodes() */
if (xa_is_node(entry))
xas_free_nodes(&xas, xa_to_node(entry));
- xas_unlock_irqrestore(&xas, flags);
+out:
+ xas_unlock_type(&xas, lock_type);
}
EXPORT_SYMBOL(xa_destroy);

2020-10-08 16:47:02

by Jens Axboe

[permalink] [raw]
Subject: Re: inconsistent lock state in xa_destroy

On 10/8/20 9:00 AM, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: e4fb79c7 Add linux-next specific files for 20201008
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=12555227900000
> kernel config: https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
> dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
> compiler: gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]

Already pushed out a fix for this, it's really an xarray issue where it just
assumes that destroy can irq grab the lock.

#syz fix: io_uring: no need to call xa_destroy() on empty xarray

--
Jens Axboe

2020-10-08 23:43:06

by syzbot

[permalink] [raw]
Subject: Re: inconsistent lock state in xa_destroy

syzbot has found a reproducer for the following issue on:

HEAD commit: e4fb79c7 Add linux-next specific files for 20201008
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17dda29f900000
kernel config: https://syzkaller.appspot.com/x/.config?x=568d41fe4341ed0f
dashboard link: https://syzkaller.appspot.com/bug?extid=cdcbdc0bd42e559b52b9
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14860568500000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16367de7900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

================================
WARNING: inconsistent lock state
5.9.0-rc8-next-20201008-syzkaller #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffff888025f65018 (&xa->xa_lock#7){+.?.}-{2:2}, at: xa_destroy+0xaa/0x350 lib/xarray.c:2205
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:354 [inline]
io_uring_add_task_file fs/io_uring.c:8607 [inline]
io_uring_add_task_file+0x207/0x430 fs/io_uring.c:8590
io_uring_get_fd fs/io_uring.c:9116 [inline]
io_uring_create fs/io_uring.c:9280 [inline]
io_uring_setup+0x2727/0x3660 fs/io_uring.c:9314
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
irq event stamp: 120141
hardirqs last enabled at (120140): [<ffffffff8847f0df>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
hardirqs last enabled at (120140): [<ffffffff8847f0df>] _raw_spin_unlock_irqrestore+0x6f/0x90 kernel/locking/spinlock.c:191
hardirqs last disabled at (120141): [<ffffffff8847f6c9>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (120141): [<ffffffff8847f6c9>] _raw_spin_lock_irqsave+0xa9/0xd0 kernel/locking/spinlock.c:159
softirqs last enabled at (119956): [<ffffffff814731af>] irq_enter_rcu+0xcf/0xf0 kernel/softirq.c:360
softirqs last disabled at (119957): [<ffffffff88600f2f>] asm_call_irq_on_stack+0xf/0x20

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&xa->xa_lock#7);
<Interrupt>
lock(&xa->xa_lock#7);

*** DEADLOCK ***

1 lock held by swapper/0/0:
#0: ffffffff8a554c80 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2474 [inline]
#0: ffffffff8a554c80 (rcu_callback){....}-{0:0}, at: rcu_core+0x5d8/0x1240 kernel/rcu/tree.c:2718

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc8-next-20201008-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fb lib/dump_stack.c:118
print_usage_bug kernel/locking/lockdep.c:3715 [inline]
valid_state kernel/locking/lockdep.c:3726 [inline]
mark_lock_irq kernel/locking/lockdep.c:3929 [inline]
mark_lock.cold+0x32/0x74 kernel/locking/lockdep.c:4396
mark_usage kernel/locking/lockdep.c:4281 [inline]
__lock_acquire+0x118a/0x56d0 kernel/locking/lockdep.c:4771
lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
xa_destroy+0xaa/0x350 lib/xarray.c:2205
__io_uring_free+0x60/0xc0 fs/io_uring.c:7693
io_uring_free include/linux/io_uring.h:40 [inline]
__put_task_struct+0xff/0x3f0 kernel/fork.c:732
put_task_struct include/linux/sched/task.h:111 [inline]
delayed_put_task_struct+0x1f6/0x340 kernel/exit.c:172
rcu_do_batch kernel/rcu/tree.c:2484 [inline]
rcu_core+0x645/0x1240 kernel/rcu/tree.c:2718
__do_softirq+0x203/0xab6 kernel/softirq.c:298
asm_call_irq_on_stack+0xf/0x20
</IRQ>
__run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
do_softirq_own_stack+0x9b/0xd0 arch/x86/kernel/irq_64.c:77
invoke_softirq kernel/softirq.c:393 [inline]
__irq_exit_rcu kernel/softirq.c:423 [inline]
irq_exit_rcu+0x235/0x280 kernel/softirq.c:435
sysvec_apic_timer_interrupt+0x51/0xf0 arch/x86/kernel/apic/apic.c:1091
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: 89 ef e8 b5 62 6f f9 e9 86 fe ff ff 48 89 df e8 a8 62 6f f9 e9 7b ff ff ff cc cc cc e9 07 00 00 00 0f 00 2d 54 08 61 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 44 08 61 00 f4 c3 cc cc 55 53 e8 09
RSP: 0018:ffffffff8a207d48 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffffffff176a7c1
RDX: ffffffff8a29ce40 RSI: ffffffff8847e5c3 RDI: 0000000000000000
RBP: ffff888012d2e064 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: ffff888012d2e000 R14: ffff888012d2e064 R15: ffff8881339b2004
arch_safe_halt arch/x86/include/asm/paravirt.h:150 [inline]
acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
acpi_idle_do_entry+0x1e8/0x330 drivers/acpi/processor_idle.c:517
acpi_idle_enter+0x35a/0x550 drivers/acpi/processor_idle.c:648
cpuidle_enter_state+0x1ab/0xdb0 drivers/cpuidle/cpuidle.c:237
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351
call_cpuidle kernel/sched/idle.c:132 [inline]
cpuidle_idle_call kernel/sched/idle.c:213 [inline]
do_idle+0x48e/0x730 kernel/sched/idle.c:273
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:369
start_kernel+0x490/0x4b1 init/main.c:1049
secondary_startup_64_no_verify+0xa6/0xab

2020-10-09 00:21:05

by Matthew Wilcox

[permalink] [raw]
Subject: Re: inconsistent lock state in xa_destroy


If I understand the lockdep report here, this actually isn't an XArray
issue, although I do think there is one.

On Thu, Oct 08, 2020 at 02:14:20PM -0700, syzbot wrote:
> ================================
> WARNING: inconsistent lock state
> 5.9.0-rc8-next-20201008-syzkaller #0 Not tainted
> --------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
> ffff888025f65018 (&xa->xa_lock#7){+.?.}-{2:2}, at: xa_destroy+0xaa/0x350 lib/xarray.c:2205
> {SOFTIRQ-ON-W} state was registered at:
> lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:354 [inline]
> io_uring_add_task_file fs/io_uring.c:8607 [inline]

You're using the XArray in a non-interrupt-disabling mode.

> _raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
> xa_destroy+0xaa/0x350 lib/xarray.c:2205
> __io_uring_free+0x60/0xc0 fs/io_uring.c:7693
> io_uring_free include/linux/io_uring.h:40 [inline]
> __put_task_struct+0xff/0x3f0 kernel/fork.c:732
> put_task_struct include/linux/sched/task.h:111 [inline]
> delayed_put_task_struct+0x1f6/0x340 kernel/exit.c:172
> rcu_do_batch kernel/rcu/tree.c:2484 [inline]

But you're calling xa_destroy() from in-interrupt context.
So (as far as lockdep is concerned), no matter what I do in
xa_destroy(), this potential deadlock is there. You'd need to be
using xa_init_flags(XA_FLAGS_LOCK_IRQ) if you actually needed to call
xa_destroy() here.

Fortunately, it seems you don't need to call xa_destroy() at all, so
that problem is solved, but the patch I have here wouldn't help.

2020-10-09 03:23:57

by Jens Axboe

[permalink] [raw]
Subject: Re: inconsistent lock state in xa_destroy

On 10/8/20 4:27 PM, Matthew Wilcox wrote:
>
> If I understand the lockdep report here, this actually isn't an XArray
> issue, although I do think there is one.
>
> On Thu, Oct 08, 2020 at 02:14:20PM -0700, syzbot wrote:
>> ================================
>> WARNING: inconsistent lock state
>> 5.9.0-rc8-next-20201008-syzkaller #0 Not tainted
>> --------------------------------
>> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
>> swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
>> ffff888025f65018 (&xa->xa_lock#7){+.?.}-{2:2}, at: xa_destroy+0xaa/0x350 lib/xarray.c:2205
>> {SOFTIRQ-ON-W} state was registered at:
>> lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5419
>> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>> spin_lock include/linux/spinlock.h:354 [inline]
>> io_uring_add_task_file fs/io_uring.c:8607 [inline]
>
> You're using the XArray in a non-interrupt-disabling mode.
>
>> _raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
>> xa_destroy+0xaa/0x350 lib/xarray.c:2205
>> __io_uring_free+0x60/0xc0 fs/io_uring.c:7693
>> io_uring_free include/linux/io_uring.h:40 [inline]
>> __put_task_struct+0xff/0x3f0 kernel/fork.c:732
>> put_task_struct include/linux/sched/task.h:111 [inline]
>> delayed_put_task_struct+0x1f6/0x340 kernel/exit.c:172
>> rcu_do_batch kernel/rcu/tree.c:2484 [inline]
>
> But you're calling xa_destroy() from in-interrupt context.
> So (as far as lockdep is concerned), no matter what I do in
> xa_destroy(), this potential deadlock is there. You'd need to be
> using xa_init_flags(XA_FLAGS_LOCK_IRQ) if you actually needed to call
> xa_destroy() here.

Yeah good point, I guess that last free is in softirq from RCU.

> Fortunately, it seems you don't need to call xa_destroy() at all, so
> that problem is solved, but the patch I have here wouldn't help.

Right, it wouldn't have helped this case.

--
Jens Axboe