2022-11-16 20:36:46

by Eric Biggers

[permalink] [raw]
Subject: KMSAN broken with lockdep again?

Hi,

I'm trying v6.1-rc5 with CONFIG_KMSAN, but the kernel continuously spams
"BUG: KMSAN: uninit-value in __init_waitqueue_head".

I tracked it down to lockdep (CONFIG_PROVE_LOCKING=y). The problem goes away if
I disable that.

I don't see any obvious use of uninitialized memory in __init_waitqueue_head().

The compiler I'm using is tip-of-tree clang (LLVM commit 4155be339ba80fef).

Is this a known issue?

- Eric


2022-11-17 14:26:40

by Alexander Potapenko

[permalink] [raw]
Subject: Re: KMSAN broken with lockdep again?

On Wed, Nov 16, 2022 at 9:12 PM Eric Biggers <[email protected]> wrote:
>
> Hi,
>
> I'm trying v6.1-rc5 with CONFIG_KMSAN, but the kernel continuously spams
> "BUG: KMSAN: uninit-value in __init_waitqueue_head".
>
> I tracked it down to lockdep (CONFIG_PROVE_LOCKING=y). The problem goes away if
> I disable that.
>
> I don't see any obvious use of uninitialized memory in __init_waitqueue_head().
>
> The compiler I'm using is tip-of-tree clang (LLVM commit 4155be339ba80fef).
>
> Is this a known issue?
>
> - Eric

Thanks for flagging this!

The reason behind that is that under lockdep we're accessing the
contents of wq_head->lock->dep_map, which KMSAN considers
uninitialized.
The initialization of dep_map happens inside kernel/locking/lockdep.c,
for which KMSAN is deliberately disabled, because lockep used to
deadlock in the past.

As far as I can tell, removing `KMSAN_SANITIZE_lockdep.o := n` does
not actually break anything now (although the kernel becomes quite
slow with both lockdep and KMSAN). Let me experiment a bit and send a
patch.
If this won't work out, we'll need an explicit call to
kmsan_unpoison_memory() somewhere in lockdep_init_map_type() to
suppress these reports.


--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

2022-11-18 04:33:50

by Eric Biggers

[permalink] [raw]
Subject: Re: KMSAN broken with lockdep again?

On Thu, Nov 17, 2022 at 02:46:29PM +0100, Alexander Potapenko wrote:
> On Wed, Nov 16, 2022 at 9:12 PM Eric Biggers <[email protected]> wrote:
> >
> > Hi,
> >
> > I'm trying v6.1-rc5 with CONFIG_KMSAN, but the kernel continuously spams
> > "BUG: KMSAN: uninit-value in __init_waitqueue_head".
> >
> > I tracked it down to lockdep (CONFIG_PROVE_LOCKING=y). The problem goes away if
> > I disable that.
> >
> > I don't see any obvious use of uninitialized memory in __init_waitqueue_head().
> >
> > The compiler I'm using is tip-of-tree clang (LLVM commit 4155be339ba80fef).
> >
> > Is this a known issue?
> >
> > - Eric
>
> Thanks for flagging this!
>
> The reason behind that is that under lockdep we're accessing the
> contents of wq_head->lock->dep_map, which KMSAN considers
> uninitialized.
> The initialization of dep_map happens inside kernel/locking/lockdep.c,
> for which KMSAN is deliberately disabled, because lockep used to
> deadlock in the past.
>
> As far as I can tell, removing `KMSAN_SANITIZE_lockdep.o := n` does
> not actually break anything now (although the kernel becomes quite
> slow with both lockdep and KMSAN). Let me experiment a bit and send a
> patch.
> If this won't work out, we'll need an explicit call to
> kmsan_unpoison_memory() somewhere in lockdep_init_map_type() to
> suppress these reports.

Thanks.

I tried just disabling CONFIG_PROVE_LOCKING, but now KMSAN warnings are being
spammed from check_stack_object() in mm/usercopy.c.

Commenting out the call to arch_within_stack_frames() makes it go away.

- Eric

2022-11-18 14:03:09

by Alexander Potapenko

[permalink] [raw]
Subject: Re: KMSAN broken with lockdep again?

> > As far as I can tell, removing `KMSAN_SANITIZE_lockdep.o := n` does
> > not actually break anything now (although the kernel becomes quite
> > slow with both lockdep and KMSAN). Let me experiment a bit and send a
> > patch.

Hm, no, lockdep isn't particularly happy with the nested
lockdep->KMSAN->lockdep calls:

------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:5508 check_flags+0x63/0x180
...
<TASK>
lock_acquire+0x196/0x640 kernel/locking/lockdep.c:5665
__raw_spin_lock_irqsave ./include/linux/spinlock_api_smp.h:110
_raw_spin_lock_irqsave+0xb3/0x110 kernel/locking/spinlock.c:162
__stack_depot_save+0x1b1/0x4b0 lib/stackdepot.c:479
stack_depot_save+0x13/0x20 lib/stackdepot.c:533
__msan_poison_alloca+0x100/0x1a0 mm/kmsan/instrumentation.c:263
native_save_fl ./include/linux/spinlock_api_smp.h:?
arch_local_save_flags ./arch/x86/include/asm/irqflags.h:70
arch_irqs_disabled ./arch/x86/include/asm/irqflags.h:130
__raw_spin_unlock_irqrestore ./include/linux/spinlock_api_smp.h:151
_raw_spin_unlock_irqrestore+0x60/0x100 kernel/locking/spinlock.c:194
tty_register_ldisc+0xcb/0x120 drivers/tty/tty_ldisc.c:68
n_tty_init+0x1f/0x21 drivers/tty/n_tty.c:2521
console_init+0x1f/0x7ee kernel/printk/printk.c:3287
start_kernel+0x577/0xaff init/main.c:1073
x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:556
x86_64_start_kernel+0x114/0x119 arch/x86/kernel/head64.c:537
secondary_startup_64_no_verify+0xcf/0xdb arch/x86/kernel/head_64.S:358
</TASK>
---[ end trace 0000000000000000 ]---

> > If this won't work out, we'll need an explicit call to
> > kmsan_unpoison_memory() somewhere in lockdep_init_map_type() to
> > suppress these reports.

I'll go for this option.

> Thanks.
>
> I tried just disabling CONFIG_PROVE_LOCKING, but now KMSAN warnings are being
> spammed from check_stack_object() in mm/usercopy.c.
>
> Commenting out the call to arch_within_stack_frames() makes it go away.

Yeah, arch_within_stack_frames() performs stack frame walking, which
confuses KMSAN.
We'll need to apply __no_kmsan_checks to it, like we did for other
stack unwinding functions.


> - Eric

T




--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

2022-11-18 19:25:14

by Alexander Potapenko

[permalink] [raw]
Subject: Re: KMSAN broken with lockdep again?

On Fri, Nov 18, 2022 at 2:39 PM Alexander Potapenko <[email protected]> wrote:
>
> > > As far as I can tell, removing `KMSAN_SANITIZE_lockdep.o := n` does
> > > not actually break anything now (although the kernel becomes quite
> > > slow with both lockdep and KMSAN). Let me experiment a bit and send a
> > > patch.
>
> Hm, no, lockdep isn't particularly happy with the nested
> lockdep->KMSAN->lockdep calls:
>
> ------------[ cut here ]------------
> DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
> WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:5508 check_flags+0x63/0x180
> ...
> <TASK>
> lock_acquire+0x196/0x640 kernel/locking/lockdep.c:5665
> __raw_spin_lock_irqsave ./include/linux/spinlock_api_smp.h:110
> _raw_spin_lock_irqsave+0xb3/0x110 kernel/locking/spinlock.c:162
> __stack_depot_save+0x1b1/0x4b0 lib/stackdepot.c:479
> stack_depot_save+0x13/0x20 lib/stackdepot.c:533
> __msan_poison_alloca+0x100/0x1a0 mm/kmsan/instrumentation.c:263
> native_save_fl ./include/linux/spinlock_api_smp.h:?
> arch_local_save_flags ./arch/x86/include/asm/irqflags.h:70
> arch_irqs_disabled ./arch/x86/include/asm/irqflags.h:130
> __raw_spin_unlock_irqrestore ./include/linux/spinlock_api_smp.h:151
> _raw_spin_unlock_irqrestore+0x60/0x100 kernel/locking/spinlock.c:194
> tty_register_ldisc+0xcb/0x120 drivers/tty/tty_ldisc.c:68
> n_tty_init+0x1f/0x21 drivers/tty/n_tty.c:2521
> console_init+0x1f/0x7ee kernel/printk/printk.c:3287
> start_kernel+0x577/0xaff init/main.c:1073
> x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:556
> x86_64_start_kernel+0x114/0x119 arch/x86/kernel/head64.c:537
> secondary_startup_64_no_verify+0xcf/0xdb arch/x86/kernel/head_64.S:358
> </TASK>
> ---[ end trace 0000000000000000 ]---

In fact, this message is printed in both cases: with and without KMSAN
instrumenting kernel/locking/lockdep.c
I wonder if this is a sign of a real problem in KMSAN, or just an
unavoidable consequence of instrumented code calling lockdep when
taking the stackdepot lock...

> > > If this won't work out, we'll need an explicit call to
> > > kmsan_unpoison_memory() somewhere in lockdep_init_map_type() to
> > > suppress these reports.
>
> I'll go for this option.
>
> > Thanks.
> >
> > I tried just disabling CONFIG_PROVE_LOCKING, but now KMSAN warnings are being
> > spammed from check_stack_object() in mm/usercopy.c.
> >
> > Commenting out the call to arch_within_stack_frames() makes it go away.
>
> Yeah, arch_within_stack_frames() performs stack frame walking, which
> confuses KMSAN.
> We'll need to apply __no_kmsan_checks to it, like we did for other
> stack unwinding functions.

Sent the patch.