2014-12-10 10:46:08

by Miklos Szeredi

[permalink] [raw]
Subject: [BUG] uml panics with "Segfault with no mm" in v3.19-rc

The guilty commit is:

00f634bc522d "asm-generic: add generic futex for !CONFIG_SMP"

And the backtrace:

#0 0x00007ffff7866457 in kill () from /lib64/libc.so.6
#1 0x000000006002a454 in uml_abort () at arch/um/os-Linux/util.c:93
#2 0x000000006002a7e5 in os_dump_core () at arch/um/os-Linux/util.c:148
#3 0x000000006001b48a in panic_exit (self=<optimized out>, unused1=<optimized out>, unused2=<optimized out>) at arch/um/kernel/um_arch.c:240
#4 0x000000006004e4df in notifier_call_chain (nl=<optimized out>, val=6, v=0xffffffff, nr_to_call=-1, nr_calls=0x0) at kernel/notifier.c:93
#5 0x000000006004e558 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:183
#6 atomic_notifier_call_chain (nh=<optimized out>, val=<optimized out>, v=<optimized out>) at kernel/notifier.c:193
#7 0x0000000060238570 in panic (fmt=<optimized out>) at kernel/panic.c:133
#8 0x000000006001ad91 in segv (fi=..., ip=1610803493, is_user=<optimized out>, regs=0x602c1840 <cpu0_irqstack+6208>) at arch/um/kernel/trap.c:218
#9 0x000000006001b0f8 in segv_handler (sig=<optimized out>, unused_si=<optimized out>, regs=<optimized out>) at arch/um/kernel/trap.c:191
#10 0x0000000060029238 in sig_handler_common (sig=11, si=0x602c1d30 <cpu0_irqstack+7472>, mc=<optimized out>) at arch/um/os-Linux/signal.c:44
#11 0x0000000060029304 in sig_handler (sig=<optimized out>, si=<optimized out>, mc=<optimized out>) at arch/um/os-Linux/signal.c:231
#12 0x0000000060028dfd in hard_handler (sig=<optimized out>, si=0x6, p=<optimized out>) at arch/um/os-Linux/signal.c:165
#13 <signal handler called>
#14 memcpy () at arch/x86/um/../lib/memcpy_64.S:160
#15 0x000000006001c13d in copy_from_user (to=0x61c49e28, from=<optimized out>, n=<optimized out>) at arch/um/kernel/skas/uaccess.c:145
#16 0x0000000060072618 in futex_atomic_cmpxchg_inatomic (newval=<optimized out>, oldval=<optimized out>, uaddr=<optimized out>, uval=<optimized out>) at include/asm-generic/futex.h:109
#17 cmpxchg_futex_value_locked (curval=0x61c49e28, uaddr=0x0, uval=<optimized out>, newval=<optimized out>) at kernel/futex.c:596
#18 0x0000000060008b70 in futex_detect_cmpxchg () at kernel/futex.c:3020
#19 futex_init () at kernel/futex.c:3043
#20 0x00000000600166ba in do_one_initcall (fn=0x60008adf <futex_init>) at init/main.c:791

Thanks,
Miklos


2014-12-10 10:49:30

by Richard Weinberger

[permalink] [raw]
Subject: Re: [BUG] uml panics with "Segfault with no mm" in v3.19-rc

Hi!

Am 10.12.2014 um 11:46 schrieb Miklos Szeredi:
> The guilty commit is:
>
> 00f634bc522d "asm-generic: add generic futex for !CONFIG_SMP"

Thanks a lot Miklos!
Your bisecting faster than I do.

Let's dig into the issue!

Thanks,
//richard

> And the backtrace:
>
> #0 0x00007ffff7866457 in kill () from /lib64/libc.so.6
> #1 0x000000006002a454 in uml_abort () at arch/um/os-Linux/util.c:93
> #2 0x000000006002a7e5 in os_dump_core () at arch/um/os-Linux/util.c:148
> #3 0x000000006001b48a in panic_exit (self=<optimized out>, unused1=<optimized out>, unused2=<optimized out>) at arch/um/kernel/um_arch.c:240
> #4 0x000000006004e4df in notifier_call_chain (nl=<optimized out>, val=6, v=0xffffffff, nr_to_call=-1, nr_calls=0x0) at kernel/notifier.c:93
> #5 0x000000006004e558 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:183
> #6 atomic_notifier_call_chain (nh=<optimized out>, val=<optimized out>, v=<optimized out>) at kernel/notifier.c:193
> #7 0x0000000060238570 in panic (fmt=<optimized out>) at kernel/panic.c:133
> #8 0x000000006001ad91 in segv (fi=..., ip=1610803493, is_user=<optimized out>, regs=0x602c1840 <cpu0_irqstack+6208>) at arch/um/kernel/trap.c:218
> #9 0x000000006001b0f8 in segv_handler (sig=<optimized out>, unused_si=<optimized out>, regs=<optimized out>) at arch/um/kernel/trap.c:191
> #10 0x0000000060029238 in sig_handler_common (sig=11, si=0x602c1d30 <cpu0_irqstack+7472>, mc=<optimized out>) at arch/um/os-Linux/signal.c:44
> #11 0x0000000060029304 in sig_handler (sig=<optimized out>, si=<optimized out>, mc=<optimized out>) at arch/um/os-Linux/signal.c:231
> #12 0x0000000060028dfd in hard_handler (sig=<optimized out>, si=0x6, p=<optimized out>) at arch/um/os-Linux/signal.c:165
> #13 <signal handler called>
> #14 memcpy () at arch/x86/um/../lib/memcpy_64.S:160
> #15 0x000000006001c13d in copy_from_user (to=0x61c49e28, from=<optimized out>, n=<optimized out>) at arch/um/kernel/skas/uaccess.c:145
> #16 0x0000000060072618 in futex_atomic_cmpxchg_inatomic (newval=<optimized out>, oldval=<optimized out>, uaddr=<optimized out>, uval=<optimized out>) at include/asm-generic/futex.h:109
> #17 cmpxchg_futex_value_locked (curval=0x61c49e28, uaddr=0x0, uval=<optimized out>, newval=<optimized out>) at kernel/futex.c:596
> #18 0x0000000060008b70 in futex_detect_cmpxchg () at kernel/futex.c:3020
> #19 futex_init () at kernel/futex.c:3043
> #20 0x00000000600166ba in do_one_initcall (fn=0x60008adf <futex_init>) at init/main.c:791
>
> Thanks,
> Miklos
>

2014-12-10 11:03:54

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [BUG] uml panics with "Segfault with no mm" in v3.19-rc

On Wednesday 10 December 2014 11:49:23 Richard Weinberger wrote:
> Hi!
>
> Am 10.12.2014 um 11:46 schrieb Miklos Szeredi:
> > The guilty commit is:
> >
> > 00f634bc522d "asm-generic: add generic futex for !CONFIG_SMP"
>
> Thanks a lot Miklos!
> Your bisecting faster than I do.
>
> Let's dig into the issue!
>

Did this happen on linux-next as well?

Does this happen only on non-SMP UML running on an SMP host, or also
in other combinations?

Arnd

2014-12-10 11:59:29

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [BUG] uml panics with "Segfault with no mm" in v3.19-rc

On Wed, Dec 10, 2014 at 11:49 AM, Richard Weinberger <[email protected]> wrote:
> Am 10.12.2014 um 11:46 schrieb Miklos Szeredi:
>> The guilty commit is:
>>
>> 00f634bc522d "asm-generic: add generic futex for !CONFIG_SMP"
>
> Thanks a lot Miklos!
> Your bisecting faster than I do.
>
> Let's dig into the issue!

Do you need "select HAVE_FUTEX_CMPXCHG if FUTEX"?

Cfr. commit e571c58f313d35c5 ("m68k: Skip futex_atomic_cmpxchg_inatomic()
test") and commit 03b8c7b623c80af2 ("futex: Allow architectures to skip
futex_atomic_cmpxchg_inatomic() test").

BTW, I still think the real problem is the wrong address space, cfr.
"[PATCH/RFC] futex: Switch to USER_DS for futex test"
(http://www.spinics.net/lists/linux-m68k/msg06597.html), so you may also
want to try that.
However, that caused problems on s390, as it ran too early:
http://permalink.gmane.org/gmane.linux.kernel.next/30165

>> And the backtrace:
>>
>> #0 0x00007ffff7866457 in kill () from /lib64/libc.so.6
>> #1 0x000000006002a454 in uml_abort () at arch/um/os-Linux/util.c:93
>> #2 0x000000006002a7e5 in os_dump_core () at arch/um/os-Linux/util.c:148
>> #3 0x000000006001b48a in panic_exit (self=<optimized out>, unused1=<optimized out>, unused2=<optimized out>) at arch/um/kernel/um_arch.c:240
>> #4 0x000000006004e4df in notifier_call_chain (nl=<optimized out>, val=6, v=0xffffffff, nr_to_call=-1, nr_calls=0x0) at kernel/notifier.c:93
>> #5 0x000000006004e558 in __atomic_notifier_call_chain (nr_calls=<optimized out>, nr_to_call=<optimized out>, v=<optimized out>, val=<optimized out>, nh=<optimized out>) at kernel/notifier.c:183
>> #6 atomic_notifier_call_chain (nh=<optimized out>, val=<optimized out>, v=<optimized out>) at kernel/notifier.c:193
>> #7 0x0000000060238570 in panic (fmt=<optimized out>) at kernel/panic.c:133
>> #8 0x000000006001ad91 in segv (fi=..., ip=1610803493, is_user=<optimized out>, regs=0x602c1840 <cpu0_irqstack+6208>) at arch/um/kernel/trap.c:218
>> #9 0x000000006001b0f8 in segv_handler (sig=<optimized out>, unused_si=<optimized out>, regs=<optimized out>) at arch/um/kernel/trap.c:191
>> #10 0x0000000060029238 in sig_handler_common (sig=11, si=0x602c1d30 <cpu0_irqstack+7472>, mc=<optimized out>) at arch/um/os-Linux/signal.c:44
>> #11 0x0000000060029304 in sig_handler (sig=<optimized out>, si=<optimized out>, mc=<optimized out>) at arch/um/os-Linux/signal.c:231
>> #12 0x0000000060028dfd in hard_handler (sig=<optimized out>, si=0x6, p=<optimized out>) at arch/um/os-Linux/signal.c:165
>> #13 <signal handler called>
>> #14 memcpy () at arch/x86/um/../lib/memcpy_64.S:160
>> #15 0x000000006001c13d in copy_from_user (to=0x61c49e28, from=<optimized out>, n=<optimized out>) at arch/um/kernel/skas/uaccess.c:145
>> #16 0x0000000060072618 in futex_atomic_cmpxchg_inatomic (newval=<optimized out>, oldval=<optimized out>, uaddr=<optimized out>, uval=<optimized out>) at include/asm-generic/futex.h:109
>> #17 cmpxchg_futex_value_locked (curval=0x61c49e28, uaddr=0x0, uval=<optimized out>, newval=<optimized out>) at kernel/futex.c:596
>> #18 0x0000000060008b70 in futex_detect_cmpxchg () at kernel/futex.c:3020
>> #19 futex_init () at kernel/futex.c:3043
>> #20 0x00000000600166ba in do_one_initcall (fn=0x60008adf <futex_init>) at init/main.c:791

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2014-12-10 12:05:42

by Richard Weinberger

[permalink] [raw]
Subject: Re: [BUG] uml panics with "Segfault with no mm" in v3.19-rc

Am 10.12.2014 um 12:03 schrieb Arnd Bergmann:
> On Wednesday 10 December 2014 11:49:23 Richard Weinberger wrote:
>> Hi!
>>
>> Am 10.12.2014 um 11:46 schrieb Miklos Szeredi:
>>> The guilty commit is:
>>>
>>> 00f634bc522d "asm-generic: add generic futex for !CONFIG_SMP"
>>
>> Thanks a lot Miklos!
>> Your bisecting faster than I do.
>>
>> Let's dig into the issue!
>>
>
> Did this happen on linux-next as well?
>
> Does this happen only on non-SMP UML running on an SMP host, or also
> in other combinations?

Had some time to look into the issue.
UML dies because of futex_detect_cmpxchg().

/*
* This will fail and we want it. Some arch implementations do
* runtime detection of the futex_atomic_cmpxchg_inatomic()
* functionality. We want to know that before we call in any
* of the complex code paths. Also we want to prevent
* registration of robust lists in that case. NULL is
* guaranteed to fault and we get -EFAULT on functional
* implementation, the non-functional ones will return
* -ENOSYS.
*/
if (cmpxchg_futex_value_locked(&curval, NULL, 0, 0) == -EFAULT)
futex_cmpxchg_enabled = 1;

The said commit adds an futex_atomic_cmpxchg_inatomic() implementation for the non-SMP case.
As UML is non-SMP it will from now on use that code.

This line of code makes UML die because the kernel tries to access address 0.

if (unlikely(get_user(val, uaddr) != 0)
return -EFAULT;

On UML you can access user space memory only from process context.
Theoretically init calls are process context but not in terms of UML.
As no user space process called into the kernel UML has no process
to fetch the data using ptrace(), it will fall back to the "kernel did a boo boo"
case and panic().

Maybe we can find an easy way to detect this case in arch/um/kernel/trap.c
but the real question is, does UML need futex_cmpxchg?
Is there a big benefit?
I fear it will open a can of worms.

Thanks,
//richard

2014-12-10 12:13:30

by Richard Weinberger

[permalink] [raw]
Subject: Re: [BUG] uml panics with "Segfault with no mm" in v3.19-rc

Am 10.12.2014 um 12:59 schrieb Geert Uytterhoeven:
> On Wed, Dec 10, 2014 at 11:49 AM, Richard Weinberger <[email protected]> wrote:
>> Am 10.12.2014 um 11:46 schrieb Miklos Szeredi:
>>> The guilty commit is:
>>>
>>> 00f634bc522d "asm-generic: add generic futex for !CONFIG_SMP"
>>
>> Thanks a lot Miklos!
>> Your bisecting faster than I do.
>>
>> Let's dig into the issue!
>
> Do you need "select HAVE_FUTEX_CMPXCHG if FUTEX"?

> Cfr. commit e571c58f313d35c5 ("m68k: Skip futex_atomic_cmpxchg_inatomic()
> test") and commit 03b8c7b623c80af2 ("futex: Allow architectures to skip
> futex_atomic_cmpxchg_inatomic() test").

Bingo!

If UML selects HAVE_FUTEX_CMPXCHG the code path I've described in my previous mail
is no longer taken and makes it work again.
I did only a small test but nothing exploded so far.

> BTW, I still think the real problem is the wrong address space, cfr.
> "[PATCH/RFC] futex: Switch to USER_DS for futex test"
> (http://www.spinics.net/lists/linux-m68k/msg06597.html), so you may also
> want to try that.
> However, that caused problems on s390, as it ran too early:
> http://permalink.gmane.org/gmane.linux.kernel.next/30165

Yeah, this would also make sense for UML.

Thanks,
//richard