2017-12-28 09:38:59

by Alexander Tsoy

[permalink] [raw]
Subject: 4.14.9 with CONFIG_MCORE2 fails to boot

Hello,

4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled with
gcc 6+. More details in the following bug reports:
https://bugzilla.kernel.org/show_bug.cgi?id=198263
https://bugs.gentoo.org/642268

I bisected it to the commit below:

$ git bisect good
2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit
commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
Author: Andy Lutomirski <[email protected]>
Date:   Mon Dec 4 15:07:23 2017 +0100

    x86/entry/64: Use a per-CPU trampoline stack for IDT entries

    commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.

    Historically, IDT entries from usermode have always gone directly
    to the running task's kernel stack.  Rearrange it so that we enter
on
    a per-CPU trampoline stack and then manually switch to the task's
stack.
    This touches a couple of extra cachelines, but it gives us a chance
    to run some code before we touch the kernel stack.

    The asm isn't exactly beautiful, but I think that fully refactoring
    it can wait.

    Signed-off-by: Andy Lutomirski <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Reviewed-by: Borislav Petkov <[email protected]>
    Reviewed-by: Thomas Gleixner <[email protected]>
    Cc: Boris Ostrovsky <[email protected]>
    Cc: Borislav Petkov <[email protected]>
    Cc: Borislav Petkov <[email protected]>
    Cc: Brian Gerst <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: David Laight <[email protected]>
    Cc: Denys Vlasenko <[email protected]>
    Cc: Eduardo Valentin <[email protected]>
    Cc: Greg KH <[email protected]>
    Cc: H. Peter Anvin <[email protected]>
    Cc: Josh Poimboeuf <[email protected]>
    Cc: Juergen Gross <[email protected]>
    Cc: Linus Torvalds <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Cc: Rik van Riel <[email protected]>
    Cc: Will Deacon <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: [email protected]
    Cc: [email protected]
    Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix
.de
    Signed-off-by: Ingo Molnar <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

:040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6
8f8e869fd59c3dd781dceffa76e53e41d733a0cf M      arch

$ git bisect log
git bisect start
# bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9
git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae
# good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8
git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d
# good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64:
Separate cpu_current_top_of_stack from TSS.sp0
git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1
# bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon: Avoid
tripping SMP hardlockup watchdog
git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36
# bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus reset
if bridge itself is broken
git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8
# bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures: Make
CPU bugs sticky
git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8
# bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64: Move
the IST stacks into struct cpu_entry_area
git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3
# bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a
per-CPU trampoline stack for IDT entries
git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
# good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64: Stop
assuming that pt_regs is on the entry stack
git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e
# first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a]
x86/entry/64: Use a per-CPU trampoline stack for IDT entries


2017-12-29 09:17:43

by Greg KH

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

On Thu, Dec 28, 2017 at 12:33:22PM +0300, Alexander Tsoy wrote:
> Hello,
>
> 4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled with
> gcc 6+. More details in the following bug reports:
> https://bugzilla.kernel.org/show_bug.cgi?id=198263
> https://bugs.gentoo.org/642268
>
> I bisected it to the commit below:
>
> $ git bisect good
> 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit
> commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
> Author: Andy Lutomirski <[email protected]>
> Date:???Mon Dec 4 15:07:23 2017 +0100
>
> ????x86/entry/64: Use a per-CPU trampoline stack for IDT entries
>
> ????commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
>
> ????Historically, IDT entries from usermode have always gone directly
> ????to the running task's kernel stack.??Rearrange it so that we enter
> on
> ????a per-CPU trampoline stack and then manually switch to the task's
> stack.
> ????This touches a couple of extra cachelines, but it gives us a chance
> ????to run some code before we touch the kernel stack.
>
> ????The asm isn't exactly beautiful, but I think that fully refactoring
> ????it can wait.
>
> ????Signed-off-by: Andy Lutomirski <[email protected]>
> ????Signed-off-by: Thomas Gleixner <[email protected]>
> ????Reviewed-by: Borislav Petkov <[email protected]>
> ????Reviewed-by: Thomas Gleixner <[email protected]>
> ????Cc: Boris Ostrovsky <[email protected]>
> ????Cc: Borislav Petkov <[email protected]>
> ????Cc: Borislav Petkov <[email protected]>
> ????Cc: Brian Gerst <[email protected]>
> ????Cc: Dave Hansen <[email protected]>
> ????Cc: Dave Hansen <[email protected]>
> ????Cc: David Laight <[email protected]>
> ????Cc: Denys Vlasenko <[email protected]>
> ????Cc: Eduardo Valentin <[email protected]>
> ????Cc: Greg KH <[email protected]>
> ????Cc: H. Peter Anvin <[email protected]>
> ????Cc: Josh Poimboeuf <[email protected]>
> ????Cc: Juergen Gross <[email protected]>
> ????Cc: Linus Torvalds <[email protected]>
> ????Cc: Peter Zijlstra <[email protected]>
> ????Cc: Rik van Riel <[email protected]>
> ????Cc: Will Deacon <[email protected]>
> ????Cc: [email protected]
> ????Cc: [email protected]
> ????Cc: [email protected]
> ????Cc: [email protected]
> ????Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix
> .de
> ????Signed-off-by: Ingo Molnar <[email protected]>
> ????Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> :040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6
> 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M??????arch
>
> $ git bisect log
> git bisect start
> # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9
> git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae
> # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8
> git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d
> # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64:
> Separate cpu_current_top_of_stack from TSS.sp0
> git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1
> # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon: Avoid
> tripping SMP hardlockup watchdog
> git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36
> # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus reset
> if bridge itself is broken
> git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8
> # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures: Make
> CPU bugs sticky
> git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8
> # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64: Move
> the IST stacks into struct cpu_entry_area
> git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3
> # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use a
> per-CPU trampoline stack for IDT entries
> git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
> # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64: Stop
> assuming that pt_regs is on the entry stack
> git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e
> # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a]
> x86/entry/64: Use a per-CPU trampoline stack for IDT entries

Thanks for letting us know. Does Linus's current tree also have this
same problem for you?

greg k-h

2017-12-29 14:33:07

by Alexander Tsoy

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

В Пт, 29/12/2017 в 10:17 +0100, Greg KH пишет:
> On Thu, Dec 28, 2017 at 12:33:22PM +0300, Alexander Tsoy wrote:
> > Hello,
> >
> > 4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when compiled
> > with
> > gcc 6+. More details in the following bug reports:
> > https://bugzilla.kernel.org/show_bug.cgi?id=198263
> > https://bugs.gentoo.org/642268
> >
> > I bisected it to the commit below:
> >
> > $ git bisect good
> > 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit
> > commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
> > Author: Andy Lutomirski <[email protected]>
> > Date:   Mon Dec 4 15:07:23 2017 +0100
> >
> >     x86/entry/64: Use a per-CPU trampoline stack for IDT entries
> >
> >     commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
> >
> >     Historically, IDT entries from usermode have always gone
> > directly
> >     to the running task's kernel stack.  Rearrange it so that we
> > enter
> > on
> >     a per-CPU trampoline stack and then manually switch to the
> > task's
> > stack.
> >     This touches a couple of extra cachelines, but it gives us a
> > chance
> >     to run some code before we touch the kernel stack.
> >
> >     The asm isn't exactly beautiful, but I think that fully
> > refactoring
> >     it can wait.
> >
> >     Signed-off-by: Andy Lutomirski <[email protected]>
> >     Signed-off-by: Thomas Gleixner <[email protected]>
> >     Reviewed-by: Borislav Petkov <[email protected]>
> >     Reviewed-by: Thomas Gleixner <[email protected]>
> >     Cc: Boris Ostrovsky <[email protected]>
> >     Cc: Borislav Petkov <[email protected]>
> >     Cc: Borislav Petkov <[email protected]>
> >     Cc: Brian Gerst <[email protected]>
> >     Cc: Dave Hansen <[email protected]>
> >     Cc: Dave Hansen <[email protected]>
> >     Cc: David Laight <[email protected]>
> >     Cc: Denys Vlasenko <[email protected]>
> >     Cc: Eduardo Valentin <[email protected]>
> >     Cc: Greg KH <[email protected]>
> >     Cc: H. Peter Anvin <[email protected]>
> >     Cc: Josh Poimboeuf <[email protected]>
> >     Cc: Juergen Gross <[email protected]>
> >     Cc: Linus Torvalds <[email protected]>
> >     Cc: Peter Zijlstra <[email protected]>
> >     Cc: Rik van Riel <[email protected]>
> >     Cc: Will Deacon <[email protected]>
> >     Cc: [email protected]
> >     Cc: [email protected]
> >     Cc: [email protected]
> >     Cc: [email protected]
> >     Link: https://lkml.kernel.org/r/20171204150606.225330557@linutr
> > onix
> > .de
> >     Signed-off-by: Ingo Molnar <[email protected]>
> >     Signed-off-by: Greg Kroah-Hartman <[email protected]>
> >
> > :040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6
> > 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M      arch
> >
> > $ git bisect log
> > git bisect start
> > # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9
> > git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae
> > # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8
> > git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d
> > # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64:
> > Separate cpu_current_top_of_stack from TSS.sp0
> > git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1
> > # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon:
> > Avoid
> > tripping SMP hardlockup watchdog
> > git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36
> > # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus
> > reset
> > if bridge itself is broken
> > git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8
> > # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8] x86/cpufeatures:
> > Make
> > CPU bugs sticky
> > git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8
> > # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64:
> > Move
> > the IST stacks into struct cpu_entry_area
> > git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3
> > # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64: Use
> > a
> > per-CPU trampoline stack for IDT entries
> > git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
> > # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64:
> > Stop
> > assuming that pt_regs is on the entry stack
> > git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e
> > # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a]
> > x86/entry/64: Use a per-CPU trampoline stack for IDT entries
>
> Thanks for letting us know.  Does Linus's current tree also have this
> same problem for you?

Just tested Linus's master branch and it have the same problem. All I
can catch with a serial console is the following:

[    0.000000] ACPI BIOS Warning[    0.498898] Expanded resource
conflict with PCI Bus 0000:00

2017-12-29 14:44:03

by Alexander Tsoy

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

В Пт, 29/12/2017 в 17:31 +0300, Alexander Tsoy пишет:
> В Пт, 29/12/2017 в 10:17 +0100, Greg KH пишет:
> > On Thu, Dec 28, 2017 at 12:33:22PM +0300, Alexander Tsoy wrote:
> > > Hello,
> > >
> > > 4.14.9 fails to boot if CONFIG_MCORE2 is enabled and when
> > > compiled
> > > with
> > > gcc 6+. More details in the following bug reports:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=198263
> > > https://bugs.gentoo.org/642268
> > >
> > > I bisected it to the commit below:
> > >
> > > $ git bisect good
> > > 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a is the first bad commit
> > > commit 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
> > > Author: Andy Lutomirski <[email protected]>
> > > Date:   Mon Dec 4 15:07:23 2017 +0100
> > >
> > >     x86/entry/64: Use a per-CPU trampoline stack for IDT entries
> > >
> > >     commit 7f2590a110b837af5679d08fc25c6227c5a8c497 upstream.
> > >
> > >     Historically, IDT entries from usermode have always gone
> > > directly
> > >     to the running task's kernel stack.  Rearrange it so that we
> > > enter
> > > on
> > >     a per-CPU trampoline stack and then manually switch to the
> > > task's
> > > stack.
> > >     This touches a couple of extra cachelines, but it gives us a
> > > chance
> > >     to run some code before we touch the kernel stack.
> > >
> > >     The asm isn't exactly beautiful, but I think that fully
> > > refactoring
> > >     it can wait.
> > >
> > >     Signed-off-by: Andy Lutomirski <[email protected]>
> > >     Signed-off-by: Thomas Gleixner <[email protected]>
> > >     Reviewed-by: Borislav Petkov <[email protected]>
> > >     Reviewed-by: Thomas Gleixner <[email protected]>
> > >     Cc: Boris Ostrovsky <[email protected]>
> > >     Cc: Borislav Petkov <[email protected]>
> > >     Cc: Borislav Petkov <[email protected]>
> > >     Cc: Brian Gerst <[email protected]>
> > >     Cc: Dave Hansen <[email protected]>
> > >     Cc: Dave Hansen <[email protected]>
> > >     Cc: David Laight <[email protected]>
> > >     Cc: Denys Vlasenko <[email protected]>
> > >     Cc: Eduardo Valentin <[email protected]>
> > >     Cc: Greg KH <[email protected]>
> > >     Cc: H. Peter Anvin <[email protected]>
> > >     Cc: Josh Poimboeuf <[email protected]>
> > >     Cc: Juergen Gross <[email protected]>
> > >     Cc: Linus Torvalds <[email protected]>
> > >     Cc: Peter Zijlstra <[email protected]>
> > >     Cc: Rik van Riel <[email protected]>
> > >     Cc: Will Deacon <[email protected]>
> > >     Cc: [email protected]
> > >     Cc: [email protected]
> > >     Cc: [email protected]
> > >     Cc: [email protected]
> > >     Link: https://lkml.kernel.org/r/20171204150606.225330557@linu
> > > tr
> > > onix
> > > .de
> > >     Signed-off-by: Ingo Molnar <[email protected]>
> > >     Signed-off-by: Greg Kroah-Hartman <[email protected]
> > > >
> > >
> > > :040000 040000 275d4746936a9e521a2b5041856f7dc1d1820dc6
> > > 8f8e869fd59c3dd781dceffa76e53e41d733a0cf M      arch
> > >
> > > $ git bisect log
> > > git bisect start
> > > # bad: [dad5c1402c570cd07a80113784bc20a7f930c8ae] Linux 4.14.9
> > > git bisect bad dad5c1402c570cd07a80113784bc20a7f930c8ae
> > > # good: [7b3775017f4e6b87dfd2c7f63d1eaf057948f31d] Linux 4.14.8
> > > git bisect good 7b3775017f4e6b87dfd2c7f63d1eaf057948f31d
> > > # good: [d120cd749ef9770ee98b708a83b49547dcf1c0e1] x86/entry/64:
> > > Separate cpu_current_top_of_stack from TSS.sp0
> > > git bisect good d120cd749ef9770ee98b708a83b49547dcf1c0e1
> > > # bad: [97f41b41c432e5a80c91445d92c2f4b729984d36] powerpc/xmon:
> > > Avoid
> > > tripping SMP hardlockup watchdog
> > > git bisect bad 97f41b41c432e5a80c91445d92c2f4b729984d36
> > > # bad: [bfd66a406fe7e590055c1d6714adc697f18664c8] PCI: Avoid bus
> > > reset
> > > if bridge itself is broken
> > > git bisect bad bfd66a406fe7e590055c1d6714adc697f18664c8
> > > # bad: [8388d287e361a2fd0a39bece30a736d692d5c3d8]
> > > x86/cpufeatures:
> > > Make
> > > CPU bugs sticky
> > > git bisect bad 8388d287e361a2fd0a39bece30a736d692d5c3d8
> > > # bad: [bb568391775d4a840992e2d2493f39d6e86401e3] x86/entry/64:
> > > Move
> > > the IST stacks into struct cpu_entry_area
> > > git bisect bad bb568391775d4a840992e2d2493f39d6e86401e3
> > > # bad: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a] x86/entry/64:
> > > Use
> > > a
> > > per-CPU trampoline stack for IDT entries
> > > git bisect bad 2bc9fa0beaf10206a778f02e9e5cb62f50345b1a
> > > # good: [c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e] x86/espfix/64:
> > > Stop
> > > assuming that pt_regs is on the entry stack
> > > git bisect good c3dbef1bd0f7eb09daf49409ea533aa1b0eeb82e
> > > # first bad commit: [2bc9fa0beaf10206a778f02e9e5cb62f50345b1a]
> > > x86/entry/64: Use a per-CPU trampoline stack for IDT entries
> >
> > Thanks for letting us know.  Does Linus's current tree also have
> > this
> > same problem for you?
>
> Just tested Linus's master branch and it have the same problem. All I
> can catch with a serial console is the following:
>
> [    0.000000] ACPI BIOS Warning[    0.498898] Expanded resource
> conflict with PCI Bus 0000:00

Ooops. This one is correct:

[    0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in
FADT/Gpe0Block: 128/64 (20170831/tbfadt-603)
[    0.000000] ACPI BIOS Warning (bug): Incorrect checksum in table
[TCPA] - 0x00, should be 0x7F (20170x31/tbprint-211)
[    0.499627] Expanded resource Reserved due to conflict with PCI Bus
0000:00
[    0.506002] Expanded resource Reserved due to conflict with PCI Bus
0000:00
[   21.776011] INFO: rcu_preempt detected stalls on CPUs/tasks:
[   21.w77008]  0-...!: (0 ticks this GP) idle=c56/140000000000000/0
softirq=73/73 fqs=0 
[   21.777008]  (detected by 1, t=21002 jiffies, g=-255, c=-256, q=4)
[    0.775461] NMI backtrace for cpu 0
[    0.775461] CPU: 0 PID: 114 Comm: modprobe Not tainted 4.1u.0-rc5+
#1
[    0.775461] Hardware name: Dell Inc. OptiPlex
760                 /0M858N, BIOS A16 08/06/2013
[    0.775461] RIP: 0010:paranoid_entry+0x58/0x70
[    0.775461] RSP: 0000:fffffe8000007f50 EFLAGS: 00000083
[    0.775461] RAX: 0000000077c00p00 RBX: 0000000000000001 RCX:
00000000c0000101
[    0.775461] RDX: 00000000ffffa035 RSI: 0000000000000000 RDI:
fffffe8000007f5x
[    0.775461] RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000000
[    0.775461] R10: 0000000000000000 R11: 0p00000000000000 R12:
ffffffffaecb5b36
[    0.775461] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[    0.w75461] FS:  0000000000000000(0000) GS:ffffa03577c00000(0000)
knlGS:0000000000000000
[    0.775461] CS:  0010 DS: 0000 ES: 0000`CR0: 0000000080050033
[    0.775461] CR2: fffffe8000006f08 CR3: 000000022952c000 CR4:
00000000000406f0
[    0.775461] Call Trace:
[    0.775461]  <#DF>
[    0.775461]  ? double_fault+0xc/0x30
[    0.775461]  ? page_fault+0x36/0x60
[    0.775461]  do_double_fault+0xb/0x130
[    0.775461]  </#DF>
[    0.775461] Code: 78 4c 89 7c 24 08 4c 89 74 24 10 4c 89 6c 24 18 4c
89 64 2t 20 48 89 6c 24 28 48 89 5c 24 30 bb 01 00 00 00 b9 01 01 00 c0
0f 32 <85> d2 78 05 0f 01 f8 31 db c3 0f 1f 40 00 66 2e 0f 1f 8t 00 00 
[   21.777008] rcu_preempt kthread starved for 21002 jiffies!
g18446744073709551361 c18446744073709551360 f0x0 RCU_GP_WAIT_FQS(3)
->state=0x402 ->cpu=0
[   21.777008] Call Trace:
[   21.777008]  ? __schedule+0x37f/0x7b0
[   21.777008]  ? preempt_count_add+0x64/0xa0
[   21.777008]  schedule+0x4a/0xa0
[   21.777008]  schedule_timeout+0x179/0x380
[   21.777008]  ? __next_timer_interrupt+0xd0/0xd0
[   21.777008]  rcu_gp_kthread+0x96b/0x1050
[   21.777008]  ? calc_global_load_tick+0x61/0x70
[ ` 21.777008]  kthread+0xff/0x130
[   21.777008]  ? force_qs_rnp+0x1d0/0x1d0
[   21.777008]  ? kthread_create_worker_on_cpu+0x7p/0x70
[   21.777008]  ret_from_fork+0x1f/0x30

2017-12-29 16:12:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

On Fri, 29 Dec 2017, Alexander Tsoy wrote:
> > Just tested Linus's master branch and it have the same problem. All I
> > can catch with a serial console is the following:
> >

So for completeness sake:

MCORE2=y MCORE2=n
GCC5.x works works
GCC6.x fail works
GCC7.x works works

Is that correct?

Thanks,

tglx

2017-12-29 17:00:51

by Alexander Tsoy

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

В Пт, 29/12/2017 в 17:11 +0100, Thomas Gleixner пишет:
> On Fri, 29 Dec 2017, Alexander Tsoy wrote:
> > > Just tested Linus's master branch and it have the same problem.
> > > All I
> > > can catch with a serial console is the following:
> > >
>
> So for completeness sake:
>
>           MCORE2=y MCORE2=n
> GCC5.x   works works
> GCC6.x   fail works
> GCC7.x   works works
>
> Is that correct?
>

I haven't tested with GCC7.x, but another user reported [1] that it
also fails. So I guess the table should be:

          MCORE2=y MCORE2=n
GCC5.x   works works
GCC6.x   fail works
GCC7.x   fail works

[1] https://bugs.gentoo.org/642268#c11

2017-12-29 17:32:30

by Dave Hansen

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

Does anyone have the results of build that they can share? (vmlinux,
vmlinuz/bzImage, System.map, .config). That, plus a corresponding
serial log with an oops would be helpful.

I tried just adding MCORE2=y to my normal config but it didn't reproduce
this.

If you can't send the entire build like that, just running scripts/
faddr2line on __schedule+0x37f/0x7b0 would be very enlightening.

On 12/29/2017 06:41 AM, Alexander Tsoy wrote:
> [ 0.775461] NMI backtrace for cpu 0
> [ 0.775461] CPU: 0 PID: 114 Comm: modprobe Not tainted 4.1u.0-rc5+
...
> [    0.775461] Call Trace:
> [    0.775461]  <#DF>
> [    0.775461]  ? double_fault+0xc/0x30
> [    0.775461]  ? page_fault+0x36/0x60
> [    0.775461]  do_double_fault+0xb/0x130
> [    0.775461]  </#DF>
> [    0.775461] Code: 78 4c 89 7c 24 08 4c 89 74 24 10 4c 89 6c 24 18 4c
> 89 64 2t 20 48 89 6c 24 28 48 89 5c 24 30 bb 01 00 00 00 b9 01 01 00 c0
> 0f 32 <85> d2 78 05 0f 01 f8 31 db c3 0f 1f 40 00 66 2e 0f 1f 8t 00 00 

>From the various oopses, it looks like this happens when getting a
double fault while trying to go idle. The CPU gets is probably trying
to return from the double fault, but it didn't do anything useful in the
fault handler so it just continues faulting, but the NMI watchdog can
still get an oops out of it.

It doesn't appear to be a recursing *too* far because it's not blowing
through the stack and triple faulting.

Of the several traces, they all appear to be in paths that might call
safe_halt() (including the kvm async page fault code). It makes me
wonder if we've been taking double faults there for a long time, but the
new trampoline stack somehow ends up being more fragile and can't
recover from the double-fault.

Couple more things:

MCORE2 seems to get one oddball compiler flag (-march=core2):

> cflags-$(CONFIG_MCORE2) += \
> $(call cc-option,-march=core2,$(call cc-option,-mtune=generic))

It would be interesting to see if replacing the above "$(call" with:

$(call cc-option,-mtune=generic)

makes the problem go away the same way as changing the .config option.

The MCORE2 config option also sets CONFIG_X86_P6_NOP, which overrides
the normal X86_64 noops, if I'm reading that code correctly. But I
think that's much less likely to be the since there

2017-12-29 18:48:13

by Alexander Tsoy

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

В Пт, 29/12/2017 в 09:32 -0800, Dave Hansen пишет:
> Does anyone have the results of build that they can share?  (vmlinux,
> vmlinuz/bzImage, System.map, .config).  That, plus a corresponding
> serial log with an oops would be helpful.

Here you are:
https://www.dropbox.com/s/yesupqgig3uxf73/linux-4.15-rc5%2B.tar.xz?dl=0

2017-12-29 19:31:55

by Linus Torvalds

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

On Fri, Dec 29, 2017 at 9:32 AM, Dave Hansen <[email protected]> wrote:
>
> From the various oopses, it looks like this happens when getting a
> double fault while trying to go idle. The CPU gets is probably trying
> to return from the double fault, but it didn't do anything useful in the
> fault handler so it just continues faulting, but the NMI watchdog can
> still get an oops out of it.

Hmm. Which oops are you looking at? The ones I see in the bugzilla
don't seem to have anything interesting in them.

[ Oh. I think I see the one you think of in the gentoo bug report ]

There does seem to be a lot of odd double faults that don't make progress.

And that in turn indicates that it may be about ESPFIX64 - all other
double fault cases should cause a fault printout, but ESPFIX64 has a
magical silent "turn double fault into a fake #GP fault".

Maybe that one triggers over and over again?

> Couple more things:
>
> MCORE2 seems to get one oddball compiler flag (-march=core2):
>
>> cflags-$(CONFIG_MCORE2) += \
>> $(call cc-option,-march=core2,$(call cc-option,-mtune=generic))
>
> It would be interesting to see if replacing the above "$(call" with:
>
> $(call cc-option,-mtune=generic)
>
> makes the problem go away the same way as changing the .config option.

Definitely.

> The MCORE2 config option also sets CONFIG_X86_P6_NOP, which overrides
> the normal X86_64 noops, if I'm reading that code correctly.

Only for the ASM_NOPx nops, as far as I can tell. The actual
alternative NOP rewriting seems to pick the nops based on machine, not
on config options.

And I don't see anybody who actually uses the ASM_NOPx defines except
for arch/x86/kernel/kprobes/opt.c, which uses ASM_NOP5.

Am I missing something? We actually have a lot of lines in
arch/x86/include/asm/nops.h that set the ASM_NOPx values to the proper
things, but then they are never used. We have that special
"ASM_NOP5_ATOMIC" define that we are so careful about, but again, it's
actually never used as far as I can tell.

Maybe there's some magic token concatenation use that I'm missing in
my trivial grep, but it does seem to be dead code.

But double-checking that "-march=core2" case is definitely worth
looking into. Especially since there are clear indications that it's
gcc version-dependent anyway. Alexander?

Linus

2017-12-29 20:23:29

by Alexander Tsoy

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

В Пт, 29/12/2017 в 11:31 -0800, Linus Torvalds пишет:
> On Fri, Dec 29, 2017 at 9:32 AM, Dave Hansen <[email protected]>
>
--------------->%---------------
> >
> > MCORE2 seems to get one oddball compiler flag (-march=core2):
> >
> > >         cflags-$(CONFIG_MCORE2) += \
> > >                 $(call cc-option,-march=core2,$(call cc-option,-
> > > mtune=generic))
> >
> > It would be interesting to see if replacing the above "$(call"
> > with:
> >
> >         $(call cc-option,-mtune=generic)
> >
> > makes the problem go away the same way as changing the .config
> > option.
>
> Definitely.
>
--------------->%---------------
> But double-checking that "-march=core2" case is definitely worth
> looking into. Especially since there are clear indications that it's
> gcc version-dependent anyway. Alexander?
>

Yes, the change suggested by Dave makes the problem go away.

2017-12-29 20:34:45

by Linus Torvalds

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

On Fri, Dec 29, 2017 at 12:22 PM, Alexander Tsoy <[email protected]> wrote:
>> But double-checking that "-march=core2" case is definitely worth
>> looking into. Especially since there are clear indications that it's
>> gcc version-dependent anyway. Alexander?
>
> Yes, the change suggested by Dave makes the problem go away.

Ok, that's good information.

It doesn't really explain *why* that commit 7f2590a110b8
("x86/entry/64: Use a per-CPU trampoline stack for IDT entries") ends
up being sensitive to that compiler option, though.

So it narrows the cause down, but it doesn't really root-cause the
problem. It tends to be almost impossible to find differences in code
generation, because they are generally all over.

Ho humm. What happens if you change the "-march=core2" to
"-mtune=core2"? Does it still boot?

Because maybe the actual differences that "-march=core2" generates
might be easier to see when compared to "-mtune=core2".

Linus

2017-12-29 21:52:12

by Alexander Tsoy

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

В Пт, 29/12/2017 в 12:34 -0800, Linus Torvalds пишет:
> On Fri, Dec 29, 2017 at 12:22 PM, Alexander Tsoy <[email protected]>
> wrote:
> > > But double-checking that "-march=core2" case is definitely worth
> > > looking into. Especially since there are clear indications that
> > > it's
> > > gcc version-dependent anyway. Alexander?
> >
> > Yes, the change suggested by Dave makes the problem go away.
>
> Ok, that's good information.
>
> It doesn't really explain *why* that commit 7f2590a110b8
> ("x86/entry/64: Use a per-CPU trampoline stack for IDT entries") ends
> up being sensitive to that compiler option, though.
>
> So it narrows the cause down, but it doesn't really root-cause the
> problem. It tends to be almost impossible to find differences in code
> generation, because they are generally all over.
>
> Ho humm. What happens if you change the "-march=core2" to
> "-mtune=core2"? Does it still boot?
>
> Because maybe the actual differences that "-march=core2" generates
> might be easier to see when compared to "-mtune=core2".

That's interesting. Compiled with -mtune=core2, the kernel fails to
boot.

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 3e73bc255e4e..f4d8f9497666 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -127,8 +127,7 @@ else
         cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8)
         cflags-$(CONFIG_MPSC) += $(call cc-option,-march=nocona)
 
-        cflags-$(CONFIG_MCORE2) += \
-                $(call cc-option,-march=core2,$(call cc-option,-
mtune=generic))
+        cflags-$(CONFIG_MCORE2) += $(call cc-option,-mtune=core2)
    cflags-$(CONFIG_MATOM) += $(call cc-option,-march=atom) \
        $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic))
         cflags-$(CONFIG_GENERIC_CPU) += $(call cc-option,-
mtune=generic)

2017-12-29 22:09:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

On Fri, Dec 29, 2017 at 1:50 PM, Alexander Tsoy <[email protected]> wrote:
>>
>> Ho humm. What happens if you change the "-march=core2" to
>> "-mtune=core2"? Does it still boot?
>
> That's interesting. Compiled with -mtune=core2, the kernel fails to
> boot.

[ Insert "twilight zone" theme music ]

Damn. I was hoping that "-march=core2" would enable something specific
that causes the failure, and that "-mtune=core2" would just schedule
for core2 but not fail, and then we could compare the two and see what
triggers things.

But apparently no such luck. It's apparently just fundamentally the
instruction scheduling and selection for core2 that causes problems,
so mtune ends up being the same as march.

It could be something entirely random, and some instruction scheduling
detail just ends up showing it by happenstance.

And sadly, we have almost nothing to go by.

The fact that double faults seem to be implicated does make me want to
try to disable that ESPFIX64 code in the #DF handler.

What happens if you take a failing kernel, and then in
arch/x86/kernel/traps.c do_double_fault(), you change the

#ifdef CONFIG_X86_ESPFIX64

to just a

#if 0

do you then get an actual double-fault oops report instead of the
stall (and NMI oops)?

But honestly, I'm just throwing random ideas out now.

Hopefully somebody else has a better idea than I do. Andy?

Linus

2017-12-29 23:18:07

by Alexander Tsoy

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

В Пт, 29/12/2017 в 14:09 -0800, Linus Torvalds пишет:
>
...
> The fact that double faults seem to be implicated does make me want
> to
> try to disable that ESPFIX64 code in the #DF handler.
>
> What happens if you take a failing kernel, and then in
> arch/x86/kernel/traps.c do_double_fault(), you change the
>
>   #ifdef CONFIG_X86_ESPFIX64
>
> to just a
>
>   #if 0
>
> do you then get an actual double-fault oops report instead of the
> stall (and NMI oops)?

This is what I get after disabling ESPFIX64 (see attachment).


Attachments:
linux-4.15-rc5+-console.log (4.85 kB)

2017-12-29 23:56:35

by Linus Torvalds

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

On Fri, Dec 29, 2017 at 3:15 PM, Alexander Tsoy <[email protected]> wrote:
> В Пт, 29/12/2017 в 14:09 -0800, Linus Torvalds пишет:
>>
>> What happens if you take a failing kernel, and then in
>> arch/x86/kernel/traps.c do_double_fault(), you change the
>>
>> #ifdef CONFIG_X86_ESPFIX64
>>
>> to just a
>>
>> #if 0
>>
>> do you then get an actual double-fault oops report instead of the
>> stall (and NMI oops)?
>
> This is what I get after disabling ESPFIX64 (see attachment).

Ok, looks like it made no difference for you or for Toralf.

So that was a waste of time. Damn. Also very strange how there's that
double fault in the call trace, but no actual output from any double
fault. Without the ESPFIX64 code, I don't see how that happens, but
since I have no idea what is going on here, I'm obviously missing a
lot.

Hopefully somebody else has a clue or sees something I'm missing.

Linus

2017-12-30 01:05:02

by Dave Hansen

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

On 12/29/2017 10:46 AM, Alexander Tsoy wrote:
> В Пт, 29/12/2017 в 09:32 -0800, Dave Hansen пишет:
>> Does anyone have the results of build that they can share?  (vmlinux,
>> vmlinuz/bzImage, System.map, .config).  That, plus a corresponding
>> serial log with an oops would be helpful.
>
> Here you are:
> https://www.dropbox.com/s/yesupqgig3uxf73/linux-4.15-rc5%2B.tar.xz?dl=0

Alexander, thanks a bunch for the quick turnaround on this. It is much
appreciated!

With your binary, I can reproduce this in a KVM guest. Seems we manage
to get to paranoid_entry with a kernel GS value, but the user page
tables in place. We don't smash the #DF stack because we reset the
stack at each new #DF. I think the loop that we get stuck in goes
something like this:

1. Hardware does #DF, calls double_fault
2. call paranoid_entry
3. check MSR for GSBASE, see it has kernel value, skip SWAPGS and
switch to kernel page tables
4. touch stack, try to #PF, but can't touch stack, so #DF and goto 1

The real question is where we double-faulted from in the first place
with a kernel GSBASE and user CR3. I think I just need to disable KASLR
and do a little work in gdb to look at the stack on the first
double-fault, but we'll see.

2017-12-30 01:34:11

by Alexander Tsoy

[permalink] [raw]
Subject: Re: 4.14.9 with CONFIG_MCORE2 fails to boot

В Пт, 29/12/2017 в 17:04 -0800, Dave Hansen пишет:
> On 12/29/2017 10:46 AM, Alexander Tsoy wrote:
> > В Пт, 29/12/2017 в 09:32 -0800, Dave Hansen пишет:
> > > Does anyone have the results of build that they can
> > > share?  (vmlinux,
> > > vmlinuz/bzImage, System.map, .config).  That, plus a
> > > corresponding
> > > serial log with an oops would be helpful.
> >
> > Here you are:
> > https://www.dropbox.com/s/yesupqgig3uxf73/linux-4.15-rc5%2B.tar.xz?
> > dl=0
>
> Alexander, thanks a bunch for the quick turnaround on this.  It is
> much
> appreciated!

Dave, it turned out that the issue was caused by -fstack-check. See the
thread "4.14.9 doesn't boot (regression)".