2022-03-11 05:54:07

by Nathan Chancellor

[permalink] [raw]
Subject: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

Hi Russell,

Apologies if this has already been reported, I did not see anything when
sifting through lore.kernel.org and I can still reproduce this with
current mainline (1db333d9a51f).

I noticed a QEMU boot failure with multi_v7_defconfig with
CONFIG_THUMB2_KERNEL=y in our continuous integration [1]. It does not
appear to be compiler specific, as it reproduces with a bunch of
different clang versions and GCC 11.2.0 (I didn't try other GCC
versions).

At commit 04e91b732476 ("ARM: early traps initialisation"), everything
boots fine.

At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
sections"), there is no output from QEMU at all.

At commit b9baf5c8c5c3 ("ARM: Spectre-BHB workaround"), there is some
output but the boot still hangs before init. I have included a log of
the output of QEMU at this revision along with the command line I am
using, which comes from [2]. If I disable CONFIG_HARDEN_BRANCH_HISTORY,
the kernel boots.

If there is any further information I can provide or patches I can try,
I am happy to do so.

[1]: https://github.com/ClangBuiltLinux/continuous-integration2/runs/5496036256?check_suite_focus=true
[2]: https://github.com/ClangBuiltLinux/boot-utils

Cheers,
Nathan


Attachments:
(No filename) (1.23 kB)
boot.log (3.86 kB)
Download all attachments

2022-03-22 22:20:29

by Christian Eggers

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

Hi Nathan, hi Russel,

I stumbled today over the same problem (no output on serial console
with v5.15.28-rt36). During `git bisect`, I had also some commits
where a few lines of output were visible.

At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
sections"), the system boots up to here:

start_kernel()
+--setup_arch()
+--paging_init()
+--devicemaps_init()
+--eary_trap_init(vectors_base = 0xC7FFE000)
+--copy_from_lma(vectors_base = 0xC7FFE000, __vectors_start=0x0, __vectors_end=0x20)
+--__memcpy()

copy_template.S:113
ldr8w r1, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f
r1 = 0


With the final v5.15.28-rt36 I found out that the system boots fine
after disabling CONFIG_HARDEN_BRANCH_HISTORY.

Is there anything else I could analyze? My SoC system is a NXP i.MX6LL
with a ARMv7 core. I have access to a JTAG debugger.

regards
Christian


On Thursday, 10 March 2022, 20:16:48 CET, Nathan Chancellor wrote:
> Hi Russell,
>
> Apologies if this has already been reported, I did not see anything when
> sifting through lore.kernel.org and I can still reproduce this with
> current mainline (1db333d9a51f).
>
> I noticed a QEMU boot failure with multi_v7_defconfig with
> CONFIG_THUMB2_KERNEL=y in our continuous integration [1]. It does not
> appear to be compiler specific, as it reproduces with a bunch of
> different clang versions and GCC 11.2.0 (I didn't try other GCC
> versions).
>
> At commit 04e91b732476 ("ARM: early traps initialisation"), everything
> boots fine.
>
> At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
> sections"), there is no output from QEMU at all.
>
> At commit b9baf5c8c5c3 ("ARM: Spectre-BHB workaround"), there is some
> output but the boot still hangs before init. I have included a log of
> the output of QEMU at this revision along with the command line I am
> using, which comes from [2]. If I disable CONFIG_HARDEN_BRANCH_HISTORY,
> the kernel boots.
>
> If there is any further information I can provide or patches I can try,
> I am happy to do so.
>
> [1]: https://github.com/ClangBuiltLinux/continuous-integration2/runs/5496036256?check_suite_focus=true
> [2]: https://github.com/ClangBuiltLinux/boot-utils
>
> Cheers,
> Nathan
>




2022-03-30 18:40:56

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

On Tue, Mar 22, 2022 at 06:49:17PM +0100, Christian Eggers wrote:
> Hi Nathan, hi Russel,
>
> I stumbled today over the same problem (no output on serial console
> with v5.15.28-rt36). During `git bisect`, I had also some commits
> where a few lines of output were visible.
>
> At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
> sections"), the system boots up to here:
>
> start_kernel()
> +--setup_arch()
> +--paging_init()
> +--devicemaps_init()
> +--eary_trap_init(vectors_base = 0xC7FFE000)
> +--copy_from_lma(vectors_base = 0xC7FFE000, __vectors_start=0x0, __vectors_end=0x20)
> +--__memcpy()
>
> copy_template.S:113
> ldr8w r1, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f
> r1 = 0
>
>
> With the final v5.15.28-rt36 I found out that the system boots fine
> after disabling CONFIG_HARDEN_BRANCH_HISTORY.
>
> Is there anything else I could analyze? My SoC system is a NXP i.MX6LL
> with a ARMv7 core. I have access to a JTAG debugger.

I think this is already fixed in mainline. Commit:

6c7cb60bff7a ("ARM: fix Thumb2 regression")

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

2022-03-31 03:52:59

by Christian Eggers

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

On Wednesday, 30 March 2022, 19:42:31 CEST, Ard Biesheuvel wrote:
> On Wed, 30 Mar 2022 at 19:33, Christian Eggers <[email protected]> wrote:
> >
> > I just switched to v5.15.31-rt38 which already includes
> > 6c7cb60bff7a ("ARM: fix Thumb2 regression")
> >
> > This kernel boots fine now, even with CONFIG_HARDEN_BRANCH_HISTORY=y. After
> > applying the patch series from Ard, the system still boots fine.
> >
> > I haven't any understanding what these patches do. Is there anything I shall
> > test?
> >
>
> Thanks for confirming. The first fix affects all Thumb2
> configurations, my patch only affects Thumb2 configurations that
> actually enable the loop8 mitigation for Spectre-BHB.
>
> What type of CPU are you booting on?
>

NXP i.MX6ULL (ARM Cortex-A7).



2022-03-31 03:57:29

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

On Wed, Mar 30, 2022 at 08:27:56PM +0200, Christian Eggers wrote:
> On Wednesday, 30 March 2022, 19:42:31 CEST, Ard Biesheuvel wrote:
> > Thanks for confirming. The first fix affects all Thumb2
> > configurations, my patch only affects Thumb2 configurations that
> > actually enable the loop8 mitigation for Spectre-BHB.
> >
> > What type of CPU are you booting on?
>
> NXP i.MX6ULL (ARM Cortex-A7).

As Cortex-A7 is not listed in Arm Ltd's table for speculative processor
vulnerabilities, the kernel doesn't implement any workarounds.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

2022-03-31 04:03:11

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

On Wed, Mar 30, 2022 at 06:34:25PM +0200, Ard Biesheuvel wrote:
> On Wed, 30 Mar 2022 at 18:12, Russell King (Oracle)
> <[email protected]> wrote:
> >
> > On Tue, Mar 22, 2022 at 06:49:17PM +0100, Christian Eggers wrote:
> > > Hi Nathan, hi Russel,
> > >
> > > I stumbled today over the same problem (no output on serial console
> > > with v5.15.28-rt36). During `git bisect`, I had also some commits
> > > where a few lines of output were visible.
> > >
> > > At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
> > > sections"), the system boots up to here:
> > >
> > > start_kernel()
> > > +--setup_arch()
> > > +--paging_init()
> > > +--devicemaps_init()
> > > +--eary_trap_init(vectors_base = 0xC7FFE000)
> > > +--copy_from_lma(vectors_base = 0xC7FFE000, __vectors_start=0x0, __vectors_end=0x20)
> > > +--__memcpy()
> > >
> > > copy_template.S:113
> > > ldr8w r1, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f
> > > r1 = 0
> > >
> > >
> > > With the final v5.15.28-rt36 I found out that the system boots fine
> > > after disabling CONFIG_HARDEN_BRANCH_HISTORY.
> > >
> > > Is there anything else I could analyze? My SoC system is a NXP i.MX6LL
> > > with a ARMv7 core. I have access to a JTAG debugger.
> >
> > I think this is already fixed in mainline. Commit:
> >
> > 6c7cb60bff7a ("ARM: fix Thumb2 regression")
> >
>
> It's still broken - I sent a couple of patches on Monday, among which
> one to fix the boot issue with loop8 on Thumb2. The problem is 'b . +
> 4', which produces a narrow encoding, and so it skips the subsequent
> subs instruction and loops forever.

And what's the current status? Sorry, I've way too much email from the
last 2.5 weeks to find it myself.

Thanks.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

2022-03-31 04:11:22

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

On Wed, 30 Mar 2022 at 19:33, Christian Eggers <[email protected]> wrote:
>
> On Wednesday, 30 March 2022, 18:45:18 CEST, Ard Biesheuvel wrote:
> >
> On Wed, 30 Mar 2022 at 18:37, Russell King (Oracle)
> > <[email protected]> wrote:
> > >
> > > On Wed, Mar 30, 2022 at 06:34:25PM +0200, Ard Biesheuvel wrote:
> > > > On Wed, 30 Mar 2022 at 18:12, Russell King (Oracle)
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Tue, Mar 22, 2022 at 06:49:17PM +0100, Christian Eggers wrote:
> > > > > > Hi Nathan, hi Russel,
> > > > > >
> > > > > > I stumbled today over the same problem (no output on serial console
> > > > > > with v5.15.28-rt36). During `git bisect`, I had also some commits
> > > > > > where a few lines of output were visible.
> > > > > >
> > > > > > At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
> > > > > > sections"), the system boots up to here:
> > > > > >
> > > > > > start_kernel()
> > > > > > +--setup_arch()
> > > > > > +--paging_init()
> > > > > > +--devicemaps_init()
> > > > > > +--eary_trap_init(vectors_base = 0xC7FFE000)
> > > > > > +--copy_from_lma(vectors_base = 0xC7FFE000, __vectors_start=0x0, __vectors_end=0x20)
> > > > > > +--__memcpy()
> > > > > >
> > > > > > copy_template.S:113
> > > > > > ldr8w r1, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f
> > > > > > r1 = 0
> > > > > >
> > > > > >
> > > > > > With the final v5.15.28-rt36 I found out that the system boots fine
> > > > > > after disabling CONFIG_HARDEN_BRANCH_HISTORY.
> > > > > >
> > > > > > Is there anything else I could analyze? My SoC system is a NXP i.MX6LL
> > > > > > with a ARMv7 core. I have access to a JTAG debugger.
> > > > >
> > > > > I think this is already fixed in mainline. Commit:
> > > > >
> > > > > 6c7cb60bff7a ("ARM: fix Thumb2 regression")
> > > > >
> > > >
> > > > It's still broken - I sent a couple of patches on Monday, among which
> > > > one to fix the boot issue with loop8 on Thumb2. The problem is 'b . +
> > > > 4', which produces a narrow encoding, and so it skips the subsequent
> > > > subs instruction and loops forever.
> > >
> > > And what's the current status? Sorry, I've way too much email from the
> > > last 2.5 weeks to find it myself.
> > >
> >
> > https://lore.kernel.org/linux-arm-kernel/[email protected]/
> >
> > Nobody bothered to respond yet, I can drop the first two in the patch
> > tracker if you like.
>
> I just switched to v5.15.31-rt38 which already includes
> 6c7cb60bff7a ("ARM: fix Thumb2 regression")
>
> This kernel boots fine now, even with CONFIG_HARDEN_BRANCH_HISTORY=y. After
> applying the patch series from Ard, the system still boots fine.
>
> I haven't any understanding what these patches do. Is there anything I shall
> test?
>

Thanks for confirming. The first fix affects all Thumb2
configurations, my patch only affects Thumb2 configurations that
actually enable the loop8 mitigation for Spectre-BHB.

What type of CPU are you booting on?

2022-03-31 04:22:51

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

On Wed, 30 Mar 2022 at 18:12, Russell King (Oracle)
<[email protected]> wrote:
>
> On Tue, Mar 22, 2022 at 06:49:17PM +0100, Christian Eggers wrote:
> > Hi Nathan, hi Russel,
> >
> > I stumbled today over the same problem (no output on serial console
> > with v5.15.28-rt36). During `git bisect`, I had also some commits
> > where a few lines of output were visible.
> >
> > At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
> > sections"), the system boots up to here:
> >
> > start_kernel()
> > +--setup_arch()
> > +--paging_init()
> > +--devicemaps_init()
> > +--eary_trap_init(vectors_base = 0xC7FFE000)
> > +--copy_from_lma(vectors_base = 0xC7FFE000, __vectors_start=0x0, __vectors_end=0x20)
> > +--__memcpy()
> >
> > copy_template.S:113
> > ldr8w r1, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f
> > r1 = 0
> >
> >
> > With the final v5.15.28-rt36 I found out that the system boots fine
> > after disabling CONFIG_HARDEN_BRANCH_HISTORY.
> >
> > Is there anything else I could analyze? My SoC system is a NXP i.MX6LL
> > with a ARMv7 core. I have access to a JTAG debugger.
>
> I think this is already fixed in mainline. Commit:
>
> 6c7cb60bff7a ("ARM: fix Thumb2 regression")
>

It's still broken - I sent a couple of patches on Monday, among which
one to fix the boot issue with loop8 on Thumb2. The problem is 'b . +
4', which produces a narrow encoding, and so it skips the subsequent
subs instruction and loops forever.

2022-03-31 04:26:08

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

On Wed, 30 Mar 2022 at 18:37, Russell King (Oracle)
<[email protected]> wrote:
>
> On Wed, Mar 30, 2022 at 06:34:25PM +0200, Ard Biesheuvel wrote:
> > On Wed, 30 Mar 2022 at 18:12, Russell King (Oracle)
> > <[email protected]> wrote:
> > >
> > > On Tue, Mar 22, 2022 at 06:49:17PM +0100, Christian Eggers wrote:
> > > > Hi Nathan, hi Russel,
> > > >
> > > > I stumbled today over the same problem (no output on serial console
> > > > with v5.15.28-rt36). During `git bisect`, I had also some commits
> > > > where a few lines of output were visible.
> > > >
> > > > At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
> > > > sections"), the system boots up to here:
> > > >
> > > > start_kernel()
> > > > +--setup_arch()
> > > > +--paging_init()
> > > > +--devicemaps_init()
> > > > +--eary_trap_init(vectors_base = 0xC7FFE000)
> > > > +--copy_from_lma(vectors_base = 0xC7FFE000, __vectors_start=0x0, __vectors_end=0x20)
> > > > +--__memcpy()
> > > >
> > > > copy_template.S:113
> > > > ldr8w r1, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f
> > > > r1 = 0
> > > >
> > > >
> > > > With the final v5.15.28-rt36 I found out that the system boots fine
> > > > after disabling CONFIG_HARDEN_BRANCH_HISTORY.
> > > >
> > > > Is there anything else I could analyze? My SoC system is a NXP i.MX6LL
> > > > with a ARMv7 core. I have access to a JTAG debugger.
> > >
> > > I think this is already fixed in mainline. Commit:
> > >
> > > 6c7cb60bff7a ("ARM: fix Thumb2 regression")
> > >
> >
> > It's still broken - I sent a couple of patches on Monday, among which
> > one to fix the boot issue with loop8 on Thumb2. The problem is 'b . +
> > 4', which produces a narrow encoding, and so it skips the subsequent
> > subs instruction and loops forever.
>
> And what's the current status? Sorry, I've way too much email from the
> last 2.5 weeks to find it myself.
>

https://lore.kernel.org/linux-arm-kernel/[email protected]/

Nobody bothered to respond yet, I can drop the first two in the patch
tracker if you like.

2022-03-31 04:51:31

by Christian Eggers

[permalink] [raw]
Subject: Re: CONFIG_THUMB2_KERNEL=y boot failure after Spectre BHB fixes

On Wednesday, 30 March 2022, 18:45:18 CEST, Ard Biesheuvel wrote:
> On Wed, 30 Mar 2022 at 18:37, Russell King (Oracle)
> <[email protected]> wrote:
> >
> > On Wed, Mar 30, 2022 at 06:34:25PM +0200, Ard Biesheuvel wrote:
> > > On Wed, 30 Mar 2022 at 18:12, Russell King (Oracle)
> > > <[email protected]> wrote:
> > > >
> > > > On Tue, Mar 22, 2022 at 06:49:17PM +0100, Christian Eggers wrote:
> > > > > Hi Nathan, hi Russel,
> > > > >
> > > > > I stumbled today over the same problem (no output on serial console
> > > > > with v5.15.28-rt36). During `git bisect`, I had also some commits
> > > > > where a few lines of output were visible.
> > > > >
> > > > > At commit 8d9d651ff227 ("ARM: use LOADADDR() to get load address of
> > > > > sections"), the system boots up to here:
> > > > >
> > > > > start_kernel()
> > > > > +--setup_arch()
> > > > > +--paging_init()
> > > > > +--devicemaps_init()
> > > > > +--eary_trap_init(vectors_base = 0xC7FFE000)
> > > > > +--copy_from_lma(vectors_base = 0xC7FFE000, __vectors_start=0x0, __vectors_end=0x20)
> > > > > +--__memcpy()
> > > > >
> > > > > copy_template.S:113
> > > > > ldr8w r1, r3, r4, r5, r6, r7, r8, ip, lr, abort=20f
> > > > > r1 = 0
> > > > >
> > > > >
> > > > > With the final v5.15.28-rt36 I found out that the system boots fine
> > > > > after disabling CONFIG_HARDEN_BRANCH_HISTORY.
> > > > >
> > > > > Is there anything else I could analyze? My SoC system is a NXP i.MX6LL
> > > > > with a ARMv7 core. I have access to a JTAG debugger.
> > > >
> > > > I think this is already fixed in mainline. Commit:
> > > >
> > > > 6c7cb60bff7a ("ARM: fix Thumb2 regression")
> > > >
> > >
> > > It's still broken - I sent a couple of patches on Monday, among which
> > > one to fix the boot issue with loop8 on Thumb2. The problem is 'b . +
> > > 4', which produces a narrow encoding, and so it skips the subsequent
> > > subs instruction and loops forever.
> >
> > And what's the current status? Sorry, I've way too much email from the
> > last 2.5 weeks to find it myself.
> >
>
> https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> Nobody bothered to respond yet, I can drop the first two in the patch
> tracker if you like.

I just switched to v5.15.31-rt38 which already includes
6c7cb60bff7a ("ARM: fix Thumb2 regression")

This kernel boots fine now, even with CONFIG_HARDEN_BRANCH_HISTORY=y. After
applying the patch series from Ard, the system still boots fine.

I haven't any understanding what these patches do. Is there anything I shall
test?

regards
Christian