2021-05-19 08:33:46

by John Stultz

[permalink] [raw]
Subject: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

With v5.13-rc2, I've been seeing an odd boot regression with the
DragonBoard 845c:

Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving
me inconsistent results so far. It feels a bit like maybe some config
option gets enabled moving forward, and then sticks around when we go
back. I'll take another swing at bisecting it later today, but I have
to move on to some other work right now, so I figured I'd share (with
folks who better know the recent __apply_alternatives changes) in case
folks have a better idea:

[ 0.254384] CPU features: detected: RAS Extension Support
[ 0.259928] CPU: All CPU(s) started at EL1
[ 0.264127] alternatives: patching kernel code
[ 0.268635] ------------[ cut here ]------------
[ 0.273303] kernel BUG at arch/arm64/kernel/alternative.c:157!
[ 0.279192] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 0.284736] Modules linked in:
[ 0.287833] CPU: 0 PID: 14 Comm: migration/0 Not tainted
5.13.0-rc2-mainline #4501
[ 0.295472] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 0.301182] Stopper: multi_cpu_stop+0x0/0x1a0 <-
stop_machine_cpuslocked+0x128/0x160
[ 0.309020] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO BTYPE=--)
[ 0.315086] pc : __apply_alternatives+0x1f0/0x270
[ 0.319847] lr : __apply_alternatives+0xf4/0x270
[ 0.324515] sp : ffffffc01020bca0
[ 0.327874] x29: ffffffc01020bca0 x28: 00000000000000a0 x27: ffffffd7f5c11124
[ 0.335086] x26: ffffffd7f5c11128 x25: 00000000001b0020 x24: ffffffd7f700ab90
[ 0.342297] x23: 0000000000000000 x22: ffffffc01020bd20 x21: ffffffd7f7bea374
[ 0.349508] x20: ffffffc01020bd30 x19: ffffffd7f72194fc x18: ffffffffffffffff
[ 0.356718] x17: ffffffd7f7bdce40 x16: 000000005c8e1b43 x15: ffffffd7f76d9d10
[ 0.363929] x14: ffffffc09020b967 x13: ffffffc01020b975 x12: ffffffd7f76d9e30
[ 0.371140] x11: 0000000005f5e0ff x10: ffffffc01020b8c0 x9 : 00000000ffffffd0
[ 0.378350] x8 : 6b20676e69686374 x7 : ffffffd7f79b9238 x6 : c0000000ffff7fff
[ 0.385560] x5 : 0000000000000000 x4 : ffffffd7f5c22898 x3 : 0000000000000010
[ 0.392771] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f
[ 0.399982] Call trace:
[ 0.402461] __apply_alternatives+0x1f0/0x270
[ 0.406873] __apply_alternatives_multi_stop+0xc0/0xe0
[ 0.412062] multi_cpu_stop+0xb8/0x1a0
[ 0.415851] cpu_stopper_thread+0xac/0x120
[ 0.419997] smpboot_thread_fn+0x200/0x238
[ 0.424146] kthread+0x14c/0x158
[ 0.427423] ret_from_fork+0x10/0x1c
[ 0.431045] Code: 39402e61 39402a62 6b01005f 54fff500 (d4210000)
[ 0.437199] ---[ end trace 523e13d9d60a992d ]---
[ 0.441868] note: migration/0[14] exited with preempt_count 2
[ 0.447739] migration/0 (14) used greatest stack depth: 12448 bytes left
[ 0.454543] ------------[ cut here ]------------
[ 0.459211] WARNING: CPU: 0 PID: 0 at kernel/rcu/tree.c:638
rcu_eqs_enter.isra.62+0x98/0x138
[ 0.467734] Modules linked in:
[ 0.470826] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D
5.13.0-rc2-mainline #4501
[ 0.479594] Hardware name: Thundercomm Dragonboard 845c (DT)
[ 0.485303] pstate: 204003c5 (nzCv DAIF +PAN -UAO -TCO BTYPE=--)
[ 0.491366] pc : rcu_eqs_enter.isra.62+0x98/0x138
[ 0.496122] lr : rcu_eqs_enter.isra.62+0x10/0x138
[ 0.500878] sp : ffffffd7f76d3e70
[ 0.504236] x29: ffffffd7f76d3e70 x28: ffffffd7f76e9780 x27: 0000000000000000
[ 0.511448] x26: 0000000000000000 x25: ffffffd7f707a480 x24: ffffffd7f72c14f0
[ 0.518660] x23: ffffffd7f76d9000 x22: ffffffd7f7d4c000 x21: ffffffd7f76d9000
[ 0.525871] x20: ffffffd7f76e9780 x19: ffffff80fd6a1380 x18: ffffffffffffffff
[ 0.533082] x17: 0000000000000000 x16: 000000000000000e x15: ffffffd7f76d9d10
[ 0.540293] x14: ffffffc09020b5f7 x13: ffffffd7f70130b0 x12: ffffffd7f76d9e30
[ 0.547504] x11: 0000000005f5e0ff x10: 0000000000000a10 x9 : ffffffd7f76d3e00
[ 0.554715] x8 : ffffffd7f76ea1f0 x7 : 0000000000000000 x6 : 00000000fffedb36
[ 0.561926] x5 : 00000000ffffffff x4 : ffffffa9063de000 x3 : 0000000000000001
[ 0.569136] x2 : 4000000000000000 x1 : ffffffd7f76da768 x0 : 4000000000000002
[ 0.576347] Call trace:
[ 0.578825] rcu_eqs_enter.isra.62+0x98/0x138
[ 0.583236] rcu_idle_enter+0x14/0x20
[ 0.586941] default_idle_call+0x44/0x1b8
[ 0.591003] do_idle+0x200/0x2a0
[ 0.594279] cpu_startup_entry+0x2c/0x50
[ 0.598251] rest_init+0xd4/0xe0
[ 0.601524] arch_call_rest_init+0x14/0x1c
[ 0.605680] start_kernel+0x504/0x538
[ 0.609382] ---[ end trace 523e13d9d60a992e ]---


2021-05-19 17:05:40

by Will Deacon

[permalink] [raw]
Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

Hi John,

On Mon, May 17, 2021 at 02:52:59PM -0700, John Stultz wrote:
> With v5.13-rc2, I've been seeing an odd boot regression with the
> DragonBoard 845c:
>
> Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving
> me inconsistent results so far. It feels a bit like maybe some config
> option gets enabled moving forward, and then sticks around when we go
> back. I'll take another swing at bisecting it later today, but I have
> to move on to some other work right now, so I figured I'd share (with
> folks who better know the recent __apply_alternatives changes) in case
> folks have a better idea:

Please can you try reverting af44068c581c and 0c6c2d3615ef?

Thanks,

Will

2021-05-19 17:19:06

by Miles Chen

[permalink] [raw]
Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

On Mon, 2021-05-17 at 14:52 -0700, John Stultz wrote:
> With v5.13-rc2, I've been seeing an odd boot regression with the
> DragonBoard 845c:

I also observed the same issue with v5.13-rc2 (by merging
android-mainline). Here is my bisect result so far.

(bad) ccd25efcb4fe Merge tag 'v5.13-rc2' into android-mainline
(bad) 3d86ae4fcdff Merge 25a1298726e9 ("Merge tag 'trace-v5.13-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace") into
android-mainline
(good) 85adc860fdf3 Merge 6efb943b8616e Linux 5.13-rc1 into
android-mainline


>
> Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving
> me inconsistent results so far. It feels a bit like maybe some config
> option gets enabled moving forward, and then sticks around when we go
> back. I'll take another swing at bisecting it later today, but I have
> to move on to some other work right now, so I figured I'd share (with
> folks who better know the recent __apply_alternatives changes) in case
> folks have a better idea:
>
> [ 0.254384] CPU features: detected: RAS Extension Support
> [ 0.259928] CPU: All CPU(s) started at EL1
> [ 0.264127] alternatives: patching kernel code
> [ 0.268635] ------------[ cut here ]------------
> [ 0.273303] kernel BUG at arch/arm64/kernel/alternative.c:157!
> [ 0.279192] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [ 0.284736] Modules linked in:
> [ 0.287833] CPU: 0 PID: 14 Comm: migration/0 Not tainted
> 5.13.0-rc2-mainline #4501
> [ 0.295472] Hardware name: Thundercomm Dragonboard 845c (DT)
> [ 0.301182] Stopper: multi_cpu_stop+0x0/0x1a0 <-
> stop_machine_cpuslocked+0x128/0x160
> [ 0.309020] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO BTYPE=--)
> [ 0.315086] pc : __apply_alternatives+0x1f0/0x270
> [ 0.319847] lr : __apply_alternatives+0xf4/0x270
> [ 0.324515] sp : ffffffc01020bca0
> [ 0.327874] x29: ffffffc01020bca0 x28: 00000000000000a0 x27: ffffffd7f5c11124
> [ 0.335086] x26: ffffffd7f5c11128 x25: 00000000001b0020 x24: ffffffd7f700ab90
> [ 0.342297] x23: 0000000000000000 x22: ffffffc01020bd20 x21: ffffffd7f7bea374
> [ 0.349508] x20: ffffffc01020bd30 x19: ffffffd7f72194fc x18: ffffffffffffffff
> [ 0.356718] x17: ffffffd7f7bdce40 x16: 000000005c8e1b43 x15: ffffffd7f76d9d10
> [ 0.363929] x14: ffffffc09020b967 x13: ffffffc01020b975 x12: ffffffd7f76d9e30
> [ 0.371140] x11: 0000000005f5e0ff x10: ffffffc01020b8c0 x9 : 00000000ffffffd0
> [ 0.378350] x8 : 6b20676e69686374 x7 : ffffffd7f79b9238 x6 : c0000000ffff7fff
> [ 0.385560] x5 : 0000000000000000 x4 : ffffffd7f5c22898 x3 : 0000000000000010
> [ 0.392771] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f
> [ 0.399982] Call trace:
> [ 0.402461] __apply_alternatives+0x1f0/0x270
> [ 0.406873] __apply_alternatives_multi_stop+0xc0/0xe0
> [ 0.412062] multi_cpu_stop+0xb8/0x1a0
> [ 0.415851] cpu_stopper_thread+0xac/0x120
> [ 0.419997] smpboot_thread_fn+0x200/0x238
> [ 0.424146] kthread+0x14c/0x158
> [ 0.427423] ret_from_fork+0x10/0x1c
> [ 0.431045] Code: 39402e61 39402a62 6b01005f 54fff500 (d4210000)
> [ 0.437199] ---[ end trace 523e13d9d60a992d ]---
> [ 0.441868] note: migration/0[14] exited with preempt_count 2
> [ 0.447739] migration/0 (14) used greatest stack depth: 12448 bytes left
> [ 0.454543] ------------[ cut here ]------------
> [ 0.459211] WARNING: CPU: 0 PID: 0 at kernel/rcu/tree.c:638
> rcu_eqs_enter.isra.62+0x98/0x138
> [ 0.467734] Modules linked in:
> [ 0.470826] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D
> 5.13.0-rc2-mainline #4501
> [ 0.479594] Hardware name: Thundercomm Dragonboard 845c (DT)
> [ 0.485303] pstate: 204003c5 (nzCv DAIF +PAN -UAO -TCO BTYPE=--)
> [ 0.491366] pc : rcu_eqs_enter.isra.62+0x98/0x138
> [ 0.496122] lr : rcu_eqs_enter.isra.62+0x10/0x138
> [ 0.500878] sp : ffffffd7f76d3e70
> [ 0.504236] x29: ffffffd7f76d3e70 x28: ffffffd7f76e9780 x27: 0000000000000000
> [ 0.511448] x26: 0000000000000000 x25: ffffffd7f707a480 x24: ffffffd7f72c14f0
> [ 0.518660] x23: ffffffd7f76d9000 x22: ffffffd7f7d4c000 x21: ffffffd7f76d9000
> [ 0.525871] x20: ffffffd7f76e9780 x19: ffffff80fd6a1380 x18: ffffffffffffffff
> [ 0.533082] x17: 0000000000000000 x16: 000000000000000e x15: ffffffd7f76d9d10
> [ 0.540293] x14: ffffffc09020b5f7 x13: ffffffd7f70130b0 x12: ffffffd7f76d9e30
> [ 0.547504] x11: 0000000005f5e0ff x10: 0000000000000a10 x9 : ffffffd7f76d3e00
> [ 0.554715] x8 : ffffffd7f76ea1f0 x7 : 0000000000000000 x6 : 00000000fffedb36
> [ 0.561926] x5 : 00000000ffffffff x4 : ffffffa9063de000 x3 : 0000000000000001
> [ 0.569136] x2 : 4000000000000000 x1 : ffffffd7f76da768 x0 : 4000000000000002
> [ 0.576347] Call trace:
> [ 0.578825] rcu_eqs_enter.isra.62+0x98/0x138
> [ 0.583236] rcu_idle_enter+0x14/0x20
> [ 0.586941] default_idle_call+0x44/0x1b8
> [ 0.591003] do_idle+0x200/0x2a0
> [ 0.594279] cpu_startup_entry+0x2c/0x50
> [ 0.598251] rest_init+0xd4/0xe0
> [ 0.601524] arch_call_rest_init+0x14/0x1c
> [ 0.605680] start_kernel+0x504/0x538
> [ 0.609382] ---[ end trace 523e13d9d60a992e ]---
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2021-05-19 17:20:19

by John Garry

[permalink] [raw]
Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

On 18/05/2021 09:49, Will Deacon wrote:
> Hi John,
>
> On Mon, May 17, 2021 at 02:52:59PM -0700, John Stultz wrote:
>> With v5.13-rc2, I've been seeing an odd boot regression with the
>> DragonBoard 845c:
>>
>> Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving
>> me inconsistent results so far. It feels a bit like maybe some config
>> option gets enabled moving forward, and then sticks around when we go
>> back. I'll take another swing at bisecting it later today, but I have
>> to move on to some other work right now, so I figured I'd share (with
>> folks who better know the recent __apply_alternatives changes) in case
>> folks have a better idea:
>
> Please can you try reverting af44068c581c and 0c6c2d3615ef?
>
> Thanks,
>
> Will
>

I saw a crash yesterday evening on my Huawei D05 and D06, but it went
away with a clean build...

Here's my draft mail:

I just noticed this this evening on my Huawei D06 and did not see same
or similar reported elsewhere:

[ 0.000000] Booting Linux on physical CPU 0x0000080000 [0x481fd010]
[ 0.000000] Linux version 5.13.0-rc2 (john@htsatcamb-server)
(aarch64-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture
8.3-2019.03 (arm-rel-8.36)) 8.3.0, GNU ld (GNU Toolchain for the
A-profile Arc
hitecture 8.3-2019.03 (arm-rel-8.36)) 2.32.0.20190321) #260 SMP PREEMPT
Mon May 17 19:58:51 BST 2021
[ 0.000000] efi: EFI v2.70 by EDK II
[ 0.000000] efi: ACPI 2.0=0x2f870000 SMBIOS 3.0=0x2f7e0000
MEMATTR=0x31ad1018 ESRT=0x31b17718 MEMRESERVE=0x2f795d18
[ 0.000000] esrt: Reserving ESRT space from 0x0000000031b17718 to
0x0000000031b17750.
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x000000002F870000 000024 (v02 HISI )
[ 0.000000] ACPI: XSDT 0x000000002F860000 0000AC (v01 HISI HIP08
00000000 01000013)
[ 0.000000] ACPI: FACP 0x000000002F330000 000114 (v06 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: DSDT 0x000000002F0A0000 00CE0E (v02 HISI HIP08
00000000 INTL 20181213)
[ 0.000000] ACPI: PCCT 0x000000002F850000 00008A (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: SSDT 0x000000002F840000 00E56A (v02 HISI HIP07
00000000 INTL 20181213)
[ 0.000000] ACPI: BERT 0x000000002F780000 000030 (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: HEST 0x000000002F760000 00058C (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: ERST 0x000000002F720000 000230 (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: EINJ 0x000000002F710000 000170 (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: GTDT 0x000000002F310000 00007C (v02 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: SDEI 0x000000002F2F0000 000030 (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: MCFG 0x000000002F0F0000 00003C (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: SLIT 0x000000002F0E0000 00003C (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: SPCR 0x000000002F0D0000 000050 (v02 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: SRAT 0x000000002F0C0000 000A10 (v03 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: APIC 0x000000002F0B0000 00286C (v04 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: IORT 0x000000002F090000 001678 (v00 HISI HIP08
00000000 INTL 20181213)
[ 0.000000] ACPI: PPTT 0x000000002E6F0000 0041D0 (v01 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: SPMI 0x000000002E6E0000 000041 (v05 HISI HIP08
00000000 HISI 20151124)
[ 0.000000] ACPI: SPCR: console: uart,mmio,0x3f00002f8,115200
[ 0.000000] earlycon: uart0 at MMIO 0x00000003f00002f8 (options '115200')
[ 0.000000] printk: bootconsole [uart0] enabled
[ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x2080000000-0x27ffffffff]
[ 0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x2800000000-0x2fffffffff]
[ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]
[ 0.000000] ACPI: SRAT: Node 2 PXM 2 [mem 0x202000000000-0x2027ffffffff]
[ 0.000000] ACPI: SRAT: Node 3 PXM 3 [mem 0x202800000000-0x202fffffffff]
[ 0.000000] NUMA: NODE_DATA [mem 0x27ffffdc00-0x27ffffffff]
[ 0.000000] NUMA: NODE_DATA [mem 0x2fffffdc00-0x2fffffffff]
[ 0.000000] NUMA: NODE_DATA [mem 0x2027ffffdc00-0x2027ffffffff]
[ 0.000000] NUMA: NODE_DATA [mem 0x202feffffc00-0x202ff0001fff]
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000000000000-0x00000000ffffffff]
[ 0.000000] DMA32 empty
[ 0.000000] Normal [mem 0x0000000100000000-0x0000202fffffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000000000000-0x000000000000ffff]
[ 0.000000] node 0: [mem 0x0000000000010000-0x000000002f0fffff]
[ 0.000000] node 0: [mem 0x000000002f100000-0x000000002f2effff]
[ 0.000000] node 0: [mem 0x000000002f2f0000-0x000000002f2fffff]
[ 0.000000] node 0: [mem 0x000000002f300000-0x000000002f30ffff]
[ 0.000000] node 0: [mem 0x000000002f310000-0x000000002f31ffff]
[ 0.000000] node 0: [mem 0x000000002f320000-0x000000002f32ffff]
[ 0.000000] node 0: [mem 0x000000002f330000-0x000000002f33ffff]
[ 0.000000] node 0: [mem 0x000000002f340000-0x000000002f41ffff]
[ 0.000000] node 0: [mem 0x000000002f420000-0x000000002f46ffff]
[ 0.000000] node 0: [mem 0x000000002f470000-0x000000002f50ffff]
[ 0.000000] node 0: [mem 0x000000002f510000-0x000000002f52ffff]
[ 0.000000] node 0: [mem 0x000000002f530000-0x000000002f66ffff]
[ 0.000000] node 0: [mem 0x000000002f670000-0x000000002f72ffff]
[ 0.000000] node 0: [mem 0x000000002f730000-0x000000002f731fff]
[ 0.000000] node 0: [mem 0x000000002f732000-0x000000002f73ffff]
[ 0.000000] node 0: [mem 0x000000002f740000-0x000000002f740fff]
[ 0.000000] node 0: [mem 0x000000002f741000-0x000000002f74ffff]
[ 0.000000] node 0: [mem 0x000000002f750000-0x000000002f751fff]
[ 0.000000] node 0: [mem 0x000000002f752000-0x000000002f76ffff]
[ 0.000000] node 0: [mem 0x000000002f770000-0x000000002f771fff]
[ 0.000000] node 0: [mem 0x000000002f772000-0x000000002f78ffff]
[ 0.000000] node 0: [mem 0x000000002f790000-0x000000002f790fff]
[ 0.000000] node 0: [mem 0x000000002f791000-0x000000002f7cffff]
[ 0.000000] node 0: [mem 0x000000002f7d0000-0x000000002f7f0fff]
[ 0.000000] node 0: [mem 0x000000002f7f1000-0x000000002f87ffff]
[ 0.000000] node 0: [mem 0x000000002f880000-0x000000002fb1ffff]
[ 0.000000] node 0: [mem 0x000000002fb20000-0x000000003eecffff]
[ 0.000000] node 0: [mem 0x000000003eed0000-0x000000003eefffff]
[ 0.000000] node 0: [mem 0x000000003ef00000-0x000000003fbfffff]
[ 0.000000] node 0: [mem 0x0000000040000000-0x0000000043ffffff]
[ 0.000000] node 0: [mem 0x0000000044030000-0x000000004fffffff]
[ 0.000000] node 0: [mem 0x0000000050000000-0x000000007fffffff]
[ 0.000000] node 0: [mem 0x0000002080000000-0x00000027ffffffff]
[ 0.000000] node 1: [mem 0x0000002800000000-0x0000002fffffffff]
[ 0.000000] node 2: [mem 0x0000202000000000-0x00002027ffffffff]
[ 0.000000] node 3: [mem 0x0000202800000000-0x0000202fffffffff]
[ 0.000000] Initmem setup node 0 [mem
0x0000000000000000-0x00000027ffffffff]
[ 0.000000] Initmem setup node 1 [mem
0x0000002800000000-0x0000002fffffffff]
[ 0.000000] Initmem setup node 2 [mem
0x0000202000000000-0x00002027ffffffff]
[ 0.000000] Initmem setup node 3 [mem
0x0000202800000000-0x0000202fffffffff]
[ 0.000000] cma: Reserved 32 MiB at 0x000000007e000000
[ 0.000000] crashkernel reserved: 0x0000000002000000 -
0x0000000012000000 (256 MB)
[ 0.000000] psci: probing for conduit method from ACPI.
[ 0.000000] psci: PSCIv1.1 detected in firmware.
[ 0.000000] psci: Using standard PSCI v0.2 function IDs
[ 0.000000] psci: MIGRATE_INFO_TYPE not supported.
[ 0.000000] psci: SMC Calling Convention v1.1
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x80000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x80100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x80200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x80300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x90000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x90100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x90200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x90300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xa0000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xa0100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xa0200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xa0300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xb0000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xb0100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xb0200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xb0300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xc0000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xc0100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xc0200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xc0300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xd0000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xd0100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xd0200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xd0300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xe0000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xe0100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xe0200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xe0300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xf0000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xf0100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xf0200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xf0300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x180000 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x180100 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x180200 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x180300 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x190000 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x190100 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x190200 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x190300 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1a0000 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1a0100 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1a0200 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1a0300 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1b0000 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1b0100 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1b0200 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1b0300 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1c0000 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1c0100 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1c0200 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1c0300 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1d0000 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1d0100 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1d0200 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1d0300 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1e0000 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1e0100 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1e0200 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1e0300 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1f0000 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1f0100 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1f0200 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x1f0300 -> Node 1
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x280000 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x280100 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x280200 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x280300 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x290000 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x290100 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x290200 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x290300 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2a0000 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2a0100 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2a0200 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2a0300 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2b0000 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2b0100 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2b0200 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2b0300 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2c0000 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2c0100 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2c0200 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2c0300 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2d0000 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2d0100 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2d0200 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2d0300 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2e0000 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2e0100 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2e0200 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2e0300 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2f0000 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2f0100 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2f0200 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x2f0300 -> Node 2
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x380000 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x380100 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x380200 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x380300 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x390000 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x390100 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x390200 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x390300 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3a0000 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3a0100 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3a0200 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3a0300 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3b0000 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3b0100 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3b0200 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3b0300 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3c0000 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3c0100 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3c0200 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3c0300 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3d0000 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3d0100 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3d0200 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3d0300 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3e0000 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3e0100 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3e0200 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3e0300 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3f0000 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3f0100 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3f0200 -> Node 3
[ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x3f0300 -> Node 3
000000] CPU features: detected: GIC system register CPU interface
[ 0.000000] CPU features: detected: Virtualization Host Extensions
[ 0.000000] CPU features: detected: Hardware dirty bit management
[ 0.000000] alternatives: patching kernel code
[ 0.000000] Built 4 zonelists, mobility grouping on. Total pages:
33029088
[ 0.000000] Policy zone: Normal
[ 0.000000] Kernel command line: BOOT_IMAGE=/john/Image rdinit=/init
crashkernel=256M@32M console=ttyAMA0,115200 earlycon acpi=force
pcie_aspm=off noinitrd root=/dev/sda2 rw log_buf_len=16M user_debug=1 iommu
.strict=0 nvme.use_threaded_interrupts=1 irqchip.gicv3_pseudo_nmi=1
[ 0.000000] PCIe ASPM is disabled
[ 0.000000] printk: log_buf_len: 16777216 bytes
[ 0.000000] printk: early log buf free: 112920(86%)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] software IO TLB: mapped [mem
0x000000007a000000-0x000000007e000000] (64MB)
[ 0.000000] Memory: 131103648K/134213440K available (14592K kernel
code, 3038K rwdata, 8012K rodata, 6080K init, 500K bss, 3077024K
reserved, 32768K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=4
[ 0.000000] rcu: Preemptible hierarchical RCU implementation.
[ 0.000000] rcu: RCU event tracing is enabled.
[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=256 to
nr_cpu_ids=128.
[ 0.000000] Trampoline variant of Tasks RCU enabled.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay
is 25 jiffies.
[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16,
nr_cpu_ids=128
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] GICv3: GIC: Using split EOI/Deactivate mode
[ 0.000000] GICv3: 640 SPIs implemented
[ 0.000000] GICv3: 0 Extended SPIs implemented
[ 0.000000] GICv3: Distributor has no Range Selector support
[ 0.000000] Root IRQ handler: gic_handle_irq
[ 0.000000] GICv3: 16 PPIs implemented
[ 0.000000] GICv3: GICv4 features: DirectLPI
[ 0.000000] GICv3: CPU0: found redistributor 80000 region
0:0x00000000ae100000
[ 0.000000] SRAT: PXM 0 -> ITS 0 -> Node 0
[ 0.000000] SRAT: PXM 2 -> ITS 1 -> Node 2
[ 0.000000] ITS [mem 0x202100000-0x20211ffff]
[ 0.000000] ITS@0x0000000202100000: Using ITS number 0
[ 0.000000] ITS@0x0000000202100000: allocated 65536 Devices
@2080280000 (flat, esz 8, psz 16K, shr 1)
[ 0.000000] ITS@0x0000000202100000: allocated 65536 Virtual CPUs
@2080300000 (flat, esz 16, psz 4K, shr 1)
[ 0.000000] ITS@0x0000000202100000: allocated 256 Interrupt
Collections @208026b000 (flat, esz 16, psz 4K, shr 1)
[ 0.000000] ITS [mem 0x200202100000-0x20020211ffff]
[ 0.000000] ITS@0x0000200202100000: Using ITS number 1
[ 0.000000] ITS@0x0000200202100000: allocated 65536 Devices
@202000080000 (flat, esz 8, psz 16K, shr 1)
[ 0.000000] ITS@0x0000200202100000: allocated 65536 Virtual CPUs
@202000100000 (flat, esz 16, psz 4K, shr 1)
[ 0.000000] ITS@0x0000200202100000: allocated 256 Interrupt
Collections @202000002000 (flat, esz 16, psz 4K, shr 1)
[ 0.000000] GICv3: using LPI property table @0x0000002080800000
[ 0.000000] ITS: Using DirectLPI for VPE invalidation
[ 0.000000] ITS: Enabling GICv4 support
[ 0.000000] GICv3: CPU0: using allocated LPI pending table
@0x0000002080820000
[ 0.000000] random: get_random_bytes called from
start_kernel+0x350/0x538 with crng_init=0
[ 0.000000] arch_timer: cp15 timer(s) running at 100.00MHz (phys).
[ 0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff
max_cycles: 0x171024e7e0, max_idle_ns: 440795205315 ns
[ 0.000000] sched_clock: 56 bits at 100MHz, resolution 10ns, wraps
every 4398046511100ns
[ 0.009117] Console: colour dummy device 80x25
[ 0.014107] mempolicy: Enabling automatic NUMA balancing. Configure
with numa_balancing= or the kernel.numa_balancing sysctl
[ 0.026557] ACPI: Core revision 20210331
[ 0.031087] Calibrating delay loop (skipped), value calculated using
timer frequency.. 200.00 BogoMIPS (lpj=400000)
[ 0.042656] pid_max: default: 131072 minimum: 1024
[ 0.048014] LSM: Security Framework initializing
[ 0.063347] Dentry cache hash table entries: 8388608 (order: 14,
67108864 bytes, vmalloc)
[ 0.077708] Inode-cache hash table entries: 4194304 (order: 13,
33554432 bytes, vmalloc)
[ 0.086871] Mount-cache hash table entries: 131072 (order: 8, 1048576
bytes, vmalloc)
[ 0.095713] Mountpoint-cache hash table entries: 131072 (order: 8,
1048576 bytes, vmalloc)
645] Platform MSI: ITS@0x200202100000 domain created
[ 0.125919] PCI/MSI: ITS@0x202100000 domain created
[ 0.131424] PCI/MSI: ITS@0x200202100000 domain created
[ 0.137120] fsl-mc MSI: ITS@0x202100000 domain created
[ 0.142816] fsl-mc MSI: ITS@0x200202100000 domain created
[ 0.148809] Remapping and enabling EFI services.
[ 0.154135] efi: memattr: Entry attributes invalid: RO and XP bits
both cleared
[ 0.162239] efi: memattr: ! 0x000000000000-0x00000000ffff [Runtime
Code|RUN| | | | | | | | | | | | | ]
[ 0.175264] smp: Bringing up secondary CPUs ...
[ 0.180420] Detected VIPT I-cache on CPU1
[ 0.180428] GICv3: CPU1: found redistributor 80100 region
1:0x00000000ae140000
[ 0.180438] GICv3: CPU1: using allocated LPI pending table
@0x0000002080830000
[ 0.180477] CPU1: Booted secondary processor 0x0000080100 [0x481fd010]
[ 0.181300] Insufficient stack space to handle exception!
[ 0.181301] ESR: 0x96000044 -- DABT (current EL)
[ 0.181302] FAR: 0x0000000000000100
[ 0.181303] Task stack: [0xffff8000125b0000..0xffff8000125b4000]
[ 0.181303] IRQ stack: [0xffff800010008000..0xffff80001000c000]
[ 0.181304] Overflow stack: [0xffff0027dfb372b0..0xffff0027dfb382b0]
[ 0.181305] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.13.0-rc2 #260
[ 0.181306] pstate: 400003c9 (nZcv DAIF -PAN -UAO -TCO BTYPE=--)
[ 0.181306] pc : el1_sync+0x0/0x100
[ 0.181307] lr : el1_irq+0xb8/0x150
[ 0.181308] sp : 0000000000000100
[ 0.181308] x29: ffff8000125b3f10 x28: ffff0020872f8e00 x27:
0000000000000000
[ 0.181311] x26: 0000000000004000 x25: 0000000000000000 x24:
ffff800011c29b04
[ 0.181313] x23: 0000000040000009 x22: ffff800010e283c0 x21:
ffff8000125b3f30
[ 0.181314] x20: 0000000000000002 x19: ffff8000125b3de0 x18:
0000000000000030
[ 0.181316] x17: 000c0400bb44ffff x16: 004000b5b5503510 x15:
ffffffffffffffff
[ 0.181318] x14: ffff800011c29948 x13: ffff202fceb259a6 x12:
ffff202fceb2599b
[ 0.181320] x11: 0000000000000040 x10: 00000000000009c0 x9 :
ffff8000125b3ea0
[ 0.181322] x8 : ffff0020872f9820 x7 : 0000000000000000 x6 :
ffff0020872f8e00
[ 0.181323] x5 : ffff0027dfb448c0 x4 : ffff0027dfb449e0 x3 :
0000000000000000
[ 0.181325] x2 : 0000000000000006 x1 : ffff8000104c5058 x0 :
ffff8000125b3de0
[ 0.181327] Kernel panic - not syncing: kernel stack overflow
[ 0.181327] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.13.0-rc2 #260
[ 0.181328] Call trace:
[ 0.181329] dump_backtrace+0x0/0x1b0
[ 0.181376] SMP: stopping secondary CPUs

> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> .
>


2021-05-19 17:37:35

by Marc Zyngier

[permalink] [raw]
Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

+ Michael

On Mon, 17 May 2021 22:52:59 +0100,
John Stultz <[email protected]> wrote:
>
> With v5.13-rc2, I've been seeing an odd boot regression with the
> DragonBoard 845c:
>
> Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving
> me inconsistent results so far. It feels a bit like maybe some config
> option gets enabled moving forward, and then sticks around when we go
> back. I'll take another swing at bisecting it later today, but I have
> to move on to some other work right now, so I figured I'd share (with
> folks who better know the recent __apply_alternatives changes) in case
> folks have a better idea:
>
> [ 0.254384] CPU features: detected: RAS Extension Support
> [ 0.259928] CPU: All CPU(s) started at EL1
> [ 0.264127] alternatives: patching kernel code
> [ 0.268635] ------------[ cut here ]------------
> [ 0.273303] kernel BUG at arch/arm64/kernel/alternative.c:157!
> [ 0.279192] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [ 0.284736] Modules linked in:
> [ 0.287833] CPU: 0 PID: 14 Comm: migration/0 Not tainted
> 5.13.0-rc2-mainline #4501
> [ 0.295472] Hardware name: Thundercomm Dragonboard 845c (DT)
> [ 0.301182] Stopper: multi_cpu_stop+0x0/0x1a0 <-
> stop_machine_cpuslocked+0x128/0x160
> [ 0.309020] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO BTYPE=--)
> [ 0.315086] pc : __apply_alternatives+0x1f0/0x270
> [ 0.319847] lr : __apply_alternatives+0xf4/0x270
> [ 0.324515] sp : ffffffc01020bca0
> [ 0.327874] x29: ffffffc01020bca0 x28: 00000000000000a0 x27: ffffffd7f5c11124
> [ 0.335086] x26: ffffffd7f5c11128 x25: 00000000001b0020 x24: ffffffd7f700ab90
> [ 0.342297] x23: 0000000000000000 x22: ffffffc01020bd20 x21: ffffffd7f7bea374
> [ 0.349508] x20: ffffffc01020bd30 x19: ffffffd7f72194fc x18: ffffffffffffffff
> [ 0.356718] x17: ffffffd7f7bdce40 x16: 000000005c8e1b43 x15: ffffffd7f76d9d10
> [ 0.363929] x14: ffffffc09020b967 x13: ffffffc01020b975 x12: ffffffd7f76d9e30
> [ 0.371140] x11: 0000000005f5e0ff x10: ffffffc01020b8c0 x9 : 00000000ffffffd0
> [ 0.378350] x8 : 6b20676e69686374 x7 : ffffffd7f79b9238 x6 : c0000000ffff7fff
> [ 0.385560] x5 : 0000000000000000 x4 : ffffffd7f5c22898 x3 : 0000000000000010
> [ 0.392771] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f
> [ 0.399982] Call trace:
> [ 0.402461] __apply_alternatives+0x1f0/0x270
> [ 0.406873] __apply_alternatives_multi_stop+0xc0/0xe0
> [ 0.412062] multi_cpu_stop+0xb8/0x1a0
> [ 0.415851] cpu_stopper_thread+0xac/0x120
> [ 0.419997] smpboot_thread_fn+0x200/0x238
> [ 0.424146] kthread+0x14c/0x158
> [ 0.427423] ret_from_fork+0x10/0x1c
> [ 0.431045] Code: 39402e61 39402a62 6b01005f 54fff500 (d4210000)
> [ 0.437199] ---[ end trace 523e13d9d60a992d ]---
> [ 0.441868] note: migration/0[14] exited with preempt_count 2
> [ 0.447739] migration/0 (14) used greatest stack depth: 12448 bytes left

[/me digs in my IRC logs]

This looks a lot like an issue that was reported my Michael Walle a
few days ago on IRC, leading to a crash that looked like this:

[ 0.325238] alternatives: patching kernel code
[ 0.329735] ------------[ cut here ]------------
[ 0.334394] kernel BUG at arch/arm64/kernel/alternative.c:157!
[ 0.340300] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 0.345836] Modules linked in:
[ 0.348916] CPU: 0 PID: 14 Comm: migration/0 Not tainted 5.13.0-rc1-next-20210511+ #536
[ 0.356998] Hardware name: Kontron SMARC-sAL28 (Single PHY) on SMARC Eval 2.0 carrier (DT)
[ 0.365339] Stopper: multi_cpu_stop+0x0/0x1a8 <- stop_cpus.constprop.9+0x78/0xc8
[ 0.372820] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO BTYPE=--)
[ 0.378882] pc : __apply_alternatives.isra.1+0x1c4/0x270
[ 0.384246] lr : __apply_alternatives.isra.1+0x110/0x270
[ 0.389606] sp : ffff800012db3ca0
[ 0.392946] x29: ffff800012db3ca0 x28: 0000000000000000 x27: ffff800010011924
[ 0.400155] x26: ffff800010011928 x25: 00000000001b0020 x24: ffff8000115ad350
[ 0.407364] x23: ffff800012db3d28 x22: 0000000000000000 x21: ffff800011fb24cd
[ 0.414571] x20: ffff800012db3d30 x19: ffff800011840b38 x18: 0000000000000010
[ 0.421779] x17: 0000000044a56c23 x16: 0000000000000002 x15: ffffffffffffffff
[ 0.428986] x14: ffff800011d50a48 x13: ffff800092db3987 x12: ffff800011de6a70
[ 0.436193] x11: 0000000000000003 x10: ffff800011dcea30 x9 : ffff8000105d8928
[ 0.443401] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001
[ 0.450608] x5 : 0000000000000000 x4 : ffff800010024398 x3 : 0000000000000010
[ 0.457815] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f
[ 0.465022] Call trace:
[ 0.467483] __apply_alternatives.isra.1+0x1c4/0x270
[ 0.472493] __apply_alternatives_multi_stop+0xcc/0xe0
[ 0.477679] multi_cpu_stop+0xac/0x1a8
[ 0.481460] cpu_stopper_thread+0xa4/0x138
[ 0.485592] smpboot_thread_fn+0x12c/0x268
[ 0.489725] kthread+0x164/0x168
[ 0.492980] ret_from_fork+0x10/0x30
[ 0.496588] Code: 39402e61 39402a62 6b01005f 54fff6a0 (d4210000)
[ 0.502742] ---[ end trace 24ef7d65759ab825 ]---
[ 0.507398] note: migration/0[14] exited with preempt_count 2
[ 0.513290] ------------[ cut here ]------------

Michael subsequently reported that:

<quote>
mhh nevermind, I can't reproduce it anymore. Maybe I should have
recompiled with a clean build dir at first
</quote>

My gut feeling is that we can end-up with some build leftovers when
going between -rc1 and -rc2, hence the screw-up when the capabilities
get reordered. Dependency issues?

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2021-05-19 17:41:38

by Mark Rutland

[permalink] [raw]
Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

On Tue, May 18, 2021 at 10:12:32AM +0100, Marc Zyngier wrote:
> + Michael
>
> On Mon, 17 May 2021 22:52:59 +0100,
> John Stultz <[email protected]> wrote:
> >
> > With v5.13-rc2, I've been seeing an odd boot regression with the
> > DragonBoard 845c:
> >
> > Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving
> > me inconsistent results so far. It feels a bit like maybe some config
> > option gets enabled moving forward, and then sticks around when we go
> > back. I'll take another swing at bisecting it later today, but I have
> > to move on to some other work right now, so I figured I'd share (with
> > folks who better know the recent __apply_alternatives changes) in case
> > folks have a better idea:
> >
> > [ 0.254384] CPU features: detected: RAS Extension Support
> > [ 0.259928] CPU: All CPU(s) started at EL1
> > [ 0.264127] alternatives: patching kernel code
> > [ 0.268635] ------------[ cut here ]------------
> > [ 0.273303] kernel BUG at arch/arm64/kernel/alternative.c:157!
> > [ 0.279192] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> > [ 0.284736] Modules linked in:
> > [ 0.287833] CPU: 0 PID: 14 Comm: migration/0 Not tainted
> > 5.13.0-rc2-mainline #4501
> > [ 0.295472] Hardware name: Thundercomm Dragonboard 845c (DT)
> > [ 0.301182] Stopper: multi_cpu_stop+0x0/0x1a0 <-
> > stop_machine_cpuslocked+0x128/0x160
> > [ 0.309020] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO BTYPE=--)
> > [ 0.315086] pc : __apply_alternatives+0x1f0/0x270
> > [ 0.319847] lr : __apply_alternatives+0xf4/0x270
> > [ 0.324515] sp : ffffffc01020bca0
> > [ 0.327874] x29: ffffffc01020bca0 x28: 00000000000000a0 x27: ffffffd7f5c11124
> > [ 0.335086] x26: ffffffd7f5c11128 x25: 00000000001b0020 x24: ffffffd7f700ab90
> > [ 0.342297] x23: 0000000000000000 x22: ffffffc01020bd20 x21: ffffffd7f7bea374
> > [ 0.349508] x20: ffffffc01020bd30 x19: ffffffd7f72194fc x18: ffffffffffffffff
> > [ 0.356718] x17: ffffffd7f7bdce40 x16: 000000005c8e1b43 x15: ffffffd7f76d9d10
> > [ 0.363929] x14: ffffffc09020b967 x13: ffffffc01020b975 x12: ffffffd7f76d9e30
> > [ 0.371140] x11: 0000000005f5e0ff x10: ffffffc01020b8c0 x9 : 00000000ffffffd0
> > [ 0.378350] x8 : 6b20676e69686374 x7 : ffffffd7f79b9238 x6 : c0000000ffff7fff
> > [ 0.385560] x5 : 0000000000000000 x4 : ffffffd7f5c22898 x3 : 0000000000000010
> > [ 0.392771] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f
> > [ 0.399982] Call trace:
> > [ 0.402461] __apply_alternatives+0x1f0/0x270
> > [ 0.406873] __apply_alternatives_multi_stop+0xc0/0xe0
> > [ 0.412062] multi_cpu_stop+0xb8/0x1a0
> > [ 0.415851] cpu_stopper_thread+0xac/0x120
> > [ 0.419997] smpboot_thread_fn+0x200/0x238
> > [ 0.424146] kthread+0x14c/0x158
> > [ 0.427423] ret_from_fork+0x10/0x1c
> > [ 0.431045] Code: 39402e61 39402a62 6b01005f 54fff500 (d4210000)
> > [ 0.437199] ---[ end trace 523e13d9d60a992d ]---
> > [ 0.441868] note: migration/0[14] exited with preempt_count 2
> > [ 0.447739] migration/0 (14) used greatest stack depth: 12448 bytes left
>
> [/me digs in my IRC logs]
>
> This looks a lot like an issue that was reported my Michael Walle a
> few days ago on IRC, leading to a crash that looked like this:
>
> [ 0.325238] alternatives: patching kernel code
> [ 0.329735] ------------[ cut here ]------------
> [ 0.334394] kernel BUG at arch/arm64/kernel/alternative.c:157!
> [ 0.340300] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [ 0.345836] Modules linked in:
> [ 0.348916] CPU: 0 PID: 14 Comm: migration/0 Not tainted 5.13.0-rc1-next-20210511+ #536
> [ 0.356998] Hardware name: Kontron SMARC-sAL28 (Single PHY) on SMARC Eval 2.0 carrier (DT)
> [ 0.365339] Stopper: multi_cpu_stop+0x0/0x1a8 <- stop_cpus.constprop.9+0x78/0xc8
> [ 0.372820] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO BTYPE=--)
> [ 0.378882] pc : __apply_alternatives.isra.1+0x1c4/0x270
> [ 0.384246] lr : __apply_alternatives.isra.1+0x110/0x270
> [ 0.389606] sp : ffff800012db3ca0
> [ 0.392946] x29: ffff800012db3ca0 x28: 0000000000000000 x27: ffff800010011924
> [ 0.400155] x26: ffff800010011928 x25: 00000000001b0020 x24: ffff8000115ad350
> [ 0.407364] x23: ffff800012db3d28 x22: 0000000000000000 x21: ffff800011fb24cd
> [ 0.414571] x20: ffff800012db3d30 x19: ffff800011840b38 x18: 0000000000000010
> [ 0.421779] x17: 0000000044a56c23 x16: 0000000000000002 x15: ffffffffffffffff
> [ 0.428986] x14: ffff800011d50a48 x13: ffff800092db3987 x12: ffff800011de6a70
> [ 0.436193] x11: 0000000000000003 x10: ffff800011dcea30 x9 : ffff8000105d8928
> [ 0.443401] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001
> [ 0.450608] x5 : 0000000000000000 x4 : ffff800010024398 x3 : 0000000000000010
> [ 0.457815] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f
> [ 0.465022] Call trace:
> [ 0.467483] __apply_alternatives.isra.1+0x1c4/0x270
> [ 0.472493] __apply_alternatives_multi_stop+0xcc/0xe0
> [ 0.477679] multi_cpu_stop+0xac/0x1a8
> [ 0.481460] cpu_stopper_thread+0xa4/0x138
> [ 0.485592] smpboot_thread_fn+0x12c/0x268
> [ 0.489725] kthread+0x164/0x168
> [ 0.492980] ret_from_fork+0x10/0x30
> [ 0.496588] Code: 39402e61 39402a62 6b01005f 54fff6a0 (d4210000)
> [ 0.502742] ---[ end trace 24ef7d65759ab825 ]---
> [ 0.507398] note: migration/0[14] exited with preempt_count 2
> [ 0.513290] ------------[ cut here ]------------
>
> Michael subsequently reported that:
>
> <quote>
> mhh nevermind, I can't reproduce it anymore. Maybe I should have
> recompiled with a clean build dir at first
> </quote>
>
> My gut feeling is that we can end-up with some build leftovers when
> going between -rc1 and -rc2, hence the screw-up when the capabilities
> get reordered. Dependency issues?

I've just reproduced this, and I'm dissecting what I have. It looks like
something goes wrong when moving from v5,13-rc1 to v5.13-rc2.

Reproduction steps below. Note `usekorg` is my script to run a specific
build of the kernel.org crosstool binaries.

$ git clean -fdx
$ git checkout v5.13-rc1
$ usekorg 10.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- defconfig
$ usekorg 10.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- Image -j50
$ git checkout v5.13-rc2
$ usekorg 10.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- Image -j50

... then when I run the resulting Image in a KVM guest on ThunderX2, I
get a splat at boot:

[ 0.437023] ------------[ cut here ]------------
[ 0.438314] kernel BUG at arch/arm64/kernel/alternative.c:157!
[ 0.439970] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 0.441428] Modules linked in:
[ 0.442293] CPU: 0 PID: 12 Comm: migration/0 Not tainted 5.13.0-rc2 #2
[ 0.444123] Hardware name: linux,dummy-virt (DT)
[ 0.445351] Stopper: multi_cpu_stop+0x0/0x17c <- stop_machine_cpuslocked+0x11c/0x16c
[ 0.447542] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO BTYPE=--)
[ 0.449155] pc : __apply_alternatives+0x210/0x250
[ 0.450473] lr : __apply_alternatives+0xf8/0x250
[ 0.451800] sp : ffff8000100abcb0
[ 0.452718] x29: ffff8000100abcb0 x28: ffffa8930f011128 x27: ffffa8931070f4c8
[ 0.454836] x26: ffffa8930f01112c x25: 00000000001b0020 x24: ffffa89310470650
[ 0.456854] x23: 0000000000000000 x22: ffffa89310f5ea84 x21: ffff8000100abd30
[ 0.458957] x20: ffff8000100abd40 x19: ffffa8931070f4cc x18: 0000000000000030
[ 0.460944] x17: ffff16ea7efee940 x16: 0000000000000068 x15: ffff16e94016a050
[ 0.463142] x14: ffffffffffffffff x13: ffffa89310ca29d8 x12: 0000000000000135
[ 0.465157] x11: 0000000000000067 x10: ffffa89310cfa9d8 x9 : 00000000fffff000
[ 0.467292] x8 : ffffa89310ca29d8 x7 : ffffa89310cfa9d8 x6 : 0000000000000000
[ 0.469273] x5 : 0000000000000000 x4 : 000000000000003f x3 : ffffffffffffffc0
[ 0.471471] x2 : 0000000000000023 x1 : 0000000000000004 x0 : 0000000000000000
[ 0.473532] Call trace:
[ 0.474216] __apply_alternatives+0x210/0x250
[ 0.475426] __apply_alternatives_multi_stop+0xc0/0xd4
[ 0.476994] multi_cpu_stop+0xa8/0x17c
[ 0.478052] cpu_stopper_thread+0x9c/0x130
[ 0.479235] smpboot_thread_fn+0x254/0x280
[ 0.480434] kthread+0x158/0x160
[ 0.481395] ret_from_fork+0x10/0x30
[ 0.482449] Code: 8b040264 d63f0080 f94006a1 17ffffc9 (d4210000)
[ 0.484144] ---[ end trace f570e0f98f46a6c3 ]---

Thanks,
Mark.

2021-05-19 17:45:33

by Mark Rutland

[permalink] [raw]
Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

Adding Mark Brown and Mazahiro Yamada.

It looks like there's a dependency issue where assembly files don't get rebuilt
when a generated header they depend upon is rebuilt, and from commit:

0c6c2d3615efb7c2 ("arm64: Generate cpucaps.h")

... we can have stale objects with old cpucap values.

More detail below.

On Tue, May 18, 2021 at 10:32:11AM +0100, Mark Rutland wrote:
> On Tue, May 18, 2021 at 10:12:32AM +0100, Marc Zyngier wrote:
> > On Mon, 17 May 2021 22:52:59 +0100,
> > John Stultz <[email protected]> wrote:
> > >
> > > With v5.13-rc2, I've been seeing an odd boot regression with the
> > > DragonBoard 845c:
> > >
> > > Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving
> > > me inconsistent results so far. It feels a bit like maybe some config
> > > option gets enabled moving forward, and then sticks around when we go
> > > back. I'll take another swing at bisecting it later today, but I have
> > > to move on to some other work right now, so I figured I'd share (with
> > > folks who better know the recent __apply_alternatives changes) in case
> > > folks have a better idea:
> > >
> > > [ 0.254384] CPU features: detected: RAS Extension Support
> > > [ 0.259928] CPU: All CPU(s) started at EL1
> > > [ 0.264127] alternatives: patching kernel code
> > > [ 0.268635] ------------[ cut here ]------------
> > > [ 0.273303] kernel BUG at arch/arm64/kernel/alternative.c:157!
> > > [ 0.279192] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> > > [ 0.284736] Modules linked in:
> > > [ 0.287833] CPU: 0 PID: 14 Comm: migration/0 Not tainted
> > > 5.13.0-rc2-mainline #4501
> > > [ 0.295472] Hardware name: Thundercomm Dragonboard 845c (DT)
> > > [ 0.301182] Stopper: multi_cpu_stop+0x0/0x1a0 <-
> > > stop_machine_cpuslocked+0x128/0x160
> > > [ 0.309020] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO BTYPE=--)
> > > [ 0.315086] pc : __apply_alternatives+0x1f0/0x270
> > > [ 0.319847] lr : __apply_alternatives+0xf4/0x270
> > > [ 0.324515] sp : ffffffc01020bca0
> > > [ 0.327874] x29: ffffffc01020bca0 x28: 00000000000000a0 x27: ffffffd7f5c11124
> > > [ 0.335086] x26: ffffffd7f5c11128 x25: 00000000001b0020 x24: ffffffd7f700ab90
> > > [ 0.342297] x23: 0000000000000000 x22: ffffffc01020bd20 x21: ffffffd7f7bea374
> > > [ 0.349508] x20: ffffffc01020bd30 x19: ffffffd7f72194fc x18: ffffffffffffffff
> > > [ 0.356718] x17: ffffffd7f7bdce40 x16: 000000005c8e1b43 x15: ffffffd7f76d9d10
> > > [ 0.363929] x14: ffffffc09020b967 x13: ffffffc01020b975 x12: ffffffd7f76d9e30
> > > [ 0.371140] x11: 0000000005f5e0ff x10: ffffffc01020b8c0 x9 : 00000000ffffffd0
> > > [ 0.378350] x8 : 6b20676e69686374 x7 : ffffffd7f79b9238 x6 : c0000000ffff7fff
> > > [ 0.385560] x5 : 0000000000000000 x4 : ffffffd7f5c22898 x3 : 0000000000000010
> > > [ 0.392771] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f
> > > [ 0.399982] Call trace:
> > > [ 0.402461] __apply_alternatives+0x1f0/0x270
> > > [ 0.406873] __apply_alternatives_multi_stop+0xc0/0xe0
> > > [ 0.412062] multi_cpu_stop+0xb8/0x1a0
> > > [ 0.415851] cpu_stopper_thread+0xac/0x120
> > > [ 0.419997] smpboot_thread_fn+0x200/0x238
> > > [ 0.424146] kthread+0x14c/0x158
> > > [ 0.427423] ret_from_fork+0x10/0x1c
> > > [ 0.431045] Code: 39402e61 39402a62 6b01005f 54fff500 (d4210000)
> > > [ 0.437199] ---[ end trace 523e13d9d60a992d ]---
> > > [ 0.441868] note: migration/0[14] exited with preempt_count 2
> > > [ 0.447739] migration/0 (14) used greatest stack depth: 12448 bytes left
> >
> > [/me digs in my IRC logs]
> >
> > This looks a lot like an issue that was reported my Michael Walle a
> > few days ago on IRC, leading to a crash that looked like this:
> >
> > [ 0.325238] alternatives: patching kernel code
> > [ 0.329735] ------------[ cut here ]------------
> > [ 0.334394] kernel BUG at arch/arm64/kernel/alternative.c:157!
> > [ 0.340300] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> > [ 0.345836] Modules linked in:
> > [ 0.348916] CPU: 0 PID: 14 Comm: migration/0 Not tainted 5.13.0-rc1-next-20210511+ #536
> > [ 0.356998] Hardware name: Kontron SMARC-sAL28 (Single PHY) on SMARC Eval 2.0 carrier (DT)
> > [ 0.365339] Stopper: multi_cpu_stop+0x0/0x1a8 <- stop_cpus.constprop.9+0x78/0xc8
> > [ 0.372820] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO BTYPE=--)
> > [ 0.378882] pc : __apply_alternatives.isra.1+0x1c4/0x270
> > [ 0.384246] lr : __apply_alternatives.isra.1+0x110/0x270
> > [ 0.389606] sp : ffff800012db3ca0
> > [ 0.392946] x29: ffff800012db3ca0 x28: 0000000000000000 x27: ffff800010011924
> > [ 0.400155] x26: ffff800010011928 x25: 00000000001b0020 x24: ffff8000115ad350
> > [ 0.407364] x23: ffff800012db3d28 x22: 0000000000000000 x21: ffff800011fb24cd
> > [ 0.414571] x20: ffff800012db3d30 x19: ffff800011840b38 x18: 0000000000000010
> > [ 0.421779] x17: 0000000044a56c23 x16: 0000000000000002 x15: ffffffffffffffff
> > [ 0.428986] x14: ffff800011d50a48 x13: ffff800092db3987 x12: ffff800011de6a70
> > [ 0.436193] x11: 0000000000000003 x10: ffff800011dcea30 x9 : ffff8000105d8928
> > [ 0.443401] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000000001
> > [ 0.450608] x5 : 0000000000000000 x4 : ffff800010024398 x3 : 0000000000000010
> > [ 0.457815] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 000000000000003f
> > [ 0.465022] Call trace:
> > [ 0.467483] __apply_alternatives.isra.1+0x1c4/0x270
> > [ 0.472493] __apply_alternatives_multi_stop+0xcc/0xe0
> > [ 0.477679] multi_cpu_stop+0xac/0x1a8
> > [ 0.481460] cpu_stopper_thread+0xa4/0x138
> > [ 0.485592] smpboot_thread_fn+0x12c/0x268
> > [ 0.489725] kthread+0x164/0x168
> > [ 0.492980] ret_from_fork+0x10/0x30
> > [ 0.496588] Code: 39402e61 39402a62 6b01005f 54fff6a0 (d4210000)
> > [ 0.502742] ---[ end trace 24ef7d65759ab825 ]---
> > [ 0.507398] note: migration/0[14] exited with preempt_count 2
> > [ 0.513290] ------------[ cut here ]------------
> >
> > Michael subsequently reported that:
> >
> > <quote>
> > mhh nevermind, I can't reproduce it anymore. Maybe I should have
> > recompiled with a clean build dir at first
> > </quote>
> >
> > My gut feeling is that we can end-up with some build leftovers when
> > going between -rc1 and -rc2, hence the screw-up when the capabilities
> > get reordered. Dependency issues?
>
> I've just reproduced this, and I'm dissecting what I have. It looks like
> something goes wrong when moving from v5,13-rc1 to v5.13-rc2.

I hacked some debug in and got:

[ 0.362573] alternatives: patching kernel code
[ 0.363856] alternatives: Bad alt region at 0xffffd98efe90f4c8
[ 0.365510] alternatives: alt->cpufeature = 63
[ 0.366829] alternatives: alt->orig_len = 4
[ 0.368155] alternatives: alt->alt_len = 0
[ 0.369377] alternatives: ptr = 0xffffd98efd211168, el0_sync_invalid+0x100/0x1a4

The alt->cpufeature value is bad, since ARM64_NCAPS is 61 since commit:

0c6c2d3615efb7c2 ("arm64: Generate cpucaps.h")

... so it looks like we're missing a dependency on the generated header,
and are not rebuilding arch/arm64/kernel/entry.S.

If I forcefully rebuild entry.S, I instead get:

[ 0.303521] alternatives: patching kernel code
[ 0.304689] alternatives: Bad alt region at 0xffffd4e7567110ac
[ 0.306177] alternatives: alt->cpufeature = 63
[ 0.307302] alternatives: alt->orig_len = 16
[ 0.308440] alternatives: alt->alt_len = 0
[ 0.309540] alternatives: ptr = 0xffffd4e755073808, __bp_harden_hyp_vecs+0x808/0x1810

... and so on, though I'm eventually left with:

[ 0.356180] alternatives: Bad alt region at 0xffffc4db559116f4
[ 0.357621] alternatives: alt->cpufeature = 63
[ 0.358739] alternatives: alt->orig_len = 16
[ 0.359899] alternatives: alt->alt_len = 0
[ 0.361009] alternatives: ptr = 0xffffc4db550a20d8, el1_error_invalid+0x27a4/0xcb94

... and that doesn't seem to exist:

[mark@lakrids:~/src/linux]% ./scripts/faddr2line vmlinux el1_error_invalid+0x27a4/0xcb94
skipping el1_error_invalid address at 0xffff800010014028 due to size mismatch (0xcb94 != 0xbc)
skipping el1_error_invalid address at 0xffff800010ea20d8 due to size mismatch (0xcb94 != 0x6cc)
skipping el1_error_invalid address at 0xffff800010eb08a8 due to size mismatch (0xcb94 != 0x6fc)
no match for el1_error_invalid+0x27a4/0xcb94

... so there might be another issue here (kallsyms?).

Thanks,
Mark.

>
> Reproduction steps below. Note `usekorg` is my script to run a specific
> build of the kernel.org crosstool binaries.
>
> $ git clean -fdx
> $ git checkout v5.13-rc1
> $ usekorg 10.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- defconfig
> $ usekorg 10.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- Image -j50
> $ git checkout v5.13-rc2
> $ usekorg 10.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- Image -j50
>
> ... then when I run the resulting Image in a KVM guest on ThunderX2, I
> get a splat at boot:
>
> [ 0.437023] ------------[ cut here ]------------
> [ 0.438314] kernel BUG at arch/arm64/kernel/alternative.c:157!
> [ 0.439970] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [ 0.441428] Modules linked in:
> [ 0.442293] CPU: 0 PID: 12 Comm: migration/0 Not tainted 5.13.0-rc2 #2
> [ 0.444123] Hardware name: linux,dummy-virt (DT)
> [ 0.445351] Stopper: multi_cpu_stop+0x0/0x17c <- stop_machine_cpuslocked+0x11c/0x16c
> [ 0.447542] pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO BTYPE=--)
> [ 0.449155] pc : __apply_alternatives+0x210/0x250
> [ 0.450473] lr : __apply_alternatives+0xf8/0x250
> [ 0.451800] sp : ffff8000100abcb0
> [ 0.452718] x29: ffff8000100abcb0 x28: ffffa8930f011128 x27: ffffa8931070f4c8
> [ 0.454836] x26: ffffa8930f01112c x25: 00000000001b0020 x24: ffffa89310470650
> [ 0.456854] x23: 0000000000000000 x22: ffffa89310f5ea84 x21: ffff8000100abd30
> [ 0.458957] x20: ffff8000100abd40 x19: ffffa8931070f4cc x18: 0000000000000030
> [ 0.460944] x17: ffff16ea7efee940 x16: 0000000000000068 x15: ffff16e94016a050
> [ 0.463142] x14: ffffffffffffffff x13: ffffa89310ca29d8 x12: 0000000000000135
> [ 0.465157] x11: 0000000000000067 x10: ffffa89310cfa9d8 x9 : 00000000fffff000
> [ 0.467292] x8 : ffffa89310ca29d8 x7 : ffffa89310cfa9d8 x6 : 0000000000000000
> [ 0.469273] x5 : 0000000000000000 x4 : 000000000000003f x3 : ffffffffffffffc0
> [ 0.471471] x2 : 0000000000000023 x1 : 0000000000000004 x0 : 0000000000000000
> [ 0.473532] Call trace:
> [ 0.474216] __apply_alternatives+0x210/0x250
> [ 0.475426] __apply_alternatives_multi_stop+0xc0/0xd4
> [ 0.476994] multi_cpu_stop+0xa8/0x17c
> [ 0.478052] cpu_stopper_thread+0x9c/0x130
> [ 0.479235] smpboot_thread_fn+0x254/0x280
> [ 0.480434] kthread+0x158/0x160
> [ 0.481395] ret_from_fork+0x10/0x30
> [ 0.482449] Code: 8b040264 d63f0080 f94006a1 17ffffc9 (d4210000)
> [ 0.484144] ---[ end trace f570e0f98f46a6c3 ]---
>
> Thanks,
> Mark.

2021-05-19 18:29:28

by John Stultz

[permalink] [raw]
Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

On Tue, May 18, 2021 at 2:59 AM Mark Rutland <[email protected]> wrote:
>
> Adding Mark Brown and Mazahiro Yamada.
>
> It looks like there's a dependency issue where assembly files don't get rebuilt
> when a generated header they depend upon is rebuilt, and from commit:
>
> 0c6c2d3615efb7c2 ("arm64: Generate cpucaps.h")
>
> ... we can have stale objects with old cpucap values.


Thanks for confirming! I've also verified that things do get back to
booting ok w/ v5.13-rc2 after a make clean, so it does seem like some
sort of dependency, which explains the inconsistent bisection.

thanks
-john

2021-05-20 01:20:39

by John Stultz

[permalink] [raw]
Subject: Re: REGRESSION: kernel BUG at arch/arm64/kernel/alternative.c:157!

On Tue, May 18, 2021 at 1:49 AM Will Deacon <[email protected]> wrote:
>
> Hi John,
>
> On Mon, May 17, 2021 at 02:52:59PM -0700, John Stultz wrote:
> > With v5.13-rc2, I've been seeing an odd boot regression with the
> > DragonBoard 845c:
> >
> > Unfortunately, trying to bisect it down (v5.13-rc1 works ok) is giving
> > me inconsistent results so far. It feels a bit like maybe some config
> > option gets enabled moving forward, and then sticks around when we go
> > back. I'll take another swing at bisecting it later today, but I have
> > to move on to some other work right now, so I figured I'd share (with
> > folks who better know the recent __apply_alternatives changes) in case
> > folks have a better idea:
>
> Please can you try reverting af44068c581c and 0c6c2d3615ef?

Hey Will,
I realized I didn't get back to you on this. As MarkR already noted
it does seem to be coming from 0c6c2d3615ef. Jumping to 5.13-rc1,
doing a make clean, building/booting then jumping to 5.13-rc2 + the
two reverts above, building/booting, and the issue won't appear. If we
just jump to 5.13-rc2 or 5.13-rc2 with af44068c581c reverted, after
building and booting I'll be able to see the issue.

Given it disappears after a make clean, I'm guessing this isn't a
major issue, mostly just a concern for folks to accidently hit it
bisecting things, so I'm not sure if there's anything else to do.

Let me know if you'd like me to try anything else.

thanks
-john