2023-12-18 13:51:08

by Matthias Schiffer

[permalink] [raw]
Subject: powerpc: several early boot regressions on MPC52xx

Hi all,

I'm currently in the process of porting our ancient TQM5200 SoM to a modern kernel, and I've
identified a number of regressions that cause early boot failures (before the UART console has been
initialized) with arch/powerpc/configs/52xx/tqm5200_defconfig.

Issue 1) Boot fails with CONFIG_PPC_KUAP enabled (enabled by default since 9f5bd8f1471d7
"powerpc/32s: Activate KUAP and KUEP by default"). The reason is a number of of_iomap() calls in
arch/powerpc/platforms/52xx that should be early_ioremap().

I can fix this up easy enough for mpc5200_simple by changing mpc5200_setup_xlb_arbiter() to use
early_ioremap() and moving mpc52xx_map_common_devices() from the setup_arch to the init hook (one
side effect is that mpc52xx_restart() only works a bit later, as it requires the mpc52xx_wdt mapping
from mpc52xx_map_common_devices(); I'm not sure if there is a better solution).

For the other 52xx platforms (efika, lite5200, media5200) things are a bit more chaotic, and they
create several more temporary mappings from setup_arch. Either they should all be moved to the init
hook as well, or be converted to early_ioremap(), but I can't tell which is more appropriate. As a
first step, I would propose a patch that fixes this for the simple platforms and leaves the other
ones unchanged.

(Side note: I also found that before 16132529cee58 ("powerpc/32s: Rework Kernel Userspace Access
Protection"), boot would succeed even with KUAP enabled without changing the incorrect of_iomap(); I
guess the old implementation was more lenient about the incorrect calls that the kernel warns
about?)

Issue 2) Boot fails with CONFIG_STRICT_KERNEL_RWX enabled, which is also the default nowadays.

I have not found the cause of this boot failure yet; is there any way to debug this if the failure
happens before the UART is available and I currently don't have JTAG for this hardware?

Best regards,
Matthias



--
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
https://www.tq-group.com/


2023-12-18 19:50:55

by Christophe Leroy

[permalink] [raw]
Subject: Re: powerpc: several early boot regressions on MPC52xx

Hi Matthias,

Le 18/12/2023 à 14:48, Matthias Schiffer a écrit :
> Hi all,
>
> I'm currently in the process of porting our ancient TQM5200 SoM to a modern kernel, and I've
> identified a number of regressions that cause early boot failures (before the UART console has been
> initialized) with arch/powerpc/configs/52xx/tqm5200_defconfig.

"modern" kernel ==> which version ?

>
> Issue 1) Boot fails with CONFIG_PPC_KUAP enabled (enabled by default since 9f5bd8f1471d7
> "powerpc/32s: Activate KUAP and KUEP by default"). The reason is a number of of_iomap() calls in
> arch/powerpc/platforms/52xx that should be early_ioremap().

Can you give more details and what leads you to that conclusion ?

There should be no relation between KUAP and of_iomap()/early_ioremap().
Problem is likely somewhere else.

>
> I can fix this up easy enough for mpc5200_simple by changing mpc5200_setup_xlb_arbiter() to use
> early_ioremap() and moving mpc52xx_map_common_devices() from the setup_arch to the init hook (one
> side effect is that mpc52xx_restart() only works a bit later, as it requires the mpc52xx_wdt mapping
> from mpc52xx_map_common_devices(); I'm not sure if there is a better solution).
>
> For the other 52xx platforms (efika, lite5200, media5200) things are a bit more chaotic, and they
> create several more temporary mappings from setup_arch. Either they should all be moved to the init
> hook as well, or be converted to early_ioremap(), but I can't tell which is more appropriate. As a
> first step, I would propose a patch that fixes this for the simple platforms and leaves the other
> ones unchanged.
>
> (Side note: I also found that before 16132529cee58 ("powerpc/32s: Rework Kernel Userspace Access
> Protection"), boot would succeed even with KUAP enabled without changing the incorrect of_iomap(); I
> guess the old implementation was more lenient about the incorrect calls that the kernel warns
> about?)

Interesting.
Again, there shouldn't be any impact of those incorrect calls. They are
correct calls, it is just an historical method that we want to get rid
of on day.
Could you then provide the dmesg of what/how it works here ? And then
I'd also be interested in a dump of /sys/kernel/debug/kernel_page_tables
and /sys/kernel/debug/powerpc/block_address_translation
and /sys/kernel/debug/powerpc/segment_registers

For that you'll need CONFIG_PTDUMP_DEBUGFS

>
> Issue 2) Boot fails with CONFIG_STRICT_KERNEL_RWX enabled, which is also the default nowadays.
>
> I have not found the cause of this boot failure yet; is there any way to debug this if the failure
> happens before the UART is available and I currently don't have JTAG for this hardware?

Shouldn't happen before UART is available, strict enforcement is
perfomed by mark_readonly() and free_initmem() in the middle of
kernel_init(). UART should be ON long before that.

So it must be something in the setup that collides with CONFIG_KUAP and
CONFIG_STRICT_KERNEL_RWX.

Could you send dmesg of when it works (ie without
CONFIG_KUAP/CONFIG_STRICT_KERNEL_RWX) and when it doesn't work if you
get some initial stuff ?

Also your .config unless you are using one of the defconfigs.

What UART driver is used ?

What's your boot console, can you use "early boot text console (BootX or
OpenFirmware only) (CONFIG_BOOTX_TEXT)" ?

Christophe

2023-12-19 13:34:51

by Matthias Schiffer

[permalink] [raw]
Subject: Re: powerpc: several early boot regressions on MPC52xx

On Mon, 2023-12-18 at 19:48 +0000, Christophe Leroy wrote:
> ********************
> Achtung externe E-Mail: Öffnen Sie Anhänge und Links nur, wenn Sie wissen, dass diese aus einer sicheren Quelle stammen und sicher sind. Leiten Sie die E-Mail im Zweifelsfall zur Prüfung an den IT-Helpdesk weiter.
> Attention external email: Open attachments and links only if you know that they are from a secure source and are safe. In doubt forward the email to the IT-Helpdesk to check it.
> ********************
>
> Hi Matthias,
>
> Le 18/12/2023 à 14:48, Matthias Schiffer a écrit :
> > Hi all,
> >
> > I'm currently in the process of porting our ancient TQM5200 SoM to a modern kernel, and I've
> > identified a number of regressions that cause early boot failures (before the UART console has been
> > initialized) with arch/powerpc/configs/52xx/tqm5200_defconfig.
>
> "modern" kernel ==> which version ?

Hi Christophe,

I was testing with torvalds/master as of yesterday, and bisected everything from 4.14 to identify
the commits related to the issues. For my current project, 6.1.y or 6.6.y will likely be our kernel
of choice, but I'd also like to get mainline to a working state again if possible.

>
> >
> > Issue 1) Boot fails with CONFIG_PPC_KUAP enabled (enabled by default since 9f5bd8f1471d7
> > "powerpc/32s: Activate KUAP and KUEP by default"). The reason is a number of of_iomap() calls in
> > arch/powerpc/platforms/52xx that should be early_ioremap().
>
> Can you give more details and what leads you to that conclusion ?
>
> There should be no relation between KUAP and of_iomap()/early_ioremap().
> Problem is likely somewhere else.

You are entirely right, the warnings about early_ioremap() were a red hering. I can't reproduce any
difference in boot behavior anymore I thought I was seeing when changing the of_iomap() to
early_ioremap(). I assume I got confused by testing for too many variables at once (kernel version +
2 Kconfig settings).


>
> >
> > I can fix this up easy enough for mpc5200_simple by changing mpc5200_setup_xlb_arbiter() to use
> > early_ioremap() and moving mpc52xx_map_common_devices() from the setup_arch to the init hook (one
> > side effect is that mpc52xx_restart() only works a bit later, as it requires the mpc52xx_wdt mapping
> > from mpc52xx_map_common_devices(); I'm not sure if there is a better solution).
> >
> > For the other 52xx platforms (efika, lite5200, media5200) things are a bit more chaotic, and they
> > create several more temporary mappings from setup_arch. Either they should all be moved to the init
> > hook as well, or be converted to early_ioremap(), but I can't tell which is more appropriate. As a
> > first step, I would propose a patch that fixes this for the simple platforms and leaves the other
> > ones unchanged.
> >
> > (Side note: I also found that before 16132529cee58 ("powerpc/32s: Rework Kernel Userspace Access
> > Protection"), boot would succeed even with KUAP enabled without changing the incorrect of_iomap(); I
> > guess the old implementation was more lenient about the incorrect calls that the kernel warns
> > about?)
>
> Interesting.
> Again, there shouldn't be any impact of those incorrect calls. They are
> correct calls, it is just an historical method that we want to get rid
> of on day.
> Could you then provide the dmesg of what/how it works here ? And then
> I'd also be interested in a dump of /sys/kernel/debug/kernel_page_tables
> and /sys/kernel/debug/powerpc/block_address_translation
> and /sys/kernel/debug/powerpc/segment_registers
>
> For that you'll need CONFIG_PTDUMP_DEBUGFS

As it turns out, whatever issue existed with KUAP at the time when it was changed to enabled by
default for 32s (which was in 5.14) has been resolved in current mainline. Current torvalds/master
boots fine with KUAP enabled, and only CONFIG_STRICT_KERNEL_RWX breaks the boot.

>
> >
> > Issue 2) Boot fails with CONFIG_STRICT_KERNEL_RWX enabled, which is also the default nowadays.
> >
> > I have not found the cause of this boot failure yet; is there any way to debug this if the failure
> > happens before the UART is available and I currently don't have JTAG for this hardware?
>
> Shouldn't happen before UART is available, strict enforcement is
> perfomed by mark_readonly() and free_initmem() in the middle of
> kernel_init(). UART should be ON long before that.
>
> So it must be something in the setup that collides with CONFIG_KUAP and
> CONFIG_STRICT_KERNEL_RWX.
>
> Could you send dmesg of when it works (ie without
> CONFIG_KUAP/CONFIG_STRICT_KERNEL_RWX) and when it doesn't work if you
> get some initial stuff ?

Here's the UART output of a working boot (CONFIG_STRICT_KERNEL_RWX disabled; I have slightly
extended tqm5200.dts to enable UART output of the cuImage wrapper):

Memory <- <0x0 0x8000000> (128MB)
CPU clock-frequency <- 0x179a7b00 (396MHz)
CPU timebase-frequency <- 0x1f78a40 (33MHz)
CPU bus-frequency <- 0x7de2900 (132MHz)

zImage starting: loaded at 0x00a00000 (sp: 0x07f2ea68)
Decompression error: 'Not a gzip file'
No valid compressed data found, assume uncompressed data
Allocating 0x4dde10 bytes for kernel...
0x4b9ee0 bytes of uncompressed data copied

Linux/PowerPC load:
Finalizing device tree... flat tree at 0xedd900
[ 0.000000] Activating Kernel Userspace Access Protection
[ 0.000000] Activating Kernel Userspace Execution Prevention
[ 0.000000] Linux version 6.7.0-rc6-00001-g1eff0ad3c1f9 (schifferm@schifferm-ubuntu) (powerpc-603e-linux-gnu-gcc (OSELAS.Toolchain-2021.07.0 11-20210703) 11.1.1 20210703, GNU ld (GNU Binutils) 2.36.1) #69 Tue Dec 19 13:03:59 CET 2023
[ 0.000000] Hardware name: tqc,tqm5200 G2_LE 0x80822014 mpc5200-simple-platform
[ 0.000000] -----------------------------------------------------
[ 0.000000] phys_mem_size = 0x8000000
[ 0.000000] dcache_bsize = 0x20
[ 0.000000] icache_bsize = 0x20
[ 0.000000] cpu_features = 0x0000000001010108
[ 0.000000] possible = 0x00000000277de148
[ 0.000000] always = 0x0000000001000000
[ 0.000000] cpu_user_features = 0x8c000000 0x00000000
[ 0.000000] mmu_features = 0x00010200
[ 0.000000] Hash_size = 0x0
[ 0.000000] -----------------------------------------------------
[ 0.000000] ioremap() called early from 0xc0393758. Use early_ioremap() instead
[ 0.000000] ioremap() called early from 0xc03937a0. Use early_ioremap() instead
[ 0.000000] ioremap() called early from 0xc03937c0. Use early_ioremap() instead
[ 0.000000] ioremap() called early from 0xc0393604. Use early_ioremap() instead
[ 0.000000] Zone ranges:
[ 0.000000] Normal [mem 0x0000000000000000-0x0000000007ffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000000000000-0x0000000007ffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
[ 0.000000] Kernel command line:
[ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 32512
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] Kernel virtual memory layout:
[ 0.000000] * 0xffbdf000..0xfffff000 : fixmap
[ 0.000000] * 0xffbd7000..0xffbdf000 : early ioremap
[ 0.000000] * 0xc9000000..0xffbd7000 : vmalloc & ioremap
[ 0.000000] * 0xb0000000..0xc0000000 : modules
[ 0.000000] Memory: 124904K/131072K available (3300K kernel code, 196K rwdata, 328K rodata, 1016K init, 143K bss, 6168K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
[ 0.000000] MPC52xx PIC is up and running!
[ 0.000016] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x79c5e18f3, max_idle_ns: 440795202740 ns
[ 0.000049] clocksource: timebase mult[1e4d9365] shift[24] registered
[ 0.000504] Console: colour dummy device 80x25
[ 0.000563] printk: legacy console [tty0] enabled
[ 0.001720] pid_max: default: 32768 minimum: 301
[ 0.002067] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.002233] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.012433] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.012622] futex hash table entries: 256 (order: -1, 3072 bytes, linear)
[ 0.026664] mpc52xx-gpt f0000600.timer: can function as watchdog
[ 0.031159] DMA: MPC52xx BestComm driver
[ 0.031608] DMA: MPC52xx BestComm engine @f0001200 ok !
[ 0.032479] usbcore: registered new interface driver usbfs
[ 0.032858] usbcore: registered new interface driver hub
[ 0.033142] usbcore: registered new device driver usb
[ 0.033655] pps_core: LinuxPPS API ver. 1 registered
[ 0.033757] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <[email protected]>
[ 0.035047] clocksource: Switched to clocksource timebase
[ 0.058417] workingset: timestamp_bits=30 max_order=15 bucket_order=0
[ 0.059098] jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc.
[ 0.189566] Serial: MPC52xx PSC UART driver
[ 0.191259] f0002000.serial: ttyPSC0 at MMIO 0xf0002000 (irq = 129, base_baud = 8250000) is a MPC5xxx PSC
[ 0.191594] printk: legacy console [ttyPSC0] enabled
[ 0.581696] f0002200.serial: ttyPSC1 at MMIO 0xf0002200 (irq = 130, base_baud = 8250000) is a MPC5xxx PSC
[ 0.593454] f0002400.serial: ttyPSC2 at MMIO 0xf0002400 (irq = 131, base_baud = 8250000) is a MPC5xxx PSC
[ 0.606025] ppc-of-ohci f0001000.usb: OF OHCI
[ 0.611163] ppc-of-ohci f0001000.usb: new USB bus registered, assigned bus number 1
[ 0.619307] ppc-of-ohci f0001000.usb: irq 134, io mem 0xf0001000
[ 0.689131] hub 1-0:1.0: USB hub found
[ 0.693344] hub 1-0:1.0: 2 ports detected
[ 0.700727] i2c_dev: i2c /dev entries driver
[ 0.706242] mpc-i2c f0003d40.i2c: timeout 1000000 us
[ 0.716567] rtc-ds1307 0-0068: registered as rtc0
[ 0.724583] usbcore: registered new interface driver usbhid
[ 0.730490] usbhid: USB HID core driver
[ 0.734621] drmem: No dynamic reconfiguration memory found
[ 0.758163] clk: Disabling unused clocks
[ 0.786639] Freeing unused kernel image (initmem) memory: 1016K
[ 0.792927] Kernel memory protection not selected by kernel config.
[ 0.799476] Run /sbin/init as init process
[ 0.804502] Run /etc/init as init process
[ 0.809344] Run /bin/init as init process
[ 0.814177] Run /bin/sh as init process
[ 0.818838] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
[ 0.833478] Rebooting in 180 seconds..

When boot doesn't work, the last messages I see are from the cuImage wrapper ("Finalizing device
tree... flat tree at ...). The panic is expected, there is no rootfs/initramfs in my current setup.


>
> Also your .config unless you are using one of the defconfigs.

This is 52xx/tqm5200_defconfig with some quick modifications to reduce the kernel size (a temporary
measure until I've dealt with a broken bootloader that truncates large kernels... who doesn't love
to deal with platforms that haven't been touched for a decade):
- Disable loadable module support
- Disable networking support
- Disable block layer
- Disable Cryptographic API
- (Unset CONFIG_STRICT_KERNEL_RWX)

I have attached the resulting .config.

>
> What UART driver is used ?

mpc52xx_uart

>
> What's your boot console, can you use "early boot text console (BootX or
> OpenFirmware only) (CONFIG_BOOTX_TEXT)" ?

No, I don't think that is available on our platform, I only have the mpc52xx_uart. I guess I could
try adding earlycon support to that driver if necessary.

Regards,
Matthias


>
> Christophe

--
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
https://www.tq-group.com/


Attachments:
.config (58.68 kB)

2023-12-20 14:55:38

by Christophe Leroy

[permalink] [raw]
Subject: Re: powerpc: several early boot regressions on MPC52xx



Le 19/12/2023 à 14:34, Matthias Schiffer a écrit :
> [Vous ne recevez pas souvent de courriers de [email protected]. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>
> On Mon, 2023-12-18 at 19:48 +0000, Christophe Leroy wrote:
>> ********************
>> Achtung externe E-Mail: Öffnen Sie Anhänge und Links nur, wenn Sie wissen, dass diese aus einer sicheren Quelle stammen und sicher sind. Leiten Sie die E-Mail im Zweifelsfall zur Prüfung an den IT-Helpdesk weiter.
>> Attention external email: Open attachments and links only if you know that they are from a secure source and are safe. In doubt forward the email to the IT-Helpdesk to check it.
>> ********************
>>
>> Hi Matthias,
>>
>> Le 18/12/2023 à 14:48, Matthias Schiffer a écrit :
>>> Hi all,
>>>
>>> I'm currently in the process of porting our ancient TQM5200 SoM to a modern kernel, and I've
>>> identified a number of regressions that cause early boot failures (before the UART console has been
>>> initialized) with arch/powerpc/configs/52xx/tqm5200_defconfig.
>>
>> "modern" kernel ==> which version ?
>
> Hi Christophe,
>
> I was testing with torvalds/master as of yesterday, and bisected everything from 4.14 to identify
> the commits related to the issues. For my current project, 6.1.y or 6.6.y will likely be our kernel
> of choice, but I'd also like to get mainline to a working state again if possible.
>
>>
>>>
>>> Issue 1) Boot fails with CONFIG_PPC_KUAP enabled (enabled by default since 9f5bd8f1471d7
>>> "powerpc/32s: Activate KUAP and KUEP by default"). The reason is a number of of_iomap() calls in
>>> arch/powerpc/platforms/52xx that should be early_ioremap().
>>
>> Can you give more details and what leads you to that conclusion ?
>>
>> There should be no relation between KUAP and of_iomap()/early_ioremap().
>> Problem is likely somewhere else.
>
> You are entirely right, the warnings about early_ioremap() were a red hering. I can't reproduce any
> difference in boot behavior anymore I thought I was seeing when changing the of_iomap() to
> early_ioremap(). I assume I got confused by testing for too many variables at once (kernel version +
> 2 Kconfig settings).
>
>
>>
>>>
>>> I can fix this up easy enough for mpc5200_simple by changing mpc5200_setup_xlb_arbiter() to use
>>> early_ioremap() and moving mpc52xx_map_common_devices() from the setup_arch to the init hook (one
>>> side effect is that mpc52xx_restart() only works a bit later, as it requires the mpc52xx_wdt mapping
>>> from mpc52xx_map_common_devices(); I'm not sure if there is a better solution).
>>>
>>> For the other 52xx platforms (efika, lite5200, media5200) things are a bit more chaotic, and they
>>> create several more temporary mappings from setup_arch. Either they should all be moved to the init
>>> hook as well, or be converted to early_ioremap(), but I can't tell which is more appropriate. As a
>>> first step, I would propose a patch that fixes this for the simple platforms and leaves the other
>>> ones unchanged.
>>>
>>> (Side note: I also found that before 16132529cee58 ("powerpc/32s: Rework Kernel Userspace Access
>>> Protection"), boot would succeed even with KUAP enabled without changing the incorrect of_iomap(); I
>>> guess the old implementation was more lenient about the incorrect calls that the kernel warns
>>> about?)
>>
>> Interesting.
>> Again, there shouldn't be any impact of those incorrect calls. They are
>> correct calls, it is just an historical method that we want to get rid
>> of on day.
>> Could you then provide the dmesg of what/how it works here ? And then
>> I'd also be interested in a dump of /sys/kernel/debug/kernel_page_tables
>> and /sys/kernel/debug/powerpc/block_address_translation
>> and /sys/kernel/debug/powerpc/segment_registers
>>
>> For that you'll need CONFIG_PTDUMP_DEBUGFS
>
> As it turns out, whatever issue existed with KUAP at the time when it was changed to enabled by
> default for 32s (which was in 5.14) has been resolved in current mainline. Current torvalds/master
> boots fine with KUAP enabled, and only CONFIG_STRICT_KERNEL_RWX breaks the boot.
>
>>
>>>
>>> Issue 2) Boot fails with CONFIG_STRICT_KERNEL_RWX enabled, which is also the default nowadays.
>>>
>>> I have not found the cause of this boot failure yet; is there any way to debug this if the failure
>>> happens before the UART is available and I currently don't have JTAG for this hardware?
>>
>> Shouldn't happen before UART is available, strict enforcement is
>> perfomed by mark_readonly() and free_initmem() in the middle of
>> kernel_init(). UART should be ON long before that.
>>
>> So it must be something in the setup that collides with CONFIG_KUAP and
>> CONFIG_STRICT_KERNEL_RWX.
>>
>> Could you send dmesg of when it works (ie without
>> CONFIG_KUAP/CONFIG_STRICT_KERNEL_RWX) and when it doesn't work if you
>> get some initial stuff ?
>
> Here's the UART output of a working boot (CONFIG_STRICT_KERNEL_RWX disabled; I have slightly
> extended tqm5200.dts to enable UART output of the cuImage wrapper):
>
...
>
> When boot doesn't work, the last messages I see are from the cuImage wrapper ("Finalizing device
> tree... flat tree at ...). The panic is expected, there is no rootfs/initramfs in my current setup.
>

Ok, so let's focus on CONFIG_STRICT_KERNEL_RWX then.

The most efficient would be if you were able to activation your UART
console earlier and/or implement some PPC_EARLY_DEBUG stuff to see where
it fails.

In your dmesg output, "Kernel memory protection not selected by kernel
config" is when the strict RWX gets activated when selected. Your UART
is enabled before that so if there was a problem with some driver
writing in a RO area, it would be seen.

One thing that came into my mind is that your CPU may have only 4 BATs
instead of 8. But I hacked the definition for the e300c2 CPU and my
board still boots with only 4 BATs so it is not that.
The thing is that to work properly, BATs should at least cover all
kernel. But I built your kernel with your .config and GCC 11.3 and I got
something that fits within 8M with the RO part stopping at 4M, so you'll
have one 4M BAT set RO, then another 4M BAT set RW, one 8M and one 16M
BAT. It won't cover your entire 128M memory but shouldn't be a problem,
just less performant.

So what ? You said the size of the kernel is a problem for your
bootloader. Could it be that ? When built with CONFIG_STRICT_KERNEL_RWX,
__end_rodata is aligned to 0xc0400000 whereas without
CONFIG_STRICT_KERNEL_RWX __end_rodata is at 0xc038c000 and so the end of
the kernel (seen from System.map) is 0xc0055e000 with
CONFIG_STRICT_KERNEL_RWX and 0xc04de000 without it.

One thing you can try is to see if it works without
CONFIG_STRICT_KERNEL_RWX but with CONFIG_DATA_SHIFT forced to 22 which
is the value set when CONFIG_STRICT_KERNEL_RWX is selected.
To be able to set that value, you'll have to hack arch/powerpc/Kconfig
directly and force it to select value 22 regardless of
CONFIG_STRICT_KERNEL_RWX

Christophe

2023-12-21 10:34:45

by Matthias Schiffer

[permalink] [raw]
Subject: Re: powerpc: several early boot regressions on MPC52xx

On Wed, 2023-12-20 at 14:55 +0000, Christophe Leroy wrote:
> > Le 19/12/2023 à 14:34, Matthias Schiffer a écrit :
> > > > [Vous ne recevez pas souvent de courriers de [email protected]. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> > > >
> > > > On Mon, 2023-12-18 at 19:48 +0000, Christophe Leroy wrote:
> > > > > > Hi Matthias,
> > > > > >
> > > > > > Le 18/12/2023 à 14:48, Matthias Schiffer a écrit :
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > I'm currently in the process of porting our ancient TQM5200 SoM to a modern kernel, and I've
> > > > > > > > identified a number of regressions that cause early boot failures (before the UART console has been
> > > > > > > > initialized) with arch/powerpc/configs/52xx/tqm5200_defconfig.
> > > > > >
> > > > > > "modern" kernel ==> which version ?
> > > >
> > > > Hi Christophe,
> > > >
> > > > I was testing with torvalds/master as of yesterday, and bisected everything from 4.14 to identify
> > > > the commits related to the issues. For my current project, 6.1.y or 6.6.y will likely be our kernel
> > > > of choice, but I'd also like to get mainline to a working state again if possible.
> > > >
> > > > > >
> > > > > > > >
> > > > > > > > Issue 1) Boot fails with CONFIG_PPC_KUAP enabled (enabled by default since 9f5bd8f1471d7
> > > > > > > > "powerpc/32s: Activate KUAP and KUEP by default"). The reason is a number of of_iomap() calls in
> > > > > > > > arch/powerpc/platforms/52xx that should be early_ioremap().
> > > > > >
> > > > > > Can you give more details and what leads you to that conclusion ?
> > > > > >
> > > > > > There should be no relation between KUAP and of_iomap()/early_ioremap().
> > > > > > Problem is likely somewhere else.
> > > >
> > > > You are entirely right, the warnings about early_ioremap() were a red hering. I can't reproduce any
> > > > difference in boot behavior anymore I thought I was seeing when changing the of_iomap() to
> > > > early_ioremap(). I assume I got confused by testing for too many variables at once (kernel version +
> > > > 2 Kconfig settings).
> > > >
> > > >
> > > > > >
> > > > > > > >
> > > > > > > > I can fix this up easy enough for mpc5200_simple by changing mpc5200_setup_xlb_arbiter() to use
> > > > > > > > early_ioremap() and moving mpc52xx_map_common_devices() from the setup_arch to the init hook (one
> > > > > > > > side effect is that mpc52xx_restart() only works a bit later, as it requires the mpc52xx_wdt mapping
> > > > > > > > from mpc52xx_map_common_devices(); I'm not sure if there is a better solution).
> > > > > > > >
> > > > > > > > For the other 52xx platforms (efika, lite5200, media5200) things are a bit more chaotic, and they
> > > > > > > > create several more temporary mappings from setup_arch. Either they should all be moved to the init
> > > > > > > > hook as well, or be converted to early_ioremap(), but I can't tell which is more appropriate. As a
> > > > > > > > first step, I would propose a patch that fixes this for the simple platforms and leaves the other
> > > > > > > > ones unchanged.
> > > > > > > >
> > > > > > > > (Side note: I also found that before 16132529cee58 ("powerpc/32s: Rework Kernel Userspace Access
> > > > > > > > Protection"), boot would succeed even with KUAP enabled without changing the incorrect of_iomap(); I
> > > > > > > > guess the old implementation was more lenient about the incorrect calls that the kernel warns
> > > > > > > > about?)
> > > > > >
> > > > > > Interesting.
> > > > > > Again, there shouldn't be any impact of those incorrect calls. They are
> > > > > > correct calls, it is just an historical method that we want to get rid
> > > > > > of on day.
> > > > > > Could you then provide the dmesg of what/how it works here ? And then
> > > > > > I'd also be interested in a dump of /sys/kernel/debug/kernel_page_tables
> > > > > > and /sys/kernel/debug/powerpc/block_address_translation
> > > > > > and /sys/kernel/debug/powerpc/segment_registers
> > > > > >
> > > > > > For that you'll need CONFIG_PTDUMP_DEBUGFS
> > > >
> > > > As it turns out, whatever issue existed with KUAP at the time when it was changed to enabled by
> > > > default for 32s (which was in 5.14) has been resolved in current mainline. Current torvalds/master
> > > > boots fine with KUAP enabled, and only CONFIG_STRICT_KERNEL_RWX breaks the boot.
> > > >
> > > > > >
> > > > > > > >
> > > > > > > > Issue 2) Boot fails with CONFIG_STRICT_KERNEL_RWX enabled, which is also the default nowadays.
> > > > > > > >
> > > > > > > > I have not found the cause of this boot failure yet; is there any way to debug this if the failure
> > > > > > > > happens before the UART is available and I currently don't have JTAG for this hardware?
> > > > > >
> > > > > > Shouldn't happen before UART is available, strict enforcement is
> > > > > > perfomed by mark_readonly() and free_initmem() in the middle of
> > > > > > kernel_init(). UART should be ON long before that.
> > > > > >
> > > > > > So it must be something in the setup that collides with CONFIG_KUAP and
> > > > > > CONFIG_STRICT_KERNEL_RWX.
> > > > > >
> > > > > > Could you send dmesg of when it works (ie without
> > > > > > CONFIG_KUAP/CONFIG_STRICT_KERNEL_RWX) and when it doesn't work if you
> > > > > > get some initial stuff ?
> > > >
> > > > Here's the UART output of a working boot (CONFIG_STRICT_KERNEL_RWX disabled; I have slightly
> > > > extended tqm5200.dts to enable UART output of the cuImage wrapper):
> > > >
> > ...
> > > >
> > > > When boot doesn't work, the last messages I see are from the cuImage wrapper ("Finalizing device
> > > > tree... flat tree at ...). The panic is expected, there is no rootfs/initramfs in my current setup.
> > > >
> >
> > Ok, so let's focus on CONFIG_STRICT_KERNEL_RWX then.
> >
> > The most efficient would be if you were able to activation your UART
> > console earlier and/or implement some PPC_EARLY_DEBUG stuff to see where
> > it fails.
> >
> > In your dmesg output, "Kernel memory protection not selected by kernel
> > config" is when the strict RWX gets activated when selected. Your UART
> > is enabled before that so if there was a problem with some driver
> > writing in a RO area, it would be seen.
> >
> > One thing that came into my mind is that your CPU may have only 4 BATs
> > instead of 8. But I hacked the definition for the e300c2 CPU and my
> > board still boots with only 4 BATs so it is not that.
> > The thing is that to work properly, BATs should at least cover all
> > kernel. But I built your kernel with your .config and GCC 11.3 and I got
> > something that fits within 8M with the RO part stopping at 4M, so you'll
> > have one 4M BAT set RO, then another 4M BAT set RW, one 8M and one 16M
> > BAT. It won't cover your entire 128M memory but shouldn't be a problem,
> > just less performant.

Hi Christophe,

this seems indeed have something to do with the issue. mmu_mapin_ram() contains a
strict_kernel_rwx_enabled() check that explains the early boot failure (and as this is a runtime
check, I can actually make the kernel boot by passing rodata=off on the cmdline!). I've added debug
output show me the addresses in mmu_mapin_ram(): base=00000000 top=08000000 border=00400000.
Modifying mmu_mapin_ram() to always use the !strict_kernel_rwx_enabled() path makes the kernel boot
until mark_readonly().

Removing MMU_FTR_USE_HIGH_BATS from mmu_features or changing find_free_bat() to only use 4 BATs
regardless of MMU_FTR_USE_HIGH_BATS results in a working kernel, but it is unclear to me why that
would be necessary, as the MPC5200B manual clearly states that it has 8.

Regards,
Matthias

> >
> > So what ? You said the size of the kernel is a problem for your
> > bootloader. Could it be that ? When built with CONFIG_STRICT_KERNEL_RWX,
> > __end_rodata is aligned to 0xc0400000 whereas without
> > CONFIG_STRICT_KERNEL_RWX __end_rodata is at 0xc038c000 and so the end of
> > the kernel (seen from System.map) is 0xc0055e000 with
> > CONFIG_STRICT_KERNEL_RWX and 0xc04de000 without it.
> >
> > One thing you can try is to see if it works without
> > CONFIG_STRICT_KERNEL_RWX but with CONFIG_DATA_SHIFT forced to 22 which
> > is the value set when CONFIG_STRICT_KERNEL_RWX is selected.
> > To be able to set that value, you'll have to hack arch/powerpc/Kconfig
> > directly and force it to select value 22 regardless of
> > CONFIG_STRICT_KERNEL_RWX



> >
> > Christophe

--
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
https://www.tq-group.com/

2023-12-21 11:17:59

by Christophe Leroy

[permalink] [raw]
Subject: Re: powerpc: several early boot regressions on MPC52xx



Le 21/12/2023 à 11:33, Matthias Schiffer a écrit :
> [Vous ne recevez pas souvent de courriers de [email protected]. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>
> On Wed, 2023-12-20 at 14:55 +0000, Christophe Leroy wrote:
>>> Le 19/12/2023 à 14:34, Matthias Schiffer a écrit :
>>>>> [Vous ne recevez pas souvent de courriers de [email protected]. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>>>>>
>>>>> On Mon, 2023-12-18 at 19:48 +0000, Christophe Leroy wrote:
>>>>>>> Hi Matthias,
>>>>>>>
>>>>>>> Le 18/12/2023 à 14:48, Matthias Schiffer a écrit :
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I'm currently in the process of porting our ancient TQM5200 SoM to a modern kernel, and I've
>>>>>>>>> identified a number of regressions that cause early boot failures (before the UART console has been
>>>>>>>>> initialized) with arch/powerpc/configs/52xx/tqm5200_defconfig.
>>>>>>>
>>>>>>> "modern" kernel ==> which version ?
>>>>>
>>>>> Hi Christophe,
>>>>>
>>>>> I was testing with torvalds/master as of yesterday, and bisected everything from 4.14 to identify
>>>>> the commits related to the issues. For my current project, 6.1.y or 6.6.y will likely be our kernel
>>>>> of choice, but I'd also like to get mainline to a working state again if possible.
>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> Issue 1) Boot fails with CONFIG_PPC_KUAP enabled (enabled by default since 9f5bd8f1471d7
>>>>>>>>> "powerpc/32s: Activate KUAP and KUEP by default"). The reason is a number of of_iomap() calls in
>>>>>>>>> arch/powerpc/platforms/52xx that should be early_ioremap().
>>>>>>>
>>>>>>> Can you give more details and what leads you to that conclusion ?
>>>>>>>
>>>>>>> There should be no relation between KUAP and of_iomap()/early_ioremap().
>>>>>>> Problem is likely somewhere else.
>>>>>
>>>>> You are entirely right, the warnings about early_ioremap() were a red hering. I can't reproduce any
>>>>> difference in boot behavior anymore I thought I was seeing when changing the of_iomap() to
>>>>> early_ioremap(). I assume I got confused by testing for too many variables at once (kernel version +
>>>>> 2 Kconfig settings).
>>>>>
>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> I can fix this up easy enough for mpc5200_simple by changing mpc5200_setup_xlb_arbiter() to use
>>>>>>>>> early_ioremap() and moving mpc52xx_map_common_devices() from the setup_arch to the init hook (one
>>>>>>>>> side effect is that mpc52xx_restart() only works a bit later, as it requires the mpc52xx_wdt mapping
>>>>>>>>> from mpc52xx_map_common_devices(); I'm not sure if there is a better solution).
>>>>>>>>>
>>>>>>>>> For the other 52xx platforms (efika, lite5200, media5200) things are a bit more chaotic, and they
>>>>>>>>> create several more temporary mappings from setup_arch. Either they should all be moved to the init
>>>>>>>>> hook as well, or be converted to early_ioremap(), but I can't tell which is more appropriate. As a
>>>>>>>>> first step, I would propose a patch that fixes this for the simple platforms and leaves the other
>>>>>>>>> ones unchanged.
>>>>>>>>>
>>>>>>>>> (Side note: I also found that before 16132529cee58 ("powerpc/32s: Rework Kernel Userspace Access
>>>>>>>>> Protection"), boot would succeed even with KUAP enabled without changing the incorrect of_iomap(); I
>>>>>>>>> guess the old implementation was more lenient about the incorrect calls that the kernel warns
>>>>>>>>> about?)
>>>>>>>
>>>>>>> Interesting.
>>>>>>> Again, there shouldn't be any impact of those incorrect calls. They are
>>>>>>> correct calls, it is just an historical method that we want to get rid
>>>>>>> of on day.
>>>>>>> Could you then provide the dmesg of what/how it works here ? And then
>>>>>>> I'd also be interested in a dump of /sys/kernel/debug/kernel_page_tables
>>>>>>> and /sys/kernel/debug/powerpc/block_address_translation
>>>>>>> and /sys/kernel/debug/powerpc/segment_registers
>>>>>>>
>>>>>>> For that you'll need CONFIG_PTDUMP_DEBUGFS
>>>>>
>>>>> As it turns out, whatever issue existed with KUAP at the time when it was changed to enabled by
>>>>> default for 32s (which was in 5.14) has been resolved in current mainline. Current torvalds/master
>>>>> boots fine with KUAP enabled, and only CONFIG_STRICT_KERNEL_RWX breaks the boot.
>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> Issue 2) Boot fails with CONFIG_STRICT_KERNEL_RWX enabled, which is also the default nowadays.
>>>>>>>>>
>>>>>>>>> I have not found the cause of this boot failure yet; is there any way to debug this if the failure
>>>>>>>>> happens before the UART is available and I currently don't have JTAG for this hardware?
>>>>>>>
>>>>>>> Shouldn't happen before UART is available, strict enforcement is
>>>>>>> perfomed by mark_readonly() and free_initmem() in the middle of
>>>>>>> kernel_init(). UART should be ON long before that.
>>>>>>>
>>>>>>> So it must be something in the setup that collides with CONFIG_KUAP and
>>>>>>> CONFIG_STRICT_KERNEL_RWX.
>>>>>>>
>>>>>>> Could you send dmesg of when it works (ie without
>>>>>>> CONFIG_KUAP/CONFIG_STRICT_KERNEL_RWX) and when it doesn't work if you
>>>>>>> get some initial stuff ?
>>>>>
>>>>> Here's the UART output of a working boot (CONFIG_STRICT_KERNEL_RWX disabled; I have slightly
>>>>> extended tqm5200.dts to enable UART output of the cuImage wrapper):
>>>>>
>>> ...
>>>>>
>>>>> When boot doesn't work, the last messages I see are from the cuImage wrapper ("Finalizing device
>>>>> tree... flat tree at ...). The panic is expected, there is no rootfs/initramfs in my current setup.
>>>>>
>>>
>>> Ok, so let's focus on CONFIG_STRICT_KERNEL_RWX then.
>>>
>>> The most efficient would be if you were able to activation your UART
>>> console earlier and/or implement some PPC_EARLY_DEBUG stuff to see where
>>> it fails.
>>>
>>> In your dmesg output, "Kernel memory protection not selected by kernel
>>> config" is when the strict RWX gets activated when selected. Your UART
>>> is enabled before that so if there was a problem with some driver
>>> writing in a RO area, it would be seen.
>>>
>>> One thing that came into my mind is that your CPU may have only 4 BATs
>>> instead of 8. But I hacked the definition for the e300c2 CPU and my
>>> board still boots with only 4 BATs so it is not that.
>>> The thing is that to work properly, BATs should at least cover all
>>> kernel. But I built your kernel with your .config and GCC 11.3 and I got
>>> something that fits within 8M with the RO part stopping at 4M, so you'll
>>> have one 4M BAT set RO, then another 4M BAT set RW, one 8M and one 16M
>>> BAT. It won't cover your entire 128M memory but shouldn't be a problem,
>>> just less performant.
>
> Hi Christophe,
>
> this seems indeed have something to do with the issue. mmu_mapin_ram() contains a
> strict_kernel_rwx_enabled() check that explains the early boot failure (and as this is a runtime
> check, I can actually make the kernel boot by passing rodata=off on the cmdline!). I've added debug
> output show me the addresses in mmu_mapin_ram(): base=00000000 top=08000000 border=00400000.
> Modifying mmu_mapin_ram() to always use the !strict_kernel_rwx_enabled() path makes the kernel boot
> until mark_readonly().
>
> Removing MMU_FTR_USE_HIGH_BATS from mmu_features or changing find_free_bat() to only use 4 BATs
> regardless of MMU_FTR_USE_HIGH_BATS results in a working kernel, but it is unclear to me why that
> would be necessary, as the MPC5200B manual clearly states that it has 8.
>

Maybe just because something is wrong in the way we set-up BATs.

Would be great if you could debug print with a bit more details what
happens inside __mmu_mapin_ram(), that is idx and base and size for
every call to setbat().

Maybe you can boot to a later point by removing the LOAD_BAT() in
function load_up_mmu() in arch/powerpc/kernel/head_book3s_32.S if it
helps you print stuff. Then maybe you can print and check the contents
of the BATS table.

Christophe