2019-11-27 08:25:03

by Chris Packham

[permalink] [raw]
Subject: ARM expections for location of kernel, ramdisk and dtb

Hi All,

We're updating our systems to use the latest kernel. For many of them
this is a fairly large leap. One problem we've hit is that durng boot
the dtb is clobbered by the uncompressed kernel.

Here's a snippet of output from u-boot

## Loading kernel from FIT Image at 62000000 ...
Using 'XS916MXS@2' configuration
Trying 'kernel@1' kernel subimage
Description: linux
Created: 2019-11-27 6:53:48 UTC
Type: Kernel Image
Compression: uncompressed
Data Start: 0x62000174
Data Size: 3495432 Bytes = 3.3 MiB
Architecture: ARM
OS: Linux
Load Address: 0x00800000
Entry Point: 0x60800000
...
Booting using the fdt blob at 0x63b90f6c
Loading Kernel Image ... OK
Loading Ramdisk to 6e7c6000, end 70000000 ... OK
Loading Device Tree to 607fb000, end 607fffd8 ... OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.

Error: invalid dtb and unrecognized/unsupported machine ID
r1=0x0000206e, r2=0x00000000

Between old and new the location of the devicetree hasn't actually
changed. But what has changed is the size of the kernel the self
extracting kernel unpacks to 0x60008000 and with our current
configuration extends into where the dtb is located.

Documentation/arm/booting.rst says that "The dtb must be placed in a
region of memory where the kernel decompressor will not overwrite it".

This suggests that the problem is with our u-boot configuration, but
how is u-boot supposed to know where the self-extracting kernel is
going to place things? As far as I can tell u-boot is doing a
reasonable job of finding a place to put the dtb which it thinks is
unused. I'm not sure why it's picked 0x607fb000 instead of putting it
just under the ramdisk but regardless with the information u-boot has
that address is up for grabs.

Has this come up before? The self-extraction code is fairly careful not
to overwrite itself but doesn't seem to pay any attention to the dtb
which surprised me. So I wonder if I'm missing something?

Thanks,
Chris


2019-11-27 09:28:58

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: ARM expections for location of kernel, ramdisk and dtb

On Wed, Nov 27, 2019 at 08:20:12AM +0000, Chris Packham wrote:
> Hi All,
>
> We're updating our systems to use the latest kernel. For many of them
> this is a fairly large leap. One problem we've hit is that durng boot
> the dtb is clobbered by the uncompressed kernel.
>
> Here's a snippet of output from u-boot
>
> ## Loading kernel from FIT Image at 62000000 ...
> Using 'XS916MXS@2' configuration
> Trying 'kernel@1' kernel subimage
> Description: linux
> Created: 2019-11-27 6:53:48 UTC
> Type: Kernel Image
> Compression: uncompressed
> Data Start: 0x62000174
> Data Size: 3495432 Bytes = 3.3 MiB
> Architecture: ARM
> OS: Linux
> Load Address: 0x00800000
> Entry Point: 0x60800000
> ...
> Booting using the fdt blob at 0x63b90f6c
> Loading Kernel Image ... OK
> Loading Ramdisk to 6e7c6000, end 70000000 ... OK
> Loading Device Tree to 607fb000, end 607fffd8 ... OK
>
> Starting kernel ...
>
> Uncompressing Linux... done, booting the kernel.
>
> Error: invalid dtb and unrecognized/unsupported machine ID
> r1=0x0000206e, r2=0x00000000
>
> Between old and new the location of the devicetree hasn't actually
> changed. But what has changed is the size of the kernel the self
> extracting kernel unpacks to 0x60008000 and with our current
> configuration extends into where the dtb is located.
>
> Documentation/arm/booting.rst says that "The dtb must be placed in a
> region of memory where the kernel decompressor will not overwrite it".
>
> This suggests that the problem is with our u-boot configuration, but
> how is u-boot supposed to know where the self-extracting kernel is
> going to place things? As far as I can tell u-boot is doing a
> reasonable job of finding a place to put the dtb which it thinks is
> unused. I'm not sure why it's picked 0x607fb000 instead of putting it
> just under the ramdisk but regardless with the information u-boot has
> that address is up for grabs.
>
> Has this come up before? The self-extraction code is fairly careful not
> to overwrite itself but doesn't seem to pay any attention to the dtb
> which surprised me. So I wonder if I'm missing something?

The self-extraction hasn't changed much over the years, and basically
follows the same method which has worked for the vast majority of
platforms.

Where things fall down is where things are placed too close, and yes,
as the kernel grows, what was reasonable years ago becomes too close
with modern kernels.

The problem has been compounded by the various different compression
algorithms that can now be used for the compressed kernel.

kexec also ran into this problem, and there is now enough information
in a modern kernel to calculate how much space the decompressor is
going to require. Have a look at the current kexec sources for how
it is done.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

2019-11-27 22:20:21

by Chris Packham

[permalink] [raw]
Subject: Re: ARM expections for location of kernel, ramdisk and dtb

Hi Russell,

On Wed, 2019-11-27 at 09:26 +0000, Russell King - ARM Linux admin
wrote:
> On Wed, Nov 27, 2019 at 08:20:12AM +0000, Chris Packham wrote:
> > Hi All,
> >
> > We're updating our systems to use the latest kernel. For many of them
> > this is a fairly large leap. One problem we've hit is that durng boot
> > the dtb is clobbered by the uncompressed kernel.
> >
> > Here's a snippet of output from u-boot
> >
> > ## Loading kernel from FIT Image at 62000000 ...
> > Using 'XS916MXS@2' configuration
> > Trying 'kernel@1' kernel subimage
> > Description: linux
> > Created: 2019-11-27 6:53:48 UTC
> > Type: Kernel Image
> > Compression: uncompressed
> > Data Start: 0x62000174
> > Data Size: 3495432 Bytes = 3.3 MiB
> > Architecture: ARM
> > OS: Linux
> > Load Address: 0x00800000
> > Entry Point: 0x60800000
> > ...
> > Booting using the fdt blob at 0x63b90f6c
> > Loading Kernel Image ... OK
> > Loading Ramdisk to 6e7c6000, end 70000000 ... OK
> > Loading Device Tree to 607fb000, end 607fffd8 ... OK
> >
> > Starting kernel ...
> >
> > Uncompressing Linux... done, booting the kernel.
> >
> > Error: invalid dtb and unrecognized/unsupported machine ID
> > r1=0x0000206e, r2=0x00000000
> >
> > Between old and new the location of the devicetree hasn't actually
> > changed. But what has changed is the size of the kernel the self
> > extracting kernel unpacks to 0x60008000 and with our current
> > configuration extends into where the dtb is located.
> >
> > Documentation/arm/booting.rst says that "The dtb must be placed in a
> > region of memory where the kernel decompressor will not overwrite it".
> >
> > This suggests that the problem is with our u-boot configuration, but
> > how is u-boot supposed to know where the self-extracting kernel is
> > going to place things? As far as I can tell u-boot is doing a
> > reasonable job of finding a place to put the dtb which it thinks is
> > unused. I'm not sure why it's picked 0x607fb000 instead of putting it
> > just under the ramdisk but regardless with the information u-boot has
> > that address is up for grabs.
> >
> > Has this come up before? The self-extraction code is fairly careful not
> > to overwrite itself but doesn't seem to pay any attention to the dtb
> > which surprised me. So I wonder if I'm missing something?
>
> The self-extraction hasn't changed much over the years, and basically
> follows the same method which has worked for the vast majority of
> platforms.
>
> Where things fall down is where things are placed too close, and yes,
> as the kernel grows, what was reasonable years ago becomes too close
> with modern kernels.
>
> The problem has been compounded by the various different compression
> algorithms that can now be used for the compressed kernel.
>

I don't think it's that we don't know how big the extracted kernel will
be. It's just that we aren't doing anything with that information w.r.t
the dtb.

> kexec also ran into this problem, and there is now enough information
> in a modern kernel to calculate how much space the decompressor is
> going to require. Have a look at the current kexec sources for how
> it is done.
>

Thanks will do. If we get something suitable we'll post a patch.

2019-11-27 23:21:22

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: ARM expections for location of kernel, ramdisk and dtb

On Wed, Nov 27, 2019 at 10:15:57PM +0000, Chris Packham wrote:
> Hi Russell,
>
> On Wed, 2019-11-27 at 09:26 +0000, Russell King - ARM Linux admin
> wrote:
> > On Wed, Nov 27, 2019 at 08:20:12AM +0000, Chris Packham wrote:
> > > Hi All,
> > >
> > > We're updating our systems to use the latest kernel. For many of them
> > > this is a fairly large leap. One problem we've hit is that durng boot
> > > the dtb is clobbered by the uncompressed kernel.
> > >
> > > Here's a snippet of output from u-boot
> > >
> > > ## Loading kernel from FIT Image at 62000000 ...
> > > Using 'XS916MXS@2' configuration
> > > Trying 'kernel@1' kernel subimage
> > > Description: linux
> > > Created: 2019-11-27 6:53:48 UTC
> > > Type: Kernel Image
> > > Compression: uncompressed
> > > Data Start: 0x62000174
> > > Data Size: 3495432 Bytes = 3.3 MiB
> > > Architecture: ARM
> > > OS: Linux
> > > Load Address: 0x00800000
> > > Entry Point: 0x60800000
> > > ...
> > > Booting using the fdt blob at 0x63b90f6c
> > > Loading Kernel Image ... OK
> > > Loading Ramdisk to 6e7c6000, end 70000000 ... OK
> > > Loading Device Tree to 607fb000, end 607fffd8 ... OK
> > >
> > > Starting kernel ...
> > >
> > > Uncompressing Linux... done, booting the kernel.
> > >
> > > Error: invalid dtb and unrecognized/unsupported machine ID
> > > r1=0x0000206e, r2=0x00000000
> > >
> > > Between old and new the location of the devicetree hasn't actually
> > > changed. But what has changed is the size of the kernel the self
> > > extracting kernel unpacks to 0x60008000 and with our current
> > > configuration extends into where the dtb is located.
> > >
> > > Documentation/arm/booting.rst says that "The dtb must be placed in a
> > > region of memory where the kernel decompressor will not overwrite it".
> > >
> > > This suggests that the problem is with our u-boot configuration, but
> > > how is u-boot supposed to know where the self-extracting kernel is
> > > going to place things? As far as I can tell u-boot is doing a
> > > reasonable job of finding a place to put the dtb which it thinks is
> > > unused. I'm not sure why it's picked 0x607fb000 instead of putting it
> > > just under the ramdisk but regardless with the information u-boot has
> > > that address is up for grabs.
> > >
> > > Has this come up before? The self-extraction code is fairly careful not
> > > to overwrite itself but doesn't seem to pay any attention to the dtb
> > > which surprised me. So I wonder if I'm missing something?
> >
> > The self-extraction hasn't changed much over the years, and basically
> > follows the same method which has worked for the vast majority of
> > platforms.
> >
> > Where things fall down is where things are placed too close, and yes,
> > as the kernel grows, what was reasonable years ago becomes too close
> > with modern kernels.
> >
> > The problem has been compounded by the various different compression
> > algorithms that can now be used for the compressed kernel.
> >
>
> I don't think it's that we don't know how big the extracted kernel will
> be. It's just that we aren't doing anything with that information w.r.t
> the dtb.

I believe u-boot tried at one point to instigate some kind of standard
placement of the kernel / dtb with respect to the available RAM, but
vendors tried hard to ignore u-boot and go their own way - resulting
in systems that didn't boot without customising various u-boot
environment variables. It's very annoying when vendors ignore the
community...

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

2019-11-27 23:55:36

by Stefan Agner

[permalink] [raw]
Subject: Re: ARM expections for location of kernel, ramdisk and dtb

On 2019-11-28 00:19, Russell King - ARM Linux admin wrote:
> On Wed, Nov 27, 2019 at 10:15:57PM +0000, Chris Packham wrote:
>> Hi Russell,
>>
>> On Wed, 2019-11-27 at 09:26 +0000, Russell King - ARM Linux admin
>> wrote:
>> > On Wed, Nov 27, 2019 at 08:20:12AM +0000, Chris Packham wrote:
>> > > Hi All,
>> > >
>> > > We're updating our systems to use the latest kernel. For many of them
>> > > this is a fairly large leap. One problem we've hit is that durng boot
>> > > the dtb is clobbered by the uncompressed kernel.
>> > >
>> > > Here's a snippet of output from u-boot
>> > >
>> > > ## Loading kernel from FIT Image at 62000000 ...
>> > > Using 'XS916MXS@2' configuration
>> > > Trying 'kernel@1' kernel subimage
>> > > Description: linux
>> > > Created: 2019-11-27 6:53:48 UTC
>> > > Type: Kernel Image
>> > > Compression: uncompressed
>> > > Data Start: 0x62000174
>> > > Data Size: 3495432 Bytes = 3.3 MiB
>> > > Architecture: ARM
>> > > OS: Linux
>> > > Load Address: 0x00800000
>> > > Entry Point: 0x60800000
>> > > ...
>> > > Booting using the fdt blob at 0x63b90f6c
>> > > Loading Kernel Image ... OK
>> > > Loading Ramdisk to 6e7c6000, end 70000000 ... OK
>> > > Loading Device Tree to 607fb000, end 607fffd8 ... OK
>> > >
>> > > Starting kernel ...
>> > >
>> > > Uncompressing Linux... done, booting the kernel.
>> > >
>> > > Error: invalid dtb and unrecognized/unsupported machine ID
>> > > r1=0x0000206e, r2=0x00000000
>> > >
>> > > Between old and new the location of the devicetree hasn't actually
>> > > changed. But what has changed is the size of the kernel the self
>> > > extracting kernel unpacks to 0x60008000 and with our current
>> > > configuration extends into where the dtb is located.
>> > >
>> > > Documentation/arm/booting.rst says that "The dtb must be placed in a
>> > > region of memory where the kernel decompressor will not overwrite it".
>> > >
>> > > This suggests that the problem is with our u-boot configuration, but
>> > > how is u-boot supposed to know where the self-extracting kernel is
>> > > going to place things? As far as I can tell u-boot is doing a
>> > > reasonable job of finding a place to put the dtb which it thinks is
>> > > unused. I'm not sure why it's picked 0x607fb000 instead of putting it
>> > > just under the ramdisk but regardless with the information u-boot has
>> > > that address is up for grabs.
>> > >
>> > > Has this come up before? The self-extraction code is fairly careful not
>> > > to overwrite itself but doesn't seem to pay any attention to the dtb
>> > > which surprised me. So I wonder if I'm missing something?
>> >
>> > The self-extraction hasn't changed much over the years, and basically
>> > follows the same method which has worked for the vast majority of
>> > platforms.
>> >
>> > Where things fall down is where things are placed too close, and yes,
>> > as the kernel grows, what was reasonable years ago becomes too close
>> > with modern kernels.
>> >
>> > The problem has been compounded by the various different compression
>> > algorithms that can now be used for the compressed kernel.
>> >
>>
>> I don't think it's that we don't know how big the extracted kernel will
>> be. It's just that we aren't doing anything with that information w.r.t
>> the dtb.
>
> I believe u-boot tried at one point to instigate some kind of standard
> placement of the kernel / dtb with respect to the available RAM, but
> vendors tried hard to ignore u-boot and go their own way - resulting
> in systems that didn't boot without customising various u-boot
> environment variables. It's very annoying when vendors ignore the
> community...

Indeed, there are too many board setting fdt_high by default to
0xffffffff which disables device tree relocation... Not sure why vendors
are doing this, maybe because they want to save the extra copy. Often
the very same boards have a kernel load address which conflicts with the
TEXT_OFFSET address requiring the kernel to relocate before decompress,
which certainly takes longer then relocating a device tree...

Disabling relocation is also problematic when storing device tree close
by to a initrd. I remember I had an issue when using FIT images without
relocation:
https://lists.denx.de/pipermail/u-boot/2016-August/263689.html

From what I remember I tracked down the exact issue. It was due to the
fact that Linux is freeing (and poisoning) the memory of the initrd page
aligned. This then corrupted the device tree.
https://elixir.bootlin.com/linux/latest/source/arch/arm/mm/init.c#L315

--
Stefan