2023-12-15 19:05:40

by Chris Koch

[permalink] [raw]
Subject: [PATCH] kexec: allocate kernel above bzImage's pref_address

A relocatable kernel will relocate itself to pref_address if it is
loaded below pref_address. This means a booted kernel may be relocating
itself to an area with reserved memory on modern systems, potentially
clobbering arbitrary data that may be important to the system.

This is often the case, as the default value of PHYSICAL_START is
0x1000000 and kernels are typically loaded at 0x100000 or above by
bootloaders like iPXE or kexec. GRUB behaves like this patch does.

Also fixes the documentation around pref_address and PHYSICAL_START to
be accurate.

Co-developed-by: Cloud Hsu <[email protected]>
Signed-off-by: Cloud Hsu <[email protected]>
Signed-off-by: Chris Koch <[email protected]>
---
Documentation/arch/x86/boot.rst | 3 ++-
arch/x86/Kconfig | 10 +++++-----
arch/x86/kernel/kexec-bzimage64.c | 5 ++++-
3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
index 22cc7a040dae..49bea8986620 100644
--- a/Documentation/arch/x86/boot.rst
+++ b/Documentation/arch/x86/boot.rst
@@ -878,7 +878,8 @@ Protocol: 2.10+
address if possible.

A non-relocatable kernel will unconditionally move itself and to run
- at this address.
+ at this address. A relocatable kernel will move itself to this address if it
+ loaded below this address.

============ =======
Field name: init_size
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3762f41bb092..1370f43328d7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2109,11 +2109,11 @@ config PHYSICAL_START
help
This gives the physical address where the kernel is loaded.

- If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
- bzImage will decompress itself to above physical address and
- run from there. Otherwise, bzImage will run from the address where
- it has been loaded by the boot loader and will ignore above physical
- address.
+ If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage
+ will decompress itself to above physical address and run from there.
+ Otherwise, bzImage will run from the address where it has been loaded
+ by the boot loader. The only exception is if it is loaded below the
+ above physical address, in which case it will relocate itself there.

In normal kdump cases one does not have to set/change this option
as now bzImage can be compiled as a completely relocatable image
diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index a61c12c01270..5dcd232d58bf 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
kbuf.bufsz = kernel_len - kern16_size;
kbuf.memsz = PAGE_ALIGN(header->init_size);
kbuf.buf_align = header->kernel_alignment;
- kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
+ if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
+ kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
+ else
+ kbuf.buf_min = header->pref_address;
kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
ret = kexec_add_buffer(&kbuf);
if (ret)
--
2.43.0.472.g3155946c3a-goog



2023-12-15 19:32:42

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

On December 15, 2023 11:05:21 AM PST, Chris Koch <[email protected]> wrote:
>A relocatable kernel will relocate itself to pref_address if it is
>loaded below pref_address. This means a booted kernel may be relocating
>itself to an area with reserved memory on modern systems, potentially
>clobbering arbitrary data that may be important to the system.
>
>This is often the case, as the default value of PHYSICAL_START is
>0x1000000 and kernels are typically loaded at 0x100000 or above by
>bootloaders like iPXE or kexec. GRUB behaves like this patch does.
>
>Also fixes the documentation around pref_address and PHYSICAL_START to
>be accurate.
>
>Co-developed-by: Cloud Hsu <[email protected]>
>Signed-off-by: Cloud Hsu <[email protected]>
>Signed-off-by: Chris Koch <[email protected]>
>---
> Documentation/arch/x86/boot.rst | 3 ++-
> arch/x86/Kconfig | 10 +++++-----
> arch/x86/kernel/kexec-bzimage64.c | 5 ++++-
> 3 files changed, 11 insertions(+), 7 deletions(-)
>
>diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
>index 22cc7a040dae..49bea8986620 100644
>--- a/Documentation/arch/x86/boot.rst
>+++ b/Documentation/arch/x86/boot.rst
>@@ -878,7 +878,8 @@ Protocol: 2.10+
> address if possible.
>
> A non-relocatable kernel will unconditionally move itself and to run
>- at this address.
>+ at this address. A relocatable kernel will move itself to this address if it
>+ loaded below this address.
>
> ============ =======
> Field name: init_size
>diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>index 3762f41bb092..1370f43328d7 100644
>--- a/arch/x86/Kconfig
>+++ b/arch/x86/Kconfig
>@@ -2109,11 +2109,11 @@ config PHYSICAL_START
> help
> This gives the physical address where the kernel is loaded.
>
>- If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
>- bzImage will decompress itself to above physical address and
>- run from there. Otherwise, bzImage will run from the address where
>- it has been loaded by the boot loader and will ignore above physical
>- address.
>+ If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage
>+ will decompress itself to above physical address and run from there.
>+ Otherwise, bzImage will run from the address where it has been loaded
>+ by the boot loader. The only exception is if it is loaded below the
>+ above physical address, in which case it will relocate itself there.
>
> In normal kdump cases one does not have to set/change this option
> as now bzImage can be compiled as a completely relocatable image
>diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>index a61c12c01270..5dcd232d58bf 100644
>--- a/arch/x86/kernel/kexec-bzimage64.c
>+++ b/arch/x86/kernel/kexec-bzimage64.c
>@@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
> kbuf.bufsz = kernel_len - kern16_size;
> kbuf.memsz = PAGE_ALIGN(header->init_size);
> kbuf.buf_align = header->kernel_alignment;
>- kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>+ if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
>+ kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>+ else
>+ kbuf.buf_min = header->pref_address;
> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
> ret = kexec_add_buffer(&kbuf);
> if (ret)

Reviewed-by: H. Peter Anvin (Intel) <[email protected]>

2023-12-15 21:17:41

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

On 12/15/23 11:05, Chris Koch wrote:
> A relocatable kernel will relocate itself to pref_address if it is
> loaded below pref_address. This means a booted kernel may be relocating
> itself to an area with reserved memory on modern systems, potentially
> clobbering arbitrary data that may be important to the system.
>
> This is often the case, as the default value of PHYSICAL_START is
> 0x1000000 and kernels are typically loaded at 0x100000 or above by
> bootloaders like iPXE or kexec. GRUB behaves like this patch does.
>
> Also fixes the documentation around pref_address and PHYSICAL_START to
> be accurate.

Are you reporting a bug and is this a bug fix? It's not super clear
from the changelog.


> diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
> index 22cc7a040dae..49bea8986620 100644
> --- a/Documentation/arch/x86/boot.rst
> +++ b/Documentation/arch/x86/boot.rst
> @@ -878,7 +878,8 @@ Protocol: 2.10+
> address if possible.
>
> A non-relocatable kernel will unconditionally move itself and to run
> - at this address.
> + at this address. A relocatable kernel will move itself to this address if it
> + loaded below this address.

I think we should avoid saying the same things over and over again in
different spots.

Here, it doesn't really help to enumerate the different interpretations
of 'pref_address'. All that matters is that the bootloader can avoid
the overhead of a later copy if it can place the kernel at
'pref_address'. The exact reasons that various kernels might decide to
relocate are unimportant here.

> ============ =======
> Field name: init_size
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 3762f41bb092..1370f43328d7 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2109,11 +2109,11 @@ config PHYSICAL_START
> help
> This gives the physical address where the kernel is loaded.
>
> - If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
> - bzImage will decompress itself to above physical address and
> - run from there. Otherwise, bzImage will run from the address where
> - it has been loaded by the boot loader and will ignore above physical
> - address.
> + If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage
> + will decompress itself to above physical address and run from there.
> + Otherwise, bzImage will run from the address where it has been loaded
> + by the boot loader. The only exception is if it is loaded below the
> + above physical address, in which case it will relocate itself there.

I kinda dislike how this is written. It's written almost like code
where you're spelling out the conditions. I prefer something much
higher-level.

This gives a minimum physical address at which the kernel can be
loaded.

CONFIG_RELOCATABLE=n kernels will be decompressed to and must
run at PHYSICAL_START exactly.

CONFIG_RELOCATABLE=y kernels can run at any address above
PHYSICAL_START. If a kernel is loaded below PHYSICAL_START, it
will relocate itself to PHYSICAL_START.

> In normal kdump cases one does not have to set/change this option
> as now bzImage can be compiled as a completely relocatable image
> diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
> index a61c12c01270..5dcd232d58bf 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
> kbuf.bufsz = kernel_len - kern16_size;
> kbuf.memsz = PAGE_ALIGN(header->init_size);
> kbuf.buf_align = header->kernel_alignment;
> - kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> + if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
> + kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> + else
> + kbuf.buf_min = header->pref_address;
> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
> ret = kexec_add_buffer(&kbuf);
> if (ret)

Comment, please.

It isn't clear from this hunk why or how this fixes the bug. How does
this manage to avoid clobbering reserved areas?

2023-12-15 21:39:07

by Chris Koch

[permalink] [raw]
Subject: Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

On Fri, Dec 15, 2023 at 1:17 PM Dave Hansen <[email protected]> wrote:
>
> On 12/15/23 11:05, Chris Koch wrote:
> > A relocatable kernel will relocate itself to pref_address if it is
> > loaded below pref_address. This means a booted kernel may be relocating
> > itself to an area with reserved memory on modern systems, potentially
> > clobbering arbitrary data that may be important to the system.
> >
> > This is often the case, as the default value of PHYSICAL_START is
> > 0x1000000 and kernels are typically loaded at 0x100000 or above by
> > bootloaders like iPXE or kexec. GRUB behaves like this patch does.
> >
> > Also fixes the documentation around pref_address and PHYSICAL_START to
> > be accurate.
>
> Are you reporting a bug and is this a bug fix? It's not super clear
> from the changelog.

I reported it as a bug yesterday in
https://lkml.org/lkml/2023/12/14/1529 -- I'm happy to reword this in
some way that indicates it's a bug fix.

>
>
> > diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
> > index 22cc7a040dae..49bea8986620 100644
> > --- a/Documentation/arch/x86/boot.rst
> > +++ b/Documentation/arch/x86/boot.rst
> > @@ -878,7 +878,8 @@ Protocol: 2.10+
> > address if possible.
> >
> > A non-relocatable kernel will unconditionally move itself and to run
> > - at this address.
> > + at this address. A relocatable kernel will move itself to this address if it
> > + loaded below this address.
>
> I think we should avoid saying the same things over and over again in
> different spots.
>
> Here, it doesn't really help to enumerate the different interpretations
> of 'pref_address'. All that matters is that the bootloader can avoid
> the overhead of a later copy if it can place the kernel at
> 'pref_address'. The exact reasons that various kernels might decide to
> relocate are unimportant here.

I think it's important documentation for bootloader authors. It's not
about avoiding overhead, it's about avoiding clobbering areas of
memory that may be reserved in e820 / EFI memory map, which the kernel
will do when it relocates itself to pref_address without checking
what's reserved and what's not. It emphasizes the importance of
choosing an address above pref_address. Happy to reword some way to
reflect that.

>
> > ============ =======
> > Field name: init_size
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 3762f41bb092..1370f43328d7 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -2109,11 +2109,11 @@ config PHYSICAL_START
> > help
> > This gives the physical address where the kernel is loaded.
> >
> > - If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
> > - bzImage will decompress itself to above physical address and
> > - run from there. Otherwise, bzImage will run from the address where
> > - it has been loaded by the boot loader and will ignore above physical
> > - address.
> > + If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage
> > + will decompress itself to above physical address and run from there.
> > + Otherwise, bzImage will run from the address where it has been loaded
> > + by the boot loader. The only exception is if it is loaded below the
> > + above physical address, in which case it will relocate itself there.
>
> I kinda dislike how this is written. It's written almost like code
> where you're spelling out the conditions. I prefer something much
> higher-level.
>
> This gives a minimum physical address at which the kernel can be
> loaded.
>
> CONFIG_RELOCATABLE=n kernels will be decompressed to and must
> run at PHYSICAL_START exactly.
>
> CONFIG_RELOCATABLE=y kernels can run at any address above
> PHYSICAL_START. If a kernel is loaded below PHYSICAL_START, it
> will relocate itself to PHYSICAL_START.

Happy to change that, yours is better.

>
> > In normal kdump cases one does not have to set/change this option
> > as now bzImage can be compiled as a completely relocatable image
> > diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
> > index a61c12c01270..5dcd232d58bf 100644
> > --- a/arch/x86/kernel/kexec-bzimage64.c
> > +++ b/arch/x86/kernel/kexec-bzimage64.c
> > @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
> > kbuf.bufsz = kernel_len - kern16_size;
> > kbuf.memsz = PAGE_ALIGN(header->init_size);
> > kbuf.buf_align = header->kernel_alignment;
> > - kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> > + if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
> > + kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> > + else
> > + kbuf.buf_min = header->pref_address;
> > kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
> > ret = kexec_add_buffer(&kbuf);
> > if (ret)
>
> Comment, please.
>
> It isn't clear from this hunk why or how this fixes the bug. How does
> this manage to avoid clobbering reserved areas?

When allocated above pref_address, the kernel will not relocate itself
to an area that potentially overlaps with reserved memory. I'll add a
comment.

Not sure what the etiquette is on immediately sending a patch v2, or
waiting for more comments. I'll err on waiting on a couple more
comments before sending v2. Thanks for the review

Chris

2023-12-15 22:17:33

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

On 12/15/23 13:38, Chris Koch wrote:
> On Fri, Dec 15, 2023 at 1:17 PM Dave Hansen <[email protected]> wrote:
...
>> Are you reporting a bug and is this a bug fix? It's not super clear
>> from the changelog.
>
> I reported it as a bug yesterday in
> https://lkml.org/lkml/2023/12/14/1529 -- I'm happy to reword this in
> some way that indicates it's a bug fix.

Ahh, thanks for the link!

Please do include things like this as a "Link:" tag in the changelog
(and use lore.kernel.org URLs when possible).

>>> diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
>>> index 22cc7a040dae..49bea8986620 100644
>>> --- a/Documentation/arch/x86/boot.rst
>>> +++ b/Documentation/arch/x86/boot.rst
>>> @@ -878,7 +878,8 @@ Protocol: 2.10+
>>> address if possible.
>>>
>>> A non-relocatable kernel will unconditionally move itself and to run
>>> - at this address.
>>> + at this address. A relocatable kernel will move itself to this address if it
>>> + loaded below this address.
>>
>> I think we should avoid saying the same things over and over again in
>> different spots.
>>
>> Here, it doesn't really help to enumerate the different interpretations
>> of 'pref_address'. All that matters is that the bootloader can avoid
>> the overhead of a later copy if it can place the kernel at
>> 'pref_address'. The exact reasons that various kernels might decide to
>> relocate are unimportant here.
>
> I think it's important documentation for bootloader authors. It's not
> about avoiding overhead, it's about avoiding clobbering areas of
> memory that may be reserved in e820 / EFI memory map, which the kernel
> will do when it relocates itself to pref_address without checking
> what's reserved and what's not. It emphasizes the importance of
> choosing an address above pref_address. Happy to reword some way to
> reflect that.

Could we give this as a direction to bootloader authors? They should
read 'pref_address' as a big, fat, "the kernel may blow this address
away" message.

...
>>> In normal kdump cases one does not have to set/change this option
>>> as now bzImage can be compiled as a completely relocatable image
>>> diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>>> index a61c12c01270..5dcd232d58bf 100644
>>> --- a/arch/x86/kernel/kexec-bzimage64.c
>>> +++ b/arch/x86/kernel/kexec-bzimage64.c
>>> @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
>>> kbuf.bufsz = kernel_len - kern16_size;
>>> kbuf.memsz = PAGE_ALIGN(header->init_size);
>>> kbuf.buf_align = header->kernel_alignment;
>>> - kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>>> + if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
>>> + kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>>> + else
>>> + kbuf.buf_min = header->pref_address;
>>> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>>> ret = kexec_add_buffer(&kbuf);
>>> if (ret)
>>
>> Comment, please.
>>
>> It isn't clear from this hunk why or how this fixes the bug. How does
>> this manage to avoid clobbering reserved areas?
>
> When allocated above pref_address, the kernel will not relocate itself
> to an area that potentially overlaps with reserved memory. I'll add a
> comment.

Sounds good.

> Not sure what the etiquette is on immediately sending a patch v2, or
> waiting for more comments. I'll err on waiting on a couple more
> comments before sending v2. Thanks for the review

Please wait another few days.

2023-12-15 23:09:49

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

On December 15, 2023 1:17:00 PM PST, Dave Hansen <[email protected]> wrote:
>On 12/15/23 11:05, Chris Koch wrote:
>> A relocatable kernel will relocate itself to pref_address if it is
>> loaded below pref_address. This means a booted kernel may be relocating
>> itself to an area with reserved memory on modern systems, potentially
>> clobbering arbitrary data that may be important to the system.
>>
>> This is often the case, as the default value of PHYSICAL_START is
>> 0x1000000 and kernels are typically loaded at 0x100000 or above by
>> bootloaders like iPXE or kexec. GRUB behaves like this patch does.
>>
>> Also fixes the documentation around pref_address and PHYSICAL_START to
>> be accurate.
>
>Are you reporting a bug and is this a bug fix? It's not super clear
>from the changelog.
>
>
>> diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
>> index 22cc7a040dae..49bea8986620 100644
>> --- a/Documentation/arch/x86/boot.rst
>> +++ b/Documentation/arch/x86/boot.rst
>> @@ -878,7 +878,8 @@ Protocol: 2.10+
>> address if possible.
>>
>> A non-relocatable kernel will unconditionally move itself and to run
>> - at this address.
>> + at this address. A relocatable kernel will move itself to this address if it
>> + loaded below this address.
>
>I think we should avoid saying the same things over and over again in
>different spots.
>
>Here, it doesn't really help to enumerate the different interpretations
>of 'pref_address'. All that matters is that the bootloader can avoid
>the overhead of a later copy if it can place the kernel at
>'pref_address'. The exact reasons that various kernels might decide to
>relocate are unimportant here.
>
>> ============ =======
>> Field name: init_size
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index 3762f41bb092..1370f43328d7 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -2109,11 +2109,11 @@ config PHYSICAL_START
>> help
>> This gives the physical address where the kernel is loaded.
>>
>> - If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
>> - bzImage will decompress itself to above physical address and
>> - run from there. Otherwise, bzImage will run from the address where
>> - it has been loaded by the boot loader and will ignore above physical
>> - address.
>> + If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage
>> + will decompress itself to above physical address and run from there.
>> + Otherwise, bzImage will run from the address where it has been loaded
>> + by the boot loader. The only exception is if it is loaded below the
>> + above physical address, in which case it will relocate itself there.
>
>I kinda dislike how this is written. It's written almost like code
>where you're spelling out the conditions. I prefer something much
>higher-level.
>
> This gives a minimum physical address at which the kernel can be
> loaded.
>
> CONFIG_RELOCATABLE=n kernels will be decompressed to and must
> run at PHYSICAL_START exactly.
>
> CONFIG_RELOCATABLE=y kernels can run at any address above
> PHYSICAL_START. If a kernel is loaded below PHYSICAL_START, it
> will relocate itself to PHYSICAL_START.
>
>> In normal kdump cases one does not have to set/change this option
>> as now bzImage can be compiled as a completely relocatable image
>> diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>> index a61c12c01270..5dcd232d58bf 100644
>> --- a/arch/x86/kernel/kexec-bzimage64.c
>> +++ b/arch/x86/kernel/kexec-bzimage64.c
>> @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
>> kbuf.bufsz = kernel_len - kern16_size;
>> kbuf.memsz = PAGE_ALIGN(header->init_size);
>> kbuf.buf_align = header->kernel_alignment;
>> - kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>> + if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
>> + kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>> + else
>> + kbuf.buf_min = header->pref_address;
>> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>> ret = kexec_add_buffer(&kbuf);
>> if (ret)
>
>Comment, please.
>
>It isn't clear from this hunk why or how this fixes the bug. How does
>this manage to avoid clobbering reserved areas?

It is a bug and a bug fix.

2023-12-15 23:13:26

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

On December 15, 2023 1:17:00 PM PST, Dave Hansen <[email protected]> wrote:
>On 12/15/23 11:05, Chris Koch wrote:
>> A relocatable kernel will relocate itself to pref_address if it is
>> loaded below pref_address. This means a booted kernel may be relocating
>> itself to an area with reserved memory on modern systems, potentially
>> clobbering arbitrary data that may be important to the system.
>>
>> This is often the case, as the default value of PHYSICAL_START is
>> 0x1000000 and kernels are typically loaded at 0x100000 or above by
>> bootloaders like iPXE or kexec. GRUB behaves like this patch does.
>>
>> Also fixes the documentation around pref_address and PHYSICAL_START to
>> be accurate.
>
>Are you reporting a bug and is this a bug fix? It's not super clear
>from the changelog.
>
>
>> diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
>> index 22cc7a040dae..49bea8986620 100644
>> --- a/Documentation/arch/x86/boot.rst
>> +++ b/Documentation/arch/x86/boot.rst
>> @@ -878,7 +878,8 @@ Protocol: 2.10+
>> address if possible.
>>
>> A non-relocatable kernel will unconditionally move itself and to run
>> - at this address.
>> + at this address. A relocatable kernel will move itself to this address if it
>> + loaded below this address.
>
>I think we should avoid saying the same things over and over again in
>different spots.
>
>Here, it doesn't really help to enumerate the different interpretations
>of 'pref_address'. All that matters is that the bootloader can avoid
>the overhead of a later copy if it can place the kernel at
>'pref_address'. The exact reasons that various kernels might decide to
>relocate are unimportant here.
>
>> ============ =======
>> Field name: init_size
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index 3762f41bb092..1370f43328d7 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -2109,11 +2109,11 @@ config PHYSICAL_START
>> help
>> This gives the physical address where the kernel is loaded.
>>
>> - If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
>> - bzImage will decompress itself to above physical address and
>> - run from there. Otherwise, bzImage will run from the address where
>> - it has been loaded by the boot loader and will ignore above physical
>> - address.
>> + If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage
>> + will decompress itself to above physical address and run from there.
>> + Otherwise, bzImage will run from the address where it has been loaded
>> + by the boot loader. The only exception is if it is loaded below the
>> + above physical address, in which case it will relocate itself there.
>
>I kinda dislike how this is written. It's written almost like code
>where you're spelling out the conditions. I prefer something much
>higher-level.
>
> This gives a minimum physical address at which the kernel can be
> loaded.
>
> CONFIG_RELOCATABLE=n kernels will be decompressed to and must
> run at PHYSICAL_START exactly.
>
> CONFIG_RELOCATABLE=y kernels can run at any address above
> PHYSICAL_START. If a kernel is loaded below PHYSICAL_START, it
> will relocate itself to PHYSICAL_START.
>
>> In normal kdump cases one does not have to set/change this option
>> as now bzImage can be compiled as a completely relocatable image
>> diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>> index a61c12c01270..5dcd232d58bf 100644
>> --- a/arch/x86/kernel/kexec-bzimage64.c
>> +++ b/arch/x86/kernel/kexec-bzimage64.c
>> @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
>> kbuf.bufsz = kernel_len - kern16_size;
>> kbuf.memsz = PAGE_ALIGN(header->init_size);
>> kbuf.buf_align = header->kernel_alignment;
>> - kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>> + if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
>> + kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>> + else
>> + kbuf.buf_min = header->pref_address;
>> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>> ret = kexec_add_buffer(&kbuf);
>> if (ret)
>
>Comment, please.
>
>It isn't clear from this hunk why or how this fixes the bug. How does
>this manage to avoid clobbering reserved areas?

Also, this really should go to stable, so please add the appropriate Cc:.