2020-02-12 11:06:11

by Dave Young

[permalink] [raw]
Subject: [PATCH] x86/kexec: do not reserve kexec setup_data in kexec e820 table

The e820 table for kexec kernel always takes setup_data as reserved.
It is reasonable for the setup_data passed by the 1st kernel boot loader,
for example SETUP_PCI etc. But SETUP_EFI is used by kexec itself to
enable EFI in 2nd kernel, also kexec setups it every time. Thus it
is pointless to reserve kexec prepared setup_data.

1st physical boot: no SETUP_EFI
kexec load new kernel and prepare a SETUP_EFI setup_data, then reboot
-> 2nd kernel sees SETUP_EFI, reserves in both e820 and kexec e820
another kexec load prepare a new SETUP_EFI, then reboot
-> 3rd kernel has two SETUP_EFI ranges reserved
-> and so on..

Thus skip SETUP_EFI while reserving setup_data for kexec kernel.

Signed-off-by: Dave Young <[email protected]>
---
arch/x86/kernel/e820.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- linux-x86.orig/arch/x86/kernel/e820.c
+++ linux-x86/arch/x86/kernel/e820.c
@@ -999,7 +999,9 @@ void __init e820__reserve_setup_data(voi
while (pa_data) {
data = early_memremap(pa_data, sizeof(*data));
e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
- e820__range_update_kexec(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
+ /* Skip kexec passed setup_data */
+ if (data->type != SETUP_EFI)
+ e820__range_update_kexec(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);

if (data->type == SETUP_INDIRECT &&
((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) {


2020-02-24 07:15:20

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH] x86/kexec: do not reserve kexec setup_data in kexec e820 table

On 02/12/20 at 07:04pm, Dave Young wrote:
> The e820 table for kexec kernel always takes setup_data as reserved.
> It is reasonable for the setup_data passed by the 1st kernel boot loader,
> for example SETUP_PCI etc. But SETUP_EFI is used by kexec itself to
> enable EFI in 2nd kernel, also kexec setups it every time. Thus it
> is pointless to reserve kexec prepared setup_data.
>
> 1st physical boot: no SETUP_EFI
> kexec load new kernel and prepare a SETUP_EFI setup_data, then reboot
> -> 2nd kernel sees SETUP_EFI, reserves in both e820 and kexec e820
> another kexec load prepare a new SETUP_EFI, then reboot
> -> 3rd kernel has two SETUP_EFI ranges reserved
> -> and so on..
>
> Thus skip SETUP_EFI while reserving setup_data for kexec kernel.
>
> Signed-off-by: Dave Young <[email protected]>
> ---
> arch/x86/kernel/e820.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> --- linux-x86.orig/arch/x86/kernel/e820.c
> +++ linux-x86/arch/x86/kernel/e820.c
> @@ -999,7 +999,9 @@ void __init e820__reserve_setup_data(voi
> while (pa_data) {
> data = early_memremap(pa_data, sizeof(*data));
> e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
> - e820__range_update_kexec(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
> + /* Skip kexec passed setup_data */
> + if (data->type != SETUP_EFI)
> + e820__range_update_kexec(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
>
> if (data->type == SETUP_INDIRECT &&
> ((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) {

Ping, can someone review this? It caused fragmented memory in kexec
kernel also waste memory.

Thanks
Dave

Subject: [tip: x86/kdump] x86/kexec: Do not reserve EFI setup_data in the kexec e820 table

The following commit has been merged into the x86/kdump branch of tip:

Commit-ID: 8efbc518b884e1db2dd6a6fce62d0112ab871dcf
Gitweb: https://git.kernel.org/tip/8efbc518b884e1db2dd6a6fce62d0112ab871dcf
Author: Dave Young <[email protected]>
AuthorDate: Wed, 12 Feb 2020 19:04:24 +08:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Mon, 24 Feb 2020 11:41:57 +01:00

x86/kexec: Do not reserve EFI setup_data in the kexec e820 table

The e820 table for the kexec kernel unconditionally marks setup_data as
reserved because the second kernel can reuse setup_data passed by the
1st kernel's boot loader, for example SETUP_PCI marked regions like PCI
BIOS, etc.

SETUP_EFI types, however, are used by kexec itself to enable EFI in the
2nd kernel. Thus, it is pointless to add this type of setup_data to the
kexec e820 table as reserved.

IOW, what happens is this:

- 1st physical boot: no SETUP_EFI.

- kexec loads a new kernel and prepares a SETUP_EFI setup_data blob, then
reboots the machine.

- 2nd kernel sees SETUP_EFI, reserves it both in the e820 and in the
kexec e820 table.

- If another kexec load is executed, it prepares a new SETUP_EFI blob and
then reboots the machine into the new kernel.

5. The 3rd kexec-ed kernel has two SETUP_EFI ranges reserved. And so on...

Thus skip SETUP_EFI while reserving setup_data in the e820_table_kexec
table because it is not needed.

[ bp: Heavily massage commit message, shorten line and improve comment. ]

Signed-off-by: Dave Young <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/e820.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index c5399e8..c920296 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -999,7 +999,15 @@ void __init e820__reserve_setup_data(void)
while (pa_data) {
data = early_memremap(pa_data, sizeof(*data));
e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
- e820__range_update_kexec(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);
+
+ /*
+ * SETUP_EFI is supplied by kexec and does not need to be
+ * reserved.
+ */
+ if (data->type != SETUP_EFI)
+ e820__range_update_kexec(pa_data,
+ sizeof(*data) + data->len,
+ E820_TYPE_RAM, E820_TYPE_RESERVED_KERN);

if (data->type == SETUP_INDIRECT &&
((struct setup_indirect *)data->data)->type != SETUP_INDIRECT) {

2020-02-25 01:49:56

by Dave Young

[permalink] [raw]
Subject: Re: [tip: x86/kdump] x86/kexec: Do not reserve EFI setup_data in the kexec e820 table

On 02/24/20 at 10:51am, tip-bot2 for Dave Young wrote:
> The following commit has been merged into the x86/kdump branch of tip:
>
> Commit-ID: 8efbc518b884e1db2dd6a6fce62d0112ab871dcf
> Gitweb: https://git.kernel.org/tip/8efbc518b884e1db2dd6a6fce62d0112ab871dcf
> Author: Dave Young <[email protected]>
> AuthorDate: Wed, 12 Feb 2020 19:04:24 +08:00
> Committer: Borislav Petkov <[email protected]>
> CommitterDate: Mon, 24 Feb 2020 11:41:57 +01:00
>
> x86/kexec: Do not reserve EFI setup_data in the kexec e820 table
>
> The e820 table for the kexec kernel unconditionally marks setup_data as
> reserved because the second kernel can reuse setup_data passed by the
> 1st kernel's boot loader, for example SETUP_PCI marked regions like PCI
> BIOS, etc.
>
> SETUP_EFI types, however, are used by kexec itself to enable EFI in the
> 2nd kernel. Thus, it is pointless to add this type of setup_data to the
> kexec e820 table as reserved.
>
> IOW, what happens is this:
>
> - 1st physical boot: no SETUP_EFI.
>
> - kexec loads a new kernel and prepares a SETUP_EFI setup_data blob, then
> reboots the machine.
>
> - 2nd kernel sees SETUP_EFI, reserves it both in the e820 and in the
> kexec e820 table.
>
> - If another kexec load is executed, it prepares a new SETUP_EFI blob and
> then reboots the machine into the new kernel.
>
> 5. The 3rd kexec-ed kernel has two SETUP_EFI ranges reserved. And so on...
>
> Thus skip SETUP_EFI while reserving setup_data in the e820_table_kexec
> table because it is not needed.
>
> [ bp: Heavily massage commit message, shorten line and improve comment. ]

Boris, thanks for the amending and log massage.
I was hesitating to break the long line or not and then I choosed
leaving it as is. Both are not very good and either of them is fine to me.

Thanks
Dave