2019-04-21 03:58:14

by Dave Young

[permalink] [raw]
Subject: [PATCH 1/2] X86/kdump: move crashkernel=X to reserve under 4G by default

The kdump crashkernel low reservation is limited to under 896M even for
X86_64. This obscure and miserable limitation exists for old kexec-tools
compatibility, but the reason is not documented anywhere.

Some more tests/investigations about the background:
a) Previously old kexec-tools can only load purgatory to memory under 2G,
Eric remove that limitation in 2012 in kexec-tools:
Commit b4f9f8599679 ("kexec x86_64: Make purgatory relocatable anywhere
in the 64bit address space.")

b) back in 2013 Yinghai removed all the limitations in new kexec-tools,
bzImage64 can be loaded to anywhere.
Commit 82c3dd2280d2 ("kexec, x86_64: Load bzImage64 above 4G")

c) test results with old kexec-tools with old and latest kernels.
1. old kexec-tools can not build with modern toolchain anymore,
I built it in a RHEL6 vm
2. 2.0.0 kexec-tools does not work with latest kernel even with
memory under 896M and give an error:
"ELF core (kcore) parse failed", it needs below kexec-tools fix
Commit ed15ba1b9977 ("build_mem_phdrs(): check if p_paddr is invalid")
3. even with patched kexec-tools which fixes 2), it still needs some
other fixes to work correctly for kaslr enabled kernels.

So the situation is:
* old kexec-tools is already broken with latest kernels
* we can not keep this limitations forever just for compatibility of very
old kexec-tools.
* If one must use old tools then he/she can choose crashkernel=X@Y
* people have reported bugs crashkernel=384M failed because kaslr makes
the 0-896M space sparse,
* crashkernel can reserve in low or high area, it is natural to understand
low as memory under 4G

Hence drop the 896M limitation, and change crashkernel low reservation to
reserve under 4G by default.

Signed-off-by: Dave Young <[email protected]>
---
arch/x86/kernel/setup.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

--- linux-x86.orig/arch/x86/kernel/setup.c
+++ linux-x86/arch/x86/kernel/setup.c
@@ -71,6 +71,7 @@
#include <linux/tboot.h>
#include <linux/jiffies.h>
#include <linux/mem_encrypt.h>
+#include <linux/sizes.h>

#include <linux/usb/xhci-dbgp.h>
#include <video/edid.h>
@@ -448,18 +449,17 @@ static void __init memblock_x86_reserve_
#ifdef CONFIG_KEXEC_CORE

/* 16M alignment for crash kernel regions */
-#define CRASH_ALIGN (16 << 20)
+#define CRASH_ALIGN SZ_16M

/*
* Keep the crash kernel below this limit. On 32 bits earlier kernels
* would limit the kernel to the low 512 MiB due to mapping restrictions.
- * On 64bit, old kexec-tools need to under 896MiB.
*/
#ifdef CONFIG_X86_32
-# define CRASH_ADDR_LOW_MAX (512 << 20)
-# define CRASH_ADDR_HIGH_MAX (512 << 20)
+# define CRASH_ADDR_LOW_MAX SZ_512M
+# define CRASH_ADDR_HIGH_MAX SZ_512M
#else
-# define CRASH_ADDR_LOW_MAX (896UL << 20)
+# define CRASH_ADDR_LOW_MAX SZ_4G
# define CRASH_ADDR_HIGH_MAX MAXMEM
#endif




2019-04-21 18:28:06

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] X86/kdump: move crashkernel=X to reserve under 4G by default


* Dave Young <[email protected]> wrote:

> The kdump crashkernel low reservation is limited to under 896M even for
> X86_64. This obscure and miserable limitation exists for old kexec-tools
> compatibility, but the reason is not documented anywhere.
>
> Some more tests/investigations about the background:
> a) Previously old kexec-tools can only load purgatory to memory under 2G,
> Eric remove that limitation in 2012 in kexec-tools:
> Commit b4f9f8599679 ("kexec x86_64: Make purgatory relocatable anywhere
> in the 64bit address space.")
>
> b) back in 2013 Yinghai removed all the limitations in new kexec-tools,
> bzImage64 can be loaded to anywhere.
> Commit 82c3dd2280d2 ("kexec, x86_64: Load bzImage64 above 4G")
>
> c) test results with old kexec-tools with old and latest kernels.
> 1. old kexec-tools can not build with modern toolchain anymore,
> I built it in a RHEL6 vm
> 2. 2.0.0 kexec-tools does not work with latest kernel even with
> memory under 896M and give an error:
> "ELF core (kcore) parse failed", it needs below kexec-tools fix
> Commit ed15ba1b9977 ("build_mem_phdrs(): check if p_paddr is invalid")
> 3. even with patched kexec-tools which fixes 2), it still needs some
> other fixes to work correctly for kaslr enabled kernels.
>
> So the situation is:
> * old kexec-tools is already broken with latest kernels
> * we can not keep this limitations forever just for compatibility of very
> old kexec-tools.
> * If one must use old tools then he/she can choose crashkernel=X@Y
> * people have reported bugs crashkernel=384M failed because kaslr makes
> the 0-896M space sparse,
> * crashkernel can reserve in low or high area, it is natural to understand
> low as memory under 4G
>
> Hence drop the 896M limitation, and change crashkernel low reservation to
> reserve under 4G by default.
>
> Signed-off-by: Dave Young <[email protected]>
> ---
> arch/x86/kernel/setup.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> --- linux-x86.orig/arch/x86/kernel/setup.c
> +++ linux-x86/arch/x86/kernel/setup.c
> @@ -71,6 +71,7 @@
> #include <linux/tboot.h>
> #include <linux/jiffies.h>
> #include <linux/mem_encrypt.h>
> +#include <linux/sizes.h>
>
> #include <linux/usb/xhci-dbgp.h>
> #include <video/edid.h>
> @@ -448,18 +449,17 @@ static void __init memblock_x86_reserve_
> #ifdef CONFIG_KEXEC_CORE
>
> /* 16M alignment for crash kernel regions */
> -#define CRASH_ALIGN (16 << 20)
> +#define CRASH_ALIGN SZ_16M
>
> /*
> * Keep the crash kernel below this limit. On 32 bits earlier kernels
> * would limit the kernel to the low 512 MiB due to mapping restrictions.
> - * On 64bit, old kexec-tools need to under 896MiB.
> */
> #ifdef CONFIG_X86_32
> -# define CRASH_ADDR_LOW_MAX (512 << 20)
> -# define CRASH_ADDR_HIGH_MAX (512 << 20)
> +# define CRASH_ADDR_LOW_MAX SZ_512M
> +# define CRASH_ADDR_HIGH_MAX SZ_512M
> #else
> -# define CRASH_ADDR_LOW_MAX (896UL << 20)
> +# define CRASH_ADDR_LOW_MAX SZ_4G
> # define CRASH_ADDR_HIGH_MAX MAXMEM
> #endif

Reviewed-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

Subject: [tip:x86/kdump] x86/kdump: Have crashkernel=X reserve under 4G by default

Commit-ID: 9ca5c8e632ce8f144ec6d00da2dc5e16b41d593c
Gitweb: https://git.kernel.org/tip/9ca5c8e632ce8f144ec6d00da2dc5e16b41d593c
Author: Dave Young <[email protected]>
AuthorDate: Sun, 21 Apr 2019 11:50:59 +0800
Committer: Borislav Petkov <[email protected]>
CommitDate: Mon, 22 Apr 2019 10:15:16 +0200

x86/kdump: Have crashkernel=X reserve under 4G by default

The kdump crashkernel low reservation is limited to under 896M even for
X86_64. This obscure and miserable limitation exists for compatibility
with old kexec-tools but the reason is not documented anywhere.

Some more tests/investigations about the background:

a) Previously, old kexec-tools could only load purgatory to memory under
2G. Eric removed that limitation in 2012 in kexec-tools:

b4f9f8599679 ("kexec x86_64: Make purgatory relocatable anywhere
in the 64bit address space.")

b) Back in 2013 Yinghai removed all the limitations in new kexec-tools,
bzImage64 can be loaded anywhere:

82c3dd2280d2 ("kexec, x86_64: Load bzImage64 above 4G")

c) Test results with old kexec-tools with old and latest kernels:

1. Old kexec-tools can not build with modern toolchain anymore,
I built it in a RHEL6 vm.

2. 2.0.0 kexec-tools does not work with the latest kernel even with
memory under 896M and gives an error:

"ELF core (kcore) parse failed"

For that it needs below kexec-tools fix:

ed15ba1b9977 ("build_mem_phdrs(): check if p_paddr is invalid")

3. Even with patched kexec-tools which fixes 2), it still needs some
other fixes to work correctly for KASLR-enabled kernels.

So the situation is:

* Old kexec-tools is already broken with latest kernels.

* We can not keep these limitations forever just for compatibility with very
old kexec-tools.

* If one must use old tools then he/she can choose crashkernel=X@Y.

* People have reported bugs where crashkernel=384M failed because KASLR
makes the 0-896M space sparse.

* Crashkernel can reserve in low or high area, it is natural to understand
low as memory under 4G.

Hence drop the 896M limitation and change crashkernel low reservation to
reserve under 4G by default.

Signed-off-by: Dave Young <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Acked-by: Baoquan He <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Howells <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Petr Tesarik <[email protected]>
Cc: [email protected]
Cc: Ram Pai <[email protected]>
Cc: Sinan Kaya <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: x86-ml <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Zhimin Gu <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/setup.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 3d872a527cd9..daf7c5650c18 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -71,6 +71,7 @@
#include <linux/tboot.h>
#include <linux/jiffies.h>
#include <linux/mem_encrypt.h>
+#include <linux/sizes.h>

#include <linux/usb/xhci-dbgp.h>
#include <video/edid.h>
@@ -448,18 +449,17 @@ static void __init memblock_x86_reserve_range_setup_data(void)
#ifdef CONFIG_KEXEC_CORE

/* 16M alignment for crash kernel regions */
-#define CRASH_ALIGN (16 << 20)
+#define CRASH_ALIGN SZ_16M

/*
* Keep the crash kernel below this limit. On 32 bits earlier kernels
* would limit the kernel to the low 512 MiB due to mapping restrictions.
- * On 64bit, old kexec-tools need to under 896MiB.
*/
#ifdef CONFIG_X86_32
-# define CRASH_ADDR_LOW_MAX (512 << 20)
-# define CRASH_ADDR_HIGH_MAX (512 << 20)
+# define CRASH_ADDR_LOW_MAX SZ_512M
+# define CRASH_ADDR_HIGH_MAX SZ_512M
#else
-# define CRASH_ADDR_LOW_MAX (896UL << 20)
+# define CRASH_ADDR_LOW_MAX SZ_4G
# define CRASH_ADDR_HIGH_MAX MAXMEM
#endif