2015-08-09 18:55:04

by Martin Schwidefsky

[permalink] [raw]
Subject: [PATCH] kexec: override GFP flag for the control page allocation

Greetings,

I got a bug report that kexec does not work on s390 if the system has
lots of memory. It is the kexec load step that dies with out-of-memory.

The reason why this happens is the value of KEXEC_CONTROL_MEMORY_LIMIT
for s390. To start the new kernel the code in the kexec control page
needs to switch to the ESA mode (31-bit), therefore the memory limit
is 2GB for s390. The allocation of the control page is done with
GFP_KERNEL in kimage_alloc_normal_control_pages. If the allocated page
is a target page in the kexec destination range or if its address is
larger than the memory limit, it is put on the list of extra pages
and another page is allocated until one is found that fits the
requirements.

With a large memory size not only does this loop take a long time to
complete (think 10 terabyte of memory), but eventually the OOM killer
steps in and terminates the program that called kexec load.

The fix for s390 is to use a different GFP flag, GFP_DMA instead of
GFP_KERNEL. To do this a new #define for kexec is introduced that
can be overruled by the architecture.

Martin Schwidefsky (1):
kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP

arch/s390/include/asm/kexec.h | 3 +++
include/linux/kexec.h | 4 ++++
kernel/kexec.c | 2 +-
3 files changed, 8 insertions(+), 1 deletion(-)

--
1.9.1


2015-08-09 18:43:20

by Martin Schwidefsky

[permalink] [raw]
Subject: [PATCH] kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP

Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code
to override the gfp flags of the allocation for the kexec control
page. The loop in kimage_alloc_normal_control_pages allocates pages
with GFP_KERNEL until a page is found that happens to have an
address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems
with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT
the loop will keep allocating memory until the oom killer steps in.

Signed-off-by: Martin Schwidefsky <[email protected]>
---
arch/s390/include/asm/kexec.h | 3 +++
include/linux/kexec.h | 4 ++++
kernel/kexec.c | 2 +-
3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/kexec.h b/arch/s390/include/asm/kexec.h
index 694bcd6..2f924bc 100644
--- a/arch/s390/include/asm/kexec.h
+++ b/arch/s390/include/asm/kexec.h
@@ -26,6 +26,9 @@
/* Not more than 2GB */
#define KEXEC_CONTROL_MEMORY_LIMIT (1UL<<31)

+/* Allocate control page with GFP_DMA */
+#define KEXEC_CONTROL_MEMORY_GFP GFP_DMA
+
/* Maximum address we can use for the crash control pages */
#define KEXEC_CRASH_CONTROL_MEMORY_LIMIT (-1UL)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index e60a745..e804306 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -40,6 +40,10 @@
#error KEXEC_CONTROL_MEMORY_LIMIT not defined
#endif

+#ifndef KEXEC_CONTROL_MEMORY_GFP
+#define KEXEC_CONTROL_MEMORY_GFP GFP_KERNEL
+#endif
+
#ifndef KEXEC_CONTROL_PAGE_SIZE
#error KEXEC_CONTROL_PAGE_SIZE not defined
#endif
diff --git a/kernel/kexec.c b/kernel/kexec.c
index 38c25b1..7a36fdc 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -707,7 +707,7 @@ static struct page *kimage_alloc_normal_control_pages(struct kimage *image,
do {
unsigned long pfn, epfn, addr, eaddr;

- pages = kimage_alloc_pages(GFP_KERNEL, order);
+ pages = kimage_alloc_pages(KEXEC_CONTROL_MEMORY_GFP, order);
if (!pages)
break;
pfn = page_to_pfn(pages);
--
1.9.1