2020-06-28 08:32:45

by chenzhou

[permalink] [raw]
Subject: [PATCH v9 0/5] support reserving crashkernel above 4G on arm64 kdump

This patch series enable reserving crashkernel above 4G in arm64.

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
when there is no enough low memory.
2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
in this case, if swiotlb or DMA buffers are required, crash dump kernel
will boot failure because there is no low memory available for allocation.
3. commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") broken
the arm64 kdump. If the memory reserved for crash dump kernel falled in
ZONE_DMA32, the devices in crash dump kernel need to use ZONE_DMA will alloc
fail.

To solve these issues, introduce crashkernel=X,low to reserve specified
size low memory.
Crashkernel=X tries to reserve memory for the crash dump kernel under
4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
size low memory for crash kdump kernel devices firstly and then reserve
memory above 4G.

When crashkernel is reserved above 4G in memory and crashkernel=X,low
is specified simultaneously, kernel should reserve specified size low memory
for crash dump kernel devices. So there may be two crash kernel regions, one
is below 4G, the other is above 4G.
In order to distinct from the high region and make no effect to the use of
kexec-tools, rename the low region as "Crash kernel (low)", and pass the
low region by reusing DT property "linux,usable-memory-range". We made the low
memory region as the last range of "linux,usable-memory-range" to keep
compatibility with existing user-space and older kdump kernels.

Besides, we need to modify kexec-tools:
arm64: support more than one crash kernel regions(see [1])

Another update is document about DT property 'linux,usable-memory-range':
schemas: update 'linux,usable-memory-range' node schema(see [2])

The previous changes and discussions can be retrieved from:

Changes since [v8]
- Reuse DT property "linux,usable-memory-range".
Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
memory region.
- Fix kdump broken with ZONE_DMA reintroduced.
- Update chosen schema.

Changes since [v7]
- Move x86 CRASH_ALIGN to 2M
Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
- Update Documentation/devicetree/bindings/chosen.txt.
Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
suggested by Arnd.
- Add Tested-by from Jhon and pk.

Changes since [v6]
- Fix build errors reported by kbuild test robot.

Changes since [v5]
- Move reserve_crashkernel_low() into kernel/crash_core.c.
- Delete crashkernel=X,high.
- Modify crashkernel=X,low.
If crashkernel=X,low is specified simultaneously, reserve spcified size low
memory for crash kdump kernel devices firstly and then reserve memory above 4G.
In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
pass to crash dump kernel by DT property "linux,low-memory-range".
- Update Documentation/admin-guide/kdump/kdump.rst.

Changes since [v4]
- Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.

Changes since [v3]
- Add memblock_cap_memory_ranges back for multiple ranges.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
in fdt_enforce_memory_region().
There are at most two crash kernel regions, for two crash kernel regions
case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
and then remove the memory range in the middle.

[1]: http://lists.infradead.org/pipermail/kexec/2020-June/020737.html
[2]: https://github.com/robherring/dt-schema/pull/19
[v1]: https://lkml.org/lkml/2019/4/2/1174
[v2]: https://lkml.org/lkml/2019/4/9/86
[v3]: https://lkml.org/lkml/2019/4/9/306
[v4]: https://lkml.org/lkml/2019/4/15/273
[v5]: https://lkml.org/lkml/2019/5/6/1360
[v6]: https://lkml.org/lkml/2019/8/30/142
[v7]: https://lkml.org/lkml/2019/12/23/411
[v8]: https://lkml.org/lkml/2020/5/21/213

Chen Zhou (5):
x86: kdump: move reserve_crashkernel_low() into crash_core.c
arm64: kdump: reserve crashkenel above 4G for crash dump kernel
arm64: kdump: add memory for devices by DT property
linux,usable-memory-range
arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
kdump: update Documentation about crashkernel on arm64

Documentation/admin-guide/kdump/kdump.rst | 13 ++-
.../admin-guide/kernel-parameters.txt | 17 +++-
arch/arm64/kernel/setup.c | 8 +-
arch/arm64/mm/init.c | 74 ++++++++++++---
arch/x86/kernel/setup.c | 66 ++------------
include/linux/crash_core.h | 3 +
include/linux/kexec.h | 2 -
kernel/crash_core.c | 90 +++++++++++++++++++
kernel/kexec_core.c | 17 ----
9 files changed, 196 insertions(+), 94 deletions(-)

--
2.20.1


2020-06-28 08:33:50

by chenzhou

[permalink] [raw]
Subject: [PATCH v9 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced

commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32")
broken the arm64 kdump. If the memory reserved for crash dump kernel
falled in ZONE_DMA32, the devices in crash dump kernel need to use
ZONE_DMA will alloc fail.

This patch addressed the above issue based on "reserving crashkernel
above 4G". Originally, we reserve low memory below 4G, and now just need
to adjust memory limit to arm64_dma_phys_limit in reserve_crashkernel_low
if ZONE_DMA is enabled. That is, if there are devices need to use ZONE_DMA
in crash dump kernel, it is a good choice to use parameters
"crashkernel=X crashkernel=Y,low".

Signed-off-by: Chen Zhou <[email protected]>
---
kernel/crash_core.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index a7580d291c37..e8ecbbc761a3 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -320,6 +320,7 @@ int __init reserve_crashkernel_low(void)
unsigned long long base, low_base = 0, low_size = 0;
unsigned long total_low_mem;
int ret;
+ phys_addr_t crash_max = 1ULL << 32;

total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));

@@ -352,7 +353,11 @@ int __init reserve_crashkernel_low(void)
return 0;
}

- low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
+#ifdef CONFIG_ARM64
+ if (IS_ENABLED(CONFIG_ZONE_DMA))
+ crash_max = arm64_dma_phys_limit;
+#endif
+ low_base = memblock_find_in_range(0, crash_max, low_size, CRASH_ALIGN);
if (!low_base) {
pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
(unsigned long)(low_size >> 20));
--
2.20.1

2020-06-28 08:34:59

by chenzhou

[permalink] [raw]
Subject: [PATCH v9 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c

In preparation for supporting reserve_crashkernel_low in arm64 as
x86_64 does, move reserve_crashkernel_low() into kernel/crash_core.c.

BTW, move x86_64 CRASH_ALIGN to 2M suggested by Dave. CONFIG_PHYSICAL_ALIGN
can be selected from 2M to 16M, move to the same as arm64.

Note, in arm64, we reserve low memory if and only if crashkernel=X,low
is specified. Different with x86_64, don't set low memory automatically.

Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Chen Zhou <[email protected]>
Tested-by: John Donnelly <[email protected]>
Tested-by: Prabhakar Kushwaha <[email protected]>
---
arch/x86/kernel/setup.c | 66 ++++-------------------------
include/linux/crash_core.h | 3 ++
include/linux/kexec.h | 2 -
kernel/crash_core.c | 85 ++++++++++++++++++++++++++++++++++++++
kernel/kexec_core.c | 17 --------
5 files changed, 96 insertions(+), 77 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a3767e74c758..33db99ae3035 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -401,8 +401,8 @@ static void __init memblock_x86_reserve_range_setup_data(void)

#ifdef CONFIG_KEXEC_CORE

-/* 16M alignment for crash kernel regions */
-#define CRASH_ALIGN SZ_16M
+/* 2M alignment for crash kernel regions */
+#define CRASH_ALIGN SZ_2M

/*
* Keep the crash kernel below this limit.
@@ -425,59 +425,6 @@ static void __init memblock_x86_reserve_range_setup_data(void)
# define CRASH_ADDR_HIGH_MAX SZ_64T
#endif

-static int __init reserve_crashkernel_low(void)
-{
-#ifdef CONFIG_X86_64
- unsigned long long base, low_base = 0, low_size = 0;
- unsigned long total_low_mem;
- int ret;
-
- total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
-
- /* crashkernel=Y,low */
- ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size, &base);
- if (ret) {
- /*
- * two parts from kernel/dma/swiotlb.c:
- * -swiotlb size: user-specified with swiotlb= or default.
- *
- * -swiotlb overflow buffer: now hardcoded to 32k. We round it
- * to 8M for other buffers that may need to stay low too. Also
- * make sure we allocate enough extra low memory so that we
- * don't run out of DMA buffers for 32-bit devices.
- */
- low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20);
- } else {
- /* passed with crashkernel=0,low ? */
- if (!low_size)
- return 0;
- }
-
- low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
- if (!low_base) {
- pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
- (unsigned long)(low_size >> 20));
- return -ENOMEM;
- }
-
- ret = memblock_reserve(low_base, low_size);
- if (ret) {
- pr_err("%s: Error reserving crashkernel low memblock.\n", __func__);
- return ret;
- }
-
- pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
- (unsigned long)(low_size >> 20),
- (unsigned long)(low_base >> 20),
- (unsigned long)(total_low_mem >> 20));
-
- crashk_low_res.start = low_base;
- crashk_low_res.end = low_base + low_size - 1;
- insert_resource(&iomem_resource, &crashk_low_res);
-#endif
- return 0;
-}
-
static void __init reserve_crashkernel(void)
{
unsigned long long crash_size, crash_base, total_mem;
@@ -541,9 +488,12 @@ static void __init reserve_crashkernel(void)
return;
}

- if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
- memblock_free(crash_base, crash_size);
- return;
+ if (crash_base >= (1ULL << 32)) {
+ if (reserve_crashkernel_low()) {
+ memblock_free(crash_base, crash_size);
+ return;
+ }
+ insert_resource(&iomem_resource, &crashk_low_res);
}

pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System RAM: %ldMB)\n",
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 525510a9f965..4df8c0bff03e 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -63,6 +63,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
extern unsigned char *vmcoreinfo_data;
extern size_t vmcoreinfo_size;
extern u32 *vmcoreinfo_note;
+extern struct resource crashk_res;
+extern struct resource crashk_low_res;

Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
void *data, size_t data_len);
@@ -74,5 +76,6 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
unsigned long long *crash_size, unsigned long long *crash_base);
int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
unsigned long long *crash_size, unsigned long long *crash_base);
+int __init reserve_crashkernel_low(void);

#endif /* LINUX_CRASH_CORE_H */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index ea67910ae6b7..a460afdbab0f 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -330,8 +330,6 @@ extern int kexec_load_disabled;

/* Location of a reserved region to hold the crash kernel.
*/
-extern struct resource crashk_res;
-extern struct resource crashk_low_res;
extern note_buf_t __percpu *crash_notes;

/* flag to track if kexec reboot is in progress */
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..a7580d291c37 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -7,6 +7,8 @@
#include <linux/crash_core.h>
#include <linux/utsname.h>
#include <linux/vmalloc.h>
+#include <linux/memblock.h>
+#include <linux/swiotlb.h>

#include <asm/page.h>
#include <asm/sections.h>
@@ -19,6 +21,22 @@ u32 *vmcoreinfo_note;
/* trusted vmcoreinfo, e.g. we can make a copy in the crash memory */
static unsigned char *vmcoreinfo_data_safecopy;

+/* Location of the reserved area for the crash kernel */
+struct resource crashk_res = {
+ .name = "Crash kernel",
+ .start = 0,
+ .end = 0,
+ .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+ .desc = IORES_DESC_CRASH_KERNEL
+};
+struct resource crashk_low_res = {
+ .name = "Crash kernel",
+ .start = 0,
+ .end = 0,
+ .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+ .desc = IORES_DESC_CRASH_KERNEL
+};
+
/*
* parsing the "crashkernel" commandline
*
@@ -292,6 +310,73 @@ int __init parse_crashkernel_low(char *cmdline,
"crashkernel=", suffix_tbl[SUFFIX_LOW]);
}

+#if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)
+#define CRASH_ALIGN SZ_2M
+#endif
+
+int __init reserve_crashkernel_low(void)
+{
+#if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)
+ unsigned long long base, low_base = 0, low_size = 0;
+ unsigned long total_low_mem;
+ int ret;
+
+ total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
+
+ /* crashkernel=Y,low */
+ ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size,
+ &base);
+ if (ret) {
+#ifdef CONFIG_X86_64
+ /*
+ * two parts from lib/swiotlb.c:
+ * -swiotlb size: user-specified with swiotlb= or default.
+ *
+ * -swiotlb overflow buffer: now hardcoded to 32k. We round it
+ * to 8M for other buffers that may need to stay low too. Also
+ * make sure we allocate enough extra low memory so that we
+ * don't run out of DMA buffers for 32-bit devices.
+ */
+ low_size = max(swiotlb_size_or_default() + (8UL << 20),
+ 256UL << 20);
+#else
+ /*
+ * in arm64, reserve low memory if and only if crashkernel=X,low
+ * specified.
+ */
+ return -EINVAL;
+#endif
+ } else {
+ /* passed with crashkernel=0,low ? */
+ if (!low_size)
+ return 0;
+ }
+
+ low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
+ if (!low_base) {
+ pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
+ (unsigned long)(low_size >> 20));
+ return -ENOMEM;
+ }
+
+ ret = memblock_reserve(low_base, low_size);
+ if (ret) {
+ pr_err("%s: Error reserving crashkernel low memblock.\n",
+ __func__);
+ return ret;
+ }
+
+ pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
+ (unsigned long)(low_size >> 20),
+ (unsigned long)(low_base >> 20),
+ (unsigned long)(total_low_mem >> 20));
+
+ crashk_low_res.start = low_base;
+ crashk_low_res.end = low_base + low_size - 1;
+#endif
+ return 0;
+}
+
Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
void *data, size_t data_len)
{
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index c19c0dad1ebe..db66bbabfff3 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -53,23 +53,6 @@ note_buf_t __percpu *crash_notes;
/* Flag to indicate we are going to kexec a new kernel */
bool kexec_in_progress = false;

-
-/* Location of the reserved area for the crash kernel */
-struct resource crashk_res = {
- .name = "Crash kernel",
- .start = 0,
- .end = 0,
- .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
- .desc = IORES_DESC_CRASH_KERNEL
-};
-struct resource crashk_low_res = {
- .name = "Crash kernel",
- .start = 0,
- .end = 0,
- .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
- .desc = IORES_DESC_CRASH_KERNEL
-};
-
int kexec_should_crash(struct task_struct *p)
{
/*
--
2.20.1

2020-06-29 21:33:33

by John Donnelly

[permalink] [raw]
Subject: Re: [PATCH v9 0/5] support reserving crashkernel above 4G on arm64 kdump

Hi ,

> On Jun 28, 2020, at 3:34 AM, Chen Zhou <[email protected]> wrote:
>
> This patch series enable reserving crashkernel above 4G in arm64.
>
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
> when there is no enough low memory.
> 2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
> in this case, if swiotlb or DMA buffers are required, crash dump kernel
> will boot failure because there is no low memory available for allocation.
> 3. commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") broken
> the arm64 kdump. If the memory reserved for crash dump kernel falled in
> ZONE_DMA32, the devices in crash dump kernel need to use ZONE_DMA will alloc
> fail.
>
> To solve these issues, introduce crashkernel=X,low to reserve specified
> size low memory.
> Crashkernel=X tries to reserve memory for the crash dump kernel under
> 4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
> size low memory for crash kdump kernel devices firstly and then reserve
> memory above 4G.
>
> When crashkernel is reserved above 4G in memory and crashkernel=X,low
> is specified simultaneously, kernel should reserve specified size low memory
> for crash dump kernel devices. So there may be two crash kernel regions, one
> is below 4G, the other is above 4G.
> In order to distinct from the high region and make no effect to the use of
> kexec-tools, rename the low region as "Crash kernel (low)", and pass the
> low region by reusing DT property "linux,usable-memory-range". We made the low
> memory region as the last range of "linux,usable-memory-range" to keep
> compatibility with existing user-space and older kdump kernels.
>
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
>
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
>
> The previous changes and discussions can be retrieved from:
>
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.


Nice job Chen,

Does this need a Ack-by from the Raspberry maintainers on this ZONE_DMA fix ?

This activity has been going on for over a year now. Can we please get this finalized and merged ?

Thank you,
John.



>
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
>
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
>
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
>
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
>
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
>
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
>
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
>
> [1]: https://urldefense.com/v3/__http://lists.infradead.org/pipermail/kexec/2020-June/020737.html__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbw-1A43Ss$
> [2]: https://urldefense.com/v3/__https://github.com/robherring/dt-schema/pull/19__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbw9xTB8yT$
> [v1]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/2/1174__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbw2EAOIhM$
> [v2]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/86__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbwx3ILnLL$
> [v3]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/306__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbw5YJeYoP$
> [v4]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/15/273__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbwxbrySTW$
> [v5]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/5/6/1360__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbwzGOGcMs$
> [v6]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/8/30/142__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbw70P0bKy$
> [v7]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/12/23/411__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbw29m6Wrh$
> [v8]: https://urldefense.com/v3/__https://lkml.org/lkml/2020/5/21/213__;!!GqivPVa7Brio!OnPQrQrzeDmjwp6OTFe3rN1ddb-AUny-Wq5vlaMfxI3rSentYYQy-2H91dqbw-jGzScF$
>
> Chen Zhou (5):
> x86: kdump: move reserve_crashkernel_low() into crash_core.c
> arm64: kdump: reserve crashkenel above 4G for crash dump kernel
> arm64: kdump: add memory for devices by DT property
> linux,usable-memory-range
> arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
> kdump: update Documentation about crashkernel on arm64
>
> Documentation/admin-guide/kdump/kdump.rst | 13 ++-
> .../admin-guide/kernel-parameters.txt | 17 +++-
> arch/arm64/kernel/setup.c | 8 +-
> arch/arm64/mm/init.c | 74 ++++++++++++---
> arch/x86/kernel/setup.c | 66 ++------------
> include/linux/crash_core.h | 3 +
> include/linux/kexec.h | 2 -
> kernel/crash_core.c | 90 +++++++++++++++++++
> kernel/kexec_core.c | 17 ----
> 9 files changed, 196 insertions(+), 94 deletions(-)
>
> --
> 2.20.1
>

2020-07-02 02:50:22

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v9 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c

On 06/28/20 at 04:34pm, Chen Zhou wrote:
> In preparation for supporting reserve_crashkernel_low in arm64 as
> x86_64 does, move reserve_crashkernel_low() into kernel/crash_core.c.
>
> BTW, move x86_64 CRASH_ALIGN to 2M suggested by Dave. CONFIG_PHYSICAL_ALIGN
> can be selected from 2M to 16M, move to the same as arm64.
>
> Note, in arm64, we reserve low memory if and only if crashkernel=X,low
> is specified. Different with x86_64, don't set low memory automatically.
>
> Reported-by: kbuild test robot <[email protected]>
> Signed-off-by: Chen Zhou <[email protected]>
> Tested-by: John Donnelly <[email protected]>
> Tested-by: Prabhakar Kushwaha <[email protected]>
> ---
> arch/x86/kernel/setup.c | 66 ++++-------------------------
> include/linux/crash_core.h | 3 ++
> include/linux/kexec.h | 2 -
> kernel/crash_core.c | 85 ++++++++++++++++++++++++++++++++++++++
> kernel/kexec_core.c | 17 --------
> 5 files changed, 96 insertions(+), 77 deletions(-)
>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index a3767e74c758..33db99ae3035 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -401,8 +401,8 @@ static void __init memblock_x86_reserve_range_setup_data(void)
>
> #ifdef CONFIG_KEXEC_CORE
>
> -/* 16M alignment for crash kernel regions */
> -#define CRASH_ALIGN SZ_16M
> +/* 2M alignment for crash kernel regions */
> +#define CRASH_ALIGN SZ_2M
>
> /*
> * Keep the crash kernel below this limit.
> @@ -425,59 +425,6 @@ static void __init memblock_x86_reserve_range_setup_data(void)
> # define CRASH_ADDR_HIGH_MAX SZ_64T
> #endif
>
> -static int __init reserve_crashkernel_low(void)
> -{
> -#ifdef CONFIG_X86_64
> - unsigned long long base, low_base = 0, low_size = 0;
> - unsigned long total_low_mem;
> - int ret;
> -
> - total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
> -
> - /* crashkernel=Y,low */
> - ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size, &base);
> - if (ret) {
> - /*
> - * two parts from kernel/dma/swiotlb.c:
> - * -swiotlb size: user-specified with swiotlb= or default.
> - *
> - * -swiotlb overflow buffer: now hardcoded to 32k. We round it
> - * to 8M for other buffers that may need to stay low too. Also
> - * make sure we allocate enough extra low memory so that we
> - * don't run out of DMA buffers for 32-bit devices.
> - */
> - low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20);
> - } else {
> - /* passed with crashkernel=0,low ? */
> - if (!low_size)
> - return 0;
> - }
> -
> - low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
> - if (!low_base) {
> - pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
> - (unsigned long)(low_size >> 20));
> - return -ENOMEM;
> - }
> -
> - ret = memblock_reserve(low_base, low_size);
> - if (ret) {
> - pr_err("%s: Error reserving crashkernel low memblock.\n", __func__);
> - return ret;
> - }
> -
> - pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
> - (unsigned long)(low_size >> 20),
> - (unsigned long)(low_base >> 20),
> - (unsigned long)(total_low_mem >> 20));
> -
> - crashk_low_res.start = low_base;
> - crashk_low_res.end = low_base + low_size - 1;
> - insert_resource(&iomem_resource, &crashk_low_res);
> -#endif
> - return 0;
> -}
> -
> static void __init reserve_crashkernel(void)
> {
> unsigned long long crash_size, crash_base, total_mem;
> @@ -541,9 +488,12 @@ static void __init reserve_crashkernel(void)
> return;
> }
>
> - if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
> - memblock_free(crash_base, crash_size);
> - return;
> + if (crash_base >= (1ULL << 32)) {
> + if (reserve_crashkernel_low()) {
> + memblock_free(crash_base, crash_size);
> + return;
> + }
> + insert_resource(&iomem_resource, &crashk_low_res);
> }
>
> pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System RAM: %ldMB)\n",
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 525510a9f965..4df8c0bff03e 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -63,6 +63,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> extern unsigned char *vmcoreinfo_data;
> extern size_t vmcoreinfo_size;
> extern u32 *vmcoreinfo_note;
> +extern struct resource crashk_res;
> +extern struct resource crashk_low_res;
>
> Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> void *data, size_t data_len);
> @@ -74,5 +76,6 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
> unsigned long long *crash_size, unsigned long long *crash_base);
> int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
> unsigned long long *crash_size, unsigned long long *crash_base);
> +int __init reserve_crashkernel_low(void);
>
> #endif /* LINUX_CRASH_CORE_H */
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index ea67910ae6b7..a460afdbab0f 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -330,8 +330,6 @@ extern int kexec_load_disabled;
>
> /* Location of a reserved region to hold the crash kernel.
> */
> -extern struct resource crashk_res;
> -extern struct resource crashk_low_res;
> extern note_buf_t __percpu *crash_notes;
>
> /* flag to track if kexec reboot is in progress */
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 9f1557b98468..a7580d291c37 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -7,6 +7,8 @@
> #include <linux/crash_core.h>
> #include <linux/utsname.h>
> #include <linux/vmalloc.h>
> +#include <linux/memblock.h>
> +#include <linux/swiotlb.h>
>
> #include <asm/page.h>
> #include <asm/sections.h>
> @@ -19,6 +21,22 @@ u32 *vmcoreinfo_note;
> /* trusted vmcoreinfo, e.g. we can make a copy in the crash memory */
> static unsigned char *vmcoreinfo_data_safecopy;
>
> +/* Location of the reserved area for the crash kernel */
> +struct resource crashk_res = {
> + .name = "Crash kernel",
> + .start = 0,
> + .end = 0,
> + .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
> + .desc = IORES_DESC_CRASH_KERNEL
> +};
> +struct resource crashk_low_res = {
> + .name = "Crash kernel",
> + .start = 0,
> + .end = 0,
> + .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
> + .desc = IORES_DESC_CRASH_KERNEL
> +};
> +
> /*
> * parsing the "crashkernel" commandline
> *
> @@ -292,6 +310,73 @@ int __init parse_crashkernel_low(char *cmdline,
> "crashkernel=", suffix_tbl[SUFFIX_LOW]);
> }
>
> +#if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)
> +#define CRASH_ALIGN SZ_2M
> +#endif
> +
> +int __init reserve_crashkernel_low(void)
> +{
> +#if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)
> + unsigned long long base, low_base = 0, low_size = 0;
> + unsigned long total_low_mem;
> + int ret;
> +
> + total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
> +
> + /* crashkernel=Y,low */
> + ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size,
> + &base);
> + if (ret) {
> +#ifdef CONFIG_X86_64
> + /*
> + * two parts from lib/swiotlb.c:
> + * -swiotlb size: user-specified with swiotlb= or default.
> + *
> + * -swiotlb overflow buffer: now hardcoded to 32k. We round it
> + * to 8M for other buffers that may need to stay low too. Also
> + * make sure we allocate enough extra low memory so that we
> + * don't run out of DMA buffers for 32-bit devices.
> + */
> + low_size = max(swiotlb_size_or_default() + (8UL << 20),
> + 256UL << 20);
> +#else
> + /*
> + * in arm64, reserve low memory if and only if crashkernel=X,low
> + * specified.
> + */
> + return -EINVAL;
> +#endif
> + } else {
> + /* passed with crashkernel=0,low ? */
> + if (!low_size)
> + return 0;
> + }
> +
> + low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
> + if (!low_base) {
> + pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
> + (unsigned long)(low_size >> 20));
> + return -ENOMEM;
> + }
> +
> + ret = memblock_reserve(low_base, low_size);
> + if (ret) {
> + pr_err("%s: Error reserving crashkernel low memblock.\n",
> + __func__);
> + return ret;
> + }
> +
> + pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
> + (unsigned long)(low_size >> 20),
> + (unsigned long)(low_base >> 20),
> + (unsigned long)(total_low_mem >> 20));
> +
> + crashk_low_res.start = low_base;
> + crashk_low_res.end = low_base + low_size - 1;
> +#endif
> + return 0;
> +}
> +
> Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> void *data, size_t data_len)
> {
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index c19c0dad1ebe..db66bbabfff3 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -53,23 +53,6 @@ note_buf_t __percpu *crash_notes;
> /* Flag to indicate we are going to kexec a new kernel */
> bool kexec_in_progress = false;
>
> -
> -/* Location of the reserved area for the crash kernel */
> -struct resource crashk_res = {
> - .name = "Crash kernel",
> - .start = 0,
> - .end = 0,
> - .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
> - .desc = IORES_DESC_CRASH_KERNEL
> -};
> -struct resource crashk_low_res = {
> - .name = "Crash kernel",
> - .start = 0,
> - .end = 0,
> - .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
> - .desc = IORES_DESC_CRASH_KERNEL
> -};
> -
> int kexec_should_crash(struct task_struct *p)
> {
> /*
> --
> 2.20.1
>
>
> _______________________________________________
> kexec mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/kexec
>

Acked-by: Dave Young <[email protected]>

Thanks
Dave