2020-08-01 13:06:44

by chenzhou

[permalink] [raw]
Subject: [PATCH v11 0/5] support reserving crashkernel above 4G on arm64 kdump

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.
3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
if the memory reserved for crash dump kernel falled in ZONE_DMA32,
the devices in crash dump kernel need to use ZONE_DMA will alloc
fail.

To solve these issues, change the behavior of crashkernel=X.
crashkernel=X tries low allocation in ZONE_DMA, and fall back to
high allocation if it fails.

If requized size X is too large and leads to very little free memory
in ZONE_DMA after low allocation, the system may not work normally.
So add a threshold and go for high allocation directly if the required
size is too large. The value of threshold is set as the half of
the low memory.

If crash_base is outside ZONE_DMA, try to allocate at least 256M in
ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
specified size low memory.
For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices. So there may be two regions reserved for
crash dump kernel, one is below 4G, the other is above 4G.
In order to distinct from the high region and make no effect to the use
of existing kexec-tools, rename the low region as "Crash kernel (low)",
and pass the low region by reusing DT property
"linux,usable-memory-range". We made the low memory region as the last
range of "linux,usable-memory-range" to keep compatibility with existing
user-space and older kdump kernels.

Besides, we need to modify kexec-tools:
arm64: support more than one crash kernel regions(see [1])

Another update is document about DT property 'linux,usable-memory-range':
schemas: update 'linux,usable-memory-range' node schema(see [2])

Changes since [v10]
- Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.

Changes since [v9]
- Patch 1 add Acked-by from Dave.
- Update patch 5 according to Dave's comments.
- Update chosen schema.

Changes since [v8]
- Reuse DT property "linux,usable-memory-range".
Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
memory region.
- Fix kdump broken with ZONE_DMA reintroduced.
- Update chosen schema.

Changes since [v7]
- Move x86 CRASH_ALIGN to 2M
Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
- Update Documentation/devicetree/bindings/chosen.txt.
Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
suggested by Arnd.
- Add Tested-by from Jhon and pk.

Changes since [v6]
- Fix build errors reported by kbuild test robot.

Changes since [v5]
- Move reserve_crashkernel_low() into kernel/crash_core.c.
- Delete crashkernel=X,high.
- Modify crashkernel=X,low.
If crashkernel=X,low is specified simultaneously, reserve spcified size low
memory for crash kdump kernel devices firstly and then reserve memory above 4G.
In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
pass to crash dump kernel by DT property "linux,low-memory-range".
- Update Documentation/admin-guide/kdump/kdump.rst.

Changes since [v4]
- Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.

Changes since [v3]
- Add memblock_cap_memory_ranges back for multiple ranges.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
in fdt_enforce_memory_region().
There are at most two crash kernel regions, for two crash kernel regions
case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
and then remove the memory range in the middle.

[1]: http://lists.infradead.org/pipermail/kexec/2020-June/020737.html
[2]: https://github.com/robherring/dt-schema/pull/19
[v1]: https://lkml.org/lkml/2019/4/2/1174
[v2]: https://lkml.org/lkml/2019/4/9/86
[v3]: https://lkml.org/lkml/2019/4/9/306
[v4]: https://lkml.org/lkml/2019/4/15/273
[v5]: https://lkml.org/lkml/2019/5/6/1360
[v6]: https://lkml.org/lkml/2019/8/30/142
[v7]: https://lkml.org/lkml/2019/12/23/411
[v8]: https://lkml.org/lkml/2020/5/21/213
[v9]: https://lkml.org/lkml/2020/6/28/73
[v10]: https://lkml.org/lkml/2020/7/2/1443

Chen Zhou (5):
arm64: kdump: add macro CRASH_ALIGN and CRASH_ADDR_LOW_MAX
x86: kdump: move reserve_crashkernel_low() into crash_core.c
arm64: kdump: reimplement crashkernel=X
arm64: kdump: add memory for devices by DT property
linux,usable-memory-range
kdump: update Documentation about crashkernel

Documentation/admin-guide/kdump/kdump.rst | 21 +++-
.../admin-guide/kernel-parameters.txt | 11 ++-
arch/arm64/include/asm/kexec.h | 9 ++
arch/arm64/include/asm/processor.h | 1 +
arch/arm64/kernel/setup.c | 8 +-
arch/arm64/mm/init.c | 99 +++++++++++++++----
arch/x86/include/asm/kexec.h | 24 +++++
arch/x86/kernel/setup.c | 86 ++--------------
include/linux/crash_core.h | 3 +
include/linux/kexec.h | 2 -
kernel/crash_core.c | 74 ++++++++++++++
kernel/kexec_core.c | 17 ----
12 files changed, 233 insertions(+), 122 deletions(-)

--
2.20.1


2020-08-01 13:06:47

by chenzhou

[permalink] [raw]
Subject: [PATCH v11 4/5] arm64: kdump: add memory for devices by DT property linux,usable-memory-range

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices and never mapped by the first kernel.
This memory range is advertised to crash dump kernel via DT property
under /chosen,
linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>

We reused the DT property linux,usable-memory-range and made the low
memory region as the second range "BASE2 SIZE2", which keeps compatibility
with existing user-space and older kdump kernels.

Crash dump kernel reads this property at boot time and call memblock_add()
to add the low memory region after memblock_cap_memory_range() has been
called.

Signed-off-by: Chen Zhou <[email protected]>
---
arch/arm64/mm/init.c | 44 ++++++++++++++++++++++++++++++++++----------
1 file changed, 34 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 53c8916fd32f..f385a8281d1b 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -69,6 +69,16 @@ EXPORT_SYMBOL(vmemmap);
phys_addr_t arm64_dma_phys_limit __ro_after_init;
phys_addr_t arm64_dma32_phys_limit __ro_after_init;

+/*
+ * The main usage of linux,usable-memory-range is for crash dump kernel.
+ * Originally, the number of usable-memory regions is one. Now there may
+ * be two regions, low region and high region.
+ * To make compatibility with existing user-space and older kdump, the low
+ * region is always the last range of linux,usable-memory-range if exist.
+ */
+#define MAX_USABLE_RANGES 2
+
+
#ifdef CONFIG_KEXEC_CORE

/*
@@ -286,9 +296,9 @@ early_param("mem", early_mem);
static int __init early_init_dt_scan_usablemem(unsigned long node,
const char *uname, int depth, void *data)
{
- struct memblock_region *usablemem = data;
- const __be32 *reg;
- int len;
+ struct memblock_region *usable_rgns = data;
+ const __be32 *reg, *endp;
+ int len, nr = 0;

if (depth != 1 || strcmp(uname, "chosen") != 0)
return 0;
@@ -297,22 +307,36 @@ static int __init early_init_dt_scan_usablemem(unsigned long node,
if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
return 1;

- usablemem->base = dt_mem_next_cell(dt_root_addr_cells, &reg);
- usablemem->size = dt_mem_next_cell(dt_root_size_cells, &reg);
+ endp = reg + (len / sizeof(__be32));
+ while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
+ usable_rgns[nr].base = dt_mem_next_cell(dt_root_addr_cells, &reg);
+ usable_rgns[nr].size = dt_mem_next_cell(dt_root_size_cells, &reg);
+
+ if (++nr >= MAX_USABLE_RANGES)
+ break;
+ }

return 1;
}

static void __init fdt_enforce_memory_region(void)
{
- struct memblock_region reg = {
- .size = 0,
+ struct memblock_region usable_rgns[MAX_USABLE_RANGES] = {
+ { .size = 0 },
+ { .size = 0 }
};

- of_scan_flat_dt(early_init_dt_scan_usablemem, &reg);
+ of_scan_flat_dt(early_init_dt_scan_usablemem, &usable_rgns);

- if (reg.size)
- memblock_cap_memory_range(reg.base, reg.size);
+ /*
+ * The first range of usable-memory regions is for crash dump
+ * kernel with only one region or for high region with two regions,
+ * the second range is dedicated for low region if exist.
+ */
+ if (usable_rgns[0].size)
+ memblock_cap_memory_range(usable_rgns[0].base, usable_rgns[0].size);
+ if (usable_rgns[1].size)
+ memblock_add(usable_rgns[1].base, usable_rgns[1].size);
}

void __init arm64_memblock_init(void)
--
2.20.1

2020-08-01 13:06:50

by chenzhou

[permalink] [raw]
Subject: [PATCH v11 5/5] kdump: update Documentation about crashkernel

Now the behavior of crashkernel=X has been changed, which tries low
allocation in ZONE_DMA, and fall back to high allocation if it fails.

If requized size X is too large and leads to very little free memory
in ZONE_DMA after low allocation, the system may not work well.
So add a threshold and go for high allocation directly if the required
size is too large. The threshold is set as the half of low memory.

If crash_base is outside ZONE_DMA, try to allocate at least 256M in
ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
specified size low memory. For non-RPi4 platforms, change ZONE_DMA
memtioned above to ZONE_DMA32.

So update the Documentation.

Signed-off-by: Chen Zhou <[email protected]>
---
Documentation/admin-guide/kdump/kdump.rst | 21 ++++++++++++++++---
.../admin-guide/kernel-parameters.txt | 11 ++++++++--
2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index 2da65fef2a1c..4b58f97351d5 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -299,7 +299,15 @@ Boot into System Kernel
"crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
starting at physical address 0x01000000 (16MB) for the dump-capture kernel.

- On x86 and x86_64, use "crashkernel=64M@16M".
+ On x86 use "crashkernel=64M@16M".
+
+ On x86_64, use "crashkernel=X" to select a region under 4G first, and
+ fall back to reserve region above 4G.
+ We can also use "crashkernel=X,high" to select a region above 4G, which
+ also tries to allocate at least 256M below 4G automatically and
+ "crashkernel=Y,low" can be used to allocate specified size low memory.
+ Use "crashkernel=Y@X" if you really have to reserve memory from specified
+ start address X.

On ppc64, use "crashkernel=128M@32M".

@@ -316,8 +324,15 @@ Boot into System Kernel
kernel will automatically locate the crash kernel image within the
first 512MB of RAM if X is not given.

- On arm64, use "crashkernel=Y[@X]". Note that the start address of
- the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
+ On arm64, use "crashkernel=X" to try low allocation in ZONE_DMA, and
+ fall back to high allocation if it fails. And go for high allocation
+ directly if the required size is too large. If crash_base is outside
+ ZONE_DMA, try to allocate at least 256M in ZONE_DMA automatically.
+ "crashkernel=Y,low" can be used to allocate specified size low memory.
+ For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.
+ Use "crashkernel=Y@X" if you really have to reserve memory from
+ specified start address X. Note that the start address of the kernel,
+ X if explicitly specified, must be aligned to 2MiB (0x200000).

Load the Dump-capture Kernel
============================
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index fb95fad81c79..d1b6016850d6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -722,6 +722,10 @@
[KNL, x86_64] select a region under 4G first, and
fall back to reserve region above 4G when '@offset'
hasn't been specified.
+ [KNL, arm64] Try low allocation in ZONE_DMA, fall back
+ to high allocation if it fails when '@offset' hasn't been
+ specified. For non-RPi4 platforms, change ZONE_DMA to
+ ZONE_DMA32.
See Documentation/admin-guide/kdump/kdump.rst for further details.

crashkernel=range1:size1[,range2:size2,...][@offset]
@@ -746,13 +750,16 @@
requires at least 64M+32K low memory, also enough extra
low memory is needed to make sure DMA buffers for 32-bit
devices won't run out. Kernel would try to allocate at
- at least 256M below 4G automatically.
+ least 256M below 4G automatically.
This one let user to specify own low range under 4G
for second kernel instead.
0: to disable low allocation.
It will be ignored when crashkernel=X,high is not used
or memory reserved is below 4G.
-
+ [KNL, arm64] range under 4G.
+ This one let user to specify a low range in ZONE_DMA for
+ crash dump kernel. For non-RPi4 platforms, change ZONE_DMA
+ to ZONE_DMA32.
cryptomgr.notests
[KNL] Disable crypto self-tests

--
2.20.1

2020-08-01 13:06:56

by chenzhou

[permalink] [raw]
Subject: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.
3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
if the memory reserved for crash dump kernel falled in ZONE_DMA32,
the devices in crash dump kernel need to use ZONE_DMA will alloc
fail.

To solve these issues, change the behavior of crashkernel=X.
crashkernel=X tries low allocation in ZONE_DMA, and fall back to
high allocation if it fails.

If requized size X is too large and leads to very little free memory
in ZONE_DMA after low allocation, the system may not work normally.
So add a threshold and go for high allocation directly if the required
size is too large. The value of threshold is set as the half of
the low memory.

If crash_base is outside ZONE_DMA, try to allocate at least 256M in
ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
specified size low memory.

For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.

Another minor change, there may be two regions reserved for crash
dump kernel, in order to distinct from the high region and make no
effect to the use of existing kexec-tools, rename the low region as
"Crash kernel (low)".

Signed-off-by: Chen Zhou <[email protected]>
---
arch/arm64/include/asm/kexec.h | 4 +++
arch/arm64/kernel/setup.c | 8 +++++-
arch/arm64/mm/init.c | 51 ++++++++++++++++++++++++++++++----
3 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 1a2f27f12794..92ed53d0bf21 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -28,7 +28,11 @@
/* 2M alignment for crash kernel regions */
#define CRASH_ALIGN SZ_2M

+#ifdef CONFIG_ZONE_DMA
+#define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit
+#else
#define CRASH_ADDR_LOW_MAX arm64_dma32_phys_limit
+#endif

#ifndef __ASSEMBLY__

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 93b3844cf442..4dc51a2ac012 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -238,7 +238,13 @@ static void __init request_standard_resources(void)
kernel_data.end <= res->end)
request_resource(res, &kernel_data);
#ifdef CONFIG_KEXEC_CORE
- /* Userspace will find "Crash kernel" region in /proc/iomem. */
+ /*
+ * Userspace will find "Crash kernel" region in /proc/iomem.
+ * Note: the low region is renamed as Crash kernel (low).
+ */
+ if (crashk_low_res.end && crashk_low_res.start >= res->start &&
+ crashk_low_res.end <= res->end)
+ request_resource(res, &crashk_low_res);
if (crashk_res.end && crashk_res.start >= res->start &&
crashk_res.end <= res->end)
request_resource(res, &crashk_res);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index a3d0193f6a0a..53c8916fd32f 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -70,6 +70,14 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
phys_addr_t arm64_dma32_phys_limit __ro_after_init;

#ifdef CONFIG_KEXEC_CORE
+
+/*
+ * Add a threshold for required memory size of crashkernel. If required memory
+ * size is greater than threshold, just go for high allocation directly. The
+ * value of threshold is set as half of the total low memory.
+ */
+#define REQUIRED_MEMORY_THRESHOLD (memblock_mem_size(CRASH_ADDR_LOW_MAX >> \
+ PAGE_SHIFT) >> 1)
/*
* reserve_crashkernel() - reserves memory for crash kernel
*
@@ -90,11 +98,22 @@ static void __init reserve_crashkernel(void)

crash_size = PAGE_ALIGN(crash_size);

- if (crash_base == 0) {
- /* Current arm64 boot protocol requires 2MB alignment */
- crash_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX,
- crash_size, CRASH_ALIGN);
- if (crash_base == 0) {
+ if (!crash_base) {
+ /*
+ * Current arm64 boot protocol requires 2MB alignment.
+ * If required memory size is greater than threshold, just go
+ * for high allocation directly.
+ * If required memory size is less than or equal to threshold,
+ * try low allocation firstly, and then fall back to high allocation
+ * if it fails.
+ */
+ if (crash_size <= REQUIRED_MEMORY_THRESHOLD)
+ crash_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX,
+ crash_size, CRASH_ALIGN);
+ if (!crash_base)
+ crash_base = memblock_find_in_range(0, MEMBLOCK_ALLOC_ACCESSIBLE,
+ crash_size, SZ_2M);
+ if (!crash_base) {
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
crash_size);
return;
@@ -118,6 +137,28 @@ static void __init reserve_crashkernel(void)
}
memblock_reserve(crash_base, crash_size);

+ if (crash_base >= CRASH_ADDR_LOW_MAX) {
+ const char *rename = "Crash kernel (low)";
+
+ if (reserve_crashkernel_low()) {
+ memblock_free(crash_base, crash_size);
+ return;
+ }
+
+ /*
+ * In order to distinct from the high region and make no effect
+ * to the use of existing kexec-tools, rename the low region as
+ * "Crash kernel (low)".
+ */
+ crashk_low_res.name = rename;
+ /*
+ * The low region is intended to be used for crash dump kernel
+ * devices, just mark the low region as "nomap" simply.
+ */
+ memblock_mark_nomap(crashk_low_res.start,
+ resource_size(&crashk_low_res));
+ }
+
pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
crash_base, crash_base + crash_size, crash_size >> 20);

--
2.20.1

2020-08-01 13:08:40

by chenzhou

[permalink] [raw]
Subject: [PATCH v11 1/5] arm64: kdump: add macro CRASH_ALIGN and CRASH_ADDR_LOW_MAX

Expose variable arm64_dma32_phys_limit for followup, and add macro
CRASH_ALIGN for alignment, macro CRASH_ADDR_LOW_MAX for upper bound
of low crash memory. Use macros instead.

Signed-off-by: Chen Zhou <[email protected]>
---
arch/arm64/include/asm/kexec.h | 5 +++++
arch/arm64/include/asm/processor.h | 1 +
arch/arm64/mm/init.c | 8 ++++----
3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index d24b527e8c00..1a2f27f12794 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -25,6 +25,11 @@

#define KEXEC_ARCH KEXEC_ARCH_AARCH64

+/* 2M alignment for crash kernel regions */
+#define CRASH_ALIGN SZ_2M
+
+#define CRASH_ADDR_LOW_MAX arm64_dma32_phys_limit
+
#ifndef __ASSEMBLY__

/**
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 240fe5e5b720..af71063f352c 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -95,6 +95,7 @@
#endif /* CONFIG_ARM64_FORCE_52BIT */

extern phys_addr_t arm64_dma_phys_limit;
+extern phys_addr_t arm64_dma32_phys_limit;
#define ARCH_LOW_ADDRESS_LIMIT (arm64_dma_phys_limit - 1)

struct debug_info {
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1e93cfc7c47a..a3d0193f6a0a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -67,7 +67,7 @@ EXPORT_SYMBOL(vmemmap);
* bit addressable memory area.
*/
phys_addr_t arm64_dma_phys_limit __ro_after_init;
-static phys_addr_t arm64_dma32_phys_limit __ro_after_init;
+phys_addr_t arm64_dma32_phys_limit __ro_after_init;

#ifdef CONFIG_KEXEC_CORE
/*
@@ -92,8 +92,8 @@ static void __init reserve_crashkernel(void)

if (crash_base == 0) {
/* Current arm64 boot protocol requires 2MB alignment */
- crash_base = memblock_find_in_range(0, arm64_dma32_phys_limit,
- crash_size, SZ_2M);
+ crash_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX,
+ crash_size, CRASH_ALIGN);
if (crash_base == 0) {
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
crash_size);
@@ -111,7 +111,7 @@ static void __init reserve_crashkernel(void)
return;
}

- if (!IS_ALIGNED(crash_base, SZ_2M)) {
+ if (!IS_ALIGNED(crash_base, CRASH_ALIGN)) {
pr_warn("cannot reserve crashkernel: base address is not 2MB aligned\n");
return;
}
--
2.20.1

2020-08-01 13:09:12

by chenzhou

[permalink] [raw]
Subject: [PATCH v11 2/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c

In preparation for supporting reserve_crashkernel_low in arm64 as
x86_64 does, move reserve_crashkernel_low() into kernel/crash_core.c.

BTW, move x86_64 CRASH_ALIGN to 2M suggested by Dave. CONFIG_PHYSICAL_ALIGN
can be selected from 2M to 16M, move to the same as arm64.

Signed-off-by: Chen Zhou <[email protected]>
---
arch/x86/include/asm/kexec.h | 24 ++++++++++
arch/x86/kernel/setup.c | 86 +++---------------------------------
include/linux/crash_core.h | 3 ++
include/linux/kexec.h | 2 -
kernel/crash_core.c | 74 +++++++++++++++++++++++++++++++
kernel/kexec_core.c | 17 -------
6 files changed, 107 insertions(+), 99 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 6802c59e8252..f8f9d952e09f 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -18,6 +18,30 @@

# define KEXEC_CONTROL_CODE_MAX_SIZE 2048

+/* 2M alignment for crash kernel regions */
+#define CRASH_ALIGN SZ_2M
+
+/*
+ * Keep the crash kernel below this limit.
+ *
+ * Earlier 32-bits kernels would limit the kernel to the low 512 MB range
+ * due to mapping restrictions.
+ *
+ * 64-bit kdump kernels need to be restricted to be under 64 TB, which is
+ * the upper limit of system RAM in 4-level paging mode. Since the kdump
+ * jump could be from 5-level paging to 4-level paging, the jump will fail if
+ * the kernel is put above 64 TB, and during the 1st kernel bootup there's
+ * no good way to detect the paging mode of the target kernel which will be
+ * loaded for dumping.
+ */
+#ifdef CONFIG_X86_32
+# define CRASH_ADDR_LOW_MAX SZ_512M
+# define CRASH_ADDR_HIGH_MAX SZ_512M
+#else
+# define CRASH_ADDR_LOW_MAX SZ_4G
+# define CRASH_ADDR_HIGH_MAX SZ_64T
+#endif
+
#ifndef __ASSEMBLY__

#include <linux/string.h>
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a3767e74c758..46763c1e5d9f 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -401,83 +401,6 @@ static void __init memblock_x86_reserve_range_setup_data(void)

#ifdef CONFIG_KEXEC_CORE

-/* 16M alignment for crash kernel regions */
-#define CRASH_ALIGN SZ_16M
-
-/*
- * Keep the crash kernel below this limit.
- *
- * Earlier 32-bits kernels would limit the kernel to the low 512 MB range
- * due to mapping restrictions.
- *
- * 64-bit kdump kernels need to be restricted to be under 64 TB, which is
- * the upper limit of system RAM in 4-level paging mode. Since the kdump
- * jump could be from 5-level paging to 4-level paging, the jump will fail if
- * the kernel is put above 64 TB, and during the 1st kernel bootup there's
- * no good way to detect the paging mode of the target kernel which will be
- * loaded for dumping.
- */
-#ifdef CONFIG_X86_32
-# define CRASH_ADDR_LOW_MAX SZ_512M
-# define CRASH_ADDR_HIGH_MAX SZ_512M
-#else
-# define CRASH_ADDR_LOW_MAX SZ_4G
-# define CRASH_ADDR_HIGH_MAX SZ_64T
-#endif
-
-static int __init reserve_crashkernel_low(void)
-{
-#ifdef CONFIG_X86_64
- unsigned long long base, low_base = 0, low_size = 0;
- unsigned long total_low_mem;
- int ret;
-
- total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
-
- /* crashkernel=Y,low */
- ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size, &base);
- if (ret) {
- /*
- * two parts from kernel/dma/swiotlb.c:
- * -swiotlb size: user-specified with swiotlb= or default.
- *
- * -swiotlb overflow buffer: now hardcoded to 32k. We round it
- * to 8M for other buffers that may need to stay low too. Also
- * make sure we allocate enough extra low memory so that we
- * don't run out of DMA buffers for 32-bit devices.
- */
- low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20);
- } else {
- /* passed with crashkernel=0,low ? */
- if (!low_size)
- return 0;
- }
-
- low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
- if (!low_base) {
- pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
- (unsigned long)(low_size >> 20));
- return -ENOMEM;
- }
-
- ret = memblock_reserve(low_base, low_size);
- if (ret) {
- pr_err("%s: Error reserving crashkernel low memblock.\n", __func__);
- return ret;
- }
-
- pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
- (unsigned long)(low_size >> 20),
- (unsigned long)(low_base >> 20),
- (unsigned long)(total_low_mem >> 20));
-
- crashk_low_res.start = low_base;
- crashk_low_res.end = low_base + low_size - 1;
- insert_resource(&iomem_resource, &crashk_low_res);
-#endif
- return 0;
-}
-
static void __init reserve_crashkernel(void)
{
unsigned long long crash_size, crash_base, total_mem;
@@ -541,9 +464,12 @@ static void __init reserve_crashkernel(void)
return;
}

- if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
- memblock_free(crash_base, crash_size);
- return;
+ if (crash_base >= (1ULL << 32)) {
+ if (reserve_crashkernel_low()) {
+ memblock_free(crash_base, crash_size);
+ return;
+ }
+ insert_resource(&iomem_resource, &crashk_low_res);
}

pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System RAM: %ldMB)\n",
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 525510a9f965..4df8c0bff03e 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -63,6 +63,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
extern unsigned char *vmcoreinfo_data;
extern size_t vmcoreinfo_size;
extern u32 *vmcoreinfo_note;
+extern struct resource crashk_res;
+extern struct resource crashk_low_res;

Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
void *data, size_t data_len);
@@ -74,5 +76,6 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
unsigned long long *crash_size, unsigned long long *crash_base);
int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
unsigned long long *crash_size, unsigned long long *crash_base);
+int __init reserve_crashkernel_low(void);

#endif /* LINUX_CRASH_CORE_H */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index ea67910ae6b7..a460afdbab0f 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -330,8 +330,6 @@ extern int kexec_load_disabled;

/* Location of a reserved region to hold the crash kernel.
*/
-extern struct resource crashk_res;
-extern struct resource crashk_low_res;
extern note_buf_t __percpu *crash_notes;

/* flag to track if kexec reboot is in progress */
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..c81b15dd78c2 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -7,7 +7,10 @@
#include <linux/crash_core.h>
#include <linux/utsname.h>
#include <linux/vmalloc.h>
+#include <linux/memblock.h>
+#include <linux/swiotlb.h>

+#include <asm/kexec.h>
#include <asm/page.h>
#include <asm/sections.h>

@@ -19,6 +22,22 @@ u32 *vmcoreinfo_note;
/* trusted vmcoreinfo, e.g. we can make a copy in the crash memory */
static unsigned char *vmcoreinfo_data_safecopy;

+/* Location of the reserved area for the crash kernel */
+struct resource crashk_res = {
+ .name = "Crash kernel",
+ .start = 0,
+ .end = 0,
+ .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+ .desc = IORES_DESC_CRASH_KERNEL
+};
+struct resource crashk_low_res = {
+ .name = "Crash kernel",
+ .start = 0,
+ .end = 0,
+ .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+ .desc = IORES_DESC_CRASH_KERNEL
+};
+
/*
* parsing the "crashkernel" commandline
*
@@ -292,6 +311,61 @@ int __init parse_crashkernel_low(char *cmdline,
"crashkernel=", suffix_tbl[SUFFIX_LOW]);
}

+int __init reserve_crashkernel_low(void)
+{
+#if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)
+ unsigned long long base, low_base = 0, low_size = 0;
+ unsigned long total_low_mem;
+ int ret;
+
+ total_low_mem = memblock_mem_size(CRASH_ADDR_LOW_MAX >> PAGE_SHIFT);
+
+ /* crashkernel=Y,low */
+ ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size,
+ &base);
+ if (ret) {
+ /*
+ * two parts from lib/swiotlb.c:
+ * -swiotlb size: user-specified with swiotlb= or default.
+ *
+ * -swiotlb overflow buffer: now hardcoded to 32k. We round it
+ * to 8M for other buffers that may need to stay low too. Also
+ * make sure we allocate enough extra low memory so that we
+ * don't run out of DMA buffers for 32-bit devices.
+ */
+ low_size = max(swiotlb_size_or_default() + (8UL << 20),
+ 256UL << 20);
+ } else {
+ /* passed with crashkernel=0,low ? */
+ if (!low_size)
+ return 0;
+ }
+
+ low_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX, low_size, CRASH_ALIGN);
+ if (!low_base) {
+ pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
+ (unsigned long)(low_size >> 20));
+ return -ENOMEM;
+ }
+
+ ret = memblock_reserve(low_base, low_size);
+ if (ret) {
+ pr_err("%s: Error reserving crashkernel low memblock.\n",
+ __func__);
+ return ret;
+ }
+
+ pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
+ (unsigned long)(low_size >> 20),
+ (unsigned long)(low_base >> 20),
+ (unsigned long)(total_low_mem >> 20));
+
+ crashk_low_res.start = low_base;
+ crashk_low_res.end = low_base + low_size - 1;
+#endif
+ return 0;
+}
+
Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
void *data, size_t data_len)
{
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index c19c0dad1ebe..db66bbabfff3 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -53,23 +53,6 @@ note_buf_t __percpu *crash_notes;
/* Flag to indicate we are going to kexec a new kernel */
bool kexec_in_progress = false;

-
-/* Location of the reserved area for the crash kernel */
-struct resource crashk_res = {
- .name = "Crash kernel",
- .start = 0,
- .end = 0,
- .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
- .desc = IORES_DESC_CRASH_KERNEL
-};
-struct resource crashk_low_res = {
- .name = "Crash kernel",
- .start = 0,
- .end = 0,
- .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
- .desc = IORES_DESC_CRASH_KERNEL
-};
-
int kexec_should_crash(struct task_struct *p)
{
/*
--
2.20.1

2020-08-06 16:30:18

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 0/5] support reserving crashkernel above 4G on arm64 kdump

Hi Chen,

Thanks for the update. I was busy on other things, I will review your x86/common changes
this weekend or early next week.

On 08/01/20 at 09:08pm, Chen Zhou wrote:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> the devices in crash dump kernel need to use ZONE_DMA will alloc
> fail.
>
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> high allocation if it fails.
>
> If requized size X is too large and leads to very little free memory
> in ZONE_DMA after low allocation, the system may not work normally.
> So add a threshold and go for high allocation directly if the required
> size is too large. The value of threshold is set as the half of
> the low memory.
>
> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.
>
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices. So there may be two regions reserved for
> crash dump kernel, one is below 4G, the other is above 4G.
> In order to distinct from the high region and make no effect to the use
> of existing kexec-tools, rename the low region as "Crash kernel (low)",
> and pass the low region by reusing DT property
> "linux,usable-memory-range". We made the low memory region as the last
> range of "linux,usable-memory-range" to keep compatibility with existing
> user-space and older kdump kernels.
>
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
>
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
>
> Changes since [v10]
> - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
>
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
>
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
>
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
>
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
>
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
>
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
>
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
>
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
>
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
>
> [1]: http://lists.infradead.org/pipermail/kexec/2020-June/020737.html
> [2]: https://github.com/robherring/dt-schema/pull/19
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
> [v10]: https://lkml.org/lkml/2020/7/2/1443
>
> Chen Zhou (5):
> arm64: kdump: add macro CRASH_ALIGN and CRASH_ADDR_LOW_MAX
> x86: kdump: move reserve_crashkernel_low() into crash_core.c
> arm64: kdump: reimplement crashkernel=X
> arm64: kdump: add memory for devices by DT property
> linux,usable-memory-range
> kdump: update Documentation about crashkernel
>
> Documentation/admin-guide/kdump/kdump.rst | 21 +++-
> .../admin-guide/kernel-parameters.txt | 11 ++-
> arch/arm64/include/asm/kexec.h | 9 ++
> arch/arm64/include/asm/processor.h | 1 +
> arch/arm64/kernel/setup.c | 8 +-
> arch/arm64/mm/init.c | 99 +++++++++++++++----
> arch/x86/include/asm/kexec.h | 24 +++++
> arch/x86/kernel/setup.c | 86 ++--------------
> include/linux/crash_core.h | 3 +
> include/linux/kexec.h | 2 -
> kernel/crash_core.c | 74 ++++++++++++++
> kernel/kexec_core.c | 17 ----
> 12 files changed, 233 insertions(+), 122 deletions(-)
>
> --
> 2.20.1
>
>
> _______________________________________________
> kexec mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/kexec
>

2020-08-06 17:08:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v11 2/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c


* Chen Zhou <[email protected]> wrote:

> In preparation for supporting reserve_crashkernel_low in arm64 as
> x86_64 does, move reserve_crashkernel_low() into kernel/crash_core.c.
>
> BTW, move x86_64 CRASH_ALIGN to 2M suggested by Dave. CONFIG_PHYSICAL_ALIGN
> can be selected from 2M to 16M, move to the same as arm64.
>
> Signed-off-by: Chen Zhou <[email protected]>
> ---
> arch/x86/include/asm/kexec.h | 24 ++++++++++
> arch/x86/kernel/setup.c | 86 +++---------------------------------
> include/linux/crash_core.h | 3 ++
> include/linux/kexec.h | 2 -
> kernel/crash_core.c | 74 +++++++++++++++++++++++++++++++
> kernel/kexec_core.c | 17 -------
> 6 files changed, 107 insertions(+), 99 deletions(-)

Since the changes are centered around arm64, I suppose the arm64 tree
will carry this patchset?

Assuming that this is a 100% invariant moving of code that doesn't
regress on x86:

Acked-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

2020-08-08 10:04:05

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 2/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c

On 08/01/20 at 09:08pm, Chen Zhou wrote:
> In preparation for supporting reserve_crashkernel_low in arm64 as
> x86_64 does, move reserve_crashkernel_low() into kernel/crash_core.c.
>
> BTW, move x86_64 CRASH_ALIGN to 2M suggested by Dave. CONFIG_PHYSICAL_ALIGN
> can be selected from 2M to 16M, move to the same as arm64.
>
> Signed-off-by: Chen Zhou <[email protected]>
> ---
> arch/x86/include/asm/kexec.h | 24 ++++++++++
> arch/x86/kernel/setup.c | 86 +++---------------------------------
> include/linux/crash_core.h | 3 ++
> include/linux/kexec.h | 2 -
> kernel/crash_core.c | 74 +++++++++++++++++++++++++++++++
> kernel/kexec_core.c | 17 -------
> 6 files changed, 107 insertions(+), 99 deletions(-)
>
> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> index 6802c59e8252..f8f9d952e09f 100644
> --- a/arch/x86/include/asm/kexec.h
> +++ b/arch/x86/include/asm/kexec.h
> @@ -18,6 +18,30 @@
>
> # define KEXEC_CONTROL_CODE_MAX_SIZE 2048
>
> +/* 2M alignment for crash kernel regions */
> +#define CRASH_ALIGN SZ_2M
> +
> +/*
> + * Keep the crash kernel below this limit.
> + *
> + * Earlier 32-bits kernels would limit the kernel to the low 512 MB range
> + * due to mapping restrictions.
> + *
> + * 64-bit kdump kernels need to be restricted to be under 64 TB, which is
> + * the upper limit of system RAM in 4-level paging mode. Since the kdump
> + * jump could be from 5-level paging to 4-level paging, the jump will fail if
> + * the kernel is put above 64 TB, and during the 1st kernel bootup there's
> + * no good way to detect the paging mode of the target kernel which will be
> + * loaded for dumping.
> + */
> +#ifdef CONFIG_X86_32
> +# define CRASH_ADDR_LOW_MAX SZ_512M
> +# define CRASH_ADDR_HIGH_MAX SZ_512M
> +#else
> +# define CRASH_ADDR_LOW_MAX SZ_4G
> +# define CRASH_ADDR_HIGH_MAX SZ_64T
> +#endif
> +
> #ifndef __ASSEMBLY__
>
> #include <linux/string.h>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index a3767e74c758..46763c1e5d9f 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -401,83 +401,6 @@ static void __init memblock_x86_reserve_range_setup_data(void)
>
> #ifdef CONFIG_KEXEC_CORE
>
> -/* 16M alignment for crash kernel regions */
> -#define CRASH_ALIGN SZ_16M
> -
> -/*
> - * Keep the crash kernel below this limit.
> - *
> - * Earlier 32-bits kernels would limit the kernel to the low 512 MB range
> - * due to mapping restrictions.
> - *
> - * 64-bit kdump kernels need to be restricted to be under 64 TB, which is
> - * the upper limit of system RAM in 4-level paging mode. Since the kdump
> - * jump could be from 5-level paging to 4-level paging, the jump will fail if
> - * the kernel is put above 64 TB, and during the 1st kernel bootup there's
> - * no good way to detect the paging mode of the target kernel which will be
> - * loaded for dumping.
> - */
> -#ifdef CONFIG_X86_32
> -# define CRASH_ADDR_LOW_MAX SZ_512M
> -# define CRASH_ADDR_HIGH_MAX SZ_512M
> -#else
> -# define CRASH_ADDR_LOW_MAX SZ_4G
> -# define CRASH_ADDR_HIGH_MAX SZ_64T
> -#endif
> -
> -static int __init reserve_crashkernel_low(void)
> -{
> -#ifdef CONFIG_X86_64
> - unsigned long long base, low_base = 0, low_size = 0;
> - unsigned long total_low_mem;
> - int ret;
> -
> - total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
> -
> - /* crashkernel=Y,low */
> - ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size, &base);
> - if (ret) {
> - /*
> - * two parts from kernel/dma/swiotlb.c:
> - * -swiotlb size: user-specified with swiotlb= or default.
> - *
> - * -swiotlb overflow buffer: now hardcoded to 32k. We round it
> - * to 8M for other buffers that may need to stay low too. Also
> - * make sure we allocate enough extra low memory so that we
> - * don't run out of DMA buffers for 32-bit devices.
> - */
> - low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20);
> - } else {
> - /* passed with crashkernel=0,low ? */
> - if (!low_size)
> - return 0;
> - }
> -
> - low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
> - if (!low_base) {
> - pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
> - (unsigned long)(low_size >> 20));
> - return -ENOMEM;
> - }
> -
> - ret = memblock_reserve(low_base, low_size);
> - if (ret) {
> - pr_err("%s: Error reserving crashkernel low memblock.\n", __func__);
> - return ret;
> - }
> -
> - pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
> - (unsigned long)(low_size >> 20),
> - (unsigned long)(low_base >> 20),
> - (unsigned long)(total_low_mem >> 20));
> -
> - crashk_low_res.start = low_base;
> - crashk_low_res.end = low_base + low_size - 1;
> - insert_resource(&iomem_resource, &crashk_low_res);
> -#endif
> - return 0;
> -}
> -
> static void __init reserve_crashkernel(void)
> {
> unsigned long long crash_size, crash_base, total_mem;
> @@ -541,9 +464,12 @@ static void __init reserve_crashkernel(void)
> return;
> }
>
> - if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
> - memblock_free(crash_base, crash_size);
> - return;
> + if (crash_base >= (1ULL << 32)) {
> + if (reserve_crashkernel_low()) {
> + memblock_free(crash_base, crash_size);
> + return;
> + }
> + insert_resource(&iomem_resource, &crashk_low_res);
> }
>
> pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System RAM: %ldMB)\n",
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 525510a9f965..4df8c0bff03e 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -63,6 +63,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
> extern unsigned char *vmcoreinfo_data;
> extern size_t vmcoreinfo_size;
> extern u32 *vmcoreinfo_note;
> +extern struct resource crashk_res;
> +extern struct resource crashk_low_res;
>
> Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> void *data, size_t data_len);
> @@ -74,5 +76,6 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
> unsigned long long *crash_size, unsigned long long *crash_base);
> int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
> unsigned long long *crash_size, unsigned long long *crash_base);
> +int __init reserve_crashkernel_low(void);
>
> #endif /* LINUX_CRASH_CORE_H */
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index ea67910ae6b7..a460afdbab0f 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -330,8 +330,6 @@ extern int kexec_load_disabled;
>
> /* Location of a reserved region to hold the crash kernel.
> */
> -extern struct resource crashk_res;
> -extern struct resource crashk_low_res;
> extern note_buf_t __percpu *crash_notes;
>
> /* flag to track if kexec reboot is in progress */
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 9f1557b98468..c81b15dd78c2 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -7,7 +7,10 @@
> #include <linux/crash_core.h>
> #include <linux/utsname.h>
> #include <linux/vmalloc.h>
> +#include <linux/memblock.h>
> +#include <linux/swiotlb.h>
>
> +#include <asm/kexec.h>
> #include <asm/page.h>
> #include <asm/sections.h>
>
> @@ -19,6 +22,22 @@ u32 *vmcoreinfo_note;
> /* trusted vmcoreinfo, e.g. we can make a copy in the crash memory */
> static unsigned char *vmcoreinfo_data_safecopy;
>
> +/* Location of the reserved area for the crash kernel */
> +struct resource crashk_res = {
> + .name = "Crash kernel",
> + .start = 0,
> + .end = 0,
> + .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
> + .desc = IORES_DESC_CRASH_KERNEL
> +};
> +struct resource crashk_low_res = {
> + .name = "Crash kernel",
> + .start = 0,
> + .end = 0,
> + .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
> + .desc = IORES_DESC_CRASH_KERNEL
> +};
> +
> /*
> * parsing the "crashkernel" commandline
> *
> @@ -292,6 +311,61 @@ int __init parse_crashkernel_low(char *cmdline,
> "crashkernel=", suffix_tbl[SUFFIX_LOW]);
> }
>
> +int __init reserve_crashkernel_low(void)
> +{
> +#if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)
> + unsigned long long base, low_base = 0, low_size = 0;
> + unsigned long total_low_mem;
> + int ret;
> +
> + total_low_mem = memblock_mem_size(CRASH_ADDR_LOW_MAX >> PAGE_SHIFT);
> +
> + /* crashkernel=Y,low */
> + ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size,
> + &base);
> + if (ret) {
> + /*
> + * two parts from lib/swiotlb.c:
> + * -swiotlb size: user-specified with swiotlb= or default.
> + *
> + * -swiotlb overflow buffer: now hardcoded to 32k. We round it
> + * to 8M for other buffers that may need to stay low too. Also
> + * make sure we allocate enough extra low memory so that we
> + * don't run out of DMA buffers for 32-bit devices.
> + */
> + low_size = max(swiotlb_size_or_default() + (8UL << 20),
> + 256UL << 20);
> + } else {
> + /* passed with crashkernel=0,low ? */
> + if (!low_size)
> + return 0;
> + }
> +
> + low_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX, low_size, CRASH_ALIGN);
> + if (!low_base) {
> + pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
> + (unsigned long)(low_size >> 20));
> + return -ENOMEM;
> + }
> +
> + ret = memblock_reserve(low_base, low_size);
> + if (ret) {
> + pr_err("%s: Error reserving crashkernel low memblock.\n",
> + __func__);
> + return ret;
> + }
> +
> + pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
> + (unsigned long)(low_size >> 20),
> + (unsigned long)(low_base >> 20),
> + (unsigned long)(total_low_mem >> 20));
> +
> + crashk_low_res.start = low_base;
> + crashk_low_res.end = low_base + low_size - 1;
> +#endif
> + return 0;
> +}
> +
> Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> void *data, size_t data_len)
> {
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index c19c0dad1ebe..db66bbabfff3 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -53,23 +53,6 @@ note_buf_t __percpu *crash_notes;
> /* Flag to indicate we are going to kexec a new kernel */
> bool kexec_in_progress = false;
>
> -
> -/* Location of the reserved area for the crash kernel */
> -struct resource crashk_res = {
> - .name = "Crash kernel",
> - .start = 0,
> - .end = 0,
> - .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
> - .desc = IORES_DESC_CRASH_KERNEL
> -};
> -struct resource crashk_low_res = {
> - .name = "Crash kernel",
> - .start = 0,
> - .end = 0,
> - .flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
> - .desc = IORES_DESC_CRASH_KERNEL
> -};
> -
> int kexec_should_crash(struct task_struct *p)
> {
> /*
> --
> 2.20.1
>

Acked-by: Dave Young <[email protected]>

Thanks
Dave

2020-08-08 10:04:09

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

On 08/01/20 at 09:08pm, Chen Zhou wrote:
> Now the behavior of crashkernel=X has been changed, which tries low
> allocation in ZONE_DMA, and fall back to high allocation if it fails.
>
> If requized size X is too large and leads to very little free memory
> in ZONE_DMA after low allocation, the system may not work well.
> So add a threshold and go for high allocation directly if the required
> size is too large. The threshold is set as the half of low memory.
>
> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> specified size low memory. For non-RPi4 platforms, change ZONE_DMA
> memtioned above to ZONE_DMA32.
>
> So update the Documentation.
>
> Signed-off-by: Chen Zhou <[email protected]>
> ---
> Documentation/admin-guide/kdump/kdump.rst | 21 ++++++++++++++++---
> .../admin-guide/kernel-parameters.txt | 11 ++++++++--
> 2 files changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
> index 2da65fef2a1c..4b58f97351d5 100644
> --- a/Documentation/admin-guide/kdump/kdump.rst
> +++ b/Documentation/admin-guide/kdump/kdump.rst
> @@ -299,7 +299,15 @@ Boot into System Kernel
> "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
> starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>
> - On x86 and x86_64, use "crashkernel=64M@16M".
> + On x86 use "crashkernel=64M@16M".
> +
> + On x86_64, use "crashkernel=X" to select a region under 4G first, and
> + fall back to reserve region above 4G.
> + We can also use "crashkernel=X,high" to select a region above 4G, which
> + also tries to allocate at least 256M below 4G automatically and
> + "crashkernel=Y,low" can be used to allocate specified size low memory.
> + Use "crashkernel=Y@X" if you really have to reserve memory from specified
> + start address X.
>
> On ppc64, use "crashkernel=128M@32M".
>
> @@ -316,8 +324,15 @@ Boot into System Kernel
> kernel will automatically locate the crash kernel image within the
> first 512MB of RAM if X is not given.
>
> - On arm64, use "crashkernel=Y[@X]". Note that the start address of
> - the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
> + On arm64, use "crashkernel=X" to try low allocation in ZONE_DMA, and
> + fall back to high allocation if it fails. And go for high allocation
> + directly if the required size is too large. If crash_base is outside
> + ZONE_DMA, try to allocate at least 256M in ZONE_DMA automatically.
> + "crashkernel=Y,low" can be used to allocate specified size low memory.
> + For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.
> + Use "crashkernel=Y@X" if you really have to reserve memory from
> + specified start address X. Note that the start address of the kernel,
> + X if explicitly specified, must be aligned to 2MiB (0x200000).
>
> Load the Dump-capture Kernel
> ============================
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index fb95fad81c79..d1b6016850d6 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -722,6 +722,10 @@
> [KNL, x86_64] select a region under 4G first, and
> fall back to reserve region above 4G when '@offset'
> hasn't been specified.
> + [KNL, arm64] Try low allocation in ZONE_DMA, fall back
> + to high allocation if it fails when '@offset' hasn't been
> + specified. For non-RPi4 platforms, change ZONE_DMA to
> + ZONE_DMA32.
> See Documentation/admin-guide/kdump/kdump.rst for further details.
>
> crashkernel=range1:size1[,range2:size2,...][@offset]
> @@ -746,13 +750,16 @@
> requires at least 64M+32K low memory, also enough extra
> low memory is needed to make sure DMA buffers for 32-bit
> devices won't run out. Kernel would try to allocate at
> - at least 256M below 4G automatically.
> + least 256M below 4G automatically.
> This one let user to specify own low range under 4G
> for second kernel instead.
> 0: to disable low allocation.
> It will be ignored when crashkernel=X,high is not used
> or memory reserved is below 4G.
> -
> + [KNL, arm64] range under 4G.
> + This one let user to specify a low range in ZONE_DMA for
> + crash dump kernel. For non-RPi4 platforms, change ZONE_DMA
> + to ZONE_DMA32.
> cryptomgr.notests
> [KNL] Disable crypto self-tests
>
> --
> 2.20.1
>

Hi Chen,

Previously I remember we talked about to use similar logic as X86, but I
remember you mentioned on some arm64 platform there could be no low
memory at all. Is this not a problem now for the fallback? Just be
curious, thanks for the update, for the common part looks good.

Acked-by: Dave Young <[email protected]>

Thanks
Dave

2020-08-10 03:29:26

by chenzhou

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

On 2020/8/8 18:02, Dave Young wrote:
> On 08/01/20 at 09:08pm, Chen Zhou wrote:
>> Now the behavior of crashkernel=X has been changed, which tries low
>> allocation in ZONE_DMA, and fall back to high allocation if it fails.
>>
>> If requized size X is too large and leads to very little free memory
>> in ZONE_DMA after low allocation, the system may not work well.
>> So add a threshold and go for high allocation directly if the required
>> size is too large. The threshold is set as the half of low memory.
>>
>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>> specified size low memory. For non-RPi4 platforms, change ZONE_DMA
>> memtioned above to ZONE_DMA32.
>>
>> So update the Documentation.
>>
>> Signed-off-by: Chen Zhou <[email protected]>
>> ---
>> Documentation/admin-guide/kdump/kdump.rst | 21 ++++++++++++++++---
>> .../admin-guide/kernel-parameters.txt | 11 ++++++++--
>> 2 files changed, 27 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
>> index 2da65fef2a1c..4b58f97351d5 100644
>> --- a/Documentation/admin-guide/kdump/kdump.rst
>> +++ b/Documentation/admin-guide/kdump/kdump.rst
>> @@ -299,7 +299,15 @@ Boot into System Kernel
>> "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
>> starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>>
>> - On x86 and x86_64, use "crashkernel=64M@16M".
>> + On x86 use "crashkernel=64M@16M".
>> +
>> + On x86_64, use "crashkernel=X" to select a region under 4G first, and
>> + fall back to reserve region above 4G.
>> + We can also use "crashkernel=X,high" to select a region above 4G, which
>> + also tries to allocate at least 256M below 4G automatically and
>> + "crashkernel=Y,low" can be used to allocate specified size low memory.
>> + Use "crashkernel=Y@X" if you really have to reserve memory from specified
>> + start address X.
>>
>> On ppc64, use "crashkernel=128M@32M".
>>
>> @@ -316,8 +324,15 @@ Boot into System Kernel
>> kernel will automatically locate the crash kernel image within the
>> first 512MB of RAM if X is not given.
>>
>> - On arm64, use "crashkernel=Y[@X]". Note that the start address of
>> - the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
>> + On arm64, use "crashkernel=X" to try low allocation in ZONE_DMA, and
>> + fall back to high allocation if it fails. And go for high allocation
>> + directly if the required size is too large. If crash_base is outside
>> + ZONE_DMA, try to allocate at least 256M in ZONE_DMA automatically.
>> + "crashkernel=Y,low" can be used to allocate specified size low memory.
>> + For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.
>> + Use "crashkernel=Y@X" if you really have to reserve memory from
>> + specified start address X. Note that the start address of the kernel,
>> + X if explicitly specified, must be aligned to 2MiB (0x200000).
>>
>> Load the Dump-capture Kernel
>> ============================
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index fb95fad81c79..d1b6016850d6 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -722,6 +722,10 @@
>> [KNL, x86_64] select a region under 4G first, and
>> fall back to reserve region above 4G when '@offset'
>> hasn't been specified.
>> + [KNL, arm64] Try low allocation in ZONE_DMA, fall back
>> + to high allocation if it fails when '@offset' hasn't been
>> + specified. For non-RPi4 platforms, change ZONE_DMA to
>> + ZONE_DMA32.
>> See Documentation/admin-guide/kdump/kdump.rst for further details.
>>
>> crashkernel=range1:size1[,range2:size2,...][@offset]
>> @@ -746,13 +750,16 @@
>> requires at least 64M+32K low memory, also enough extra
>> low memory is needed to make sure DMA buffers for 32-bit
>> devices won't run out. Kernel would try to allocate at
>> - at least 256M below 4G automatically.
>> + least 256M below 4G automatically.
>> This one let user to specify own low range under 4G
>> for second kernel instead.
>> 0: to disable low allocation.
>> It will be ignored when crashkernel=X,high is not used
>> or memory reserved is below 4G.
>> -
>> + [KNL, arm64] range under 4G.
>> + This one let user to specify a low range in ZONE_DMA for
>> + crash dump kernel. For non-RPi4 platforms, change ZONE_DMA
>> + to ZONE_DMA32.
>> cryptomgr.notests
>> [KNL] Disable crypto self-tests
>>
>> --
>> 2.20.1
>>
> Hi Chen,
>
> Previously I remember we talked about to use similar logic as X86, but I
> remember you mentioned on some arm64 platform there could be no low
> memory at all. Is this not a problem now for the fallback? Just be
> curious, thanks for the update, for the common part looks good.
Hi Dave,

Did you mean this discuss: https://lkml.org/lkml/2019/12/27/122?
This is about the different implementation instead of no low memory in arm64.

On arm64 platform, if there is no low memory, system will boot fail.

Thanks,
Chen Zhou
>
> Acked-by: Dave Young <[email protected]>
>
> Thanks
> Dave
>
>
> .
>

2020-08-10 06:00:36

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

On 08/10/20 at 11:28am, chenzhou wrote:
> On 2020/8/8 18:02, Dave Young wrote:
> > On 08/01/20 at 09:08pm, Chen Zhou wrote:
> >> Now the behavior of crashkernel=X has been changed, which tries low
> >> allocation in ZONE_DMA, and fall back to high allocation if it fails.
> >>
> >> If requized size X is too large and leads to very little free memory
> >> in ZONE_DMA after low allocation, the system may not work well.
> >> So add a threshold and go for high allocation directly if the required
> >> size is too large. The threshold is set as the half of low memory.
> >>
> >> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> >> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> >> specified size low memory. For non-RPi4 platforms, change ZONE_DMA
> >> memtioned above to ZONE_DMA32.
> >>
> >> So update the Documentation.
> >>
> >> Signed-off-by: Chen Zhou <[email protected]>
> >> ---
> >> Documentation/admin-guide/kdump/kdump.rst | 21 ++++++++++++++++---
> >> .../admin-guide/kernel-parameters.txt | 11 ++++++++--
> >> 2 files changed, 27 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
> >> index 2da65fef2a1c..4b58f97351d5 100644
> >> --- a/Documentation/admin-guide/kdump/kdump.rst
> >> +++ b/Documentation/admin-guide/kdump/kdump.rst
> >> @@ -299,7 +299,15 @@ Boot into System Kernel
> >> "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
> >> starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
> >>
> >> - On x86 and x86_64, use "crashkernel=64M@16M".
> >> + On x86 use "crashkernel=64M@16M".
> >> +
> >> + On x86_64, use "crashkernel=X" to select a region under 4G first, and
> >> + fall back to reserve region above 4G.
> >> + We can also use "crashkernel=X,high" to select a region above 4G, which
> >> + also tries to allocate at least 256M below 4G automatically and
> >> + "crashkernel=Y,low" can be used to allocate specified size low memory.
> >> + Use "crashkernel=Y@X" if you really have to reserve memory from specified
> >> + start address X.
> >>
> >> On ppc64, use "crashkernel=128M@32M".
> >>
> >> @@ -316,8 +324,15 @@ Boot into System Kernel
> >> kernel will automatically locate the crash kernel image within the
> >> first 512MB of RAM if X is not given.
> >>
> >> - On arm64, use "crashkernel=Y[@X]". Note that the start address of
> >> - the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
> >> + On arm64, use "crashkernel=X" to try low allocation in ZONE_DMA, and
> >> + fall back to high allocation if it fails. And go for high allocation
> >> + directly if the required size is too large. If crash_base is outside
> >> + ZONE_DMA, try to allocate at least 256M in ZONE_DMA automatically.
> >> + "crashkernel=Y,low" can be used to allocate specified size low memory.
> >> + For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.
> >> + Use "crashkernel=Y@X" if you really have to reserve memory from
> >> + specified start address X. Note that the start address of the kernel,
> >> + X if explicitly specified, must be aligned to 2MiB (0x200000).
> >>
> >> Load the Dump-capture Kernel
> >> ============================
> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> >> index fb95fad81c79..d1b6016850d6 100644
> >> --- a/Documentation/admin-guide/kernel-parameters.txt
> >> +++ b/Documentation/admin-guide/kernel-parameters.txt
> >> @@ -722,6 +722,10 @@
> >> [KNL, x86_64] select a region under 4G first, and
> >> fall back to reserve region above 4G when '@offset'
> >> hasn't been specified.
> >> + [KNL, arm64] Try low allocation in ZONE_DMA, fall back
> >> + to high allocation if it fails when '@offset' hasn't been
> >> + specified. For non-RPi4 platforms, change ZONE_DMA to
> >> + ZONE_DMA32.
> >> See Documentation/admin-guide/kdump/kdump.rst for further details.
> >>
> >> crashkernel=range1:size1[,range2:size2,...][@offset]
> >> @@ -746,13 +750,16 @@
> >> requires at least 64M+32K low memory, also enough extra
> >> low memory is needed to make sure DMA buffers for 32-bit
> >> devices won't run out. Kernel would try to allocate at
> >> - at least 256M below 4G automatically.
> >> + least 256M below 4G automatically.
> >> This one let user to specify own low range under 4G
> >> for second kernel instead.
> >> 0: to disable low allocation.
> >> It will be ignored when crashkernel=X,high is not used
> >> or memory reserved is below 4G.
> >> -
> >> + [KNL, arm64] range under 4G.
> >> + This one let user to specify a low range in ZONE_DMA for
> >> + crash dump kernel. For non-RPi4 platforms, change ZONE_DMA
> >> + to ZONE_DMA32.
> >> cryptomgr.notests
> >> [KNL] Disable crypto self-tests
> >>
> >> --
> >> 2.20.1
> >>
> > Hi Chen,
> >
> > Previously I remember we talked about to use similar logic as X86, but I
> > remember you mentioned on some arm64 platform there could be no low
> > memory at all. Is this not a problem now for the fallback? Just be
> > curious, thanks for the update, for the common part looks good.
> Hi Dave,
>
> Did you mean this discuss: https://lkml.org/lkml/2019/12/27/122?
> This is about the different implementation instead of no low memory in arm64.
>
> On arm64 platform, if there is no low memory, system will boot fail.

James mentioned some systems have no memory below 4G, if I understand it
correctly that means they can boot without low mem.

Anyway I like the new implementation in this series if it is good enough
for arm64 people.

>
> Thanks,
> Chen Zhou
> >
> > Acked-by: Dave Young <[email protected]>
> >
> > Thanks
> > Dave
> >
> >
> > .
> >
>

Thanks
Dave

2020-08-10 06:05:12

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

Hi,

> > Previously I remember we talked about to use similar logic as X86, but I
> > remember you mentioned on some arm64 platform there could be no low
> > memory at all. Is this not a problem now for the fallback? Just be
> > curious, thanks for the update, for the common part looks good.
> Hi Dave,
>
> Did you mean this discuss: https://lkml.org/lkml/2019/12/27/122?

I meant about this reply instead :)
https://lkml.org/lkml/2020/1/16/616

Thanks
Dave

2020-08-18 07:08:57

by chenzhou

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel



On 2020/8/10 14:03, Dave Young wrote:
> Hi,
>
>>> Previously I remember we talked about to use similar logic as X86, but I
>>> remember you mentioned on some arm64 platform there could be no low
>>> memory at all. Is this not a problem now for the fallback? Just be
>>> curious, thanks for the update, for the common part looks good.
>> Hi Dave,
>>
>> Did you mean this discuss: https://lkml.org/lkml/2019/12/27/122?
> I meant about this reply instead :)
> https://lkml.org/lkml/2020/1/16/616
Hi Dave,

Sorry for not repley in time, I was on holiday last week.

The platform James mentioned may exist for which have no devices and need no low memory.
For our arm64 server platform, there are some devices and need low memory.

I got it. For the platform with no low memory, reserving crashkernel will always fail.
How about like this:

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index a8e34d97a894..4df18c7ea438 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -147,7 +147,7 @@ static void __init reserve_crashkernel(void)
}
memblock_reserve(crash_base, crash_size);

- if (crash_base >= CRASH_ADDR_LOW_MAX) {
+ if (memstart_addr < CRASH_ADDR_LOW_MAX && crash_base >= CRASH_ADDR_LOW_MAX) {
const char *rename = "Crash kernel (low)";

if (reserve_crashkernel_low()) {


Thanks,
Chen Zhou

>
> Thanks
> Dave
>
>
> .
>


2020-08-19 12:07:31

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

On 08/18/20 at 03:07pm, chenzhou wrote:
>
>
> On 2020/8/10 14:03, Dave Young wrote:
> > Hi,
> >
> >>> Previously I remember we talked about to use similar logic as X86, but I
> >>> remember you mentioned on some arm64 platform there could be no low
> >>> memory at all. Is this not a problem now for the fallback? Just be
> >>> curious, thanks for the update, for the common part looks good.
> >> Hi Dave,
> >>
> >> Did you mean this discuss: https://lkml.org/lkml/2019/12/27/122?
> > I meant about this reply instead :)
> > https://lkml.org/lkml/2020/1/16/616
> Hi Dave,
>
> Sorry for not repley in time, I was on holiday last week.

Hi, no problem, thanks for following up.

>
> The platform James mentioned may exist for which have no devices and need no low memory.
> For our arm64 server platform, there are some devices and need low memory.
>
> I got it. For the platform with no low memory, reserving crashkernel will always fail.
> How about like this:

I think the question should leave to Catalin or James, I have no
suggestion about this:)

>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index a8e34d97a894..4df18c7ea438 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -147,7 +147,7 @@ static void __init reserve_crashkernel(void)
> }
> memblock_reserve(crash_base, crash_size);
>
> - if (crash_base >= CRASH_ADDR_LOW_MAX) {
> + if (memstart_addr < CRASH_ADDR_LOW_MAX && crash_base >= CRASH_ADDR_LOW_MAX) {
> const char *rename = "Crash kernel (low)";
>
> if (reserve_crashkernel_low()) {
>
>
> Thanks,
> Chen Zhou
>
> >
> > Thanks
> > Dave
> >
> >
> > .
> >
>
>

2020-08-28 02:00:27

by chenzhou

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

Hi Catalin,


On 2020/8/19 20:03, Dave Young wrote:
> On 08/18/20 at 03:07pm, chenzhou wrote:
>>
>> On 2020/8/10 14:03, Dave Young wrote:
>>> Hi,
>>>
>>>>> Previously I remember we talked about to use similar logic as X86, but I
>>>>> remember you mentioned on some arm64 platform there could be no low
>>>>> memory at all. Is this not a problem now for the fallback? Just be
>>>>> curious, thanks for the update, for the common part looks good.
>>>> Hi Dave,
>>>>
>>>> Did you mean this discuss: https://lkml.org/lkml/2019/12/27/122?
>>> I meant about this reply instead :)
>>> https://lkml.org/lkml/2020/1/16/616
>> Hi Dave,
>>
>> Sorry for not repley in time, I was on holiday last week.
> Hi, no problem, thanks for following up.
>
>> The platform James mentioned may exist for which have no devices and need no low memory.
>> For our arm64 server platform, there are some devices and need low memory.
>>
>> I got it. For the platform with no low memory, reserving crashkernel will always fail.
>> How about like this:
> I think the question should leave to Catalin or James, I have no
> suggestion about this:)
Any suggestions about this?

Thanks,
Chen Zhou
>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index a8e34d97a894..4df18c7ea438 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -147,7 +147,7 @@ static void __init reserve_crashkernel(void)
>> }
>> memblock_reserve(crash_base, crash_size);
>>
>> - if (crash_base >= CRASH_ADDR_LOW_MAX) {
>> + if (memstart_addr < CRASH_ADDR_LOW_MAX && crash_base >= CRASH_ADDR_LOW_MAX) {
>> const char *rename = "Crash kernel (low)";
>>
>> if (reserve_crashkernel_low()) {
>>
>>
>> Thanks,
>> Chen Zhou
>>
>>> Thanks
>>> Dave
>>>
>>>
>>> .
>>>
>>
>
> .
>


2020-09-01 16:52:44

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v11 2/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c

On Thu, Aug 06, 2020 at 03:36:27PM +0200, Ingo Molnar wrote:
>
> * Chen Zhou <[email protected]> wrote:
>
> > In preparation for supporting reserve_crashkernel_low in arm64 as
> > x86_64 does, move reserve_crashkernel_low() into kernel/crash_core.c.
> >
> > BTW, move x86_64 CRASH_ALIGN to 2M suggested by Dave. CONFIG_PHYSICAL_ALIGN
> > can be selected from 2M to 16M, move to the same as arm64.
> >
> > Signed-off-by: Chen Zhou <[email protected]>
> > ---
> > arch/x86/include/asm/kexec.h | 24 ++++++++++
> > arch/x86/kernel/setup.c | 86 +++---------------------------------
> > include/linux/crash_core.h | 3 ++
> > include/linux/kexec.h | 2 -
> > kernel/crash_core.c | 74 +++++++++++++++++++++++++++++++
> > kernel/kexec_core.c | 17 -------
> > 6 files changed, 107 insertions(+), 99 deletions(-)
>
> Since the changes are centered around arm64, I suppose the arm64 tree
> will carry this patchset?
>
> Assuming that this is a 100% invariant moving of code that doesn't
> regress on x86:
>
> Acked-by: Ingo Molnar <[email protected]>

Thanks Ingo. The only difference I see is that CRASH_ALIGN has been
reduced to 2M here from 16M for x86. Would this break configs that have
PHYSICAL_ALIGN > 2M?

--
Catalin

2020-09-01 17:16:28

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

On Tue, Aug 18, 2020 at 03:07:04PM +0800, chenzhou wrote:
> On 2020/8/10 14:03, Dave Young wrote:
> >>> Previously I remember we talked about to use similar logic as X86, but I
> >>> remember you mentioned on some arm64 platform there could be no low
> >>> memory at all. Is this not a problem now for the fallback? Just be
> >>> curious, thanks for the update, for the common part looks good.
> >>
> >> Did you mean this discuss: https://lkml.org/lkml/2019/12/27/122?
> > I meant about this reply instead :)
> > https://lkml.org/lkml/2020/1/16/616
>
> Sorry for not repley in time, I was on holiday last week.
>
> The platform James mentioned may exist for which have no devices and
> need no low memory.

If there is no memory below 4GB, the arm64 kernel assumes that the
32-bit devices will have some DMA offsets shifting the addresses to the
bottom of the available RAM. So even if RAM starts above 4GB, we
ZONE_DMA32 will be allocated in the bottom 4GB of the high memory (and
if the hardware designers forgot to shift those DMA accesses, we don't
have to support the platform ;)).

So the arm64 notion of low memory differs slightly from the x86 one.

--
Catalin

2020-09-02 16:44:23

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

On Tue, Aug 18, 2020 at 03:07:04PM +0800, chenzhou wrote:
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index a8e34d97a894..4df18c7ea438 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -147,7 +147,7 @@ static void __init reserve_crashkernel(void)
> }
> memblock_reserve(crash_base, crash_size);
>
> - if (crash_base >= CRASH_ADDR_LOW_MAX) {
> + if (memstart_addr < CRASH_ADDR_LOW_MAX && crash_base >= CRASH_ADDR_LOW_MAX) {
> const char *rename = "Crash kernel (low)";

Since CRASH_ADDR_LOW_MAX is defined as arm64_dma32_phys_limit and such
limit is always greater than memstart_addr, this additional check
doesn't do anything. See my other reply on how ZONE_DMA32 is created on
arm64.

--
Catalin

2020-09-02 17:12:40

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X

On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> the devices in crash dump kernel need to use ZONE_DMA will alloc
> fail.
>
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> high allocation if it fails.
>
> If requized size X is too large and leads to very little free memory
> in ZONE_DMA after low allocation, the system may not work normally.
> So add a threshold and go for high allocation directly if the required
> size is too large. The value of threshold is set as the half of
> the low memory.
>
> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> specified size low memory.

Except for the threshold to keep zone ZONE_DMA memory,
reserve_crashkernel() looks very close to the x86 version. Shall we try
to make this generic as well? In the first instance, you could avoid the
threshold check if it takes an explicit ",high" option.

--
Catalin

2020-09-02 17:15:34

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

On Sat, Aug 01, 2020 at 09:08:56PM +0800, Chen Zhou wrote:
> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
> index 2da65fef2a1c..4b58f97351d5 100644
> --- a/Documentation/admin-guide/kdump/kdump.rst
> +++ b/Documentation/admin-guide/kdump/kdump.rst
> @@ -299,7 +299,15 @@ Boot into System Kernel
> "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
> starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>
> - On x86 and x86_64, use "crashkernel=64M@16M".
> + On x86 use "crashkernel=64M@16M".
> +
> + On x86_64, use "crashkernel=X" to select a region under 4G first, and
> + fall back to reserve region above 4G.
> + We can also use "crashkernel=X,high" to select a region above 4G, which
> + also tries to allocate at least 256M below 4G automatically and
> + "crashkernel=Y,low" can be used to allocate specified size low memory.
> + Use "crashkernel=Y@X" if you really have to reserve memory from specified
> + start address X.
>
> On ppc64, use "crashkernel=128M@32M".
>
> @@ -316,8 +324,15 @@ Boot into System Kernel
> kernel will automatically locate the crash kernel image within the
> first 512MB of RAM if X is not given.
>
> - On arm64, use "crashkernel=Y[@X]". Note that the start address of
> - the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
> + On arm64, use "crashkernel=X" to try low allocation in ZONE_DMA, and
> + fall back to high allocation if it fails. And go for high allocation
> + directly if the required size is too large. If crash_base is outside

I wouldn't mention crash_base in the admin guide. That's an
implementation detail really and admins are not supposed to read the
source code to make sense of the documentation. ZONE_DMA is also a
kernel internal, so you'd need to define what it is for arm64. At least
the DMA and DMA32 zones are printed during kernel boot.

> + ZONE_DMA, try to allocate at least 256M in ZONE_DMA automatically.
> + "crashkernel=Y,low" can be used to allocate specified size low memory.
> + For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.
> + Use "crashkernel=Y@X" if you really have to reserve memory from
> + specified start address X. Note that the start address of the kernel,
> + X if explicitly specified, must be aligned to 2MiB (0x200000).

--
Catalin

2020-09-03 14:57:24

by chenzhou

[permalink] [raw]
Subject: Re: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X



On 2020/9/3 19:26, chenzhou wrote:
> Hi Catalin,
>
>
> On 2020/9/3 1:09, Catalin Marinas wrote:
>> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
>>> There are following issues in arm64 kdump:
>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>>> will fail when there is no enough low memory.
>>> 2. If reserving crashkernel above 4G, in this case, crash dump
>>> kernel will boot failure because there is no low memory available
>>> for allocation.
>>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
>>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
>>> the devices in crash dump kernel need to use ZONE_DMA will alloc
>>> fail.
>>>
>>> To solve these issues, change the behavior of crashkernel=X.
>>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
>>> high allocation if it fails.
>>>
>>> If requized size X is too large and leads to very little free memory
>>> in ZONE_DMA after low allocation, the system may not work normally.
>>> So add a threshold and go for high allocation directly if the required
>>> size is too large. The value of threshold is set as the half of
>>> the low memory.
>>>
>>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>>> specified size low memory.
>> Except for the threshold to keep zone ZONE_DMA memory,
>> reserve_crashkernel() looks very close to the x86 version. Shall we try
>> to make this generic as well? In the first instance, you could avoid the
>> threshold check if it takes an explicit ",high" option.
> Ok, i will try to do this.
>
> I look into the function reserve_crashkernel() of x86 and found the start address is
> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
>
> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
Besides, in function reserve_crashkernel_low() of x86, the start address is 0.

>
> Thanks,
> Chen Zhou
>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> .
>


2020-09-03 15:15:04

by chenzhou

[permalink] [raw]
Subject: Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel



On 2020/9/3 1:13, Catalin Marinas wrote:
> On Sat, Aug 01, 2020 at 09:08:56PM +0800, Chen Zhou wrote:
>> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
>> index 2da65fef2a1c..4b58f97351d5 100644
>> --- a/Documentation/admin-guide/kdump/kdump.rst
>> +++ b/Documentation/admin-guide/kdump/kdump.rst
>> @@ -299,7 +299,15 @@ Boot into System Kernel
>> "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
>> starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>>
>> - On x86 and x86_64, use "crashkernel=64M@16M".
>> + On x86 use "crashkernel=64M@16M".
>> +
>> + On x86_64, use "crashkernel=X" to select a region under 4G first, and
>> + fall back to reserve region above 4G.
>> + We can also use "crashkernel=X,high" to select a region above 4G, which
>> + also tries to allocate at least 256M below 4G automatically and
>> + "crashkernel=Y,low" can be used to allocate specified size low memory.
>> + Use "crashkernel=Y@X" if you really have to reserve memory from specified
>> + start address X.
>>
>> On ppc64, use "crashkernel=128M@32M".
>>
>> @@ -316,8 +324,15 @@ Boot into System Kernel
>> kernel will automatically locate the crash kernel image within the
>> first 512MB of RAM if X is not given.
>>
>> - On arm64, use "crashkernel=Y[@X]". Note that the start address of
>> - the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
>> + On arm64, use "crashkernel=X" to try low allocation in ZONE_DMA, and
>> + fall back to high allocation if it fails. And go for high allocation
>> + directly if the required size is too large. If crash_base is outside
> I wouldn't mention crash_base in the admin guide. That's an
> implementation detail really and admins are not supposed to read the
> source code to make sense of the documentation. ZONE_DMA is also a
> kernel internal, so you'd need to define what it is for arm64. At least
> the DMA and DMA32 zones are printed during kernel boot.
Ok, i will fix this in next version.
>
>> + ZONE_DMA, try to allocate at least 256M in ZONE_DMA automatically.
>> + "crashkernel=Y,low" can be used to allocate specified size low memory.
>> + For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.
>> + Use "crashkernel=Y@X" if you really have to reserve memory from
>> + specified start address X. Note that the start address of the kernel,
>> + X if explicitly specified, must be aligned to 2MiB (0x200000).


2020-09-03 15:21:11

by chenzhou

[permalink] [raw]
Subject: Re: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X

Hi Catalin,


On 2020/9/3 1:09, Catalin Marinas wrote:
> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
>> There are following issues in arm64 kdump:
>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>> will fail when there is no enough low memory.
>> 2. If reserving crashkernel above 4G, in this case, crash dump
>> kernel will boot failure because there is no low memory available
>> for allocation.
>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
>> the devices in crash dump kernel need to use ZONE_DMA will alloc
>> fail.
>>
>> To solve these issues, change the behavior of crashkernel=X.
>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
>> high allocation if it fails.
>>
>> If requized size X is too large and leads to very little free memory
>> in ZONE_DMA after low allocation, the system may not work normally.
>> So add a threshold and go for high allocation directly if the required
>> size is too large. The value of threshold is set as the half of
>> the low memory.
>>
>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>> specified size low memory.
> Except for the threshold to keep zone ZONE_DMA memory,
> reserve_crashkernel() looks very close to the x86 version. Shall we try
> to make this generic as well? In the first instance, you could avoid the
> threshold check if it takes an explicit ",high" option.
Ok, i will try to do this.

I look into the function reserve_crashkernel() of x86 and found the start address is
CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.

I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?

Thanks,
Chen Zhou


2020-09-04 03:07:36

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X

On 09/03/20 at 07:26pm, chenzhou wrote:
> Hi Catalin,
>
>
> On 2020/9/3 1:09, Catalin Marinas wrote:
> > On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
> >> There are following issues in arm64 kdump:
> >> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> >> will fail when there is no enough low memory.
> >> 2. If reserving crashkernel above 4G, in this case, crash dump
> >> kernel will boot failure because there is no low memory available
> >> for allocation.
> >> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> >> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> >> the devices in crash dump kernel need to use ZONE_DMA will alloc
> >> fail.
> >>
> >> To solve these issues, change the behavior of crashkernel=X.
> >> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> >> high allocation if it fails.
> >>
> >> If requized size X is too large and leads to very little free memory
> >> in ZONE_DMA after low allocation, the system may not work normally.
> >> So add a threshold and go for high allocation directly if the required
> >> size is too large. The value of threshold is set as the half of
> >> the low memory.
> >>
> >> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> >> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> >> specified size low memory.
> > Except for the threshold to keep zone ZONE_DMA memory,
> > reserve_crashkernel() looks very close to the x86 version. Shall we try
> > to make this generic as well? In the first instance, you could avoid the
> > threshold check if it takes an explicit ",high" option.
> Ok, i will try to do this.
>
> I look into the function reserve_crashkernel() of x86 and found the start address is
> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
>
> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?

Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
config PHYSICAL_ALIGN
hex "Alignment value to which kernel should be aligned"
default "0x200000"
range 0x2000 0x1000000 if X86_32
range 0x200000 0x1000000 if X86_64

According to above, I think the 16M should come from the largest value
But the default value is 2M, with smaller value reservation can have
more chance to succeed.

It seems we still need arch specific CRASH_ALIGN, but the initial
version you added the #ifdef for different arches, can you move the
macro to arch specific headers?

Thanks
Dave

2020-09-04 03:11:47

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X

On 09/04/20 at 11:04am, Dave Young wrote:
> On 09/03/20 at 07:26pm, chenzhou wrote:
> > Hi Catalin,
> >
> >
> > On 2020/9/3 1:09, Catalin Marinas wrote:
> > > On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
> > >> There are following issues in arm64 kdump:
> > >> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> > >> will fail when there is no enough low memory.
> > >> 2. If reserving crashkernel above 4G, in this case, crash dump
> > >> kernel will boot failure because there is no low memory available
> > >> for allocation.
> > >> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> > >> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> > >> the devices in crash dump kernel need to use ZONE_DMA will alloc
> > >> fail.
> > >>
> > >> To solve these issues, change the behavior of crashkernel=X.
> > >> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> > >> high allocation if it fails.
> > >>
> > >> If requized size X is too large and leads to very little free memory
> > >> in ZONE_DMA after low allocation, the system may not work normally.
> > >> So add a threshold and go for high allocation directly if the required
> > >> size is too large. The value of threshold is set as the half of
> > >> the low memory.
> > >>
> > >> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> > >> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> > >> specified size low memory.
> > > Except for the threshold to keep zone ZONE_DMA memory,
> > > reserve_crashkernel() looks very close to the x86 version. Shall we try
> > > to make this generic as well? In the first instance, you could avoid the
> > > threshold check if it takes an explicit ",high" option.
> > Ok, i will try to do this.
> >
> > I look into the function reserve_crashkernel() of x86 and found the start address is
> > CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
> >
> > I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
>
> Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
> config PHYSICAL_ALIGN
> hex "Alignment value to which kernel should be aligned"
> default "0x200000"
> range 0x2000 0x1000000 if X86_32
> range 0x200000 0x1000000 if X86_64
>
> According to above, I think the 16M should come from the largest value
> But the default value is 2M, with smaller value reservation can have
> more chance to succeed.
>
> It seems we still need arch specific CRASH_ALIGN, but the initial
> version you added the #ifdef for different arches, can you move the
> macro to arch specific headers?

And just keep the x86 align value as is, I can try to change the x86
value later to CONFIG_PHYSICAL_ALIGN, in this way this series can be
cleaner.

>
> Thanks
> Dave

2020-09-04 04:07:25

by chenzhou

[permalink] [raw]
Subject: Re: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X



On 2020/9/4 11:10, Dave Young wrote:
> On 09/04/20 at 11:04am, Dave Young wrote:
>> On 09/03/20 at 07:26pm, chenzhou wrote:
>>> Hi Catalin,
>>>
>>>
>>> On 2020/9/3 1:09, Catalin Marinas wrote:
>>>> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
>>>>> There are following issues in arm64 kdump:
>>>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>>>>> will fail when there is no enough low memory.
>>>>> 2. If reserving crashkernel above 4G, in this case, crash dump
>>>>> kernel will boot failure because there is no low memory available
>>>>> for allocation.
>>>>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
>>>>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
>>>>> the devices in crash dump kernel need to use ZONE_DMA will alloc
>>>>> fail.
>>>>>
>>>>> To solve these issues, change the behavior of crashkernel=X.
>>>>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
>>>>> high allocation if it fails.
>>>>>
>>>>> If requized size X is too large and leads to very little free memory
>>>>> in ZONE_DMA after low allocation, the system may not work normally.
>>>>> So add a threshold and go for high allocation directly if the required
>>>>> size is too large. The value of threshold is set as the half of
>>>>> the low memory.
>>>>>
>>>>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>>>>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>>>>> specified size low memory.
>>>> Except for the threshold to keep zone ZONE_DMA memory,
>>>> reserve_crashkernel() looks very close to the x86 version. Shall we try
>>>> to make this generic as well? In the first instance, you could avoid the
>>>> threshold check if it takes an explicit ",high" option.
>>> Ok, i will try to do this.
>>>
>>> I look into the function reserve_crashkernel() of x86 and found the start address is
>>> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
>>>
>>> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
>> Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
>> config PHYSICAL_ALIGN
>> hex "Alignment value to which kernel should be aligned"
>> default "0x200000"
>> range 0x2000 0x1000000 if X86_32
>> range 0x200000 0x1000000 if X86_64
>>
>> According to above, I think the 16M should come from the largest value
>> But the default value is 2M, with smaller value reservation can have
>> more chance to succeed.
>>
>> It seems we still need arch specific CRASH_ALIGN, but the initial
>> version you added the #ifdef for different arches, can you move the
>> macro to arch specific headers?
> And just keep the x86 align value as is, I can try to change the x86
> value later to CONFIG_PHYSICAL_ALIGN, in this way this series can be
> cleaner.
Ok. I have no question about the value of macro CRASH_ALIGN,
instead the lower bound of memblock_find_in_range().

For x86, in reserve_crashkernel(),restrict the lower bound of the range to CRASH_ALIGN,
...
crash_base = memblock_find_in_range(CRASH_ALIGN,
CRASH_ADDR_LOW_MAX,
crash_size, CRASH_ALIGN);
...

in reserve_crashkernel_low(),with no this restriction.
...
low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
...

How about all making memblock_find_in_range() search from the start of memory?
If it is ok, i will do like this in the generic version.

Thanks,
Chen Zhou
>
>> Thanks
>> Dave
>
> .
>


2020-09-04 04:19:54

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X

On 09/04/20 at 12:02pm, chenzhou wrote:
>
>
> On 2020/9/4 11:10, Dave Young wrote:
> > On 09/04/20 at 11:04am, Dave Young wrote:
> >> On 09/03/20 at 07:26pm, chenzhou wrote:
> >>> Hi Catalin,
> >>>
> >>>
> >>> On 2020/9/3 1:09, Catalin Marinas wrote:
> >>>> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
> >>>>> There are following issues in arm64 kdump:
> >>>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> >>>>> will fail when there is no enough low memory.
> >>>>> 2. If reserving crashkernel above 4G, in this case, crash dump
> >>>>> kernel will boot failure because there is no low memory available
> >>>>> for allocation.
> >>>>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> >>>>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> >>>>> the devices in crash dump kernel need to use ZONE_DMA will alloc
> >>>>> fail.
> >>>>>
> >>>>> To solve these issues, change the behavior of crashkernel=X.
> >>>>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> >>>>> high allocation if it fails.
> >>>>>
> >>>>> If requized size X is too large and leads to very little free memory
> >>>>> in ZONE_DMA after low allocation, the system may not work normally.
> >>>>> So add a threshold and go for high allocation directly if the required
> >>>>> size is too large. The value of threshold is set as the half of
> >>>>> the low memory.
> >>>>>
> >>>>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> >>>>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> >>>>> specified size low memory.
> >>>> Except for the threshold to keep zone ZONE_DMA memory,
> >>>> reserve_crashkernel() looks very close to the x86 version. Shall we try
> >>>> to make this generic as well? In the first instance, you could avoid the
> >>>> threshold check if it takes an explicit ",high" option.
> >>> Ok, i will try to do this.
> >>>
> >>> I look into the function reserve_crashkernel() of x86 and found the start address is
> >>> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
> >>>
> >>> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
> >> Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
> >> config PHYSICAL_ALIGN
> >> hex "Alignment value to which kernel should be aligned"
> >> default "0x200000"
> >> range 0x2000 0x1000000 if X86_32
> >> range 0x200000 0x1000000 if X86_64
> >>
> >> According to above, I think the 16M should come from the largest value
> >> But the default value is 2M, with smaller value reservation can have
> >> more chance to succeed.
> >>
> >> It seems we still need arch specific CRASH_ALIGN, but the initial
> >> version you added the #ifdef for different arches, can you move the
> >> macro to arch specific headers?
> > And just keep the x86 align value as is, I can try to change the x86
> > value later to CONFIG_PHYSICAL_ALIGN, in this way this series can be
> > cleaner.
> Ok. I have no question about the value of macro CRASH_ALIGN,
> instead the lower bound of memblock_find_in_range().
>
> For x86, in reserve_crashkernel(),restrict the lower bound of the range to CRASH_ALIGN,
> ...
> crash_base = memblock_find_in_range(CRASH_ALIGN,
> CRASH_ADDR_LOW_MAX,
> crash_size, CRASH_ALIGN);
> ...
>
> in reserve_crashkernel_low(),with no this restriction.
> ...
> low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
> ...
>
> How about all making memblock_find_in_range() search from the start of memory?
> If it is ok, i will do like this in the generic version.

I feel starting with CRASH_ALIGN sounds better, can you just search from
CRASH_ALIGN in generic version?

Thanks
Dave

2020-09-04 06:40:33

by chenzhou

[permalink] [raw]
Subject: Re: [PATCH v11 3/5] arm64: kdump: reimplement crashkernel=X



On 2020/9/4 12:16, Dave Young wrote:
> On 09/04/20 at 12:02pm, chenzhou wrote:
>>
>> On 2020/9/4 11:10, Dave Young wrote:
>>> On 09/04/20 at 11:04am, Dave Young wrote:
>>>> On 09/03/20 at 07:26pm, chenzhou wrote:
>>>>> Hi Catalin,
>>>>>
>>>>>
>>>>> On 2020/9/3 1:09, Catalin Marinas wrote:
>>>>>> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
>>>>>>> There are following issues in arm64 kdump:
>>>>>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>>>>>>> will fail when there is no enough low memory.
>>>>>>> 2. If reserving crashkernel above 4G, in this case, crash dump
>>>>>>> kernel will boot failure because there is no low memory available
>>>>>>> for allocation.
>>>>>>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
>>>>>>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
>>>>>>> the devices in crash dump kernel need to use ZONE_DMA will alloc
>>>>>>> fail.
>>>>>>>
>>>>>>> To solve these issues, change the behavior of crashkernel=X.
>>>>>>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
>>>>>>> high allocation if it fails.
>>>>>>>
>>>>>>> If requized size X is too large and leads to very little free memory
>>>>>>> in ZONE_DMA after low allocation, the system may not work normally.
>>>>>>> So add a threshold and go for high allocation directly if the required
>>>>>>> size is too large. The value of threshold is set as the half of
>>>>>>> the low memory.
>>>>>>>
>>>>>>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>>>>>>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>>>>>>> specified size low memory.
>>>>>> Except for the threshold to keep zone ZONE_DMA memory,
>>>>>> reserve_crashkernel() looks very close to the x86 version. Shall we try
>>>>>> to make this generic as well? In the first instance, you could avoid the
>>>>>> threshold check if it takes an explicit ",high" option.
>>>>> Ok, i will try to do this.
>>>>>
>>>>> I look into the function reserve_crashkernel() of x86 and found the start address is
>>>>> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
>>>>>
>>>>> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
>>>> Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
>>>> config PHYSICAL_ALIGN
>>>> hex "Alignment value to which kernel should be aligned"
>>>> default "0x200000"
>>>> range 0x2000 0x1000000 if X86_32
>>>> range 0x200000 0x1000000 if X86_64
>>>>
>>>> According to above, I think the 16M should come from the largest value
>>>> But the default value is 2M, with smaller value reservation can have
>>>> more chance to succeed.
>>>>
>>>> It seems we still need arch specific CRASH_ALIGN, but the initial
>>>> version you added the #ifdef for different arches, can you move the
>>>> macro to arch specific headers?
>>> And just keep the x86 align value as is, I can try to change the x86
>>> value later to CONFIG_PHYSICAL_ALIGN, in this way this series can be
>>> cleaner.
>> Ok. I have no question about the value of macro CRASH_ALIGN,
>> instead the lower bound of memblock_find_in_range().
>>
>> For x86, in reserve_crashkernel(),restrict the lower bound of the range to CRASH_ALIGN,
>> ...
>> crash_base = memblock_find_in_range(CRASH_ALIGN,
>> CRASH_ADDR_LOW_MAX,
>> crash_size, CRASH_ALIGN);
>> ...
>>
>> in reserve_crashkernel_low(),with no this restriction.
>> ...
>> low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
>> ...
>>
>> How about all making memblock_find_in_range() search from the start of memory?
>> If it is ok, i will do like this in the generic version.
> I feel starting with CRASH_ALIGN sounds better, can you just search from
> CRASH_ALIGN in generic version?
ok.
>
> Thanks
> Dave
>
>
> .
>