2021-07-14 12:52:44

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH v4 00/10] Add generic support for kdump DT properties

Hi all,

This patch series adds generic support for parsing DT properties related
to crash dump kernels ("linux,elfcorehdr" and "linux,elfcorehdr" under
the "/chosen" node), makes use of it on arm64, arm32, and (partially)
riscv, and performs a few cleanups. It is an evolution of the
combination of [1] and [2].

The series consists of 5 parts:
1. Patches 1-2 prepare (the visibility of) variables used to hold
information retrieved from the DT properties.
2. Patches 3-5 add support to the FDT core for parsing the properties.
This can co-exist safely with architecture-specific parsing, until
the latter has been removed.
3. Patch 6 removes the non-standard handling of "linux,elfcorehdr" on
riscv.
I think this can be applied independently, as the non-standard
handling is in v5.13, but upstream riscv kdump support is still
incomplete.
4. Patches 7-9 convert arm64 to use the generic handling instead of
its own implementation.
5. Patch 10 adds support for kdump properties to arm32.
The corresponding patch for kexec-tools is "[PATCH] arm: kdump: Add
DT properties to crash dump kernel's DTB"[3], which is still valid.

Changes compared to the previous versions:
- Make elfcorehdr_{addr,size} always visible,
- Add variables for usable memory limitation,
- Use IS_ENABLED() instead of #ifdef (incl. initrd and arm64),
- Clarify what architecture-specific code is still responsible for,
- Add generic support for parsing linux,usable-memory-range,
- Remove custom linux,usable-memory-range parsing on arm64,
- Use generic handling on ARM.

This has been tested on arm32 and arm64, and compile-tested on riscv64.

Thanks for your comments!

[1] "[PATCH v3] ARM: Parse kdump DT properties"
https://lore.kernel.org/r/[email protected]/
[2] "[PATCH 0/3] Add generic-support for linux,elfcorehdr and fix riscv"
https://lore.kernel.org/r/[email protected]/
[3] "[PATCH] arm: kdump: Add DT properties to crash dump kernel's DTB"
https://lore.kernel.org/linux-arm-kernel/[email protected]/


Geert Uytterhoeven (10):
crash_dump: Make elfcorehdr_{addr,size} always visible
memblock: Add variables for usable memory limitation
of: fdt: Add generic support for parsing elf core headers property
of: fdt: Add generic support for parsing usable memory range property
of: fdt: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD) instead of #ifdef
riscv: Remove non-standard linux,elfcorehdr handling
arm64: kdump: Remove custom linux,elfcorehdr parsing
arm64: kdump: Remove custom linux,usable-memory-range parsing
arm64: kdump: Use IS_ENABLED(CONFIG_CRASH_DUMP) instead of #ifdef
ARM: Parse kdump DT properties

Documentation/devicetree/bindings/chosen.txt | 12 ++--
.../arm/boot/compressed/fdt_check_mem_start.c | 48 ++++++++++++--
arch/arm/mm/init.c | 30 +++++++++
arch/arm64/mm/init.c | 63 +------------------
arch/riscv/mm/init.c | 20 ------
drivers/of/fdt.c | 63 ++++++++++++++++---
include/linux/crash_dump.h | 7 ++-
include/linux/memblock.h | 6 ++
mm/memblock.c | 2 +
9 files changed, 148 insertions(+), 103 deletions(-)

--
2.25.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


2021-07-14 12:53:17

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH v4 07/10] arm64: kdump: Remove custom linux,elfcorehdr parsing

Remove the architecture-specific code for parsing the "linux,elfcorehdr"
property under the "/chosen" node in DT, as the platform-agnostic
handling in the FDT core code already takes care of this.

Signed-off-by: Geert Uytterhoeven <[email protected]>
---
v4:
- s/handlng/parsing/ in patch description.
---
arch/arm64/mm/init.c | 21 ---------------------
1 file changed, 21 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8490ed2917ff2430..946e246660f2b180 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -125,25 +125,6 @@ static void __init reserve_crashkernel(void)
#endif /* CONFIG_KEXEC_CORE */

#ifdef CONFIG_CRASH_DUMP
-static int __init early_init_dt_scan_elfcorehdr(unsigned long node,
- const char *uname, int depth, void *data)
-{
- const __be32 *reg;
- int len;
-
- if (depth != 1 || strcmp(uname, "chosen") != 0)
- return 0;
-
- reg = of_get_flat_dt_prop(node, "linux,elfcorehdr", &len);
- if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
- return 1;
-
- elfcorehdr_addr = dt_mem_next_cell(dt_root_addr_cells, &reg);
- elfcorehdr_size = dt_mem_next_cell(dt_root_size_cells, &reg);
-
- return 1;
-}
-
/*
* reserve_elfcorehdr() - reserves memory for elf core header
*
@@ -154,8 +135,6 @@ static int __init early_init_dt_scan_elfcorehdr(unsigned long node,
*/
static void __init reserve_elfcorehdr(void)
{
- of_scan_flat_dt(early_init_dt_scan_elfcorehdr, NULL);
-
if (!elfcorehdr_size)
return;

--
2.25.1

2021-07-14 12:53:49

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH v4 01/10] crash_dump: Make elfcorehdr_{addr,size} always visible

Make the forward declarations of elfcorehdr_addr and elfcorehdr_size
always available, like is done for phys_initrd_start and
phys_initrd_size. Code referring to these symbols can then just check
for IS_ENABLED(CONFIG_CRASH_DUMP), instead of requiring conditional
compilation using an #ifdef, thus preparing to increase compile
coverage.

Suggested-by: Rob Herring <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
---
v4:
- New.
---
include/linux/crash_dump.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
index a5192b718dbe4f9a..ad31893d13d634de 100644
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -10,13 +10,14 @@

#include <linux/pgtable.h> /* for pgprot_t */

+/* For IS_ENABLED(CONFIG_CRASH_DUMP) */
+extern unsigned long long elfcorehdr_addr;
+extern unsigned long long elfcorehdr_size;
+
#ifdef CONFIG_CRASH_DUMP
#define ELFCORE_ADDR_MAX (-1ULL)
#define ELFCORE_ADDR_ERR (-2ULL)

-extern unsigned long long elfcorehdr_addr;
-extern unsigned long long elfcorehdr_size;
-
extern int elfcorehdr_alloc(unsigned long long *addr, unsigned long long *size);
extern void elfcorehdr_free(unsigned long long addr);
extern ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos);
--
2.25.1

2021-07-14 12:54:49

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH v4 08/10] arm64: kdump: Remove custom linux,usable-memory-range parsing

Remove the architecture-specific code for parsing the
"linux,usable-memory-range" property under the "/chosen" node in DT, as
the platform-agnostic handling in the FDT core code already takes care
of this.

Signed-off-by: Geert Uytterhoeven <[email protected]>
---
v4:
- New.
---
arch/arm64/mm/init.c | 34 +---------------------------------
1 file changed, 1 insertion(+), 33 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 946e246660f2b180..f90ba99437c0f3c9 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -227,44 +227,12 @@ static int __init early_mem(char *p)
}
early_param("mem", early_mem);

-static int __init early_init_dt_scan_usablemem(unsigned long node,
- const char *uname, int depth, void *data)
-{
- struct memblock_region *usablemem = data;
- const __be32 *reg;
- int len;
-
- if (depth != 1 || strcmp(uname, "chosen") != 0)
- return 0;
-
- reg = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
- if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
- return 1;
-
- usablemem->base = dt_mem_next_cell(dt_root_addr_cells, &reg);
- usablemem->size = dt_mem_next_cell(dt_root_size_cells, &reg);
-
- return 1;
-}
-
-static void __init fdt_enforce_memory_region(void)
-{
- struct memblock_region reg = {
- .size = 0,
- };
-
- of_scan_flat_dt(early_init_dt_scan_usablemem, &reg);
-
- if (reg.size)
- memblock_cap_memory_range(reg.base, reg.size);
-}
-
void __init arm64_memblock_init(void)
{
const s64 linear_region_size = PAGE_END - _PAGE_OFFSET(vabits_actual);

/* Handle linux,usable-memory-range property */
- fdt_enforce_memory_region();
+ memblock_cap_memory_range(cap_mem_addr, cap_mem_size);

/* Remove memory above our supported physical address size */
memblock_remove(1ULL << PHYS_MASK_SHIFT, ULLONG_MAX);
--
2.25.1

2021-07-14 12:55:12

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH v4 02/10] memblock: Add variables for usable memory limitation

Add two global variables (cap_mem_addr and cap_mem_size) for storing a
base address and size, describing a limited region in which memory may
be considered available for use by the kernel. If enabled, memory
outside of this range is not available for use.

These variables can by filled by firmware-specific code, and used in
calls to memblock_cap_memory_range() by architecture-specific code.
An example user is the parser of the "linux,usable-memory-range"
property in the DT "/chosen" node.

Signed-off-by: Geert Uytterhoeven <[email protected]>
---
This is similar to how the initial ramdisk (phys_initrd_{start,size})
and ELF core headers (elfcorehdr_{addr,size})) are handled.

Does there exist a suitable place in the common memblock code to call
"memblock_cap_memory_range(cap_mem_addr, cap_mem_size)", or does this
have to be done in architecture-specific code?

v4:
- New.
---
include/linux/memblock.h | 6 ++++++
mm/memblock.c | 2 ++
2 files changed, 8 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index cbf46f56d1053b68..07e2474c4c3901e9 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -25,6 +25,12 @@ extern unsigned long max_pfn;
*/
extern unsigned long long max_possible_pfn;

+/*
+ * limited region in which memory may be considered usable by the kernel
+ */
+extern phys_addr_t cap_mem_addr;
+extern phys_addr_t cap_mem_size;
+
/**
* enum memblock_flags - definition of memory region attributes
* @MEMBLOCK_NONE: no special request
diff --git a/mm/memblock.c b/mm/memblock.c
index 0041ff62c584e7e1..66659f2b10ed40ff 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -101,6 +101,8 @@ unsigned long max_low_pfn;
unsigned long min_low_pfn;
unsigned long max_pfn;
unsigned long long max_possible_pfn;
+phys_addr_t cap_mem_addr;
+phys_addr_t cap_mem_size;

static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
static struct memblock_region memblock_reserved_init_regions[INIT_MEMBLOCK_RESERVED_REGIONS] __initdata_memblock;
--
2.25.1

2021-07-14 12:55:12

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH v4 03/10] of: fdt: Add generic support for parsing elf core headers property

There are two methods to specify the location of the elf core headers:
using the "elfcorehdr=" kernel parameter, as handled by generic code in
kernel/crash_dump.c, or using the "linux,elfcorehdr" property under the
"/chosen" node in the Device Tree, as handled by architecture-specific
code in arch/arm64/mm/init.c.

Extend support for "linux,elfcorehdr" to all platforms supporting DT by
adding platform-agnostic handling for parsing this property to the FDT
core code. This can co-exist safely with the architecture-specific
parsing, until the latter has been removed.

This requires moving the call to of_scan_flat_dt() up, as the code
scanning the "/chosen" node now needs to be aware of the values of
"#address-cells" and "#size-cells".

Architecture-specific code still has to reserve the memory used by the
elf core headers, if present.

Signed-off-by: Geert Uytterhoeven <[email protected]>
---
About the change to chosen.txt: I have a similar change for
schemas/chosen.yaml in dt-schema.

v4:
- Use IS_ENABLED() instead of #ifdef,
- Clarify what architecture-specific code is still responsible for.
---
Documentation/devicetree/bindings/chosen.txt | 6 ++--
drivers/of/fdt.c | 34 ++++++++++++++++++--
2 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/Documentation/devicetree/bindings/chosen.txt b/Documentation/devicetree/bindings/chosen.txt
index 45e79172a646c537..5b0b94eb2d04e79d 100644
--- a/Documentation/devicetree/bindings/chosen.txt
+++ b/Documentation/devicetree/bindings/chosen.txt
@@ -106,9 +106,9 @@ respectively, of the root node.
linux,elfcorehdr
----------------

-This property (currently used only on arm64) holds the memory range,
-the address and the size, of the elf core header which mainly describes
-the panicked kernel's memory layout as PT_LOAD segments of elf format.
+This property holds the memory range, the address and the size, of the elf
+core header which mainly describes the panicked kernel's memory layout as
+PT_LOAD segments of elf format.
e.g.

/ {
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 344f16bb04ccf081..f797d52c5b492cb7 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -8,6 +8,7 @@

#define pr_fmt(fmt) "OF: fdt: " fmt

+#include <linux/crash_dump.h>
#include <linux/crc32.h>
#include <linux/kernel.h>
#include <linux/initrd.h>
@@ -908,6 +909,32 @@ static inline void early_init_dt_check_for_initrd(unsigned long node)
}
#endif /* CONFIG_BLK_DEV_INITRD */

+/**
+ * early_init_dt_check_for_elfcorehdr - Decode elfcorehdr location from flat
+ * tree
+ * @node: reference to node containing elfcorehdr location ('chosen')
+ */
+static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
+{
+ const __be32 *prop;
+ int len;
+
+ if (!IS_ENABLED(CONFIG_CRASH_DUMP))
+ return;
+
+ pr_debug("Looking for elfcorehdr property... ");
+
+ prop = of_get_flat_dt_prop(node, "linux,elfcorehdr", &len);
+ if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
+ return;
+
+ elfcorehdr_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
+ elfcorehdr_size = dt_mem_next_cell(dt_root_size_cells, &prop);
+
+ pr_debug("elfcorehdr_start=0x%llx elfcorehdr_size=0x%llx\n",
+ elfcorehdr_addr, elfcorehdr_size);
+}
+
#ifdef CONFIG_SERIAL_EARLYCON

int __init early_init_dt_scan_chosen_stdout(void)
@@ -1055,6 +1082,7 @@ int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
return 0;

early_init_dt_check_for_initrd(node);
+ early_init_dt_check_for_elfcorehdr(node);

/* Retrieve command line */
p = of_get_flat_dt_prop(node, "bootargs", &l);
@@ -1199,14 +1227,14 @@ void __init early_init_dt_scan_nodes(void)
{
int rc = 0;

+ /* Initialize {size,address}-cells info */
+ of_scan_flat_dt(early_init_dt_scan_root, NULL);
+
/* Retrieve various information from the /chosen node */
rc = of_scan_flat_dt(early_init_dt_scan_chosen, boot_command_line);
if (!rc)
pr_warn("No chosen node found, continuing without\n");

- /* Initialize {size,address}-cells info */
- of_scan_flat_dt(early_init_dt_scan_root, NULL);
-
/* Setup memory, calling early_init_dt_add_memory_arch */
of_scan_flat_dt(early_init_dt_scan_memory, NULL);
}
--
2.25.1

2021-07-14 12:55:18

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH v4 10/10] ARM: Parse kdump DT properties

Parse the following DT properties in the crash dump kernel, to provide a
modern interface between kexec and the crash dump kernel:
- linux,elfcorehdr: ELF core header segment, similar to the
"elfcorehdr=" kernel parameter.
- linux,usable-memory-range: Usable memory reserved for the crash dump
kernel.
This makes the memory reservation explicit. If present, Linux no
longer needs to mask the program counter, and rely on the "mem="
kernel parameter to obtain the start and size of usable memory.

For backwards compatibility, the traditional method to derive the start
of memory is still used if "linux,usable-memory-range" is absent, and
the "elfcorehdr=" and "mem=" kernel parameters are still parsed.

Loosely based on the ARM64 version by Akashi Takahiro.

Signed-off-by: Geert Uytterhoeven <[email protected]>
---
The corresponding patch for kexec-tools is "[PATCH] arm: kdump: Add DT
properties to crash dump kernel's DTB", which is still valid:
https://lore.kernel.org/linux-arm-kernel/[email protected]/

v4:
- Remove references to architectures in chosen.txt, to avoid having to
change this again when more architectures copy kdump support,
- Remove the architecture-specific code for parsing
"linux,usable-memory-range" and "linux,elfcorehdr", as the FDT core
code now takes care of this,
- Move chosen.txt change to patch changing the FDT core,
- Use IS_ENABLED(CONFIG_CRASH_DUMP) instead of #ifdef,

v3:
- Rebase on top of accepted solution for DTB memory information
handling, which is part of v5.12-rc1,

v2:
- Rebase on top of reworked DTB memory information handling.
---
.../arm/boot/compressed/fdt_check_mem_start.c | 48 ++++++++++++++++---
arch/arm/mm/init.c | 30 ++++++++++++
2 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/arch/arm/boot/compressed/fdt_check_mem_start.c b/arch/arm/boot/compressed/fdt_check_mem_start.c
index 62450d824c3ca180..9291a2661bdfe57f 100644
--- a/arch/arm/boot/compressed/fdt_check_mem_start.c
+++ b/arch/arm/boot/compressed/fdt_check_mem_start.c
@@ -55,16 +55,17 @@ static uint64_t get_val(const fdt32_t *cells, uint32_t ncells)
* DTB, and, if out-of-range, replace it by the real start address.
* To preserve backwards compatibility (systems reserving a block of memory
* at the start of physical memory, kdump, ...), the traditional method is
- * always used if it yields a valid address.
+ * used if it yields a valid address, unless the "linux,usable-memory-range"
+ * property is present.
*
* Return value: start address of physical memory to use
*/
uint32_t fdt_check_mem_start(uint32_t mem_start, const void *fdt)
{
- uint32_t addr_cells, size_cells, base;
+ uint32_t addr_cells, size_cells, usable_base, base;
uint32_t fdt_mem_start = 0xffffffff;
- const fdt32_t *reg, *endp;
- uint64_t size, end;
+ const fdt32_t *usable, *reg, *endp;
+ uint64_t size, usable_end, end;
const char *type;
int offset, len;

@@ -80,6 +81,27 @@ uint32_t fdt_check_mem_start(uint32_t mem_start, const void *fdt)
if (addr_cells > 2 || size_cells > 2)
return mem_start;

+ /*
+ * Usable memory in case of a crash dump kernel
+ * This property describes a limitation: memory within this range is
+ * only valid when also described through another mechanism
+ */
+ usable = get_prop(fdt, "/chosen", "linux,usable-memory-range",
+ (addr_cells + size_cells) * sizeof(fdt32_t));
+ if (usable) {
+ size = get_val(usable + addr_cells, size_cells);
+ if (!size)
+ return mem_start;
+
+ if (addr_cells > 1 && fdt32_ld(usable)) {
+ /* Outside 32-bit address space */
+ return mem_start;
+ }
+
+ usable_base = fdt32_ld(usable + addr_cells - 1);
+ usable_end = usable_base + size;
+ }
+
/* Walk all memory nodes and regions */
for (offset = fdt_next_node(fdt, -1, NULL); offset >= 0;
offset = fdt_next_node(fdt, offset, NULL)) {
@@ -107,7 +129,20 @@ uint32_t fdt_check_mem_start(uint32_t mem_start, const void *fdt)

base = fdt32_ld(reg + addr_cells - 1);
end = base + size;
- if (mem_start >= base && mem_start < end) {
+ if (usable) {
+ /*
+ * Clip to usable range, which takes precedence
+ * over mem_start
+ */
+ if (base < usable_base)
+ base = usable_base;
+
+ if (end > usable_end)
+ end = usable_end;
+
+ if (end <= base)
+ continue;
+ } else if (mem_start >= base && mem_start < end) {
/* Calculated address is valid, use it */
return mem_start;
}
@@ -123,7 +158,8 @@ uint32_t fdt_check_mem_start(uint32_t mem_start, const void *fdt)
}

/*
- * The calculated address is not usable.
+ * The calculated address is not usable, or was overridden by the
+ * "linux,usable-memory-range" property.
* Use the lowest usable physical memory address from the DTB instead,
* and make sure this is a multiple of 2 MiB for phys/virt patching.
*/
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 6162a070a4104a26..dfaee199554dda97 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -4,6 +4,7 @@
*
* Copyright (C) 1995-2005 Russell King
*/
+#include <linux/crash_dump.h>
#include <linux/kernel.h>
#include <linux/errno.h>
#include <linux/swap.h>
@@ -221,8 +222,35 @@ void check_cpu_icache_size(int cpuid)
}
#endif

+/*
+ * reserve_elfcorehdr() - reserves memory for elf core header
+ *
+ * This function reserves the memory occupied by an elf core header
+ * described in the device tree. This region contains all the
+ * information about primary kernel's core image and is used by a dump
+ * capture kernel to access the system memory on primary kernel.
+ */
+static void __init reserve_elfcorehdr(void)
+{
+ if (!IS_ENABLED(CONFIG_CRASH_DUMP) || !elfcorehdr_size)
+ return;
+
+ if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
+ pr_warn("elfcorehdr is overlapped\n");
+ return;
+ }
+
+ memblock_reserve(elfcorehdr_addr, elfcorehdr_size);
+
+ pr_info("Reserving %llu KiB of memory at 0x%llx for elfcorehdr\n",
+ elfcorehdr_size >> 10, elfcorehdr_addr);
+}
+
void __init arm_memblock_init(const struct machine_desc *mdesc)
{
+ /* Handle linux,usable-memory-range property */
+ memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
+
/* Register the kernel text, kernel data and initrd with memblock. */
memblock_reserve(__pa(KERNEL_START), KERNEL_END - KERNEL_START);

@@ -236,6 +264,8 @@ void __init arm_memblock_init(const struct machine_desc *mdesc)

early_init_fdt_scan_reserved_mem();

+ reserve_elfcorehdr();
+
/* reserve memory for DMA contiguous allocations */
dma_contiguous_reserve(arm_dma_limit);

--
2.25.1

2021-07-14 13:49:12

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 01/10] crash_dump: Make elfcorehdr_{addr,size} always visible

On Wed, Jul 14, 2021 at 02:50:11PM +0200, Geert Uytterhoeven wrote:
> Make the forward declarations of elfcorehdr_addr and elfcorehdr_size
> always available, like is done for phys_initrd_start and
> phys_initrd_size. Code referring to these symbols can then just check
> for IS_ENABLED(CONFIG_CRASH_DUMP), instead of requiring conditional
> compilation using an #ifdef, thus preparing to increase compile
> coverage.
>
> Suggested-by: Rob Herring <[email protected]>
> Signed-off-by: Geert Uytterhoeven <[email protected]>
> ---
> v4:
> - New.
> ---
> include/linux/crash_dump.h | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
> index a5192b718dbe4f9a..ad31893d13d634de 100644
> --- a/include/linux/crash_dump.h
> +++ b/include/linux/crash_dump.h
> @@ -10,13 +10,14 @@
>
> #include <linux/pgtable.h> /* for pgprot_t */
>
> +/* For IS_ENABLED(CONFIG_CRASH_DUMP) */
> +extern unsigned long long elfcorehdr_addr;
> +extern unsigned long long elfcorehdr_size;
> +
> #ifdef CONFIG_CRASH_DUMP
> #define ELFCORE_ADDR_MAX (-1ULL)
> #define ELFCORE_ADDR_ERR (-2ULL)

Seems like these could be needed and no need to hide them, so perhaps
just move the #ifdef down.

>
> -extern unsigned long long elfcorehdr_addr;
> -extern unsigned long long elfcorehdr_size;
> -
> extern int elfcorehdr_alloc(unsigned long long *addr, unsigned long long *size);
> extern void elfcorehdr_free(unsigned long long addr);
> extern ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos);
> --
> 2.25.1
>
>

2021-07-14 13:53:00

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 02/10] memblock: Add variables for usable memory limitation

On Wed, Jul 14, 2021 at 02:50:12PM +0200, Geert Uytterhoeven wrote:
> Add two global variables (cap_mem_addr and cap_mem_size) for storing a
> base address and size, describing a limited region in which memory may
> be considered available for use by the kernel. If enabled, memory
> outside of this range is not available for use.
>
> These variables can by filled by firmware-specific code, and used in
> calls to memblock_cap_memory_range() by architecture-specific code.
> An example user is the parser of the "linux,usable-memory-range"
> property in the DT "/chosen" node.
>
> Signed-off-by: Geert Uytterhoeven <[email protected]>
> ---
> This is similar to how the initial ramdisk (phys_initrd_{start,size})
> and ELF core headers (elfcorehdr_{addr,size})) are handled.
>
> Does there exist a suitable place in the common memblock code to call
> "memblock_cap_memory_range(cap_mem_addr, cap_mem_size)", or does this
> have to be done in architecture-specific code?

Can't you just call it from early_init_dt_scan_usablemem? If the
property is present, you want to call it. If the property is not
present, nothing happens.

Rob

2021-07-14 14:54:00

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 10/10] ARM: Parse kdump DT properties

On Wed, Jul 14, 2021 at 02:50:20PM +0200, Geert Uytterhoeven wrote:
> Parse the following DT properties in the crash dump kernel, to provide a
> modern interface between kexec and the crash dump kernel:
> - linux,elfcorehdr: ELF core header segment, similar to the
> "elfcorehdr=" kernel parameter.
> - linux,usable-memory-range: Usable memory reserved for the crash dump
> kernel.
> This makes the memory reservation explicit. If present, Linux no
> longer needs to mask the program counter, and rely on the "mem="
> kernel parameter to obtain the start and size of usable memory.
>
> For backwards compatibility, the traditional method to derive the start
> of memory is still used if "linux,usable-memory-range" is absent, and
> the "elfcorehdr=" and "mem=" kernel parameters are still parsed.
>
> Loosely based on the ARM64 version by Akashi Takahiro.
>
> Signed-off-by: Geert Uytterhoeven <[email protected]>
> ---
> The corresponding patch for kexec-tools is "[PATCH] arm: kdump: Add DT
> properties to crash dump kernel's DTB", which is still valid:
> https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> v4:
> - Remove references to architectures in chosen.txt, to avoid having to
> change this again when more architectures copy kdump support,
> - Remove the architecture-specific code for parsing
> "linux,usable-memory-range" and "linux,elfcorehdr", as the FDT core
> code now takes care of this,
> - Move chosen.txt change to patch changing the FDT core,
> - Use IS_ENABLED(CONFIG_CRASH_DUMP) instead of #ifdef,
>
> v3:
> - Rebase on top of accepted solution for DTB memory information
> handling, which is part of v5.12-rc1,
>
> v2:
> - Rebase on top of reworked DTB memory information handling.
> ---
> .../arm/boot/compressed/fdt_check_mem_start.c | 48 ++++++++++++++++---
> arch/arm/mm/init.c | 30 ++++++++++++
> 2 files changed, 72 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm/boot/compressed/fdt_check_mem_start.c b/arch/arm/boot/compressed/fdt_check_mem_start.c
> index 62450d824c3ca180..9291a2661bdfe57f 100644
> --- a/arch/arm/boot/compressed/fdt_check_mem_start.c
> +++ b/arch/arm/boot/compressed/fdt_check_mem_start.c
> @@ -55,16 +55,17 @@ static uint64_t get_val(const fdt32_t *cells, uint32_t ncells)
> * DTB, and, if out-of-range, replace it by the real start address.
> * To preserve backwards compatibility (systems reserving a block of memory
> * at the start of physical memory, kdump, ...), the traditional method is
> - * always used if it yields a valid address.
> + * used if it yields a valid address, unless the "linux,usable-memory-range"
> + * property is present.
> *
> * Return value: start address of physical memory to use
> */
> uint32_t fdt_check_mem_start(uint32_t mem_start, const void *fdt)
> {
> - uint32_t addr_cells, size_cells, base;
> + uint32_t addr_cells, size_cells, usable_base, base;
> uint32_t fdt_mem_start = 0xffffffff;
> - const fdt32_t *reg, *endp;
> - uint64_t size, end;
> + const fdt32_t *usable, *reg, *endp;
> + uint64_t size, usable_end, end;
> const char *type;
> int offset, len;
>
> @@ -80,6 +81,27 @@ uint32_t fdt_check_mem_start(uint32_t mem_start, const void *fdt)
> if (addr_cells > 2 || size_cells > 2)
> return mem_start;
>
> + /*
> + * Usable memory in case of a crash dump kernel
> + * This property describes a limitation: memory within this range is
> + * only valid when also described through another mechanism
> + */
> + usable = get_prop(fdt, "/chosen", "linux,usable-memory-range",
> + (addr_cells + size_cells) * sizeof(fdt32_t));
> + if (usable) {
> + size = get_val(usable + addr_cells, size_cells);
> + if (!size)
> + return mem_start;
> +
> + if (addr_cells > 1 && fdt32_ld(usable)) {
> + /* Outside 32-bit address space */
> + return mem_start;
> + }
> +
> + usable_base = fdt32_ld(usable + addr_cells - 1);
> + usable_end = usable_base + size;
> + }
> +
> /* Walk all memory nodes and regions */
> for (offset = fdt_next_node(fdt, -1, NULL); offset >= 0;
> offset = fdt_next_node(fdt, offset, NULL)) {
> @@ -107,7 +129,20 @@ uint32_t fdt_check_mem_start(uint32_t mem_start, const void *fdt)
>
> base = fdt32_ld(reg + addr_cells - 1);
> end = base + size;
> - if (mem_start >= base && mem_start < end) {
> + if (usable) {
> + /*
> + * Clip to usable range, which takes precedence
> + * over mem_start
> + */
> + if (base < usable_base)
> + base = usable_base;
> +
> + if (end > usable_end)
> + end = usable_end;
> +
> + if (end <= base)
> + continue;
> + } else if (mem_start >= base && mem_start < end) {
> /* Calculated address is valid, use it */
> return mem_start;
> }
> @@ -123,7 +158,8 @@ uint32_t fdt_check_mem_start(uint32_t mem_start, const void *fdt)
> }
>
> /*
> - * The calculated address is not usable.
> + * The calculated address is not usable, or was overridden by the
> + * "linux,usable-memory-range" property.
> * Use the lowest usable physical memory address from the DTB instead,
> * and make sure this is a multiple of 2 MiB for phys/virt patching.
> */
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 6162a070a4104a26..dfaee199554dda97 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -4,6 +4,7 @@
> *
> * Copyright (C) 1995-2005 Russell King
> */
> +#include <linux/crash_dump.h>
> #include <linux/kernel.h>
> #include <linux/errno.h>
> #include <linux/swap.h>
> @@ -221,8 +222,35 @@ void check_cpu_icache_size(int cpuid)
> }
> #endif
>
> +/*
> + * reserve_elfcorehdr() - reserves memory for elf core header
> + *
> + * This function reserves the memory occupied by an elf core header
> + * described in the device tree. This region contains all the
> + * information about primary kernel's core image and is used by a dump
> + * capture kernel to access the system memory on primary kernel.
> + */
> +static void __init reserve_elfcorehdr(void)
> +{
> + if (!IS_ENABLED(CONFIG_CRASH_DUMP) || !elfcorehdr_size)
> + return;
> +
> + if (memblock_is_region_reserved(elfcorehdr_addr, elfcorehdr_size)) {
> + pr_warn("elfcorehdr is overlapped\n");
> + return;
> + }
> +
> + memblock_reserve(elfcorehdr_addr, elfcorehdr_size);

Can't this be in common code?

> +
> + pr_info("Reserving %llu KiB of memory at 0x%llx for elfcorehdr\n",
> + elfcorehdr_size >> 10, elfcorehdr_addr);
> +}
> +
> void __init arm_memblock_init(const struct machine_desc *mdesc)
> {
> + /* Handle linux,usable-memory-range property */
> + memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
> +
> /* Register the kernel text, kernel data and initrd with memblock. */
> memblock_reserve(__pa(KERNEL_START), KERNEL_END - KERNEL_START);
>
> @@ -236,6 +264,8 @@ void __init arm_memblock_init(const struct machine_desc *mdesc)
>
> early_init_fdt_scan_reserved_mem();
>
> + reserve_elfcorehdr();
> +
> /* reserve memory for DMA contiguous allocations */
> dma_contiguous_reserve(arm_dma_limit);
>
> --
> 2.25.1
>
>

2021-07-18 09:32:47

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v4 02/10] memblock: Add variables for usable memory limitation

Hi,

On Wed, Jul 14, 2021 at 07:51:01AM -0600, Rob Herring wrote:
> On Wed, Jul 14, 2021 at 02:50:12PM +0200, Geert Uytterhoeven wrote:
> > Add two global variables (cap_mem_addr and cap_mem_size) for storing a
> > base address and size, describing a limited region in which memory may
> > be considered available for use by the kernel. If enabled, memory
> > outside of this range is not available for use.
> >
> > These variables can by filled by firmware-specific code, and used in
> > calls to memblock_cap_memory_range() by architecture-specific code.
> > An example user is the parser of the "linux,usable-memory-range"
> > property in the DT "/chosen" node.
> >
> > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > ---
> > This is similar to how the initial ramdisk (phys_initrd_{start,size})
> > and ELF core headers (elfcorehdr_{addr,size})) are handled.
> >
> > Does there exist a suitable place in the common memblock code to call
> > "memblock_cap_memory_range(cap_mem_addr, cap_mem_size)", or does this
> > have to be done in architecture-specific code?
>
> Can't you just call it from early_init_dt_scan_usablemem? If the
> property is present, you want to call it. If the property is not
> present, nothing happens.

For memblock_cap_memory_range() to work properly it should be called after
memory is detected and added to memblock with memblock_add[_node]()

I'm not huge fan of adding more globals to memblock so if such ordering can
be implemented on the DT side it would be great.

I don't see a way to actually enforce this ordering, so maybe we'd want to
add warning in memblock_cap_memory_range() if memblock.memory is empty.

> Rob

--
Sincerely yours,
Mike.

2021-07-19 06:59:58

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v4 02/10] memblock: Add variables for usable memory limitation

Hi Mike,

On Sun, Jul 18, 2021 at 11:31 AM Mike Rapoport <[email protected]> wrote:
> On Wed, Jul 14, 2021 at 07:51:01AM -0600, Rob Herring wrote:
> > On Wed, Jul 14, 2021 at 02:50:12PM +0200, Geert Uytterhoeven wrote:
> > > Add two global variables (cap_mem_addr and cap_mem_size) for storing a
> > > base address and size, describing a limited region in which memory may
> > > be considered available for use by the kernel. If enabled, memory
> > > outside of this range is not available for use.
> > >
> > > These variables can by filled by firmware-specific code, and used in
> > > calls to memblock_cap_memory_range() by architecture-specific code.
> > > An example user is the parser of the "linux,usable-memory-range"
> > > property in the DT "/chosen" node.
> > >
> > > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > > ---
> > > This is similar to how the initial ramdisk (phys_initrd_{start,size})
> > > and ELF core headers (elfcorehdr_{addr,size})) are handled.
> > >
> > > Does there exist a suitable place in the common memblock code to call
> > > "memblock_cap_memory_range(cap_mem_addr, cap_mem_size)", or does this
> > > have to be done in architecture-specific code?
> >
> > Can't you just call it from early_init_dt_scan_usablemem? If the
> > property is present, you want to call it. If the property is not
> > present, nothing happens.

I will have a look...

> For memblock_cap_memory_range() to work properly it should be called after
> memory is detected and added to memblock with memblock_add[_node]()
>
> I'm not huge fan of adding more globals to memblock so if such ordering can
> be implemented on the DT side it would be great.

Me neither ;-)

> I don't see a way to actually enforce this ordering, so maybe we'd want to
> add warning in memblock_cap_memory_range() if memblock.memory is empty.

"linux,usable-memory-range" is optional, and typically used only in
crashdump kernels, so it would be a bad idea to add such a warning.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2021-07-20 05:43:13

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v4 02/10] memblock: Add variables for usable memory limitation

Hi Geert,

On Mon, Jul 19, 2021 at 08:59:03AM +0200, Geert Uytterhoeven wrote:
> Hi Mike,
>
> On Sun, Jul 18, 2021 at 11:31 AM Mike Rapoport <[email protected]> wrote:
> > On Wed, Jul 14, 2021 at 07:51:01AM -0600, Rob Herring wrote:
> > > On Wed, Jul 14, 2021 at 02:50:12PM +0200, Geert Uytterhoeven wrote:
> > > > Add two global variables (cap_mem_addr and cap_mem_size) for storing a
> > > > base address and size, describing a limited region in which memory may
> > > > be considered available for use by the kernel. If enabled, memory
> > > > outside of this range is not available for use.
> > > >
> > > > These variables can by filled by firmware-specific code, and used in
> > > > calls to memblock_cap_memory_range() by architecture-specific code.
> > > > An example user is the parser of the "linux,usable-memory-range"
> > > > property in the DT "/chosen" node.
> > > >
> > > > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > > > ---
> > > > This is similar to how the initial ramdisk (phys_initrd_{start,size})
> > > > and ELF core headers (elfcorehdr_{addr,size})) are handled.
> > > >
> > > > Does there exist a suitable place in the common memblock code to call
> > > > "memblock_cap_memory_range(cap_mem_addr, cap_mem_size)", or does this
> > > > have to be done in architecture-specific code?
> > >
> > > Can't you just call it from early_init_dt_scan_usablemem? If the
> > > property is present, you want to call it. If the property is not
> > > present, nothing happens.
>
> I will have a look...
>
> > For memblock_cap_memory_range() to work properly it should be called after
> > memory is detected and added to memblock with memblock_add[_node]()
> >
> > I'm not huge fan of adding more globals to memblock so if such ordering can
> > be implemented on the DT side it would be great.
>
> Me neither ;-)
>
> > I don't see a way to actually enforce this ordering, so maybe we'd want to
> > add warning in memblock_cap_memory_range() if memblock.memory is empty.
>
> "linux,usable-memory-range" is optional, and typically used only in
> crashdump kernels, so it would be a bad idea to add such a warning.

If I remember correctly, memblock_cap_memory_range() was added to support
"linux,usable-memory-range" for crasdump kernels on arm64 and if it would
be called before memory is registered we may silently corrupt the memory
because the crash kernel will see all the memory as available.

So while WARN() maybe too much a pr_warn() seems to me quite appropriate.

--
Sincerely yours,
Mike.

2021-07-20 07:24:56

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v4 02/10] memblock: Add variables for usable memory limitation

Hi Mike,

On Tue, Jul 20, 2021 at 7:41 AM Mike Rapoport <[email protected]> wrote:
> On Mon, Jul 19, 2021 at 08:59:03AM +0200, Geert Uytterhoeven wrote:
> > On Sun, Jul 18, 2021 at 11:31 AM Mike Rapoport <[email protected]> wrote:
> > > On Wed, Jul 14, 2021 at 07:51:01AM -0600, Rob Herring wrote:
> > > > On Wed, Jul 14, 2021 at 02:50:12PM +0200, Geert Uytterhoeven wrote:
> > > > > Add two global variables (cap_mem_addr and cap_mem_size) for storing a
> > > > > base address and size, describing a limited region in which memory may
> > > > > be considered available for use by the kernel. If enabled, memory
> > > > > outside of this range is not available for use.
> > > > >
> > > > > These variables can by filled by firmware-specific code, and used in
> > > > > calls to memblock_cap_memory_range() by architecture-specific code.
> > > > > An example user is the parser of the "linux,usable-memory-range"
> > > > > property in the DT "/chosen" node.
> > > > >
> > > > > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > > > > ---
> > > > > This is similar to how the initial ramdisk (phys_initrd_{start,size})
> > > > > and ELF core headers (elfcorehdr_{addr,size})) are handled.
> > > > >
> > > > > Does there exist a suitable place in the common memblock code to call
> > > > > "memblock_cap_memory_range(cap_mem_addr, cap_mem_size)", or does this
> > > > > have to be done in architecture-specific code?
> > > >
> > > > Can't you just call it from early_init_dt_scan_usablemem? If the
> > > > property is present, you want to call it. If the property is not
> > > > present, nothing happens.
> >
> > I will have a look...
> >
> > > For memblock_cap_memory_range() to work properly it should be called after
> > > memory is detected and added to memblock with memblock_add[_node]()
> > >
> > > I'm not huge fan of adding more globals to memblock so if such ordering can
> > > be implemented on the DT side it would be great.
> >
> > Me neither ;-)
> >
> > > I don't see a way to actually enforce this ordering, so maybe we'd want to
> > > add warning in memblock_cap_memory_range() if memblock.memory is empty.

Sorry, I misread "if memblock.memory is empty" as "if capmem is empty".

> > "linux,usable-memory-range" is optional, and typically used only in
> > crashdump kernels, so it would be a bad idea to add such a warning.
>
> If I remember correctly, memblock_cap_memory_range() was added to support
> "linux,usable-memory-range" for crasdump kernels on arm64 and if it would
> be called before memory is registered we may silently corrupt the memory
> because the crash kernel will see all the memory as available.
>"
> So while WARN() maybe too much a pr_warn() seems to me quite appropriate.

Yes, makes perfect sense now.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2021-08-11 08:15:30

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v4 02/10] memblock: Add variables for usable memory limitation

On Wed, Jul 14, 2021 at 3:51 PM Rob Herring <[email protected]> wrote:
> On Wed, Jul 14, 2021 at 02:50:12PM +0200, Geert Uytterhoeven wrote:
> > Add two global variables (cap_mem_addr and cap_mem_size) for storing a
> > base address and size, describing a limited region in which memory may
> > be considered available for use by the kernel. If enabled, memory
> > outside of this range is not available for use.
> >
> > These variables can by filled by firmware-specific code, and used in
> > calls to memblock_cap_memory_range() by architecture-specific code.
> > An example user is the parser of the "linux,usable-memory-range"
> > property in the DT "/chosen" node.
> >
> > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > ---
> > This is similar to how the initial ramdisk (phys_initrd_{start,size})
> > and ELF core headers (elfcorehdr_{addr,size})) are handled.
> >
> > Does there exist a suitable place in the common memblock code to call
> > "memblock_cap_memory_range(cap_mem_addr, cap_mem_size)", or does this
> > have to be done in architecture-specific code?
>
> Can't you just call it from early_init_dt_scan_usablemem? If the
> property is present, you want to call it. If the property is not
> present, nothing happens.

Seems to work fine when called from early_init_dt_scan_nodes().
Hence v5 will no longer need to touch memblock.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds