2019-10-29 07:23:27

by Lianbo Jiang

[permalink] [raw]
Subject: [PATCH 0/2 v7] x86/kdump: Fix 'kmem -s' reported an invalid freepointer when SME was active

In purgatory(), the main things are as below:

[1] verify sha256 hashes for various segments.
Lets keep these codes, and do not touch the logic.

[2] copy the first 640k content to a backup region.
Lets safely remove it and clean all code related to backup region.

This patch series will remove the backup region, because the current
handling of copying the first 640k runs into problems when SME is
active(https://bugzilla.kernel.org/show_bug.cgi?id=204793).

The low 1MiB region will always be reserved when the crashkernel kernel
command line option is specified. And this way makes it unnecessary to
do anything with the low 1MiB region, because the memory allocated later
won't fall into the low 1MiB area.

This series includes two patches:
[1] x86/kdump: always reserve the low 1MiB when the crashkernel option
is specified
The low 1MiB region will always be reserved when the crashkernel
kernel command line option is specified, which ensures that the
memory allocated later won't fall into the low 1MiB area.

[2] x86/kdump: clean up all the code related to the backup region
Remove the backup region and clean up.

Changes since v1:
[1] Add extra checking condition: when the crashkernel option is
specified, reserve the low 640k area.

Changes since v2:
[1] Reserve the low 1MiB region when the crashkernel option is only
specified.(Suggested by Eric)

[2] Remove the unused crash_copy_backup_region()

[3] Remove the backup region and clean up

[4] Split them into three patches

Changes since v3:
[1] Improve the first patch's log

[2] Improve the third patch based on Eric's suggestions

Changes since v4:
[1] Correct some typos, and also improve the first patch's log

[2] Add a new function kexec_reserve_low_1MiB() in kernel/kexec_core.c
and which is called by reserve_real_mode(). (Suggested by Boris)

Changes since v5:
[1] Call the cmdline_find_option() instead of strstr() to check the
crashkernel option. (Suggested by Hatayama)

[2] Add a weak function kexec_reserve_low_1MiB() in kernel/kexec_core.c,
and implement the kexec_reserve_low_1MiB() in arch/x86/kernel/
machine_kexec_64.c so that it does not cause the compile error
on non-x86 kernel, and also ensures that it can work well on x86
kernel.

Changes since v6:
[1] Move the kexec_reserve_low_1MiB() to arch/x86/kernel/crash.c and
also move its declaration function to arch/x86/include/asm/crash.h
(Suggested by Dave Young)

[2] Adjust the corresponding header files.

Lianbo Jiang (2):
x86/kdump: always reserve the low 1MiB when the crashkernel option is
specified
x86/kdump: clean up all the code related to the backup region

arch/x86/include/asm/crash.h | 6 ++
arch/x86/include/asm/kexec.h | 10 ---
arch/x86/include/asm/purgatory.h | 10 ---
arch/x86/kernel/crash.c | 102 ++++++++---------------------
arch/x86/kernel/machine_kexec_64.c | 47 -------------
arch/x86/purgatory/purgatory.c | 19 ------
arch/x86/realmode/init.c | 2 +
7 files changed, 34 insertions(+), 162 deletions(-)

--
2.17.1


2019-10-29 07:23:28

by Lianbo Jiang

[permalink] [raw]
Subject: [PATCH 1/2 v7] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified

Kdump kernel will reuse the first 640k region because the real mode
trampoline has to work in this area. When the vmcore is dumped, the
old memory in this area may be accessed, therefore, kernel has to
copy the contents of the first 640k area to a backup region so that
kdump kernel can read the old memory from the backup area of the
first 640k area, which is done in the purgatory().

But, the current handling of copying the first 640k area runs into
problems when SME is enabled, kernel does not properly copy these
old memory to the backup area in the purgatory(), thereby, kdump
kernel reads out the encrypted contents, because the kdump kernel
must access the first kernel's memory with the encryption bit set
when SME is enabled in the first kernel. Please refer to this link:

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793

Finally, it causes the following errors, and the crash tool gets
invalid pointers when parsing the vmcore.

crash> kmem -s|grep -i invalid
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
crash>

To avoid the above errors, when the crashkernel option is specified,
lets reserve the remaining low 1MiB memory(after reserving real mode
memory) so that the allocated memory does not fall into the low 1MiB
area, which makes us not to copy the first 640k content to a backup
region in purgatory(). This indicates that it does not need to be
included in crash dumps or used for anything except the processor
trampolines that must live in the low 1MiB.

Signed-off-by: Lianbo Jiang <[email protected]>
---
arch/x86/include/asm/crash.h | 6 ++++++
arch/x86/kernel/crash.c | 15 +++++++++++++++
arch/x86/realmode/init.c | 2 ++
3 files changed, 23 insertions(+)

diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
index 0acf5ee45a21..3e966a3dc823 100644
--- a/arch/x86/include/asm/crash.h
+++ b/arch/x86/include/asm/crash.h
@@ -8,4 +8,10 @@ int crash_setup_memmap_entries(struct kimage *image,
struct boot_params *params);
void crash_smp_send_stop(void);

+#ifdef CONFIG_KEXEC_CORE
+void __init kexec_reserve_low_1MiB(void);
+#else
+static inline void __init kexec_reserve_low_1MiB(void) { }
+#endif
+
#endif /* _ASM_X86_CRASH_H */
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index eb651fbde92a..144f519aef29 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -24,6 +24,7 @@
#include <linux/export.h>
#include <linux/slab.h>
#include <linux/vmalloc.h>
+#include <linux/memblock.h>

#include <asm/processor.h>
#include <asm/hardirq.h>
@@ -39,6 +40,7 @@
#include <asm/virtext.h>
#include <asm/intel_pt.h>
#include <asm/crash.h>
+#include <asm/cmdline.h>

/* Used while preparing memory map entries for second kernel */
struct crash_memmap_data {
@@ -68,6 +70,19 @@ static inline void cpu_crash_vmclear_loaded_vmcss(void)
rcu_read_unlock();
}

+/*
+ * When the crashkernel option is specified, only use the low
+ * 1MiB for the real mode trampoline.
+ */
+void __init kexec_reserve_low_1MiB(void)
+{
+ if (cmdline_find_option(boot_command_line, "crashkernel",
+ NULL, 0) > 0) {
+ memblock_reserve(0, 1<<20);
+ pr_info("Reserving the low 1MiB of memory for crashkernel\n");
+ }
+}
+
#if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC)

static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 7dce39c8c034..b8bbd0017ca8 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -8,6 +8,7 @@
#include <asm/pgtable.h>
#include <asm/realmode.h>
#include <asm/tlbflush.h>
+#include <asm/crash.h>

struct real_mode_header *real_mode_header;
u32 *trampoline_cr4_features;
@@ -34,6 +35,7 @@ void __init reserve_real_mode(void)

memblock_reserve(mem, size);
set_real_mode_mem(mem);
+ kexec_reserve_low_1MiB();
}

static void __init setup_real_mode(void)
--
2.17.1

2019-10-29 07:25:30

by Lianbo Jiang

[permalink] [raw]
Subject: [PATCH 2/2 v7] x86/kdump: clean up all the code related to the backup region

When the crashkernel kernel command line option is specified, the
low 1MiB memory will always be reserved, which makes that the memory
allocated later won't fall into the low 1MiB area, thereby, it's not
necessary to create a backup region and also no need to copy the first
640k content to a backup region.

Currently, the code related to the backup region can be safely removed,
so lets clean up.

Signed-off-by: Lianbo Jiang <[email protected]>
---
arch/x86/include/asm/kexec.h | 10 ----
arch/x86/include/asm/purgatory.h | 10 ----
arch/x86/kernel/crash.c | 87 ++++--------------------------
arch/x86/kernel/machine_kexec_64.c | 47 ----------------
arch/x86/purgatory/purgatory.c | 19 -------
5 files changed, 11 insertions(+), 162 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 5e7d6b46de97..6802c59e8252 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -66,10 +66,6 @@ struct kimage;
# define KEXEC_ARCH KEXEC_ARCH_X86_64
#endif

-/* Memory to backup during crash kdump */
-#define KEXEC_BACKUP_SRC_START (0UL)
-#define KEXEC_BACKUP_SRC_END (640 * 1024UL - 1) /* 640K */
-
/*
* This function is responsible for capturing register states if coming
* via panic otherwise just fix up the ss and sp if coming via kernel
@@ -154,12 +150,6 @@ struct kimage_arch {
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
- /* Details of backup region */
- unsigned long backup_src_start;
- unsigned long backup_src_sz;
-
- /* Physical address of backup segment */
- unsigned long backup_load_addr;

/* Core ELF header buffer */
void *elf_headers;
diff --git a/arch/x86/include/asm/purgatory.h b/arch/x86/include/asm/purgatory.h
index 92c34e517da1..5528e9325049 100644
--- a/arch/x86/include/asm/purgatory.h
+++ b/arch/x86/include/asm/purgatory.h
@@ -6,16 +6,6 @@
#include <linux/purgatory.h>

extern void purgatory(void);
-/*
- * These forward declarations serve two purposes:
- *
- * 1) Make sparse happy when checking arch/purgatory
- * 2) Document that these are required to be global so the symbol
- * lookup in kexec works
- */
-extern unsigned long purgatory_backup_dest;
-extern unsigned long purgatory_backup_src;
-extern unsigned long purgatory_backup_sz;
#endif /* __ASSEMBLY__ */

#endif /* _ASM_PURGATORY_H */
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 144f519aef29..baf32a3c6f8c 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -188,8 +188,6 @@ void native_machine_crash_shutdown(struct pt_regs *regs)

#ifdef CONFIG_KEXEC_FILE

-static unsigned long crash_zero_bytes;
-
static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
{
unsigned int *nr_ranges = arg;
@@ -232,6 +230,11 @@ static int elf_header_exclude_ranges(struct crash_mem *cmem)
{
int ret = 0;

+ /* Exclude the low 1MiB because it is always reserved */
+ ret = crash_exclude_mem_range(cmem, 0, 1<<20);
+ if (ret)
+ return ret;
+
/* Exclude crashkernel region */
ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
if (ret)
@@ -261,9 +264,7 @@ static int prepare_elf_headers(struct kimage *image, void **addr,
unsigned long *sz)
{
struct crash_mem *cmem;
- Elf64_Ehdr *ehdr;
- Elf64_Phdr *phdr;
- int ret, i;
+ int ret;

cmem = fill_up_crash_elf_data();
if (!cmem)
@@ -282,22 +283,7 @@ static int prepare_elf_headers(struct kimage *image, void **addr,
/* By default prepare 64bit headers */
ret = crash_prepare_elf64_headers(cmem,
IS_ENABLED(CONFIG_X86_64), addr, sz);
- if (ret)
- goto out;

- /*
- * If a range matches backup region, adjust offset to backup
- * segment.
- */
- ehdr = (Elf64_Ehdr *)*addr;
- phdr = (Elf64_Phdr *)(ehdr + 1);
- for (i = 0; i < ehdr->e_phnum; phdr++, i++)
- if (phdr->p_type == PT_LOAD &&
- phdr->p_paddr == image->arch.backup_src_start &&
- phdr->p_memsz == image->arch.backup_src_sz) {
- phdr->p_offset = image->arch.backup_load_addr;
- break;
- }
out:
vfree(cmem);
return ret;
@@ -336,19 +322,11 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
unsigned long long mend)
{
unsigned long start, end;
- int ret = 0;

cmem->ranges[0].start = mstart;
cmem->ranges[0].end = mend;
cmem->nr_ranges = 1;

- /* Exclude Backup region */
- start = image->arch.backup_load_addr;
- end = start + image->arch.backup_src_sz - 1;
- ret = crash_exclude_mem_range(cmem, start, end);
- if (ret)
- return ret;
-
/* Exclude elf header region */
start = image->arch.elf_load_addr;
end = start + image->arch.elf_headers_sz - 1;
@@ -371,11 +349,11 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
memset(&cmd, 0, sizeof(struct crash_memmap_data));
cmd.params = params;

- /* Add first 640K segment */
- ei.addr = image->arch.backup_src_start;
- ei.size = image->arch.backup_src_sz;
- ei.type = E820_TYPE_RAM;
- add_e820_entry(params, &ei);
+ /* Add the low 1MiB */
+ cmd.type = E820_TYPE_RAM;
+ flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+ walk_iomem_res_desc(IORES_DESC_NONE, flags, 0, (1<<20)-1, &cmd,
+ memmap_entry_callback);

/* Add ACPI tables */
cmd.type = E820_TYPE_ACPI;
@@ -424,55 +402,12 @@ int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
return ret;
}

-static int determine_backup_region(struct resource *res, void *arg)
-{
- struct kimage *image = arg;
-
- image->arch.backup_src_start = res->start;
- image->arch.backup_src_sz = resource_size(res);
-
- /* Expecting only one range for backup region */
- return 1;
-}
-
int crash_load_segments(struct kimage *image)
{
int ret;
struct kexec_buf kbuf = { .image = image, .buf_min = 0,
.buf_max = ULONG_MAX, .top_down = false };

- /*
- * Determine and load a segment for backup area. First 640K RAM
- * region is backup source
- */
-
- ret = walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
- image, determine_backup_region);
-
- /* Zero or postive return values are ok */
- if (ret < 0)
- return ret;
-
- /* Add backup segment. */
- if (image->arch.backup_src_sz) {
- kbuf.buffer = &crash_zero_bytes;
- kbuf.bufsz = sizeof(crash_zero_bytes);
- kbuf.memsz = image->arch.backup_src_sz;
- kbuf.buf_align = PAGE_SIZE;
- /*
- * Ideally there is no source for backup segment. This is
- * copied in purgatory after crash. Just add a zero filled
- * segment for now to make sure checksum logic works fine.
- */
- ret = kexec_add_buffer(&kbuf);
- if (ret)
- return ret;
- image->arch.backup_load_addr = kbuf.mem;
- pr_debug("Loaded backup region at 0x%lx backup_start=0x%lx memsz=0x%lx\n",
- image->arch.backup_load_addr,
- image->arch.backup_src_start, kbuf.memsz);
- }
-
/* Prepare elf headers and add a segment */
ret = prepare_elf_headers(image, &kbuf.buffer, &kbuf.bufsz);
if (ret)
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 5dcd438ad8f2..16e125a50b33 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -298,48 +298,6 @@ static void load_segments(void)
);
}

-#ifdef CONFIG_KEXEC_FILE
-/* Update purgatory as needed after various image segments have been prepared */
-static int arch_update_purgatory(struct kimage *image)
-{
- int ret = 0;
-
- if (!image->file_mode)
- return 0;
-
- /* Setup copying of backup region */
- if (image->type == KEXEC_TYPE_CRASH) {
- ret = kexec_purgatory_get_set_symbol(image,
- "purgatory_backup_dest",
- &image->arch.backup_load_addr,
- sizeof(image->arch.backup_load_addr), 0);
- if (ret)
- return ret;
-
- ret = kexec_purgatory_get_set_symbol(image,
- "purgatory_backup_src",
- &image->arch.backup_src_start,
- sizeof(image->arch.backup_src_start), 0);
- if (ret)
- return ret;
-
- ret = kexec_purgatory_get_set_symbol(image,
- "purgatory_backup_sz",
- &image->arch.backup_src_sz,
- sizeof(image->arch.backup_src_sz), 0);
- if (ret)
- return ret;
- }
-
- return ret;
-}
-#else /* !CONFIG_KEXEC_FILE */
-static inline int arch_update_purgatory(struct kimage *image)
-{
- return 0;
-}
-#endif /* CONFIG_KEXEC_FILE */
-
int machine_kexec_prepare(struct kimage *image)
{
unsigned long start_pgtable;
@@ -353,11 +311,6 @@ int machine_kexec_prepare(struct kimage *image)
if (result)
return result;

- /* update purgatory as needed */
- result = arch_update_purgatory(image);
- if (result)
- return result;
-
return 0;
}

diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c
index 3b95410ff0f8..2961234d0795 100644
--- a/arch/x86/purgatory/purgatory.c
+++ b/arch/x86/purgatory/purgatory.c
@@ -14,28 +14,10 @@

#include "../boot/string.h"

-unsigned long purgatory_backup_dest __section(.kexec-purgatory);
-unsigned long purgatory_backup_src __section(.kexec-purgatory);
-unsigned long purgatory_backup_sz __section(.kexec-purgatory);
-
u8 purgatory_sha256_digest[SHA256_DIGEST_SIZE] __section(.kexec-purgatory);

struct kexec_sha_region purgatory_sha_regions[KEXEC_SEGMENT_MAX] __section(.kexec-purgatory);

-/*
- * On x86, second kernel requries first 640K of memory to boot. Copy
- * first 640K to a backup region in reserved memory range so that second
- * kernel can use first 640K.
- */
-static int copy_backup_region(void)
-{
- if (purgatory_backup_dest) {
- memcpy((void *)purgatory_backup_dest,
- (void *)purgatory_backup_src, purgatory_backup_sz);
- }
- return 0;
-}
-
static int verify_sha256_digest(void)
{
struct kexec_sha_region *ptr, *end;
@@ -66,7 +48,6 @@ void purgatory(void)
for (;;)
;
}
- copy_backup_region();
}

/*
--
2.17.1

2019-10-29 07:26:15

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 1/2 v7] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified

On 10/29/19 at 10:10am, Lianbo Jiang wrote:
> Kdump kernel will reuse the first 640k region because the real mode
> trampoline has to work in this area. When the vmcore is dumped, the
> old memory in this area may be accessed, therefore, kernel has to
> copy the contents of the first 640k area to a backup region so that
> kdump kernel can read the old memory from the backup area of the
> first 640k area, which is done in the purgatory().
>
> But, the current handling of copying the first 640k area runs into
> problems when SME is enabled, kernel does not properly copy these
> old memory to the backup area in the purgatory(), thereby, kdump
> kernel reads out the encrypted contents, because the kdump kernel
> must access the first kernel's memory with the encryption bit set
> when SME is enabled in the first kernel. Please refer to this link:
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>
> Finally, it causes the following errors, and the crash tool gets
> invalid pointers when parsing the vmcore.
>
> crash> kmem -s|grep -i invalid
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
> crash>
>
> To avoid the above errors, when the crashkernel option is specified,
> lets reserve the remaining low 1MiB memory(after reserving real mode
> memory) so that the allocated memory does not fall into the low 1MiB
> area, which makes us not to copy the first 640k content to a backup
> region in purgatory(). This indicates that it does not need to be
> included in crash dumps or used for anything except the processor
> trampolines that must live in the low 1MiB.
>
> Signed-off-by: Lianbo Jiang <[email protected]>
> ---
> arch/x86/include/asm/crash.h | 6 ++++++
> arch/x86/kernel/crash.c | 15 +++++++++++++++
> arch/x86/realmode/init.c | 2 ++
> 3 files changed, 23 insertions(+)
>
> diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
> index 0acf5ee45a21..3e966a3dc823 100644
> --- a/arch/x86/include/asm/crash.h
> +++ b/arch/x86/include/asm/crash.h
> @@ -8,4 +8,10 @@ int crash_setup_memmap_entries(struct kimage *image,
> struct boot_params *params);
> void crash_smp_send_stop(void);
>
> +#ifdef CONFIG_KEXEC_CORE
> +void __init kexec_reserve_low_1MiB(void);
> +#else
> +static inline void __init kexec_reserve_low_1MiB(void) { }
> +#endif
> +
> #endif /* _ASM_X86_CRASH_H */
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index eb651fbde92a..144f519aef29 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -24,6 +24,7 @@
> #include <linux/export.h>
> #include <linux/slab.h>
> #include <linux/vmalloc.h>
> +#include <linux/memblock.h>
>
> #include <asm/processor.h>
> #include <asm/hardirq.h>
> @@ -39,6 +40,7 @@
> #include <asm/virtext.h>
> #include <asm/intel_pt.h>
> #include <asm/crash.h>
> +#include <asm/cmdline.h>
>
> /* Used while preparing memory map entries for second kernel */
> struct crash_memmap_data {
> @@ -68,6 +70,19 @@ static inline void cpu_crash_vmclear_loaded_vmcss(void)
> rcu_read_unlock();
> }
>
> +/*
> + * When the crashkernel option is specified, only use the low
> + * 1MiB for the real mode trampoline.
> + */
> +void __init kexec_reserve_low_1MiB(void)

Thanks for the effort, Lianbo. I believe everyone is confident with this
solution and fix.

I have a tiny concern, why the function name is
kexec_reserve_low_1MiB(), but not kexec_reserve_low_1M()?
I searched in kernel code with below filter, didn't see MiB appearing in
a function name. I am not sure about it either, just ask.

git grep "_[1-9]*M " arch/ kernel/ mm include/ drivers/ net/ init fs crypto/ certs/ ipc lib

Thanks
Baoquan

> +{
> + if (cmdline_find_option(boot_command_line, "crashkernel",
> + NULL, 0) > 0) {
> + memblock_reserve(0, 1<<20);
> + pr_info("Reserving the low 1MiB of memory for crashkernel\n");
> + }
> +}
> +
> #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC)
>
> static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> index 7dce39c8c034..b8bbd0017ca8 100644
> --- a/arch/x86/realmode/init.c
> +++ b/arch/x86/realmode/init.c
> @@ -8,6 +8,7 @@
> #include <asm/pgtable.h>
> #include <asm/realmode.h>
> #include <asm/tlbflush.h>
> +#include <asm/crash.h>
>
> struct real_mode_header *real_mode_header;
> u32 *trampoline_cr4_features;
> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>
> memblock_reserve(mem, size);
> set_real_mode_mem(mem);
> + kexec_reserve_low_1MiB();
> }
>
> static void __init setup_real_mode(void)
> --
> 2.17.1
>

2019-10-29 07:27:59

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 1/2 v7] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified

On 10/29/19 at 02:06pm, lijiang wrote:
> >> struct crash_memmap_data {
> >> @@ -68,6 +70,19 @@ static inline void cpu_crash_vmclear_loaded_vmcss(void)
> >> rcu_read_unlock();
> >> }
> >>
> >> +/*
> >> + * When the crashkernel option is specified, only use the low
> >> + * 1MiB for the real mode trampoline.
> >> + */
> >> +void __init kexec_reserve_low_1MiB(void)
> >
> > Thanks for the effort, Lianbo. I believe everyone is confident with this
> > solution and fix.
> >
> > I have a tiny concern, why the function name is
> > kexec_reserve_low_1MiB(), but not kexec_reserve_low_1M()?
>
> Thanks for your comment, Baoquan.
>
> It means that kernel will reserve 1M 'Byte' memory, the function name does not
> have special meaning.
>
> Would you mind if i change it to the crash_reserve_low_1M()?

Yes, crash_xx looks better since it's only related to crash dumping. As
for 1M, not very sure, see if other people have comment about it. Anyway,
crash_reserve_low_1M() looks good to me. Thanks.

>
> void __init crash_reserve_low_1M(void)
>
> Thanks.
> Lianbo
>
> > I searched in kernel code with below filter, didn't see MiB appearing in
> > a function name. I am not sure about it either, just ask.
> >
> > git grep "_[1-9]*M " arch/ kernel/ mm include/ drivers/ net/ init fs crypto/ certs/ ipc lib
> >
> > Thanks
> > Baoquan
> >
> >> +{
> >> + if (cmdline_find_option(boot_command_line, "crashkernel",
> >> + NULL, 0) > 0) {
> >> + memblock_reserve(0, 1<<20);
> >> + pr_info("Reserving the low 1MiB of memory for crashkernel\n");
> >> + }
> >> +}
> >> +
> >> #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC)
> >>
> >> static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
> >> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
> >> index 7dce39c8c034..b8bbd0017ca8 100644
> >> --- a/arch/x86/realmode/init.c
> >> +++ b/arch/x86/realmode/init.c
> >> @@ -8,6 +8,7 @@
> >> #include <asm/pgtable.h>
> >> #include <asm/realmode.h>
> >> #include <asm/tlbflush.h>
> >> +#include <asm/crash.h>
> >>
> >> struct real_mode_header *real_mode_header;
> >> u32 *trampoline_cr4_features;
> >> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
> >>
> >> memblock_reserve(mem, size);
> >> set_real_mode_mem(mem);
> >> + kexec_reserve_low_1MiB();
> >> }
> >>
> >> static void __init setup_real_mode(void)
> >> --
> >> 2.17.1
> >>

2019-10-29 09:55:24

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH 1/2 v7] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified

在 2019年10月29日 13:28, Baoquan He 写道:
> On 10/29/19 at 10:10am, Lianbo Jiang wrote:
>> Kdump kernel will reuse the first 640k region because the real mode
>> trampoline has to work in this area. When the vmcore is dumped, the
>> old memory in this area may be accessed, therefore, kernel has to
>> copy the contents of the first 640k area to a backup region so that
>> kdump kernel can read the old memory from the backup area of the
>> first 640k area, which is done in the purgatory().
>>
>> But, the current handling of copying the first 640k area runs into
>> problems when SME is enabled, kernel does not properly copy these
>> old memory to the backup area in the purgatory(), thereby, kdump
>> kernel reads out the encrypted contents, because the kdump kernel
>> must access the first kernel's memory with the encryption bit set
>> when SME is enabled in the first kernel. Please refer to this link:
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204793
>>
>> Finally, it causes the following errors, and the crash tool gets
>> invalid pointers when parsing the vmcore.
>>
>> crash> kmem -s|grep -i invalid
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> kmem: dma-kmalloc-512: slab:ffffd77680001c00 invalid freepointer:a6086ac099f0c5a4
>> crash>
>>
>> To avoid the above errors, when the crashkernel option is specified,
>> lets reserve the remaining low 1MiB memory(after reserving real mode
>> memory) so that the allocated memory does not fall into the low 1MiB
>> area, which makes us not to copy the first 640k content to a backup
>> region in purgatory(). This indicates that it does not need to be
>> included in crash dumps or used for anything except the processor
>> trampolines that must live in the low 1MiB.
>>
>> Signed-off-by: Lianbo Jiang <[email protected]>
>> ---
>> arch/x86/include/asm/crash.h | 6 ++++++
>> arch/x86/kernel/crash.c | 15 +++++++++++++++
>> arch/x86/realmode/init.c | 2 ++
>> 3 files changed, 23 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
>> index 0acf5ee45a21..3e966a3dc823 100644
>> --- a/arch/x86/include/asm/crash.h
>> +++ b/arch/x86/include/asm/crash.h
>> @@ -8,4 +8,10 @@ int crash_setup_memmap_entries(struct kimage *image,
>> struct boot_params *params);
>> void crash_smp_send_stop(void);
>>
>> +#ifdef CONFIG_KEXEC_CORE
>> +void __init kexec_reserve_low_1MiB(void);
>> +#else
>> +static inline void __init kexec_reserve_low_1MiB(void) { }
>> +#endif
>> +
>> #endif /* _ASM_X86_CRASH_H */
>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>> index eb651fbde92a..144f519aef29 100644
>> --- a/arch/x86/kernel/crash.c
>> +++ b/arch/x86/kernel/crash.c
>> @@ -24,6 +24,7 @@
>> #include <linux/export.h>
>> #include <linux/slab.h>
>> #include <linux/vmalloc.h>
>> +#include <linux/memblock.h>
>>
>> #include <asm/processor.h>
>> #include <asm/hardirq.h>
>> @@ -39,6 +40,7 @@
>> #include <asm/virtext.h>
>> #include <asm/intel_pt.h>
>> #include <asm/crash.h>
>> +#include <asm/cmdline.h>
>>
>> /* Used while preparing memory map entries for second kernel */
>> struct crash_memmap_data {
>> @@ -68,6 +70,19 @@ static inline void cpu_crash_vmclear_loaded_vmcss(void)
>> rcu_read_unlock();
>> }
>>
>> +/*
>> + * When the crashkernel option is specified, only use the low
>> + * 1MiB for the real mode trampoline.
>> + */
>> +void __init kexec_reserve_low_1MiB(void)
>
> Thanks for the effort, Lianbo. I believe everyone is confident with this
> solution and fix.
>
> I have a tiny concern, why the function name is
> kexec_reserve_low_1MiB(), but not kexec_reserve_low_1M()?

Thanks for your comment, Baoquan.

It means that kernel will reserve 1M 'Byte' memory, the function name does not
have special meaning.

Would you mind if i change it to the crash_reserve_low_1M()?

void __init crash_reserve_low_1M(void)

Thanks.
Lianbo

> I searched in kernel code with below filter, didn't see MiB appearing in
> a function name. I am not sure about it either, just ask.
>
> git grep "_[1-9]*M " arch/ kernel/ mm include/ drivers/ net/ init fs crypto/ certs/ ipc lib
>
> Thanks
> Baoquan
>
>> +{
>> + if (cmdline_find_option(boot_command_line, "crashkernel",
>> + NULL, 0) > 0) {
>> + memblock_reserve(0, 1<<20);
>> + pr_info("Reserving the low 1MiB of memory for crashkernel\n");
>> + }
>> +}
>> +
>> #if defined(CONFIG_SMP) && defined(CONFIG_X86_LOCAL_APIC)
>>
>> static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
>> diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
>> index 7dce39c8c034..b8bbd0017ca8 100644
>> --- a/arch/x86/realmode/init.c
>> +++ b/arch/x86/realmode/init.c
>> @@ -8,6 +8,7 @@
>> #include <asm/pgtable.h>
>> #include <asm/realmode.h>
>> #include <asm/tlbflush.h>
>> +#include <asm/crash.h>
>>
>> struct real_mode_header *real_mode_header;
>> u32 *trampoline_cr4_features;
>> @@ -34,6 +35,7 @@ void __init reserve_real_mode(void)
>>
>> memblock_reserve(mem, size);
>> set_real_mode_mem(mem);
>> + kexec_reserve_low_1MiB();
>> }
>>
>> static void __init setup_real_mode(void)
>> --
>> 2.17.1
>>

2019-10-30 19:12:00

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 1/2 v7] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified

Hi Lianbo,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.4-rc5 next-20191030]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url: https://github.com/0day-ci/linux/commits/Lianbo-Jiang/x86-kdump-Fix-kmem-s-reported-an-invalid-freepointer-when-SME-was-active/20191031-001903
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 320000e72ec0613e164ce9608d865396fb2da278
config: i386-defconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-14) 7.4.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>

All warnings (new ones prefixed by >>):

In file included from arch/x86/realmode/init.c:11:0:
>> arch/x86/include/asm/crash.h:5:32: warning: 'struct kimage' declared inside parameter list will not be visible outside of this definition or declaration
int crash_load_segments(struct kimage *image);
^~~~~~
arch/x86/include/asm/crash.h:6:37: warning: 'struct kimage' declared inside parameter list will not be visible outside of this definition or declaration
int crash_copy_backup_region(struct kimage *image);
^~~~~~
arch/x86/include/asm/crash.h:7:39: warning: 'struct kimage' declared inside parameter list will not be visible outside of this definition or declaration
int crash_setup_memmap_entries(struct kimage *image,
^~~~~~

vim +5 arch/x86/include/asm/crash.h

dd5f726076cc76 Vivek Goyal 2014-08-08 4
dd5f726076cc76 Vivek Goyal 2014-08-08 @5 int crash_load_segments(struct kimage *image);
dd5f726076cc76 Vivek Goyal 2014-08-08 6 int crash_copy_backup_region(struct kimage *image);
dd5f726076cc76 Vivek Goyal 2014-08-08 7 int crash_setup_memmap_entries(struct kimage *image,
dd5f726076cc76 Vivek Goyal 2014-08-08 8 struct boot_params *params);
89f579ce99f7e0 Yi Wang 2018-11-22 9 void crash_smp_send_stop(void);
dd5f726076cc76 Vivek Goyal 2014-08-08 10

:::::: The code at line 5 was first introduced by commit
:::::: dd5f726076cc7639d9713b334c8c133f77c6757a kexec: support for kexec on panic using new system call

:::::: TO: Vivek Goyal <[email protected]>
:::::: CC: Linus Torvalds <[email protected]>

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (2.77 kB)
.config.gz (27.49 kB)
Download all attachments

2019-10-31 01:34:41

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH 1/2 v7] x86/kdump: always reserve the low 1MiB when the crashkernel option is specified

在 2019年10月31日 02:25, kbuild test robot 写道:
> Hi Lianbo,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on linus/master]
> [also build test WARNING on v5.4-rc5 next-20191030]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
>
> url: https://github.com/0day-ci/linux/commits/Lianbo-Jiang/x86-kdump-Fix-kmem-s-reported-an-invalid-freepointer-when-SME-was-active/20191031-001903
> base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 320000e72ec0613e164ce9608d865396fb2da278
> config: i386-defconfig (attached as .config)
> compiler: gcc-7 (Debian 7.4.0-14) 7.4.0
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=i386
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <[email protected]>
>
> All warnings (new ones prefixed by >>):
>
> In file included from arch/x86/realmode/init.c:11:0:
>>> arch/x86/include/asm/crash.h:5:32: warning: 'struct kimage' declared inside parameter list will not be visible outside of this definition or declaration
> int crash_load_segments(struct kimage *image);
> ^~~~~~
> arch/x86/include/asm/crash.h:6:37: warning: 'struct kimage' declared inside parameter list will not be visible outside of this definition or declaration
> int crash_copy_backup_region(struct kimage *image);
> ^~~~~~
> arch/x86/include/asm/crash.h:7:39: warning: 'struct kimage' declared inside parameter list will not be visible outside of this definition or declaration
> int crash_setup_memmap_entries(struct kimage *image,
> ^~~~~~
>
Hi,

The above warnings will still occur without my patches.

But i will fix the warnings in my patch series, and resend v8 later.

Thanks.

Lianbo

> vim +5 arch/x86/include/asm/crash.h
>
> dd5f726076cc76 Vivek Goyal 2014-08-08 4
> dd5f726076cc76 Vivek Goyal 2014-08-08 @5 int crash_load_segments(struct kimage *image);
> dd5f726076cc76 Vivek Goyal 2014-08-08 6 int crash_copy_backup_region(struct kimage *image);
> dd5f726076cc76 Vivek Goyal 2014-08-08 7 int crash_setup_memmap_entries(struct kimage *image,
> dd5f726076cc76 Vivek Goyal 2014-08-08 8 struct boot_params *params);
> 89f579ce99f7e0 Yi Wang 2018-11-22 9 void crash_smp_send_stop(void);
> dd5f726076cc76 Vivek Goyal 2014-08-08 10
>
> :::::: The code at line 5 was first introduced by commit
> :::::: dd5f726076cc7639d9713b334c8c133f77c6757a kexec: support for kexec on panic using new system call
>

Exactly.

> :::::: TO: Vivek Goyal <[email protected]>
> :::::: CC: Linus Torvalds <[email protected]>
>
> ---
> 0-DAY kernel test infrastructure Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all Intel Corporation
>