2018-05-15 01:53:18

by Lianbo Jiang

[permalink] [raw]
Subject: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

It is convenient to remap the old memory encrypted to the second kernel by
calling ioremap_encrypted().

When sme enabled on AMD server, we also need to support kdump. Because
the memory is encrypted in the first kernel, we will remap the old memory
encrypted to the second kernel(crash kernel), and sme is also enabled in
the second kernel, otherwise the old memory encrypted can not be decrypted.
Because simply changing the value of a C-bit on a page will not
automatically encrypt the existing contents of a page, and any data in the
page prior to the C-bit modification will become unintelligible. A page of
memory that is marked encrypted will be automatically decrypted when read
from DRAM and will be automatically encrypted when written to DRAM.

For the kdump, it is necessary to distinguish whether the memory is
encrypted. Furthermore, we should also know which part of the memory is
encrypted or decrypted. We will appropriately remap the memory according
to the specific situation in order to tell cpu how to deal with the
data(encrypted or decrypted). For example, when sme enabled, if the old
memory is encrypted, we will remap the old memory in encrypted way, which
will automatically decrypt the old memory encrypted when we read those data
from the remapping address.

----------------------------------------------
| first-kernel | second-kernel | kdump support |
| (mem_encrypt=on|off) | (yes|no) |
|--------------+---------------+---------------|
| on | on | yes |
| off | off | yes |
| on | off | no |
| off | on | no |
|______________|_______________|_______________|

Test tools:
makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
Author: Lianbo Jiang <[email protected]>
Date: Mon May 14 17:02:40 2018 +0800
Note: This patch can only dump vmcore in the case of SME enabled.

crash-7.2.1: https://github.com/crash-utility/crash.git
commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
Author: Dave Anderson <[email protected]>
Date: Fri May 11 15:54:32 2018 -0400

Test environment:
HP ProLiant DL385Gen10 AMD EPYC 7251
8-Core Processor
32768 MB memory
600 GB disk space

Linux 4.17-rc4:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
commit 75bc37fefc44 ("Linux 4.17-rc4")
Author: Linus Torvalds <[email protected]>
Date: Sun May 6 16:57:38 2018 -1000

Reference:
AMD64 Architecture Programmer's Manual
https://support.amd.com/TechDocs/24593.pdf

Lianbo Jiang (2):
add a function(ioremap_encrypted) for kdump when AMD sme enabled.
support kdump when AMD secure memory encryption is active

arch/x86/include/asm/dmi.h | 14 +++++++++++++-
arch/x86/include/asm/io.h | 2 ++
arch/x86/kernel/acpi/boot.c | 8 ++++++++
arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
drivers/acpi/tables.c | 14 +++++++++++++-
drivers/iommu/amd_iommu_init.c | 9 ++++++++-
fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
include/linux/crash_dump.h | 4 ++++
kernel/kexec_core.c | 12 ++++++++++++
10 files changed, 135 insertions(+), 16 deletions(-)

--
2.9.5



2018-05-15 01:52:19

by Lianbo Jiang

[permalink] [raw]
Subject: [PATCH 2/2] support kdump when AMD secure memory encryption is active

When sme enabled on AMD server, we also need to support kdump. Because
the memory is encrypted in the first kernel, we will remap the old memory
encrypted to the second kernel(crash kernel), and sme is also enabled in
the second kernel, otherwise the old memory encrypted can not be decrypted.
Because simply changing the value of a C-bit on a page will not
automatically encrypt the existing contents of a page, and any data in the
page prior to the C-bit modification will become unintelligible. A page of
memory that is marked encrypted will be automatically decrypted when read
from DRAM and will be automatically encrypted when written to DRAM.

For the kdump, it is necessary to distinguish whether the memory is
encrypted. Furthermore, we should also know which part of the memory is
encrypted or decrypted. We will appropriately remap the memory according
to the specific situation in order to tell cpu how to deal with the data(
encrypted or unencrypted). For example, when sme enabled, if the old memory
is encrypted, we will remap the old memory in encrypted way, which will
automatically decrypt the old memory encrypted when we read those data from
the remapping address.

----------------------------------------------
| first-kernel | second-kernel | kdump support |
| (mem_encrypt=on|off) | (yes|no) |
|--------------+---------------+---------------|
| on | on | yes |
| off | off | yes |
| on | off | no |
| off | on | no |
|______________|_______________|_______________|

Signed-off-by: Lianbo Jiang <[email protected]>
---
arch/x86/include/asm/dmi.h | 14 +++++++++++++-
arch/x86/kernel/acpi/boot.c | 8 ++++++++
arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
drivers/acpi/tables.c | 14 +++++++++++++-
drivers/iommu/amd_iommu_init.c | 9 ++++++++-
fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
include/linux/crash_dump.h | 4 ++++
kernel/kexec_core.c | 12 ++++++++++++
8 files changed, 116 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/dmi.h b/arch/x86/include/asm/dmi.h
index 0ab2ab2..a5663b4 100644
--- a/arch/x86/include/asm/dmi.h
+++ b/arch/x86/include/asm/dmi.h
@@ -7,6 +7,10 @@

#include <asm/io.h>
#include <asm/setup.h>
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+#include <linux/crash_dump.h>
+#include <linux/mem_encrypt.h>
+#endif

static __always_inline __init void *dmi_alloc(unsigned len)
{
@@ -14,7 +18,15 @@ static __always_inline __init void *dmi_alloc(unsigned len)
}

/* Use early IO mappings for DMI because it's initialized early */
-#define dmi_early_remap early_memremap
+static __always_inline __init void *dmi_early_remap(resource_size_t
+ phys_addr, unsigned long size)
+{
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ if (sme_active() && is_kdump_kernel())
+ return early_memremap_decrypted(phys_addr, size);
+#endif
+ return early_memremap(phys_addr, size);
+}
#define dmi_early_unmap early_memunmap
#define dmi_remap(_x, _l) memremap(_x, _l, MEMREMAP_WB)
#define dmi_unmap(_x) memunmap(_x)
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 3b20607..354ad66 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -48,6 +48,10 @@
#include <asm/mpspec.h>
#include <asm/smp.h>
#include <asm/i8259.h>
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+#include <linux/crash_dump.h>
+#include <linux/mem_encrypt.h>
+#endif

#include "sleep.h" /* To include x86_acpi_suspend_lowlevel */
static int __initdata acpi_force = 0;
@@ -124,6 +128,10 @@ void __init __iomem *__acpi_map_table(unsigned long phys, unsigned long size)
if (!phys || !size)
return NULL;

+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ if (sme_active() && is_kdump_kernel())
+ return early_memremap_decrypted(phys, size);
+#endif
return early_memremap(phys, size);
}

diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
index 4f2e077..2ef67fc 100644
--- a/arch/x86/kernel/crash_dump_64.c
+++ b/arch/x86/kernel/crash_dump_64.c
@@ -48,3 +48,30 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
iounmap(vaddr);
return csize;
}
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
+ size_t csize, unsigned long offset, int userbuf)
+{
+ void *vaddr;
+
+ if (!csize)
+ return 0;
+
+ vaddr = ioremap_encrypted(pfn << PAGE_SHIFT, PAGE_SIZE);
+ if (!vaddr)
+ return -ENOMEM;
+
+ if (userbuf) {
+ if (copy_to_user(buf, vaddr + offset, csize)) {
+ iounmap(vaddr);
+ return -EFAULT;
+ }
+ } else
+ memcpy(buf, vaddr + offset, csize);
+
+ set_iounmap_nonlazy();
+ iounmap(vaddr);
+ return csize;
+}
+#endif
diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 849c4fb..6da9b0c 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -36,6 +36,10 @@
#include <linux/memblock.h>
#include <linux/initrd.h>
#include "internal.h"
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+#include <linux/crash_dump.h>
+#include <linux/mem_encrypt.h>
+#endif

#ifdef CONFIG_ACPI_CUSTOM_DSDT
#include CONFIG_ACPI_CUSTOM_DSDT_FILE
@@ -566,7 +570,15 @@ void __init acpi_table_upgrade(void)
clen = size;
if (clen > MAP_CHUNK_SIZE - slop)
clen = MAP_CHUNK_SIZE - slop;
- dest_p = early_memremap(dest_addr & PAGE_MASK,
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ if (sme_active() && is_kdump_kernel())
+ dest_p = early_memremap_decrypted(
+ dest_addr & PAGE_MASK,
+ clen + slop);
+ else
+#endif
+ dest_p = early_memremap(
+ dest_addr & PAGE_MASK,
clen + slop);
memcpy(dest_p + slop, src_p, clen);
early_memunmap(dest_p, clen + slop);
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 904c575..8ecbddb 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -889,11 +889,18 @@ static bool copy_device_table(void)
}

old_devtb_phys = entry & PAGE_MASK;
+ if (sme_active() && is_kdump_kernel())
+ old_devtb_phys = __sme_clr(old_devtb_phys);
if (old_devtb_phys >= 0x100000000ULL) {
pr_err("The address of old device table is above 4G, not trustworthy!\n");
return false;
}
- old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
+ if (sme_active() && is_kdump_kernel())
+ old_devtb = ioremap_encrypted(old_devtb_phys,
+ dev_table_size);
+ else
+ old_devtb = memremap(old_devtb_phys,
+ dev_table_size, MEMREMAP_WB);
if (!old_devtb)
return false;

diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index a45f0af..316e2b0 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -25,6 +25,10 @@
#include <linux/uaccess.h>
#include <asm/io.h>
#include "internal.h"
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+#include <linux/mem_encrypt.h>
+#include <asm/pgtable.h>
+#endif

/* List representing chunks of contiguous memory areas and their offsets in
* vmcore file.
@@ -86,7 +90,8 @@ static int pfn_is_ram(unsigned long pfn)

/* Reads a page from the oldmem device from given offset. */
static ssize_t read_from_oldmem(char *buf, size_t count,
- u64 *ppos, int userbuf)
+ u64 *ppos, int userbuf,
+ bool encrypted)
{
unsigned long pfn, offset;
size_t nr_bytes;
@@ -108,8 +113,15 @@ static ssize_t read_from_oldmem(char *buf, size_t count,
if (pfn_is_ram(pfn) == 0)
memset(buf, 0, nr_bytes);
else {
- tmp = copy_oldmem_page(pfn, buf, nr_bytes,
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ if (encrypted)
+ tmp = copy_oldmem_page_encrypted(pfn, buf,
+ nr_bytes, offset, userbuf);
+ else
+#endif
+ tmp = copy_oldmem_page(pfn, buf, nr_bytes,
offset, userbuf);
+
if (tmp < 0)
return tmp;
}
@@ -143,7 +155,7 @@ void __weak elfcorehdr_free(unsigned long long addr)
*/
ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
{
- return read_from_oldmem(buf, count, ppos, 0);
+ return read_from_oldmem(buf, count, ppos, 0, false);
}

/*
@@ -151,7 +163,11 @@ ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
*/
ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos)
{
- return read_from_oldmem(buf, count, ppos, 0);
+ bool flag = false;
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ flag = sme_active();
+#endif
+ return read_from_oldmem(buf, count, ppos, 0, flag);
}

/*
@@ -161,6 +177,10 @@ int __weak remap_oldmem_pfn_range(struct vm_area_struct *vma,
unsigned long from, unsigned long pfn,
unsigned long size, pgprot_t prot)
{
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ if (sme_active())
+ prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);
+#endif
return remap_pfn_range(vma, from, pfn, size, prot);
}

@@ -188,6 +208,11 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
size_t tsz;
u64 start;
struct vmcore *m = NULL;
+ bool sme_flag = false;
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ sme_flag = sme_active();
+#endif

if (buflen == 0 || *fpos >= vmcore_size)
return 0;
@@ -235,7 +260,8 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
m->offset + m->size - *fpos,
buflen);
start = m->paddr + *fpos - m->offset;
- tmp = read_from_oldmem(buffer, tsz, &start, userbuf);
+ tmp = read_from_oldmem(buffer, tsz, &start,
+ userbuf, sme_flag);
if (tmp < 0)
return tmp;
buflen -= tsz;
diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
index f7ac2aa..024ae9e 100644
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -25,6 +25,10 @@ extern int remap_oldmem_pfn_range(struct vm_area_struct *vma,

extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
unsigned long, int);
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+extern ssize_t copy_oldmem_page_encrypted(unsigned long, char *, size_t,
+ unsigned long, int);
+#endif
void vmcore_cleanup(void);

/* Architecture code defines this if there are other possible ELF
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 20fef1a..3c22a9b 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -471,6 +471,16 @@ static struct page *kimage_alloc_crash_control_pages(struct kimage *image,
}
}

+ if (pages) {
+ unsigned int count, i;
+
+ pages->mapping = NULL;
+ set_page_private(pages, order);
+ count = 1 << order;
+ for (i = 0; i < count; i++)
+ SetPageReserved(pages + i);
+ arch_kexec_post_alloc_pages(page_address(pages), 1 << order, 0);
+ }
return pages;
}

@@ -865,6 +875,7 @@ static int kimage_load_crash_segment(struct kimage *image,
result = -ENOMEM;
goto out;
}
+ arch_kexec_post_alloc_pages(page_address(page), 1, 0);
ptr = kmap(page);
ptr += maddr & ~PAGE_MASK;
mchunk = min_t(size_t, mbytes,
@@ -882,6 +893,7 @@ static int kimage_load_crash_segment(struct kimage *image,
result = copy_from_user(ptr, buf, uchunk);
kexec_flush_icache_page(page);
kunmap(page);
+ arch_kexec_pre_free_pages(page_address(page), 1);
if (result) {
result = -EFAULT;
goto out;
--
2.9.5


2018-05-15 01:52:32

by Lianbo Jiang

[permalink] [raw]
Subject: [PATCH 1/2] add a function(ioremap_encrypted) for kdump when AMD sme enabled.

It is convenient to remap the old memory encrypted to the second kernel
by calling ioremap_encrypted().

Signed-off-by: Lianbo Jiang <[email protected]>
---
arch/x86/include/asm/io.h | 2 ++
arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index f6e5b93..06d2a9f 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -192,6 +192,8 @@ extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
#define ioremap_cache ioremap_cache
extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val);
#define ioremap_prot ioremap_prot
+extern void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size);
+#define ioremap_encrypted ioremap_encrypted

/**
* ioremap - map bus memory into CPU space
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c63a545..7a52d1e 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -131,7 +131,8 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
* caller shouldn't need to know that small detail.
*/
static void __iomem *__ioremap_caller(resource_size_t phys_addr,
- unsigned long size, enum page_cache_mode pcm, void *caller)
+ unsigned long size, enum page_cache_mode pcm,
+ void *caller, bool encrypted)
{
unsigned long offset, vaddr;
resource_size_t last_addr;
@@ -199,7 +200,8 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
* resulting mapping.
*/
prot = PAGE_KERNEL_IO;
- if (sev_active() && mem_flags.desc_other)
+ if ((sev_active() && mem_flags.desc_other) ||
+ (encrypted && sme_active()))
prot = pgprot_encrypted(prot);

switch (pcm) {
@@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;

return __ioremap_caller(phys_addr, size, pcm,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_nocache);

@@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, unsigned long size)
enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;

return __ioremap_caller(phys_addr, size, pcm,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL_GPL(ioremap_uc);

@@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_wc);

@@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_wt);

+void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size)
+{
+ return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
+ __builtin_return_address(0), true);
+}
+EXPORT_SYMBOL(ioremap_encrypted);
+
void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_cache);

@@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, unsigned long size,
{
return __ioremap_caller(phys_addr, size,
pgprot2cachemode(__pgprot(prot_val)),
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_prot);

--
2.9.5


2018-05-15 12:47:21

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 2/2] support kdump when AMD secure memory encryption is active

Hi Lianbo,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.17-rc5 next-20180514]
[cannot apply to tip/x86/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Lianbo-Jiang/support-kdump-for-AMD-secure-memory-encryption-sme/20180515-135732
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> drivers/iommu/amd_iommu_init.c:899:27: sparse: incorrect type in assignment (different address spaces) @@ expected struct dev_table_entry *old_devtb @@ got truct dev_table_entry *old_devtb @@
drivers/iommu/amd_iommu_init.c:899:27: expected struct dev_table_entry *old_devtb
drivers/iommu/amd_iommu_init.c:899:27: got void [noderef] <asn:2>*
drivers/iommu/amd_iommu_init.c:1740:39: sparse: expression using sizeof(void)
drivers/iommu/amd_iommu_init.c:1740:39: sparse: expression using sizeof(void)
drivers/iommu/amd_iommu_init.c:1750:49: sparse: expression using sizeof(void)
drivers/iommu/amd_iommu_init.c:1750:49: sparse: expression using sizeof(void)
drivers/iommu/amd_iommu_init.c:2950:18: sparse: symbol 'get_amd_iommu' was not declared. Should it be static?
drivers/iommu/amd_iommu_init.c:2969:4: sparse: symbol 'amd_iommu_pc_get_max_banks' was not declared. Should it be static?
drivers/iommu/amd_iommu_init.c:2980:6: sparse: symbol 'amd_iommu_pc_supported' was not declared. Should it be static?
drivers/iommu/amd_iommu_init.c:2986:4: sparse: symbol 'amd_iommu_pc_get_max_counters' was not declared. Should it be static?
drivers/iommu/amd_iommu_init.c:3035:5: sparse: symbol 'amd_iommu_pc_get_reg' was not declared. Should it be static?
drivers/iommu/amd_iommu_init.c:3044:5: sparse: symbol 'amd_iommu_pc_set_reg' was not declared. Should it be static?

vim +899 drivers/iommu/amd_iommu_init.c

854
855
856 static bool copy_device_table(void)
857 {
858 u64 int_ctl, int_tab_len, entry = 0, last_entry = 0;
859 struct dev_table_entry *old_devtb = NULL;
860 u32 lo, hi, devid, old_devtb_size;
861 phys_addr_t old_devtb_phys;
862 struct amd_iommu *iommu;
863 u16 dom_id, dte_v, irq_v;
864 gfp_t gfp_flag;
865 u64 tmp;
866
867 if (!amd_iommu_pre_enabled)
868 return false;
869
870 pr_warn("Translation is already enabled - trying to copy translation structures\n");
871 for_each_iommu(iommu) {
872 /* All IOMMUs should use the same device table with the same size */
873 lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
874 hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4);
875 entry = (((u64) hi) << 32) + lo;
876 if (last_entry && last_entry != entry) {
877 pr_err("IOMMU:%d should use the same dev table as others!\n",
878 iommu->index);
879 return false;
880 }
881 last_entry = entry;
882
883 old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12;
884 if (old_devtb_size != dev_table_size) {
885 pr_err("The device table size of IOMMU:%d is not expected!\n",
886 iommu->index);
887 return false;
888 }
889 }
890
891 old_devtb_phys = entry & PAGE_MASK;
892 if (sme_active() && is_kdump_kernel())
893 old_devtb_phys = __sme_clr(old_devtb_phys);
894 if (old_devtb_phys >= 0x100000000ULL) {
895 pr_err("The address of old device table is above 4G, not trustworthy!\n");
896 return false;
897 }
898 if (sme_active() && is_kdump_kernel())
> 899 old_devtb = ioremap_encrypted(old_devtb_phys,
900 dev_table_size);
901 else
902 old_devtb = memremap(old_devtb_phys,
903 dev_table_size, MEMREMAP_WB);
904 if (!old_devtb)
905 return false;
906
907 gfp_flag = GFP_KERNEL | __GFP_ZERO | GFP_DMA32;
908 old_dev_tbl_cpy = (void *)__get_free_pages(gfp_flag,
909 get_order(dev_table_size));
910 if (old_dev_tbl_cpy == NULL) {
911 pr_err("Failed to allocate memory for copying old device table!\n");
912 return false;
913 }
914
915 for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
916 old_dev_tbl_cpy[devid] = old_devtb[devid];
917 dom_id = old_devtb[devid].data[1] & DEV_DOMID_MASK;
918 dte_v = old_devtb[devid].data[0] & DTE_FLAG_V;
919
920 if (dte_v && dom_id) {
921 old_dev_tbl_cpy[devid].data[0] = old_devtb[devid].data[0];
922 old_dev_tbl_cpy[devid].data[1] = old_devtb[devid].data[1];
923 __set_bit(dom_id, amd_iommu_pd_alloc_bitmap);
924 /* If gcr3 table existed, mask it out */
925 if (old_devtb[devid].data[0] & DTE_FLAG_GV) {
926 tmp = DTE_GCR3_VAL_B(~0ULL) << DTE_GCR3_SHIFT_B;
927 tmp |= DTE_GCR3_VAL_C(~0ULL) << DTE_GCR3_SHIFT_C;
928 old_dev_tbl_cpy[devid].data[1] &= ~tmp;
929 tmp = DTE_GCR3_VAL_A(~0ULL) << DTE_GCR3_SHIFT_A;
930 tmp |= DTE_FLAG_GV;
931 old_dev_tbl_cpy[devid].data[0] &= ~tmp;
932 }
933 }
934
935 irq_v = old_devtb[devid].data[2] & DTE_IRQ_REMAP_ENABLE;
936 int_ctl = old_devtb[devid].data[2] & DTE_IRQ_REMAP_INTCTL_MASK;
937 int_tab_len = old_devtb[devid].data[2] & DTE_IRQ_TABLE_LEN_MASK;
938 if (irq_v && (int_ctl || int_tab_len)) {
939 if ((int_ctl != DTE_IRQ_REMAP_INTCTL) ||
940 (int_tab_len != DTE_IRQ_TABLE_LEN)) {
941 pr_err("Wrong old irq remapping flag: %#x\n", devid);
942 return false;
943 }
944
945 old_dev_tbl_cpy[devid].data[2] = old_devtb[devid].data[2];
946 }
947 }
948 memunmap(old_devtb);
949
950 return true;
951 }
952

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation

2018-05-15 13:53:47

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> It is convenient to remap the old memory encrypted to the second kernel by
> calling ioremap_encrypted().
>
> When sme enabled on AMD server, we also need to support kdump. Because
> the memory is encrypted in the first kernel, we will remap the old memory
> encrypted to the second kernel(crash kernel), and sme is also enabled in
> the second kernel, otherwise the old memory encrypted can not be decrypted.
> Because simply changing the value of a C-bit on a page will not
> automatically encrypt the existing contents of a page, and any data in the
> page prior to the C-bit modification will become unintelligible. A page of
> memory that is marked encrypted will be automatically decrypted when read
> from DRAM and will be automatically encrypted when written to DRAM.
>
> For the kdump, it is necessary to distinguish whether the memory is
> encrypted. Furthermore, we should also know which part of the memory is
> encrypted or decrypted. We will appropriately remap the memory according
> to the specific situation in order to tell cpu how to deal with the
> data(encrypted or decrypted). For example, when sme enabled, if the old
> memory is encrypted, we will remap the old memory in encrypted way, which
> will automatically decrypt the old memory encrypted when we read those data
> from the remapping address.
>
> ----------------------------------------------
> | first-kernel | second-kernel | kdump support |
> | (mem_encrypt=on|off) | (yes|no) |
> |--------------+---------------+---------------|
> | on | on | yes |
> | off | off | yes |
> | on | off | no |
> | off | on | no |
> |______________|_______________|_______________|
>
> Test tools:
> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
> Author: Lianbo Jiang <[email protected]>
> Date: Mon May 14 17:02:40 2018 +0800
> Note: This patch can only dump vmcore in the case of SME enabled.
>
> crash-7.2.1: https://github.com/crash-utility/crash.git
> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
> Author: Dave Anderson <[email protected]>
> Date: Fri May 11 15:54:32 2018 -0400
>
> Test environment:
> HP ProLiant DL385Gen10 AMD EPYC 7251
> 8-Core Processor
> 32768 MB memory
> 600 GB disk space
>
> Linux 4.17-rc4:
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> commit 75bc37fefc44 ("Linux 4.17-rc4")
> Author: Linus Torvalds <[email protected]>
> Date: Sun May 6 16:57:38 2018 -1000
>
> Reference:
> AMD64 Architecture Programmer's Manual
> https://support.amd.com/TechDocs/24593.pdf
>

Have you also tested this with SEV? It would be nice if the kdump
changes you make work with both SME and SEV.

Thanks,
Tom

> Lianbo Jiang (2):
> add a function(ioremap_encrypted) for kdump when AMD sme enabled.
> support kdump when AMD secure memory encryption is active
>
> arch/x86/include/asm/dmi.h | 14 +++++++++++++-
> arch/x86/include/asm/io.h | 2 ++
> arch/x86/kernel/acpi/boot.c | 8 ++++++++
> arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
> drivers/acpi/tables.c | 14 +++++++++++++-
> drivers/iommu/amd_iommu_init.c | 9 ++++++++-
> fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
> include/linux/crash_dump.h | 4 ++++
> kernel/kexec_core.c | 12 ++++++++++++
> 10 files changed, 135 insertions(+), 16 deletions(-)
>

2018-05-15 14:45:19

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH 1/2] add a function(ioremap_encrypted) for kdump when AMD sme enabled.

On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> It is convenient to remap the old memory encrypted to the second kernel
> by calling ioremap_encrypted().
>
> Signed-off-by: Lianbo Jiang <[email protected]>
> ---
> arch/x86/include/asm/io.h | 2 ++
> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
> 2 files changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index f6e5b93..06d2a9f 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -192,6 +192,8 @@ extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
> #define ioremap_cache ioremap_cache
> extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val);
> #define ioremap_prot ioremap_prot
> +extern void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size);
> +#define ioremap_encrypted ioremap_encrypted
>
> /**
> * ioremap - map bus memory into CPU space
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c63a545..7a52d1e 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -131,7 +131,8 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
> * caller shouldn't need to know that small detail.
> */
> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
> - unsigned long size, enum page_cache_mode pcm, void *caller)
> + unsigned long size, enum page_cache_mode pcm,
> + void *caller, bool encrypted)
> {
> unsigned long offset, vaddr;
> resource_size_t last_addr;
> @@ -199,7 +200,8 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
> * resulting mapping.
> */
> prot = PAGE_KERNEL_IO;
> - if (sev_active() && mem_flags.desc_other)
> + if ((sev_active() && mem_flags.desc_other) ||
> + (encrypted && sme_active()))

You only need the encrypted check here. If sme is not active,
then the pgprot_encrypted will basically be a no-op.

Also, extra indents.

Thanks,
Tom

> prot = pgprot_encrypted(prot);
>
> switch (pcm) {
> @@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
> enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
>
> return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_nocache);
>
> @@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, unsigned long size)
> enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
>
> return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL_GPL(ioremap_uc);
>
> @@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
> void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
> {
> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_wc);
>
> @@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
> void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
> {
> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_wt);
>
> +void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size)
> +{
> + return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> + __builtin_return_address(0), true);
> +}
> +EXPORT_SYMBOL(ioremap_encrypted);
> +
> void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
> {
> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_cache);
>
> @@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, unsigned long size,
> {
> return __ioremap_caller(phys_addr, size,
> pgprot2cachemode(__pgprot(prot_val)),
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_prot);
>
>

2018-05-15 20:20:02

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH 2/2] support kdump when AMD secure memory encryption is active

On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
> When sme enabled on AMD server, we also need to support kdump. Because
> the memory is encrypted in the first kernel, we will remap the old memory
> encrypted to the second kernel(crash kernel), and sme is also enabled in
> the second kernel, otherwise the old memory encrypted can not be decrypted.
> Because simply changing the value of a C-bit on a page will not
> automatically encrypt the existing contents of a page, and any data in the
> page prior to the C-bit modification will become unintelligible. A page of
> memory that is marked encrypted will be automatically decrypted when read
> from DRAM and will be automatically encrypted when written to DRAM.
>
> For the kdump, it is necessary to distinguish whether the memory is
> encrypted. Furthermore, we should also know which part of the memory is
> encrypted or decrypted. We will appropriately remap the memory according
> to the specific situation in order to tell cpu how to deal with the data(
> encrypted or unencrypted). For example, when sme enabled, if the old memory
> is encrypted, we will remap the old memory in encrypted way, which will
> automatically decrypt the old memory encrypted when we read those data from
> the remapping address.
>
> ----------------------------------------------
> | first-kernel | second-kernel | kdump support |
> | (mem_encrypt=on|off) | (yes|no) |
> |--------------+---------------+---------------|
> | on | on | yes |
> | off | off | yes |
> | on | off | no |
> | off | on | no |
> |______________|_______________|_______________|
>
> Signed-off-by: Lianbo Jiang <[email protected]>
> ---
> arch/x86/include/asm/dmi.h | 14 +++++++++++++-
> arch/x86/kernel/acpi/boot.c | 8 ++++++++
> arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
> drivers/acpi/tables.c | 14 +++++++++++++-
> drivers/iommu/amd_iommu_init.c | 9 ++++++++-
> fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
> include/linux/crash_dump.h | 4 ++++
> kernel/kexec_core.c | 12 ++++++++++++
> 8 files changed, 116 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/dmi.h b/arch/x86/include/asm/dmi.h
> index 0ab2ab2..a5663b4 100644
> --- a/arch/x86/include/asm/dmi.h
> +++ b/arch/x86/include/asm/dmi.h
> @@ -7,6 +7,10 @@
>
> #include <asm/io.h>
> #include <asm/setup.h>
> +#ifdef CONFIG_AMD_MEM_ENCRYPT

I don't think you need all of the #ifdef stuff throughout this
patch. Everything should work just fine without it.

> +#include <linux/crash_dump.h>
> +#include <linux/mem_encrypt.h>
> +#endif
>
> static __always_inline __init void *dmi_alloc(unsigned len)
> {
> @@ -14,7 +18,15 @@ static __always_inline __init void *dmi_alloc(unsigned len)
> }
>
> /* Use early IO mappings for DMI because it's initialized early */
> -#define dmi_early_remap early_memremap
> +static __always_inline __init void *dmi_early_remap(resource_size_t
> + phys_addr, unsigned long size)
> +{
> +#ifdef CONFIG_AMD_MEM_ENCRYPT

Again, no need for the #ifdef here. You should probably audit the
code for all of these and truly determine if they are really needed.

> + if (sme_active() && is_kdump_kernel())

Use of sme_active() here is good since under SEV, this area will be
encrypted.

> + return early_memremap_decrypted(phys_addr, size);
> +#endif
> + return early_memremap(phys_addr, size);

Instead of doing this, maybe it makes more sense to put this logic
somewhere in the early_memremap() path. Possibly smarten up the
early_memremap_pgprot_adjust() function with some kdump kernel
related logic. Not sure it's possible, but would be nice since you
have this logic in a couple of places.

> +}
> #define dmi_early_unmap early_memunmap
> #define dmi_remap(_x, _l) memremap(_x, _l, MEMREMAP_WB)
> #define dmi_unmap(_x) memunmap(_x)
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 3b20607..354ad66 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -48,6 +48,10 @@
> #include <asm/mpspec.h>
> #include <asm/smp.h>
> #include <asm/i8259.h>
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +#include <linux/crash_dump.h>
> +#include <linux/mem_encrypt.h>
> +#endif
>
> #include "sleep.h" /* To include x86_acpi_suspend_lowlevel */
> static int __initdata acpi_force = 0;
> @@ -124,6 +128,10 @@ void __init __iomem *__acpi_map_table(unsigned long phys, unsigned long size)
> if (!phys || !size)
> return NULL;
>
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + if (sme_active() && is_kdump_kernel())
> + return early_memremap_decrypted(phys, size);
> +#endif

Same as previous comment(s).

> return early_memremap(phys, size);
> }
>
> diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
> index 4f2e077..2ef67fc 100644
> --- a/arch/x86/kernel/crash_dump_64.c
> +++ b/arch/x86/kernel/crash_dump_64.c
> @@ -48,3 +48,30 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
> iounmap(vaddr);
> return csize;
> }
> +
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
> + size_t csize, unsigned long offset, int userbuf)
> +{
> + void *vaddr;
> +
> + if (!csize)
> + return 0;
> +
> + vaddr = ioremap_encrypted(pfn << PAGE_SHIFT, PAGE_SIZE);
> + if (!vaddr)
> + return -ENOMEM;
> +
> + if (userbuf) {
> + if (copy_to_user(buf, vaddr + offset, csize)) {
> + iounmap(vaddr);
> + return -EFAULT;
> + }
> + } else
> + memcpy(buf, vaddr + offset, csize);
> +
> + set_iounmap_nonlazy();
> + iounmap(vaddr);
> + return csize;
> +}
> +#endif

This seems exactly the same as copy_oldmem_page() with the difference
being the type of ioremap done. Might be better to make the code
after the ioremap's a common piece of code that each of the copy_oldmem
functions would call.

> diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
> index 849c4fb..6da9b0c 100644
> --- a/drivers/acpi/tables.c
> +++ b/drivers/acpi/tables.c
> @@ -36,6 +36,10 @@
> #include <linux/memblock.h>
> #include <linux/initrd.h>
> #include "internal.h"
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +#include <linux/crash_dump.h>
> +#include <linux/mem_encrypt.h>
> +#endif
>
> #ifdef CONFIG_ACPI_CUSTOM_DSDT
> #include CONFIG_ACPI_CUSTOM_DSDT_FILE
> @@ -566,7 +570,15 @@ void __init acpi_table_upgrade(void)
> clen = size;
> if (clen > MAP_CHUNK_SIZE - slop)
> clen = MAP_CHUNK_SIZE - slop;
> - dest_p = early_memremap(dest_addr & PAGE_MASK,
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + if (sme_active() && is_kdump_kernel())
> + dest_p = early_memremap_decrypted(
> + dest_addr & PAGE_MASK,
> + clen + slop);
> + else
> +#endif
> + dest_p = early_memremap(
> + dest_addr & PAGE_MASK,

So if the dest_addr (based off of acpi_tables_addr) was added to the e820
map as an ACPI area (which it will be), then it would be mapped properly
(in both SME and SEV) without needing the if/then/else.

> clen + slop);
> memcpy(dest_p + slop, src_p, clen);
> early_memunmap(dest_p, clen + slop);
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 904c575..8ecbddb 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -889,11 +889,18 @@ static bool copy_device_table(void)
> }
>
> old_devtb_phys = entry & PAGE_MASK;
> + if (sme_active() && is_kdump_kernel())

Use mem_encrypt_active() here to cover both SME and SEV.

> + old_devtb_phys = __sme_clr(old_devtb_phys);
> if (old_devtb_phys >= 0x100000000ULL) {
> pr_err("The address of old device table is above 4G, not trustworthy!\n");
> return false;
> }
> - old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
> + if (sme_active() && is_kdump_kernel())
> + old_devtb = ioremap_encrypted(old_devtb_phys,
> + dev_table_size);
> + else
> + old_devtb = memremap(old_devtb_phys,
> + dev_table_size, MEMREMAP_WB);

What happens to the memremap here, does it fall back to the ioremap and
end up getting mapped decrypted? It would be nice to do the right thing
under the covers of memremap. Not sure what that would take, but it would
keep the code nice and clean.

> if (!old_devtb)
> return false;
>
> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
> index a45f0af..316e2b0 100644
> --- a/fs/proc/vmcore.c
> +++ b/fs/proc/vmcore.c
> @@ -25,6 +25,10 @@
> #include <linux/uaccess.h>
> #include <asm/io.h>
> #include "internal.h"
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +#include <linux/mem_encrypt.h>
> +#include <asm/pgtable.h>
> +#endif
>
> /* List representing chunks of contiguous memory areas and their offsets in
> * vmcore file.
> @@ -86,7 +90,8 @@ static int pfn_is_ram(unsigned long pfn)
>
> /* Reads a page from the oldmem device from given offset. */
> static ssize_t read_from_oldmem(char *buf, size_t count,
> - u64 *ppos, int userbuf)
> + u64 *ppos, int userbuf,
> + bool encrypted)
> {
> unsigned long pfn, offset;
> size_t nr_bytes;
> @@ -108,8 +113,15 @@ static ssize_t read_from_oldmem(char *buf, size_t count,
> if (pfn_is_ram(pfn) == 0)
> memset(buf, 0, nr_bytes);
> else {
> - tmp = copy_oldmem_page(pfn, buf, nr_bytes,
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + if (encrypted)
> + tmp = copy_oldmem_page_encrypted(pfn, buf,
> + nr_bytes, offset, userbuf);
> + else
> +#endif
> + tmp = copy_oldmem_page(pfn, buf, nr_bytes,
> offset, userbuf);> +
> if (tmp < 0)
> return tmp;
> }
> @@ -143,7 +155,7 @@ void __weak elfcorehdr_free(unsigned long long addr)
> */
> ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
> {
> - return read_from_oldmem(buf, count, ppos, 0);
> + return read_from_oldmem(buf, count, ppos, 0, false);

For SEV, this will likely be encrypted, so you can probably replace the
"false" with sev_active() so that under SME it is un-encrypted but under
SEV it is encrypted. Where is the elfcorehdr stored? I wonder if it
could be created as encrypted under SME and then you could actually remove
the encrypted parameter from read_from_oldmem() and always map encrypted.
If SME or SEV are active it will be mapped encrypted and if they aren't
then it is mapped normally.

> }
>
> /*
> @@ -151,7 +163,11 @@ ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
> */
> ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos)
> {
> - return read_from_oldmem(buf, count, ppos, 0);
> + bool flag = false;
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + flag = sme_active();
> +#endif
> + return read_from_oldmem(buf, count, ppos, 0, flag);
> }
>
> /*
> @@ -161,6 +177,10 @@ int __weak remap_oldmem_pfn_range(struct vm_area_struct *vma,
> unsigned long from, unsigned long pfn,
> unsigned long size, pgprot_t prot)
> {
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + if (sme_active())

No need for the sme_active() check here, the encryption will be applied,
for both SME and SEV, if memory encryption is active, otherwise it won't.

> + prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);

prot = pgprot_encrypted(prot);

> +#endif> return remap_pfn_range(vma, from, pfn, size, prot);
> }
>
> @@ -188,6 +208,11 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
> size_t tsz;
> u64 start;
> struct vmcore *m = NULL;
> + bool sme_flag = false;
> +
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> + sme_flag = sme_active();> +#endif

Probably just want mem_encrypt_active() here to get both SME and SEV
cases mapped as encrypted.

Thanks,
Tom

>
> if (buflen == 0 || *fpos >= vmcore_size)
> return 0;
> @@ -235,7 +260,8 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
> m->offset + m->size - *fpos,
> buflen);
> start = m->paddr + *fpos - m->offset;
> - tmp = read_from_oldmem(buffer, tsz, &start, userbuf);
> + tmp = read_from_oldmem(buffer, tsz, &start,
> + userbuf, sme_flag);
> if (tmp < 0)
> return tmp;
> buflen -= tsz;
> diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
> index f7ac2aa..024ae9e 100644
> --- a/include/linux/crash_dump.h
> +++ b/include/linux/crash_dump.h
> @@ -25,6 +25,10 @@ extern int remap_oldmem_pfn_range(struct vm_area_struct *vma,
>
> extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
> unsigned long, int);
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +extern ssize_t copy_oldmem_page_encrypted(unsigned long, char *, size_t,
> + unsigned long, int);
> +#endif
> void vmcore_cleanup(void);
>
> /* Architecture code defines this if there are other possible ELF
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 20fef1a..3c22a9b 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -471,6 +471,16 @@ static struct page *kimage_alloc_crash_control_pages(struct kimage *image,
> }
> }
>
> + if (pages) {
> + unsigned int count, i;
> +
> + pages->mapping = NULL;
> + set_page_private(pages, order);
> + count = 1 << order;
> + for (i = 0; i < count; i++)
> + SetPageReserved(pages + i);
> + arch_kexec_post_alloc_pages(page_address(pages), 1 << order, 0);
> + }
> return pages;
> }
>
> @@ -865,6 +875,7 @@ static int kimage_load_crash_segment(struct kimage *image,
> result = -ENOMEM;
> goto out;
> }
> + arch_kexec_post_alloc_pages(page_address(page), 1, 0);
> ptr = kmap(page);
> ptr += maddr & ~PAGE_MASK;
> mchunk = min_t(size_t, mbytes,
> @@ -882,6 +893,7 @@ static int kimage_load_crash_segment(struct kimage *image,
> result = copy_from_user(ptr, buf, uchunk);
> kexec_flush_icache_page(page);
> kunmap(page);
> + arch_kexec_pre_free_pages(page_address(page), 1);
> if (result) {
> result = -EFAULT;
> goto out;
>

2018-05-16 13:21:37

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH 1/2] add a function(ioremap_encrypted) for kdump when AMD sme enabled.



在 2018年05月15日 22:34, Tom Lendacky 写道:
> On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
>> It is convenient to remap the old memory encrypted to the second kernel
>> by calling ioremap_encrypted().
>>
>> Signed-off-by: Lianbo Jiang <[email protected]>
>> ---
>> arch/x86/include/asm/io.h | 2 ++
>> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
>> 2 files changed, 19 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
>> index f6e5b93..06d2a9f 100644
>> --- a/arch/x86/include/asm/io.h
>> +++ b/arch/x86/include/asm/io.h
>> @@ -192,6 +192,8 @@ extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
>> #define ioremap_cache ioremap_cache
>> extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val);
>> #define ioremap_prot ioremap_prot
>> +extern void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size);
>> +#define ioremap_encrypted ioremap_encrypted
>>
>> /**
>> * ioremap - map bus memory into CPU space
>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>> index c63a545..7a52d1e 100644
>> --- a/arch/x86/mm/ioremap.c
>> +++ b/arch/x86/mm/ioremap.c
>> @@ -131,7 +131,8 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
>> * caller shouldn't need to know that small detail.
>> */
>> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>> - unsigned long size, enum page_cache_mode pcm, void *caller)
>> + unsigned long size, enum page_cache_mode pcm,
>> + void *caller, bool encrypted)
>> {
>> unsigned long offset, vaddr;
>> resource_size_t last_addr;
>> @@ -199,7 +200,8 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>> * resulting mapping.
>> */
>> prot = PAGE_KERNEL_IO;
>> - if (sev_active() && mem_flags.desc_other)
>> + if ((sev_active() && mem_flags.desc_other) ||
>> + (encrypted && sme_active()))
>
> You only need the encrypted check here. If sme is not active,
> then the pgprot_encrypted will basically be a no-op.
>
Great! Thank you, Tom.
It will be fixed.

Lianbo
> Also, extra indents.
>
> Thanks,
> Tom
>
>> prot = pgprot_encrypted(prot);
>>
>> switch (pcm) {
>> @@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
>> enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
>>
>> return __ioremap_caller(phys_addr, size, pcm,
>> - __builtin_return_address(0));
>> + __builtin_return_address(0), false);
>> }
>> EXPORT_SYMBOL(ioremap_nocache);
>>
>> @@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, unsigned long size)
>> enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
>>
>> return __ioremap_caller(phys_addr, size, pcm,
>> - __builtin_return_address(0));
>> + __builtin_return_address(0), false);
>> }
>> EXPORT_SYMBOL_GPL(ioremap_uc);
>>
>> @@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
>> void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
>> {
>> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
>> - __builtin_return_address(0));
>> + __builtin_return_address(0), false);
>> }
>> EXPORT_SYMBOL(ioremap_wc);
>>
>> @@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
>> void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
>> {
>> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
>> - __builtin_return_address(0));
>> + __builtin_return_address(0), false);
>> }
>> EXPORT_SYMBOL(ioremap_wt);
>>
>> +void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size)
>> +{
>> + return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
>> + __builtin_return_address(0), true);
>> +}
>> +EXPORT_SYMBOL(ioremap_encrypted);
>> +
>> void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
>> {
>> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
>> - __builtin_return_address(0));
>> + __builtin_return_address(0), false);
>> }
>> EXPORT_SYMBOL(ioremap_cache);
>>
>> @@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, unsigned long size,
>> {
>> return __ioremap_caller(phys_addr, size,
>> pgprot2cachemode(__pgprot(prot_val)),
>> - __builtin_return_address(0));
>> + __builtin_return_address(0), false);
>> }
>> EXPORT_SYMBOL(ioremap_prot);
>>
>>

2018-05-16 15:07:06

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH 2/2] support kdump when AMD secure memory encryption is active

在 2018年05月16日 04:18, Tom Lendacky 写道:
> On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
>> When sme enabled on AMD server, we also need to support kdump. Because
>> the memory is encrypted in the first kernel, we will remap the old memory
>> encrypted to the second kernel(crash kernel), and sme is also enabled in
>> the second kernel, otherwise the old memory encrypted can not be decrypted.
>> Because simply changing the value of a C-bit on a page will not
>> automatically encrypt the existing contents of a page, and any data in the
>> page prior to the C-bit modification will become unintelligible. A page of
>> memory that is marked encrypted will be automatically decrypted when read
>> from DRAM and will be automatically encrypted when written to DRAM.
>>
>> For the kdump, it is necessary to distinguish whether the memory is
>> encrypted. Furthermore, we should also know which part of the memory is
>> encrypted or decrypted. We will appropriately remap the memory according
>> to the specific situation in order to tell cpu how to deal with the data(
>> encrypted or unencrypted). For example, when sme enabled, if the old memory
>> is encrypted, we will remap the old memory in encrypted way, which will
>> automatically decrypt the old memory encrypted when we read those data from
>> the remapping address.
>>
>> ----------------------------------------------
>> | first-kernel | second-kernel | kdump support |
>> | (mem_encrypt=on|off) | (yes|no) |
>> |--------------+---------------+---------------|
>> | on | on | yes |
>> | off | off | yes |
>> | on | off | no |
>> | off | on | no |
>> |______________|_______________|_______________|
>>
>> Signed-off-by: Lianbo Jiang <[email protected]>
>> ---
>> arch/x86/include/asm/dmi.h | 14 +++++++++++++-
>> arch/x86/kernel/acpi/boot.c | 8 ++++++++
>> arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
>> drivers/acpi/tables.c | 14 +++++++++++++-
>> drivers/iommu/amd_iommu_init.c | 9 ++++++++-
>> fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
>> include/linux/crash_dump.h | 4 ++++
>> kernel/kexec_core.c | 12 ++++++++++++
>> 8 files changed, 116 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/dmi.h b/arch/x86/include/asm/dmi.h
>> index 0ab2ab2..a5663b4 100644
>> --- a/arch/x86/include/asm/dmi.h
>> +++ b/arch/x86/include/asm/dmi.h
>> @@ -7,6 +7,10 @@
>>
>> #include <asm/io.h>
>> #include <asm/setup.h>
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>
> I don't think you need all of the #ifdef stuff throughout this
> patch. Everything should work just fine without it.
>
Thanks Tom. The macro will be deleted from this patch.
Thanks.

Lianbo
>> +#include <linux/crash_dump.h>
>> +#include <linux/mem_encrypt.h>
>> +#endif
>>
>> static __always_inline __init void *dmi_alloc(unsigned len)
>> {
>> @@ -14,7 +18,15 @@ static __always_inline __init void *dmi_alloc(unsigned len)
>> }
>>
>> /* Use early IO mappings for DMI because it's initialized early */
>> -#define dmi_early_remap early_memremap
>> +static __always_inline __init void *dmi_early_remap(resource_size_t
>> + phys_addr, unsigned long size)
>> +{
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>
> Again, no need for the #ifdef here. You should probably audit the
> code for all of these and truly determine if they are really needed.
>
It will be fixed in the patch v2.
Thanks.
>> + if (sme_active() && is_kdump_kernel())
>
> Use of sme_active() here is good since under SEV, this area will be
> encrypted.
>
>> + return early_memremap_decrypted(phys_addr, size);
>> +#endif
>> + return early_memremap(phys_addr, size);
>
> Instead of doing this, maybe it makes more sense to put this logic
> somewhere in the early_memremap() path. Possibly smarten up the
> early_memremap_pgprot_adjust() function with some kdump kernel
> related logic. Not sure it's possible, but would be nice since you
> have this logic in a couple of places.
>
Good idea. If we put this logic into the early_memremap path, there are
many codes that will not have to be modified, for example, dmi, acpi.

Thanks.
>> +}
>> #define dmi_early_unmap early_memunmap
>> #define dmi_remap(_x, _l) memremap(_x, _l, MEMREMAP_WB)
>> #define dmi_unmap(_x) memunmap(_x)
>> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
>> index 3b20607..354ad66 100644
>> --- a/arch/x86/kernel/acpi/boot.c
>> +++ b/arch/x86/kernel/acpi/boot.c
>> @@ -48,6 +48,10 @@
>> #include <asm/mpspec.h>
>> #include <asm/smp.h>
>> #include <asm/i8259.h>
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +#include <linux/crash_dump.h>
>> +#include <linux/mem_encrypt.h>
>> +#endif
>>
>> #include "sleep.h" /* To include x86_acpi_suspend_lowlevel */
>> static int __initdata acpi_force = 0;
>> @@ -124,6 +128,10 @@ void __init __iomem *__acpi_map_table(unsigned long phys, unsigned long size)
>> if (!phys || !size)
>> return NULL;
>>
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + if (sme_active() && is_kdump_kernel())
>> + return early_memremap_decrypted(phys, size);
>> +#endif
>
> Same as previous comment(s).
>
>> return early_memremap(phys, size);
>> }
>>
>> diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
>> index 4f2e077..2ef67fc 100644
>> --- a/arch/x86/kernel/crash_dump_64.c
>> +++ b/arch/x86/kernel/crash_dump_64.c
>> @@ -48,3 +48,30 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
>> iounmap(vaddr);
>> return csize;
>> }
>> +
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
>> + size_t csize, unsigned long offset, int userbuf)
>> +{
>> + void *vaddr;
>> +
>> + if (!csize)
>> + return 0;
>> +
>> + vaddr = ioremap_encrypted(pfn << PAGE_SHIFT, PAGE_SIZE);
>> + if (!vaddr)
>> + return -ENOMEM;
>> +
>> + if (userbuf) {
>> + if (copy_to_user(buf, vaddr + offset, csize)) {
>> + iounmap(vaddr);
>> + return -EFAULT;
>> + }
>> + } else
>> + memcpy(buf, vaddr + offset, csize);
>> +
>> + set_iounmap_nonlazy();
>> + iounmap(vaddr);
>> + return csize;
>> +}
>> +#endif
>
> This seems exactly the same as copy_oldmem_page() with the difference
> being the type of ioremap done. Might be better to make the code
> after the ioremap's a common piece of code that each of the copy_oldmem
> functions would call.
>
>> diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
>> index 849c4fb..6da9b0c 100644
>> --- a/drivers/acpi/tables.c
>> +++ b/drivers/acpi/tables.c
>> @@ -36,6 +36,10 @@
>> #include <linux/memblock.h>
>> #include <linux/initrd.h>
>> #include "internal.h"
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +#include <linux/crash_dump.h>
>> +#include <linux/mem_encrypt.h>
>> +#endif
>>
>> #ifdef CONFIG_ACPI_CUSTOM_DSDT
>> #include CONFIG_ACPI_CUSTOM_DSDT_FILE
>> @@ -566,7 +570,15 @@ void __init acpi_table_upgrade(void)
>> clen = size;
>> if (clen > MAP_CHUNK_SIZE - slop)
>> clen = MAP_CHUNK_SIZE - slop;
>> - dest_p = early_memremap(dest_addr & PAGE_MASK,
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + if (sme_active() && is_kdump_kernel())
>> + dest_p = early_memremap_decrypted(
>> + dest_addr & PAGE_MASK,
>> + clen + slop);
>> + else
>> +#endif
>> + dest_p = early_memremap(
>> + dest_addr & PAGE_MASK,
>
> So if the dest_addr (based off of acpi_tables_addr) was added to the e820
> map as an ACPI area (which it will be), then it would be mapped properly
> (in both SME and SEV) without needing the if/then/else.
>

>> clen + slop);
>> memcpy(dest_p + slop, src_p, clen);
>> early_memunmap(dest_p, clen + slop);
>> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
>> index 904c575..8ecbddb 100644
>> --- a/drivers/iommu/amd_iommu_init.c
>> +++ b/drivers/iommu/amd_iommu_init.c
>> @@ -889,11 +889,18 @@ static bool copy_device_table(void)
>> }
>>
>> old_devtb_phys = entry & PAGE_MASK;
>> + if (sme_active() && is_kdump_kernel())
>
> Use mem_encrypt_active() here to cover both SME and SEV.
>
>> + old_devtb_phys = __sme_clr(old_devtb_phys);
>> if (old_devtb_phys >= 0x100000000ULL) {
>> pr_err("The address of old device table is above 4G, not trustworthy!\n");
>> return false;
>> }
>> - old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
>> + if (sme_active() && is_kdump_kernel())
>> + old_devtb = ioremap_encrypted(old_devtb_phys,
>> + dev_table_size);
>> + else
>> + old_devtb = memremap(old_devtb_phys,
>> + dev_table_size, MEMREMAP_WB);
>
> What happens to the memremap here, does it fall back to the ioremap and
> end up getting mapped decrypted? It would be nice to do the right thing
> under the covers of memremap. Not sure what that would take, but it would
> keep the code nice and clean.
>
>> if (!old_devtb)
>> return false;
>>
>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
>> index a45f0af..316e2b0 100644
>> --- a/fs/proc/vmcore.c
>> +++ b/fs/proc/vmcore.c
>> @@ -25,6 +25,10 @@
>> #include <linux/uaccess.h>
>> #include <asm/io.h>
>> #include "internal.h"
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +#include <linux/mem_encrypt.h>
>> +#include <asm/pgtable.h>
>> +#endif
>>
>> /* List representing chunks of contiguous memory areas and their offsets in
>> * vmcore file.
>> @@ -86,7 +90,8 @@ static int pfn_is_ram(unsigned long pfn)
>>
>> /* Reads a page from the oldmem device from given offset. */
>> static ssize_t read_from_oldmem(char *buf, size_t count,
>> - u64 *ppos, int userbuf)
>> + u64 *ppos, int userbuf,
>> + bool encrypted)
>> {
>> unsigned long pfn, offset;
>> size_t nr_bytes;
>> @@ -108,8 +113,15 @@ static ssize_t read_from_oldmem(char *buf, size_t count,
>> if (pfn_is_ram(pfn) == 0)
>> memset(buf, 0, nr_bytes);
>> else {
>> - tmp = copy_oldmem_page(pfn, buf, nr_bytes,
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + if (encrypted)
>> + tmp = copy_oldmem_page_encrypted(pfn, buf,
>> + nr_bytes, offset, userbuf);
>> + else
>> +#endif
>> + tmp = copy_oldmem_page(pfn, buf, nr_bytes,
>> offset, userbuf);> +
>> if (tmp < 0)
>> return tmp;
>> }
>> @@ -143,7 +155,7 @@ void __weak elfcorehdr_free(unsigned long long addr)
>> */
>> ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
>> {
>> - return read_from_oldmem(buf, count, ppos, 0);
>> + return read_from_oldmem(buf, count, ppos, 0, false);
>
> For SEV, this will likely be encrypted, so you can probably replace the
> "false" with sev_active() so that under SME it is un-encrypted but under
It's fine to use the sev_active.
Thanks.
> SEV it is encrypted. Where is the elfcorehdr stored? I wonder if it
> could be created as encrypted under SME and then you could actually remove
> the encrypted parameter from read_from_oldmem() and always map encrypted.
> If SME or SEV are active it will be mapped encrypted and if they aren't
> then it is mapped normally.
>
Thank you, Tom.
It looks like very well, but its page is actually encrypted, because we will copy the content
from encrypted area(user space) to unencrypted area(kernel space), which leads to it has
completed the decryption. I remember it is like this. It is hard to distinguish the case when
we load kernel image and initrd by kexec. Maybe it will make the problem more complicated.

I am very glad that you can help to review this patch. The patch will be improved in the
patch v2. Hope you would like to help me review the patches.

Thanks
Lianbo
>> }
>>
>> /*
>> @@ -151,7 +163,11 @@ ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
>> */
>> ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos)
>> {
>> - return read_from_oldmem(buf, count, ppos, 0);
>> + bool flag = false;
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + flag = sme_active();
>> +#endif
>> + return read_from_oldmem(buf, count, ppos, 0, flag);
>> }
>>
>> /*
>> @@ -161,6 +177,10 @@ int __weak remap_oldmem_pfn_range(struct vm_area_struct *vma,
>> unsigned long from, unsigned long pfn,
>> unsigned long size, pgprot_t prot)
>> {
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + if (sme_active())
>
> No need for the sme_active() check here, the encryption will be applied,
> for both SME and SEV, if memory encryption is active, otherwise it won't.
>
>> + prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);
>
> prot = pgprot_encrypted(prot);
>
Great. Thanks.
>> +#endif> return remap_pfn_range(vma, from, pfn, size, prot);
>> }
>>
>> @@ -188,6 +208,11 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
>> size_t tsz;
>> u64 start;
>> struct vmcore *m = NULL;
>> + bool sme_flag = false;
>> +
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + sme_flag = sme_active();> +#endif
>
> Probably just want mem_encrypt_active() here to get both SME and SEV
> cases mapped as encrypted.
>
Great. Thanks.
> Thanks,
> Tom
>
>>
>> if (buflen == 0 || *fpos >= vmcore_size)
>> return 0;
>> @@ -235,7 +260,8 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
>> m->offset + m->size - *fpos,
>> buflen);
>> start = m->paddr + *fpos - m->offset;
>> - tmp = read_from_oldmem(buffer, tsz, &start, userbuf);
>> + tmp = read_from_oldmem(buffer, tsz, &start,
>> + userbuf, sme_flag);
>> if (tmp < 0)
>> return tmp;
>> buflen -= tsz;
>> diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
>> index f7ac2aa..024ae9e 100644
>> --- a/include/linux/crash_dump.h
>> +++ b/include/linux/crash_dump.h
>> @@ -25,6 +25,10 @@ extern int remap_oldmem_pfn_range(struct vm_area_struct *vma,
>>
>> extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
>> unsigned long, int);
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +extern ssize_t copy_oldmem_page_encrypted(unsigned long, char *, size_t,
>> + unsigned long, int);
>> +#endif
>> void vmcore_cleanup(void);
>>
>> /* Architecture code defines this if there are other possible ELF
>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index 20fef1a..3c22a9b 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -471,6 +471,16 @@ static struct page *kimage_alloc_crash_control_pages(struct kimage *image,
>> }
>> }
>>
>> + if (pages) {
>> + unsigned int count, i;
>> +
>> + pages->mapping = NULL;
>> + set_page_private(pages, order);
>> + count = 1 << order;
>> + for (i = 0; i < count; i++)
>> + SetPageReserved(pages + i);
>> + arch_kexec_post_alloc_pages(page_address(pages), 1 << order, 0);
>> + }
>> return pages;
>> }
>>
>> @@ -865,6 +875,7 @@ static int kimage_load_crash_segment(struct kimage *image,
>> result = -ENOMEM;
>> goto out;
>> }
>> + arch_kexec_post_alloc_pages(page_address(page), 1, 0);
>> ptr = kmap(page);
>> ptr += maddr & ~PAGE_MASK;
>> mchunk = min_t(size_t, mbytes,
>> @@ -882,6 +893,7 @@ static int kimage_load_crash_segment(struct kimage *image,
>> result = copy_from_user(ptr, buf, uchunk);
>> kexec_flush_icache_page(page);
>> kunmap(page);
>> + arch_kexec_pre_free_pages(page_address(page), 1);
>> if (result) {
>> result = -EFAULT;
>> goto out;
>>

2018-05-17 00:49:42

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH 2/2] support kdump when AMD secure memory encryption is active

在 2018年05月16日 04:18, Tom Lendacky 写道:
> On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
>> When sme enabled on AMD server, we also need to support kdump. Because
>> the memory is encrypted in the first kernel, we will remap the old memory
>> encrypted to the second kernel(crash kernel), and sme is also enabled in
>> the second kernel, otherwise the old memory encrypted can not be decrypted.
>> Because simply changing the value of a C-bit on a page will not
>> automatically encrypt the existing contents of a page, and any data in the
>> page prior to the C-bit modification will become unintelligible. A page of
>> memory that is marked encrypted will be automatically decrypted when read
>> from DRAM and will be automatically encrypted when written to DRAM.
>>
>> For the kdump, it is necessary to distinguish whether the memory is
>> encrypted. Furthermore, we should also know which part of the memory is
>> encrypted or decrypted. We will appropriately remap the memory according
>> to the specific situation in order to tell cpu how to deal with the data(
>> encrypted or unencrypted). For example, when sme enabled, if the old memory
>> is encrypted, we will remap the old memory in encrypted way, which will
>> automatically decrypt the old memory encrypted when we read those data from
>> the remapping address.
>>
>> ----------------------------------------------
>> | first-kernel | second-kernel | kdump support |
>> | (mem_encrypt=on|off) | (yes|no) |
>> |--------------+---------------+---------------|
>> | on | on | yes |
>> | off | off | yes |
>> | on | off | no |
>> | off | on | no |
>> |______________|_______________|_______________|
>>
>> Signed-off-by: Lianbo Jiang <[email protected]>
>> ---
>> arch/x86/include/asm/dmi.h | 14 +++++++++++++-
>> arch/x86/kernel/acpi/boot.c | 8 ++++++++
>> arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
>> drivers/acpi/tables.c | 14 +++++++++++++-
>> drivers/iommu/amd_iommu_init.c | 9 ++++++++-
>> fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
>> include/linux/crash_dump.h | 4 ++++
>> kernel/kexec_core.c | 12 ++++++++++++
>> 8 files changed, 116 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/dmi.h b/arch/x86/include/asm/dmi.h
>> index 0ab2ab2..a5663b4 100644
>> --- a/arch/x86/include/asm/dmi.h
>> +++ b/arch/x86/include/asm/dmi.h
>> @@ -7,6 +7,10 @@
>>
>> #include <asm/io.h>
>> #include <asm/setup.h>
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>
> I don't think you need all of the #ifdef stuff throughout this
> patch. Everything should work just fine without it.
>
>> +#include <linux/crash_dump.h>
>> +#include <linux/mem_encrypt.h>
>> +#endif
>>
>> static __always_inline __init void *dmi_alloc(unsigned len)
>> {
>> @@ -14,7 +18,15 @@ static __always_inline __init void *dmi_alloc(unsigned len)
>> }
>>
>> /* Use early IO mappings for DMI because it's initialized early */
>> -#define dmi_early_remap early_memremap
>> +static __always_inline __init void *dmi_early_remap(resource_size_t
>> + phys_addr, unsigned long size)
>> +{
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>
> Again, no need for the #ifdef here. You should probably audit the
> code for all of these and truly determine if they are really needed.
>
>> + if (sme_active() && is_kdump_kernel())
>
> Use of sme_active() here is good since under SEV, this area will be
> encrypted.
>
>> + return early_memremap_decrypted(phys_addr, size);
>> +#endif
>> + return early_memremap(phys_addr, size);
>
> Instead of doing this, maybe it makes more sense to put this logic
> somewhere in the early_memremap() path. Possibly smarten up the
> early_memremap_pgprot_adjust() function with some kdump kernel
> related logic. Not sure it's possible, but would be nice since you
> have this logic in a couple of places.
>
>> +}
>> #define dmi_early_unmap early_memunmap
>> #define dmi_remap(_x, _l) memremap(_x, _l, MEMREMAP_WB)
>> #define dmi_unmap(_x) memunmap(_x)
>> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
>> index 3b20607..354ad66 100644
>> --- a/arch/x86/kernel/acpi/boot.c
>> +++ b/arch/x86/kernel/acpi/boot.c
>> @@ -48,6 +48,10 @@
>> #include <asm/mpspec.h>
>> #include <asm/smp.h>
>> #include <asm/i8259.h>
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +#include <linux/crash_dump.h>
>> +#include <linux/mem_encrypt.h>
>> +#endif
>>
>> #include "sleep.h" /* To include x86_acpi_suspend_lowlevel */
>> static int __initdata acpi_force = 0;
>> @@ -124,6 +128,10 @@ void __init __iomem *__acpi_map_table(unsigned long phys, unsigned long size)
>> if (!phys || !size)
>> return NULL;
>>
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + if (sme_active() && is_kdump_kernel())
>> + return early_memremap_decrypted(phys, size);
>> +#endif
>
> Same as previous comment(s).
>
>> return early_memremap(phys, size);
>> }
>>
>> diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
>> index 4f2e077..2ef67fc 100644
>> --- a/arch/x86/kernel/crash_dump_64.c
>> +++ b/arch/x86/kernel/crash_dump_64.c
>> @@ -48,3 +48,30 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
>> iounmap(vaddr);
>> return csize;
>> }
>> +
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
>> + size_t csize, unsigned long offset, int userbuf)
>> +{
>> + void *vaddr;
>> +
>> + if (!csize)
>> + return 0;
>> +
>> + vaddr = ioremap_encrypted(pfn << PAGE_SHIFT, PAGE_SIZE);
>> + if (!vaddr)
>> + return -ENOMEM;
>> +
>> + if (userbuf) {
>> + if (copy_to_user(buf, vaddr + offset, csize)) {
>> + iounmap(vaddr);
>> + return -EFAULT;
>> + }
>> + } else
>> + memcpy(buf, vaddr + offset, csize);
>> +
>> + set_iounmap_nonlazy();
>> + iounmap(vaddr);
>> + return csize;
>> +}
>> +#endif
>
> This seems exactly the same as copy_oldmem_page() with the difference
> being the type of ioremap done. Might be better to make the code
> after the ioremap's a common piece of code that each of the copy_oldmem
> functions would call.
>
>> diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
>> index 849c4fb..6da9b0c 100644
>> --- a/drivers/acpi/tables.c
>> +++ b/drivers/acpi/tables.c
>> @@ -36,6 +36,10 @@
>> #include <linux/memblock.h>
>> #include <linux/initrd.h>
>> #include "internal.h"
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +#include <linux/crash_dump.h>
>> +#include <linux/mem_encrypt.h>
>> +#endif
>>
>> #ifdef CONFIG_ACPI_CUSTOM_DSDT
>> #include CONFIG_ACPI_CUSTOM_DSDT_FILE
>> @@ -566,7 +570,15 @@ void __init acpi_table_upgrade(void)
>> clen = size;
>> if (clen > MAP_CHUNK_SIZE - slop)
>> clen = MAP_CHUNK_SIZE - slop;
>> - dest_p = early_memremap(dest_addr & PAGE_MASK,
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + if (sme_active() && is_kdump_kernel())
>> + dest_p = early_memremap_decrypted(
>> + dest_addr & PAGE_MASK,
>> + clen + slop);
>> + else
>> +#endif
>> + dest_p = early_memremap(
>> + dest_addr & PAGE_MASK,
>
> So if the dest_addr (based off of acpi_tables_addr) was added to the e820
> map as an ACPI area (which it will be), then it would be mapped properly
> (in both SME and SEV) without needing the if/then/else.
>
>> clen + slop);
>> memcpy(dest_p + slop, src_p, clen);
>> early_memunmap(dest_p, clen + slop);
>> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
>> index 904c575..8ecbddb 100644
>> --- a/drivers/iommu/amd_iommu_init.c
>> +++ b/drivers/iommu/amd_iommu_init.c
>> @@ -889,11 +889,18 @@ static bool copy_device_table(void)
>> }
>>
>> old_devtb_phys = entry & PAGE_MASK;
>> + if (sme_active() && is_kdump_kernel())
>
> Use mem_encrypt_active() here to cover both SME and SEV.
>
>> + old_devtb_phys = __sme_clr(old_devtb_phys);
>> if (old_devtb_phys >= 0x100000000ULL) {
>> pr_err("The address of old device table is above 4G, not trustworthy!\n");
>> return false;
>> }
>> - old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
>> + if (sme_active() && is_kdump_kernel())
>> + old_devtb = ioremap_encrypted(old_devtb_phys,
>> + dev_table_size);
>> + else
>> + old_devtb = memremap(old_devtb_phys,
>> + dev_table_size, MEMREMAP_WB);
>
> What happens to the memremap here, does it fall back to the ioremap and
> end up getting mapped decrypted? It would be nice to do the right thing
> under the covers of memremap. Not sure what that would take, but it would
> keep the code nice and clean.
>
Thank you, Tom.
For this issue, it will copy the old content of device table entries, that is to
say, it will copy the device table from the first kernel to the second kernel in
kdump mode and the content of device table is encrypted, the address includes the
sme_me_mask. For the old memory encrypted, it is suitable to use ioremap_encrypted,
the kdump kernel can not directly access the first kernel's memory.

Lianbo
>> if (!old_devtb)
>> return false;
>>
>> diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
>> index a45f0af..316e2b0 100644
>> --- a/fs/proc/vmcore.c
>> +++ b/fs/proc/vmcore.c
>> @@ -25,6 +25,10 @@
>> #include <linux/uaccess.h>
>> #include <asm/io.h>
>> #include "internal.h"
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +#include <linux/mem_encrypt.h>
>> +#include <asm/pgtable.h>
>> +#endif
>>
>> /* List representing chunks of contiguous memory areas and their offsets in
>> * vmcore file.
>> @@ -86,7 +90,8 @@ static int pfn_is_ram(unsigned long pfn)
>>
>> /* Reads a page from the oldmem device from given offset. */
>> static ssize_t read_from_oldmem(char *buf, size_t count,
>> - u64 *ppos, int userbuf)
>> + u64 *ppos, int userbuf,
>> + bool encrypted)
>> {
>> unsigned long pfn, offset;
>> size_t nr_bytes;
>> @@ -108,8 +113,15 @@ static ssize_t read_from_oldmem(char *buf, size_t count,
>> if (pfn_is_ram(pfn) == 0)
>> memset(buf, 0, nr_bytes);
>> else {
>> - tmp = copy_oldmem_page(pfn, buf, nr_bytes,
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + if (encrypted)
>> + tmp = copy_oldmem_page_encrypted(pfn, buf,
>> + nr_bytes, offset, userbuf);
>> + else
>> +#endif
>> + tmp = copy_oldmem_page(pfn, buf, nr_bytes,
>> offset, userbuf);> +
>> if (tmp < 0)
>> return tmp;
>> }
>> @@ -143,7 +155,7 @@ void __weak elfcorehdr_free(unsigned long long addr)
>> */
>> ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
>> {
>> - return read_from_oldmem(buf, count, ppos, 0);
>> + return read_from_oldmem(buf, count, ppos, 0, false);
>
> For SEV, this will likely be encrypted, so you can probably replace the
> "false" with sev_active() so that under SME it is un-encrypted but under
> SEV it is encrypted. Where is the elfcorehdr stored? I wonder if it
> could be created as encrypted under SME and then you could actually remove
> the encrypted parameter from read_from_oldmem() and always map encrypted.
> If SME or SEV are active it will be mapped encrypted and if they aren't
> then it is mapped normally.
>
>> }
>>
>> /*
>> @@ -151,7 +163,11 @@ ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
>> */
>> ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos)
>> {
>> - return read_from_oldmem(buf, count, ppos, 0);
>> + bool flag = false;
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + flag = sme_active();
>> +#endif
>> + return read_from_oldmem(buf, count, ppos, 0, flag);
>> }
>>
>> /*
>> @@ -161,6 +177,10 @@ int __weak remap_oldmem_pfn_range(struct vm_area_struct *vma,
>> unsigned long from, unsigned long pfn,
>> unsigned long size, pgprot_t prot)
>> {
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + if (sme_active())
>
> No need for the sme_active() check here, the encryption will be applied,
> for both SME and SEV, if memory encryption is active, otherwise it won't.
>
>> + prot = __pgprot(pgprot_val(prot) | _PAGE_ENC);
>
> prot = pgprot_encrypted(prot);
>
>> +#endif> return remap_pfn_range(vma, from, pfn, size, prot);
>> }
>>
>> @@ -188,6 +208,11 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
>> size_t tsz;
>> u64 start;
>> struct vmcore *m = NULL;
>> + bool sme_flag = false;
>> +
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> + sme_flag = sme_active();> +#endif
>
> Probably just want mem_encrypt_active() here to get both SME and SEV
> cases mapped as encrypted.
>
> Thanks,
> Tom
>
>>
>> if (buflen == 0 || *fpos >= vmcore_size)
>> return 0;
>> @@ -235,7 +260,8 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
>> m->offset + m->size - *fpos,
>> buflen);
>> start = m->paddr + *fpos - m->offset;
>> - tmp = read_from_oldmem(buffer, tsz, &start, userbuf);
>> + tmp = read_from_oldmem(buffer, tsz, &start,
>> + userbuf, sme_flag);
>> if (tmp < 0)
>> return tmp;
>> buflen -= tsz;
>> diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
>> index f7ac2aa..024ae9e 100644
>> --- a/include/linux/crash_dump.h
>> +++ b/include/linux/crash_dump.h
>> @@ -25,6 +25,10 @@ extern int remap_oldmem_pfn_range(struct vm_area_struct *vma,
>>
>> extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
>> unsigned long, int);
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +extern ssize_t copy_oldmem_page_encrypted(unsigned long, char *, size_t,
>> + unsigned long, int);
>> +#endif
>> void vmcore_cleanup(void);
>>
>> /* Architecture code defines this if there are other possible ELF
>> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
>> index 20fef1a..3c22a9b 100644
>> --- a/kernel/kexec_core.c
>> +++ b/kernel/kexec_core.c
>> @@ -471,6 +471,16 @@ static struct page *kimage_alloc_crash_control_pages(struct kimage *image,
>> }
>> }
>>
>> + if (pages) {
>> + unsigned int count, i;
>> +
>> + pages->mapping = NULL;
>> + set_page_private(pages, order);
>> + count = 1 << order;
>> + for (i = 0; i < count; i++)
>> + SetPageReserved(pages + i);
>> + arch_kexec_post_alloc_pages(page_address(pages), 1 << order, 0);
>> + }
>> return pages;
>> }
>>
>> @@ -865,6 +875,7 @@ static int kimage_load_crash_segment(struct kimage *image,
>> result = -ENOMEM;
>> goto out;
>> }
>> + arch_kexec_post_alloc_pages(page_address(page), 1, 0);
>> ptr = kmap(page);
>> ptr += maddr & ~PAGE_MASK;
>> mchunk = min_t(size_t, mbytes,
>> @@ -882,6 +893,7 @@ static int kimage_load_crash_segment(struct kimage *image,
>> result = copy_from_user(ptr, buf, uchunk);
>> kexec_flush_icache_page(page);
>> kunmap(page);
>> + arch_kexec_pre_free_pages(page_address(page), 1);
>> if (result) {
>> result = -EFAULT;
>> goto out;
>>

2018-05-17 13:48:02

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

在 2018年05月15日 21:31, Tom Lendacky 写道:
> On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
>> It is convenient to remap the old memory encrypted to the second kernel by
>> calling ioremap_encrypted().
>>
>> When sme enabled on AMD server, we also need to support kdump. Because
>> the memory is encrypted in the first kernel, we will remap the old memory
>> encrypted to the second kernel(crash kernel), and sme is also enabled in
>> the second kernel, otherwise the old memory encrypted can not be decrypted.
>> Because simply changing the value of a C-bit on a page will not
>> automatically encrypt the existing contents of a page, and any data in the
>> page prior to the C-bit modification will become unintelligible. A page of
>> memory that is marked encrypted will be automatically decrypted when read
>> from DRAM and will be automatically encrypted when written to DRAM.
>>
>> For the kdump, it is necessary to distinguish whether the memory is
>> encrypted. Furthermore, we should also know which part of the memory is
>> encrypted or decrypted. We will appropriately remap the memory according
>> to the specific situation in order to tell cpu how to deal with the
>> data(encrypted or decrypted). For example, when sme enabled, if the old
>> memory is encrypted, we will remap the old memory in encrypted way, which
>> will automatically decrypt the old memory encrypted when we read those data
>> from the remapping address.
>>
>> ----------------------------------------------
>> | first-kernel | second-kernel | kdump support |
>> | (mem_encrypt=on|off) | (yes|no) |
>> |--------------+---------------+---------------|
>> | on | on | yes |
>> | off | off | yes |
>> | on | off | no |
>> | off | on | no |
>> |______________|_______________|_______________|
>>
>> Test tools:
>> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
>> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
>> Author: Lianbo Jiang <[email protected]>
>> Date: Mon May 14 17:02:40 2018 +0800
>> Note: This patch can only dump vmcore in the case of SME enabled.
>>
>> crash-7.2.1: https://github.com/crash-utility/crash.git
>> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
>> Author: Dave Anderson <[email protected]>
>> Date: Fri May 11 15:54:32 2018 -0400
>>
>> Test environment:
>> HP ProLiant DL385Gen10 AMD EPYC 7251
>> 8-Core Processor
>> 32768 MB memory
>> 600 GB disk space
>>
>> Linux 4.17-rc4:
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>> commit 75bc37fefc44 ("Linux 4.17-rc4")
>> Author: Linus Torvalds <[email protected]>
>> Date: Sun May 6 16:57:38 2018 -1000
>>
>> Reference:
>> AMD64 Architecture Programmer's Manual
>> https://support.amd.com/TechDocs/24593.pdf
>>
>
> Have you also tested this with SEV? It would be nice if the kdump
> changes you make work with both SME and SEV.
>
Thank you, Tom.
This is a great question, we originally plan to implement SEV in subsequent patches, and
we are also working on SEV at present.
Furthermore, we have another known issue that the system can't jump into the second kernel
when SME is enabled and kaslr is disabled in kdump mode. It seems that is a complex problems,
maybe it is related to kaslr and SME, currently, i'm not sure the root cause, but we will
also plan to fix it. Can you give me any advice about this issue?

Thanks.
Lianbo
> Thanks,
> Tom
>
>> Lianbo Jiang (2):
>> add a function(ioremap_encrypted) for kdump when AMD sme enabled.
>> support kdump when AMD secure memory encryption is active
>>
>> arch/x86/include/asm/dmi.h | 14 +++++++++++++-
>> arch/x86/include/asm/io.h | 2 ++
>> arch/x86/kernel/acpi/boot.c | 8 ++++++++
>> arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
>> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
>> drivers/acpi/tables.c | 14 +++++++++++++-
>> drivers/iommu/amd_iommu_init.c | 9 ++++++++-
>> fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
>> include/linux/crash_dump.h | 4 ++++
>> kernel/kexec_core.c | 12 ++++++++++++
>> 10 files changed, 135 insertions(+), 16 deletions(-)
>>

2018-05-21 03:46:49

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

在 2018年05月17日 21:45, lijiang 写道:
> 在 2018年05月15日 21:31, Tom Lendacky 写道:
>> On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
>>> It is convenient to remap the old memory encrypted to the second kernel by
>>> calling ioremap_encrypted().
>>>
>>> When sme enabled on AMD server, we also need to support kdump. Because
>>> the memory is encrypted in the first kernel, we will remap the old memory
>>> encrypted to the second kernel(crash kernel), and sme is also enabled in
>>> the second kernel, otherwise the old memory encrypted can not be decrypted.
>>> Because simply changing the value of a C-bit on a page will not
>>> automatically encrypt the existing contents of a page, and any data in the
>>> page prior to the C-bit modification will become unintelligible. A page of
>>> memory that is marked encrypted will be automatically decrypted when read
>>> from DRAM and will be automatically encrypted when written to DRAM.
>>>
>>> For the kdump, it is necessary to distinguish whether the memory is
>>> encrypted. Furthermore, we should also know which part of the memory is
>>> encrypted or decrypted. We will appropriately remap the memory according
>>> to the specific situation in order to tell cpu how to deal with the
>>> data(encrypted or decrypted). For example, when sme enabled, if the old
>>> memory is encrypted, we will remap the old memory in encrypted way, which
>>> will automatically decrypt the old memory encrypted when we read those data
>>> from the remapping address.
>>>
>>> ----------------------------------------------
>>> | first-kernel | second-kernel | kdump support |
>>> | (mem_encrypt=on|off) | (yes|no) |
>>> |--------------+---------------+---------------|
>>> | on | on | yes |
>>> | off | off | yes |
>>> | on | off | no |
>>> | off | on | no |
>>> |______________|_______________|_______________|
>>>
>>> Test tools:
>>> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
>>> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
>>> Author: Lianbo Jiang <[email protected]>
>>> Date: Mon May 14 17:02:40 2018 +0800
>>> Note: This patch can only dump vmcore in the case of SME enabled.
>>>
>>> crash-7.2.1: https://github.com/crash-utility/crash.git
>>> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
>>> Author: Dave Anderson <[email protected]>
>>> Date: Fri May 11 15:54:32 2018 -0400
>>>
>>> Test environment:
>>> HP ProLiant DL385Gen10 AMD EPYC 7251
>>> 8-Core Processor
>>> 32768 MB memory
>>> 600 GB disk space
>>>
>>> Linux 4.17-rc4:
>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>> commit 75bc37fefc44 ("Linux 4.17-rc4")
>>> Author: Linus Torvalds <[email protected]>
>>> Date: Sun May 6 16:57:38 2018 -1000
>>>
>>> Reference:
>>> AMD64 Architecture Programmer's Manual
>>> https://support.amd.com/TechDocs/24593.pdf
>>>
>>
>> Have you also tested this with SEV? It would be nice if the kdump
>> changes you make work with both SME and SEV.
>>
> Thank you, Tom.
> This is a great question, we originally plan to implement SEV in subsequent patches, and
> we are also working on SEV at present.
> Furthermore, we have another known issue that the system can't jump into the second kernel
> when SME is enabled and kaslr is disabled in kdump mode. It seems that is a complex problems,
> maybe it is related to kaslr and SME, currently, i'm not sure the root cause, but we will
> also plan to fix it. Can you give me any advice about this issue?
>
Based on upstream 4.17-rc5, we have recently found a new issue that the system can't boot to
use kexec when SME is enabled and kaslr is disabled. Have you ever run into this issue?
They have similar reproduction scenarios.

CC Tom & Baoquan

Thanks.
Lianbo

> Thanks.
> Lianbo
>> Thanks,
>> Tom
>>
>>> Lianbo Jiang (2):
>>> add a function(ioremap_encrypted) for kdump when AMD sme enabled.
>>> support kdump when AMD secure memory encryption is active
>>>
>>> arch/x86/include/asm/dmi.h | 14 +++++++++++++-
>>> arch/x86/include/asm/io.h | 2 ++
>>> arch/x86/kernel/acpi/boot.c | 8 ++++++++
>>> arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
>>> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
>>> drivers/acpi/tables.c | 14 +++++++++++++-
>>> drivers/iommu/amd_iommu_init.c | 9 ++++++++-
>>> fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
>>> include/linux/crash_dump.h | 4 ++++
>>> kernel/kexec_core.c | 12 ++++++++++++
>>> 10 files changed, 135 insertions(+), 16 deletions(-)
>>>

2018-05-21 13:23:54

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

On 5/20/2018 10:45 PM, lijiang wrote:
> 在 2018年05月17日 21:45, lijiang 写道:
>> 在 2018年05月15日 21:31, Tom Lendacky 写道:
>>> On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
>>>> It is convenient to remap the old memory encrypted to the second kernel by
>>>> calling ioremap_encrypted().
>>>>
>>>> When sme enabled on AMD server, we also need to support kdump. Because
>>>> the memory is encrypted in the first kernel, we will remap the old memory
>>>> encrypted to the second kernel(crash kernel), and sme is also enabled in
>>>> the second kernel, otherwise the old memory encrypted can not be decrypted.
>>>> Because simply changing the value of a C-bit on a page will not
>>>> automatically encrypt the existing contents of a page, and any data in the
>>>> page prior to the C-bit modification will become unintelligible. A page of
>>>> memory that is marked encrypted will be automatically decrypted when read
>>>> from DRAM and will be automatically encrypted when written to DRAM.
>>>>
>>>> For the kdump, it is necessary to distinguish whether the memory is
>>>> encrypted. Furthermore, we should also know which part of the memory is
>>>> encrypted or decrypted. We will appropriately remap the memory according
>>>> to the specific situation in order to tell cpu how to deal with the
>>>> data(encrypted or decrypted). For example, when sme enabled, if the old
>>>> memory is encrypted, we will remap the old memory in encrypted way, which
>>>> will automatically decrypt the old memory encrypted when we read those data
>>>> from the remapping address.
>>>>
>>>> ----------------------------------------------
>>>> | first-kernel | second-kernel | kdump support |
>>>> | (mem_encrypt=on|off) | (yes|no) |
>>>> |--------------+---------------+---------------|
>>>> | on | on | yes |
>>>> | off | off | yes |
>>>> | on | off | no |
>>>> | off | on | no |
>>>> |______________|_______________|_______________|
>>>>
>>>> Test tools:
>>>> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
>>>> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
>>>> Author: Lianbo Jiang <[email protected]>
>>>> Date: Mon May 14 17:02:40 2018 +0800
>>>> Note: This patch can only dump vmcore in the case of SME enabled.
>>>>
>>>> crash-7.2.1: https://github.com/crash-utility/crash.git
>>>> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
>>>> Author: Dave Anderson <[email protected]>
>>>> Date: Fri May 11 15:54:32 2018 -0400
>>>>
>>>> Test environment:
>>>> HP ProLiant DL385Gen10 AMD EPYC 7251
>>>> 8-Core Processor
>>>> 32768 MB memory
>>>> 600 GB disk space
>>>>
>>>> Linux 4.17-rc4:
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>>> commit 75bc37fefc44 ("Linux 4.17-rc4")
>>>> Author: Linus Torvalds <[email protected]>
>>>> Date: Sun May 6 16:57:38 2018 -1000
>>>>
>>>> Reference:
>>>> AMD64 Architecture Programmer's Manual
>>>> https://support.amd.com/TechDocs/24593.pdf
>>>>
>>>
>>> Have you also tested this with SEV? It would be nice if the kdump
>>> changes you make work with both SME and SEV.
>>>
>> Thank you, Tom.
>> This is a great question, we originally plan to implement SEV in subsequent patches, and
>> we are also working on SEV at present.
>> Furthermore, we have another known issue that the system can't jump into the second kernel
>> when SME is enabled and kaslr is disabled in kdump mode. It seems that is a complex problems,
>> maybe it is related to kaslr and SME, currently, i'm not sure the root cause, but we will
>> also plan to fix it. Can you give me any advice about this issue?
>>
> Based on upstream 4.17-rc5, we have recently found a new issue that the system can't boot to
> use kexec when SME is enabled and kaslr is disabled. Have you ever run into this issue?
> They have similar reproduction scenarios.
>
> CC Tom & Baoquan

I haven't encountered this issue. Can you send the kernel config that you
are using? And your kernel command line? I'll try your config and see if
I can reproduce the issue.

Just to be clarify, you perform a power-on boot with SME enabled and KASLR
disabled (e.g. mem_encrypt=on nokaslr), correct, and that won't boot?

Thanks,
Tom

>
> Thanks.
> Lianbo
>
>> Thanks.
>> Lianbo
>>> Thanks,
>>> Tom
>>>
>>>> Lianbo Jiang (2):
>>>> add a function(ioremap_encrypted) for kdump when AMD sme enabled.
>>>> support kdump when AMD secure memory encryption is active
>>>>
>>>> arch/x86/include/asm/dmi.h | 14 +++++++++++++-
>>>> arch/x86/include/asm/io.h | 2 ++
>>>> arch/x86/kernel/acpi/boot.c | 8 ++++++++
>>>> arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
>>>> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
>>>> drivers/acpi/tables.c | 14 +++++++++++++-
>>>> drivers/iommu/amd_iommu_init.c | 9 ++++++++-
>>>> fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
>>>> include/linux/crash_dump.h | 4 ++++
>>>> kernel/kexec_core.c | 12 ++++++++++++
>>>> 10 files changed, 135 insertions(+), 16 deletions(-)
>>>>

2018-05-23 02:03:44

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH 0/2] support kdump for AMD secure memory encryption(sme)

在 2018年05月21日 21:23, Tom Lendacky 写道:
> On 5/20/2018 10:45 PM, lijiang wrote:
>> 在 2018年05月17日 21:45, lijiang 写道:
>>> 在 2018年05月15日 21:31, Tom Lendacky 写道:
>>>> On 5/14/2018 8:51 PM, Lianbo Jiang wrote:
>>>>> It is convenient to remap the old memory encrypted to the second kernel by
>>>>> calling ioremap_encrypted().
>>>>>
>>>>> When sme enabled on AMD server, we also need to support kdump. Because
>>>>> the memory is encrypted in the first kernel, we will remap the old memory
>>>>> encrypted to the second kernel(crash kernel), and sme is also enabled in
>>>>> the second kernel, otherwise the old memory encrypted can not be decrypted.
>>>>> Because simply changing the value of a C-bit on a page will not
>>>>> automatically encrypt the existing contents of a page, and any data in the
>>>>> page prior to the C-bit modification will become unintelligible. A page of
>>>>> memory that is marked encrypted will be automatically decrypted when read
>>>>> from DRAM and will be automatically encrypted when written to DRAM.
>>>>>
>>>>> For the kdump, it is necessary to distinguish whether the memory is
>>>>> encrypted. Furthermore, we should also know which part of the memory is
>>>>> encrypted or decrypted. We will appropriately remap the memory according
>>>>> to the specific situation in order to tell cpu how to deal with the
>>>>> data(encrypted or decrypted). For example, when sme enabled, if the old
>>>>> memory is encrypted, we will remap the old memory in encrypted way, which
>>>>> will automatically decrypt the old memory encrypted when we read those data
>>>>> from the remapping address.
>>>>>
>>>>> ----------------------------------------------
>>>>> | first-kernel | second-kernel | kdump support |
>>>>> | (mem_encrypt=on|off) | (yes|no) |
>>>>> |--------------+---------------+---------------|
>>>>> | on | on | yes |
>>>>> | off | off | yes |
>>>>> | on | off | no |
>>>>> | off | on | no |
>>>>> |______________|_______________|_______________|
>>>>>
>>>>> Test tools:
>>>>> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
>>>>> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
>>>>> Author: Lianbo Jiang <[email protected]>
>>>>> Date: Mon May 14 17:02:40 2018 +0800
>>>>> Note: This patch can only dump vmcore in the case of SME enabled.
>>>>>
>>>>> crash-7.2.1: https://github.com/crash-utility/crash.git
>>>>> commit 1e1bd9c4c1be (Fix for the "bpf" command display on Linux 4.17-rc1)
>>>>> Author: Dave Anderson <[email protected]>
>>>>> Date: Fri May 11 15:54:32 2018 -0400
>>>>>
>>>>> Test environment:
>>>>> HP ProLiant DL385Gen10 AMD EPYC 7251
>>>>> 8-Core Processor
>>>>> 32768 MB memory
>>>>> 600 GB disk space
>>>>>
>>>>> Linux 4.17-rc4:
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>>>> commit 75bc37fefc44 ("Linux 4.17-rc4")
>>>>> Author: Linus Torvalds <[email protected]>
>>>>> Date: Sun May 6 16:57:38 2018 -1000
>>>>>
>>>>> Reference:
>>>>> AMD64 Architecture Programmer's Manual
>>>>> https://support.amd.com/TechDocs/24593.pdf
>>>>>
>>>>
>>>> Have you also tested this with SEV? It would be nice if the kdump
>>>> changes you make work with both SME and SEV.
>>>>
>>> Thank you, Tom.
>>> This is a great question, we originally plan to implement SEV in subsequent patches, and
>>> we are also working on SEV at present.
>>> Furthermore, we have another known issue that the system can't jump into the second kernel
>>> when SME is enabled and kaslr is disabled in kdump mode. It seems that is a complex problems,
>>> maybe it is related to kaslr and SME, currently, i'm not sure the root cause, but we will
>>> also plan to fix it. Can you give me any advice about this issue?
>>>
>> Based on upstream 4.17-rc5, we have recently found a new issue that the system can't boot to
>> use kexec when SME is enabled and kaslr is disabled. Have you ever run into this issue?
>> They have similar reproduction scenarios.
>>
>> CC Tom & Baoquan
>
> I haven't encountered this issue. Can you send the kernel config that you
> are using? And your kernel command line? I'll try your config and see if
> I can reproduce the issue.
>
Thank you, Tom.
It doesn't have anything special config, and it's always reproduced.
I will send another email to you about kernel config and command line.
> Just to be clarify, you perform a power-on boot with SME enabled and KASLR
> disabled (e.g. mem_encrypt=on nokaslr), correct, and that won't boot?
>
The system can boot in this case.

Thanks.
Lianbo
> Thanks,
> Tom
>
>>
>> Thanks.
>> Lianbo
>>
>>> Thanks.
>>> Lianbo
>>>> Thanks,
>>>> Tom
>>>>
>>>>> Lianbo Jiang (2):
>>>>> add a function(ioremap_encrypted) for kdump when AMD sme enabled.
>>>>> support kdump when AMD secure memory encryption is active
>>>>>
>>>>> arch/x86/include/asm/dmi.h | 14 +++++++++++++-
>>>>> arch/x86/include/asm/io.h | 2 ++
>>>>> arch/x86/kernel/acpi/boot.c | 8 ++++++++
>>>>> arch/x86/kernel/crash_dump_64.c | 27 +++++++++++++++++++++++++++
>>>>> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
>>>>> drivers/acpi/tables.c | 14 +++++++++++++-
>>>>> drivers/iommu/amd_iommu_init.c | 9 ++++++++-
>>>>> fs/proc/vmcore.c | 36 +++++++++++++++++++++++++++++++-----
>>>>> include/linux/crash_dump.h | 4 ++++
>>>>> kernel/kexec_core.c | 12 ++++++++++++
>>>>> 10 files changed, 135 insertions(+), 16 deletions(-)
>>>>>