When SME is enabled on AMD machine, we also need to support kdump. Because
the memory is encrypted in the first kernel, we will remap the old memory
to the kdump kernel for dumping data, and SME is also enabled in the kdump
kernel, otherwise the old memory can not be decrypted.
For the kdump, it is necessary to distinguish whether the memory is encrypted.
Furthermore, we should also know which part of the memory is encrypted or
decrypted. We will appropriately remap the memory according to the specific
situation in order to tell cpu how to access the memory.
As we know, a page of memory that is marked as encrypted, which will be
automatically decrypted when read from DRAM, and will also be automatically
encrypted when written to DRAM. If the old memory is encrypted, we have to
remap the old memory with the memory encryption mask, which will automatically
decrypt the old memory when we read those data.
For kdump(SME), there are two cases that doesn't support:
----------------------------------------------
| first-kernel | second-kernel | kdump support |
| (mem_encrypt=on|off) | (yes|no) |
|--------------+---------------+---------------|
| on | on | yes |
| off | off | yes |
| on | off | no |
| off | on | no |
|______________|_______________|_______________|
1. SME is enabled in the first kernel, but SME is disabled in kdump kernel
In this case, because the old memory is encrypted, we can't decrypt the
old memory.
2. SME is disabled in the first kernel, but SME is enabled in kdump kernel
It is unnecessary to support in this case, because the old memory is
unencrypted, the old memory can be dumped as usual, we don't need to enable
SME in kdump kernel. Another, If we must support the scenario, it will
increase the complexity of the code, we will have to consider how to pass
the SME flag from the first kernel to the kdump kernel, in order to let the
kdump kernel know that whether the old memory is encrypted.
There are two methods to pass the SME flag to the kdump kernel. The first
method is to modify the assembly code, which includes some common code and
the path is too long. The second method is to use kexec tool, which could
require the SME flag to be exported in the first kernel by "proc" or "sysfs",
kexec tools will read the SME flag from "proc" or "sysfs" when we use kexec
tools to load image, subsequently the SME flag will be saved in boot_params,
we can properly remap the old memory according to the previously saved SME
flag. But it is too expensive to do this.
This patches are only for SME kdump, the patches don't support SEV kdump.
Test tools:
makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
Note: This patch can only dump vmcore in the case of SME enabled.
crash-7.2.3: https://github.com/crash-utility/crash.git
commit 001f77a05585 (Fix for Linux 4.19-rc1 and later kernels that contain
kernel commit7290d58095712a89f845e1bca05334796dd49ed2)
kexec-tools-2.0.17: git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
commit b9de21ef51a7 (kexec: fix for "Unhandled rela relocation: R_X86_64_PLT32" error)
Note:
Before you load the kernel and initramfs for kdump, this patch(http://lists.infradead.org/pipermail/kexec/2018-September/021460.html)
must be merged to kexec-tools, and then the kdump kernel will work well. Because there
is a patch which is removed based on v6(x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust()
to adjust encryption mask).
Test environment:
HP ProLiant DL385Gen10 AMD EPYC 7251
8-Core Processor
32768 MB memory
600 GB disk space
Linux 4.19-rc2:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
commit 57361846b52bc686112da6ca5368d11210796804
Reference:
AMD64 Architecture Programmer's Manual
https://support.amd.com/TechDocs/24593.pdf
Changes since v6:
1. There is a patch which is removed based on v6.
(x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask)
Dave Young suggests that this patch can be removed and fix the kexec-tools.
Reference: http://lists.infradead.org/pipermail/kexec/2018-September/021460.html)
2. Update the patch log.
Some known issues:
1. about SME
Upstream kernel will hang on HP machine(DL385Gen10 AMD EPYC 7251) when
we execute the kexec command as follow:
# kexec -l /boot/vmlinuz-4.19.0-rc2+ --initrd=/boot/initramfs-4.19.0-rc2+.img --command-line="root=/dev/mapper/rhel_hp--dl385g10--03-root ro mem_encrypt=on rd.lvm.lv=rhel_hp-dl385g10-03/root rd.lvm.lv=rhel_hp-dl385g10-03/swap console=ttyS0,115200n81 LANG=en_US.UTF-8 earlyprintk=serial debug nokaslr"
# kexec -e (or reboot)
But this issue can not be reproduced on speedway machine, and this issue
is irrelevant to my posted patches.
The kernel log:
[ 1248.932239] kexec_core: Starting new kernel
early console in extract_kernel
input_data: 0x000000087e91c3b4
input_len: 0x000000000067fcbd
output: 0x000000087d400000
output_len: 0x0000000001b6fa90
kernel_total_size: 0x0000000001a9d000
trampoline_32bit: 0x0000000000099000
Decompressing Linux...
Parsing ELF... [---Here the system will hang]
Lianbo Jiang (4):
x86/ioremap: add a function ioremap_encrypted() to remap kdump old
memory
kexec: allocate unencrypted control pages for kdump in case SME is
enabled
amd_iommu: remap the device table of IOMMU with the memory encryption
mask for kdump
kdump/vmcore: support encrypted old memory with SME enabled
arch/x86/include/asm/io.h | 3 ++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/crash_dump_encrypt.c | 53 ++++++++++++++++++++++++++++
arch/x86/mm/ioremap.c | 25 ++++++++-----
drivers/iommu/amd_iommu_init.c | 14 ++++++--
fs/proc/vmcore.c | 21 +++++++----
include/linux/crash_dump.h | 12 +++++++
kernel/kexec_core.c | 12 +++++++
8 files changed, 125 insertions(+), 16 deletions(-)
create mode 100644 arch/x86/kernel/crash_dump_encrypt.c
--
2.17.1
When SME is enabled in the first kernel, we will allocate unencrypted pages
for kdump in order to be able to boot the kdump kernel like kexec.
Signed-off-by: Lianbo Jiang <[email protected]>
---
kernel/kexec_core.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 23a83a4da38a..e7efcd1a977b 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -471,6 +471,16 @@ static struct page *kimage_alloc_crash_control_pages(struct kimage *image,
}
}
+ if (pages) {
+ /*
+ * For kdump, we need to ensure that these pages are
+ * unencrypted pages if SME is enabled.
+ * By the way, it is unnecessary to call the arch_
+ * kexec_pre_free_pages(), which will make the code
+ * become more simple.
+ */
+ arch_kexec_post_alloc_pages(page_address(pages), 1 << order, 0);
+ }
return pages;
}
@@ -867,6 +877,7 @@ static int kimage_load_crash_segment(struct kimage *image,
result = -ENOMEM;
goto out;
}
+ arch_kexec_post_alloc_pages(page_address(page), 1, 0);
ptr = kmap(page);
ptr += maddr & ~PAGE_MASK;
mchunk = min_t(size_t, mbytes,
@@ -884,6 +895,7 @@ static int kimage_load_crash_segment(struct kimage *image,
result = copy_from_user(ptr, buf, uchunk);
kexec_flush_icache_page(page);
kunmap(page);
+ arch_kexec_pre_free_pages(page_address(page), 1);
if (result) {
result = -EFAULT;
goto out;
--
2.17.1
When SME is enabled on AMD machine, the memory is encrypted in the first
kernel. In this case, SME also needs to be enabled in kdump kernel, and
we have to remap the old memory with the memory encryption mask.
Signed-off-by: Lianbo Jiang <[email protected]>
---
arch/x86/include/asm/io.h | 3 +++
arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
2 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 6de64840dd22..f8795f9581c7 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -192,6 +192,9 @@ extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
#define ioremap_cache ioremap_cache
extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val);
#define ioremap_prot ioremap_prot
+extern void __iomem *ioremap_encrypted(resource_size_t phys_addr,
+ unsigned long size);
+#define ioremap_encrypted ioremap_encrypted
/**
* ioremap - map bus memory into CPU space
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c63a545ec199..e01e6c695add 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -24,6 +24,7 @@
#include <asm/pgalloc.h>
#include <asm/pat.h>
#include <asm/setup.h>
+#include <linux/crash_dump.h>
#include "physaddr.h"
@@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
* caller shouldn't need to know that small detail.
*/
static void __iomem *__ioremap_caller(resource_size_t phys_addr,
- unsigned long size, enum page_cache_mode pcm, void *caller)
+ unsigned long size, enum page_cache_mode pcm,
+ void *caller, bool encrypted)
{
unsigned long offset, vaddr;
resource_size_t last_addr;
@@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
* resulting mapping.
*/
prot = PAGE_KERNEL_IO;
- if (sev_active() && mem_flags.desc_other)
+ if ((sev_active() && mem_flags.desc_other) || encrypted)
prot = pgprot_encrypted(prot);
switch (pcm) {
@@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
return __ioremap_caller(phys_addr, size, pcm,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_nocache);
@@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, unsigned long size)
enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
return __ioremap_caller(phys_addr, size, pcm,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL_GPL(ioremap_uc);
@@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_wc);
@@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_wt);
+void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size)
+{
+ return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
+ __builtin_return_address(0), true);
+}
+EXPORT_SYMBOL(ioremap_encrypted);
+
void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_cache);
@@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, unsigned long size,
{
return __ioremap_caller(phys_addr, size,
pgprot2cachemode(__pgprot(prot_val)),
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_prot);
--
2.17.1
In kdump kernel, it will copy the device table of IOMMU from the old device
table, which is encrypted when SME is enabled in the first kernel. So we
have to remap the old device table with the memory encryption mask.
Signed-off-by: Lianbo Jiang <[email protected]>
---
drivers/iommu/amd_iommu_init.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 84b3e4445d46..3931c7de7c69 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -902,12 +902,22 @@ static bool copy_device_table(void)
}
}
- old_devtb_phys = entry & PAGE_MASK;
+ /*
+ * When SME is enabled in the first kernel, the entry includes the
+ * memory encryption mask(sme_me_mask), we must remove the memory
+ * encryption mask to obtain the true physical address in kdump kernel.
+ */
+ old_devtb_phys = __sme_clr(entry) & PAGE_MASK;
+
if (old_devtb_phys >= 0x100000000ULL) {
pr_err("The address of old device table is above 4G, not trustworthy!\n");
return false;
}
- old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
+ old_devtb = (sme_active() && is_kdump_kernel())
+ ? (__force void *)ioremap_encrypted(old_devtb_phys,
+ dev_table_size)
+ : memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
+
if (!old_devtb)
return false;
--
2.17.1
In kdump kernel, we need to dump the old memory into vmcore file,if SME
is enabled in the first kernel, we have to remap the old memory with the
memory encryption mask, which will be automatically decrypted when we
read from DRAM.
For SME kdump, there are two cases that doesn't support:
----------------------------------------------
| first-kernel | second-kernel | kdump support |
| (mem_encrypt=on|off) | (yes|no) |
|--------------+---------------+---------------|
| on | on | yes |
| off | off | yes |
| on | off | no |
| off | on | no |
|______________|_______________|_______________|
1. SME is enabled in the first kernel, but SME is disabled in kdump kernel
In this case, because the old memory is encrypted, we can't decrypt the
old memory.
2. SME is disabled in the first kernel, but SME is enabled in kdump kernel
On the one hand, the old memory is unencrypted, the old memory can be dumped
as usual, we don't need to enable SME in kdump kernel; On the other hand, it
will increase the complexity of the code, we will have to consider how to
pass the SME flag from the first kernel to the kdump kernel, it is really
too expensive to do this.
This patches are only for SME kdump, the patches don't support SEV kdump.
Signed-off-by: Lianbo Jiang <[email protected]>
---
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/crash_dump_encrypt.c | 53 ++++++++++++++++++++++++++++
fs/proc/vmcore.c | 21 +++++++----
include/linux/crash_dump.h | 12 +++++++
4 files changed, 81 insertions(+), 6 deletions(-)
create mode 100644 arch/x86/kernel/crash_dump_encrypt.c
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 8824d01c0c35..dfbeae0e35ce 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -97,6 +97,7 @@ obj-$(CONFIG_KEXEC_CORE) += machine_kexec_$(BITS).o
obj-$(CONFIG_KEXEC_CORE) += relocate_kernel_$(BITS).o crash.o
obj-$(CONFIG_KEXEC_FILE) += kexec-bzimage64.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump_$(BITS).o
+obj-$(CONFIG_AMD_MEM_ENCRYPT) += crash_dump_encrypt.o
obj-y += kprobes/
obj-$(CONFIG_MODULES) += module.o
obj-$(CONFIG_DOUBLEFAULT) += doublefault.o
diff --git a/arch/x86/kernel/crash_dump_encrypt.c b/arch/x86/kernel/crash_dump_encrypt.c
new file mode 100644
index 000000000000..e1b1a577f197
--- /dev/null
+++ b/arch/x86/kernel/crash_dump_encrypt.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Memory preserving reboot related code.
+ *
+ * Created by: Lianbo Jiang ([email protected])
+ * Copyright (C) RedHat Corporation, 2018. All rights reserved
+ */
+
+#include <linux/errno.h>
+#include <linux/crash_dump.h>
+#include <linux/uaccess.h>
+#include <linux/io.h>
+
+/**
+ * copy_oldmem_page_encrypted - copy one page from "oldmem encrypted"
+ * @pfn: page frame number to be copied
+ * @buf: target memory address for the copy; this can be in kernel address
+ * space or user address space (see @userbuf)
+ * @csize: number of bytes to copy
+ * @offset: offset in bytes into the page (based on pfn) to begin the copy
+ * @userbuf: if set, @buf is in user address space, use copy_to_user(),
+ * otherwise @buf is in kernel address space, use memcpy().
+ *
+ * Copy a page from "oldmem encrypted". For this page, there is no pte
+ * mapped in the current kernel. We stitch up a pte, similar to
+ * kmap_atomic.
+ */
+
+ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
+ size_t csize, unsigned long offset, int userbuf)
+{
+ void *vaddr;
+
+ if (!csize)
+ return 0;
+
+ vaddr = (__force void *)ioremap_encrypted(pfn << PAGE_SHIFT,
+ PAGE_SIZE);
+ if (!vaddr)
+ return -ENOMEM;
+
+ if (userbuf) {
+ if (copy_to_user((void __user *)buf, vaddr + offset, csize)) {
+ iounmap((void __iomem *)vaddr);
+ return -EFAULT;
+ }
+ } else
+ memcpy(buf, vaddr + offset, csize);
+
+ set_iounmap_nonlazy();
+ iounmap((void __iomem *)vaddr);
+ return csize;
+}
diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index cbde728f8ac6..3065c8bada6a 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -25,6 +25,9 @@
#include <linux/pagemap.h>
#include <linux/uaccess.h>
#include <asm/io.h>
+#include <linux/io.h>
+#include <linux/mem_encrypt.h>
+#include <asm/pgtable.h>
#include "internal.h"
/* List representing chunks of contiguous memory areas and their offsets in
@@ -98,7 +101,8 @@ static int pfn_is_ram(unsigned long pfn)
/* Reads a page from the oldmem device from given offset. */
static ssize_t read_from_oldmem(char *buf, size_t count,
- u64 *ppos, int userbuf)
+ u64 *ppos, int userbuf,
+ bool encrypted)
{
unsigned long pfn, offset;
size_t nr_bytes;
@@ -120,8 +124,11 @@ static ssize_t read_from_oldmem(char *buf, size_t count,
if (pfn_is_ram(pfn) == 0)
memset(buf, 0, nr_bytes);
else {
- tmp = copy_oldmem_page(pfn, buf, nr_bytes,
- offset, userbuf);
+ tmp = encrypted ? copy_oldmem_page_encrypted(pfn,
+ buf, nr_bytes, offset, userbuf)
+ : copy_oldmem_page(pfn, buf, nr_bytes,
+ offset, userbuf);
+
if (tmp < 0)
return tmp;
}
@@ -155,7 +162,7 @@ void __weak elfcorehdr_free(unsigned long long addr)
*/
ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
{
- return read_from_oldmem(buf, count, ppos, 0);
+ return read_from_oldmem(buf, count, ppos, 0, false);
}
/*
@@ -163,7 +170,7 @@ ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos)
*/
ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos)
{
- return read_from_oldmem(buf, count, ppos, 0);
+ return read_from_oldmem(buf, count, ppos, 0, sme_active());
}
/*
@@ -173,6 +180,7 @@ int __weak remap_oldmem_pfn_range(struct vm_area_struct *vma,
unsigned long from, unsigned long pfn,
unsigned long size, pgprot_t prot)
{
+ prot = pgprot_encrypted(prot);
return remap_pfn_range(vma, from, pfn, size, prot);
}
@@ -351,7 +359,8 @@ static ssize_t __read_vmcore(char *buffer, size_t buflen, loff_t *fpos,
m->offset + m->size - *fpos,
buflen);
start = m->paddr + *fpos - m->offset;
- tmp = read_from_oldmem(buffer, tsz, &start, userbuf);
+ tmp = read_from_oldmem(buffer, tsz, &start, userbuf,
+ sme_active());
if (tmp < 0)
return tmp;
buflen -= tsz;
diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
index 3e4ba9d753c8..a7e7be8b0502 100644
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -26,6 +26,18 @@ extern int remap_oldmem_pfn_range(struct vm_area_struct *vma,
extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
unsigned long, int);
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+extern ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
+ size_t csize, unsigned long offset,
+ int userbuf);
+#else
+static inline
+ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf, size_t csize,
+ unsigned long offset, int userbuf)
+{
+ return 0;
+}
+#endif
void vmcore_cleanup(void);
/* Architecture code defines this if there are other possible ELF
--
2.17.1
On Fri, Sep 07, 2018 at 04:18:04PM +0800, Lianbo Jiang wrote:
> In kdump kernel, it will copy the device table of IOMMU from the old device
> table, which is encrypted when SME is enabled in the first kernel. So we
> have to remap the old device table with the memory encryption mask.
>
> Signed-off-by: Lianbo Jiang <[email protected]>
Please change the subject to:
iommu/amd: Remap the device table of IOMMU with the memory encryption mask for kdump
With that you can add my:
Acked-by: Joerg Roedel <[email protected]>
On 09/07/2018 03:18 AM, Lianbo Jiang wrote:
> When SME is enabled on AMD machine, we also need to support kdump. Because
> the memory is encrypted in the first kernel, we will remap the old memory
> to the kdump kernel for dumping data, and SME is also enabled in the kdump
> kernel, otherwise the old memory can not be decrypted.
>
> For the kdump, it is necessary to distinguish whether the memory is encrypted.
> Furthermore, we should also know which part of the memory is encrypted or
> decrypted. We will appropriately remap the memory according to the specific
> situation in order to tell cpu how to access the memory.
>
> As we know, a page of memory that is marked as encrypted, which will be
> automatically decrypted when read from DRAM, and will also be automatically
> encrypted when written to DRAM. If the old memory is encrypted, we have to
> remap the old memory with the memory encryption mask, which will automatically
> decrypt the old memory when we read those data.
>
> For kdump(SME), there are two cases that doesn't support:
>
> ----------------------------------------------
> | first-kernel | second-kernel | kdump support |
> | (mem_encrypt=on|off) | (yes|no) |
> |--------------+---------------+---------------|
> | on | on | yes |
> | off | off | yes |
> | on | off | no |
> | off | on | no |
> |______________|_______________|_______________|
>
> 1. SME is enabled in the first kernel, but SME is disabled in kdump kernel
> In this case, because the old memory is encrypted, we can't decrypt the
> old memory.
>
> 2. SME is disabled in the first kernel, but SME is enabled in kdump kernel
> It is unnecessary to support in this case, because the old memory is
> unencrypted, the old memory can be dumped as usual, we don't need to enable
> SME in kdump kernel. Another, If we must support the scenario, it will
> increase the complexity of the code, we will have to consider how to pass
> the SME flag from the first kernel to the kdump kernel, in order to let the
> kdump kernel know that whether the old memory is encrypted.
>
> There are two methods to pass the SME flag to the kdump kernel. The first
> method is to modify the assembly code, which includes some common code and
> the path is too long. The second method is to use kexec tool, which could
> require the SME flag to be exported in the first kernel by "proc" or "sysfs",
> kexec tools will read the SME flag from "proc" or "sysfs" when we use kexec
> tools to load image, subsequently the SME flag will be saved in boot_params,
> we can properly remap the old memory according to the previously saved SME
> flag. But it is too expensive to do this.
>
> This patches are only for SME kdump, the patches don't support SEV kdump.
Reviewed-by: Tom Lendacky <[email protected]>
Just curious, are you planning to add SEV kdump support after this?
Also, a question below...
>
> Test tools:
> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
> Note: This patch can only dump vmcore in the case of SME enabled.
>
> crash-7.2.3: https://github.com/crash-utility/crash.git
> commit 001f77a05585 (Fix for Linux 4.19-rc1 and later kernels that contain
> kernel commit7290d58095712a89f845e1bca05334796dd49ed2)
>
> kexec-tools-2.0.17: git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
> commit b9de21ef51a7 (kexec: fix for "Unhandled rela relocation: R_X86_64_PLT32" error)
> Note:
> Before you load the kernel and initramfs for kdump, this patch(http://lists.infradead.org/pipermail/kexec/2018-September/021460.html)
> must be merged to kexec-tools, and then the kdump kernel will work well. Because there
> is a patch which is removed based on v6(x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust()
> to adjust encryption mask).
>
> Test environment:
> HP ProLiant DL385Gen10 AMD EPYC 7251
> 8-Core Processor
> 32768 MB memory
> 600 GB disk space
>
> Linux 4.19-rc2:
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> commit 57361846b52bc686112da6ca5368d11210796804
>
> Reference:
> AMD64 Architecture Programmer's Manual
> https://support.amd.com/TechDocs/24593.pdf
>
> Changes since v6:
> 1. There is a patch which is removed based on v6.
> (x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask)
> Dave Young suggests that this patch can be removed and fix the kexec-tools.
> Reference: http://lists.infradead.org/pipermail/kexec/2018-September/021460.html)
> 2. Update the patch log.
>
> Some known issues:
> 1. about SME
> Upstream kernel will hang on HP machine(DL385Gen10 AMD EPYC 7251) when
> we execute the kexec command as follow:
>
> # kexec -l /boot/vmlinuz-4.19.0-rc2+ --initrd=/boot/initramfs-4.19.0-rc2+.img --command-line="root=/dev/mapper/rhel_hp--dl385g10--03-root ro mem_encrypt=on rd.lvm.lv=rhel_hp-dl385g10-03/root rd.lvm.lv=rhel_hp-dl385g10-03/swap console=ttyS0,115200n81 LANG=en_US.UTF-8 earlyprintk=serial debug nokaslr"
> # kexec -e (or reboot)
>
> But this issue can not be reproduced on speedway machine, and this issue
> is irrelevant to my posted patches.
>
> The kernel log:
> [ 1248.932239] kexec_core: Starting new kernel
> early console in extract_kernel
> input_data: 0x000000087e91c3b4
> input_len: 0x000000000067fcbd
> output: 0x000000087d400000
> output_len: 0x0000000001b6fa90
> kernel_total_size: 0x0000000001a9d000
> trampoline_32bit: 0x0000000000099000
>
> Decompressing Linux...
> Parsing ELF... [---Here the system will hang]
Do you know the reason for the hang? It looks like it is hanging in
parse_elf(). Can you add some debug to parse_elf() to see if the
value of ehdr.e_phnum is valid (maybe it is not a valid value and so
the loop takes forever)?
Thanks,
Tom
>
>
> Lianbo Jiang (4):
> x86/ioremap: add a function ioremap_encrypted() to remap kdump old
> memory
> kexec: allocate unencrypted control pages for kdump in case SME is
> enabled
> amd_iommu: remap the device table of IOMMU with the memory encryption
> mask for kdump
> kdump/vmcore: support encrypted old memory with SME enabled
>
> arch/x86/include/asm/io.h | 3 ++
> arch/x86/kernel/Makefile | 1 +
> arch/x86/kernel/crash_dump_encrypt.c | 53 ++++++++++++++++++++++++++++
> arch/x86/mm/ioremap.c | 25 ++++++++-----
> drivers/iommu/amd_iommu_init.c | 14 ++++++--
> fs/proc/vmcore.c | 21 +++++++----
> include/linux/crash_dump.h | 12 +++++++
> kernel/kexec_core.c | 12 +++++++
> 8 files changed, 125 insertions(+), 16 deletions(-)
> create mode 100644 arch/x86/kernel/crash_dump_encrypt.c
>
Hi Lianbo,
On 09/07/18 at 04:18pm, Lianbo Jiang wrote:
> When SME is enabled on AMD machine, the memory is encrypted in the first
> kernel. In this case, SME also needs to be enabled in kdump kernel, and
> we have to remap the old memory with the memory encryption mask.
This patch series looks good to me. One thing is in your v5 post, Boris
reviewed and complained about the git log, we worked together to make an
document to explain, wondering why you don't rearrange it into log of
this patch. Other than this, all looks fine.
http://lkml.kernel.org/r/[email protected]
>
> Signed-off-by: Lianbo Jiang <[email protected]>
> ---
> arch/x86/include/asm/io.h | 3 +++
> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
> 2 files changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index 6de64840dd22..f8795f9581c7 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -192,6 +192,9 @@ extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
> #define ioremap_cache ioremap_cache
> extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val);
> #define ioremap_prot ioremap_prot
> +extern void __iomem *ioremap_encrypted(resource_size_t phys_addr,
> + unsigned long size);
> +#define ioremap_encrypted ioremap_encrypted
>
> /**
> * ioremap - map bus memory into CPU space
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index c63a545ec199..e01e6c695add 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -24,6 +24,7 @@
> #include <asm/pgalloc.h>
> #include <asm/pat.h>
> #include <asm/setup.h>
> +#include <linux/crash_dump.h>
>
> #include "physaddr.h"
>
> @@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
> * caller shouldn't need to know that small detail.
> */
> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
> - unsigned long size, enum page_cache_mode pcm, void *caller)
> + unsigned long size, enum page_cache_mode pcm,
> + void *caller, bool encrypted)
> {
> unsigned long offset, vaddr;
> resource_size_t last_addr;
> @@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
> * resulting mapping.
> */
> prot = PAGE_KERNEL_IO;
> - if (sev_active() && mem_flags.desc_other)
> + if ((sev_active() && mem_flags.desc_other) || encrypted)
> prot = pgprot_encrypted(prot);
>
> switch (pcm) {
> @@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
> enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
>
> return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_nocache);
>
> @@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, unsigned long size)
> enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
>
> return __ioremap_caller(phys_addr, size, pcm,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL_GPL(ioremap_uc);
>
> @@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
> void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
> {
> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_wc);
>
> @@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
> void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
> {
> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_wt);
>
> +void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size)
> +{
> + return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> + __builtin_return_address(0), true);
> +}
> +EXPORT_SYMBOL(ioremap_encrypted);
> +
> void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
> {
> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_cache);
>
> @@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, unsigned long size,
> {
> return __ioremap_caller(phys_addr, size,
> pgprot2cachemode(__pgprot(prot_val)),
> - __builtin_return_address(0));
> + __builtin_return_address(0), false);
> }
> EXPORT_SYMBOL(ioremap_prot);
>
> --
> 2.17.1
>
Also cc maintainer and other reviewer. Thanks.
在 2018年09月26日 13:52, lijiang 写道:
> 在 2018年09月26日 03:10, Lendacky, Thomas 写道:
>> On 09/07/2018 03:18 AM, Lianbo Jiang wrote:
>>> When SME is enabled on AMD machine, we also need to support kdump. Because
>>> the memory is encrypted in the first kernel, we will remap the old memory
>>> to the kdump kernel for dumping data, and SME is also enabled in the kdump
>>> kernel, otherwise the old memory can not be decrypted.
>>>
>>> For the kdump, it is necessary to distinguish whether the memory is encrypted.
>>> Furthermore, we should also know which part of the memory is encrypted or
>>> decrypted. We will appropriately remap the memory according to the specific
>>> situation in order to tell cpu how to access the memory.
>>>
>>> As we know, a page of memory that is marked as encrypted, which will be
>>> automatically decrypted when read from DRAM, and will also be automatically
>>> encrypted when written to DRAM. If the old memory is encrypted, we have to
>>> remap the old memory with the memory encryption mask, which will automatically
>>> decrypt the old memory when we read those data.
>>>
>>> For kdump(SME), there are two cases that doesn't support:
>>>
>>> ----------------------------------------------
>>> | first-kernel | second-kernel | kdump support |
>>> | (mem_encrypt=on|off) | (yes|no) |
>>> |--------------+---------------+---------------|
>>> | on | on | yes |
>>> | off | off | yes |
>>> | on | off | no |
>>> | off | on | no |
>>> |______________|_______________|_______________|
>>>
>>> 1. SME is enabled in the first kernel, but SME is disabled in kdump kernel
>>> In this case, because the old memory is encrypted, we can't decrypt the
>>> old memory.
>>>
>>> 2. SME is disabled in the first kernel, but SME is enabled in kdump kernel
>>> It is unnecessary to support in this case, because the old memory is
>>> unencrypted, the old memory can be dumped as usual, we don't need to enable
>>> SME in kdump kernel. Another, If we must support the scenario, it will
>>> increase the complexity of the code, we will have to consider how to pass
>>> the SME flag from the first kernel to the kdump kernel, in order to let the
>>> kdump kernel know that whether the old memory is encrypted.
>>>
>>> There are two methods to pass the SME flag to the kdump kernel. The first
>>> method is to modify the assembly code, which includes some common code and
>>> the path is too long. The second method is to use kexec tool, which could
>>> require the SME flag to be exported in the first kernel by "proc" or "sysfs",
>>> kexec tools will read the SME flag from "proc" or "sysfs" when we use kexec
>>> tools to load image, subsequently the SME flag will be saved in boot_params,
>>> we can properly remap the old memory according to the previously saved SME
>>> flag. But it is too expensive to do this.
>>>
>>> This patches are only for SME kdump, the patches don't support SEV kdump.
>>
>> Reviewed-by: Tom Lendacky <[email protected]>
>>
>
> Thank you, Tom. I'm very glad that you would like to review my patches, and
> also gave me some advice to improve these patches.
>
>> Just curious, are you planning to add SEV kdump support after this?
>>
>
> Yes, we are planning to add SEV kdump support after this.
> And i also welcome that you would like to review my SEV kdump patch again.
>
>> Also, a question below...
>>
>>>
>>> Test tools:
>>> makedumpfile[v1.6.3]: https://github.com/LianboJ/makedumpfile
>>> commit e1de103eca8f (A draft for kdump vmcore about AMD SME)
>>> Note: This patch can only dump vmcore in the case of SME enabled.
>>>
>>> crash-7.2.3: https://github.com/crash-utility/crash.git
>>> commit 001f77a05585 (Fix for Linux 4.19-rc1 and later kernels that contain
>>> kernel commit7290d58095712a89f845e1bca05334796dd49ed2)
>>>
>>> kexec-tools-2.0.17: git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
>>> commit b9de21ef51a7 (kexec: fix for "Unhandled rela relocation: R_X86_64_PLT32" error)
>>> Note:
>>> Before you load the kernel and initramfs for kdump, this patch(http://lists.infradead.org/pipermail/kexec/2018-September/021460.html)
>>> must be merged to kexec-tools, and then the kdump kernel will work well. Because there
>>> is a patch which is removed based on v6(x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust()
>>> to adjust encryption mask).
>>>
>>> Test environment:
>>> HP ProLiant DL385Gen10 AMD EPYC 7251
>>> 8-Core Processor
>>> 32768 MB memory
>>> 600 GB disk space
>>>
>>> Linux 4.19-rc2:
>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>> commit 57361846b52bc686112da6ca5368d11210796804
>>>
>>> Reference:
>>> AMD64 Architecture Programmer's Manual
>>> https://support.amd.com/TechDocs/24593.pdf
>>>
>>> Changes since v6:
>>> 1. There is a patch which is removed based on v6.
>>> (x86/ioremap: strengthen the logic in early_memremap_pgprot_adjust() to adjust encryption mask)
>>> Dave Young suggests that this patch can be removed and fix the kexec-tools.
>>> Reference: http://lists.infradead.org/pipermail/kexec/2018-September/021460.html)
>>> 2. Update the patch log.
>>>
>>> Some known issues:
>>> 1. about SME
>>> Upstream kernel will hang on HP machine(DL385Gen10 AMD EPYC 7251) when
>>> we execute the kexec command as follow:
>>>
>>> # kexec -l /boot/vmlinuz-4.19.0-rc2+ --initrd=/boot/initramfs-4.19.0-rc2+.img --command-line="root=/dev/mapper/rhel_hp--dl385g10--03-root ro mem_encrypt=on rd.lvm.lv=rhel_hp-dl385g10-03/root rd.lvm.lv=rhel_hp-dl385g10-03/swap console=ttyS0,115200n81 LANG=en_US.UTF-8 earlyprintk=serial debug nokaslr"
>>> # kexec -e (or reboot)
>>>
>>> But this issue can not be reproduced on speedway machine, and this issue
>>> is irrelevant to my posted patches.
>>>
>>> The kernel log:
>>> [ 1248.932239] kexec_core: Starting new kernel
>>> early console in extract_kernel
>>> input_data: 0x000000087e91c3b4
>>> input_len: 0x000000000067fcbd
>>> output: 0x000000087d400000
>>> output_len: 0x0000000001b6fa90
>>> kernel_total_size: 0x0000000001a9d000
>>> trampoline_32bit: 0x0000000000099000
>>>
>>> Decompressing Linux...
>>> Parsing ELF... [---Here the system will hang]
>>
>> Do you know the reason for the hang? It looks like it is hanging in
>> parse_elf(). Can you add some debug to parse_elf() to see if the
>> value of ehdr.e_phnum is valid (maybe it is not a valid value and so
>> the loop takes forever)?
>>
>
> Previously, i had loaned a speedway machine, however i could not reproduce this
> issue on this machine. But on the 'HP ProLiant DL385Gen10' machine, this issue
> was always reproduced.(btw: the code is the same.)
>
> I'm not sure whether this issue is relate to hardware. I had printed these values,
> and i remembered that the value of ehdr.e_phum was valid.
>
> Because this issue is only reproduced on DL385Gen10 machine, i decreased the priority
> of dealing with this issue.
>
> If you also care about this issue, i can create a new email thread to trace this issue.
> What do you think about this?
>
> Thanks
> Lianbo
>
>> Thanks,
>> Tom
>>
>>>
>>>
>>> Lianbo Jiang (4):
>>> x86/ioremap: add a function ioremap_encrypted() to remap kdump old
>>> memory
>>> kexec: allocate unencrypted control pages for kdump in case SME is
>>> enabled
>>> amd_iommu: remap the device table of IOMMU with the memory encryption
>>> mask for kdump
>>> kdump/vmcore: support encrypted old memory with SME enabled
>>>
>>> arch/x86/include/asm/io.h | 3 ++
>>> arch/x86/kernel/Makefile | 1 +
>>> arch/x86/kernel/crash_dump_encrypt.c | 53 ++++++++++++++++++++++++++++
>>> arch/x86/mm/ioremap.c | 25 ++++++++-----
>>> drivers/iommu/amd_iommu_init.c | 14 ++++++--
>>> fs/proc/vmcore.c | 21 +++++++----
>>> include/linux/crash_dump.h | 12 +++++++
>>> kernel/kexec_core.c | 12 +++++++
>>> 8 files changed, 125 insertions(+), 16 deletions(-)
>>> create mode 100644 arch/x86/kernel/crash_dump_encrypt.c
>>>
>> _______________________________________________
>> kexec mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/kexec
>>
>
> _______________________________________________
> kexec mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/kexec
>
Also cc maintainer and other reviewer. Thanks.
在 2018年09月26日 14:18, lijiang 写道:
> 在 2018年09月26日 10:21, Baoquan He 写道:
>> Hi Lianbo,
>>
>> On 09/07/18 at 04:18pm, Lianbo Jiang wrote:
>>> When SME is enabled on AMD machine, the memory is encrypted in the first
>>> kernel. In this case, SME also needs to be enabled in kdump kernel, and
>>> we have to remap the old memory with the memory encryption mask.
>>
>> This patch series looks good to me. One thing is in your v5 post, Boris
>> reviewed and complained about the git log, we worked together to make an
>> document to explain, wondering why you don't rearrange it into log of
>> this patch. Other than this, all looks fine.
>>
>> http://lkml.kernel.org/r/[email protected]
>>
> Thank you, Baoquan.
>
> Previously i had considered whether i should put these explaining into patch log,
> because these contents are a little more, i might just put the description of
> Solution A into this patch log and post this patch again.
>
> Lianbo
>>
>>>
>>> Signed-off-by: Lianbo Jiang <[email protected]>
>>> ---
>>> arch/x86/include/asm/io.h | 3 +++
>>> arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
>>> 2 files changed, 20 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
>>> index 6de64840dd22..f8795f9581c7 100644
>>> --- a/arch/x86/include/asm/io.h
>>> +++ b/arch/x86/include/asm/io.h
>>> @@ -192,6 +192,9 @@ extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
>>> #define ioremap_cache ioremap_cache
>>> extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val);
>>> #define ioremap_prot ioremap_prot
>>> +extern void __iomem *ioremap_encrypted(resource_size_t phys_addr,
>>> + unsigned long size);
>>> +#define ioremap_encrypted ioremap_encrypted
>>>
>>> /**
>>> * ioremap - map bus memory into CPU space
>>> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
>>> index c63a545ec199..e01e6c695add 100644
>>> --- a/arch/x86/mm/ioremap.c
>>> +++ b/arch/x86/mm/ioremap.c
>>> @@ -24,6 +24,7 @@
>>> #include <asm/pgalloc.h>
>>> #include <asm/pat.h>
>>> #include <asm/setup.h>
>>> +#include <linux/crash_dump.h>
>>>
>>> #include "physaddr.h"
>>>
>>> @@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
>>> * caller shouldn't need to know that small detail.
>>> */
>>> static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>> - unsigned long size, enum page_cache_mode pcm, void *caller)
>>> + unsigned long size, enum page_cache_mode pcm,
>>> + void *caller, bool encrypted)
>>> {
>>> unsigned long offset, vaddr;
>>> resource_size_t last_addr;
>>> @@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
>>> * resulting mapping.
>>> */
>>> prot = PAGE_KERNEL_IO;
>>> - if (sev_active() && mem_flags.desc_other)
>>> + if ((sev_active() && mem_flags.desc_other) || encrypted)
>>> prot = pgprot_encrypted(prot);
>>>
>>> switch (pcm) {
>>> @@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
>>> enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
>>>
>>> return __ioremap_caller(phys_addr, size, pcm,
>>> - __builtin_return_address(0));
>>> + __builtin_return_address(0), false);
>>> }
>>> EXPORT_SYMBOL(ioremap_nocache);
>>>
>>> @@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, unsigned long size)
>>> enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
>>>
>>> return __ioremap_caller(phys_addr, size, pcm,
>>> - __builtin_return_address(0));
>>> + __builtin_return_address(0), false);
>>> }
>>> EXPORT_SYMBOL_GPL(ioremap_uc);
>>>
>>> @@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
>>> void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
>>> {
>>> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
>>> - __builtin_return_address(0));
>>> + __builtin_return_address(0), false);
>>> }
>>> EXPORT_SYMBOL(ioremap_wc);
>>>
>>> @@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
>>> void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
>>> {
>>> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
>>> - __builtin_return_address(0));
>>> + __builtin_return_address(0), false);
>>> }
>>> EXPORT_SYMBOL(ioremap_wt);
>>>
>>> +void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size)
>>> +{
>>> + return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
>>> + __builtin_return_address(0), true);
>>> +}
>>> +EXPORT_SYMBOL(ioremap_encrypted);
>>> +
>>> void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
>>> {
>>> return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
>>> - __builtin_return_address(0));
>>> + __builtin_return_address(0), false);
>>> }
>>> EXPORT_SYMBOL(ioremap_cache);
>>>
>>> @@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, unsigned long size,
>>> {
>>> return __ioremap_caller(phys_addr, size,
>>> pgprot2cachemode(__pgprot(prot_val)),
>>> - __builtin_return_address(0));
>>> + __builtin_return_address(0), false);
>>> }
>>> EXPORT_SYMBOL(ioremap_prot);
>>>
>>> --
>>> 2.17.1
>>>
>>
>> _______________________________________________
>> kexec mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/kexec
>>
>
> _______________________________________________
> kexec mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/kexec
>
When SME is enabled on AMD machine, the memory is encrypted in the first
kernel. In this case, SME also needs to be enabled in kdump kernel, and
we have to remap the old memory with the memory encryption mask.
Here we only talk about the case that SME is active in the first kernel,
and only care it's active too in kdump kernel. there are four cases we
need considered.
a. dump vmcore
It is encrypted in the first kernel, and needs be read out in kdump
kernel.
b. crash notes
When dumping vmcore, the people usually need to read the useful
information from notes, and the notes is also encrypted.
c. iommu device table
It is allocated by kernel, need fill its pointer into mmio of amd iommu.
It's encrypted in the first kernel, need read the old content to analyze
and get useful information.
d. mmio of amd iommu
Register reported by amd firmware, it's not RAM, we don't encrypt in
both the first kernel and kdump kernel.
To achieve the goal, the solution is:
1. add a new bool parameter "encrypted" to __ioremap_caller()
It is a low level function, and check the newly added parameter, if it's
true and in kdump kernel, will remap the memory with sme mask.
2. add a new function ioremap_encrypted() to explicitly passed in a "true"
value for "encrypted".
For above a, b, c, we will call ioremap_encrypted();
3. adjust all existed ioremap wrapper functions, passed in "false" for
encrypted to make them an before.
ioremap_encrypted()\
ioremap_cache() |
ioremap_prot() |
ioremap_wt() |->__ioremap_caller()
ioremap_wc() |
ioremap_uc() |
ioremap_nocache() /
Signed-off-by: Lianbo Jiang <[email protected]>
---
Changes since v7:
1. Only modify patch log(suggested by Baoquan He)
arch/x86/include/asm/io.h | 3 +++
arch/x86/mm/ioremap.c | 25 +++++++++++++++++--------
2 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 6de64840dd22..f8795f9581c7 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -192,6 +192,9 @@ extern void __iomem *ioremap_cache(resource_size_t offset, unsigned long size);
#define ioremap_cache ioremap_cache
extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, unsigned long prot_val);
#define ioremap_prot ioremap_prot
+extern void __iomem *ioremap_encrypted(resource_size_t phys_addr,
+ unsigned long size);
+#define ioremap_encrypted ioremap_encrypted
/**
* ioremap - map bus memory into CPU space
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index c63a545ec199..e01e6c695add 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -24,6 +24,7 @@
#include <asm/pgalloc.h>
#include <asm/pat.h>
#include <asm/setup.h>
+#include <linux/crash_dump.h>
#include "physaddr.h"
@@ -131,7 +132,8 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,
* caller shouldn't need to know that small detail.
*/
static void __iomem *__ioremap_caller(resource_size_t phys_addr,
- unsigned long size, enum page_cache_mode pcm, void *caller)
+ unsigned long size, enum page_cache_mode pcm,
+ void *caller, bool encrypted)
{
unsigned long offset, vaddr;
resource_size_t last_addr;
@@ -199,7 +201,7 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
* resulting mapping.
*/
prot = PAGE_KERNEL_IO;
- if (sev_active() && mem_flags.desc_other)
+ if ((sev_active() && mem_flags.desc_other) || encrypted)
prot = pgprot_encrypted(prot);
switch (pcm) {
@@ -291,7 +293,7 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC_MINUS;
return __ioremap_caller(phys_addr, size, pcm,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_nocache);
@@ -324,7 +326,7 @@ void __iomem *ioremap_uc(resource_size_t phys_addr, unsigned long size)
enum page_cache_mode pcm = _PAGE_CACHE_MODE_UC;
return __ioremap_caller(phys_addr, size, pcm,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL_GPL(ioremap_uc);
@@ -341,7 +343,7 @@ EXPORT_SYMBOL_GPL(ioremap_uc);
void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_wc);
@@ -358,14 +360,21 @@ EXPORT_SYMBOL(ioremap_wc);
void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_wt);
+void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size)
+{
+ return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
+ __builtin_return_address(0), true);
+}
+EXPORT_SYMBOL(ioremap_encrypted);
+
void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
{
return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_cache);
@@ -374,7 +383,7 @@ void __iomem *ioremap_prot(resource_size_t phys_addr, unsigned long size,
{
return __ioremap_caller(phys_addr, size,
pgprot2cachemode(__pgprot(prot_val)),
- __builtin_return_address(0));
+ __builtin_return_address(0), false);
}
EXPORT_SYMBOL(ioremap_prot);
--
2.17.1
Hi Lianbo,
On 09/26/18 at 05:34pm, lijiang wrote:
> When SME is enabled on AMD machine, the memory is encrypted in the first
> kernel. In this case, SME also needs to be enabled in kdump kernel, and
> we have to remap the old memory with the memory encryption mask.
>
> Here we only talk about the case that SME is active in the first kernel,
> and only care it's active too in kdump kernel. there are four cases we
> need considered.
>
> a. dump vmcore
> It is encrypted in the first kernel, and needs be read out in kdump
> kernel.
>
> b. crash notes
> When dumping vmcore, the people usually need to read the useful
> information from notes, and the notes is also encrypted.
>
> c. iommu device table
> It is allocated by kernel, need fill its pointer into mmio of amd iommu.
> It's encrypted in the first kernel, need read the old content to analyze
> and get useful information.
>
> d. mmio of amd iommu
> Register reported by amd firmware, it's not RAM, we don't encrypt in
> both the first kernel and kdump kernel.
>
> To achieve the goal, the solution is:
> 1. add a new bool parameter "encrypted" to __ioremap_caller()
> It is a low level function, and check the newly added parameter, if it's
> true and in kdump kernel, will remap the memory with sme mask.
>
> 2. add a new function ioremap_encrypted() to explicitly passed in a "true"
> value for "encrypted".
> For above a, b, c, we will call ioremap_encrypted();
>
> 3. adjust all existed ioremap wrapper functions, passed in "false" for
> encrypted to make them an before.
>
> ioremap_encrypted()\
> ioremap_cache() |
> ioremap_prot() |
> ioremap_wt() |->__ioremap_caller()
> ioremap_wc() |
> ioremap_uc() |
> ioremap_nocache() /
Thanks, I think it's better. Since no code change, just patch log
improvement, maybe you can repost a series and carry both Tom and
Joerg's ACK.
在 2018年09月25日 20:04, Joerg Roedel 写道:
> On Fri, Sep 07, 2018 at 04:18:04PM +0800, Lianbo Jiang wrote:
>> In kdump kernel, it will copy the device table of IOMMU from the old device
>> table, which is encrypted when SME is enabled in the first kernel. So we
>> have to remap the old device table with the memory encryption mask.
>>
>> Signed-off-by: Lianbo Jiang <[email protected]>
>
> Please change the subject to:
>
> iommu/amd: Remap the device table of IOMMU with the memory encryption mask for kdump
>
> With that you can add my:
>
> Acked-by: Joerg Roedel <[email protected]>
>
Thank you, Joerg.
Sorry for my late reply. I will change the subject and resend this patch.
Thanks.
Lianbo
在 2018年09月27日 10:06, Baoquan He 写道:
> Hi Lianbo,
>
> On 09/26/18 at 05:34pm, lijiang wrote:
>> When SME is enabled on AMD machine, the memory is encrypted in the first
>> kernel. In this case, SME also needs to be enabled in kdump kernel, and
>> we have to remap the old memory with the memory encryption mask.
>>
>> Here we only talk about the case that SME is active in the first kernel,
>> and only care it's active too in kdump kernel. there are four cases we
>> need considered.
>>
>> a. dump vmcore
>> It is encrypted in the first kernel, and needs be read out in kdump
>> kernel.
>>
>> b. crash notes
>> When dumping vmcore, the people usually need to read the useful
>> information from notes, and the notes is also encrypted.
>>
>> c. iommu device table
>> It is allocated by kernel, need fill its pointer into mmio of amd iommu.
>> It's encrypted in the first kernel, need read the old content to analyze
>> and get useful information.
>>
>> d. mmio of amd iommu
>> Register reported by amd firmware, it's not RAM, we don't encrypt in
>> both the first kernel and kdump kernel.
>>
>> To achieve the goal, the solution is:
>> 1. add a new bool parameter "encrypted" to __ioremap_caller()
>> It is a low level function, and check the newly added parameter, if it's
>> true and in kdump kernel, will remap the memory with sme mask.
>>
>> 2. add a new function ioremap_encrypted() to explicitly passed in a "true"
>> value for "encrypted".
>> For above a, b, c, we will call ioremap_encrypted();
>>
>> 3. adjust all existed ioremap wrapper functions, passed in "false" for
>> encrypted to make them an before.
>>
>> ioremap_encrypted()\
>> ioremap_cache() |
>> ioremap_prot() |
>> ioremap_wt() |->__ioremap_caller()
>> ioremap_wc() |
>> ioremap_uc() |
>> ioremap_nocache() /
>
> Thanks, I think it's better. Since no code change, just patch log
> improvement, maybe you can repost a series and carry both Tom and
> Joerg's ACK.
>
Thank you, Baoquan.
I will resend a series, and add Tom's Reviewed-by for all patches, also
add Joerg's Acked-by for patch 3/4.
Thanks.
Lianbo