Subject: [PATCH v1 0/7] Add TDX Guest Support (shared-mm support)

Hi All,

Intel's Trust Domain Extensions (TDX) protect guest VMs from malicious
hosts and some physical attacks. Since VMM is untrusted entity, it does
not allow VMM to access guest private memory. Any memory that is required
for communication with VMM must be shared explicitly. This series adds
support to securely share guest memory with VMM when it is required by
guest.

This series is the continuation of the patch series titled "Add TDX Guest
Support (Initial support)", "Add TDX Guest Support (#VE handler support)"
and "Add TDX Guest Support (boot fixes)" which added initial support,
#VE handler support and boot fixes for TDX guests. You can find the
related patchsets in the following links.

https://lore.kernel.org/patchwork/project/lkml/list/?series=502143
https://lore.kernel.org/patchwork/project/lkml/list/?series=503701
https://lore.kernel.org/patchwork/project/lkml/list/?series=503702

Also please note that this series alone is not necessarily fully
functional. You need to apply all the above 3 patch series to get
a fully functional TDX guest.

You can find TDX related documents in the following link.

https://software.intel.com/content/www/br/pt/develop/articles/intel-trust-domain-extensions.html

Isaku Yamahata (1):
x86/tdx: ioapic: Add shared bit for IOAPIC base address

Kirill A. Shutemov (6):
x86/mm: Move force_dma_unencrypted() to common code
x86/tdx: Exclude Shared bit from physical_mask
x86/tdx: Make pages shared in ioremap()
x86/tdx: Add helper to do MapGPA hypercall
x86/tdx: Make DMA pages shared
x86/kvm: Use bounce buffers for TD guest

arch/x86/Kconfig | 9 +++-
arch/x86/include/asm/mem_encrypt_common.h | 20 ++++++++
arch/x86/include/asm/pgtable.h | 5 ++
arch/x86/include/asm/tdx.h | 23 +++++++++
arch/x86/kernel/apic/io_apic.c | 17 ++++++-
arch/x86/kernel/tdx.c | 58 +++++++++++++++++++++++
arch/x86/mm/Makefile | 2 +
arch/x86/mm/ioremap.c | 9 ++--
arch/x86/mm/mem_encrypt.c | 10 ++--
arch/x86/mm/mem_encrypt_common.c | 39 +++++++++++++++
arch/x86/mm/pat/set_memory.c | 46 +++++++++++++++---
11 files changed, 218 insertions(+), 20 deletions(-)
create mode 100644 arch/x86/include/asm/mem_encrypt_common.h
create mode 100644 arch/x86/mm/mem_encrypt_common.c

--
2.25.1


Subject: [PATCH v1 3/7] x86/tdx: Make pages shared in ioremap()

From: "Kirill A. Shutemov" <[email protected]>

All ioremap()ed pages that are not backed by normal memory (NONE or
RESERVED) have to be mapped as shared.

Reuse the infrastructure we have for AMD SEV.

Note that DMA code doesn't use ioremap() to convert memory to shared as
DMA buffers backed by normal memory. DMA code make buffer shared with
set_memory_decrypted().

Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
---
arch/x86/include/asm/pgtable.h | 4 ++++
arch/x86/mm/ioremap.c | 9 ++++++---
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index b1099f2d9800..5b77843dfa10 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -21,6 +21,10 @@
#define pgprot_encrypted(prot) __pgprot(__sme_set(pgprot_val(prot)))
#define pgprot_decrypted(prot) __pgprot(__sme_clr(pgprot_val(prot)))

+/* Make the page accesable by VMM for protected guests */
+#define pgprot_protected_guest(prot) __pgprot(pgprot_val(prot) | \
+ tdg_shared_mask())
+
#ifndef __ASSEMBLY__
#include <asm/x86_init.h>
#include <asm/fpu/xstate.h>
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 12c686c65ea9..94718396e9e6 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -17,6 +17,7 @@
#include <linux/mem_encrypt.h>
#include <linux/efi.h>
#include <linux/pgtable.h>
+#include <linux/protected_guest.h>

#include <asm/set_memory.h>
#include <asm/e820/api.h>
@@ -87,12 +88,12 @@ static unsigned int __ioremap_check_ram(struct resource *res)
}

/*
- * In a SEV guest, NONE and RESERVED should not be mapped encrypted because
- * there the whole memory is already encrypted.
+ * In a SEV or TDX guest, NONE and RESERVED should not be mapped encrypted (or
+ * private in TDX case) because there the whole memory is already encrypted.
*/
static unsigned int __ioremap_check_encrypted(struct resource *res)
{
- if (!sev_active())
+ if (!sev_active() && !prot_guest_has(PR_GUEST_MEM_ENCRYPT))
return 0;

switch (res->desc) {
@@ -244,6 +245,8 @@ __ioremap_caller(resource_size_t phys_addr, unsigned long size,
prot = PAGE_KERNEL_IO;
if ((io_desc.flags & IORES_MAP_ENCRYPTED) || encrypted)
prot = pgprot_encrypted(prot);
+ else if (prot_guest_has(PR_GUEST_SHARED_MAPPING_INIT))
+ prot = pgprot_protected_guest(prot);

switch (pcm) {
case _PAGE_CACHE_MODE_UC:
--
2.25.1

Subject: [PATCH v1 5/7] x86/tdx: Make DMA pages shared

From: "Kirill A. Shutemov" <[email protected]>

Just like MKTME, TDX reassigns bits of the physical address for
metadata. MKTME used several bits for an encryption KeyID. TDX
uses a single bit in guests to communicate whether a physical page
should be protected by TDX as private memory (bit set to 0) or
unprotected and shared with the VMM (bit set to 1).

__set_memory_enc_dec() is now aware about TDX and sets Shared bit
accordingly following with relevant TDX hypercall.

Also, Do TDACCEPTPAGE on every 4k page after mapping the GPA range
when converting memory to private. Using 4k page size limit is due
to current TDX spec restriction. Also, If the GPA (range) was
already mapped as an active, private page, the host VMM may remove
the private page from the TD by following the “Removing TD Private
Pages” sequence in the Intel TDX-module specification [1] to safely
block the mapping(s), flush the TLB and cache, and remove the
mapping(s).

BUG() if TDACCEPTPAGE fails (except "previously accepted page" case)
, as the guest is completely hosed if it can't access memory. 

[1] https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-module-1eas-v0.85.039.pdf

Tested-by: Kai Huang <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
---
arch/x86/include/asm/pgtable.h | 1 +
arch/x86/kernel/tdx.c | 34 ++++++++++++++++++-----
arch/x86/mm/mem_encrypt_common.c | 3 +++
arch/x86/mm/pat/set_memory.c | 46 +++++++++++++++++++++++++++-----
4 files changed, 71 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5b77843dfa10..41c8d3ace070 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -24,6 +24,7 @@
/* Make the page accesable by VMM for protected guests */
#define pgprot_protected_guest(prot) __pgprot(pgprot_val(prot) | \
tdg_shared_mask())
+#define pgprot_pg_shared_mask() __pgprot(tdg_shared_mask())

#ifndef __ASSEMBLY__
#include <asm/x86_init.h>
diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index 591643abae88..c90871a10443 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -16,10 +16,14 @@
/* TDX Module call Leaf IDs */
#define TDINFO 1
#define TDGETVEINFO 3
+#define TDACCEPTPAGE 6

/* TDX hypercall Leaf IDs */
#define TDVMCALL_MAP_GPA 0x10001

+/* TDX Module call error codes */
+#define TDX_PAGE_ALREADY_ACCEPTED 0x8000000000000001
+
#define VE_IS_IO_OUT(exit_qual) (((exit_qual) & 8) ? 0 : 1)
#define VE_GET_IO_SIZE(exit_qual) (((exit_qual) & 7) + 1)
#define VE_GET_PORT_NUM(exit_qual) ((exit_qual) >> 16)
@@ -124,25 +128,43 @@ static void tdg_get_info(void)
physical_mask &= ~tdg_shared_mask();
}

+static void tdg_accept_page(phys_addr_t gpa)
+{
+ u64 ret;
+
+ ret = __tdx_module_call(TDACCEPTPAGE, gpa, 0, 0, 0, NULL);
+
+ BUG_ON(ret && ret != TDX_PAGE_ALREADY_ACCEPTED);
+}
+
/*
* Inform the VMM of the guest's intent for this physical page:
* shared with the VMM or private to the guest. The VMM is
* expected to change its mapping of the page in response.
- *
- * Note: shared->private conversions require further guest
- * action to accept the page.
*/
int tdx_hcall_gpa_intent(phys_addr_t gpa, int numpages,
enum tdx_map_type map_type)
{
- u64 ret;
+ u64 ret = 0;
+ int i;

if (map_type == TDX_MAP_SHARED)
gpa |= tdg_shared_mask();

- ret = tdx_hypercall(TDVMCALL_MAP_GPA, gpa, PAGE_SIZE * numpages, 0, 0);
+ if (tdx_hypercall(TDVMCALL_MAP_GPA, gpa, PAGE_SIZE * numpages, 0, 0))
+ ret = -EIO;

- return ret ? -EIO : 0;
+ if (ret || map_type == TDX_MAP_SHARED)
+ return ret;
+
+ /*
+ * For shared->private conversion, accept the page using TDACCEPTPAGE
+ * TDX module call.
+ */
+ for (i = 0; i < numpages; i++)
+ tdg_accept_page(gpa + i * PAGE_SIZE);
+
+ return 0;
}

static __cpuidle void tdg_halt(void)
diff --git a/arch/x86/mm/mem_encrypt_common.c b/arch/x86/mm/mem_encrypt_common.c
index 4a9a4d5f36cd..8053b43298ff 100644
--- a/arch/x86/mm/mem_encrypt_common.c
+++ b/arch/x86/mm/mem_encrypt_common.c
@@ -16,5 +16,8 @@ bool force_dma_unencrypted(struct device *dev)
if (sev_active() || sme_active())
return amd_force_dma_unencrypted(dev);

+ if (prot_guest_has(PR_GUEST_MEM_ENCRYPT))
+ return true;
+
return false;
}
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 156cd235659f..fa0f2de20617 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -29,6 +29,7 @@
#include <asm/proto.h>
#include <asm/memtype.h>
#include <asm/set_memory.h>
+#include <asm/tdx.h>

#include "../mm_internal.h"

@@ -1980,13 +1981,16 @@ int set_memory_global(unsigned long addr, int numpages)
__pgprot(_PAGE_GLOBAL), 0);
}

-static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
+static int __set_memory_protect(unsigned long addr, int numpages, bool protect)
{
+ pgprot_t mem_protected_bits, mem_plain_bits;
struct cpa_data cpa;
+ enum tdx_map_type map_type;
int ret;

/* Nothing to do if memory encryption is not active */
- if (!mem_encrypt_active())
+ if (!mem_encrypt_active() &&
+ !prot_guest_has(PR_GUEST_MEM_ENCRYPT_ACTIVE))
return 0;

/* Should not be working on unaligned addresses */
@@ -1996,8 +2000,25 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
memset(&cpa, 0, sizeof(cpa));
cpa.vaddr = &addr;
cpa.numpages = numpages;
- cpa.mask_set = enc ? __pgprot(_PAGE_ENC) : __pgprot(0);
- cpa.mask_clr = enc ? __pgprot(0) : __pgprot(_PAGE_ENC);
+
+ if (prot_guest_has(PR_GUEST_SHARED_MAPPING_INIT)) {
+ mem_protected_bits = __pgprot(0);
+ mem_plain_bits = pgprot_pg_shared_mask();
+ } else {
+ mem_protected_bits = __pgprot(_PAGE_ENC);
+ mem_plain_bits = __pgprot(0);
+ }
+
+ if (protect) {
+ cpa.mask_set = mem_protected_bits;
+ cpa.mask_clr = mem_plain_bits;
+ map_type = TDX_MAP_PRIVATE;
+ } else {
+ cpa.mask_set = mem_plain_bits;
+ cpa.mask_clr = mem_protected_bits;
+ map_type = TDX_MAP_SHARED;
+ }
+
cpa.pgd = init_mm.pgd;

/* Must avoid aliasing mappings in the highmem code */
@@ -2006,8 +2027,16 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)

/*
* Before changing the encryption attribute, we need to flush caches.
+ *
+ * For TDX we need to flush caches on private->shared. VMM is
+ * responsible for flushing on shared->private.
*/
- cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
+ if (is_tdx_guest()) {
+ if (map_type == TDX_MAP_SHARED)
+ cpa_flush(&cpa, 1);
+ } else {
+ cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
+ }

ret = __change_page_attr_set_clr(&cpa, 1);

@@ -2020,18 +2049,21 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
*/
cpa_flush(&cpa, 0);

+ if (!ret && prot_guest_has(PR_GUEST_SHARED_MAPPING_INIT))
+ ret = tdx_hcall_gpa_intent(__pa(addr), numpages, map_type);
+
return ret;
}

int set_memory_encrypted(unsigned long addr, int numpages)
{
- return __set_memory_enc_dec(addr, numpages, true);
+ return __set_memory_protect(addr, numpages, true);
}
EXPORT_SYMBOL_GPL(set_memory_encrypted);

int set_memory_decrypted(unsigned long addr, int numpages)
{
- return __set_memory_enc_dec(addr, numpages, false);
+ return __set_memory_protect(addr, numpages, false);
}
EXPORT_SYMBOL_GPL(set_memory_decrypted);

--
2.25.1

Subject: [PATCH v1 6/7] x86/kvm: Use bounce buffers for TD guest

From: "Kirill A. Shutemov" <[email protected]>

Intel TDX doesn't allow VMM to directly access guest private
memory. Any memory that is required for communication with
VMM must be shared explicitly. The same rule applies for any
any DMA to and fromTDX guest. All DMA pages had to marked as
shared pages. A generic way to achieve this without any changes
to device drivers is to use the SWIOTLB framework.

This method of handling is similar to AMD SEV. So extend this
support for TDX guest as well. Also since there are some common
code between AMD SEV and TDX guest in mem_encrypt_init(), move it
to mem_encrypt_common.c and call AMD specific init function from
it

Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
---
arch/x86/include/asm/mem_encrypt_common.h | 2 ++
arch/x86/kernel/tdx.c | 3 +++
arch/x86/mm/mem_encrypt.c | 5 +----
arch/x86/mm/mem_encrypt_common.c | 16 ++++++++++++++++
4 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt_common.h b/arch/x86/include/asm/mem_encrypt_common.h
index 697bc40a4e3d..48d98a3d64fd 100644
--- a/arch/x86/include/asm/mem_encrypt_common.h
+++ b/arch/x86/include/asm/mem_encrypt_common.h
@@ -8,11 +8,13 @@

#ifdef CONFIG_AMD_MEM_ENCRYPT
bool amd_force_dma_unencrypted(struct device *dev);
+void __init amd_mem_encrypt_init(void);
#else /* CONFIG_AMD_MEM_ENCRYPT */
static inline bool amd_force_dma_unencrypted(struct device *dev)
{
return false;
}
+static inline void amd_mem_encrypt_init(void) {}
#endif /* CONFIG_AMD_MEM_ENCRYPT */

#endif
diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index c90871a10443..1caf9fa5bb30 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -9,6 +9,7 @@
#include <asm/insn.h>
#include <asm/insn-eval.h>
#include <linux/sched/signal.h> /* force_sig_fault() */
+#include <linux/swiotlb.h>

#include <linux/cpu.h>
#include <linux/protected_guest.h>
@@ -535,6 +536,8 @@ void __init tdx_early_init(void)

legacy_pic = &null_legacy_pic;

+ swiotlb_force = SWIOTLB_FORCE;
+
cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "tdg:cpu_hotplug",
NULL, tdg_cpu_offline_prepare);

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 9c55a3209c88..84ee14446139 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -467,14 +467,11 @@ static void print_mem_encrypt_feature_info(void)
}

/* Architecture __weak replacement functions */
-void __init mem_encrypt_init(void)
+void __init amd_mem_encrypt_init(void)
{
if (!sme_me_mask)
return;

- /* Call into SWIOTLB to update the SWIOTLB DMA buffers */
- swiotlb_update_mem_attributes();
-
/*
* With SEV, we need to unroll the rep string I/O instructions,
* but SEV-ES supports them through the #VC handler.
diff --git a/arch/x86/mm/mem_encrypt_common.c b/arch/x86/mm/mem_encrypt_common.c
index 8053b43298ff..2da70f58b208 100644
--- a/arch/x86/mm/mem_encrypt_common.c
+++ b/arch/x86/mm/mem_encrypt_common.c
@@ -9,6 +9,7 @@

#include <asm/mem_encrypt_common.h>
#include <linux/dma-mapping.h>
+#include <linux/swiotlb.h>

/* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
bool force_dma_unencrypted(struct device *dev)
@@ -21,3 +22,18 @@ bool force_dma_unencrypted(struct device *dev)

return false;
}
+
+/* Architecture __weak replacement functions */
+void __init mem_encrypt_init(void)
+{
+ /*
+ * For TDX guest or SEV/SME, call into SWIOTLB to update
+ * the SWIOTLB DMA buffers
+ */
+ if (sme_me_mask || prot_guest_has(PR_GUEST_MEM_ENCRYPT))
+ swiotlb_update_mem_attributes();
+
+ if (sme_me_mask)
+ amd_mem_encrypt_init();
+}
+
--
2.25.1

Subject: [PATCH v1 4/7] x86/tdx: Add helper to do MapGPA hypercall

From: "Kirill A. Shutemov" <[email protected]>

MapGPA hypercall is used by TDX guests to request VMM convert
the existing mapping of given GPA address range between
private/shared.

tdx_hcall_gpa_intent() is the wrapper used for making MapGPA
hypercall.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
---
arch/x86/include/asm/tdx.h | 17 +++++++++++++++++
arch/x86/kernel/tdx.c | 24 ++++++++++++++++++++++++
2 files changed, 41 insertions(+)

diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 70e0931bbf52..f20b1f056cdd 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -5,6 +5,15 @@

#define TDX_CPUID_LEAF_ID 0x21

+/*
+ * Page mapping type enum. This is software construct not
+ * part of any hardware or VMM ABI.
+ */
+enum tdx_map_type {
+ TDX_MAP_PRIVATE,
+ TDX_MAP_SHARED,
+};
+
#ifdef CONFIG_INTEL_TDX_GUEST

#include <asm/cpufeature.h>
@@ -123,6 +132,8 @@ do { \
#endif

extern phys_addr_t tdg_shared_mask(void);
+extern int tdx_hcall_gpa_intent(phys_addr_t gpa, int numpages,
+ enum tdx_map_type map_type);

#else // !CONFIG_INTEL_TDX_GUEST

@@ -147,6 +158,12 @@ static inline phys_addr_t tdg_shared_mask(void)
{
return 0;
}
+
+static inline int tdx_hcall_gpa_intent(phys_addr_t gpa, int numpages,
+ enum tdx_map_type map_type)
+{
+ return -ENODEV;
+}
#endif /* CONFIG_INTEL_TDX_GUEST */

#ifdef CONFIG_INTEL_TDX_GUEST_KVM
diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index 1cd572a35eea..591643abae88 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -17,6 +17,9 @@
#define TDINFO 1
#define TDGETVEINFO 3

+/* TDX hypercall Leaf IDs */
+#define TDVMCALL_MAP_GPA 0x10001
+
#define VE_IS_IO_OUT(exit_qual) (((exit_qual) & 8) ? 0 : 1)
#define VE_GET_IO_SIZE(exit_qual) (((exit_qual) & 7) + 1)
#define VE_GET_PORT_NUM(exit_qual) ((exit_qual) >> 16)
@@ -121,6 +124,27 @@ static void tdg_get_info(void)
physical_mask &= ~tdg_shared_mask();
}

+/*
+ * Inform the VMM of the guest's intent for this physical page:
+ * shared with the VMM or private to the guest. The VMM is
+ * expected to change its mapping of the page in response.
+ *
+ * Note: shared->private conversions require further guest
+ * action to accept the page.
+ */
+int tdx_hcall_gpa_intent(phys_addr_t gpa, int numpages,
+ enum tdx_map_type map_type)
+{
+ u64 ret;
+
+ if (map_type == TDX_MAP_SHARED)
+ gpa |= tdg_shared_mask();
+
+ ret = tdx_hypercall(TDVMCALL_MAP_GPA, gpa, PAGE_SIZE * numpages, 0, 0);
+
+ return ret ? -EIO : 0;
+}
+
static __cpuidle void tdg_halt(void)
{
u64 ret;
--
2.25.1

Subject: [PATCH v1 7/7] x86/tdx: ioapic: Add shared bit for IOAPIC base address

From: Isaku Yamahata <[email protected]>

The kernel interacts with each bare-metal IOAPIC with a special
MMIO page. When running under KVM, the guest's IOAPICs are
emulated by KVM.

When running as a TDX guest, the guest needs to mark each IOAPIC
mapping as "shared" with the host. This ensures that TDX private
protections are not applied to the page, which allows the TDX host
emulation to work.

Earlier patches in this series modified ioremap() so that
ioremap()-created mappings such as virtio will be marked as
shared. However, the IOAPIC code does not use ioremap() and instead
uses the fixmap mechanism.

Introduce a special fixmap helper just for the IOAPIC code. Ensure
that it marks IOAPIC pages as "shared". This replaces
set_fixmap_nocache() with __set_fixmap() since __set_fixmap()
allows custom 'prot' values.

Signed-off-by: Isaku Yamahata <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]>
---
arch/x86/kernel/apic/io_apic.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index d5c691a3208b..95639072c986 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -49,6 +49,7 @@
#include <linux/slab.h>
#include <linux/memblock.h>
#include <linux/msi.h>
+#include <linux/protected_guest.h>

#include <asm/irqdomain.h>
#include <asm/io.h>
@@ -2675,6 +2676,18 @@ static struct resource * __init ioapic_setup_resources(void)
return res;
}

+static void io_apic_set_fixmap_nocache(enum fixed_addresses idx,
+ phys_addr_t phys)
+{
+ pgprot_t flags = FIXMAP_PAGE_NOCACHE;
+
+ /* Set TDX guest shared bit in pgprot flags */
+ if (prot_guest_has(PR_GUEST_SHARED_MAPPING_INIT))
+ flags = pgprot_protected_guest(flags);
+
+ __set_fixmap(idx, phys, flags);
+}
+
void __init io_apic_init_mappings(void)
{
unsigned long ioapic_phys, idx = FIX_IO_APIC_BASE_0;
@@ -2707,7 +2720,7 @@ void __init io_apic_init_mappings(void)
__func__, PAGE_SIZE, PAGE_SIZE);
ioapic_phys = __pa(ioapic_phys);
}
- set_fixmap_nocache(idx, ioapic_phys);
+ io_apic_set_fixmap_nocache(idx, ioapic_phys);
apic_printk(APIC_VERBOSE, "mapped IOAPIC to %08lx (%08lx)\n",
__fix_to_virt(idx) + (ioapic_phys & ~PAGE_MASK),
ioapic_phys);
@@ -2836,7 +2849,7 @@ int mp_register_ioapic(int id, u32 address, u32 gsi_base,
ioapics[idx].mp_config.flags = MPC_APIC_USABLE;
ioapics[idx].mp_config.apicaddr = address;

- set_fixmap_nocache(FIX_IO_APIC_BASE_0 + idx, address);
+ io_apic_set_fixmap_nocache(FIX_IO_APIC_BASE_0 + idx, address);
if (bad_ioapic_register(idx)) {
clear_fixmap(FIX_IO_APIC_BASE_0 + idx);
return -ENODEV;
--
2.25.1

2021-06-11 14:54:24

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v1 6/7] x86/kvm: Use bounce buffers for TD guest

On 6/9/21 4:55 PM, Kuppuswamy Sathyanarayanan wrote:
> From: "Kirill A. Shutemov" <[email protected]>
> diff --git a/arch/x86/mm/mem_encrypt_common.c b/arch/x86/mm/mem_encrypt_common.c
> index 8053b43298ff..2da70f58b208 100644
> --- a/arch/x86/mm/mem_encrypt_common.c
> +++ b/arch/x86/mm/mem_encrypt_common.c
> @@ -9,6 +9,7 @@
>
> #include <asm/mem_encrypt_common.h>
> #include <linux/dma-mapping.h>
> +#include <linux/swiotlb.h>
>
> /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
> bool force_dma_unencrypted(struct device *dev)
> @@ -21,3 +22,18 @@ bool force_dma_unencrypted(struct device *dev)
>
> return false;
> }
> +
> +/* Architecture __weak replacement functions */
> +void __init mem_encrypt_init(void)
> +{
> + /*
> + * For TDX guest or SEV/SME, call into SWIOTLB to update
> + * the SWIOTLB DMA buffers
> + */
> + if (sme_me_mask || prot_guest_has(PR_GUEST_MEM_ENCRYPT))
> + swiotlb_update_mem_attributes();
> +
> + if (sme_me_mask)
> + amd_mem_encrypt_init();

The sme_me_mask is checked in amd_mem_encrypt_init(), so you should just
invoke amd_mem_encrypt_init() unconditionally.

Thanks,
Tom

> +}
> +
>

Subject: Re: [PATCH v1 6/7] x86/kvm: Use bounce buffers for TD guest



On 6/11/21 7:52 AM, Tom Lendacky wrote:
> The sme_me_mask is checked in amd_mem_encrypt_init(), so you should just
> invoke amd_mem_encrypt_init() unconditionally.

Ok. I will fix it in next version.

--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer