Hi Peter, Kirill, all,
This series unifies the assembly code for TDCALL/SEAMCALL and TDVMCALL.
Now all of them use one singe TDX_MODULE_CALL asm macro. More
information please see cover letter of v2 (see link below).
This version mainly addressed Peter's comment to add patch to adjust
'struct tdx_module_args' to match KVM's "vcpu::regs".
Tested by booting TDX guest, initializing TDX module, and running TDX
guest successfully, all with this series applied.
------- Histroy --------
v2 -> v3:
- New patch (patch 12) to adjust 'struct tdx_module_args' layout to
match KVM's "vcpu::regs[]" for VP.ENTER. (Peter)
- Added __seamcall_saved_ret() wrapper to support VP.ENTER (merged to
patch 10).
- Fixed a 'noinstr' check build regression found by LKP (patch 7).
- Rebased to latest Linus's tree (6.5-rc3 + 2 commits).
v2: https://lore.kernel.org/lkml/[email protected]/T/
v1 -> v2:
- Rebased to 6.5-rc2.
- Fixed comments from Peter and others.
- Split patch "x86/tdx: Unify TDX_HYPERCALL and TDX_MODULE_CALL assembly"
into three smaller patches for better review.
- A new patch to skip saving output registers when SEAMCALL fails due to
VMFailInvalid.
- Removed patch "x86/tdx: Use cmovc to save a label in TDX_MODULE_CALL asm"
- Merged patch "x86/tdx: Move FRAME_BEGIN/END to TDX_MODULE_CALL asm macro"
to the new patch mentioned above.
v1: https://lore.kernel.org/lkml/[email protected]/T/
Kai Huang (12):
x86/tdx: Zero out the missing RSI in TDX_HYPERCALL macro
x86/tdx: Skip saving output regs when SEAMCALL fails with
VMFailInvalid
x86/tdx: Make macros of TDCALLs consistent with the spec
x86/tdx: Rename __tdx_module_call() to __tdcall()
x86/tdx: Pass TDCALL/SEAMCALL input/output registers via a structure
x86/tdx: Extend TDX_MODULE_CALL to support more TDCALL/SEAMCALL leafs
x86/tdx: Make TDX_HYPERCALL asm similar to TDX_MODULE_CALL
x86/tdx: Reimplement __tdx_hypercall() using TDX_MODULE_CALL asm
x86/tdx: Remove 'struct tdx_hypercall_args'
x86/virt/tdx: Wire up basic SEAMCALL functions
x86/virt/tdx: Allow SEAMCALL to handle #UD and #GP
x86/virt/tdx: Adjust 'struct tdx_module_args' to use x86 "register
index" layout
arch/x86/Kconfig | 12 ++
arch/x86/Makefile | 2 +
arch/x86/boot/compressed/tdx.c | 6 +-
arch/x86/coco/tdx/tdcall.S | 231 ++++--------------------------
arch/x86/coco/tdx/tdx-shared.c | 28 +++-
arch/x86/coco/tdx/tdx.c | 69 +++++----
arch/x86/include/asm/shared/tdx.h | 92 +++++++-----
arch/x86/include/asm/tdx.h | 11 ++
arch/x86/kernel/asm-offsets.c | 33 ++---
arch/x86/virt/Makefile | 2 +
arch/x86/virt/vmx/Makefile | 2 +
arch/x86/virt/vmx/tdx/Makefile | 2 +
arch/x86/virt/vmx/tdx/seamcall.S | 61 ++++++++
arch/x86/virt/vmx/tdx/tdxcall.S | 227 ++++++++++++++++++++++-------
14 files changed, 433 insertions(+), 345 deletions(-)
create mode 100644 arch/x86/virt/Makefile
create mode 100644 arch/x86/virt/vmx/Makefile
create mode 100644 arch/x86/virt/vmx/tdx/Makefile
create mode 100644 arch/x86/virt/vmx/tdx/seamcall.S
base-commit: 18b44bc5a67275641fb26f2c54ba7eef80ac5950
--
2.41.0
Intel Trust Domain Extensions (TDX) protects guest VMs from malicious
host and certain physical attacks. A CPU-attested software module
called 'the TDX module' runs inside a new isolated memory range as a
trusted hypervisor to manage and run protected VMs.
TDX introduces a new CPU mode: Secure Arbitration Mode (SEAM). This
mode runs only the TDX module itself or other code to load the TDX
module.
The host kernel communicates with SEAM software via a new SEAMCALL
instruction. This is conceptually similar to a guest->host hypercall,
except it is made from the host to SEAM software instead. The TDX
module establishes a new SEAMCALL ABI which allows the host to
initialize the module and to manage VMs.
The SEAMCALL ABI is very similar to the TDCALL ABI and leverages much
TDCALL infrastructure. Wire up basic functions to make SEAMCALLs for
the basic support of running TDX guests: __seamcall(), __seamcall_ret(),
and __seamcall_saved_ret() for TDH.VP.ENTER. All SEAMCALLs involved in
the basic TDX support don't use "callee-saved" registers as input and
output, except the TDH.VP.ENTER.
To start to support TDX, create a new arch/x86/virt/vmx/tdx/tdx.c for
TDX host kernel support. Add a new Kconfig option CONFIG_INTEL_TDX_HOST
to opt-in TDX host kernel support (to distinguish with TDX guest kernel
support). So far only KVM uses TDX. Make the new config option depend
on KVM_INTEL.
Signed-off-by: Kai Huang <[email protected]>
---
v2 -> v3:
- Added __seamcall_saved_ret() back for TDH.VP.ENTER, given the new
patch to adjust 'struct tdx_module_args' layout.
v1 -> v2:
- Removed __seamcall_saved_ret() and leave it to KVM TDX patches.
---
arch/x86/Kconfig | 12 +++++++
arch/x86/Makefile | 2 ++
arch/x86/include/asm/tdx.h | 7 ++++
arch/x86/virt/Makefile | 2 ++
arch/x86/virt/vmx/Makefile | 2 ++
arch/x86/virt/vmx/tdx/Makefile | 2 ++
arch/x86/virt/vmx/tdx/seamcall.S | 61 ++++++++++++++++++++++++++++++++
7 files changed, 88 insertions(+)
create mode 100644 arch/x86/virt/Makefile
create mode 100644 arch/x86/virt/vmx/Makefile
create mode 100644 arch/x86/virt/vmx/tdx/Makefile
create mode 100644 arch/x86/virt/vmx/tdx/seamcall.S
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7422db409770..0558dd98abd7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1949,6 +1949,18 @@ config X86_SGX
If unsure, say N.
+config INTEL_TDX_HOST
+ bool "Intel Trust Domain Extensions (TDX) host support"
+ depends on CPU_SUP_INTEL
+ depends on X86_64
+ depends on KVM_INTEL
+ help
+ Intel Trust Domain Extensions (TDX) protects guest VMs from malicious
+ host and certain physical attacks. This option enables necessary TDX
+ support in the host kernel to run confidential VMs.
+
+ If unsure, say N.
+
config EFI
bool "EFI runtime service support"
depends on ACPI
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index fdc2e3abd615..5d8d1892aae9 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -252,6 +252,8 @@ archheaders:
libs-y += arch/x86/lib/
+core-y += arch/x86/virt/
+
# drivers-y are linked after core-y
drivers-$(CONFIG_MATH_EMULATION) += arch/x86/math-emu/
drivers-$(CONFIG_PCI) += arch/x86/pci/
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 603e6d1e9d4a..a69bb7d3061b 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -72,5 +72,12 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1,
return -ENODEV;
}
#endif /* CONFIG_INTEL_TDX_GUEST && CONFIG_KVM_GUEST */
+
+#ifdef CONFIG_INTEL_TDX_HOST
+u64 __seamcall(u64 fn, struct tdx_module_args *args);
+u64 __seamcall_ret(u64 fn, struct tdx_module_args *args);
+u64 __seamcall_saved_ret(u64 fn, struct tdx_module_args *args);
+#endif /* CONFIG_INTEL_TDX_HOST */
+
#endif /* !__ASSEMBLY__ */
#endif /* _ASM_X86_TDX_H */
diff --git a/arch/x86/virt/Makefile b/arch/x86/virt/Makefile
new file mode 100644
index 000000000000..1e36502cd738
--- /dev/null
+++ b/arch/x86/virt/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-y += vmx/
diff --git a/arch/x86/virt/vmx/Makefile b/arch/x86/virt/vmx/Makefile
new file mode 100644
index 000000000000..feebda21d793
--- /dev/null
+++ b/arch/x86/virt/vmx/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_INTEL_TDX_HOST) += tdx/
diff --git a/arch/x86/virt/vmx/tdx/Makefile b/arch/x86/virt/vmx/tdx/Makefile
new file mode 100644
index 000000000000..46ef8f73aebb
--- /dev/null
+++ b/arch/x86/virt/vmx/tdx/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-y += seamcall.o
diff --git a/arch/x86/virt/vmx/tdx/seamcall.S b/arch/x86/virt/vmx/tdx/seamcall.S
new file mode 100644
index 000000000000..5b1f2286aea9
--- /dev/null
+++ b/arch/x86/virt/vmx/tdx/seamcall.S
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <linux/linkage.h>
+#include <asm/frame.h>
+
+#include "tdxcall.S"
+
+/*
+ * __seamcall() - Host-side interface functions to SEAM software
+ * (the P-SEAMLDR or the TDX module).
+ *
+ * __seamcall() function ABI:
+ *
+ * @fn (RDI) - SEAMCALL Leaf number, moved to RAX
+ * @args (RSI) - struct tdx_module_args for input
+ *
+ * Only RCX/RDX/R8-R11 are used as input registers.
+ *
+ * Return (via RAX) TDX_SEAMCALL_VMFAILINVALID if the SEAMCALL itself
+ * fails, or the completion status of the SEAMCALL leaf function.
+ */
+SYM_FUNC_START(__seamcall)
+ TDX_MODULE_CALL host=1
+SYM_FUNC_END(__seamcall)
+
+/*
+ * __seamcall_ret() - Host-side interface functions to SEAM software
+ * (the P-SEAMLDR or the TDX module), with saving output registers to
+ * the 'struct tdx_module_args' used as input.
+ *
+ * __seamcall_ret() function ABI:
+ *
+ * @fn (RDI) - SEAMCALL Leaf number, moved to RAX
+ * @args (RSI) - struct tdx_module_args for input and output
+ *
+ * Only RCX/RDX/R8-R11 are used as input/output registers.
+ *
+ * Return (via RAX) TDX_SEAMCALL_VMFAILINVALID if the SEAMCALL itself
+ * fails, or the completion status of the SEAMCALL leaf function.
+ */
+SYM_FUNC_START(__seamcall_ret)
+ TDX_MODULE_CALL host=1 ret=1
+SYM_FUNC_END(__seamcall_ret)
+
+/*
+ * __seamcall_saved_ret() - Host-side interface functions to SEAM software
+ * (the P-SEAMLDR or the TDX module), with saving output registers to the
+ * 'struct tdx_module_args' used as input.
+ *
+ * __seamcall_saved_ret() function ABI:
+ *
+ * @fn (RDI) - SEAMCALL Leaf number, moved to RAX
+ * @args (RSI) - struct tdx_module_args for input and output
+ *
+ * All registers in @args are used as input/output registers.
+ *
+ * Return (via RAX) TDX_SEAMCALL_VMFAILINVALID if the SEAMCALL itself
+ * fails, or the completion status of the SEAMCALL leaf function.
+ */
+SYM_FUNC_START(__seamcall_saved_ret)
+ TDX_MODULE_CALL host=1 ret=1 saved=1
+SYM_FUNC_END(__seamcall_saved_ret)
--
2.41.0
Now 'struct tdx_hypercall_args' is basically 'struct tdx_module_args'
minus RCX. Although from __tdx_hypercall()'s perspective RCX isn't
used as shared register thus not part of input/output registers, it's
not worth to have a separate structure just due to one register.
Remove the 'struct tdx_hypercall_args' and use 'struct tdx_module_args'
instead in __tdx_hypercall() related code. This also saves the memory
copy between the two structures within __tdx_hypercall().
Cc: Kirill A. Shutemov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
---
v2 -> v3:
- No change.
v1 -> v2:
- The third patch (new) split from v1 "x86/tdx: Unify TDX_HYPERCALL
and TDX_MODULE_CALL assembly"
- Rebase to 6.5-rc2.
---
arch/x86/boot/compressed/tdx.c | 4 ++--
arch/x86/coco/tdx/tdx-shared.c | 37 ++++++-------------------------
arch/x86/coco/tdx/tdx.c | 16 ++++++-------
arch/x86/include/asm/shared/tdx.h | 25 ++-------------------
4 files changed, 19 insertions(+), 63 deletions(-)
diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c
index bc03eae2c560..8451d6a1030c 100644
--- a/arch/x86/boot/compressed/tdx.c
+++ b/arch/x86/boot/compressed/tdx.c
@@ -18,7 +18,7 @@ void __tdx_hypercall_failed(void)
static inline unsigned int tdx_io_in(int size, u16 port)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
.r12 = size,
@@ -34,7 +34,7 @@ static inline unsigned int tdx_io_in(int size, u16 port)
static inline void tdx_io_out(int size, u16 port, u32 value)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
.r12 = size,
diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
index 344b3818f4c3..78e413269791 100644
--- a/arch/x86/coco/tdx/tdx-shared.c
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -70,45 +70,22 @@ bool tdx_accept_memory(phys_addr_t start, phys_addr_t end)
return true;
}
-noinstr u64 __tdx_hypercall(struct tdx_hypercall_args *args)
+noinstr u64 __tdx_hypercall(struct tdx_module_args *args)
{
- struct tdx_module_args margs = {
- .rcx = TDVMCALL_EXPOSE_REGS_MASK,
- .rdx = args->rdx,
- .r8 = args->r8,
- .r9 = args->r9,
- .r10 = args->r10,
- .r11 = args->r11,
- .r12 = args->r12,
- .r13 = args->r13,
- .r14 = args->r14,
- .r15 = args->r15,
- .rbx = args->rbx,
- .rdi = args->rdi,
- .rsi = args->rsi,
- };
+ /*
+ * For TDVMCALL explicitly set RCX to the bitmap of shared registers.
+ * The caller isn't expected to set @args->rcx anyway.
+ */
+ args->rcx = TDVMCALL_EXPOSE_REGS_MASK;
/*
* Failure of __tdcall_saved_ret() indicates a failure of the TDVMCALL
* mechanism itself and that something has gone horribly wrong with
* the TDX module. __tdx_hypercall_failed() never returns.
*/
- if (__tdcall_saved_ret(TDG_VP_VMCALL, &margs))
+ if (__tdcall_saved_ret(TDG_VP_VMCALL, args))
__tdx_hypercall_failed();
- args->r8 = margs.r8;
- args->r9 = margs.r9;
- args->r10 = margs.r10;
- args->r11 = margs.r11;
- args->r12 = margs.r12;
- args->r13 = margs.r13;
- args->r14 = margs.r14;
- args->r15 = margs.r15;
- args->rdi = margs.rdi;
- args->rsi = margs.rsi;
- args->rbx = margs.rbx;
- args->rdx = margs.rdx;
-
/* TDVMCALL leaf return code is in R10 */
return args->r10;
}
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index d2c2f1ccef94..619af4465cf2 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -48,7 +48,7 @@ noinstr void __tdx_hypercall_failed(void)
long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, unsigned long p2,
unsigned long p3, unsigned long p4)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = nr,
.r11 = p1,
.r12 = p2,
@@ -108,7 +108,7 @@ EXPORT_SYMBOL_GPL(tdx_mcall_get_report0);
static void __noreturn tdx_panic(const char *msg)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = TDVMCALL_REPORT_FATAL_ERROR,
.r12 = 0, /* Error code: 0 is Panic */
@@ -230,7 +230,7 @@ static int ve_instr_len(struct ve_info *ve)
static u64 __cpuidle __halt(const bool irq_disabled)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = hcall_func(EXIT_REASON_HLT),
.r12 = irq_disabled,
@@ -274,7 +274,7 @@ void __cpuidle tdx_safe_halt(void)
static int read_msr(struct pt_regs *regs, struct ve_info *ve)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = hcall_func(EXIT_REASON_MSR_READ),
.r12 = regs->cx,
@@ -295,7 +295,7 @@ static int read_msr(struct pt_regs *regs, struct ve_info *ve)
static int write_msr(struct pt_regs *regs, struct ve_info *ve)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = hcall_func(EXIT_REASON_MSR_WRITE),
.r12 = regs->cx,
@@ -315,7 +315,7 @@ static int write_msr(struct pt_regs *regs, struct ve_info *ve)
static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = hcall_func(EXIT_REASON_CPUID),
.r12 = regs->ax,
@@ -357,7 +357,7 @@ static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve)
static bool mmio_read(int size, unsigned long addr, unsigned long *val)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = hcall_func(EXIT_REASON_EPT_VIOLATION),
.r12 = size,
@@ -486,7 +486,7 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
static bool handle_in(struct pt_regs *regs, int size, int port)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = hcall_func(EXIT_REASON_IO_INSTRUCTION),
.r12 = size,
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index 188e8bf3332c..74fc466dfdcd 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -83,29 +83,8 @@ u64 __tdcall(u64 fn, struct tdx_module_args *args);
u64 __tdcall_ret(u64 fn, struct tdx_module_args *args);
u64 __tdcall_saved_ret(u64 fn, struct tdx_module_args *args);
-/*
- * Used in __tdx_hypercall() to pass down and get back registers' values of
- * the TDCALL instruction when requesting services from the VMM.
- *
- * This is a software only structure and not part of the TDX module/VMM ABI.
- */
-struct tdx_hypercall_args {
- u64 r8;
- u64 r9;
- u64 r10;
- u64 r11;
- u64 r12;
- u64 r13;
- u64 r14;
- u64 r15;
- u64 rdi;
- u64 rsi;
- u64 rbx;
- u64 rdx;
-};
-
/* Used to request services from the VMM */
-u64 __tdx_hypercall(struct tdx_hypercall_args *args);
+u64 __tdx_hypercall(struct tdx_module_args *args);
/*
* Wrapper for standard use of __tdx_hypercall with no output aside from
@@ -113,7 +92,7 @@ u64 __tdx_hypercall(struct tdx_hypercall_args *args);
*/
static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15)
{
- struct tdx_hypercall_args args = {
+ struct tdx_module_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
.r11 = fn,
.r12 = r12,
--
2.41.0
For TDX guest, KVM needs to call __seamcall_saved_ret() to make the
TDH.VP.ENTER SEAMCALL to enter the guest, possibly taking all registers
in 'struct tdx_module_args' as input/output.
KVM caches guest's GPRs in 'kvm_vcpu_arch::regs[]', which follows the
"register index" hardware layout of x86 GPRs. On the other hand, the
__seamcall_saved_ret() takes the pointer of 'struct tdx_module_args' as
argument, thus there's a mismatch.
KVM could choose to copy input registers from 'vcpu::regs[]' to a
'struct tdx_module_args' and use that as argument to make the SEAMCALL,
but such memory copy isn't desired and should be avoided if possible.
It's not feasible to change KVM's 'vcpu::regs[]' layout due to various
reasons (e.g., emulation code uses decoded register index as array index
to access the register). Therefore, adjust 'struct tdx_module_args' to
match KVM's 'vcpu::regs[]' layout so that KVM can simply do below:
__seamcall_saved_ret(TDH_VP_ENTER,
(struct tdx_module_args *)vcpu->arch.regs);
Note RAX/RSP/RBP are not used by the TDX_MODULE_CALL assembly, but they
are necessary in order match the layout of 'struct tdx_module_args' to
KVM's 'vcpu::regs[]'. Thus add them to the structure, but name them as
*_unused. Also don't include them to asm-offset.c so that any misuse of
them in the assembly would result in build error.
Cc: Kirill A. Shutemov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Isaku Yamahata <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
---
v2 -> v3:
- New patch
---
arch/x86/include/asm/shared/tdx.h | 19 +++++++++++++------
arch/x86/kernel/asm-offsets.c | 6 +++---
2 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index 74fc466dfdcd..8d1427562c63 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -58,24 +58,31 @@
* Used in __tdcall*() to gather the input/output registers' values of the
* TDCALL instruction when requesting services from the TDX module. This is a
* software only structure and not part of the TDX module/VMM ABI
+ *
+ * Note those *_unused are not used by the TDX_MODULE_CALL assembly.
+ * The layout of this structure also matches KVM's kvm_vcpu_arch::regs[]
+ * layout, which follows the "register index" order of x86 GPRs. KVM
+ * then can simply type cast kvm_vcpu_arch::regs[] to this structure to
+ * avoid the extra memory copy between two structures when making
+ * TDH.VP.ENTER SEAMCALL.
*/
struct tdx_module_args {
- /* callee-clobbered */
+ u64 rax_unused;
u64 rcx;
u64 rdx;
+ u64 rbx;
+ u64 rsp_unused;
+ u64 rbp_unused;
+ u64 rsi;
+ u64 rdi;
u64 r8;
u64 r9;
- /* extra callee-clobbered */
u64 r10;
u64 r11;
- /* callee-saved + rdi/rsi */
u64 r12;
u64 r13;
u64 r14;
u64 r15;
- u64 rbx;
- u64 rdi;
- u64 rsi;
};
/* Used to communicate with the TDX module */
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 6913b372ccf7..e4ad822d3acd 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -70,6 +70,9 @@ static void __used common(void)
BLANK();
OFFSET(TDX_MODULE_rcx, tdx_module_args, rcx);
OFFSET(TDX_MODULE_rdx, tdx_module_args, rdx);
+ OFFSET(TDX_MODULE_rbx, tdx_module_args, rbx);
+ OFFSET(TDX_MODULE_rsi, tdx_module_args, rsi);
+ OFFSET(TDX_MODULE_rdi, tdx_module_args, rdi);
OFFSET(TDX_MODULE_r8, tdx_module_args, r8);
OFFSET(TDX_MODULE_r9, tdx_module_args, r9);
OFFSET(TDX_MODULE_r10, tdx_module_args, r10);
@@ -78,9 +81,6 @@ static void __used common(void)
OFFSET(TDX_MODULE_r13, tdx_module_args, r13);
OFFSET(TDX_MODULE_r14, tdx_module_args, r14);
OFFSET(TDX_MODULE_r15, tdx_module_args, r15);
- OFFSET(TDX_MODULE_rbx, tdx_module_args, rbx);
- OFFSET(TDX_MODULE_rdi, tdx_module_args, rdi);
- OFFSET(TDX_MODULE_rsi, tdx_module_args, rsi);
BLANK();
OFFSET(BP_scratch, boot_params, scratch);
--
2.41.0
If SEAMCALL fails with VMFailInvalid, the SEAM software (e.g., the TDX
module) won't have chance to set any output register. Skip saving the
output registers to the structure in this case.
Also, as '.Lno_output_struct' is the very last symbol before RET, rename
it to '.Lout' to make it short.
Opportunistically make the asm directives unindented.
Cc: Kirill A. Shutemov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
---
v2 -> v3:
- No change.
v1 -> v2:
- A new patch to improve SEAMCALL VMFailInvalid failure, with v1 patch
"x86/tdx: Move FRAME_BEGIN/END to TDX_MODULE_CALL asm macro" merged.
---
arch/x86/coco/tdx/tdcall.S | 3 ---
arch/x86/virt/vmx/tdx/tdxcall.S | 29 ++++++++++++++++++++---------
2 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index 2eca5f43734f..e5d4b7d8ecd4 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -78,10 +78,7 @@
* Return status of TDCALL via RAX.
*/
SYM_FUNC_START(__tdx_module_call)
- FRAME_BEGIN
TDX_MODULE_CALL host=0
- FRAME_END
- RET
SYM_FUNC_END(__tdx_module_call)
/*
diff --git a/arch/x86/virt/vmx/tdx/tdxcall.S b/arch/x86/virt/vmx/tdx/tdxcall.S
index 49a54356ae99..6bdf6e137953 100644
--- a/arch/x86/virt/vmx/tdx/tdxcall.S
+++ b/arch/x86/virt/vmx/tdx/tdxcall.S
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: GPL-2.0 */
#include <asm/asm-offsets.h>
+#include <asm/frame.h>
#include <asm/tdx.h>
/*
@@ -18,6 +19,7 @@
* TDX module.
*/
.macro TDX_MODULE_CALL host:req
+ FRAME_BEGIN
/*
* R12 will be used as temporary storage for struct tdx_module_output
* pointer. Since R12-R15 registers are not used by TDCALL/SEAMCALL
@@ -44,7 +46,7 @@
mov %rsi, %rcx
/* Leave input param 2 in RDX */
- .if \host
+.if \host
seamcall
/*
* SEAMCALL instruction is essentially a VMExit from VMX root
@@ -57,13 +59,10 @@
* This value will never be used as actual SEAMCALL error code as
* it is from the Reserved status code class.
*/
- jnc .Lno_vmfailinvalid
- mov $TDX_SEAMCALL_VMFAILINVALID, %rax
-.Lno_vmfailinvalid:
-
- .else
+ jc .Lseamcall_vmfailinvalid
+.else
tdcall
- .endif
+.endif
/*
* Fetch output pointer from stack to R12 (It is used
@@ -80,7 +79,7 @@
* Other registers may contain details of the failure.
*/
test %r12, %r12
- jz .Lno_output_struct
+ jz .Lout
/* Copy result registers to output struct: */
movq %rcx, TDX_MODULE_rcx(%r12)
@@ -90,7 +89,19 @@
movq %r10, TDX_MODULE_r10(%r12)
movq %r11, TDX_MODULE_r11(%r12)
-.Lno_output_struct:
+.Lout:
/* Restore the state of R12 register */
pop %r12
+
+ FRAME_END
+ RET
+
+.if \host
+.Lseamcall_vmfailinvalid:
+ mov $TDX_SEAMCALL_VMFAILINVALID, %rax
+ /* pop the unused output pointer back to %r9 */
+ pop %r9
+ jmp .Lout
+.endif /* \host */
+
.endm
--
2.41.0
Now the TDX_HYPERCALL asm is basically identical to the TDX_MODULE_CALL
with both '\saved' and '\ret' enabled, with two minor things though:
1) The way to restore the structure pointer is different
The TDX_HYPERCALL uses RCX as spare to restore the structure pointer,
but the TDX_MODULE_CALL assumes no spare register can be used. In other
words, TDX_MODULE_CALL already covers what TDX_HYPERCALL does.
2) TDX_MODULE_CALL only clears shared registers for TDH.VP.ENTER
For this just need to make that code available for the non-host case.
Thus, remove the TDX_HYPERCALL and reimplement the __tdx_hypercall()
using the TDX_MODULE_CALL.
Extend the TDX_MODULE_CALL to cover "clear shared registers" for
TDG.VP.VMCALL. Introduce a new __tdcall_saved_ret() to replace the
temporary __tdcall_hypercall().
The __tdcall_saved_ret() can also be used for those new TDCALLs which
require more input/output registers than the basic TDCALLs do.
Cc: Kirill A. Shutemov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
---
v2 -> v3:
- No update
v1 -> v2:
- The second patch (new) split from v1 "x86/tdx: Unify TDX_HYPERCALL
and TDX_MODULE_CALL assembly"
- Rebase to 6.5-rc2.
---
arch/x86/coco/tdx/tdcall.S | 132 ++----------------------------
arch/x86/coco/tdx/tdx-shared.c | 4 +-
arch/x86/include/asm/shared/tdx.h | 2 +-
arch/x86/virt/vmx/tdx/tdxcall.S | 8 +-
4 files changed, 15 insertions(+), 131 deletions(-)
diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index 086afe888e5e..21242c28b54b 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -48,135 +48,19 @@ SYM_FUNC_START(__tdcall_ret)
SYM_FUNC_END(__tdcall_ret)
/*
- * TDX_HYPERCALL - Make hypercalls to a TDX VMM using TDVMCALL leaf of TDCALL
- * instruction
- *
- * Transforms values in function call argument struct tdx_module_args @args
- * into the TDCALL register ABI. After TDCALL operation, VMM output is saved
- * back in @args, if \ret is 1.
- *
- * Depends on the caller to pass TDG.VP.VMCALL as the TDCALL leaf, and set
- * @args::rcx to TDVMCALL_EXPOSE_REGS_MASK.
- *
- *-------------------------------------------------------------------------
- * TD VMCALL ABI:
- *-------------------------------------------------------------------------
- *
- * Input Registers:
- *
- * RAX - TDCALL instruction leaf number (0 - TDG.VP.VMCALL)
- * RCX - BITMAP which controls which part of TD Guest GPR
- * is passed as-is to the VMM and back.
- * R10 - Set 0 to indicate TDCALL follows standard TDX ABI
- * specification. Non zero value indicates vendor
- * specific ABI.
- * R11 - VMCALL sub function number
- * RBX, RDX, RDI, RSI - Used to pass VMCALL sub function specific arguments.
- * R8-R9, R12-R15 - Same as above.
- *
- * Output Registers:
- *
- * RAX - TDCALL instruction status (Not related to hypercall
- * output).
- * RBX, RDX, RDI, RSI - Hypercall sub function specific output values.
- * R8-R15 - Same as above.
- *
- */
-.macro TDX_HYPERCALL
- FRAME_BEGIN
-
- /* Save callee-saved GPRs as mandated by the x86_64 ABI */
- push %r15
- push %r14
- push %r13
- push %r12
- push %rbx
-
- /* Move Leaf ID to RAX */
- movq %rdi, %rax
-
- /* Move bitmap of shared registers to RCX */
- movq TDX_MODULE_rcx(%rsi), %rcx
-
- /* Copy hypercall registers from arg struct: */
- movq TDX_MODULE_r8(%rsi), %r8
- movq TDX_MODULE_r9(%rsi), %r9
- movq TDX_MODULE_r10(%rsi), %r10
- movq TDX_MODULE_r11(%rsi), %r11
- movq TDX_MODULE_r12(%rsi), %r12
- movq TDX_MODULE_r13(%rsi), %r13
- movq TDX_MODULE_r14(%rsi), %r14
- movq TDX_MODULE_r15(%rsi), %r15
- movq TDX_MODULE_rdi(%rsi), %rdi
- movq TDX_MODULE_rbx(%rsi), %rbx
- movq TDX_MODULE_rdx(%rsi), %rdx
-
- pushq %rsi
- movq TDX_MODULE_rsi(%rsi), %rsi
-
- tdcall
-
- /*
- * Restore the pointer of the structure to save output registers.
- *
- * RCX is used as bitmap of shared registers and doesn't hold any
- * value provided by the VMM, thus it can be used as spare to
- * restore the structure pointer.
- */
- popq %rcx
- movq %rsi, TDX_MODULE_rsi(%rcx)
- movq %rcx, %rsi
-
- movq %r8, TDX_MODULE_r8(%rsi)
- movq %r9, TDX_MODULE_r9(%rsi)
- movq %r10, TDX_MODULE_r10(%rsi)
- movq %r11, TDX_MODULE_r11(%rsi)
- movq %r12, TDX_MODULE_r12(%rsi)
- movq %r13, TDX_MODULE_r13(%rsi)
- movq %r14, TDX_MODULE_r14(%rsi)
- movq %r15, TDX_MODULE_r15(%rsi)
- movq %rdi, TDX_MODULE_rdi(%rsi)
- movq %rbx, TDX_MODULE_rbx(%rsi)
- movq %rdx, TDX_MODULE_rdx(%rsi)
-
- /*
- * Zero out registers exposed to the VMM to avoid speculative execution
- * with VMM-controlled values. This needs to include all registers
- * present in TDVMCALL_EXPOSE_REGS_MASK, except RBX, and R12-R15 which
- * will be restored.
- */
- xor %r8d, %r8d
- xor %r9d, %r9d
- xor %r10d, %r10d
- xor %r11d, %r11d
- xor %rdi, %rdi
- xor %rsi, %rsi
- xor %rdx, %rdx
-
- /* Restore callee-saved GPRs as mandated by the x86_64 ABI */
- pop %rbx
- pop %r12
- pop %r13
- pop %r14
- pop %r15
-
- FRAME_END
-
- RET
-.endm
-
-/*
+ * __tdcall_saved_ret() - Used by TDX guests to request services from the
+ * TDX module (including VMM services) using TDCALL instruction, with
+ * saving output registers to the 'struct tdx_module_args' used as input.
*
- * __tdcall_hypercall() function ABI:
+ * __tdcall_saved_ret() function ABI:
*
* @fn (RDI) - TDCALL leaf ID, moved to RAX
* @args (RSI) - struct tdx_module_args for input/output
*
- * @fn and @args::rcx from the caller must be TDG_VP_VMCALL and
- * TDVMCALL_EXPOSE_REGS_MASK respectively.
+ * All registers in @args are used as input/output registers.
*
* On successful completion, return the hypercall error code.
*/
-SYM_FUNC_START(__tdcall_hypercall)
- TDX_HYPERCALL
-SYM_FUNC_END(__tdcall_hypercall)
+SYM_FUNC_START(__tdcall_saved_ret)
+ TDX_MODULE_CALL host=0 ret=1 saved=1
+SYM_FUNC_END(__tdcall_saved_ret)
diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
index b47c8cce91b0..344b3818f4c3 100644
--- a/arch/x86/coco/tdx/tdx-shared.c
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -89,11 +89,11 @@ noinstr u64 __tdx_hypercall(struct tdx_hypercall_args *args)
};
/*
- * Failure of __tdcall_hypercall() indicates a failure of the TDVMCALL
+ * Failure of __tdcall_saved_ret() indicates a failure of the TDVMCALL
* mechanism itself and that something has gone horribly wrong with
* the TDX module. __tdx_hypercall_failed() never returns.
*/
- if (__tdcall_hypercall(TDG_VP_VMCALL, &margs))
+ if (__tdcall_saved_ret(TDG_VP_VMCALL, &margs))
__tdx_hypercall_failed();
args->r8 = margs.r8;
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index da5205daa53d..188e8bf3332c 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -81,6 +81,7 @@ struct tdx_module_args {
/* Used to communicate with the TDX module */
u64 __tdcall(u64 fn, struct tdx_module_args *args);
u64 __tdcall_ret(u64 fn, struct tdx_module_args *args);
+u64 __tdcall_saved_ret(u64 fn, struct tdx_module_args *args);
/*
* Used in __tdx_hypercall() to pass down and get back registers' values of
@@ -104,7 +105,6 @@ struct tdx_hypercall_args {
};
/* Used to request services from the VMM */
-u64 __tdcall_hypercall(u64 fn, struct tdx_module_args *args);
u64 __tdx_hypercall(struct tdx_hypercall_args *args);
/*
diff --git a/arch/x86/virt/vmx/tdx/tdxcall.S b/arch/x86/virt/vmx/tdx/tdxcall.S
index 42f9225a530d..3ed6d8b8d2a9 100644
--- a/arch/x86/virt/vmx/tdx/tdxcall.S
+++ b/arch/x86/virt/vmx/tdx/tdxcall.S
@@ -143,10 +143,10 @@
movq %r11, TDX_MODULE_r11(%rsi)
.endif /* \ret */
-.if \host && \saved && \ret
+.if \saved && \ret
/*
- * Clear registers shared by guest for VP.ENTER to prevent
- * speculative use of guest's values, including those are
+ * Clear registers shared by guest for VP.VMCALL/VP.ENTER to prevent
+ * speculative use of guest's/VMM's values, including those are
* restored from the stack.
*
* See arch/x86/kvm/vmx/vmenter.S:
@@ -171,7 +171,7 @@
xorl %r15d, %r15d
xorl %ebx, %ebx
xorl %edi, %edi
-.endif /* \host && \ret && \host */
+.endif /* \ret && \host */
.if \host
.Lout\@:
--
2.41.0
TDX memory has integrity and confidentiality protections. Violations of
this integrity protection are supposed to only affect TDX operations and
are never supposed to affect the host kernel itself. In other words,
the host kernel should never, itself, see machine checks induced by the
TDX integrity hardware.
Alas, the first few generations of TDX hardware have an erratum. A
partial write to a TDX private memory cacheline will silently "poison"
the line. Subsequent reads will consume the poison and generate a
machine check. According to the TDX hardware spec, neither of these
things should have happened.
On the platform with this erratum, it would be nice if the #MC handler
could print additional information to show the #MC was TDX private
memory error due to possible kernel bug.
To do that, the machine check handler needs to use SEAMCALL to query
page type of the error memory from the TDX module, because there's no
existing infrastructure to track TDX private pages.
SEAMCALL instruction causes #UD if CPU isn't in VMX operation. In #MC
handler, it is legal that CPU isn't in VMX operation when making this
SEAMCALL. Extend the TDX_MODULE_CALL macro to handle #UD so the
SEAMCALL can return error code instead of Oops in the #MC handler.
Opportunistically handles #GP too since they share the same code.
A bonus is when kernel mistakenly calls SEAMCALL when CPU isn't in VMX
operation, or when TDX isn't enabled by the BIOS, or when the BIOS is
buggy, the kernel can get a nicer error message rather than a less
understandable Oops.
This is basically based on Peter's code.
Cc: Kirill A. Shutemov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
---
v2 -> v3:
- Added more material to the changelog to explain the TDX erratum
(hoping it can be merged together with this series).
- Changed to use %rdi to hold $TDX_SW_ERROR instead of %r12 as the
latter is callee-saved.
v1 -> v2:
- Skip saving output registers when SEAMCALL #UD/#GP
---
arch/x86/include/asm/tdx.h | 4 ++++
arch/x86/virt/vmx/tdx/tdxcall.S | 19 +++++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index a69bb7d3061b..adcbe3f1de30 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -8,6 +8,7 @@
#include <asm/errno.h>
#include <asm/ptrace.h>
+#include <asm/trapnr.h>
#include <asm/shared/tdx.h>
/*
@@ -20,6 +21,9 @@
#define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40))
#define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000))
+#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
+#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
+
#ifndef __ASSEMBLY__
/*
diff --git a/arch/x86/virt/vmx/tdx/tdxcall.S b/arch/x86/virt/vmx/tdx/tdxcall.S
index 3ed6d8b8d2a9..ccb7ff08078a 100644
--- a/arch/x86/virt/vmx/tdx/tdxcall.S
+++ b/arch/x86/virt/vmx/tdx/tdxcall.S
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
#include <asm/asm-offsets.h>
#include <asm/frame.h>
+#include <asm/asm.h>
#include <asm/tdx.h>
/*
@@ -85,6 +86,7 @@
.endif /* \saved */
.if \host
+.Lseamcall\@:
seamcall
/*
* SEAMCALL instruction is essentially a VMExit from VMX root
@@ -192,11 +194,28 @@
.if \host
.Lseamcall_vmfailinvalid\@:
mov $TDX_SEAMCALL_VMFAILINVALID, %rax
+ jmp .Lseamcall_fail\@
+
+.Lseamcall_trap\@:
+ /*
+ * SEAMCALL caused #GP or #UD. By reaching here %eax contains
+ * the trap number. Convert the trap number to the TDX error
+ * code by setting TDX_SW_ERROR to the high 32-bits of %rax.
+ *
+ * Note cannot OR TDX_SW_ERROR directly to %rax as OR instruction
+ * only accepts 32-bit immediate at most.
+ */
+ movq $TDX_SW_ERROR, %rdi
+ orq %rdi, %rax
+
+.Lseamcall_fail\@:
.if \ret && \saved
/* pop the unused structure pointer back to %rsi */
popq %rsi
.endif
jmp .Lout\@
+
+ _ASM_EXTABLE_FAULT(.Lseamcall\@, .Lseamcall_trap\@)
.endif /* \host */
.endm
--
2.41.0
The TDX spec names all TDCALLs with prefix "TDG". Currently, the kernel
doesn't follow such convention for the macros of those TDCALLs but uses
prefix "TDX_" for all of them. Although it's arguable whether the TDX
spec names those TDCALLs properly, it's better for the kernel to follow
the spec when naming those macros.
Change all macros of TDCALLs to make them consistent with the spec. As
a bonus, they get distinguished easily from the host-side SEAMCALLs,
which all have prefix "TDH".
No functional change intended.
Signed-off-by: Kai Huang <[email protected]>
---
v2 -> v3:
- No change.
v1 -> v2:
- Rebase to 6.5-rc2.
---
arch/x86/coco/tdx/tdx-shared.c | 4 ++--
arch/x86/coco/tdx/tdx.c | 8 ++++----
arch/x86/include/asm/shared/tdx.h | 10 +++++-----
3 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
index ef20ddc37b58..f10cd3e4a04e 100644
--- a/arch/x86/coco/tdx/tdx-shared.c
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -35,7 +35,7 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned long len,
}
tdcall_rcx = start | page_size;
- if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL))
+ if (__tdx_module_call(TDG_MEM_PAGE_ACCEPT, tdcall_rcx, 0, 0, 0, NULL))
return 0;
return accept_size;
@@ -45,7 +45,7 @@ bool tdx_accept_memory(phys_addr_t start, phys_addr_t end)
{
/*
* For shared->private conversion, accept the page using
- * TDX_ACCEPT_PAGE TDX module call.
+ * TDG_MEM_PAGE_ACCEPT TDX module call.
*/
while (start < end) {
unsigned long len = end - start;
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 1d6b863c42b0..05785df66b1c 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -91,7 +91,7 @@ int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
{
u64 ret;
- ret = __tdx_module_call(TDX_GET_REPORT, virt_to_phys(tdreport),
+ ret = __tdx_module_call(TDG_MR_REPORT, virt_to_phys(tdreport),
virt_to_phys(reportdata), TDREPORT_SUBTYPE_0,
0, NULL);
if (ret) {
@@ -152,7 +152,7 @@ static void tdx_parse_tdinfo(u64 *cc_mask)
* Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL
* [TDG.VP.INFO].
*/
- tdx_module_call(TDX_GET_INFO, 0, 0, 0, 0, &out);
+ tdx_module_call(TDG_VP_INFO, 0, 0, 0, 0, &out);
/*
* The highest bit of a guest physical address is the "sharing" bit.
@@ -594,7 +594,7 @@ void tdx_get_ve_info(struct ve_info *ve)
* Note, the TDX module treats virtual NMIs as inhibited if the #VE
* valid flag is set. It means that NMI=>#VE will not result in a #DF.
*/
- tdx_module_call(TDX_GET_VEINFO, 0, 0, 0, 0, &out);
+ tdx_module_call(TDG_VP_VEINFO_GET, 0, 0, 0, 0, &out);
/* Transfer the output parameters */
ve->exit_reason = out.rcx;
@@ -774,7 +774,7 @@ void __init tdx_early_init(void)
cc_set_mask(cc_mask);
/* Kernel does not use NOTIFY_ENABLES and does not need random #VEs */
- tdx_module_call(TDX_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
+ tdx_module_call(TDG_VM_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
/*
* All bits above GPA width are reserved and kernel treats shared bit
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index 7513b3bb69b7..78f109446da6 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -11,11 +11,11 @@
#define TDX_IDENT "IntelTDX "
/* TDX module Call Leaf IDs */
-#define TDX_GET_INFO 1
-#define TDX_GET_VEINFO 3
-#define TDX_GET_REPORT 4
-#define TDX_ACCEPT_PAGE 6
-#define TDX_WR 8
+#define TDG_VP_INFO 1
+#define TDG_VP_VEINFO_GET 3
+#define TDG_MR_REPORT 4
+#define TDG_MEM_PAGE_ACCEPT 6
+#define TDG_VM_WR 8
/* TDCS fields. To be used by TDG.VM.WR and TDG.VM.RD module calls */
#define TDCS_NOTIFY_ENABLES 0x9100000000000010
--
2.41.0
In the TDX_HYPERCALL asm, after the TDCALL instruction returns from the
untrusted VMM, the registers that the TDX guest shares to the VMM need
to be cleared to avoid speculative execution of VMM-provided values.
RSI is specified in the bitmap of those registers, but it is missing
when zeroing out those registers in the current TDX_HYPERCALL.
It was there when it was originally added in commit 752d13305c78
("x86/tdx: Expand __tdx_hypercall() to handle more arguments"), but was
later removed in commit 1e70c680375a ("x86/tdx: Do not corrupt
frame-pointer in __tdx_hypercall()"), which was correct because %rsi is
later restored in the "pop %rsi". However a later commit 7a3a401874be
("x86/tdx: Drop flags from __tdx_hypercall()") removed that "pop %rsi"
but forgot to add the "xor %rsi, %rsi" back.
Fix by adding it back.
Fixes: 7a3a401874be ("x86/tdx: Drop flags from __tdx_hypercall()")
Signed-off-by: Kai Huang <[email protected]>
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
---
arch/x86/coco/tdx/tdcall.S | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index b193c0a1d8db..2eca5f43734f 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -195,6 +195,7 @@ SYM_FUNC_END(__tdx_module_call)
xor %r10d, %r10d
xor %r11d, %r11d
xor %rdi, %rdi
+ xor %rsi, %rsi
xor %rdx, %rdx
/* Restore callee-saved GPRs as mandated by the x86_64 ABI */
--
2.41.0
__tdx_module_call() is only used by the TDX guest to issue TDCALL to the
TDX module. Rename it to __tdcall() to match its behaviour, e.g., it
cannot be used to make host-side SEAMCALL.
Also rename tdx_module_call() which is a wrapper of __tdx_module_call()
to tdcall().
No functional change intended.
Signed-off-by: Kai Huang <[email protected]>
---
v2 -> v3:
- No change.
v1 -> v2:
- Rebase to 6.5-rc2.
---
arch/x86/coco/tdx/tdcall.S | 10 +++++-----
arch/x86/coco/tdx/tdx-shared.c | 2 +-
arch/x86/coco/tdx/tdx.c | 18 +++++++++---------
arch/x86/include/asm/shared/tdx.h | 4 ++--
4 files changed, 17 insertions(+), 17 deletions(-)
diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index e5d4b7d8ecd4..6aebac08f2bf 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -40,8 +40,8 @@
.section .noinstr.text, "ax"
/*
- * __tdx_module_call() - Used by TDX guests to request services from
- * the TDX module (does not include VMM services) using TDCALL instruction.
+ * __tdcall() - Used by TDX guests to request services from the TDX
+ * module (does not include VMM services) using TDCALL instruction.
*
* Transforms function call register arguments into the TDCALL register ABI.
* After TDCALL operation, TDX module output is saved in @out (if it is
@@ -62,7 +62,7 @@
*
*-------------------------------------------------------------------------
*
- * __tdx_module_call() function ABI:
+ * __tdcall() function ABI:
*
* @fn (RDI) - TDCALL Leaf ID, moved to RAX
* @rcx (RSI) - Input parameter 1, moved to RCX
@@ -77,9 +77,9 @@
*
* Return status of TDCALL via RAX.
*/
-SYM_FUNC_START(__tdx_module_call)
+SYM_FUNC_START(__tdcall)
TDX_MODULE_CALL host=0
-SYM_FUNC_END(__tdx_module_call)
+SYM_FUNC_END(__tdcall)
/*
* TDX_HYPERCALL - Make hypercalls to a TDX VMM using TDVMCALL leaf of TDCALL
diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
index f10cd3e4a04e..90631abdac34 100644
--- a/arch/x86/coco/tdx/tdx-shared.c
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -35,7 +35,7 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned long len,
}
tdcall_rcx = start | page_size;
- if (__tdx_module_call(TDG_MEM_PAGE_ACCEPT, tdcall_rcx, 0, 0, 0, NULL))
+ if (__tdcall(TDG_MEM_PAGE_ACCEPT, tdcall_rcx, 0, 0, 0, NULL))
return 0;
return accept_size;
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 05785df66b1c..8c13944925e3 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -66,10 +66,10 @@ EXPORT_SYMBOL_GPL(tdx_kvm_hypercall);
* should only be used for calls that have no legitimate reason to fail
* or where the kernel can not survive the call failing.
*/
-static inline void tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
- struct tdx_module_output *out)
+static inline void tdcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
+ struct tdx_module_output *out)
{
- if (__tdx_module_call(fn, rcx, rdx, r8, r9, out))
+ if (__tdcall(fn, rcx, rdx, r8, r9, out))
panic("TDCALL %lld failed (Buggy TDX module!)\n", fn);
}
@@ -91,9 +91,9 @@ int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
{
u64 ret;
- ret = __tdx_module_call(TDG_MR_REPORT, virt_to_phys(tdreport),
- virt_to_phys(reportdata), TDREPORT_SUBTYPE_0,
- 0, NULL);
+ ret = __tdcall(TDG_MR_REPORT, virt_to_phys(tdreport),
+ virt_to_phys(reportdata), TDREPORT_SUBTYPE_0,
+ 0, NULL);
if (ret) {
if (TDCALL_RETURN_CODE(ret) == TDCALL_INVALID_OPERAND)
return -EINVAL;
@@ -152,7 +152,7 @@ static void tdx_parse_tdinfo(u64 *cc_mask)
* Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL
* [TDG.VP.INFO].
*/
- tdx_module_call(TDG_VP_INFO, 0, 0, 0, 0, &out);
+ tdcall(TDG_VP_INFO, 0, 0, 0, 0, &out);
/*
* The highest bit of a guest physical address is the "sharing" bit.
@@ -594,7 +594,7 @@ void tdx_get_ve_info(struct ve_info *ve)
* Note, the TDX module treats virtual NMIs as inhibited if the #VE
* valid flag is set. It means that NMI=>#VE will not result in a #DF.
*/
- tdx_module_call(TDG_VP_VEINFO_GET, 0, 0, 0, 0, &out);
+ tdcall(TDG_VP_VEINFO_GET, 0, 0, 0, 0, &out);
/* Transfer the output parameters */
ve->exit_reason = out.rcx;
@@ -774,7 +774,7 @@ void __init tdx_early_init(void)
cc_set_mask(cc_mask);
/* Kernel does not use NOTIFY_ENABLES and does not need random #VEs */
- tdx_module_call(TDG_VM_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
+ tdcall(TDG_VM_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
/*
* All bits above GPA width are reserved and kernel treats shared bit
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index 78f109446da6..9e3699b751ef 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -88,8 +88,8 @@ struct tdx_module_output {
};
/* Used to communicate with the TDX module */
-u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
- struct tdx_module_output *out);
+u64 __tdcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
+ struct tdx_module_output *out);
bool tdx_accept_memory(phys_addr_t start, phys_addr_t end);
--
2.41.0
On Wed, Jul 26, 2023 at 11:25:03PM +1200, Kai Huang wrote:
> In the TDX_HYPERCALL asm, after the TDCALL instruction returns from the
> untrusted VMM, the registers that the TDX guest shares to the VMM need
> to be cleared to avoid speculative execution of VMM-provided values.
>
> RSI is specified in the bitmap of those registers, but it is missing
> when zeroing out those registers in the current TDX_HYPERCALL.
>
> It was there when it was originally added in commit 752d13305c78
> ("x86/tdx: Expand __tdx_hypercall() to handle more arguments"), but was
> later removed in commit 1e70c680375a ("x86/tdx: Do not corrupt
> frame-pointer in __tdx_hypercall()"), which was correct because %rsi is
> later restored in the "pop %rsi". However a later commit 7a3a401874be
> ("x86/tdx: Drop flags from __tdx_hypercall()") removed that "pop %rsi"
> but forgot to add the "xor %rsi, %rsi" back.
>
> Fix by adding it back.
>
> Fixes: 7a3a401874be ("x86/tdx: Drop flags from __tdx_hypercall()")
> Signed-off-by: Kai Huang <[email protected]>
> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
--
Kiryl Shutsemau / Kirill A. Shutemov
On Wed, Jul 26, 2023 at 11:25:05PM +1200, Kai Huang wrote:
> The TDX spec names all TDCALLs with prefix "TDG". Currently, the kernel
> doesn't follow such convention for the macros of those TDCALLs but uses
> prefix "TDX_" for all of them. Although it's arguable whether the TDX
> spec names those TDCALLs properly, it's better for the kernel to follow
> the spec when naming those macros.
>
> Change all macros of TDCALLs to make them consistent with the spec. As
> a bonus, they get distinguished easily from the host-side SEAMCALLs,
> which all have prefix "TDH".
>
> No functional change intended.
>
> Signed-off-by: Kai Huang <[email protected]>
I remember we had few back-and-forth on naming this defines. I think
matching names to the spec is helpful for future readers:
Reviewed-by: Kirill A. Shutemov <[email protected]>
--
Kiryl Shutsemau / Kirill A. Shutemov
On Wed, Jul 26, 2023 at 11:25:06PM +1200, Kai Huang wrote:
> __tdx_module_call() is only used by the TDX guest to issue TDCALL to the
> TDX module. Rename it to __tdcall() to match its behaviour, e.g., it
> cannot be used to make host-side SEAMCALL.
>
> Also rename tdx_module_call() which is a wrapper of __tdx_module_call()
> to tdcall().
>
> No functional change intended.
>
> Signed-off-by: Kai Huang <[email protected]>
Fair enough:
Reviewed-by: Kirill A. Shutemov <[email protected]>
--
Kiryl Shutsemau / Kirill A. Shutemov
On Wed, Jul 26, 2023 at 11:25:04PM +1200, Kai Huang wrote:
> If SEAMCALL fails with VMFailInvalid, the SEAM software (e.g., the TDX
> module) won't have chance to set any output register. Skip saving the
> output registers to the structure in this case.
>
> Also, as '.Lno_output_struct' is the very last symbol before RET, rename
> it to '.Lout' to make it short.
>
> Opportunistically make the asm directives unindented.
>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> Signed-off-by: Kai Huang <[email protected]>
> ---
>
> v2 -> v3:
> - No change.
>
> v1 -> v2:
> - A new patch to improve SEAMCALL VMFailInvalid failure, with v1 patch
> "x86/tdx: Move FRAME_BEGIN/END to TDX_MODULE_CALL asm macro" merged.
>
> ---
> arch/x86/coco/tdx/tdcall.S | 3 ---
> arch/x86/virt/vmx/tdx/tdxcall.S | 29 ++++++++++++++++++++---------
> 2 files changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
> index 2eca5f43734f..e5d4b7d8ecd4 100644
> --- a/arch/x86/coco/tdx/tdcall.S
> +++ b/arch/x86/coco/tdx/tdcall.S
> @@ -78,10 +78,7 @@
> * Return status of TDCALL via RAX.
> */
> SYM_FUNC_START(__tdx_module_call)
> - FRAME_BEGIN
> TDX_MODULE_CALL host=0
> - FRAME_END
> - RET
> SYM_FUNC_END(__tdx_module_call)
>
Do we still need to include asm/frame.h after the change?
--
Kiryl Shutsemau / Kirill A. Shutemov
On Thu, 2023-07-27 at 15:52 +0300, [email protected] wrote:
> On Wed, Jul 26, 2023 at 11:25:04PM +1200, Kai Huang wrote:
> > If SEAMCALL fails with VMFailInvalid, the SEAM software (e.g., the TDX
> > module) won't have chance to set any output register. Skip saving the
> > output registers to the structure in this case.
> >
> > Also, as '.Lno_output_struct' is the very last symbol before RET, rename
> > it to '.Lout' to make it short.
> >
> > Opportunistically make the asm directives unindented.
> >
> > Cc: Kirill A. Shutemov <[email protected]>
> > Cc: Dave Hansen <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Suggested-by: Peter Zijlstra <[email protected]>
> > Signed-off-by: Kai Huang <[email protected]>
> > ---
> >
> > v2 -> v3:
> > - No change.
> >
> > v1 -> v2:
> > - A new patch to improve SEAMCALL VMFailInvalid failure, with v1 patch
> > "x86/tdx: Move FRAME_BEGIN/END to TDX_MODULE_CALL asm macro" merged.
> >
> > ---
> > arch/x86/coco/tdx/tdcall.S | 3 ---
> > arch/x86/virt/vmx/tdx/tdxcall.S | 29 ++++++++++++++++++++---------
> > 2 files changed, 20 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
> > index 2eca5f43734f..e5d4b7d8ecd4 100644
> > --- a/arch/x86/coco/tdx/tdcall.S
> > +++ b/arch/x86/coco/tdx/tdcall.S
> > @@ -78,10 +78,7 @@
> > * Return status of TDCALL via RAX.
> > */
> > SYM_FUNC_START(__tdx_module_call)
> > - FRAME_BEGIN
> > TDX_MODULE_CALL host=0
> > - FRAME_END
> > - RET
> > SYM_FUNC_END(__tdx_module_call)
> >
>
> Do we still need to include asm/frame.h after the change?
>
For this patch yes because we still have TDX_HYPERCALL macro here.
We probably can remove it in the patch where TDX_HYPERCALL gets removed.
On 7/26/23 4:25 AM, Kai Huang wrote:
> The TDX spec names all TDCALLs with prefix "TDG". Currently, the kernel
> doesn't follow such convention for the macros of those TDCALLs but uses
> prefix "TDX_" for all of them. Although it's arguable whether the TDX
> spec names those TDCALLs properly, it's better for the kernel to follow
> the spec when naming those macros.
>
> Change all macros of TDCALLs to make them consistent with the spec. As
> a bonus, they get distinguished easily from the host-side SEAMCALLs,
> which all have prefix "TDH".
>
> No functional change intended.
>
When upstreaming the TDX guest patches, there was a discussion about using
TDG vs TDX. Final agreement is to use TDX_ prefix. I think it makes sense
to align with the spec, but it is up to the maintainer.
What about the function name prefix? Are you planning to change them to tdg_*?
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
> Signed-off-by: Kai Huang <[email protected]>
> ---
>
> v2 -> v3:
> - No change.
>
> v1 -> v2:
> - Rebase to 6.5-rc2.
>
> ---
> arch/x86/coco/tdx/tdx-shared.c | 4 ++--
> arch/x86/coco/tdx/tdx.c | 8 ++++----
> arch/x86/include/asm/shared/tdx.h | 10 +++++-----
> 3 files changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
> index ef20ddc37b58..f10cd3e4a04e 100644
> --- a/arch/x86/coco/tdx/tdx-shared.c
> +++ b/arch/x86/coco/tdx/tdx-shared.c
> @@ -35,7 +35,7 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned long len,
> }
>
> tdcall_rcx = start | page_size;
> - if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL))
> + if (__tdx_module_call(TDG_MEM_PAGE_ACCEPT, tdcall_rcx, 0, 0, 0, NULL))
> return 0;
>
> return accept_size;
> @@ -45,7 +45,7 @@ bool tdx_accept_memory(phys_addr_t start, phys_addr_t end)
> {
> /*
> * For shared->private conversion, accept the page using
> - * TDX_ACCEPT_PAGE TDX module call.
> + * TDG_MEM_PAGE_ACCEPT TDX module call.
> */
> while (start < end) {
> unsigned long len = end - start;
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 1d6b863c42b0..05785df66b1c 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -91,7 +91,7 @@ int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
> {
> u64 ret;
>
> - ret = __tdx_module_call(TDX_GET_REPORT, virt_to_phys(tdreport),
> + ret = __tdx_module_call(TDG_MR_REPORT, virt_to_phys(tdreport),
> virt_to_phys(reportdata), TDREPORT_SUBTYPE_0,
> 0, NULL);
> if (ret) {
> @@ -152,7 +152,7 @@ static void tdx_parse_tdinfo(u64 *cc_mask)
> * Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL
> * [TDG.VP.INFO].
> */
> - tdx_module_call(TDX_GET_INFO, 0, 0, 0, 0, &out);
> + tdx_module_call(TDG_VP_INFO, 0, 0, 0, 0, &out);
>
> /*
> * The highest bit of a guest physical address is the "sharing" bit.
> @@ -594,7 +594,7 @@ void tdx_get_ve_info(struct ve_info *ve)
> * Note, the TDX module treats virtual NMIs as inhibited if the #VE
> * valid flag is set. It means that NMI=>#VE will not result in a #DF.
> */
> - tdx_module_call(TDX_GET_VEINFO, 0, 0, 0, 0, &out);
> + tdx_module_call(TDG_VP_VEINFO_GET, 0, 0, 0, 0, &out);
>
> /* Transfer the output parameters */
> ve->exit_reason = out.rcx;
> @@ -774,7 +774,7 @@ void __init tdx_early_init(void)
> cc_set_mask(cc_mask);
>
> /* Kernel does not use NOTIFY_ENABLES and does not need random #VEs */
> - tdx_module_call(TDX_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
> + tdx_module_call(TDG_VM_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
>
> /*
> * All bits above GPA width are reserved and kernel treats shared bit
> diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
> index 7513b3bb69b7..78f109446da6 100644
> --- a/arch/x86/include/asm/shared/tdx.h
> +++ b/arch/x86/include/asm/shared/tdx.h
> @@ -11,11 +11,11 @@
> #define TDX_IDENT "IntelTDX "
>
> /* TDX module Call Leaf IDs */
> -#define TDX_GET_INFO 1
> -#define TDX_GET_VEINFO 3
> -#define TDX_GET_REPORT 4
> -#define TDX_ACCEPT_PAGE 6
> -#define TDX_WR 8
> +#define TDG_VP_INFO 1
> +#define TDG_VP_VEINFO_GET 3
> +#define TDG_MR_REPORT 4
> +#define TDG_MEM_PAGE_ACCEPT 6
> +#define TDG_VM_WR 8
>
> /* TDCS fields. To be used by TDG.VM.WR and TDG.VM.RD module calls */
> #define TDCS_NOTIFY_ENABLES 0x9100000000000010
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
On Thu, 2023-07-27 at 18:54 -0700, Sathyanarayanan Kuppuswamy wrote:
>
> On 7/26/23 4:25 AM, Kai Huang wrote:
> > The TDX spec names all TDCALLs with prefix "TDG". Currently, the kernel
> > doesn't follow such convention for the macros of those TDCALLs but uses
> > prefix "TDX_" for all of them. Although it's arguable whether the TDX
> > spec names those TDCALLs properly, it's better for the kernel to follow
> > the spec when naming those macros.
> >
> > Change all macros of TDCALLs to make them consistent with the spec. As
> > a bonus, they get distinguished easily from the host-side SEAMCALLs,
> > which all have prefix "TDH".
> >
> > No functional change intended.
> >
>
> When upstreaming the TDX guest patches, there was a discussion about using
> TDG vs TDX. Final agreement is to use TDX_ prefix. I think it makes sense
> to align with the spec, but it is up to the maintainer.
>
> What about the function name prefix? Are you planning to change them to tdg_*?
I changed to __tdcall() in the next patch.
>
> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Thanks.
On 7/26/23 4:25 AM, Kai Huang wrote:
> __tdx_module_call() is only used by the TDX guest to issue TDCALL to the
> TDX module. Rename it to __tdcall() to match its behaviour, e.g., it
> cannot be used to make host-side SEAMCALL.
>
> Also rename tdx_module_call() which is a wrapper of __tdx_module_call()
> to tdcall().
>
> No functional change intended.
>
> Signed-off-by: Kai Huang <[email protected]>
> ---
Looks fine to me.
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
>
> v2 -> v3:
> - No change.
>
> v1 -> v2:
> - Rebase to 6.5-rc2.
>
> ---
> arch/x86/coco/tdx/tdcall.S | 10 +++++-----
> arch/x86/coco/tdx/tdx-shared.c | 2 +-
> arch/x86/coco/tdx/tdx.c | 18 +++++++++---------
> arch/x86/include/asm/shared/tdx.h | 4 ++--
> 4 files changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
> index e5d4b7d8ecd4..6aebac08f2bf 100644
> --- a/arch/x86/coco/tdx/tdcall.S
> +++ b/arch/x86/coco/tdx/tdcall.S
> @@ -40,8 +40,8 @@
> .section .noinstr.text, "ax"
>
> /*
> - * __tdx_module_call() - Used by TDX guests to request services from
> - * the TDX module (does not include VMM services) using TDCALL instruction.
> + * __tdcall() - Used by TDX guests to request services from the TDX
> + * module (does not include VMM services) using TDCALL instruction.
> *
> * Transforms function call register arguments into the TDCALL register ABI.
> * After TDCALL operation, TDX module output is saved in @out (if it is
> @@ -62,7 +62,7 @@
> *
> *-------------------------------------------------------------------------
> *
> - * __tdx_module_call() function ABI:
> + * __tdcall() function ABI:
> *
> * @fn (RDI) - TDCALL Leaf ID, moved to RAX
> * @rcx (RSI) - Input parameter 1, moved to RCX
> @@ -77,9 +77,9 @@
> *
> * Return status of TDCALL via RAX.
> */
> -SYM_FUNC_START(__tdx_module_call)
> +SYM_FUNC_START(__tdcall)
> TDX_MODULE_CALL host=0
> -SYM_FUNC_END(__tdx_module_call)
> +SYM_FUNC_END(__tdcall)
>
> /*
> * TDX_HYPERCALL - Make hypercalls to a TDX VMM using TDVMCALL leaf of TDCALL
> diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
> index f10cd3e4a04e..90631abdac34 100644
> --- a/arch/x86/coco/tdx/tdx-shared.c
> +++ b/arch/x86/coco/tdx/tdx-shared.c
> @@ -35,7 +35,7 @@ static unsigned long try_accept_one(phys_addr_t start, unsigned long len,
> }
>
> tdcall_rcx = start | page_size;
> - if (__tdx_module_call(TDG_MEM_PAGE_ACCEPT, tdcall_rcx, 0, 0, 0, NULL))
> + if (__tdcall(TDG_MEM_PAGE_ACCEPT, tdcall_rcx, 0, 0, 0, NULL))
> return 0;
>
> return accept_size;
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 05785df66b1c..8c13944925e3 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -66,10 +66,10 @@ EXPORT_SYMBOL_GPL(tdx_kvm_hypercall);
> * should only be used for calls that have no legitimate reason to fail
> * or where the kernel can not survive the call failing.
> */
> -static inline void tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> - struct tdx_module_output *out)
> +static inline void tdcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> + struct tdx_module_output *out)
> {
> - if (__tdx_module_call(fn, rcx, rdx, r8, r9, out))
> + if (__tdcall(fn, rcx, rdx, r8, r9, out))
> panic("TDCALL %lld failed (Buggy TDX module!)\n", fn);
> }
>
> @@ -91,9 +91,9 @@ int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
> {
> u64 ret;
>
> - ret = __tdx_module_call(TDG_MR_REPORT, virt_to_phys(tdreport),
> - virt_to_phys(reportdata), TDREPORT_SUBTYPE_0,
> - 0, NULL);
> + ret = __tdcall(TDG_MR_REPORT, virt_to_phys(tdreport),
> + virt_to_phys(reportdata), TDREPORT_SUBTYPE_0,
> + 0, NULL);
> if (ret) {
> if (TDCALL_RETURN_CODE(ret) == TDCALL_INVALID_OPERAND)
> return -EINVAL;
> @@ -152,7 +152,7 @@ static void tdx_parse_tdinfo(u64 *cc_mask)
> * Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL
> * [TDG.VP.INFO].
> */
> - tdx_module_call(TDG_VP_INFO, 0, 0, 0, 0, &out);
> + tdcall(TDG_VP_INFO, 0, 0, 0, 0, &out);
>
> /*
> * The highest bit of a guest physical address is the "sharing" bit.
> @@ -594,7 +594,7 @@ void tdx_get_ve_info(struct ve_info *ve)
> * Note, the TDX module treats virtual NMIs as inhibited if the #VE
> * valid flag is set. It means that NMI=>#VE will not result in a #DF.
> */
> - tdx_module_call(TDG_VP_VEINFO_GET, 0, 0, 0, 0, &out);
> + tdcall(TDG_VP_VEINFO_GET, 0, 0, 0, 0, &out);
>
> /* Transfer the output parameters */
> ve->exit_reason = out.rcx;
> @@ -774,7 +774,7 @@ void __init tdx_early_init(void)
> cc_set_mask(cc_mask);
>
> /* Kernel does not use NOTIFY_ENABLES and does not need random #VEs */
> - tdx_module_call(TDG_VM_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
> + tdcall(TDG_VM_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
>
> /*
> * All bits above GPA width are reserved and kernel treats shared bit
> diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
> index 78f109446da6..9e3699b751ef 100644
> --- a/arch/x86/include/asm/shared/tdx.h
> +++ b/arch/x86/include/asm/shared/tdx.h
> @@ -88,8 +88,8 @@ struct tdx_module_output {
> };
>
> /* Used to communicate with the TDX module */
> -u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> - struct tdx_module_output *out);
> +u64 __tdcall(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> + struct tdx_module_output *out);
>
> bool tdx_accept_memory(phys_addr_t start, phys_addr_t end);
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
On Wed, Jul 26, 2023 at 11:25:14PM +1200,
Kai Huang <[email protected]> wrote:
> For TDX guest, KVM needs to call __seamcall_saved_ret() to make the
> TDH.VP.ENTER SEAMCALL to enter the guest, possibly taking all registers
> in 'struct tdx_module_args' as input/output.
>
> KVM caches guest's GPRs in 'kvm_vcpu_arch::regs[]', which follows the
> "register index" hardware layout of x86 GPRs. On the other hand, the
> __seamcall_saved_ret() takes the pointer of 'struct tdx_module_args' as
> argument, thus there's a mismatch.
>
> KVM could choose to copy input registers from 'vcpu::regs[]' to a
> 'struct tdx_module_args' and use that as argument to make the SEAMCALL,
> but such memory copy isn't desired and should be avoided if possible.
>
> It's not feasible to change KVM's 'vcpu::regs[]' layout due to various
> reasons (e.g., emulation code uses decoded register index as array index
> to access the register). Therefore, adjust 'struct tdx_module_args' to
> match KVM's 'vcpu::regs[]' layout so that KVM can simply do below:
>
> __seamcall_saved_ret(TDH_VP_ENTER,
> (struct tdx_module_args *)vcpu->arch.regs);
>
> Note RAX/RSP/RBP are not used by the TDX_MODULE_CALL assembly, but they
> are necessary in order match the layout of 'struct tdx_module_args' to
> KVM's 'vcpu::regs[]'. Thus add them to the structure, but name them as
> *_unused. Also don't include them to asm-offset.c so that any misuse of
> them in the assembly would result in build error.
>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Sean Christopherson <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Isaku Yamahata <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> Signed-off-by: Kai Huang <[email protected]>
> ---
>
> v2 -> v3:
> - New patch
>
> ---
> arch/x86/include/asm/shared/tdx.h | 19 +++++++++++++------
> arch/x86/kernel/asm-offsets.c | 6 +++---
> 2 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
> index 74fc466dfdcd..8d1427562c63 100644
> --- a/arch/x86/include/asm/shared/tdx.h
> +++ b/arch/x86/include/asm/shared/tdx.h
> @@ -58,24 +58,31 @@
> * Used in __tdcall*() to gather the input/output registers' values of the
> * TDCALL instruction when requesting services from the TDX module. This is a
> * software only structure and not part of the TDX module/VMM ABI
> + *
> + * Note those *_unused are not used by the TDX_MODULE_CALL assembly.
> + * The layout of this structure also matches KVM's kvm_vcpu_arch::regs[]
> + * layout, which follows the "register index" order of x86 GPRs. KVM
> + * then can simply type cast kvm_vcpu_arch::regs[] to this structure to
> + * avoid the extra memory copy between two structures when making
> + * TDH.VP.ENTER SEAMCALL.
> */
> struct tdx_module_args {
> - /* callee-clobbered */
> + u64 rax_unused;
> u64 rcx;
> u64 rdx;
> + u64 rbx;
> + u64 rsp_unused;
> + u64 rbp_unused;
> + u64 rsi;
> + u64 rdi;
> u64 r8;
> u64 r9;
> - /* extra callee-clobbered */
> u64 r10;
> u64 r11;
> - /* callee-saved + rdi/rsi */
> u64 r12;
> u64 r13;
> u64 r14;
> u64 r15;
> - u64 rbx;
> - u64 rdi;
> - u64 rsi;
> };
>
> /* Used to communicate with the TDX module */
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index 6913b372ccf7..e4ad822d3acd 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -70,6 +70,9 @@ static void __used common(void)
> BLANK();
> OFFSET(TDX_MODULE_rcx, tdx_module_args, rcx);
> OFFSET(TDX_MODULE_rdx, tdx_module_args, rdx);
> + OFFSET(TDX_MODULE_rbx, tdx_module_args, rbx);
> + OFFSET(TDX_MODULE_rsi, tdx_module_args, rsi);
> + OFFSET(TDX_MODULE_rdi, tdx_module_args, rdi);
> OFFSET(TDX_MODULE_r8, tdx_module_args, r8);
> OFFSET(TDX_MODULE_r9, tdx_module_args, r9);
> OFFSET(TDX_MODULE_r10, tdx_module_args, r10);
> @@ -78,9 +81,6 @@ static void __used common(void)
> OFFSET(TDX_MODULE_r13, tdx_module_args, r13);
> OFFSET(TDX_MODULE_r14, tdx_module_args, r14);
> OFFSET(TDX_MODULE_r15, tdx_module_args, r15);
> - OFFSET(TDX_MODULE_rbx, tdx_module_args, rbx);
> - OFFSET(TDX_MODULE_rdi, tdx_module_args, rdi);
> - OFFSET(TDX_MODULE_rsi, tdx_module_args, rsi);
>
> BLANK();
> OFFSET(BP_scratch, boot_params, scratch);
> --
> 2.41.0
I replaced the current TDX KVM TDH.VP.ENTER with this function and it worked.
Test-by: Isaku Yamahata <[email protected]>
--
Isaku Yamahata <[email protected]>
On Wed, Jul 26, 2023 at 11:25:14PM +1200,
Kai Huang <[email protected]> wrote:
> For TDX guest, KVM needs to call __seamcall_saved_ret() to make the
> TDH.VP.ENTER SEAMCALL to enter the guest, possibly taking all registers
> in 'struct tdx_module_args' as input/output.
>
> KVM caches guest's GPRs in 'kvm_vcpu_arch::regs[]', which follows the
> "register index" hardware layout of x86 GPRs. On the other hand, the
> __seamcall_saved_ret() takes the pointer of 'struct tdx_module_args' as
> argument, thus there's a mismatch.
>
> KVM could choose to copy input registers from 'vcpu::regs[]' to a
> 'struct tdx_module_args' and use that as argument to make the SEAMCALL,
> but such memory copy isn't desired and should be avoided if possible.
>
> It's not feasible to change KVM's 'vcpu::regs[]' layout due to various
> reasons (e.g., emulation code uses decoded register index as array index
> to access the register). Therefore, adjust 'struct tdx_module_args' to
> match KVM's 'vcpu::regs[]' layout so that KVM can simply do below:
>
> __seamcall_saved_ret(TDH_VP_ENTER,
> (struct tdx_module_args *)vcpu->arch.regs);
>
> Note RAX/RSP/RBP are not used by the TDX_MODULE_CALL assembly, but they
> are necessary in order match the layout of 'struct tdx_module_args' to
> KVM's 'vcpu::regs[]'. Thus add them to the structure, but name them as
> *_unused. Also don't include them to asm-offset.c so that any misuse of
> them in the assembly would result in build error.
Maybe we can have static check if the offsets match.
e.g. BUILD_BUG_ON(__VCPU_REGS_RAX * 8 != TDX_MODULE_rax); etc...
Anyway, I can have such a patch in TDX KVM side when I use this function for
TDH.VP.ENTER.
Thansk,
>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Sean Christopherson <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Isaku Yamahata <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> Signed-off-by: Kai Huang <[email protected]>
> ---
>
> v2 -> v3:
> - New patch
>
> ---
> arch/x86/include/asm/shared/tdx.h | 19 +++++++++++++------
> arch/x86/kernel/asm-offsets.c | 6 +++---
> 2 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
> index 74fc466dfdcd..8d1427562c63 100644
> --- a/arch/x86/include/asm/shared/tdx.h
> +++ b/arch/x86/include/asm/shared/tdx.h
> @@ -58,24 +58,31 @@
> * Used in __tdcall*() to gather the input/output registers' values of the
> * TDCALL instruction when requesting services from the TDX module. This is a
> * software only structure and not part of the TDX module/VMM ABI
> + *
> + * Note those *_unused are not used by the TDX_MODULE_CALL assembly.
> + * The layout of this structure also matches KVM's kvm_vcpu_arch::regs[]
> + * layout, which follows the "register index" order of x86 GPRs. KVM
> + * then can simply type cast kvm_vcpu_arch::regs[] to this structure to
> + * avoid the extra memory copy between two structures when making
> + * TDH.VP.ENTER SEAMCALL.
> */
> struct tdx_module_args {
> - /* callee-clobbered */
> + u64 rax_unused;
> u64 rcx;
> u64 rdx;
> + u64 rbx;
> + u64 rsp_unused;
> + u64 rbp_unused;
> + u64 rsi;
> + u64 rdi;
> u64 r8;
> u64 r9;
> - /* extra callee-clobbered */
> u64 r10;
> u64 r11;
> - /* callee-saved + rdi/rsi */
> u64 r12;
> u64 r13;
> u64 r14;
> u64 r15;
> - u64 rbx;
> - u64 rdi;
> - u64 rsi;
> };
>
> /* Used to communicate with the TDX module */
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index 6913b372ccf7..e4ad822d3acd 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -70,6 +70,9 @@ static void __used common(void)
> BLANK();
> OFFSET(TDX_MODULE_rcx, tdx_module_args, rcx);
> OFFSET(TDX_MODULE_rdx, tdx_module_args, rdx);
> + OFFSET(TDX_MODULE_rbx, tdx_module_args, rbx);
> + OFFSET(TDX_MODULE_rsi, tdx_module_args, rsi);
> + OFFSET(TDX_MODULE_rdi, tdx_module_args, rdi);
> OFFSET(TDX_MODULE_r8, tdx_module_args, r8);
> OFFSET(TDX_MODULE_r9, tdx_module_args, r9);
> OFFSET(TDX_MODULE_r10, tdx_module_args, r10);
> @@ -78,9 +81,6 @@ static void __used common(void)
> OFFSET(TDX_MODULE_r13, tdx_module_args, r13);
> OFFSET(TDX_MODULE_r14, tdx_module_args, r14);
> OFFSET(TDX_MODULE_r15, tdx_module_args, r15);
> - OFFSET(TDX_MODULE_rbx, tdx_module_args, rbx);
> - OFFSET(TDX_MODULE_rdi, tdx_module_args, rdi);
> - OFFSET(TDX_MODULE_rsi, tdx_module_args, rsi);
>
> BLANK();
> OFFSET(BP_scratch, boot_params, scratch);
> --
> 2.41.0
>
--
Isaku Yamahata <[email protected]>
>
> I replaced the current TDX KVM TDH.VP.ENTER with this function and it worked.
>
> Test-by: Isaku Yamahata <[email protected]>
I suppose you mean:
Tested-by: ... :-)
Anyway I only sent this series out after I got confirmation from you that
VP.ENTER worked using the SEAMCALL function in this series, but I forgot to add
your Tested-by before sending out this series. Will add. Thanks!
On Wed, Jul 26, 2023 at 11:25:10PM +1200, Kai Huang wrote:
> Now the TDX_HYPERCALL asm is basically identical to the TDX_MODULE_CALL
> with both '\saved' and '\ret' enabled, with two minor things though:
>
> 1) The way to restore the structure pointer is different
>
> The TDX_HYPERCALL uses RCX as spare to restore the structure pointer,
> but the TDX_MODULE_CALL assumes no spare register can be used. In other
> words, TDX_MODULE_CALL already covers what TDX_HYPERCALL does.
>
> 2) TDX_MODULE_CALL only clears shared registers for TDH.VP.ENTER
>
> For this just need to make that code available for the non-host case.
>
> Thus, remove the TDX_HYPERCALL and reimplement the __tdx_hypercall()
> using the TDX_MODULE_CALL.
>
> Extend the TDX_MODULE_CALL to cover "clear shared registers" for
> TDG.VP.VMCALL. Introduce a new __tdcall_saved_ret() to replace the
> temporary __tdcall_hypercall().
>
> The __tdcall_saved_ret() can also be used for those new TDCALLs which
> require more input/output registers than the basic TDCALLs do.
>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> Signed-off-by: Kai Huang <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
--
Kiryl Shutsemau / Kirill A. Shutemov
On Wed, Jul 26, 2023 at 11:25:12PM +1200, Kai Huang wrote:
> Intel Trust Domain Extensions (TDX) protects guest VMs from malicious
> host and certain physical attacks. A CPU-attested software module
> called 'the TDX module' runs inside a new isolated memory range as a
> trusted hypervisor to manage and run protected VMs.
>
> TDX introduces a new CPU mode: Secure Arbitration Mode (SEAM). This
> mode runs only the TDX module itself or other code to load the TDX
> module.
>
> The host kernel communicates with SEAM software via a new SEAMCALL
> instruction. This is conceptually similar to a guest->host hypercall,
> except it is made from the host to SEAM software instead. The TDX
> module establishes a new SEAMCALL ABI which allows the host to
> initialize the module and to manage VMs.
>
> The SEAMCALL ABI is very similar to the TDCALL ABI and leverages much
> TDCALL infrastructure. Wire up basic functions to make SEAMCALLs for
> the basic support of running TDX guests: __seamcall(), __seamcall_ret(),
> and __seamcall_saved_ret() for TDH.VP.ENTER. All SEAMCALLs involved in
> the basic TDX support don't use "callee-saved" registers as input and
> output, except the TDH.VP.ENTER.
>
> To start to support TDX, create a new arch/x86/virt/vmx/tdx/tdx.c for
> TDX host kernel support. Add a new Kconfig option CONFIG_INTEL_TDX_HOST
> to opt-in TDX host kernel support (to distinguish with TDX guest kernel
> support). So far only KVM uses TDX. Make the new config option depend
> on KVM_INTEL.
>
> Signed-off-by: Kai Huang <[email protected]>
> ---
>
> v2 -> v3:
> - Added __seamcall_saved_ret() back for TDH.VP.ENTER, given the new
> patch to adjust 'struct tdx_module_args' layout.
>
> v1 -> v2:
> - Removed __seamcall_saved_ret() and leave it to KVM TDX patches.
>
> ---
> arch/x86/Kconfig | 12 +++++++
> arch/x86/Makefile | 2 ++
> arch/x86/include/asm/tdx.h | 7 ++++
> arch/x86/virt/Makefile | 2 ++
> arch/x86/virt/vmx/Makefile | 2 ++
> arch/x86/virt/vmx/tdx/Makefile | 2 ++
> arch/x86/virt/vmx/tdx/seamcall.S | 61 ++++++++++++++++++++++++++++++++
> 7 files changed, 88 insertions(+)
> create mode 100644 arch/x86/virt/Makefile
> create mode 100644 arch/x86/virt/vmx/Makefile
> create mode 100644 arch/x86/virt/vmx/tdx/Makefile
> create mode 100644 arch/x86/virt/vmx/tdx/seamcall.S
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 7422db409770..0558dd98abd7 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1949,6 +1949,18 @@ config X86_SGX
>
> If unsure, say N.
>
> +config INTEL_TDX_HOST
> + bool "Intel Trust Domain Extensions (TDX) host support"
> + depends on CPU_SUP_INTEL
> + depends on X86_64
> + depends on KVM_INTEL
Hm. I expected KVM_INTEL to depend on CPU_SUP_INTEL, but apparently no.
Any reasons why?
Anyway, it is not strictly related to the patch:
Reviewed-by: Kirill A. Shutemov <[email protected]>
--
Kiryl Shutsemau / Kirill A. Shutemov
On Wed, Jul 26, 2023 at 11:25:13PM +1200, Kai Huang wrote:
> @@ -20,6 +21,9 @@
> #define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40))
> #define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000))
>
> +#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
> +#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
Is there any explantion how these error codes got chosen? Looks very
arbitrary and may collide with other error codes in the future.
--
Kiryl Shutsemau / Kirill A. Shutemov
On Wed, Jul 26, 2023 at 11:25:14PM +1200, Kai Huang wrote:
> For TDX guest, KVM needs to call __seamcall_saved_ret() to make the
> TDH.VP.ENTER SEAMCALL to enter the guest, possibly taking all registers
> in 'struct tdx_module_args' as input/output.
>
> KVM caches guest's GPRs in 'kvm_vcpu_arch::regs[]', which follows the
> "register index" hardware layout of x86 GPRs. On the other hand, the
> __seamcall_saved_ret() takes the pointer of 'struct tdx_module_args' as
> argument, thus there's a mismatch.
>
> KVM could choose to copy input registers from 'vcpu::regs[]' to a
> 'struct tdx_module_args' and use that as argument to make the SEAMCALL,
> but such memory copy isn't desired and should be avoided if possible.
I doubt the copy will be visible on any profile.
I personally don't like that kvm implementation detail leaks here. It
suppose to be generic TDX code.
--
Kiryl Shutsemau / Kirill A. Shutemov
On Wed, Jul 26, 2023 at 11:25:11PM +1200, Kai Huang wrote:
> Now 'struct tdx_hypercall_args' is basically 'struct tdx_module_args'
> minus RCX. Although from __tdx_hypercall()'s perspective RCX isn't
> used as shared register thus not part of input/output registers, it's
> not worth to have a separate structure just due to one register.
>
> Remove the 'struct tdx_hypercall_args' and use 'struct tdx_module_args'
> instead in __tdx_hypercall() related code. This also saves the memory
> copy between the two structures within __tdx_hypercall().
>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> Signed-off-by: Kai Huang <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
--
Kiryl Shutsemau / Kirill A. Shutemov
> >
> > +config INTEL_TDX_HOST
> > + bool "Intel Trust Domain Extensions (TDX) host support"
> > + depends on CPU_SUP_INTEL
> > + depends on X86_64
> > + depends on KVM_INTEL
>
> Hm. I expected KVM_INTEL to depend on CPU_SUP_INTEL, but apparently no.
> Any reasons why?
Hmm.. Not sure :-)
On Sun, 2023-08-06 at 14:50 +0300, [email protected] wrote:
> On Wed, Jul 26, 2023 at 11:25:14PM +1200, Kai Huang wrote:
> > For TDX guest, KVM needs to call __seamcall_saved_ret() to make the
> > TDH.VP.ENTER SEAMCALL to enter the guest, possibly taking all registers
> > in 'struct tdx_module_args' as input/output.
> >
> > KVM caches guest's GPRs in 'kvm_vcpu_arch::regs[]', which follows the
> > "register index" hardware layout of x86 GPRs. On the other hand, the
> > __seamcall_saved_ret() takes the pointer of 'struct tdx_module_args' as
> > argument, thus there's a mismatch.
> >
> > KVM could choose to copy input registers from 'vcpu::regs[]' to a
> > 'struct tdx_module_args' and use that as argument to make the SEAMCALL,
> > but such memory copy isn't desired and should be avoided if possible.
>
> I doubt the copy will be visible on any profile.
>
> I personally don't like that kvm implementation detail leaks here. It
> suppose to be generic TDX code.
>
>
Well I kinda agree with you. But it seems Peter wanted this to be done:
https://lore.kernel.org/lkml/[email protected]/T/#m37f39493e9f2bf0a4c9ccc72aaf4938927375dc1
On Sun, 2023-08-06 at 14:41 +0300, [email protected] wrote:
> On Wed, Jul 26, 2023 at 11:25:13PM +1200, Kai Huang wrote:
> > @@ -20,6 +21,9 @@
> > #define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40))
> > #define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000))
> >
> > +#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
> > +#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
>
> Is there any explantion how these error codes got chosen? Looks very
> arbitrary and may collide with other error codes in the future.
>
Any error code has TDX_SW_ERROR is reserved to software use so the TDX module
can never return any error code which conflicts with those software ones.
For why to choose these two, I believe XOR the TRAP number to TDX_SW_ERROR is
the simplest way to achieve: 1) costing minimal assembly code; 2)
opportunistically handling #GP too, allowing caller to distinguish the two
errors.
I can add this to the changelog.
Btw, as I chatted to you I believe we have another justification to handle
#UD/#GP in the assembly: emergency virtualization disable. Thus we can even get
rid of the erratum staff in the changelog.
How does below look like?
commit d3ff21da1083a525eb2cac6576490045e22f6f5d
Author: Kai Huang <[email protected]>
Date: Mon Jun 26 16:04:08 2023 +1200
x86/virt/tdx: Allow SEAMCALL to handle #UD and #GP
SEAMCALL instruction causes #UD if the CPU isn't in VMX operation.
Currently the TDX_MODULE_CALL assembly doesn't handle #UD, thus making
SEAMCALL when VMX is disabled would cause Oops.
Unfortunately, there are legal cases that SEAMCALL can be made when VMX
is disabled. For instance, VMX can be disabled due to emergency reboot
while there are still TDX guest is running.
Extend the TDX_MODULE_CALL assembly to return an error code for #UD to
handle this case gracefully, e.g., KVM can then quitely eat all SEAMCALL
errors caused by emergency reboot.
SEAMCALL instruction also causes #GP when TDX isn't enabled by the BIOS.
Use _ASM_EXTABLE_FAULT() to catch both exceptions with the trap number
recorded, and define two new error codes by XORing the trap number to
the TDX_SW_ERROR. This opportunistically handles #GP too while using
the same simple assembly code.
A bonus is when kernel mistakenly calls SEAMCALL when CPU isn't in VMX
operation, or when TDX isn't enabled by the BIOS, or when the BIOS is
buggy, the kernel can get a nicer error code rather than a less
understandable Oops.
This is basically based on Peter's code.
Cc: Kirill A. Shutemov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 603e6d1e9d4a..6b8547dc40fd 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -8,6 +8,7 @@
#include <asm/errno.h>
#include <asm/ptrace.h>
+#include <asm/trapnr.h>
#include <asm/shared/tdx.h>
/*
@@ -20,6 +21,9 @@
#define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40))
#define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000))
+#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
+#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
+
#ifndef __ASSEMBLY__
/*
diff --git a/arch/x86/virt/vmx/tdx/tdxcall.S b/arch/x86/virt/vmx/tdx/tdxcall.S
index 3f0b83a9977e..016a2a1ec1d6 100644
--- a/arch/x86/virt/vmx/tdx/tdxcall.S
+++ b/arch/x86/virt/vmx/tdx/tdxcall.S
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
#include <asm/asm-offsets.h>
#include <asm/frame.h>
+#include <asm/asm.h>
#include <asm/tdx.h>
/*
@@ -85,6 +86,7 @@
.endif /* \saved */
.if \host
+.Lseamcall\@:
seamcall
/*
* SEAMCALL instruction is essentially a VMExit from VMX root
@@ -191,11 +193,28 @@
.if \host
.Lseamcall_vmfailinvalid\@:
mov $TDX_SEAMCALL_VMFAILINVALID, %rax
+ jmp .Lseamcall_fail\@
+
+.Lseamcall_trap\@:
+ /*
+ * SEAMCALL caused #GP or #UD. By reaching here RAX contains
+ * the trap number. Convert the trap number to the TDX error
+ * code by setting TDX_SW_ERROR to the high 32-bits of RAX.
+ *
+ * Note cannot OR TDX_SW_ERROR directly to RAX as OR instruction
+ * only accepts 32-bit immediate at most.
+ */
+ movq $TDX_SW_ERROR, %rdi
+ orq %rdi, %rax
+
+.Lseamcall_fail\@:
.if \ret && \saved
/* pop the unused structure pointer back to RSI */
popq %rsi
.endif
jmp .Lout\@
+
+ _ASM_EXTABLE_FAULT(.Lseamcall\@, .Lseamcall_trap\@)
.endif /* \host */
.endm
On Mon, Aug 07, 2023 at 02:14:37AM +0000, Huang, Kai wrote:
> On Sun, 2023-08-06 at 14:41 +0300, [email protected] wrote:
> > On Wed, Jul 26, 2023 at 11:25:13PM +1200, Kai Huang wrote:
> > > @@ -20,6 +21,9 @@
> > > #define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40))
> > > #define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000))
> > >
> > > +#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
> > > +#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
> >
> > Is there any explantion how these error codes got chosen? Looks very
> > arbitrary and may collide with other error codes in the future.
> >
>
> Any error code has TDX_SW_ERROR is reserved to software use so the TDX module
> can never return any error code which conflicts with those software ones.
>
> For why to choose these two, I believe XOR the TRAP number to TDX_SW_ERROR is
> the simplest way to achieve: 1) costing minimal assembly code; 2)
> opportunistically handling #GP too, allowing caller to distinguish the two
> errors.
My problem is that it is going to conflict with errno-based errors if we
going to take this path in the future. Like these errors are the same as
(TDX_SW_ERROR | EACCES) and (TDX_SW_ERROR | ENXIO) respectively.
--
Kiryl Shutsemau / Kirill A. Shutemov
On Mon, Aug 07, 2023 at 12:41:13PM +0000, Huang, Kai wrote:
> On Mon, 2023-08-07 at 12:53 +0300, [email protected] wrote:
> > On Mon, Aug 07, 2023 at 02:14:37AM +0000, Huang, Kai wrote:
> > > On Sun, 2023-08-06 at 14:41 +0300, [email protected] wrote:
> > > > On Wed, Jul 26, 2023 at 11:25:13PM +1200, Kai Huang wrote:
> > > > > @@ -20,6 +21,9 @@
> > > > > #define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40))
> > > > > #define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000))
> > > > >
> > > > > +#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
> > > > > +#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
> > > >
> > > > Is there any explantion how these error codes got chosen? Looks very
> > > > arbitrary and may collide with other error codes in the future.
> > > >
> > >
> > > Any error code has TDX_SW_ERROR is reserved to software use so the TDX module
> > > can never return any error code which conflicts with those software ones.
> > >
> > > For why to choose these two, I believe XOR the TRAP number to TDX_SW_ERROR is
> > > the simplest way to achieve: 1) costing minimal assembly code; 2)
> > > opportunistically handling #GP too, allowing caller to distinguish the two
> > > errors.
> >
> > My problem is that it is going to conflict with errno-based errors if we
> > going to take this path in the future. Like these errors are the same as
> > (TDX_SW_ERROR | EACCES) and (TDX_SW_ERROR | ENXIO) respectively.
> >
>
> Is there any use case that we need those definitions?
>
> Even we have such requirement in the future, we still have many bits available
> after taking out the bits of TDX_SW_ERROR thus I assume we can do some bit shift
> when this really happens??
Okay, fair enough.
--
Kiryl Shutsemau / Kirill A. Shutemov
On Mon, 2023-08-07 at 12:53 +0300, [email protected] wrote:
> On Mon, Aug 07, 2023 at 02:14:37AM +0000, Huang, Kai wrote:
> > On Sun, 2023-08-06 at 14:41 +0300, [email protected] wrote:
> > > On Wed, Jul 26, 2023 at 11:25:13PM +1200, Kai Huang wrote:
> > > > @@ -20,6 +21,9 @@
> > > > #define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40))
> > > > #define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000))
> > > >
> > > > +#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
> > > > +#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
> > >
> > > Is there any explantion how these error codes got chosen? Looks very
> > > arbitrary and may collide with other error codes in the future.
> > >
> >
> > Any error code has TDX_SW_ERROR is reserved to software use so the TDX module
> > can never return any error code which conflicts with those software ones.
> >
> > For why to choose these two, I believe XOR the TRAP number to TDX_SW_ERROR is
> > the simplest way to achieve: 1) costing minimal assembly code; 2)
> > opportunistically handling #GP too, allowing caller to distinguish the two
> > errors.
>
> My problem is that it is going to conflict with errno-based errors if we
> going to take this path in the future. Like these errors are the same as
> (TDX_SW_ERROR | EACCES) and (TDX_SW_ERROR | ENXIO) respectively.
>
Is there any use case that we need those definitions?
Even we have such requirement in the future, we still have many bits available
after taking out the bits of TDX_SW_ERROR thus I assume we can do some bit shift
when this really happens??
On Mon, Aug 07, 2023, Kai Huang wrote:
>
> > >
> > > +config INTEL_TDX_HOST
> > > + bool "Intel Trust Domain Extensions (TDX) host support"
> > > + depends on CPU_SUP_INTEL
> > > + depends on X86_64
> > > + depends on KVM_INTEL
> >
> > Hm. I expected KVM_INTEL to depend on CPU_SUP_INTEL, but apparently no.
> > Any reasons why?
>
> Hmm.. Not sure :-)
Centuar and Zhaoxin CPUs also support VMX.
commit 8f63aaf5c493c6502a058585cdfa3c71cdf8c44a
Author: Sean Christopherson <[email protected]>
Date: Fri Dec 20 20:45:13 2019 -0800
KVM: VMX: Allow KVM_INTEL when building for Centaur and/or Zhaoxin CPUs
Change the dependency for KVM_INTEL, i.e. KVM w/ VMX, from Intel CPUs to
any CPU that supports the IA32_FEAT_CTL MSR and thus VMX functionality.
This effectively allows building KVM_INTEL for Centaur and Zhaoxin CPUs.
On 8/6/23 04:41, [email protected] wrote:
> On Wed, Jul 26, 2023 at 11:25:13PM +1200, Kai Huang wrote:
>> @@ -20,6 +21,9 @@
>> #define TDX_SW_ERROR (TDX_ERROR | GENMASK_ULL(47, 40))
>> #define TDX_SEAMCALL_VMFAILINVALID (TDX_SW_ERROR | _UL(0xFFFF0000))
>>
>> +#define TDX_SEAMCALL_GP (TDX_SW_ERROR | X86_TRAP_GP)
>> +#define TDX_SEAMCALL_UD (TDX_SW_ERROR | X86_TRAP_UD)
> Is there any explantion how these error codes got chosen? Looks very
> arbitrary and may collide with other error codes in the future.
If they collide, we can just fix it then.
So, please, do comment what the limitations are and what must be avoided
in the future, but I don't think we need to go mucking with this at all.
On Mon, 2023-08-07 at 07:30 -0700, Sean Christopherson wrote:
> On Mon, Aug 07, 2023, Kai Huang wrote:
> >
> > > >
> > > > +config INTEL_TDX_HOST
> > > > + bool "Intel Trust Domain Extensions (TDX) host support"
> > > > + depends on CPU_SUP_INTEL
> > > > + depends on X86_64
> > > > + depends on KVM_INTEL
> > >
> > > Hm. I expected KVM_INTEL to depend on CPU_SUP_INTEL, but apparently no.
> > > Any reasons why?
> >
> > Hmm.. Not sure :-)
>
> Centuar and Zhaoxin CPUs also support VMX.
>
> commit 8f63aaf5c493c6502a058585cdfa3c71cdf8c44a
> Author: Sean Christopherson <[email protected]>
> Date: Fri Dec 20 20:45:13 2019 -0800
>
> KVM: VMX: Allow KVM_INTEL when building for Centaur and/or Zhaoxin CPUs
>
> Change the dependency for KVM_INTEL, i.e. KVM w/ VMX, from Intel CPUs to
> any CPU that supports the IA32_FEAT_CTL MSR and thus VMX functionality.
> This effectively allows building KVM_INTEL for Centaur and Zhaoxin CPUs.
>
Ah. Thanks Sean!