This part of Secure Encrypted Paging (SEV-SNP) series focuses on the changes
required in a guest OS for SEV-SNP support.
SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
new hardware-based memory protections. SEV-SNP adds strong memory integrity
protection to help prevent malicious hypervisor-based attacks like data
replay, memory re-mapping and more in order to create an isolated memory
encryption environment.
This series provides the basic building blocks to support booting the SEV-SNP
VMs, it does not cover all the security enhancement introduced by the SEV-SNP
such as interrupt protection.
Many of the integrity guarantees of SEV-SNP are enforced through a new
structure called the Reverse Map Table (RMP). Adding a new page to SEV-SNP
VM requires a 2-step process. First, the hypervisor assigns a page to the
guest using the new RMPUPDATE instruction. This transitions the page to
guest-invalid. Second, the guest validates the page using the new PVALIDATE
instruction. The SEV-SNP VMs can use the new "Page State Change Request NAE"
defined in the GHCB specification to ask hypervisor to add or remove page
from the RMP table.
Each page assigned to the SEV-SNP VM can either be validated or unvalidated,
as indicated by the Validated flag in the page's RMP entry. There are two
approaches that can be taken for the page validation: Pre-validation and
Lazy Validation.
Under pre-validation, the pages are validated prior to first use. And under
lazy validation, pages are validated when first accessed. An access to a
unvalidated page results in a #VC exception, at which time the exception
handler may validate the page. Lazy validation requires careful tracking of
the validated pages to avoid validating the same GPA more than once. The
recently introduced "Unaccepted" memory type can be used to communicate the
unvalidated memory ranges to the Guest OS.
At this time we only sypport the pre-validation, the OVMF guest BIOS
validates the entire RAM before the control is handed over to the guest kernel.
The early_set_memory_{encrypt,decrypt} and set_memory_{encrypt,decrypt} are
enlightened to perform the page validation or invalidation while setting or
clearing the encryption attribute from the page table.
This series does not provide support for the following SEV-SNP features yet:
* Lazy validation
* Interrupt security
The series is based on tip/master
e53fbd0a2509 (origin/master) Merge branch 'core/urgent'
Additional resources
---------------------
SEV-SNP whitepaper
https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
APM 2: https://www.amd.com/system/files/TechDocs/24593.pdf
(section 15.36)
GHCB spec:
https://developer.amd.com/wp-content/resources/56421.pdf
SEV-SNP firmware specification:
https://developer.amd.com/sev/
Changes since v3:
* Add support to use the PSP filtered CPUID.
* Add support for the extended guest request.
* Move sevguest driver in driver/virt/coco.
* Add documentation for sevguest ioctl.
* Add support to check the vmpl0.
* Pass the VM encryption key and id to be used for encrypting guest messages
through the platform drv data.
* Multiple cleanup and fixes to address the review feedbacks.
Changes since v2:
* Add support for AP startup using SNP specific vmgexit.
* Add snp_prep_memory() helper.
* Drop sev_snp_active() helper.
* Add sev_feature_enabled() helper to check which SEV feature is active.
* Sync the SNP guest message request header with latest SNP FW spec.
* Multiple cleanup and fixes to address the review feedbacks.
Changes since v1:
* Integerate the SNP support in sev.{ch}.
* Add support to query the hypervisor feature and detect whether SNP is supported.
* Define Linux specific reason code for the SNP guest termination.
* Extend the setup_header provide a way for hypervisor to pass secret and cpuid page.
* Add support to create a platform device and driver to query the attestation report
and the derive a key.
* Multiple cleanup and fixes to address Boris's review fedback.
Brijesh Singh (23):
x86/sev: shorten GHCB terminate macro names
x86/sev: Save the negotiated GHCB version
x86/sev: Add support for hypervisor feature VMGEXIT
x86/mm: Add sev_feature_enabled() helper
x86/sev: Define the Linux specific guest termination reasons
x86/sev: check SEV-SNP features support
x86/sev: Add a helper for the PVALIDATE instruction
x86/sev: check the vmpl level
x86/compressed: Add helper for validating pages in the decompression
stage
x86/compressed: Register GHCB memory when SEV-SNP is active
x86/sev: Register GHCB memory when SEV-SNP is active
x86/sev: Add helper for validating pages in early enc attribute
changes
x86/kernel: Make the bss.decrypted section shared in RMP table
x86/kernel: Validate rom memory before accessing when SEV-SNP is
active
x86/mm: Add support to validate memory when changing C-bit
KVM: SVM: define new SEV_FEATURES field in the VMCB Save State Area
x86/boot: Add Confidential Computing type to setup_data
x86/sev: Provide support for SNP guest request NAEs
x86/sev: Add snp_msg_seqno() helper
x86/sev: Register SNP guest request platform device
virt: Add SEV-SNP guest driver
virt: sevguest: Add support to derive key
virt: sevguest: Add support to get extended report
Michael Roth (9):
x86/head/64: set up a startup %gs for stack protector
x86/sev: move MSR-based VMGEXITs for CPUID to helper
KVM: x86: move lookup of indexed CPUID leafs to helper
x86/compressed/acpi: move EFI config table access to common code
x86/compressed/64: enable SEV-SNP-validated CPUID in #VC handler
x86/boot: add a pointer to Confidential Computing blob in bootparams
x86/compressed/64: store Confidential Computing blob address in
bootparams
x86/compressed/64: add identity mapping for Confidential Computing
blob
x86/sev: enable SEV-SNP-validated CPUID in #VC handlers
Tom Lendacky (4):
KVM: SVM: Create a separate mapping for the SEV-ES save area
KVM: SVM: Create a separate mapping for the GHCB save area
KVM: SVM: Update the SEV-ES save area mapping
x86/sev: Use SEV-SNP AP creation to start secondary CPUs
Documentation/virt/coco/sevguest.rst | 109 +++
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/acpi.c | 124 ++--
arch/x86/boot/compressed/efi-config-table.c | 224 ++++++
arch/x86/boot/compressed/head_64.S | 1 +
arch/x86/boot/compressed/ident_map_64.c | 35 +-
arch/x86/boot/compressed/idt_64.c | 7 +-
arch/x86/boot/compressed/misc.h | 66 ++
arch/x86/boot/compressed/sev.c | 115 +++-
arch/x86/include/asm/bootparam_utils.h | 1 +
arch/x86/include/asm/cpuid-indexed.h | 26 +
arch/x86/include/asm/mem_encrypt.h | 10 +
arch/x86/include/asm/msr-index.h | 2 +
arch/x86/include/asm/realmode.h | 1 +
arch/x86/include/asm/setup.h | 5 +-
arch/x86/include/asm/sev-common.h | 80 ++-
arch/x86/include/asm/sev.h | 79 ++-
arch/x86/include/asm/svm.h | 176 ++++-
arch/x86/include/uapi/asm/bootparam.h | 4 +-
arch/x86/include/uapi/asm/svm.h | 13 +
arch/x86/kernel/head64.c | 46 +-
arch/x86/kernel/head_64.S | 6 +-
arch/x86/kernel/probe_roms.c | 13 +-
arch/x86/kernel/setup.c | 3 +
arch/x86/kernel/sev-internal.h | 12 +
arch/x86/kernel/sev-shared.c | 572 +++++++++++++++-
arch/x86/kernel/sev.c | 716 +++++++++++++++++++-
arch/x86/kernel/smpboot.c | 5 +
arch/x86/kvm/cpuid.c | 17 +-
arch/x86/kvm/svm/sev.c | 24 +-
arch/x86/kvm/svm/svm.c | 4 +-
arch/x86/kvm/svm/svm.h | 2 +-
arch/x86/mm/mem_encrypt.c | 65 +-
arch/x86/mm/pat/set_memory.c | 15 +
drivers/virt/Kconfig | 3 +
drivers/virt/Makefile | 1 +
drivers/virt/coco/sevguest/Kconfig | 9 +
drivers/virt/coco/sevguest/Makefile | 2 +
drivers/virt/coco/sevguest/sevguest.c | 623 +++++++++++++++++
drivers/virt/coco/sevguest/sevguest.h | 63 ++
include/linux/efi.h | 1 +
include/linux/sev-guest.h | 90 +++
include/uapi/linux/sev-guest.h | 81 +++
43 files changed, 3253 insertions(+), 199 deletions(-)
create mode 100644 Documentation/virt/coco/sevguest.rst
create mode 100644 arch/x86/boot/compressed/efi-config-table.c
create mode 100644 arch/x86/include/asm/cpuid-indexed.h
create mode 100644 arch/x86/kernel/sev-internal.h
create mode 100644 drivers/virt/coco/sevguest/Kconfig
create mode 100644 drivers/virt/coco/sevguest/Makefile
create mode 100644 drivers/virt/coco/sevguest/sevguest.c
create mode 100644 drivers/virt/coco/sevguest/sevguest.h
create mode 100644 include/linux/sev-guest.h
create mode 100644 include/uapi/linux/sev-guest.h
--
2.17.1
Reviewed-by: Venu Busireddy <[email protected]>
Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/sev.c | 6 +++---
arch/x86/include/asm/sev-common.h | 4 ++--
arch/x86/kernel/sev-shared.c | 2 +-
arch/x86/kernel/sev.c | 4 ++--
4 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 670e998fe930..28bcf04c022e 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -122,7 +122,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
static bool early_setup_sev_es(void)
{
if (!sev_es_negotiate_protocol())
- sev_es_terminate(GHCB_SEV_ES_REASON_PROTOCOL_UNSUPPORTED);
+ sev_es_terminate(GHCB_SEV_ES_PROT_UNSUPPORTED);
if (set_page_decrypted((unsigned long)&boot_ghcb_page))
return false;
@@ -175,7 +175,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
enum es_result result;
if (!boot_ghcb && !early_setup_sev_es())
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
vc_ghcb_invalidate(boot_ghcb);
result = vc_init_em_ctxt(&ctxt, regs, exit_code);
@@ -202,5 +202,5 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
if (result == ES_OK)
vc_finish_insn(&ctxt);
else if (result != ES_RETRY)
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
}
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 629c3df243f0..11b7d9cea775 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -54,8 +54,8 @@
(((((u64)reason_set) & GHCB_MSR_TERM_REASON_SET_MASK) << GHCB_MSR_TERM_REASON_SET_POS) | \
((((u64)reason_val) & GHCB_MSR_TERM_REASON_MASK) << GHCB_MSR_TERM_REASON_POS))
-#define GHCB_SEV_ES_REASON_GENERAL_REQUEST 0
-#define GHCB_SEV_ES_REASON_PROTOCOL_UNSUPPORTED 1
+#define GHCB_SEV_ES_GEN_REQ 0
+#define GHCB_SEV_ES_PROT_UNSUPPORTED 1
#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 9f90f460a28c..114f62fe2529 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -208,7 +208,7 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
fail:
/* Terminate the guest */
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
}
static enum es_result vc_insn_string_read(struct es_em_ctxt *ctxt,
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 87a4b00f028e..66b7f63ad041 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1429,7 +1429,7 @@ DEFINE_IDTENTRY_VC_KERNEL(exc_vmm_communication)
show_regs(regs);
/* Ask hypervisor to sev_es_terminate */
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
/* If that fails and we get here - just panic */
panic("Returned from Terminate-Request to Hypervisor\n");
@@ -1477,7 +1477,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
/* Do initial setup or terminate the guest */
if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
vc_ghcb_invalidate(boot_ghcb);
--
2.17.1
The SEV-ES guest calls the sev_es_negotiate_protocol() to negotiate the
GHCB protocol version before establishing the GHCB. Cache the negotiated
GHCB version so that it can be used later.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 2 +-
arch/x86/kernel/sev-shared.c | 17 ++++++++++++++---
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index fa5cd05d3b5b..7ec91b1359df 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -12,7 +12,7 @@
#include <asm/insn.h>
#include <asm/sev-common.h>
-#define GHCB_PROTO_OUR 0x0001UL
+#define GHCB_PROTOCOL_MIN 1ULL
#define GHCB_PROTOCOL_MAX 1ULL
#define GHCB_DEFAULT_USAGE 0ULL
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 114f62fe2529..19c2306ac02d 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,15 @@
#define has_cpuflag(f) boot_cpu_has(f)
#endif
+/*
+ * Since feature negotiation related variables are set early in the boot
+ * process they must reside in the .data section so as not to be zeroed
+ * out when the .bss section is later cleared.
+ *
+ * GHCB protocol version negotiated with the hypervisor.
+ */
+static u16 ghcb_version __section(".data..ro_after_init");
+
static bool __init sev_es_check_cpu_features(void)
{
if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -54,10 +63,12 @@ static bool sev_es_negotiate_protocol(void)
if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
return false;
- if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTO_OUR ||
- GHCB_MSR_PROTO_MIN(val) > GHCB_PROTO_OUR)
+ if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTOCOL_MIN ||
+ GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
return false;
+ ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+
return true;
}
@@ -102,7 +113,7 @@ static enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
enum es_result ret;
/* Fill in protocol and format specifiers */
- ghcb->protocol_version = GHCB_PROTOCOL_MAX;
+ ghcb->protocol_version = ghcb_version;
ghcb->ghcb_usage = GHCB_DEFAULT_USAGE;
ghcb_set_sw_exit_code(ghcb, exit_code);
--
2.17.1
Version 2 of GHCB specification introduced advertisement of a features
that are supported by the hypervisor. Define the GHCB MSR protocol and NAE
for the hypervisor feature request and query the feature during the GHCB
protocol negotitation. See the GHCB specification for more details.
Version 2 of GHCB specification adds several new NAEs, most of them are
optional except the hypervisor feature. Now that hypervisor feature NAE
is implemented, so bump the GHCB maximum support protocol version.
Reviewed-by: Venu Busireddy <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/mem_encrypt.h | 2 ++
arch/x86/include/asm/sev-common.h | 9 +++++++++
arch/x86/include/asm/sev.h | 2 +-
arch/x86/include/uapi/asm/svm.h | 2 ++
arch/x86/kernel/sev-shared.c | 23 +++++++++++++++++++++++
arch/x86/kernel/sev.c | 3 +++
6 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 9c80c68d75b5..8cc2fd308f65 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -20,6 +20,7 @@
extern u64 sme_me_mask;
extern u64 sev_status;
+extern u64 sev_hv_features;
void sme_encrypt_execute(unsigned long encrypted_kernel_vaddr,
unsigned long decrypted_kernel_vaddr,
@@ -59,6 +60,7 @@ bool sev_es_active(void);
#else /* !CONFIG_AMD_MEM_ENCRYPT */
#define sme_me_mask 0ULL
+#define sev_hv_features 0ULL
static inline void __init sme_early_encrypt(resource_size_t paddr,
unsigned long size) { }
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 11b7d9cea775..23929a3010df 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -45,6 +45,15 @@
(((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
(((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
+/* GHCB Hypervisor Feature Request */
+#define GHCB_MSR_HV_FT_REQ 0x080
+#define GHCB_MSR_HV_FT_RESP 0x081
+#define GHCB_MSR_HV_FT_POS 12
+#define GHCB_MSR_HV_FT_MASK GENMASK_ULL(51, 0)
+
+#define GHCB_MSR_HV_FT_RESP_VAL(v) \
+ (((unsigned long)((v) >> GHCB_MSR_HV_FT_POS) & GHCB_MSR_HV_FT_MASK))
+
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
#define GHCB_MSR_TERM_REASON_SET_MASK 0xf
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 7ec91b1359df..134a7c9d91b6 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -13,7 +13,7 @@
#include <asm/sev-common.h>
#define GHCB_PROTOCOL_MIN 1ULL
-#define GHCB_PROTOCOL_MAX 1ULL
+#define GHCB_PROTOCOL_MAX 2ULL
#define GHCB_DEFAULT_USAGE 0ULL
#define VMGEXIT() { asm volatile("rep; vmmcall\n\r"); }
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 554f75fe013c..7fbc311e2de1 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -108,6 +108,7 @@
#define SVM_VMGEXIT_AP_JUMP_TABLE 0x80000005
#define SVM_VMGEXIT_SET_AP_JUMP_TABLE 0
#define SVM_VMGEXIT_GET_AP_JUMP_TABLE 1
+#define SVM_VMGEXIT_HYPERVISOR_FEATURES 0x8000fffd
#define SVM_VMGEXIT_UNSUPPORTED_EVENT 0x8000ffff
#define SVM_EXIT_ERR -1
@@ -215,6 +216,7 @@
{ SVM_VMGEXIT_NMI_COMPLETE, "vmgexit_nmi_complete" }, \
{ SVM_VMGEXIT_AP_HLT_LOOP, "vmgexit_ap_hlt_loop" }, \
{ SVM_VMGEXIT_AP_JUMP_TABLE, "vmgexit_ap_jump_table" }, \
+ { SVM_VMGEXIT_HYPERVISOR_FEATURES, "vmgexit_hypervisor_feature" }, \
{ SVM_EXIT_ERR, "invalid_guest_state" }
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 19c2306ac02d..34821da5f05e 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -23,6 +23,9 @@
*/
static u16 ghcb_version __section(".data..ro_after_init");
+/* Bitmap of SEV features supported by the hypervisor */
+u64 sev_hv_features __section(".data..ro_after_init") = 0;
+
static bool __init sev_es_check_cpu_features(void)
{
if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -51,6 +54,22 @@ static void __noreturn sev_es_terminate(unsigned int reason)
asm volatile("hlt\n" : : : "memory");
}
+static bool get_hv_features(void)
+{
+ u64 val;
+
+ sev_es_wr_ghcb_msr(GHCB_MSR_HV_FT_REQ);
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+ if (GHCB_RESP_CODE(val) != GHCB_MSR_HV_FT_RESP)
+ return false;
+
+ sev_hv_features = GHCB_MSR_HV_FT_RESP_VAL(val);
+
+ return true;
+}
+
static bool sev_es_negotiate_protocol(void)
{
u64 val;
@@ -69,6 +88,10 @@ static bool sev_es_negotiate_protocol(void)
ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+ /* The hypervisor features are available from version 2 onward. */
+ if ((ghcb_version >= 2) && !get_hv_features())
+ return false;
+
return true;
}
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 66b7f63ad041..540b81ac54c9 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -96,6 +96,9 @@ struct ghcb_state {
static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
+/* Bitmap of SEV features supported by the hypervisor */
+EXPORT_SYMBOL(sev_hv_features);
+
/* Needed in vc_early_forward_exception */
void do_early_exception(struct pt_regs *regs, int trapnr);
--
2.17.1
GHCB specification defines the reason code for reason set 0. The reason
codes defined in the set 0 do not cover all possible causes for a guest
to request termination.
The reason set 1 to 255 is reserved for the vendor-specific codes.
Reseve the reason set 1 for the Linux guest. Define an error codes for
reason set 1.
While at it, change the sev_es_terminate() to accept the reason set
parameter.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/sev.c | 6 +++---
arch/x86/include/asm/sev-common.h | 8 ++++++++
arch/x86/kernel/sev-shared.c | 11 ++++-------
arch/x86/kernel/sev.c | 4 ++--
4 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 28bcf04c022e..7760959fe96d 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -122,7 +122,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
static bool early_setup_sev_es(void)
{
if (!sev_es_negotiate_protocol())
- sev_es_terminate(GHCB_SEV_ES_PROT_UNSUPPORTED);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
if (set_page_decrypted((unsigned long)&boot_ghcb_page))
return false;
@@ -175,7 +175,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
enum es_result result;
if (!boot_ghcb && !early_setup_sev_es())
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
vc_ghcb_invalidate(boot_ghcb);
result = vc_init_em_ctxt(&ctxt, regs, exit_code);
@@ -202,5 +202,5 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
if (result == ES_OK)
vc_finish_insn(&ctxt);
else if (result != ES_RETRY)
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 23929a3010df..e75e29c05f59 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -63,9 +63,17 @@
(((((u64)reason_set) & GHCB_MSR_TERM_REASON_SET_MASK) << GHCB_MSR_TERM_REASON_SET_POS) | \
((((u64)reason_val) & GHCB_MSR_TERM_REASON_MASK) << GHCB_MSR_TERM_REASON_POS))
+/* Error code from reason set 0 */
+#define SEV_TERM_SET_GEN 0
#define GHCB_SEV_ES_GEN_REQ 0
#define GHCB_SEV_ES_PROT_UNSUPPORTED 1
#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)
+/* Linux specific reason codes (used with reason set 1) */
+#define SEV_TERM_SET_LINUX 1
+#define GHCB_TERM_REGISTER 0 /* GHCB GPA registration failure */
+#define GHCB_TERM_PSC 1 /* Page State Change failure */
+#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
+
#endif
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 34821da5f05e..c54be2698df0 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -36,15 +36,12 @@ static bool __init sev_es_check_cpu_features(void)
return true;
}
-static void __noreturn sev_es_terminate(unsigned int reason)
+static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
{
u64 val = GHCB_MSR_TERM_REQ;
- /*
- * Tell the hypervisor what went wrong - only reason-set 0 is
- * currently supported.
- */
- val |= GHCB_SEV_TERM_REASON(0, reason);
+ /* Tell the hypervisor what went wrong. */
+ val |= GHCB_SEV_TERM_REASON(set, reason);
/* Request Guest Termination from Hypvervisor */
sev_es_wr_ghcb_msr(val);
@@ -242,7 +239,7 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
fail:
/* Terminate the guest */
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}
static enum es_result vc_insn_string_read(struct es_em_ctxt *ctxt,
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 540b81ac54c9..0245fe986615 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1432,7 +1432,7 @@ DEFINE_IDTENTRY_VC_KERNEL(exc_vmm_communication)
show_regs(regs);
/* Ask hypervisor to sev_es_terminate */
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
/* If that fails and we get here - just panic */
panic("Returned from Terminate-Request to Hypervisor\n");
@@ -1480,7 +1480,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
/* Do initial setup or terminate the guest */
if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
vc_ghcb_invalidate(boot_ghcb);
--
2.17.1
The sev_feature_enabled() helper can be used by the guest to query whether
the SNP - Secure Nested Paging feature is active.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/mem_encrypt.h | 8 ++++++++
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/mm/mem_encrypt.c | 14 ++++++++++++++
3 files changed, 24 insertions(+)
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 8cc2fd308f65..fb857f2e72cb 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -16,6 +16,12 @@
#include <asm/bootparam.h>
+enum sev_feature_type {
+ SEV,
+ SEV_ES,
+ SEV_SNP
+};
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern u64 sme_me_mask;
@@ -54,6 +60,7 @@ void __init sev_es_init_vc_handling(void);
bool sme_active(void);
bool sev_active(void);
bool sev_es_active(void);
+bool sev_feature_enabled(unsigned int feature_type);
#define __bss_decrypted __section(".bss..decrypted")
@@ -87,6 +94,7 @@ static inline int __init
early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
static inline void mem_encrypt_free_decrypted_mem(void) { }
+static bool sev_feature_enabled(unsigned int feature_type) { return false; }
#define __bss_decrypted
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index a7c413432b33..37589da0282e 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -481,8 +481,10 @@
#define MSR_AMD64_SEV 0xc0010131
#define MSR_AMD64_SEV_ENABLED_BIT 0
#define MSR_AMD64_SEV_ES_ENABLED_BIT 1
+#define MSR_AMD64_SEV_SNP_ENABLED_BIT 2
#define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
#define MSR_AMD64_SEV_ES_ENABLED BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
+#define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ff08dc463634..63e7799a9a86 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -389,6 +389,16 @@ bool noinstr sev_es_active(void)
return sev_status & MSR_AMD64_SEV_ES_ENABLED;
}
+bool sev_feature_enabled(unsigned int type)
+{
+ switch (type) {
+ case SEV: return sev_status & MSR_AMD64_SEV_ENABLED;
+ case SEV_ES: return sev_status & MSR_AMD64_SEV_ES_ENABLED;
+ case SEV_SNP: return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
+ default: return false;
+ }
+}
+
/* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
bool force_dma_unencrypted(struct device *dev)
{
@@ -461,6 +471,10 @@ static void print_mem_encrypt_feature_info(void)
if (sev_es_active())
pr_cont(" SEV-ES");
+ /* Secure Nested Paging */
+ if (sev_feature_enabled(SEV_SNP))
+ pr_cont(" SEV-SNP");
+
pr_cont("\n");
}
--
2.17.1
Version 2 of the GHCB specification added the advertisement of features
that are supported by the hypervisor. If hypervisor supports the SEV-SNP
then it must set the SEV-SNP features bit to indicate that the base
SEV-SNP is supported.
Check the SEV-SNP feature while establishing the GHCB, if failed,
terminate the guest.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/sev.c | 26 ++++++++++++++++++++++++--
arch/x86/include/asm/sev-common.h | 3 +++
arch/x86/kernel/sev.c | 8 ++++++--
3 files changed, 33 insertions(+), 4 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 7760959fe96d..7be325d9b09f 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -25,6 +25,7 @@
struct ghcb boot_ghcb_page __aligned(PAGE_SIZE);
struct ghcb *boot_ghcb;
+static u64 msr_sev_status;
/*
* Copy a version of this function here - insn-eval.c can't be used in
@@ -119,11 +120,32 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
/* Include code for early handlers */
#include "../../kernel/sev-shared.c"
-static bool early_setup_sev_es(void)
+static inline bool sev_snp_enabled(void)
+{
+ unsigned long low, high;
+
+ if (!msr_sev_status) {
+ asm volatile("rdmsr\n"
+ : "=a" (low), "=d" (high)
+ : "c" (MSR_AMD64_SEV));
+ msr_sev_status = (high << 32) | low;
+ }
+
+ return msr_sev_status & MSR_AMD64_SEV_SNP_ENABLED;
+}
+
+static bool do_early_sev_setup(void)
{
if (!sev_es_negotiate_protocol())
sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
+ /*
+ * If SEV-SNP is enabled, then check if the hypervisor supports the SEV-SNP
+ * features.
+ */
+ if (sev_snp_enabled() && !(sev_hv_features & GHCB_HV_FT_SNP))
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
if (set_page_decrypted((unsigned long)&boot_ghcb_page))
return false;
@@ -174,7 +196,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
struct es_em_ctxt ctxt;
enum es_result result;
- if (!boot_ghcb && !early_setup_sev_es())
+ if (!boot_ghcb && !do_early_sev_setup())
sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
vc_ghcb_invalidate(boot_ghcb);
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index e75e29c05f59..8abc5eb7ca3d 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -54,6 +54,8 @@
#define GHCB_MSR_HV_FT_RESP_VAL(v) \
(((unsigned long)((v) >> GHCB_MSR_HV_FT_POS) & GHCB_MSR_HV_FT_MASK))
+#define GHCB_HV_FT_SNP BIT_ULL(0)
+
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
#define GHCB_MSR_TERM_REASON_SET_MASK 0xf
@@ -67,6 +69,7 @@
#define SEV_TERM_SET_GEN 0
#define GHCB_SEV_ES_GEN_REQ 0
#define GHCB_SEV_ES_PROT_UNSUPPORTED 1
+#define GHCB_SNP_UNSUPPORTED 2
#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 0245fe986615..504169f1966b 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -665,12 +665,16 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
* This function runs on the first #VC exception after the kernel
* switched to virtual addresses.
*/
-static bool __init sev_es_setup_ghcb(void)
+static bool __init setup_ghcb(void)
{
/* First make sure the hypervisor talks a supported protocol. */
if (!sev_es_negotiate_protocol())
return false;
+ /* If SNP is active, make sure that hypervisor supports the feature. */
+ if (sev_feature_enabled(SEV_SNP) && !(sev_hv_features & GHCB_HV_FT_SNP))
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
/*
* Clear the boot_ghcb. The first exception comes in before the bss
* section is cleared.
@@ -1479,7 +1483,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
enum es_result result;
/* Do initial setup or terminate the guest */
- if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
+ if (unlikely(boot_ghcb == NULL && !setup_ghcb()))
sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
vc_ghcb_invalidate(boot_ghcb);
--
2.17.1
An SNP-active guest uses the PVALIDATE instruction to validate or
rescind the validation of a guest page’s RMP entry. Upon completion,
a return code is stored in EAX and rFLAGS bits are set based on the
return code. If the instruction completed successfully, the CF
indicates if the content of the RMP were changed or not.
See AMD APM Volume 3 for additional details.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 134a7c9d91b6..b308815a2c01 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -59,6 +59,9 @@ extern void vc_no_ghcb(void);
extern void vc_boot_ghcb(void);
extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
+/* Software defined (when rFlags.CF = 1) */
+#define PVALIDATE_FAIL_NOUPDATE 255
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern struct static_key_false sev_es_enable_key;
extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -81,12 +84,30 @@ static __always_inline void sev_es_nmi_complete(void)
__sev_es_nmi_complete();
}
extern int __init sev_es_efi_map_ghcbs(pgd_t *pgd);
+static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
+{
+ bool no_rmpupdate;
+ int rc;
+
+ /* "pvalidate" mnemonic support in binutils 2.36 and newer */
+ asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFF\n\t"
+ CC_SET(c)
+ : CC_OUT(c) (no_rmpupdate), "=a"(rc)
+ : "a"(vaddr), "c"(rmp_psize), "d"(validate)
+ : "memory", "cc");
+
+ if (no_rmpupdate)
+ return PVALIDATE_FAIL_NOUPDATE;
+
+ return rc;
+}
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { return 0; }
static inline void sev_es_nmi_complete(void) { }
static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
+static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
#endif
#endif
--
2.17.1
Virtual Machine Privilege Level (VMPL) is an optional feature in the
SEV-SNP architecture, which allows a guest VM to divide its address space
into four levels. The level can be used to provide the hardware isolated
abstraction layers with a VM. The VMPL0 is the highest privilege, and
VMPL3 is the least privilege. Certain operations must be done by the VMPL0
software, such as:
* Validate or invalidate memory range (PVALIDATE instruction)
* Allocate VMSA page (RMPADJUST instruction when VMSA=1)
The initial SEV-SNP support assumes that the guest kernel is running on
VMPL0. Let's add a check to make sure that kernel is running at VMPL0
before continuing the boot. There is no easy method to query the current
VMPL level, so use the RMPADJUST instruction to determine whether its
booted at the VMPL0.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/sev.c | 41 ++++++++++++++++++++++++++++---
arch/x86/include/asm/sev-common.h | 1 +
arch/x86/include/asm/sev.h | 3 +++
3 files changed, 42 insertions(+), 3 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 7be325d9b09f..2f3081e9c78c 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -134,6 +134,36 @@ static inline bool sev_snp_enabled(void)
return msr_sev_status & MSR_AMD64_SEV_SNP_ENABLED;
}
+static bool is_vmpl0(void)
+{
+ u64 attrs, va;
+ int err;
+
+ /*
+ * There is no straightforward way to query the current VMPL level. The
+ * simplest method is to use the RMPADJUST instruction to change a page
+ * permission to a VMPL level-1, and if the guest kernel is launched at
+ * at a level <= 1, then RMPADJUST instruction will return an error.
+ */
+ attrs = 1;
+
+ /*
+ * Any page aligned virtual address is sufficent to test the VMPL level.
+ * The boot_ghcb_page is page aligned memory, so lets use for the test.
+ */
+ va = (u64)&boot_ghcb_page;
+
+ /* Instruction mnemonic supported in binutils versions v2.36 and later */
+ asm volatile (".byte 0xf3,0x0f,0x01,0xfe\n\t"
+ : "=a" (err)
+ : "a" (va), "c" (RMP_PG_SIZE_4K), "d" (attrs)
+ : "memory", "cc");
+ if (err)
+ return false;
+
+ return true;
+}
+
static bool do_early_sev_setup(void)
{
if (!sev_es_negotiate_protocol())
@@ -141,10 +171,15 @@ static bool do_early_sev_setup(void)
/*
* If SEV-SNP is enabled, then check if the hypervisor supports the SEV-SNP
- * features.
+ * features and is launched at VMPL-0 level.
*/
- if (sev_snp_enabled() && !(sev_hv_features & GHCB_HV_FT_SNP))
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+ if (sev_snp_enabled()) {
+ if (!sev_hv_features & GHCB_HV_FT_SNP)
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
+ if (!is_vmpl0())
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0);
+ }
if (set_page_decrypted((unsigned long)&boot_ghcb_page))
return false;
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 8abc5eb7ca3d..ea508835ab33 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -78,5 +78,6 @@
#define GHCB_TERM_REGISTER 0 /* GHCB GPA registration failure */
#define GHCB_TERM_PSC 1 /* Page State Change failure */
#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
+#define GHCB_TERM_NOT_VMPL0 3 /* SNP guest is not running at VMPL-0 */
#endif
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index b308815a2c01..242af1154e49 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -62,6 +62,9 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
/* Software defined (when rFlags.CF = 1) */
#define PVALIDATE_FAIL_NOUPDATE 255
+/* RMP page size */
+#define RMP_PG_SIZE_4K 0
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern struct static_key_false sev_es_enable_key;
extern void __sev_es_ist_enter(struct pt_regs *regs);
--
2.17.1
The SEV-SNP guest is required to perform GHCB GPA registration. This is
because the hypervisor may prefer that a guest use a consistent and/or
specific GPA for the GHCB associated with a vCPU. For more information,
see the GHCB specification.
If hypervisor can not work with the guest provided GPA then terminate the
guest boot.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/sev.c | 4 ++++
arch/x86/include/asm/sev-common.h | 11 +++++++++++
arch/x86/kernel/sev-shared.c | 16 ++++++++++++++++
3 files changed, 31 insertions(+)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index f386d45a57b6..d4cbadf80838 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -233,6 +233,10 @@ static bool do_early_sev_setup(void)
/* Initialize lookup tables for the instruction decoder */
inat_init_tables();
+ /* SEV-SNP guest requires the GHCB GPA must be registered */
+ if (sev_snp_enabled())
+ snp_register_ghcb_early(__pa(&boot_ghcb_page));
+
return true;
}
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index aee07d1bb138..b19d8d301f5d 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -45,6 +45,17 @@
(((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
(((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
+/* GHCB GPA Register */
+#define GHCB_MSR_GPA_REG_REQ 0x012
+#define GHCB_MSR_GPA_REG_VALUE_POS 12
+#define GHCB_MSR_GPA_REG_GFN_MASK GENMASK_ULL(51, 0)
+#define GHCB_MSR_GPA_REQ_GFN_VAL(v) \
+ (((unsigned long)((v) & GHCB_MSR_GPA_REG_GFN_MASK) << GHCB_MSR_GPA_REG_VALUE_POS)| \
+ GHCB_MSR_GPA_REG_REQ)
+
+#define GHCB_MSR_GPA_REG_RESP 0x013
+#define GHCB_MSR_GPA_REG_RESP_VAL(v) ((v) >> GHCB_MSR_GPA_REG_VALUE_POS)
+
/* SNP Page State Change */
#define GHCB_MSR_PSC_REQ 0x014
#define SNP_PAGE_STATE_PRIVATE 1
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index c54be2698df0..be4025f14b4f 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -67,6 +67,22 @@ static bool get_hv_features(void)
return true;
}
+static void snp_register_ghcb_early(unsigned long paddr)
+{
+ unsigned long pfn = paddr >> PAGE_SHIFT;
+ u64 val;
+
+ sev_es_wr_ghcb_msr(GHCB_MSR_GPA_REQ_GFN_VAL(pfn));
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+
+ /* If the response GPA is not ours then abort the guest */
+ if ((GHCB_RESP_CODE(val) != GHCB_MSR_GPA_REG_RESP) ||
+ (GHCB_MSR_GPA_REG_RESP_VAL(val) != pfn))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_REGISTER);
+}
+
static bool sev_es_negotiate_protocol(void)
{
u64 val;
--
2.17.1
Many of the integrity guarantees of SEV-SNP are enforced through the
Reverse Map Table (RMP). Each RMP entry contains the GPA at which a
particular page of DRAM should be mapped. The VMs can request the
hypervisor to add pages in the RMP table via the Page State Change VMGEXIT
defined in the GHCB specification. Inside each RMP entry is a Validated
flag; this flag is automatically cleared to 0 by the CPU hardware when a
new RMP entry is created for a guest. Each VM page can be either
validated or invalidated, as indicated by the Validated flag in the RMP
entry. Memory access to a private page that is not validated generates
a #VC. A VM must use PVALIDATE instruction to validate the private page
before using it.
To maintain the security guarantee of SEV-SNP guests, when transitioning
pages from private to shared, the guest must invalidate the pages before
asking the hypervisor to change the page state to shared in the RMP table.
After the pages are mapped private in the page table, the guest must issue
a page state change VMGEXIT to make the pages private in the RMP table and
validate it.
On boot, BIOS should have validated the entire system memory. During
the kernel decompression stage, the VC handler uses the
set_memory_decrypted() to make the GHCB page shared (i.e clear encryption
attribute). And while exiting from the decompression, it calls the
set_page_encrypted() to make the page private.
Add sev_snp_set_page_{private,shared}() helper that is used by the
set_memory_{decrypt,encrypt}() to change the page state in the RMP table.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/ident_map_64.c | 17 +++++++++-
arch/x86/boot/compressed/misc.h | 6 ++++
arch/x86/boot/compressed/sev.c | 41 +++++++++++++++++++++++++
arch/x86/include/asm/sev-common.h | 17 ++++++++++
4 files changed, 80 insertions(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index f7213d0943b8..59befc610993 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -274,16 +274,31 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
/*
* Changing encryption attributes of a page requires to flush it from
* the caches.
+ *
+ * If the encryption attribute is being cleared, then change the page
+ * state to shared in the RMP table.
*/
- if ((set | clr) & _PAGE_ENC)
+ if ((set | clr) & _PAGE_ENC) {
clflush_page(address);
+ if (clr)
+ snp_set_page_shared(pte_pfn(*ptep) << PAGE_SHIFT);
+ }
+
/* Update PTE */
pte = *ptep;
pte = pte_set_flags(pte, set);
pte = pte_clear_flags(pte, clr);
set_pte(ptep, pte);
+ /*
+ * If the encryption attribute is being set, then change the page state to
+ * private in the RMP entry. The page state must be done after the PTE
+ * is updated.
+ */
+ if (set & _PAGE_ENC)
+ snp_set_page_private(pte_pfn(*ptep) << PAGE_SHIFT);
+
/* Flush TLB after changing encryption attribute */
write_cr3(top_level_pgt);
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 31139256859f..822e0c254b9a 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -121,12 +121,18 @@ void set_sev_encryption_mask(void);
#ifdef CONFIG_AMD_MEM_ENCRYPT
void sev_es_shutdown_ghcb(void);
extern bool sev_es_check_ghcb_fault(unsigned long address);
+void snp_set_page_private(unsigned long paddr);
+void snp_set_page_shared(unsigned long paddr);
+
#else
static inline void sev_es_shutdown_ghcb(void) { }
static inline bool sev_es_check_ghcb_fault(unsigned long address)
{
return false;
}
+static inline void snp_set_page_private(unsigned long paddr) { }
+static inline void snp_set_page_shared(unsigned long paddr) { }
+
#endif
/* acpi.c */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 2f3081e9c78c..f386d45a57b6 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -164,6 +164,47 @@ static bool is_vmpl0(void)
return true;
}
+static void __page_state_change(unsigned long paddr, int op)
+{
+ u64 val;
+
+ if (!sev_snp_enabled())
+ return;
+
+ /*
+ * If private -> shared then invalidate the page before requesting the
+ * state change in the RMP table.
+ */
+ if ((op == SNP_PAGE_STATE_SHARED) && pvalidate(paddr, RMP_PG_SIZE_4K, 0))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+ /* Issue VMGEXIT to change the page state in RMP table. */
+ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+ VMGEXIT();
+
+ /* Read the response of the VMGEXIT. */
+ val = sev_es_rd_ghcb_msr();
+ if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+
+ /*
+ * Now that page is added in the RMP table, validate it so that it is
+ * consistent with the RMP entry.
+ */
+ if ((op == SNP_PAGE_STATE_PRIVATE) && pvalidate(paddr, RMP_PG_SIZE_4K, 1))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+}
+
+void snp_set_page_private(unsigned long paddr)
+{
+ __page_state_change(paddr, SNP_PAGE_STATE_PRIVATE);
+}
+
+void snp_set_page_shared(unsigned long paddr)
+{
+ __page_state_change(paddr, SNP_PAGE_STATE_SHARED);
+}
+
static bool do_early_sev_setup(void)
{
if (!sev_es_negotiate_protocol())
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index ea508835ab33..aee07d1bb138 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -45,6 +45,23 @@
(((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
(((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
+/* SNP Page State Change */
+#define GHCB_MSR_PSC_REQ 0x014
+#define SNP_PAGE_STATE_PRIVATE 1
+#define SNP_PAGE_STATE_SHARED 2
+#define GHCB_MSR_PSC_GFN_POS 12
+#define GHCB_MSR_PSC_GFN_MASK GENMASK_ULL(39, 0)
+#define GHCB_MSR_PSC_OP_POS 52
+#define GHCB_MSR_PSC_OP_MASK 0xf
+#define GHCB_MSR_PSC_REQ_GFN(gfn, op) \
+ (((unsigned long)((op) & GHCB_MSR_PSC_OP_MASK) << GHCB_MSR_PSC_OP_POS) | \
+ ((unsigned long)((gfn) & GHCB_MSR_PSC_GFN_MASK) << GHCB_MSR_PSC_GFN_POS) | \
+ GHCB_MSR_PSC_REQ)
+
+#define GHCB_MSR_PSC_RESP 0x015
+#define GHCB_MSR_PSC_ERROR_POS 32
+#define GHCB_MSR_PSC_RESP_VAL(val) ((val) >> GHCB_MSR_PSC_ERROR_POS)
+
/* GHCB Hypervisor Feature Request */
#define GHCB_MSR_HV_FT_REQ 0x080
#define GHCB_MSR_HV_FT_RESP 0x081
--
2.17.1
The encryption attribute for the bss.decrypted region is cleared in the
initial page table build. This is because the section contains the data
that need to be shared between the guest and the hypervisor.
When SEV-SNP is active, just clearing the encryption attribute in the
page table is not enough. The page state need to be updated in the RMP
table.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/head64.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index de01903c3735..f4c3e632345a 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -288,7 +288,14 @@ unsigned long __head __startup_64(unsigned long physaddr,
if (mem_encrypt_active()) {
vaddr = (unsigned long)__start_bss_decrypted;
vaddr_end = (unsigned long)__end_bss_decrypted;
+
for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
+ /*
+ * When SEV-SNP is active then transition the page to shared in the RMP
+ * table so that it is consistent with the page table attribute change.
+ */
+ early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+
i = pmd_index(vaddr);
pmd[i] -= sme_get_me_mask();
}
--
2.17.1
The SEV-SNP guest is required to perform GHCB GPA registration. This is
because the hypervisor may prefer that a guest use a consistent and/or
specific GPA for the GHCB associated with a vCPU. For more information,
see the GHCB specification section GHCB GPA Registration.
During the boot, init_ghcb() allocates a per-cpu GHCB page. On very first
VC exception, the exception handler switch to using the per-cpu GHCB page
allocated during the init_ghcb(). The GHCB page must be registered in
the current vcpu context.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/sev-internal.h | 12 ++++++++++++
arch/x86/kernel/sev.c | 29 +++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)
create mode 100644 arch/x86/kernel/sev-internal.h
diff --git a/arch/x86/kernel/sev-internal.h b/arch/x86/kernel/sev-internal.h
new file mode 100644
index 000000000000..0fb7324803b4
--- /dev/null
+++ b/arch/x86/kernel/sev-internal.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Forward declarations for sev-shared.c
+ *
+ * Author: Brijesh Singh <[email protected]>
+ */
+
+#ifndef __X86_SEV_INTERNAL_H__
+
+static void snp_register_ghcb_early(unsigned long paddr);
+
+#endif /* __X86_SEV_INTERNAL_H__ */
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 504169f1966b..c7ff60b55bde 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -31,6 +31,8 @@
#include <asm/smp.h>
#include <asm/cpu.h>
+#include "sev-internal.h"
+
#define DR7_RESET_VALUE 0x400
/* For early boot hypervisor communication in SEV-ES enabled guests */
@@ -87,6 +89,13 @@ struct sev_es_runtime_data {
* is currently unsupported in SEV-ES guests.
*/
unsigned long dr7;
+
+ /*
+ * SEV-SNP requires that the GHCB must be registered before using it.
+ * The flag below will indicate whether the GHCB is registered, if its
+ * not registered then sev_es_get_ghcb() will perform the registration.
+ */
+ bool snp_ghcb_registered;
};
struct ghcb_state {
@@ -194,6 +203,17 @@ void noinstr __sev_es_ist_exit(void)
this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
}
+static void snp_register_ghcb(struct sev_es_runtime_data *data, unsigned long paddr)
+{
+ if (data->snp_ghcb_registered)
+ return;
+
+ snp_register_ghcb_early(paddr);
+
+ data->snp_ghcb_registered = true;
+}
+
+
/*
* Nothing shall interrupt this code path while holding the per-CPU
* GHCB. The backup GHCB is only for NMIs interrupting this path.
@@ -240,6 +260,10 @@ static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
data->ghcb_active = true;
}
+ /* SEV-SNP guest requires that GHCB must be registered. */
+ if (sev_feature_enabled(SEV_SNP))
+ snp_register_ghcb(data, __pa(ghcb));
+
return ghcb;
}
@@ -684,6 +708,10 @@ static bool __init setup_ghcb(void)
/* Alright - Make the boot-ghcb public */
boot_ghcb = &boot_ghcb_page;
+ /* SEV-SNP guest requires that GHCB GPA must be registered. */
+ if (sev_feature_enabled(SEV_SNP))
+ snp_register_ghcb_early(__pa(&boot_ghcb_page));
+
return true;
}
@@ -773,6 +801,7 @@ static void __init init_ghcb(int cpu)
data->ghcb_active = false;
data->backup_ghcb_active = false;
+ data->snp_ghcb_registered = false;
}
void __init sev_es_init_vc_handling(void)
--
2.17.1
From: Tom Lendacky <[email protected]>
The save area for SEV-ES/SEV-SNP guests, as used by the hardware, is
different from the save area of a non SEV-ES/SEV-SNP guest.
This is the first step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Create an SEV-ES/SEV-SNP save area and adjust usage to the new
save area definition where needed.
Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/svm.h | 83 +++++++++++++++++++++++++++++---------
arch/x86/kvm/svm/sev.c | 24 +++++------
arch/x86/kvm/svm/svm.h | 2 +-
3 files changed, 77 insertions(+), 32 deletions(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index ff614cdcf628..a7fa24ec8ddf 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -228,6 +228,7 @@ struct vmcb_seg {
u64 base;
} __packed;
+/* Save area definition for legacy and SEV-MEM guests */
struct vmcb_save_area {
struct vmcb_seg es;
struct vmcb_seg cs;
@@ -244,8 +245,58 @@ struct vmcb_save_area {
u8 cpl;
u8 reserved_2[4];
u64 efer;
+ u8 reserved_3[112];
+ u64 cr4;
+ u64 cr3;
+ u64 cr0;
+ u64 dr7;
+ u64 dr6;
+ u64 rflags;
+ u64 rip;
+ u8 reserved_4[88];
+ u64 rsp;
+ u64 s_cet;
+ u64 ssp;
+ u64 isst_addr;
+ u64 rax;
+ u64 star;
+ u64 lstar;
+ u64 cstar;
+ u64 sfmask;
+ u64 kernel_gs_base;
+ u64 sysenter_cs;
+ u64 sysenter_esp;
+ u64 sysenter_eip;
+ u64 cr2;
+ u8 reserved_5[32];
+ u64 g_pat;
+ u64 dbgctl;
+ u64 br_from;
+ u64 br_to;
+ u64 last_excp_from;
+ u64 last_excp_to;
+ u8 reserved_6[72];
+ u32 spec_ctrl; /* Guest version of SPEC_CTRL at 0x2E0 */
+} __packed;
+
+/* Save area definition for SEV-ES and SEV-SNP guests */
+struct sev_es_save_area {
+ struct vmcb_seg es;
+ struct vmcb_seg cs;
+ struct vmcb_seg ss;
+ struct vmcb_seg ds;
+ struct vmcb_seg fs;
+ struct vmcb_seg gs;
+ struct vmcb_seg gdtr;
+ struct vmcb_seg ldtr;
+ struct vmcb_seg idtr;
+ struct vmcb_seg tr;
+ u8 reserved_1[43];
+ u8 cpl;
+ u8 reserved_2[4];
+ u64 efer;
u8 reserved_3[104];
- u64 xss; /* Valid for SEV-ES only */
+ u64 xss;
u64 cr4;
u64 cr3;
u64 cr0;
@@ -273,22 +324,14 @@ struct vmcb_save_area {
u64 br_to;
u64 last_excp_from;
u64 last_excp_to;
-
- /*
- * The following part of the save area is valid only for
- * SEV-ES guests when referenced through the GHCB or for
- * saving to the host save area.
- */
- u8 reserved_7[72];
- u32 spec_ctrl; /* Guest version of SPEC_CTRL at 0x2E0 */
- u8 reserved_7b[4];
+ u8 reserved_7[80];
u32 pkru;
- u8 reserved_7a[20];
- u64 reserved_8; /* rax already available at 0x01f8 */
+ u8 reserved_9[20];
+ u64 reserved_10; /* rax already available at 0x01f8 */
u64 rcx;
u64 rdx;
u64 rbx;
- u64 reserved_9; /* rsp already available at 0x01d8 */
+ u64 reserved_11; /* rsp already available at 0x01d8 */
u64 rbp;
u64 rsi;
u64 rdi;
@@ -300,21 +343,21 @@ struct vmcb_save_area {
u64 r13;
u64 r14;
u64 r15;
- u8 reserved_10[16];
+ u8 reserved_12[16];
u64 sw_exit_code;
u64 sw_exit_info_1;
u64 sw_exit_info_2;
u64 sw_scratch;
u64 sev_features;
- u8 reserved_11[48];
+ u8 reserved_13[48];
u64 xcr0;
u8 valid_bitmap[16];
u64 x87_state_gpa;
} __packed;
struct ghcb {
- struct vmcb_save_area save;
- u8 reserved_save[2048 - sizeof(struct vmcb_save_area)];
+ struct sev_es_save_area save;
+ u8 reserved_save[2048 - sizeof(struct sev_es_save_area)];
u8 shared_buffer[2032];
@@ -324,13 +367,15 @@ struct ghcb {
} __packed;
-#define EXPECTED_VMCB_SAVE_AREA_SIZE 1032
+#define EXPECTED_VMCB_SAVE_AREA_SIZE 740
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1032
#define EXPECTED_VMCB_CONTROL_AREA_SIZE 272
#define EXPECTED_GHCB_SIZE PAGE_SIZE
static inline void __unused_size_checks(void)
{
BUILD_BUG_ON(sizeof(struct vmcb_save_area) != EXPECTED_VMCB_SAVE_AREA_SIZE);
+ BUILD_BUG_ON(sizeof(struct sev_es_save_area) != EXPECTED_SEV_ES_SAVE_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct vmcb_control_area) != EXPECTED_VMCB_CONTROL_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct ghcb) != EXPECTED_GHCB_SIZE);
}
@@ -401,7 +446,7 @@ struct vmcb {
/* GHCB Accessor functions */
#define GHCB_BITMAP_IDX(field) \
- (offsetof(struct vmcb_save_area, field) / sizeof(u64))
+ (offsetof(struct sev_es_save_area, field) / sizeof(u64))
#define DEFINE_GHCB_ACCESSORS(field) \
static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb) \
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 8d36f0c73071..751a4604a51d 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -552,12 +552,20 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
static int sev_es_sync_vmsa(struct vcpu_svm *svm)
{
- struct vmcb_save_area *save = &svm->vmcb->save;
+ struct sev_es_save_area *save = svm->vmsa;
/* Check some debug related fields before encrypting the VMSA */
- if (svm->vcpu.guest_debug || (save->dr7 & ~DR7_FIXED_1))
+ if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1))
return -EINVAL;
+ /*
+ * SEV-ES will use a VMSA that is pointed to by the VMCB, not
+ * the traditional VMSA that is part of the VMCB. Copy the
+ * traditional VMSA as it has been built so far (in prep
+ * for LAUNCH_UPDATE_VMSA) to be the initial SEV-ES state.
+ */
+ memcpy(save, &svm->vmcb->save, sizeof(svm->vmcb->save));
+
/* Sync registgers */
save->rax = svm->vcpu.arch.regs[VCPU_REGS_RAX];
save->rbx = svm->vcpu.arch.regs[VCPU_REGS_RBX];
@@ -584,14 +592,6 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
save->pkru = svm->vcpu.arch.pkru;
save->xss = svm->vcpu.arch.ia32_xss;
- /*
- * SEV-ES will use a VMSA that is pointed to by the VMCB, not
- * the traditional VMSA that is part of the VMCB. Copy the
- * traditional VMSA as it has been built so far (in prep
- * for LAUNCH_UPDATE_VMSA) to be the initial SEV-ES state.
- */
- memcpy(svm->vmsa, save, sizeof(*save));
-
return 0;
}
@@ -2606,7 +2606,7 @@ void sev_es_create_vcpu(struct vcpu_svm *svm)
void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu)
{
struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
- struct vmcb_save_area *hostsa;
+ struct sev_es_save_area *hostsa;
/*
* As an SEV-ES guest, hardware will restore the host state on VMEXIT,
@@ -2616,7 +2616,7 @@ void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu)
vmsave(__sme_page_pa(sd->save_area));
/* XCR0 is restored on VMEXIT, save the current host value */
- hostsa = (struct vmcb_save_area *)(page_address(sd->save_area) + 0x400);
+ hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
hostsa->xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
/* PKRU is restored on VMEXIT, save the current host value */
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 2908c6ab5bb4..bb64b7f1b433 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -170,7 +170,7 @@ struct vcpu_svm {
} shadow_msr_intercept;
/* SEV-ES support */
- struct vmcb_save_area *vmsa;
+ struct sev_es_save_area *vmsa;
struct ghcb *ghcb;
struct kvm_host_map ghcb_map;
bool received_first_sipi;
--
2.17.1
The early_set_memory_{encrypt,decrypt}() are used for changing the
page from decrypted (shared) to encrypted (private) and vice versa.
When SEV-SNP is active, the page state transition needs to go through
additional steps.
If the page is transitioned from shared to private, then perform the
following after the encryption attribute is set in the page table:
1. Issue the page state change VMGEXIT to add the page as a private
in the RMP table.
2. Validate the page after its successfully added in the RMP table.
To maintain the security guarantees, if the page is transitioned from
private to shared, then perform the following before clearing the
encryption attribute from the page table.
1. Invalidate the page.
2. Issue the page state change VMGEXIT to make the page shared in the
RMP table.
The early_set_memory_{encrypt,decrypt} can be called before the GHCB
is setup, use the SNP page state MSR protocol VMGEXIT defined in the GHCB
specification to request the page state change in the RMP table.
While at it, add a helper snp_prep_memory() that can be used outside
the sev specific files to change the page state for a specified memory
range.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 10 ++++
arch/x86/kernel/sev.c | 99 ++++++++++++++++++++++++++++++++++++++
arch/x86/mm/mem_encrypt.c | 51 ++++++++++++++++++--
3 files changed, 156 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 242af1154e49..9a676fb0929d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -104,6 +104,11 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
return rc;
}
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, int op);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -111,6 +116,11 @@ static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { ret
static inline void sev_es_nmi_complete(void) { }
static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
+static inline void __init
+early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init
+early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, int op) { }
#endif
#endif
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c7ff60b55bde..62034879fb3f 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -595,6 +595,105 @@ static u64 get_jump_table_addr(void)
return ret;
}
+static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool validate)
+{
+ unsigned long vaddr_end;
+ int rc;
+
+ vaddr = vaddr & PAGE_MASK;
+ vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+ while (vaddr < vaddr_end) {
+ rc = pvalidate(vaddr, RMP_PG_SIZE_4K, validate);
+ if (WARN(rc, "Failed to validate address 0x%lx ret %d", vaddr, rc))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+ vaddr = vaddr + PAGE_SIZE;
+ }
+}
+
+static void __init early_set_page_state(unsigned long paddr, unsigned int npages, int op)
+{
+ unsigned long paddr_end;
+ u64 val;
+
+ paddr = paddr & PAGE_MASK;
+ paddr_end = paddr + (npages << PAGE_SHIFT);
+
+ while (paddr < paddr_end) {
+ /*
+ * Use the MSR protocol because this function can be called before the GHCB
+ * is established.
+ */
+ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+
+ if (WARN(GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP,
+ "Wrong PSC response code: 0x%x\n",
+ (unsigned int)GHCB_RESP_CODE(val)))
+ goto e_term;
+
+ if (WARN(GHCB_MSR_PSC_RESP_VAL(val),
+ "Failed to change page state to '%s' paddr 0x%lx error 0x%llx\n",
+ op == SNP_PAGE_STATE_PRIVATE ? "private" : "shared",
+ paddr, GHCB_MSR_PSC_RESP_VAL(val)))
+ goto e_term;
+
+ paddr = paddr + PAGE_SIZE;
+ }
+
+ return;
+
+e_term:
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!sev_feature_enabled(SEV_SNP))
+ return;
+
+ /* Ask hypervisor to add the memory pages in RMP table as a 'private'. */
+ early_set_page_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+ /* Validate the memory pages after they've been added in the RMP table. */
+ pvalidate_pages(vaddr, npages, 1);
+}
+
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!sev_feature_enabled(SEV_SNP))
+ return;
+
+ /*
+ * Invalidate the memory pages before they are marked shared in the
+ * RMP table.
+ */
+ pvalidate_pages(vaddr, npages, 0);
+
+ /* Ask hypervisor to make the memory pages shared in the RMP table. */
+ early_set_page_state(paddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, int op)
+{
+ unsigned long vaddr, npages;
+
+ vaddr = (unsigned long)__va(paddr);
+ npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (op == SNP_PAGE_STATE_PRIVATE)
+ early_snp_set_memory_private(vaddr, paddr, npages);
+ else if (op == SNP_PAGE_STATE_SHARED)
+ early_snp_set_memory_shared(vaddr, paddr, npages);
+ else
+ WARN(1, "invalid memory op %d\n", op);
+}
+
int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
{
u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 63e7799a9a86..d434376568de 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -30,6 +30,7 @@
#include <asm/processor-flags.h>
#include <asm/msr.h>
#include <asm/cmdline.h>
+#include <asm/sev.h>
#include "mm_internal.h"
@@ -48,6 +49,34 @@ EXPORT_SYMBOL_GPL(sev_enable_key);
/* Buffer used for early in-place encryption by BSP, no locking needed */
static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
+/*
+ * When SNP is active, change the page state from private to shared before
+ * copying the data from the source to destination and restore after the copy.
+ * This is required because the source address is mapped as decrypted by the
+ * caller of the routine.
+ */
+static inline void __init snp_memcpy(void *dst, void *src, size_t sz,
+ unsigned long paddr, bool decrypt)
+{
+ unsigned long npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (!sev_feature_enabled(SEV_SNP) || !decrypt) {
+ memcpy(dst, src, sz);
+ return;
+ }
+
+ /*
+ * With SNP, the paddr needs to be accessed decrypted, mark the page
+ * shared in the RMP table before copying it.
+ */
+ early_snp_set_memory_shared((unsigned long)__va(paddr), paddr, npages);
+
+ memcpy(dst, src, sz);
+
+ /* Restore the page state after the memcpy. */
+ early_snp_set_memory_private((unsigned long)__va(paddr), paddr, npages);
+}
+
/*
* This routine does not change the underlying encryption setting of the
* page(s) that map this memory. It assumes that eventually the memory is
@@ -96,8 +125,8 @@ static void __init __sme_early_enc_dec(resource_size_t paddr,
* Use a temporary buffer, of cache-line multiple size, to
* avoid data corruption as documented in the APM.
*/
- memcpy(sme_early_buffer, src, len);
- memcpy(dst, sme_early_buffer, len);
+ snp_memcpy(sme_early_buffer, src, len, paddr, enc);
+ snp_memcpy(dst, sme_early_buffer, len, paddr, !enc);
early_memunmap(dst, len);
early_memunmap(src, len);
@@ -272,14 +301,28 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
clflush_cache_range(__va(pa), size);
/* Encrypt/decrypt the contents in-place */
- if (enc)
+ if (enc) {
sme_early_encrypt(pa, size);
- else
+ } else {
sme_early_decrypt(pa, size);
+ /*
+ * ON SNP, the page state in the RMP table must happen
+ * before the page table updates.
+ */
+ early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);
+ }
+
/* Change the page encryption mask. */
new_pte = pfn_pte(pfn, new_prot);
set_pte_atomic(kpte, new_pte);
+
+ /*
+ * If page is set encrypted in the page table, then update the RMP table to
+ * add this page as private.
+ */
+ if (enc)
+ early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);
}
static int __init early_set_memory_enc_dec(unsigned long vaddr,
--
2.17.1
The probe_roms() access the memory range (0xc0000 - 0x10000) to probe
various ROMs. The memory range is not part of the E820 system RAM
range. The memory range is mapped as private (i.e encrypted) in page
table.
When SEV-SNP is active, all the private memory must be validated before
the access. The ROM range was not part of E820 map, so the guest BIOS
did not validate it. An access to invalidated memory will cause a VC
exception. The guest does not support handling not-validated VC exception
yet, so validate the ROM memory regions before it is accessed.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/probe_roms.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 9e1def3744f2..9c09df86d167 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -21,6 +21,7 @@
#include <asm/sections.h>
#include <asm/io.h>
#include <asm/setup_arch.h>
+#include <asm/sev.h>
static struct resource system_rom_resource = {
.name = "System ROM",
@@ -197,11 +198,21 @@ static int __init romchecksum(const unsigned char *rom, unsigned long length)
void __init probe_roms(void)
{
- const unsigned char *rom;
unsigned long start, length, upper;
+ const unsigned char *rom;
unsigned char c;
int i;
+ /*
+ * The ROM memory is not part of the E820 system RAM and is not pre-validated
+ * by the BIOS. The kernel page table maps the ROM region as encrypted memory,
+ * the SEV-SNP requires the encrypted memory must be validated before the
+ * access. Validate the ROM before accessing it.
+ */
+ snp_prep_memory(video_rom_resource.start,
+ ((system_rom_resource.end + 1) - video_rom_resource.start),
+ SNP_PAGE_STATE_PRIVATE);
+
/* video rom */
upper = adapter_rom_resources[0].start;
for (start = video_rom_resource.start; start < upper; start += 2048) {
--
2.17.1
While launching the encrypted guests, the hypervisor may need to provide
some additional information during the guest boot. When booting under the
EFI based BIOS, the EFI configuration table contains an entry for the
confidential computing blob that contains the required information.
To support booting encrypted guests on non-EFI VM, the hypervisor needs to
pass this additional information to the kernel with a different method.
For this purpose, introduce SETUP_CC_BLOB type in setup_data to hold the
physical address of the confidential computing blob location. The boot
loader or hypervisor may choose to use this method instead of EFI
configuration table. The CC blob location scanning should give preference
to setup_data data over the EFI configuration table.
In AMD SEV-SNP, the CC blob contains the address of the secrets and CPUID
pages. The secrets page includes information such as a VM to PSP
communication key and CPUID page contains PSP filtered CPUID values.
Define the AMD SEV confidential computing blob structure.
While at it, define the EFI GUID for the confidential computing blob.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 12 ++++++++++++
arch/x86/include/uapi/asm/bootparam.h | 1 +
include/linux/efi.h | 1 +
3 files changed, 14 insertions(+)
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index f68c9e2c3851..e41bd55dba5d 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -44,6 +44,18 @@ struct es_em_ctxt {
void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code);
+/* AMD SEV Confidential computing blob structure */
+#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
+struct cc_blob_sev_info {
+ u32 magic;
+ u16 version;
+ u16 reserved;
+ u64 secrets_phys;
+ u32 secrets_len;
+ u64 cpuid_phys;
+ u32 cpuid_len;
+};
+
static inline u64 lower_bits(u64 val, unsigned int bits)
{
u64 mask = (1ULL << bits) - 1;
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index b25d3f82c2f3..1ac5acca72ce 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -10,6 +10,7 @@
#define SETUP_EFI 4
#define SETUP_APPLE_PROPERTIES 5
#define SETUP_JAILHOUSE 6
+#define SETUP_CC_BLOB 7
#define SETUP_INDIRECT (1<<31)
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 6b5d36babfcc..75aeb2a56888 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -344,6 +344,7 @@ void efi_native_runtime_setup(void);
#define EFI_CERT_SHA256_GUID EFI_GUID(0xc1c41626, 0x504c, 0x4092, 0xac, 0xa9, 0x41, 0xf9, 0x36, 0x93, 0x43, 0x28)
#define EFI_CERT_X509_GUID EFI_GUID(0xa5c059a1, 0x94e4, 0x4aa7, 0x87, 0xb5, 0xab, 0x15, 0x5c, 0x2b, 0xf0, 0x72)
#define EFI_CERT_X509_SHA256_GUID EFI_GUID(0x3bd2a492, 0x96c0, 0x4079, 0xb4, 0x20, 0xfc, 0xf9, 0x8e, 0xf1, 0x03, 0xed)
+#define EFI_CC_BLOB_GUID EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)
/*
* This GUID is used to pass to the kernel proper the struct screen_info
--
2.17.1
From: Tom Lendacky <[email protected]>
This is the final step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Update the SEV-ES/SEV-SNP save area to match the APM. This save
area will be used for the upcoming SEV-SNP AP Creation NAE event support.
Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/svm.h | 66 +++++++++++++++++++++++++++++---------
1 file changed, 50 insertions(+), 16 deletions(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index f679018685b6..5e72faa00cf2 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -291,7 +291,13 @@ struct sev_es_save_area {
struct vmcb_seg ldtr;
struct vmcb_seg idtr;
struct vmcb_seg tr;
- u8 reserved_1[43];
+ u64 vmpl0_ssp;
+ u64 vmpl1_ssp;
+ u64 vmpl2_ssp;
+ u64 vmpl3_ssp;
+ u64 u_cet;
+ u8 reserved_1[2];
+ u8 vmpl;
u8 cpl;
u8 reserved_2[4];
u64 efer;
@@ -304,9 +310,19 @@ struct sev_es_save_area {
u64 dr6;
u64 rflags;
u64 rip;
- u8 reserved_4[88];
+ u64 dr0;
+ u64 dr1;
+ u64 dr2;
+ u64 dr3;
+ u64 dr0_addr_mask;
+ u64 dr1_addr_mask;
+ u64 dr2_addr_mask;
+ u64 dr3_addr_mask;
+ u8 reserved_4[24];
u64 rsp;
- u8 reserved_5[24];
+ u64 s_cet;
+ u64 ssp;
+ u64 isst_addr;
u64 rax;
u64 star;
u64 lstar;
@@ -317,7 +333,7 @@ struct sev_es_save_area {
u64 sysenter_esp;
u64 sysenter_eip;
u64 cr2;
- u8 reserved_6[32];
+ u8 reserved_5[32];
u64 g_pat;
u64 dbgctl;
u64 br_from;
@@ -326,12 +342,12 @@ struct sev_es_save_area {
u64 last_excp_to;
u8 reserved_7[80];
u32 pkru;
- u8 reserved_9[20];
- u64 reserved_10; /* rax already available at 0x01f8 */
+ u8 reserved_8[20];
+ u64 reserved_9; /* rax already available at 0x01f8 */
u64 rcx;
u64 rdx;
u64 rbx;
- u64 reserved_11; /* rsp already available at 0x01d8 */
+ u64 reserved_10; /* rsp already available at 0x01d8 */
u64 rbp;
u64 rsi;
u64 rdi;
@@ -343,16 +359,34 @@ struct sev_es_save_area {
u64 r13;
u64 r14;
u64 r15;
- u8 reserved_12[16];
- u64 sw_exit_code;
- u64 sw_exit_info_1;
- u64 sw_exit_info_2;
- u64 sw_scratch;
+ u8 reserved_11[16];
+ u64 guest_exit_info_1;
+ u64 guest_exit_info_2;
+ u64 guest_exit_int_info;
+ u64 guest_nrip;
u64 sev_features;
- u8 reserved_13[48];
+ u64 vintr_ctrl;
+ u64 guest_exit_code;
+ u64 virtual_tom;
+ u64 tlb_id;
+ u64 pcpu_id;
+ u64 event_inj;
u64 xcr0;
- u8 valid_bitmap[16];
- u64 x87_state_gpa;
+ u8 reserved_12[16];
+
+ /* Floating point area */
+ u64 x87_dp;
+ u32 mxcsr;
+ u16 x87_ftw;
+ u16 x87_fsw;
+ u16 x87_fcw;
+ u16 x87_fop;
+ u16 x87_ds;
+ u16 x87_cs;
+ u64 x87_rip;
+ u8 fpreg_x87[80];
+ u8 fpreg_xmm[256];
+ u8 fpreg_ymm[256];
} __packed;
struct ghcb_save_area {
@@ -409,7 +443,7 @@ struct ghcb {
#define EXPECTED_VMCB_SAVE_AREA_SIZE 740
#define EXPECTED_GHCB_SAVE_AREA_SIZE 1032
-#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1032
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1648
#define EXPECTED_VMCB_CONTROL_AREA_SIZE 272
#define EXPECTED_GHCB_SIZE PAGE_SIZE
--
2.17.1
From: Michael Roth <[email protected]>
When the Confidential Computing blob is located by the boot/compressed
kernel, store a pointer to it in bootparams->cc_blob_address to avoid
the need for the run-time kernel to rescan the EFI config table to find
it again.
Since this function is also shared by the run-time kernel, this patch
also adds the logic to make use of bootparams->cc_blob_address when it
has been initialized.
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/sev-shared.c | 38 +++++++++++++++++++++++++-----------
1 file changed, 27 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 5e0e8e208a8c..23328727caf4 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -820,7 +820,6 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
return ES_OK;
}
-#ifdef BOOT_COMPRESSED
static struct setup_data *get_cc_setup_data(struct boot_params *bp)
{
struct setup_data *hdr = (struct setup_data *)bp->hdr.setup_data;
@@ -840,6 +839,16 @@ static struct setup_data *get_cc_setup_data(struct boot_params *bp)
* 1) Search for CC blob in the following order/precedence:
* - via linux boot protocol / setup_data entry
* - via EFI configuration table
+ * 2) If found, initialize boot_params->cc_blob_address to point to the
+ * blob so that uncompressed kernel can easily access it during very
+ * early boot without the need to re-parse EFI config table
+ * 3) Return a pointer to the CC blob, NULL otherwise.
+ *
+ * For run-time/uncompressed kernel:
+ *
+ * 1) Search for CC blob in the following order/precedence:
+ * - via linux boot protocol / setup_data entry
+ * - via boot_params->cc_blob_address
* 2) Return a pointer to the CC blob, NULL otherwise.
*/
static struct cc_blob_sev_info *sev_snp_probe_cc_blob(struct boot_params *bp)
@@ -857,27 +866,34 @@ static struct cc_blob_sev_info *sev_snp_probe_cc_blob(struct boot_params *bp)
goto out_verify;
}
+#ifdef __BOOT_COMPRESSED
/* CC blob isn't in setup_data, see if it's in the EFI config table */
(void)efi_bp_find_vendor_table(bp, EFI_CC_BLOB_GUID,
(unsigned long *)&cc_info);
+#else
+ /*
+ * CC blob isn't in setup_data, see if boot kernel passed it via
+ * boot_params.
+ */
+ if (bp->cc_blob_address)
+ cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address;
+#endif
out_verify:
/* CC blob should be either valid or not present. Fail otherwise. */
if (cc_info && cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
sev_es_terminate(1, GHCB_SNP_UNSUPPORTED);
+#ifdef __BOOT_COMPRESSED
+ /*
+ * Pass run-time kernel a pointer to CC info via boot_params for easier
+ * access during early boot.
+ */
+ bp->cc_blob_address = (u32)(unsigned long)cc_info;
+#endif
+
return cc_info;
}
-#else
-/*
- * Probing for CC blob for run-time kernel will be enabled in a subsequent
- * patch. For now we need to stub this out.
- */
-static struct cc_blob_sev_info *sev_snp_probe_cc_blob(struct boot_params *bp)
-{
- return NULL;
-}
-#endif
/*
* Initial set up of CPUID table when running identity-mapped.
--
2.17.1
From: Michael Roth <[email protected]>
The previously defined Confidential Computing blob is provided to the
kernel via a setup_data structure or EFI config table entry. Currently
these are both checked for by boot/compressed kernel to access the
CPUID table address within it for use with SEV-SNP CPUID enforcement.
To also enable SEV-SNP CPUID enforcement for the run-time kernel,
similar early access to the CPUID table is needed early on while it's
still using the identity-mapped page table set up by boot/compressed,
where global pointers need to be accessed via fixup_pointer().
This is much of an issue for accessing setup_data, and the EFI config
table helper code currently used in boot/compressed *could* be used in
this case as well since they both rely on identity-mapping. However, it
has some reliance on EFI helpers/string constants that would need to be
accessed via fixup_pointer(), and fixing it up while making it
shareable between boot/compressed and run-time kernel is fragile and
introduces a good bit of uglyness.
Instead, this patch adds a boot_params->cc_blob_address pointer that
boot/compressed can initialize so that the run-time kernel can access
the prelocated CC blob that way instead.
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/bootparam_utils.h | 1 +
arch/x86/include/uapi/asm/bootparam.h | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/bootparam_utils.h b/arch/x86/include/asm/bootparam_utils.h
index 981fe923a59f..53e9b0620d96 100644
--- a/arch/x86/include/asm/bootparam_utils.h
+++ b/arch/x86/include/asm/bootparam_utils.h
@@ -74,6 +74,7 @@ static void sanitize_boot_params(struct boot_params *boot_params)
BOOT_PARAM_PRESERVE(hdr),
BOOT_PARAM_PRESERVE(e820_table),
BOOT_PARAM_PRESERVE(eddbuf),
+ BOOT_PARAM_PRESERVE(cc_blob_address),
};
memset(&scratch, 0, sizeof(scratch));
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index 1ac5acca72ce..bea5cdcdf532 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -188,7 +188,8 @@ struct boot_params {
__u32 ext_ramdisk_image; /* 0x0c0 */
__u32 ext_ramdisk_size; /* 0x0c4 */
__u32 ext_cmd_line_ptr; /* 0x0c8 */
- __u8 _pad4[116]; /* 0x0cc */
+ __u8 _pad4[112]; /* 0x0cc */
+ __u32 cc_blob_address; /* 0x13c */
struct edid_info edid_info; /* 0x140 */
struct efi_info efi_info; /* 0x1c0 */
__u32 alt_mem_k; /* 0x1e0 */
--
2.17.1
From: Tom Lendacky <[email protected]>
To provide a more secure way to start APs under SEV-SNP, use the SEV-SNP
AP Creation NAE event. This allows for guest control over the AP register
state rather than trusting the hypervisor with the SEV-ES Jump Table
address.
During native_smp_prepare_cpus(), invoke an SEV-SNP function that, if
SEV-SNP is active, will set/override apic->wakeup_secondary_cpu. This
will allow the SEV-SNP AP Creation NAE event method to be used to boot
the APs. As a result of installing the override when SEV-SNP is active,
this method of starting the APs becomes the required method. The override
function will fail to start the AP if the hypervisor does not have
support for AP creation.
Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev-common.h | 1 +
arch/x86/include/asm/sev.h | 6 +
arch/x86/include/uapi/asm/svm.h | 5 +
arch/x86/kernel/sev.c | 205 ++++++++++++++++++++++++++++++
arch/x86/kernel/smpboot.c | 3 +
5 files changed, 220 insertions(+)
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 2277c8085b13..5da5f5147623 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -85,6 +85,7 @@
(((unsigned long)((v) >> GHCB_MSR_HV_FT_POS) & GHCB_MSR_HV_FT_MASK))
#define GHCB_HV_FT_SNP BIT_ULL(0)
+#define GHCB_HV_FT_SNP_AP_CREATION (BIT_ULL(1) | GHCB_HV_FT_SNP)
/* SNP Page State Change NAE event */
#define VMGEXIT_PSC_MAX_ENTRY 253
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 2385651c810e..f68c9e2c3851 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -65,6 +65,8 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
/* RMP page size */
#define RMP_PG_SIZE_4K 0
+#define RMPADJUST_VMSA_PAGE_BIT BIT(16)
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern struct static_key_false sev_es_enable_key;
extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -111,6 +113,8 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
void __init snp_prep_memory(unsigned long paddr, unsigned int sz, int op);
void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
+void snp_set_wakeup_secondary_cpu(void);
+
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -125,6 +129,8 @@ early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned i
static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, int op) { }
static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_wakeup_secondary_cpu(void) { }
+
#endif
#endif
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index f7f65febff70..997918f0a89a 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -109,6 +109,10 @@
#define SVM_VMGEXIT_SET_AP_JUMP_TABLE 0
#define SVM_VMGEXIT_GET_AP_JUMP_TABLE 1
#define SVM_VMGEXIT_PSC 0x80000010
+#define SVM_VMGEXIT_AP_CREATION 0x80000013
+#define SVM_VMGEXIT_AP_CREATE_ON_INIT 0
+#define SVM_VMGEXIT_AP_CREATE 1
+#define SVM_VMGEXIT_AP_DESTROY 2
#define SVM_VMGEXIT_HYPERVISOR_FEATURES 0x8000fffd
#define SVM_VMGEXIT_UNSUPPORTED_EVENT 0x8000ffff
@@ -218,6 +222,7 @@
{ SVM_VMGEXIT_AP_HLT_LOOP, "vmgexit_ap_hlt_loop" }, \
{ SVM_VMGEXIT_AP_JUMP_TABLE, "vmgexit_ap_jump_table" }, \
{ SVM_VMGEXIT_PSC, "vmgexit_page_state_change" }, \
+ { SVM_VMGEXIT_AP_CREATION, "vmgexit_ap_creation" }, \
{ SVM_VMGEXIT_HYPERVISOR_FEATURES, "vmgexit_hypervisor_feature" }, \
{ SVM_EXIT_ERR, "invalid_guest_state" }
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 5fef7fc46282..59e0dd04cb02 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -18,6 +18,7 @@
#include <linux/memblock.h>
#include <linux/kernel.h>
#include <linux/mm.h>
+#include <linux/cpumask.h>
#include <asm/cpu_entry_area.h>
#include <asm/stacktrace.h>
@@ -30,6 +31,7 @@
#include <asm/svm.h>
#include <asm/smp.h>
#include <asm/cpu.h>
+#include <asm/apic.h>
#include "sev-internal.h"
@@ -108,6 +110,8 @@ DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
/* Bitmap of SEV features supported by the hypervisor */
EXPORT_SYMBOL(sev_hv_features);
+static DEFINE_PER_CPU(struct sev_es_save_area *, snp_vmsa);
+
/* Needed in vc_early_forward_exception */
void do_early_exception(struct pt_regs *regs, int trapnr);
@@ -854,6 +858,207 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
pvalidate_pages(vaddr, npages, 1);
}
+static int vmsa_rmpadjust(void *va, bool vmsa)
+{
+ u64 attrs;
+ int err;
+
+ /*
+ * The RMPADJUST instruction is used to set or clear the VMSA bit for
+ * a page. A change to the VMSA bit is only performed when running
+ * at VMPL0 and is ignored at other VMPL levels. If too low of a target
+ * VMPL level is specified, the instruction can succeed without changing
+ * the VMSA bit should the kernel not be in VMPL0. Using a target VMPL
+ * level of 1 will return a FAIL_PERMISSION error if the kernel is not
+ * at VMPL0, thus ensuring that the VMSA bit has been properly set when
+ * no error is returned.
+ */
+ attrs = 1;
+ if (vmsa)
+ attrs |= RMPADJUST_VMSA_PAGE_BIT;
+
+ /* Instruction mnemonic supported in binutils versions v2.36 and later */
+ asm volatile (".byte 0xf3,0x0f,0x01,0xfe\n\t"
+ : "=a" (err)
+ : "a" (va), "c" (RMP_PG_SIZE_4K), "d" (attrs)
+ : "memory", "cc");
+
+ return err;
+}
+
+#define __ATTR_BASE (SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK)
+#define INIT_CS_ATTRIBS (__ATTR_BASE | SVM_SELECTOR_READ_MASK | SVM_SELECTOR_CODE_MASK)
+#define INIT_DS_ATTRIBS (__ATTR_BASE | SVM_SELECTOR_WRITE_MASK)
+
+#define INIT_LDTR_ATTRIBS (SVM_SELECTOR_P_MASK | 2)
+#define INIT_TR_ATTRIBS (SVM_SELECTOR_P_MASK | 3)
+
+static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
+{
+ struct sev_es_save_area *cur_vmsa, *vmsa;
+ struct ghcb_state state;
+ unsigned long flags;
+ struct ghcb *ghcb;
+ int cpu, err, ret;
+ u8 sipi_vector;
+ u64 cr4;
+
+ if ((sev_hv_features & GHCB_HV_FT_SNP_AP_CREATION) != GHCB_HV_FT_SNP_AP_CREATION)
+ return -EOPNOTSUPP;
+
+ /*
+ * Verify the desired start IP against the known trampoline start IP
+ * to catch any future new trampolines that may be introduced that
+ * would require a new protected guest entry point.
+ */
+ if (WARN_ONCE(start_ip != real_mode_header->trampoline_start,
+ "unsupported SEV-SNP start_ip: %lx\n", start_ip))
+ return -EINVAL;
+
+ /* Override start_ip with known protected guest start IP */
+ start_ip = real_mode_header->sev_es_trampoline_start;
+
+ /* Find the logical CPU for the APIC ID */
+ for_each_present_cpu(cpu) {
+ if (arch_match_cpu_phys_id(cpu, apic_id))
+ break;
+ }
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ cur_vmsa = per_cpu(snp_vmsa, cpu);
+
+ /*
+ * A new VMSA is created each time because there is no guarantee that
+ * the current VMSA is the kernels or that the vCPU is not running. If
+ * an attempt was done to use the current VMSA with a running vCPU, a
+ * #VMEXIT of that vCPU would wipe out all of the settings being done
+ * here.
+ */
+ vmsa = (struct sev_es_save_area *)get_zeroed_page(GFP_KERNEL);
+ if (!vmsa)
+ return -ENOMEM;
+
+ /* CR4 should maintain the MCE value */
+ cr4 = native_read_cr4() & ~X86_CR4_MCE;
+
+ /* Set the CS value based on the start_ip converted to a SIPI vector */
+ sipi_vector = (start_ip >> 12);
+ vmsa->cs.base = sipi_vector << 12;
+ vmsa->cs.limit = 0xffff;
+ vmsa->cs.attrib = INIT_CS_ATTRIBS;
+ vmsa->cs.selector = sipi_vector << 8;
+
+ /* Set the RIP value based on start_ip */
+ vmsa->rip = start_ip & 0xfff;
+
+ /* Set VMSA entries to the INIT values as documented in the APM */
+ vmsa->ds.limit = 0xffff;
+ vmsa->ds.attrib = INIT_DS_ATTRIBS;
+ vmsa->es = vmsa->ds;
+ vmsa->fs = vmsa->ds;
+ vmsa->gs = vmsa->ds;
+ vmsa->ss = vmsa->ds;
+
+ vmsa->gdtr.limit = 0xffff;
+ vmsa->ldtr.limit = 0xffff;
+ vmsa->ldtr.attrib = INIT_LDTR_ATTRIBS;
+ vmsa->idtr.limit = 0xffff;
+ vmsa->tr.limit = 0xffff;
+ vmsa->tr.attrib = INIT_TR_ATTRIBS;
+
+ vmsa->efer = 0x1000; /* Must set SVME bit */
+ vmsa->cr4 = cr4;
+ vmsa->cr0 = 0x60000010;
+ vmsa->dr7 = 0x400;
+ vmsa->dr6 = 0xffff0ff0;
+ vmsa->rflags = 0x2;
+ vmsa->g_pat = 0x0007040600070406ULL;
+ vmsa->xcr0 = 0x1;
+ vmsa->mxcsr = 0x1f80;
+ vmsa->x87_ftw = 0x5555;
+ vmsa->x87_fcw = 0x0040;
+
+ /*
+ * Set the SNP-specific fields for this VMSA:
+ * VMPL level
+ * SEV_FEATURES (matches the SEV STATUS MSR right shifted 2 bits)
+ */
+ vmsa->vmpl = 0;
+ vmsa->sev_features = sev_status >> 2;
+
+ /* Switch the page over to a VMSA page now that it is initialized */
+ ret = vmsa_rmpadjust(vmsa, true);
+ if (ret) {
+ pr_err("set VMSA page failed (%u)\n", ret);
+ free_page((unsigned long)vmsa);
+
+ return -EINVAL;
+ }
+
+ /* Issue VMGEXIT AP Creation NAE event */
+ local_irq_save(flags);
+
+ ghcb = __sev_get_ghcb(&state);
+
+ vc_ghcb_invalidate(ghcb);
+ ghcb_set_rax(ghcb, vmsa->sev_features);
+ ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION);
+ ghcb_set_sw_exit_info_1(ghcb, ((u64)apic_id << 32) | SVM_VMGEXIT_AP_CREATE);
+ ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa));
+
+ sev_es_wr_ghcb_msr(__pa(ghcb));
+ VMGEXIT();
+
+ if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
+ lower_32_bits(ghcb->save.sw_exit_info_1)) {
+ pr_alert("SNP AP Creation error\n");
+ ret = -EINVAL;
+ }
+
+ __sev_put_ghcb(&state);
+
+ local_irq_restore(flags);
+
+ /* Perform cleanup if there was an error */
+ if (ret) {
+ err = vmsa_rmpadjust(vmsa, false);
+ if (err)
+ pr_err("clear VMSA page failed (%u), leaking page\n", err);
+ else
+ free_page((unsigned long)vmsa);
+
+ vmsa = NULL;
+ }
+
+ /* Free up any previous VMSA page */
+ if (cur_vmsa) {
+ err = vmsa_rmpadjust(cur_vmsa, false);
+ if (err)
+ pr_err("clear VMSA page failed (%u), leaking page\n", err);
+ else
+ free_page((unsigned long)cur_vmsa);
+ }
+
+ /* Record the current VMSA page */
+ per_cpu(snp_vmsa, cpu) = vmsa;
+
+ return ret;
+}
+
+void snp_set_wakeup_secondary_cpu(void)
+{
+ if (!sev_feature_enabled(SEV_SNP))
+ return;
+
+ /*
+ * Always set this override if SEV-SNP is enabled. This makes it the
+ * required method to start APs under SEV-SNP. If the hypervisor does
+ * not support AP creation, then no APs will be started.
+ */
+ apic->wakeup_secondary_cpu = wakeup_cpu_via_vmgexit;
+}
+
int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
{
u16 startup_cs, startup_ip;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 9320285a5e29..4fc07006f7f8 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -82,6 +82,7 @@
#include <asm/spec-ctrl.h>
#include <asm/hw_irq.h>
#include <asm/stackprotector.h>
+#include <asm/sev.h>
#ifdef CONFIG_ACPI_CPPC_LIB
#include <acpi/cppc_acpi.h>
@@ -1377,6 +1378,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
smp_quirk_init_udelay();
speculative_store_bypass_ht_init();
+
+ snp_set_wakeup_secondary_cpu();
}
void arch_thaw_secondary_cpus_begin(void)
--
2.17.1
From: Michael Roth <[email protected]>
CPUID instructions generate a #VC exception for SEV-ES/SEV-SNP guests,
for which early handlers are currently set up to handle. In the case
of SEV-SNP, guests can use a special location in guest memory address
space that has been pre-populated with firmware-validated CPUID
information to look up the relevant CPUID values rather than
requesting them from hypervisor via a VMGEXIT.
Determine the location of the CPUID memory address in advance of any
CPUID instructions/exceptions and, when available, use it to handle
the CPUID lookup.
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/efi-config-table.c | 44 +++
arch/x86/boot/compressed/head_64.S | 1 +
arch/x86/boot/compressed/idt_64.c | 7 +-
arch/x86/boot/compressed/misc.h | 10 +
arch/x86/boot/compressed/sev.c | 3 +
arch/x86/include/asm/sev-common.h | 2 +
arch/x86/include/asm/sev.h | 3 +
arch/x86/kernel/sev-shared.c | 322 ++++++++++++++++++++
arch/x86/kernel/sev.c | 4 +
9 files changed, 394 insertions(+), 2 deletions(-)
diff --git a/arch/x86/boot/compressed/efi-config-table.c b/arch/x86/boot/compressed/efi-config-table.c
index d1a34aa7cefd..678fc4236030 100644
--- a/arch/x86/boot/compressed/efi-config-table.c
+++ b/arch/x86/boot/compressed/efi-config-table.c
@@ -178,3 +178,47 @@ efi_bp_get_conf_table(struct boot_params *boot_params,
return 0;
}
+
+/*
+ * Given boot_params, locate EFI system/config table from it and search for
+ * physical for the vendor table associated with GUID.
+ *
+ * @boot_params: pointer to boot_params
+ * @guid: GUID of vendor table
+ * @vendor_table_pa: location to store physical address of vendor table
+ *
+ * Returns 0 on success. On error, return params are left unchanged.
+ */
+int
+efi_bp_find_vendor_table(struct boot_params *boot_params, efi_guid_t guid,
+ unsigned long *vendor_table_pa)
+{
+ unsigned long conf_table_pa = 0;
+ unsigned int conf_table_len = 0;
+ unsigned int i;
+ bool efi_64;
+ int ret;
+
+ ret = efi_bp_get_conf_table(boot_params, &conf_table_pa,
+ &conf_table_len, &efi_64);
+ if (ret)
+ return ret;
+
+ for (i = 0; i < conf_table_len; i++) {
+ unsigned long vendor_table_pa_tmp;
+ efi_guid_t vendor_table_guid;
+ int ret;
+
+ if (get_vendor_table((void *)conf_table_pa, i,
+ &vendor_table_pa_tmp,
+ &vendor_table_guid, efi_64))
+ return -EINVAL;
+
+ if (!efi_guidcmp(guid, vendor_table_guid)) {
+ *vendor_table_pa = vendor_table_pa_tmp;
+ return 0;
+ }
+ }
+
+ return -ENOENT;
+}
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index a2347ded77ea..1c1658693fc9 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -441,6 +441,7 @@ SYM_CODE_START(startup_64)
.Lon_kernel_cs:
pushq %rsi
+ movq %rsi, %rdi /* real mode address */
call load_stage1_idt
popq %rsi
diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c
index 9b93567d663a..1f6511a6625d 100644
--- a/arch/x86/boot/compressed/idt_64.c
+++ b/arch/x86/boot/compressed/idt_64.c
@@ -3,6 +3,7 @@
#include <asm/segment.h>
#include <asm/trapnr.h>
#include "misc.h"
+#include <asm/sev.h>
static void set_idt_entry(int vector, void (*handler)(void))
{
@@ -28,13 +29,15 @@ static void load_boot_idt(const struct desc_ptr *dtr)
}
/* Setup IDT before kernel jumping to .Lrelocated */
-void load_stage1_idt(void)
+void load_stage1_idt(void *rmode)
{
boot_idt_desc.address = (unsigned long)boot_idt;
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+ sev_snp_cpuid_init(rmode);
set_idt_entry(X86_TRAP_VC, boot_stage1_vc);
+ }
load_boot_idt(&boot_idt_desc);
}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 522baf8ff04a..74c3cf3b982c 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -193,6 +193,10 @@ int efi_bp_get_conf_table(struct boot_params *boot_params,
unsigned long *conf_table_pa,
unsigned int *conf_table_len,
bool *is_efi_64);
+
+int efi_bp_find_vendor_table(struct boot_params *boot_params, efi_guid_t guid,
+ unsigned long *vendor_table_pa);
+
#else
static inline int
efi_foreach_conf_entry(void *conf_table, unsigned int conf_table_len,
@@ -222,6 +226,12 @@ efi_bp_get_conf_table(struct boot_params *boot_params,
{
return -ENOENT;
}
+
+int efi_bp_find_vendor_table(struct boot_params *boot_params, efi_guid_t guid,
+ unsigned long *vendor_table_pa);
+{
+ return -ENOENT;
+}
#endif /* CONFIG_EFI */
#endif /* BOOT_COMPRESSED_MISC_H */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index d4cbadf80838..13a6ce74f320 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -20,6 +20,9 @@
#include <asm/fpu/xcr.h>
#include <asm/ptrace.h>
#include <asm/svm.h>
+#include <asm/cpuid-indexed.h>
+#include <linux/efi.h>
+#include <linux/log2.h>
#include "error.h"
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 5da5f5147623..e14d24f0950c 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -132,5 +132,7 @@ struct __packed snp_psc_desc {
#define GHCB_TERM_PSC 1 /* Page State Change failure */
#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
#define GHCB_TERM_NOT_VMPL0 3 /* SNP guest is not running at VMPL-0 */
+#define GHCB_TERM_CPUID 4 /* CPUID-validation failure */
+#define GHCB_TERM_CPUID_HYP 5 /* CPUID failure during hypervisor fallback */
#endif
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index e41bd55dba5d..e403bd1fcb23 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -11,6 +11,7 @@
#include <linux/types.h>
#include <asm/insn.h>
#include <asm/sev-common.h>
+#include <asm/bootparam.h>
#define GHCB_PROTOCOL_MIN 1ULL
#define GHCB_PROTOCOL_MAX 2ULL
@@ -127,6 +128,7 @@ void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
void snp_set_wakeup_secondary_cpu(void);
+void sev_snp_cpuid_init(struct boot_params *bp);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -143,6 +145,7 @@ static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npage
static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
static inline void snp_set_wakeup_secondary_cpu(void) { }
+static inline void sev_snp_cpuid_init(struct boot_params *bp) { }
#endif
#endif
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 4884de256a49..5e0e8e208a8c 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,25 @@
#define has_cpuflag(f) boot_cpu_has(f)
#endif
+struct sev_snp_cpuid_fn {
+ u32 eax_in;
+ u32 ecx_in;
+ u64 unused;
+ u64 unused2;
+ u32 eax;
+ u32 ebx;
+ u32 ecx;
+ u32 edx;
+ u64 reserved;
+} __packed;
+
+struct sev_snp_cpuid_info {
+ u32 count;
+ u32 reserved1;
+ u64 reserved2;
+ struct sev_snp_cpuid_fn fn[0];
+} __packed;
+
/*
* Since feature negotiation related variables are set early in the boot
* process they must reside in the .data section so as not to be zeroed
@@ -26,6 +45,15 @@ static u16 ghcb_version __section(".data..ro_after_init");
/* Bitmap of SEV features supported by the hypervisor */
u64 sev_hv_features __section(".data..ro_after_init") = 0;
+/*
+ * These are also stored in .data section to avoid the need to re-parse
+ * boot_params and re-determine CPUID memory range when .bss is cleared.
+ */
+static int sev_snp_cpuid_enabled __section(".data");
+static unsigned long sev_snp_cpuid_pa __section(".data");
+static unsigned long sev_snp_cpuid_sz __section(".data");
+static const struct sev_snp_cpuid_info *cpuid_info __section(".data");
+
static bool __init sev_es_check_cpu_features(void)
{
if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -236,6 +264,171 @@ static int sev_es_cpuid_msr_proto(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
return 0;
}
+static bool sev_snp_cpuid_active(void)
+{
+ return sev_snp_cpuid_enabled;
+}
+
+static int sev_snp_cpuid_xsave_size(u64 xfeatures_en, u32 base_size,
+ u32 *xsave_size, bool compacted)
+{
+ u64 xfeatures_found = 0;
+ int i;
+
+ *xsave_size = base_size;
+
+ for (i = 0; i < cpuid_info->count; i++) {
+ const struct sev_snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+ if (!(fn->eax_in == 0xd && fn->ecx_in > 1 && fn->ecx_in < 64))
+ continue;
+ if (!(xfeatures_en & (1UL << fn->ecx_in)))
+ continue;
+ if (xfeatures_found & (1UL << fn->ecx_in))
+ continue;
+
+ xfeatures_found |= (1UL << fn->ecx_in);
+ if (compacted)
+ *xsave_size += fn->eax;
+ else
+ *xsave_size = max(*xsave_size, fn->eax + fn->ebx);
+ }
+
+ /*
+ * Either the guest set unsupported XCR0/XSS bits, or the corresponding
+ * entries in the CPUID table were not present. This is not a valid
+ * state to be in.
+ */
+ if (xfeatures_found != (xfeatures_en & ~3ULL))
+ return -EINVAL;
+
+ return 0;
+}
+
+static void sev_snp_cpuid_hyp(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
+ u32 *ecx, u32 *edx)
+{
+ /*
+ * Currently MSR protocol is sufficient to handle fallback cases, but
+ * should that change make sure we terminate rather than grabbing random
+ * values. Handling can be added in future to use GHCB-page protocol for
+ * cases that occur late enough in boot that GHCB page is available
+ */
+ if (cpuid_function_is_indexed(func) && subfunc != 0)
+ sev_es_terminate(1, GHCB_TERM_CPUID_HYP);
+
+ if (sev_es_cpuid_msr_proto(func, 0, eax, ebx, ecx, edx))
+ sev_es_terminate(1, GHCB_TERM_CPUID_HYP);
+}
+
+/*
+ * Returns -EOPNOTSUPP if feature not enabled. Any other return value should be
+ * treated as fatal by caller since we cannot fall back to hypervisor to fetch
+ * the values for security reasons (outside of the specific cases handled here)
+ */
+static int sev_snp_cpuid(u32 func, u32 subfunc, u32 *eax, u32 *ebx, u32 *ecx,
+ u32 *edx)
+{
+ bool found = false;
+ int i;
+
+ if (!sev_snp_cpuid_active())
+ return -EOPNOTSUPP;
+
+ if (!cpuid_info)
+ return -EIO;
+
+ for (i = 0; i < cpuid_info->count; i++) {
+ const struct sev_snp_cpuid_fn *fn = &cpuid_info->fn[i];
+
+ if (fn->eax_in != func)
+ continue;
+
+ if (cpuid_function_is_indexed(func) && fn->ecx_in != subfunc)
+ continue;
+
+ *eax = fn->eax;
+ *ebx = fn->ebx;
+ *ecx = fn->ecx;
+ *edx = fn->edx;
+ found = true;
+
+ break;
+ }
+
+ if (!found) {
+ *eax = *ebx = *ecx = *edx = 0;
+ goto out;
+ }
+
+ if (func == 0x1) {
+ u32 ebx2, edx2;
+
+ sev_snp_cpuid_hyp(func, subfunc, NULL, &ebx2, NULL, &edx2);
+ /* initial APIC ID */
+ *ebx = (*ebx & 0x00FFFFFF) | (ebx2 & 0xFF000000);
+ /* APIC enabled bit */
+ *edx = (*edx & ~BIT_ULL(9)) | (edx2 & BIT_ULL(9));
+
+ /* OSXSAVE enabled bit */
+ if (native_read_cr4() & X86_CR4_OSXSAVE)
+ *ecx |= BIT_ULL(27);
+ } else if (func == 0x7) {
+ /* OSPKE enabled bit */
+ *ecx &= ~BIT_ULL(4);
+ if (native_read_cr4() & X86_CR4_PKE)
+ *ecx |= BIT_ULL(4);
+ } else if (func == 0xB) {
+ /* extended APIC ID */
+ sev_snp_cpuid_hyp(func, 0, NULL, NULL, NULL, edx);
+ } else if (func == 0xd && (subfunc == 0x0 || subfunc == 0x1)) {
+ bool compacted = false;
+ u64 xcr0 = 1, xss = 0;
+ u32 xsave_size;
+
+ if (native_read_cr4() & X86_CR4_OSXSAVE)
+ xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
+ if (subfunc == 1) {
+ /* boot/compressed doesn't set XSS so 0 is fine there */
+#ifndef __BOOT_COMPRESSED
+ if (*eax & 0x8) /* XSAVES */
+ if (boot_cpu_has(X86_FEATURE_XSAVES))
+ rdmsrl(MSR_IA32_XSS, xss);
+#endif
+ /*
+ * The PPR and APM aren't clear on what size should be
+ * encoded in 0xD:0x1:EBX when compaction is not enabled
+ * by either XSAVEC or XSAVES since SNP-capable hardware
+ * has the entries fixed as 1. KVM sets it to 0 in this
+ * case, but to avoid this becoming an issue it's safer
+ * to simply treat this as unsupported or SNP guests.
+ */
+ if (!(*eax & 0xA)) /* (XSAVEC|XSAVES) */
+ return -EINVAL;
+
+ compacted = true;
+ }
+
+ if (sev_snp_cpuid_xsave_size(xcr0 | xss, *ebx, &xsave_size,
+ compacted))
+ return -EINVAL;
+
+ *ebx = xsave_size;
+ } else if (func == 0x8000001E) {
+ u32 ebx2, ecx2;
+
+ /* extended APIC ID */
+ sev_snp_cpuid_hyp(func, subfunc, eax, &ebx2, &ecx2, NULL);
+ /* compute ID */
+ *ebx = (*ebx & 0xFFFFFFF00) | (ebx2 & 0x000000FF);
+ /* node ID */
+ *ecx = (*ecx & 0xFFFFFFF00) | (ecx2 & 0x000000FF);
+ }
+
+out:
+ return 0;
+}
+
/*
* Boot VC Handler - This is the first VC handler during boot, there is no GHCB
* page yet, so it only supports the MSR based communication with the
@@ -244,15 +437,25 @@ static int sev_es_cpuid_msr_proto(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
{
unsigned int fn = lower_bits(regs->ax, 32);
+ unsigned int subfn = lower_bits(regs->cx, 32);
u32 eax, ebx, ecx, edx;
+ int ret;
/* Only CPUID is supported via MSR protocol */
if (exit_code != SVM_EXIT_CPUID)
goto fail;
+ ret = sev_snp_cpuid(fn, subfn, &eax, &ebx, &ecx, &edx);
+ if (ret == 0)
+ goto out;
+
+ if (ret != -EOPNOTSUPP)
+ goto fail;
+
if (sev_es_cpuid_msr_proto(fn, 0, &eax, &ebx, &ecx, &edx))
goto fail;
+out:
regs->ax = eax;
regs->bx = ebx;
regs->cx = ecx;
@@ -552,6 +755,19 @@ static enum es_result vc_handle_cpuid(struct ghcb *ghcb,
struct pt_regs *regs = ctxt->regs;
u32 cr4 = native_read_cr4();
enum es_result ret;
+ u32 eax, ebx, ecx, edx;
+ int cpuid_ret;
+
+ cpuid_ret = sev_snp_cpuid(regs->ax, regs->cx, &eax, &ebx, &ecx, &edx);
+ if (cpuid_ret == 0) {
+ regs->ax = eax;
+ regs->bx = ebx;
+ regs->cx = ecx;
+ regs->dx = edx;
+ return ES_OK;
+ }
+ if (cpuid_ret != -EOPNOTSUPP)
+ return ES_VMM_ERROR;
ghcb_set_rax(ghcb, regs->ax);
ghcb_set_rcx(ghcb, regs->cx);
@@ -603,3 +819,109 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,
return ES_OK;
}
+
+#ifdef BOOT_COMPRESSED
+static struct setup_data *get_cc_setup_data(struct boot_params *bp)
+{
+ struct setup_data *hdr = (struct setup_data *)bp->hdr.setup_data;
+
+ while (hdr) {
+ if (hdr->type == SETUP_CC_BLOB)
+ return hdr;
+ hdr = (struct setup_data *)hdr->next;
+ }
+
+ return NULL;
+}
+
+/*
+ * For boot/compressed kernel:
+ *
+ * 1) Search for CC blob in the following order/precedence:
+ * - via linux boot protocol / setup_data entry
+ * - via EFI configuration table
+ * 2) Return a pointer to the CC blob, NULL otherwise.
+ */
+static struct cc_blob_sev_info *sev_snp_probe_cc_blob(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info = NULL;
+ struct setup_data_cc {
+ struct setup_data header;
+ u32 cc_blob_address;
+ } *sd;
+
+ /* Try to get CC blob via setup_data */
+ sd = (struct setup_data_cc *)get_cc_setup_data(bp);
+ if (sd) {
+ cc_info = (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
+ goto out_verify;
+ }
+
+ /* CC blob isn't in setup_data, see if it's in the EFI config table */
+ (void)efi_bp_find_vendor_table(bp, EFI_CC_BLOB_GUID,
+ (unsigned long *)&cc_info);
+
+out_verify:
+ /* CC blob should be either valid or not present. Fail otherwise. */
+ if (cc_info && cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+ sev_es_terminate(1, GHCB_SNP_UNSUPPORTED);
+
+ return cc_info;
+}
+#else
+/*
+ * Probing for CC blob for run-time kernel will be enabled in a subsequent
+ * patch. For now we need to stub this out.
+ */
+static struct cc_blob_sev_info *sev_snp_probe_cc_blob(struct boot_params *bp)
+{
+ return NULL;
+}
+#endif
+
+/*
+ * Initial set up of CPUID table when running identity-mapped.
+ *
+ * NOTE: Since SEV_SNP feature partly relies on CPUID checks that can't
+ * happen until we access CPUID page, we skip the check and hope the
+ * bootloader is providing sane values. Current code relies on all CPUID
+ * page lookups originating from #VC handler, which at least provides
+ * indication that SEV-ES is enabled. Subsequent init levels will check for
+ * SEV_SNP feature once available to also take SEV MSR value into account.
+ */
+void sev_snp_cpuid_init(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ if (!bp)
+ sev_es_terminate(1, GHCB_TERM_CPUID);
+
+ cc_info = sev_snp_probe_cc_blob(bp);
+
+ if (!cc_info)
+ return;
+
+ sev_snp_cpuid_pa = cc_info->cpuid_phys;
+ sev_snp_cpuid_sz = cc_info->cpuid_len;
+
+ /*
+ * These should always be valid values for SNP, even if guest isn't
+ * actually configured to use the CPUID table.
+ */
+ if (!sev_snp_cpuid_pa || sev_snp_cpuid_sz < PAGE_SIZE)
+ sev_es_terminate(1, GHCB_TERM_CPUID);
+
+ cpuid_info = (const struct sev_snp_cpuid_info *)sev_snp_cpuid_pa;
+
+ /*
+ * We should be able to trust the 'count' value in the CPUID table
+ * area, but ensure it agrees with CC blob value to be safe.
+ */
+ if (sev_snp_cpuid_sz < (sizeof(struct sev_snp_cpuid_info) +
+ sizeof(struct sev_snp_cpuid_fn) *
+ cpuid_info->count))
+ sev_es_terminate(1, GHCB_TERM_CPUID);
+
+ if (cpuid_info->count > 0)
+ sev_snp_cpuid_enabled = 1;
+}
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 59e0dd04cb02..04ef5e79fa12 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -19,6 +19,8 @@
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/cpumask.h>
+#include <linux/log2.h>
+#include <linux/efi.h>
#include <asm/cpu_entry_area.h>
#include <asm/stacktrace.h>
@@ -32,6 +34,8 @@
#include <asm/smp.h>
#include <asm/cpu.h>
#include <asm/apic.h>
+#include <asm/efi.h>
+#include <asm/cpuid-indexed.h>
#include "sev-internal.h"
--
2.17.1
The hypervisor uses the SEV_FEATURES field (offset 3B0h) in the Save State
Area to control the SEV-SNP guest features such as SNPActive, vTOM,
ReflectVC etc. An SEV-SNP guest can read the SEV_FEATURES fields through
the SEV_STATUS MSR.
While at it, update the dump_vmcb() to log the VMPL level.
See APM2 Table 15-34 and B-4 for more details.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/svm.h | 15 +++++++++++++--
arch/x86/kvm/svm/svm.c | 4 ++--
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 772e60efe243..ff614cdcf628 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -212,6 +212,15 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
#define SVM_NESTED_CTL_SEV_ENABLE BIT(1)
#define SVM_NESTED_CTL_SEV_ES_ENABLE BIT(2)
+#define SVM_SEV_FEATURES_SNP_ACTIVE BIT(0)
+#define SVM_SEV_FEATURES_VTOM BIT(1)
+#define SVM_SEV_FEATURES_REFLECT_VC BIT(2)
+#define SVM_SEV_FEATURES_RESTRICTED_INJECTION BIT(3)
+#define SVM_SEV_FEATURES_ALTERNATE_INJECTION BIT(4)
+#define SVM_SEV_FEATURES_DEBUG_SWAP BIT(5)
+#define SVM_SEV_FEATURES_PREVENT_HOST_IBS BIT(6)
+#define SVM_SEV_FEATURES_BTB_ISOLATION BIT(7)
+
struct vmcb_seg {
u16 selector;
u16 attrib;
@@ -230,7 +239,8 @@ struct vmcb_save_area {
struct vmcb_seg ldtr;
struct vmcb_seg idtr;
struct vmcb_seg tr;
- u8 reserved_1[43];
+ u8 reserved_1[42];
+ u8 vmpl;
u8 cpl;
u8 reserved_2[4];
u64 efer;
@@ -295,7 +305,8 @@ struct vmcb_save_area {
u64 sw_exit_info_1;
u64 sw_exit_info_2;
u64 sw_scratch;
- u8 reserved_11[56];
+ u64 sev_features;
+ u8 reserved_11[48];
u64 xcr0;
u8 valid_bitmap[16];
u64 x87_state_gpa;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e088086f3de6..293c9e03da5a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3184,8 +3184,8 @@ static void dump_vmcb(struct kvm_vcpu *vcpu)
"tr:",
save01->tr.selector, save01->tr.attrib,
save01->tr.limit, save01->tr.base);
- pr_err("cpl: %d efer: %016llx\n",
- save->cpl, save->efer);
+ pr_err("vmpl: %d cpl: %d efer: %016llx\n",
+ save->vmpl, save->cpl, save->efer);
pr_err("%-15s %016llx %-13s %016llx\n",
"cr0:", save->cr0, "cr2:", save->cr2);
pr_err("%-15s %016llx %-13s %016llx\n",
--
2.17.1
The SNP_GET_DERIVED_KEY ioctl interface can be used by the SNP guest to
ask the firmware to provide a key derived from a root key. The derived
key may be used by the guest for any purposes it choose, such as a
sealing key or communicating with the external entities.
See SEV-SNP firmware spec for more information.
Signed-off-by: Brijesh Singh <[email protected]>
---
Documentation/virt/coco/sevguest.rst | 18 ++++++++++
drivers/virt/coco/sevguest/sevguest.c | 48 +++++++++++++++++++++++++++
include/uapi/linux/sev-guest.h | 24 ++++++++++++++
3 files changed, 90 insertions(+)
diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
index 52d5915037ef..25446670d816 100644
--- a/Documentation/virt/coco/sevguest.rst
+++ b/Documentation/virt/coco/sevguest.rst
@@ -67,3 +67,21 @@ provided by the SEV-SNP firmware to query the attestation report.
On success, the snp_report_resp.data will contains the report. The report
format is described in the SEV-SNP specification. See the SEV-SNP specification
for further details.
+
+2.2 SNP_GET_DERIVED_KEY
+-----------------------
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in): struct snp_derived_key_req
+:Returns (out): struct snp_derived_key_req on success, -negative on error
+
+The SNP_GET_DERIVED_KEY ioctl can be used to get a key derive from a root key.
+The derived key can be used by the guest for any purpose, such as sealing keys
+or communicating with external entities.
+
+The ioctl uses the SNP_GUEST_REQUEST (MSG_KEY_REQ) command provided by the
+SEV-SNP firmware to derive the key. See SEV-SNP specification for further details
+on the various fileds passed in the key derivation request.
+
+On success, the snp_derived_key_resp.data will contains the derived key
+value.
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
index f3f86f9b5b22..2f20db80490a 100644
--- a/drivers/virt/coco/sevguest/sevguest.c
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -304,6 +304,50 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_user_guest_reque
return rc;
}
+static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_user_guest_request *arg)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_derived_key_resp *resp;
+ struct snp_derived_key_req req;
+ int rc, resp_len;
+
+ if (!arg->req_data || !arg->resp_data)
+ return -EINVAL;
+
+ /* Copy the request payload from the userspace */
+ if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+ return -EFAULT;
+
+ /* Message version must be non-zero */
+ if (!req.msg_version)
+ return -EINVAL;
+
+ /*
+ * The intermediate response buffer is used while decrypting the
+ * response payload. Make sure that it has enough space to cover the
+ * authtag.
+ */
+ resp_len = sizeof(resp->data) + crypto->a_len;
+ resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+ if (!resp)
+ return -ENOMEM;
+
+ /* Issue the command to get the attestation report */
+ rc = handle_guest_request(snp_dev, req.msg_version, SNP_MSG_KEY_REQ,
+ &req.data, sizeof(req.data), resp->data, resp_len,
+ &arg->fw_err);
+ if (rc)
+ goto e_free;
+
+ /* Copy the response payload to userspace */
+ if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+ rc = -EFAULT;
+
+e_free:
+ kfree(resp);
+ return rc;
+}
+
static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
{
struct snp_guest_dev *snp_dev = to_snp_dev(file);
@@ -321,6 +365,10 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
ret = get_report(snp_dev, &input);
break;
}
+ case SNP_GET_DERIVED_KEY: {
+ ret = get_derived_key(snp_dev, &input);
+ break;
+ }
default:
break;
}
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index e8cfd15133f3..621a9167df7a 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -36,9 +36,33 @@ struct snp_user_guest_request {
__u64 fw_err;
};
+struct __snp_derived_key_req {
+ __u32 root_key_select;
+ __u32 rsvd;
+ __u64 guest_field_select;
+ __u32 vmpl;
+ __u32 guest_svn;
+ __u64 tcb_version;
+};
+
+struct snp_derived_key_req {
+ /* message version number (must be non-zero) */
+ __u8 msg_version;
+
+ struct __snp_derived_key_req data;
+};
+
+struct snp_derived_key_resp {
+ /* response data, see SEV-SNP spec for the format */
+ __u8 data[64];
+};
+
#define SNP_GUEST_REQ_IOC_TYPE 'S'
/* Get SNP attestation report */
#define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_user_guest_request)
+/* Get a derived key from the root */
+#define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_user_guest_request)
+
#endif /* __UAPI_LINUX_SEV_GUEST_H_ */
--
2.17.1
From: Michael Roth <[email protected]>
The run-time kernel will need to access the Confidential Computing
blob very early in boot to access the CPUID table it points to. At that
stage of boot it will be relying on the identity-mapped page table set
up by boot/compressed kernel, so make sure we have both of them mapped
in advance.
Signed-off-by: Michael Roth <[email protected]>
---
arch/x86/boot/compressed/ident_map_64.c | 18 ++++++++++++++++++
arch/x86/boot/compressed/sev.c | 2 +-
arch/x86/include/asm/sev.h | 8 ++++++++
3 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index 59befc610993..91e5ab433be4 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -37,6 +37,9 @@
#include <asm/setup.h> /* For COMMAND_LINE_SIZE */
#undef _SETUP
+#define __BOOT_COMPRESSED
+#include <asm/sev.h> /* For sev_snp_active() + ConfidentialComputing blob */
+
extern unsigned long get_cmd_line_ptr(void);
/* Used by PAGE_KERN* macros: */
@@ -163,6 +166,21 @@ void initialize_identity_maps(void *rmode)
cmdline = get_cmd_line_ptr();
add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
+ /*
+ * The ConfidentialComputing blob is used very early in uncompressed
+ * kernel to find CPUID memory to handle cpuid instructions. Make sure
+ * an identity-mapping exists so they can be accessed after switchover.
+ */
+ if (sev_snp_enabled()) {
+ struct cc_blob_sev_info *cc_info =
+ (void *)(unsigned long)boot_params->cc_blob_address;
+
+ add_identity_map((unsigned long)cc_info,
+ (unsigned long)cc_info + sizeof(*cc_info));
+ add_identity_map((unsigned long)cc_info->cpuid_phys,
+ (unsigned long)cc_info->cpuid_phys + cc_info->cpuid_len);
+ }
+
/* Load the new page-table. */
sev_verify_cbit(top_level_pgt);
write_cr3(top_level_pgt);
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 13a6ce74f320..87080bc4a574 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -123,7 +123,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
/* Include code for early handlers */
#include "../../kernel/sev-shared.c"
-static inline bool sev_snp_enabled(void)
+bool sev_snp_enabled(void)
{
unsigned long low, high;
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index e403bd1fcb23..b5715a26361a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -128,6 +128,10 @@ void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
void snp_set_wakeup_secondary_cpu(void);
+#ifdef __BOOT_COMPRESSED
+bool sev_snp_enabled(void);
+#endif
+
void sev_snp_cpuid_init(struct boot_params *bp);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
@@ -145,6 +149,10 @@ static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npage
static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
static inline void snp_set_wakeup_secondary_cpu(void) { }
+#ifdef __BOOT_COMPRESSED
+static inline bool sev_snp_enabled { return false; }
+#endif
+
static inline void sev_snp_cpuid_init(struct boot_params *bp) { }
#endif
--
2.17.1
From: Michael Roth <[email protected]>
This code will also be used later for SEV-SNP-validated CPUID code in
some cases, so move it to a common helper.
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
1 file changed, 58 insertions(+), 26 deletions(-)
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index be4025f14b4f..4884de256a49 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -184,6 +184,58 @@ static enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
return ret;
}
+static int sev_es_cpuid_msr_proto(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
+ u32 *ecx, u32 *edx)
+{
+ u64 val;
+
+ if (eax) {
+ sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EAX));
+ VMGEXIT();
+ val = sev_es_rd_ghcb_msr();
+
+ if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+ return -EIO;
+
+ *eax = (val >> 32);
+ }
+
+ if (ebx) {
+ sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EBX));
+ VMGEXIT();
+ val = sev_es_rd_ghcb_msr();
+
+ if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+ return -EIO;
+
+ *ebx = (val >> 32);
+ }
+
+ if (ecx) {
+ sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_ECX));
+ VMGEXIT();
+ val = sev_es_rd_ghcb_msr();
+
+ if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+ return -EIO;
+
+ *ecx = (val >> 32);
+ }
+
+ if (edx) {
+ sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(func, GHCB_CPUID_REQ_EDX));
+ VMGEXIT();
+ val = sev_es_rd_ghcb_msr();
+
+ if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+ return -EIO;
+
+ *edx = (val >> 32);
+ }
+
+ return 0;
+}
+
/*
* Boot VC Handler - This is the first VC handler during boot, there is no GHCB
* page yet, so it only supports the MSR based communication with the
@@ -192,39 +244,19 @@ static enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
{
unsigned int fn = lower_bits(regs->ax, 32);
- unsigned long val;
+ u32 eax, ebx, ecx, edx;
/* Only CPUID is supported via MSR protocol */
if (exit_code != SVM_EXIT_CPUID)
goto fail;
- sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX));
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
- if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+ if (sev_es_cpuid_msr_proto(fn, 0, &eax, &ebx, &ecx, &edx))
goto fail;
- regs->ax = val >> 32;
- sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX));
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
- if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
- goto fail;
- regs->bx = val >> 32;
-
- sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX));
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
- if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
- goto fail;
- regs->cx = val >> 32;
-
- sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX));
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
- if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
- goto fail;
- regs->dx = val >> 32;
+ regs->ax = eax;
+ regs->bx = ebx;
+ regs->cx = ecx;
+ regs->dx = edx;
/*
* This is a VC handler and the #VC is only raised when SEV-ES is
--
2.17.1
From: Michael Roth <[email protected]>
This adds support for utilizing the SEV-SNP-validated CPUID table in
the various #VC handler routines used throughout boot/run-time. Mostly
this is handled by re-using the CPUID lookup code introduced earlier
for the boot/compressed kernel, but at various stages of boot some work
needs to be done to ensure the CPUID table is set up and remains
accessible throughout. The following init routines are introduced to
handle this:
sev_snp_cpuid_init():
This sets up access to the CPUID memory range for the #VC handler
that gets set up just after entry to startup_64(). Since the code is
still using an identity mapping, the existing sev_snp_cpuid_init()
used by boot/compressed is used here as well, but annotated as __init
so it can be cleaned up later (boot/compressed/sev.c already defines
away __init when it pulls in shared SEV code). The boot/compressed
kernel handles any necessary lookup of ConfidentialComputing blob
from EFI and puts it into boot_params if present, so only boot_params
needs to be checked.
sev_snp_cpuid_init_virtual():
This is called when the previous identity mapping is gone and the
memory used for the CPUID memory range needs to be mapped into the
new page table with encryption bit set and accessed via __va().
Since this path is also entered later by APs to set up their initial
VC handlers, a function pointer is used to switch them to a handler
that doesn't attempt to re-initialize the SNP CPUID feature, as at
that point it will have already been set up.
sev_snp_cpuid_init_remap_early():
This is called when the previous mapping of CPUID memory range is no
longer present. early_memremap() is now available, so use that to
create a new one that can be used until memremap() is available.
sev_snp_cpuid_init_remap():
This switches away from using early_memremap() to ioremap_encrypted()
to map CPUID memory range, otherwise the leak detector will complain.
This mapping is what gets used for the remaining life of the guest.
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/realmode.h | 1 +
arch/x86/include/asm/setup.h | 5 +-
arch/x86/include/asm/sev.h | 8 +++
arch/x86/kernel/head64.c | 21 ++++++--
arch/x86/kernel/head_64.S | 6 ++-
arch/x86/kernel/setup.c | 3 ++
arch/x86/kernel/sev-shared.c | 93 ++++++++++++++++++++++++++++++++-
arch/x86/kernel/smpboot.c | 2 +
8 files changed, 129 insertions(+), 10 deletions(-)
diff --git a/arch/x86/include/asm/realmode.h b/arch/x86/include/asm/realmode.h
index 5db5d083c873..ff0eecee4235 100644
--- a/arch/x86/include/asm/realmode.h
+++ b/arch/x86/include/asm/realmode.h
@@ -63,6 +63,7 @@ extern unsigned long initial_stack;
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern unsigned long initial_vc_handler;
#endif
+extern unsigned long initial_idt_setup;
extern unsigned char real_mode_blob[];
extern unsigned char real_mode_relocs[];
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index a12458a7a8d4..12fc52894ad8 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -50,8 +50,9 @@ extern void reserve_standard_io_resources(void);
extern void i386_reserve_resources(void);
extern unsigned long __startup_64(unsigned long physaddr, struct boot_params *bp);
extern unsigned long __startup_secondary_64(void);
-extern void startup_64_setup_env(unsigned long physbase);
-extern void early_setup_idt(void);
+extern void startup_64_setup_env(unsigned long physbase, struct boot_params *bp);
+extern void early_setup_idt_common(void *rmode);
+extern void __init early_setup_idt(void *rmode);
extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
#ifdef CONFIG_X86_INTEL_MID
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index b5715a26361a..6c23e694a109 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -133,6 +133,10 @@ bool sev_snp_enabled(void);
#endif
void sev_snp_cpuid_init(struct boot_params *bp);
+#ifndef __BOOT_COMPRESSED
+void sev_snp_cpuid_init_virtual(void);
+void sev_snp_cpuid_init_remap_early(void);
+#endif /* __BOOT_COMPRESSED */
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -154,6 +158,10 @@ static inline bool sev_snp_enabled { return false; }
#endif
static inline void sev_snp_cpuid_init(struct boot_params *bp) { }
+#ifndef __BOOT_COMPRESSED
+static inline void sev_snp_cpuid_init_virtual(void) { }
+static inline void sev_snp_cpuid_init_remap_early(void) { }
+#endif /* __BOOT_COMPRESSED */
#endif
#endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 8615418f98f1..de3b4f1afbfe 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -567,7 +567,7 @@ static void set_bringup_idt_handler(gate_desc *idt, int n, void *handler)
}
/* This runs while still in the direct mapping */
-static void startup_64_load_idt(unsigned long physbase)
+static void startup_64_load_idt(unsigned long physbase, struct boot_params *bp)
{
struct desc_ptr *desc = fixup_pointer(&bringup_idt_descr, physbase);
gate_desc *idt = fixup_pointer(bringup_idt_table, physbase);
@@ -577,6 +577,7 @@ static void startup_64_load_idt(unsigned long physbase)
void *handler;
/* VMM Communication Exception */
+ sev_snp_cpuid_init(bp); /* used by #VC handler */
handler = fixup_pointer(vc_no_ghcb, physbase);
set_bringup_idt_handler(idt, X86_TRAP_VC, handler);
}
@@ -585,8 +586,8 @@ static void startup_64_load_idt(unsigned long physbase)
native_load_idt(desc);
}
-/* This is used when running on kernel addresses */
-void early_setup_idt(void)
+/* Used for all CPUs */
+void early_setup_idt_common(void *rmode)
{
/* VMM Communication Exception */
if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
@@ -596,10 +597,20 @@ void early_setup_idt(void)
native_load_idt(&bringup_idt_descr);
}
+/* This is used by boot processor when running on kernel addresses */
+void __init early_setup_idt(void *rmode)
+{
+ /* SEV-SNP CPUID setup for use by #VC handler */
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
+ sev_snp_cpuid_init_virtual();
+
+ early_setup_idt_common(rmode);
+}
+
/*
* Setup boot CPU state needed before kernel switches to virtual addresses.
*/
-void __head startup_64_setup_env(unsigned long physbase)
+void __head startup_64_setup_env(unsigned long physbase, struct boot_params *bp)
{
u64 gs_area = (u64)fixup_pointer(startup_gs_area, physbase);
@@ -623,5 +634,5 @@ void __head startup_64_setup_env(unsigned long physbase)
*/
native_wrmsr(MSR_GS_BASE, gs_area, gs_area >> 32);
- startup_64_load_idt(physbase);
+ startup_64_load_idt(physbase, bp);
}
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d8b3ebd2bb85..78f35e446498 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -218,7 +218,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
/* Setup and Load IDT */
pushq %rsi
- call early_setup_idt
+ movq %rsi, %rdi
+ movq initial_idt_setup(%rip), %rax
+ ANNOTATE_RETPOLINE_SAFE
+ call *%rax
popq %rsi
/* Check if nx is implemented */
@@ -341,6 +344,7 @@ SYM_DATA(initial_gs, .quad INIT_PER_CPU_VAR(fixed_percpu_data))
#ifdef CONFIG_AMD_MEM_ENCRYPT
SYM_DATA(initial_vc_handler, .quad handle_vc_boot_ghcb)
#endif
+SYM_DATA(initial_idt_setup, .quad early_setup_idt)
/*
* The FRAME_SIZE gap is a convention which helps the in-kernel unwinder
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 85acd22f8022..5ff264917b5b 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -47,6 +47,7 @@
#include <asm/thermal.h>
#include <asm/unwind.h>
#include <asm/vsyscall.h>
+#include <asm/sev.h>
#include <linux/vmalloc.h>
/*
@@ -1077,6 +1078,8 @@ void __init setup_arch(char **cmdline_p)
init_mem_mapping();
+ sev_snp_cpuid_init_remap_early();
+
idt_setup_early_pf();
/*
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 23328727caf4..dbc5c2600d9d 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -264,7 +264,7 @@ static int sev_es_cpuid_msr_proto(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
return 0;
}
-static bool sev_snp_cpuid_active(void)
+static inline bool sev_snp_cpuid_active(void)
{
return sev_snp_cpuid_enabled;
}
@@ -905,7 +905,7 @@ static struct cc_blob_sev_info *sev_snp_probe_cc_blob(struct boot_params *bp)
* indication that SEV-ES is enabled. Subsequent init levels will check for
* SEV_SNP feature once available to also take SEV MSR value into account.
*/
-void sev_snp_cpuid_init(struct boot_params *bp)
+void __init sev_snp_cpuid_init(struct boot_params *bp)
{
struct cc_blob_sev_info *cc_info;
@@ -941,3 +941,92 @@ void sev_snp_cpuid_init(struct boot_params *bp)
if (cpuid_info->count > 0)
sev_snp_cpuid_enabled = 1;
}
+
+#ifndef __BOOT_COMPRESSED
+
+static bool __init early_make_pgtable_enc(unsigned long physaddr)
+{
+ pmdval_t pmd;
+
+ /* early_pmd_flags hasn't been updated with SME bit yet; add it */
+ pmd = (physaddr & PMD_MASK) + early_pmd_flags + sme_get_me_mask();
+
+ return __early_make_pgtable((unsigned long)__va(physaddr), pmd);
+}
+
+/*
+ * This is called when we switch to virtual kernel addresses, before #PF
+ * handler is set up. boot_params have already been parsed at this point,
+ * but CPUID page is no longer identity-mapped so we need to create a
+ * virtual mapping.
+ */
+void __init sev_snp_cpuid_init_virtual(void)
+{
+ /*
+ * We rely on sev_snp_cpuid_init() to do initial parsing of bootparams
+ * and initial setup. If that didn't enable the feature then don't try
+ * to enable it here.
+ */
+ if (!sev_snp_cpuid_active())
+ return;
+
+ /*
+ * Either boot_params/EFI advertised the feature even though SNP isn't
+ * enabled, or something else went wrong. Bail out.
+ */
+ if (!sev_feature_enabled(SEV_SNP))
+ sev_es_terminate(1, GHCB_TERM_CPUID);
+
+ /* If feature is enabled, but we can't map CPUID info, we're hosed */
+ if (!early_make_pgtable_enc(sev_snp_cpuid_pa))
+ sev_es_terminate(1, GHCB_TERM_CPUID);
+
+ cpuid_info = (const struct sev_snp_cpuid_info *)__va(sev_snp_cpuid_pa);
+}
+
+/* Called after early_ioremap_init() */
+void __init sev_snp_cpuid_init_remap_early(void)
+{
+ if (!sev_snp_cpuid_active())
+ return;
+
+ /*
+ * This really shouldn't be possible at this point.
+ */
+ if (!sev_feature_enabled(SEV_SNP))
+ sev_es_terminate(1, GHCB_TERM_CPUID);
+
+ cpuid_info = early_memremap(sev_snp_cpuid_pa, sev_snp_cpuid_sz);
+}
+
+/* Final switch to run-time mapping */
+static int __init sev_snp_cpuid_init_remap(void)
+{
+ if (!sev_snp_cpuid_active())
+ return 0;
+
+ /*
+ * This really shouldn't be possible at this point either.
+ */
+ if (!sev_feature_enabled(SEV_SNP))
+ sev_es_terminate(1, GHCB_TERM_CPUID);
+
+ /* Clean up earlier mapping. */
+ if (cpuid_info)
+ early_memunmap((void *)cpuid_info, sev_snp_cpuid_sz);
+
+ /*
+ * We need ioremap_encrypted() to get an encrypted mapping, but this
+ * is normal RAM so can be accessed directly.
+ */
+ cpuid_info = (__force void *)ioremap_encrypted(sev_snp_cpuid_pa,
+ sev_snp_cpuid_sz);
+ if (!cpuid_info)
+ return -EIO;
+
+ return 0;
+}
+
+arch_initcall(sev_snp_cpuid_init_remap);
+
+#endif /* __BOOT_COMPRESSED */
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 4fc07006f7f8..d3f4993b89cc 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1041,6 +1041,8 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
early_gdt_descr.address = (unsigned long)get_cpu_gdt_rw(cpu);
initial_code = (unsigned long)start_secondary;
initial_stack = idle->thread.sp;
+ /* don't repeat IDT setup work specific to the BSP */
+ initial_idt_setup = (unsigned long)early_setup_idt_common;
/* Enable the espfix hack for this CPU */
init_espfix_ap(cpu);
--
2.17.1
From: Michael Roth <[email protected]>
Determining which CPUID leafs have significant ECX/index values is
also needed by guest kernel code when doing SEV-SNP-validated CPUID
lookups. Move this to common code to keep future updates in sync.
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/cpuid-indexed.h | 26 ++++++++++++++++++++++++++
arch/x86/kvm/cpuid.c | 17 ++---------------
2 files changed, 28 insertions(+), 15 deletions(-)
create mode 100644 arch/x86/include/asm/cpuid-indexed.h
diff --git a/arch/x86/include/asm/cpuid-indexed.h b/arch/x86/include/asm/cpuid-indexed.h
new file mode 100644
index 000000000000..f5ab746f5712
--- /dev/null
+++ b/arch/x86/include/asm/cpuid-indexed.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_CPUID_INDEXED_H
+#define _ASM_X86_CPUID_INDEXED_H
+
+static __always_inline bool cpuid_function_is_indexed(u32 function)
+{
+ switch (function) {
+ case 4:
+ case 7:
+ case 0xb:
+ case 0xd:
+ case 0xf:
+ case 0x10:
+ case 0x12:
+ case 0x14:
+ case 0x17:
+ case 0x18:
+ case 0x1f:
+ case 0x8000001d:
+ return true;
+ }
+
+ return false;
+}
+
+#endif /* _ASM_X86_CPUID_INDEXED_H */
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b4da665bb892..be6b226f50e4 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -19,6 +19,7 @@
#include <asm/user.h>
#include <asm/fpu/xstate.h>
#include <asm/sgx.h>
+#include <asm/cpuid-indexed.h>
#include "cpuid.h"
#include "lapic.h"
#include "mmu.h"
@@ -608,22 +609,8 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
cpuid_count(entry->function, entry->index,
&entry->eax, &entry->ebx, &entry->ecx, &entry->edx);
- switch (function) {
- case 4:
- case 7:
- case 0xb:
- case 0xd:
- case 0xf:
- case 0x10:
- case 0x12:
- case 0x14:
- case 0x17:
- case 0x18:
- case 0x1f:
- case 0x8000001d:
+ if (cpuid_function_is_indexed(function))
entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
- break;
- }
return entry;
}
--
2.17.1
Version 2 of GHCB specification defines NAE to get the extended guest
request. It is similar to the SNP_GET_REPORT ioctl. The main difference
is related to the additional data that be returned. The additional
data returned is a certificate blob that can be used by the SNP guest
user. The certificate blob layout is defined in the GHCB specification.
The driver simply treats the blob as a opaque data and copies it to
userspace.
Signed-off-by: Brijesh Singh <[email protected]>
---
Documentation/virt/coco/sevguest.rst | 22 +++++
drivers/virt/coco/sevguest/sevguest.c | 126 ++++++++++++++++++++++++++
include/uapi/linux/sev-guest.h | 13 +++
3 files changed, 161 insertions(+)
diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
index 25446670d816..7acb8696fca4 100644
--- a/Documentation/virt/coco/sevguest.rst
+++ b/Documentation/virt/coco/sevguest.rst
@@ -85,3 +85,25 @@ on the various fileds passed in the key derivation request.
On success, the snp_derived_key_resp.data will contains the derived key
value.
+
+2.2 SNP_GET_EXT_REPORT
+----------------------
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in/out): struct snp_ext_report_req
+:Returns (out): struct snp_report_resp on success, -negative on error
+
+The SNP_GET_EXT_REPORT ioctl is similar to the SNP_GET_REPORT. The difference is
+related to the additional certificate data that is returned with the report.
+The certificate data returned is being provided by the hypervisor through the
+SNP_SET_EXT_CONFIG.
+
+The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command provided by the SEV-SNP
+firmware to get the attestation report.
+
+On success, the snp_ext_report_resp.data will contains the attestation report
+and snp_ext_report_req.certs_address will contains the certificate blob. If the
+length of the blob is lesser than expected then snp_ext_report_req.certs_len will
+be updated with the expected value.
+
+See GHCB specification for further detail on how to parse the certificate blob.
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
index 2f20db80490a..d39667a62002 100644
--- a/drivers/virt/coco/sevguest/sevguest.c
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -39,6 +39,7 @@ struct snp_guest_dev {
struct device *dev;
struct miscdevice misc;
+ void *certs_data;
struct snp_guest_crypto *crypto;
struct snp_guest_msg *request, *response;
};
@@ -348,6 +349,117 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_user_guest_
return rc;
}
+static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_user_guest_request *arg)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_guest_request_data input = {};
+ struct snp_ext_report_req req;
+ int ret, npages = 0, resp_len;
+ struct snp_report_resp *resp;
+ struct snp_report_req *rreq;
+ unsigned long fw_err = 0;
+
+ if (!arg->req_data || !arg->resp_data)
+ return -EINVAL;
+
+ /* Copy the request payload from the userspace */
+ if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+ return -EFAULT;
+
+ rreq = &req.data;
+
+ /* Message version must be non-zero */
+ if (!rreq->msg_version)
+ return -EINVAL;
+
+ if (req.certs_len) {
+ if ((req.certs_len > SEV_FW_BLOB_MAX_SIZE) ||
+ !IS_ALIGNED(req.certs_len, PAGE_SIZE))
+ return -EINVAL;
+ }
+
+ if (req.certs_address && req.certs_len) {
+ if (!access_ok(req.certs_address, req.certs_len))
+ return -EFAULT;
+
+ /*
+ * Initialize the intermediate buffer with all zero's. This buffer
+ * is used in the guest request message to get the certs blob from
+ * the host. If host does not supply any certs in it, then we copy
+ * zeros to indicate that certificate data was not provided.
+ */
+ memset(snp_dev->certs_data, 0, req.certs_len);
+
+ input.data_gpa = __pa(snp_dev->certs_data);
+ npages = req.certs_len >> PAGE_SHIFT;
+ }
+
+ /*
+ * The intermediate response buffer is used while decrypting the
+ * response payload. Make sure that it has enough space to cover the
+ * authtag.
+ */
+ resp_len = sizeof(resp->data) + crypto->a_len;
+ resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+ if (!resp)
+ return -ENOMEM;
+
+ if (copy_from_user(resp, (void __user *)arg->resp_data, sizeof(*resp))) {
+ ret = -EFAULT;
+ goto e_free;
+ }
+
+ /* Encrypt the userspace provided payload */
+ ret = enc_payload(snp_dev, rreq->msg_version, SNP_MSG_REPORT_REQ,
+ &rreq->user_data, sizeof(rreq->user_data));
+ if (ret)
+ goto e_free;
+
+ /* Call firmware to process the request */
+ input.req_gpa = __pa(snp_dev->request);
+ input.resp_gpa = __pa(snp_dev->response);
+ input.data_npages = npages;
+ memset(snp_dev->response, 0, sizeof(*snp_dev->response));
+ ret = snp_issue_guest_request(EXT_GUEST_REQUEST, &input, &fw_err);
+
+ /* Popogate any firmware error to the userspace */
+ arg->fw_err = fw_err;
+
+ /* If certs length is invalid then copy the returned length */
+ if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
+ req.certs_len = input.data_npages << PAGE_SHIFT;
+
+ if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
+ ret = -EFAULT;
+
+ goto e_free;
+ }
+
+ if (ret)
+ goto e_free;
+
+ /* Decrypt the response payload */
+ ret = verify_and_dec_payload(snp_dev, resp->data, resp_len);
+ if (ret)
+ goto e_free;
+
+ /* Copy the certificate data blob to userspace */
+ if (req.certs_address &&
+ copy_to_user((void __user *)req.certs_address, snp_dev->certs_data,
+ req.certs_len)) {
+ ret = -EFAULT;
+ goto e_free;
+ }
+
+ /* Copy the response payload to userspace */
+ if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+ ret = -EFAULT;
+
+e_free:
+ kfree(resp);
+ return ret;
+}
+
static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
{
struct snp_guest_dev *snp_dev = to_snp_dev(file);
@@ -369,6 +481,10 @@ static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long
ret = get_derived_key(snp_dev, &input);
break;
}
+ case SNP_GET_EXT_REPORT: {
+ ret = get_ext_report(snp_dev, &input);
+ break;
+ }
default:
break;
}
@@ -454,6 +570,12 @@ static int __init snp_guest_probe(struct platform_device *pdev)
goto e_free_req;
}
+ snp_dev->certs_data = alloc_shared_pages(SEV_FW_BLOB_MAX_SIZE);
+ if (IS_ERR(snp_dev->certs_data)) {
+ ret = PTR_ERR(snp_dev->certs_data);
+ goto e_free_resp;
+ }
+
misc = &snp_dev->misc;
misc->minor = MISC_DYNAMIC_MINOR;
misc->name = DEVICE_NAME;
@@ -461,6 +583,9 @@ static int __init snp_guest_probe(struct platform_device *pdev)
return misc_register(misc);
+e_free_resp:
+ free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+
e_free_req:
free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
@@ -476,6 +601,7 @@ static int __exit snp_guest_remove(struct platform_device *pdev)
free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+ free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
deinit_crypto(snp_dev->crypto);
misc_deregister(&snp_dev->misc);
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
index 621a9167df7a..23659215fcfb 100644
--- a/include/uapi/linux/sev-guest.h
+++ b/include/uapi/linux/sev-guest.h
@@ -57,6 +57,16 @@ struct snp_derived_key_resp {
__u8 data[64];
};
+struct snp_ext_report_req {
+ struct snp_report_req data;
+
+ /* where to copy the certificate blob */
+ __u64 certs_address;
+
+ /* length of the certificate blob */
+ __u32 certs_len;
+};
+
#define SNP_GUEST_REQ_IOC_TYPE 'S'
/* Get SNP attestation report */
@@ -65,4 +75,7 @@ struct snp_derived_key_resp {
/* Get a derived key from the root */
#define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_user_guest_request)
+/* Get SNP extended report as defined in the GHCB specification version 2. */
+#define SNP_GET_EXT_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x2, struct snp_user_guest_request)
+
#endif /* __UAPI_LINUX_SEV_GUEST_H_ */
--
2.17.1
Version 2 of GHCB specification provides SNP_GUEST_REQUEST and
SNP_EXT_GUEST_REQUEST NAE that can be used by the SNP guest to communicate
with the PSP.
While at it, add a snp_issue_guest_request() helper that can be used by
driver or other subsystem to issue the request to PSP.
See SEV-SNP and GHCB spec for more details.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/uapi/asm/svm.h | 4 +++
arch/x86/kernel/sev.c | 57 +++++++++++++++++++++++++++++++++
include/linux/sev-guest.h | 48 +++++++++++++++++++++++++++
3 files changed, 109 insertions(+)
create mode 100644 include/linux/sev-guest.h
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 997918f0a89a..9aaf0ab386ef 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -109,6 +109,8 @@
#define SVM_VMGEXIT_SET_AP_JUMP_TABLE 0
#define SVM_VMGEXIT_GET_AP_JUMP_TABLE 1
#define SVM_VMGEXIT_PSC 0x80000010
+#define SVM_VMGEXIT_GUEST_REQUEST 0x80000011
+#define SVM_VMGEXIT_EXT_GUEST_REQUEST 0x80000012
#define SVM_VMGEXIT_AP_CREATION 0x80000013
#define SVM_VMGEXIT_AP_CREATE_ON_INIT 0
#define SVM_VMGEXIT_AP_CREATE 1
@@ -221,6 +223,8 @@
{ SVM_VMGEXIT_NMI_COMPLETE, "vmgexit_nmi_complete" }, \
{ SVM_VMGEXIT_AP_HLT_LOOP, "vmgexit_ap_hlt_loop" }, \
{ SVM_VMGEXIT_AP_JUMP_TABLE, "vmgexit_ap_jump_table" }, \
+ { SVM_VMGEXIT_GUEST_REQUEST, "vmgexit_guest_request" }, \
+ { SVM_VMGEXIT_EXT_GUEST_REQUEST, "vmgexit_ext_guest_request" }, \
{ SVM_VMGEXIT_PSC, "vmgexit_page_state_change" }, \
{ SVM_VMGEXIT_AP_CREATION, "vmgexit_ap_creation" }, \
{ SVM_VMGEXIT_HYPERVISOR_FEATURES, "vmgexit_hypervisor_feature" }, \
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 04ef5e79fa12..b85cab838372 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -21,6 +21,7 @@
#include <linux/cpumask.h>
#include <linux/log2.h>
#include <linux/efi.h>
+#include <linux/sev-guest.h>
#include <asm/cpu_entry_area.h>
#include <asm/stacktrace.h>
@@ -2024,3 +2025,59 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
while (true)
halt();
}
+
+int snp_issue_guest_request(int type, struct snp_guest_request_data *input, unsigned long *fw_err)
+{
+ struct ghcb_state state;
+ unsigned long id, flags;
+ struct ghcb *ghcb;
+ int ret;
+
+ if (!sev_feature_enabled(SEV_SNP))
+ return -ENODEV;
+
+
+ local_irq_save(flags);
+
+ ghcb = __sev_get_ghcb(&state);
+ if (!ghcb)
+ return -ENODEV;
+
+ vc_ghcb_invalidate(ghcb);
+
+ if (type == GUEST_REQUEST) {
+ id = SVM_VMGEXIT_GUEST_REQUEST;
+ } else if (type == EXT_GUEST_REQUEST) {
+ id = SVM_VMGEXIT_EXT_GUEST_REQUEST;
+ ghcb_set_rax(ghcb, input->data_gpa);
+ ghcb_set_rbx(ghcb, input->data_npages);
+ } else {
+ ret = -EINVAL;
+ goto e_put;
+ }
+
+
+ ret = sev_es_ghcb_hv_call(ghcb, NULL, id, input->req_gpa, input->resp_gpa);
+ if (ret)
+ goto e_put;
+
+ if (ghcb->save.sw_exit_info_2) {
+
+ /* Number of expected pages are returned in RBX */
+ if (id == EXT_GUEST_REQUEST)
+ input->data_npages = ghcb_get_rbx(ghcb);
+
+ if (fw_err)
+ *fw_err = ghcb->save.sw_exit_info_2;
+
+ ret = -EIO;
+ goto e_put;
+ }
+
+e_put:
+ __sev_put_ghcb(&state);
+ local_irq_restore(flags);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(snp_issue_guest_request);
diff --git a/include/linux/sev-guest.h b/include/linux/sev-guest.h
new file mode 100644
index 000000000000..24dd17507789
--- /dev/null
+++ b/include/linux/sev-guest.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * AMD Secure Encrypted Virtualization (SEV) guest driver interface
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <[email protected]>
+ *
+ */
+
+#ifndef __LINUX_SEV_GUEST_H_
+#define __LINUX_SEV_GUEST_H_
+
+#include <linux/types.h>
+
+enum vmgexit_type {
+ GUEST_REQUEST,
+ EXT_GUEST_REQUEST,
+
+ GUEST_REQUEST_MAX
+};
+
+/*
+ * The error code when the data_npages is too small. The error code
+ * is defined in the GHCB specification.
+ */
+#define SNP_GUEST_REQ_INVALID_LEN 0x100000000ULL
+
+struct snp_guest_request_data {
+ unsigned long req_gpa;
+ unsigned long resp_gpa;
+ unsigned long data_gpa;
+ unsigned int data_npages;
+};
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+int snp_issue_guest_request(int vmgexit_type, struct snp_guest_request_data *input,
+ unsigned long *fw_err);
+#else
+
+static inline int snp_issue_guest_request(int type, struct snp_guest_request_data *input,
+ unsigned long *fw_err)
+{
+ return -ENODEV;
+}
+
+#endif /* CONFIG_AMD_MEM_ENCRYPT */
+#endif /* __LINUX_SEV_GUEST_H__ */
--
2.17.1
The SNP guest request message header contains a message count. The
message count is used while building the IV. The PSP firmware increments
the message count by 1, and expects that next message will be using the
incremented count. The snp_msg_seqno() helper will be used by driver to
get and message sequence counter used in the request message header,
and it will be automatically incremented after the
snp_issue_guest_request() is successful. The incremented value is saved
in the secrets page so that the kexec'ed kernel knows from where to
begin.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/sev.c | 79 +++++++++++++++++++++++++++++++++++++++
include/linux/sev-guest.h | 37 ++++++++++++++++++
2 files changed, 116 insertions(+)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index b85cab838372..663cfe96c186 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -51,6 +51,8 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
*/
static struct ghcb __initdata *boot_ghcb;
+static u64 snp_secrets_phys;
+
/* #VC handler runtime per-CPU data */
struct sev_es_runtime_data {
struct ghcb ghcb_page;
@@ -2026,6 +2028,80 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
halt();
}
+static struct snp_secrets_page_layout *snp_map_secrets_page(void)
+{
+ u16 __iomem *secrets;
+
+ if (!snp_secrets_phys || !sev_feature_enabled(SEV_SNP))
+ return NULL;
+
+ secrets = ioremap_encrypted(snp_secrets_phys, PAGE_SIZE);
+ if (!secrets)
+ return NULL;
+
+ return (struct snp_secrets_page_layout *)secrets;
+}
+
+static inline u64 snp_read_msg_seqno(void)
+{
+ struct snp_secrets_page_layout *layout;
+ u64 count;
+
+ layout = snp_map_secrets_page();
+ if (layout == NULL)
+ return 0;
+
+ /* Read the current message sequence counter from secrets pages */
+ count = readl(&layout->os_area.msg_seqno_0);
+
+ iounmap(layout);
+
+ /* The sequence counter must begin with 1 */
+ if (!count)
+ return 1;
+
+ return count + 1;
+}
+
+u64 snp_msg_seqno(void)
+{
+ u64 count = snp_read_msg_seqno();
+
+ if (unlikely(!count))
+ return 0;
+
+ /*
+ * The message sequence counter for the SNP guest request is a
+ * 64-bit value but the version 2 of GHCB specification defines a
+ * 32-bit storage for the it.
+ */
+ if (count >= UINT_MAX)
+ return 0;
+
+ return count;
+}
+EXPORT_SYMBOL_GPL(snp_msg_seqno);
+
+static void snp_gen_msg_seqno(void)
+{
+ struct snp_secrets_page_layout *layout;
+ u64 count;
+
+ layout = snp_map_secrets_page();
+ if (layout == NULL)
+ return;
+
+ /*
+ * The counter is also incremented by the PSP, so increment it by 2
+ * and save in secrets page.
+ */
+ count = readl(&layout->os_area.msg_seqno_0);
+ count += 2;
+
+ writel(count, &layout->os_area.msg_seqno_0);
+ iounmap(layout);
+}
+
int snp_issue_guest_request(int type, struct snp_guest_request_data *input, unsigned long *fw_err)
{
struct ghcb_state state;
@@ -2074,6 +2150,9 @@ int snp_issue_guest_request(int type, struct snp_guest_request_data *input, unsi
goto e_put;
}
+ /* The command was successful, increment the sequence counter */
+ snp_gen_msg_seqno();
+
e_put:
__sev_put_ghcb(&state);
local_irq_restore(flags);
diff --git a/include/linux/sev-guest.h b/include/linux/sev-guest.h
index 24dd17507789..5f6c4d634e1f 100644
--- a/include/linux/sev-guest.h
+++ b/include/linux/sev-guest.h
@@ -20,6 +20,41 @@ enum vmgexit_type {
GUEST_REQUEST_MAX
};
+/*
+ * The secrets page contains 96-bytes of reserved field that can be used by
+ * the guest OS. The guest OS uses the area to save the message sequence
+ * number for each VMPL level.
+ *
+ * See the GHCB spec section Secret page layout for the format for this area.
+ */
+struct secrets_os_area {
+ u32 msg_seqno_0;
+ u32 msg_seqno_1;
+ u32 msg_seqno_2;
+ u32 msg_seqno_3;
+ u64 ap_jump_table_pa;
+ u8 rsvd[40];
+ u8 guest_usage[32];
+} __packed;
+
+#define VMPCK_KEY_LEN 32
+
+/* See the SNP spec secrets page layout section for the structure */
+struct snp_secrets_page_layout {
+ u32 version;
+ u32 imiEn : 1,
+ rsvd1 : 31;
+ u32 fms;
+ u32 rsvd2;
+ u8 gosvw[16];
+ u8 vmpck0[VMPCK_KEY_LEN];
+ u8 vmpck1[VMPCK_KEY_LEN];
+ u8 vmpck2[VMPCK_KEY_LEN];
+ u8 vmpck3[VMPCK_KEY_LEN];
+ struct secrets_os_area os_area;
+ u8 rsvd3[3840];
+} __packed;
+
/*
* The error code when the data_npages is too small. The error code
* is defined in the GHCB specification.
@@ -36,6 +71,7 @@ struct snp_guest_request_data {
#ifdef CONFIG_AMD_MEM_ENCRYPT
int snp_issue_guest_request(int vmgexit_type, struct snp_guest_request_data *input,
unsigned long *fw_err);
+u64 snp_msg_seqno(void);
#else
static inline int snp_issue_guest_request(int type, struct snp_guest_request_data *input,
@@ -43,6 +79,7 @@ static inline int snp_issue_guest_request(int type, struct snp_guest_request_dat
{
return -ENODEV;
}
+static inline u64 snp_msg_seqno(void) { return 0; }
#endif /* CONFIG_AMD_MEM_ENCRYPT */
#endif /* __LINUX_SEV_GUEST_H__ */
--
2.17.1
From: Michael Roth <[email protected]>
Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Move the related code into a set of
helpers that can be re-used for that purpose.
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/acpi.c | 124 +++++---------
arch/x86/boot/compressed/efi-config-table.c | 180 ++++++++++++++++++++
arch/x86/boot/compressed/misc.h | 50 ++++++
4 files changed, 272 insertions(+), 83 deletions(-)
create mode 100644 arch/x86/boot/compressed/efi-config-table.c
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 431bf7f846c3..b41aecfda49c 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -100,6 +100,7 @@ endif
vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
+vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi-config-table.o
efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a
$(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 8bcbcee54aa1..e087dcaf43b3 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -24,42 +24,36 @@ struct mem_vector immovable_mem[MAX_NUMNODES*2];
* Search EFI system tables for RSDP. If both ACPI_20_TABLE_GUID and
* ACPI_TABLE_GUID are found, take the former, which has more features.
*/
+#ifdef CONFIG_EFI
+static bool
+rsdp_find_fn(efi_guid_t guid, unsigned long vendor_table, bool efi_64,
+ void *opaque)
+{
+ acpi_physical_address *rsdp_addr = opaque;
+
+ if (!(efi_guidcmp(guid, ACPI_TABLE_GUID))) {
+ *rsdp_addr = vendor_table;
+ } else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID))) {
+ *rsdp_addr = vendor_table;
+ return false;
+ }
+
+ return true;
+}
+#endif
+
static acpi_physical_address
-__efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables,
+__efi_get_rsdp_addr(unsigned long config_table_pa, unsigned int config_table_len,
bool efi_64)
{
acpi_physical_address rsdp_addr = 0;
-
#ifdef CONFIG_EFI
- int i;
-
- /* Get EFI tables from systab. */
- for (i = 0; i < nr_tables; i++) {
- acpi_physical_address table;
- efi_guid_t guid;
-
- if (efi_64) {
- efi_config_table_64_t *tbl = (efi_config_table_64_t *)config_tables + i;
-
- guid = tbl->guid;
- table = tbl->table;
-
- if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
- debug_putstr("Error getting RSDP address: EFI config table located above 4GB.\n");
- return 0;
- }
- } else {
- efi_config_table_32_t *tbl = (efi_config_table_32_t *)config_tables + i;
-
- guid = tbl->guid;
- table = tbl->table;
- }
+ int ret;
- if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
- rsdp_addr = table;
- else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
- return table;
- }
+ ret = efi_foreach_conf_entry((void *)config_table_pa, config_table_len,
+ efi_64, rsdp_find_fn, &rsdp_addr);
+ if (ret)
+ debug_putstr("Error getting RSDP address.\n");
#endif
return rsdp_addr;
}
@@ -87,7 +81,9 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
efi_system_table_64_t *systab;
struct efi_setup_data *esd;
struct efi_info *ei;
+ bool efi_64;
char *sig;
+ int ret;
esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
if (!esd)
@@ -98,18 +94,16 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
return 0;
}
- ei = &boot_params->efi_info;
- sig = (char *)&ei->efi_loader_signature;
- if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+ /* Get systab from boot params. */
+ ret = efi_bp_get_system_table(boot_params, (unsigned long *)&systab, &efi_64);
+ if (ret)
+ error("EFI system table not found in kexec boot_params.");
+
+ if (!efi_64) {
debug_putstr("Wrong kexec EFI loader signature.\n");
return 0;
}
- /* Get systab from boot params. */
- systab = (efi_system_table_64_t *) (ei->efi_systab | ((__u64)ei->efi_systab_hi << 32));
- if (!systab)
- error("EFI system table not found in kexec boot_params.");
-
return __efi_get_rsdp_addr((unsigned long)esd->tables, systab->nr_tables, true);
}
#else
@@ -119,54 +113,18 @@ static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
static acpi_physical_address efi_get_rsdp_addr(void)
{
#ifdef CONFIG_EFI
- unsigned long systab, config_tables;
- unsigned int nr_tables;
- struct efi_info *ei;
+ unsigned long config_table_pa = 0;
+ unsigned int config_table_len;
bool efi_64;
- char *sig;
-
- ei = &boot_params->efi_info;
- sig = (char *)&ei->efi_loader_signature;
-
- if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
- efi_64 = true;
- } else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
- efi_64 = false;
- } else {
- debug_putstr("Wrong EFI loader signature.\n");
- return 0;
- }
-
- /* Get systab from boot params. */
-#ifdef CONFIG_X86_64
- systab = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
-#else
- if (ei->efi_systab_hi || ei->efi_memmap_hi) {
- debug_putstr("Error getting RSDP address: EFI system table located above 4GB.\n");
- return 0;
- }
- systab = ei->efi_systab;
-#endif
- if (!systab)
- error("EFI system table not found.");
-
- /* Handle EFI bitness properly */
- if (efi_64) {
- efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab;
-
- config_tables = stbl->tables;
- nr_tables = stbl->nr_tables;
- } else {
- efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab;
-
- config_tables = stbl->tables;
- nr_tables = stbl->nr_tables;
- }
+ int ret;
- if (!config_tables)
- error("EFI config tables not found.");
+ ret = efi_bp_get_conf_table(boot_params, &config_table_pa,
+ &config_table_len, &efi_64);
+ if (ret || !config_table_pa)
+ error("EFI config table not found.");
- return __efi_get_rsdp_addr(config_tables, nr_tables, efi_64);
+ return __efi_get_rsdp_addr(config_table_pa, config_table_len,
+ efi_64);
#else
return 0;
#endif
diff --git a/arch/x86/boot/compressed/efi-config-table.c b/arch/x86/boot/compressed/efi-config-table.c
new file mode 100644
index 000000000000..d1a34aa7cefd
--- /dev/null
+++ b/arch/x86/boot/compressed/efi-config-table.c
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for early access to EFI configuration table
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Michael Roth <[email protected]>
+ */
+
+#include "misc.h"
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+/* Get vendor table address/guid from EFI config table at the given index */
+static int get_vendor_table(void *conf_table, unsigned int idx,
+ unsigned long *vendor_table_pa,
+ efi_guid_t *vendor_table_guid,
+ bool efi_64)
+{
+ if (efi_64) {
+ efi_config_table_64_t *table_entry =
+ (efi_config_table_64_t *)conf_table + idx;
+
+ if (!IS_ENABLED(CONFIG_X86_64) &&
+ table_entry->table >> 32) {
+ debug_putstr("Error: EFI config table entry located above 4GB.\n");
+ return -EINVAL;
+ }
+
+ *vendor_table_pa = table_entry->table;
+ *vendor_table_guid = table_entry->guid;
+
+ } else {
+ efi_config_table_32_t *table_entry =
+ (efi_config_table_32_t *)conf_table + idx;
+
+ *vendor_table_pa = table_entry->table;
+ *vendor_table_guid = table_entry->guid;
+ }
+
+ return 0;
+}
+
+/*
+ * Iterate through the entries in the EFI configuration table and pass the
+ * associated GUID/physical address of each entry on the provided callback
+ * function.
+ *
+ * @conf_table: pointer to EFI configuration table
+ * @conf_table_len: number of entries in EFI configuration table
+ * @efi_64: true if using 64-bit EFI
+ * @fn: callback function that returns true if iteration
+ * should continue
+ * @opaque: optional caller-provided data structure to pass to
+ * callback function on each iteration
+ *
+ * Returns 0 on success.
+ */
+int
+efi_foreach_conf_entry(void *conf_table, unsigned int conf_table_len,
+ bool efi_64, bool (*fn)(efi_guid_t vendor_table_guid,
+ unsigned long vendor_table_pa,
+ bool efi_64,
+ void *opaque),
+ void *opaque)
+{
+ unsigned int i;
+
+ for (i = 0; i < conf_table_len; i++) {
+ unsigned long vendor_table_pa;
+ efi_guid_t vendor_table_guid;
+
+ if (get_vendor_table(conf_table, i, &vendor_table_pa,
+ &vendor_table_guid, efi_64))
+ return -EINVAL;
+
+ if (!fn(vendor_table_guid, vendor_table_pa, efi_64, opaque))
+ break;
+ }
+
+ return 0;
+}
+
+/*
+ * Given boot_params, retrieve the physical address of EFI system table.
+ *
+ * @boot_params: pointer to boot_params
+ * @sys_table_pa: location to store physical address of system table
+ * @is_efi_64: location to store whether using 64-bit EFI or not
+ *
+ * Returns 0 on success. On error, return params are left unchanged.
+ */
+int
+efi_bp_get_system_table(struct boot_params *boot_params,
+ unsigned long *sys_table_pa, bool *is_efi_64)
+{
+ unsigned long sys_table;
+ struct efi_info *ei;
+ bool efi_64;
+ char *sig;
+
+ if (!sys_table_pa || !is_efi_64)
+ return -EINVAL;
+
+ ei = &boot_params->efi_info;
+ sig = (char *)&ei->efi_loader_signature;
+
+ if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+ efi_64 = true;
+ } else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+ efi_64 = false;
+ } else {
+ debug_putstr("Wrong EFI loader signature.\n");
+ return -ENOENT;
+ }
+
+ /* Get systab from boot params. */
+#ifdef CONFIG_X86_64
+ sys_table = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
+#else
+ if (ei->efi_systab_hi || ei->efi_memmap_hi) {
+ debug_putstr("Error: EFI system table located above 4GB.\n");
+ return -EINVAL;
+ }
+ sys_table = ei->efi_systab;
+#endif
+ if (!sys_table) {
+ debug_putstr("EFI system table not found.");
+ return -ENOENT;
+ }
+
+ *sys_table_pa = sys_table;
+ *is_efi_64 = efi_64;
+ return 0;
+}
+
+/*
+ * Given boot_params, locate EFI system table from it and return the physical
+ * address EFI configuration table.
+ *
+ * @boot_params: pointer to boot_params
+ * @conf_table_pa: location to store physical address of config table
+ * @conf_table_len: location to store number of config table entries
+ * @is_efi_64: location to store whether using 64-bit EFI or not
+ *
+ * Returns 0 on success. On error, return params are left unchanged.
+ */
+int
+efi_bp_get_conf_table(struct boot_params *boot_params,
+ unsigned long *conf_table_pa,
+ unsigned int *conf_table_len,
+ bool *is_efi_64)
+{
+ unsigned long sys_table_pa = 0;
+ int ret;
+
+ if (!conf_table_pa || !conf_table_len || !is_efi_64)
+ return -EINVAL;
+
+ ret = efi_bp_get_system_table(boot_params, &sys_table_pa, is_efi_64);
+ if (ret)
+ return ret;
+
+ /* Handle EFI bitness properly */
+ if (*is_efi_64) {
+ efi_system_table_64_t *stbl =
+ (efi_system_table_64_t *)sys_table_pa;
+
+ *conf_table_pa = stbl->tables;
+ *conf_table_len = stbl->nr_tables;
+ } else {
+ efi_system_table_32_t *stbl =
+ (efi_system_table_32_t *)sys_table_pa;
+
+ *conf_table_pa = stbl->tables;
+ *conf_table_len = stbl->nr_tables;
+ }
+
+ return 0;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 822e0c254b9a..522baf8ff04a 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -21,6 +21,7 @@
#include <linux/screen_info.h>
#include <linux/elf.h>
#include <linux/io.h>
+#include <linux/efi.h>
#include <asm/page.h>
#include <asm/boot.h>
#include <asm/bootparam.h>
@@ -174,4 +175,53 @@ void boot_stage2_vc(void);
unsigned long sev_verify_cbit(unsigned long cr3);
+#ifdef CONFIG_EFI
+/* helpers for early EFI config table access */
+int efi_foreach_conf_entry(void *conf_table, unsigned int conf_table_len,
+ bool efi_64,
+ bool (*fn)(efi_guid_t guid,
+ unsigned long vendor_table_pa,
+ bool efi_64,
+ void *opaque),
+ void *opaque);
+
+int efi_bp_get_system_table(struct boot_params *boot_params,
+ unsigned long *sys_table_pa,
+ bool *is_efi_64);
+
+int efi_bp_get_conf_table(struct boot_params *boot_params,
+ unsigned long *conf_table_pa,
+ unsigned int *conf_table_len,
+ bool *is_efi_64);
+#else
+static inline int
+efi_foreach_conf_entry(void *conf_table, unsigned int conf_table_len,
+ bool efi_64,
+ bool (*fn)(efi_guid_t guid,
+ unsigned long vendor_table_pa,
+ bool efi_64,
+ void *opaque),
+ void *opaque);
+{
+ return -ENOENT;
+}
+
+static inline int
+efi_bp_get_system_table(struct boot_params *boot_params,
+ unsigned long *sys_table_pa,
+ bool *is_efi_64)
+{
+ return -ENOENT;
+}
+
+static inline int
+efi_bp_get_conf_table(struct boot_params *boot_params,
+ unsigned long *conf_table_pa,
+ unsigned int *conf_table_len,
+ bool *is_efi_64)
+{
+ return -ENOENT;
+}
+#endif /* CONFIG_EFI */
+
#endif /* BOOT_COMPRESSED_MISC_H */
--
2.17.1
Version 2 of GHCB specification provides NAEs that can be used by the SNP
guest to communicate with the PSP without risk from a malicious hypervisor
who wishes to read, alter, drop or replay the messages sent.
In order to communicate with the PSP, the guest need to locate the secrets
page inserted by the hypervisor during the SEV-SNP guest launch. The
secrets page contains the communication keys used to send and receive the
encrypted messages between the guest and the PSP.
The secrets page is location is passed through the setup_data.
Create a platform device that the SNP guest driver can bind to get the
platform resources such as encryption key and message id to use to
communicate with the PSP. The SNP guest driver can provide userspace
interface to get the attestation report, key derivation, extended
attestation report etc.
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/sev.c | 68 +++++++++++++++++++++++++++++++++++++++
include/linux/sev-guest.h | 5 +++
2 files changed, 73 insertions(+)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 663cfe96c186..566625707132 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -22,6 +22,8 @@
#include <linux/log2.h>
#include <linux/efi.h>
#include <linux/sev-guest.h>
+#include <linux/platform_device.h>
+#include <linux/io.h>
#include <asm/cpu_entry_area.h>
#include <asm/stacktrace.h>
@@ -37,6 +39,7 @@
#include <asm/apic.h>
#include <asm/efi.h>
#include <asm/cpuid-indexed.h>
+#include <asm/setup.h>
#include "sev-internal.h"
@@ -2160,3 +2163,68 @@ int snp_issue_guest_request(int type, struct snp_guest_request_data *input, unsi
return ret;
}
EXPORT_SYMBOL_GPL(snp_issue_guest_request);
+
+static struct platform_device guest_req_device = {
+ .name = "snp-guest",
+ .id = -1,
+};
+
+static u64 find_secrets_paddr(void)
+{
+ u64 pa_data = boot_params.cc_blob_address;
+ struct cc_blob_sev_info info;
+ void *map;
+
+ /*
+ * The CC blob contains the address of the secrets page, check if the
+ * blob is present.
+ */
+ if (!pa_data)
+ return 0;
+
+ map = early_memremap(pa_data, sizeof(info));
+ memcpy(&info, map, sizeof(info));
+ early_memunmap(map, sizeof(info));
+
+ /* Verify that secrets page address is passed */
+ if ((info.secrets_phys && (info.secrets_len == PAGE_SIZE)))
+ return info.secrets_phys;
+
+ return 0;
+}
+
+static int __init add_snp_guest_request(void)
+{
+ struct snp_secrets_page_layout *layout;
+ struct snp_guest_platform_data data;
+
+ if (!sev_feature_enabled(SEV_SNP))
+ return -ENODEV;
+
+ snp_secrets_phys = find_secrets_paddr();
+ if (!snp_secrets_phys)
+ return -ENODEV;
+
+ layout = snp_map_secrets_page();
+ if (layout == NULL)
+ return -ENODEV;
+
+ /*
+ * The secrets page contains three VMPCK that can be used for
+ * communicating with the PSP. We choose the VMPCK0 to encrypt guest
+ * messages send and receive by the Linux. Provide the key and
+ * id through the platform data to the driver.
+ */
+ data.vmpck_id = 0;
+ memcpy_fromio(data.vmpck, layout->vmpck0, sizeof(data.vmpck));
+
+ iounmap(layout);
+
+ platform_device_add_data(&guest_req_device, &data, sizeof(data));
+
+ if (!platform_device_register(&guest_req_device))
+ dev_info(&guest_req_device.dev, "secret phys 0x%llx\n", snp_secrets_phys);
+
+ return 0;
+}
+device_initcall(add_snp_guest_request);
diff --git a/include/linux/sev-guest.h b/include/linux/sev-guest.h
index 5f6c4d634e1f..d13f2e2cd367 100644
--- a/include/linux/sev-guest.h
+++ b/include/linux/sev-guest.h
@@ -68,6 +68,11 @@ struct snp_guest_request_data {
unsigned int data_npages;
};
+struct snp_guest_platform_data {
+ u8 vmpck_id;
+ char vmpck[VMPCK_KEY_LEN];
+};
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
int snp_issue_guest_request(int vmgexit_type, struct snp_guest_request_data *input,
unsigned long *fw_err);
--
2.17.1
* Brijesh Singh ([email protected]) wrote:
> The sev_feature_enabled() helper can be used by the guest to query whether
> the SNP - Secure Nested Paging feature is active.
>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/include/asm/mem_encrypt.h | 8 ++++++++
> arch/x86/include/asm/msr-index.h | 2 ++
> arch/x86/mm/mem_encrypt.c | 14 ++++++++++++++
> 3 files changed, 24 insertions(+)
>
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 8cc2fd308f65..fb857f2e72cb 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -16,6 +16,12 @@
>
> #include <asm/bootparam.h>
>
> +enum sev_feature_type {
> + SEV,
> + SEV_ES,
> + SEV_SNP
> +};
Is this ....
> #ifdef CONFIG_AMD_MEM_ENCRYPT
>
> extern u64 sme_me_mask;
> @@ -54,6 +60,7 @@ void __init sev_es_init_vc_handling(void);
> bool sme_active(void);
> bool sev_active(void);
> bool sev_es_active(void);
> +bool sev_feature_enabled(unsigned int feature_type);
>
> #define __bss_decrypted __section(".bss..decrypted")
>
> @@ -87,6 +94,7 @@ static inline int __init
> early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
>
> static inline void mem_encrypt_free_decrypted_mem(void) { }
> +static bool sev_feature_enabled(unsigned int feature_type) { return false; }
>
> #define __bss_decrypted
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index a7c413432b33..37589da0282e 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -481,8 +481,10 @@
> #define MSR_AMD64_SEV 0xc0010131
> #define MSR_AMD64_SEV_ENABLED_BIT 0
> #define MSR_AMD64_SEV_ES_ENABLED_BIT 1
> +#define MSR_AMD64_SEV_SNP_ENABLED_BIT 2
Just the same as this ?
> #define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
> #define MSR_AMD64_SEV_ES_ENABLED BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
> +#define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
>
> #define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
>
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index ff08dc463634..63e7799a9a86 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -389,6 +389,16 @@ bool noinstr sev_es_active(void)
> return sev_status & MSR_AMD64_SEV_ES_ENABLED;
> }
>
> +bool sev_feature_enabled(unsigned int type)
In which case, if you want the enum then that would be enum
sev_feature_type type ?
> +{
> + switch (type) {
> + case SEV: return sev_status & MSR_AMD64_SEV_ENABLED;
> + case SEV_ES: return sev_status & MSR_AMD64_SEV_ES_ENABLED;
> + case SEV_SNP: return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
> + default: return false;
or, could you just go for making that whole thing a bit test on 1<<type
?
> + }
> +}
> +
> /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
> bool force_dma_unencrypted(struct device *dev)
> {
> @@ -461,6 +471,10 @@ static void print_mem_encrypt_feature_info(void)
> if (sev_es_active())
> pr_cont(" SEV-ES");
>
> + /* Secure Nested Paging */
> + if (sev_feature_enabled(SEV_SNP))
> + pr_cont(" SEV-SNP");
> +
> pr_cont("\n");
> }
Dave
> --
> 2.17.1
>
>
--
Dr. David Alan Gilbert / [email protected] / Manchester, UK
On 08/07/21 10:50, Dr. David Alan Gilbert wrote:
>> +enum sev_feature_type {
>> + SEV,
>> + SEV_ES,
>> + SEV_SNP
>> +};
> Is this ....
>
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index a7c413432b33..37589da0282e 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -481,8 +481,10 @@
>> #define MSR_AMD64_SEV 0xc0010131
>> #define MSR_AMD64_SEV_ENABLED_BIT 0
>> #define MSR_AMD64_SEV_ES_ENABLED_BIT 1
>> +#define MSR_AMD64_SEV_SNP_ENABLED_BIT 2
> Just the same as this ?
>
No, it's just a coincidence.
Paolo
On Wed, Jul 07, 2021 at 01:14:32PM -0500, Brijesh Singh wrote:
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index 114f62fe2529..19c2306ac02d 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -14,6 +14,15 @@
> #define has_cpuflag(f) boot_cpu_has(f)
> #endif
>
> +/*
> + * Since feature negotiation related variables are set early in the boot
> + * process they must reside in the .data section so as not to be zeroed
> + * out when the .bss section is later cleared.
> + *
> + * GHCB protocol version negotiated with the hypervisor.
> + */
> +static u16 ghcb_version __section(".data..ro_after_init");
There's a define for that section specifier: __ro_after_init
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:32PM -0500, Brijesh Singh wrote:
> The SEV-ES guest calls the sev_es_negotiate_protocol() to negotiate the
> GHCB protocol version before establishing the GHCB. Cache the negotiated
> GHCB version so that it can be used later.
>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/include/asm/sev.h | 2 +-
> arch/x86/kernel/sev-shared.c | 17 ++++++++++++++---
> 2 files changed, 15 insertions(+), 4 deletions(-)
Also, while looking at this, all those defines in sev-common.h were
bothering me for a while now because they're an unreadable mess because
I have to go look at the GHCB spec, decode the corresponding MSR
protocol request and then go and piece together each request value by
verifying the masks...
So I did the below which is, IMO, a lot more readable as you can follow
it directly with the spec opened in parallel.
Thus, if you don't have a better idea, I'd ask you to please add this to
your set and continue defining the new MSR protocol requests this way so
that it can be readable.
Thx.
---
From: Borislav Petkov <[email protected]>
Date: Tue, 10 Aug 2021 11:39:57 +0200
Subject: [PATCH] x86/sev: Get rid of excessive use of defines
Remove all the defines of masks and bit positions for the GHCB MSR
protocol and use comments instead which correspond directly to the spec
so that following those can be a lot easier and straightforward with the
spec opened in parallel to the code.
Aligh vertically while at it.
No functional changes.
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/include/asm/sev-common.h | 51 +++++++++++++++++--------------
1 file changed, 28 insertions(+), 23 deletions(-)
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 855b0ec9c4e8..28244f8ebde2 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -18,20 +18,19 @@
/* SEV Information Request/Response */
#define GHCB_MSR_SEV_INFO_RESP 0x001
#define GHCB_MSR_SEV_INFO_REQ 0x002
-#define GHCB_MSR_VER_MAX_POS 48
-#define GHCB_MSR_VER_MAX_MASK 0xffff
-#define GHCB_MSR_VER_MIN_POS 32
-#define GHCB_MSR_VER_MIN_MASK 0xffff
-#define GHCB_MSR_CBIT_POS 24
-#define GHCB_MSR_CBIT_MASK 0xff
-#define GHCB_MSR_SEV_INFO(_max, _min, _cbit) \
- ((((_max) & GHCB_MSR_VER_MAX_MASK) << GHCB_MSR_VER_MAX_POS) | \
- (((_min) & GHCB_MSR_VER_MIN_MASK) << GHCB_MSR_VER_MIN_POS) | \
- (((_cbit) & GHCB_MSR_CBIT_MASK) << GHCB_MSR_CBIT_POS) | \
+
+#define GHCB_MSR_SEV_INFO(_max, _min, _cbit) \
+ /* GHCBData[63:48] */ \
+ ((((_max) & 0xffff) << 48) | \
+ /* GHCBData[47:32] */ \
+ (((_min) & 0xffff) << 32) | \
+ /* GHCBData[31:24] */ \
+ (((_cbit) & 0xff) << 24) | \
GHCB_MSR_SEV_INFO_RESP)
+
#define GHCB_MSR_INFO(v) ((v) & 0xfffUL)
-#define GHCB_MSR_PROTO_MAX(v) (((v) >> GHCB_MSR_VER_MAX_POS) & GHCB_MSR_VER_MAX_MASK)
-#define GHCB_MSR_PROTO_MIN(v) (((v) >> GHCB_MSR_VER_MIN_POS) & GHCB_MSR_VER_MIN_MASK)
+#define GHCB_MSR_PROTO_MAX(v) (((v) >> 48) & 0xffff)
+#define GHCB_MSR_PROTO_MIN(v) (((v) >> 32) & 0xffff)
/* CPUID Request/Response */
#define GHCB_MSR_CPUID_REQ 0x004
@@ -46,27 +45,33 @@
#define GHCB_CPUID_REQ_EBX 1
#define GHCB_CPUID_REQ_ECX 2
#define GHCB_CPUID_REQ_EDX 3
-#define GHCB_CPUID_REQ(fn, reg) \
- (GHCB_MSR_CPUID_REQ | \
- (((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
- (((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
+#define GHCB_CPUID_REQ(fn, reg) \
+ /* GHCBData[11:0] */ \
+ (GHCB_MSR_CPUID_REQ | \
+ /* GHCBData[31:12] */ \
+ (((unsigned long)reg & 0x3) << 30) | \
+ /* GHCBData[63:32] */ \
+ (((unsigned long)fn) << 32))
/* AP Reset Hold */
-#define GHCB_MSR_AP_RESET_HOLD_REQ 0x006
-#define GHCB_MSR_AP_RESET_HOLD_RESP 0x007
+#define GHCB_MSR_AP_RESET_HOLD_REQ 0x006
+#define GHCB_MSR_AP_RESET_HOLD_RESP 0x007
/* GHCB Hypervisor Feature Request/Response */
-#define GHCB_MSR_HV_FT_REQ 0x080
-#define GHCB_MSR_HV_FT_RESP 0x081
+#define GHCB_MSR_HV_FT_REQ 0x080
+#define GHCB_MSR_HV_FT_RESP 0x081
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
#define GHCB_MSR_TERM_REASON_SET_MASK 0xf
#define GHCB_MSR_TERM_REASON_POS 16
#define GHCB_MSR_TERM_REASON_MASK 0xff
-#define GHCB_SEV_TERM_REASON(reason_set, reason_val) \
- (((((u64)reason_set) & GHCB_MSR_TERM_REASON_SET_MASK) << GHCB_MSR_TERM_REASON_SET_POS) | \
- ((((u64)reason_val) & GHCB_MSR_TERM_REASON_MASK) << GHCB_MSR_TERM_REASON_POS))
+
+#define GHCB_SEV_TERM_REASON(reason_set, reason_val) \
+ /* GHCBData[15:12] */ \
+ (((((u64)reason_set) & 0xf) << 12) | \
+ /* GHCBData[23:16] */ \
+ ((((u64)reason_val) & 0xff) << 16))
#define GHCB_SEV_ES_GEN_REQ 0
#define GHCB_SEV_ES_PROT_UNSUPPORTED 1
--
2.29.2
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:33PM -0500, Brijesh Singh wrote:
> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index 11b7d9cea775..23929a3010df 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -45,6 +45,15 @@
> (((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
> (((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
>
> +/* GHCB Hypervisor Feature Request */
> +#define GHCB_MSR_HV_FT_REQ 0x080
> +#define GHCB_MSR_HV_FT_RESP 0x081
> +#define GHCB_MSR_HV_FT_POS 12
> +#define GHCB_MSR_HV_FT_MASK GENMASK_ULL(51, 0)
> +
> +#define GHCB_MSR_HV_FT_RESP_VAL(v) \
> + (((unsigned long)((v) >> GHCB_MSR_HV_FT_POS) & GHCB_MSR_HV_FT_MASK))
As I suggested...
> @@ -215,6 +216,7 @@
> { SVM_VMGEXIT_NMI_COMPLETE, "vmgexit_nmi_complete" }, \
> { SVM_VMGEXIT_AP_HLT_LOOP, "vmgexit_ap_hlt_loop" }, \
> { SVM_VMGEXIT_AP_JUMP_TABLE, "vmgexit_ap_jump_table" }, \
> + { SVM_VMGEXIT_HYPERVISOR_FEATURES, "vmgexit_hypervisor_feature" }, \
SVM_VMGEXIT_HV_FEATURES
> { SVM_EXIT_ERR, "invalid_guest_state" }
>
>
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index 19c2306ac02d..34821da5f05e 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -23,6 +23,9 @@
> */
> static u16 ghcb_version __section(".data..ro_after_init");
>
> +/* Bitmap of SEV features supported by the hypervisor */
> +u64 sev_hv_features __section(".data..ro_after_init") = 0;
__ro_after_init
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 66b7f63ad041..540b81ac54c9 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -96,6 +96,9 @@ struct ghcb_state {
> static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
> DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
>
> +/* Bitmap of SEV features supported by the hypervisor */
> +EXPORT_SYMBOL(sev_hv_features);
Why is this exported and why not a _GPL export?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:34PM -0500, Brijesh Singh wrote:
> The sev_feature_enabled() helper can be used by the guest to query whether
> the SNP - Secure Nested Paging feature is active.
>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/include/asm/mem_encrypt.h | 8 ++++++++
> arch/x86/include/asm/msr-index.h | 2 ++
> arch/x86/mm/mem_encrypt.c | 14 ++++++++++++++
> 3 files changed, 24 insertions(+)
This will get replaced by this I presume:
https://lkml.kernel.org/r/[email protected]
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:35PM -0500, Brijesh Singh wrote:
> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index 23929a3010df..e75e29c05f59 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -63,9 +63,17 @@
> (((((u64)reason_set) & GHCB_MSR_TERM_REASON_SET_MASK) << GHCB_MSR_TERM_REASON_SET_POS) | \
> ((((u64)reason_val) & GHCB_MSR_TERM_REASON_MASK) << GHCB_MSR_TERM_REASON_POS))
>
> +/* Error code from reason set 0 */
... Error codes...
> +#define SEV_TERM_SET_GEN 0
> #define GHCB_SEV_ES_GEN_REQ 0
> #define GHCB_SEV_ES_PROT_UNSUPPORTED 1
>
> #define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)
>
> +/* Linux specific reason codes (used with reason set 1) */
... Linux-specific ...
> +#define SEV_TERM_SET_LINUX 1
GHCB doc says:
"This document defines and owns reason code set 0x0"
Should it also say, reason code set 1 is allocated for Linux guest use?
I don't see why not...
Tom?
> +#define GHCB_TERM_REGISTER 0 /* GHCB GPA registration failure */
> +#define GHCB_TERM_PSC 1 /* Page State Change failure */
> +#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
> +
> #endif
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 8/10/21 4:17 AM, Borislav Petkov wrote:
> On Wed, Jul 07, 2021 at 01:14:32PM -0500, Brijesh Singh wrote:
>> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
>> index 114f62fe2529..19c2306ac02d 100644
>> --- a/arch/x86/kernel/sev-shared.c
>> +++ b/arch/x86/kernel/sev-shared.c
>> @@ -14,6 +14,15 @@
>> #define has_cpuflag(f) boot_cpu_has(f)
>> #endif
>>
>> +/*
>> + * Since feature negotiation related variables are set early in the boot
>> + * process they must reside in the .data section so as not to be zeroed
>> + * out when the .bss section is later cleared.
>> + *
>> + * GHCB protocol version negotiated with the hypervisor.
>> + */
>> +static u16 ghcb_version __section(".data..ro_after_init");
>
> There's a define for that section specifier: __ro_after_init
>
Noted.
thanks
On 8/10/21 5:01 AM, Borislav Petkov wrote:
> On Wed, Jul 07, 2021 at 01:14:32PM -0500, Brijesh Singh wrote:
>> The SEV-ES guest calls the sev_es_negotiate_protocol() to negotiate the
>> GHCB protocol version before establishing the GHCB. Cache the negotiated
>> GHCB version so that it can be used later.
>>
>> Signed-off-by: Brijesh Singh <[email protected]>
>> ---
>> arch/x86/include/asm/sev.h | 2 +-
>> arch/x86/kernel/sev-shared.c | 17 ++++++++++++++---
>> 2 files changed, 15 insertions(+), 4 deletions(-)
>
> Also, while looking at this, all those defines in sev-common.h were
> bothering me for a while now because they're an unreadable mess because
> I have to go look at the GHCB spec, decode the corresponding MSR
> protocol request and then go and piece together each request value by
> verifying the masks...
>
> So I did the below which is, IMO, a lot more readable as you can follow
> it directly with the spec opened in parallel.
>
> Thus, if you don't have a better idea, I'd ask you to please add this to
> your set and continue defining the new MSR protocol requests this way so
> that it can be readable.
>
thanks Boris, I will include this in my series and rework the remaining
patches to align with it.
-brijesh
On 8/10/21 6:22 AM, Borislav Petkov wrote:
>
> SVM_VMGEXIT_HV_FEATURES
>
Noted.
>>
>> +/* Bitmap of SEV features supported by the hypervisor */
>> +u64 sev_hv_features __section(".data..ro_after_init") = 0;
>
> __ro_after_init
Noted.
>
>> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
>> index 66b7f63ad041..540b81ac54c9 100644
>> --- a/arch/x86/kernel/sev.c
>> +++ b/arch/x86/kernel/sev.c
>> @@ -96,6 +96,9 @@ struct ghcb_state {
>> static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
>> DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
>>
>> +/* Bitmap of SEV features supported by the hypervisor */
>> +EXPORT_SYMBOL(sev_hv_features);
>
> Why is this exported and why not a _GPL export?
>
I was thinking that some driver may need it in future, but nothing in my
series needs it yet. I will drop it and we can revisit it later.
-Brijesh
On Tue, Aug 10, 2021 at 08:39:02AM -0500, Brijesh Singh wrote:
> I was thinking that some driver may need it in future, but nothing in my
> series needs it yet. I will drop it and we can revisit it later.
Yeah, please never do such exports in anticipation.
And if we *ever* need them, they should be _GPL ones - not
EXPORT_SYMBOL. And then the API needs to be discussed and potentially
proper accessors added instead of exporting naked variables...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 8/10/21 6:25 AM, Borislav Petkov wrote:
> On Wed, Jul 07, 2021 at 01:14:34PM -0500, Brijesh Singh wrote:
>> The sev_feature_enabled() helper can be used by the guest to query whether
>> the SNP - Secure Nested Paging feature is active.
>>
>> Signed-off-by: Brijesh Singh <[email protected]>
>> ---
>> arch/x86/include/asm/mem_encrypt.h | 8 ++++++++
>> arch/x86/include/asm/msr-index.h | 2 ++
>> arch/x86/mm/mem_encrypt.c | 14 ++++++++++++++
>> 3 files changed, 24 insertions(+)
>
> This will get replaced by this I presume:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.kernel.org%2Fr%2Fcover.1627424773.git.thomas.lendacky%40amd.com&data=04%7C01%7Cbrijesh.singh%40amd.com%7C15d8b87644e148488da408d95bf16ae3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637641914718165877%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=UMRigkWSG2h%2BZ4L08AUlG0JeUiqMb9te52LprPrq51M%3D&reserved=0
>
Yes.
thanks
On 8/10/21 6:33 AM, Borislav Petkov wrote:
> On Wed, Jul 07, 2021 at 01:14:35PM -0500, Brijesh Singh wrote:
>> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
>> index 23929a3010df..e75e29c05f59 100644
>> --- a/arch/x86/include/asm/sev-common.h
>> +++ b/arch/x86/include/asm/sev-common.h
>> @@ -63,9 +63,17 @@
>> (((((u64)reason_set) & GHCB_MSR_TERM_REASON_SET_MASK) << GHCB_MSR_TERM_REASON_SET_POS) | \
>> ((((u64)reason_val) & GHCB_MSR_TERM_REASON_MASK) << GHCB_MSR_TERM_REASON_POS))
>>
>> +/* Error code from reason set 0 */
>
> ... Error codes...
>
Noted.
>> +#define SEV_TERM_SET_GEN 0
>> #define GHCB_SEV_ES_GEN_REQ 0
>> #define GHCB_SEV_ES_PROT_UNSUPPORTED 1
>>
>> #define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)
>>
>> +/* Linux specific reason codes (used with reason set 1) */
>
> ... Linux-specific ...
Noted.
>
>> +#define SEV_TERM_SET_LINUX 1
>
> GHCB doc says:
>
> "This document defines and owns reason code set 0x0"
>
> Should it also say, reason code set 1 is allocated for Linux guest use?
> I don't see why not...
> > Tom?
>
If Tom is okay with it then maybe in next version of the GHCB doc can
add this text.
On 8/10/21 9:59 AM, Brijesh Singh wrote:
> On 8/10/21 6:33 AM, Borislav Petkov wrote:
>> On Wed, Jul 07, 2021 at 01:14:35PM -0500, Brijesh Singh wrote:
>>> +#define SEV_TERM_SET_LINUX 1
>>
>> GHCB doc says:
>>
>> "This document defines and owns reason code set 0x0"
>>
>> Should it also say, reason code set 1 is allocated for Linux guest use?
>> I don't see why not...
>> > Tom?
>>
>
> If Tom is okay with it then maybe in next version of the GHCB doc can add
> this text.
IIRC, during the review of the first GHCB version there was discussion
about assigning reason sets outside of 0 within the spec and the overall
feeling was to not do that as part of the spec.
We can re-open that discussion for the next version of the GHCB document.
Thanks,
Tom
On Wed, Jul 07, 2021 at 01:14:38PM -0500, Brijesh Singh wrote:
> Virtual Machine Privilege Level (VMPL) is an optional feature in the
> SEV-SNP architecture, which allows a guest VM to divide its address space
> into four levels. The level can be used to provide the hardware isolated
> abstraction layers with a VM. The VMPL0 is the highest privilege, and
> VMPL3 is the least privilege. Certain operations must be done by the VMPL0
> software, such as:
>
> * Validate or invalidate memory range (PVALIDATE instruction)
> * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
>
> The initial SEV-SNP support assumes that the guest kernel is running on
> VMPL0. Let's add a check to make sure that kernel is running at VMPL0
> before continuing the boot. There is no easy method to query the current
> VMPL level, so use the RMPADJUST instruction to determine whether its
> booted at the VMPL0.
>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/boot/compressed/sev.c | 41 ++++++++++++++++++++++++++++---
> arch/x86/include/asm/sev-common.h | 1 +
> arch/x86/include/asm/sev.h | 3 +++
> 3 files changed, 42 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index 7be325d9b09f..2f3081e9c78c 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -134,6 +134,36 @@ static inline bool sev_snp_enabled(void)
> return msr_sev_status & MSR_AMD64_SEV_SNP_ENABLED;
> }
>
> +static bool is_vmpl0(void)
> +{
> + u64 attrs, va;
> + int err;
> +
> + /*
> + * There is no straightforward way to query the current VMPL level. The
So this is not nice at all.
And this VMPL level checking can't be part of the GHCB MSR protocol
because the HV can tell us any VPML level it wants to.
Is there a way to disable VMPL levels and say, this guest should run
only at VMPL0?
Err, I see SYSCFG[VMPLEn]:
"VMPLEn. Bit 25. Setting this bit to 1 enables the VMPL feature (Section
15.36.7 “Virtual Machine Privilege Levels,” on page 580). Software
should set this bit to 1 when SecureNestedPagingEn is being set to 1.
Once SecureNestedPagingEn is set to 1, VMPLEn cannot be changed."
But why should that bit be set if SNP is enabled? Can I run a SNP guest
without VPMLs, i.e, at an implicit VPML level 0?
It says above VPML is optional...
Also, why do you even need to do this at all since the guest controls
and validates its memory with the RMP? It can simply go and check the
VMPLs of every page it owns to make sure it is 0.
Also, if you really wanna support guests with multiple VMPLs, then
prevalidating its memory is going to be a useless exercise because it'll
have to go and revalidate the VMPL levels...
I also see this:
"When the hypervisor assigns a page to a guest using RMPUPDATE, full
permissions are enabled for VMPL0 and are disabled for all other VMPLs."
so you get your memory at VMPL0 by the HV. So what is that check for?
Questions over questions, I'm sure I'm missing an aspect.
> + * simplest method is to use the RMPADJUST instruction to change a page
> + * permission to a VMPL level-1, and if the guest kernel is launched at
> + * at a level <= 1, then RMPADJUST instruction will return an error.
WARNING: Possible repeated word: 'at'
#156: FILE: arch/x86/boot/compressed/sev.c:146:
+ * permission to a VMPL level-1, and if the guest kernel is launched at
+ * at a level <= 1, then RMPADJUST instruction will return an error.
How many times do I have to say:
Please integrate scripts/checkpatch.pl into your patch creation
workflow. Some of the warnings/errors *actually* make sense.
?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:39PM -0500, Brijesh Singh wrote:
> @@ -274,16 +274,31 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
> /*
> * Changing encryption attributes of a page requires to flush it from
> * the caches.
> + *
> + * If the encryption attribute is being cleared, then change the page
> + * state to shared in the RMP table.
That comment...
> */
> - if ((set | clr) & _PAGE_ENC)
> + if ((set | clr) & _PAGE_ENC) {
> clflush_page(address);
>
... goes here:
<---
> + if (clr)
> + snp_set_page_shared(pte_pfn(*ptep) << PAGE_SHIFT);
> + }
> +
...
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index 2f3081e9c78c..f386d45a57b6 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -164,6 +164,47 @@ static bool is_vmpl0(void)
> return true;
> }
>
> +static void __page_state_change(unsigned long paddr, int op)
That op should be:
enum psc_op {
SNP_PAGE_STATE_SHARED,
SNP_PAGE_STATE_PRIVATE,
};
and have
static void __page_state_change(unsigned long paddr, enum psc_op op)
so that the compiler can check you're at least passing from the correct
set of defines.
> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index ea508835ab33..aee07d1bb138 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -45,6 +45,23 @@
> (((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
> (((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
>
> +/* SNP Page State Change */
> +#define GHCB_MSR_PSC_REQ 0x014
> +#define SNP_PAGE_STATE_PRIVATE 1
> +#define SNP_PAGE_STATE_SHARED 2
> +#define GHCB_MSR_PSC_GFN_POS 12
> +#define GHCB_MSR_PSC_GFN_MASK GENMASK_ULL(39, 0)
> +#define GHCB_MSR_PSC_OP_POS 52
> +#define GHCB_MSR_PSC_OP_MASK 0xf
> +#define GHCB_MSR_PSC_REQ_GFN(gfn, op) \
> + (((unsigned long)((op) & GHCB_MSR_PSC_OP_MASK) << GHCB_MSR_PSC_OP_POS) | \
> + ((unsigned long)((gfn) & GHCB_MSR_PSC_GFN_MASK) << GHCB_MSR_PSC_GFN_POS) | \
> + GHCB_MSR_PSC_REQ)
> +
> +#define GHCB_MSR_PSC_RESP 0x015
> +#define GHCB_MSR_PSC_ERROR_POS 32
> +#define GHCB_MSR_PSC_RESP_VAL(val) ((val) >> GHCB_MSR_PSC_ERROR_POS)
> +
Also get rid of eccessive defines...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:40PM -0500, Brijesh Singh wrote:
> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
> index aee07d1bb138..b19d8d301f5d 100644
> --- a/arch/x86/include/asm/sev-common.h
> +++ b/arch/x86/include/asm/sev-common.h
> @@ -45,6 +45,17 @@
> (((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
> (((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
>
> +/* GHCB GPA Register */
> +#define GHCB_MSR_GPA_REG_REQ 0x012
> +#define GHCB_MSR_GPA_REG_VALUE_POS 12
> +#define GHCB_MSR_GPA_REG_GFN_MASK GENMASK_ULL(51, 0)
> +#define GHCB_MSR_GPA_REQ_GFN_VAL(v) \
> + (((unsigned long)((v) & GHCB_MSR_GPA_REG_GFN_MASK) << GHCB_MSR_GPA_REG_VALUE_POS)| \
> + GHCB_MSR_GPA_REG_REQ)
> +
> +#define GHCB_MSR_GPA_REG_RESP 0x013
> +#define GHCB_MSR_GPA_REG_RESP_VAL(v) ((v) >> GHCB_MSR_GPA_REG_VALUE_POS)
Simplify...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:42PM -0500, Brijesh Singh wrote:
> +void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
> + unsigned int npages)
> +{
> + if (!sev_feature_enabled(SEV_SNP))
> + return;
> +
> + /* Ask hypervisor to add the memory pages in RMP table as a 'private'. */
From a previous review:
Ask the hypervisor to mark the memory pages as private in the RMP table.
Are you missing my comments, do I have to write them more prominently or
what is the problem?
DO I NEED TO WRITE IN CAPS ONLY MAYBE?
> +void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
> + unsigned int npages)
> +{
> + if (!sev_feature_enabled(SEV_SNP))
> + return;
> +
> + /*
> + * Invalidate the memory pages before they are marked shared in the
> + * RMP table.
> + */
> + pvalidate_pages(vaddr, npages, 0);
> +
> + /* Ask hypervisor to make the memory pages shared in the RMP table. */
From a previous review:
s/make/mark/
> + early_set_page_state(paddr, npages, SNP_PAGE_STATE_SHARED);
> +}
> +
> +void __init snp_prep_memory(unsigned long paddr, unsigned int sz, int op)
that op should be:
enum psc_op {
SNP_PAGE_STATE_SHARED,
SNP_PAGE_STATE_PRIVATE,
};
too.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 8/13/21 2:25 AM, Borislav Petkov wrote:
> On Wed, Jul 07, 2021 at 01:14:38PM -0500, Brijesh Singh wrote:
>> Virtual Machine Privilege Level (VMPL) is an optional feature in the
>> SEV-SNP architecture, which allows a guest VM to divide its address space
>> into four levels. The level can be used to provide the hardware isolated
>> abstraction layers with a VM. The VMPL0 is the highest privilege, and
>> VMPL3 is the least privilege. Certain operations must be done by the VMPL0
>> software, such as:
>>
>> * Validate or invalidate memory range (PVALIDATE instruction)
>> * Allocate VMSA page (RMPADJUST instruction when VMSA=1)
>>
>> The initial SEV-SNP support assumes that the guest kernel is running on
>> VMPL0. Let's add a check to make sure that kernel is running at VMPL0
>> before continuing the boot. There is no easy method to query the current
>> VMPL level, so use the RMPADJUST instruction to determine whether its
>> booted at the VMPL0.
>>
>> Signed-off-by: Brijesh Singh <[email protected]>
>> ---
>> arch/x86/boot/compressed/sev.c | 41 ++++++++++++++++++++++++++++---
>> arch/x86/include/asm/sev-common.h | 1 +
>> arch/x86/include/asm/sev.h | 3 +++
>> 3 files changed, 42 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
>> index 7be325d9b09f..2f3081e9c78c 100644
>> --- a/arch/x86/boot/compressed/sev.c
>> +++ b/arch/x86/boot/compressed/sev.c
>> @@ -134,6 +134,36 @@ static inline bool sev_snp_enabled(void)
>> return msr_sev_status & MSR_AMD64_SEV_SNP_ENABLED;
>> }
>>
>> +static bool is_vmpl0(void)
>> +{
>> + u64 attrs, va;
>> + int err;
>> +
>> + /*
>> + * There is no straightforward way to query the current VMPL level. The
> So this is not nice at all.
>
> And this VMPL level checking can't be part of the GHCB MSR protocol
> because the HV can tell us any VPML level it wants to.
>
> Is there a way to disable VMPL levels and say, this guest should run
> only at VMPL0?
No.
>
> Err, I see SYSCFG[VMPLEn]:
>
> "VMPLEn. Bit 25. Setting this bit to 1 enables the VMPL feature (Section
> 15.36.7 “Virtual Machine Privilege Levels,” on page 580). Software
> should set this bit to 1 when SecureNestedPagingEn is being set to 1.
> Once SecureNestedPagingEn is set to 1, VMPLEn cannot be changed."
>
> But why should that bit be set if SNP is enabled? Can I run a SNP guest
> without VPMLs, i.e, at an implicit VPML level 0?
During the firmware initialization the PSP requires that the VMPLEn is
set. See SNP firmware spec [1] section 8.6. To run the SNP guest you
*must* specify a VMPL level during the vCPU creation.
>
> It says above VPML is optional...
I should not say its optional when we know from the SEV-SNP spec that
VMPLEn must be set to launch SEV-SNP guest. I will fix the description.
> Also, why do you even need to do this at all since the guest controls
> and validates its memory with the RMP? It can simply go and check the
> VMPLs of every page it owns to make sure it is 0.
There is no easy way for a guest to query its VMPL level. The VMPL level
is set during the vCPU creation. The boot cpu is created by the HV and
thus its VMPL level is set by the HV. If HV chooses a lower VMPL level
for the boot CPU then Linux guest will not be able to validate its
memory because the PVALIDATE instruction will cause #GP when the vCPU is
running at !VMPL0. The patch tries to detect the boot CPU VMPL level and
terminate the boot.
>
> Also, if you really wanna support guests with multiple VMPLs, then
> prevalidating its memory is going to be a useless exercise because it'll
> have to go and revalidate the VMPL levels...
We do not need to re-valiate memory when changing to different VMPL
level. The RMPADJUST instruction inside the guest can be used to change
the VMPL permission.
> I also see this:
>
> "When the hypervisor assigns a page to a guest using RMPUPDATE, full
> permissions are enabled for VMPL0 and are disabled for all other VMPLs."
>
> so you get your memory at VMPL0 by the HV. So what is that check for?
Validating the memory is a two step process:
#1 HV adding the memory using the RMPUPDATE in the RMP table.
#2 Guest issuing the PVALIDATE
If guest is not running at VMPL0 then step #2 will cause #GP. The check
is prevent the #GP and terminate the boot early.
thanks
On 8/13/21 5:22 AM, Borislav Petkov wrote:
>> +static void __page_state_change(unsigned long paddr, int op)
>
> That op should be:
>
> enum psc_op {
> SNP_PAGE_STATE_SHARED,
> SNP_PAGE_STATE_PRIVATE,
> };
>
Noted.
> and have
>
> static void __page_state_change(unsigned long paddr, enum psc_op op)
>
> so that the compiler can check you're at least passing from the correct
> set of defines.
>
>> diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
>> index ea508835ab33..aee07d1bb138 100644
>> --- a/arch/x86/include/asm/sev-common.h
>> +++ b/arch/x86/include/asm/sev-common.h
>> @@ -45,6 +45,23 @@
>> (((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
>> (((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
>>
>> +/* SNP Page State Change */
>> +#define GHCB_MSR_PSC_REQ 0x014
>> +#define SNP_PAGE_STATE_PRIVATE 1
>> +#define SNP_PAGE_STATE_SHARED 2
>> +#define GHCB_MSR_PSC_GFN_POS 12
>> +#define GHCB_MSR_PSC_GFN_MASK GENMASK_ULL(39, 0)
>> +#define GHCB_MSR_PSC_OP_POS 52
>> +#define GHCB_MSR_PSC_OP_MASK 0xf
>> +#define GHCB_MSR_PSC_REQ_GFN(gfn, op) \
>> + (((unsigned long)((op) & GHCB_MSR_PSC_OP_MASK) << GHCB_MSR_PSC_OP_POS) | \
>> + ((unsigned long)((gfn) & GHCB_MSR_PSC_GFN_MASK) << GHCB_MSR_PSC_GFN_POS) | \
>> + GHCB_MSR_PSC_REQ)
>> +
>> +#define GHCB_MSR_PSC_RESP 0x015
>> +#define GHCB_MSR_PSC_ERROR_POS 32
>> +#define GHCB_MSR_PSC_RESP_VAL(val) ((val) >> GHCB_MSR_PSC_ERROR_POS)
>> +
>
> Also get rid of eccessive defines...
I am getting conflicting review comments on function naming, comment
style, macro etc. While addressing the feedback I try to incorporate all
those comments, lets see how I do in next rev.
thanks
On Fri, Aug 13, 2021 at 08:13:20AM -0500, Brijesh Singh wrote:
> During the firmware initialization the PSP requires that the VMPLEn is
> set. See SNP firmware spec [1] section 8.6. To run the SNP guest you
> *must* specify a VMPL level during the vCPU creation.
Yes, that's why I said "implicit VMPL level 0"! When you don't specify
it, it should implied as 0.
Right now that "enable" bit is useless as it is *forced* to be enabled.
I sincerely hope querying the VMPL level is going to be made
straight-forwaed in future versions.
> I should not say its optional when we know from the SEV-SNP spec that
> VMPLEn must be set to launch SEV-SNP guest. I will fix the description.
It probably wasn't required when that bit was invented - why would you
call it "enable" otherwise - but some decision later made it required,
I'd guess.
> There is no easy way for a guest to query its VMPL level.
Yes, and there should be.
> The VMPL level is set during the vCPU creation. The boot cpu is
> created by the HV and thus its VMPL level is set by the HV. If HV
> chooses a lower VMPL level for the boot CPU then Linux guest will
> not be able to validate its memory because the PVALIDATE instruction
> will cause #GP when the vCPU is running at !VMPL0. The patch tries to
> detect the boot CPU VMPL level and terminate the boot.
I figured as much. All I don't like is the VMPL checking method.
> If guest is not running at VMPL0 then step #2 will cause #GP. The check
> is prevent the #GP and terminate the boot early.
Yah, Tom helped me understand the design of the permission masks in the
RMP on IRC.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Fri, Aug 13, 2021 at 09:21:14AM -0500, Brijesh Singh wrote:
> > Also get rid of eccessive defines...
>
> I am getting conflicting review comments on function naming, comment style,
> macro etc. While addressing the feedback I try to incorporate all those
> comments, lets see how I do in next rev.
What do you mean? Example?
With "Also get rid of eccessive defines..." I mean this here from
earlier this week:
https://lkml.kernel.org/r/YRJOkQe0W9/[email protected]
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:43PM -0500, Brijesh Singh wrote:
> The encryption attribute for the bss.decrypted region is cleared in the
> initial page table build. This is because the section contains the data
> that need to be shared between the guest and the hypervisor.
>
> When SEV-SNP is active, just clearing the encryption attribute in the
> page table is not enough. The page state need to be updated in the RMP
> table.
>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/kernel/head64.c | 7 +++++++
> 1 file changed, 7 insertions(+)
Please apply this cleanup before this one.
Thx.
---
From: Borislav Petkov <[email protected]>
Subject: [PATCH] x86/head64: Carve out the guest encryption postprocessing into a helper
Carve it out so that it is abstracted out of the main boot path. All
other encrypted guest-relevant processing should be placed in there.
No functional changes.
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/head64.c | 55 ++++++++++++++++++++++------------------
1 file changed, 31 insertions(+), 24 deletions(-)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index de01903c3735..eee24b427237 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -126,6 +126,36 @@ static bool __head check_la57_support(unsigned long physaddr)
}
#endif
+static unsigned long sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
+{
+ unsigned long vaddr, vaddr_end;
+ int i;
+
+ /* Encrypt the kernel and related (if SME is active) */
+ sme_encrypt_kernel(bp);
+
+ /*
+ * Clear the memory encryption mask from the .bss..decrypted section.
+ * The bss section will be memset to zero later in the initialization so
+ * there is no need to zero it after changing the memory encryption
+ * attribute.
+ */
+ if (mem_encrypt_active()) {
+ vaddr = (unsigned long)__start_bss_decrypted;
+ vaddr_end = (unsigned long)__end_bss_decrypted;
+ for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
+ i = pmd_index(vaddr);
+ pmd[i] -= sme_get_me_mask();
+ }
+ }
+
+ /*
+ * Return the SME encryption mask (if SME is active) to be used as a
+ * modifier for the initial pgdir entry programmed into CR3.
+ */
+ return sme_get_me_mask();
+}
+
/* Code in __startup_64() can be relocated during execution, but the compiler
* doesn't have to generate PC-relative relocations when accessing globals from
* that function. Clang actually does not generate them, which leads to
@@ -135,7 +165,6 @@ static bool __head check_la57_support(unsigned long physaddr)
unsigned long __head __startup_64(unsigned long physaddr,
struct boot_params *bp)
{
- unsigned long vaddr, vaddr_end;
unsigned long load_delta, *p;
unsigned long pgtable_flags;
pgdval_t *pgd;
@@ -276,29 +305,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
*/
*fixup_long(&phys_base, physaddr) += load_delta - sme_get_me_mask();
- /* Encrypt the kernel and related (if SME is active) */
- sme_encrypt_kernel(bp);
-
- /*
- * Clear the memory encryption mask from the .bss..decrypted section.
- * The bss section will be memset to zero later in the initialization so
- * there is no need to zero it after changing the memory encryption
- * attribute.
- */
- if (mem_encrypt_active()) {
- vaddr = (unsigned long)__start_bss_decrypted;
- vaddr_end = (unsigned long)__end_bss_decrypted;
- for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
- i = pmd_index(vaddr);
- pmd[i] -= sme_get_me_mask();
- }
- }
-
- /*
- * Return the SME encryption mask (if SME is active) to be used as a
- * modifier for the initial pgdir entry programmed into CR3.
- */
- return sme_get_me_mask();
+ return sme_postprocess_startup(bp, pmd);
}
unsigned long __startup_secondary_64(void)
--
2.29.2
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:46PM -0500, Brijesh Singh wrote:
> The hypervisor uses the SEV_FEATURES field (offset 3B0h) in the Save State
> Area to control the SEV-SNP guest features such as SNPActive, vTOM,
> ReflectVC etc. An SEV-SNP guest can read the SEV_FEATURES fields through
> the SEV_STATUS MSR.
>
> While at it, update the dump_vmcb() to log the VMPL level.
>
> See APM2 Table 15-34 and B-4 for more details.
>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 15 +++++++++++++--
> arch/x86/kvm/svm/svm.c | 4 ++--
> 2 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 772e60efe243..ff614cdcf628 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -212,6 +212,15 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
> #define SVM_NESTED_CTL_SEV_ENABLE BIT(1)
> #define SVM_NESTED_CTL_SEV_ES_ENABLE BIT(2)
>
> +#define SVM_SEV_FEATURES_SNP_ACTIVE BIT(0)
> +#define SVM_SEV_FEATURES_VTOM BIT(1)
> +#define SVM_SEV_FEATURES_REFLECT_VC BIT(2)
> +#define SVM_SEV_FEATURES_RESTRICTED_INJECTION BIT(3)
> +#define SVM_SEV_FEATURES_ALTERNATE_INJECTION BIT(4)
> +#define SVM_SEV_FEATURES_DEBUG_SWAP BIT(5)
> +#define SVM_SEV_FEATURES_PREVENT_HOST_IBS BIT(6)
> +#define SVM_SEV_FEATURES_BTB_ISOLATION BIT(7)
Only some of those get used and only later. Please introduce only those
with the patch that adds usage.
Also,
s/SVM_SEV_FEATURES_/SVM_SEV_FEAT_/g
at least.
And by the way, why is this patch and the next 3 part of the guest set?
They look like they belong into the hypervisor set.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Aug 17, 2021 at 07:54:15PM +0200, Borislav Petkov wrote:
> And by the way, why is this patch and the next 3 part of the guest set?
> They look like they belong into the hypervisor set.
Aha, patch 20 and further need the definitions.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 8/17/21 12:54 PM, Borislav Petkov wrote:
> On Wed, Jul 07, 2021 at 01:14:46PM -0500, Brijesh Singh wrote:
>> The hypervisor uses the SEV_FEATURES field (offset 3B0h) in the Save State
>> Area to control the SEV-SNP guest features such as SNPActive, vTOM,
>> ReflectVC etc. An SEV-SNP guest can read the SEV_FEATURES fields through
>> the SEV_STATUS MSR.
>>
>> While at it, update the dump_vmcb() to log the VMPL level.
>>
>> See APM2 Table 15-34 and B-4 for more details.
>>
>> Signed-off-by: Brijesh Singh <[email protected]>
>> ---
>> arch/x86/include/asm/svm.h | 15 +++++++++++++--
>> arch/x86/kvm/svm/svm.c | 4 ++--
>> 2 files changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
>> index 772e60efe243..ff614cdcf628 100644
>> --- a/arch/x86/include/asm/svm.h
>> +++ b/arch/x86/include/asm/svm.h
>> @@ -212,6 +212,15 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
>> #define SVM_NESTED_CTL_SEV_ENABLE BIT(1)
>> #define SVM_NESTED_CTL_SEV_ES_ENABLE BIT(2)
>>
>> +#define SVM_SEV_FEATURES_SNP_ACTIVE BIT(0)
>> +#define SVM_SEV_FEATURES_VTOM BIT(1)
>> +#define SVM_SEV_FEATURES_REFLECT_VC BIT(2)
>> +#define SVM_SEV_FEATURES_RESTRICTED_INJECTION BIT(3)
>> +#define SVM_SEV_FEATURES_ALTERNATE_INJECTION BIT(4)
>> +#define SVM_SEV_FEATURES_DEBUG_SWAP BIT(5)
>> +#define SVM_SEV_FEATURES_PREVENT_HOST_IBS BIT(6)
>> +#define SVM_SEV_FEATURES_BTB_ISOLATION BIT(7)
>
> Only some of those get used and only later. Please introduce only those
> with the patch that adds usage.
>
Okay.
> Also,
>
> s/SVM_SEV_FEATURES_/SVM_SEV_FEAT_/g
>
I can do that.
> at least.
>
> And by the way, why is this patch and the next 3 part of the guest set?
> They look like they belong into the hypervisor set.
>
This is needed by the AP creation, in SNP the AP creation need to
populate the VMSA page and thus need to use some of macros and fields etc.
On Wed, Jul 07, 2021 at 01:14:50PM -0500, Brijesh Singh wrote:
> @@ -854,6 +858,207 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
> pvalidate_pages(vaddr, npages, 1);
> }
>
> +static int vmsa_rmpadjust(void *va, bool vmsa)
I know, I know it gets a bool vmsa param but you can still call it
simply rmpadjust() because this is what it does - it is a wrapper around
the insn. Just like pvalidate() and so on.
...
> +static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
> +{
> + struct sev_es_save_area *cur_vmsa, *vmsa;
> + struct ghcb_state state;
> + unsigned long flags;
> + struct ghcb *ghcb;
> + int cpu, err, ret;
> + u8 sipi_vector;
> + u64 cr4;
> +
> + if ((sev_hv_features & GHCB_HV_FT_SNP_AP_CREATION) != GHCB_HV_FT_SNP_AP_CREATION)
> + return -EOPNOTSUPP;
> +
> + /*
> + * Verify the desired start IP against the known trampoline start IP
> + * to catch any future new trampolines that may be introduced that
> + * would require a new protected guest entry point.
> + */
> + if (WARN_ONCE(start_ip != real_mode_header->trampoline_start,
> + "unsupported SEV-SNP start_ip: %lx\n", start_ip))
"Unsupported... " - with a capital letter
> + return -EINVAL;
> +
> + /* Override start_ip with known protected guest start IP */
> + start_ip = real_mode_header->sev_es_trampoline_start;
> +
...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 8/17/21 3:04 PM, Borislav Petkov wrote:
> On Wed, Jul 07, 2021 at 01:14:50PM -0500, Brijesh Singh wrote:
>> @@ -854,6 +858,207 @@ void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
>> pvalidate_pages(vaddr, npages, 1);
>> }
>>
>> +static int vmsa_rmpadjust(void *va, bool vmsa)
>
> I know, I know it gets a bool vmsa param but you can still call it
> simply rmpadjust() because this is what it does - it is a wrapper around
> the insn. Just like pvalidate() and so on.
Well, yes and no. It really is just setting or clearing the VMSA page
attribute. It isn't trying to update permissions for the lower VMPLs, so I
didn't want to mislabel it as a general rmpadjust function. But it's a
simple enough thing to change and if multiple VMPL levels are ever
supported it can be evaluated at that time.
>
> ...
>
>> +static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
>> +{
>> + struct sev_es_save_area *cur_vmsa, *vmsa;
>> + struct ghcb_state state;
>> + unsigned long flags;
>> + struct ghcb *ghcb;
>> + int cpu, err, ret;
>> + u8 sipi_vector;
>> + u64 cr4;
>> +
>> + if ((sev_hv_features & GHCB_HV_FT_SNP_AP_CREATION) != GHCB_HV_FT_SNP_AP_CREATION)
>> + return -EOPNOTSUPP;
>> +
>> + /*
>> + * Verify the desired start IP against the known trampoline start IP
>> + * to catch any future new trampolines that may be introduced that
>> + * would require a new protected guest entry point.
>> + */
>> + if (WARN_ONCE(start_ip != real_mode_header->trampoline_start,
>> + "unsupported SEV-SNP start_ip: %lx\n", start_ip))
>
> "Unsupported... " - with a capital letter
Will do.
Thanks,
Tom
>
>> + return -EINVAL;
>> +
>> + /* Override start_ip with known protected guest start IP */
>> + start_ip = real_mode_header->sev_es_trampoline_start;
>> +
>
> ...
>
On Tue, Aug 17, 2021 at 05:13:54PM -0500, Tom Lendacky wrote:
> Well, yes and no. It really is just setting or clearing the VMSA page
> attribute. It isn't trying to update permissions for the lower VMPLs, so I
> didn't want to mislabel it as a general rmpadjust function. But it's a
> simple enough thing to change and if multiple VMPL levels are ever
> supported it can be evaluated at that time.
You got it - when we need more RMPADJUST functionality, then that should
be the function that gets the beefing up.
:-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Finally drop this bouncing npmccallum at RH email address from the Cc
list.
On Wed, Jul 07, 2021 at 01:14:52PM -0500, Brijesh Singh wrote:
> From: Michael Roth <[email protected]>
>
> This code will also be used later for SEV-SNP-validated CPUID code in
> some cases, so move it to a common helper.
>
> Signed-off-by: Michael Roth <[email protected]>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
> 1 file changed, 58 insertions(+), 26 deletions(-)
>
> diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> index be4025f14b4f..4884de256a49 100644
> --- a/arch/x86/kernel/sev-shared.c
> +++ b/arch/x86/kernel/sev-shared.c
> @@ -184,6 +184,58 @@ static enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
> return ret;
> }
>
> +static int sev_es_cpuid_msr_proto(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
Since it is not only SEV-ES, then it should be prefixed with "sev_" like
we do for the other such functions. I guess simply
sev_cpuid()
?
> + u32 *ecx, u32 *edx)
> +{
> + u64 val;
> +
> + if (eax) {
What's the protection for? Is it ever going to be called with NULL ptrs
for the regs? That's not the case in this patchset at least...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:53PM -0500, Brijesh Singh wrote:
> From: Michael Roth <[email protected]>
>
> Determining which CPUID leafs have significant ECX/index values is
> also needed by guest kernel code when doing SEV-SNP-validated CPUID
> lookups. Move this to common code to keep future updates in sync.
>
> Signed-off-by: Michael Roth <[email protected]>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/include/asm/cpuid-indexed.h | 26 ++++++++++++++++++++++++++
> arch/x86/kvm/cpuid.c | 17 ++---------------
> 2 files changed, 28 insertions(+), 15 deletions(-)
> create mode 100644 arch/x86/include/asm/cpuid-indexed.h
>
> diff --git a/arch/x86/include/asm/cpuid-indexed.h b/arch/x86/include/asm/cpuid-indexed.h
> new file mode 100644
> index 000000000000..f5ab746f5712
> --- /dev/null
> +++ b/arch/x86/include/asm/cpuid-indexed.h
Just call it arch/x86/include/asm/cpuid.h
And if you feel bored, you can move the cpuid* primitives from
asm/processor.h to it, in another patch so that processor.h gets
slimmer.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 07, 2021 at 01:14:54PM -0500, Brijesh Singh wrote:
> From: Michael Roth <[email protected]>
>
> Future patches for SEV-SNP-validated CPUID will also require early
> parsing of the EFI configuration. Move the related code into a set of
> helpers that can be re-used for that purpose.
>
> Signed-off-by: Michael Roth <[email protected]>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/boot/compressed/Makefile | 1 +
> arch/x86/boot/compressed/acpi.c | 124 +++++---------
> arch/x86/boot/compressed/efi-config-table.c | 180 ++++++++++++++++++++
> arch/x86/boot/compressed/misc.h | 50 ++++++
> 4 files changed, 272 insertions(+), 83 deletions(-)
> create mode 100644 arch/x86/boot/compressed/efi-config-table.c
arch/x86/boot/compressed/efi.c
should be good enough.
And in general, this patch is hard to review because it does a bunch of
things at the same time. You should split it:
- the first patch sould carve out only the functionality into helpers
without adding or changing the existing functionality.
- later ones should add the new functionality, in single logical steps.
Some preliminary comments below as far as I can:
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 431bf7f846c3..b41aecfda49c 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -100,6 +100,7 @@ endif
> vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
>
> vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
> +vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi-config-table.o
> efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a
>
> $(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
> diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> index 8bcbcee54aa1..e087dcaf43b3 100644
> --- a/arch/x86/boot/compressed/acpi.c
> +++ b/arch/x86/boot/compressed/acpi.c
> @@ -24,42 +24,36 @@ struct mem_vector immovable_mem[MAX_NUMNODES*2];
> * Search EFI system tables for RSDP. If both ACPI_20_TABLE_GUID and
> * ACPI_TABLE_GUID are found, take the former, which has more features.
> */
> +#ifdef CONFIG_EFI
> +static bool
> +rsdp_find_fn(efi_guid_t guid, unsigned long vendor_table, bool efi_64,
> + void *opaque)
> +{
> + acpi_physical_address *rsdp_addr = opaque;
> +
> + if (!(efi_guidcmp(guid, ACPI_TABLE_GUID))) {
> + *rsdp_addr = vendor_table;
> + } else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID))) {
> + *rsdp_addr = vendor_table;
> + return false;
No "return false" in the ACPI_TABLE_GUID branch above? Maybe this has to
do with the preference to ACPI_20_TABLE_GUID.
In any case, this looks silly. Please do the iteration simple
and stupid without the function pointer and get rid of that
efi_foreach_conf_entry() thing - this is not firmware.
> diff --git a/arch/x86/boot/compressed/efi-config-table.c b/arch/x86/boot/compressed/efi-config-table.c
> new file mode 100644
> index 000000000000..d1a34aa7cefd
> --- /dev/null
> +++ b/arch/x86/boot/compressed/efi-config-table.c
...
> +/*
If you're going to add proper comments, make them kernel-doc. I.e., it
should start with
/**
and then use
./scripts/kernel-doc -none arch/x86/boot/compressed/efi-config-table.c
to check them all they're proper.
> + * Given boot_params, retrieve the physical address of EFI system table.
> + *
> + * @boot_params: pointer to boot_params
> + * @sys_table_pa: location to store physical address of system table
> + * @is_efi_64: location to store whether using 64-bit EFI or not
> + *
> + * Returns 0 on success. On error, return params are left unchanged.
> + */
> +int
> +efi_bp_get_system_table(struct boot_params *boot_params,
There's no need for the "_bp_" - just efi_get_system_table(). Ditto for
the other naming.
I'll review the rest properly after you've split it.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
(Moving Ard to To: for the EFI bits)
On Wed, Jul 07, 2021 at 01:14:55PM -0500, Brijesh Singh wrote:
> While launching the encrypted guests, the hypervisor may need to provide
> some additional information during the guest boot. When booting under the
> EFI based BIOS, the EFI configuration table contains an entry for the
> confidential computing blob that contains the required information.
>
> To support booting encrypted guests on non-EFI VM, the hypervisor needs to
> pass this additional information to the kernel with a different method.
>
> For this purpose, introduce SETUP_CC_BLOB type in setup_data to hold the
> physical address of the confidential computing blob location. The boot
> loader or hypervisor may choose to use this method instead of EFI
> configuration table. The CC blob location scanning should give preference
> to setup_data data over the EFI configuration table.
>
> In AMD SEV-SNP, the CC blob contains the address of the secrets and CPUID
> pages. The secrets page includes information such as a VM to PSP
> communication key and CPUID page contains PSP filtered CPUID values.
> Define the AMD SEV confidential computing blob structure.
>
> While at it, define the EFI GUID for the confidential computing blob.
>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/include/asm/sev.h | 12 ++++++++++++
> arch/x86/include/uapi/asm/bootparam.h | 1 +
> include/linux/efi.h | 1 +
> 3 files changed, 14 insertions(+)
>
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index f68c9e2c3851..e41bd55dba5d 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -44,6 +44,18 @@ struct es_em_ctxt {
>
> void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code);
>
> +/* AMD SEV Confidential computing blob structure */
> +#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
> +struct cc_blob_sev_info {
> + u32 magic;
> + u16 version;
> + u16 reserved;
> + u64 secrets_phys;
> + u32 secrets_len;
> + u64 cpuid_phys;
> + u32 cpuid_len;
> +};
> +
> static inline u64 lower_bits(u64 val, unsigned int bits)
> {
> u64 mask = (1ULL << bits) - 1;
> diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> index b25d3f82c2f3..1ac5acca72ce 100644
> --- a/arch/x86/include/uapi/asm/bootparam.h
> +++ b/arch/x86/include/uapi/asm/bootparam.h
> @@ -10,6 +10,7 @@
> #define SETUP_EFI 4
> #define SETUP_APPLE_PROPERTIES 5
> #define SETUP_JAILHOUSE 6
> +#define SETUP_CC_BLOB 7
>
> #define SETUP_INDIRECT (1<<31)
>
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index 6b5d36babfcc..75aeb2a56888 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -344,6 +344,7 @@ void efi_native_runtime_setup(void);
> #define EFI_CERT_SHA256_GUID EFI_GUID(0xc1c41626, 0x504c, 0x4092, 0xac, 0xa9, 0x41, 0xf9, 0x36, 0x93, 0x43, 0x28)
> #define EFI_CERT_X509_GUID EFI_GUID(0xa5c059a1, 0x94e4, 0x4aa7, 0x87, 0xb5, 0xab, 0x15, 0x5c, 0x2b, 0xf0, 0x72)
> #define EFI_CERT_X509_SHA256_GUID EFI_GUID(0x3bd2a492, 0x96c0, 0x4079, 0xb4, 0x20, 0xfc, 0xf9, 0x8e, 0xf1, 0x03, 0xed)
> +#define EFI_CC_BLOB_GUID EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)
>
> /*
> * This GUID is used to pass to the kernel proper the struct screen_info
> --
> 2.17.1
>
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Aug 19, 2021 at 12:47:59PM +0200, Borislav Petkov wrote:
> On Wed, Jul 07, 2021 at 01:14:54PM -0500, Brijesh Singh wrote:
> > From: Michael Roth <[email protected]>
> >
> > Future patches for SEV-SNP-validated CPUID will also require early
> > parsing of the EFI configuration. Move the related code into a set of
> > helpers that can be re-used for that purpose.
> >
> > Signed-off-by: Michael Roth <[email protected]>
> > Signed-off-by: Brijesh Singh <[email protected]>
> > ---
> > arch/x86/boot/compressed/Makefile | 1 +
> > arch/x86/boot/compressed/acpi.c | 124 +++++---------
> > arch/x86/boot/compressed/efi-config-table.c | 180 ++++++++++++++++++++
> > arch/x86/boot/compressed/misc.h | 50 ++++++
> > 4 files changed, 272 insertions(+), 83 deletions(-)
> > create mode 100644 arch/x86/boot/compressed/efi-config-table.c
>
Hi Boris,
Thanks for reviewing. Just FYI, Brijesh is prepping v5 to post soon, and I
will work to get all your comments addressed as part of that, but there has
also been a change to the CPUID handling in the #VC handlers in case you
wanted to wait for that to land.
> arch/x86/boot/compressed/efi.c
>
> should be good enough.
>
> And in general, this patch is hard to review because it does a bunch of
> things at the same time. You should split it:
>
> - the first patch sould carve out only the functionality into helpers
> without adding or changing the existing functionality.
>
> - later ones should add the new functionality, in single logical steps.
Not sure what you mean here. All the interfaces introduced here are used
by acpi.c. There is another helper added later (efi_bp_find_vendor_table())
in "enable SEV-SNP-validated CPUID in #VC handler", since it's not used
here by acpi.c.
>
> Some preliminary comments below as far as I can:
>
> > diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> > index 431bf7f846c3..b41aecfda49c 100644
> > --- a/arch/x86/boot/compressed/Makefile
> > +++ b/arch/x86/boot/compressed/Makefile
> > @@ -100,6 +100,7 @@ endif
> > vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
> >
> > vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
> > +vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi-config-table.o
> > efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a
> >
> > $(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
> > diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> > index 8bcbcee54aa1..e087dcaf43b3 100644
> > --- a/arch/x86/boot/compressed/acpi.c
> > +++ b/arch/x86/boot/compressed/acpi.c
> > @@ -24,42 +24,36 @@ struct mem_vector immovable_mem[MAX_NUMNODES*2];
> > * Search EFI system tables for RSDP. If both ACPI_20_TABLE_GUID and
> > * ACPI_TABLE_GUID are found, take the former, which has more features.
> > */
> > +#ifdef CONFIG_EFI
> > +static bool
> > +rsdp_find_fn(efi_guid_t guid, unsigned long vendor_table, bool efi_64,
> > + void *opaque)
> > +{
> > + acpi_physical_address *rsdp_addr = opaque;
> > +
> > + if (!(efi_guidcmp(guid, ACPI_TABLE_GUID))) {
> > + *rsdp_addr = vendor_table;
> > + } else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID))) {
> > + *rsdp_addr = vendor_table;
> > + return false;
>
> No "return false" in the ACPI_TABLE_GUID branch above? Maybe this has to
> do with the preference to ACPI_20_TABLE_GUID.
Right, current acpi.c keeps searching in case the preferred
ACPI_20_TABLE_GUID is found.
>
> In any case, this looks silly. Please do the iteration simple
> and stupid without the function pointer and get rid of that
> efi_foreach_conf_entry() thing - this is not firmware.
There is the aforementioned efi_bp_find_vendor_table() that does the
simple iteration, but I wasn't sure how to build the "find one of these,
but this one is preferred" logic into it in a reasonable way.
I could just call it once for each of these GUIDs though. I was hesitant
to do so since it's less efficient than existing code, but if it's worth
it for the simplification then I'm all for it.
So I'll pull efi_bp_find_vendor_table() into this patch, rename to
efi_find_vendor_table(), and drop efi_foreach_conf_entry() in favor
of it.
>
> > diff --git a/arch/x86/boot/compressed/efi-config-table.c b/arch/x86/boot/compressed/efi-config-table.c
> > new file mode 100644
> > index 000000000000..d1a34aa7cefd
> > --- /dev/null
> > +++ b/arch/x86/boot/compressed/efi-config-table.c
>
> ...
>
> > +/*
>
> If you're going to add proper comments, make them kernel-doc. I.e., it
> should start with
>
> /**
>
> and then use
>
> ./scripts/kernel-doc -none arch/x86/boot/compressed/efi-config-table.c
>
> to check them all they're proper.
Nice, thanks for the tip.
>
>
> > + * Given boot_params, retrieve the physical address of EFI system table.
> > + *
> > + * @boot_params: pointer to boot_params
> > + * @sys_table_pa: location to store physical address of system table
> > + * @is_efi_64: location to store whether using 64-bit EFI or not
> > + *
> > + * Returns 0 on success. On error, return params are left unchanged.
> > + */
> > +int
> > +efi_bp_get_system_table(struct boot_params *boot_params,
>
> There's no need for the "_bp_" - just efi_get_system_table(). Ditto for
> the other naming.
There used to also be an efi_get_conf_table() that did the lookup given
a direct pointer to systab rather than going through bootparams, so I
needed some way to differentiate, but looks like I dropped that at some
point, so I'll rename these as suggested.
>
> I'll review the rest properly after you've split it.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&data=04%7C01%7Cmichael.roth%40amd.com%7C5a35d1d920024a99451608d962febc38%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637649668538296788%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ED6tez8A1ktSULe%2FhJTxeqPlp4LVb0Yt4i44P9gytAw%3D&reserved=0
On Thu, Aug 19, 2021 at 11:45:35AM +0200, Borislav Petkov wrote:
> Finally drop this bouncing npmccallum at RH email address from the Cc
> list.
>
> On Wed, Jul 07, 2021 at 01:14:52PM -0500, Brijesh Singh wrote:
> > From: Michael Roth <[email protected]>
> >
> > This code will also be used later for SEV-SNP-validated CPUID code in
> > some cases, so move it to a common helper.
> >
> > Signed-off-by: Michael Roth <[email protected]>
> > Signed-off-by: Brijesh Singh <[email protected]>
> > ---
> > arch/x86/kernel/sev-shared.c | 84 +++++++++++++++++++++++++-----------
> > 1 file changed, 58 insertions(+), 26 deletions(-)
> >
> > diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
> > index be4025f14b4f..4884de256a49 100644
> > --- a/arch/x86/kernel/sev-shared.c
> > +++ b/arch/x86/kernel/sev-shared.c
> > @@ -184,6 +184,58 @@ static enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
> > return ret;
> > }
> >
> > +static int sev_es_cpuid_msr_proto(u32 func, u32 subfunc, u32 *eax, u32 *ebx,
>
> Since it is not only SEV-ES, then it should be prefixed with "sev_" like
> we do for the other such functions. I guess simply
>
> sev_cpuid()
>
> ?
That makes sense, but I think it helps in making sense of the security
aspects of the code to know that sev_cpuid() would be fetching cpuid
information from the hypervisor. "msr_proto" is meant to be an indicator
that it will be using the GHCB MSR protocol to do it, but maybe just
"_hyp" is enough to get the idea across? I use the convention elsewhere
in the series as well.
So sev_cpuid_hyp() maybe?
>
> > + u32 *ecx, u32 *edx)
> > +{
> > + u64 val;
> > +
> > + if (eax) {
>
> What's the protection for? Is it ever going to be called with NULL ptrs
> for the regs? That's not the case in this patchset at least...
In "enable SEV-SNP-validated CPUID in #VC handler", it does:
sev_snp_cpuid() -> sev_snp_cpuid_hyp(),
which will call this with NULL e{a,b,c,d}x arguments in some cases. There
are enough call-sites in sev_snp_cpuid() that it seemed worthwhile to
add the guards so we wouldn't need to declare dummy variables for arguments.
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&data=04%7C01%7Cmichael.roth%40amd.com%7C567fab11117b4072171508d962f6043a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637649631103094962%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=fg87GYa5RX5ea54IwYLzwXupt6VVyLM%2BkyMnGB3S0wQ%3D&reserved=0
On Thu, Aug 19, 2021 at 10:37:41AM -0500, Michael Roth wrote:
> That makes sense, but I think it helps in making sense of the security
> aspects of the code to know that sev_cpuid() would be fetching cpuid
> information from the hypervisor.
Why is it important for the callers to know where do we fetch the CPUID
info from?
> "msr_proto" is meant to be an indicator that it will be using the GHCB
> MSR protocol to do it, but maybe just "_hyp" is enough to get the idea
> across? I use the convention elsewhere in the series as well.
>
> So sev_cpuid_hyp() maybe?
sev_cpuid_hv() pls. We abbreviate the hypervisor as HV usually.
> In "enable SEV-SNP-validated CPUID in #VC handler", it does:
>
> sev_snp_cpuid() -> sev_snp_cpuid_hyp(),
>
> which will call this with NULL e{a,b,c,d}x arguments in some cases. There
> are enough call-sites in sev_snp_cpuid() that it seemed worthwhile to
> add the guards so we wouldn't need to declare dummy variables for arguments.
Yah, saw that in the later patches.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Aug 19, 2021 at 09:58:31AM -0500, Michael Roth wrote:
> Not sure what you mean here. All the interfaces introduced here are used
> by acpi.c. There is another helper added later (efi_bp_find_vendor_table())
> in "enable SEV-SNP-validated CPUID in #VC handler", since it's not used
> here by acpi.c.
Maybe I got confused by the amount of changes in a single patch. I'll
try harder with your v5. :)
> There is the aforementioned efi_bp_find_vendor_table() that does the
> simple iteration, but I wasn't sure how to build the "find one of these,
> but this one is preferred" logic into it in a reasonable way.
Instead of efi_foreach_conf_entry() you simply do a bog-down simple
loop and each time you stop at a table, you examine it and overwrite
pointers, if you've found something better.
With "overwrite pointers" I mean you cache the pointers to those conf
tables you iterate over and dig out so that you don't have to do it a
second time. That is, *if* you need them a second time. I believe you
call at least efi_bp_get_conf_table() twice... you get the idea.
> I could just call it once for each of these GUIDs though. I was
> hesitant to do so since it's less efficient than existing code, but if
> it's worth it for the simplification then I'm all for it.
Yeah, this is executed once during boot so I don't think you can make it
more efficient than a single iteration over the config table blobs.
I hope that makes more sense.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Aug 19, 2021 at 07:09:42PM +0200, Borislav Petkov wrote:
> On Thu, Aug 19, 2021 at 09:58:31AM -0500, Michael Roth wrote:
> > Not sure what you mean here. All the interfaces introduced here are used
> > by acpi.c. There is another helper added later (efi_bp_find_vendor_table())
> > in "enable SEV-SNP-validated CPUID in #VC handler", since it's not used
> > here by acpi.c.
>
> Maybe I got confused by the amount of changes in a single patch. I'll
> try harder with your v5. :)
>
> > There is the aforementioned efi_bp_find_vendor_table() that does the
> > simple iteration, but I wasn't sure how to build the "find one of these,
> > but this one is preferred" logic into it in a reasonable way.
>
> Instead of efi_foreach_conf_entry() you simply do a bog-down simple
> loop and each time you stop at a table, you examine it and overwrite
> pointers, if you've found something better.
>
> With "overwrite pointers" I mean you cache the pointers to those conf
> tables you iterate over and dig out so that you don't have to do it a
> second time. That is, *if* you need them a second time. I believe you
> call at least efi_bp_get_conf_table() twice... you get the idea.
Sorry I'm still a little confused on how to determine "something better",
since it's acpi.c that decides which GUID is preferred, whereas
efi_find_vendor_table() is a library function with no outside knowledge
other than its arguments, so to return the preferred pointer it would need
both/multiple GUIDs passed in as arguments, wouldn't it? (or a callback)
Another alternative is something like what
drivers/firmware/efi/efi.c:common_tables does, where interesting table
GUIDs are each associated with a pointer, and all the pointers can then
be initialized with to the corresponding table address with a single pass.
But would need to be careful to re-initialize those pointers when BSS gets
cleared, or declare them in __section(".data"). Is that closer to what
you were thinking?
>
> > I could just call it once for each of these GUIDs though. I was
> > hesitant to do so since it's less efficient than existing code, but if
> > it's worth it for the simplification then I'm all for it.
>
> Yeah, this is executed once during boot so I don't think you can make it
> more efficient than a single iteration over the config table blobs.
In v5, I've simplified things to just call efi_find_vendor_table() once
for ACPI_20_TABLE_GUID, then once for ACPI_TABLE_GUID if that's not
available. So definitely doesn't sound like what you are suggesting here,
but does at least simplify code and gets rid of the efi_foreach* stuff. But
happy to rework things if you had something else in mind.
>
> I hope that makes more sense.
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&data=04%7C01%7Cmichael.roth%40amd.com%7C2a4304e70b5b4f2137f808d963340eec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637649897546039774%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=GKDogD%2BOCN0swhmT4RZ2%2B3JmURF3e4qq%2FzgrxxFqJt0%3D&reserved=0
On Thu, Aug 19, 2021 at 06:46:48PM +0200, Borislav Petkov wrote:
> On Thu, Aug 19, 2021 at 10:37:41AM -0500, Michael Roth wrote:
> > That makes sense, but I think it helps in making sense of the security
> > aspects of the code to know that sev_cpuid() would be fetching cpuid
> > information from the hypervisor.
>
> Why is it important for the callers to know where do we fetch the CPUID
> info from?
The select cases where we still fetch CPUID values from hypervisor in
SNP need careful consideration, so for the purposes of auditing the code
for security, or just noticing things in patches, I think it's important
to make it clear what is the "normal" SNP case (not trusting hypervisor
CPUID values) and what are exceptional cases (getting select values from
hypervisor). If something got added in the future, I think something
like:
+sev_cpuid_hv(0x8000001f, ...)
would be more likely to raise eyebrows and get more scrutiny than:
+sev_cpuid(0x8000001f, ...)
where it might get lost in the noise or mistaken as similar to
sev_snp_cpuid().
Maybe a bit contrived, and probably not a big deal in practice, but
conveying the source it in the naming does seem at least seem slightly
better than not doing so.
>
> > "msr_proto" is meant to be an indicator that it will be using the GHCB
> > MSR protocol to do it, but maybe just "_hyp" is enough to get the idea
> > across? I use the convention elsewhere in the series as well.
> >
> > So sev_cpuid_hyp() maybe?
>
> sev_cpuid_hv() pls. We abbreviate the hypervisor as HV usually.
Ah yes, much nicer. I've gone with this for v5 and adopted the
convention in the rest of the code.
>
> > In "enable SEV-SNP-validated CPUID in #VC handler", it does:
> >
> > sev_snp_cpuid() -> sev_snp_cpuid_hyp(),
> >
> > which will call this with NULL e{a,b,c,d}x arguments in some cases. There
> > are enough call-sites in sev_snp_cpuid() that it seemed worthwhile to
> > add the guards so we wouldn't need to declare dummy variables for arguments.
>
> Yah, saw that in the later patches.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&data=04%7C01%7Cmichael.roth%40amd.com%7C6e23d0d9be7a4125d70008d96330de54%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637649883863838712%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HaRdEA0P4%2FGzmTXYyVYhGCnDaQHR8rbJqf%2B0xTBPSt0%3D&reserved=0
On Thu, Aug 19, 2021 at 10:29:08PM -0500, Michael Roth wrote:
> The select cases where we still fetch CPUID values from hypervisor in
> SNP need careful consideration, so for the purposes of auditing the code
> for security, or just noticing things in patches, I think it's important
> to make it clear what is the "normal" SNP case (not trusting hypervisor
> CPUID values) and what are exceptional cases (getting select values from
> hypervisor). If something got added in the future, I think something
> like:
>
> +sev_cpuid_hv(0x8000001f, ...)
>
> would be more likely to raise eyebrows and get more scrutiny than:
>
> +sev_cpuid(0x8000001f, ...)
>
> where it might get lost in the noise or mistaken as similar to
> sev_snp_cpuid().
>
> Maybe a bit contrived, and probably not a big deal in practice, but
> conveying the source it in the naming does seem at least seem slightly
> better than not doing so.
Ok, makes sense.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Aug 19, 2021 at 06:42:58PM -0500, Michael Roth wrote:
> In v5, I've simplified things to just call efi_find_vendor_table() once
> for ACPI_20_TABLE_GUID, then once for ACPI_TABLE_GUID if that's not
> available. So definitely doesn't sound like what you are suggesting here,
> but does at least simplify code and gets rid of the efi_foreach* stuff. But
> happy to rework things if you had something else in mind.
Ok, thanks. Lemme get to that version and I'll holler if something's
still bothering me.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette