2022-03-08 14:29:18

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 00/46] Add AMD Secure Nested Paging (SEV-SNP) Guest Support

This part of Secure Encrypted Paging (SEV-SNP) series focuses on the changes
required in a guest OS for SEV-SNP support.

SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
new hardware-based memory protections. SEV-SNP adds strong memory integrity
protection to help prevent malicious hypervisor-based attacks like data
replay, memory re-mapping and more in order to create an isolated memory
encryption environment.

This series provides the basic building blocks to support booting the SEV-SNP
VMs, it does not cover all the security enhancement introduced by the SEV-SNP
such as interrupt protection.

Many of the integrity guarantees of SEV-SNP are enforced through a new
structure called the Reverse Map Table (RMP). Adding a new page to SEV-SNP
VM requires a 2-step process. First, the hypervisor assigns a page to the
guest using the new RMPUPDATE instruction. This transitions the page to
guest-invalid. Second, the guest validates the page using the new PVALIDATE
instruction. The SEV-SNP VMs can use the new "Page State Change Request NAE"
defined in the GHCB specification to ask hypervisor to add or remove page
from the RMP table.

Each page assigned to the SEV-SNP VM can either be validated or unvalidated,
as indicated by the Validated flag in the page's RMP entry. There are two
approaches that can be taken for the page validation: Pre-validation and
Lazy Validation.

Under pre-validation, the pages are validated prior to first use. And under
lazy validation, pages are validated when first accessed. An access to a
unvalidated page results in a #VC exception, at which time the exception
handler may validate the page. Lazy validation requires careful tracking of
the validated pages to avoid validating the same GPA more than once. The
recently introduced "Unaccepted" memory type can be used to communicate the
unvalidated memory ranges to the Guest OS.

At this time we only support the pre-validation, the OVMF guest BIOS
validates the entire RAM before the control is handed over to the guest kernel.
The early_set_memory_{encrypted,decrypted} and set_memory_{encrypted,decrypted} are
enlightened to perform the page validation or invalidation while setting or
clearing the encryption attribute from the page table.

This series does not provide support for the Interrupt security yet which will
be added after the base support.

The complete branch is at https://github.com/AMDESE/linux/tree/sev-snp-v12

Patch 1-4 defines multiple VMSA save area to support SEV,SEV-ES and SEV-SNP guests.
It is a pre-requisite for the SEV-SNP guest support, and included in the
series for the completeness. These patch is queue'd here
https://git.kernel.org/pub/scm/virt/kvm/kvm.git, branch svm-for-snp.

Additional resources
---------------------
SEV-SNP whitepaper
https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf

APM 2: https://www.amd.com/system/files/TechDocs/24593.pdf
(section 15.36)

GHCB spec:
https://developer.amd.com/wp-content/resources/56421.pdf

SEV-SNP firmware specification:
https://developer.amd.com/sev/

v11: https://lore.kernel.org/all/[email protected]/
v10: https://lore.kernel.org/linux-mm/[email protected]/T/
v9: https://lore.kernel.org/linux-mm/[email protected]
v8: https://lore.kernel.org/lkml/[email protected]/
v7: https://lore.kernel.org/linux-mm/[email protected]/
v6: https://lore.kernel.org/linux-mm/[email protected]/
v5: https://lore.kernel.org/lkml/[email protected]/

Change since v11:
* Simplify the memory allocation for VMSA to address Dave Hansen feedback.
* Drop the unneeded memset for request and response buffer in the sevguest command handling.
* Make fw_err required for the SNP guest request API.
* Simplify the error code checking for SNP_GET_EXT_REPORT command handing to address Boris feedback.
* Rename the command line from "sev_debug" => "sev=debug" to dump the CPUID value.

Changes since v10:
* Rebase patches to x86/cc.
* Integerate the SNP page state change functions in x86_platform.guest_{prepare,finish} hook.

Changes since v9:
* Removed unnecessary checks on CPUID table contents, added kernel param to dump CPUID table during boot
* Added boot_{rd,wr}msr() helpers
* Renamed/refactored SNP CPUID code/definitions for clarity/consistency
* Re-worked comments for clarity and avoid redundancies
* Moved SNP CPUID table documentation to Documentation/virt/coco/sevguest.rst
* Documented cc_blob_address/acpi_rsdp_addr in zero-page.rst

Changes since v8:
* Setup the GHCB before taking the first #VC.
* Make the CC blob structure size invariant.
* Define the AP INIT macro and update the AP creation to use those macro
instead of the hardcoded values.
* Expand the comments to cover some of previous feedbacks.
* Fix the commit messages based on the feedbacks.
* Multiple fixes/cleanup on cpuid patches (based on Boris and Dave feedback)
* drop is_efi64 return arguments in favor of a separate efi_get_type() helper.
* drop is_efi64 input arguments in favor of calling efi_get_type() as-needed.
* move acpi.c's kexec-specific handling into library code.
* fix stack protection for 32/64-bit builds.
* Export add_identity_map() to avoid SEV-specific code in ident_map_64.c.
* use snp_abort() when terminating via initial ccblob scan.
* fix the copyright header after the code refactor.
* remove code duplication whereever possible.

Changes since v7:
* sevguest: extend the get report structure to accept the vmpl from userspace.
* In the compressed path, move the GHCB protocol negotiation from VC handler
to sev_enable().
* sev_enable(): don't expect SEV bit in status MSR when cpuid bit is present, update comments.
* sme_enable(): call directly from head_64.S rather than as part of startup_64_setup_env, add comments
* snp_find_cc_blob(), sev_prep_identity_maps(): add missing 'static' keywords to function prototypes

Changes since v6:
* Add rmpadjust() helper to be used by AP creation and vmpl0 detect function.
* Clear the VM communication key if guest detects that hypervisor is modifying
the SNP_GUEST_REQ response header.
* Move the per-cpu GHCB registration from first #VC to idt setup.
* Consolidate initial SEV/SME setup into a common entry point that gets called
early enough to also be used for SEV-SNP CPUID table setup.
* SNP CPUID: separate initial SEV-SNP feature detection out into standalone
snp_init() routines, then add CPUID table setup to it as a separate patch.
* SNP CPUID: fix boot issue with Seabios due to ACPI relying on certain EFI
config table lookup failures as fallthrough cases rather than error cases.
* SNP CPUID: drop the use of a separate init routines to handle pointer fixups
after switching to kernel virtual addresses, instead use a helper that uses
RIP-relative addressing to access CPUID table when either on identity mapping
or kernel virtual addresses.

Changes since v5:
* move the seqno allocation in the sevguest driver.
* extend snp_issue_guest_request() to accept the exit_info to simplify the logic.
* use smaller structure names based on feedback.
* explicitly clear the memory after the SNP guest request is completed.
* cpuid validation: use a local copy of cpuid table instead of keeping
firmware table mapped throughout boot.
* cpuid validation: coding style fix-ups and refactor cpuid-related helpers
as suggested.
* cpuid validation: drop a number of BOOT_COMPRESSED-guarded defs/declarations
by moving things like snp_cpuid_init*() out of sev-shared.c and keeping only
the common bits there.
* Break up EFI config table helpers and related acpi.c changes into separate
patches.
* re-enable stack protection for 32-bit kernels as well, not just 64-bit

Changes since v4:
* Address the cpuid specific review comment
* Simplified the macro based on the review feedback
* Move macro definition to the patch that needs it
* Fix the issues reported by the checkpath
* Address the AP creation specific review comment

Changes since v3:
* Add support to use the PSP filtered CPUID.
* Add support for the extended guest request.
* Move sevguest driver in driver/virt/coco.
* Add documentation for sevguest ioctl.
* Add support to check the vmpl0.
* Pass the VM encryption key and id to be used for encrypting guest messages
through the platform drv data.
* Multiple cleanup and fixes to address the review feedbacks.

Changes since v2:
* Add support for AP startup using SNP specific vmgexit.
* Add snp_prep_memory() helper.
* Drop sev_snp_active() helper.
* Add sev_feature_enabled() helper to check which SEV feature is active.
* Sync the SNP guest message request header with latest SNP FW spec.
* Multiple cleanup and fixes to address the review feedbacks.

Changes since v1:
* Integerate the SNP support in sev.{ch}.
* Add support to query the hypervisor feature and detect whether SNP is supported.
* Define Linux specific reason code for the SNP guest termination.
* Extend the setup_header provide a way for hypervisor to pass secret and cpuid page.
* Add support to create a platform device and driver to query the attestation report
and the derive a key.
* Multiple cleanup and fixes to address Boris's review fedback.

Brijesh Singh (20):
KVM: SVM: Define sev_features and vmpl field in the VMSA
x86/mm: Extend cc_attr to include AMD SEV-SNP
x86/sev: Define the Linux specific guest termination reasons
x86/sev: Save the negotiated GHCB version
x86/sev: Check SEV-SNP features support
x86/sev: Add a helper for the PVALIDATE instruction
x86/sev: Check the vmpl level
x86/compressed: Add helper for validating pages in the decompression
stage
x86/compressed: Register GHCB memory when SEV-SNP is active
x86/sev: Register GHCB memory when SEV-SNP is active
x86/sev: Add helper for validating pages in early enc attribute
changes
x86/kernel: Make the .bss..decrypted section shared in RMP table
x86/kernel: Validate ROM memory before accessing when SEV-SNP is
active
x86/mm: Validate memory when changing the C-bit
x86/boot: Add Confidential Computing type to setup_data
x86/sev: Provide support for SNP guest request NAEs
x86/sev: Register SEV-SNP guest request platform device
virt: Add SEV-SNP guest driver
virt: sevguest: Add support to derive key
virt: sevguest: Add support to get extended report

Michael Roth (22):
x86/boot: Introduce helpers for MSR reads/writes
x86/boot: Use MSR read/write helpers instead of inline assembly
x86/compressed/64: Detect/setup SEV/SME features earlier in boot
x86/sev: Detect/setup SEV/SME features earlier in boot
x86/head/64: Re-enable stack protection
x86/compressed/acpi: Move EFI detection to helper
x86/compressed/acpi: Move EFI system table lookup to helper
x86/compressed/acpi: Move EFI config table lookup to helper
x86/compressed/acpi: Move EFI vendor table lookup to helper
x86/compressed/acpi: Move EFI kexec handling into common code
KVM: x86: Move lookup of indexed CPUID leafs to helper
x86/sev: Move MSR-based VMGEXITs for CPUID to helper
x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers
x86/boot: Add a pointer to Confidential Computing blob in bootparams
x86/compressed: Add SEV-SNP feature detection/setup
x86/compressed: Use firmware-validated CPUID leaves for SEV-SNP guests
x86/compressed: Export and rename add_identity_map()
x86/compressed/64: Add identity mapping for Confidential Computing
blob
x86/sev: Add SEV-SNP feature detection/setup
x86/sev: Use firmware-validated CPUID for SEV-SNP guests
x86/sev: add sev=debug cmdline option to dump SNP CPUID table
virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement

Tom Lendacky (4):
KVM: SVM: Create a separate mapping for the SEV-ES save area
KVM: SVM: Create a separate mapping for the GHCB save area
KVM: SVM: Update the SEV-ES save area mapping
x86/sev: Use SEV-SNP AP creation to start secondary CPUs

.../admin-guide/kernel-parameters.txt | 2 +
Documentation/virt/coco/sevguest.rst | 155 ++++
Documentation/virt/index.rst | 1 +
Documentation/x86/x86_64/boot-options.rst | 14 +
Documentation/x86/zero-page.rst | 2 +
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/acpi.c | 173 +---
arch/x86/boot/compressed/efi.c | 238 +++++
arch/x86/boot/compressed/head_64.S | 37 +-
arch/x86/boot/compressed/ident_map_64.c | 39 +-
arch/x86/boot/compressed/idt_64.c | 18 +-
arch/x86/boot/compressed/mem_encrypt.S | 36 -
arch/x86/boot/compressed/misc.h | 55 +-
arch/x86/boot/compressed/sev.c | 263 +++++-
arch/x86/boot/cpucheck.c | 30 +-
arch/x86/boot/msr.h | 28 +
arch/x86/coco/core.c | 3 +
arch/x86/include/asm/bootparam_utils.h | 1 +
arch/x86/include/asm/cpuid.h | 34 +
arch/x86/include/asm/msr-index.h | 2 +
arch/x86/include/asm/msr.h | 11 +-
arch/x86/include/asm/setup.h | 1 -
arch/x86/include/asm/sev-common.h | 82 ++
arch/x86/include/asm/sev.h | 102 ++-
arch/x86/include/asm/shared/msr.h | 15 +
arch/x86/include/asm/svm.h | 171 +++-
arch/x86/include/uapi/asm/bootparam.h | 4 +-
arch/x86/include/uapi/asm/svm.h | 13 +
arch/x86/kernel/Makefile | 2 -
arch/x86/kernel/cpu/common.c | 4 +
arch/x86/kernel/head64.c | 29 +-
arch/x86/kernel/head_64.S | 37 +-
arch/x86/kernel/probe_roms.c | 13 +-
arch/x86/kernel/sev-shared.c | 529 ++++++++++-
arch/x86/kernel/sev.c | 834 +++++++++++++++++-
arch/x86/kernel/smpboot.c | 3 +
arch/x86/kvm/cpuid.c | 19 +-
arch/x86/kvm/svm/sev.c | 24 +-
arch/x86/kvm/svm/svm.c | 4 +-
arch/x86/kvm/svm/svm.h | 2 +-
arch/x86/mm/mem_encrypt.c | 4 +
arch/x86/mm/mem_encrypt_amd.c | 71 +-
arch/x86/mm/mem_encrypt_identity.c | 8 +
drivers/virt/Kconfig | 3 +
drivers/virt/Makefile | 1 +
drivers/virt/coco/sevguest/Kconfig | 14 +
drivers/virt/coco/sevguest/Makefile | 2 +
drivers/virt/coco/sevguest/sevguest.c | 740 ++++++++++++++++
drivers/virt/coco/sevguest/sevguest.h | 98 ++
include/linux/cc_platform.h | 8 +
include/linux/efi.h | 1 +
include/uapi/linux/sev-guest.h | 80 ++
52 files changed, 3688 insertions(+), 373 deletions(-)
create mode 100644 Documentation/virt/coco/sevguest.rst
create mode 100644 arch/x86/boot/compressed/efi.c
create mode 100644 arch/x86/boot/msr.h
create mode 100644 arch/x86/include/asm/cpuid.h
create mode 100644 arch/x86/include/asm/shared/msr.h
create mode 100644 drivers/virt/coco/sevguest/Kconfig
create mode 100644 drivers/virt/coco/sevguest/Makefile
create mode 100644 drivers/virt/coco/sevguest/sevguest.c
create mode 100644 drivers/virt/coco/sevguest/sevguest.h
create mode 100644 include/uapi/linux/sev-guest.h

--
2.25.1


2022-03-08 16:38:50

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 18/46] x86/sev: Add helper for validating pages in early enc attribute changes

The early_set_memory_{encrypted,decrypted}() are used for changing the
page from decrypted (shared) to encrypted (private) and vice versa.
When SEV-SNP is active, the page state transition needs to go through
additional steps.

If the page is transitioned from shared to private, then perform the
following after the encryption attribute is set in the page table:

1. Issue the page state change VMGEXIT to add the page as a private
in the RMP table.
2. Validate the page after its successfully added in the RMP table.

To maintain the security guarantees, if the page is transitioned from
private to shared, then perform the following before clearing the
encryption attribute from the page table.

1. Invalidate the page.
2. Issue the page state change VMGEXIT to make the page shared in the
RMP table.

The early_set_memory_{encrypted,decrypted} can be called before the
GHCB is setup, use the SNP page state MSR protocol VMGEXIT defined in
the GHCB specification to request the page state change in the RMP
table.

While at it, add a helper snp_prep_memory() which will be used in
probe_roms(), in a later patch.

Reviewed-by: Venu Busireddy <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 10 ++++
arch/x86/kernel/sev.c | 99 +++++++++++++++++++++++++++++++++++
arch/x86/mm/mem_encrypt_amd.c | 58 ++++++++++++++++++--
3 files changed, 163 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 48df02713ee0..f65d257e3d4a 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -123,6 +123,11 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
return rc;
}
void setup_ghcb(void);
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -132,6 +137,11 @@ static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
static inline void setup_ghcb(void) { }
+static inline void __init
+early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init
+early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
#endif

#endif
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index cc382c4f89ef..1e8dc71e7ba6 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -556,6 +556,105 @@ static u64 get_jump_table_addr(void)
return ret;
}

+static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool validate)
+{
+ unsigned long vaddr_end;
+ int rc;
+
+ vaddr = vaddr & PAGE_MASK;
+ vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+ while (vaddr < vaddr_end) {
+ rc = pvalidate(vaddr, RMP_PG_SIZE_4K, validate);
+ if (WARN(rc, "Failed to validate address 0x%lx ret %d", vaddr, rc))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+ vaddr = vaddr + PAGE_SIZE;
+ }
+}
+
+static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
+{
+ unsigned long paddr_end;
+ u64 val;
+
+ paddr = paddr & PAGE_MASK;
+ paddr_end = paddr + (npages << PAGE_SHIFT);
+
+ while (paddr < paddr_end) {
+ /*
+ * Use the MSR protocol because this function can be called before
+ * the GHCB is established.
+ */
+ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+
+ if (WARN(GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP,
+ "Wrong PSC response code: 0x%x\n",
+ (unsigned int)GHCB_RESP_CODE(val)))
+ goto e_term;
+
+ if (WARN(GHCB_MSR_PSC_RESP_VAL(val),
+ "Failed to change page state to '%s' paddr 0x%lx error 0x%llx\n",
+ op == SNP_PAGE_STATE_PRIVATE ? "private" : "shared",
+ paddr, GHCB_MSR_PSC_RESP_VAL(val)))
+ goto e_term;
+
+ paddr = paddr + PAGE_SIZE;
+ }
+
+ return;
+
+e_term:
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ /*
+ * Ask the hypervisor to mark the memory pages as private in the RMP
+ * table.
+ */
+ early_set_pages_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+ /* Validate the memory pages after they've been added in the RMP table. */
+ pvalidate_pages(vaddr, npages, true);
+}
+
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ /* Invalidate the memory pages before they are marked shared in the RMP table. */
+ pvalidate_pages(vaddr, npages, false);
+
+ /* Ask hypervisor to mark the memory pages shared in the RMP table. */
+ early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
+{
+ unsigned long vaddr, npages;
+
+ vaddr = (unsigned long)__va(paddr);
+ npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (op == SNP_PAGE_STATE_PRIVATE)
+ early_snp_set_memory_private(vaddr, paddr, npages);
+ else if (op == SNP_PAGE_STATE_SHARED)
+ early_snp_set_memory_shared(vaddr, paddr, npages);
+ else
+ WARN(1, "invalid memory op %d\n", op);
+}
+
int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
{
u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 6169053c2854..8539dd6f24ff 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -31,6 +31,7 @@
#include <asm/processor-flags.h>
#include <asm/msr.h>
#include <asm/cmdline.h>
+#include <asm/sev.h>

#include "mm_internal.h"

@@ -47,6 +48,36 @@ EXPORT_SYMBOL(sme_me_mask);
/* Buffer used for early in-place encryption by BSP, no locking needed */
static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);

+/*
+ * SNP-specific routine which needs to additionally change the page state from
+ * private to shared before copying the data from the source to destination and
+ * restore after the copy.
+ */
+static inline void __init snp_memcpy(void *dst, void *src, size_t sz,
+ unsigned long paddr, bool decrypt)
+{
+ unsigned long npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (decrypt) {
+ /*
+ * @paddr needs to be accessed decrypted, mark the page shared in
+ * the RMP table before copying it.
+ */
+ early_snp_set_memory_shared((unsigned long)__va(paddr), paddr, npages);
+
+ memcpy(dst, src, sz);
+
+ /* Restore the page state after the memcpy. */
+ early_snp_set_memory_private((unsigned long)__va(paddr), paddr, npages);
+ } else {
+ /*
+ * @paddr need to be accessed encrypted, no need for the page state
+ * change.
+ */
+ memcpy(dst, src, sz);
+ }
+}
+
/*
* This routine does not change the underlying encryption setting of the
* page(s) that map this memory. It assumes that eventually the memory is
@@ -95,8 +126,13 @@ static void __init __sme_early_enc_dec(resource_size_t paddr,
* Use a temporary buffer, of cache-line multiple size, to
* avoid data corruption as documented in the APM.
*/
- memcpy(sme_early_buffer, src, len);
- memcpy(dst, sme_early_buffer, len);
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+ snp_memcpy(sme_early_buffer, src, len, paddr, enc);
+ snp_memcpy(dst, sme_early_buffer, len, paddr, !enc);
+ } else {
+ memcpy(sme_early_buffer, src, len);
+ memcpy(dst, sme_early_buffer, len);
+ }

early_memunmap(dst, len);
early_memunmap(src, len);
@@ -322,14 +358,28 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
clflush_cache_range(__va(pa), size);

/* Encrypt/decrypt the contents in-place */
- if (enc)
+ if (enc) {
sme_early_encrypt(pa, size);
- else
+ } else {
sme_early_decrypt(pa, size);

+ /*
+ * ON SNP, the page state in the RMP table must happen
+ * before the page table updates.
+ */
+ early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);
+ }
+
/* Change the page encryption mask. */
new_pte = pfn_pte(pfn, new_prot);
set_pte_atomic(kpte, new_pte);
+
+ /*
+ * If page is set encrypted in the page table, then update the RMP table to
+ * add this page as private.
+ */
+ if (enc)
+ early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);
}

static int __init early_set_memory_enc_dec(unsigned long vaddr,
--
2.25.1

2022-03-08 16:40:39

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 46/46] virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement

From: Michael Roth <[email protected]>

Update the documentation with information regarding SEV-SNP CPUID
Enforcement details and what sort of assurances it provides to guests.

Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
Documentation/virt/coco/sevguest.rst | 29 ++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
index 0f352056572d..48d66e10305b 100644
--- a/Documentation/virt/coco/sevguest.rst
+++ b/Documentation/virt/coco/sevguest.rst
@@ -118,6 +118,35 @@ be updated with the expected value.

See GHCB specification for further detail on how to parse the certificate blob.

+3. SEV-SNP CPUID Enforcement
+============================
+
+SEV-SNP guests can access a special page that contains a table of CPUID values
+that have been validated by the PSP as part of the SNP_LAUNCH_UPDATE firmware
+command. It provides the following assurances regarding the validity of CPUID
+values:
+
+ - Its address is obtained via bootloader/firmware (via CC blob), and those
+ binaries will be measured as part of the SEV-SNP attestation report.
+ - Its initial state will be encrypted/pvalidated, so attempts to modify
+ it during run-time will result in garbage being written, or #VC exceptions
+ being generated due to changes in validation state if the hypervisor tries
+ to swap the backing page.
+ - Attempts to bypass PSP checks by the hypervisor by using a normal page, or
+ a non-CPUID encrypted page will change the measurement provided by the
+ SEV-SNP attestation report.
+ - The CPUID page contents are *not* measured, but attempts to modify the
+ expected contents of a CPUID page as part of guest initialization will be
+ gated by the PSP CPUID enforcement policy checks performed on the page
+ during SNP_LAUNCH_UPDATE, and noticeable later if the guest owner
+ implements their own checks of the CPUID values.
+
+It is important to note that this last assurance is only useful if the kernel
+has taken care to make use of the SEV-SNP CPUID throughout all stages of boot.
+Otherwise, guest owner attestation provides no assurance that the kernel wasn't
+fed incorrect values at some point during boot.
+
+
Reference
---------

--
2.25.1

2022-03-08 16:55:52

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 06/46] x86/boot: Use MSR read/write helpers instead of inline assembly

From: Michael Roth <[email protected]>

Update all C code to use the new boot_rdmsr()/boot_wrmsr() helpers
instead of relying on inline assembly.

Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
---
arch/x86/boot/compressed/sev.c | 17 +++++++----------
arch/x86/boot/cpucheck.c | 30 +++++++++++++++---------------
2 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 28bcf04c022e..4e82101b7d13 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -22,6 +22,7 @@
#include <asm/svm.h>

#include "error.h"
+#include "../msr.h"

struct ghcb boot_ghcb_page __aligned(PAGE_SIZE);
struct ghcb *boot_ghcb;
@@ -56,23 +57,19 @@ static unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)

static inline u64 sev_es_rd_ghcb_msr(void)
{
- unsigned long low, high;
+ struct msr m;

- asm volatile("rdmsr" : "=a" (low), "=d" (high) :
- "c" (MSR_AMD64_SEV_ES_GHCB));
+ boot_rdmsr(MSR_AMD64_SEV_ES_GHCB, &m);

- return ((high << 32) | low);
+ return m.q;
}

static inline void sev_es_wr_ghcb_msr(u64 val)
{
- u32 low, high;
+ struct msr m;

- low = val & 0xffffffffUL;
- high = val >> 32;
-
- asm volatile("wrmsr" : : "c" (MSR_AMD64_SEV_ES_GHCB),
- "a"(low), "d" (high) : "memory");
+ m.q = val;
+ boot_wrmsr(MSR_AMD64_SEV_ES_GHCB, &m);
}

static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
diff --git a/arch/x86/boot/cpucheck.c b/arch/x86/boot/cpucheck.c
index e1478d32de1a..fed8d13ce252 100644
--- a/arch/x86/boot/cpucheck.c
+++ b/arch/x86/boot/cpucheck.c
@@ -27,6 +27,7 @@
#include <asm/required-features.h>
#include <asm/msr-index.h>
#include "string.h"
+#include "msr.h"

static u32 err_flags[NCAPINTS];

@@ -130,12 +131,11 @@ int check_cpu(int *cpu_level_ptr, int *req_level_ptr, u32 **err_flags_ptr)
/* If this is an AMD and we're only missing SSE+SSE2, try to
turn them on */

- u32 ecx = MSR_K7_HWCR;
- u32 eax, edx;
+ struct msr m;

- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- eax &= ~(1 << 15);
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_rdmsr(MSR_K7_HWCR, &m);
+ m.l &= ~(1 << 15);
+ boot_wrmsr(MSR_K7_HWCR, &m);

get_cpuflags(); /* Make sure it really did something */
err = check_cpuflags();
@@ -145,28 +145,28 @@ int check_cpu(int *cpu_level_ptr, int *req_level_ptr, u32 **err_flags_ptr)
/* If this is a VIA C3, we might have to enable CX8
explicitly */

- u32 ecx = MSR_VIA_FCR;
- u32 eax, edx;
+ struct msr m;

- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- eax |= (1<<1)|(1<<7);
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_rdmsr(MSR_VIA_FCR, &m);
+ m.l |= (1 << 1) | (1 << 7);
+ boot_wrmsr(MSR_VIA_FCR, &m);

set_bit(X86_FEATURE_CX8, cpu.flags);
err = check_cpuflags();
} else if (err == 0x01 && is_transmeta()) {
/* Transmeta might have masked feature bits in word 0 */

- u32 ecx = 0x80860004;
- u32 eax, edx;
+ struct msr m, m_tmp;
u32 level = 1;

- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- asm("wrmsr" : : "a" (~0), "d" (edx), "c" (ecx));
+ boot_rdmsr(0x80860004, &m);
+ m_tmp = m;
+ m_tmp.l = ~0;
+ boot_wrmsr(0x80860004, &m_tmp);
asm("cpuid"
: "+a" (level), "=d" (cpu.flags[0])
: : "ecx", "ebx");
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_wrmsr(0x80860004, &m);

err = check_cpuflags();
} else if (err == 0x01 &&
--
2.25.1

2022-03-08 17:22:09

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 08/46] x86/sev: Detect/setup SEV/SME features earlier in boot

From: Michael Roth <[email protected]>

sme_enable() handles feature detection for both SEV and SME. Future
patches will also use it for SEV-SNP feature detection/setup, which
will need to be done immediately after the first #VC handler is set up.
Move it now in preparation.

Reviewed-by: Venu Busireddy <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/head64.c | 3 ---
arch/x86/kernel/head_64.S | 13 +++++++++++++
2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 4f5ecbbaae77..cbc285ddc4ac 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -192,9 +192,6 @@ unsigned long __head __startup_64(unsigned long physaddr,
if (load_delta & ~PMD_PAGE_MASK)
for (;;);

- /* Activate Secure Memory Encryption (SME) if supported and enabled */
- sme_enable(bp);
-
/* Include the SME encryption mask in the fixup value */
load_delta += sme_get_me_mask();

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 9c63fc5988cd..9c2c3aff5ee4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -69,6 +69,19 @@ SYM_CODE_START_NOALIGN(startup_64)
call startup_64_setup_env
popq %rsi

+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ /*
+ * Activate SEV/SME memory encryption if supported/enabled. This needs to
+ * be done now, since this also includes setup of the SEV-SNP CPUID table,
+ * which needs to be done before any CPUID instructions are executed in
+ * subsequent code.
+ */
+ movq %rsi, %rdi
+ pushq %rsi
+ call sme_enable
+ popq %rsi
+#endif
+
/* Now switch to __KERNEL_CS so IRET works reliably */
pushq $__KERNEL_CS
leaq .Lon_kernel_cs(%rip), %rax
--
2.25.1

2022-03-08 23:08:53

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 38/46] x86/sev: Add SEV-SNP feature detection/setup

From: Michael Roth <[email protected]>

Initial/preliminary detection of SEV-SNP is done via the Confidential
Computing blob. Check for it prior to the normal SEV/SME feature
initialization, and add some sanity checks to confirm it agrees with
SEV-SNP CPUID/MSR bits.

Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/sev.c | 27 -------------
arch/x86/include/asm/sev.h | 2 +
arch/x86/kernel/sev-shared.c | 27 +++++++++++++
arch/x86/kernel/sev.c | 64 ++++++++++++++++++++++++++++++
arch/x86/mm/mem_encrypt_identity.c | 8 ++++
5 files changed, 101 insertions(+), 27 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 2a48f3a3f372..2911137bf37f 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -352,33 +352,6 @@ static struct cc_blob_sev_info *find_cc_blob_efi(struct boot_params *bp)
EFI_CC_BLOB_GUID);
}

-struct cc_setup_data {
- struct setup_data header;
- u32 cc_blob_address;
-};
-
-/*
- * Search for a Confidential Computing blob passed in as a setup_data entry
- * via the Linux Boot Protocol.
- */
-static struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp)
-{
- struct cc_setup_data *sd = NULL;
- struct setup_data *hdr;
-
- hdr = (struct setup_data *)bp->hdr.setup_data;
-
- while (hdr) {
- if (hdr->type == SETUP_CC_BLOB) {
- sd = (struct cc_setup_data *)hdr;
- return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
- }
- hdr = (struct setup_data *)hdr->next;
- }
-
- return NULL;
-}
-
/*
* Initial set up of SNP relies on information provided by the
* Confidential Computing blob, which can be passed to the boot kernel
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4e3909042001..219abb4590f2 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -153,6 +153,7 @@ void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
void snp_set_wakeup_secondary_cpu(void);
bool snp_init(struct boot_params *bp);
+void snp_abort(void);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -171,6 +172,7 @@ static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npage
static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
static inline void snp_set_wakeup_secondary_cpu(void) { }
static inline bool snp_init(struct boot_params *bp) { return false; }
+static inline void snp_abort(void) { }
#endif

#endif
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 0f1375164ff0..a7a1c0fb298e 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -937,3 +937,30 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,

return ES_OK;
}
+
+struct cc_setup_data {
+ struct setup_data header;
+ u32 cc_blob_address;
+};
+
+/*
+ * Search for a Confidential Computing blob passed in as a setup_data entry
+ * via the Linux Boot Protocol.
+ */
+static struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp)
+{
+ struct cc_setup_data *sd = NULL;
+ struct setup_data *hdr;
+
+ hdr = (struct setup_data *)bp->hdr.setup_data;
+
+ while (hdr) {
+ if (hdr->type == SETUP_CC_BLOB) {
+ sd = (struct cc_setup_data *)hdr;
+ return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
+ }
+ hdr = (struct setup_data *)hdr->next;
+ }
+
+ return NULL;
+}
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 97c0aa44ce2d..5a7df617394b 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1977,3 +1977,67 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
while (true)
halt();
}
+
+/*
+ * Initial set up of SNP relies on information provided by the
+ * Confidential Computing blob, which can be passed to the kernel
+ * in the following ways, depending on how it is booted:
+ *
+ * - when booted via the boot/decompress kernel:
+ * - via boot_params
+ *
+ * - when booted directly by firmware/bootloader (e.g. CONFIG_PVH):
+ * - via a setup_data entry, as defined by the Linux Boot Protocol
+ *
+ * Scan for the blob in that order.
+ */
+static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ /* Boot kernel would have passed the CC blob via boot_params. */
+ if (bp->cc_blob_address) {
+ cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address;
+ goto found_cc_info;
+ }
+
+ /*
+ * If kernel was booted directly, without the use of the
+ * boot/decompression kernel, the CC blob may have been passed via
+ * setup_data instead.
+ */
+ cc_info = find_cc_blob_setup_data(bp);
+ if (!cc_info)
+ return NULL;
+
+found_cc_info:
+ if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+ snp_abort();
+
+ return cc_info;
+}
+
+bool __init snp_init(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ if (!bp)
+ return false;
+
+ cc_info = find_cc_blob(bp);
+ if (!cc_info)
+ return false;
+
+ /*
+ * The CC blob will be used later to access the secrets page. Cache
+ * it here like the boot kernel does.
+ */
+ bp->cc_blob_address = (u32)(unsigned long)cc_info;
+
+ return true;
+}
+
+void __init snp_abort(void)
+{
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+}
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index b43bc24d2bb6..f415498d3175 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -45,6 +45,7 @@
#include <asm/sections.h>
#include <asm/cmdline.h>
#include <asm/coco.h>
+#include <asm/sev.h>

#include "mm_internal.h"

@@ -509,8 +510,11 @@ void __init sme_enable(struct boot_params *bp)
bool active_by_default;
unsigned long me_mask;
char buffer[16];
+ bool snp;
u64 msr;

+ snp = snp_init(bp);
+
/* Check for the SME/SEV support leaf */
eax = 0x80000000;
ecx = 0;
@@ -542,6 +546,10 @@ void __init sme_enable(struct boot_params *bp)
sev_status = __rdmsr(MSR_AMD64_SEV);
feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;

+ /* The SEV-SNP CC blob should never be present unless SEV-SNP is enabled. */
+ if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
+ snp_abort();
+
/* Check if memory encryption is enabled */
if (feature_mask == AMD_SME_BIT) {
/*
--
2.25.1

2022-03-08 23:13:36

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 29/46] x86/boot: Add Confidential Computing type to setup_data

While launching the encrypted guests, the hypervisor may need to provide
some additional information during the guest boot. When booting under the
EFI based BIOS, the EFI configuration table contains an entry for the
confidential computing blob that contains the required information.

To support booting encrypted guests on non-EFI VM, the hypervisor needs to
pass this additional information to the kernel with a different method.

For this purpose, introduce SETUP_CC_BLOB type in setup_data to hold the
physical address of the confidential computing blob location. The boot
loader or hypervisor may choose to use this method instead of EFI
configuration table. The CC blob location scanning should give preference
to setup_data data over the EFI configuration table.

In AMD SEV-SNP, the CC blob contains the address of the secrets and CPUID
pages. The secrets page includes information such as a VM to PSP
communication key and CPUID page contains PSP filtered CPUID values.
Define the AMD SEV confidential computing blob structure.

While at it, define the EFI GUID for the confidential computing blob.

Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 18 ++++++++++++++++++
arch/x86/include/uapi/asm/bootparam.h | 1 +
include/linux/efi.h | 1 +
3 files changed, 20 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index a3203b2caaca..1a7e21bb6eea 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -42,6 +42,24 @@ struct es_em_ctxt {
struct es_fault_info fi;
};

+/*
+ * AMD SEV Confidential computing blob structure. The structure is
+ * defined in OVMF UEFI firmware header:
+ * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
+ */
+#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
+struct cc_blob_sev_info {
+ u32 magic;
+ u16 version;
+ u16 reserved;
+ u64 secrets_phys;
+ u32 secrets_len;
+ u32 rsvd1;
+ u64 cpuid_phys;
+ u32 cpuid_len;
+ u32 rsvd2;
+};
+
void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code);

static inline u64 lower_bits(u64 val, unsigned int bits)
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index b25d3f82c2f3..1ac5acca72ce 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -10,6 +10,7 @@
#define SETUP_EFI 4
#define SETUP_APPLE_PROPERTIES 5
#define SETUP_JAILHOUSE 6
+#define SETUP_CC_BLOB 7

#define SETUP_INDIRECT (1<<31)

diff --git a/include/linux/efi.h b/include/linux/efi.h
index ccd4d3f91c98..984aa688997a 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -390,6 +390,7 @@ void efi_native_runtime_setup(void);
#define EFI_CERT_SHA256_GUID EFI_GUID(0xc1c41626, 0x504c, 0x4092, 0xac, 0xa9, 0x41, 0xf9, 0x36, 0x93, 0x43, 0x28)
#define EFI_CERT_X509_GUID EFI_GUID(0xa5c059a1, 0x94e4, 0x4aa7, 0x87, 0xb5, 0xab, 0x15, 0x5c, 0x2b, 0xf0, 0x72)
#define EFI_CERT_X509_SHA256_GUID EFI_GUID(0x3bd2a492, 0x96c0, 0x4079, 0xb4, 0x20, 0xfc, 0xf9, 0x8e, 0xf1, 0x03, 0xed)
+#define EFI_CC_BLOB_GUID EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)

/*
* This GUID is used to pass to the kernel proper the struct screen_info
--
2.25.1

2022-03-08 23:13:52

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 10/46] x86/sev: Define the Linux specific guest termination reasons

GHCB specification defines the reason code for reason set 0. The reason
codes defined in the set 0 do not cover all possible causes for a guest
to request termination.

The reason sets 1 to 255 are reserved for the vendor-specific codes.
Reserve the reason set 1 for the Linux guest. Define the error codes for
reason set 1 so that one can have meaningful termination reasons and thus
better guest failure diagnosis.

While at it, change the sev_es_terminate() to accept the reason set
parameter.

Reviewed-by: Venu Busireddy <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/sev.c | 6 +++---
arch/x86/include/asm/sev-common.h | 8 ++++++++
arch/x86/kernel/sev-shared.c | 11 ++++-------
arch/x86/kernel/sev.c | 4 ++--
4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 27ccd5a5ff60..56e941d5e092 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -119,7 +119,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
static bool early_setup_sev_es(void)
{
if (!sev_es_negotiate_protocol())
- sev_es_terminate(GHCB_SEV_ES_PROT_UNSUPPORTED);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);

if (set_page_decrypted((unsigned long)&boot_ghcb_page))
return false;
@@ -172,7 +172,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
enum es_result result;

if (!boot_ghcb && !early_setup_sev_es())
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);

vc_ghcb_invalidate(boot_ghcb);
result = vc_init_em_ctxt(&ctxt, regs, exit_code);
@@ -199,7 +199,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
if (result == ES_OK)
vc_finish_insn(&ctxt);
else if (result != ES_RETRY)
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}

void sev_enable(struct boot_params *bp)
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 1b2fd32b42fe..94f0ea574049 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -73,9 +73,17 @@
/* GHCBData[23:16] */ \
((((u64)reason_val) & 0xff) << 16))

+/* Error codes from reason set 0 */
+#define SEV_TERM_SET_GEN 0
#define GHCB_SEV_ES_GEN_REQ 0
#define GHCB_SEV_ES_PROT_UNSUPPORTED 1

+/* Linux-specific reason codes (used with reason set 1) */
+#define SEV_TERM_SET_LINUX 1
+#define GHCB_TERM_REGISTER 0 /* GHCB GPA registration failure */
+#define GHCB_TERM_PSC 1 /* Page State Change failure */
+#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
+
#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)

/*
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index ce987688bbc0..2abf8a7d75e5 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -24,15 +24,12 @@ static bool __init sev_es_check_cpu_features(void)
return true;
}

-static void __noreturn sev_es_terminate(unsigned int reason)
+static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
{
u64 val = GHCB_MSR_TERM_REQ;

- /*
- * Tell the hypervisor what went wrong - only reason-set 0 is
- * currently supported.
- */
- val |= GHCB_SEV_TERM_REASON(0, reason);
+ /* Tell the hypervisor what went wrong. */
+ val |= GHCB_SEV_TERM_REASON(set, reason);

/* Request Guest Termination from Hypvervisor */
sev_es_wr_ghcb_msr(val);
@@ -221,7 +218,7 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)

fail:
/* Terminate the guest */
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}

static enum es_result vc_insn_string_read(struct es_em_ctxt *ctxt,
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index e6d316a01fdd..19ad09712902 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1337,7 +1337,7 @@ DEFINE_IDTENTRY_VC_KERNEL(exc_vmm_communication)
show_regs(regs);

/* Ask hypervisor to sev_es_terminate */
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);

/* If that fails and we get here - just panic */
panic("Returned from Terminate-Request to Hypervisor\n");
@@ -1385,7 +1385,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)

/* Do initial setup or terminate the guest */
if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);

vc_ghcb_invalidate(boot_ghcb);

--
2.25.1

2022-03-08 23:14:54

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 21/46] x86/mm: Validate memory when changing the C-bit

Add the needed functionality to change pages state from shared
to private and vice-versa using the Page State Change VMGEXIT as
documented in the GHCB spec.

Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev-common.h | 22 ++++
arch/x86/include/asm/sev.h | 4 +
arch/x86/include/uapi/asm/svm.h | 2 +
arch/x86/kernel/sev.c | 168 ++++++++++++++++++++++++++++++
arch/x86/mm/mem_encrypt_amd.c | 13 +++
5 files changed, 209 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index f077a6c95e67..1aa72b5c2490 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -105,6 +105,28 @@ enum psc_op {

#define GHCB_HV_FT_SNP BIT_ULL(0)

+/* SNP Page State Change NAE event */
+#define VMGEXIT_PSC_MAX_ENTRY 253
+
+struct psc_hdr {
+ u16 cur_entry;
+ u16 end_entry;
+ u32 reserved;
+} __packed;
+
+struct psc_entry {
+ u64 cur_page : 12,
+ gfn : 40,
+ operation : 4,
+ pagesize : 1,
+ reserved : 7;
+} __packed;
+
+struct snp_psc_desc {
+ struct psc_hdr hdr;
+ struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
+} __packed;
+
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
#define GHCB_MSR_TERM_REASON_SET_MASK 0xf
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index f65d257e3d4a..feeb93e6ec97 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -128,6 +128,8 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
unsigned int npages);
void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -142,6 +144,8 @@ early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned
static inline void __init
early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
+static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
#endif

#endif
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index b0ad00f4c1e1..0dcdb6e0c913 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -108,6 +108,7 @@
#define SVM_VMGEXIT_AP_JUMP_TABLE 0x80000005
#define SVM_VMGEXIT_SET_AP_JUMP_TABLE 0
#define SVM_VMGEXIT_GET_AP_JUMP_TABLE 1
+#define SVM_VMGEXIT_PSC 0x80000010
#define SVM_VMGEXIT_HV_FEATURES 0x8000fffd
#define SVM_VMGEXIT_UNSUPPORTED_EVENT 0x8000ffff

@@ -219,6 +220,7 @@
{ SVM_VMGEXIT_NMI_COMPLETE, "vmgexit_nmi_complete" }, \
{ SVM_VMGEXIT_AP_HLT_LOOP, "vmgexit_ap_hlt_loop" }, \
{ SVM_VMGEXIT_AP_JUMP_TABLE, "vmgexit_ap_jump_table" }, \
+ { SVM_VMGEXIT_PSC, "vmgexit_page_state_change" }, \
{ SVM_VMGEXIT_HV_FEATURES, "vmgexit_hypervisor_feature" }, \
{ SVM_EXIT_ERR, "invalid_guest_state" }

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 1e8dc71e7ba6..4315be1602d1 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -655,6 +655,174 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
WARN(1, "invalid memory op %d\n", op);
}

+static int vmgexit_psc(struct snp_psc_desc *desc)
+{
+ int cur_entry, end_entry, ret = 0;
+ struct snp_psc_desc *data;
+ struct ghcb_state state;
+ struct es_em_ctxt ctxt;
+ unsigned long flags;
+ struct ghcb *ghcb;
+
+ /*
+ * __sev_get_ghcb() needs to run with IRQs disabled because it is using
+ * a per-CPU GHCB.
+ */
+ local_irq_save(flags);
+
+ ghcb = __sev_get_ghcb(&state);
+ if (!ghcb) {
+ ret = 1;
+ goto out_unlock;
+ }
+
+ /* Copy the input desc into GHCB shared buffer */
+ data = (struct snp_psc_desc *)ghcb->shared_buffer;
+ memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof(*desc)));
+
+ /*
+ * As per the GHCB specification, the hypervisor can resume the guest
+ * before processing all the entries. Check whether all the entries
+ * are processed. If not, then keep retrying. Note, the hypervisor
+ * will update the data memory directly to indicate the status, so
+ * reference the data->hdr everywhere.
+ *
+ * The strategy here is to wait for the hypervisor to change the page
+ * state in the RMP table before guest accesses the memory pages. If the
+ * page state change was not successful, then later memory access will
+ * result in a crash.
+ */
+ cur_entry = data->hdr.cur_entry;
+ end_entry = data->hdr.end_entry;
+
+ while (data->hdr.cur_entry <= data->hdr.end_entry) {
+ ghcb_set_sw_scratch(ghcb, (u64)__pa(data));
+
+ /* This will advance the shared buffer data points to. */
+ ret = sev_es_ghcb_hv_call(ghcb, true, &ctxt, SVM_VMGEXIT_PSC, 0, 0);
+
+ /*
+ * Page State Change VMGEXIT can pass error code through
+ * exit_info_2.
+ */
+ if (WARN(ret || ghcb->save.sw_exit_info_2,
+ "SNP: PSC failed ret=%d exit_info_2=%llx\n",
+ ret, ghcb->save.sw_exit_info_2)) {
+ ret = 1;
+ goto out;
+ }
+
+ /* Verify that reserved bit is not set */
+ if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n")) {
+ ret = 1;
+ goto out;
+ }
+
+ /*
+ * Sanity check that entry processing is not going backwards.
+ * This will happen only if hypervisor is tricking us.
+ */
+ if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_entry,
+"SNP: PSC processing going backward, end_entry %d (got %d) cur_entry %d (got %d)\n",
+ end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) {
+ ret = 1;
+ goto out;
+ }
+ }
+
+out:
+ __sev_put_ghcb(&state);
+
+out_unlock:
+ local_irq_restore(flags);
+
+ return ret;
+}
+
+static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
+ unsigned long vaddr_end, int op)
+{
+ struct psc_hdr *hdr;
+ struct psc_entry *e;
+ unsigned long pfn;
+ int i;
+
+ hdr = &data->hdr;
+ e = data->entries;
+
+ memset(data, 0, sizeof(*data));
+ i = 0;
+
+ while (vaddr < vaddr_end) {
+ if (is_vmalloc_addr((void *)vaddr))
+ pfn = vmalloc_to_pfn((void *)vaddr);
+ else
+ pfn = __pa(vaddr) >> PAGE_SHIFT;
+
+ e->gfn = pfn;
+ e->operation = op;
+ hdr->end_entry = i;
+
+ /*
+ * Current SNP implementation doesn't keep track of the RMP page
+ * size so use 4K for simplicity.
+ */
+ e->pagesize = RMP_PG_SIZE_4K;
+
+ vaddr = vaddr + PAGE_SIZE;
+ e++;
+ i++;
+ }
+
+ if (vmgexit_psc(data))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
+{
+ unsigned long vaddr_end, next_vaddr;
+ struct snp_psc_desc *desc;
+
+ desc = kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT);
+ if (!desc)
+ panic("SNP: failed to allocate memory for PSC descriptor\n");
+
+ vaddr = vaddr & PAGE_MASK;
+ vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+ while (vaddr < vaddr_end) {
+ /* Calculate the last vaddr that fits in one struct snp_psc_desc. */
+ next_vaddr = min_t(unsigned long, vaddr_end,
+ (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr);
+
+ __set_pages_state(desc, vaddr, next_vaddr, op);
+
+ vaddr = next_vaddr;
+ }
+
+ kfree(desc);
+}
+
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ pvalidate_pages(vaddr, npages, false);
+
+ set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+ pvalidate_pages(vaddr, npages, true);
+}
+
int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
{
u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 8539dd6f24ff..d3c88d9ef8d6 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -316,11 +316,24 @@ static void enc_dec_hypercall(unsigned long vaddr, int npages, bool enc)

static void amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool enc)
{
+ /*
+ * To maintain the security guarantees of SEV-SNP guests, make sure
+ * to invalidate the memory before encryption attribute is cleared.
+ */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && !enc)
+ snp_set_memory_shared(vaddr, npages);
}

/* Return true unconditionally: return value doesn't matter for the SEV side */
static bool amd_enc_status_change_finish(unsigned long vaddr, int npages, bool enc)
{
+ /*
+ * After memory is mapped encrypted in the page table, validate it
+ * so that it is consistent with the page table updates.
+ */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && enc)
+ snp_set_memory_private(vaddr, npages);
+
if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
enc_dec_hypercall(vaddr, npages, enc);

--
2.25.1

2022-03-08 23:21:55

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 36/46] x86/compressed: Export and rename add_identity_map()

From: Michael Roth <[email protected]>

SEV-specific code will need to add some additional mappings, but doing
this within ident_map_64.c requires some SEV-specific helpers to be
exported and some SEV-specific struct definitions to be pulled into
ident_map_64.c. Instead, export add_identity_map() so SEV-specific (and
other subsystem-specific) code can be better contained outside of
ident_map_64.c.

While at it, rename the function to kernel_add_identity_map(), similar
to the kernel_ident_mapping_init() function it relies upon.

No functional changes.

Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
---
arch/x86/boot/compressed/ident_map_64.c | 18 +++++++++---------
arch/x86/boot/compressed/misc.h | 1 +
2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index 3d566964b829..7975680f521f 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -90,7 +90,7 @@ static struct x86_mapping_info mapping_info;
/*
* Adds the specified range to the identity mappings.
*/
-static void add_identity_map(unsigned long start, unsigned long end)
+void kernel_add_identity_map(unsigned long start, unsigned long end)
{
int ret;

@@ -157,11 +157,11 @@ void initialize_identity_maps(void *rmode)
* explicitly here in case the compressed kernel does not touch them,
* or does not touch all the pages covering them.
*/
- add_identity_map((unsigned long)_head, (unsigned long)_end);
+ kernel_add_identity_map((unsigned long)_head, (unsigned long)_end);
boot_params = rmode;
- add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
+ kernel_add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
cmdline = get_cmd_line_ptr();
- add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
+ kernel_add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);

/* Load the new page-table. */
sev_verify_cbit(top_level_pgt);
@@ -246,10 +246,10 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
* It should already exist, but keep things generic.
*
* To map the page just read from it and fault it in if there is no
- * mapping yet. add_identity_map() can't be called here because that
- * would unconditionally map the address on PMD level, destroying any
- * PTE-level mappings that might already exist. Use assembly here so
- * the access won't be optimized away.
+ * mapping yet. kernel_add_identity_map() can't be called here because
+ * that would unconditionally map the address on PMD level, destroying
+ * any PTE-level mappings that might already exist. Use assembly here
+ * so the access won't be optimized away.
*/
asm volatile("mov %[address], %%r9"
:: [address] "g" (*(unsigned long *)address)
@@ -363,5 +363,5 @@ void do_boot_page_fault(struct pt_regs *regs, unsigned long error_code)
* Error code is sane - now identity map the 2M region around
* the faulting address.
*/
- add_identity_map(address, end);
+ kernel_add_identity_map(address, end);
}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index ba538af37e90..aae2722c6e9a 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -156,6 +156,7 @@ static inline int count_immovable_mem_regions(void) { return 0; }
#ifdef CONFIG_X86_5LEVEL
extern unsigned int __pgtable_l5_enabled, pgdir_shift, ptrs_per_p4d;
#endif
+extern void kernel_add_identity_map(unsigned long start, unsigned long end);

/* Used by PAGE_KERN* macros: */
extern pteval_t __default_kernel_pte_mask;
--
2.25.1

2022-03-08 23:36:14

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 03/46] KVM: SVM: Create a separate mapping for the GHCB save area

From: Tom Lendacky <[email protected]>

The initial implementation of the GHCB spec was based on trying to keep
the register state offsets the same relative to the VM save area. However,
the save area for SEV-ES has changed within the hardware causing the
relation between the SEV-ES save area to change relative to the GHCB save
area.

This is the second step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Create a GHCB save area that matches the GHCB specification.

Reviewed-by: Venu Busireddy <[email protected]>
Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/svm.h | 48 +++++++++++++++++++++++++++++++++++---
1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index e748aa33c355..eae5c7ab9c6c 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -390,11 +390,51 @@ struct sev_es_save_area {
u64 x87_state_gpa;
} __packed;

+struct ghcb_save_area {
+ u8 reserved_1[203];
+ u8 cpl;
+ u8 reserved_2[116];
+ u64 xss;
+ u8 reserved_3[24];
+ u64 dr7;
+ u8 reserved_4[16];
+ u64 rip;
+ u8 reserved_5[88];
+ u64 rsp;
+ u8 reserved_6[24];
+ u64 rax;
+ u8 reserved_7[264];
+ u64 rcx;
+ u64 rdx;
+ u64 rbx;
+ u8 reserved_8[8];
+ u64 rbp;
+ u64 rsi;
+ u64 rdi;
+ u64 r8;
+ u64 r9;
+ u64 r10;
+ u64 r11;
+ u64 r12;
+ u64 r13;
+ u64 r14;
+ u64 r15;
+ u8 reserved_9[16];
+ u64 sw_exit_code;
+ u64 sw_exit_info_1;
+ u64 sw_exit_info_2;
+ u64 sw_scratch;
+ u8 reserved_10[56];
+ u64 xcr0;
+ u8 valid_bitmap[16];
+ u64 x87_state_gpa;
+} __packed;
+
#define GHCB_SHARED_BUF_SIZE 2032

struct ghcb {
- struct sev_es_save_area save;
- u8 reserved_save[2048 - sizeof(struct sev_es_save_area)];
+ struct ghcb_save_area save;
+ u8 reserved_save[2048 - sizeof(struct ghcb_save_area)];

u8 shared_buffer[GHCB_SHARED_BUF_SIZE];

@@ -405,6 +445,7 @@ struct ghcb {


#define EXPECTED_VMCB_SAVE_AREA_SIZE 740
+#define EXPECTED_GHCB_SAVE_AREA_SIZE 1032
#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1032
#define EXPECTED_VMCB_CONTROL_AREA_SIZE 1024
#define EXPECTED_GHCB_SIZE PAGE_SIZE
@@ -412,6 +453,7 @@ struct ghcb {
static inline void __unused_size_checks(void)
{
BUILD_BUG_ON(sizeof(struct vmcb_save_area) != EXPECTED_VMCB_SAVE_AREA_SIZE);
+ BUILD_BUG_ON(sizeof(struct ghcb_save_area) != EXPECTED_GHCB_SAVE_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct sev_es_save_area) != EXPECTED_SEV_ES_SAVE_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct vmcb_control_area) != EXPECTED_VMCB_CONTROL_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct ghcb) != EXPECTED_GHCB_SIZE);
@@ -482,7 +524,7 @@ struct vmcb {
/* GHCB Accessor functions */

#define GHCB_BITMAP_IDX(field) \
- (offsetof(struct sev_es_save_area, field) / sizeof(u64))
+ (offsetof(struct ghcb_save_area, field) / sizeof(u64))

#define DEFINE_GHCB_ACCESSORS(field) \
static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb) \
--
2.25.1

2022-03-08 23:49:55

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 23/46] x86/head/64: Re-enable stack protection

From: Michael Roth <[email protected]>

Due to commit 103a4908ad4d ("x86/head/64: Disable stack protection for
head$(BITS).o"), kernel/head{32,64}.c are compiled with
-fno-stack-protector to allow a call to set_bringup_idt_handler(), which
would otherwise have stack protection enabled with
CONFIG_STACKPROTECTOR_STRONG.

While sufficient for that case, there may still be issues with calls to
any external functions that were compiled with stack protection enabled
that in-turn make stack-protected calls, or if the exception handlers
set up by set_bringup_idt_handler() make calls to stack-protected
functions.

Subsequent patches for SEV-SNP CPUID validation support will introduce
both such cases. Attempting to disable stack protection for everything
in scope to address that is prohibitive since much of the code, like
SEV-ES #VC handler, is shared code that remains in use after boot and
could benefit from having stack protection enabled. Attempting to inline
calls is brittle and can quickly balloon out to library/helper code
where that's not really an option.

Instead, re-enable stack protection for head32.c/head64.c, and make the
appropriate changes to ensure the segment used for the stack canary is
initialized in advance of any stack-protected C calls.

For head64.c:

- The BSP will enter from startup_64() and call into C code
(startup_64_setup_env()) shortly after setting up the stack, which
may result in calls to stack-protected code. Set up %gs early to allow
for this safely.
- APs will enter from secondary_startup_64*(), and %gs will be set up
soon after. There is one call to C code prior to %gs being setup
(__startup_secondary_64()), but it is only to fetch 'sme_me_mask'
global, so just load 'sme_me_mask' directly instead, and remove the
now-unused __startup_secondary_64() function.

For head32.c:

- BSPs/APs will set %fs to __BOOT_DS prior to any C calls. In recent
kernels, the compiler is configured to access the stack canary at
%fs:__stack_chk_guard [1], which overlaps with the initial per-cpu
'__stack_chk_guard' variable in the initial/"master" .data..percpu
area. This is sufficient to allow access to the canary for use
during initial startup, so no changes are needed there.

[1] commit 3fb0fdb3bbe7 ("x86/stackprotector/32: Make the canary into a regular percpu variable")

Suggested-by: Joerg Roedel <[email protected]> #for 64-bit %gs set up
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/setup.h | 1 -
arch/x86/kernel/Makefile | 2 --
arch/x86/kernel/head64.c | 9 ---------
arch/x86/kernel/head_64.S | 24 +++++++++++++++++++++---
4 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index a12458a7a8d4..72ede9159951 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -49,7 +49,6 @@ extern unsigned long saved_video_mode;
extern void reserve_standard_io_resources(void);
extern void i386_reserve_resources(void);
extern unsigned long __startup_64(unsigned long physaddr, struct boot_params *bp);
-extern unsigned long __startup_secondary_64(void);
extern void startup_64_setup_env(unsigned long physbase);
extern void early_setup_idt(void);
extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 6462e3dd98f4..ff4da5784d63 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -46,8 +46,6 @@ endif
# non-deterministic coverage.
KCOV_INSTRUMENT := n

-CFLAGS_head$(BITS).o += -fno-stack-protector
-
CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace

obj-y := process_$(BITS).o signal.o
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 656d2f3e2cf0..c185f4831498 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -318,15 +318,6 @@ unsigned long __head __startup_64(unsigned long physaddr,
return sme_postprocess_startup(bp, pmd);
}

-unsigned long __startup_secondary_64(void)
-{
- /*
- * Return the SME encryption mask (if SME is active) to be used as a
- * modifier for the initial pgdir entry programmed into CR3.
- */
- return sme_get_me_mask();
-}
-
/* Wipe all early page tables except for the kernel symbol map */
static void __init reset_early_page_tables(void)
{
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 9c2c3aff5ee4..9e84263bcb94 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -65,6 +65,22 @@ SYM_CODE_START_NOALIGN(startup_64)
leaq (__end_init_task - FRAME_SIZE)(%rip), %rsp

leaq _text(%rip), %rdi
+
+ /*
+ * initial_gs points to initial fixed_percpu_data struct with storage for
+ * the stack protector canary. Global pointer fixups are needed at this
+ * stage, so apply them as is done in fixup_pointer(), and initialize %gs
+ * such that the canary can be accessed at %gs:40 for subsequent C calls.
+ */
+ movl $MSR_GS_BASE, %ecx
+ movq initial_gs(%rip), %rax
+ movq $_text, %rdx
+ subq %rdx, %rax
+ addq %rdi, %rax
+ movq %rax, %rdx
+ shrq $32, %rdx
+ wrmsr
+
pushq %rsi
call startup_64_setup_env
popq %rsi
@@ -145,9 +161,11 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
* Retrieve the modifier (SME encryption mask if SME is active) to be
* added to the initial pgdir entry that will be programmed into CR3.
*/
- pushq %rsi
- call __startup_secondary_64
- popq %rsi
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ movq sme_me_mask, %rax
+#else
+ xorq %rax, %rax
+#endif

/* Form the CR3 value being sure to include the CR3 modifier */
addq $(init_top_pgt - __START_KERNEL_map), %rax
--
2.25.1

2022-03-09 00:06:53

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 42/46] x86/sev: Register SEV-SNP guest request platform device

Version 2 of the GHCB specification providesa Non Automatic Exit (NAE)
event type that can be used by the SEV-SNP guest to communicate with the
PSP without risk from a malicious hypervisor who wishes to read, alter,
drop or replay the messages sent.

SNP_LAUNCH_UPDATE can insert two special pages into the guest’s memory:
the secrets page and the CPUID page. The PSP firmware populates the
contents of the secrets page. The secrets page contains encryption keys
used by the guest to interact with the firmware. Because the secrets
page is encrypted with the guest’s memory encryption key, the hypervisor
cannot read the keys. See SEV-SNP firmware spec for further details on
the secrets page format.

Create a platform device that the SEV-SNP guest driver can bind to get
the platform resources such as encryption key and message id to use to
communicate with the PSP. The SEV-SNP guest driver provides a userspace
interface to get the attestation report, key derivation, extended
attestation report etc.

Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 4 +++
arch/x86/kernel/sev.c | 56 ++++++++++++++++++++++++++++++++++++++
2 files changed, 60 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 9830ee1d6ef0..ca977493eb72 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -95,6 +95,10 @@ struct snp_req_data {
unsigned int data_npages;
};

+struct snp_guest_platform_data {
+ u64 secrets_gpa;
+};
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern struct static_key_false sev_es_enable_key;
extern void __sev_es_ist_enter(struct pt_regs *regs);
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 97c86541b9c6..5f48263bff8a 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -19,6 +19,9 @@
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/cpumask.h>
+#include <linux/efi.h>
+#include <linux/platform_device.h>
+#include <linux/io.h>

#include <asm/cpu_entry_area.h>
#include <asm/stacktrace.h>
@@ -2180,3 +2183,56 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned
return ret;
}
EXPORT_SYMBOL_GPL(snp_issue_guest_request);
+
+static struct platform_device guest_req_device = {
+ .name = "snp-guest",
+ .id = -1,
+};
+
+static u64 get_secrets_page(void)
+{
+ u64 pa_data = boot_params.cc_blob_address;
+ struct cc_blob_sev_info info;
+ void *map;
+
+ /*
+ * The CC blob contains the address of the secrets page, check if the
+ * blob is present.
+ */
+ if (!pa_data)
+ return 0;
+
+ map = early_memremap(pa_data, sizeof(info));
+ memcpy(&info, map, sizeof(info));
+ early_memunmap(map, sizeof(info));
+
+ /* smoke-test the secrets page passed */
+ if (!info.secrets_phys || info.secrets_len != PAGE_SIZE)
+ return 0;
+
+ return info.secrets_phys;
+}
+
+static int __init snp_init_platform_device(void)
+{
+ struct snp_guest_platform_data data;
+ u64 gpa;
+
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return -ENODEV;
+
+ gpa = get_secrets_page();
+ if (!gpa)
+ return -ENODEV;
+
+ data.secrets_gpa = gpa;
+ if (platform_device_add_data(&guest_req_device, &data, sizeof(data)))
+ return -ENODEV;
+
+ if (platform_device_register(&guest_req_device))
+ return -ENODEV;
+
+ pr_info("SNP guest platform device initialized.\n");
+ return 0;
+}
+device_initcall(snp_init_platform_device);
--
2.25.1

2022-03-09 00:21:04

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 04/46] KVM: SVM: Update the SEV-ES save area mapping

From: Tom Lendacky <[email protected]>

This is the final step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Update the SEV-ES/SEV-SNP save area to match the APM. This save
area will be used for the upcoming SEV-SNP AP Creation NAE event support.

Reviewed-by: Venu Busireddy <[email protected]>
Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/svm.h | 66 +++++++++++++++++++++++++++++---------
1 file changed, 50 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index eae5c7ab9c6c..7ab508fd8c4c 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -326,7 +326,13 @@ struct sev_es_save_area {
struct vmcb_seg ldtr;
struct vmcb_seg idtr;
struct vmcb_seg tr;
- u8 reserved_1[43];
+ u64 vmpl0_ssp;
+ u64 vmpl1_ssp;
+ u64 vmpl2_ssp;
+ u64 vmpl3_ssp;
+ u64 u_cet;
+ u8 reserved_1[2];
+ u8 vmpl;
u8 cpl;
u8 reserved_2[4];
u64 efer;
@@ -339,9 +345,19 @@ struct sev_es_save_area {
u64 dr6;
u64 rflags;
u64 rip;
- u8 reserved_4[88];
+ u64 dr0;
+ u64 dr1;
+ u64 dr2;
+ u64 dr3;
+ u64 dr0_addr_mask;
+ u64 dr1_addr_mask;
+ u64 dr2_addr_mask;
+ u64 dr3_addr_mask;
+ u8 reserved_4[24];
u64 rsp;
- u8 reserved_5[24];
+ u64 s_cet;
+ u64 ssp;
+ u64 isst_addr;
u64 rax;
u64 star;
u64 lstar;
@@ -352,7 +368,7 @@ struct sev_es_save_area {
u64 sysenter_esp;
u64 sysenter_eip;
u64 cr2;
- u8 reserved_6[32];
+ u8 reserved_5[32];
u64 g_pat;
u64 dbgctl;
u64 br_from;
@@ -361,12 +377,12 @@ struct sev_es_save_area {
u64 last_excp_to;
u8 reserved_7[80];
u32 pkru;
- u8 reserved_9[20];
- u64 reserved_10; /* rax already available at 0x01f8 */
+ u8 reserved_8[20];
+ u64 reserved_9; /* rax already available at 0x01f8 */
u64 rcx;
u64 rdx;
u64 rbx;
- u64 reserved_11; /* rsp already available at 0x01d8 */
+ u64 reserved_10; /* rsp already available at 0x01d8 */
u64 rbp;
u64 rsi;
u64 rdi;
@@ -378,16 +394,34 @@ struct sev_es_save_area {
u64 r13;
u64 r14;
u64 r15;
- u8 reserved_12[16];
- u64 sw_exit_code;
- u64 sw_exit_info_1;
- u64 sw_exit_info_2;
- u64 sw_scratch;
+ u8 reserved_11[16];
+ u64 guest_exit_info_1;
+ u64 guest_exit_info_2;
+ u64 guest_exit_int_info;
+ u64 guest_nrip;
u64 sev_features;
- u8 reserved_13[48];
+ u64 vintr_ctrl;
+ u64 guest_exit_code;
+ u64 virtual_tom;
+ u64 tlb_id;
+ u64 pcpu_id;
+ u64 event_inj;
u64 xcr0;
- u8 valid_bitmap[16];
- u64 x87_state_gpa;
+ u8 reserved_12[16];
+
+ /* Floating point area */
+ u64 x87_dp;
+ u32 mxcsr;
+ u16 x87_ftw;
+ u16 x87_fsw;
+ u16 x87_fcw;
+ u16 x87_fop;
+ u16 x87_ds;
+ u16 x87_cs;
+ u64 x87_rip;
+ u8 fpreg_x87[80];
+ u8 fpreg_xmm[256];
+ u8 fpreg_ymm[256];
} __packed;

struct ghcb_save_area {
@@ -446,7 +480,7 @@ struct ghcb {

#define EXPECTED_VMCB_SAVE_AREA_SIZE 740
#define EXPECTED_GHCB_SAVE_AREA_SIZE 1032
-#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1032
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1648
#define EXPECTED_VMCB_CONTROL_AREA_SIZE 1024
#define EXPECTED_GHCB_SIZE PAGE_SIZE

--
2.25.1

2022-03-09 00:22:16

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 11/46] x86/sev: Save the negotiated GHCB version

The SEV-ES guest calls the sev_es_negotiate_protocol() to negotiate the
GHCB protocol version before establishing the GHCB. Cache the negotiated
GHCB version so that it can be used later.

Reviewed-by: Venu Busireddy <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 2 +-
arch/x86/kernel/sev-shared.c | 17 ++++++++++++++---
2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ec060c433589..9b9c190e8c3b 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -12,7 +12,7 @@
#include <asm/insn.h>
#include <asm/sev-common.h>

-#define GHCB_PROTO_OUR 0x0001UL
+#define GHCB_PROTOCOL_MIN 1ULL
#define GHCB_PROTOCOL_MAX 1ULL
#define GHCB_DEFAULT_USAGE 0ULL

diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 2abf8a7d75e5..91105f5a02a8 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,15 @@
#define has_cpuflag(f) boot_cpu_has(f)
#endif

+/*
+ * Since feature negotiation related variables are set early in the boot
+ * process they must reside in the .data section so as not to be zeroed
+ * out when the .bss section is later cleared.
+ *
+ * GHCB protocol version negotiated with the hypervisor.
+ */
+static u16 ghcb_version __ro_after_init;
+
static bool __init sev_es_check_cpu_features(void)
{
if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -51,10 +60,12 @@ static bool sev_es_negotiate_protocol(void)
if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
return false;

- if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTO_OUR ||
- GHCB_MSR_PROTO_MIN(val) > GHCB_PROTO_OUR)
+ if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTOCOL_MIN ||
+ GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
return false;

+ ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+
return true;
}

@@ -127,7 +138,7 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
u64 exit_info_1, u64 exit_info_2)
{
/* Fill in protocol and format specifiers */
- ghcb->protocol_version = GHCB_PROTOCOL_MAX;
+ ghcb->protocol_version = ghcb_version;
ghcb->ghcb_usage = GHCB_DEFAULT_USAGE;

ghcb_set_sw_exit_code(ghcb, exit_code);
--
2.25.1

2022-03-09 00:40:33

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 09/46] x86/mm: Extend cc_attr to include AMD SEV-SNP

The CC_ATTR_GUEST_SEV_SNP can be used by the guest to query whether the
SNP (Secure Nested Paging) feature is active.

Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/coco/core.c | 3 +++
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/mm/mem_encrypt.c | 4 ++++
include/linux/cc_platform.h | 8 ++++++++
4 files changed, 17 insertions(+)

diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index fc1365dd927e..dafd4881ce29 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -57,6 +57,9 @@ static bool amd_cc_platform_has(enum cc_attr attr)
return (sev_status & MSR_AMD64_SEV_ENABLED) &&
!(sev_status & MSR_AMD64_SEV_ES_ENABLED);

+ case CC_ATTR_GUEST_SEV_SNP:
+ return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
+
default:
return false;
}
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index a142cab6882e..1315531e66ef 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -484,8 +484,10 @@
#define MSR_AMD64_SEV 0xc0010131
#define MSR_AMD64_SEV_ENABLED_BIT 0
#define MSR_AMD64_SEV_ES_ENABLED_BIT 1
+#define MSR_AMD64_SEV_SNP_ENABLED_BIT 2
#define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
#define MSR_AMD64_SEV_ES_ENABLED BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
+#define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)

#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 50d209939c66..f85868c031c6 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -62,6 +62,10 @@ static void print_mem_encrypt_feature_info(void)
if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
pr_cont(" SEV-ES");

+ /* Secure Nested Paging */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ pr_cont(" SEV-SNP");
+
pr_cont("\n");
}

diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index efd8205282da..d08dd65b5c43 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -72,6 +72,14 @@ enum cc_attr {
* Examples include TDX guest & SEV.
*/
CC_ATTR_GUEST_UNROLL_STRING_IO,
+
+ /**
+ * @CC_ATTR_SEV_SNP: Guest SNP is active.
+ *
+ * The platform/OS is running as a guest/virtual machine and actively
+ * using AMD SEV-SNP features.
+ */
+ CC_ATTR_GUEST_SEV_SNP,
};

#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
--
2.25.1

2022-03-09 00:45:08

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 14/46] x86/sev: Check the vmpl level

Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP architecture
allows a guest VM to divide its address space into four levels. The level
can be used to provide the hardware isolated abstraction layers with a VM.
The VMPL0 is the highest privilege, and VMPL3 is the least privilege.
Certain operations must be done by the VMPL0 software, such as:

* Validate or invalidate memory range (PVALIDATE instruction)
* Allocate VMSA page (RMPADJUST instruction when VMSA=1)

The initial SEV-SNP support requires that the guest kernel is running on
VMPL0. Add a check to make sure that kernel is running at VMPL0 before
continuing the boot. There is no easy method to query the current VMPL
level, so use the RMPADJUST instruction to determine whether the guest is
running at the VMPL0.

Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/sev.c | 28 ++++++++++++++++++++++++++--
arch/x86/include/asm/sev-common.h | 1 +
arch/x86/include/asm/sev.h | 16 ++++++++++++++++
3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 5b389310be87..84e7d45afa9e 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -199,6 +199,26 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}

+static void enforce_vmpl0(void)
+{
+ u64 attrs;
+ int err;
+
+ /*
+ * RMPADJUST modifies RMP permissions of a lesser-privileged (numerically
+ * higher) privilege level. Here, clear the VMPL1 permission mask of the
+ * GHCB page. If the guest is not running at VMPL0, this will fail.
+ *
+ * If the guest is running at VMPL0, it will succeed. Even if that operation
+ * modifies permission bits, it is still ok to do currently because Linux
+ * SNP guests are supported only on VMPL0 so VMPL1 or higher permission masks
+ * changing is a don't-care.
+ */
+ attrs = 1;
+ if (rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, attrs))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0);
+}
+
void sev_enable(struct boot_params *bp)
{
unsigned int eax, ebx, ecx, edx;
@@ -242,8 +262,12 @@ void sev_enable(struct boot_params *bp)
* SNP is supported in v2 of the GHCB spec which mandates support for HV
* features.
*/
- if (sev_status & MSR_AMD64_SEV_SNP_ENABLED && !(get_hv_features() & GHCB_HV_FT_SNP))
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+ if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
+ if (!(get_hv_features() & GHCB_HV_FT_SNP))
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
+ enforce_vmpl0();
+ }

sme_me_mask = BIT_ULL(ebx & 0x3f);
}
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 6f037c29a46e..7ac5842e32b6 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -89,6 +89,7 @@
#define GHCB_TERM_REGISTER 0 /* GHCB GPA registration failure */
#define GHCB_TERM_PSC 1 /* Page State Change failure */
#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
+#define GHCB_TERM_NOT_VMPL0 3 /* SNP guest is not running at VMPL-0 */

#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4ee98976aed8..e37451849165 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -63,6 +63,9 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
/* Software defined (when rFlags.CF = 1) */
#define PVALIDATE_FAIL_NOUPDATE 255

+/* RMP page size */
+#define RMP_PG_SIZE_4K 0
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern struct static_key_false sev_es_enable_key;
extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -90,6 +93,18 @@ extern enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
struct es_em_ctxt *ctxt,
u64 exit_code, u64 exit_info_1,
u64 exit_info_2);
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs)
+{
+ int rc;
+
+ /* "rmpadjust" mnemonic support in binutils 2.36 and newer */
+ asm volatile(".byte 0xF3,0x0F,0x01,0xFE\n\t"
+ : "=a"(rc)
+ : "a"(vaddr), "c"(rmp_psize), "d"(attrs)
+ : "memory", "cc");
+
+ return rc;
+}
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
{
bool no_rmpupdate;
@@ -114,6 +129,7 @@ static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { ret
static inline void sev_es_nmi_complete(void) { }
static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
#endif

#endif
--
2.25.1

2022-03-09 01:23:22

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

The encryption attribute for the .bss..decrypted section is cleared in the
initial page table build. This is because the section contains the data
that need to be shared between the guest and the hypervisor.

When SEV-SNP is active, just clearing the encryption attribute in the
page table is not enough. The page state need to be updated in the RMP
table.

Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/kernel/head64.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 83514b9827e6..656d2f3e2cf0 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
if (sme_get_me_mask()) {
vaddr = (unsigned long)__start_bss_decrypted;
vaddr_end = (unsigned long)__end_bss_decrypted;
+
for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
+ /*
+ * On SNP, transition the page to shared in the RMP table so that
+ * it is consistent with the page table attribute change.
+ *
+ * __start_bss_decrypted has a virtual address in the high range
+ * mapping (kernel .text). PVALIDATE, by way of
+ * early_snp_set_memory_shared(), requires a valid virtual
+ * address but the kernel is currently running off of the identity
+ * mapping so use __pa() to get a *currently* valid virtual address.
+ */
+ early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+
i = pmd_index(vaddr);
pmd[i] -= sme_get_me_mask();
}
--
2.25.1

2022-03-09 01:37:00

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 24/46] x86/compressed/acpi: Move EFI detection to helper

From: Michael Roth <[email protected]>

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related
code into a set of helpers that can be re-used for that purpose.

Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/acpi.c | 28 +++++++----------
arch/x86/boot/compressed/efi.c | 50 +++++++++++++++++++++++++++++++
arch/x86/boot/compressed/misc.h | 16 ++++++++++
4 files changed, 77 insertions(+), 18 deletions(-)
create mode 100644 arch/x86/boot/compressed/efi.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 6115274fe10f..e69c3d2e0628 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -103,6 +103,7 @@ endif
vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o

vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
+vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o
efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a

$(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 8bcbcee54aa1..db6c561920f0 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -87,7 +87,7 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
efi_system_table_64_t *systab;
struct efi_setup_data *esd;
struct efi_info *ei;
- char *sig;
+ enum efi_type et;

esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
if (!esd)
@@ -98,10 +98,9 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
return 0;
}

- ei = &boot_params->efi_info;
- sig = (char *)&ei->efi_loader_signature;
- if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
- debug_putstr("Wrong kexec EFI loader signature.\n");
+ et = efi_get_type(boot_params);
+ if (et != EFI_TYPE_64) {
+ debug_putstr("Unexpected kexec EFI environment (expected 64-bit EFI).\n");
return 0;
}

@@ -122,29 +121,22 @@ static acpi_physical_address efi_get_rsdp_addr(void)
unsigned long systab, config_tables;
unsigned int nr_tables;
struct efi_info *ei;
+ enum efi_type et;
bool efi_64;
- char *sig;
-
- ei = &boot_params->efi_info;
- sig = (char *)&ei->efi_loader_signature;

- if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+ et = efi_get_type(boot_params);
+ if (et == EFI_TYPE_64)
efi_64 = true;
- } else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+ else if (et == EFI_TYPE_32)
efi_64 = false;
- } else {
- debug_putstr("Wrong EFI loader signature.\n");
+ else
return 0;
- }

/* Get systab from boot params. */
+ ei = &boot_params->efi_info;
#ifdef CONFIG_X86_64
systab = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
#else
- if (ei->efi_systab_hi || ei->efi_memmap_hi) {
- debug_putstr("Error getting RSDP address: EFI system table located above 4GB.\n");
- return 0;
- }
systab = ei->efi_systab;
#endif
if (!systab)
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
new file mode 100644
index 000000000000..bad0ce3e2ef6
--- /dev/null
+++ b/arch/x86/boot/compressed/efi.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for early access to EFI configuration table.
+ *
+ * Originally derived from arch/x86/boot/compressed/acpi.c
+ */
+
+#include "misc.h"
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+/**
+ * efi_get_type - Given a pointer to boot_params, determine the type of EFI environment.
+ *
+ * @bp: pointer to boot_params
+ *
+ * Return: EFI_TYPE_{32,64} for valid EFI environments, EFI_TYPE_NONE otherwise.
+ */
+enum efi_type efi_get_type(struct boot_params *bp)
+{
+ struct efi_info *ei;
+ enum efi_type et;
+ const char *sig;
+
+ ei = &bp->efi_info;
+ sig = (char *)&ei->efi_loader_signature;
+
+ if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+ et = EFI_TYPE_64;
+ } else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+ et = EFI_TYPE_32;
+ } else {
+ debug_putstr("No EFI environment detected.\n");
+ et = EFI_TYPE_NONE;
+ }
+
+#ifndef CONFIG_X86_64
+ /*
+ * Existing callers like acpi.c treat this case as an indicator to
+ * fall-through to non-EFI, rather than an error, so maintain that
+ * functionality here as well.
+ */
+ if (ei->efi_systab_hi || ei->efi_memmap_hi) {
+ debug_putstr("EFI system table is located above 4GB and cannot be accessed.\n");
+ et = EFI_TYPE_NONE;
+ }
+#endif
+
+ return et;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 01cc13c12059..fede1afa39e9 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -176,4 +176,20 @@ void boot_stage2_vc(void);

unsigned long sev_verify_cbit(unsigned long cr3);

+enum efi_type {
+ EFI_TYPE_64,
+ EFI_TYPE_32,
+ EFI_TYPE_NONE,
+};
+
+#ifdef CONFIG_EFI
+/* helpers for early EFI config table access */
+enum efi_type efi_get_type(struct boot_params *bp);
+#else
+static inline enum efi_type efi_get_type(struct boot_params *bp)
+{
+ return EFI_TYPE_NONE;
+}
+#endif /* CONFIG_EFI */
+
#endif /* BOOT_COMPRESSED_MISC_H */
--
2.25.1

2022-03-09 01:48:47

by Brijesh Singh

[permalink] [raw]
Subject: [PATCH v12 17/46] x86/sev: Register GHCB memory when SEV-SNP is active

The SEV-SNP guest is required by the GHCB spec to register the GHCB's
Guest Physical Address (GPA). This is because the hypervisor may prefer
that a guest use a consistent and/or specific GPA for the GHCB associated
with a vCPU. For more information, see the GHCB specification section
"GHCB GPA Registration".

Signed-off-by: Brijesh Singh <[email protected]>
---
arch/x86/include/asm/sev.h | 2 ++
arch/x86/kernel/cpu/common.c | 4 +++
arch/x86/kernel/head64.c | 4 ++-
arch/x86/kernel/sev-shared.c | 2 +-
arch/x86/kernel/sev.c | 47 +++++++++++++++++++++++++++---------
5 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index e37451849165..48df02713ee0 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -122,6 +122,7 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)

return rc;
}
+void setup_ghcb(void);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -130,6 +131,7 @@ static inline void sev_es_nmi_complete(void) { }
static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
+static inline void setup_ghcb(void) { }
#endif

#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 64deb7727d00..2e0dd7f4018e 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -59,6 +59,7 @@
#include <asm/cpu_device_id.h>
#include <asm/uv/uv.h>
#include <asm/sigframe.h>
+#include <asm/sev.h>

#include "cpu.h"

@@ -2067,6 +2068,9 @@ void cpu_init_exception_handling(void)

load_TR_desc();

+ /* GHCB need to be setup to handle #VC. */
+ setup_ghcb();
+
/* Finally load the IDT */
load_current_idt();
}
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index cbc285ddc4ac..83514b9827e6 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -597,8 +597,10 @@ static void startup_64_load_idt(unsigned long physbase)
void early_setup_idt(void)
{
/* VMM Communication Exception */
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+ setup_ghcb();
set_bringup_idt_handler(bringup_idt_table, X86_TRAP_VC, vc_boot_ghcb);
+ }

bringup_idt_descr.address = (unsigned long)bringup_idt_table;
native_load_idt(&bringup_idt_descr);
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index e9ff13cd90b0..3aaef1a18ffe 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -68,7 +68,7 @@ static u64 get_hv_features(void)
return GHCB_MSR_HV_FT_RESP_VAL(val);
}

-static void __maybe_unused snp_register_ghcb_early(unsigned long paddr)
+static void snp_register_ghcb_early(unsigned long paddr)
{
unsigned long pfn = paddr >> PAGE_SHIFT;
u64 val;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index cb20fb0c608e..cc382c4f89ef 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -41,7 +41,7 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
* Needs to be in the .data section because we need it NULL before bss is
* cleared
*/
-static struct ghcb __initdata *boot_ghcb;
+static struct ghcb *boot_ghcb __section(".data");

/* Bitmap of SEV features supported by the hypervisor */
static u64 sev_hv_features __ro_after_init;
@@ -647,15 +647,40 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
return ret;
}

-/*
- * This function runs on the first #VC exception after the kernel
- * switched to virtual addresses.
- */
-static bool __init sev_es_setup_ghcb(void)
+static void snp_register_per_cpu_ghcb(void)
{
+ struct sev_es_runtime_data *data;
+ struct ghcb *ghcb;
+
+ data = this_cpu_read(runtime_data);
+ ghcb = &data->ghcb_page;
+
+ snp_register_ghcb_early(__pa(ghcb));
+}
+
+void setup_ghcb(void)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
+ return;
+
/* First make sure the hypervisor talks a supported protocol. */
if (!sev_es_negotiate_protocol())
- return false;
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
+ /*
+ * Check whether the runtime #VC exception handler is active.
+ * The runtime exception handler uses the per-CPU GHCB page, and
+ * the GHCB page would be setup by sev_es_init_vc_handling().
+ *
+ * If SNP is active, then register the per-CPU GHCB page so that
+ * the runtime exception handler can use it.
+ */
+ if (initial_vc_handler == (unsigned long)kernel_exc_vmm_communication) {
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ snp_register_per_cpu_ghcb();
+
+ return;
+ }

/*
* Clear the boot_ghcb. The first exception comes in before the bss
@@ -666,7 +691,9 @@ static bool __init sev_es_setup_ghcb(void)
/* Alright - Make the boot-ghcb public */
boot_ghcb = &boot_ghcb_page;

- return true;
+ /* SNP guest requires that GHCB GPA must be registered. */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ snp_register_ghcb_early(__pa(&boot_ghcb_page));
}

#ifdef CONFIG_HOTPLUG_CPU
@@ -1397,10 +1424,6 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
struct es_em_ctxt ctxt;
enum es_result result;

- /* Do initial setup or terminate the guest */
- if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
-
vc_ghcb_invalidate(boot_ghcb);

result = vc_init_em_ctxt(&ctxt, regs, exit_code);
--
2.25.1

2022-03-15 09:58:47

by Peter Gonda

[permalink] [raw]
Subject: Re: [PATCH v12 46/46] virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement

On Mon, Mar 7, 2022 at 2:35 PM Brijesh Singh <[email protected]> wrote:
>
> From: Michael Roth <[email protected]>
>
> Update the documentation with information regarding SEV-SNP CPUID
> Enforcement details and what sort of assurances it provides to guests.
>
> Signed-off-by: Michael Roth <[email protected]>
> Signed-off-by: Brijesh Singh <[email protected]>

Tested-by: Peter Gonda <[email protected]>

I've booted these V12 patches on our internal KVM stack and then
tested these new driver ioctls with some basic usage. Feel free to add
this tag to all the driver patches, I am not sure if a basic boot test
qualifies this for the entire series though.

2022-04-07 05:51:07

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v12 29/46] x86/boot: Add Confidential Computing type to setup_data

On Mon, Mar 07 2022 at 15:33, Brijesh Singh wrote:
>
> +/*
> + * AMD SEV Confidential computing blob structure. The structure is
> + * defined in OVMF UEFI firmware header:
> + * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
> + */
> +#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
> +struct cc_blob_sev_info {
> + u32 magic;
> + u16 version;
> + u16 reserved;
> + u64 secrets_phys;
> + u32 secrets_len;
> + u32 rsvd1;
> + u64 cpuid_phys;
> + u32 cpuid_len;
> + u32 rsvd2;
> +};

Shouldn't this be packed?

Thanks,

tglx

2022-04-07 19:51:12

by Brijesh Singh

[permalink] [raw]
Subject: Re: [PATCH v12 29/46] x86/boot: Add Confidential Computing type to setup_data



On 4/6/22 16:19, Thomas Gleixner wrote:
> On Mon, Mar 07 2022 at 15:33, Brijesh Singh wrote:
>>
>> +/*
>> + * AMD SEV Confidential computing blob structure. The structure is
>> + * defined in OVMF UEFI firmware header:
>> + * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
>> + */
>> +#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
>> +struct cc_blob_sev_info {
>> + u32 magic;
>> + u16 version;
>> + u16 reserved;
>> + u64 secrets_phys;
>> + u32 secrets_len;
>> + u32 rsvd1;
>> + u64 cpuid_phys;
>> + u32 cpuid_len;
>> + u32 rsvd2;
>> +};
>
> Shouldn't this be packed?
>

Yep, to avoid any additional compiler alignment we should pack it.

thanks

2022-04-07 20:35:22

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v12 29/46] x86/boot: Add Confidential Computing type to setup_data

On Wed, Apr 06, 2022 at 11:19:10PM +0200, Thomas Gleixner wrote:
> On Mon, Mar 07 2022 at 15:33, Brijesh Singh wrote:
> >
> > +/*
> > + * AMD SEV Confidential computing blob structure. The structure is
> > + * defined in OVMF UEFI firmware header:
> > + * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
> > + */
> > +#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
> > +struct cc_blob_sev_info {
> > + u32 magic;
> > + u16 version;
> > + u16 reserved;
> > + u64 secrets_phys;
> > + u32 secrets_len;
> > + u32 rsvd1;
> > + u64 cpuid_phys;
> > + u32 cpuid_len;
> > + u32 rsvd2;
> > +};
>
> Shouldn't this be packed?

Done.

Thx.

--
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Ivo Totev, HRB 36809, AG Nürnberg

Subject: [tip: x86/sev] x86/kernel: Mark the .bss..decrypted section as shared in the RMP table

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: efac0eedfab515e523cde5cb7a62289eb2ee58f8
Gitweb: https://git.kernel.org/tip/efac0eedfab515e523cde5cb7a62289eb2ee58f8
Author: Brijesh Singh <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:13 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:23:00 +02:00

x86/kernel: Mark the .bss..decrypted section as shared in the RMP table

The encryption attribute for the .bss..decrypted section is cleared in the
initial page table build. This is because the section contains the data
that need to be shared between the guest and the hypervisor.

When SEV-SNP is active, just clearing the encryption attribute in the
page table is not enough. The page state needs to be updated in the RMP
table.

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/kernel/head64.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 83514b9..656d2f3 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
if (sme_get_me_mask()) {
vaddr = (unsigned long)__start_bss_decrypted;
vaddr_end = (unsigned long)__end_bss_decrypted;
+
for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
+ /*
+ * On SNP, transition the page to shared in the RMP table so that
+ * it is consistent with the page table attribute change.
+ *
+ * __start_bss_decrypted has a virtual address in the high range
+ * mapping (kernel .text). PVALIDATE, by way of
+ * early_snp_set_memory_shared(), requires a valid virtual
+ * address but the kernel is currently running off of the identity
+ * mapping so use __pa() to get a *currently* valid virtual address.
+ */
+ early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+
i = pmd_index(vaddr);
pmd[i] -= sme_get_me_mask();
}

Subject: [tip: x86/sev] x86/sev: Define the Linux-specific guest termination reasons

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 6c0f74d678c94060932683738b3e227995b363d3
Gitweb: https://git.kernel.org/tip/6c0f74d678c94060932683738b3e227995b363d3
Author: Brijesh Singh <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:04 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:02:41 +02:00

x86/sev: Define the Linux-specific guest termination reasons

The GHCB specification defines the reason code for reason set 0. The
reason codes defined in the set 0 do not cover all possible causes for a
guest to request termination.

The reason sets 1 to 255 are reserved for the vendor-specific codes.
Reserve the reason set 1 for the Linux guest. Define the error codes for
reason set 1 so that one can have meaningful termination reasons and thus
better guest failure diagnosis.

While at it, change sev_es_terminate() to accept a reason set parameter.

[ bp: Massage commit message. ]

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Venu Busireddy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/sev.c | 6 +++---
arch/x86/include/asm/sev-common.h | 8 ++++++++
arch/x86/kernel/sev-shared.c | 11 ++++-------
arch/x86/kernel/sev.c | 4 ++--
4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 27ccd5a..56e941d 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -119,7 +119,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
static bool early_setup_sev_es(void)
{
if (!sev_es_negotiate_protocol())
- sev_es_terminate(GHCB_SEV_ES_PROT_UNSUPPORTED);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);

if (set_page_decrypted((unsigned long)&boot_ghcb_page))
return false;
@@ -172,7 +172,7 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
enum es_result result;

if (!boot_ghcb && !early_setup_sev_es())
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);

vc_ghcb_invalidate(boot_ghcb);
result = vc_init_em_ctxt(&ctxt, regs, exit_code);
@@ -199,7 +199,7 @@ finish:
if (result == ES_OK)
vc_finish_insn(&ctxt);
else if (result != ES_RETRY)
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}

void sev_enable(struct boot_params *bp)
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 1b2fd32..94f0ea5 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -73,9 +73,17 @@
/* GHCBData[23:16] */ \
((((u64)reason_val) & 0xff) << 16))

+/* Error codes from reason set 0 */
+#define SEV_TERM_SET_GEN 0
#define GHCB_SEV_ES_GEN_REQ 0
#define GHCB_SEV_ES_PROT_UNSUPPORTED 1

+/* Linux-specific reason codes (used with reason set 1) */
+#define SEV_TERM_SET_LINUX 1
+#define GHCB_TERM_REGISTER 0 /* GHCB GPA registration failure */
+#define GHCB_TERM_PSC 1 /* Page State Change failure */
+#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
+
#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)

/*
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index ce98768..2abf8a7 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -24,15 +24,12 @@ static bool __init sev_es_check_cpu_features(void)
return true;
}

-static void __noreturn sev_es_terminate(unsigned int reason)
+static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
{
u64 val = GHCB_MSR_TERM_REQ;

- /*
- * Tell the hypervisor what went wrong - only reason-set 0 is
- * currently supported.
- */
- val |= GHCB_SEV_TERM_REASON(0, reason);
+ /* Tell the hypervisor what went wrong. */
+ val |= GHCB_SEV_TERM_REASON(set, reason);

/* Request Guest Termination from Hypvervisor */
sev_es_wr_ghcb_msr(val);
@@ -221,7 +218,7 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)

fail:
/* Terminate the guest */
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}

static enum es_result vc_insn_string_read(struct es_em_ctxt *ctxt,
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index e6d316a..19ad097 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1337,7 +1337,7 @@ DEFINE_IDTENTRY_VC_KERNEL(exc_vmm_communication)
show_regs(regs);

/* Ask hypervisor to sev_es_terminate */
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);

/* If that fails and we get here - just panic */
panic("Returned from Terminate-Request to Hypervisor\n");
@@ -1385,7 +1385,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)

/* Do initial setup or terminate the guest */
if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
- sev_es_terminate(GHCB_SEV_ES_GEN_REQ);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);

vc_ghcb_invalidate(boot_ghcb);

Subject: [tip: x86/sev] x86/sev: Register GHCB memory when SEV-SNP is active

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 95d33bfaa3e169cfec1926e0d0f0c6b0ea75d763
Gitweb: https://git.kernel.org/tip/95d33bfaa3e169cfec1926e0d0f0c6b0ea75d763
Author: Brijesh Singh <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:11 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:16:58 +02:00

x86/sev: Register GHCB memory when SEV-SNP is active

The SEV-SNP guest is required by the GHCB spec to register the GHCB's
Guest Physical Address (GPA). This is because the hypervisor may prefer
that a guest uses a consistent and/or specific GPA for the GHCB associated
with a vCPU. For more information, see the GHCB specification section
"GHCB GPA Registration".

[ bp: Cleanup comments. ]

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/sev.h | 2 ++-
arch/x86/kernel/cpu/common.c | 4 +++-
arch/x86/kernel/head64.c | 4 ++-
arch/x86/kernel/sev-shared.c | 2 +-
arch/x86/kernel/sev.c | 46 +++++++++++++++++++++++++----------
5 files changed, 44 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index e374518..48df027 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -122,6 +122,7 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)

return rc;
}
+void setup_ghcb(void);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -130,6 +131,7 @@ static inline void sev_es_nmi_complete(void) { }
static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
+static inline void setup_ghcb(void) { }
#endif

#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ed44175..9e45521 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -60,6 +60,7 @@
#include <asm/uv/uv.h>
#include <asm/sigframe.h>
#include <asm/traps.h>
+#include <asm/sev.h>

#include "cpu.h"

@@ -2124,6 +2125,9 @@ void cpu_init_exception_handling(void)

load_TR_desc();

+ /* GHCB needs to be setup to handle #VC. */
+ setup_ghcb();
+
/* Finally load the IDT */
load_current_idt();
}
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index cbc285d..83514b9 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -597,8 +597,10 @@ static void startup_64_load_idt(unsigned long physbase)
void early_setup_idt(void)
{
/* VMM Communication Exception */
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+ setup_ghcb();
set_bringup_idt_handler(bringup_idt_table, X86_TRAP_VC, vc_boot_ghcb);
+ }

bringup_idt_descr.address = (unsigned long)bringup_idt_table;
native_load_idt(&bringup_idt_descr);
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index e9ff13c..3aaef1a 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -68,7 +68,7 @@ static u64 get_hv_features(void)
return GHCB_MSR_HV_FT_RESP_VAL(val);
}

-static void __maybe_unused snp_register_ghcb_early(unsigned long paddr)
+static void snp_register_ghcb_early(unsigned long paddr)
{
unsigned long pfn = paddr >> PAGE_SHIFT;
u64 val;
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index cb20fb0..39c8746 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -41,7 +41,7 @@ static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
* Needs to be in the .data section because we need it NULL before bss is
* cleared
*/
-static struct ghcb __initdata *boot_ghcb;
+static struct ghcb *boot_ghcb __section(".data");

/* Bitmap of SEV features supported by the hypervisor */
static u64 sev_hv_features __ro_after_init;
@@ -647,15 +647,39 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
return ret;
}

-/*
- * This function runs on the first #VC exception after the kernel
- * switched to virtual addresses.
- */
-static bool __init sev_es_setup_ghcb(void)
+static void snp_register_per_cpu_ghcb(void)
{
+ struct sev_es_runtime_data *data;
+ struct ghcb *ghcb;
+
+ data = this_cpu_read(runtime_data);
+ ghcb = &data->ghcb_page;
+
+ snp_register_ghcb_early(__pa(ghcb));
+}
+
+void setup_ghcb(void)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
+ return;
+
/* First make sure the hypervisor talks a supported protocol. */
if (!sev_es_negotiate_protocol())
- return false;
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
+ /*
+ * Check whether the runtime #VC exception handler is active. It uses
+ * the per-CPU GHCB page which is set up by sev_es_init_vc_handling().
+ *
+ * If SNP is active, register the per-CPU GHCB page so that the runtime
+ * exception handler can use it.
+ */
+ if (initial_vc_handler == (unsigned long)kernel_exc_vmm_communication) {
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ snp_register_per_cpu_ghcb();
+
+ return;
+ }

/*
* Clear the boot_ghcb. The first exception comes in before the bss
@@ -666,7 +690,9 @@ static bool __init sev_es_setup_ghcb(void)
/* Alright - Make the boot-ghcb public */
boot_ghcb = &boot_ghcb_page;

- return true;
+ /* SNP guest requires that GHCB GPA must be registered. */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ snp_register_ghcb_early(__pa(&boot_ghcb_page));
}

#ifdef CONFIG_HOTPLUG_CPU
@@ -1397,10 +1423,6 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
struct es_em_ctxt ctxt;
enum es_result result;

- /* Do initial setup or terminate the guest */
- if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
-
vc_ghcb_invalidate(boot_ghcb);

result = vc_init_em_ctxt(&ctxt, regs, exit_code);

Subject: [tip: x86/sev] KVM: SVM: Update the SEV-ES save area mapping

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 6d3b3d34e39eb4ee9a7dbe7e28e2588e160f9c0f
Gitweb: https://git.kernel.org/tip/6d3b3d34e39eb4ee9a7dbe7e28e2588e160f9c0f
Author: Tom Lendacky <[email protected]>
AuthorDate: Mon, 07 Mar 2022 15:33:14 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 12:19:51 +02:00

KVM: SVM: Update the SEV-ES save area mapping

This is the final step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Update the SEV-ES/SEV-SNP save area to match the APM. This save
area will be used for the upcoming SEV-SNP AP Creation NAE event support.

Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Venu Busireddy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/svm.h | 66 ++++++++++++++++++++++++++++---------
1 file changed, 50 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 0789ad8..ec623e7 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -334,7 +334,13 @@ struct sev_es_save_area {
struct vmcb_seg ldtr;
struct vmcb_seg idtr;
struct vmcb_seg tr;
- u8 reserved_1[43];
+ u64 vmpl0_ssp;
+ u64 vmpl1_ssp;
+ u64 vmpl2_ssp;
+ u64 vmpl3_ssp;
+ u64 u_cet;
+ u8 reserved_1[2];
+ u8 vmpl;
u8 cpl;
u8 reserved_2[4];
u64 efer;
@@ -347,9 +353,19 @@ struct sev_es_save_area {
u64 dr6;
u64 rflags;
u64 rip;
- u8 reserved_4[88];
+ u64 dr0;
+ u64 dr1;
+ u64 dr2;
+ u64 dr3;
+ u64 dr0_addr_mask;
+ u64 dr1_addr_mask;
+ u64 dr2_addr_mask;
+ u64 dr3_addr_mask;
+ u8 reserved_4[24];
u64 rsp;
- u8 reserved_5[24];
+ u64 s_cet;
+ u64 ssp;
+ u64 isst_addr;
u64 rax;
u64 star;
u64 lstar;
@@ -360,7 +376,7 @@ struct sev_es_save_area {
u64 sysenter_esp;
u64 sysenter_eip;
u64 cr2;
- u8 reserved_6[32];
+ u8 reserved_5[32];
u64 g_pat;
u64 dbgctl;
u64 br_from;
@@ -369,12 +385,12 @@ struct sev_es_save_area {
u64 last_excp_to;
u8 reserved_7[80];
u32 pkru;
- u8 reserved_9[20];
- u64 reserved_10; /* rax already available at 0x01f8 */
+ u8 reserved_8[20];
+ u64 reserved_9; /* rax already available at 0x01f8 */
u64 rcx;
u64 rdx;
u64 rbx;
- u64 reserved_11; /* rsp already available at 0x01d8 */
+ u64 reserved_10; /* rsp already available at 0x01d8 */
u64 rbp;
u64 rsi;
u64 rdi;
@@ -386,16 +402,34 @@ struct sev_es_save_area {
u64 r13;
u64 r14;
u64 r15;
- u8 reserved_12[16];
- u64 sw_exit_code;
- u64 sw_exit_info_1;
- u64 sw_exit_info_2;
- u64 sw_scratch;
+ u8 reserved_11[16];
+ u64 guest_exit_info_1;
+ u64 guest_exit_info_2;
+ u64 guest_exit_int_info;
+ u64 guest_nrip;
u64 sev_features;
- u8 reserved_13[48];
+ u64 vintr_ctrl;
+ u64 guest_exit_code;
+ u64 virtual_tom;
+ u64 tlb_id;
+ u64 pcpu_id;
+ u64 event_inj;
u64 xcr0;
- u8 valid_bitmap[16];
- u64 x87_state_gpa;
+ u8 reserved_12[16];
+
+ /* Floating point area */
+ u64 x87_dp;
+ u32 mxcsr;
+ u16 x87_ftw;
+ u16 x87_fsw;
+ u16 x87_fcw;
+ u16 x87_fop;
+ u16 x87_ds;
+ u16 x87_cs;
+ u64 x87_rip;
+ u8 fpreg_x87[80];
+ u8 fpreg_xmm[256];
+ u8 fpreg_ymm[256];
} __packed;

struct ghcb_save_area {
@@ -454,7 +488,7 @@ struct ghcb {

#define EXPECTED_VMCB_SAVE_AREA_SIZE 740
#define EXPECTED_GHCB_SAVE_AREA_SIZE 1032
-#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1032
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1648
#define EXPECTED_VMCB_CONTROL_AREA_SIZE 1024
#define EXPECTED_GHCB_SIZE PAGE_SIZE

Subject: [tip: x86/sev] x86/sev: Add helper for validating pages in early enc attribute changes

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 5e5ccff60a2977142d39b987a8b90e422d9fc634
Gitweb: https://git.kernel.org/tip/5e5ccff60a2977142d39b987a8b90e422d9fc634
Author: Brijesh Singh <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:12 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:22:54 +02:00

x86/sev: Add helper for validating pages in early enc attribute changes

early_set_memory_{encrypted,decrypted}() are used for changing the page
state from decrypted (shared) to encrypted (private) and vice versa.

When SEV-SNP is active, the page state transition needs to go through
additional steps.

If the page is transitioned from shared to private, then perform the
following after the encryption attribute is set in the page table:

1. Issue the page state change VMGEXIT to add the page as a private
in the RMP table.
2. Validate the page after its successfully added in the RMP table.

To maintain the security guarantees, if the page is transitioned from
private to shared, then perform the following before clearing the
encryption attribute from the page table.

1. Invalidate the page.
2. Issue the page state change VMGEXIT to make the page shared in the
RMP table.

early_set_memory_{encrypted,decrypted}() can be called before the GHCB
is setup so use the SNP page state MSR protocol VMGEXIT defined in the
GHCB specification to request the page state change in the RMP table.

While at it, add a helper snp_prep_memory() which will be used in
probe_roms(), in a later patch.

[ bp: Massage commit message. ]

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Venu Busireddy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/sev.h | 10 +++-
arch/x86/kernel/sev.c | 99 +++++++++++++++++++++++++++++++++-
arch/x86/mm/mem_encrypt_amd.c | 58 +++++++++++++++++--
3 files changed, 163 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 48df027..f65d257 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -123,6 +123,11 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
return rc;
}
void setup_ghcb(void);
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -132,6 +137,11 @@ static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
static inline void setup_ghcb(void) { }
+static inline void __init
+early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init
+early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
#endif

#endif
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 39c8746..e3fca86 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -556,6 +556,105 @@ static u64 get_jump_table_addr(void)
return ret;
}

+static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool validate)
+{
+ unsigned long vaddr_end;
+ int rc;
+
+ vaddr = vaddr & PAGE_MASK;
+ vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+ while (vaddr < vaddr_end) {
+ rc = pvalidate(vaddr, RMP_PG_SIZE_4K, validate);
+ if (WARN(rc, "Failed to validate address 0x%lx ret %d", vaddr, rc))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+ vaddr = vaddr + PAGE_SIZE;
+ }
+}
+
+static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
+{
+ unsigned long paddr_end;
+ u64 val;
+
+ paddr = paddr & PAGE_MASK;
+ paddr_end = paddr + (npages << PAGE_SHIFT);
+
+ while (paddr < paddr_end) {
+ /*
+ * Use the MSR protocol because this function can be called before
+ * the GHCB is established.
+ */
+ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+
+ if (WARN(GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP,
+ "Wrong PSC response code: 0x%x\n",
+ (unsigned int)GHCB_RESP_CODE(val)))
+ goto e_term;
+
+ if (WARN(GHCB_MSR_PSC_RESP_VAL(val),
+ "Failed to change page state to '%s' paddr 0x%lx error 0x%llx\n",
+ op == SNP_PAGE_STATE_PRIVATE ? "private" : "shared",
+ paddr, GHCB_MSR_PSC_RESP_VAL(val)))
+ goto e_term;
+
+ paddr = paddr + PAGE_SIZE;
+ }
+
+ return;
+
+e_term:
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ /*
+ * Ask the hypervisor to mark the memory pages as private in the RMP
+ * table.
+ */
+ early_set_pages_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+ /* Validate the memory pages after they've been added in the RMP table. */
+ pvalidate_pages(vaddr, npages, true);
+}
+
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ /* Invalidate the memory pages before they are marked shared in the RMP table. */
+ pvalidate_pages(vaddr, npages, false);
+
+ /* Ask hypervisor to mark the memory pages shared in the RMP table. */
+ early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
+{
+ unsigned long vaddr, npages;
+
+ vaddr = (unsigned long)__va(paddr);
+ npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (op == SNP_PAGE_STATE_PRIVATE)
+ early_snp_set_memory_private(vaddr, paddr, npages);
+ else if (op == SNP_PAGE_STATE_SHARED)
+ early_snp_set_memory_shared(vaddr, paddr, npages);
+ else
+ WARN(1, "invalid memory op %d\n", op);
+}
+
int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
{
u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 6169053..8539dd6 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -31,6 +31,7 @@
#include <asm/processor-flags.h>
#include <asm/msr.h>
#include <asm/cmdline.h>
+#include <asm/sev.h>

#include "mm_internal.h"

@@ -48,6 +49,36 @@ EXPORT_SYMBOL(sme_me_mask);
static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);

/*
+ * SNP-specific routine which needs to additionally change the page state from
+ * private to shared before copying the data from the source to destination and
+ * restore after the copy.
+ */
+static inline void __init snp_memcpy(void *dst, void *src, size_t sz,
+ unsigned long paddr, bool decrypt)
+{
+ unsigned long npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (decrypt) {
+ /*
+ * @paddr needs to be accessed decrypted, mark the page shared in
+ * the RMP table before copying it.
+ */
+ early_snp_set_memory_shared((unsigned long)__va(paddr), paddr, npages);
+
+ memcpy(dst, src, sz);
+
+ /* Restore the page state after the memcpy. */
+ early_snp_set_memory_private((unsigned long)__va(paddr), paddr, npages);
+ } else {
+ /*
+ * @paddr need to be accessed encrypted, no need for the page state
+ * change.
+ */
+ memcpy(dst, src, sz);
+ }
+}
+
+/*
* This routine does not change the underlying encryption setting of the
* page(s) that map this memory. It assumes that eventually the memory is
* meant to be accessed as either encrypted or decrypted but the contents
@@ -95,8 +126,13 @@ static void __init __sme_early_enc_dec(resource_size_t paddr,
* Use a temporary buffer, of cache-line multiple size, to
* avoid data corruption as documented in the APM.
*/
- memcpy(sme_early_buffer, src, len);
- memcpy(dst, sme_early_buffer, len);
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+ snp_memcpy(sme_early_buffer, src, len, paddr, enc);
+ snp_memcpy(dst, sme_early_buffer, len, paddr, !enc);
+ } else {
+ memcpy(sme_early_buffer, src, len);
+ memcpy(dst, sme_early_buffer, len);
+ }

early_memunmap(dst, len);
early_memunmap(src, len);
@@ -322,14 +358,28 @@ static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
clflush_cache_range(__va(pa), size);

/* Encrypt/decrypt the contents in-place */
- if (enc)
+ if (enc) {
sme_early_encrypt(pa, size);
- else
+ } else {
sme_early_decrypt(pa, size);

+ /*
+ * ON SNP, the page state in the RMP table must happen
+ * before the page table updates.
+ */
+ early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);
+ }
+
/* Change the page encryption mask. */
new_pte = pfn_pte(pfn, new_prot);
set_pte_atomic(kpte, new_pte);
+
+ /*
+ * If page is set encrypted in the page table, then update the RMP table to
+ * add this page as private.
+ */
+ if (enc)
+ early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);
}

static int __init early_set_memory_enc_dec(unsigned long vaddr,

Subject: [tip: x86/sev] x86/sev: Register SEV-SNP guest request platform device

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 3a45b3753849c4a12cca2dd176c0192cd2a63e62
Gitweb: https://git.kernel.org/tip/3a45b3753849c4a12cca2dd176c0192cd2a63e62
Author: Brijesh Singh <[email protected]>
AuthorDate: Thu, 24 Feb 2022 10:56:21 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Thu, 07 Apr 2022 16:47:12 +02:00

x86/sev: Register SEV-SNP guest request platform device

Version 2 of the GHCB specification provides a Non Automatic Exit (NAE)
event type that can be used by the SEV-SNP guest to communicate with the
PSP without risk from a malicious hypervisor who wishes to read, alter,
drop or replay the messages sent.

SNP_LAUNCH_UPDATE can insert two special pages into the guest’s memory:
the secrets page and the CPUID page. The PSP firmware populates the
contents of the secrets page. The secrets page contains encryption keys
used by the guest to interact with the firmware. Because the secrets
page is encrypted with the guest’s memory encryption key, the hypervisor
cannot read the keys. See SEV-SNP firmware spec for further details on
the secrets page format.

Create a platform device that the SEV-SNP guest driver can bind to get
the platform resources such as encryption key and message id to use to
communicate with the PSP. The SEV-SNP guest driver provides a userspace
interface to get the attestation report, key derivation, extended
attestation report etc.

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/sev.h | 4 +++-
arch/x86/kernel/sev.c | 56 +++++++++++++++++++++++++++++++++++++-
2 files changed, 60 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index c63a1d4..9c2d33f 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -95,6 +95,10 @@ struct snp_req_data {
unsigned int data_npages;
};

+struct snp_guest_platform_data {
+ u64 secrets_gpa;
+};
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern struct static_key_false sev_es_enable_key;
extern void __sev_es_ist_enter(struct pt_regs *regs);
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 7237b41..ace43e1 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -19,6 +19,9 @@
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/cpumask.h>
+#include <linux/efi.h>
+#include <linux/platform_device.h>
+#include <linux/io.h>

#include <asm/cpu_entry_area.h>
#include <asm/stacktrace.h>
@@ -2163,3 +2166,56 @@ e_restore_irq:
return ret;
}
EXPORT_SYMBOL_GPL(snp_issue_guest_request);
+
+static struct platform_device guest_req_device = {
+ .name = "snp-guest",
+ .id = -1,
+};
+
+static u64 get_secrets_page(void)
+{
+ u64 pa_data = boot_params.cc_blob_address;
+ struct cc_blob_sev_info info;
+ void *map;
+
+ /*
+ * The CC blob contains the address of the secrets page, check if the
+ * blob is present.
+ */
+ if (!pa_data)
+ return 0;
+
+ map = early_memremap(pa_data, sizeof(info));
+ memcpy(&info, map, sizeof(info));
+ early_memunmap(map, sizeof(info));
+
+ /* smoke-test the secrets page passed */
+ if (!info.secrets_phys || info.secrets_len != PAGE_SIZE)
+ return 0;
+
+ return info.secrets_phys;
+}
+
+static int __init snp_init_platform_device(void)
+{
+ struct snp_guest_platform_data data;
+ u64 gpa;
+
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return -ENODEV;
+
+ gpa = get_secrets_page();
+ if (!gpa)
+ return -ENODEV;
+
+ data.secrets_gpa = gpa;
+ if (platform_device_add_data(&guest_req_device, &data, sizeof(data)))
+ return -ENODEV;
+
+ if (platform_device_register(&guest_req_device))
+ return -ENODEV;
+
+ pr_info("SNP guest platform device initialized.\n");
+ return 0;
+}
+device_initcall(snp_init_platform_device);

Subject: [tip: x86/sev] x86/compressed: Export and rename add_identity_map()

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: a9ee679b1f8c3803490ed2eeffb688aaee56583f
Gitweb: https://git.kernel.org/tip/a9ee679b1f8c3803490ed2eeffb688aaee56583f
Author: Michael Roth <[email protected]>
AuthorDate: Thu, 24 Feb 2022 10:56:16 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Thu, 07 Apr 2022 16:47:11 +02:00

x86/compressed: Export and rename add_identity_map()

SEV-specific code will need to add some additional mappings, but doing
this within ident_map_64.c requires some SEV-specific helpers to be
exported and some SEV-specific struct definitions to be pulled into
ident_map_64.c. Instead, export add_identity_map() so SEV-specific (and
other subsystem-specific) code can be better contained outside of
ident_map_64.c.

While at it, rename the function to kernel_add_identity_map(), similar
to the kernel_ident_mapping_init() function it relies upon.

No functional changes.

Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/ident_map_64.c | 18 +++++++++---------
arch/x86/boot/compressed/misc.h | 1 +
2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index 613367e..99348ee 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -90,7 +90,7 @@ static struct x86_mapping_info mapping_info;
/*
* Adds the specified range to the identity mappings.
*/
-static void add_identity_map(unsigned long start, unsigned long end)
+void kernel_add_identity_map(unsigned long start, unsigned long end)
{
int ret;

@@ -157,11 +157,11 @@ void initialize_identity_maps(void *rmode)
* explicitly here in case the compressed kernel does not touch them,
* or does not touch all the pages covering them.
*/
- add_identity_map((unsigned long)_head, (unsigned long)_end);
+ kernel_add_identity_map((unsigned long)_head, (unsigned long)_end);
boot_params = rmode;
- add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
+ kernel_add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
cmdline = get_cmd_line_ptr();
- add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
+ kernel_add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);

/* Load the new page-table. */
sev_verify_cbit(top_level_pgt);
@@ -246,10 +246,10 @@ static int set_clr_page_flags(struct x86_mapping_info *info,
* It should already exist, but keep things generic.
*
* To map the page just read from it and fault it in if there is no
- * mapping yet. add_identity_map() can't be called here because that
- * would unconditionally map the address on PMD level, destroying any
- * PTE-level mappings that might already exist. Use assembly here so
- * the access won't be optimized away.
+ * mapping yet. kernel_add_identity_map() can't be called here because
+ * that would unconditionally map the address on PMD level, destroying
+ * any PTE-level mappings that might already exist. Use assembly here
+ * so the access won't be optimized away.
*/
asm volatile("mov %[address], %%r9"
:: [address] "g" (*(unsigned long *)address)
@@ -363,5 +363,5 @@ void do_boot_page_fault(struct pt_regs *regs, unsigned long error_code)
* Error code is sane - now identity map the 2M region around
* the faulting address.
*/
- add_identity_map(address, end);
+ kernel_add_identity_map(address, end);
}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index ba538af..aae2722 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -156,6 +156,7 @@ static inline int count_immovable_mem_regions(void) { return 0; }
#ifdef CONFIG_X86_5LEVEL
extern unsigned int __pgtable_l5_enabled, pgdir_shift, ptrs_per_p4d;
#endif
+extern void kernel_add_identity_map(unsigned long start, unsigned long end);

/* Used by PAGE_KERN* macros: */
extern pteval_t __default_kernel_pte_mask;

Subject: [tip: x86/sev] x86/compressed/acpi: Move EFI detection to helper

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 7c4146e8885512719a50b641e9277a1712e052ff
Gitweb: https://git.kernel.org/tip/7c4146e8885512719a50b641e9277a1712e052ff
Author: Michael Roth <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:18 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 17:07:02 +02:00

x86/compressed/acpi: Move EFI detection to helper

Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.

First, carve out the functionality which determines the EFI environment
type the machine is booting on.

[ bp: Massage commit message. ]

Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/Makefile | 1 +-
arch/x86/boot/compressed/acpi.c | 28 ++++++-----------
arch/x86/boot/compressed/efi.c | 50 ++++++++++++++++++++++++++++++-
arch/x86/boot/compressed/misc.h | 16 ++++++++++-
4 files changed, 77 insertions(+), 18 deletions(-)
create mode 100644 arch/x86/boot/compressed/efi.c

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 6115274..e69c3d2 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -103,6 +103,7 @@ endif
vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o

vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
+vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o
efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a

$(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 8bcbcee..db6c561 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -87,7 +87,7 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
efi_system_table_64_t *systab;
struct efi_setup_data *esd;
struct efi_info *ei;
- char *sig;
+ enum efi_type et;

esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
if (!esd)
@@ -98,10 +98,9 @@ static acpi_physical_address kexec_get_rsdp_addr(void)
return 0;
}

- ei = &boot_params->efi_info;
- sig = (char *)&ei->efi_loader_signature;
- if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
- debug_putstr("Wrong kexec EFI loader signature.\n");
+ et = efi_get_type(boot_params);
+ if (et != EFI_TYPE_64) {
+ debug_putstr("Unexpected kexec EFI environment (expected 64-bit EFI).\n");
return 0;
}

@@ -122,29 +121,22 @@ static acpi_physical_address efi_get_rsdp_addr(void)
unsigned long systab, config_tables;
unsigned int nr_tables;
struct efi_info *ei;
+ enum efi_type et;
bool efi_64;
- char *sig;
-
- ei = &boot_params->efi_info;
- sig = (char *)&ei->efi_loader_signature;

- if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+ et = efi_get_type(boot_params);
+ if (et == EFI_TYPE_64)
efi_64 = true;
- } else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+ else if (et == EFI_TYPE_32)
efi_64 = false;
- } else {
- debug_putstr("Wrong EFI loader signature.\n");
+ else
return 0;
- }

/* Get systab from boot params. */
+ ei = &boot_params->efi_info;
#ifdef CONFIG_X86_64
systab = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
#else
- if (ei->efi_systab_hi || ei->efi_memmap_hi) {
- debug_putstr("Error getting RSDP address: EFI system table located above 4GB.\n");
- return 0;
- }
systab = ei->efi_systab;
#endif
if (!systab)
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
new file mode 100644
index 0000000..bad0ce3
--- /dev/null
+++ b/arch/x86/boot/compressed/efi.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for early access to EFI configuration table.
+ *
+ * Originally derived from arch/x86/boot/compressed/acpi.c
+ */
+
+#include "misc.h"
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+/**
+ * efi_get_type - Given a pointer to boot_params, determine the type of EFI environment.
+ *
+ * @bp: pointer to boot_params
+ *
+ * Return: EFI_TYPE_{32,64} for valid EFI environments, EFI_TYPE_NONE otherwise.
+ */
+enum efi_type efi_get_type(struct boot_params *bp)
+{
+ struct efi_info *ei;
+ enum efi_type et;
+ const char *sig;
+
+ ei = &bp->efi_info;
+ sig = (char *)&ei->efi_loader_signature;
+
+ if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+ et = EFI_TYPE_64;
+ } else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+ et = EFI_TYPE_32;
+ } else {
+ debug_putstr("No EFI environment detected.\n");
+ et = EFI_TYPE_NONE;
+ }
+
+#ifndef CONFIG_X86_64
+ /*
+ * Existing callers like acpi.c treat this case as an indicator to
+ * fall-through to non-EFI, rather than an error, so maintain that
+ * functionality here as well.
+ */
+ if (ei->efi_systab_hi || ei->efi_memmap_hi) {
+ debug_putstr("EFI system table is located above 4GB and cannot be accessed.\n");
+ et = EFI_TYPE_NONE;
+ }
+#endif
+
+ return et;
+}
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 01cc13c..fede1af 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -176,4 +176,20 @@ void boot_stage2_vc(void);

unsigned long sev_verify_cbit(unsigned long cr3);

+enum efi_type {
+ EFI_TYPE_64,
+ EFI_TYPE_32,
+ EFI_TYPE_NONE,
+};
+
+#ifdef CONFIG_EFI
+/* helpers for early EFI config table access */
+enum efi_type efi_get_type(struct boot_params *bp);
+#else
+static inline enum efi_type efi_get_type(struct boot_params *bp)
+{
+ return EFI_TYPE_NONE;
+}
+#endif /* CONFIG_EFI */
+
#endif /* BOOT_COMPRESSED_MISC_H */

Subject: [tip: x86/sev] x86/sev: Detect/setup SEV/SME features earlier in boot

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: bcce829083339bf862d66df602cbb111943da8fb
Gitweb: https://git.kernel.org/tip/bcce829083339bf862d66df602cbb111943da8fb
Author: Michael Roth <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:02 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:02:26 +02:00

x86/sev: Detect/setup SEV/SME features earlier in boot

sme_enable() handles feature detection for both SEV and SME. Future
patches will also use it for SEV-SNP feature detection/setup, which
will need to be done immediately after the first #VC handler is set up.
Move it now in preparation.

Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Venu Busireddy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/kernel/head64.c | 3 ---
arch/x86/kernel/head_64.S | 13 +++++++++++++
2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 4f5ecbb..cbc285d 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -192,9 +192,6 @@ unsigned long __head __startup_64(unsigned long physaddr,
if (load_delta & ~PMD_PAGE_MASK)
for (;;);

- /* Activate Secure Memory Encryption (SME) if supported and enabled */
- sme_enable(bp);
-
/* Include the SME encryption mask in the fixup value */
load_delta += sme_get_me_mask();

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index b8e3019..6bf340c 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -69,6 +69,19 @@ SYM_CODE_START_NOALIGN(startup_64)
call startup_64_setup_env
popq %rsi

+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ /*
+ * Activate SEV/SME memory encryption if supported/enabled. This needs to
+ * be done now, since this also includes setup of the SEV-SNP CPUID table,
+ * which needs to be done before any CPUID instructions are executed in
+ * subsequent code.
+ */
+ movq %rsi, %rdi
+ pushq %rsi
+ call sme_enable
+ popq %rsi
+#endif
+
/* Now switch to __KERNEL_CS so IRET works reliably */
pushq $__KERNEL_CS
leaq .Lon_kernel_cs(%rip), %rax

Subject: [tip: x86/sev] x86/mm: Validate memory when changing the C-bit

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: dc3f3d2474b80eaee8be89f4c5eb344f10648f42
Gitweb: https://git.kernel.org/tip/dc3f3d2474b80eaee8be89f4c5eb344f10648f42
Author: Brijesh Singh <[email protected]>
AuthorDate: Thu, 24 Feb 2022 10:56:01 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:24:53 +02:00

x86/mm: Validate memory when changing the C-bit

Add the needed functionality to change pages state from shared
to private and vice-versa using the Page State Change VMGEXIT as
documented in the GHCB spec.

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/sev-common.h | 22 ++++-
arch/x86/include/asm/sev.h | 4 +-
arch/x86/include/uapi/asm/svm.h | 2 +-
arch/x86/kernel/sev.c | 168 +++++++++++++++++++++++++++++-
arch/x86/mm/mem_encrypt_amd.c | 13 ++-
5 files changed, 209 insertions(+)

diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index f077a6c..1aa72b5 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -105,6 +105,28 @@ enum psc_op {

#define GHCB_HV_FT_SNP BIT_ULL(0)

+/* SNP Page State Change NAE event */
+#define VMGEXIT_PSC_MAX_ENTRY 253
+
+struct psc_hdr {
+ u16 cur_entry;
+ u16 end_entry;
+ u32 reserved;
+} __packed;
+
+struct psc_entry {
+ u64 cur_page : 12,
+ gfn : 40,
+ operation : 4,
+ pagesize : 1,
+ reserved : 7;
+} __packed;
+
+struct snp_psc_desc {
+ struct psc_hdr hdr;
+ struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
+} __packed;
+
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
#define GHCB_MSR_TERM_REASON_SET_MASK 0xf
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index f65d257..feeb93e 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -128,6 +128,8 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
unsigned int npages);
void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -142,6 +144,8 @@ early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned
static inline void __init
early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
+static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
#endif

#endif
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index b0ad00f..64404b4 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -108,6 +108,7 @@
#define SVM_VMGEXIT_AP_JUMP_TABLE 0x80000005
#define SVM_VMGEXIT_SET_AP_JUMP_TABLE 0
#define SVM_VMGEXIT_GET_AP_JUMP_TABLE 1
+#define SVM_VMGEXIT_PSC 0x80000010
#define SVM_VMGEXIT_HV_FEATURES 0x8000fffd
#define SVM_VMGEXIT_UNSUPPORTED_EVENT 0x8000ffff

@@ -219,6 +220,7 @@
{ SVM_VMGEXIT_NMI_COMPLETE, "vmgexit_nmi_complete" }, \
{ SVM_VMGEXIT_AP_HLT_LOOP, "vmgexit_ap_hlt_loop" }, \
{ SVM_VMGEXIT_AP_JUMP_TABLE, "vmgexit_ap_jump_table" }, \
+ { SVM_VMGEXIT_PSC, "vmgexit_page_state_change" }, \
{ SVM_VMGEXIT_HV_FEATURES, "vmgexit_hypervisor_feature" }, \
{ SVM_EXIT_ERR, "invalid_guest_state" }

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index e3fca86..bf4b578 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -655,6 +655,174 @@ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op
WARN(1, "invalid memory op %d\n", op);
}

+static int vmgexit_psc(struct snp_psc_desc *desc)
+{
+ int cur_entry, end_entry, ret = 0;
+ struct snp_psc_desc *data;
+ struct ghcb_state state;
+ struct es_em_ctxt ctxt;
+ unsigned long flags;
+ struct ghcb *ghcb;
+
+ /*
+ * __sev_get_ghcb() needs to run with IRQs disabled because it is using
+ * a per-CPU GHCB.
+ */
+ local_irq_save(flags);
+
+ ghcb = __sev_get_ghcb(&state);
+ if (!ghcb) {
+ ret = 1;
+ goto out_unlock;
+ }
+
+ /* Copy the input desc into GHCB shared buffer */
+ data = (struct snp_psc_desc *)ghcb->shared_buffer;
+ memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof(*desc)));
+
+ /*
+ * As per the GHCB specification, the hypervisor can resume the guest
+ * before processing all the entries. Check whether all the entries
+ * are processed. If not, then keep retrying. Note, the hypervisor
+ * will update the data memory directly to indicate the status, so
+ * reference the data->hdr everywhere.
+ *
+ * The strategy here is to wait for the hypervisor to change the page
+ * state in the RMP table before guest accesses the memory pages. If the
+ * page state change was not successful, then later memory access will
+ * result in a crash.
+ */
+ cur_entry = data->hdr.cur_entry;
+ end_entry = data->hdr.end_entry;
+
+ while (data->hdr.cur_entry <= data->hdr.end_entry) {
+ ghcb_set_sw_scratch(ghcb, (u64)__pa(data));
+
+ /* This will advance the shared buffer data points to. */
+ ret = sev_es_ghcb_hv_call(ghcb, true, &ctxt, SVM_VMGEXIT_PSC, 0, 0);
+
+ /*
+ * Page State Change VMGEXIT can pass error code through
+ * exit_info_2.
+ */
+ if (WARN(ret || ghcb->save.sw_exit_info_2,
+ "SNP: PSC failed ret=%d exit_info_2=%llx\n",
+ ret, ghcb->save.sw_exit_info_2)) {
+ ret = 1;
+ goto out;
+ }
+
+ /* Verify that reserved bit is not set */
+ if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n")) {
+ ret = 1;
+ goto out;
+ }
+
+ /*
+ * Sanity check that entry processing is not going backwards.
+ * This will happen only if hypervisor is tricking us.
+ */
+ if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_entry,
+"SNP: PSC processing going backward, end_entry %d (got %d) cur_entry %d (got %d)\n",
+ end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) {
+ ret = 1;
+ goto out;
+ }
+ }
+
+out:
+ __sev_put_ghcb(&state);
+
+out_unlock:
+ local_irq_restore(flags);
+
+ return ret;
+}
+
+static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
+ unsigned long vaddr_end, int op)
+{
+ struct psc_hdr *hdr;
+ struct psc_entry *e;
+ unsigned long pfn;
+ int i;
+
+ hdr = &data->hdr;
+ e = data->entries;
+
+ memset(data, 0, sizeof(*data));
+ i = 0;
+
+ while (vaddr < vaddr_end) {
+ if (is_vmalloc_addr((void *)vaddr))
+ pfn = vmalloc_to_pfn((void *)vaddr);
+ else
+ pfn = __pa(vaddr) >> PAGE_SHIFT;
+
+ e->gfn = pfn;
+ e->operation = op;
+ hdr->end_entry = i;
+
+ /*
+ * Current SNP implementation doesn't keep track of the RMP page
+ * size so use 4K for simplicity.
+ */
+ e->pagesize = RMP_PG_SIZE_4K;
+
+ vaddr = vaddr + PAGE_SIZE;
+ e++;
+ i++;
+ }
+
+ if (vmgexit_psc(data))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
+{
+ unsigned long vaddr_end, next_vaddr;
+ struct snp_psc_desc *desc;
+
+ desc = kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT);
+ if (!desc)
+ panic("SNP: failed to allocate memory for PSC descriptor\n");
+
+ vaddr = vaddr & PAGE_MASK;
+ vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+ while (vaddr < vaddr_end) {
+ /* Calculate the last vaddr that fits in one struct snp_psc_desc. */
+ next_vaddr = min_t(unsigned long, vaddr_end,
+ (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr);
+
+ __set_pages_state(desc, vaddr, next_vaddr, op);
+
+ vaddr = next_vaddr;
+ }
+
+ kfree(desc);
+}
+
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ pvalidate_pages(vaddr, npages, false);
+
+ set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+ pvalidate_pages(vaddr, npages, true);
+}
+
int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
{
u16 startup_cs, startup_ip;
diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 8539dd6..d3c88d9 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -316,11 +316,24 @@ static void enc_dec_hypercall(unsigned long vaddr, int npages, bool enc)

static void amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool enc)
{
+ /*
+ * To maintain the security guarantees of SEV-SNP guests, make sure
+ * to invalidate the memory before encryption attribute is cleared.
+ */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && !enc)
+ snp_set_memory_shared(vaddr, npages);
}

/* Return true unconditionally: return value doesn't matter for the SEV side */
static bool amd_enc_status_change_finish(unsigned long vaddr, int npages, bool enc)
{
+ /*
+ * After memory is mapped encrypted in the page table, validate it
+ * so that it is consistent with the page table updates.
+ */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && enc)
+ snp_set_memory_private(vaddr, npages);
+
if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
enc_dec_hypercall(vaddr, npages, enc);

Subject: [tip: x86/sev] x86/sev: Add SEV-SNP feature detection/setup

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: b190a043c49af4587f5e157053f909192820522a
Gitweb: https://git.kernel.org/tip/b190a043c49af4587f5e157053f909192820522a
Author: Michael Roth <[email protected]>
AuthorDate: Thu, 24 Feb 2022 10:56:18 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Thu, 07 Apr 2022 16:47:11 +02:00

x86/sev: Add SEV-SNP feature detection/setup

Initial/preliminary detection of SEV-SNP is done via the Confidential
Computing blob. Check for it prior to the normal SEV/SME feature
initialization, and add some sanity checks to confirm it agrees with
SEV-SNP CPUID/MSR bits.

Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/sev.c | 27 +------------
arch/x86/include/asm/sev.h | 2 +-
arch/x86/kernel/sev-shared.c | 27 ++++++++++++-
arch/x86/kernel/sev.c | 64 +++++++++++++++++++++++++++++-
arch/x86/mm/mem_encrypt_identity.c | 8 ++++-
5 files changed, 101 insertions(+), 27 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 1e4930c..82079ce 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -352,33 +352,6 @@ static struct cc_blob_sev_info *find_cc_blob_efi(struct boot_params *bp)
EFI_CC_BLOB_GUID);
}

-struct cc_setup_data {
- struct setup_data header;
- u32 cc_blob_address;
-};
-
-/*
- * Search for a Confidential Computing blob passed in as a setup_data entry
- * via the Linux Boot Protocol.
- */
-static struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp)
-{
- struct cc_setup_data *sd = NULL;
- struct setup_data *hdr;
-
- hdr = (struct setup_data *)bp->hdr.setup_data;
-
- while (hdr) {
- if (hdr->type == SETUP_CC_BLOB) {
- sd = (struct cc_setup_data *)hdr;
- return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
- }
- hdr = (struct setup_data *)hdr->next;
- }
-
- return NULL;
-}
-
/*
* Initial set up of SNP relies on information provided by the
* Confidential Computing blob, which can be passed to the boot kernel
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 31b3b10..84d3d51 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -153,6 +153,7 @@ void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
void snp_set_wakeup_secondary_cpu(void);
bool snp_init(struct boot_params *bp);
+void snp_abort(void);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
@@ -171,6 +172,7 @@ static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npage
static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
static inline void snp_set_wakeup_secondary_cpu(void) { }
static inline bool snp_init(struct boot_params *bp) { return false; }
+static inline void snp_abort(void) { }
#endif

#endif
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 0f13751..a7a1c0f 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -937,3 +937,30 @@ static enum es_result vc_handle_rdtsc(struct ghcb *ghcb,

return ES_OK;
}
+
+struct cc_setup_data {
+ struct setup_data header;
+ u32 cc_blob_address;
+};
+
+/*
+ * Search for a Confidential Computing blob passed in as a setup_data entry
+ * via the Linux Boot Protocol.
+ */
+static struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp)
+{
+ struct cc_setup_data *sd = NULL;
+ struct setup_data *hdr;
+
+ hdr = (struct setup_data *)bp->hdr.setup_data;
+
+ while (hdr) {
+ if (hdr->type == SETUP_CC_BLOB) {
+ sd = (struct cc_setup_data *)hdr;
+ return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
+ }
+ hdr = (struct setup_data *)hdr->next;
+ }
+
+ return NULL;
+}
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 0c2bf39..692da7b 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1974,3 +1974,67 @@ fail:
while (true)
halt();
}
+
+/*
+ * Initial set up of SNP relies on information provided by the
+ * Confidential Computing blob, which can be passed to the kernel
+ * in the following ways, depending on how it is booted:
+ *
+ * - when booted via the boot/decompress kernel:
+ * - via boot_params
+ *
+ * - when booted directly by firmware/bootloader (e.g. CONFIG_PVH):
+ * - via a setup_data entry, as defined by the Linux Boot Protocol
+ *
+ * Scan for the blob in that order.
+ */
+static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ /* Boot kernel would have passed the CC blob via boot_params. */
+ if (bp->cc_blob_address) {
+ cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address;
+ goto found_cc_info;
+ }
+
+ /*
+ * If kernel was booted directly, without the use of the
+ * boot/decompression kernel, the CC blob may have been passed via
+ * setup_data instead.
+ */
+ cc_info = find_cc_blob_setup_data(bp);
+ if (!cc_info)
+ return NULL;
+
+found_cc_info:
+ if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+ snp_abort();
+
+ return cc_info;
+}
+
+bool __init snp_init(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ if (!bp)
+ return false;
+
+ cc_info = find_cc_blob(bp);
+ if (!cc_info)
+ return false;
+
+ /*
+ * The CC blob will be used later to access the secrets page. Cache
+ * it here like the boot kernel does.
+ */
+ bp->cc_blob_address = (u32)(unsigned long)cc_info;
+
+ return true;
+}
+
+void __init snp_abort(void)
+{
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+}
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index b43bc24..f415498 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -45,6 +45,7 @@
#include <asm/sections.h>
#include <asm/cmdline.h>
#include <asm/coco.h>
+#include <asm/sev.h>

#include "mm_internal.h"

@@ -509,8 +510,11 @@ void __init sme_enable(struct boot_params *bp)
bool active_by_default;
unsigned long me_mask;
char buffer[16];
+ bool snp;
u64 msr;

+ snp = snp_init(bp);
+
/* Check for the SME/SEV support leaf */
eax = 0x80000000;
ecx = 0;
@@ -542,6 +546,10 @@ void __init sme_enable(struct boot_params *bp)
sev_status = __rdmsr(MSR_AMD64_SEV);
feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;

+ /* The SEV-SNP CC blob should never be present unless SEV-SNP is enabled. */
+ if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
+ snp_abort();
+
/* Check if memory encryption is enabled */
if (feature_mask == AMD_SME_BIT) {
/*

Subject: [tip: x86/sev] x86/boot: Add Confidential Computing type to setup_data

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 5ea98e01ab524cbc53dad8aebd27b434ebe5d074
Gitweb: https://git.kernel.org/tip/5ea98e01ab524cbc53dad8aebd27b434ebe5d074
Author: Brijesh Singh <[email protected]>
AuthorDate: Mon, 07 Mar 2022 15:33:39 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Thu, 07 Apr 2022 16:46:33 +02:00

x86/boot: Add Confidential Computing type to setup_data

While launching encrypted guests, the hypervisor may need to provide
some additional information during the guest boot. When booting under an
EFI-based BIOS, the EFI configuration table contains an entry for the
confidential computing blob that contains the required information.

To support booting encrypted guests on non-EFI VMs, the hypervisor
needs to pass this additional information to the guest kernel using a
different method.

For this purpose, introduce SETUP_CC_BLOB type in setup_data to hold
the physical address of the confidential computing blob location. The
boot loader or hypervisor may choose to use this method instead of an
EFI configuration table. The CC blob location scanning should give
preference to a setup_data blob over an EFI configuration table.

In AMD SEV-SNP, the CC blob contains the address of the secrets and
CPUID pages. The secrets page includes information such as a VM to PSP
communication key and the CPUID page contains PSP-filtered CPUID values.
Define the AMD SEV confidential computing blob structure.

While at it, define the EFI GUID for the confidential computing blob.

[ bp: Massage commit message, mark struct __packed. ]

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/sev.h | 18 ++++++++++++++++++
arch/x86/include/uapi/asm/bootparam.h | 1 +
include/linux/efi.h | 1 +
3 files changed, 20 insertions(+)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index a3203b2..8c934bd 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -42,6 +42,24 @@ struct es_em_ctxt {
struct es_fault_info fi;
};

+/*
+ * AMD SEV Confidential computing blob structure. The structure is
+ * defined in OVMF UEFI firmware header:
+ * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
+ */
+#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
+struct cc_blob_sev_info {
+ u32 magic;
+ u16 version;
+ u16 reserved;
+ u64 secrets_phys;
+ u32 secrets_len;
+ u32 rsvd1;
+ u64 cpuid_phys;
+ u32 cpuid_len;
+ u32 rsvd2;
+} __packed;
+
void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code);

static inline u64 lower_bits(u64 val, unsigned int bits)
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index b25d3f8..1ac5acc 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -10,6 +10,7 @@
#define SETUP_EFI 4
#define SETUP_APPLE_PROPERTIES 5
#define SETUP_JAILHOUSE 6
+#define SETUP_CC_BLOB 7

#define SETUP_INDIRECT (1<<31)

diff --git a/include/linux/efi.h b/include/linux/efi.h
index ccd4d3f..984aa68 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -390,6 +390,7 @@ void efi_native_runtime_setup(void);
#define EFI_CERT_SHA256_GUID EFI_GUID(0xc1c41626, 0x504c, 0x4092, 0xac, 0xa9, 0x41, 0xf9, 0x36, 0x93, 0x43, 0x28)
#define EFI_CERT_X509_GUID EFI_GUID(0xa5c059a1, 0x94e4, 0x4aa7, 0x87, 0xb5, 0xab, 0x15, 0x5c, 0x2b, 0xf0, 0x72)
#define EFI_CERT_X509_SHA256_GUID EFI_GUID(0x3bd2a492, 0x96c0, 0x4079, 0xb4, 0x20, 0xfc, 0xf9, 0x8e, 0xf1, 0x03, 0xed)
+#define EFI_CC_BLOB_GUID EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)

/*
* This GUID is used to pass to the kernel proper the struct screen_info

Subject: [tip: x86/sev] x86/sev: Check the VMPL level

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 81cc3df9a90e7817494421ecc48ede6bd5e8132b
Gitweb: https://git.kernel.org/tip/81cc3df9a90e7817494421ecc48ede6bd5e8132b
Author: Brijesh Singh <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:08 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:10:34 +02:00

x86/sev: Check the VMPL level

The Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP
architecture allows a guest VM to divide its address space into four
levels. The level can be used to provide hardware isolated abstraction
layers within a VM. VMPL0 is the highest privilege level, and VMPL3 is
the least privilege level. Certain operations must be done by the VMPL0
software, such as:

* Validate or invalidate memory range (PVALIDATE instruction)
* Allocate VMSA page (RMPADJUST instruction when VMSA=1)

The initial SNP support requires that the guest kernel is running at
VMPL0. Add such a check to verify the guest is running at level 0 before
continuing the boot. There is no easy method to query the current VMPL
level, so use the RMPADJUST instruction to determine whether the guest
is running at the VMPL0.

[ bp: Massage commit message. ]

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/sev.c | 28 ++++++++++++++++++++++++++--
arch/x86/include/asm/sev-common.h | 1 +
arch/x86/include/asm/sev.h | 16 ++++++++++++++++
3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 5b38931..eb42178 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -199,6 +199,26 @@ finish:
sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}

+static void enforce_vmpl0(void)
+{
+ u64 attrs;
+ int err;
+
+ /*
+ * RMPADJUST modifies RMP permissions of a lesser-privileged (numerically
+ * higher) privilege level. Here, clear the VMPL1 permission mask of the
+ * GHCB page. If the guest is not running at VMPL0, this will fail.
+ *
+ * If the guest is running at VMPL0, it will succeed. Even if that operation
+ * modifies permission bits, it is still ok to do so currently because Linux
+ * SNP guests are supported only on VMPL0 so VMPL1 or higher permission masks
+ * changing is a don't-care.
+ */
+ attrs = 1;
+ if (rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, attrs))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0);
+}
+
void sev_enable(struct boot_params *bp)
{
unsigned int eax, ebx, ecx, edx;
@@ -242,8 +262,12 @@ void sev_enable(struct boot_params *bp)
* SNP is supported in v2 of the GHCB spec which mandates support for HV
* features.
*/
- if (sev_status & MSR_AMD64_SEV_SNP_ENABLED && !(get_hv_features() & GHCB_HV_FT_SNP))
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+ if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
+ if (!(get_hv_features() & GHCB_HV_FT_SNP))
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
+ enforce_vmpl0();
+ }

sme_me_mask = BIT_ULL(ebx & 0x3f);
}
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 6f037c2..7ac5842 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -89,6 +89,7 @@
#define GHCB_TERM_REGISTER 0 /* GHCB GPA registration failure */
#define GHCB_TERM_PSC 1 /* Page State Change failure */
#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
+#define GHCB_TERM_NOT_VMPL0 3 /* SNP guest is not running at VMPL-0 */

#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 4ee9897..e374518 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -63,6 +63,9 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
/* Software defined (when rFlags.CF = 1) */
#define PVALIDATE_FAIL_NOUPDATE 255

+/* RMP page size */
+#define RMP_PG_SIZE_4K 0
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern struct static_key_false sev_es_enable_key;
extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -90,6 +93,18 @@ extern enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb,
struct es_em_ctxt *ctxt,
u64 exit_code, u64 exit_info_1,
u64 exit_info_2);
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs)
+{
+ int rc;
+
+ /* "rmpadjust" mnemonic support in binutils 2.36 and newer */
+ asm volatile(".byte 0xF3,0x0F,0x01,0xFE\n\t"
+ : "=a"(rc)
+ : "a"(vaddr), "c"(rmp_psize), "d"(attrs)
+ : "memory", "cc");
+
+ return rc;
+}
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
{
bool no_rmpupdate;
@@ -114,6 +129,7 @@ static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { ret
static inline void sev_es_nmi_complete(void) { }
static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
#endif

#endif

Subject: [tip: x86/sev] x86/sev: Save the negotiated GHCB version

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 2ea29c5abbc27147c2d9e2ab5e05436aca706b65
Gitweb: https://git.kernel.org/tip/2ea29c5abbc27147c2d9e2ab5e05436aca706b65
Author: Brijesh Singh <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:05 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:10:18 +02:00

x86/sev: Save the negotiated GHCB version

The SEV-ES guest calls sev_es_negotiate_protocol() to negotiate the GHCB
protocol version before establishing the GHCB. Cache the negotiated GHCB
version so that it can be used later.

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Venu Busireddy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/sev.h | 2 +-
arch/x86/kernel/sev-shared.c | 17 ++++++++++++++---
2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ec060c4..9b9c190 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -12,7 +12,7 @@
#include <asm/insn.h>
#include <asm/sev-common.h>

-#define GHCB_PROTO_OUR 0x0001UL
+#define GHCB_PROTOCOL_MIN 1ULL
#define GHCB_PROTOCOL_MAX 1ULL
#define GHCB_DEFAULT_USAGE 0ULL

diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 2abf8a7..91105f5 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,15 @@
#define has_cpuflag(f) boot_cpu_has(f)
#endif

+/*
+ * Since feature negotiation related variables are set early in the boot
+ * process they must reside in the .data section so as not to be zeroed
+ * out when the .bss section is later cleared.
+ *
+ * GHCB protocol version negotiated with the hypervisor.
+ */
+static u16 ghcb_version __ro_after_init;
+
static bool __init sev_es_check_cpu_features(void)
{
if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -51,10 +60,12 @@ static bool sev_es_negotiate_protocol(void)
if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
return false;

- if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTO_OUR ||
- GHCB_MSR_PROTO_MIN(val) > GHCB_PROTO_OUR)
+ if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTOCOL_MIN ||
+ GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
return false;

+ ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+
return true;
}

@@ -127,7 +138,7 @@ enum es_result sev_es_ghcb_hv_call(struct ghcb *ghcb, bool set_ghcb_msr,
u64 exit_info_1, u64 exit_info_2)
{
/* Fill in protocol and format specifiers */
- ghcb->protocol_version = GHCB_PROTOCOL_MAX;
+ ghcb->protocol_version = ghcb_version;
ghcb->ghcb_usage = GHCB_DEFAULT_USAGE;

ghcb_set_sw_exit_code(ghcb, exit_code);

Subject: [tip: x86/sev] x86/mm: Extend cc_attr to include AMD SEV-SNP

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: f742b90e61bb53b27771f64bdae05db03a6ab1f2
Gitweb: https://git.kernel.org/tip/f742b90e61bb53b27771f64bdae05db03a6ab1f2
Author: Brijesh Singh <[email protected]>
AuthorDate: Thu, 24 Feb 2022 10:55:49 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:02:34 +02:00

x86/mm: Extend cc_attr to include AMD SEV-SNP

The CC_ATTR_GUEST_SEV_SNP can be used by the guest to query whether the
SNP (Secure Nested Paging) feature is active.

Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/coco/core.c | 3 +++
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/mm/mem_encrypt.c | 4 ++++
include/linux/cc_platform.h | 8 ++++++++
4 files changed, 17 insertions(+)

diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index fc1365d..dafd488 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -57,6 +57,9 @@ static bool amd_cc_platform_has(enum cc_attr attr)
return (sev_status & MSR_AMD64_SEV_ENABLED) &&
!(sev_status & MSR_AMD64_SEV_ES_ENABLED);

+ case CC_ATTR_GUEST_SEV_SNP:
+ return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
+
default:
return false;
}
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 0eb90d2..ef96f16 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -502,8 +502,10 @@
#define MSR_AMD64_SEV 0xc0010131
#define MSR_AMD64_SEV_ENABLED_BIT 0
#define MSR_AMD64_SEV_ES_ENABLED_BIT 1
+#define MSR_AMD64_SEV_SNP_ENABLED_BIT 2
#define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
#define MSR_AMD64_SEV_ES_ENABLED BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
+#define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)

#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 50d2099..f85868c 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -62,6 +62,10 @@ static void print_mem_encrypt_feature_info(void)
if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
pr_cont(" SEV-ES");

+ /* Secure Nested Paging */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ pr_cont(" SEV-SNP");
+
pr_cont("\n");
}

diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index efd8205..d08dd65 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -72,6 +72,14 @@ enum cc_attr {
* Examples include TDX guest & SEV.
*/
CC_ATTR_GUEST_UNROLL_STRING_IO,
+
+ /**
+ * @CC_ATTR_SEV_SNP: Guest SNP is active.
+ *
+ * The platform/OS is running as a guest/virtual machine and actively
+ * using AMD SEV-SNP features.
+ */
+ CC_ATTR_GUEST_SEV_SNP,
};

#ifdef CONFIG_ARCH_HAS_CC_PLATFORM

Subject: [tip: x86/sev] virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 92a99584d965b930988b28f36d925bd9675828b3
Gitweb: https://git.kernel.org/tip/92a99584d965b930988b28f36d925bd9675828b3
Author: Michael Roth <[email protected]>
AuthorDate: Thu, 24 Feb 2022 10:56:25 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Thu, 07 Apr 2022 16:47:12 +02:00

virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement

Update the documentation with information regarding SEV-SNP CPUID
Enforcement details and what sort of assurances it provides to guests.

Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Documentation/virt/coco/sevguest.rst | 29 +++++++++++++++++++++++++++-
1 file changed, 29 insertions(+)

diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
index 625de22..bf593e8 100644
--- a/Documentation/virt/coco/sevguest.rst
+++ b/Documentation/virt/coco/sevguest.rst
@@ -118,6 +118,35 @@ be updated with the expected value.

See GHCB specification for further detail on how to parse the certificate blob.

+3. SEV-SNP CPUID Enforcement
+============================
+
+SEV-SNP guests can access a special page that contains a table of CPUID values
+that have been validated by the PSP as part of the SNP_LAUNCH_UPDATE firmware
+command. It provides the following assurances regarding the validity of CPUID
+values:
+
+ - Its address is obtained via bootloader/firmware (via CC blob), and those
+ binaries will be measured as part of the SEV-SNP attestation report.
+ - Its initial state will be encrypted/pvalidated, so attempts to modify
+ it during run-time will result in garbage being written, or #VC exceptions
+ being generated due to changes in validation state if the hypervisor tries
+ to swap the backing page.
+ - Attempts to bypass PSP checks by the hypervisor by using a normal page, or
+ a non-CPUID encrypted page will change the measurement provided by the
+ SEV-SNP attestation report.
+ - The CPUID page contents are *not* measured, but attempts to modify the
+ expected contents of a CPUID page as part of guest initialization will be
+ gated by the PSP CPUID enforcement policy checks performed on the page
+ during SNP_LAUNCH_UPDATE, and noticeable later if the guest owner
+ implements their own checks of the CPUID values.
+
+It is important to note that this last assurance is only useful if the kernel
+has taken care to make use of the SEV-SNP CPUID throughout all stages of boot.
+Otherwise, guest owner attestation provides no assurance that the kernel wasn't
+fed incorrect values at some point during boot.
+
+
Reference
---------

Subject: [tip: x86/sev] KVM: SVM: Create a separate mapping for the GHCB save area

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: a4690359eaec985a1351786da887df1ba92440a0
Gitweb: https://git.kernel.org/tip/a4690359eaec985a1351786da887df1ba92440a0
Author: Tom Lendacky <[email protected]>
AuthorDate: Mon, 07 Mar 2022 15:33:13 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 12:13:34 +02:00

KVM: SVM: Create a separate mapping for the GHCB save area

The initial implementation of the GHCB spec was based on trying to keep
the register state offsets the same relative to the VM save area. However,
the save area for SEV-ES has changed within the hardware causing the
relation between the SEV-ES save area to change relative to the GHCB save
area.

This is the second step in defining the multiple save areas to keep them
separate and ensuring proper operation amongst the different types of
guests. Create a GHCB save area that matches the GHCB specification.

Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Venu Busireddy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/svm.h | 48 ++++++++++++++++++++++++++++++++++---
1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 788a43f..0789ad8 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -398,11 +398,51 @@ struct sev_es_save_area {
u64 x87_state_gpa;
} __packed;

+struct ghcb_save_area {
+ u8 reserved_1[203];
+ u8 cpl;
+ u8 reserved_2[116];
+ u64 xss;
+ u8 reserved_3[24];
+ u64 dr7;
+ u8 reserved_4[16];
+ u64 rip;
+ u8 reserved_5[88];
+ u64 rsp;
+ u8 reserved_6[24];
+ u64 rax;
+ u8 reserved_7[264];
+ u64 rcx;
+ u64 rdx;
+ u64 rbx;
+ u8 reserved_8[8];
+ u64 rbp;
+ u64 rsi;
+ u64 rdi;
+ u64 r8;
+ u64 r9;
+ u64 r10;
+ u64 r11;
+ u64 r12;
+ u64 r13;
+ u64 r14;
+ u64 r15;
+ u8 reserved_9[16];
+ u64 sw_exit_code;
+ u64 sw_exit_info_1;
+ u64 sw_exit_info_2;
+ u64 sw_scratch;
+ u8 reserved_10[56];
+ u64 xcr0;
+ u8 valid_bitmap[16];
+ u64 x87_state_gpa;
+} __packed;
+
#define GHCB_SHARED_BUF_SIZE 2032

struct ghcb {
- struct sev_es_save_area save;
- u8 reserved_save[2048 - sizeof(struct sev_es_save_area)];
+ struct ghcb_save_area save;
+ u8 reserved_save[2048 - sizeof(struct ghcb_save_area)];

u8 shared_buffer[GHCB_SHARED_BUF_SIZE];

@@ -413,6 +453,7 @@ struct ghcb {


#define EXPECTED_VMCB_SAVE_AREA_SIZE 740
+#define EXPECTED_GHCB_SAVE_AREA_SIZE 1032
#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1032
#define EXPECTED_VMCB_CONTROL_AREA_SIZE 1024
#define EXPECTED_GHCB_SIZE PAGE_SIZE
@@ -420,6 +461,7 @@ struct ghcb {
static inline void __unused_size_checks(void)
{
BUILD_BUG_ON(sizeof(struct vmcb_save_area) != EXPECTED_VMCB_SAVE_AREA_SIZE);
+ BUILD_BUG_ON(sizeof(struct ghcb_save_area) != EXPECTED_GHCB_SAVE_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct sev_es_save_area) != EXPECTED_SEV_ES_SAVE_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct vmcb_control_area) != EXPECTED_VMCB_CONTROL_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct ghcb) != EXPECTED_GHCB_SIZE);
@@ -490,7 +532,7 @@ struct vmcb {
/* GHCB Accessor functions */

#define GHCB_BITMAP_IDX(field) \
- (offsetof(struct sev_es_save_area, field) / sizeof(u64))
+ (offsetof(struct ghcb_save_area, field) / sizeof(u64))

#define DEFINE_GHCB_ACCESSORS(field) \
static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb) \

Subject: [tip: x86/sev] x86/head/64: Re-enable stack protection

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 469693d8f62299709e8ba56d8fb3da9ea990213c
Gitweb: https://git.kernel.org/tip/469693d8f62299709e8ba56d8fb3da9ea990213c
Author: Michael Roth <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:17 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 17:06:55 +02:00

x86/head/64: Re-enable stack protection

Due to

103a4908ad4d ("x86/head/64: Disable stack protection for head$(BITS).o")

kernel/head{32,64}.c are compiled with -fno-stack-protector to allow
a call to set_bringup_idt_handler(), which would otherwise have stack
protection enabled with CONFIG_STACKPROTECTOR_STRONG.

While sufficient for that case, there may still be issues with calls to
any external functions that were compiled with stack protection enabled
that in-turn make stack-protected calls, or if the exception handlers
set up by set_bringup_idt_handler() make calls to stack-protected
functions.

Subsequent patches for SEV-SNP CPUID validation support will introduce
both such cases. Attempting to disable stack protection for everything
in scope to address that is prohibitive since much of the code, like the
SEV-ES #VC handler, is shared code that remains in use after boot and
could benefit from having stack protection enabled. Attempting to inline
calls is brittle and can quickly balloon out to library/helper code
where that's not really an option.

Instead, re-enable stack protection for head32.c/head64.c, and make the
appropriate changes to ensure the segment used for the stack canary is
initialized in advance of any stack-protected C calls.

For head64.c:

- The BSP will enter from startup_64() and call into C code
(startup_64_setup_env()) shortly after setting up the stack, which
may result in calls to stack-protected code. Set up %gs early to allow
for this safely.
- APs will enter from secondary_startup_64*(), and %gs will be set up
soon after. There is one call to C code prior to %gs being setup
(__startup_secondary_64()), but it is only to fetch 'sme_me_mask'
global, so just load 'sme_me_mask' directly instead, and remove the
now-unused __startup_secondary_64() function.

For head32.c:

- BSPs/APs will set %fs to __BOOT_DS prior to any C calls. In recent
kernels, the compiler is configured to access the stack canary at
%fs:__stack_chk_guard [1], which overlaps with the initial per-cpu
'__stack_chk_guard' variable in the initial/"master" .data..percpu
area. This is sufficient to allow access to the canary for use
during initial startup, so no changes are needed there.

[1] 3fb0fdb3bbe7 ("x86/stackprotector/32: Make the canary into a regular percpu variable")

[ bp: Massage commit message. ]

Suggested-by: Joerg Roedel <[email protected]> #for 64-bit %gs set up
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Brijesh Singh <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/include/asm/setup.h | 1 -
arch/x86/kernel/Makefile | 2 --
arch/x86/kernel/head64.c | 9 ---------
arch/x86/kernel/head_64.S | 24 +++++++++++++++++++++---
4 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 896e48d..a1b107f 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -50,7 +50,6 @@ extern unsigned long saved_video_mode;
extern void reserve_standard_io_resources(void);
extern void i386_reserve_resources(void);
extern unsigned long __startup_64(unsigned long physaddr, struct boot_params *bp);
-extern unsigned long __startup_secondary_64(void);
extern void startup_64_setup_env(unsigned long physbase);
extern void early_setup_idt(void);
extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index c41ef42..1a2dc32 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -46,8 +46,6 @@ endif
# non-deterministic coverage.
KCOV_INSTRUMENT := n

-CFLAGS_head$(BITS).o += -fno-stack-protector
-
CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace

obj-y := process_$(BITS).o signal.o
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 656d2f3..c185f48 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -318,15 +318,6 @@ unsigned long __head __startup_64(unsigned long physaddr,
return sme_postprocess_startup(bp, pmd);
}

-unsigned long __startup_secondary_64(void)
-{
- /*
- * Return the SME encryption mask (if SME is active) to be used as a
- * modifier for the initial pgdir entry programmed into CR3.
- */
- return sme_get_me_mask();
-}
-
/* Wipe all early page tables except for the kernel symbol map */
static void __init reset_early_page_tables(void)
{
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 6bf340c..7bac9a4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -65,6 +65,22 @@ SYM_CODE_START_NOALIGN(startup_64)
leaq (__end_init_task - FRAME_SIZE)(%rip), %rsp

leaq _text(%rip), %rdi
+
+ /*
+ * initial_gs points to initial fixed_percpu_data struct with storage for
+ * the stack protector canary. Global pointer fixups are needed at this
+ * stage, so apply them as is done in fixup_pointer(), and initialize %gs
+ * such that the canary can be accessed at %gs:40 for subsequent C calls.
+ */
+ movl $MSR_GS_BASE, %ecx
+ movq initial_gs(%rip), %rax
+ movq $_text, %rdx
+ subq %rdx, %rax
+ addq %rdi, %rax
+ movq %rax, %rdx
+ shrq $32, %rdx
+ wrmsr
+
pushq %rsi
call startup_64_setup_env
popq %rsi
@@ -147,9 +163,11 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
* Retrieve the modifier (SME encryption mask if SME is active) to be
* added to the initial pgdir entry that will be programmed into CR3.
*/
- pushq %rsi
- call __startup_secondary_64
- popq %rsi
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ movq sme_me_mask, %rax
+#else
+ xorq %rax, %rax
+#endif

/* Form the CR3 value being sure to include the CR3 modifier */
addq $(init_top_pgt - __START_KERNEL_map), %rax

Subject: [tip: x86/sev] x86/boot: Use MSR read/write helpers instead of inline assembly

The following commit has been merged into the x86/sev branch of tip:

Commit-ID: 950d00558a920227b5703d1fcc4751cfe03853cd
Gitweb: https://git.kernel.org/tip/950d00558a920227b5703d1fcc4751cfe03853cd
Author: Michael Roth <[email protected]>
AuthorDate: Wed, 09 Feb 2022 12:10:00 -06:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 06 Apr 2022 13:02:13 +02:00

x86/boot: Use MSR read/write helpers instead of inline assembly

Update all C code to use the new boot_rdmsr()/boot_wrmsr() helpers
instead of relying on inline assembly.

Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Michael Roth <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/sev.c | 17 +++++++----------
arch/x86/boot/cpucheck.c | 30 +++++++++++++++---------------
2 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 28bcf04..4e82101 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -22,6 +22,7 @@
#include <asm/svm.h>

#include "error.h"
+#include "../msr.h"

struct ghcb boot_ghcb_page __aligned(PAGE_SIZE);
struct ghcb *boot_ghcb;
@@ -56,23 +57,19 @@ static unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx)

static inline u64 sev_es_rd_ghcb_msr(void)
{
- unsigned long low, high;
+ struct msr m;

- asm volatile("rdmsr" : "=a" (low), "=d" (high) :
- "c" (MSR_AMD64_SEV_ES_GHCB));
+ boot_rdmsr(MSR_AMD64_SEV_ES_GHCB, &m);

- return ((high << 32) | low);
+ return m.q;
}

static inline void sev_es_wr_ghcb_msr(u64 val)
{
- u32 low, high;
+ struct msr m;

- low = val & 0xffffffffUL;
- high = val >> 32;
-
- asm volatile("wrmsr" : : "c" (MSR_AMD64_SEV_ES_GHCB),
- "a"(low), "d" (high) : "memory");
+ m.q = val;
+ boot_wrmsr(MSR_AMD64_SEV_ES_GHCB, &m);
}

static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
diff --git a/arch/x86/boot/cpucheck.c b/arch/x86/boot/cpucheck.c
index e1478d3..fed8d13 100644
--- a/arch/x86/boot/cpucheck.c
+++ b/arch/x86/boot/cpucheck.c
@@ -27,6 +27,7 @@
#include <asm/required-features.h>
#include <asm/msr-index.h>
#include "string.h"
+#include "msr.h"

static u32 err_flags[NCAPINTS];

@@ -130,12 +131,11 @@ int check_cpu(int *cpu_level_ptr, int *req_level_ptr, u32 **err_flags_ptr)
/* If this is an AMD and we're only missing SSE+SSE2, try to
turn them on */

- u32 ecx = MSR_K7_HWCR;
- u32 eax, edx;
+ struct msr m;

- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- eax &= ~(1 << 15);
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_rdmsr(MSR_K7_HWCR, &m);
+ m.l &= ~(1 << 15);
+ boot_wrmsr(MSR_K7_HWCR, &m);

get_cpuflags(); /* Make sure it really did something */
err = check_cpuflags();
@@ -145,28 +145,28 @@ int check_cpu(int *cpu_level_ptr, int *req_level_ptr, u32 **err_flags_ptr)
/* If this is a VIA C3, we might have to enable CX8
explicitly */

- u32 ecx = MSR_VIA_FCR;
- u32 eax, edx;
+ struct msr m;

- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- eax |= (1<<1)|(1<<7);
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_rdmsr(MSR_VIA_FCR, &m);
+ m.l |= (1 << 1) | (1 << 7);
+ boot_wrmsr(MSR_VIA_FCR, &m);

set_bit(X86_FEATURE_CX8, cpu.flags);
err = check_cpuflags();
} else if (err == 0x01 && is_transmeta()) {
/* Transmeta might have masked feature bits in word 0 */

- u32 ecx = 0x80860004;
- u32 eax, edx;
+ struct msr m, m_tmp;
u32 level = 1;

- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- asm("wrmsr" : : "a" (~0), "d" (edx), "c" (ecx));
+ boot_rdmsr(0x80860004, &m);
+ m_tmp = m;
+ m_tmp.l = ~0;
+ boot_wrmsr(0x80860004, &m_tmp);
asm("cpuid"
: "+a" (level), "=d" (cpu.flags[0])
: : "ecx", "ebx");
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_wrmsr(0x80860004, &m);

err = check_cpuflags();
} else if (err == 0x01 &&

2022-06-14 00:57:21

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

s/Brijesh/Michael

On Mon, Mar 07, 2022, Brijesh Singh wrote:
> The encryption attribute for the .bss..decrypted section is cleared in the
> initial page table build. This is because the section contains the data
> that need to be shared between the guest and the hypervisor.
>
> When SEV-SNP is active, just clearing the encryption attribute in the
> page table is not enough. The page state need to be updated in the RMP
> table.
>
> Signed-off-by: Brijesh Singh <[email protected]>
> ---
> arch/x86/kernel/head64.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 83514b9827e6..656d2f3e2cf0 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
> if (sme_get_me_mask()) {
> vaddr = (unsigned long)__start_bss_decrypted;
> vaddr_end = (unsigned long)__end_bss_decrypted;
> +
> for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> + /*
> + * On SNP, transition the page to shared in the RMP table so that
> + * it is consistent with the page table attribute change.
> + *
> + * __start_bss_decrypted has a virtual address in the high range
> + * mapping (kernel .text). PVALIDATE, by way of
> + * early_snp_set_memory_shared(), requires a valid virtual
> + * address but the kernel is currently running off of the identity
> + * mapping so use __pa() to get a *currently* valid virtual address.
> + */
> + early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);

This breaks SME on Rome and Milan when compiling with clang-13. I haven't been
able to figure out exactly what goes wrong. printk isn't functional at this point,
and interactive debug during boot on our test systems is beyond me. I can't even
verify that the bug is specific to clang because the draconian build system for our
test systems apparently is stuck pointing at gcc-4.9.

I suspect the issue is related to relocation and/or encrypting memory, as skipping
the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
use of absolute addresses.

Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the
moment because INIT_EX is also broken.

The crash incurs a very, very slow reboot, and I was out of cycles to work on this
about three hours ago. If someone on the AMD side can repro, it would be much
appreciated.

2022-06-14 16:07:27

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On Tue, Jun 14, 2022, Sean Christopherson wrote:
> s/Brijesh/Michael
>
> On Mon, Mar 07, 2022, Brijesh Singh wrote:
> > The encryption attribute for the .bss..decrypted section is cleared in the
> > initial page table build. This is because the section contains the data
> > that need to be shared between the guest and the hypervisor.
> >
> > When SEV-SNP is active, just clearing the encryption attribute in the
> > page table is not enough. The page state need to be updated in the RMP
> > table.
> >
> > Signed-off-by: Brijesh Singh <[email protected]>
> > ---
> > arch/x86/kernel/head64.c | 13 +++++++++++++
> > 1 file changed, 13 insertions(+)
> >
> > diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> > index 83514b9827e6..656d2f3e2cf0 100644
> > --- a/arch/x86/kernel/head64.c
> > +++ b/arch/x86/kernel/head64.c
> > @@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
> > if (sme_get_me_mask()) {
> > vaddr = (unsigned long)__start_bss_decrypted;
> > vaddr_end = (unsigned long)__end_bss_decrypted;
> > +
> > for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> > + /*
> > + * On SNP, transition the page to shared in the RMP table so that
> > + * it is consistent with the page table attribute change.
> > + *
> > + * __start_bss_decrypted has a virtual address in the high range
> > + * mapping (kernel .text). PVALIDATE, by way of
> > + * early_snp_set_memory_shared(), requires a valid virtual
> > + * address but the kernel is currently running off of the identity
> > + * mapping so use __pa() to get a *currently* valid virtual address.
> > + */
> > + early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
>
> This breaks SME on Rome and Milan when compiling with clang-13. I haven't been
> able to figure out exactly what goes wrong. printk isn't functional at this point,
> and interactive debug during boot on our test systems is beyond me. I can't even
> verify that the bug is specific to clang because the draconian build system for our
> test systems apparently is stuck pointing at gcc-4.9.
>
> I suspect the issue is related to relocation and/or encrypting memory, as skipping
> the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
> I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
> use of absolute addresses.
>
> Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the
> moment because INIT_EX is also broken.

The SEV INIT_EX was a PEBKAC issue. An SEV guest boots just fine with a clang-built
kernel, so either it's a finnicky relocation issue or something specific to SME.

> The crash incurs a very, very slow reboot, and I was out of cycles to work on this
> about three hours ago. If someone on the AMD side can repro, it would be much
> appreciated.

2022-06-14 16:16:58

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On Tue, Jun 14, 2022, Tom Lendacky wrote:
> On 6/14/22 10:43, Sean Christopherson wrote:
> > On Tue, Jun 14, 2022, Sean Christopherson wrote:
> > > s/Brijesh/Michael
> > >
> > > On Mon, Mar 07, 2022, Brijesh Singh wrote:
> > > > The encryption attribute for the .bss..decrypted section is cleared in the
> > > > initial page table build. This is because the section contains the data
> > > > that need to be shared between the guest and the hypervisor.
> > > >
> > > > When SEV-SNP is active, just clearing the encryption attribute in the
> > > > page table is not enough. The page state need to be updated in the RMP
> > > > table.
> > > >
> > > > Signed-off-by: Brijesh Singh <[email protected]>
> > > > ---
> > > > arch/x86/kernel/head64.c | 13 +++++++++++++
> > > > 1 file changed, 13 insertions(+)
> > > >
> > > > diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> > > > index 83514b9827e6..656d2f3e2cf0 100644
> > > > --- a/arch/x86/kernel/head64.c
> > > > +++ b/arch/x86/kernel/head64.c
> > > > @@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
> > > > if (sme_get_me_mask()) {
> > > > vaddr = (unsigned long)__start_bss_decrypted;
> > > > vaddr_end = (unsigned long)__end_bss_decrypted;
> > > > +
> > > > for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
> > > > + /*
> > > > + * On SNP, transition the page to shared in the RMP table so that
> > > > + * it is consistent with the page table attribute change.
> > > > + *
> > > > + * __start_bss_decrypted has a virtual address in the high range
> > > > + * mapping (kernel .text). PVALIDATE, by way of
> > > > + * early_snp_set_memory_shared(), requires a valid virtual
> > > > + * address but the kernel is currently running off of the identity
> > > > + * mapping so use __pa() to get a *currently* valid virtual address.
> > > > + */
> > > > + early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
> > >
> > > This breaks SME on Rome and Milan when compiling with clang-13. I haven't been
> > > able to figure out exactly what goes wrong. printk isn't functional at this point,
> > > and interactive debug during boot on our test systems is beyond me. I can't even
> > > verify that the bug is specific to clang because the draconian build system for our
> > > test systems apparently is stuck pointing at gcc-4.9.
> > >
> > > I suspect the issue is related to relocation and/or encrypting memory, as skipping
> > > the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
> > > I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
> > > use of absolute addresses.
> > >
> > > Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the
> > > moment because INIT_EX is also broken.
> >
> > The SEV INIT_EX was a PEBKAC issue. An SEV guest boots just fine with a clang-built
> > kernel, so either it's a finnicky relocation issue or something specific to SME.
>
> I just built and booted 5.19-rc2 with clang-13 and SME enabled without issue:
>
> [ 4.118226] Memory Encryption Features active: AMD SME

Phooey.

> Maybe something with your kernel config? Can you send me your config?

Attached. If you can't repro, I'll find someone on our end to work on this.

Thanks!


Attachments:
(No filename) (3.34 kB)
.config.bz2 (27.83 kB)
Download all attachments

2022-06-14 16:23:01

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On 6/14/22 10:43, Sean Christopherson wrote:
> On Tue, Jun 14, 2022, Sean Christopherson wrote:
>> s/Brijesh/Michael
>>
>> On Mon, Mar 07, 2022, Brijesh Singh wrote:
>>> The encryption attribute for the .bss..decrypted section is cleared in the
>>> initial page table build. This is because the section contains the data
>>> that need to be shared between the guest and the hypervisor.
>>>
>>> When SEV-SNP is active, just clearing the encryption attribute in the
>>> page table is not enough. The page state need to be updated in the RMP
>>> table.
>>>
>>> Signed-off-by: Brijesh Singh <[email protected]>
>>> ---
>>> arch/x86/kernel/head64.c | 13 +++++++++++++
>>> 1 file changed, 13 insertions(+)
>>>
>>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>>> index 83514b9827e6..656d2f3e2cf0 100644
>>> --- a/arch/x86/kernel/head64.c
>>> +++ b/arch/x86/kernel/head64.c
>>> @@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
>>> if (sme_get_me_mask()) {
>>> vaddr = (unsigned long)__start_bss_decrypted;
>>> vaddr_end = (unsigned long)__end_bss_decrypted;
>>> +
>>> for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>>> + /*
>>> + * On SNP, transition the page to shared in the RMP table so that
>>> + * it is consistent with the page table attribute change.
>>> + *
>>> + * __start_bss_decrypted has a virtual address in the high range
>>> + * mapping (kernel .text). PVALIDATE, by way of
>>> + * early_snp_set_memory_shared(), requires a valid virtual
>>> + * address but the kernel is currently running off of the identity
>>> + * mapping so use __pa() to get a *currently* valid virtual address.
>>> + */
>>> + early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
>>
>> This breaks SME on Rome and Milan when compiling with clang-13. I haven't been
>> able to figure out exactly what goes wrong. printk isn't functional at this point,
>> and interactive debug during boot on our test systems is beyond me. I can't even
>> verify that the bug is specific to clang because the draconian build system for our
>> test systems apparently is stuck pointing at gcc-4.9.
>>
>> I suspect the issue is related to relocation and/or encrypting memory, as skipping
>> the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
>> I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
>> use of absolute addresses.
>>
>> Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the
>> moment because INIT_EX is also broken.
>
> The SEV INIT_EX was a PEBKAC issue. An SEV guest boots just fine with a clang-built
> kernel, so either it's a finnicky relocation issue or something specific to SME.

I just built and booted 5.19-rc2 with clang-13 and SME enabled without issue:

[ 4.118226] Memory Encryption Features active: AMD SME

Maybe something with your kernel config? Can you send me your config?

Thanks,
Tom

>
>> The crash incurs a very, very slow reboot, and I was out of cycles to work on this
>> about three hours ago. If someone on the AMD side can repro, it would be much
>> appreciated.

2022-06-14 19:16:27

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On 6/14/22 11:13, Sean Christopherson wrote:
> On Tue, Jun 14, 2022, Tom Lendacky wrote:
>> On 6/14/22 10:43, Sean Christopherson wrote:
>>> On Tue, Jun 14, 2022, Sean Christopherson wrote:
>>>> s/Brijesh/Michael
>>>>
>>>> On Mon, Mar 07, 2022, Brijesh Singh wrote:
>>>>> The encryption attribute for the .bss..decrypted section is cleared in the
>>>>> initial page table build. This is because the section contains the data
>>>>> that need to be shared between the guest and the hypervisor.
>>>>>
>>>>> When SEV-SNP is active, just clearing the encryption attribute in the
>>>>> page table is not enough. The page state need to be updated in the RMP
>>>>> table.
>>>>>
>>>>> Signed-off-by: Brijesh Singh <[email protected]>
>>>>> ---
>>>>> arch/x86/kernel/head64.c | 13 +++++++++++++
>>>>> 1 file changed, 13 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>>>>> index 83514b9827e6..656d2f3e2cf0 100644
>>>>> --- a/arch/x86/kernel/head64.c
>>>>> +++ b/arch/x86/kernel/head64.c
>>>>> @@ -143,7 +143,20 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
>>>>> if (sme_get_me_mask()) {
>>>>> vaddr = (unsigned long)__start_bss_decrypted;
>>>>> vaddr_end = (unsigned long)__end_bss_decrypted;
>>>>> +
>>>>> for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>>>>> + /*
>>>>> + * On SNP, transition the page to shared in the RMP table so that
>>>>> + * it is consistent with the page table attribute change.
>>>>> + *
>>>>> + * __start_bss_decrypted has a virtual address in the high range
>>>>> + * mapping (kernel .text). PVALIDATE, by way of
>>>>> + * early_snp_set_memory_shared(), requires a valid virtual
>>>>> + * address but the kernel is currently running off of the identity
>>>>> + * mapping so use __pa() to get a *currently* valid virtual address.
>>>>> + */
>>>>> + early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
>>>>
>>>> This breaks SME on Rome and Milan when compiling with clang-13. I haven't been
>>>> able to figure out exactly what goes wrong. printk isn't functional at this point,
>>>> and interactive debug during boot on our test systems is beyond me. I can't even
>>>> verify that the bug is specific to clang because the draconian build system for our
>>>> test systems apparently is stuck pointing at gcc-4.9.
>>>>
>>>> I suspect the issue is related to relocation and/or encrypting memory, as skipping
>>>> the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
>>>> I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
>>>> use of absolute addresses.
>>>>
>>>> Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the
>>>> moment because INIT_EX is also broken.
>>>
>>> The SEV INIT_EX was a PEBKAC issue. An SEV guest boots just fine with a clang-built
>>> kernel, so either it's a finnicky relocation issue or something specific to SME.
>>
>> I just built and booted 5.19-rc2 with clang-13 and SME enabled without issue:
>>
>> [ 4.118226] Memory Encryption Features active: AMD SME
>
> Phooey.
>
>> Maybe something with your kernel config? Can you send me your config?
>
> Attached. If you can't repro, I'll find someone on our end to work on this.

I was able to repro. It dies in the cc_platform_has() code, where it is
trying to do an indirect jump based on the attribute (actually in the
amd_cc_platform_has() which I think has been optimized in):

bool cc_platform_has(enum cc_attr attr)
{
ffffffff81002140: 55 push %rbp
ffffffff81002141: 48 89 e5 mov %rsp,%rbp
switch (vendor) {
ffffffff81002144: 8b 05 c6 e9 3a 01 mov 0x13ae9c6(%rip),%eax # ffffffff823b0b10 <vendor>
ffffffff8100214a: 83 f8 03 cmp $0x3,%eax
ffffffff8100214d: 74 25 je ffffffff81002174 <cc_platform_has+0x34>
ffffffff8100214f: 83 f8 02 cmp $0x2,%eax
ffffffff81002152: 74 2f je ffffffff81002183 <cc_platform_has+0x43>
ffffffff81002154: 83 f8 01 cmp $0x1,%eax
ffffffff81002157: 75 26 jne ffffffff8100217f <cc_platform_has+0x3f>
switch (attr) {
ffffffff81002159: 83 ff 05 cmp $0x5,%edi
ffffffff8100215c: 77 21 ja ffffffff8100217f <cc_platform_has+0x3f>
ffffffff8100215e: 89 f8 mov %edi,%eax
ffffffff81002160: ff 24 c5 c0 01 00 82 jmp *-0x7dfffe40(,%rax,8)

This last line is what causes the reset. I'm guessing that the jump isn't
valid at this point because we are running in identity mapped mode and not
with a kernel virtual address at this point.

Trying to see what the difference was between your config and mine, the
indirect jump lead me to check the setting of CONFIG_RETPOLINE. Your config
did not have it enabled, so I set CONFIG_RETPOLINE=y, and with that, the
kernel boots successfully. With retpolines, the code is completely different
around here:

bool cc_platform_has(enum cc_attr attr)
{
ffffffff81001f30: 55 push %rbp
ffffffff81001f31: 48 89 e5 mov %rsp,%rbp
switch (vendor) {
ffffffff81001f34: 8b 05 26 8f 37 01 mov 0x1378f26(%rip),%eax # ffffffff8237ae60 <vendor>
ffffffff81001f3a: 83 f8 03 cmp $0x3,%eax
ffffffff81001f3d: 74 29 je ffffffff81001f68 <cc_platform_has+0x38>
ffffffff81001f3f: 83 f8 02 cmp $0x2,%eax
ffffffff81001f42: 74 33 je ffffffff81001f77 <cc_platform_has+0x47>
ffffffff81001f44: 83 f8 01 cmp $0x1,%eax
ffffffff81001f47: 75 2a jne ffffffff81001f73 <cc_platform_has+0x43>
ffffffff81001f49: 31 c0 xor %eax,%eax
switch (attr) {
ffffffff81001f4b: 83 ff 02 cmp $0x2,%edi
ffffffff81001f4e: 7f 2f jg ffffffff81001f7f <cc_platform_has+0x4f>
ffffffff81001f50: 85 ff test %edi,%edi
ffffffff81001f52: 74 47 je ffffffff81001f9b <cc_platform_has+0x6b>
ffffffff81001f54: 83 ff 01 cmp $0x1,%edi
ffffffff81001f57: 74 5b je ffffffff81001fb4 <cc_platform_has+0x84>
ffffffff81001f59: 83 ff 02 cmp $0x2,%edi
ffffffff81001f5c: 75 08 jne ffffffff81001f66 <cc_platform_has+0x36>
return sev_status & MSR_AMD64_SEV_ENABLED;
ffffffff81001f5e: 8a 05 44 3f 64 01 mov 0x1643f44(%rip),%al # ffffffff82645ea8 <sev_status>
ffffffff81001f64: 24 01 and $0x1,%al
case CC_VENDOR_HYPERV:
return hyperv_cc_platform_has(attr);
default:
return false;
}
}
ffffffff81001f66: 5d pop %rbp
ffffffff81001f67: c3 ret
switch (attr) {
ffffffff81001f68: 83 ff 07 cmp $0x7,%edi
ffffffff81001f6b: 73 06 jae ffffffff81001f73 <cc_platform_has+0x43>
ffffffff81001f6d: 40 f6 c7 01 test $0x1,%dil
ffffffff81001f71: eb 07 jmp ffffffff81001f7a <cc_platform_has+0x4a>
ffffffff81001f73: 31 c0 xor %eax,%eax
}
ffffffff81001f75: 5d pop %rbp
ffffffff81001f76: c3 ret
return attr == CC_ATTR_GUEST_MEM_ENCRYPT;
ffffffff81001f77: 83 ff 02 cmp $0x2,%edi
ffffffff81001f7a: 0f 94 c0 sete %al
}
ffffffff81001f7d: 5d pop %rbp
ffffffff81001f7e: c3 ret
switch (attr) {
.
.
.

I'm not sure if there's a way to remove the jump table optimization for
the arch/x86/coco/core.c file when retpolines aren't configured.

Thanks,
Tom

>
> Thanks!

2022-06-14 20:25:30

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On Tue, Jun 14, 2022, Tom Lendacky wrote:
> On 6/14/22 11:13, Sean Christopherson wrote:
> > > > > This breaks SME on Rome and Milan when compiling with clang-13. I haven't been
> > > > > able to figure out exactly what goes wrong. printk isn't functional at this point,
> > > > > and interactive debug during boot on our test systems is beyond me. I can't even
> > > > > verify that the bug is specific to clang because the draconian build system for our
> > > > > test systems apparently is stuck pointing at gcc-4.9.
> > > > >
> > > > > I suspect the issue is related to relocation and/or encrypting memory, as skipping
> > > > > the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
> > > > > I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
> > > > > use of absolute addresses.
> > > > >
> > > > > Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the
> > > > > moment because INIT_EX is also broken.
> > > >
> > > > The SEV INIT_EX was a PEBKAC issue. An SEV guest boots just fine with a clang-built
> > > > kernel, so either it's a finnicky relocation issue or something specific to SME.
> > >
> > > I just built and booted 5.19-rc2 with clang-13 and SME enabled without issue:
> > >
> > > [ 4.118226] Memory Encryption Features active: AMD SME
> >
> > Phooey.
> >
> > > Maybe something with your kernel config? Can you send me your config?
> >
> > Attached. If you can't repro, I'll find someone on our end to work on this.
>
> I was able to repro. It dies in the cc_platform_has() code, where it is
> trying to do an indirect jump based on the attribute (actually in the
> amd_cc_platform_has() which I think has been optimized in):
>
> bool cc_platform_has(enum cc_attr attr)

...

> ffffffff81002160: ff 24 c5 c0 01 00 82 jmp *-0x7dfffe40(,%rax,8)
>
> This last line is what causes the reset. I'm guessing that the jump isn't
> valid at this point because we are running in identity mapped mode and not
> with a kernel virtual address at this point.
>
> Trying to see what the difference was between your config and mine, the
> indirect jump lead me to check the setting of CONFIG_RETPOLINE. Your config
> did not have it enabled, so I set CONFIG_RETPOLINE=y, and with that, the
> kernel boots successfully.

That would explain why my VMs didn't fail, I build those kernels with CONFIG_RETPOLINE=y.

> With retpolines, the code is completely different around here:

...

> I'm not sure if there's a way to remove the jump table optimization for
> the arch/x86/coco/core.c file when retpolines aren't configured.

And for post-boot I don't think we'd want to disable any such optimizations.

A possibled "fix" would be to do what sme_encrypt_kernel() does and just query
sev_status directly. But even that works, the fragility of the boot code is
terrifying :-( I can't think of any clever solutions though.

Many thanks again Tom!

---
arch/x86/include/asm/sev.h | 4 ++++
arch/x86/kernel/head64.c | 10 +++++++---
arch/x86/kernel/sev.c | 16 +++++++++++-----
3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 19514524f0f8..701c561fdf08 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -193,6 +193,8 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
void setup_ghcb(void);
void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
unsigned int npages);
+void __init __early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
unsigned int npages);
void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
@@ -214,6 +216,8 @@ static inline void setup_ghcb(void) { }
static inline void __init
early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
static inline void __init
+__early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init
early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index bd4a34100ed0..5efab0d8e49d 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -127,7 +127,9 @@ static bool __head check_la57_support(unsigned long physaddr)
}
#endif

-static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
+static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
+ pmdval_t *pmd,
+ unsigned long physaddr)
{
unsigned long vaddr, vaddr_end;
int i;
@@ -156,7 +158,9 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
* address but the kernel is currently running off of the identity
* mapping so use __pa() to get a *currently* valid virtual address.
*/
- early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+ if (sev_status & MSR_AMD64_SEV_SNP_ENABLED_BIT)
+ __early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr),
+ PTRS_PER_PMD);

i = pmd_index(vaddr);
pmd[i] -= sme_get_me_mask();
@@ -316,7 +320,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
*/
*fixup_long(&phys_base, physaddr) += load_delta - sme_get_me_mask();

- return sme_postprocess_startup(bp, pmd);
+ return sme_postprocess_startup(bp, pmd, physaddr);
}

/* Wipe all early page tables except for the kernel symbol map */
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c05f0124c410..48966ecc520e 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -714,12 +714,9 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
pvalidate_pages(vaddr, npages, true);
}

-void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
- unsigned int npages)
+void __init __early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
{
- if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
- return;
-
/* Invalidate the memory pages before they are marked shared in the RMP table. */
pvalidate_pages(vaddr, npages, false);

@@ -727,6 +724,15 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED);
}

+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ __early_snp_set_memory_shared(vaddr, paddr, npages);
+}
+
void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
{
unsigned long vaddr, npages;

base-commit: b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3
--

2022-06-16 16:25:44

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On 6/14/22 14:52, Sean Christopherson wrote:
> On Tue, Jun 14, 2022, Tom Lendacky wrote:
>> On 6/14/22 11:13, Sean Christopherson wrote:
>>>>>> This breaks SME on Rome and Milan when compiling with clang-13. I haven't been
>>>>>> able to figure out exactly what goes wrong. printk isn't functional at this point,
>>>>>> and interactive debug during boot on our test systems is beyond me. I can't even
>>>>>> verify that the bug is specific to clang because the draconian build system for our
>>>>>> test systems apparently is stuck pointing at gcc-4.9.
>>>>>>
>>>>>> I suspect the issue is related to relocation and/or encrypting memory, as skipping
>>>>>> the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
>>>>>> I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
>>>>>> use of absolute addresses.
>>>>>>
>>>>>> Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the
>>>>>> moment because INIT_EX is also broken.
>>>>>

>
>> I'm not sure if there's a way to remove the jump table optimization for
>> the arch/x86/coco/core.c file when retpolines aren't configured.
>
> And for post-boot I don't think we'd want to disable any such optimizations.
>
> A possibled "fix" would be to do what sme_encrypt_kernel() does and just query
> sev_status directly. But even that works, the fragility of the boot code is
> terrifying :-( I can't think of any clever solutions though.

I worry that another use of cc_platform_has() could creep in at some point
and cause the same issue. Not sure how bad it would be, performance-wise,
to remove the jump table optimization for arch/x86/coco/core.c.

I guess we can wait for Boris to get back and chime in.

> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index bd4a34100ed0..5efab0d8e49d 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -127,7 +127,9 @@ static bool __head check_la57_support(unsigned long physaddr)
> }
> #endif
>
> -static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
> +static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
> + pmdval_t *pmd,
> + unsigned long physaddr)

I noticed that you added the physaddr parameter but never use it...

Thanks,
Tom

> {
> unsigned long vaddr, vaddr_end;
> int i;
> @@ -156,7 +158,9 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
> * address but the kernel is currently running off of the identity
> * mapping so use __pa() to get a *currently* valid virtual address.
> */
> - early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
> + if (sev_status & MSR_AMD64_SEV_SNP_ENABLED_BIT)
> + __early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr),
> + PTRS_PER_PMD);
>
> i = pmd_index(vaddr);
> pmd[i] -= sme_get_me_mask();
> @@ -316,7 +320,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
> */
> *fixup_long(&phys_base, physaddr) += load_delta - sme_get_me_mask();
>
> - return sme_postprocess_startup(bp, pmd);
> + return sme_postprocess_startup(bp, pmd, physaddr);
> }
>
> /* Wipe all early page tables except for the kernel symbol map */

2022-06-16 16:55:22

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On Thu, Jun 16, 2022, Tom Lendacky wrote:
> On 6/14/22 14:52, Sean Christopherson wrote:
> > On Tue, Jun 14, 2022, Tom Lendacky wrote:
> > > On 6/14/22 11:13, Sean Christopherson wrote:
> > > > > > > This breaks SME on Rome and Milan when compiling with clang-13. I haven't been
> > > > > > > able to figure out exactly what goes wrong. printk isn't functional at this point,
> > > > > > > and interactive debug during boot on our test systems is beyond me. I can't even
> > > > > > > verify that the bug is specific to clang because the draconian build system for our
> > > > > > > test systems apparently is stuck pointing at gcc-4.9.
> > > > > > >
> > > > > > > I suspect the issue is related to relocation and/or encrypting memory, as skipping
> > > > > > > the call to early_snp_set_memory_shared() if SNP isn't active masks the issue.
> > > > > > > I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious
> > > > > > > use of absolute addresses.
> > > > > > >
> > > > > > > Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the
> > > > > > > moment because INIT_EX is also broken.
> > > > > >
>
> >
> > > I'm not sure if there's a way to remove the jump table optimization for
> > > the arch/x86/coco/core.c file when retpolines aren't configured.
> >
> > And for post-boot I don't think we'd want to disable any such optimizations.
> >
> > A possibled "fix" would be to do what sme_encrypt_kernel() does and just query
> > sev_status directly. But even that works, the fragility of the boot code is
> > terrifying :-( I can't think of any clever solutions though.
>
> I worry that another use of cc_platform_has() could creep in at some point
> and cause the same issue. Not sure how bad it would be, performance-wise, to
> remove the jump table optimization for arch/x86/coco/core.c.

One thought would be to initialize "vendor" to a bogus value, disallow calls to
cc_set_vendor() until after the kernel as gotten to a safe point, and then WARN
(or panic?) if cc_platform_has() is called before "vendor" is explicitly set.
New calls can still get in, but they'll be much easier to detect and less likely
to escape initial testing.

diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index 49b44f881484..803220cd34a6 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -13,7 +13,11 @@
#include <asm/coco.h>
#include <asm/processor.h>

-static enum cc_vendor vendor __ro_after_init;
+/*
+ * Initialize the vendor to garbage to detect usage of cc_platform_has() before
+ * the vendor has been set.
+ */
+static enum cc_vendor vendor = CC_NR_VENDORS __ro_after_init;
static u64 cc_mask __ro_after_init;

static bool intel_cc_platform_has(enum cc_attr attr)
@@ -90,7 +94,10 @@ bool cc_platform_has(enum cc_attr attr)
return intel_cc_platform_has(attr);
case CC_VENDOR_HYPERV:
return hyperv_cc_platform_has(attr);
+ case CC_VENDOR_NONE:
+ return false;
default:
+ WARN_ONCE(1, "blah blah blah");
return false;
}
}
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
index 3d98c3a60d34..adfd2fbce7ac 100644
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -9,6 +9,7 @@ enum cc_vendor {
CC_VENDOR_AMD,
CC_VENDOR_HYPERV,
CC_VENDOR_INTEL,
+ CC_NR_VENDORS,
};

void cc_set_vendor(enum cc_vendor v);

> I guess we can wait for Boris to get back and chime in.
>
> > diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> > index bd4a34100ed0..5efab0d8e49d 100644
> > --- a/arch/x86/kernel/head64.c
> > +++ b/arch/x86/kernel/head64.c
> > @@ -127,7 +127,9 @@ static bool __head check_la57_support(unsigned long physaddr)
> > }
> > #endif
> >
> > -static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
> > +static unsigned long __head sme_postprocess_startup(struct boot_params *bp,
> > + pmdval_t *pmd,
> > + unsigned long physaddr)
>
> I noticed that you added the physaddr parameter but never use it...

Likely just garbage on my end, I was trying various ideas.

2022-07-01 17:17:14

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On Thu, Jun 16, 2022 at 04:41:05PM +0000, Sean Christopherson wrote:
> > I worry that another use of cc_platform_has() could creep in at some point
> > and cause the same issue. Not sure how bad it would be, performance-wise, to
> > remove the jump table optimization for arch/x86/coco/core.c.

Is there a gcc switch for that?

> One thought would be to initialize "vendor" to a bogus value, disallow calls to
> cc_set_vendor() until after the kernel as gotten to a safe point, and then WARN
> (or panic?) if cc_platform_has() is called before "vendor" is explicitly set.
> New calls can still get in, but they'll be much easier to detect and less likely
> to escape initial testing.

The invalid vendor thing makes sense but I don't think it'll help in
this case.

We set vendor in sme_enable() which comes before the

__startup_64 -> sme_postprocess_startup

path you're hitting.

We could do only the aspect of checking whether it hasn't been set yet
and warn then, in order to make the usage more robust...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-07-07 20:48:04

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v12 19/46] x86/kernel: Make the .bss..decrypted section shared in RMP table

On Fri, Jul 01, 2022, Borislav Petkov wrote:
> On Thu, Jun 16, 2022 at 04:41:05PM +0000, Sean Christopherson wrote:
> > > I worry that another use of cc_platform_has() could creep in at some point
> > > and cause the same issue. Not sure how bad it would be, performance-wise, to
> > > remove the jump table optimization for arch/x86/coco/core.c.
>
> Is there a gcc switch for that?

I believe -fno-jump-tables will do the trick. That also reminds me exactly why
CONFIG_RETPOLINE=y isn't broken, jump tables are disabled when retpolines are enabled[*].

[*] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952

> > One thought would be to initialize "vendor" to a bogus value, disallow calls to
> > cc_set_vendor() until after the kernel as gotten to a safe point, and then WARN
> > (or panic?) if cc_platform_has() is called before "vendor" is explicitly set.
> > New calls can still get in, but they'll be much easier to detect and less likely
> > to escape initial testing.
>
> The invalid vendor thing makes sense but I don't think it'll help in
> this case.
>
> We set vendor in sme_enable() which comes before the
>
> __startup_64 -> sme_postprocess_startup
>
> path you're hitting.

Right, but that's easily solved, no? E.g.

diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index e8f7953fda83..ed3118f5bf62 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -487,6 +487,8 @@ void __init sme_early_init(void)
if (!sme_me_mask)
return;

+ cc_set_vendor(CC_VENDOR_AMD);
+
early_pmd_flags = __sme_set(early_pmd_flags);

__supported_pte_mask = __sme_set(__supported_pte_mask);
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index f415498d3175..6b1c60032400 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -611,7 +611,6 @@ void __init sme_enable(struct boot_params *bp)
out:
if (sme_me_mask) {
physical_mask &= ~sme_me_mask;
- cc_set_vendor(CC_VENDOR_AMD);
cc_set_mask(sme_me_mask);
}
}

And disallow cc_set_vendor() before x86_64_start_kernel(), then fix any fallout.

> We could do only the aspect of checking whether it hasn't been set yet
> and warn then, in order to make the usage more robust...
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2022-07-17 06:16:09

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v12 29/46] x86/boot: Add Confidential Computing type to setup_data

On April 7, 2022 7:57:22 AM PDT, Brijesh Singh <[email protected]> wrote:
>
>
>On 4/6/22 16:19, Thomas Gleixner wrote:
>> On Mon, Mar 07 2022 at 15:33, Brijesh Singh wrote:
>>>
>>> +/*
>>> + * AMD SEV Confidential computing blob structure. The structure is
>>> + * defined in OVMF UEFI firmware header:
>>> + * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
>>> + */
>>> +#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
>>> +struct cc_blob_sev_info {
>>> + u32 magic;
>>> + u16 version;
>>> + u16 reserved;
>>> + u64 secrets_phys;
>>> + u32 secrets_len;
>>> + u32 rsvd1;
>>> + u64 cpuid_phys;
>>> + u32 cpuid_len;
>>> + u32 rsvd2;
>>> +};
>>
>> Shouldn't this be packed?
>>
>
>Yep, to avoid any additional compiler alignment we should pack it.
>
>thanks

It shouldn't be *necessary*, any more than it is necessary for kernel-user space structures, since EFI is a C-based ABI. On x86 it doesn't hurt, either, though, so might as well.