Received: by 2002:a05:6a10:144:0:0:0:0 with SMTP id 4csp514530pxw; Fri, 8 Apr 2022 13:28:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz+qKOOyb9VUvK1nShYqUS1izyR6C/vrq0RBylLhn5nsInzbH8Lp6kv09CItLRzo4ei4+oE X-Received: by 2002:a17:902:82c8:b0:156:8918:8db5 with SMTP id u8-20020a17090282c800b0015689188db5mr20717709plz.124.1649449718266; Fri, 08 Apr 2022 13:28:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649449718; cv=none; d=google.com; s=arc-20160816; b=V7yaqMv2RuRB3lu0D4OHWFSmn1cLSU5f6teU9vzvKAfWv7YJtea+T6QPCAink5Xpte tEBugrXgtHchaimA7XpKwKCi5npXgoSCDtdwM4UuIcHOuYwJuSJD8dHaQ+w1QGV3S2MW cD6+cLLWFcCdJ/ZBvowZfnePkVymUk9cNqFnczZ2T5zRezfO/qm9cnqWBszJK8OjouMq NadPGIr64VbXLsSuoWvRsNK49VyXqaLiImbMreXYFLSc4iem3V8kSDbYdXnkTDhbFt1P VZGRmQiwtL+paoUnlQXxVHb5WHaXx/nuFjbYwx3fcPe7dY+DSzgoB4qw0NmR7mPku15b NjfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:references:in-reply-to:cc:subject :to:reply-to:sender:from:dkim-signature:dkim-signature:date; bh=iwojwRcQcDjGFBBu8xOojK+y4o745zdudHBrtwX2Td8=; b=WelKEN5H5BLXlWyrpBlJT7GUhAcnHeROBcy456r8d/tIASyb27loXIPHRbGUyLgxWV qPlP1LvwXH8DploKt7tEaxtcUDFcqLlfTT1/DePyH///nw/k5LzrCMBbi9g8CWPqqTcV +3mTdTBT1gzVbPEdJmmJcWeX7iNYS3NPjccTlncezDLaJfmjMyA6Fp89EkYnTqwgbGAY HFw+P5w1r6B3ErT1AiwhjJSiIcBN11iwJ/iZMK0TzoBiv7wxnpmj+oJuFil4zjcUI5Jo XrWFzGMXXQM5rqIgoNUnAp3SZgidRpYuvvI+cTKUKmswJEkjQHWL1xZaTUC7rlE1NEyQ ibMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=bnSph3kM; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j2-20020a63e742000000b00398be20ca1csi2220425pgk.537.2022.04.08.13.28.23; Fri, 08 Apr 2022 13:28:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=bnSph3kM; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233265AbiDHJMg (ORCPT + 99 others); Fri, 8 Apr 2022 05:12:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232685AbiDHJLN (ORCPT ); Fri, 8 Apr 2022 05:11:13 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD59512C6EE; Fri, 8 Apr 2022 02:08:53 -0700 (PDT) Date: Fri, 08 Apr 2022 09:08:50 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1649408932; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iwojwRcQcDjGFBBu8xOojK+y4o745zdudHBrtwX2Td8=; b=bnSph3kMl9kf/a88V50XfU0QyMxdO8hOURgy4C0hb8TbKiRKhu5vth5gSePM1XOxCH/r3K Sl0PjSp632aUnAw/+UVmQI6vutlyEEFOT266F3kOvNf7LwmNcEqKo6AjpfHAEVitSEnJhe PYjJbP7nw1ArUZ0Y3qJRR4nrHwVlKlaRbNoEkjILAB/9AwyV05dLhjT9I2gFTsRogeNcPG oxUZUvK+9hDOWE+vGz+iUEugNNPwk3TEGAUE/5Sf9cXdSPHhOsdZVy9Jm5PaqMYstFDIV2 opg0ViG7eQvJXYFY3UBaczWlZvWQDBX9LmvC4eG2gzAfYW+clSR4x5xc1PNgKg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1649408932; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iwojwRcQcDjGFBBu8xOojK+y4o745zdudHBrtwX2Td8=; b=AfEkByd5Q8r45Vs9W2ozJOOZE6Mq67PfijPWmrNp71/lH3rcrcGiyG0mdXolVWzcI3avSp QF/pbETH+cKT1/Bg== From: "tip-bot2 for Michael Roth" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/sev] x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers Cc: Michael Roth , Brijesh Singh , Borislav Petkov , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20220307213356.2797205-33-brijesh.singh@amd.com> References: <20220307213356.2797205-33-brijesh.singh@amd.com> MIME-Version: 1.0 Message-ID: <164940893100.389.14709127887160231666.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the x86/sev branch of tip: Commit-ID: ee0bfa08a345370df28c07288e886abcbaac481f Gitweb: https://git.kernel.org/tip/ee0bfa08a345370df28c07288e886abcbaac481f Author: Michael Roth AuthorDate: Thu, 24 Feb 2022 10:56:12 -06:00 Committer: Borislav Petkov CommitterDate: Thu, 07 Apr 2022 16:47:11 +02:00 x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers CPUID instructions generate a #VC exception for SEV-ES/SEV-SNP guests, for which early handlers are currently set up to handle. In the case of SEV-SNP, guests can use a configurable location in guest memory that has been pre-populated with a firmware-validated CPUID table to look up the relevant CPUID values rather than requesting them from hypervisor via a VMGEXIT. Add the various hooks in the #VC handlers to allow CPUID instructions to be handled via the table. The code to actually configure/enable the table will be added in a subsequent commit. Signed-off-by: Michael Roth Signed-off-by: Brijesh Singh Signed-off-by: Borislav Petkov Link: https://lore.kernel.org/r/20220307213356.2797205-33-brijesh.singh@amd.com --- arch/x86/include/asm/sev-common.h | 2 +- arch/x86/kernel/sev-shared.c | 324 +++++++++++++++++++++++++++++- 2 files changed, 326 insertions(+) diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h index e9b6815..0759af9 100644 --- a/arch/x86/include/asm/sev-common.h +++ b/arch/x86/include/asm/sev-common.h @@ -152,6 +152,8 @@ struct snp_psc_desc { #define GHCB_TERM_PSC 1 /* Page State Change failure */ #define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */ #define GHCB_TERM_NOT_VMPL0 3 /* SNP guest is not running at VMPL-0 */ +#define GHCB_TERM_CPUID 4 /* CPUID-validation failure */ +#define GHCB_TERM_CPUID_HV 5 /* CPUID failure during hypervisor fallback */ #define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK) diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c index b4d5558..0f13751 100644 --- a/arch/x86/kernel/sev-shared.c +++ b/arch/x86/kernel/sev-shared.c @@ -25,6 +25,36 @@ struct cpuid_leaf { }; /* + * Individual entries of the SNP CPUID table, as defined by the SNP + * Firmware ABI, Revision 0.9, Section 7.1, Table 14. + */ +struct snp_cpuid_fn { + u32 eax_in; + u32 ecx_in; + u64 xcr0_in; + u64 xss_in; + u32 eax; + u32 ebx; + u32 ecx; + u32 edx; + u64 __reserved; +} __packed; + +/* + * SNP CPUID table, as defined by the SNP Firmware ABI, Revision 0.9, + * Section 8.14.2.6. Also noted there is the SNP firmware-enforced limit + * of 64 entries per CPUID table. + */ +#define SNP_CPUID_COUNT_MAX 64 + +struct snp_cpuid_table { + u32 count; + u32 __reserved1; + u64 __reserved2; + struct snp_cpuid_fn fn[SNP_CPUID_COUNT_MAX]; +} __packed; + +/* * Since feature negotiation related variables are set early in the boot * process they must reside in the .data section so as not to be zeroed * out when the .bss section is later cleared. @@ -33,6 +63,19 @@ struct cpuid_leaf { */ static u16 ghcb_version __ro_after_init; +/* Copy of the SNP firmware's CPUID page. */ +static struct snp_cpuid_table cpuid_table_copy __ro_after_init; + +/* + * These will be initialized based on CPUID table so that non-present + * all-zero leaves (for sparse tables) can be differentiated from + * invalid/out-of-range leaves. This is needed since all-zero leaves + * still need to be post-processed. + */ +static u32 cpuid_std_range_max __ro_after_init; +static u32 cpuid_hyp_range_max __ro_after_init; +static u32 cpuid_ext_range_max __ro_after_init; + static bool __init sev_es_check_cpu_features(void) { if (!has_cpuflag(X86_FEATURE_RDRAND)) { @@ -243,6 +286,252 @@ static int sev_cpuid_hv(struct cpuid_leaf *leaf) } /* + * This may be called early while still running on the initial identity + * mapping. Use RIP-relative addressing to obtain the correct address + * while running with the initial identity mapping as well as the + * switch-over to kernel virtual addresses later. + */ +static const struct snp_cpuid_table *snp_cpuid_get_table(void) +{ + void *ptr; + + asm ("lea cpuid_table_copy(%%rip), %0" + : "=r" (ptr) + : "p" (&cpuid_table_copy)); + + return ptr; +} + +/* + * The SNP Firmware ABI, Revision 0.9, Section 7.1, details the use of + * XCR0_IN and XSS_IN to encode multiple versions of 0xD subfunctions 0 + * and 1 based on the corresponding features enabled by a particular + * combination of XCR0 and XSS registers so that a guest can look up the + * version corresponding to the features currently enabled in its XCR0/XSS + * registers. The only values that differ between these versions/table + * entries is the enabled XSAVE area size advertised via EBX. + * + * While hypervisors may choose to make use of this support, it is more + * robust/secure for a guest to simply find the entry corresponding to the + * base/legacy XSAVE area size (XCR0=1 or XCR0=3), and then calculate the + * XSAVE area size using subfunctions 2 through 64, as documented in APM + * Volume 3, Rev 3.31, Appendix E.3.8, which is what is done here. + * + * Since base/legacy XSAVE area size is documented as 0x240, use that value + * directly rather than relying on the base size in the CPUID table. + * + * Return: XSAVE area size on success, 0 otherwise. + */ +static u32 snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted) +{ + const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table(); + u64 xfeatures_found = 0; + u32 xsave_size = 0x240; + int i; + + for (i = 0; i < cpuid_table->count; i++) { + const struct snp_cpuid_fn *e = &cpuid_table->fn[i]; + + if (!(e->eax_in == 0xD && e->ecx_in > 1 && e->ecx_in < 64)) + continue; + if (!(xfeatures_en & (BIT_ULL(e->ecx_in)))) + continue; + if (xfeatures_found & (BIT_ULL(e->ecx_in))) + continue; + + xfeatures_found |= (BIT_ULL(e->ecx_in)); + + if (compacted) + xsave_size += e->eax; + else + xsave_size = max(xsave_size, e->eax + e->ebx); + } + + /* + * Either the guest set unsupported XCR0/XSS bits, or the corresponding + * entries in the CPUID table were not present. This is not a valid + * state to be in. + */ + if (xfeatures_found != (xfeatures_en & GENMASK_ULL(63, 2))) + return 0; + + return xsave_size; +} + +static bool +snp_cpuid_get_validated_func(struct cpuid_leaf *leaf) +{ + const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table(); + int i; + + for (i = 0; i < cpuid_table->count; i++) { + const struct snp_cpuid_fn *e = &cpuid_table->fn[i]; + + if (e->eax_in != leaf->fn) + continue; + + if (cpuid_function_is_indexed(leaf->fn) && e->ecx_in != leaf->subfn) + continue; + + /* + * For 0xD subfunctions 0 and 1, only use the entry corresponding + * to the base/legacy XSAVE area size (XCR0=1 or XCR0=3, XSS=0). + * See the comments above snp_cpuid_calc_xsave_size() for more + * details. + */ + if (e->eax_in == 0xD && (e->ecx_in == 0 || e->ecx_in == 1)) + if (!(e->xcr0_in == 1 || e->xcr0_in == 3) || e->xss_in) + continue; + + leaf->eax = e->eax; + leaf->ebx = e->ebx; + leaf->ecx = e->ecx; + leaf->edx = e->edx; + + return true; + } + + return false; +} + +static void snp_cpuid_hv(struct cpuid_leaf *leaf) +{ + if (sev_cpuid_hv(leaf)) + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV); +} + +static int snp_cpuid_postprocess(struct cpuid_leaf *leaf) +{ + struct cpuid_leaf leaf_hv = *leaf; + + switch (leaf->fn) { + case 0x1: + snp_cpuid_hv(&leaf_hv); + + /* initial APIC ID */ + leaf->ebx = (leaf_hv.ebx & GENMASK(31, 24)) | (leaf->ebx & GENMASK(23, 0)); + /* APIC enabled bit */ + leaf->edx = (leaf_hv.edx & BIT(9)) | (leaf->edx & ~BIT(9)); + + /* OSXSAVE enabled bit */ + if (native_read_cr4() & X86_CR4_OSXSAVE) + leaf->ecx |= BIT(27); + break; + case 0x7: + /* OSPKE enabled bit */ + leaf->ecx &= ~BIT(4); + if (native_read_cr4() & X86_CR4_PKE) + leaf->ecx |= BIT(4); + break; + case 0xB: + leaf_hv.subfn = 0; + snp_cpuid_hv(&leaf_hv); + + /* extended APIC ID */ + leaf->edx = leaf_hv.edx; + break; + case 0xD: { + bool compacted = false; + u64 xcr0 = 1, xss = 0; + u32 xsave_size; + + if (leaf->subfn != 0 && leaf->subfn != 1) + return 0; + + if (native_read_cr4() & X86_CR4_OSXSAVE) + xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK); + if (leaf->subfn == 1) { + /* Get XSS value if XSAVES is enabled. */ + if (leaf->eax & BIT(3)) { + unsigned long lo, hi; + + asm volatile("rdmsr" : "=a" (lo), "=d" (hi) + : "c" (MSR_IA32_XSS)); + xss = (hi << 32) | lo; + } + + /* + * The PPR and APM aren't clear on what size should be + * encoded in 0xD:0x1:EBX when compaction is not enabled + * by either XSAVEC (feature bit 1) or XSAVES (feature + * bit 3) since SNP-capable hardware has these feature + * bits fixed as 1. KVM sets it to 0 in this case, but + * to avoid this becoming an issue it's safer to simply + * treat this as unsupported for SNP guests. + */ + if (!(leaf->eax & (BIT(1) | BIT(3)))) + return -EINVAL; + + compacted = true; + } + + xsave_size = snp_cpuid_calc_xsave_size(xcr0 | xss, compacted); + if (!xsave_size) + return -EINVAL; + + leaf->ebx = xsave_size; + } + break; + case 0x8000001E: + snp_cpuid_hv(&leaf_hv); + + /* extended APIC ID */ + leaf->eax = leaf_hv.eax; + /* compute ID */ + leaf->ebx = (leaf->ebx & GENMASK(31, 8)) | (leaf_hv.ebx & GENMASK(7, 0)); + /* node ID */ + leaf->ecx = (leaf->ecx & GENMASK(31, 8)) | (leaf_hv.ecx & GENMASK(7, 0)); + break; + default: + /* No fix-ups needed, use values as-is. */ + break; + } + + return 0; +} + +/* + * Returns -EOPNOTSUPP if feature not enabled. Any other non-zero return value + * should be treated as fatal by caller. + */ +static int snp_cpuid(struct cpuid_leaf *leaf) +{ + const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table(); + + if (!cpuid_table->count) + return -EOPNOTSUPP; + + if (!snp_cpuid_get_validated_func(leaf)) { + /* + * Some hypervisors will avoid keeping track of CPUID entries + * where all values are zero, since they can be handled the + * same as out-of-range values (all-zero). This is useful here + * as well as it allows virtually all guest configurations to + * work using a single SNP CPUID table. + * + * To allow for this, there is a need to distinguish between + * out-of-range entries and in-range zero entries, since the + * CPUID table entries are only a template that may need to be + * augmented with additional values for things like + * CPU-specific information during post-processing. So if it's + * not in the table, set the values to zero. Then, if they are + * within a valid CPUID range, proceed with post-processing + * using zeros as the initial values. Otherwise, skip + * post-processing and just return zeros immediately. + */ + leaf->eax = leaf->ebx = leaf->ecx = leaf->edx = 0; + + /* Skip post-processing for out-of-range zero leafs. */ + if (!(leaf->fn <= cpuid_std_range_max || + (leaf->fn >= 0x40000000 && leaf->fn <= cpuid_hyp_range_max) || + (leaf->fn >= 0x80000000 && leaf->fn <= cpuid_ext_range_max))) + return 0; + } + + return snp_cpuid_postprocess(leaf); +} + +/* * Boot VC Handler - This is the first VC handler during boot, there is no GHCB * page yet, so it only supports the MSR based communication with the * hypervisor and only the CPUID exit-code. @@ -252,6 +541,7 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code) unsigned int subfn = lower_bits(regs->cx, 32); unsigned int fn = lower_bits(regs->ax, 32); struct cpuid_leaf leaf; + int ret; /* Only CPUID is supported via MSR protocol */ if (exit_code != SVM_EXIT_CPUID) @@ -259,9 +549,18 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code) leaf.fn = fn; leaf.subfn = subfn; + + ret = snp_cpuid(&leaf); + if (!ret) + goto cpuid_done; + + if (ret != -EOPNOTSUPP) + goto fail; + if (sev_cpuid_hv(&leaf)) goto fail; +cpuid_done: regs->ax = leaf.eax; regs->bx = leaf.ebx; regs->cx = leaf.ecx; @@ -556,12 +855,37 @@ static enum es_result vc_handle_ioio(struct ghcb *ghcb, struct es_em_ctxt *ctxt) return ret; } +static int vc_handle_cpuid_snp(struct pt_regs *regs) +{ + struct cpuid_leaf leaf; + int ret; + + leaf.fn = regs->ax; + leaf.subfn = regs->cx; + ret = snp_cpuid(&leaf); + if (!ret) { + regs->ax = leaf.eax; + regs->bx = leaf.ebx; + regs->cx = leaf.ecx; + regs->dx = leaf.edx; + } + + return ret; +} + static enum es_result vc_handle_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt) { struct pt_regs *regs = ctxt->regs; u32 cr4 = native_read_cr4(); enum es_result ret; + int snp_cpuid_ret; + + snp_cpuid_ret = vc_handle_cpuid_snp(regs); + if (!snp_cpuid_ret) + return ES_OK; + if (snp_cpuid_ret != -EOPNOTSUPP) + return ES_VMM_ERROR; ghcb_set_rax(ghcb, regs->ax); ghcb_set_rcx(ghcb, regs->cx);