Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp869068pxb; Thu, 23 Sep 2021 12:19:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw2lsHvkCT4q49OomuzpF73FCMrtkpjnFXAUcqzL2bWJz6xMXy3FOzPLrkiHNNk3xgCokE9 X-Received: by 2002:a17:906:f8d4:: with SMTP id lh20mr7156326ejb.382.1632424784781; Thu, 23 Sep 2021 12:19:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632424784; cv=none; d=google.com; s=arc-20160816; b=so1mqcPgvtgpJ6CT6QoTrywKQxPJrhkT1/SdcRKB4CamXgNTQ6FU6A/iX3GCwRXEqh NZEUx/IXpvOf0Ok0z3N1aSSadbiEvAEgH8Bzhfx0JDulm9OvLPDpyAGMIBbQCTIk7L4o Zti0WancnPcHsRGPzKd8H1oQgz9+fo+8Bro8U78bxz9UGKunxn2mDiUb75Y3QgFNw4PI enQh/ea036sPmdCfuI+12+a85QHHR8wYOaCNZj2RVuW65pnkZQ1COG1AyLvCQk/m5Q7o d1wfbmsC3ta1EZy02KjOGK53vSmmI6aggUnDSv+tACR/wPlupDAmJ1ye8Q2fDLOi4PvH +khA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=LljGpfowrgivlRJFftpyDSU2BC04nSpPhM5eTXaFlr8=; b=QPcdJrJF6Bx+NKi8r4ZNr3X8cr6BcBB4jtPI7AmDU0EKjAWG0f+h+HkPekaJhWhVz5 gllHtlYRYoXXeEBe8SCdv6gSb7deaLjEZTflhA2nbN/0eNugJx7QwtNzwvQ9QRUJNsLl BGHitLFkNIzR9/aFRLKhS5bhHkSYg0mGCL9+hSo1feGlOaq0cjol3bedOAL/8ZdaP+ly mR3r1UTwY/d08FJPqradBpThEG2CzU/wal8+DU7twjg6hZwDqZ2WrypROuaQvmbT0IWG zy+2vfX/Sdcsl0o4wFVLXCx/B+WtZy45YsKHzZw5Qx/tASr1B92mt0Y5ZRDfvlR2agAm yH4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RAEdX72B; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p22si3246621eds.582.2021.09.23.12.19.13; Thu, 23 Sep 2021 12:19:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RAEdX72B; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242903AbhIWTTR (ORCPT + 99 others); Thu, 23 Sep 2021 15:19:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237009AbhIWTTQ (ORCPT ); Thu, 23 Sep 2021 15:19:16 -0400 Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [IPv6:2607:f8b0:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE9DAC061756 for ; Thu, 23 Sep 2021 12:17:44 -0700 (PDT) Received: by mail-ot1-x32c.google.com with SMTP id 77-20020a9d0ed3000000b00546e10e6699so10007663otj.2 for ; Thu, 23 Sep 2021 12:17:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LljGpfowrgivlRJFftpyDSU2BC04nSpPhM5eTXaFlr8=; b=RAEdX72B6FJ8jw2x0FecoX4YekDfJKpt4KQDs/G7WIW+nNaklHbXDWQAUYj0t/JaJA W6hAufJVFkCXJxttIR9GzclXE2yluJI6Bwx53jtS0jbPDjn1rlKV6tBgBi4ArQLfsZCi Cn6S6RQyKUVhromBPbhmSLdKQCprrgOesWVX0ED1SN/D63o/DXwt4XTlk33AJPUYBK6R wLtbGlC8TXtukuJd+LDL+BkSYLD/qSgLHPM/TmI5pJq7f+uqj1ABsIC00RlD+0Nka0Gu mvuMEdkYIFQ171A8dVY0zC5YcScytshrd/pyp7zQDJqlNYIx5SnZso5D+J3/EI/CELhR D6Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LljGpfowrgivlRJFftpyDSU2BC04nSpPhM5eTXaFlr8=; b=lWbtdLeQfU8OOK51Huf/t/wQE0H4I5Q2eAVbPSC7x1DHrH0NFSQ3O8wUZSBYFxNPLm aXI21egrYUbgstTf7U2+tIPfdquyWSVQjGY32ARRjV5ezCnQh/bt1I5CE9/iwTgoWy2S x9erFhL04pCY4S7xWZap2rsxadcvXzkZP5GhewBzr0R56iK2oC8eMA0IWi8v8hGzpe0X Ipmqa9jP+nqcPaCO5Ej+RH8crVhYB5dbPY59VsuO5RYSQJ4X9i4KmZpM/a0kYsj1N1sv qJc/xkDA70Y+iT89HOgtDCQg/uFj0heGlwGaDWArs6rbrW36CsilbyEgEUau5lVa266b JOpQ== X-Gm-Message-State: AOAM533rKGXuvbjgske4+hKTROrDj0XCv/00g3oLfmimLpk79Vgf+Kbn OvW9L0l/QGegsIZ+FjDmt+cV4gSq96YpHuW8wfttIQ== X-Received: by 2002:a9d:71d4:: with SMTP id z20mr309962otj.29.1632424663779; Thu, 23 Sep 2021 12:17:43 -0700 (PDT) MIME-Version: 1.0 References: <20210820155918.7518-1-brijesh.singh@amd.com> <20210820155918.7518-22-brijesh.singh@amd.com> In-Reply-To: From: Marc Orr Date: Thu, 23 Sep 2021 12:17:32 -0700 Message-ID: Subject: Re: [PATCH Part2 v5 21/45] KVM: SVM: Make AVIC backing, VMSA and VMCB memory allocation SNP safe To: Brijesh Singh Cc: "Dr. David Alan Gilbert" , x86 , LKML , kvm list , linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Joerg Roedel , Tom Lendacky , "H. Peter Anvin" , Ard Biesheuvel , Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Andy Lutomirski , Dave Hansen , Sergio Lopez , Peter Gonda , Peter Zijlstra , Srinivas Pandruvada , David Rientjes , Dov Murik , Tobin Feldman-Fitzthum , Borislav Petkov , Michael Roth , Vlastimil Babka , "Kirill A . Shutemov" , Andi Kleen , tony.luck@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Thu, Sep 23, 2021 at 11:09 AM Brijesh Singh wrote: > > > On 9/22/21 1:55 PM, Dr. David Alan Gilbert wrote: > > * Brijesh Singh (brijesh.singh@amd.com) wrote: > >> Implement a workaround for an SNP erratum where the CPU will incorrectly > >> signal an RMP violation #PF if a hugepage (2mb or 1gb) collides with the > >> RMP entry of a VMCB, VMSA or AVIC backing page. > >> > >> When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC > >> backing pages as "in-use" in the RMP after a successful VMRUN. This is > >> done for _all_ VMs, not just SNP-Active VMs. > > Can you explain what 'globally enabled' means? > > This means that SNP is enabled in host SYSCFG_MSR.Snp=1. Once its > enabled then RMP checks are enforced. > > > > Or more specifically, can we trip this bug on public hardware that has > > the SNP enabled in the bios, but no SNP init in the host OS? > > Enabling the SNP support on host is 3 step process: > > step1 (bios): reserve memory for the RMP table. > > step2 (host): initialize the RMP table memory, set the SYSCFG msr to > enable the SNP feature > > step3 (host): call the SNP_INIT to initialize the SNP firmware (this is > needed only if you ever plan to launch SNP guest from this host). > > The "SNP globally enabled" means the step 1 to 2. The RMP checks are > enforced as soon as step 2 is completed. It might be good to update the code to reflect this reply, such that the new `snp_safe_alloc_page()` function introduced in this patch only deviates from "normal" page allocation logic when these two conditions are met. We could introduce a global variable, `snp_globally_enabled` that defaults to false and gets set to true after we write the SYSCFG MSR. > thanks > > > > > Dave > > > >> If the hypervisor accesses an in-use page through a writable translation, > >> the CPU will throw an RMP violation #PF. On early SNP hardware, if an > >> in-use page is 2mb aligned and software accesses any part of the associated > >> 2mb region with a hupage, the CPU will incorrectly treat the entire 2mb > >> region as in-use and signal a spurious RMP violation #PF. > >> > >> The recommended is to not use the hugepage for the VMCB, VMSA or > >> AVIC backing page. Add a generic allocator that will ensure that the page > >> returns is not hugepage (2mb or 1gb) and is safe to be used when SEV-SNP > >> is enabled. > >> > >> Co-developed-by: Marc Orr > >> Signed-off-by: Marc Orr > >> Signed-off-by: Brijesh Singh > >> --- > >> arch/x86/include/asm/kvm-x86-ops.h | 1 + > >> arch/x86/include/asm/kvm_host.h | 1 + > >> arch/x86/kvm/lapic.c | 5 ++++- > >> arch/x86/kvm/svm/sev.c | 35 ++++++++++++++++++++++++++++++ > >> arch/x86/kvm/svm/svm.c | 16 ++++++++++++-- > >> arch/x86/kvm/svm/svm.h | 1 + > >> 6 files changed, 56 insertions(+), 3 deletions(-) > >> > >> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h > >> index a12a4987154e..36a9c23a4b27 100644 > >> --- a/arch/x86/include/asm/kvm-x86-ops.h > >> +++ b/arch/x86/include/asm/kvm-x86-ops.h > >> @@ -122,6 +122,7 @@ KVM_X86_OP_NULL(enable_direct_tlbflush) > >> KVM_X86_OP_NULL(migrate_timers) > >> KVM_X86_OP(msr_filter_changed) > >> KVM_X86_OP_NULL(complete_emulated_msr) > >> +KVM_X86_OP(alloc_apic_backing_page) > >> > >> #undef KVM_X86_OP > >> #undef KVM_X86_OP_NULL > >> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > >> index 974cbfb1eefe..5ad6255ff5d5 100644 > >> --- a/arch/x86/include/asm/kvm_host.h > >> +++ b/arch/x86/include/asm/kvm_host.h > >> @@ -1453,6 +1453,7 @@ struct kvm_x86_ops { > >> int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err); > >> > >> void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector); > >> + void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu); > >> }; > >> > >> struct kvm_x86_nested_ops { > >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > >> index ba5a27879f1d..05b45747b20b 100644 > >> --- a/arch/x86/kvm/lapic.c > >> +++ b/arch/x86/kvm/lapic.c > >> @@ -2457,7 +2457,10 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns) > >> > >> vcpu->arch.apic = apic; > >> > >> - apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); > >> + if (kvm_x86_ops.alloc_apic_backing_page) > >> + apic->regs = static_call(kvm_x86_alloc_apic_backing_page)(vcpu); > >> + else > >> + apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); > >> if (!apic->regs) { > >> printk(KERN_ERR "malloc apic regs error for vcpu %x\n", > >> vcpu->vcpu_id); > >> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > >> index 1644da5fc93f..8771b878193f 100644 > >> --- a/arch/x86/kvm/svm/sev.c > >> +++ b/arch/x86/kvm/svm/sev.c > >> @@ -2703,3 +2703,38 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) > >> break; > >> } > >> } > >> + > >> +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) > >> +{ > >> + unsigned long pfn; > >> + struct page *p; > >> + > >> + if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP)) > >> + return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); Continuing my other comment, above: if we introduce a `snp_globally_enabled` var, we could use that here, rather than `cpu_feature_enabled(X86_FEATURE_SEV_SNP)`. > >> + > >> + /* > >> + * Allocate an SNP safe page to workaround the SNP erratum where > >> + * the CPU will incorrectly signal an RMP violation #PF if a > >> + * hugepage (2mb or 1gb) collides with the RMP entry of VMCB, VMSA > >> + * or AVIC backing page. The recommeded workaround is to not use the > >> + * hugepage. > >> + * > >> + * Allocate one extra page, use a page which is not 2mb aligned > >> + * and free the other. > >> + */ > >> + p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1); > >> + if (!p) > >> + return NULL; > >> + > >> + split_page(p, 1); > >> + > >> + pfn = page_to_pfn(p); > >> + if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) { > >> + pfn++; > >> + __free_page(p); > >> + } else { > >> + __free_page(pfn_to_page(pfn + 1)); > >> + } > >> + > >> + return pfn_to_page(pfn); > >> +} > >> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > >> index 25773bf72158..058eea8353c9 100644 > >> --- a/arch/x86/kvm/svm/svm.c > >> +++ b/arch/x86/kvm/svm/svm.c > >> @@ -1368,7 +1368,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu) > >> svm = to_svm(vcpu); > >> > >> err = -ENOMEM; > >> - vmcb01_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); > >> + vmcb01_page = snp_safe_alloc_page(vcpu); > >> if (!vmcb01_page) > >> goto out; > >> > >> @@ -1377,7 +1377,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu) > >> * SEV-ES guests require a separate VMSA page used to contain > >> * the encrypted register state of the guest. > >> */ > >> - vmsa_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); > >> + vmsa_page = snp_safe_alloc_page(vcpu); > >> if (!vmsa_page) > >> goto error_free_vmcb_page; > >> > >> @@ -4539,6 +4539,16 @@ static int svm_vm_init(struct kvm *kvm) > >> return 0; > >> } > >> > >> +static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu) > >> +{ > >> + struct page *page = snp_safe_alloc_page(vcpu); > >> + > >> + if (!page) > >> + return NULL; > >> + > >> + return page_address(page); > >> +} > >> + > >> static struct kvm_x86_ops svm_x86_ops __initdata = { > >> .hardware_unsetup = svm_hardware_teardown, > >> .hardware_enable = svm_hardware_enable, > >> @@ -4667,6 +4677,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { > >> .complete_emulated_msr = svm_complete_emulated_msr, > >> > >> .vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector, > >> + > >> + .alloc_apic_backing_page = svm_alloc_apic_backing_page, > >> }; > >> > >> static struct kvm_x86_init_ops svm_init_ops __initdata = { > >> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h > >> index d1f1512a4b47..e40800e9c998 100644 > >> --- a/arch/x86/kvm/svm/svm.h > >> +++ b/arch/x86/kvm/svm/svm.h > >> @@ -575,6 +575,7 @@ void sev_es_create_vcpu(struct vcpu_svm *svm); > >> void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector); > >> void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu); > >> void sev_es_unmap_ghcb(struct vcpu_svm *svm); > >> +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu); > >> > >> /* vmenter.S */ > >> > >> -- > >> 2.17.1 > >> > >>