Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1723467rwb; Thu, 8 Dec 2022 14:33:44 -0800 (PST) X-Google-Smtp-Source: AA0mqf5ykLL6kM+rFbdr4sy6qQOXq/qIYG7VxS7AI8oY5zFfKOT8d0Tw3QF9TiP/L9ZC+HPTwTmF X-Received: by 2002:a17:906:7f8e:b0:7c1:3a0:c09c with SMTP id f14-20020a1709067f8e00b007c103a0c09cmr10652602ejr.354.1670538824367; Thu, 08 Dec 2022 14:33:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670538824; cv=none; d=google.com; s=arc-20160816; b=cM8LHOLR6CnGLn9ffaOOWx8VG9lonT2Q3NL9fmChBccq9ILDEw1PGBeNVPqz/P7JR+ xyA7ZjISZZnhCsU9vd1XOPeijgw4LUW9D7OWoYjj9wLc7SOP+qOXyO8FNnZajin1x4A2 neVjJawkr5f7uq+Dk0D0cdm6tYLTE6LDnMQtOeykb6JAABD4iStxsiwIFbrNQuI7MvLw xMKiNS9weG+KbwMUH53XOPNDL4jChFEp4dUmWudnIj/Sj8g//OuXe6vmJ7bNKHRqnMHi bguoC76chLZ+qO9eeRJy2wksBCZ5AuEvxhKMtNIAyV42+tdL64uMTPOV58SjYHmYXHS7 +UoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=fZEl3N04EMAP2CKa6UNjlTejw3q2BTFmKhNwbtfSXdQ=; b=CyYy0LROwpZj8xrEeeCWuisIy1Y3o+NLGvnfI7yUVXQ73FWpROkKevfD9VurInzDnh B1beSt7dv8qGm1H6rmkbHsc3hPNx0bKrgFWM5EhvKzpPa52Yg3E/N/J9GKJqsdl+a86y RLgF7QM15rsAlWeFxI/9ptL/+y2R65K+U7krhJc67hQJ8waooDVFHSxWz238OERjlA8B yVusUyFTkT993Jz3OZUuO9ALqzz8lg5H4sJgAcK7pa6w3D8dSWPw+w+XCiR3prZ2u1Mx bO7sNqYtw/WFLOt19I/YE6rjKdZ5S2meTdClOZLhYBnjnAJoE+C3qgvsvgoug9ImlqQc 9FFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FBFylfhy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dd17-20020a1709069b9100b007c0cb833625si14888161ejc.478.2022.12.08.14.33.26; Thu, 08 Dec 2022 14:33:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FBFylfhy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229738AbiLHV4p (ORCPT + 73 others); Thu, 8 Dec 2022 16:56:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229556AbiLHV4g (ORCPT ); Thu, 8 Dec 2022 16:56:36 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 87DC16B9BD for ; Thu, 8 Dec 2022 13:55:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670536524; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fZEl3N04EMAP2CKa6UNjlTejw3q2BTFmKhNwbtfSXdQ=; b=FBFylfhyWgP5zi41KT1AUJqfaYISFIFtVuXF06ZmzOU5VB1fcFuqt5XgQklXiBgbaRSPAq CV3DlTO7LS854s/fg/j383k8fCF1o+8g+oNUss23CdhH3AMlYfiiVdQqsXQEDQe3186P81 fqO8bkYzU7s5xsE9qpYtVz+DYHsYbGc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-256-Pkd-WK6wM6SjpHbiMn6xxw-1; Thu, 08 Dec 2022 16:55:21 -0500 X-MC-Unique: Pkd-WK6wM6SjpHbiMn6xxw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4A516185A78B; Thu, 8 Dec 2022 21:55:21 +0000 (UTC) Received: from starship (unknown [10.35.206.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id D64B6140EBF5; Thu, 8 Dec 2022 21:55:19 +0000 (UTC) Message-ID: Subject: Re: [PATCH v4 10/32] KVM: x86: Move APIC access page helper to common x86 code From: Maxim Levitsky To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Alejandro Jimenez , Suravee Suthikulpanit , Li RongQing Date: Thu, 08 Dec 2022 23:55:18 +0200 In-Reply-To: <20221001005915.2041642-11-seanjc@google.com> References: <20221001005915.2041642-1-seanjc@google.com> <20221001005915.2041642-11-seanjc@google.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 2022-10-01 at 00:58 +0000, Sean Christopherson wrote: > Move the APIC access page allocation helper function to common x86 code, > the allocation routine is virtually identical between APICv (VMX) and > AVIC (SVM). Keep APICv's gfn_to_page() + put_page() sequence, which > verifies that a backing page can be allocated, i.e. that the system isn't > under heavy memory pressure. Forcing the backing page to be populated > isn't strictly necessary, but skipping the effective prefetch only delays > the inevitable. Just a note on the way we deal with that dummy page differs between APICv and AVIC a bit: On APICv we need physical address of the dummy page, since APICv has physical address of this page in VMCB, and when guest "accesses" only then APICv specific handling is invoked. Since we don't want to pin it, we have a mmu notifier telling us where it is currently and we update the vmcb each time it moves. If the page is swapped out, mmu notifier will call us, and we will pretty much swap it back in on next VM entry, in 'vmx_set_apic_access_page_addr' I hope that this code doesn't have races, I never looked at depth at it. On AVIC on the other hand we don't force the page to be mapped at all, since AVIC stores in the vmcb the virtual address of the page and only checkes that 'something' is mapped there. If we get nothing mapped there, AVIC will not intercept, we will get normal NPT fault and hopefully swap that page in. Reviewed-by: Maxim Levitsky Best regards, Maxim Levitsky > > Signed-off-by: Sean Christopherson > --- > arch/x86/kvm/lapic.c | 35 +++++++++++++++++++++++++++++++++++ > arch/x86/kvm/lapic.h | 1 + > arch/x86/kvm/svm/avic.c | 41 +++++++---------------------------------- > arch/x86/kvm/vmx/vmx.c | 35 +---------------------------------- > 4 files changed, 44 insertions(+), 68 deletions(-) > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 316b61b56cca..80e8b1cc6dc2 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -2436,6 +2436,41 @@ void kvm_apic_update_apicv(struct kvm_vcpu *vcpu) > } > EXPORT_SYMBOL_GPL(kvm_apic_update_apicv); > > +int kvm_alloc_apic_access_page(struct kvm *kvm) > +{ > + struct page *page; > + void __user *hva; > + int ret = 0; > + > + mutex_lock(&kvm->slots_lock); > + if (kvm->arch.apic_access_memslot_enabled) > + goto out; > + > + hva = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, > + APIC_DEFAULT_PHYS_BASE, PAGE_SIZE); > + if (IS_ERR(hva)) { > + ret = PTR_ERR(hva); > + goto out; > + } > + > + page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); > + if (is_error_page(page)) { > + ret = -EFAULT; > + goto out; > + } > + > + /* > + * Do not pin the page in memory, so that memory hot-unplug > + * is able to migrate it. > + */ > + put_page(page); > + kvm->arch.apic_access_memslot_enabled = true; > +out: > + mutex_unlock(&kvm->slots_lock); > + return ret; > +} > +EXPORT_SYMBOL_GPL(kvm_alloc_apic_access_page); > + > void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event) > { > struct kvm_lapic *apic = vcpu->arch.apic; > diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h > index a5ac4a5a5179..0587a8282cb3 100644 > --- a/arch/x86/kvm/lapic.h > +++ b/arch/x86/kvm/lapic.h > @@ -112,6 +112,7 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq, > struct dest_map *dest_map); > int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type); > void kvm_apic_update_apicv(struct kvm_vcpu *vcpu); > +int kvm_alloc_apic_access_page(struct kvm *kvm); > > bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src, > struct kvm_lapic_irq *irq, int *r, struct dest_map *dest_map); > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c > index 97ad0661f963..ec28ba4c5f1b 100644 > --- a/arch/x86/kvm/svm/avic.c > +++ b/arch/x86/kvm/svm/avic.c > @@ -256,39 +256,6 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu, > return &avic_physical_id_table[index]; > } > > -/* > - * Note: > - * AVIC hardware walks the nested page table to check permissions, > - * but does not use the SPA address specified in the leaf page > - * table entry since it uses address in the AVIC_BACKING_PAGE pointer > - * field of the VMCB. Therefore, we set up the > - * APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (4KB) here. > - */ > -static int avic_alloc_access_page(struct kvm *kvm) > -{ > - void __user *ret; > - int r = 0; > - > - mutex_lock(&kvm->slots_lock); > - > - if (kvm->arch.apic_access_memslot_enabled) > - goto out; > - > - ret = __x86_set_memory_region(kvm, > - APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, > - APIC_DEFAULT_PHYS_BASE, > - PAGE_SIZE); > - if (IS_ERR(ret)) { > - r = PTR_ERR(ret); > - goto out; > - } > - > - kvm->arch.apic_access_memslot_enabled = true; > -out: > - mutex_unlock(&kvm->slots_lock); > - return r; > -} > - > static int avic_init_backing_page(struct kvm_vcpu *vcpu) > { > u64 *entry, new_entry; > @@ -305,7 +272,13 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu) > if (kvm_apicv_activated(vcpu->kvm)) { > int ret; > > - ret = avic_alloc_access_page(vcpu->kvm); > + /* > + * Note, AVIC hardware walks the nested page table to check > + * permissions, but does not use the SPA address specified in > + * the leaf SPTE since it uses address in the AVIC_BACKING_PAGE > + * pointer field of the VMCB. > + */ > + ret = kvm_alloc_apic_access_page(vcpu->kvm); > if (ret) > return ret; > } > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 9dba04b6b019..974d9a366d5d 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -3771,39 +3771,6 @@ static void seg_setup(int seg) > vmcs_write32(sf->ar_bytes, ar); > } > > -static int alloc_apic_access_page(struct kvm *kvm) > -{ > - struct page *page; > - void __user *hva; > - int ret = 0; > - > - mutex_lock(&kvm->slots_lock); > - if (kvm->arch.apic_access_memslot_enabled) > - goto out; > - hva = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, > - APIC_DEFAULT_PHYS_BASE, PAGE_SIZE); > - if (IS_ERR(hva)) { > - ret = PTR_ERR(hva); > - goto out; > - } > - > - page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); > - if (is_error_page(page)) { > - ret = -EFAULT; > - goto out; > - } > - > - /* > - * Do not pin the page in memory, so that memory hot-unplug > - * is able to migrate it. > - */ > - put_page(page); > - kvm->arch.apic_access_memslot_enabled = true; > -out: > - mutex_unlock(&kvm->slots_lock); > - return ret; > -} > - > int allocate_vpid(void) > { > int vpid; > @@ -7348,7 +7315,7 @@ static int vmx_vcpu_create(struct kvm_vcpu *vcpu) > vmx->loaded_vmcs = &vmx->vmcs01; > > if (cpu_need_virtualize_apic_accesses(vcpu)) { > - err = alloc_apic_access_page(vcpu->kvm); > + err = kvm_alloc_apic_access_page(vcpu->kvm); > if (err) > goto free_vmcs; > }