Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1564820rwd; Thu, 1 Jun 2023 18:26:52 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6bImm5//tlWg3LjEAiU8/86Y9Dw+afCR6sa79kaA1sQj3DljChPrmadCaPmagXV8Qo/dS9 X-Received: by 2002:a05:6358:430b:b0:125:8e67:4644 with SMTP id r11-20020a056358430b00b001258e674644mr5486255rwc.18.1685669211852; Thu, 01 Jun 2023 18:26:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685669211; cv=none; d=google.com; s=arc-20160816; b=egrhUsSOhRAYeGmz04hIs2dJCSWw7OH52OH8XKxK8w7MzvtjZnlNTk3E0XBWg1fO/k DqVw54w/OBk4jLh+iE28G27DCKhpTKYavNHhqsGSC2BiLvd7AbRA3IcywYTVD0QnqyEW lQds+CWJeu0dk4dHvW/Z0zxQqVvmwvwA7gLrz25Eo2v0gKyP9rgTeVSa9nWoCWXIuvLF vYO0cl/BK0eAc7vZ7mjXN5tBiMCzJCKuq3WNBmjsL2Kx5eXEXRQ6+G9BRGah7cdWwFcX jlu0uRWQy0D9udbZqIPiNL4s7tN6zl9t3F6ZgzIJ5nhKV7Su5fuEgDpCwujJeWXjfsJ0 jZKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=tmG5Gj4Mwq4ntwod8eDhcTKnVqwdZ3BWjaq2d7Ob80k=; b=B+G3JBReFAm5Mh+17QxhksA+WeA1+tUSW47NtWyx+UaA9wUY+K/lKnxn2hMOfMOWvz 2celWAnlDX+GPQtkkYrz/c7aZO+wWMxWaxfQHAIbae8SDK4AEYm6TUHJ+kOeM5kNGf3m ayZgBHg+13jswkNIIxRun960hB5MFte08BxODekCQaDofwYF98ZPAhPUmj1DGeKzw1b/ MEaPlLnjcEbK//YDbl5BtR5R6xeQQbz+3Vhte1yTicWlPTB/ourHofkND/TDjK+wteet KacY1PCNhW3BQbH2Bhw/7d/h9SF+Pcp2r+oY4FAMFEpX44jTapmnl3BK+XigPwdztGqA eC2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=uC2XLUjv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w10-20020a17090a8a0a00b0024769a264fcsi1949098pjn.10.2023.06.01.18.26.38; Thu, 01 Jun 2023 18:26:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=uC2XLUjv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233220AbjFBBP2 (ORCPT + 99 others); Thu, 1 Jun 2023 21:15:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233251AbjFBBP0 (ORCPT ); Thu, 1 Jun 2023 21:15:26 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AE941A7 for ; Thu, 1 Jun 2023 18:15:23 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-565d1b86a64so20936757b3.3 for ; Thu, 01 Jun 2023 18:15:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685668523; x=1688260523; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=tmG5Gj4Mwq4ntwod8eDhcTKnVqwdZ3BWjaq2d7Ob80k=; b=uC2XLUjv1JyRrK7ep7CVM4ZlT2APqmo2uQ6DEC+UkTT7G4T5nodeNoub/YBwvFRguI PslfYWKsTdTj4HRUhOMESTHjjITLrF++g2d95YBnetmlqjkh4Lv7sHnfD84BU4IrrLr1 Y5btKdmMLEjN0rli82gVKnmJetTXW7TIlzAhgKrhCFh2EB93DEqW98F0N4CKWyb3NKFe /4auYE/4xbSdPJJGl4joVAFDBd7DD7YCEKNCJGP22ATCcQktaQFTJ4boUisttl+4im18 LgdVZ1pi3R27V3/VTdX4GGBBI6KUSrPwDxMKSZHt1pIbFB+sgE3lARNsbPLX286Nkozl rqlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685668523; x=1688260523; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tmG5Gj4Mwq4ntwod8eDhcTKnVqwdZ3BWjaq2d7Ob80k=; b=dXKi5bamr/fkz0UHkJCdERE+Ivsgkni+wQUDfLUoL4QBl9O8Z3jfxotVcHEngWH58B ArSbddOSpoXVrVYC8lhjCPhjSI/L+zOBvuVUcSyFcOQrMxluXb90I1VEaBRVlvnXEMBS CFZCyPvunR2rxSfUQrMPZtbgQ6bIeaED9dWu9YRRvTZfLYsr+EYsQ+XMntBsqF+CqZNA QNNHB7YXe6KTDwitDB4DZfmVtmnwrau7x/P9kxtBAC4bV4pCSNjhqUlWcu13G2iUe4jo XyvKYTNuCvC8LveUVSkyQ1S2NixwDwPNdUKCkud5Lp9dggOZFj/OqoHwT/vE4fXQ9Bkb TD7A== X-Gm-Message-State: AC+VfDwdrKojG7oB+xENuL1LWEy9h59B7MalvYjtKL1yjj//lIOYIQ81 Vd3IqAg/tMrKKXTqhDBkjVifNtPJkeo= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:c509:0:b0:552:b607:634b with SMTP id k9-20020a81c509000000b00552b607634bmr6259856ywi.4.1685668522848; Thu, 01 Jun 2023 18:15:22 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Jun 2023 18:15:16 -0700 In-Reply-To: <20230602011518.787006-1-seanjc@google.com> Mime-Version: 1.0 References: <20230602011518.787006-1-seanjc@google.com> X-Mailer: git-send-email 2.41.0.rc2.161.g9c6817b8e7-goog Message-ID: <20230602011518.787006-2-seanjc@google.com> Subject: [PATCH 1/3] KVM: VMX: Retry APIC-access page reload if invalidation is in-progress From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Jason Gunthorpe , Alistair Popple , Robin Murphy Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Re-request an APIC-access page reload if there is a relevant mmu_notifier invalidation in-progress when KVM retrieves the backing pfn, i.e. stall vCPUs until the backing pfn for the APIC-access page is "officially" stable. Relying on the primary MMU to not make changes after invoking ->invalidate_range() works, e.g. any additional changes to a PRESENT PTE would also trigger an ->invalidate_range(), but using ->invalidate_range() to fudge around KVM not honoring past and in-progress invalidations is a bit hacky. Honoring invalidations will allow using KVM's standard mmu_notifier hooks to detect APIC-access page reloads, which will in turn allow removing KVM's implementation of ->invalidate_range() (the APIC-access page case is a true one-off). Opportunistically add a comment to explain why doing nothing if a memslot isn't found is functionally correct. Suggested-by: Jason Gunthorpe Cc: Alistair Popple Cc: Robin Murphy Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/vmx.c | 50 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 45 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 44fb619803b8..59195f0dc7a5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6708,7 +6708,12 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu) static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu) { - struct page *page; + const gfn_t gfn = APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT; + struct kvm *kvm = vcpu->kvm; + struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memory_slot *slot; + unsigned long mmu_seq; + kvm_pfn_t pfn; /* Defer reload until vmcs01 is the current VMCS. */ if (is_guest_mode(vcpu)) { @@ -6720,18 +6725,53 @@ static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu) SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) return; - page = gfn_to_page(vcpu->kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); - if (is_error_page(page)) + /* + * Grab the memslot so that the hva lookup for the mmu_notifier retry + * is guaranteed to use the same memslot as the pfn lookup, i.e. rely + * on the pfn lookup's validation of the memslot to ensure a valid hva + * is used for the retry check. + */ + slot = id_to_memslot(slots, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT); + if (!slot || slot->flags & KVM_MEMSLOT_INVALID) return; - vmcs_write64(APIC_ACCESS_ADDR, page_to_phys(page)); + /* + * Ensure that the mmu_notifier sequence count is read before KVM + * retrieves the pfn from the primary MMU. Note, the memslot is + * protected by SRCU, not the mmu_notifier. Pairs with the smp_wmb() + * in kvm_mmu_invalidate_end(). + */ + mmu_seq = kvm->mmu_invalidate_seq; + smp_rmb(); + + /* + * No need to retry if the memslot does not exist or is invalid. KVM + * controls the APIC-access page memslot, and only deletes the memslot + * if APICv is permanently inhibited, i.e. the memslot won't reappear. + */ + pfn = gfn_to_pfn_memslot(slot, gfn); + if (is_error_noslot_pfn(pfn)) + return; + + read_lock(&vcpu->kvm->mmu_lock); + if (mmu_invalidate_retry_hva(kvm, mmu_seq, + gfn_to_hva_memslot(slot, gfn))) { + kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu); + read_unlock(&vcpu->kvm->mmu_lock); + goto out; + } + + vmcs_write64(APIC_ACCESS_ADDR, pfn_to_hpa(pfn)); + read_unlock(&vcpu->kvm->mmu_lock); + vmx_flush_tlb_current(vcpu); +out: /* * Do not pin apic access page in memory, the MMU notifier * will call us again if it is migrated or swapped out. */ - put_page(page); + kvm_release_pfn_clean(pfn); } static void vmx_hwapic_isr_update(int max_isr) -- 2.41.0.rc2.161.g9c6817b8e7-goog