Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1058901rwb; Thu, 22 Sep 2022 09:34:17 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6t0U9SEDSABMGXPjy52yUSHKsIR1xCPheATfXm6vMXZR2zBqYJvf6rIWYRXvtod005/7yk X-Received: by 2002:a17:907:e9e:b0:77f:9688:2714 with SMTP id ho30-20020a1709070e9e00b0077f96882714mr3523522ejc.208.1663864456876; Thu, 22 Sep 2022 09:34:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663864456; cv=none; d=google.com; s=arc-20160816; b=ecpDDyA4NrNB6aRiyJCKYacmyQ66aviuauyTwgVEY65UwQnCxSzZxCtaY0yKy+tvg9 MkcYKq29tHEpbD54dspBMm63sSMlUVBxpsTzEUNFcOqpU4N8uyD2OxlGftkJRx9LVNt4 IlIikxnHa17Dl2PDLn67tdg0naa8AVxH02V/yMAWC3sIsZoi2r8RwlRey9PQpG6u6wtX DfB5mmY/XlzrkzbQbWoroALGW+c7LwSTdJlU/A915K4m3Sfw3PjLxvd5h24uziUJoiXx HSAOuUd8pYvzSqVQMJD4XPwzpbCUkumiYiNhZZ0C7OXpX/NYlztkGb7GPYIk3neogl9Z RjhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=Mpy+ZkaV8qU9h3mLap4FWRmJ5KwCWQIoj50k/FL3Cgo=; b=nRbz2ZfJ0hiOf7vvmrPYg7AgiQhS2ds2tRcFFab7IFkmMG2tJHsu2lYoppcyHuty02 bKlrcYg9JL8+miP5AECB1C7Uh2GRoQTONfa2+aC8dIpbmTR1LjJU0l8G47nrU0Cr+4aR kzj1yCHqlWyf0khyHjK2pnO3P+TjkYfdePIxsaAhSadzsT2KV7xbxKexowy6lSdE+uSB oef1SC8B88xnMzixi+kpxTlkYHO6v0OT95tWSpypOp3+pOXKCN2o63boIqS+T1jl/Yor CHOrd+E1fTCpwjiy2oPoFxqpxKwrviEhun9YTl5pl/E2K3Bih0tvXaV8EaC/7N81qa1z I+Wg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Zrr5sbG9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dp20-20020a170906c15400b0074b26b6bbbcsi5711061ejc.419.2022.09.22.09.33.48; Thu, 22 Sep 2022 09:34:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Zrr5sbG9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230484AbiIVPhO (ORCPT + 99 others); Thu, 22 Sep 2022 11:37:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229792AbiIVPhM (ORCPT ); Thu, 22 Sep 2022 11:37:12 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB832100ABD for ; Thu, 22 Sep 2022 08:37:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1663861030; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Mpy+ZkaV8qU9h3mLap4FWRmJ5KwCWQIoj50k/FL3Cgo=; b=Zrr5sbG9yD7l1oJMqWHn/Q4VyxWhenp+/WJLQXYV/RvA7S/s/4JUPb683HRxKq6eXE0QPg 77CwXIPh6vHtXk+f3FNtbxkxPKuatJ3PUMkOwNgY6m2gaV+STmIuoevnm5zIqmk68xVtCy +/QBO8TlMqXU8fIESUtwy1n50P7gAiI= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-488-Efm75d45Mi2nNYobq2PqMQ-1; Thu, 22 Sep 2022 11:37:09 -0400 X-MC-Unique: Efm75d45Mi2nNYobq2PqMQ-1 Received: by mail-wm1-f70.google.com with SMTP id d5-20020a05600c34c500b003b4fb42ccdeso1258229wmq.8 for ; Thu, 22 Sep 2022 08:37:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date; bh=Mpy+ZkaV8qU9h3mLap4FWRmJ5KwCWQIoj50k/FL3Cgo=; b=hb0n8kU+UX22srtCWa9Y63T1eQ0ENQcuXtdIFGVpyUk0z+Sy4wdbSzZnUe7GTqp5JY FkETWfhM8MFZd8v4ZmkgiasG5UktdYgz7XzCS39LuyRYvqrYZn04qtt2JEryehiSLUqa elsQ+fdwv1sly5bHijc1kvx2B3xQQs1sdBXnG9vsQ9IaHx9uQ/SqBA51WYQIk3pF6XMG sKbo2rqbgCghVmNuXaaMVIuntakl8jLpRDBjkBjJYsNSCM9w6iklhu84GMQqgZ/t1Sbs 2yEP1nNKjonL04OR+cRundz68WAqQ3tXq2oHEgZCu2VPIwpDXFEELFhJ9vf7Ou9AuQ+T b1oQ== X-Gm-Message-State: ACrzQf1/tx0DpRPJJe/IRAtcGMMUX2JYqMDaiFD8zHxZy1hBdmUnO+2J txRrxPmO1V+8HW//2ZtoPqFQGwNTLqDMUXtWN0E+2OKUJkopbWr+UTGCGXGsWkkMHDs9OyNm0YI WjBRSX8y8sespCEUp/vIgGUAWCfl6oTJz23jMewkJmUampQCadGYxA+Lcwiq4b+9by1wKKEvhLc NW X-Received: by 2002:a05:600c:1f18:b0:3b4:c4ae:f666 with SMTP id bd24-20020a05600c1f1800b003b4c4aef666mr9900261wmb.88.1663861027619; Thu, 22 Sep 2022 08:37:07 -0700 (PDT) X-Received: by 2002:a05:600c:1f18:b0:3b4:c4ae:f666 with SMTP id bd24-20020a05600c1f1800b003b4c4aef666mr9900223wmb.88.1663861027244; Thu, 22 Sep 2022 08:37:07 -0700 (PDT) Received: from fedora (nat-2.ign.cz. [91.219.240.2]) by smtp.gmail.com with ESMTPSA id z5-20020a5d6405000000b0022af9555669sm6245246wru.99.2022.09.22.08.37.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Sep 2022 08:37:06 -0700 (PDT) From: Vitaly Kuznetsov To: Sean Christopherson Cc: kvm@vger.kernel.org, Paolo Bonzini , Wanpeng Li , Jim Mattson , Michael Kelley , Siddharth Chandrasekaran , Yuan Yao , Maxim Levitsky , linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v10 02/39] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag In-Reply-To: References: <20220921152436.3673454-1-vkuznets@redhat.com> <20220921152436.3673454-3-vkuznets@redhat.com> <877d1voiuz.fsf@redhat.com> Date: Thu, 22 Sep 2022 17:37:05 +0200 Message-ID: <87sfkjmndq.fsf@redhat.com> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sean Christopherson writes: > On Thu, Sep 22, 2022, Vitaly Kuznetsov wrote: >> Now let's get to VMX and the point of my confusion (and thanks in >> advance for educating me!): >> AFAIU, when EPT is in use: >> KVM_REQ_TLB_FLUSH_CURRENT == invept >> KVM_REQ_TLB_FLUSH_GUEST = invvpid >> >> For "normal" mappings (which are mapped on both stages) this is the same >> thing as they're 'tagged' with both VPID and 'EPT root'. The question is >> what's left. Given your comment, do I understand correctly that in case >> of an invalid mapping in the guest (GVA doesn't resolve to a GPA), this >> will only be tagged with VPID but not with 'EPT root' (as the CPU never >> reached to the second translation stage)? We certainly can't ignore >> these. Another (probably pure theoretical question) is what are the >> mappings which are tagged with 'EPT root' but don't have a VPID tag? > > Intel puts mappings into three categories, which for non-root mode equates to: > > linear == GVA => GPA > guest-physical == GPA => HPA > combined == GVA => HPA > > and essentially the categories that consume the GVA are tagged with the VPID > (linear and combined), and categories that consume the GPA are tagged with the > EPTP address (guest-physical and combined). > >> Are these the mapping which happen when e.g. vCPU has paging disabled? > > No, these mappings can be created at all times. Even with CR0.PG=1, the guest > can generate GPAs without going through a GVA=>GPA translation, e.g. the page tables > themselves, RTIT (Intel PT) addresses, etc... And even for combined/full > translations, the CPU can insert TLB entries for just the GPA=>HPA part. > > E.g. when a page is allocated by/for userspace, the kernel will zero the page using > the kernel's direct map, but userspace will access the page via a different GVA. > I.e. the guest effectively aliases GPA(x) with GVA(k) and GVA(u). By inserting > the GPA(x) => HPA(y) into the TLB, when guest userspace access GVA(u), the CPU > encounters a TLB miss on GVA(u) => GPA(x), but gets a TLB hit on GPA(x) => HPA(y). > > Separating EPT flushes from VPID (and PCID) flushes allows the CPU to retain > the partial TLB entries, e.g. a host change in the EPT tables will result in the > guest-physical and combined mappings being invalidated, but linear mappings can > be kept. > Thanks a bunch! For some reason I though it's always the full thing (combined) which is tagged with both VPID/PCID and EPTP and linear/guest-physical are just 'corner' cases (but are still combined and tagged). Apparently, it's not like that. > I'm 99% certain AMD also caches partial entries, e.g. see the blurb on INVLPGA > not affecting NPT translations, AMD just doesn't provide a way for the host to > flush _only_ NPT translations. Maybe the performance benefits weren't significant > enough to justify the extra complexity? > >> These are probably unrelated to Hyper-V TLB flushing. >> >> To preserve the 'small' optimization, we can probably move >> kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu); >> >> to nested_svm_transition_tlb_flush() or, in case this sounds too >> hackish > > Move it to svm_flush_tlb_current(), because the justification is that on SVM, > flushing "current" TLB entries also flushes "guest" TLB entries due to the more > coarse-grained ASID-based TLB flush. E.g. > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index dd599afc85f5..a86b41503723 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -3737,6 +3737,13 @@ static void svm_flush_tlb_current(struct kvm_vcpu *vcpu) > { > struct vcpu_svm *svm = to_svm(vcpu); > > + /* > + * Unlike VMX, SVM doesn't provide a way to flush only NPT TLB entries. > + * A TLB flush for the current ASID flushes both "host" and "guest" TLB > + * entries, and thus is a superset of Hyper-V's fine grained flushing. > + */ > + kvm_hv_vcpu_purge_flush_tlb(vcpu); > + > /* > * Flush only the current ASID even if the TLB flush was invoked via > * kvm_flush_remote_tlbs(). Although flushing remote TLBs requires all > >> we can drop it for now and add it to the (already overfull) >> bucket of the "optimize nested_svm_transition_tlb_flush()". > > I think even long term, purging Hyper-V's FIFO in svm_flush_tlb_current() is the > correct/desired behavior. This doesn't really have anything to do with nSVM, > it's all about SVM not providing a way to flush only NPT entries. True that, silly me forgot that even without any nesting, Hyper-V TLB flush after svm_flush_tlb_current() makes no sense. > -- Vitaly