Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp6023212pxb; Mon, 14 Feb 2022 13:21:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJyqmmzXbmndRy08uwgyFx53yTw/2WEW96fUa1IbYEJQiay4vYlg29nXbhj/FNO63lCSsFY0 X-Received: by 2002:a05:6402:c97:: with SMTP id cm23mr773452edb.9.1644873660560; Mon, 14 Feb 2022 13:21:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644873660; cv=none; d=google.com; s=arc-20160816; b=VakGRMcTMaegLxlKW+e3R7cNWFBHPrT4GS+4S8f48kTaj5VH3JBVjyEQ/cNOgFP5Sz JNGNLD21h/cx26PPKHyaHyjRUEh94KaQD2Bn56QzlxAdwg4UPuRrvucv/yBrb2ZX4XML Pgv6B4vouu7aLrvTckMLGRx594AajfTmoAZCuhQR0U7ELoQ6PLRghg89PQVLjx6cJEHZ 7yjCXkWUk95rN72IPHgAOUuUhv6HGC4dCV0i7qQY8ah7p0VnpLzlcDcDWZKt5VGaOcl1 JH3uXcuR330X6bPjuCo4YFAJcoh0F7NOrrtxJRx0YiVt83xnE0Iuf/wUQWiORJ0sxDGQ 97NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=lUDAX0UaDPTHJ9u8BBzfbhja83Bgj8HqOrB2yYnjOXM=; b=loG2j7EWlmwWYMGx8oQOro3gTjx2TVCscmUypEgpIcyT1za0jeaND4pFY3Jr2hXXOo j4RCClI7NYbmPMCvtPNQoO2L9/1Uu2WzhaLco2+uFdpsCrRkHaKJkLw8UYMkLDfA7N5p xGag9jKGw0gTCNZpvDBuII+UWyYA05K3xPzvPXC8CJlaSvsElsGTOo6ekeM6ssrRm6rG 4/gVZ/OowinjjIwnkdCGbkVD/qDOxcIBxItOlAwq8X1MMOVYX6davs8zTT1SjKB6g5WP 26JFlMtArpfLFAK2/wvnAfio0ogBCX5PX4KkqqIUThkhgHycAUXipmgzXTVnzn9KbbSb EOmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=VFULTseL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h4si6639294edt.557.2022.02.14.13.20.38; Mon, 14 Feb 2022 13:21:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=VFULTseL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230414AbiBNVS0 (ORCPT + 99 others); Mon, 14 Feb 2022 16:18:26 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:36200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230396AbiBNVSY (ORCPT ); Mon, 14 Feb 2022 16:18:24 -0500 Received: from mail-oi1-x231.google.com (mail-oi1-x231.google.com [IPv6:2607:f8b0:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2991D13C243 for ; Mon, 14 Feb 2022 13:18:14 -0800 (PST) Received: by mail-oi1-x231.google.com with SMTP id 4so18772716oil.11 for ; Mon, 14 Feb 2022 13:18:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=lUDAX0UaDPTHJ9u8BBzfbhja83Bgj8HqOrB2yYnjOXM=; b=VFULTseLnwoWoQSLrikwObkqrgWEDJsgEoShox8nAP9wuMRLYLSmINgZVPuS4zJCYq AY+rJI2tTabEed97CNniygA5XGKmOrXxLL+dqf5LJsVaIMA+QxlQKmlOxLVW3MPyEQOW ZtZ4oDJOQLly/BilNjT4JcO3Qc4hVnFdspg2L6GzChP02kiowqdnpSk3qcJBLKZzsP0E LQOUETeZSPr1OgoOyaK1JtHOTBP1ihk0bK5wukpFaWyN8SjOlX96peTYbEzeUHuSIcP3 kErjTwCsp+aGM83fy1ZyKBkFtKkFORMEFCjsOWRHwsOb1sx0pkSNSAWQmxneJL40R6Rz fXKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=lUDAX0UaDPTHJ9u8BBzfbhja83Bgj8HqOrB2yYnjOXM=; b=x4BjQYd41J76UTSPC3ac8L9oysTAKVksMbjFlGYgnzqXGk3cPO7YmRYimx19iv3mib Pqm8rUOFPTpFrDKXO5qNaF6y/xKhInunHJGAHvZOZ3i/BWRuRef2GpNNYPpGQPfMCwEA AHSDiDptahWS5qLuHXatiXrFXU1XUbVI0ilSq/SshAIKuIwUqta278Rmv1wMgT62Tfln 7Q1rZ/eEcCIYL7vmBpvgzfa1s2uvODqSGFwHwVPPQ9QBgZB4nWqU3obKh4HOSFUCUy9W jJoNizX+7f8YUb7/H8ZagHF8CniYwKnE11vRPB6vBDff0NRbEu4bc0u/7IACRAqpfl/6 OqeA== X-Gm-Message-State: AOAM532MguSng4hI9xB1droneinrwY9w5CMSTyske4NHVyahl2q8L5Es GF9PaDnVpYaF9At7C31iOkCU8nBaXwKfFw== X-Received: by 2002:a17:90a:1b2c:b0:1b8:ab57:309f with SMTP id q41-20020a17090a1b2c00b001b8ab57309fmr248913pjq.48.1644866647585; Mon, 14 Feb 2022 11:24:07 -0800 (PST) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id nn16sm15099137pjb.2.2022.02.14.11.24.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Feb 2022 11:24:06 -0800 (PST) Date: Mon, 14 Feb 2022 19:24:03 +0000 From: Sean Christopherson To: Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com Subject: Re: [PATCH 12/12] KVM: x86: do not unload MMU roots on all role changes Message-ID: References: <20220209170020.1775368-1-pbonzini@redhat.com> <20220209170020.1775368-13-pbonzini@redhat.com> <5f42d1ef-f6b7-c339-32b9-f4cf48c21841@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5f42d1ef-f6b7-c339-32b9-f4cf48c21841@redhat.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 14, 2022, Paolo Bonzini wrote: > On 2/11/22 19:48, Sean Christopherson wrote: > > On Wed, Feb 09, 2022, Paolo Bonzini wrote: > > > @@ -5045,8 +5046,8 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) > > > void kvm_mmu_reset_context(struct kvm_vcpu *vcpu) > > > { > > > - kvm_mmu_unload(vcpu); > > > kvm_init_mmu(vcpu); > > > + kvm_mmu_new_pgd(vcpu, vcpu->arch.cr3); > > > > This is too risky IMO, there are far more flows than just MOV CR0/CR4 that are > > affected, e.g. SMM transitions, KVM_SET_SREG, etc... > > SMM exit does flush the TLB because RSM clears CR0.PG (I did check this :)). > SMM re-entry then does not need to flush. But I don't think SMM exit should > flush the TLB *for non-SMM roles*. I'm not concerned about the TLB flush aspects so much as the addition of kvm_mmu_new_pgd() in new paths. > For KVM_SET_SREGS I'm not sure if it should flush the TLB, but I agree it is > certainly safer to keep it that way. > > > Given that kvm_post_set_cr{0,4}() and kvm_vcpu_reset() explicitly handle CR0.PG > > and CR4.SMEP toggling, I highly doubt the other flows are correct in all instances. > > The call to kvm_mmu_new_pgd() is also > > *white noise* > > > To minimize risk, we should leave kvm_mmu_reset_context() as is (rename it if > > necessary) and instead add a new helper to handle kvm_post_set_cr{0,4}(). In > > the future we can/should work on avoiding unload in all paths, but again, future > > problem. > > I disagree on this. There aren't many calls to kvm_mmu_reset_context. All the more reason to do things incrementally. I have no objection to allowing all flows to reuse a cached (or current) root, I'm objecting to converting them all in a single patch. > > > - if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS) > > > + if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS) { > > > + /* Flush the TLB if CR0 is changed 1 -> 0. */ > > > + if ((old_cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PG)) > > > + kvm_mmu_unload(vcpu); > > > > Calling kvm_mmu_unload() instead of requesting a flush isn't coherent with respect > > to the comment, or with SMEP handling. And the SMEP handling isn't coherent with > > respect to the changelog. Please elaborate :-) > > Yep, will do (the CR0.PG=0 case is similar to the CR0.PCIDE=0 case below). Oh, you're freeing all roots to ensure a future MOV CR3 with NO_FLUSH and PCIDE=1 can't reuse a stale root. That's necessary if and only if the MMU is shadowing the guest, non-nested TDP MMUs just need to flush the guest's TLB. The same is true for the PCIDE case, i.e. we could optimize that too, though the main motivation would be to clarify why all roots are unloaded. > Using kvm_mmu_unload() avoids loading a cached root just to throw it away > immediately after, The shadow paging case will throw it away, but not the non-nested TDP MMU case? > but I can change this to a new KVM_REQ_MMU_UPDATE_ROOT flag that does > > kvm_mmu_new_pgd(vcpu, vcpu->arch.cr3); I don't think that's necessary, I was just confused by the discrepancy. > By the way, I have a possibly stupid question. In kvm_set_cr3 (called e.g. > from emulator_set_cr()) there is > > if (cr3 != kvm_read_cr3(vcpu)) > kvm_mmu_new_pgd(vcpu, cr3); > > What makes this work if mmu_is_nested(vcpu)? Hmm, nothing. VMX is "broken" anyways because it will kick out to userspace with X86EMUL_UNHANDLEABLE due to the CR3 intercept check. SVM is also broken in that it doesn't check INTERCEPT_CR3_WRITE, e.g. will do the wrong thing even if L1 wants to intercept CR3 accesses. > Should this also have an "if (... & !tdp_enabled)"? Yes? That should avoid the nested mess. This patch also needs to handle CR0 and CR4 modifications if L2 is active, e.g. if L1 choose not to intercept CR0/CR4. kvm_post_set_cr_reinit_mmu() would be a lovely landing spot for that check :-D