Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp21129157rwd; Thu, 29 Jun 2023 11:19:54 -0700 (PDT) X-Google-Smtp-Source: APBJJlElFfafDaUARyavFgTrHuy27oR9zZB0mBNu5Bh4WFZpgVcMRIrQe4BI5eBa5KW87gMxHnaC X-Received: by 2002:a17:90a:f6c8:b0:263:3567:f99 with SMTP id er8-20020a17090af6c800b0026335670f99mr717034pjb.15.1688062793727; Thu, 29 Jun 2023 11:19:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688062793; cv=none; d=google.com; s=arc-20160816; b=J4Vf1dRreHQnvUiTpwHUeQHra4wcZwhyScn2IY0w+UNgWZWKwYe+SBwp3m7Omkso3+ Dzv8b5I469cPMe6tNgzzDi9Mn1YqTZ00gT/23fTqh0aEjeSgpTeUdfoq6rKajBg2jb/0 tgf60r72gQZbYpQdgdHgcHgm9jVsnZgyWQOx/YjT2IOnkND26TfDlm5BCNveQDw2IigM q/tZbTa6Kk9pfQ4uFygL0k7h2fwSYczxQmYTWofL7aozVcz+WZTdTE+Ga0mu2aE4oCLL Zcczqmozekp1ExiUR6yQwZscE/yHEfcIBO6P6RvKwLG8UsNnXo9iMTBFseRup+Y0Dc88 ESzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=4FkhFrr8w6aneHxfl/h27nLQhmkU8QpQuoyru1TLP4c=; fh=wgZVZObMu1lXOGhwgl6yitZ+9nQ81QjIXcT1fkQvf6s=; b=rXOqInghYh6vITo89KJu+cbjS9X1uCoFwbcdKfVjoBvtJ1ohDRFNeY5fKWl5zfqd5z pOiT8c7m2+srHy4ZDWVlqv04J/i2pUx7HfvydnnnreBUw8k4d98oSlATNHL1/NhQt9jJ WCeeD2cbFtrjmMcf+Ci1nSkKL7XtfKpSsqXTnSLvxdaMlr/z+a4E3Hv/jTCjo6AbTgZx UXvXptaT76Qj1E02mtyd1TSKbDJ2ZktRsY5iaXjc1URiRcZwhS9c91EOqU1X1WbeJHGI 8UUSgQ3uzGljFlRLiF6c15MdlxS7z3xrnbRDqDaUxeY/SHpT+qhpk7GaTA3D7P/SBfjh EPzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=QBg6Sw4t; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h5-20020a17090ac38500b00262ecc71b13si8160477pjt.109.2023.06.29.11.19.38; Thu, 29 Jun 2023 11:19:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=QBg6Sw4t; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231777AbjF2R4a (ORCPT + 99 others); Thu, 29 Jun 2023 13:56:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229910AbjF2R42 (ORCPT ); Thu, 29 Jun 2023 13:56:28 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0D501FD8 for ; Thu, 29 Jun 2023 10:56:27 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5704e551e8bso7882437b3.3 for ; Thu, 29 Jun 2023 10:56:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688061386; x=1690653386; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4FkhFrr8w6aneHxfl/h27nLQhmkU8QpQuoyru1TLP4c=; b=QBg6Sw4tvE6/pClUmxNda3MWefmw+W4KGc9WF6HW+rF0vuDvCBcf5loU3jRbHfnYig i861uID1dQ78wFmjMi+gpotuXmWiokksT3ABBYgthg9VEjTpLXe4TrzIJhyzwNUk5HFh 9qHbnx2awjERX0lkB5i5W6ZngQIDkJb01cP7TPCOqlgtsO8uP2HPC/wVmmjbjWZigHXx TtWWEoEkv5wmw7fV8AyWtGUG3tqc19WEekI6RjGmXtxPGURI/AmhqGmmHdaXHn7Rp0jF OXszuPhwDcVKGF9EUrX3uHG91CZmh4bEum8bnNqC5COQJGndLYd7mfCX/mwj0KT/id4B xIug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688061386; x=1690653386; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4FkhFrr8w6aneHxfl/h27nLQhmkU8QpQuoyru1TLP4c=; b=Hs3n2wq3n97CP/J9KbQMMpjqdzgqR8VosNmVQPrR8QMkaz0dMr+ZIimtuH/xxplead jV2FP/tYmvbxdDywkLPKT7Rzixj9LbEs44NebPCvaebEw5oSP30X4lfOBdfXwUS1iy/E 0Jy0E03ZZyUli0E/hcWOdQp5VKID7uX6Hba8tNoBYhHO/+RvpJjZwTkMWBMp/k88GzIK ox3WClmPfedlRV3WXp6mLZ/tl86P/nduIrJEpx1ux93IRL+Vy4uCSxHw+DzBstWlbTPE rbI77Ce0/wrgqT0kZNlR7Hf5SG+GUoPog2u0Q6whuM7r2sLyukdfSYqr5FPA1eciKYfN RdcA== X-Gm-Message-State: ABy/qLbSIa/Yf41uXSysVgkIXfvJrCKBc48ZOOYR4NchUCVm7NziC+iT fWuXZy0+K6LlZfyV+B/dwhepKhkzI8Q= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:c845:0:b0:56c:f8b7:d4f7 with SMTP id k5-20020a81c845000000b0056cf8b7d4f7mr819ywl.6.1688061386259; Thu, 29 Jun 2023 10:56:26 -0700 (PDT) Date: Thu, 29 Jun 2023 10:56:24 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230628071217.71126-1-ishiir@g.ecc.u-tokyo.ac.jp> Message-ID: Subject: Re: [PATCH] KVM: nVMX: Prevent vmlaunch with EPTP pointing outside assigned memory area From: Sean Christopherson To: Yan Zhao Cc: Reima Ishii , shina@ecc.u-tokyo.ac.jp, Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, yuan.yao@intel.com Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 29, 2023, Yan Zhao wrote: > On Wed, Jun 28, 2023 at 08:37:45AM -0700, Sean Christopherson wrote: > ... > > So I think we should try this: > > > > --- > > arch/x86/kvm/mmu/mmu.c | 19 ------------------- > > include/linux/kvm_host.h | 1 - > > virt/kvm/kvm_main.c | 13 ++----------- > > 3 files changed, 2 insertions(+), 31 deletions(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 60397a1beda3..e305737edf84 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -3671,19 +3671,6 @@ void kvm_mmu_free_guest_mode_roots(struct kvm *kvm, struct kvm_mmu *mmu) > > } > > EXPORT_SYMBOL_GPL(kvm_mmu_free_guest_mode_roots); > > > > - > > -static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn) > > -{ > > - int ret = 0; > > - > > - if (!kvm_vcpu_is_visible_gfn(vcpu, root_gfn)) { > > - kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); > > - ret = 1; > > - } > > - > > - return ret; > > -} > > - > > static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, int quadrant, > > u8 level) > > { > > @@ -3821,9 +3808,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) > > root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu); > > root_gfn = root_pgd >> PAGE_SHIFT; > > > > - if (mmu_check_root(vcpu, root_gfn)) > > - return 1; > > - > Hi Sean, > The checking of existence of memslot is still useful, > Otherwise NULL pointer dereference would be met as in below call trace. > > mmu_alloc_shadow_roots > |->mmu_alloc_root > |->mmu_alloc_root(root_gfn) > |->kvm_mmu_get_shadow_page > |->__kvm_mmu_get_shadow_page > |->kvm_mmu_alloc_shadow_page > |->account_shadowed > |->slot = __gfn_to_memslot(slots, gfn); ==>NULL pointer > | kvm_slot_page_track_add_page(kvm, slot, gfn,KVM_PAGE_TRACK_WRITE); > |->update_gfn_track(slot, gfn, mode, 1); > |->index = gfn_to_index(gfn, slot->base_gfn, PG_LEVEL_4K); ==>NULL pointer dereference Argh, right, the internal memslot might "work", but the no memslot case will not. The non-root path effectively injects a page fault if there's no memslot. Oof, and simply skipping the accounting for the no-slot case would result in an imbalanced world if userspace creates a valid memslot before unaccount_shadowed() is called. As much as it pains me to propagate KVM's arbitrary behavior, doing the right from an architectural perspective is really gross for KVM, e.g. would need all kinds of dedicated code in the MMU. I still don't like adding a non-architectural check to nested_vmx_check_eptp(), especially since there would be a race, e.g. if a memslot were deleted between nested_vmx_check_eptp() and allocating the root. This is the least awful solution I can think of (not yet tested): From: Sean Christopherson Date: Thu, 29 Jun 2023 10:54:12 -0700 Subject: [PATCH] KVM: nVMX: Allow triple fault shutdown in L2 to cancel nested VM-Enter Reported-by: Reima Ishii Cc: Yan Zhao Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 22e08d30baef..6da6db575a27 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4729,8 +4729,15 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, /* Pending MTF traps are discarded on VM-Exit. */ vmx->nested.mtf_pending = false; - /* trying to cancel vmlaunch/vmresume is a bug */ - WARN_ON_ONCE(vmx->nested.nested_run_pending); + /* + * Canceling VMLAUNCH/VMRESUME is a KVM bug, but triple fault shutdown + * is allowed due to limitations with KVM's shadow MMU, which requires + * shadowed page tables to be associated with a valid memslot, and KVM + * can't complete VM-Enter without a valid shadow root. + */ + WARN_ON_ONCE(vmx->nested.nested_run_pending && + vm_exit_reason != EXIT_REASON_TRIPLE_FAULT); + vmx->nested.nested_run_pending = false; if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) { /* base-commit: 61eb0bb88e6c154a5e19e62edd99299a86a2c6a7 --