Received: by 2002:a05:7412:cfc7:b0:fc:a2b0:25d7 with SMTP id by7csp2105096rdb; Tue, 20 Feb 2024 18:17:04 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCUrpQeyQXyoBrkfkI4ptb5eznTqMH/KlWR0kdkv31DHcfzNZmJZwcad+OCCYI6hNhZHSRwHGSOFo53+bWV9nekdLX1XU1Q7031GmHz7NQ== X-Google-Smtp-Source: AGHT+IG05vT22Qe/tL1E/3ixyGp+bkGW0W90nZTFlrXzQY3dRh61pbN8AcBoqEuwlYlwe2gscQyn X-Received: by 2002:a17:906:f1c6:b0:a3e:5ffa:d564 with SMTP id gx6-20020a170906f1c600b00a3e5ffad564mr6099153ejb.8.1708481824655; Tue, 20 Feb 2024 18:17:04 -0800 (PST) Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id h13-20020a17090634cd00b00a3e6feb8f01si2824732ejb.287.2024.02.20.18.17.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Feb 2024 18:17:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-73936-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@google.com header.s=20230601 header.b=Fh5Q7uaR; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-73936-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-73936-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=REJECT sp=REJECT dis=QUARANTINE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 09EB51F2603C for ; Wed, 21 Feb 2024 02:10:39 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C6ADC63CB; Wed, 21 Feb 2024 02:10:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="Fh5Q7uaR" Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4CA864C99 for ; Wed, 21 Feb 2024 02:10:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708481427; cv=none; b=GlfkcFntAGX6RaXxyQa0+9I154RCfjYjZsa8Jomwwl57ZEp0Y5iaUXpPVvWqYwmXK3L6wiY/zVwXPpYc3SWiPo5335Sdf9K05Nd16NDXsmcz49hqICFs6BZIuZqAgZc1aUupQZdVjP8yLp5OiVgJ5899kKgohaaOZG4dkKDJjck= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708481427; c=relaxed/simple; bh=j41BACY3bbl0dJLSj+GBS3NKl9wXXB4Sfy90HrTAWok=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Fp0ubp5ZdiznfU82/avSaRvYksHeVCQdVAX+Oid2WY03bLiuyeCTw/rYpNHElqnB3GFBkqdtvHGOp5bepuXQcQ6GQjU8R37gZBg4NPntk1WKeyfjjM/ybpWtkauqNptdcEbs2Ekrp0hFK4Nuw8AYIGKfe/y+9cCMWqmFMmI08mM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Fh5Q7uaR; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-602dae507caso100431957b3.0 for ; Tue, 20 Feb 2024 18:10:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708481425; x=1709086225; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gKDDhVDI/caWn5TUGufa5pdgsoyIgpZ0Gl6DWfyCRzw=; b=Fh5Q7uaRUFyhtMNW/LtWAS/w+WBOcLEMJXBhLKAuHPSYjInNYEhTUihpZ2yiFyAZHn HWQ4+EmST8/QVPjIgViS91TZ1aOp9x7m67izw+IuDv0gSQcRajNu4gAmDgRzi697Xa19 sHI10FUJS0qQO3+UDNad5Sg3Pff2zduqlGMkNs7zfk8LKvpe00olcm96sUxTsyTZxBEd He2KSJzmtiQliVd8ndpOUiAGKFRzZ+F7NACHg76fzqdxzEOVqT5UVCrBF/F3zzrRrSkn 90cXg+VB/XEmQpWXbHRBiF3Uv1/VyIuqSER+FBmsXBEJ5QpgikyQ3Gjuo94dY9DkKzqm yBnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708481425; x=1709086225; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gKDDhVDI/caWn5TUGufa5pdgsoyIgpZ0Gl6DWfyCRzw=; b=O2sxRwco8RUZY/he6D4j5qk28dVefdtKpSn8eaH0HKY3qkjrxBwL+ACyWTdoiqN02p Z+KI9Qc+mYJ+Gb1ZU/PYUPXcF+/SbG9it+pZyb70VemkxNxZ4Eutj9PdOW7As7W/BdTS brE8VbapT8OZOzCjkdkexyg4mUV9pjPF00J6S6rWBx8pT9sBMztkP5gDuxdZOmpLdEtb z8/t+l4uXknqralFfhQw523kdwAOLig5FsGs3JAB0Hu3CvLzhAkkbgoXQEl0uSgswXhq IxMxcA5MCnDMNcM+/svJnbi3x2KmIaL1iPJzF89+a5eQajQOOmqps8wIqcZK0p9vvXpV nkvw== X-Forwarded-Encrypted: i=1; AJvYcCW0YKUGaTR6CFJK5YIchUytf3tsxiXo6c7++b0Z3Vn93bWghosvrQi+uajR1doCz8UR2A+Rp3Vk1+3Quhb56tD74x9CFZ2EzXH0WtJt X-Gm-Message-State: AOJu0YzuEuQ3mgcNNMZIh+9XgCsG/41LF8kqYq5DHMc70WdqCHWNTecI YwrKKYX90rGAY1pSrxnmqrmOJed16dfvU1ibiFsa7ZKb1Jjnfr647TQCRaSNiIPbCOfqloWWtQg pKg== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:494c:0:b0:608:72fe:b8a1 with SMTP id w73-20020a81494c000000b0060872feb8a1mr495006ywa.4.1708481425305; Tue, 20 Feb 2024 18:10:25 -0800 (PST) Date: Tue, 20 Feb 2024 18:10:23 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240209222858.396696-1-seanjc@google.com> <20240209222858.396696-4-seanjc@google.com> Message-ID: Subject: Re: [PATCH v4 3/4] KVM: x86/mmu: Move slot checks from __kvm_faultin_pfn() to kvm_faultin_pfn() From: Sean Christopherson To: Yan Zhao Cc: Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Friedrich Weber , Kai Huang , Yuan Yao , Xu Yilun , Yu Zhang , Chao Peng , Fuad Tabba , Michael Roth , Isaku Yamahata , David Matlack Content-Type: text/plain; charset="us-ascii" On Tue, Feb 20, 2024, Yan Zhao wrote: > On Mon, Feb 19, 2024 at 11:44:54AM -0800, Sean Christopherson wrote: > > If KVM is using TDP, but L1 is using shadow paging for L2, then routing through > > kvm_handle_noslot_fault() will incorrectly cache the gfn as MMIO, and create an > > MMIO SPTE. Creating an MMIO SPTE is ok, but only because kvm_mmu_page_role.guest_mode > > ensure KVM uses different roots for L1 vs. L2. But mmio_gfn will remain valid, > > and could (quite theoretically) cause KVM to incorrectly treat an L1 access to > > the private TSS or identity mapped page tables as MMIO. > Why would KVM treat L1 access to the private TSS and identity mapped page > tables as MMIO even though mmio_gfn is valid? Because KVM doesn't need to take an EPT Violation or Misconfig to trigger emulation, those just happen to be (by far) the most common ways KVM gets into the emulator on modern CPUs. > It looks that (for Intel platform) EPT for L1 will only install normal SPTEs > (non-MMIO SPTEs) for the two private slots, so there would not have EPT > misconfiguration and would not go to emulation path incorrectly. > Am I missing something? .. > > -- > > Subject: [PATCH] KVM: x86/mmu: Don't force emulation of L2 accesses to > > non-APIC internal slots > > > > Signed-off-by: Sean Christopherson > > --- > > arch/x86/kvm/mmu/mmu.c | 17 +++++++++++++---- > > 1 file changed, 13 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 488f522f09c6..4ce824cec5b9 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -4341,8 +4341,18 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault > > if (slot && (slot->flags & KVM_MEMSLOT_INVALID)) > > return RET_PF_RETRY; > > > > - if (!kvm_is_visible_memslot(slot)) { > > - /* Don't expose private memslots to L2. */ > > + if (slot && slot->id == APIC_ACCESS_PAGE_PRIVATE_MEMSLOT) { > > + /* > > + * Don't map L1's APIC access page into L2, KVM doesn't support > > + * using APICv/AVIC to accelerate L2 accesses to L1's APIC, > > + * i.e. the access needs to be emulated. Emulating access to > > + * L1's APIC is also correct if L1 is accelerating L2's own > > + * virtual APIC, but for some reason L1 also maps _L1's_ APIC > > + * into L2. Note, vcpu_is_mmio_gpa() always treats access to > > + * the APIC as MMIO. Allow an MMIO SPTE to be created, as KVM > > + * uses different roots for L1 vs. L2, i.e. there is no danger > > + * of breaking APICv/AVIC for L1. > > + */ > > if (is_guest_mode(vcpu)) { > > fault->slot = NULL; > > fault->pfn = KVM_PFN_NOSLOT; > Checking fault->is_private before calling kvm_handle_noslot_fault()? Ya, the actual series will perform that check, this slots in halfway through. > And do we need a centralized check of fault->is_private in kvm_mmu_do_page_fault() > before returning RET_PF_EMULATE? Oof, yes. > > @@ -4355,8 +4365,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault > > * MMIO SPTE. That way the cache doesn't need to be purged > > * when the AVIC is re-enabled. > > */ > > - if (slot && slot->id == APIC_ACCESS_PAGE_PRIVATE_MEMSLOT && > > - !kvm_apicv_activated(vcpu->kvm)) > > + if (!kvm_apicv_activated(vcpu->kvm)) > > return RET_PF_EMULATE; > Otherwise, here also needs a checking of fault->is_private? > Maybe also for where RET_PF_EMULATE is returned when page_fault_handle_page_track() > is true (though I know it's always false for TDX). Ya, and practically speaking it should always be false for functional setups (software-protected VMs don't yet play nice with shadow paging or any form of emulation), but it's easy enough to guard against RET_PF_EMULATE in kvm_mmu_do_page_fault(). I'm going to post _just_ patch 1 as v5 so that it can land in 6.8 (assuming I don't screw it up again). I'll post a separate series to tackle the refactor and is_private cleanups and fixes as that has ballooned to 17 patches :-/