Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp2318489rbb; Tue, 27 Feb 2024 19:31:53 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUBTrEDBr2XyWCb3sMdksHlDR1iBDpDs69mpOM/sxqGaO3mNQo0PtLhx/lrTrZs0rHyUM23McMUgnu3gmMHOLK6WEPdNJUU2qRkGE2Mhg== X-Google-Smtp-Source: AGHT+IEXCY/69LCLG3xucCeodIRYfdJq7rucXZj1FCQUqd4jLtNFF4AGNJXq6ZXEXh3kdNWO3OgY X-Received: by 2002:a05:6a21:318c:b0:1a0:df5b:1217 with SMTP id za12-20020a056a21318c00b001a0df5b1217mr4364646pzb.14.1709091113543; Tue, 27 Feb 2024 19:31:53 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709091113; cv=pass; d=google.com; s=arc-20160816; b=S4zI6D+iOTbtxfBr45I2jH3pUzUJShKLJAqUexcqL8j5rxUA0fz6JXfzuooK8CTV+T /YP4Cr8I+grPGV/aXaMR0ima4S6keDsyGNlhtoXKsZBgM15zT3bZO5fJpHMCw7dcpAHo iac6FFa8PQPOt/QcUkoL/A6U5HqGF1NrK7l2P+5c4u1lQDzckSvxlTjdEyMjfbEOS6XD nVLw8DKYWTE8G9yaJfUWq+lTa7dOD1nS76E1/P8h98GEc5RS88zq9HXCugjklD6XfKgC NdaA+c9u+RbIEHIxaa12Yzc9Mzi4vPu7zx5b1MMOYxUinIcQLTqctvg+e1U5vWHEPQeT S8rA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :reply-to:dkim-signature; bh=L6OrO+CyUKUmFb+nQO2BWBH5lLYiXnAmYHqUzxi5l7c=; fh=NjTHi5Oufa7NVn/fV7f+Wb+hWdCDqhGjQbs0ZPy3gT8=; b=XAs/N9S/Ki192R7R6PPGVvDO1+uW+r4wfVAQyRzf6HTLwOJtNKo38HOhvCoUlgTWS3 ysA8fWhq4yfgfF69a4FyM1+JfqnpfeVPAOHJuay0M8KTUbv/w5cTJseTAEka7i5AOonv 5y/zA2V7yM3rXL+oRkgBNyGBt7ob16orUbajcEN/t+UTu0KVso81P+tTAeeM1wj4UOxq qv+5BbBhewA98gXaxUCScFfvnd47HxYsEVQGW/9mGc0GP/lso1c6K1kT70LbXpvBEBll xeGB+VwD6h/IynT+BqBgcQX+0bMv6xfEGMUqpZUVzZhUnZ2uaGeSi4jXdCDm4n45zZ0x 4EEg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="JZQPbhn/"; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-84436-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84436-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id a38-20020a630b66000000b005d8b59b8da3si6440957pgl.839.2024.02.27.19.31.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Feb 2024 19:31:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84436-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="JZQPbhn/"; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-84436-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84436-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 61E0EB25A8C for ; Wed, 28 Feb 2024 02:44:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3A6672C6A5; Wed, 28 Feb 2024 02:42:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="JZQPbhn/" Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCF502C197 for ; Wed, 28 Feb 2024 02:42:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709088132; cv=none; b=itnnpIABDboHX9l1v7BsUSOttmLff8E99jj8Y1ZET6QvfGERVsTk+1u/7hfon6wYGk8pLGwdNoQLReh6W02tfyC95VmjZY9cwUFddDQeDxUXbJ3XKXwuNf+mS4xflLqhsV5ATpw258VdwjXGtVLXsRrMLMrl+fhZ76adAclWvMY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709088132; c=relaxed/simple; bh=9PNdeAstWFyBPPWEs7KhyiizgXxO3+J35k1TqWTitDo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=djjFrBMCvukE8Rib52wxZwLtdHAW6NWPR4ljRe4WSDTWYIcg/PQeoSLTDgM5YKXxzq2xMAcIQWyYFkkzhPzVWBbSJQbgvkXatgyNyVhlb/AUZhdrUIsEBIf5hFcQGwZB+0ebVh2pbjn7JR1SKbTVPz4sb6pbT9N9xGuuXSONuYw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=JZQPbhn/; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-6e557bde036so877620b3a.0 for ; Tue, 27 Feb 2024 18:42:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1709088129; x=1709692929; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=L6OrO+CyUKUmFb+nQO2BWBH5lLYiXnAmYHqUzxi5l7c=; b=JZQPbhn//+Ga12435LtjkrbT5ugjSrcSMCPTXKjZGsoV4QYeR37wszvWluIbpI2aIZ XZrJJRUhCZ/FkwzECVje7AnlwZs1EsUqGrUEj6jpRwZQ8At6WFRqTcmLgL3RBNTIjDYM F+Nv7LuSfc/dkW77SjR7MelG3ygjAuGHU7ipUYOGGWPpcNlLAMXJK36c15Oqzwg/0UH8 zR+uGGRr00a2iyq3RN9JZYXGnLRyG9mVFuzhSk7Eq1/nanepIT/Ouqqs6k0WTNFde3yC dIou95nF9WtE6J7m2JgQsKRFqTisE542GxbDn3Z8RXJyH+DoACu7M2APE7xGKCF10bYy Mu7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709088129; x=1709692929; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=L6OrO+CyUKUmFb+nQO2BWBH5lLYiXnAmYHqUzxi5l7c=; b=Je/birh5TwYxW3YGFqmxNj4496qoVYq7dScTWPx9v83wi6iMsswkCzuUTIXpm/aizW 5r++xDjx4hReuSrHua+11NfXy6wK4J4N2Pm6II9ZaJuREdWu+3262EXc+kM/ebGup9y/ qfD4GkcudoegNjlV0Mhece0t3FnokSgQoqzw0eeERIVnpz/LUbWjUp82PUon5fRuWw16 yFNryxum2hw4gA44s3iE0QFFcD7mxbM/CMXQztx1lqRMyCpMzqwEGbj0I6ai7t5FDUVl PcI5YfIxTtm4TPClqydSKn6XAj4kPh/W/G00De16gIshp1ucDFrUEkBRnSnXNg7Lerev kD5A== X-Forwarded-Encrypted: i=1; AJvYcCU3vb2PaMRjNlsjHIPB3xlPs23+sQn/TKvGA2ha7FL5L8+FDK6oPZXI4xLmi7eLF0Q0eRCPK6n1yG9Zu3TsIAC1jnCIJamGRN9+ivVG X-Gm-Message-State: AOJu0Yx22//XD6BeGbYraimsCgBZOaqaOU20NEvuav/IkqD7ddStYjEQ QrLdwY1R8IbfdQVFvg4lh27ssRmX5Mry4+uEO9g99NYS67r/yoMsOzhohKCVmnXlvYYdE3frrO1 BKA== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:8917:b0:6e4:f310:1fd with SMTP id hw23-20020a056a00891700b006e4f31001fdmr362188pfb.4.1709088129052; Tue, 27 Feb 2024 18:42:09 -0800 (PST) Reply-To: Sean Christopherson Date: Tue, 27 Feb 2024 18:41:41 -0800 In-Reply-To: <20240228024147.41573-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240228024147.41573-1-seanjc@google.com> X-Mailer: git-send-email 2.44.0.278.ge034bb2e1d-goog Message-ID: <20240228024147.41573-11-seanjc@google.com> Subject: [PATCH 10/16] KVM: x86/mmu: Don't force emulation of L2 accesses to non-APIC internal slots From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yan Zhao , Isaku Yamahata , Michael Roth , Yu Zhang , Chao Peng , Fuad Tabba , David Matlack Content-Type: text/plain; charset="UTF-8" Allow mapping KVM's internal memslots used for EPT without unrestricted guest into L2, i.e. allow mapping the hidden TSS and the identity mapped page tables into L2. Unlike the APIC access page, there is no correctness issue with letting L2 access the "hidden" memory. Allowing these memslots to be mapped into L2 fixes a largely theoretical bug where KVM could incorrectly emulate subsequent _L1_ accesses as MMIO, and also ensures consistent KVM behavior for L2. If KVM is using TDP, but L1 is using shadow paging for L2, then routing through kvm_handle_noslot_fault() will incorrectly cache the gfn as MMIO, and create an MMIO SPTE. Creating an MMIO SPTE is ok, but only because kvm_mmu_page_role.guest_mode ensure KVM uses different roots for L1 vs. L2. But vcpu->arch.mmio_gfn will remain valid, and could cause KVM to incorrectly treat an L1 access to the hidden TSS or identity mapped page tables as MMIO. Furthermore, forcing L2 accesses to be treated as "no slot" faults doesn't actually prevent exposing KVM's internal memslots to L2, it simply forces KVM to emulate the access. In most cases, that will trigger MMIO, amusingly due to filling vcpu->arch.mmio_gfn, but also because vcpu_is_mmio_gpa() unconditionally treats APIC accesses as MMIO, i.e. APIC accesses are ok. But the hidden TSS and identity mapped page tables could go either way (MMIO or access the private memslot's backing memory). Alternatively, the inconsistent emulator behavior could be addressed by forcing MMIO emulation for L2 access to all internal memslots, not just to the APIC. But that's arguably less correct than letting L2 access the hidden TSS and identity mapped page tables, not to mention that it's *extremely* unlikely anyone cares what KVM does in this case. From L1's perspective there is R/W memory at those memslots, the memory just happens to be initialized with non-zero data. Making the memory disappear when it is accessed by L2 is far more magical and arbitrary than the memory existing in the first place. The APIC access page is special because KVM _must_ emulate the access to do the right thing (emulate an APIC access instead of reading/writing the APIC access page). And despite what commit 3a2936dedd20 ("kvm: mmu: Don't expose private memslots to L2") said, it's not just necessary when L1 is accelerating L2's virtual APIC, it's just as important (likely *more* imporant for correctness when L1 is passing through its own APIC to L2. Fixes: 3a2936dedd20 ("kvm: mmu: Don't expose private memslots to L2") Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 58c5ae8be66c..5c8caab64ba2 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4346,8 +4346,18 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault if (slot && (slot->flags & KVM_MEMSLOT_INVALID)) return RET_PF_RETRY; - if (!kvm_is_visible_memslot(slot)) { - /* Don't expose private memslots to L2. */ + if (slot && slot->id == APIC_ACCESS_PAGE_PRIVATE_MEMSLOT) { + /* + * Don't map L1's APIC access page into L2, KVM doesn't support + * using APICv/AVIC to accelerate L2 accesses to L1's APIC, + * i.e. the access needs to be emulated. Emulating access to + * L1's APIC is also correct if L1 is accelerating L2's own + * virtual APIC, but for some reason L1 also maps _L1's_ APIC + * into L2. Note, vcpu_is_mmio_gpa() always treats access to + * the APIC as MMIO. Allow an MMIO SPTE to be created, as KVM + * uses different roots for L1 vs. L2, i.e. there is no danger + * of breaking APICv/AVIC for L1. + */ if (is_guest_mode(vcpu)) { fault->slot = NULL; fault->pfn = KVM_PFN_NOSLOT; @@ -4360,8 +4370,7 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault * MMIO SPTE. That way the cache doesn't need to be purged * when the AVIC is re-enabled. */ - if (slot && slot->id == APIC_ACCESS_PAGE_PRIVATE_MEMSLOT && - !kvm_apicv_activated(vcpu->kvm)) + if (!kvm_apicv_activated(vcpu->kvm)) return RET_PF_EMULATE; } -- 2.44.0.278.ge034bb2e1d-goog