Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp513214rwb; Thu, 12 Jan 2023 08:48:12 -0800 (PST) X-Google-Smtp-Source: AMrXdXu9A/SRa8JHaduZGZA2yMebs5TzmnNL4U5BGWvI6tqwGPPj+tPI007c+Li+TLqCF2+JdhS+ X-Received: by 2002:a17:906:ce4d:b0:849:a231:e503 with SMTP id se13-20020a170906ce4d00b00849a231e503mr60178982ejb.49.1673542092309; Thu, 12 Jan 2023 08:48:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673542092; cv=none; d=google.com; s=arc-20160816; b=S2/zuzeDttZw0yHM8YHOwa4bA/Su+iti11PFxF9Qwvjw7jfwuhvquZWSUJnPobQaBc U7kk87S+3/4QNdXnkHkehlb6NCJun60Ge5OPB0Zslzmd1VTUYlCu9sRPQYEEsSy9Hz8t 6SYjbrTSreAUNgJaZgjm5Wc8ZkepzYrnN8Drh0I/HUhL2UpIt3gvmn0eM2MQmf8FzPH9 63xepwz2ZlGukga24o6j46dgc27DoS/mqSNPdIVfTFpdVSgXV4mxcBW3MwJpVvlnMG2p e0IKQggLuch8mjK1udbGWZlRh4Ps44TD2u6vbiq29zY/dl2YPDkwmFhQgmGkvoIpSIwe K6CA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=X87kk/xxuv5oI73dcRZdzyqIZ2TELYwvj8KeEjF38cU=; b=lBpI/dLAHGMU2rlnWzn/0l1t+4LOBZkEffCA+GmbnobWL4BlTjmnClOGuPBj05W6xs NHLGNoHg37bqkszeux+IaPkn4KLYaiXbpnlFxc0YYWgvYthOmtaVuegYeR+tbIe4CNN8 DkiHt1IQp4h/fSrzGgxAnDO3kd5q3SlXmFQY+vpyVaIc+2uFHVcd+rYbqwBS/JqFIlMl A3L7S8niZdLJLxox3VpLVCYG7JaauPJBjHX6IQ/giDnswN+2qOaryDrYQqLrgmC66IOG aaari/Sg4Z1wiU+bJ9f4fKWjAaYS9CUcEQnc4cTF59DdPMOq2ERuqWlAUq0dEKOsUMzX XHhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aH85sodE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ds6-20020a170907724600b0078cdba56108si19299474ejc.296.2023.01.12.08.47.59; Thu, 12 Jan 2023 08:48:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aH85sodE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240716AbjALQpY (ORCPT + 50 others); Thu, 12 Jan 2023 11:45:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240069AbjALQh7 (ORCPT ); Thu, 12 Jan 2023 11:37:59 -0500 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F6E2193CD; Thu, 12 Jan 2023 08:34:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673541240; x=1705077240; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1dvYAU9snCkQDNzXMfMyZF+jsAZqjYqLPUm7CpBWnmY=; b=aH85sodEq5prBLele8YMpeh6bWaWJwv//CkWlB0JROk0JJkNRT08Eir1 AH1dqQQCka5Wp4h/Al87crHTc9PWhqvbRp/ykhk3Z+5OjTzdnoDTVQiL1 4gSmxMF0sMI/LTJ8ZvcfWs3ZEwdPMBZnPGFXarhjf8BOIGhaTTwrsjZQA Bbnutbp1aA3E96vpdwyGhTNfVgcdpdZJUPI+viyp1FMBmrljydNiaGy0S xNAOjDxXc11JukaIxf2WFmazZNbQ6k9OJ8GO2Nb+ei7mNrctAn9j6rFn9 nXfUbNvuAONMhwnbZmgAV7swXdiDua12BuxIoQWFVR9pVc2YvClYVaMZ6 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10588"; a="323811799" X-IronPort-AV: E=Sophos;i="5.97,211,1669104000"; d="scan'208";a="323811799" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2023 08:33:25 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10588"; a="721151731" X-IronPort-AV: E=Sophos;i="5.97,211,1669104000"; d="scan'208";a="721151731" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2023 08:33:25 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v11 031/113] KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE Date: Thu, 12 Jan 2023 08:31:39 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sean Christopherson For TD guest, the current way to emulate MMIO doesn't work any more, as KVM is not able to access the private memory of TD guest and do the emulation. Instead, TD guest expects to receive #VE when it accesses the MMIO and then it can explicitly make hypercall to KVM to get the expected information. To achieve this, the TDX module always enables "EPT-violation #VE" in the VMCS control. And accordingly, for the MMIO spte for the shared GPA, 1. KVM needs to set "suppress #VE" bit for the non-present SPTE so that EPT violation happens on TD accessing MMIO range. 2. On EPT violation, KVM sets the MMIO spte to clear "suppress #VE" bit so the TD guest can receive the #VE instead of EPT misconfigration unlike VMX case. For the shared GPA that is not populated yet, EPT violation need to be triggered when TD guest accesses such shared GPA. The non-present SPTE value for shared GPA should set "suppress #VE" bit. Add "suppress #VE" bit (bit 63) to SHADOW_NONPRESENT_VALUE and REMOVED_SPTE. Unconditionally set the "suppress #VE" bit (which is bit 63) for both AMD and Intel as: 1) AMD hardware doesn't use this bit when present bit is off; 2) for normal VMX guest, KVM never enables the "EPT-violation #VE" in VMCS control and "suppress #VE" bit is ignored by hardware. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/vmx.h | 1 + arch/x86/kvm/mmu/spte.h | 15 ++++++++++++++- arch/x86/kvm/mmu/tdp_mmu.c | 8 ++++++++ 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 498dc600bd5c..cdbf12c1a83c 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -511,6 +511,7 @@ enum vmcs_field { #define VMX_EPT_IPAT_BIT (1ull << 6) #define VMX_EPT_ACCESS_BIT (1ull << 8) #define VMX_EPT_DIRTY_BIT (1ull << 9) +#define VMX_EPT_SUPPRESS_VE_BIT (1ull << 63) #define VMX_EPT_RWX_MASK (VMX_EPT_READABLE_MASK | \ VMX_EPT_WRITABLE_MASK | \ VMX_EPT_EXECUTABLE_MASK) diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index f190eaf6b2b5..471378ee9071 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -148,7 +148,20 @@ static_assert(MMIO_SPTE_GEN_LOW_BITS == 8 && MMIO_SPTE_GEN_HIGH_BITS == 11); #define MMIO_SPTE_GEN_MASK GENMASK_ULL(MMIO_SPTE_GEN_LOW_BITS + MMIO_SPTE_GEN_HIGH_BITS - 1, 0) +/* + * Non-present SPTE value for both VMX and SVM for TDP MMU. + * For SVM NPT, for non-present spte (bit 0 = 0), other bits are ignored. + * For VMX EPT, bit 63 is ignored if #VE is disabled. (EPT_VIOLATION_VE=0) + * bit 63 is #VE suppress if #VE is enabled. (EPT_VIOLATION_VE=1) + * For TDX: + * TDX module sets EPT_VIOLATION_VE for Secure-EPT and conventional EPT + */ +#ifdef CONFIG_X86_64 +#define SHADOW_NONPRESENT_VALUE BIT_ULL(63) +static_assert(!(SHADOW_NONPRESENT_VALUE & SPTE_MMU_PRESENT_MASK)); +#else #define SHADOW_NONPRESENT_VALUE 0ULL +#endif extern u64 __read_mostly shadow_host_writable_mask; extern u64 __read_mostly shadow_mmu_writable_mask; @@ -195,7 +208,7 @@ extern u64 __read_mostly shadow_nonpresent_or_rsvd_mask; * * Only used by the TDP MMU. */ -#define REMOVED_SPTE 0x5a0ULL +#define REMOVED_SPTE (SHADOW_NONPRESENT_VALUE | 0x5a0ULL) /* Removed SPTEs must not be misconstrued as shadow present PTEs. */ static_assert(!(REMOVED_SPTE & SPTE_MMU_PRESENT_MASK)); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 9cf5844dd34a..6111e3e9266d 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -700,6 +700,14 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm, * overwrite the special removed SPTE value. No bookkeeping is needed * here since the SPTE is going from non-present to non-present. Use * the raw write helper to avoid an unnecessary check on volatile bits. + * + * Set non-present value to SHADOW_NONPRESENT_VALUE, rather than 0. + * It is because when TDX is enabled, TDX module always + * enables "EPT-violation #VE", so KVM needs to set + * "suppress #VE" bit in EPT table entries, in order to get + * real EPT violation, rather than TDVMCALL. KVM sets + * SHADOW_NONPRESENT_VALUE (which sets "suppress #VE" bit) so it + * can be set when EPT table entries are zapped. */ __kvm_tdp_mmu_write_spte(iter->sptep, SHADOW_NONPRESENT_VALUE); -- 2.25.1