Received: by 2002:a05:6500:1b45:b0:1f5:f2ab:c469 with SMTP id cz5csp218282lqb; Tue, 16 Apr 2024 13:34:01 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWfbEwaJ4Eum5R6BIoHQZ9cjZD+eBxNSv2dZb2wj29ZCSmPvZ3uurwYVDn0yhVscuLlDPJ/DNEcbcpTVF/j25rm+Q9xpqI5l9ZrhP/uBg== X-Google-Smtp-Source: AGHT+IHURfYK0O97dj+FmSWPkhQDhtPtnEzGBEVAuzOW9SvPTQ7cis7WvY5OVWT+MTeKeuW/4Xsr X-Received: by 2002:a17:906:a2d9:b0:a52:1635:9d16 with SMTP id by25-20020a170906a2d900b00a5216359d16mr3164303ejb.38.1713299641330; Tue, 16 Apr 2024 13:34:01 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713299641; cv=pass; d=google.com; s=arc-20160816; b=iQFwNOZ83AVa5xgyjGx9tVk3h+n1QMXJ2s4/3yLUUOHb0Xt8tk7ZX+IeNHJHY6GlN3 mNeUufBekEXZqUxJZ1FLo2/9FDHRtgkQ2bhTterdPEJaPgealyer4LJjQvHB6nY2t8+t wHH6fuOKcQSdvj1SpXqvsn9eaNnWFqDAMYb7ja6QAOelSAR6F745XD7ycxHdNc8HxzLs x+jdjfZ6Gfsds1ZTHMip2SvjIIFzDqyQ/Q3DetR1BDA1XVv/4HtDnt7XYWNmuuWiK0w6 RjHVQGBFAm9v5BgLDAJATGUnRI1h0ruE5uF/ckb1uTZYDXc22nZxMzZoXi5A+yVDXueu cdqA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=TwTWd1LWboFZJFIO6b5zNSAq4i8lxaUHZBQQ5FiIFdY=; fh=uoU4xVk2z8sYQspgKHmOkXA3iM8ZMG6r3oSNwpwtc4I=; b=WZ1PUPPuZxqJSLnpaD12mQqD/Sh2MwMGNS6PGSeMdODFV9jEqWZM5dVhZgSBGfTMZf yXuRd+MXJTTaUpuUy/geI27w+nT/cKHCuGDQWlK1bmqK0+osLCLfTUJd06tBnLFIvq0C LBAIYS1DVOYPewurZZCbi2BGq4PGvfQX82OVRjxeUTsm+1qAgddUrHFMY0M1BZH7xtjU LZJ5afAwSOTxpkUxABcDcAkgGiftcUAGiOQyj1MsXIwRKcVInal/vdioqaustRtDTBQU e6Q5HDsc2DQTpf+quAzJio2s6ELyI5Q++3LFSQPgQC5YF70PDJU0h6+4GD57j2kd57JE QsHg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=CpI3L1lw; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-147510-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-147510-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id ka5-20020a170907990500b00a46dedc140esi5496101ejc.658.2024.04.16.13.34.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Apr 2024 13:34:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-147510-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=CpI3L1lw; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-147510-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-147510-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id CCD991F2516C for ; Tue, 16 Apr 2024 20:22:02 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B350013C69F; Tue, 16 Apr 2024 20:19:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CpI3L1lw" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70003139D1F for ; Tue, 16 Apr 2024 20:19:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713298784; cv=none; b=eDKbt2o4L5pXf3X9GOeEcL7EwO+4TbYnr7rc+hQMLhZ5D+8N4Wdmvr8PalOVtw+wgx0PyAk/Hy78V7xaNZtBGMPpMeQxI8+6FWX0toKjFZw2k9cjM7q2qSQ8hVI+Vmoy8/uQOaCvJUcfMm2SGsmKOMNB/YsuslWJXxLqiFHNb04= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713298784; c=relaxed/simple; bh=7hcY1DK19et/zMDAnYOy2LFryAQQJMgdDKr+FeCfUqg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=rWWXsogL8CrKUvJhi9Pg6d/iP/1lfGdXbY7Qxoafx4Xt8AgiYFH7eRt5dnLblvzYze1+OeiU6EnAlEnVACLD3DjHGPaoHGE8MVvQGFTdJpzR5VET6A05kei64232Jnr3E/fiAZeM1BvbYtz87gFON+jlj/Sbk8HYQsqSM5a46is= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CpI3L1lw; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1713298781; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TwTWd1LWboFZJFIO6b5zNSAq4i8lxaUHZBQQ5FiIFdY=; b=CpI3L1lwLgHZz7EyHzy5V1TjwR38TN0IFyIy+wjUTCeJOKwLI+zQM5VD7+Unr3UCytqRrl nnqh0jSYV5HgYCyP7EzXq4mQlTxIjN+j8bAVzF1Z/ZbIVv9sgve8eS5NgKNQR0mKhsqkbE +kCvgGhN/cngXZzb23qzAL8tgRwfsS8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-524-RMLZ8XXpNqWAwMBKH031CA-1; Tue, 16 Apr 2024 16:19:37 -0400 X-MC-Unique: RMLZ8XXpNqWAwMBKH031CA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5CC3B8007AD; Tue, 16 Apr 2024 20:19:37 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2B02749109; Tue, 16 Apr 2024 20:19:37 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: isaku.yamahata@intel.com, xiaoyao.li@intel.com, binbin.wu@linux.intel.com, chao.gao@intel.com, Sean Christopherson Subject: [PATCH v2 03/10] KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE Date: Tue, 16 Apr 2024 16:19:28 -0400 Message-ID: <20240416201935.3525739-4-pbonzini@redhat.com> In-Reply-To: <20240416201935.3525739-1-pbonzini@redhat.com> References: <20240416201935.3525739-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 From: Sean Christopherson For TD guest, the current way to emulate MMIO doesn't work any more, as KVM is not able to access the private memory of TD guest and do the emulation. Instead, TD guest expects to receive #VE when it accesses the MMIO and then it can explicitly make hypercall to KVM to get the expected information. To achieve this, the TDX module always enables "EPT-violation #VE" in the VMCS control. And accordingly, for the MMIO spte for the shared GPA, 1. KVM needs to set "suppress #VE" bit for the non-present SPTE so that EPT violation happens on TD accessing MMIO range. 2. On EPT violation, KVM sets the MMIO spte to clear "suppress #VE" bit so the TD guest can receive the #VE instead of EPT misconfiguration unlike VMX case. For the shared GPA that is not populated yet, EPT violation need to be triggered when TD guest accesses such shared GPA. The non-present SPTE value for shared GPA should set "suppress #VE" bit. Add "suppress #VE" bit (bit 63) to SHADOW_NONPRESENT_VALUE and REMOVED_SPTE. Unconditionally set the "suppress #VE" bit (which is bit 63) for both AMD and Intel as: 1) AMD hardware doesn't use this bit when present bit is off; 2) for normal VMX guest, KVM never enables the "EPT-violation #VE" in VMCS control and "suppress #VE" bit is ignored by hardware. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Binbin Wu Reviewed-by: Xiaoyao Li Message-Id: Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/paging_tmpl.h | 12 ++++++------ arch/x86/kvm/mmu/spte.c | 14 +++++++------- arch/x86/kvm/mmu/spte.h | 16 +++++++++++++++- 3 files changed, 28 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index bebd73cd61bb..9aac3aa93d88 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -933,13 +933,13 @@ static int FNAME(sync_spte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, int return 0; /* - * Drop the SPTE if the new protections would result in a RWX=0 - * SPTE or if the gfn is changing. The RWX=0 case only affects - * EPT with execute-only support, i.e. EPT without an effective - * "present" bit, as all other paging modes will create a - * read-only SPTE if pte_access is zero. + * Drop the SPTE if the new protections result in no effective + * "present" bit or if the gfn is changing. The former case + * only affects EPT with execute-only support with pte_access==0; + * all other paging modes will create a read-only SPTE if + * pte_access is zero. */ - if ((!pte_access && !shadow_present_mask) || + if ((pte_access | shadow_present_mask) == SHADOW_NONPRESENT_VALUE || gfn != kvm_mmu_page_get_gfn(sp, i)) { drop_spte(vcpu->kvm, &sp->spt[i]); return 1; diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 6c7ab3aa6aa7..768aaeddf5fa 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -144,19 +144,19 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, u64 spte = SPTE_MMU_PRESENT_MASK; bool wrprot = false; - WARN_ON_ONCE(!pte_access && !shadow_present_mask); + /* + * For the EPT case, shadow_present_mask has no RWX bits set if + * exec-only page table entries are supported. In that case, + * ACC_USER_MASK and shadow_user_mask are used to represent + * read access. See FNAME(gpte_access) in paging_tmpl.h. + */ + WARN_ON_ONCE((pte_access | shadow_present_mask) == SHADOW_NONPRESENT_VALUE); if (sp->role.ad_disabled) spte |= SPTE_TDP_AD_DISABLED; else if (kvm_mmu_page_ad_need_write_protect(sp)) spte |= SPTE_TDP_AD_WRPROT_ONLY; - /* - * For the EPT case, shadow_present_mask is 0 if hardware - * supports exec-only page table entries. In that case, - * ACC_USER_MASK and shadow_user_mask are used to represent - * read access. See FNAME(gpte_access) in paging_tmpl.h. - */ spte |= shadow_present_mask; if (!prefetch) spte |= spte_shadow_accessed_mask(spte); diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 0f4ec2859474..8056b7853a79 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -149,7 +149,21 @@ static_assert(MMIO_SPTE_GEN_LOW_BITS == 8 && MMIO_SPTE_GEN_HIGH_BITS == 11); #define MMIO_SPTE_GEN_MASK GENMASK_ULL(MMIO_SPTE_GEN_LOW_BITS + MMIO_SPTE_GEN_HIGH_BITS - 1, 0) +/* + * Non-present SPTE value needs to set bit 63 for TDX, in order to suppress + * #VE and get EPT violations on non-present PTEs. We can use the + * same value also without TDX for both VMX and SVM: + * + * For SVM NPT, for non-present spte (bit 0 = 0), other bits are ignored. + * For VMX EPT, bit 63 is ignored if #VE is disabled. (EPT_VIOLATION_VE=0) + * bit 63 is #VE suppress if #VE is enabled. (EPT_VIOLATION_VE=1) + */ +#ifdef CONFIG_X86_64 +#define SHADOW_NONPRESENT_VALUE BIT_ULL(63) +static_assert(!(SHADOW_NONPRESENT_VALUE & SPTE_MMU_PRESENT_MASK)); +#else #define SHADOW_NONPRESENT_VALUE 0ULL +#endif extern u64 __read_mostly shadow_host_writable_mask; extern u64 __read_mostly shadow_mmu_writable_mask; @@ -192,7 +206,7 @@ extern u64 __read_mostly shadow_nonpresent_or_rsvd_mask; * * Use a semi-arbitrary value that doesn't set RWX bits, i.e. is not-present on * both AMD and Intel CPUs, and doesn't set PFN bits, i.e. doesn't create a L1TF - * vulnerability. Use only low bits to avoid 64-bit immediates. + * vulnerability. * * Only used by the TDP MMU. */ -- 2.43.0