Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp1119659lql; Tue, 12 Mar 2024 07:51:22 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVUYBq0kEkSB80nir8AyHNLEpErC3Y3hJkM9UtDo5Nwca+20XKHPf9rkrGIEERI6K3r67SAHCAPIgHUFrTNBfMRkaLTTMmYFwaOsk6YZQ== X-Google-Smtp-Source: AGHT+IGE1dwnZgo104Xs/yvayW9uUn5ooC/tIo31spttg8yj9mDD+NAZFfu7EXAYUbvkg2mJg5lg X-Received: by 2002:a05:620a:22f4:b0:788:22e4:aec5 with SMTP id p20-20020a05620a22f400b0078822e4aec5mr3108249qki.57.1710255081782; Tue, 12 Mar 2024 07:51:21 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710255081; cv=pass; d=google.com; s=arc-20160816; b=gtMw0DKM+1TVFkVWlNIlm8ZghXnFRQmeJzG0kMWeAv1hzwSP+AnUkpCb6n1scIdBRy Hk8xFc0MEgYcOOto87pPOLzmzHVXr/VnODg0Fzn02a2MkHJVQAri/SJ7H9kHHP9mZfZH J9pGQxooFxVGZ+nEa+MQ78fMIfOm00HlUnXNYiPU4Vk2Eds3Mv17sdH6ahfvEGO4L9wg kdnUjHGnCK4gewcDXiq6/ti1XQcCJqF9n6dxjUT+wK8o1oJflX/TBczrCLYfMjfDjHjF TtBuKoyJ03tZ9pWkZlByB+Osb4l7B6keEoFEG0AmRWrDPasxGB7L7q88kAM6WqBKEB6l Jr/g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:date:message-id:dkim-signature; bh=4UfDQT3f9gGslw1lHRgral7m7g6QVZuMTBbDsYz8xC0=; fh=8scsGQn+hJ1HvbMI23kGOdGh/wtWpH4wdCi8Q1zCFc4=; b=Jl+sBFuvD1CXhg6KrjRBzxBchmSduwkbT01NO9wKtX2ULRIReC+cigRMD7swK1drnI 9KStdj00Zx/2dFy7aG+BfrvbS5IbBBKABiphynE+EPrkVlSpE7yAXvubl/qGC/LTO5Ue nQXmadsgDu5I/cZ83Uni84N/pvP0p1HHjQtk7uWaNvm6wLPXkxdn3VK+9fjh1BwNVzW2 ezJFr8ByBUNPvDlrYwJ5XGJ/tx8yDtJnQIKTkzpQF6+t3qGZXpftRv34rkP3lPCKk2aX lioLIq/9knvF140TKjOLBm6QFk097CdAuuhF3t1nKoH43ZTh9rklKf5zZTn/liriNSF5 34/g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TGMb3hDc; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-100295-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-100295-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id bk30-20020a05620a1a1e00b007882e4e5330si8151667qkb.267.2024.03.12.07.51.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Mar 2024 07:51:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-100295-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TGMb3hDc; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-100295-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-100295-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 5DF821C2186D for ; Tue, 12 Mar 2024 13:33:50 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DD6507A709; Tue, 12 Mar 2024 13:33:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TGMb3hDc" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38CD779B91; Tue, 12 Mar 2024 13:33:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710250421; cv=none; b=Ff9qAM/bGMopZO3mkolmTnykdEMiVyrjT4s2vrUvJhV9lT5oiklwpZQdQyIbH1aFdryw4LZjUObFlMxvHoq4pqX/L7sOXx4qGcTDM6T9vUpn062r6KDuCsqEI3cnRcHseSnD1cZFuYSrAN+mnnHf7L4H/j84koR61SXWcQGx3qc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710250421; c=relaxed/simple; bh=bzbXVrzjpIxE+1u40J+dNUpW+UEPjzDf9Zam2JO9mbg=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=W7OMJlRc5A+6ryoQToCylNmlNnotxoCtDyoNJOye0U+vc5JCCX8zN1MvR+gyK60lUPPaVZynZr+M7HqPIpbw96bs3P0KxVgfl4BOZ0Ilk47gCE0YYk8L/xrRd5cWHkiXvX/Mu8RfX25DXe4ROJPueU5c99CH3kammQQcB8bE6hk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TGMb3hDc; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710250419; x=1741786419; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=bzbXVrzjpIxE+1u40J+dNUpW+UEPjzDf9Zam2JO9mbg=; b=TGMb3hDc4IxefPv9idXFn/J2m2h6p+yTRWY8wLwLGGzqpypqEeWxB7Jv MJ1RnAvBKYW3BKjTExz/1MdBpw8blyW7bK3TonXbjr7B9iId//hiTiGxs c/p3uq6NZhUoNBu4D1BgaRGsWg3o5A8LfBTisMKbgjNHQcEPwjCRUsKPf 08s0nFFiD32iC+5x5dsOmCqLKyKF4e4uZ+Vjr4T9HfUQi4EXmWeBdr/Tp wvnwl5Spyw5Eevjvycxk97cjZvMWBazOqoMGKPvDon3QGhPZhU16QJ8Cb DTfJMz5C8K8E16IOxgktpm2RQ2lziV8r4Txfa/fW14FcTqjkFrsMRlMyK Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11010"; a="7904350" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="7904350" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 06:33:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="48978613" Received: from binbinwu-mobl.ccr.corp.intel.com (HELO [10.125.242.247]) ([10.125.242.247]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 06:33:35 -0700 Message-ID: <2daf03ae-6b5a-44ae-806e-76d09fb5273b@linux.intel.com> Date: Tue, 12 Mar 2024 21:33:31 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v19 011/130] KVM: Add new members to struct kvm_gfn_range to operate on To: isaku.yamahata@intel.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , Kai Huang , chen.bo@intel.com, hang.yuan@intel.com, tina.zhang@intel.com References: From: Binbin Wu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 2/26/2024 4:25 PM, isaku.yamahata@intel.com wrote: > From: Isaku Yamahata > > Add new members to strut kvm_gfn_range to indicate which mapping > (private-vs-shared) to operate on. only_private and only_shared. Update > mmu notifier, set memory attributes ioctl or KVM gmem callback to > initialize them. > > It was premature for set_memory_attributes ioctl to call > kvm_unmap_gfn_range(). Instead, let kvm_arch_ste_memory_attributes() "kvm_arch_ste_memory_attributes()" -> "kvm_vm_set_mem_attributes()" ? > handle it and add a new x86 vendor callback to react to memory attribute > change. [1] Which new x86 vendor callback? > > - If it's from the mmu notifier, zap shared pages only > - If it's from the KVM gmem, zap private pages only > - If setting memory attributes, vendor callback checks new attributes > and make decisions. > SNP would do nothing and handle it later with gmem callback > TDX callback would do as follows. > When it converts pages to shared, zap private pages only. > When it converts pages to private, zap shared pages only. > > TDX needs to know which mapping to operate on. Shared-EPT vs. Secure-EPT. > The following sequence to convert the GPA to private doesn't work for TDX > because the page can already be private. > > 1) Update memory attributes to private in memory attributes xarray > 2) Zap the GPA range irrespective of private-or-shared. > Even if the page is already private, zap the entry. > 3) EPT violation on the GPA > 4) Populate the GPA as private > The page is zeroed, and the guest has to accept the page again. > > In step 2, TDX wants to zap only shared pages and skip private ones. > > [1] https://lore.kernel.org/all/ZJX0hk+KpQP0KUyB@google.com/ > > Suggested-by: Sean Christopherson > Signed-off-by: Isaku Yamahata > > --- > Changes v18: > - rebased to kvm-next > > Changes v2 -> v3: > - Drop the KVM_GFN_RANGE flags > - Updated struct kvm_gfn_range > - Change kvm_arch_set_memory_attributes() to return bool for flush > - Added set_memory_attributes x86 op for vendor backends > - Refined commit message to describe TDX care concretely > > Changes v1 -> v2: > - consolidate KVM_GFN_RANGE_FLAGS_GMEM_{PUNCH_HOLE, RELEASE} into > KVM_GFN_RANGE_FLAGS_GMEM. > - Update the commit message to describe TDX more. Drop SEV_SNP. > > Signed-off-by: Isaku Yamahata > --- > include/linux/kvm_host.h | 2 ++ > virt/kvm/guest_memfd.c | 3 +++ > virt/kvm/kvm_main.c | 17 +++++++++++++++++ > 3 files changed, 22 insertions(+) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 7e7fd25b09b3..0520cd8d03cc 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -264,6 +264,8 @@ struct kvm_gfn_range { > gfn_t start; > gfn_t end; > union kvm_mmu_notifier_arg arg; > + bool only_private; > + bool only_shared; IMO, an enum will be clearer than the two flags.     enum {         PROCESS_PRIVATE_AND_SHARED,         PROCESS_ONLY_PRIVATE,         PROCESS_ONLY_SHARED,     }; > bool may_block; > }; > bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > index 0f4e0cf4f158..3830d50b9b67 100644 > --- a/virt/kvm/guest_memfd.c > +++ b/virt/kvm/guest_memfd.c > @@ -64,6 +64,9 @@ static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, > .end = slot->base_gfn + min(pgoff + slot->npages, end) - pgoff, > .slot = slot, > .may_block = true, > + /* guest memfd is relevant to only private mappings. */ > + .only_private = true, > + .only_shared = false, > }; > > if (!found_memslot) { > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 10bfc88a69f7..0349e1f241d1 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -634,6 +634,12 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, > */ > gfn_range.arg = range->arg; > gfn_range.may_block = range->may_block; > + /* > + * HVA-based notifications aren't relevant to private > + * mappings as they don't have a userspace mapping. > + */ > + gfn_range.only_private = false; > + gfn_range.only_shared = true; > > /* > * {gfn(page) | page intersects with [hva_start, hva_end)} = > @@ -2486,6 +2492,16 @@ static __always_inline void kvm_handle_gfn_range(struct kvm *kvm, > gfn_range.arg = range->arg; > gfn_range.may_block = range->may_block; > > + /* > + * If/when KVM supports more attributes beyond private .vs shared, this > + * _could_ set only_{private,shared} appropriately if the entire target > + * range already has the desired private vs. shared state (it's unclear > + * if that is a net win). For now, KVM reaches this point if and only > + * if the private flag is being toggled, i.e. all mappings are in play. > + */ > + gfn_range.only_private = false; > + gfn_range.only_shared = false; > + > for (i = 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) { > slots = __kvm_memslots(kvm, i); > > @@ -2542,6 +2558,7 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end, > struct kvm_mmu_notifier_range pre_set_range = { > .start = start, > .end = end, > + .arg.attributes = attributes, > .handler = kvm_pre_set_memory_attributes, > .on_lock = kvm_mmu_invalidate_begin, > .flush_on_ret = true,