Received: by 2002:ab2:6c55:0:b0:1fd:c486:4f03 with SMTP id v21csp650917lqp; Wed, 12 Jun 2024 11:56:19 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUanibj557jQkZkxh2xYFZ6/jp+C+8P4Zj0EJ8IRpaJDViv1GbhcdC/GIeW0HS4EvMhuoGVykKvVcaJ/08DL7AhWhlIickqQMegIu41iw== X-Google-Smtp-Source: AGHT+IEQamK1wivHhyRtRwoxkaMi2EJNXrWcqYzJoJpC/U7kO41OXggYWNlp16pzC7iVJimNfBTE X-Received: by 2002:a17:907:2d28:b0:a6f:fbe:bc22 with SMTP id a640c23a62f3a-a6f47c9b57emr211040466b.15.1718218579120; Wed, 12 Jun 2024 11:56:19 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718218579; cv=pass; d=google.com; s=arc-20160816; b=xKrWmkEx2hio0x0RBqbHdZYbxEdInoPgSjdOUwtXJ/kSZXb5+wM5QdKV+sN8XnKU8w PpDpisZOtofn5+Prc6VqtuK+u9xh7GpPrOVkLdb80fm8aOZp4rvHhiIYia4t8FLi0q6G C8EBp8VzoufWIJAOIatBpGhtCuzyrC9uHy8PT2qYChXnXSCBaV1cLnfe9SAmsTUqfjch Uf/cnd3ao9GKxtCXH2FjvQsWoWl0RnWlVQMnOqfgw2kfl1FqA474DbWPEyUeIMwrQhcx 0lTZ9Hei3iaA427Cd+W9zonQpkmXaWsM7uMzBM0c8FDYvQ57uZb9mJMXkz3hsQpFoUPE /8Pw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:message-id:subject:cc:to:from:date:dkim-signature; bh=KaxtRXH8AGZdcOWl2Asd0BwXsoyjtk9xMoDHIzMGTCo=; fh=6HAu122piWIUzoAQ2M419baelUibhUtQH6bzSqq2sDM=; b=sUpsp34SW25q/IOE7GLv2UVeKeKsBOT+5i0uocwEOrarMXvLhKpQmebgULd0St/Dry hL10kuZ8cLOz0DvKA9RavMjsJjwGU/kqfjVaSFQKXKXIub6oK96tJzqgAnblFtlS+eHH cMgGuZYTk7zzw9LbeBVDZHbhxOlH92FU8d+33B0vUU5tMrArwhxdxBmUq06W/blAuRFg KVXSnniAqvTvJJ4XaFQLeVwvGYz14RQc6FYsZ01ov28aH0qmV1gXFSk04ogKFd+vxf94 o55eNAutc6Ou9J+Hb9gOpjKf47lUlUxUbZ8N1ZRrvfy68OIzwm3EZ23aWyczkK+doN4a JN6Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="k+VxNY9/"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-212130-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-212130-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id a640c23a62f3a-a6c8074950csi689809866b.954.2024.06.12.11.56.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Jun 2024 11:56:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-212130-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="k+VxNY9/"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-212130-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-212130-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id A0A9E1F21AA3 for ; Wed, 12 Jun 2024 18:56:18 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5B2A97E574; Wed, 12 Jun 2024 18:56:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="k+VxNY9/" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 708525FB8A; Wed, 12 Jun 2024 18:56:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718218567; cv=none; b=lxsfbZwwNHjdzY5iooeRc7uVN+Mz/kgz5z7jJ8PPpzIHW/mIuu0r2atYRAavg0bGI03inPZyuS668Y/mHkLrg6W0u4DpEc25cl09DZGOE72PlvOClfFMIbHVU1QU4fV8coL3jcnhgKnTMX1z2DyML3LA4VlqIUqBqlXgcjOC26k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718218567; c=relaxed/simple; bh=fE9vto68Rt85S2SZXDTIB7EibUnpLIdueFbtiR5ca+s=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HsMGA5G+0b5eLQ0CoCK3pbMabyzI46XXcdrBQLp9U6itQ5V3ZyTnOueiO6zdlcCwWKddw0gUBL5tAIzNA43SojWrP0ey9v2zz8o8a9W/g+6a8C2OAe5mRJRGCK4rZlDZDTddS2SobNZ9lKrex+ZXJWUjk0MDxhePrdoaClMwBgg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=k+VxNY9/; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718218565; x=1749754565; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=fE9vto68Rt85S2SZXDTIB7EibUnpLIdueFbtiR5ca+s=; b=k+VxNY9/2qqOFeDODnqr5YFFkuQ4SmdsMnJYZUOG6P5EHMDrzXjSu/OU SzS5H7WCWcZadxts6khUyuSQYGwSvPyqLfMKpvqkohEMtNzpVz5Q0qK5h kLvOZ7IJd+vby2i7RIIUgC2825xgkL8jWjGQrnRRlhGSg6SIWNxIZsrSV SYOEbV20QvtjUngw0I5OxG/jSWjmmT8e7m7t8RZGLsEzIJ8W0HyqVb7w1 V9v1ZJC44b3gEXOuvDTT7uQuuovWFpwH+e97zzjtqgIcLv7/u9tZ7tA70 2AEIgtu7RSScNWdqvwng1I+o/ZFpoq3KQ1vARDrRaXDu8K5dRZJzu/wGm A==; X-CSE-ConnectionGUID: ungPepIHQ3Wc9B/cj+yWnA== X-CSE-MsgGUID: rUZUJ+6pQhOGqd7IDaN05w== X-IronPort-AV: E=McAfee;i="6700,10204,11101"; a="17927199" X-IronPort-AV: E=Sophos;i="6.08,234,1712646000"; d="scan'208";a="17927199" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jun 2024 11:56:04 -0700 X-CSE-ConnectionGUID: kAKJKxBhThSMMnn3RVpG1g== X-CSE-MsgGUID: lQKMqU64TvaTvNOzQvS+Xg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,234,1712646000"; d="scan'208";a="40357216" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.54]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jun 2024 11:56:04 -0700 Date: Wed, 12 Jun 2024 11:56:03 -0700 From: Isaku Yamahata To: "Edgecombe, Rick P" Cc: "pbonzini@redhat.com" , "seanjc@google.com" , "Huang, Kai" , "sagis@google.com" , "linux-kernel@vger.kernel.org" , "Aktas, Erdem" , "Zhao, Yan Y" , "dmatlack@google.com" , "kvm@vger.kernel.org" , "Yamahata, Isaku" , "isaku.yamahata@gmail.com" Subject: Re: [PATCH v2 15/15] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU Message-ID: <20240612185603.GK386318@ls.amr.corp.intel.com> References: <20240530210714.364118-1-rick.p.edgecombe@intel.com> <20240530210714.364118-16-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Fri, Jun 07, 2024 at 11:39:14PM +0000, "Edgecombe, Rick P" wrote: > On Fri, 2024-06-07 at 11:31 +0200, Paolo Bonzini wrote: > > > -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, > > > -                        int *root_level) > > > +static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 > > > *sptes, > > > +                                 enum kvm_tdp_mmu_root_types root_type) > > >   { > > > -       struct kvm_mmu_page *root = root_to_sp(vcpu->arch.mmu->root.hpa); > > > +       struct kvm_mmu_page *root = tdp_mmu_get_root(vcpu, root_type); > > > > I think this function should take the struct kvm_mmu_page * directly. > > > > > +{ > > > +       *root_level = vcpu->arch.mmu->root_role.level; > > > + > > > +       return __kvm_tdp_mmu_get_walk(vcpu, addr, sptes, KVM_DIRECT_ROOTS); > > > > Here you pass root_to_sp(vcpu->arch.mmu->root.hpa); > > I see. It is another case of more indirection to try to send the decision making > through the helpers. We can try to open code things more. > > > > > > +int kvm_tdp_mmu_get_walk_mirror_pfn(struct kvm_vcpu *vcpu, u64 gpa, > > > +                                    kvm_pfn_t *pfn) > > > +{ > > > +       u64 sptes[PT64_ROOT_MAX_LEVEL + 1], spte; > > > +       int leaf; > > > + > > > +       lockdep_assert_held(&vcpu->kvm->mmu_lock); > > > + > > > +       rcu_read_lock(); > > > +       leaf = __kvm_tdp_mmu_get_walk(vcpu, gpa, sptes, KVM_MIRROR_ROOTS); > > > > and likewise here. > > > > You might also consider using a kvm_mmu_root_info for the mirror root, > > even though the pgd field is not used. > > This came up on the last version actually. The reason against it was that it > used that tiny bit of extra memory for the pgd. It does look more symmetrical > though. > > > > > Then __kvm_tdp_mmu_get_walk() can take a struct kvm_mmu_root_info * instead. > > Ahh, I see. Yes, that's a good reason. > > > > > kvm_tdp_mmu_get_walk_mirror_pfn() doesn't belong in this series, but > > introducing __kvm_tdp_mmu_get_walk() can stay here. > > Ok, we can split it. > > > > > Looking at the later patch, which uses > > kvm_tdp_mmu_get_walk_mirror_pfn(), I think this function is a bit > > overkill. I'll do a pre-review of the init_mem_region function, > > especially the usage of kvm_gmem_populate: > > > > +    slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); > > +    if (!kvm_slot_can_be_private(slot) || !kvm_mem_is_private(kvm, gfn)) { > > +        ret = -EFAULT; > > +        goto out_put_page; > > +    } > > > > The slots_lock is taken, so checking kvm_slot_can_be_private is unnecessary. > > > > Checking kvm_mem_is_private perhaps should also be done in > > kvm_gmem_populate() itself. I'll send a patch. > > > > +    read_lock(&kvm->mmu_lock); > > + > > +    ret = kvm_tdp_mmu_get_walk_mirror_pfn(vcpu, gpa, &mmu_pfn); > > +    if (ret < 0) > > +        goto out; > > +    if (ret > PG_LEVEL_4K) { > > +        ret = -EINVAL; > > +        goto out; > > +    } > > +    if (mmu_pfn != pfn) { > > +        ret = -EAGAIN; > > +        goto out; > > +    } > > > > If you require pre-faulting, you don't need to return mmu_pfn and > > things would be seriously wrong if the two didn't match, wouldn't > > they? > > Yea, I'm not sure why it would be a normal condition. Maybe Isaku can comment on > the thinking? Sean suggested for KVM_TDX_INIT_MEM_REGION to check if the two pfn from TDP MMU and guest_memfd match. As pointed out, the two PFNs should match under appropriate lock (or heavily broken). Personally I'm fine to remove such check and to avoid returning pfn. https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/ Then KVM provides a dedicated TDX ioctl(), i.e. what is/was KVM_TDX_INIT_MEM_REGION, to do PAGE.ADD. KVM_TDX_INIT_MEM_REGION wouldn't need to map anything, it would simply need to verify that the pfn from guest_memfd() is the same as what's in the TDP MMU. -- Isaku Yamahata