Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp911715pxt; Thu, 5 Aug 2021 15:03:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw0wESMxFcU2fUAFUN/wVQheTckEhsFuOA0ftVqezOGRdhV77ExBFAxNRCK99UIAkD3435z X-Received: by 2002:a05:6402:27c9:: with SMTP id c9mr9657867ede.234.1628201034386; Thu, 05 Aug 2021 15:03:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628201034; cv=none; d=google.com; s=arc-20160816; b=R6YTYckwe45aCN6XJbKo6i14qhtZxxUe9LNgDDQIvKDZqijcFh8UjQX21yzrdcPzYM Cue2PfQNrvSNNYDzpWpS1QeLItADa3ybBo9FHomQNAWgCTrbI4PMze55MoTRiPDTZtjw bT37xACt1Fl/7uH5b/T2ED74wLZuBz553uHS9Fe+MocEMMMn1DNE3ZNbvGmsSD58GFXP mZ5kLFo2Y/0CIAobW+cK5w9tPefHciQ7sp7s5l9pIu0tUuhT8UFXddZRahZAZOnod89L iJC1webnpbiSp/w97HwDxqJnZcnMQYvXZ62kki+j7bIsxwpxAbsd5rqQIZrkqTZUCGpW cirg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=/P3TDkQ7ZgIaEjG8DCMa6pM+jRJTxvux2xppyKDcnn0=; b=GhO1QvyqvDdMjQSBtOFd3QpJx2ex40YEwSyUAeQafDUTkFhydUpQlAEjAhP24HHlBH gPdds2BgiXdxm/EoBMp92DX9kbAILvDqyDrlKrO9+lxfQKUpoqO6bSwtEu6E2W5Bluw1 ae/VoqYMLrxBaj20dTslX5j75OWnlXf/2/HPUtmnoFAMZpmL7walz15zW21nlQTOTpFA x6ckasPHkOyyQbOjxiqXjkq0yMBJIErYLOj9xxJwfZbDWuw1EZFFdxKqG7pQmQZLJr00 HgCb9CKWGIyPkWFsJyP+zxBg/5QilKXasHrAe0eY8dR4hhoK6vpaFeT5BfOp+lmfCClf te5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b33si7559672edf.242.2021.08.05.15.03.30; Thu, 05 Aug 2021 15:03:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230471AbhHEV7o (ORCPT + 99 others); Thu, 5 Aug 2021 17:59:44 -0400 Received: from mga12.intel.com ([192.55.52.136]:65246 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229437AbhHEV7n (ORCPT ); Thu, 5 Aug 2021 17:59:43 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10067"; a="193854179" X-IronPort-AV: E=Sophos;i="5.84,296,1620716400"; d="scan'208";a="193854179" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2021 14:59:28 -0700 X-IronPort-AV: E=Sophos;i="5.84,296,1620716400"; d="scan'208";a="586393590" Received: from dawntan-mobl1.gar.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.254.52.64]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2021 14:59:24 -0700 Date: Fri, 6 Aug 2021 09:59:22 +1200 From: Kai Huang To: Sean Christopherson Cc: isaku.yamahata@intel.com, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , erdemaktas@google.com, Connor Kuehl , x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, isaku.yamahata@gmail.com, Rick Edgecombe Subject: Re: [RFC PATCH v2 41/69] KVM: x86: Add infrastructure for stolen GPA bits Message-Id: <20210806095922.6e2ca6587dc6f5b4fe8d52e7@intel.com> In-Reply-To: References: <20210805234424.d14386b79413845b990a18ac@intel.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 5 Aug 2021 16:06:41 +0000 Sean Christopherson wrote: > On Thu, Aug 05, 2021, Kai Huang wrote: > > On Fri, 2 Jul 2021 15:04:47 -0700 isaku.yamahata@intel.com wrote: > > > From: Rick Edgecombe > > > @@ -2020,6 +2032,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, > > > sp = kvm_mmu_alloc_page(vcpu, direct); > > > > > > sp->gfn = gfn; > > > + sp->gfn_stolen_bits = gfn_stolen_bits; > > > sp->role = role; > > > hlist_add_head(&sp->hash_link, sp_list); > > > if (!direct) { > > > @@ -2044,6 +2057,13 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, > > > return sp; > > > } > > > > > > Sorry for replying old thread, > > Ha, one month isn't old, it's barely even mature. > > > but to me it looks weird to have gfn_stolen_bits > > in 'struct kvm_mmu_page'. If I understand correctly, above code basically > > means that GFN with different stolen bit will have different 'struct > > kvm_mmu_page', but in the context of this patch, mappings with different > > stolen bits still use the same root, > > You're conflating "mapping" with "PTE". The GFN is a per-PTE value. Yes, there > is a final GFN that is representative of the mapping, but more directly the final > GFN is associated with the leaf PTE. > > TDX effectively adds the restriction that all PTEs used for a mapping must have > the same shared/private status, so mapping and PTE are somewhat interchangeable > when talking about stolen bits (the shared bit), but in the context of this patch, > the stolen bits are a property of the PTE. Yes it is a property of PTE, this is the reason that I think it's weird to have stolen bits in 'struct kvm_mmu_page'. Shouldn't stolen bits in 'struct kvm_mmu_page' imply that all PTEs (whether leaf or not) share the same stolen bit? > > Back to your statement, it's incorrect. PTEs (effectively mappings in TDX) with > different stolen bits will _not_ use the same root. kvm_mmu_get_page() includes > the stolen bits in both the hash lookup and in the comparison, i.e. restores the > stolen bits when looking for an existing shadow page at the target GFN. > > @@ -1978,9 +1990,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, > role.quadrant = quadrant; > } > > - sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; > + sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn_and_stolen)]; > for_each_valid_sp(vcpu->kvm, sp, sp_list) { > - if (sp->gfn != gfn) { > + if ((sp->gfn | sp->gfn_stolen_bits) != gfn_and_stolen) { > collisions++; > continue; > } > This only works for non-root table, but there's only one single vcpu->arch.mmu->root_hpa, we don't have an array to have one root for each stolen bit, i.e. do a loop in mmu_alloc_direct_roots(), so effectively all stolen bits share one single root. > > which means gfn_stolen_bits doesn't make a lot of sense at least for root > > page table. > > It does make sense, even without a follow-up patch. In Rick's original series, > stealing a bit for execute-only guest memory, there was only a single root. And > except for TDX, there can only ever be a single root because the shared EPTP isn't > usable, i.e. there's only the regular/private EPTP. > > > Instead, having gfn_stolen_bits in 'struct kvm_mmu_page' only makes sense in > > the context of TDX, since TDX requires two separate roots for private and > > shared mappings. > > > So given we cannot tell whether the same root, or different roots should be > > used for different stolen bits, I think we should not add 'gfn_stolen_bits' to > > 'struct kvm_mmu_page' and use it to determine whether to allocate a new table > > for the same GFN, but should use a new role (i.e role.private) to determine. > > A new role would work, too, but it has the disadvantage of not automagically > working for all uses of stolen bits, e.g. XO support would have to add another > role bit. For each purpose of particular stolen bit, a new role can be defined. For instance, in __direct_map(), if you see stolen bit is TDX shared bit, you don't set role.private (otherwise set role.private). For XO, if you see the stolen bit is XO, you set role.xo. We already have info of 'gfn_stolen_mask' in vcpu, so we just need to make sure all code paths can find the actual stolen bit based on sp->role and vcpu (I haven't gone through all though, assuming the annoying part is rmap). > > > And removing 'gfn_stolen_bits' in 'struct kvm_mmu_page' could also save some > > memory. > > But I do like saving memory... One potentially bad idea would be to unionize > gfn and stolen bits by shifting the stolen bits after they're extracted from the > gpa, e.g. > > union { > gfn_t gfn_and_stolen; > struct { > gfn_t gfn:52; > gfn_t stolen:12; > } > }; > > the downsides being that accessing just the gfn would require an additional masking > operation, and the stolen bits wouldn't align with reality.