Received: by 2002:a05:7412:9c0d:b0:e4:3794:b17d with SMTP id lr13csp2457459rdb; Tue, 10 Oct 2023 02:19:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG2Hi4HNIQGM1wolbzu81VnZP+NWkwFD/zxsbt1GFKP7QZmOdNCu4EwF/515b1bfoJ3RAbS X-Received: by 2002:a17:90b:33cd:b0:263:829:2de with SMTP id lk13-20020a17090b33cd00b00263082902demr14131371pjb.2.1696929564537; Tue, 10 Oct 2023 02:19:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696929564; cv=none; d=google.com; s=arc-20160816; b=FWB2BdKqZKSrUUmGzuTZr9jEeiksJd7nGXPmgkPs1gIxhpw7W4D4NOW/K2ocG/jrL8 LHe5B/u+WVnrq3MYDSGM/6Fhv2fLX6OhCOxlLz6Wk4aEs3DsTMGBhamKG8n2BVYuD9sg oV285cje60qdPSG//iVBfdKOIBHSCQ5vXXnCp0CafqIBLHotNLn6u60S4Om0AushdE+4 Ek+LxUhZEwGVa2qjtng6oditySHi2yuJPSDvFPxhtJ0Q433qfRO+98rziiBmZuqdfpDT 2bePt8gBg4tn92sZBdXv7oFm1vndVwJR4V6lLD0N6Plq9GvBIjWohaPuimjOIg+xkkVX n7fA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=9sYsdLfHfyDO6CG+W+sF+sBTTPvMZTMru+vTxHPrMPU=; fh=RNSTGDbupHj6zFX/y6pVQE1RczhLP9y4OMW+XiOxieA=; b=FALG55NuBb/JQj3+qtSTGT4M121a0VcegOT3cdLTAZNkaeOl+zRYB8OR+X7zwSK+kQ bA9UFpK8W/Pn4ykq4jDiEYYLcwMVWmkR+a4ybOM+Erc9shFcK+X+ySXDk+VXsWAD+JC2 MMnCi0lvVT8INQTDvyn3fR5eMChhIffZsJSSwlzHyG3OhWfjTkJ1kZ1624rrJuJoMrvq w3L94PJRjacutdpzhCvag6vC7NMkjDZDcA0vDevYUN2wWxqf4qC7s9zqKBMyLewPtm2E EdR99EknTEY+P4qJrvZq4h7uXIJISCNE9G6BS6ulvzLk6/36M/FyPv1xsPGrZv63Il8f sf4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UN+jLhqf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id k6-20020a170902c40600b001c7316f26a0si12672665plk.516.2023.10.10.02.19.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 02:19:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UN+jLhqf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id D6DA6802936A; Tue, 10 Oct 2023 02:19:20 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230053AbjJJJSw (ORCPT + 99 others); Tue, 10 Oct 2023 05:18:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230041AbjJJJSg (ORCPT ); Tue, 10 Oct 2023 05:18:36 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D491A7; Tue, 10 Oct 2023 02:18:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696929515; x=1728465515; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=nhMjTskI0kdje3r63TqyWep3YjWnF2MvMNQ53LvDfCs=; b=UN+jLhqfsSbaBaPO9pDGfUUVfbmdIYGRn5IqelQmiBlj8sMfNnv2RX7A 7DQ5i+buRfPE5lYFJ6pPrSs7ADpYJgTGaqyRlbT9kJmgqmOl9cN4VOQs7 pkCS7fhESPkkF359bWE1jYgnGOhZjEHX/ErC/Wub1u/39QI6SZNvSv884 oTOF1CukH2jW5ItoBpRvD+kI6GD4DkYwhqU4LrZ8IBq2BX2SgQvxOfdIe 4OtT4DgbcqD6NrIWdfkeuB5qlCqQaLXUU87EYTemH9QyxwSO9xsN/lcfm NeBkspYuVPmdZc/sOvt3YFqH+B0DFzplD06LqQCTzmtZ9FS6+Q7I+d7VK Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10858"; a="364643123" X-IronPort-AV: E=Sophos;i="6.03,212,1694761200"; d="scan'208";a="364643123" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2023 02:18:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10858"; a="844048252" X-IronPort-AV: E=Sophos;i="6.03,212,1694761200"; d="scan'208";a="844048252" Received: from yilunxu-optiplex-7050.sh.intel.com (HELO localhost) ([10.239.159.165]) by FMSMGA003.fm.intel.com with ESMTP; 10 Oct 2023 02:18:30 -0700 Date: Tue, 10 Oct 2023 17:17:36 +0800 From: Xu Yilun To: isaku.yamahata@intel.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Michael Roth , Paolo Bonzini , Sean Christopherson , erdemaktas@google.com, Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com, linux-coco@lists.linux.dev, Chao Peng , Ackerley Tng , Vishal Annapurve , Yuan Yao , Jarkko Sakkinen , Quentin Perret , wei.w.wang@intel.com, Fuad Tabba Subject: Re: [PATCH 6/8] KVM: gmem, x86: Add gmem hook for invalidating private memory Message-ID: References: <8c9f0470ba6e5dc122f3f4e37c4dcfb6fb97b184.1692119201.git.isaku.yamahata@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8c9f0470ba6e5dc122f3f4e37c4dcfb6fb97b184.1692119201.git.isaku.yamahata@intel.com> X-Spam-Status: No, score=2.7 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Tue, 10 Oct 2023 02:19:21 -0700 (PDT) X-Spam-Level: ** On 2023-08-15 at 10:18:53 -0700, isaku.yamahata@intel.com wrote: > From: Michael Roth > > TODO: add a CONFIG option that can be to completely skip arch > invalidation loop and avoid __weak references for arch/platforms that > don't need an additional invalidation hook. > > In some cases, like with SEV-SNP, guest memory needs to be updated in a > platform-specific manner before it can be safely freed back to the host. > Add hooks to wire up handling of this sort when freeing memory in > response to FALLOC_FL_PUNCH_HOLE operations. > > Also issue invalidations of all allocated pages when releasing the gmem > file so that the pages are not left in an unusable state when they get > freed back to the host. > > Signed-off-by: Michael Roth > Link: https://lore.kernel.org/r/20230612042559.375660-3-michael.roth@amd.com > [...] > +/* Handle arch-specific hooks needed before releasing guarded pages. */ > +static void kvm_gmem_issue_arch_invalidate(struct kvm *kvm, struct file *file, > + pgoff_t start, pgoff_t end) > +{ > + pgoff_t file_end = i_size_read(file_inode(file)) >> PAGE_SHIFT; > + pgoff_t index = start; > + > + end = min(end, file_end); > + > + while (index < end) { > + struct folio *folio; > + unsigned int order; > + struct page *page; > + kvm_pfn_t pfn; > + > + folio = __filemap_get_folio(file->f_mapping, index, > + FGP_LOCK, 0); > + if (!folio) { > + index++; > + continue; > + } > + > + page = folio_file_page(folio, index); > + pfn = page_to_pfn(page); > + order = folio_order(folio); > + > + kvm_arch_gmem_invalidate(kvm, pfn, pfn + min((1ul << order), end - index)); Observed an issue there. The valid page may not point to the first page in the folio, then the range [pfn, pfn + (1ul << order)) expands to the next folio. This makes a part of the pages be invalidated again when loop to the next folio. On TDX, it causes TDH_PHYMEM_PAGE_WBINVD failed. > + > + index = folio_next_index(folio); > + folio_unlock(folio); > + folio_put(folio); > + > + cond_resched(); > + } > +} My fix would be: diff --git a/virt/kvm/guest_mem.c b/virt/kvm/guest_mem.c index e629782d73d5..3665003c3746 100644 --- a/virt/kvm/guest_mem.c +++ b/virt/kvm/guest_mem.c @@ -155,7 +155,7 @@ static void kvm_gmem_issue_arch_invalidate(struct kvm *kvm, struct inode *inode, while (index < end) { struct folio *folio; - unsigned int order; + pgoff_t ntails; struct page *page; kvm_pfn_t pfn; @@ -168,9 +168,9 @@ static void kvm_gmem_issue_arch_invalidate(struct kvm *kvm, struct inode *inode, page = folio_file_page(folio, index); pfn = page_to_pfn(page); - order = folio_order(folio); + ntails = folio_nr_pages(folio) - folio_page_idx(folio, page); - kvm_arch_gmem_invalidate(kvm, pfn, pfn + min((1ul << order), end - index)); + kvm_arch_gmem_invalidate(kvm, pfn, pfn + min(ntails, end - index)); index = folio_next_index(folio); folio_unlock(folio); Thanks, Yilun