Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp2564198rdb; Fri, 22 Sep 2023 02:15:42 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGF1o3AWBZfP9Trfh1xorqku7aXnHo2srIsFgHjNirnaa8h/5yK2FT9b/Y21234Eu8n6hO/ X-Received: by 2002:a05:6808:1d1:b0:3a8:5fd6:f4cf with SMTP id x17-20020a05680801d100b003a85fd6f4cfmr8263458oic.22.1695374141877; Fri, 22 Sep 2023 02:15:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695374141; cv=none; d=google.com; s=arc-20160816; b=AgtTCmyfdDoN4guwau+nyLeWGRHE4QXqbTVjyu74v2bk0vPK6cFwN/6iBvI4DMpo+a ato2HheB+sGnle8W79tR66hu+RT4l6B/uyXX3lePNPfwHo10x8LAJKeKEhGn0wStFko4 gpo24XM+SZBTwr+QpG16bLoKWxDiV/dB8594Ghtvy2j9q/ZRpfSXUA6eVTw9TQPjOgAd Q2kFd1lOPp45WO+OmZLXsUfnOhTngpQj2zJBdF/BPR4dqk2GTt7DVWk3URNcYm3PmWMb P6uRoQBepjCZBH4X2vxC2NvyvVUtPurCxnJaq8UmONzrnkE0lk/xLtFyiL+l/4NlaM6U NxnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=br+/C/v821nNYRFlkBHOr44NAy7oCmXwb/N/stsA46I=; fh=gOZ1jBlIYuabkeRMB+tK+ar5CMn+OV6GvgDzUC49U8E=; b=NE/IMUyFKV3GsIOFghPsYBq5X3wO2o0x2joPMm4mWzgstdcR9WE2Gj7J4hHs2V+0+X 9OfDQacXTFtleAizqzZSZmNyFwnUBOHssZAAZTPnqnEUU6JzhoKtD6r2vIUGGzTJFR91 YA2vIMKypcG8RPKh8AL4Kn2DQjz4FqlSZH3HzXd/7OBgMhk+1RsP4+cS3XMy4prRkull FS2eBO9ZoR5OKMw2pOMxn73cHk3CetU6MCRY3X3Fsyj9mIFixGhu7L+E8Tn0lSdb1vDG JvXvE4gWOHthBULez0C+uuu02bPRrDULlt04Tkj0q9190hEC/1pzcY/oT2Fwt3Qh9r5f 1zwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=iieGcXJy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id l3-20020a056a00140300b0068e4037c5f7si3494105pfu.388.2023.09.22.02.15.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 02:15:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=iieGcXJy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 319ED83220FC; Thu, 21 Sep 2023 19:55:01 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229540AbjIVCyy (ORCPT + 99 others); Thu, 21 Sep 2023 22:54:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229497AbjIVCyy (ORCPT ); Thu, 21 Sep 2023 22:54:54 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48D72198 for ; Thu, 21 Sep 2023 19:54:46 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-6926d5ad926so236360b3a.1 for ; Thu, 21 Sep 2023 19:54:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1695351286; x=1695956086; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=br+/C/v821nNYRFlkBHOr44NAy7oCmXwb/N/stsA46I=; b=iieGcXJyMSUNUOMFkMI8NGO223qUzApef48rY6IMCTl3FpfdAFBmg5wjFjgydijixc yUkTmPo15sAqVGkpS0PVBltlbJZBgT8FJ/X2ufRYw6f4kRGmuHxg3AiiTPtZGaNZoXa1 787dxbAJXnJdRTESFaKdsWgYgbN3/3hZWlqxbUWPBaGbCXWxg9LbY53FVrUhWitM7iSg dQYjccukqgfMz4GpYXdIky1R+hNKPDgHcDcoKwiLK+lXG3maGfYtd9o17AyQpysZCZvk eoHqp0rWQ0qhLpdUMTjRmM+z+VQhaG6iMSMFRzabmAZu0R0aozWtjTdIhwwKkjHJ1sPh 7AkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695351286; x=1695956086; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=br+/C/v821nNYRFlkBHOr44NAy7oCmXwb/N/stsA46I=; b=ZymipTltomu6k8x97PcOeZ5qO2YY6unywlf5W1Lwpkwl7+JJtTM9THW7h/N7Kw4irA jg2ngy8o0G0L8c2+vK+MFVlswJJhjDlQ7AA1np+vJa9LB7tuQAyG93BdC8Ix27cikW8T wN0EVmidxyd+Z9lAfcfFhdx8rCXln7nDkZg8GSbXgYgULTLv/hyGZQqkxWpJHhpHPYtu jaTU9pvSztk7Rcc9gfwbqyzy5KF+OVSTwljB5Rfem+28/fQB9WLkyV38at2i9LaE0fHK dq30MuFRBaRuchYXzPS+cCbKUiAk1V9789DR1BvM+njkXOAPet1lifiog3rwVFkM8Uwg mUrA== X-Gm-Message-State: AOJu0YwfkJhDJ7jlBeZ9eAsOL7AaCXGndvm3BDJEPN3uVTlhRk/k+TTx vQuAcMl8jGQN/h+coi9GSWhAzA== X-Received: by 2002:a05:6a20:c1aa:b0:15c:b7ba:e9ba with SMTP id bg42-20020a056a20c1aa00b0015cb7bae9bamr6999312pzb.0.1695351285575; Thu, 21 Sep 2023 19:54:45 -0700 (PDT) Received: from [10.84.155.178] ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id gp24-20020a17090adf1800b00268032f6a64sm3855746pjb.25.2023.09.21.19.54.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 21 Sep 2023 19:54:45 -0700 (PDT) Message-ID: <217bb956-b9f6-1057-914b-436d4c775a8b@bytedance.com> Date: Fri, 22 Sep 2023 10:54:32 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH v1 8/8] arm64: hugetlb: Fix set_huge_pte_at() to work with all swap entries Content-Language: en-US To: Ryan Roberts Cc: Catalin Marinas , Will Deacon , "James E.J. Bottomley" , Helge Deller , Nicholas Piggin , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Gerald Schaefer , "David S. Miller" , Arnd Bergmann , Mike Kravetz , Muchun Song , SeongJae Park , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Anshuman Khandual , Peter Xu , Axel Rasmussen , Qi Zheng , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org References: <20230921162007.1630149-1-ryan.roberts@arm.com> <20230921162007.1630149-9-ryan.roberts@arm.com> From: Qi Zheng In-Reply-To: <20230921162007.1630149-9-ryan.roberts@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Thu, 21 Sep 2023 19:55:01 -0700 (PDT) Hi Ryan, On 2023/9/22 00:20, Ryan Roberts wrote: > When called with a swap entry that does not embed a PFN (e.g. > PTE_MARKER_POISONED or PTE_MARKER_UFFD_WP), the previous implementation > of set_huge_pte_at() would either cause a BUG() to fire (if > CONFIG_DEBUG_VM is enabled) or cause a dereference of an invalid address > and subsequent panic. > > arm64's huge pte implementation supports multiple huge page sizes, some > of which are implemented in the page table with contiguous mappings. So > set_huge_pte_at() needs to work out how big the logical pte is, so that > it can also work out how many physical ptes (or pmds) need to be > written. It does this by grabbing the folio out of the pte and querying > its size. > > However, there are cases when the pte being set is actually a swap > entry. But this also used to work fine, because for huge ptes, we only > ever saw migration entries and hwpoison entries. And both of these types > of swap entries have a PFN embedded, so the code would grab that and > everything still worked out. > > But over time, more calls to set_huge_pte_at() have been added that set > swap entry types that do not embed a PFN. And this causes the code to go > bang. The triggering case is for the uffd poison test, commit > 99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), > which sets a PTE_MARKER_POISONED swap entry. But review shows there are > other places too (PTE_MARKER_UFFD_WP). > > So the root cause is due to commit 18f3962953e4 ("mm: hugetlb: kill > set_huge_swap_pte_at()"), which aimed to simplify the interface to the > core code by removing set_huge_swap_pte_at() (which took a page size > parameter) and replacing it with calls to set_huge_swap_pte_at() where > the size was inferred from the folio, as descibed above. While that > commit didn't break anything at the time, If it didn't break anything at that time, then shouldn't the Fixes tag be added to this commit? > it did break the interface > because it couldn't handle swap entries without PFNs. And since then new > callers have come along which rely on this working. So the Fixes tag should be added only to the commit that introduces the first new callers? Other than that, LGTM. Thanks, Qi > > Now that we have modified the set_huge_pte_at() interface to pass the > vma, we can extract the huge page size from it and fix this issue. > > I'm tagging the commit that added the uffd poison feature, since that is > what exposed the problem, as well as the original change that broke the > interface. Hopefully this is valuable for people doing bisect. > > Signed-off-by: Ryan Roberts > Fixes: 18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()") > Fixes: 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") > --- > arch/arm64/mm/hugetlbpage.c | 17 +++-------------- > 1 file changed, 3 insertions(+), 14 deletions(-) > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > index 844832511c1e..a08601a14689 100644 > --- a/arch/arm64/mm/hugetlbpage.c > +++ b/arch/arm64/mm/hugetlbpage.c > @@ -241,13 +241,6 @@ static void clear_flush(struct mm_struct *mm, > flush_tlb_range(&vma, saddr, addr); > } > > -static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry) > -{ > - VM_BUG_ON(!is_migration_entry(entry) && !is_hwpoison_entry(entry)); > - > - return page_folio(pfn_to_page(swp_offset_pfn(entry))); > -} > - > void set_huge_pte_at(struct vm_area_struct *vma, unsigned long addr, > pte_t *ptep, pte_t pte) > { > @@ -258,13 +251,10 @@ void set_huge_pte_at(struct vm_area_struct *vma, unsigned long addr, > unsigned long pfn, dpfn; > pgprot_t hugeprot; > > - if (!pte_present(pte)) { > - struct folio *folio; > - > - folio = hugetlb_swap_entry_to_folio(pte_to_swp_entry(pte)); > - ncontig = num_contig_ptes(folio_size(folio), &pgsize); > + ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); > > - for (i = 0; i < ncontig; i++, ptep++) > + if (!pte_present(pte)) { > + for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) > set_pte_at(mm, addr, ptep, pte); > return; > } > @@ -274,7 +264,6 @@ void set_huge_pte_at(struct vm_area_struct *vma, unsigned long addr, > return; > } > > - ncontig = find_num_contig(mm, addr, ptep, &pgsize); > pfn = pte_pfn(pte); > dpfn = pgsize >> PAGE_SHIFT; > hugeprot = pte_pgprot(pte);