Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp827904pxb; Thu, 17 Feb 2022 15:58:45 -0800 (PST) X-Google-Smtp-Source: ABdhPJwN1gYDZzL0YTpzDDCBCQMxsOffvvuEqZGySyAYdzt300+jH+Xd8zD/WnCnu/qoEGRk4Fae X-Received: by 2002:a05:6a00:174e:b0:4e1:7cfb:7a26 with SMTP id j14-20020a056a00174e00b004e17cfb7a26mr5146710pfc.50.1645142325019; Thu, 17 Feb 2022 15:58:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645142325; cv=none; d=google.com; s=arc-20160816; b=XfpkQGMDPWs/zy0SmRqnGmqCA7HXmvlgtH3pnSJtO/oiKP3/vT//emLGngKVURWLwn FpFhEsYqzaeCYLeLb0nPjZHEUsMwsOp7Fh2ybM8nmG5OPoPkyyMGJM1pH4vIftBPSVV9 ERTOdykLSER36aGs0Stj/bErrYBdY2JIcbbEMlJ5Usd2dtsqZbvMLVtQfC4K/slDYEdn ugVz3ZLN+8xrGWCYDabWEHHB+IS+vjBujZZeX3ZrM88Tyh7lGGO/OK5kxGOVSXCuq4Wz 4FZNwsPPZcdE76Xci6xGptd8/BO93us30IWswdFqN8lugd+ycC/6FL1wBDeAjflCyOrU IkEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=xMLGTKj4HTOTXdlhSUyVvWQINdLGyf8MyKTFBShV+f4=; b=A9y2zD5XwJ6yH+wzCSaQWr4YoOWyWTJajtSEb2h2Wncg7J4i0aRAGV3cY8nDSmpKbS HDj5PlM+cEC/SCMHsHy/evJCrBeOf+TUhgFvXMmnOwqV+59FtkmvmXSizwykXRSfmDP5 1DCiwwhvx5znyhXd/7QQi31Kz2WD50e352F5vw/LPDdywaMGxBuW0hD73OuRBKXDGkwx gbUc/TihiAIiIf2yFQuOzhje/r8yrys96/0owpAcvjRGM3e+bZpnGPzfTXPnvJPAjKxA tpNLHzG4K4KoDmgRHRXbg8RudCMCiqyyrAd0PwCEv74i03aaGTipMwdvTkODBSOgCrl9 5TfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=YgRZ2OEC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id b22si8802337pgg.697.2022.02.17.15.58.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Feb 2022 15:58:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=YgRZ2OEC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 95D80318304; Thu, 17 Feb 2022 15:28:20 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241298AbiBQNpA (ORCPT + 99 others); Thu, 17 Feb 2022 08:45:00 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:48514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241316AbiBQNoq (ORCPT ); Thu, 17 Feb 2022 08:44:46 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F9E92B3194 for ; Thu, 17 Feb 2022 05:43:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=xMLGTKj4HTOTXdlhSUyVvWQINdLGyf8MyKTFBShV+f4=; b=YgRZ2OECOBTphRSxVwowj4LEGo MfNJFpDMUbcRjXPoQ6THgCYUAwdvAh9DZUiABlDIfEHnReQTVnR+r2Ob8djBg17cpOK4t2ZWlRSYu BK3TnQeXuf5YnCshO2gbo7prAxxOCXXmpOw3ieSqEy6RBV+p8vUrRsuVwsVSww93FmzQ8ghuu/jIf o5Sy3KYuKPSj/hsPYoIrAXExk2wUWN46Q4Kn60c3rERbMgYXOojrHB5zO8j3zvjAcBR7alCgLwq5B SDEjyoMJnme/ujnlAWjh0sYfUSau86++w1jG+2m0umDX88QXROlGHlGyxd+qj/08qFK/7ZT0MoXfX 8n6lYpGQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1nKh4O-00FeQI-Td; Thu, 17 Feb 2022 13:43:40 +0000 Date: Thu, 17 Feb 2022 13:43:40 +0000 From: Matthew Wilcox To: Hugh Dickins Cc: Mike Kravetz , cgel.zte@gmail.com, kirill@shutemov.name, songliubraving@fb.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, yang.yang29@zte.com.cn, wang.yong12@zte.com.cn, Zeal Robot Subject: Re: [PATCH linux-next] Fix shmem huge page failed to set F_SEAL_WRITE attribute problem Message-ID: References: <20220215073743.1769979-1-cgel.zte@gmail.com> <1f486393-3829-4618-39a1-931afc580835@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 16, 2022 at 05:25:17PM -0800, Hugh Dickins wrote: > On Wed, 16 Feb 2022, Mike Kravetz wrote: > > On 2/14/22 23:37, cgel.zte@gmail.com wrote: > > > From: wangyong > > > > > > After enabling tmpfs filesystem to support transparent hugepage with the > > > following command: > > > echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled > > > The docker program adds F_SEAL_WRITE through the following command will > > > prompt EBUSY. > > > fcntl(5, F_ADD_SEALS, F_SEAL_WRITE)=-1. > > > > > > It is found that in memfd_wait_for_pins function, the page_count of > > > hugepage is 512 and page_mapcount is 0, which does not meet the > > > conditions: > > > page_count(page) - page_mapcount(page) != 1. > > > But the page is not busy at this time, therefore, the page_order of > > > hugepage should be taken into account in the calculation. > > > > > > Reported-by: Zeal Robot > > > Signed-off-by: wangyong > > > --- > > > mm/memfd.c | 16 +++++++++++++--- > > > 1 file changed, 13 insertions(+), 3 deletions(-) > > > > > > diff --git a/mm/memfd.c b/mm/memfd.c > > > index 9f80f162791a..26d1d390a22a 100644 > > > --- a/mm/memfd.c > > > +++ b/mm/memfd.c > > > @@ -31,6 +31,7 @@ > > > static void memfd_tag_pins(struct xa_state *xas) > > > { > > > struct page *page; > > > + int count = 0; > > > unsigned int tagged = 0; > > > > > > lru_add_drain(); > > > @@ -39,8 +40,12 @@ static void memfd_tag_pins(struct xa_state *xas) > > > xas_for_each(xas, page, ULONG_MAX) { > > > if (xa_is_value(page)) > > > continue; > > > + > > > page = find_subpage(page, xas->xa_index); > > > - if (page_count(page) - page_mapcount(page) > 1) > > > + count = page_count(page); > > > + if (PageTransCompound(page)) > > > > PageTransCompound() is true for hugetlb pages as well as THP. And, hugetlb > > pages will not have a ref per subpage as THP does. So, I believe this will > > break hugetlb seal usage. > > Yes, I think so too; and that is not the only issue with the patch > (I don't think page_mapcount is enough, I had to use total_mapcount). > > It's a good find, and thank you WangYong for the report. > I found the same issue when testing my MFD_HUGEPAGE patch last year, > and devised a patch to fix it (and keep MFD_HUGETLB working) then; but > never sent that in because there wasn't time to re-present MFD_HUGEPAGE. > > I'm currently retesting my patch: just found something failing which > I thought should pass; but maybe I'm confused, or maybe the xarray is > working differently now. I'm rushing to reply now because I don't want > others to waste their own time on it. I did change how the XArray works for THP recently. Kirill's original patch stored: 512: p 513: p+1 514: p+2 ... 1023: p+511 A couple of years ago, I changed it to store: 512: p 513: p 514: p ... 1023: p And in January, Linus merged the commit which changes it to: 512-575: p 576-639: (sibling of 512) 640-703: (sibling of 512) ... 960-1023: (sibling of 512) That is, I removed a level of the tree and store sibling entries rather than duplicate entries. That wasn't for fun; I needed to do that in order to make msync() work with large folios. Commit 6b24ca4a1a8d has more detail and hopefully can inspire whatever changes you need to make to your patch.