Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp21833imw; Mon, 4 Jul 2022 04:37:46 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vM9SyfHG8WDBlynBtwfs8I/Hkh+tLXvC6zpRW1KZHhLPvhF5uCJzE3ggqHQe9/M6A4ti6u X-Received: by 2002:aa7:9814:0:b0:527:da1e:a0d6 with SMTP id e20-20020aa79814000000b00527da1ea0d6mr31827139pfl.71.1656934666264; Mon, 04 Jul 2022 04:37:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656934666; cv=none; d=google.com; s=arc-20160816; b=EMDY9+hz+jN7HHzIVmG4UKkRv4WlFuloXLeHFeCZ3xHj5KYlofGpXXVCrs/VX4SwMx Z1V4OjvENkTwkNdhWOu4n9ht9eF0iTLsV5haUaAh/a/Yfl1inhLGXOYdOLsXMvFtKnmC P73trl/ezWXDnw4Ae47RsNV1dgw2hVD8yIVH94wYk/kqMHPSc1uIX87aNy7vBkGBe7/C MoTEHX82Xir0t5odeTZXEhOLzZLqjWdLjxihc61xgdy3cGKz3PL9TOU4nyx+R3/zhtg1 tYb2SK1or7VgVlL24qhSV2V9g7r3Y1VyyS4i86g2kWGg8qy4qqAlR+NCnzUQNco1JI3b ntLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=CXQfpYTDQSPQk5n85TMF5Rtc8fjIvsWXa8Dgsig58ac=; b=BiVYEIJ8y5z805uQbH7ZQozkgsokNzDC2B9FB4ImhBZ2u28Om/V7jh7AxkGZn+qBqI WS+pw4NEpZ0WuihLeEVWHqF62Ufas+E5MylTzkew4FOGUl3H6ZtKCa5M/fFwFrS3rnQe rYZ+KjbvhYy/x9W5SKJxey8r/mcRU6H7q4PPQcdTfjbZ5CaG5IajI+C1DxAkSKpan9QW hejZrSL7btcMS2tUIm31lRYdwJKAI4HAOLt3KExvG+b1Qv5+hrNfR+zkKCBN1WyZC7sE l9oTi5VBuM4myDFCRUmXGzDK1pGO965OxYxJ/wYHB3sILAJECCuRKjxDNh4amk+zF+fS PdBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=Bp2IHpgW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d15-20020a655acf000000b003fbb3ea3390si43906043pgt.189.2022.07.04.04.37.34; Mon, 04 Jul 2022 04:37:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=Bp2IHpgW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233782AbiGDK4d (ORCPT + 99 others); Mon, 4 Jul 2022 06:56:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231544AbiGDK4Z (ORCPT ); Mon, 4 Jul 2022 06:56:25 -0400 Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB6FFF5B4 for ; Mon, 4 Jul 2022 03:56:23 -0700 (PDT) Received: by mail-pg1-x534.google.com with SMTP id 68so8609523pgb.10 for ; Mon, 04 Jul 2022 03:56:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=CXQfpYTDQSPQk5n85TMF5Rtc8fjIvsWXa8Dgsig58ac=; b=Bp2IHpgWFvsjVPLdIykEUTI6E62rNcXQ1Qqs+kBE0ofKgsRCSjGPqjUiszcAFTH8sk srzUVlPKcuxjQuahsuvWr9Vp+Eoegd4/LBZrvv8mm/Iuq5yfReWHvWwKmp8M7I/Kt9k7 btxI7NFaYQbE5kkpesVGFRY+TCtBn130gljZFXT3XlzG3nWfQyMg4pCBfu4tGjqVgx3V Epsbxi5Mi5ILXuhvvnngWCarZ+VLTWRlHQ6+UVUvp5ad/4oDaBf5Y0ZqzK6wVkV3Nugw vB/QyqZLvCVUn0FJXiupvXPszkB98MVXCfwCFIuOIkyd6y6YgKAmAKt75CiToINptW7M K/mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=CXQfpYTDQSPQk5n85TMF5Rtc8fjIvsWXa8Dgsig58ac=; b=gbF+tIBz3e5sgFZt3e9kiTCNjT7r7TOfuMMzlzEwcrJ5i1HAw8NsMCpfxo2v9Ny1QX gOyBjUNAIG7FvmIwTETNd0oeppHzprHXVAFUnN29wOz0xYaKLqdMuO7gZ3yopO4l0Fqt vLG1CWfjBrcnyMmHCL6C5Ve4i1LtYFn37jq8ye6NHWPQSdFlVCXCLy+h3SlRKD/cN0LL Kf+B7iRhcWeD7E7YWpfR9etkdmHlkIWIb2vtVEzaJM5AgMVwu1VCcW2W4BfZLNm1DVvM N1vsCcGmHK451PRGvUXK7jYjbS67IqrTvKjacGFNY7wD5T0tYl3IYfPTqDTQrF7LBk+S cURg== X-Gm-Message-State: AJIora/0b5aWzbVTsxSAkS1xXkZFuMaZU6WiZEwfUb6LWfwz15KkOi9q /GVOeFBFv0684mSo+meZCkg+xw== X-Received: by 2002:a05:6a00:1808:b0:528:3ec:543a with SMTP id y8-20020a056a00180800b0052803ec543amr27281869pfa.70.1656932183295; Mon, 04 Jul 2022 03:56:23 -0700 (PDT) Received: from localhost ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id s23-20020a170902a51700b001690d283f52sm20554943plq.158.2022.07.04.03.56.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Jul 2022 03:56:23 -0700 (PDT) Date: Mon, 4 Jul 2022 18:56:19 +0800 From: Muchun Song To: Matthew Wilcox Cc: akpm@linux-foundation.org, jgg@ziepe.ca, jhubbard@nvidia.com, william.kucharski@oracle.com, dan.j.williams@intel.com, jack@suse.cz, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Subject: Re: [PATCH] mm: fix missing wake-up event for FSDAX pages Message-ID: References: <20220704074054.32310-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 04, 2022 at 11:38:16AM +0100, Matthew Wilcox wrote: > On Mon, Jul 04, 2022 at 03:40:54PM +0800, Muchun Song wrote: > > FSDAX page refcounts are 1-based, rather than 0-based: if refcount is > > 1, then the page is freed. The FSDAX pages can be pinned through GUP, > > then they will be unpinned via unpin_user_page() using a folio variant > > to put the page, however, folio variants did not consider this special > > case, the result will be to miss a wakeup event (like the user of > > __fuse_dax_break_layouts()). > > Argh, no. The 1-based refcounts are a blight on the entire kernel. > They need to go away, not be pushed into folios as well. I think I would be happy if this could go away. > we're close to having that fixed, but until then, this should do > the trick? > The following fix looks good to me since it lowers the overhead as much as possible Thanks. > diff --git a/include/linux/mm.h b/include/linux/mm.h > index cc98ab012a9b..4cef5e0f78b6 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1129,18 +1129,18 @@ static inline bool is_zone_movable_page(const struct page *page) > #if defined(CONFIG_ZONE_DEVICE) && defined(CONFIG_FS_DAX) > DECLARE_STATIC_KEY_FALSE(devmap_managed_key); > > -bool __put_devmap_managed_page(struct page *page); > -static inline bool put_devmap_managed_page(struct page *page) > +bool __put_devmap_managed_page(struct page *page, int refs); > +static inline bool put_devmap_managed_page(struct page *page, int refs) > { > if (!static_branch_unlikely(&devmap_managed_key)) > return false; > if (!is_zone_device_page(page)) > return false; > - return __put_devmap_managed_page(page); > + return __put_devmap_managed_page(page, refs); > } > > #else /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */ > -static inline bool put_devmap_managed_page(struct page *page) > +static inline bool put_devmap_managed_page(struct page *page, int refs) > { > return false; > } > @@ -1246,7 +1246,7 @@ static inline void put_page(struct page *page) > * For some devmap managed pages we need to catch refcount transition > * from 2 to 1: > */ > - if (put_devmap_managed_page(&folio->page)) > + if (put_devmap_managed_page(&folio->page, 1)) > return; > folio_put(folio); > } > diff --git a/mm/gup.c b/mm/gup.c > index d1132b39aa8f..28df02121c78 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -88,7 +88,8 @@ static inline struct folio *try_get_folio(struct page *page, int refs) > * belongs to this folio. > */ > if (unlikely(page_folio(page) != folio)) { > - folio_put_refs(folio, refs); > + if (!put_devmap_managed_page(&folio->page, refs)) > + folio_put_refs(folio, refs); > goto retry; > } > > @@ -177,6 +178,8 @@ static void gup_put_folio(struct folio *folio, int refs, unsigned int flags) > refs *= GUP_PIN_COUNTING_BIAS; > } > > + if (put_devmap_managed_page(&folio->page, refs)) > + return; > folio_put_refs(folio, refs); > } > > diff --git a/mm/memremap.c b/mm/memremap.c > index b870a659eee6..b25e40e3a11e 100644 > --- a/mm/memremap.c > +++ b/mm/memremap.c > @@ -499,7 +499,7 @@ void free_zone_device_page(struct page *page) > } > > #ifdef CONFIG_FS_DAX > -bool __put_devmap_managed_page(struct page *page) > +bool __put_devmap_managed_page(struct page *page, int refs) > { > if (page->pgmap->type != MEMORY_DEVICE_FS_DAX) > return false; > @@ -509,7 +509,7 @@ bool __put_devmap_managed_page(struct page *page) > * refcount is 1, then the page is free and the refcount is > * stable because nobody holds a reference on the page. > */ > - if (page_ref_dec_return(page) == 1) > + if (page_ref_sub_return(page, refs) == 1) > wake_up_var(&page->_refcount); > return true; > } > diff --git a/mm/swap.c b/mm/swap.c > index c6194cfa2af6..94e42a9bab92 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -960,7 +960,7 @@ void release_pages(struct page **pages, int nr) > unlock_page_lruvec_irqrestore(lruvec, flags); > lruvec = NULL; > } > - if (put_devmap_managed_page(&folio->page)) > + if (put_devmap_managed_page(&folio->page, 1)) > continue; > if (folio_put_testzero(folio)) > free_zone_device_page(&folio->page); >