Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp2609398ioo; Sat, 28 May 2022 20:33:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwaNIYGWsledE9A9qdS60F/lthfdK7gQWpk7BExMWdUwzPvIARfyxE4ITNaiX4Ga9LV0x97 X-Received: by 2002:a63:b57:0:b0:3fb:2369:2298 with SMTP id a23-20020a630b57000000b003fb23692298mr12138699pgl.524.1653795204417; Sat, 28 May 2022 20:33:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653795204; cv=none; d=google.com; s=arc-20160816; b=XvFmn5R46cHfazcb/h9UeB4jWvZKx4GPITFCnQ0A6HLnPQ0BdlbGora4SG83LvfzKY s8eZMRGH9xxYNae7wKiz+ZN2Kwz99WYjlYsHNBb/NMfb/AekA70qRDvTtCQVBZMjb9c5 9aPEYkjaVepiXWfIbYdHbPjStDazLm5Xcb/+3LG1qEn7eqi7MWk7WvxbKr6wqkgzfqKJ 9ihsb1Esit4RSugvApzAtICsv31TGVCPUaoOPjgxiht/Fwp8fiVFrUDXHsbvh/En88P0 9TyHXniopxgiMYxYPd7OgukwTA/9WePkIV4qRFvaEBn9eslL7bZSztK1smsw48Eqvu5g 5YUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=qyOue3vhI9r64/cB5hJYFTVsL61v9tYIZ/ebY1l6Usk=; b=NU6Tex0KlGxM2nJnQo3bbMKsLhPSsjgWoKbF+QtCaCz2BBgRDUlZq7x//HnUpGSiZB DE4A3PTn8ZRu5e13DJTvyXSd0+aXyD33X3/2TUZAv5lPQcBeHghlorhEj+XuLdh62pYk x6lullTt+Cl1xzSa0NSpq22n3JJKUfEjCgJ1qZBDdcuoElMvHWCSdYd6rL2+l65RKW/W D6+KpBxHyErKZVYSJEcgB64S2k9dLA5IS9FRb1ZN8mit/WHyw2Kqsw75ZbWUj5bZgnjb UhkOke4BTNqFXB92v6AJvajvpudjDLQVyThWiHxQW5+sYCozrP5l5ZgvgMQGMFpfd0fA yS9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b="Ptypr/Rm"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id hk5-20020a17090b224500b001dfc364540dsi9119912pjb.150.2022.05.28.20.33.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 May 2022 20:33:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b="Ptypr/Rm"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D4D0060EB; Sat, 28 May 2022 20:31:46 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230282AbiE2Dbb (ORCPT + 99 others); Sat, 28 May 2022 23:31:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230273AbiE2Dba (ORCPT ); Sat, 28 May 2022 23:31:30 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E83A207 for ; Sat, 28 May 2022 20:31:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=qyOue3vhI9r64/cB5hJYFTVsL61v9tYIZ/ebY1l6Usk=; b=Ptypr/RmNCnkuOvkKpm4oFUly8 7rRrFWo0GsMzRV5Z+J7L1JCyAeqB5ATCDQg1+WvGM/JZx7m5IYRwiMNYlQqcoBr5CTJWnlsAI3zPN dZ++yUw2Aj3yYCYk5yd7FRuQPYNIkUtxFucRzg1wxbL+GSgxwyDGwDJz+LeBUObZRPwavmW8dXbOM oxey5EfzRfR5Md56svvfw12ouZh44fI2tuWwJiem8uF1CUuei3zea43okhgGukFuSrki5m7/IO5gD YxNATO2aCmtvHrwAsroAzUP6PxBtYPtQzkVGL1mSbkWkGVBU7sV0KTrlc1DRwnitVfukgT+l31Edv EDfCDE2w==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1nv9dz-003M3H-Mw; Sun, 29 May 2022 03:31:07 +0000 Date: Sun, 29 May 2022 04:31:07 +0100 From: Matthew Wilcox To: Muchun Song Cc: bh1scw@gmail.com, Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm/slub: replace alloc_pages with folio_alloc Message-ID: References: <20220528161157.3934825-1-bh1scw@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 29, 2022 at 10:58:18AM +0800, Muchun Song wrote: > On Sat, May 28, 2022 at 05:27:11PM +0100, Matthew Wilcox wrote: > > On Sun, May 29, 2022 at 12:11:58AM +0800, bh1scw@gmail.com wrote: > > > From: Fanjun Kong > > > > > > This patch will use folio allocation functions for allocating pages. > > > > That's not actually a good idea. folio_alloc() will do the > > prep_transhuge_page() step which isn't needed for slab. > > You mean folio_alloc() is dedicated for THP allocation? It is a little > surprise to me. I thought folio_alloc() is just a variant of alloc_page(), > which returns a folio struct instead of a page. Seems like I was wrong. > May I ask what made us decide to do this? Yeah, the naming isn't great here. The problem didn't really occur to me until I saw this patch, and I don't have a good solution yet. We're in the middle of a transition, but the transition is likely to take years and I don't think we necessarily have the final form of the transition fully agreed to or understood, so we should come up with something better for the transition. Ignoring the naming here, memory allocated to filesystems can be split, but the split can fail, so they need the page-deferred-list and the DTOR. Memory allocated to slab cannot be split, so initialising the page-deferred-list is a waste of time. The end-goal is to split apart allocating the memory from allocating its memory descriptor (which I like to call memdesc). So for filesystem folios, we'd call slab to allocate a struct folio and then tell the buddy allocator "here is the memdesc of type folio, allocate me 2^n pages and make pfn_to_memdesc return this memdesc for each of the 2^n pages in it". In this end-goal, slab would also allocate a struct slab (... there's a recursion problem here which has a solution ...), and then allocate 2^n pages. But until we're ready to shrink struct page down to one or two words, doing this is just a waste of memory and time. So I still don't have a good solution to receiving patches like this other than maybe adding a comment like /* Do not change this to allocate a folio */ which will be ignored.