Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp1677238pxy; Thu, 6 May 2021 13:03:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzIGX0jZUds61Emxj61ETnk6MKSbiWeHX2jAtjf0pEVbJMl3zj1BaK2hh07JBxK+EZNsQB6 X-Received: by 2002:a17:906:1ed1:: with SMTP id m17mr6486673ejj.208.1620331391228; Thu, 06 May 2021 13:03:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620331391; cv=none; d=google.com; s=arc-20160816; b=IoGoVorCwQPf/RokEJkdMJEjOy17zsNZBq52GXy0lHOrvY+v9u0iczbcIdLkeLCXi1 7MnV+RZueDB06v4b04z20oQAvEZwv53i+LsV5SsLTy57giShrOE/ZRPIRyRAXx0G8J1k Nsszsgliei4bjkgw3tz9ucAROEDEQAhoA+jfayWfMqrTsp6X/qzwONudbBpKcaHMzZqW Sq1rahI96lpULD2i/85tEtv0cANHhUvhB6vDnPxjk1BmNob2VxREvbc9Zczf02Hsxxyp GhtXsc+ufFdnKChihqGSCEBCEfDXlSj8cUn4+t5+aFgm1cAiVSGKuwhKP6j9lPypH1B7 Bizg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=+Gmd9WE9Bu2QYLvWpqKR/yYem0cFXFiMf/zwpkj95U8=; b=k21fcT9bJDro/Rb9Ky78JEMUa/96q4qfzCfxcrgUDai0VgQJ/U1FA0qm2JBvufm4gv nFc9c/zz6lL1g2+vPjtfTSXAJryOKJr9uPx0rtmPn1uUWb3UG1A6mewAl/Re8rBXGPm0 qs5CZ6/Id1ALE8dERmHuMTOnXUg6B4XP6IMdOuAkp1v18PNwyg1hzNsbuCnS8W8IYKCh 9d0LyP9CpzSVXUQAsPKBNrq1gLLR3xkObwDYc3kIQyQKlzXx6fs0RwMQgUclIFQ0fFeR hovQ/zUNYADuWD2V0IxNc5tm31PUnTvgJvkDBgaE061oyvUNil/9TrXwMVrl3VdKQPEb xthw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=crQvOxVh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y2si3247377edr.397.2021.05.06.13.02.36; Thu, 06 May 2021 13:03:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=crQvOxVh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235254AbhEFTgY (ORCPT + 99 others); Thu, 6 May 2021 15:36:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235140AbhEFTgX (ORCPT ); Thu, 6 May 2021 15:36:23 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3981C061574; Thu, 6 May 2021 12:35:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=+Gmd9WE9Bu2QYLvWpqKR/yYem0cFXFiMf/zwpkj95U8=; b=crQvOxVhfA0rYm1ruAKJBNw0xC xsM1t+SsJ5U1dOvsLom1VliGQksmvwhkFYj4n7ocfHyofPKeGYBDJFe7TYnq2YqoCPZFOp9Hi5yST 4FH25flkLwf/Lwe/+RCpeR/nmAkbyaM0SvaT/6ZA3du1X0I7kf4FyjORRo8wpB+6q2guCZ8DJU5gv abGYH78vgb/2DeTmQHYosKU2fazelARvJeX7jelqa1FilgXL4Ovy+pzWnv0m0qQ94W3S3HRu/HDqu DvLeYnQ9jgK+5VudN9eKZp/TM9hNoGEyJpw9FhDDQ01zRQCnF9I0oCHm83eSPFGW7r3nw8/YpGdhl 9IBGKFNQ==; Received: from willy by casper.infradead.org with local (Exim 4.94 #2 (Red Hat Linux)) id 1lejha-0028sz-5z; Thu, 06 May 2021 19:30:34 +0000 Date: Thu, 6 May 2021 20:30:26 +0100 From: Matthew Wilcox To: David Hildenbrand Cc: Zi Yan , Oscar Salvador , Michael Ellerman , Benjamin Herrenschmidt , Thomas Gleixner , x86@kernel.org, Andy Lutomirski , "Rafael J . Wysocki" , Andrew Morton , Mike Rapoport , Anshuman Khandual , Michal Hocko , Dan Williams , Wei Yang , linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org Subject: Re: [RFC PATCH 0/7] Memory hotplug/hotremove at subsection size Message-ID: <20210506193026.GE388843@casper.infradead.org> References: <20210506152623.178731-1-zi.yan@sent.com> <9D7FD316-988E-4B11-AC1C-64FF790BA79E@nvidia.com> <3a51f564-f3d1-c21f-93b5-1b91639523ec@redhat.com> <16962E62-7D1E-4E06-B832-EC91F54CC359@nvidia.com> <3A6D54CF-76F4-4401-A434-84BEB813A65A@nvidia.com> <0e850dcb-c69a-188b-7ab9-09e6644af3ab@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0e850dcb-c69a-188b-7ab9-09e6644af3ab@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 06, 2021 at 09:10:52PM +0200, David Hildenbrand wrote: > I have to admit that I am not really a friend of that. I still think our > target goal should be to have gigantic THP *in addition to* ordinary THP. > Use gigantic THP where enabled and possible, and just use ordinary THP > everywhere else. Having one pageblock granularity is a real limitation IMHO > and requires us to hack the system to support it to some degree. You're thinking too small with only two THP sizes ;-) I'm aiming to support arbitrary power-of-two memory allocations. I think there's a fruitful discussion to be had about how that works for anonymous memory -- with page cache, we have readahead to tell us when our predictions of use are actually fulfilled. It doesn't tell us what percentage of the pages allocated were actually used, but it's a hint. It's a big lift to go from 2MB all the way to 1GB ... if you can look back to see that the previous 1GB was basically fully populated, then maybe jump up from allocating 2MB folios to allocating a 1GB folio, but wow, that's a big step. This goal really does mean that we want to allocate from the page allocator, and so we do want to grow MAX_ORDER. I suppose we could do somethig ugly like if (order <= MAX_ORDER) alloc_page() else alloc_really_big_page() but that feels like unnecessary hardship to place on the user. I know that for the initial implementation, we're going to rely on hints from the user to use 1GB pages, but it'd be nice to not do that.