Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp707522pxu; Fri, 23 Oct 2020 11:13:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzqFQjlPIbRu92UfIGrzgPE27Z0J7bDzmQjhakEw65gExaeeBH9UHSXr2DInoZj6QfVecmt X-Received: by 2002:a17:906:3799:: with SMTP id n25mr3164216ejc.6.1603476821837; Fri, 23 Oct 2020 11:13:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603476821; cv=none; d=google.com; s=arc-20160816; b=kc45Z4MyDtvV790R+BAKhl4MX4Z6Kkild/KQT02A6aeLeCepOn/Ylz7oVte8ybq4Oc 5X9Dp1GrveCkxWB2XIb+5Fo+D1Q3tmZvb0X/qc0vuFyAX+i9Xp1THwrE8Rn/G6RSN7HM 4CppTKgLZuy91GkBxcTm9bkyA1xj5ZMLijcxW0tHR3PCqEt1uSzEPS/wFoyoEDi0gbay Q9GI/8HBHKzDqS5bd3ruYaC4XMttcvI4g5ZXki5cN1gkXqmUiHu8fnwONEe8dt9VRLep afrPMe+ZX8OtnZBcp4f/a/bSf1BnE/zDobAzGPkrNAKyjOcorpDZVqxyDLggVe2U73QD LvCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=7Mf06wropCYoz+pvT981q8UQrBcoNfoNDUpkgD3jJ20=; b=pcQFBSVRZCys0VBVqAOuvg5TSoI5eH17v7mr8hL2tkOvNVdC+FmCJYGX9wnyHt9ijV 2UZCUO15auiZLIPnelUhVbEF4z85YrD0QrDljzPA0W4Lca6XIlu/GAj47HvpaMpkbcDf QEd+YEDJSNjJ0QmCAvnJZ/vRbx+AQchuYaJDH8C8TMF9w916HIfa/gP45AIkkteaP6iC PMxTNIOJAXtV45qmCxZZMi4Z4bJmCHI7DhtuGgfFvP8ydRxRixqS7rfBuiGly81eqYDm yeNMdVA0IYz0IEZIIB299O+GEKbSleOiGv57Ozc2bxetl/Ua7w5/EW2OOvZUHYnXTkHd gmTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=HKRzA2I3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l26si1290544edw.348.2020.10.23.11.13.20; Fri, 23 Oct 2020 11:13:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=HKRzA2I3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S464107AbgJWMz7 (ORCPT + 99 others); Fri, 23 Oct 2020 08:55:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S464099AbgJWMz6 (ORCPT ); Fri, 23 Oct 2020 08:55:58 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4362CC0613CE for ; Fri, 23 Oct 2020 05:55:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=7Mf06wropCYoz+pvT981q8UQrBcoNfoNDUpkgD3jJ20=; b=HKRzA2I3isHd29L2LTasFeK7Gl eVyGuPncFwx3/bDp3I953xn0jwRz4Ov1NaaIYuNugteTRfR9RtPfXjQ26VpzlxiarCXubuKO9f8Oe p9xNbWu3BpaUpqZMDd4HCO4KLq4FLmsEoTkMnUwHpajGigc9bNGohXF+1acla/Z5yFThXmGUkt+o8 SAM+t0xz7JW/OwRSwV1ie3lbw2NOrGa+SHI4W6IdycLZjeD657Lp+26y9qF635yjagWyuC0YO5lJ0 u1+gvL384695DcXj0iRNyU1S1mnvfq6EOVxvChlHQy7i1lt6tgfJSSd4HPBUobwzqnSp3RYCUVw9s igQTAgTw==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kVwbE-0003wU-34; Fri, 23 Oct 2020 12:55:16 +0000 Date: Fri, 23 Oct 2020 13:55:16 +0100 From: Matthew Wilcox To: Rik van Riel Cc: Hugh Dickins , Yu Xu , Andrew Morton , Mel Gorman , Andrea Arcangeli , Michal Hocko , Vlastimil Babka , "Kirill A. Shutemov" , linux-mm@kvack.org, kernel-team@fb.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm,thp,shmem: limit shmem THP alloc gfp_mask Message-ID: <20201023125516.GA20115@casper.infradead.org> References: <20201022124511.72448a5f@imladris.surriel.com> <932f5931911e5ad7d730127b0784b0913045639c.camel@surriel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <932f5931911e5ad7d730127b0784b0913045639c.camel@surriel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 22, 2020 at 11:40:53PM -0400, Rik van Riel wrote: > On Thu, 2020-10-22 at 19:54 -0700, Hugh Dickins wrote: > > Michal is right to remember pushback before, because tmpfs is a > > filesystem, and "huge=" is a mount option: in using a huge=always > > filesystem, the user has already declared a preference for huge > > pages. > > Whereas the original anon THP had to deduce that preference from sys > > tunables and vma madvice. > > ... > > > But it's likely that they have accumulated some defrag wisdom, which > > tmpfs can take on board - but please accept that in using a huge > > mount, > > the preference for huge has already been expressed, so I don't expect > > anon THP alloc_hugepage_direct_gfpmask() choices will map one to one. > > In my mind, the huge= mount options for tmpfs corresponded > to the "enabled" anon THP options, denoting a desired end > state, not necessarily how much we will stall allocations > to get there immediately. > > The underlying allocation behavior has been changed repeatedly, > with changes to the direct reclaim code and the compaction > deferral code. > > The shmem THP gfp_mask never tried really hard anyway, > with __GFP_NORETRY being the default, which matches what > is used for non-VM_HUGEPAGE anon VMAs. > > Likewise, the direct reclaim done from the opportunistic > THP allocations done by the shmem code limited itself to > reclaiming 32 4kB pages per THP allocation. > > In other words, mounting > with huge=always has never behaved > the same as the more aggressive allocations done for > MADV_HUGEPAGE VMAs. > > This patch would leave shmem THP allocations for non-MADV_HUGEPAGE > mapped files opportunistic like today, and make shmem THP > allocations for files mapped with MADV_HUGEPAGE more aggressive > than today. > > However, I would like to know what people think the shmem > huge= mount options should do, and how things should behave > when memory gets low, before pushing in a patch just because > it makes the system run smoother "without changing current > behavior too much". > > What do people want tmpfs THP allocations to do? I'm also interested for non-tmpfs THP allocations. In my patchset, THPs are no longer limited to being PMD sized, and allocating smaller pages isn't such a tax on the VM. So currently I'm doing: gfp_t gfp = readahead_gfp_mask(mapping); ... struct page *page = __page_cache_alloc_order(gfp, order); which translates to: mapping_gfp_mask(mapping) | __GFP_NORETRY | __GFP_NOWARN; gfp |= GFP_TRANSHUGE_LIGHT; gfp &= ~__GFP_DIRECT_RECLAIM; Everything's very willing to fall back to order-0 pages, but I can see that, eg, for a VM_HUGEPAGE vma, we should perhaps be less willing to fall back to small pages. I would prefer not to add a mount option to every filesystem. People will only get it wrong.