Received: by 2002:ab2:69cc:0:b0:1fd:c486:4f03 with SMTP id n12csp71624lqp; Mon, 10 Jun 2024 19:04:54 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXOy07GeTsskUBYI7G7SS5UVjml0tPW5LfHfK+8BgIyzOxa8UrjY0Y0mMfQFyq5eSHJCBvuS5GUNUqU/A4jGzm6jsPIdW5NK19HfSjkeA== X-Google-Smtp-Source: AGHT+IFEGKZQa418HBNsNIRQfk8pGLcUyv13yLLlhULLXpjwzR6DvFFzKiMX7z71se0r0cgSBjUY X-Received: by 2002:a50:9f4b:0:b0:57c:6004:438a with SMTP id 4fb4d7f45d1cf-57c60044668mr6271977a12.34.1718071493949; Mon, 10 Jun 2024 19:04:53 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718071493; cv=pass; d=google.com; s=arc-20160816; b=ZbHDNSV/LddvvTG/RteVjcAsmzzFUOpNUPmz6H+9fppy9yeFC5PAdyD+pOBVeUu9gE MNC7ptWsLfgxZelGCAwBzoWEWSHzIStejGgDlj+HxGgsy5kiiT/flqQI4YPnbnQ5p7Is Fjn7LrgNq4iGMtWXEiRt301smLaxSigv9NFcz8bX0a9yagyellDuQHMkbxkBgkXSGeM3 feR0nObIRLpwEA1JODm4g4jWv9H20pD8SUz/JBCrCphuEcsQEvPWXwjgnfjMaIVkiDdK 4HzQdZZemgspfeEmMb2UK6nh7rIlSizyWzpBEDPFCweU4v+Mw0QzR8ioJ+lOQaOcqjW4 w7lg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:date:message-id:dkim-signature; bh=7j4g4SdK/dgS0J/c8Twl0jOBteQXn2uqyypbDOrvpVM=; fh=5UCACJcEcVJNNXQ6AgzWwHop8IRQO4SkE0j+bkNbAb0=; b=er/hSInGxMKa+QGmBYVL0q4HKCkwHTzaryxDi9MxVAikay2CBD2Z8JBcZ+ddBCg8tW vbSl5IslSofoNpTZmTenUbBzBJKmLzG+2P7C7r7CcqjnCCg+IKKi8vEIwPilcr9FOHX7 APydUd5s5zqfJ3TtSuHNV4skv/hfbcHRwAVR8aKKEj8ow93i18BzP5cY6id6HI8KKhFp ziyTGfSyIk6QNWXvcKr+ofymv0qnxLjaX1wmAsOMc0nPCs0tRhVhUptb3cj6Ry+jKFLx ruMJ5miZqM5z/njPe0BrYmlUW5mfpiE5NkM6ALl2CuVpjYdRBkkfNnKlAQgl034mZBcv U1tQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.alibaba.com header.s=default header.b="t/b2HW8t"; arc=pass (i=1 spf=pass spfdomain=linux.alibaba.com dkim=pass dkdomain=linux.alibaba.com dmarc=pass fromdomain=linux.alibaba.com); spf=pass (google.com: domain of linux-kernel+bounces-209168-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-209168-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.alibaba.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id 4fb4d7f45d1cf-57c72cd92c7si2779163a12.375.2024.06.10.19.04.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Jun 2024 19:04:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-209168-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.alibaba.com header.s=default header.b="t/b2HW8t"; arc=pass (i=1 spf=pass spfdomain=linux.alibaba.com dkim=pass dkdomain=linux.alibaba.com dmarc=pass fromdomain=linux.alibaba.com); spf=pass (google.com: domain of linux-kernel+bounces-209168-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-209168-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.alibaba.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 7AE941F22BB0 for ; Tue, 11 Jun 2024 02:04:53 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7751FB673; Tue, 11 Jun 2024 02:04:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="t/b2HW8t" Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ECC94A94B for ; Tue, 11 Jun 2024 02:04:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.112 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718071486; cv=none; b=ZhaNW9VOIawJxewXZgAUC2ijxewXzeNDh+oBFhBgkDQUwiWf9Q8fW33YrqCaOw/p7oDYfnzG+2t6kZmpWHnJu/JyOdKYKt/pR+voQo4OB7oaAepww7F8uWfNitTO6oY0VPeGKSY4CHvIwcqI0l0MzHtFfKyLbikTIdoHFQXIsHo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718071486; c=relaxed/simple; bh=KHMWCLwiS57Nz16itqOWI3JbubOZ0uAVAJGTbNUd2Iw=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=fBd6bONdAX0alAM4fkgF2i8MsFCkQj0crZC8au3ft0pBqqXUdLaPXAfpE0Ya7UMRll6If64dDNofdsyYOjuSG/Pm6JONvPQXj/hQ8NFy6W7T1Lp01YtK8TzoxfQ3y69SzUjPF1dA4utoHrP5GsRpu17/kcnTf2iLqXUR8E9lKbU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=t/b2HW8t; arc=none smtp.client-ip=115.124.30.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718071474; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=7j4g4SdK/dgS0J/c8Twl0jOBteQXn2uqyypbDOrvpVM=; b=t/b2HW8t/Wflliv86LLSBqZpwduSb1VCrfnnUln9w/tmvS/8L7nNGOcrxa+S2U7nLfeGcMaXnX/hKST3LiskawF83M16TVkLQ5tQ5n39WYM6XYyTRIp6oWqwdtVSi13Tuc33P+whfzZy8+heTLAZfE1giXdpaRVN7/FzF3Y+3ZM= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R461e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032014031;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0W8EY.xP_1718071471; Received: from 30.97.56.68(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8EY.xP_1718071471) by smtp.aliyun-inc.com; Tue, 11 Jun 2024 10:04:32 +0800 Message-ID: <6c7a8602-5b88-424c-a8c4-8a9502865d94@linux.alibaba.com> Date: Tue, 11 Jun 2024 10:04:31 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 3/6] mm: shmem: add multi-size THP sysfs interface for anonymous shmem To: Daniel Gomez Cc: "akpm@linux-foundation.org" , "hughd@google.com" , "willy@infradead.org" , "david@redhat.com" , "wangkefeng.wang@huawei.com" , "ying.huang@intel.com" , "21cnbao@gmail.com" <21cnbao@gmail.com>, "ryan.roberts@arm.com" , "shy828301@gmail.com" , "ziy@nvidia.com" , "ioworker0@gmail.com" , Pankaj Raghav , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" References: <119966ae28bf2e2d362ae3d369ac1a1cd27ba866.1717495894.git.baolin.wang@linux.alibaba.com> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2024/6/10 20:23, Daniel Gomez wrote: > Hi Baolin, > On Tue, Jun 04, 2024 at 06:17:47PM +0800, Baolin Wang wrote: >> To support the use of mTHP with anonymous shmem, add a new sysfs interface >> 'shmem_enabled' in the '/sys/kernel/mm/transparent_hugepage/hugepages-kB/' >> directory for each mTHP to control whether shmem is enabled for that mTHP, >> with a value similar to the top level 'shmem_enabled', which can be set to: >> "always", "inherit (to inherit the top level setting)", "within_size", "advise", >> "never". An 'inherit' option is added to ensure compatibility with these >> global settings, and the options 'force' and 'deny' are dropped, which are >> rather testing artifacts from the old ages. >> >> By default, PMD-sized hugepages have enabled="inherit" and all other hugepage >> sizes have enabled="never" for '/sys/kernel/mm/transparent_hugepage/hugepages-xxkB/shmem_enabled'. >> >> In addition, if top level value is 'force', then only PMD-sized hugepages >> have enabled="inherit", otherwise configuration will be failed and vice versa. >> That means now we will avoid using non-PMD sized THP to override the global >> huge allocation. >> >> Signed-off-by: Baolin Wang >> --- >> Documentation/admin-guide/mm/transhuge.rst | 23 ++++++ >> include/linux/huge_mm.h | 10 +++ >> mm/huge_memory.c | 11 +-- >> mm/shmem.c | 96 ++++++++++++++++++++++ >> 4 files changed, 132 insertions(+), 8 deletions(-) >> >> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >> index d414d3f5592a..b76d15e408b3 100644 >> --- a/Documentation/admin-guide/mm/transhuge.rst >> +++ b/Documentation/admin-guide/mm/transhuge.rst >> @@ -332,6 +332,29 @@ deny >> force >> Force the huge option on for all - very useful for testing; >> >> +Shmem can also use "multi-size THP" (mTHP) by adding a new sysfs knob to control >> +mTHP allocation: '/sys/kernel/mm/transparent_hugepage/hugepages-kB/shmem_enabled', >> +and its value for each mTHP is essentially consistent with the global setting. >> +An 'inherit' option is added to ensure compatibility with these global settings. >> +Conversely, the options 'force' and 'deny' are dropped, which are rather testing >> +artifacts from the old ages. >> +always >> + Attempt to allocate huge pages every time we need a new page; >> + >> +inherit >> + Inherit the top-level "shmem_enabled" value. By default, PMD-sized hugepages >> + have enabled="inherit" and all other hugepage sizes have enabled="never"; >> + >> +never >> + Do not allocate huge pages; >> + >> +within_size >> + Only allocate huge page if it will be fully within i_size. >> + Also respect fadvise()/madvise() hints; >> + >> +advise >> + Only allocate huge pages if requested with fadvise()/madvise(); >> + >> Need of application restart >> =========================== >> >> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >> index 020e2344eb86..fac21548c5de 100644 >> --- a/include/linux/huge_mm.h >> +++ b/include/linux/huge_mm.h >> @@ -6,6 +6,7 @@ >> #include >> >> #include /* only for vma_is_dax() */ >> +#include >> >> vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf); >> int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, >> @@ -63,6 +64,7 @@ ssize_t single_hugepage_flag_show(struct kobject *kobj, >> struct kobj_attribute *attr, char *buf, >> enum transparent_hugepage_flag flag); >> extern struct kobj_attribute shmem_enabled_attr; >> +extern struct kobj_attribute thpsize_shmem_enabled_attr; >> >> /* >> * Mask of all large folio orders supported for anonymous THP; all orders up to >> @@ -265,6 +267,14 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, >> return __thp_vma_allowable_orders(vma, vm_flags, tva_flags, orders); >> } >> >> +struct thpsize { >> + struct kobject kobj; >> + struct list_head node; >> + int order; >> +}; >> + >> +#define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) >> + >> enum mthp_stat_item { >> MTHP_STAT_ANON_FAULT_ALLOC, >> MTHP_STAT_ANON_FAULT_FALLBACK, >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 8e49f402d7c7..1360a1903b66 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -449,14 +449,6 @@ static void thpsize_release(struct kobject *kobj); >> static DEFINE_SPINLOCK(huge_anon_orders_lock); >> static LIST_HEAD(thpsize_list); >> >> -struct thpsize { >> - struct kobject kobj; >> - struct list_head node; >> - int order; >> -}; >> - >> -#define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) >> - >> static ssize_t thpsize_enabled_show(struct kobject *kobj, >> struct kobj_attribute *attr, char *buf) >> { >> @@ -517,6 +509,9 @@ static struct kobj_attribute thpsize_enabled_attr = >> >> static struct attribute *thpsize_attrs[] = { >> &thpsize_enabled_attr.attr, >> +#ifdef CONFIG_SHMEM >> + &thpsize_shmem_enabled_attr.attr, >> +#endif >> NULL, >> }; >> >> diff --git a/mm/shmem.c b/mm/shmem.c >> index ae358efc397a..643ff7516b4d 100644 >> --- a/mm/shmem.c >> +++ b/mm/shmem.c >> @@ -131,6 +131,14 @@ struct shmem_options { >> #define SHMEM_SEEN_QUOTA 32 >> }; >> >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >> +static unsigned long huge_anon_shmem_orders_always __read_mostly; >> +static unsigned long huge_anon_shmem_orders_madvise __read_mostly; >> +static unsigned long huge_anon_shmem_orders_inherit __read_mostly; >> +static unsigned long huge_anon_shmem_orders_within_size __read_mostly; >> +static DEFINE_SPINLOCK(huge_anon_shmem_orders_lock); >> +#endif > > Since we are also applying the new sysfs knob controls to tmpfs and anon mm, > should we rename this to get rid of the anon prefix? Sure. I want to do this in the patch set of mTHP support tmpfs originally, but yes, I can just drop the 'anon' prefix as a preparation.