Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp4690742ybb; Tue, 24 Mar 2020 03:33:59 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuKBGAeYnZSFkNrik0bXiqEgd7PzB5SOebFvEPoCJGUVPvd4gZuAP2Y+XuDJ5vlprSuSSGU X-Received: by 2002:a9d:7a9:: with SMTP id 38mr20962291oto.60.1585046038947; Tue, 24 Mar 2020 03:33:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585046038; cv=none; d=google.com; s=arc-20160816; b=Bk1rfp5JKlRUsJyOe6dixN+Gbd8LvuGgSdD6Y6pbE70QRsCvK+zN6Qgh9Fn2nNyQZl PI/QLQBnKLwxKZCO7OqP2a+HOiTj+7F4HSnYpzlj+HN6018E+oZhIQRiiN/l+ag4NFs5 tw7D8RnnRUwSu0NeF+APGl/YyyaRxf+kIdP67YzBrrHEyfEqb/voMv8kGY7DK4fw5g6I va+FsK08fCdOpOgNntnm0ZYxBfIsafLPX90RxizTk+GPEAQOMvtE3BMruEwepHrTMWn1 VBLmWmTvhVwS2BWS+ZeqiY6MYvxJEGe+P3eziDz9OeWhHPjlg2/sLP5htE99OXgalp8t oNWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from; bh=8YJYD7tuiIIZrKcPubGSe271mPybqwGJ9Ed/KmNj46Y=; b=gNRbsjw+LvE7Ut2oTO/xfd+nSojrosbhHCm4iavVCb+h/EpKr4WOP1CE+rGZXhBgm4 F4HxeQeaqcj4rsspyhio/Ae7zENtfm0gCxdK4Vg9hcDQ/+AZwy/Ukaa6lGVvXpV7OoNL Zmm/clXE6A3o+wSMtw4+O059UqxlTAAj6511eotOnRoY+jwA+qgygbSfdBbalNMVJAX4 8IJlyx4tRNMCsSPwmsHqY2jpAxVoWvW/Uzn4wJZ7WJNN76wWrf56gJgUkTGJYYYyqz72 n5SZso0t1i0RC+/ogHKvb00EeFqn0lNGxUx4fa2kgQHam56l4JAup2m+FjJR20NLUWcr Zqxw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u1si9459733otg.231.2020.03.24.03.33.45; Tue, 24 Mar 2020 03:33:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727297AbgCXKcQ (ORCPT + 99 others); Tue, 24 Mar 2020 06:32:16 -0400 Received: from out30-42.freemail.mail.aliyun.com ([115.124.30.42]:52655 "EHLO out30-42.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726167AbgCXKcP (ORCPT ); Tue, 24 Mar 2020 06:32:15 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R471e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07488;MF=teawaterz@linux.alibaba.com;NM=1;PH=DS;RN=21;SR=0;TI=SMTPD_---0TtVTLwS_1585045921; Received: from localhost(mailfrom:teawaterz@linux.alibaba.com fp:SMTPD_---0TtVTLwS_1585045921) by smtp.aliyun-inc.com(127.0.0.1); Tue, 24 Mar 2020 18:32:06 +0800 From: Hui Zhu To: hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, akpm@linux-foundation.org, hughd@google.com, yang.shi@linux.alibaba.com, kirill@shutemov.name, dan.j.williams@intel.com, aneesh.kumar@linux.ibm.com, sean.j.christopherson@intel.com, thellstrom@vmware.com, guro@fb.com, shakeelb@google.com, chris@chrisdown.name, tj@kernel.org, tglx@linutronix.de, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Cc: Hui Zhu , Hui Zhu Subject: [PATCH] mm, memcg: Add memory.transparent_hugepage_disabled Date: Tue, 24 Mar 2020 18:31:56 +0800 Message-Id: <1585045916-27339-1-git-send-email-teawater@gmail.com> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org /sys/kernel/mm/transparent_hugepage/enabled is the only interface to control if the application can use THP in system level. Sometime, we would not want an application use THP even if transparent_hugepage/enabled is set to "always" or "madvise" because thp may need more cpu and memory resources in some cases. This commit add a new interface memory.transparent_hugepage_disabled in memcg. When it set to 1, the application inside the cgroup cannot use THP except dax. Signed-off-by: Hui Zhu --- include/linux/huge_mm.h | 18 ++++++++++++++++-- include/linux/memcontrol.h | 2 ++ mm/memcontrol.c | 42 ++++++++++++++++++++++++++++++++++++++++++ mm/shmem.c | 4 ++++ 4 files changed, 64 insertions(+), 2 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5aca3d1..fd81479 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -91,6 +91,16 @@ extern bool is_vma_temporary_stack(struct vm_area_struct *vma); extern unsigned long transparent_hugepage_flags; +#ifdef CONFIG_MEMCG +extern bool memcg_transparent_hugepage_disabled(struct vm_area_struct *vma); +#else +static inline bool +memcg_transparent_hugepage_disabled(struct vm_area_struct *vma) +{ + return false; +} +#endif + /* * to be used on vmas which are known to support THP. * Use transparent_hugepage_enabled otherwise @@ -106,8 +116,6 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma) if (test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)) return false; - if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_FLAG)) - return true; /* * For dax vmas, try to always use hugepage mappings. If the kernel does * not support hugepages, fsdax mappings will fallback to PAGE_SIZE @@ -117,6 +125,12 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma) if (vma_is_dax(vma)) return true; + if (memcg_transparent_hugepage_disabled(vma)) + return false; + + if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_FLAG)) + return true; + if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG)) return !!(vma->vm_flags & VM_HUGEPAGE); diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a7a0a1a5..abc3142 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -320,6 +320,8 @@ struct mem_cgroup { #ifdef CONFIG_TRANSPARENT_HUGEPAGE struct deferred_split deferred_split_queue; + + bool transparent_hugepage_disabled; #endif struct mem_cgroup_per_node *nodeinfo[0]; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7a4bd8b..b6d91b6 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5011,6 +5011,14 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) if (parent) { memcg->swappiness = mem_cgroup_swappiness(parent); memcg->oom_kill_disable = parent->oom_kill_disable; +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + memcg->transparent_hugepage_disabled + = parent->transparent_hugepage_disabled; +#endif + } else { +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + memcg->transparent_hugepage_disabled = false; +#endif } if (parent && parent->use_hierarchy) { memcg->use_hierarchy = true; @@ -6126,6 +6134,24 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of, return nbytes; } +static u64 transparent_hugepage_disabled_read(struct cgroup_subsys_state *css, + struct cftype *cft) +{ + struct mem_cgroup *memcg = mem_cgroup_from_css(css); + + return memcg->transparent_hugepage_disabled; +} + +static int transparent_hugepage_disabled_write(struct cgroup_subsys_state *css, + struct cftype *cft, u64 val) +{ + struct mem_cgroup *memcg = mem_cgroup_from_css(css); + + memcg->transparent_hugepage_disabled = !!val; + + return 0; +} + static struct cftype memory_files[] = { { .name = "current", @@ -6179,6 +6205,12 @@ static struct cftype memory_files[] = { .seq_show = memory_oom_group_show, .write = memory_oom_group_write, }, + { + .name = "transparent_hugepage_disabled", + .flags = CFTYPE_NOT_ON_ROOT, + .read_u64 = transparent_hugepage_disabled_read, + .write_u64 = transparent_hugepage_disabled_write, + }, { } /* terminate */ }; @@ -6787,6 +6819,16 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages) refill_stock(memcg, nr_pages); } +bool memcg_transparent_hugepage_disabled(struct vm_area_struct *vma) +{ + struct mem_cgroup *memcg = get_mem_cgroup_from_mm(vma->vm_mm); + + if (memcg && memcg->transparent_hugepage_disabled) + return true; + + return false; +} + static int __init cgroup_memory(char *s) { char *token; diff --git a/mm/shmem.c b/mm/shmem.c index aad3ba7..253b63b 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1810,6 +1810,10 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, goto alloc_nohuge; if (shmem_huge == SHMEM_HUGE_DENY || sgp_huge == SGP_NOHUGE) goto alloc_nohuge; +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + if (memcg_transparent_hugepage_disabled(vma)) + goto alloc_nohuge; +#endif if (shmem_huge == SHMEM_HUGE_FORCE) goto alloc_huge; switch (sbinfo->huge) { -- 2.7.4