Received: by 10.223.185.116 with SMTP id b49csp196823wrg; Tue, 20 Feb 2018 19:04:09 -0800 (PST) X-Google-Smtp-Source: AH8x226MUvTfmYBSBxRX4W3DaK1uhxhv97b70u+quhJ2SZv3GJWrIkAAKM8wj+u7QUazOr9jkCMQ X-Received: by 10.101.97.67 with SMTP id o3mr1492214pgv.251.1519182249296; Tue, 20 Feb 2018 19:04:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519182249; cv=none; d=google.com; s=arc-20160816; b=Yu2Ao+0MngKj89qcilmJT3XuYEb27f+oyW+CN41VTnQPPTkIvdw+oJ+Nt3XetrLwWo TEYi4vewl7s53w31HqAlStCGLaTUCk0MdJNu1ynrNaiTihc5jpKQuBy96I/I9T03nEWl jtXVdBFnymkL0Nd0dJ4RHeeXSv0OWCRI9vYSZ6aRy6BNr0mTSPXzQgD3OeQMTLeEL9T3 K14FO7sFqog6JN8397YjmPiEStC9hCuRndD4Z8Wv23sXsFhAjQfHEpWRqKNeKGOMQ3f0 7YLhr+kXn6p8qWBaKEg+ZWOcwyKsnbUvaXC7jL1eIRLU150BlMfhI1BjSR/6NALKPb4s fPkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=8znfXbvNPo15QrWGfNfxLWonMZzJO3ZeBf6MkMUe9AU=; b=TMW0glH1zd7tnTLPyI19DOdpeecZjs4l4RxsyfFoV7RSsUg0P229sob91bXQ1IpuQA KO0rBQPuMVKyB/5kCNW54Eumn37PMe48uZVj4hxURQasCzGhHkL1SiCyjXy/WA+qEy9I eKUUgqUzREVKSyuQMavfDvFr06rddIOqUP9AzatK249ar8Eq+Yd5Bm55sgsrDCY4qZxT p3DM3yzZmRY7gDrJxqb2hFrpI+r/6zmeDra2JVmnhB1RjH3ULV/dyHzKR8SE4e/a3dKn VbzBfvtT62HKnmGeEROSCK4JbwjCEmOWeTLCKlvcIhOqGkxQ2Gw8eCc+Nx4ZGtbqoX8C FqUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=csH5FL2I; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a15-v6si2024015pll.691.2018.02.20.19.03.53; Tue, 20 Feb 2018 19:04:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=csH5FL2I; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751629AbeBUDC2 (ORCPT + 99 others); Tue, 20 Feb 2018 22:02:28 -0500 Received: from mail-pg0-f67.google.com ([74.125.83.67]:39017 "EHLO mail-pg0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751455AbeBUDBm (ORCPT ); Tue, 20 Feb 2018 22:01:42 -0500 Received: by mail-pg0-f67.google.com with SMTP id w17so127876pgv.6 for ; Tue, 20 Feb 2018 19:01:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8znfXbvNPo15QrWGfNfxLWonMZzJO3ZeBf6MkMUe9AU=; b=csH5FL2ImIW6G7AtXeYengeu1ocgxh/ey/BgKoLqpmvaMvrua9XlxCJiRloBhK4Cpw 0i2coHRhtNbndeTA2KjTer7N/DsLeMdgMxdj08e5UUPitPb6qVuSG8J8mZ5/3gB47aDW D06sA891kVvWqnZ4PwZCBFDx0R9jZCY3ftj+nr/zpI4g1rubP9wZJbII7+RxIyJAFUzb cVsPZ3Gua2mWy1q9eoNxQAhwB181lzycsnk0qSKFvGN2oNQukJFBL6vtDSa/tnaaywVP hBNbEi34Jeu8ApNi5BCI0qaCmu4lN0paFhxvT0o2eRmzjRiQPiKZNuz8GBEgFyg1COwX CQQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8znfXbvNPo15QrWGfNfxLWonMZzJO3ZeBf6MkMUe9AU=; b=lRCPdZduKKRSlDlhHG4X673GpArxvjLfTbfEVP2bkFIVasimwOfqc6vmr2/yriSzJX PuwIMPgsZ+Ojdlk36bYQuKFqSozwtb7ouv3yuP3Gx5zowhk9XQqQcLqDluh54S8XRSY4 FQqGao5KGUgcBroeq7Lo25y4KAMHi7V1CQMgvwGodY+663R5IZ7UQMbfwHirM56Jws82 OYVq0Hyb6UMTjwjYWP3QUZkLG6KCSvPI3777L/As398N/MR89dup8q7y82zUdMxevJhK 4wfcVwMZ6h6aCTdXa3EZeOnwQoS+GxLXLjDPJh7MtdK+HSEkSXMxB4tvWfySe0xuni7E /3iA== X-Gm-Message-State: APf1xPAxTHRpqT0mPcHvXAGb0iM9ddyKMnboVF+T0jytpD2cL5eO9imT yjX2Y7usOzUL3qpvcR8MhQrfpw== X-Received: by 10.99.109.79 with SMTP id i76mr1407062pgc.402.1519182101482; Tue, 20 Feb 2018 19:01:41 -0800 (PST) Received: from shakeelb.mtv.corp.google.com ([2620:15c:2cb:201:3a5f:3a4f:fa44:6b63]) by smtp.gmail.com with ESMTPSA id b5sm23230739pfc.12.2018.02.20.19.01.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Feb 2018 19:01:39 -0800 (PST) From: Shakeel Butt To: Jan Kara , Amir Goldstein , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Greg Thelen , Johannes Weiner , Michal Hocko , Vladimir Davydov , Mel Gorman , Vlastimil Babka Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Shakeel Butt Subject: [PATCH v2 2/3] mm: memcg: plumbing memcg for kmalloc allocations Date: Tue, 20 Feb 2018 19:01:00 -0800 Message-Id: <20180221030101.221206-3-shakeelb@google.com> X-Mailer: git-send-email 2.16.1.291.g4437f3f132-goog In-Reply-To: <20180221030101.221206-1-shakeelb@google.com> References: <20180221030101.221206-1-shakeelb@google.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Introducing the memcg variant for kmalloc allocation functions. The kmalloc allocations are underlying served using the kmem caches unless the size of the allocation request is larger than KMALLOC_MAX_CACHE_SIZE, in which case, the kmem caches are bypassed and the request is routed directly to page allocator. So, for __GFP_ACCOUNT kmalloc allocations, the memcg of current task is charged. This patch introduces memcg variant of kmalloc functions to allow callers to provide memcg for charging. Signed-off-by: Shakeel Butt --- Changelog since v1: - Fixed build for SLOB include/linux/memcontrol.h | 3 +- include/linux/slab.h | 45 +++++++++++++++++++++++--- mm/memcontrol.c | 9 ++++-- mm/page_alloc.c | 2 +- mm/slab.c | 31 +++++++++++++----- mm/slab_common.c | 41 +++++++++++++++++++++++- mm/slob.c | 6 ++++ mm/slub.c | 65 +++++++++++++++++++++++++++++++------- 8 files changed, 172 insertions(+), 30 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 48eaf19859e9..9dec8a5c0ca2 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1179,7 +1179,8 @@ struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep, void memcg_kmem_put_cache(struct kmem_cache *cachep); int memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order, struct mem_cgroup *memcg); -int memcg_kmem_charge(struct page *page, gfp_t gfp, int order); +int memcg_kmem_charge(struct page *page, gfp_t gfp, int order, + struct mem_cgroup *memcg); void memcg_kmem_uncharge(struct page *page, int order); #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB) diff --git a/include/linux/slab.h b/include/linux/slab.h index 24355bc9e655..9df5d6279b38 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -352,6 +352,8 @@ static __always_inline int kmalloc_index(size_t size) #endif /* !CONFIG_SLOB */ void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc; +void *__kmalloc_memcg(size_t size, gfp_t flags, + struct mem_cgroup *memcg) __assume_kmalloc_alignment __malloc; void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) __assume_slab_alignment __malloc; void *kmem_cache_alloc_memcg(struct kmem_cache *, gfp_t flags, struct mem_cgroup *memcg) __assume_slab_alignment __malloc; @@ -378,6 +380,8 @@ static __always_inline void kfree_bulk(size_t size, void **p) #ifdef CONFIG_NUMA void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment __malloc; +void *__kmalloc_node_memcg(size_t size, gfp_t flags, int node, + struct mem_cgroup *memcg) __assume_kmalloc_alignment __malloc; void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment __malloc; void *kmem_cache_alloc_node_memcg(struct kmem_cache *, gfp_t flags, int node, struct mem_cgroup *memcg) __assume_slab_alignment __malloc; @@ -387,6 +391,12 @@ static __always_inline void *__kmalloc_node(size_t size, gfp_t flags, int node) return __kmalloc(size, flags); } +static __always_inline void *__kmalloc_node_memcg(size_t size, gfp_t flags, + struct mem_cgroup *memcg, int node) +{ + return __kmalloc_memcg(size, flags, memcg); +} + static __always_inline void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) { return kmem_cache_alloc(s, flags); @@ -470,15 +480,26 @@ kmem_cache_alloc_node_memcg_trace(struct kmem_cache *s, gfp_t gfpflags, #endif /* CONFIG_TRACING */ extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __malloc; +extern void *kmalloc_order_memcg(size_t size, gfp_t flags, unsigned int order, + struct mem_cgroup *memcg) __assume_page_alignment __malloc; #ifdef CONFIG_TRACING extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __malloc; +extern void *kmalloc_order_memcg_trace(size_t size, gfp_t flags, + unsigned int order, + struct mem_cgroup *memcg) __assume_page_alignment __malloc; #else static __always_inline void * kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) { return kmalloc_order(size, flags, order); } +static __always_inline void * +kmalloc_order_memcg_trace(size_t size, gfp_t flags, unsigned int order, + struct mem_cgroup *memcg) +{ + return kmalloc_order_memcg(size, flags, order, memcg); +} #endif static __always_inline void *kmalloc_large(size_t size, gfp_t flags) @@ -487,6 +508,14 @@ static __always_inline void *kmalloc_large(size_t size, gfp_t flags) return kmalloc_order_trace(size, flags, order); } +static __always_inline void *kmalloc_large_memcg(size_t size, gfp_t flags, + struct mem_cgroup *memcg) +{ + unsigned int order = get_order(size); + + return kmalloc_order_memcg_trace(size, flags, order, memcg); +} + /** * kmalloc - allocate memory * @size: how many bytes of memory are required. @@ -538,11 +567,12 @@ static __always_inline void *kmalloc_large(size_t size, gfp_t flags) * for general use, and so are not documented here. For a full list of * potential flags, always refer to linux/gfp.h. */ -static __always_inline void *kmalloc(size_t size, gfp_t flags) +static __always_inline void * +kmalloc_memcg(size_t size, gfp_t flags, struct mem_cgroup *memcg) { if (__builtin_constant_p(size)) { if (size > KMALLOC_MAX_CACHE_SIZE) - return kmalloc_large(size, flags); + return kmalloc_large_memcg(size, flags, memcg); #ifndef CONFIG_SLOB if (!(flags & GFP_DMA)) { int index = kmalloc_index(size); @@ -550,12 +580,17 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags) if (!index) return ZERO_SIZE_PTR; - return kmem_cache_alloc_trace(kmalloc_caches[index], - flags, size); + return kmem_cache_alloc_memcg_trace( + kmalloc_caches[index], flags, size, memcg); } #endif } - return __kmalloc(size, flags); + return __kmalloc_memcg(size, flags, memcg); +} + +static __always_inline void *kmalloc(size_t size, gfp_t flags) +{ + return kmalloc_memcg(size, flags, NULL); } /* diff --git a/mm/memcontrol.c b/mm/memcontrol.c index bd37e855e277..0dcd6ab6cc94 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2348,15 +2348,18 @@ int memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order, * * Returns 0 on success, an error code on failure. */ -int memcg_kmem_charge(struct page *page, gfp_t gfp, int order) +int memcg_kmem_charge(struct page *page, gfp_t gfp, int order, + struct mem_cgroup *memcg) { - struct mem_cgroup *memcg; int ret = 0; if (memcg_kmem_bypass()) return 0; - memcg = get_mem_cgroup_from_mm(current->mm); + if (memcg) + memcg = get_mem_cgroup(memcg); + if (!memcg) + memcg = get_mem_cgroup_from_mm(current->mm); if (!mem_cgroup_is_root(memcg)) { ret = memcg_kmem_charge_memcg(page, gfp, order, memcg); if (!ret) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e2b42f603b1a..d65d58045893 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4261,7 +4261,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, out: if (memcg_kmem_enabled() && (gfp_mask & __GFP_ACCOUNT) && page && - unlikely(memcg_kmem_charge(page, gfp_mask, order) != 0)) { + unlikely(memcg_kmem_charge(page, gfp_mask, order, NULL) != 0)) { __free_pages(page, order); page = NULL; } diff --git a/mm/slab.c b/mm/slab.c index 3daeda62bd0c..4282f5a84dcd 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -3715,7 +3715,8 @@ EXPORT_SYMBOL(kmem_cache_alloc_node_memcg_trace); #endif static __always_inline void * -__do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller) +__do_kmalloc_node(size_t size, gfp_t flags, int node, struct mem_cgroup *memcg, + unsigned long caller) { struct kmem_cache *cachep; void *ret; @@ -3723,7 +3724,8 @@ __do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller) cachep = kmalloc_slab(size, flags); if (unlikely(ZERO_OR_NULL_PTR(cachep))) return cachep; - ret = kmem_cache_alloc_node_trace(cachep, flags, node, size); + ret = kmem_cache_alloc_node_memcg_trace(cachep, flags, node, size, + memcg); kasan_kmalloc(cachep, ret, size, flags); return ret; @@ -3731,14 +3733,21 @@ __do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller) void *__kmalloc_node(size_t size, gfp_t flags, int node) { - return __do_kmalloc_node(size, flags, node, _RET_IP_); + return __do_kmalloc_node(size, flags, node, NULL, _RET_IP_); } EXPORT_SYMBOL(__kmalloc_node); +void *__kmalloc_node_memcg(size_t size, gfp_t flags, int node, + struct mem_cgroup *memcg) +{ + return __do_kmalloc_node(size, flags, node, memcg, _RET_IP_); +} +EXPORT_SYMBOL(__kmalloc_node_memcg); + void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node, unsigned long caller) { - return __do_kmalloc_node(size, flags, node, caller); + return __do_kmalloc_node(size, flags, node, NULL, caller); } EXPORT_SYMBOL(__kmalloc_node_track_caller); #endif /* CONFIG_NUMA */ @@ -3750,7 +3759,7 @@ EXPORT_SYMBOL(__kmalloc_node_track_caller); * @caller: function caller for debug tracking of the caller */ static __always_inline void *__do_kmalloc(size_t size, gfp_t flags, - unsigned long caller) + struct mem_cgroup *memcg, unsigned long caller) { struct kmem_cache *cachep; void *ret; @@ -3758,7 +3767,7 @@ static __always_inline void *__do_kmalloc(size_t size, gfp_t flags, cachep = kmalloc_slab(size, flags); if (unlikely(ZERO_OR_NULL_PTR(cachep))) return cachep; - ret = slab_alloc(cachep, flags, NULL, caller); + ret = slab_alloc(cachep, flags, memcg, caller); kasan_kmalloc(cachep, ret, size, flags); trace_kmalloc(caller, ret, @@ -3769,13 +3778,19 @@ static __always_inline void *__do_kmalloc(size_t size, gfp_t flags, void *__kmalloc(size_t size, gfp_t flags) { - return __do_kmalloc(size, flags, _RET_IP_); + return __do_kmalloc(size, flags, NULL, _RET_IP_); } EXPORT_SYMBOL(__kmalloc); +void *__kmalloc_memcg(size_t size, gfp_t flags, struct mem_cgroup *memcg) +{ + return __do_kmalloc(size, flags, memcg, _RET_IP_); +} +EXPORT_SYMBOL(__kmalloc_memcg); + void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller) { - return __do_kmalloc(size, flags, caller); + return __do_kmalloc(size, flags, NULL, caller); } EXPORT_SYMBOL(__kmalloc_track_caller); diff --git a/mm/slab_common.c b/mm/slab_common.c index 10f127b2de7c..49aea3b0725d 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1155,20 +1155,49 @@ void __init create_kmalloc_caches(slab_flags_t flags) * directly to the page allocator. We use __GFP_COMP, because we will need to * know the allocation order to free the pages properly in kfree. */ -void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) +static __always_inline void *__kmalloc_order_memcg(size_t size, gfp_t flags, + unsigned int order, + struct mem_cgroup *memcg) { void *ret; struct page *page; flags |= __GFP_COMP; + + /* + * Do explicit targeted memcg charging instead of + * __alloc_pages_nodemask charging current memcg. + */ + if (memcg && (flags & __GFP_ACCOUNT)) + flags &= ~__GFP_ACCOUNT; + page = alloc_pages(flags, order); + + if (memcg && page && memcg_kmem_enabled() && + memcg_kmem_charge(page, flags, order, memcg)) { + __free_pages(page, order); + page = NULL; + } + ret = page ? page_address(page) : NULL; kmemleak_alloc(ret, size, 1, flags); kasan_kmalloc_large(ret, size, flags); return ret; } + +void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) +{ + return __kmalloc_order_memcg(size, flags, order, NULL); +} EXPORT_SYMBOL(kmalloc_order); +void *kmalloc_order_memcg(size_t size, gfp_t flags, unsigned int order, + struct mem_cgroup *memcg) +{ + return __kmalloc_order_memcg(size, flags, order, memcg); +} +EXPORT_SYMBOL(kmalloc_order_memcg); + #ifdef CONFIG_TRACING void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) { @@ -1177,6 +1206,16 @@ void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) return ret; } EXPORT_SYMBOL(kmalloc_order_trace); + +void *kmalloc_order_memcg_trace(size_t size, gfp_t flags, unsigned int order, + struct mem_cgroup *memcg) +{ + void *ret = kmalloc_order_memcg(size, flags, order, memcg); + + trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << order, flags); + return ret; +} +EXPORT_SYMBOL(kmalloc_order_memcg_trace); #endif #ifdef CONFIG_SLAB_FREELIST_RANDOM diff --git a/mm/slob.c b/mm/slob.c index 49cdd24424b0..696baf517bda 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -470,6 +470,12 @@ void *__kmalloc(size_t size, gfp_t gfp) } EXPORT_SYMBOL(__kmalloc); +void *__kmalloc_memcg(size_t size, gfp_t gfp, struct mem_cgroup *memcg) +{ + return __do_kmalloc_node(size, gfp, NUMA_NO_NODE, _RET_IP_); +} +EXPORT_SYMBOL(__kmalloc_memcg); + void *__kmalloc_track_caller(size_t size, gfp_t gfp, unsigned long caller) { return __do_kmalloc_node(size, gfp, NUMA_NO_NODE, caller); diff --git a/mm/slub.c b/mm/slub.c index 061cfbc7c3d7..5b119f4fb6bc 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3791,13 +3791,14 @@ static int __init setup_slub_min_objects(char *str) __setup("slub_min_objects=", setup_slub_min_objects); -void *__kmalloc(size_t size, gfp_t flags) +static __always_inline void *__do_kmalloc(size_t size, gfp_t flags, + struct mem_cgroup *memcg, unsigned long caller) { struct kmem_cache *s; void *ret; if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) - return kmalloc_large(size, flags); + return kmalloc_large_memcg(size, flags, memcg); s = kmalloc_slab(size, flags); @@ -3806,22 +3807,50 @@ void *__kmalloc(size_t size, gfp_t flags) ret = slab_alloc(s, flags, NULL, _RET_IP_); - trace_kmalloc(_RET_IP_, ret, size, s->size, flags); + trace_kmalloc(caller, ret, size, s->size, flags); kasan_kmalloc(s, ret, size, flags); return ret; } + +void *__kmalloc(size_t size, gfp_t flags) +{ + return __do_kmalloc(size, flags, NULL, _RET_IP_); +} EXPORT_SYMBOL(__kmalloc); +void *__kmalloc_memcg(size_t size, gfp_t flags, struct mem_cgroup *memcg) +{ + return __do_kmalloc(size, flags, memcg, _RET_IP_); +} +EXPORT_SYMBOL(__kmalloc_memcg); + #ifdef CONFIG_NUMA -static void *kmalloc_large_node(size_t size, gfp_t flags, int node) +static void *kmalloc_large_node(size_t size, gfp_t flags, int node, + struct mem_cgroup *memcg) { struct page *page; void *ptr = NULL; + unsigned int order = get_order(size); flags |= __GFP_COMP; - page = alloc_pages_node(node, flags, get_order(size)); + + /* + * Do explicit targeted memcg charging instead of + * __alloc_pages_nodemask charging current memcg. + */ + if (memcg && (flags & __GFP_ACCOUNT)) + flags &= ~__GFP_ACCOUNT; + + page = alloc_pages_node(node, flags, order); + + if (memcg && page && memcg_kmem_enabled() && + memcg_kmem_charge(page, flags, order, memcg)) { + __free_pages(page, order); + page = NULL; + } + if (page) ptr = page_address(page); @@ -3829,15 +3858,17 @@ static void *kmalloc_large_node(size_t size, gfp_t flags, int node) return ptr; } -void *__kmalloc_node(size_t size, gfp_t flags, int node) +static __always_inline void * +__do_kmalloc_node_memcg(size_t size, gfp_t flags, int node, + struct mem_cgroup *memcg, unsigned long caller) { struct kmem_cache *s; void *ret; if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) { - ret = kmalloc_large_node(size, flags, node); + ret = kmalloc_large_node(size, flags, node, memcg); - trace_kmalloc_node(_RET_IP_, ret, + trace_kmalloc_node(caller, ret, size, PAGE_SIZE << get_order(size), flags, node); @@ -3849,15 +3880,27 @@ void *__kmalloc_node(size_t size, gfp_t flags, int node) if (unlikely(ZERO_OR_NULL_PTR(s))) return s; - ret = slab_alloc_node(s, flags, node, NULL, _RET_IP_); + ret = slab_alloc_node(s, flags, node, memcg, caller); - trace_kmalloc_node(_RET_IP_, ret, size, s->size, flags, node); + trace_kmalloc_node(caller, ret, size, s->size, flags, node); kasan_kmalloc(s, ret, size, flags); return ret; } + +void *__kmalloc_node(size_t size, gfp_t flags, int node) +{ + return __do_kmalloc_node_memcg(size, flags, node, NULL, _RET_IP_); +} EXPORT_SYMBOL(__kmalloc_node); + +void *__kmalloc_node_memcg(size_t size, gfp_t flags, int node, + struct mem_cgroup *memcg) +{ + return __do_kmalloc_node_memcg(size, flags, node, memcg, _RET_IP_); +} +EXPORT_SYMBOL(__kmalloc_node_memcg); #endif #ifdef CONFIG_HARDENED_USERCOPY @@ -4370,7 +4413,7 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags, void *ret; if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) { - ret = kmalloc_large_node(size, gfpflags, node); + ret = kmalloc_large_node(size, gfpflags, node, NULL); trace_kmalloc_node(caller, ret, size, PAGE_SIZE << get_order(size), -- 2.16.1.291.g4437f3f132-goog