Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp3785957pxv; Mon, 26 Jul 2021 12:01:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxiolq3mgP5lVU/XfoRgGG7je/ObJN/CHRxFFHNpsvprhYSaHW4coqSWm2c0EtWtK5bT5z4 X-Received: by 2002:a02:cb92:: with SMTP id u18mr17986485jap.78.1627326103841; Mon, 26 Jul 2021 12:01:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627326103; cv=none; d=google.com; s=arc-20160816; b=ngvSGVoQbmCyNxYgxGYzKDMP7CiFqGv7YVsUEwtdesJtzPVg6WtZfLdDi3/QrPMIc8 /WIjTliMYTEIjdWjQRrSshObBR4Oxb+6NLNJLGMeMiTQ99WfHM/RfzY0558pmtqEm2KS PPSDJELFNYW2sBy0myX4QdQf2MaScb/3B1aqtxtjm3CCk9uW8fwFkZsc1ubFR+3gydOM 8RnN5QE1gV5vNnOFsgVkXn0OY8vYvVLofFBvEBOKFSmJx15/uq0HgJkK4Hbg7DtQpe/h 5sMwCINCjrZHDIN3aA9c+b2VuVT6MBsqCLVJm0nsMqrDr0AvEE5s6tVuzHa23gMRz97p PiwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:references:cc :to:subject:from:dkim-signature; bh=Mmyov/gGbqGKw7raDVRA7gmUoQ0+HKnkbSQKhbmo2a0=; b=ii6/zkze218KmLO9kPT4Oe2BMmwt6mtdPRsjcy3RqSYgXCXXtH8RbhUE0jwUm0KmwL dwNBh6dUsz63hR3JVe6PCP+hBg/kcuai7RuC9YbuyuPYlsDsl7sjS1/uERTU/UGlm9g1 x3hoSXtmOFmVP8OVWLW/seGqgLIkhvnxtUVu30b/nWmy5lBzSgFzbvaB333bZaOE/wMS bPpvhxQTM5Td9GnEbza6XM7Wv54Ej2w0VMlLTUn37Ce9AhSZjtwhF3qDU/3cmkzGorxA eHUNKwtJVyR8KNCfsMICGVfXbD4386UAU22Y8EkOPruPK51K5j/sezEblkJOEh5wBMys 7QiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@virtuozzo.com header.s=relay header.b=BcoP0TsV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=virtuozzo.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g17si630466jaq.116.2021.07.26.12.01.31; Mon, 26 Jul 2021 12:01:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@virtuozzo.com header.s=relay header.b=BcoP0TsV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=virtuozzo.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232541AbhGZSTo (ORCPT + 99 others); Mon, 26 Jul 2021 14:19:44 -0400 Received: from relay.sw.ru ([185.231.240.75]:55016 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232332AbhGZSTk (ORCPT ); Mon, 26 Jul 2021 14:19:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=Content-Type:MIME-Version:Date:Message-ID:Subject :From; bh=Mmyov/gGbqGKw7raDVRA7gmUoQ0+HKnkbSQKhbmo2a0=; b=BcoP0TsVgn7sQK+GAfX l06/xEY10rF6nb4u8PIkU99hqVQaktpHXV9skZOCSEDXyuCwtGXJq49jQ9ptZop4TJu30fr8Gb2rv ytornZ/BC8yPb8NkcmDfZcHwOZHJPOxI3pST7mFJv/3uI9qKRZz4e5ZNEgT2hkJ7ENAu8hj2XjA=; Received: from [10.93.0.56] by relay.sw.ru with esmtp (Exim 4.94.2) (envelope-from ) id 1m85pf-005JRc-Du; Mon, 26 Jul 2021 22:00:07 +0300 From: Vasily Averin Subject: [PATCH v6 02/16] memcg: enable accounting for IP address and routing-related objects To: Andrew Morton Cc: cgroups@vger.kernel.org, Michal Hocko , Shakeel Butt , Johannes Weiner , Vladimir Davydov , Roman Gushchin , "David S. Miller" , Jakub Kicinski , Hideaki YOSHIFUJI , David Ahern , netdev@vger.kernel.org, linux-kernel@vger.kernel.org References: <9bf9d9bd-03b1-2adb-17b4-5d59a86a9394@virtuozzo.com> Message-ID: <3f1754de-0abd-d480-8d23-c03469ca072e@virtuozzo.com> Date: Mon, 26 Jul 2021 22:00:06 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org An netadmin inside container can use 'ip a a' and 'ip r a' to assign a large number of ipv4/ipv6 addresses and routing entries and force kernel to allocate megabytes of unaccounted memory for long-lived per-netdevice related kernel objects: 'struct in_ifaddr', 'struct inet6_ifaddr', 'struct fib6_node', 'struct rt6_info', 'struct fib_rules' and ip_fib caches. These objects can be manually removed, though usually they lives in memory till destroy of its net namespace. It makes sense to account for them to restrict the host's memory consumption from inside the memcg-limited container. One of such objects is the 'struct fib6_node' mostly allocated in net/ipv6/route.c::__ip6_ins_rt() inside the lock_bh()/unlock_bh() section: write_lock_bh(&table->tb6_lock); err = fib6_add(&table->tb6_root, rt, info, mxc); write_unlock_bh(&table->tb6_lock); In this case it is not enough to simply add SLAB_ACCOUNT to corresponding kmem cache. The proper memory cgroup still cannot be found due to the incorrect 'in_interrupt()' check used in memcg_kmem_bypass(). Obsoleted in_interrupt() does not describe real execution context properly. From include/linux/preempt.h: The following macros are deprecated and should not be used in new code: in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled To verify the current execution context new macro should be used instead: in_task() - We're in task context Signed-off-by: Vasily Averin --- mm/memcontrol.c | 2 +- net/core/fib_rules.c | 4 ++-- net/ipv4/devinet.c | 2 +- net/ipv4/fib_trie.c | 4 ++-- net/ipv6/addrconf.c | 2 +- net/ipv6/ip6_fib.c | 4 ++-- net/ipv6/route.c | 2 +- 7 files changed, 10 insertions(+), 10 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ae1f5d0..1bbf239 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -968,7 +968,7 @@ static __always_inline bool memcg_kmem_bypass(void) return false; /* Memcg to charge can't be determined. */ - if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD)) + if (!in_task() || !current->mm || (current->flags & PF_KTHREAD)) return true; return false; diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c index a9f9379..79df7cd 100644 --- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -57,7 +57,7 @@ int fib_default_rule_add(struct fib_rules_ops *ops, { struct fib_rule *r; - r = kzalloc(ops->rule_size, GFP_KERNEL); + r = kzalloc(ops->rule_size, GFP_KERNEL_ACCOUNT); if (r == NULL) return -ENOMEM; @@ -541,7 +541,7 @@ static int fib_nl2rule(struct sk_buff *skb, struct nlmsghdr *nlh, goto errout; } - nlrule = kzalloc(ops->rule_size, GFP_KERNEL); + nlrule = kzalloc(ops->rule_size, GFP_KERNEL_ACCOUNT); if (!nlrule) { err = -ENOMEM; goto errout; diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 73721a4..d38124b 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -215,7 +215,7 @@ static void devinet_sysctl_unregister(struct in_device *idev) static struct in_ifaddr *inet_alloc_ifa(void) { - return kzalloc(sizeof(struct in_ifaddr), GFP_KERNEL); + return kzalloc(sizeof(struct in_ifaddr), GFP_KERNEL_ACCOUNT); } static void inet_rcu_free_ifa(struct rcu_head *head) diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index 25cf387..8060524 100644 --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -2380,11 +2380,11 @@ void __init fib_trie_init(void) { fn_alias_kmem = kmem_cache_create("ip_fib_alias", sizeof(struct fib_alias), - 0, SLAB_PANIC, NULL); + 0, SLAB_PANIC | SLAB_ACCOUNT, NULL); trie_leaf_kmem = kmem_cache_create("ip_fib_trie", LEAF_SIZE, - 0, SLAB_PANIC, NULL); + 0, SLAB_PANIC | SLAB_ACCOUNT, NULL); } struct fib_table *fib_trie_table(u32 id, struct fib_table *alias) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 3bf685f..8eaeade 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1080,7 +1080,7 @@ static int ipv6_add_addr_hash(struct net_device *dev, struct inet6_ifaddr *ifa) goto out; } - ifa = kzalloc(sizeof(*ifa), gfp_flags); + ifa = kzalloc(sizeof(*ifa), gfp_flags | __GFP_ACCOUNT); if (!ifa) { err = -ENOBUFS; goto out; diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 2d650dc..a8f118e 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -2449,8 +2449,8 @@ int __init fib6_init(void) int ret = -ENOMEM; fib6_node_kmem = kmem_cache_create("fib6_nodes", - sizeof(struct fib6_node), - 0, SLAB_HWCACHE_ALIGN, + sizeof(struct fib6_node), 0, + SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, NULL); if (!fib6_node_kmem) goto out; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 7b756a7..5f7286a 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -6638,7 +6638,7 @@ int __init ip6_route_init(void) ret = -ENOMEM; ip6_dst_ops_template.kmem_cachep = kmem_cache_create("ip6_dst_cache", sizeof(struct rt6_info), 0, - SLAB_HWCACHE_ALIGN, NULL); + SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, NULL); if (!ip6_dst_ops_template.kmem_cachep) goto out; -- 1.8.3.1