Received: by 2002:a05:6602:2086:0:0:0:0 with SMTP id a6csp4471274ioa; Wed, 27 Apr 2022 04:43:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyJO5vA12ueV1QT7fqWFMN1O6L/4z1q0Hqrk3a2YcKmFDQc6NQ/W0HnWj2iJVxu2SK6E9+0 X-Received: by 2002:a65:6942:0:b0:378:9365:5963 with SMTP id w2-20020a656942000000b0037893655963mr23295150pgq.142.1651059819525; Wed, 27 Apr 2022 04:43:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651059819; cv=none; d=google.com; s=arc-20160816; b=NvSMicSLhWSCHT2K80AUPkVChHuhpXQb62QrCJF2APYOyzYH2abfLGfEehti1QkRP9 R9cfPDDSls9+h9/Oop+xJdtdpocaZx1HemizlLwLrEhB/gNyZ+HpGkZT3xfEOTUytJRJ L0Y58/lw+Lrpf1DmbE5vmd2MIz1Zk+qrDPz393lsWgeb4h7y1Q77hgAKWX6g1DLK2KPv q0DeZGcaW0JQUOW2NmPsGeCGwOWeyNfQLduPiQ5IJ0iesgP9KVt1hCHVqEGPluONjmPX a4U+6aMBIL7BUC1zt0XLW5IivdmGen/j4O+OqlcALiv4PkDi0vEYFMMZ/XowEDsldFIR TLPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language:cc:to :subject:from:user-agent:mime-version:date:message-id:dkim-signature; bh=jRN8XksCdAf0+x8Uj8a/DButwn5KzLonnaobSsBfglg=; b=FpV8XizD/arSLcI1NDAZG+FDqtafzOuseHOolCR2Cv7lqhAWM+2IR6O+c+9n3YaGTI +D/Z12ocGEoCWJtW+qErQU9EjeoFBcT/9RYoIyr90G2j4PZVvIIJsDqwplk8idL5lnH3 nrblKtpbDRG8WLAQIQPr8IlJmsEYzGx5itAHawvAgo0ns3zmnBeKUmdM2B6e7SJnkEkR E1UaqJxohHmjaM9Z6KJowqSI6iGsV/KTX3YV7qfKmZrXkDZHLXcsi806sb9yH6pIA4Vk n7iWoN7H7HH3rvyXttyXazvAgUrYqxV1wTPEZMa+KqM4BQir1aV0n3R96Kbu3J7fe5kw 8D1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@openvz-org.20210112.gappssmtp.com header.s=20210112 header.b=DK23ya2E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=openvz.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id s20-20020a056a00179400b004fa3a8e007fsi1284332pfg.310.2022.04.27.04.43.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Apr 2022 04:43:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@openvz-org.20210112.gappssmtp.com header.s=20210112 header.b=DK23ya2E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=openvz.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 90E983C59BF; Wed, 27 Apr 2022 03:57:44 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231865AbiD0LAp (ORCPT + 99 others); Wed, 27 Apr 2022 07:00:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231779AbiD0LAL (ORCPT ); Wed, 27 Apr 2022 07:00:11 -0400 Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30D153C762D for ; Wed, 27 Apr 2022 03:37:54 -0700 (PDT) Received: by mail-lf1-x12d.google.com with SMTP id bu29so2438575lfb.0 for ; Wed, 27 Apr 2022 03:37:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=openvz-org.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:from:subject:to:cc :content-language:content-transfer-encoding; bh=jRN8XksCdAf0+x8Uj8a/DButwn5KzLonnaobSsBfglg=; b=DK23ya2EJ5weoMmh1t7kx/QodejkNkn8M5NyBLC6YNDDq/iyYjlsNadZ2EHYSJ5Fqg v6gIuXZexBcUPt7oHC2h4qRgCgC/5A3dmpZmQsJ+8GqVCb1x/lOqbf91P7uw5Ww0uqKD uyHVfyKv/F1Mc0YNVKPWcoHW3ohxieUlWta8+KHFA8/SQJvfuKwbBS+TFPT4bO6XlV24 H/uUZxi93xoDkhBCqfXOOKQhO/48gVdV2wprUxrQeRwT2K32euRbJz7E7YXAoFNXApwv SLu4nXXcpWgbpsAiiuxu+zsOGeWaGuKPZ7cmgu84GIuIDg25iFm84RkrJJ73rKF+8/gg IwGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:from :subject:to:cc:content-language:content-transfer-encoding; bh=jRN8XksCdAf0+x8Uj8a/DButwn5KzLonnaobSsBfglg=; b=xurPRvdhwZUfAn5MHUTCPyOa1HNZ6Ot5i1FOU/C3rIPVWBm0UZxJKeCXt+MZ4lpjnU JCMbRp9BNdlhBlw7IDycLEFsyy2F1E9cbK+iUFzWR/xqrp+w7FiUHf+bZS56MNTvljIh SPaXseVgLe9KCpEzl97SHAHu+Thisj1c5nEUKcD93ClrAwyn/O+6fbPBXvbJoKINUsq5 aZSju/7tLrCd826xpkErZYTw2wuKJ8VQVx6OEwp/parz7aACqYHpU35xLZvQpNSsiWOe QtoZppXnlgBveQ/PuMDx8KPAUkJUoLpdBg3sSlU3piqETHX+Hl7YkcCeuUcZbrlvT2mB 2B4A== X-Gm-Message-State: AOAM5309VBPWvupmolZQQuA5EgZvOztjEYayagIZJpXEJFqo/+owAe+1 AgNlgHwGeePExhsntGN1Xq1lwg== X-Received: by 2002:a05:6512:400a:b0:46b:8cd9:1af8 with SMTP id br10-20020a056512400a00b0046b8cd91af8mr20687452lfb.545.1651055872441; Wed, 27 Apr 2022 03:37:52 -0700 (PDT) Received: from [192.168.1.65] ([46.188.121.177]) by smtp.gmail.com with ESMTPSA id m4-20020a197104000000b00471ebfc7a0bsm1840776lfc.191.2022.04.27.03.37.51 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 27 Apr 2022 03:37:51 -0700 (PDT) Message-ID: <7e867cb0-89d6-402c-33d2-9b9ba0ba1523@openvz.org> Date: Wed, 27 Apr 2022 13:37:50 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 From: Vasily Averin Subject: [PATCH] memcg: accounting for objects allocated for new netdevice To: Roman Gushchin , Vlastimil Babka , Shakeel Butt Cc: kernel@openvz.org, Florian Westphal , linux-kernel@vger.kernel.org, Michal Hocko , cgroups@vger.kernel.org, netdev@vger.kernel.org, "David S. Miller" , Jakub Kicinski , Paolo Abeni , Greg Kroah-Hartman , Tejun Heo , Luis Chamberlain , Kees Cook , Iurii Zaikin , linux-fsdevel@vger.kernel.org Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Creating a new netdevice allocates at least ~50Kb of memory for various kernel objects, but only ~5Kb of them are accounted to memcg. As a result, creating an unlimited number of netdevice inside a memcg-limited container does not fall within memcg restrictions, consumes a significant part of the host's memory, can cause global OOM and lead to random kills of host processes. The main consumers of non-accounted memory are: ~10Kb 80+ kernfs nodes ~6Kb ipv6_add_dev() allocations 6Kb __register_sysctl_table() allocations 4Kb neigh_sysctl_register() allocations 4Kb __devinet_sysctl_register() allocations 4Kb __addrconf_sysctl_register() allocations Accounting of these objects allows to increase the share of memcg-related memory up to 60-70% (~38Kb accounted vs ~54Kb total for dummy netdevice on typical VM with default Fedora 35 kernel) and this should be enough to somehow protect the host from misuse inside container. Other related objects are quite small and may not be taken into account to minimize the expected performance degradation. It should be separately mentonied ~300 bytes of percpu allocation of struct ipstats_mib in snmp6_alloc_dev(), on huge multi-cpu nodes it can become the main consumer of memory. Signed-off-by: Vasily Averin --- RFC was discussed here: https://lore.kernel.org/all/a5e09e93-106d-0527-5b1e-48dbf3b48b4e@virtuozzo.com/ --- fs/kernfs/mount.c | 2 +- fs/proc/proc_sysctl.c | 2 +- net/core/neighbour.c | 2 +- net/ipv4/devinet.c | 2 +- net/ipv6/addrconf.c | 8 ++++---- 5 files changed, 8 insertions(+), 8 deletions(-) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index cfa79715fc1a..2881aeeaa880 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -391,7 +391,7 @@ void __init kernfs_init(void) { kernfs_node_cache = kmem_cache_create("kernfs_node_cache", sizeof(struct kernfs_node), - 0, SLAB_PANIC, NULL); + 0, SLAB_PANIC | SLAB_ACCOUNT, NULL); /* Creates slab cache for kernfs inode attributes */ kernfs_iattrs_cache = kmem_cache_create("kernfs_iattrs_cache", diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 7d9cfc730bd4..df4604fea4f8 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -1333,7 +1333,7 @@ struct ctl_table_header *__register_sysctl_table( nr_entries++; header = kzalloc(sizeof(struct ctl_table_header) + - sizeof(struct ctl_node)*nr_entries, GFP_KERNEL); + sizeof(struct ctl_node)*nr_entries, GFP_KERNEL_ACCOUNT); if (!header) return NULL; diff --git a/net/core/neighbour.c b/net/core/neighbour.c index ec0bf737b076..3dcda2a54f86 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -3728,7 +3728,7 @@ int neigh_sysctl_register(struct net_device *dev, struct neigh_parms *p, char neigh_path[ sizeof("net//neigh/") + IFNAMSIZ + IFNAMSIZ ]; char *p_name; - t = kmemdup(&neigh_sysctl_template, sizeof(*t), GFP_KERNEL); + t = kmemdup(&neigh_sysctl_template, sizeof(*t), GFP_KERNEL_ACCOUNT); if (!t) goto err; diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index fba2bffd65f7..47523fe5b891 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -2566,7 +2566,7 @@ static int __devinet_sysctl_register(struct net *net, char *dev_name, struct devinet_sysctl_table *t; char path[sizeof("net/ipv4/conf/") + IFNAMSIZ]; - t = kmemdup(&devinet_sysctl, sizeof(*t), GFP_KERNEL); + t = kmemdup(&devinet_sysctl, sizeof(*t), GFP_KERNEL_ACCOUNT); if (!t) goto out; diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index f908e2fd30b2..e79621ee4a0a 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -342,7 +342,7 @@ static int snmp6_alloc_dev(struct inet6_dev *idev) { int i; - idev->stats.ipv6 = alloc_percpu(struct ipstats_mib); + idev->stats.ipv6 = alloc_percpu_gfp(struct ipstats_mib, GFP_KERNEL_ACCOUNT); if (!idev->stats.ipv6) goto err_ip; @@ -358,7 +358,7 @@ static int snmp6_alloc_dev(struct inet6_dev *idev) if (!idev->stats.icmpv6dev) goto err_icmp; idev->stats.icmpv6msgdev = kzalloc(sizeof(struct icmpv6msg_mib_device), - GFP_KERNEL); + GFP_KERNEL_ACCOUNT); if (!idev->stats.icmpv6msgdev) goto err_icmpmsg; @@ -382,7 +382,7 @@ static struct inet6_dev *ipv6_add_dev(struct net_device *dev) if (dev->mtu < IPV6_MIN_MTU) return ERR_PTR(-EINVAL); - ndev = kzalloc(sizeof(struct inet6_dev), GFP_KERNEL); + ndev = kzalloc(sizeof(struct inet6_dev), GFP_KERNEL_ACCOUNT); if (!ndev) return ERR_PTR(err); @@ -7029,7 +7029,7 @@ static int __addrconf_sysctl_register(struct net *net, char *dev_name, struct ctl_table *table; char path[sizeof("net/ipv6/conf/") + IFNAMSIZ]; - table = kmemdup(addrconf_sysctl, sizeof(addrconf_sysctl), GFP_KERNEL); + table = kmemdup(addrconf_sysctl, sizeof(addrconf_sysctl), GFP_KERNEL_ACCOUNT); if (!table) goto out; -- 2.31.1