Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp1143817iob; Wed, 4 May 2022 15:51:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwhDF9rmVx4DPKplGjk2PLcPcA0uxZrnkxGTo6MslyV97IoG5rKHvS+Gimmmbg25H8p24ll X-Received: by 2002:a17:907:6d25:b0:6f4:d753:f250 with SMTP id sa37-20020a1709076d2500b006f4d753f250mr3087634ejc.580.1651704715936; Wed, 04 May 2022 15:51:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651704715; cv=none; d=google.com; s=arc-20160816; b=VTS+CDVkQD722A5B3Wm4GVkNfRuCenzSk3hX9AXrQsRgKcVfRvD2Z++Xu2vKBBBYUd YLIePLntSHEzaeaXIOYEK8E5UKP3NOcHQ8mslqbqxl2N44eWoZhx7xsRYhj/GjnuuO3H bbaETq6Vg792nI7aDE8gNIhQLmySooNXzmOXLrym99Hxb7/RdhzZm3LtiquJN7Qbc4ub MU91rvVFmvIZCqVwLdnIL8UYnupFhQ+6Y2vwn+/Bxo4r6AfBA9mfUrqPU5sYdU0x0uNI EjUddDWkKsTwb+fFJtE9mQkVM3BPXXQBVPEOCyVMOe2YdqqFvzeTDgcod5OKbScy2Jbn 628w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=LUcWUXDVgRzdq1jJHmWhPb5IAqTKtI0mu9qnJVWCZic=; b=0LZIQujFa2KWnv/RdgW1cFdHmhqTjjgRMSwMN7+pNrVlAkW3jeb92HLK7/N5qBK6K4 NodH2x8uXUmUwbxkToDcNOOpQn9bVwU9NaOLLjCx42FBn8L4Hzq4BohD8YEZjF0Rp0JF ckQOrisVvKwpuLaeamp4jP3gx3HdExftVfoneFkBtII1Uy0I7ZjYafZDh9SXX4UCvzqZ jbDLqKHMC+uOx16/Wn4dWziyZjq9U1unwbNW57y+uZ6d8T8EGFxCiaqyi7Qj1gp+Aoc1 052vYt9P+b4wXXVGts4BN/NBzw/WMQSbpUPuLiEQ1rzp1pE2JO7TED1aipLls2VjbyKa tEzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=rp3tULaf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i23-20020a170906699700b006f39a430bd3si187849ejr.599.2022.05.04.15.51.31; Wed, 04 May 2022 15:51:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=rp3tULaf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378364AbiEDUyX (ORCPT + 99 others); Wed, 4 May 2022 16:54:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237697AbiEDUyW (ORCPT ); Wed, 4 May 2022 16:54:22 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:e::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 803D11A386; Wed, 4 May 2022 13:50:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=LUcWUXDVgRzdq1jJHmWhPb5IAqTKtI0mu9qnJVWCZic=; b=rp3tULafgJwVBqrzXLuxhcy/Cp Wr2a2vR1oJIDTyhyFZkUrOdOCxirZ2u/G22uY4Av/ItDlm8otE7VnKe2fXojanNUlEyWaoabC/Dwf J00DWqbXLn8bndSSTnYoI7TYITG2H1IuBQTQDwOBB2xkJ54dCtWZaJdGf7tPVNYb2Vv/DytdhX/9G mkkKPJFcyJNGSDSfFKs5+TawT4xG+qSCMyicGkYx7J42cZLKgWN8TXYzigWCjfDrJU+N1CBm8rp3Q 9Nhb3PnTxJBMO3I7UEL15fBIw8+A+jwvNz2yOFgwpv22O76MwcP35joAhD4t8jR+dnFu06TC+1DRI 1otRHafw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1nmLxJ-00CgNR-Mt; Wed, 04 May 2022 20:50:41 +0000 Date: Wed, 4 May 2022 13:50:41 -0700 From: Luis Chamberlain To: Vasily Averin Cc: Shakeel Butt , kernel@openvz.org, Florian Westphal , linux-kernel@vger.kernel.org, Roman Gushchin , Vlastimil Babka , Michal Hocko , cgroups@vger.kernel.org, netdev@vger.kernel.org, "David S. Miller" , Jakub Kicinski , Paolo Abeni , Kees Cook , Iurii Zaikin , linux-fsdevel@vger.kernel.org Subject: Re: [PATCH memcg v2] memcg: accounting for objects allocated for new netdevice Message-ID: References: <53613f02-75f2-0546-d84c-a5ed989327b6@openvz.org> <354a0a5f-9ec3-a25c-3215-304eab2157bc@openvz.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <354a0a5f-9ec3-a25c-3215-304eab2157bc@openvz.org> Sender: Luis Chamberlain X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 02, 2022 at 03:15:51PM +0300, Vasily Averin wrote: > Creating a new netdevice allocates at least ~50Kb of memory for various > kernel objects, but only ~5Kb of them are accounted to memcg. As a result, > creating an unlimited number of netdevice inside a memcg-limited container > does not fall within memcg restrictions, consumes a significant part > of the host's memory, can cause global OOM and lead to random kills of > host processes. > > The main consumers of non-accounted memory are: > ~10Kb 80+ kernfs nodes > ~6Kb ipv6_add_dev() allocations > 6Kb __register_sysctl_table() allocations > 4Kb neigh_sysctl_register() allocations > 4Kb __devinet_sysctl_register() allocations > 4Kb __addrconf_sysctl_register() allocations > > Accounting of these objects allows to increase the share of memcg-related > memory up to 60-70% (~38Kb accounted vs ~54Kb total for dummy netdevice > on typical VM with default Fedora 35 kernel) and this should be enough > to somehow protect the host from misuse inside container. > > Other related objects are quite small and may not be taken into account > to minimize the expected performance degradation. > > It should be separately mentonied ~300 bytes of percpu allocation > of struct ipstats_mib in snmp6_alloc_dev(), on huge multi-cpu nodes > it can become the main consumer of memory. > > This patch does not enables kernfs accounting as it affects > other parts of the kernel and should be discussed separately. > However, even without kernfs, this patch significantly improves the > current situation and allows to take into account more than half > of all netdevice allocations. > > Signed-off-by: Vasily Averin > --- > v2: 1) kernfs accounting moved into separate patch, suggested by > Shakeel and mkoutny@. > 2) in ipv6_add_dev() changed original "sizeof(struct inet6_dev)" > to "sizeof(*ndev)", according to checkpath.pl recommendation: > CHECK: Prefer kzalloc(sizeof(*ndev)...) over kzalloc(sizeof > (struct inet6_dev)...) > --- > fs/proc/proc_sysctl.c | 2 +- for proc_sysctl: Acked-by: Luis Chamberlain Luis