Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp118619pxv; Tue, 13 Jul 2021 23:33:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwGc0fZJjwj0E+6HVRd/N+I8bvFUVRc+gCPJ5zvMWmK2aRcoMoewXlnx9qQqxNIpTxchaqp X-Received: by 2002:a17:906:4a0a:: with SMTP id w10mr10555062eju.371.1626244393453; Tue, 13 Jul 2021 23:33:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626244393; cv=none; d=google.com; s=arc-20160816; b=U04X503+TZPpkQSfc7vh9daIYU9PTb/NXNE+npJ5SBUYIIQVWCptIiCbmHoRyYSZhU oVIHck5Vybi+7fWRauptj0bh4bziTRrm48RH19ulqiqzx+5vPMd04A6qAXxJqKWae0+r hsBLLhsThBMaItiCCyQT4h9+DpMbQh70wwQpQ8oV9jrjHiVjyG6sH1jJR7jMwk1JlC6/ 9fYurfHUj72cRZgYaI0/3JCHs5Pxjdw7eILkVJf93AE4gmm+1hC9wG0EaWizaTY24CzK eU2gNhJ3vfAwje1bg5gkHPKa0p/1uFd9n9V0q7NchLDkel6mmXifJvYad+IP87nb/ulR e9Bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:references:cc :to:from:subject:dkim-signature; bh=CmW4WEPrBYc9JkO1ZpHwc4AZ/QqQIR1GbHwoX4kgVJY=; b=Wsc39Px220sYmPWh8e4yehaREVsXUKVm2joFx/+dqWuQ6aw8w/2Sgt7w6kEEwkQlrd HEq8b8LJduPzyOpBRoKf4+rdJdduGu/exTnKFFBlN7IBm5dcXSauHc8rjIaIxDhawUCl sKoo0fXJf7QUGpy+fGcXW6R4X4dMHF9g0dVp71SSs2OX2p59B+ys7l0zD9CO+Oho9oyp cw3105xABNL6DecI8VnMD7OEkfMCbtGar+WXOLcrciUVq5UZSqS4CnU9R0Tha6oUMgzP aJWY8CKIZcGOUWhDp0yM3JvxnIwG8blNSnmVYaBEtkknf2o+QMR0jNUM0rRkH/pF3jco f+9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@virtuozzo.com header.s=relay header.b=o6m5sOcY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=virtuozzo.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y9si1567752eju.362.2021.07.13.23.32.49; Tue, 13 Jul 2021 23:33:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@virtuozzo.com header.s=relay header.b=o6m5sOcY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=virtuozzo.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238061AbhGNGed (ORCPT + 99 others); Wed, 14 Jul 2021 02:34:33 -0400 Received: from relay.sw.ru ([185.231.240.75]:34486 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238003AbhGNGed (ORCPT ); Wed, 14 Jul 2021 02:34:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=Content-Type:MIME-Version:Date:Message-ID:From: Subject; bh=CmW4WEPrBYc9JkO1ZpHwc4AZ/QqQIR1GbHwoX4kgVJY=; b=o6m5sOcYtTchWFIWd TJWZ1yyazhKhz8n6I4Mzq8WRrP5nncMBYc4zFUWIYbiTxTH7Ggl7xiv/7LteuVnDqRbTsUdZd+E0o dgPq+uUl7Y4KhHVL6YmNtWaNe/uNpa865AZKZ+sqKfeq73BLmfg/ZQHSUyLCLQU8wEFCuWjR5H5HI =; Received: from [10.93.0.56] by relay.sw.ru with esmtp (Exim 4.94.2) (envelope-from ) id 1m3YQh-003uXN-Pc; Wed, 14 Jul 2021 09:31:35 +0300 Subject: Re: [PATCH v2 1/1] memcg: enable accounting for pids in nested pid namespaces From: Vasily Averin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, Roman Gushchin , Christian Brauner , =?UTF-8?Q?Michal_Koutn=c3=bd?= , Serge Hallyn , cgroups@vger.kernel.org, Michal Hocko References: <7b777e22-5b0d-7444-343d-92cbfae5f8b4@virtuozzo.com> <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> Message-ID: <21db0c2c-45ea-fded-9633-7b76ab2b1083@virtuozzo.com> Date: Wed, 14 Jul 2021 09:31:34 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dear Andrew, could you please pick up this patch and add Reviewed-by: Shakeel Butt Thank you, Vasily Averin On 4/24/21 2:54 PM, Vasily Averin wrote: > Commit 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg") > enabled memcg accounting for pids allocated from init_pid_ns.pid_cachep, > but forgot to adjust the setting for nested pid namespaces. > As a result, pid memory is not accounted exactly where it is really needed, > inside memcg-limited containers with their own pid namespaces. > > Pid was one the first kernel objects enabled for memcg accounting. > init_pid_ns.pid_cachep marked by SLAB_ACCOUNT and we can expect that > any new pids in the system are memcg-accounted. > > Though recently I've noticed that it is wrong. nested pid namespaces creates > own slab caches for pid objects, nested pids have increased size because contain > id both for all parent and for own pid namespaces. The problem is that these slab > caches are _NOT_ marked by SLAB_ACCOUNT, as a result any pids allocated in > nested pid namespaces are not memcg-accounted. > > Pid struct in nested pid namespace consumes up to 500 bytes memory, > 100000 such objects gives us up to ~50Mb unaccounted memory, > this allow container to exceed assigned memcg limits. > > Fixes: 5d097056c9a0 ("kmemcg: account certain kmem allocations to memcg") > Cc: stable@vger.kernel.org > Signed-off-by: Vasily Averin > Reviewed-by: Michal Koutný > Acked-by: Christian Brauner > Acked-by: Roman Gushchin > --- > kernel/pid_namespace.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c > index 6cd6715..a46a372 100644 > --- a/kernel/pid_namespace.c > +++ b/kernel/pid_namespace.c > @@ -51,7 +51,8 @@ static struct kmem_cache *create_pid_cachep(unsigned int level) > mutex_lock(&pid_caches_mutex); > /* Name collision forces to do allocation under mutex. */ > if (!*pkc) > - *pkc = kmem_cache_create(name, len, 0, SLAB_HWCACHE_ALIGN, 0); > + *pkc = kmem_cache_create(name, len, 0, > + SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, 0); > mutex_unlock(&pid_caches_mutex); > /* current can fail, but someone else can succeed. */ > return READ_ONCE(*pkc); >