Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4842757imm; Tue, 19 Jun 2018 00:22:38 -0700 (PDT) X-Google-Smtp-Source: ADUXVKILPqAEliwma7FJQalWACamG1bRqXsHqDUkevpVpExDOAB+do8MnDbQbDARgyAzVj5U23Hu X-Received: by 2002:a17:902:8e87:: with SMTP id bg7-v6mr17502395plb.129.1529392958753; Tue, 19 Jun 2018 00:22:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529392958; cv=none; d=google.com; s=arc-20160816; b=ogxhVYmJC0qLG1IQqG8cUWr0YxfKW7ZRRpxJBIk/HvKt2mzd71oN+ZDwrBFX4vSNF5 xAW3aElmZI7swMTOxQMw15ZaptqBiP5OaU3TDc9vTlwQCqG6b752O2TonYRqDOPqs5tm VTeNnVnWn2l533ARYhlzg9h7I63Wg86vFrM3myngQZym+B+UoRTbkHBRnul/JyIe1dwq +fUndglRW99bTYZRqbCkyXg4QBJF8ab5O2QuO3kRXfW9J9KlSvbjzX/JeUzvjubR1XB0 7M+ELHXimLa/+EVhMdK2y/1XFQ8/ePxaVKYfTKB2uJZMNdnPxLRKRWslJXldTUdcPg8P /qnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=IIIluWCNyf8WZcOvbWOkP62FJRvfEyzTJeQNR73PNHY=; b=mIL8/AwOZGYRwe2mcTIEBjZIUerVrhr09iIT1PJ5X4H4o5IMZviz2bMzDB3DGyJkZq jjuHUmZDBoeIVy5GCGQGKGwO9/KoSMTrhw7ADhIhqSIgqkBMx9CBjfrXTKYir9s8ZsNZ 2q2+rhoHCbka/a/eZD1a6chFGrA5k3+a0a5si8B1mrKMcbxcBqI9fKY4FcFhZpvBdB8J dV+NnYRDBdpvMDHF7Te5yAGtnAWwSnTh7zc87U+uEP1clfYW9qdXRgB5DxiylnebMy++ 9LhNRgA8EL/W4GkdLjL4HOM5UABLAGfsmDL/eix9mg6BNgzc+mU6p6U8ewpLdqoCVw+5 Ii1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=prjl03Q5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z73-v6si13623783pgd.122.2018.06.19.00.22.25; Tue, 19 Jun 2018 00:22:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=prjl03Q5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756178AbeFSHUy (ORCPT + 99 others); Tue, 19 Jun 2018 03:20:54 -0400 Received: from mail-yb0-f194.google.com ([209.85.213.194]:33922 "EHLO mail-yb0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755674AbeFSHUw (ORCPT ); Tue, 19 Jun 2018 03:20:52 -0400 Received: by mail-yb0-f194.google.com with SMTP id n23-v6so6930851ybg.1; Tue, 19 Jun 2018 00:20:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=IIIluWCNyf8WZcOvbWOkP62FJRvfEyzTJeQNR73PNHY=; b=prjl03Q5Ox7pBLB9KVKVs2qNuJ1GTZ9tEMjz170EaEp6MCSVY77Vag++bU5L0dvlcl KCo4qr6CttLinXN5hbNmEtfT9jP5LFReDYK0lERu3YhwPw9yO8qdriYXKoMoMdFCpm14 nFDbS9qCWgzzlhXL1FL4YJ57hSFu14wnlgpuLw2KGYyZNEiy1ZF5D50VS1LVeThrVl+o xqOrVLrGA555xLNl4g1CCLZTMuI2GZBoTtX254s8GtHJ9cvR43RYp5rkAq14XnIzq+VL 5O7sv1QcwoG0r9HfRoTNpBEGZd8lqNEAsYfWopPliZQuftPM1GGH0F0RcsNVtj0Byogz X9TQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=IIIluWCNyf8WZcOvbWOkP62FJRvfEyzTJeQNR73PNHY=; b=Pk2ylM0gzuyFnfWbpZw+nSaPUjNm8V1o8Gsw5HxDVRYhirI+jGgkbazMzmqO5+rzHZ F+ew2VYZvZPCPzOTHphMsRiyTR9le6oMUCJsuLqyOZ3pKxktMnEk+leixv3yh7kfX0XN VwPTKs9LvrPA4F5zqRxRKP0rkskLXA9MnnNEg7nvaQPjsXmCIzGs4JyjyeVpplc0A87R /9q04SOfA6PsD3IjsUYK3XQiN+unrPEtmK0EId0aTeEhy8rGpnRxIVrP4N49hXqiMRoS NBj77uPyojwBzOcqq2wx5z704wgr4rUi/eiVY7AReZ/fqpMPmMPgcfgI9LRhTUn203ef CLHQ== X-Gm-Message-State: APt69E1WQa0zUzkypsy+yapdYAqq3To3FxMIkPtlPVTHhyfIXUB+TqnD 2QdOH74dlfy+Jg0RaSp4+qwdMT55JzkxxaKzmMo= X-Received: by 2002:a5b:146:: with SMTP id c6-v6mr73527ybp.377.1529392851844; Tue, 19 Jun 2018 00:20:51 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a81:79cf:0:0:0:0:0 with HTTP; Tue, 19 Jun 2018 00:20:51 -0700 (PDT) In-Reply-To: <20180619051327.149716-3-shakeelb@google.com> References: <20180619051327.149716-1-shakeelb@google.com> <20180619051327.149716-3-shakeelb@google.com> From: Amir Goldstein Date: Tue, 19 Jun 2018 10:20:51 +0300 Message-ID: Subject: Re: [PATCH 2/3] fs: fsnotify: account fsnotify metadata to kmemcg To: Shakeel Butt Cc: Andrew Morton , Michal Hocko , Johannes Weiner , Vladimir Davydov , Jan Kara , Greg Thelen , linux-kernel , cgroups@vger.kernel.org, linux-fsdevel , Linux MM , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Mel Gorman , Vlastimil Babka , Alexander Viro Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 19, 2018 at 8:13 AM, Shakeel Butt wrote: > A lot of memory can be consumed by the events generated for the huge or > unlimited queues if there is either no or slow listener. This can cause > system level memory pressure or OOMs. So, it's better to account the > fsnotify kmem caches to the memcg of the listener. > > There are seven fsnotify kmem caches and among them allocations from > dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and > inotify_inode_mark_cachep happens in the context of syscall from the > listener. So, SLAB_ACCOUNT is enough for these caches. > > The objects from fsnotify_mark_connector_cachep are not accounted as they > are small compared to the notification mark or events and it is unclear > whom to account connector to since it is shared by all events attached to > the inode. > > The allocations from the event caches happen in the context of the event > producer. For such caches we will need to remote charge the allocations > to the listener's memcg. Thus we save the memcg reference in the > fsnotify_group structure of the listener. > > This patch has also moved the members of fsnotify_group to keep the size > same, at least for 64 bit build, even with additional member by filling > the holes. > > Signed-off-by: Shakeel Butt > Acked-by: Jan Kara > Cc: Michal Hocko > Cc: Amir Goldstein > Cc: Christoph Lameter > Cc: Pekka Enberg > Cc: David Rientjes > Cc: Joonsoo Kim > Cc: Greg Thelen > Cc: Johannes Weiner > Cc: Vladimir Davydov > Cc: Mel Gorman > Cc: Vlastimil Babka > Cc: Alexander Viro > Cc: Andrew Morton > --- > Changelog since v5: > - None > > Changelog since v4: > - Fixed the build for CONFIG_MEMCG=n > > Changelog since v3: > - Rebased over Jan's patches. > - Some cleanup based on Amir's comments. > > Changelog since v2: > - None > > Changelog since v1: > - no more charging fsnotify_mark_connector objects > - Fixed the build for SLOB > > fs/notify/dnotify/dnotify.c | 5 +++-- > fs/notify/fanotify/fanotify.c | 6 ++++-- > fs/notify/fanotify/fanotify_user.c | 5 ++++- > fs/notify/group.c | 6 ++++++ > fs/notify/inotify/inotify_fsnotify.c | 2 +- > fs/notify/inotify/inotify_user.c | 5 ++++- > include/linux/fsnotify_backend.h | 12 ++++++++---- > include/linux/memcontrol.h | 7 +++++++ > mm/memcontrol.c | 15 +++++++++++++-- > 9 files changed, 50 insertions(+), 13 deletions(-) > > diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c > index e2bea2ac5dfb..a6365e6bc047 100644 > --- a/fs/notify/dnotify/dnotify.c > +++ b/fs/notify/dnotify/dnotify.c > @@ -384,8 +384,9 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg) > > static int __init dnotify_init(void) > { > - dnotify_struct_cache = KMEM_CACHE(dnotify_struct, SLAB_PANIC); > - dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC); > + dnotify_struct_cache = KMEM_CACHE(dnotify_struct, > + SLAB_PANIC|SLAB_ACCOUNT); > + dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC|SLAB_ACCOUNT); > > dnotify_group = fsnotify_alloc_group(&dnotify_fsnotify_ops); > if (IS_ERR(dnotify_group)) > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c > index f90842efea13..c8d6e37a4855 100644 > --- a/fs/notify/fanotify/fanotify.c > +++ b/fs/notify/fanotify/fanotify.c > @@ -154,14 +154,16 @@ struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group, > if (fanotify_is_perm_event(mask)) { > struct fanotify_perm_event_info *pevent; > > - pevent = kmem_cache_alloc(fanotify_perm_event_cachep, gfp); > + pevent = kmem_cache_alloc_memcg(fanotify_perm_event_cachep, gfp, > + group->memcg); > if (!pevent) > return NULL; > event = &pevent->fae; > pevent->response = 0; > goto init; > } > - event = kmem_cache_alloc(fanotify_event_cachep, gfp); > + event = kmem_cache_alloc_memcg(fanotify_event_cachep, gfp, > + group->memcg); > if (!event) > return NULL; > init: __maybe_unused > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c > index ec4d8c59d0e3..0cf45041dc32 100644 > --- a/fs/notify/fanotify/fanotify_user.c > +++ b/fs/notify/fanotify/fanotify_user.c > @@ -16,6 +16,7 @@ > #include > #include > #include > +#include > > #include > > @@ -756,6 +757,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > > group->fanotify_data.user = user; > atomic_inc(&user->fanotify_listeners); > + group->memcg = get_mem_cgroup_from_mm(current->mm); > It seem to me that you should export a wrapper to modules with the mem_cgroup_ prefix and not sure that need to pass current->mm to this wrapper. > oevent = fanotify_alloc_event(group, NULL, FS_Q_OVERFLOW, NULL); > if (unlikely(!oevent)) { > @@ -957,7 +959,8 @@ COMPAT_SYSCALL_DEFINE6(fanotify_mark, > */ > static int __init fanotify_user_setup(void) > { > - fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, SLAB_PANIC); > + fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, > + SLAB_PANIC|SLAB_ACCOUNT); > fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC); > if (IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS)) { > fanotify_perm_event_cachep = > diff --git a/fs/notify/group.c b/fs/notify/group.c > index aa5468f23e45..300fc0f62115 100644 > --- a/fs/notify/group.c > +++ b/fs/notify/group.c > @@ -22,6 +22,7 @@ > #include > #include > #include > +#include > > #include > #include "fsnotify.h" > @@ -36,6 +37,11 @@ static void fsnotify_final_destroy_group(struct fsnotify_group *group) > if (group->ops->free_group_priv) > group->ops->free_group_priv(group); > > +#ifdef CONFIG_MEMCG > + if (group->memcg) > + css_put(&group->memcg->css); > +#endif > + This looks very much like an internal implementation detail that has no business in an external module. I see evidence that you have used mem_cgroup_put() in the patch at some point and I agree that you need to export a pair of matching helpers to external modules. Thanks, Amir.