Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp5237754imm; Tue, 19 Jun 2018 07:17:55 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIQiP166oQswvHDNSyapPG7y7wyvCS6WkCc/0chs1e87OK1Y3zb1Y8LKhaGqWYH/EOcOmFD X-Received: by 2002:a17:902:7c16:: with SMTP id x22-v6mr18739183pll.77.1529417875373; Tue, 19 Jun 2018 07:17:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529417875; cv=none; d=google.com; s=arc-20160816; b=c7USo4m4KJ2J7HFyU+MP7BdtI5t4WjZP3rMoRz7A1cL/tqne2jwvnMEj21yh+5UUhL aAWXUxrQNHdu4c34teOp3lWnqAhOAskxlo8D4fiFjaUyf/OdTBoGB4ge/jsCX5NWejLZ Z7DqI5wiqNomtebtPs4A+4z9ePq8xElfnkTBd4N8DJ6klFJVtep8QJQWTKJMSLfjNMvL EINHEeSUCrZ9+pwcbTNuSjDfE+dQz7lnPf5zQUyRtbFN+6JI8rWrEhKndCT7zQuZjRJ2 SNSh8/8X8AtNPxr8s3lNiA8ydqLosPNjQaI/XJcx4S8+3FvZfLZD+psmOq5CoAzJ30gI +OFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=Pbp3RnfkzKVCDuM/1bn+SJUC8WkWYTm8XZSitE1r5s8=; b=xLUEDVTsl0CIdT8U6lWfJqYkIXRZF0ts1I+yEpoa2psu2OU/P+WxkPHLPXlY2eMqRR tEET/WRmzdhKgerPgRW3Mf80O9VKdbfre7S/CR2KEUv8Q5w2cQC6vcoMnbCVyT/0X3cT y1CBDYDGUJ5KgLBwZ2U7n34yaq4jZOvXdO8SOIUwtyo7zJDwjYTiMUj/dG4mqPv0BV+n /lPbnJYI7pXf0crKE9dPzj+af1wT5yAN6kuwrwZaI9+A8gXXSPHk9j92Nf86mXV7Atfz g9gqNG9JGg6IFwlQ1CUnc4Jg0E99BeMlyyMSKqJSWegIV6zeDnrYUNodlRlZnJgatmlk EYFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HfCNx5ML; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k91-v6si17802310pld.248.2018.06.19.07.17.31; Tue, 19 Jun 2018 07:17:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HfCNx5ML; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937576AbeFSOPj (ORCPT + 99 others); Tue, 19 Jun 2018 10:15:39 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:53989 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S937572AbeFSOPg (ORCPT ); Tue, 19 Jun 2018 10:15:36 -0400 Received: by mail-wm0-f68.google.com with SMTP id x6-v6so660594wmc.3 for ; Tue, 19 Jun 2018 07:15:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Pbp3RnfkzKVCDuM/1bn+SJUC8WkWYTm8XZSitE1r5s8=; b=HfCNx5MLjRITR9Je5oukk0srQZHLvNSY7truzaMsaWBkOSO2/TxKWgRNgScj69QCou TlE0vTpjZw0sAVu5t5a6ZOn3jmavlk/zNKDVb8wEh1raraZimefTrriSZvGbzDF7dZxD vcTKxeLgTpXAoYtT8Y0I3trsTbS3mtIjLkCaDfy7kjQytwO8MH0fCriqftBnyRikrKJM IWB61bXYuj6Z0aF8Rnuv6j0H+0jH563gdNx2C7rMOS57pZLSJzKZR/C47tC9yRy8UrCN lSWqMt+uby391Et8pBexxFKxs1q2iYhqXfPEfa4aT4caZB4tfJaQe8u7aVxNbAGzv1b4 h9EA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Pbp3RnfkzKVCDuM/1bn+SJUC8WkWYTm8XZSitE1r5s8=; b=epF31zwhzNHmQniBUKEYYTtgbeA0it696aalQ605j+dx10WZAcQP0B+5xKwR8Dnjv6 AIDp65A6xgdD32zAatgDNtxzWbOKYW/zuZmVcnhC29plUzCb1P/pvq9qqpbyTMrxLVIA mkvbp+fx054ZFR0XyFugPWp1fqMRGVOkI0AHANQQyRgQJz0sxfb9z0pmBbcu9iQnhzID 9dA3GkzKUWmi0jsRjL/A71Qai+GLI/JKseyOQQ/glnX7s1lQr43RpR+MolVf5Fze/SPD OkiLRTZZ++Xrt2eZaFzc3PPPSlCzPjcKf3xioMPOGzTkmt/QDnEeuvvx2Lla5JO8SN1t brKw== X-Gm-Message-State: APt69E3s8zeauovQ26veI8yXe081+8qn+Iw41fSSjY9m/RmFeq3584RH 4cqzFORyHZaIlOPCBzrZQUwzgRsfWXZ4QgIYysTekQ== X-Received: by 2002:a1c:443:: with SMTP id 64-v6mr12888077wme.140.1529417735017; Tue, 19 Jun 2018 07:15:35 -0700 (PDT) MIME-Version: 1.0 References: <20180619051327.149716-1-shakeelb@google.com> <20180619051327.149716-3-shakeelb@google.com> In-Reply-To: From: Shakeel Butt Date: Tue, 19 Jun 2018 07:15:20 -0700 Message-ID: Subject: Re: [PATCH 2/3] fs: fsnotify: account fsnotify metadata to kmemcg To: Amir Goldstein Cc: Andrew Morton , Michal Hocko , Johannes Weiner , Vladimir Davydov , Jan Kara , Greg Thelen , LKML , Cgroups , linux-fsdevel , Linux MM , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Mel Gorman , Vlastimil Babka , Alexander Viro Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 19, 2018 at 12:20 AM Amir Goldstein wrote: > > On Tue, Jun 19, 2018 at 8:13 AM, Shakeel Butt wrote: > > A lot of memory can be consumed by the events generated for the huge or > > unlimited queues if there is either no or slow listener. This can cause > > system level memory pressure or OOMs. So, it's better to account the > > fsnotify kmem caches to the memcg of the listener. > > > > There are seven fsnotify kmem caches and among them allocations from > > dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and > > inotify_inode_mark_cachep happens in the context of syscall from the > > listener. So, SLAB_ACCOUNT is enough for these caches. > > > > The objects from fsnotify_mark_connector_cachep are not accounted as they > > are small compared to the notification mark or events and it is unclear > > whom to account connector to since it is shared by all events attached to > > the inode. > > > > The allocations from the event caches happen in the context of the event > > producer. For such caches we will need to remote charge the allocations > > to the listener's memcg. Thus we save the memcg reference in the > > fsnotify_group structure of the listener. > > > > This patch has also moved the members of fsnotify_group to keep the size > > same, at least for 64 bit build, even with additional member by filling > > the holes. > > > > Signed-off-by: Shakeel Butt > > Acked-by: Jan Kara > > Cc: Michal Hocko > > Cc: Amir Goldstein > > Cc: Christoph Lameter > > Cc: Pekka Enberg > > Cc: David Rientjes > > Cc: Joonsoo Kim > > Cc: Greg Thelen > > Cc: Johannes Weiner > > Cc: Vladimir Davydov > > Cc: Mel Gorman > > Cc: Vlastimil Babka > > Cc: Alexander Viro > > Cc: Andrew Morton > > --- > > Changelog since v5: > > - None > > > > Changelog since v4: > > - Fixed the build for CONFIG_MEMCG=n > > > > Changelog since v3: > > - Rebased over Jan's patches. > > - Some cleanup based on Amir's comments. > > > > Changelog since v2: > > - None > > > > Changelog since v1: > > - no more charging fsnotify_mark_connector objects > > - Fixed the build for SLOB > > > > fs/notify/dnotify/dnotify.c | 5 +++-- > > fs/notify/fanotify/fanotify.c | 6 ++++-- > > fs/notify/fanotify/fanotify_user.c | 5 ++++- > > fs/notify/group.c | 6 ++++++ > > fs/notify/inotify/inotify_fsnotify.c | 2 +- > > fs/notify/inotify/inotify_user.c | 5 ++++- > > include/linux/fsnotify_backend.h | 12 ++++++++---- > > include/linux/memcontrol.h | 7 +++++++ > > mm/memcontrol.c | 15 +++++++++++++-- > > 9 files changed, 50 insertions(+), 13 deletions(-) > > > > diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c > > index e2bea2ac5dfb..a6365e6bc047 100644 > > --- a/fs/notify/dnotify/dnotify.c > > +++ b/fs/notify/dnotify/dnotify.c > > @@ -384,8 +384,9 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg) > > > > static int __init dnotify_init(void) > > { > > - dnotify_struct_cache = KMEM_CACHE(dnotify_struct, SLAB_PANIC); > > - dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC); > > + dnotify_struct_cache = KMEM_CACHE(dnotify_struct, > > + SLAB_PANIC|SLAB_ACCOUNT); > > + dnotify_mark_cache = KMEM_CACHE(dnotify_mark, SLAB_PANIC|SLAB_ACCOUNT); > > > > dnotify_group = fsnotify_alloc_group(&dnotify_fsnotify_ops); > > if (IS_ERR(dnotify_group)) > > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c > > index f90842efea13..c8d6e37a4855 100644 > > --- a/fs/notify/fanotify/fanotify.c > > +++ b/fs/notify/fanotify/fanotify.c > > @@ -154,14 +154,16 @@ struct fanotify_event_info *fanotify_alloc_event(struct fsnotify_group *group, > > if (fanotify_is_perm_event(mask)) { > > struct fanotify_perm_event_info *pevent; > > > > - pevent = kmem_cache_alloc(fanotify_perm_event_cachep, gfp); > > + pevent = kmem_cache_alloc_memcg(fanotify_perm_event_cachep, gfp, > > + group->memcg); > > if (!pevent) > > return NULL; > > event = &pevent->fae; > > pevent->response = 0; > > goto init; > > } > > - event = kmem_cache_alloc(fanotify_event_cachep, gfp); > > + event = kmem_cache_alloc_memcg(fanotify_event_cachep, gfp, > > + group->memcg); > > if (!event) > > return NULL; > > init: __maybe_unused > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c > > index ec4d8c59d0e3..0cf45041dc32 100644 > > --- a/fs/notify/fanotify/fanotify_user.c > > +++ b/fs/notify/fanotify/fanotify_user.c > > @@ -16,6 +16,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > > > @@ -756,6 +757,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > > > > group->fanotify_data.user = user; > > atomic_inc(&user->fanotify_listeners); > > + group->memcg = get_mem_cgroup_from_mm(current->mm); > > > > It seem to me that you should export a wrapper to modules with > the mem_cgroup_ prefix and not sure that need to pass current->mm > to this wrapper. > Thanks, I will test with fsnotify as module. > > oevent = fanotify_alloc_event(group, NULL, FS_Q_OVERFLOW, NULL); > > if (unlikely(!oevent)) { > > @@ -957,7 +959,8 @@ COMPAT_SYSCALL_DEFINE6(fanotify_mark, > > */ > > static int __init fanotify_user_setup(void) > > { > > - fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, SLAB_PANIC); > > + fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, > > + SLAB_PANIC|SLAB_ACCOUNT); > > fanotify_event_cachep = KMEM_CACHE(fanotify_event_info, SLAB_PANIC); > > if (IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS)) { > > fanotify_perm_event_cachep = > > diff --git a/fs/notify/group.c b/fs/notify/group.c > > index aa5468f23e45..300fc0f62115 100644 > > --- a/fs/notify/group.c > > +++ b/fs/notify/group.c > > @@ -22,6 +22,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include "fsnotify.h" > > @@ -36,6 +37,11 @@ static void fsnotify_final_destroy_group(struct fsnotify_group *group) > > if (group->ops->free_group_priv) > > group->ops->free_group_priv(group); > > > > +#ifdef CONFIG_MEMCG > > + if (group->memcg) > > + css_put(&group->memcg->css); > > +#endif > > + > > This looks very much like an internal implementation detail that has no > business in an external module. I see evidence that you have used > mem_cgroup_put() in the patch at some point and I agree that you > need to export a pair of matching helpers to external modules. > Do not worry. Andrew will change this to mem_cgroup_put(). Basically mem_cgroup_put() was defined by still-in-review patch series by Roman. There were build failure reports where someone was taking this series either without Roman's series or applying series out of order. thanks, Shakeel