Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp5583502imm; Tue, 19 Jun 2018 12:53:18 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKMyxPsuWftiGNWetIg0sccxVsmcAcmdF/92u3pRrKNaHtgfkDR8R+M3I7aktWinAVM+AAh X-Received: by 2002:a17:902:bb8a:: with SMTP id m10-v6mr20284852pls.236.1529437998366; Tue, 19 Jun 2018 12:53:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529437998; cv=none; d=google.com; s=arc-20160816; b=RpbCnOnHUJ00/iVW5KXN8Gp1Zq/BO/6lZBIvmK8IBWMcGs0QUIqz/5JPrtolX9kw2N LNk3ZuNfIR9EYTv7/mPFZ1CBFHcn0Ms7tjT86N3SihaFnwhxy6cuIiaOLKXXu7QdNenv GKgc5pD48YzNHLmu6aLOJg92xrzRDR0eC9uG5+6FOiIVF5nfmg5zQJvvkgRYJhobMoGR Uv9SSxqyajDQeEPBlSh+/kG8QkP76A9x94he6LnY2bp4bRKDlM6dpxxSauTg7lamI5Ib 9oDv6qxZUospE9Ec2Gine3Hw02zDCZEUf3KniHey5nlJQNfbFsBlfNbw7n32Cv8LNVEm Fl2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=l4gF71BdySkEIdOCOG26I9QPdBcvXYq4hOYRkeBb97A=; b=U3owtS5Bm1YeAB2yu9RHd9CV7iarFmz0+6urfmVI/DrfLA6nAHe8U2dA3UtHNdNlmq RG8ejFy/s9x8BBJzNVdpX2s3m5ZEL9MxwxGZVD4o+rVir3YHTOle4pDNw47pTsGeyfMk 1Wjm5crF+G0n3K4/hobGI8xCHKQWxdtzgfy7B+6yTxFVIV0Ovf3aYzyJBw3DJ7JBMy+h I4KRnKkDp4qxfPrwXCogR58EJ65Qzx8snfbyuBZmPjMJb9b9FMPDAHw+4vja2KrzFv5n mrX3aLzhkEYNE3S8HCR8WEpvl09PYjosN5zxKj3X/iECj8OG3j+T0FRZhCij21tuh+7u 09cA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=BXSy2WWh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m5-v6si428576plt.468.2018.06.19.12.53.04; Tue, 19 Jun 2018 12:53:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=BXSy2WWh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030621AbeFSTvb (ORCPT + 99 others); Tue, 19 Jun 2018 15:51:31 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:36407 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030491AbeFSTv3 (ORCPT ); Tue, 19 Jun 2018 15:51:29 -0400 Received: by mail-wm0-f68.google.com with SMTP id v131-v6so2617648wma.1 for ; Tue, 19 Jun 2018 12:51:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=l4gF71BdySkEIdOCOG26I9QPdBcvXYq4hOYRkeBb97A=; b=BXSy2WWhLQG3XCo54lB0XCnZSwfG1dAFDHhMbpbpol422NtvIvKazCf+JKt9eKQ8dL /PPmtZmaI7wUPEtVsq6uEKKxlGeniZnp83fcOKTqM9rdUreOoUYrMeeUCnooWONPvFuQ TTDg5dJaJjfMDSKAC9JoMX07LV9KxQfEvd3g7iCczbOGv0yaPbaAq1vroomp26vR8m/s oF9r3gXDVdG3RUXcEe+o8TPeokhiQy8pCX3W11+IsSCDgKOBSnkgNbUVd4d4b2+uRHw8 FSb7V28zapMzwshEGDVUedv46jzLNHon4GVk3jdYmCGbMigd7GnN+QpuLxdM+J/S50vq Q03w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=l4gF71BdySkEIdOCOG26I9QPdBcvXYq4hOYRkeBb97A=; b=GixghITKmkcaUMDrNB0LwI2fGY4ThBJsI7YN/N8iv8CLzwRoeYDerDq/fGBIV2d3gT /yTSNrikowxEA1XK3Xs3UBDzTmB3hahV2tqLHfPmjIEnOnd/AqmOIu6oAwn+EMaun30v ee/0iiEONJpfP/j1TL5P2byf6tDeT/x9orroMZEhRBpcy0y+LPgwnrgT+sGpcQE3IjCy 6eJuQZCo41gifcsbPcGl/z6QZ/x7Jc9OLh9J0fDYL5rhXqpvOIPy5Z0opGtls0bDKIIT soMiasuh8HmexCYCxTMrQNTaQkM+MO78J7MZQuC4amrDeqgw6x9XQz+tILgemsWek/Bn TGZQ== X-Gm-Message-State: APt69E2f4p8GwNIfHH4drOpaeNhk+Z+n30UPNP6LHxrpzAPHASvmbS4l 3dpbelRH8OEJREMjFuE2WgCXuxO3lzhpxJCCTh2dVg== X-Received: by 2002:a1c:20c7:: with SMTP id g190-v6mr13554422wmg.2.1529437887885; Tue, 19 Jun 2018 12:51:27 -0700 (PDT) MIME-Version: 1.0 References: <20180619051327.149716-1-shakeelb@google.com> <20180619051327.149716-4-shakeelb@google.com> <20180619162741.GC27423@cmpxchg.org> <20180619174040.GA4304@castle.DHCP.thefacebook.com> In-Reply-To: <20180619174040.GA4304@castle.DHCP.thefacebook.com> From: Shakeel Butt Date: Tue, 19 Jun 2018 12:51:15 -0700 Message-ID: Subject: Re: [PATCH 3/3] fs, mm: account buffer_head to kmemcg To: Roman Gushchin Cc: Johannes Weiner , Andrew Morton , Michal Hocko , Vladimir Davydov , Jan Kara , Greg Thelen , LKML , Cgroups , linux-fsdevel , Linux MM , Jan Kara , Alexander Viro Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 19, 2018 at 10:41 AM Roman Gushchin wrote: > > On Tue, Jun 19, 2018 at 12:27:41PM -0400, Johannes Weiner wrote: > > On Mon, Jun 18, 2018 at 10:13:27PM -0700, Shakeel Butt wrote: > > > The buffer_head can consume a significant amount of system memory and > > > is directly related to the amount of page cache. In our production > > > environment we have observed that a lot of machines are spending a > > > significant amount of memory as buffer_head and can not be left as > > > system memory overhead. > > > > > > Charging buffer_head is not as simple as adding __GFP_ACCOUNT to the > > > allocation. The buffer_heads can be allocated in a memcg different from > > > the memcg of the page for which buffer_heads are being allocated. One > > > concrete example is memory reclaim. The reclaim can trigger I/O of pages > > > of any memcg on the system. So, the right way to charge buffer_head is > > > to extract the memcg from the page for which buffer_heads are being > > > allocated and then use targeted memcg charging API. > > > > > > Signed-off-by: Shakeel Butt > > > Cc: Jan Kara > > > Cc: Greg Thelen > > > Cc: Michal Hocko > > > Cc: Johannes Weiner > > > Cc: Vladimir Davydov > > > Cc: Alexander Viro > > > Cc: Andrew Morton > > > --- > > > fs/buffer.c | 14 +++++++++++++- > > > include/linux/memcontrol.h | 7 +++++++ > > > mm/memcontrol.c | 21 +++++++++++++++++++++ > > > 3 files changed, 41 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/buffer.c b/fs/buffer.c > > > index 8194e3049fc5..26389b7a3cab 100644 > > > --- a/fs/buffer.c > > > +++ b/fs/buffer.c > > > @@ -815,10 +815,17 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, > > > struct buffer_head *bh, *head; > > > gfp_t gfp = GFP_NOFS; > > > long offset; > > > + struct mem_cgroup *old_memcg; > > > + struct mem_cgroup *memcg = get_mem_cgroup_from_page(page); > > > > > > if (retry) > > > gfp |= __GFP_NOFAIL; > > > > > > + if (memcg) { > > > + gfp |= __GFP_ACCOUNT; > > > + old_memcg = memalloc_memcg_save(memcg); > > > + } > > > > Please move the get_mem_cgroup_from_page() call out of the > > declarations and down to right before the if (memcg) branch. > > > > > head = NULL; > > > offset = PAGE_SIZE; > > > while ((offset -= size) >= 0) { > > > @@ -835,6 +842,11 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, > > > /* Link the buffer to its page */ > > > set_bh_page(bh, page, offset); > > > } > > > +out: > > > + if (memcg) { > > > + memalloc_memcg_restore(old_memcg); > > > +#ifdef CONFIG_MEMCG > > > + css_put(&memcg->css); > > > +#endif > > > > Please add a put_mem_cgroup() ;) > > I've added such helper by commit 8a34a8b7fd62 ("mm, oom: cgroup-aware OOM killer"). > It's in the mm tree. > I was using mem_cgroup_put() defined by Roman's patch but there were a lot of build failure reports where someone was taking this series without Roman's series or applying the series out of order. Andrew asked me to keep it like this and then he will convert these callsites into mem_cgroup_put() after making making sure Roman's series is applied in mm tree. I will recheck with him, how he wants to handle it now. thanks, Shakeel