Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp6790338imm; Wed, 27 Jun 2018 13:28:52 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJA65hE7IabKmC2yHxU5PoMTPS65h/7mlqKYUgUEfOklmDSjYLsQ95IwiMB0DMBCgBLqETk X-Received: by 2002:a65:4888:: with SMTP id n8-v6mr6543601pgs.149.1530131331992; Wed, 27 Jun 2018 13:28:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530131331; cv=none; d=google.com; s=arc-20160816; b=Wz4YU0RXmEPWeuslePcLCJ0cEPKcOi8bVlNhZrZLO64alHhsZZoRg/NjHzYZbW+O61 N4qsXxtEkpEP7sf68oOGSdZO87GPFZHCHc3tIhGkCwpeFxrmKHTLXqUVeA4L0A0WwlIe tO4DWGeMB1rYrxVlQHdBiA8lljZ2PhaRRB1h2+1DyOaybz5YEYiI/yL7BRx2ImuJYkJ2 HwO8wBssJLFMo1ngJmnhWvMmsSWuUlJXDi5GCXS72h2y63NypPwU5mSbKy2nWtfXe4wb Pw0/G6v5EEP3tH6TNBEET/OB8I2c2zeeCMJdcHJ02lpJJqUvLscVevCUAVPt/wC7fw9S Revg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=m6tdRrvKpKkGOKc/97tUsEByWeLMusIKApVaRRiu7mU=; b=EPWzLGZF3Az7/gmwpmdaMnxN4HR9wSkDqC0iQDMRxBUyAuIzeeRLOyiM9u1BC56j64 eb+7pwi93GoyLicm2pn06OC7OcGqmLzdZfwXRpQnTwJ525ZAGc99yqNpXhG/tM+Cy6aH Bj4yxOX9jFxLEZ4oCQgXXXK50rZMpRoDLgP2AHD16g9Il+94b/J4mHTtS9efgzTTqUlh RMexXr1NdxAf/9kO/Xl4YvlbzgOzfY0WGkQ1fWLFGUTTm6fe1/J15TCf8LJe4QVSGEpR 3j94zWO5hAGMDAeOMtB5Vo0975W9EiTJXivvWIrRaO76850R+Gfs1dX9khbhIZxIEU0t bZng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=kB85XWA+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d37-v6si5007266pla.85.2018.06.27.13.28.37; Wed, 27 Jun 2018 13:28:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=kB85XWA+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966407AbeF0TNJ (ORCPT + 99 others); Wed, 27 Jun 2018 15:13:09 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:43867 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965421AbeF0TNF (ORCPT ); Wed, 27 Jun 2018 15:13:05 -0400 Received: by mail-pf0-f195.google.com with SMTP id y8-v6so1403427pfm.10 for ; Wed, 27 Jun 2018 12:13:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=m6tdRrvKpKkGOKc/97tUsEByWeLMusIKApVaRRiu7mU=; b=kB85XWA+cW/Cagq0Dweo+oofwnOc402eWhHxVOKirFbUyN8vz5fZfEA/K9qnPx0BIp 32Mo/J0ve02NLb1YyTtaqvzB8fQj5BVIrrUx5bVe4wFW/zHBcb9jH4a2+ZBVmTDGkZyq hP5AJ6VqguuZ43OEUpGQMr2rRATt3AK5PJ0Ji1gt4rPdlvWgKTaSs7/ye6/yOG29y2Or FzzaY+n3z59Q5ALwHba6Cc8caBjDLe6DGGHaJxafMPhGgSAu0JiV8bY0olDYJN6yK/Bb CikBqVj7SvsUnd6VH0N749G8eGBAyvDOSmOT4o1+uSCpfzuRFZ1vOW9//ubb63crRbiq OWRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=m6tdRrvKpKkGOKc/97tUsEByWeLMusIKApVaRRiu7mU=; b=thL4KgvUzH0asJc7vw3NhPcMQYjEhvVGQ5vUWQn7SiI0W0HacxuY4wYONDu9V8/ZKn ZHr6xW0sWklgSfub5+uQd8AW8NMuJGd8HKtse1BsYGSW9MYlsk3HX8PdRVgt2rsmmKWn C1f67R6Uk90WoFEpTfriC5gea6VUT6RgXj0fLrkHv6tT0/5be7jTuuELaaX0T4xDmoO+ dJevLDYT/M0hx0DFWxykE2RqFtPIgnL8txlfabHzufz6QyadOzZEKnO4/2yQqGZqoJsz rQwXp4sHhEAQRbMfWDpRyYgKWJpIR8OJhhsxZoTvlD7wxsMHN189Iw0IWH/qinfF/lEt 2JYQ== X-Gm-Message-State: APt69E2IQwfNIfQtZbQzj/fU1ca6e2DPKbzL3zK8zfCv8TRaJ+/8TYoX hlAqfI26OhwHGJU+Ix57cqGujQ== X-Received: by 2002:a63:3f05:: with SMTP id m5-v6mr6162564pga.51.1530126784136; Wed, 27 Jun 2018 12:13:04 -0700 (PDT) Received: from shakeelb.mtv.corp.google.com ([2620:15c:2cb:201:3a5f:3a4f:fa44:6b63]) by smtp.gmail.com with ESMTPSA id l6-v6sm8733667pfc.172.2018.06.27.12.13.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 Jun 2018 12:13:02 -0700 (PDT) From: Shakeel Butt To: Andrew Morton Cc: Michal Hocko , Johannes Weiner , Vladimir Davydov , Jan Kara , Greg Thelen , Amir Goldstein , Roman Gushchin , Alexander Viro , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Jan Kara Subject: [PATCH 2/2] fs, mm: account buffer_head to kmemcg Date: Wed, 27 Jun 2018 12:12:50 -0700 Message-Id: <20180627191250.209150-3-shakeelb@google.com> X-Mailer: git-send-email 2.18.0.rc2.346.g013aa6912e-goog In-Reply-To: <20180627191250.209150-1-shakeelb@google.com> References: <20180627191250.209150-1-shakeelb@google.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The buffer_head can consume a significant amount of system memory and is directly related to the amount of page cache. In our production environment we have observed that a lot of machines are spending a significant amount of memory as buffer_head and can not be left as system memory overhead. Charging buffer_head is not as simple as adding __GFP_ACCOUNT to the allocation. The buffer_heads can be allocated in a memcg different from the memcg of the page for which buffer_heads are being allocated. One concrete example is memory reclaim. The reclaim can trigger I/O of pages of any memcg on the system. So, the right way to charge buffer_head is to extract the memcg from the page for which buffer_heads are being allocated and then use targeted memcg charging API. Signed-off-by: Shakeel Butt Cc: Michal Hocko Cc: Jan Kara Cc: Amir Goldstein Cc: Greg Thelen Cc: Johannes Weiner Cc: Vladimir Davydov Cc: Roman Gushchin Cc: Andrew Morton Cc: Alexander Viro --- Changelog since v2: - get_mem_cgroup_from_page() returns root_mem_cgroup if page->memcg is either NULL or css_tryget_online fails. Changelog since v1: - simple code cleanups fs/buffer.c | 10 +++++++++- include/linux/memcontrol.h | 7 +++++++ mm/memcontrol.c | 22 ++++++++++++++++++++++ 3 files changed, 38 insertions(+), 1 deletion(-) diff --git a/fs/buffer.c b/fs/buffer.c index 8194e3049fc5..235826333936 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -45,6 +45,7 @@ #include #include #include +#include #include static int fsync_buffers_list(spinlock_t *lock, struct list_head *list); @@ -815,10 +816,14 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, struct buffer_head *bh, *head; gfp_t gfp = GFP_NOFS; long offset; + struct mem_cgroup *memcg; if (retry) gfp |= __GFP_NOFAIL; + memcg = get_mem_cgroup_from_page(page); + memalloc_use_memcg(memcg); + head = NULL; offset = PAGE_SIZE; while ((offset -= size) >= 0) { @@ -835,6 +840,9 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, /* Link the buffer to its page */ set_bh_page(bh, page, offset); } +out: + memalloc_unuse_memcg(); + mem_cgroup_put(memcg); return head; /* * In case anything failed, we just free everything we got. @@ -848,7 +856,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, } while (head); } - return NULL; + goto out; } EXPORT_SYMBOL_GPL(alloc_page_buffers); diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index cb04b382c8d2..919b98ddda45 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -380,6 +380,8 @@ struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm); +struct mem_cgroup *get_mem_cgroup_from_page(struct page *page); + static inline struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){ return css ? container_of(css, struct mem_cgroup, css) : NULL; @@ -865,6 +867,11 @@ static inline struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) return NULL; } +static inline struct mem_cgroup *get_mem_cgroup_from_page(struct page *page) +{ + return NULL; +} + static inline void mem_cgroup_put(struct mem_cgroup *memcg) { } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b25ca5c13196..21a7c2fb8097 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -713,6 +713,28 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) } EXPORT_SYMBOL(get_mem_cgroup_from_mm); +/** + * get_mem_cgroup_from_page: Obtain a reference on given page's memcg. + * @page: page from which memcg should be extracted. + * + * Obtain a reference on page->memcg and returns it if successful. Otherwise + * root_mem_cgroup is returned. + */ +struct mem_cgroup *get_mem_cgroup_from_page(struct page *page) +{ + struct mem_cgroup *memcg = page->mem_cgroup; + + if (mem_cgroup_disabled()) + return NULL; + + rcu_read_lock(); + if (!memcg || !css_tryget_online(&memcg->css)) + memcg = root_mem_cgroup; + rcu_read_unlock(); + return memcg; +} +EXPORT_SYMBOL(get_mem_cgroup_from_page); + /** * If current->active_memcg is non-NULL, do not fallback to current->mm->memcg. */ -- 2.18.0.rc2.346.g013aa6912e-goog