Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753765AbbLIJH6 (ORCPT ); Wed, 9 Dec 2015 04:07:58 -0500 Received: from mail-wm0-f47.google.com ([74.125.82.47]:36383 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753563AbbLIJHi (ORCPT ); Wed, 9 Dec 2015 04:07:38 -0500 Date: Wed, 9 Dec 2015 10:07:35 +0100 From: Michal Hocko To: Andrew Morton Cc: Johannes Weiner , Minchan Kim , "Kirill A. Shutemov" , Vladimir Davydov , linux-mm@kvack.org, LKML Subject: Re: [PATCH mmotm] memcg: Ignore partial THP when moving task Message-ID: <20151209090735.GA30907@dhcp22.suse.cz> References: <1449594789-15866-1-git-send-email-mhocko@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1449594789-15866-1-git-send-email-mhocko@kernel.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2593 Lines: 64 Dohh, forgot to git add after s@PageCoumpound@PageTransCompound@ Updated patch is below: --- >From efff9d4696cbce6710827a8422a5b285bf9b8052 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Fri, 4 Dec 2015 08:30:22 +0100 Subject: [PATCH] memcg: Ignore partial THP when moving task After "mm: rework mapcount accounting to enable 4k mapping of THPs" it is possible to have a partial THP accessible via ptes. Memcg task migration code is not prepared for this situation and uncharges the tail page from the original memcg while the original THP is still charged via the head page which is not mapped to the moved task. The page counter of the origin memcg will underflow when the whole THP is uncharged later on and lead to: WARNING: CPU: 0 PID: 1340 at mm/page_counter.c:26 page_counter_cancel+0x34/0x40() reported by Minchan Kim. This patch prevents from the underflow by skipping any partial THP pages in mem_cgroup_move_charge_pte_range. PageTransCompound is checked when we do pte walk. This means that a process might leave a partial THP behind in the original memcg if there is no other process mapping it via pmd but this is considered acceptable because it shouldn't happen often and this is not considered a memory leak because the original THP is still accessible and reclaimable. Moreover the task migration has always been racy and never guaranteed to move all pages. Reported-by: Minchan Kim Acked-by: Johannes Weiner Signed-off-by: Michal Hocko --- mm/memcontrol.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 79a29d564bff..4cecefa4a3b0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4895,6 +4895,14 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd, switch (get_mctgt_type(vma, addr, ptent, &target)) { case MC_TARGET_PAGE: page = target.page; + /* + * We can have a part of the split pmd here. Moving it + * can be done but it would be too convoluted so simply + * ignore such a partial THP and keep it in original + * memcg. There should be somebody mapping the head. + */ + if (PageTransCompound(page)) + goto put; if (isolate_lru_page(page)) goto put; if (!mem_cgroup_move_account(page, false, -- 2.6.2 -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/