Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753705Ab1EaNdv (ORCPT ); Tue, 31 May 2011 09:33:51 -0400 Received: from mail-pv0-f174.google.com ([74.125.83.174]:65294 "EHLO mail-pv0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750830Ab1EaNdt (ORCPT ); Tue, 31 May 2011 09:33:49 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=pJ4xU0nRQxotUNuxTcxNRfr8u2PKyW4kE85x71NKdYlbRAsQIsr31qcQwYBDidUPf+ u0f/9vhDqqFUyMcLmyjhXoZ62k0tltvBemG1MUDYTB5c0OJhkDi70vXAfbk3CqN70pMd 1/APFnDjvaqPM2pvJ638LYDVVpa02L5T244gg= Date: Tue, 31 May 2011 22:33:40 +0900 From: Minchan Kim To: Andrea Arcangeli Cc: Mel Gorman , Mel Gorman , akpm@linux-foundation.org, Ury Stankevich , KOSAKI Motohiro , linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@kernel.org Subject: Re: [PATCH] mm: compaction: Abort compaction if too many pages are isolated and caller is asynchronous Message-ID: <20110531133340.GB3490@barrios-laptop> References: <20110530131300.GQ5044@csn.ul.ie> <20110530143109.GH19505@random.random> <20110530153748.GS5044@csn.ul.ie> <20110530165546.GC5118@suse.de> <20110530175334.GI19505@random.random> <20110531121620.GA3490@barrios-laptop> <20110531122437.GJ19505@random.random> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110531122437.GJ19505@random.random> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3027 Lines: 83 On Tue, May 31, 2011 at 02:24:37PM +0200, Andrea Arcangeli wrote: > On Tue, May 31, 2011 at 09:16:20PM +0900, Minchan Kim wrote: > > I am not sure this is related to the problem you have seen. > > If he used hwpoison by madivse, it is possible. > > CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y > # CONFIG_MEMORY_FAILURE is not set > > > Anyway, we can see negative value by count mismatch in UP build. > > Let's fix it. > > Definitely let's fix it, but it's probably not related to this one. > > > > > From 1d3ebce2e8aa79dcc912da16b7a8d0611b6f9f1a Mon Sep 17 00:00:00 2001 > > From: Minchan Kim > > Date: Tue, 31 May 2011 21:11:58 +0900 > > Subject: [PATCH] Fix page isolated count mismatch > > > > If migration is failed, normally we call putback_lru_pages which > > decreases NR_ISOLATE_[ANON|FILE]. > > It means we should increase NR_ISOLATE_[ANON|FILE] before calling > > putback_lru_pages. But soft_offline_page dosn't it. > > > > It can make NR_ISOLATE_[ANON|FILE] with negative value and in UP build, > > zone_page_state will say huge isolated pages so too_many_isolated > > functions be deceived completely. At last, some process stuck in D state > > as it expect while loop ending with congestion_wait. > > But it's never ending story. > > > > If it is right, it would be -stable stuff. > > > > Cc: Mel Gorman > > Cc: Andrea Arcangeli > > Signed-off-by: Minchan Kim > > --- > > mm/memory-failure.c | 4 +++- > > 1 files changed, 3 insertions(+), 1 deletions(-) > > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > > index 5c8f7e0..eac0ba5 100644 > > --- a/mm/memory-failure.c > > +++ b/mm/memory-failure.c > > @@ -52,6 +52,7 @@ > > #include > > #include > > #include > > +#include > > #include "internal.h" > > > > int sysctl_memory_failure_early_kill __read_mostly = 0; > > @@ -1468,7 +1469,8 @@ int soft_offline_page(struct page *page, int flags) > > put_page(page); > > if (!ret) { > > LIST_HEAD(pagelist); > > - > > + inc_zone_page_state(page, NR_ISOLATED_ANON + > > + page_is_file_cache(page)); > > list_add(&page->lru, &pagelist); > > ret = migrate_pages(&pagelist, new_page, MPOL_MF_MOVE_ALL, > > 0, true); > > Reviewed-by: Andrea Arcangeli Thanks, Andrea. > > Let's check all other migrate_pages callers too... I checked them before sending patch but I got failed to find strange things. :( Now I am checking the page's SwapBacked flag can be changed between before and after of migrate_pages so accounting of NR_ISOLATED_XX can make mistake. I am approaching the failure, too. Hmm. -- Kind regards Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/