Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754710Ab1F3Jjm (ORCPT ); Thu, 30 Jun 2011 05:39:42 -0400 Received: from cantor2.suse.de ([195.135.220.15]:59491 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752943Ab1F3Jji (ORCPT ); Thu, 30 Jun 2011 05:39:38 -0400 Date: Thu, 30 Jun 2011 10:39:33 +0100 From: Mel Gorman To: Andrew Morton Cc: P?draig Brady , James Bottomley , Colin King , Minchan Kim , Andrew Lutomirski , Rik van Riel , Johannes Weiner , linux-mm , linux-kernel Subject: Re: [PATCH 1/4] mm: vmscan: Correct check for kswapd sleeping in sleeping_prematurely Message-ID: <20110630093933.GY9396@suse.de> References: <1308926697-22475-1-git-send-email-mgorman@suse.de> <1308926697-22475-2-git-send-email-mgorman@suse.de> <20110628144900.b33412c6.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20110628144900.b33412c6.akpm@linux-foundation.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2904 Lines: 76 On Tue, Jun 28, 2011 at 02:49:00PM -0700, Andrew Morton wrote: > On Fri, 24 Jun 2011 15:44:54 +0100 > Mel Gorman wrote: > > > During allocator-intensive workloads, kswapd will be woken frequently > > causing free memory to oscillate between the high and min watermark. > > This is expected behaviour. > > > > A problem occurs if the highest zone is small. balance_pgdat() > > only considers unreclaimable zones when priority is DEF_PRIORITY > > but sleeping_prematurely considers all zones. It's possible for this > > sequence to occur > > > > 1. kswapd wakes up and enters balance_pgdat() > > 2. At DEF_PRIORITY, marks highest zone unreclaimable > > 3. At DEF_PRIORITY-1, ignores highest zone setting end_zone > > 4. At DEF_PRIORITY-1, calls shrink_slab freeing memory from > > highest zone, clearing all_unreclaimable. Highest zone > > is still unbalanced > > 5. kswapd returns and calls sleeping_prematurely > > 6. sleeping_prematurely looks at *all* zones, not just the ones > > being considered by balance_pgdat. The highest small zone > > has all_unreclaimable cleared but but the zone is not > > balanced. all_zones_ok is false so kswapd stays awake > > > > This patch corrects the behaviour of sleeping_prematurely to check > > the zones balance_pgdat() checked. > > But kswapd is making progress: it's reclaiming slab. Eventually that > won't work any more and all_unreclaimable will not be cleared and the > condition will fix itself up? > It might, but at that point we've dumped as much slab as we can which is very aggressive and there is no guarantee the condition is fixed up. For example, if fork is happening often enough due to terminal usage for example, it may be just enough allocation requests satisified from the highest zone to clear all_unreclaimable during exit. > btw, > > if (!sleeping_prematurely(...)) > sleep(); > > hurts my brain. My brain would prefer > > if (kswapd_should_sleep(...)) > sleep(); > > no? > kswapd_try_to_sleep -> should_sleep feel like it would hurt too. I prefer the sleeping_prematurely name because it indicates what condition we are checking but I'm biased and generally suck at naming. > > Reported-and-tested-by: P?draig Brady > > But what were the before-and-after observations? I don't understand > how this can cause a permanent cpuchew by kswapd. > P?draig has reported on his before-and-after observations. On its own, this patch doesn't entirely fix his problem because all the patches are required but I felt that a rolled-up patch would be too hard to review. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/