Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755044AbZF0M6L (ORCPT ); Sat, 27 Jun 2009 08:58:11 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752996AbZF0M56 (ORCPT ); Sat, 27 Jun 2009 08:57:58 -0400 Received: from cmpxchg.org ([85.214.51.133]:33185 "EHLO cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751810AbZF0M55 (ORCPT ); Sat, 27 Jun 2009 08:57:57 -0400 Date: Sat, 27 Jun 2009 14:54:12 +0200 From: Johannes Weiner To: David Howells Cc: Wu Fengguang , "riel@redhat.com" , "minchan.kim@gmail.com" , Andrew Morton , LKML , Christoph Lameter , KOSAKI Motohiro , "peterz@infradead.org" , "tytso@mit.edu" , "linux-mm@kvack.org" , "elladan@eskimo.com" , "npiggin@suse.de" Subject: Re: Found the commit that causes the OOMs Message-ID: <20090627125412.GA1667@cmpxchg.org> References: <3901.1245848839@redhat.com> <20090624023251.GA16483@localhost> <20090620043303.GA19855@localhost> <32411.1245336412@redhat.com> <20090517022327.280096109@intel.com> <2015.1245341938@redhat.com> <20090618095729.d2f27896.akpm@linux-foundation.org> <7561.1245768237@redhat.com> <26537.1246086769@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <26537.1246086769@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5191 Lines: 128 On Sat, Jun 27, 2009 at 08:12:49AM +0100, David Howells wrote: > > I've managed to bisect things to find the commit that causes the OOMs. It's: > > commit 69c854817566db82c362797b4a6521d0b00fe1d8 > Author: MinChan Kim > Date: Tue Jun 16 15:32:44 2009 -0700 > > vmscan: prevent shrinking of active anon lru list in case of no swap space V3 > > shrink_zone() can deactivate active anon pages even if we don't have a > swap device. Many embedded products don't have a swap device. So the > deactivation of anon pages is unnecessary. > > This patch prevents unnecessary deactivation of anon lru pages. But, it > don't prevent aging of anon pages to swap out. > > Signed-off-by: Minchan Kim > Acked-by: KOSAKI Motohiro > Cc: Johannes Weiner > Acked-by: Rik van Riel > Signed-off-by: Andrew Morton > Signed-off-by: Linus Torvalds > > This exhibits the problem. The previous commit: > > commit 35282a2de4e5e4e173ab61aa9d7015886021a821 > Author: Brice Goglin > Date: Tue Jun 16 15:32:43 2009 -0700 > > migration: only migrate_prep() once per move_pages() > > survives 16 iterations of the LTP syscall testsuite without exhibiting the > problem. Here is the patch in question: diff --git a/mm/vmscan.c b/mm/vmscan.c index 7592d8e..879d034 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1570,7 +1570,7 @@ static void shrink_zone(int priority, struct zone *zone, * Even if we did not try to evict anon pages at all, we want to * rebalance the anon lru active/inactive ratio. */ - if (inactive_anon_is_low(zone, sc)) + if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0) shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0); throttle_vm_writeout(sc->gfp_mask); When this was discussed, I think we missed that nr_swap_pages can actually get zero on swap systems as well and this should have been total_swap_pages - otherwise we also stop balancing the two anon lists when swap is _full_ which was not the intention of this change at all. [ There is another one hiding in shrink_zone() that does the same - it was moved from get_scan_ratio() and is pretty old but we still kept the inactive/active ratio halfway sane without MinChan's patch. ] This is from your OOM-run dmesg, David: Adding 32k swap on swapfile22. Priority:-21 extents:1 across:32k Adding 32k swap on swapfile23. Priority:-22 extents:1 across:32k Adding 32k swap on swapfile24. Priority:-23 extents:3 across:44k Adding 32k swap on swapfile25. Priority:-24 extents:1 across:32k So we actually have swap? Or are those removed again before the OOM? If not, I think we let the anon lists rot while swap is full and when some swap space gets freed up and we should be able to evict anon pages again, we don't find any candidates. The following patch should improve on that. If it's not true for your particular situation, I think we still need it for the scenario described above. --- From: Johannes Weiner Subject: vmscan: keep balancing anon lists on swap-full conditions Page reclaim doesn't scan and balance the anon LRU lists when nr_swap_pages is zero to save the scan overhead for swapless systems. Unfortunately, this variable can reach zero when all present swap space is occupied as well and we don't want to stop balancing in that case or we encounter an unreclaimable mess of anon lists when swap space gets freed up and we are theoretically in the position to page out again. Use the total_swap_pages variable to have a better indicator when to scan the anon LRU lists. We still might have unbalanced anon lists when swap space is added during run time but it is a a less dynamic change in state and we still save the scanning overhead for CONFIG_SWAP systems that never actually set up swap space. Signed-off-by: Johannes Weiner --- diff --git a/mm/vmscan.c b/mm/vmscan.c index 5415526..5ea7fc3 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1524,7 +1524,7 @@ static void shrink_zone(int priority, struct zone *zone, int noswap = 0; /* If we have no swap space, do not bother scanning anon pages. */ - if (!sc->may_swap || (nr_swap_pages <= 0)) { + if (!sc->may_swap || (total_swap_pages <= 0)) { noswap = 1; percent[0] = 0; percent[1] = 100; @@ -1578,7 +1578,7 @@ static void shrink_zone(int priority, struct zone *zone, * Even if we did not try to evict anon pages at all, we want to * rebalance the anon lru active/inactive ratio. */ - if (inactive_anon_is_low(zone, sc) && nr_swap_pages > 0) + if (inactive_anon_is_low(zone, sc) && total_swap_pages > 0) shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0); throttle_vm_writeout(sc->gfp_mask); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/