Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754667AbZJZWSF (ORCPT ); Mon, 26 Oct 2009 18:18:05 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754620AbZJZWSE (ORCPT ); Mon, 26 Oct 2009 18:18:04 -0400 Received: from cpsmtpm-eml103.kpnxchange.com ([195.121.3.7]:52730 "EHLO CPSMTPM-EML103.kpnxchange.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754619AbZJZWSB (ORCPT ); Mon, 26 Oct 2009 18:18:01 -0400 From: Frans Pop To: Mel Gorman Subject: Re: [PATCH 0/5] Candidate fix for increased number of GFP_ATOMIC failures V2 Date: Mon, 26 Oct 2009 23:17:50 +0100 User-Agent: KMail/1.9.9 Cc: Jiri Kosina , Sven Geggus , Karol Lewandowski , Tobias Oetiker , "Rafael J. Wysocki" , David Miller , Reinette Chatre , Kalle Valo , David Rientjes , KOSAKI Motohiro , Mohamed Abbas , Jens Axboe , "John W. Linville" , Pekka Enberg , Bartlomiej Zolnierkiewicz , Greg Kroah-Hartman , Stephan von Krawczynski , Kernel Testers List , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, "linux-mm@kvack.org" References: <1256221356-26049-1-git-send-email-mel@csn.ul.ie> In-Reply-To: <1256221356-26049-1-git-send-email-mel@csn.ul.ie> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200910262317.55960.elendil@planet.nl> X-OriginalArrivalTime: 26 Oct 2009 22:18:05.0677 (UTC) FILETIME=[30BED9D0:01CA568A] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3632 Lines: 95 On Thursday 22 October 2009, Mel Gorman wrote: > Test 1: Verify your problem occurs on 2.6.32-rc5 if you can I've tested against 2.6.31.1 as it's easier for me to compare behaviors with that than with .32. All patches applied without problems against .31. I've also tested 2.6.31.1 with SLAB instead of SLUB, but that does not seem to make a significant difference for my test. > Test 2: Apply the following two patches and test again > 1/5 page allocator: Always wake kswapd when restarting an allocation > attempt after direct reclaim failed > 2/5 page allocator: Do not allow interrupts to use ALLOC_HARDER Does not look to make any difference. Possibly causes more variation in the duration of the test (increases timing effects)? > Test 3: If you are getting allocation failures, try with the following > patch > 3/5 vmscan: Force kswapd to take notice faster when high-order > watermarks are being hit Applied on top of patches 1-2. Does not look to make any difference. > Test 4: If you are still getting failures, apply the following > 4/5 page allocator: Pre-emptively wake kswapd when high-order > watermarks are hit Applied on top of patches 1-3. Does not look to make any difference. > Test 5: If things are still screwed, apply the following > 5/5 Revert 373c0a7e, 8aa7e847: Fix congestion_wait() sync/async vs > read/write confusion Applied on top of patches 1-4. Despite Jens' scepticism is this still the patch that makes the most significant difference in my test. The reading of commits in gitk is much more fluent and music skips are a lot less severe. But most important is that there is no long total freeze of the system halfway during the reading of commits and gitk loads fastest. It also gives by far the most consistent results. The likelyhood of SKB allocation errors during the test is a lot smaller. See also http://lkml.org/lkml/2009/10/26/455. Detailed test results follow. I've done 2 test runs with each kernel (3 for the last). The columns below give the following info: - time at which all commits have been read by gitk - time at which gitk fills in "branch", "follows" and "precedes" data for the current commit - time at which there's no longer any disk activity, i.e. when gitk is fully loaded and all swapping is done - total number of SKB allocation errors during the test A "freeze" during the reading of commits is indicated by an "f" (short freeze) or "F" (long "hard" freeze). An "S" shows when there were SKB allocation errors. end commits show branch done SKB errs 1) vanilla .31.1 run 1: 1:20 fFS 2:10 S 2:30 44 a) run 2: 1:35 FS 1:45 2:10 13 2) .31.1 + patches 1-2 run1: 2:30 fFS 2:45 3:00 58 run2: 1:15 fS 2:00 2:20 2 a) 3) .31.1 + patches 1-3 run1: 1:00 fS 1:15 1:45 1 *) run2: 3:00 fFS 3:15 3:30 33 *) unexpected; fortunate timing? 4) .31.1 + patches 1-4 run1: 1:10 ffS 1:55 S 2:20 35 a) run2: 3:05 fFS 3:15 3:25 36 5) .31.1 + patches 1-5 run1: 1:00 1:15 1:35 0 run2: 0:50 1:15 S 1:45 45 *) run3: 1:00 1:15 1:45 0 *) unexpected; unfortunate timing? a) fast in 1st phase; slow in 2nd and 3rd Note that without the congestion_wait() reverts occurrence of SKB errors, the long freezes and time it takes for gitk to load seem roughly related; with the reverts total time is not affected even with many SKB errors. Cheers, FJP -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/