Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755099Ab2KUPI5 (ORCPT ); Wed, 21 Nov 2012 10:08:57 -0500 Received: from cantor2.suse.de ([195.135.220.15]:59316 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754959Ab2KUPI4 (ORCPT ); Wed, 21 Nov 2012 10:08:56 -0500 Date: Wed, 21 Nov 2012 15:08:50 +0000 From: Mel Gorman To: Josh Boyer Cc: Zdenek Kabelac , Seth Jennings , Jiri Slaby , Valdis.Kletnieks@vt.edu, Jiri Slaby , linux-mm@kvack.org, LKML , Andrew Morton , Rik van Riel , Robert Jennings , Thorsten Leemhuis , bruno@wolff.to Subject: Re: [PATCH] Revert "mm: remove __GFP_NO_KSWAPD" Message-ID: <20121121150850.GF8218@suse.de> References: <5093A631.5020209@suse.cz> <509422C3.1000803@suse.cz> <509C84ED.8090605@linux.vnet.ibm.com> <509CB9D1.6060704@redhat.com> <20121109090635.GG8218@suse.de> <509F6C2A.9060502@redhat.com> <20121112113731.GS8218@suse.de> <20121116200616.GK8218@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4231 Lines: 78 On Tue, Nov 20, 2012 at 10:38:45AM -0500, Josh Boyer wrote: > On Fri, Nov 16, 2012 at 3:06 PM, Mel Gorman wrote: > > On Fri, Nov 16, 2012 at 02:14:47PM -0500, Josh Boyer wrote: > >> On Mon, Nov 12, 2012 at 6:37 AM, Mel Gorman wrote: > >> > With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction > >> > based on failures" reverted, Zdenek Kabelac reported the following > >> > > >> > Hmm, so it's just took longer to hit the problem and observe > >> > kswapd0 spinning on my CPU again - it's not as endless like before - > >> > but still it easily eats minutes - it helps to turn off Firefox > >> > or TB (memory hungry apps) so kswapd0 stops soon - and restart > >> > those apps again. (And I still have like >1GB of cached memory) > >> > > >> > kswapd0 R running task 0 30 2 0x00000000 > >> > ffff8801331efae8 0000000000000082 0000000000000018 0000000000000246 > >> > ffff880135b9a340 ffff8801331effd8 ffff8801331effd8 ffff8801331effd8 > >> > ffff880055dfa340 ffff880135b9a340 00000000331efad8 ffff8801331ee000 > >> > Call Trace: > >> > [] preempt_schedule+0x42/0x60 > >> > [] _raw_spin_unlock+0x55/0x60 > >> > [] put_super+0x31/0x40 > >> > [] drop_super+0x22/0x30 > >> > [] prune_super+0x149/0x1b0 > >> > [] shrink_slab+0xba/0x510 > >> > > >> > The sysrq+m indicates the system has no swap so it'll never reclaim > >> > anonymous pages as part of reclaim/compaction. That is one part of the > >> > problem but not the root cause as file-backed pages could also be reclaimed. > >> > > >> > The likely underlying problem is that kswapd is woken up or kept awake > >> > for each THP allocation request in the page allocator slow path. > >> > > >> > If compaction fails for the requesting process then compaction will be > >> > deferred for a time and direct reclaim is avoided. However, if there > >> > are a storm of THP requests that are simply rejected, it will still > >> > be the the case that kswapd is awake for a prolonged period of time > >> > as pgdat->kswapd_max_order is updated each time. This is noticed by > >> > the main kswapd() loop and it will not call kswapd_try_to_sleep(). > >> > Instead it will loopp, shrinking a small number of pages and calling > >> > shrink_slab() on each iteration. > >> > > >> > The temptation is to supply a patch that checks if kswapd was woken for > >> > THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not > >> > backed up by proper testing. As 3.7 is very close to release and this is > >> > not a bug we should release with, a safer path is to revert "mm: remove > >> > __GFP_NO_KSWAPD" for now and revisit it with the view to ironing out the > >> > balance_pgdat() logic in general. > >> > > >> > Signed-off-by: Mel Gorman > >> > >> Does anyone know if this is queued to go into 3.7 somewhere? I looked > >> a bit and can't find it in a tree. We have a few reports of Fedora > >> rawhide users hitting this. > >> > > > > No, because I was waiting to hear if a) it worked and preferably if the > > alternative "less safe" option worked. This close to release it might be > > better to just go with the safe option. > > We've been tracking it in https://bugzilla.redhat.com/show_bug.cgi?id=866988 > and people say this revert patch doesn't seem to make the issue go away > fully. Thorsten has created another kernel with the other patch applied > for testing. > There is also a potential accounting bug that could be affecting this. https://lkml.org/lkml/2012/11/20/613 . NR_FREE_PAGES affects watermark calculations. If it's drifts too far then processes would keep entering direct reclaim and waking kswapd even if there is no need to. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/