Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755540Ab1BJEEx (ORCPT ); Wed, 9 Feb 2011 23:04:53 -0500 Received: from mail-iw0-f174.google.com ([209.85.214.174]:52883 "EHLO mail-iw0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753847Ab1BJEEw convert rfc822-to-8bit (ORCPT ); Wed, 9 Feb 2011 23:04:52 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=EjAxoIlOM+q86d5keNdWE+g2okJTXdz0s9mzEXRjKlM2NqAwWUH7vy0GrGDGWzd7Xc wJ9fgijlIdcErXvdFJxC/iRHA8PYtetkJEb3wDhuJWyxBd9f7RmRtJE7RwTk3v6hVGVR 3nMNYBultgUFIUslfyfGiTLejbZ+Alkj7OGbc= MIME-Version: 1.0 In-Reply-To: <20110209154606.GJ27110@cmpxchg.org> References: <20110209154606.GJ27110@cmpxchg.org> Date: Thu, 10 Feb 2011 13:04:51 +0900 Message-ID: Subject: Re: [patch] vmscan: fix zone shrinking exit when scan work is done From: Minchan Kim To: Johannes Weiner Cc: Andrew Morton , Andrea Arcangeli , Mel Gorman , Rik van Riel , Michal Hocko , Kent Overstreet , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1935 Lines: 46 On Thu, Feb 10, 2011 at 12:46 AM, Johannes Weiner wrote: > Hi, > > I think this should fix the problem of processes getting stuck in > reclaim that has been reported several times.  Kent actually > single-stepped through this code and noted that it was never exiting > shrink_zone(), which really narrowed it down a lot, considering the > tons of nested loops from the allocator down to the list shrinking. > >        Hannes > > --- > From: Johannes Weiner > Subject: vmscan: fix zone shrinking exit when scan work is done > > '3e7d344 mm: vmscan: reclaim order-0 and use compaction instead of > lumpy reclaim' introduced an indefinite loop in shrink_zone(). > > It meant to break out of this loop when no pages had been reclaimed > and not a single page was even scanned.  The way it would detect the > latter is by taking a snapshot of sc->nr_scanned at the beginning of > the function and comparing it against the new sc->nr_scanned after the > scan loop.  But it would re-iterate without updating that snapshot, > looping forever if sc->nr_scanned changed at least once since > shrink_zone() was invoked. > > This is not the sole condition that would exit that loop, but it > requires other processes to change the zone state, as the reclaimer > that is stuck obviously can not anymore. > > This is only happening for higher-order allocations, where reclaim is > run back to back with compaction. > > Reported-by: Michal Hocko > Reported-by: Kent Overstreet > Signed-off-by: Johannes Weiner Reviewed-by: Minchan Kim -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/