Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754952Ab1EXLYg (ORCPT ); Tue, 24 May 2011 07:24:36 -0400 Received: from mail-pz0-f46.google.com ([209.85.210.46]:48452 "EHLO mail-pz0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753866Ab1EXLYf convert rfc822-to-8bit (ORCPT ); Tue, 24 May 2011 07:24:35 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; b=pF1mTsf8cgA4fqVLON/9FfW1eTp4yKNkNIdqaXMOvlX3tzEkSyPDFL5dyeqTXB7hE4 hkSl10ab2SwgGbQs8LfTtvo9GayYHaFGjJ3RwmH9H9KvgexyBtmm3zi/loaa8ImZ49Yj rb1Yt4r7KQRCrudo3IJZVXsJGHNo9bd2TYQko= MIME-Version: 1.0 In-Reply-To: References: <4DD5DC06.6010204@jp.fujitsu.com> <20110520140856.fdf4d1c8.kamezawa.hiroyu@jp.fujitsu.com> <20110520101120.GC11729@random.random> <20110520153346.GA1843@barrios-desktop> <20110520161934.GA2386@barrios-desktop> From: Andrew Lutomirski Date: Tue, 24 May 2011 07:24:15 -0400 X-Google-Sender-Auth: lCL0kLgvu8Ful3_n5LQGF-3Qbh0 Message-ID: Subject: Re: Kernel falls apart under light memory pressure (i.e. linking vmlinux) To: Minchan Kim Cc: KOSAKI Motohiro , Andrea Arcangeli , KAMEZAWA Hiroyuki , fengguang.wu@intel.com, andi@firstfloor.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, mgorman@suse.de, hannes@cmpxchg.org, riel@redhat.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3075 Lines: 78 On Mon, May 23, 2011 at 9:34 PM, Minchan Kim wrote: > On Tue, May 24, 2011 at 10:19 AM, Andrew Lutomirski wrote: >> On Sun, May 22, 2011 at 7:12 PM, Minchan Kim wrote: >>> Could you test below patch based on vanilla 2.6.38.6? >>> The expect result is that system hang never should happen. >>> I hope this is last test about hang. >>> >>> Thanks. >>> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>> index 292582c..1663d24 100644 >>> --- a/mm/vmscan.c >>> +++ b/mm/vmscan.c >>> @@ -231,8 +231,11 @@ unsigned long shrink_slab(struct shrink_control *shrink, >>> ? ? ? if (scanned == 0) >>> ? ? ? ? ? ? ? scanned = SWAP_CLUSTER_MAX; >>> >>> - ? ? ? if (!down_read_trylock(&shrinker_rwsem)) >>> - ? ? ? ? ? ? ? return 1; ? ? ? /* Assume we'll be able to shrink next time */ >>> + ? ? ? if (!down_read_trylock(&shrinker_rwsem)) { >>> + ? ? ? ? ? ? ? /* Assume we'll be able to shrink next time */ >>> + ? ? ? ? ? ? ? ret = 1; >>> + ? ? ? ? ? ? ? goto out; >>> + ? ? ? } >>> >>> ? ? ? list_for_each_entry(shrinker, &shrinker_list, list) { >>> ? ? ? ? ? ? ? unsigned long long delta; >>> @@ -286,6 +289,8 @@ unsigned long shrink_slab(struct shrink_control *shrink, >>> ? ? ? ? ? ? ? shrinker->nr += total_scan; >>> ? ? ? } >>> ? ? ? up_read(&shrinker_rwsem); >>> +out: >>> + ? ? ? cond_resched(); >>> ? ? ? return ret; >>> ?} >>> >>> @@ -2331,7 +2336,7 @@ static bool sleeping_prematurely(pg_data_t >>> *pgdat, int order, long remaining, >>> ? ? ? ?* must be balanced >>> ? ? ? ?*/ >>> ? ? ? if (order) >>> - ? ? ? ? ? ? ? return pgdat_balanced(pgdat, balanced, classzone_idx); >>> + ? ? ? ? ? ? ? return !pgdat_balanced(pgdat, balanced, classzone_idx); >>> ? ? ? else >>> ? ? ? ? ? ? ? return !all_zones_ok; >>> ?} >> >> So far with this patch I can't reproduce the hang or the bogus OOM. >> >> To be completely clear, I have COMPACTION, MIGRATION, and THP off, I'm >> running 2.6.38.6, and I have exactly two patches applied. ?One is the >> attached patch and the other is a the fpu.ko/aesni_intel.ko merger >> which I need to get dracut to boot my box. >> >> For fun, I also upgraded to 8GB of RAM and it still works. >> > > Hmm. Could you test it with enable thp and 2G RAM? > Isn't it a original test environment? > Please don't change test environment. :) The test that passed last night was an environment (hardware and config) that I had confirmed earlier as failing without the patch. I just re-tested my original config (from a backup -- migration, compaction, and thp "always" are enabled). I get bogus OOMs but not a hang. (I'm running with mem=2G right now -- I'll swap the DIMMs back out later on if you want.) I attached the bogus OOM (actually several that happened in sequence). They look readahead-related. There was plenty of free swap space. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/