Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753702Ab0HRPV2 (ORCPT ); Wed, 18 Aug 2010 11:21:28 -0400 Received: from mga01.intel.com ([192.55.52.88]:14662 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752926Ab0HRPVY (ORCPT ); Wed, 18 Aug 2010 11:21:24 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.56,227,1280732400"; d="scan'208";a="829367363" Date: Wed, 18 Aug 2010 23:21:03 +0800 From: Wu Fengguang To: Chris Webb Cc: Minchan Kim , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , KOSAKI Motohiro , Pekka Enberg , Andi Kleen , Lee Schermerhorn , Christoph Lameter Subject: Re: Over-eager swapping Message-ID: <20100818152103.GA11268@localhost> References: <20100803042835.GA17377@localhost> <20100803214945.GA2326@arachsys.com> <20100804022148.GA5922@localhost> <20100804032400.GA14141@localhost> <20100804095811.GC2326@arachsys.com> <20100804114933.GA13527@localhost> <20100804120430.GB23551@arachsys.com> <20100818143801.GA9086@localhost> <20100818144655.GX2370@arachsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100818144655.GX2370@arachsys.com> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2197 Lines: 58 Andi, Christoph and Lee: This looks like an "unbalanced NUMA memory usage leading to premature swapping" problem. Thanks, Fengguang On Wed, Aug 18, 2010 at 10:46:59PM +0800, Chris Webb wrote: > Wu Fengguang writes: > > > Did you enable any NUMA policy? That could start swapping even if > > there are lots of free pages in some nodes. > > Hi. Thanks for the follow-up. We haven't done any configuration or tuning of > NUMA behaviour, but NUMA support is definitely compiled into the kernel: > > # zgrep NUMA /proc/config.gz > CONFIG_NUMA_IRQ_DESC=y > CONFIG_NUMA=y > CONFIG_K8_NUMA=y > CONFIG_X86_64_ACPI_NUMA=y > # CONFIG_NUMA_EMU is not set > CONFIG_ACPI_NUMA=y > # grep -i numa /var/log/dmesg.boot > NUMe: Allocated memnodemap from b000 - 1b540 > NUMA: Using 20 for the hash shift. > > > Are your free pages equally distributed over the nodes? Or limited to > > some of the nodes? Try this command: > > > > grep MemFree /sys/devices/system/node/node*/meminfo > > My worst-case machines current have swap completely turned off to make them > usable for clients, but I have one machine which is about 3GB into swap with > 8GB of buffers and 3GB free. This shows > > # grep MemFree /sys/devices/system/node/node*/meminfo > /sys/devices/system/node/node0/meminfo:Node 0 MemFree: 954500 kB > /sys/devices/system/node/node1/meminfo:Node 1 MemFree: 2374528 kB > > I could definitely imagine that one of the nodes could have dipped down to > zero in the past. I'll try enabling swap on one of our machines with the bad > problem late tonight and repeat the experiment. The node meminfo on this box > currently looks like > > # grep MemFree /sys/devices/system/node/node*/meminfo > /sys/devices/system/node/node0/meminfo:Node 0 MemFree: 82732 kB > /sys/devices/system/node/node1/meminfo:Node 1 MemFree: 1723896 kB > > Best wishes, > > Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/