Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757827Ab0HDLts (ORCPT ); Wed, 4 Aug 2010 07:49:48 -0400 Received: from mga14.intel.com ([143.182.124.37]:51335 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757407Ab0HDLtr (ORCPT ); Wed, 4 Aug 2010 07:49:47 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.55,315,1278313200"; d="scan'208";a="308059483" Date: Wed, 4 Aug 2010 19:49:33 +0800 From: Wu Fengguang To: Chris Webb Cc: Minchan Kim , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , KOSAKI Motohiro , Pekka Enberg Subject: Re: Over-eager swapping Message-ID: <20100804114933.GA13527@localhost> References: <20100802124734.GI2486@arachsys.com> <20100803033108.GA23117@arachsys.com> <20100803042835.GA17377@localhost> <20100803214945.GA2326@arachsys.com> <20100804022148.GA5922@localhost> <20100804032400.GA14141@localhost> <20100804095811.GC2326@arachsys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100804095811.GC2326@arachsys.com> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1731 Lines: 43 On Wed, Aug 04, 2010 at 05:58:12PM +0800, Chris Webb wrote: > Wu Fengguang writes: > > > This is interesting. Why is it waiting for 1m here? Are there high CPU > > loads? Would you do a > > > > echo t > /proc/sysrq-trigger > > > > and show us the dmesg? > > Annoyingly, magic-sysrq isn't compiled in on these kernels. Is there another > way I can get this info for you? Replacing the kernels on the machines is a > painful job as I have to give the clients running on them quite a bit of > notice of the reboot, and I haven't been able to reproduce the problem on a > test machine. Maybe turn off KSM? It helps to isolate problems. It's a relative new and complex feature after all. > I also think the swap use is much better following a reboot, and only starts > to spiral out of control after the machines have been running for a week or > so. Something deteriorates over long time.. It may take time to catch this bug.. > However, your suggestion is right that the CPU loads on these machines are > typically quite high. The large number of kvm virtual machines they run mean > thatl oads of eight or even sixteen in /proc/loadavg are not unusual, and > these are higher when there's swap than after it has been removed. I assume > this is mostly because of increased IO wait, as this number increases > significantly in top. iowait = CPU (idle) waiting for disk IO So iowait means not CPU load, but somehow disk load :) Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/