Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755799AbZGAEVS (ORCPT ); Wed, 1 Jul 2009 00:21:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750715AbZGAEVJ (ORCPT ); Wed, 1 Jul 2009 00:21:09 -0400 Received: from mga03.intel.com ([143.182.124.21]:1786 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750736AbZGAEVH (ORCPT ); Wed, 1 Jul 2009 00:21:07 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.42,321,1243839600"; d="scan'208";a="160319300" Date: Wed, 1 Jul 2009 12:21:03 +0800 From: Wu Fengguang To: Attila Kinali Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: Long lasting MM bug when swap is smaller than RAM Message-ID: <20090701042103.GA13911@localhost> References: <20090630115819.38b40ba4.attila@kinali.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090630115819.38b40ba4.attila@kinali.ch> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6719 Lines: 154 On Tue, Jun 30, 2009 at 11:58:19AM +0200, Attila Kinali wrote: > Moin, > > There has been a bug back in the 2.4.17 days that is somehow > triggered by swap being smaller than RAM, which i thought had > been fixed long ago, reappeared on one of the machines i manage. > > > Back in 2002, i had a few machines, running 2.4.x kernels, which > i upgraded heavily from some 16-64MB RAM to a couple 100MB > (changing mainboards at times, but keeping the harddisks). > Due to the upgrade of RAM, the swap size became a lot smaller > than RAM size, sometimes not even half as much. > Under most conditions these machines worked fine, but sometimes, > they showed a strange behavior: At times, the swap use would grow > (depending on the machine and its use faster or slower, sometimes > at 1MB/minute) until it was full. I couldnt figure out what filled > swap back then, couldnt find any programm that used a lot of memory. > And even more, the RAM portion that was used as cache and buffers > was most times still very large, ie it didnt seem like something using > a lot of memory. > After swap was full, nothing happend. No programms crashing, no errors > in the logs, nothing.... Until later (between hours and a few weeks), > the OOM would suddenly start to kick in and kill applications. This > time, something would use a lot of memory, but i couldn't figure out > what. None of the applications running would use more than usual. > And even killing the usual culprits (Mozilla, X11,...) wouldnt help. > The only cure was to reboot. > > All the machines back then were running Debian, a vanilla kernel, > and had more RAM than swap and were x86 boxes. Other than that, > they didnt had much in common. One was a machine with an Adaptec > 2940UW, others had IDE, one had a K6-III CPU, others were Intel. > Some had a lot of disk, others very little. Machine usage was > fileserver, firewall/router, desktop, laptop. > > I reported this bug back then but never got an answer, so i used > the only fix i had available back then: disable swap completely. > > > Now, 7 years later, i have a machine that shows the same behavior. > > Some data: > > We have a HP DL380 G4 currently running a 2.6.29.4 vanilla kernel, > compiled for x86 32 bit. > It was originaly purchased in 2005 with 2GB RAM and a few weeks > ago upgraded to 6GB (no other changes beside this and a kernel upgrade). > The machine, being the MPlayer main server, runs a lighttpd, svnserve, > mailman, postfix, bind. Ie nothing unusual and the applications didn't > change in the last months (since the update from debian/etch to lenny). > > --- > root@natsuki:/home/attila# uname -a > Linux natsuki 2.6.29.4 #1 SMP Sun May 31 22:13:21 CEST 2009 i686 GNU/Linux > root@natsuki:/home/attila# uptime > 11:41:07 up 29 days, 13:17, 5 users, load average: 0.15, 0.36, 0.54 > root@natsuki:/home/attila# free -m > total used free shared buffers cached > Mem: 6023 5919 103 0 415 3873 > -/+ buffers/cache: 1630 4393 > Swap: 3812 879 2932 > --- Hi Attila, What's your /proc/meminfo and/or /proc/vmstat contents? Are you making use of tmpfs, or intel graphics devices with drm? Thanks, Fengguang > I want to point your attention at the fact that the machine has now > more RAM installed than it previously had RAM+Swap (ie before the upgrade). > Ie there is no reason it would need to swap out, at least not so much. > > What is even more interesting is the amount of swap used over time. > Sampled every day at 10:00 CEST: > > --- > Date: Wed, 17 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5893 130 0 405 3834 > Swap: 3812 190 3622 > > Date: Thu, 18 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5793 229 0 340 3939 > Swap: 3812 225 3586 > > Date: Fri, 19 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5820 203 0 341 3899 > Swap: 3812 275 3536 > > Date: Sun, 21 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5264 758 0 459 3181 > Swap: 3812 325 3486 > > Date: Sat, 20 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5761 262 0 348 3865 > Swap: 3812 297 3514 > > Date: Mon, 22 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5875 147 0 397 3681 > Swap: 3812 353 3458 > > Date: Tue, 23 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5748 275 0 193 3949 > Swap: 3812 415 3396 > > Date: Wed, 24 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5779 244 0 176 3924 > Swap: 3812 519 3292 > > Date: Thu, 25 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5812 210 0 345 3856 > Swap: 3812 611 3200 > > Date: Fri, 26 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5830 192 0 431 3688 > Swap: 3812 682 3129 > > Date: Sat, 27 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5697 326 0 442 3621 > Swap: 3812 719 3093 > > Date: Sun, 28 Jun 2009 10:00:02 +0200 (CEST) > Mem: 6023 5890 132 0 402 3886 > Swap: 3812 784 3028 > > Date: Mon, 29 Jun 2009 10:00:01 +0200 (CEST) > Mem: 6023 5388 635 0 425 3321 > Swap: 3812 826 2985 > --- > > As you can see, although memory usage didnt change much over time, > swap usage increased from 190MB to 826MB in about two weeks. > > As i'm pretty much clueless when it commes to how the linux VM works, > i would appreciate it if someone could give me some pointers on how > to figure out what causes this bug so that it could be fixed finally. > > Thanks a lot in advance > > Attila Kinali > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/