Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756109AbYJEQSt (ORCPT ); Sun, 5 Oct 2008 12:18:49 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754662AbYJEQMz (ORCPT ); Sun, 5 Oct 2008 12:12:55 -0400 Received: from nef2.ens.fr ([129.199.96.40]:1845 "EHLO nef2.ens.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757384AbYJEQMy (ORCPT ); Sun, 5 Oct 2008 12:12:54 -0400 Date: Sun, 5 Oct 2008 18:12:15 +0200 From: Quentin Godfroy To: Theodore Tso , linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: possible (ext4 related?) memory leak in kernel 2.6.26 Message-ID: <20081005161214.GA2985@goelette.ens.fr> References: <20080930211854.GZ10831@mit.edu> <20080930222358.1FF30EAC415@quatramaran.ens.fr> <20081003003548.GA18138@mit.edu> <20081005091526.GA678@goelette.ens.fr> <20081005122752.GB27335@mit.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081005122752.GB27335@mit.edu> User-Agent: Mutt/1.4.2.2i X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.1.4 (nef2.ens.fr [129.199.96.32]); Sun, 05 Oct 2008 18:12:31 +0200 (CEST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3511 Lines: 88 On Sun, Oct 05, 2008 at 08:27:52AM -0400, Theodore Tso wrote: > > In both cases it freezes solid the machine for more than a minute or so, and > > it overflows the dmesg with messages. > > Can you check and see if you got more of the messages recorded in > /var/log/messages? Once you do, can you take the block numbers, and > pull them out into a single command file to feed to debugfs. Unfortunately, there are no more messages in the syslog than in the dmesg. ... > Take the column of block numbers, and tack on "icheck " at the > beginning, like so: > > icheck 35491 35493 35494 35496 35497 35498 35499 35501 ... > > You can put a thousand or so block numbers on each line; then it's > probably better to start a new line with "icheck " at the beginning. > Then take that script and run it through debugfs: > > debugfs /dev/XXX < icheck.in > icheck.out > > That will result in a file icheck.out that looks like this: > > debugfs: icheck 33347 > Block Inode number > 33347 8193 > 33348 8193 > 33350 8196 > 33351 8197 > ... > > Now you'll need to take the inode numbers returned in icheck.out, and > create another file called ncheck.in that will take the inode numbers > and turn them into pathnames. (I find that using emacs's > kill-rectangle command very handy for doing this sort of thing, but > other people will like to use awk, and I'm sure there's some way to do > this using vi but I don't know what it is. :-) It's also a good idea > to take the inode numbers and run them through "sort -u" to get rid of > duplicates before putting them on a single line and adding ncheck to > them. So what you want is to create a file ncheck.in that looks like this: > > ncheck 8193 8196 8197 .... > > ... and then feed that to debugfs again: > > debugfs /dev/XXX < ncheck.in > ncheck.out > > That will produce a file that looks like this: > > debugfs: ncheck 8193 > Inode Pathname > 8193 /ext4 > ... > > > The next thing I'd ask you to do is to look at the pathnames and > eyeball them; are they all directories? Files? Files that you have > modified earlier? If you're not sure, you can look at a particular > inode either by giving its pathname: > > debugfs: stat /ext4 > > or by its inode number, in angle brackets: > > debugfs: stat <8193> > > What I'm trying to do here is to get a pattern of what might be going > on. I'm assuming that your filesystem is too big (and probably > contains private information) for you to send it to me. (Although if > you're willing to send me a compressed raw e2image --- see the "RAW > IMAGE FILES" section of the e2image man page ---- and the portions of > the buffer information dummped in /var/log/messages, I can try to do > some of the analysis for you.) For the two fs the only inode which shows up is the inode 8 (this seems to be the journal. According to 'stat <8>' in debugfs it looks like the journal is 134Megs long. I don't remember exactly how I created the fs, but i'm sure I did not specified the journal size. Does it seem reasonable for a 6,6G fs? However seeing only the journal inode could be because I only had the last 300 blocks left in the dmesg. A small number of them did not belong to any inode. Should I reboot with a very big ring buffer? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/