2003-02-20 13:18:01

by Martin Josefsson

[permalink] [raw]
Subject: cifs leaks memory like crazy in 2.5.61

Hi Steven,

I've been using cifs in 2.5.61 instead of smbfs (it has problems with
slab debugging) for 4.5 days now and I just noticed a bad thing.
I was looking at /proc/slabinfo and this jumped out at me:

size-64 1843081 1847421 72 .....

that's a _lot_ of allocations.
I tried to figure out what was leaking and every time I listed a
directory mounted from a samba server it increased. A simple ls -R was
scary to see.

Then I unmounted all my cifs filesystems (two) and removed the cifs
module and got this:

kmem_cache_destroy: Can't free all objects e8eefd00
cifs_destroy_request_cache: error not all structures were freed

Is this a known problem?

--
/Martin

Never argue with an idiot. They drag you down to their level, then beat you with experience.


2003-02-20 16:21:56

by Steven French

[permalink] [raw]
Subject: Re: cifs leaks memory like crazy in 2.5.61





Hadn't run into this - I had been focusing on the readahead and write page
improvements (which have improved especially write performance
spectacularly) and also have just fixed a problem with redundant lookups of
directory inodes but had not been doing readdir (cifs
Trans2FindFirst/Trans2FindNext) testing recently. I just did - and the
situation looks worse than you describe and probably related to what you
are running into. I found a readdir test case that hangs my post 2.5.62
system pretty fast and the last two unrelated cifs changesets don't fix it.
The cifs readdir code needed some rework anyway - I will crawl through it
today. Thanks for finding this.

>kmem_cache_destroy: Can't free all objects e8eefd00
>cifs_destroy_request_cache: error not all structures were freed
>
>Is this a known problem?

Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: [email protected]

2003-02-20 16:35:48

by Jeff Garzik

[permalink] [raw]
Subject: Re: cifs leaks memory like crazy in 2.5.61

On Thu, Feb 20, 2003 at 10:28:07AM -0600, Steven French wrote:
> Hadn't run into this - I had been focusing on the readahead and write page
> improvements (which have improved especially write performance
> spectacularly) and also have just fixed a problem with redundant lookups of
> directory inodes but had not been doing readdir (cifs
> Trans2FindFirst/Trans2FindNext) testing recently. I just did - and the

Well, here is a humble request to actually test your stuff,
before adding more features!

Jeff



2003-02-20 16:48:42

by Steven French

[permalink] [raw]
Subject: Re: cifs leaks memory like crazy in 2.5.61





I run three file API tests regularly against it - fsx, the connecathon
"nfs" tests and iozone and use them as a sort of regression test bucket
(which unfortunately didn't pick this problem up) - as a result of this I
will add "ls -R" of a deep directory tree to the list (ls -R of a shallow
tree doesn't seem to show this problem up) - if there are other useful
filesystem regression cases that I could automate and run, I would love to
know about them.

> request to actually test your stuff, before adding more features!

Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: [email protected]

2003-02-20 17:06:39

by Jeff Garzik

[permalink] [raw]
Subject: Re: cifs leaks memory like crazy in 2.5.61

On Thu, Feb 20, 2003 at 10:56:04AM -0600, Steven French wrote:
> I run three file API tests regularly against it - fsx, the connecathon
> "nfs" tests and iozone and use them as a sort of regression test bucket
> (which unfortunately didn't pick this problem up) - as a result of this I
> will add "ls -R" of a deep directory tree to the list (ls -R of a shallow
> tree doesn't seem to show this problem up) - if there are other useful
> filesystem regression cases that I could automate and run, I would love to
> know about them.

Those are more stress tests, than regression tests. You have to know
what you regressed from, and progressed to, before you have regression
tests. ;-)

It sounds like unit tests are lacking...

Jeff



2003-02-20 21:14:14

by Steven French

[permalink] [raw]
Subject: Re: cifs leaks memory like crazy in 2.5.61





Fixed now.

The obvious warning message that Martin noted (on kmem_cache_free of
the request buffers when the cifs module was unloaded) turned out
to be a sideffect of the global kernel change for masking signals for
users of daemonize that went in about 10 days ago. With signals now
masked by default, the cifsd captive thread was not exiting fully at
unmount time leaving an unused buffer at rmmod time. With Andrew Morton's
recent exports, the cifs vfs can be built as a module again so the fix will
be timely (since others would be likely to notice it). The fix is
changeset
1.1004 at http://cifs.bkbits.net/linux-2.5cifs

The unrelated 64 byte object allocation growth in the slab cache turns out
to
have been around for quite a long time and was caused by a path in
which file->private_data was reallocated when search rewinding
occured (which "ls -R" does). The file->private_data field is freed on
release in cifs_closedir but in this path it could be allocated more than
once
when (specifically when rewind of file->f_pos occurred to the second search
entry when readdir was reinvoked). I found this while rechecking all the
kmalloc
invocations in the cifs vfs today. This particular case (readdir
handles) was
not instrumented with a counter as many of the other memory alloctions
are.
I will post this fix later today. In retrospect this reminds me of bugs
in various
filesystems and network servers generated by that other OS's "tree" utility
which also did search rewind.

Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: [email protected]