Forwarding to LKML as per Neil Brown.
---------- Forwarded message ----------
Date: Sun, 9 Nov 2003 17:21:47 -0500 (EST)
From: Burton Windle <[email protected]>
To: Trond Myklebust <[email protected]>, [email protected]
Subject: nfsd caused slab corruption in test9
Hello. I'm not quite sure whom to send this to, so my appologizes if you
aren't the right people.
Two linux boxes, both running Debian Testing, both x86 with 2.6.0-test9.
The nfs client was running the nhfsstone stress test against an ext3 NFS
exported share. After about an hour (I forgot it was running), the console on the
server woke up and spit this out:
Slab corruption: start=c9df6bcc, expend=c9df6c8b, problemat=c9df6bcc
Last user: [<c017f87e>](d_callback+0x1e/0x40)
Data: 6A**********************************************************************************************************************************************************************************************A5
Next: 71 F0 2C .7E F8 17 C0 A5 C2 0F 17 00 00 00 00 08 00 00 00 3C 4B 24 1D 00 00 00 00 0A 00 00 00 slab error in check_poison_obj(): cache
`dentry_cache': object was modified after freeing Call Trace:
[<c0143653>] check_poison_obj+0xf3/0x180
[<c0181d19>] d_alloc+0x19/0x360
[<c01457da>] kmem_cache_alloc+0x13a/0x180
[<c0181d19>] d_alloc+0x19/0x360
[<c0174a3e>] cached_lookup+0x5e/0x80
[<c0175ebb>] __lookup_hash+0x5b/0xa0
[<c0175f10>] lookup_hash+0x10/0x20
[<c0175f75>] lookup_one_len+0x55/0x80
[<c01e7d05>] nfsd_lookup+0xa5/0x5e0
[<c01f020b>] nfsd3_proc_lookup+0x8b/0x100
[<c01e5147>] nfsd_dispatch+0xc7/0x1a0
[<c0313a7f>] svc_process+0x49f/0x640
[<c01e4c47>] nfsd+0x307/0x740
[<c01e4940>] nfsd+0x0/0x740
[<c0107429>] kernel_thread_helper+0x5/0x1c
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V4 is not set
# CONFIG_NFS_DIRECTIO is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V4 is not set
# CONFIG_NFSD_TCP is not set
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_IOVIRT=y
CONFIG_DEBUG_SPINLOCK=y
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_DEBUG_INFO is not set
CONFIG_DEBUG_SPINLOCK_SLEEP=y
Linux version 2.6.0-test9 (root@morpheus) (gcc version 3.3.2 20030908
(Debian prerelease)) #2 Sat Oct 25 19:33:41 EDT 2003
Gnu C 3.3.2
Gnu make 3.80
util-linux 2.12
mount 2.12
module-init-tools implemented
e2fsprogs 1.35-WIP
nfs-utils 1.0.5
Linux C Library 2.3.2
Dynamic linker (ldd) 2.3.2
Procps 3.1.12
Net-tools 1.60
Console-tools 0.2.3
Sh-utils 5.0
--
Burton Windle [email protected]
Linux: the "grim reaper of innocent orphaned children."
from /usr/src/linux-2.4.18/init/main.c:461
Burton Windle <[email protected]> wrote:
>
> Forwarding to LKML as per Neil Brown.
>
>
> ---------- Forwarded message ----------
> Date: Sun, 9 Nov 2003 17:21:47 -0500 (EST)
> From: Burton Windle <[email protected]>
> To: Trond Myklebust <[email protected]>, [email protected]
> Subject: nfsd caused slab corruption in test9
>
> Hello. I'm not quite sure whom to send this to, so my appologizes if you
> aren't the right people.
>
> Two linux boxes, both running Debian Testing, both x86 with 2.6.0-test9.
> The nfs client was running the nhfsstone stress test against an ext3 NFS
> exported share. After about an hour (I forgot it was running), the console on the
> server woke up and spit this out:
>
>
> Slab corruption: start=c9df6bcc, expend=c9df6c8b, problemat=c9df6bcc
> Last user: [<c017f87e>](d_callback+0x1e/0x40)
> Data: 6A**********************************************************************************************************************************************************************************************A5
> Next: 71 F0 2C .7E F8 17 C0 A5 C2 0F 17 00 00 00 00 08 00 00 00 3C 4B 24 1D 00 00 00 00 0A 00 00 00 slab error in check_poison_obj(): cache
> `dentry_cache': object was modified after freeing Call Trace:
> [<c0143653>] check_poison_obj+0xf3/0x180
> [<c0181d19>] d_alloc+0x19/0x360
> [<c01457da>] kmem_cache_alloc+0x13a/0x180
> [<c0181d19>] d_alloc+0x19/0x360
> [<c0174a3e>] cached_lookup+0x5e/0x80
> [<c0175ebb>] __lookup_hash+0x5b/0xa0
> [<c0175f10>] lookup_hash+0x10/0x20
> [<c0175f75>] lookup_one_len+0x55/0x80
> [<c01e7d05>] nfsd_lookup+0xa5/0x5e0
> [<c01f020b>] nfsd3_proc_lookup+0x8b/0x100
> [<c01e5147>] nfsd_dispatch+0xc7/0x1a0
> [<c0313a7f>] svc_process+0x49f/0x640
> [<c01e4c47>] nfsd+0x307/0x740
> [<c01e4940>] nfsd+0x0/0x740
> [<c0107429>] kernel_thread_helper+0x5/0x1c
>
Someone did a dput() of a freed dentry. This is quite possibly an nfsd bug.
There's also http://bugme.osdl.org/show_bug.cgi?id=1497, in which someone
did a scribble on the buffer_head slab's metadata. Could be unrelated.
There's a debug patch in -mm which should catch the rogue dput(). Perhaps
you could include it in your testing. It is at
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test9/2.6.0-test9-mm2/broken-out/atomic_dec-debug.patch
On Sunday November 9, [email protected] wrote:
> Burton Windle <[email protected]> wrote:
> > Slab corruption: start=c9df6bcc, expend=c9df6c8b, problemat=c9df6bcc
> > Last user: [<c017f87e>](d_callback+0x1e/0x40)
> > Data: 6A**********************************************************************************************************************************************************************************************A5
> > Next: 71 F0 2C .7E F8 17 C0 A5 C2 0F 17 00 00 00 00 08 00 00 00 3C 4B 24 1D 00 00 00 00 0A 00 00 00 slab error in check_poison_obj(): cache
> >
>
> Someone did a dput() of a freed dentry. This is quite possibly an
> nfsd bug.
Ahhh. That would be:
===============================================
An extra dput was introduced in nfsd_rename 20 months ago....
time to remove it.
----------- Diffstat output ------------
./fs/nfsd/vfs.c | 1 -
1 files changed, 1 deletion(-)
diff ./fs/nfsd/vfs.c~current~ ./fs/nfsd/vfs.c
--- ./fs/nfsd/vfs.c~current~ 2003-11-05 11:25:45.000000000 +1100
+++ ./fs/nfsd/vfs.c 2003-11-05 11:25:59.000000000 +1100
@@ -1368,7 +1368,6 @@ nfsd_rename(struct svc_rqst *rqstp, stru
nfsd_sync_dir(tdentry);
nfsd_sync_dir(fdentry);
}
- dput(ndentry);
out_dput_new:
dput(ndentry);
==============================================
NeilBrown
On Mon, 10 Nov 2003, Neil Brown wrote:
>
> An extra dput was introduced in nfsd_rename 20 months ago....
>
> time to remove it.
Oh, you stand-up comedian you.
I'm just wondering how the hell this hasn't bit us seriously until now?
What's up?
In other words, your patch certainly looks obviously correct, but it also
looks _so_ obviously correct that my alarm bells are going off. If the
code was quite that broken at counting dentries, how the hell did it ever
work AT ALL?
Call me suspicious, but I find this really strange..
Linus
On Sun, Nov 09, 2003 at 08:09:40PM -0800, Linus Torvalds wrote:
> In other words, your patch certainly looks obviously correct, but it also
> looks _so_ obviously correct that my alarm bells are going off. If the
> code was quite that broken at counting dentries, how the hell did it ever
> work AT ALL?
From Jargon File (4.3.0, 30 APR 2001) [jargon]:
schroedinbug /shroh'din-buhg/ n. [MIT: from the Schroedinger's Cat
thought-experiment in quantum physics] A design or implementation bug in
a program that doesn't manifest until someone reading source or using
the program in an unusual way notices that it never should have worked,
at which point the program promptly stops working for everybody until
fixed. Though (like {bit rot}) this sounds impossible, it happens; some
programs have harbored latent schroedinbugs for years. Compare
{heisenbug}, {Bohr bug}, {mandelbug}.
Hold on to your horses everyone? :)
--
Joshua Kwan
On 9 Nov 2003, Trond Myklebust wrote:
>
> Given that d_free() now uses rcu, and hence defers the actual call to
> kmem_cache_free(), might that not suffice to explain why actual
> consequences are rare?
The thing is, since it gets free'd before all users have let go of their
pointers to it, I'd expect people to now have a pointer to a _totally_
stale entry somewhere.
Although I guess in 99% of all cases there just won't be any other users
for the new dentry in particular (it might be different if it was the old
dentry that got dropped too much), and clearly it hasn't been noticed for
the last 20 months.
Even so, I'm a bit surprised. But I obviously applied the patch.
Linus
>>>>> " " == Linus Torvalds <[email protected]> writes:
> In other words, your patch certainly looks obviously correct,
> but it also looks _so_ obviously correct that my alarm bells
> are going off. If the code was quite that broken at counting
> dentries, how the hell did it ever work AT ALL?
Given that d_free() now uses rcu, and hence defers the actual call to
kmem_cache_free(), might that not suffice to explain why actual
consequences are rare?
Cheers,
Trond
On Sunday November 9, [email protected] wrote:
>
> On Mon, 10 Nov 2003, Neil Brown wrote:
> >
> > An extra dput was introduced in nfsd_rename 20 months ago....
> >
> > time to remove it.
>
> Oh, you stand-up comedian you.
>
> I'm just wondering how the hell this hasn't bit us seriously until now?
> What's up?
>
> In other words, your patch certainly looks obviously correct, but it also
> looks _so_ obviously correct that my alarm bells are going off. If the
> code was quite that broken at counting dentries, how the hell did it ever
> work AT ALL?
>
> Call me suspicious, but I find this really strange..
>
> Linus
Me too.
I had to read through the code several times, and then find the patch
that introduced the bug and make sure it looked wrong too, which it
does.
The only time that the dentry is used after the dput that reduces
d_count to 0, is that dput is called once more, and it is very soon
and only decrements the count (down to -1).
So there is a very small window for someone else to get that piece of
memory and have it corrupted.
This bug was first brought to my attention by the NFSv4 team who have
obviously been hammering nfsd harder than the rest of us (an in more
interesting ways, as is the nature of nfsv4).
NeilBrown
Linus Torvalds <[email protected]> wrote:
>
>
> On Mon, 10 Nov 2003, Neil Brown wrote:
> >
> > An extra dput was introduced in nfsd_rename 20 months ago....
> >
> > time to remove it.
>
> Oh, you stand-up comedian you.
>
> I'm just wondering how the hell this hasn't bit us seriously until now?
> What's up?
Me too.
If the dentry was on the dentry_unused list then it would get a whacky
refcount and we would simply leak it.
It is only in the very rare case that the dentry gets freed up between the
two dput()s that the slab debugging will trip.
So probably, we've been leaking dentries as a result of rename activity by
NFS clients, but nobody noticed.
On Mon, 10 Nov 2003, Neil Brown wrote:
>
> I had to read through the code several times, and then find the patch
> that introduced the bug and make sure it looked wrong too, which it
> does.
Yeah, me too. I looked up the original patch from Feb 2002 to verify that
there wasn't something subtle going on. But I agree - there's nothing
subtle there at all, it really is an obvious bug.
Linus
On Sun, Nov 09, 2003 at 08:09:40PM -0800, Linus Torvalds wrote:
>
> On Mon, 10 Nov 2003, Neil Brown wrote:
> >
> > An extra dput was introduced in nfsd_rename 20 months ago....
> >
> > time to remove it.
>
> Oh, you stand-up comedian you.
>
> I'm just wondering how the hell this hasn't bit us seriously until now?
> What's up?
>
> In other words, your patch certainly looks obviously correct, but it also
> looks _so_ obviously correct that my alarm bells are going off. If the
> code was quite that broken at counting dentries, how the hell did it ever
> work AT ALL?
>
> Call me suspicious, but I find this really strange..
Arrgh...
No, Neil is right - it was a plain and simple fsckup on my part. No hidden
logics is there; his fix is correct.