2010-04-28 19:54:09

by Robert Henney

[permalink] [raw]
Subject: NULL pointer dereference in 2.6.32.12 on mount attempt

this bug has so far been reproducible.

I have an nfs server running Debian lenny with the stock 2.6.26-2-686
kernel, and a client machine also running Debian lenny but with a
2.6.32.12 kernel (kernel config attached).

/etc/exports on the server, possibly bogus although the server never
complains and still probably shouldn't trigger a NULL dereference in
the client:
/stow *(ro,fsid=0,crossmnt,no_subtree_check)
/stow -mp,ro,all_squash,async,no_subtree_check \
199.125.85.51 \
199.125.85.134 \
66.55.209.223

/etc/fstab on the client:
199.125.85.39:/stow /stow nfs4 noatime

the OOPS (attached below) occurs when attempting a mount from the client

# mount /stow
# echo $?
2

the mount command never outputs but has a return code of 2 and the mount
is not successful.


Attachments:
(No filename) (797.00 B)
kern.log (2.84 kB)
kernel_config (65.18 kB)
Download all attachments

2010-04-28 22:40:49

by Robert Henney

[permalink] [raw]
Subject: Re: NULL pointer dereference in 2.6.32.12 on mount attempt

On Wed, Apr 28, 2010 at 06:29:33PM -0400, Trond Myklebust wrote:

> I was thinking more about what the contents of your kernel syslog.
>
> IOW: if you do
>
> dmesg -s 900000 > /tmp/dmesg.txt
>
> what are the contents of the file /tmp/dmesg.txt?

attached. are compressed attachments allowed on this list?


Attachments:
(No filename) (309.00 B)
dmesg.txt.bz2 (10.14 kB)
Download all attachments

2010-04-28 22:29:38

by Trond Myklebust

[permalink] [raw]
Subject: Re: NULL pointer dereference in 2.6.32.12 on mount attempt

On Wed, 2010-04-28 at 18:23 -0400, Robert Henney wrote:
> On Wed, Apr 28, 2010 at 04:24:23PM -0400, Trond Myklebust wrote:
> > What happens if you do
> >
> > echo 1025 > /proc/sys/sunrpc/nfs_debug
> >
> > prior to trying the mount?
>
> the client becomes slow enough to be unusable after the value of nfs_debug
> is changed to 1025, which is probably due to it being a diskless client.
> although the root filesystem is not the mount causing the issue, I can
> try and get a dedicated test machine set up soon to aid further testing.

I was thinking more about what the contents of your kernel syslog.

IOW: if you do

dmesg -s 900000 > /tmp/dmesg.txt

what are the contents of the file /tmp/dmesg.txt?

Cheers
Trond


2010-04-28 22:23:40

by Robert Henney

[permalink] [raw]
Subject: Re: NULL pointer dereference in 2.6.32.12 on mount attempt

On Wed, Apr 28, 2010 at 04:24:23PM -0400, Trond Myklebust wrote:
> On Wed, 2010-04-28 at 15:17 -0400, Robert Henney wrote:

> > /etc/exports on the server, possibly bogus although the server never
> > complains and still probably shouldn't trigger a NULL dereference in
> > the client:
> > /stow *(ro,fsid=0,crossmnt,no_subtree_check)
> > /stow -mp,ro,all_squash,async,no_subtree_check \
> > 199.125.85.51 \
> > 199.125.85.134 \
> > 66.55.209.223
>
> You probably want to add at least a 'fsid=0' option to that second line.
>
> > /etc/fstab on the client:
> > 199.125.85.39:/stow /stow nfs4 noatime
>
> Should be
>
> 199.125.85.39:/ /stow nfs4

if I correct both of the above as you say, then it works. :)

I should mention though that occasionally when reproducing the bug on
the client it caused the server kernel (debian lenny
linux-image-2.6.26-2-686) to report its own kernel bug and nfsd on the
server became hosed and unusable for all clients until the server was
rebooted. kern.log output attached.

since I can only reproduce the kernel bugs using a "wrong" exports
file, I'm not sure how critical they are anymore.

> > the mount command never outputs but has a return code of 2 and the mount
> > is not successful.
>
> That looks like a stack overflow to me, but it's hard to tell.
>
> What happens if you do
>
> echo 1025 > /proc/sys/sunrpc/nfs_debug
>
> prior to trying the mount?

the client becomes slow enough to be unusable after the value of nfs_debug
is changed to 1025, which is probably due to it being a diskless client.
although the root filesystem is not the mount causing the issue, I can
try and get a dedicated test machine set up soon to aid further testing.


Attachments:
(No filename) (1.68 kB)
server_kern.log (2.51 kB)
Download all attachments

2010-04-28 20:24:40

by Myklebust, Trond

[permalink] [raw]
Subject: Re: NULL pointer dereference in 2.6.32.12 on mount attempt

On Wed, 2010-04-28 at 15:17 -0400, Robert Henney wrote:
> this bug has so far been reproducible.
>
> I have an nfs server running Debian lenny with the stock 2.6.26-2-686
> kernel, and a client machine also running Debian lenny but with a
> 2.6.32.12 kernel (kernel config attached).
>
> /etc/exports on the server, possibly bogus although the server never
> complains and still probably shouldn't trigger a NULL dereference in
> the client:
> /stow *(ro,fsid=0,crossmnt,no_subtree_check)
> /stow -mp,ro,all_squash,async,no_subtree_check \
> 199.125.85.51 \
> 199.125.85.134 \
> 66.55.209.223

You probably want to add at least a 'fsid=0' option to that second line.

> /etc/fstab on the client:
> 199.125.85.39:/stow /stow nfs4 noatime

Should be

199.125.85.39:/ /stow nfs4


> the OOPS (attached below) occurs when attempting a mount from the client
>
> # mount /stow
> # echo $?
> 2
>
> the mount command never outputs but has a return code of 2 and the mount
> is not successful.

That looks like a stack overflow to me, but it's hard to tell.

What happens if you do

echo 1025 > /proc/sys/sunrpc/nfs_debug

prior to trying the mount?

Cheers
Trond