2002-10-20 17:25:39

by bert hubert

[permalink] [raw]
Subject: nfsd/sunrpc boot on reboot in 2.5.44

My Debian sid machine oopses when I run 'sudo reboot'.

$ mount
/dev/hdb2 on / type ext3 (rw,errors=remount-ro)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hdb4 on /mnt type ext2 (rw,errors=remount-ro)
nodev on /dev/oprofile type oprofilefs (rw)
/dev/hda1 on /images type ext2 (rw)

No NFS activity involved.

By the way, can anybody tell me how to convert this:
Oct 20 19:21:32 hubert kernel: [<c8831060>] auth_domain_drop+0x50/0x60 [sunrpc]

To a line in auth_domain_drop()?

I'm looking if I can reproduce this.

Oct 20 13:15:28 hubert kernel: EXT2-fs warning (device ide0(3,68)): ext2_fill_super: mounting ext3 filesystem as ext2
Oct 20 13:15:28 hubert kernel:
Oct 20 13:15:30 hubert kernel: Installing knfsd (copyright (C) 1996 [email protected]).
Oct 20 13:15:49 hubert kernel: MTRR: setting reg 1
(I mount my /images kernel images partition: )
Oct 20 19:21:01 hubert kernel: EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
(I entered reboot: )
Oct 20 19:21:28 hubert kernel: MTRR: setting reg 1
Oct 20 19:21:32 hubert kernel: nfsd: last server has exited
Oct 20 19:21:32 hubert kernel: nfsd: unexporting all filesystems
Oct 20 19:21:32 hubert kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Oct 20 19:21:32 hubert kernel: printing eip:
Oct 20 19:21:32 hubert kernel: 00000000
Oct 20 19:21:32 hubert kernel: *pde = 00000000
Oct 20 19:21:32 hubert kernel: Oops: 0000
Oct 20 19:21:32 hubert kernel: soundcore nfsd lockd sunrpc exportfs
Oct 20 19:21:32 hubert kernel: CPU: 0
Oct 20 19:21:32 hubert kernel: EIP: 0060:[<00000000>] Not tainted
Oct 20 19:21:32 hubert kernel: EFLAGS: 00010202
Oct 20 19:21:32 hubert kernel: eax: c88388b8 ebx: 7fffffff ecx: 00000001 edx: c7643280
Oct 20 19:21:32 hubert kernel: esi: c8839c3c edi: 00000001 ebp: c78c6000 esp: c78c7f7c
Oct 20 19:21:32 hubert kernel: ds: 0068 es: 0068 ss: 0068
Oct 20 19:21:32 hubert kernel: Process nfsd (pid: 231, threadinfo=c78c6000 task=c12826c0)
Oct 20 19:21:32 hubert kernel: Stack: c8831060 c7643280 c88330dd c7643280 c8838800 c78c6000 c77281e0 c1327e00
Oct 20 19:21:32 hubert kernel: c88330f5 c8862c84 c8831acb c8838800 c8859fe0 c8862c40 00000042 00000001
Oct 20 19:21:32 hubert kernel: c885137c c8861280 c1327e00 000493e0 c78c6000 c8862520 c8862520 c12826c0
Oct 20 19:21:32 hubert kernel: Call Trace:
Oct 20 19:21:32 hubert kernel: [<c8831060>] auth_domain_drop+0x50/0x60 [sunrpc]
Oct 20 19:21:32 hubert kernel: [<c88330dd>] cache_clean+0x16d/0x170 [sunrpc]
Oct 20 19:21:32 hubert kernel: [<c8838800>] auth_domain_cache+0x0/0x60 [sunrpc]
Oct 20 19:21:32 hubert kernel: [<c88330f5>] cache_flush+0x15/0x50 [sunrpc]
Oct 20 19:21:32 hubert kernel: [<c8862c84>] hash_sem+0x0/0x1c [nfsd]
Oct 20 19:21:32 hubert kernel: [<c8831acb>] svcauth_unix_purge+0x1b/0x20 [sunrpc]
Oct 20 19:21:32 hubert kernel: [<c8838800>] auth_domain_cache+0x0/0x60 [sunrpc]
Oct 20 19:21:32 hubert kernel: [<c8859fe0>] nfsd_export_shutdown+0x70/0xe0 [nfsd]
Oct 20 19:21:32 hubert kernel: [<c8862c40>] svc_export_cache+0x0/0x44 [nfsd]
Oct 20 19:21:32 hubert kernel: [<c885137c>] nfsd+0x19c/0x1f0 [nfsd]
Oct 20 19:21:32 hubert kernel: [<c8861280>] .rodata.str1.32+0x80/0x1320 [nfsd]
Oct 20 19:21:32 hubert kernel: [<c8862520>] nfsd_list+0x0/0x8 [nfsd]
Oct 20 19:21:32 hubert kernel: [<c8862520>] nfsd_list+0x0/0x8 [nfsd]
Oct 20 19:21:32 hubert kernel: [<c88511e0>] nfsd+0x0/0x1f0 [nfsd]
Oct 20 19:21:32 hubert kernel: [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10
Oct 20 19:21:32 hubert kernel:
Oct 20 19:21:32 hubert kernel: Code: Bad EIP value.


--
http://www.PowerDNS.com Versatile DNS Software & Services
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO


2002-10-20 18:15:10

by bert hubert

[permalink] [raw]
Subject: Re: nfsd/sunrpc boot on reboot in 2.5.44

On Sun, Oct 20, 2002 at 07:31:42PM +0200, bert hubert wrote:

> I'm looking if I can reproduce this.

Like clockwork, it happens just after 'Unexporting directories'. If you run
exportfs -au first, nothing happens. the error only happens when the entire
script runs.

stop)
printf "Stopping $DESC: mountd"
start-stop-daemon --stop --oknodo --quiet \
--name rpc.mountd --user 0
printf " nfsd"
start-stop-daemon --stop --oknodo --quiet \
--name nfsd --user 0 --signal 2
echo "."

printf "Unexporting directories for $DESC..."
$PREFIX/sbin/exportfs -au
echo "done."
;;

This is /etc/exports:
# /etc/exports: the access control list for filesystems which may be
exported
# to NFS clients. See exports(5)
/ 10.0.0.0/255.0.0.0(rw)
/mnt 10.0.0.0/255.0.0.0(rw)

Regards,

bert

--
http://www.PowerDNS.com Versatile DNS Software & Services
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO

2002-10-20 23:22:34

by Hirokazu Takahashi

[permalink] [raw]
Subject: Re: nfsd/sunrpc boot on reboot in 2.5.44

--- linux/net/sunrpc/svcauth.c.ORG Sun Oct 20 15:47:26 2030
+++ linux/net/sunrpc/svcauth.c Sun Oct 20 15:50:37 2030
@@ -143,8 +143,9 @@ static struct cache_head *auth_domain_ta
void auth_domain_drop(struct cache_head *item, struct cache_detail *cd)
{
struct auth_domain *dom = container_of(item, struct auth_domain, h);
- if (cache_put(item,cd))
- authtab[dom->flavour]->domain_release(dom);
+ void (*fn)(struct auth_domain *) = authtab[dom->flavour]->domain_release;
+ if (cache_put(item,cd) && fn != NULL)
+ fn(dom);
}



Attachments:
rpcdomain-fix2.5.43.patch (532.00 B)

2002-10-21 03:20:46

by NeilBrown

[permalink] [raw]
Subject: Re: nfsd/sunrpc boot on reboot in 2.5.44

On Sunday October 20, [email protected] wrote:
>
> By the way, can anybody tell me how to convert this:
> Oct 20 19:21:32 hubert kernel: [<c8831060>] auth_domain_drop+0x50/0x60 [sunrpc]
>
> To a line in auth_domain_drop()?
>

gdb sunrpc.o
disassemble auth_domain_drop

stare at assembler listing, stare at source code....

NeilBrown

2002-10-21 03:19:33

by NeilBrown

[permalink] [raw]
Subject: Re: nfsd/sunrpc boot on reboot in 2.5.44

On Sunday October 20, [email protected] wrote:
> My Debian sid machine oopses when I run 'sudo reboot'.
>

Sorry 'bout that. Appended patch should fix it.
When the last daemon exits all exports are automatically unexported,
and it was releasing too much so when "exportfs -a" came along it
tried to access some deallocated data strctures.

NeilBrown

-------------------------
Fix nfs shutdown problem.

The 'unexport everything' that happens when the
last nfsd thread dies was shuting down too much -
things that should only be shut down on module unload.



----------- Diffstat output ------------
./fs/nfsd/export.c | 31 ++++++++++++-------------------
./fs/nfsd/nfssvc.c | 2 +-
./include/linux/nfsd/export.h | 1 +
./net/sunrpc/sunrpc_syms.c | 1 +
4 files changed, 15 insertions(+), 20 deletions(-)

--- ./fs/nfsd/export.c 2002/10/20 23:48:51 1.1
+++ ./fs/nfsd/export.c 2002/10/21 03:23:44 1.2
@@ -738,23 +738,6 @@ exp_do_unexport(svc_export *unexp)
exp_fsid_unhash(unexp);
}

-/*
- * Revoke all exports for a given client.
- */
-static void
-exp_unexport_all(svc_client *clp)
-{
- struct svc_export *exp;
- int index;
-
- dprintk("unexporting all fs's for clnt %p\n", clp);
-
- cache_for_each(exp, &svc_export_cache, index, h)
- if (exp->ex_client == clp)
- exp_do_unexport(exp);
- cache_flush();
-
-}

/*
* unexport syscall.
@@ -1109,6 +1092,18 @@ nfsd_export_init(void)
}

/*
+ * Flush exports table - called when last nfsd thread is killed
+ */
+void
+nfsd_export_flush(void)
+{
+ exp_writelock();
+ cache_purge(&svc_expkey_cache);
+ cache_purge(&svc_export_cache);
+ exp_writeunlock();
+}
+
+/*
* Shutdown the exports module.
*/
void
@@ -1119,8 +1114,6 @@ nfsd_export_shutdown(void)

exp_writelock();

- exp_unexport_all(NULL);
-
if (cache_unregister(&svc_expkey_cache))
printk(KERN_ERR "nfsd: failed to unregister expkey cache\n");
if (cache_unregister(&svc_export_cache))
--- ./fs/nfsd/nfssvc.c 2002/10/20 23:53:59 1.1
+++ ./fs/nfsd/nfssvc.c 2002/10/21 03:23:44 1.2
@@ -238,7 +238,7 @@ nfsd(struct svc_rqst *rqstp)
printk(KERN_WARNING "nfsd: last server has exited\n");
if (err != SIG_NOCLEAN) {
printk(KERN_WARNING "nfsd: unexporting all filesystems\n");
- nfsd_export_shutdown();
+ nfsd_export_flush();
}
nfsd_serv = NULL;
nfsd_racache_shutdown(); /* release read-ahead cache */
--- ./include/linux/nfsd/export.h 2002/10/20 23:53:38 1.1
+++ ./include/linux/nfsd/export.h 2002/10/21 03:23:44 1.2
@@ -83,6 +83,7 @@ struct svc_expkey {
*/
void nfsd_export_init(void);
void nfsd_export_shutdown(void);
+void nfsd_export_flush(void);
void exp_readlock(void);
void exp_readunlock(void);
struct svc_expkey * exp_find_key(struct auth_domain *clp,
--- ./net/sunrpc/sunrpc_syms.c 2002/10/20 23:53:09 1.1
+++ ./net/sunrpc/sunrpc_syms.c 2002/10/21 03:23:44 1.2
@@ -101,6 +101,7 @@ EXPORT_SYMBOL(auth_unix_lookup);
EXPORT_SYMBOL(cache_check);
EXPORT_SYMBOL(cache_clean);
EXPORT_SYMBOL(cache_flush);
+EXPORT_SYMBOL(cache_purge);
EXPORT_SYMBOL(cache_fresh);
EXPORT_SYMBOL(cache_init);
EXPORT_SYMBOL(cache_register);

2002-10-21 06:28:49

by bert hubert

[permalink] [raw]
Subject: Re: nfsd/sunrpc boot on reboot in 2.5.44

On Mon, Oct 21, 2002 at 01:26:26PM +1000, Neil Brown wrote:

> > By the way, can anybody tell me how to convert this:
> > Oct 20 19:21:32 hubert kernel: [<c8831060>] auth_domain_drop+0x50/0x60 [sunrpc]
> >
> > To a line in auth_domain_drop()?
>
> gdb sunrpc.o
> disassemble auth_domain_drop
>
> stare at assembler listing, stare at source code....

I also found this to work:

touch sunrpc.c
make
[ observe how sunrpc.o gets compiled ]
[ add a -g to the commandline ]
gdb sunrpc.o
l *(auth_domain_drop+0x50)

"A #kernelnewbies discovery".

Thanks for the switch patch! Will check if it helps.

Regards,

bert hubert

--
http://www.PowerDNS.com Versatile DNS Software & Services
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO

2002-10-21 14:31:19

by Kai Germaschewski

[permalink] [raw]
Subject: Re: nfsd/sunrpc boot on reboot in 2.5.44

On Mon, 21 Oct 2002, bert hubert wrote:

> I also found this to work:
>
> touch sunrpc.c
> make
> [ observe how sunrpc.o gets compiled ]
> [ add a -g to the commandline ]
> gdb sunrpc.o
> l *(auth_domain_drop+0x50)

Or even

make fs/sunrpc/svc_auth.lst

and stare at fs/sunrpc/svc_auth.lst ;)

(it also gives you a svc_auth.o compiled with -g, so you can use the gdb
thing above, too)

--Kai