2002-07-29 00:21:36

by Jonathan Woithe

[permalink] [raw]
Subject: Linux 2.2 client trouble after 2.4.18 server reboot

Hi all

[Please CC me replies since I'm not subscribed to this list]

Recently I upgraded our NFS server to Linux 2.4.18. This server provides
exported home directories for our user Linux boxes which are currently
running 2.2.19.

Most of the time everything seems to be going well. However, if the server
is rebooted the clients seem unable to restore the NFS connection to a
usable state. The logs note the server becoming unavailable and then
following the reboot they note that the server is now "ok". However, any
attempt to then access the mounted volumes is greeted with a "permission
denied" error. Even root can't access the drives.

If the NFS volumes are unmounted and remounted the problem (predictably)
goes away. Of course such an unmount can't occur until people have logged
off, which makes it a little cumbersome.

The exported directory on the server is a ReiserFS filesystem, exported with
only "rw" options. The client mounts with "hard,intr,bg" options. NFSv3
support is not compiled in on the 2.2.19 clients. The server is a default
slackware configuration (from the point of view of NFS):
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_ROOT_NFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
The server also has the lockd oops fix from Trond.

I would appreciate any pointers to ways to stop this from happening.
Requiring that all users log out before a server reboot is rather
inconvenient.

Best regards
jonathan
--
* Jonathan Woithe [email protected] *
* http://www.physics.adelaide.edu.au/~jwoithe *
***-----------------------------------------------------------------------***
** "Time is an illusion; lunchtime doubly so" **
* "...you wouldn't recognize a subtle plan if it painted itself purple and *
* danced naked on a harpsichord singing 'subtle plans are here again'" *


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2002-07-29 10:56:17

by James Pearson

[permalink] [raw]
Subject: Re: Linux 2.2 client trouble after 2.4.18 server reboot

Have a look through the archives of this list at:

http://marc.theaimsgroup.com/?l=linux-nfs&r=1&w=2

It could be similar to the problem reported in thread:

http://marc.theaimsgroup.com/?l=linux-nfs&m=101796943306978&w=2

James Pearson

Jonathan Woithe wrote:
>
> Hi all
>
> [Please CC me replies since I'm not subscribed to this list]
>
> Recently I upgraded our NFS server to Linux 2.4.18. This server provides
> exported home directories for our user Linux boxes which are currently
> running 2.2.19.
>
> Most of the time everything seems to be going well. However, if the server
> is rebooted the clients seem unable to restore the NFS connection to a
> usable state. The logs note the server becoming unavailable and then
> following the reboot they note that the server is now "ok". However, any
> attempt to then access the mounted volumes is greeted with a "permission
> denied" error. Even root can't access the drives.
>
> If the NFS volumes are unmounted and remounted the problem (predictably)
> goes away. Of course such an unmount can't occur until people have logged
> off, which makes it a little cumbersome.
>
> The exported directory on the server is a ReiserFS filesystem, exported with
> only "rw" options. The client mounts with "hard,intr,bg" options. NFSv3
> support is not compiled in on the 2.2.19 clients. The server is a default
> slackware configuration (from the point of view of NFS):
> CONFIG_NFS_FS=y
> CONFIG_NFS_V3=y
> # CONFIG_ROOT_NFS is not set
> CONFIG_NFSD=m
> CONFIG_NFSD_V3=y
> CONFIG_SUNRPC=y
> CONFIG_LOCKD=y
> CONFIG_LOCKD_V4=y
> The server also has the lockd oops fix from Trond.
>
> I would appreciate any pointers to ways to stop this from happening.
> Requiring that all users log out before a server reboot is rather
> inconvenient.
>
> Best regards
> jonathan
> --
> * Jonathan Woithe [email protected] *
> * http://www.physics.adelaide.edu.au/~jwoithe *
> ***-----------------------------------------------------------------------***
> ** "Time is an illusion; lunchtime doubly so" **
> * "...you wouldn't recognize a subtle plan if it painted itself purple and *
> * danced naked on a harpsichord singing 'subtle plans are here again'" *
>
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-29 11:05:27

by Trond Myklebust

[permalink] [raw]
Subject: Linux 2.2 client trouble after 2.4.18 server reboot

>>>>> " " == Jonathan Woithe <[email protected]> writes:

> The exported directory on the server is a ReiserFS filesystem,
> exported with only "rw" options. The client mounts with
> "hard,intr,bg" options. NFSv3 support is not compiled in on
> the 2.2.19 clients. The server is a default slackware
> configuration (from the point of view of NFS):

Was that ReiserFS system created on a 2.2.x system? If so, it is a
known bug in ReiserFS itself. There is a mount option (can't remember
the name in the spur of the moment, but you'll find it described on
the 'mount' manpage in http://www.namesys.com and you'll find it described
several times in the NFS lists archives) that updates the filesystem
to the latest ReiserFS format.

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-29 13:01:56

by Sampo Nurmentaus

[permalink] [raw]
Subject: Weird error with nfs server and an embedded linux system.



Hello folks,

I ported all the nfs stuff to our embedded system running 2.4.14
on cris platform. All the daemons start up nicely except
the statd that prints to log that:

Jul 29 12:58:06 noisy25 statd[124]: Version from Hell! Starting

But it starts up anyway.

When I try to mount a share it ses in syslog that mount request is
authenticated and my laptops ip appears in xtab but still the mount fails
with fallowing error message:

[root@kombo root]# mount -t nfs -o nolock 120.100.100.25:/tmp/hd /mnt/keke/
mount: RPC: Remote system error


When I run rpcinfo on my laptop against the server
I get fallowing output:

[root@kombo root]# rpcinfo -p 120.100.100.25
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100021 1 udp 2048 nlockmgr
100021 3 udp 2048 nlockmgr
100021 4 udp 2048 nlockmgr
100011 1 udp 715 rquotad
100011 2 udp 715 rquotad
100005 1 udp 2050 mountd
100005 1 tcp 2059 mountd
100005 2 udp 2050 mountd
100005 2 tcp 2059 mountd
100005 3 udp 2050 mountd
100005 3 tcp 2059 mountd
100024 1 udp 2051 status
100024 1 tcp 2060 status

And my xtab on the embedded board includes the ip of my laptop
as it should.


Any ideas what might be going wrong?


Thanks,

Sampo Nurmentaus,
Software Developers,
Audio Riders Inc.



-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-29 13:14:50

by Trond Myklebust

[permalink] [raw]
Subject: Re: Weird error with nfs server and an embedded linux system.

>>>>> " " == Sampo Nurmentaus <[email protected]> writes:

> Hello folks,

> I ported all the nfs stuff to our embedded system running
> 2.4.14 on cris platform. All the daemons start up nicely
> except the statd that prints to log that:

> Jul 29 12:58:06 noisy25 statd[124]: Version from Hell! Starting

> But it starts up anyway.

> When I try to mount a share it ses in syslog that mount request
> is authenticated and my laptops ip appears in xtab but still
> the mount fails with fallowing error message:

> [root@kombo root]# mount -t nfs -o nolock
> 120.100.100.25:/tmp/hd /mnt/keke/ mount: RPC: Remote system
> error

If you are mounting using 'nolock', then you won't care if statd is
running or not. rpc.statd is only required if you need file locking
over NFS.

Your mount error is therefore likely to be due to some other
problem. I suggest you have a look at the RPC exchange between client
and server using the 'ethereal' program. That should show you what is
going wrong...

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-29 14:11:43

by Sampo Nurmentaus

[permalink] [raw]
Subject: Re: Weird error with nfs server and an embedded linux system.


Thanks for the tip Trond,

It really seems that there is something wierd going on.

Portmapping seems to work fine and mount
appears in rmtab but then ethereal reports
something it calls: "Unknown WCCP package"

I first thought that this is related to some other traffic
between these machines but this wierd packet is sent
from my laptop to the server every time the error occurs!

For some reason ethereal was not able to parse all
rpc calls only those made by portmap.

Have anyone seens this kind of behaviour before?


Sampo

> >>>>> " " == Sampo Nurmentaus <[email protected]> writes:
>
> > Hello folks,
>
> > I ported all the nfs stuff to our embedded system running
> > 2.4.14 on cris platform. All the daemons start up nicely
> > except the statd that prints to log that:
>
> > Jul 29 12:58:06 noisy25 statd[124]: Version from Hell! Starting
>
> > But it starts up anyway.
>
> > When I try to mount a share it ses in syslog that mount request
> > is authenticated and my laptops ip appears in xtab but still
> > the mount fails with fallowing error message:
>
> > [root@kombo root]# mount -t nfs -o nolock
> > 120.100.100.25:/tmp/hd /mnt/keke/ mount: RPC: Remote system
> > error
>
> If you are mounting using 'nolock', then you won't care if statd is
> running or not. rpc.statd is only required if you need file locking
> over NFS.
>
> Your mount error is therefore likely to be due to some other
> problem. I suggest you have a look at the RPC exchange between client
> and server using the 'ethereal' program. That should show you what is
> going wrong...
>
> Cheers,
> Trond
>
>
> -------------------------------------------------------
> This sf.net email is sponsored by: Dice - The leading online job board
> for high-tech professionals. Search and apply for tech jobs today!
> http://seeker.dice.com/seeker.epl?rel_code=31
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>



-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-29 23:30:30

by Jonathan Woithe

[permalink] [raw]
Subject: Re: Linux 2.2 client trouble after 2.4.18 server reboot

Hi Trond

> >>>>> " " == Jonathan Woithe <[email protected]> writes:
> > The exported directory on the server is a ReiserFS filesystem,
> > exported with only "rw" options. The client mounts with
> > "hard,intr,bg" options. NFSv3 support is not compiled in on
> > the 2.2.19 clients. The server is a default slackware
> > configuration (from the point of view of NFS):
>
> Was that ReiserFS system created on a 2.2.x system? If so, it is a
> known bug in ReiserFS itself.

No. The ReiserFS system was freshly created under 2.4.18.

> There is a mount option ... that updates the filesystem to the latest
> ReiserFS format.

I'll look at this and give it a go anyway ... in the absence of any other
ideas.

Regards
jonathan
--
* Jonathan Woithe [email protected] *
* http://www.physics.adelaide.edu.au/~jwoithe *
***-----------------------------------------------------------------------***
** "Time is an illusion; lunchtime doubly so" **
* "...you wouldn't recognize a subtle plan if it painted itself purple and *
* danced naked on a harpsichord singing 'subtle plans are here again'" *


-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-07-30 07:44:34

by Jonathan Woithe

[permalink] [raw]
Subject: Re: Linux 2.2 client trouble after 2.4.18 server reboot

Hi Trond and others

[ NFS client troubles after server rebooted ]

> >>>>> " " == Jonathan Woithe <[email protected]> writes:
> > The exported directory on the server is a ReiserFS filesystem,
> > exported with only "rw" options. The client mounts with
> > "hard,intr,bg" options. NFSv3 support is not compiled in on
> > the 2.2.19 clients. The server is a default slackware
> > configuration (from the point of view of NFS):
>
> Was that ReiserFS system created on a 2.2.x system? If so, it is a
> known bug in ReiserFS itself. There is a mount option (can't remember
> the name in the spur of the moment, but you'll find it described on
> the 'mount' manpage in http://www.namesys.com and you'll find it described
> several times in the NFS lists archives) that updates the filesystem
> to the latest ReiserFS format.

A few more details from tests I've done today. Under Slackware 8.1 (what
we're using), the nfs system is shutdown with a `exportfs -au' before a
reboot. If I carry out a stop/start cycle on the NFS system, the clients
have no problem. It seems it only occurs across reboots - not NFS server
restarts.

Furthermore, I've found that if I do a `exportfs -a' immediately after the
server reboot then the clients are able access the NFS volume again. And
yes, exportfs is called before rpc.mountd under Slackware 8.1. Before the
second exportfs invocation, the /var/lib/nfs/xtab file is empty whereas it
contains details of the exported filesystems after. I don't know if this is
relevant - but it is the only file which changes as a result of the second
call.

Does this mean that an extra `exportfs -r' (or `exportfs -a') is required
during boot to resync this, or does it indicate another problem?

In regards to the ReiserFS version, `debugreiserfs <dev>' reports a
superblock of version 3.6.

Regards
jonathan
--
* Jonathan Woithe [email protected] *
* http://www.physics.adelaide.edu.au/~jwoithe *
***-----------------------------------------------------------------------***
** "Time is an illusion; lunchtime doubly so" **
* "...you wouldn't recognize a subtle plan if it painted itself purple and *
* danced naked on a harpsichord singing 'subtle plans are here again'" *


-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-12 23:44:57

by Jonathan Woithe

[permalink] [raw]
Subject: Re: Linux 2.2 client trouble after 2.4.18 server reboot

Hi all

I have finally managed to track down what appears to have caused this
problem. In short, the list of currently exported volumes used machine
names to identify which machines had an active mount at the time the machine
was shut down. During boot (under Slackware 8.1 at least), NFS was started
before our local named server, which meant that the machine names didn't
resolve at the time the NFS server was started. It seems that when this
occurs the NFS system doesn't retry the name resolution at a later time at
all. Because the names don't resolve, it seems the NFS server decides to
make these prior mounts inaccessible. The situation can be fixed by doing
a re-export following the start of named.

As far as I can see there's no reason why the nfs system has to start before
named (and NIS/LDAP for that matter). If it was started after, it would
ensure that name mapping was definitely available when NFS started and so
prior mounts will be resumed without trouble. On our server I have moved
NFS startup to after named with no ill-effects. Is there any reason not to
do this routinely?

Regards
jonathan

James Pearson wrote:
> Have a look through the archives of this list at:
>
> http://marc.theaimsgroup.com/?l=linux-nfs&r=1&w=2
>
> It could be similar to the problem reported in thread:
>
> http://marc.theaimsgroup.com/?l=linux-nfs&m=101796943306978&w=2

> Jonathan Woithe wrote:
> >
> > Hi all
> >
> > [Please CC me replies since I'm not subscribed to this list]
> >
> > Recently I upgraded our NFS server to Linux 2.4.18. This server provides
> > exported home directories for our user Linux boxes which are currently
> > running 2.2.19.
> >
> > Most of the time everything seems to be going well. However, if the server
> > is rebooted the clients seem unable to restore the NFS connection to a
> > usable state. The logs note the server becoming unavailable and then
> > following the reboot they note that the server is now "ok". However, any
> > attempt to then access the mounted volumes is greeted with a "permission
> > denied" error. Even root can't access the drives.
> >
> > If the NFS volumes are unmounted and remounted the problem (predictably)
> > goes away. Of course such an unmount can't occur until people have logged
> > off, which makes it a little cumbersome.
> >
> > The exported directory on the server is a ReiserFS filesystem, exported with
> > only "rw" options. The client mounts with "hard,intr,bg" options. NFSv3
> > support is not compiled in on the 2.2.19 clients. The server is a default
> > slackware configuration (from the point of view of NFS):
> > CONFIG_NFS_FS=y
> > CONFIG_NFS_V3=y
> > # CONFIG_ROOT_NFS is not set
> > CONFIG_NFSD=m
> > CONFIG_NFSD_V3=y
> > CONFIG_SUNRPC=y
> > CONFIG_LOCKD=y
> > CONFIG_LOCKD_V4=y
> > The server also has the lockd oops fix from Trond.
> >
> > I would appreciate any pointers to ways to stop this from happening.
> > Requiring that all users log out before a server reboot is rather
> > inconvenient.

--
* Jonathan Woithe [email protected] *
* http://www.physics.adelaide.edu.au/~jwoithe *
***-----------------------------------------------------------------------***
** "Time is an illusion; lunchtime doubly so" **
* "...you wouldn't recognize a subtle plan if it painted itself purple and *
* danced naked on a harpsichord singing 'subtle plans are here again'" *


-------------------------------------------------------
This sf.net email is sponsored by: Dice - The leading online job board
for high-tech professionals. Search and apply for tech jobs today!
http://seeker.dice.com/seeker.epl?rel_code=31
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs