2006-01-16 09:16:11

by Wally Winchester

[permalink] [raw]
Subject: Stale nfs file handle - cwd without path - workaround?

Hello,

I have an nfs setup with heartbeat failover that is working nicely. I
only have one problem, I hope someone can help.

When nfs1 dies, and nfs2 takes over, everything works - as long as the
clients are writing files using the full paths.
If the path is omitted, i.e. they are expecting a write to their cwd, a
"stale nfs file handle" warning is issued.

Is there a way around this?

e.g. this works
touch /nfs-mount/file
(failover)
touch /nfs-mount/file

e.g. this doesn't work
cd /nfs-mount/
touch file
(failover)
touch file
(stale nfs file handle)

Thanks
--
Wally Winchester
[email protected]

--
http://www.fastmail.fm - Access all of your messages and folders
wherever you are



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2006-01-18 18:18:38

by Bernd Schubert

[permalink] [raw]
Subject: Re: Stale nfs file handle - cwd without path - workaround?

Hi Wally,

sorry for the late response, but I'm very busy right now.

> The setup is two virtual machines on VMWare ESX server.
> They talk to a shared partition which is really a disk file on a SAN.
>
> Perhaps interestingly, this works:
> cd /nfs-mount/
> touch file
> (failover)
> touch file
> (stale nfs file handle)
> cd .
> touch file
> (works)
>
> $ cat /etc/exports
> /srv/nfs *(rw,no_root_squash,insecure,sync)
>
> $ ls -la /var/lib/|grep nfs
> nfs -> /srv/nfs/nfs/
>
> $ ls /srv/nfs/nfs/
> etab rmtab sm state xtab
>
> This is a Debian Sarge install.

Which kernel version are you using? If its the debian kernel try a newer on=
e=20
(on the client and on the server). Rather probably its not related, but you=
=20
should also add an fsid entry to your exports.
Sorry, can't help more, since I'm absolutely not familar with your virtual=
=20
machine + SAN setup.

Cheers,
Bernd


=2D-=20
Bernd Schubert
PCI / Theoretische Chemie
Universit=E4t Heidelberg
INF 229
69120 Heidelberg



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-01-18 21:06:24

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale nfs file handle - cwd without path - workaround?

On Wed, 2006-01-18 at 08:43 +0000, Wally Winchester wrote:
> > > When nfs1 dies, and nfs2 takes over, everything works - as long as the
> > > clients are writing files using the full paths.
> > > If the path is omitted, i.e. they are expecting a write to their cwd, a
> > > "stale nfs file handle" warning is issued.
> > >
> > > Is there a way around this?
> > >
> > > e.g. this works
> > > touch /nfs-mount/file
> > > (failover)
> > > touch /nfs-mount/file
> > >
> > > e.g. this doesn't work
> > > cd /nfs-mount/
> > > touch file
> > > (failover)
> > > touch file
> > > (stale nfs file handle)
> >
> > really strange, we never had this problem. I think you have to provide
> > some
> > more information (/etc/exports of both servers, kernel version,
> > distribution,
> > how you handle /var/lib/nfs, drbd? configuration, etc.).
> >
> > Cheers,
> > Bernd
> >
> >
> >
>
> As a follow up to my reply,
> can anyone suggest how to troubleshoot this? Is there a way of getting
> an nfs client to not cache the details of the cwd (if this is the
> problem)?
>
> I hope someone knows!

You have never actually told anybody which kernel version you are using.
My guess, however, would be either a 2.4.x kernel, or an early 2.6.x.

IOW: this sort of thing should be fixed in more recent 2.6.x kernels.
Please upgrade, and try again.

Cheers,
Trond



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-01-23 13:23:37

by Wally Winchester

[permalink] [raw]
Subject: Re: Stale nfs file handle - cwd without path - workaround?

On Wed, 18 Jan 2006 19:18:17 +0100, "Bernd Schubert"
<[email protected]> said:
> Hi Wally,
>=20
> sorry for the late response, but I'm very busy right now.
>=20
> > The setup is two virtual machines on VMWare ESX server.
> > They talk to a shared partition which is really a disk file on a SAN.
> >
> > Perhaps interestingly, this works:
> > cd /nfs-mount/
> > touch file
> > (failover)
> > touch file
> > (stale nfs file handle)
> > cd .
> > touch file
> > (works)
> >
> > $ cat /etc/exports
> > /srv/nfs *(rw,no_root_squash,insecure,sync)
> >
> > $ ls -la /var/lib/|grep nfs
> > nfs -> /srv/nfs/nfs/
> >
> > $ ls /srv/nfs/nfs/
> > etab rmtab sm state xtab
> >
> > This is a Debian Sarge install.
>=20
> Which kernel version are you using? If its the debian kernel try a newer
> one=20
> (on the client and on the server). Rather probably its not related, but
> you=20
> should also add an fsid entry to your exports.
> Sorry, can't help more, since I'm absolutely not familar with your
> virtual=20
> machine + SAN setup.
>=20
> Cheers,
> Bernd
>=20
>=20
> --=20
> Bernd Schubert
> PCI / Theoretische Chemie
> Universit=E4t Heidelberg
> INF 229
> 69120 Heidelberg
>=20


Hello,

Thanks very much for the replies.

The client is running 2.6.12-1.1398_FC4smp
The server is now Ubuntu, running 2.6.12-9-686.

Ubuntu seems to work better (although not as I would expect).
After the failover, it takes 10-15 minutes for a result to come back. Is
there some way of tuning this?
To make this worse, not all files written to one nfs server show on the
other after the failover.

I have tried setting the fsid, but it doesn't seem to help either.


The setup: Two virtual machines, connected by heartbeat. When heartbeat
is stopped on one machine, it shuts down nfs-kernel-server, unmounts the
shared scsi drive, and the other mounts it, and starts-nfs-server. They
also have a shared ip which works in the same way. The shared scsi disk
is really a vmware disk image which is shared using "physical" sharing
in the vmware setup.

Eck!
--=20
Wally Winchester
[email protected]

--=20
http://www.fastmail.fm - The way an email service should be



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-01-27 11:32:24

by Wally Winchester

[permalink] [raw]
Subject: [SOLVED] Re: Stale nfs file handle - cwd without path - workaround?


The problem was related to vmware's locking. Disabling locking makes the
problem go away (heartbeat handles the locking).

On failover, no delay. On failback, 15 minute delay - each time.

Where is this 15 minutes delay coming from and can it be tweaked?

Thanks very much :)
--
Wally Winchester
[email protected]

--
http://www.fastmail.fm - Or how I learned to stop worrying and
love email again



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-01-16 23:43:30

by Bernd Schubert

[permalink] [raw]
Subject: Re: Stale nfs file handle - cwd without path - workaround?

Hello Wally,

> When nfs1 dies, and nfs2 takes over, everything works - as long as the
> clients are writing files using the full paths.
> If the path is omitted, i.e. they are expecting a write to their cwd, a
> "stale nfs file handle" warning is issued.
>
> Is there a way around this?
>
> e.g. this works
> touch /nfs-mount/file
> (failover)
> touch /nfs-mount/file
>
> e.g. this doesn't work
> cd /nfs-mount/
> touch file
> (failover)
> touch file
> (stale nfs file handle)

really strange, we never had this problem. I think you have to provide some
more information (/etc/exports of both servers, kernel version, distribution,
how you handle /var/lib/nfs, drbd? configuration, etc.).

Cheers,
Bernd





-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-01-17 08:45:16

by Wally Winchester

[permalink] [raw]
Subject: Re: Stale nfs file handle - cwd without path - workaround?

On Tue, 17 Jan 2006 00:43:17 +0100, "Bernd Schubert"
<[email protected]> said:
> Hello Wally,
>
> > When nfs1 dies, and nfs2 takes over, everything works - as long as the
> > clients are writing files using the full paths.
> > If the path is omitted, i.e. they are expecting a write to their cwd, a
> > "stale nfs file handle" warning is issued.
> >
> > Is there a way around this?
> >
> > e.g. this works
> > touch /nfs-mount/file
> > (failover)
> > touch /nfs-mount/file
> >
> > e.g. this doesn't work
> > cd /nfs-mount/
> > touch file
> > (failover)
> > touch file
> > (stale nfs file handle)
>
> really strange, we never had this problem. I think you have to provide
> some
> more information (/etc/exports of both servers, kernel version,
> distribution,
> how you handle /var/lib/nfs, drbd? configuration, etc.).
>
> Cheers,
> Bernd
>
>
>

The setup is two virtual machines on VMWare ESX server.
They talk to a shared partition which is really a disk file on a SAN.

Perhaps interestingly, this works:
cd /nfs-mount/
touch file
(failover)
touch file
(stale nfs file handle)
cd .
touch file
(works)

$ cat /etc/exports
/srv/nfs *(rw,no_root_squash,insecure,sync)

$ ls -la /var/lib/|grep nfs
nfs -> /srv/nfs/nfs/

$ ls /srv/nfs/nfs/
etab rmtab sm state xtab

This is a Debian Sarge install.
--
Wally Winchester
[email protected]

--
http://www.fastmail.fm - Access all of your messages and folders
wherever you are



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-01-18 08:43:45

by Wally Winchester

[permalink] [raw]
Subject: Re: Stale nfs file handle - cwd without path - workaround?


On Tue, 17 Jan 2006 00:43:17 +0100, "Bernd Schubert"
<[email protected]> said:
> Hello Wally,
>
> > When nfs1 dies, and nfs2 takes over, everything works - as long as the
> > clients are writing files using the full paths.
> > If the path is omitted, i.e. they are expecting a write to their cwd, a
> > "stale nfs file handle" warning is issued.
> >
> > Is there a way around this?
> >
> > e.g. this works
> > touch /nfs-mount/file
> > (failover)
> > touch /nfs-mount/file
> >
> > e.g. this doesn't work
> > cd /nfs-mount/
> > touch file
> > (failover)
> > touch file
> > (stale nfs file handle)
>
> really strange, we never had this problem. I think you have to provide
> some
> more information (/etc/exports of both servers, kernel version,
> distribution,
> how you handle /var/lib/nfs, drbd? configuration, etc.).
>
> Cheers,
> Bernd
>
>
>

As a follow up to my reply,
can anyone suggest how to troubleshoot this? Is there a way of getting
an nfs client to not cache the details of the cwd (if this is the
problem)?

I hope someone knows!
--
Wally Winchester
[email protected]

--
http://www.fastmail.fm - Accessible with your email software
or over the web



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs