2006-02-16 15:14:28

by mehta kiran

[permalink] [raw]
Subject: Stale handle error on RHEL4 (2.6.9-22.ELsmp)

Hi,
When nfs(v4) server fails over from one system to=20
other it gets stale handle error.
All systems have RHEL4 installed (same kernel)
uname -a =3D 2.6.9-22.ELsmp #1 SMP=20
=20

Details :

[root@vcslinux118 bin]# mount
<---snip>
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
/dev/sdi2 on /dir2 type ext3 (rw)
/dev/sdi1 on /dir1 type ext3 (rw)
/dir2 on /export/exp2 type bind (rw,bind)
/dir1 on /export/exp1 type bind (rw,bind)
</---snip>

[root@vcslinux118 bin]# exportfs -v
/export/exp1 =20
<world>(rw,wdelay,nohide,no_root_squash)
/export/exp2 =20
<world>(rw,wdelay,nohide,no_root_squash)
/export =20
<world>(rw,wdelay,nohide,no_root_squash,fsid=3D0)
[root@vcslinux118 bin]# uname -a
Linux vcslinux118.vxindia.veritas.com 2.6.9-22.ELsmp
#1 SMP Mon Sep 19 18:00:54 EDT 2005 x86_64 x86_64
x86_64 GNU/Linux

I have mounted this on client as=20
#mount -t nfs4 10.212.102.70:/ /kiran

After failover i get stale handle error on client.
Is this a known issue ?

thanks,
kiran
=20

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi=
les
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2006-02-16 15:27:59

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale handle error on RHEL4 (2.6.9-22.ELsmp)

On Thu, 2006-02-16 at 07:14 -0800, mehta kiran wrote:
> Hi,
> When nfs(v4) server fails over from one system to
> other it gets stale handle error.
> All systems have RHEL4 installed (same kernel)
> uname -a = 2.6.9-22.ELsmp #1 SMP
>
>
> Details :
>
> [root@vcslinux118 bin]# mount
> <---snip>
> sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
> nfsd on /proc/fs/nfsd type nfsd (rw)
> /dev/sdi2 on /dir2 type ext3 (rw)
> /dev/sdi1 on /dir1 type ext3 (rw)
> /dir2 on /export/exp2 type bind (rw,bind)
> /dir1 on /export/exp1 type bind (rw,bind)
> </---snip>
>
> [root@vcslinux118 bin]# exportfs -v
> /export/exp1
> <world>(rw,wdelay,nohide,no_root_squash)
> /export/exp2
> <world>(rw,wdelay,nohide,no_root_squash)
> /export
> <world>(rw,wdelay,nohide,no_root_squash,fsid=0)
> [root@vcslinux118 bin]# uname -a
> Linux vcslinux118.vxindia.veritas.com 2.6.9-22.ELsmp
> #1 SMP Mon Sep 19 18:00:54 EDT 2005 x86_64 x86_64
> x86_64 GNU/Linux
>
> I have mounted this on client as
> #mount -t nfs4 10.212.102.70:/ /kiran
>
> After failover i get stale handle error on client.
> Is this a known issue ?

It is expected, yes. Your failover system has to be a precise copy of
the original filesystem. If the inode number differs for a given file or
directory, then the filehandles will differ on the original system and
the failover system, and so you will get ESTALE errors.

Cheers,
Trond



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-02-16 15:34:09

by mehta kiran

[permalink] [raw]
Subject: Re: Stale handle error on RHEL4 (2.6.9-22.ELsmp)

Hi Trond,
partitions /dev/sdi1 and /dev/sdi2 are=20
seen by both the systems on which server
can start and their major/minor numbers
are same on both the systems.
Sfter server fails over to other system,
output of exportfs and mount looks the same.
Should /export directory which is exported
with fsid=3D0 be on shared disk as well ?
=20
thanks,
kiran
=20
=20

--- Trond Myklebust <[email protected]>
wrote:

> On Thu, 2006-02-16 at 07:14 -0800, mehta kiran
> wrote:
> > Hi,
> > When nfs(v4) server fails over from one system
> to=20
> > other it gets stale handle error.
> > All systems have RHEL4 installed (same kernel)
> > uname -a =3D 2.6.9-22.ELsmp #1 SMP=20
> > =20
> >=20
> > Details :
> >=20
> > [root@vcslinux118 bin]# mount
> > <---snip>
> > sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs
> (rw)
> > nfsd on /proc/fs/nfsd type nfsd (rw)
> > /dev/sdi2 on /dir2 type ext3 (rw)
> > /dev/sdi1 on /dir1 type ext3 (rw)
> > /dir2 on /export/exp2 type bind (rw,bind)
> > /dir1 on /export/exp1 type bind (rw,bind)
> > </---snip>
> >=20
> > [root@vcslinux118 bin]# exportfs -v
> > /export/exp1 =20
> > <world>(rw,wdelay,nohide,no_root_squash)
> > /export/exp2 =20
> > <world>(rw,wdelay,nohide,no_root_squash)
> > /export =20
> > <world>(rw,wdelay,nohide,no_root_squash,fsid=3D0)
> > [root@vcslinux118 bin]# uname -a
> > Linux vcslinux118.vxindia.veritas.com
> 2.6.9-22.ELsmp
> > #1 SMP Mon Sep 19 18:00:54 EDT 2005 x86_64 x86_64
> > x86_64 GNU/Linux
> >=20
> > I have mounted this on client as=20
> > #mount -t nfs4 10.212.102.70:/ /kiran
> >=20
> > After failover i get stale handle error on client.
> > Is this a known issue ?
>=20
> It is expected, yes. Your failover system has to be
> a precise copy of
> the original filesystem. If the inode number differs
> for a given file or
> directory, then the filehandles will differ on the
> original system and
> the failover system, and so you will get ESTALE
> errors.
>=20
> Cheers,
> Trond
>=20
>=20
>=20
>
-------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do
> you grep through log files
> for problems? Stop! Download the new AJAX search
> engine that makes
> searching your log files as easy as surfing the=20
> web. DOWNLOAD SPLUNK!
>
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat=3D=
121642
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi=
les
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-02-16 15:36:22

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale handle error on RHEL4 (2.6.9-22.ELsmp)

On Thu, 2006-02-16 at 07:34 -0800, mehta kiran wrote:
> Hi Trond,
> partitions /dev/sdi1 and /dev/sdi2 are
> seen by both the systems on which server
> can start and their major/minor numbers
> are same on both the systems.
> Sfter server fails over to other system,
> output of exportfs and mount looks the same.
> Should /export directory which is exported
> with fsid=0 be on shared disk as well ?

Probably, yes.

Cheers,
Trond



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-02-16 15:44:13

by mehta kiran

[permalink] [raw]
Subject: Re: Stale handle error on RHEL4 (2.6.9-22.ELsmp)

Hi Trond,
But this is different than behaviour of NFS=20
on other platforms. On other platforms like=20
AIX, root of exported fses need not be=20
on shared disk.
Anyways, i will try out the thing suggested
by you and post the result tomorrow.

thanks,
kiran


--- Trond Myklebust <[email protected]>
wrote:

> On Thu, 2006-02-16 at 07:34 -0800, mehta kiran
> wrote:
> > Hi Trond,
> > partitions /dev/sdi1 and /dev/sdi2 are=20
> > seen by both the systems on which server
> > can start and their major/minor numbers
> > are same on both the systems.
> > Sfter server fails over to other system,
> > output of exportfs and mount looks the same.
> > Should /export directory which is exported
> > with fsid=3D0 be on shared disk as well ?
>=20
> Probably, yes.
>=20
> Cheers,
> Trond
>=20
>=20


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi=
les
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-02-18 08:00:15

by mehta kiran

[permalink] [raw]
Subject: Re: Stale handle error on RHEL4 (2.6.9-22.ELsmp)

Hi Trond,
I tried with a setup in which filesystem
exported with fsid=3D0 is also on shared disk.
I worked that way. However it is still
interesting to see that on AIX it works
with a little less restriction as mentioned
in my mail below.
=20
thanks,
kiran
=20

--- mehta kiran <[email protected]> wrote:

> Hi Trond,
> But this is different than behaviour of NFS=20
> on other platforms. On other platforms like=20
> AIX, root of exported fses need not be=20
> on shared disk.
> Anyways, i will try out the thing suggested
> by you and post the result tomorrow.
>=20
> thanks,
> kiran
>=20
>=20
> --- Trond Myklebust <[email protected]>
> wrote:
>=20
> > On Thu, 2006-02-16 at 07:34 -0800, mehta kiran
> > wrote:
> > > Hi Trond,
> > > partitions /dev/sdi1 and /dev/sdi2 are=20
> > > seen by both the systems on which server
> > > can start and their major/minor numbers
> > > are same on both the systems.
> > > Sfter server fails over to other system,
> > > output of exportfs and mount looks the same.
> > > Should /export directory which is exported
> > > with fsid=3D0 be on shared disk as well ?
> >=20
> > Probably, yes.
> >=20
> > Cheers,
> > Trond
> >=20
> >=20
>=20
>=20
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam
> protection around=20
> http://mail.yahoo.com=20
>=20
>=20
>
-------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do
> you grep through log files
> for problems? Stop! Download the new AJAX search
> engine that makes
> searching your log files as easy as surfing the=20
> web. DOWNLOAD SPLUNK!
>
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat=3D=
121642
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around=20
http://mail.yahoo.com=20


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log fi=
les
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D103432&bid=3D230486&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs