Hi,
We have a serious problem to use NFS v4 with recent openSUSE versions.
Provided that I have write permission to the directory I touch, I can do
the following for getting a stale file handle:
> touch ..
> ls
ls: cannot open directory .: Stale NFS file handle
This makes the filesystem quite unusable, as anything which touches a
directory above (like writing a file) can stop an application.
If I leave the directory and cd into it again, it content is listable.
>From our computing center that the server runs I got the information
that it is NetApp ONTAP 7.3.6P4.
I observed the problem here with openSUSE-11.4 (kernel: 2.6.37.6-0.20-desktop),
openSUSE-12.1 (kernel: 3.4.4-1-desktop), openSUSE-12.2 (kernel: 3.4.6-2.10-desktop)
and all x86_64 architecture but _not_ with openSUSE-11.3 (kernel: 2.6.34.10-0.6-desktop).
We use NFSv4 without encryption and userdata from NIS.
I'm looking for suggestions to resolve this problem.
Best regards, David
--
David Werner
Universitaet Stuttgart
Institut f?r Wasser- und Umweltsystemmodellierung
Lehrstuhl fuer Hydromechanik & Hydrosystemmodellierung
Pfaffenwaldring 61 ** 70569 Stuttgart
Tel.: ++49-711-685 67010 ** Fax: ++49-711-685 60430
[email protected]
http://www.hydrosys.uni-stuttgart.de/
Hi Emmanuel,
Mount shows the following options which resulted from "defaults,_netdev"
in fstab:
rus4iws.rus.uni-stuttgart.de:/vol/rus4iws_data0/ on /home type nfs4
(rw,relatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=129.69.98.101,local_lock=none,addr=129.69.201.103,_netdev)
I now also tried with "noac" and without _netdev (where I forgot its meaning .. I think it was a recommendation to cirumvent some systemd boot problem)
like the following:
rus4iws.rus.uni-stuttgart.de:/vol/rus4iws_data0/ on /home type nfs4
(rw,relatime,sync,vers=4.0,rsize=65536,wsize=65536,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=129.69.98.101,local_lock=none,addr=129.69.201.103)
But it did not resolve anything.
Best regards,
David
On Tue, Oct 09, 2012 at 06:15:11PM +0200, Emmanuel Florac wrote:
> Le Tue, 9 Oct 2012 17:09:59 +0200
> David Werner <[email protected]> ?crivait:
>
> > rus4iws.rus.uni-stuttgart.de:/vol/rus4iws_data0/ on /home
> > type nfs4
> > (rw,relatime,sync,vers=4.0,rsize=65536,wsize=65536,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=129.69.98.101,local_lock=none,addr=129.69.201.103)
> >
> > But it did not resolve anything.
>
> Is the NetApp exported volume compressed by any chance? What does the
> output from "showmount -e <filer>" looks like for the related exports?
output of showmount -e:
Export list for rus4iws.rus.uni-stuttgart.de:
/vol/rus4iws_data0
129.69.201.31,129.69.201.39,ikearw,ikea1,ikea2,ikea3,teleloch
/vol/rus4iws_vol0 129.69.201.31,129.69.201.39
The first line contains our netgroups and is the directory we mount.
I think some deduplication is enabled.
Today I made a test with ubuntu-12.04.1 client, with the same problem.
--
David Werner
Universitaet Stuttgart
Institut f?r Wasser- und Umweltsystemmodellierung
Lehrstuhl fuer Hydromechanik & Hydrosystemmodellierung
Pfaffenwaldring 61 ** 70569 Stuttgart
Tel.: ++49-711-685 67010 ** Fax: ++49-711-685 60430
[email protected]
Hi,
I now also tried the mount option 'nordirplus' which is in the shipped
with man-page listet only for NFSv3, but on this mailing list it was said is
also available for NFSv4. At a first glance this seems also to resolve the problem too,
but seems to give better performance than the former mentioned 'lookupcache=none'.
Best regards, David
--
David Werner
Universitaet Stuttgart
Institut f?r Wasser- und Umweltsystemmodellierung
Lehrstuhl fuer Hydromechanik & Hydrosystemmodellierung
Pfaffenwaldring 61 ** 70569 Stuttgart
Tel.: ++49-711-685 67010 ** Fax: ++49-711-685 60430
[email protected]
http://www.hydrosys.uni-stuttgart.de/
Le Tue, 9 Oct 2012 15:59:19 +0200
David Werner <[email protected]> ?crivait:
> I'm looking for suggestions to resolve this problem.
Did you try mounting the export with the noac option?
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <[email protected]>
| +33 1 78 94 84 02
------------------------------------------------------------------------
Le Tue, 9 Oct 2012 17:09:59 +0200
David Werner <[email protected]> ?crivait:
> rus4iws.rus.uni-stuttgart.de:/vol/rus4iws_data0/ on /home
> type nfs4
> (rw,relatime,sync,vers=4.0,rsize=65536,wsize=65536,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=129.69.98.101,local_lock=none,addr=129.69.201.103)
>
> But it did not resolve anything.
Is the NetApp exported volume compressed by any chance? What does the
output from "showmount -e <filer>" looks like for the related exports?
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <[email protected]>
| +33 1 78 94 84 02
------------------------------------------------------------------------
Hi,
to provide more details I made tcpdump of the
"touch .." on two hosts. The outputfiles for looking in with wireshark
& co are on:
http://maultier.iws.uni-stuttgart.de:8080/nfs
But I do not understand much of nfs-protocol.
Best regards, David
David Werner [[email protected]] wrote:
> Hi,
>
> to provide more details I made tcpdump of the
> "touch .." on two hosts. The outputfiles for looking in with wireshark
> & co are on:
>
> http://maultier.iws.uni-stuttgart.de:8080/nfs
>
> But I do not understand much of nfs-protocol.
I did take a quick look and didn't see anything wrong in the NFS trace.
Any syslog messages at the client. The ESTALE error must be made up at
the client.
Regards, Malahal.
Some more news about my problem. Thanks to all, who read this thread and
made their thoughts about it.
* I found not any significant kernel messages regarding stale
handles in syslog or dmesg.
* The question about "noac"-mount-parameter brought me to check more mount
options. The parameter lookupcache is quite signifcant. If I set it
to "none", the problem disappears. while with the values "positive" or "all"
the problem persits.
- Though if I mount first with "none" and later with "all"
the problem disappeared but only until reboot when it then was
mounted with "all".
- I checked this under openSUSE-12.2.
- I made a quick check with openSUSE 11.4 whether "none" takes there
also away the problem.
Best regards, David
Le Tue, 9 Oct 2012 21:27:24 +0200 vous écriviez:
> The first line contains our netgroups and is the directory we mount.
Yes, nothing special apparently.
> I think some deduplication is enabled.
> Today I made a test with ubuntu-12.04.1 client, with the same problem.
>
So this may be a kernel bug. I'll have a quick look.
--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <[email protected]>
| +33 1 78 94 84 02
------------------------------------------------------------------------