2010-01-25 19:09:03

by Chuck Lever

[permalink] [raw]
Subject: Re: nfs client performance while server is down

On Jan 25, 2010, at 2:02 PM, Whoop Whouzer wrote:
> Ok, I did that, after shutting down the server and enabling debug
> trace I tried to open the home folder of the current account (totally
> unrelated to the nfsshare), it wouldn't open at all, I got no nautilus
> at all. During the time my cursor was in busy mode I got the following
> messages in kern.log (for ubuntu 10.04 client):
> Jan 25 19:30:13 whoop-desktop kernel: [ 160.719262] NFS call fsstat
> Jan 25 19:30:37 whoop-desktop kernel: [ 184.458611] NFS:
> permission(0:16/74386), mask=0x10, res=0
> Jan 25 19:30:37 whoop-desktop kernel: [ 184.458647] NFS call access
> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721086] nfs: server
> 192.168.1.130 not responding, timed out
> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721113] NFS reply
> statfs: -5
> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721116] nfs_statfs:
> statfs error = 5
> These series of traces are repeating over and over again at a set
> interval (there is no flooding of the logs), even if I do nothing.
> It's even worse than I thought because when I tried to shutdown, the
> machine wouldn't shutdown because it claimed
> the "File manager" was still running (although it was not visible on
> screen); so I had to kill that before I could shutdown (properly).
>
> In Fedora 12 I had a similar user experience (nautilus did show up
> without showing any contents and it was hanging). I had enabled
> tracing and it seems to be logged to /var/log/messages. I got this
> output in fedora:
> Jan 25 20:48:38 localhost kernel: NFS reply statfs: -5
> Jan 25 20:48:38 localhost kernel: nfs_statfs: statfs error = 5
> Jan 25 20:48:38 localhost kernel: NFS call fsstat
> Jan 25 20:49:14 localhost kernel: nfs: server 192.168.1.130 not
> responding, timed out
> Jan 25 20:49:14 localhost kernel: NFS reply getattr: -5
> Jan 25 20:49:14 localhost kernel: nfs_revalidate_inode: (0:14/74386)
> getattr failed, error=-5
> Jan 25 20:49:25 localhost kernel: NFS: revalidating (0:14/74386)
> Jan 25 20:49:25 localhost kernel: NFS call getattr
> Jan 25 20:50:14 localhost kernel: nfs: server 192.168.1.130 not
> responding, timed out
> Jan 25 20:50:14 localhost kernel: NFS reply access: -5
> Jan 25 20:50:14 localhost kernel: NFS: permission(0:14/74386),
> mask=0x1, res=-5
> Jan 25 20:50:14 localhost kernel: NFS call access
> Jan 25 20:51:14 localhost kernel: nfs: server 192.168.1.130 not
> responding, timed out
> Jan 25 20:51:14 localhost kernel: NFS reply statfs: -5
> Jan 25 20:51:14 localhost kernel: nfs_statfs: statfs error = 5
> Jan 25 20:51:14 localhost kernel: NFS call fsstat
> Most of the trace is repeating in set intervals as well, there is no
> flooding of the logs...
> Fedora would not shutdown normally either

This verifies that your client is attempting to access the NFS server,
but doesn't tell us which file it's attempting to access. Essentially
the EIO means "failed to connect".

Maybe try an strace of the nautilus process next?

> On Mon, Jan 25, 2010 at 5:48 PM, Chuck Lever
> <[email protected]> wrote:
>> On Jan 24, 2010, at 7:09 PM, Whoop Whouzer wrote:
>>>
>>> I did some network traces and there is nothing strange happening as
>>> far as I can tell. I shut down the server (some network traffic
>>> occurred as is to be expected). It got quiet again, I launched
>>> nautilus, it got stuck without displaying anything and there was no
>>> real network activity except 3 broadcasts using the ARP protocol
>>> asking where the server was (could be just coincidence).
>>
>> That sounds like the client does want to reconnect with the server.
>>
>> You could try enabling debug tracing on your client (sudo rpcdebug -
>> m nfs -s
>> all) after shutting down your server, then try to start nautilus.
>> The
>> kernel log would then contain NFS-related messages that might
>> indicate where
>> to look next.
>>
>>> Closing
>>> nautilus and launching it again will let it hang again but I see no
>>> additional network traffic. After a while nautilus will display the
>>> contents of the folder without any network traffic.
>>>
>>> On Sun, Jan 24, 2010 at 10:34 PM, Muntz, Daniel <[email protected]
>>> >
>>> wrote:
>>>>
>>>> Perhaps something in your $PATH is in the NFS mount? Do a
>>>> network trace
>>>> and maybe you can see if, in fact, there are actually NFS
>>>> operations being
>>>> attempted that you weren't expecting. Then try to figure out why.
>>>>
>>>> -Dan
>>>>
>>>>> -----Original Message-----
>>>>> From: Whoop Whouzer [mailto:[email protected]]
>>>>> Sent: Saturday, January 23, 2010 8:28 AM
>>>>> To: Peter Chacko
>>>>> Cc: [email protected]
>>>>> Subject: Re: nfs client performance while server is down
>>>>>
>>>>> I don't remember all the different set-ups I tried it on, but I
>>>>> just
>>>>> confirmed this with the following combinations:
>>>>>
>>>>> ubuntu server 10.04 (alpha 2) --> ubuntu desktop 9.10, ubuntu
>>>>> desktop
>>>>> 10.04 (alpha 2), fedora 12
>>>>> ubuntu server 9.10 --> ubuntu desktop 9.10, ubuntu desktop 10.04
>>>>> (alpha 2), fedora 12
>>>>>
>>>>> I'll be happy to test it on another client machine (distro) even
>>>>> another server (although it would require a little more time)
>>>>>
>>>>> Here are some examples on the bugreports I noticed and how they
>>>>> do not
>>>>> seem to get solved:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=175283
>>>>> https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/164120
>>>>>
>>>>> regards,
>>>>> Whoop
>>>>>
>>>>> On Sat, Jan 23, 2010 at 4:57 PM, Peter Chacko
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> Which client OS you observed this behavior ? This has nothing
>>>>>> to do
>>>>>> NFS design, and its purely stateless...Its upto the client OS
>>>>>> implementation about aspects like how to deal with local
>>>>>
>>>>> IO, when NFS
>>>>>>
>>>>>> share gets disconnected..
>>>>>>
>>>>>> May be a VFS bug on the local OS you found this problem ..
>>>>>>
>>>>>> thanks
>>>>>>
>>>>>> On Sat, Jan 23, 2010 at 9:15 PM, Whoop Whouzer
>>>>>
>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Howdy,
>>>>>>>
>>>>>>> I was wondering why nfs is designed in such a way that the
>>>>>
>>>>> performance
>>>>>>>
>>>>>>> of an nfs client machine gets very bad when the nfs server
>>>>>
>>>>> is offline?
>>>>>>>
>>>>>>> This is even the case with a soft mount (either via mount
>>>>>
>>>>> or fstab).
>>>>>>>
>>>>>>> Just about every application that requires disk access (not
>>>>>>> talking
>>>>>>> about nfs share acces) gets really slow to unresponsive.
>>>>>
>>>>> For instance
>>>>>>>
>>>>>>> nautilus becomes unresponsive when displaying the contents of
>>>>>>> any
>>>>>>> folder on the local disk,
>>>>>>> playing movie files (stored on local disk) let totem or
>>>>>
>>>>> vlc get stuck
>>>>>>>
>>>>>>> on set intervals, even the terminal becomes unresponsive at
>>>>>>> times.
>>>>>>>
>>>>>>> I could understand that these problems would occur while
>>>>>
>>>>> accessing the
>>>>>>>
>>>>>>> nfs share directoiourry while the server is offline, but
>>>>>
>>>>> why for totally
>>>>>>>
>>>>>>> unrelated directories?
>>>>>>>
>>>>>>> I have experienced this behaviour on various distro's, and
>>>>>
>>>>> also found
>>>>>>>
>>>>>>> various bug reports on this issue, they don't seem to get
>>>>>>> solved as
>>>>>>> this is viewed as nfs design.
>>>>>>> I see this as a flaw because clients are totally dependent on
>>>>>>> the
>>>>>>> server. This would be less of a deal if the entire home
>>>>>>> directory
>>>>>>> would be stored on nfs (although I even think some sort of
>>>>>>> synchronisation technology could and should be implemented in
>>>>>>> this
>>>>>>> case). It is a bit odd that (technically) one machine serving
>>>>>>> some
>>>>>>> "useless" files to a non-trivial directory on client
>>>>>
>>>>> machines can take
>>>>>>>
>>>>>>> down these client machines.
>>>>>>>
>>>>>>> For me the preferred functionality would be:
>>>>>>> *If an nfs server gets offline the client's nfs share becomes
>>>>>>> unaccessible, but local directories and applications (that only
>>>>>>> require local disk access) stay responsive.
>>>>>>> *If an nfs server gets online (after being offline while the
>>>>>>> client
>>>>>>> has not been restarted) the nfs share becomes reconnected.
>>>>>>>
>>>>>>> regards,
>>>>>>> Whoop
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>
>>>>> linux-nfs" in
>>>>>>>
>>>>>>> the body of a message to [email protected]
>>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>> linux-nfs" in
>>>>> the body of a message to [email protected]
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-
>>> nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> --
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>>
>>
>>
>>
>>

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2010-01-27 18:47:38

by Whoop Whouzer

[permalink] [raw]
Subject: Re: nfs client performance while server is down

ok, but it's not just GNOME/nautilus behaviour. For one, I am
experiencing problems with just about all applications that require
(local) disk access. Furthermore, problems have also been reported
with xfce/thunar and also with KDE.

A bug for this issue has just been created for xfce/thunar:
http://bugzilla.xfce.org/show_bug.cgi?id=6185

On Wed, Jan 27, 2010 at 7:40 PM, Trond Myklebust
<[email protected]> wrote:
> On Wed, 2010-01-27 at 13:23 -0500, Chuck Lever wrote:
>> On 01/26/2010 06:21 PM, J. Bruce Fields wrote:
>> > I wonder if nautilus (or some library it uses) likes to regularly
>> > "statfs" all the filesystems it knows about?
>>
>> The NFS client seems to like to send these periodically, but I've never
>> looked into why. ?It's probably triggered by some cache timeout, and
>> gathers recent server file system information.
>
> No. It is entirely application driven. Furthermore, most of the statfs
> data is uncached, since it should not be performance critical in any
> sane application environment.
>
> IOW: I agree with Bruce that this is most likely GNOME or nautilus
> triggering statfs calls. Indeed, when I do actually open a window on
> some directory it also appears to display the free space.
>
> Trond
>
>
>

2010-01-27 00:40:09

by Whoop Whouzer

[permalink] [raw]
Subject: Re: nfs client performance while server is down

Could be, although not very likely as it was also reported happening
with thunar (although I have not tested this myself).
But I am also experiencing similar problems with other applications
even gnome-terminal (basically all applications requiring (local) disk
access).
So this would led me to think it is some sub-process, that is used by
all application requiring disk access, that is to blame...

On Wed, Jan 27, 2010 at 12:21 AM, J. Bruce Fields <[email protected]> wrote:
> On Mon, Jan 25, 2010 at 02:08:47PM -0500, Chuck Lever wrote:
>> On Jan 25, 2010, at 2:02 PM, Whoop Whouzer wrote:
>>> Ok, I did that, after shutting down the server and enabling debug
>>> trace I tried to open the home folder of the current account (totally
>>> unrelated to the nfsshare), it wouldn't open at all, I got no nautilus
>>> at all. During the time my cursor was in busy mode I got the following
>>> messages in kern.log (for ubuntu 10.04 client):
>>> Jan 25 19:30:13 whoop-desktop kernel: [ ?160.719262] NFS call ?fsstat
>>> Jan 25 19:30:37 whoop-desktop kernel: [ ?184.458611] NFS:
>>> permission(0:16/74386), mask=0x10, res=0
>>> Jan 25 19:30:37 whoop-desktop kernel: [ ?184.458647] NFS call ?access
>>> Jan 25 19:30:43 whoop-desktop kernel: [ ?190.721086] nfs: server
>>> 192.168.1.130 not responding, timed out
>>> Jan 25 19:30:43 whoop-desktop kernel: [ ?190.721113] NFS reply statfs:
>>> -5
>>> Jan 25 19:30:43 whoop-desktop kernel: [ ?190.721116] nfs_statfs:
>>> statfs error = 5
> ...
>> This verifies that your client is attempting to access the NFS server,
>> but doesn't tell us which file it's attempting to access. ?Essentially
>> the EIO means "failed to connect".
>
> I wonder if nautilus (or some library it uses) likes to regularly
> "statfs" all the filesystems it knows about?
>
> --b.
>

2010-01-27 19:09:42

by Trond Myklebust

[permalink] [raw]
Subject: Re: nfs client performance while server is down

So? I don't see why that would be an NFS problem.

As far as I can see from this thread, you are basically asking us to fix
these broken applications by implementing a "disconnected NFS" mode.
While that may indeed be a cool thing to support, I haven't seen anybody
so far stepping up and saying that they have the time and resources to
work on it. Are you volunteering?

Trond

On Wed, 2010-01-27 at 19:47 +0100, Whoop Whouzer wrote:
> ok, but it's not just GNOME/nautilus behaviour. For one, I am
> experiencing problems with just about all applications that require
> (local) disk access. Furthermore, problems have also been reported
> with xfce/thunar and also with KDE.
>
> A bug for this issue has just been created for xfce/thunar:
> http://bugzilla.xfce.org/show_bug.cgi?id=6185
>
> On Wed, Jan 27, 2010 at 7:40 PM, Trond Myklebust
> <[email protected]> wrote:
> > On Wed, 2010-01-27 at 13:23 -0500, Chuck Lever wrote:
> >> On 01/26/2010 06:21 PM, J. Bruce Fields wrote:
> >> > I wonder if nautilus (or some library it uses) likes to regularly
> >> > "statfs" all the filesystems it knows about?
> >>
> >> The NFS client seems to like to send these periodically, but I've never
> >> looked into why. It's probably triggered by some cache timeout, and
> >> gathers recent server file system information.
> >
> > No. It is entirely application driven. Furthermore, most of the statfs
> > data is uncached, since it should not be performance critical in any
> > sane application environment.
> >
> > IOW: I agree with Bruce that this is most likely GNOME or nautilus
> > triggering statfs calls. Indeed, when I do actually open a window on
> > some directory it also appears to display the free space.
> >
> > Trond
> >
> >
> >




2010-01-27 17:09:13

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfs client performance while server is down

On Wed, Jan 27, 2010 at 01:40:07AM +0100, Whoop Whouzer wrote:
> Could be, although not very likely as it was also reported happening
> with thunar (although I have not tested this myself).
> But I am also experiencing similar problems with other applications
> even gnome-terminal (basically all applications requiring (local) disk
> access).
> So this would led me to think it is some sub-process, that is used by
> all application requiring disk access, that is to blame...

You could patch the kernel to add printk()'s in statfs showing who is
calling it (and with what path).

But probably there's some tracing infrastructure that would make this
possible without patching.

--b.

>
> On Wed, Jan 27, 2010 at 12:21 AM, J. Bruce Fields <[email protected]> wrote:
> > On Mon, Jan 25, 2010 at 02:08:47PM -0500, Chuck Lever wrote:
> >> On Jan 25, 2010, at 2:02 PM, Whoop Whouzer wrote:
> >>> Ok, I did that, after shutting down the server and enabling debug
> >>> trace I tried to open the home folder of the current account (totally
> >>> unrelated to the nfsshare), it wouldn't open at all, I got no nautilus
> >>> at all. During the time my cursor was in busy mode I got the following
> >>> messages in kern.log (for ubuntu 10.04 client):
> >>> Jan 25 19:30:13 whoop-desktop kernel: [  160.719262] NFS call  fsstat
> >>> Jan 25 19:30:37 whoop-desktop kernel: [  184.458611] NFS:
> >>> permission(0:16/74386), mask=0x10, res=0
> >>> Jan 25 19:30:37 whoop-desktop kernel: [  184.458647] NFS call  access
> >>> Jan 25 19:30:43 whoop-desktop kernel: [  190.721086] nfs: server
> >>> 192.168.1.130 not responding, timed out
> >>> Jan 25 19:30:43 whoop-desktop kernel: [  190.721113] NFS reply statfs:
> >>> -5
> >>> Jan 25 19:30:43 whoop-desktop kernel: [  190.721116] nfs_statfs:
> >>> statfs error = 5
> > ...
> >> This verifies that your client is attempting to access the NFS server,
> >> but doesn't tell us which file it's attempting to access.  Essentially
> >> the EIO means "failed to connect".
> >
> > I wonder if nautilus (or some library it uses) likes to regularly
> > "statfs" all the filesystems it knows about?
> >
> > --b.
> >

2010-01-27 19:32:03

by Peter Staubach

[permalink] [raw]
Subject: Re: nfs client performance while server is down

Whoop Whouzer wrote:
> I am not stating this is an NFS problem at all. I am not asking anybody to fix
> anything.
> I asked if this issue was by design. I was told it wasn't (as nfs is stateless).
> So, therefore I considered it as a bug (which I don't believe to
> reside in either nfs or nautilus). I am just trying to figure out
> where the problem lies.
>

There is a misconception here. NFS is not stateless. To
be accurate, the NFSv2 and NFSv3 protocols were defined in
such as to allow the NFS server to be stateless. The server
was not supposed to be required to remember anything about
what a client was doing from operation to the next. (In
reality, there are non-idempotent operations, ie. operations
which can not be done twice and get the same results, so it
is very helpful if the server remembers some state.)

NFS clients have always been _very_ stateful. They have to
know about all mounted file systems, open files, current
directories, etc.

The problem here is some application which is attempting to
touch all of the mounted file systems. When it tries to
touch one from a non-responsive NFS server, then it hangs.
This represents an architectural problem with the
application making an assumption that is okay and acceptable
to access all file systems which are currently mounted.
This assumption leads to situations such as you are observing.

This isn't new.

ps

> I am not talking about implementing "disconnected NFS" mode,
> synchronisation or anything like that. There is not something missing,
> there is something not working properly, somewhere, and I'm trying to
> find out where..
>
> On Wed, Jan 27, 2010 at 8:09 PM, Trond Myklebust
> <[email protected]> wrote:
>> So? I don't see why that would be an NFS problem.
>>
>> As far as I can see from this thread, you are basically asking us to fix
>> these broken applications by implementing a "disconnected NFS" mode.
>> While that may indeed be a cool thing to support, I haven't seen anybody
>> so far stepping up and saying that they have the time and resources to
>> work on it. Are you volunteering?
>>
>> Trond
>>
>> On Wed, 2010-01-27 at 19:47 +0100, Whoop Whouzer wrote:
>>> ok, but it's not just GNOME/nautilus behaviour. For one, I am
>>> experiencing problems with just about all applications that require
>>> (local) disk access. Furthermore, problems have also been reported
>>> with xfce/thunar and also with KDE.
>>>
>>> A bug for this issue has just been created for xfce/thunar:
>>> http://bugzilla.xfce.org/show_bug.cgi?id=6185
>>>
>>> On Wed, Jan 27, 2010 at 7:40 PM, Trond Myklebust
>>> <[email protected]> wrote:
>>>> On Wed, 2010-01-27 at 13:23 -0500, Chuck Lever wrote:
>>>>> On 01/26/2010 06:21 PM, J. Bruce Fields wrote:
>>>>>> I wonder if nautilus (or some library it uses) likes to regularly
>>>>>> "statfs" all the filesystems it knows about?
>>>>> The NFS client seems to like to send these periodically, but I've never
>>>>> looked into why. It's probably triggered by some cache timeout, and
>>>>> gathers recent server file system information.
>>>> No. It is entirely application driven. Furthermore, most of the statfs
>>>> data is uncached, since it should not be performance critical in any
>>>> sane application environment.
>>>>
>>>> IOW: I agree with Bruce that this is most likely GNOME or nautilus
>>>> triggering statfs calls. Indeed, when I do actually open a window on
>>>> some directory it also appears to display the free space.
>>>>
>>>> Trond
>>>>
>>>>
>>>>
>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


2010-01-26 23:20:18

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfs client performance while server is down

On Mon, Jan 25, 2010 at 02:08:47PM -0500, Chuck Lever wrote:
> On Jan 25, 2010, at 2:02 PM, Whoop Whouzer wrote:
>> Ok, I did that, after shutting down the server and enabling debug
>> trace I tried to open the home folder of the current account (totally
>> unrelated to the nfsshare), it wouldn't open at all, I got no nautilus
>> at all. During the time my cursor was in busy mode I got the following
>> messages in kern.log (for ubuntu 10.04 client):
>> Jan 25 19:30:13 whoop-desktop kernel: [ 160.719262] NFS call fsstat
>> Jan 25 19:30:37 whoop-desktop kernel: [ 184.458611] NFS:
>> permission(0:16/74386), mask=0x10, res=0
>> Jan 25 19:30:37 whoop-desktop kernel: [ 184.458647] NFS call access
>> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721086] nfs: server
>> 192.168.1.130 not responding, timed out
>> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721113] NFS reply statfs:
>> -5
>> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721116] nfs_statfs:
>> statfs error = 5
...
> This verifies that your client is attempting to access the NFS server,
> but doesn't tell us which file it's attempting to access. Essentially
> the EIO means "failed to connect".

I wonder if nautilus (or some library it uses) likes to regularly
"statfs" all the filesystems it knows about?

--b.

2010-01-27 18:23:40

by Chuck Lever

[permalink] [raw]
Subject: Re: nfs client performance while server is down

On 01/26/2010 06:21 PM, J. Bruce Fields wrote:
> On Mon, Jan 25, 2010 at 02:08:47PM -0500, Chuck Lever wrote:
>> On Jan 25, 2010, at 2:02 PM, Whoop Whouzer wrote:
>>> Ok, I did that, after shutting down the server and enabling debug
>>> trace I tried to open the home folder of the current account (totally
>>> unrelated to the nfsshare), it wouldn't open at all, I got no nautilus
>>> at all. During the time my cursor was in busy mode I got the following
>>> messages in kern.log (for ubuntu 10.04 client):
>>> Jan 25 19:30:13 whoop-desktop kernel: [ 160.719262] NFS call fsstat
>>> Jan 25 19:30:37 whoop-desktop kernel: [ 184.458611] NFS:
>>> permission(0:16/74386), mask=0x10, res=0
>>> Jan 25 19:30:37 whoop-desktop kernel: [ 184.458647] NFS call access
>>> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721086] nfs: server
>>> 192.168.1.130 not responding, timed out
>>> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721113] NFS reply statfs:
>>> -5
>>> Jan 25 19:30:43 whoop-desktop kernel: [ 190.721116] nfs_statfs:
>>> statfs error = 5
> ...
>> This verifies that your client is attempting to access the NFS server,
>> but doesn't tell us which file it's attempting to access. Essentially
>> the EIO means "failed to connect".
>
> I wonder if nautilus (or some library it uses) likes to regularly
> "statfs" all the filesystems it knows about?

The NFS client seems to like to send these periodically, but I've never
looked into why. It's probably triggered by some cache timeout, and
gathers recent server file system information.

The ACCESS command is for a particular file, however. That's probably
where we will get the most interesting and specific information. A
network trace would capture the FH in the ACCESS request. When the
server is up, you could match that FH to other requests where the actual
pathname of the file is known.

Simply run wireshark on your client, and it should automatically sniff
the FH information. Wireshark would need to be running while the server
is up and continue running after the server is taken down.

--
chuck[dot]lever[at]oracle[dot]com

2010-01-27 19:25:41

by Whoop Whouzer

[permalink] [raw]
Subject: Re: nfs client performance while server is down

I am not stating this is an NFS problem at all. I am not asking anybody to fix
anything.
I asked if this issue was by design. I was told it wasn't (as nfs is stateless).
So, therefore I considered it as a bug (which I don't believe to
reside in either nfs or nautilus). I am just trying to figure out
where the problem lies.

I am not talking about implementing "disconnected NFS" mode,
synchronisation or anything like that. There is not something missing,
there is something not working properly, somewhere, and I'm trying to
find out where..

On Wed, Jan 27, 2010 at 8:09 PM, Trond Myklebust
<[email protected]> wrote:
> So? I don't see why that would be an NFS problem.
>
> As far as I can see from this thread, you are basically asking us to fix
> these broken applications by implementing a "disconnected NFS" mode.
> While that may indeed be a cool thing to support, I haven't seen anybody
> so far stepping up and saying that they have the time and resources to
> work on it. Are you volunteering?
>
> Trond
>
> On Wed, 2010-01-27 at 19:47 +0100, Whoop Whouzer wrote:
>> ok, but it's not just GNOME/nautilus behaviour. For one, I am
>> experiencing problems with just about all applications that require
>> (local) disk access. Furthermore, problems have also been reported
>> with xfce/thunar and also with KDE.
>>
>> A bug for this issue has just been created for xfce/thunar:
>> http://bugzilla.xfce.org/show_bug.cgi?id=6185
>>
>> On Wed, Jan 27, 2010 at 7:40 PM, Trond Myklebust
>> <[email protected]> wrote:
>> > On Wed, 2010-01-27 at 13:23 -0500, Chuck Lever wrote:
>> >> On 01/26/2010 06:21 PM, J. Bruce Fields wrote:
>> >> > I wonder if nautilus (or some library it uses) likes to regularly
>> >> > "statfs" all the filesystems it knows about?
>> >>
>> >> The NFS client seems to like to send these periodically, but I've never
>> >> looked into why. ?It's probably triggered by some cache timeout, and
>> >> gathers recent server file system information.
>> >
>> > No. It is entirely application driven. Furthermore, most of the statfs
>> > data is uncached, since it should not be performance critical in any
>> > sane application environment.
>> >
>> > IOW: I agree with Bruce that this is most likely GNOME or nautilus
>> > triggering statfs calls. Indeed, when I do actually open a window on
>> > some directory it also appears to display the free space.
>> >
>> > Trond
>> >
>> >
>> >
>
>
>
>

2010-01-27 19:30:30

by Ray Van Dolson

[permalink] [raw]
Subject: Re: nfs client performance while server is down

On Wed, Jan 27, 2010 at 11:25:39AM -0800, Whoop Whouzer wrote:
> I am not stating this is an NFS problem at all. I am not asking
> anybody to fix anything. I asked if this issue was by design. I was
> told it wasn't (as nfs is stateless). So, therefore I considered it
> as a bug (which I don't believe to reside in either nfs or nautilus).
> I am just trying to figure out where the problem lies.
>
> I am not talking about implementing "disconnected NFS" mode,
> synchronisation or anything like that. There is not something
> missing, there is something not working properly, somewhere, and I'm
> trying to find out where..

My impression is that this is "by design" in that NFS mounts, when
mounted in "hard" mode (which is the case by default) will "block"
until the remote server responds.

For the most part this is a good thing. Applications expect their
filesystem calls to behave a certain way regardless of what type of
filesystem is underneath.

In this case, it seems like Nautilus tries to open the mount point and
it just hangs... forever. This would be expected with an NFS mount in
my view.

One way I could think of getting around it would be to ensure that the
NFS mount is mounted with "intr", and then get Nautilus to monitor how
long it takes to read a mount point and "terminate" after a timeout is
reached, perhaps flagging that mount so future accesses are quicker.

Anyways, that goes beyond the scope of NFS... and good luck convincing
the GNOME developers to make a change like that. :)

Also, even if you mount an NFS share with "intr", you can't always
guarantee that you'll be able to kill a process trying to access said
mount..... at least in my experience.

Ray

2010-01-27 18:40:16

by Trond Myklebust

[permalink] [raw]
Subject: Re: nfs client performance while server is down

On Wed, 2010-01-27 at 13:23 -0500, Chuck Lever wrote:
> On 01/26/2010 06:21 PM, J. Bruce Fields wrote:
> > I wonder if nautilus (or some library it uses) likes to regularly
> > "statfs" all the filesystems it knows about?
>
> The NFS client seems to like to send these periodically, but I've never
> looked into why. It's probably triggered by some cache timeout, and
> gathers recent server file system information.

No. It is entirely application driven. Furthermore, most of the statfs
data is uncached, since it should not be performance critical in any
sane application environment.

IOW: I agree with Bruce that this is most likely GNOME or nautilus
triggering statfs calls. Indeed, when I do actually open a window on
some directory it also appears to display the free space.

Trond