2004-01-23 10:01:03

by Perceval Anichini

[permalink] [raw]
Subject: NFS : open is failing too slowly...

Hello !

Here is my problem :
I have an application which job is to record an ip stream on a
NFS server. If the write () fails, the application continue to record
on a local disk.

When the application is recording to the local disk, it stills
have to check wether or not the main filesystem is back. To perform
that operation, I try to open () a file on the main filesystem
regularly.

My problem is that open () is taking far too much time to fail
(~0.3s) which lead the application to loose datas.

So my question is : Is their a fast way to know (ioctl, or whatever) if
a nfs server is available or in timeout mode ?

For infos : I am running redhat 9.0 with a kernel 2.6.1 compiled
"by hand"
The filesystem is mounted with options nfsvers=3 soft timeo=1 retrans=1

(Soft mode because I don't want my process to hang when the
server is stuck, and timeo,retrans = 1 in order to reduce the timeout
as much as possible. Tell me if i'm wrong...)

Thanks a lot !

Perceval Anichini.



-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2004-01-23 14:58:40

by Lever, Charles

[permalink] [raw]
Subject: RE: NFS : open is failing too slowly...

> I have an application which job is to record an ip stream on a
> NFS server. If the write () fails, the application continue to record
> on a local disk.
>=20
> When the application is recording to the local disk, it stills
> have to check wether or not the main filesystem is back. To perform
> that operation, I try to open () a file on the main filesystem
> regularly.
>=20
> My problem is that open () is taking far too much time to fail
> (~0.3s) which lead the application to loose datas.
>=20
> So my question is : Is their a fast way to know (ioctl, or=20
> whatever) if
> a nfs server is available or in timeout mode ?

in general there is no way for a client to indicate to an application
that it is no longer in touch with a server.

> For infos : I am running redhat 9.0 with a kernel 2.6.1 compiled
> "by hand"
> The filesystem is mounted with options nfsvers=3D3 soft=20
> timeo=3D1 retrans=3D1
>=20
> (Soft mode because I don't want my process to hang when the
> server is stuck, and timeo,retrans =3D 1 in order to reduce the =
timeout
> as much as possible. Tell me if i'm wrong...)

if you are using UDP, then you don't need the timeo=3D option at all.
the read and write retransmit timeout is set by the RPC client, and
usually is much faster than a tenth of a second.

if you are using TCP, then timeo=3D1 is also not advisable. this will
cause the RPC and TCP layers to generate competing retransmissions,
which wastes resources.

retrans=3D1 with soft is an open invitation for silent data corruption.
i highly encourage you not to do this.

you really don't want soft either. rather, using "hard" instead will
guarantee that the client will continue to retry your writes until the
server is back up. no data will be lost unless the client crashes.
if you write() but don't flush() the file, your application should not
hang during a normal write() until the client has filled its memory.

if your file server is so unreliable that this is even an issue, then
you have a problem with your server, and not in your application. if
you are concerned about the application becoming unresponsive, then
you should consider using threads or nonblocking I/O.


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-23 15:26:25

by Perceval Anichini

[permalink] [raw]
Subject: RE: NFS : open is failing too slowly...

First of all, thanks for your answer !

> in general there is no way for a client to indicate to an application
> that it is no longer in touch with a server.
Ok.

> if you are using UDP, then you don't need the timeo= option at all.
> the read and write retransmit timeout is set by the RPC client, and
> usually is much faster than a tenth of a second.
I am effectively using UDP.

Thanks for the information. What lead me in error is the "man nfs"
page which said the default value is 7 tenth of a second.

> if you are using TCP, then timeo=1 is also not advisable. this will
> cause the RPC and TCP layers to generate competing retransmissions,
> which wastes resources.
Ok.

> retrans=1 with soft is an open invitation for silent data corruption.
> i highly encourage you not to do this.
Ok. I'll put back retrans=3.

> you really don't want soft either. rather, using "hard" instead will
> guarantee that the client will continue to retry your writes until the
> server is back up. no data will be lost unless the client crashes.
> if you write() but don't flush() the file, your application should not
> hang during a normal write() until the client has filled its memory.
That can be real fast, as i'm recording a bunch of video streams
encoded in MPEG2 :) That's why i really want to switch on the local hard
drive.

> if your file server is so unreliable that this is even an issue, then
> you have a problem with your server, and not in your application. if
> you are concerned about the application becoming unresponsive, then
> you should consider using threads or nonblocking I/O.
The NFS server is really reliable (According to your email, thanks)
The problem is that i don't want to loose datas !!! (If even one occurs)
So if the server have to shut down (even for a few minutes), I must be
able to continue the recording.



-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs