LinuxLists.cc - Re: network storage solutions

2003-05-15 22:07:47

Subject: Re: network storage solutions

Josip Loncaric wrote:

> Glen Kaukola wrote:
>
>> Greg Lindahl wrote:
>>
>>> There are soft and hard mounts, and there are interruptable mounts
>>> ("intr" -- check out "man nfs").
>>>
>>> A hard mount will never time out. If you make it interruptable, then
>>> the user can choose to ^C. This is the safe option.
>>>
>>>
>>
>> You know, I thought that's how it was supposed to work too. I do use
>> the intr option, but even with that option, when a nfs drive is down,
>> and something like a df command gets stuck, hitting ctrl-c doesn't
>> seem to do a thing. All I can ever do is just kill my xterm or
>> whatever.
>
>
> We've had similar problems while I was at ICASE. "Hard" mounts would
> lock up client processes (even unmount) when the NFS server went down,
> but "soft" mounts were "too soft" for some of our users. A reasonable
> solution is to "harden" your soft mounts by insisting on longer major
> timeouts, as in "retrans=15" (the default is 3).

I still think this is dangerous. With soft mounts you can
still get silent data corruption despite the longer timeouts.
Chuck, do you agree?

Jeff

>
>
> Sincerely,
> Josip
>
> P.S. Our NFS servers virtually never went down, except due to
> hardware problems or service, so indefinite retransmissions were
> highly undesirable.
>

-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
http://www.enterpriselinuxforum.com

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-05-15 22:25:49

by Joe Landman

[permalink] [raw]

Subject: RE: Re: network storage solutions

On Thu, 2003-05-15 at 17:01, Lever, Charles wrote:

> > I would like to know that as well. I would like to believe
> > it will not
> > continue with corrupt data, but return an error code/condition which
> > should be handled.
>
> to quote:
>
> If a soft time-out interrupts a write operation, there is no
> guarantee that the file on the server is correct, nor is there
> any guarantee that the client's cached version of the file matches
> what is on the server.

This text isn't in my man page for mount (RH8.0). Thanks for the
pointer. The semantics of "soft" exist then for what purpose? Is this
a "performance" switch that benchmark folks like?

I used it for remote file backup purposes, mount the file system to be
backed up with the soft, intr options (and large rsize/wsize, and
vers=3), and then dump. With the hard option, when there was an error,
I had machine crashes. With soft, I could recover.

--
Joseph Landman, Ph.D
Scalable Informatics LLC
email: [email protected]
web: http://scalableinformatics.com
phone: +1 734 612 4615

-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
http://www.enterpriselinuxforum.com

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-05-16 06:07:25

by Brian Pawlowski

[permalink] [raw]

Subject: Re: Re: network storage solutions

> People use soft mounts for (1) improved performance (you can juice
> up cheap servers by caching data), or (2) prevent hung clients
> in face of unreliable networks and servers (when client is accessing
> many NFS servers).

Skip (1) - that is (a)sync on server (not soft mounts). I should think before
I type:-)

So, Windows CIFS has dramatic "soft mount" like behaviour (some
popup like "Delayed writes lost" on session disconnect). Always
a pain - makes me want hard mount NFS behaviour.

-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
http://www.enterpriselinuxforum.com

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-05-16 13:21:54

by Robert G. Brown

[permalink] [raw]

Subject: Re: network storage solutions

On Thu, 15 May 2003, Jeffrey B. Layton wrote:

> > We've had similar problems while I was at ICASE. "Hard" mounts would
> > lock up client processes (even unmount) when the NFS server went down,
> > but "soft" mounts were "too soft" for some of our users. A reasonable
> > solution is to "harden" your soft mounts by insisting on longer major
> > timeouts, as in "retrans=15" (the default is 3).
>
>
> I still think this is dangerous. With soft mounts you can
> still get silent data corruption despite the longer timeouts.
> Chuck, do you agree?
>
> Jeff

Perhaps it is a question of probability, and what people are willing to
accept in terms of data loss in a given environment. It is a
cost-benefit equation, as always, so acceptable solutions do have to at
least examine the cost of a corrupted file against other costs
associated with using hard mounts everywhere.

In one somewhat jaded view, one says "crashes happen, and if a crash
occurs in the middle of a file write there is a distinct chance of
losing that file". This is (I suspect) true anyway for both hard and
soft mounts, depending on the cause of the crash and what has to be done
to fix it. If the exported filesystem is left in an inconsistent state
post-crash and is modified before being re-exported to the clients, they
are likely to see a stale mount and not be able to complete the ongoing
write transaction.

Nowadays (within linux) it is indeed pretty rare as Greg noted for a
client not to recover gracefully from a server crash and reboot,
although I confess to being less lucky -- we still see stale NFS mounts
after certain crashes, and generally plan on being ABLE to reboot all
the clients in the department after any major, planned downtime of our
principle servers. There seems to be a bit of state dependence here --
"most" clients recover, but one or two sometimes seem to hang and need
either a therapeutic reboot or at least a remount to clear some
state-dependent problem.

A question for the experts out there -- does the use of a journalling
filesystem affect the probability of NFS file corruption on a soft
mount? As in, is there any interaction between the journal and the NFS
server that cause an incomplete or corrupted transaction to be
interpreted as cause for invoking some of the protections journalling
provides? I'm just curious...one would think that NFS would effectively
"journal" itself to consistently end up in a "reliable" state (which
might well cost one the latest writes to the file!) even on a soft
mount.

The probability and cost-benefit issues are often related to LAN
architecture. In a common architecture, one has a single (or perhaps
2-3) "major server(s)" that have lots of capacity in all dimensions.
This is where users manipulate "critical data" (e.g. home directories,
project directories), and one EXPECTS the LAN to effectively go down
when these servers are down so the mounts should definitely be hard
mounts (although they might well be automounts, so your system isn't
hung if YOUR home directory server stays up). To protect against
anomalous amounts of downtime (which DEFINITELY costs one work at a
fixed rate, compared to the stochastic expectation of loss in the case
of possible data corruption) one makes the servers as reliable as
possible -- they are architected "not to go down" and have things like
four-hour service or hot mirror spares.

In a few cases, as in Greg's example, lots of people with desktop
workstations export workspace and crossmount it all over the place.
Then the issue becomes one of cumulative stability of the workstation
space. Because of the nasty behaviors of e.g. stat, it is quite common
for a system or at least a session to effectively hang when ANY of its
hard mounts go down -- perhaps not to crash, and to recover gracefully
when the offending server comes back up, but in the meantime you can
lose access to your workstation and ability to do work -- a real cost,
potentially multiplied by N on a big network where NOBODY gets to do
work until the workstation is back up. The downed workstation might NOT
be so reliable and might NOT have hot and cold running service and might
stay down a day or more, and a decision might well be made to take
draconian measures to free up the (not really) "hung" clients. This
also can cost work, e.g. work in progress, where a user may have to
choose between not being able to work interactively while their
background task completes or e.g. killing the background task with a
reboot to come back up without the hung mount.

Obviously the "best" solution to this sort of situation is to not put
NFS exports on your path where you can avoid it, and to use the
automounter to effectively reduce the number of mounts that can hang on
a general path stat or df to the unavoidable main (hopefully reliable)
filesystems plus those exported spaces belonging to your buddies that
you actually are using NOW. However, in a small/informal LAN (like a
home network), where the workstations that are providing the mounts
aren't horribly overloaded at either the network or CPU or memory level
(so they aren't at all likely to timeout on an NFS request) and where
the admin either doesn't want to figure out the automounter or just
isn't that concerned about the (low) probability of data corruption, one
might choose as a "quick and dirty" solution to use a soft mount and bet
that data corruption never occurs.

Back a LONG time ago when NFS recovery on a hard mount was basically
nonexistent (e.g. SunOS, Irix, etc.) I used sometimes used soft mounts
on crossmounted workstation spaces and our (much slower, much less
powerful client "servers") LAN never knowingly had a problem with data
actually being corrupted -- although files were sometimes lost, leaving
one of those lovely .nfs323112114 tags -- so I'd guess that the
>>probability<< of silent corruption is actually pretty low. On the
other hand, even a soft mount was never really all that recoverable
either -- NFS just plain had a way of stubbornly hanging whenever a
server went down, no matter what.

It proved smarter, more cost-beneficial, and more professional in the
long run (in a production LAN environment, with real costs associated
with EVERYTHING) to consolidate exported space, including e.g. project
space, into a very few, very reliable servers, period and just not LET
"everybody mount everybody else", soft OR hard. Um, so to speak;-) I AM
talking about networking here, after all...

In summary, while soft mounts exist(ed) "for a reason", they never in
the past and still don't work terribly well or reliably, and the reasons
themselves for using them have mostly passed on. There are better ways
to cope with the cost/benefit dilemna between de facto hung workstations
and possibly lost/corrupt data. The vast improvements in automounters
(which back in those same old days sucked incredibly and were as likely
to produce problems as to solve them:-) make automounters with hard
mounts, from a few, reliable, consolidated servers, by far the
preferrable solution to the problem. The fact is that most users of
single user workstations are most unlikely to have more than one or two
automount directories mounted at any one time (within the mount timeout
window) simply because they will typically be "working" at one path
location or perhaps two at a time.

Server consolidation also makes it MUCH easier to back things up,
another "chronic" problem in cowboy networks where there otherwise would
be project directories on fifty workstations, most of them with
relatively unreliable IDE disks, every one being used by somebody that
would whine or bluster and threaten if their data went away upon the
crash of their cheap, three year old disk. Then there are the "control
and security issues" -- NFS is a bleeding wound as far as security is
concerned anyway (or at least has been historically) and all those
crossmounts on private workstations offer a cracker or evil employee
numerous opportunities to be naughty. True, one generally keeps a
sucker rod handy to school the latter, but cleaning it afterwards is
such a mess. Workstations just aren't architected (without effort and
additional expense) to be good, secure, reliable servers in a LAN
serving hundreds of clients.

The "best" solution for "most" LAN architectures is thus to automount
basically everything but the home directories or other "critical"
filesystems mounted from a few reliable servers -- maybe even automount
the home directories (if you have more than one home server, e.g.)!
That way a desktop client system doesn't (generally) "hang"
(recoverably, but hang nonetheless as far as the user is concerned) or
become otherwise difficult to work with if a non-critical or currently
unused server dies -- at most one might lose a tty window when one tries
to access an automount, but if one keeps the automounts off of one's
path then the path stat won't hang (almost) every shell transaction.
Administrative control is concentrated in a relatively few points of
failure and systems to secure and back up. Data reliability and
protection against loss of work time and access at the same time.

rgb

--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:[email protected]

-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
http://www.enterpriselinuxforum.com

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-05-16 16:18:28

by Lever, Charles

[permalink] [raw]

Subject: RE: Re: network storage solutions

> -----Original Message-----
> From: Joseph Landman [mailto:[email protected]]
> Sent: Thursday, May 15, 2003 6:26 PM
> To: Lever, Charles
> Cc: Beowulf; [email protected]
> Subject: RE: [NFS] Re: network storage solutions
>=20

[ ... snip ... ]
=20
> I used it for remote file backup purposes, mount the file system to be
> backed up with the soft, intr options (and large rsize/wsize, and
> vers=3D3), and then dump. With the hard option, when there was=20
> an error,
> I had machine crashes. With soft, I could recover.

caveat: using a soft mount for backups increases the probability
that your backups will be unusable for all the reasons we've
discussed.

if an error caused a machine crash, then there are bugs that should
be fixed.

-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
http://www.enterpriselinuxforum.com

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-05-15 21:01:53

by Lever, Charles

[permalink] [raw]

Subject: RE: Re: network storage solutions

> > Since we use our cluster for production work (please, I'm
> > not trying to offend anyone), we HAVE to have non-corrupted
> > data. This is why we use hard mounts with 'sync' as well as
> > a few other options. The URL above to Chuck's paper has
> > several examples of "good" mount options.
>=20
> Hmmm. I am reasonably sure that when the IO system returns=20
> an error, it
> does in fact get propagated to the appropriate user-land calling
> program.

yes, that's true.

> The program then makes the determination as to=20
> whether or not
> to continue. There are quite a few programs that rarely=20
> inspect return
> code from file operations.

not all programs are well-written. most, in fact, do not
assume that ESTALE can happen, nor were they designed to
work with NFS locking (they may use dot-file locking, for
example, which doesn't work on NFS). most programs assume
that they run on a local disk-based file system, and were
never tested over NFS.

NFS systems are becoming more and more reliable all the
time, and that means that some corner-case behaviors are
not well tested, or not tested at all, because they
rarely occur any more.

> > > The way I and other who use soft mounts view it, data=20
> lossage occurs
> > > when the server crashes, as you cannot guarantee (except=20
> with sync),
> > > that the data was committed to disk.
> > >
> >=20
> > However, if I read Chuck's paper correctly, with soft mount
> > you can get a soft time-out that can interrupt an operation
> > but the client will continue then with corrupted data. Am I
> > understanding this correctly? Therefore, the clients may be
> > up, but now the data is corrupt and the appliation doesn't
> > know it.
>=20
> I would like to know that as well. I would like to believe=20
> it will not
> continue with corrupt data, but return an error code/condition which
> should be handled.

to quote:

If a soft time-out interrupts a write operation, there is no
guarantee that the file on the server is correct, nor is there
any guarantee that the client's cached version of the file matches
what is on the server.

in the face of a soft timeout, there is no way to synchronize
write requests remaining on the client with what exists on the
server.

if a soft timeout occurs during a read operation, the client
should purge its data cache, but i'm not sure it does that
today. if there are pending write requests on the client
when a read timeout occurs, there is still the problem of
what to do with the waiting writes.

if you absolutely must use soft mounts, you can reduce the
probability of data corruption by making sure that:

+ you have specified a large number of retransmissions on
your mount command line: retrans=3D10 for example

+ you specify a conservative retransmission timeout:
timeo=3D7 or more for UDP, and timeo=3D600 for TCP

+ you use NFS over TCP if possible

+ on Linux clients, you specify rsize and wsize larger
than the client's page size to ensure the client
generates NFS operations asynchronously. synchronous
operations (especially a lot of synchronous small
writes or directory operations) will slow the server
down and be more likely to trigger a timeout.

> > I'm not sure... If the server crashes, I think this is true.
> > But what if you get an interrupt. Soft mounts will allow
> > the application to continue with corrupted data while hard
> > mounts will produce an error, but not corrupt data (I think).
>=20
> I hope not. The programs that I send an INTR to on an NFS=20
> system (with
> the intr flag allowed) seem to accept the signal and die. I guess the
> question is here, what should be the state of the filesystem upon
> acceptance of that signal? Can you assume it is in a known state?

usually interrupting an application waiting on a file system
mounted with "intr" is harmless, but there is also a probability
of corruption when using "intr," although it is smaller than
when using "soft."

for databases, i recommend "hard,nointr." that is the safest
combination.

-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
http://www.enterpriselinuxforum.com

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs