2002-07-17 23:58:08

by Pete Zaitcev

[permalink] [raw]
Subject: Re: close return value

>>From: David S. Miller <[email protected]>

>> From: Alan Cox <[email protected]>
>> Date: 17 Jul 2002 02:35:41 +0100
>>
>> Our NFS can return errors from close().
>>
>>Better tell Linus.
>
> Oh, Linus knows. In fact, Linus wrote some of the code in question.
>
> But the thing is, Linus doesn't want to have people have the same issues
> with local filesystems. I _know_ there are broken applications that do
> not test the error return from close(), and I think it is a politeness
> issue to return error codes that you can know about as soon as humanly
> possible.

> For NFS, you simply cannot do any reasonable performance without doing
> deferred error reporting. The same isn't true of other filesystems.
> Even in the presense of delayed block allocation, a local filesystem can
> _reserve_ the blocks early, and has no excuse for giving errors late
> (except, of course, for actual IO errors).

I really hate to disagree with the chief penguin here, but
it's extremely dumb to return errors from close(). The last
time we trashed this issue on this list was when a newbie used
an error return from release() to communicate with his driver.

The problem with errors from close() is that NOTHING SMART can be
done by the application when it receives it. And application can:

a) print a message "Your data are lost, have a nice day\n".
b) loop retrying close() until it works.
c) do (a) then (b).

The thing about (b) is that the kernel can do it much better.
Another thing proponents of errors from close() better ask themselves
is if the file descriptor stays open or closed if close() abends.
If it remains open, your exit() is bust. If it closes, you
cannot retry the error (b).

-- Pete


2002-07-18 00:08:06

by Thunder from the hill

[permalink] [raw]
Subject: Re: close return value

Hi,

On Wed, 17 Jul 2002, Pete Zaitcev wrote:
> The problem with errors from close() is that NOTHING SMART can be
> done by the application when it receives it. And application can:
>
> a) print a message "Your data are lost, have a nice day\n".
> b) loop retrying close() until it works.
> c) do (a) then (b).

(a) is much saner than silently loosing data.

Regards,
Thunder
--
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y-
------END GEEK CODE BLOCK------

2002-07-18 20:06:52

by Hildo.Biersma

[permalink] [raw]
Subject: Re: close return value

>>>>> "Pete" == Pete Zaitcev <[email protected]> writes:

Pete> I really hate to disagree with the chief penguin here, but it's
Pete> extremely dumb to return errors from close(). The last time we
Pete> trashed this issue on this list was when a newbie used an error
Pete> return from release() to communicate with his driver.

Pete> The problem with errors from close() is that NOTHING SMART can be
Pete> done by the application when it receives it. And application can:

Pete> a) print a message "Your data are lost, have a nice day\n".
Pete> b) loop retrying close() until it works.
Pete> c) do (a) then (b).

I must disagree with you. We run the Andrew File System (AFS), which
has client-side caching with write-on-close semantics. If an error
occurs goes wrong at close() time, a well-written application can
actually do something useful - such as sending an alert, or letting
the user know the action failed.

2002-07-18 23:52:03

by Pete Zaitcev

[permalink] [raw]
Subject: Re: close return value

> Date: Thu, 18 Jul 2002 16:09:51 -0400 (EDT)
> From: [email protected]

> Pete> The problem with errors from close() is that NOTHING SMART can be
> Pete> done by the application when it receives it. And application can:
>
> Pete> a) print a message "Your data are lost, have a nice day\n".
> Pete> b) loop retrying close() until it works.
> Pete> c) do (a) then (b).
>
> I must disagree with you. We run the Andrew File System (AFS), which
> has client-side caching with write-on-close semantics. If an error
> occurs goes wrong at close() time, a well-written application can
> actually do something useful - such as sending an alert, or letting
> the user know the action failed.

The above is an example of an application covering up for
a filesystem that breaks the general expactions for the
operating environment. Remember your precursor with a broken
driver who received his beating a couple of months ago.
He also had an appliction which processed his errors from
close just fine. A workaround can be done in every specific
instance, but it does not make this practice any smarter.

What AFS designers should have done if they had a brain larger
than a pea was:

1. Make close to block indefinitely, retrying writes.
Allow overlapping writes, let clients to sort it out.
2. Provide an ioctl to flush writes before close() or
make fsync() work right. Your "smart" applications have had
to use that, so that no ambiguity existed between tearing down
the descriptor and writing out the data.

This way, naive applications such as cat and cc would
continue to work. There is no reason to penalize them just
because some application _could_ possibly post idiotic alerts
(Abort, Retry, Fail).

-- Pete

2002-07-19 11:28:54

by Hildo.Biersma

[permalink] [raw]
Subject: Re: close return value

>>>>> "Pete" == Pete Zaitcev <[email protected]> writes:

>> Date: Thu, 18 Jul 2002 16:09:51 -0400 (EDT)
>> From: [email protected]

Pete> The problem with errors from close() is that NOTHING SMART can be
Pete> done by the application when it receives it. And application can:
>>
Pete> a) print a message "Your data are lost, have a nice day\n".
Pete> b) loop retrying close() until it works.
Pete> c) do (a) then (b).
>>
>> I must disagree with you. We run the Andrew File System (AFS), which
>> has client-side caching with write-on-close semantics. If an error
>> occurs goes wrong at close() time, a well-written application can
>> actually do something useful - such as sending an alert, or letting
>> the user know the action failed.

Pete> The above is an example of an application covering up for
Pete> a filesystem that breaks the general expactions for the
Pete> operating environment. Remember your precursor with a broken
Pete> driver who received his beating a couple of months ago.
Pete> He also had an appliction which processed his errors from
Pete> close just fine. A workaround can be done in every specific
Pete> instance, but it does not make this practice any smarter.

I agree in general, but you should realize that there are valid
reasons why Unix filesystem semantics are sometimes violated.

We have slightly over 8,000 Unix hosts using the same networked
filesystem against the same set of file-servers. This is only
feasible if you minimize the number of client<->server interactions.

This is done in two ways:
- persistent (disk-based) client-side caching, where the server will
let a client know if a file is updated and needs to be evicted from
the client's cache
- close-on-write semantics for files

Pete> What AFS designers should have done if they had a brain larger
Pete> than a pea was:

Pete> 1. Make close to block indefinitely, retrying writes.
Pete> Allow overlapping writes, let clients to sort it out.

None of these things work, as security may be denied, a volume may be
taken off-line, or hvaing overlppaing writes from clients increases
the amount of client<->server interaction.

Pete> 2. Provide an ioctl to flush writes before close() or
Pete> make fsync() work right. Your "smart" applications have had
Pete> to use that, so that no ambiguity existed between tearing down
Pete> the descriptor and writing out the data.

This is provided - sync, fsync, msync all work.

Pete> This way, naive applications such as cat and cc would
Pete> continue to work. There is no reason to penalize them just
Pete> because some application _could_ possibly post idiotic alerts
Pete> (Abort, Retry, Fail).

That's work the trade-offs come in. The AFS designers found that
relaxing the Unix filesystem semantics vastly improves scalability.

Many of the high-performance filesystems (not XFS, the _really_
high-performance filesystems) that you run on supercomputers also
vioilate Unix semantics in various ways. Yes, that breaks na\"ive
apps, but that trade-off is generally accepted.

2002-07-19 16:13:06

by Pete Zaitcev

[permalink] [raw]
Subject: Re: close return value

> Date: Fri, 19 Jul 2002 07:31:54 -0400 (EDT)
> From: [email protected]

> Pete> 1. Make close to block indefinitely, retrying writes.
> Pete> Allow overlapping writes, let clients to sort it out.
>
> None of these things work, as security may be denied, a volume may be
> taken off-line, or hvaing overlppaing writes from clients increases
> the amount of client<->server interaction.
>
> Pete> 2. Provide an ioctl to flush writes before close() or
> Pete> make fsync() work right. Your "smart" applications have had
> Pete> to use that, so that no ambiguity existed between tearing down
> Pete> the descriptor and writing out the data.
>
> This is provided - sync, fsync, msync all work.

It is unfair for you to separate 1. and 2. They should work
together. Remember, you said "return error from close is
useful BECAUSE my smart application may deal with it."
If fsync works, the argument does not hold water at all.
Your smart application can do fsync just as easily.
If it does, it does not need the return code from close.

> That's work the trade-offs come in. The AFS designers found that
> relaxing the Unix filesystem semantics vastly improves scalability.

I know about the improvements. They are applicable to NFS too.
What I am trying to tell you is that there was NO reason to break
close in particular. Even on ancient AIXes without fsync they
could have used an ioctl.

-- Pete

2002-07-23 22:22:01

by Bill Davidsen

[permalink] [raw]
Subject: Re: close return value

On Thu, 18 Jul 2002, Pete Zaitcev wrote:


> 1. Make close to block indefinitely, retrying writes.

We went through this with sync() a while ago. You don't want things to
loop forever. That's what status returns are for, if the program wants to
retry it can. Consider the f/s being out of space, the write can't work,
the process can't die, the f/s can't unmount because there's i/o in
progress, the system can't shutdown cleanly.

Let the program handle the problems, and decide what to retry.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
for (;;) exit(0);