2002-07-17 01:05:29

by Linus Torvalds

[permalink] [raw]
Subject: Re: close return value

In article <[email protected]>,
David S. Miller <[email protected]> wrote:
> From: Alan Cox <[email protected]>
> Date: 17 Jul 2002 02:35:41 +0100
>
> Our NFS can return errors from close().
>
>Better tell Linus.

Oh, Linus knows. In fact, Linus wrote some of the code in question.

But the thing is, Linus doesn't want to have people have the same issues
with local filesystems. I _know_ there are broken applications that do
not test the error return from close(), and I think it is a politeness
issue to return error codes that you can know about as soon as humanly
possible.

For NFS, you simply cannot do any reasonable performance without doing
deferred error reporting. The same isn't true of other filesystems.
Even in the presense of delayed block allocation, a local filesystem can
_reserve_ the blocks early, and has no excuse for giving errors late
(except, of course, for actual IO errors).

Linus


2002-07-17 01:11:55

by David Miller

[permalink] [raw]
Subject: Re: close return value

From: [email protected] (Linus Torvalds)
Date: Wed, 17 Jul 2002 01:05:00 +0000 (UTC)

In article <[email protected]>,
David S. Miller <[email protected]> wrote:
>Better tell Linus.

Oh, Linus knows. In fact, Linus wrote some of the code in question.

Ok, I think the issue here is different.

Several years ago we were returning -EAGAIN from close() via NFS and
that is what caused the problems.

2002-07-17 01:19:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: close return value



On Tue, 16 Jul 2002, David S. Miller wrote:
>
> Oh, Linus knows. In fact, Linus wrote some of the code in question.
>
> Ok, I think the issue here is different.
>
> Several years ago we were returning -EAGAIN from close() via NFS and
> that is what caused the problems.

Oh.

Yes, EAGAIN doesn't really work as a close return value, simply because
_nobody_ expects that (and leaving the file descriptor open after a
close() is definitely unexpected, ie people can very validly complain
about buggy behaviour).

Linus

2002-07-17 11:48:32

by Matthias Andree

[permalink] [raw]
Subject: Re: close return value

On Tue, 16 Jul 2002, Linus Torvalds wrote:

> Yes, EAGAIN doesn't really work as a close return value, simply because
> _nobody_ expects that (and leaving the file descriptor open after a
> close() is definitely unexpected, ie people can very validly complain
> about buggy behaviour).

non-issue, since EAGAIN would violates the specs that don't list EGAIN
(and EAGAIN in response does not make sense either, the kernel should
then try harder to get the I/O completed).

--
Matthias Andree

2002-07-17 17:20:54

by Andries Brouwer

[permalink] [raw]
Subject: Re: close return value

On Wed, Jul 17, 2002 at 01:51:25PM +0200, Matthias Andree wrote:

> non-issue, since EAGAIN would violates the specs that don't list EGAIN

"Implementations may support additional errors not included in this
list, may generate errors included in this list under circumstances
other than those described here, or may contain extensions or
limitations that prevent some errors from occurring. The ERRORS
section on each reference page specifies whether an error shall be
returned, or whether it may be returned. Implementations shall not
generate a different error number from the ones described here for
error conditions described in this volume of IEEE Std 1003.1-2001, but
may generate additional errors unless explicitly disallowed for a
particular function."


Not listing an error in the spec does not mean it cannot occur.
Especially EFAULT is not usually listed.

Andries

2002-07-20 07:56:57

by Florian Weimer

[permalink] [raw]
Subject: Re: close return value

Linus Torvalds <[email protected]> writes:

> Yes, EAGAIN doesn't really work as a close return value, simply because
> _nobody_ expects that (and leaving the file descriptor open after a
> close() is definitely unexpected, ie people can very validly complain
> about buggy behaviour).

Returning an error and still doing the operation is slightly awkward.
Are there any other syscalls which do similar things?

Of course, a significant portion of TCP related code would leak
descriptors like hell if the behavior of close() ischanged (there are
quite a few protocols which do not avoid race conditions resulting in
ECONNRESET connection teardown).

--
Florian Weimer [email protected]
University of Stuttgart http://CERT.Uni-Stuttgart.DE/people/fw/
RUS-CERT fax +49-711-685-5898

2002-07-20 16:41:59

by Linus Torvalds

[permalink] [raw]
Subject: Re: close return value



On Sat, 20 Jul 2002, Florian Weimer wrote:
>
> Returning an error and still doing the operation is slightly awkward.
> Are there any other syscalls which do similar things?

mmap(MAP_FIXED) may have already unmapped any underlying old area if an
error occurs.

And EFAULT may have strange behaviour for left-over stuff. If I remember
correctly, at some point, for example, EFAULT on a write to a TCP socket
(if the fault happened in the middle) would still send out the full-sized
packet zero-padded, because not doing so would have screwed up the state
machine.

(But EFAULT is a special case in general, it's documented to be undefined
behaviour).

I can't think of any others, but at least close() isn't _completely_
alone.

And as you say, we really cannot change it anyway, even if we wanted to
(which I'm personally convinced we do not).

Linus

2002-07-26 09:17:39

by Pavel Machek

[permalink] [raw]
Subject: EFAULT vs. SIGSEGV [was Re: close return value]

Hi!

> > Returning an error and still doing the operation is slightly awkward.
> > Are there any other syscalls which do similar things?
>
> mmap(MAP_FIXED) may have already unmapped any underlying old area if an
> error occurs.
>
> And EFAULT may have strange behaviour for left-over stuff. If I remember
> correctly, at some point, for example, EFAULT on a write to a TCP socket
> (if the fault happened in the middle) would still send out the full-sized
> packet zero-padded, because not doing so would have screwed up the state
> machine.
>
> (But EFAULT is a special case in general, it's documented to be undefined
> behaviour).

SOme time ago you said you'd agree to doing SIGSEGV in addition to
segfault. What about following patch? It should make difference
between VSYSCALL and normal syscall smaller...

Pavel

--- clean.2.5/arch/i386/mm/fault.c Thu Jul 25 22:21:08 2002
+++ linux/arch/i386/mm/fault.c Thu Jul 25 22:21:24 2002
@@ -305,6 +305,15 @@
no_context:
/* Are we prepared to handle this kernel fault? */
if ((fixup = search_exception_table(regs->eip)) != 0) {
+ tsk->thread.cr2 = address;
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = 14;
+ info.si_signo = SIGSEGV;
+ info.si_errno = 0;
+ /* info.si_code has been set above */
+ info.si_addr = (void *)address;
+ force_sig_info(SIGSEGV, &info, tsk);
+
regs->eip = fixup;
return;
}

--
I'm [email protected]. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [email protected]