LinuxLists.cc - Re: close return value

2002-07-18 14:39:31

Subject: Re: close return value

Pete Zaitcev <[email protected]> writes:

> The problem with errors from close() is that NOTHING SMART can be
> done by the application when it receives it.

This is like saying "nothing smart" can be done when write() returns
ENOSPC. Such statements are either trivially true or blatantly false,
depending on what you mean by "smart".

Failures happen. They can happen on write(), they can happen on
close(), and they can happen on any system call for which the API
allows it. There is no difference! Your application either deals
with them and is correct or fails to deal with them and is broken.

If the API allows an error return, you *must* check for it, period.
This includes "impossible" errors. You may think it is impossible for
gettimeofday() to return an error in some case, but if it ever did,
you should darn well want to know about it right away.

If you are that convinced that close() can not return an error in your
particular application (e.g., because you "know" you are using a local
disk, or the file descriptor is read-only), then treat such errors
like assertion failures. Because that is what they are.

Checking system calls for errors, always, is fundamental to writing
reliable code. Failing to check them is shoddy and amateurish
programming. It is amazing that so many people would argue this
point. Then again, maybe not, given how bad most software is...

- Pat

2002-07-18 15:09:49

by Richard B. Johnson

[permalink] [raw]

Subject: Re: close return value

On 18 Jul 2002, Patrick J. LoPresti wrote:

> Pete Zaitcev <[email protected]> writes:
>
> > The problem with errors from close() is that NOTHING SMART can be
> > done by the application when it receives it.
>
> This is like saying "nothing smart" can be done when write() returns
> ENOSPC. Such statements are either trivially true or blatantly false,
> depending on what you mean by "smart".
>
> Failures happen. They can happen on write(), they can happen on
> close(), and they can happen on any system call for which the API
> allows it. There is no difference! Your application either deals
> with them and is correct or fails to deal with them and is broken.
>
> If the API allows an error return, you *must* check for it, period.
[SNIPPED..]

Well no. Many procedures are called for effect. When is the last
time you checked the return-value of printf() or puts()? If your
code does this it's wasting CPU cycles.

When it is necessary to perform code reviews, because your company
does FDA or some similar critical software, then you show that
you know you are ignoring a return value by casting it to void.
This shows that the writer knew that he or she was deliberately
ignoring a return-value.

In the specific close(fd) function, my reading of the man page
on this system says that it can only return an error of EBADF
on Linux. Which means that if you make Linux-only code, you
can ignore any error because the fd has become invalid somehow
and subsequent attempts to close with the same fd will surely
fail in the exact same way.

But most systems can return -1 and have an error code of EINTR
(interrupted system call) on any system call. Also, deferred
writing, such as happens in network file-systems, may not return
an error during the write. Such systems are supposed to return
an error during a later call that uses the same file descriptor.
If that call is a close(), then you may get an error. I don't
know what you do under those circumstances, but at the very least,
somebody/something should 'know' that the network write didn't
go as planned.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

Windows-2000/Professional isn't.

2002-07-18 23:44:39

by Albert D. Cahalan

[permalink] [raw]

Subject: Re: close return value

Patrick J. LoPrest writes:

> Failures happen. They can happen on write(), they can happen on
> close(), and they can happen on any system call for which the API
> allows it. There is no difference! Your application either deals
> with them and is correct or fails to deal with them and is broken.
>
> If the API allows an error return, you *must* check for it, period.
> This includes "impossible" errors. You may think it is impossible for
> gettimeofday() to return an error in some case, but if it ever did,
> you should darn well want to know about it right away.
>
> If you are that convinced that close() can not return an error in your
> particular application (e.g., because you "know" you are using a local
> disk, or the file descriptor is read-only), then treat such errors
> like assertion failures. Because that is what they are.
>
> Checking system calls for errors, always, is fundamental to writing
> reliable code. Failing to check them is shoddy and amateurish
> programming. It is amazing that so many people would argue this
> point. Then again, maybe not, given how bad most software is...

You check printf() and fprintf() then? Like this?

///////////////////////////////////////////
void err_print(int err){
const char *msg;
int rc;

msg = strerror(err);
if(!msg) err_print(errno);

do{
rc = fprintf(stderr,"Problem: %s\n",msg);
}while(rc<0 && errno==EINTR);
if(rc<0) err_print(errno);
}
///////////////////////////////////////////

Get off your high horse.

2002-07-19 16:09:23

by Patrick J. LoPresti

[permalink] [raw]

Subject: Re: close return value

"Albert D. Cahalan" <[email protected]> writes:

> You check printf() and fprintf() then? Like this?
>
> ///////////////////////////////////////////
> void err_print(int err){
> const char *msg;
> int rc;
>
> msg = strerror(err);
> if(!msg) err_print(errno);
>
> do{
> rc = fprintf(stderr,"Problem: %s\n",msg);
> }while(rc<0 && errno==EINTR);
> if(rc<0) err_print(errno);
> }
> ///////////////////////////////////////////

Wow, I hardly know where to begin.

I could point out that, at least according to my man page, fprintf()
returns the number of characters printed; it tells you nothing about
errors. Also, fprintf() is a library funciton, not a system call, so
you cannot expect it to put anything meaningful in errno. (I am not
sure whether these mistakes were part of your sarcasm or your
ignorance.)

Or I could ask, what part of "assertion failure" did you not
understand? Yes, the code above is idiotic. But checking that
fprintf() did not return zero, and calling abort() otherwise, is often
the right thing to do.

Yes, I exaggerated. There are times when you can reasonably skip
checking a system call for errors; namely, when you have coded
defensively enough that any error can do no harm. If you can show
that the rest of your program operates correctly whether the call
succeeded or not, then you can skip the error check.

But my main point still holds: You should *not* skip error checks
because you "know" that the error is "impossible". It takes little
experience with real-world systems to learn that the "impossible"
happens with alarming frequency. And when it does, aborting
immediately is much better than proceeding, because your subsequent
code is unpredictable and therefore dangerous when your assumptions
have been violated.

Once you have taken the hit of making a system call, the additional
cost of checking the return value is irrelevant. So do yourself and
your users a favor and add the checks.

> Get off your high horse.

Actually, I would rather give others a lift to join me. The view is
pretty good from up here.

- Pat