Here goes my liitle patchy, once again :o)
Whats it's about?
When close()ing an charcter device one expects the return value of the
charcter drivers release() call to be forwarded to the close() called in
userspace. However thats not the case, the kernel swallows the release()
value, and always returns 0 to the userspace's close(). (tha char drivers
release() function is called in fput() as it would have a void return value)
It may sound weired at first but there are actually device drivers than can
fail on close(), in my case it's a driver to program a LCA, the userspace
application signals end of data by closing the device, the driver finalizes
the download, and the LCA reports if it has accepted it's new program, if not
close() should return a non-zero value, indicating the operation did not
complete successfully.
----
diff -r -u linux-2.4.18-orig/fs/file_table.c linux-2.4.18/fs/file_table.c
--- linux-2.4.18-orig/fs/file_table.c Mon Sep 17 22:16:30 2001
+++ linux-2.4.18/fs/file_table.c Wed Mar 20 16:35:34 2002
@@ -97,11 +97,12 @@
return 0;
}
-void fput(struct file * file)
+int fput(struct file * file)
{
struct dentry * dentry = file->f_dentry;
struct vfsmount * mnt = file->f_vfsmnt;
struct inode * inode = dentry->d_inode;
+ int retval = 0;
if (atomic_dec_and_test(&file->f_count)) {
locks_remove_flock(file);
@@ -110,7 +111,7 @@
free_kiovec(1, &file->f_iobuf);
if (file->f_op && file->f_op->release)
- file->f_op->release(inode, file);
+ retval = file->f_op->release(inode, file);
fops_put(file->f_op);
if (file->f_mode & FMODE_WRITE)
put_write_access(inode);
@@ -124,6 +125,7 @@
dput(dentry);
mntput(mnt);
}
+ return retval;
}
struct file * fget(unsigned int fd)
diff -r -u linux-2.4.18-orig/fs/open.c linux-2.4.18/fs/open.c
--- linux-2.4.18-orig/fs/open.c Fri Oct 12 22:48:42 2001
+++ linux-2.4.18/fs/open.c Wed Mar 20 16:34:12 2002
@@ -835,7 +835,10 @@
}
fcntl_dirnotify(0, filp, 0);
locks_remove_posix(filp, id);
- fput(filp);
+ if (retval == 0)
+ retval = fput(filp);
+ else
+ fput(filp);
return retval;
}
diff -r -u linux-2.4.18-orig/include/linux/file.h
linux-2.4.18/include/linux/file.h
--- linux-2.4.18-orig/include/linux/file.h Wed Aug 23 20:22:26 2000
+++ linux-2.4.18/include/linux/file.h Wed Mar 20 16:32:36 2002
@@ -5,7 +5,7 @@
#ifndef __LINUX_FILE_H
#define __LINUX_FILE_H
-extern void FASTCALL(fput(struct file *));
+extern int FASTCALL(fput(struct file *));
extern struct file * FASTCALL(fget(unsigned int fd));
static inline int get_close_on_exec(unsigned int fd)
Axel Kittenberger wrote:
>Here goes my liitle patchy, once again :o)
>
>Whats it's about?
>
>When close()ing an charcter device one expects the return value of the
>charcter drivers release() call to be forwarded to the close() called in
>userspace. However thats not the case, the kernel swallows the release()
>value, and always returns 0 to the userspace's close(). (tha char drivers
>release() function is called in fput() as it would have a void return value)
>
>It may sound weired at first but there are actually device drivers than can
>fail on close(), in my case it's a driver to program a LCA, the userspace
>application signals end of data by closing the device, the driver finalizes
>the download, and the LCA reports if it has accepted it's new program, if not
>close() should return a non-zero value, indicating the operation did not
>complete successfully.
>
Do you see how many places call fput() ? Are you going to audit
__all__ those paths and prove to us that changing the semantics of
close(2) in Linux doesn't break things in the kernel or in userland?
Your driver is buggy, if it thinks it can fail f_op->release.
Jeff
Axel Kittenberger wrote:
>Here goes my liitle patchy, once again :o)
>
>Whats it's about?
>
>When close()ing an charcter device one expects the return value of the
>charcter drivers release() call to be forwarded to the close() called in
>userspace. However thats not the case, the kernel swallows the release()
>value, and always returns 0 to the userspace's close(). (tha char drivers
>release() function is called in fput() as it would have a void return value)
>
>It may sound weired at first but there are actually device drivers than can
>fail on close(), in my case it's a driver to program a LCA, the userspace
>application signals end of data by closing the device, the driver finalizes
>the download, and the LCA reports if it has accepted it's new program, if not
>close() should return a non-zero value, indicating the operation did not
>complete successfully.
>
Note also that the Single Unix Specification v3 says that the state of
the filedes is undefined. If your applications are relying on such
fail-on-close(2) behavior that keeps the file descriptor completely
intact, they are non-standard at best, buggy at worst.
Jeff
Whoops, my apologies. The patch looks ok to me.
I read your text closely and the patch not close enough. As I said, it
is indeed wrong for a device driver to fail f_op->release(), "fail"
being defined as leaving fd state lying around, assuming that the system
will fail the fput().
But your patch merely propagates a return value, not change behavior,
which seems sane to me.
Jeff
On Thursday 21 March 2002 09:27, Jeff Garzik wrote:
> Whoops, my apologies. The patch looks ok to me.
>
> I read your text closely and the patch not close enough. As I said, it
> is indeed wrong for a device driver to fail f_op->release(), "fail"
> being defined as leaving fd state lying around, assuming that the system
> will fail the fput().
>
> But your patch merely propagates a return value, not change behavior,
> which seems sane to me.
Hi,
close() does not directly map to release().
If you want your device to return error
information reliably, you need to implement flush().
Regards
Oliver
Oliver Neukum wrote:
>On Thursday 21 March 2002 09:27, Jeff Garzik wrote:
>
>>Whoops, my apologies. The patch looks ok to me.
>>
>>I read your text closely and the patch not close enough. As I said, it
>>is indeed wrong for a device driver to fail f_op->release(), "fail"
>>being defined as leaving fd state lying around, assuming that the system
>>will fail the fput().
>>
>>But your patch merely propagates a return value, not change behavior,
>>which seems sane to me.
>>
>
>Hi,
>
>close() does not directly map to release().
>If you want your device to return error
>information reliably, you need to implement flush().
>
Agreed.
I still think propagating f_op->release's return value is a good idea,
though.
Jeff
On Thu, 21 Mar 2002, Jeff Garzik wrote:
> Oliver Neukum wrote:
>
> >On Thursday 21 March 2002 09:27, Jeff Garzik wrote:
> >
> >>Whoops, my apologies. The patch looks ok to me.
> >>
> >>I read your text closely and the patch not close enough. As I said, it
> >>is indeed wrong for a device driver to fail f_op->release(), "fail"
> >>being defined as leaving fd state lying around, assuming that the system
> >>will fail the fput().
> >>
> >>But your patch merely propagates a return value, not change behavior,
> >>which seems sane to me.
> >>
> >
> >Hi,
> >
> >close() does not directly map to release().
> >If you want your device to return error
> >information reliably, you need to implement flush().
> >
>
> Agreed.
>
> I still think propagating f_op->release's return value is a good idea,
> though.
>
> Jeff
Probably. Throwing away information without need is bad.
Regards
Oliver
Hi:
Forgive me if I'm not completely understanding this, but isn't release()
only called when the refcount of the file structure drops to zero? e.g.:
[18]Note that release isn't invoked every time a process calls close.
Whenever a file structure is shared (for example, after a fork or a
dup), release won't be invoked until all copies are closed. If you need
to flush pending data when any copy is closed, you should implement the
flush method.
Seems to me that even /with/ a change in semantics of close(), only the
last call to close() would actually get notified of the failure.
Thinking to NFS, would flush() be a better file_operation to use?
Regards,
- Dave
[email protected]
On Thu, Mar 21, 2002 at 03:13:39AM -0500, Jeff Garzik wrote:
> Axel Kittenberger wrote:
>
> >Here goes my liitle patchy, once again :o)
> >
> >Whats it's about?
> >
> >When close()ing an charcter device one expects the return value of the
> >charcter drivers release() call to be forwarded to the close() called in
> >userspace. However thats not the case, the kernel swallows the release()
> >value, and always returns 0 to the userspace's close(). (tha char drivers
> >release() function is called in fput() as it would have a void return value)
> >
> >It may sound weired at first but there are actually device drivers than can
> >fail on close(), in my case it's a driver to program a LCA, the userspace
> >application signals end of data by closing the device, the driver finalizes
> >the download, and the LCA reports if it has accepted it's new program, if not
> >close() should return a non-zero value, indicating the operation did not
> >complete successfully.
> >
>
> Do you see how many places call fput() ? Are you going to audit
> __all__ those paths and prove to us that changing the semantics of
> close(2) in Linux doesn't break things in the kernel or in userland?
>
> Your driver is buggy, if it thinks it can fail f_op->release.
>
> Jeff
> Forgive me if I'm not completely understanding this, but isn't release()
> only called when the refcount of the file structure drops to zero? e.g.:
>
> [18]Note that release isn't invoked every time a process calls close.
> Whenever a file structure is shared (for example, after a fork or a
> dup), release won't be invoked until all copies are closed. If you need
> to flush pending data when any copy is closed, you should implement the
> flush method.
Oh that might be, however in my case the device is exclusivly locked, so it
doesn't matter that much, since there can be only one writer.
To explain what it actually is, it's a driver to program a LCA chip, an
userspace opens the device, writes the program for the LCA, and closes the
device. The driver itself can not understand the file format transmitted
trough it, and has so no chance to know where the supposed end of stream is
until the userspace closes the device. At close() it let's the LCA go, and
looks if it's starting up, if so it has accpeted it's program, if not the
file was not a valid image or the LCA might be damaged, and this error
condition should be signaled.
Programming a LCA simply works then from the shell like in example this:
$> cp my-lca-image /dev/lca || echo "LCA failure :("
Where cp fails if the lca has not accepted it's image.
flush() can also be called in middle of stream, and is not a good indication
for an end of stream.
> > Your driver is buggy, if it thinks it can fail f_op->release.
If this is would be really (always) true, wouldn't it at least be better to
have the release() prototype with a void return value?
> This is a relatively sane part, though I would ask Viro first
> if changing the prototype of fput is such a great idea.
> However, the justification below is completely brain damaged.
I would love to hear his opinion,
(I send him this patch some months ago...)
> So, we're here with an error from close. NOW WHAT DO WE DO?!
> That's right, you have NO CHOICE but to CLOSE AGAIN, and loop
> until that condition clears (think about exit() for a moment).
Well waybe I picked with "fail on close" the wrong words, close always
succeeds, but yet it's return value can signal error conditions.
$> man close
NOTES
Not checking the return value of close is a common but nevertheless
serious programming error. File system imple?
mentations which use techniques as ``write-behind'' to increase
performance may lead to write(2) succeeding,
although the data has not been written yet. The error status may be
reported at a later write operation, but it is
guaranteed to be reported on closing the file. Not checking the
return value when closing the file may lead to
silent loss of data. This can especially be observed with NFS and
disk quotas.
So when closing a file with write-behind cache, close() signals in example an
error (it """fails"""), as application I do not have close it again or to
circle, but I have to live with the fact that something went wrong, and at
least have to inform the user about this, (or return with the error level
set, like dd or cp do it, if a close() """fails""".)
> From: Axel Kittenberger <[email protected]>
> Date: Fri, 22 Mar 2002 09:06:38 +0100
>[...]
> So when closing a file with write-behind cache, close() signals in example an
> error (it """fails"""), as application I do not have close it again or to
> circle, but I have to live with the fact that something went wrong, [...]
I do not think it's true. And what about exit()?
-- Pete
On Fri, Mar 22, 2002 at 08:58:02AM +0100, Axel Kittenberger wrote:
> > Forgive me if I'm not completely understanding this, but isn't release()
> > only called when the refcount of the file structure drops to zero? e.g.:
> >
> > [18]Note that release isn't invoked every time a process calls close.
> > Whenever a file structure is shared (for example, after a fork or a
> > dup), release won't be invoked until all copies are closed. If you need
> > to flush pending data when any copy is closed, you should implement the
> > flush method.
>
> Oh that might be, however in my case the device is exclusivly locked, so it
> doesn't matter that much, since there can be only one writer.
>
> To explain what it actually is, it's a driver to program a LCA chip, an
> userspace opens the device, writes the program for the LCA, and closes the
> device. The driver itself can not understand the file format transmitted
> trough it, and has so no chance to know where the supposed end of stream is
> until the userspace closes the device. At close() it let's the LCA go, and
> looks if it's starting up, if so it has accpeted it's program, if not the
> file was not a valid image or the LCA might be damaged, and this error
> condition should be signaled.
>
> Programming a LCA simply works then from the shell like in example this:
> $> cp my-lca-image /dev/lca || echo "LCA failure :("
>
> Where cp fails if the lca has not accepted it's image.
>
> flush() can also be called in middle of stream, and is not a good indication
> for an end of stream.
This is me talking prior to having coffee, but Chapter 3 of the
Rubini/Corbet book says:
The flush operation is invoked when a process closes its copy of a file
descriptor for a device; it should execute (and wait for) any outstanding
operations on the device. This must not be confused with the fsync operation
requested by user programs. Currently, flush is used only in the network file
system (NFS) code. If flush is NULL, it is simply not invoked.
I guess it doesn't specifically say it's not called in midstream, but
it reads as if flush() is called on /only/ close(). I may test this
today, just for fun.
>
> > > Your driver is buggy, if it thinks it can fail f_op->release.
>
> If this is would be really (always) true, wouldn't it at least be better to
> have the release() prototype with a void return value?
That wasn't me, that was Jeff. :)
- Dave
[email protected]
> This is me talking prior to having coffee, but Chapter 3 of the
> Rubini/Corbet book says:
>
> The flush operation is invoked when a process closes its copy of a file
> descriptor for a device; it should execute (and wait for) any outstanding
> operations on the device. This must not be confused with the fsync
> operation requested by user programs. Currently, flush is used only in the
> network file system (NFS) code. If flush is NULL, it is simply not invoked.
>
> I guess it doesn't specifically say it's not called in midstream, but
> it reads as if flush() is called on /only/ close(). I may test this
> today, just for fun.
Oh thats interesting, indeed, so the function name "flush" is just
contra-intentional. Oxay I know now how I could have written this driver
without patching the kernel.....
Still the basic issue/idea is remaining. release() is defined as int return
type, but everywhere it's called it's value is discarded. (except internally
in "intermezzo" whatever that is)
btw: blkdev_put() has int return type and seems to return correctly the
return value from release()s for block devices, so I guess it would be the
right thing for char devs to do also.
The other way I would also see as okay is to state release() can't return
anything senseful to anybody, bet then declare it as void return instead. But
as the state is currently it's suboptimal from both views.
On Mon, Mar 25, 2002 at 10:50:17AM +0100, Axel Kittenberger wrote:
>
> > This is me talking prior to having coffee, but Chapter 3 of the
> > Rubini/Corbet book says:
> >
> > The flush operation is invoked when a process closes its copy of a file
> > descriptor for a device; it should execute (and wait for) any outstanding
> > operations on the device. This must not be confused with the fsync
> > operation requested by user programs. Currently, flush is used only in the
> > network file system (NFS) code. If flush is NULL, it is simply not invoked.
> >
> > I guess it doesn't specifically say it's not called in midstream, but
> > it reads as if flush() is called on /only/ close(). I may test this
> > today, just for fun.
>
> Oh thats interesting, indeed, so the function name "flush" is just
> contra-intentional. Oxay I know now how I could have written this driver
> without patching the kernel.....
>
And FWIW, I tested this behavior with a dummy chardev and printks()
around open(), release, and flush(). flush() is indeed called only on
each close.
> Still the basic issue/idea is remaining. release() is defined as int return
> type, but everywhere it's called it's value is discarded. (except internally
> in "intermezzo" whatever that is)
>
> btw: blkdev_put() has int return type and seems to return correctly the
> return value from release()s for block devices, so I guess it would be the
> right thing for char devs to do also.
>
> The other way I would also see as okay is to state release() can't return
> anything senseful to anybody, bet then declare it as void return instead. But
> as the state is currently it's suboptimal from both views.
Agreed, but the question is which approach to use. :) Declaring it as void
sounds like it may involve a lot of driver fixup work.
Regards,
- Dave
[email protected]
> Agreed, but the question is which approach to use. :) Declaring it as void
> sounds like it may involve a lot of driver fixup work.
For the first way of doing I already provided a patch, which started this
thread. (returning the release() value proparly to the close())
However if I get a word from the applicate maintaners (linus or viro) that a
patch declaring release() with void return type will be accepted for 2.5.x, I
would volunteer for providing it. Should not be that much of a work, once you
concentrate on it. However I'm not doing it for the birds :o) (without
consultation first).
Personlaly I'm unsure which of both decisions would be better, but am
unsatisfied with the way it's currently.