2005-11-15 02:00:23

by NeilBrown

[permalink] [raw]
Subject: [PATCH ] Fix some problems with truncate and mtime semantics.

Resubmitting this patch to fix truncate/mtime semantics.

It is against 2.6.14-mm2 and is probably suitable for 2.6.15, but can
be held over to 2.6.16 if you are feeling cautious.

NeilBrown

### Comments for Changeset

SUS requires that when truncating a file to the size that it currently
is:
truncate and ftruncate should NOT modify ctime or mtime
O_EXCL SHOULD modify ctime and mtime.

Currently mtime and ctime are always modified on most local
filesystems (side effect of ->truncate) or never modified (on NFS).

With this patch:
ATTR_CTIME|ATTR_MTIME are sent with ATTR_SIZE precisely when
an update of these times is required whether size changes or not
(via a new argument to do_truncate). This allows NFS to do
the right thing for O_EXCL.
inode_setattr nolonger forces ATTR_MTIME|ATTR_CTIME when the ATTR_SIZE
sets the size to it's current value. This allows local filesystems
to do the right thing for f?truncate.

Also, the logic in inode_setattr is changed a bit so there are two
return points. One returns the error from vmtruncate if it failed,
the other returns 0 (there can be no other failure).

Finally, if vmtruncate succeeds, and ATTR_SIZE is the only change
requested, we now fall-through and mark_inode_dirty. If a filesystem
did not have a ->truncate function, then vmtruncate will have changed
i_size, without marking the inode as 'dirty', and I think this is wrong.


Signed-off-by: Neil Brown <[email protected]>

### Diffstat output
./fs/attr.c | 22 +++++++---------------
./fs/exec.c | 2 +-
./fs/namei.c | 2 +-
./fs/open.c | 9 +++++----
./include/linux/fs.h | 3 ++-
5 files changed, 16 insertions(+), 22 deletions(-)

diff ./fs/attr.c~current~ ./fs/attr.c
--- ./fs/attr.c~current~ 2005-11-15 10:29:55.000000000 +1100
+++ ./fs/attr.c 2005-11-15 10:29:55.000000000 +1100
@@ -67,20 +67,12 @@ EXPORT_SYMBOL(inode_change_ok);
int inode_setattr(struct inode * inode, struct iattr * attr)
{
unsigned int ia_valid = attr->ia_valid;
- int error = 0;

- if (ia_valid & ATTR_SIZE) {
- if (attr->ia_size != i_size_read(inode)) {
- error = vmtruncate(inode, attr->ia_size);
- if (error || (ia_valid == ATTR_SIZE))
- goto out;
- } else {
- /*
- * We skipped the truncate but must still update
- * timestamps
- */
- ia_valid |= ATTR_MTIME|ATTR_CTIME;
- }
+ if (ia_valid & ATTR_SIZE &&
+ attr->ia_size != i_size_read(inode)) {
+ int error = vmtruncate(inode, attr->ia_size);
+ if (error)
+ return error;
}

if (ia_valid & ATTR_UID)
@@ -104,8 +96,8 @@ int inode_setattr(struct inode * inode,
inode->i_mode = mode;
}
mark_inode_dirty(inode);
-out:
- return error;
+
+ return 0;
}
EXPORT_SYMBOL(inode_setattr);


diff ./fs/exec.c~current~ ./fs/exec.c
--- ./fs/exec.c~current~ 2005-11-15 10:29:55.000000000 +1100
+++ ./fs/exec.c 2005-11-15 10:29:55.000000000 +1100
@@ -1516,7 +1516,7 @@ int do_coredump(long signr, int exit_cod
goto close_fail;
if (!file->f_op->write)
goto close_fail;
- if (do_truncate(file->f_dentry, 0, file) != 0)
+ if (do_truncate(file->f_dentry, 0, 0, file) != 0)
goto close_fail;

retval = binfmt->core_dump(signr, regs, file);

diff ./fs/namei.c~current~ ./fs/namei.c
--- ./fs/namei.c~current~ 2005-11-15 10:29:55.000000000 +1100
+++ ./fs/namei.c 2005-11-15 10:29:55.000000000 +1100
@@ -1491,7 +1491,7 @@ int may_open(struct nameidata *nd, int a
if (!error) {
DQUOT_INIT(inode);

- error = do_truncate(dentry, 0, NULL);
+ error = do_truncate(dentry, 0, ATTR_MTIME|ATTR_CTIME, NULL);
}
put_write_access(inode);
if (error)

diff ./fs/open.c~current~ ./fs/open.c
--- ./fs/open.c~current~ 2005-11-15 10:29:55.000000000 +1100
+++ ./fs/open.c 2005-11-15 10:29:55.000000000 +1100
@@ -194,7 +194,8 @@ out:
return error;
}

-int do_truncate(struct dentry *dentry, loff_t length, struct file *filp)
+int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
+ struct file *filp)
{
int err;
struct iattr newattrs;
@@ -204,7 +205,7 @@ int do_truncate(struct dentry *dentry, l
return -EINVAL;

newattrs.ia_size = length;
- newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
+ newattrs.ia_valid = ATTR_SIZE | time_attrs;
if (filp) {
newattrs.ia_file = filp;
newattrs.ia_valid |= ATTR_FILE;
@@ -266,7 +267,7 @@ static inline long do_sys_truncate(const
error = locks_verify_truncate(inode, NULL, length);
if (!error) {
DQUOT_INIT(inode);
- error = do_truncate(nd.dentry, length, NULL);
+ error = do_truncate(nd.dentry, length, 0, NULL);
}
put_write_access(inode);

@@ -318,7 +319,7 @@ static inline long do_sys_ftruncate(unsi

error = locks_verify_truncate(inode, file, length);
if (!error)
- error = do_truncate(dentry, length, file);
+ error = do_truncate(dentry, length, 0, file);
out_putf:
fput(file);
out:

diff ./include/linux/fs.h~current~ ./include/linux/fs.h
--- ./include/linux/fs.h~current~ 2005-11-15 10:29:55.000000000 +1100
+++ ./include/linux/fs.h 2005-11-15 10:29:55.000000000 +1100
@@ -1340,7 +1340,8 @@ static inline int break_lease(struct ino

/* fs/open.c */

-extern int do_truncate(struct dentry *, loff_t start, struct file *filp);
+extern int do_truncate(struct dentry *, loff_t start, unsigned int time_attrs,
+ struct file *filp);
extern long do_sys_open(const char __user *filename, int flags, int mode);
extern struct file *filp_open(const char *, int, int);
extern struct file * dentry_open(struct dentry *, struct vfsmount *, int);


2005-11-15 02:03:26

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH ] Fix some problems with truncate and mtime semantics.

On Tue, 2005-11-15 at 13:00 +1100, NeilBrown wrote:
> Resubmitting this patch to fix truncate/mtime semantics.
>
> It is against 2.6.14-mm2 and is probably suitable for 2.6.15, but can
> be held over to 2.6.16 if you are feeling cautious.
>
> NeilBrown
>
> ### Comments for Changeset
>
> SUS requires that when truncating a file to the size that it currently
> is:
> truncate and ftruncate should NOT modify ctime or mtime
> O_EXCL SHOULD modify ctime and mtime.
^^^^^ O_CREAT ;-)

> Currently mtime and ctime are always modified on most local
> filesystems (side effect of ->truncate) or never modified (on NFS).
>
> With this patch:
> ATTR_CTIME|ATTR_MTIME are sent with ATTR_SIZE precisely when
> an update of these times is required whether size changes or not
> (via a new argument to do_truncate). This allows NFS to do
> the right thing for O_EXCL.
^^^^^

Cheers,
Trond

2005-11-15 02:09:52

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH ] Fix some problems with truncate and mtime semantics.

On Monday November 14, [email protected] wrote:
> On Tue, 2005-11-15 at 13:00 +1100, NeilBrown wrote:
> > Resubmitting this patch to fix truncate/mtime semantics.
> >
> > It is against 2.6.14-mm2 and is probably suitable for 2.6.15, but can
> > be held over to 2.6.16 if you are feeling cautious.
> >
> > NeilBrown
> >
> > ### Comments for Changeset
> >
> > SUS requires that when truncating a file to the size that it currently
> > is:
> > truncate and ftruncate should NOT modify ctime or mtime
> > O_EXCL SHOULD modify ctime and mtime.
> ^^^^^ O_CREAT ;-)

Groan... thanks Trond.
Andrew .. if you could just do a little edit there for me, I'd really
appreciate it :-(

NeilBrown

>
> > Currently mtime and ctime are always modified on most local
> > filesystems (side effect of ->truncate) or never modified (on NFS).
> >
> > With this patch:
> > ATTR_CTIME|ATTR_MTIME are sent with ATTR_SIZE precisely when
> > an update of these times is required whether size changes or not
> > (via a new argument to do_truncate). This allows NFS to do
> > the right thing for O_EXCL.
> ^^^^^
>
> Cheers,
> Trond

2005-11-15 09:56:16

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH ] Fix some problems with truncate and mtime semantics.

> -int do_truncate(struct dentry *dentry, loff_t length, struct file *filp)
> +int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
> + struct file *filp)
> {
> int err;
> struct iattr newattrs;
> @@ -204,7 +205,7 @@ int do_truncate(struct dentry *dentry, l
> return -EINVAL;
>
> newattrs.ia_size = length;
> - newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
> + newattrs.ia_valid = ATTR_SIZE | time_attrs;

I'd rather make the argument and boolean update_times flag and this:


newattrs.ia_valid = ATTR_SIZE;
if (update_times)
newattrs.ia_valid |= ATTR_CTIME|ATTR_MTIME;

but except for that little nitpick the patch looks good. thanks.

2005-11-15 10:16:34

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH ] Fix some problems with truncate and mtime semantics.

Christoph Hellwig <[email protected]> wrote:
>
> > -int do_truncate(struct dentry *dentry, loff_t length, struct file *filp)
> > +int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
> > + struct file *filp)
> > {
> > int err;
> > struct iattr newattrs;
> > @@ -204,7 +205,7 @@ int do_truncate(struct dentry *dentry, l
> > return -EINVAL;
> >
> > newattrs.ia_size = length;
> > - newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
> > + newattrs.ia_valid = ATTR_SIZE | time_attrs;
>
> I'd rather make the argument and boolean update_times flag and this:
>

That sentence is incomprehensible. Want to have another go?

2005-11-15 10:41:30

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH ] Fix some problems with truncate and mtime semantics.

On Tuesday November 15, [email protected] wrote:
> > Finally, if vmtruncate succeeds, and ATTR_SIZE is the only change
> > requested, we now fall-through and mark_inode_dirty. If a filesystem
> > did not have a ->truncate function, then vmtruncate will have changed
> > i_size, without marking the inode as 'dirty', and I think this is wrong.
>
> And if filesystem does not have a ->truncate() function and it caller
> was [f]truncate(), ctime and mtime won't be set.
>
> That's wrong too, isn't it?

I don't think that is a problem.
The filesystem would have had a ->setattr function which would have
been told that the size had changed, and it would have had to change
the inode, and so would have updated the ctime and probably mtime.

Changing i_size without marking the inode dirty is an issue of
consistency of common data structures and inode_setattr should be
careful about that.

Changing ctime/mtime when the isize changes is an issue of filesystem
correctness, and the filesystem callbacks should handle that.

Changing mtime when isize DOESN'T change is an issue of obscure
POSIX/SUS semantics. It was less clear where this should be handled.
I think the handle of this is now better.

NeilBrown

2005-11-15 11:15:35

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: [PATCH ] Fix some problems with truncate and mtime semantics.

On Mon, 14 Nov 2005, Trond Myklebust wrote:
> On Tue, 2005-11-15 at 13:00 +1100, NeilBrown wrote:
> > Resubmitting this patch to fix truncate/mtime semantics.
> >
> > It is against 2.6.14-mm2 and is probably suitable for 2.6.15, but can
> > be held over to 2.6.16 if you are feeling cautious.
> >
> > NeilBrown
> >
> > ### Comments for Changeset
> >
> > SUS requires that when truncating a file to the size that it currently
> > is:
> > truncate and ftruncate should NOT modify ctime or mtime
> > O_EXCL SHOULD modify ctime and mtime.
> ^^^^^ O_CREAT ;-)

O_TRUNC actually ;-) See:

http://www.opengroup.org/onlinepubs/009695399/functions/open.html

<quote>
If O_CREAT is set and the file did not previously exist, upon successful
completion, open() shall mark for update the st_atime, st_ctime, and
st_mtime fields of the file and the st_ctime and st_mtime fields of the
parent directory.

If O_TRUNC is set and the file did previously exist, upon successful
completion, open() shall mark for update the st_ctime and st_mtime fields
of the file.
</quote>

Given we are talking about truncate, we only care about O_TRUNC.
O_CREAT/O_EXCL of course set A/C/M time as the file is newly created and
hence all three values are set to the current system time!

> > Currently mtime and ctime are always modified on most local
> > filesystems (side effect of ->truncate) or never modified (on NFS).
> >
> > With this patch:
> > ATTR_CTIME|ATTR_MTIME are sent with ATTR_SIZE precisely when
> > an update of these times is required whether size changes or not
> > (via a new argument to do_truncate). This allows NFS to do
> > the right thing for O_EXCL.
> ^^^^^

O_TRUNC...

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2005-11-15 12:38:09

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH ] Fix some problems with truncate and mtime semantics.

> Finally, if vmtruncate succeeds, and ATTR_SIZE is the only change
> requested, we now fall-through and mark_inode_dirty. If a filesystem
> did not have a ->truncate function, then vmtruncate will have changed
> i_size, without marking the inode as 'dirty', and I think this is wrong.

And if filesystem does not have a ->truncate() function and it caller
was [f]truncate(), ctime and mtime won't be set.

That's wrong too, isn't it?

Miklos

2005-11-15 12:49:05

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH ] Fix some problems with truncate and mtime semantics.

> > > Finally, if vmtruncate succeeds, and ATTR_SIZE is the only change
> > > requested, we now fall-through and mark_inode_dirty. If a filesystem
> > > did not have a ->truncate function, then vmtruncate will have changed
> > > i_size, without marking the inode as 'dirty', and I think this is wrong.
> >
> > And if filesystem does not have a ->truncate() function and it caller
> > was [f]truncate(), ctime and mtime won't be set.
> >
> > That's wrong too, isn't it?
>
> I don't think that is a problem.
> The filesystem would have had a ->setattr function which would have
> been told that the size had changed, and it would have had to change
> the inode, and so would have updated the ctime and probably mtime.

What if filesystem has neither ->truncate() nor ->setattr()?
E.g. ramfs and derivatives.

> Changing i_size without marking the inode dirty is an issue of
> consistency of common data structures and inode_setattr should be
> careful about that.
>
> Changing ctime/mtime when the isize changes is an issue of filesystem
> correctness, and the filesystem callbacks should handle that.

Yes, but VFS should have a sane default when filesystem doesn't have
the methods. It's the same with all the other methods.

Miklos