Hi,
There is an inconsistency in the way truncate works which was introduced
(relatively) recently.
fs/open.c::sys_truncate
-> do_sys_truncate
-> do_truncate does:
newattrs.ia_size = length;
newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
down(&dentry->d_inode->i_sem);
err = notify_change(dentry, &newattrs);
up(&dentry->d_inode->i_sem);
This changes the ctime only.
But fs/attr.c::notify_change
-> inode_setattr [happens when !inode->i_op || !inode->i_op->setattr]
which does:
if (ia_valid & ATTR_SIZE) {
if (attr->ia_size != i_size_read(inode)) {
error = vmtruncate(inode, attr->ia_size);
if (error || (ia_valid == ATTR_SIZE))
goto out;
} else {
/*
* We skipped the truncate but must still update
* timestamps
*/
ia_valid |= ATTR_MTIME|ATTR_CTIME;
}
}
Which changes both mtime and ctime when the size does not change.
Also, the man page for stat(2) says:
<quote>
The field st_mtime is changed by file modifications, e.g. by mknod(2),
truncate(2), utime(2) and write(2) (of more than zero bytes).
Moreover, st_mtime of a directory is changed by the creation or deletion
of files in that directory. The st_mtime field is not changed for
changes in owner, group, hard link count, or mode.
</quote>
So it suggests mtime is always modified on truncate.
So what is correct? mtime and ctime or just ctime? If just ctime,
Linus please apply below patch to remove the superfluous ATTR_MTIME from
fs/attr.c::inode_setattr().
Also someone ought to fix the man pages if they are not fixed already
(mine are man-pages-2.01-2 from SUSE 9.3)...
Signed-off-by: Anton Altaparmakov <[email protected]>
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
--- ntfs-2.6/fs/attr.c.old 2005-09-30 16:00:29.000000000 +0100
+++ ntfs-2.6/fs/attr.c 2005-09-30 16:00:47.000000000 +0100
@@ -79,7 +79,7 @@ int inode_setattr(struct inode * inode,
* We skipped the truncate but must still update
* timestamps
*/
- ia_valid |= ATTR_MTIME|ATTR_CTIME;
+ ia_valid |= ATTR_CTIME;
}
}
On Fri, 30 Sep 2005, Anton Altaparmakov wrote:
>
> There is an inconsistency in the way truncate works which was introduced
> (relatively) recently.
>
> fs/open.c::sys_truncate
> -> do_sys_truncate
> -> do_truncate does:
>
> newattrs.ia_size = length;
> newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
>
> down(&dentry->d_inode->i_sem);
> err = notify_change(dentry, &newattrs);
> up(&dentry->d_inode->i_sem);
>
> This changes the ctime only.
Hmm.. That looks wrong, partly because I don't think it should even set
ATTR_CTIME _either_. However, I don't see any recent changes to that code,
so it must have been logn for a long time. That line in do_truncate() has
been like that since at _least_ 2002.
So what changed to make you notice?
The _code_ actually expects the low-level filesystem to just do it when it
does the actual truncate itself. That's certainly what ext3 does, for
example.
A comment or two to that effect might be useful, though.
In other words: some attributes are "implicit". For example, mtime is
supposed to be set automatically by the filesystem whenever it sees a
write. The VFS layer will _not_ do a "inode state notify" event for that.
The same is true of truncation.
But I agree that it's inconsistent. Anybody have any deep opinions?
Linus
On Fri, 2005-09-30 at 08:36 -0700, Linus Torvalds wrote:
> On Fri, 30 Sep 2005, Anton Altaparmakov wrote:
> > There is an inconsistency in the way truncate works which was introduced
> > (relatively) recently.
> >
> > fs/open.c::sys_truncate
> > -> do_sys_truncate
> > -> do_truncate does:
> >
> > newattrs.ia_size = length;
> > newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
> >
> > down(&dentry->d_inode->i_sem);
> > err = notify_change(dentry, &newattrs);
> > up(&dentry->d_inode->i_sem);
> >
> > This changes the ctime only.
>
> Hmm.. That looks wrong, partly because I don't think it should even set
> ATTR_CTIME _either_. However, I don't see any recent changes to that code,
> so it must have been logn for a long time. That line in do_truncate() has
> been like that since at _least_ 2002.
Indeed. What _has_ changed recently is that IIRC do_truncate used to
set ATTR_SIZE | ATTR_CTIME | ATTR_MTIME at which point the two were
consistent. But (relatively) recently someone removed the ATTR_MTIME
from do_truncate and forgot to remove it from inode_setattr...
> So what changed to make you notice?
I have finished coding the initial cut of ntfs_truncate() (regular files
only, i.e. uncompressed, unencrypted, and
not-attribute-list-attribute-containing). (-:
Now I am at the testing it stage and I was doing:
$ stat ntfs/file
$ truncate ntfs/file <someval>
$ stat ntfs/file
And noticed that the ctime was chaning but not the mtime so I read the
man page which suggested mtime ought to change so I had a quick look at
relevant kernel code and the inconsistency stared me in the face. (-;
> The _code_ actually expects the low-level filesystem to just do it when it
> does the actual truncate itself. That's certainly what ext3 does, for
> example.
Ah, I thought the VFS took care of _all_ {a,c,m} time changes and we
filesystem developers only ever had to write the changed times to
disk...
Obviously I was wrong...
I will have a look at ext2/3.
> A comment or two to that effect might be useful, though.
Indeed, it would be useful for unsuspecting fs developers like me. (-;
> In other words: some attributes are "implicit". For example, mtime is
> supposed to be set automatically by the filesystem whenever it sees a
> write. The VFS layer will _not_ do a "inode state notify" event for that.
Right. I seem to remember this being different some time ago. IIRC the
kernel did update {a,m,c}time in many places...
> The same is true of truncation.
>
> But I agree that it's inconsistent. Anybody have any deep opinions?
I certainly don't. It just should be consistent. (-:
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
Hi Linus,
Here is a (tested) patch to remove the setting of ATTR_CTIME in
do_truncate() as you suggested.
Signed-off-by: Anton Altaparmakov <[email protected]>
---
On Fri, 2005-09-30 at 08:36 -0700, Linus Torvalds wrote:
> On Fri, 30 Sep 2005, Anton Altaparmakov wrote:
> > There is an inconsistency in the way truncate works which was introduced
> > (relatively) recently.
> >
> > fs/open.c::sys_truncate
> > -> do_sys_truncate
> > -> do_truncate does:
> >
> > newattrs.ia_size = length;
> > newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
> >
> > Hmm.. That looks wrong, partly because I don't think it should even set
> > ATTR_CTIME _either_. However, I don't see any recent changes to that code,
> > so it must have been logn for a long time. That line in do_truncate() has
> > been like that since at _least_ 2002.
Having gone through looking at ext2/3 and having made ntfs_truncate()
behave the same way as ext2/3_truncate() wrt m/ctime, I now agree with
you. do_truncate() should only set ATTR_SIZE, not ATTR_CTIME. The file
system will set ATTR_CTIME and ATTR_MTIME anyway so it's a waste for the
attribute to also set ATTR_CTIME. That way it happens twice.
Below is a patch to remove it...
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
diff --git a/fs/open.c b/fs/open.c
--- a/fs/open.c
+++ b/fs/open.c
@@ -204,7 +204,7 @@ int do_truncate(struct dentry *dentry, l
return -EINVAL;
newattrs.ia_size = length;
- newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
+ newattrs.ia_valid = ATTR_SIZE;
down(&dentry->d_inode->i_sem);
err = notify_change(dentry, &newattrs);
> Here is a (tested) patch to remove the setting of ATTR_CTIME in
> do_truncate() as you suggested.
This is still inconsistent: all other "setattr" operations (chmod,
chown, chgrp, utime) still set ATTR_CTIME.
I think either ATTR_CTIME should be implicit in ->setattr() in which
case you should remove it in all those cases, or leave it as it is.
Miklos