2003-05-22 18:56:44

by Andries E. Brouwer

[permalink] [raw]
Subject: [patch?] truncate and timestamps

Investigating why some SuSE init script no longer works, I find:
The shell command
> file
does not update the time stamp of file in case it existed and had size 0.
The corresponding system call is
open(file, O_WRONLY | O_TRUNC);

Time stamp here is st_mtime, and the behaviour is not unreasonable:
after all, the contents do not change.

And indeed, in do_truncate() we have
newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
that is, ATTR_MTIME is not mentioned.

Older Linux systems did update st_mtime, old FreeBSD systems did not,
SunOS 5.7 does, SunOS 5.8 does not. There is no uniform Unix behaviour.

Remains the question what POSIX says.
For opening with O_TRUNC, and for the ftruncate system call
POSIX has always said and still says: st_ctime and st_mtime
are updated.
For the truncate call POSIX.1-2001 says: st_ctime and st_mtime
are updated if the size changed.

If we want to follow this, then do_truncate() needs an additional
parameter indicating what kind of call we have.
A little bit ugly. A possible patch follows.

Andries

------------------
diff -u --recursive --new-file -X /linux/dontdiff a/fs/exec.c b/fs/exec.c
--- a/fs/exec.c Thu May 22 13:17:02 2003
+++ b/fs/exec.c Thu May 22 20:14:02 2003
@@ -1352,7 +1352,7 @@
goto close_fail;
if (!file->f_op->write)
goto close_fail;
- if (do_truncate(file->f_dentry, 0) != 0)
+ if (do_truncate(file->f_dentry, 0, 1) != 0)
goto close_fail;

retval = binfmt->core_dump(signr, regs, file);
diff -u --recursive --new-file -X /linux/dontdiff a/fs/namei.c b/fs/namei.c
--- a/fs/namei.c Thu May 22 13:17:02 2003
+++ b/fs/namei.c Thu May 22 20:13:41 2003
@@ -1178,7 +1178,7 @@
if (!error) {
DQUOT_INIT(inode);

- error = do_truncate(dentry, 0);
+ error = do_truncate(dentry, 0, 1);
}
put_write_access(inode);
if (error)
diff -u --recursive --new-file -X /linux/dontdiff a/fs/open.c b/fs/open.c
--- a/fs/open.c Thu May 22 13:17:02 2003
+++ b/fs/open.c Thu May 22 20:15:25 2003
@@ -75,7 +75,7 @@
return error;
}

-int do_truncate(struct dentry *dentry, loff_t length)
+int do_truncate(struct dentry *dentry, loff_t length, int update_timestamp)
{
int err;
struct iattr newattrs;
@@ -85,7 +85,9 @@
return -EINVAL;

newattrs.ia_size = length;
- newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
+ newattrs.ia_valid = ATTR_SIZE;
+ if (update_timestamp || length != dentry->d_inode->i_size)
+ newattrs.ia_valid |= ATTR_CTIME | ATTR_MTIME;
down(&dentry->d_inode->i_sem);
err = notify_change(dentry, &newattrs);
up(&dentry->d_inode->i_sem);
@@ -142,7 +144,7 @@
error = locks_verify_truncate(inode, NULL, length);
if (!error) {
DQUOT_INIT(inode);
- error = do_truncate(nd.dentry, length);
+ error = do_truncate(nd.dentry, length, 0);
}
put_write_access(inode);

@@ -194,7 +196,7 @@

error = locks_verify_truncate(inode, file, length);
if (!error)
- error = do_truncate(dentry, length);
+ error = do_truncate(dentry, length, 1);
out_putf:
fput(file);
out:
diff -u --recursive --new-file -X /linux/dontdiff a/include/linux/fs.h b/include/linux/fs.h
--- a/include/linux/fs.h Thu May 22 13:17:05 2003
+++ b/include/linux/fs.h Thu May 22 20:14:39 2003
@@ -1019,7 +1019,7 @@

asmlinkage long sys_open(const char __user *, int, int);
asmlinkage long sys_close(unsigned int); /* yes, it's really unsigned */
-extern int do_truncate(struct dentry *, loff_t start);
+extern int do_truncate(struct dentry *, loff_t start, int update_timestamp);

extern struct file *filp_open(const char *, int, int);
extern struct file * dentry_open(struct dentry *, struct vfsmount *, int);


2003-05-22 23:08:36

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

[email protected] wrote:
>
> Investigating why some SuSE init script no longer works, I find:
> The shell command
> > file
> does not update the time stamp of file in case it existed and had size 0.

oops. That's due to me "don't call vmtruncate if i_size didn't change"
speedup. It was a pretty good speedup too.

Does this look sane?


25-akpm/fs/attr.c | 13 ++++++++++---
1 files changed, 10 insertions(+), 3 deletions(-)

diff -puN fs/attr.c~a fs/attr.c
--- 25/fs/attr.c~a Thu May 22 16:16:33 2003
+++ 25-akpm/fs/attr.c Thu May 22 16:18:13 2003
@@ -68,10 +68,17 @@ int inode_setattr(struct inode * inode,
int error = 0;

if (ia_valid & ATTR_SIZE) {
- if (attr->ia_size != inode->i_size)
+ if (attr->ia_size != inode->i_size) {
error = vmtruncate(inode, attr->ia_size);
- if (error || (ia_valid == ATTR_SIZE))
- goto out;
+ if (error)
+ goto out;
+ } else {
+ /*
+ * We skipped the truncate but must still update
+ * timestamps
+ */
+ ia_valid |= ATTR_MTIME|ATTR_CTIME;
+ }
}

lock_kernel();

_

2003-05-23 00:04:51

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

From: Andrew Morton <[email protected]>

> Investigating why some SuSE init script no longer works, I find:
> The shell command
> > file
> does not update the time stamp of file in case it existed and had size 0.

oops. That's due to me "don't call vmtruncate if i_size didn't change"
speedup. It was a pretty good speedup too.

Does this look sane?

Well, before this 2.5.66 patch vmtruncate() was called, which called
foofs->truncate(), which probably did
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
and it looks like you do the same now. So, yes.

[Why is this in foofs->truncate() and not in vfs code?]

[It looks like the only way ATTR_SIZE can be set is from the
do_truncate, so maybe your patch is equivalent to changing open.c:

88c88
< newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
---
> newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME | ATTR_MTIME;

("It looks like" means: nfs does unspeakable things,
intermezzo has izo_do_truncate(), a copy of do_truncate(),
hpfs_unlink() tries a truncate if there is no space for a delete,
ncp_notify_change() may have ATTR_SIZE without ATTR_MTIME.)]

On the other hand, my question was really a different one:
do we want to follow POSIX, also in the silly requirement
that truncate only sets mtime when the size changes, while
O_TRUNC and ftruncate always set mtime.

If so, we have to uglify do_truncate().

Andries

2003-05-23 00:17:48

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps


On Fri, 23 May 2003 [email protected] wrote:
>
> On the other hand, my question was really a different one:
> do we want to follow POSIX, also in the silly requirement
> that truncate only sets mtime when the size changes, while
> O_TRUNC and ftruncate always set mtime.

Does POSIX really say that? What a crock. If so, we should probably add
the ATTR_xxx mask as an argument to do_truncate() itself, and then make
sure that may_open() sets the ATTR_MTIME bit.

Linus

2003-05-23 01:04:49

by Al Viro

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

On Thu, May 22, 2003 at 05:30:33PM -0700, Linus Torvalds wrote:
>
> On Fri, 23 May 2003 [email protected] wrote:
> >
> > On the other hand, my question was really a different one:
> > do we want to follow POSIX, also in the silly requirement
> > that truncate only sets mtime when the size changes, while
> > O_TRUNC and ftruncate always set mtime.
>
> Does POSIX really say that? What a crock. If so, we should probably add
> the ATTR_xxx mask as an argument to do_truncate() itself, and then make
> sure that may_open() sets the ATTR_MTIME bit.

"POSIX says" has value only if there is at least some consensus among
implementations. Otherwise it's worthless, simply because any program
that cares about portability can't rely on specified behaviour and
any program that doesn't couldn't care less anyway - it will rely on
actual behaviour on system it's supposed to run on.

Andries had shown that there is _no_ consensus. Ergo, POSIX can take
a hike and we should go with the behaviour convenient for us. It's that
simple...

2003-05-23 02:26:13

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

[email protected] wrote:
>
> POSIX can take a hike

aye. But given a choice we should continue to do what 2.4 did.

I assume every foo_truncate() is doing

inode->i_mtime = inode->i_ctime = CURRENT_TIME;
mark_inode_dirty(inode);

and as Andries says, we can probably pull all that up to the VFS level.

2003-05-23 03:01:48

by David Schwartz

[permalink] [raw]
Subject: RE: [patch?] truncate and timestamps


> "POSIX says" has value only if there is at least some consensus among
> implementations. Otherwise it's worthless, simply because any program
> that cares about portability can't rely on specified behaviour and
> any program that doesn't couldn't care less anyway - it will rely on
> actual behaviour on system it's supposed to run on.

This type of attitude ensures there never will be a consensus among
implementations. A lack of consensus today is not grounds for failing to
comply with a standard specifically designed to eliminate that lack.

On the other hand, that has to be balanced by how objectively reasonable or
unreasonable the standard is. However, there should be an extremely strong
preference for concurring with the standard, even against the weight of
other implementations.

DS


2003-05-23 04:58:39

by Trond Myklebust

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

>>>>> " " == Andrew Morton <[email protected]> writes:

> I assume every foo_truncate() is doing

inode-> i_mtime = inode->i_ctime = CURRENT_TIME;
> mark_inode_dirty(inode);

> and as Andries says, we can probably pull all that up to the
> VFS level.

No. Please do not assume that the above is equivalent to calling
notify_change() with ATTR_MTIME|ATTR_CTIME.

NFS tends to leave the above to the server side, since the clocks may
be desynchronized between client and server.

As far as NFS is concerned, we should only be setting ATTR_*TIME if/when
the *user* specifies it through a utimes() call or something like that.

Cheers,
Trond

2003-05-23 05:12:31

by Trond Myklebust

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

>>>>> " " == Trond Myklebust <[email protected]> writes:

> As far as NFS is concerned, we should only be setting
> ATTR_*TIME if/when the *user* specifies it through a utimes()
> call or something like that.

Duh... errno = -ENOCOFFEE;

I'm confusing ATTR_*TIME and ATTR_*TIME_SET...

Cheers,
Trond

2003-05-23 17:52:42

by kaih

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

[email protected] (Linus Torvalds) wrote on 22.05.03 in <[email protected]>:

> On Fri, 23 May 2003 [email protected] wrote:
> >
> > On the other hand, my question was really a different one:
> > do we want to follow POSIX, also in the silly requirement
> > that truncate only sets mtime when the size changes, while
> > O_TRUNC and ftruncate always set mtime.
>
> Does POSIX really say that? What a crock.

That's why POSIX says no such thing.

What it *does* say is

Upon successful completion, if fildes refers to a regular file, the
ftruncate() function shall mark for update the st_ctime and st_mtime
fields of the file and the S_ISUID and S_ISGID bits of the file mode may
be cleared. If the ftruncate() function is unsuccessful, the file is
unaffected.

See:

http://www.opengroup.org/onlinepubs/007904975/functions/ftruncate.html

Is it really so hard to look it up that we need to spout FUD instead?

MfG Kai

2003-05-23 18:54:44

by Andries Brouwer

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

On Fri, May 23, 2003 at 08:02:00PM +0200, Kai Henningsen wrote:

> > On Fri, 23 May 2003 [email protected] wrote:
> > >
> > > On the other hand, my question was really a different one:
> > > do we want to follow POSIX, also in the silly requirement
> > > that truncate only sets mtime when the size changes, while
> > > O_TRUNC and ftruncate always set mtime.
>
> See:
>
> http://www.opengroup.org/onlinepubs/007904975/functions/ftruncate.html
>
> Is it really so hard to look it up that we need to spout FUD instead?

Look up ftruncate, look up truncate, loop up open with O_TRUNC and compare.
You will understand what the discussion is about.


2003-05-27 00:27:29

by Alan

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

On Gwe, 2003-05-23 at 02:17, [email protected]
wrote:
> Andries had shown that there is _no_ consensus. Ergo, POSIX can take
> a hike and we should go with the behaviour convenient for us. It's that
> simple...

Skipping the update on a truncate not changing size is a performance win
although not a very useful one. I don't think we can ignore the standard
however. For one it simply means all the vendors have to fix it so they
can sell to Government etc.

Now we can certainly put the fix in -every- vendor tree on the planet
and not base, I'm just not sure for something so trivial it isnt better
just to fix it to the spec *or* beat the spec authors up to fix the
spec.

2003-05-27 01:06:32

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch?] truncate and timestamps

Alan Cox <[email protected]> wrote:
>
> On Gwe, 2003-05-23 at 02:17, [email protected]
> wrote:
> > Andries had shown that there is _no_ consensus. Ergo, POSIX can take
> > a hike and we should go with the behaviour convenient for us. It's that
> > simple...
>
> Skipping the update on a truncate not changing size is a performance win
> although not a very useful one.

It makes the AIM7 & AIM9 numbers look good ;)

An open(O_TRUNC) of a zero-length file is presumably not totally uncommon,
so not calling into the fs there may benefit some things.

> I don't think we can ignore the standard
> however. For one it simply means all the vendors have to fix it so they
> can sell to Government etc.

I think this patch will put us back to the 2.4 behaviour while preserving
the speedup. It's a bit dopey (why do the timestamp update in the fs at
all?) but changing this stuff tends to cause subtle problems...




diff -puN fs/attr.c~truncate-timestamp-fix fs/attr.c
--- 25/fs/attr.c~truncate-timestamp-fix 2003-05-22 22:10:18.000000000 -0700
+++ 25-akpm/fs/attr.c 2003-05-22 22:14:06.000000000 -0700
@@ -68,10 +68,17 @@ int inode_setattr(struct inode * inode,
int error = 0;

if (ia_valid & ATTR_SIZE) {
- if (attr->ia_size != inode->i_size)
+ if (attr->ia_size != inode->i_size) {
error = vmtruncate(inode, attr->ia_size);
- if (error || (ia_valid == ATTR_SIZE))
- goto out;
+ if (error || (ia_valid == ATTR_SIZE))
+ goto out;
+ } else {
+ /*
+ * We skipped the truncate but must still update
+ * timestamps
+ */
+ ia_valid |= ATTR_MTIME|ATTR_CTIME;
+ }
}

lock_kernel();

_