2005-09-19 17:57:01

by Steve French

[permalink] [raw]
Subject: ctime set by truncate even if NOCMTIME requested

I am seeing requests to set ctime on truncate which does not make any
sense to me as I was testing with the flag that should have turned that
off. ie my inodes having S_NOCMTIME set, as NFS does.

do_truncate (line 206 of open.c) sets
newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME
instead of
newattrs.ia_valid = ATTR_SIZE;
if(!IS_NOCMTIME(inode))
newattrs.ia_valid |= ATTR_CTIME;

I thought that the correct way to handle this for network filesystems,
is to let the server set the mtime and ctime unless the application
explicitly sets the attributes (in the case of the sys call truncate or
ftruncate the application is not explicitly setting the ctime/mtime as
it would on a backup/restore so they should be ignored for the network
fs so the server will set it correctl to its time, reducing traffic and
more accurately representing the time it got updated).

Shouldn't there be a IS_NOCMTIME check in the truncate path in fs/open.c?


2005-09-19 18:58:41

by Trond Myklebust

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

må den 19.09.2005 Klokka 12:51 (-0500) skreiv Steve French:
> I am seeing requests to set ctime on truncate which does not make any
> sense to me as I was testing with the flag that should have turned that
> off. ie my inodes having S_NOCMTIME set, as NFS does.
>
> do_truncate (line 206 of open.c) sets
> newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME
> instead of
> newattrs.ia_valid = ATTR_SIZE;
> if(!IS_NOCMTIME(inode))
> newattrs.ia_valid |= ATTR_CTIME;
>
> I thought that the correct way to handle this for network filesystems,
> is to let the server set the mtime and ctime unless the application
> explicitly sets the attributes (in the case of the sys call truncate or
> ftruncate the application is not explicitly setting the ctime/mtime as
> it would on a backup/restore so they should be ignored for the network
> fs so the server will set it correctl to its time, reducing traffic and
> more accurately representing the time it got updated).
>
> Shouldn't there be a IS_NOCMTIME check in the truncate path in fs/open.c?

See the discussion on this a couple of weeks back.

It is quite correct for the kernel to request that the filesystem set
ctime/mtime on successful calls to open(O_TRUNC).
http://www.opengroup.org/onlinepubs/009695399/toc.htm

It is _incorrect_ for it to request that ctime/mtime be set (or that the
suid/sgid mode bit be cleared) if a truncate()/ftruncate() call results
in no size change.
http://www.opengroup.org/onlinepubs/009695399/toc.htm

So the current do_truncate() does have to be changed. Adding a check for
IS_NOCMTIME would be wrong, though.

Cheers,
Trond

2005-09-19 21:04:07

by Steve French

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

Trond Myklebust wrote:

>It is quite correct for the kernel to request that the filesystem set
>ctime/mtime on successful calls to open(O_TRUNC).
> http://www.opengroup.org/onlinepubs/009695399/toc.htm
>
>
I agree that it is correct to set ctime/mtime here, just not convinced
it it worth setting it twice which is what will happen for non-local fs.

client truncate -> client setattr(for both size and ctime) -> server
setattr (which will have a sideffect of ctime to be set on the server,
not just the change to file size) and then another call to the server to
set the ctime (which will end up setting ctime twice - one to the
(correct) server's time and once to the less correct client's time.

Problem is I can't tell the difference inside the fs between the case of
an application that needs to set ctime explicitly to something different
than the server would want (e.g. presumably the touch utility) and
something which can be ignored. I noticed this when I found a server
(windows me) that was returning an error for setting timestamps - and
this was causing errors to be returned to truncate. In this case
(server refusing to let the client fs (re)set the timestamps), I would
like to return an error on touch, but succeed on truncate but I don't
see a way to tell the difference reliably.

2005-09-19 21:28:42

by Trond Myklebust

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

må den 19.09.2005 Klokka 15:58 (-0500) skreiv Steve French:
> Trond Myklebust wrote:
>
> >It is quite correct for the kernel to request that the filesystem set
> >ctime/mtime on successful calls to open(O_TRUNC).
> > http://www.opengroup.org/onlinepubs/009695399/toc.htm
> >
> >
> I agree that it is correct to set ctime/mtime here, just not convinced
> it it worth setting it twice which is what will happen for non-local fs.
>
> client truncate -> client setattr(for both size and ctime) -> server
> setattr (which will have a sideffect of ctime to be set on the server,
> not just the change to file size) and then another call to the server to
> set the ctime (which will end up setting ctime twice - one to the
> (correct) server's time and once to the less correct client's time.

If the VFS sets ATTR_[ACM]TIME, you should always assume that you are
being requested to set the [acm]time to server time. Doesn't CIFS allow
you that option?

Cheers,
Trond

2005-09-20 01:37:04

by Trond Myklebust

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

må den 19.09.2005 Klokka 19:35 (-0500) skreiv Steve French:

> CIFS time format is similar to "DCE time" ie expressed as 100
> nanoseconds units since 1600 UTC. In many large networks cifs clients
> in a domain will share an ntp or equivalent time source with the
> servers in the domain and will agree reasonably closely on what the
> UTC time is. Ideally (at least from a performance perspective) most
> time stamps would be set implicitly by the server rather than
> explicitly by the client (in particular time of file creation, which
> is set by the server when the file is created, and last write time,
> set by the server after every write). truncate of course is another
> example where time could be set implicitly by the server rather than
> explicitly by the client, but is hard to distinguish from explicit
> setattr.

That sucks... Even if the clients are synchronised using ntp, there is
no guarantee that two almost-simultaneous setattr RPC calls can't race
and cause time to go backwards if they have to explicitly set to the
"current" client time.

However if you know the cases where time is set implicitly by the
server, why can't you simply optimise away the ATTR_CTIME and/or
ATTR_MTIME?

Cheers,
Trond

2005-09-20 02:21:42

by Steve French

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

Trond Myklebust wrote:

>
>However if you know the cases where time is set implicitly by the
>server, why can't you simply optimise away the ATTR_CTIME and/or
>ATTR_MTIME?
>
>
>
I don't see any case other than explicit application calls on the client
to utime in which we would ever want the client to send its timestamp to
the server.

The cases that have to be checked for for the cifs client case are few -
setting the file size (truncate/ftruncate) and chmod/chown (which sets
CTIME on the client, not just the setting the mode) and in my traces
they do look different in setattr than calls to setattr that come
through utimes. For example, it probably is safe to assume that ctime
never has to be sent to the server when also one or more of the the
mode, owner or size is being set - and no other user space application
(just via the truncate system call) would ever call notify_change
(setattr) for both size and time, but it does make me nervous to throw
away any request to update the timestamps remotely if the size is also
changing (the change of the file size does need to go to the server).
It does seem like
utime(filename, timeval)
may be the only time we want to send time changes to the server but I am
not certain how risky such an approach is even after scanning fs/open.c
to ignore time changes except when both ATIME/MTIME/CTIME are set at the
same time (as they are in sys_utime and do_utimes). Most people
probably don't care if the server and client clocks are not too far off,
but it does affect performance (presumably even noticeable on something
like fsx test)

2005-09-20 08:50:14

by Miklos Szeredi

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

> See the discussion on this a couple of weeks back.
>
> It is quite correct for the kernel to request that the filesystem set
> ctime/mtime on successful calls to open(O_TRUNC).
> http://www.opengroup.org/onlinepubs/009695399/toc.htm
>
> It is _incorrect_ for it to request that ctime/mtime be set (or that the
> suid/sgid mode bit be cleared) if a truncate()/ftruncate() call results
> in no size change.
> http://www.opengroup.org/onlinepubs/009695399/toc.htm
>
> So the current do_truncate() does have to be changed. Adding a check for
> IS_NOCMTIME would be wrong, though.

These are othogonal problems.

IS_NOCMTIME is the filesystem's way of saying that it doesn't need
->setattr on truncate(), write(), etc. Why? Because it can do the
[cm]time change implicitly _within_ the operation.

If IS_NOCMTIME is set, the only place where the filesystem should get
ATTR_MTIME or ATTR_CTIME (and for that matter ATTR_MTIME_SET) is from
sys_utime[s].

Miklos

2005-09-20 08:54:17

by Miklos Szeredi

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

> However if you know the cases where time is set implicitly by the
> server, why can't you simply optimise away the ATTR_CTIME and/or
> ATTR_MTIME?

How? By checking whether ATTR_SIZE is also set? Yes, that would work
for truncate(), but it's rather more ugly than the proposed check for
IS_NOCMTIME.

Miklos

2005-09-20 10:07:22

by Miklos Szeredi

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

> > However if you know the cases where time is set implicitly by the
> > server, why can't you simply optimise away the ATTR_CTIME and/or
> > ATTR_MTIME?
>
> How? By checking whether ATTR_SIZE is also set? Yes, that would work
> for truncate(), but it's rather more ugly than the proposed check for
> IS_NOCMTIME.

I have to correct myself: it seems, the check is not needed. Ctime is
never set explicitly. So filesystem can simply ignore ATTR_CTIME, and
do the right thing without it.

ATTR_MTIME is _only_ set in utime[s], which all filesystems want to
honor.

So I think the only thing CIFS needs is simply to ignore ATTR_CTIME.

Miklos

2005-09-20 12:12:47

by Trond Myklebust

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

ty den 20.09.2005 Klokka 12:05 (+0200) skreiv Miklos Szeredi:
> These are othogonal problems.
>
> IS_NOCMTIME is the filesystem's way of saying that it doesn't need
> ->setattr on truncate(), write(), etc. Why? Because it can do the
> [cm]time change implicitly _within_ the operation.

No. IS_NOCMTIME is the filesystem's way of telling the VFS never to
screw around with the values of inode->i_mtime and inode->i_ctime.

The reason is that crap like inode_update_time() explicitly sets these
values to the local current time instead of using server timestamps.

OTOH, ->setattr with an ATTR_MTIME or ATTR_CTIME argument is telling the
filesystem to update the timestamp. The filesystem then has a choice of
whether or not to use current time, server time, or to just ignore it if
the timestamp is going to be be updated by the other ->setattr arguments
anyway.

> ATTR_MTIME is _only_ set in utime[s], which all filesystems want to
> honor.

ATTR_MTIME is set in both utimes and truncate. In the latter case, CIFS
could optimise it away by noting that ATTR_SIZE will set mtime anyway.
As for ATTR_CTIME, that is also set in chown(), chmod(). It too can be
optimised away for those operations, assuming that CIFS servers
automatically update ctime.

Cheers,
Trond

2005-09-20 12:16:58

by Andreas Dilger

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

On Sep 19, 2005 21:16 -0500, Steve French wrote:
> It does seem like
> utime(filename, timeval)
> may be the only time we want to send time changes to the server but I am
> not certain how risky such an approach is even after scanning fs/open.c
> to ignore time changes except when both ATIME/MTIME/CTIME are set at the
> same time (as they are in sys_utime and do_utimes). Most people
> probably don't care if the server and client clocks are not too far off,
> but it does affect performance (presumably even noticeable on something
> like fsx test)

For Lustre (since we are patching the VFS anyways) we have added an
ATTR_CTIME_SET flag to distinguish whether the client has explicitly
set the ia_ctime field, or if it is an implicit update.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2005-09-20 12:22:03

by Miklos Szeredi

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

> > ATTR_MTIME is _only_ set in utime[s], which all filesystems want to
> > honor.
>
> ATTR_MTIME is set in both utimes and truncate.

Not in truncate:

int do_truncate(struct dentry *dentry, loff_t length)
{
/* ... */

newattrs.ia_size = length;
newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;

down(&dentry->d_inode->i_sem);
err = notify_change(dentry, &newattrs);
up(&dentry->d_inode->i_sem);
return err;
}

> As for ATTR_CTIME, that is also set in chown(), chmod(). It too can be
> optimised away for those operations, assuming that CIFS servers
> automatically update ctime.

Yes, I realized this shortly after posting.

Miklos

2005-09-20 12:28:33

by Trond Myklebust

[permalink] [raw]
Subject: Re: ctime set by truncate even if NOCMTIME requested

ty den 20.09.2005 Klokka 14:20 (+0200) skreiv Miklos Szeredi:
> > > ATTR_MTIME is _only_ set in utime[s], which all filesystems want to
> > > honor.
> >
> > ATTR_MTIME is set in both utimes and truncate.
>
> Not in truncate:
>
> int do_truncate(struct dentry *dentry, loff_t length)
> {
> /* ... */
>
> newattrs.ia_size = length;
> newattrs.ia_valid = ATTR_SIZE | ATTR_CTIME;
>
> down(&dentry->d_inode->i_sem);
> err = notify_change(dentry, &newattrs);
> up(&dentry->d_inode->i_sem);
> return err;
> }

Oops. You're right. That is a recent change that I missed...

Cheers,
Trond