On Thu, 2023-07-06 at 08:19 +0900, Damien Le Moal wrote:
> On 7/6/23 03:58, Jeff Layton wrote:
> > A rename potentially involves updating 4 different inode timestamps. Add
> > a function that handles the details sanely, and convert the libfs.c
> > callers to use it.
> >
> > Signed-off-by: Jeff Layton <[email protected]>
> > ---
> > fs/libfs.c | 36 +++++++++++++++++++++++++++---------
> > include/linux/fs.h | 2 ++
> > 2 files changed, 29 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/libfs.c b/fs/libfs.c
> > index a7e56baf8bbd..9ee79668c909 100644
> > --- a/fs/libfs.c
> > +++ b/fs/libfs.c
> > @@ -692,6 +692,31 @@ int simple_rmdir(struct inode *dir, struct dentry *dentry)
> > }
> > EXPORT_SYMBOL(simple_rmdir);
> >
> > +/**
> > + * simple_rename_timestamp - update the various inode timestamps for rename
> > + * @old_dir: old parent directory
> > + * @old_dentry: dentry that is being renamed
> > + * @new_dir: new parent directory
> > + * @new_dentry: target for rename
> > + *
> > + * POSIX mandates that the old and new parent directories have their ctime and
> > + * mtime updated, and that inodes of @old_dentry and @new_dentry (if any), have
> > + * their ctime updated.
> > + */
> > +void simple_rename_timestamp(struct inode *old_dir, struct dentry *old_dentry,
> > + struct inode *new_dir, struct dentry *new_dentry)
> > +{
> > + struct inode *newino = d_inode(new_dentry);
> > +
> > + old_dir->i_mtime = inode_set_ctime_current(old_dir);
> > + if (new_dir != old_dir)
> > + new_dir->i_mtime = inode_set_ctime_current(new_dir);
> > + inode_set_ctime_current(d_inode(old_dentry));
> > + if (newino)
> > + inode_set_ctime_current(newino);
> > +}
> > +EXPORT_SYMBOL_GPL(simple_rename_timestamp);
> > +
> > int simple_rename_exchange(struct inode *old_dir, struct dentry *old_dentry,
> > struct inode *new_dir, struct dentry *new_dentry)
> > {
> > @@ -707,11 +732,7 @@ int simple_rename_exchange(struct inode *old_dir, struct dentry *old_dentry,
> > inc_nlink(old_dir);
> > }
> > }
> > - old_dir->i_ctime = old_dir->i_mtime =
> > - new_dir->i_ctime = new_dir->i_mtime =
> > - d_inode(old_dentry)->i_ctime =
> > - d_inode(new_dentry)->i_ctime = current_time(old_dir);
> > -
> > + simple_rename_timestamp(old_dir, old_dentry, new_dir, new_dentry);
>
> This is somewhat changing the current behavior: before the patch, the mtime and
> ctime of old_dir, new_dir and the inodes associated with the dentries are always
> equal. But given that simple_rename_timestamp() calls inode_set_ctime_current()
> 4 times, the times could potentially be different.
>
> I am not sure if that is an issue, but it seems that calling
> inode_set_ctime_current() once, recording the "now" time it sets and using that
> value to set all times may be more efficient and preserve the existing behavior.
>
I don't believe it's an issue. I've seen nothing in the POSIX spec that
mandates that timestamp updates to different inodes involved in an
operation be set to the _same_ value. It just says they must be updated.
It's also hard to believe that any software would depend on this either,
given that it's very inconsistent across filesystems today. AFAICT, this
was mostly done in the past just as a matter of convenience.
The other problem with doing it that way is that it assumes that
current_time(inode) should always return the same value when given
different inodes. Is it really correct to do this?
inode_set_ctime(dir, inode_set_ctime_current(inode));
"dir" and "inode" are different inodes, after all, and you're setting
dir's timestamp to "inode"'s value. It's not a big deal today since
they're always on the same sb, but the ultimate goal of these changes is
to implement multigrain timestamps. That will mean that fetching a fine-
grained timestamp for an update when the existing mtime or ctime value
has been queried via getattr.
With that change, I think it's best that we treat updates to different
inodes individually, as some of them may require updating with a fine-
grained timestamp and some may not.
> > return 0;
> > }
> > EXPORT_SYMBOL_GPL(simple_rename_exchange);
> > @@ -720,7 +741,6 @@ int simple_rename(struct mnt_idmap *idmap, struct inode *old_dir,
> > struct dentry *old_dentry, struct inode *new_dir,
> > struct dentry *new_dentry, unsigned int flags)
> > {
> > - struct inode *inode = d_inode(old_dentry);
> > int they_are_dirs = d_is_dir(old_dentry);
> >
> > if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE))
> > @@ -743,9 +763,7 @@ int simple_rename(struct mnt_idmap *idmap, struct inode *old_dir,
> > inc_nlink(new_dir);
> > }
> >
> > - old_dir->i_ctime = old_dir->i_mtime = new_dir->i_ctime =
> > - new_dir->i_mtime = inode->i_ctime = current_time(old_dir);
> > -
> > + simple_rename_timestamp(old_dir, old_dentry, new_dir, new_dentry);
> > return 0;
> > }
> > EXPORT_SYMBOL(simple_rename);
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index bdfbd11a5811..14e38bd900f1 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2979,6 +2979,8 @@ extern int simple_open(struct inode *inode, struct file *file);
> > extern int simple_link(struct dentry *, struct inode *, struct dentry *);
> > extern int simple_unlink(struct inode *, struct dentry *);
> > extern int simple_rmdir(struct inode *, struct dentry *);
> > +void simple_rename_timestamp(struct inode *old_dir, struct dentry *old_dentry,
> > + struct inode *new_dir, struct dentry *new_dentry);
> > extern int simple_rename_exchange(struct inode *old_dir, struct dentry *old_dentry,
> > struct inode *new_dir, struct dentry *new_dentry);
> > extern int simple_rename(struct mnt_idmap *, struct inode *,
>
--
Jeff Layton <[email protected]>
On Wed, Jul 05, 2023 at 08:04:41PM -0400, Jeff Layton wrote:
>
> I don't believe it's an issue. I've seen nothing in the POSIX spec that
> mandates that timestamp updates to different inodes involved in an
> operation be set to the _same_ value. It just says they must be updated.
>
> It's also hard to believe that any software would depend on this either,
> given that it's very inconsistent across filesystems today. AFAICT, this
> was mostly done in the past just as a matter of convenience.
I've seen this assumption in several programs:
mutt buffy.c
https://sources.debian.org/src/mutt/2.2.9-1/buffy.c/?hl=625#L625
if (mailbox->newly_created &&
(sb->st_ctime != sb->st_mtime || sb->st_ctime != sb->st_atime))
mailbox->newly_created = 0;
neomutt mbox/mbox.c
https://sources.debian.org/src/neomutt/20220429+dfsg1-4.1/mbox/mbox.c/?hl=1820#L1820
if (m->newly_created && ((st.st_ctime != st.st_mtime) || (st.st_ctime != st.st_atime)))
m->newly_created = false;
screen logfile.c
https://sources.debian.org/src/screen/4.9.0-4/logfile.c/?hl=130#L130
if ((!s->st_dev && !s->st_ino) || /* stat failed, that's new! */
!s->st_nlink || /* red alert: file unlinked */
(s->st_size < o.st_size) || /* file truncated */
(s->st_mtime != o.st_mtime) || /* file modified */
((s->st_ctime != o.st_ctime) && /* file changed (moved) */
!(s->st_mtime == s->st_ctime && /* and it was not a change */
o.st_ctime < s->st_ctime))) /* due to delayed nfs write */
{
nemo libnemo-private/nemo-vfs-file.c
https://sources.debian.org/src/nemo/5.6.5-1/libnemo-private/nemo-vfs-file.c/?hl=344#L344
/* mtime is when the contents changed; ctime is when the
* contents or the permissions (inc. owner/group) changed.
* So we can only know when the permissions changed if mtime
* and ctime are different.
*/
if (file->details->mtime == file->details->ctime) {
return FALSE;
}
While looking for more examples, I found a perl test that seems to suggest
that at least Solaris, AFS, AmigaOS, DragonFly BSD do as you suggest:
https://sources.debian.org/src/perl/5.36.0-7/t/op/stat.t/?hl=158#L140
Thanks
On Thu, 2023-07-06 at 21:02 +0000, Seth Arnold wrote:
> On Wed, Jul 05, 2023 at 08:04:41PM -0400, Jeff Layton wrote:
> >
> > I don't believe it's an issue. I've seen nothing in the POSIX spec that
> > mandates that timestamp updates to different inodes involved in an
> > operation be set to the _same_ value. It just says they must be updated.
> >
> > It's also hard to believe that any software would depend on this either,
> > given that it's very inconsistent across filesystems today. AFAICT, this
> > was mostly done in the past just as a matter of convenience.
>
> I've seen this assumption in several programs:
>
Thanks for looking into this!
To be clear, POSIX doesn't require that _different_ inodes ever be set
to the same timestamp value. IOW, it certainly doesn't require that the
source and target directories on a rename() end up with the exact same
timestamp value.
Granted, POSIX is rather vague on timestamps in general, but most of the
examples below involve comparing different timestamps on the _same_
inode.
> mutt buffy.c
> https://sources.debian.org/src/mutt/2.2.9-1/buffy.c/?hl=625#L625
>
> if (mailbox->newly_created &&
> (sb->st_ctime != sb->st_mtime || sb->st_ctime != sb->st_atime))
> mailbox->newly_created = 0;
>
This should be fine with this patchset. Note that this is comparing
a/c/mtime on the same inode, and our usual pattern on inode
instantiation is:
inode->i_atime = inode->i_mtime = inode_set_ctime_current(inode);
...which should result in all of inode's timestamps being synchronized.
>
> neomutt mbox/mbox.c
> https://sources.debian.org/src/neomutt/20220429+dfsg1-4.1/mbox/mbox.c/?hl=1820#L1820
>
> if (m->newly_created && ((st.st_ctime != st.st_mtime) || (st.st_ctime != st.st_atime)))
> m->newly_created = false;
>
Ditto here.
>
> screen logfile.c
> https://sources.debian.org/src/screen/4.9.0-4/logfile.c/?hl=130#L130
>
> if ((!s->st_dev && !s->st_ino) || /* stat failed, that's new! */
> !s->st_nlink || /* red alert: file unlinked */
> (s->st_size < o.st_size) || /* file truncated */
> (s->st_mtime != o.st_mtime) || /* file modified */
> ((s->st_ctime != o.st_ctime) && /* file changed (moved) */
> !(s->st_mtime == s->st_ctime && /* and it was not a change */
> o.st_ctime < s->st_ctime))) /* due to delayed nfs write */
> {
>
This one is really weird. You have two different struct stat's, "o" and
"s". I assume though that these should be stat values from the same
inode, because otherwise this comparison would make no sense:
((s->st_ctime != o.st_ctime) && /* file changed (moved) */
In general, we can never contrive to ensure that the ctime of two
different inodes are the same, since that is always set by the kernel to
the current time, and you'd have to ensure that they were created within
the same jiffy (at least with today's code).
> nemo libnemo-private/nemo-vfs-file.c
> https://sources.debian.org/src/nemo/5.6.5-1/libnemo-private/nemo-vfs-file.c/?hl=344#L344
>
> /* mtime is when the contents changed; ctime is when the
> * contents or the permissions (inc. owner/group) changed.
> * So we can only know when the permissions changed if mtime
> * and ctime are different.
> */
> if (file->details->mtime == file->details->ctime) {
> return FALSE;
> }
>
Ditto here with the first examples. This involves comparing timestamps
on the same inode, which should be fine.
>
> While looking for more examples, I found a perl test that seems to suggest
> that at least Solaris, AFS, AmigaOS, DragonFly BSD do as you suggest:
> https://sources.debian.org/src/perl/5.36.0-7/t/op/stat.t/?hl=158#L140
>
(I kinda miss Perl. I wrote a bunch of stuff in it in the 90's and early
naughties)
I think this test is supposed to be testing whether the mtime changes on
link() ?
-----------------8<----------------
my($nlink, $mtime, $ctime) = (stat($tmpfile))[$NLINK, $MTIME, $CTIME];
[...]
skip "Solaris tmpfs has different mtime/ctime link semantics", 2
if $Is_Solaris and $cwd =~ m#^/tmp# and
$mtime && $mtime == $ctime;
-----------------8<----------------
...again, I think this would be ok too since it's just comparing the
mtime and ctime of the same inode. Granted this is a Solaris-specific
test, but Linux would be fine here too.
So in conclusion, I don't think this patchset will cause problems with
any of the above code.
--
Jeff Layton <[email protected]>