2008-01-11 00:44:21

by Anton Salikhmetov

[permalink] [raw]
Subject: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

From: Anton Salikhmetov <[email protected]>

The patch contains changes for updating the ctime and mtime fields for memory mapped files:

1) adding a new flag triggering update of the inode data;
2) implementing a helper function for checking that flag and updating ctime and mtime;
3) updating time stamps for mapped files in sys_msync() and do_fsync().

Signed-off-by: Anton Salikhmetov <[email protected]>

---

diff --git a/fs/buffer.c b/fs/buffer.c
index 7249e01..09adf7e 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -719,6 +719,7 @@ static int __set_page_dirty(struct page *page,
}
write_unlock_irq(&mapping->tree_lock);
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
+ set_bit(AS_MCTIME, &mapping->flags);

return 1;
}
diff --git a/fs/inode.c b/fs/inode.c
index ed35383..c5b954e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -22,6 +22,7 @@
#include <linux/bootmem.h>
#include <linux/inotify.h>
#include <linux/mount.h>
+#include <linux/file.h>

/*
* This is needed for the following functions:
@@ -1282,6 +1283,18 @@ void file_update_time(struct file *file)

EXPORT_SYMBOL(file_update_time);

+/*
+ * Update the ctime and mtime stamps after checking if they are to be updated.
+ */
+void mapped_file_update_time(struct file *file)
+{
+ if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
+ get_file(file);
+ file_update_time(file);
+ fput(file);
+ }
+}
+
int inode_needs_sync(struct inode *inode)
{
if (IS_SYNC(inode))
diff --git a/fs/sync.c b/fs/sync.c
index 7cd005e..df57507 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -87,6 +87,8 @@ long do_fsync(struct file *file, int datasync)
goto out;
}

+ mapped_file_update_time(file);
+
ret = filemap_fdatawrite(mapping);

/*
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b3ec4a4..0b05118 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1978,6 +1978,7 @@ extern int inode_change_ok(struct inode *, struct iattr *);
extern int __must_check inode_setattr(struct inode *, struct iattr *);

extern void file_update_time(struct file *file);
+extern void mapped_file_update_time(struct file *file);

static inline ino_t parent_ino(struct dentry *dentry)
{
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index db8a410..bf0f9e7 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -17,8 +17,9 @@
* Bits in mapping->flags. The lower __GFP_BITS_SHIFT bits are the page
* allocation mode flags.
*/
-#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
+#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
#define AS_ENOSPC (__GFP_BITS_SHIFT + 1) /* ENOSPC on async write */
+#define AS_MCTIME (__GFP_BITS_SHIFT + 2) /* mtime and ctime to update */

static inline void mapping_set_error(struct address_space *mapping, int error)
{
diff --git a/mm/msync.c b/mm/msync.c
index e788f7b..9d0a8f9 100644
--- a/mm/msync.c
+++ b/mm/msync.c
@@ -5,6 +5,7 @@
* Copyright (C) 1994-1999 Linus Torvalds
*
* Substantial code cleanup.
+ * Updating the ctime and mtime stamps for memory mapped files.
* Copyright (C) 2008 Anton Salikhmetov <[email protected]>
*/

@@ -22,6 +23,10 @@
* Nor does it mark the relevant pages dirty (it used to up to 2.6.17).
* Now it doesn't do anything, since dirty pages are properly tracked.
*
+ * The msync() system call updates the ctime and mtime fields for
+ * the mapped file when called with the MS_SYNC or MS_ASYNC flags
+ * according to the POSIX standard.
+ *
* The application may now run fsync() to
* write out the dirty pages and wait on the writeout and check the result.
* Or the application may run fadvise(FADV_DONTNEED) against the fd to start
@@ -74,14 +79,17 @@ asmlinkage long sys_msync(unsigned long start, size_t len, int flags)
break;
}
file = vma->vm_file;
- if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) {
- get_file(file);
- up_read(&mm->mmap_sem);
- error = do_fsync(file, 0);
- fput(file);
- if (error)
- return error;
- down_read(&mm->mmap_sem);
+ if (file && (vma->vm_flags & VM_SHARED)) {
+ mapped_file_update_time(file);
+ if (flags & MS_SYNC) {
+ get_file(file);
+ up_read(&mm->mmap_sem);
+ error = do_fsync(file, 0);
+ fput(file);
+ if (error)
+ return error;
+ down_read(&mm->mmap_sem);
+ }
}

start = vma->vm_end;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index d55cfca..a85df0b 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1025,6 +1025,7 @@ int __set_page_dirty_nobuffers(struct page *page)
if (mapping->host) {
/* !PageAnon && !swapper_space */
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
+ set_bit(AS_MCTIME, &mapping->flags);
}
return 1;
}


2008-01-11 18:55:26

by Peter Staubach

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

Anton Salikhmetov wrote:
> From: Anton Salikhmetov <[email protected]>
>
> The patch contains changes for updating the ctime and mtime fields for memory mapped files:
>
> 1) adding a new flag triggering update of the inode data;
> 2) implementing a helper function for checking that flag and updating ctime and mtime;
> 3) updating time stamps for mapped files in sys_msync() and do_fsync().
>
>

What happens if the application does not issue either an msync
or an fsync call, but either just munmap's the region or just
keeps on manipulating it? It appears to me that the file times
will never be updated in these cases.

It seems to me that the file times should be updated eventually,
and perhaps even regularly if the file is being constantly
updated via the mmap'd region.

Thanx...

ps

> Signed-off-by: Anton Salikhmetov <[email protected]>
>
> ---
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 7249e01..09adf7e 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -719,6 +719,7 @@ static int __set_page_dirty(struct page *page,
> }
> write_unlock_irq(&mapping->tree_lock);
> __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> + set_bit(AS_MCTIME, &mapping->flags);
>
> return 1;
> }
> diff --git a/fs/inode.c b/fs/inode.c
> index ed35383..c5b954e 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -22,6 +22,7 @@
> #include <linux/bootmem.h>
> #include <linux/inotify.h>
> #include <linux/mount.h>
> +#include <linux/file.h>
>
> /*
> * This is needed for the following functions:
> @@ -1282,6 +1283,18 @@ void file_update_time(struct file *file)
>
> EXPORT_SYMBOL(file_update_time);
>
> +/*
> + * Update the ctime and mtime stamps after checking if they are to be updated.
> + */
> +void mapped_file_update_time(struct file *file)
> +{
> + if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
> + get_file(file);
> + file_update_time(file);
> + fput(file);
> + }
> +}
> +
> int inode_needs_sync(struct inode *inode)
> {
> if (IS_SYNC(inode))
> diff --git a/fs/sync.c b/fs/sync.c
> index 7cd005e..df57507 100644
> --- a/fs/sync.c
> +++ b/fs/sync.c
> @@ -87,6 +87,8 @@ long do_fsync(struct file *file, int datasync)
> goto out;
> }
>
> + mapped_file_update_time(file);
> +
> ret = filemap_fdatawrite(mapping);
>
> /*
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b3ec4a4..0b05118 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1978,6 +1978,7 @@ extern int inode_change_ok(struct inode *, struct iattr *);
> extern int __must_check inode_setattr(struct inode *, struct iattr *);
>
> extern void file_update_time(struct file *file);
> +extern void mapped_file_update_time(struct file *file);
>
> static inline ino_t parent_ino(struct dentry *dentry)
> {
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index db8a410..bf0f9e7 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -17,8 +17,9 @@
> * Bits in mapping->flags. The lower __GFP_BITS_SHIFT bits are the page
> * allocation mode flags.
> */
> -#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
> +#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
> #define AS_ENOSPC (__GFP_BITS_SHIFT + 1) /* ENOSPC on async write */
> +#define AS_MCTIME (__GFP_BITS_SHIFT + 2) /* mtime and ctime to update */
>
> static inline void mapping_set_error(struct address_space *mapping, int error)
> {
> diff --git a/mm/msync.c b/mm/msync.c
> index e788f7b..9d0a8f9 100644
> --- a/mm/msync.c
> +++ b/mm/msync.c
> @@ -5,6 +5,7 @@
> * Copyright (C) 1994-1999 Linus Torvalds
> *
> * Substantial code cleanup.
> + * Updating the ctime and mtime stamps for memory mapped files.
> * Copyright (C) 2008 Anton Salikhmetov <[email protected]>
> */
>
> @@ -22,6 +23,10 @@
> * Nor does it mark the relevant pages dirty (it used to up to 2.6.17).
> * Now it doesn't do anything, since dirty pages are properly tracked.
> *
> + * The msync() system call updates the ctime and mtime fields for
> + * the mapped file when called with the MS_SYNC or MS_ASYNC flags
> + * according to the POSIX standard.
> + *
> * The application may now run fsync() to
> * write out the dirty pages and wait on the writeout and check the result.
> * Or the application may run fadvise(FADV_DONTNEED) against the fd to start
> @@ -74,14 +79,17 @@ asmlinkage long sys_msync(unsigned long start, size_t len, int flags)
> break;
> }
> file = vma->vm_file;
> - if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) {
> - get_file(file);
> - up_read(&mm->mmap_sem);
> - error = do_fsync(file, 0);
> - fput(file);
> - if (error)
> - return error;
> - down_read(&mm->mmap_sem);
> + if (file && (vma->vm_flags & VM_SHARED)) {
> + mapped_file_update_time(file);
> + if (flags & MS_SYNC) {
> + get_file(file);
> + up_read(&mm->mmap_sem);
> + error = do_fsync(file, 0);
> + fput(file);
> + if (error)
> + return error;
> + down_read(&mm->mmap_sem);
> + }
> }
>
> start = vma->vm_end;
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index d55cfca..a85df0b 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1025,6 +1025,7 @@ int __set_page_dirty_nobuffers(struct page *page)
> if (mapping->host) {
> /* !PageAnon && !swapper_space */
> __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> + set_bit(AS_MCTIME, &mapping->flags);
> }
> return 1;
> }
>
>

2008-01-11 18:59:36

by Peter Staubach

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

--- linux-2.6.20.i686/fs/buffer.c.org
+++ linux-2.6.20.i686/fs/buffer.c
@@ -710,6 +710,7 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
int __set_page_dirty_buffers(struct page *page)
{
struct address_space * const mapping = page_mapping(page);
+ int ret = 0;

if (unlikely(!mapping))
return !TestSetPageDirty(page);
@@ -727,7 +728,7 @@ int __set_page_dirty_buffers(struct page
spin_unlock(&mapping->private_lock);

if (TestSetPageDirty(page))
- return 0;
+ goto out;

write_lock_irq(&mapping->tree_lock);
if (page->mapping) { /* Race with truncate? */
@@ -740,7 +741,11 @@ int __set_page_dirty_buffers(struct page
}
write_unlock_irq(&mapping->tree_lock);
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
- return 1;
+ ret = 1;
+out:
+ if (page_mapped(page))
+ set_bit(AS_MCTIME, &mapping->flags);
+ return ret;
}
EXPORT_SYMBOL(__set_page_dirty_buffers);

--- linux-2.6.20.i686/fs/fs-writeback.c.org
+++ linux-2.6.20.i686/fs/fs-writeback.c
@@ -167,6 +167,13 @@ __sync_single_inode(struct inode *inode,

spin_unlock(&inode_lock);

+ if (test_and_clear_bit(AS_MCTIME, &mapping->flags)) {
+ if (S_ISBLK(inode->i_mode))
+ bd_inode_update_time(inode);
+ else
+ inode_update_time(inode);
+ }
+
ret = do_writepages(mapping, wbc);

/* Don't write the inode if only I_DIRTY_PAGES was set */
--- linux-2.6.20.i686/fs/inode.c.org
+++ linux-2.6.20.i686/fs/inode.c
@@ -1201,8 +1201,8 @@ void touch_atime(struct vfsmount *mnt, s
EXPORT_SYMBOL(touch_atime);

/**
- * file_update_time - update mtime and ctime time
- * @file: file accessed
+ * inode_update_time - update mtime and ctime time
+ * @inode: file accessed
*
* Update the mtime and ctime members of an inode and mark the inode
* for writeback. Note that this function is meant exclusively for
@@ -1212,9 +1212,8 @@ EXPORT_SYMBOL(touch_atime);
* timestamps are handled by the server.
*/

-void file_update_time(struct file *file)
+void inode_update_time(struct inode *inode)
{
- struct inode *inode = file->f_path.dentry->d_inode;
struct timespec now;
int sync_it = 0;

@@ -1238,7 +1237,7 @@ void file_update_time(struct file *file)
mark_inode_dirty_sync(inode);
}

-EXPORT_SYMBOL(file_update_time);
+EXPORT_SYMBOL(inode_update_time);

int inode_needs_sync(struct inode *inode)
{
--- linux-2.6.20.i686/fs/block_dev.c.org
+++ linux-2.6.20.i686/fs/block_dev.c
@@ -608,6 +608,22 @@ void bdput(struct block_device *bdev)

EXPORT_SYMBOL(bdput);

+void bd_inode_update_time(struct inode *inode)
+{
+ struct block_device *bdev = inode->i_bdev;
+ struct list_head *p;
+
+ if (bdev == NULL)
+ return;
+
+ spin_lock(&bdev_lock);
+ list_for_each(p, &bdev->bd_inodes) {
+ inode = list_entry(p, struct inode, i_devices);
+ inode_update_time(inode);
+ }
+ spin_unlock(&bdev_lock);
+}
+
static struct block_device *bd_acquire(struct inode *inode)
{
struct block_device *bdev;
--- linux-2.6.20.i686/include/linux/fs.h.org
+++ linux-2.6.20.i686/include/linux/fs.h
@@ -1488,6 +1488,7 @@ extern struct block_device *bdget(dev_t)
extern void bd_set_size(struct block_device *, loff_t size);
extern void bd_forget(struct inode *inode);
extern void bdput(struct block_device *);
+extern void bd_inode_update_time(struct inode *);
extern struct block_device *open_by_devnum(dev_t, unsigned);
extern const struct address_space_operations def_blk_aops;
#else
@@ -1892,7 +1893,11 @@ extern int buffer_migrate_page(struct ad
extern int inode_change_ok(struct inode *, struct iattr *);
extern int __must_check inode_setattr(struct inode *, struct iattr *);

-extern void file_update_time(struct file *file);
+extern void inode_update_time(struct inode *inode);
+static inline void file_update_time(struct file *file)
+{
+ inode_update_time(file->f_path.dentry->d_inode);
+}

static inline ino_t parent_ino(struct dentry *dentry)
{
--- linux-2.6.20.i686/include/linux/pagemap.h.org
+++ linux-2.6.20.i686/include/linux/pagemap.h
@@ -16,8 +16,9 @@
* Bits in mapping->flags. The lower __GFP_BITS_SHIFT bits are the page
* allocation mode flags.
*/
-#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
+#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
#define AS_ENOSPC (__GFP_BITS_SHIFT + 1) /* ENOSPC on async write */
+#define AS_MCTIME (__GFP_BITS_SHIFT + 2) /* need m/ctime change */

static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
{
--- linux-2.6.20.i686/mm/page-writeback.c.org
+++ linux-2.6.20.i686/mm/page-writeback.c
@@ -760,8 +760,10 @@ int __set_page_dirty_no_writeback(struct
*/
int __set_page_dirty_nobuffers(struct page *page)
{
+ struct address_space *mapping = page_mapping(page);
+ int ret = 0;
+
if (!TestSetPageDirty(page)) {
- struct address_space *mapping = page_mapping(page);
struct address_space *mapping2;

if (!mapping)
@@ -783,9 +785,11 @@ int __set_page_dirty_nobuffers(struct pa
/* !PageAnon && !swapper_space */
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
}
- return 1;
+ ret = 1;
}
- return 0;
+ if (page_mapped(page))
+ set_bit(AS_MCTIME, &mapping->flags);
+ return ret;
}
EXPORT_SYMBOL(__set_page_dirty_nobuffers);

--- linux-2.6.20.i686/mm/msync.c.org
+++ linux-2.6.20.i686/mm/msync.c
@@ -12,6 +12,7 @@
#include <linux/mman.h>
#include <linux/file.h>
#include <linux/syscalls.h>
+#include <linux/pagemap.h>

/*
* MS_SYNC syncs the entire file - including mappings.
@@ -77,11 +78,16 @@ asmlinkage long sys_msync(unsigned long
}
file = vma->vm_file;
start = vma->vm_end;
- if ((flags & MS_SYNC) && file &&
- (vma->vm_flags & VM_SHARED)) {
+ if ((flags & (MS_SYNC|MS_ASYNC)) &&
+ file && (vma->vm_flags & VM_SHARED)) {
get_file(file);
up_read(&mm->mmap_sem);
- error = do_fsync(file, 0);
+ error = 0;
+ if (flags & MS_SYNC)
+ error = do_fsync(file, 0);
+ else if (test_and_clear_bit(AS_MCTIME,
+ &file->f_mapping->flags))
+ file_update_time(file);
fput(file);
if (error || start >= end)
goto out;


Attachments:
mmap_mctime.devel.v2 (5.88 kB)

2008-01-11 19:29:47

by Anton Salikhmetov

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

2008/1/11, Peter Staubach <[email protected]>:
> Anton Salikhmetov wrote:
> > From: Anton Salikhmetov <[email protected]>
> >
> > The patch contains changes for updating the ctime and mtime fields for memory mapped files:
> >
> > 1) adding a new flag triggering update of the inode data;
> > 2) implementing a helper function for checking that flag and updating ctime and mtime;
> > 3) updating time stamps for mapped files in sys_msync() and do_fsync().
> >
> >
>
> What happens if the application does not issue either an msync
> or an fsync call, but either just munmap's the region or just
> keeps on manipulating it? It appears to me that the file times
> will never be updated in these cases.
>
> It seems to me that the file times should be updated eventually,
> and perhaps even regularly if the file is being constantly
> updated via the mmap'd region.

Indeed, FreeBSD, for example, implements updating ctime and mtime
in exactly the way you described. Many people I've spoken with do
interpret the POSIX requirement the same way as you do.

I had thoroughly investigated the possibility of implementing
the feature you are talking about, and came to the conclusion
that it would lead to quite massive changes and would require
a very complicated work with locks. At least, within
the current kernel design for the memory management.
So, I believe that the "auto-updating" feature should be
implemented later and outside of the bug #2645.

Finally, my solution puts the Linux kernel much closer to
the POSIX standard (the msync() and fsync() requirements), anyway.
And the changes which my patch contains, to the best of my knowledge,
do not intersect with the possible implementation of
the "auto-updating" feature.

I'm planning on adding the "auto-updating" feature in
the nearest future, but it looks like a separate project to me.

>
> Thanx...
>
> ps
>
> > Signed-off-by: Anton Salikhmetov <[email protected]>
> >
> > ---
> >
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index 7249e01..09adf7e 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -719,6 +719,7 @@ static int __set_page_dirty(struct page *page,
> > }
> > write_unlock_irq(&mapping->tree_lock);
> > __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> > + set_bit(AS_MCTIME, &mapping->flags);
> >
> > return 1;
> > }
> > diff --git a/fs/inode.c b/fs/inode.c
> > index ed35383..c5b954e 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -22,6 +22,7 @@
> > #include <linux/bootmem.h>
> > #include <linux/inotify.h>
> > #include <linux/mount.h>
> > +#include <linux/file.h>
> >
> > /*
> > * This is needed for the following functions:
> > @@ -1282,6 +1283,18 @@ void file_update_time(struct file *file)
> >
> > EXPORT_SYMBOL(file_update_time);
> >
> > +/*
> > + * Update the ctime and mtime stamps after checking if they are to be updated.
> > + */
> > +void mapped_file_update_time(struct file *file)
> > +{
> > + if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
> > + get_file(file);
> > + file_update_time(file);
> > + fput(file);
> > + }
> > +}
> > +
> > int inode_needs_sync(struct inode *inode)
> > {
> > if (IS_SYNC(inode))
> > diff --git a/fs/sync.c b/fs/sync.c
> > index 7cd005e..df57507 100644
> > --- a/fs/sync.c
> > +++ b/fs/sync.c
> > @@ -87,6 +87,8 @@ long do_fsync(struct file *file, int datasync)
> > goto out;
> > }
> >
> > + mapped_file_update_time(file);
> > +
> > ret = filemap_fdatawrite(mapping);
> >
> > /*
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index b3ec4a4..0b05118 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -1978,6 +1978,7 @@ extern int inode_change_ok(struct inode *, struct iattr *);
> > extern int __must_check inode_setattr(struct inode *, struct iattr *);
> >
> > extern void file_update_time(struct file *file);
> > +extern void mapped_file_update_time(struct file *file);
> >
> > static inline ino_t parent_ino(struct dentry *dentry)
> > {
> > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> > index db8a410..bf0f9e7 100644
> > --- a/include/linux/pagemap.h
> > +++ b/include/linux/pagemap.h
> > @@ -17,8 +17,9 @@
> > * Bits in mapping->flags. The lower __GFP_BITS_SHIFT bits are the page
> > * allocation mode flags.
> > */
> > -#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
> > +#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
> > #define AS_ENOSPC (__GFP_BITS_SHIFT + 1) /* ENOSPC on async write */
> > +#define AS_MCTIME (__GFP_BITS_SHIFT + 2) /* mtime and ctime to update */
> >
> > static inline void mapping_set_error(struct address_space *mapping, int error)
> > {
> > diff --git a/mm/msync.c b/mm/msync.c
> > index e788f7b..9d0a8f9 100644
> > --- a/mm/msync.c
> > +++ b/mm/msync.c
> > @@ -5,6 +5,7 @@
> > * Copyright (C) 1994-1999 Linus Torvalds
> > *
> > * Substantial code cleanup.
> > + * Updating the ctime and mtime stamps for memory mapped files.
> > * Copyright (C) 2008 Anton Salikhmetov <[email protected]>
> > */
> >
> > @@ -22,6 +23,10 @@
> > * Nor does it mark the relevant pages dirty (it used to up to 2.6.17).
> > * Now it doesn't do anything, since dirty pages are properly tracked.
> > *
> > + * The msync() system call updates the ctime and mtime fields for
> > + * the mapped file when called with the MS_SYNC or MS_ASYNC flags
> > + * according to the POSIX standard.
> > + *
> > * The application may now run fsync() to
> > * write out the dirty pages and wait on the writeout and check the result.
> > * Or the application may run fadvise(FADV_DONTNEED) against the fd to start
> > @@ -74,14 +79,17 @@ asmlinkage long sys_msync(unsigned long start, size_t len, int flags)
> > break;
> > }
> > file = vma->vm_file;
> > - if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) {
> > - get_file(file);
> > - up_read(&mm->mmap_sem);
> > - error = do_fsync(file, 0);
> > - fput(file);
> > - if (error)
> > - return error;
> > - down_read(&mm->mmap_sem);
> > + if (file && (vma->vm_flags & VM_SHARED)) {
> > + mapped_file_update_time(file);
> > + if (flags & MS_SYNC) {
> > + get_file(file);
> > + up_read(&mm->mmap_sem);
> > + error = do_fsync(file, 0);
> > + fput(file);
> > + if (error)
> > + return error;
> > + down_read(&mm->mmap_sem);
> > + }
> > }
> >
> > start = vma->vm_end;
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index d55cfca..a85df0b 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -1025,6 +1025,7 @@ int __set_page_dirty_nobuffers(struct page *page)
> > if (mapping->host) {
> > /* !PageAnon && !swapper_space */
> > __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> > + set_bit(AS_MCTIME, &mapping->flags);
> > }
> > return 1;
> > }
> >
> >
>
>

2008-01-11 21:40:38

by Anton Salikhmetov

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

2008/1/11, Peter Staubach <[email protected]>:
> Anton Salikhmetov wrote:
> > From: Anton Salikhmetov <[email protected]>
> >
> > The patch contains changes for updating the ctime and mtime fields for memory mapped files:
> >
> > 1) adding a new flag triggering update of the inode data;
> > 2) implementing a helper function for checking that flag and updating ctime and mtime;
> > 3) updating time stamps for mapped files in sys_msync() and do_fsync().
>
> Sorry, one other issue to throw out too -- an mmap'd block device
> should also have its inode time fields updated. This is a little
> tricky because the inode referenced via mapping->host isn't the
> one that needs to have the time fields updated on.
>
> I have attached the patch that I submitted last. It is quite out
> of date, but does show my attempt to resolve some of these issues.

Thanks for your feedback!

Now I'm looking at your solution and thinking about which parts of it
I could adapt to the infrastructure I'm trying to develop.

However, I would like to address the block device case within
a separate project. But for now, I want the msync() and fsync()
system calls to update ctime and mtime at least for memory-mapped
regular files properly. I feel that even this little improvement could address
many customer's troubles such as the one Jacob Oestergaard reported
in the bug #2645.

>
> Thanx...
>
> ps
>
> --- linux-2.6.20.i686/fs/buffer.c.org
> +++ linux-2.6.20.i686/fs/buffer.c
> @@ -710,6 +710,7 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
> int __set_page_dirty_buffers(struct page *page)
> {
> struct address_space * const mapping = page_mapping(page);
> + int ret = 0;
>
> if (unlikely(!mapping))
> return !TestSetPageDirty(page);
> @@ -727,7 +728,7 @@ int __set_page_dirty_buffers(struct page
> spin_unlock(&mapping->private_lock);
>
> if (TestSetPageDirty(page))
> - return 0;
> + goto out;
>
> write_lock_irq(&mapping->tree_lock);
> if (page->mapping) { /* Race with truncate? */
> @@ -740,7 +741,11 @@ int __set_page_dirty_buffers(struct page
> }
> write_unlock_irq(&mapping->tree_lock);
> __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> - return 1;
> + ret = 1;
> +out:
> + if (page_mapped(page))
> + set_bit(AS_MCTIME, &mapping->flags);
> + return ret;
> }
> EXPORT_SYMBOL(__set_page_dirty_buffers);
>
> --- linux-2.6.20.i686/fs/fs-writeback.c.org
> +++ linux-2.6.20.i686/fs/fs-writeback.c
> @@ -167,6 +167,13 @@ __sync_single_inode(struct inode *inode,
>
> spin_unlock(&inode_lock);
>
> + if (test_and_clear_bit(AS_MCTIME, &mapping->flags)) {
> + if (S_ISBLK(inode->i_mode))
> + bd_inode_update_time(inode);
> + else
> + inode_update_time(inode);
> + }
> +
> ret = do_writepages(mapping, wbc);
>
> /* Don't write the inode if only I_DIRTY_PAGES was set */
> --- linux-2.6.20.i686/fs/inode.c.org
> +++ linux-2.6.20.i686/fs/inode.c
> @@ -1201,8 +1201,8 @@ void touch_atime(struct vfsmount *mnt, s
> EXPORT_SYMBOL(touch_atime);
>
> /**
> - * file_update_time - update mtime and ctime time
> - * @file: file accessed
> + * inode_update_time - update mtime and ctime time
> + * @inode: file accessed
> *
> * Update the mtime and ctime members of an inode and mark the inode
> * for writeback. Note that this function is meant exclusively for
> @@ -1212,9 +1212,8 @@ EXPORT_SYMBOL(touch_atime);
> * timestamps are handled by the server.
> */
>
> -void file_update_time(struct file *file)
> +void inode_update_time(struct inode *inode)
> {
> - struct inode *inode = file->f_path.dentry->d_inode;
> struct timespec now;
> int sync_it = 0;
>
> @@ -1238,7 +1237,7 @@ void file_update_time(struct file *file)
> mark_inode_dirty_sync(inode);
> }
>
> -EXPORT_SYMBOL(file_update_time);
> +EXPORT_SYMBOL(inode_update_time);
>
> int inode_needs_sync(struct inode *inode)
> {
> --- linux-2.6.20.i686/fs/block_dev.c.org
> +++ linux-2.6.20.i686/fs/block_dev.c
> @@ -608,6 +608,22 @@ void bdput(struct block_device *bdev)
>
> EXPORT_SYMBOL(bdput);
>
> +void bd_inode_update_time(struct inode *inode)
> +{
> + struct block_device *bdev = inode->i_bdev;
> + struct list_head *p;
> +
> + if (bdev == NULL)
> + return;
> +
> + spin_lock(&bdev_lock);
> + list_for_each(p, &bdev->bd_inodes) {
> + inode = list_entry(p, struct inode, i_devices);
> + inode_update_time(inode);
> + }
> + spin_unlock(&bdev_lock);
> +}
> +
> static struct block_device *bd_acquire(struct inode *inode)
> {
> struct block_device *bdev;
> --- linux-2.6.20.i686/include/linux/fs.h.org
> +++ linux-2.6.20.i686/include/linux/fs.h
> @@ -1488,6 +1488,7 @@ extern struct block_device *bdget(dev_t)
> extern void bd_set_size(struct block_device *, loff_t size);
> extern void bd_forget(struct inode *inode);
> extern void bdput(struct block_device *);
> +extern void bd_inode_update_time(struct inode *);
> extern struct block_device *open_by_devnum(dev_t, unsigned);
> extern const struct address_space_operations def_blk_aops;
> #else
> @@ -1892,7 +1893,11 @@ extern int buffer_migrate_page(struct ad
> extern int inode_change_ok(struct inode *, struct iattr *);
> extern int __must_check inode_setattr(struct inode *, struct iattr *);
>
> -extern void file_update_time(struct file *file);
> +extern void inode_update_time(struct inode *inode);
> +static inline void file_update_time(struct file *file)
> +{
> + inode_update_time(file->f_path.dentry->d_inode);
> +}
>
> static inline ino_t parent_ino(struct dentry *dentry)
> {
> --- linux-2.6.20.i686/include/linux/pagemap.h.org
> +++ linux-2.6.20.i686/include/linux/pagemap.h
> @@ -16,8 +16,9 @@
> * Bits in mapping->flags. The lower __GFP_BITS_SHIFT bits are the page
> * allocation mode flags.
> */
> -#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
> +#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
> #define AS_ENOSPC (__GFP_BITS_SHIFT + 1) /* ENOSPC on async write */
> +#define AS_MCTIME (__GFP_BITS_SHIFT + 2) /* need m/ctime change */
>
> static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
> {
> --- linux-2.6.20.i686/mm/page-writeback.c.org
> +++ linux-2.6.20.i686/mm/page-writeback.c
> @@ -760,8 +760,10 @@ int __set_page_dirty_no_writeback(struct
> */
> int __set_page_dirty_nobuffers(struct page *page)
> {
> + struct address_space *mapping = page_mapping(page);
> + int ret = 0;
> +
> if (!TestSetPageDirty(page)) {
> - struct address_space *mapping = page_mapping(page);
> struct address_space *mapping2;
>
> if (!mapping)
> @@ -783,9 +785,11 @@ int __set_page_dirty_nobuffers(struct pa
> /* !PageAnon && !swapper_space */
> __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> }
> - return 1;
> + ret = 1;
> }
> - return 0;
> + if (page_mapped(page))
> + set_bit(AS_MCTIME, &mapping->flags);
> + return ret;
> }
> EXPORT_SYMBOL(__set_page_dirty_nobuffers);
>
> --- linux-2.6.20.i686/mm/msync.c.org
> +++ linux-2.6.20.i686/mm/msync.c
> @@ -12,6 +12,7 @@
> #include <linux/mman.h>
> #include <linux/file.h>
> #include <linux/syscalls.h>
> +#include <linux/pagemap.h>
>
> /*
> * MS_SYNC syncs the entire file - including mappings.
> @@ -77,11 +78,16 @@ asmlinkage long sys_msync(unsigned long
> }
> file = vma->vm_file;
> start = vma->vm_end;
> - if ((flags & MS_SYNC) && file &&
> - (vma->vm_flags & VM_SHARED)) {
> + if ((flags & (MS_SYNC|MS_ASYNC)) &&
> + file && (vma->vm_flags & VM_SHARED)) {
> get_file(file);
> up_read(&mm->mmap_sem);
> - error = do_fsync(file, 0);
> + error = 0;
> + if (flags & MS_SYNC)
> + error = do_fsync(file, 0);
> + else if (test_and_clear_bit(AS_MCTIME,
> + &file->f_mapping->flags))
> + file_update_time(file);
> fput(file);
> if (error || start >= end)
> goto out;
>
>

2008-01-11 21:59:58

by Peter Staubach

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

Anton Salikhmetov wrote:
> 2008/1/11, Peter Staubach <[email protected]>:
>
>> Anton Salikhmetov wrote:
>>
>>> From: Anton Salikhmetov <[email protected]>
>>>
>>> The patch contains changes for updating the ctime and mtime fields for memory mapped files:
>>>
>>> 1) adding a new flag triggering update of the inode data;
>>> 2) implementing a helper function for checking that flag and updating ctime and mtime;
>>> 3) updating time stamps for mapped files in sys_msync() and do_fsync().
>>>
>> Sorry, one other issue to throw out too -- an mmap'd block device
>> should also have its inode time fields updated. This is a little
>> tricky because the inode referenced via mapping->host isn't the
>> one that needs to have the time fields updated on.
>>
>> I have attached the patch that I submitted last. It is quite out
>> of date, but does show my attempt to resolve some of these issues.
>>
>
> Thanks for your feedback!
>
> Now I'm looking at your solution and thinking about which parts of it
> I could adapt to the infrastructure I'm trying to develop.
>
> However, I would like to address the block device case within
> a separate project. But for now, I want the msync() and fsync()
> system calls to update ctime and mtime at least for memory-mapped
> regular files properly. I feel that even this little improvement could address
> many customer's troubles such as the one Jacob Oestergaard reported
> in the bug #2645.

Not that I disagree and I also have customers who would really like
to see this situation addressed so that I can then fix it in RHEL,
but the block device issue was raised by Andrew Morton during my
first attempt to get a patch integrated.

Just so that you are aware of who has raised which issues... :-)

Thanx...

ps

2008-01-11 22:15:55

by Anton Salikhmetov

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

2008/1/12, Peter Staubach <[email protected]>:
> Anton Salikhmetov wrote:
> > 2008/1/11, Peter Staubach <[email protected]>:
> >
> >> Anton Salikhmetov wrote:
> >>
> >>> From: Anton Salikhmetov <[email protected]>
> >>>
> >>> The patch contains changes for updating the ctime and mtime fields for memory mapped files:
> >>>
> >>> 1) adding a new flag triggering update of the inode data;
> >>> 2) implementing a helper function for checking that flag and updating ctime and mtime;
> >>> 3) updating time stamps for mapped files in sys_msync() and do_fsync().
> >>>
> >> Sorry, one other issue to throw out too -- an mmap'd block device
> >> should also have its inode time fields updated. This is a little
> >> tricky because the inode referenced via mapping->host isn't the
> >> one that needs to have the time fields updated on.
> >>
> >> I have attached the patch that I submitted last. It is quite out
> >> of date, but does show my attempt to resolve some of these issues.
> >>
> >
> > Thanks for your feedback!
> >
> > Now I'm looking at your solution and thinking about which parts of it
> > I could adapt to the infrastructure I'm trying to develop.
> >
> > However, I would like to address the block device case within
> > a separate project. But for now, I want the msync() and fsync()
> > system calls to update ctime and mtime at least for memory-mapped
> > regular files properly. I feel that even this little improvement could address
> > many customer's troubles such as the one Jacob Oestergaard reported
> > in the bug #2645.
>
> Not that I disagree and I also have customers who would really like
> to see this situation addressed so that I can then fix it in RHEL,
> but the block device issue was raised by Andrew Morton during my
> first attempt to get a patch integrated.
>
> Just so that you are aware of who has raised which issues... :-)

Yes, I remember that email by Andrew Morton (http://lkml.org/lkml/2006/6/19/6).
In fact, I went over that thread many times while working on my
solution for this bug.

Nevertheless, I presume the block device case to be addressed in a
separate patch
series, just like the "auto-updating" feature.

>
> Thanx...
>
> ps
>

2008-01-12 02:25:16

by Anton Salikhmetov

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

2008/1/11, Peter Staubach <[email protected]>:
> Anton Salikhmetov wrote:
> > From: Anton Salikhmetov <[email protected]>
> >
> > The patch contains changes for updating the ctime and mtime fields for memory mapped files:
> >
> > 1) adding a new flag triggering update of the inode data;
> > 2) implementing a helper function for checking that flag and updating ctime and mtime;
> > 3) updating time stamps for mapped files in sys_msync() and do_fsync().
>
> Sorry, one other issue to throw out too -- an mmap'd block device
> should also have its inode time fields updated. This is a little
> tricky because the inode referenced via mapping->host isn't the
> one that needs to have the time fields updated on.
>
> I have attached the patch that I submitted last. It is quite out
> of date, but does show my attempt to resolve some of these issues.

It looks very strange to me that your patch received no reaction whatsoever:

http://lkml.org/lkml/2007/2/20/255

I've just tried to adapt your patch to my solution. I'm afraid ctime
and mtime stamps
for memory-mapped block device files are still not updated with your
code integrated
into what I'm doing. I set up the loopback device /dev/loop0, then ran
my test program:

http://bugzilla.kernel.org/attachment.cgi?id=14398

The unit test shows that ctime and mtime are not updated.
However, regular files are updated properly.

Also I have a couple of questions about your patch. Please see below.

>
> Thanx...
>
> ps
>
> --- linux-2.6.20.i686/fs/buffer.c.org
> +++ linux-2.6.20.i686/fs/buffer.c
> @@ -710,6 +710,7 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
> int __set_page_dirty_buffers(struct page *page)
> {
> struct address_space * const mapping = page_mapping(page);
> + int ret = 0;
>
> if (unlikely(!mapping))
> return !TestSetPageDirty(page);
> @@ -727,7 +728,7 @@ int __set_page_dirty_buffers(struct page
> spin_unlock(&mapping->private_lock);
>
> if (TestSetPageDirty(page))
> - return 0;
> + goto out;
>
> write_lock_irq(&mapping->tree_lock);
> if (page->mapping) { /* Race with truncate? */
> @@ -740,7 +741,11 @@ int __set_page_dirty_buffers(struct page
> }
> write_unlock_irq(&mapping->tree_lock);
> __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> - return 1;
> + ret = 1;
> +out:
> + if (page_mapped(page))
> + set_bit(AS_MCTIME, &mapping->flags);
> + return ret;
> }
> EXPORT_SYMBOL(__set_page_dirty_buffers);
>
> --- linux-2.6.20.i686/fs/fs-writeback.c.org
> +++ linux-2.6.20.i686/fs/fs-writeback.c
> @@ -167,6 +167,13 @@ __sync_single_inode(struct inode *inode,
>
> spin_unlock(&inode_lock);
>
> + if (test_and_clear_bit(AS_MCTIME, &mapping->flags)) {
> + if (S_ISBLK(inode->i_mode))
> + bd_inode_update_time(inode);
> + else
> + inode_update_time(inode);
> + }
> +

Why the S_ISBLK check is done only here? I think sys_msync() should contain
the same logic.

> ret = do_writepages(mapping, wbc);
>
> /* Don't write the inode if only I_DIRTY_PAGES was set */
> --- linux-2.6.20.i686/fs/inode.c.org
> +++ linux-2.6.20.i686/fs/inode.c
> @@ -1201,8 +1201,8 @@ void touch_atime(struct vfsmount *mnt, s
> EXPORT_SYMBOL(touch_atime);
>
> /**
> - * file_update_time - update mtime and ctime time
> - * @file: file accessed
> + * inode_update_time - update mtime and ctime time
> + * @inode: file accessed
> *
> * Update the mtime and ctime members of an inode and mark the inode
> * for writeback. Note that this function is meant exclusively for
> @@ -1212,9 +1212,8 @@ EXPORT_SYMBOL(touch_atime);
> * timestamps are handled by the server.
> */
>
> -void file_update_time(struct file *file)
> +void inode_update_time(struct inode *inode)
> {
> - struct inode *inode = file->f_path.dentry->d_inode;
> struct timespec now;
> int sync_it = 0;
>
> @@ -1238,7 +1237,7 @@ void file_update_time(struct file *file)
> mark_inode_dirty_sync(inode);
> }
>
> -EXPORT_SYMBOL(file_update_time);
> +EXPORT_SYMBOL(inode_update_time);
>
> int inode_needs_sync(struct inode *inode)
> {
> --- linux-2.6.20.i686/fs/block_dev.c.org
> +++ linux-2.6.20.i686/fs/block_dev.c
> @@ -608,6 +608,22 @@ void bdput(struct block_device *bdev)
>
> EXPORT_SYMBOL(bdput);
>
> +void bd_inode_update_time(struct inode *inode)
> +{
> + struct block_device *bdev = inode->i_bdev;
> + struct list_head *p;
> +
> + if (bdev == NULL)
> + return;
> +
> + spin_lock(&bdev_lock);
> + list_for_each(p, &bdev->bd_inodes) {
> + inode = list_entry(p, struct inode, i_devices);
> + inode_update_time(inode);
> + }
> + spin_unlock(&bdev_lock);
> +}
> +
> static struct block_device *bd_acquire(struct inode *inode)
> {
> struct block_device *bdev;
> --- linux-2.6.20.i686/include/linux/fs.h.org
> +++ linux-2.6.20.i686/include/linux/fs.h
> @@ -1488,6 +1488,7 @@ extern struct block_device *bdget(dev_t)
> extern void bd_set_size(struct block_device *, loff_t size);
> extern void bd_forget(struct inode *inode);
> extern void bdput(struct block_device *);
> +extern void bd_inode_update_time(struct inode *);
> extern struct block_device *open_by_devnum(dev_t, unsigned);
> extern const struct address_space_operations def_blk_aops;
> #else
> @@ -1892,7 +1893,11 @@ extern int buffer_migrate_page(struct ad
> extern int inode_change_ok(struct inode *, struct iattr *);
> extern int __must_check inode_setattr(struct inode *, struct iattr *);
>
> -extern void file_update_time(struct file *file);
> +extern void inode_update_time(struct inode *inode);
> +static inline void file_update_time(struct file *file)
> +{
> + inode_update_time(file->f_path.dentry->d_inode);
> +}
>
> static inline ino_t parent_ino(struct dentry *dentry)
> {
> --- linux-2.6.20.i686/include/linux/pagemap.h.org
> +++ linux-2.6.20.i686/include/linux/pagemap.h
> @@ -16,8 +16,9 @@
> * Bits in mapping->flags. The lower __GFP_BITS_SHIFT bits are the page
> * allocation mode flags.
> */
> -#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
> +#define AS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */
> #define AS_ENOSPC (__GFP_BITS_SHIFT + 1) /* ENOSPC on async write */
> +#define AS_MCTIME (__GFP_BITS_SHIFT + 2) /* need m/ctime change */
>
> static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
> {
> --- linux-2.6.20.i686/mm/page-writeback.c.org
> +++ linux-2.6.20.i686/mm/page-writeback.c
> @@ -760,8 +760,10 @@ int __set_page_dirty_no_writeback(struct
> */
> int __set_page_dirty_nobuffers(struct page *page)
> {
> + struct address_space *mapping = page_mapping(page);
> + int ret = 0;
> +
> if (!TestSetPageDirty(page)) {
> - struct address_space *mapping = page_mapping(page);
> struct address_space *mapping2;
>
> if (!mapping)
> @@ -783,9 +785,11 @@ int __set_page_dirty_nobuffers(struct pa
> /* !PageAnon && !swapper_space */
> __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> }
> - return 1;
> + ret = 1;
> }
> - return 0;
> + if (page_mapped(page))
> + set_bit(AS_MCTIME, &mapping->flags);
> + return ret;
> }
> EXPORT_SYMBOL(__set_page_dirty_nobuffers);
>
> --- linux-2.6.20.i686/mm/msync.c.org
> +++ linux-2.6.20.i686/mm/msync.c
> @@ -12,6 +12,7 @@
> #include <linux/mman.h>
> #include <linux/file.h>
> #include <linux/syscalls.h>
> +#include <linux/pagemap.h>
>
> /*
> * MS_SYNC syncs the entire file - including mappings.
> @@ -77,11 +78,16 @@ asmlinkage long sys_msync(unsigned long
> }
> file = vma->vm_file;
> start = vma->vm_end;
> - if ((flags & MS_SYNC) && file &&
> - (vma->vm_flags & VM_SHARED)) {
> + if ((flags & (MS_SYNC|MS_ASYNC)) &&
> + file && (vma->vm_flags & VM_SHARED)) {
> get_file(file);
> up_read(&mm->mmap_sem);
> - error = do_fsync(file, 0);
> + error = 0;
> + if (flags & MS_SYNC)
> + error = do_fsync(file, 0);
> + else if (test_and_clear_bit(AS_MCTIME,
> + &file->f_mapping->flags))
> + file_update_time(file);

The msync() function might be called for a block device as well.

> fput(file);
> if (error || start >= end)
> goto out;
>
>

2008-01-12 09:36:06

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing


On Fri, 2008-01-11 at 03:44 +0300, Anton Salikhmetov wrote:

> +/*
> + * Update the ctime and mtime stamps after checking if they are to be updated.
> + */
> +void mapped_file_update_time(struct file *file)
> +{
> + if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
> + get_file(file);
> + file_update_time(file);
> + fput(file);
> + }
> +}
> +

I don't think you need the get/put file stuff here, because

> @@ -87,6 +87,8 @@ long do_fsync(struct file *file, int datasync)
> goto out;
> }
>
> + mapped_file_update_time(file);
> +
> ret = filemap_fdatawrite(mapping);
>
> /*

at this call-site we already hold an extra reference on the file, and

> @@ -74,14 +79,17 @@ asmlinkage long sys_msync(unsigned long start, size_t len, int flags)
> break;
> }
> file = vma->vm_file;
> - if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) {
> - get_file(file);
> - up_read(&mm->mmap_sem);
> - error = do_fsync(file, 0);
> - fput(file);
> - if (error)
> - return error;
> - down_read(&mm->mmap_sem);
> + if (file && (vma->vm_flags & VM_SHARED)) {
> + mapped_file_update_time(file);
> + if (flags & MS_SYNC) {
> + get_file(file);
> + up_read(&mm->mmap_sem);
> + error = do_fsync(file, 0);
> + fput(file);
> + if (error)
> + return error;
> + down_read(&mm->mmap_sem);
> + }
> }
>
> start = vma->vm_end;

here we hold the mmap_sem so the vma reference on the file can't go
away.

2008-01-12 09:40:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing


On Sat, 2008-01-12 at 10:36 +0100, Peter Zijlstra wrote:
> On Fri, 2008-01-11 at 03:44 +0300, Anton Salikhmetov wrote:
>
> > +/*
> > + * Update the ctime and mtime stamps after checking if they are to be updated.
> > + */
> > +void mapped_file_update_time(struct file *file)
> > +{
> > + if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
> > + get_file(file);
> > + file_update_time(file);
> > + fput(file);
> > + }
> > +}
> > +
>
> I don't think you need the get/put file stuff here, because

BTW, the reason for me noticing this is that if it would be needed there
is a race condition right there, who is to say that the file pointer
you're deref'ing in your test condition isn't a dead one already.

2008-01-12 12:18:08

by Anton Salikhmetov

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

2008/1/12, Peter Zijlstra <[email protected]>:
>
> On Fri, 2008-01-11 at 03:44 +0300, Anton Salikhmetov wrote:
>
> > +/*
> > + * Update the ctime and mtime stamps after checking if they are to be updated.
> > + */
> > +void mapped_file_update_time(struct file *file)
> > +{
> > + if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
> > + get_file(file);
> > + file_update_time(file);
> > + fput(file);
> > + }
> > +}
> > +
>
> I don't think you need the get/put file stuff here, because
>
> > @@ -87,6 +87,8 @@ long do_fsync(struct file *file, int datasync)
> > goto out;
> > }
> >
> > + mapped_file_update_time(file);
> > +
> > ret = filemap_fdatawrite(mapping);
> >
> > /*
>
> at this call-site we already hold an extra reference on the file, and
>
> > @@ -74,14 +79,17 @@ asmlinkage long sys_msync(unsigned long start, size_t len, int flags)
> > break;
> > }
> > file = vma->vm_file;
> > - if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) {
> > - get_file(file);
> > - up_read(&mm->mmap_sem);
> > - error = do_fsync(file, 0);
> > - fput(file);
> > - if (error)
> > - return error;
> > - down_read(&mm->mmap_sem);
> > + if (file && (vma->vm_flags & VM_SHARED)) {
> > + mapped_file_update_time(file);
> > + if (flags & MS_SYNC) {
> > + get_file(file);
> > + up_read(&mm->mmap_sem);
> > + error = do_fsync(file, 0);
> > + fput(file);
> > + if (error)
> > + return error;
> > + down_read(&mm->mmap_sem);
> > + }
> > }
> >
> > start = vma->vm_end;
>
> here we hold the mmap_sem so the vma reference on the file can't go
> away.

These get_file() and fput() calls were in the sys_msync() function form before
all my changes. I did not add them here.

>
>

2008-01-12 12:38:52

by Anton Salikhmetov

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

2008/1/12, Peter Zijlstra <[email protected]>:
>
> On Sat, 2008-01-12 at 10:36 +0100, Peter Zijlstra wrote:
> > On Fri, 2008-01-11 at 03:44 +0300, Anton Salikhmetov wrote:
> >
> > > +/*
> > > + * Update the ctime and mtime stamps after checking if they are to be updated.
> > > + */
> > > +void mapped_file_update_time(struct file *file)
> > > +{
> > > + if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
> > > + get_file(file);
> > > + file_update_time(file);
> > > + fput(file);
> > > + }
> > > +}
> > > +
> >
> > I don't think you need the get/put file stuff here, because
>
> BTW, the reason for me noticing this is that if it would be needed there
> is a race condition right there, who is to say that the file pointer
> you're deref'ing in your test condition isn't a dead one already.

So, in your opinion, is it at all needed here to play with the file reference
counter? May I drop the get_file() and fput() calls from the
sys_msync() function?

>
>

2008-01-12 13:09:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing


On Sat, 2008-01-12 at 15:38 +0300, Anton Salikhmetov wrote:
> 2008/1/12, Peter Zijlstra <[email protected]>:
> >
> > On Sat, 2008-01-12 at 10:36 +0100, Peter Zijlstra wrote:
> > > On Fri, 2008-01-11 at 03:44 +0300, Anton Salikhmetov wrote:
> > >
> > > > +/*
> > > > + * Update the ctime and mtime stamps after checking if they are to be updated.
> > > > + */
> > > > +void mapped_file_update_time(struct file *file)
> > > > +{
> > > > + if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
> > > > + get_file(file);
> > > > + file_update_time(file);
> > > > + fput(file);
> > > > + }
> > > > +}
> > > > +
> > >
> > > I don't think you need the get/put file stuff here, because
> >
> > BTW, the reason for me noticing this is that if it would be needed there
> > is a race condition right there, who is to say that the file pointer
> > you're deref'ing in your test condition isn't a dead one already.
>
> So, in your opinion, is it at all needed here to play with the file reference
> counter? May I drop the get_file() and fput() calls from the
> sys_msync() function?

No, the ones in sys_msync() around calling do_fsync() are most
definately needed because we release mmap_sem there.

What I'm saying is that you can remove the get_file()/fput() calls from
your new mapped_file_update_time() function.

2008-01-12 13:51:39

by Anton Salikhmetov

[permalink] [raw]
Subject: Re: [PATCH 2/2][RFC][BUG] msync: updating ctime and mtime at syncing

2008/1/12, Peter Zijlstra <[email protected]>:
>
> On Sat, 2008-01-12 at 15:38 +0300, Anton Salikhmetov wrote:
> > 2008/1/12, Peter Zijlstra <[email protected]>:
> > >
> > > On Sat, 2008-01-12 at 10:36 +0100, Peter Zijlstra wrote:
> > > > On Fri, 2008-01-11 at 03:44 +0300, Anton Salikhmetov wrote:
> > > >
> > > > > +/*
> > > > > + * Update the ctime and mtime stamps after checking if they are to be updated.
> > > > > + */
> > > > > +void mapped_file_update_time(struct file *file)
> > > > > +{
> > > > > + if (test_and_clear_bit(AS_MCTIME, &file->f_mapping->flags)) {
> > > > > + get_file(file);
> > > > > + file_update_time(file);
> > > > > + fput(file);
> > > > > + }
> > > > > +}
> > > > > +
> > > >
> > > > I don't think you need the get/put file stuff here, because
> > >
> > > BTW, the reason for me noticing this is that if it would be needed there
> > > is a race condition right there, who is to say that the file pointer
> > > you're deref'ing in your test condition isn't a dead one already.
> >
> > So, in your opinion, is it at all needed here to play with the file reference
> > counter? May I drop the get_file() and fput() calls from the
> > sys_msync() function?
>
> No, the ones in sys_msync() around calling do_fsync() are most
> definately needed because we release mmap_sem there.
>
> What I'm saying is that you can remove the get_file()/fput() calls from
> your new mapped_file_update_time() function.

OK, thank you very much for your answer. I'm planning to submit my
next solution which is going to take your suggestion into account.

But I'm not sure how to process memory-mapped block device files.
Staubach's approach did not work for me after adapting it to my
solution.

>
>