2009-07-06 19:24:57

by Andi Kleen

[permalink] [raw]
Subject: [PATCH] [1/2] Optimization for touch_atime


Some benchmark testing shows touch_atime to be high up in profile
logs for IO intensive workloads. Most likely that's due to the lock
in mnt_want_write(). Unfortunately touch_atime first takes the lock,
and then does all the other tests that could avoid atime updates (like
noatime or relatime).

Do it the other way round -- first try to avoid the update and only
then if that didn't succeed take the lock. That works because none of
the atime avoidance tests rely on locking.

This also eliminates a goto.

Signed-off-by: Andi Kleen <[email protected]>

---
fs/inode.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

Index: linux-2.6.30-ak/fs/inode.c
===================================================================
--- linux-2.6.30-ak.orig/fs/inode.c
+++ linux-2.6.30-ak/fs/inode.c
@@ -1361,31 +1361,31 @@ void touch_atime(struct vfsmount *mnt, s
struct inode *inode = dentry->d_inode;
struct timespec now;

- if (mnt_want_write(mnt))
- return;
if (inode->i_flags & S_NOATIME)
- goto out;
+ return;
if (IS_NOATIME(inode))
- goto out;
+ return;
if ((inode->i_sb->s_flags & MS_NODIRATIME) && S_ISDIR(inode->i_mode))
- goto out;
+ return;

if (mnt->mnt_flags & MNT_NOATIME)
- goto out;
+ return;
if ((mnt->mnt_flags & MNT_NODIRATIME) && S_ISDIR(inode->i_mode))
- goto out;
+ return;

now = current_fs_time(inode->i_sb);

if (!relatime_need_update(mnt, inode, now))
- goto out;
+ return;

if (timespec_equal(&inode->i_atime, &now))
- goto out;
+ return;
+
+ if (mnt_want_write(mnt))
+ return;

inode->i_atime = now;
mark_inode_dirty_sync(inode);
-out:
mnt_drop_write(mnt);
}
EXPORT_SYMBOL(touch_atime);


2009-07-06 19:25:11

by Andi Kleen

[permalink] [raw]
Subject: [PATCH] [2/2] Optimize touch_time too


Do a similar optimization as earlier for touch_atime. Getting
the lock in mnt_get_write is relatively costly, so try all
avenues to avoid it first.

This patch is careful to still only update inode fields
inside the lock region.

This didn't show up in benchmarks, but it's easy enough
to do.

Signed-off-by: Andi Kleen <[email protected]>

---
fs/inode.c | 43 +++++++++++++++++++++++--------------------
1 file changed, 23 insertions(+), 20 deletions(-)

Index: linux-2.6.31-rc1-ak/fs/inode.c
===================================================================
--- linux-2.6.31-rc1-ak.orig/fs/inode.c
+++ linux-2.6.31-rc1-ak/fs/inode.c
@@ -1431,34 +1431,37 @@ void file_update_time(struct file *file)
{
struct inode *inode = file->f_path.dentry->d_inode;
struct timespec now;
- int sync_it = 0;
- int err;
+ enum { S_MTIME = 1, S_CTIME = 2, S_VERSION = 4 } sync_it = 0;

+ /* First try to exhause all avenues to not sync */
if (IS_NOCMTIME(inode))
return;

- err = mnt_want_write_file(file);
- if (err)
- return;
-
now = current_fs_time(inode->i_sb);
- if (!timespec_equal(&inode->i_mtime, &now)) {
- inode->i_mtime = now;
- sync_it = 1;
- }
+ if (!timespec_equal(&inode->i_mtime, &now))
+ sync_it = S_MTIME;

- if (!timespec_equal(&inode->i_ctime, &now)) {
- inode->i_ctime = now;
- sync_it = 1;
- }
+ if (!timespec_equal(&inode->i_ctime, &now))
+ sync_it |= S_CTIME;

- if (IS_I_VERSION(inode)) {
- inode_inc_iversion(inode);
- sync_it = 1;
- }
+ if (IS_I_VERSION(inode))
+ sync_it |= S_VERSION;
+
+ if (!sync_it)
+ return;
+
+ /* Finally allowed to write? Takes lock. */
+ if (!mnt_want_write_file(file))
+ return;

- if (sync_it)
- mark_inode_dirty_sync(inode);
+ /* Only change inode inside the lock region */
+ if (sync_it & S_VERSION)
+ inode_inc_iversion(inode);
+ if (sync_it & S_CTIME)
+ inode->i_ctime = now;
+ if (sync_it & S_MTIME)
+ inode->i_mtime = now;
+ mark_inode_dirty_sync(inode);
mnt_drop_write(file->f_path.mnt);
}
EXPORT_SYMBOL(file_update_time);

2009-07-07 10:49:22

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] [1/2] Optimization for touch_atime

On Mon, Jul 06, 2009 at 09:24:46PM +0200, Andi Kleen wrote:
>
> Some benchmark testing shows touch_atime to be high up in profile
> logs for IO intensive workloads. Most likely that's due to the lock
> in mnt_want_write(). Unfortunately touch_atime first takes the lock,
> and then does all the other tests that could avoid atime updates (like
> noatime or relatime).
>
> Do it the other way round -- first try to avoid the update and only
> then if that didn't succeed take the lock. That works because none of
> the atime avoidance tests rely on locking.
>
> This also eliminates a goto.
>
> Signed-off-by: Andi Kleen <[email protected]>


Looks good to me.

2009-07-07 10:50:26

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] [2/2] Optimize touch_time too

On Mon, Jul 06, 2009 at 09:24:47PM +0200, Andi Kleen wrote:
>
> Do a similar optimization as earlier for touch_atime. Getting
> the lock in mnt_get_write is relatively costly, so try all
> avenues to avoid it first.
>
> This patch is careful to still only update inode fields
> inside the lock region.
>
> This didn't show up in benchmarks, but it's easy enough
> to do.
>
> Signed-off-by: Andi Kleen <[email protected]>
>
> ---
> fs/inode.c | 43 +++++++++++++++++++++++--------------------
> 1 file changed, 23 insertions(+), 20 deletions(-)
>
> Index: linux-2.6.31-rc1-ak/fs/inode.c
> ===================================================================
> --- linux-2.6.31-rc1-ak.orig/fs/inode.c
> +++ linux-2.6.31-rc1-ak/fs/inode.c
> @@ -1431,34 +1431,37 @@ void file_update_time(struct file *file)
> {
> struct inode *inode = file->f_path.dentry->d_inode;
> struct timespec now;
> - int sync_it = 0;
> - int err;
> + enum { S_MTIME = 1, S_CTIME = 2, S_VERSION = 4 } sync_it = 0;

Looks good, and makes sense to keep thise in sync with
file_update_atime.

2009-07-08 20:25:26

by Valerie Aurora

[permalink] [raw]
Subject: Re: [PATCH] [1/2] Optimization for touch_atime

On Mon, Jul 06, 2009 at 09:24:46PM +0200, Andi Kleen wrote:
>
> Some benchmark testing shows touch_atime to be high up in profile
> logs for IO intensive workloads. Most likely that's due to the lock
> in mnt_want_write(). Unfortunately touch_atime first takes the lock,
> and then does all the other tests that could avoid atime updates (like
> noatime or relatime).
>
> Do it the other way round -- first try to avoid the update and only
> then if that didn't succeed take the lock. That works because none of
> the atime avoidance tests rely on locking.
>
> This also eliminates a goto.
>
> Signed-off-by: Andi Kleen <[email protected]>
>
> ---
> fs/inode.c | 20 ++++++++++----------
> 1 file changed, 10 insertions(+), 10 deletions(-)
>
> Index: linux-2.6.30-ak/fs/inode.c
> ===================================================================
> --- linux-2.6.30-ak.orig/fs/inode.c
> +++ linux-2.6.30-ak/fs/inode.c
> @@ -1361,31 +1361,31 @@ void touch_atime(struct vfsmount *mnt, s
> struct inode *inode = dentry->d_inode;
> struct timespec now;
>
> - if (mnt_want_write(mnt))
> - return;
> if (inode->i_flags & S_NOATIME)
> - goto out;
> + return;
> if (IS_NOATIME(inode))
> - goto out;
> + return;
> if ((inode->i_sb->s_flags & MS_NODIRATIME) && S_ISDIR(inode->i_mode))
> - goto out;
> + return;
>
> if (mnt->mnt_flags & MNT_NOATIME)
> - goto out;
> + return;
> if ((mnt->mnt_flags & MNT_NODIRATIME) && S_ISDIR(inode->i_mode))
> - goto out;
> + return;
>
> now = current_fs_time(inode->i_sb);
>
> if (!relatime_need_update(mnt, inode, now))
> - goto out;
> + return;
>
> if (timespec_equal(&inode->i_atime, &now))
> - goto out;
> + return;
> +
> + if (mnt_want_write(mnt))
> + return;
>
> inode->i_atime = now;
> mark_inode_dirty_sync(inode);
> -out:
> mnt_drop_write(mnt);
> }
> EXPORT_SYMBOL(touch_atime);

Nice!

Reviewed-by: Valerie Aurora <[email protected]>

-VAL