2001-11-30 06:09:45

by Andrew Morton

[permalink] [raw]
Subject: [patch] smarter atime updates

mark_inode_dirty() is quite expensive for journalling filesystems,
and we're calling it a lot more than we need to.

--- linux-2.4.17-pre1/fs/inode.c Mon Nov 26 11:52:07 2001
+++ linux-akpm/fs/inode.c Thu Nov 29 21:53:02 2001
@@ -1187,6 +1187,8 @@ void __init inode_init(unsigned long mem

void update_atime (struct inode *inode)
{
+ if (inode->i_atime == CURRENT_TIME)
+ return;
if ( IS_NOATIME (inode) ) return;
if ( IS_NODIRATIME (inode) && S_ISDIR (inode->i_mode) ) return;
if ( IS_RDONLY (inode) ) return;


with this patch, the time to read a 10 meg file with 10 million
read()s falls from 38 seconds (ext3), 39 seconds (reiserfs) and
11.6 seconds (ext2) down to 10.5 seconds.

-


2001-11-30 09:46:38

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

Hi,

Andrew Morton <[email protected]> writes:

> mark_inode_dirty() is quite expensive for journalling filesystems,
> and we're calling it a lot more than we need to.
>
> --- linux-2.4.17-pre1/fs/inode.c Mon Nov 26 11:52:07 2001
> +++ linux-akpm/fs/inode.c Thu Nov 29 21:53:02 2001
> @@ -1187,6 +1187,8 @@ void __init inode_init(unsigned long mem
>
> void update_atime (struct inode *inode)
> {
> + if (inode->i_atime == CURRENT_TIME)
> + return;
> if ( IS_NOATIME (inode) ) return;
> if ( IS_NODIRATIME (inode) && S_ISDIR (inode->i_mode) ) return;
> if ( IS_RDONLY (inode) ) return;

in include/linux/fs.h:

#define UPDATE_ATIME(inode) \
do { \
if ((inode)->i_atime != CURRENT_TIME) \
update_atime (inode); \
} while (0)

How about this macro? (add likely()?)
--
OGAWA Hirofumi <[email protected]>

2001-11-30 09:56:48

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

OGAWA Hirofumi wrote:
>
> #define UPDATE_ATIME(inode) \
> do { \
> if ((inode)->i_atime != CURRENT_TIME) \
> update_atime (inode); \
> } while (0)
>

yes, that'd be fine. The more conventional approach
would be to blow away the strange UPPER CASE name and:

static inline void update_atime(struct inode *inode)
{
if (inode->i_atime != CURRENT_TIME)
__update_atime(inode);
}

But that would be a bigger patch, and I rather like shaving
off three quarters of the sys_read() overhead with a two-liner ;)

-

2001-11-30 10:41:34

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

Andrew Morton <[email protected]> writes:

> OGAWA Hirofumi wrote:
> >
> > #define UPDATE_ATIME(inode) \
> > do { \
> > if ((inode)->i_atime != CURRENT_TIME) \
> > update_atime (inode); \
> > } while (0)
> >
>
> yes, that'd be fine. The more conventional approach
> would be to blow away the strange UPPER CASE name and:
>
> static inline void update_atime(struct inode *inode)
> {
> if (inode->i_atime != CURRENT_TIME)
> __update_atime(inode);
> }
>
> But that would be a bigger patch, and I rather like shaving
> off three quarters of the sys_read() overhead with a two-liner ;)

Umm, I think only fs/autofs4/root.c use update_atime() directly.
--
OGAWA Hirofumi <[email protected]>

2001-11-30 17:02:08

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [patch] smarter atime updates


Now are you sure this can't break anything ?

On Thu, 29 Nov 2001, Andrew Morton wrote:

> mark_inode_dirty() is quite expensive for journalling filesystems,
> and we're calling it a lot more than we need to.
>
> --- linux-2.4.17-pre1/fs/inode.c Mon Nov 26 11:52:07 2001
> +++ linux-akpm/fs/inode.c Thu Nov 29 21:53:02 2001
> @@ -1187,6 +1187,8 @@ void __init inode_init(unsigned long mem
>
> void update_atime (struct inode *inode)
> {
> + if (inode->i_atime == CURRENT_TIME)
> + return;
> if ( IS_NOATIME (inode) ) return;
> if ( IS_NODIRATIME (inode) && S_ISDIR (inode->i_mode) ) return;
> if ( IS_RDONLY (inode) ) return;
>
>
> with this patch, the time to read a 10 meg file with 10 million
> read()s falls from 38 seconds (ext3), 39 seconds (reiserfs) and
> 11.6 seconds (ext2) down to 10.5 seconds.
>
> -
>

2001-11-30 17:30:31

by Chris Mason

[permalink] [raw]
Subject: Re: [patch] smarter atime updates



On Friday, November 30, 2001 01:44:42 PM -0200 Marcelo Tosatti
<[email protected]> wrote:

>
> Now are you sure this can't break anything ?
>

It shouldn't hurt reiserfs at least, I like it.

-chris

> On Thu, 29 Nov 2001, Andrew Morton wrote:
>
>> mark_inode_dirty() is quite expensive for journalling filesystems,
>> and we're calling it a lot more than we need to.
>>
>> --- linux-2.4.17-pre1/fs/inode.c Mon Nov 26 11:52:07 2001
>> +++ linux-akpm/fs/inode.c Thu Nov 29 21:53:02 2001
>> @@ -1187,6 +1187,8 @@ void __init inode_init(unsigned long mem
>>
>> void update_atime (struct inode *inode)
>> {
>> + if (inode->i_atime == CURRENT_TIME)
>> + return;
>> if ( IS_NOATIME (inode) ) return;
>> if ( IS_NODIRATIME (inode) && S_ISDIR (inode->i_mode) ) return;
>> if ( IS_RDONLY (inode) ) return;
>>
>>
>> with this patch, the time to read a 10 meg file with 10 million
>> read()s falls from 38 seconds (ext3), 39 seconds (reiserfs) and
>> 11.6 seconds (ext2) down to 10.5 seconds.

2001-11-30 20:04:34

by Robert Love

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

On Fri, 2001-11-30 at 10:44, Marcelo Tosatti wrote:

> Now are you sure this can't break anything ?

It certainly won't break ext2 and ext3, and it probably does offer a
nice performance benefit.

Robert Love

2001-11-30 21:53:57

by Andreas Dilger

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

On Nov 30, 2001 13:44 -0200, Marcelo Tosatti wrote:
> Now are you sure this can't break anything ?

What fun would that be, if you couldn't follow in Linus' footsteps?
People would get complacent if things didn't break now and again ;-).

Anyways, you never want to put a change that is not a specific bug
fix in a -rc patch anyways. Save it for a -pre patch, where you expect
at least some testing before it is released.

> On Thu, 29 Nov 2001, Andrew Morton wrote:
> > mark_inode_dirty() is quite expensive for journalling filesystems,
> > and we're calling it a lot more than we need to.
> >
> > --- linux-2.4.17-pre1/fs/inode.c Mon Nov 26 11:52:07 2001
> > +++ linux-akpm/fs/inode.c Thu Nov 29 21:53:02 2001
> > @@ -1187,6 +1187,8 @@ void __init inode_init(unsigned long mem
> >
> > void update_atime (struct inode *inode)
> > {
> > + if (inode->i_atime == CURRENT_TIME)
> > + return;
> > if ( IS_NOATIME (inode) ) return;
> > if ( IS_NODIRATIME (inode) && S_ISDIR (inode->i_mode) ) return;
> > if ( IS_RDONLY (inode) ) return;
> >
> >
> > with this patch, the time to read a 10 meg file with 10 million
> > read()s falls from 38 seconds (ext3), 39 seconds (reiserfs) and
> > 11.6 seconds (ext2) down to 10.5 seconds.

Well, just doing a code check of the update_atime() and UPDATE_ATIME()
users, and they are all in readlink(), follow_link(), open_namei(),
and various fs _readdir() codes. None of them (AFAICS) depend on the
mark_inode_dirty() as a side-effect. This means it should be safe.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-11-30 21:56:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] smarter atime updates


On Fri, 30 Nov 2001, Andreas Dilger wrote:
>
> Well, just doing a code check of the update_atime() and UPDATE_ATIME()
> users, and they are all in readlink(), follow_link(), open_namei(),
> and various fs _readdir() codes. None of them (AFAICS) depend on the
> mark_inode_dirty() as a side-effect. This means it should be safe.

More importantly, _if_ somebody depended on the side effects, they'd have
been thwarted by the "noatime" mount option anyway, so any such bug would
not be a new bug.

Linus

2001-11-30 22:30:38

by Simon Kirby

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

On Fri, Nov 30, 2001 at 01:50:21PM -0800, Linus Torvalds wrote:

> On Fri, 30 Nov 2001, Andreas Dilger wrote:
> >
> > Well, just doing a code check of the update_atime() and UPDATE_ATIME()
> > users, and they are all in readlink(), follow_link(), open_namei(),
> > and various fs _readdir() codes. None of them (AFAICS) depend on the
> > mark_inode_dirty() as a side-effect. This means it should be safe.
>
> More importantly, _if_ somebody depended on the side effects, they'd have
> been thwarted by the "noatime" mount option anyway, so any such bug would
> not be a new bug.

I've always thought filesystems should mount with noatime,nodiratime by
default and only actually update atime if specifically mounted with
"atime", as it's so rarely used. Out of all of the servers here, none
actually use atime (every file system on _every_ server is mounted
noatime,nodiratime). It's such a waste and just sounds fundamentally
broken to issue a write because somebody read from a file.

...But there's probably some POSIX standard which would make such a
change illegal. Blah blah...

(Not to say that atime isn't useful, but in most cases where it might be
useful, it is so easily broken by backup processes, etc., that it really
wants to be a different sort of mechanism.)

Simon-

[ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
[ [email protected] ][ [email protected] ]
[ Opinions expressed are not necessarily those of my employers. ]

2001-11-30 22:36:38

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

Followup to: <[email protected]>
By author: OGAWA Hirofumi <[email protected]>
In newsgroup: linux.dev.kernel
>
> Hi,
>
> Andrew Morton <[email protected]> writes:
>
> > mark_inode_dirty() is quite expensive for journalling filesystems,
> > and we're calling it a lot more than we need to.
> >
> > --- linux-2.4.17-pre1/fs/inode.c Mon Nov 26 11:52:07 2001
> > +++ linux-akpm/fs/inode.c Thu Nov 29 21:53:02 2001
> > @@ -1187,6 +1187,8 @@ void __init inode_init(unsigned long mem
> >
> > void update_atime (struct inode *inode)
> > {
> > + if (inode->i_atime == CURRENT_TIME)
> > + return;
> > if ( IS_NOATIME (inode) ) return;
> > if ( IS_NODIRATIME (inode) && S_ISDIR (inode->i_mode) ) return;
> > if ( IS_RDONLY (inode) ) return;
>
> in include/linux/fs.h:
>
> #define UPDATE_ATIME(inode) \
> do { \
> if ((inode)->i_atime != CURRENT_TIME) \
> update_atime (inode); \
> } while (0)
>
> How about this macro? (add likely()?)
>

The only potential issue I can see (with either approach) is that it
seems to break filesystems for which atime has a granularity finer
than 1 s.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-11-30 23:32:10

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

Followup to: <[email protected]>
By author: Simon Kirby <[email protected]>
In newsgroup: linux.dev.kernel
>
> I've always thought filesystems should mount with noatime,nodiratime by
> default and only actually update atime if specifically mounted with
> "atime", as it's so rarely used. Out of all of the servers here, none
> actually use atime (every file system on _every_ server is mounted
> noatime,nodiratime). It's such a waste and just sounds fundamentally
> broken to issue a write because somebody read from a file.
>
> ...But there's probably some POSIX standard which would make such a
> change illegal. Blah blah...
>
> (Not to say that atime isn't useful, but in most cases where it might be
> useful, it is so easily broken by backup processes, etc., that it really
> wants to be a different sort of mechanism.)
>

Edit /etc/fstab and be happy. I'm sorry, but you even know why your
request is unacceptable.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-12-01 09:23:32

by Hans Reiser

[permalink] [raw]
Subject: Re: [patch] smarter atime updates

Marcelo Tosatti wrote:

>Now are you sure this can't break anything ?
>
>On Thu, 29 Nov 2001, Andrew Morton wrote:
>
>>mark_inode_dirty() is quite expensive for journalling filesystems,
>>and we're calling it a lot more than we need to.
>>
>>--- linux-2.4.17-pre1/fs/inode.c Mon Nov 26 11:52:07 2001
>>+++ linux-akpm/fs/inode.c Thu Nov 29 21:53:02 2001
>>@@ -1187,6 +1187,8 @@ void __init inode_init(unsigned long mem
>>
>> void update_atime (struct inode *inode)
>> {
>>+ if (inode->i_atime == CURRENT_TIME)
>>+ return;
>> if ( IS_NOATIME (inode) ) return;
>> if ( IS_NODIRATIME (inode) && S_ISDIR (inode->i_mode) ) return;
>> if ( IS_RDONLY (inode) ) return;
>>
>>
>>with this patch, the time to read a 10 meg file with 10 million
>>read()s falls from 38 seconds (ext3), 39 seconds (reiserfs) and
>>11.6 seconds (ext2) down to 10.5 seconds.
>>
>>-
>>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
nice optimization.

Hans