2004-06-12 01:15:23

by Cesar Eduardo Barros

[permalink] [raw]
Subject: [PATCH] O_NOATIME support

(not subscribed to lkml, please CC: me on replies)

This patch adds support for the O_NOATIME open flag (GNU extension):

int O_NOATIME Macro
If this bit is set, read will not update the access time of the file.
See File Times. This is used by programs that do backups, so that
backing a file up does not count as reading it. Only the owner of the
file or the superuser may use this bit.

It is useful if you want to do something with the file atime (for
instance, moving files that have not been accessed in a while to
somewhere else, or something like Debian's popularity-contest) but you
also want to read all files periodically (for instance, tripwire or
debsums).

Currently, the program that reads all files periodically has to use
utimes, which can race with the atime update:

A B
open
fstat
read
open
read
close
close
utimes

And the file still has the old atime, instead of the new one from when B
did the read from it. This problem does not happen if A uses O_NOATIME
instead of utimes to preserve the atime.

This patch adds the O_NOATIME constant for all architectures, but it
would also be possible to add it one architecture at a time by defining
it to 0 when not defined in asm-*.

Based on patch by Marek Michalkiewicz <[email protected]> at
http://www.uwsg.iu.edu/hypermail/linux/kernel/9811.2/0118.html

Lightly tested on i386.


fs/fcntl.c | 7 ++++++-
fs/namei.c | 5 +++++
include/asm-alpha/fcntl.h | 1 +
include/asm-arm/fcntl.h | 1 +
include/asm-arm26/fcntl.h | 1 +
include/asm-cris/fcntl.h | 1 +
include/asm-h8300/fcntl.h | 1 +
include/asm-i386/fcntl.h | 1 +
include/asm-ia64/fcntl.h | 1 +
include/asm-m68k/fcntl.h | 1 +
include/asm-mips/fcntl.h | 1 +
include/asm-parisc/fcntl.h | 1 +
include/asm-ppc/fcntl.h | 1 +
include/asm-ppc64/fcntl.h | 1 +
include/asm-s390/fcntl.h | 1 +
include/asm-sh/fcntl.h | 1 +
include/asm-sparc/fcntl.h | 1 +
include/asm-sparc64/fcntl.h | 1 +
include/asm-v850/fcntl.h | 1 +
include/asm-x86_64/fcntl.h | 1 +
include/linux/fs.h | 3 ++-
21 files changed, 31 insertions(+), 2 deletions(-)


diff -Nur linux-2.6.6.orig/fs/fcntl.c linux-2.6.6/fs/fcntl.c
--- linux-2.6.6.orig/fs/fcntl.c 2004-05-14 18:21:42.000000000 -0300
+++ linux-2.6.6/fs/fcntl.c 2004-06-10 18:14:28.000000000 -0300
@@ -212,7 +212,7 @@
return ret;
}

-#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | FASYNC | O_DIRECT)
+#define SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | FASYNC | O_DIRECT | O_NOATIME)

static int setfl(int fd, struct file * filp, unsigned long arg)
{
@@ -223,6 +223,11 @@
if (!(arg & O_APPEND) && IS_APPEND(inode))
return -EPERM;

+ /* O_NOATIME can only be set by the owner or superuser */
+ if ((arg & O_NOATIME) && !(filp->f_flags & O_NOATIME))
+ if (current->fsuid != inode->i_uid && !capable(CAP_FOWNER))
+ return -EPERM;
+
/* required for strict SunOS emulation */
if (O_NONBLOCK != O_NDELAY)
if (arg & O_NDELAY)
diff -Nur linux-2.6.6.orig/fs/namei.c linux-2.6.6/fs/namei.c
--- linux-2.6.6.orig/fs/namei.c 2004-05-14 18:21:43.000000000 -0300
+++ linux-2.6.6/fs/namei.c 2004-06-10 18:30:07.000000000 -0300
@@ -1206,6 +1206,11 @@
return -EPERM;
}

+ /* O_NOATIME can only be set by the owner or superuser */
+ if (flag & O_NOATIME)
+ if (current->fsuid != inode->i_uid && !capable(CAP_FOWNER))
+ return -EPERM;
+
/*
* Ensure there are no outstanding leases on the file.
*/
diff -Nur linux-2.6.6.orig/include/asm-alpha/fcntl.h linux-2.6.6/include/asm-alpha/fcntl.h
--- linux-2.6.6.orig/include/asm-alpha/fcntl.h 2004-04-04 00:37:24.000000000 -0300
+++ linux-2.6.6/include/asm-alpha/fcntl.h 2004-06-10 18:36:01.000000000 -0300
@@ -21,6 +21,7 @@
#define O_NOFOLLOW 0200000 /* don't follow links */
#define O_LARGEFILE 0400000 /* will be set by the kernel on every open */
#define O_DIRECT 02000000 /* direct disk access - should check with OSF/1 */
+#define O_NOATIME 04000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-arm/fcntl.h linux-2.6.6/include/asm-arm/fcntl.h
--- linux-2.6.6.orig/include/asm-arm/fcntl.h 2004-04-04 00:36:27.000000000 -0300
+++ linux-2.6.6/include/asm-arm/fcntl.h 2004-06-10 18:36:55.000000000 -0300
@@ -20,6 +20,7 @@
#define O_NOFOLLOW 0100000 /* don't follow links */
#define O_DIRECT 0200000 /* direct disk access hint - currently ignored */
#define O_LARGEFILE 0400000
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-arm26/fcntl.h linux-2.6.6/include/asm-arm26/fcntl.h
--- linux-2.6.6.orig/include/asm-arm26/fcntl.h 2004-04-04 00:37:40.000000000 -0300
+++ linux-2.6.6/include/asm-arm26/fcntl.h 2004-06-10 18:37:42.000000000 -0300
@@ -20,6 +20,7 @@
#define O_NOFOLLOW 0100000 /* don't follow links */
#define O_DIRECT 0200000 /* direct disk access hint - currently ignored */
#define O_LARGEFILE 0400000
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-cris/fcntl.h linux-2.6.6/include/asm-cris/fcntl.h
--- linux-2.6.6.orig/include/asm-cris/fcntl.h 2004-04-04 00:36:25.000000000 -0300
+++ linux-2.6.6/include/asm-cris/fcntl.h 2004-06-10 18:37:59.000000000 -0300
@@ -22,6 +22,7 @@
#define O_LARGEFILE 0100000
#define O_DIRECTORY 0200000 /* must be a directory */
#define O_NOFOLLOW 0400000 /* don't follow links */
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get f_flags */
diff -Nur linux-2.6.6.orig/include/asm-h8300/fcntl.h linux-2.6.6/include/asm-h8300/fcntl.h
--- linux-2.6.6.orig/include/asm-h8300/fcntl.h 2004-04-04 00:37:43.000000000 -0300
+++ linux-2.6.6/include/asm-h8300/fcntl.h 2004-06-10 18:38:16.000000000 -0300
@@ -20,6 +20,7 @@
#define O_NOFOLLOW 0100000 /* don't follow links */
#define O_DIRECT 0200000 /* direct disk access hint - currently ignored */
#define O_LARGEFILE 0400000
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-i386/fcntl.h linux-2.6.6/include/asm-i386/fcntl.h
--- linux-2.6.6.orig/include/asm-i386/fcntl.h 2004-04-04 00:37:23.000000000 -0300
+++ linux-2.6.6/include/asm-i386/fcntl.h 2004-06-10 18:38:26.000000000 -0300
@@ -20,6 +20,7 @@
#define O_LARGEFILE 0100000
#define O_DIRECTORY 0200000 /* must be a directory */
#define O_NOFOLLOW 0400000 /* don't follow links */
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-ia64/fcntl.h linux-2.6.6/include/asm-ia64/fcntl.h
--- linux-2.6.6.orig/include/asm-ia64/fcntl.h 2004-04-04 00:37:23.000000000 -0300
+++ linux-2.6.6/include/asm-ia64/fcntl.h 2004-06-10 18:38:38.000000000 -0300
@@ -28,6 +28,7 @@
#define O_LARGEFILE 0100000
#define O_DIRECTORY 0200000 /* must be a directory */
#define O_NOFOLLOW 0400000 /* don't follow links */
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-m68k/fcntl.h linux-2.6.6/include/asm-m68k/fcntl.h
--- linux-2.6.6.orig/include/asm-m68k/fcntl.h 2004-04-04 00:36:53.000000000 -0300
+++ linux-2.6.6/include/asm-m68k/fcntl.h 2004-06-10 18:38:49.000000000 -0300
@@ -20,6 +20,7 @@
#define O_NOFOLLOW 0100000 /* don't follow links */
#define O_DIRECT 0200000 /* direct disk access hint - currently ignored */
#define O_LARGEFILE 0400000
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-mips/fcntl.h linux-2.6.6/include/asm-mips/fcntl.h
--- linux-2.6.6.orig/include/asm-mips/fcntl.h 2004-04-04 00:37:43.000000000 -0300
+++ linux-2.6.6/include/asm-mips/fcntl.h 2004-06-10 18:39:12.000000000 -0300
@@ -26,6 +26,7 @@
#define O_DIRECT 0x8000 /* direct disk access hint */
#define O_DIRECTORY 0x10000 /* must be a directory */
#define O_NOFOLLOW 0x20000 /* don't follow links */
+#define O_NOATIME 0x40000

#define O_NDELAY O_NONBLOCK

diff -Nur linux-2.6.6.orig/include/asm-parisc/fcntl.h linux-2.6.6/include/asm-parisc/fcntl.h
--- linux-2.6.6.orig/include/asm-parisc/fcntl.h 2004-04-04 00:37:07.000000000 -0300
+++ linux-2.6.6/include/asm-parisc/fcntl.h 2004-06-10 18:40:03.000000000 -0300
@@ -19,6 +19,7 @@
#define O_NOCTTY 00400000 /* not fcntl */
#define O_DSYNC 01000000 /* HPUX only */
#define O_RSYNC 02000000 /* HPUX only */
+#define O_NOATIME 04000000

#define FASYNC 00020000 /* fcntl, for BSD compatibility */
#define O_DIRECT 00040000 /* direct disk access hint - currently ignored */
diff -Nur linux-2.6.6.orig/include/asm-ppc/fcntl.h linux-2.6.6/include/asm-ppc/fcntl.h
--- linux-2.6.6.orig/include/asm-ppc/fcntl.h 2004-04-04 00:37:07.000000000 -0300
+++ linux-2.6.6/include/asm-ppc/fcntl.h 2004-06-10 18:40:14.000000000 -0300
@@ -20,6 +20,7 @@
#define O_NOFOLLOW 0100000 /* don't follow links */
#define O_LARGEFILE 0200000
#define O_DIRECT 0400000 /* direct disk access hint */
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-ppc64/fcntl.h linux-2.6.6/include/asm-ppc64/fcntl.h
--- linux-2.6.6.orig/include/asm-ppc64/fcntl.h 2004-04-04 00:36:15.000000000 -0300
+++ linux-2.6.6/include/asm-ppc64/fcntl.h 2004-06-10 18:40:25.000000000 -0300
@@ -27,6 +27,7 @@
#define O_NOFOLLOW 0100000 /* don't follow links */
#define O_LARGEFILE 0200000
#define O_DIRECT 0400000 /* direct disk access hint */
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-s390/fcntl.h linux-2.6.6/include/asm-s390/fcntl.h
--- linux-2.6.6.orig/include/asm-s390/fcntl.h 2004-04-04 00:36:12.000000000 -0300
+++ linux-2.6.6/include/asm-s390/fcntl.h 2004-06-10 18:40:42.000000000 -0300
@@ -27,6 +27,7 @@
#define O_LARGEFILE 0100000
#define O_DIRECTORY 0200000 /* must be a directory */
#define O_NOFOLLOW 0400000 /* don't follow links */
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-sh/fcntl.h linux-2.6.6/include/asm-sh/fcntl.h
--- linux-2.6.6.orig/include/asm-sh/fcntl.h 2004-04-04 00:37:42.000000000 -0300
+++ linux-2.6.6/include/asm-sh/fcntl.h 2004-06-10 18:40:52.000000000 -0300
@@ -20,6 +20,7 @@
#define O_LARGEFILE 0100000
#define O_DIRECTORY 0200000 /* must be a directory */
#define O_NOFOLLOW 0400000 /* don't follow links */
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-sparc/fcntl.h linux-2.6.6/include/asm-sparc/fcntl.h
--- linux-2.6.6.orig/include/asm-sparc/fcntl.h 2004-04-04 00:38:20.000000000 -0300
+++ linux-2.6.6/include/asm-sparc/fcntl.h 2004-06-10 18:41:14.000000000 -0300
@@ -21,6 +21,7 @@
#define O_NOFOLLOW 0x20000 /* don't follow links */
#define O_LARGEFILE 0x40000
#define O_DIRECT 0x100000 /* direct disk access hint */
+#define O_NOATIME 0x200000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-sparc64/fcntl.h linux-2.6.6/include/asm-sparc64/fcntl.h
--- linux-2.6.6.orig/include/asm-sparc64/fcntl.h 2004-04-04 00:38:20.000000000 -0300
+++ linux-2.6.6/include/asm-sparc64/fcntl.h 2004-06-10 18:41:27.000000000 -0300
@@ -21,6 +21,7 @@
#define O_NOFOLLOW 0x20000 /* don't follow links */
#define O_LARGEFILE 0x40000
#define O_DIRECT 0x100000 /* direct disk access hint */
+#define O_NOATIME 0x200000


#define F_DUPFD 0 /* dup */
diff -Nur linux-2.6.6.orig/include/asm-v850/fcntl.h linux-2.6.6/include/asm-v850/fcntl.h
--- linux-2.6.6.orig/include/asm-v850/fcntl.h 2004-04-04 00:36:53.000000000 -0300
+++ linux-2.6.6/include/asm-v850/fcntl.h 2004-06-10 18:41:56.000000000 -0300
@@ -20,6 +20,7 @@
#define O_NOFOLLOW 0100000 /* don't follow links */
#define O_DIRECT 0200000 /* direct disk access hint - currently ignored */
#define O_LARGEFILE 0400000
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/asm-x86_64/fcntl.h linux-2.6.6/include/asm-x86_64/fcntl.h
--- linux-2.6.6.orig/include/asm-x86_64/fcntl.h 2004-04-04 00:36:26.000000000 -0300
+++ linux-2.6.6/include/asm-x86_64/fcntl.h 2004-06-10 18:42:13.000000000 -0300
@@ -20,6 +20,7 @@
#define O_LARGEFILE 0100000
#define O_DIRECTORY 0200000 /* must be a directory */
#define O_NOFOLLOW 0400000 /* don't follow links */
+#define O_NOATIME 01000000

#define F_DUPFD 0 /* dup */
#define F_GETFD 1 /* get close_on_exec */
diff -Nur linux-2.6.6.orig/include/linux/fs.h linux-2.6.6/include/linux/fs.h
--- linux-2.6.6.orig/include/linux/fs.h 2004-05-14 18:21:59.000000000 -0300
+++ linux-2.6.6/include/linux/fs.h 2004-06-10 17:57:30.000000000 -0300
@@ -974,7 +974,8 @@

static inline void file_accessed(struct file *file)
{
- touch_atime(file->f_vfsmnt, file->f_dentry);
+ if (!(file->f_flags & O_NOATIME))
+ touch_atime(file->f_vfsmnt, file->f_dentry);
}

int sync_inode(struct inode *inode, struct writeback_control *wbc);


--
Cesar Eduardo Barros
[email protected]
[email protected]


2004-06-12 16:44:08

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

In article <[email protected]> you wrote:
> +++ linux-2.6.6/include/asm-arm/fcntl.h 2004-06-10 18:36:55.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-arm26/fcntl.h 2004-06-10 18:37:42.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-cris/fcntl.h 2004-06-10 18:37:59.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-h8300/fcntl.h 2004-06-10 18:38:16.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-i386/fcntl.h 2004-06-10 18:38:26.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-ia64/fcntl.h 2004-06-10 18:38:38.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-m68k/fcntl.h 2004-06-10 18:38:49.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-mips/fcntl.h 2004-06-10 18:39:12.000000000 -0300
> +#define O_NOATIME 0x40000
> +++ linux-2.6.6/include/asm-parisc/fcntl.h 2004-06-10 18:40:03.000000000 -0300
> +#define O_NOATIME 04000000
> +++ linux-2.6.6/include/asm-ppc/fcntl.h 2004-06-10 18:40:14.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-ppc64/fcntl.h 2004-06-10 18:40:25.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-s390/fcntl.h 2004-06-10 18:40:42.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-sh/fcntl.h 2004-06-10 18:40:52.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-sparc/fcntl.h 2004-06-10 18:41:14.000000000 -0300
> +#define O_NOATIME 0x200000
> +++ linux-2.6.6/include/asm-sparc64/fcntl.h 2004-06-10 18:41:27.000000000 -0300
> +#define O_NOATIME 0x200000
> +++ linux-2.6.6/include/asm-v850/fcntl.h 2004-06-10 18:41:56.000000000 -0300
> +#define O_NOATIME 01000000
> +++ linux-2.6.6/include/asm-x86_64/fcntl.h 2004-06-10 18:42:13.000000000 -0300
> +#define O_NOATIME 01000000

This is less related to your patch (i like this feature!) but more to the
current source layout: is there a reason for not sharing those open flags on
an non architecture specific file?

And should you not try to use the same value on all architectures to make
that especially easy to change later?

Greetings
Bernd


--
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

2004-06-12 18:09:30

by Chris Wedgwood

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Sat, Jun 12, 2004 at 06:44:01PM +0200, Bernd Eckenfels wrote:

> This is less related to your patch (i like this feature!) but more
> to the current source layout: is there a reason for not sharing
> those open flags on an non architecture specific file?

We just never did it that way, and they are not all the same across
all architectures.

> And should you not try to use the same value on all architectures to
> make that especially easy to change later?

They are ABI specific and will never change.


--cw

2004-06-12 18:22:57

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Sat, Jun 12, 2004 at 11:09:18AM -0700, Chris Wedgwood wrote:
> They are ABI specific and will never change.

Yes, Thats even more an argument for introducing a value which is the same on
all architectures, since it can never be changed again.

Greetings
Bernd
--
(OO) -- Bernd_Eckenfels@M?rscher_Strasse_8.76185Karlsruhe.de --
( .. ) ecki@{inka.de,linux.de,debian.org} http://www.eckes.org/
o--o 1024D/E383CD7E eckes@IRCNet v:+497211603874 f:+497211606754
(O____O) When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl!

2004-06-14 09:55:38

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Fri, Jun 11, 2004 at 10:11:29PM -0300, Cesar Eduardo Barros wrote:
> (not subscribed to lkml, please CC: me on replies)
>
> This patch adds support for the O_NOATIME open flag (GNU extension):
>
> int O_NOATIME Macro
> If this bit is set, read will not update the access time of the file.
> See File Times. This is used by programs that do backups, so that
> backing a file up does not count as reading it. Only the owner of the
> file or the superuser may use this bit.
>
> It is useful if you want to do something with the file atime (for
> instance, moving files that have not been accessed in a while to
> somewhere else, or something like Debian's popularity-contest) but you
> also want to read all files periodically (for instance, tripwire or
> debsums).
>
> Currently, the program that reads all files periodically has to use
> utimes, which can race with the atime update:

Any chance we could change the flag to also not update mtime and ctime
for updates on a fd opened with it (and renaming it to O_INVISIBLE for
example). That's needed for your above moving infrequently used files
away scenario (aka a HSM)

2004-06-14 14:04:00

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Mon, Jun 14, 2004 at 10:46:52AM -0300, Cesar Eduardo Barros wrote:
> I don't see why preserving the mtime and ctime would be necessary, since
> to move a file away you either don't touch it (using rename) or only
> read and unlink it (to write to a tape or other filesystem, and you can
> save the atime and mtime while doing it). So O_NOATIME is enough for
> both behaviours.

Maybe some day the file needs to come back from the tape ;-) Or rather
in the HSM scenario a part of the file.

2004-06-14 14:11:41

by Cesar Eduardo Barros

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Mon, Jun 14, 2004 at 10:55:29AM +0100, Christoph Hellwig wrote:
> On Fri, Jun 11, 2004 at 10:11:29PM -0300, Cesar Eduardo Barros wrote:
> > (not subscribed to lkml, please CC: me on replies)
> >
> > This patch adds support for the O_NOATIME open flag (GNU extension):
> >
> > int O_NOATIME Macro
> > If this bit is set, read will not update the access time of the file.
> > See File Times. This is used by programs that do backups, so that
> > backing a file up does not count as reading it. Only the owner of the
> > file or the superuser may use this bit.
> >
> > It is useful if you want to do something with the file atime (for
> > instance, moving files that have not been accessed in a while to
> > somewhere else, or something like Debian's popularity-contest) but you
> > also want to read all files periodically (for instance, tripwire or
> > debsums).
> >
> > Currently, the program that reads all files periodically has to use
> > utimes, which can race with the atime update:
>
> Any chance we could change the flag to also not update mtime and ctime
> for updates on a fd opened with it (and renaming it to O_INVISIBLE for
> example). That's needed for your above moving infrequently used files
> away scenario (aka a HSM)

I don't see why preserving the mtime and ctime would be necessary, since
to move a file away you either don't touch it (using rename) or only
read and unlink it (to write to a tape or other filesystem, and you can
save the atime and mtime while doing it). So O_NOATIME is enough for
both behaviours.

Besides, O_NOATIME is most important not for the program that's moving
the files elsewhere, but for these checksum-the-world utilities that
read every single file they can see, and in the process manage to
destroy the usefulness of the atime, or backup programs that also read
everything they can touch. Both currently have to use utimes after
reading the whole file to restore the atime it had when they began
reading, which can take a long time if the file is huge (but note that
the mtime doesn't change since they are all reading, not writing).

The ctime changing is not a problem, since programs that want to move
infrequently used files away will use only the atime and mtime to make
their decisions, not the ctime. Also, wouldn't not changing the atime
make the ctime not change too, since nothing in the inode has changed?

O_NOATIME would also be useful for things like tar --atime-preserve,
cpio --reset-access-time, star -atime, pax -t, and others.

--
Cesar Eduardo Barros
[email protected]
[email protected]

2004-06-14 15:29:24

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

> O_INVISIBLE

If I were writing Linux viruses, I'd like that option.
One additional easy way to cover my tracks a bit more.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-14 16:57:50

by David Lang

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Mon, 14 Jun 2004, Cesar Eduardo Barros wrote:
> On Mon, Jun 14, 2004 at 10:55:29AM +0100, Christoph Hellwig wrote:
>> On Fri, Jun 11, 2004 at 10:11:29PM -0300, Cesar Eduardo Barros wrote:
>>> (not subscribed to lkml, please CC: me on replies)
>>>
>>> This patch adds support for the O_NOATIME open flag (GNU extension):
>>>
>>> int O_NOATIME Macro
>>> If this bit is set, read will not update the access time of the file.
>>> See File Times. This is used by programs that do backups, so that
>>> backing a file up does not count as reading it. Only the owner of the
>>> file or the superuser may use this bit.
>>>
>>> It is useful if you want to do something with the file atime (for
>>> instance, moving files that have not been accessed in a while to
>>> somewhere else, or something like Debian's popularity-contest) but you
>>> also want to read all files periodically (for instance, tripwire or
>>> debsums).
>>>
>>> Currently, the program that reads all files periodically has to use
>>> utimes, which can race with the atime update:
>>
>> Any chance we could change the flag to also not update mtime and ctime
>> for updates on a fd opened with it (and renaming it to O_INVISIBLE for
>> example). That's needed for your above moving infrequently used files
>> away scenario (aka a HSM)
>
> I don't see why preserving the mtime and ctime would be necessary, since
> to move a file away you either don't touch it (using rename) or only
> read and unlink it (to write to a tape or other filesystem, and you can
> save the atime and mtime while doing it). So O_NOATIME is enough for
> both behaviours.
>
> Besides, O_NOATIME is most important not for the program that's moving
> the files elsewhere, but for these checksum-the-world utilities that
> read every single file they can see, and in the process manage to
> destroy the usefulness of the atime, or backup programs that also read
> everything they can touch. Both currently have to use utimes after
> reading the whole file to restore the atime it had when they began
> reading, which can take a long time if the file is huge (but note that
> the mtime doesn't change since they are all reading, not writing).
>
> The ctime changing is not a problem, since programs that want to move
> infrequently used files away will use only the atime and mtime to make
> their decisions, not the ctime. Also, wouldn't not changing the atime
> make the ctime not change too, since nothing in the inode has changed?
>
> O_NOATIME would also be useful for things like tar --atime-preserve,
> cpio --reset-access-time, star -atime, pax -t, and others.

This sounds like the same catagory of use that does a single pass through
the data and is destroying our memory useage. should this flag also imply
that the data gets thrown away immediatly after being freed by the
program?

that way you don't have to worry if the software reads the data once or
ten times, as long as it doesn't go back to it after it has freed it.

David Lang

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

2004-06-14 19:26:32

by Cesar Eduardo Barros

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Mon, Jun 14, 2004 at 03:03:56PM +0100, Christoph Hellwig wrote:
> On Mon, Jun 14, 2004 at 10:46:52AM -0300, Cesar Eduardo Barros wrote:
> > I don't see why preserving the mtime and ctime would be necessary, since
> > to move a file away you either don't touch it (using rename) or only
> > read and unlink it (to write to a tape or other filesystem, and you can
> > save the atime and mtime while doing it). So O_NOATIME is enough for
> > both behaviours.
>
> Maybe some day the file needs to come back from the tape ;-) Or rather
> in the HSM scenario a part of the file.

When it comes back, it can be written to a temporary file, have its
atime/mtime set with utimes, and atomically renamed to the right place.

If you want to play with parts of files, you would need an atime for
each block of the file ;-)

--
Cesar Eduardo Barros
[email protected]
[email protected]

2004-06-14 19:36:17

by Cesar Eduardo Barros

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Mon, Jun 14, 2004 at 09:57:11AM -0700, David Lang wrote:
> On Mon, 14 Jun 2004, Cesar Eduardo Barros wrote:
> >On Mon, Jun 14, 2004 at 10:55:29AM +0100, Christoph Hellwig wrote:
> >>On Fri, Jun 11, 2004 at 10:11:29PM -0300, Cesar Eduardo Barros wrote:
> >>>(not subscribed to lkml, please CC: me on replies)
> >>>
> >>>This patch adds support for the O_NOATIME open flag (GNU extension):
> >>>
> >>>int O_NOATIME Macro
> >>> If this bit is set, read will not update the access time of the file.
> >>> See File Times. This is used by programs that do backups, so that
> >>> backing a file up does not count as reading it. Only the owner of the
> >>> file or the superuser may use this bit.
> >>>
> >>>It is useful if you want to do something with the file atime (for
> >>>instance, moving files that have not been accessed in a while to
> >>>somewhere else, or something like Debian's popularity-contest) but you
> >>>also want to read all files periodically (for instance, tripwire or
> >>>debsums).
> >>>
> >
> >Besides, O_NOATIME is most important not for the program that's moving
> >the files elsewhere, but for these checksum-the-world utilities that
> >read every single file they can see, and in the process manage to
> >destroy the usefulness of the atime, or backup programs that also read
> >everything they can touch. Both currently have to use utimes after
> >reading the whole file to restore the atime it had when they began
> >reading, which can take a long time if the file is huge (but note that
> >the mtime doesn't change since they are all reading, not writing).
> >
> >O_NOATIME would also be useful for things like tar --atime-preserve,
> >cpio --reset-access-time, star -atime, pax -t, and others.
>
> This sounds like the same catagory of use that does a single pass through
> the data and is destroying our memory useage. should this flag also imply
> that the data gets thrown away immediatly after being freed by the
> program?
>
> that way you don't have to worry if the software reads the data once or
> ten times, as long as it doesn't go back to it after it has freed it.

No, that would be surprising behaviour. O_NOATIME means the atime
shouldn't be changed -- no more, no less. Nothing prevents me from doing
complex read patterns on the file while using O_NOATIME (for instance,
if I know the internal format of the file, I might use a random access
pattern, read parts of the file more than once, or something like that).

If you want drop-behind, you should be able to say it explicitly (and in
fact most people would probably want it but not O_NOATIME -- for
instance, a media player, after reading the headers, reads the file
mostly in sequence). I believe that would work better as a fcntl (since
you would want to read the headers before setting it to sequential).

--
Cesar Eduardo Barros
[email protected]
[email protected]

2004-06-14 21:13:24

by Alexandre Oliva

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Jun 11, 2004, Cesar Eduardo Barros <[email protected]> wrote:

> int O_NOATIME Macro
> If this bit is set, read will not update the access time of the file.
> See File Times. This is used by programs that do backups, so that
> backing a file up does not count as reading it. Only the owner of the
> file or the superuser may use this bit.

IMHO it's a bad idea to enable the owner of the file to avoid changing
the atime of their files. I've heard more than once about the atime
bit being used to as proof that a user had actually seen the contents
of a file although s/he claimed s/he hadn't. If it was root-only,
atime could still be used for the same purpose, and would enable
backups with tools that accessed the filesystem through the FS layer,
as opposed to though the block layer, to keep such proof unchanged.

--
Alexandre Oliva http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}

2004-06-14 21:59:21

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Mon, 14 Jun 2004 18:12:59 -0300, Alexandre Oliva said:

> IMHO it's a bad idea to enable the owner of the file to avoid changing
> the atime of their files. I've heard more than once about the atime
> bit being used to as proof that a user had actually seen the contents
> of a file although s/he claimed s/he hadn't. If it was root-only,
> atime could still be used for the same purpose, and would enable
> backups with tools that accessed the filesystem through the FS layer,
> as opposed to though the block layer, to keep such proof unchanged.

Of course, such "proof" is broken. Consider that something so simple as a
'find . | xargs wc -l' will break that "proof" - as will any file manager that
looks at magic (anything from 'nautilus' to 'file' - if it uses /etc/magic or /
usr/share/file/magic or wherever your distro keeps it, you have a problem).

If you don't have O_NOATIME, it doesn't strengthen the "proof" any, because any
tool can look at the file and then call utime() to clean up behind itself. Of
course, at that point the kernel still has to write that dirty inode back.....

If you want *proof* a given userid did/didn't open a file, do up a proper
set of audit trail hooks (keep in mind it will likely be even more intrusive
than the LSM hooks).

And trying to prove a connection from "file opened" to "contents displayed to
user" is challenging enough without a *proper* audit trail (one that can cross-correlate
open/read/write on the input and output file descriptors). Figuring out
how to get from there to "user saw it" will likely require major work
(and, in fact, absent an auditable event generated by the user that proves
they read the information, almost impossible).

cd /usr/src/linux-2.6.6; find . -name '*.[ch]' | xargs cat

Let me know if you actually *see* anything. My laptop makes it through the first 200
*files* (comprising some 3168K) in 3.45 seconds or so.


Attachments:
(No filename) (226.00 B)

2004-06-14 22:09:50

by Matthias Schniedermeyer

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Mon, Jun 14, 2004 at 06:12:59PM -0300, Alexandre Oliva wrote:
> On Jun 11, 2004, Cesar Eduardo Barros <[email protected]> wrote:
>
> > int O_NOATIME Macro
> > If this bit is set, read will not update the access time of the file.
> > See File Times. This is used by programs that do backups, so that
> > backing a file up does not count as reading it. Only the owner of the
> > file or the superuser may use this bit.
>
> IMHO it's a bad idea to enable the owner of the file to avoid changing
> the atime of their files. I've heard more than once about the atime
> bit being used to as proof that a user had actually seen the contents
> of a file although s/he claimed s/he hadn't. If it was root-only,
> atime could still be used for the same purpose, and would enable
> backups with tools that accessed the filesystem through the FS layer,
> as opposed to though the block layer, to keep such proof unchanged.

man mount
/noatime
-> You can disable updating the atime for the whole filesystem.

man utimes/touch -a
-> You can modify "at will" the atime & mtime of a file.


Or in other words, nothing you can't already manipulate at will today.



Bis denn

--
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.

2004-06-14 22:41:16

by Cesar Eduardo Barros

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Mon, Jun 14, 2004 at 06:12:59PM -0300, Alexandre Oliva wrote:
> On Jun 11, 2004, Cesar Eduardo Barros <[email protected]> wrote:
>
> > int O_NOATIME Macro
> > If this bit is set, read will not update the access time of the file.
> > See File Times. This is used by programs that do backups, so that
> > backing a file up does not count as reading it. Only the owner of the
> > file or the superuser may use this bit.
>
> IMHO it's a bad idea to enable the owner of the file to avoid changing
> the atime of their files. I've heard more than once about the atime
> bit being used to as proof that a user had actually seen the contents
> of a file although s/he claimed s/he hadn't. If it was root-only,
> atime could still be used for the same purpose, and would enable
> backups with tools that accessed the filesystem through the FS layer,
> as opposed to though the block layer, to keep such proof unchanged.

I'm not the one who invented O_NOATIME; it's the GNU people, and so I
wanted to avoid diverging from their description (the text you quoted
above is from the glibc manual).

The semantics of O_NOATIME are the same as using utimes or variants, and
utimes has the same security restriction (only the file owner or the
superuser). The only thing O_NOATIME gains is the absence of a race
condition where another program can read the file without it being noted
in the atime.

I believe the Unix philosophy is that a user can do anything with the
files he owns, with the exception that root can do anything with any
file. This is why various functions (chown, chmod, etc) check if
current->fsuid equals inode->i_uid or CAP_FOWNER is set.

If you want to use the atime as proof of wrongdoing, you probably want a
root-only O_NOATIME to avoid checksummers and backup daemons creating a
race condition due to their restoring of the old atime; however, I fail
to see how can reading a file you own would be wrongdoing. If the file
isn't yours, you can't use O_NOATIME (you get -EPERM). So, a user
ignoring atime updates on his own files is no big deal.

And finally, nothing prevents a user from running his own backup
programs/checksummers/storage managers, which would benefit from
O_NOATIME.

The atime was never intended as an auditing feature (if it were, utimes
and related functions would be root only).

--
Cesar Eduardo Barros
[email protected]
[email protected]

2004-06-14 23:14:05

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

In article <[email protected]> you wrote:
> superuser). The only thing O_NOATIME gains is the absence of a race
> condition where another program can read the file without it being noted
> in the atime.

And it will not dirty the inode, which is a fairly big saving for filesystem
scanning tools.

Greetings
Bernd
--
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

2004-06-15 19:03:05

by Alexandre Oliva

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Jun 14, 2004, Matthias Schniedermeyer <[email protected]> wrote:

> On Mon, Jun 14, 2004 at 06:12:59PM -0300, Alexandre Oliva wrote:

>> I've heard more than once about the atime bit being used to as
>> proof that a user had actually seen the contents of a file although
>> s/he claimed s/he hadn't. If it was root-only,

> man mount
> /noatime
-> You can disable updating the atime for the whole filesystem.

As a sysadmin that intends to use atime as proof, you don't do that.

--
Alexandre Oliva http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}

2004-06-15 19:02:30

by Alexandre Oliva

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Jun 14, 2004, Cesar Eduardo Barros <[email protected]> wrote:

> The atime was never intended as an auditing feature (if it were, utimes
> and related functions would be root only).

But utimes updates the inode modification time, so you can still tell
something happened to the file.

--
Alexandre Oliva http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}

2004-06-15 19:32:42

by Matthias Schniedermeyer

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Tue, Jun 15, 2004 at 04:01:23PM -0300, Alexandre Oliva wrote:
> On Jun 14, 2004, Cesar Eduardo Barros <[email protected]> wrote:
>
> > The atime was never intended as an auditing feature (if it were, utimes
> > and related functions would be root only).
>
> But utimes updates the inode modification time, so you can still tell
> something happened to the file.

No.




Bis denn

--
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.

2004-06-15 21:54:10

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

Matthias, replying to Alexandre:
> > But utimes updates the inode modification time, so you can still tell
> > something happened to the file.
>
> No.

A less terse answer:

Utimes modifies the inode ctime - time of last inode change.

So, yes, you can still something happened to the file.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-16 06:23:30

by Matthias Schniedermeyer

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

On Tue, Jun 15, 2004 at 03:03:49PM -0700, Paul Jackson wrote:
> Matthias, replying to Alexandre:
> > > But utimes updates the inode modification time, so you can still tell
> > > something happened to the file.
> >
> > No.
>
> A less terse answer:
>
> Utimes modifies the inode ctime - time of last inode change.
>
> So, yes, you can still something happened to the file.

Hmm. The man-page doesn't meantion this, but i tried it

stat <file>
touch <file>
stat <file>

and all 3 times were the same after touching it.

man touch
- snip -
Update the access and modification times of each FILE to the current time.
- snip -

I would have guessed that changing atime/mtime doesn't change ctime.





Bis denn

--
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.

2004-06-16 14:28:00

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [PATCH] O_NOATIME support

Alexandre Oliva <[email protected]> said:
> On Jun 14, 2004, Matthias Schniedermeyer <[email protected]> wrote:

[...]

> -> You can disable updating the atime for the whole filesystem.
>
> As a sysadmin that intends to use atime as proof, you don't do that.

And you disable touch(1) too?
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513