2009-06-30 09:56:36

by Miklos Szeredi

[permalink] [raw]
Subject: [RFC PATCH] vfs: new open(2) flag to open filesystem node

This patch adds a new open flag, O_NODE. This flag means: open just
the filesystem node instead of the object referenced by the node.

Opening a file will not call the driver's ->open() method and will not
have any side effect other than referencing the dentry/vfsmount from
the struct file pointer.

Currently O_NODE can only be used together with O_NOACCESS (value 3),
meaning that read(2) and write(2) will return EBADF and no permission
is necessary to open the file. This is logical for device nodes and
fifos. For directories we could allow O_RDONLY, and for regular files
any access mode, but this is not done yet.

O_NODE used together with O_NOFOLLOW will open a symlink node itself
instead of returning ELOOP. O_NOFOLLOW will not prevent following
mountpoints if used together with O_NODE. In theory we could allow
O_RDONLY for symlinks too and return the contents of the link on
read(2).

Filesystems which want to install special file operations for O_NODE
opens (e.g. ioctl) may set S_OPEN_NODE flag in the inode. This will
cause dentry_open() to call inode->i_fop->open, and the filesystem is
responsible for handling the O_NODE flag and installing the right
filesystem operations in file->f_op.

Signed-off-by: Miklos Szeredi <[email protected]>
---
fs/file_table.c | 6 +++-
fs/locks.c | 3 ++
fs/namei.c | 54 ++++++++++++++++++++++++++------------------
fs/open.c | 12 ++++++++-
include/asm-generic/fcntl.h | 4 +++
include/linux/fs.h | 1
6 files changed, 56 insertions(+), 24 deletions(-)

Index: linux-2.6/fs/locks.c
===================================================================
--- linux-2.6.orig/fs/locks.c 2009-06-29 13:47:41.000000000 +0200
+++ linux-2.6/fs/locks.c 2009-06-29 13:48:10.000000000 +0200
@@ -1183,6 +1183,9 @@ int __break_lease(struct inode *inode, u
unsigned long break_time;
int i_have_this_lease = 0;

+ if (!(mode & (FMODE_READ | FMODE_WRITE)))
+ return 0;
+
new_fl = lease_alloc(NULL, mode & FMODE_WRITE ? F_WRLCK : F_RDLCK);

lock_kernel();
Index: linux-2.6/fs/namei.c
===================================================================
--- linux-2.6.orig/fs/namei.c 2009-06-29 13:48:09.000000000 +0200
+++ linux-2.6/fs/namei.c 2009-06-29 13:48:10.000000000 +0200
@@ -1160,7 +1160,9 @@ static int path_lookup_open(int dfd, con
nd->intent.open.file = filp;
nd->intent.open.flags = open_flags;
nd->intent.open.create_mode = 0;
- err = do_path_lookup(dfd, name, lookup_flags|LOOKUP_OPEN, nd);
+ if (!(open_flags & O_NODE))
+ lookup_flags |= LOOKUP_OPEN;
+ err = do_path_lookup(dfd, name, lookup_flags, nd);
if (IS_ERR(nd->intent.open.file)) {
if (err == 0) {
err = PTR_ERR(nd->intent.open.file);
@@ -1511,22 +1513,27 @@ int may_open(struct path *path, int acc_
if (!inode)
return -ENOENT;

- switch (inode->i_mode & S_IFMT) {
- case S_IFLNK:
- return -ELOOP;
- case S_IFDIR:
- if (acc_mode & MAY_WRITE)
- return -EISDIR;
- break;
- case S_IFBLK:
- case S_IFCHR:
- if (path->mnt->mnt_flags & MNT_NODEV)
+ if ((flag & O_NODE)) {
+ if (acc_mode & (MAY_READ | MAY_WRITE))
return -EACCES;
- /*FALLTHRU*/
- case S_IFIFO:
- case S_IFSOCK:
- flag &= ~O_TRUNC;
- break;
+ } else {
+ switch (inode->i_mode & S_IFMT) {
+ case S_IFLNK:
+ return -ELOOP;
+ case S_IFDIR:
+ if (acc_mode & MAY_WRITE)
+ return -EISDIR;
+ break;
+ case S_IFBLK:
+ case S_IFCHR:
+ if (path->mnt->mnt_flags & MNT_NODEV)
+ return -EACCES;
+ /*FALLTHRU*/
+ case S_IFIFO:
+ case S_IFSOCK:
+ flag &= ~O_TRUNC;
+ break;
+ }
}

error = inode_permission(inode, acc_mode);
@@ -1629,14 +1636,16 @@ out_unlock:
* 11 - read-write
* for the internal routines (ie open_namei()/follow_link() etc)
* This is more logical, and also allows the 00 "no perm needed"
- * to be used for symlinks (where the permissions are checked
- * later).
+ * to be used for opening a symlink, pipe, socket or device with
+ * O_NODE.
*
*/
static inline int open_to_namei_flags(int flag)
{
if ((flag+1) & O_ACCMODE)
flag++;
+ else if (flag & O_NODE)
+ flag &= ~O_ACCMODE;
return flag;
}

@@ -1724,7 +1733,9 @@ struct file *do_filp_open(int dfd, const
nd.intent.open.create_mode = mode;
dir = nd.path.dentry;
nd.flags &= ~LOOKUP_PARENT;
- nd.flags |= LOOKUP_CREATE | LOOKUP_OPEN;
+ nd.flags |= LOOKUP_CREATE;
+ if (!(open_flag & O_NODE))
+ nd.flags |= LOOKUP_OPEN;
if (flag & O_EXCL)
nd.flags |= LOOKUP_EXCL;
mutex_lock(&dir->d_inode->i_mutex);
@@ -1779,7 +1790,7 @@ do_last:

if (__follow_mount(&path)) {
error = -ELOOP;
- if (flag & O_NOFOLLOW)
+ if ((flag & O_NOFOLLOW) & !(flag & O_NODE))
goto exit_dput;
}

@@ -1790,7 +1801,8 @@ do_last:
if (!(flag & O_NOFOLLOW))
goto do_link;
error = -ELOOP;
- goto exit_dput;
+ if (!(flag & O_NODE))
+ goto exit_dput;
}

path_to_nameidata(&path, &nd);
Index: linux-2.6/include/asm-generic/fcntl.h
===================================================================
--- linux-2.6.orig/include/asm-generic/fcntl.h 2009-06-29 13:47:41.000000000 +0200
+++ linux-2.6/include/asm-generic/fcntl.h 2009-06-29 13:48:10.000000000 +0200
@@ -9,6 +9,7 @@
#define O_RDONLY 00000000
#define O_WRONLY 00000001
#define O_RDWR 00000002
+#define O_NOACCESS 00000003
#ifndef O_CREAT
#define O_CREAT 00000100 /* not fcntl */
#endif
@@ -51,6 +52,9 @@
#ifndef O_CLOEXEC
#define O_CLOEXEC 02000000 /* set close_on_exec */
#endif
+#ifndef O_NODE
+#define O_NODE 04000000 /* open filesystem node, not device */
+#endif
#ifndef O_NDELAY
#define O_NDELAY O_NONBLOCK
#endif
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h 2009-06-29 13:47:41.000000000 +0200
+++ linux-2.6/include/linux/fs.h 2009-06-29 13:48:10.000000000 +0200
@@ -231,6 +231,7 @@ struct inodes_stat_t {
#define S_NOCMTIME 128 /* Do not update file c/mtime */
#define S_SWAPFILE 256 /* Do not truncate: swapon got its bmaps */
#define S_PRIVATE 512 /* Inode is fs-internal */
+#define S_OPEN_NODE 1024 /* Fs is prepared for O_NODE opens */

/*
* Note that nosuid etc flags are inode-specific: setting some file-system
Index: linux-2.6/fs/open.c
===================================================================
--- linux-2.6.orig/fs/open.c 2009-06-29 13:47:41.000000000 +0200
+++ linux-2.6/fs/open.c 2009-06-29 13:48:10.000000000 +0200
@@ -803,12 +803,17 @@ static inline int __get_file_write_acces
return error;
}

+static const struct file_operations default_node_fops = {
+ .llseek = no_llseek,
+};
+
static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
int flags, struct file *f,
int (*open)(struct inode *, struct file *),
const struct cred *cred)
{
struct inode *inode;
+ const struct file_operations *fops;
int error;

f->f_flags = flags;
@@ -827,7 +832,12 @@ static struct file *__dentry_open(struct
f->f_path.dentry = dentry;
f->f_path.mnt = mnt;
f->f_pos = 0;
- f->f_op = fops_get(inode->i_fop);
+
+ fops = inode->i_fop;
+ if ((flags & O_NODE) && !(inode->i_flags & S_OPEN_NODE))
+ fops = &default_node_fops;
+ f->f_op = fops_get(fops);
+
file_move(f, &inode->i_sb->s_files);

error = security_dentry_open(f, cred);
Index: linux-2.6/fs/file_table.c
===================================================================
--- linux-2.6.orig/fs/file_table.c 2009-06-29 13:47:51.000000000 +0200
+++ linux-2.6/fs/file_table.c 2009-06-29 13:48:15.000000000 +0200
@@ -281,8 +281,10 @@ void __fput(struct file *file)
file->f_op->release(inode, file);
security_file_free(file);
ima_file_free(file);
- if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL))
- cdev_put(inode->i_cdev);
+ if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL)) {
+ if (!(file->f_flags & O_NODE))
+ cdev_put(inode->i_cdev);
+ }
fops_put(file->f_op);
put_pid(file->f_owner.pid);
file_kill(file);


2009-06-30 20:26:30

by David Howells

[permalink] [raw]
Subject: Re: [RFC PATCH] vfs: new open(2) flag to open filesystem node

Miklos Szeredi <[email protected]> wrote:

> @@ -1790,7 +1801,8 @@ do_last:
> if (!(flag & O_NOFOLLOW))
> goto do_link;
> error = -ELOOP;
> - goto exit_dput;
> + if (!(flag & O_NODE))
> + goto exit_dput;
> }
>
> path_to_nameidata(&path, &nd);

This hunk doesn't apply.

David

2009-07-01 04:59:32

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [RFC PATCH] vfs: new open(2) flag to open filesystem node

On Tue, 30 Jun 2009, David Howells wrote:
> This hunk doesn't apply.

Sorry... here's the correct patch.

Thanks,
Miklos

----
This patch adds a new open flag, O_NODE. This flag means: open just
the filesystem node instead of the object referenced by the node.

Opening a file will not call the driver's ->open() method and will not
have any side effect other than referencing the dentry/vfsmount from
the struct file pointer.

Currently O_NODE can only be used together with O_NOACCESS (value 3),
meaning that read(2) and write(2) will return EBADF and no permission
is necessary to open the file. This is logical for device nodes and
fifos. For directories we could allow O_RDONLY, and for regular files
any access mode, but this is not done yet.

O_NODE used together with O_NOFOLLOW will open a symlink node itself
instead of returning ELOOP. O_NOFOLLOW will not prevent following
mountpoints if used together with O_NODE. In theory we could allow
O_RDONLY for symlinks too and return the contents of the link on
read(2).

Filesystems which want to install special file operations for O_NODE
opens (e.g. ioctl) may set S_OPEN_NODE flag in the inode. This will
cause dentry_open() to call inode->i_fop->open, and the filesystem is
responsible for handling the O_NODE flag and installing the right
filesystem operations in file->f_op.

Signed-off-by: Miklos Szeredi <[email protected]>
---
fs/file_table.c | 6 ++--
fs/locks.c | 3 ++
fs/namei.c | 65 ++++++++++++++++++++++++++------------------
fs/open.c | 12 +++++++-
include/asm-generic/fcntl.h | 4 ++
include/linux/fs.h | 1
6 files changed, 62 insertions(+), 29 deletions(-)

Index: linux-2.6/fs/file_table.c
===================================================================
--- linux-2.6.orig/fs/file_table.c 2009-07-01 06:46:51.000000000 +0200
+++ linux-2.6/fs/file_table.c 2009-07-01 06:47:18.000000000 +0200
@@ -281,8 +281,10 @@ void __fput(struct file *file)
file->f_op->release(inode, file);
security_file_free(file);
ima_file_free(file);
- if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL))
- cdev_put(inode->i_cdev);
+ if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL)) {
+ if (!(file->f_flags & O_NODE))
+ cdev_put(inode->i_cdev);
+ }
fops_put(file->f_op);
put_pid(file->f_owner.pid);
file_kill(file);
Index: linux-2.6/fs/locks.c
===================================================================
--- linux-2.6.orig/fs/locks.c 2009-07-01 06:46:51.000000000 +0200
+++ linux-2.6/fs/locks.c 2009-07-01 06:47:18.000000000 +0200
@@ -1183,6 +1183,9 @@ int __break_lease(struct inode *inode, u
unsigned long break_time;
int i_have_this_lease = 0;

+ if (!(mode & (FMODE_READ | FMODE_WRITE)))
+ return 0;
+
new_fl = lease_alloc(NULL, mode & FMODE_WRITE ? F_WRLCK : F_RDLCK);

lock_kernel();
Index: linux-2.6/fs/namei.c
===================================================================
--- linux-2.6.orig/fs/namei.c 2009-07-01 06:46:52.000000000 +0200
+++ linux-2.6/fs/namei.c 2009-07-01 06:47:18.000000000 +0200
@@ -1160,7 +1160,9 @@ static int path_lookup_open(int dfd, con
nd->intent.open.file = filp;
nd->intent.open.flags = open_flags;
nd->intent.open.create_mode = 0;
- err = do_path_lookup(dfd, name, lookup_flags|LOOKUP_OPEN, nd);
+ if (!(open_flags & O_NODE))
+ lookup_flags |= LOOKUP_OPEN;
+ err = do_path_lookup(dfd, name, lookup_flags, nd);
if (IS_ERR(nd->intent.open.file)) {
if (err == 0) {
err = PTR_ERR(nd->intent.open.file);
@@ -1511,22 +1513,27 @@ int may_open(struct path *path, int acc_
if (!inode)
return -ENOENT;

- switch (inode->i_mode & S_IFMT) {
- case S_IFLNK:
- return -ELOOP;
- case S_IFDIR:
- if (acc_mode & MAY_WRITE)
- return -EISDIR;
- break;
- case S_IFBLK:
- case S_IFCHR:
- if (path->mnt->mnt_flags & MNT_NODEV)
+ if ((flag & O_NODE)) {
+ if (acc_mode & (MAY_READ | MAY_WRITE))
return -EACCES;
- /*FALLTHRU*/
- case S_IFIFO:
- case S_IFSOCK:
- flag &= ~O_TRUNC;
- break;
+ } else {
+ switch (inode->i_mode & S_IFMT) {
+ case S_IFLNK:
+ return -ELOOP;
+ case S_IFDIR:
+ if (acc_mode & MAY_WRITE)
+ return -EISDIR;
+ break;
+ case S_IFBLK:
+ case S_IFCHR:
+ if (path->mnt->mnt_flags & MNT_NODEV)
+ return -EACCES;
+ /*FALLTHRU*/
+ case S_IFIFO:
+ case S_IFSOCK:
+ flag &= ~O_TRUNC;
+ break;
+ }
}

error = inode_permission(inode, acc_mode);
@@ -1629,14 +1636,16 @@ out_unlock:
* 11 - read-write
* for the internal routines (ie open_namei()/follow_link() etc)
* This is more logical, and also allows the 00 "no perm needed"
- * to be used for symlinks (where the permissions are checked
- * later).
+ * to be used for opening a symlink, pipe, socket or device with
+ * O_NODE.
*
*/
static inline int open_to_namei_flags(int flag)
{
if ((flag+1) & O_ACCMODE)
flag++;
+ else if (flag & O_NODE)
+ flag &= ~O_ACCMODE;
return flag;
}

@@ -1724,7 +1733,9 @@ struct file *do_filp_open(int dfd, const
nd.intent.open.create_mode = mode;
dir = nd.path.dentry;
nd.flags &= ~LOOKUP_PARENT;
- nd.flags |= LOOKUP_CREATE | LOOKUP_OPEN;
+ nd.flags |= LOOKUP_CREATE;
+ if (!(open_flag & O_NODE))
+ nd.flags |= LOOKUP_OPEN;
if (flag & O_EXCL)
nd.flags |= LOOKUP_EXCL;
mutex_lock(&dir->d_inode->i_mutex);
@@ -1783,19 +1794,24 @@ do_last:

if (__follow_mount(&path)) {
error = -ELOOP;
- if (flag & O_NOFOLLOW)
+ if ((flag & O_NOFOLLOW) & !(flag & O_NODE))
goto exit_dput;
}

error = -ENOENT;
if (!path.dentry->d_inode)
goto exit_dput;
- if (path.dentry->d_inode->i_op->follow_link)
- goto do_link;
+ if (path.dentry->d_inode->i_op->follow_link) {
+ if (!(flag & O_NOFOLLOW))
+ goto do_link;
+ error = -ELOOP;
+ if (!(flag & O_NODE))
+ goto exit_dput;
+ }

path_to_nameidata(&path, &nd);
error = -EISDIR;
- if (path.dentry->d_inode && S_ISDIR(path.dentry->d_inode->i_mode))
+ if (S_ISDIR(path.dentry->d_inode->i_mode))
goto exit;
ok:
/*
@@ -1849,9 +1865,6 @@ exit_parent:
return ERR_PTR(error);

do_link:
- error = -ELOOP;
- if (flag & O_NOFOLLOW)
- goto exit_dput;
/*
* This is subtle. Instead of calling do_follow_link() we do the
* thing by hands. The reason is that this way we have zero link_count
Index: linux-2.6/fs/open.c
===================================================================
--- linux-2.6.orig/fs/open.c 2009-07-01 06:46:51.000000000 +0200
+++ linux-2.6/fs/open.c 2009-07-01 06:47:18.000000000 +0200
@@ -803,12 +803,17 @@ static inline int __get_file_write_acces
return error;
}

+static const struct file_operations default_node_fops = {
+ .llseek = no_llseek,
+};
+
static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
int flags, struct file *f,
int (*open)(struct inode *, struct file *),
const struct cred *cred)
{
struct inode *inode;
+ const struct file_operations *fops;
int error;

f->f_flags = flags;
@@ -827,7 +832,12 @@ static struct file *__dentry_open(struct
f->f_path.dentry = dentry;
f->f_path.mnt = mnt;
f->f_pos = 0;
- f->f_op = fops_get(inode->i_fop);
+
+ fops = inode->i_fop;
+ if ((flags & O_NODE) && !(inode->i_flags & S_OPEN_NODE))
+ fops = &default_node_fops;
+ f->f_op = fops_get(fops);
+
file_move(f, &inode->i_sb->s_files);

error = security_dentry_open(f, cred);
Index: linux-2.6/include/asm-generic/fcntl.h
===================================================================
--- linux-2.6.orig/include/asm-generic/fcntl.h 2009-07-01 06:46:51.000000000 +0200
+++ linux-2.6/include/asm-generic/fcntl.h 2009-07-01 06:47:18.000000000 +0200
@@ -9,6 +9,7 @@
#define O_RDONLY 00000000
#define O_WRONLY 00000001
#define O_RDWR 00000002
+#define O_NOACCESS 00000003
#ifndef O_CREAT
#define O_CREAT 00000100 /* not fcntl */
#endif
@@ -51,6 +52,9 @@
#ifndef O_CLOEXEC
#define O_CLOEXEC 02000000 /* set close_on_exec */
#endif
+#ifndef O_NODE
+#define O_NODE 04000000 /* open filesystem node, not device */
+#endif
#ifndef O_NDELAY
#define O_NDELAY O_NONBLOCK
#endif
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h 2009-07-01 06:46:51.000000000 +0200
+++ linux-2.6/include/linux/fs.h 2009-07-01 06:47:18.000000000 +0200
@@ -231,6 +231,7 @@ struct inodes_stat_t {
#define S_NOCMTIME 128 /* Do not update file c/mtime */
#define S_SWAPFILE 256 /* Do not truncate: swapon got its bmaps */
#define S_PRIVATE 512 /* Inode is fs-internal */
+#define S_OPEN_NODE 1024 /* Fs is prepared for O_NODE opens */

/*
* Note that nosuid etc flags are inode-specific: setting some file-system

2009-07-05 19:36:25

by Ulrich Drepper

[permalink] [raw]
Subject: Re: [RFC PATCH] vfs: new open(2) flag to open filesystem node

On Tue, Jun 30, 2009 at 21:59, Miklos Szeredi<[email protected]> wrote:
> This patch adds a new open flag, O_NODE.  This flag means: open just
> the filesystem node instead of the object referenced by the node.

I am still not sure whether this shouldn't rather be an implementation
of POSIX's O_SEARCH or whether it is already an O_SEARCH
implementation. If it's close enough it seems wasteful to add yet
another mode. I probably cannot research it this coming week but can
try to look at it the week after that.

2009-07-06 00:41:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH] vfs: new open(2) flag to open filesystem node



On Sun, 5 Jul 2009, Ulrich Drepper wrote:
>
> I am still not sure whether this shouldn't rather be an implementation
> of POSIX's O_SEARCH or whether it is already an O_SEARCH
> implementation.

Umm. That makes no sense at all.

O_SEARCH is only meaningful for directories. For anything else, it's not
at all POSIX - it's expressly defined to be "undefined".

So clearly, O_SEARCH is absolutely the _wrong_ thing to do. Claiming that
"it is POSIX" is pure and utter garbage, because it is _not_ POSIX in any
relevant way. Sure, POSIX allows us to have flying monkeys out of our
butts when you specify O_SEARCH, but where's the advantage? Undefined
behavior is undefined behavior.

It would be clearly and unambiguously _better_ to have another O_xyz flag,
that people can then do

#ifdef O_xyz
.. get some well-defined Linux extension behavior ..
#endif

in their source code, rather than overlead an undefined case with random
crap.

Linus

2009-07-06 05:50:34

by Ulrich Drepper

[permalink] [raw]
Subject: Re: [RFC PATCH] vfs: new open(2) flag to open filesystem node

On Sun, Jul 5, 2009 at 17:40, Linus
Torvalds<[email protected]> wrote:
> O_SEARCH is only meaningful for directories. For anything else, it's not
> at all POSIX - it's expressly defined to be "undefined".

And this is why there is the differentiation with O_EXEC. Yes, i
didn't mention it in the last email. But I mentioned it when it came
up the first time.

I don't say this is indeed what is wanted/needed here. But there are
IMO some similarities and I think implementing O_SEARCH and O_EXEC is
desirable. If it means completely different implementations from te
proposed O_NODE, so be it. But my gut tells me there is some overlay.

2009-07-06 12:30:16

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [RFC PATCH] vfs: new open(2) flag to open filesystem node

On Sun, 5 Jul 2009, Ulrich Drepper wrote:
> On Sun, Jul 5, 2009 at 17:40, Linus
> Torvalds<[email protected]> wrote:
> > O_SEARCH is only meaningful for directories. For anything else, it's not
> > at all POSIX - it's expressly defined to be "undefined".
>
> And this is why there is the differentiation with O_EXEC. Yes, i
> didn't mention it in the last email. But I mentioned it when it came
> up the first time.
>
> I don't say this is indeed what is wanted/needed here. But there are
> IMO some similarities and I think implementing O_SEARCH and O_EXEC is
> desirable.

O_SEARCH loosens the security model somewhat: a process could keep
search access to a directory even after the permissions have been
changed.

O_EXEC is similar, but "execute" is really not an access, just a flag,
so...

Thanks,
Miklos

2009-07-06 16:00:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH] vfs: new open(2) flag to open filesystem node



On Sun, 5 Jul 2009, Ulrich Drepper wrote:

> On Sun, Jul 5, 2009 at 17:40, Linus
> Torvalds<[email protected]> wrote:
> > O_SEARCH is only meaningful for directories. For anything else, it's not
> > at all POSIX - it's expressly defined to be "undefined".
>
> And this is why there is the differentiation with O_EXEC. Yes, i
> didn't mention it in the last email. But I mentioned it when it came
> up the first time.
>
> I don't say this is indeed what is wanted/needed here. But there are
> IMO some similarities and I think implementing O_SEARCH and O_EXEC is
> desirable. If it means completely different implementations from te
> proposed O_NODE, so be it. But my gut tells me there is some overlay.

I suspect that what we _could_ possibly do is to have something like
O_NODE, and after that - if the semantics (for directories) match what
O_SEARCH/O_EXEC wants, we could just do

#define O_SEARCH O_NODE

but my point is that we should _not_ start from O_SEARCH and make that the
"core" part, since its semantics are badly defined (undefined) to begin
with.

Put another way: it's better to start with something that is well-defined
(for us), and then say "in the special case of directories, this becomes
the same thing as O_SEARCH", than start with something that is defined
only for directories, and then say "ok, the behavior for other things is
undefined, so we could hijack it for our own uses".

Linus