2010-07-13 21:55:56

by David Howells

[permalink] [raw]
Subject: [RFC][PATCH] Add a dentry op to handle automounting rather than abusing follow_link


Add a dentry op (d_automount) to handle automounting directories rather than
abusing the follow_link() inode operation.

I've only changed __follow_mount() to handle automount points, but it might be
necessary to change follow_mount() too. The latter is only used from
follow_dotdot(), but any automounts on ".." should be pinned whilst we're using
a child of it.

AFS is made to use this facility so that it can be tested. Other filesystems
abusing the follow_mount() inode operation will also need to be modified.

Not-yet-signed-off-by: David Howells <[email protected]>
---

Documentation/filesystems/Locking | 2 +
Documentation/filesystems/vfs.txt | 13 +++++
fs/afs/dir.c | 12 ++++-
fs/afs/internal.h | 1
fs/afs/mntpt.c | 45 +++++------------
fs/namei.c | 96 +++++++++++++++++++++++++------------
include/linux/dcache.h | 7 +++
7 files changed, 112 insertions(+), 64 deletions(-)


diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 96d4293..ccbfa98 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -16,6 +16,7 @@ prototypes:
void (*d_release)(struct dentry *);
void (*d_iput)(struct dentry *, struct inode *);
char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen);
+ struct vfsmount *(*d_automount)(struct path *path);

locking rules:
none have BKL
@@ -27,6 +28,7 @@ d_delete: yes no yes no
d_release: no no no yes
d_iput: no no no yes
d_dname: no no no no
+d_automount: no no no yes

--------------------------- inode_operations ---------------------------
prototypes:
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 94677e7..040a85f 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -851,6 +851,7 @@ struct dentry_operations {
void (*d_release)(struct dentry *);
void (*d_iput)(struct dentry *, struct inode *);
char *(*d_dname)(struct dentry *, char *, int);
+ struct vfsmount *(*d_automount)(struct path *);
};

d_revalidate: called when the VFS needs to revalidate a dentry. This
@@ -885,6 +886,18 @@ struct dentry_operations {
at the end of the buffer, and returns a pointer to the first char.
dynamic_dname() helper function is provided to take care of this.

+ d_automount: called when an automount dentry is to be traversed (optional).
+ This should create a new VFS mount record, mount it on the directory
+ and return the record to the caller. The caller is supplied with a
+ path parameter giving the automount directory to describe the automount
+ target and the parent VFS mount record to provide inheritable mount
+ parameters. NULL should be returned if someone else managed to make
+ the automount first. If the automount failed, then an error code
+ should be returned.
+
+ The presence of a non-NULL d_automount for a dentry is taken to
+ indicate that the directory is an automount point.
+
Example :

static char *pipefs_dname(struct dentry *dent, char *buffer, int buflen)
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index afb9ff8..cc2eb94 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -67,6 +67,13 @@ static const struct dentry_operations afs_fs_dentry_operations = {
.d_release = afs_d_release,
};

+static const struct dentry_operations afs_mntpt_dentry_operations = {
+ .d_revalidate = afs_d_revalidate,
+ .d_delete = afs_d_delete,
+ .d_release = afs_d_release,
+ .d_automount = afs_mntpt_d_automount,
+};
+
#define AFS_DIR_HASHTBL_SIZE 128
#define AFS_DIR_DIRENT_SIZE 32
#define AFS_DIRENT_PER_BLOCK 64
@@ -539,7 +546,10 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
return ERR_CAST(inode);
}

- dentry->d_op = &afs_fs_dentry_operations;
+ if (test_bit(AFS_VNODE_MOUNTPOINT, &AFS_FS_I(inode)->flags))
+ dentry->d_op = &afs_mntpt_dentry_operations;
+ else
+ dentry->d_op = &afs_fs_dentry_operations;

d_add(dentry, inode);
_leave(" = 0 { vn=%u u=%u } -> { ino=%lu v=%u }",
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 5f679b7..47b99b5 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -583,6 +583,7 @@ extern int afs_abort_to_error(u32);
extern const struct inode_operations afs_mntpt_inode_operations;
extern const struct file_operations afs_mntpt_file_operations;

+extern struct vfsmount *afs_mntpt_d_automount(struct path *);
extern int afs_mntpt_check_symlink(struct afs_vnode *, struct key *);
extern void afs_mntpt_kill_timer(void);

diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
index a9e2303..5b95370 100644
--- a/fs/afs/mntpt.c
+++ b/fs/afs/mntpt.c
@@ -24,7 +24,6 @@ static struct dentry *afs_mntpt_lookup(struct inode *dir,
struct dentry *dentry,
struct nameidata *nd);
static int afs_mntpt_open(struct inode *inode, struct file *file);
-static void *afs_mntpt_follow_link(struct dentry *dentry, struct nameidata *nd);
static void afs_mntpt_expiry_timed_out(struct work_struct *work);

const struct file_operations afs_mntpt_file_operations = {
@@ -33,7 +32,6 @@ const struct file_operations afs_mntpt_file_operations = {

const struct inode_operations afs_mntpt_inode_operations = {
.lookup = afs_mntpt_lookup,
- .follow_link = afs_mntpt_follow_link,
.readlink = page_readlink,
.getattr = afs_getattr,
};
@@ -205,52 +203,37 @@ error_no_devname:
}

/*
- * follow a link from a mountpoint directory, thus causing it to be mounted
+ * handle an automount point
*/
-static void *afs_mntpt_follow_link(struct dentry *dentry, struct nameidata *nd)
+struct vfsmount *afs_mntpt_d_automount(struct path *path)
{
struct vfsmount *newmnt;
int err;

- _enter("%p{%s},{%s:%p{%s},}",
- dentry,
- dentry->d_name.name,
- nd->path.mnt->mnt_devname,
- dentry,
- nd->path.dentry->d_name.name);
-
- dput(nd->path.dentry);
- nd->path.dentry = dget(dentry);
+ _enter("{%s,%s}", path->mnt->mnt_devname, path->dentry->d_name.name);

- newmnt = afs_mntpt_do_automount(nd->path.dentry);
- if (IS_ERR(newmnt)) {
- path_put(&nd->path);
- return (void *)newmnt;
- }
+ newmnt = afs_mntpt_do_automount(path->dentry);
+ if (IS_ERR(newmnt))
+ return newmnt;

mntget(newmnt);
- err = do_add_mount(newmnt, &nd->path, MNT_SHRINKABLE, &afs_vfsmounts);
+ err = do_add_mount(newmnt, path, MNT_SHRINKABLE, &afs_vfsmounts);
switch (err) {
case 0:
- path_put(&nd->path);
- nd->path.mnt = newmnt;
- nd->path.dentry = dget(newmnt->mnt_root);
schedule_delayed_work(&afs_mntpt_expiry_timer,
afs_mntpt_expiry_timeout * HZ);
- break;
+ _leave(" = %p {%s}", newmnt, newmnt->mnt_devname);
+ return newmnt;
case -EBUSY:
/* someone else made a mount here whilst we were busy */
- while (d_mountpoint(nd->path.dentry) &&
- follow_down(&nd->path))
- ;
- err = 0;
+ mntput(newmnt);
+ _leave(" = NULL [EBUSY]");
+ return NULL;
default:
mntput(newmnt);
- break;
+ _leave(" = %d", err);
+ return ERR_PTR(err);
}
-
- _leave(" = %d", err);
- return ERR_PTR(err);
}

/*
diff --git a/fs/namei.c b/fs/namei.c
index 868d0cb..19b99f9 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -617,24 +617,65 @@ int follow_up(struct path *path)
return 1;
}

+/*
+ * Perform an automount
+ */
+static int follow_automount(struct path *path, int res)
+{
+ struct vfsmount *mnt;
+
+ current->total_link_count++;
+ if (current->total_link_count >= 40)
+ return -ELOOP;
+
+ mnt = path->dentry->d_op->d_automount(path);
+ if (IS_ERR(mnt))
+ return PTR_ERR(mnt);
+ if (!mnt) /* mount collision */
+ return 0;
+
+ if (mnt->mnt_sb == path->mnt->mnt_sb && mnt->mnt_root == path->dentry)
+ return -ELOOP;
+
+ dput(path->dentry);
+ if (res)
+ mntput(path->mnt);
+ path->mnt = mnt;
+ path->dentry = dget(mnt->mnt_root);
+ return 0;
+}
+
/* no need for dcache_lock, as serialization is taken care in
* namespace.c
*/
-static int __follow_mount(struct path *path)
+static int __follow_mount(struct path *path, unsigned nofollow)
{
- int res = 0;
- while (d_mountpoint(path->dentry)) {
- struct vfsmount *mounted = lookup_mnt(path);
- if (!mounted)
+ struct vfsmount *mounted;
+ int ret, res = 0;
+ for (;;) {
+ while (d_mountpoint(path->dentry)) {
+ if (nofollow)
+ return -ELOOP;
+ mounted = lookup_mnt(path);
+ if (!mounted)
+ break;
+ dput(path->dentry);
+ if (res)
+ mntput(path->mnt);
+ path->mnt = mounted;
+ path->dentry = dget(mounted->mnt_root);
+ res = 1;
+ }
+ if (!d_automount_point(path->dentry))
break;
- dput(path->dentry);
- if (res)
- mntput(path->mnt);
- path->mnt = mounted;
- path->dentry = dget(mounted->mnt_root);
+ if (nofollow)
+ return -ELOOP;
+ ret = follow_automount(path, res);
+ if (ret < 0)
+ return ret;
res = 1;
}
- return res;
+ return 0;
}

static void follow_mount(struct path *path)
@@ -702,6 +743,8 @@ static int do_lookup(struct nameidata *nd, struct qstr *name,
struct vfsmount *mnt = nd->path.mnt;
struct dentry *dentry, *parent;
struct inode *dir;
+ int ret;
+
/*
* See if the low-level filesystem might want
* to use its own hash..
@@ -720,8 +763,10 @@ static int do_lookup(struct nameidata *nd, struct qstr *name,
done:
path->mnt = mnt;
path->dentry = dentry;
- __follow_mount(path);
- return 0;
+ ret = __follow_mount(path, 0);
+ if (unlikely(ret < 0))
+ path_put(path);
+ return ret;

need_lookup:
parent = nd->path.dentry;
@@ -794,17 +839,6 @@ fail:
}

/*
- * This is a temporary kludge to deal with "automount" symlinks; proper
- * solution is to trigger them on follow_mount(), so that do_lookup()
- * would DTRT. To be killed before 2.6.34-final.
- */
-static inline int follow_on_final(struct inode *inode, unsigned lookup_flags)
-{
- return inode && unlikely(inode->i_op->follow_link) &&
- ((lookup_flags & LOOKUP_FOLLOW) || S_ISDIR(inode->i_mode));
-}
-
-/*
* Name resolution.
* This is the basic name resolution function, turning a pathname into
* the final dentry. We expect 'base' to be positive and a directory.
@@ -924,7 +958,8 @@ last_component:
if (err)
break;
inode = next.dentry->d_inode;
- if (follow_on_final(inode, lookup_flags)) {
+ if (inode && unlikely(inode->i_op->follow_link) &&
+ (lookup_flags & LOOKUP_FOLLOW)) {
err = do_follow_link(&next, nd);
if (err)
goto return_err;
@@ -1721,11 +1756,9 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
if (open_flag & O_EXCL)
goto exit_dput;

- if (__follow_mount(path)) {
- error = -ELOOP;
- if (open_flag & O_NOFOLLOW)
- goto exit_dput;
- }
+ error = __follow_mount(path, open_flag & O_NOFOLLOW);
+ if (error < 0)
+ goto exit_dput;

error = -ENOENT;
if (!path->dentry->d_inode)
@@ -1839,8 +1872,7 @@ reval:
struct inode *inode = path.dentry->d_inode;
void *cookie;
error = -ELOOP;
- /* S_ISDIR part is a temporary automount kludge */
- if (!(nd.flags & LOOKUP_FOLLOW) && !S_ISDIR(inode->i_mode))
+ if (!(nd.flags & LOOKUP_FOLLOW))
goto exit_dput;
if (count++ == 32)
goto exit_dput;
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index eebb617..0cfc4ac 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -139,6 +139,7 @@ struct dentry_operations {
void (*d_release)(struct dentry *);
void (*d_iput)(struct dentry *, struct inode *);
char *(*d_dname)(struct dentry *, char *, int);
+ struct vfsmount *(*d_automount)(struct path *);
};

/* the dentry parameter passed to d_hash and d_compare is the parent
@@ -157,6 +158,7 @@ d_compare: no yes yes no
d_delete: no yes no no
d_release: no no no yes
d_iput: no no no yes
+d_automount: no no no yes
*/

/* d_flags entries */
@@ -389,6 +391,11 @@ static inline int d_mountpoint(struct dentry *dentry)
return dentry->d_mounted;
}

+static inline bool d_automount_point(struct dentry *dentry)
+{
+ return !!dentry->d_op && dentry->d_op->d_automount;
+}
+
extern struct vfsmount *lookup_mnt(struct path *);
extern struct dentry *lookup_create(struct nameidata *nd, int is_dir);


2010-07-22 04:16:06

by Nick Piggin

[permalink] [raw]
Subject: Re: [RFC][PATCH] Add a dentry op to handle automounting rather than abusing follow_link

This hasn't had much comments yet.

On Tue, Jul 13, 2010 at 10:55:50PM +0100, David Howells wrote:
>
> Add a dentry op (d_automount) to handle automounting directories rather than
> abusing the follow_link() inode operation.
>
> I've only changed __follow_mount() to handle automount points, but it might be
> necessary to change follow_mount() too. The latter is only used from
> follow_dotdot(), but any automounts on ".." should be pinned whilst we're using
> a child of it.
>
> AFS is made to use this facility so that it can be tested. Other filesystems
> abusing the follow_mount() inode operation will also need to be modified.

How about having a .follow_mount op, and using that instead of
default follow_mount in case mounted is incremented?

Also I would prefer the patch to add this call keep basically the
same API as follow_mount, so if you are going to change that to
return an error and do the NOFOLLOW handling in there, then
could you do that first, as a more trivial patch?

Then your addition of the d_op should not touch outside
*follow_mount.

2010-07-22 12:36:35

by David Howells

[permalink] [raw]
Subject: Re: [RFC][PATCH] Add a dentry op to handle automounting rather than abusing follow_link

Nick Piggin <[email protected]> wrote:

> > AFS is made to use this facility so that it can be tested. Other
> > filesystems abusing the follow_mount() inode operation will also need to
> > be modified.

I meant follow_link() here of course... Too many followy things:-)

> How about having a .follow_mount op, and using that instead of
> default follow_mount in case mounted is incremented?

But what if d_mounted is not incremented, though? That's usually the point
you'd want to call the automount code. Why would you want to call into the
filesystem just to skip over possibly mounted dentries? A dentry may have an
elevated d_mount on it, but nothing mounted at that {vfsmount,dentry} point I
suppose, but still jumping into the filesystem just so it can skip an already
mounted point would seem a waste of time.

> Also I would prefer the patch to add this call

Meaning i_op->follow_mount()?

> keep basically the same API as follow_mount, so if you are going to change
> that to return an error and do the NOFOLLOW handling in there, then could
> you do that first, as a more trivial patch?

Ummm... I'm not sure I follow you. I changed __follow_mount() not
follow_mount(). I don't think changing the latter is necessary.

> Then your addition of the d_op should not touch outside *follow_mount.

But calling i_op->follow_mount() would, so what does this gain you? And why
not touch the inside of __follow_mount()?

Are you suggesting doing i_op->follow_mount() instead of or as well as
d_op->d_automount()? I'm not entirely sure.

David

2010-07-22 14:57:47

by Nick Piggin

[permalink] [raw]
Subject: Re: [RFC][PATCH] Add a dentry op to handle automounting rather than abusing follow_link

On Thu, Jul 22, 2010 at 01:36:27PM +0100, David Howells wrote:
> Nick Piggin <[email protected]> wrote:
>
> > > AFS is made to use this facility so that it can be tested. Other
> > > filesystems abusing the follow_mount() inode operation will also need to
> > > be modified.
>
> I meant follow_link() here of course... Too many followy things:-)
>
> > How about having a .follow_mount op, and using that instead of
> > default follow_mount in case mounted is incremented?
>
> But what if d_mounted is not incremented, though?

Nothing?


> That's usually the point
> you'd want to call the automount code.

I think you have it the wrong way around. If you wanted to call
the automount code, you would have incremented d_mounted.


> Why would you want to call into the
> filesystem just to skip over possibly mounted dentries? A dentry may have an
> elevated d_mount on it, but nothing mounted at that {vfsmount,dentry} point I
> suppose, but still jumping into the filesystem just so it can skip an already
> mounted point would seem a waste of time.

Those that don't care wouldn't set ->follow_mount though.
Following a mount is a fairly heavy operation already, it
does take a global lock (before vfs scalability patches,
anyway).

I like the flexibility of doing one's own ->follow_mount,
although Al might object to allowing filesystems to follow
mounts in ways that are not published to the core
namespace structures.


> > Also I would prefer the patch to add this call
>
> Meaning i_op->follow_mount()?

Either one, just make the follow_mount/__follow_mount API
changes in one patch, and add the callback in another.


> > keep basically the same API as follow_mount, so if you are going to change
> > that to return an error and do the NOFOLLOW handling in there, then could
> > you do that first, as a more trivial patch?
>
> Ummm... I'm not sure I follow you. I changed __follow_mount() not
> follow_mount(). I don't think changing the latter is necessary.

I meant __follow_mount.


> > Then your addition of the d_op should not touch outside *follow_mount.
>
> But calling i_op->follow_mount() would, so what does this gain you? And why
> not touch the inside of __follow_mount()?
>
> Are you suggesting doing i_op->follow_mount() instead of or as well as
> d_op->d_automount()? I'm not entirely sure.

Two suggestions. Firstly a d_op->d_follow_mount() (does following
a mount even make sense at the inode level?)

Secondly, just simply to split the patch so you change the
__follow_mount API in namespace first.

2010-07-22 15:33:23

by David Howells

[permalink] [raw]
Subject: Re: [RFC][PATCH] Add a dentry op to handle automounting rather than abusing follow_link

Nick Piggin <[email protected]> wrote:

> I think you have it the wrong way around. If you wanted to call the
> automount code, you would have incremented d_mounted.

Why? d_mounted indicates how many things are mounted on a dentry, doesn't it?
Before the automounter is invoked there isn't anything mounted there.

> Those that don't care wouldn't set ->follow_mount though. Following a mount
> is a fairly heavy operation already, it does take a global lock (before vfs
> scalability patches, anyway).

I wonder if we could do it with a lock on vfsmount instead and use mnt_mounts
to find it.

> I like the flexibility of doing one's own ->follow_mount, although Al might
> object to allowing filesystems to follow mounts in ways that are not
> published to the core namespace structures.

But why would you want to delegate mountpoint traversal to the filesystem?

David

2010-07-22 16:04:12

by David Howells

[permalink] [raw]
Subject: Re: [RFC][PATCH] Add a dentry op to handle automounting rather than abusing follow_link

David Howells <[email protected]> wrote:

> But why would you want to delegate mountpoint traversal to the filesystem?

s/mountpoint traversal/mounted filesystem iteration/

David