autofs4 currently doesn't support RCU-walk - it immediately
aborts any attempt at RCU-walk to force REF-walk for path name
lookup.
This can cause a significant performance impact on multi-core
systems.
I have a client with a test case which spends >80% of its time
waiting for spinlocks with a "make -j 40" on a 40 core system.
This patchset aims to remove most of these spinlocks. To be fully
effective in the particular case it needs a second patch set which
makes NFS RCU-walk friendly, but one thing at a time.
This has only been lightly tested so far so I'm really after feed-back
rather than to have the patch set accepted, though the first two
patches are trivial and could be taken immediately.
The last two patches are the most interesting so review comments on
those are particularly welcome.
Thanks,
NeilBrown
---
NeilBrown (6):
autofs4: remove unused autofs4_ispending()
autofs4: remove a redundant assignment
autofs4: allow RCU-walk to walk through autofs4.
autofs4: factor should_expire() out of autofs4_expire_indirect.
autofs4: avoid taking fs_lock during rcu-walk
autofs4: don't take spinlock when not needed in autofs4_lookup_expiring
fs/autofs4/autofs_i.h | 20 +----
fs/autofs4/dev-ioctl.c | 2 -
fs/autofs4/expire.c | 192 +++++++++++++++++++++++++++++-------------------
fs/autofs4/root.c | 46 ++++++++----
4 files changed, 151 insertions(+), 109 deletions(-)
--
Signature
Signed-off-by: NeilBrown <[email protected]>
---
fs/autofs4/autofs_i.h | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index acf32054edd8..22a280151e45 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -143,20 +143,6 @@ static inline int autofs4_oz_mode(struct autofs_sb_info *sbi) {
return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp;
}
-/* Does a dentry have some pending activity? */
-static inline int autofs4_ispending(struct dentry *dentry)
-{
- struct autofs_info *inf = autofs4_dentry_ino(dentry);
-
- if (inf->flags & AUTOFS_INF_PENDING)
- return 1;
-
- if (inf->flags & AUTOFS_INF_EXPIRING)
- return 1;
-
- return 0;
-}
-
struct inode *autofs4_get_inode(struct super_block *, umode_t);
void autofs4_free_ino(struct autofs_info *);
The variable 'ino' already exists and already
has the correct value. The d_fsdata of a dentry
is never changed after the d_fsdata is instantiated,
so this new assignment cannot be necessary.
It was introduced in
commit b5b801779d59165c4ecf1009009109545bd1f642
autofs4: Add d_manage() dentry operation
Signed-off-by: NeilBrown <[email protected]>
---
fs/autofs4/expire.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
index 394e90b02c5e..a7be57e39be7 100644
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -333,7 +333,6 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
if (ino->flags & AUTOFS_INF_PENDING)
goto out;
if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
- struct autofs_info *ino = autofs4_dentry_ino(root);
ino->flags |= AUTOFS_INF_EXPIRING;
init_completion(&ino->expire_complete);
spin_unlock(&sbi->fs_lock);
Any attempt to look up a pathname that passes though an
autofs4 mount is currently forced out of RCU-walk into
REF-walk.
This can significantly hurt performance of many-thread work
loads on many-core systems, especially if the automounted
filesystem supports RCU-walk but doesn't get to benefit from
it.
So if autofs4_d_manage is called with rcu_walk set, only
fail with -ECHILD if it is necessary to wait longer than
a spinlock, and avoid even the spinlock in one trivial case.
Signed-off-by: NeilBrown <[email protected]>
---
fs/autofs4/autofs_i.h | 2 +-
fs/autofs4/dev-ioctl.c | 2 +-
fs/autofs4/expire.c | 4 +++-
fs/autofs4/root.c | 44 +++++++++++++++++++++++++++++---------------
4 files changed, 34 insertions(+), 18 deletions(-)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index 22a280151e45..99dbb05d6148 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -148,7 +148,7 @@ void autofs4_free_ino(struct autofs_info *);
/* Expiration */
int is_autofs4_dentry(struct dentry *);
-int autofs4_expire_wait(struct dentry *dentry);
+int autofs4_expire_wait(struct dentry *dentry, int rcu_walk);
int autofs4_expire_run(struct super_block *, struct vfsmount *,
struct autofs_sb_info *,
struct autofs_packet_expire __user *);
diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 5b570b6efa28..aaf96cb25452 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -450,7 +450,7 @@ static int autofs_dev_ioctl_requester(struct file *fp,
ino = autofs4_dentry_ino(path.dentry);
if (ino) {
err = 0;
- autofs4_expire_wait(path.dentry);
+ autofs4_expire_wait(path.dentry, 0);
spin_lock(&sbi->fs_lock);
param->requester.uid = from_kuid_munged(current_user_ns(), ino->uid);
param->requester.gid = from_kgid_munged(current_user_ns(), ino->gid);
diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
index a7be57e39be7..7e2f22ce6954 100644
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -467,7 +467,7 @@ found:
return expired;
}
-int autofs4_expire_wait(struct dentry *dentry)
+int autofs4_expire_wait(struct dentry *dentry, int rcu_walk)
{
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry);
@@ -477,6 +477,8 @@ int autofs4_expire_wait(struct dentry *dentry)
spin_lock(&sbi->fs_lock);
if (ino->flags & AUTOFS_INF_EXPIRING) {
spin_unlock(&sbi->fs_lock);
+ if (rcu_walk)
+ return -ECHILD;
DPRINTK("waiting for expire %p name=%.*s",
dentry, dentry->d_name.len, dentry->d_name.name);
diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index cc87c1abac97..1ad119407e2f 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -208,7 +208,8 @@ next:
return NULL;
}
-static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
+static struct dentry *autofs4_lookup_expiring(struct dentry *dentry,
+ bool rcu_walk)
{
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct dentry *parent = dentry->d_parent;
@@ -225,6 +226,11 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
struct dentry *expiring;
struct qstr *qstr;
+ if (rcu_walk) {
+ spin_unlock(&sbi->lookup_lock);
+ return ERR_PTR(-ECHILD);
+ }
+
ino = list_entry(p, struct autofs_info, expiring);
expiring = ino->dentry;
@@ -260,13 +266,15 @@ next:
return NULL;
}
-static int autofs4_mount_wait(struct dentry *dentry)
+static int autofs4_mount_wait(struct dentry *dentry, bool rcu_walk)
{
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry);
int status = 0;
if (ino->flags & AUTOFS_INF_PENDING) {
+ if (rcu_walk)
+ return -ECHILD;
DPRINTK("waiting for mount name=%.*s",
dentry->d_name.len, dentry->d_name.name);
status = autofs4_wait(sbi, dentry, NFY_MOUNT);
@@ -276,20 +284,22 @@ static int autofs4_mount_wait(struct dentry *dentry)
return status;
}
-static int do_expire_wait(struct dentry *dentry)
+static int do_expire_wait(struct dentry *dentry, bool rcu_walk)
{
struct dentry *expiring;
- expiring = autofs4_lookup_expiring(dentry);
+ expiring = autofs4_lookup_expiring(dentry, rcu_walk);
+ if (IS_ERR(expiring))
+ return PTR_ERR(expiring);
if (!expiring)
- return autofs4_expire_wait(dentry);
+ return autofs4_expire_wait(dentry, rcu_walk);
else {
/*
* If we are racing with expire the request might not
* be quite complete, but the directory has been removed
* so it must have been successful, just wait for it.
*/
- autofs4_expire_wait(expiring);
+ autofs4_expire_wait(expiring, 0);
autofs4_del_expiring(expiring);
dput(expiring);
}
@@ -341,7 +351,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
* and the directory was removed, so just go ahead and try
* the mount.
*/
- status = do_expire_wait(dentry);
+ status = do_expire_wait(dentry, 0);
if (status && status != -EAGAIN)
return NULL;
@@ -349,7 +359,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
spin_lock(&sbi->fs_lock);
if (ino->flags & AUTOFS_INF_PENDING) {
spin_unlock(&sbi->fs_lock);
- status = autofs4_mount_wait(dentry);
+ status = autofs4_mount_wait(dentry, 0);
if (status)
return ERR_PTR(status);
goto done;
@@ -390,7 +400,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
}
ino->flags |= AUTOFS_INF_PENDING;
spin_unlock(&sbi->fs_lock);
- status = autofs4_mount_wait(dentry);
+ status = autofs4_mount_wait(dentry, 0);
spin_lock(&sbi->fs_lock);
ino->flags &= ~AUTOFS_INF_PENDING;
if (status) {
@@ -426,21 +436,25 @@ static int autofs4_d_manage(struct dentry *dentry, bool rcu_walk)
return 0;
}
- /* We need to sleep, so we need pathwalk to be in ref-mode */
- if (rcu_walk)
- return -ECHILD;
-
/* Wait for pending expires */
- do_expire_wait(dentry);
+ if (do_expire_wait(dentry, rcu_walk) == -ECHILD)
+ return -ECHILD;
/*
* This dentry may be under construction so wait on mount
* completion.
*/
- status = autofs4_mount_wait(dentry);
+ status = autofs4_mount_wait(dentry, rcu_walk);
if (status)
return status;
+ if (rcu_walk)
+ /* No need to consider returning -EISDIR as
+ * there is no risk that ->d_automount will
+ * be called
+ */
+ return status;
+
spin_lock(&sbi->fs_lock);
/*
* If the dentry has been selected for expire while we slept
Future patch will potentially call this twice, so make it
separate.
Signed-off-by: NeilBrown <[email protected]>
---
fs/autofs4/expire.c | 153 ++++++++++++++++++++++++++-------------------------
1 file changed, 79 insertions(+), 74 deletions(-)
diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
index 7e2f22ce6954..fb0b5003353f 100644
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -345,6 +345,80 @@ out:
return NULL;
}
+static struct dentry *should_expire(struct dentry *dentry,
+ struct vfsmount *mnt,
+ unsigned long timeout,
+ int how)
+{
+ int do_now = how & AUTOFS_EXP_IMMEDIATE;
+ int exp_leaves = how & AUTOFS_EXP_LEAVES;
+ struct autofs_info *ino = autofs4_dentry_ino(dentry);
+ unsigned int ino_count;
+
+ /* No point expiring a pending mount */
+ if (ino->flags & AUTOFS_INF_PENDING)
+ return NULL;
+
+ /*
+ * Case 1: (i) indirect mount or top level pseudo direct mount
+ * (autofs-4.1).
+ * (ii) indirect mount with offset mount, check the "/"
+ * offset (autofs-5.0+).
+ */
+ if (d_mountpoint(dentry)) {
+ DPRINTK("checking mountpoint %p %.*s",
+ dentry, (int)dentry->d_name.len, dentry->d_name.name);
+
+ /* Can we umount this guy */
+ if (autofs4_mount_busy(mnt, dentry))
+ return NULL;
+
+ /* Can we expire this guy */
+ if (autofs4_can_expire(dentry, timeout, do_now))
+ return dentry;
+ return NULL;
+ }
+
+ if (dentry->d_inode && S_ISLNK(dentry->d_inode->i_mode)) {
+ DPRINTK("checking symlink %p %.*s",
+ dentry, (int)dentry->d_name.len, dentry->d_name.name);
+ /*
+ * A symlink can't be "busy" in the usual sense so
+ * just check last used for expire timeout.
+ */
+ if (autofs4_can_expire(dentry, timeout, do_now))
+ return dentry;
+ return NULL;
+ }
+
+ if (simple_empty(dentry))
+ return NULL;
+
+ /* Case 2: tree mount, expire iff entire tree is not busy */
+ if (!exp_leaves) {
+ /* Path walk currently on this dentry? */
+ ino_count = atomic_read(&ino->count) + 1;
+ if (d_count(dentry) > ino_count)
+ return NULL;
+
+ if (!autofs4_tree_busy(mnt, dentry, timeout, do_now))
+ return dentry;
+ /*
+ * Case 3: pseudo direct mount, expire individual leaves
+ * (autofs-4.1).
+ */
+ } else {
+ /* Path walk currently on this dentry? */
+ ino_count = atomic_read(&ino->count) + 1;
+ if (d_count(dentry) > ino_count)
+ return NULL;
+
+ dentry = autofs4_check_leaves(mnt, dentry, timeout, do_now);
+ if (dentry)
+ return dentry;
+ }
+ return NULL;
+}
/*
* Find an eligible tree to time-out
* A tree is eligible if :-
@@ -359,11 +433,8 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
unsigned long timeout;
struct dentry *root = sb->s_root;
struct dentry *dentry;
- struct dentry *expired = NULL;
- int do_now = how & AUTOFS_EXP_IMMEDIATE;
- int exp_leaves = how & AUTOFS_EXP_LEAVES;
+ struct dentry *expired;
struct autofs_info *ino;
- unsigned int ino_count;
if (!root)
return NULL;
@@ -374,78 +445,12 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
dentry = NULL;
while ((dentry = get_next_positive_subdir(dentry, root))) {
spin_lock(&sbi->fs_lock);
- ino = autofs4_dentry_ino(dentry);
- /* No point expiring a pending mount */
- if (ino->flags & AUTOFS_INF_PENDING)
- goto next;
-
- /*
- * Case 1: (i) indirect mount or top level pseudo direct mount
- * (autofs-4.1).
- * (ii) indirect mount with offset mount, check the "/"
- * offset (autofs-5.0+).
- */
- if (d_mountpoint(dentry)) {
- DPRINTK("checking mountpoint %p %.*s",
- dentry, (int)dentry->d_name.len, dentry->d_name.name);
-
- /* Can we umount this guy */
- if (autofs4_mount_busy(mnt, dentry))
- goto next;
-
- /* Can we expire this guy */
- if (autofs4_can_expire(dentry, timeout, do_now)) {
- expired = dentry;
- goto found;
- }
- goto next;
- }
-
- if (dentry->d_inode && S_ISLNK(dentry->d_inode->i_mode)) {
- DPRINTK("checking symlink %p %.*s",
- dentry, (int)dentry->d_name.len, dentry->d_name.name);
- /*
- * A symlink can't be "busy" in the usual sense so
- * just check last used for expire timeout.
- */
- if (autofs4_can_expire(dentry, timeout, do_now)) {
- expired = dentry;
- goto found;
- }
- goto next;
- }
-
- if (simple_empty(dentry))
- goto next;
-
- /* Case 2: tree mount, expire iff entire tree is not busy */
- if (!exp_leaves) {
- /* Path walk currently on this dentry? */
- ino_count = atomic_read(&ino->count) + 1;
- if (d_count(dentry) > ino_count)
- goto next;
-
- if (!autofs4_tree_busy(mnt, dentry, timeout, do_now)) {
- expired = dentry;
- goto found;
- }
- /*
- * Case 3: pseudo direct mount, expire individual leaves
- * (autofs-4.1).
- */
- } else {
- /* Path walk currently on this dentry? */
- ino_count = atomic_read(&ino->count) + 1;
- if (d_count(dentry) > ino_count)
- goto next;
-
- expired = autofs4_check_leaves(mnt, dentry, timeout, do_now);
- if (expired) {
+ expired = should_expire(dentry, mnt, timeout, how);
+ if (expired) {
+ if (expired != dentry)
dput(dentry);
- goto found;
- }
+ goto found;
}
-next:
spin_unlock(&sbi->fs_lock);
}
return NULL;
->fs_lock protects AUTOFS_INF_EXPIRING. We need to be sure
that once the flag is set, no new references beneath the dentry
are taken. So rcu-walk currently needs to take fs_lock before
checking the flag. This hurts performance.
Change the expiry to a two-stage process.
First set AUTHFS_INF_NO_RCU which forces any path walk into
ref-walk mode, then drop the lock and call synchronize_rcu().
Once that returns we can be sure no rcu-walk is active beneath
the dentry and we can check reference counts again.
Now during an RCU-walk we can test AUTHFS_INF_EXPIRING without
taking the lock as along as we test AUTHFS_INF_NO_RCU too.
If either are set, we must abort the RCU-walk
If neither are set, we know that refcounts will be tested again
after we finish the RCU-walk so we are safe to continue.
Signed-off-by: NeilBrown <[email protected]>
---
fs/autofs4/autofs_i.h | 4 ++++
fs/autofs4/expire.c | 46 ++++++++++++++++++++++++++++++++++++++--------
2 files changed, 42 insertions(+), 8 deletions(-)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index 99dbb05d6148..469724d7568c 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -79,6 +79,10 @@ struct autofs_info {
};
#define AUTOFS_INF_EXPIRING (1<<0) /* dentry is in the process of expiring */
+#define AUTOFS_INF_NO_RCU (1<<1) /* the dentry is being considered
+ * for expiry, so RCU_walk is
+ * not permitted
+ */
#define AUTOFS_INF_PENDING (1<<2) /* dentry pending mount */
struct autofs_wait_queue {
diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
index fb0b5003353f..98a6fd4957f8 100644
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -333,10 +333,19 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
if (ino->flags & AUTOFS_INF_PENDING)
goto out;
if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
- ino->flags |= AUTOFS_INF_EXPIRING;
- init_completion(&ino->expire_complete);
+ ino->flags |= AUTOFS_INF_NO_RCU;
spin_unlock(&sbi->fs_lock);
- return root;
+ synchronize_rcu();
+ spin_lock(&sbi->fs_lock);
+ if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
+ ino->flags |= AUTOFS_INF_EXPIRING;
+ smp_mb()
+ ino->flags &= ~AUTOFS_INF_NO_RCU;
+ init_completion(&ino->expire_complete);
+ spin_unlock(&sbi->fs_lock);
+ return root;
+ }
+ ino->flags &= ~AUTOFS_INF_NO_RCU;
}
out:
spin_unlock(&sbi->fs_lock);
@@ -445,12 +454,29 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
dentry = NULL;
while ((dentry = get_next_positive_subdir(dentry, root))) {
spin_lock(&sbi->fs_lock);
- expired = should_expire(dentry, mnt, timeout, how);
- if (expired) {
+ ino = autofs4_dentry_ino(dentry);
+ if (ino->flags & AUTOFS_INF_NO_RCU)
+ expired = NULL;
+ else
+ expired = should_expire(dentry, mnt, timeout, how);
+ if (!expired) {
+ spin_unlock(&sbi->fs_lock);
+ continue;
+ }
+ ino = autofs4_dentry_ino(expired);
+ ino->flags |= AUTOFS_INF_NO_RCU;
+ spin_unlock(&sbi->fs_lock);
+ synchronize_rcu();
+ spin_lock(&sbi->fs_lock);
+ if (should_expire(expired, mnt, timeout, how)) {
if (expired != dentry)
dput(dentry);
goto found;
}
+
+ ino->flags &= ~AUTOFS_INF_NO_RCU;
+ if (expired != dentry)
+ dput(expired);
spin_unlock(&sbi->fs_lock);
}
return NULL;
@@ -458,8 +484,9 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
found:
DPRINTK("returning %p %.*s",
expired, (int)expired->d_name.len, expired->d_name.name);
- ino = autofs4_dentry_ino(expired);
ino->flags |= AUTOFS_INF_EXPIRING;
+ smp_mb()
+ ino->flags &= ~AUTOFS_INF_NO_RCU;
init_completion(&ino->expire_complete);
spin_unlock(&sbi->fs_lock);
spin_lock(&sbi->lookup_lock);
@@ -479,11 +506,14 @@ int autofs4_expire_wait(struct dentry *dentry, int rcu_walk)
int status;
/* Block on any pending expire */
+ if (!(ino->flags & (AUTOFS_INF_EXPIRING | AUTOFS_INF_NO_RCU)))
+ return 0;
+
+ if (rcu_walk)
+ return -ECHILD;
spin_lock(&sbi->fs_lock);
if (ino->flags & AUTOFS_INF_EXPIRING) {
spin_unlock(&sbi->fs_lock);
- if (rcu_walk)
- return -ECHILD;
DPRINTK("waiting for expire %p name=%.*s",
dentry, dentry->d_name.len, dentry->d_name.name);
If the expiring_list is empty, we can avoid a costly spinlock
in the rcu-walk path through authfs4_d_manage.
Signed-off-by: NeilBrown <[email protected]>
---
fs/autofs4/root.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index 1ad119407e2f..774c2dab331b 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -219,6 +219,8 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry,
const unsigned char *str = name->name;
struct list_head *p, *head;
+ if (list_empty(&sbi->expiring_list))
+ return NULL;
spin_lock(&sbi->lookup_lock);
head = &sbi->expiring_list;
list_for_each(p, head) {
On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> autofs4 currently doesn't support RCU-walk - it immediately
> aborts any attempt at RCU-walk to force REF-walk for path name
> lookup.
>
> This can cause a significant performance impact on multi-core
> systems.
> I have a client with a test case which spends >80% of its time
> waiting for spinlocks with a "make -j 40" on a 40 core system.
Right, sounds worth the effort.
>
> This patchset aims to remove most of these spinlocks. To be fully
> effective in the particular case it needs a second patch set which
> makes NFS RCU-walk friendly, but one thing at a time.
>
> This has only been lightly tested so far so I'm really after feed-back
> rather than to have the patch set accepted, though the first two
> patches are trivial and could be taken immediately.
I've only scanned the patches so far, I'll need to spend a bit more time
on them before I can comment.
I'm going to be pressed for time for at least several days so I won't be
able to get to this right away.
I expect the submount_test I use to stress path walking and expire to
mount transitions will likely be a good test to use. I haven't used it
in my personal environment for quite a while now so I'll need to have a
look around and see if I can still find a suitable set of scripts.
Otherwise I'll need to decouple it from the RedHat automated test
environment.
>
> The last two patches are the most interesting so review comments on
> those are particularly welcome.
Again I haven't looked closely at these but don't you mean the last
three patches or am I just fussing over an obviously straight forward
patch 3?
Thanks for your effort Bruce,
Ian
On Thu, 2014-07-10 at 15:43 +0800, Ian Kent wrote:
>
> Thanks for your effort Bruce,
Oh boy, where did that come from, sorry Neil, ;)
> Ian
>
On Thu, 10 Jul 2014 15:43:40 +0800 Ian Kent <[email protected]> wrote:
> On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> > autofs4 currently doesn't support RCU-walk - it immediately
> > aborts any attempt at RCU-walk to force REF-walk for path name
> > lookup.
> >
> > This can cause a significant performance impact on multi-core
> > systems.
> > I have a client with a test case which spends >80% of its time
> > waiting for spinlocks with a "make -j 40" on a 40 core system.
>
> Right, sounds worth the effort.
>
> >
> > This patchset aims to remove most of these spinlocks. To be fully
> > effective in the particular case it needs a second patch set which
> > makes NFS RCU-walk friendly, but one thing at a time.
> >
> > This has only been lightly tested so far so I'm really after feed-back
> > rather than to have the patch set accepted, though the first two
> > patches are trivial and could be taken immediately.
>
> I've only scanned the patches so far, I'll need to spend a bit more time
> on them before I can comment.
>
> I'm going to be pressed for time for at least several days so I won't be
> able to get to this right away.
>
> I expect the submount_test I use to stress path walking and expire to
> mount transitions will likely be a good test to use. I haven't used it
> in my personal environment for quite a while now so I'll need to have a
> look around and see if I can still find a suitable set of scripts.
> Otherwise I'll need to decouple it from the RedHat automated test
> environment.
>
> >
> > The last two patches are the most interesting so review comments on
> > those are particularly welcome.
>
> Again I haven't looked closely at these but don't you mean the last
> three patches or am I just fussing over an obviously straight forward
> patch 3?
Exactly right - that thirds last patch was "obviously straight forward", so
is naturally the one that I have already found a bug in (the patch assumes
that autofs4_check_leaves returns a different dentry, which clearly isn't
true).
I'll repost it, probably on Monday.
>
> Thanks for your effort Bruce,
> Ian
>
http://www.youtube.com/watch?v=_f_p0CgPeyA
(Usually when people get my name wrong they call me "Ian", so you calling me
Bruce is both slightly ironic and quite refreshing!)
Thanks,
NeilBrown
On Thu, 2014-07-10 at 18:25 +1000, NeilBrown wrote:
> On Thu, 10 Jul 2014 15:43:40 +0800 Ian Kent <[email protected]> wrote:
>
> > On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> > > autofs4 currently doesn't support RCU-walk - it immediately
> > > aborts any attempt at RCU-walk to force REF-walk for path name
> > > lookup.
> > >
> > > This can cause a significant performance impact on multi-core
> > > systems.
> > > I have a client with a test case which spends >80% of its time
> > > waiting for spinlocks with a "make -j 40" on a 40 core system.
> >
> > Right, sounds worth the effort.
> >
> > >
> > > This patchset aims to remove most of these spinlocks. To be fully
> > > effective in the particular case it needs a second patch set which
> > > makes NFS RCU-walk friendly, but one thing at a time.
> > >
> > > This has only been lightly tested so far so I'm really after feed-back
> > > rather than to have the patch set accepted, though the first two
> > > patches are trivial and could be taken immediately.
> >
> > I've only scanned the patches so far, I'll need to spend a bit more time
> > on them before I can comment.
> >
> > I'm going to be pressed for time for at least several days so I won't be
> > able to get to this right away.
> >
> > I expect the submount_test I use to stress path walking and expire to
> > mount transitions will likely be a good test to use. I haven't used it
> > in my personal environment for quite a while now so I'll need to have a
> > look around and see if I can still find a suitable set of scripts.
> > Otherwise I'll need to decouple it from the RedHat automated test
> > environment.
> >
> > >
> > > The last two patches are the most interesting so review comments on
> > > those are particularly welcome.
> >
> > Again I haven't looked closely at these but don't you mean the last
> > three patches or am I just fussing over an obviously straight forward
> > patch 3?
>
> Exactly right - that thirds last patch was "obviously straight forward", so
> is naturally the one that I have already found a bug in (the patch assumes
> that autofs4_check_leaves returns a different dentry, which clearly isn't
> true).
Ha, IIRC that was to support the old pseudo direct mounts that would
expire the leaves of a tree independently. That was a long time ago now
and probably isn't used but it should remain think.
I'll need to have a look at that too, to refresh my memory.
>
> I'll repost it, probably on Monday.
>
> >
> > Thanks for your effort Bruce,
> > Ian
> >
>
> http://www.youtube.com/watch?v=_f_p0CgPeyA
>
> (Usually when people get my name wrong they call me "Ian", so you calling me
> Bruce is both slightly ironic and quite refreshing!)
LOL, but I have a lame excuse!
I was thinking Bruce Fields when replying since have the impression you
both work in similar areas.
Ian
Here is a revised version of this one patch.
This one fixes a problem with refcounts on dentry and adds a comment to
clarify the behaviour of should_expire().
thanks,
NeilBrown
From: NeilBrown <[email protected]>Date: Tue, 8 Jul 2014 17:14:53 +1000
Subject: [PATCH] autofs4: factor should_expire() out of autofs4_expire_indirect.
Future patch will potentially call this twice, so make it
separate.
Signed-off-by: NeilBrown <[email protected]>
diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
index 7e2f22ce6954..402ee7f1461a 100644
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -345,6 +345,89 @@ out:
return NULL;
}
+/* Check if 'dentry' should expire, or return a nearby
+ * dentry that is suitable.
+ * If returned dentry is different from arg dentry,
+ * then a dget() reference was taken, else not.
+ */
+static struct dentry *should_expire(struct dentry *dentry,
+ struct vfsmount *mnt,
+ unsigned long timeout,
+ int how)
+{
+ int do_now = how & AUTOFS_EXP_IMMEDIATE;
+ int exp_leaves = how & AUTOFS_EXP_LEAVES;
+ struct autofs_info *ino = autofs4_dentry_ino(dentry);
+ unsigned int ino_count;
+
+ /* No point expiring a pending mount */
+ if (ino->flags & AUTOFS_INF_PENDING)
+ return NULL;
+
+ /*
+ * Case 1: (i) indirect mount or top level pseudo direct mount
+ * (autofs-4.1).
+ * (ii) indirect mount with offset mount, check the "/"
+ * offset (autofs-5.0+).
+ */
+ if (d_mountpoint(dentry)) {
+ DPRINTK("checking mountpoint %p %.*s",
+ dentry, (int)dentry->d_name.len, dentry->d_name.name);
+
+ /* Can we umount this guy */
+ if (autofs4_mount_busy(mnt, dentry))
+ return NULL;
+
+ /* Can we expire this guy */
+ if (autofs4_can_expire(dentry, timeout, do_now))
+ return dentry;
+ return NULL;
+ }
+
+ if (dentry->d_inode && S_ISLNK(dentry->d_inode->i_mode)) {
+ DPRINTK("checking symlink %p %.*s",
+ dentry, (int)dentry->d_name.len, dentry->d_name.name);
+ /*
+ * A symlink can't be "busy" in the usual sense so
+ * just check last used for expire timeout.
+ */
+ if (autofs4_can_expire(dentry, timeout, do_now))
+ return dentry;
+ return NULL;
+ }
+
+ if (simple_empty(dentry))
+ return NULL;
+
+ /* Case 2: tree mount, expire iff entire tree is not busy */
+ if (!exp_leaves) {
+ /* Path walk currently on this dentry? */
+ ino_count = atomic_read(&ino->count) + 1;
+ if (d_count(dentry) > ino_count)
+ return NULL;
+
+ if (!autofs4_tree_busy(mnt, dentry, timeout, do_now))
+ return dentry;
+ /*
+ * Case 3: pseudo direct mount, expire individual leaves
+ * (autofs-4.1).
+ */
+ } else {
+ /* Path walk currently on this dentry? */
+ struct dentry *expired;
+ ino_count = atomic_read(&ino->count) + 1;
+ if (d_count(dentry) > ino_count)
+ return NULL;
+
+ expired = autofs4_check_leaves(mnt, dentry, timeout, do_now);
+ if (expired) {
+ if (expired == dentry)
+ dput(dentry);
+ return dentry;
+ }
+ }
+ return NULL;
+}
/*
* Find an eligible tree to time-out
* A tree is eligible if :-
@@ -359,11 +442,8 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
unsigned long timeout;
struct dentry *root = sb->s_root;
struct dentry *dentry;
- struct dentry *expired = NULL;
- int do_now = how & AUTOFS_EXP_IMMEDIATE;
- int exp_leaves = how & AUTOFS_EXP_LEAVES;
+ struct dentry *expired;
struct autofs_info *ino;
- unsigned int ino_count;
if (!root)
return NULL;
@@ -374,78 +454,12 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
dentry = NULL;
while ((dentry = get_next_positive_subdir(dentry, root))) {
spin_lock(&sbi->fs_lock);
- ino = autofs4_dentry_ino(dentry);
- /* No point expiring a pending mount */
- if (ino->flags & AUTOFS_INF_PENDING)
- goto next;
-
- /*
- * Case 1: (i) indirect mount or top level pseudo direct mount
- * (autofs-4.1).
- * (ii) indirect mount with offset mount, check the "/"
- * offset (autofs-5.0+).
- */
- if (d_mountpoint(dentry)) {
- DPRINTK("checking mountpoint %p %.*s",
- dentry, (int)dentry->d_name.len, dentry->d_name.name);
-
- /* Can we umount this guy */
- if (autofs4_mount_busy(mnt, dentry))
- goto next;
-
- /* Can we expire this guy */
- if (autofs4_can_expire(dentry, timeout, do_now)) {
- expired = dentry;
- goto found;
- }
- goto next;
- }
-
- if (dentry->d_inode && S_ISLNK(dentry->d_inode->i_mode)) {
- DPRINTK("checking symlink %p %.*s",
- dentry, (int)dentry->d_name.len, dentry->d_name.name);
- /*
- * A symlink can't be "busy" in the usual sense so
- * just check last used for expire timeout.
- */
- if (autofs4_can_expire(dentry, timeout, do_now)) {
- expired = dentry;
- goto found;
- }
- goto next;
- }
-
- if (simple_empty(dentry))
- goto next;
-
- /* Case 2: tree mount, expire iff entire tree is not busy */
- if (!exp_leaves) {
- /* Path walk currently on this dentry? */
- ino_count = atomic_read(&ino->count) + 1;
- if (d_count(dentry) > ino_count)
- goto next;
-
- if (!autofs4_tree_busy(mnt, dentry, timeout, do_now)) {
- expired = dentry;
- goto found;
- }
- /*
- * Case 3: pseudo direct mount, expire individual leaves
- * (autofs-4.1).
- */
- } else {
- /* Path walk currently on this dentry? */
- ino_count = atomic_read(&ino->count) + 1;
- if (d_count(dentry) > ino_count)
- goto next;
-
- expired = autofs4_check_leaves(mnt, dentry, timeout, do_now);
- if (expired) {
+ expired = should_expire(dentry, mnt, timeout, how);
+ if (expired) {
+ if (expired != dentry)
dput(dentry);
- goto found;
- }
+ goto found;
}
-next:
spin_unlock(&sbi->fs_lock);
}
return NULL;
On Mon, 2014-07-14 at 10:53 +1000, NeilBrown wrote:
> Here is a revised version of this one patch.
> This one fixes a problem with refcounts on dentry and adds a comment to
> clarify the behaviour of should_expire().
I have some bad news I'm afraid, not about the patches.
As I mentioned I'm well behind in some time sensitive work I need to do
and I've suddenly run into a completely unexpected and quite serious
problem related to the amd map format parser I've added to autofs.
I did most of the development for that on the 3.10 series kernel and
that worked fine, and since I made hardly any kernel changes, I expected
there wouldn't be further problems.
Now I come to develop RHEL-6 tests for it for QA purposes (which I
started to do on Fedora 20, 3.14 series kernel) and the symlink handling
is suddenly broken.
The symptom is an autofs mount that has contained a symlink (whether
it's still present or not) can't be umounted and I use symlinks a lot
with the amd parser.
So far I've tracked this to something that was introduced between 3.11
and 3.12. One change that went into 3.12 was Jeff Laytons' umount
specific path resolution for umount. I've found this is also broken on
recent RHEL-6 kernels and that change is present in them too so the
evidence is leaning toward that being the problem. Just how could happen
with this change I have no clue so far, it just doesn't make sense.
Anyway that's going to make testing these patches impossible until I can
work out what the problem is and fix it.
Very sorry to hold you up on this but it can't be avoided.
>
> thanks,
> NeilBrown
>
>
> From: NeilBrown <[email protected]>Date: Tue, 8 Jul 2014 17:14:53 +1000
> Subject: [PATCH] autofs4: factor should_expire() out of autofs4_expire_indirect.
>
> Future patch will potentially call this twice, so make it
> separate.
>
> Signed-off-by: NeilBrown <[email protected]>
>
> diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> index 7e2f22ce6954..402ee7f1461a 100644
> --- a/fs/autofs4/expire.c
> +++ b/fs/autofs4/expire.c
> @@ -345,6 +345,89 @@ out:
> return NULL;
> }
>
> +/* Check if 'dentry' should expire, or return a nearby
> + * dentry that is suitable.
> + * If returned dentry is different from arg dentry,
> + * then a dget() reference was taken, else not.
> + */
> +static struct dentry *should_expire(struct dentry *dentry,
> + struct vfsmount *mnt,
> + unsigned long timeout,
> + int how)
> +{
> + int do_now = how & AUTOFS_EXP_IMMEDIATE;
> + int exp_leaves = how & AUTOFS_EXP_LEAVES;
> + struct autofs_info *ino = autofs4_dentry_ino(dentry);
> + unsigned int ino_count;
> +
> + /* No point expiring a pending mount */
> + if (ino->flags & AUTOFS_INF_PENDING)
> + return NULL;
> +
> + /*
> + * Case 1: (i) indirect mount or top level pseudo direct mount
> + * (autofs-4.1).
> + * (ii) indirect mount with offset mount, check the "/"
> + * offset (autofs-5.0+).
> + */
> + if (d_mountpoint(dentry)) {
> + DPRINTK("checking mountpoint %p %.*s",
> + dentry, (int)dentry->d_name.len, dentry->d_name.name);
> +
> + /* Can we umount this guy */
> + if (autofs4_mount_busy(mnt, dentry))
> + return NULL;
> +
> + /* Can we expire this guy */
> + if (autofs4_can_expire(dentry, timeout, do_now))
> + return dentry;
> + return NULL;
> + }
> +
> + if (dentry->d_inode && S_ISLNK(dentry->d_inode->i_mode)) {
> + DPRINTK("checking symlink %p %.*s",
> + dentry, (int)dentry->d_name.len, dentry->d_name.name);
> + /*
> + * A symlink can't be "busy" in the usual sense so
> + * just check last used for expire timeout.
> + */
> + if (autofs4_can_expire(dentry, timeout, do_now))
> + return dentry;
> + return NULL;
> + }
> +
> + if (simple_empty(dentry))
> + return NULL;
> +
> + /* Case 2: tree mount, expire iff entire tree is not busy */
> + if (!exp_leaves) {
> + /* Path walk currently on this dentry? */
> + ino_count = atomic_read(&ino->count) + 1;
> + if (d_count(dentry) > ino_count)
> + return NULL;
> +
> + if (!autofs4_tree_busy(mnt, dentry, timeout, do_now))
> + return dentry;
> + /*
> + * Case 3: pseudo direct mount, expire individual leaves
> + * (autofs-4.1).
> + */
> + } else {
> + /* Path walk currently on this dentry? */
> + struct dentry *expired;
> + ino_count = atomic_read(&ino->count) + 1;
> + if (d_count(dentry) > ino_count)
> + return NULL;
> +
> + expired = autofs4_check_leaves(mnt, dentry, timeout, do_now);
> + if (expired) {
> + if (expired == dentry)
> + dput(dentry);
> + return dentry;
> + }
> + }
> + return NULL;
> +}
> /*
> * Find an eligible tree to time-out
> * A tree is eligible if :-
> @@ -359,11 +442,8 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
> unsigned long timeout;
> struct dentry *root = sb->s_root;
> struct dentry *dentry;
> - struct dentry *expired = NULL;
> - int do_now = how & AUTOFS_EXP_IMMEDIATE;
> - int exp_leaves = how & AUTOFS_EXP_LEAVES;
> + struct dentry *expired;
> struct autofs_info *ino;
> - unsigned int ino_count;
>
> if (!root)
> return NULL;
> @@ -374,78 +454,12 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
> dentry = NULL;
> while ((dentry = get_next_positive_subdir(dentry, root))) {
> spin_lock(&sbi->fs_lock);
> - ino = autofs4_dentry_ino(dentry);
> - /* No point expiring a pending mount */
> - if (ino->flags & AUTOFS_INF_PENDING)
> - goto next;
> -
> - /*
> - * Case 1: (i) indirect mount or top level pseudo direct mount
> - * (autofs-4.1).
> - * (ii) indirect mount with offset mount, check the "/"
> - * offset (autofs-5.0+).
> - */
> - if (d_mountpoint(dentry)) {
> - DPRINTK("checking mountpoint %p %.*s",
> - dentry, (int)dentry->d_name.len, dentry->d_name.name);
> -
> - /* Can we umount this guy */
> - if (autofs4_mount_busy(mnt, dentry))
> - goto next;
> -
> - /* Can we expire this guy */
> - if (autofs4_can_expire(dentry, timeout, do_now)) {
> - expired = dentry;
> - goto found;
> - }
> - goto next;
> - }
> -
> - if (dentry->d_inode && S_ISLNK(dentry->d_inode->i_mode)) {
> - DPRINTK("checking symlink %p %.*s",
> - dentry, (int)dentry->d_name.len, dentry->d_name.name);
> - /*
> - * A symlink can't be "busy" in the usual sense so
> - * just check last used for expire timeout.
> - */
> - if (autofs4_can_expire(dentry, timeout, do_now)) {
> - expired = dentry;
> - goto found;
> - }
> - goto next;
> - }
> -
> - if (simple_empty(dentry))
> - goto next;
> -
> - /* Case 2: tree mount, expire iff entire tree is not busy */
> - if (!exp_leaves) {
> - /* Path walk currently on this dentry? */
> - ino_count = atomic_read(&ino->count) + 1;
> - if (d_count(dentry) > ino_count)
> - goto next;
> -
> - if (!autofs4_tree_busy(mnt, dentry, timeout, do_now)) {
> - expired = dentry;
> - goto found;
> - }
> - /*
> - * Case 3: pseudo direct mount, expire individual leaves
> - * (autofs-4.1).
> - */
> - } else {
> - /* Path walk currently on this dentry? */
> - ino_count = atomic_read(&ino->count) + 1;
> - if (d_count(dentry) > ino_count)
> - goto next;
> -
> - expired = autofs4_check_leaves(mnt, dentry, timeout, do_now);
> - if (expired) {
> + expired = should_expire(dentry, mnt, timeout, how);
> + if (expired) {
> + if (expired != dentry)
> dput(dentry);
> - goto found;
> - }
> + goto found;
> }
> -next:
> spin_unlock(&sbi->fs_lock);
> }
> return NULL;
On Tue, 15 Jul 2014 11:48:33 +0800 Ian Kent <[email protected]> wrote:
...
> So far I've tracked this to something that was introduced between 3.11
> and 3.12. One change that went into 3.12 was Jeff Laytons' umount
> specific path resolution for umount. I've found this is also broken on
> recent RHEL-6 kernels and that change is present in them too so the
> evidence is leaning toward that being the problem. Just how could happen
> with this change I have no clue so far, it just doesn't make sense.
Such a familiar story! I've not yet failed to find bugs like that, but
sometimes it takes a looooong time. Good luck!
>
> Anyway that's going to make testing these patches impossible until I can
> work out what the problem is and fix it.
>
> Very sorry to hold you up on this but it can't be avoided.
Thanks a lot for letting me know the situation!
While I always appreciate upstream review I've learnt not to depend on it.
I'll proceed with my customers based on my judgement and look forward to any
input you can provide whenever you are able to provide it.
Thanks,
NeilBrown
On Tue, 2014-07-15 at 14:05 +1000, NeilBrown wrote:
> On Tue, 15 Jul 2014 11:48:33 +0800 Ian Kent <[email protected]> wrote:
> ...
> > So far I've tracked this to something that was introduced between 3.11
> > and 3.12. One change that went into 3.12 was Jeff Laytons' umount
> > specific path resolution for umount. I've found this is also broken on
> > recent RHEL-6 kernels and that change is present in them too so the
> > evidence is leaning toward that being the problem. Just how could happen
> > with this change I have no clue so far, it just doesn't make sense.
>
> Such a familiar story! I've not yet failed to find bugs like that, but
> sometimes it takes a looooong time. Good luck!
>
> >
> > Anyway that's going to make testing these patches impossible until I can
> > work out what the problem is and fix it.
> >
> > Very sorry to hold you up on this but it can't be avoided.
>
> Thanks a lot for letting me know the situation!
> While I always appreciate upstream review I've learnt not to depend on it.
> I'll proceed with my customers based on my judgement and look forward to any
> input you can provide whenever you are able to provide it.
Yeah, I planed to spend some time to have a look on Wednesday and still
will but, as I say, testing is going to take more time than I have just
now.
Usual autofs usage rarely, if ever, uses symlinks so your customers
should be fine but be aware of the difficulty if you run into people
that want to use amd maps.
>
> Thanks,
> NeilBrown
On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> The variable 'ino' already exists and already
> has the correct value. The d_fsdata of a dentry
> is never changed after the d_fsdata is instantiated,
> so this new assignment cannot be necessary.
>
> It was introduced in
> commit b5b801779d59165c4ecf1009009109545bd1f642
> autofs4: Add d_manage() dentry operation
>
> Signed-off-by: NeilBrown <[email protected]>
Again, an obvious cleanup, thanks for that.
Acked-by: Ian Kent <[email protected]>
> ---
> fs/autofs4/expire.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> index 394e90b02c5e..a7be57e39be7 100644
> --- a/fs/autofs4/expire.c
> +++ b/fs/autofs4/expire.c
> @@ -333,7 +333,6 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> if (ino->flags & AUTOFS_INF_PENDING)
> goto out;
> if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
> - struct autofs_info *ino = autofs4_dentry_ino(root);
> ino->flags |= AUTOFS_INF_EXPIRING;
> init_completion(&ino->expire_complete);
> spin_unlock(&sbi->fs_lock);
>
>
On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> autofs4 currently doesn't support RCU-walk - it immediately
> aborts any attempt at RCU-walk to force REF-walk for path name
> lookup.
As discussed I don't have time to properly test these just now but I'll
do my best to review the patches and return to test them later.
My impression is that you will be submitting these patches rather than
expecting me to pick them up and submit them. If that's not what your
expecting please let me know.
I appreciate you including me in this work, all to often things get
merged that I'm miss and while I may not have identified any problem
with them at the time at least I would be aware of what I might need to
look at when problems arise.
>
> This can cause a significant performance impact on multi-core
> systems.
> I have a client with a test case which spends >80% of its time
> waiting for spinlocks with a "make -j 40" on a 40 core system.
>
> This patchset aims to remove most of these spinlocks. To be fully
> effective in the particular case it needs a second patch set which
> makes NFS RCU-walk friendly, but one thing at a time.
>
> This has only been lightly tested so far so I'm really after feed-back
> rather than to have the patch set accepted, though the first two
> patches are trivial and could be taken immediately.
>
> The last two patches are the most interesting so review comments on
> those are particularly welcome.
>
> Thanks,
> NeilBrown
>
>
> ---
>
> NeilBrown (6):
> autofs4: remove unused autofs4_ispending()
> autofs4: remove a redundant assignment
> autofs4: allow RCU-walk to walk through autofs4.
> autofs4: factor should_expire() out of autofs4_expire_indirect.
> autofs4: avoid taking fs_lock during rcu-walk
> autofs4: don't take spinlock when not needed in autofs4_lookup_expiring
>
>
> fs/autofs4/autofs_i.h | 20 +----
> fs/autofs4/dev-ioctl.c | 2 -
> fs/autofs4/expire.c | 192 +++++++++++++++++++++++++++++-------------------
> fs/autofs4/root.c | 46 ++++++++----
> 4 files changed, 151 insertions(+), 109 deletions(-)
>
On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> Signed-off-by: NeilBrown <[email protected]>
Obviously straight forward.
Acked-by: Ian Kent <[email protected]>
> ---
> fs/autofs4/autofs_i.h | 14 --------------
> 1 file changed, 14 deletions(-)
>
> diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
> index acf32054edd8..22a280151e45 100644
> --- a/fs/autofs4/autofs_i.h
> +++ b/fs/autofs4/autofs_i.h
> @@ -143,20 +143,6 @@ static inline int autofs4_oz_mode(struct autofs_sb_info *sbi) {
> return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp;
> }
>
> -/* Does a dentry have some pending activity? */
> -static inline int autofs4_ispending(struct dentry *dentry)
> -{
> - struct autofs_info *inf = autofs4_dentry_ino(dentry);
> -
> - if (inf->flags & AUTOFS_INF_PENDING)
> - return 1;
> -
> - if (inf->flags & AUTOFS_INF_EXPIRING)
> - return 1;
> -
> - return 0;
> -}
> -
> struct inode *autofs4_get_inode(struct super_block *, umode_t);
> void autofs4_free_ino(struct autofs_info *);
>
>
>
On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> If the expiring_list is empty, we can avoid a costly spinlock
> in the rcu-walk path through authfs4_d_manage.
>
> Signed-off-by: NeilBrown <[email protected]>
I know it should be straight forward to say this is OK but I always
think twice and again about areas that are subject to race pressure,
such as the expire to mount pressure of this code.
After thinking about it for a while now I don't have any reason to think
this would be a problem. Perhaps later pressure testing will reveal
something I missed.
It occurs to me that autofs4_lookup_active() might benefit from a
similar addition. Multiple calls to ->lookup() made while the dentry is
still unhashed should have enforced ordering due to the directory
i_mutex so there shouldn't be a problem adding this. But perhaps you
haven't seen delays in that function.
Acked-by: Ian Kent <[email protected]>
> ---
> fs/autofs4/root.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> index 1ad119407e2f..774c2dab331b 100644
> --- a/fs/autofs4/root.c
> +++ b/fs/autofs4/root.c
> @@ -219,6 +219,8 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry,
> const unsigned char *str = name->name;
> struct list_head *p, *head;
>
> + if (list_empty(&sbi->expiring_list))
> + return NULL;
> spin_lock(&sbi->lookup_lock);
> head = &sbi->expiring_list;
> list_for_each(p, head) {
>
>
On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> Any attempt to look up a pathname that passes though an
> autofs4 mount is currently forced out of RCU-walk into
> REF-walk.
>
> This can significantly hurt performance of many-thread work
> loads on many-core systems, especially if the automounted
> filesystem supports RCU-walk but doesn't get to benefit from
> it.
>
> So if autofs4_d_manage is called with rcu_walk set, only
> fail with -ECHILD if it is necessary to wait longer than
> a spinlock, and avoid even the spinlock in one trivial case.
I've looked more closely at this one now and mostly it looks good to me.
But I'm probably a bit slow, there's just one place where I can't work
out the reasoning ....
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
> fs/autofs4/autofs_i.h | 2 +-
> fs/autofs4/dev-ioctl.c | 2 +-
> fs/autofs4/expire.c | 4 +++-
> fs/autofs4/root.c | 44 +++++++++++++++++++++++++++++---------------
> 4 files changed, 34 insertions(+), 18 deletions(-)
>
> diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
> index 22a280151e45..99dbb05d6148 100644
> --- a/fs/autofs4/autofs_i.h
> +++ b/fs/autofs4/autofs_i.h
> @@ -148,7 +148,7 @@ void autofs4_free_ino(struct autofs_info *);
>
> /* Expiration */
> int is_autofs4_dentry(struct dentry *);
> -int autofs4_expire_wait(struct dentry *dentry);
> +int autofs4_expire_wait(struct dentry *dentry, int rcu_walk);
> int autofs4_expire_run(struct super_block *, struct vfsmount *,
> struct autofs_sb_info *,
> struct autofs_packet_expire __user *);
> diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
> index 5b570b6efa28..aaf96cb25452 100644
> --- a/fs/autofs4/dev-ioctl.c
> +++ b/fs/autofs4/dev-ioctl.c
> @@ -450,7 +450,7 @@ static int autofs_dev_ioctl_requester(struct file *fp,
> ino = autofs4_dentry_ino(path.dentry);
> if (ino) {
> err = 0;
> - autofs4_expire_wait(path.dentry);
> + autofs4_expire_wait(path.dentry, 0);
> spin_lock(&sbi->fs_lock);
> param->requester.uid = from_kuid_munged(current_user_ns(), ino->uid);
> param->requester.gid = from_kgid_munged(current_user_ns(), ino->gid);
> diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> index a7be57e39be7..7e2f22ce6954 100644
> --- a/fs/autofs4/expire.c
> +++ b/fs/autofs4/expire.c
> @@ -467,7 +467,7 @@ found:
> return expired;
> }
>
> -int autofs4_expire_wait(struct dentry *dentry)
> +int autofs4_expire_wait(struct dentry *dentry, int rcu_walk)
> {
> struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> struct autofs_info *ino = autofs4_dentry_ino(dentry);
> @@ -477,6 +477,8 @@ int autofs4_expire_wait(struct dentry *dentry)
> spin_lock(&sbi->fs_lock);
> if (ino->flags & AUTOFS_INF_EXPIRING) {
> spin_unlock(&sbi->fs_lock);
> + if (rcu_walk)
> + return -ECHILD;
>
> DPRINTK("waiting for expire %p name=%.*s",
> dentry, dentry->d_name.len, dentry->d_name.name);
> diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> index cc87c1abac97..1ad119407e2f 100644
> --- a/fs/autofs4/root.c
> +++ b/fs/autofs4/root.c
> @@ -208,7 +208,8 @@ next:
> return NULL;
> }
>
> -static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
> +static struct dentry *autofs4_lookup_expiring(struct dentry *dentry,
> + bool rcu_walk)
> {
> struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> struct dentry *parent = dentry->d_parent;
> @@ -225,6 +226,11 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
> struct dentry *expiring;
> struct qstr *qstr;
>
> + if (rcu_walk) {
> + spin_unlock(&sbi->lookup_lock);
> + return ERR_PTR(-ECHILD);
> + }
> +
> ino = list_entry(p, struct autofs_info, expiring);
> expiring = ino->dentry;
>
> @@ -260,13 +266,15 @@ next:
> return NULL;
> }
>
> -static int autofs4_mount_wait(struct dentry *dentry)
> +static int autofs4_mount_wait(struct dentry *dentry, bool rcu_walk)
> {
> struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> struct autofs_info *ino = autofs4_dentry_ino(dentry);
> int status = 0;
>
> if (ino->flags & AUTOFS_INF_PENDING) {
> + if (rcu_walk)
> + return -ECHILD;
> DPRINTK("waiting for mount name=%.*s",
> dentry->d_name.len, dentry->d_name.name);
> status = autofs4_wait(sbi, dentry, NFY_MOUNT);
> @@ -276,20 +284,22 @@ static int autofs4_mount_wait(struct dentry *dentry)
> return status;
> }
>
> -static int do_expire_wait(struct dentry *dentry)
> +static int do_expire_wait(struct dentry *dentry, bool rcu_walk)
> {
> struct dentry *expiring;
>
> - expiring = autofs4_lookup_expiring(dentry);
> + expiring = autofs4_lookup_expiring(dentry, rcu_walk);
> + if (IS_ERR(expiring))
> + return PTR_ERR(expiring);
> if (!expiring)
> - return autofs4_expire_wait(dentry);
> + return autofs4_expire_wait(dentry, rcu_walk);
> else {
> /*
> * If we are racing with expire the request might not
> * be quite complete, but the directory has been removed
> * so it must have been successful, just wait for it.
> */
> - autofs4_expire_wait(expiring);
> + autofs4_expire_wait(expiring, 0);
> autofs4_del_expiring(expiring);
> dput(expiring);
> }
> @@ -341,7 +351,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> * and the directory was removed, so just go ahead and try
> * the mount.
> */
> - status = do_expire_wait(dentry);
> + status = do_expire_wait(dentry, 0);
> if (status && status != -EAGAIN)
> return NULL;
>
> @@ -349,7 +359,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> spin_lock(&sbi->fs_lock);
> if (ino->flags & AUTOFS_INF_PENDING) {
> spin_unlock(&sbi->fs_lock);
> - status = autofs4_mount_wait(dentry);
> + status = autofs4_mount_wait(dentry, 0);
> if (status)
> return ERR_PTR(status);
> goto done;
> @@ -390,7 +400,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> }
> ino->flags |= AUTOFS_INF_PENDING;
> spin_unlock(&sbi->fs_lock);
> - status = autofs4_mount_wait(dentry);
> + status = autofs4_mount_wait(dentry, 0);
> spin_lock(&sbi->fs_lock);
> ino->flags &= ~AUTOFS_INF_PENDING;
> if (status) {
> @@ -426,21 +436,25 @@ static int autofs4_d_manage(struct dentry *dentry, bool rcu_walk)
> return 0;
> }
>
> - /* We need to sleep, so we need pathwalk to be in ref-mode */
> - if (rcu_walk)
> - return -ECHILD;
> -
> /* Wait for pending expires */
> - do_expire_wait(dentry);
> + if (do_expire_wait(dentry, rcu_walk) == -ECHILD)
> + return -ECHILD;
>
> /*
> * This dentry may be under construction so wait on mount
> * completion.
> */
> - status = autofs4_mount_wait(dentry);
> + status = autofs4_mount_wait(dentry, rcu_walk);
> if (status)
> return status;
>
> + if (rcu_walk)
> + /* No need to consider returning -EISDIR as
> + * there is no risk that ->d_automount will
> + * be called
> + */
> + return status;
> +
This bit here.
The test below says (accepting that d_mountpoint() is unreliable in a
multi-namespace environment and needs to be fixed), if this dentry isn't
being expired and it's not a mount and the directory isn't empty then we
can walk straight through the dentry.
For certain automount map constructs it's possible for a dentry that is
not itself a mount point to be expired in the same way a mount or mount
tree is expired.
What's the thinking behind this one please?
> spin_lock(&sbi->fs_lock);
> /*
> * If the dentry has been selected for expire while we slept
>
>
On Wed, 16 Jul 2014 12:44:17 +0800 Ian Kent <[email protected]> wrote:
> On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> > Any attempt to look up a pathname that passes though an
> > autofs4 mount is currently forced out of RCU-walk into
> > REF-walk.
> >
> > This can significantly hurt performance of many-thread work
> > loads on many-core systems, especially if the automounted
> > filesystem supports RCU-walk but doesn't get to benefit from
> > it.
> >
> > So if autofs4_d_manage is called with rcu_walk set, only
> > fail with -ECHILD if it is necessary to wait longer than
> > a spinlock, and avoid even the spinlock in one trivial case.
>
> I've looked more closely at this one now and mostly it looks good to me.
>
> But I'm probably a bit slow, there's just one place where I can't work
> out the reasoning ....
>
> >
> > Signed-off-by: NeilBrown <[email protected]>
> > ---
> > fs/autofs4/autofs_i.h | 2 +-
> > fs/autofs4/dev-ioctl.c | 2 +-
> > fs/autofs4/expire.c | 4 +++-
> > fs/autofs4/root.c | 44 +++++++++++++++++++++++++++++---------------
> > 4 files changed, 34 insertions(+), 18 deletions(-)
> >
> > diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
> > index 22a280151e45..99dbb05d6148 100644
> > --- a/fs/autofs4/autofs_i.h
> > +++ b/fs/autofs4/autofs_i.h
> > @@ -148,7 +148,7 @@ void autofs4_free_ino(struct autofs_info *);
> >
> > /* Expiration */
> > int is_autofs4_dentry(struct dentry *);
> > -int autofs4_expire_wait(struct dentry *dentry);
> > +int autofs4_expire_wait(struct dentry *dentry, int rcu_walk);
> > int autofs4_expire_run(struct super_block *, struct vfsmount *,
> > struct autofs_sb_info *,
> > struct autofs_packet_expire __user *);
> > diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
> > index 5b570b6efa28..aaf96cb25452 100644
> > --- a/fs/autofs4/dev-ioctl.c
> > +++ b/fs/autofs4/dev-ioctl.c
> > @@ -450,7 +450,7 @@ static int autofs_dev_ioctl_requester(struct file *fp,
> > ino = autofs4_dentry_ino(path.dentry);
> > if (ino) {
> > err = 0;
> > - autofs4_expire_wait(path.dentry);
> > + autofs4_expire_wait(path.dentry, 0);
> > spin_lock(&sbi->fs_lock);
> > param->requester.uid = from_kuid_munged(current_user_ns(), ino->uid);
> > param->requester.gid = from_kgid_munged(current_user_ns(), ino->gid);
> > diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> > index a7be57e39be7..7e2f22ce6954 100644
> > --- a/fs/autofs4/expire.c
> > +++ b/fs/autofs4/expire.c
> > @@ -467,7 +467,7 @@ found:
> > return expired;
> > }
> >
> > -int autofs4_expire_wait(struct dentry *dentry)
> > +int autofs4_expire_wait(struct dentry *dentry, int rcu_walk)
> > {
> > struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> > struct autofs_info *ino = autofs4_dentry_ino(dentry);
> > @@ -477,6 +477,8 @@ int autofs4_expire_wait(struct dentry *dentry)
> > spin_lock(&sbi->fs_lock);
> > if (ino->flags & AUTOFS_INF_EXPIRING) {
> > spin_unlock(&sbi->fs_lock);
> > + if (rcu_walk)
> > + return -ECHILD;
> >
> > DPRINTK("waiting for expire %p name=%.*s",
> > dentry, dentry->d_name.len, dentry->d_name.name);
> > diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> > index cc87c1abac97..1ad119407e2f 100644
> > --- a/fs/autofs4/root.c
> > +++ b/fs/autofs4/root.c
> > @@ -208,7 +208,8 @@ next:
> > return NULL;
> > }
> >
> > -static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
> > +static struct dentry *autofs4_lookup_expiring(struct dentry *dentry,
> > + bool rcu_walk)
> > {
> > struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> > struct dentry *parent = dentry->d_parent;
> > @@ -225,6 +226,11 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
> > struct dentry *expiring;
> > struct qstr *qstr;
> >
> > + if (rcu_walk) {
> > + spin_unlock(&sbi->lookup_lock);
> > + return ERR_PTR(-ECHILD);
> > + }
> > +
> > ino = list_entry(p, struct autofs_info, expiring);
> > expiring = ino->dentry;
> >
> > @@ -260,13 +266,15 @@ next:
> > return NULL;
> > }
> >
> > -static int autofs4_mount_wait(struct dentry *dentry)
> > +static int autofs4_mount_wait(struct dentry *dentry, bool rcu_walk)
> > {
> > struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> > struct autofs_info *ino = autofs4_dentry_ino(dentry);
> > int status = 0;
> >
> > if (ino->flags & AUTOFS_INF_PENDING) {
> > + if (rcu_walk)
> > + return -ECHILD;
> > DPRINTK("waiting for mount name=%.*s",
> > dentry->d_name.len, dentry->d_name.name);
> > status = autofs4_wait(sbi, dentry, NFY_MOUNT);
> > @@ -276,20 +284,22 @@ static int autofs4_mount_wait(struct dentry *dentry)
> > return status;
> > }
> >
> > -static int do_expire_wait(struct dentry *dentry)
> > +static int do_expire_wait(struct dentry *dentry, bool rcu_walk)
> > {
> > struct dentry *expiring;
> >
> > - expiring = autofs4_lookup_expiring(dentry);
> > + expiring = autofs4_lookup_expiring(dentry, rcu_walk);
> > + if (IS_ERR(expiring))
> > + return PTR_ERR(expiring);
> > if (!expiring)
> > - return autofs4_expire_wait(dentry);
> > + return autofs4_expire_wait(dentry, rcu_walk);
> > else {
> > /*
> > * If we are racing with expire the request might not
> > * be quite complete, but the directory has been removed
> > * so it must have been successful, just wait for it.
> > */
> > - autofs4_expire_wait(expiring);
> > + autofs4_expire_wait(expiring, 0);
> > autofs4_del_expiring(expiring);
> > dput(expiring);
> > }
> > @@ -341,7 +351,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> > * and the directory was removed, so just go ahead and try
> > * the mount.
> > */
> > - status = do_expire_wait(dentry);
> > + status = do_expire_wait(dentry, 0);
> > if (status && status != -EAGAIN)
> > return NULL;
> >
> > @@ -349,7 +359,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> > spin_lock(&sbi->fs_lock);
> > if (ino->flags & AUTOFS_INF_PENDING) {
> > spin_unlock(&sbi->fs_lock);
> > - status = autofs4_mount_wait(dentry);
> > + status = autofs4_mount_wait(dentry, 0);
> > if (status)
> > return ERR_PTR(status);
> > goto done;
> > @@ -390,7 +400,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> > }
> > ino->flags |= AUTOFS_INF_PENDING;
> > spin_unlock(&sbi->fs_lock);
> > - status = autofs4_mount_wait(dentry);
> > + status = autofs4_mount_wait(dentry, 0);
> > spin_lock(&sbi->fs_lock);
> > ino->flags &= ~AUTOFS_INF_PENDING;
> > if (status) {
> > @@ -426,21 +436,25 @@ static int autofs4_d_manage(struct dentry *dentry, bool rcu_walk)
> > return 0;
> > }
> >
> > - /* We need to sleep, so we need pathwalk to be in ref-mode */
> > - if (rcu_walk)
> > - return -ECHILD;
> > -
> > /* Wait for pending expires */
> > - do_expire_wait(dentry);
> > + if (do_expire_wait(dentry, rcu_walk) == -ECHILD)
> > + return -ECHILD;
> >
> > /*
> > * This dentry may be under construction so wait on mount
> > * completion.
> > */
> > - status = autofs4_mount_wait(dentry);
> > + status = autofs4_mount_wait(dentry, rcu_walk);
> > if (status)
> > return status;
> >
> > + if (rcu_walk)
> > + /* No need to consider returning -EISDIR as
> > + * there is no risk that ->d_automount will
> > + * be called
> > + */
> > + return status;
> > +
>
> This bit here.
>
> The test below says (accepting that d_mountpoint() is unreliable in a
> multi-namespace environment and needs to be fixed), if this dentry isn't
> being expired and it's not a mount and the directory isn't empty then we
> can walk straight through the dentry.
>
> For certain automount map constructs it's possible for a dentry that is
> not itself a mount point to be expired in the same way a mount or mount
> tree is expired.
>
> What's the thinking behind this one please?
I admit that I don't fully understand the role of -EISDIR being returned by
d_manage.
However the comment below suggests that it is an optimisation. It tells the
VFS not to even bother calling ->d_automount on this dentry.
In RCU-walk this doesn't apply. ->d_automount will not be called if
->d_manage returns zero. So the optimisation is not needed.
I might have missed a subtlety here but it looks like:
in rcu_walk mode:
->d_manage returns zero if everything is stable, no automount is
needed, and nothing is expiring
and returns non-zero (-ECHILD) if anything doesn't quite look right
and we need to re-try with in refwalk mode with spinlocks etc.
while in refwalk mode:
->d_manage returns zero, possibly after waiting, if everything looks
stable and ->d_automount is needed
or it returns -EISDIR if the dentry never needs automounting
So while the 'success' condition is similar, the 'error' case is quite
different. The errors that make sense in one case don't make sense in the
other.
So this patch fragment is avoiding a refwalk error return in the rcuwalk case.
On reflection, there might be something wrong here.
If a dentry should eventually be mounted on but isn't yet, ->d_manage needs
to fail.
So if !d_mountpoint and simple_empty(), maybe d_manage should fail in rcuwalk
mode.
That looks a bit messy ... I wonder if we could have a new "ino" flag which
says "This dentry is mounted-on if it needs to be. Gets set by ->lookup
and cleared by ->d_automount or when ->d_manage returns -EISDIR.
Then set again when it expires.
I'll probably have to understand the code more deeply before I can be sure.
Thanks,
NeilBrown
>
> > spin_lock(&sbi->fs_lock);
> > /*
> > * If the dentry has been selected for expire while we slept
> >
> >
>
On Wed, 16 Jul 2014 11:24:58 +0800 Ian Kent <[email protected]> wrote:
> On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> > autofs4 currently doesn't support RCU-walk - it immediately
> > aborts any attempt at RCU-walk to force REF-walk for path name
> > lookup.
>
> As discussed I don't have time to properly test these just now but I'll
> do my best to review the patches and return to test them later.
Thanks.
>
> My impression is that you will be submitting these patches rather than
> expecting me to pick them up and submit them. If that's not what your
> expecting please let me know.
I had assumed that you would take them as you are listed as the maintainer.
However if you would like me to send them on I can certainly do that.
You seem to send via Andrew Morton so I'll do that when they seem to be ready
if you like.
>
> I appreciate you including me in this work, all to often things get
> merged that I'm miss and while I may not have identified any problem
> with them at the time at least I would be aware of what I might need to
> look at when problems arise.
>
Maybe I'm old fashioned, but I have this idea that all patches *must* at
least be Cc:ed to the maintainer and if they are as intrusive as these, they
*must* be approved. Maybe others behave different?
Thanks,
NeilBrown
On Wed, 16 Jul 2014 11:42:12 +0800 Ian Kent <[email protected]> wrote:
> On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> > If the expiring_list is empty, we can avoid a costly spinlock
> > in the rcu-walk path through authfs4_d_manage.
> >
> > Signed-off-by: NeilBrown <[email protected]>
>
> I know it should be straight forward to say this is OK but I always
> think twice and again about areas that are subject to race pressure,
> such as the expire to mount pressure of this code.
>
> After thinking about it for a while now I don't have any reason to think
> this would be a problem. Perhaps later pressure testing will reveal
> something I missed.
>
> It occurs to me that autofs4_lookup_active() might benefit from a
> similar addition. Multiple calls to ->lookup() made while the dentry is
> still unhashed should have enforced ordering due to the directory
> i_mutex so there shouldn't be a problem adding this. But perhaps you
> haven't seen delays in that function.
Yes I think the same optimisation could apply to autofs4_lookup_active(). It
wouldn't benefit as much though.
autofs4_lookup_active() is only called from autofs4_lookup() which is only
called when the dentry doesn't already exist. So it isn't called so often
and isn't on the fast path.
However ... maybe "is on the active list" is exactly the "flag" I was wanting
earlier to see if a dentry might be waiting to be mounted on - in which case
d_managed already does the right thing.
I think it is time to read through the code again. I seem to be
understanding more of it which always helps:-)
Thanks,
NeilBrown
>
> Acked-by: Ian Kent <[email protected]>
>
> > ---
> > fs/autofs4/root.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> > index 1ad119407e2f..774c2dab331b 100644
> > --- a/fs/autofs4/root.c
> > +++ b/fs/autofs4/root.c
> > @@ -219,6 +219,8 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry,
> > const unsigned char *str = name->name;
> > struct list_head *p, *head;
> >
> > + if (list_empty(&sbi->expiring_list))
> > + return NULL;
> > spin_lock(&sbi->lookup_lock);
> > head = &sbi->expiring_list;
> > list_for_each(p, head) {
> >
> >
>
On Wed, 2014-07-16 at 15:51 +1000, NeilBrown wrote:
> On Wed, 16 Jul 2014 12:44:17 +0800 Ian Kent <[email protected]> wrote:
>
> > On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> > > Any attempt to look up a pathname that passes though an
> > > autofs4 mount is currently forced out of RCU-walk into
> > > REF-walk.
> > >
> > > This can significantly hurt performance of many-thread work
> > > loads on many-core systems, especially if the automounted
> > > filesystem supports RCU-walk but doesn't get to benefit from
> > > it.
> > >
> > > So if autofs4_d_manage is called with rcu_walk set, only
> > > fail with -ECHILD if it is necessary to wait longer than
> > > a spinlock, and avoid even the spinlock in one trivial case.
> >
> > I've looked more closely at this one now and mostly it looks good to me.
> >
> > But I'm probably a bit slow, there's just one place where I can't work
> > out the reasoning ....
> >
> > >
> > > Signed-off-by: NeilBrown <[email protected]>
> > > ---
> > > fs/autofs4/autofs_i.h | 2 +-
> > > fs/autofs4/dev-ioctl.c | 2 +-
> > > fs/autofs4/expire.c | 4 +++-
> > > fs/autofs4/root.c | 44 +++++++++++++++++++++++++++++---------------
> > > 4 files changed, 34 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
> > > index 22a280151e45..99dbb05d6148 100644
> > > --- a/fs/autofs4/autofs_i.h
> > > +++ b/fs/autofs4/autofs_i.h
> > > @@ -148,7 +148,7 @@ void autofs4_free_ino(struct autofs_info *);
> > >
> > > /* Expiration */
> > > int is_autofs4_dentry(struct dentry *);
> > > -int autofs4_expire_wait(struct dentry *dentry);
> > > +int autofs4_expire_wait(struct dentry *dentry, int rcu_walk);
> > > int autofs4_expire_run(struct super_block *, struct vfsmount *,
> > > struct autofs_sb_info *,
> > > struct autofs_packet_expire __user *);
> > > diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
> > > index 5b570b6efa28..aaf96cb25452 100644
> > > --- a/fs/autofs4/dev-ioctl.c
> > > +++ b/fs/autofs4/dev-ioctl.c
> > > @@ -450,7 +450,7 @@ static int autofs_dev_ioctl_requester(struct file *fp,
> > > ino = autofs4_dentry_ino(path.dentry);
> > > if (ino) {
> > > err = 0;
> > > - autofs4_expire_wait(path.dentry);
> > > + autofs4_expire_wait(path.dentry, 0);
> > > spin_lock(&sbi->fs_lock);
> > > param->requester.uid = from_kuid_munged(current_user_ns(), ino->uid);
> > > param->requester.gid = from_kgid_munged(current_user_ns(), ino->gid);
> > > diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> > > index a7be57e39be7..7e2f22ce6954 100644
> > > --- a/fs/autofs4/expire.c
> > > +++ b/fs/autofs4/expire.c
> > > @@ -467,7 +467,7 @@ found:
> > > return expired;
> > > }
> > >
> > > -int autofs4_expire_wait(struct dentry *dentry)
> > > +int autofs4_expire_wait(struct dentry *dentry, int rcu_walk)
> > > {
> > > struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> > > struct autofs_info *ino = autofs4_dentry_ino(dentry);
> > > @@ -477,6 +477,8 @@ int autofs4_expire_wait(struct dentry *dentry)
> > > spin_lock(&sbi->fs_lock);
> > > if (ino->flags & AUTOFS_INF_EXPIRING) {
> > > spin_unlock(&sbi->fs_lock);
> > > + if (rcu_walk)
> > > + return -ECHILD;
> > >
> > > DPRINTK("waiting for expire %p name=%.*s",
> > > dentry, dentry->d_name.len, dentry->d_name.name);
> > > diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> > > index cc87c1abac97..1ad119407e2f 100644
> > > --- a/fs/autofs4/root.c
> > > +++ b/fs/autofs4/root.c
> > > @@ -208,7 +208,8 @@ next:
> > > return NULL;
> > > }
> > >
> > > -static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
> > > +static struct dentry *autofs4_lookup_expiring(struct dentry *dentry,
> > > + bool rcu_walk)
> > > {
> > > struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> > > struct dentry *parent = dentry->d_parent;
> > > @@ -225,6 +226,11 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
> > > struct dentry *expiring;
> > > struct qstr *qstr;
> > >
> > > + if (rcu_walk) {
> > > + spin_unlock(&sbi->lookup_lock);
> > > + return ERR_PTR(-ECHILD);
> > > + }
> > > +
> > > ino = list_entry(p, struct autofs_info, expiring);
> > > expiring = ino->dentry;
> > >
> > > @@ -260,13 +266,15 @@ next:
> > > return NULL;
> > > }
> > >
> > > -static int autofs4_mount_wait(struct dentry *dentry)
> > > +static int autofs4_mount_wait(struct dentry *dentry, bool rcu_walk)
> > > {
> > > struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
> > > struct autofs_info *ino = autofs4_dentry_ino(dentry);
> > > int status = 0;
> > >
> > > if (ino->flags & AUTOFS_INF_PENDING) {
> > > + if (rcu_walk)
> > > + return -ECHILD;
> > > DPRINTK("waiting for mount name=%.*s",
> > > dentry->d_name.len, dentry->d_name.name);
> > > status = autofs4_wait(sbi, dentry, NFY_MOUNT);
> > > @@ -276,20 +284,22 @@ static int autofs4_mount_wait(struct dentry *dentry)
> > > return status;
> > > }
> > >
> > > -static int do_expire_wait(struct dentry *dentry)
> > > +static int do_expire_wait(struct dentry *dentry, bool rcu_walk)
> > > {
> > > struct dentry *expiring;
> > >
> > > - expiring = autofs4_lookup_expiring(dentry);
> > > + expiring = autofs4_lookup_expiring(dentry, rcu_walk);
> > > + if (IS_ERR(expiring))
> > > + return PTR_ERR(expiring);
> > > if (!expiring)
> > > - return autofs4_expire_wait(dentry);
> > > + return autofs4_expire_wait(dentry, rcu_walk);
> > > else {
> > > /*
> > > * If we are racing with expire the request might not
> > > * be quite complete, but the directory has been removed
> > > * so it must have been successful, just wait for it.
> > > */
> > > - autofs4_expire_wait(expiring);
> > > + autofs4_expire_wait(expiring, 0);
> > > autofs4_del_expiring(expiring);
> > > dput(expiring);
> > > }
> > > @@ -341,7 +351,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> > > * and the directory was removed, so just go ahead and try
> > > * the mount.
> > > */
> > > - status = do_expire_wait(dentry);
> > > + status = do_expire_wait(dentry, 0);
> > > if (status && status != -EAGAIN)
> > > return NULL;
> > >
> > > @@ -349,7 +359,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> > > spin_lock(&sbi->fs_lock);
> > > if (ino->flags & AUTOFS_INF_PENDING) {
> > > spin_unlock(&sbi->fs_lock);
> > > - status = autofs4_mount_wait(dentry);
> > > + status = autofs4_mount_wait(dentry, 0);
> > > if (status)
> > > return ERR_PTR(status);
> > > goto done;
> > > @@ -390,7 +400,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> > > }
> > > ino->flags |= AUTOFS_INF_PENDING;
> > > spin_unlock(&sbi->fs_lock);
> > > - status = autofs4_mount_wait(dentry);
> > > + status = autofs4_mount_wait(dentry, 0);
> > > spin_lock(&sbi->fs_lock);
> > > ino->flags &= ~AUTOFS_INF_PENDING;
> > > if (status) {
> > > @@ -426,21 +436,25 @@ static int autofs4_d_manage(struct dentry *dentry, bool rcu_walk)
> > > return 0;
> > > }
> > >
> > > - /* We need to sleep, so we need pathwalk to be in ref-mode */
> > > - if (rcu_walk)
> > > - return -ECHILD;
> > > -
> > > /* Wait for pending expires */
> > > - do_expire_wait(dentry);
> > > + if (do_expire_wait(dentry, rcu_walk) == -ECHILD)
> > > + return -ECHILD;
> > >
> > > /*
> > > * This dentry may be under construction so wait on mount
> > > * completion.
> > > */
> > > - status = autofs4_mount_wait(dentry);
> > > + status = autofs4_mount_wait(dentry, rcu_walk);
> > > if (status)
> > > return status;
> > >
> > > + if (rcu_walk)
> > > + /* No need to consider returning -EISDIR as
> > > + * there is no risk that ->d_automount will
> > > + * be called
> > > + */
> > > + return status;
> > > +
> >
> > This bit here.
> >
> > The test below says (accepting that d_mountpoint() is unreliable in a
> > multi-namespace environment and needs to be fixed), if this dentry isn't
> > being expired and it's not a mount and the directory isn't empty then we
> > can walk straight through the dentry.
> >
> > For certain automount map constructs it's possible for a dentry that is
> > not itself a mount point to be expired in the same way a mount or mount
> > tree is expired.
> >
> > What's the thinking behind this one please?
>
> I admit that I don't fully understand the role of -EISDIR being returned by
> d_manage.
Yeah, it's to accommodate the so called autofs rootless mutli-mount
where the root of the tree doesn't itself have a mount and path walks
need to walk through them if they aren't expiring or under construction.
> However the comment below suggests that it is an optimisation. It tells the
> VFS not to even bother calling ->d_automount on this dentry.
> In RCU-walk this doesn't apply. ->d_automount will not be called if
> ->d_manage returns zero. So the optimisation is not needed.
Oh yes, I remember now, your talking about __follow_mount_rcu(), the
only place ->d_manage() gets called in an RCU-walk mode.
I get what's going on now, thanks.
>
> I might have missed a subtlety here but it looks like:
>
> in rcu_walk mode:
> ->d_manage returns zero if everything is stable, no automount is
> needed, and nothing is expiring
> and returns non-zero (-ECHILD) if anything doesn't quite look right
> and we need to re-try with in refwalk mode with spinlocks etc.
>
> while in refwalk mode:
> ->d_manage returns zero, possibly after waiting, if everything looks
> stable and ->d_automount is needed
> or it returns -EISDIR if the dentry never needs automounting
Yep, that sounds right to me.
>
> So while the 'success' condition is similar, the 'error' case is quite
> different. The errors that make sense in one case don't make sense in the
> other.
>
> So this patch fragment is avoiding a refwalk error return in the rcuwalk case.
>
> On reflection, there might be something wrong here.
> If a dentry should eventually be mounted on but isn't yet, ->d_manage needs
> to fail.
> So if !d_mountpoint and simple_empty(), maybe d_manage should fail in rcuwalk
> mode.
Yep, that's right, I missed that.
> That looks a bit messy ... I wonder if we could have a new "ino" flag which
> says "This dentry is mounted-on if it needs to be. Gets set by ->lookup
> and cleared by ->d_automount or when ->d_manage returns -EISDIR.
At one point DCACHE_NEED_AUTOMOUNT and DCACHE_MANAGE_TRANSIT were
handled separately and DCACHE_NEED_AUTOMOUNT was cleared for rootless
multi-mount dentrys following a mount and set again at expire. Not
having to worry about managing that flag was also part of the
optimization.
We could go back to managing DCACHE_NEED_AUTOMOUNT or add a new flag.
I'm not fussy how it's done as long as it works. IIRC there was one
quite convoluted if check (in the expire code) that was removed due to
the optimization.
> Then set again when it expires.
> I'll probably have to understand the code more deeply before I can be sure.
Rootless multi-mount entries have always caused things to be that much
more difficult.
Fyi, here's an example of an autofs map construct that has no root
mount:
root /one server:/export/one \
/two server2:/export/two \
/three server:/export/three
I'll continue, and have a look at the other two patches.
Ian
On Wed, 2014-07-16 at 16:00 +1000, NeilBrown wrote:
> On Wed, 16 Jul 2014 11:24:58 +0800 Ian Kent <[email protected]> wrote:
>
> > On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> > > autofs4 currently doesn't support RCU-walk - it immediately
> > > aborts any attempt at RCU-walk to force REF-walk for path name
> > > lookup.
> >
> > As discussed I don't have time to properly test these just now but I'll
> > do my best to review the patches and return to test them later.
>
> Thanks.
>
> >
> > My impression is that you will be submitting these patches rather than
> > expecting me to pick them up and submit them. If that's not what your
> > expecting please let me know.
>
> I had assumed that you would take them as you are listed as the maintainer.
> However if you would like me to send them on I can certainly do that.
> You seem to send via Andrew Morton so I'll do that when they seem to be ready
> if you like.
Either or, we can decide that once we're done with the patches.
I don't maintain an autofs tree because the module isn't volatile enough
to warrant it. I was sloppy with patches I sent to Linus a few times and
Andrew politely asked me to send them via him, and rightly so, ;)
I guess it comes down to how urgent it is to get these merged since I'd
like to hammer them with my submount test before passing them on and at
this point I don't know when I'll have time to work through that.
I could just set aside some time to run the test a few times and not try
and fix any problems at the time. Point being the test has a habit of
exposing problems that can be difficult to work out let alone fix.
Ian
On Mon, 2014-07-14 at 10:53 +1000, NeilBrown wrote:
> Here is a revised version of this one patch.
> This one fixes a problem with refcounts on dentry and adds a comment to
> clarify the behaviour of should_expire().
I had some problems with this patch.
Looked like it got munged by the email client.
I had to get rid of ^M line ends and fix several other translations and
wrapped lines.
Hopefully I didn't mess it up.
>
> thanks,
> NeilBrown
>
>
> From: NeilBrown <[email protected]>Date: Tue, 8 Jul 2014 17:14:53 +1000
> Subject: [PATCH] autofs4: factor should_expire() out of autofs4_expire_indirect.
>
> Future patch will potentially call this twice, so make it
> separate.
>
> Signed-off-by: NeilBrown <[email protected]>
>
> diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> index 7e2f22ce6954..402ee7f1461a 100644
> --- a/fs/autofs4/expire.c
> +++ b/fs/autofs4/expire.c
> @@ -345,6 +345,89 @@ out:
> return NULL;
> }
>
> +/* Check if 'dentry' should expire, or return a nearby
> + * dentry that is suitable.
> + * If returned dentry is different from arg dentry,
> + * then a dget() reference was taken, else not.
> + */
> +static struct dentry *should_expire(struct dentry *dentry,
> + struct vfsmount *mnt,
> + unsigned long timeout,
> + int how)
> +{
> + int do_now = how & AUTOFS_EXP_IMMEDIATE;
> + int exp_leaves = how & AUTOFS_EXP_LEAVES;
> + struct autofs_info *ino = autofs4_dentry_ino(dentry);
> + unsigned int ino_count;
> +
> + /* No point expiring a pending mount */
> + if (ino->flags & AUTOFS_INF_PENDING)
> + return NULL;
> +
> + /*
> + * Case 1: (i) indirect mount or top level pseudo direct mount
> + * (autofs-4.1).
> + * (ii) indirect mount with offset mount, check the "/"
> + * offset (autofs-5.0+).
> + */
> + if (d_mountpoint(dentry)) {
> + DPRINTK("checking mountpoint %p %.*s",
> + dentry, (int)dentry->d_name.len, dentry->d_name.name);
> +
> + /* Can we umount this guy */
> + if (autofs4_mount_busy(mnt, dentry))
> + return NULL;
> +
> + /* Can we expire this guy */
> + if (autofs4_can_expire(dentry, timeout, do_now))
> + return dentry;
> + return NULL;
> + }
> +
> + if (dentry->d_inode && S_ISLNK(dentry->d_inode->i_mode)) {
> + DPRINTK("checking symlink %p %.*s",
> + dentry, (int)dentry->d_name.len, dentry->d_name.name);
> + /*
> + * A symlink can't be "busy" in the usual sense so
> + * just check last used for expire timeout.
> + */
> + if (autofs4_can_expire(dentry, timeout, do_now))
> + return dentry;
> + return NULL;
> + }
> +
> + if (simple_empty(dentry))
> + return NULL;
> +
> + /* Case 2: tree mount, expire iff entire tree is not busy */
> + if (!exp_leaves) {
> + /* Path walk currently on this dentry? */
> + ino_count = atomic_read(&ino->count) + 1;
> + if (d_count(dentry) > ino_count)
> + return NULL;
> +
> + if (!autofs4_tree_busy(mnt, dentry, timeout, do_now))
> + return dentry;
> + /*
> + * Case 3: pseudo direct mount, expire individual leaves
> + * (autofs-4.1).
> + */
I recommend removing one of the tab stops on this since it relates to
the else case but feels like it's "attached" to the code above the else.
> + } else {
> + /* Path walk currently on this dentry? */
> + struct dentry *expired;
> + ino_count = atomic_read(&ino->count) + 1;
> + if (d_count(dentry) > ino_count)
> + return NULL;
> +
> + expired = autofs4_check_leaves(mnt, dentry, timeout, do_now);
> + if (expired) {
> + if (expired == dentry)
> + dput(dentry);
> + return dentry;
Umm .. I think this should be "return expired".
Otherwise this looks like a straight refactor.
> + }
> + }
> + return NULL;
> +}
> /*
> * Find an eligible tree to time-out
> * A tree is eligible if :-
> @@ -359,11 +442,8 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
> unsigned long timeout;
> struct dentry *root = sb->s_root;
> struct dentry *dentry;
> - struct dentry *expired = NULL;
> - int do_now = how & AUTOFS_EXP_IMMEDIATE;
> - int exp_leaves = how & AUTOFS_EXP_LEAVES;
> + struct dentry *expired;
> struct autofs_info *ino;
> - unsigned int ino_count;
>
> if (!root)
> return NULL;
> @@ -374,78 +454,12 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
> dentry = NULL;
> while ((dentry = get_next_positive_subdir(dentry, root))) {
> spin_lock(&sbi->fs_lock);
> - ino = autofs4_dentry_ino(dentry);
> - /* No point expiring a pending mount */
> - if (ino->flags & AUTOFS_INF_PENDING)
> - goto next;
> -
> - /*
> - * Case 1: (i) indirect mount or top level pseudo direct mount
> - * (autofs-4.1).
> - * (ii) indirect mount with offset mount, check the "/"
> - * offset (autofs-5.0+).
> - */
> - if (d_mountpoint(dentry)) {
> - DPRINTK("checking mountpoint %p %.*s",
> - dentry, (int)dentry->d_name.len, dentry->d_name.name);
> -
> - /* Can we umount this guy */
> - if (autofs4_mount_busy(mnt, dentry))
> - goto next;
> -
> - /* Can we expire this guy */
> - if (autofs4_can_expire(dentry, timeout, do_now)) {
> - expired = dentry;
> - goto found;
> - }
> - goto next;
> - }
> -
> - if (dentry->d_inode && S_ISLNK(dentry->d_inode->i_mode)) {
> - DPRINTK("checking symlink %p %.*s",
> - dentry, (int)dentry->d_name.len, dentry->d_name.name);
> - /*
> - * A symlink can't be "busy" in the usual sense so
> - * just check last used for expire timeout.
> - */
> - if (autofs4_can_expire(dentry, timeout, do_now)) {
> - expired = dentry;
> - goto found;
> - }
> - goto next;
> - }
> -
> - if (simple_empty(dentry))
> - goto next;
> -
> - /* Case 2: tree mount, expire iff entire tree is not busy */
> - if (!exp_leaves) {
> - /* Path walk currently on this dentry? */
> - ino_count = atomic_read(&ino->count) + 1;
> - if (d_count(dentry) > ino_count)
> - goto next;
> -
> - if (!autofs4_tree_busy(mnt, dentry, timeout, do_now)) {
> - expired = dentry;
> - goto found;
> - }
> - /*
> - * Case 3: pseudo direct mount, expire individual leaves
> - * (autofs-4.1).
> - */
> - } else {
> - /* Path walk currently on this dentry? */
> - ino_count = atomic_read(&ino->count) + 1;
> - if (d_count(dentry) > ino_count)
> - goto next;
> -
> - expired = autofs4_check_leaves(mnt, dentry, timeout, do_now);
> - if (expired) {
> + expired = should_expire(dentry, mnt, timeout, how);
> + if (expired) {
> + if (expired != dentry)
> dput(dentry);
> - goto found;
> - }
> + goto found;
> }
> -next:
> spin_unlock(&sbi->fs_lock);
> }
> return NULL;
On Thu, 2014-07-10 at 09:41 +1000, NeilBrown wrote:
> ->fs_lock protects AUTOFS_INF_EXPIRING. We need to be sure
> that once the flag is set, no new references beneath the dentry
> are taken. So rcu-walk currently needs to take fs_lock before
> checking the flag. This hurts performance.
>
> Change the expiry to a two-stage process.
> First set AUTHFS_INF_NO_RCU which forces any path walk into
> ref-walk mode, then drop the lock and call synchronize_rcu().
> Once that returns we can be sure no rcu-walk is active beneath
> the dentry and we can check reference counts again.
>
> Now during an RCU-walk we can test AUTHFS_INF_EXPIRING without
> taking the lock as along as we test AUTHFS_INF_NO_RCU too.
Couple of typos above, eeek!
> If either are set, we must abort the RCU-walk
> If neither are set, we know that refcounts will be tested again
> after we finish the RCU-walk so we are safe to continue.
I believe the idea is sound and the patch looks good.
Nevertheless I think this is probably the tricky bit and if there is a
problem I'm not seeing it's probably in this patch.
The submount-test will probably help with that.
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
> fs/autofs4/autofs_i.h | 4 ++++
> fs/autofs4/expire.c | 46 ++++++++++++++++++++++++++++++++++++++--------
> 2 files changed, 42 insertions(+), 8 deletions(-)
>
> diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
> index 99dbb05d6148..469724d7568c 100644
> --- a/fs/autofs4/autofs_i.h
> +++ b/fs/autofs4/autofs_i.h
> @@ -79,6 +79,10 @@ struct autofs_info {
> };
>
> #define AUTOFS_INF_EXPIRING (1<<0) /* dentry is in the process of expiring */
> +#define AUTOFS_INF_NO_RCU (1<<1) /* the dentry is being considered
> + * for expiry, so RCU_walk is
> + * not permitted
> + */
> #define AUTOFS_INF_PENDING (1<<2) /* dentry pending mount */
>
> struct autofs_wait_queue {
> diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
> index fb0b5003353f..98a6fd4957f8 100644
> --- a/fs/autofs4/expire.c
> +++ b/fs/autofs4/expire.c
> @@ -333,10 +333,19 @@ struct dentry *autofs4_expire_direct(struct super_block *sb,
> if (ino->flags & AUTOFS_INF_PENDING)
> goto out;
> if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
> - ino->flags |= AUTOFS_INF_EXPIRING;
> - init_completion(&ino->expire_complete);
> + ino->flags |= AUTOFS_INF_NO_RCU;
> spin_unlock(&sbi->fs_lock);
> - return root;
> + synchronize_rcu();
> + spin_lock(&sbi->fs_lock);
> + if (!autofs4_direct_busy(mnt, root, timeout, do_now)) {
> + ino->flags |= AUTOFS_INF_EXPIRING;
> + smp_mb()
> + ino->flags &= ~AUTOFS_INF_NO_RCU;
> + init_completion(&ino->expire_complete);
> + spin_unlock(&sbi->fs_lock);
> + return root;
> + }
> + ino->flags &= ~AUTOFS_INF_NO_RCU;
> }
> out:
> spin_unlock(&sbi->fs_lock);
> @@ -445,12 +454,29 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
> dentry = NULL;
> while ((dentry = get_next_positive_subdir(dentry, root))) {
> spin_lock(&sbi->fs_lock);
> - expired = should_expire(dentry, mnt, timeout, how);
> - if (expired) {
> + ino = autofs4_dentry_ino(dentry);
> + if (ino->flags & AUTOFS_INF_NO_RCU)
> + expired = NULL;
> + else
> + expired = should_expire(dentry, mnt, timeout, how);
> + if (!expired) {
> + spin_unlock(&sbi->fs_lock);
> + continue;
> + }
> + ino = autofs4_dentry_ino(expired);
> + ino->flags |= AUTOFS_INF_NO_RCU;
> + spin_unlock(&sbi->fs_lock);
> + synchronize_rcu();
> + spin_lock(&sbi->fs_lock);
> + if (should_expire(expired, mnt, timeout, how)) {
> if (expired != dentry)
> dput(dentry);
> goto found;
> }
> +
> + ino->flags &= ~AUTOFS_INF_NO_RCU;
> + if (expired != dentry)
> + dput(expired);
> spin_unlock(&sbi->fs_lock);
> }
> return NULL;
> @@ -458,8 +484,9 @@ struct dentry *autofs4_expire_indirect(struct super_block *sb,
> found:
> DPRINTK("returning %p %.*s",
> expired, (int)expired->d_name.len, expired->d_name.name);
> - ino = autofs4_dentry_ino(expired);
> ino->flags |= AUTOFS_INF_EXPIRING;
> + smp_mb()
> + ino->flags &= ~AUTOFS_INF_NO_RCU;
> init_completion(&ino->expire_complete);
> spin_unlock(&sbi->fs_lock);
> spin_lock(&sbi->lookup_lock);
> @@ -479,11 +506,14 @@ int autofs4_expire_wait(struct dentry *dentry, int rcu_walk)
> int status;
>
> /* Block on any pending expire */
> + if (!(ino->flags & (AUTOFS_INF_EXPIRING | AUTOFS_INF_NO_RCU)))
> + return 0;
> +
> + if (rcu_walk)
> + return -ECHILD;
Be nice to add a blank line here.
> spin_lock(&sbi->fs_lock);
> if (ino->flags & AUTOFS_INF_EXPIRING) {
> spin_unlock(&sbi->fs_lock);
> - if (rcu_walk)
> - return -ECHILD;
>
> DPRINTK("waiting for expire %p name=%.*s",
> dentry, dentry->d_name.len, dentry->d_name.name);
>
>
On Wed, 16 Jul 2014 15:50:17 +0800 Ian Kent <[email protected]> wrote:
> > + if (simple_empty(dentry))
> > + return NULL;
> > +
> > + /* Case 2: tree mount, expire iff entire tree is not busy */
> > + if (!exp_leaves) {
> > + /* Path walk currently on this dentry? */
> > + ino_count = atomic_read(&ino->count) + 1;
> > + if (d_count(dentry) > ino_count)
> > + return NULL;
> > +
> > + if (!autofs4_tree_busy(mnt, dentry, timeout, do_now))
> > + return dentry;
> > + /*
> > + * Case 3: pseudo direct mount, expire individual leaves
> > + * (autofs-4.1).
> > + */
>
> I recommend removing one of the tab stops on this since it relates to
> the else case but feels like it's "attached" to the code above the else.
The auto-reindent feature of emacs put it there and I didn't double check.
I've fix it.
>
> > + } else {
> > + /* Path walk currently on this dentry? */
> > + struct dentry *expired;
> > + ino_count = atomic_read(&ino->count) + 1;
> > + if (d_count(dentry) > ino_count)
> > + return NULL;
> > +
> > + expired = autofs4_check_leaves(mnt, dentry, timeout, do_now);
> > + if (expired) {
> > + if (expired == dentry)
> > + dput(dentry);
> > + return dentry;
>
> Umm .. I think this should be "return expired".
Yes, of course it should. Fixed.
>
> Otherwise this looks like a straight refactor.
Thanks,
NeilBrown
On Wed, 2014-07-16 at 14:56 +0800, Ian Kent wrote:
> > That looks a bit messy ... I wonder if we could have a new "ino" flag which
> > says "This dentry is mounted-on if it needs to be. Gets set by ->lookup
> > and cleared by ->d_automount or when ->d_manage returns -EISDIR.
>
> At one point DCACHE_NEED_AUTOMOUNT and DCACHE_MANAGE_TRANSIT were
> handled separately and DCACHE_NEED_AUTOMOUNT was cleared for rootless
> multi-mount dentrys following a mount and set again at expire. Not
> having to worry about managing that flag was also part of the
> optimization.
>
> We could go back to managing DCACHE_NEED_AUTOMOUNT or add a new flag.
> I'm not fussy how it's done as long as it works. IIRC there was one
> quite convoluted if check (in the expire code) that was removed due to
> the optimization.
Umm ... using DCACHE_NEED_AUTOMOUNT rather than a new flag means using
an additional lock so perhaps a new flag would be preferred since
reducing lock overhead was the point of this.
Just a thought
Ian
On Thu, 17 Jul 2014 13:00:56 +0800 Ian Kent <[email protected]> wrote:
> On Wed, 2014-07-16 at 14:56 +0800, Ian Kent wrote:
> > > That looks a bit messy ... I wonder if we could have a new "ino" flag which
> > > says "This dentry is mounted-on if it needs to be. Gets set by ->lookup
> > > and cleared by ->d_automount or when ->d_manage returns -EISDIR.
> >
> > At one point DCACHE_NEED_AUTOMOUNT and DCACHE_MANAGE_TRANSIT were
> > handled separately and DCACHE_NEED_AUTOMOUNT was cleared for rootless
> > multi-mount dentrys following a mount and set again at expire. Not
> > having to worry about managing that flag was also part of the
> > optimization.
> >
> > We could go back to managing DCACHE_NEED_AUTOMOUNT or add a new flag.
> > I'm not fussy how it's done as long as it works. IIRC there was one
> > quite convoluted if check (in the expire code) that was removed due to
> > the optimization.
>
> Umm ... using DCACHE_NEED_AUTOMOUNT rather than a new flag means using
> an additional lock so perhaps a new flag would be preferred since
> reducing lock overhead was the point of this.
Actually, I've managed to convince myself that there is nothing that needs
fixing here.
autofs4 already sets DCACHE_NEED_AUTOMOUNT on any directory that might need
to be mounted on. It does this before the dentry is added to the dcache, so
before rcuwalk could possibly see the dentry.
If __follow_mount_rcu find that the dentry is d_mountpoint(), it will
follow down to the mounted directory, which is always correct.
If it isn't d_mountpoint, then lookup_fast, which is the only caller for
__follow_mount_rcu, will check if DCACHE_NEED_AUTOMOUNT is set, and if so
will goto unlazy, which drops back to refwalk and tries again. So that is
always safe.
So it is currently either correct or safe.
I'm not 100% sure about the v4 code paths, but it seems OK.
If there some documentation about the interactions between the automountd and
the kernel?
It looks like:
With V5, every name in the root directory gets something mounted on it,
either another autofs or the target filesystem.
With V4, you don't mount an autofs onto an autofs, but when a name in the
root is automounted, all the names beneath there are created and the
target filesystems are mounted before the top level automount is
acknowledged.
Does that sound at all right (I suspect I haven't explained it very clearly).
Also I'd like to send a few of the simple patches to Andrew Morton now to get
them out of the way. Then the patches which need more thought and testing
can wait until we are ready. I'm in no hurry - as long as there is
expectation of forward progress I don't need to target any particular release.
As well as the first two that you provided an Acked-by for, I'd like to send
akpm:
- A revised version of the last patch which also avoids the spinlock for
autofs4_lookup_active
- A patch which removes some more unused 'inlines' from the header file.
- a typo-fix.
I've included the changed one inline below. If you are happy I'll send them
all to Andrew.
Thanks,
NeilBrown
From 31b8be2565c72b053d0a92ec4fb2bfadd9545557 Mon Sep 17 00:00:00 2001
From: NeilBrown <[email protected]>
Date: Wed, 9 Jul 2014 12:33:15 +1000
Subject: [PATCH 1/3] autofs4: don't take spinlock when not needed in
autofs4_lookup_expiring
If the expiring_list is empty, we can avoid a costly spinlock
in the rcu-walk path through autofs4_d_manage (once the rest
of the path becomes rcu-walk friendly).
Signed-off-by: NeilBrown <[email protected]>
Acked-by: Ian Kent <[email protected]>
---
fs/autofs4/root.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index cc87c1abac97..fb202cadd4b3 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -166,8 +166,10 @@ static struct dentry *autofs4_lookup_active(struct dentry *dentry)
const unsigned char *str = name->name;
struct list_head *p, *head;
- spin_lock(&sbi->lookup_lock);
head = &sbi->active_list;
+ if (list_empty(head))
+ return NULL;
+ spin_lock(&sbi->lookup_lock);
list_for_each(p, head) {
struct autofs_info *ino;
struct dentry *active;
@@ -218,8 +220,10 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
const unsigned char *str = name->name;
struct list_head *p, *head;
- spin_lock(&sbi->lookup_lock);
head = &sbi->expiring_list;
+ if (list_empty(head))
+ return NULL;
+ spin_lock(&sbi->lookup_lock);
list_for_each(p, head) {
struct autofs_info *ino;
struct dentry *expiring;
--
2.0.0
From 7ea8366c9b2022e05c4a5bceee1d495bc6517bee Mon Sep 17 00:00:00 2001
From: NeilBrown <[email protected]>
Date: Thu, 17 Jul 2014 14:49:37 +1000
Subject: [PATCH 2/3] AUTOFS: remove some unused inline functions.
{__,}manage_dentry_{set,clear}_{automount,transit}
are 4 unused inline functions. Discard them.
Signed-off-by: NeilBrown <[email protected]>
---
fs/autofs4/autofs_i.h | 49 -------------------------------------------------
1 file changed, 49 deletions(-)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index 22a280151e45..9e359fb20c0a 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -177,55 +177,6 @@ extern const struct file_operations autofs4_root_operations;
extern const struct dentry_operations autofs4_dentry_operations;
/* VFS automount flags management functions */
-
-static inline void __managed_dentry_set_automount(struct dentry *dentry)
-{
- dentry->d_flags |= DCACHE_NEED_AUTOMOUNT;
-}
-
-static inline void managed_dentry_set_automount(struct dentry *dentry)
-{
- spin_lock(&dentry->d_lock);
- __managed_dentry_set_automount(dentry);
- spin_unlock(&dentry->d_lock);
-}
-
-static inline void __managed_dentry_clear_automount(struct dentry *dentry)
-{
- dentry->d_flags &= ~DCACHE_NEED_AUTOMOUNT;
-}
-
-static inline void managed_dentry_clear_automount(struct dentry *dentry)
-{
- spin_lock(&dentry->d_lock);
- __managed_dentry_clear_automount(dentry);
- spin_unlock(&dentry->d_lock);
-}
-
-static inline void __managed_dentry_set_transit(struct dentry *dentry)
-{
- dentry->d_flags |= DCACHE_MANAGE_TRANSIT;
-}
-
-static inline void managed_dentry_set_transit(struct dentry *dentry)
-{
- spin_lock(&dentry->d_lock);
- __managed_dentry_set_transit(dentry);
- spin_unlock(&dentry->d_lock);
-}
-
-static inline void __managed_dentry_clear_transit(struct dentry *dentry)
-{
- dentry->d_flags &= ~DCACHE_MANAGE_TRANSIT;
-}
-
-static inline void managed_dentry_clear_transit(struct dentry *dentry)
-{
- spin_lock(&dentry->d_lock);
- __managed_dentry_clear_transit(dentry);
- spin_unlock(&dentry->d_lock);
-}
-
static inline void __managed_dentry_set_managed(struct dentry *dentry)
{
dentry->d_flags |= (DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT);
--
2.0.0
From e1b74448bf2d0ab7e3c5d5b219d7a938bfd82d99 Mon Sep 17 00:00:00 2001
From: NeilBrown <[email protected]>
Date: Thu, 17 Jul 2014 14:58:08 +1000
Subject: [PATCH 3/3] AUTOFS: comment typo: remove a a doubled word.
Signed-off-by: NeilBrown <[email protected]>
---
fs/autofs4/root.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index fb202cadd4b3..cdb25ebccc4c 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -377,7 +377,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
* this because the leaves of the directory tree under the
* mount never trigger mounts themselves (they have an autofs
* trigger mount mounted on them). But v4 pseudo direct mounts
- * do need the leaves to to trigger mounts. In this case we
+ * do need the leaves to trigger mounts. In this case we
* have no choice but to use the list_empty() check and
* require user space behave.
*/
--
2.0.0
On Thu, 2014-07-17 at 18:04 +1000, NeilBrown wrote:
> On Thu, 17 Jul 2014 13:00:56 +0800 Ian Kent <[email protected]> wrote:
>
> > On Wed, 2014-07-16 at 14:56 +0800, Ian Kent wrote:
> > > > That looks a bit messy ... I wonder if we could have a new "ino" flag which
> > > > says "This dentry is mounted-on if it needs to be. Gets set by ->lookup
> > > > and cleared by ->d_automount or when ->d_manage returns -EISDIR.
> > >
> > > At one point DCACHE_NEED_AUTOMOUNT and DCACHE_MANAGE_TRANSIT were
> > > handled separately and DCACHE_NEED_AUTOMOUNT was cleared for rootless
> > > multi-mount dentrys following a mount and set again at expire. Not
> > > having to worry about managing that flag was also part of the
> > > optimization.
> > >
> > > We could go back to managing DCACHE_NEED_AUTOMOUNT or add a new flag.
> > > I'm not fussy how it's done as long as it works. IIRC there was one
> > > quite convoluted if check (in the expire code) that was removed due to
> > > the optimization.
> >
> > Umm ... using DCACHE_NEED_AUTOMOUNT rather than a new flag means using
> > an additional lock so perhaps a new flag would be preferred since
> > reducing lock overhead was the point of this.
>
> Actually, I've managed to convince myself that there is nothing that needs
> fixing here.
>
> autofs4 already sets DCACHE_NEED_AUTOMOUNT on any directory that might need
> to be mounted on. It does this before the dentry is added to the dcache, so
> before rcuwalk could possibly see the dentry.
Yep, that's right and that takes care of indirect mounts, but also have
a quick look at fs/autofs4/inode.c:autofs4_fill_super() which will set
the flags on the root dentry for direct mounts.
Actually they are often called triggers in the code as the so called
direct and offset mounts have the same semantics, so in
autofs4_fill_super() you'll see:
if (autofs_type_trigger(sbi->type))
__managed_dentry_set_managed(root);
which is what I'm referring to.
>
> If __follow_mount_rcu find that the dentry is d_mountpoint(), it will
> follow down to the mounted directory, which is always correct.
> If it isn't d_mountpoint, then lookup_fast, which is the only caller for
> __follow_mount_rcu, will check if DCACHE_NEED_AUTOMOUNT is set, and if so
> will goto unlazy, which drops back to refwalk and tries again. So that is
> always safe.
That all sound right having now looked at the rcu-walk code.
>
> So it is currently either correct or safe.
>
> I'm not 100% sure about the v4 code paths, but it seems OK.
> If there some documentation about the interactions between the automountd and
> the kernel?
Not really, only the ioctl interaction described in
Documentation/filesystems/autofs4-mount-control.txt
> It looks like:
> With V5, every name in the root directory gets something mounted on it,
> either another autofs or the target filesystem.
> With V4, you don't mount an autofs onto an autofs, but when a name in the
> root is automounted, all the names beneath there are created and the
> target filesystems are mounted before the top level automount is
> acknowledged.
>
> Does that sound at all right (I suspect I haven't explained it very clearly).
Mostly, but let me try and simplify it (or perhaps confuse you even more
and bore you as a bonus, ;)) and offer a broader picture.
I always relate to this in terms of automount maps and some map features
weren't available in version 4, in particular direct mounts. That lead
to some of the hard to understand code in the kernel, more so because
there's virtually nothing left of it in user space now.
Basically all autofs mounted file systems are either indirect or direct
mounts that are driven by automount maps.
These maps (stored in a file, nis, ldap or other) are associated with
autofs mounts by the master map (poor example below).
Indirect mounts have a mounted autofs filesystem and mounts are entries
in the root of that file system as enumerated by map entry keys (or
matched by a wildcard map entry) in the map. In this case the keys are a
single directory component.
This is where the everything in the root directory gets mounted comes
from, but there are exceptions.
Direct mounts are distinct autofs filesystem mounts that correspond to
distinct map entries in automount maps. Each direct mount map entry
represents a full path that can trigger a mount directly. But there are
exception here too.
This is where the setting in autofs4_fill_super() comes from.
Indirect maps can have one map only associated with a specific autofs
mounted filesystem whereas direct mount maps can have multiple entries
in the master map and therefore can have multiple maps associated with
them. Direct mount entries all have a pseudo mount path of "/-" in the
master map.
For example the master map used by autofs connectathon, before it gets
translated by the test driver script, is (ignore the fact that "_" is
used in map names here, even though the Linux convention is to use "."):
/net -hosts -nosuid
#/home auto_home -nosuid
/- AUTOMAP_DIR/auto_dcthon
/- AUTOMAP_DIR/auto_test3_direct
/- AUTOMAP_DIR/auto_test4_direct
AUTO_CLIENT_MNTPNT/iparse AUTOMAP_DIR/auto_icthon
AUTO_CLIENT_MNTPNT/eparse AUTOMAP_DIR/auto_ecthon
AUTO_CLIENT_MNTPNT/oparse AUTOMAP_DIR/auto_octhon
AUTO_CLIENT_MNTPNT/test1 AUTOMAP_DIR/auto_icthon
AUTO_CLIENT_MNTPNT/test2 AUTOMAP_DIR/auto_ecthon
AUTO_CLIENT_MNTPNT/test4/indirect AUTOMAP_DIR/auto_icthon
AUTO_CLIENT_MNTPNT/options AUTOMAP_DIR/auto_octhon
AUTO_CLIENT_MNTPNT/nested AUTOMAP_DIR/auto_nested
AUTO_CLIENT_MNTPNT/badnames AUTOMAP_DIR/auto_badnames
AUTO_CLIENT_MNTPNT/vers AUTOMAP_DIR/auto_octhon
+auto_master
I'm thinking this is much more than you want or need so I'll cut it
short right now.
Two notable exceptions to the above common map entry types are:
1) multi-mount map entries (as I briefly described before) that can give
rise to the rootless mount trees we have special handling for in kernel.
They can also be used by direct mounts but they always have at least an
autofs file system mounted at the root so are simpler to handle.
2) sub-mounts which are always indirect mounts, mounted within an autofs
mounted filesystem (hence sub-mount). Typically you'll see those as map
entries that have -fstype=autofs in the map entry.
So 1) gives rise to those pesky special cases you see in the kernel
requiring walks to pass through them and 2) gives rise to special cases
mostly in the expire code because they are managed almost entirely by
user space.
And finally the inbuilt hosts map (often mounted on /net) is essentially
a multi-mount map entry of the exports of each host.
So that's version 5.
The differences between v4 and v5 are few.
Most notably direct mounts had no kernel support (and were not
originally implemented in the daemon at all) and this is where the
pseudo-direct mounts of version 4 came from. They where made up of a
directory tree corresponding to the map entries and a convoluted path
matching was done by the daemon to mount and expire leaves of the
directory tree.
And in version 4 multi-mount map entries were mounted and expired as a
whole whereas in version 5 they are mounted and expired on demand (well
almost anyway) using mount triggers (the offset mount type seen in the
kernel). This on demand functionality gives rise to the "mount the base,
mount the triggers inside (and expire in reverse)" that I think you
referred to above.
Other than that there isn't really much difference (although I've
probably forgotten some).
>
>
> Also I'd like to send a few of the simple patches to Andrew Morton now to get
> them out of the way. Then the patches which need more thought and testing
> can wait until we are ready. I'm in no hurry - as long as there is
> expectation of forward progress I don't need to target any particular release.
>
> As well as the first two that you provided an Acked-by for, I'd like to send
> akpm:
Feel free to forward those patches and add any attribution you think is
warranted, Acked-by, Reviewed-by or even Signed-off-by.
I think I've made forward progress with my ref-counting problem so, with
some luck, I might be back on track by the end of the weekend, fingers
crossed.
>
> - A revised version of the last patch which also avoids the spinlock for
> autofs4_lookup_active
> - A patch which removes some more unused 'inlines' from the header file.
> - a typo-fix.
>
> I've included the changed one inline below. If you are happy I'll send them
> all to Andrew.
>
> Thanks,
> NeilBrown
>
>
> From 31b8be2565c72b053d0a92ec4fb2bfadd9545557 Mon Sep 17 00:00:00 2001
> From: NeilBrown <[email protected]>
> Date: Wed, 9 Jul 2014 12:33:15 +1000
> Subject: [PATCH 1/3] autofs4: don't take spinlock when not needed in
> autofs4_lookup_expiring
>
> If the expiring_list is empty, we can avoid a costly spinlock
> in the rcu-walk path through autofs4_d_manage (once the rest
> of the path becomes rcu-walk friendly).
>
> Signed-off-by: NeilBrown <[email protected]>
> Acked-by: Ian Kent <[email protected]>
> ---
> fs/autofs4/root.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> index cc87c1abac97..fb202cadd4b3 100644
> --- a/fs/autofs4/root.c
> +++ b/fs/autofs4/root.c
> @@ -166,8 +166,10 @@ static struct dentry *autofs4_lookup_active(struct dentry *dentry)
> const unsigned char *str = name->name;
> struct list_head *p, *head;
>
> - spin_lock(&sbi->lookup_lock);
> head = &sbi->active_list;
> + if (list_empty(head))
> + return NULL;
> + spin_lock(&sbi->lookup_lock);
> list_for_each(p, head) {
> struct autofs_info *ino;
> struct dentry *active;
> @@ -218,8 +220,10 @@ static struct dentry *autofs4_lookup_expiring(struct dentry *dentry)
> const unsigned char *str = name->name;
> struct list_head *p, *head;
>
> - spin_lock(&sbi->lookup_lock);
> head = &sbi->expiring_list;
> + if (list_empty(head))
> + return NULL;
> + spin_lock(&sbi->lookup_lock);
> list_for_each(p, head) {
> struct autofs_info *ino;
> struct dentry *expiring;
Yep, looks fine.
> --
> 2.0.0
>
> From 7ea8366c9b2022e05c4a5bceee1d495bc6517bee Mon Sep 17 00:00:00 2001
> From: NeilBrown <[email protected]>
> Date: Thu, 17 Jul 2014 14:49:37 +1000
> Subject: [PATCH 2/3] AUTOFS: remove some unused inline functions.
>
> {__,}manage_dentry_{set,clear}_{automount,transit}
>
> are 4 unused inline functions. Discard them.
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
> fs/autofs4/autofs_i.h | 49 -------------------------------------------------
> 1 file changed, 49 deletions(-)
>
> diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
> index 22a280151e45..9e359fb20c0a 100644
> --- a/fs/autofs4/autofs_i.h
> +++ b/fs/autofs4/autofs_i.h
> @@ -177,55 +177,6 @@ extern const struct file_operations autofs4_root_operations;
> extern const struct dentry_operations autofs4_dentry_operations;
>
> /* VFS automount flags management functions */
> -
> -static inline void __managed_dentry_set_automount(struct dentry *dentry)
> -{
> - dentry->d_flags |= DCACHE_NEED_AUTOMOUNT;
> -}
> -
> -static inline void managed_dentry_set_automount(struct dentry *dentry)
> -{
> - spin_lock(&dentry->d_lock);
> - __managed_dentry_set_automount(dentry);
> - spin_unlock(&dentry->d_lock);
> -}
> -
> -static inline void __managed_dentry_clear_automount(struct dentry *dentry)
> -{
> - dentry->d_flags &= ~DCACHE_NEED_AUTOMOUNT;
> -}
> -
> -static inline void managed_dentry_clear_automount(struct dentry *dentry)
> -{
> - spin_lock(&dentry->d_lock);
> - __managed_dentry_clear_automount(dentry);
> - spin_unlock(&dentry->d_lock);
> -}
> -
> -static inline void __managed_dentry_set_transit(struct dentry *dentry)
> -{
> - dentry->d_flags |= DCACHE_MANAGE_TRANSIT;
> -}
> -
> -static inline void managed_dentry_set_transit(struct dentry *dentry)
> -{
> - spin_lock(&dentry->d_lock);
> - __managed_dentry_set_transit(dentry);
> - spin_unlock(&dentry->d_lock);
> -}
> -
> -static inline void __managed_dentry_clear_transit(struct dentry *dentry)
> -{
> - dentry->d_flags &= ~DCACHE_MANAGE_TRANSIT;
> -}
> -
> -static inline void managed_dentry_clear_transit(struct dentry *dentry)
> -{
> - spin_lock(&dentry->d_lock);
> - __managed_dentry_clear_transit(dentry);
> - spin_unlock(&dentry->d_lock);
> -}
> -
> static inline void __managed_dentry_set_managed(struct dentry *dentry)
> {
> dentry->d_flags |= (DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT);
Yeah, I kept these around deliberately and some were never used and
others became unused over time.
I can always put them back if I need any of them.
> --
> 2.0.0
>
> From e1b74448bf2d0ab7e3c5d5b219d7a938bfd82d99 Mon Sep 17 00:00:00 2001
> From: NeilBrown <[email protected]>
> Date: Thu, 17 Jul 2014 14:58:08 +1000
> Subject: [PATCH 3/3] AUTOFS: comment typo: remove a a doubled word.
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
> fs/autofs4/root.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> index fb202cadd4b3..cdb25ebccc4c 100644
> --- a/fs/autofs4/root.c
> +++ b/fs/autofs4/root.c
> @@ -377,7 +377,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path)
> * this because the leaves of the directory tree under the
> * mount never trigger mounts themselves (they have an autofs
> * trigger mount mounted on them). But v4 pseudo direct mounts
> - * do need the leaves to to trigger mounts. In this case we
> + * do need the leaves to trigger mounts. In this case we
> * have no choice but to use the list_empty() check and
> * require user space behave.
> */
Right, not much to say about this one, ;)
Ian
On Thu, 17 Jul 2014 18:17:32 +0800 Ian Kent <[email protected]> wrote:
>
> On Thu, 2014-07-17 at 18:04 +1000, NeilBrown wrote:
> > If there some documentation about the interactions between the automountd and
> > the kernel?
>
> Not really, only the ioctl interaction described in
> Documentation/filesystems/autofs4-mount-control.txt
>
> > It looks like:
> > With V5, every name in the root directory gets something mounted on it,
> > either another autofs or the target filesystem.
> > With V4, you don't mount an autofs onto an autofs, but when a name in the
> > root is automounted, all the names beneath there are created and the
> > target filesystems are mounted before the top level automount is
> > acknowledged.
> >
> > Does that sound at all right (I suspect I haven't explained it very clearly).
>
> Mostly, but let me try and simplify it (or perhaps confuse you even more
> and bore you as a bonus, ;)) and offer a broader picture.
Thanks for going into all that detail! Having read that, and read the code
(a few times) and having tried to document it myself I think I really
understand it all now.
I wanted to be sure I understood from the perspective of what the module
actually supports rather than just how the automount daemon uses it, so I put
together a document describing the module. At 3000 words, I'm not sure
whether to call it "verbose" or "thorough". I'll post it separately.
I did discover one thing that is important for the RCU-walk work.
If you have a "root-less multimount" then the top-level directory will have
DCACHE_NEED_AUTOMOUNT set even though it is not a mount-trap.
As it contains a sub-directory (or more), d_manage will return -EISDIR so
normal REF-walking will side-step d_automount and it will be treated as a
normal directory.
However in RCU-walk mode, d_manage cannot usefully return -EISDIR so the VFS
will always drop into REF-walk mode, which negates some of the value of my
RCU-walk patches.
I see two ways to fix this:
1- clear DCACHE_NEED_AUTOMOUNT when creating a subdir or symlink, just like
we do with protocol-version 4
2- teach the VFS to accept -EISDIR from d_manage and to ignore
DCACHE_NEED_AUTOMOUNT in that case. I have a patch to do this and it is
fairly straight forward.
I think I prefer 2 (it seems easier to document:-) but thought I'd ask if you
have an opinion before I post it (I haven't even tested it properly yet).
Thanks,
NeilBrown
On Tue, 2014-07-29 at 11:51 +1000, NeilBrown wrote:
> On Thu, 17 Jul 2014 18:17:32 +0800 Ian Kent <[email protected]> wrote:
>
> >
> > On Thu, 2014-07-17 at 18:04 +1000, NeilBrown wrote:
> > > If there some documentation about the interactions between the automountd and
> > > the kernel?
> >
> > Not really, only the ioctl interaction described in
> > Documentation/filesystems/autofs4-mount-control.txt
> >
> > > It looks like:
> > > With V5, every name in the root directory gets something mounted on it,
> > > either another autofs or the target filesystem.
> > > With V4, you don't mount an autofs onto an autofs, but when a name in the
> > > root is automounted, all the names beneath there are created and the
> > > target filesystems are mounted before the top level automount is
> > > acknowledged.
> > >
> > > Does that sound at all right (I suspect I haven't explained it very clearly).
> >
> > Mostly, but let me try and simplify it (or perhaps confuse you even more
> > and bore you as a bonus, ;)) and offer a broader picture.
>
> Thanks for going into all that detail! Having read that, and read the code
> (a few times) and having tried to document it myself I think I really
> understand it all now.
Glad you appreciated the effort.
>
> I wanted to be sure I understood from the perspective of what the module
> actually supports rather than just how the automount daemon uses it, so I put
> together a document describing the module. At 3000 words, I'm not sure
> whether to call it "verbose" or "thorough". I'll post it separately.
Indeed, that's very sensible since it can easily be used by others.
I still way behind and I'm ill, in fact the whole family is ill (winter
here as you know), so I'll have a look at what you've put together soon
as I can.
FWICS you've put a lot of effort into writing it and I appreciate it
very much.
Once we're both happy with it do you think it should go somewhere in
Documents in the kernel tree or did you have somewhere else in mind?
>
> I did discover one thing that is important for the RCU-walk work.
> If you have a "root-less multimount" then the top-level directory will have
> DCACHE_NEED_AUTOMOUNT set even though it is not a mount-trap.
> As it contains a sub-directory (or more), d_manage will return -EISDIR so
> normal REF-walking will side-step d_automount and it will be treated as a
> normal directory.
> However in RCU-walk mode, d_manage cannot usefully return -EISDIR so the VFS
> will always drop into REF-walk mode, which negates some of the value of my
> RCU-walk patches.
Yes, think we discussed that before.
>
> I see two ways to fix this:
> 1- clear DCACHE_NEED_AUTOMOUNT when creating a subdir or symlink, just like
> we do with protocol-version 4
> 2- teach the VFS to accept -EISDIR from d_manage and to ignore
> DCACHE_NEED_AUTOMOUNT in that case. I have a patch to do this and it is
> fairly straight forward.
>
> I think I prefer 2 (it seems easier to document:-) but thought I'd ask if you
> have an opinion before I post it (I haven't even tested it properly yet).
I definitely prefer option 2 also.
Trying to juggle the flag adds at least one seriously ugly test in the
expire code and the path walk code and likely eliminates the ability to
use the optimization of not calling ->d_automount() for these.
Keep in mind that there's also the multiple namespace problem I
mentioned before.
d_mountpoint() says, "is this dentry mounted (in any namespace)" which
is clearly unreliable.
Since autofs path walks were always done in ref-walk mode I haven't
looked at how this would be done in rcu-walk mode but in ref-walk mode I
believe the only solution is to use lookup_mnt() and that won't work in
rcu-walk and would require kicking the process back to ref-walk mode.
Ian