2013-10-15 20:40:00

by J. Bruce Fields

[permalink] [raw]
Subject: simplify reconnecting dentries looked up by filehandle

When we lookup a filehandle for a directory not already in the dentry
cache, fs/exportfs/expfs.c:reconnect_path() is what populates the dentry
cache with the directory and its parents.

The current code looks more complicated than necessary:

- after looking up the parent, it searches the parent directory
for our target dentry. If that search fails with -ENOENT, it
retries up to 10 times. I don't understand why. This should
only happen if there's a concurrent rename or rmdir, in which
case that concurrent operation should have reconnected the
dentry for us, and we're done.

- each time it succesfully connects a dentry to its parent, it
restarts from scratch with the original dentry. Why not
continue working with the parent we just found?

Have I missed some reason that we need the more complicated approach?

Patches showing my attempted simplification follow.

Most of the work is in the last patch (which I didn't manage to break up
as well as I'd like), but there's also a smaller optimization in the
first patch.

I tested performance with a script that creates an N-deep directory
tree, gets a filehandle for the bottom directory, writes 2 to
/proc/sys/vm/drop_caches, then times an open_by_handle_at() of the
filehandle. Code at

git://linux-nfs.org/~bfields/fhtests.git

For directories of various depths, some example observed times (median
results of 3 similar runs, in seconds), were:

depth: 8000 2000 200
no patches: 11 0.7 0.02
first patch: 6 0.4 0.01
all patches: 0.1 0.03 0.01

For depths < 2000 I used an ugly hack to shrink_slab_node() to force
drop_caches to free more dentries. Difference look lost in the noise
for much smaller depths.

Nesting that deep sounds crazy to me, and cold lookup of a
filehandle isn't the common case. But maybe it's worth not having to
worry about the awful performance in these pathalogical cases. And it
does simplify the code.

--b.



2013-10-15 20:40:01

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 2/5] exportfs: move most of reconnect_path to helper function

From: "J. Bruce Fields" <[email protected]>

Just cleanup, no change in functionality.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/exportfs/expfs.c | 147 +++++++++++++++++++++++++++------------------------
1 file changed, 79 insertions(+), 68 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 455b0bb..63996d2 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -107,6 +107,82 @@ static void clear_disconnected(struct dentry *dentry)
}

/*
+ * Return the parent directory on success.
+ *
+ * Return NULL to keep trying.
+ *
+ * Otherwise return an error.
+ */
+static int reconnect_one(struct vfsmount *mnt, struct dentry *pd, char *nbuf, int *noprogress)
+{
+ struct dentry *ppd;
+ struct dentry *npd;
+ int err;
+ /*
+ * Getting the parent can't be supported generically, the
+ * locking is too icky.
+ *
+ * If it can't be done, we just return EACCES. If you were
+ * depending on the dcache finding the parent for you, you lose
+ * if there's a reboot or inodes get flushed.
+ */
+ ppd = ERR_PTR(-EACCES);
+
+ mutex_lock(&pd->d_inode->i_mutex);
+ if (mnt->mnt_sb->s_export_op->get_parent)
+ ppd = mnt->mnt_sb->s_export_op->get_parent(pd);
+ mutex_unlock(&pd->d_inode->i_mutex);
+
+ if (IS_ERR(ppd)) {
+ err = PTR_ERR(ppd);
+ dprintk("%s: get_parent of %ld failed, err %d\n",
+ __func__, dentry->d_inode->i_ino, err);
+ return err;
+ }
+
+ dprintk("%s: find name of %lu in %lu\n", __func__,
+ dentry->d_inode->i_ino, parent->d_inode->i_ino);
+ err = exportfs_get_name(mnt, ppd, nbuf, pd);
+ if (err) {
+ dput(ppd);
+ if (err == -ENOENT)
+ /* some race between get_parent and
+ * get_name? just try again
+ */
+ return 0;
+ return err;
+ }
+ dprintk("%s: found name: %s\n", __func__, nbuf);
+ mutex_lock(&ppd->d_inode->i_mutex);
+ npd = lookup_one_len(nbuf, ppd, strlen(nbuf));
+ mutex_unlock(&ppd->d_inode->i_mutex);
+ if (IS_ERR(npd)) {
+ err = PTR_ERR(npd);
+ dprintk("%s: lookup failed: %d\n",
+ __func__, err);
+ dput(ppd);
+ return err;
+ }
+ /* we didn't really want npd, we really wanted
+ * a side-effect of the lookup.
+ * hopefully, npd == pd, though it isn't really
+ * a problem if it isn't
+ */
+ if (npd == pd)
+ *noprogress = 0;
+ else
+ printk("%s: npd != pd\n", __func__);
+ dput(npd);
+ dput(ppd);
+ if (IS_ROOT(pd)) {
+ /* something went wrong, we have to give up */
+ dput(pd);
+ return -ESTALE;
+ }
+ return 0;
+}
+
+/*
* Make sure target_dir is fully connected to the dentry tree.
*
* It may already be, as the flag isn't always updated when connection happens.
@@ -140,75 +216,10 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
/*
* We have hit the top of a disconnected path, try to
* find parent and connect.
- *
- * Racing with some other process renaming a directory
- * isn't much of a problem here. If someone renames
- * the directory, it will end up properly connected,
- * which is what we want
- *
- * Getting the parent can't be supported generically,
- * the locking is too icky.
- *
- * Instead we just return EACCES. If server reboots
- * or inodes get flushed, you lose
- */
- struct dentry *ppd = ERR_PTR(-EACCES);
- struct dentry *npd;
-
- mutex_lock(&pd->d_inode->i_mutex);
- if (mnt->mnt_sb->s_export_op->get_parent)
- ppd = mnt->mnt_sb->s_export_op->get_parent(pd);
- mutex_unlock(&pd->d_inode->i_mutex);
-
- if (IS_ERR(ppd)) {
- err = PTR_ERR(ppd);
- dprintk("%s: get_parent of %ld failed, err %d\n",
- __func__, pd->d_inode->i_ino, err);
- dput(pd);
- break;
- }
-
- dprintk("%s: find name of %lu in %lu\n", __func__,
- pd->d_inode->i_ino, ppd->d_inode->i_ino);
- err = exportfs_get_name(mnt, ppd, nbuf, pd);
- if (err) {
- dput(ppd);
- dput(pd);
- if (err == -ENOENT)
- /* some race between get_parent and
- * get_name? just try again
- */
- continue;
- break;
- }
- dprintk("%s: found name: %s\n", __func__, nbuf);
- mutex_lock(&ppd->d_inode->i_mutex);
- npd = lookup_one_len(nbuf, ppd, strlen(nbuf));
- mutex_unlock(&ppd->d_inode->i_mutex);
- if (IS_ERR(npd)) {
- err = PTR_ERR(npd);
- dprintk("%s: lookup failed: %d\n",
- __func__, err);
- dput(ppd);
- dput(pd);
- break;
- }
- /* we didn't really want npd, we really wanted
- * a side-effect of the lookup.
- * hopefully, npd == pd, though it isn't really
- * a problem if it isn't
*/
- if (npd == pd)
- noprogress = 0;
- else
- printk("%s: npd != pd\n", __func__);
- dput(npd);
- dput(ppd);
- if (IS_ROOT(pd)) {
- /* something went wrong, we have to give up */
- dput(pd);
- break;
- }
+ err = reconnect_one(mnt, pd, nbuf, &noprogress);
+ if (err)
+ return err;
}
dput(pd);
}
--
1.7.9.5


2013-10-15 20:40:01

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 1/5] exportfs: clear DISCONNECTED on all parents sooner

From: "J. Bruce Fields" <[email protected]>

Once we've found any connected parent, we know all our parents are
connected--that's true even if there's a concurrent rename. May as well
clear them all at once.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/exportfs/expfs.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index c43fe9b..455b0bb 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -90,6 +90,22 @@ find_disconnected_root(struct dentry *dentry)
return dentry;
}

+static void clear_disconnected(struct dentry *dentry)
+{
+ dget(dentry);
+ while (dentry->d_flags & DCACHE_DISCONNECTED) {
+ struct dentry *parent = dget_parent(dentry);
+
+ spin_lock(&dentry->d_lock);
+ dentry->d_flags &= ~DCACHE_DISCONNECTED;
+ spin_unlock(&dentry->d_lock);
+
+ dput(dentry);
+ dentry = parent;
+ }
+ dput(dentry);
+}
+
/*
* Make sure target_dir is fully connected to the dentry tree.
*
@@ -114,15 +130,11 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)

if (!IS_ROOT(pd)) {
/* must have found a connected parent - great */
- spin_lock(&pd->d_lock);
- pd->d_flags &= ~DCACHE_DISCONNECTED;
- spin_unlock(&pd->d_lock);
+ clear_disconnected(target_dir);
noprogress = 0;
} else if (pd == mnt->mnt_sb->s_root) {
printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n");
- spin_lock(&pd->d_lock);
- pd->d_flags &= ~DCACHE_DISCONNECTED;
- spin_unlock(&pd->d_lock);
+ clear_disconnected(target_dir);
noprogress = 0;
} else {
/*
--
1.7.9.5


2013-10-15 20:40:02

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 5/5] exportfs: fix quadratic behavior in filehandle lookup

From: "J. Bruce Fields" <[email protected]>

We shouldn't really have to re-lookup the parents on each pass through.
This should make behavior linear instead of quadratic as a function of
directory depth.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/exportfs/expfs.c | 105 +++++++++++++++++++--------------------------------
1 file changed, 39 insertions(+), 66 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index afa7c9c..08544ae 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -69,11 +69,7 @@ find_acceptable_alias(struct dentry *result,
return NULL;
}

-/*
- * Find root of a disconnected subtree and return a reference to it.
- */
-static struct dentry *
-find_disconnected_root(struct dentry *dentry)
+static bool dentry_connected(struct dentry *dentry)
{
dget(dentry);
while (!IS_ROOT(dentry)) {
@@ -81,13 +77,14 @@ find_disconnected_root(struct dentry *dentry)

if (!(parent->d_flags & DCACHE_DISCONNECTED)) {
dput(parent);
- break;
+ dput(dentry);
+ return true;
}

dput(dentry);
dentry = parent;
}
- return dentry;
+ return false;
}

static void clear_disconnected(struct dentry *dentry)
@@ -109,11 +106,13 @@ static void clear_disconnected(struct dentry *dentry)
/*
* Return the parent directory on success.
*
- * Return NULL to keep trying.
+ * If it races with a rename or unlink, assume the other operation
+ * connected it, but return NULL to indicate the caller should check
+ * this just to make sure the filesystem isn't nuts.
*
* Otherwise return an error.
*/
-static int reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf, int *noprogress)
+static struct dentry *reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf)
{
struct dentry *parent;
struct dentry *tmp;
@@ -137,20 +136,17 @@ static int reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf
err = PTR_ERR(parent);
dprintk("%s: get_parent of %ld failed, err %d\n",
__func__, dentry->d_inode->i_ino, err);
- return err;
+ return parent;
}

dprintk("%s: find name of %lu in %lu\n", __func__,
dentry->d_inode->i_ino, parent->d_inode->i_ino);
err = exportfs_get_name(mnt, parent, nbuf, dentry);
if (err) {
- dput(parent);
if (err == -ENOENT)
- /* some race between get_parent and
- * get_name? just try again
- */
- return 0;
- return err;
+ return NULL;
+ dput(parent);
+ return ERR_PTR(err);
}
dprintk("%s: found name: %s\n", __func__, nbuf);
mutex_lock(&parent->d_inode->i_mutex);
@@ -160,26 +156,13 @@ static int reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf
err = PTR_ERR(tmp);
dprintk("%s: lookup failed: %d\n",
__func__, err);
- dput(parent);
- return err;
+ return NULL;
}
- /* we didn't really want tmp, we really wanted
- * a side-effect of the lookup.
- * hopefully, tmp == dentry, though it isn't really
- * a problem if it isn't
- */
- if (tmp == dentry)
- *noprogress = 0;
- else
- printk("%s: tmp != dentry\n", __func__);
+ /* Note tmp != dentry is possible at this point, but that's OK */
dput(tmp);
- dput(parent);
- if (IS_ROOT(dentry)) {
- /* something went wrong, we have to give up */
- dput(dentry);
- return -ESTALE;
- }
- return 0;
+ if (IS_ROOT(dentry))
+ return ERR_PTR(-ESTALE);
+ return parent;
}

/*
@@ -190,51 +173,41 @@ static int reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf
static int
reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
{
- int noprogress = 0;
- int err = -ESTALE;
+ struct dentry *dentry, *parent;

- /*
- * It is possible that a confused file system might not let us complete
- * the path to the root. For example, if get_parent returns a directory
- * in which we cannot find a name for the child. While this implies a
- * very sick filesystem we don't want it to cause knfsd to spin. Hence
- * the noprogress counter. If we go through the loop 10 times (2 is
- * probably enough) without getting anywhere, we just give up
- */
- while (target_dir->d_flags & DCACHE_DISCONNECTED && noprogress++ < 10) {
- struct dentry *dentry = find_disconnected_root(target_dir);
+ dentry = dget(target_dir);

- if (!IS_ROOT(dentry)) {
- /* must have found a connected parent - great */
- clear_disconnected(target_dir);
- noprogress = 0;
- dput(dentry);
- continue;
- }
+ while (dentry->d_flags & DCACHE_DISCONNECTED) {
if (dentry == mnt->mnt_sb->s_root) {
- printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n");
- clear_disconnected(target_dir);
- noprogress = 0;
+ WARN_ON_ONCE(1);
+ break;
+ }
+ if (!IS_ROOT(dentry)) {
+ parent = dget_parent(dentry);
dput(dentry);
+ dentry = parent;
continue;
}
/*
* We have hit the top of a disconnected path, try to
* find parent and connect.
*/
- err = reconnect_one(mnt, dentry, nbuf, &noprogress);
- if (err)
- return err;
+ parent = reconnect_one(mnt, dentry, nbuf);
dput(dentry);
+ if (!parent)
+ break;
+ if (IS_ERR(parent))
+ return PTR_ERR(parent);
+ dentry = parent;
}
-
- if (target_dir->d_flags & DCACHE_DISCONNECTED) {
- /* something went wrong - oh-well */
- if (!err)
- err = -ESTALE;
- return err;
+ /*
+ * Should be unnecessary, but let's be careful:
+ */
+ if (!dentry_connected(target_dir)) {
+ WARN_ON_ONCE(1);
+ return -ESTALE;
}
-
+ clear_disconnected(target_dir);
return 0;
}

--
1.7.9.5


2013-10-16 08:04:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 5/5] exportfs: fix quadratic behavior in filehandle lookup

On Tue, Oct 15, 2013 at 04:39:33PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> We shouldn't really have to re-lookup the parents on each pass through.
> This should make behavior linear instead of quadratic as a function of
> directory depth.

This looks valid after going through the nasty cases like a
cross-directory rename. But I'd really prefer the changelog to be way
more detailed.

Also a few untested fixups below:
- document that reconnect_one can only be called for directories
- make sure we always properly drop the parent dentry on
reconnect_one failure
- warn then the tmp dentry is not the same as the one we started
with. Given that we deal with directories this should not
happen
- add/update a few comments.
- tidy up the reconnect_path loop so the reconnect case shares
more logic with the dget_parent case


Index: linux/fs/exportfs/expfs.c
===================================================================
--- linux.orig/fs/exportfs/expfs.c 2013-10-16 09:58:44.544119517 +0200
+++ linux/fs/exportfs/expfs.c 2013-10-16 10:04:19.176127470 +0200
@@ -104,38 +104,30 @@ static void clear_disconnected(struct de
}

/*
- * Return the parent directory on success.
+ * Reconnect a directory dentry with its parent.
*
- * If it races with a rename or unlink, assume the other operation
- * connected it, but return NULL to indicate the caller should check
- * this just to make sure the filesystem isn't nuts.
+ * Return the parent directory when successfully reconnecting, NULL if it
+ * raced with another VFS operation, in which case it got connected or
+ * the whole file handle will become stale.
*
* Otherwise return an error.
*/
-static struct dentry *reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf)
+static struct dentry *reconnect_one(struct vfsmount *mnt,
+ struct dentry *dentry, char *nbuf)
{
struct dentry *parent;
struct dentry *tmp;
int err;
- /*
- * Getting the parent can't be supported generically, the
- * locking is too icky.
- *
- * If it can't be done, we just return EACCES. If you were
- * depending on the dcache finding the parent for you, you lose
- * if there's a reboot or inodes get flushed.
- */
- parent = ERR_PTR(-EACCES);

+ parent = ERR_PTR(-EACCES);
mutex_lock(&dentry->d_inode->i_mutex);
if (mnt->mnt_sb->s_export_op->get_parent)
parent = mnt->mnt_sb->s_export_op->get_parent(dentry);
mutex_unlock(&dentry->d_inode->i_mutex);

if (IS_ERR(parent)) {
- err = PTR_ERR(parent);
dprintk("%s: get_parent of %ld failed, err %d\n",
- __func__, dentry->d_inode->i_ino, err);
+ __func__, dentry->d_inode->i_ino, PTR_ERR(parent));
return parent;
}

@@ -143,26 +135,45 @@ static struct dentry *reconnect_one(stru
dentry->d_inode->i_ino, parent->d_inode->i_ino);
err = exportfs_get_name(mnt, parent, nbuf, dentry);
if (err) {
+ /*
+ * Someone must have renamed our entry into another parent,
+ * in which case it has been reconnected by the rename.
+ *
+ * Or someone removed it entirely, in which case the file
+ * handle has become stale.
+ */
if (err == -ENOENT)
- return NULL;
- dput(parent);
- return ERR_PTR(err);
+ goto out_reconnected;
+ goto out_err;
}
dprintk("%s: found name: %s\n", __func__, nbuf);
mutex_lock(&parent->d_inode->i_mutex);
tmp = lookup_one_len(nbuf, parent, strlen(nbuf));
mutex_unlock(&parent->d_inode->i_mutex);
if (IS_ERR(tmp)) {
- err = PTR_ERR(tmp);
- dprintk("%s: lookup failed: %d\n",
- __func__, err);
- return NULL;
+ dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
+ goto out_reconnected;
+ }
+ if (WARN_ON(tmp != dentry)) {
+ dprintk("%s: got different dentry while reconnecting.\n",
+ __func__);
+ dput(tmp);
+ err = -ESTALE;
+ goto out_err;
}
- /* Note tmp != dentry is possible at this point, but that's OK */
dput(tmp);
- if (IS_ROOT(dentry))
- return ERR_PTR(-ESTALE);
+ if (IS_ROOT(dentry)) {
+ err = -ESTALE;
+ goto out_err;
+ }
return parent;
+
+out_err:
+ dput(parent);
+ return ERR_PTR(err);
+out_reconnected:
+ dput(parent);
+ return NULL;
}

/*
@@ -178,21 +189,18 @@ reconnect_path(struct vfsmount *mnt, str
dentry = dget(target_dir);

while (dentry->d_flags & DCACHE_DISCONNECTED) {
- if (dentry == mnt->mnt_sb->s_root) {
- WARN_ON_ONCE(1);
- break;
- }
- if (!IS_ROOT(dentry)) {
+ BUG_ON(dentry == mnt->mnt_sb->s_root);
+
+ if (IS_ROOT(dentry)) {
+ /*
+ * We have hit the top of a disconnected path, try to
+ * find parent and connect.
+ */
+ parent = reconnect_one(mnt, dentry, nbuf);
+ } else {
parent = dget_parent(dentry);
- dput(dentry);
- dentry = parent;
- continue;
}
- /*
- * We have hit the top of a disconnected path, try to
- * find parent and connect.
- */
- parent = reconnect_one(mnt, dentry, nbuf);
+
dput(dentry);
if (!parent)
break;
@@ -200,6 +208,7 @@ reconnect_path(struct vfsmount *mnt, str
return PTR_ERR(parent);
dentry = parent;
}
+
/*
* Should be unnecessary, but let's be careful:
*/

2013-10-16 13:56:48

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 1/5] exportfs: clear DISCONNECTED on all parents sooner

On Wed, Oct 16, 2013 at 12:13:43AM -0700, Christoph Hellwig wrote:
> On Tue, Oct 15, 2013 at 04:39:29PM -0400, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <[email protected]>
> >
> > Once we've found any connected parent, we know all our parents are
> > connected--that's true even if there's a concurrent rename. May as well
> > clear them all at once.
> >
> > Signed-off-by: J. Bruce Fields <[email protected]>
>
> Looks good,
>
> Reviewed-by: Christoph Hellwig <[email protected]>
>
> > if (!IS_ROOT(pd)) {
> > /* must have found a connected parent - great */
> > - spin_lock(&pd->d_lock);
> > - pd->d_flags &= ~DCACHE_DISCONNECTED;
> > - spin_unlock(&pd->d_lock);
> > + clear_disconnected(target_dir);
> > noprogress = 0;
> > } else if (pd == mnt->mnt_sb->s_root) {
> > printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n");
>
> You might as well throw a patch in to make this condition a BUG_ON, it
> would indicate a nasty bug in the dcache and has never triggered in the
> last 10 years as far as I know.

We've seen some ugly crashes made uglier by unnecessary BUG()s. I guess
that wouldn't be the case here as we're not holding any critical locks
and all we're going to kill is an nfsd thread or open_by_handle caller.
And like you say it's a very unlikely case anyway.

So, OK, I'll turn this case into a BUG_ON in a separate patch.

--b.

2013-10-16 22:39:22

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 5/5] exportfs: fix quadratic behavior in filehandle lookup

On Wed, Oct 16, 2013 at 01:04:07AM -0700, Christoph Hellwig wrote:
> On Tue, Oct 15, 2013 at 04:39:33PM -0400, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <[email protected]>
> >
> > We shouldn't really have to re-lookup the parents on each pass through.
> > This should make behavior linear instead of quadratic as a function of
> > directory depth.
>
> This looks valid after going through the nasty cases like a
> cross-directory rename. But I'd really prefer the changelog to be way
> more detailed.
>
> Also a few untested fixups below:
> - document that reconnect_one can only be called for directories
> - make sure we always properly drop the parent dentry on
> reconnect_one failure
> - warn then the tmp dentry is not the same as the one we started
> with. Given that we deal with directories this should not
> happen
> - add/update a few comments.
> - tidy up the reconnect_path loop so the reconnect case shares
> more logic with the dget_parent case
>
>
> Index: linux/fs/exportfs/expfs.c
> ===================================================================
> --- linux.orig/fs/exportfs/expfs.c 2013-10-16 09:58:44.544119517 +0200
> +++ linux/fs/exportfs/expfs.c 2013-10-16 10:04:19.176127470 +0200
> @@ -104,38 +104,30 @@ static void clear_disconnected(struct de
> }
>
> /*
> - * Return the parent directory on success.
> + * Reconnect a directory dentry with its parent.
> *
> - * If it races with a rename or unlink, assume the other operation
> - * connected it, but return NULL to indicate the caller should check
> - * this just to make sure the filesystem isn't nuts.
> + * Return the parent directory when successfully reconnecting, NULL if it
> + * raced with another VFS operation, in which case it got connected or
> + * the whole file handle will become stale.
> *
> * Otherwise return an error.
> */
> -static struct dentry *reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf)
> +static struct dentry *reconnect_one(struct vfsmount *mnt,
> + struct dentry *dentry, char *nbuf)
> {
> struct dentry *parent;
> struct dentry *tmp;
> int err;
> - /*
> - * Getting the parent can't be supported generically, the
> - * locking is too icky.
> - *
> - * If it can't be done, we just return EACCES. If you were
> - * depending on the dcache finding the parent for you, you lose
> - * if there's a reboot or inodes get flushed.
> - */
> - parent = ERR_PTR(-EACCES);
>
> + parent = ERR_PTR(-EACCES);
> mutex_lock(&dentry->d_inode->i_mutex);
> if (mnt->mnt_sb->s_export_op->get_parent)
> parent = mnt->mnt_sb->s_export_op->get_parent(dentry);
> mutex_unlock(&dentry->d_inode->i_mutex);
>
> if (IS_ERR(parent)) {
> - err = PTR_ERR(parent);
> dprintk("%s: get_parent of %ld failed, err %d\n",
> - __func__, dentry->d_inode->i_ino, err);
> + __func__, dentry->d_inode->i_ino, PTR_ERR(parent));
> return parent;
> }
>
> @@ -143,26 +135,45 @@ static struct dentry *reconnect_one(stru
> dentry->d_inode->i_ino, parent->d_inode->i_ino);
> err = exportfs_get_name(mnt, parent, nbuf, dentry);
> if (err) {
> + /*
> + * Someone must have renamed our entry into another parent,
> + * in which case it has been reconnected by the rename.
> + *
> + * Or someone removed it entirely, in which case the file
> + * handle has become stale.
> + */
> if (err == -ENOENT)
> - return NULL;
> - dput(parent);
> - return ERR_PTR(err);
> + goto out_reconnected;
> + goto out_err;
> }
> dprintk("%s: found name: %s\n", __func__, nbuf);
> mutex_lock(&parent->d_inode->i_mutex);
> tmp = lookup_one_len(nbuf, parent, strlen(nbuf));
> mutex_unlock(&parent->d_inode->i_mutex);
> if (IS_ERR(tmp)) {
> - err = PTR_ERR(tmp);
> - dprintk("%s: lookup failed: %d\n",
> - __func__, err);
> - return NULL;
> + dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
> + goto out_reconnected;
> + }
> + if (WARN_ON(tmp != dentry)) {
> + dprintk("%s: got different dentry while reconnecting.\n",
> + __func__);
> + dput(tmp);
> + err = -ESTALE;
> + goto out_err;

Wait, no, this isn't right. This isn't a multiple-directory-alias case.

It's just another "out_reconnected" case. Somebody renamed or removed
between get_name and lookup_one_len and lookup_one_len() and now
whatever lookup_one_len() finds could be something totally different.

--b.

> }
> - /* Note tmp != dentry is possible at this point, but that's OK */
> dput(tmp);
> - if (IS_ROOT(dentry))
> - return ERR_PTR(-ESTALE);
> + if (IS_ROOT(dentry)) {
> + err = -ESTALE;
> + goto out_err;
> + }
> return parent;
> +
> +out_err:
> + dput(parent);
> + return ERR_PTR(err);
> +out_reconnected:
> + dput(parent);
> + return NULL;
> }
>
> /*
> @@ -178,21 +189,18 @@ reconnect_path(struct vfsmount *mnt, str
> dentry = dget(target_dir);
>
> while (dentry->d_flags & DCACHE_DISCONNECTED) {
> - if (dentry == mnt->mnt_sb->s_root) {
> - WARN_ON_ONCE(1);
> - break;
> - }
> - if (!IS_ROOT(dentry)) {
> + BUG_ON(dentry == mnt->mnt_sb->s_root);
> +
> + if (IS_ROOT(dentry)) {
> + /*
> + * We have hit the top of a disconnected path, try to
> + * find parent and connect.
> + */
> + parent = reconnect_one(mnt, dentry, nbuf);
> + } else {
> parent = dget_parent(dentry);
> - dput(dentry);
> - dentry = parent;
> - continue;
> }
> - /*
> - * We have hit the top of a disconnected path, try to
> - * find parent and connect.
> - */
> - parent = reconnect_one(mnt, dentry, nbuf);
> +
> dput(dentry);
> if (!parent)
> break;
> @@ -200,6 +208,7 @@ reconnect_path(struct vfsmount *mnt, str
> return PTR_ERR(parent);
> dentry = parent;
> }
> +
> /*
> * Should be unnecessary, but let's be careful:
> */

2013-10-16 19:22:28

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 5/5] exportfs: fix quadratic behavior in filehandle lookup

On Wed, Oct 16, 2013 at 01:04:07AM -0700, Christoph Hellwig wrote:
> On Tue, Oct 15, 2013 at 04:39:33PM -0400, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <[email protected]>
> >
> > We shouldn't really have to re-lookup the parents on each pass through.
> > This should make behavior linear instead of quadratic as a function of
> > directory depth.
>
> This looks valid after going through the nasty cases like a
> cross-directory rename. But I'd really prefer the changelog to be way
> more detailed.
>
> Also a few untested fixups below:
> - document that reconnect_one can only be called for directories
> - make sure we always properly drop the parent dentry on
> reconnect_one failure
> - warn then the tmp dentry is not the same as the one we started
> with. Given that we deal with directories this should not
> happen
> - add/update a few comments.
> - tidy up the reconnect_path loop so the reconnect case shares
> more logic with the dget_parent case
>
>
> Index: linux/fs/exportfs/expfs.c
> ===================================================================
> --- linux.orig/fs/exportfs/expfs.c 2013-10-16 09:58:44.544119517 +0200
> +++ linux/fs/exportfs/expfs.c 2013-10-16 10:04:19.176127470 +0200
> @@ -104,38 +104,30 @@ static void clear_disconnected(struct de
> }
>
> /*
> - * Return the parent directory on success.
> + * Reconnect a directory dentry with its parent.
> *
> - * If it races with a rename or unlink, assume the other operation
> - * connected it, but return NULL to indicate the caller should check
> - * this just to make sure the filesystem isn't nuts.
> + * Return the parent directory when successfully reconnecting, NULL if it
> + * raced with another VFS operation, in which case it got connected or
> + * the whole file handle will become stale.
> *
> * Otherwise return an error.
> */
> -static struct dentry *reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf)
> +static struct dentry *reconnect_one(struct vfsmount *mnt,
> + struct dentry *dentry, char *nbuf)
> {
> struct dentry *parent;
> struct dentry *tmp;
> int err;
> - /*
> - * Getting the parent can't be supported generically, the
> - * locking is too icky.
> - *
> - * If it can't be done, we just return EACCES. If you were
> - * depending on the dcache finding the parent for you, you lose
> - * if there's a reboot or inodes get flushed.
> - */
> - parent = ERR_PTR(-EACCES);
>
> + parent = ERR_PTR(-EACCES);
> mutex_lock(&dentry->d_inode->i_mutex);
> if (mnt->mnt_sb->s_export_op->get_parent)
> parent = mnt->mnt_sb->s_export_op->get_parent(dentry);
> mutex_unlock(&dentry->d_inode->i_mutex);
>
> if (IS_ERR(parent)) {
> - err = PTR_ERR(parent);
> dprintk("%s: get_parent of %ld failed, err %d\n",
> - __func__, dentry->d_inode->i_ino, err);
> + __func__, dentry->d_inode->i_ino, PTR_ERR(parent));
> return parent;
> }
>
> @@ -143,26 +135,45 @@ static struct dentry *reconnect_one(stru
> dentry->d_inode->i_ino, parent->d_inode->i_ino);
> err = exportfs_get_name(mnt, parent, nbuf, dentry);
> if (err) {
> + /*
> + * Someone must have renamed our entry into another parent,
> + * in which case it has been reconnected by the rename.
> + *
> + * Or someone removed it entirely, in which case the file
> + * handle has become stale.
> + */
> if (err == -ENOENT)
> - return NULL;
> - dput(parent);
> - return ERR_PTR(err);
> + goto out_reconnected;
> + goto out_err;
> }
> dprintk("%s: found name: %s\n", __func__, nbuf);
> mutex_lock(&parent->d_inode->i_mutex);
> tmp = lookup_one_len(nbuf, parent, strlen(nbuf));
> mutex_unlock(&parent->d_inode->i_mutex);
> if (IS_ERR(tmp)) {
> - err = PTR_ERR(tmp);
> - dprintk("%s: lookup failed: %d\n",
> - __func__, err);
> - return NULL;
> + dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
> + goto out_reconnected;
> + }
> + if (WARN_ON(tmp != dentry)) {
> + dprintk("%s: got different dentry while reconnecting.\n",
> + __func__);
> + dput(tmp);
> + err = -ESTALE;
> + goto out_err;
> }
> - /* Note tmp != dentry is possible at this point, but that's OK */
> dput(tmp);
> - if (IS_ROOT(dentry))
> - return ERR_PTR(-ESTALE);
> + if (IS_ROOT(dentry)) {
> + err = -ESTALE;
> + goto out_err;
> + }
> return parent;
> +
> +out_err:
> + dput(parent);
> + return ERR_PTR(err);
> +out_reconnected:
> + dput(parent);
> + return NULL;
> }
>
> /*
> @@ -178,21 +189,18 @@ reconnect_path(struct vfsmount *mnt, str
> dentry = dget(target_dir);
>
> while (dentry->d_flags & DCACHE_DISCONNECTED) {
> - if (dentry == mnt->mnt_sb->s_root) {
> - WARN_ON_ONCE(1);
> - break;
> - }
> - if (!IS_ROOT(dentry)) {
> + BUG_ON(dentry == mnt->mnt_sb->s_root);
> +
> + if (IS_ROOT(dentry)) {
> + /*
> + * We have hit the top of a disconnected path, try to
> + * find parent and connect.
> + */
> + parent = reconnect_one(mnt, dentry, nbuf);
> + } else {
> parent = dget_parent(dentry);
> - dput(dentry);
> - dentry = parent;
> - continue;
> }
> - /*
> - * We have hit the top of a disconnected path, try to
> - * find parent and connect.
> - */
> - parent = reconnect_one(mnt, dentry, nbuf);
> +
> dput(dentry);
> if (!parent)
> break;
> @@ -200,6 +208,7 @@ reconnect_path(struct vfsmount *mnt, str
> return PTR_ERR(parent);
> dentry = parent;
> }

I think this still leaves a leak of dentry in this case the loop
condition fails.

Other suggestions look good, thanks!

Now that I look at it I think this all works out a bit more neatly if
the dget_parent() moves to reconnect_one().

I'll fold all this in, retest, and repost....

--b.

> +
> /*
> * Should be unnecessary, but let's be careful:
> */

2013-10-16 07:13:43

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/5] exportfs: clear DISCONNECTED on all parents sooner

On Tue, Oct 15, 2013 at 04:39:29PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> Once we've found any connected parent, we know all our parents are
> connected--that's true even if there's a concurrent rename. May as well
> clear them all at once.
>
> Signed-off-by: J. Bruce Fields <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

> if (!IS_ROOT(pd)) {
> /* must have found a connected parent - great */
> - spin_lock(&pd->d_lock);
> - pd->d_flags &= ~DCACHE_DISCONNECTED;
> - spin_unlock(&pd->d_lock);
> + clear_disconnected(target_dir);
> noprogress = 0;
> } else if (pd == mnt->mnt_sb->s_root) {
> printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n");

You might as well throw a patch in to make this condition a BUG_ON, it
would indicate a nasty bug in the dcache and has never triggered in the
last 10 years as far as I know.


2013-10-16 18:24:47

by Christoph Hellwig

[permalink] [raw]
Subject: Re: simplify reconnecting dentries looked up by filehandle

On Tue, Oct 15, 2013 at 04:39:28PM -0400, J. Bruce Fields wrote:
> I tested performance with a script that creates an N-deep directory
> tree, gets a filehandle for the bottom directory, writes 2 to
> /proc/sys/vm/drop_caches, then times an open_by_handle_at() of the
> filehandle. Code at
>
> git://linux-nfs.org/~bfields/fhtests.git
>
> For directories of various depths, some example observed times (median
> results of 3 similar runs, in seconds), were:
>
> depth: 8000 2000 200
> no patches: 11 0.7 0.02
> first patch: 6 0.4 0.01
> all patches: 0.1 0.03 0.01
>
> For depths < 2000 I used an ugly hack to shrink_slab_node() to force
> drop_caches to free more dentries. Difference look lost in the noise
> for much smaller depths.

Btw, it would be good to get this wired up in xfstests - add xfs_io
commands for the by handle ops and then just wire up the script driving
them.

I'd also really like to see a stress test for cold handle conversion vs
various VFS ops based on that sort of infrastructure.


2013-10-15 20:40:02

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 3/5] exportfs: make variable names more helpful

From: "J. Bruce Fields" <[email protected]>

The use of 3 easily-confused TLAs is not helpful here.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/exportfs/expfs.c | 60 +++++++++++++++++++++++++--------------------------
1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 63996d2..24fb763 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -113,10 +113,10 @@ static void clear_disconnected(struct dentry *dentry)
*
* Otherwise return an error.
*/
-static int reconnect_one(struct vfsmount *mnt, struct dentry *pd, char *nbuf, int *noprogress)
+static int reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf, int *noprogress)
{
- struct dentry *ppd;
- struct dentry *npd;
+ struct dentry *parent;
+ struct dentry *tmp;
int err;
/*
* Getting the parent can't be supported generically, the
@@ -126,15 +126,15 @@ static int reconnect_one(struct vfsmount *mnt, struct dentry *pd, char *nbuf, in
* depending on the dcache finding the parent for you, you lose
* if there's a reboot or inodes get flushed.
*/
- ppd = ERR_PTR(-EACCES);
+ parent = ERR_PTR(-EACCES);

- mutex_lock(&pd->d_inode->i_mutex);
+ mutex_lock(&dentry->d_inode->i_mutex);
if (mnt->mnt_sb->s_export_op->get_parent)
- ppd = mnt->mnt_sb->s_export_op->get_parent(pd);
- mutex_unlock(&pd->d_inode->i_mutex);
+ parent = mnt->mnt_sb->s_export_op->get_parent(dentry);
+ mutex_unlock(&dentry->d_inode->i_mutex);

- if (IS_ERR(ppd)) {
- err = PTR_ERR(ppd);
+ if (IS_ERR(parent)) {
+ err = PTR_ERR(parent);
dprintk("%s: get_parent of %ld failed, err %d\n",
__func__, dentry->d_inode->i_ino, err);
return err;
@@ -142,9 +142,9 @@ static int reconnect_one(struct vfsmount *mnt, struct dentry *pd, char *nbuf, in

dprintk("%s: find name of %lu in %lu\n", __func__,
dentry->d_inode->i_ino, parent->d_inode->i_ino);
- err = exportfs_get_name(mnt, ppd, nbuf, pd);
+ err = exportfs_get_name(mnt, parent, nbuf, dentry);
if (err) {
- dput(ppd);
+ dput(parent);
if (err == -ENOENT)
/* some race between get_parent and
* get_name? just try again
@@ -153,30 +153,30 @@ static int reconnect_one(struct vfsmount *mnt, struct dentry *pd, char *nbuf, in
return err;
}
dprintk("%s: found name: %s\n", __func__, nbuf);
- mutex_lock(&ppd->d_inode->i_mutex);
- npd = lookup_one_len(nbuf, ppd, strlen(nbuf));
- mutex_unlock(&ppd->d_inode->i_mutex);
- if (IS_ERR(npd)) {
- err = PTR_ERR(npd);
+ mutex_lock(&parent->d_inode->i_mutex);
+ tmp = lookup_one_len(nbuf, parent, strlen(nbuf));
+ mutex_unlock(&parent->d_inode->i_mutex);
+ if (IS_ERR(tmp)) {
+ err = PTR_ERR(tmp);
dprintk("%s: lookup failed: %d\n",
__func__, err);
- dput(ppd);
+ dput(parent);
return err;
}
- /* we didn't really want npd, we really wanted
+ /* we didn't really want tmp, we really wanted
* a side-effect of the lookup.
- * hopefully, npd == pd, though it isn't really
+ * hopefully, tmp == dentry, though it isn't really
* a problem if it isn't
*/
- if (npd == pd)
+ if (tmp == dentry)
*noprogress = 0;
else
- printk("%s: npd != pd\n", __func__);
- dput(npd);
- dput(ppd);
- if (IS_ROOT(pd)) {
+ printk("%s: tmp != dentry\n", __func__);
+ dput(tmp);
+ dput(parent);
+ if (IS_ROOT(dentry)) {
/* something went wrong, we have to give up */
- dput(pd);
+ dput(dentry);
return -ESTALE;
}
return 0;
@@ -202,13 +202,13 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
* probably enough) without getting anywhere, we just give up
*/
while (target_dir->d_flags & DCACHE_DISCONNECTED && noprogress++ < 10) {
- struct dentry *pd = find_disconnected_root(target_dir);
+ struct dentry *dentry = find_disconnected_root(target_dir);

- if (!IS_ROOT(pd)) {
+ if (!IS_ROOT(dentry)) {
/* must have found a connected parent - great */
clear_disconnected(target_dir);
noprogress = 0;
- } else if (pd == mnt->mnt_sb->s_root) {
+ } else if (dentry == mnt->mnt_sb->s_root) {
printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n");
clear_disconnected(target_dir);
noprogress = 0;
@@ -217,11 +217,11 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
* We have hit the top of a disconnected path, try to
* find parent and connect.
*/
- err = reconnect_one(mnt, pd, nbuf, &noprogress);
+ err = reconnect_one(mnt, dentry, nbuf, &noprogress);
if (err)
return err;
}
- dput(pd);
+ dput(dentry);
}

if (target_dir->d_flags & DCACHE_DISCONNECTED) {
--
1.7.9.5


2013-10-16 07:19:21

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 4/5] exportfs: slight reorganization of reconnect loop

On Tue, Oct 15, 2013 at 04:39:32PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> Just cleanup, no change in functionality

Looks reasonable, but as said I'd have just turned the root check into a
BUG_ON earlier.


2013-10-16 07:18:31

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 3/5] exportfs: make variable names more helpful

On Tue, Oct 15, 2013 at 04:39:31PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <[email protected]>
>
> The use of 3 easily-confused TLAs is not helpful here.
>
> Signed-off-by: J. Bruce Fields <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>


2013-10-16 07:16:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 2/5] exportfs: move most of reconnect_path to helper function

> /*
> + * Return the parent directory on success.
> + *
> + * Return NULL to keep trying.
> + *
> + * Otherwise return an error.
> + */
> +static int reconnect_one(struct vfsmount *mnt, struct dentry *pd, char *nbuf, int *noprogress)

Please keep the lines under 80 characters.

> + struct dentry *ppd;
> + struct dentry *npd;

Might as well use sensible names already when splitting it out instead
of renaming them later down the series?

Otherwise looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2013-10-16 18:30:20

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 5/5] exportfs: fix quadratic behavior in filehandle lookup

On Wed, Oct 16, 2013 at 01:04:07AM -0700, Christoph Hellwig wrote:
> On Tue, Oct 15, 2013 at 04:39:33PM -0400, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <[email protected]>
> >
> > We shouldn't really have to re-lookup the parents on each pass through.
> > This should make behavior linear instead of quadratic as a function of
> > directory depth.
>
> This looks valid after going through the nasty cases like a
> cross-directory rename. But I'd really prefer the changelog to be way
> more detailed.
>
> Also a few untested fixups below:
> - document that reconnect_one can only be called for directories
> - make sure we always properly drop the parent dentry on
> reconnect_one failure
> - warn then the tmp dentry is not the same as the one we started
> with. Given that we deal with directories this should not
> happen
> - add/update a few comments.
> - tidy up the reconnect_path loop so the reconnect case shares
> more logic with the dget_parent case
>
>
> Index: linux/fs/exportfs/expfs.c
> ===================================================================
> --- linux.orig/fs/exportfs/expfs.c 2013-10-16 09:58:44.544119517 +0200
> +++ linux/fs/exportfs/expfs.c 2013-10-16 10:04:19.176127470 +0200
> @@ -104,38 +104,30 @@ static void clear_disconnected(struct de
> }
>
> /*
> - * Return the parent directory on success.
> + * Reconnect a directory dentry with its parent.
> *
> - * If it races with a rename or unlink, assume the other operation
> - * connected it, but return NULL to indicate the caller should check
> - * this just to make sure the filesystem isn't nuts.
> + * Return the parent directory when successfully reconnecting, NULL if it
> + * raced with another VFS operation, in which case it got connected or
> + * the whole file handle will become stale.
> *
> * Otherwise return an error.
> */
> -static struct dentry *reconnect_one(struct vfsmount *mnt, struct dentry *dentry, char *nbuf)
> +static struct dentry *reconnect_one(struct vfsmount *mnt,
> + struct dentry *dentry, char *nbuf)
> {
> struct dentry *parent;
> struct dentry *tmp;
> int err;
> - /*
> - * Getting the parent can't be supported generically, the
> - * locking is too icky.
> - *
> - * If it can't be done, we just return EACCES. If you were
> - * depending on the dcache finding the parent for you, you lose
> - * if there's a reboot or inodes get flushed.
> - */
> - parent = ERR_PTR(-EACCES);
>
> + parent = ERR_PTR(-EACCES);
> mutex_lock(&dentry->d_inode->i_mutex);
> if (mnt->mnt_sb->s_export_op->get_parent)
> parent = mnt->mnt_sb->s_export_op->get_parent(dentry);
> mutex_unlock(&dentry->d_inode->i_mutex);
>
> if (IS_ERR(parent)) {
> - err = PTR_ERR(parent);
> dprintk("%s: get_parent of %ld failed, err %d\n",
> - __func__, dentry->d_inode->i_ino, err);
> + __func__, dentry->d_inode->i_ino, PTR_ERR(parent));
> return parent;
> }
>
> @@ -143,26 +135,45 @@ static struct dentry *reconnect_one(stru
> dentry->d_inode->i_ino, parent->d_inode->i_ino);
> err = exportfs_get_name(mnt, parent, nbuf, dentry);
> if (err) {
> + /*
> + * Someone must have renamed our entry into another parent,
> + * in which case it has been reconnected by the rename.
> + *
> + * Or someone removed it entirely, in which case the file
> + * handle has become stale.

Well, not necessarily. In the removal case, I believe what happens in
practice is that filehandle lookup succeeds and then subsequent
operations on it fail (e.g. any op that calls may_create hits the
IS_DEADDIR test and returns -ENOENT).

Which is fine, and could always happen before: even if filehandle lookup
itself didn't have this race, callers aren't holding any lock that would
prevent removal before they use the returned dentry.

But then that comment isn't quite right. Maybe:

* Or something removed it entirely, in which case
* filehandle lookup will succeed but the directory is
* now IS_DEAD and subsequent operations will fail.

--b.


> + */
> if (err == -ENOENT)
> - return NULL;
> - dput(parent);
> - return ERR_PTR(err);
> + goto out_reconnected;
> + goto out_err;
> }
> dprintk("%s: found name: %s\n", __func__, nbuf);
> mutex_lock(&parent->d_inode->i_mutex);
> tmp = lookup_one_len(nbuf, parent, strlen(nbuf));
> mutex_unlock(&parent->d_inode->i_mutex);
> if (IS_ERR(tmp)) {
> - err = PTR_ERR(tmp);
> - dprintk("%s: lookup failed: %d\n",
> - __func__, err);
> - return NULL;
> + dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
> + goto out_reconnected;
> + }
> + if (WARN_ON(tmp != dentry)) {
> + dprintk("%s: got different dentry while reconnecting.\n",
> + __func__);
> + dput(tmp);
> + err = -ESTALE;
> + goto out_err;
> }
> - /* Note tmp != dentry is possible at this point, but that's OK */
> dput(tmp);
> - if (IS_ROOT(dentry))
> - return ERR_PTR(-ESTALE);
> + if (IS_ROOT(dentry)) {
> + err = -ESTALE;
> + goto out_err;
> + }
> return parent;
> +
> +out_err:
> + dput(parent);
> + return ERR_PTR(err);
> +out_reconnected:
> + dput(parent);
> + return NULL;
> }
>
> /*
> @@ -178,21 +189,18 @@ reconnect_path(struct vfsmount *mnt, str
> dentry = dget(target_dir);
>
> while (dentry->d_flags & DCACHE_DISCONNECTED) {
> - if (dentry == mnt->mnt_sb->s_root) {
> - WARN_ON_ONCE(1);
> - break;
> - }
> - if (!IS_ROOT(dentry)) {
> + BUG_ON(dentry == mnt->mnt_sb->s_root);
> +
> + if (IS_ROOT(dentry)) {
> + /*
> + * We have hit the top of a disconnected path, try to
> + * find parent and connect.
> + */
> + parent = reconnect_one(mnt, dentry, nbuf);
> + } else {
> parent = dget_parent(dentry);
> - dput(dentry);
> - dentry = parent;
> - continue;
> }
> - /*
> - * We have hit the top of a disconnected path, try to
> - * find parent and connect.
> - */
> - parent = reconnect_one(mnt, dentry, nbuf);
> +
> dput(dentry);
> if (!parent)
> break;
> @@ -200,6 +208,7 @@ reconnect_path(struct vfsmount *mnt, str
> return PTR_ERR(parent);
> dentry = parent;
> }
> +
> /*
> * Should be unnecessary, but let's be careful:
> */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2013-10-15 20:40:00

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 4/5] exportfs: slight reorganization of reconnect loop

From: "J. Bruce Fields" <[email protected]>

Just cleanup, no change in functionality

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/exportfs/expfs.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index 24fb763..afa7c9c 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -208,19 +208,23 @@ reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
/* must have found a connected parent - great */
clear_disconnected(target_dir);
noprogress = 0;
- } else if (dentry == mnt->mnt_sb->s_root) {
+ dput(dentry);
+ continue;
+ }
+ if (dentry == mnt->mnt_sb->s_root) {
printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n");
clear_disconnected(target_dir);
noprogress = 0;
- } else {
- /*
- * We have hit the top of a disconnected path, try to
- * find parent and connect.
- */
- err = reconnect_one(mnt, dentry, nbuf, &noprogress);
- if (err)
- return err;
+ dput(dentry);
+ continue;
}
+ /*
+ * We have hit the top of a disconnected path, try to
+ * find parent and connect.
+ */
+ err = reconnect_one(mnt, dentry, nbuf, &noprogress);
+ if (err)
+ return err;
dput(dentry);
}

--
1.7.9.5