2003-09-18 05:53:16

by Trond Myklebust

[permalink] [raw]
Subject: Re: Re: [autofs] VFS: Busy inodes after unmount on 2 way SMP

>>>>> " " == Olaf Hering <[email protected]> writes:

> On Wed, Sep 17, H. Peter Anvin wrote:
>> Okay... let me ask this so I get it straight...
>>
>> Has anyone seen this (busy inodes after stopping the
>> automounter) using autofs v3 kernel module and daemon?

> I'm not sure, but everyone who has seen this should fiddle this
> patch into the kernel and see how it goes. The whole thing
> started very recently (post 2.4.21) for us.

> This patch is untested, any feedback appreciated.

...and is indeed wrong... It does the exact opposite of what
sillydelete should do. Instead of causing the last application that
closes the file to perform the sillydelete, you are asking the *first*
application that closes it to do so.
Sillydelete *has* to be tied to dentries. Not files, and not
inodes. It is purely a namespace operation...

So exactly what are you trying to do, and why?

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-09-29 05:46:37

by Frank Cusack

[permalink] [raw]
Subject: Re: Re: [autofs] VFS: Busy inodes after unmount on 2 way SMP

Does this patch help?

--- linux-2.4.22/fs/namei.c 2003-08-25 04:44:43.000000000 -0700
+++ linux-2.4.22-gg4/fs/namei.c 2003-09-26 00:03:16.000000000 -0700
@@ -893,6 +893,8 @@ static inline int check_sticky(struct in
* 7. If we were asked to remove a directory and victim isn't one - ENOTDIR.
* 8. If we were asked to remove a non-directory and victim isn't one - EISDIR.
* 9. We can't remove a root or mountpoint.
+ * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
+ * nfs_async_unlink().
*/
static inline int may_delete(struct inode *dir,struct dentry *victim, int isdir)
{
@@ -916,6 +918,8 @@ static inline int may_delete(struct inod
return -EISDIR;
if (IS_DEADDIR(dir))
return -ENOENT;
+ if (victim->d_flags & DCACHE_NFSFS_RENAMED)
+ return -EBUSY;
return 0;
}

@@ -1484,13 +1488,14 @@ int vfs_unlink(struct inode *dir, struct
lock_kernel();
error = dir->i_op->unlink(dir, dentry);
unlock_kernel();
- if (!error)
+ if (!error &&
+ !(dentry->d_flags & DCACHE_NFSFS_RENAMED))
d_delete(dentry);
}
}
}
up(&dir->i_zombie);
- if (!error)
+ if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED))
inode_dir_notify(dir, DN_DELETE);
return error;
}


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-09-18 08:27:03

by Olaf Kirch

[permalink] [raw]
Subject: Re: Re: [autofs] VFS: Busy inodes after unmount on 2 way SMP

On Thu, Sep 18, 2003 at 01:52:05AM -0400, Trond Myklebust wrote:
> ...and is indeed wrong... It does the exact opposite of what
> sillydelete should do. Instead of causing the last application that
> closes the file to perform the sillydelete, you are asking the *first*
> application that closes it to do so.

I know. I was not claiming it's perfectly right, what I'm interested
in whether it cures the oopses we're seeing. If that were true, one
could still look for The Right Solution...

> Sillydelete *has* to be tied to dentries. Not files, and not
> inodes. It is purely a namespace operation...
>
> So exactly what are you trying to do, and why?

We have seen several reports of autofs+nfs leading to oopses, preceded by
"Busy inodes on umount, self-destruct in 5 seconds".
It's somewhat hard to reproduce, so I'm currently trying to come up with
possible patches by looking at the source.

You cannot umount if the vfsmount mnt_count is != 2. So either something
hosed the mnt_count completely (by doing too many mntput() calls for
instance), or something holds dentry references without bumping mnt_count
along with it. Do you agree?

The only place in the nfs client code that actually does a dget is
the sillydelete stuff in unlink.c. So my idea in generating this patch
was that the nfs_complete_unlink was happening too late, when the file
is already closed and the vfsmount reference is gone.

It's a long shot admittedly...

Have a nice day,
Olaf
--
Olaf Kirch | Anyone who has had to work with X.509 has probably
[email protected] | experienced what can best be described as
---------------+ ISO water torture. -- Peter Gutmann


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-09-25 23:18:26

by Matt C

[permalink] [raw]
Subject: Re: Re: [autofs] VFS: Busy inodes after unmount on 2 way SMP

Hi All-

We've been having the 'VFS: Busy inodes after unmount' problems like crazy
on our desktop and server linux boxes running autofs under 2.4.21
(vanilla). We're using the autofs-3.1.7 userspace tools, but with the
autofs4 kernel module.

I applied the patch that was attached earlier in this thread to our kernel
and have been running it for the past few days on my desktop. These errors
have completely disappeared, and autofs is behaving as expected so far.

So, while it might be the wrong thing to do, it does fix our problems.

-matt

On 18 Sep 2003, Trond Myklebust wrote:

> >>>>> " " == Olaf Hering <[email protected]> writes:
>
> > On Wed, Sep 17, H. Peter Anvin wrote:
> >> Okay... let me ask this so I get it straight...
> >>
> >> Has anyone seen this (busy inodes after stopping the
> >> automounter) using autofs v3 kernel module and daemon?
>
> > I'm not sure, but everyone who has seen this should fiddle this
> > patch into the kernel and see how it goes. The whole thing
> > started very recently (post 2.4.21) for us.
>
> > This patch is untested, any feedback appreciated.
>
> ...and is indeed wrong... It does the exact opposite of what
> sillydelete should do. Instead of causing the last application that
> closes the file to perform the sillydelete, you are asking the *first*
> application that closes it to do so.
> Sillydelete *has* to be tied to dentries. Not files, and not
> inodes. It is purely a namespace operation...
>
> So exactly what are you trying to do, and why?
>
> Cheers,
> Trond
>
>
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-09-26 00:25:44

by Trond Myklebust

[permalink] [raw]
Subject: Re: Re: [autofs] VFS: Busy inodes after unmount on 2 way SMP

>>>>> " " == Matt C <Matt> writes:

> Hi All- We've been having the 'VFS: Busy inodes after unmount'
> problems like crazy on our desktop and server linux boxes
> running autofs under 2.4.21 (vanilla). We're using the
> autofs-3.1.7 userspace tools, but with the autofs4 kernel
> module.

> I applied the patch that was attached earlier in this thread to
> our kernel and have been running it for the past few days on my
> desktop. These errors have completely disappeared, and autofs
> is behaving as expected so far.

> So, while it might be the wrong thing to do, it does fix our
> problems.

That changes nothing. It is still not a "fix" that I want to see go
into the main kernel.

The problem here is that sillyrename needs to hold onto the parent
directory's inode until it is done, and yet it cannot know which
vfsmount struct to hold onto because that information is not passed
down by the VFS.

I would rather then have something like the appended patch go in to
which force the last process to close the file to actually wait on
completion (unless someone signals it with a 'kill -9' - in which case
they are knowingly breaking things).

Cheers,
Trond

diff -u --recursive --new-file linux-2.4.22-03-soft2/fs/nfs/dir.c linux-2.4.22-04-fix_unlink/fs/nfs/dir.c
--- linux-2.4.22-03-soft2/fs/nfs/dir.c 2002-10-15 06:59:27.000000000 +0200
+++ linux-2.4.22-04-fix_unlink/fs/nfs/dir.c 2003-07-03 12:29:28.000000000 +0200
@@ -994,7 +994,7 @@
struct inode *old_inode = old_dentry->d_inode;
struct inode *new_inode = new_dentry->d_inode;
struct dentry *dentry = NULL, *rehash = NULL;
- int error = -EBUSY;
+ int error;

/*
* To prevent any new references to the target during the rename,
@@ -1020,6 +1020,12 @@
*/
if (!new_inode)
goto go_ahead;
+ /* If target is a hard link to the source, then noop */
+ error = 0;
+ if (NFS_FILEID(new_inode) == NFS_FILEID(old_inode))
+ goto out;
+
+ error = -EBUSY;
if (S_ISDIR(new_inode->i_mode))
goto out;
else if (atomic_read(&new_dentry->d_count) > 1) {
diff -u --recursive --new-file linux-2.4.22-03-soft2/fs/nfs/unlink.c linux-2.4.22-04-fix_unlink/fs/nfs/unlink.c
--- linux-2.4.22-03-soft2/fs/nfs/unlink.c 2002-08-11 13:34:02.000000000 +0200
+++ linux-2.4.22-04-fix_unlink/fs/nfs/unlink.c 2003-07-03 12:29:28.000000000 +0200
@@ -12,6 +12,7 @@
#include <linux/sunrpc/sched.h>
#include <linux/sunrpc/clnt.h>
#include <linux/nfs_fs.h>
+#include <linux/wait.h>


struct nfs_unlinkdata {
@@ -21,6 +22,9 @@
struct rpc_task task;
struct rpc_cred *cred;
unsigned int count;
+
+ wait_queue_head_t waitq;
+ int completed;
};

static struct nfs_unlinkdata *nfs_deletes;
@@ -133,6 +137,8 @@
put_rpccred(data->cred);
data->cred = NULL;
dput(dir);
+ data->completed = 1;
+ wake_up(&data->waitq);
}

/**
@@ -175,6 +181,8 @@
nfs_deletes = data;
data->count = 1;

+ init_waitqueue_head(&data->waitq);
+
task = &data->task;
rpc_init_task(task, clnt, nfs_async_unlink_done , RPC_TASK_ASYNC);
task->tk_calldata = data;
@@ -212,7 +220,10 @@
data->count++;
nfs_copy_dname(dentry, data);
dentry->d_flags &= ~DCACHE_NFSFS_RENAMED;
- if (data->task.tk_rpcwait == &nfs_delete_queue)
+ if (data->task.tk_rpcwait == &nfs_delete_queue) {
+ struct rpc_clnt *clnt = data->task.tk_client;
rpc_wake_up_task(&data->task);
+ nfs_wait_event(clnt, data->waitq, data->completed == 1);
+ }
nfs_put_unlinkdata(data);
}


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs