Hi,
d_delete() doesn't appear to ever create negative dentries when
called via vfs_unlink() due to the extra reference on the dentry.
In fact, a printk() in the d_delete() spot never ever triggers...
This could explain the behavior change, wherein rm -r on a large
directory no longer works in ramfs/tmpfs, because the dentries are
now destroyed instead of going negative. [1] Is it intentional
that no negative dentries are created via vfs_unlink() method?
-Mike
1. if that's true, memory pressure could make readdir fail too.
On Sat, 30 Mar 2002, Mike Galbraith wrote:
> Hi,
>
> d_delete() doesn't appear to ever create negative dentries when
> called via vfs_unlink() due to the extra reference on the dentry.
> In fact, a printk() in the d_delete() spot never ever triggers...
Well shoot. I guess I've chased this about as far as I can, and
hope this thread wasn't a total waste. I found a better way to
get my rm -r to work as before fwiw. Rewinding the directory on
seek failure (yeah, could do in three lines, but not the point)
works, but is kinda b0rken. I think the only interesting thing
in the below is the FIXME :)) but I'll post it anyway.
EOT,
-Mike
Private comments on this quite welcome. Locking is a bitch :-)
--- fs/dcache.c.org Sat Mar 30 12:27:34 2002
+++ fs/dcache.c Sat Mar 30 17:32:43 2002
@@ -803,7 +803,10 @@
* @dentry: The dentry to delete
*
* Turn the dentry into a negative dentry if possible, otherwise
- * remove it from the hash queues so it can be deleted later
+ * remove it from the hash queues so it can be deleted later.
+ * FIXME: this assumes more than GOD allows :)
+ * This function must be called while holding an extra reference
+ * to the dentry, and with the dentry's d_inode pinned down.
*/
void d_delete(struct dentry * dentry)
@@ -812,7 +815,7 @@
* Are we the only user?
*/
spin_lock(&dcache_lock);
- if (atomic_read(&dentry->d_count) == 1) {
+ if (atomic_read(&dentry->d_count) == 2) {
dentry_iput(dentry);
return;
}
--- fs/namei.c.org Sat Mar 30 12:54:29 2002
+++ fs/namei.c Sat Mar 30 17:17:02 2002
@@ -1453,6 +1453,7 @@
int vfs_unlink(struct inode *dir, struct dentry *dentry)
{
int error = may_delete(dir, dentry, 0);
+ struct inode *inode;
if (error)
return error;
@@ -1463,7 +1464,12 @@
DQUOT_INIT(dir);
dget(dentry);
- down(&dentry->d_inode->i_sem);
+ inode = igrab(dentry->d_inode);
+ if (!inode) {
+ dput(dentry);
+ return -EBUSY;
+ }
+ down(&inode->i_sem);
if (d_mountpoint(dentry))
error = -EBUSY;
else {
@@ -1471,7 +1477,10 @@
if (!error)
d_delete(dentry);
}
- up(&dentry->d_inode->i_sem);
+ if (inode) {
+ up(&inode->i_sem);
+ iput(inode);
+ }
dput(dentry);
if (!error)
(Hmm. Actually, rewinding on seek failure is a more useful lie
than a NULL when the thing ain't empty, so I think I'll go apply
the three liner again and see if anything gripes)
On Sat, 30 Mar 2002, Mike Galbraith wrote:
> On Sat, 30 Mar 2002, Mike Galbraith wrote:
>
> > Hi,
> >
> > d_delete() doesn't appear to ever create negative dentries when
> > called via vfs_unlink() due to the extra reference on the dentry.
> > In fact, a printk() in the d_delete() spot never ever triggers...
>
> Well shoot. I guess I've chased this about as far as I can, and
> hope this thread wasn't a total waste. I found a better way to
> get my rm -r to work as before fwiw. Rewinding the directory on
> seek failure (yeah, could do in three lines, but not the point)
> works, but is kinda b0rken. I think the only interesting thing
> in the below is the FIXME :)) but I'll post it anyway.
Your patch is broken. FWIW, there are several real issues:
a) d_delete() being called too early in vfs_unlink(). Not a big
deal, it's easy to move outside of dget()/dput(). However, you _can't_
expect unlink() to make dentry negative. It's always possible that it
will be left positive and unhashed - that's what we have to do if file
we are unlinking is opened.
b) rm -rf expecting offsets in directory to stay stable after
unlink(). B0rken, complain to GNU folks. Sorry, I'm not touching that
code - GNU fileutils source is too yucky.
c) dcache_readdir() behaviour. There was an old patch that makes
it slightly more forgiving; I'll dig it out.
On Sat, 30 Mar 2002, Alexander Viro wrote:
> On Sat, 30 Mar 2002, Mike Galbraith wrote:
>
> > On Sat, 30 Mar 2002, Mike Galbraith wrote:
> >
> > > Hi,
> > >
> > > d_delete() doesn't appear to ever create negative dentries when
> > > called via vfs_unlink() due to the extra reference on the dentry.
> > > In fact, a printk() in the d_delete() spot never ever triggers...
> >
> > Well shoot. I guess I've chased this about as far as I can, and
> > hope this thread wasn't a total waste. I found a better way to
> > get my rm -r to work as before fwiw. Rewinding the directory on
> > seek failure (yeah, could do in three lines, but not the point)
> > works, but is kinda b0rken. I think the only interesting thing
> > in the below is the FIXME :)) but I'll post it anyway.
>
> Your patch is broken. FWIW, there are several real issues:
I'm not in the least suprised. (Actually, I'm rather pleased with
that assesment;)
> a) d_delete() being called too early in vfs_unlink(). Not a big
> deal, it's easy to move outside of dget()/dput(). However, you _can't_
Ok, at least here I was onto something.
> expect unlink() to make dentry negative. It's always possible that it
> will be left positive and unhashed - that's what we have to do if file
> we are unlinking is opened.
This one I'll have to fiddle with (thanks!). I was just looking at
the behavior delta, missing any real understanding of the 'why'.
> b) rm -rf expecting offsets in directory to stay stable after
> unlink(). B0rken, complain to GNU folks. Sorry, I'm not touching that
> code - GNU fileutils source is too yucky.
That was my starting point. (I kinda agree, but it's better than anything
_I_ could write from scratch, so..)
> c) dcache_readdir() behaviour. There was an old patch that makes
> it slightly more forgiving; I'll dig it out.
I'd love to see it
-Mike
On Sat, 30 Mar 2002, Alexander Viro wrote:
> a) d_delete() being called too early in vfs_unlink(). Not a big
> deal, it's easy to move outside of dget()/dput().
Indeed, that worked just fine. I ran Bill Hawes' fs-test scripts in
ext2, ext3 and tmpfs after doing so, and nothing interesting happened.
When I ran them in a knfsd loopback mounted ext2 fs however, a couple
of gripes popped out.
nfs_safe_remove: dir1/.nfs00009d5000000b6e busy, d_count=2
nfs_safe_remove: dir2/.nfs00009d5d00000c40 busy, d_count=2
-Mike
For reference:
--- linux-2.5.7/fs/namei.c.org Sat Mar 30 12:54:29 2002
+++ linux-2.5.7/fs/namei.c Sun Mar 31 05:59:24 2002
@@ -1466,16 +1466,15 @@
down(&dentry->d_inode->i_sem);
if (d_mountpoint(dentry))
error = -EBUSY;
- else {
+ else
error = dir->i_op->unlink(dir, dentry);
- if (!error)
- d_delete(dentry);
- }
up(&dentry->d_inode->i_sem);
dput(dentry);
- if (!error)
+ if (!error) {
+ d_delete(dentry);
inode_dir_notify(dir, DN_DELETE);
+ }
return error;
}