2002-03-30 13:02:32

by Mike Galbraith

[permalink] [raw]
Subject: vfs_unlink() >=2.5.5-pre1 question

Hi,

d_delete() doesn't appear to ever create negative dentries when
called via vfs_unlink() due to the extra reference on the dentry.
In fact, a printk() in the d_delete() spot never ever triggers...

This could explain the behavior change, wherein rm -r on a large
directory no longer works in ramfs/tmpfs, because the dentries are
now destroyed instead of going negative. [1] Is it intentional
that no negative dentries are created via vfs_unlink() method?

-Mike

1. if that's true, memory pressure could make readdir fail too.


2002-03-30 16:50:24

by Mike Galbraith

[permalink] [raw]
Subject: Re: vfs_unlink() >=2.5.5-pre1 question

On Sat, 30 Mar 2002, Mike Galbraith wrote:

> Hi,
>
> d_delete() doesn't appear to ever create negative dentries when
> called via vfs_unlink() due to the extra reference on the dentry.
> In fact, a printk() in the d_delete() spot never ever triggers...

Well shoot. I guess I've chased this about as far as I can, and
hope this thread wasn't a total waste. I found a better way to
get my rm -r to work as before fwiw. Rewinding the directory on
seek failure (yeah, could do in three lines, but not the point)
works, but is kinda b0rken. I think the only interesting thing
in the below is the FIXME :)) but I'll post it anyway.

EOT,

-Mike

Private comments on this quite welcome. Locking is a bitch :-)

--- fs/dcache.c.org Sat Mar 30 12:27:34 2002
+++ fs/dcache.c Sat Mar 30 17:32:43 2002
@@ -803,7 +803,10 @@
* @dentry: The dentry to delete
*
* Turn the dentry into a negative dentry if possible, otherwise
- * remove it from the hash queues so it can be deleted later
+ * remove it from the hash queues so it can be deleted later.
+ * FIXME: this assumes more than GOD allows :)
+ * This function must be called while holding an extra reference
+ * to the dentry, and with the dentry's d_inode pinned down.
*/

void d_delete(struct dentry * dentry)
@@ -812,7 +815,7 @@
* Are we the only user?
*/
spin_lock(&dcache_lock);
- if (atomic_read(&dentry->d_count) == 1) {
+ if (atomic_read(&dentry->d_count) == 2) {
dentry_iput(dentry);
return;
}
--- fs/namei.c.org Sat Mar 30 12:54:29 2002
+++ fs/namei.c Sat Mar 30 17:17:02 2002
@@ -1453,6 +1453,7 @@
int vfs_unlink(struct inode *dir, struct dentry *dentry)
{
int error = may_delete(dir, dentry, 0);
+ struct inode *inode;

if (error)
return error;
@@ -1463,7 +1464,12 @@
DQUOT_INIT(dir);

dget(dentry);
- down(&dentry->d_inode->i_sem);
+ inode = igrab(dentry->d_inode);
+ if (!inode) {
+ dput(dentry);
+ return -EBUSY;
+ }
+ down(&inode->i_sem);
if (d_mountpoint(dentry))
error = -EBUSY;
else {
@@ -1471,7 +1477,10 @@
if (!error)
d_delete(dentry);
}
- up(&dentry->d_inode->i_sem);
+ if (inode) {
+ up(&inode->i_sem);
+ iput(inode);
+ }
dput(dentry);

if (!error)

(Hmm. Actually, rewinding on seek failure is a more useful lie
than a NULL when the thing ain't empty, so I think I'll go apply
the three liner again and see if anything gripes)

2002-03-30 18:28:40

by Alexander Viro

[permalink] [raw]
Subject: Re: vfs_unlink() >=2.5.5-pre1 question



On Sat, 30 Mar 2002, Mike Galbraith wrote:

> On Sat, 30 Mar 2002, Mike Galbraith wrote:
>
> > Hi,
> >
> > d_delete() doesn't appear to ever create negative dentries when
> > called via vfs_unlink() due to the extra reference on the dentry.
> > In fact, a printk() in the d_delete() spot never ever triggers...
>
> Well shoot. I guess I've chased this about as far as I can, and
> hope this thread wasn't a total waste. I found a better way to
> get my rm -r to work as before fwiw. Rewinding the directory on
> seek failure (yeah, could do in three lines, but not the point)
> works, but is kinda b0rken. I think the only interesting thing
> in the below is the FIXME :)) but I'll post it anyway.

Your patch is broken. FWIW, there are several real issues:
a) d_delete() being called too early in vfs_unlink(). Not a big
deal, it's easy to move outside of dget()/dput(). However, you _can't_
expect unlink() to make dentry negative. It's always possible that it
will be left positive and unhashed - that's what we have to do if file
we are unlinking is opened.
b) rm -rf expecting offsets in directory to stay stable after
unlink(). B0rken, complain to GNU folks. Sorry, I'm not touching that
code - GNU fileutils source is too yucky.
c) dcache_readdir() behaviour. There was an old patch that makes
it slightly more forgiving; I'll dig it out.

2002-03-30 19:46:15

by Mike Galbraith

[permalink] [raw]
Subject: Re: vfs_unlink() >=2.5.5-pre1 question

On Sat, 30 Mar 2002, Alexander Viro wrote:

> On Sat, 30 Mar 2002, Mike Galbraith wrote:
>
> > On Sat, 30 Mar 2002, Mike Galbraith wrote:
> >
> > > Hi,
> > >
> > > d_delete() doesn't appear to ever create negative dentries when
> > > called via vfs_unlink() due to the extra reference on the dentry.
> > > In fact, a printk() in the d_delete() spot never ever triggers...
> >
> > Well shoot. I guess I've chased this about as far as I can, and
> > hope this thread wasn't a total waste. I found a better way to
> > get my rm -r to work as before fwiw. Rewinding the directory on
> > seek failure (yeah, could do in three lines, but not the point)
> > works, but is kinda b0rken. I think the only interesting thing
> > in the below is the FIXME :)) but I'll post it anyway.
>
> Your patch is broken. FWIW, there are several real issues:

I'm not in the least suprised. (Actually, I'm rather pleased with
that assesment;)

> a) d_delete() being called too early in vfs_unlink(). Not a big
> deal, it's easy to move outside of dget()/dput(). However, you _can't_

Ok, at least here I was onto something.

> expect unlink() to make dentry negative. It's always possible that it
> will be left positive and unhashed - that's what we have to do if file
> we are unlinking is opened.

This one I'll have to fiddle with (thanks!). I was just looking at
the behavior delta, missing any real understanding of the 'why'.

> b) rm -rf expecting offsets in directory to stay stable after
> unlink(). B0rken, complain to GNU folks. Sorry, I'm not touching that
> code - GNU fileutils source is too yucky.

That was my starting point. (I kinda agree, but it's better than anything
_I_ could write from scratch, so..)

> c) dcache_readdir() behaviour. There was an old patch that makes
> it slightly more forgiving; I'll dig it out.

I'd love to see it

-Mike

2002-03-31 06:02:37

by Mike Galbraith

[permalink] [raw]
Subject: [NFS] Re: vfs_unlink() >=2.5.5-pre1 question

On Sat, 30 Mar 2002, Alexander Viro wrote:

> a) d_delete() being called too early in vfs_unlink(). Not a big
> deal, it's easy to move outside of dget()/dput().

Indeed, that worked just fine. I ran Bill Hawes' fs-test scripts in
ext2, ext3 and tmpfs after doing so, and nothing interesting happened.

When I ran them in a knfsd loopback mounted ext2 fs however, a couple
of gripes popped out.

nfs_safe_remove: dir1/.nfs00009d5000000b6e busy, d_count=2
nfs_safe_remove: dir2/.nfs00009d5d00000c40 busy, d_count=2

-Mike

For reference:

--- linux-2.5.7/fs/namei.c.org Sat Mar 30 12:54:29 2002
+++ linux-2.5.7/fs/namei.c Sun Mar 31 05:59:24 2002
@@ -1466,16 +1466,15 @@
down(&dentry->d_inode->i_sem);
if (d_mountpoint(dentry))
error = -EBUSY;
- else {
+ else
error = dir->i_op->unlink(dir, dentry);
- if (!error)
- d_delete(dentry);
- }
up(&dentry->d_inode->i_sem);
dput(dentry);

- if (!error)
+ if (!error) {
+ d_delete(dentry);
inode_dir_notify(dir, DN_DELETE);
+ }

return error;
}