2016-08-10 18:06:36

by Linus Torvalds

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On Tue, Aug 9, 2016 at 10:39 PM, kernel test robot
<[email protected]> wrote:
>
> [ 1537.558739] nfsd: last server has exited, flushing export cache
> [ 1540.627795] BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]
> [ 1540.633915] ------------[ cut here ]------------
> [ 1540.636551] WARNING: CPU: 0 PID: 20552 at fs/dcache.c:1474 umount_check+0x72/0x80

Hmm. Adding Al and the btrfs people to the cc, and expanding the nfs
list. Unlike the flakey-IO warning, this one sounds like a real issue.
Whether it's some dentry leak by nfsd or a VFS or btrfs issue, I can't
begin to guess. Al, ideas?

More information in the original email on lkml.

Linus


2016-08-10 18:34:21

by Josef Bacik

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On 08/10/2016 02:25 PM, Linus Torvalds wrote:
> On Wed, Aug 10, 2016 at 11:22 AM, Josef Bacik <[email protected]> wrote:
>> On 08/10/2016 02:06 PM, Linus Torvalds wrote:
>>>
>>> More information in the original email on lkml.
>>
>> I'm not subscribed to lkml and for some reason I can't find the original
>> email in any of the lkml/linux-nfs archives. Could you forward more of the
>> details?
>
> Done.
>

Looks like an NFS problem. Before in the !resfhp->fh_dentry case we would do a
lookup_one_len() and carry on. Now we do the lookup_one_len(), call into
nfsd_create_locked() and do a dget on resfhp->f_dentry, which from looks of it
is the dentry we just did a lookup_one_len() on, so we have one more ref on the
dentry than we did previously. Thanks,

Josef


2016-08-10 18:48:32

by Josef Bacik

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On 08/10/2016 02:25 PM, Linus Torvalds wrote:
> On Wed, Aug 10, 2016 at 11:22 AM, Josef Bacik <[email protected]> wrote:
>> On 08/10/2016 02:06 PM, Linus Torvalds wrote:
>>>
>>> More information in the original email on lkml.
>>
>> I'm not subscribed to lkml and for some reason I can't find the original
>> email in any of the lkml/linux-nfs archives. Could you forward more of the
>> details?
>
> Done.
>

So my naive fix would be something like this


From: Josef Bacik <[email protected]>
Date: Wed, 10 Aug 2016 14:43:08 -0400
Subject: [PATCH] nfsd: fix dentry refcounting problem

b44061d0b9 introduced a dentry ref counting bug, previously we were grabbing one
ref to dchild in nfsd_create(), but with the creation of nfsd_create_locked() we
have a ref for dchild from the lookup in nfsd_create(), and then another ref in
nfsd_create_locked(). The ref from the lookup in nfsd_create() is never dropped
and results in dentries still in use at unmount.

Signed-off-by: Josef Bacik <[email protected]>
---
fs/nfsd/vfs.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index ba944123..ff476e6 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1252,10 +1252,13 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
if (IS_ERR(dchild))
return nfserrno(host_err);
err = fh_compose(resfhp, fhp->fh_export, dchild, fhp);
- if (err) {
- dput(dchild);
+ /*
+ * We unconditionally drop our ref to dchild as fh_compose will have
+ * already grabbed its own ref for it.
+ */
+ dput(dchild);
+ if (err)
return err;
- }
return nfsd_create_locked(rqstp, fhp, fname, flen, iap, type,
rdev, resfhp);
}
--
2.5.5


2016-08-10 19:06:27

by Jeff Layton

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On Wed, 2016-08-10 at 14:46 -0400, Josef Bacik wrote:
> On 08/10/2016 02:25 PM, Linus Torvalds wrote:
> >
> > > > On Wed, Aug 10, 2016 at 11:22 AM, Josef Bacik <[email protected]> wrote:
> > >
> > > On 08/10/2016 02:06 PM, Linus Torvalds wrote:
> > > >
> > > >
> > > > More information in the original email on lkml.
> > >
> > > I'm not subscribed to lkml and for some reason I can't find the original
> > > email in any of the lkml/linux-nfs archives.  Could you forward more of the
> > > details?
> >
> > Done.
> >
>
> So my naive fix would be something like this
>
>
> > From: Josef Bacik <[email protected]>
> Date: Wed, 10 Aug 2016 14:43:08 -0400
> Subject: [PATCH] nfsd: fix dentry refcounting problem
>
> b44061d0b9 introduced a dentry ref counting bug, previously we were grabbing one
> ref to dchild in nfsd_create(), but with the creation of nfsd_create_locked() we
> have a ref for dchild from the lookup in nfsd_create(), and then another ref in
> nfsd_create_locked().  The ref from the lookup in nfsd_create() is never dropped
> and results in dentries still in use at unmount.
>
> > Signed-off-by: Josef Bacik <[email protected]>
> ---
>   fs/nfsd/vfs.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index ba944123..ff476e6 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1252,10 +1252,13 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> >    if (IS_ERR(dchild))
> >    return nfserrno(host_err);
> >    err = fh_compose(resfhp, fhp->fh_export, dchild, fhp);
> > - if (err) {
> > - dput(dchild);
> > + /*
> > +  * We unconditionally drop our ref to dchild as fh_compose will have
> > +  * already grabbed its own ref for it.
> > +  */
> > + dput(dchild);
> > + if (err)
> >    return err;
> > - }
> >    return nfsd_create_locked(rqstp, fhp, fname, flen, iap, type,
> >    rdev, resfhp);
>   }

Looks correct to me:

Reviewed-by: Jeff Layton <[email protected]>

2016-08-10 19:09:17

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On Wed, Aug 10, 2016 at 11:56:15AM -0700, Linus Torvalds wrote:
> On Wed, Aug 10, 2016 at 11:46 AM, Josef Bacik <[email protected]> wrote:
> >
> > So my naive fix would be something like this
>
> Bruce? Josef's patch looks ObviouslyCorrect(tm) to me now that I look
> at it - all the other callers of fh_compose() also seem to just drop
> the dentry unconditionally, knowing that fh_compose() took a ref to it
> if needed.
>
> In fact, the only thing I'd do differently would be to not even put
> the comment there at all, since this call site isn't any different
> from any of the others. If anything, it could go on fh_compose() if we
> want to add comments.

Yep, makes sense to me too.

OK with me if you want to take it, otherwise I'll run it through my
usual tests and send you a pull request probably today or tomorrow.

Thanks, Josef! (And kernel test robot folks--the speed with which
they're catching stuff, and bisecting down to individual commits, is
really amazing to me.)

--b.

2016-08-10 19:15:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On Wed, Aug 10, 2016 at 12:09 PM, J. Bruce Fields <[email protected]> wrote:
>
> OK with me if you want to take it, otherwise I'll run it through my
> usual tests and send you a pull request probably today or tomorrow.

I'll let it go through your tree and your usual tests - it's not like
this seems to be a "needs to be in my tree *IMMEDIATELY* or the world
will end!!!11!" kind of issue.

Thanks everybody,

Linus

2016-08-10 18:56:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On Wed, Aug 10, 2016 at 11:46 AM, Josef Bacik <[email protected]> wrote:
>
> So my naive fix would be something like this

Bruce? Josef's patch looks ObviouslyCorrect(tm) to me now that I look
at it - all the other callers of fh_compose() also seem to just drop
the dentry unconditionally, knowing that fh_compose() took a ref to it
if needed.

In fact, the only thing I'd do differently would be to not even put
the comment there at all, since this call site isn't any different
from any of the others. If anything, it could go on fh_compose() if we
want to add comments.

Linus

2016-08-10 18:25:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On Wed, Aug 10, 2016 at 11:22 AM, Josef Bacik <[email protected]> wrote:
> On 08/10/2016 02:06 PM, Linus Torvalds wrote:
>>
>> More information in the original email on lkml.
>
> I'm not subscribed to lkml and for some reason I can't find the original
> email in any of the lkml/linux-nfs archives. Could you forward more of the
> details?

Done.

Linus

2016-08-10 18:22:20

by Josef Bacik

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On 08/10/2016 02:06 PM, Linus Torvalds wrote:
> On Tue, Aug 9, 2016 at 10:39 PM, kernel test robot
> <[email protected]> wrote:
>>
>> [ 1537.558739] nfsd: last server has exited, flushing export cache
>> [ 1540.627795] BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]
>> [ 1540.633915] ------------[ cut here ]------------
>> [ 1540.636551] WARNING: CPU: 0 PID: 20552 at fs/dcache.c:1474 umount_check+0x72/0x80
>
> Hmm. Adding Al and the btrfs people to the cc, and expanding the nfs
> list. Unlike the flakey-IO warning, this one sounds like a real issue.
> Whether it's some dentry leak by nfsd or a VFS or btrfs issue, I can't
> begin to guess. Al, ideas?
>
> More information in the original email on lkml.
>

I'm not subscribed to lkml and for some reason I can't find the original email
in any of the lkml/linux-nfs archives. Could you forward more of the details?
Thanks,

Josef


2016-08-11 02:19:27

by Al Viro

[permalink] [raw]
Subject: Re: [lkp] [nfsd] b44061d0b9: BUG: Dentry ffff880027d7c540{i=1846f,n=0a} still in use (1) [unmount of btrfs vda]

On Wed, Aug 10, 2016 at 02:46:27PM -0400, Josef Bacik wrote:
> On 08/10/2016 02:25 PM, Linus Torvalds wrote:
> > On Wed, Aug 10, 2016 at 11:22 AM, Josef Bacik <[email protected]> wrote:
> > > On 08/10/2016 02:06 PM, Linus Torvalds wrote:
> > > >
> > > > More information in the original email on lkml.
> > >
> > > I'm not subscribed to lkml and for some reason I can't find the original
> > > email in any of the lkml/linux-nfs archives. Could you forward more of the
> > > details?
> >
> > Done.
> >
>
> So my naive fix would be something like this
>
>
> From: Josef Bacik <[email protected]>
> Date: Wed, 10 Aug 2016 14:43:08 -0400
> Subject: [PATCH] nfsd: fix dentry refcounting problem
>
> b44061d0b9 introduced a dentry ref counting bug, previously we were grabbing one
> ref to dchild in nfsd_create(), but with the creation of nfsd_create_locked() we
> have a ref for dchild from the lookup in nfsd_create(), and then another ref in
> nfsd_create_locked(). The ref from the lookup in nfsd_create() is never dropped
> and results in dentries still in use at unmount.
>
> Signed-off-by: Josef Bacik <[email protected]>

[sorry, had been off-line since yesterday]

Patch looks sane; feel free to slap Acked-by: Al Viro <[email protected]>
on it. I think it should go through nfsd tree.