LinuxLists.cc - NFSroot hangs with bad unlock balance in Linux next

2016-05-05 22:03:48

Subject: NFSroot hangs with bad unlock balance in Linux next

Hi,

Looks like Linux next with NFSroot hangs for me at some point booting
into init. Then after a while it produces "BUG: bad unlock balance
detected!".

This happens at least with omap5-uevm and igepv5. Not sure yet if it
also happens on other boards, the ones I'm seeing it happen both have
USB Ethernet controller. They usually hang after the system starts
being idle some tens of seconds into booting.

I tried to bisect it down with no luck. I do have the following
trace, does that provide any clues?

Regards,

Tony

8< --------------------------
[ BUG: bad unlock balance detected! ]
4.6.0-rc6-next-20160505+ #1252 Not tainted
-------------------------------------
kworker/0:2/112 is trying to release lock (&nfsi->rmdir_sem) at:
[<c03a7f98>] nfs_async_unlink_release+0x20/0x68
but there are no more locks to release!

other info that might help us debug this:
2 locks held by kworker/0:2/112:
#0: ("nfsiod"){.+.+..}, at: [<c0154774>] process_one_work+0x120/0x6bc
#1: ((&task->u.tk_work)#2){+.+...}, at: [<c0154774>] process_one_work+0x120/0x6bc

stack backtrace:
CPU: 0 PID: 112 Comm: kworker/0:2 Not tainted 4.6.0-rc6-next-20160505+ #1252
Hardware name: Generic OMAP5 (Flattened Device Tree)
Workqueue: nfsiod rpc_async_release
[<c0110328>] (unwind_backtrace) from [<c010c3cc>] (show_stack+0x10/0x14)
[<c010c3cc>] (show_stack) from [<c0481348>] (dump_stack+0xb0/0xe4)
[<c0481348>] (dump_stack) from [<c018d638>] (print_unlock_imbalance_bug+0xb0/0xe0)
[<c018d638>] (print_unlock_imbalance_bug) from [<c0191444>] (lock_release+0x2ec/0x4c4)
[<c0191444>] (lock_release) from [<c018abf0>] (up_read+0x18/0x58)
[<c018abf0>] (up_read) from [<c03a7f98>] (nfs_async_unlink_release+0x20/0x68)
[<c03a7f98>] (nfs_async_unlink_release) from [<c0785320>] (rpc_free_task+0x24/0x44)
[<c0785320>] (rpc_free_task) from [<c015483c>] (process_one_work+0x1e8/0x6bc)
[<c015483c>] (process_one_work) from [<c0154e54>] (worker_thread+0x144/0x4e8)
[<c0154e54>] (worker_thread) from [<c015ad0c>] (kthread+0xdc/0xf8)
[<c015ad0c>] (kthread) from [<c01078f0>] (ret_from_fork+0x14/0x24)

2016-05-08 14:16:39

by Al Viro

[permalink] [raw]

Subject: Re: NFSroot hangs with bad unlock balance in Linux next

On Thu, May 05, 2016 at 03:03:44PM -0700, Tony Lindgren wrote:
> Hi,
>
> Looks like Linux next with NFSroot hangs for me at some point booting
> into init. Then after a while it produces "BUG: bad unlock balance
> detected!".
>
> This happens at least with omap5-uevm and igepv5. Not sure yet if it
> also happens on other boards, the ones I'm seeing it happen both have
> USB Ethernet controller. They usually hang after the system starts
> being idle some tens of seconds into booting.
>
> I tried to bisect it down with no luck. I do have the following
> trace, does that provide any clues?

> kworker/0:2/112 is trying to release lock (&nfsi->rmdir_sem) at:
> [<c03a7f98>] nfs_async_unlink_release+0x20/0x68
> but there are no more locks to release!

Very strange. We grab that rwsem at the entry into nfs_call_unlink()
and then either release it there and return or call nfs_do_call_unlink().
Which arranges for eventual call of nfs_async_unlink_release() (via
->rpc_release); nfs_async_unlink_release() releases the rwsem. Nobody else
releases it (on the read side, that is).

The only kinda-sorta possibility I see here is that the inode we are
unlocking in that nfs_async_unlink_release() is not the one we'd locked
in nfs_call_unlink() that has lead to it. That really shouldn't happen,
though... Just to verify whether that's what we are hitting, could you
try to reproduce that thing with the patch below on top of -next and see
if it triggers any of those WARN_ON?

diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index d367b06..dbbb4c9 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -64,6 +64,10 @@ static void nfs_async_unlink_release(void *calldata)
struct dentry *dentry = data->dentry;
struct super_block *sb = dentry->d_sb;

+ if (WARN_ON(data->parent != dentry->d_parent) ||
+ WARN_ON(data->parent_inode != dentry->d_parent->d_inode)) {
+ printk(KERN_ERR "WTF2[%pd4]", dentry);
+ }
up_read(&NFS_I(d_inode(dentry->d_parent))->rmdir_sem);
d_lookup_done(dentry);
nfs_free_unlinkdata(data);
@@ -114,7 +118,8 @@ static void nfs_do_call_unlink(struct nfs_unlinkdata *data)

static int nfs_call_unlink(struct dentry *dentry, struct nfs_unlinkdata *data)
{
- struct inode *dir = d_inode(dentry->d_parent);
+ struct dentry *parent = dentry->d_parent;
+ struct inode *dir = d_inode(parent);
struct dentry *alias;

down_read(&NFS_I(dir)->rmdir_sem);
@@ -152,6 +157,12 @@ static int nfs_call_unlink(struct dentry *dentry, struct nfs_unlinkdata *data)
return ret;
}
data->dentry = alias;
+ data->parent = parent;
+ data->parent_inode = dir;
+ if (WARN_ON(parent != alias->d_parent) ||
+ WARN_ON(dir != parent->d_inode)) {
+ printk(KERN_ERR "WTF1[%pd4]", alias);
+ }
nfs_do_call_unlink(data);
return 1;
}
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index ee8491d..b01a7f1 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1471,6 +1471,8 @@ struct nfs_unlinkdata {
struct nfs_removeargs args;
struct nfs_removeres res;
struct dentry *dentry;
+ struct dentry *parent;
+ struct inode *parent_inode;
wait_queue_head_t wq;
struct rpc_cred *cred;
struct nfs_fattr dir_attr;

2016-05-09 07:32:43

by Al Viro

[permalink] [raw]

Subject: Re: NFSroot hangs with bad unlock balance in Linux next

On Sun, May 08, 2016 at 03:16:29PM +0100, Al Viro wrote:

> Very strange. We grab that rwsem at the entry into nfs_call_unlink()
> and then either release it there and return or call nfs_do_call_unlink().
> Which arranges for eventual call of nfs_async_unlink_release() (via
> ->rpc_release); nfs_async_unlink_release() releases the rwsem. Nobody else
> releases it (on the read side, that is).
>
> The only kinda-sorta possibility I see here is that the inode we are
> unlocking in that nfs_async_unlink_release() is not the one we'd locked
> in nfs_call_unlink() that has lead to it. That really shouldn't happen,
> though... Just to verify whether that's what we are hitting, could you
> try to reproduce that thing with the patch below on top of -next and see
> if it triggers any of those WARN_ON?

D'oh... Lockdep warnings are easy to trigger (and, AFAICS, bogus).
up_read/down_read in fs/nfs/unlink.c should be replaced with
up_read_non_owner/down_read_non_owner, lest the lockdep gets confused.
Hangs are different - I've no idea what's triggering those. I've seen
something similar on that -next, but not on work.lookups.

The joy of bisecting -next... <a couple of hours later>
9317bb69824ec8d078b0b786b6971aedb0af3d4f is the first bad commit
commit 9317bb69824ec8d078b0b786b6971aedb0af3d4f
Author: Eric Dumazet <[email protected]>
Date: Mon Apr 25 10:39:32 2016 -0700

net: SOCKWQ_ASYNC_NOSPACE optimizations

Reverting changes to sk_set_bit/sk_clear_bit gets rid of the hangs. Plain
revert gives a conflict, since there had been additional change in
"net: SOCKWQ_ASYNC_WAITDATA optimizations"; removing both fixed the hangs.

Note that hangs appear without any fs/nfs/unlink.c modifications being
there. When the hang happens it affects NFS traffic; ssh session still
works fine until it steps on a filesystem operation on NFS (i.e. you
can use builtins, access procfs, etc.)

2016-05-09 14:14:43

by Eric Dumazet

[permalink] [raw]

Subject: Re: NFSroot hangs with bad unlock balance in Linux next

On Mon, May 9, 2016 at 12:32 AM, Al Viro <[email protected]> wrote:
> On Sun, May 08, 2016 at 03:16:29PM +0100, Al Viro wrote:
>
>> Very strange. We grab that rwsem at the entry into nfs_call_unlink()
>> and then either release it there and return or call nfs_do_call_unlink().
>> Which arranges for eventual call of nfs_async_unlink_release() (via
>> ->rpc_release); nfs_async_unlink_release() releases the rwsem. Nobody else
>> releases it (on the read side, that is).
>>
>> The only kinda-sorta possibility I see here is that the inode we are
>> unlocking in that nfs_async_unlink_release() is not the one we'd locked
>> in nfs_call_unlink() that has lead to it. That really shouldn't happen,
>> though... Just to verify whether that's what we are hitting, could you
>> try to reproduce that thing with the patch below on top of -next and see
>> if it triggers any of those WARN_ON?
>
> D'oh... Lockdep warnings are easy to trigger (and, AFAICS, bogus).
> up_read/down_read in fs/nfs/unlink.c should be replaced with
> up_read_non_owner/down_read_non_owner, lest the lockdep gets confused.
> Hangs are different - I've no idea what's triggering those. I've seen
> something similar on that -next, but not on work.lookups.
>
> The joy of bisecting -next... <a couple of hours later>
> 9317bb69824ec8d078b0b786b6971aedb0af3d4f is the first bad commit
> commit 9317bb69824ec8d078b0b786b6971aedb0af3d4f
> Author: Eric Dumazet <[email protected]>
> Date: Mon Apr 25 10:39:32 2016 -0700
>
> net: SOCKWQ_ASYNC_NOSPACE optimizations
>
> Reverting changes to sk_set_bit/sk_clear_bit gets rid of the hangs. Plain
> revert gives a conflict, since there had been additional change in
> "net: SOCKWQ_ASYNC_WAITDATA optimizations"; removing both fixed the hangs.
>
> Note that hangs appear without any fs/nfs/unlink.c modifications being
> there. When the hang happens it affects NFS traffic; ssh session still
> works fine until it steps on a filesystem operation on NFS (i.e. you
> can use builtins, access procfs, etc.)

Yeah, the issue was reported last week (
http://www.spinics.net/lists/netdev/msg375777.html ),
and I could not convince myself to add a new sock flag, like
SOCK_FASYNC_STICKY.

(Just in case NFS would ever call sock_fasync() with an empty
fasync_list, and SOCK_FASYNC would be cleared again.

2016-05-09 15:12:24

by Tony Lindgren

[permalink] [raw]

Subject: Re: NFSroot hangs with bad unlock balance in Linux next

* Eric Dumazet <[email protected]> [160509 07:16]:
> On Mon, May 9, 2016 at 12:32 AM, Al Viro <[email protected]> wrote:
> > On Sun, May 08, 2016 at 03:16:29PM +0100, Al Viro wrote:
> >
> >> Very strange. We grab that rwsem at the entry into nfs_call_unlink()
> >> and then either release it there and return or call nfs_do_call_unlink().
> >> Which arranges for eventual call of nfs_async_unlink_release() (via
> >> ->rpc_release); nfs_async_unlink_release() releases the rwsem. Nobody else
> >> releases it (on the read side, that is).
> >>
> >> The only kinda-sorta possibility I see here is that the inode we are
> >> unlocking in that nfs_async_unlink_release() is not the one we'd locked
> >> in nfs_call_unlink() that has lead to it. That really shouldn't happen,
> >> though... Just to verify whether that's what we are hitting, could you
> >> try to reproduce that thing with the patch below on top of -next and see
> >> if it triggers any of those WARN_ON?

Thanks no warnings with that patch though.

> > D'oh... Lockdep warnings are easy to trigger (and, AFAICS, bogus).
> > up_read/down_read in fs/nfs/unlink.c should be replaced with
> > up_read_non_owner/down_read_non_owner, lest the lockdep gets confused.
> > Hangs are different - I've no idea what's triggering those. I've seen
> > something similar on that -next, but not on work.lookups.
> >
> > The joy of bisecting -next... <a couple of hours later>
> > 9317bb69824ec8d078b0b786b6971aedb0af3d4f is the first bad commit
> > commit 9317bb69824ec8d078b0b786b6971aedb0af3d4f
> > Author: Eric Dumazet <[email protected]>
> > Date: Mon Apr 25 10:39:32 2016 -0700
> >
> > net: SOCKWQ_ASYNC_NOSPACE optimizations
> >
> > Reverting changes to sk_set_bit/sk_clear_bit gets rid of the hangs. Plain
> > revert gives a conflict, since there had been additional change in
> > "net: SOCKWQ_ASYNC_WAITDATA optimizations"; removing both fixed the hangs.
> >
> > Note that hangs appear without any fs/nfs/unlink.c modifications being
> > there. When the hang happens it affects NFS traffic; ssh session still
> > works fine until it steps on a filesystem operation on NFS (i.e. you
> > can use builtins, access procfs, etc.)
>
> Yeah, the issue was reported last week (
> http://www.spinics.net/lists/netdev/msg375777.html ),
> and I could not convince myself to add a new sock flag, like
> SOCK_FASYNC_STICKY.
>
> (Just in case NFS would ever call sock_fasync() with an empty
> fasync_list, and SOCK_FASYNC would be cleared again.

Yeah applying the test patch from the url above makes things work
for me again.

Regards,

Tony

2016-05-09 15:21:43

by Tony Lindgren

[permalink] [raw]

Subject: Re: NFSroot hangs with bad unlock balance in Linux next

* Tony Lindgren <[email protected]> [160509 08:15]:
> * Eric Dumazet <[email protected]> [160509 07:16]:
> > On Mon, May 9, 2016 at 12:32 AM, Al Viro <[email protected]> wrote:
> > > On Sun, May 08, 2016 at 03:16:29PM +0100, Al Viro wrote:
> > >
> > >> Very strange. We grab that rwsem at the entry into nfs_call_unlink()
> > >> and then either release it there and return or call nfs_do_call_unlink().
> > >> Which arranges for eventual call of nfs_async_unlink_release() (via
> > >> ->rpc_release); nfs_async_unlink_release() releases the rwsem. Nobody else
> > >> releases it (on the read side, that is).
> > >>
> > >> The only kinda-sorta possibility I see here is that the inode we are
> > >> unlocking in that nfs_async_unlink_release() is not the one we'd locked
> > >> in nfs_call_unlink() that has lead to it. That really shouldn't happen,
> > >> though... Just to verify whether that's what we are hitting, could you
> > >> try to reproduce that thing with the patch below on top of -next and see
> > >> if it triggers any of those WARN_ON?
>
> Thanks no warnings with that patch though.
>
> > > D'oh... Lockdep warnings are easy to trigger (and, AFAICS, bogus).
> > > up_read/down_read in fs/nfs/unlink.c should be replaced with
> > > up_read_non_owner/down_read_non_owner, lest the lockdep gets confused.
> > > Hangs are different - I've no idea what's triggering those. I've seen
> > > something similar on that -next, but not on work.lookups.
> > >
> > > The joy of bisecting -next... <a couple of hours later>
> > > 9317bb69824ec8d078b0b786b6971aedb0af3d4f is the first bad commit
> > > commit 9317bb69824ec8d078b0b786b6971aedb0af3d4f
> > > Author: Eric Dumazet <[email protected]>
> > > Date: Mon Apr 25 10:39:32 2016 -0700
> > >
> > > net: SOCKWQ_ASYNC_NOSPACE optimizations
> > >
> > > Reverting changes to sk_set_bit/sk_clear_bit gets rid of the hangs. Plain
> > > revert gives a conflict, since there had been additional change in
> > > "net: SOCKWQ_ASYNC_WAITDATA optimizations"; removing both fixed the hangs.
> > >
> > > Note that hangs appear without any fs/nfs/unlink.c modifications being
> > > there. When the hang happens it affects NFS traffic; ssh session still
> > > works fine until it steps on a filesystem operation on NFS (i.e. you
> > > can use builtins, access procfs, etc.)
> >
> > Yeah, the issue was reported last week (
> > http://www.spinics.net/lists/netdev/msg375777.html ),
> > and I could not convince myself to add a new sock flag, like
> > SOCK_FASYNC_STICKY.
> >
> > (Just in case NFS would ever call sock_fasync() with an empty
> > fasync_list, and SOCK_FASYNC would be cleared again.
>
> Yeah applying the test patch from the url above makes things work
> for me again.

Looks like with both patches applied I still also get this eventually:

=====================================
[ BUG: bad unlock balance detected! ]
4.6.0-rc7-next-20160509+ #1264 Not tainted
-------------------------------------
kworker/0:1/18 is trying to release lock (&nfsi->rmdir_sem) at:
[<c03a894c>] nfs_async_unlink_release+0x3c/0xc0
but there are no more locks to release!

other info that might help us debug this:
2 locks held by kworker/0:1/18:
#0: ("nfsiod"){.+.+..}, at: [<c015473c>] process_one_work+0x120/0x6bc
#1: ((&task->u.tk_work)#2){+.+...}, at: [<c015473c>] process_one_work+0x120/0x6bc

stack backtrace:
CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 4.6.0-rc7-next-20160509+ #1264
Hardware name: Generic OMAP5 (Flattened Device Tree)
Workqueue: nfsiod rpc_async_release
[<c0110318>] (unwind_backtrace) from [<c010c3bc>] (show_stack+0x10/0x14)
[<c010c3bc>] (show_stack) from [<c0481da8>] (dump_stack+0xb0/0xe4)
[<c0481da8>] (dump_stack) from [<c018d090>] (print_unlock_imbalance_bug+0xb0/0xe0)
[<c018d090>] (print_unlock_imbalance_bug) from [<c0190e9c>] (lock_release+0x2ec/0x4c0)
[<c0190e9c>] (lock_release) from [<c018a4b8>] (up_read+0x18/0x58)
[<c018a4b8>] (up_read) from [<c03a894c>] (nfs_async_unlink_release+0x3c/0xc0)
[<c03a894c>] (nfs_async_unlink_release) from [<c07860e0>] (rpc_free_task+0x24/0x44)
[<c07860e0>] (rpc_free_task) from [<c0154804>] (process_one_work+0x1e8/0x6bc)
[<c0154804>] (process_one_work) from [<c0154e1c>] (worker_thread+0x144/0x4e8)
[<c0154e1c>] (worker_thread) from [<c015acd4>] (kthread+0xdc/0xf8)
[<c015acd4>] (kthread) from [<c01078f0>] (ret_from_fork+0x14/0x24)

After the warning, NFSroot keeps working with Eric's patch.

Regards,

Tony

2016-05-09 15:39:39

by Al Viro

[permalink] [raw]

Subject: Re: NFSroot hangs with bad unlock balance in Linux next

On Mon, May 09, 2016 at 08:21:38AM -0700, Tony Lindgren wrote:

> Looks like with both patches applied I still also get this eventually:
>
> =====================================
> [ BUG: bad unlock balance detected! ]
> 4.6.0-rc7-next-20160509+ #1264 Not tainted
> -------------------------------------

Lockdep warnings are noise. To make them STFU try the following incremental;
I'll fold it into #work.lookups and #for-next. Note that it will do nothing
to hangs - those are completely unrelated and you need Eric's patch to deal
with them.

diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index d367b06..1868246 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -64,7 +64,7 @@ static void nfs_async_unlink_release(void *calldata)
struct dentry *dentry = data->dentry;
struct super_block *sb = dentry->d_sb;

- up_read(&NFS_I(d_inode(dentry->d_parent))->rmdir_sem);
+ up_read_non_owner(&NFS_I(d_inode(dentry->d_parent))->rmdir_sem);
d_lookup_done(dentry);
nfs_free_unlinkdata(data);
dput(dentry);
@@ -117,10 +117,10 @@ static int nfs_call_unlink(struct dentry *dentry, struct nfs_unlinkdata *data)
struct inode *dir = d_inode(dentry->d_parent);
struct dentry *alias;

- down_read(&NFS_I(dir)->rmdir_sem);
+ down_read_non_owner(&NFS_I(dir)->rmdir_sem);
alias = d_alloc_parallel(dentry->d_parent, &data->args.name, &data->wq);
if (IS_ERR(alias)) {
- up_read(&NFS_I(dir)->rmdir_sem);
+ up_read_non_owner(&NFS_I(dir)->rmdir_sem);
return 0;
}
if (!d_in_lookup(alias)) {
@@ -142,7 +142,7 @@ static int nfs_call_unlink(struct dentry *dentry, struct nfs_unlinkdata *data)
ret = 0;
spin_unlock(&alias->d_lock);
dput(alias);
- up_read(&NFS_I(dir)->rmdir_sem);
+ up_read_non_owner(&NFS_I(dir)->rmdir_sem);
/*
* If we'd displaced old cached devname, free it. At that
* point dentry is definitely not a root, so we won't need

2016-05-09 19:40:58

by Tony Lindgren

[permalink] [raw]

Subject: Re: NFSroot hangs with bad unlock balance in Linux next

* Al Viro <[email protected]> [160509 08:41]:
> On Mon, May 09, 2016 at 08:21:38AM -0700, Tony Lindgren wrote:
>
> > Looks like with both patches applied I still also get this eventually:
> >
> > =====================================
> > [ BUG: bad unlock balance detected! ]
> > 4.6.0-rc7-next-20160509+ #1264 Not tainted
> > -------------------------------------
>
> Lockdep warnings are noise. To make them STFU try the following incremental;
> I'll fold it into #work.lookups and #for-next. Note that it will do nothing
> to hangs - those are completely unrelated and you need Eric's patch to deal
> with them.

OK yeah that helps thanks:

Tested-by: Tony Lindgren <[email protected]>

> diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
> index d367b06..1868246 100644
> --- a/fs/nfs/unlink.c
> +++ b/fs/nfs/unlink.c
> @@ -64,7 +64,7 @@ static void nfs_async_unlink_release(void *calldata)
> struct dentry *dentry = data->dentry;
> struct super_block *sb = dentry->d_sb;
>
> - up_read(&NFS_I(d_inode(dentry->d_parent))->rmdir_sem);
> + up_read_non_owner(&NFS_I(d_inode(dentry->d_parent))->rmdir_sem);
> d_lookup_done(dentry);
> nfs_free_unlinkdata(data);
> dput(dentry);
> @@ -117,10 +117,10 @@ static int nfs_call_unlink(struct dentry *dentry, struct nfs_unlinkdata *data)
> struct inode *dir = d_inode(dentry->d_parent);
> struct dentry *alias;
>
> - down_read(&NFS_I(dir)->rmdir_sem);
> + down_read_non_owner(&NFS_I(dir)->rmdir_sem);
> alias = d_alloc_parallel(dentry->d_parent, &data->args.name, &data->wq);
> if (IS_ERR(alias)) {
> - up_read(&NFS_I(dir)->rmdir_sem);
> + up_read_non_owner(&NFS_I(dir)->rmdir_sem);
> return 0;
> }
> if (!d_in_lookup(alias)) {
> @@ -142,7 +142,7 @@ static int nfs_call_unlink(struct dentry *dentry, struct nfs_unlinkdata *data)
> ret = 0;
> spin_unlock(&alias->d_lock);
> dput(alias);
> - up_read(&NFS_I(dir)->rmdir_sem);
> + up_read_non_owner(&NFS_I(dir)->rmdir_sem);
> /*
> * If we'd displaced old cached devname, free it. At that
> * point dentry is definitely not a root, so we won't need