2011-04-07 19:44:08

by Nick Bowler

[permalink] [raw]
Subject: Lockdep splat in autofs with 2.6.39-rc2

Just saw this on 2.6.39-rc2 after half a day or so of uptime. I've
never seen it before today so it may be a regression from 2.6.38.
Nothing seems have failed as a result. Please let me know if you
need any more info.

=============================================
[ INFO: possible recursive locking detected ]
2.6.39-rc2 #177
---------------------------------------------
automount/23324 is trying to acquire lock:
(&(&dentry->d_lock)->rlock/1){+.+...}, at: [<ffffffffa034ea28>] autofs4_expire_indirect+0x307/0x484 [autofs4]

but task is already holding lock:
(&(&dentry->d_lock)->rlock/1){+.+...}, at: [<ffffffffa034ea28>] autofs4_expire_indirect+0x307/0x484 [autofs4]

other info that might help us debug this:
2 locks held by automount/23324:
#0: (&(&sbi->lookup_lock)->rlock){+.+...}, at: [<ffffffffa034e9aa>] autofs4_expire_indirect+0x289/0x484 [autofs4]
#1: (&(&dentry->d_lock)->rlock/1){+.+...}, at: [<ffffffffa034ea28>] autofs4_expire_indirect+0x307/0x484 [autofs4]

stack backtrace:
Pid: 23324, comm: automount Not tainted 2.6.39-rc2 #177
Call Trace:
[<ffffffff81061dc7>] __lock_acquire+0xc83/0xcfa
[<ffffffff8105f200>] ? static_obj+0x3d/0x4d
[<ffffffff81062073>] ? lock_release_non_nested+0x1c8/0x227
[<ffffffffa034ea28>] ? autofs4_expire_indirect+0x307/0x484 [autofs4]
[<ffffffffa034ea28>] ? autofs4_expire_indirect+0x307/0x484 [autofs4]
[<ffffffff81061e95>] lock_acquire+0x57/0x6d
[<ffffffffa034ea28>] ? autofs4_expire_indirect+0x307/0x484 [autofs4]
[<ffffffff8131f4e3>] _raw_spin_lock_nested+0x39/0x48
[<ffffffffa034ea28>] ? autofs4_expire_indirect+0x307/0x484 [autofs4]
[<ffffffff8131fb10>] ? _raw_spin_unlock+0x3e/0x4b
[<ffffffffa034ea28>] autofs4_expire_indirect+0x307/0x484 [autofs4]
[<ffffffffa034f184>] ? autofs_dev_ioctl_askumount+0x2d/0x2d [autofs4]
[<ffffffffa034edbd>] autofs4_do_expire_multi+0x30/0xe9 [autofs4]
[<ffffffffa034f184>] ? autofs_dev_ioctl_askumount+0x2d/0x2d [autofs4]
[<ffffffffa034f184>] ? autofs_dev_ioctl_askumount+0x2d/0x2d [autofs4]
[<ffffffffa034f19e>] autofs_dev_ioctl_expire+0x1a/0x1c [autofs4]
[<ffffffffa034f738>] _autofs_dev_ioctl+0x2a3/0x348 [autofs4]
[<ffffffffa034f7eb>] autofs_dev_ioctl+0xe/0x12 [autofs4]
[<ffffffff810c750d>] do_vfs_ioctl+0x45f/0x4ae
[<ffffffff810bab4e>] ? rcu_read_unlock+0x21/0x23
[<ffffffff810c759e>] sys_ioctl+0x42/0x65
[<ffffffff8132463b>] system_call_fastpath+0x16/0x1b

--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)


2011-04-13 19:26:49

by Maciej Rutecki

[permalink] [raw]
Subject: Re: Lockdep splat in autofs with 2.6.39-rc2

I created a Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=33242
for your bug report, please add your address to the CC list in there, thanks!


On czwartek, 7 kwietnia 2011 o 21:44:03 Nick Bowler wrote:
> Just saw this on 2.6.39-rc2 after half a day or so of uptime. I've
> never seen it before today so it may be a regression from 2.6.38.
> Nothing seems have failed as a result. Please let me know if you
> need any more info.
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.39-rc2 #177
> ---------------------------------------------
> automount/23324 is trying to acquire lock:
> (&(&dentry->d_lock)->rlock/1){+.+...}, at: [<ffffffffa034ea28>]
> autofs4_expire_indirect+0x307/0x484 [autofs4]
>
> but task is already holding lock:
> (&(&dentry->d_lock)->rlock/1){+.+...}, at: [<ffffffffa034ea28>]
> autofs4_expire_indirect+0x307/0x484 [autofs4]
>
> other info that might help us debug this:
> 2 locks held by automount/23324:
> #0: (&(&sbi->lookup_lock)->rlock){+.+...}, at: [<ffffffffa034e9aa>]
> autofs4_expire_indirect+0x289/0x484 [autofs4] #1:
> (&(&dentry->d_lock)->rlock/1){+.+...}, at: [<ffffffffa034ea28>]
> autofs4_expire_indirect+0x307/0x484 [autofs4]
>
> stack backtrace:
> Pid: 23324, comm: automount Not tainted 2.6.39-rc2 #177
> Call Trace:
> [<ffffffff81061dc7>] __lock_acquire+0xc83/0xcfa
> [<ffffffff8105f200>] ? static_obj+0x3d/0x4d
> [<ffffffff81062073>] ? lock_release_non_nested+0x1c8/0x227
> [<ffffffffa034ea28>] ? autofs4_expire_indirect+0x307/0x484 [autofs4]
> [<ffffffffa034ea28>] ? autofs4_expire_indirect+0x307/0x484 [autofs4]
> [<ffffffff81061e95>] lock_acquire+0x57/0x6d
> [<ffffffffa034ea28>] ? autofs4_expire_indirect+0x307/0x484 [autofs4]
> [<ffffffff8131f4e3>] _raw_spin_lock_nested+0x39/0x48
> [<ffffffffa034ea28>] ? autofs4_expire_indirect+0x307/0x484 [autofs4]
> [<ffffffff8131fb10>] ? _raw_spin_unlock+0x3e/0x4b
> [<ffffffffa034ea28>] autofs4_expire_indirect+0x307/0x484 [autofs4]
> [<ffffffffa034f184>] ? autofs_dev_ioctl_askumount+0x2d/0x2d [autofs4]
> [<ffffffffa034edbd>] autofs4_do_expire_multi+0x30/0xe9 [autofs4]
> [<ffffffffa034f184>] ? autofs_dev_ioctl_askumount+0x2d/0x2d [autofs4]
> [<ffffffffa034f184>] ? autofs_dev_ioctl_askumount+0x2d/0x2d [autofs4]
> [<ffffffffa034f19e>] autofs_dev_ioctl_expire+0x1a/0x1c [autofs4]
> [<ffffffffa034f738>] _autofs_dev_ioctl+0x2a3/0x348 [autofs4]
> [<ffffffffa034f7eb>] autofs_dev_ioctl+0xe/0x12 [autofs4]
> [<ffffffff810c750d>] do_vfs_ioctl+0x45f/0x4ae
> [<ffffffff810bab4e>] ? rcu_read_unlock+0x21/0x23
> [<ffffffff810c759e>] sys_ioctl+0x42/0x65
> [<ffffffff8132463b>] system_call_fastpath+0x16/0x1b

--
Maciej Rutecki
http://www.maciek.unixy.pl

2011-04-13 19:43:05

by Nick Bowler

[permalink] [raw]
Subject: Re: Lockdep splat in autofs with 2.6.39-rc2

On 2011-04-13 21:26 +0200, Maciej Rutecki wrote:
> I created a Bugzilla entry at
> https://bugzilla.kernel.org/show_bug.cgi?id=33242 for your bug report,
> please add your address to the CC list in there, thanks!

Unfortunately, that site does not appear to let me add this email to the
CC list:

Kernel Bug Tracker was unable to make any match at all for one or more
of the names and/or email addresses you entered on the previous page.
Please go back and try other names or email addresses.

CC: [email protected] did not match anything

--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

2011-04-14 19:38:08

by Maciej Rutecki

[permalink] [raw]
Subject: Re: Lockdep splat in autofs with 2.6.39-rc2

On środa, 13 kwietnia 2011 o 21:43:00 Nick Bowler wrote:
> On 2011-04-13 21:26 +0200, Maciej Rutecki wrote:
> > I created a Bugzilla entry at
> > https://bugzilla.kernel.org/show_bug.cgi?id=33242 for your bug report,
> > please add your address to the CC list in there, thanks!
>
> Unfortunately, that site does not appear to let me add this email to the
> CC list:
>
> Kernel Bug Tracker was unable to make any match at all for one or more
> of the names and/or email addresses you entered on the previous page.
> Please go back and try other names or email addresses.
>
> CC: [email protected] did not match anything

Firstly you should create account in Bugzilla, then add your e-mail to CC.
That is the reason why I ask you to put email -- I can't do it for you.

Regards
--
Maciej Rutecki
http://www.maciek.unixy.pl

2011-04-21 21:25:41

by Steven Rostedt

[permalink] [raw]
Subject: Re: Lockdep splat in autofs with 2.6.39-rc2

On Thu, Apr 07, 2011 at 03:44:03PM -0400, Nick Bowler wrote:
> Just saw this on 2.6.39-rc2 after half a day or so of uptime. I've
> never seen it before today so it may be a regression from 2.6.38.
> Nothing seems have failed as a result. Please let me know if you
> need any more info.
>

Could you try this patch. I know it may be hard to reproduce, but the
issue is that we are recursing down the locks in a tree/list and we changed a
lock from being nested to being a parent. This patch tells lockdep about
what we did.

Signed-off-by: Steven Rostedt <[email protected]>


diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c
index 450f529..1feb68e 100644
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -124,6 +124,7 @@ start:
/* Negative dentry - try next */
if (!simple_positive(q)) {
spin_unlock(&p->d_lock);
+ lock_set_subclass(&q->d_lock.dep_map, 0, _RET_IP_);
p = q;
goto again;
}
@@ -186,6 +187,7 @@ again:
/* Negative dentry - try next */
if (!simple_positive(ret)) {
spin_unlock(&p->d_lock);
+ lock_set_subclass(&ret->d_lock.dep_map, 0, _RET_IP_);
p = ret;
goto again;
}

2011-04-27 13:22:34

by Nick Bowler

[permalink] [raw]
Subject: Re: Lockdep splat in autofs with 2.6.39-rc2

On 2011-04-21 17:25 -0400, Steven Rostedt wrote:
> On Thu, Apr 07, 2011 at 03:44:03PM -0400, Nick Bowler wrote:
> > Just saw this on 2.6.39-rc2 after half a day or so of uptime. I've
> > never seen it before today so it may be a regression from 2.6.38.
> > Nothing seems have failed as a result. Please let me know if you
> > need any more info.
>
> Could you try this patch. I know it may be hard to reproduce, but the
> issue is that we are recursing down the locks in a tree/list and we changed a
> lock from being nested to being a parent. This patch tells lockdep about
> what we did.

OK, I've built 2.6.39-rc5 with this patch applied. However, it took ~5
days before I saw any splat with -rc4, thus it's unlikely that I'll be
able to say for sure that it works.

--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

2011-05-11 19:13:14

by Nick Bowler

[permalink] [raw]
Subject: Re: Lockdep splat in autofs with 2.6.39-rc2

On 2011-04-27 09:22 -0400, Nick Bowler wrote:
> On 2011-04-21 17:25 -0400, Steven Rostedt wrote:
> > On Thu, Apr 07, 2011 at 03:44:03PM -0400, Nick Bowler wrote:
> > > Just saw this on 2.6.39-rc2 after half a day or so of uptime. I've
> > > never seen it before today so it may be a regression from 2.6.38.
> > > Nothing seems have failed as a result. Please let me know if you
> > > need any more info.
> >
> > Could you try this patch. I know it may be hard to reproduce, but the
> > issue is that we are recursing down the locks in a tree/list and we changed a
> > lock from being nested to being a parent. This patch tells lockdep about
> > what we did.
>
> OK, I've built 2.6.39-rc5 with this patch applied. However, it took ~5
> days before I saw any splat with -rc4, thus it's unlikely that I'll be
> able to say for sure that it works.

FWIW, haven't had any problems with this kernel (+ patch) during the
last two weeks.

Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)

2011-06-14 00:43:19

by Steven Rostedt

[permalink] [raw]
Subject: Re: Lockdep splat in autofs with 2.6.39-rc2

On Wed, 2011-05-11 at 15:13 -0400, Nick Bowler wrote:
> On 2011-04-27 09:22 -0400, Nick Bowler wrote:
> > On 2011-04-21 17:25 -0400, Steven Rostedt wrote:
> > > On Thu, Apr 07, 2011 at 03:44:03PM -0400, Nick Bowler wrote:
> > > > Just saw this on 2.6.39-rc2 after half a day or so of uptime. I've
> > > > never seen it before today so it may be a regression from 2.6.38.
> > > > Nothing seems have failed as a result. Please let me know if you
> > > > need any more info.
> > >
> > > Could you try this patch. I know it may be hard to reproduce, but the
> > > issue is that we are recursing down the locks in a tree/list and we changed a
> > > lock from being nested to being a parent. This patch tells lockdep about
> > > what we did.
> >
> > OK, I've built 2.6.39-rc5 with this patch applied. However, it took ~5
> > days before I saw any splat with -rc4, thus it's unlikely that I'll be
> > able to say for sure that it works.
>
> FWIW, haven't had any problems with this kernel (+ patch) during the
> last two weeks.

I'm going to wrap this up and send it out as a proper patch. Can I add
your "Reported-by" and "Tested-by" tags?

Thanks,

-- Steve

2011-06-20 14:48:31

by Nick Bowler

[permalink] [raw]
Subject: Re: Lockdep splat in autofs with 2.6.39-rc2

Hi Steven, and sorry for the delay.

On 2011-06-13 20:43 -0400, Steven Rostedt wrote:
> I'm going to wrap this up and send it out as a proper patch. Can I add
> your "Reported-by" and "Tested-by" tags?

Sure, but please only add Tested-by if the code changes are the same as
the patch I actually tested.

Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)