2014-04-06 05:12:27

by Michael L. Semon

[permalink] [raw]
Subject: 3.14.0+/x86: lockdep and mutexes not getting along

Hi! Starting early in this merge window for 3.15, lockdep has been
giving me trouble. Normally, a splat will happen, lockdep will shut
itself off, and my i686 Pentium 4 PC will continue. Now, after the
splat, it will allow one key of input at either a VGA console or over
serial. After that, only the magic SysRq keys and KDB still work.
File activity stops, and many processes are stuck in the D state.

Bisect brought me here:

root@plbearer:/usr/src/kernel-git/linux# git bisect good
6f008e72cd111a119b5d8de8c5438d892aae99eb is the first bad commit
commit 6f008e72cd111a119b5d8de8c5438d892aae99eb
Author: Peter Zijlstra <[email protected]>
Date: Wed Mar 12 13:24:42 2014 +0100

locking/mutex: Fix debug checks

OK, so commit:

1d8fe7dc8078 ("locking/mutexes: Unlock the mutex without the wait_lock")

generates this boot warning when CONFIG_DEBUG_MUTEXES=y:

WARNING: CPU: 0 PID: 139 at /usr/src/linux-2.6/kernel/locking/mutex-debug.c:82 debug_mutex_unlock+0x155/0x180() DEBUG_LOCKS_WARN_ON(lock->owner != current)

And that makes sense, because as soon as we release the lock a
new owner can come in...

One would think that !__mutex_slowpath_needs_to_unlock()
implementations suffer the same, but for DEBUG we fall back to
mutex-null.h which has an unconditional 1 for that.

The mutex debug code requires the mutex to be unlocked after
doing the debug checks, otherwise it can find inconsistent
state.

Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

:040000 040000 80e40c2009942a31f98127c4f9fa958f34b3947b f46ed4b70c4f30fc665fe8f810d3c13920cd765a M kernel

Indeed, my issues are solved (so far) simply by reverting this commit.

Might someone test lockdep on x86 to see if this is a consistent
issue that needs to be adjusted? My lockdep splats are generated by
running xfstests test generic/113 on XFS, but splats caused by other
issues should still create the same symptoms.

Otherwise, this 3.15 kernel has been rather kind to me so far.

PC is an i686 Pentium 4 with 1280 MB RAM and old IDE hardware,
running Slackware 14.1.

Thanks!

Michael


2014-04-09 12:22:49

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Sun, Apr 06, 2014 at 01:12:14AM -0400, Michael L. Semon wrote:
> Hi! Starting early in this merge window for 3.15, lockdep has been
> giving me trouble. Normally, a splat will happen, lockdep will shut
> itself off, and my i686 Pentium 4 PC will continue. Now, after the
> splat, it will allow one key of input at either a VGA console or over
> serial. After that, only the magic SysRq keys and KDB still work.
> File activity stops, and many processes are stuck in the D state.
>
> Bisect brought me here:
>
> root@plbearer:/usr/src/kernel-git/linux# git bisect good
> 6f008e72cd111a119b5d8de8c5438d892aae99eb is the first bad commit
> commit 6f008e72cd111a119b5d8de8c5438d892aae99eb
> Author: Peter Zijlstra <[email protected]>
> Date: Wed Mar 12 13:24:42 2014 +0100
>
> locking/mutex: Fix debug checks
>
> OK, so commit:
>
> 1d8fe7dc8078 ("locking/mutexes: Unlock the mutex without the wait_lock")
>
> generates this boot warning when CONFIG_DEBUG_MUTEXES=y:
>
> WARNING: CPU: 0 PID: 139 at /usr/src/linux-2.6/kernel/locking/mutex-debug.c:82 debug_mutex_unlock+0x155/0x180() DEBUG_LOCKS_WARN_ON(lock->owner != current)
>
> And that makes sense, because as soon as we release the lock a
> new owner can come in...
>
> One would think that !__mutex_slowpath_needs_to_unlock()
> implementations suffer the same, but for DEBUG we fall back to
> mutex-null.h which has an unconditional 1 for that.
>
> The mutex debug code requires the mutex to be unlocked after
> doing the debug checks, otherwise it can find inconsistent
> state.
>
> Reported-by: Ingo Molnar <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>
> Cc: [email protected]
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Ingo Molnar <[email protected]>
>
> :040000 040000 80e40c2009942a31f98127c4f9fa958f34b3947b f46ed4b70c4f30fc665fe8f810d3c13920cd765a M kernel
>
> Indeed, my issues are solved (so far) simply by reverting this commit.
>
> Might someone test lockdep on x86 to see if this is a consistent
> issue that needs to be adjusted? My lockdep splats are generated by
> running xfstests test generic/113 on XFS, but splats caused by other
> issues should still create the same symptoms.
>
> Otherwise, this 3.15 kernel has been rather kind to me so far.
>
> PC is an i686 Pentium 4 with 1280 MB RAM and old IDE hardware,
> running Slackware 14.1.

Hi Michael,

+Ingo, +PeterZ and +Jason.
It usually helps to write to relevant people.

I have troble with the commit as well:

[ 26.745741]
[ 26.747484] ======================================================
[ 26.748725] [ INFO: possible circular locking dependency detected ]
[ 26.748725] 3.13.0-11331-g6f008e72cd11 #1162 Not tainted
[ 26.748725] -------------------------------------------------------
[ 26.748725] trinity-c5/848 is trying to acquire lock:
[ 26.748725] (&p->lock){+.+.+.}, at: [<ffffffff811774a8>] seq_read+0x38/0x3c0
[ 26.748725]
[ 26.748725] but task is already holding lock:
[ 26.748725] (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811579db>] prepare_bprm_creds+0x2b/0x80
[ 26.748725]
[ 26.748725] which lock already depends on the new lock.
[ 26.748725]
[ 26.748725]
[ 26.748725] the existing dependency chain (in reverse order) is:
[ 26.748725]
-> #1 (&sig->cred_guard_mutex){+.+.+.}:
[ 26.748725] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
[ 26.748725] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
[ 26.748725] [<ffffffff8173634d>] mutex_lock_killable_nested+0x6d/0x460
[ 26.748725] [<ffffffff81049924>] mm_access+0x24/0xb0
[ 26.748725] [<ffffffff811b6307>] m_start+0x67/0x1e0
[ 26.748725] [<ffffffff811775a0>] seq_read+0x130/0x3c0
[ 26.748725] [<ffffffff8114ffaa>] do_loop_readv_writev+0x5a/0x80
[ 26.748725] [<ffffffff81150c4d>] compat_do_readv_writev+0x20d/0x220
[ 26.748725] [<ffffffff81150c92>] compat_readv+0x32/0x70
[ 26.748725] [<ffffffff81151c07>] compat_SyS_readv+0x47/0xa0
[ 26.748725] [<ffffffff81742179>] ia32_sysret+0x0/0x5
[ 26.748725]
-> #0 (&p->lock){+.+.+.}:
[ 26.780481] [<ffffffff81097a4a>] validate_chain.isra.37+0x105a/0x10d0
[ 26.780481] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
[ 26.780481] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
[ 26.780481] [<ffffffff81735bad>] mutex_lock_nested+0x6d/0x3d0
[ 26.780481] [<ffffffff811774a8>] seq_read+0x38/0x3c0
[ 26.780481] [<ffffffff811b7af8>] proc_reg_read+0x38/0x70
[ 26.780481] [<ffffffff81150799>] vfs_read+0x99/0x160
[ 26.780481] [<ffffffff8115635c>] kernel_read+0x3c/0x50
[ 26.780481] [<ffffffff811565e7>] prepare_binprm+0x137/0x1d0
[ 26.780481] [<ffffffff81157f82>] do_execve_common.isra.34+0x4d2/0x730
[ 26.780481] [<ffffffff81158461>] SyS_execve+0x31/0x50
[ 26.780481] [<ffffffff81741099>] stub_execve+0x69/0xa0
[ 26.780481]
[ 26.780481] other info that might help us debug this:
[ 26.780481]
[ 26.780481] Possible unsafe locking scenario:
[ 26.780481]
[ 26.780481] CPU0 CPU1
[ 26.780481] ---- ----
[ 26.780481] lock(&sig->cred_guard_mutex);
[ 26.780481] lock(&p->lock);
[ 26.780481] lock(&sig->cred_guard_mutex);
[ 26.780481] lock(&p->lock);
[ 26.780481]
[ 26.780481] *** DEADLOCK ***
[ 26.780481]
[ 26.780481] 1 lock held by trinity-c5/848:
[ 26.780481] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811579db>] prepare_bprm_creds+0x2b/0x80
[ 26.780481]
[ 26.780481] stack backtrace:
[ 26.780481] CPU: 5 PID: 848 Comm: trinity-c5 Not tainted 3.13.0-11331-g6f008e72cd11 #1162
[ 26.780481] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
[ 26.780481] ffffffff824f1130 ffff8803b6973b58 ffffffff8172fc76 ffffffff824f1130
[ 26.780481] ffff8803b6973b98 ffffffff8172b6de ffff8803b6973bd0 ffff8803b72550e0
[ 26.780481] ffff8803b72550e0 0000000000000000 ffff8803b72549d0 0000000000000001
[ 26.780481] Call Trace:
[ 26.780481] [<ffffffff8172fc76>] dump_stack+0x4d/0x66
[ 26.780481] [<ffffffff8172b6de>] print_circular_bug+0x201/0x20f
[ 26.780481] [<ffffffff81097a4a>] validate_chain.isra.37+0x105a/0x10d0
[ 26.780481] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
[ 26.780481] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
[ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
[ 26.780481] [<ffffffff81735bad>] mutex_lock_nested+0x6d/0x3d0
[ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
[ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
[ 26.780481] [<ffffffff81082e18>] ? sched_clock_cpu+0xa8/0xd0
[ 26.780481] [<ffffffff811774a8>] seq_read+0x38/0x3c0
[ 26.780481] [<ffffffff811b7af8>] proc_reg_read+0x38/0x70
[ 26.780481] [<ffffffff81167b9e>] ? dput+0x1e/0x110
[ 26.780481] [<ffffffff81150799>] vfs_read+0x99/0x160
[ 26.780481] [<ffffffff8115635c>] kernel_read+0x3c/0x50
[ 26.780481] [<ffffffff811565e7>] prepare_binprm+0x137/0x1d0
[ 26.780481] [<ffffffff81157f82>] do_execve_common.isra.34+0x4d2/0x730
[ 26.780481] [<ffffffff81157ba9>] ? do_execve_common.isra.34+0xf9/0x730
[ 26.780481] [<ffffffff8115c100>] ? mountpoint_last+0x1a0/0x1b0
[ 26.780481] [<ffffffff81158461>] SyS_execve+0x31/0x50
[ 26.780481] [<ffffffff81741099>] stub_execve+0x69/0xa0

It's easy triggable by trinity. Config is attached.

--
Kirill A. Shutemov


Attachments:
(No filename) (7.50 kB)
config (97.75 kB)
Download all attachments

2014-04-10 05:43:04

by Jason Low

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Wed, 2014-04-09 at 15:19 +0300, Kirill A. Shutemov wrote:
> On Sun, Apr 06, 2014 at 01:12:14AM -0400, Michael L. Semon wrote:
> > Hi! Starting early in this merge window for 3.15, lockdep has been
> > giving me trouble. Normally, a splat will happen, lockdep will shut
> > itself off, and my i686 Pentium 4 PC will continue. Now, after the
> > splat, it will allow one key of input at either a VGA console or over
> > serial. After that, only the magic SysRq keys and KDB still work.
> > File activity stops, and many processes are stuck in the D state.
> >
> > Bisect brought me here:
> >
> > root@plbearer:/usr/src/kernel-git/linux# git bisect good
> > 6f008e72cd111a119b5d8de8c5438d892aae99eb is the first bad commit
> > commit 6f008e72cd111a119b5d8de8c5438d892aae99eb
> > Author: Peter Zijlstra <[email protected]>
> > Date: Wed Mar 12 13:24:42 2014 +0100
> >
> > locking/mutex: Fix debug checks
> >
> > OK, so commit:
> >
> > 1d8fe7dc8078 ("locking/mutexes: Unlock the mutex without the wait_lock")
> >
> > generates this boot warning when CONFIG_DEBUG_MUTEXES=y:
> >
> > WARNING: CPU: 0 PID: 139 at /usr/src/linux-2.6/kernel/locking/mutex-debug.c:82 debug_mutex_unlock+0x155/0x180() DEBUG_LOCKS_WARN_ON(lock->owner != current)
> >
> > And that makes sense, because as soon as we release the lock a
> > new owner can come in...
> >
> > One would think that !__mutex_slowpath_needs_to_unlock()
> > implementations suffer the same, but for DEBUG we fall back to
> > mutex-null.h which has an unconditional 1 for that.
> >
> > The mutex debug code requires the mutex to be unlocked after
> > doing the debug checks, otherwise it can find inconsistent
> > state.
> >
> > Reported-by: Ingo Molnar <[email protected]>
> > Signed-off-by: Peter Zijlstra <[email protected]>
> > Cc: [email protected]

Hello,

As a starting point, would either of you like to test the following
patch to see if it fixes the issue? This patch essentially generates the
same code as in older kernels in the debug case. This applies on top of
kernels with both commits 6f008e72cd11 and 1d8fe7dc8078.

Thanks.

-----
diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index e1191c9..faf6f5b 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -83,12 +83,6 @@ void debug_mutex_unlock(struct mutex *lock)

DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
mutex_clear_owner(lock);
-
- /*
- * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
- * mutexes so that we can do it here after we've verified state.
- */
- atomic_set(&lock->count, 1);
}

void debug_mutex_init(struct mutex *lock, const char *name,
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index bc73d33..f1f672e 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -34,13 +34,6 @@
#ifdef CONFIG_DEBUG_MUTEXES
# include "mutex-debug.h"
# include <asm-generic/mutex-null.h>
-/*
- * Must be 0 for the debug case so we do not do the unlock outside of the
- * wait_lock region. debug_mutex_unlock() will do the actual unlock in this
- * case.
- */
-# undef __mutex_slowpath_needs_to_unlock
-# define __mutex_slowpath_needs_to_unlock() 0
#else
# include "mutex.h"
# include <asm/mutex.h>
@@ -688,6 +681,17 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
unsigned long flags;

/*
+ * In the debug cases, obtain the wait_lock first
+ * before calling the following debugging functions.
+ */
+#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_DEBUG_LOCK_ALLOC)
+ spin_lock_mutex(&lock->wait_lock, flags);
+#endif
+
+ mutex_release(&lock->dep_map, nested, _RET_IP_);
+ debug_mutex_unlock(lock);
+
+ /*
* some architectures leave the lock unlocked in the fastpath failure
* case, others need to leave it locked. In the later case we have to
* unlock it here
@@ -695,9 +699,9 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
if (__mutex_slowpath_needs_to_unlock())
atomic_set(&lock->count, 1);

+#if !defined(CONFIG_DEBUG_MUTEXES) && !defined(CONFIG_DEBUG_LOCK_ALLOC)
spin_lock_mutex(&lock->wait_lock, flags);
- mutex_release(&lock->dep_map, nested, _RET_IP_);
- debug_mutex_unlock(lock);
+#endif

if (!list_empty(&lock->wait_list)) {
/* get the first entry from the wait-list: */

2014-04-10 08:12:27

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Wed, Apr 09, 2014 at 03:19:40PM +0300, Kirill A. Shutemov wrote:
> I have troble with the commit as well:

How so? As far as I can tell the below is a genuine bug and not related
to the mutex debug thing.

> [ 26.745741]
> [ 26.747484] ======================================================
> [ 26.748725] [ INFO: possible circular locking dependency detected ]
> [ 26.748725] 3.13.0-11331-g6f008e72cd11 #1162 Not tainted
> [ 26.748725] -------------------------------------------------------
> [ 26.748725] trinity-c5/848 is trying to acquire lock:
> [ 26.748725] (&p->lock){+.+.+.}, at: [<ffffffff811774a8>] seq_read+0x38/0x3c0
> [ 26.748725]
> [ 26.748725] but task is already holding lock:
> [ 26.748725] (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811579db>] prepare_bprm_creds+0x2b/0x80
> [ 26.748725]
> [ 26.748725] which lock already depends on the new lock.
> [ 26.748725]
> [ 26.748725]
> [ 26.748725] the existing dependency chain (in reverse order) is:
> [ 26.748725] -> #1 (&sig->cred_guard_mutex){+.+.+.}:
> [ 26.748725] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
> [ 26.748725] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
> [ 26.748725] [<ffffffff8173634d>] mutex_lock_killable_nested+0x6d/0x460
> [ 26.748725] [<ffffffff81049924>] mm_access+0x24/0xb0
> [ 26.748725] [<ffffffff811b6307>] m_start+0x67/0x1e0
> [ 26.748725] [<ffffffff811775a0>] seq_read+0x130/0x3c0
> [ 26.748725] [<ffffffff8114ffaa>] do_loop_readv_writev+0x5a/0x80
> [ 26.748725] [<ffffffff81150c4d>] compat_do_readv_writev+0x20d/0x220
> [ 26.748725] [<ffffffff81150c92>] compat_readv+0x32/0x70
> [ 26.748725] [<ffffffff81151c07>] compat_SyS_readv+0x47/0xa0
> [ 26.748725] [<ffffffff81742179>] ia32_sysret+0x0/0x5
> [ 26.748725] -> #0 (&p->lock){+.+.+.}:
> [ 26.780481] [<ffffffff81097a4a>] validate_chain.isra.37+0x105a/0x10d0
> [ 26.780481] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
> [ 26.780481] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
> [ 26.780481] [<ffffffff81735bad>] mutex_lock_nested+0x6d/0x3d0
> [ 26.780481] [<ffffffff811774a8>] seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff811b7af8>] proc_reg_read+0x38/0x70
> [ 26.780481] [<ffffffff81150799>] vfs_read+0x99/0x160
> [ 26.780481] [<ffffffff8115635c>] kernel_read+0x3c/0x50
> [ 26.780481] [<ffffffff811565e7>] prepare_binprm+0x137/0x1d0
> [ 26.780481] [<ffffffff81157f82>] do_execve_common.isra.34+0x4d2/0x730
> [ 26.780481] [<ffffffff81158461>] SyS_execve+0x31/0x50
> [ 26.780481] [<ffffffff81741099>] stub_execve+0x69/0xa0
> [ 26.780481]
> [ 26.780481] other info that might help us debug this:
> [ 26.780481]
> [ 26.780481] Possible unsafe locking scenario:
> [ 26.780481]
> [ 26.780481] CPU0 CPU1
> [ 26.780481] ---- ----
> [ 26.780481] lock(&sig->cred_guard_mutex);
> [ 26.780481] lock(&p->lock);
> [ 26.780481] lock(&sig->cred_guard_mutex);
> [ 26.780481] lock(&p->lock);
> [ 26.780481]
> [ 26.780481] *** DEADLOCK ***
> [ 26.780481]
> [ 26.780481] 1 lock held by trinity-c5/848:
> [ 26.780481] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811579db>] prepare_bprm_creds+0x2b/0x80
> [ 26.780481]
> [ 26.780481] stack backtrace:
> [ 26.780481] CPU: 5 PID: 848 Comm: trinity-c5 Not tainted 3.13.0-11331-g6f008e72cd11 #1162
> [ 26.780481] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
> [ 26.780481] ffffffff824f1130 ffff8803b6973b58 ffffffff8172fc76 ffffffff824f1130
> [ 26.780481] ffff8803b6973b98 ffffffff8172b6de ffff8803b6973bd0 ffff8803b72550e0
> [ 26.780481] ffff8803b72550e0 0000000000000000 ffff8803b72549d0 0000000000000001
> [ 26.780481] Call Trace:
> [ 26.780481] [<ffffffff8172fc76>] dump_stack+0x4d/0x66
> [ 26.780481] [<ffffffff8172b6de>] print_circular_bug+0x201/0x20f
> [ 26.780481] [<ffffffff81097a4a>] validate_chain.isra.37+0x105a/0x10d0
> [ 26.780481] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
> [ 26.780481] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
> [ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff81735bad>] mutex_lock_nested+0x6d/0x3d0
> [ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff81082e18>] ? sched_clock_cpu+0xa8/0xd0
> [ 26.780481] [<ffffffff811774a8>] seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff811b7af8>] proc_reg_read+0x38/0x70
> [ 26.780481] [<ffffffff81167b9e>] ? dput+0x1e/0x110
> [ 26.780481] [<ffffffff81150799>] vfs_read+0x99/0x160
> [ 26.780481] [<ffffffff8115635c>] kernel_read+0x3c/0x50
> [ 26.780481] [<ffffffff811565e7>] prepare_binprm+0x137/0x1d0
> [ 26.780481] [<ffffffff81157f82>] do_execve_common.isra.34+0x4d2/0x730
> [ 26.780481] [<ffffffff81157ba9>] ? do_execve_common.isra.34+0xf9/0x730
> [ 26.780481] [<ffffffff8115c100>] ? mountpoint_last+0x1a0/0x1b0
> [ 26.780481] [<ffffffff81158461>] SyS_execve+0x31/0x50
> [ 26.780481] [<ffffffff81741099>] stub_execve+0x69/0xa0

2014-04-10 08:13:59

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Wed, Apr 09, 2014 at 03:19:40PM +0300, Kirill A. Shutemov wrote:
> On Sun, Apr 06, 2014 at 01:12:14AM -0400, Michael L. Semon wrote:
> > Hi! Starting early in this merge window for 3.15, lockdep has been
> > giving me trouble. Normally, a splat will happen, lockdep will shut
> > itself off, and my i686 Pentium 4 PC will continue. Now, after the
> > splat, it will allow one key of input at either a VGA console or over
> > serial. After that, only the magic SysRq keys and KDB still work.
> > File activity stops, and many processes are stuck in the D state.
> >
> > Bisect brought me here:
> >
> > root@plbearer:/usr/src/kernel-git/linux# git bisect good
> > 6f008e72cd111a119b5d8de8c5438d892aae99eb is the first bad commit
> > commit 6f008e72cd111a119b5d8de8c5438d892aae99eb
> > Author: Peter Zijlstra <[email protected]>
> > Date: Wed Mar 12 13:24:42 2014 +0100
> >
> > locking/mutex: Fix debug checks
> >
> > OK, so commit:
> >
> > 1d8fe7dc8078 ("locking/mutexes: Unlock the mutex without the wait_lock")
> >
> > generates this boot warning when CONFIG_DEBUG_MUTEXES=y:
> >
> > WARNING: CPU: 0 PID: 139 at /usr/src/linux-2.6/kernel/locking/mutex-debug.c:82 debug_mutex_unlock+0x155/0x180() DEBUG_LOCKS_WARN_ON(lock->owner != current)
> >
> > And that makes sense, because as soon as we release the lock a
> > new owner can come in...
> >
> > One would think that !__mutex_slowpath_needs_to_unlock()
> > implementations suffer the same, but for DEBUG we fall back to
> > mutex-null.h which has an unconditional 1 for that.
> >
> > The mutex debug code requires the mutex to be unlocked after
> > doing the debug checks, otherwise it can find inconsistent
> > state.
> >
> > Reported-by: Ingo Molnar <[email protected]>
> > Signed-off-by: Peter Zijlstra <[email protected]>
> > Cc: [email protected]
> > Link: http://lkml.kernel.org/r/[email protected]
> > Signed-off-by: Ingo Molnar <[email protected]>
> >
> > :040000 040000 80e40c2009942a31f98127c4f9fa958f34b3947b f46ed4b70c4f30fc665fe8f810d3c13920cd765a M kernel
> >
> > Indeed, my issues are solved (so far) simply by reverting this commit.
> >
> > Might someone test lockdep on x86 to see if this is a consistent
> > issue that needs to be adjusted? My lockdep splats are generated by
> > running xfstests test generic/113 on XFS, but splats caused by other
> > issues should still create the same symptoms.
> >
> > Otherwise, this 3.15 kernel has been rather kind to me so far.
> >
> > PC is an i686 Pentium 4 with 1280 MB RAM and old IDE hardware,
> > running Slackware 14.1.

So what does lockdep say when it messes up your p4?

2014-04-10 08:14:54

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Wed, Apr 09, 2014 at 10:42:59PM -0700, Jason Low wrote:
> +#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_DEBUG_LOCK_ALLOC)

DEBUG_LOCK_ALLOC always implies DEBUG_MUTEXES, see lib/Kconfig.debug

2014-04-10 09:16:33

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Wed, Apr 09, 2014 at 10:42:59PM -0700, Jason Low wrote:
> On Wed, 2014-04-09 at 15:19 +0300, Kirill A. Shutemov wrote:
> > On Sun, Apr 06, 2014 at 01:12:14AM -0400, Michael L. Semon wrote:
> > > Hi! Starting early in this merge window for 3.15, lockdep has been
> > > giving me trouble. Normally, a splat will happen, lockdep will shut
> > > itself off, and my i686 Pentium 4 PC will continue. Now, after the
> > > splat, it will allow one key of input at either a VGA console or over
> > > serial. After that, only the magic SysRq keys and KDB still work.
> > > File activity stops, and many processes are stuck in the D state.
> > >
> > > Bisect brought me here:
> > >
> > > root@plbearer:/usr/src/kernel-git/linux# git bisect good
> > > 6f008e72cd111a119b5d8de8c5438d892aae99eb is the first bad commit
> > > commit 6f008e72cd111a119b5d8de8c5438d892aae99eb
> > > Author: Peter Zijlstra <[email protected]>
> > > Date: Wed Mar 12 13:24:42 2014 +0100
> > >
> > > locking/mutex: Fix debug checks
> > >
> > > OK, so commit:
> > >
> > > 1d8fe7dc8078 ("locking/mutexes: Unlock the mutex without the wait_lock")
> > >
> > > generates this boot warning when CONFIG_DEBUG_MUTEXES=y:
> > >
> > > WARNING: CPU: 0 PID: 139 at /usr/src/linux-2.6/kernel/locking/mutex-debug.c:82 debug_mutex_unlock+0x155/0x180() DEBUG_LOCKS_WARN_ON(lock->owner != current)
> > >
> > > And that makes sense, because as soon as we release the lock a
> > > new owner can come in...
> > >
> > > One would think that !__mutex_slowpath_needs_to_unlock()
> > > implementations suffer the same, but for DEBUG we fall back to
> > > mutex-null.h which has an unconditional 1 for that.
> > >
> > > The mutex debug code requires the mutex to be unlocked after
> > > doing the debug checks, otherwise it can find inconsistent
> > > state.
> > >
> > > Reported-by: Ingo Molnar <[email protected]>
> > > Signed-off-by: Peter Zijlstra <[email protected]>
> > > Cc: [email protected]
>
> Hello,
>
> As a starting point, would either of you like to test the following
> patch to see if it fixes the issue? This patch essentially generates the
> same code as in older kernels in the debug case. This applies on top of
> kernels with both commits 6f008e72cd11 and 1d8fe7dc8078.

I'm not able to trigger the lockdep report with the patch applied so far.

--
Kirill A. Shutemov

2014-04-10 09:18:57

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Wed, Apr 09, 2014 at 10:42:59PM -0700, Jason Low wrote:
> As a starting point, would either of you like to test the following
> patch to see if it fixes the issue? This patch essentially generates the
> same code as in older kernels in the debug case. This applies on top of
> kernels with both commits 6f008e72cd11 and 1d8fe7dc8078.


So I managed to reproduce, and the below makes it go away. I just don't
understand why though. will stare more.

---
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -83,12 +83,6 @@ void debug_mutex_unlock(struct mutex *lo

DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
mutex_clear_owner(lock);
-
- /*
- * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
- * mutexes so that we can do it here after we've verified state.
- */
- atomic_set(&lock->count, 1);
}

void debug_mutex_init(struct mutex *lock, const char *name,
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -34,13 +34,6 @@
#ifdef CONFIG_DEBUG_MUTEXES
# include "mutex-debug.h"
# include <asm-generic/mutex-null.h>
-/*
- * Must be 0 for the debug case so we do not do the unlock outside of the
- * wait_lock region. debug_mutex_unlock() will do the actual unlock in this
- * case.
- */
-# undef __mutex_slowpath_needs_to_unlock
-# define __mutex_slowpath_needs_to_unlock() 0
#else
# include "mutex.h"
# include <asm/mutex.h>
@@ -688,6 +681,17 @@ __mutex_unlock_common_slowpath(atomic_t
unsigned long flags;

/*
+ * In the debug cases, obtain the wait_lock first
+ * before calling the following debugging functions.
+ */
+#ifdef CONFIG_DEBUG_MUTEXES
+ spin_lock_mutex(&lock->wait_lock, flags);
+ mutex_release(&lock->dep_map, nested, _RET_IP_);
+ debug_mutex_unlock(lock);
+#endif
+
+
+ /*
* some architectures leave the lock unlocked in the fastpath failure
* case, others need to leave it locked. In the later case we have to
* unlock it here
@@ -695,9 +699,11 @@ __mutex_unlock_common_slowpath(atomic_t
if (__mutex_slowpath_needs_to_unlock())
atomic_set(&lock->count, 1);

+#ifndef CONFIG_DEBUG_MUTEXES
spin_lock_mutex(&lock->wait_lock, flags);
mutex_release(&lock->dep_map, nested, _RET_IP_);
debug_mutex_unlock(lock);
+#endif

if (!list_empty(&lock->wait_list)) {
/* get the first entry from the wait-list: */

2014-04-10 17:15:19

by Jason Low

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, 2014-04-10 at 11:18 +0200, Peter Zijlstra wrote:
> On Wed, Apr 09, 2014 at 10:42:59PM -0700, Jason Low wrote:
> > As a starting point, would either of you like to test the following
> > patch to see if it fixes the issue? This patch essentially generates the
> > same code as in older kernels in the debug case. This applies on top of
> > kernels with both commits 6f008e72cd11 and 1d8fe7dc8078.
>
>
> So I managed to reproduce, and the below makes it go away. I just don't
> understand why though. will stare more.

So one thing I noticed that is different in the current code is that in
debug_mutex_unlock(), there is is a possibility that it does not unlock
the mutex (when !debug_locks). May be interesting to try out this
patch too:

-----
diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index e1191c9..97f8df0 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -72,7 +72,7 @@ void mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter,
void debug_mutex_unlock(struct mutex *lock)
{
if (unlikely(!debug_locks))
- return;
+ goto out;

DEBUG_LOCKS_WARN_ON(lock->magic != lock);

@@ -84,6 +84,7 @@ void debug_mutex_unlock(struct mutex *lock)
DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
mutex_clear_owner(lock);

+out:
/*
* __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
* mutexes so that we can do it here after we've verified state.

2014-04-10 18:43:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, Apr 10, 2014 at 10:14:44AM -0700, Jason Low wrote:
> So one thing I noticed that is different in the current code is that in
> debug_mutex_unlock(), there is is a possibility that it does not unlock
> the mutex (when !debug_locks). May be interesting to try out this
> patch too:

Yeah; look at a few mails down :-) I spotted it too when I woke up.

2014-04-10 18:51:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, Apr 10, 2014 at 11:18:24AM +0200, Peter Zijlstra wrote:
> So I managed to reproduce, and the below makes it go away. I just don't
> understand why though. will stare more.

/me kicks himself.. bloody obvious fail there :-)

Not unlocking the lock after a lockdep trigger will make things get
stuck real fast :-)

---
kernel/locking/mutex-debug.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index e1191c996c59..5cf6731b98e9 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -71,18 +71,17 @@ void mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter,

void debug_mutex_unlock(struct mutex *lock)
{
- if (unlikely(!debug_locks))
- return;
+ if (likely(debug_locks)) {
+ DEBUG_LOCKS_WARN_ON(lock->magic != lock);

- DEBUG_LOCKS_WARN_ON(lock->magic != lock);
+ if (!lock->owner)
+ DEBUG_LOCKS_WARN_ON(!lock->owner);
+ else
+ DEBUG_LOCKS_WARN_ON(lock->owner != current);

- if (!lock->owner)
- DEBUG_LOCKS_WARN_ON(!lock->owner);
- else
- DEBUG_LOCKS_WARN_ON(lock->owner != current);
-
- DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
- mutex_clear_owner(lock);
+ DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
+ mutex_clear_owner(lock);
+ }

/*
* __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug

2014-04-10 18:55:54

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, Apr 10, 2014 at 12:15:44PM +0300, Kirill A. Shutemov wrote:
> I'm not able to trigger the lockdep report with the patch applied so far.

So what I've found it that the lockdep reports are valid; the only
difference is a lockup after the report or not.

So linus.git will hang after a lockdep splat, whereas the patched kernel
will continue after giving the exact same lockdep splat.

2014-04-10 18:58:51

by Peter Zijlstra

[permalink] [raw]
Subject: cred_guard_mutex vs seq_file::lock [was: Re: 3.14.0+/x86: lockdep and mutexes not getting along]

On Wed, Apr 09, 2014 at 03:19:40PM +0300, Kirill A. Shutemov wrote:
> [ 26.747484] ======================================================
> [ 26.748725] [ INFO: possible circular locking dependency detected ]
> [ 26.748725] 3.13.0-11331-g6f008e72cd11 #1162 Not tainted
> [ 26.748725] -------------------------------------------------------
> [ 26.748725] trinity-c5/848 is trying to acquire lock:
> [ 26.748725] (&p->lock){+.+.+.}, at: [<ffffffff811774a8>] seq_read+0x38/0x3c0
> [ 26.748725]
> [ 26.748725] but task is already holding lock:
> [ 26.748725] (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811579db>] prepare_bprm_creds+0x2b/0x80
> [ 26.748725]
> [ 26.748725] which lock already depends on the new lock.
> [ 26.748725]
> [ 26.748725]
> [ 26.748725] the existing dependency chain (in reverse order) is:
> [ 26.748725] -> #1 (&sig->cred_guard_mutex){+.+.+.}:
> [ 26.748725] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
> [ 26.748725] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
> [ 26.748725] [<ffffffff8173634d>] mutex_lock_killable_nested+0x6d/0x460
> [ 26.748725] [<ffffffff81049924>] mm_access+0x24/0xb0
> [ 26.748725] [<ffffffff811b6307>] m_start+0x67/0x1e0
> [ 26.748725] [<ffffffff811775a0>] seq_read+0x130/0x3c0
> [ 26.748725] [<ffffffff8114ffaa>] do_loop_readv_writev+0x5a/0x80
> [ 26.748725] [<ffffffff81150c4d>] compat_do_readv_writev+0x20d/0x220
> [ 26.748725] [<ffffffff81150c92>] compat_readv+0x32/0x70
> [ 26.748725] [<ffffffff81151c07>] compat_SyS_readv+0x47/0xa0
> [ 26.748725] [<ffffffff81742179>] ia32_sysret+0x0/0x5
> [ 26.748725] -> #0 (&p->lock){+.+.+.}:
> [ 26.780481] [<ffffffff81097a4a>] validate_chain.isra.37+0x105a/0x10d0
> [ 26.780481] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
> [ 26.780481] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
> [ 26.780481] [<ffffffff81735bad>] mutex_lock_nested+0x6d/0x3d0
> [ 26.780481] [<ffffffff811774a8>] seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff811b7af8>] proc_reg_read+0x38/0x70
> [ 26.780481] [<ffffffff81150799>] vfs_read+0x99/0x160
> [ 26.780481] [<ffffffff8115635c>] kernel_read+0x3c/0x50
> [ 26.780481] [<ffffffff811565e7>] prepare_binprm+0x137/0x1d0
> [ 26.780481] [<ffffffff81157f82>] do_execve_common.isra.34+0x4d2/0x730
> [ 26.780481] [<ffffffff81158461>] SyS_execve+0x31/0x50
> [ 26.780481] [<ffffffff81741099>] stub_execve+0x69/0xa0
> [ 26.780481]
> [ 26.780481] other info that might help us debug this:
> [ 26.780481]
> [ 26.780481] Possible unsafe locking scenario:
> [ 26.780481]
> [ 26.780481] CPU0 CPU1
> [ 26.780481] ---- ----
> [ 26.780481] lock(&sig->cred_guard_mutex);
> [ 26.780481] lock(&p->lock);
> [ 26.780481] lock(&sig->cred_guard_mutex);
> [ 26.780481] lock(&p->lock);
> [ 26.780481]
> [ 26.780481] *** DEADLOCK ***
> [ 26.780481]
> [ 26.780481] 1 lock held by trinity-c5/848:
> [ 26.780481] #0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff811579db>] prepare_bprm_creds+0x2b/0x80
> [ 26.780481]
> [ 26.780481] stack backtrace:
> [ 26.780481] CPU: 5 PID: 848 Comm: trinity-c5 Not tainted 3.13.0-11331-g6f008e72cd11 #1162
> [ 26.780481] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
> [ 26.780481] ffffffff824f1130 ffff8803b6973b58 ffffffff8172fc76 ffffffff824f1130
> [ 26.780481] ffff8803b6973b98 ffffffff8172b6de ffff8803b6973bd0 ffff8803b72550e0
> [ 26.780481] ffff8803b72550e0 0000000000000000 ffff8803b72549d0 0000000000000001
> [ 26.780481] Call Trace:
> [ 26.780481] [<ffffffff8172fc76>] dump_stack+0x4d/0x66
> [ 26.780481] [<ffffffff8172b6de>] print_circular_bug+0x201/0x20f
> [ 26.780481] [<ffffffff81097a4a>] validate_chain.isra.37+0x105a/0x10d0
> [ 26.780481] [<ffffffff810998d8>] __lock_acquire+0x3a8/0xc20
> [ 26.780481] [<ffffffff8109a1c6>] lock_acquire+0x76/0xc0
> [ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff81735bad>] mutex_lock_nested+0x6d/0x3d0
> [ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff811774a8>] ? seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff81082e18>] ? sched_clock_cpu+0xa8/0xd0
> [ 26.780481] [<ffffffff811774a8>] seq_read+0x38/0x3c0
> [ 26.780481] [<ffffffff811b7af8>] proc_reg_read+0x38/0x70
> [ 26.780481] [<ffffffff81167b9e>] ? dput+0x1e/0x110
> [ 26.780481] [<ffffffff81150799>] vfs_read+0x99/0x160
> [ 26.780481] [<ffffffff8115635c>] kernel_read+0x3c/0x50
> [ 26.780481] [<ffffffff811565e7>] prepare_binprm+0x137/0x1d0
> [ 26.780481] [<ffffffff81157f82>] do_execve_common.isra.34+0x4d2/0x730
> [ 26.780481] [<ffffffff81157ba9>] ? do_execve_common.isra.34+0xf9/0x730
> [ 26.780481] [<ffffffff8115c100>] ? mountpoint_last+0x1a0/0x1b0
> [ 26.780481] [<ffffffff81158461>] SyS_execve+0x31/0x50
> [ 26.780481] [<ffffffff81741099>] stub_execve+0x69/0xa0

So as far as I can tell the bug that led you here only wrecks the lock
state after you hit a lockdep error, so all actual lockdep reports are
still entirely valid.

This means the above is 'interesting'. I talked with David Howells
earlier today, it looks like trinity manages to exec() a /proc file and
create the lock inversion that way.

Now all /proc files that take cred_guard_mutex inside seq_read() aren't
executable, and David also tried to use them as loaders and that didn't
work either.

However, any executable file that uses seq_read() is sufficient, the
file doesn't need to actually take cred_guard_mutex too. We just need
two tasks: one doing a seq_read() exec and the other reading a
seq_read()->cred_guard_mutex file.

Al, David, any bright ideas on how to best fix this?

2014-04-10 19:04:49

by Jason Low

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, 2014-04-10 at 19:28 +0200, Peter Zijlstra wrote:
> On Thu, Apr 10, 2014 at 10:14:44AM -0700, Jason Low wrote:
> > So one thing I noticed that is different in the current code is that in
> > debug_mutex_unlock(), there is is a possibility that it does not unlock
> > the mutex (when !debug_locks). May be interesting to try out this
> > patch too:
>
> Yeah; look at a few mails down :-) I spotted it too when I woke up.

Ah, I've received those mails in my inbox just now. :)

2014-04-10 23:27:46

by Dave Jones

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, Apr 10, 2014 at 10:14:44AM -0700, Jason Low wrote:
> On Thu, 2014-04-10 at 11:18 +0200, Peter Zijlstra wrote:
> > On Wed, Apr 09, 2014 at 10:42:59PM -0700, Jason Low wrote:
> > > As a starting point, would either of you like to test the following
> > > patch to see if it fixes the issue? This patch essentially generates the
> > > same code as in older kernels in the debug case. This applies on top of
> > > kernels with both commits 6f008e72cd11 and 1d8fe7dc8078.
> >
> >
> > So I managed to reproduce, and the below makes it go away. I just don't
> > understand why though. will stare more.
>
> So one thing I noticed that is different in the current code is that in
> debug_mutex_unlock(), there is is a possibility that it does not unlock
> the mutex (when !debug_locks). May be interesting to try out this
> patch too:

I've been seeing lockups this last week or two too, that manifested by
RCU spewing stall messages, and the box being totally unresponsive.
This patch seems to cure that for me, which I'm sure will make Paul
happy to hear.

Dave

2014-04-10 23:32:27

by Dave Jones

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, Apr 10, 2014 at 07:26:52PM -0400, Dave Jones wrote:
> On Thu, Apr 10, 2014 at 10:14:44AM -0700, Jason Low wrote:
> > On Thu, 2014-04-10 at 11:18 +0200, Peter Zijlstra wrote:
> > > On Wed, Apr 09, 2014 at 10:42:59PM -0700, Jason Low wrote:
> > > > As a starting point, would either of you like to test the following
> > > > patch to see if it fixes the issue? This patch essentially generates the
> > > > same code as in older kernels in the debug case. This applies on top of
> > > > kernels with both commits 6f008e72cd11 and 1d8fe7dc8078.
> > >
> > >
> > > So I managed to reproduce, and the below makes it go away. I just don't
> > > understand why though. will stare more.
> >
> > So one thing I noticed that is different in the current code is that in
> > debug_mutex_unlock(), there is is a possibility that it does not unlock
> > the mutex (when !debug_locks). May be interesting to try out this
> > patch too:
>
> I've been seeing lockups this last week or two too, that manifested by
> RCU spewing stall messages, and the box being totally unresponsive.
> This patch seems to cure that for me, which I'm sure will make Paul
> happy to hear.

duh, replied to the wrong mail. I tested peterz's last patch, not Jason's.

Dave

2014-04-11 03:48:23

by Paul E. McKenney

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, Apr 10, 2014 at 07:26:52PM -0400, Dave Jones wrote:
> On Thu, Apr 10, 2014 at 10:14:44AM -0700, Jason Low wrote:
> > On Thu, 2014-04-10 at 11:18 +0200, Peter Zijlstra wrote:
> > > On Wed, Apr 09, 2014 at 10:42:59PM -0700, Jason Low wrote:
> > > > As a starting point, would either of you like to test the following
> > > > patch to see if it fixes the issue? This patch essentially generates the
> > > > same code as in older kernels in the debug case. This applies on top of
> > > > kernels with both commits 6f008e72cd11 and 1d8fe7dc8078.
> > >
> > >
> > > So I managed to reproduce, and the below makes it go away. I just don't
> > > understand why though. will stare more.
> >
> > So one thing I noticed that is different in the current code is that in
> > debug_mutex_unlock(), there is is a possibility that it does not unlock
> > the mutex (when !debug_locks). May be interesting to try out this
> > patch too:
>
> I've been seeing lockups this last week or two too, that manifested by
> RCU spewing stall messages, and the box being totally unresponsive.
> This patch seems to cure that for me, which I'm sure will make Paul
> happy to hear.

Whew!!! ;-)

Thanx, Paul

2014-04-11 13:41:35

by Michael L. Semon

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Wed, 9 Apr 2014, Jason Low wrote:

> On Wed, 2014-04-09 at 15:19 +0300, Kirill A. Shutemov wrote:
> > On Sun, Apr 06, 2014 at 01:12:14AM -0400, Michael L. Semon wrote:
> > > Hi! Starting early in this merge window for 3.15, lockdep has been
> > > giving me trouble. Normally, a splat will happen, lockdep will shut
> > > itself off, and my i686 Pentium 4 PC will continue. Now, after the
> > > splat, it will allow one key of input at either a VGA console or over
> > > serial. After that, only the magic SysRq keys and KDB still work.
> > > File activity stops, and many processes are stuck in the D state.
> > >
> > > Bisect brought me here:
> > >
> > > root@plbearer:/usr/src/kernel-git/linux# git bisect good
> > > 6f008e72cd111a119b5d8de8c5438d892aae99eb is the first bad commit
> > > commit 6f008e72cd111a119b5d8de8c5438d892aae99eb
> > > Author: Peter Zijlstra <[email protected]>
> > > Date: Wed Mar 12 13:24:42 2014 +0100
> > >
> > > locking/mutex: Fix debug checks
> > >
> > > OK, so commit:
> > >
> > > 1d8fe7dc8078 ("locking/mutexes: Unlock the mutex without the wait_lock")
> > >
> > > generates this boot warning when CONFIG_DEBUG_MUTEXES=y:
> > >
> > > WARNING: CPU: 0 PID: 139 at /usr/src/linux-2.6/kernel/locking/mutex-debug.c:82 debug_mutex_unlock+0x155/0x180() DEBUG_LOCKS_WARN_ON(lock->owner != current)
> > >
> > > And that makes sense, because as soon as we release the lock a
> > > new owner can come in...
> > >
> > > One would think that !__mutex_slowpath_needs_to_unlock()
> > > implementations suffer the same, but for DEBUG we fall back to
> > > mutex-null.h which has an unconditional 1 for that.
> > >
> > > The mutex debug code requires the mutex to be unlocked after
> > > doing the debug checks, otherwise it can find inconsistent
> > > state.
> > >
> > > Reported-by: Ingo Molnar <[email protected]>
> > > Signed-off-by: Peter Zijlstra <[email protected]>
> > > Cc: [email protected]
>
> Hello,
>
> As a starting point, would either of you like to test the following
> patch to see if it fixes the issue? This patch essentially generates the
> same code as in older kernels in the debug case. This applies on top of
> kernels with both commits 6f008e72cd11 and 1d8fe7dc8078.
>
> Thanks.
>
> -----
> diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
> index e1191c9..faf6f5b 100644
> --- a/kernel/locking/mutex-debug.c
> +++ b/kernel/locking/mutex-debug.c
> @@ -83,12 +83,6 @@ void debug_mutex_unlock(struct mutex *lock)
>
> DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
> mutex_clear_owner(lock);
> -
> - /*
> - * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
> - * mutexes so that we can do it here after we've verified state.
> - */
> - atomic_set(&lock->count, 1);
> }
>
> void debug_mutex_init(struct mutex *lock, const char *name,
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index bc73d33..f1f672e 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -34,13 +34,6 @@
> #ifdef CONFIG_DEBUG_MUTEXES
> # include "mutex-debug.h"
> # include <asm-generic/mutex-null.h>
> -/*
> - * Must be 0 for the debug case so we do not do the unlock outside of the
> - * wait_lock region. debug_mutex_unlock() will do the actual unlock in this
> - * case.
> - */
> -# undef __mutex_slowpath_needs_to_unlock
> -# define __mutex_slowpath_needs_to_unlock() 0
> #else
> # include "mutex.h"
> # include <asm/mutex.h>
> @@ -688,6 +681,17 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
> unsigned long flags;
>
> /*
> + * In the debug cases, obtain the wait_lock first
> + * before calling the following debugging functions.
> + */
> +#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_DEBUG_LOCK_ALLOC)
> + spin_lock_mutex(&lock->wait_lock, flags);
> +#endif
> +
> + mutex_release(&lock->dep_map, nested, _RET_IP_);
> + debug_mutex_unlock(lock);
> +
> + /*
> * some architectures leave the lock unlocked in the fastpath failure
> * case, others need to leave it locked. In the later case we have to
> * unlock it here
> @@ -695,9 +699,9 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
> if (__mutex_slowpath_needs_to_unlock())
> atomic_set(&lock->count, 1);
>
> +#if !defined(CONFIG_DEBUG_MUTEXES) && !defined(CONFIG_DEBUG_LOCK_ALLOC)
> spin_lock_mutex(&lock->wait_lock, flags);
> - mutex_release(&lock->dep_map, nested, _RET_IP_);
> - debug_mutex_unlock(lock);
> +#endif
>
> if (!list_empty(&lock->wait_list)) {
> /* get the first entry from the wait-list: */
>
>
>

This works and was given an overnight xfstests run on XFS to prove
itself. The other patch worked, too, but it was sent on a failing
mission to find an elusive JFS lockdep splat by find'ing, untarring,
and fs_mark'ing everything in sight. The other patch did fine on
the original xfstests generic/113 on devel/debug XFS.

Thanks!

Michael

2014-04-11 14:01:55

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 3.14.0+/x86: lockdep and mutexes not getting along

On Thu, 10 Apr 2014 16:15:59 +0200, Peter Zijlstra said:
> On Thu, Apr 10, 2014 at 11:18:24AM +0200, Peter Zijlstra wrote:
> > So I managed to reproduce, and the below makes it go away. I just don't
> > understand why though. will stare more.
>
> /me kicks himself.. bloody obvious fail there :-)
>
> Not unlocking the lock after a lockdep trigger will make things get
> stuck real fast :-)
>
> ---
> kernel/locking/mutex-debug.c | 19 +++++++++----------
> 1 file changed, 9 insertions(+), 10 deletions(-)

Confirming that this makes next-20140409 boot without hanging for me.
Feel free to stick a Tested-by: on it. ;)


Attachments:
(No filename) (848.00 B)

2014-04-11 14:51:12

by David Howells

[permalink] [raw]
Subject: Re: cred_guard_mutex vs seq_file::lock [was: Re: 3.14.0+/x86: lockdep and mutexes not getting along]

Peter Zijlstra <[email protected]> wrote:

> Al, David, any bright ideas on how to best fix this?

Have the seq_xxx() code throw an error if current->in_execve is true. I can't
think of any circumstance where execve() should be reading anything that uses
seq_xxx().

David

2014-04-11 15:08:07

by Al Viro

[permalink] [raw]
Subject: Re: cred_guard_mutex vs seq_file::lock [was: Re: 3.14.0+/x86: lockdep and mutexes not getting along]

On Fri, Apr 11, 2014 at 03:50:27PM +0100, David Howells wrote:
> Peter Zijlstra <[email protected]> wrote:
>
> > Al, David, any bright ideas on how to best fix this?
>
> Have the seq_xxx() code throw an error if current->in_execve is true. I can't
> think of any circumstance where execve() should be reading anything that uses
> seq_xxx().

*cringe*

I don't like it. That really should be a responsiblity of specific ->show();
"I'm going to take that mutex, bugger off if we are in execve()" makes a lot
more sense than having e.g. seq_read() care of that. IOW, I would very
much prefer the patch you've sent last week.

And yes, it might leave lockdep false positives, but that's better dealt with
by annotating the sucker ("this guy has a separate lockdep class for its
->lock"). E.g. by splitting proc_single_file_operations in two and having
the one used for those files do lockdep_set_class() in its ->open().

Subject: [tip:core/urgent] locking/mutex: Fix debug_mutexes

Commit-ID: a227960fe0cafcc229a8d6bb8b454a3a0b33719d
Gitweb: http://git.kernel.org/tip/a227960fe0cafcc229a8d6bb8b454a3a0b33719d
Author: Peter Zijlstra <[email protected]>
AuthorDate: Thu, 10 Apr 2014 16:15:59 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 11 Apr 2014 10:40:35 +0200

locking/mutex: Fix debug_mutexes

debug_mutex_unlock() would bail when !debug_locks and forgets to
actually unlock.

Reported-by: "Michael L. Semon" <[email protected]>
Reported-by: "Kirill A. Shutemov" <[email protected]>
Reported-by: Valdis Kletnieks <[email protected]>
Fixes: 6f008e72cd11 ("locking/mutex: Fix debug checks")
Tested-by: Dave Jones <[email protected]>
Cc: Jason Low <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/locking/mutex-debug.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index e1191c9..5cf6731 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -71,18 +71,17 @@ void mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter,

void debug_mutex_unlock(struct mutex *lock)
{
- if (unlikely(!debug_locks))
- return;
+ if (likely(debug_locks)) {
+ DEBUG_LOCKS_WARN_ON(lock->magic != lock);

- DEBUG_LOCKS_WARN_ON(lock->magic != lock);
+ if (!lock->owner)
+ DEBUG_LOCKS_WARN_ON(!lock->owner);
+ else
+ DEBUG_LOCKS_WARN_ON(lock->owner != current);

- if (!lock->owner)
- DEBUG_LOCKS_WARN_ON(!lock->owner);
- else
- DEBUG_LOCKS_WARN_ON(lock->owner != current);
-
- DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
- mutex_clear_owner(lock);
+ DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
+ mutex_clear_owner(lock);
+ }

/*
* __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug

2014-07-30 22:35:14

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: cred_guard_mutex vs seq_file::lock [was: Re: 3.14.0+/x86: lockdep and mutexes not getting along]

On Fri, Apr 11, 2014 at 04:07:25PM +0100, Al Viro wrote:
> On Fri, Apr 11, 2014 at 03:50:27PM +0100, David Howells wrote:
> > Peter Zijlstra <[email protected]> wrote:
> >
> > > Al, David, any bright ideas on how to best fix this?
> >
> > Have the seq_xxx() code throw an error if current->in_execve is true. I can't
> > think of any circumstance where execve() should be reading anything that uses
> > seq_xxx().
>
> *cringe*
>
> I don't like it. That really should be a responsiblity of specific ->show();
> "I'm going to take that mutex, bugger off if we are in execve()" makes a lot
> more sense than having e.g. seq_read() care of that. IOW, I would very
> much prefer the patch you've sent last week.
>
> And yes, it might leave lockdep false positives, but that's better dealt with
> by annotating the sucker ("this guy has a separate lockdep class for its
> ->lock"). E.g. by splitting proc_single_file_operations in two and having
> the one used for those files do lockdep_set_class() in its ->open().

I've got annoyed by the lockdep warning. What about the patch below?

>From 54d8c463e12f23c09d6a2dbf93a4dc9bcb493c67 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <[email protected]>
Date: Thu, 31 Jul 2014 00:59:52 +0300
Subject: [PATCH] procfs: silence lockdep warning about read vs. exec seq_file

Testcase:

cat /proc/self/maps >/dev/null
chmod +x /proc/self/net/packet
exec /proc/self/net/packet

It triggers lockdep warning:

[ INFO: possible circular locking dependency detected ]
3.16.0-rc7-00064-g26bcd8b72563 #8 Not tainted
-------------------------------------------------------
sh/157 is trying to acquire lock:
(&p->lock){+.+.+.}, at: [<ffffffff8117f4f8>] seq_read+0x38/0x3e0

but task is already holding lock:
(&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff81160018>] prepare_bprm_creds+0x28/0x90

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&sig->cred_guard_mutex){+.+.+.}:
[<ffffffff8109a9c1>] __lock_acquire+0x531/0xde0
[<ffffffff8109b959>] lock_acquire+0x79/0xd0
[<ffffffff8173f838>] mutex_lock_killable_nested+0x68/0x460
[<ffffffff811c0d9f>] lock_trace+0x1f/0x60
[<ffffffff811c0ed7>] proc_pid_personality+0x17/0x60
[<ffffffff811be39b>] proc_single_show+0x4b/0x90
[<ffffffff8117f5a0>] seq_read+0xe0/0x3e0
[<ffffffff81158f1e>] vfs_read+0x8e/0x170
[<ffffffff81159be8>] SyS_read+0x48/0xc0
[<ffffffff81743712>] system_call_fastpath+0x16/0x1b

-> #0 (&p->lock){+.+.+.}:
[<ffffffff81097437>] validate_chain.isra.37+0xfe7/0x13b0
[<ffffffff8109a9c1>] __lock_acquire+0x531/0xde0
[<ffffffff8109b959>] lock_acquire+0x79/0xd0
[<ffffffff8173f09a>] mutex_lock_nested+0x6a/0x3d0
[<ffffffff8117f4f8>] seq_read+0x38/0x3e0
[<ffffffff811bd5f3>] proc_reg_read+0x43/0x70
[<ffffffff81158f1e>] vfs_read+0x8e/0x170
[<ffffffff8115ea13>] kernel_read+0x43/0x60
[<ffffffff8115ec65>] prepare_binprm+0xd5/0x170
[<ffffffff811605c8>] do_execve_common.isra.32+0x548/0x800
[<ffffffff81160893>] do_execve+0x13/0x20
[<ffffffff81160b70>] SyS_execve+0x20/0x30
[<ffffffff81743c89>] stub_execve+0x69/0xa0

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&sig->cred_guard_mutex);
lock(&p->lock);
lock(&sig->cred_guard_mutex);
lock(&p->lock);

*** DEADLOCK ***

1 lock held by sh/157:
#0: (&sig->cred_guard_mutex){+.+.+.}, at: [<ffffffff81160018>] prepare_bprm_creds+0x28/0x90

It's a false positive: seq files which take cred_guard_mutex are never
executable. Let's use separate lock class for them.

I don't know why we allow "chmod +x" on some proc files, notably net-related.
Is it a bug?

Also I suspect eb94cd96e05d fixes non-existing bug, like this one.

Signed-off-by: Kirill A. Shutemov <[email protected]>
---
fs/proc/base.c | 24 +++++++++++++++++++++++-
fs/proc/task_mmu.c | 14 ++++++++++++++
fs/proc/task_nommu.c | 4 ++++
3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 2d696b0c93bf..c05b4a227acb 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -655,9 +655,31 @@ static int proc_single_show(struct seq_file *m, void *v)
return ret;
}

+/*
+ * proc_pid_personality() and proc_pid_stack() take cred_guard_mutex via
+ * lock_trace() with seq_file->lock held.
+ * execve(2) calls vfs_read() with cred_guard_mutex held.
+ *
+ * So if you will try to execute a seq_file, lockdep will report a possible
+ * circular locking dependency. It's false-positive, since ONE() files are
+ * never executable.
+ *
+ * Let's set separate lock class for seq_file->lock of ONE() files.
+ */
+static struct lock_class_key proc_single_open_lock_class;
+
static int proc_single_open(struct inode *inode, struct file *filp)
{
- return single_open(filp, proc_single_show, inode);
+ struct seq_file *m;
+ int ret;
+
+ ret = single_open(filp, proc_single_show, inode);
+ if (ret)
+ return ret;
+
+ m = filp->private_data;
+ lockdep_set_class(&m->lock, &proc_single_open_lock_class);
+ return 0;
}

static const struct file_operations proc_single_file_operations = {
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cfa63ee92c96..536b9f9a9ff5 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -19,6 +19,18 @@
#include <asm/tlbflush.h>
#include "internal.h"

+/*
+ * m_start() takes cred_guard_mutex via mm_access() with seq_file->lock held.
+ * execve(2) calls vfs_read() with cred_guard_mutex held.
+ *
+ * So if you will try to execute a seq_file, lockdep will report a possible
+ * circular locking dependency. It's false positive, since m_start() users are
+ * never executable.
+ *
+ * Let's set separate class lock for seq_file->lock of m_start() users.
+ */
+static struct lock_class_key pid_maps_seq_file_lock;
+
void task_mem(struct seq_file *m, struct mm_struct *mm)
{
unsigned long data, text, lib, swap;
@@ -242,6 +254,7 @@ static int do_maps_open(struct inode *inode, struct file *file,
ret = seq_open(file, ops);
if (!ret) {
struct seq_file *m = file->private_data;
+ lockdep_set_class(&m->lock, &pid_maps_seq_file_lock);
m->private = priv;
} else {
kfree(priv);
@@ -1512,6 +1525,7 @@ static int numa_maps_open(struct inode *inode, struct file *file,
ret = seq_open(file, ops);
if (!ret) {
struct seq_file *m = file->private_data;
+ lockdep_set_class(&m->lock, &pid_maps_seq_file_lock);
m->private = priv;
} else {
kfree(priv);
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c
index 678455d2d683..35a799443990 100644
--- a/fs/proc/task_nommu.c
+++ b/fs/proc/task_nommu.c
@@ -9,6 +9,9 @@
#include <linux/seq_file.h>
#include "internal.h"

+/* See comment in task_mmu.c */
+static struct lock_class_key pid_maps_seq_file_lock;
+
/*
* Logic: we've got two memory sums for each process, "shared", and
* "non-shared". Shared memory may get counted more than once, for
@@ -277,6 +280,7 @@ static int maps_open(struct inode *inode, struct file *file,
ret = seq_open(file, ops);
if (!ret) {
struct seq_file *m = file->private_data;
+ lockdep_set_class(&m->lock, &pid_maps_seq_file_lock);
m->private = priv;
} else {
kfree(priv);
--
Kirill A. Shutemov

2014-07-30 23:03:51

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: cred_guard_mutex vs seq_file::lock [was: Re: 3.14.0+/x86: lockdep and mutexes not getting along]

On Thu, Jul 31, 2014 at 01:31:30AM +0300, Kirill A. Shutemov wrote:
> I don't know why we allow "chmod +x" on some proc files, notably net-related.
> Is it a bug?

# ls -l /proc/{1,157}/net/packet
-r--r--r-- 1 root 0 0 Jul 30 23:01 /proc/1/net/packet
-r--r--r-- 1 root 0 0 Jul 30 23:01 /proc/157/net/packet
# chmod +x /proc/157/net/packet
# ls -l /proc/{1,157}/net/packet
-r-xr-xr-x 1 root 0 0 Jul 30 23:02 /proc/1/net/packet
-r-xr-xr-x 1 root 0 0 Jul 30 23:02 /proc/157/net/packet

Ouch.. Or is it expected?

--
Kirill A. Shutemov

2014-07-31 07:26:21

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: cred_guard_mutex vs seq_file::lock [was: Re: 3.14.0+/x86: lockdep and mutexes not getting along]

On Thu, Jul 31, 2014 at 01:31:30AM +0300, Kirill A. Shutemov wrote:
...
> >
> > *cringe*
> >
> > I don't like it. That really should be a responsiblity of specific ->show();
> > "I'm going to take that mutex, bugger off if we are in execve()" makes a lot
> > more sense than having e.g. seq_read() care of that. IOW, I would very
> > much prefer the patch you've sent last week.
> >
> > And yes, it might leave lockdep false positives, but that's better dealt with
> > by annotating the sucker ("this guy has a separate lockdep class for its
> > ->lock"). E.g. by splitting proc_single_file_operations in two and having
> > the one used for those files do lockdep_set_class() in its ->open().
>
> I've got annoyed by the lockdep warning. What about the patch below?
>
> From 54d8c463e12f23c09d6a2dbf93a4dc9bcb493c67 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <[email protected]>
> Date: Thu, 31 Jul 2014 00:59:52 +0300
> Subject: [PATCH] procfs: silence lockdep warning about read vs. exec seq_file
>
> Testcase:
>
> cat /proc/self/maps >/dev/null
> chmod +x /proc/self/net/packet
> exec /proc/self/net/packet
>
> It triggers lockdep warning:
...
>
> It's a false positive: seq files which take cred_guard_mutex are never
> executable. Let's use separate lock class for them.
>
> I don't know why we allow "chmod +x" on some proc files, notably net-related.
> Is it a bug?
>
> Also I suspect eb94cd96e05d fixes non-existing bug, like this one.

Yeah, it should. And the patch looks good to me. Still I think maybe indeed
we need to prevent ability to set executable attribute on proc seq files?