2014-02-10 19:19:47

by Tommi Rantala

[permalink] [raw]
Subject: lockdep: strange %s#5 lock name

Hello,

Noticed a suspicious "%s#5" lock name in a lockdep splat while fuzzing
with trinity.

Tommi

[249844.491141] INFO: task kworker/u2:2:32113 blocked for more than 120 seconds.
[249844.493268] Not tainted v3.13-11268-g8a1f006 #3
[249844.494731] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[249844.496922] kworker/u2:2 D ffff880074a92520 0 32113 2 0x00000000
[249844.498985] Workqueue: netns cleanup_net
[249844.500188] ffff8800956d1c78 0000000000000046 ffff880074a92520
00000000001d4f80
[249844.502360] ffff8800956d1fd8 00000000001d4f80 ffff880074a90000
ffff880074a92520
[249844.504617] ffff8800b414dab8 ffffffff82db4d20 ffffffff82db4d28
0000000000000246
[249844.506647] Call Trace:
[249844.507331] [<ffffffff82378b75>] schedule+0x65/0x70
[249844.508576] [<ffffffff82378b91>] schedule_preempt_disabled+0x11/0x20
[249844.510185] [<ffffffff8237a0e5>] mutex_lock_nested+0x285/0x4a0
[249844.511777] [<ffffffff81f89160>] ? cleanup_net+0x80/0x1e0
[249844.513422] [<ffffffff81f89160>] ? cleanup_net+0x80/0x1e0
[249844.514998] [<ffffffff81f89160>] cleanup_net+0x80/0x1e0
[249844.516523] [<ffffffff8115b096>] process_one_work+0x366/0x690
[249844.518172] [<ffffffff8115af70>] ? process_one_work+0x240/0x690
[249844.519840] [<ffffffff8115b8de>] worker_thread+0x21e/0x370
[249844.521422] [<ffffffff8115b6c0>] ? rescuer_thread+0x2c0/0x2c0
[249844.523040] [<ffffffff81164420>] kthread+0xf0/0x100
[249844.524425] [<ffffffff8116dde1>] ? finish_task_switch+0x81/0x130
[249844.525987] [<ffffffff81164330>] ? insert_kthread_work+0x90/0x90
[249844.527487] [<ffffffff8237f50c>] ret_from_fork+0x7c/0xb0
[249844.528959] [<ffffffff81164330>] ? insert_kthread_work+0x90/0x90
[249844.530451] 3 locks held by kworker/u2:2/32113:
[249844.531638] #0: (%s#5){.+.+.+}, at: [<ffffffff8115af70>]
process_one_work+0x240/0x690
[249844.533891] #1: (net_cleanup_work){+.+.+.}, at:
[<ffffffff8115af70>] process_one_work+0x240/0x690
[249844.536317] #2: (net_mutex){+.+.+.}, at: [<ffffffff81f89160>]
cleanup_net+0x80/0x1e0
[249844.538744] INFO: task trinity-c10:23911 blocked for more than 120 seconds.
[249844.540723] Not tainted v3.13-11268-g8a1f006 #3
[249844.542192] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[249844.544785] trinity-c10 D ffff8800b963a520 0 23911 22465 0x00000004
[249844.546929] ffff880020d49df8 0000000000000046 ffff8800b963a520
00000000001d4f80
[249844.549183] ffff880020d49fd8 00000000001d4f80 ffff8800bb42a520
ffff8800b963a520
[249844.551401] ffff880024e58000 ffffffff82db4d20 ffffffff82db4d28
0000000000000246
[249844.553376] Call Trace:
[249844.553998] [<ffffffff82378b75>] schedule+0x65/0x70
[249844.555237] [<ffffffff82378b91>] schedule_preempt_disabled+0x11/0x20
[249844.556844] [<ffffffff8237a0e5>] mutex_lock_nested+0x285/0x4a0
[249844.558349] [<ffffffff81f89357>] ? copy_net_ns+0x97/0x150
[249844.559748] [<ffffffff81f89357>] ? copy_net_ns+0x97/0x150
[249844.561305] [<ffffffff81f89357>] copy_net_ns+0x97/0x150
[249844.562789] [<ffffffff811693e6>] create_new_namespaces+0x126/0x1c0
[249844.564551] [<ffffffff81fffffd>] ? ip_build_and_send_pkt+0x1cd/0x260
[249844.566484] [<ffffffff81169797>] unshare_nsproxy_namespaces+0xa7/0xe0
[249844.568325] [<ffffffff81139876>] SyS_unshare+0x116/0x2c0
[249844.569838] [<ffffffff81fffffd>] ? ip_build_and_send_pkt+0x1cd/0x260
[249844.571664] [<ffffffff8237f5b9>] system_call_fastpath+0x16/0x1b
[249844.573328] [<ffffffff81fffffd>] ? ip_build_and_send_pkt+0x1cd/0x260
[249844.574886] 1 lock held by trinity-c10/23911:
[249844.576036] #0: (net_mutex){+.+.+.}, at: [<ffffffff81f89357>]
copy_net_ns+0x97/0x150


2014-02-10 19:28:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep: strange %s#5 lock name

On Mon, Feb 10, 2014 at 09:19:43PM +0200, Tommi Rantala wrote:
> Hello,
>
> Noticed a suspicious "%s#5" lock name in a lockdep splat while fuzzing
> with trinity.
>
> [249844.531638] #0: (%s#5){.+.+.+}, at: [<ffffffff8115af70>]
> process_one_work+0x240/0x690

Lol.. its correct afaict:

struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
unsigned int flags,
int max_active,
struct lock_class_key *key,
const char *lock_name, ...)
{
...
lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);



So while its called lock_name, it looks like the argument is a format.

I suppose the below was what was intended...

---
kernel/workqueue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 82ef9f3b7473..861d8ddd92a2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4202,7 +4202,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
INIT_LIST_HEAD(&wq->flusher_overflow);
INIT_LIST_HEAD(&wq->maydays);

- lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
+ lockdep_init_map(&wq->lockdep_map, wq->name, key, 0);
INIT_LIST_HEAD(&wq->list);

if (alloc_and_link_pwqs(wq) < 0)

2014-02-10 21:52:35

by Tejun Heo

[permalink] [raw]
Subject: Re: lockdep: strange %s#5 lock name

On Mon, Feb 10, 2014 at 08:28:46PM +0100, Peter Zijlstra wrote:
> Lol.. its correct afaict:
>
> struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
> unsigned int flags,
> int max_active,
> struct lock_class_key *key,
> const char *lock_name, ...)
> {
> ...
> lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
>
>
>
> So while its called lock_name, it looks like the argument is a format.

Heh, yeah, that's silly.

> I suppose the below was what was intended...
>
> ---
> kernel/workqueue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 82ef9f3b7473..861d8ddd92a2 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -4202,7 +4202,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
> INIT_LIST_HEAD(&wq->flusher_overflow);
> INIT_LIST_HEAD(&wq->maydays);
>
> - lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
> + lockdep_init_map(&wq->lockdep_map, wq->name, key, 0);

Looks good to me. Can you please post the patch with SOB?

Thanks!

--
tejun

2014-02-11 11:00:47

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep: strange %s#5 lock name

> Looks good to me. Can you please post the patch with SOB?

---
Subject: workqueue: Fix workqueue lockdep name

Tommi noticed a 'funny' lock class name: "%s#5" from a lock acquired in
process_one_work(). It turns out that commit b196be89cdc14 forgot to
change the lockdep_init_map() when it changed the @lock_name argument
from a string to a format.

Fixes: b196be89cdc14 ("workqueue: make alloc_workqueue() take printf fmt and args for name")
Reported-by: Tommi Rantala <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 82ef9f3b7473..861d8ddd92a2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4202,7 +4202,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
INIT_LIST_HEAD(&wq->flusher_overflow);
INIT_LIST_HEAD(&wq->maydays);

- lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
+ lockdep_init_map(&wq->lockdep_map, wq->name, key, 0);
INIT_LIST_HEAD(&wq->list);

if (alloc_and_link_pwqs(wq) < 0)

2014-02-11 15:27:47

by Tejun Heo

[permalink] [raw]
Subject: Re: lockdep: strange %s#5 lock name

On Tue, Feb 11, 2014 at 12:00:36PM +0100, Peter Zijlstra wrote:
> > Looks good to me. Can you please post the patch with SOB?
>
> ---
> Subject: workqueue: Fix workqueue lockdep name
>
> Tommi noticed a 'funny' lock class name: "%s#5" from a lock acquired in
> process_one_work(). It turns out that commit b196be89cdc14 forgot to
> change the lockdep_init_map() when it changed the @lock_name argument
> from a string to a format.
>
> Fixes: b196be89cdc14 ("workqueue: make alloc_workqueue() take printf fmt and args for name")
> Reported-by: Tommi Rantala <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>

Applied to wq/for-3.14-fixes.

Thanks!

--
tejun

2014-02-13 04:35:37

by Li Zhong

[permalink] [raw]
Subject: Re: lockdep: strange %s#5 lock name

On Tue, 2014-02-11 at 10:27 -0500, Tejun Heo wrote:
> On Tue, Feb 11, 2014 at 12:00:36PM +0100, Peter Zijlstra wrote:
> > > Looks good to me. Can you please post the patch with SOB?
> >
> > ---
> > Subject: workqueue: Fix workqueue lockdep name
> >
> > Tommi noticed a 'funny' lock class name: "%s#5" from a lock acquired in
> > process_one_work(). It turns out that commit b196be89cdc14 forgot to
> > change the lockdep_init_map() when it changed the @lock_name argument
> > from a string to a format.
> >
> > Fixes: b196be89cdc14 ("workqueue: make alloc_workqueue() take printf fmt and args for name")
> > Reported-by: Tommi Rantala <[email protected]>
> > Signed-off-by: Peter Zijlstra <[email protected]>
>
> Applied to wq/for-3.14-fixes.

Hi Tejun, Peter,

I found following lockdep warning caused by this commit in next-tree:

[ 5.251993] ------------[ cut here ]------------
[ 5.252019] WARNING: CPU: 0 PID: 221 at kernel/locking/lockdep.c:710 __lock_acquire+0x1761/0x1f60()
[ 5.252019] Modules linked in: e1000
[ 5.252019] CPU: 0 PID: 221 Comm: lvm Not tainted 3.14.0-rc2-next-20140212 #1
[ 5.252019] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 5.252019] 0000000000000009 ffff880118e91938 ffffffff8155fe12 ffff880118e91978
[ 5.252019] ffffffff8105c195 ffff880118e91958 ffffffff81eb33d0 0000000000000002
[ 5.252019] ffff880118dd2318 0000000000000000 0000000000000000 ffff880118e91988
[ 5.252019] Call Trace:
[ 5.252019] [<ffffffff8155fe12>] dump_stack+0x19/0x1b
[ 5.252019] [<ffffffff8105c195>] warn_slowpath_common+0x85/0xb0
[ 5.252019] [<ffffffff8105c1da>] warn_slowpath_null+0x1a/0x20
[ 5.252019] [<ffffffff810a1721>] __lock_acquire+0x1761/0x1f60
[ 5.252019] [<ffffffff8109ec2e>] ? mark_held_locks+0xae/0x120
[ 5.252019] [<ffffffff8109ef4e>] ? debug_check_no_locks_freed+0x8e/0x160
[ 5.252019] [<ffffffff810a264c>] ? lockdep_init_map+0xac/0x600
[ 5.252019] [<ffffffff810a251a>] lock_acquire+0x9a/0x120
[ 5.252019] [<ffffffff810793f5>] ? flush_workqueue+0x5/0x750
[ 5.252019] [<ffffffff810794f9>] flush_workqueue+0x109/0x750
[ 5.252019] [<ffffffff810793f5>] ? flush_workqueue+0x5/0x750
[ 5.252019] [<ffffffff81566890>] ? _raw_spin_unlock_irq+0x30/0x40
[ 5.252019] [<ffffffff810b7720>] ? srcu_reschedule+0xe0/0xf0
[ 5.252019] [<ffffffff81405889>] dm_suspend+0xe9/0x1e0
[ 5.252019] [<ffffffff8140a853>] dev_suspend+0x1e3/0x270
[ 5.252019] [<ffffffff8140a670>] ? table_load+0x350/0x350
[ 5.252019] [<ffffffff8140b40c>] ctl_ioctl+0x26c/0x510
[ 5.252019] [<ffffffff810a03dc>] ? __lock_acquire+0x41c/0x1f60
[ 5.252019] [<ffffffff810923d8>] ? vtime_account_user+0x98/0xb0
[ 5.252019] [<ffffffff8140b6c3>] dm_ctl_ioctl+0x13/0x20
[ 5.252019] [<ffffffff811986c8>] do_vfs_ioctl+0x88/0x570
[ 5.252019] [<ffffffff811a5579>] ? __fget_light+0x129/0x150
[ 5.252019] [<ffffffff81198c41>] SyS_ioctl+0x91/0xb0
[ 5.252019] [<ffffffff8157049d>] tracesys+0xcf/0xd4
[ 5.252019] ---[ end trace ff1fa506f34be3bc ]---

It seems to me that when the second time alloc_workqueue() is called
from the same code path, it would have two locks with the same key, but
not the same &wq->name, which doesn't meet lockdep's assumption.

Maybe we could use the #fmt plus #args as the lock_name to avoid the
above issue? It could also give some more information for some string
like %s#5. Draft code attached below.

I removed __builtin_constant_p() check (I don't know how to check all
the args). However, by removing the check, it only adds two additional
two "s for those constants.

Some lockdep name examples I printed out after the change:

lockdep name wq->name

"events_long" events_long
"%s"("khelper") khelper
"xfs-data/%s"mp->m_fsname xfs-data/dm-3
...

A little bit ugly, not sure whether we could generate some better names
here?

Thanks,
Zhong

---
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 594521b..704f4f6 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -419,10 +419,7 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
static struct lock_class_key __key; \
const char *__lock_name; \
\
- if (__builtin_constant_p(fmt)) \
- __lock_name = (fmt); \
- else \
- __lock_name = #fmt; \
+ __lock_name = #fmt#args; \
\
__alloc_workqueue_key((fmt), (flags), (max_active), \
&__key, __lock_name, ##args); \
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 861d8dd..82ef9f3 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4202,7 +4202,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
INIT_LIST_HEAD(&wq->flusher_overflow);
INIT_LIST_HEAD(&wq->maydays);

- lockdep_init_map(&wq->lockdep_map, wq->name, key, 0);
+ lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
INIT_LIST_HEAD(&wq->list);

if (alloc_and_link_pwqs(wq) < 0)

2014-02-13 21:26:51

by Tejun Heo

[permalink] [raw]
Subject: Re: lockdep: strange %s#5 lock name

On Thu, Feb 13, 2014 at 12:35:24PM +0800, Li Zhong wrote:
> [ 5.251993] ------------[ cut here ]------------
> [ 5.252019] WARNING: CPU: 0 PID: 221 at kernel/locking/lockdep.c:710 __lock_acquire+0x1761/0x1f60()
> [ 5.252019] Modules linked in: e1000
> [ 5.252019] CPU: 0 PID: 221 Comm: lvm Not tainted 3.14.0-rc2-next-20140212 #1
> [ 5.252019] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [ 5.252019] 0000000000000009 ffff880118e91938 ffffffff8155fe12 ffff880118e91978
> [ 5.252019] ffffffff8105c195 ffff880118e91958 ffffffff81eb33d0 0000000000000002
> [ 5.252019] ffff880118dd2318 0000000000000000 0000000000000000 ffff880118e91988
> [ 5.252019] Call Trace:
> [ 5.252019] [<ffffffff8155fe12>] dump_stack+0x19/0x1b
> [ 5.252019] [<ffffffff8105c195>] warn_slowpath_common+0x85/0xb0
> [ 5.252019] [<ffffffff8105c1da>] warn_slowpath_null+0x1a/0x20
> [ 5.252019] [<ffffffff810a1721>] __lock_acquire+0x1761/0x1f60
> [ 5.252019] [<ffffffff8109ec2e>] ? mark_held_locks+0xae/0x120
> [ 5.252019] [<ffffffff8109ef4e>] ? debug_check_no_locks_freed+0x8e/0x160
> [ 5.252019] [<ffffffff810a264c>] ? lockdep_init_map+0xac/0x600
> [ 5.252019] [<ffffffff810a251a>] lock_acquire+0x9a/0x120
> [ 5.252019] [<ffffffff810793f5>] ? flush_workqueue+0x5/0x750
> [ 5.252019] [<ffffffff810794f9>] flush_workqueue+0x109/0x750
> [ 5.252019] [<ffffffff810793f5>] ? flush_workqueue+0x5/0x750
> [ 5.252019] [<ffffffff81566890>] ? _raw_spin_unlock_irq+0x30/0x40
> [ 5.252019] [<ffffffff810b7720>] ? srcu_reschedule+0xe0/0xf0
> [ 5.252019] [<ffffffff81405889>] dm_suspend+0xe9/0x1e0
> [ 5.252019] [<ffffffff8140a853>] dev_suspend+0x1e3/0x270
> [ 5.252019] [<ffffffff8140a670>] ? table_load+0x350/0x350
> [ 5.252019] [<ffffffff8140b40c>] ctl_ioctl+0x26c/0x510
> [ 5.252019] [<ffffffff810a03dc>] ? __lock_acquire+0x41c/0x1f60
> [ 5.252019] [<ffffffff810923d8>] ? vtime_account_user+0x98/0xb0
> [ 5.252019] [<ffffffff8140b6c3>] dm_ctl_ioctl+0x13/0x20
> [ 5.252019] [<ffffffff811986c8>] do_vfs_ioctl+0x88/0x570
> [ 5.252019] [<ffffffff811a5579>] ? __fget_light+0x129/0x150
> [ 5.252019] [<ffffffff81198c41>] SyS_ioctl+0x91/0xb0
> [ 5.252019] [<ffffffff8157049d>] tracesys+0xcf/0xd4
> [ 5.252019] ---[ end trace ff1fa506f34be3bc ]---
>
> It seems to me that when the second time alloc_workqueue() is called
> from the same code path, it would have two locks with the same key, but
> not the same &wq->name, which doesn't meet lockdep's assumption.

Dang... I reverted the previous patch for now. Peter, does this
approach sound good to you?

Thanks.

--
tejun

2014-02-13 21:33:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: lockdep: strange %s#5 lock name

On Thu, Feb 13, 2014 at 04:26:45PM -0500, Tejun Heo wrote:
> > It seems to me that when the second time alloc_workqueue() is called
> > from the same code path, it would have two locks with the same key, but
> > not the same &wq->name, which doesn't meet lockdep's assumption.
>
> Dang... I reverted the previous patch for now. Peter, does this
> approach sound good to you?

Whatever works I suppose; like Li said; a little ugly but not sure we
can do better.

2014-02-13 22:15:21

by Tejun Heo

[permalink] [raw]
Subject: Re: lockdep: strange %s#5 lock name

On Thu, Feb 13, 2014 at 10:32:53PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 13, 2014 at 04:26:45PM -0500, Tejun Heo wrote:
> > > It seems to me that when the second time alloc_workqueue() is called
> > > from the same code path, it would have two locks with the same key, but
> > > not the same &wq->name, which doesn't meet lockdep's assumption.
> >
> > Dang... I reverted the previous patch for now. Peter, does this
> > approach sound good to you?
>
> Whatever works I suppose; like Li said; a little ugly but not sure we
> can do better.

Cool, Li, can you please spin it on top of v3.14-rc1 w/ SOB and
description.

Thanks.

--
tejun

2014-02-14 02:53:10

by Li Zhong

[permalink] [raw]
Subject: [PATCH ] workqueue: add args to workqueue lockdep name

Tommi noticed a 'funny' lock class name: "%s#5" from a lock acquired in
process_one_work().

Maybe #fmt plus #args could be used as the lock_name to give some more
information for some fmt string like the above.

__builtin_constant_p() check is removed (as there seems no good way to
check all the variables in args list). However, by removing the check,
it only adds two additional "s for those constants.

Some lockdep name examples printed out after the change:

lockdep name wq->name

"events_long" events_long
"%s"("khelper") khelper
"xfs-data/%s"mp->m_fsname xfs-data/dm-3

Signed-off-by: Li Zhong <[email protected]>
---
include/linux/workqueue.h | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 594521b..704f4f6 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -419,10 +419,7 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
static struct lock_class_key __key; \
const char *__lock_name; \
\
- if (__builtin_constant_p(fmt)) \
- __lock_name = (fmt); \
- else \
- __lock_name = #fmt; \
+ __lock_name = #fmt#args; \
\
__alloc_workqueue_key((fmt), (flags), (max_active), \
&__key, __lock_name, ##args); \


2014-02-14 14:06:20

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH ] workqueue: add args to workqueue lockdep name

On Fri, Feb 14, 2014 at 10:52:57AM +0800, Li Zhong wrote:
> Tommi noticed a 'funny' lock class name: "%s#5" from a lock acquired in
> process_one_work().
>
> Maybe #fmt plus #args could be used as the lock_name to give some more
> information for some fmt string like the above.
>
> __builtin_constant_p() check is removed (as there seems no good way to
> check all the variables in args list). However, by removing the check,
> it only adds two additional "s for those constants.
>
> Some lockdep name examples printed out after the change:
>
> lockdep name wq->name
>
> "events_long" events_long
> "%s"("khelper") khelper
> "xfs-data/%s"mp->m_fsname xfs-data/dm-3
>
> Signed-off-by: Li Zhong <[email protected]>

Applied to wq/for-3.14-fixes.

Thanks.

--
tejun