2010-02-11 17:11:50

by John Kacur

[permalink] [raw]
Subject: [BUG]: Possibe recursive locking detected in sysfs

I'm not sure if this one has already been reported. - thanks.

Feb 11 07:24:15 localhost kernel: [ INFO: possible recursive locking detected ]
Feb 11 07:24:15 localhost kernel: 2.6.33-rc7 #1
Feb 11 07:24:15 localhost kernel: ---------------------------------------------
Feb 11 07:24:15 localhost kernel: 94cpufreq/2933 is trying to acquire lock:
Feb 11 07:24:15 localhost kernel: (s_active){++++.+}, at:
[<ffffffff81157408>] sysfs_hash_and_remove+0x53/0x6a
Feb 11 07:24:15 localhost kernel:
Feb 11 07:24:15 localhost kernel: but task is already holding lock:
Feb 11 07:24:15 localhost kernel: (s_active){++++.+}, at:
[<ffffffff81159530>] sysfs_get_active_two+0x24/0x48
Feb 11 07:24:15 localhost kernel:
Feb 11 07:24:15 localhost kernel: other info that might help us debug this:
Feb 11 07:24:15 localhost kernel: 4 locks held by 94cpufreq/2933:
Feb 11 07:24:15 localhost kernel: #0: (&buffer->mutex){+.+.+.}, at:
[<ffffffff81157dbd>] sysfs_write_file+0x3e/0x12b
Feb 11 07:24:15 localhost kernel: #1: (s_active){++++.+}, at:
[<ffffffff81159530>] sysfs_get_active_two+0x24/0x48
Feb 11 07:24:15 localhost kernel: #2: (s_active){++++.+}, at:
[<ffffffff8115953d>] sysfs_get_active_two+0x31/0x48
Feb 11 07:24:15 localhost kernel: #3: (dbs_mutex){+.+.+.}, at:
[<ffffffffa04bcdb3>] cpufreq_governor_dbs+0x29b/0x348
[cpufreq_ondemand]
Feb 11 07:24:15 localhost kernel:
Feb 11 07:24:15 localhost kernel: stack backtrace:
Feb 11 07:24:15 localhost kernel: Pid: 2933, comm: 94cpufreq Not
tainted 2.6.33-rc7 #1
Feb 11 07:24:15 localhost kernel: Call Trace:
Feb 11 07:24:15 localhost kernel: [<ffffffff81071f53>]
__lock_acquire+0xcf6/0xd8b
Feb 11 07:24:15 localhost kernel: [<ffffffff81070914>] ?
debug_check_no_locks_freed+0x120/0x12f
Feb 11 07:24:15 localhost kernel: [<ffffffff8107020e>] ?
trace_hardirqs_on_caller+0x11f/0x14a
Feb 11 07:24:15 localhost kernel: [<ffffffff810720c0>] lock_acquire+0xd8/0xf5
Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>] ?
sysfs_hash_and_remove+0x53/0x6a
Feb 11 07:24:15 localhost kernel: [<ffffffff81159231>]
sysfs_addrm_finish+0xe1/0x175
Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>] ?
sysfs_hash_and_remove+0x53/0x6a
Feb 11 07:24:15 localhost kernel: [<ffffffff8137b1b5>] ?
sub_preempt_count+0xa3/0xb6
Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>]
sysfs_hash_and_remove+0x53/0x6a
Feb 11 07:24:15 localhost kernel: [<ffffffff8115a59d>]
sysfs_remove_group+0x91/0xc9
Feb 11 07:24:15 localhost kernel: [<ffffffffa04bcdc6>]
cpufreq_governor_dbs+0x2ae/0x348 [cpufreq_ondemand]
Feb 11 07:24:15 localhost kernel: [<ffffffff8137b1b5>] ?
sub_preempt_count+0xa3/0xb6
Feb 11 07:24:15 localhost kernel: [<ffffffff812ab9cc>]
__cpufreq_governor+0x89/0xc7
Feb 11 07:24:15 localhost kernel: [<ffffffff812ac7b9>]
__cpufreq_set_policy+0x18e/0x22a
Feb 11 07:24:15 localhost kernel: [<ffffffff812accd6>]
store_scaling_governor+0x199/0x1ed
Feb 11 07:24:15 localhost kernel: [<ffffffff812ad50d>] ? handle_update+0x0/0x39
Feb 11 07:24:15 localhost kernel: [<ffffffff8137685d>] ? down_write+0x76/0x7e
Feb 11 07:24:15 localhost kernel: [<ffffffff812ac607>] store+0x67/0x8b
Feb 11 07:24:15 localhost kernel: [<ffffffff81157e75>]
sysfs_write_file+0xf6/0x12b
Feb 11 07:24:15 localhost kernel: [<ffffffff811003c9>] vfs_write+0xb0/0x10a
Feb 11 07:24:15 localhost kernel: [<ffffffff811004f1>] sys_write+0x4c/0x75
Feb 11 07:24:15 localhost kernel: [<ffffffff81002d9b>]
system_call_fastpath+0x16/0x1b


2010-02-14 13:10:19

by John Kacur

[permalink] [raw]
Subject: Re: [BUG]: Possibe recursive locking detected in sysfs

On Thu, Feb 11, 2010 at 12:11 PM, John Kacur <[email protected]> wrote:
> I'm not sure if this one has already been reported. - thanks.
>
> Feb 11 07:24:15 localhost kernel: [ INFO: possible recursive locking detected ]
> Feb 11 07:24:15 localhost kernel: 2.6.33-rc7 #1
> Feb 11 07:24:15 localhost kernel: ---------------------------------------------
> Feb 11 07:24:15 localhost kernel: 94cpufreq/2933 is trying to acquire lock:
> Feb 11 07:24:15 localhost kernel: (s_active){++++.+}, at:
> [<ffffffff81157408>] sysfs_hash_and_remove+0x53/0x6a
> Feb 11 07:24:15 localhost kernel:
> Feb 11 07:24:15 localhost kernel: but task is already holding lock:
> Feb 11 07:24:15 localhost kernel: (s_active){++++.+}, at:
> [<ffffffff81159530>] sysfs_get_active_two+0x24/0x48
> Feb 11 07:24:15 localhost kernel:
> Feb 11 07:24:15 localhost kernel: other info that might help us debug this:
> Feb 11 07:24:15 localhost kernel: 4 locks held by 94cpufreq/2933:
> Feb 11 07:24:15 localhost kernel: #0: ?(&buffer->mutex){+.+.+.}, at:
> [<ffffffff81157dbd>] sysfs_write_file+0x3e/0x12b
> Feb 11 07:24:15 localhost kernel: #1: ?(s_active){++++.+}, at:
> [<ffffffff81159530>] sysfs_get_active_two+0x24/0x48
> Feb 11 07:24:15 localhost kernel: #2: ?(s_active){++++.+}, at:
> [<ffffffff8115953d>] sysfs_get_active_two+0x31/0x48
> Feb 11 07:24:15 localhost kernel: #3: ?(dbs_mutex){+.+.+.}, at:
> [<ffffffffa04bcdb3>] cpufreq_governor_dbs+0x29b/0x348
> [cpufreq_ondemand]
> Feb 11 07:24:15 localhost kernel:
> Feb 11 07:24:15 localhost kernel: stack backtrace:
> Feb 11 07:24:15 localhost kernel: Pid: 2933, comm: 94cpufreq Not
> tainted 2.6.33-rc7 #1
> Feb 11 07:24:15 localhost kernel: Call Trace:
> Feb 11 07:24:15 localhost kernel: [<ffffffff81071f53>]
> __lock_acquire+0xcf6/0xd8b
> Feb 11 07:24:15 localhost kernel: [<ffffffff81070914>] ?
> debug_check_no_locks_freed+0x120/0x12f
> Feb 11 07:24:15 localhost kernel: [<ffffffff8107020e>] ?
> trace_hardirqs_on_caller+0x11f/0x14a
> Feb 11 07:24:15 localhost kernel: [<ffffffff810720c0>] lock_acquire+0xd8/0xf5
> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>] ?
> sysfs_hash_and_remove+0x53/0x6a
> Feb 11 07:24:15 localhost kernel: [<ffffffff81159231>]
> sysfs_addrm_finish+0xe1/0x175
> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>] ?
> sysfs_hash_and_remove+0x53/0x6a
> Feb 11 07:24:15 localhost kernel: [<ffffffff8137b1b5>] ?
> sub_preempt_count+0xa3/0xb6
> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>]
> sysfs_hash_and_remove+0x53/0x6a
> Feb 11 07:24:15 localhost kernel: [<ffffffff8115a59d>]
> sysfs_remove_group+0x91/0xc9
> Feb 11 07:24:15 localhost kernel: [<ffffffffa04bcdc6>]
> cpufreq_governor_dbs+0x2ae/0x348 [cpufreq_ondemand]
> Feb 11 07:24:15 localhost kernel: [<ffffffff8137b1b5>] ?
> sub_preempt_count+0xa3/0xb6
> Feb 11 07:24:15 localhost kernel: [<ffffffff812ab9cc>]
> __cpufreq_governor+0x89/0xc7
> Feb 11 07:24:15 localhost kernel: [<ffffffff812ac7b9>]
> __cpufreq_set_policy+0x18e/0x22a
> Feb 11 07:24:15 localhost kernel: [<ffffffff812accd6>]
> store_scaling_governor+0x199/0x1ed
> Feb 11 07:24:15 localhost kernel: [<ffffffff812ad50d>] ? handle_update+0x0/0x39
> Feb 11 07:24:15 localhost kernel: [<ffffffff8137685d>] ? down_write+0x76/0x7e
> Feb 11 07:24:15 localhost kernel: [<ffffffff812ac607>] store+0x67/0x8b
> Feb 11 07:24:15 localhost kernel: [<ffffffff81157e75>]
> sysfs_write_file+0xf6/0x12b
> Feb 11 07:24:15 localhost kernel: [<ffffffff811003c9>] vfs_write+0xb0/0x10a
> Feb 11 07:24:15 localhost kernel: [<ffffffff811004f1>] sys_write+0x4c/0x75
> Feb 11 07:24:15 localhost kernel: [<ffffffff81002d9b>]
> system_call_fastpath+0x16/0x1b
>

This is still present in 2.6.33-rc8.
I can trigger it every time. I have T500 laptop, the trick is to boot
it up without plugging it in, so it is working on battery power, and
then to shut the lid.
first you get the message from above, and then it does a few more
things before it freezes and I have to hard reboot it.
The above problem does not occur if I have the laptop plugged into the wall.

Here is the 2.6.33-rc8 version of above. If anyone has further
suggestions for debugging, pls let me know.
I've cced the folks that get_maintainer.pl reports for
fs/sysfs/inode.c and fs/sysfs/dir.c

Feb 14 07:35:49 localhost kernel: =============================================
Feb 14 07:35:49 localhost kernel: [ INFO: possible recursive locking detected ]
Feb 14 07:35:49 localhost kernel: 2.6.33-rc8 #1
Feb 14 07:35:49 localhost kernel: ---------------------------------------------
Feb 14 07:35:49 localhost kernel: 94cpufreq/2914 is trying to acquire lock:
Feb 14 07:35:49 localhost kernel: (s_active){++++.+}, at:
[<ffffffff81157478>] sysfs_hash_and_remove+0x53/0x6a
Feb 14 07:35:49 localhost kernel:
Feb 14 07:35:49 localhost kernel: but task is already holding lock:
Feb 14 07:35:49 localhost kernel: (s_active){++++.+}, at:
[<ffffffff811595a0>] sysfs_get_active_two+0x24/0x48
Feb 14 07:35:49 localhost kernel:
Feb 14 07:35:49 localhost kernel: other info that might help us debug this:
Feb 14 07:35:49 localhost kernel: 4 locks held by 94cpufreq/2914:
Feb 14 07:35:49 localhost kernel: #0: (&buffer->mutex){+.+.+.}, at:
[<ffffffff81157e2d>] sysfs_write_file+0x3e/0x12b
Feb 14 07:35:49 localhost kernel: #1: (s_active){++++.+}, at:
[<ffffffff811595a0>] sysfs_get_active_two+0x24/0x48
Feb 14 07:35:49 localhost kernel: #2: (s_active){++++.+}, at:
[<ffffffff811595ad>] sysfs_get_active_two+0x31/0x48
Feb 14 07:35:49 localhost kernel: #3: (dbs_mutex){+.+.+.}, at:
[<ffffffffa04c0dbe>] cpufreq_governor_dbs+0x29b/0x345
[cpufreq_ondemand]
Feb 14 07:35:49 localhost kernel:
Feb 14 07:35:49 localhost kernel: stack backtrace:
Feb 14 07:35:49 localhost kernel: Pid: 2914, comm: 94cpufreq Not
tainted 2.6.33-rc8 #1
Feb 14 07:35:49 localhost kernel: Call Trace:
Feb 14 07:35:49 localhost kernel: [<ffffffff81071f53>]
__lock_acquire+0xcf6/0xd8b
Feb 14 07:35:49 localhost kernel: [<ffffffff81070914>] ?
debug_check_no_locks_freed+0x120/0x12f
Feb 14 07:35:49 localhost kernel: [<ffffffff8107020e>] ?
trace_hardirqs_on_caller+0x11f/0x14a
Feb 14 07:35:49 localhost kernel: [<ffffffff810720c0>] lock_acquire+0xd8/0xf5
Feb 14 07:35:49 localhost kernel: [<ffffffff81157478>] ?
sysfs_hash_and_remove+0x53/0x6a
Feb 14 07:35:49 localhost kernel: [<ffffffff811592a1>]
sysfs_addrm_finish+0xe1/0x175
Feb 14 07:35:49 localhost kernel: [<ffffffff81157478>] ?
sysfs_hash_and_remove+0x53/0x6a
Feb 14 07:35:49 localhost kernel: [<ffffffff8137b335>] ?
sub_preempt_count+0xa3/0xb6
Feb 14 07:35:49 localhost kernel: [<ffffffff81157478>]
sysfs_hash_and_remove+0x53/0x6a
Feb 14 07:35:49 localhost kernel: [<ffffffff8115a60d>]
sysfs_remove_group+0x91/0xc9
Feb 14 07:35:49 localhost kernel: [<ffffffffa04c0dd1>]
cpufreq_governor_dbs+0x2ae/0x345 [cpufreq_ondemand]
Feb 14 07:35:49 localhost kernel: [<ffffffff8137b335>] ?
sub_preempt_count+0xa3/0xb6
Feb 14 07:35:49 localhost kernel: [<ffffffff812aba60>]
__cpufreq_governor+0x89/0xc7
Feb 14 07:35:49 localhost kernel: [<ffffffff812ac84d>]
__cpufreq_set_policy+0x18e/0x22a
Feb 14 07:35:49 localhost kernel: [<ffffffff812acd6a>]
store_scaling_governor+0x199/0x1ed
Feb 14 07:35:49 localhost kernel: [<ffffffff812ad5a1>] ? handle_update+0x0/0x39
Feb 14 07:35:49 localhost kernel: [<ffffffff813769ad>] ? down_write+0x76/0x7e
Feb 14 07:35:49 localhost kernel: [<ffffffff812ac69b>] store+0x67/0x8b
Feb 14 07:35:49 localhost kernel: [<ffffffff81157ee5>]
sysfs_write_file+0xf6/0x12b
Feb 14 07:35:49 localhost kernel: [<ffffffff811003c9>] vfs_write+0xb0/0x10a
Feb 14 07:35:49 localhost kernel: [<ffffffff811004f1>] sys_write+0x4c/0x75
Feb 14 07:35:49 localhost kernel: [<ffffffff81002d9b>]
system_call_fastpath+0x16/0x1b

2010-02-14 14:57:40

by Eric Biederman

[permalink] [raw]
Subject: Re: [BUG]: Possibe recursive locking detected in sysfs

On Sun, Feb 14, 2010 at 5:10 AM, John Kacur <[email protected]> wrote:
> On Thu, Feb 11, 2010 at 12:11 PM, John Kacur <[email protected]> wrote:
>> I'm not sure if this one has already been reported. - thanks.
>>
>> Feb 11 07:24:15 localhost kernel: [ INFO: possible recursive locking detected ]
>> Feb 11 07:24:15 localhost kernel: 2.6.33-rc7 #1
>> Feb 11 07:24:15 localhost kernel: ---------------------------------------------
>> Feb 11 07:24:15 localhost kernel: 94cpufreq/2933 is trying to acquire lock:
>> Feb 11 07:24:15 localhost kernel: (s_active){++++.+}, at:
>> [<ffffffff81157408>] sysfs_hash_and_remove+0x53/0x6a
>> Feb 11 07:24:15 localhost kernel:
>> Feb 11 07:24:15 localhost kernel: but task is already holding lock:
>> Feb 11 07:24:15 localhost kernel: (s_active){++++.+}, at:
>> [<ffffffff81159530>] sysfs_get_active_two+0x24/0x48
>> Feb 11 07:24:15 localhost kernel:
>> Feb 11 07:24:15 localhost kernel: other info that might help us debug this:
>> Feb 11 07:24:15 localhost kernel: 4 locks held by 94cpufreq/2933:
>> Feb 11 07:24:15 localhost kernel: #0: ?(&buffer->mutex){+.+.+.}, at:
>> [<ffffffff81157dbd>] sysfs_write_file+0x3e/0x12b
>> Feb 11 07:24:15 localhost kernel: #1: ?(s_active){++++.+}, at:
>> [<ffffffff81159530>] sysfs_get_active_two+0x24/0x48
>> Feb 11 07:24:15 localhost kernel: #2: ?(s_active){++++.+}, at:
>> [<ffffffff8115953d>] sysfs_get_active_two+0x31/0x48
>> Feb 11 07:24:15 localhost kernel: #3: ?(dbs_mutex){+.+.+.}, at:
>> [<ffffffffa04bcdb3>] cpufreq_governor_dbs+0x29b/0x348
>> [cpufreq_ondemand]
>> Feb 11 07:24:15 localhost kernel:
>> Feb 11 07:24:15 localhost kernel: stack backtrace:
>> Feb 11 07:24:15 localhost kernel: Pid: 2933, comm: 94cpufreq Not
>> tainted 2.6.33-rc7 #1
>> Feb 11 07:24:15 localhost kernel: Call Trace:
>> Feb 11 07:24:15 localhost kernel: [<ffffffff81071f53>]
>> __lock_acquire+0xcf6/0xd8b
>> Feb 11 07:24:15 localhost kernel: [<ffffffff81070914>] ?
>> debug_check_no_locks_freed+0x120/0x12f
>> Feb 11 07:24:15 localhost kernel: [<ffffffff8107020e>] ?
>> trace_hardirqs_on_caller+0x11f/0x14a
>> Feb 11 07:24:15 localhost kernel: [<ffffffff810720c0>] lock_acquire+0xd8/0xf5
>> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>] ?
>> sysfs_hash_and_remove+0x53/0x6a
>> Feb 11 07:24:15 localhost kernel: [<ffffffff81159231>]
>> sysfs_addrm_finish+0xe1/0x175
>> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>] ?
>> sysfs_hash_and_remove+0x53/0x6a
>> Feb 11 07:24:15 localhost kernel: [<ffffffff8137b1b5>] ?
>> sub_preempt_count+0xa3/0xb6
>> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>]
>> sysfs_hash_and_remove+0x53/0x6a
>> Feb 11 07:24:15 localhost kernel: [<ffffffff8115a59d>]
>> sysfs_remove_group+0x91/0xc9
>> Feb 11 07:24:15 localhost kernel: [<ffffffffa04bcdc6>]
>> cpufreq_governor_dbs+0x2ae/0x348 [cpufreq_ondemand]
>> Feb 11 07:24:15 localhost kernel: [<ffffffff8137b1b5>] ?
>> sub_preempt_count+0xa3/0xb6
>> Feb 11 07:24:15 localhost kernel: [<ffffffff812ab9cc>]
>> __cpufreq_governor+0x89/0xc7
>> Feb 11 07:24:15 localhost kernel: [<ffffffff812ac7b9>]
>> __cpufreq_set_policy+0x18e/0x22a
>> Feb 11 07:24:15 localhost kernel: [<ffffffff812accd6>]
>> store_scaling_governor+0x199/0x1ed
>> Feb 11 07:24:15 localhost kernel: [<ffffffff812ad50d>] ? handle_update+0x0/0x39
>> Feb 11 07:24:15 localhost kernel: [<ffffffff8137685d>] ? down_write+0x76/0x7e
>> Feb 11 07:24:15 localhost kernel: [<ffffffff812ac607>] store+0x67/0x8b
>> Feb 11 07:24:15 localhost kernel: [<ffffffff81157e75>]
>> sysfs_write_file+0xf6/0x12b
>> Feb 11 07:24:15 localhost kernel: [<ffffffff811003c9>] vfs_write+0xb0/0x10a
>> Feb 11 07:24:15 localhost kernel: [<ffffffff811004f1>] sys_write+0x4c/0x75
>> Feb 11 07:24:15 localhost kernel: [<ffffffff81002d9b>]
>> system_call_fastpath+0x16/0x1b
>>
>
> This is still present in 2.6.33-rc8.
> I can trigger it every time. I have T500 laptop, the trick is to boot
> it up without plugging it in, so it is working on battery power, and
> then to shut the lid.
> first you get the message from above, and then it does a few more
> things before it freezes and I have to hard reboot it.
> The above problem does not occur if I have the laptop plugged into the wall.
>
> Here is the 2.6.33-rc8 version of above. If anyone has further
> suggestions for debugging, pls let me know.
> I've cced the folks that get_maintainer.pl reports for
> fs/sysfs/inode.c and fs/sysfs/dir.c

We have a real hang and a possibly spurious lockdep report.
Could you try my latest patches that I sent to gregkh to clear up the sysfs
lockdep reports? I don't expect it will fix your issue but it will
at least give
us a better idea of where to look.

Eric


> Feb 14 07:35:49 localhost kernel: =============================================
> Feb 14 07:35:49 localhost kernel: [ INFO: possible recursive locking detected ]
> Feb 14 07:35:49 localhost kernel: 2.6.33-rc8 #1
> Feb 14 07:35:49 localhost kernel: ---------------------------------------------
> Feb 14 07:35:49 localhost kernel: 94cpufreq/2914 is trying to acquire lock:
> Feb 14 07:35:49 localhost kernel: (s_active){++++.+}, at:
> [<ffffffff81157478>] sysfs_hash_and_remove+0x53/0x6a
> Feb 14 07:35:49 localhost kernel:
> Feb 14 07:35:49 localhost kernel: but task is already holding lock:
> Feb 14 07:35:49 localhost kernel: (s_active){++++.+}, at:
> [<ffffffff811595a0>] sysfs_get_active_two+0x24/0x48
> Feb 14 07:35:49 localhost kernel:
> Feb 14 07:35:49 localhost kernel: other info that might help us debug this:
> Feb 14 07:35:49 localhost kernel: 4 locks held by 94cpufreq/2914:
> Feb 14 07:35:49 localhost kernel: #0: ?(&buffer->mutex){+.+.+.}, at:
> [<ffffffff81157e2d>] sysfs_write_file+0x3e/0x12b
> Feb 14 07:35:49 localhost kernel: #1: ?(s_active){++++.+}, at:
> [<ffffffff811595a0>] sysfs_get_active_two+0x24/0x48
> Feb 14 07:35:49 localhost kernel: #2: ?(s_active){++++.+}, at:
> [<ffffffff811595ad>] sysfs_get_active_two+0x31/0x48
> Feb 14 07:35:49 localhost kernel: #3: ?(dbs_mutex){+.+.+.}, at:
> [<ffffffffa04c0dbe>] cpufreq_governor_dbs+0x29b/0x345
> [cpufreq_ondemand]
> Feb 14 07:35:49 localhost kernel:
> Feb 14 07:35:49 localhost kernel: stack backtrace:
> Feb 14 07:35:49 localhost kernel: Pid: 2914, comm: 94cpufreq Not
> tainted 2.6.33-rc8 #1
> Feb 14 07:35:49 localhost kernel: Call Trace:
> Feb 14 07:35:49 localhost kernel: [<ffffffff81071f53>]
> __lock_acquire+0xcf6/0xd8b
> Feb 14 07:35:49 localhost kernel: [<ffffffff81070914>] ?
> debug_check_no_locks_freed+0x120/0x12f
> Feb 14 07:35:49 localhost kernel: [<ffffffff8107020e>] ?
> trace_hardirqs_on_caller+0x11f/0x14a
> Feb 14 07:35:49 localhost kernel: [<ffffffff810720c0>] lock_acquire+0xd8/0xf5
> Feb 14 07:35:49 localhost kernel: [<ffffffff81157478>] ?
> sysfs_hash_and_remove+0x53/0x6a
> Feb 14 07:35:49 localhost kernel: [<ffffffff811592a1>]
> sysfs_addrm_finish+0xe1/0x175
> Feb 14 07:35:49 localhost kernel: [<ffffffff81157478>] ?
> sysfs_hash_and_remove+0x53/0x6a
> Feb 14 07:35:49 localhost kernel: [<ffffffff8137b335>] ?
> sub_preempt_count+0xa3/0xb6
> Feb 14 07:35:49 localhost kernel: [<ffffffff81157478>]
> sysfs_hash_and_remove+0x53/0x6a
> Feb 14 07:35:49 localhost kernel: [<ffffffff8115a60d>]
> sysfs_remove_group+0x91/0xc9
> Feb 14 07:35:49 localhost kernel: [<ffffffffa04c0dd1>]
> cpufreq_governor_dbs+0x2ae/0x345 [cpufreq_ondemand]
> Feb 14 07:35:49 localhost kernel: [<ffffffff8137b335>] ?
> sub_preempt_count+0xa3/0xb6
> Feb 14 07:35:49 localhost kernel: [<ffffffff812aba60>]
> __cpufreq_governor+0x89/0xc7
> Feb 14 07:35:49 localhost kernel: [<ffffffff812ac84d>]
> __cpufreq_set_policy+0x18e/0x22a
> Feb 14 07:35:49 localhost kernel: [<ffffffff812acd6a>]
> store_scaling_governor+0x199/0x1ed
> Feb 14 07:35:49 localhost kernel: [<ffffffff812ad5a1>] ? handle_update+0x0/0x39
> Feb 14 07:35:49 localhost kernel: [<ffffffff813769ad>] ? down_write+0x76/0x7e
> Feb 14 07:35:49 localhost kernel: [<ffffffff812ac69b>] store+0x67/0x8b
> Feb 14 07:35:49 localhost kernel: [<ffffffff81157ee5>]
> sysfs_write_file+0xf6/0x12b
> Feb 14 07:35:49 localhost kernel: [<ffffffff811003c9>] vfs_write+0xb0/0x10a
> Feb 14 07:35:49 localhost kernel: [<ffffffff811004f1>] sys_write+0x4c/0x75
> Feb 14 07:35:49 localhost kernel: [<ffffffff81002d9b>]
> system_call_fastpath+0x16/0x1b
>

2010-02-14 20:54:43

by John Kacur

[permalink] [raw]
Subject: Re: [BUG]: Possibe recursive locking detected in sysfs

On Sun, Feb 14, 2010 at 9:57 AM, Eric Biederman
<[email protected]> wrote:
> On Sun, Feb 14, 2010 at 5:10 AM, John Kacur <[email protected]> wrote:
>> On Thu, Feb 11, 2010 at 12:11 PM, John Kacur <[email protected]> wrote:
>>> I'm not sure if this one has already been reported. - thanks.
>>>
>>> Feb 11 07:24:15 localhost kernel: [ INFO: possible recursive locking detected ]
>>> Feb 11 07:24:15 localhost kernel: 2.6.33-rc7 #1
>>> Feb 11 07:24:15 localhost kernel: ---------------------------------------------
>>> Feb 11 07:24:15 localhost kernel: 94cpufreq/2933 is trying to acquire lock:
>>> Feb 11 07:24:15 localhost kernel: (s_active){++++.+}, at:
>>> [<ffffffff81157408>] sysfs_hash_and_remove+0x53/0x6a
>>> Feb 11 07:24:15 localhost kernel:
>>> Feb 11 07:24:15 localhost kernel: but task is already holding lock:
>>> Feb 11 07:24:15 localhost kernel: (s_active){++++.+}, at:
>>> [<ffffffff81159530>] sysfs_get_active_two+0x24/0x48
>>> Feb 11 07:24:15 localhost kernel:
>>> Feb 11 07:24:15 localhost kernel: other info that might help us debug this:
>>> Feb 11 07:24:15 localhost kernel: 4 locks held by 94cpufreq/2933:
>>> Feb 11 07:24:15 localhost kernel: #0: ?(&buffer->mutex){+.+.+.}, at:
>>> [<ffffffff81157dbd>] sysfs_write_file+0x3e/0x12b
>>> Feb 11 07:24:15 localhost kernel: #1: ?(s_active){++++.+}, at:
>>> [<ffffffff81159530>] sysfs_get_active_two+0x24/0x48
>>> Feb 11 07:24:15 localhost kernel: #2: ?(s_active){++++.+}, at:
>>> [<ffffffff8115953d>] sysfs_get_active_two+0x31/0x48
>>> Feb 11 07:24:15 localhost kernel: #3: ?(dbs_mutex){+.+.+.}, at:
>>> [<ffffffffa04bcdb3>] cpufreq_governor_dbs+0x29b/0x348
>>> [cpufreq_ondemand]
>>> Feb 11 07:24:15 localhost kernel:
>>> Feb 11 07:24:15 localhost kernel: stack backtrace:
>>> Feb 11 07:24:15 localhost kernel: Pid: 2933, comm: 94cpufreq Not
>>> tainted 2.6.33-rc7 #1
>>> Feb 11 07:24:15 localhost kernel: Call Trace:
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff81071f53>]
>>> __lock_acquire+0xcf6/0xd8b
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff81070914>] ?
>>> debug_check_no_locks_freed+0x120/0x12f
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff8107020e>] ?
>>> trace_hardirqs_on_caller+0x11f/0x14a
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff810720c0>] lock_acquire+0xd8/0xf5
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>] ?
>>> sysfs_hash_and_remove+0x53/0x6a
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff81159231>]
>>> sysfs_addrm_finish+0xe1/0x175
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>] ?
>>> sysfs_hash_and_remove+0x53/0x6a
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff8137b1b5>] ?
>>> sub_preempt_count+0xa3/0xb6
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff81157408>]
>>> sysfs_hash_and_remove+0x53/0x6a
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff8115a59d>]
>>> sysfs_remove_group+0x91/0xc9
>>> Feb 11 07:24:15 localhost kernel: [<ffffffffa04bcdc6>]
>>> cpufreq_governor_dbs+0x2ae/0x348 [cpufreq_ondemand]
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff8137b1b5>] ?
>>> sub_preempt_count+0xa3/0xb6
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff812ab9cc>]
>>> __cpufreq_governor+0x89/0xc7
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff812ac7b9>]
>>> __cpufreq_set_policy+0x18e/0x22a
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff812accd6>]
>>> store_scaling_governor+0x199/0x1ed
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff812ad50d>] ? handle_update+0x0/0x39
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff8137685d>] ? down_write+0x76/0x7e
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff812ac607>] store+0x67/0x8b
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff81157e75>]
>>> sysfs_write_file+0xf6/0x12b
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff811003c9>] vfs_write+0xb0/0x10a
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff811004f1>] sys_write+0x4c/0x75
>>> Feb 11 07:24:15 localhost kernel: [<ffffffff81002d9b>]
>>> system_call_fastpath+0x16/0x1b
>>>
>>
>> This is still present in 2.6.33-rc8.
>> I can trigger it every time. I have T500 laptop, the trick is to boot
>> it up without plugging it in, so it is working on battery power, and
>> then to shut the lid.
>> first you get the message from above, and then it does a few more
>> things before it freezes and I have to hard reboot it.
>> The above problem does not occur if I have the laptop plugged into the wall.
>>
>> Here is the 2.6.33-rc8 version of above. If anyone has further
>> suggestions for debugging, pls let me know.
>> I've cced the folks that get_maintainer.pl reports for
>> fs/sysfs/inode.c and fs/sysfs/dir.c
>
> We have a real hang and a possibly spurious lockdep report.
> Could you try my latest patches that I sent to gregkh to clear up the sysfs
> lockdep reports? ? I don't expect it will fix your issue but it will
> at least give
> us a better idea of where to look.
>
> Eric
>
Sure, are you referring to the patch-set that begins with
"[PATCH 1/6] sysfs: Serialize updates to the vfs inode"?

2010-02-14 21:50:12

by Eric Biederman

[permalink] [raw]
Subject: Re: [BUG]: Possibe recursive locking detected in sysfs

> Sure, are you referring to the patch-set that begins with
> "[PATCH 1/6] sysfs: Serialize updates to the vfs inode"?

Sorry no.

http://lkml.org/lkml/2010/2/11/329

Eric

2010-02-15 02:22:11

by John Kacur

[permalink] [raw]
Subject: Re: [BUG]: Possibe recursive locking detected in sysfs

On Sun, Feb 14, 2010 at 4:50 PM, Eric Biederman
<[email protected]> wrote:
>> Sure, are you referring to the patch-set that begins with
>> "[PATCH 1/6] sysfs: Serialize updates to the vfs inode"?
>
> Sorry no.
>
> http://lkml.org/lkml/2010/2/11/329
>

I applied your patch, and yes, it removed the possible recursive
locking detected message, but everything still froze.
I don't think I really have any good info from the crash to report.
Your patch seems to have added the symptom of a huge number of
BUG: key ffff880126269e40 not in .data!
BUG: key ffff880136fc03f0 not in .data!

messages to my log.

If there is any other info that I can provide that would be useful,
please let me know.

2010-02-15 10:04:08

by Eric Biederman

[permalink] [raw]
Subject: Re: [BUG]: Possibe recursive locking detected in sysfs

On Sun, Feb 14, 2010 at 6:22 PM, John Kacur <[email protected]> wrote:
> On Sun, Feb 14, 2010 at 4:50 PM, Eric Biederman
> <[email protected]> wrote:
>>> Sure, are you referring to the patch-set that begins with
>>> "[PATCH 1/6] sysfs: Serialize updates to the vfs inode"?
>>
>> Sorry no.
>>
>> http://lkml.org/lkml/2010/2/11/329
>>
>
> I applied your patch, and yes, it removed the possible recursive
> locking detected message, but everything still froze.
> I don't think I really have any good info from the crash to report.
> Your patch seems to have added the symptom of a huge number of
> BUG: key ffff880126269e40 not in .data!
> BUG: key ffff880136fc03f0 not in .data!

Those are from dynamic sysfs entries that I have not yet annoted
with sysfs_attr_init, and are generally harmless. If you happen
to see the first one. I would appreciate having the backtrace so I
can see about fixing it.

With respect to your problem the important point is that lockdep does
not throw a warning and disable itself. Can you verify that?

Assuming that lockdep has not complained and disabled itself than
my patches are successful at disabling the sysfs lockdep false positives
(except those BUG: key ... not in .data messages). and the lockdep
warnings are just a coincidence in your case.

I believe the cause of your hang is somewhere else entirely. Perhaps
a driver regression.

Eric

2010-02-15 10:30:31

by Cong Wang

[permalink] [raw]
Subject: Re: [BUG]: Possibe recursive locking detected in sysfs

On Mon, Feb 15, 2010 at 01:56:45AM -0800, Eric Biederman wrote:
>On Sun, Feb 14, 2010 at 6:22 PM, John Kacur <[email protected]> wrote:
>> On Sun, Feb 14, 2010 at 4:50 PM, Eric Biederman
>> <[email protected]> wrote:
>>>> Sure, are you referring to the patch-set that begins with
>>>> "[PATCH 1/6] sysfs: Serialize updates to the vfs inode"?
>>>
>>> Sorry no.
>>>
>>> http://lkml.org/lkml/2010/2/11/329
>>>
>>
>> I applied your patch, and yes, it removed the possible recursive
>> locking detected message, but everything still froze.
>> I don't think I really have any good info from the crash to report.
>> Your patch seems to have added the symptom of a huge number of
>> BUG: key ffff880126269e40 not in .data!
>> BUG: key ffff880136fc03f0 not in .data!
>
>Those are from dynamic sysfs entries that I have not yet annoted
>with sysfs_attr_init, and are generally harmless. If you happen
>to see the first one. I would appreciate having the backtrace so I
>can see about fixing it.
>
>With respect to your problem the important point is that lockdep does
>not throw a warning and disable itself. Can you verify that?
>
>Assuming that lockdep has not complained and disabled itself than
>my patches are successful at disabling the sysfs lockdep false positives
>(except those BUG: key ... not in .data messages). and the lockdep
>warnings are just a coincidence in your case.
>
>I believe the cause of your hang is somewhere else entirely. Perhaps
>a driver regression.
>

Right, we got a real deadlock here:

http://lkml.org/lkml/2010/1/28/320

2010-02-15 19:41:24

by Maciej Rutecki

[permalink] [raw]
Subject: Re: [BUG]: Possibe recursive locking detected in sysfs

On czwartek, 11 lutego 2010 o 18:11:46 John Kacur wrote:
> I'm not sure if this one has already been reported. - thanks.
>
> Feb 11 07:24:15 localhost kernel: [ INFO: possible recursive locking
> detected ] Feb 11 07:24:15 localhost kernel: 2.6.33-rc7 #1
[...]

I created a Bugzilla entry at http://bugzilla.kernel.org/show_bug.cgi?id=15319
for your bug report, please add your address to the CC list in there, thanks!


--
Maciej Rutecki
http://www.maciek.unixy.pl