Hi,
I was playing around with the tracers I encountered a lockdep
warning when the system is using threaded IRQs. I can trigger it by
adding threadirqs to the kernel options and then do a
echo preemptoff > /sys/kernel/debug/tracing/current_tracer
[ 206.610079] ------------[ cut here ]------------
[ 206.610122] WARNING: CPU: 0 PID: 3 at kernel/locking/lockdep.c:3574 check_flags.part.36+0x1bc/0x210()
[ 206.610166] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled)
[ 206.610192] Modules linked in:
[ 206.610224] x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel iTCO_wdt iTCO_vendor_support ipmi_devintf dcdbas joydev tpm_tis pcspkr acpi_pad ipmi_si ipmi_msghandler acpi_power_meter wmi mei_me tpm sb_edac shpchp edac_core mei lpc_ich mfd_core nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c mgag200 drm_kms_helper ttm drm ixgbe igb mdio vxlan ip6_udp_tunnel udp_tunnel megaraid_sas ptp pps_core dca i2c_algo_bit
[ 206.610892] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.1.0-rc4 #99
[ 206.610952] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
[ 206.611021] ffffffff8186e017 000000005a61066b ffff881fb11d3b28 ffffffff8186e017
[ 206.611118] 0000000000000000 ffff881fb11d3b80 ffff881fb11d3b68 ffffffff810af8fa
[ 206.611211] ffff881fb11d3b88 ffffffff81e6cb58 0000000000000000 0000000000000000
[ 206.611302] Call Trace:
[ 206.611338] [<ffffffff8186e017>] ? dump_stack+0x4f/0x7b
[ 206.611392] [<ffffffff8186e017>] dump_stack+0x4f/0x7b
[ 206.611448] [<ffffffff810af8fa>] warn_slowpath_common+0x8a/0xc0
[ 206.611507] [<ffffffff810af985>] warn_slowpath_fmt+0x55/0x70
[ 206.611565] [<ffffffff810af935>] ? warn_slowpath_fmt+0x5/0x70
[ 206.611623] [<ffffffff811126fc>] check_flags.part.36+0x1bc/0x210
[ 206.611682] [<ffffffff81113479>] lock_acquire+0x199/0x2d0
[ 206.611738] [<ffffffff8187920b>] ? _raw_spin_lock_irqsave+0x2b/0xa0
[ 206.611800] [<ffffffff810e5925>] ? preempt_count_add+0x5/0xc0
[ 206.611860] [<ffffffff81879239>] _raw_spin_lock_irqsave+0x59/0xa0
[ 206.611922] [<ffffffff811b773e>] ? check_critical_timing+0xae/0x170
[ 206.611982] [<ffffffff818791e5>] ? _raw_spin_lock_irqsave+0x5/0xa0
[ 206.612043] [<ffffffff811b773e>] check_critical_timing+0xae/0x170
[ 206.612102] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
[ 206.612158] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
[ 206.612213] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
[ 206.612266] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
[ 206.612322] [<ffffffff811b8401>] trace_preempt_on+0x101/0x120
[ 206.612376] [<ffffffff810e59e5>] ? preempt_count_sub+0x5/0xf0
[ 206.612431] [<ffffffff810e5a8b>] preempt_count_sub+0xab/0xf0
[ 206.612485] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
[ 206.612540] [<ffffffff810b434b>] __local_bh_enable+0x3b/0x70
[ 206.612595] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
[ 206.612649] [<ffffffff810b606c>] __do_softirq+0x24c/0x6b0
[ 206.612703] [<ffffffff810b64f5>] run_ksoftirqd+0x25/0x70
[ 206.614370] [<ffffffff810db453>] smpboot_thread_fn+0x193/0x2a0
[ 206.616013] [<ffffffff810db2c0>] ? sort_range+0x30/0x30
[ 206.616948] [<ffffffff810d71f4>] kthread+0x104/0x120
[ 206.617733] [<ffffffff81873293>] ? wait_for_completion+0xd3/0x140
[ 206.618512] [<ffffffff810d70f0>] ? kthread_create_on_node+0x260/0x260
[ 206.619281] [<ffffffff818798a2>] ret_from_fork+0x42/0x70
[ 206.620054] [<ffffffff810d70f0>] ? kthread_create_on_node+0x260/0x260
[ 206.620831] ---[ end trace 9e22068fb8699756 ]---
[ 206.621607] irq event stamp: 636192
[ 206.622362] hardirqs last enabled at (636190): [<ffffffff818787e5>] _raw_spin_unlock_irqrestore+0x65/0x90
[ 206.623140] hardirqs last disabled at (636191): [<ffffffff810b5ffe>] __do_softirq+0x1de/0x6b0
[ 206.623904] softirqs last enabled at (636192): [<ffffffff810b606c>] __do_softirq+0x24c/0x6b0
[ 206.624653] softirqs last disabled at (635925): [<ffffffff810b64f5>] run_ksoftirqd+0x25/0x70
There is no warning without threadirqs.
The same warning can also be triggered inside KVM. I tried to
bisect it but kernel version before v3.15 didn't boot due to
some systemd dependencies.
Any ideas?
cheers,
daniel
On 05/20/2015 10:46 AM, Daniel Wagner wrote:
> [ 206.610079] ------------[ cut here ]------------
> [ 206.610122] WARNING: CPU: 0 PID: 3 at kernel/locking/lockdep.c:3574 check_flags.part.36+0x1bc/0x210()
> [ 206.610166] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled)
> [ 206.610192] Modules linked in:
> [ 206.610224] x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel iTCO_wdt iTCO_vendor_support ipmi_devintf dcdbas joydev tpm_tis pcspkr acpi_pad ipmi_si ipmi_msghandler acpi_power_meter wmi mei_me tpm sb_edac shpchp edac_core mei lpc_ich mfd_core nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c mgag200 drm_kms_helper ttm drm ixgbe igb mdio vxlan ip6_udp_tunnel udp_tunnel megaraid_sas ptp pps_core dca i2c_algo_bit
> [ 206.610892] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.1.0-rc4 #99
> [ 206.610952] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
> [ 206.611021] ffffffff8186e017 000000005a61066b ffff881fb11d3b28 ffffffff8186e017
> [ 206.611118] 0000000000000000 ffff881fb11d3b80 ffff881fb11d3b68 ffffffff810af8fa
> [ 206.611211] ffff881fb11d3b88 ffffffff81e6cb58 0000000000000000 0000000000000000
> [ 206.611302] Call Trace:
> [ 206.611338] [<ffffffff8186e017>] ? dump_stack+0x4f/0x7b
> [ 206.611392] [<ffffffff8186e017>] dump_stack+0x4f/0x7b
> [ 206.611448] [<ffffffff810af8fa>] warn_slowpath_common+0x8a/0xc0
> [ 206.611507] [<ffffffff810af985>] warn_slowpath_fmt+0x55/0x70
> [ 206.611565] [<ffffffff810af935>] ? warn_slowpath_fmt+0x5/0x70
> [ 206.611623] [<ffffffff811126fc>] check_flags.part.36+0x1bc/0x210
> [ 206.611682] [<ffffffff81113479>] lock_acquire+0x199/0x2d0
> [ 206.611738] [<ffffffff8187920b>] ? _raw_spin_lock_irqsave+0x2b/0xa0
> [ 206.611800] [<ffffffff810e5925>] ? preempt_count_add+0x5/0xc0
> [ 206.611860] [<ffffffff81879239>] _raw_spin_lock_irqsave+0x59/0xa0
> [ 206.611922] [<ffffffff811b773e>] ? check_critical_timing+0xae/0x170
> [ 206.611982] [<ffffffff818791e5>] ? _raw_spin_lock_irqsave+0x5/0xa0
> [ 206.612043] [<ffffffff811b773e>] check_critical_timing+0xae/0x170
> [ 206.612102] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
> [ 206.612158] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
> [ 206.612213] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
> [ 206.612266] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
> [ 206.612322] [<ffffffff811b8401>] trace_preempt_on+0x101/0x120
> [ 206.612376] [<ffffffff810e59e5>] ? preempt_count_sub+0x5/0xf0
> [ 206.612431] [<ffffffff810e5a8b>] preempt_count_sub+0xab/0xf0
> [ 206.612485] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
> [ 206.612540] [<ffffffff810b434b>] __local_bh_enable+0x3b/0x70
> [ 206.612595] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
> [ 206.612649] [<ffffffff810b606c>] __do_softirq+0x24c/0x6b0
> [ 206.612703] [<ffffffff810b64f5>] run_ksoftirqd+0x25/0x70
> [ 206.614370] [<ffffffff810db453>] smpboot_thread_fn+0x193/0x2a0
> [ 206.616013] [<ffffffff810db2c0>] ? sort_range+0x30/0x30
> [ 206.616948] [<ffffffff810d71f4>] kthread+0x104/0x120
> [ 206.617733] [<ffffffff81873293>] ? wait_for_completion+0xd3/0x140
> [ 206.618512] [<ffffffff810d70f0>] ? kthread_create_on_node+0x260/0x260
> [ 206.619281] [<ffffffff818798a2>] ret_from_fork+0x42/0x70
> [ 206.620054] [<ffffffff810d70f0>] ? kthread_create_on_node+0x260/0x260
> [ 206.620831] ---[ end trace 9e22068fb8699756 ]---
> [ 206.621607] irq event stamp: 636192
> [ 206.622362] hardirqs last enabled at (636190): [<ffffffff818787e5>] _raw_spin_unlock_irqrestore+0x65/0x90
> [ 206.623140] hardirqs last disabled at (636191): [<ffffffff810b5ffe>] __do_softirq+0x1de/0x6b0
> [ 206.623904] softirqs last enabled at (636192): [<ffffffff810b606c>] __do_softirq+0x24c/0x6b0
> [ 206.624653] softirqs last disabled at (635925): [<ffffffff810b64f5>] run_ksoftirqd+0x25/0x70
I starred for a while at the code and saw the comment in
__local_bh_disable_ip():
/*
* The preempt tracer hooks into preempt_count_add and will break
* lockdep because it calls back into lockdep after SOFTIRQ_OFFSET
* is set and before current->softirq_enabled is cleared.
* We must manually increment preempt_count here and manually
* call the trace_preempt_off later.
*/
>From this I figured this might be the same problem just in the
enable path. The below patch made the lockdep warning disappear.
Though I am not sure what I breaking with this.
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 479e443..6fc0449 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -128,9 +128,13 @@ static void __local_bh_enable(unsigned int cnt)
{
WARN_ON_ONCE(!irqs_disabled());
+ if (preempt_count() == cnt)
+ trace_preempt_on(CALLER_ADDR0, get_parent_ip(CALLER_ADDR1));
+
if (softirq_count() == (cnt & SOFTIRQ_MASK))
trace_softirqs_on(_RET_IP_);
- preempt_count_sub(cnt);
+
+ __preempt_count_sub(cnt);
}
/*
cheers,
daniel
On Tue, 26 May 2015 13:52:02 +0200
Daniel Wagner <[email protected]> wrote:
> On 05/20/2015 10:46 AM, Daniel Wagner wrote:
> > [ 206.610079] ------------[ cut here ]------------
> > [ 206.610122] WARNING: CPU: 0 PID: 3 at kernel/locking/lockdep.c:3574 check_flags.part.36+0x1bc/0x210()
> > [ 206.610166] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled)
> > [ 206.610192] Modules linked in:
> > [ 206.610224] x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel iTCO_wdt iTCO_vendor_support ipmi_devintf dcdbas joydev tpm_tis pcspkr acpi_pad ipmi_si ipmi_msghandler acpi_power_meter wmi mei_me tpm sb_edac shpchp edac_core mei lpc_ich mfd_core nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c mgag200 drm_kms_helper ttm drm ixgbe igb mdio vxlan ip6_udp_tunnel udp_tunnel megaraid_sas ptp pps_core dca i2c_algo_bit
> > [ 206.610892] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.1.0-rc4 #99
> > [ 206.610952] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
> > [ 206.611021] ffffffff8186e017 000000005a61066b ffff881fb11d3b28 ffffffff8186e017
> > [ 206.611118] 0000000000000000 ffff881fb11d3b80 ffff881fb11d3b68 ffffffff810af8fa
> > [ 206.611211] ffff881fb11d3b88 ffffffff81e6cb58 0000000000000000 0000000000000000
> > [ 206.611302] Call Trace:
> > [ 206.611338] [<ffffffff8186e017>] ? dump_stack+0x4f/0x7b
> > [ 206.611392] [<ffffffff8186e017>] dump_stack+0x4f/0x7b
> > [ 206.611448] [<ffffffff810af8fa>] warn_slowpath_common+0x8a/0xc0
> > [ 206.611507] [<ffffffff810af985>] warn_slowpath_fmt+0x55/0x70
> > [ 206.611565] [<ffffffff810af935>] ? warn_slowpath_fmt+0x5/0x70
> > [ 206.611623] [<ffffffff811126fc>] check_flags.part.36+0x1bc/0x210
> > [ 206.611682] [<ffffffff81113479>] lock_acquire+0x199/0x2d0
> > [ 206.611738] [<ffffffff8187920b>] ? _raw_spin_lock_irqsave+0x2b/0xa0
> > [ 206.611800] [<ffffffff810e5925>] ? preempt_count_add+0x5/0xc0
> > [ 206.611860] [<ffffffff81879239>] _raw_spin_lock_irqsave+0x59/0xa0
> > [ 206.611922] [<ffffffff811b773e>] ? check_critical_timing+0xae/0x170
> > [ 206.611982] [<ffffffff818791e5>] ? _raw_spin_lock_irqsave+0x5/0xa0
> > [ 206.612043] [<ffffffff811b773e>] check_critical_timing+0xae/0x170
> > [ 206.612102] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
> > [ 206.612158] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
> > [ 206.612213] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
> > [ 206.612266] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
> > [ 206.612322] [<ffffffff811b8401>] trace_preempt_on+0x101/0x120
> > [ 206.612376] [<ffffffff810e59e5>] ? preempt_count_sub+0x5/0xf0
> > [ 206.612431] [<ffffffff810e5a8b>] preempt_count_sub+0xab/0xf0
> > [ 206.612485] [<ffffffff810b434b>] ? __local_bh_enable+0x3b/0x70
> > [ 206.612540] [<ffffffff810b434b>] __local_bh_enable+0x3b/0x70
> > [ 206.612595] [<ffffffff810b606c>] ? __do_softirq+0x24c/0x6b0
> > [ 206.612649] [<ffffffff810b606c>] __do_softirq+0x24c/0x6b0
> > [ 206.612703] [<ffffffff810b64f5>] run_ksoftirqd+0x25/0x70
> > [ 206.614370] [<ffffffff810db453>] smpboot_thread_fn+0x193/0x2a0
> > [ 206.616013] [<ffffffff810db2c0>] ? sort_range+0x30/0x30
> > [ 206.616948] [<ffffffff810d71f4>] kthread+0x104/0x120
> > [ 206.617733] [<ffffffff81873293>] ? wait_for_completion+0xd3/0x140
> > [ 206.618512] [<ffffffff810d70f0>] ? kthread_create_on_node+0x260/0x260
> > [ 206.619281] [<ffffffff818798a2>] ret_from_fork+0x42/0x70
> > [ 206.620054] [<ffffffff810d70f0>] ? kthread_create_on_node+0x260/0x260
> > [ 206.620831] ---[ end trace 9e22068fb8699756 ]---
> > [ 206.621607] irq event stamp: 636192
> > [ 206.622362] hardirqs last enabled at (636190): [<ffffffff818787e5>] _raw_spin_unlock_irqrestore+0x65/0x90
> > [ 206.623140] hardirqs last disabled at (636191): [<ffffffff810b5ffe>] __do_softirq+0x1de/0x6b0
> > [ 206.623904] softirqs last enabled at (636192): [<ffffffff810b606c>] __do_softirq+0x24c/0x6b0
> > [ 206.624653] softirqs last disabled at (635925): [<ffffffff810b64f5>] run_ksoftirqd+0x25/0x70
>
> I starred for a while at the code and saw the comment in
> __local_bh_disable_ip():
>
> /*
> * The preempt tracer hooks into preempt_count_add and will break
> * lockdep because it calls back into lockdep after SOFTIRQ_OFFSET
> * is set and before current->softirq_enabled is cleared.
> * We must manually increment preempt_count here and manually
> * call the trace_preempt_off later.
> */
>
> >From this I figured this might be the same problem just in the
> enable path. The below patch made the lockdep warning disappear.
> Though I am not sure what I breaking with this.
Don't you want to modify __local_bh_enable_ip()?
-- Steve
>
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 479e443..6fc0449 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -128,9 +128,13 @@ static void __local_bh_enable(unsigned int cnt)
> {
> WARN_ON_ONCE(!irqs_disabled());
>
> + if (preempt_count() == cnt)
> + trace_preempt_on(CALLER_ADDR0, get_parent_ip(CALLER_ADDR1));
> +
> if (softirq_count() == (cnt & SOFTIRQ_MASK))
> trace_softirqs_on(_RET_IP_);
> - preempt_count_sub(cnt);
> +
> + __preempt_count_sub(cnt);
> }
>
> /*
>
>
> cheers,
> daniel
On 05/26/2015 02:32 PM, Steven Rostedt wrote:
> Daniel Wagner <[email protected]> wrote:
>> On 05/20/2015 10:46 AM, Daniel Wagner wrote:
>> I starred for a while at the code and saw the comment in
>> __local_bh_disable_ip():
>>
>> /*
>> * The preempt tracer hooks into preempt_count_add and will break
>> * lockdep because it calls back into lockdep after SOFTIRQ_OFFSET
>> * is set and before current->softirq_enabled is cleared.
>> * We must manually increment preempt_count here and manually
>> * call the trace_preempt_off later.
>> */
>>
>> >From this I figured this might be the same problem just in the
>> enable path. The below patch made the lockdep warning disappear.
>> Though I am not sure what I breaking with this.
>
> Don't you want to modify __local_bh_enable_ip()?
__do_softirq() calls __local_bh_disable_ip() and __local_bh_enable() and
not __local_bh_enable_ip(). I have no clue why this is done this way.
Need to look into the history.
On Tue, 26 May 2015 14:51:28 +0200
Daniel Wagner <[email protected]> wrote:
> > Don't you want to modify __local_bh_enable_ip()?
>
> __do_softirq() calls __local_bh_disable_ip() and __local_bh_enable() and
> not __local_bh_enable_ip(). I have no clue why this is done this way.
> Need to look into the history.
Yeah, that probably needs to be fixed. Find out why we call *_ip() and
it would make more sense to make that symmetric, as it may break
something else otherwise.
-- Steve